SQL to Mongo Query : A quick guide

Rajesh Vinayagam
17 min readMar 22, 2023

--

Very often the first step in migrating data from SQL to MongoDB is to define a schema that is suitable for MongoDB’s document-based data model. This involves identifying the entities, relationships, and attributes in the data and mapping them to the corresponding document structures in MongoDB. To know more on this you can refer to my previous articles on Analysis and Schema Design

  1. Schema Analysis : Quick Peek on Pre-Migration Analysis in Oracle
  2. Schema Design : Quick guide on Schema Design.

Once the schema has been defined, the next step is to analyze all the queries used in SQL and convert them to MongoDB queries. This process requires a deep understanding of both SQL and MongoDB query languages and the differences between them. Below is a simple methodology of categorising the queries

  1. Identify the queries: Start by identifying all the SQL queries that your application is currently running. This can be done by analyzing your codebase or by using tools like SQL Profiler.
  2. Categorize the queries: Once you have identified the queries, categorize them based on their purpose and complexity. For example, you can group queries based on whether they are used for data retrieval( all select queries) or data updates ( all insertion, deletion, or update queries). This will help you diagnose read heavy vs write heavy application.
  3. Query patterns refer to commonly used ways of querying data from a database, including single table queries, simple and complex joins involving two or more tables, and other types of queries.

SQL Profiler is a tool used to monitor and analyze SQL Server database activity. It allows users to capture and analyze SQL Server events such as SQL queries, stored procedure executions, and system events.

It is important to note that not all SQL queries can be directly translated to MongoDB queries, and some queries may need to be restructured as per the refined Schema in MongoDB.

Once you have defined the MongoDB schema, you can begin to convert the stored procedures, queries, and custom functions from SQL to MongoDB. In the below article we will take into consideration some of the commonly used SQL operators and their equivalent in MongoDB.

SQL to Mongo Query

In order to rewrite a SQL query for MongoDB, it’s important to identify the common query operators and functions used in both systems. This can help ensure that the rewritten query produces the same results as the original SQL query.

Query Keywords

Below are some commonly used MSSQL query keywords and their equivalents in MongoDB.

MS SQL

  • SELECT: used to specify which columns to include in the query results
  • AS: used to alias a column or table with a different name
  • FROM: used to specify the table(s) from which to select data
  • WHERE: used to filter the query results based on specified conditions
  • GROUP BY: used to group the query results based on one or more columns
  • HAVING: used to filter the grouped query results based on specified conditions
  • ORDER BY: used to sort the query results based on one or more columns
  • JOIN: used to combine rows from two or more tables based on a related column
  • INNER JOIN: used to combine rows from two tables where the join condition is met
  • LEFT JOIN: used to combine all rows from the left table with matching rows from the right table where the join condition is met
  • RIGHT JOIN: used to combine all rows from the right table with matching rows from the left table where the join condition is met
  • FULL OUTER JOIN: used to combine all rows from both tables, including those that do not have a match in the other table
  • ON: used to specify the join condition for a JOIN
  • UNION: used to combine the results of two or more SELECT statements into a single result set, excluding duplicates
  • UNION ALL: used to combine the results of two or more SELECT statements into a single result set, including duplicates
  • LIMIT: used to limit the number of rows returned in a result set
  • OFFSET: used to specify the starting point for a result set when using LIMIT
  • INSERT INTO: used to insert new rows into a table
  • UPDATE: used to modify existing rows in a table
  • DELETE FROM: used to delete rows from a table

MongoDB

  • SELECT: $project — used to specify which fields to include or exclude in the query results.
  • AS: $project (with field alias) — used to alias a field with a different name
  • WHERE: $match — used to filter the query results based on specified conditions.
  • GROUP BY: $group — used to group the query results based on one or more fields.
  • HAVING: $match (after group with aggregation) — used to filter the grouped query results based on specified conditions.
  • ORDER BY: $sort -used to sort the query results based on one or more fields.
  • JOIN, INNER JOIN: $lookup — used to combine documents from two or more collections based on a related field.
  • INNER JOIN: -used to combine documents from two collections where the join condition is met.
  • LEFT JOIN: $lookup (with from)- used to combine all documents from the left collection with matching documents from the right collection where the join condition is met.
  • RIGHT JOIN: $lookup (with localField and foreignField reversed) -used to combine all documents from the right collection with matching documents from the left collection where the join condition is met.
  • FULL OUTER JOIN: $lookup (with from and localField and foreignField reversed) — used to combine all documents from both collections, including those that do not have a match in the other collection.
  • UNION: $unionWith — used to combine the results of two or more queries into a single result set, excluding duplicates.
  • UNION ALL: $unionWith — used to combine the results of two or more queries into a single result set, including duplicates.
  • LIMIT: $limit — used to limit the number of documents returned in a query result set.
  • OFFSET: $skip -used to specify the starting point for a query result set when using $limit.
  • INSERT INTO: insertOne, insertMany -used to insert one or many documents into a collection.
  • UPDATE: updateOne, updateMany -used to update one or many documents in a collection.
  • DELETE FROM: deleteOne, deleteMany — used to delete one or many documents from a collection.

Query Operators

Below are some commonly used SQL operators and their equivalents in MongoDB.

Comparison Operators

Comparison Operators used to compare values in a query and return a boolean value (TRUE or FALSE)

  • =: equal to
  • <> or !=: not equal to
  • <: less than
  • >: greater than
  • <=: less than or equal to
  • >=: greater than or equal to

In MongoDB, the equivalent operators for comparison operations are:

  • = : $eq — used to match documents where the value of a field equals a specified value.
  • <> or != : $ne — used to match documents where the value of a field is not equal to a specified value.
  • < : $lt — used to match documents where the value of a field is less than a specified value.
  • > : $gt — used to match documents where the value of a field is greater than a specified value.
  • <= : $lte — used to match documents where the value of a field is less than or equal to a specified value.
  • >= : $gte — used to match documents where the value of a field is greater than or equal to a specified value.

Logical Operators

Logical Operators: used to combine multiple conditions in a query and return a boolean value

  • AND: returns TRUE if both conditions are TRUE
  • OR: returns TRUE if either condition is TRUE
  • NOT: negates the result of a condition (returns TRUE if the condition is FALSE, and vice versa)

In MongoDB, the equivalent operators for logical operations are:

  • AND: $and — used to combine multiple expressions into a logical AND.
  • OR: $or -used to combine multiple expressions into a logical OR.
  • NOT: $not — used to invert the result of an expression.

Arithmetic Operators

Arithmetic Operators used to perform arithmetic operations on numerical data

  • +: addition
  • -: subtraction
  • *: multiplication
  • /: division
  • %: modulo (returns the remainder of a division operation)

In MongoDB, the equivalent operators for basic arithmetic operations are:

  • +: $add -used to add numbers or arrays of numbers.
  • -: $subtract — used to subtract numbers or arrays of numbers.
  • *: $multiply — used to multiply numbers or arrays of numbers.
  • /: $divide — used to divide numbers.
  • %: $mod — used to calculate the remainder of a division operation.

Bitwise Operators

Bitwise Operators: used to perform bitwise operations on binary data

  • &: bitwise AND
  • |: bitwise OR
  • ^: bitwise XOR
  • ~: bitwise NOT

In MongoDB, we don’t have corresponding bitwise operators but has bitwise operators that can be leveraged

$bitsAllClear,$bitsAllSet,$bitsAnyClear,$bitsAnySet

Array Operators

In SQL, some common array operators are:

  • IN — used to check if a value is in a list of values.
  • NOT IN — used to check if a value is not in a list of values.
  • ANY — used to check if a value is in any of the elements of an array.
  • ALL — used to check if a value is in all of the elements of an array.

In MongoDB, arrays are a native data type and there are several operators available to work with them. The equivalent MongoDB array operators for the SQL operators listed above are:

  • $in — used to check if a value is in a list of values.
  • $nin — used to check if a value is not in a list of values.
  • $elemMatch — used to check if a value matches any of the elements of an array.
  • $all — used to check if a value matches all of the elements of an array.

Query Functions

SQL provides a wide range of built-in functions that can be used to perform various operations on data in a database. Some of the most commonly used types of functions in MSSQL include:

Aggregate Functions

MSSQL provides several aggregate functions that can be used to perform calculations on groups of rows in a table. Below are some commonly used aggregate functions in MSSQL:

  • COUNT: This function is used to count the number of rows in a table or the number of non-null values in a specific column.
  • SUM: This function is used to calculate the sum of values in a specific column.
  • AVG: This function is used to calculate the average of values in a specific column.
  • MAX: This function is used to find the maximum value in a specific column.
  • MIN: This function is used to find the minimum value in a specific column.

MongoDB provides a similar aggregation function that can be used to perform calculations on groups of documents in a collection.

  • $count: This function is used to count the number of documents in a collection that match a specified condition.
  • $sum: This function is used to calculate the sum of values in a specific field in a collection.
  • $avg: This function is used to calculate the average of values in a specific field in a collection.
  • $max: This function is used to find the maximum value in a specific field in a collection.
  • $min: This function is used to find the minimum value in a specific field in a collection.

Convert Functions

In MSSQL, the CONVERT function is used to convert an expression of one data type to another. Below are some commonly used conversion functions in MSSQL:

  • CONVERT: This function is used to convert an expression of one data type to another.
  • CAST: This function is used to convert an expression of one data type to another.
  • PARSE: This function is used to convert a string to a specific data type.
  • TRY_CONVERT: This function is used to convert an expression of one data type to another. If the conversion fails, it returns NULL.

MongoDB provides similar functions that can be used to convert data types.

  • $toInt: This function is used to convert a value to an integer data type.
  • $toLong: This function is used to convert a value to a long data type.
  • $toDouble: This function is used to convert a value to a double data type.
  • $toDecimal: This function is used to convert a value to a decimal data type.
  • $toString: This function is used to convert a value to a string data type.
  • $toDate: This function is used to convert a string value to a date data type.
  • $toBool: This function is used to convert a value to a boolean data type.

String Functions

MSSQL provides a variety of string functions that can be used to manipulate and transform string data. Below are some commonly used string functions in MSSQL:

  • CONCAT: This function is used to concatenate two or more strings into a single string.
  • SUBSTRING: This function is used to extract a substring from a larger string.
  • CHARINDEX: This function is used to find the position of a specified substring within a larger string.
  • LEN: This function is used to return the length of a string.
  • REPLACE: This function is used to replace all occurrences of a specified substring in a larger string with another string.
  • LOWER: This function is used to convert all characters in a string to lowercase.
  • UPPER: This function is used to convert all characters in a string to uppercase.
  • LTRIM: This function is used to remove leading spaces from a string.
  • RTRIM: This function is used to remove trailing spaces from a string.

MongoDB provides similar string functions that can be used to manipulate and transform string data. Below are some commonly used string functions in MongoDB:

  • $concat: This function is used to concatenate two or more strings into a single string.
  • $substr: This function is used to extract a substring from a larger string.
  • $indexOfBytes: This function is used to find the position of a specified substring within a larger string.
  • $strLenCP: This function is used to return the length of a string.
  • $replaceAll: This function is used to replace all occurrences of a specified substring in a larger string with another string.
  • $toLower: This function is used to convert all characters in a string to lowercase.
  • $toUpper: This function is used to convert all characters in a string to uppercase.
  • $trim: This function is used to remove leading and/or trailing spaces from a string.

DateTime Functions

MSSQL provides a number of built-in functions that can be used to manipulate and format date and time data. Below are some commonly used date and time functions in MSSQL:

  • GETDATE(): This function returns the current system date and time.
  • DATEPART(): This function returns a specific part of a date and time, such as the year, month, or day.
  • DATEDIFF(): This function calculates the difference between two dates or times, in a specified unit of time such as days, hours, minutes, or seconds.
  • DATEADD(): This function adds a specified number of units to a date or time value.
  • YEAR(): This function returns the year of a specified date or time value.
  • MONTH(): This function returns the month of a specified date or time value.
  • DAY(): This function returns the day of the month of a specified date or time value.

MongoDB provides similar built-in functions that can be used to manipulate and format date and time data.

  • new Date(): This function returns the current date and time in ISODate format.
  • $dateToString: This operator converts a date to a string in a specified format.
  • $year, $month, $dayOfMonth: These operators extract the year, month, and day of the month from a date.
  • $dateFromParts: This operator constructs a date from individual parts such as year, month, and day.
  • $dateAdd: This operator adds a specified number of units to a date.
  • $dateSubtract: This operator subtracts a specified number of units from a date.
  • $toDate: This operator converts a string to a date.
  • $isoWeekYear, $isoWeek, $dayOfWeek: These operators extract the ISO week year, ISO week, and day of the week from a date.

Mathematical Functions

MSSQL provides a variety of arithmetic functions that can be used to perform mathematical calculations on numeric data. Below dare some of the commonly used arithmetic functions in MSSQL:

  • ABS: This function is used to return the absolute value of a number. For example, ABS(-5) returns 5.
  • CEILING: This function is used to round a number up to the nearest integer. For example, CEILING(3.5) returns 4.
  • FLOOR: This function is used to round a number down to the nearest integer. For example, FLOOR(3.5) returns 3.
  • ROUND: This function is used to round a number to a specified number of decimal places. For example, ROUND(3.14159, 2) returns 3.14.
  • SQRT: This function is used to return the square root of a number. For example, SQRT(16) returns 4.
  • POWER: This function is used to raise a number to a specified power. For example, POWER(2, 3) returns 8.
  • RAND: This function is used to generate a random number between 0 and 1. For example, RAND() returns a random number between 0 and 1.

MongoDB provides a similar arithmetic operators that can be used to perform mathematical operations on numeric data.

  • $abs: This operator returns the absolute value of a number.
  • $ceil: This operator rounds a number up to the nearest integer.
  • $floor: This operator rounds a number down to the nearest integer.
  • $round: This operator rounds a number to a specified number of decimal places.
  • $sqrt: This operator returns the square root of a number.
  • $pow: This operator raises a number to a specified power.
  • $rand: returns a random float between 0 and 1.

Control Flow statements

In SQL, control flow statements like IF, CASE, and WHILE are used to conditionally execute code or perform iterative tasks. MongoDB, achieves the similar functionality using Aggregation Framework.

IF Statement

SQL

This example demonstrates a simple IF statement, it checks if the "quantity" field is greater than 0 and assigns the value "In Stock" or "Out Of Stock" to the "stock_status" field accordingly.

SELECT name,
IF(quantity > 0, 'In Stock', 'Out of Stock') AS stock_status
FROM products;

In MongoDB, you can use the $cond operator within the aggregation framework to implement conditional logic and get the similiar results

MongoDB

The equivalent MongoDB query using the $cond operator within the $project stage.

db.products.aggregate([
{
$project: {
name: 1,
stock_status: {
$cond: {
if: { $gt: [ "$quantity", 0 ] },
then: "In Stock",
else: "Out of Stock"
}
}
}
}
])

In this MongoDB query, we use the $project stage to include the "name" field and create a new field called "stock_status". The $cond operator is used to conditionally assign values to the "stock_status" field based on the comparison of the "quantity" field with 0. If the "quantity" is greater than 0, the value "In Stock" is assigned; otherwise, "Out of Stock" is assigned.

CASE Statement

SQL

In this example, the CASE statement evaluates the value of the “Salary” column for each row. Depending on the salary value, it assigns the corresponding category (‘Low Salary’, ‘Medium Salary’, or ‘High Salary’) to the new column “SalaryCategory.”

SELECT EmployeeID,
FirstName,
Salary,
CASE
WHEN Salary < 50000 THEN 'Low Salary'
WHEN Salary >= 50000 AND Salary < 100000 THEN 'Medium Salary'
ELSE 'High Salary'
END AS SalaryCategory
FROM Employees;

MongoDB

In MongoDB, the equivalent of the CASE statement in SQL can be achieved using the $switch operator within the Aggregation Framework.

db.Employees.aggregate([
{
$project: {
EmployeeID: 1,
FirstName: 1,
Salary: 1,
SalaryCategory: {
$switch: {
branches: [
{ case: { $lt: ["$Salary", 50000] }, then: "Low Salary" },
{ case: { $and: [{ $gte: ["$Salary", 50000] }, { $lt: ["$Salary", 100000] }] }, then: "Medium Salary" },
],
default: "High Salary"
}
}
}
}
]);

The $project stage is used to select the desired fields and create the computed field "SalaryCategory" using the $switch operator. The $switch operator evaluates different conditions using the $lt , $gteand $and operators, similar to the SQL CASE statement.

WHILE Statement

SQL

In SQL, cursor-based iteration allows you to process rows one at a time in a result set. Cursors provide a way to move through the result set sequentially, perform operations on each row, and retrieve data as needed.

DECLARE @employee_id INT;
DECLARE @salary DECIMAL(10,2);

DECLARE employee_cursor CURSOR FOR
SELECT employee_id, salary
FROM employees;

OPEN employee_cursor;

FETCH NEXT FROM employee_cursor INTO @employee_id, @salary;

WHILE @@FETCH_STATUS = 0
BEGIN
-- Process the current row
-- Perform operations using @employee_id and @salary

FETCH NEXT FROM employee_cursor INTO @employee_id, @salary;
END

CLOSE employee_cursor;
DEALLOCATE employee_cursor;

In this SQL example, a cursor named employee_cursor is declared and opened, and the SELECT statement defines the result set that the cursor will iterate over. The FETCH NEXT statement retrieves the values of employee_id and salary into variables, and the WHILE loop processes each row in the result set until there are no more rows (@@FETCH_STATUS = 0).

MongoDB

MongoDB does not provide direct cursor-based iteration like SQL. By combining the querying capabilities of MongoDB with the cursor-based iteration in your programming language, you can achieve similar functionality to SQL’s cursor-based iteration in a MongoDB environment.

import com.mongodb.MongoClient;
import com.mongodb.client.MongoCollection;
import com.mongodb.client.MongoCursor;
import com.mongodb.client.MongoDatabase;
import org.bson.Document;

public class MongoDBCursorIterationExample {
public static void main(String[] args) {
// Establish connection to MongoDB
MongoClient mongoClient = new MongoClient("localhost", 27017);
MongoDatabase database = mongoClient.getDatabase("your_database");
MongoCollection<Document> collection = database.getCollection("your_collection");

// Query documents
MongoCursor<Document> cursor = collection.find().iterator();

// Iterate over the cursor
while (cursor.hasNext()) {
Document document = cursor.next();
// Process the current document
// Perform operations using the fields of the document
int employeeId = document.getInteger("employee_id");
double salary = document.getDouble("salary");

// Rest of your processing logic goes here
}

// Close the cursor and the connection
cursor.close();
mongoClient.close();
}
}

In this Java example, the MongoDB Java Driver is used to establish a connection to the MongoDB server, access the desired database, and retrieve a cursor to iterate over the result set. The collection.find().iterator() query retrieves all documents from the specified collection and returns a cursor. The while loop then iterates over each document in the cursor, allowing you to access the document fields and perform operations as needed.

In some cases where you may need to iterate over a collection of documents, you can leverage the $map or $reduceoperator within the aggregation framework to perform transformations on each document in an array field.

Sample Query Conversion

Below we will see some examples of SQL queries and their equivalent MongoDB queries.

Query 1

Below is a sample query with the simple join of two tables and later grouping by certain fields to return only the results with some filter using HAVING.

SELECT i.INVOICE_NUM, i.SUPPLIER_NUM, SUM(l.INVOICE_LINE_AMOUNT)
FROM AP_INVOICES i
JOIN AP_INVOICE_LINES l ON i.INVOICE_ID = l.INVOICE_ID
GROUP BY i.INVOICE_NUM, i.SUPPLIER_NUM
HAVING SUM(l.INVOICE_LINE_AMOUNT) > 1000;

The corresponding MongoDB query with $lookup, $match, $group and $project stages in aggregation pipeline

db.ap_invoices.aggregate([
{
$lookup: {
from: "ap_invoice_lines",
localField: "invoice_id",
foreignField: "invoice_id",
as: "invoice_lines"
}
},
{
$group: {
_id: { invoice_num: "$invoice_num", supplier_num: "$supplier_num" },
total_amount: { $sum: "$invoice_lines.invoice_line_amount" }
}
},
{
$match: { total_amount: { $gt: 1000 } }
},
{
$project: {
_id: 0,
invoice_num: "$_id.invoice_num",
supplier_num: "$_id.supplier_num",
total_amount: 1
}
}
])

Query 2

Below is a sample query where we join using more than two tables and later grouping by certain fields to return only the results in sorted manner with some filter using HAVING and ORDER BY

SELECT i.INVOICE_NUM, s.SUPPLIER_NAME, i.SUPPLIER_NUM, TO_CHAR(i.INVOICE_DATE, 'MM/DD/YYYY') AS INVOICE_DATE, SUM(l.INVOICE_LINE_AMOUNT) AS "Invoice Amount"
FROM AP_INVOICES i
JOIN AP_INVOICE_LINES l ON i.INVOICE_ID = l.INVOICE_ID
JOIN AP_SUPPLIERS s ON i.SUPPLIER_ID = s.SUPPLIER_ID
WHERE i.INVOICE_DATE >= SYSDATE - 7
GROUP BY i.INVOICE_NUM, s.SUPPLIER_NAME, i.SUPPLIER_NUM, i.INVOICE_DATE
HAVING SUM(l.INVOICE_LINE_AMOUNT) > 1000
ORDER BY SUM(l.INVOICE_LINE_AMOUNT) DESC, INVOICE_NUM ASC;

The corresponding MongoDB query with $lookup, $match, $group and $project and $sort stages in aggregation pipeline

db.ap_invoices.aggregate([
{
$lookup: {
from: "ap_invoice_lines",
localField: "invoice_id",
foreignField: "invoice_id",
as: "invoice_lines"
}
},
{
$lookup: {
from: "ap_suppliers",
localField: "supplier_id",
foreignField: "supplier_id",
as: "supplier"
}
},
{
$match: {
invoice_date: {
$gte: ISODate(new Date(Date.now() - 7 * 24 * 60 * 60 * 1000))
}
}
},
{
$group: {
_id: { invoice_num: "$invoice_num", supplier_name: "$supplier.supplier_name", supplier_num: "$supplier_num", invoice_date: "$invoice_date" },
total_amount: { $sum: "$invoice_lines.invoice_line_amount" }
}
},
{
$match: { total_amount: { $gt: 1000 } }
},
{
$project: {
_id: 0,
invoice_num: "$_id.invoice_num",
supplier_name: "$_id.supplier_name",
supplier_num: "$_id.supplier_num",
invoice_date: { $dateToString: { format: "%m/%d/%Y", date: "$_id.invoice_date" } },
"Invoice Amount": "$total_amount"
}
},
{
$sort: { "Invoice Amount": -1, invoice_num: 1 }
}
])

Conclusion

Converting SQL queries to MongoDB queries can be a challenging task due to the fundamental differences in the data models and query languages. However, with a solid understanding of both SQL and MongoDB, it is possible to convert most SQL queries to their MongoDB equivalent.

Some key considerations include understanding the differences between relational and document-based data models to come-up with refined schema and later using the appropriate MongoDB query operators.

This article highlights some of the most commonly used SQL operators and their corresponding MongoDB operators, there are cases where the translation can be done directly and other cases where you need to take into consideration the MongoDB schema.

Happy Migrations!!!

--

--

No responses yet