Azure Cosmos DB for MongoDB : Data Migration from Azure Cosmos DB to Atlas — Part 2

Rajesh Vinayagam
5 min readMar 13, 2021

In the First part of the article we created Azure Cosmos DB for Mongo API populated with test data and later used the .NET Console application using Mongo Driver to perform some basic CRUD operation.

In this article we will see how we can migrate data from Azure Cosmos to Atlas Cluster and later see just a simple switch over of the connection string to support connectivity to Atlas Cluster for doing the CRUD operation.

Pre-requisite

Mongo Atlas ( Azure )

  1. Create an Atlas account.
  2. Create an Organisation with one project.
  3. Create an Atlas Cluster( Peering option is available only on M10 and above)
  4. Create a DB User with access to any database

Create Atlas Cluster

Under the project choose Create Cluster and follow the below wizard. Select the required cloud provider, for this article we are using Azure, so lets use Azure and choose the required region

Create Atlas Cluster

For Cluster Tier use M0 as it is free forever.

The Cluster creation will take 10 mins once the cluster is created lets create a DB user.

Add a New DB User

Add a DB user with read and write access to the database. This will be used in the connection string for connecting to a database

DBUser: Add a DB user for connecting to the database from the application.

Data Export

The Current data in Azure has two collection in test database as shown below

  1. example: This collection has 1079 documents.
  2. PersonCollection: This collection has 2 documents.

Use mongodump to export data from Azure Cosmos DB for Mongo API.

mongodump --host <HOST>:10255 --authenticationDatabase test -u <USERNAME> -p <PRIMARY PASSWORD> --ssl --sslAllowInvalidCertificatesNote: Replace <HOST>, <USERNAME>, and <SECONDARY PASSWORD> with the actual values shown in the Figure 1figure.

Running the above command will produce output, showing the number of documents exported in each collection.

The output will be extracted to a dump folder creating a separate folder for each database. In this example we have only one database named test, so it outputs the below files under test folder with each collection having one bson and metadata file e.g. example.bson, example.metadata.json

Data Import

Use the mongorestore command to restore the data exported from Azure Cosmos to Atlas

mongorestore --uri=<connectionstring> -u <username> -p <password> dump

Once the command is executed successfully the documents will be restored to Atlas. The mongorestore will show the total documents restored.

Atlas portal will now show the restored collection.

Application Migration

Download the code from the below git location. This is the same code that was used in Part 1 of the application for connecting to Azure Cosmos DB using Mongo API.

Connection String

Update the connection string with the connection string from Atlas and run the application.

Just updating the connection string and running will allow the application to connect and retrieve data from Atlas seamlessly.

Note: For this demo the connection is established by whitelisting required IPs, ideally this would be done with peering

Scenarios

Scenario 1 : CosmosDB to Atlas hosted in Azure

In this experiment the data migrated is very minimal, so we can connect over the internet and execute in our PC. Ideally in the production when the volume of data is huge( in GBs ) we can create a VM in the same region as CosmosDB and have the blob storage mounted as a drive and then run mongodump to copy over the files to blob storage and later use the restore command to migrate to Atlas

Scenario 2: CosmosDB to Atlas hosted in other cloud platform(GCP)

Scenario 2 is moving huge volume of data across cloud. We can still leverage the same steps as in scenario 1 like having a VM created in the same region and have the blob storage mounted as a drive to copy over the data using mongodump.

Optionally we will leverage data transfer service from Google Cloud to copy over the data from Blob Storage to Cloud Storage and then have a VM created in the same region as Atlas in GCP to restore the data. This can avoid some network latency when trying to run the mongo restore directly from Azure cloud VM( again this is optional and options like VPN direct connectivity to GCP from Azure can also be leveraged)

Below is the short video showcasing the data migration service in GCP

https://youtu.be/516WeME6DQo

This article considered simple migration of the data from Azure Cosmos DB to Atlas hosted in Azure or GCP. In the next article we will explore on complex migrations with sharded data Part 3.

--

--