Learn how to easily synchronize your MongoDB data into PostgreSQL.
MongoDB is a distributed database that is built for modern transactional and analytical applications and may be used for rapidly changing, multi-structured data. On the other hand, PostgreSQL is an SQL database that has all of the features that you require from a relational database . If you are unsure of the differences between these systems, on the MongoDB website, you can find an article that compares PostgreSQL and MongoDB.
Choosing one or the other between MongoDB and PostgreSQL may not be your only option – in-fact, because each database has different strengths you may wish to use them side-by-side. If this is your case, then you may need to sync data between them.
Custom building a data pipeline to replicate data from MongoDB to PostgreSQL is time-consuming and tedious. On the other hand, Airbyte is designed exactly for this task. This article will demonstrate how to use Airbyte to replicate and synchronize data from MongoDB to PostgreSQL!
This tutorial makes use of the following tools:
In this section, you will use Clever Cloud to create a MongoDB instance. Once you sign up, choose the option to create an add-on from your personal space.
From the available list of add-ons, choose the MongoDB add-on.
For the instance size, choose the DEV plan, which is free to use.
Enter an add-on name and select a region as shown below, and then click Next.
You should now have a new MongoDB database created with all the details to connect to it. Copy the mongo CLI connection command from the Clever Cloud add-on dashboard shown below
Copy and replace “mongo” with “mongosh” before executing the command in your terminal, as demonstrated below:
You should now be connected to the PRIMARY replica of the MongoDB replica set (indicated by [primary] in the shell).
In the Airbyte connection to MongoDB, you will make use of the URL for the primary replica. This can be retrieved by running rs.isMaster().primary in the MongoDB shell, which will respond with a string in the format of [hostname]:[port]. In our case, the URL returned by this command is is n2-c2-mongodb-clevercloud-customers.services.clever-cloud.com:27017
For this demo, we download and then import a sample restaurant collection using the mongoimport db tool command.
Create an Airbyte MongoDB source by choosing sources from your Airbyte dashboard and clicking on the New source button. Then from the list of sources, choose MongoDB, and you should see a UI similar to the following:
To keep this tutorial simple, and for demonstration purposes only, in the above image we have selected a Standalone MongoDB instance. However, you may also consider selecting one of the alternative MongoDB configuration parameters if you wish to have a more resilient connection to your MongoDB cluster.
Enter the Host, Port, Username, DB Name, and Password that were shown earlier in the Clever Cloud MongoDB configuration UI. Then choose Set up source.
To set up a PostgreSQL database, create a new add-on on your Clever Cloud dashboard, and choose PostgreSQL from the available add-ons
For plan, choose the DEV option, which provides 256 MB for storage.
Give a name to your add-on, and choose a location.
Click on Next once you are satisfied with your configurations, and then Clever Cloud should show you the PostgreSQL database credentials with information that will be required by Airbyte, including host, user, password, and database name.
To connect to our newly created database, copy the Connection URI and provide it as an argument to the psql CLI tool as shown below.
Go to Destinations in your Airbyte Dashboard, choose to Create destination from the list, and choose PostgreSQL. You will then see a UI similar to the following:
Enter in the PostgreSQL parameters that were returned by Clever Cloud, and click Set up destination.
Go to Connections in your Airbyte dashboard and choose New connection. Select the source and the destination that you just created, at which point you should see a UI similar to the following:
Airbyte has correctly detected the restaurant collection as a stream, and you can choose how it should be replicated to PostgreSQL. For sync mode, choose one from the available modes – for more information you may wish to consult the blog: An overview of Airbyte’s replication modes.
For Replication frequency, specify the interval between sync runs. Once you are done with the configurations, choose Set up connection and Airbyte will start its first sync. Once complete, you will be able to see how many records were replicated.
Log in to Postgres host to see the replicated data. Note that you must change the search_path according to the DB Name that you specified when you set up the PostgreSQL destination in Airbyte.
You should now be able to view the replicated data using standard SQL commands.
In summary, in this tutorial you have learned how to:
With Airbyte, the data integration possibilities are endless, and we look forward to seeing you use it! We invite you to join the conversation on our community Slack Channel, participate in discussions on Airbyte’s discourse, or sign up for our newsletter. You may also be interested in other Airbyte tutorials and Airbyte’s blog!
Learn how to modify the dbt code used by Airbyte to partition and cluster BigQuery tables.
Learn how to use Airbyte’s Python CDK to write a source connector that extracts data from the Webflow API.
Learn how Airbyte’s incremental synchronization replication modes work.