Create a connection in Airbyte OSS to synchronize data from Microsoft SQL Server to Snowflake.
Microsoft SQL Server is one of the top three databases as per the DBMS rankings of the last 2 years and is designed for operational use-cases – but it is not designed for big data and analytics. On the other hand, a platform such as Snowflake is purpose-built for big data and analytics. Airbyte can be used to replicate your data from MSSQL Server to Snowflake, which allows you to leverage each system for its strengths.
Benefits of moving data from Microsoft SQL Server to Snowflake
Moving data from MSSQL to Snowflake may be part of an overall data integration strategy, which will provide your organization with:
In addition to the benefits listed above, Snowflake is designed for storing massive amounts of data. Therefore MSSQL replication into Snowflake may be used for backups, or for archiving historical MSSQL data as required for compliance or regulatory requirements.
What you will learn in this tutorial
Airbyte makes it easy to replicate data from Microsoft SQL Server to Snowflake. This tutorial will go through the steps required to set up an Airbyte OSS connection, which will copy data from MSSQL to Snowflake. Because of the similarity between Airbyte Cloud and Airbyte Open-Source, the instructions should be applicable to both platforms.
Let's get started!
In this tutorial Airbyte OSS will be used to replicate your Microsoft SQL Server data to Snowflake. You will therefore need the following prerequisites:
Depending on your operating system, you may use the Microsoft SQL Server docker image or you can install SQL Server on Windows by downloading the .exe installer file. In this example, we will set up SQL Server on macOS using Docker. First, download the latest version of the Microsoft SQL server image by running the following command.
Once downloaded, start an instance by running the following command. First, choose a password by setting it in the highlighted section.
You can also run T-SQL containers by connecting to the sqlcmd, which is a command-line shell for SQL Server. Again, use the same password configured in the previous step.
Create a new database by running the following commands:
You can verify that the database is created by running the following:
Which should respond with the following:
Run the following T-SQL Statements to create schema and tables that will be used as our sample data:
Add some rows to the customers table by running the following:
Add rows to the stores table by executing:
If you don’t already have a Snowflake account, you can create a trial account. When creating a Snowflake account, you’ll need to pick a Snowflake edition and a cloud provider as part of the account creation process.
Once your account is successfully created, you'll be redirected to the Snowflake dashboard. The worksheet area will be the primary place you’ll run scripts for creating and modifying resources. You will need to set up the destination database, user, role, and schema on Snowflake for the sync.
Airbyte provides a convenient script in the Snowflake destination connector documentation which you should copy into your Snowflake worksheet area. After you have copied the script into your Snowflake worksheet select ‘All queries’ and run the script by clicking on the run button as shown below.
ℹ️ Before running the script, be sure to change the airbyte_password variable to your preferred password value.
Go to Airbyte and create a new source connection. Give the connection a name and select Microsoft SQL Server as the Source Type.
ℹ️ See the Microsoft SQL Server source connector documentation for additional information.
Enter the values for the host and port you configured when setting up your MSSQL Server docker container as shown below:
Set Snowflake as the destination, give the destination a name, and select Snowflake as the destination type.
ℹ️ See the Snowflake destination connector documentation for more information.
Enter the values for the fields based on the values set in the script in Step 2. For example, enter the URL you received by email for the host when signing up for Snowflake. If you updated the password in your script, enter the new password.
Once the source and destination connectors are configured, you can access your connection settings. You should be able to see the tables that are available in your Microsoft SQL Server as shown below.
Set the sync frequency and choose your Sync mode. In this example, the Full refresh | Append mode has been selected.
Save the connection and select Sync now.
Once the Sync is complete, you can go to the Database section in the Snowflake UI to see the tables that have been copied. Snowflake should contain the normalized data in the same format as the SQL Server Table. The replica also includes the raw data in a separate set with the name _AIRBYTE_RAW_{TABLE_NAME}
You can view the structure of the table and the data types for each of the fields. Airbyte automatically maps the data types in the SQL Server tables to the corresponding data types in Snowflake.
This tutorial has shown you how easy it is to replicate data from Microsoft SQL Server to Snowflake using Airbyte. The replicated data can then be used in Snowflake to improve your analytics and insights. To summarize, in this tutorial you have:
If you have enjoyed this tutorial, you may be interested in other Airbyte tutorials, or in Airbyte’s blog. You can also join the conversation on our community Slack Channel, participate in discussions on Airbyte’s discourse, or sign up for our newsletter. Furthermore, if you want to use Airbyte to replicate your HubSpot data to Snowflake, try out our fully managed solution Airbyte Cloud for free!
Learn how to move all your data to a data lake and connect your data lake with the Dremio lakehouse platform.
Learn how to bypass Slack's message history restriction and access all of your messages, even if you aren't on a paid Slack plan.
Learn how to replicate data from Salesforce and Zendesk to Keen to gain a 360-degree view of your business using Airbyte.