All tutorials
No items found.

Build a Data Ingestion Pipeline from Microsoft SQL Server to Snowflake

Create a connection in Airbyte OSS to synchronize data from Microsoft SQL Server to Snowflake.

Microsoft SQL Server is one of the top three databases as per the DBMS rankings of the last 2 years and is designed for operational use-cases – but it is not designed for big data and analytics. On the other hand, a platform such as Snowflake is purpose-built for big data and analytics. Airbyte can be used to replicate your data from MSSQL Server to Snowflake, which allows you to leverage each system for its strengths. 

Benefits of moving data from Microsoft SQL Server to Snowflake

Moving data from MSSQL to Snowflake may be part of an overall data integration strategy, which will provide your organization with: 

  • A unified view of data and a single source of truth – achieved by copying data from MSSQL Server and other operational systems into Snowflake.
  • Improved analytics capabilities – Snowflake is purpose built for running large analytics jobs.
  • The ability to transform data in a single location – moving data from multiple systems into Snowflake allows you to transform and join data from multiple disparate systems.
  • Improved security – limit the number of people that require access to your MSSQL Server, as they can analyze your MSSQL data in Snowflake.

In addition to the benefits listed above, Snowflake is designed for storing massive amounts of data. Therefore MSSQL replication into Snowflake may be used for backups, or for archiving historical MSSQL data as required for compliance or regulatory requirements. 

What you will learn in this tutorial

Airbyte makes it easy to replicate data from Microsoft SQL Server to Snowflake. This tutorial will go through the steps required to set up an  Airbyte OSS connection, which will copy data from MSSQL to Snowflake. Because of the similarity between Airbyte Cloud and Airbyte Open-Source, the instructions should be applicable to both platforms. 

Let's get started!

Prerequisites

In this tutorial Airbyte OSS will be used to replicate your Microsoft SQL Server data to Snowflake. You will therefore need the following prerequisites:

  1. Microsoft SQL Server.
  2. Airbyte OSS
  3. Snowflake.

Step 1: Configuring Microsoft SQL Server

Depending on your operating system, you may use the Microsoft SQL Server docker image or you can install SQL Server on Windows by downloading the .exe installer file. In this example, we will set up SQL Server on macOS using Docker. First, download the latest version of the Microsoft SQL server image by running the following command.


docker pull mcr.microsoft.com/mssql/server

Once downloaded, start an instance by running the following command. First, choose a password by setting it in the highlighted section.


docker run --name airbyte-mssql -e "ACCEPT_EULA=Y" -e "SA_PASSWORD={YOUR_PASSWORD}" -e "MSSQL_AGENT_ENABLED=True" -p 1433:1433 -d mcr.microsoft.com/mssql/server:latest

You can also run T-SQL containers by connecting to the sqlcmd, which is a command-line shell for SQL Server. Again, use the same password configured in the previous step.


docker exec -it airbyte-mssql /opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P {YOUR_PASSWORD}

Create a new database by running the following commands:


CREATE DATABASE Airbyte;
GO

You can verify that the database is created by running the following:


SELECT Name FROM sys.Databases;
GO

Which should respond with the following:

Run the following T-SQL Statements to create schema and tables that will be used as our sample data:


USE Airbyte;
GO

CREATE SCHEMA sales;
GO

CREATE TABLE sales.customers (
   customer_id INT IDENTITY (1, 1) PRIMARY KEY,
   first_name VARCHAR (255) NOT NULL,
   last_name VARCHAR (255) NOT NULL,
   phone VARCHAR (25),
   email VARCHAR (255) NOT NULL,
   street VARCHAR (255),
   city VARCHAR (50),
   state VARCHAR (25),
   zip_code VARCHAR (5)
);

CREATE TABLE sales.stores (
   store_id INT IDENTITY (1, 1) PRIMARY KEY,
   store_name VARCHAR (255) NOT NULL,
   phone VARCHAR (25),
   email VARCHAR (255),
   street VARCHAR (255),
   city VARCHAR (255),
   state VARCHAR (10),
   zip_code VARCHAR (5)
);

Add some rows to the customers table by running the following:



INSERT INTO sales.customers(first_name, last_name, phone, email, street, city, state, zip_code) 
VALUES('Debra','Burks',NULL,'debra.burks@yahoo.com','9273 Thorne Ave. ','Orchard Park','NY',14127);
INSERT INTO sales.customers(first_name, last_name, phone, email, street, city, state, zip_code) VALUES('Kasha','Todd',NULL,'kasha.todd@yahoo.com','910 Vine Street ','Campbell','CA',95008);
INSERT INTO sales.customers(first_name, last_name, phone, email, street, city, state, zip_code) VALUES('Tameka','Fisher',NULL,'tameka.fisher@aol.com','769C Honey Creek St. ','Redondo Beach','CA',90278);
INSERT INTO sales.customers(first_name, last_name, phone, email, street, city, state, zip_code) VALUES('Daryl','Spence',NULL,'daryl.spence@aol.com','988 Pearl Lane ','Uniondale','NY',11553);
INSERT INTO sales.customers(first_name, last_name, phone, email, street, city, state, zip_code) VALUES('Charlotte','Rice','(916) 381-6003','charolette.rice@msn.com','107 River Dr. ','Sacramento','CA',95820);

GO

Add rows to the stores table by executing:



INSERT INTO sales.stores(store_name, phone, email, street, city, state, zip_code)
VALUES('Santa Cruz Bikes','(831) 476-4321','santacruz@bikes.shop','3700 Portola Drive', 'Santa Cruz','CA',95060),
     ('Baldwin Bikes','(516) 379-8888','baldwin@bikes.shop','4200 Chestnut Lane', 'Baldwin','NY',11432),
     ('Rowlett Bikes','(972) 530-5555','rowlett@bikes.shop','8000 Fairway Avenue', 'Rowlett','TX',75088);

GO

Step 2: Setting up your Snowflake Account

If you don’t already have a Snowflake account, you can create a trial account. When creating a Snowflake account, you’ll need to pick a Snowflake edition and a cloud provider as part of the account creation process. 

Once your account is successfully created, you'll be redirected to the Snowflake dashboard. The worksheet area will be the primary place you’ll run scripts for creating and modifying resources. You will need to set up the destination database, user, role, and schema on Snowflake for the sync. 

Airbyte provides a convenient script in the Snowflake destination connector documentation which you should copy into your Snowflake worksheet area. After you have copied the script into your Snowflake worksheet select ‘All queries’ and run the script by clicking on the run button as shown below.

ℹ️ Before running the script, be sure to change the airbyte_password variable to your preferred password value. 

Step 3: Setting up Microsoft SQL Server as the Airbyte Source

Go to Airbyte and create a new source connection. Give the connection a name and select Microsoft SQL Server as the Source Type.

ℹ️ See the Microsoft SQL Server source connector documentation for additional information.

Enter the values for the host and port you configured when setting up your MSSQL Server docker container as shown below:

Step 4: Setting up Snowflake Destination in Airbyte

Set Snowflake as the destination, give the destination a name, and select Snowflake as the destination type. 

ℹ️ See the Snowflake destination connector documentation for more information. 

Enter the values for the fields based on the values set in the script in Step 2. For example, enter the URL you received by email for the host when signing up for Snowflake. If you updated the password in your script, enter the new password.

Step 5: Setting up the MSSQL to Snowflake Airbyte Connection

Once the source and destination connectors are configured, you can access your connection settings. You should be able to see the tables that are available in your Microsoft SQL Server as shown below.

Set the sync frequency and choose your Sync mode. In this example, the Full refresh | Append mode has been selected.

Step 6: Sync your data

Save the connection and select Sync now.

Once the Sync is complete, you can go to the Database section in the Snowflake UI to see the tables that have been copied. Snowflake should contain the normalized data in the same format as the SQL Server Table. The replica also includes the raw data in a separate set with the name _AIRBYTE_RAW_{TABLE_NAME}

You can view the structure of the table and the data types for each of the fields. Airbyte automatically maps the data types in the SQL Server tables to the corresponding data types in Snowflake.

Wrapping up

This tutorial has shown you how easy it is to replicate data from Microsoft SQL Server to Snowflake using Airbyte. The replicated data can then be used in Snowflake to improve your analytics and insights. To summarize, in this tutorial you have:

  1. Started a docker container running Microsoft SQL Server
  2. Set up Snowflake as a destination.
  3. Configured a Microsoft SQL Server Airbyte source.
  4. Configured a Snowflake Airbyte destination.
  5. Created an Airbyte connection that replicates data from SQL Server to Snowflake
  6. Synchronized Microsoft SQL Server data to Snowflake

If you have enjoyed this tutorial, you may be interested in other Airbyte tutorials, or in Airbyte’s blog. You can also join the conversation on our community Slack Channel, participate in discussions on Airbyte’s discourse, or sign up for our newsletter. Furthermore, if you want to use Airbyte to replicate your HubSpot data to Snowflake, try out our fully managed solution Airbyte Cloud for free!

Open-source data integration

Get all your ELT data pipelines running in minutes with Airbyte.

Similar use cases

Build a Slack activity dashboard with Apache Superset

Build a Slack activity dashboard quickly using the Slack Airbyte connector and Apache Superset.

Build a GitHub activity dashboard for your project

Using the Airbyte GitHub connector and Metabase, we can create insightful dashboards for GitHub projects.

Replicate Salesforce and Zendesk data to Keen for unified analytics

Learn how to replicate data from Salesforce and Zendesk to Keen to gain a 360-degree view of your business using Airbyte.