Building your pipeline or Using Airbyte
Airbyte is the only open solution empowering data teams to meet all their growing custom business demands in the new AI era.
- Inconsistent and inaccurate data
- Laborious and expensive
- Brittle and inflexible
- Reliable and accurate
- Extensible and scalable for all your needs
- Deployed and governed your way
Start syncing with Airbyte in 3 easy steps within 10 minutes
Take a virtual tour
Demo video of Airbyte Cloud
Demo video of AI Connector Builder
What sets Airbyte Apart
Modern GenAI Workflows
Move Large Volumes, Fast
An Extensible Open-Source Standard
Full Control & Security
Fully Featured & Integrated
Enterprise Support with SLAs
What our users say
"The intake layer of Datadog’s self-serve analytics platform is largely built on Airbyte.Airbyte’s ease of use and extensibility allowed any team in the company to push their data into the platform - without assistance from the data team!"
“Airbyte helped us accelerate our progress by years, compared to our competitors. We don’t need to worry about connectors and focus on creating value for our users instead of building infrastructure. That’s priceless. The time and energy saved allows us to disrupt and grow faster.”
“We chose Airbyte for its ease of use, its pricing scalability and its absence of vendor lock-in. Having a lean team makes them our top criteria. The value of being able to scale and execute at a high level by maximizing resources is immense”
a. Query Data
- Log in to your Salesforce account.
- Use Salesforce's SOQL (Salesforce Object Query Language) to query the data you want to export. You can do this through the Developer Console, Workbench, or any tool that allows you to run SOQL queries.
b. Export Data
- Once you have the SOQL query, you can export the data. If you're using the Developer Console or Workbench, you can usually export the results as a CSV file directly.
- If you need to automate this process, you could use Salesforce's Data Loader command-line interface (CLI) to export the data to CSV. You can schedule a cron job (on Unix-like systems) or a scheduled task (on Windows) to run the Data Loader CLI with the appropriate SOQL query and export parameters.
a. Clean Data
- Open the exported CSV file and ensure the data types and formats align with what Snowflake expects. For example, dates may need to be formatted appropriately, and strings sanitized to escape special characters.
b. Split Large Files
- If your CSV file is very large, consider splitting it into smaller files to make the upload process more manageable and to avoid timeouts or memory issues.
a. Choose a Staging Area
- Snowflake supports various staging options such as internal (Snowflake) stages, Amazon S3, Google Cloud Storage, or Microsoft Azure. Choose the one that best suits your needs.
b. Upload Data to the Staging Area
- Use the command-line tools provided by your chosen cloud storage provider to upload the CSV files to your staging area.You can use aws s3 cp for Amazon S3, gsutil cp for Google Cloud Storage, or azcopy for Azure Blob Storage.
- Log in to your Snowflake account.
- Use the Snowflake web interface or SnowSQL to create a file format that matches the CSV file structure. This will ensure Snowflake can correctly parse the data.
CREATE OR REPLACE FILE FORMAT my_csv_format
TYPE = 'CSV'
FIELD_DELIMITER = ','
SKIP_HEADER = 1
NULL_IF = ('NULL', 'null')
EMPTY_FIELD_AS_NULL = TRUE
TRIM_SPACE = TRUE;
- Create a database and schema in Snowflake if you haven't already.
CREATE DATABASE IF NOT EXISTS salesforce_data;
CREATE SCHEMA IF NOT EXISTS salesforce_data_schema;
- Create a table in Snowflake that matches the structure of the Salesforce data you exported.
CREATE TABLE salesforce_data_schema.my_table (
Column1 DataType,
Column2 DataType,
-- Add all columns as per the CSV file
);
a. Copy Command
- Use the COPY INTO command to load data from the staged files into the Snowflake table.
COPY INTO salesforce_data_schema.my_table
FROM '@my_stage/path_to_files/'
FILE_FORMAT = (FORMAT_NAME = my_csv_format)
ON_ERROR = 'CONTINUE';
b. Verify Data
- After the COPY INTO command completes, verify that the data was loaded correctly by querying the table.
SELECT * FROM salesforce_data_schema.my_table LIMIT 10;
- To automate this process, you can create a script that combines the data extraction, preparation, and upload steps.
- Schedule the script to run at regular intervals using cron jobs, scheduled tasks, or Snowflake tasks, depending on your preference and the tools at your disposal.
Remember to handle any security considerations, such as encrypting data during transfer and storing credentials securely. Also, monitor the process for any failures or issues, and set up alerts to notify you if something goes wrong.
By following these steps, you can move data from Salesforce to Snowflake without third-party connectors. It requires some setup and maintenance, but it gives you complete control over the data transfer process.
Salesforce is a cloud-based CRM platform that has become the go-to solution for businesses to understand and manage their customers. Its low-code admin tools, powerful data integration capabilities, and dynamic dashboards have made it the market leader in the CRM space.
FAQs
What is ETL?
ETL, an acronym for Extract, Transform, Load, is a vital data integration process. It involves extracting data from diverse sources, transforming it into a usable format, and loading it into a database, data warehouse or data lake. This process enables meaningful data analysis, enhancing business intelligence.
Salesforce is a cloud-based customer relationship management (CRM) platform providing business solutions software on a subscription basis. Salesforce is a huge force in the ecommerce world, helping businesses with marketing, commerce, service and sales, and enabling enterprises’ IT teams to collaborate easily from anywhere. Salesforces is the force behind many industries, offering healthcare, automotive, finance, media, communications, and manufacturing multichannel support. Its services are wide-ranging, with access to customer, partner, and developer communities as well as an app exchange marketplace.
Salesforce's API provides access to a wide range of data types, including:
1. Accounts: Information about customer accounts, including contact details, billing information, and purchase history.
2. Leads: Data on potential customers, including contact information, lead source, and lead status.
3. Opportunities: Information on potential sales deals, including deal size, stage, and probability of closing.
4. Contacts: Details on individual contacts associated with customer accounts, including contact information and activity history.
5. Cases: Information on customer service cases, including case details, status, and resolution.
6. Products: Data on products and services offered by the company, including pricing, availability, and product descriptions.
7. Campaigns: Information on marketing campaigns, including campaign details, status, and results.
8. Reports and Dashboards: Access to pre-built and custom reports and dashboards that provide insights into sales, marketing, and customer service performance.
9. Custom Objects: Ability to access and manipulate custom objects created by the organization to store specific types of data.
Overall, Salesforce's API provides access to a comprehensive set of data types that enable organizations to manage and analyze their customer relationships, sales processes, and marketing campaigns.
What is ELT?
ELT, standing for Extract, Load, Transform, is a modern take on the traditional ETL data integration process. In ELT, data is first extracted from various sources, loaded directly into a data warehouse, and then transformed. This approach enhances data processing speed, analytical flexibility and autonomy.
Difference between ETL and ELT?
ETL and ELT are critical data integration strategies with key differences. ETL (Extract, Transform, Load) transforms data before loading, ideal for structured data. In contrast, ELT (Extract, Load, Transform) loads data before transformation, perfect for processing large, diverse data sets in modern data warehouses. ELT is becoming the new standard as it offers a lot more flexibility and autonomy to data analysts.
Salesforce is a cloud-based customer relationship management (CRM) platform providing business solutions software on a subscription basis. Salesforce is a huge force in the ecommerce world, helping businesses with marketing, commerce, service and sales, and enabling enterprises’ IT teams to collaborate easily from anywhere. Salesforces is the force behind many industries, offering healthcare, automotive, finance, media, communications, and manufacturing multichannel support. Its services are wide-ranging, with access to customer, partner, and developer communities as well as an app exchange marketplace.
A cloud data platform, Snowflake Data Cloud provides a warehouse-as-a-service built specifically for the cloud. The Snowflake platform is designed to empower many types of data workloads, and offers secure, immediate, governed access to a comprehensive network of data. Snowflake’s innovative technology goes above the capabilities of the ordinary database, supplying users all the functionality of database storage, query processing, and cloud services in one package.
1. Open the Airbyte platform and navigate to the "Sources" tab on the left-hand side of the screen.
2. Click on the "Salesforce" source connector and select "Create new connection."
3. Enter a name for your connection and click "Next."
4. Enter your Salesforce credentials, including your username, password, and security token.
5. Click "Test connection" to ensure that your credentials are correct and that Airbyte can connect to your Salesforce account.
6. Once the connection is successful, select the objects you want to replicate from Salesforce.
7. Choose the replication frequency and any other settings you want to apply to your connection.
8. Click "Create connection" to save your settings and start replicating data from Salesforce to Airbyte.
9. You can monitor the progress of your replication in the "Connections" tab and view the data in the "Dashboard" tab.
1. First, navigate to the Airbyte website and log in to your account.
2. Once you are logged in, click on the "Destinations" tab on the left-hand side of the screen.
3. Scroll down until you find the Snowflake Data Cloud destination connector and click on it.
4. You will be prompted to enter your Snowflake account information, including your account name, username, and password.
5. After entering your account information, click on the "Test" button to ensure that the connection is successful.
6. If the test is successful, click on the "Save" button to save your Snowflake Data Cloud destination connector settings.
7. You can now use the Snowflake Data Cloud destination connector to transfer data from your Airbyte sources to your Snowflake account.
8. To set up a data transfer, navigate to the "Sources" tab on the left-hand side of the screen and select the source you want to transfer data from.
9. Click on the "Create New Connection" button and select the Snowflake Data Cloud destination connector as your destination.
10. Follow the prompts to set up your data transfer, including selecting the tables or data sources you want to transfer and setting up any necessary transformations or mappings.
11. Once you have set up your data transfer, click on the "Run" button to start the transfer process.
With Airbyte, creating data pipelines take minutes, and the data integration possibilities are endless. Airbyte supports the largest catalog of API tools, databases, and files, among other sources. Airbyte's connectors are open-source, so you can add any custom objects to the connector, or even build a new connector from scratch without any local dev environment or any data engineer within 10 minutes with the no-code connector builder.
We look forward to seeing you make use of it! We invite you to join the conversation on our community Slack Channel, or sign up for our newsletter. You should also check out other Airbyte tutorials, and Airbyte’s content hub!
What should you do next?
Hope you enjoyed the reading. Here are the 3 ways we can help you in your data journey:
Salesforce is a cloud-based CRM platform that has become the go-to solution for businesses to understand and manage their customers. Its low-code admin tools, powerful data integration capabilities, and dynamic dashboards have made it the market leader in the CRM space.
By moving your Salesforce data to a cloud-based data warehouse like Snowflake you can combine sales data with other business sources to build a 360-degree customer view and build custom reporting.
With Airbyte Cloud, you can seamlessly export, backup and replicate data between any supported business application, database and data warehouse. In this tutorial, we'll take you through the steps to set up Airbyte Cloud, and copy over your data from Salesforce to Snowflake.
Prerequisites
1. An Airbyte Cloud account to perform data replication. You can create an Airbyte Cloud account here.
2. A Salesforce account to export the data. You can create a Salesforce account here. You will need at least a developer account for Airbyte to be able to access Salesforce REST APIs.
3. A Snowflake account to load the Salesforce data. You can create a Snowflake account here.
{{COMPONENT_CTA}}
Methods to Move Data From Salesforce to Snowflake
- Method 1: Connecting Salesforce to Snowflake using Airbyte.
- Method 2: Connecting Salesforce to Snowflake manually.
Method 1: Connecting Salesforce to Snowflake using Airbyte
Step 1: Set up a Salesforce account
The first step is to configure your Salesforce account to allow external applications to connect and access data from the account. Login to Salesforce, go to Setup > Create > Apps, and select New in the Connected Apps section.
Fill in the required fields and click on Save to add the new app.
Next, we will generate and import sample data into Salesforce. For this example, we will create Leads data. Go to https://www.mockaroo.com/ and create the Leads data set with the following fields:
The Mockaroo tool will generate a CSV with sample data.
Login to Salesforce and go to Setup > Data > Data Import Wizard. Choose the Leads object, upload the CSV and follow the Wizard steps to upload the data.
Once complete, you can view the data by going to the Salesforce homepage > Leads.
Step 2: Set up a Snowflake account
When creating a Snowflake account, you’ll need to pick a Snowflake edition and a cloud provider as part of the account creation process. Next, you’ll receive an email containing your login URL. Make sure to bookmark this URL for future logins.
Once your account is successfully created, you'll be redirected to the Snowflake dashboard. The worksheet area will be the primary place you’ll run scripts for creating and modifying resources. You will need to set up the destination database, user, role, and schema on Snowflake for the sync.
The good news is that Airbyte provides a script in the Snowflake destination documentation for automating this process. Paste the snippet below into your worksheet area on Snowflake. Select ‘All queries’ and run the script by clicking on the run button.
Once executed, you should see the following message:
Step 3: Set up a Salesforce connector as a source
Next, log in to Airbyte Cloud and create a new source. Give the source a name and choose Salesforce as the Source type. You can find more information about the Salesforce connector from the Airbyte documentation here.
Next, click on the Authenticate your account button to give Airbyte Cloud the required permissions to copy your data.
Clicking on the button will launch a pop-up window where you will be prompted to sign in with your Salesforce credentials.
Once authenticated, you will be redirected back to Airbyte Cloud. Then enter the ‘Start Date’ from when Airbyte should copy data and click on Set up source.
Step 4: Set up a Snowflake connector as the destination
The next step is to use the Snowflake connector. Select Snowflake as the destination type and give it a name. You can find more information about the Snowflake connector in the Airbyte docs here.
Enter the values for the fields based on the values set in the script in Step 2. For the host, enter the URL you received when signing up for Snowflake. If you had updated the password in your script, enter the new password. Once ready, click on the Set up destination button.
Step 5: Set up a Salesforce to Snowflake connection
Once the source and destination are configured, you can access your connection settings. Here you can set the replication frequency, the destination namespace, and also set a destination stream prefix if required. You can learn more about this configuration settings in the Airbyte Cloud getting started guide.
You can also see the various source data streams that are available to be copied. In this example, we will select the Lead stream. Depending on the features you use on Salesforce, you can select different objects to copy. Each objects will replicated into a table in Snowflake
You can set the sync mode for each data stream individually. In this case, we will set the Lead stream to use the Incremental | Append sync mode. If you want to save each version of your Salesforce data you can instead select the Full Refresh | Append sync mode that will append all your data to each table. In this case, you will need to deduplicate the data yourself when querying the data.
By selecting Normalized tabular data Airbyte will automatically create tables with one column for each field in the source data.
Once configured, save the connection and select Sync now.
Once the sync is complete, you can go to the Database section in the Snowflake UI to see the tables that have been created. The copied data also contains the raw data in a separate table with the name _AIRBYTE_RAW_{TABLE_NAME}
You can view the structure of the table as well as the data types for each of the fields. The Lead data has the following structure:
To test the incremental sync, you can add more data to the Salesforce and run another sync to copy the new data.
Once the sync has been completed, you can go back to Snowflake to see the updated counts for the appended rows.
Method 2: Connecting Salesforce to Snowflake manually.
Moving data from Salesforce to Snowflake without using third-party connectors can be a bit challenging, but it's certainly doable by leveraging the capabilities of both platforms. Here's a step-by-step guide to help you accomplish this task:
Step 1: Extract Data from Salesforce
a. Query Data
- Log in to your Salesforce account.
- Use Salesforce's SOQL (Salesforce Object Query Language) to query the data you want to export. You can do this through the Developer Console, Workbench, or any tool that allows you to run SOQL queries.
b. Export Data
- Once you have the SOQL query, you can export the data. If you're using the Developer Console or Workbench, you can usually export the results as a CSV file directly.
- If you need to automate this process, you could use Salesforce's Data Loader command-line interface (CLI) to export the data to CSV. You can schedule a cron job (on Unix-like systems) or a scheduled task (on Windows) to run the Data Loader CLI with the appropriate SOQL query and export parameters.
Step 2: Prepare Data for Snowflake
a. Clean Data
- Open the exported CSV file and ensure the data types and formats align with what Snowflake expects. For example, dates may need to be formatted appropriately, and strings sanitized to escape special characters.
b. Split Large Files
- If your CSV file is very large, consider splitting it into smaller files to make the upload process more manageable and to avoid timeouts or memory issues.
Step 3: Stage Data for Snowflake
a. Choose a Staging Area
- Snowflake supports various staging options such as internal (Snowflake) stages, Amazon S3, Google Cloud Storage, or Microsoft Azure. Choose the one that best suits your needs.
b. Upload Data to the Staging Area
- Use the command-line tools provided by your chosen cloud storage provider to upload the CSV files to your staging area.You can use aws s3 cp for Amazon S3, gsutil cp for Google Cloud Storage, or azcopy for Azure Blob Storage.
Step 4: Create File Format in Snowflake
- Log in to your Snowflake account.
- Use the Snowflake web interface or SnowSQL to create a file format that matches the CSV file structure. This will ensure Snowflake can correctly parse the data.
CREATE OR REPLACE FILE FORMAT my_csv_format
TYPE = 'CSV'
FIELD_DELIMITER = ','
SKIP_HEADER = 1
NULL_IF = ('NULL', 'null')
EMPTY_FIELD_AS_NULL = TRUE
TRIM_SPACE = TRUE;
Step 5: Create a Database and Schema in Snowflake
- Create a database and schema in Snowflake if you haven't already.
CREATE DATABASE IF NOT EXISTS salesforce_data;
CREATE SCHEMA IF NOT EXISTS salesforce_data_schema;
Step 6: Create a Table in Snowflake
- Create a table in Snowflake that matches the structure of the Salesforce data you exported.
CREATE TABLE salesforce_data_schema.my_table (
Column1 DataType,
Column2 DataType,
-- Add all columns as per the CSV file
);
Step 7: Copy Data into Snowflake
a. Copy Command
- Use the COPY INTO command to load data from the staged files into the Snowflake table.
COPY INTO salesforce_data_schema.my_table
FROM '@my_stage/path_to_files/'
FILE_FORMAT = (FORMAT_NAME = my_csv_format)
ON_ERROR = 'CONTINUE';
b. Verify Data
- After the COPY INTO command completes, verify that the data was loaded correctly by querying the table.
SELECT * FROM salesforce_data_schema.my_table LIMIT 10;
Step 8: Automate and Schedule
- To automate this process, you can create a script that combines the data extraction, preparation, and upload steps.
- Schedule the script to run at regular intervals using cron jobs, scheduled tasks, or Snowflake tasks, depending on your preference and the tools at your disposal.
Remember to handle any security considerations, such as encrypting data during transfer and storing credentials securely. Also, monitor the process for any failures or issues, and set up alerts to notify you if something goes wrong.
By following these steps, you can move data from Salesforce to Snowflake without third-party connectors. It requires some setup and maintenance, but it gives you complete control over the data transfer process.
Salesforce is a cloud-based CRM platform that has become the go-to solution for businesses to understand and manage their customers. Its low-code admin tools, powerful data integration capabilities, and dynamic dashboards have made it the market leader in the CRM space.
Conclusion
To summarize how we can move data between Salesforce and Snowflake with Airbyte Cloud:
- Set up a Salesforce source in Airbyte Cloud.
- Set up a Snowflake destination in Airbyte Cloud.
- Set up a connection between Salesforce and Snowflake.
- Run a sync to move your sales data.
We know that development and operations teams working on fast-moving projects with tight timelines need quick answers to their questions from developers who are actively developing Airbyte. They also want to share their learnings with experienced community members who have “been there and done that.”
Join the conversation at Airbyte’s community Slack Channel to share your ideas with over 4000 data engineers and help make everyone’s project a success. With Airbyte, the integration possibilities are endless, and we can't wait to see what you're going to build!
What should you do next?
Hope you enjoyed the reading. Here are the 3 ways we can help you in your data journey:
Ready to get started?
Frequently Asked Questions
Salesforce's API provides access to a wide range of data types, including:
1. Accounts: Information about customer accounts, including contact details, billing information, and purchase history.
2. Leads: Data on potential customers, including contact information, lead source, and lead status.
3. Opportunities: Information on potential sales deals, including deal size, stage, and probability of closing.
4. Contacts: Details on individual contacts associated with customer accounts, including contact information and activity history.
5. Cases: Information on customer service cases, including case details, status, and resolution.
6. Products: Data on products and services offered by the company, including pricing, availability, and product descriptions.
7. Campaigns: Information on marketing campaigns, including campaign details, status, and results.
8. Reports and Dashboards: Access to pre-built and custom reports and dashboards that provide insights into sales, marketing, and customer service performance.
9. Custom Objects: Ability to access and manipulate custom objects created by the organization to store specific types of data.
Overall, Salesforce's API provides access to a comprehensive set of data types that enable organizations to manage and analyze their customer relationships, sales processes, and marketing campaigns.
What should you do next?
Hope you enjoyed the reading. Here are the 3 ways we can help you in your data journey: