Inserting data into tables is an essential database operation that helps you organize your data in a structured way. Snowflake, a fully managed cloud-based data warehouse, streamlines this process with its INSERT INTO command.
This guide provides a detailed explanation of how to use the Snowflake INSERT INTO command with examples. By learning this command, you can efficiently load single or multiple rows into your Snowflake tables.
What Is Snowflake INSERT INTO Command?
The Snowflake INSERT INTO command enables you to add rows of data to an existing or newly created table in the Snowflake data warehouse. You can insert one or more rows at a time by either specifying values directly or using data retrieved from another query.
Before inserting data into a Snowflake table, you must create it using the CREATE TABLE statement. Once the table is ready in the Snowflake data warehouse, you can utilize the INSERT INTO command to populate it with the required data. This process allows you to prepare structured data for further analysis and processing.
Snowflake INSERT INTO Syntax
Here is the syntax of the Snowflake INSERT INTO command:
This syntax is similar to the Redshift INSERT INTO command. Let’s understand each of these parameters:
- INSERT: Initiates the command to add data to a table.
- [OVERWRITE]: It is optional. When specified, it replaces the existing data in the target table. Without OVERWRITE, the command allows you to append the data to the table.
- target_table_name: You can specify the table name into which you need to insert rows.
- target_column_name: An optional parameter that enables you to fill in the target columns for inserting data. If omitted, data is inserted into all columns of the table in their defined order.
- VALUES ( { value | DEFAULT | NULL } [ , ... ] ) [ , ( ... ) ]: The VALUES clause allows you to insert one or more values into the respective columns of a table. Inside this clause, you can provide explicit values, use DEFAULT to add a default value for a corresponding column or specify NULL for an empty entry. To insert multiple rows, include additional sets of values, each enclosed in parentheses and separated by commas.
- query: An alternative to the VALUES clause, allowing you to insert data retrieved from a query to a desired column in the target table. You can also add rows from one or more source tables to the desired table.
How Do You Load Data Into Snowflake Using INSERT INTO Command?
Using the INSERT INTO command, you can load a single row, multiple rows, multi-tables, or even JSON data to Snowflake tables.
Single Row Insert
You can insert one row at a time into a table by directly specifying the values you want to insert. It is useful when inserting a small amount of known data.
Syntax:
Consider an employee table present in Snowflake. Now, let’s start adding a single row to the table as follows:
In this example, you can see that values for id, name, and department of only one employee are provided explicitly.
Run SELECT * FROM employee to see the results:
Multi-Row Insert
Like a single-row insert, you can add multiple rows into a Snowflake table by explicitly specifying the values for each row.
Syntax:
Example:
In this example, two rows ((highlighted in pink color) are added to the employee table by directly providing the values for each column.
Multi-Table Insert
With the INSERT INTO command, you can insert values into multiple tables simultaneously.
Syntax:
Example:
Consider another Snowflake table named employee_hire_info.
The updated employee table:
The updated employee_hire_info table:
JSON Data Insert
Snowflake supports adding JSON data into the Snowflake tables with the help of the VARIANT data type. The VARIANT type allows you to store any type of data up to a size of 16MB. If you want to insert JSON into a Snowflake table, you must first convert the JSON values into a VARIANT type using the PARSE_JSON() function. This function enables you to store the JSON object in a column with VARIANT data type.
Here is an example of inserting multiple JSON objects into a VARIANT column json_column1 in a Snowflake table named customer:
You can extract required fields in a tabular format from the VARIANT column json_column1 consisting of JSON objects.
In this query, the values are explicitly converted to the desired data type using the typecast operator (::).
Conditional Multi-Table Insert
A conditional multi-table insert in Snowflake enables you to insert data into multiple tables based on specific conditions.
Syntax:
Let’s understand syntax parameters:
- FIRST: A keyword to help you insert data into only the first table that matches the conditions specified in the WHEN clause.
- ALL: A keyword that allows you to add data into every table that matches the conditions specified in the WHEN clause.
- WHEN condition: You can define the condition using the SELECT statement to insert data into target tables. Once the condition is evaluated as TRUE, you can provide the specific table and columns into which values should be inserted in the following format.
- ELSE: Optionally, you can specify an action if no WHEN condition is met.
An example with ALL keyword:
Suppose the Source_Table has three rows with values 1, 12, and 102 in the marks column. In the given INSERT statement, the first WHEN clause matches the row with 102 marks and inserts it into table1. The rows with values of 12 and 102 satisfy the second WHEN clause, inserting them into both table1 and table2. Finally, the row with a value of one does not meet any WHEN clause conditions and is stored in table2 via the ELSE clause.
Therefore, table1 will contain 102, 102, and 12, while table2 will include 102, 12, and 1.
Another example with the FIRST keyword:
This example uses the FIRST keyword, which ensures that only the first matching condition is executed for each row. If a row satisfies the first condition marks>100, it is inserted into table1 and skips subsequent conditions. If marks > 10 is the next condition, it inserts into both table1 and table2, but only after the first condition fails. The ELSE clause applies to rows that do not meet any of the conditions, inserting them into table2. Thus, FIRST ensures only one insertion based on the first matched condition.
Now, table1 contains 102 and 12, whereas table2 has the values 12 and 1.
Insert Using SELECT Statement
You can add data by selecting it from another table using a SELECT statement.
Syntax:
Example:
Consider a temp_employees table:
Case 1: Inserting a Single Row
The above query will filter rows from the temp_employees table based on a condition (id =4001). Since the id is a primary key in the temp_employees table, only one row will be added to the employee table.
After executing the above query, a row ((highlighted in pink color) will be newly added to the employee table.
Case 2: Inserting Multiple Rows
The SELECT statement retrieves all employees' information from the temp_employees table who were hired after March 1, 2024.
Once you execute the query, rows (highlighted in pink color) will be added to the employee table:
Case 3: Add Multiple Tables
The second INSERT INTO query helps you select data from the temp_employees table where the department is Engineering. The salary is explicitly set to 45000 for each row.
The updated employee table:
The updated employee_hire_info table:
INSERT OVERWRITE
The INSERT with OVERWRITE command enables you to replace the data in the target table with new data. It is usually used when you want to overwrite existing data entirely.
Consider the employee table.
Suppose you may have one more table named sample_employees, with only one row. Let’s see how to insert data into the sample_employees table from the employee table using the OVERWRITE clause.
This query allows you to replace all existing data in the sample_employees table with the data returned by the SELECT statement. The sample_employees table now contains information about employees who worked in the Engineering department.
INSERT INTO Vs. COPY INTO Command - Tabular Comparison
How to Handle Duplicates During Inserts?
When inserting data into a table, Snowflake helps you remove duplicates by initially extracting unique records using the DISTINCT, GROUP BY, or ROW_NUMBER() analytic functions. Once this is done, the next step is to remove duplicates by leveraging the SWAP WITH or INSERT OVERWRITE command.
Assume you have the following employee_info table that includes duplicate records.
Follow the below steps to handle duplicates from tables using DISTINCT and SWAP WITH commands:
- Create an empty table named employee_duplicate with the same schema as the employee_info table.
- Insert unique records into the employee_duplicate table using the DISTINCT keyword.
- Interchange the data between two tables using the SWAP WITH command.
This ensures that the employee_info table retains only unique records.
- Finally, you can drop the employee_duplicate table, which now holds all the data from the source table employee_info.
How Do You Verify Data After INSERT INTO Command?
To verify data after using the INSERT INTO table command in Snowflake, you can run a SELECT query on the target table to ensure the data is correctly inserted.
Another option is to use a COUNT(*) to check the total number of records and compare it with the expected results.
You may also use constraints and audit logs for further verification.
Automating Data Ingestion Into Snowflake with Airbyte
When loading data with Snowflake, its built-in methods, such as INSERT INTO and COPY INTO, are often used. However, both of these Snowflake features have limitations that can affect performance and scalability. The INSERT INTO method becomes slow when dealing with large datasets because each row is inserted individually. On the other hand, the COPY INTO command requires files to be staged in cloud storage, adding extra complexity to managing these files.
Airbyte, a no-code data movement platform, helps you address these challenges by offering a more efficient solution for automating data ingestion into Snowflake. It provides 550+ pre-built connectors, allowing you to extract data from files, databases, APIs, SaaS applications, or any other systems and load it into Snowflake.
If you cannot find a suitable connector for your needs, you can build one using an AI Assistant no-code Connector Builder. The AI-assistant feature enables you to fill in the required configuration fields automatically.
Here are the simple steps you need to follow while moving data into Snowflake using Airbyte:
- Sign in to your Airbyte Cloud.
- Search and select your source connector from the Sources section, specify the mandatory fields accordingly, and click the Set up source button.
- Browse your Snowflake connector from the Destinations section, configure the necessary fields by following the setup guide, and click the Set up destination button.
- Finally, go to the Connections tab and set up a connection between your chosen source and Snowflake destination.
- During the connection configuration, you can specify the sync mode, frequency, and required data streams and then click the Finish and Sync button.
To leverage Airbyte for your specific use cases, you can connect with experts.
Conclusion
The INSERT INTO command is a powerful feature in Snowflake that enables you to insert rows as required in tables. However, manual interventions are required when adding each record to the tables. This causes complexity during the insertion of large datasets. With an efficient tool like Airbyte, you can automate the data ingestion into Snowflake or wherever you prefer.