How to Use Snowflake INSERT INTO Command?

Jim Kutz
August 12, 2025
20 min read

Summarize with ChatGPT

Inserting data into Snowflake tables represents one of the most fundamental yet critical operations in modern data engineering workflows. While the INSERT INTO command appears straightforward on the surface, data professionals working with enterprise-scale Snowflake deployments frequently encounter performance bottlenecks, concurrency limitations, and cost optimization challenges that can dramatically impact pipeline efficiency and operational expenses. The complexity deepens when organizations attempt to use INSERT INTO for high-volume data operations without understanding Snowflake's unique columnar architecture and optimization requirements. Success with Snowflake INSERT INTO operations requires more than basic syntax knowledge; it demands understanding of performance optimization strategies, transaction management principles, and integration with Snowflake's advanced features like clustering, materialized views, and automated workflow systems that can transform simple data insertion tasks into sophisticated, scalable data processing architectures.

What Is the Snowflake INSERT INTO Command?

The Snowflake INSERT INTO command enables you to add rows of data to an existing or newly-created table in the Snowflake data warehouse. You can insert one or more rows at a time by either specifying values directly or using data retrieved from another query.

Snowflake Insert Command

Before inserting data into a Snowflake table, you must create it using the CREATE TABLE statement. Once the table is ready, you can utilize the INSERT INTO command to populate it with the required data. This process allows you to prepare structured data for further analysis and processing.

What Is the Syntax for Snowflake INSERT INTO?

INSERT [ OVERWRITE ] INTO <target_table_name> [ ( <target_column_name> [ , ... ] ) ]       {         VALUES ( { <value> | DEFAULT | NULL } [ , ... ] ) [ , ( ... ) ]  |         <query>       }

This syntax is similar to the Redshift INSERT INTO command.

  • INSERT – initiates the command.
  • [OVERWRITE] – optional; replaces existing data when specified, otherwise appends.
  • targettablename – name of the destination table.
  • targetcolumnname – optional list of destination columns.
  • VALUES (…) – supplies explicit values, DEFAULT, or NULL; multiple rows may be listed.
  • query – alternative to VALUES; inserts rows returned by a sub-query.

How Do You Load Data Into Snowflake Using INSERT INTO?

Using the INSERT INTO command, you can load a single row, multiple rows, multiple tables, or even JSON data into Snowflake tables.

Single-Row Insert

INSERT INTO table_name (column1, column2, ...) VALUES (value1, value2, ...);
INSERT INTO employee (id, name, department) VALUES (1011, 'Thompson', 'Engineering');

Multi-Row Insert

INSERT INTO table_name (column1, column2, ...) VALUES (value11, value12, ...),       (value21, value22, ...);
INSERT INTO employee (id, name, department) VALUES (1017, 'Franklin', 'Finance'),       (1018, 'Kim',      'HR');

Multi-Table Insert

INSERT INTO employee (id, name, department)VALUES (1019, 'Ann', 'Sales');INSERT INTO employee_hire_info (id, name, hire_date, department, salary)VALUES (1010, 'Alen', '2024-06-22', 'Mechanics', 30000);

JSON Data Insert

Snowflake supports JSON via the VARIANT data type. Convert JSON strings with PARSE_JSON() before inserting.

INSERT INTO customerSELECT PARSE_JSON(json_column1)FROM VALUES('{  "Id": "C00012",  "name": { "first": "Ken", "last": "Williams" },  "company": "ABC",  "email": "ken.williams@abc.info",  "phone": "+1 (914) 486-3535",  "address": "218 Tech Street, Innovation City, California, 90001"}'),('{  "Id": "C00017",  "name": { "first": "Dennis", "last": "Green" },  "company": "XYZ",  "email": "dennis.green@xyz.net",  "phone": "+1 (979) 587-1011",  "address": "421 Greenway Palace, Boulevard, Oregon, 97035"}');

Extracting fields:

SELECT  json_column1:Id::STRING              AS Id,  json_column1:name.first::STRING      AS first_name,  json_column1:name.last::STRING       AS last_name,  json_column1:company::STRING         AS company,  json_column1:email::STRING           AS email,  json_column1:phone::STRING           AS phone,  json_column1:address::STRING         AS addressFROM customer;

Conditional Multi-Table Insert

INSERT [ OVERWRITE ] { FIRST | ALL }  { WHEN <condition> THEN intoClause [ ... ] }  [ ... ]  [ ELSE intoClause ]<subquery>

Example (ALL):

INSERT ALL  WHEN marks > 100 THEN INTO table1  WHEN marks >  10 THEN INTO table1 INTO table2  ELSE                   INTO table2SELECT marks FROM Source_Table;

Example (FIRST):

INSERT FIRST  WHEN marks > 100 THEN INTO table1  WHEN marks >  10 THEN INTO table1 INTO table2  ELSE                   INTO table2SELECT marks FROM Source_Table;

Insert Using SELECT Statement

INSERT INTO employee (id, name, department) SELECT id, name, departmentFROM temp_employeesWHERE id = 4001;
INSERT INTO employees (id, name, department) SELECT id, name, departmentFROM temp_employeesWHERE hire_date > '2024-03-01';
INSERT INTO employee (id, name, department)SELECT id, name, departmentFROM temp_employeesWHERE department = 'Marketing';INSERT INTO employee_hire_info (id, name, hire_date, department, salary)SELECT id, name, hire_date, department, 45000FROM temp_employeesWHERE department = 'Engineering';

INSERT OVERWRITE

Replace all data in the target table:

INSERT OVERWRITE INTO sample_employeesSELECT *FROM employeeWHERE department = 'Engineering';

What Are the Common Challenges and Solutions When Using Snowflake INSERT INTO?

Data professionals frequently encounter several critical challenges when implementing Snowflake INSERT INTO operations at enterprise scale. Understanding these challenges and their solutions is essential for building reliable, efficient data pipelines that can handle production workloads without compromising performance or cost-effectiveness.

Performance Bottlenecks and Optimization Strategies

The most significant challenge with Snowflake INSERT INTO operations stems from performance degradation during large-scale data insertion tasks. Unlike traditional row-based databases, Snowflake's columnar architecture creates unique performance characteristics that can surprise experienced database professionals. INSERT INTO operations suffer substantial performance penalties when used inappropriately for bulk data loading, as the underlying columnar storage system is optimized for bulk operations rather than row-by-row processing.

Multi-row insertion strategies provide the most effective solution for improving INSERT performance. Instead of executing individual INSERT statements for each record, consolidating multiple rows into single statements dramatically reduces network overhead and transaction costs. This approach can improve performance by orders of magnitude compared to single-row insertion patterns, particularly when processing thousands or millions of records.

For scenarios requiring bulk data movement, leveraging Snowflake's COPY INTO command often provides superior performance compared to INSERT INTO operations. COPY INTO is specifically designed for high-volume data ingestion from staged files and can process gigabytes of data more efficiently than equivalent INSERT operations. When bulk loading is the primary requirement, staging data in cloud storage and using COPY INTO typically delivers better performance and cost efficiency.

Concurrency Limitations and Architectural Solutions

Snowflake imposes a strict limit of 20 queued DML statements targeting the same table simultaneously, creating significant operational bottlenecks in high-concurrency environments. This limitation becomes particularly problematic for applications with multiple parallel processes or microservices architectures where several components attempt to write to the same table concurrently.

Implementing table partitioning strategies and staging table architectures provides effective solutions for concurrency challenges. By distributing writes across multiple staging tables and periodically consolidating data into target tables, applications can work within Snowflake's concurrency constraints while maintaining high throughput. Queue-based architectures that serialize writes to individual tables can also eliminate concurrency conflicts while providing predictable performance characteristics.

For real-time data ingestion scenarios, leveraging Snowpipe's automated ingestion capabilities often provides better results than direct INSERT operations. Snowpipe can handle continuous data loading without the concurrency limitations that affect direct DML operations, making it ideal for high-frequency data insertion requirements.

Data Consistency and Validation Complexities

Snowflake's approach to constraint enforcement creates unique challenges for data professionals accustomed to traditional relational databases. While Snowflake supports the definition of primary key, unique, and foreign key constraints, it does not enforce them during INSERT operations. This fundamental difference can lead to data quality issues and unexpected behavior when duplicate records or constraint violations occur.

Implementing comprehensive data validation and quality control processes outside of INSERT operations provides the most reliable solution for maintaining data integrity. Applications must implement business logic to ensure data consistency, including duplicate detection, referential integrity validation, and data type consistency checking. This approach requires additional development effort but provides more control over data quality outcomes than relying on database constraints.

Multi-row INSERT operations in Snowflake require data types to be consistent across all rows, using the first row as a guide for type inference. This can lead to unexpected failures when inserting mixed data types, even when the target column could accommodate all values through implicit conversion. Implementing proper data type validation and conversion logic before insertion prevents these issues and ensures reliable data loading operations.

How Can You Optimize Performance for Snowflake INSERT INTO Operations?

Performance optimization for Snowflake INSERT INTO operations requires understanding the platform's unique architecture and implementing strategies that align with its columnar storage design and cloud-native scaling capabilities. Effective optimization involves multiple complementary approaches that address both query-level performance and broader architectural considerations.

Warehouse Sizing and Resource Management

Virtual warehouse sizing represents one of the most critical factors affecting INSERT INTO performance. Larger warehouses provide more compute power for processing INSERT operations, particularly when dealing with complex transformations or high-volume data loads. However, warehouse size directly impacts costs, requiring careful balance between performance requirements and budget constraints.

Dynamic warehouse scaling strategies enable automatic adjustment of compute resources based on workload demands. Implementing separate warehouses for different types of INSERT workloads provides better resource isolation and optimization opportunities. Dedicated warehouses for ETL processing, streaming ingestion, and interactive operations enable fine-tuned optimization for each use case while preventing resource contention between different processing patterns.

Multi-cluster warehouse configurations provide automatic scaling capabilities that adapt to variable workload demands while maintaining consistent performance levels. This approach is particularly valuable for INSERT-heavy applications with unpredictable data volume patterns, as it ensures adequate resources are available during peak processing periods without over-provisioning during low-activity periods.

Query Optimization and Execution Strategies

Query optimization for INSERT operations requires careful attention to data filtering, transformation logic, and join strategies that minimize computational overhead. When INSERT statements include complex WHERE clauses or JOIN operations, optimizing these components can dramatically improve overall performance. Leveraging Snowflake's query profiling capabilities helps identify bottlenecks and optimization opportunities within INSERT operations.

Batch processing strategies that group related INSERT operations into larger transactions can improve performance by reducing transaction overhead and enabling more efficient resource utilization. However, batch sizes must be balanced against transaction duration and error recovery requirements to avoid creating excessively long-running transactions that could impact system performance.

Utilizing prepared statements and parameterized queries for repetitive INSERT operations can reduce query compilation overhead and improve execution efficiency. This approach is particularly valuable for applications that execute similar INSERT patterns repeatedly with different data values, as it eliminates redundant query planning operations.

Advanced Data Organization Techniques

Clustering key implementation provides significant performance benefits for tables that receive frequent INSERT operations followed by analytical queries. Proper clustering ensures that inserted data is physically organized to optimize subsequent query performance, reducing scan costs and improving overall system efficiency. The selection of appropriate clustering keys should align with common query patterns and filtering requirements.

Leveraging materialized views and dynamic tables can improve performance for scenarios where INSERT operations are followed by complex analytical queries. By precomputing aggregations and transformations, these features reduce the computational load on the system while ensuring that analytical results remain current with inserted data. This approach is particularly valuable for real-time dashboard and reporting scenarios that require immediate availability of inserted data.

Table design optimization, including appropriate data type selection and column ordering, can improve INSERT performance by reducing storage overhead and optimizing data compression. Snowflake's automatic optimization features work more effectively when table schemas are designed to support the platform's internal optimization algorithms.

How Does INSERT INTO Compare to COPY INTO?

FactorINSERT INTOCOPY INTO
PurposeRow-level insertsBulk loading from stages or external locations
SyntaxINSERT INTO VALUES (…);
COPY INTO FROM
OptionsSingle-row, multi-row, conditional, overwriteFile formats, pattern matching, transformations, error handling
Typical UsageTransactional operationsHigh-volume ELT/ETL workflows

How Do You Handle Duplicates During Inserts?

  1. Create a duplicate table with the same schema:
   CREATE OR REPLACE TABLE employee_duplicate LIKE employee_info;
  1. Insert unique records using DISTINCT:
   INSERT INTO employee_duplicate   SELECT DISTINCT *   FROM employee_info;
  1. Swap the tables:
   ALTER TABLE employee_duplicate SWAP WITH employee_info;
  1. Drop the temporary table:
   DROP TABLE employee_duplicate;

How Do You Verify Data After INSERT INTO?

-- View inserted rowsSELECT * FROM target_table;-- Row count checkSELECT COUNT(*) FROM target_table;

Constraints, audit logs, and data-quality checks provide additional assurance.

How Can You Automate Data Ingestion Into Snowflake with Airbyte?

While Snowflake's native INSERT INTO and COPY INTO commands are powerful, they can be cumbersome for large-scale or highly-distributed data pipelines. Airbyte transforms how organizations approach data integration by solving the fundamental problem of effectively managing and integrating data across diverse enterprise environments. As the leading open-source data integration platform, Airbyte provides over 600+ pre-built connectors that eliminate the traditional trade-offs between expensive, inflexible proprietary solutions and complex, resource-intensive custom integrations.

Organizations choose Airbyte because it generates open-standard code and provides deployment flexibility across cloud, hybrid, and on-premises environments while maintaining enterprise-grade security and governance capabilities. Rather than forcing teams into proprietary ecosystems, Airbyte enables data sovereignty and prevents vendor lock-in while delivering superior performance and cost efficiency compared to legacy ETL platforms.

If a needed connector doesn't exist, you can create one with the AI-assisted Connector Builder.

Basic steps:

  1. Sign in to Airbyte Cloud.
  2. Configure your Source connector.
  3. Configure your Snowflake Destination.
  4. Create a Connection, choose sync mode & frequency, then click Finish and Sync.

To explore custom use-cases, talk to Airbyte experts.

Conclusion

The INSERT INTO command is a versatile feature in Snowflake that lets you load data row-by-row, in bulk, conditionally, or by overwriting existing records. However, achieving optimal performance and reliability requires understanding Snowflake's unique architectural characteristics and implementing appropriate optimization strategies. While INSERT INTO provides essential capabilities for data manipulation, recognizing when to leverage alternative approaches like COPY INTO for bulk operations or Snowpipe for continuous ingestion ensures efficient data processing workflows. For larger pipelines, automating ingestion with a tool like Airbyte eliminates manual effort and scales seamlessly while maintaining data sovereignty and preventing vendor lock-in.

Move Data Anywhere, Anytime.

Frequently Asked Questions

What is the difference between INSERT INTO and INSERT OVERWRITE in Snowflake?

INSERT INTO appends new rows to the existing data in a table, while INSERT OVERWRITE replaces all existing data in the target table with the new data being inserted. Use INSERT OVERWRITE when you want to completely refresh a table's contents rather than adding to existing data.

Can I use INSERT INTO for real-time data ingestion in Snowflake?

While INSERT INTO can handle real-time insertions, it's not optimal for high-volume streaming scenarios due to Snowflake's concurrent write limitations and performance characteristics. For real-time ingestion, consider using Snowpipe Streaming or staging data with COPY INTO operations for better performance and scalability.

How does Snowflake handle constraint violations during INSERT operations?

Snowflake supports defining constraints like primary keys and foreign keys but does not enforce them during INSERT operations. The constraints serve as metadata for query optimization and documentation purposes. You must implement application-level validation to ensure data integrity and prevent constraint violations.

What are the best practices for optimizing INSERT INTO performance in Snowflake?

Use multi-row INSERT statements instead of single-row operations, implement appropriate warehouse sizing for your workload, leverage clustering keys for frequently queried tables, and consider using COPY INTO for bulk loading scenarios. Additionally, optimize your queries by minimizing complex transformations during insertion and using batch processing strategies.

When should I use INSERT INTO versus COPY INTO in Snowflake?

Use INSERT INTO for transactional operations, small data volumes, and when you need conditional logic or data transformations during insertion. Use COPY INTO for bulk loading large datasets from staged files, as it provides better performance and cost efficiency for high-volume data ingestion scenarios.

Limitless data movement with free Alpha and Beta connectors
Introducing: our Free Connector Program
The data movement infrastructure for the modern data teams.
Try a 14-day free trial