How to Setup Schema Change Handling in Snowflake
Summarize this article with:
✨ AI Generated Summary
Your pipeline ran fine for months. Then someone added a column to the source table, and now your Snowflake destination is throwing errors. Schema changes are one of the most common causes of pipeline failures, and most teams discover them only after data stops flowing.
This guide walks through how to detect and handle schema changes in Snowflake, from manual approaches using native SQL to automated schema propagation.
TL;DR: How to Setup Schema Change Handling in Snowflake at a Glance
- Query INFORMATION_SCHEMA to capture current column names, types, and positions for any table
- Create a schema snapshot table to store historical schema states and track changes over time
- Build a comparison query using FULL OUTER JOIN to detect added columns, removed columns, and type changes
- Apply changes with ALTER TABLE statements to add columns, modify types, or handle removals
- Automate with Airbyte's schema propagation to detect and apply schema changes automatically without custom SQL maintenance
Why do Schema Changes Break Snowflake Pipelines?
Source systems evolve constantly. Product teams add new fields to track features. Marketing renames columns to match new campaign structures. Engineers change data types to handle edge cases. Each of these changes can break downstream pipelines that expect a fixed schema.
When incoming data doesn't match your Snowflake destination table, inserts fail. A new column in the source has nowhere to land. A renamed column leaves the old destination column empty while the new data gets rejected. Type mismatches cause casting errors that halt entire syncs.
The challenge splits into two parts: detecting when source schemas change, and propagating those changes to your destination tables. Most teams handle both manually, which means schema drift goes unnoticed until something breaks in production.
What are Your Options for Handling Schema Changes in Snowflake?
Three main approaches exist for managing schema evolution in Snowflake pipelines.
1. Manual Schema Updates
Query INFORMATION_SCHEMA periodically to detect drift between source and destination. When you find mismatches, write ALTER TABLE statements to modify Snowflake tables. This approach gives you full control but requires constant attention. Most teams fall behind on schema maintenance until a pipeline failure forces them to catch up.
2. Automated Schema Propagation
Data integration tools can detect source schema changes and apply them to destinations automatically. This removes the manual toil of monitoring and updating schemas. Airbyte's schema propagation feature, for example, detects new columns, type changes, and removed fields, then updates your Snowflake destination accordingly.
3. Schema Evolution with Semi-Structured Data
Load everything into a VARIANT column and flatten at query time. This absorbs any schema change without pipeline modifications. The trade-off is query performance and complexity. Every downstream consumer needs to handle the semi-structured data, and you lose the benefits of typed columns for filtering and aggregation.
How to Configure Schema Change Detection in Snowflake?
If you're building schema change detection manually, start by capturing and comparing schema states over time.
1. Query INFORMATION_SCHEMA to Capture Current State
Snowflake's INFORMATION_SCHEMA contains metadata about every table in your database. Query it to get the current column structure:
SELECT
column_name,
data_type,
ordinal_position,
is_nullable
FROM information_schema.columns
WHERE table_schema = 'YOUR_SCHEMA'
AND table_name = 'YOUR_TABLE'
ORDER BY ordinal_position;This returns the column names, types, and positions for any table. Run this against both your source system (if accessible) and your Snowflake destination to compare structures.
2. Create a Schema Snapshot Table
Store historical schema states to track changes over time:
CREATE TABLE IF NOT EXISTS schema_snapshots (
snapshot_id INTEGER AUTOINCREMENT,
table_schema STRING,
table_name STRING,
column_name STRING,
data_type STRING,
ordinal_position INTEGER,
captured_at TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP()
);
INSERT INTO schema_snapshots (table_schema, table_name, column_name, data_type, ordinal_position)
SELECT
table_schema,
table_name,
column_name,
data_type,
ordinal_position
FROM information_schema.columns
WHERE table_schema = 'YOUR_SCHEMA'
AND table_name = 'YOUR_TABLE';3. Build a Comparison Query to Detect Drift
Compare the current schema against your last snapshot to find additions, removals, and type changes:
WITH current_schema AS (
SELECT column_name, data_type, ordinal_position
FROM information_schema.columns
WHERE table_schema = 'YOUR_SCHEMA'
AND table_name = 'YOUR_TABLE'
),
last_snapshot AS (
SELECT column_name, data_type, ordinal_position
FROM schema_snapshots
WHERE table_schema = 'YOUR_SCHEMA'
AND table_name = 'YOUR_TABLE'
AND captured_at = (
SELECT MAX(captured_at)
FROM schema_snapshots
WHERE table_schema = 'YOUR_SCHEMA'
AND table_name = 'YOUR_TABLE'
)
)
SELECT
COALESCE(c.column_name, l.column_name) AS column_name,
CASE
WHEN l.column_name IS NULL THEN 'ADDED'
WHEN c.column_name IS NULL THEN 'REMOVED'
WHEN c.data_type != l.data_type THEN 'TYPE_CHANGED'
ELSE 'UNCHANGED'
END AS change_type,
l.data_type AS old_type,
c.data_type AS new_type
FROM current_schema c
FULL OUTER JOIN last_snapshot l ON c.column_name = l.column_name
WHERE l.column_name IS NULL
OR c.column_name IS NULL
OR c.data_type != l.data_type;4. Set Up Alerts for Schema Changes
Wrap the detection query in a Snowflake task that runs on a schedule. When changes are detected, trigger an alert:
CREATE OR REPLACE TASK check_schema_changes
WAREHOUSE = YOUR_WAREHOUSE
SCHEDULE = 'USING CRON 0 * * * * UTC'
AS
-- Insert detection logic here
-- Use Snowflake alerts or external notification integration
;Connect alerts to Slack, email, or your incident management system using Snowflake's notification integrations.
How to Propagate Schema Changes to Snowflake Destination Tables?
Once you detect a schema change, you need to apply it to your destination table.
1. Handle New Columns
Adding columns is the safest schema change. Existing data remains intact, and the new column accepts NULL values until populated:
ALTER TABLE destination_table
ADD COLUMN new_column_name VARCHAR;Match the data type to your source. For typed sources, map types directly. For semi-structured sources, choose types that accommodate the expected data range.
2. Handle Column Type Changes
Type changes require more care. Snowflake allows some implicit conversions, but others require explicit handling:
-- For compatible type changes (e.g., VARCHAR to larger VARCHAR)
ALTER TABLE destination_table
MODIFY COLUMN column_name VARCHAR(500);
-- For incompatible changes, create a new column and migrate
ALTER TABLE destination_table
ADD COLUMN column_name_new INTEGER;
UPDATE destination_table
SET column_name_new = TRY_CAST(column_name AS INTEGER);
ALTER TABLE destination_table
DROP COLUMN column_name;
ALTER TABLE destination_table
RENAME COLUMN column_name_new TO column_name;Use TRY_CAST to handle values that don't convert cleanly. Decide whether to fail on conversion errors or accept NULL values for incompatible rows.
3. Handle Column Removals
Removing columns is a business decision as much as a technical one. Options include:
- Drop the column if the data is no longer needed and no downstream systems depend on it
- Keep the column with NULL values going forward to preserve historical data
- Rename with a deprecated prefix to signal the column is no longer populated
-- Option 1: Drop
ALTER TABLE destination_table
DROP COLUMN removed_column;
-- Option 2: Keep (no action needed, column receives NULLs)
-- Option 3: Deprecate
ALTER TABLE destination_table
RENAME COLUMN removed_column TO _deprecated_removed_column;How Does Airbyte Automate Schema Change Handling for Snowflake?
Building and maintaining custom schema detection adds ongoing work to your pipeline operations. Airbyte handles schema changes automatically through its schema propagation feature.
When you connect a source to Snowflake in Airbyte, the platform detects the source schema and creates corresponding destination tables. As the source schema evolves, Airbyte detects changes and offers three handling modes:
- Propagate all changes automatically applies additions, type changes, and removals to your Snowflake destination
- Propagate safe changes only applies new columns but flags potentially breaking changes for review
- Ignore changes maintains your existing destination schema regardless of source evolution
Column selection lets you control exactly which fields sync to Snowflake. Exclude sensitive columns, ignore deprecated fields, or focus on the subset of data your analytics actually needs.
Configuration takes minutes in the Airbyte UI. Select your source from the 600+ available connectors, choose Snowflake as your destination, and set your schema propagation preference. No custom SQL to maintain, no scheduled tasks to monitor.
Ready to Automate Your Snowflake Schema Management?
Schema changes will happen in any production data pipeline. You can manage them manually with periodic INFORMATION_SCHEMA queries and ALTER TABLE statements, or automate the process with tools built for schema propagation.
Start moving data to Snowflake with automatic schema change handling. Try Airbyte and connect your first data source in minutes.
Frequently Asked Questions
Does Snowflake support automatic schema evolution?
Snowflake supports schema evolution for tables using the VARIANT data type and for Parquet, Avro, and ORC file formats loaded via COPY INTO. For standard relational tables, schema changes require explicit ALTER TABLE statements or external tooling to detect and propagate changes automatically.
How do I check if a column was added or removed in Snowflake?
Query the INFORMATIONSCHEMA.COLUMNS view to get the current schema, then compare against a stored snapshot of the previous schema state. A FULL OUTER JOIN between current and historical schema data reveals added columns (present only in current), removed columns (present only in snapshot), and type changes (mismatched datatype values).
Can I roll back a schema change in Snowflake?
Snowflake's Time Travel feature lets you query historical data, but it doesn't automatically restore schema changes. To roll back an ALTER TABLE operation, you need to execute a reverse statement manually. For example, if you added a column, run ALTER TABLE DROP COLUMN. Keep a log of schema changes to enable rollbacks when needed.
What happens to existing data when I change a column type in Snowflake?
Snowflake allows some in-place type modifications, such as increasing VARCHAR length or changing NUMBER precision. For incompatible type changes, you need to create a new column, migrate data using CAST or TRYCAST, then drop the original column. Rows that fail type conversion return NULL when using TRYCAST, or cause the entire operation to fail when using CAST.
.webp)
