DynamoDB Replication: Easy Step-by-Step Guide

•

January 16, 2026

Summarize this article with:

✨ AI Generated Summary

Teams replicating DynamoDB data to analytics warehouses typically face two painful choices: write custom Lambda functions that break when schemas change, or run expensive full-table exports that leave data hours stale. Neither approach scales without significant engineering overhead.

This guide walks through three approaches to DynamoDB replication: native AWS Streams, S3 exports, and managed connectors. You can set up continuous replication to Snowflake, BigQuery, or Redshift in under 30 minutes without writing custom code.

TL;DR: DynamoDB Replication at a Glance

Streams + Lambda: Sub-second latency, but requires custom code and ongoing maintenance
Export to S3: Simple batch snapshots, but data is hours stale and not continuous
Managed connectors: Continuous warehouse replication without custom code, minutes-level latency
Best for most teams: Managed connectors if you need warehouse sync; Streams if you need real-time triggers

What Are Your DynamoDB Replication Options?

DynamoDB offers native replication mechanisms, but each comes with trade-offs around latency, cost, and maintenance. The right choice depends on your data freshness requirements and how much custom infrastructure you want to maintain.

1. DynamoDB Streams + Lambda

DynamoDB Streams captures item-level changes as they happen, triggering Lambda functions for each modification. Stream records persist for 24 hours, giving you a rolling window to process changes.

Best for: Real-time triggers, event-driven architectures, small-scale internal processing
Latency: Sub-second when Lambda is warm
Limitations: Requires custom code, 24-hour data retention window, Lambda cold starts can add latency

2. DynamoDB Export to S3

Point-in-time exports dump your entire table to S3 in Parquet or JSON format. This approach works well for batch analytics where you can tolerate data that's hours old.

Best for: Periodic snapshots, batch analytics, data lake ingestion
Latency: Hours (export duration depends on table size)
Limitations: Not continuous, export costs scale with table size, requires Point-in-Time Recovery enabled

3. Managed Connectors

Data integration platforms like Airbyte provide pre-built DynamoDB connectors that handle schema detection, incremental syncs, and destination mapping without custom code.

Best for: Continuous warehouse replication, teams without bandwidth for custom infrastructure
Latency: Minutes to hours (configurable sync frequency)
Limitations: Additional tooling cost, depends on connector quality

Talk to sales to understand how managed connectors reduce pipeline maintenance, adapt to DynamoDB schema changes, and keep warehouse syncs reliable as data volumes grow.

How Do You Set Up DynamoDB Replication Using Streams?

DynamoDB Streams provides the foundation for event-driven replication. This approach requires custom Lambda code but delivers the lowest latency option.

1. Enable DynamoDB Streams on Your Table

Navigate to your DynamoDB table in the AWS Console, select the Exports and streams tab, and enable DynamoDB Streams. Choose NEWANDOLD_IMAGES as your stream view type to capture both the before and after state of each item change.

Alternatively, enable streams via AWS CLI:

aws dynamodb update-table \
  --table-name <your-table-name> \
  --stream-specification StreamEnabled=true,StreamViewType=NEW_AND_OLD_IMAGES

Stream view type options:

KEYS_ONLY: Only the key attributes of the modified item
NEW_IMAGE: The entire item after modification
OLD_IMAGE: The entire item before modification
NEWANDOLD_IMAGES: Both before and after states (recommended for most replication use cases)

2. Create a Lambda Function to Process Stream Events

Create a Lambda function with a DynamoDB Streams trigger. The function receives batches of stream records and writes them to your destination.

Basic Python handler structure:

import json
import boto3


def lambda_handler(event, context):
    for record in event["Records"]:
        event_name = record["eventName"]
        dynamodb = record["dynamodb"]


        if event_name == "INSERT":
            new_image = dynamodb["NewImage"]
            # Transform and send to destination


        elif event_name == "MODIFY":
            new_image = dynamodb["NewImage"]
            old_image = dynamodb["OldImage"]
            # Handle updates


        elif event_name == "REMOVE":
            old_image = dynamodb["OldImage"]
            # Handle deletes


    return {
        "statusCode": 200
    }

Required IAM permissions for your Lambda execution role:

dynamodb:GetRecords
dynamodb:GetShardIterator
dynamodb:DescribeStream
dynamodb:ListStreams

3. Configure Error Handling and Monitoring

Stream processing failures can cause data loss if not handled properly. Configure a dead-letter queue (DLQ) to capture failed records and set up CloudWatch alarms for iterator age.

Key CloudWatch metrics to monitor:

IteratorAge: Time between record creation and processing. High values indicate falling behind.
Errors: Lambda invocation failures
Throttles: Stream read throttling events

How to Set Up DynamoDB Replication Using Export to S3?

S3 exports work well for batch analytics workflows where data freshness measured in hours is acceptable. This approach requires minimal custom code but produces point-in-time snapshots rather than continuous replication.

1. Enable Point-in-Time Recovery

DynamoDB exports require Point-in-Time Recovery (PITR) enabled on your table. Enable it in the AWS Console under the Backups tab, or via CLI:

aws dynamodb update-continuous-backups \
  --table-name <your-table-name> \
  --point-in-time-recovery-specification PointInTimeRecoveryEnabled=true

2. Run the Export

Create an S3 bucket for your exports if you don't have one, then initiate the export:

aws dynamodb export-table-to-point-in-time \
  --table-arn arn:aws:dynamodb:us-east-1:123456789:table/YourTable \
  --s3-bucket <your-export-bucket> \
  --s3-prefix dynamodb-exports/ \
  --export-format DYNAMODB_JSON

Export format options include DYNAMODB_JSON and ION. For most warehouse destinations, DYNAMODB_JSON provides easier parsing.

3. Schedule Recurring Exports (Optional)

For automated daily or hourly exports, create an EventBridge rule that triggers a Lambda function to initiate exports on your schedule. Keep in mind that export costs scale with table size, and each export is a full table snapshot.

How Do You Replicate DynamoDB to a Data Warehouse Without Custom Code?

Managed data integration platforms eliminate the custom code requirement while providing continuous replication to warehouses like Snowflake, BigQuery, and Redshift. This approach trades platform costs for reduced engineering overhead.

1. Set Up Your DynamoDB Source Connection

Create an IAM user or role with the following permissions for your DynamoDB connector:

dynamodb:Scan (required for full table reads)
dynamodb:DescribeTable (required for schema discovery)
dynamodb:ListTables (optional, for table discovery)

Configure the source connection with your AWS access key, secret key, and the region where your DynamoDB tables reside.

2. Configure Your Destination

Select your target warehouse and provide connection credentials. Common destinations include:

Snowflake: Account identifier, warehouse, database, schema, and credentials
BigQuery: Project ID, dataset, and service account JSON
Redshift: Host, port, database, schema, and credentials
Databricks: Workspace URL, HTTP path, and personal access token

3. Select Sync Mode and Frequency

Choose between full refresh (complete table replacement each sync) and incremental (only changed records). For DynamoDB tables with a consistent primary key, incremental syncs reduce data transfer and warehouse compute costs.

Set your sync frequency based on data freshness requirements. Options typically range from every hour to once daily.

4. Run Your First Sync

Trigger an initial sync and monitor the job status. The first sync performs a full table scan, so expect longer runtimes for large tables. Subsequent incremental syncs complete faster.

Verify data in your destination by comparing record counts and spot-checking a few rows against the source table.

Which DynamoDB Replication Method Should You Choose?

The right approach depends on your latency requirements, engineering capacity, and maintenance tolerance.

Choose Streams + Lambda if you need sub-second latency and have engineering capacity to build and maintain custom code. This approach gives you the most control but requires ongoing maintenance as schemas evolve.
Choose Export to S3 if you need periodic snapshots for batch analytics and can tolerate data that's hours old. This is the simplest option but doesn't support continuous replication.
Choose a managed connector if you need continuous warehouse replication without maintaining custom infrastructure. This approach trades platform costs for reduced engineering overhead.

Start replicating DynamoDB to your warehouse in minutes. Try Airbyte free and connect your first data source today.

Frequently Asked Questions

How does DynamoDB replication differ from DynamoDB Global Tables?

DynamoDB replication for analytics focuses on copying data out of DynamoDB into external systems like warehouses or data lakes. Global Tables, by contrast, replicate data between DynamoDB tables across AWS regions for low-latency application access. Global Tables are designed for multi-region availability, not analytics, and they do not help you query DynamoDB data in Snowflake, BigQuery, or Redshift.

Does DynamoDB Streams guarantee exactly-once delivery?

No. DynamoDB Streams provides at-least-once delivery. This means your Lambda or downstream consumer must handle potential duplicate records. Production-grade replication pipelines implement idempotent writes or deduplication logic to avoid double-counting updates in the destination.

Can DynamoDB replication handle schema changes automatically?

Native approaches do not handle schema evolution automatically. With Streams + Lambda, you must update transformation logic when attributes change. S3 exports simply dump raw items without enforcing schemas. Managed connectors typically detect new attributes, map them to new columns, and propagate changes to the destination with minimal manual intervention.

What is the main cost driver for DynamoDB replication?

Costs vary by method. Streams-based replication incurs DynamoDB Streams read costs and Lambda execution costs that scale with write volume. S3 exports scale with table size and export frequency. Managed connectors add a platform cost but often reduce overall spend by lowering engineering time, operational overhead, and reprocessing caused by brittle custom pipelines.

Limitless data movement with free Alpha and Beta connectors

Introducing: our Free Connector Program

The data movement infrastructure for the modern data teams.

Try a 30-day free trial

About the Author

Jim Kutz brings over 20 years of experience in data analytics to his work, helping organizations transform raw data into actionable business insights. His expertise spans predictive modeling, data engineering and data visualization, with a focus on making analytics accessible and impactful for stakeholders at all levels.