How to load data from SFTP Bulk to Redshift
Learn how to use Airbyte to synchronize your SFTP Bulk data into Redshift within minutes.


Building your pipeline or Using Airbyte
Airbyte is the only open source solution empowering data teams to meet all their growing custom business demands in the new AI era.
- Inconsistent and inaccurate data
- Laborious and expensive
- Brittle and inflexible
- Reliable and accurate
- Extensible and scalable for all your needs
- Deployed and governed your way
Start syncing with Airbyte in 3 easy steps within 10 minutes



Take a virtual tour
Demo video of Airbyte Cloud
Demo video of AI Connector Builder
Setup Complexities simplified!
Simple & Easy to use Interface
Airbyte is built to get out of your way. Our clean, modern interface walks you through setup, so you can go from zero to sync in minutes—without deep technical expertise.
Guided Tour: Assisting you in building connections
Whether you’re setting up your first connection or managing complex syncs, Airbyte’s UI and documentation help you move with confidence. No guesswork. Just clarity.
Airbyte AI Assistant that will act as your sidekick in building your data pipelines in Minutes
Airbyte’s built-in assistant helps you choose sources, set destinations, and configure syncs quickly. It’s like having a data engineer on call—without the overhead.
What sets Airbyte Apart
Modern GenAI Workflows
Move Large Volumes, Fast
An Extensible Open-Source Standard
Full Control & Security
Fully Featured & Integrated
Enterprise Support with SLAs
What our users say

Raman Singh
Predictable, straightforward pricing model that simplified budgeting and significantly reduced overall spend

Chase Zieman

“Airbyte helped us accelerate our progress by years, compared to our competitors. We don’t need to worry about connectors and focus on creating value for our users instead of building infrastructure. That’s priceless. The time and energy saved allows us to disrupt and grow faster.”

Rupak Patel
"With Airbyte, we could just push a few buttons, allow API access, and bring all the data into Google BigQuery. By blending all the different marketing data sources, we can gain valuable insights."
How to Sync to Manually
First, ensure you have the necessary credentials and permissions to access the SFTP server. This includes obtaining the hostname, port number, username, password, and any SSH keys required for authentication. Verify that you can connect to the SFTP server using an SFTP client or command-line tool to ensure your credentials are correct.
Use a command-line tool such as `sftp` or `scp` to download the data files from the SFTP server to a local or intermediary server. For example, using the `sftp` command, you can connect to the server and use `get` or `mget` to download files:
```
sftp user@hostname
sftp> get /path/to/remote/file /path/to/local/directory
```
Repeat this step for each file or automate the process using a shell script.
Ensure that the downloaded data files are in a format compatible with Amazon Redshift, such as CSV or TSV. If necessary, convert or transform the data using tools like `awk`, `sed`, or Python scripts. Also, clean the data by removing duplicates, handling missing values, or applying any necessary transformations to match your Redshift schema.
Before loading the data into Redshift, you need to upload it to an Amazon S3 bucket. Use the AWS CLI to copy the files from your local server to S3:
```
aws s3 cp /path/to/local/file s3://your-bucket-name/path/to/destination/
```
Ensure the S3 bucket is in the same AWS region as your Redshift cluster to avoid unnecessary data transfer costs.
Set up an IAM role that allows Amazon Redshift to access the S3 bucket where your data resides. Attach this role to your Redshift cluster. Ensure the IAM policy attached to the role has the necessary permissions, such as `s3:GetObject`, to access the data files.
Before loading data, create a table in Redshift that matches the structure of your data. Use the `CREATE TABLE` SQL command in the Redshift query editor, specifying appropriate data types for each column. Ensure the table schema aligns with your data to prevent loading errors.
Use the `COPY` command in Redshift to load data from the S3 bucket into your Redshift table. The `COPY` command will read data from S3 and insert it into the specified table:
```sql
COPY your_table_name
FROM 's3://your-bucket-name/path/to/destination/'
IAM_ROLE 'arn:aws:iam::your-account-id:role/your-redshift-role'
FORMAT AS CSV;
```
Adjust the command based on your data format (e.g., specify `DELIMITER` for TSV) and any additional options (e.g., `IGNOREHEADER` for CSV files with headers).
By following these steps, you can move data from an SFTP server to Amazon Redshift without relying on third-party connectors or integrations.