

Building your pipeline or Using Airbyte
Airbyte is the only open source solution empowering data teams to meet all their growing custom business demands in the new AI era.
- Inconsistent and inaccurate data
- Laborious and expensive
- Brittle and inflexible
- Reliable and accurate
- Extensible and scalable for all your needs
- Deployed and governed your way
Start syncing with Airbyte in 3 easy steps within 10 minutes



Take a virtual tour
Demo video of Airbyte Cloud
Demo video of AI Connector Builder
What sets Airbyte Apart
Modern GenAI Workflows
Move Large Volumes, Fast
An Extensible Open-Source Standard
Full Control & Security
Fully Featured & Integrated
Enterprise Support with SLAs
What our users say


"The intake layer of Datadog’s self-serve analytics platform is largely built on Airbyte.Airbyte’s ease of use and extensibility allowed any team in the company to push their data into the platform - without assistance from the data team!"


“Airbyte helped us accelerate our progress by years, compared to our competitors. We don’t need to worry about connectors and focus on creating value for our users instead of building infrastructure. That’s priceless. The time and energy saved allows us to disrupt and grow faster.”


“We chose Airbyte for its ease of use, its pricing scalability and its absence of vendor lock-in. Having a lean team makes them our top criteria. The value of being able to scale and execute at a high level by maximizing resources is immense”
- Identify the public API you want to use and understand the data it provides.
- Read the API documentation to know the endpoints, request parameters, response format, rate limits, and authentication mechanism.
- Sign up for Starburst Galaxy if you haven't already.
- Create a new cluster in Starburst Galaxy, or use an existing one.
- Make sure you have the necessary permissions to create catalogs and schemas within Starburst Galaxy.
- Choose a programming language that you are comfortable with (e.g., Python, Node.js).
- Write a script that makes requests to the API endpoint(s) and handles the responses.
- Include error handling to manage API rate limits and possible downtimes.
- Parse the API response and transform the data into a format suitable for ingestion into Starburst Galaxy (e.g., CSV, JSON).
- Depending on the size of the data, you may need to store it temporarily before loading it into Starburst Galaxy.
- You can use a local file system, cloud storage, or a database as temporary storage.
- Ensure that the storage medium you choose is accessible from Starburst Galaxy.
- Log into Starburst Galaxy and navigate to the "Catalogs" section.
- Create a new catalog that connects to the storage system where you temporarily stored the data.
- Within the catalog, create a schema that will hold the data tables.
- Define the table structure that matches the data you fetched from the API.
- Create the table within the schema you created in the previous step using a DDL (Data Definition Language) statement.
- Make sure the table columns correspond to the data fields from the API.
- Write a script or use a command-line tool to load the data from your temporary storage into the table you defined in Starburst Galaxy.
- You may use the `INSERT INTO` statement or a bulk loading mechanism provided by Starburst Galaxy.
- Once the data is loaded, run some queries to ensure that the data has been loaded correctly and completely.
- Check for any discrepancies or data loss during the transfer process.
- To keep the data in Starburst Galaxy up to date, automate the data fetching and loading process.
- Set up a cron job, a cloud function, or use a workflow orchestration tool to run your scripts at regular intervals.
- Monitor the automated process for any failures or issues.
- Update your scripts and API requests as needed to accommodate any changes in the public API.
- Regularly check the performance of your queries in Starburst Galaxy and optimize as necessary.
FAQs
What is ETL?
ETL, an acronym for Extract, Transform, Load, is a vital data integration process. It involves extracting data from diverse sources, transforming it into a usable format, and loading it into a database, data warehouse or data lake. This process enables meaningful data analysis, enhancing business intelligence.
Public API connector permits users the flexibility to connect to any existing REST API and quickly abstract the necessary data. The API Connector also permits you to connect to almost any external API from Bubble. It provides Azure Active Directory with the information needed to call the API endpoint by defining the HTTP endpoint URL and authentication for the API call. API Connector is a dynamic, comfortable-to-use extension that pulls data from any API into Google Sheets.
Public APIs provide access to a wide range of data, including:
1. Weather data: Public APIs provide access to real-time weather data, including temperature, humidity, wind speed, and precipitation.
2. Financial data: Public APIs provide access to financial data, including stock prices, exchange rates, and economic indicators.
3. Social media data: Public APIs provide access to social media data, including user profiles, posts, and comments.
4. Geographic data: Public APIs provide access to geographic data, including maps, geocoding, and routing.
5. Government data: Public APIs provide access to government data, including census data, crime statistics, and public health data.
6. News data: Public APIs provide access to news data, including headlines, articles, and trending topics.
7. Sports data: Public APIs provide access to sports data, including scores, schedules, and player statistics.
8. Entertainment data: Public APIs provide access to entertainment data, including movie and TV show information, music data, and gaming data.
Overall, Public APIs provide access to a vast array of data, making it easier for developers to build applications and services that leverage this data to create innovative solutions.
What is ELT?
ELT, standing for Extract, Load, Transform, is a modern take on the traditional ETL data integration process. In ELT, data is first extracted from various sources, loaded directly into a data warehouse, and then transformed. This approach enhances data processing speed, analytical flexibility and autonomy.
Difference between ETL and ELT?
ETL and ELT are critical data integration strategies with key differences. ETL (Extract, Transform, Load) transforms data before loading, ideal for structured data. In contrast, ELT (Extract, Load, Transform) loads data before transformation, perfect for processing large, diverse data sets in modern data warehouses. ELT is becoming the new standard as it offers a lot more flexibility and autonomy to data analysts.
What should you do next?
Hope you enjoyed the reading. Here are the 3 ways we can help you in your data journey: