Announcing Airbyte 0.50: Checkpointing, Column Selection, and Schema Propagation John Lafleur
•
•
June 8, 2023
•
4 min read
Airbyte was a project started in July 2020, less than 3 years ago. Since then, we grew to be the platform with the most connectors and with more than 3,000 companies syncing data on a daily basis. In our minds, we’re still early in the journey on our mission to commoditize data integration. But today, we’re getting one step closer with the release of 3 significant features.
Building on the 3 announcements we made in the last weeks - the no-code connector builder , the State of Data 2023 , and the content hub -, today we’re releasing 3 features we consider essential to data integration: checkpointing , column selection , and schema propagation , all 3 now available on Airbyte Cloud and Airbyte Open Source.
Checkpointing for uninterrupted data syncs Checkpointing is a data processing technique that enables recovery from interruptions. By creating a saved state of a data stream at regular intervals, checkpointing allows the system to resume from the most recent checkpoint in the event of a failure, avoiding data loss or redundant processing.
Airbyte's adoption of checkpointing now affords users an improved level of data reliability. Particularly in the context of large-scale data integration tasks, this feature is invaluable. Airbyte's adoption of checkpointing demonstrates the platform's commitment to secure, reliable data integration.
If you want to know more about how we built it for Airbyte, here’s an article detailing this.
Column Selection to control the exact data to sync Beyond checkpointing, we are now also unveiling column selection. This feature allows users to select specific columns from a source for integration, rather than being required to integrate the entire dataset. With this added flexibility, users can now focus on the data most relevant to their needs, optimizing resource usage and efficiency. This new feature, once again, positions Airbyte as a serious contender in the data integration market, bridging the gap with the capabilities of industry leaders such as Fivetran.
Here’s an article detailing how it works and how it was built as well as a video quickly demoing how it works:
VIDEO
Enhancing Usability with Schema Propagation Last but not least, we are introducing schema propagation. This feature automatically propagates changes from the source schema to the destination schema, saving users from manually making these changes. As data schemas change frequently in real-world scenarios, this capability greatly enhances the usability and convenience of Airbyte's platform. By automating the process, it allows for seamless schema evolution over time, reducing the administrative overhead associated with data integration tasks.
Airbyte users won’t need to do manual resets each time the source schema changes anymore. Airbyte lets users control how schema propagation would be done for each connection.
Here’s an article detailing how it works and how it was built.
Limitless data movement with free Alpha and Beta connectors
Introducing: our Free Connector Program ->
Getting one step closer to our mission In just three years since our inception, our team has been able to narrow the feature-set gap with the best in class in our industry, and we’re just getting started. A lot more will be coming in the next 3 months, our ambition is to provide the best data engineer experience for an ELT platform. This might include Terraform SDK, higher throughput, and more! So stay tuned!