Hey everyone, welcome to the May edition of the Drip where we take you downstream to cover highlights of our change-log, community and anything Airbyte related.
Connector Builder is Live 🚢
In May, we launched the Connector Builder, a game-changing no-code tool that allows users to create new API connectors in as little as 15 minutes. This self-serve tool does away with the need for coding experience or a development environment, making it easier than ever to extract data from unsupported or niche ELT solutions. Since its soft launch two months ago, the Connector Builder has been a hit with our customers, with over 100 connectors built and deployed to production to support critical data movement workloads.
We designed the Connector Builder to simplify the process of building API connectors, which are often formulaic and unnecessarily complex. By providing a no-code solution, we're empowering non-engineers to work independently, while also offering engineers a higher leverage way to build and maintain connectors. Currently, the tool is best suited for synchronous HTTP API connectors, but as it evolves, we expect it to support the vast majority of API connector needs.
Azure Blob Storage Breaking Changes 🚨
In the pull request #25739, we made a significant change to the Snowflake Destination. We removed the Azure Blob Storage loading option due to some issues and the lack of certification for this functionality. This decision was made after observing that this variant of the load has not seen any tracking on our Cloud offering. This change will impact OSS customers who use Azure Blob Storage as a loading method for the Snowflake Destination.
However, it will not affect customers that use the Azure Blob Storage Destination as that uses different classes. This PR involved removing integration tests for Snowflake using Azure Blob Storage as a loading method, and all associated classes, whether that be tests or supporting classes.
State of Data
In the past 2 years, the data ecosystem has been evolving rapidly. New tools have been emerging every month in the modern data stack. In a hype cycle, it becomes hard to distinguish the signal from the noise. Which of those tools would eventually become simple features or actual products that we would be using in a few years?
To make sense of it all, we made the largest data engineering survey made to date - State of Data 2023 - with 886 respondents. The survey helps us take a step back and understand what the community is using and feeling excited about, what is noise or signal in the modern data stack.
Brand New Content Hub 📚
We also introduced our new content hub, a comprehensive online destination for all things related to data engineering. The hub features a variety of content formats, including articles, videos, shorts, podcast episodes, tutorials, and even courses. We aim to cater to all learning styles and preferences, and to provide insightful content for every level, from beginners to seasoned data engineers.
The content hub covers a wide range of topics related to data engineering, including databases, data orchestration, transformation, modeling, warehousing, AI, and more. We believe in the power of community, which is why our content hub is open to contributions from data professionals. We offer a payment of $900 per article for approved drafts, and we also provide feedback and advice to improve your writing skills.
We're excited for you to explore Airbyte's content hub and immerse yourself in the wealth of resources we've put together just for you. Don't miss out on this opportunity to learn, grow, and contribute to the data engineering community. Visit our content hub today and consider contributing your own insights and experiences. Your voice matters, and we can't wait to hear from you!
At the beginning of the month, our team spent 2 full days hacking prototypes on whatever projects they found most interesting and impactful, or just fun to do! Here are a few projects our team built during the Hack Days (which we will have every quarter from now on):
- Using the Airbyte API to make an iOS App
- Supercharging e2e Testing with Cypress and Airbyte’s Config API
- An Easier Way to Understand Airbyte Synchronization
- Implement AI data pipelines with Langchain, Airbyte, and Dagster
And that’s all we have for May’s edition of The Drip. Thanks for reading through. If you have any questions:
- Please join our Slack community to talk to us on the Airbyte team as well as other fantastic folks in the community!
- Also sign up for our Newsletter to keep up with the state of the art in Data Integration and the broader Data Engineering Ecosystem!
✨ New and improved features
- New Sources and Promotions
- 🎉 New Source: FullStory [Low code CDK] (#25465)
- 🎉 New Source: Yotpo [Low code CDK] (#25532)
- 🎉 New Source: Merge [Low code CDK] (#25342)
- New Features for Existing Connectors
- Source Marketo: New Stream Segmentation (#23956)
- 🎉Categorized Config Errors Accurately for Google Analytics 4 (GA4) and Google Ads (#25987)
- 🎉 Source Amplitude: added missing attrs in events schema, enabled default availability strategy (#25842)
- 🎉 Source Bind Ads: add campaignlabels col (#24223)
- ✨ Source Amazon Ads: add availability strategy for basic streams (#25792)
- 🎉 Source Bing Ads: added undeclared fields to schemas (#25668)
- 🎉Source Hubspot: Add oauth scope for goals and custom objects stream (#5820)
- New Features in Airbyte Platform
- Normalization: Better handling for CDC transactional updates (#25993)
- 🎉 Connector builder: Keep testing values around when leaving connector builder (#6336)
- 🎉 Connector builder: Copy from new stream modal (#6582)
- 🎉 Schema auto-propagation UI (#6700)
- 🎉 Connector builder: Client credentials flow for oauth authenticator (#6555)
- 🎉 Add support for source/destination LD contexts in UI (#6586)
- 🎉 Workspaces can be opened in a new tab (#6565)
🚨 Security & Breaking changes
- 🚨 Removes defunct Azure Blob Storage laoding option for Snowflake 🚨 (#25739)
🐛 Bug fixes
- 🐛 Source Close-Come, Source Hubspot, Source GitHub, Source TikTok-Marketing, Source SurveyMonkey, Source SmartSheets: fix builds (#26024)
- 🐛 Source Google Analytics V4 Data API: handle 429 - potentiallyThresholdedRequestsPerHour (#26008)
- 🐛 Fix date-time for airbyte types (#25965)
- 🐛 Source Shopify: validate shop input (name only, reject urls) (#25961)
- 🐛 Source Mixpanel, Source Pinterest, Source Freshdesk: fix builds (#25915)
- 🐛 Source Gitlab, Source Hubspot, Source Snapchat-Marketing: fix builds (#25948)
- 🐛 Source Close.com: extend roles schema with missing properties (#25868)
- 🐛Source Jira: add sprint information from team-managed project (#25798)
- 🐛 Source Hubspot: update expected records (#25869)
- 🐛 Source Trello: extend organizations schema (#25870)
- 🐛 Source Stripe: fixed subscription_schedule.canceled_at type issues + update expected_records (#25795)
- 🐛 Destination S3 Glue: Fix decimal type syntax (#25813)
- 🐛 Source MySQL/MsSQL Disable index logging for MySQL (#25740)
- 🐛 CAT: fix close-com, confluence, gitlab, pipedrive, slack, xero expected records (#25780)
- 🐛 Source Xero: fix expected records for CAT (#25758)
- 🐛 Source Zendesk Support: stream sla_policies fix data type error (events.value) (#24053)
- 🐛 Source Notion: fix ai_block is unsupported by API issue, while fetching Blocks stream (#25709)
- 🐛 CAT: updated expected records for Zendesk-Support, Faker, Harvest, Freshdesk (#25707)
- 🐛 Source S3: remove minimum block size (#25706)
- 🐛 Correct connection overview status to not pull from an active job (#6426)
- 🐛 Connector builder: Always save yaml based manifest (#6486)
- 🐛 Allow users to cancel a sync on a disabled connection (#6496)
- 🐛 Asynchronously fetch connector update notifications (#6396)
- 🐛 Don't show connector builder prompt in destinations (#6321)