How Airbyte Raised Its Series-A Round 2 Months after Its Seed
In early March 2021, we announced our $5.2M seed round with Accel for Airbyte - our open-source data integration platform; 2 months later, we announced our $26M Series-A round led by Benchmark.
In this article, we want to tell you about what happened behind the scenes, including the deck we presented to the Benchmark team. We hope this will give you some insights about the fundraising process, even though our case might be atypical.
If you like what we do, don’t hesitate to subscribe to our newsletter or star our GitHub project!
If you're mostly interested in the deck, here's a Slideshare link to the Series-A deck.
Some additional context: We actually signed the term sheet for our seed round with Accel in December 2020, but we just closed the round (after the due diligence and all the negotiations on the long form contract) in early February. It then took us about a month to prepare the announcement for the seed, which happened on March 2, 2021.
The First Call
Benchmark contacted us at the end of February, and we had our first call with them on March 10. Although our seed round wasn’t public information, they already knew about it, as most VCs did. Private news travels fast in the VC world. We witnessed that again during the Series-A, as we got contacted by a lot of VCs after we signed the term sheet, even though we kept it confidential. One hypothesis is that the information leaks during the vetting process. We’ll come back to that later.
How did the first contact happen? Chetan Puttagunta and Peter Fenton both reached out to Michel and John, via different mutual connections. Having both of them try to reach out to us was a strong signal about how serious they were about talking with us.
Leading up to the Partnership Meeting
After the first call on March 10, we had several video calls over the next month. Each one ranging from an hour to several hours. These long form conversations allowed us to get to know Benchmark and, at the same time, allowed Benchmark to get to know us as founders as well as our plans for Airbyte. We knew that we were going to take on a board member with the Series A and it’s critical to have clarity before starting the 10+ year journey together. We felt that this discussion had to happen over several weeks. We presented to the Benchmark partnership on April 12. That’s when we presented the deck below. We sent it the day before so they could prepare their questions and so we could leverage that hour the best.
After that, we had calls on April 13 and 14 to negotiate the term sheet, and we got it signed on April 15.
During that time, we didn’t interact with any VCs other than Accel. We didn’t need to raise money. And Accel was a great sounding board and was putting our company first.
While you can be sure the VC will contact people you have worked with in the past, you want to talk to companies they have invested in as well. We try to get feedback from companies in various states of progress: hugely successful, successful but in a slow growth mode, and some that didn’t work out. The goal is to obtain a well-rounded view of the VC’s support and behavior in all scenarios. We made all these calls between the 7th and 14th of April.
It might be possible the information leak happens at this stage. After all, the CEOs that you’re talking to are well connected with other VCs.
The Term Sheet
A piece of advice here: Try to get a maximum amount of information about the terms of the round written in the term sheet.
Once you have signed the term sheet, most of the negotiation you will do will be with the VC’s lawyers rather than the VC partner. So, if you get it written in the term sheet, it’s one less thing that is open to negotiation. And don’t accept a one-page term sheet. Try to get the full set of terms clearly written.
This is the only way to know as entrepreneurs that there will not be an egregious term in the final documents. Benchmark clearly spelled out the terms of their financing in their term sheet, which made the closing process smooth and on schedule.
As founders, you have two jobs to do after the term sheet is signed:
- Negotiate the details of the long-form investment agreement with the VC’s lawyers. That’s when having great lawyers helps, as they can make sure to represent your best interests.
- Complete the remainder of the round. You might have pro-rata rights to abide by for your earlier investor, and you might want to involve some high-profile business angels who can bring a ton of value, too.
In the end, we got about a dozen more investors in the round, including Shay Banon (co-founder & CEO of Elastic), Dev Ittycheria (CEO of MongoDB), Auren Hoffman (co-founder of LiveRamp and Safegraph), and SV Angel.
Why We Decided to Raise Even Though We Didn’t Need the Money
Entrepreneurs often think in terms of valuation and the equity they own in the company. But there is a third factor that is the most important of all: the success factor, which goes from 0 (bankruptcy) to 1 (IPO or huge exit).
success factor * ownership * valuation = value you own
We did YC in January-March 2020 because we thought it would have a significant impact on our success factor. We chose to raise with Accel (we didn’t need that money then as we still had a full year of runway) because we thought they were the ideal partner to bring us closer to the 1 for our success factor.
This is what happened again with the Benchmark team. We were not convinced at all about raising a Series-A now at the start, but each interaction with Chetan and the Benchmark team was bringing immense value and extremely relevant advice. After the third call, we could no longer see Airbyte without them working closely with us to bring us closer to the 1.
Now that we knew whom we wanted to partner with, it was all about getting terms (valuation, etc.) that worked for both us and Benchmark. And that would be our advice in fundraising:
Optimize first for whom you want to partner with. And then, optimize the terms of the fundraising (valuation included) for that partner.
As the Airbyte seed deck was public, we couldn’t just reuse it. And a lot of progress had been made since we signed the seed term sheet in December, and we wanted to reflect that.
So here’s the Series-A deck we presented to the Benchmark partnership team: https://www.slideshare.net/jeanlaf/airbyte-seriesa-deck
Our deck was pretty standard, though some might consider it a bit short. That’s why we include the appendices along with the deck.
The Structure and Our Pitch on Each Slide
Here are more details on our slides - why we included them and what we were saying during our pitches.
Although you might only spend 10 seconds discussing the cover slide, it is important. It’s about showing that you have identified your positioning. In our case, we identify EL(T) as the future of the ETL industry. So, essentially, we want to identify ourselves as the future of the ETL industry here.
1. Industry Context
This is the only slide that is common to our seed deck.
Thanks to Snowflake’s IPO and Segment’s acquisition, we knew that data infrastructure was hot from an investor point of view. We decided to start with some context on the industry and how we fit into this ecosystem.
What we were saying:
When you look at the data infrastructure industry, there is often a new category that emerges through a commercial product. Once the market matures, an open-source alternative gets created and ends up taking over the category. This behavior is often seen because data infrastructure requires privacy, security and scale - which cloud-based solutions can’t offer as well as open-sourced ones. There are many examples, such as Kafka, Spark, and now dbt. We want to be the open-source solution for data integration.
You might wonder why an open-source approach would also win the format for data integration; sometimes a closed-source cloud-based approach works. This last sentence is a transition to the next slide.
2. Problem We’re Solving
This is a rephrasing of our seed round slide with more details.
What we were saying:
In June and July 2020, we started reaching out to 250 of Fivetran’s, StitchData’s and Matillion’s customers. We ultimately managed to talk to 45 of them. We wanted to know whether an open-source approach would make sense to address data integration. What we learned is that a cloud-based closed-source solution will never be able to fully address the data integration problem. It has several inherent issues.
100% of the companies we talked to were using Fivetran, StitchData or other solutions, while also building and maintaining their own connectors. They did so because either (a) the ETL solution didn’t support the connector they wanted, or (b) the solution supported it, but not in the way they needed.
When you look at Fivetran, for instance, you’ll see that after 8 years, they only support 150 connectors. The hard part about ETL/ELT is not about building the connectors, but maintaining them. It is costly, and any cloud-based closed-source solution will be restricted by a ROI (return on investment) consideration. It isn’t profitable for them to support the long tail of connectors, so they only focus on the most popular integrations.
During those 45 interactions, we also identified a third issue. Some of the companies were about to stop using Fivetran for some connectors because it was becoming too pricey. The value of an ELT solution is about replacing a paid data engineer who builds and maintains a connector in-house. The amount of work required from an engineer is about the same, whether the volume of data being moved is low or high. So with volume-based pricing, at some point it just stops making sense to use an external solution.
And the last inherent issue with a cloud-based approach: Although cloud data warehouses are winning the enterprise market, it is because they are considered part of the data infrastructure. All other solutions must go through a rigorous privacy compliance process that will take several months.
3. Our Solution
That’s the slide where we start to get deeper into our differences from our seed deck.
What we were saying:
There are two main differences with our approach.
1. Being open-source:
- Enables us to address the long tail of integrations: We built a Connector Development Kit (CDK) that enables our users to build new connectors in a matter of hours (instead of days!) for our community. The idea is to standardize the way connectors are being built, thus making them much easier to maintain by us and the community.
- Enables our users to customize any connector to their specific needs, and to leverage Airbyte the way they need to, along with other tools in their data stack.
- Enables our users to start testing Airbyte without asking upper management for permission. So the speed to value in a testing use case is way better than any closed-source solution.
- Enables us to address the use case of all companies that can’t use a third-party vendor to handle their sensitive data. This includes healthcare, fintech and insurance, as well as enterprises.
2. Having a value-based pricing
Current solutions have adopted a volume-based pricing, which has always been counterproductive for data.
We won’t detail much yet what our intentions are here, as this is something that we would like to keep under the hood.
4. Our Product
What we were saying:
Airbyte addresses two different audiences.
With our easy-to-use UI, we want to empower data consumers, such as data analysts and scientists, to start replicating data in a no-code way. Airbyte essentially makes them autonomous; they are no longer dependent on the priorities of the data or software engineering team. It takes literally 2 minutes for a data analyst to replicate data from Salesforce to Snowflake, including the deployment through our Docker Compose.
5. Our Product 
What we were saying:
The second audience we’re addressing is data and software engineers. With our API and soon our CLI, we enable them to leverage Airbyte to power the data replication they need in their software and workflow. Some will even leverage Airbyte to offer data integrations on their own platform to their customers.
Airbyte integrates with dbt, Airflow and Kubernetes for now. But our intention is to integrate with the whole stack by the end of the year. Our goal is that Airbyte just works - whatever your data volume, architecture, infrastructure, or connector needs - by the end of the year.
6. Our Team
In this slide, we review the expertise of the team. In our case, we have six people coming from Liveramp, where they built and maintained more than 1,000 connectors and were moving more than 100TB of data every day. Our goal is to make investors understand that we’re experienced and have already done it in the past, but now the goal is to enable every company to do it.
Since we presented these slides we’ve hired three more people: Natalie K. (Director of Operations), Jenny B. (Senior Software Engineer) and George C. (Senior Software Engineer).
By the way, we are hiring!
7. Our Velocity
This is possibly the most important slide, as it validates everything we’ve said with facts. Please keep in mind that these metrics are the ones we had on April 11, when we presented the deck to the Benchmark partnership team.
The point we wanted to make is how far we’ve come since the seed round’s term sheet was signed. We deliberately didn’t choose the seed closing date, because Accel made the decision to invest in Airbyte in December, not in February.
The number of companies using you is most likely the most important data point for investors.
What we were saying:
We actually started working on Airbyte at the end of July. We soft launched a MVP 2 months later (at the end of September), with only 6 connectors and no support for incremental replications. We wanted to have feedback as early as possible. Since then, we’ve been used by X companies to sync data, we have Y PR contributors and Z issue contributors, and we’ve ramped up the number of connectors to 60.
8. Some Growth Metrics
Investors will always ask for the names of companies that are using you. So here, we’re just anticipating the question.
Our point on this slide is to show that the conversion rate from users having synced data to users who are using us in prod every day is growing every week!
9. Our Roadmap
Now that we’ve shown our velocity up to this point, it’s time to show what we can accomplish in the next few quarters. Investors want to see where you will be when it is time to raise the next round. That is this slide’s goal.
10. Our Ask
This is the last slide of the pitch. The goal here is that you shouldn’t need to say much! It shouldn’t come as a surprise, and is a direct consequence of the roadmap slide.
You need some sort of slide to separate the main deck from the appendix. No surprise here! We only showed the appendix slides if asked relevant questions, but we sent the deck with the appendix.
12. The Competitive Landscape
What we were saying:
Here are a few choices we made that distinguish Airbyte from other open-source solutions:
- So the velocity in terms of execution, team and community growth are not comparable.
- Airbyte’s connectors are usable out of the box through a UI and an API, with monitoring, scheduling and orchestration. It takes literally 2 minutes to start replicating data from Salesforce to Snowflake.
- Airbyte runs connectors as Docker containers, so they can be built in the language of your choice.
- Airbyte’s components are modular, and you can decide to use subsets of the features to better fit in with your data infrastructure (e.g., orchestration with Airflow or Kubernetes or Airbyte…)
- We integrate with dbt for the transformation piece and will soon let the community contribute normalization schemas for all connectors.
- Unlike Singer, Airbyte uses one single open-source repo to standardize and consolidate all developments from the community, leading to high quality connectors. We built a compatibility layer with Singer so that the Singer community can easily migrate their work to Airbyte.
The point for us here was to demonstrate that we are well on our way to becoming the industry’s open-source standard.
13. Some Growth Metrics 
Even though GitHub stars are essentially vanity metrics, they are an indicator of the awareness you create. And it adds more support in demonstrating growth to your investors.
14. Some Growth Metrics 
These metrics are much more important to show, as our defensibility will come through the community. The exponential growth was what was exciting to investors.
15. Some Growth Metrics 
We anticipated that Series-A investors might need more metrics than we included in the seed deck. We showed the growing usage per user. If we make our prod users successful, they will naturally put more and more of their connectors on our platform when moving away from other solutions.
16. Some Growth Metrics 
Even though we never actually presented this slide, it was important for us to include it in the deck. As we sent the deck before the meeting, it was already answering the question, “Who is using you in prod?”
17. Our Investors
This is probably the slide we showed the least often, as serious VCs will already know who your main investors are. But you should still have it ready just in case.
That’s it, folks! Now you know why we included the slides we did, and what we were trying to accomplish with each one. Hope this helps you understand us better, and maybe helps you prepare for your own investor encounters. Let us know what you think or if you have any questions in the comments below!