
Announcing our acquisition of Grouparoo to accelerate Data Movement. Learn more
In February 2021, we closed our $5.2M seed round with Accel. Three months later, we announced our $26M Series-A round with Benchmark. And in mid-December of that same year, our $150M Series-B led by Altimeter and Coatue was also announced for a valuation at $1.5B. Having 3 rounds of funding in the same year is definitely not common, and can trigger different types of reactions, such as TechCrunch’s “infinite revenues multiple” article.
In this article, we want to tell you why and how we raised the Series-B. As for the last few times when we shared our Seed deck and our Series-A deck, we will also share the deck we used for the Series-B in this article and how we pitched it. We hope this will give you some insight about Airbyte’s vision and the fundraising process, even though our case might be unique.
We’ve also recorded how we pitched the Series-B deck in this community call. Don’t hesitate to check the pitch here:
If you like what we do, don’t hesitate to subscribe to our newsletter or star our GitHub project!
Our initial plan was to raise the Series-B at the end of 2022. But we decided to test the waters in November of this year. Our thinking was that if we could get the Series-B now under the same conditions with the partners we would ideally seek next year, then we would do it now. Why?
Our next company milestone is a high revenue goal. To get there, in 2022, we would need to grow the team:
So raising the Series-B now would:
In the end, the Series-B would help us get to the next company milestone a lot faster with both of the above compounding effects.
So all this makes sense. The question now becomes how did we reach our $1.5B valuation, since Airbyte, after all, was only founded in July 2020 – just 16 months ago.
We started working on our deck with the assumption that new potential investors were already familiar with both our seed deck and our Series-A deck.
In comparison to other Series-B decks, ours could only be lighter in terms of revenue reporting (as we were just getting started with the private Beta of Airbyte Cloud), so we focused on community and usage metrics, as well as on how the vision evolved, to show that Airbyte will become the big leader in all data movement in the future.
Here’s the Series-B deck:
Let’s dive into how we presented it.
Here are more details on our slides - why we included them and what we said during our pitches.
Although you might only spend 10 seconds discussing the cover slide, it’s important. It’s about showing that you have identified your positioning. In our case, it’s not only about ELT, but about all data integration and movement.
Data infrastructure has become one of the hottest industries, thanks to Snowflake’s IPO and Databrick’s impressive growth. So we decided to start with some context on the industry and how we fit into this ecosystem.
What we were saying:
When you look at the data infrastructure industry, quite a few verticals have now become mature. This is the case for the data repository and warehousing, for data transformation (with dbt leading the charge), and for data visualization / analytics.
But, unfortunately, anything related to data movement is not mature at all. Reverse-ETL is essentially 2 years old, and the current ETL/ELT solutions only cover a small fraction of the possible connectors. Airbyte is first focusing on the ELT use case, but will soon expand to all data movement use cases.
We removed the name of the incumbents that we would mention on this slide, as to avoid any problems with them. If you recognize your provider, it’s because the limitations detailed here are the same across the industry.
What we were saying:
In the last year, more than 10k companies used Airbyte to sync data, most of which already had an ETL/ELT platform in place. This enabled us to learn a ton about their use cases and the problems they were experiencing with these solutions. We identified 3 main issues with the current solutions:
All this is true for all data movement, including both streaming and reverse-ETL use cases.
That’s the slide where we start to get deeper into our differences from our seed deck.
What we were saying:
Airbyte addresses all three issues in the following manner:
What we were saying:
Airbyte simplifies the life of all practitioners in the data team:
This slide enabled us to make sure that investors didn’t have any more questions on the product, before switching to our achievements.
The goal of this slide is to show them the extreme velocity of the team. We would also mention that we introduced the vision of the participative model at the beginning of Q4 2021, and you can see the growth in connector contributions.
The blue line shows the number of connectors contributed by the community, and the black one, the number of total connectors available on the platform. You can see that our internal team is now only focused on improving the quality and reliability of the connectors. We will also provide better tools for the community to build and maintain new connectors more easily.
In the end, within 16 months, we reached the number of connectors that the incumbents offer.
So you can imagine how many connectors we will have in the years to come.
The main point of this slide is to show that our deployment monthly growth went up 6x within only 9 months, and to emphasize the fact that this growth was strictly organic. No money was spent on acquisition.
The goal of this slide is to show how deployments translate into activated companies, i.e,. companies that have successfully synced data with us, and therefore understood how Airbyte could be leveraged.
We define a prod user as a user that syncs data more than once a day for a period of several weeks.
What we point out in this slide is the more significant growth rate that we have with prod users than with activated users. The percent of activated users that become prod users keeps increasing.
The last point we mention is that out of our interactions with activated users that churned, most of them want to use our Airbyte Cloud solution. They’re just not interested in hosting and operating Airbyte by themselves. That’s okay when you need 5 to 10 connectors, but most companies need to replicate data from 20 to 100 different sources, and this number keeps increasing. And that is actually the reason why we worked on Airbyte Cloud this early on. Airbyte Cloud doesn’t stand for a revenue optimization strategy, but an adoption optimization strategy.
Our goal is that Airbyte becomes the most used data integration solution fast, and this is actually happening.
Growth in terms of users is great, but it needs to be accompanied by a growing usage to be fully convincing. So this slide is the (big) cherry on the cake. It would prove our point of “land and expand.” Airbyte users will test Airbyte first on a few use cases before building trust in the solution and expanding their usage.
What we would say:
This is possibly one of the most important slides. Let us tell you why.
We’re comparing Airbyte with other open-source projects related to data integration: Grouparoo (addresses open-source reverse-ETL), Rudderstack (addresses the open-source real-time use case of Segment), and Meltano (started more as a competitor to us but is now positioning itself more and more as an orchestrator for data pipelines).
The big difference in growth rate in terms of contributors stems from the fact that we’re the only ones with a huge overlap between our user community and our contributor community. Our users are data analysts and data engineers. Those same data engineers are tasked with building and maintaining connectors and are our contributors. In the case of Grouparoo and Rudderstack, their users are product, marketing, customer success, and sales people; they are not the data engineers that would contribute back. This lack of overlap means they will never be able to build a community of contributors or to address the long tail of connectors.
So, in the end, Airbyte is the only project out there able to address the long tail of connectors for ELT, reverse-ETL and streaming use cases. Our data engineering community is also tasked to build the other types of pipelines. So when we enable streaming and reverse-ETL use cases in 2022, we will start addressing the long tail for those use cases with the same participative model.
And this shows Airbyte’s true ambition: we want to power all data movement. ETL/ELT is only the first step. People compare us to Fivetran today, but we will do much more by the end of 2022.
You can only become a standard if the ecosystem considers you one. The goal here is to emphasize the fact that we’re still at the beginning of the journey but have already seen very promising adoption from the ecosystem.
Some companies have started to build their own product on top of our data protocol and connectors. This is especially interesting to us as they will have high stakes in helping us maintain the connectors to an SLA they want to provide to their own customers. This is the virtuous circle we want to get to.
Airbyte is not just a company; it’s a community of companies and individuals solving the data movement problem once and for all.
Series-B means you need to show revenue metrics, and this slide is a transition towards revenue discussions. We already mentioned ETL/ELT is only a first step for Airbyte, so we would give an indication on the total addressable market (TAM) for that first step alone.
Investors will only invest if they’re convinced of your business model and have a good understanding of your ACV (Annual Contract Value), which (together) will give them a better sense of your TAM.
What we would say:
In terms of what should be open source or in Airbyte Cloud, here’s our philosophy:
This gives us a lot of levers for conversion to premium, the biggest of which is actually just hosting and management, as most companies don’t want to do that by themselves for dozens of connectors. Typical premium features would also include user access management, which is important when it relates to data.
As mentioned before, our compute-based pricing model enables us to be up to 10x cheaper for databases. You might think 10x is a lot, but we’re actually creating a new category here, as most companies don’t use other ETL solutions for that use case, so their pricing can’t serve as references.
Even if Airbyte Cloud was only released as invite-only a few weeks earlier , we still needed to show the traction of our new model, and the level of contracts we were discussing. This is the slide that would enable investors to have some validation on the TAM. We would also emphasize the fact that the ACV will grow with the new use cases we would enable.
Now that we had shared the vision of the opportunity, we needed to show them how we were going to achieve it.
What we would say:
Airbyte has a bottom-up approach. To leverage this to the fullest, we will focus on a self-serve product-led approach. Cloud users would start with a free trial to get to the Aha moment as easily and quickly as possible. This will give us some usage insights that will complement firmographics data to determine whether we would privilege an entirely self-serve motion or a hand-off to sales. The way we’re thinking right now is that deals larger than $10k will need a human hand.
What we would say:
Since April, we grew from 7 to 25 (we’re already up to 35 now at the time of writing). We’ve been focusing on hiring senior and leadership roles (“head of”), so those people can grow their own team. The best way to grow by 6x next year is to have many swimlanes.
By the way, we are hiring!
At this point, investors would often ask about our vision for 2022. So this slide would serve as a transition.
What we would say:
Our focus in 2022 is to build a ubiquitous and reliable integration standard. Here’s what that would entail:
We want to address these 3 points in 2022. Our goal is to become the new standard for any kind of data movement. Then we can make those data pipelines smarter by addressing what is on top of them: data quality, privacy and compliance, data discovery, observability, etc.
Our next company milestone is to reach $XM (we don’t communicate the numbers publicly unfortunately) in annualized projected revenues. How does that translate? We believe we will need at least 300 employees. So, our goal in 2022 would be to grow from 30 to 200. Most of our fundraising’s money will go into growing the team.
The last slide of our presentation is always about the ask. We wouldn’t mention how much we needed, because at the time, as we explained, we were testing the waters. We were mostly focused on telling them why we’re considering raising now if we’re given the opportunity with the right partners and at the conditions we envisioned for our Series-B.
We displayed this slide when presenting the participative model.
Airbyte’s internal team won’t focus on building more new connectors, but on enabling the community to do so. We will also maintain around 100 connectors ourselves, and give the tools to the contributors to do the same. We can only commoditize data integration if our connectors stay up to date and high quality.
We would show this slide only if asked more detailed questions on go-to-market, which happened only once.
The emphasis we give in this slide is that Airbyte is a movement powered by a community, and that’s the difference between us and the incumbents. We will build a community of local meetup organizers and writers, in addition to connector builders.
We didn’t actually show those slides even once, as serious VCs will already know who your main investors are. But you should still have it ready just in case.
--
That’s it, folks! Now you know why we included the slides we did, and what we were trying to accomplish with each one. Hope this helps you understand us better, and maybe helps you prepare for your own investor encounters. Let us know what you think or if you have any questions in the comments below!
Also, if you’re curious about joining the team, please don’t hesitate to check our careers page and apply!
Get all your ELT data pipelines running in minutes with Airbyte.