The Bytes

The Deck We Used to Raise our $150M Series-B

In February 2021, we closed our $5.2M seed round with Accel. Three months later, we announced our $26M Series-A round with Benchmark. And in mid-December of that same year, our $150M Series-B led by Altimeter and Coatue was also announced for a valuation at $1.5B. Having 3 rounds of funding in the same year is definitely not common, and can trigger different types of reactions, such as TechCrunch’s “infinite revenues multiple” article. 

In this article, we want to tell you why and how we raised the Series-B. As for the last few times when we shared our Seed deck and our Series-A deck, we will also share the deck we used for the Series-B in this article and how we pitched it. We hope this will give you some insight about Airbyte’s vision and the fundraising process, even though our case might be unique. 

We’ve also recorded how we pitched the Series-B deck in this community call. Don’t hesitate to check the pitch here:

If you like what we do, don’t hesitate to subscribe to our newsletter or star our GitHub project!

Why We Raised in December 2021

Our initial plan was to raise the Series-B at the end of 2022. But we decided to test the waters in November of this year. Our thinking was that if we could get the Series-B now under the same conditions with the partners we would ideally seek next year, then we would do it now. Why?

Our next company milestone is a high revenue goal. To get there, in 2022, we would need to grow the team:

  • from 30 team members (that’s what we call ourselves internally) to 200. 
  • from 16k deployments to 100k
  • from 150 connectors to 500 high-quality ones

So raising the Series-B now would: 

  • Boost our recruiting. Some very high-potential profiles would rather join a Series-B startup than a Series-A one. The perceived risk is not the same. Those are also the profiles we need, as they’ve experienced a hypergrowth setting. So raising now would enable us to fill our recruiting needs faster and help us reach our 200-employee goal in 2022. 
  • Boost our brand perception, usage and revenue. Some companies were waiting for Airbyte to become more mature and financially secure before deciding to use the technology. So raising now would unblock a certain portion of them and get us faster to 100k deployments, which is one of our next milestones. This would reinforce our leader position in the open-source data integration market.

In the end, the Series-B would help us get to the next company milestone a lot faster with both of the above compounding effects. 

So all this makes sense. The question now becomes how did we reach our $1.5B valuation, since Airbyte, after all, was only founded in July 2020 – just 16 months ago. 

Our Deck and Pitch

We started working on our deck with the assumption that new potential investors were already familiar with both our seed deck and our Series-A deck. 

In comparison to other Series-B decks, ours could only be lighter in terms of revenue reporting (as we were just getting started with the private Beta of Airbyte Cloud), so we focused  on community and usage metrics, as well as on how the vision evolved, to show that Airbyte will become the big leader in all data movement in the future. 

Here’s the Series-B deck:


Let’s dive into how we presented it.  

The Structure and Our Pitch on Each Slide

Here are more details on our slides - why we included them and what we said during our pitches. 

0. Cover

Although you might only spend 10 seconds discussing the cover slide, it’s important. It’s about showing that you have identified your positioning. In our case, it’s not only about ELT, but about all data integration and movement. 

1. Industry Context

Data infrastructure has become one of the hottest industries, thanks to Snowflake’s IPO and Databrick’s impressive growth. So we decided to start with some context on the industry and how we fit into this ecosystem. 

What we were saying: 

When you look at the data infrastructure industry, quite a few verticals have now become mature. This is the case for the data repository and warehousing, for data transformation (with dbt leading the charge), and for data visualization / analytics. 

But, unfortunately, anything related to data movement is not mature at all. Reverse-ETL is essentially 2 years old, and the current ETL/ELT solutions only cover a small fraction of the possible connectors. Airbyte is first focusing on the ELT use case, but will soon expand to all data movement use cases. 

2. Problem We’re Solving

We removed the name of the incumbents that we would mention on this slide, as to avoid any problems with them. If you recognize your provider, it’s because the limitations detailed here are the same across the industry. 

What we were saying:

In the last year, more than 10k companies used Airbyte to sync data, most of which already had an ETL/ELT platform in place. This enabled us to learn a ton about their use cases and the problems they were experiencing with these solutions. We identified 3 main issues with the current solutions: 

  1. Most of these companies are only using them for their most common APIs, and only in the case where they are satisfied with the use cases addressed by the ETL/ELT platform. As soon as these companies start to have custom use cases that go beyond the limited extensibility of those platforms, the companies will most likely need to have an in-house data engineering team build and maintain those custom connectors. 
  2. Furthermore, after 8 years, those platforms have plateaued at 170-200 connectors. The hard part about ETL/ELT is not about building the connectors, but maintaining them. And doing it all in-house and closed-source is costly. In addition to this, those closed-source platforms will be restricted by ROI (return on investment) considerations. It isn’t profitable for them to support the long tail of connectors, so they only focus on the most popular integrations. In the end, most companies using those platforms will also have long-tail connector needs that can only be addressed by building and maintaining connectors in-house with a data engineering team.
  3. The last point where most companies would rather leverage in-house data engineering teams is for high-volume databases. Those ETL platforms have a row-based pricing model that is incompatible with the hundreds of millions of rows you have in databases. 

All this is true for all data movement, including both streaming and reverse-ETL use cases. 

3. Our Solution

That’s the slide where we start to get deeper into our differences from our seed deck. 

What we were saying: 

Airbyte addresses all three issues in the following manner: 

  1. Airbyte will be the only solution addressing the long tail of connectors with its community-powered and open-source approach. We announced a participative model where we will do revenue sharing with our connector contributors as soon as they provide high-quality connectors. We’ve already seen huge adoption of this model with 40 connectors contributed in October and November. Our goal is to reach 500+ high-quality connectors by the end of 2022, and then reach thousands more with time. Plus, even if we don’t support the connector you need now, you can leverage our Connector Development Kit (CDK) to build a new connector from scratch within 2 hours, instead of 2 days. Our moonshot is to get that time below 30 minutes for most use cases in the future.
  2. Airbyte is non-opinionated, meaning it will enable you to replicate the data not only wherever you want it, but also however you want it. We don’t enforce schemas, we let you change the connector as much as you need, and we let you use whichever tools you want for orchestration and transformation. We’re also not opinionated in terms of sources and destinations. Data infrastructures are unique, so you need a solution that will adapt to your unique needs.
  3. When you look at data infrastructure, most verticals offer compute-based pricing models, but for some reason not with ELT until Airbyte. Data teams are used to paying based on compute costs. It gives them control and predictability. In our case, Airbyte’s compute-based pricing model will also enable you to replicate data from databases, as database replication throughput is 100x higher than for APIs, on average. 

4. Our Audience / Personas

What we were saying: 

Airbyte simplifies the life of all practitioners in the data team: 

  1. Data analysts and scientists will have a lot more autonomy. With time, as our CDK becomes simpler and simpler to use, the analytics engineer of tomorrow will even become completely autonomous at moving data for their needs. 
  2. Data engineers have a LOT less work to do on building and maintaining connectors, a redundant task that they dread doing. 
  3. Engineers most often turn towards open-source solutions to address their unique use case before building something in-house. Since Airbyte will eventually address all data movement use cases, it will become a more and more attractive solution for them.

5. Transition to Metrics

This slide enabled us to make sure that investors didn’t have any more questions on the product, before switching to our achievements. 

6. Our Execution Velocity 


The goal of this slide is to show them the extreme velocity of the team. We would also mention that we introduced the vision of the participative model at the beginning of Q4 2021, and you can see the growth in connector contributions. 

The blue line shows the number of connectors contributed by the community, and the black one, the number of total connectors available on the platform. You can see that our internal team is now only focused on improving the quality and reliability of the connectors. We will also provide better tools for the community to build and maintain new connectors more easily. 

In the end, within 16 months, we reached the number of connectors that the incumbents offer. 

So you can imagine how many connectors we will have in the years to come. 

7. Our Deployment Velocity

The main point of this slide is to show that our deployment monthly growth went up 6x within only 9 months, and to emphasize the fact that this growth was strictly organic. No money was spent on acquisition. 

8. Our Activations

The goal of this slide is to show how deployments translate into activated companies, i.e,. companies that have successfully synced data with us, and therefore understood how Airbyte could be leveraged. 

9. Our Prod Usage

We define a prod user as a user that syncs data more than once a day for a period of several weeks. 

What we point out in this slide is the more significant growth rate that we have with prod users than with activated users. The percent of activated users that become prod users keeps increasing. 

The last point we mention is that out of our interactions with activated users that churned, most of them want to use our Airbyte Cloud solution. They’re just not interested in hosting and operating Airbyte by themselves. That’s okay when you need 5 to 10 connectors, but most companies need to replicate data from 20 to 100 different sources, and this number keeps increasing. And that is actually the reason why we worked on Airbyte Cloud this early on. Airbyte Cloud doesn’t stand for a revenue optimization strategy, but an adoption optimization strategy. 

Our goal is that Airbyte becomes the most used data integration solution fast, and this is actually happening. 

10. Our Data Usage

Growth in terms of users is great, but it needs to be accompanied by a growing usage to be fully convincing. So this slide is the (big) cherry on the cake. It would prove our point of “land and expand.” Airbyte users will test Airbyte first on a few use cases before building trust in the solution and expanding their usage. 

11. Our Community Growth

What we would say: 

This is possibly one of the most important slides. Let us tell you why. 

We’re comparing Airbyte with other open-source projects related to data integration: Grouparoo (addresses open-source reverse-ETL), Rudderstack (addresses the open-source real-time use case of Segment), and Meltano (started more as a competitor to us but is now positioning itself more and more as an orchestrator for data pipelines).

The big difference in growth rate in terms of contributors stems from the fact that we’re the only ones with a huge overlap between our user community and our contributor community. Our users are data analysts and data engineers. Those same data engineers are tasked with building and maintaining connectors and are our contributors. In the case of Grouparoo and Rudderstack, their users are product, marketing, customer success, and sales people; they are not the data engineers that would contribute back. This lack of overlap means they will never be able to build a community of contributors or to address the long tail of connectors. 

So, in the end, Airbyte is the only project out there able to address the long tail of connectors for ELT, reverse-ETL and streaming use cases. Our data engineering community is also tasked to build the other types of pipelines. So when we enable streaming and reverse-ETL use cases in 2022, we will start addressing the long tail for those use cases with the same participative model. 

And this shows Airbyte’s true ambition: we want to power all data movement. ETL/ELT is only the first step. People compare us to Fivetran today, but we will do much more by the end of 2022.

12. Our Ecosystem Adoption

You can only become a standard if the ecosystem considers you one. The goal here is to emphasize the fact that we’re still at the beginning of the journey but have already seen very promising adoption from the ecosystem. 

Some companies have started to build their own product on top of our data protocol and connectors. This is especially interesting to us as they will have high stakes in helping us maintain the connectors to an SLA they want to provide to their own customers. This is the virtuous circle we want to get to. 

Airbyte is not just a company; it’s a community of companies and individuals solving the data movement problem once and for all. 

13. Transition Towards Revenue Metrics

Series-B means you need to show revenue metrics, and this slide is a transition towards revenue discussions. We already mentioned ETL/ELT is only a first step for Airbyte, so we would give an indication on the total addressable market (TAM) for that first step alone. 

14. Our Paid Solution - Airbyte Cloud

Investors will only invest if they’re convinced of your business model and have a good understanding of your ACV (Annual Contract Value), which (together) will give them a better sense of your TAM. 

What we would say: 

In terms of what should be open source or in Airbyte Cloud, here’s our philosophy: 

  • Any feature that addresses the needs of the team or the company should be in Airbyte Cloud. 
  • Any feature that enables an individual to move data seamlessly should be open source.

This gives us a lot of levers for conversion to premium, the biggest of which is actually just hosting and management, as most companies don’t want to do that by themselves for dozens of connectors. Typical premium features would also include user access management, which is important when it relates to data. 

As mentioned before, our compute-based pricing model enables us to be up to 10x cheaper for databases. You might think 10x is a lot, but we’re actually creating a new category here, as most companies don’t use other ETL solutions for that use case, so their pricing can’t serve as references. 

15. Adoption Metrics

Even if Airbyte Cloud was only released as invite-only a few weeks earlier , we still needed to show the traction of our new model, and the level of contracts we were discussing. This is the slide that would enable investors to have some validation on the TAM. We would also emphasize the fact that the ACV will grow with the new use cases we would enable.

16. Our GTM Strategy - Product-Led Motion

Now that we had shared the vision of the opportunity, we needed to show them how we were going to achieve it. 

What we would say: 

Airbyte has a bottom-up approach. To leverage this to the fullest, we will focus on a self-serve product-led approach. Cloud users would start with a free trial to get to the Aha moment as easily and quickly as possible. This will give us some usage insights that will complement firmographics data to determine whether we would privilege an entirely self-serve motion or a hand-off to sales. The way we’re thinking right now is that deals larger than $10k will need a human hand.  

17. Our Team

What we would say: 

Since April, we grew from 7 to 25 (we’re already up to 35 now at the time of writing). We’ve been focusing on hiring senior and leadership roles (“head of”), so those people can grow their own team. The best way to grow by 6x next year is to have many swimlanes. 

By the way, we are hiring!

18. Transition to Long-Term Vision

At this point, investors would often ask about our vision for 2022. So this slide would serve as a transition.

19. Our Next Company Milestones

What we would say: 

Our focus in 2022 is to build a ubiquitous and reliable integration standard. Here’s what that would entail: 

  1. Connectivity with the long tail of open-source and easily extensible connectors. This means we want Airbyte to work for your use case in a reliable way, even if you’re replicating hundreds of GBs every day from your databases. 
  2. Localization: we will start by releasing Airbyte Cloud in the US, then in Europe 3 months later, and APAC after that. We will also separate the data plane from the control panel, so that companies will be able to keep the data plane behind their private VPC if they want to, so Airbyte would never have access to the data in itself. That’s what we call hybrid deployment. 
  3. New types of data movement, including both streaming and reverse-ETL use cases. 

We want to address these 3 points in 2022. Our goal is to become the new standard for any kind of data movement. Then we can make those data pipelines smarter by addressing what is on top of them: data quality, privacy and compliance, data discovery, observability, etc.

Our next company milestone is to reach $XM (we don’t communicate the numbers publicly unfortunately) in annualized projected revenues. How does that translate? We believe we will need at least 300 employees. So, our goal in 2022 would be to grow from 30 to 200. Most of our fundraising’s money will go into growing the team.  

20. Our Ask

The last slide of our presentation is always about the ask. We wouldn’t mention how much we needed, because at the time, as we explained, we were testing the waters. We were mostly focused on telling them why we’re considering raising now if we’re given the opportunity with the right partners and at the conditions we envisioned for our Series-B. 

21. Appendix


22. The Participative Model

We displayed this slide when presenting the participative model. 

Airbyte’s internal team won’t focus on building more new connectors, but on enabling the community to do so. We will also maintain around 100 connectors ourselves, and give the tools to the contributors to do the same. We can only commoditize data integration if our connectors stay up to date and high quality.

23. GTM Strategy [2]

We would show this slide only if asked more detailed questions on go-to-market, which happened only once. 

The emphasis we give in this slide is that Airbyte is a movement powered by a community, and that’s the difference between us and the incumbents. We will build a community of local meetup organizers and writers, in addition to connector builders. 

24. Our Investors

We didn’t actually show those slides even once, as serious VCs will already know who your main investors are. But you should still have it ready just in case. 

--

That’s it, folks! Now you know why we included the slides we did, and what we were trying to accomplish with each one. Hope this helps you understand us better, and maybe helps you prepare for your own investor encounters. Let us know what you think or if you have any questions in the comments below!

Also, if you’re curious about joining the team, please don’t hesitate to check our careers page and apply!

Getting started is easy

Start breaking your data siloes with Airbyte.