Confluent, a leading infrastructure software company commercializing Apache Kafka, filed for an IPO and with the most recent S-1/A, the company plans to raise $759M at the high end of the range at $33/share, which is likely to increase as we get closer to trading. Morgan Stanley is leading the IPO and Confluent plans to trade under the ticker “CFLT”. The Confluent CEO & co-founder, Jay Kreps, created Apache Kafka while working as an engineer at LinkedIn in 2011. I’ll refer to Confluent throughout this post but their products are centered around the massively popular open source project, Apache Kafka. As more software is used inside companies, the need for real-time (or streaming) data is ever more important. Confluent becomes the central nervous system that connects disparate data repositories, enabling companies to react in real-time. While modern databases are great for data at rest, Confluent is best for real-time data and their mission statement is to “set data in motion.” Confluent’s commercial offering can be deployed on premise, hosted in the cloud, and works across hybrid cloud infrastructure environments. The company discusses the need for its product here:
“The operation of the business needs to happen in real-time and cut across infrastructure silos. Organizations can no longer have disconnected applications around the edges of their business with piles of data stored and siloed in separate databases. These sources of data need to integrate in real-time in order to be relevant, and applications need to be able to react continuously to everything happening in the business as it occurs. To accomplish this, businesses need data infrastructure that provides connectivity across the entire organization with real-time flow and processing of data, and the ability to build applications that react and respond to that data flow. As companies increasingly become software, they need a central nervous system that connects all of their disparate software systems, unifying their business and enabling them to react intelligently in real-time.”
Confluent does not report ARR (annual recurring revenue), but as of last quarter the company was at $272M of implied ARR (quarterly subscription revenue * 4), growing 55% year-over-year and had a 117% dollar-based net retention rate in the same period. The company has over 2,500 customers and 60 customers paying greater than $1M in ARR and 561 customers paying greater than $100K in ARR. The company is also spending a lot for their growth and burned $21.2M of cash (free cash flow) last quarter, implying a (28)% free cash flow margin. Confluent was founded in 2014 with the name Infinitem and changed their name to Confluent in the same year. According to Pitchbook, Confluent has raised $455.9M in equity financings to date across seed → series E rounds. The most recent round was a series E led by Coatue in April 2020 at a $4.5B post-money (according to Pitchbook). At the high end of the range of $33/share, the fully-diluted market cap would be $10.7B, up 120% from the series E share price of $14.97 just 15 months ago. More information on investors and funding can be found later in this post. Confluent is headquartered in Mountain View, California, and has 1,473 full-time employees across 20 countries (as of March 2021).
Below is a timeline graphic from the S-1.
Confluent is pioneering a new category of data infrastructure focused on what they call data in motion which is essentially data streaming (Confluent refers to this as data in motion throughout their S-1). The traditional approach to how applications are built and deployed is no longer sufficient as companies need integrated systems that operate in real-time. Confluent talks about integration, connectivity, all in real-time with Kafka at the core.
Confluent aims to be the intelligent connective tissue by having real-time data from multiple sources constantly streamed across an enterprise for analysis. The solution enables organizations to deploy production-ready applications that run across cloud infrastructure environments and data centers and paired with features for security and compliance. The company believes that, over time, it can become the central nervous system for modern digital enterprises, providing ubiquitous real-time connectivity and powering real-time applications.
Confluent’s software can be consumed in two ways, Confluent Cloud and Confluent Platform, which customers can leverage independently or together.
Confluent believes they have 3 main differentiating elements to their platform:
In addition to the above, Confluent highlights the "network effects" its product creates as customers leverage the solution, helping speed adoption and creating a defensible competitive advantage. These network effects are created as the first application that utilizes the platform brings in data streams that can then be used by all future applications, subsequently bringing value to the entire ecosystem and attracting more applications. The company also describes how Confluent Cloud will be important for their go-forward growth and Confluent calls out the Q1 Confluent Cloud year-over-year revenue growth of 124%. Confluent Cloud is no doubt the future of the company and while it only represented 20% of subscription revenue last quarter, it’s growing over 100% a year and it’s no surprise Confluent listed Confluent Cloud first in their product section.
Below is an S-1 graphic of the current state of enterprise infrastructure.
And below is what it looks like when you are running with Confluent.
Summary Metrics and GTM (Go-to-Market)
Below are a few high-level metrics on Confluent’s financial performance and metrics before getting into their GTM.
Given the broad awareness of Apache Kafka, Confluent can target users who are already on Kafka for their proprietary Confluent suite of products. The company did not opt for a traditional open core model, but the core offering, Confluent Server, is highly differentiated from the open source Kafka. Confluent’s goal is to convert small pockets of Kafka usage into Confluent deployments in a low-friction way and try to expand those relationships to eventually become enterprise-wide Confluent deployments. Confluent derives almost all of their revenue from subscriptions to Confluent Platform (a self-managed product that can be deployed on premise, or in a public or private cloud), and Confluent Cloud, their fully-managed cloud-native SaaS offering. Confluent Cloud was launched in Nov-2017 and in late 2019 transitioned the cloud product from a defined configuration paid annually in advance to a usage-based pricing model. The pay-as-you-go model has become quite popular in SaaS and not surprising Confluent opted for this approach. Confluent also offers a 3-month free trial of the cloud product. Most deals are 1 year in length.
Confluent likely sees a mix of customers -- small customers or a pocket of an enterprise starting out on Confluent Cloud and also large, wall-to-wall deployments. Customers that are $1M+ in ARR were at 60 last quarter, up 82% year-over-year. Customers that were $100K+ in were at 561 last quarter, up 50% year-over-year. The sales cycles for enterprise customers can be long, and Confluent calls this out in the risk factors but doesn’t mention the exact length. During the second quarter of 2020 (during COVID), the company temporarily reduced hiring and believes this had a negative impact on net retention and growth and as of last quarter, the company had 684 FTEs in sales and marketing (sales development, inside sales, field sales, sales engineering, and marketing personnel) which represents almost 50% of the company’s entire headcount. Salespeople take a few quarters to ramp (they don't give specifics) and most of the company’s business is sold in the fourth quarter of each year.
Below are more stats on the business and industry from the S-1:
Other industry stats:
Confluent’s market is large and supports both on premise and cloud environments. The company believes they touch a few of Gartner’s segments: 1) Application Infrastructure & Middleware 2) Database Management Systems 3) Data Integration Tools and Data Quality Tools, and 4) Analytics and Business Intelligence. According to Gartner’s 2021 estimates, these total markets represent $149B in spend. With that said, Confluent believes they only touch a portion of these markets (~$50B) as outlined below:
This market is also expected to grow at a 22% CAGR (compounded annual growth rate) through 2024 to reach $91B. The market they operate in is massive irrespective of how the data is cut.
Not surprisingly, Confluent states their primary competition is internal IT teams that develop their own data in motion infrastructure, which in many cases is based on Apache Kafka, the open source project the Confluent founders developed. So teams using the open source product and not paying Confluent is a large threat, which is not uncommon for massively popular open source projects and the companies commercializing them. Confluent also calls out a few other areas of competitors 1) the cloud providers with products like Azure Event Hubs, AWS Kinesis and DynamoDB Streams, and Google’s Cloud Pub/Sub and Cloud Dataflow products, 2) legacy products that have pivoted into this space including TIBCO Streaming, Cloudera Dataflow, Redhat (IBM), AMQ Streams, and Oracle Cloud Infrastructure Streaming and 3) other competing open source projects which they do not name, although Apache Pulsar is another open source alternative to Kafka that is gaining traction. You can see the GitHub star history below; Puslar started later but is on a similar trajectory to Kafka. GitHub stars are not a source of truth for traction, but a good proxy for a project’s popularity. Moreover, Confluent launched a cloud product that is offered on top of the cloud providers, and if they offered competing products Confluent would see more competition on the cloud side as well.
Investors and Ownership
Confluent has raised $456M in equity financing according to Pitchbook. Investors include Benchmark, Index, Sequoia, Altimeter, Coatue, and others. 5%+ institutional investor shareholders include Benchmark (15.3%), Index (13.0%), and Sequoia (9.3%). Jay Kreps, CEO, and co-founder, holds a 12.6% stake. The last round was a $250M series E in April 2020 that was led by Coatue at a $4.5B post-money valuation (per Pitchbook) at a $14.97 price per share. In July 2020 the company did a tender offer and sold ~$110M worth of shares at a $14.97 price per share. Jay Kreps and his spouse sold $39.5M of secondary shares in that tender offer. Another co-founder and board member, Neha Narkhede, sold ~$77.8M of secondary in September of 2020.
See the cap table output of major shareholders below.
Major Shareholder Summary Cap Table Ownership %
Price per Share Disclosure
The following chart shows the preferred share prices over time. At the high end of the range ($33 per share), Confluent would be worth $10.7B, with the share price up 120% from the series E share price 15 months ago.
Shareholder Value at High End of IPO Range ($M)
Assuming a $33 price per share (the high end of the current range), the chart below shows the major shareholder’s dollar value of their shares. Benchmark stands to be the largest winner with almost a $1.2B stake. CEO Jay Kreps' stake would be worth almost $1B. Note this only includes investors that have their shares disclosed, and doesn't include other major investors (such as Coatue) who own less than 5% and are not on the board.
Confluent vs. Elastic (NYSE:ESTC)
Given the similarities in business model, I indexed Elastic (NYSE:ESTC) and Confluent’s implied ARR (quarterly subscription revenue x 4) and their non-GAAP operating margins to the quarter they both crossed ~$100M of implied ARR. Elastic commercialized the popular open source search and analytics engine Elasticsearch and went public in 2018 at a $2.5B valuation and now has a market cap of $12.5B (16-Jun-2021).
Indexed Implied ARR ($M)
Taking a look at Confluent and Elastic’s implied ARR the quarter they did ~$25M of subscription revenue / $100M of implied ARR. The growth trajectories are quite similar.
Indexed Non-GAAP Operating Margin
While they were growing at similar rates, Elastic was more efficient on a non-GAAP operating margin-basis than Confluent thus far. Margins are indexed to the quarter the companies crossed ~$100M of implied ARR.
Financials and Other Metrics Outputs
Confluent is almost at $300M of implied ARR growing 50%+ year-over-year in a massive market. Their cloud revenue is growing over 100% year-over-year and offers an easy way for the masses to onboard to Kafka. Kafka isn’t an easy to use solution on its own, and Confluent Cloud is creating a great on ramp for a larger part of the market that previously might not had decided to use Kafka at all. While non-GAAP operating margins are hovering ~(40)%, investing for growth at this stage of the market is the right move for Confluent. While COVID did hurt them and the company slowed hiring, the past few quarters have been strong. The following charts dive deeper into the company’s business and metrics.
Historical P&L & Key Metrics ($000's)
Quarterly Revenue ($M)
Subscription and Services Revenue Mix
Implied Ending ARR ($M)
Confluent added $19.1M of implied net new ARR over the past quarter and $96.2M over the past year.
Confluent Cloud vs Platform Revenue ($M)
Confluent's Cloud revenue is growing much faster than their self-managed product and while a smaller part of overall subscription revenue today, it's likely to make up significantly more in the future given the growth rate. Confluent Cloud is growing 124% year-over-year vs Confluent Platform growing at 43% year-over-year.
Here is another view; Confluent Cloud revenue now makes up 20% of subscription revenue.
Quarterly Non-GAAP Gross Margins and Operating Expenses as a % of Revenue
Quarterly GAAP and Non-GAAP Operating Margins
Given there is some disclosure around customer segments, the following looks at the percentages of customers by bucket. It appears that larger customers and cloud customers (which likely tend to start smaller) are both growing at high rates.
Net Dollar Retention
Net dollar retention has come down for Confluent over the past few quarters. The company calls out a few reasons for this; 1) the impact of existing customers becoming a larger portion of both the overall customer base and ARR 2) large initial deal sizes that incorporate potential growth 3) the impact of COVID-19 and 4) the initial impact of existing customers transitioning to the usage-based Confluent Cloud offering.
Confluent also released ARR (annual recurring revenue) cohorts by year in the chart below. For example, the 2017 cohort represents all customers that made their initial subscription from the company between January 1, 2017 and December 31, 2017. The fiscal year 2017 cohort increased their initial ARR from $15M to $62M in fiscal year 2020, representing a multiple of 4.1x.
Contribution Margin Analysis
Confluent released a contribution margin analysis of their 2018 cohort, which they believe to be a good representation of the entire business. There is a longer definition of what Contribution Margin is in the S-1 on page 88, but it shows that on a cohort-basis, Confluent’s base of ARR becomes very profitable over time. Not surprising for a software company.
Sales Efficiency and Payback Periods
Confluent doesn’t release customer counts by quarter, but the below output plots their implied months to payback using the inverse of a CAC ratio (net new implied ARR multiplied by non-GAAP gross margin divided by non-GAAP sales and marketing spend of the prior quarter). The magic number is defined as implied net new ARR divided by non-GAAP sales and marketing spend of the prior quarter. The median months-to-pay-back over the disclosure period is 21 months (8 quarters). The COVID impact on sales efficiency in Q1 and Q2 of last year is clear.
U.S. vs. International Revenue Mix Percentage
Cash Flows ($M)
Quarterly P&L ($000's)
Confluent will trade like other high-growth SaaS companies: on a multiple of forward revenue. The output below uses the NTM (next-twelve-months) revenue based on an illustrative range of growth rates and comparable EV (enterprise value)/NTM revenue multiples from other public, high-growth SaaS businesses. It also includes an implied ARR multiple range. As mentioned in other posts, companies do not release projections or guidance in S-1's. It’s likely Confluent will trade in the range of other, high-growth SaaS companies, which would be more than 2x above their last private round of $4.5B. Elastic’s current NTM revenue multiple is to the right in the table and it appears Confluent will see a significant premium to Elastic. While IPO pops have been the subject of much debate which this post won’t get into, bankers are now pushing the pricing of tech IPOs higher to avoid those optics (pricing at a lower multiple for a higher first day pop), so I would be surprised to see much movement upwards beyond the higher end of the range of the current $33/share and less of a first-day pop when the company starts trading. The fully-diluted enterprise value at $33/share is ~$9.7B. Clearly the interest in the IPO is quite high given the illustrative multiples below.
Confluent is the leading independent player commercializing one of the largest and most successful open source projects in Apache Kafka. While competition is likely to increase from the cloud providers from the likes of AWS, Microsoft, and Google, as well as from other emerging open source projects, Confluent is well positioned given their scale and market awareness. Moreover, as most large enterprises move towards hybrid and multi-cloud environments, they will need to standardize their real time / data streaming data infrastructure stack on one independent vendor, and the cloud providers don’t make products that interoperate well with each other, leaving significant white space for Confluent to grow. Confluent Cloud should also be a big driver of growth in the future as it unlocks a large portion of the market by offering an easy way for the masses to onboard to Kafka. Based on the price range for the IPO that’s already been set, Confluent is on its way to being a successful outcome for all of its shareholders.
To receive these posts by email, click here.
Special thanks to Chris Gaertner for the help on this post.