The hype around rare sneakers has never been bigger. Founded in 2018, SoleSavy is an exclusive marketplace where you can buy trendy sneakers and accessories hard to acquire in the retail market. Sneaker enthusiasts can join this passionate community to connect with other fans, get expert advice, and access tools to stay on top of every important sneaker release.
SoleSavy's business revolves around Slack, with various Slack communities dedicated to helping members stay on top of sneaker releases, share experiences with each another, and keep in tune with the sneaker world. As their core platform grew, SoleSavy encountered several challenges that needed to be solved -
Hard to measure Slack community growth metrics
SoleSavy has developed a new marketplace for sneaker enthusiasts, allowing them to network globally and get early access to a wide range of sneakers. However, with its rapidly growing network of over 11000 members, SoleSavy needed an effective way to extract growth-and-churn metrics from its multiple Slack communities.
360-degree, community-wide data analytics
SoleSavy manages enormous amounts of member data spread over 10 different Slack communities. To quickly and effectively analyze this data and extract critical insights, SoleSavy needed to build data pipelines that pull information from several community databases and bring them into a centralized space.
We needed metrics on these Slack communities and had to be able to cross-reference Community A with Community B. We have north of 10 communities that we want to be able to measure growth for. It’s important that we have this information in a centralized space where we can see all accounts at once and measure their corresponding growth.
SoleSavy needed to efficiently consolidate and analyze data from multiple Slack communities to determine growth points and obtain relevant metrics. Their old data architecture failed to meet these requirements and had several problems -
Unable to quickly organize and process large amounts of data
For SoleSavy, it became quickly evident that using an ELK stack as their primary search and analytics engine did not adequately meet their business needs. With over 100 MB of raw unstructured data added to each Slack community daily, it became increasingly challenging to analyze the data. Additionally, due to the lack of SQL language support in ElasticSearch, an open source search, and analytics engine, it became difficult to extract critical information from the large amounts of collected data.
I used ELK stacks in the past for several things. However, it didn’t meet the needs for this use case. We had to process almost a gigabyte of data a day. After a month of data collection, that was a very slow query for Metabase to graph, making the solution impractical.
Difficulty consolidating different databases
In SoleSavy's data architecture, a Slack community was mapped one-to-one with a database, so data corresponding to that community could be quickly identified, filtered, and graphed on-demand. Unfortunately, as the number of communities and volume of data from these communities grew, so did the number and size of corresponding databases. This made it harder to perform cross-database joins and correlate data stored within multiple databases. For bringing data together from numerous databases, SoleSavy considered several options, such as Meltano. Meltano was hard to deploy at scale, and it could not seamlessly integrate with Slack. Aside from that, setting up Meltano took time, and with the tremendous growth, the company was experiencing, scaling Meltano was a struggle for quick and efficient data analysis.
Cloud infrastructure and Docker containers are critical components of SoleSavy’s architecture. To address some of the key challenges with the old architecture, SoleSavy bet their infrastructure on Airbyte.
Airbyte was simple to deploy in a docker container and use, making it an ideal technology to utilize.
Simpler data aggregation for analytics and reporting
For SoleSavy, data aggregation across its different slack communities was a serious problem. Thanks to Airbyte, SoleSavy automatically discovered all the available Slack entities, whether users or channels and only captured a subset of the data needed for analytics. In the new architecture, data from multiple sources was tapped back into Postgres and then extracted using a single database instance. With all the data now in a consolidated database, data analytics became easier — by simply pointing Metabase, the company’s analytics dashboard tool, to the newly consolidated database, rich data insights could be extracted and reported.
I just wanted to use an easy-to-implement tool to start collecting data from Slack right away. Using Airbyte was incredibly simple and just took a few clicks. Once you put in the credentials, it connected to the database, did the thinking, and the task was soon complete. So after trying out Airbyte, we decided that this was the tool we had to use.”
Looking into the future
Today, Airbyte is an essential component of SoleSavy’s data processing systems. As SoleSavy grows and scales its use cases, it is confident in Airbyte's capabilities and the support that it can get from experts in the community.
SoleSavy has a very young engineering team, and we want to be part of open source. It’s a good thing for our team to be involved, and also a good thing for the community as well.
Get all your ELT data pipelines running in minutes with Airbyte.
Learn how BetterSaver set up data integration without data engineers.
Learn how Intellum easily started managing their massive daily data syncs by switching to Airbyte.