Deploy a Self-service Business Intelligence Project With Whaly & Airbyte

Join our newsletter to get all the insights on the data stack

Self-service BI is fundamental to a company’s ability to make critical business decisions when it counts. For these decisions to be made in a timely manner, all stakeholders need to have an ongoing pulse on what’s going on in the business.

The challenge with the process of turning raw data into insights is that the end data consumers are Sales reps, CMOs, Heads of Growth, and others - who tend not to come from technical backgrounds, and aren’t responsible for the technical majority of the data process: collection, integration with Airbyte, transformation, and modeling.

As a result, business teams are dependent on data teams for their dashboards and reports. This model isn’t scalable because the data teams end up being a bottleneck, and business teams aren’t autonomous, which could slow business growth.

BI platforms with a self-service layer, like Whaly, empower business teams to serve themselves the analytics they need, when they need it, in a drag & drop interface that’s intuitive to non-technical users.

This is a game-changer because:

Data teams are no longer a bottleneck. Business teams are autonomous, with the right tools for ongoing data visibility.
Prevents “gut feeling” decisions - all decisions are backed by data, and made on time.
Data adoption grows across the organization.
Trust in data is fostered, moving towards a truly data-driven company.

How Airbyte and Whaly provide a self-service experience

Airbyte benefits:

Airbyte has an impressive catalog of 300+ connectors, so that your data projects are always comprehensive
Airbyte allows you to easily get data into your data warehouse
One-stop shop for connecting all your data sources, without having to hire full-time engineers or more resources

Whaly benefits:

Fast to implement and use, and connects directly to your data warehouse
Whaly’s platform consists of 2 tools that are highly in sync:
A bespoke tool built for data teams: create the semantic layer and model data, setting the business teams up for successful explorations.
A bespoke tool built for business teams: intuitive interface in which they can autonomously run queries and answer their own questions.
A self-service experience that ensures high governance. Whaly enables data teams and business teams to collaborate in a way that they never could before.
Data teams save time, and business teams are empowered to create their own dashboards.
Boost in data adoption and trust in data company-wide, leading to business growth.

How they’re better together:

1 data source implies a whole project that involves dashboard creation & maintenance. Adding new sources can pose a challenge because it requires data team resources to build reporting from scratch for new use cases.
Lack of bandwidth from data teams means they won’t consider connecting to new sources, since they’ll be busy building dashboards for their existing sources.
Whaly’s self-service layer is a time-saver for the data teams, who will have less maintenance work, and won’t have to be the dashboard builders.
This widens their scope, and allows them to work on new use cases and different projects by connecting more data sources.
Airbyte has the largest catalog of sources, ensuring that your data projects are comprehensive and connected to every source that’s relevant to your company.
Airbyte + Whaly ensure that you have a wider set of analyzed data, for more insights.

Getting started with Whaly and Airbyte is simple! Use Airbyte to get all your data into your data warehouse, and perform analytics on it with Whaly’s BI platform. Follow our step-by-step guide below:

Airbyte & Whaly Tutorial

Airbyte

Step 1: Set up your data sources

Create a Airbyte Cloud account to get started. Through its user interface (UI), Airbyte can connect to a wide variety of data sources, from business applications to databases. From the Sources tab, select your data source from the drop-down menu.

In our example, we’ll use the Slack data source.

Once you have selected your data source, Airbyte will run you through a set-up guide to connect to your data source.

‍

Step 2: Set up a Warehouse destination

Once you have your source, the next step is to set a destination to which Airbyte can connect and write your data. Airbyte supports a large number of destinations, including cloud data warehouses like Google BigQuery and Snowflake.

‍

Whaly supports both Snowflake and BigQuery. We’ll use BigQuery in this example, as it has a nice free tier.

Step 3: Set up connections between your sources and BigQuery destination

Fill out the proper information as per Airbyte documentation, in order to connect to the BigQuery instance.

‍

How to set up Airbyte to feed Slack data into BigQuery

Once the BigQuery credentials are validated, we’re redirected to the Slack configuration panel where we can select which tables we want to sync.

‍

Once the connection is set up, the sync begins. Grab yourself a cup of coffee while it runs ☕

‍

Whaly

Step 4: Connect Whaly to your BigQuery instance

When creating your Whaly account, you’ll be asked for your BigQuery credentials. The full instructions for getting those credentials is explained in our documentation: https://docs.whaly.io/warehouse/bigquery/connect-your-bigquery

Step 5: Import the data to be queried

Now, we’ll go through the “Workbench” interface to import the data tables that Airbyte has created.

Workbench is a tool within Whaly’s platform that only data practitioners can access to create models and produce “Explorations” that will be published in the Workspace. The Workspace is the tool in which data consumers (business teams) can build and access their dashboards.

This line of separation is important for ensuring strong governance in the business intelligence platform.

Once you’re in the Workbench, click on “+” button next to the Source to import our Airbyte datasets.

Step 6: Create a self-service layer

By clicking on “Import Dataset”, we’re able to list all the tables that Airbyte has created and import the relevant ones into Whaly.

We can now see a preview of the tables and their schema to understand how the data is shaped and structured.

Now that we have a solid overview of our data, we’ll create a “self-service layer” on top of it. This can be considered a lightweight semantic layer, which serves as the interface and link between the data producers and practitioners, and the data consumers.

In the first step, we’ll declare the relationship between the tables:

A Channel has many Threads
A Channel has many Channel Message
A Channel has many Channel Members

For each relationship, we select the matching columns that will be used to combine the tables together.

In the end, we have declared all the relationships ⤵️

We can now create an “Exploration” on top of these tables to declare the dimensions and the metrics that will be made available to our data consumers.

As you can see, a lot of dimensions (in 🟩) and metrics (in 🟦 ) have been automatically created by analyzing the table schema. We can quickly clean them up, keeping only the most interesting ones, and discarding those that don’t add value.

We can also rename and add descriptions to the important dimensions. For example our table provided by Airbyte has a ts column. This is the abbreviation for “timestamp” which is an engineering term. Our business users will prefer to call this column “Message date”.

At any time, we can click “Open exploration” to preview what the Query Builder will look like for data consumers and dashboards builders.

Now that we have some interesting dimensions and metrics on the “Message” table, let’s add the “Channel” table, to understand where the messages are being sent within Slack. To do that, we can add a related table in our Exploration.

Note: A dynamic JOIN will be executed when needed, as a result of the relationships that we have previously declared. As Whaly knows everything about the schema, the nature of the relationships and of the matching columns, this JOIN will be optimized to avoid the “fan-out” problem that can break some reports.

Now that we have added the “Channel” table, we can use it to break down the number of messages per Channel in the query builder interface.

We can see that most messages are sent in 2 Slack channels: poseidon/prod, which contain automatic notifications. We will remove them when building dashboards.

To finalize our Exploration, we’ll configure the Drills to let people interact with the charts that they will create or access.

Now, whenever an end user charts something from this Exploration, he or she will be able to click on the chart to dig deeper and understand the details behind the figures - like content of the messages, the channel it was sent to, etc.

Based on the messages, we can see that the last week of December was a celebration week for the team 👍🍾

Step 7: Create a dashboard

Now that we have an Exploration ready, with information from the Message table and the Channel table, we can create our first dashboard.

This time, we’ll do everything within the “Workspace” environment as data consumers (C-Level / Managers / Business users). The exploration has been created to enable self-service reporting across your organization, so now, non-technical business users can access analytics autonomously.

In the Workspace, the data consumer can access the Explorations that we built in the Workbench in the “Explore” part of the left menu.

Let’s start by asking ourselves a question that we want to answer, “what is the breakdown of messages per day and per channel?”

Here, we can run a query based on the relevant metrics and dimensions on the left hand side. Once we’re happy with the outcome of the chart, we can save it to a Dashboard. This will redirect us to the dashboard studio.

In the dashboard studio, we can add other charts, re-position them, add filters and customize everything. Now, let’s add a time filter and a title.

This way, when data consumers access the dashboard directly from the Workbench, they will be able to change the filter value to match their needs.

We can also store the Dashboard inside a “Slack” folder on the menu on the left hand side, so that people can find it more easily when browsing the Workbench.

Now you have the beginning of your Slack dashboard, with the data connected into Whaly through Airbyte.

Whaly is a full-blown BI platform, so this is only the beginning. To better analyze our Slack data, you can share this dashboard via a password-protected link, manage the dashboard’s access control, blend other data sources, create new models (SQL or no-code) - but that’s the gist of it, and enough for today!

How to get started

Getting started with Airbyte & Whaly is simple. Airbyte Cloud offers a free trial so you can start syncing your data into your data warehouse. You can also sign up for a free trial of Whaly here. Once your data is in your warehouse, you can load the data easily into Whaly's self-service BI platform, to start visualizing and learning from your data. If you have any questions or feedback, please reach out to anna@whaly.io.

About the Author

Anna Lorentz is the Head of marketing at Whaly.

About the Author

Pierre de Poulpiquet is the Co-founder & CTO at Whaly

Should you build or buy your data pipelines?

Download our free guide and discover the best approach for your needs, whether it's building your ELT solution in-house or opting for Airbyte Open Source or Airbyte Cloud.

Download now

Deploy a self-service business intelligence project with Whaly & Airbyte

Join our newsletter to get all the insights on the data stack

Should you build or buy your data pipelines?

About the Author

About the Author

Join our newsletter to get all the insights on the data stack

How Airbyte and Whaly provide a self-service experience

Airbyte & Whaly Tutorial

Airbyte

Whaly

How to get started

About the Author

About the Author

Should you build or buy your data pipelines?

Similar use cases

Version control Airbyte configurations with Octavia CLI

Orchestrate data ingestion and transformation pipelines with Dagster

How to Build a Slack Analytics Dashboard Using Apache Superset