No items found.

One-click Data Pipeline for Profitability in E-commerce

In a world where e-commerce business models are relatively uniform, lies a huge opportunity in analytics of building modular, reusable data transformation models. This tutorial is about open sourcing the full end to end pipeline around a critical use case for every e-commerce: profitability calculation!

Should you build or buy your data pipelines?

Download our free guide and discover the best approach for your needs, whether it's building your ELT solution in-house or opting for Airbyte Open Source or Airbyte Cloud.

Download now

Should you build or buy your data pipelines?

Download our free guide and discover the best approach for your needs, whether it's building your ELT solution in-house or opting for Airbyte Open Source or Airbyte Cloud.

Download now

In a world where e-commerce business models are relatively uniform, where few operational tools dominate the market, lies a huge opportunity in analytics of building modular, reusable data transformation models. A lot of dbt packages so far have focused on modelling isolated data sources from extraction tools like Airbyte. We see the opportunities of going one step further, by open sourcing the full end to end pipeline around a critical use case for every e-commerce: profitability calculation!

Open sourcing Data Pipeline for calculating Profitability in E-commerce

Recognizing this opportunity, we've embarked on a project to create a fully modular data pipeline aimed at calculating e-commerce profitability. This pipeline is designed to be open-source and easily reusable, enabling data teams to quickly deploy a robust profitability model without starting from scratch.

Open Source GitHub Repository

End result of the tutorial

Here are the few dashboards that were built on top of the data model built from this tutorial's repo.

1. Overview of Revenue and Profits

This view gives the business the ability to understand Revenue and Profits, with possibility to drill down at the level of an order.

Architecture

2. Overview of Impact of different Cost Factors in EUR and % of Revenue

This view gives the business the ability to understand how does each cost factor impacts overall profitability, with possibility to drill down at the level of an order.

Architecture

3. Map of Countries with highest Profit Margins after deducting all the costs from Order Revenues

This view gives the business the ability to understand which countries see highest margins, with possibility to drill down at the level of an order.

Architecture

4. Overview of Performance of Paid Marketing and ROI (Profits made from 1 EUR of Marketing Investment)

This view gives the business the ability to optmise marketing spent not on ROAS which takes into consideration sales but on ROI (Return being Profits). In Revenue a lot of country specific costs like VAT, shipping can lead to widely different Profit outcome.

Architecture

Overview of the Repo

This project goal is to create a fully modular data pipeline aimed at calculating e-commerce profitability. This pipeline is designed to be open-source and easily reusable, enabling data teams to quickly deploy a robust profitability model without starting from scratch.

This repository is a production dbt pipeline example that models the profitability of an e-commerce business. Data is extracted and loaded to a BigQuery data warehouse by Airbyte. The main goal of this repo is to show a production pipeline for a common analytics use case (improving profitability):

  1. Data Extraction: Using tools like Airbyte, we extract data from common e-commerce platforms (e.g., Shopify) and payment gateways (e.g., PayPal, Klarna).
  2. Data Loading: The extracted data is loaded into a cloud data warehouse, in this case we chose BigQuery.
  3. Data Transformation: With dbt, we model the raw data into a reporting layer that includes all necessary metrics for profitability analysis. This involves creating separate branches for revenue calculation and various cost components.
  4. Revenue Calculation: We build a branch that aggregates order data to calculate total revenue.
  5. Cost Calculation: Multiple branches are created to calculate different types of costs, including shipping (from vendors like GLS, Spring, or FedEx), manufacturing costs, marketing costs (from online advertising like Facebook Ads), VAT costs from different countries of purchase, and commission fees (for transactions and currency conversion from payment gateways and e-commerce platforms like Klarna, Shopify, and PayPal).
  6. Profitability Calculation: The final step involves combining revenue and cost data to calculate profitability at the order level and across different dimensions such as time and product category.

Architecture

Data Sources

  • Shopify: E-commerce platform data (including Shopify Orders, Shopify Balance Transactions, Shopify Transactions)
  • Facebook Ads: Paid Marketing (Facebook Advertising) data
  • Paypal: Payment processing data (coming from Shopify)
  • FedEx and GLS: Shipping data (coming from GSheets)
  • Manufacturing Costs: Cost data from the manufacturing process (coming from Gsheets)

Models

Detail the dbt models included in the project and their relationships.

  • staging models: Models raw data from source systems
  • intermediate models: Models Revenue, and all costs separately. Join and aggregate data from staging models.
  • final models: Calculates profitability metrics and other business KPIs and build fact_orders table including all costs at order level

More details about each specific part of the pipeline

Documentation of model to combine all costs at the level of an order once each cost has been modeled:
https://blog.getdot.ai/automating-end-to-end-profitability-reporting-for-your-e-commerce-business-shopify-fedex-paypal-c65c31a69441

Documentation of model to get VAT costs at the level of an order:
https://blog.getdot.ai/how-to-model-vat-costs-to-the-level-of-order-for-profitability-calculation-in-e-commerce-e315c1d47f95

Documentation of model to get Manufacturing Costs at the level of an order:
https://blog.getdot.ai/how-to-model-manufacturing-costs-to-the-level-of-order-for-profitability-calculation-in-e-commerce-c553ad327339

Documentation of model to get Shipping costs at the level of an order:
https://blog.getdot.ai/how-to-model-shipping-costs-to-the-level-of-order-for-profitability-calculation-in-e-commerce-b5cd3b400ace

Documentation of model to get Commission Fees at the level of an order:
https://blog.getdot.ai/how-to-model-commission-fees-shopify-paypal-to-orders-for-profitability-calculation-in-937d4a8b0fe9

Documentation of model to get Marketing Costs at the level of an order:
https://blog.getdot.ai/how-to-model-marketing-costs-to-the-level-of-order-for-profitability-calculation-in-e-commerce-06fa6bade89f

Data lineage

Architecture

Combining Data for Profitability Analysis

Once we have detailed revenue and cost data (see both steps below), we combine these metrics to calculate profitability. This involves creating a unified model that allows for the analysis of profitability at various levels of granularity, from individual orders to overall business performance over time.

Complete step by step guide in this blog post.

Revenue Calculation

Our revenue calculation branch aggregates order data to provide a comprehensive view of total revenue generated. This includes handling various payment methods and ensuring that all revenue streams are accurately captured.

Cost Calculation

To accurately calculate profitability, we need to consider all relevant costs:

  • Shipping Costs: We track expenses from various shipping vendors and incorporate them into the overall cost model. 
  • Manufacturing Costs: These are accounted for by analysing the cost of goods sold (COGS) data. 
  • Marketing Costs: We include costs from online advertising platforms such as Facebook Ads to account for customer acquisition expenses. 
  • VAT Costs: Value-added tax (VAT) costs are considered, taking into account different rates from various countries of purchase. 
  • Commission Fees: Fees from payment gateways and e-commerce platforms, including transaction fees and currency conversion fees from services like Klarna, Shopify, and PayPal, are included. 

Conclusion

Our goal with this project is to provide a comprehensive, modular, and reusable data pipeline for calculating profitability in e-commerce. By open-sourcing this pipeline, we aim to empower data teams to quickly and efficiently build robust analytics infrastructure. This approach not only saves time and resources but also ensures consistency and accuracy in profitability analysis, ultimately helping e-commerce businesses make more informed decisions and drive growth.

Disclaimer

This repository can be installed to create a "one click pipeline." However, you will need to recreate the sources for cost data because they are custom to the business. The Facebook, Shopify, and Paypal transformations should be plug and play.

Similar use cases

No similar recipes were found, but check back soon!