Skip to main content

OLAP in the modern data stack

Burak Kabakci

Burak Kabakci

Core Team

OLAP is almost 30 years old and has often been used in different contexts over the last 10 years. It's a huge market, and there are many enterprise software in the space. OLAP is simply pre-aggregated data from your raw data, and it's not for everyone. If you're working for a startup that doesn't have terabytes of data, you probably don't need OLAP because you can efficiently run ad-hoc queries on your raw data (also known as fact tables). However, when you have petabytes of data and have tens of people who rely on it to make decisions, you don't want to make them wait 10 minutes for a simple query. Let's breakdown the OLAP into two categories:

Common Metric Types

Burak Kabakci

Burak Kabakci

Core Team

We've been working on building the first open-source metrics store for the last months. We worked closely with many data teams from different industries and markets to help them define their metrics as code. While working closely with various data teams, we've identified 3 types of metrics definitions that are widely accepted by data teams. Let me explain what these 3 common metric types are and how they are defined in BI tools today:

Introducing Metriql: Open-source metrics store

Burak Kabakci

Burak Kabakci

Core Team

Introduction

Today, most data-led/driven companies successfully establish the data warehouse as the single source of truth for their data and collect all their data into their data warehouse through ETL jobs. However, ETL is only one part of the whole process. Companies want to collect and own their data because they want to consume it in various ways to grow their business, whether by building analytical dashboards for stakeholders or building an ML application. To consume the data via BI or data tools, companies need to model their data and define business metrics and KPIs so stakeholders can agree on the definition of the data. Establishing the single source of truth for metadata is very difficult because tools and teams consume the data in different methods. This topic has also been addressed by many data folks recently: Ben has written The missing piece of the modern data stack, and Basecase came up with the Headless BI concept, Airbnb introduced Minerva.