Creating Datasets
A dataset represents a dataset in your data warehouse. Metriql automatically creates datasets from your dbt models
, sources
, and seeds
. A dataset has three important properties:
dimension lets you drill down into the dataset, it can be used to filter query results. It can be either a column in your table or an SQL expression that processes a single row and returns a value.
measure is a field that uses an SQL aggregate function, such as count
, sum
, or average
. Measures can be used to aggregate dimensions. For example, measures for a Sales
model might include total items sold (a count), total sales price (a sum), and the average sales price (an average).
relation defines a join in between the datasets. Metriql automatically generates SQL join logic to bring in all fields from another dataset correctly then the user analyzes a specific dataset.
#
Create datasets from dbt metricsinfo
The feature is in beta and the API is unstable for now.
We support dbt's metrics natively. The datasets that is created via dbt metrics only have single measures and the dimensions are inherited from the base model / source / seed if you use meta.metriql
in them. Please refer to dbt's metrics feature to learn more about the definitions.
#
Create datasets from dbt modelsYou can map your columns as dimensions and measures under the meta
property of columns
as follows:
models:- name: customerscolumns:- name: country_codemeta:metriql.dimension:type: string- name: citymeta:metriql.dimension:type: string- name: total_customersdescription: total number of customers defined as count(*) in sqlmeta:metriql.measure:aggregation: sum
In addition to column mapping, you can also create custom measures and dimensions by defining them under meta
of the model:
models:- name: customersmeta:metriql:measures:total_rows:aggregation: countdimensions:full_location:sql: CONCAT({TABLE}.country, {TABLE}.city)type: string
You can see the full list of properties that you can use under model.meta
and column.meta
here.
#
Create datasets from dbt sourcesIn case you want to create models that point to tables in your database, you can make use of dbt's source properties as follows:
sources:- name: raw_eventstables:- name: pageviewmeta:metriql:measures:total_pageviews:aggregation: countcolumns:- name: urlmeta:metriql.dimension:type: string
If you're analyzing the time-series data, you can also define mappings
so that Metriql understands your data in a better way and lets you access specific features such as funnel and retention:
models:- name: eventsmeta:metriql:mappings:event_timestamp: event_occurred_atuserId: user_id
#
Create datasets from dbt seeds#
Exposing your seeds to the end-usersSince seeds are usually used to enrich the data, we don't create models from seeds
by default. However; you can expose them to the end-users if you enable the seeds
to have appeared in dbt docs:
seeds:- name: countriesdocs:show: truemeta:metriql:measures:#columns:- name: iso_codemeta:metriql.dimension:- name: user_friendly_namemeta:metriql.dimension:type: stringmodels:- name: pageviewsmeta:metriql:relations:countries:to: ref('countries')sql: "{TABLE}.country = {TARGET}.iso_code"type: left_joinrelationship: many_to_many
Metriql automatically creates the relation from pageview
to a hidden countries
model so that the users can drill down by user_friendly_name
when they're analyzing pageviews
.
#
Supported testsWe make use of your dbt test definitions in order to understand your data in a better way. Here is the list of column tests we currently support:
unique
marks the dimension as primary key.not_null
hides theis not set
filter to the user in the user interface.accepted_values
makes Metriql suggests the values to the end-user in the user interface. It disables the automatic suggestion in favor of performance.