Skip to main content

Metrics as dimensions with metric filters

Metrics provide users with valuable insights into their data, like number of active users and overall performance trends to inform business decisions. Dimensions, on the other hand, help categorize data through attributes, like user type or number of orders placed by a customer.

To make informed business decisions, some metrics need the value of another metric as part of the metric definition, leading us to "metrics as dimensions".

This document explains how you can use metrics as dimensions with metric filters, enabling you to create more complex metrics and gain more insights. Available in dbt Cloud for versions 1.8 or higher.

Reference a metric in a filter

Use the Metric() object syntax to reference a metric in the where filter for another metric. The function for referencing a metric accepts a metric name and exactly one entity:

{{ Metric('metric_name', group_by=['entity_name']) }}

Usage example

As an example, a Software as a service (SaaS) company wants to count activated accounts. In this case, the definition of an activated account is an account with more than five data model runs.

To express this metric in SQL, the company will:

  • Write a query to calculate the number of data model runs per account.
  • Then count the number of accounts who have more than five data model runs.
models/model_name.sql
with data_models_per_user as (
select
account_id as account,
count(model_runs) as data_model_runs
from
{{ ref('fct_model_runs') }}
group by
account_id
),

activated_accounts as (
select
count(distinct account_id) as activated_accounts
from
{{ ref('dim_accounts') }}
left join
data_models_per_user
on
{{ ref('dim_accounts') }}.account_id = data_models_per_user.account
where
data_models_per_user.data_model_runs > 5
)

select
*
from
activated_accounts

This SQL query calculates the number of activated_accounts by using the data_model_runs metric as a dimension for the user entity. It filters based on the metric value scoped to the account entity. You can express this logic natively in the MetricFlow specification.

YAML configuration

Using the same activated_accounts example mentioned in the usage example, the following YAML example explains how a company can create semantic models and metrics, and use the Metric() object to reference the data_model_runs metric in the activated_accounts metric filter:

  • Create two semantic models: model_runs and accounts.

  • Create a measure and metric to count data model runs, and another measure to count users.

  • Specify the foreign entity account in the model_runs semantic model.

  • Then create the Activated Accounts metric by filtering accounts that have more than five data model runs.

    models/metrics/semantic_model.yml
    semantic_models:
    - name: model_runs
    ... # Placeholder for other configurations
    entities:
    - name: model_run
    type: primary
    - name: account
    type: foreign
    measures:
    - name: data_model_runs
    agg: sum
    expr: 1
    create_metric: true # The 'create_metric: true' attribute automatically creates the 'data_model_runs' metric.

    - name: accounts
    ... # Placeholder for other configurations
    entities:
    - name: account
    type: primary
    measures:
    - name: accounts
    agg: sum
    expr: 1
    metrics:
    - name: activated_accounts
    label: Activated Accounts
    type: simple
    type_params:
    measure: accounts
    filter: |
    {{ Metric('data_model_runs', group_by=['account']) }} > 5

    Let’s break down the SQL the system generates based on the metric definition when you run dbt sl query --metrics activated_accounts from the command line interface:

  • The filter {{ Metric('data_model_runs', group_by=['account']) }} generates SQL similar to the data_models_per_user sub-query shown earlier:

    select
    sum(1) as data_model_runs,
    account
    from
    data_model_runs
    group by
    account
  • MetricFlow joins this query to the query generated by the accounts measure on the group by elements and applies the filter conditions:

    select
    sum(1) as activated_accounts
    from accounts
    left join (
    select
    sum(1) as data_model_runs,
    account
    from data_model_runs
    group by
    account
    ) as subq on accounts.account = subq.account
    where data_model_runs > 5

    The intermediate tables used to create this metric is: Accounts with the data_model_runs dimension

    accountdata_model runs
    14
    27
    39
    41

    MetricFlow then filters this table to accounts with more than 5 data model runs and counts the number of accounts that meet this criteria:

    activated_accounts
    2

Considerations

  • When using a metric filter, ensure the sub-query can join to the outer query without fanning out the result (unexpectedly increasing the number of rows).
    • The example that filters the accounts measure using {{ Metric('data_model_runs', group_by=['account']) }} is valid because it aggregates the model runs to the account level.
    • However, filtering the 'accounts' measure by {{ Metric('data_model_runs', group_by=['model']) }} isn't valid due to a one-to-many relationship between accounts and model runs, leading to duplicate data.
  • You can only group a metric by one entity. The ability to support grouping by multiple entities and dimensions is pending.
  • In the future, you can use metrics as dimensions for some of the following example use cases:
    • User segments: Segment users by using the number of orders placed by a user in the last 7 days as a dimension.
    • Churn prediction: Use the number of support tickets an account submitted in the first 30 days to predict potential churn.
    • Activation tracking: Define account or user activation based on the specific actions taken within a certain number of days after signing up.
    • Support for metric filters requiring multi-hop joins is pending.
0