Dbt Quick Patterns — Run at Date
How many of you run a nightly job for yesterday’s data? If not all of us, definitely a large percentage. It is the fundamental pattern in data. Wait until the end of the day, then collect data for the previous day. Sometimes, we need to backfill a different day. Wouldn’t it be great to have a macro for that simple logic? Enter get_run_at_date().
The Challenge: We want to pull data for one complete day in models and custom tests.
The solution: Create a macro that uses var() to pass in the date we want to run for.
{# Generates a string to be be used in date logic #}
{% macro get_run_at_date() %}
{% if not execute %}{{return('')}}{% endif %}
{# Use your default date here. current_date() - 1 is common in nightly runs #}
{% set run_at_date = var('run_at_date', 'current_date() - 1') %}
{% if 'current_date()' in run_at_date %}
{{ log(run_at_date, info=true) }} {# For demonstration purposes #}
{{ return(run_at_date) }}
{% else %}
{# We use cast here, but it may not be needed in your DW. #}
{% set run_at_date_string = "cast('" + run_at_date + "' as date)" %}
{{ log(run_at_date_string, info=true) }} {# for demonstration purposes #}
{{ return(run_at_date_string) }}
{% endif %}
{% endmacro %}
In this macro, we simply set a default of current_date()-1 or a string representing the date we pass in. Let’s see it in action. With no vars passed in, the macro returns “current_date() -1”
When we pass in a date using vars, we get a cast statement with the date specified.
As simple as it gets, right? With this macro, we can easily get the date to use in where clauses. This is useful in models, snapshots and custom tests. Here is an example using it in a model.
{{
config(
materialized = 'view',
)
}}
{% set run_at_date = get_run_at_date() %}
select
*
from {{ ref('orders') }}
where order_date = {{ run_at_date }}
And the compiled SQL generated.
select
*
from `some-database`.`enterprise_sales`.`orders`
where order_date = cast('2023-11-02' as date)
Thoughts
Get_run_at_date is a macro all of us should have in our projects. It ensures we consistently use the same date logic for nightly runs. This is the basic starting point of many pipelines. What happens when we don’t use daily logic? Maybe we have weekly, monthly, or something more complex to handle late-arriving data?
Stay tuned. In a future article, we will cover these use cases and learn why dbt’s inbuilt python modules might be the right solution for complex date logic.
Want to learn more about dbt, check this out.