Dbt Quick Patterns — Macros Part 1
You really listened to Doogie Howser and countless special episodes of family sitcoms. Right? You always wrap your Jinja code in {% if execute %} blocks. If not, stop, do not pass go, and precede directly to this article to learn why you should. Now that you understand, let’s find a better way to prevent Jinja code in your macros from running before the execution phase of a dbt invocation.
Let’s say we have the macro below. We definitely do not want this running during parse phase. We’ve all seen that dreaded ‘None’ has no attribute ‘table’ error. Can we improve this macro?
{% macro run_long_query() %}
{% set sql %}
select
count(*) as num_rows
from {{ ref('orders')}}
{% endset %}
{% set result = run_query(sql) %}
{% set num_rows = result.rows[0].values()[0] %}
{{ log('Num rows: ' + num_rows | string) }}
{{ return(num_rows)}}
{% endmacro %}
Can we do it? Yes we can! We wrap the code in an {% if execute %} block. But there is a problem we can easily see. We are starting all code inside an if block. This hurts readability and makes it more difficult to write our logic. The additional if block adds another level of complexity. We will need to be very careful about indenting to remember which endif is which. It is fine for simple macros, but more complicated ones will suffer.
{% macro run_long_query() %}
{% if execute %}
{% set sql %}
select
count(*) as num_rows
from {{ ref('orders')}}
{% endset %}
{% set result = run_query(sql) %}
{% set num_rows = result.rows[0].values()[0] %}
{{ log('Num rows: ' + num_rows | string) }}
{{ return(num_rows)}}
{% endif %}
{% endmacro %}
Enter this little gem I found in the Elementary-dbt repo that goes against everything my seventh-grade health teacher ever taught me. Don’t wrap it. Just pull out early. Instead of using {% if execute %} we use {% if not execute %} then exit the macro. We get the same functionality, but better readability and one less level of nesting.
{% macro run_long_query() %}
{# Example 1 #}
{% if not execute %} {{ return(none) }} {% endif %}
{# Example 2 #}
{% if not execute %}
return(none)
{% endif %}
{# Code starts here #}
{% set sql %}
select
count(*) as num_rows
from {{ ref('orders')}}
{% endset %}
{% set result = run_query(sql) %}
{% set num_rows = result.rows[0].values()[0] %}
{{ log('Num rows: ' + num_rows | string) }}
{{ return(num_rows)}}
{% endmacro %}
Which example do you prefer? I can’t decide between 1 and 2. Please discuss.
One simple change simplifies our code and improves the developer experience a little bit. Hey, small things add up over time. Now this is where I would normally close with a catchy line relating to the title of the article, but let’s face it, you’ve all had enough of my sophomoric humor for one day. So… Keep on learning.
This article is part of the Understanding Dbt for Data series. The short list of articles is growing, so be sure save the main thread to find what’s new.