Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix missing sql_header for incremental models #2200

Merged
merged 6 commits into from
May 1, 2020
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
{% macro default__get_merge_sql(target, source, unique_key, dest_columns, predicates) -%}
{%- set predicates = [] if predicates is none else [] + predicates -%}
{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}
{%- set sql_header = config.get('sql_header', none) -%}

{% if unique_key %}
{% set unique_key_match %}
Expand All @@ -28,6 +29,8 @@
{% do predicates.append('FALSE') %}
{% endif %}

{{ sql_header if sql_header is not none }}

merge into {{ target }} as DBT_INTERNAL_DEST
using {{ source }} as DBT_INTERNAL_SOURCE
on {{ predicates | join(' and ') }}
Expand Down Expand Up @@ -87,6 +90,9 @@
{% macro default__get_insert_overwrite_merge_sql(target, source, dest_columns, predicates) -%}
{%- set predicates = [] if predicates is none else [] + predicates -%}
{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute="name")) -%}
{%- set sql_header = config.get('sql_header', none) -%}

{{ sql_header if sql_header is not none }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, if we do this, then the sql_header will run twice for "dynamic" (scripting) insert_overwrite materializations. (The first time will be when creating the temp table.) I assume that's ok?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good call. Temporary UDFs can't be defined twice in the same BQ script. Think we're going to need to find an alternative approach here.

I kicked around the idea of making this a materialization-level thing, and while I think that might be a good approach long-term, I don't want to touch every dbt materialization in support of this change. Going to try to figure out a nice way to make this sensible for a very specific use case (insert_overwrite on BigQuery incremental models) and localize the jank as much as possible


merge into {{ target }} as DBT_INTERNAL_DEST
using {{ source }} as DBT_INTERNAL_SOURCE
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,12 @@
#}

{%- set dest_cols_csv = get_quoted_csv(dest_columns | map(attribute='name')) -%}
{%- set sql_header = config.get('sql_header', none) -%}

{%- if unique_key is none -%}

{{ sql_header if sql_header is not none }}

insert into {{ target }} ({{ dest_cols_csv }})
(
select {{ dest_cols_csv }}
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

{{ config(materialized="incremental") }}

{# This will fail if it is not extracted correctly #}
{% call set_sql_header(config) %}
CREATE TEMPORARY FUNCTION a_to_b(str STRING)
RETURNS STRING AS (
CASE
WHEN LOWER(str) = 'a' THEN 'b'
ELSE str
END
);
{% endcall %}

select a_to_b(dupe) as dupe from {{ ref('view_model') }}
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ def test__bigquery_simple_run(self):
self.run_dbt(['seed', '--full-refresh'])
results = self.run_dbt()
# Bump expected number of results when adding new model
self.assertEqual(len(results), 8)
self.assertEqual(len(results), 9)
self.assert_nondupes_pass()


Expand All @@ -64,7 +64,7 @@ class TestUnderscoreBigQueryRun(TestBaseBigQueryRun):
def test_bigquery_run_twice(self):
self.run_dbt(['seed'])
results = self.run_dbt()
self.assertEqual(len(results), 8)
self.assertEqual(len(results), 9)
results = self.run_dbt()
self.assertEqual(len(results), 8)
self.assertEqual(len(results), 9)
self.assert_nondupes_pass()