You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
Running dbt clone on a Python model raises the following error:
$ dbt clone --select reporting.ratio_stats --state master-cache
16:46:38 Running with dbt=1.7.11
16:46:38 Registered adapter: athena=1.7.1
16:46:39 Found 82 models, 5 seeds, 415 tests, 136 sources, 10 exposures, 0 metrics, 595 macros, 0 groups, 0 semantic models
16:46:39
16:46:44 Concurrency: 5 threads (target='dev')
16:46:44
Failed to execute query.
Traceback (most recent call last):
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/pyathena/common.py", line 522, in _execute
query_id = retry_api_call(
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/pyathena/util.py", line 85, in retry_api_call
return retry(func, *args, **kwargs)
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 379, in __call__
do = self.iter(retry_state=retry_state)
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 314, in iter
return fut.result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/tenacity/__init__.py", line 382, in __call__
result = fn(*args, **kwargs)
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/home/jecochr/data-architecture/dbt/venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidRequestException: An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 5:5: mismatched input 'None'. Expecting: <query>
Failed to execute query.
16:46:48
16:46:48 Completed with 1 error and 0 warnings:
16:46:48
16:46:48 Runtime Error in model reporting.ratio_stats (models/reporting/reporting.ratio_stats.py)
An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 5:5: mismatched input 'None'. Expecting: <query>
16:46:48
16:46:48 Done. PASS=0 WARN=0 ERROR=1 SKIP=0 TOTAL=1
The root cause of the error is that dbt's builtin clone materialization macro calls the dbt-athena view materialization macro, which in turn calls create_or_replace_view, which references the sql context object instead of compiled_code, which returns None for Python models. This results in a clone view query of the form create or replace view <clone_view> as None, which raises the above error. Here's the line in create_or_replace_view that causes the error:
And here are the definitions of compiled_code and sql in dbt-core (source):
@contextproperty()defcompiled_code(self) ->Optional[str]:
# TODO: avoid routing on args.which if possibleifgetattr(self.model, "defer_relation", None) andself.config.args.which=="clone":
# TODO https://github.com/dbt-labs/dbt-core/issues/7976returnf"select * from {self.model.defer_relation.relation_nameorstr(self.defer_relation)}"# type: ignore[union-attr]elifgetattr(self.model, "extra_ctes_injected", None):
# TODO CT-211returnself.model.compiled_code# type: ignore[union-attr]else:
returnNone@contextproperty()defsql(self) ->Optional[str]:
# only set this for sql models, for backward compatibilityifself.model.language==ModelLanguage.sql: # type: ignore[union-attr]returnself.compiled_codeelse:
returnNone
Expected Behavior
dbt clone should not raise an error when cloning Python models. It should support clone materialization by referencing the compiled_code context object when generating the clone view query rather than the sql context object.
Steps To Reproduce
Setup a dbt config with two targets, dev and prod
Define and build a dummy Python model that just runs print("hello world") in the prod target
This particular bug is blocking us on our use of clone materialiazation, but I think it also implicates the create_table_csv_upload macro and a few snapshot macros like hive_snapshot_merge_sql that also reference the sql context variable instead of compiled_code. I see that the docs for Python models explicitly list the lack of snapshot materialization support as a limitation, so I'm wondering if there's a deeper reason why these parts of the codebase haven't yet been transitioned from sql to compiled_code?
Either way, we've tested out switching to compiled_code for clone materialization in our environment and it seems to work, so I'm happy to put up a patch with our changes. I just want to make sure that I'm not barging into a discussion that has been put on the backburner for a good reason.
The text was updated successfully, but these errors were encountered:
@jeancochrane seems that you find the possible root-cause. Feel free to propose a bug fix, ideally covered by functional testing (I can help you on that part).
Is this a new bug in dbt-athena?
Current Behavior
Running
dbt clone
on a Python model raises the following error:The root cause of the error is that dbt's builtin clone materialization macro calls the
dbt-athena
view materialization macro, which in turn callscreate_or_replace_view
, which references thesql
context object instead ofcompiled_code
, which returnsNone
for Python models. This results in a clone view query of the formcreate or replace view <clone_view> as None
, which raises the above error. Here's the line increate_or_replace_view
that causes the error:https://github.com/dbt-athena/dbt-athena/blob/59d005a46c4a97d8438d050a55ab499e669c0c04/dbt/include/athena/macros/materializations/models/view/create_or_replace_view.sql#L32
And here are the definitions of
compiled_code
andsql
indbt-core
(source):Expected Behavior
dbt clone
should not raise an error when cloning Python models. It should support clone materialization by referencing thecompiled_code
context object when generating the clone view query rather than thesql
context object.Steps To Reproduce
dev
andprod
print("hello world")
in theprod
targettarget/
directory toprod-state/
dbt clone --state prod-state
Environment
Additional Context
This particular bug is blocking us on our use of clone materialiazation, but I think it also implicates the
create_table_csv_upload
macro and a few snapshot macros likehive_snapshot_merge_sql
that also reference thesql
context variable instead ofcompiled_code
. I see that the docs for Python models explicitly list the lack of snapshot materialization support as a limitation, so I'm wondering if there's a deeper reason why these parts of the codebase haven't yet been transitioned fromsql
tocompiled_code
?Either way, we've tested out switching to
compiled_code
for clone materialization in our environment and it seems to work, so I'm happy to put up a patch with our changes. I just want to make sure that I'm not barging into a discussion that has been put on the backburner for a good reason.The text was updated successfully, but these errors were encountered: