[Feature] Reflect macro dependencies on model nodes in `depends_on` in `manifest.json` #9982

epapineau · 2024-04-18T21:23:04Z

Is this a new bug in dbt-core?

I believe this is a new bug in dbt-core
I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am not 100% sure this is a 🐛, but it doesn't meet my assumptions of what I'd find in manifest.json:

If I have a macro, defined as:

{%- macro generate_model_sql() -%}
    select
        column_name_a,
        column_name_b
    from {{ ref('some_model') }}
{%- endmacro -%}

I would expect macros.macro.project.generate_model_sql.depends_on to contain: {'node': ['model.project.generate_model_sql']}. However, AFAICT, this depends_on only contains metadata about referenced macros and not referenced nodes / models.

Expected Behavior

Ref'd models to be included in the depends_on object

Steps To Reproduce

Create a macro using ref
dbt parse
Inspect this macro's depends_on value in manifest.json

Relevant log output

No response

Environment

- OS:
- Python:
- dbt: 1.7.5

Which database adapter are you using with dbt?

bigquery

Additional Context

No response

The text was updated successfully, but these errors were encountered:

dbeatty10 · 2024-04-18T21:41:21Z

Heya @epapineau 👋

This looks the same to me as needing to force dependencies like this:

 -- depends_on: {{ ref('some_model') }}

 {{ generate_model_sql() }}

If you don't do the -- depends_on: {{ ref('some_model') }} bit, does your project display the expected lineage graph and run in the correct order?

epapineau · 2024-04-18T22:15:50Z

@dbeatty10 thanks for the suggestion! The issue is not lineage or ordering - that is working as expected. However, after adding the suggested --depends_on and executing dbt parse, the ref'd models are still not present in the depends_on object for the macro in manifest.json.

dbeatty10 · 2024-04-18T23:03:33Z

After touching base with @gshank, I can confirm that it's known behavior that we don't track model refs when they are in a macro. So I'm going to update this to be a feature request.

jtcohen6 · 2024-04-19T10:38:54Z

Correct!

Macros can only depend on other macros — this is part of our spec
However, a model which calls this macro transitively gets dependencies on the model ref'd in the macro

Conceptually, this is a bit similar to "vars referenced in macros are resolved based on the scope (project) of the model calling the macro, rather than the scope (package) where the macro is defined":

[CT-965] [Bug] Variable scoping ignored for macros #5587

epapineau · 2024-04-25T21:02:14Z

@jtcohen6 interesting. Is there an alternative recommendation to understand macro dependencies on models from manifest.json?

jtcohen6 · 2024-05-08T11:35:06Z

@epapineau Thinking aloud - could you add an ephemeral model which calls the macro, and then look at the ephemeral model's depends_on.nodes in manifest.json?

epapineau · 2024-05-08T20:07:53Z

That use case would work for one or two macros, but would quickly become unfeasible at the scale of 100+ macros 🤔

jtcohen6 · 2024-05-09T10:54:26Z

Fair! Okay, taking a step back — why not track model dependencies within macros?

Conceptually, macros are resources in your project, but they aren't actually nodes in the DAG. They're functions that return code, which can call (and depend on) other functions that return code — whereas depends_on.nodes describes actual edges within the DAG. A model is only directly depended on by another DAG node, not by the function called within that DAG node which returns code.

You're allowed to define a macro that references a model that doesn't exist:

{%- macro generate_bad_model_sql() -%}
    select
        column_name_a,
        column_name_b
    from {{ ref('does_not_exist') }}
{%- endmacro -%}

Until you call that macro within an actual model (DAG node), it doesn't pose a problem for constructing a viable DAG. As soon as you do, you'll see a Compilation Error.

It's also tricky to think about: How would we track model → macro dependencies when that macro takes an argument, to modify the reference based on some input or parse-time condition?

{%- macro generate_model_sql(model_name='some_model') -%}
    select
        column_name_a,
        column_name_b
    from {{ ref(model_name) }}
{%- endmacro -%}

I do see the value in tracking which macros depend on ref within their depends_on.macros. I don't think we support that for builtin ref right now (it's not a macro per se), but it does work for custom ref overrides. Not actually recommending this, just to illustrate how you might be able to programmatically determine which macros / how many macros directly reference ref:

{%- macro generate_model_sql() -%}
    select
        column_name_a,
        column_name_b
    from {{ my_project.ref('some_model') }}
{%- endmacro -%}

{%- macro custom_ref() -%}
    {{ return(builtins.ref(*varargs, **kwargs)) }}
{%- endmacro -%}

Then in manifest.json:

    "macro.my_project.generate_model_sql": {
      "name": "generate_model_sql",
      "resource_type": "macro",
      ...
      "depends_on": {
        "macros": [
          "macro.my_project.ref"
        ]
      },

github-actions · 2024-11-06T01:58:45Z

This issue has been marked as Stale because it has been open for 180 days with no activity. If you would like the issue to remain open, please comment on the issue or else it will be closed in 7 days.

github-actions · 2024-11-13T01:59:35Z

Although we are closing this issue as stale, it's not gone forever. Issues can be reopened if there is renewed community interest. Just add a comment to notify the maintainers.

epapineau added bug Something isn't working triage labels Apr 18, 2024

dbeatty10 added awaiting_response and removed triage labels Apr 18, 2024

github-actions bot added triage and removed awaiting_response labels Apr 18, 2024

dbeatty10 added enhancement New feature or request and removed bug Something isn't working labels Apr 18, 2024

dbeatty10 changed the title ~~[Bug] Macro node dependency not reflected in manifest.json~~ [Feature] Reflect macro dependencies on model nodes in depends_on in manifest.json Apr 18, 2024

dbeatty10 added the manifest label Apr 23, 2024

jtcohen6 removed the triage label May 8, 2024

github-actions bot added the stale Issues that have gone stale label Nov 6, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Reflect macro dependencies on model nodes in `depends_on` in `manifest.json` #9982

[Feature] Reflect macro dependencies on model nodes in `depends_on` in `manifest.json` #9982

epapineau commented Apr 18, 2024 •

edited

Loading

dbeatty10 commented Apr 18, 2024

epapineau commented Apr 18, 2024 •

edited

Loading

dbeatty10 commented Apr 18, 2024

jtcohen6 commented Apr 19, 2024

epapineau commented Apr 25, 2024

jtcohen6 commented May 8, 2024

epapineau commented May 8, 2024

jtcohen6 commented May 9, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 13, 2024

[Feature] Reflect macro dependencies on model nodes in depends_on in manifest.json #9982

[Feature] Reflect macro dependencies on model nodes in depends_on in manifest.json #9982

Comments

epapineau commented Apr 18, 2024 • edited Loading

Is this a new bug in dbt-core?

Current Behavior

Expected Behavior

Steps To Reproduce

Relevant log output

Environment

Which database adapter are you using with dbt?

Additional Context

dbeatty10 commented Apr 18, 2024

epapineau commented Apr 18, 2024 • edited Loading

dbeatty10 commented Apr 18, 2024

jtcohen6 commented Apr 19, 2024

epapineau commented Apr 25, 2024

jtcohen6 commented May 8, 2024

epapineau commented May 8, 2024

jtcohen6 commented May 9, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 13, 2024

[Feature] Reflect macro dependencies on model nodes in `depends_on` in `manifest.json` #9982

[Feature] Reflect macro dependencies on model nodes in `depends_on` in `manifest.json` #9982

epapineau commented Apr 18, 2024 •

edited

Loading

epapineau commented Apr 18, 2024 •

edited

Loading