[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879

gshank · 2023-02-06T20:30:35Z

The "original_file_path" field identifies the file from which this node was parsed, such as "models/schema.yml" for a schema file, or "models/my_model.sql" for a model. The "path" field, however, means a number of different things depending on where the node came from.

For generic tests, original_file_path has the schema file listed, but "path" contains the name of the generic test, i.e. "unique_raw_customers_id.sql". If the same schema file implements the same generic test in multiple places, these tests would overwrite each other in the compiled "target/compiled/<project_name>/models/schema.yml" directory. (I think that at one point these written test names contained the generated unique id name, such as "unique_stg_customers_customer_id.c7614daada", but that seems to have been lost somewhere).

In addition, files that can produce multiple nodes, such as macros may have a name in "path" to distinguish the various blocks.

The so-called "path" is used in creating the file path for writing out the compiled files (in the "compiled" and "run" directories), but it does it by checking whether the path is equal to the original_file_path, and if it isn't (it usually is), it appends the path to the original_file_path to get the path for writing the compiled file.

So sometimes the "path" field has the normal path of the file, sometimes it has the filename of a generic test, sometimes it has a block name.

I propose that we change "path" to always have the path of the file, and we create a new field (path_extra) to contain the necessary extra pieces of the compiled file paths. That way we can more easily construct the "bulld_path" and "compiled_path" (for example see "write_node" in ParsedNode).

jtcohen6 · 2023-02-07T10:16:59Z

@gshank Thanks for opening! I'm reading this as good-old-fashioned tech debt: a thing that's currently inconsistent, and ought to be more consistent. I don't see this as blocking for or blocked by any of the work in #6873.

github-actions bot changed the title ~~Use of "path" and "original_file_path" in nodes is confused~~ [CT-2050] Use of "path" and "original_file_path" in nodes is confused Feb 6, 2023

gshank added Team:Language tech_debt Behind-the-scenes changes, with little direct impact on end-user functionality labels Feb 6, 2023

jtcohen6 added the file_system How dbt-core interoperates with file systems to read/write data label Feb 10, 2023

jtcohen6 removed the Team:Language label Jul 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879

[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879

gshank commented Feb 6, 2023

jtcohen6 commented Feb 7, 2023

[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879

[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879

Comments

gshank commented Feb 6, 2023

jtcohen6 commented Feb 7, 2023