[CT-2050] Use of "path" and "original_file_path" in nodes is confused #6879
Labels
file_system
How dbt-core interoperates with file systems to read/write data
tech_debt
Behind-the-scenes changes, with little direct impact on end-user functionality
The "original_file_path" field identifies the file from which this node was parsed, such as "models/schema.yml" for a schema file, or "models/my_model.sql" for a model. The "path" field, however, means a number of different things depending on where the node came from.
For generic tests, original_file_path has the schema file listed, but "path" contains the name of the generic test, i.e. "unique_raw_customers_id.sql". If the same schema file implements the same generic test in multiple places, these tests would overwrite each other in the compiled "target/compiled/<project_name>/models/schema.yml" directory. (I think that at one point these written test names contained the generated unique id name, such as "unique_stg_customers_customer_id.c7614daada", but that seems to have been lost somewhere).
In addition, files that can produce multiple nodes, such as macros may have a name in "path" to distinguish the various blocks.
The so-called "path" is used in creating the file path for writing out the compiled files (in the "compiled" and "run" directories), but it does it by checking whether the path is equal to the original_file_path, and if it isn't (it usually is), it appends the path to the original_file_path to get the path for writing the compiled file.
So sometimes the "path" field has the normal path of the file, sometimes it has the filename of a generic test, sometimes it has a block name.
I propose that we change "path" to always have the path of the file, and we create a new field (path_extra) to contain the necessary extra pieces of the compiled file paths. That way we can more easily construct the "bulld_path" and "compiled_path" (for example see "write_node" in ParsedNode).
The text was updated successfully, but these errors were encountered: