[Feature] Add functionality to only run tests #1279

luis-fnogueira · 2024-10-23T16:59:17Z

Description

Hi y'all!

I'd like to periodically run only the dbt tests of my project using cosmos, using an approach similar to:


        render_config=RenderConfig(                               
            test_behavior=TestBehavior.ONLY_TESTS

Would that be possible for future releases? Or is already there a workaround to do that? I could not figure it out at the documentation.

Use case/motivation

I'd like to periodically have an alert on Slack about how my dbt tests are. Like a summary.

Related issues

No response

Are you willing to submit a PR?

Yes, I am willing to submit a PR!

The text was updated successfully, but these errors were encountered:

tatiana · 2024-10-29T13:07:15Z

@luis-fnogueira This feels like a duplicate of #1242 and potentially related to #959.

Would it be an option to use the DbtTestLocalOperator (

astronomer-cosmos/cosmos/operators/local.py

Line 720 in 4c9b28f

class DbtTestLocalOperator(DbtTestMixin, DbtLocalBaseOperator):

) or an equivalent directly to run tests only? Would this solve your use-case?

As illustrated in:

astronomer-cosmos/tests/operators/test_local.py

Lines 458 to 465 in 4c9b28f

    
           test_operator = DbtTestLocalOperator( 
        
               profile_config=real_profile_config, 
        
               project_dir=DBT_PROJ_DIR, 
        
               task_id="test", 
        
               dbt_cmd_flags=["--models", "stg_customers"], 
        
               install_deps=True, 
        
               append_env=True, 
        
           )

github-actions · 2024-11-29T11:04:05Z

This issue is stale because it has been open for 30 days with no activity.

luis-fnogueira · 2024-12-02T12:57:01Z

DbtTestLocalOperator

Hi! I'd have to test it because I haven't found this documentation at https://astronomer.github.io/astronomer-cosmos/. Thanks a lot for the suggestion!

dosubot · 2024-12-02T12:57:08Z

Thank you for closing the issue, luis-fnogueira! We appreciate your contribution to the Cosmos project.

If these two circumstances are met: 1. The dbt project has tests that rely on multiple parent models and; 2. The `DbtDag` or `DbtTaskGroup` use `TestBehavior.AFTER_EACH` (default) or `TestBehavior.BUILD` Cosmos 1.8.0 and previous versions would attempt to run the same test multiple times after each parent model run, likely failing if any of the parents hadn't been run yet. This PR aims to fix this behaviour by not running tests with multiple dependencies within each task group / build task - and by adding those tests to run only once and after all parents have run. # Related issues Closes: #978 Closes: #1365 This change also sets the ground for adding support to tests that don't have any dependencies, a problem discussed in the following tickets: * #959 * #1242 * #1279 # How to reproduce There are two steps to reproduce this problem: 1. To create a representative dbt project 2. To create a Cosmos `DbtDag` that uses this dbt project to reproduce the original problem ## Representative dbt project We created a dbt project named `multiple_parents_test` that has a test called`custom_test_combined_model` that depends on two models: * combined_model * model_a The expectation from a user perspective is that, since the `combined_model` depends on `model_a`, that the `multiple_parents_test` will only be run after both models were run, once. Definitions of the test: ``` {% test custom_test_combined_model(model) %} WITH source_data AS ( SELECT id FROM {{ ref('model_a') }} ), combined_data AS ( SELECT id FROM {{ model }} ) SELECT s.id FROM source_data s LEFT JOIN combined_data c ON s.id = c.id WHERE c.id IS NULL {% endtest %} ``` By running the following `dbt build` command, we confirm that the test depends on both models: ``` dbt build --select "+custom_test_combined_model_combined_model_" 11:59:29 Running with dbt=1.8.2 11:59:29 Registered adapter: postgres=1.8.1 11:59:29 Found 3 models, 6 data tests, 414 macros 11:59:29 11:59:30 Concurrency: 4 threads (target='dev') 11:59:30 11:59:30 1 of 9 START sql view model public.model_a ..................................... [RUN] 11:59:30 2 of 9 START sql view model public.model_b ..................................... [RUN] 11:59:30 1 of 9 OK created sql view model public.model_a ................................ [CREATE VIEW in 0.18s] 11:59:30 2 of 9 OK created sql view model public.model_b ................................ [CREATE VIEW in 0.18s] 11:59:30 3 of 9 START test unique_model_a_id ............................................ [RUN] 11:59:30 4 of 9 START test unique_model_b_id ............................................ [RUN] 11:59:30 4 of 9 PASS unique_model_b_id .................................................. [PASS in 0.05s] 11:59:30 3 of 9 PASS unique_model_a_id .................................................. [PASS in 0.06s] 11:59:30 5 of 9 START sql view model public.combined_model .............................. [RUN] 11:59:30 5 of 9 OK created sql view model public.combined_model ......................... [CREATE VIEW in 0.03s] 11:59:30 6 of 9 START test custom_test_combined_model_combined_model_ ................... [RUN] 11:59:30 7 of 9 START test not_null_combined_model_created_at ........................... [RUN] 11:59:30 8 of 9 START test not_null_combined_model_id ................................... [RUN] 11:59:30 9 of 9 START test not_null_combined_model_name ................................. [RUN] 11:59:30 7 of 9 PASS not_null_combined_model_created_at ................................. [PASS in 0.07s] 11:59:30 9 of 9 PASS not_null_combined_model_name ....................................... [PASS in 0.07s] 11:59:30 8 of 9 PASS not_null_combined_model_id ......................................... [PASS in 0.07s] 11:59:30 6 of 9 PASS custom_test_combined_model_combined_model_ ......................... [PASS in 0.08s] 11:59:30 11:59:30 Finished running 3 view models, 6 data tests in 0 hours 0 minutes and 0.50 seconds (0.50s). 11:59:30 11:59:30 Completed successfully 11:59:30 11:59:30 Done. PASS=9 WARN=0 ERROR=0 SKIP=0 TOTAL=9 ``` This is what the pipeline topology looks like: <img width="1020" alt="Screenshot 2024-12-27 at 11 39 31" src="https://github.com/user-attachments/assets/d8a8e628-2fd7-4959-b13f-3d289e7250ed" /> The source code structure for this dbt project: ``` ├── dbt_project.yml ├── macros │ └── custom_test_combined_model.sql ├── models │ ├── combined_model.sql │ ├── model_a.sql │ ├── model_b.sql │ └── schema.yml └── profiles.yml ``` When running `dbt ls`, it displays: ``` dbt ls 11:40:58 Running with dbt=1.8.2 11:40:58 Registered adapter: postgres=1.8.1 11:40:58 Unable to do partial parsing because saved manifest not found. Starting full parse. 11:40:59 [WARNING]: Deprecated functionality The `tests` config has been renamed to `data_tests`. Please see https://docs.getdbt.com/docs/build/data-tests#new-data_tests-syntax for more information. 11:40:59 Found 3 models, 6 data tests, 414 macros my_dbt_project.combined_model my_dbt_project.model_a my_dbt_project.model_b my_dbt_project.custom_test_combined_model_combined_model_ my_dbt_project.not_null_combined_model_created_at my_dbt_project.not_null_combined_model_id my_dbt_project.not_null_combined_model_name my_dbt_project.unique_model_a_id my_dbt_project.unique_model_b_id ``` ## Behavior in Cosmos The DAG `example_multiple_parents_test` uses this new dbt project: ``` import os from datetime import datetime from pathlib import Path from cosmos import DbtDag, ProfileConfig, ProjectConfig from cosmos.profiles import PostgresUserPasswordProfileMapping DEFAULT_DBT_ROOT_PATH = Path(__file__).parent / "dbt" DBT_ROOT_PATH = Path(os.getenv("DBT_ROOT_PATH", DEFAULT_DBT_ROOT_PATH)) profile_config = ProfileConfig( profile_name="default", target_name="dev", profile_mapping=PostgresUserPasswordProfileMapping( conn_id="example_conn", profile_args={"schema": "public"}, disable_event_tracking=True, ), ) example_multiple_parents_test = DbtDag( # dbt/cosmos-specific parameters project_config=ProjectConfig( DBT_ROOT_PATH / "multiple_parents_test", ), profile_config=profile_config, # normal dag parameters start_date=datetime(2023, 1, 1), dag_id="example_multiple_parents_test", ) ``` When trying to run it using: ``` airflow dags test example_multiple_parents_test ``` Users face the original error because the test is being attempted to be run after `model_a` was run but before `combined_model` is run: <img width="861" alt="Screenshot 2024-12-27 at 12 10 36" src="https://github.com/user-attachments/assets/33ea7b71-ba49-4418-b194-4d3590fff1b8" /> Excerpt from the logs of the failing task: ``` [2024-12-27T12:07:33.564+0000] {taskinstance.py:2905} ERROR - Task failed with exception Traceback (most recent call last): File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/venvpy39/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 465, in _execute_task result = _execute_callable(context=context, **execute_callable_kwargs) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/venvpy39/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 432, in _execute_callable return execute_callable(context=context, **execute_callable_kwargs) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/venvpy39/lib/python3.9/site-packages/airflow/models/baseoperator.py", line 401, in wrapper return func(self, *args, **kwargs) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/operators/local.py", line 796, in execute result = self.build_and_run_cmd(context=context, cmd_flags=self.add_cmd_flags()) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/operators/local.py", line 654, in build_and_run_cmd result = self.run_command(cmd=dbt_cmd, env=env, context=context) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/operators/local.py", line 509, in run_command self.handle_exception(result) File "/Users/tati/Code/cosmos-clean/astronomer-cosmos/cosmos/operators/local.py", line 237, in handle_exception_dbt_runner raise AirflowException(f"dbt invocation completed with errors: {error_message}") airflow.exceptions.AirflowException: dbt invocation completed with errors: custom_test_combined_model_combined_model_: Database Error in test custom_test_combined_model_combined_model_ (models/schema.yml) relation "public.combined_model" does not exist LINE 12: SELECT id FROM "postgres"."public"."combined_model" ^ compiled Code at target/run/my_dbt_project/models/schema.yml/custom_test_combined_model_combined_model_.sql ``` ## Behaviour after this change With this change, when running the DAG mentioned above, it results in: <img width="1264" alt="Screenshot 2024-12-27 at 15 44 17" src="https://github.com/user-attachments/assets/e0395a4d-dbae-4b63-a3c3-69ca79ad0b04" /> And it can successfully be run. ## Breaking Change? This PR slightly changes the behaviour of Cosmos DAG rendering when using `TestBeahavior.AFTER_EACH` or `TestBeahavior.BUILD` when there are tests with multiple parents. Some may consider it a breaking change, but a bug fix is a better classification since Cosmos did not support rendering many dbt projects that met these circumstances. The behaviour change in those cases is that we're isolating tests that depend on multiple parents and running them outside of the `TestBehaviour.AFTER_EACH` dbt node Cosmos TaskGroup or `TestBehaviour.BUILD`. This change will likely highlight any tests that depended on multiple models and were not failing previously but running as part of the tests of both models.

luis-fnogueira added enhancement New feature or request triage-needed Items need to be reviewed / assigned to milestone labels Oct 23, 2024

dosubot bot added area:rendering Related to rendering, like Jinja, Airflow tasks, etc dbt:test Primarily related to dbt test command or functionality labels Oct 23, 2024

tatiana mentioned this issue Oct 29, 2024

Support rendering test tasks even when they are not attached to models/seeds/snapshots #959

Open

github-actions bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 29, 2024

luis-fnogueira closed this as not planned Won't fix, can't repro, duplicate, stale Dec 2, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Dec 2, 2024

tatiana reopened this Dec 27, 2024

tatiana mentioned this issue Dec 27, 2024

Fix rendering dbt tests with multiple parents #1433

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add functionality to only run tests #1279

[Feature] Add functionality to only run tests #1279

luis-fnogueira commented Oct 23, 2024

tatiana commented Oct 29, 2024

github-actions bot commented Nov 29, 2024

luis-fnogueira commented Dec 2, 2024

dosubot bot commented Dec 2, 2024

[Feature] Add functionality to only run tests #1279

[Feature] Add functionality to only run tests #1279

Comments

luis-fnogueira commented Oct 23, 2024

Description

Use case/motivation

Related issues

Are you willing to submit a PR?

tatiana commented Oct 29, 2024

github-actions bot commented Nov 29, 2024

luis-fnogueira commented Dec 2, 2024

dosubot bot commented Dec 2, 2024