Skip to content

populationgenomics/test_workflows_shared

Repository files navigation


Markdownify
Test Workflows Shared

A template test workflows repository that works with CPG Flow

GitHub Actions Workflow Status GitHub License Python Version from PEP 621 TOML

BackgroundKey FeaturesHow To UseEditing in an IDERelatedLicense

Background

The tests_workflows_shared repository serves as a dedicated testing space for both cpg_flow and pipeline developers. It is designed to facilitate manual, integrated end-to-end (E2E) validation of the cpg_flow package, ensuring its robustness and reliability in production-like environments. By interfacing with Metamist and leveraging a cohort from the fewgenomes project, the repository enables testing of new builds and modifications before deployment.

For pipeline developers who are new to cpg_flow, this repository provides a practical trial workflow, offering a hands-on introduction to its core functionalities and best practices. This dual-purpose approach not only supports continuous improvement of cpg_flow but also accelerates onboarding and skill development for new contributors.

Beyond its primary focus on testing, the repository promotes standardization through:

  • Enforcement of consistent naming conventions aligned with CPG standards.
  • Automated package and dependency updates using Renovate.
  • Dependency management facilitated by uv.

By combining rigorous testing capabilities with a standardised development framework, tests_workflows_shared ensures high-quality pipeline development and fosters a cohesive developer experience.

Key Features

  • Uses uv to manage dependencies
  • Uses renovate for package upgrades
  • Uses analysis-runner to run the test workflow
  • The jobs and stages are defined in separate files:
    • The cpg_flow_test/jobs/ directory contains the job definitions that can be reused across stages.
    • The cpg_flow_test/stages.py file contains the stage definitions, which call the jobs.
  • The cpg_flow_test/workflow.py file contains the test workflow definition.

How To Use

From your command line:

# Clone this repository
$ git clone https://github.com/populationgenomics/test_workflows_shared

# Go into the repository
$ cd test_workflows_shared

# Go to the test folder
$ cd cpg_flow_test

# Run the test with the bash script
$ chmod +x run-test-workflow.sh

# See the notes below on how to find a valid path/tag.
# The default path is australia-southeast1-docker.pkg.dev/cpg-common/images/cpg_flow:0.1.0-alpha.14
$ ./run-test-workflow.sh --image "australia-southeast1-docker.pkg.dev/cpg-common/images/cpg_flow:<tag_id>"

If the job is successfully created, the analysis-runner output will include a job URL. This driver job will trigger additional jobs, which can be monitored via the /batches page on Hail. Monitoring these jobs helps verify that the workflow ran successfully. When all expected jobs complete without errors, this confirms the successful execution of the test workflow and indicates that the cpg_flow package is functioning as intended.

Notes

  • You will need to have analysis-runner installed in your environment. See the analysis-runner for more information or install it with pipx install analysis-runner.

  • Testing with Different Image Tags: Running the pipeline on different tags of the cpg_flow image is valuable for validating unmerged functionality in the cpg_flow repository. To ensure stability, you can default to a recent release tag when testing with a stable version of the cpg_flow image.

  • Finding a Valid Tag: A valid tag can be obtained from the most recent cpg-flowDocker workflow runs. Look under the print docker tag job of the workflow. Be mindful of the distinction between images (stable) and images-tmp (test images pruned fortnightly).

Editing in an IDE

To enable syntax highlighting in your IDE, you will need to install dependencies.

# Install dependencies
# `uv` documentation: https://docs.astral.sh/uv/
$ uv sync

# Activate the virtual environment
$ source .venv/bin/activate

Related

cpg-flow - supports various stages of genomic data processing, from raw data ingestion to final analysis outputs, making it easier for researchers to manage and scale their population genomics workflows.

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •