Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* One workflow for align&qc and variant calling (#428) * merge qc and variant calling for tga/wes into one * delete Alignment.smk and VariantCalling.smk snakefiles * Add RuleException and WorkflowError exceptions * include before final rule * fix typo * fix typo * include rules before delivery * remove VariantCalling_sentieon.smk and merge with balsamic.smk * one sv caller set to rule for all workflows * sequencing_type key fixed * add sentieon callers exclusive to wes and tga * remove redundant addition * remove vardict single from wgs single * syntax Watson, syntax! * analysis specific result for wes and tga * extend variant calling rule list * move env vars into try..except, move all align and qc rules to their path * Pathlib for rule directory * remove rule directory from config * code style * move sentieon variant call rules next to others * remove unused cutadapt path * common qc_rules and some style fixes * add fastqc to multiqc.rule * add fastqc to multiqc.rule * codefactor issues * remove unused import and function * Features: Adds qc rules to UMI workflow (#431) * changed umiextract aligned output sam->bam * added wildcards {step} to variantcalling and vep rules * added qc rules for UMIworkflow * add sentieon license to config * added QC rules for umi-workflow * update threads for umi-rules in cluster.json * small fixes to umi rules * fix redundant lines and alignment * small fixes to rules and setting sentieon env variables * add densityplot script directly to qc rule * remove previous assignment of sentieon variables * add umi tag to picard hs metrics * fixed errors causing workflow fail in qc rules * Add gnomad genome to GenerateRef.smk (#432) * Add gnomad genome to GenerateRef.smk * gnomad hg38 and add to models * correct cli utils * add gnomad to valid downloadble list * Correct genome version VEP * refactor urls * code format * more format * revert import change * Refactor/disable multiple variant callers (#433) * disable multiple variant callers * fill paragraph * typo fix! * Refactor rules: align rules (#434) * refactor bwa_mem align * str * Path it! * clean up imports * replace sentieon install dir with sentieon exec in config * proper path * Path for missed path, move picard markdup to bwa-mem rule * str concat to comma * Fix/metric mkdups (#436) * proper new line * remove white space * refactor imports (#437) * refactor imports * cleanup imports from rules * refactor reference, remove run reference (#438) * refactor reference, remove run reference * rename GenerateRef.smk to reference.smk, add genome ver to final json * missing import * cleanup shebang! * syntax fix * Create new cli-group * Create new cli-group * cleanup tests * always one cores for reference generation * more clean up on tests * formatting * separate directories for hg19 and hg38 * quiet reference! * proper structure for tests * quiet mode decouple from sm_opt * put it back! * import it! * always append cores number to workflow * fstring instead of format * formatting * update docs for new reference generation * remove redundant genome version path * test --quiet is in shell command * formatting! * remove redundant ".log" extension * Refactor/vardict muect2 resources (#440) * sort cluster.json by key name, and rename sambamba rules to match cluster.json * __default__ and all rule on top of json * increase max memory for VarDictJava and Mutect2 * Add threads to vardict tumor and tumor-normal * add vcfanno (#446) * Refactor/no gunzip gnomad, keep az bgzip (#445) * add gnomad tbi and no gunzip, keep original file * add gnomad_tbi to list of downloads * add gnomad_tbi to models * outfiles in rule all * index file to rule all * redundant gzip * get files based on gzip status * only get gzip True vcfs * indexable vcf files * some cleanup * remove trailing commas * annotate with gnomad_genome (#462) * formating, convert shell to triple quote * add vcfanno.toml to MANIFEST.in and setup.py * vcfanno config to constants.py * extend varcaller filter with pop_freq * doc prettify * add vcfanno annotate to vep * add pop freq to constants.py * add dummy gnomad.genomes.r2.1.1.sites.vcf.bgz reference file * fix tests * extend vardict_tumor_only filter * Feat/init container (#464) * placeholder to initialize container * init container * add to init base * show default * image name to download * formatting * default name with a v! * path for image * comma for / * resolve properly * download container * remove extra v * add force download, and some log * remove redundant log * add dry run mode * fix condition for dry run * construct cmd for singularity pull * formatting * docker path to consants * str concat to format * log message fix * update some comment * test for init container * test force download command * add pycharm .idea to gitignore * test dummy tag * mock exit code * capture raise in a test * move cmd out, change to LOG.error * caplog and mock check_output * formatting * clean up tests * some error raise! * mock checkoutput with sideeffect * fix some formatting * update manta to 1.6.0 (#470) * decouple develop dockerfile from master * Feat/filterconsensus called reads (#469) * added feature for filtering consensus reads * Feat:refactor umirules (#477) * fixed styling and added params as constants in umi rules * fixed threads for UMI workflow rules * update umiworkflow constants * fixed model classes for UMI constants * fixed test models for UMI constants * updated umimodel class names * fix vep default params to constants * fixed rule and params names * fixed threads for qc rules * added vep params as constants * update rulenames with proper function names * fix missing import in test models * update requirements for plotting densityplots * format old code from old PRs (#481) * increase cnvkit queue time request from 8hour, 10 core to 10hour 10 core (#482) * Refactor/pip freeze version (#480) * freeze version * minimum version * format old code from old PRs * crush redundant variables in rules and have one in the main workflow (#471) * crush redundant variables in rules and have one in the main workflow * add missing import * normal sample only if it is paired * get panel if not wgs * add fasta and refflat to cnvkit paired * remove all redundant imports, and move them to main workflow * move more rules * Feat/refactor densityplot script (#483) * move density plot related lines to python script * python script to plot density * add tests for density plot functions * test files related to densityplots * update seaborn version * fix docstrings * lock seaborn version * fix correct variable names * fix code smell for test_result Co-authored-by: Hassan Foroughi <[email protected]> * formatting * Disable UMI trim option for WGS (#486) * do not trim umi for WGS * sample name in params * quote variable * proper logic * remove condition from bash and always disable UMI for WGS. * remove redundant logic on disabling umi for WGS * DRAGEN addition (#488) * first commit dragen rule * add dragen option, update snakemake class, update dragen dna rule, add resources for cluster.json * add partition for scheduler.py * missing parition attribute * missing parition attribute * keep sm config key-val list until the end * typo * some tests * quote! two of them! * add dragen to workflow * prepend to snakemake config k-v * include dragen_dna.rule in workflows * dragen in its own path for results * correct path in analysis specific results * rename partition from dragen to cg-dragen * create reference, create tmpdirs, and run varcall * use BALSAMIC's reference file for DRAGEN * dragen vc enable, and relocate result path * correct bam path * remove gatk3-register command (#496) * remove gatk3-register command * remove gatk3 resources * do not check sacct/jobid file if in local mode (#497) * Feat/add binding paths (#498) * add assets and background file bindpaths * add scripts balsamic varaible * formatting fix * Update UMIworkflow.smk Co-authored-by: Ashwini Jeggari <[email protected]> Co-authored-by: hassanfa <[email protected]> Co-authored-by: Hassan Foroughi <[email protected]> * Feat/add tests umiworkflow (#502) * tests for background variant file check * tests for balsamic run analysis with analysis-type umi * fixture to create umi config file with background variant file * tests to run umi workflow using sample config * additional tests files for umi workflow runs * remove redundant test_reference variable * fix tumor only umi dag success * assign initial value before assigning background file * small fix * change test_reference to reference * Refactor/umiworkflow rules (#503) * fix sentieon exec path * change outputs to consensusfilter instead of consensalign * UMIworkflow.smk * added values for HSMETRICS_QC_CHECK (#401) * added values for HSMETRICS_QC_CHECK * new structure with panel name as keys * new script for qc check * added typehint and Docstring * edited docstring * edited type hinting for df * changed the check_qc_criteria func with new conditions for passing criteria * new function to print if the QC failed * new function to provide csv-file with the QC critera * removed unnecessary variables, added extra click option, renamed function * Added new script to check_qc_criteria() and edited main() * removed print * added empty lines to make the script more readable. * added METRIC_CRITERIA * removed unnecessary modules and made minor laytout changes * modified curly brackets * added unit test * .idea * added whitespaces * changed read_qc_table func to import the qc table from constants. Removed click function for qc_table Changed main functionn * changed type hint for read_qc_table function * remove pycharm files * added "_hg19_design.bed" extension * Rewrote get_bait_name function Edited main() Added required modules * Added new function get_sample_name to extract sample names from confing Changed main(), failed_qc(), and check_qc_criteria() to include sample names returned from get_sample_name() Removed unnecessary variables * edited doc string for several functions * Removed main() and click and added new get_qc_check() Removed unnecessary modules * removed unused lines * format constants.py and qc_check.py. Move test_qc_check to tests directory * added pandas * added file for testing * rewrote the tests * added new test function to increase the cov * changed variable name * added new function test_get_sample_name * added new test function test_get_qc_criteria * added new test func test_check_qc_criteria * merged two test func to one, added extra test to test_check_qc_criteria_and_output_csv(): * changed failed_qc() to return qc status as value * edited test_check_qc_criteria_output_csv_and_qc() to test three functions to increase cov * yapf formated * yapf formated * changed config-file name and modified test_read_qc_table() * modifed test_get_bait_and_sample_name() to include scope fixure * modifed function to include Path module * edited with yapf * added extra comment * Removed extra comment * mimic correct panel name Co-authored-by: Hassan Foroughi <[email protected]> Co-authored-by: hassanfa <[email protected]> * Refactor analysis/scheduler (#491) * move scheduler.py to utils * minor reformating * add quiet option to run analysis.py * capture early if Sentieon executable is missing * better log formatting * properly handle env variables and existing condition for sentieon * mock env path * remove env variables from travis * formatting * some formtting * mocks extended * formatting * remove commented out code * remove commented out code * os.environ mock constant * os.environ mock constant * do something in except claus * do something in except claus * formatting fix * missing fixtures * some formatting * correct call! * formatting * Feat/disable var caller report (#439) * disable variant caller to status and report * import VCF_DICT * add config as dict instead of string (key=value) * Remove disable_var_caller from status report. Snakemake API doesn't like config dicts * add tmp files for normal/tumor bam to check missing * add tmp files for normal/tumor bam to check missing * test disable variant caller * Decouple containers into their own definition (#511) * decoupled and tested containers * simply dockerfile a bit, and remove old ones * align_qc container * some action * restore old containers * some name calling * Update docker.yml * Update docker.yml * add missing conda file * buildit! * typo * only on push * correct path * move dockerfile to root * tipsy clean * long option format * cnvkit env * remove pip * some bash script corrections * proper env path * activate env before pip * proper conda env path * update picarad version to 2.23.8 * use dockerhub action * correct secrects * correct head name * get_branch name id * two for build push and build test * string for master/develop * ignore m&d * rename action * don't fail non-master/develop * Update docker_build_push.yml * Remove Strelka and Mutect2 from somatic variant calling (#513) * remove strelka and mutect2 * coveralls * stepS * test only for certain path * refactor env * remove cython * coveralls version unlock * remove --user from pip install * remove unused run * pip as module * remove activate * put activate back * good old source activate instead of conda * login bash * proper py.test * test help * get germline dynamic * remove unused lines * replace coveralls with codecov * codecov simpler * Refactor/umi wildcards (#514) * change of wildcard {sample} to {casename} in umi snakemake rules * change of {sample} to {casename} in umi params constants * clean up of umi workflow * removed umi specific vep rule * Fix/container add bash (#525) * run as root and remove user * add executable to workflow * docker as root * empty line for beauty * wording * Feat/umi integration (#517) * small fixes to umi rules * move tables and plots to qc_dir and fix expand rules * incorporate umi rules into main balsamic workflow * reset to main code line * rename wildcards order in umirules * rename analysis output wildcards names in umi workflow * add umi_workflow to balsamic * add cli option umiworkflow * update vcfdict for umi vcf files * add umiworkflow otion to analysismodel * fix indent space * fix indent space * fix vcf_dict for umi vcfs * add vcfattributes for new added vcf keys * small fixes to umipart in balsamic workflow * add umiworkflow cli command to conftest * fix umiworkflow fixture in models * revert conftest * add collecths metric umi rule to multiqc * fix umi qc rules * fix multiqc umi qc * tmpdir refactor (#516) * Use python to generate final output file * refactor variant calling rules * missing comma! * create a root temporary directory in workflow * review comments * review comments * review comments and refactor * Update BALSAMIC/snakemake_rules/variant_calling/sentieon_tn_varcall.rule Co-authored-by: ashwini06 <[email protected]> * Update BALSAMIC/snakemake_rules/variant_calling/cnvkit_paired.rule Co-authored-by: ashwini06 <[email protected]> * fix tmpdir assignment * refactor cnvkit_paired * fix wildcard * Unify tmp dirs * remove redundant key * create a temp dir per rule * get correct manta env * refactor cnvkit rule * collect files from new path for manta * no need to remove workdir, as it is a tempdir * trailing quote * write to file via python and not shell * tmp for sentieon align as well * fix wording in docker build test Co-authored-by: ashwini06 <[email protected]> * Feat/init containers (#522) * move outdir to base, only limited choice of containers to pull * refactor container names and complete decouple for containers * correct path to yaml file * details Watson! details * return version! * dont join, set! * better set! * unused zero in version * only take conda envs * formatting * replace conda_env_yaml key with bioinfo_tools (new key) * delete obsolete rules * directly use config instead of middleman * some more replacements * bioinfo tools in reference.smk as well * remove get_conda_env * pass outdir to reference from base * formatting * fix tests for cli * use context for container init * specific tag to develop * specific tag to develop * use singularity image path, download all images from docker, and add a list of valid containers, etc * formatting * fix tests and correct vep name * manta resource * haplotpyer back * fix some tests * remove unused try-except * remove unused var * shell executable to workflow * some tests * remove unused module * formatting * more conflict fix * formatting * Feat/genmod (#531) * formatting * add genmod * libgxx for gcc compiled * fastqc version bump (#532) * readlink command fix according to container (#533) * bump up bcftools (#537) * bump up bcftools * samtools bump * fix tests for biinfo tools * Refactor tmpdir, resources, and some minor fixes to UMI workflow (#534) * modify jobs for better resources * formatting * use picard instead of picard.jar * some minor refactoring, and formatting * refactor UMI rules for sentieon * correct container selection for qc_umi rules * refactor * update requirements versions * use snakemake class instead * generate and check executable for snakemake workflow * close multiple open fig files * move plots to workflow level * some unittests * if's best friend: else * pathlib * better var names * move some stuff to workflow * move tests to their own place * formatting * add case name * add dummy files * some more tests, a bit refactoring, and move files where they belong * sometimes remember, sometimes forget. This round: forgot to add file * increase covearge * more tests * move some code into their own functions, and some tests * beautify * review comments2 * bump requirements * more test * final nail on the tests * var caller filter for tumor-normal * variant filter for tumor-normal and tumor-only in panel * tumor-normal filters * use bcftools env instead of vep * formatting * merge container and reference into one (#538) * merge container and reference into one * fix coverage and tests * mocks and tests * remove comments2 * genome version test * formatting * formatting * final coverage attempt * more test * test script * remove print statement * rankscore for genmod as reference and rule (#539) * rankscore for genmod as reference * add to set of downloaded file * correct rankscore version * fix container path * add rankscore model * dummy file for rank model * rankscore model and some tests * smelly code * merge conflict * conflict * missing import * Fix/gsutil dep temp (#550) * fixes #541 * fixes #549 * Feat/split snv sv indel (#540) * split variant base rule * wip * undo the changes * split WGS variants into SV and SNV * remove redundant rule2 * move split into its own rule * review comments * fix sentieon command issue * docs * docs (#553) * docs * Feat/wgsfilter rules (#548) * new rules for ngs filtering for tnscope and tnhaplotyper * changed VarCallerFilter DP as optional * add new condition for filtering wgs and single analysis * merged tnscope and tnhaplotyper into one rule * modify sentieon var calling constants * add sentieon varcalling variable to balsamic workflow * Delete varcaller_filter_tumor_only_bckup.rule wrong file * restore UMIworkflow script * fix sentieon varcall filters according to review comments * change filename of wgs varcalling rule in balsamic workflow * new rule file for wgs sentieon varcall filters * removed sentieon wgs related varcall rule * fix AD sum formula * rules for sentieon varcall filtering and isec * add new fields for filtering as constants * update attributes of VarCallerFilter * fix wgs filter output vcf file calling * fix review comments * changelog * TN wgs filters for tnscope (#558) * TN wgs filters for tnscope * fix review comments * add hk tag * small fix to expand rules for wgs filtered pass vcf * fix bcftools filter and workflow conditional statement * bump genmod (#565) * Feat/umi tn (#563) * umiextract rules changed to sample names * rules consenuscall wildcrads set to sample names * seperate two input files commandlines * TNscope rules for paired analysis * fix umi collect hsmetric for multiqc * fix wildcards in qc_umi * fix align header to sample name * fix rules in main and umi workflows * fix conditional statement for umi rules * removed get_densityplot function from workflowscripts.py * removed matplotlib and searborn packages * using only consensusfilter file for variantcalling * remove vardict variantcalling rule * removed rules bcftools_query_calculatenoiseAF_umi, seaborn_densityplot_umi and wildcard {step} * remove wildcard step for umi_collect_hsmetric * change umi variantcaller name to TNscope_umi in VCFdict * fix missing . in tnscope output * remove wildcard {step} and simplification in UMI workflow * fix VCFModel for TNscope_umi * delete densityplot related test data files * remove get_density_plot function from pytests * refactor of umi related code lines in balsamic workflow * remove get_densityplot from test_utils * remove density plots related test lines * fix import name for get_file_contents * fix back the matplotlib.pyplot * Feat/refactor fastp (#570) * split fastp rule for umi workflow input * use fastp umi optimized as file inputs for umi extract * add fastp rule to umi workflow * small fix * remove unused params * fixes #571 (#572) * changelog Co-authored-by: ashwini06 <[email protected]> Co-authored-by: Ashwini Jeggari <[email protected]> Co-authored-by: keyvanelhami <[email protected]>
- Loading branch information