-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace the sed snippets with Python ones #168
Conversation
Added the site_representation_cutoff
Added the script which replace the sed commands used to create the fasta file for phylogeny
Moved site representation to EXPERIENCED USERS section
They should both remain, the median coverage is used to filter samples but was mistakenly also used here. I dont understand the issue with the files not being generated |
I see - therefore I assume that we don't need the cohort stats file to reflect this cut-off right?
Yeah, strange for me too. But what I suggest is that you run the pipeline and then do debugging based on the
I suggest, you copy the python script there and then update the |
Added a newline that I forgot here to fix the issue with the snpeff replacemnt of sed |
Okay, I tested again with the change, but the result is same still. The downstream gatk IndexFeatureFile --java-options "-Xmx4G" \
\
-I joint.raw_variants.annotated.vcf.gz
Error log
|
Okay, this looks functional now , had to tweak the I'm opening this PR for review, please do another pipeline-level test on your side - to check the final analysis - and then we can merge. |
Hey @LennertVerboven and @TimHHH ,
I've accommodated both Python scripts within the two processes. Here's some context
Continuing from #163 (comment), I tweaked that script for the
python2
environment which is inmagma-env-2 // magma-container-2
Continuing from #163 (comment) , the generated results are a bit different from the previous file.
Question: Is
params.median_coverage_cutoff
no longer necessary as it has been substituted byparams.site_representation_cutoff
?==========
However, one problem that I perceived is that the exact expected file isn't being generated in both steps hence I created a
DRAFT
PR. Once the debugging is done we can merge this PR.