-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allele-specific VQSR convergence fix #6262
Conversation
problems for exomes with AS annotations Add VQSR debug arg
925d407
to
e6de27a
Compare
@ldgauthier Can you nominate a reviewer for this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of minor changes requested.
private static final long serialVersionUID = 0L; | ||
|
||
public VQSRNegativeModelFailure(String message) { super(message); } | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since these exception classes are specific to VQSR, and don't appear to be handled/caught anywhere anyway, they seem to be unnecessary. The call sites can just throw UserException directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm looking for one of those in the tests now. I don't like to test for UserException because when the input files are missing, that's a user exception. (I think Louis and I cleaned that up, but I'm still in favor of being specific.)
src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantDataManager.java
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantRecalibratorEngine.java
Outdated
Show resolved
Hide resolved
src/main/java/org/broadinstitute/hellbender/tools/walkers/vqsr/VariantDataManager.java
Outdated
Show resolved
Hide resolved
"--use-allele-specific-annotations", | ||
"-mode", "SNP", | ||
"--" + StandardArgumentDefinitions.ADD_OUTPUT_VCF_COMMANDLINE, "false", | ||
"--max-gaussians", "6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this test fails if the number of gaussians is not reduced. Can you add this same test case, but without this arg, as a negative test case, since I don't think we actually have any of those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out it doesn't fail, so I took the arg out and updated the results. Added two different failing tests.
value = Double.NaN; // The VQSR works with missing data by marginalizing over the missing dimension when evaluating the Gaussian mixture model | ||
if( jitter && (annotationKey.equalsIgnoreCase(GATKVCFConstants.AS_RMS_MAPPING_QUALITY_KEY))){ | ||
value += vrac.MQ_JITTER * Utils.getRandomGenerator().nextGaussian(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth updating the MQ_CAP doc (to explicitly state that its MQ only and doesn't affect AS_MQ) ? Up to you. It might actually be more correct now.
@cmnbroad I did some cleanup and added a few more tests. Anything else? |
Thanks @ldgauthier . 👍 when tests pass. |
Ops reported several instances in which the allele-specific filtering failed. In the case I examined, the MQ distribution is much tighter around the mode at 60, which causes lin alg failures because that variable is effectively constant. Added more jitter, which has served well in the past.