Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple rsID in Haplotype Caller results #6690

Closed
Stikus opened this issue Jul 2, 2020 · 2 comments
Closed

Multiple rsID in Haplotype Caller results #6690

Stikus opened this issue Jul 2, 2020 · 2 comments
Labels

Comments

@Stikus
Copy link

Stikus commented Jul 2, 2020

Hello, thanks for great software.
After GATK 4.1.8.0 release we updated our internal Docker containers (from GATK v4.1.7.0) and noticed changes in Haplotype Caller results:

check_against_37.woRandomLine.vcf.txt
test_v37.haplotypecaller.woRandomLine.vcf.txt

Here is the difference:

17	7571487	rs17880560	A	AGCCGTG	166.10	.	AC=2;AF=1.00;AN=2;DB;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=28.73;SOR=0.693	GT:AD:DP:GQ:PL	1/1:0,4:4:12:180,12,0
17	7571487	rs17880560;rs79948390	A	AGCCGTG	166.10	.	AC=2;AF=1.00;AN=2;DB;DP=4;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=28.73;SOR=0.693	GT:AD:DP:GQ:PL	1/1:0,4:4:12:180,12,0
17	7578711	rs141204613	CTTT	C	232.93	.	AC=2;AF=1.00;AN=2;DB;DP=13;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.83;QD=30.97;SOR=1.329	GT:AD:DP:GQ:PL	1/1:0,6:6:18:247,18,0
17	7578711	rs141204613;rs5819162	CTTT	C	232.93	.	AC=2;AF=1.00;AN=2;DB;DP=13;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.83;QD=30.97;SOR=1.329	GT:AD:DP:GQ:PL	1/1:0,6:6:18:247,18,0
17	7579643	rs150200764	CCCCCAGCCCTCCAGGT	C	1834.03	.	AC=2;AF=1.00;AN=2;DB;DP=52;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=61.03;QD=27.24;SOR=0.843	GT:AD:DP:GQ:PL	1/1:0,41:41:99:1848,125,0
17	7579643	rs150200764;rs146534833;rs59758982	CCCCCAGCCCTCCAGGT	C	1834.03	.	AC=2;AF=1.00;AN=2;DB;DP=52;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=61.03;QD=27.24;SOR=0.843	GT:AD:DP:GQ:PL	1/1:0,41:41:99:1848,125,0

New file (test_v37) contains multiple rsID in ID field. Is that expected behavior or not?
I can't find any info about in in changelog.

@droazen
Copy link
Contributor

droazen commented Jul 2, 2020

@Stikus Yes, this is expected, and is mentioned in the release notes for 4.1.8.0:

  • More flexible matching of dbSNP variants during variant annotation (More flexible matching of dbSNP variants #6626)
    • Add all dbsnp id's which match a particular variant to the variant's id, instead of just the first one found in the dbsnp vcf.
    • Be less brittle to variant normalization issues, and match differing variant representations of the same underlying variant. This is implemented by splitting and trimming multiallelics before checking for a match, which I suspect are the predominant cause of these types of matching failures.

For more details see the original pull request here: #6626

@Stikus
Copy link
Author

Stikus commented Jul 2, 2020

Thank you for fast answer, looks like I've not read well. :)

@Stikus Stikus closed this as completed Jul 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants