Skip to content

sites file optimized for RNA-Seq, bug-fixes and ancestry improvements

Compare
Choose a tag to compare
@brentp brentp released this 01 May 22:24
· 46 commits to master since this release

v0.2.10

  • added a new sites file that includes sites likely to be expressed in GTeX to improve kinship estimation in RNA-Seq data (see below for link to new sites file).
  • fix extra output column in pairs.tsv (#47)
  • change output file name of ancestry to include "somalier"
  • fix for gvcf with empty alts (#46)
  • add include regions and exclude sites to find-sites
  • add --min-ab option to somalier relate to limit het sites to min_ab..(1-min_ab). default is 0.3
  • html output for sample plot defaults to number of het sites on X (was hom-alt)
  • better estimates in somalier ancestry when incoming samples are different ancestry from training (thousand genomes samples)

example output is now at: https://brentp.github.io/somalier/ex.html
and: https://brentp.github.io/somalier/ex.somalier-ancestry.html

Installation

grab the static binary , or use docker via brentp/somalier:v0.2.10

sites files (unchanged from previous releases)

These sites files are build-specific, but as of this release, once the sites are extracted, the resulting somalier files can be used to compare samples even across genome builds.

sites.hg19.vcf.gz
sites.hg38.nochr.vcf.gz
sites.GRCh37.vcf.gz
sites.hg38.vcf.gz

sites.hg38.rna.vcf.gz only includes sites likely to be expressed in GTeX