Skip to content
forked from Illumina/Polaris

Data and information about the Polaris study

Notifications You must be signed in to change notification settings

map2085/Polaris

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Polaris

HiSeqX data

HiSeqX data

Table of Contents

Summary

The Polaris project provides

  • Population sequencing resources on high throughput Illumina sequencing platforms
  • Variant calls validated using population genetic and Mendelian methods

The variant calls currently provided in Polaris are breakpoint-resolved deletion and insertion structural variants (SVs).

Further details of the sequencing resources, input data sources, genotyping methods and validation methods can be found in the project wiki.

Latest variant call release — VC1.0

Our latest variant call release set is VC1.0. This call set contains 70,706 from a candidate set of 184,988 validated breakpoint-resolved SV calls.

Candidates were identified from 4 sources:

All candidates were jointly re-called using our breakpoint joint caller suite, paragraph.

Validation consisted of:

PASS calls were either

  • Pedigree consistent
  • Homozygous in pedigree and MAF > 0.05 + HWE p-value > 0.05 in Polaris panels

Complete release notes for VC1.0 can be found here.

Download the VCF

The VC1.0 VCF is available can be downloaded either using AWS CLI or wget and can also be viewed in this S3 bucket display. Using wget is currently the easier of the two command line options.

AWS CLI

Polaris datasets are stored in an AWS S3 bucket called illumina-polaris, and can de downloaded using the AWS CLI:

$: aws cp s3://illumina-polaris/vc1_0.vcf.gz
$: aws cp s3://illumina-polaris/vc1_0.vcf.gz.tbi

wget

If you don't have AWS credentials, you can use wget or a similar tool to download VC1.0:

$: wget https://s3.amazonaws.com/illumina-polaris/vc1_0.vcf.gz
$: wget https://s3.amazonaws.com/illumina-polaris/vc1_0.vcf.gz.tbi

Sequencing resources

Population panels with unrestricted access sequenced as part of Polaris are available through BaseSpace or the European Nucleotide Archive (ENA).

Additional panels are available through the EGA or dbGaP with restricted access subject to approval through a Data Access Committee. No variant calls are ever reported in Polaris for restricted access panels.

Further information the sequencing resources described below can be found in the [project wiki][0.3].

HiSeqX PCR-Free Data (Polaris 1)

All HiSeqX PCR-Free data was generated by Illumina Laboratory Services (ILS) with a target whole genome coverage of 30X.

There are currently two unrestricted access panels available in Polaris, with a third pending.

  1. Diversity panel (BaseSpace (pending), ENA) — 150 samples selected to represent a diversity of populations
  2. PGx panel (BaseSpace (pending), ENA) — 70 samples with orthogonally validated genotypes for 28 genes relevant for PGx4
  3. Trio panel (pending) — 51 children whose parents were sequenced as part of the diversity panel

There is also a restricted access repeat expansion panel available through EGA.

Associated resources

HiSeq2000 PCR-free

HiSeqX PCR-Free

  • Parents & grandparents
    • ENA — pending
    • BaseSpace — pending
  • Children
    • dbGaP — pending

Coming soon

HiSeqX PCR-Free

  • Platinum Genomes pedigree
  • NIST Ashkenazi Jewish trio

10X

  • Platinum Genomes Pedigree

NovaSeq S4 PCR-Free

  • Platinum Genomes pedigree
  • NIST Ashkenazi Jewish trio

Issues

Please open an issue to provide feedback or ask questions.

References

  1. Eberle, et al (2017) A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree. Genome Res. 27:157-164. doi:10.1101/gr.210500.116
  2. English, et al (2015) Assessing structural variation in a personal genome-towards a human reference diploid genome. BMC Genomics. 16:286 doi:10.1186/s12864-015-1479-3
  3. Kehr, et al (2017) Diversity in non-repetitive human sequences not found in the reference genome. Nat Genet. 49(4):588-593. doi: 10.1038/ng.3801
  4. Pratt, et al (2016) Characterization of 137 Genomic DNA Reference Materials for 28 Pharmacogenetic Genes: A GeT-RM Collaborative Project. J Mol Diagn. 18(1):109-23. doi:10.1016/j.jmoldx.2015.08.005

About

Data and information about the Polaris study

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published