Skip to content
This repository has been archived by the owner on Jan 16, 2019. It is now read-only.
pmorrell edited this page Nov 26, 2014 · 39 revisions

Introduction

angsd-wrapper is a project that attempts to streamline analysis in ANGSD with reproducibility and ease of use in mind. angsd-wrapper implements methods in ANGSD such as calculating Tajima's D, Thetas, and (2D) site frequency spectrums.

Each script comes with a number of variables that change how the method is carried out. All variables can be overridden from the default values by setting the variable in the .conf file to the desired value. For example, I can change the minimum base mapping quality to use from the default of 30 to 25 by putting MIN_MAPQ=25 in the .conf file.


Installation/Usage

ANGSD and ngsPopGen must be compiled before use. Both can be accomplished by changing to the directory and executing make:

$ cd angsd
$ make
$ cd ..
$ cd ngsPopGen
$ make

Note, if you try make test in ngsPopGen the script will likely fail because it does not have all the directories it expects.

Scripts should be invoked with bash and executed from the toplevel directory so the paths are correct.

$ bash scripts/ANGSD_SFS.sh scripts/SFS_example.conf

Methods

Each method included in angsd-wrapper has an in-depth on its respective page.

##ANGSD-related publications

###Methods used in ANGSD are explained in:

Nielsen R, Korneliussen T, Albrechtsen A, Li Y, Wang J (2012) SNP calling, genotype calling, and sample allele frequency estimation from New-Generation Sequencing data. PLoS One 7: e37558.

###FST and PCA methods are detailed in:

Fumagalli M, Vieira FG, Korneliussen TS, Linderoth T, Huerta-Sanchez E, Albrechtsen A, Nielsen R (2013) Quantifying population genetic differentiation from next-generation sequencing data. Genetics 195: 979-992.

###Estimation of individual inbreeding coefficients is detailed in:

Vieira FG, Fumagalli M, Albrechtsen A, Nielsen R (2013) Estimating inbreeding coefficients from NGS data: Impact on genotype calling and allele frequency estimation. Genome Res 23: 1852-1861.

###A comparison of the accuracy of population genetic analyses using ANGSD versus various SNP calling approaches:

Han E, Sinsheimer JS, Novembre J (2014) Characterizing bias in population genetic inferences from low-coverage sequencing data. Mol Biol Evol 31: 723-735.