C program that parses a pileup file/stream generated by samtools mpileup into a tabular file of base counts.
It is experimental, and only reads from sdtin and outputs to stdout.
Installation:
compile the main.c file with:
gcc main.c -o pileupParser
Usage:
cat file.pileup | pileupParser {arguments} > bases.counts.out
The arguments are not named and must comme in this order: minmal number of reads that carry a base require to output this base (integer, default 1) minimal number of different bases (alleles) to output a position (detaut is 2, because I coded this to investigate SNPs) minimal quality score of a base to consider this read in the count (default 13, as in samtools mpileup) number of bam files to scan (defaults to all, but cannot take more than 100)
Exemple:
cat file.pileup | pileupParser 2 1 25 > bases.counts.out //this scans all bam files in the pileup (if ≤ 100)
The output file lists : chromosome, position, bases found (separated by comas), number of different bases found, coma-separated counts for each base in the scanned bam files. There is a final field listing the number of reads showing indels at this position (for information). Fields are separated by tabs.