C program that parses a pileup file/stream generated by samtools mpileup into a tabular file of base counts.
It is experimental, and only reads from sdtin and outputs to stdout.
compile the main.c file with:
gcc main.c -o pileupParser
cat file.pileup | pileupParser {arguments} > bases.counts.out
The arguments are not named and must comme in this order: minmal number of reads that carry a base require to output this base (integer, default 1) minimal number of different bases (alleles) to output a position (detaut is 2, because I coded this to investigate SNPs) minimal quality score of a base to consider this read in the count (default 13, as in samtools mpileup) number of bam files to scan (defaults to all, but cannot take more than 100)
cat file.pileup | pileupParser 2 1 25 > bases.counts.out //this scans all bam files in the pileup (if ≤ 100)
The output file lists : chromosome, position, bases found (separated by comas), number of different bases found, coma-separated counts for each base in the scanned bam files. There is a final field listing the number of reads showing indels at this position (for information). Fields are separated by tabs.