Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy between number of reads from collapse --generate_map and quantification module output #400

Open
pgupta3005 opened this issue Dec 23, 2024 · 2 comments
Labels
Documentation Update ReadTheDocs manual mod; quantify

Comments

@pgupta3005
Copy link

Hi, I used FLAIR correct, collapse and quantification modules after aligning Nanopore fastq files to the respective genome using minimap2. I used the following commands -

# FLAIR collapse
python /home/pallavi/softwares/flair-1.5/flair.py collapse \
-r PF/${i}.fastq \
-q $output_dir/${i}_all_corrected.bed \
-g $genome_fasta -f $genome_gtf \
--generate_map \
-t 12 --temp_dir /nfs_master/pallavi/tmp_work/ -o $output_dir/${i}_collapse 2> $output_dir/${i}_collapse.log

# FLAIR quantify
python /home/pallavi/softwares/flair-1.5/flair.py quantify \
-i $output_dir/${i}.isoforms.fa \
-r manifest_${i} \
-t 16 --tpm \
--temp_dir /nfs_master/pallavi/tmp_work -o $output_dir/${i}_quantify 2> $output_dir/${i}_quantify.log

While for most of the identified isoforms, the number of reads mapping to that isoform and the counts in the quantification output match, but for some the quantification returned is higher. For example -
image

As you can see, the number of reads mapped here are 5, but quantification gives 6. Why could this be happening and what is the solution?

Thanks!
-Pallavi

@pgupta3005
Copy link
Author

pgupta3005 commented Dec 23, 2024

Just to add - from a total for 2815 isoforms identified for our sample - the numbers matched for 1955 cases, while quantification gave higher counts for 787 isoforms and lower counts in 73 isoforms.

image

In IGV, it looks like this where the red ticks indicate the reads that were represented in the read-to-isoform map

image

@cafelton cafelton added Need more information Need more information from user mod; quantify labels Jan 8, 2025
@cafelton
Copy link
Collaborator

cafelton commented Jan 8, 2025

I think this is not an issue, just something that needs better documentation if I understand your situation correctly. If the mapped reads are in the isoform.read.map.txt file generated by FLAIR collapse and you're comparing them to FLAIR quantify, is is possible for these reads to not match up. This is because the process of de novo isoform identification in FLAIR collapse is distinct from the isoform quantification in FLAIR quantify. They share a lot of similarities and should produce similar, but not necessarily exactly the same, results. If you have to pick one to trust, I would pick the quantify file. You can also tell quantify to produce a read.map.txt file if you need that for your downstream pipeline.

@cafelton cafelton added Documentation Update ReadTheDocs manual and removed Need more information Need more information from user labels Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Update ReadTheDocs manual mod; quantify
Projects
None yet
Development

No branches or pull requests

2 participants