Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FLAIR collapse 5'/3' UTR and splice isoform issue #409

Open
mckalem opened this issue Jan 21, 2025 · 3 comments
Open

FLAIR collapse 5'/3' UTR and splice isoform issue #409

mckalem opened this issue Jan 21, 2025 · 3 comments
Labels
bug Something isn't working mod; collapse

Comments

@mckalem
Copy link

mckalem commented Jan 21, 2025

Hello FLAIR Team,

I am working on annotating the transcriptome of a fungal organism using Nanopore Direct RNA Sequencing. I have DRS where I enriched for full-length transcripts by using a 5' adapter ligation strategy. I am inputting filtered full-length reads to FLAIR2. My organism of interest has many single exon transcripts and many regions with antisense transcripts in packed genomic loci.

I struggle to get consistent 5' and 3' end isoforms, and FLAIR2 is missing some splice isoforms. I tried running FLAIR2 using less stringent parameters to get the most isoforms.

This is how I am invoking FLAIR2:

flair align -g /home/muratcankalem/genomes/G217BV3/HcG217Bv3_contig.fasta -r /home/muratcankalem/Histo_Isoform_Analsis/minimap2/fasta/merged_filtered_1_13_25.fasta \
--threads 32 \
--nvrna \
--quality 5 \
--output "/home/muratcankalem/Histo_Isoform_Analsis/flair2_H_Y_combined/H_Y_ALIGN"

flair correct -q /home/muratcankalem/Histo_Isoform_Analsis/flair2_H_Y_combined/H_Y_ALIGN.bed \
--gtf /home/muratcankalem/genomes/G217BV3/HcG217BV3_mRNA.gtf \
-g /home/muratcankalem/genomes/G217BV3/HcG217Bv3_contig.fasta \
--output /home/muratcankalem/Histo_Isoform_Analsis/flair2_H_Y_combined/H_Y_CORRECT \
--nvrna \
--threads 32 \
--ss_window 10 \
--print_check

#Collapse_C

flair collapse -q /home/muratcankalem/Histo_Isoform_Analsis/flair2_H_Y_combined/H_Y_CORRECT_all_corrected.bed \
-g /home/muratcankalem/genomes/G217BV3/HcG217Bv3_contig.fasta \
-r /home/muratcankalem/Histo_Isoform_Analsis/minimap2/fasta/merged_filtered_1_13_25.fasta \
--gtf /home/muratcankalem/genomes/G217BV3/HcG217BV3_mRNA.gtf \
--output /home/muratcankalem/Histo_Isoform_Analsis/flair2_H_Y_combined/collapse_1_15_25/H_Y_COLLAPSE_1_15_25_C \
--threads 32 \
--mm2_args=-G2000 \
--trust_ends \
--no_gtf_end_adjustment \
--generate_map \
--max_ends 5 \
--isoformtss \
--end_window 20 \
--support 2 \
--filter ginormous

I attached 2 examples of missing either 5'/3' UTR annotation or splice isoforms:

  I7I48_02682: I expect this transcript to have 5' and 3' UTR isoforms by looking at the reads, but FLAIR2 does not identify the correct TSS/TES for the isoforms.
            
  I7I48_04800: I expect this to have an isoform originating from the mis-splicing of intron 1, but FLAIR2 has not identified it.
            

I am unsure if there's a reason that these are not called as isoforms. There are many instances like these that I observe where 5' and 3' ends are not correctly called (happy to share my fasta files and flair outputs). How can I modify my FLAIR2 invocation to have the most inclusive list of isoforms (especially the 5' and 3' UTR isoforms)?

I'd appreciate any advice you might have for me.

Thank you,
-Murat

I7I48_04800 FLAIR_C.pdf
I7I48_02682 FLAIR_C.pdf

@diekhans
Copy link
Collaborator

diekhans commented Jan 22, 2025 via email

@diekhans
Copy link
Collaborator

data files to reproduce the problem are in /private/home/markd/pub/for-colette/issue-409.zip

@diekhans diekhans added bug Something isn't working mod; collapse labels Jan 22, 2025
@mckalem
Copy link
Author

mckalem commented Jan 27, 2025

Hello - Providing short-read data solved the issue with missing splice isoforms. I still have problems identifying 5' and 3' UTR isoforms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working mod; collapse
Projects
None yet
Development

No branches or pull requests

2 participants