-
Notifications
You must be signed in to change notification settings - Fork 22
Home
FusionInspector is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT). FusionInspector assists in fusion transcript discovery by performing a supervised analysis of fusion predictions, attempting to recover and re-score evidence for such predictions.
Given a list of candidate fusion genes (as derived from running any fusion transcript prediction tool, such as Prada, FusionCatcher, SoapFuse, TophatFusion, DISCASM/GMAP-Fusion, STAR-Fusion, or other), FusionInspector extracts the genomic regions for the fusion partners and constructs mini-fusion-contigs containing the pairs of genes in their proposed fused orientation. The original reads are aligned to these candidate fusion contigs; fusion-supporting reads that would normally align as discordant pairs or split reads should align as concordant 'normal' reads in this fusion-gene context. Those reads supporting each fusion (spanning fragments and fusion-breakpoint-containing reads) are identified, reported, and scored accordingly.
Optionally, Trinity de novo transcriptome assembly can be executed as part of the FusionInspector routine in order to de novo reconstruct fusion transcripts from the mapped reads.
Outputs generated by FusionInspector are easily viewed in a genome browser such as IGV so that the evidence for fusion transcripts can be manually assessed for read and alignment quality.
FusionInspector requires the following companion software tools to be installed:
-
STAR (obtain the very latest development version of STAR via 'git clone')
-
Trinity (Even if you do not wish to de novo assemble fusion transcripts, components of the Trinity software are still required.)
-
[bgzip from the htslib package] (https://github.com/samtools/htslib/releases/tag/1.3)
-
And the following non-standard Perl modules: ** URI::Escape ** Set::IntervalTree
The cpanm tool is useful for local installations of these.
Be sure STAR, samtools, and bgzip are available via your PATH env var setting, and create env var TRINITY_HOME set to the Trinity installation directory.
FusionInspector is a component of the Trinity Cancer Transcriptome Analysis Toolkit, and as with the other fusion-transcriptome components of CTAT, FusionInspector leverages the FusionFilter data resources. Visit the FusionFilter website for links to existing data resources for human fusion transcript detection, or instructions on how to build your own data resources for use with CTAT.
FusionInspector requires one or more lists of fusion candidates, with each formatted like so, as geneA--geneB:
B3GNT1--NPSR1
ZNF709--DYRK1A
ZNF844--NCBP2
RBX1--HAPLN2
FAM180B--TRIM60
CASP9--ADCYAP1
HS3ST3A1--C1QTNF2
OPTC--AP000347.4
GRIA2--ZW10
We'll call the file containing this list 'fusions.listA.txt'. Let's assume we have another such list from another source, and we'll call it 'fusions.listB.txt'.
It's ok to have a tab-delimited file containing other attributes (such as the raw output from some fusion-prediction tool) as long as the first column fits the above format.
Given this list of fusions, we'll run FusionInspector like so:
FusionInspector --fusions fusions.listA.txt,fusions.listB.txt \
--genome_lib /path/to/CTAT_genome_lib \
--left_fq rnaseq_1.fq --right_fq rnaseq_2.fq \
--out_dir my_FusionInspector_outdir \
--out_prefix finspector \
--prep_for_IGV
The final output of FusionInspector is a file called 'finspector.fusion_predictions.final', which you'll find in the --out_dir specified. The format of this file is tab-delimited and contains the following fields and formatting:
0 #fusion_name
1 JunctionReads
2 SpanningFrags
3 Splice_type
4 LeftGene
5 LeftBreakpoint
6 RightGene
7 RightBreakpoint
8 JunctionReads
9 SpanningFrags
10 Annotations
0 HS3ST3A1--C1QTNF2
1 106
2 1254
3 ONLY_REF_SPLICE
4 HS3ST3A1
5 chr17:13503848:-
6 C1QTNF2
7 chr5:159776788:-
8 fragBp1383/1,fragBp1365/1,... # the exact reads identified as breakpoint-junction reads
9 fragBp692,fragBp277,fragBp389,... # the names of the fragments containing breakpoint-spanning paired reads.
10 .
The .final files can be large and difficult to navigate due to all the evidence reads being described. Instead, try navigating the .final.abridged file, which contains all the above information, but excludes the names of the reads.
When the '--prep_for_IGV' parameter is specified, a number of files as shown below are generated for viewing in the IGV (or other) genome browser:
finspector.fa : the candidate fusion-gene contigs
finspector.bed : the reference gene structure annotations for fusion partners
finspector.junction_reads.bam : alignments of the breakpoint-junction supporting reads.
finspector.spanning_reads.bam : alignments of the breakpoint-spanning paired-end reads.
An example of viewing a fusion candidate with recovered read evidence using IGV is shown below.
See the 'test/' subdirectory and examine the README.txt file included. Example data and command execution info are provided.
Contact us on our google group https://groups.google.com/forum/#!forum/trinity_ctat_users