Skip to content
Brian Haas edited this page Jul 5, 2023 · 34 revisions

CTAT-LR-Fusion : Detect Fusion Transcripts from Long Reads (PacBio Iso-seq or ONT transcriptomes)

CTAT-LR-Fusion is a component of the Trinity Cancer Transcriptome Analysis Toolkit (CTAT) used for detecting fusion transcripts from long-read transcriptome sequencing data, including PacBio Iso-seq and Oxford Nanopore Technology sequenced transcriptomes. If matched Illumina RNA-seq data are available, these can be leveraged as well for additional exploration and quantification of fusions initially detected via long reads.

CTAT-LR-Fusion was developed in the Broad Institute's Methods Development Laboratory (MDL) for characterizing long read transcriptome sequences such as derived from MAS-seq.

Installing CTAT-LR-Fusion

Obtaining CTAT-LR-Fusion software

Docker and Singularity images are available and recommended.

If you would prefer to install from source code, download the latest 'FULL' release tarball from the CTAT-LR-Fusion release site. Unpack it, and run 'make' in the base installation directory.

Obtaining and configuring the CTAT Genome Lib

The CTAT genome lib is the same used for other CTAT tools and can be downloaded from https://data.broadinstitute.org/Trinity/CTAT_RESOURCE_LIB/. The ctat genome lib software compatibility matrix indicates the version of STAR to use if you have companion Illumina short reads.

Configuring the CTAT Genome Lib for CTAT-LR-Fusion

The ctat-LR-fusion software comes with a customized version of minimap2 named ctat-minimap2, and CTAT-LR-Fusion requires a minimap2 index of the reference genome. To build this, initially run ctat-LR-fusion like so:

ctat-LR-fusion -T long_reads.fq \
               --genome_lib_dir  /path/to/ctat_genome_lib_build_dir \
               --prep_reference

and it will first build the minimap2 genome index before running ctat-LR-fusion to find fusion transcripts.

If you run with --prep_reference_only, it will stop after building the index.

For future runs, drop the --prep_reference argument, as the index only needs to be built once. If you forget, no worries. It'll only build it once anyway.

Running CTAT-LR-Fusion

Once you have the ctat genome lib installed and configured as above.

Clone this wiki locally