GitHub - ctheodoris/MIRA: Python package for analysis of multiomic single cell RNA-seq and ATAC-seq.

MIRA (Probabilistic Multimodal Models for Integrated Regulatory Analysis) is a comprehensive methodology that systematically contrasts single cell transcription and accessibility to infer the regulatory circuitry driving cells along developmental trajectories.

MIRA leverages joint topic modeling of cell states and regulatory potential modeling at individual gene loci to:

jointly represent cell states in an efficient and interpretable latent space
infer high fidelity lineage trees
determine key regulators of fate decisions at branch points
expose the variable influence of local accessibility on transcription at distinct loci

See our manuscript for details.

Install

MIRA can be installed from either PyPI or conda-forge:

pip install mira-multiome

or

conda install -c conda-forge mira-multiome

Getting Started

MIRA takes count matrices of transcripts and accessible regions measured by single cell multimodal RNA-seq and ATAC-seq from any platform as input data. MIRA output integrates with AnnData data structure for interoperability with Scanpy. The initial model training is faster with GPU hardware but can be accomplished with CPU computation.

Please refer to our tutorial for an overview of analyses that can be achieved with MIRA using an example 10x Multiome embryonic brain dataset.

Gallery

With MIRA, you can analyze single cell multimodal transcriptional (RNA-seq) and accessibility (ATAC-seq) to:

Construct biologically meaningful joint representations of cells progressing through developmental trajectories¹:

Infer high fidelity lineage trees defining developmental fate decisions¹:

Learn the "topics" describing cell transcriptional and accessibility states¹:

Contrast transcriptional and accessibility topics on stream graphs and determine the pathways and regulators governing in each cell state¹:

Identify the transcription factors driving poised genes down diverging developmental paths, predict transcription factor targets via in silico deletion of putative regulatory elements, plot heatmaps of transcriptional and accessibility dynamics, and compare expression and motif scores of key factors on MIRA's joint representation¹:

Explore gene expression within lineage trajectories and compare expression to motif score of key factors with stream graphs¹:

Determine the transcription factors driving fate decisions at key lineage branch points²:

Elucidate genes with local chromatin accessibility-influenced transcriptional expression (LITE) versus non-local chromatin accessibility-influenced transcriptional expression (NITE) and plot "chromatin differential" to highlight cells where transcription is decoupled from shifts in local chromatin accessibility²:

Quantify NITE regulation of topics or cells across the developmental continuum to reveal how variable circuitry regulates fate commitment and terminal identity.^1,2:

Overall, MIRA leverages principled probabilistic cell-level topic modeling and gene-level RP modeling to expose the key regulators driving fate decisions at lineage branch points and to precisely contrast the spatiotemporal dynamics of transcription and local chromatin accessibility at unprecedented resolution to reveal the distinct circuitry regulating fate commitment versus terminal identity.

Methodology

MIRA Topic Model

MIRA harnesses a variational autoencoder approach to model both transcription and chromatin accessibility topics defining each cell’s identity while accounting for their distinct statistical properties and employing a sparsity constraint to ensure topics are coherent and interpretable. MIRA’s hyperparameter tuning scheme learns the appropriate number of topics needed to comprehensively yet non-redundantly describe each dataset. MIRA next combines the expression and accessibility topics into a joint representation used to calculate a k-nearest neighbors (KNN) graph. This output can then be leveraged for visualization and clustering, construction of high fidelity lineage trajectories, and rigorous topic analysis to determine regulators driving key fate decisions at lineage branch points.

MIRA RP Model

MIRA’s regulatory potential (RP) model integrates transcriptional and chromatin accessibility data at each gene locus to determine how regulatory elements surrounding each gene influence its expression. Regulatory influence of enhancers is modeled to decay exponentially with genomic distance at a rate learned by the MIRA RP model from the joint multimodal data. MIRA learns independent upstream and downstream decay rates and includes parameters to weigh upstream, downstream, and promoter effects. The RP of each gene is scored as the sum of the contribution of individual regulatory elements. MIRA predicts key regulators at each locus by examining transcription factor motif enrichment or occupancy (if provided chromatin immunoprecipitation (ChIP-seq) data) within elements predicted to highly influence transcription at that locus using probabilistic in silico deletion (ISD).

MIRA LITE vs NITE Models

MIRA quantifies the regulatory influence of local chromatin accessibility by comparing the local RP model with a second, expanded model that augments the local RP model with genome-wide accessibility states encoded by MIRA’s chromatin accessibility topics. Genes whose expression is significantly better described by this expanded model are defined as non-local chromatin accessibility-influenced transcriptional expression (NITE) genes. Genes whose transcription is sufficiently predicted by the RP model based on local accessibility alone are defined as local chromatin accessibility-influenced transcriptional expression (LITE) genes. While LITE genes appear tightly regulated by local chromatin accessibility, the transcription of NITE genes appears to be titrated without requiring extensive local chromatin remodeling. MIRA defines the extent to which the LITE model over- or under-estimates expression in each cell as “chromatin differential”, highlighting cells where transcription is decoupled from shifts in local chromatin accessibility. MIRA examines chromatin differential across the developmental continuum to reveal how variable circuitry regulates fate commitment and terminal identity.

Citations

MIRA was created by researchers in the X. Shirley Liu Lab at Dana-Farber Cancer Institute. If you use MIRA in your research, we would appreciate citation of our manuscript (bibtex).

Public datasets used for analyses in gallery and tutorial:

Ma, S. et al. Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin. Cell (2020).
Datasets - 10x Genomics. https://support.10xgenomics.com/single-cell-multiome-atac-gex/datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
docs		docs
mira		mira
MANIFEST.in		MANIFEST.in
README.md		README.md
README2.md		README2.md
environment.yaml		environment.yaml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Install

Getting Started

Gallery

Methodology

MIRA Topic Model

MIRA RP Model

MIRA LITE vs NITE Models

Citations

About

Releases

Packages

Languages

ctheodoris/MIRA

Folders and files

Latest commit

History

Repository files navigation

Install

Getting Started

Gallery

Methodology

MIRA Topic Model

MIRA RP Model

MIRA LITE vs NITE Models

Citations

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages