Skip to content

Causal inference, differential expression, and co-expression for scRNA-seq

License

Notifications You must be signed in to change notification settings

lingfeiwang/normalisr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Normalisr

https://img.shields.io/pypi/v/normalisr?color=informational

Normalisr is a parameter-free normalization and statistical association testing framework that unifies single-cell differential expression, co-expression, and pooled single-cell CRISPR screen analyses with linear models. By systematically detecting and removing nonlinear confounders arising from library size at mean and variance levels, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased p-value estimation.

Normalisr first removes confounding technical noises from raw read counts to recover the biological variations. Then, linear association testing provides a unified inferential framework with several advantages: (i) exact P-value estimation without permutation, (ii) native removal of covariates (e.g. batches, house-keeping programs, and untested gRNAs) as fixed effects, (iii) robustness against read count distribution distortions with enough (> 100) cells, and (iv) computational efficiency.

Normalisr is in python and provides a command-line and a python functional interface. Normalisr is published in Nature Communications (2021).

Installation

Normalisr is on PyPI and can be installed with pip: pip install normalisr. You can also install Normalisr from github: pip install git+https://github.com/lingfeiwang/normalisr.git. Make sure you have added Normalisr's install path into PATH environment before using the command-line interface (See FAQ). Normalisr's installation should take less than a minute.

There are more advanced installation methods but if you want that, most likely you already know how to do it. If not, give me a shout (See Issues).

Usage

Normalisr provides a command-line and a python functional interface below. You can use the examples provided below to guide yourself through Normalisr's use. Sphinx-based documentation is underway.

  • Commmand-line interface

    You can run Normalisr by typing normalisr on command-line. Normalisr uses submodules for different analysis steps. Type normalisr or normalisr -h for general help, and for example normalisr de -h for help on submodule 'de' of differential expression.

    Normalisr uses tsv (tab separated values) file format for input and output matrices, and text file for row and column names, such as cells and genes, one per line. For initial input, Normalisr also accepts the sparse mtx format (Cell Ranger output) for raw read count matrix. Gzipped input/output files are automatically recognized if file name suffix '.gz' is present.

  • Python functional interface

    Normalisr's python functional interface is more flexible than command-line, but requires knowledge of python programming. Documentation of any function can be obtained with ? in ipython or jupyter notebook, such as:

    import normalisr.normalisr as norm
    ?norm.de
    

    The example jupyter notebooks also illustrate the scope of functions Normalisr provides.

Documentation

Documentations are available as html and pdf.

Examples and pipelines

You can find several examples in the 'examples' folder, to cover all functions Normalisr currently provides. The example datasets have been scaled down to run on a 16GB-memory personal computer. Although they only serve as demonstrations of work here, the pipelines should be transferable to a full-scale, different dataset. Since Normalisr is non-parametric, the only adjustable parameters are for quality control and final cutoffs of differential or co-expression. You can change down-sampling parameters in the examples to run the full datasets on a larger computer.

You can find more details in the respective examples.

Issues

Pease raise an issue on github.

References

FAQ

  • What does Normalisr stand for?
    Normalisr Offers Robust Modelling of Associations Linearly In Single-cell RNA-seq. Yes, it's a recursive acronym. See GNU and pip.
  • I installed Normalisr but typing normalisr says 'command not found'.
    See below.
  • How do I use a specific python version for Normalisr's command-line interface?
    You can always use the python command to run Normalisr, such as python3 -m normalisr to replace command normalisr. You can also use a specific path or version for python, such as python3.7 -m normalisr or /usr/bin/python3.7 -m normalisr. Make sure you have installed Normalisr for this python version.
  • Why don't the examples work?
    Please make sure you followed every step in the README.md of the respective example folder with Internet connection, and then submit an issue report detailing at which executed line the error occurred with input and output.
  • Does Normalisr run on Windows?
    I have not tested Normalisr on Windows. However, it is purely in python and should be able to function properly.