-
Notifications
You must be signed in to change notification settings - Fork 100
Downstreamer
Downstreamer can be used to to perform key gene prioritization using GWAS summary statistics. We do this using 57 tissue specific co-expression networks derived from the Recount3 data.
1️⃣ Getting started
2️⃣ Running PascalX to obtain gene p-values
3️⃣ Tissue enrichment
4️⃣ Key gene enrichment
5️⃣ Code availability
Download tool and reference data here: https://downloads.molgeniscloud.org/downloads/downstreamerRelease2.tar.gz
This includes the files that are needed for PascalX
Downstreamer needs gene level p-values for the analysis. PascalX can be used to convert the variant level summary statistics of GWAS to gene level summary statistics.
The instruction to do so are listed here: PascalX for Downstreamer
In principle Downstreamer can also use gene p-values from another source. This is however not recommend as you would then also need to create a new null distribution for the gene p-values.
The expected format of gene p-values is a tab-separated file with 4 columns:
Column name | Description |
---|---|
gene | The name of the gene |
pvalue | The gene p-value |
nsnps | The number of SNPs on which p-value is based. Can be 1 for all if not applicable |
min_pvalue | The smallest SNP p-value. Can be zero for all if not applicable |
The gene-gene correlations of the null gwas p-values are stored per chromosome arm and are using the following naming scheme: NAME_1_q_correlations.datg
The .datg
files and corresponding .rows.txt.gz
and .cols.txt.gz
files can be created from a tab-seperated .txt
file using the CONVERT_TXT
mode of the Downstreamer.
Note: without updated null distributions the results might not be reliable.
First we use Downstreamer to determine which tissue express the genes implicated by the GWAS using a tissue enrichment analysis. By doing this we make sure that the key genes predictions are driven by relevant co-expression instead of cell tissue specific expression.
For this first run: runDownstreamerTissueEnrichment.sh
followed by the R code in: selectSignficantTissues.R
This will prepare a parameter specifying which tissue specific networks Downstreamer should use in the next step.
We are now ready to run the actual key gene prioritization. This done by: runDownstreamerKeygenePrediction.sh
The resulting key gene prioritization per tissue are found in: _keygene_enrichtments.xlsx
If needed the Z-scores of the different tissues can be meta-analyzed to obtain the final key gene prioritization score.
https://github.com/molgenis/systemsgenetics/tree/master/Downstreamer
- QTL mapping pipeline
- Genotype Harmonizer
- Genotype IO
- ASE
- GADO Command line
- Downstreamer
- GeneNetwork Analysis
Analysis plans
Other