-
Notifications
You must be signed in to change notification settings - Fork 100
Downstreamer
Downstreamer can be used to to perform key gene prioritization using GWAS summary statistics. We do this using 57 tissue specific co-expression networks derived from the Recount3 data.
1️⃣ Getting started
2️⃣ Running PascalX to obtain gene p-values
3️⃣ Tissue enrichment
4️⃣ Key gene enrichment
5️⃣ Code availability
Download tool and reference data here: https://downloads.molgeniscloud.org/downloads/downstreamerRelease2.tar.gz
This includes the files that are needed for PascalX
Downstreamer needs gene level p-values for the analysis. PascalX can be used to convert the variant level summary statistics of GWAS to gene level summary statistics.
The instruction to do so are listed here: PascalX for Downstreamer
In principle Downstreamer can also use gene p-values from another source. This is however not recommend as you would then also need to create a new null distribution for the gene p-values.
The expected format of gene p-values is a tab-separated file with 4 columns:
Column name | Description |
---|---|
gene | The name of the gene |
pvalue | The gene p-value |
nsnps | The number of SNPs on which p-value is based. Can be 1 for all if not applicable |
min_pvalue | The smallest SNP p-value. Can be zero for all if not applicable |
The gene-gene correlations of the null gwas p-values are stored per chromosome arm and are using the following naming scheme: NAME_1_q_correlations.datg
The .datg
files and corresponding .rows.txt.gz
and .cols.txt.gz
files can be created from a tab-seperated .txt
file using the CONVERT_TXT
mode of the Downstreamer.
Note: without updated null distributions the results might not be reliable.
First we use Downstreamer to determine which tissue express the genes implicated by the GWAS using a tissue enrichment analysis. By doing this we make sure that the key genes predictions are driven by relevant co-expression instead of cell tissue specific expression.
First change the variables in top of runDownstreamerTissueEnrichment.sh
and selectSignficantTissues.R
Then run:
sh runDownstreamerTissueEnrichment.sh
Rscript selectSignficantTissues.R
This will prepare a parameter specifying which tissue specific networks Downstreamer should use in the next step.
We are now ready to run the actual key gene prioritization.
Again change the variables but now in: runDownstreamerKeygenePrediction.sh
sh runDownstreamerKeygenePrediction.sh
The resulting key gene prioritization per tissue are found in: _keygene_enrichtments.xlsx
If needed the Z-scores of the different tissues can be meta-analyzed to obtain the final key gene prioritization score.
https://github.com/molgenis/systemsgenetics/tree/master/Downstreamer
- QTL mapping pipeline
- Genotype Harmonizer
- Genotype IO
- ASE
- GADO Command line
- Downstreamer
- GeneNetwork Analysis
Analysis plans
Other