-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
70 lines (50 loc) · 3.27 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
title: "README"
author: "Pourya Naderi"
date: "2/17/2020"
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
Centrality in Pathways Manual and Walk Through
================================================
Overview
===============================================
This document contains the walk-through for the accompanying code for the publication "Revisiting the Use of Graph Centrality Models in Biological Pathway Analysis".
Citation
========
TBA
Dependencies and Installation guide
===================================
The provided codes have the following dependencies: `graph`, `stringr`, `dplyr`, `magrittr`, `CHRONOS`,`pathview`,`KEGGgraph`,`Rgraphviz`,`igraph`, `org.Mm.eg.db`,`tm`,`stringi`,`stringr`,`dplyr`,`sna`, `RBGL`,`tidyr`,`AnnotationDbi`,`org.Hs.eg.db`,`annotate`,`biomaRt`,`Hmisc`,`broom`,`xtable`. Make sure the packages are installed before using. To install the packages you can use the following code chunk.
``` r
libList <- c("CHRONOS","pathview","KEGGgraph","Rgraphviz","igraph",
"org.Mm.eg.db","tm","stringi","stringr","dplyr","sna",
"RBGL","tidyr","AnnotationDbi","org.Hs.eg.db","annotate",
"biomaRt","Hmisc","broom","xtable")
#source("https://bioconductor.org/biocLite.R")
# for(i in libList) {
# if(!require(i)){
# install.packages(i,repos = "http://cran.us.r-project.org")
# }
# if(!require(i)){
# BiocManager::install(i)
# }
# }
```
File guides
-----------
The folders `human_data` and `mouse_data` contain background data, preprocessed pathways, and precomputed centrality values. and its subfolders contains relevant data and code accompanying the package and the original manuscripts. The folders `HUMAN_PATHWAYLIST` and `MOUSE_PATHWAYLIST` contain the raw unprocessed XML files of the original pathways.
The folders `Human Processing` and `Mouse Processing` contain the scripts. Since the files in the folders are identifical, we will only review the contents of the `Human Processing` folder.
- `00-basicFunctions.R`: Contains basic functions as used throughout the scripts.
- `01-00-PathwayParsing.R`: Contains a retrieve script for KEGG Pathways and processing them into graph objects. This script calculates some basic stats regarding pathways as well.
- `01-01-CentralityTable.R`: Produces centrality values for all qualified pathways using the centrality models, as described in the manuscript.
- `02-pathwayCleaning.R`: Calculates the quantile score as described in the manuscript and filters out the pathways containing less than 5 cancer genes.
- `03-1-regressionAnalysis.R`: Computes the regression models for the percentage of cancer genes with in each quantile-normalized score and generates reports.
- `03-2-KStests.R`: Kolmogorov-Smirnov test statistics for comparing the distribution of cancer genes and non-cancer genes across all pathways.
- `03-3-pathwayWiseTesting.R`: Contains two-sample test between the centrality values of cancer genes and non-cancer genes within each pathway.
- `04-01-pgr_sensitivity.R`: This script generates PageRank centrality values across different alpha values.
- `04-02-pgr_sensitivity_fit.R`: Repeats the regression analysis for PageRank centrality values from the last script.
References
==========