Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



10 Commits

Repository files navigation

Welcome to EPIraction 1.6, nextflow implementation.

Our pipeline needs a lot of computations. In order to run it efficiently and smoothly we wrap all the scripts in the nextflow environment to run them into the HPC cluster. To run this pipeline you need to download 1,179 Gb of bigWig files and 443 Gb of Hi-C data. You can find links to these data below. Please estimate your resource properly before downloading the data and attempting to run the pipeline.


Linux operating system with bash, perl, sort and other default executables.
nextflow - version 24.10.3 or above. That supports DSL2 and arrays
bedtools - any more or less recent version
bigWigAverageOverBed -
bedToBigBed -
R - starting from 4.0.0
pigz - any working version

R libraries:

R.utils 2.12.2 -
data.table 1.14.6 -
Rfast 2.0.6 -

Obtaining EPIraction data:

You need to create a specific folder ("$work_folder") that would store all the input data, all the processed data and all the predictions.
You need to download this file:
put it into $work_folder and extract the content:
tar xzvf EPIraction.tar.gz

You need to download all bigWig files from: into "$work_folder/BigWig"
You need to download all Hi-C files from: into "$work_folder/HiC.files"
You need to download all RSEM files from: into "$work_folder/RSEM"

You need to run:

to test if all necessary files are present.

Downloading and running the pipeline:

We recommend to download the pipeline into a folder that is different from "$work_folder" above:
git clone

You need to edit the EPIraction.config.slurm file. Put the actual full path to "$work_folder" into
data_folder  = '*****' variable

Edit the "executor" section:
Set "name" value, consult
Other parameters are not so essential, consult

Edit the "process" section:
If you use modules like we or conda envirement please use "beforeScript" variable. Please clarify if the job parameters "cpu", "memory" and "time" are compatible with your executor and have correct format.

Run the pipeline:
nextflow -c EPIraction.config.slurm run EPIraction

The complete run of the pipeline takes several hours. Please be sure that your Linux administrators do not kill this nextflow process.
Nextflow pipeline stores and runs its own stuff within the "work" folder. However, the main location of all temporary and final files is "$work_folder".
The scripts are internally configured to search for proper intermediate files. Nextflow caching is disabled, do not rely on it.

All the output files will be stored within the "$work_folder/report" folder.