-
Notifications
You must be signed in to change notification settings - Fork 1
Workflows
Common workflows are described here.
Downloads the files specified in a sample
Inputs: data.csv
Outputs: nanopore-reads, illumina-reads
Options: --config data_csv=<path> (Default is data.csv)
--resources connections=<int>
The connections
resource controls how many simultaneous download jobs will be used. By default it is 1. Be careful to not make it too high and overload your system!
Merges the raw reads corresponding to each sample into one file per type of read.
Inputs: data.csv
Outputs: merged-reads
Options: <same options as download-data>
--config breseq_options="<breseq_options>"
Merges the trimmed reads corresponding to each sample into one file per type of read.
Inputs: data.csv
Outputs: merged-reads-trimmed
Options: <same options as download-data>
--config breseq_options="<breseq_options>"
Runs breseq
using the reference files and trimmed read files.
Inputs: data.csv
Outputs: breseq-references/data, breseq-references/html, breseq-references/gd
Options: <same options as download-data>
--config BRESEQ_OPTIONS="<breseq_options>"
Options that get passed to breseq
--config BRESEQ_THREADS=<int>
Override the default number of threads for each breseq job.
--config No_DEFAULT_BRESEQ_OPTIONS=<bool>
Don't pass the default option of -x to breseq when using nanopore reads
Runs breseq
using the reference files and trimmed read files. Then runs breseq CL-TABULATE
on the aligned reads to create a CSV file that counts how many reads have different numbers of bases in each mononucleotide repeat with at least a certain minimum length in the reference file.
Inputs: data.csv
Outputs: breseq-references/ssrs
Options: <same options as download-data>
--config ssr_minimum_length=<int>
Minimum length (--minimum-length) parameter passed to `breseq CL-TABULATE`
--config ssr_strict_mode=<bool>
Pass the `--strict` parameter to `breseq CL-TABULATE`.
Runs predict-mutations-breseq
and then generates HTML compare tables to summarize similarities and differences between samples. Different compare table files are created for each set of samples that were compared against different reference sequences.
Inputs: data.csv
Outputs: breseq-references/compare[_#].html, breseq-references/html, breseq-references/gd
Options: <same options as predict-mutations-breseq>
Runs breseq BAM2COV
to create coverage plots tiling the reference genome.
inputs: breseq-references/data
Outputs: breseq-references/cov
Options: <same options as predict-mutations-breseq>
Uses gdtools
from breseq to apply the GenomeDiff files in genome_diff
to generate updated reference genomes that include those mutations. One GenomeDiff file is expected per sample with the *.gd
file ending. These could be copied from a breseq-*/gd
directory and then manually edited to curate the mutations they describe.
Inputs: data.csv, genome-diffs/*.gd
Outputs: mutants
Options: <same options as download-data>
You can use predict-mutations-breseq-mutants
after this command to re-run breseq using the input reads against the hypothesized mutant genome sequences. If their lists of mutations are correct and complete the output should now show no mutations predicted.
Generates files that can be loaded in IGV to view sequences (FASTA/FAI), reads (BAM/BAI) and annotations (GFF). Runs minimap2 for nanopore reads and bowtie2 for illumina reads for mapping to the provided reference.
Inputs: data.csv
Outputs: align-reads-references/data
Options: <same options as download-data>
Analyzes and plots soft-clipped reads after mapping.
Inputs: align-reads-references/data
Outputs: align-reads-reference/soft-clipping
Options: <same options as download-data>
Combines annotations of genes from prokka
with annotations of IS elements from isescan
into a final Genbank file for each sample.
Inputs: data.csv, references
Outputs: annotated-references
Options: <same options as download-data>
- Autocycler is not available on bioconda. Download the release for your OS from the Autocycler GitHub.
- DO NOT download the binary into the
brefito
folder. This interferes with the execution of Snakemake. - Add the path to the folder that contains the
autocycler
binary to your $PATH variable. - Clone the Autocycler repository anywhere on your system. DO NOT clone it into the
brefito
folder.
git clone https://github.com/rrwick/Autocycler.git
- Add the path to the
scripts/
folder of this repository to your $PATH.
Use autocycler to generate a consensus assembly for each sample.
Inputs: data.csv
Outputs: autocycler/{sample}/output/consensus_assembly.fasta
Options: --config genome_size=<int>
required step to supply estimated genome size (eg: 4600000)