-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Module documentation format #1
Comments
Suggestions: don’t prefix each line with |
With some x-talk with @ewels, let's try simple YAML. It is no effort at all to parse in most languages, and with a regex like
Everything that does not look like a YAML can be easily ignored (probably a usual code comment). What information do we want to display? I will start with a list:
DescriptionJust a general description about the purpose of the process / function. KeywordsOne or more keywords to be able to group processes by keyword. ToolsA list of tool objects used in a process. A tool object can contain fields like
InputInput is a list of Nextflow input definitions, and follow the format
Maybe two fields here: the definition and a description? OutputSame as input. AuthorsA list of GitHub users contributed to the process. ExampleHow would this look like: /*
description: Simply FASTQC
keywords:
- Quality Control
- QC
tools:
- fastqc:
description: <description here>
homepage: https://superhomepage.edu
doi: <doi here>
input:
- reads:
type: file
description: <description here>
- sample_id
type: string
description: <description here>
output:
- report:
type: file
schema: *_fastqc.{zip,html}
authors:
- @sven1103
- @drharshil
*/
process fastqc {
tag "$sample_id"
publishDir "${params.outdir}/fastqc", mode: 'copy',
saveAs: {filename -> filename.indexOf(".zip") > 0 ? "zips/$filename" : "$filename"}
input:
set val(sample_id), file(reads)
output:
file "*_fastqc.{zip,html}"
script:
"""
fastqc -q $reads
fastqc --version &> fastqc.version.txt
"""
} This is just an example, we can work out the details. But seeing the code makes it easier to communicate what we are talking about :D |
I think we should try to parse everything inside the comment block as YAML. Guessing which bits are YAML and which bits are comment is a bit of a faff (there can always be yaml comments!). Otherwise, I think this all looks great! Only thing I notice is that the inputs should be a list of a list, as there can be multiple input channels, each of which can have multiple definitions. So more like: input:
- - reads:
type: file
description: <description here>
- sample_id:
type: string
description: <description here> Then you can have, for example: input:
-
- reads:
type: file
description: <description here>
- sample_id:
type: string
description: <description here>
-
- index:
type: file
description: Second input channel for a reference or whatever This YAML syntax is a bit confusing to look at, so will definitely need some linting with nice helpful error messages 😉 |
ok, I agree. All-or-nothing parsing :) But people could still have usual comment blocks, and we should not restrict them from doing so. So I suggest to let the linting throw warnings, if a comment block cannot be parsed as YAML. |
Discussing at the hackathon - suggestion is that we should have this meta information as a separate file so that it is easier to parse by other tools (including nextflow itself). If it's in a comment then it will be very difficult to get in to nextflow. We could copy bioconda and have a Note that we need things to be organised in directories for this. But we should probably have that anyway. |
Addressed in #9 |
In the context of the discussion in #8, I was wondering if the
|
Discussion at another hackathon - general consensus was that the current system of using separate YAML files is probably best. I think that we can close this issue now. |
* adding plink module using nf-core tool [ci skip] * Restructures the project for plink/vcf (#1) * Add version string for plink * Create a plink/vcf module * small tweaks on main.nf and started to test [ci skip] * small changes on test args, local test with docker passed! * Update plink/vcf module listing * Update tag * fix tags as per linting guidelines * revert to the original state of tags * adding --threads to `main.nf` and `meta.yml` information Co-authored-by: Abhinav Sharma <[email protected]>
Update broken modules
* Adding module for miniprot_index. * Adding module for miniprot_index. * Adding module for miniprot_index. * Adding module for miniprot_index. * Adding module for miniprot_index. * Adding module for miniprot_index. * Adding module for miniprot_index. * update the wrong file name * put back the test data path * change index file name Co-authored-by: Guoying Qi <[email protected]> * Adding module for miniprot/align (#31) * Adding module for miniprot/align. Closes #1 * Adding module for miniprot/align. Closes #1 * Adding module for <software/tool>. Closes #<issue_number>. * removed gtf flag from main.nf and meta.yml * removed gtf flag from main.nf and meta.yml * incorporate comments * incorporate comments * fixed a bug, swapped the order of reference and protein (#32) * Fixed the paths for the new modules structure * Switched to the nf-core test data and the biocontainer * This output is actually named "index" * linting * Fixed the tool name * Added a meta map to the reference index too, as per the latest nf-core usage * Added another keyword Co-authored-by: YSims <[email protected]> Co-authored-by: Guoying Qi <[email protected]> Co-authored-by: Matthias De Smet <[email protected]>
* Adding module for miniprot_index. Closes #1. * Adding module for miniprot_index. Closes #1. * Adding module for miniprot_index. Closes #1. * Adding module for miniprot_index. Closes #1. * Adding module for miniprot_index. Closes #1. * Adding module for miniprot_index. Closes #1. * update the wrong file name * put back the test data path * change index file name Co-authored-by: Guoying Qi <[email protected]>
…lows (nf-core#3) * Started adding nf-microbe functions, modules. subworkflows, and workflows * Removed objects for later additions, and started fixing bowtie2 subworkflow * Added FASTQ_BOWTIE2_FASTQ snapshot and updated CI * Updated subworkflow tests * Fix CI attempt #1 * Fixed function tests and linting * Fixed failing nf-tests
We need to decide how best to be able to document each individual module itself e.g. what is this module doing, keywords for findability, links to homepage per tool used in the process etc. @sven and I came up with a rudimentary version of this but I think we will need more discussion to get this right.
It would also be good to be able to generate automated docs for the types of objects that are required as
input:
andoutput:
for each modules, thescript:
section and any other information that may be useful. @sven suggested we may be able to get this by directly by plugging into NF.This is all still open for discussion so please chime in if you have some ideas.
The text was updated successfully, but these errors were encountered: