Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify namespacing #340

Closed
wants to merge 9 commits into from
Closed

Conversation

DavyCats
Copy link
Contributor

@DavyCats DavyCats commented Nov 8, 2019

This PR aims to clarify the namespacing of calls to workflows. Currently miniwdl and cromwell handle this differently (at least in the input JSON; see also #329), so some clarification seems to be needed in the spec.

Pinging @mlin and @cjllanwarne as engine implementers.

@mlin
Copy link
Member

mlin commented Nov 10, 2019

Hi @DavyCats, thanks for pushing this forward! To be clear, is this saying the JSON key for a subworkflow input should not include the subworkflow name following the call name (as Cromwell currently expects)? If so, I like that on first principles but we probably need to talk about how breaking this is for Cromwell and existing input JSON files.

@patmagee
Copy link
Member

@DavyCats I like this as well. The namespacing has always felt a little weird. There are a considerable number of breaking changes currently in the propsosed spec, I wonder if we could do this in a non-breaking way: ie Propose that <wf>.<task>.<subworkflow>.<input> is actually an alias for <wf>.<task>.<input> where the optional <subworkflow> will be deprecated in a future version of the WDL specification. @mlin @cjllanwarne as implementors would this feel better?

@mlin
Copy link
Member

mlin commented Nov 12, 2019

Generally sounds reasonable to me. There might be some dumb corner cases to worry about, for example if the subworkflow has a call given the same name as the subworkflow itself:

  • lib.wdl:
version 1.0

workflow subsubwf {
  input {
    Int bar
  }
}
  • subwf.wdl:
version 1.0

import "lib.wdl"

workflow subwf {
  input {
    Int bar
  }

  call lib.subsubwf as subwf
}

Then with the aliasing, wf.callname.subwf.bar would be an ambiguous reference. I'm not saying this dumb case should shoot down the overall proposal, just that we'd need a rule for resolving it.

@DavyCats
Copy link
Contributor Author

@mlin I feel like that should be a name collision:

"[...]The names of the contained namespaces, tasks, and workflow need to be unique within that namespace. For example, one cannot import two workflows while they have the same namespace identifier. Additionally, a workflow and a namespace both named foo cannot exist inside a common namespace. Similarly there cannot be a task foo in a workflow also named foo. [...]"

last part of https://github.com/openwdl/wdl/blob/master/versions/development/SPEC.md#namespaces

@patmagee
Copy link
Member

Hey @mlin, in the example you gave, wouldnt wf.subwf.subsubwf.bar always evauluate to the bar input of subsubwf even when you dealias and turn it back intowf.subwf.bar?

@DavyCats
Copy link
Contributor Author

@patmagee I made some adjustments based on your comment. Is this what you had in mind?

@patmagee
Copy link
Member

@DavyCats that makes alot sense to me. @mlin can you verify that this works?

@patmagee patmagee added the spec change A suggested change to the WDL specification. label Nov 20, 2019
@patmagee patmagee added this to the v2.0 milestone Nov 20, 2019
@mlin
Copy link
Member

mlin commented Nov 21, 2019

My crappy example #340 (comment) currently validates with both womtool and miniwdl. I'm open to simply declaring them both in spec violation 😎 The section @DavyCats cited might be clarified

Similarly there cannot be a task call foo in a workflow also named foo

Def interested in Cromwell team input here

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Nov 22, 2019

Nice work @DavyCats!

The names of the contained namespaces, tasks, and workflow need to be unique within that namespace.

Should we also specify inputs here? For example

version 1.0

import foo_workflows.wdl as foo_wf
workflow DirtyNamespace {
    input {
        String foo
    }
    call foo_wf.foo as foo {
          input:
               foo=foo
     }
}

DirtyNamespace.foo as both a call and an input here. So how does that work according to the spec?

@DavyCats
Copy link
Contributor Author

DavyCats commented Dec 4, 2019

After some consideration of @rhpvorderman's comment and reading the namespaces section a few times, I found that it was a bit hard to follow (it doesn't clearly define what a namespace is) and thus decided to rewrite it entirely. It also contained some notes about imports, which seemed redundant and out of place, which I have now removed.

My intend is for the specifications to be the same as before, just spelled out a bit more clearly. Please let me know if this should be changed (or reverted).

@DavyCats DavyCats changed the title clarify subworkflow namespacing clarify namespacing Dec 4, 2019
@mlin
Copy link
Member

mlin commented Dec 6, 2019

chanzuckerberg/miniwdl#309

There's a branch on miniwdl which (i) prevents calls from having the same name as their containing workflow and (ii) attempts to delete the subworkflow name from the input JSON keys. Seems OK so far

mlin added a commit to chanzuckerberg/miniwdl that referenced this pull request Dec 17, 2019
To supply an optional input for a subworkflow call, Cromwell input JSON expects the key to include (i) the workflow name, (ii) the call name, (iii) the subworkflow name, and finally (iv) the input name. The need to include (iii) the subworkflow name is questionable, as discussed on openwdl/wdl#340

This change makes miniwdl accept such inputs either with or without the subworkflow name, by removing it if it's present. To ensure unambiguous resolution, also forbids collision between a call's name and that of its containing workflow.

#193
@cjllanwarne
Copy link
Contributor

Sorry that I didn't spot this PR earlier - what you found in Cromwell looks like a bug rather than intended design since Cromwell (and WDL!) never intentionally allowed passing inputs through to subworkflow calls from input json.

Since #347 is addressing that - and in theory we'll be "adding" it for the first time, I'd be inclined to remove the "backwards-compatible-with-a-bug" option and define this as the only logical (at least to me...) syntax of wf.call.subcall.subcall.inputname.

@DavyCats
Copy link
Contributor Author

DavyCats commented Jan 6, 2020

Thanks @cjllanwarne, I guess it was a common misconception that inputs could be passed to sub workflows through the inputs json. I have removed the "backwards-compatible-with-a-bug" option.

@rhpvorderman
Copy link
Contributor

rhpvorderman commented Jan 14, 2020

Can #347 be closed if this is accepted? I think so. @DavyCats what do you think?

Also. I think we should get this into voting. I believe Cromwell 48 49 is waiting on a decision to support namespaced inputs in broadinstitute/cromwell#5317. I believe @cjllanwarne coded the input JSON parsing to easily allow support for namespaced inputs on purpose. But currently it is not turned on because the current development spec does not allow it.

EDIT: Cromwell 48 has already been released yesterday.

@DavyCats
Copy link
Contributor Author

I'm not sure, the changes in #347 should still be made I think but we could fold it into this PR. This PR doesn't (in its current state) alter the fact that the spec explicitly disallows the provision of inputs to subworkflows by the end-user.

@rhpvorderman
Copy link
Contributor

Well in that case we can also take the time to carefully evaluate this PR and make sure #347 is merged quickly.

@jdidion jdidion changed the base branch from master to main June 29, 2020 14:06
@patmagee
Copy link
Member

patmagee commented Aug 28, 2020

@DavyCats is this ready for voting?

@DavyCats
Copy link
Contributor Author

@patmagee Yes, I believe so

@patmagee patmagee added the voting active Changes for which voting is active. label Aug 31, 2020
@aednichols
Copy link
Contributor

👍

@patmagee
Copy link
Member

patmagee commented Nov 3, 2020

@mlin @cjllanwarne @jdidion before I pass this and merge, can you please give your final thoughts on it

@@ -185,7 +185,7 @@ workflow wf {

This describes a task, called 'hello', which has two parameters (`String pattern` and `File in`). A `task` definition is a way of **encapsulating a UNIX command and environment and presenting them as functions**. Tasks have both inputs and outputs. Inputs are declared as declarations at the top of the `task` definition, while outputs are defined in the `output` section.

The user must provide a value for these two parameters in order for this task to be runnable. Implementations of WDL should accept their [inputs as JSON format](#specifying-workflow-inputs-in-json). For example, the above task needs values for two parameters: `String pattern` and `File in`:
The user must provide a value for these two parameters in order for this task to be runnable. Implementations of WDL should accept their [inputs as JSON format](#specifying-workflow-inputs-in-json). The inputs described in such a JSON file should be fully qualified according to the namespacing rules described in the [Fully Qualified Names & Namespaced Identifiers](#fully-qualified-names--namespaced-identifiers) section. For example, the above task needs values for two parameters: `String pattern` and `File in`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should/must/

}
```
The following namespaces exist:
* A WDL file: When imported the name equals that of the basename of the file by default, but may be aliased using the `as identifier` syntax.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add link to relevant place in the spec

* A WDL file: When imported the name equals that of the basename of the file by default, but may be aliased using the `as identifier` syntax.
* May contain namespaces, tasks, structs (please see the notes at [Importing Structs](#importing-structs)) and at most one workflow.
* A call (of a task or workflow): The name equals that of the called task or workflow by default, but may be aliased using the `as identifier` syntax.
* May contain inputs, outputs, runtime_attributes (if the call is to a task), variables (accessibility limited by [scope](#scope)) and calls (if the call is to a workflow).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add link to relevant place in the spec

@jdidion
Copy link
Collaborator

jdidion commented Nov 3, 2020

👍 with minor comments

@mlin
Copy link
Member

mlin commented Nov 4, 2020

LGTM, my only comment re @jdidion on the JSON is that the first namespace -- the name of the top-level workflow being run -- doesn't constrain or disambiguate anything, so I think it's nice to leave a little wiggle room to omit it from the JSON keys.

@DavyCats
Copy link
Contributor Author

DavyCats commented Nov 4, 2020

I addressed your comments @jdidion.

@jdidion
Copy link
Collaborator

jdidion commented Feb 1, 2021

This is in v1.1 and will be propagated to development.

@jdidion jdidion closed this Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec change A suggested change to the WDL specification. voting active Changes for which voting is active.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants