-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify optional and nonempty type constraints #290
Conversation
versions/development/SPEC.md
Outdated
Sometimes, optional parameters need a string prefix. Consider this task: | ||
These constraints also apply to function arguments, with exceptions for string concatenation, described below, and the equality/inequality operators `==` and `!=`, which can compare two values of types differing only in their quantifiers, considering `None` equal to itself but no other value. | ||
|
||
Quantifiers may apply within compound types; for example, if `a : Array[T?]+`, then `a[0] : T?`. An array type can be both non-empty and optional, `Array[T]+?`, such that the value can be `None` or a non-empty array, but not the empty array. (Reliance on this subtle distinction is discouraged, but it arises from other constructs such as a conditional call with an `Array[T]+` output.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's valid to discourage using Array[T]+?
, consider you might have an optional input which may occur multiple times (eg. giving multiple bed/interval files to a variantcaller). It's optional, thus the ?
, and giving an empty array might result in an invalid command.
eg:
There are two valid ways of calling a script: someScript
or somScript -i 1.txt -i 2.txt ...
The command block might looks like this: someScript ~{true="-i " false="" defined(inputs)}~{sep="-i " inputs}
- Using
Array[File]
orArray[File]+
would not allow for the first option. - Using
Array[File]?
would allow for an empty array which would prduce the invalid commandsomeScript -i
- So your only option in to use
Array[File]+?
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think its fair to say the parentheses statement is more of my editorial opinion and can be stricken.
versions/development/SPEC.md
Outdated
|
||
Interpolations with `~{}` and `${}` accept optional string expressions and substitute the empty string for `None` at runtime. | ||
|
||
String concatenation with the `+` operator has special typing properties. When applied to two non-optional operands, the result is a non-optional `String`. However, if either operand has an optional type, then the concatenation has type `String?`, and the runtime result is `None` if either operand is `None`. These unusual semantics facilitate command-line flag interpolation. Consider this task: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not obvious to me, thank you for pointing it out. For ease of reference, this is the task:
task test {
input {
String? val
}
command {
python script.py ${"--val=" + val}
}
}
I would much rather not overload the +
operator to handle optional strings in a special way. Perhaps the die is already cast, because ${"--val=" + val}
is a much loved, common idiom. If at all possible, I would rather use a different symbol than +
, to avoid overloading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically new, FWIW - right now it's only permissible as a special case in string interpolation, and an error elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with the special case although I speculate it's actually a bit trickier to implement, propagating the flag of whether we're inside an interpolation or not when typechecking expressions...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also seems to open the Pandora's box for operators (although that might not be a bad thing), IE should we allow the same logic to be applied to other Types and operators?
Int? one
Int? two
Int? subtract = one - two
Int? add = one + two
Int? multiply = one * two
Int? all = ((one * two) + one) - two
To me, it seems like it could be a bad idea to overload the +
operator and not do the same overloading on all other operators. I think it will likely cause confusion, and I am sure people will inevitably ask for it if in other contexts if we introduce it to string concatenation.
versions/development/SPEC.md
Outdated
@@ -2327,9 +2327,17 @@ Then the command would be instantiated as: | |||
/bin/mycmd /path/to/c.txt | |||
``` | |||
|
|||
## Prepending a String to an Optional Parameter | |||
In general, it's invalid to supply an expression of optional type `T?` for a non-optional input or variable of type `T`, as the latter cannot accept `None`. (The idiom `select_first([expr, default])` coerces `expr : T?` to `T` by substituting `default : T` when `expr` is undefined; if `default` is omitted, then the coercion fails at runtime.) Similarly, an `Array[T]` expression does not satisfy an `Array[T]+` input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A way of coercing Array[T]
to Array[T]+
will need to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, it will be problematic to statically enforce +
without also adding such a coercion. We can also contemplate leaving +
as more of a runtime assertion, with the downside of nonuniformly treating ?
and +
. Discussion on this welcome. cc @orodeh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mlin, thanks for straightening out these issues.
My personal opinion is that having +
as a non-empty array qualifier is fine, but I prefer that it's implementation be done as a runtime assertion. If it is part of the type system, then we run into a combinatorial explosion of types. For example:
Array[String], Array[String]+
Array[Array[String]], Array[Array[String]]+, Array[Array[String]+], Array[Array[String]+]+
However, I would argue that most people find two types sufficient here: Array[String]
and Array[Array[String]]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't mind (/actively like) the fact that you can nest this as a type constraint inside other types. It makes the logic much easier to implement, at least...
I agree though that it should be a coercion (and thus enforced via a runtime check) to go from Array[X]
to Array[X]+
. Anything else (eg engine functions) is going to be a nightmare to apply correctly in the more complex cases here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cjllanwarne just clarifying -- do you want to see +
enforced as a runtime assertion rather than statically? (I'm on the fence at this point)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+
to me has always seemed like a strange type. To me its a more of a runtime decorator that is an assertion check as opposed to a Type
check. I for one do not even mind if it is removed from the specification entirely, however that likely would cause people to raise their pitchforks in protest.
Realistically, the only guarantee that Array[X]+
provides is that Array[0]
will not be None
. This is not a strong enough argument to assert that it is a separate type completely IMO, and it SHOULD be treated as a normal Array[]
type. Ie the following should be valid with no need to coercion:
Array[X]+ myArray
Array[X] otherArray = myArray
The one thing we need to be cognisant of however, is that be relegating the +
type to a decorator
and not a type, we essentially are introducing a new concept to WDL. That of Decorator
expressions, which serve as runtime hints
that should be enforced by the engine, but should not change the typeing of the variable they are on
versions/development/SPEC.md
Outdated
Sometimes, optional parameters need a string prefix. Consider this task: | ||
These constraints also apply to function arguments, with exceptions for string concatenation, described below, and the equality/inequality operators `==` and `!=`, which can compare two values of types differing only in their quantifiers, considering `None` equal to itself but no other value. | ||
|
||
Quantifiers may apply within compound types; for example, if `a : Array[T?]+`, then `a[0] : T?`. An array type can be both non-empty and optional, `Array[T]+?`, such that the value can be `None` or a non-empty array, but not the empty array. (Reliance on this subtle distinction is discouraged, but it arises from other constructs such as a conditional call with an `Array[T]+` output.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the only valid form Array[T]+?
or can it also be written Array[T]?+
? The spec isn't explicit about this as far as I can see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that technically you can only apply +
to arrays. Thus, since Array[T]?
is not an array (it's an optional), the form Array[T]?+
would be invalid
@cjllanwarne @orodeh @DavyCats @pshapiro4broad to try the feedback on for size, I changed nonempty-array to be a runtime-only check, constrained the special string concatenation behavior to apply inside interpolations only, (and struck my editorial comment about |
Looks good to me. I am still not a fan of allowing |
@mlin This looks good to me. As you've done here, it would be nice to make the spec explicit everywhere about run-time vs compile-time behaviors; you shouldn't have to define terminology here for that. That would something for another PR though.
I agree that it's messy but am not sure what a better solution would be. Maybe a different syntax for string interpolation that makes it clear that a list of fields must all not |
@mlin where does this stand? |
@geoffjentry I think it's ready for a 'last call' for comments before voting -- I'm happy to address anything that comes in |
+1 |
2 similar comments
👍 |
👍 |
The ayes have it and this is moving to |
Cromwell conforms to this behavior, making it implemented. Merging |
Following discussion on issue #271, this PR is a straw-man resolution to current ambiguity over how/when the optional and nonempty type quantifiers are "enforced." It will definitely need debate and discussion, so fire away!
A couple of comments to start:
select_first
idiom is workable but not exactly loved; there's some discussion on Ease optional access with scatter-like block #187 about nicer sugar which could follow on.Array[T]
toArray[T]+
right now, which is needed if we want to enforce nonempty in the same way we propose to statically validate optionals. (And it would be a little weird to enforce one of them statically and the other only at runtime, right?)