-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mini-RFC: Finalize syntax of impl Trait
and dyn Trait
with multiple bounds
#2250
Conversation
( The rendered link should probably point to: https://github.com/petrochenkov/rfcs/blob/impldyn/text/0000-finalize-impl-dyn-syntax.md to sync with additional future commits if there are any ) |
It seems like the proposed alternative of always needing parentheses is the only option that allows us to backwards-compatibly tweak these rules later and allow dropping parenthesis in certain situations. Is this true? (the fact that the RFC never mentions this makes me think I'm missing a reason why it doesn't actually work) |
Seems like a home run to me. Even if one could argue that the difference between I guess the worst we can expect from something like this is impl (Fn() -> impl (Iterator<Item=u8> + 'a) + 'a) |
Actually, I too am wondering along the same lines as @Ixrec especially for types like Of course, whatever rules are drawn up (if any) for allowing these parentheses to be omitted should apply equally to |
Always requiring parentheses like this I is indeed the most forward compatible, but it's really weird, so I didn't listed it in the RFC. // WHY!?
type A = dyn A + B; // ERORR, interpreted as `(dyn A) + B`
type B = (dyn A + B); // OK, dyn is immediately in parentheses II is a bit more complex, it introduces new syntax - parentheses after Suppose we always require parens now: fn f() -> impl (A + B) { ... } but then relax them in the future: fn f() -> impl A + B { ... } // OK What happens? As a result, if rules for the "Alternative 3" are going to be relaxed in the future, then it will become "Alternative 2" + additional legacy syntax. It's better to immediately start with "Alternative 2" in this case. |
Using your example, is there any reason we could not also support this: fn f() -> &A + B { ... } That is to say, these extensions don't have to be a hack exclusively for |
The same reason why there's desire to change the status quo syntax - operator priorities. |
I guess that I feel that having consistency between expressions and types is not quite as critical as having consistency within types. My thought is, because there are no valid types that contain a unary operator nested within a binary operator (i.e. To make
That said, I can think of a few issues with my approach:
Again, I feel the details of an extension like this could be ironed out later. Perhaps it's a terrible idea that's just not possible; and if this is the case, then all else considered, I'd rather have to write |
Thanks @petrochenkov for putting this together! It's really helpful seeing all the different options (and their effects) laid out clearly in the RFC. Unfortunately, I'm personally frustrated by all of the options-- clearly there's no perfect solution. Alternative 3 feels the most logically consistent to me, but it's also the most verbose, and I'm sure I'd quickly become confused and annoyed at the visual clutter it adds. After trying out a few examples, I personally prefer Alternative 1. The vast majority of the time The drawbacks: It usually doesn't make sense to see Nested You might well ask, "Why not Alternative 2? Most of the clear cases you brought up would be handled properly there, and most of the unclear cases would require parentheses?" I've waffled about this a bit, and in the end I decided against Alternative 2 because of how it handles Edit: I think I would also be in support of a variant of Alternative 2 which added |
It's been a while since I looked at the details of types parsing, but isn't there also a potential for ambiguity with tuples when adding/requiring brackets in some of the examples? I.e., wouldn't |
@rolandsteiner A 1-tuple (why do we have such a thing in the type-system anyways...) is always |
Hmmm, I assumed Can this work? It doesn't need parenthesis. |
@pornel No, then it wouldn't really mean the same thing. The |
for<'a> A<'a> + B // (for<'a> A<'a>) + B, not for<'a> (A<'a> + B)
?A + B // (?A) + B, not `?(A + B)` so &A + B // (&A) + B, not &(A + B) is consistent with them as well. |
So, I wanted to look how these alternatives look on real code and tested them on rustc and libstd - https://github.com/petrochenkov/rust/commits/impldyntest. The first commit is the baseline (Alternative 1) and the next commits show changes required to move to Alternatives 2 and 3. |
From #2250 (comment) I'm pretty much convinced that the Alternative 3 is just impractical, So, I'm going to change the RFC to use the Alternative 2 instead.
I don't want this because it's technically unnecessary and creates more (resolvable) parsing ambiguities (it's basically repeating the mistake from rust-lang/rust#41077), but it's a fully backward-compatible extension that can be implemented later on top of the Alternative 2 if necessary. |
The RFC is updated. |
Can you say a bit more about these ambiguities? I am trying to page that back in -- maybe some links to prior conversations? :) |
@nikomatsakis There was some previous discussion in this thread. |
OK, I didn't find that especially enlightening. I'd like some examples of the sort of conflict we expect. So I could get behind alternative 2, though like @cramertj I really feel like I've also been pondering a few questions. What do people expect from something like I tried making a little toy model of types plus bounds here https://github.com/nikomatsakis/rust-grammar-toy using LALRPOP. This is intended to model Alternative 2 except that we do permit parens in lists of bounds too (e.g., I'd like, before we settle on something, to be able to express what we expect in fairly concrete terms like this. The grammar as I wrote it there is a bit peculiar in that it does't support parens like |
@rfcbot fcp merge While I would prefer that we end up with a grammar that supports |
Team member @cramertj has proposed to merge this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
OK, so, we had a discussion about this in the lang team meeting. There was general consensus, I believe, on a few points: First, the basic idea of "alternative 2" seems like the right starting point. Second, for those cases where disambiguation is needed, everybody prefers However, there was some disagreement about what to do when
@cramertj felt like we should interpret That said, we don't currently accept impl trait in the return position anyway (for other reasons), so in a way this dispute is moot, right? |
Don't have strong opinions about the remaining controversy & see both arguments. Reading it, interpreting it as grouped with the return type seems obvious and natural, so parens seem superfluous. However, I could see someone getting a compiler error, adding Possibly just linting could solve that. I think maybe we should conservatively start with requiring the parens, and see if we can introduce a lint someday that would make requiring them unnecessary. |
Has using angle brackets already been discussed at some point? Given the keywordness of
This would seem to align with the idea that a usage of impl Trait is akin to an anonymous generic type (since the traits listed inside the angle brackets are bounds on some unnamed/anonymous type). There would also be alignment with (EDIT: A downside, of course, would be that as with Box the brackets would always be required.) |
@mark-i-m It takes a type, and right now if you use a trait name in the type position it can only name the trait object - there's no such thing as "being generic over a trait" so far, the What @scottmcm wrote makes |
Thanks for the explanation :) that's makes sense. |
I don't think this is meant to be a serious suggestion, but I think the main advantages of dyn are not supported by something like this:
|
I'm not sure whether it was intended as a serious suggestion, but the ease with which it would slide in is interesting for other reasons. In particular: it demonstrates that As a thought exercise: what if the way to accomplish the requirement that
If |
Parentheses on the first bound in a trait object, for example, especially if the first bound is a lifetime. ((...(u8,)...)) + Trait // Not a legal type
((...('a)...)) + Trait // Legal type
((...('a)...)) // Not a legal type We start parsing this as a type and only after burrowing deep enough into the parens we realize this is actually a bound and not a type, and we still don't know if this bound represents a legal type or not until we parse all the closing parentheses. Can it be disambiguated in principle? Yes. I wish bound lists used some other separator and not fn f((a: u8, b: u16), c: u32) {} // Such a beautiful sight If parentheses in
without changing grammar for bounds in other contexts: fn f<T: (A + B)>() {} // Still an error |
They are used, they are just not needed very often, for example: &(Write + Send) I see mandatory angle brackets as an analogue of Alternative 3 (too noisy) and optional angle brackets as an analogue of
What if
This is great 😄 |
@petrochenkov : Thank you for the thorough response! I agree that Alternative 3 is noisier, although "too noisy" is subjective, and whether the alignment with similar concepts ( If Thanks again. |
I strongly object to this rationale for triggering a final comment period. Why would we choose to accept this now only to force our users to suffer a transition sometime in the future? I agree with the apparent consensus that |
@bstrie Everyone on the lang team is in consensus that we should include |
I won't amend the RFC because it's a completely unnecessary feature complicating the grammar. |
SGTM 😄 |
The final comment period is now complete. |
So, during implementation, some further questions arose. The TL;DR is that we feel like we ought to be somewhat more conservative than this RFC suggests, and in particular try to make it an error to use a Let me try to clarify what I think we want. We already have two grammatical precedence levels for types. I'll them T0 and T1. T0 includes all types, including the "n-ary" operators like dyn A + B and A + B. T1 excludes those n-ary operators, and is used primarily for & types (which take a T1 argument). Currently, in the parser, if we are parsing a T1 type and we see a + we stop, leaving it to be consumed from some enclosing context. I think that we want to have a rule such that T1 = dyn Identifier parses, but if we see a + after that, we error out (in the parser). This means that &dyn A + B is a parse error, as is where F: Fn() -> dyn A + B, and (I imagine) dyn Fn() -> dyn A + B. I think of this as being analogous to non-associativity: there are basically multiple interpretations of the + operator. It can be adding bounds to a type parameter in a where-clause list; it can be adding types in a (old-style, A+B) sum-type; and it can be adding bounds to a dyn or impl type. In a fn item context, or in the T in Box, once we see dyn or impl, there is only one valid interpretation (since we are not in a where clause list) -- . These are exactly the cases (I believe) where we use the grammatical nonterminal T0. We can parse eagerly there. In other contexts, there are multiple contending interpretations. We are aiming not to choose between them (hence "non-associative"). |
If the concern is getting impl Trait stabilized ASAP (which I empathize with), then I agree that it's far better to take the conservative path and simply forbid the ambiguous case rather than to stabilize a short-lived hack. If it really turns out to be such a problem in practice then that also implies that far more people would otherwise be using the temporary bad syntax, which further implies a much more painful transition than expected. |
That is the priority. That said, it's unclear that we would ever want to permit
since it seems like it will be always be somewhat unclear what it means, leading to style guides that recommend using parentheses to clarify. In that case, perhaps we should just make it an error and hence require you to rewrite to one of the following:
I could go either way on this point. As @petrochenkov put it at some point, nobody really knows the precedence of most binary operators but we still have precedence rules rather than requiring parens to clarify; so the precedent (pun intended) is to define some rule and stick with it. But I think @joshtriplett's feeling is that we don't have to repeat the mistakes of the past here. In any case, it's a conservative choice to error out, leaves us room to revisit in the future, so it's obviously the way to go imo. |
Given the challenges encountered, is it worth re-evaluating whether Alternative 3 with required brackets/parens would be a better path forward? Using keywords in a type position seems very error prone, and none of the examples in #2250 (comment) (even the unambiguous ones) seem as readable as As much as we want this feature, it really seems like caution is in order. |
It was certainly discussed. I think that nobody found a syntax we were happy with. For example, I was considering This isn't something I really know how to "rigorously evaluate" -- of course, technically, we could always move towards such a syntax in the future. |
I agree that it is heavier, but... only by one character overall. The result of both |
And with epochs we could plausibly even fix them in new code, making confusing cases non-associative. |
Note: for trait objects without // ambiguous, `(x as usize) + y` or `x as (usize + y)`,
// but code like this is very common and we must accept it (with low priority `+`)
x as usize + y |
Huzzah! The RFC is merged! Tracking issue: rust-lang/rust#34511 |
Priority of
+
inimpl Trait1 + Trait2
anddyn Trait1 + Trait2
is inconsistent with the rest of the grammar.This RFC lists possible solutions to this problem and chooses one of them for future stabilization.
Note that the chosen alternative is not unambiguously better than others, so please bikeshed the syntax!
This was previously submitted as a PR, but reformatted as an RFC specifically to get feedback from wider audience.
Rendered
Tracking issue: rust-lang/rust#34511