-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
repr(tag = ...)
for type aliases
#3659
base: master
Are you sure you want to change the base?
Conversation
This seems like a straightforward change that we should support to improve type definitions for interfaces that depend on type aliases, particularly those that vary by target. @rfcbot merge That said, I think we should go with the mentioned alternative of allowing type aliases to shadow reprs (with a lint) in a future edition, to avoid forcing people to write @rfcbot concern allow-type-aliases-to-shadow-reprs (We may even consider allowing that in current editions on the basis of a crater run turning up no conflicts.) |
Team member @joshtriplett has proposed to merge this. The next step is review by the rest of the tagged team members: Concerns:
Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! cc @rust-lang/lang-advisors: FCP proposed for lang, please feel free to register concerns. |
@rfcbot concern ambiguity This makes things like type transparent = u64;
#[repr(transparent)] enum Foo { … } legal. That seems particularly bad for any other proc macros that want to look at the attribute as part of things like their own soundness checks, since they'd no longer know what's going on from what's in the attribute. Also, I'm always a bit sad when we have "real" stuff in an attribute, in the sense that it's something that we have a proper grammar construction for rather than just needing to deal in tokens. It makes me think of changing it to instead be enum Foo : c_int { … } or something instead. |
@rfcbot resolve ambiguity I'm an idiot and somehow missed the "you need That does solve the hard blocker, but it still makes me wish for a better thing instead so that |
Just an idea for a another syntax that is not ambiguous: And can I use complex types such as |
Hmm, making the existing Although, I'm not sure if "type = " is the right name, since we're really talking about the enum discriminant here, but "discriminant" is unfortunately a very, very long name. Perhaps In terms of allowing shadowing, the main reason why I'm totally against it is because there's no way to override the shadowing. For example, you can shadow |
"type" or "base" or several other possible names could work there. |
I also really like |
Decided to just update the RFC to use the |
This is not straightforward, we currently never have "real code" (types, expressions, patterns) in inert attributes. For the purpose of this feature we'll need treat the (In other words, this is not what inert attributes are generally supposed to be used for.) |
Discriminant might be long, but on the other hand it describes exactly what this is, and makes it very clear and obvious. It is also consistent with So if it was me, I'd go with It doesn't look bad at all: #[derive(Clone, Copy, Eq, PartialEq)]
#[repr(discriminant = u32)]
#[non_exhaustive]
enum Foo {
//...
} |
@petrochenkov How difficult is it to make |
@kennytm |
I'm a bit confused about the concept of inert attributes versus attribute macros, and also why particularly this would be a problem to implement. I assume you're totally correct about the specific issues with implementing this, but could use a bit more info to fully understand what we're going for. From my (relatively naïve) perspective, when the definition is expanded in the HIR, we just resolve the type alias when we put together the enum definition, since we should have the information at that point. Since we're literally just listing allowed types for the alias, it shouldn't need much extra logic. But this appears to be wrong, and I'm not sure why. |
text/0000-repr-type-aliases.md
Outdated
In addition to the primitive types themselves, you can also use the path to a type alias in the `repr` attribute instead, and it will resolve the primitive type of the type alias. However, to ensure compatibility as new potential representations are added, the path to the alias must contain a double-colon: you can access an alias `Alias` defined in the same module by using `self::Alias`. | ||
|
||
For example, `#[repr(core::ffi::c_int)]` is valid because it contains a double-colon, but a `use core::ffi::c_int` followed by `#[repr(c_int)]` is not. If you wanted to `use core::ffi::c_int` first, then you could still do `#[repr(self::c_int)]` to reference the type. | ||
To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think type u32 = u8
seems needlessly obfuscating; I think it'd be a more readable example to write type C = u8
and then #[repr(type = C)]
, which is equivalent to #[repr(u8)]
rather than $[repr(C)]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that I was intentionally pointing out the obfuscation here because it feels more likely: #[repr(type = C)]
is obviously going to mean whatever C
type you have, but #[repr(type = u32)]
meaning #[repr(u8)]
is more likely to occur in something like proc macros if someone is doing something nefarious. So, genuinely, there is a preference to do #[repr(u32)]
over #[repr(type = u32)]
when you don't necessarily trust the parent scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've since updated this section a bit to elaborate a bit better, including what a type alias C
might look like. Does this feel like it addresses your concerns?
@rfcbot resolve allow-type-aliases-to-shadow-reprs Thank you to @ogoffart for the better |
@rfcbot reviewed This seems like a great idea to me. As a future extension, I would like it if we could support types like the following: #[repr(transparent)]
pub struct SecretInteger(u16); Since that would allow "delegating representations" without allowing code to rely on the specific value of a type alias (this is especially useful when it changes on rarely-tested platforms). If you agree that this is sensible, could you please add it as a future possibility? |
…dowing, and alter the recommended list of lints
So, taking in additional suggestions:
Still need to go through more existing feedback though, so, there will probably be more changes. I don't expect to change the syntax again, though. |
repr(type)
for type aliasesrepr(discriminant = ...)
for type aliases
If
All these cases require having the So what are the possible ways to put
enum E: TYPE { ... } // E.g. something like this (the specific syntax is taken from C++).
// Before expansion
#[repr_discriminant(TYPE)]
enum E { ... }
// After expansion, you may see this with `-Zunpretty=expanded`.
enum E builtin#repr TYPE { ... }
// Before expansion
#[repr(discriminant = TYPE, packed)]
enum E { ... }
// After expansion, you may see this with `-Zunpretty=expanded`.
#[repr(packed)]
enum E builtin#repr TYPE { ... } |
I didn't want to have to elaborate more on the context of syntax, but since the FCP is not happening any more, I might as well. First, I'll start with the obvious. A colon is a terrible syntax. There's a reason why, when given the chance to design a new language from scratch instead of bolting on additional syntax like C++ does, Java and many other languages chose to use a keyword to describe the relationship between the name of the type and whatever occurs after the colon. Sure, Java also has distinctions like With Second, the ship has already sailed as far as Like, I personally have no issues with adding a dedicated syntax for this in the future. I just don't think that I think that an attribute fits this feature because, as mentioned before, any syntax will likely be less descriptive and more confusing. Elevating |
I went to mark this waiting-on-team, but it looks like that label doesn't exist here. Please treat this RFC as waiting on us to give better feedback. |
I should add: I was having a sour day before I saw this, so, that is definitely reflected in my response. I think that it's absolutely reasonable to want to pause the FCP because you have additional concerns without fully fleshing out those concerns, since you are on a time limit. I just wish that were the reason stated for pausing the FCP, instead of explicitly providing not-fleshed-out concerns as an excuse instead. Because as far as I'm concerned, the reasons mentioned were addressed before the FCP started, and no additional reasons were cited. |
(We were actually talking in the lang design meeting about marking this waiting-on-team before I even saw your response. Under the usual "no new rationale", we always need to put detailed rationale and concerns in the thread.) |
I would like to second @clarfonthey’s point about clarity. What I really appreciate about Rust is being able to write more “self-documenting” code, where types and variable names (and ? short-circuiting) make many more comments superfluous. A big part of that is writing code that looks like it explains itself. Yes, attributes are always a bit more magic than I would like, but otherwise In contrast, the If Rust gains dedicated syntax for this feature at some point (and perhaps a stable trait to reason about the discriminant type in the type system), that would be nice. But it should be as easy to read as the proposed attribute syntax. As this syntax is also not a new attribute but merely extends an existing one, I feel like not having it would at some point feel increasingly like a lack in feature with the repr attribute. |
@rustbot labels +S-waiting-on-team We decided in our design meeting today that we're going with So if this RFC were to be accepted, we'd want it updated to |
repr(discriminant = ...)
for type aliasesrepr(tag = ...)
for type aliases
"tag" might not be a good choice of terminology. As I already wrote over there: The use of "tag" as apparently a synonym for "discriminant" is unfortunate insofar as "tag" exists as a term in the compiler and it is not equivalent to "discriminant" there. It refers to how the discriminant is encoded in memory. For instance, for an Granted, with this largely being internal compiler terminology, it can be changed. But it will certainly be confusing to people that have worked with enums in the compiler in the past. We have also occasionally used this terminology in opsem and other language discussions, to my knowledge. And we would need a new word for what is currently called "tag". |
"Tag" could make sense here since |
@RalfJung Here, tag is appropriate. This RFC can be thought of as an extension of RFC2195: Really Tagged Unions, allowing the primitive repr type to be a type alias. Variants of enums with explicit primitive reprs are defined to always have tags. By contrast, the feature proposed by #3607 only concerns discriminants. I agree entirely with your comment there that the use of "tag" in that case is inappropriate. It's extremely useful to draw a distinction between discriminants (part of a variant's logical representation, and something all variants have) and tags (part of the physical representation, and only had by the variants of some enums). Using these terms synonymously will make talking about enum representation much more challenging. @traviscross I hope there is still time and procedural flexibility for the lang team to reconsider using the same terminology for both of these RFCs. |
We'll take the required time and consider all the feedback. |
@jswrenn that's fair, if this here is indeed about controlling the representation, and maybe even saying that With |
I think unless we change https://doc.rust-lang.org/reference/type-layout.html#primitive-representation-of-enums-with-fields-less-enums, a |
Hmm, that distinction is not one we were thinking of. We were thinking of "tag" as a synonym for "discriminant" -- one that may or may not be represented with actual bytes in memory.
|
Yeah, I'm also of the opinion that discriminant is a much better term in general. The way we tell variants apart is using a discriminant. That discriminant may be a dedicated tag field, but it could be niche values. However, I think that calling this tag is acceptable because, well, it's an explicit tag representation. Take this example: enum Even {
Zero = 0,
Two = 2,
}
enum Odd {
One = 1,
Three = 3,
}
enum Both {
Odd(Odd),
Even(Even),
} There is a sense where all of these could have a The "discriminant" here would fit into a So, calling this representation flag |
if this gets accepted, this discriminant/tag definition should be added to the glossary of the rust reference |
I agree that the decisions made, if final, should be solidified in the reference and similar documentation. Personally, I'm fine with deciding that this particular representation can be called a "tag" in all cases and in the reference, and thus using that name in the attribute makes sense. However, whether enum discriminants in general should be called tags is something I particularly disagree with, and which I'm not sure is actually finalised given the current discussion. That said, it is the teams responsible that have the final say in these discussions, and while I do hope they listen to the feedback presented to them, I ultimately can't force them to decide one way or another. As far as I'm concerned, the discussion left to be had is largely external to this particular RFC, and I've also already made the name changes in the RFC text itself so that we can restart the FCP. I'm not sure what the best place to continue the discussion is, but considering how I don't have anything particular to add to it, I'll wait for others to decide and share links and such. |
I think the concern for naming-and-syntax is not just "tag" vs "discriminant" but because of implementation constraint of rustc 🤷 the compiler currently simply cannot support |
I'm fine with commenting that down, although based upon @petrochenkov's comments (#3659 (comment) and #3659 (comment)) I was under the impression that this was less a design concern and more of an implementation detail. Like, I agree that implementing this now would require a few invasive changes to the way To me, approving this RFC means that:
If, later down the road, someone proposes a better syntax for this feature and decides to make a new RFC amending this one to ditch the attribute-based syntax, I'm fine with that. I just think that inheriting C++'s messy To me, attributes are just a predefined syntax we can use instead of sprinkling other sigils in the language, and are a way to easily stabilise features without giving them a permanently stable notation. The fact that the compiler is unable to keep up with this perception is something that would be better to change, IMHO, or maybe it's just I really shouldn't go off my own personal interpretation of the opinion of one person when that interpretation could be wrong, or that person's opinion doesn't reflect the entire team, but at least without any other input on this, that's what I'm doing for now. As stated, if this is actually a serious problem to consider, or if the team(s) would prefer to postpone this RFC instead of merging it, I can update the RFC text to reflect that. |
@clarfonthey: We've just been a bit busy recently, with the edition coming up and whatnot. We'll circle back to these design questions when we can. |
My entire point here was that I'm against rushing out a syntax before it's fully designed, so, it would be hypocritical to try and rush this out while folks have other, more important work to do. Thank you all for the 2024 efforts. |
@rfcbot fcp cancel This FCP is stale, so let's cancel it. This work is blocked on us deciding what we want to do about RFC #3607 and about the syntax in general, as reflected in the concern filed in #3659 (comment), which we should of course settle if we are to later repropose FCP here. |
@traviscross proposal cancelled. |
Primitive representations on enums now accept type aliases, meaning that in addition to primitives like
#[repr(u32)]
,#[repr(tag = core::ffi::c_int)]
and#[repr(tag = my_type)]
are now accepted.Internals discussion: https://internals.rust-lang.org/t/pre-rfc-type-aliases-in-repr/20956
Last comment on RFC under first version (
self::
): #3659 (comment)Last comment on RFC under second version (
type = ...
): #3659 (comment)Last comment on RFC under third version (
discriminant = …
): #3659 (comment)Rendered