-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should niches/ABI be part of the layout of a type? #122
Comments
@RalfJung Niches can have at least two different sources: invalid representations, which are part of validity (e.g. in The current definition of layout in the glossary includes padding by omission (e.g. by saying that field offsets are part of layout), but does not explicitly mention it. Maybe we should explicitly mention padding in layouts definition, and mention that padding bits introduce niches in the representation. |
So I think that This would narrow this question to whether |
It would be a very useful property if two types with the same layout, when used as the concrete type of a generic type, produced the same layout of the overall type. In other words, I think this makes for a rule that is very easy to understand and teach. I don't think there's any optimization possibilities lost. However, if we want to have such a rule then niche needs to count as part of the layout. |
All representations are valid for padding bytes, so they cannot introduce niches. |
Just to be clear: the latter is correct, the former is not. Padding cannot be used for niches.
That is indeed the justification for a niche. But not every ruled-out bit pattern is a niche. For example,
Yes, that's basically my point. We need some name for "all the things of So, what do we call that thing? "Layout" seems like a reasonable term. So "layout" would, by definition, consist of the size, alignment and niche of a type. But there are other things that are relevant for a type in this context, that are also sometimes to be considered to be included in "layout", namely the function call ABI and the offsets of the fields (if the type has any). In particular, So maybe (size, align, niche) should be called something else, to capture precisely the property @Lokathor was mentioning? Or maybe |
What seems really odd though is to include ABI and fields, but exclude the niche. I think that is just an oversight. So my proposal is to update the docs to include "niche" in the definition of "layout". Or does anyone have a case where talking about (size, align, fields, abi) and excluding the niche is useful? @gnzlbg you seem to have that in mind when suggesting that That still leaves open the question of how to call (size, align, niche), though. I would certainly not include the validity invariant in whatever a layout is, that's way more information than we actually need. Abstracting it to a "niche" the way rustc does is useful I think. |
I would prefer to exclude "ABI" (meaning how it is passed by value, not the more general sense of Application Binary Interface) from "layout":
That is, I propose "layout = (size, align, fields, niche)". This would maybe entail renaming I also think it's fine that this is more information than is needed for "computing the layout of |
We'd definitely want to include the ABI in that one though.
Well, type punning is okay between |
Sure, just say "ABI and layout is deterministic [w.r.t. ...]".
Yes, there is lots of type punning between types that aren't comparable wrt fields, and there's also other things (e.g., validity and safety) invariants to keep in mind when type-punning. Field offsets are just part of the story. |
I feel it makes sense to exclude fields, because then equality of layout is a necessary condition for type punning of Turns out the docs already have a definition of layout, and it doesn't agree with the glossary: https://doc.rust-lang.org/stable/reference/type-layout.html says
So this does not include ABI, nor niche. I think this is unlike anything that any one of us has been proposing. ;) |
I can understand that, but there are many incompatible substitutions for X, Y in "if we define layout as X then propety Y can be stated concisely in terms of layout". I don't know how to resolve that, arguing about which is more important seems miserable and unlikely to help. But I don't have very strong opinions about most of the definition anyway. I am very serious about this one point though: specification terms that Rust users recognize from informal/pre-formal discussion should bear some resemblance to this informal/preexisting meaning. From that angle, "layout" absolutely must include field offsets. After all, where fields are located is a major part of how the type is laid out in memory, and fiddling with that is a large source of layout optimizations. I care less about whether niche and ABI are in or out, definitely not enough to argue at length about it. Users think about those comparatively rarely. But telling users that |
That's fair. I think you convinced me that field offsets should be included, also for consistency with existing docs. So the contentious points seem to be whether ABI and/or niche are included -- and if not, what we call the thing that includes them as well.
IMO it is pretty clear that when you define where the fields are, that also defines the gaps between the fields, i.e., padding. Once the field offsets are fixed, there is no freedom left for where to put padding. But it might make sense to state this explicitly. |
What do you mean by type punning? The alignment of |
People don't just pun with transmute, they also pub with slice casting stuff. I actually had to put it in a crate recently, link, because people kept wanting to do it but getting it wrong. However, as you say, even a slice cast doesn't require layout equality. |
Nit: I think we can maybe refer to this as "call ABI"? As in, the ABI of a given type wrt calling conventions? |
I'd love that. "ABI" is such an overloaded term... |
For the reader semi-confused by the phrase "the ABI w.r.t. calling conventions": is "call ABI" actually different from "calling convention"? Or are they just synonymous? Or is one of them a superset of the other? |
@Ixrec It's... complicated. One way to look at it is that each calling convention takes in argument/return types' "call ABI" and lowers that to passing those argument/return values in registers and/or the stack. E.g. the "call ABI" of It gets trickier with an aggregate "call ABI", because some calling conventions introspect the layout, even recursing through the fields (x86_64 SysV being the most complex AFAIK). I guess the confusing bit is I could say "call ABI (of a type)" and "call ABI (of a platform)", the latter being more or less a "calling convention" (but I'm less sure here). There's also the more gnarly distinction between what LLVM lowers itself, and what the frontend has to lower, but I would think we'd paper over that in this context. |
They are different. The "calling convention" is an agreement between the caller (of a function) and the callee (the function) about how to interface, for example, the function arguments (amongst many other things). Both need to agree on this. In Rust, most functions follow the Rust calling convention, but you can also choose These calling conventions classify the types that you can pass around into different categories, e.g., in some calling conventions, an These categories is what is meant here by "call ABI" of the type, and the same type can be a SCALAR in one calling convention, and an AGGREGATE in another. This is fine: the calling convention is part of the function type so both the caller and the callee agree on the category. People often want to use, e.g., wrappers like |
This is inaccurate, or at least misleading, as we have a scalar/aggregate distinction that the calling convention can't contest: it might still pass an aggregate in registers or a scalar on the stack, but it can't tell apart |
@eddyb proposed in private conversation to use "ABI" (instead of layout) as the overarching term here, and then categorize that into memory ABI (= layout?), call ABI, and maybe more. |
I like that. What would "memory ABI" be? Just size+align? If so, I don't think we should call that "layout". The term is a bit overloaded, and we use it, e.g., in the context of "layout optimizations", which do apply to "niche"s as well. Also, where do "niche"s go there ? I suppose we could have a "value ABI" that includes padding+niches, where the difference is that "niche"s can be used for layout optimization while "padding" cannot. Maybe we need a better word for this than "padding", and this could also tie nicely with "value representation". |
For example, the "value ABI" for |
One way to be more precise about this is to talk about "memory" vs "immediate". As for what "memory ABI" would be: size, align, field offsets and "largest niche" offset/range. You could implement an entirely compliant Rust compiler with it except for FFI calling conventions. |
The issue with these "X ABIs" is that often people will say just "ABI" when they mean "call ABI". At least that's my experience. But otherwise, I do like this proposal. Maybe we also need a "nesting ABI" that includes the niche, whereas "memory ABI" only includes size + alignment? |
I think this has not been mentioned in this issue yet, quoting from #153 (but said meeting was more than a year ago):
|
Closing as a duplicate in favor of #304 . |
The current definition of layout (https://github.com/rust-lang/unsafe-code-guidelines/blob/master/reference/src/glossary.md#layout) does not consider "niches" part of the type layout.
In this thread (#120 (comment)) it was argued that maybe we might want to change that and make them part of the layout of a type.
If we do that, we need to change the glossary, and distinguish that
&mut T
and*mut T
don't have the same layout, because they don't have the same "niches".cc @eddyb
The text was updated successfully, but these errors were encountered: