Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Primitive bit validity and padding guarantees #1292

Closed
wants to merge 2 commits into from

Conversation

joshlf
Copy link
Contributor

@joshlf joshlf commented Nov 2, 2022

Also fix a few typos.

Resolves #1291

@ehuss
Copy link
Contributor

ehuss commented Nov 2, 2022

Thanks for the PR! Can you say more about why it seems that this should be part of the layout chapter? My perspective is that the layout chapter is relevant to the alignment and padding of fields within a nominal type (and a few other circumstances), but the internal representation of a type is not relevant here. The types chapters defines what the valid bit patterns are (and by extension, the UB chapter explains invalid values and safety).

I'm reluctant to define this multiple times, or go into details about what kind of transmutes are sound within the layout chapter.

aligned to 32 bits.

### Primitive Bit Validity and Padding

For each primitive type, `T`, in the preceding table other than `char`, any
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For each primitive type, `T`, in the preceding table other than `char`, any
For each primitive type, `T`, in the preceding table other than `char` and `bool`, any

`transmute::<[u8; size_of::<T>()], T>(...)` is guaranteed to be sound.

Similarly, for each primitive type, `T`, in the preceding table (including
`char`), `T` contains no padding or otherwise uninitialized bytes. In other
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`char`), `T` contains no padding or otherwise uninitialized bytes. In other
`char` and `bool`), `T` contains no padding or otherwise uninitialized bytes. In other

Similarly, for each primitive type, `T`, in the preceding table (including
`char`), `T` contains no padding or otherwise uninitialized bytes. In other
words, `transmute::<T, [u8; size_of::<T>()]>(...)` is guaranteed to be sound.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### `bool` Bit Validity
A `bool`'s numerical value is guaranteed to be either 0x00 or 0x01,
It is undefined behavior to construct a `bool` with a value outside
this range. See the [`bool` docs][bool-docs] for more information.

@@ -583,6 +599,7 @@ used with any other representation.
[`Sized`]: ../std/marker/trait.Sized.html
[`Copy`]: ../std/marker/trait.Copy.html
[dynamically sized types]: dynamically-sized-types.md
[char-docs]: ../std/primitive.char.html
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
[char-docs]: ../std/primitive.char.html
[bool-docs]: ../std/primitive.bool.html
[char-docs]: ../std/primitive.char.html

@@ -53,9 +53,25 @@ target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
bit target, this is 8 bytes.

Most primitives are generally aligned to their size, although this is
platform-specific behavior. In particular, on x86 u64 and f64 are only
platform-specific behavior. In particular, on x86, `u64` and `f64` are only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, this phrasing struck me as potentially confusing, since "x86" is often used as a shorthand to refer to "x86-64". (I did a brief double take, at least.) Per Wikipedia's chronology, "IA-32" might be a more appropriate term:

Suggested change
platform-specific behavior. In particular, on x86, `u64` and `f64` are only
platform-specific behavior. In particular, on IA-32, `u64` and `f64` are only

...but I think you're referring to target_arch value "x86" (which is distinct from x86_64). Rust inherits this shorthand from LLVM:

    x86,            // X86: i[3-9]86

...so you could instead disambiguate this by writing:

Suggested change
platform-specific behavior. In particular, on x86, `u64` and `f64` are only
platform-specific behavior. In particular, on `target_arch = x86` (which has a word size of 32 bits), `u64` and `f64` are only

@joshlf
Copy link
Contributor Author

joshlf commented Nov 2, 2022

@ehuss I didn't realize that validity was documented elsewhere! I've put up #1293, which adds a bit of documentation to the pages in the "Types" section. Thanks for the feedback!

@joshlf joshlf closed this Nov 2, 2022
@joshlf joshlf deleted the patch-3 branch November 2, 2022 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Describe bit validity and padding for primitive types
3 participants