-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add endianness as one of the pointer properties #649
Comments
Wouldn't it be easier to provide handful of functions converting basic datatypes back and forth? Perhaps also function which does it for structs and arrays, with support from the compiler. |
How does that address bit fields? |
AFAIK endianness cares only about 16, 32 and 64 bit sized integer data (not sure about 128 bit and floats). Bytes and byte streams are the same everywhere. Unless Zig is doing something really really strange bitfields could be transferred over the wire "as is". I am not sure about the "parent integer" concept, it feels as a bad idea. Since bitfields are what one really wants here it probably should be treated as bytestream. |
It is common to think that there are only two ways of representing multibyte values, big endian and little endian. However, that is not the case. There are definitely others. I would request that this be rethought to allow it to be more powerful to allow any arbitrary byte order. I deal with alternate byte orders in the embedded/automation field. @PavelVozenilek, it is not that simple in a lot of areas. Floats are definitely subject to byte ordering. However, they are not always subject in the same ways. I have heard of (a common manufacturer) of programmable logic controllers that use a byte order of 1032 for 32-bit ints (two big-endian 16-bit words with the words themselves in little-endian order!) and 3210 for 32-bit floats (big-endian). The numbers are the byte offsets from the start of the value in memory. I like the ability to add these onto pointer values (that makes more sense than what I was thinking a few months ago with the integers themselves having byte order).
I have been playing with ideas for bit structs where all the fields are specified in bits.
I am not very happy about the syntax. What I am trying to achieve is the ability to state exactly which bits belong to each field. In the above example, you can see that the |
Excited about this. |
|
To me this feels like hidden control flow, and if you are going to add hidden control flow, like with bit-fields, then why only provide for a few use-cases, as @kyle-github pointed out? Why not just allow overloading the assignment/load operators with pure functions? This way you can do;
...Or just not using the load/assignment operators for these use-cases. Bit-fields certainly don't need those operators, as the algorithms work better if you copy out/in of the bit-field before/after. The only one here that does need them is circular power-of-two-sized buffers. |
This issue was first posed back in 2017. Now that we have In/OutBitStream, and to a lesser extent PackedIntArray/Slice, I'm not sure the reasonable use cases for this feature aren't covered already without a new language feature. Can anyone speak to use cases I may not have considered? |
One potential use case for this that I've been running into lately would be UTF-16. Being able to have a |
A pointer is a memory address with metadata.
Here's the metadata a pointer currently has:
type
const
or mutablevolatile
or no-side-effects for load/storealign(x)
- guaranteed alignment of the address.if unspecified it is the ABI alignment of the type.
:a:b
- indicates that the value isa
bits offset from the address.I think we can remove
b
because it should always be@bitSizeOf(T)
If
:a:b
is omitted,a
is 0 andb
is@bitSizeOf(T)
.Here is metadata we plan on adding in accepted proposals:
null
or0
to indicate that the pointer is null or 0 terminated (see proposal: type for null terminated pointer #265)This proposal is to add yet another piece of metadata to pointers, which is endianness.
&.Endian.Little u32
&.Endian.Big u32
A target has a native endianness. When pointer endianness is unspecified, it
means the native endianness. So on x86_64,
&.Endian.Little u32
is the sameas
&u32
.The value
Endian
here can be obtained from@import("builtin").Endian
.We may decide to automatically import
builtin
into the global namespace,so it would become
builtin.Endian
.Just like the type of a pointer, endianness can be a comptime value:
These pointer concepts can be combined, and make sense together:
&.Endian.Big const volatile :2 u4
Here we have a memory address that
const
we should not write throughvolatile
there are side effects from reading from:2
we must bit shift the loaded valueu4
we must mask only 4 bits from the loaded value.Endian.Big
bit shift and mask assuming the loaded value is big endianSo how does this work with packed structs? (See #307)
Here, if you take the address of each field, you get respectively:
a
-&.Endian.Big u32
.b
-&.Endian.Big u32
.c
-&.Endian.Big :0 u4
.d
-&.Endian.Big :4 u4
.e
-&.Endian.Big :0 u4
.f
-&.Endian.Big :4 u4
.What happened here is that the sub-byte fields have a parent integer, which
zig automatically determines based on byte boundaries.
Now we have explicitly decided the parent integer.
data.c
-&.Endian.Big :0 u4
.data.d
-&.Endian.Big :4 u4
.data.e
-&.Endian.Big :8 u4
.data.f
-&.Endian.Big :12 u4
.The text was updated successfully, but these errors were encountered: