-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Zero sized string having address null have unexpected results when slicing and taking the pointer #1831
Comments
The ptr field of a 0-len slice has an Another way to think about this is "where in ram can you find 0 consecutive u8s?", to which the answer is "everywhere". Starting at literally any address you have at least 0 consecutive objects of every type. (This might not be the most helpful explanation.) The main issue here is that Zig pointers are not necessarily memory addresses at all. The semantics of a pointer are that a pointer tells you where you can find the thing you're pointing at. If you're pointing at nothing, then the pointer doesn't need to do anything. This is also the reasoning behind |
To further clarify Josh's point, this code is equivalent to: test "" {
var data: ?[*]const u8 = undefined;
@import("std").debug.assert(data != null);
} |
Actually no, my example should be equal to this:
Which it isn't for some reason (this code works). This code doesn't work either:
Which doesn't really make sense. The values of the slice should be |
Anyways, I think this is a problem with the fact that |
All of your code examples branch of an undefined value, which is undefined behavior in Zig and C. The equivalent C code is: void main() {
char *p;
assert(p != NULL);
} The fact that optional types have special semantics for pointers is for C compatibility. If a C function takes or returns a pointer that might be NULL, then Zig needs a way to express that. If optional pointers in Zig used the normal optional semantics where it becomes a struct of #1059 will introduce a special pointer type for C compatibility, so maybe this special case handling for optional pointers could be isolated to that type and avoid this confusion, but that remains to be seen. Another thing that might be adding confusion to all this is that I see in the original post you're taking the address of an empty string. In C an empty string is not really empty; it includes the @Hejsil how did you run into this issue? |
@thejoshwolfe I have a struct like this: struct {
ptr: ?[*]const u8,
len: u28,
} and depending on the API used (streaming vs non streaming) T{ .ptr = ""[0..].ptr, .len = 0 } I understand that our pointer optimization is what is causing this weird behavior. I was just wondering if we could do something about it. If we can't then that is fine, and I've just pointed out something unsafe :) |
@Hejsil in the case where |
Hmmmm. That's an idea. I just did |
No, actually |
I could use a |
My mistake, you're right. This is an important issue which I'm glad you brought up. I'm still thinking about how to best resolve it, but I think it will be good to figure it out before the next release and document the intended semantics. My first idea is that behavior should remain status quo, and we document how it is supposed to work. Which is that casting from I want to note that although in Zig, pointers cannot be |
Discussed this a bit with @andrewrk, and i feel confident about the plan:
This fits in nicely with the rules in #1947. |
Related: #1952 |
When a pointer may not be address 0, an optional pointer shall have the same bit pattern. Therefore if a pointer which may not be address 0 has an undefined value (See #1947) the bit pattern may be regardless be 0, and when it is implicitly casted to an optional pointer, the optional pointer has an undefined value as well. However, implicit cast from T to ?T even if T is undefined is expected to produce a non-null value. So we have three choices: Possible Solution 1. Implicit cast from ptr which may not be address 0 to optional pointer is not a no-op; if the value is 0 then zig chooses any non-zero bit pattern for the optional pointer. This is sub-optimal in terms of generated machine code, but avoids confusing semantics. Here's the LLVM it could codegen to: %1 = load i8*, i8** %x, align 8
%2 = icmp neq i8* %1, i8* null
%3 = zext %2 to i64
%4 = or %3, %1
store i8* %4, i8** %y, align 8 Possible Solution 2. Make it safety-checked undefined behavior to implicit cast undefined value of pointer which may not have address 0 to optional pointer. In other words, if your pointer is undefined or illegally address 0 and you try to cast it to an optional pointer, boom. Possible Solution 3: Define T to ?T when T is a pointer which may not have address 0 to result in an unchanged bit pattern. This means implicit casting |
After some discussion, @andrewrk, @thejoshwolfe, and I came to the following conclusions: As a sanity check, we analyzed the behavior of So we've decided to accept this rule and close this issue. It is superseded by #63 and #211, which cover adding debug checks for these cases where possible. |
String literals are null-terminated now, so that's not an empty slice anymore. The pointer is defined in this case. But in the case of an empty slice, the The other ways to create an empty slice are as follows:
So I don't think there's an easy way left to accidentally end up with an undefined slice pointer. |
The text was updated successfully, but these errors were encountered: