Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sentinel-terminated pointers #3728

Merged
merged 25 commits into from
Nov 25, 2019
Merged

sentinel-terminated pointers #3728

merged 25 commits into from
Nov 25, 2019

Conversation

andrewrk
Copy link
Member

@andrewrk andrewrk commented Nov 20, 2019

This implements #265.

@daurnimator
Copy link
Contributor

* make string literals null terminated pointers to arrays instead of array values

Does this mean that you can't do "hello".len any more? does that end up invoking a comptime strlen?

What does this mean for concatenating two strings?

const a = "foo";
const b = "bar";
const c = a ++ b; // is this null terminated? do I end up with a null in the middle?

MakeMemNoAccess = valgrind.ToolBase("MC".*),

Are we going to end up with .* everywhere? I'm not entirely clear what the .* implies here (it's a pointer dereference


idea: instead of [*]null u8, [*]sentinel(0) u8. It's not obvious that null means 0: no where else in zig do we call 0 "null". We could open things up to allow arbitrary values in the future

@Rocknest
Copy link
Contributor

This implementation does not seem quite right. It gives impression of a hack. Not sure.

Maybe there should be a type comptime_string that can coerce to [*]u8, [*]null u8 and slices.

Also i like @daurnimator's idea: [*]sentinel(0) u8 or [*]zero_sentinel u8.

@andrewrk
Copy link
Member Author

Does this mean that you can't do "hello".len any more? does that end up invoking a comptime strlen?

This code works in master branch and in this branch: "hello".len == 5. No there is no comptime strlen. String literals have both a comptime known length and are also null terminated. They safely cast to both []const u8 and [*]null const u8.

What does this mean for concatenating two strings?

In the expression a ++ b, if either of a or b is a null terminated array, then the result is a null terminated array. It works with single-item pointers to arrays as well. If either is a single-item pointer to an array then the result is a single-item pointer to an array.

One of the main benefits of having null termination in the type system is that operations such as ++, **, and len property can work correctly.

Are we going to end up with .* everywhere? I'm not entirely clear what the .* implies here (it's a pointer dereference

typeOf("MC") == *const [2]null u8. So, dereferencing it gives you an array value of type [2]null u8. The question of whether to allow a type coercion from *[N]T to [N]T is a separate issue.

idea: instead of [*]null u8, [*]sentinel(0) u8. It's not obvious that null means 0: no where else in zig do we call 0 "null". We could open things up to allow arbitrary values in the future

Maybe you are onto something here, but please open a separate proposal. This is an implementation of the accepted-for-over-two-years proposal #265.

@daurnimator daurnimator added the breaking Implementing this issue could cause existing code to no longer compile or have different behavior. label Nov 20, 2019
@daurnimator
Copy link
Contributor

Maybe you are onto something here, but please open a separate proposal. This is an implementation of the accepted-for-over-two-years proposal #265.

Created as #3731

However now I'm curious: for non-pointer non-integer elements, what does null mean? e.g. [*]null enum{Foo, Bar}

[*]allowzero u8, [*]allowzero const u8,
[*]allowzero volatile u8, [*]allowzero const volatile u8,
[*]allowzero align(4) u8, [*]allowzero align(4) const u8,
[*]allowzero align(4) volatile u8, [*]allowzero align(4) const volatile u8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need a extra bit of the test matrix here adding null pointers?

@andrewrk
Copy link
Member Author

However now I'm curious: for non-pointer non-integer elements, what does null mean? e.g. [*]null enum{Foo, Bar}

Not allowed, with the current implementation. error: type "E" has no null value, cannot be used in null terminated pointer.

this also deletes C string literals from the language, and then makes
the std lib changes and compiler changes necessary to get the behavior
tests and std lib tests passing again.
@andrewrk andrewrk force-pushed the null-terminated-pointers branch from 974e44b to 7597735 Compare November 23, 2019 09:47
@andrewrk andrewrk changed the title Null terminated pointers sentinel-terminated pointers Nov 24, 2019
@daurnimator
Copy link
Contributor

A good update to make would be for std.mem.toSlice and std.mem.toSliceConst to return null terminated slices.

@daurnimator
Copy link
Contributor

A good update to make would be for std.mem.toSlice and std.mem.toSliceConst to return null terminated slices.

Attempting this I got:

broken LLVM module found: Call parameter type does not match function signature!
  %18 = getelementptr inbounds { %"[]u8", i16 }, { %"[]u8", i16 }* %0, i32 0, i32 0, !dbg !126586
 %"[:0]u8"*  call fastcc void @mem.toSlice(%"[]u8"* sret %18, i8* %16), !dbg !126586
Call parameter type does not match function signature!
  %name = alloca %"[:0]u8", align 8
 %"[]u8"*  %48 = call fastcc i1 @mem.eql(%"[:0]u8"* %name, %"[]u8"* @5875), !dbg !211774
Call parameter type does not match function signature!
  %name = alloca %"[:0]u8", align 8
 %"[]u8"*  %84 = call fastcc i1 @mem.eql(%"[:0]u8"* %name, %"[]u8"* @5874), !dbg !211810
Call parameter type does not match function signature!
  %hostname = alloca %"[:0]u8", align 8
 %"[]u8"*  call fastcc void @mem.copy(%"[]u8"* %9, %"[:0]u8"* %hostname), !dbg !251182
Call parameter type does not match function signature!
  %27 = getelementptr inbounds %"?[]const u8", %"?[]const u8"* %0, i32 0, i32 0, !dbg !284345
 %"[:0]u8"*  call fastcc void @mem.toSlice(%"[]u8"* sret %27, i8* %25), !dbg !284345

@andrewrk andrewrk merged commit 5a98dd4 into master Nov 25, 2019
@andrewrk andrewrk deleted the null-terminated-pointers branch November 25, 2019 07:20
iguessthislldo added a commit to iguessthislldo/georgios that referenced this pull request Nov 28, 2019
Removed C literals
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Implementing this issue could cause existing code to no longer compile or have different behavior.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants