-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement a ubsan runtime for better error messages #22488
Merged
Merged
Changes from all commits
Commits
Show all changes
22 commits
Select commit
Hold shift + click to select a range
5e0073c
ubsan: add a basic runtime
Rexicon226 eef8d4f
ubsan: switch to using `std.builtin.panicExtra` to log errors
Rexicon226 c27b797
Compilation: use the minimal runtime in `ReleaseSafe`
Rexicon226 babee5f
ubsan: implement some more checks
Rexicon226 95720f0
move libubsan to `lib/` and integrate it into `-fubsan-rt`
Rexicon226 fc77678
mem: add `@branchHint` to `indexOfSentinel`
Rexicon226 590c613
ubsan: implement more checks
Rexicon226 658fba9
ubsan: extend `ptr` before adding to avoid overflow
Rexicon226 50b9556
ubsan: clean-up and remove the unused handlers
Rexicon226 a468929
ubsan: resolve the last of the TODOs
Rexicon226 2d4574a
Compilation: always import ubsan if a ZCU exists
Rexicon226 1417847
main: add `-f{no-}ubsan-rt` to the usage text
Rexicon226 d669b95
ubsan: clean-up a bit more
Rexicon226 9432a9b
build: add `bundle_ubsan_rt`
Rexicon226 9311784
Compilation: correct when to include ubsan
Rexicon226 35b9db3
correct some bugs
Rexicon226 44d3b5a
build: add comments explaining why we disable ubsan
Rexicon226 d4413e3
ubsan: avoid depending on `@returnAddress` combined with `inline`
andrewrk e18c7f9
ubsan: don't create ubsan in every static lib by default
andrewrk faf256e
std.mem.indexOfSentinel: don't ask the OS the page size
andrewrk 2447b87
std.heap.page_size_min: relax freestanding restriction
andrewrk ca83f52
ubsan: update wording
Rexicon226 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate on your thinking here?
freestanding
means there is no OS so there is no guarantee that the concept of a "page" is even meaningful. For all the standard library knows, paging might not be set up at all. So I don't see how this change is correct.For
other
there's a similar problem: The implication is that there is an OS (as opposed tofreestanding
), but we still don't know what that OS is, so we can't make any valid claims about its minimum page size.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that the page size is a CPU architecture feature - the OS has a limited set of choices available based on the CPU architecture. The OS can't pick any arbitrary value for page size. Do you have reason to believe this is not the case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpretation of "minimum and maximum page size" that we're using here is basically "the minimum and maximum granularity that
mmap
can be configured to operate at" (e.g. at kernel build time). We chose this interpretation specifically because if we just went by the CPU architecture's minimum/maximum, we would get into absurd territory quickly; most CPU architectures support page sizes that are completely impractical as the normal page size of a general-purpose OS, and so these page sizes are either not supported at all by OSs, or only supported with special opt-in flags (e.g.MAP_HUGE*
).Obviously this interpretation is not very meaningful in
freestanding
because for all we know there's nommap
syscall/function at all. But that's fine, because for all we know there may not be any paging at all anyway. And that implies that the standard library must not exploit knowledge about the supposed safety of overlong reads within a page, for example. So forfreestanding
specifically, it just doesn't really make sense to speak of page size. If there actually is one in effect, the programmer needs to tell us about it. Otherwise we're making a potentially invalid assumption.For
other
, the situation is similar. We know there's an OS, but not much else; there's a good chance that it does havemmap
of some description, but it could certainly also be an OS without paging. We just don't know. (This makes me think that there might not be as much utility in theother
OS tag as was originally thought, but I digress.)(I suppose you could also argue that it's possible for an OS to have
mmap
operate at a granularity that has nothing to do with the page sizes supported by the CPU architecture. But I don't know why you would do something that silly, and I don't think anyone has, so this probably need not be a consideration.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm arguing precisely the opposite: it's not possible for an OS to have mmap operate at a granularity different than the page sizes supported by the CPU architecture, which means that
page_size_min
being 4096 on x86_64 is correct, including on freestanding.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make room for this PR to land, and then hash this out separately.
std.mem.indexOfPos
is an important function, so we need to give it the appropriate amount of engineering effort it deserves. That means even though I'm tempted to simply say, "this violates pointer provenance, fuck it," I still need to consider whether a page-aware implementation should be considered. It absolutely needs to work on freestanding by default however.@tiehuis, if available, should be involved in that discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, what I was getting at with that paragraph is that you can do some incredibly silly stuff with page protection and instruction emulation/resumption at the supervisor level to make it appear to userspace as if the machine has page sizes that don't look anything like the architectural page sizes. It'd be very slow and very stupid, which is why I also said it probably shouldn't be a real consideration.
👍 But just out of curiosity, why were these changes needed in this PR? That's not clear from the commit messages.
Agree; I just think the answer here is to avoid the page-size-dependent code paths altogether in the freestanding case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, sorry I thought you saw - ubsan_rt depends on
std.mem.indexOfSentinel
, which was callingstd.heap.pageSize()
. When that function was implemented, it was accessing a comptime value; in the runtime page size changeset, it was carelessly updated to be a runtime-known value, regressing the function on freestanding/other targets.https://github.com/ziglang/zig/actions/runs/13497343926/job/37707452929
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Wait, was there ever a fully comptime
pageSize()
function? I thought it was just apage_size
constant before. FWIW, it's intentional thatpageSize()
is potentially runtime: The actual page size in effect can be different (and often is, e.g. on aarch64) from thepage_size_min
/page_size_max
constants. The latter only tell you the bounds of possible values forpageSize()
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, by "that function", I meant
std.mem.indexOfSentinel
.