-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make overflow arithmetic builtins return a tuple instead of using a pointer parameter and bool return value #10248
Comments
Any reason to not to have @addOverflow(a: T, b: T) Overflow!T which seems to be pretty much made for this kind thing? |
Yes. This builtin returns not only the overflow bit but the other bits as well. |
@RogierBrussee Here are a couple of use cases for getting both the overflowed result and the excess carry bit:
|
I can confirm the findings https://gist.github.com/matu3ba/f848765e3c67ee74781b57858049d692 './addo_noptr' ran
1.06 ± 0.03 times faster than './addo_fast'
1.15 ± 0.04 times faster than './addo_simple' Since the benchmarks show 1.10x speedup of the Hacker's Delight solution vs the simple overflow test everyone does, I will provide |
change signature of arithmetic operations @addwithOverflow, @subWithOverflow, @mulWithOverflow, shlWithOverflow from @operation(comptime T: type, a: T, b: T, result: *T) bool to @operation(comptime T: type, a: T, b: T) anytype with anytype being a tuple struct { res: T, ov: bool } This removes the pointer store and load for efficiency of codegen. Comptime operation is accordingly kept in sync. closes ziglang#10248
change signature of arithmetic operations @addwithOverflow, @subWithOverflow, @mulWithOverflow, shlWithOverflow from @operation(comptime T: type, a: T, b: T, result: *T) bool to @operation(comptime T: type, a: T, b: T) anytype with anytype being a tuple struct { res: T, ov: bool } This removes the pointer store and load for efficiency of codegen. Comptime operation is accordingly kept in sync. closes ziglang#10248
What is the new signature for the builtin functions? The initial issue comment contains two differing versions in the specification and in the example code. Is it @addWithOverflow(a: T, b: T) tuple {T, u1} or @addWithOverflow(comptime T: type, a: T, b: T) tuple {T, u1} Asking because I'm working on implementing this change in the self-hosted compiler (on top of #10854). |
I could be missing something but was it ever discussed how the result of x, const add_overflow = @addWithOverflow(T, x, digit); self.x, const overflow = @addWithOverflow(u32, self.x, freq); val.*, carry= @addWithOverflow(u8, val.*, 1); which doesn't look like Zig code at all to me. Surely this is not how it's going to look anyway? Or is this un-Ziggy assignment syntax part of #4335 and I'm missing it? I don't know what the accepted way was to access the members of a tuple but I think it was const op = @addWithOverflow(@as(u8, 128), @as(u8, 64));
_ = op.@"0";
_ = op.@"1"; I think this is not optimal in terms of readability. @addWithOverflow(a: T, b: T) struct { result: T, overflow: u1 } and used like this: const op = @addWithOverflow(@as(u8, 128), @as(u8, 64));
_ = op.result;
_ = op.overflow; which I think is just much easier to understand. Compared to this using a tuple with unnamed fields could require you to consult the langref. |
It should return a a vector of bools for compatibility with scalar operands and stage1 until ziglang#10248 can be implemented.
It should return a a vector of bools for compatibility with scalar operands and stage1 until ziglang#10248 can be implemented. Closes ziglang#13201
It should return a a vector of bools for compatibility with scalar operands and stage1 until ziglang#10248 can be implemented. Closes ziglang#13201
It should return a a vector of bools for compatibility with scalar operands and stage1 until ziglang#10248 can be implemented. Closes ziglang#13201
@andrewrk Is there proposal or discussion for this syntax? x, const mul_overflow = @mulWithOverflow(T, x, radix); |
|
Let's examine integer parsing without
@addWithOverflow
and@mulWithOverflow
:With
@addWithOverflow
and@mulWithOverflow
:godbolt link
Observations: Even with
-OReleaseFast
optimizations on, LLVM is able to generate better code with@addWithOverflow
and@mulWithOverflow
. The reason for this is that the backend that generates machine code can special-case the result of these functions, and turn struct field accesses followed by jumps intojo
rather than actually doing multiplication with more bits.Certainly in a debug build, this has the potential to generate much better runtime code.
So that's why these builtins exist: they help the programmer avoid integer overflow bugs, while generating efficient code.
There is a problem though, which is that the current function signature sabotages the potential for debug code to be efficient. The status quo function signature is:
The problem is the result pointer. As a point of comparison, here is the corresponding LLVM builtin signature:
Now that we have started to get into writing our own backends and not relying exclusively on LLVM, I'm seeing the flaw: writing the result through a pointer parameter makes it too hard to use a special value returned from the builtin and detect the pattern that allows lowering to the efficient code.
For example, in the x86 backend code, the MCValues tagged union can represent a value that is partially in the condition flags and partially in a register, which is exactly what would be the return type of one of these arithmetic overflow functions. However, loading through the result pointer messes this mechanism up.
Likewise in a debug build even with the LLVM backend, it messes up LLVM's ability to do the same. It ends up writing the overflow flag to a stack value, which then gets fetched via the result pointer.
Furthermore, the result pointer does not harmonize with SIMD vectors (related: #6835).
I propose that the arithmetic overflow functions return a tuple, like this (related: #6771, #4335):
With #498 we can now look at our
parseInt
function again:It's effectively the same code, except we have eliminated the use of a mutable variable, and it is now possible for even simple, unoptimizing backends to quickly lower this to efficient runtime code.
The text was updated successfully, but these errors were encountered: