Add `{u,s}{add,sub,mul}_overflow` instructions #5784

T0b1-iOS · 2023-02-15T03:41:33Z

Currently, there is not really a way to efficiently detect arithmetic over/underflow which some ISAs (in particular x64) allow and can be important. For example, I'm currently trialing Cranelift as a backend in a database system which needs efficient overflow handling.
There also seems to be more general interest in overflow detection (see #1044).

This PR adds instructions for unsigned and signed add/sub/mul which return a second output indicating overflow.
add/sub support 8 through 128 bit integers while the mul variants only support up to 64 bit integers. (That's because the most efficient implementation for umul_overflow with 128bit integers is along the lines of 50 instructions on x64 at which point it is probably better to emit a function call).
I left iadd_cout and friends untouched as the semantics are a bit confusing right now, e.g. iadd_cout seems to be indicating signed overflow while the pseudocode in the description describes unsigned overflow).

In detail, this PR adds:

{u,s}{add,sub,mul}_overflow with an implementation in the interpreter (I hope the naming convention is ok)
runtests for these instructions
support for 8/16 bit operands for AluRmiR/AluRM instructions in the x64 backend
support for the usigned mul instruction on x64
support for UMAddL/SMAddL instructions in the aarch64 backend
lowerings of {u,s}{add,sub,mul}_overflow for x64 and aarch64

Currently, there are no emit tests since they were not important for the initial testing and I would like to know whether there is interest in these instructions being included first.
Additionally, the 128bit lowerings currently forego the flag helpers since they are, as far as I can tell, not really designed to produce three outputs so getting both the low/high part of an addition and the dst reg for a setcc seemed a bit complicated.
I also do not have a lot of knowledge about RV64/S390X so I don't think I can provide lowerings there.

cfallin · 2023-02-16T21:46:04Z

FWIW, I think that these instructions are a reasonable addition -- thanks for proposing and prototyping this!

I think, as you suggest as well, it's worth clarifying how this interacts with the existing operators; in particular

I left iadd_cout and friends untouched as the semantics are a bit confusing right now, e.g. iadd_cout seems to be indicating signed overflow while the pseudocode in the description describes unsigned overflow).

does indeed seem to be true (e.g. here for x86-64), but is pretty surprising to me, and IMHO is incorrect. It doesn't appear that iadd_cout is the target of any rewrite in cranelift-codegen proper, and is not used by cranelift-wasm; if at least cg_clif can adapt to these operators instead (cc @bjorn3 ?) then I like the explicitness of these opcode names much better and we should remove iadd_cout.

If others agree, I'm happy to do a full review here!

bjorn3 · 2023-02-16T22:40:33Z

cg_clif currently has all overflow checks manually due to iadd_cout not working in combinations of target archs and integer sizes. I did be more than happy to switch to the _overflow instructions added by this PR.

T0b1-iOS · 2023-03-25T22:32:41Z

Okay, sorry for the long wait but university got a bit busy the past month. I did the emit tests now (and fixed a few bugs along the way) and I think this is now in a somewhat cleaned up stage and could be reviewed.
The CI currently fails because of the s390x tests failing which is because QEMU incorrectly emulates the overflow flag (see https://gitlab.com/qemu-project/qemu/-/issues/618). This was fixed in QEMU 7.0.0 so the version in the CI would need to be updated but I don't know if I can do that.

bjorn3 · 2023-03-27T11:04:10Z

Tried this with cg_clif. Didn't find any failing tests. https://github.com/bjorn3/rustc_codegen_cranelift/tree/clif_overflow_insts

afonso360

The interpreter implementation LGTM! Thanks!

I'm running the fuzzer now, I'll report back if it finds anything.

One small note, I've had to disable the unimplemented ops in the fuzzer can you include that commit with these changes? Otherwise it crashes because it tries to generate those instructions when the architectures don't have lowerings for them.

cranelift/codegen/meta/src/shared/instructions.rs

cfallin

I made an initial pass over this -- thank you very much for a thorough implementation job on both x64 and aarch64!

I left a bunch of comments that are "nits", but also a few issues I think we should resolve before merging -- the with_flags fix is the most important one I think. I also notice you basically added 8/16-bit support to ALU ops generally in x64. It might be useful to split that out as a separate PR so we can review it (and so others can take a look more easily -- @abrown maybe in particular). Overall though, the shape of this looks right and it's close to mergeable -- thanks again!

cranelift/codegen/src/isa/aarch64/inst/mod.rs

cranelift/codegen/src/isa/aarch64/inst.isle

cranelift/codegen/src/isa/aarch64/lower.isle

cranelift/codegen/src/isa/x64/inst.isle

cranelift/codegen/src/isa/x64/lower.isle

cranelift/filetests/filetests/runtests/sadd_overflow.clif

cranelift/codegen/src/isa/aarch64/lower.isle

cfallin

This looks good now -- thanks so much for your patience as we iterated on the implementation!

I think once the merge conflict in the interpreter is resolved, this should be good to merge.

T0b1-iOS · 2023-04-11T19:21:07Z

This looks good now -- thanks so much for your patience as we iterated on the implementation!

No problem, I was the slow one, after all ^^

I think once the merge conflict in the interpreter is resolved, this should be good to merge.

I'll rebase the branch if that's okay.

* add `{u,s}{add,sub,mul}_overflow` with interpreter * add `{u,s}{add,sub,mul}_overflow` for x64 * add `{u,s}{add,sub,mul}_overflow` for aarch64 * 128bit filetests for `{u,s}{add,sub,mul}_overflow` * `{u,s}{add,sub,mul}_overflow` emit tests for x64 * `{u,s}{add,sub,mul}_overflow` emit tests for aarch64 * Initial review changes * add `with_flags_extended` helper * add `with_flags_chained` helper

github-actions bot added cranelift Issues related to the Cranelift code generator cranelift:area:aarch64 Issues related to AArch64 backend. cranelift:area:x64 Issues related to x64 codegen cranelift:meta Everything related to the meta-language. labels Feb 15, 2023

T0b1-iOS force-pushed the arith_of branch 4 times, most recently from 3e4c8bb to 23f75de Compare March 25, 2023 21:12

T0b1-iOS marked this pull request as ready for review March 25, 2023 22:32

afonso360 mentioned this pull request Mar 26, 2023

CI: Upgrade QEMU to 7.2.0 #6102

Merged

T0b1-iOS force-pushed the arith_of branch from 23f75de to d5aaa3e Compare March 26, 2023 23:47

afonso360 approved these changes Mar 27, 2023

View reviewed changes

cranelift/codegen/meta/src/shared/instructions.rs Show resolved Hide resolved

cfallin reviewed Mar 27, 2023

View reviewed changes

T0b1-iOS requested a review from a team as a code owner April 3, 2023 20:06

T0b1-iOS requested review from elliottt and removed request for a team April 3, 2023 20:06

github-actions bot added the cranelift:area:machinst Issues related to instruction selection and the new MachInst backend. label Apr 3, 2023

cfallin reviewed Apr 3, 2023

View reviewed changes

cranelift/codegen/src/isa/aarch64/lower.isle Outdated Show resolved Hide resolved

elliottt removed their request for review April 3, 2023 22:18

cfallin approved these changes Apr 11, 2023

View reviewed changes

T0b1-iOS added 5 commits April 11, 2023 21:42

add {u,s}{add,sub,mul}_overflow with interpreter

8c80d3e

add {u,s}{add,sub,mul}_overflow for x64

0f84349

add {u,s}{add,sub,mul}_overflow for aarch64

3aeb330

128bit filetests for {u,s}{add,sub,mul}_overflow

639d6b3

{u,s}{add,sub,mul}_overflow emit tests for x64

ebeb3aa

T0b1-iOS added 4 commits April 11, 2023 21:42

{u,s}{add,sub,mul}_overflow emit tests for aarch64

3d2c419

Initial review changes

7407eb2

add with_flags_extended helper

2589d7c

add with_flags_chained helper

afa26e9

T0b1-iOS force-pushed the arith_of branch from 7ceab60 to afa26e9 Compare April 11, 2023 19:44

cfallin enabled auto-merge April 11, 2023 20:15

cfallin added this pull request to the merge queue Apr 11, 2023

Merged via the queue into bytecodealliance:main with commit 569089e Apr 11, 2023

T0b1-iOS mentioned this pull request Apr 11, 2023

Remove iadd_cout and isub_bout #6198

Merged

afonso360 mentioned this pull request Apr 12, 2023

cranelift: Rename iadd_carry/ iadd_cout instructions #5165

Closed

jameysharp mentioned this pull request Apr 14, 2023

Missing variants of umul/smul/ushl/sshl/ushr/sshr returning if it overflows #1044

Closed

scottmcm mentioned this pull request Apr 18, 2023

Cranelift: handle carry flags as bools in backends #2860

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `{u,s}{add,sub,mul}_overflow` instructions #5784

Add `{u,s}{add,sub,mul}_overflow` instructions #5784

T0b1-iOS commented Feb 15, 2023

cfallin commented Feb 16, 2023

bjorn3 commented Feb 16, 2023

T0b1-iOS commented Mar 25, 2023

bjorn3 commented Mar 27, 2023 •

edited

Loading

afonso360 left a comment

cfallin left a comment

cfallin left a comment

T0b1-iOS commented Apr 11, 2023

Add {u,s}{add,sub,mul}_overflow instructions #5784

Add {u,s}{add,sub,mul}_overflow instructions #5784

Conversation

T0b1-iOS commented Feb 15, 2023

cfallin commented Feb 16, 2023

bjorn3 commented Feb 16, 2023

T0b1-iOS commented Mar 25, 2023

bjorn3 commented Mar 27, 2023 • edited Loading

afonso360 left a comment

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

cfallin left a comment

Choose a reason for hiding this comment

T0b1-iOS commented Apr 11, 2023

Add `{u,s}{add,sub,mul}_overflow` instructions #5784

Add `{u,s}{add,sub,mul}_overflow` instructions #5784

bjorn3 commented Mar 27, 2023 •

edited

Loading