-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add {u,s}{add,sub,mul}_overflow
instructions
#5784
Conversation
FWIW, I think that these instructions are a reasonable addition -- thanks for proposing and prototyping this! I think, as you suggest as well, it's worth clarifying how this interacts with the existing operators; in particular
does indeed seem to be true (e.g. here for x86-64), but is pretty surprising to me, and IMHO is incorrect. It doesn't appear that If others agree, I'm happy to do a full review here! |
cg_clif currently has all overflow checks manually due to iadd_cout not working in combinations of target archs and integer sizes. I did be more than happy to switch to the _overflow instructions added by this PR. |
3e4c8bb
to
23f75de
Compare
Okay, sorry for the long wait but university got a bit busy the past month. I did the emit tests now (and fixed a few bugs along the way) and I think this is now in a somewhat cleaned up stage and could be reviewed. |
Tried this with cg_clif. Didn't find any failing tests. https://github.com/bjorn3/rustc_codegen_cranelift/tree/clif_overflow_insts |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpreter implementation LGTM! Thanks!
I'm running the fuzzer now, I'll report back if it finds anything.
One small note, I've had to disable the unimplemented ops in the fuzzer can you include that commit with these changes? Otherwise it crashes because it tries to generate those instructions when the architectures don't have lowerings for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made an initial pass over this -- thank you very much for a thorough implementation job on both x64 and aarch64!
I left a bunch of comments that are "nits", but also a few issues I think we should resolve before merging -- the with_flags
fix is the most important one I think. I also notice you basically added 8/16-bit support to ALU ops generally in x64. It might be useful to split that out as a separate PR so we can review it (and so others can take a look more easily -- @abrown maybe in particular). Overall though, the shape of this looks right and it's close to mergeable -- thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good now -- thanks so much for your patience as we iterated on the implementation!
I think once the merge conflict in the interpreter is resolved, this should be good to merge.
No problem, I was the slow one, after all ^^
I'll rebase the branch if that's okay. |
* add `{u,s}{add,sub,mul}_overflow` with interpreter * add `{u,s}{add,sub,mul}_overflow` for x64 * add `{u,s}{add,sub,mul}_overflow` for aarch64 * 128bit filetests for `{u,s}{add,sub,mul}_overflow` * `{u,s}{add,sub,mul}_overflow` emit tests for x64 * `{u,s}{add,sub,mul}_overflow` emit tests for aarch64 * Initial review changes * add `with_flags_extended` helper * add `with_flags_chained` helper
* add `{u,s}{add,sub,mul}_overflow` with interpreter * add `{u,s}{add,sub,mul}_overflow` for x64 * add `{u,s}{add,sub,mul}_overflow` for aarch64 * 128bit filetests for `{u,s}{add,sub,mul}_overflow` * `{u,s}{add,sub,mul}_overflow` emit tests for x64 * `{u,s}{add,sub,mul}_overflow` emit tests for aarch64 * Initial review changes * add `with_flags_extended` helper * add `with_flags_chained` helper
Currently, there is not really a way to efficiently detect arithmetic over/underflow which some ISAs (in particular x64) allow and can be important. For example, I'm currently trialing Cranelift as a backend in a database system which needs efficient overflow handling.
There also seems to be more general interest in overflow detection (see #1044).
This PR adds instructions for unsigned and signed add/sub/mul which return a second output indicating overflow.
add
/sub
support 8 through 128 bit integers while themul
variants only support up to 64 bit integers. (That's because the most efficient implementation forumul_overflow
with 128bit integers is along the lines of 50 instructions on x64 at which point it is probably better to emit a function call).I left
iadd_cout
and friends untouched as the semantics are a bit confusing right now, e.g.iadd_cout
seems to be indicating signed overflow while the pseudocode in the description describes unsigned overflow).In detail, this PR adds:
{u,s}{add,sub,mul}_overflow
with an implementation in the interpreter (I hope the naming convention is ok)AluRmiR
/AluRM
instructions in the x64 backendmul
instruction on x64UMAddL
/SMAddL
instructions in the aarch64 backend{u,s}{add,sub,mul}_overflow
for x64 and aarch64Currently, there are no emit tests since they were not important for the initial testing and I would like to know whether there is interest in these instructions being included first.
Additionally, the 128bit lowerings currently forego the flag helpers since they are, as far as I can tell, not really designed to produce three outputs so getting both the low/high part of an addition and the dst reg for a setcc seemed a bit complicated.
I also do not have a lot of knowledge about RV64/S390X so I don't think I can provide lowerings there.