-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aarch64 add basic i128 bit ops #2959
Conversation
v5, v6 = isplit v4 | ||
return v5, v6 | ||
} | ||
; run: %ishl_i128_i8(0x01010101_01010101, 0x01010101_01010101, 2) == [0x04040404_04040404, 0x04040404_04040404] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfallin I pretty much copied the testing code from x64/shift-i128-run.clif. And added a few more test cases where my implementation had issues.
Do you think it would be a good idea to merge run tests so that they could be multi arch? I've been thinking about something along the lines of:
testfiles/runtests/i128-bitops.clif
:
test run
target x86_64
target aarch64
target s390x
function %rest_of_the_tests() {}
I think we should be able to do this for all runtests? This way we could reuse all of these test cases for all arches.
The down side of this is that we may have to be more granular with the testfiles. For example this file right here fails for x86_64 because some of the bit ops fail. So we would have to split this into shift_run and bitops_run or something along those lines.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sharing run-tests makes a lot of sense! In general I actually want to try to shift tests from golden-code (compile tests) to golden-output (run tests) as this makes our suite more robust against cross-cutting backend changes, such as regalloc optimizations; so there are additional benefits, aside from sharing across architectures.
Could you be more specific about the failures you're seeing on x86_64, though? That's somewhat concerning -- unimplemented opcode, incorrect result, or something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Ill try to make a PR changing this soon.
About x86_64:
BandNot is failling on this assert.
BorNot / BxorNot / Cls (not on this PR yet) are not implemented at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK, for tests like those for BandNot
where we don't have an implementation on x86_64, we can just omit the target
line in the test file, perhaps with a comment saying "not yet implemented on $PLATFORM".
98295ae
to
c99c02c
Compare
c99c02c
to
67ab750
Compare
67ab750
to
1eb4259
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I'll plan to merge your runtests-consolidation PR (#2964) first, then I think this should be rebased over it, if I'm not mistaken?
Yeah, sure, merge that first, and then I'll update this |
Currently we just basically use a two instruction version of the same i64 ops. IMMLogic doesn't really support multiple register inputs, so its left as a TODO for future optimizations.
8383d71
to
988ddfd
Compare
988ddfd
to
b1475f3
Compare
Hey, some more ops implemented for i128 support.
I think the shifts can be reduced by a few instructions (especially the last csel), but I'm not seeing how right now.
I also didn't implement support for immlogic, changing it to support multiple registers didn't seem like a simple change.
This adds support for: