-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add x86 AES-NI vendor intrinsics #311
Conversation
I would just use that as "inspiration" for the tests. For the docs you can check the clang-docs. IIRC @alexcrichton suggested that we could base our docs on those and then just add attribution on the readme. |
coresimd/src/x86/x86_64/aes.rs
Outdated
|
||
use x86::*; | ||
|
||
#[simd_test = "aex"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be aes
instead of aex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
coresimd/src/x86/x86_64/aes.rs
Outdated
#[target_feature(enable = "aes")] | ||
#[cfg_attr(test, assert_instr(aeskeygenassist))] | ||
pub unsafe fn _mm_aeskeygenassist_si128(a: __m128i, imm8: u8) -> __m128i { | ||
aeskeygenassist(a, imm8) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't probably work because imm8
must be a constant but here it is a run-time value.
You need a constify
macro for this, from the description I'd guess that you need constify_imm8!
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With constify it does work, indeed.
Thanks! These are currently in |
The tests now pass on
Can they? Intel does not list any 32-bit CPUs with AES-NI, unless I am missing something.
The constants in the tests should not be an issue. The documentation comments in Clang’s |
@ruuda ah that's true, but making making code compatible across x86/x86_64 tends to be easier if it's available everywhere. The I think it also matches what clang/gcc do, making the intrinsics available on both platforms. |
coresimd/src/x86/x86_64/aes.rs
Outdated
unsafe fn test_mm_aeskeygenassist_si128() { | ||
// Constants taken from https://msdn.microsoft.com/en-us/library/cc714138.aspx. | ||
let a = __m128i(0x0123456789abcdef, 0x8899aabbccddeeff); | ||
let e = __m128i(0x857c266b7c266e85, 0xeac4eea9c4eeacea); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this use _mm_set_epi64x
to avoid making an assumption about the internal layout of __m128i
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Ah, I see, that makes sense. Moved them into i686. |
64-bit CPUs can run 32-bit binaries. AFAIK 32-bit binaries can call 64-bit CPU intrinsics without issues, but they obviously won't be able to run on CPUs that don't have these intrinsics. |
Looks like stdsimd's runtime detection may need to be taught the aes feature? |
@alexcrichton yes, I am on it. |
These tests are based on the examples in Microsoft's documentation. Same input should result in the same output in any case.
Use _mm_set_epi64x instead to construct constants.
Although i686 does not have the AES New Instructions, making code compatible across x86 and x64_64 tends to be easier if the intrinsics are available everywhere.
Looks like that constify macro expanded to an enormous function ... is there anything we can do about that? |
coresimd/src/x86/i686/aes.rs
Outdated
/// a round key for encryption cipher using data from `a` and an 8-bit round constant. | ||
#[inline] | ||
#[target_feature(enable = "aes")] | ||
#[cfg_attr(test, assert_instr(aeskeygenassist))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this
assert_instr(aeskeygenassist)
to (from the top of my head):
assert_instr(aeskeygenassist, imm8 = 0)
That should fix the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to explain what's going on, imm8
must be a compile-time argument, and while testing the disassembly, we need to pass it a value. If one doesn't specify it in the assert_instr
macro, it will pass it a run-time value, and that won't work for this intrinsic.
We don't have a way of expressing that function arguments must be const yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the explanation. Should be fixed now.
I was a bit worried that this meant that the entire match statement would end up in the final binary, so I also verified the disassembly of test_mm_aeskeygenassist_si128
in release mode, and fortunately it does get const-folded:
0000000000041720 <coresimd::x86::i686::aes::tests::test_mm_aeskeygenassist_si128::test_mm_aeskeygenassist_si128>:
41720: 55 push %rbp
41721: 48 89 e5 mov %rsp,%rbp
41724: 48 81 ec 80 00 00 00 sub $0x80,%rsp
4172b: 66 0f 3a df 05 3b c7 aeskeygenassist $0x5,0x11c73b(%rip),%xmm0 # 15de70 <str.4y+0x20>
41732: 11 00 05
41735: 66 0f 7f 45 90 movdqa %xmm0,-0x70(%rbp)
4173a: 0f 28 05 9f c7 11 00 movaps 0x11c79f(%rip),%xmm0 # 15dee0 <str.4y+0x90>
41741: 0f 29 45 80 movaps %xmm0,-0x80(%rbp)
In debug mode it doesn’t though, instead there is a call to _mm_aeskeygenassist_si128
which contains the entire match statement. Something along the lines of rust-lang/rust#47980 would be good to have indeed.
Pass a particular value for the disassembly test, so we end up with one instruction, instead of the match arm that matches on all 256 values.
The only thing failing at this point, is that the Intel xml file claims that the |
@ruuda for now I'd stick exactly to what Intel says, and Intel says |
Bah it looks like that yield an LLVM error. Maybe the LLVM intrinsic needs a |
Intel documentation specifies it as an "8-bit round constant", but then goes on to give it a type "const int", which translates to i32 in Rust. The test that verifies the Rust signatures against Intel documentation failed on this. For now we will replicate the C API verbatim. Even when Rust could have a more accurate type signature that makes passing values more than 8 bits impossible, rather than silently mapping out-of-range values to 255.
It compiles 🎉 |
Nice! Now you just need to fix the minor formatting and clippy issues and this is good to go! (see the last two build bots!) |
Thanks @ruuda! |
Fixes #295.
I have not tested this yet, waiting for a nightly that includes rust-lang/rust@2497d10 to be released.
Also, the MSDN documentation is a bit clearer and more comprehensive than the Intel intrinsics guide (thanks @gnzlbg for pointing this out), but I am not sure if we could copy it into the docs license-wise. In the worst case we could add links to the MSDN docs.