-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update montgomery multiplication to use s2n-bignum's verified scalar bignum functions #1135
Conversation
…omery multiplication
nice improvements, can we get perf numbers for GV3 and possibly Apple M1/M2? |
@dkostic For microarchitectures supporting fast enough multiplications (such as Graviton 3 and M1), the current version of s2n-bignum didn't bring effective performance improvement. To selectively apply s2n-bignum to Graviton 2, this patch invokes // The Neoverse V1 and Apple M1 micro-architectures are detected to enable
// high unrolling factor of AES-GCM and other algorithms that leverage a
// wide crypto pipeline and fast multiplier.
#define ARMV8_NEOVERSE_V1 (1 << 12)
#define ARMV8_APPLE_M1 (1 << 13) |
From the CI failure, it seems in ARM the s2n-bignum assembly files are not linked to executables if Relevant CMakeLists.txt lines: https://github.com/aws/aws-lc/blob/main/crypto/fipsmodule/CMakeLists.txt#L170-L171 Should we update the CMakeLists.txt lines in this PR? => After a discussion, crypto/fipsmodule/CMakeLists.txt is updated to do so. |
…Y_ASSEMBLER_IS_TOO_OLD_FOR_AVX is set
Thank you, aqjune-aws. |
Description of changes:
This patch updates aws-lc's montgomery multiplication to use s2n-bignum's verified bignum functions.
This is a follow-up of #1114, and is splitted from #1108.
To selectively apply s2n-bignum to Graviton 2, this patch invokes CRYPTO_is_ARMv8_wide_multiplier_capable() and runs the s2n-bignum functions only when wide multipliers are not capable.
The performance numbers of RSA signing are as follows. Graviton 2 is used, and
tool/bssl speed -filter RSA
has been used. (Unit: ops/sec).This is only adopted for AArch64 that has narrow multiplication instruction bandwidths.
Testing:
Tested via bssl speed -filter RSA
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.