Skip to content

Commit

Permalink
Add VAES + AVX2 optimized AES-GCM
Browse files Browse the repository at this point in the history
Add a VAES-optimized AES-GCM implementation that is optimized for AMD
Zen 3 processors, using AVX2 instead of AVX512 / AVX10.  With AVX2 only
16 vector registers are available and some instructions are missing,
which is inconvenient and makes the code not easily sharable with the
AVX512 / AVX10 version.  However, using VAES still gives a significant
performance improvement, about 80-85% on long messages as shown by the
following tables which show the change in AES-256-GCM throughput in MB/s
on a Zen 3 "Milan" processor for various message lengths in bytes.

Encryption:

            | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    --------+-------+-------+-------+-------+-------+-------+
    Before  |  3955 |  3749 |  3597 |  3054 |  2411 |  2038 |
    After   |  7128 |  6631 |  5975 |  4788 |  3807 |  2676 |

            |   300 |   200 |    64 |    63 |    16 |
    --------+-------+-------+-------+-------+-------+
    Before  |  1757 |  1405 |   856 |   602 |   356 |
    After   |  1885 |  1430 |   940 |   593 |   381 |

Decryption:

            | 16384 |  4096 |  4095 |  1420 |   512 |   500 |
    --------+-------+-------+-------+-------+-------+-------+
    Before  |  3962 |  3774 |  3593 |  2978 |  2510 |  1998 |
    After   |  7378 |  6836 |  6282 |  4826 |  3868 |  2753 |

            |   300 |   200 |    64 |    63 |    16 |
    --------+-------+-------+-------+-------+-------+
    Before  |  1742 |  1428 |   856 |   535 |   383 |
    After   |  1940 |  1534 |   940 |   573 |   383 |

Change-Id: I583dd6b48b81ab3c6df51bfe8729366cad500537
Reviewed-on: https://boringssl-review.googlesource.com/c/boringssl/+/74368
Reviewed-by: David Benjamin <[email protected]>
Commit-Queue: David Benjamin <[email protected]>
  • Loading branch information
ebiggers authored and Boringssl LUCI CQ committed Jan 6, 2025
1 parent e869bfb commit 3b6e1be
Show file tree
Hide file tree
Showing 15 changed files with 5,324 additions and 14 deletions.
1 change: 1 addition & 0 deletions build.json
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,7 @@
"perlasm_x86_64": [
{"src": "crypto/fipsmodule/modes/asm/aesni-gcm-x86_64.pl"},
{"src": "crypto/fipsmodule/modes/asm/aes-gcm-avx10-x86_64.pl"},
{"src": "crypto/fipsmodule/modes/asm/aes-gcm-avx2-x86_64.pl"},
{"src": "crypto/fipsmodule/aes/asm/aesni-x86_64.pl"},
{"src": "crypto/fipsmodule/modes/asm/ghash-ssse3-x86_64.pl"},
{"src": "crypto/fipsmodule/modes/asm/ghash-x86_64.pl"},
Expand Down
2 changes: 1 addition & 1 deletion crypto/crypto.cc
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ static_assert(sizeof(ossl_ssize_t) == sizeof(size_t),
// archive, linking on OS X will fail to resolve common symbols. By
// initialising it to zero, it becomes a "data symbol", which isn't so
// affected.
HIDDEN uint8_t BORINGSSL_function_hit[8] = {0};
HIDDEN uint8_t BORINGSSL_function_hit[9] = {0};
#endif

#if defined(OPENSSL_X86) || defined(OPENSSL_X86_64)
Expand Down
Loading

0 comments on commit 3b6e1be

Please sign in to comment.