diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2022-03-05 16:09:33 +0200 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2022-03-06 18:46:49 +0200 |
commit | d857e85cb4d4cb9702a59364ce9a4b9d81328cb5 (patch) | |
tree | 4fcc48013497a931b0f9df54fb973f87fb3a1896 /cipher/mac.c | |
parent | 47cafffb09d8a224f07e0750f4ba882bb86cb15a (diff) | |
download | libgcrypt-d857e85cb4d4cb9702a59364ce9a4b9d81328cb5.tar.gz |
ghash|polyval: add x86_64 VPCLMUL/AVX2 accelerated implementation
* cipher/cipher-gcm-intel-pclmul.c (GCM_INTEL_USE_VPCLMUL_AVX2)
(GCM_INTEL_AGGR8_TABLE_INITIALIZED)
(GCM_INTEL_AGGR16_TABLE_INITIALIZED): New.
(gfmul_pclmul): Fixes to comments.
[GCM_USE_INTEL_VPCLMUL_AVX2] (GFMUL_AGGR16_ASM_VPCMUL_AVX2)
(gfmul_vpclmul_avx2_aggr16, gfmul_vpclmul_avx2_aggr16_le)
(gfmul_pclmul_avx2, gcm_lsh_avx2, load_h1h2_to_ymm1)
(ghash_setup_aggr8_avx2, ghash_setup_aggr16_avx2): New.
(_gcry_ghash_setup_intel_pclmul): Add 'hw_features' parameter; Setup
ghash and polyval function pointers for context; Add VPCLMUL/AVX2 code
path; Defer aggr8 and aggr16 table initialization to until first use in
'_gcry_ghash_intel_pclmul' or '_gcry_polyval_intel_pclmul'.
[__x86_64__] (ghash_setup_aggr8): New.
(_gcry_ghash_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for
aggr8 table initialization.
(_gcry_polyval_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for
aggr8 table initialization.
* cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_intel_pclmul)
(_gcry_polyval_intel_pclmul): Remove.
[GCM_USE_INTEL_PCLMUL] (_gcry_ghash_setup_intel_pclmul): Add
'hw_features' parameter.
(setupM) [GCM_USE_INTEL_PCLMUL]: Pass HW features to
'_gcry_ghash_setup_intel_pclmul'; Let '_gcry_ghash_setup_intel_pclmul'
setup function pointers.
* cipher/cipher-internal.h (GCM_USE_INTEL_VPCLMUL_AVX2): New.
(gcry_cipher_handle): Add member 'gcm.hw_impl_flags'.
--
Patch adds VPCLMUL/AVX2 accelerated implementation for GHASH (GCM) and
POLYVAL (GCM-SIV).
Benchmark on AMD Ryzen 5800X (zen3):
Before:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM auth | 0.088 ns/B 10825 MiB/s 0.427 c/B 4850
GCM-SIV auth | 0.083 ns/B 11472 MiB/s 0.403 c/B 4850
After: (~1.93x faster)
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM auth | 0.045 ns/B 21098 MiB/s 0.219 c/B 4850
GCM-SIV auth | 0.043 ns/B 22181 MiB/s 0.209 c/B 4850
AES128-GCM / AES128-GCM-SIV encryption:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM enc | 0.079 ns/B 12073 MiB/s 0.383 c/B 4850
GCM-SIV enc | 0.076 ns/B 12500 MiB/s 0.370 c/B 4850
Benchmark on Intel Core i3-1115G4 (tigerlake):
Before:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM auth | 0.080 ns/B 11919 MiB/s 0.327 c/B 4090
GCM-SIV auth | 0.075 ns/B 12643 MiB/s 0.309 c/B 4090
After: (~1.28x faster)
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM auth | 0.062 ns/B 15348 MiB/s 0.254 c/B 4090
GCM-SIV auth | 0.058 ns/B 16381 MiB/s 0.238 c/B 4090
AES128-GCM / AES128-GCM-SIV encryption:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM enc | 0.101 ns/B 9441 MiB/s 0.413 c/B 4090
GCM-SIV enc | 0.098 ns/B 9692 MiB/s 0.402 c/B 4089
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/mac.c')
0 files changed, 0 insertions, 0 deletions