summaryrefslogtreecommitdiff
path: root/cipher/mac.c
diff options
context:
space:
mode:
authorJussi Kivilinna <jussi.kivilinna@iki.fi>2022-03-05 16:09:33 +0200
committerJussi Kivilinna <jussi.kivilinna@iki.fi>2022-03-06 18:46:49 +0200
commitd857e85cb4d4cb9702a59364ce9a4b9d81328cb5 (patch)
tree4fcc48013497a931b0f9df54fb973f87fb3a1896 /cipher/mac.c
parent47cafffb09d8a224f07e0750f4ba882bb86cb15a (diff)
downloadlibgcrypt-d857e85cb4d4cb9702a59364ce9a4b9d81328cb5.tar.gz
ghash|polyval: add x86_64 VPCLMUL/AVX2 accelerated implementation
* cipher/cipher-gcm-intel-pclmul.c (GCM_INTEL_USE_VPCLMUL_AVX2) (GCM_INTEL_AGGR8_TABLE_INITIALIZED) (GCM_INTEL_AGGR16_TABLE_INITIALIZED): New. (gfmul_pclmul): Fixes to comments. [GCM_USE_INTEL_VPCLMUL_AVX2] (GFMUL_AGGR16_ASM_VPCMUL_AVX2) (gfmul_vpclmul_avx2_aggr16, gfmul_vpclmul_avx2_aggr16_le) (gfmul_pclmul_avx2, gcm_lsh_avx2, load_h1h2_to_ymm1) (ghash_setup_aggr8_avx2, ghash_setup_aggr16_avx2): New. (_gcry_ghash_setup_intel_pclmul): Add 'hw_features' parameter; Setup ghash and polyval function pointers for context; Add VPCLMUL/AVX2 code path; Defer aggr8 and aggr16 table initialization to until first use in '_gcry_ghash_intel_pclmul' or '_gcry_polyval_intel_pclmul'. [__x86_64__] (ghash_setup_aggr8): New. (_gcry_ghash_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for aggr8 table initialization. (_gcry_polyval_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for aggr8 table initialization. * cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_intel_pclmul) (_gcry_polyval_intel_pclmul): Remove. [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_setup_intel_pclmul): Add 'hw_features' parameter. (setupM) [GCM_USE_INTEL_PCLMUL]: Pass HW features to '_gcry_ghash_setup_intel_pclmul'; Let '_gcry_ghash_setup_intel_pclmul' setup function pointers. * cipher/cipher-internal.h (GCM_USE_INTEL_VPCLMUL_AVX2): New. (gcry_cipher_handle): Add member 'gcm.hw_impl_flags'. -- Patch adds VPCLMUL/AVX2 accelerated implementation for GHASH (GCM) and POLYVAL (GCM-SIV). Benchmark on AMD Ryzen 5800X (zen3): Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth | 0.088 ns/B 10825 MiB/s 0.427 c/B 4850 GCM-SIV auth | 0.083 ns/B 11472 MiB/s 0.403 c/B 4850 After: (~1.93x faster) | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth | 0.045 ns/B 21098 MiB/s 0.219 c/B 4850 GCM-SIV auth | 0.043 ns/B 22181 MiB/s 0.209 c/B 4850 AES128-GCM / AES128-GCM-SIV encryption: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM enc | 0.079 ns/B 12073 MiB/s 0.383 c/B 4850 GCM-SIV enc | 0.076 ns/B 12500 MiB/s 0.370 c/B 4850 Benchmark on Intel Core i3-1115G4 (tigerlake): Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth | 0.080 ns/B 11919 MiB/s 0.327 c/B 4090 GCM-SIV auth | 0.075 ns/B 12643 MiB/s 0.309 c/B 4090 After: (~1.28x faster) | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth | 0.062 ns/B 15348 MiB/s 0.254 c/B 4090 GCM-SIV auth | 0.058 ns/B 16381 MiB/s 0.238 c/B 4090 AES128-GCM / AES128-GCM-SIV encryption: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM enc | 0.101 ns/B 9441 MiB/s 0.413 c/B 4090 GCM-SIV enc | 0.098 ns/B 9692 MiB/s 0.402 c/B 4089 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/mac.c')
0 files changed, 0 insertions, 0 deletions