summaryrefslogtreecommitdiff
path: root/configure.ac
diff options
context:
space:
mode:
authorJussi Kivilinna <jussi.kivilinna@iki.fi>2022-05-01 16:01:41 +0300
committerJussi Kivilinna <jussi.kivilinna@iki.fi>2022-05-11 20:14:33 +0300
commit9ab61ba24b72bc109b7578a7868716910d2ea9d1 (patch)
tree6f0d22f7e6fe010535e78bbec83e2fc428c1f1a6 /configure.ac
parenta611e3a25d61505698e2bb38ec2db38bc6a74820 (diff)
downloadlibgcrypt-9ab61ba24b72bc109b7578a7868716910d2ea9d1.tar.gz
camellia: add amd64 GFNI/AVX512 implementation
* cipher/Makefile.am: Add 'camellia-gfni-avx512-amd64.S'. * cipher/bulkhelp.h (bulk_ocb_prepare_L_pointers_array_blk64): New. * cipher/camellia-aesni-avx2-amd64.h: Rename internal functions from "__camellia_???" to "FUNC_NAME(???)"; Minor changes to comments. * cipher/camellia-gfni-avx512-amd64.S: New. * cipher/camellia-gfni.c (USE_GFNI_AVX512): New. (CAMELLIA_context): Add 'use_gfni_avx512'. (_gcry_camellia_gfni_avx512_ctr_enc, _gcry_camellia_gfni_avx512_cbc_dec) (_gcry_camellia_gfni_avx512_cfb_dec, _gcry_camellia_gfni_avx512_ocb_enc) (_gcry_camellia_gfni_avx512_ocb_dec) (_gcry_camellia_gfni_avx512_enc_blk64) (_gcry_camellia_gfni_avx512_dec_blk64, avx512_burn_stack_depth): New. (camellia_setkey): Use GFNI/AVX512 if supported by CPU. (camellia_encrypt_blk1_64, camellia_decrypt_blk1_64): New. (_gcry_camellia_ctr_enc, _gcry_camellia_cbc_dec, _gcry_camellia_cfb_dec) (_gcry_camellia_ocb_crypt) [USE_GFNI_AVX512]: Add GFNI/AVX512 code path. (_gcry_camellia_xts_crypt): Change parallel block size from 32 to 64. (selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Increase test block size. * cipher/chacha20-amd64-avx512.S: Clear k-mask registers with xor. * cipher/poly1305-amd64-avx512.S: Likewise. * cipher/sha512-avx512-amd64.S: Likewise. --- Benchmark on Intel i3-1115G4 (tigerlake): Before (GFNI/AVX2): CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz CBC dec | 0.356 ns/B 2679 MiB/s 1.46 c/B 4089 CFB dec | 0.374 ns/B 2547 MiB/s 1.53 c/B 4089 CTR enc | 0.409 ns/B 2332 MiB/s 1.67 c/B 4089 CTR dec | 0.406 ns/B 2347 MiB/s 1.66 c/B 4089 XTS enc | 0.430 ns/B 2216 MiB/s 1.76 c/B 4090 XTS dec | 0.433 ns/B 2201 MiB/s 1.77 c/B 4090 OCB enc | 0.460 ns/B 2071 MiB/s 1.88 c/B 4089 OCB dec | 0.492 ns/B 1939 MiB/s 2.01 c/B 4089 After (GFNI/AVX512): CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz CBC dec | 0.207 ns/B 4600 MiB/s 0.827 c/B 3989 CFB dec | 0.207 ns/B 4610 MiB/s 0.825 c/B 3989 CTR enc | 0.218 ns/B 4382 MiB/s 0.868 c/B 3990 CTR dec | 0.217 ns/B 4389 MiB/s 0.867 c/B 3990 XTS enc | 0.330 ns/B 2886 MiB/s 1.35 c/B 4097±4 XTS dec | 0.328 ns/B 2904 MiB/s 1.35 c/B 4097±3 OCB enc | 0.246 ns/B 3879 MiB/s 0.981 c/B 3990 OCB dec | 0.247 ns/B 3855 MiB/s 0.987 c/B 3990 CBC dec: 70% faster CFB dec: 80% faster CTR: 87% faster XTS: 31% faster OCB: 92% faster Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'configure.ac')
-rw-r--r--configure.ac3
1 files changed, 3 insertions, 0 deletions
diff --git a/configure.ac b/configure.ac
index e63a7d6d..a7482cf3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2758,6 +2758,9 @@ if test "$found" = "1" ; then
# Build with the GFNI/AVX2 implementation
GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS camellia-gfni-avx2-amd64.lo"
+
+ # Build with the GFNI/AVX512 implementation
+ GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS camellia-gfni-avx512-amd64.lo"
fi
fi
fi