diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2023-02-18 00:14:44 +0200 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2023-02-22 20:27:55 +0200 |
commit | 45351e6474cbbe5baaa4c488222610edc417176e (patch) | |
tree | 224a4452c9aefe6e4af996af5d322328c15f4902 /cipher/rijndael-tables.h | |
parent | f4268a8f51a89a7c0374a23f669d7a19cad304ae (diff) | |
download | libgcrypt-45351e6474cbbe5baaa4c488222610edc417176e.tar.gz |
aria: add x86_64 GFNI/AVX512 accelerated implementation
* cipher/Makefile.am: Add 'aria-gfni-avx512-amd64.S'.
* cipher/aria-gfni-avx512-amd64.S: New.
* cipher/aria.c (USE_GFNI_AVX512): New.
[USE_GFNI_AVX512] (MAX_PARALLEL_BLKS): New.
(ARIA_context): Add 'use_gfni_avx512'.
(_gcry_aria_gfni_avx512_ecb_crypt_blk64)
(_gcry_aria_gfni_avx512_ctr_crypt_blk64)
(aria_gfni_avx512_ecb_crypt_blk64)
(aria_gfni_avx512_ctr_crypt_blk64): New.
(aria_crypt_blocks) [USE_GFNI_AVX512]: Add 64 parallel block
AVX512/GFNI processing.
(_gcry_aria_ctr_enc) [USE_GFNI_AVX512]: Add 64 parallel block
AVX512/GFNI processing.
(aria_setkey): Enable GFNI/AVX512 based on HW features.
* configure.ac: Add 'aria-gfni-avx512-amd64.lo'.
--
This patch adds AVX512/GFNI accelerated ARIA block cipher
implementation for libgcrypt. This implementation is based on
work by Taehee Yoo, with following notable changes:
- Integration to libgcrypt, use of 'aes-common-amd64.h'.
- Use round loop instead of unrolling for smaller code size and
increased performance.
- Use stack for temporary storage instead of external buffers.
- Add byte-addition fast path for CTR.
===
Benchmark on AMD Ryzen 9 7900X (zen4, turbo-freq off):
GFNI/AVX512:
ARIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 0.203 ns/B 4703 MiB/s 0.953 c/B 4700
ECB dec | 0.204 ns/B 4675 MiB/s 0.959 c/B 4700
CTR enc | 0.207 ns/B 4609 MiB/s 0.973 c/B 4700
CTR dec | 0.207 ns/B 4608 MiB/s 0.973 c/B 4700
===
Benchmark on Intel Core i3-1115G4 (tiger-lake, turbo-freq off):
GFNI/AVX512:
ARIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 0.362 ns/B 2635 MiB/s 1.08 c/B 2992
ECB dec | 0.361 ns/B 2639 MiB/s 1.08 c/B 2992
CTR enc | 0.362 ns/B 2633 MiB/s 1.08 c/B 2992
CTR dec | 0.362 ns/B 2633 MiB/s 1.08 c/B 2992
[v2]:
- Add byte-addition fast path for CTR.
Cc: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/rijndael-tables.h')
0 files changed, 0 insertions, 0 deletions