diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2020-06-12 22:36:32 +0300 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2020-06-20 14:48:10 +0300 |
commit | 35a78eb248d6bacd2a58477a122a0020d796ce63 (patch) | |
tree | 6d397d5f4e9d4385c7d9f5385ca3dc5b25ffb864 /cipher/Makefile.am | |
parent | c9a3f1bb91e63033e3bf3e06bdd6075622626d0d (diff) | |
download | libgcrypt-35a78eb248d6bacd2a58477a122a0020d796ce63.tar.gz |
Add SM4 x86-64/AES-NI/AVX2 implementation
* cipher/Makefile.am: Add 'sm4-aesni-avx2-amd64.S'.
* cipher/sm4-aesni-avx2-amd64.S: New.
* cipher/sm4.c (USE_AESNI_AVX2): New.
(SM4_context) [USE_AESNI_AVX2]: Add 'use_aesni_avx2'.
[USE_AESNI_AVX2] (_gcry_sm4_aesni_avx2_ctr_enc)
(_gcry_sm4_aesni_avx2_cbc_dec, _gcry_sm4_aesni_avx2_cfb_dec)
(_gcry_sm4_aesni_avx2_ocb_enc, _gcry_sm4_aesni_avx2_ocb_dec)
(_gcry_sm4_aesni_avx_ocb_auth): New.
(sm4_setkey): Enable AES-NI/AVX2 if supported by HW.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth) [USE_AESNI_AVX2]: Add
AES-NI/AVX2 bulk functions.
* configure.ac: Add ''sm4-aesni-avx2-amd64.lo'.
--
This patch adds x86-64/AES-NI/AVX2 bulk encryption/decryption. Bulk
functions process 16 blocks in parallel.
Benchmark on AMD Ryzen 7 3700X:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 8.98 ns/B 106.2 MiB/s 38.62 c/B 4300
CBC dec | 1.55 ns/B 613.7 MiB/s 6.64 c/B 4275
CFB enc | 8.96 ns/B 106.4 MiB/s 38.52 c/B 4300
CFB dec | 1.54 ns/B 617.4 MiB/s 6.60 c/B 4275
CTR enc | 1.57 ns/B 607.8 MiB/s 6.75 c/B 4300
CTR dec | 1.57 ns/B 608.9 MiB/s 6.74 c/B 4300
OCB enc | 1.58 ns/B 603.8 MiB/s 6.75 c/B 4275
OCB dec | 1.57 ns/B 605.7 MiB/s 6.73 c/B 4275
OCB auth | 1.53 ns/B 624.5 MiB/s 6.57 c/B 4300
After (~56% faster than AES-NI/AVX impl.):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 8.93 ns/B 106.8 MiB/s 38.61 c/B 4326
CBC dec | 0.984 ns/B 969.5 MiB/s 4.23 c/B 4300
CFB enc | 8.93 ns/B 106.8 MiB/s 38.62 c/B 4325
CFB dec | 0.983 ns/B 970.3 MiB/s 4.23 c/B 4300
CTR enc | 0.998 ns/B 955.1 MiB/s 4.29 c/B 4300
CTR dec | 0.996 ns/B 957.4 MiB/s 4.28 c/B 4300
OCB enc | 1.00 ns/B 951.8 MiB/s 4.31 c/B 4300
OCB dec | 1.00 ns/B 951.8 MiB/s 4.31 c/B 4300
OCB auth | 0.993 ns/B 960.2 MiB/s 4.28 c/B 4304±2
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/Makefile.am')
-rw-r--r-- | cipher/Makefile.am | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/cipher/Makefile.am b/cipher/Makefile.am index 427922c6..4798d456 100644 --- a/cipher/Makefile.am +++ b/cipher/Makefile.am @@ -107,7 +107,7 @@ EXTRA_libcipher_la_SOURCES = \ scrypt.c \ seed.c \ serpent.c serpent-sse2-amd64.S \ - sm4.c sm4-aesni-avx-amd64.S \ + sm4.c sm4-aesni-avx-amd64.S sm4-aesni-avx2-amd64.S \ serpent-avx2-amd64.S serpent-armv7-neon.S \ sha1.c sha1-ssse3-amd64.S sha1-avx-amd64.S sha1-avx-bmi2-amd64.S \ sha1-avx2-bmi2-amd64.S sha1-armv7-neon.S sha1-armv8-aarch32-ce.S \ |