diff options
author | Tianjia Zhang <tianjia.zhang@linux.alibaba.com> | 2022-03-01 17:56:55 +0800 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2022-03-02 20:45:54 +0200 |
commit | 47cafffb09d8a224f07e0750f4ba882bb86cb15a (patch) | |
tree | 50c0a551cc41fb30a3721ebe6089e3752bf3d09e /configure.ac | |
parent | 7d2983979866223d96aad4806af0311671585f64 (diff) | |
download | libgcrypt-47cafffb09d8a224f07e0750f4ba882bb86cb15a.tar.gz |
Add SM4 ARMv8/AArch64/CE assembly implementation
* cipher/Makefile.am: Add 'sm4-armv8-aarch64-ce.S'.
* cipher/sm4-armv8-aarch64-ce.S: New.
* cipher/sm4.c (USE_ARM_CE): New.
(SM4_context) [USE_ARM_CE]: Add 'use_arm_ce'.
[USE_ARM_CE] (_gcry_sm4_armv8_ce_expand_key)
(_gcry_sm4_armv8_ce_crypt, _gcry_sm4_armv8_ce_ctr_enc)
(_gcry_sm4_armv8_ce_cbc_dec, _gcry_sm4_armv8_ce_cfb_dec)
(_gcry_sm4_armv8_ce_crypt_blk1_8, sm4_armv8_ce_crypt_blk1_8): New.
(sm4_expand_key) [USE_ARM_CE]: Use ARMv8/AArch64/CE key setup.
(sm4_setkey): Enable ARMv8/AArch64/CE if supported by HW.
(sm4_encrypt) [USE_ARM_CE]: Use SM4 CE encryption.
(sm4_decrypt) [USE_ARM_CE]: Use SM4 CE decryption.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth) [USE_ARM_CE]: Add
ARMv8/AArch64/CE bulk functions.
* configure.ac: Add 'sm4-armv8-aarch64-ce.lo'.
--
This patch adds ARMv8/AArch64/CE bulk encryption/decryption. Bulk
functions process eight blocks in parallel.
Benchmark on T-Head Yitian-710 2.75 GHz:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 12.10 ns/B 78.79 MiB/s 33.28 c/B 2750
CBC dec | 4.63 ns/B 205.9 MiB/s 12.74 c/B 2749
CFB enc | 12.14 ns/B 78.58 MiB/s 33.37 c/B 2750
CFB dec | 4.64 ns/B 205.5 MiB/s 12.76 c/B 2750
CTR enc | 4.69 ns/B 203.3 MiB/s 12.90 c/B 2750
CTR dec | 4.69 ns/B 203.3 MiB/s 12.90 c/B 2750
GCM enc | 4.88 ns/B 195.4 MiB/s 13.42 c/B 2750
GCM dec | 4.88 ns/B 195.5 MiB/s 13.42 c/B 2750
GCM auth | 0.189 ns/B 5048 MiB/s 0.520 c/B 2750
OCB enc | 4.86 ns/B 196.0 MiB/s 13.38 c/B 2750
OCB dec | 4.90 ns/B 194.7 MiB/s 13.47 c/B 2750
OCB auth | 4.79 ns/B 199.0 MiB/s 13.18 c/B 2750
After (10x - 19x faster than ARMv8/AArch64 impl):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC enc | 1.25 ns/B 762.7 MiB/s 3.44 c/B 2749
CBC dec | 0.243 ns/B 3927 MiB/s 0.668 c/B 2750
CFB enc | 1.25 ns/B 763.1 MiB/s 3.44 c/B 2750
CFB dec | 0.245 ns/B 3899 MiB/s 0.673 c/B 2750
CTR enc | 0.298 ns/B 3199 MiB/s 0.820 c/B 2750
CTR dec | 0.298 ns/B 3198 MiB/s 0.820 c/B 2750
GCM enc | 0.487 ns/B 1957 MiB/s 1.34 c/B 2749
GCM dec | 0.487 ns/B 1959 MiB/s 1.34 c/B 2750
GCM auth | 0.189 ns/B 5048 MiB/s 0.519 c/B 2750
OCB enc | 0.443 ns/B 2150 MiB/s 1.22 c/B 2749
OCB dec | 0.486 ns/B 1964 MiB/s 1.34 c/B 2750
OCB auth | 0.369 ns/B 2585 MiB/s 1.01 c/B 2749
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Diffstat (limited to 'configure.ac')
-rw-r--r-- | configure.ac | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/configure.ac b/configure.ac index f5363f22..e20f9d13 100644 --- a/configure.ac +++ b/configure.ac @@ -2755,6 +2755,7 @@ if test "$found" = "1" ; then aarch64-*-*) # Build with the assembly implementation GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS sm4-aarch64.lo" + GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS sm4-armv8-aarch64-ce.lo" esac fi |