| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
--
|
| |
|
|
|
|
|
|
|
|
|
| |
* configure.ac (LIBGCRYPT_CONFIG_LIBS): Remove DL_LIBS.
* src/libgcrypt.c.in: Distinguish static link use case.
* tests/Makefile.am: Fix use of -lgpg-error.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
(cherry picked from commit 9b8ac13761f0407bd701e43b0a65fbada204958f)
|
|
|
|
| |
--
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (*-apple-darwin*): Set _DARWIN_C_SOURCE 1.
--
Cherry-pick master commit of:
b9a14725ec13747dab1d96658b2f7ce09b1ec874
GnuPG-bug-id: 5440
Reported-by: Jay Freeman
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac [*-apple-darwin*] (USE_POSIX_SPAWN_FOR_TESTS): New.
* tests/random.c [USE_POSIX_SPAWN_FOR_TESTS] (run_all_rng_tests): New.
--
Cherry-pick master commit of:
9769b40b54cf010a0c41c4ab05a7a88e17d70613
GnuPG-bug-id: 5159
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
| |
--
|
| |
|
| |
|
|
|
|
| |
--
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac: Generate src/libgcrypt.pc.
* src/Makefile.am (pkgconfigdir, pkgconfig_DATA): New.
(EXTRA_DIST): Add libgcrypt.pc.in.
* src/libgcrypt.pc.in: New.
--
Backported from master.
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
| |
--
|
|
|
|
| |
--
|
|
|
|
|
|
|
| |
* configure.ac: Try to use -fno-delete-null-pointer-checks.
Signed-off-by: Werner Koch <wk@gnupg.org>
(cherry picked from commit 61dbb7c08ab11c10060e193b52e3e1d2ec6dd062)
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (gcry_cv_gcc_inline_asm_bmi2): New assembly test.
--
Use actual assembly snippets from keccak.c to check that compiler
has proper support for used BMI2 instructions.
GnuPG-bug-id: 3408
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
(cherry picked from commit 135250e3060e79be698d4f36a819aa8a880789f8)
|
|
|
|
| |
--
|
|
|
|
| |
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
--
Ported from GnuPG 1.4.
All /dev/*random devices have been equivalent since OpenBSD 4.9, on
purpose (/dev/random doesn't block). /dev/srandom has been removed in
the OpenBSD 6.3 development cycle, /dev/arandom will likely follow.
Signed-off-by: Jeremie Courreges-Anglas <jca@wxcvbn.org>
|
|
|
|
| |
--
|
|
|
|
|
|
| |
* configure.ac: Set LT version to C22/A2/R1.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/cipher-gcm-armv8-aarch32-ce.S: Select ARMv8 architecure.
* cipher/rijndael-armv8-aarch32-ce.S: Ditto.
* cipher/sha1-armv8-aarch32-ce.S: Ditto.
* cipher/sha256-armv8-aarch32-ce.S: Ditto.
* configure.ac (gcry_cv_gcc_inline_asm_aarch32_crypto): Ditto.
--
Raspbian distribution defaults to ARMv6 architecture thus 'rbit'
instruction is not available with default compiler flags. Patch
adds explicit architecture selection for ARMv8 to enable 'rbit'
usage with ARMv8/AArch32-CE assembly implementations of SHA,
GHASH and AES.
Reported-by: Chris Horry <zerbey@gmail.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
| |
--
|
|
|
|
| |
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
| |
--
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
| |
--
This is required to allow installation of 1.7 and 1.8.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* random/rndjent.c: New.
* random/rndlinux.c (_gcry_rndlinux_gather_random): Use rndjent.
* random/rndw32.c (_gcry_rndw32_gather_random): Use rndjent.
(slow_gatherer): Fix compiler warning.
* random/Makefile.am (librandom_la_SOURCES): Add rndjent.c
(EXTRA_librandom_la_SOURCES): Add jitterentropy-base.c and
jitterentropy.h.
(rndjent.o, rndjent.lo): New rules.
* configure.ac: New option --disbale-jent-support
(ENABLE_JENT_SUPPORT): New ac-define.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
| |
--
GnuPG-bug-id: 3120
Reported-by: ka7 (klemens)
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac: On macOS, use the compatibility macros to expose every
feature of the libc. This is the equivalent of _GNU_SOURCE on GNU
libc.
--
Not defining this leads to compilation errors or superfluous warnings
on macOS.
GnuPG-bug-id: 2910
Signed-off-by: Justus Winter <justus@g10code.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/blake2.c: New.
* cipher/Makefile.am: Add 'blake2.c'.
* cipher/md.c (digest_list, prepare_macpads): Add BLAKE2.
(md_setkey): New.
(_gcry_md_setkey): Call 'md_setkey' for non-HMAC md.
* configure.ac: Add BLAKE2 digest.
* doc/gcrypt.texi: Add BLAKE2.
* src/cipher.h (_gcry_blake2_init_with_key)
(_gcry_digest_spec_blake2b_512, _gcry_digest_spec_blake2b_384)
(_gcry_digest_spec_blake2b_256, _gcry_digest_spec_blake2b_160)
(_gcry_digest_spec_blake2s_256, _gcry_digest_spec_blake2s_224)
(_gcry_digest_spec_blake2s_160, _gcry_digest_spec_blake2s_128): New.
* src/gcrypt.h.in (GCRY_MD_BLAKE2B_512, GCRY_MD_BLAKE2B_384)
(GCRY_MD_BLAKE2B_256, GCRY_MD_BLAKE2B_160, GCRY_MD_BLAKE2S_256)
(GCRY_MD_BLAKE2S_224, GCRY_MD_BLAKE2S_160, GCRY_MD_BLAKE2S_128): New.
* tests/basic.c (check_one_md): Add testing for keyed hashes.
(check_digests): Add BLAKE2 test vectors; Add testing for keyed hashes.
* tests/blake2b.h: New.
* tests/blake2s.h: New.
* tests/Makefile.am: Add 'blake2b.h' and 'blake2s.h'.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/cipher-gcm-armv8-aarch64-ce.S: Use '.cpu generic+simd+crypto'
instead of '.arch armv8-a+crypto'.
* cipher/rijndael-armv8-aarch64-ce.S: Ditto.
* cipher/sha1-armv8-aarch64-ce.S: Ditto.
* cipher/sha256-armv8-aarch64-ce.S: Ditto.
* configure.ac (gcry_cv_gcc_inline_asm_aarch64_neon): Ditto.
(gcry_cv_gcc_inline_asm_aarch64_crypto): Ditto; and include NEON
instructions to crypto instructions check.
--
GnuPG-bug-id: 2975
Reported-by: Kirill Ponomarev <kp@krion.cc>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac: Add -Werror flag for attribute checks.
--
Compilter ignores unknown attributes and just shows warning. Therefore
attribute checks need to be run with -Werror.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
| |
* configure.ac: Test may_alias attribute on type, not on variable.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (gcry_cv_gcc_attribute_may_alias)
(HAVE_GCC_ATTRIBUTE_MAY_ALIAS): New check for 'may_alias' attribute.
* cipher/bufhelp.h (BUFHELP_FAST_UNALIGNED_ACCESS): Enable only if
HAVE_GCC_ATTRIBUTE_MAY_ALIAS is defined.
[BUFHELP_FAST_UNALIGNED_ACCESS] (bufhelp_int_t, bufhelp_u32_t)
(bufhelp_u64_t): Add 'may_alias' attribute.
* src/g10lib.h (fast_wipememory_t): Add HAVE_GCC_ATTRIBUTE_MAY_ALIAS
defined check; Add 'may_alias' attribute.
--
Attribute 'may_alias' was missing from bufhelp unaligned memory access
pointer types, and was causing problems with newer GCC versions (with
more aggressive optimization). This patch fixes broken Camellia-CFB
with '-O3 -flto' flags with GCC-6 on x86-64 and generic GCM with
default '-O2' on x32.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'rinjdael-ssse3-amd64-asm.S'.
* cipher/rinjdael-ssse3-amd64-asm.S: Moved assembly functions
here ...
* cipher/rinjdael-ssse3-amd64.c: ... from this file.
(_gcry_aes_ssse3_enc_preload, _gcry_aes_ssse3_dec_preload)
(_gcry_aes_ssse3_shedule_core, _gcry_aes_ssse3_encrypt_core)
(_gcry_aes_ssse3_decrypt_core): New.
(vpaes_ssse3_prepare_enc, vpaes_ssse3_prepare_dec)
(_gcry_aes_ssse3_do_setkey, _gcry_aes_ssse3_prepare_decryption)
(do_vpaes_ssse3_enc, do_vpaes_ssse3_dec): Update to use external
assembly functions; remove 'aes_const_ptr' variable usage.
(_gcry_aes_ssse3_encrypt, _gcry_aes_ssse3_decrypt)
(_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc)
(_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, ssse3_ocb_enc, ssse3_ocb_dec)
(_gcry_aes_ssse3_ocb_auth): Remove 'aes_const_ptr' variable usage.
* configure.ac: Add 'rinjdael-ssse3-amd64-asm.lo'.
--
After this change, libgcrypt can be compiled with -flto optimization
enabled on x86-64.
GnuPG-bug-id: 2882
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'twofish-avx2-amd64.S'.
* cipher/twofish-avx2-amd64.S: New.
* cipher/twofish.c (USE_AVX2): New.
(TWOFISH_context) [USE_AVX2]: Add 'use_avx2' member.
(ASM_FUNC_ABI): New.
(twofish_setkey): Add check for AVX2 and fast VPGATHER HW features.
(_gcry_twofish_avx2_ctr_enc, _gcry_twofish_avx2_cbc_dec)
(_gcry_twofish_avx2_cfb_dec, _gcry_twofish_avx2_ocb_enc)
(_gcry_twofish_avx2_ocb_dec, _gcry_twofish_avx2_ocb_auth): New.
(_gcry_twofish_ctr_enc, _gcry_twofish_cbc_dec, _gcry_twofish_cfb_dec)
(_gcry_twofish_ocb_crypt, _gcry_twofish_ocb_auth): Add AVX2 bulk
handling.
(selftest_ctr, selftest_cbc, selftest_cfb): Increase nblocks from
3+X to 16+X.
* configure.ac: Add 'twofish-avx2-amd64.lo'.
* src/g10lib.h (HWF_INTEL_FAST_VPGATHER): New.
* src/hwf-x86.c (detect_x86_gnuc): Add detection for
HWF_INTEL_FAST_VPGATHER.
* src/hwfeatures.c (HWF_INTEL_FAST_VPGATHER): Add
"intel-fast-vpgather" for HWF_INTEL_FAST_VPGATHER.
--
Benchmark on Intel Core i3-6100 (3.7 Ghz):
Before:
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 4.25 ns/B 224.5 MiB/s 15.71 c/B
ECB dec | 4.16 ns/B 229.5 MiB/s 15.38 c/B
CBC enc | 4.53 ns/B 210.4 MiB/s 16.77 c/B
CBC dec | 2.71 ns/B 351.6 MiB/s 10.04 c/B
CFB enc | 4.60 ns/B 207.3 MiB/s 17.02 c/B
CFB dec | 2.70 ns/B 353.5 MiB/s 9.98 c/B
OFB enc | 4.25 ns/B 224.2 MiB/s 15.74 c/B
OFB dec | 4.24 ns/B 225.0 MiB/s 15.68 c/B
CTR enc | 2.72 ns/B 350.6 MiB/s 10.06 c/B
CTR dec | 2.72 ns/B 350.7 MiB/s 10.06 c/B
CCM enc | 7.25 ns/B 131.5 MiB/s 26.83 c/B
CCM dec | 7.25 ns/B 131.5 MiB/s 26.83 c/B
CCM auth | 4.57 ns/B 208.9 MiB/s 16.89 c/B
GCM enc | 3.02 ns/B 315.3 MiB/s 11.19 c/B
GCM dec | 3.02 ns/B 315.6 MiB/s 11.18 c/B
GCM auth | 0.297 ns/B 3208.4 MiB/s 1.10 c/B
OCB enc | 2.73 ns/B 349.7 MiB/s 10.09 c/B
OCB dec | 2.82 ns/B 338.3 MiB/s 10.43 c/B
OCB auth | 2.77 ns/B 343.7 MiB/s 10.27 c/B
After (CBC-dec & CFB-dec & CTR & OCB, ~1.5x faster):
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 4.25 ns/B 224.2 MiB/s 15.74 c/B
ECB dec | 4.15 ns/B 229.5 MiB/s 15.37 c/B
CBC enc | 4.61 ns/B 206.8 MiB/s 17.06 c/B
CBC dec | 1.75 ns/B 544.0 MiB/s 6.49 c/B
CFB enc | 4.52 ns/B 211.0 MiB/s 16.72 c/B
CFB dec | 1.72 ns/B 554.1 MiB/s 6.37 c/B
OFB enc | 4.27 ns/B 223.3 MiB/s 15.80 c/B
OFB dec | 4.28 ns/B 222.7 MiB/s 15.84 c/B
CTR enc | 1.73 ns/B 549.9 MiB/s 6.42 c/B
CTR dec | 1.75 ns/B 545.1 MiB/s 6.47 c/B
CCM enc | 6.31 ns/B 151.2 MiB/s 23.34 c/B
CCM dec | 6.42 ns/B 148.5 MiB/s 23.76 c/B
CCM auth | 4.56 ns/B 208.9 MiB/s 16.89 c/B
GCM enc | 1.90 ns/B 502.8 MiB/s 7.02 c/B
GCM dec | 2.00 ns/B 477.8 MiB/s 7.38 c/B
GCM auth | 0.300 ns/B 3178.6 MiB/s 1.11 c/B
OCB enc | 1.76 ns/B 542.2 MiB/s 6.51 c/B
OCB dec | 1.76 ns/B 540.7 MiB/s 6.53 c/B
OCB auth | 1.76 ns/B 542.8 MiB/s 6.50 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* .gitignore: Add 'tests/basic-disable-all-hwf'.
* configure.ac: Ditto.
* tests/Makefile.am: Ditto.
* src/hwfeatures.c (_gcry_disable_hw_feature): Match 'all' for
masking all HW features off.
(parse_hwf_deny_file): Use '_gcry_disable_hw_feature' for matching.
* tests/basic-disable-all-hwf.in: New.
--
Also add new test to run 'basic' with all HWF disable. With current
assembly implementations and build servers using new CPUs, generic
implementations are not being tested enough anymore and compiler
problems might end up unnoticed.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/gcrypt.h.in (GCRYCTL_REINIT_SYSCALL_CLAMP): New.
* configure.ac: Require Libgpg-error 1.25. Set version number to
1.8.0.
* src/gcrypt-int.h: Remove error code emulation.
* src/global.c (pre_syscall_func, post_syscall_func): New.
(global_init): Call gpgrt_get_syscall_clamp.
(_gcry_vcontrol) <GCRYCTL_REINIT_SYSCALL_CLAMP>: Ditto.
(_gcry_pre_syscall, _gcry_post_syscall): New.
* random/rndlinux.c (_gcry_rndlinux_gather_random): Use the new
functions.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'twofish-aarch64.S'.
* cipher/twofish-aarch64.S: New.
* cipher/twofish.c: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac [host=aarch64]: Add 'twofish-aarch64.lo'.
--
Patch adds ARMv8/Aarch64 implementation of Twofish.
Benchmark on Cortex-A53 (1152 Mhz):
Before:
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 27.51 ns/B 34.67 MiB/s 31.69 c/B
ECB dec | 26.37 ns/B 36.17 MiB/s 30.38 c/B
CBC enc | 28.64 ns/B 33.29 MiB/s 33.00 c/B
CBC dec | 26.21 ns/B 36.39 MiB/s 30.19 c/B
CFB enc | 28.54 ns/B 33.42 MiB/s 32.88 c/B
CFB dec | 27.40 ns/B 34.81 MiB/s 31.56 c/B
OFB enc | 28.38 ns/B 33.61 MiB/s 32.69 c/B
OFB dec | 28.37 ns/B 33.61 MiB/s 32.69 c/B
CTR enc | 27.57 ns/B 34.60 MiB/s 31.76 c/B
CTR dec | 27.57 ns/B 34.60 MiB/s 31.76 c/B
CCM enc | 55.28 ns/B 17.25 MiB/s 63.69 c/B
CCM dec | 55.29 ns/B 17.25 MiB/s 63.70 c/B
CCM auth | 27.83 ns/B 34.27 MiB/s 32.06 c/B
GCM enc | 28.86 ns/B 33.04 MiB/s 33.25 c/B
GCM dec | 28.87 ns/B 33.04 MiB/s 33.25 c/B
GCM auth | 1.30 ns/B 731.9 MiB/s 1.50 c/B
OCB enc | 29.69 ns/B 32.12 MiB/s 34.20 c/B
OCB dec | 28.50 ns/B 33.47 MiB/s 32.83 c/B
OCB auth | 29.04 ns/B 32.84 MiB/s 33.45 c/B
=
After (~1.3x faster):
TWOFISH | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 19.97 ns/B 47.77 MiB/s 23.00 c/B
ECB dec | 18.29 ns/B 52.16 MiB/s 21.06 c/B
CBC enc | 20.94 ns/B 45.54 MiB/s 24.13 c/B
CBC dec | 18.34 ns/B 52.00 MiB/s 21.13 c/B
CFB enc | 20.83 ns/B 45.77 MiB/s 24.00 c/B
CFB dec | 19.97 ns/B 47.76 MiB/s 23.00 c/B
OFB enc | 20.94 ns/B 45.54 MiB/s 24.13 c/B
OFB dec | 20.94 ns/B 45.54 MiB/s 24.13 c/B
CTR enc | 20.19 ns/B 47.24 MiB/s 23.26 c/B
CTR dec | 20.19 ns/B 47.24 MiB/s 23.26 c/B
CCM enc | 40.53 ns/B 23.53 MiB/s 46.69 c/B
CCM dec | 40.53 ns/B 23.53 MiB/s 46.69 c/B
CCM auth | 20.40 ns/B 46.74 MiB/s 23.50 c/B
GCM enc | 21.49 ns/B 44.39 MiB/s 24.75 c/B
GCM dec | 21.48 ns/B 44.39 MiB/s 24.75 c/B
GCM auth | 1.30 ns/B 731.8 MiB/s 1.50 c/B
OCB enc | 22.15 ns/B 43.05 MiB/s 25.52 c/B
OCB dec | 20.47 ns/B 46.58 MiB/s 23.59 c/B
OCB auth | 21.64 ns/B 44.07 MiB/s 24.93 c/B
=
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'camellia-aarch64.S'.
* cipher/camellia-aarch64.S: New.
* cipher/camellia-glue.c [USE_ARM_ASM][__aarch64__]: Set stack burn
size to zero.
* cipher/camellia.h: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac [host=aarch64]: Add 'rijndael-aarch64.lo'.
--
Patch adds ARMv8/Aarch64 implementation of Camellia.
Benchmark on Cortex-A53 (1152 Mhz):
Before:
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 39.71 ns/B 24.01 MiB/s 45.75 c/B
ECB dec | 39.72 ns/B 24.01 MiB/s 45.75 c/B
CBC enc | 40.80 ns/B 23.38 MiB/s 47.00 c/B
CBC dec | 39.66 ns/B 24.05 MiB/s 45.69 c/B
CFB enc | 40.69 ns/B 23.44 MiB/s 46.88 c/B
CFB dec | 39.66 ns/B 24.05 MiB/s 45.69 c/B
OFB enc | 40.69 ns/B 23.44 MiB/s 46.88 c/B
OFB dec | 40.69 ns/B 23.44 MiB/s 46.88 c/B
CTR enc | 39.88 ns/B 23.91 MiB/s 45.94 c/B
CTR dec | 39.88 ns/B 23.91 MiB/s 45.94 c/B
CCM enc | 79.97 ns/B 11.92 MiB/s 92.13 c/B
CCM dec | 79.97 ns/B 11.93 MiB/s 92.13 c/B
CCM auth | 40.20 ns/B 23.72 MiB/s 46.31 c/B
GCM enc | 41.18 ns/B 23.16 MiB/s 47.44 c/B
GCM dec | 41.18 ns/B 23.16 MiB/s 47.44 c/B
GCM auth | 1.30 ns/B 732.7 MiB/s 1.50 c/B
OCB enc | 42.04 ns/B 22.69 MiB/s 48.43 c/B
OCB dec | 42.03 ns/B 22.69 MiB/s 48.42 c/B
OCB auth | 41.38 ns/B 23.05 MiB/s 47.67 c/B
=
CAMELLIA256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 52.36 ns/B 18.22 MiB/s 60.31 c/B
ECB dec | 52.36 ns/B 18.22 MiB/s 60.31 c/B
CBC enc | 53.39 ns/B 17.86 MiB/s 61.50 c/B
CBC dec | 52.14 ns/B 18.29 MiB/s 60.06 c/B
CFB enc | 53.28 ns/B 17.90 MiB/s 61.38 c/B
CFB dec | 52.14 ns/B 18.29 MiB/s 60.06 c/B
OFB enc | 53.17 ns/B 17.94 MiB/s 61.25 c/B
OFB dec | 53.17 ns/B 17.94 MiB/s 61.25 c/B
CTR enc | 52.36 ns/B 18.21 MiB/s 60.32 c/B
CTR dec | 52.36 ns/B 18.21 MiB/s 60.32 c/B
CCM enc | 105.0 ns/B 9.08 MiB/s 120.9 c/B
CCM dec | 105.0 ns/B 9.08 MiB/s 120.9 c/B
CCM auth | 52.74 ns/B 18.08 MiB/s 60.75 c/B
GCM enc | 53.66 ns/B 17.77 MiB/s 61.81 c/B
GCM dec | 53.66 ns/B 17.77 MiB/s 61.82 c/B
GCM auth | 1.30 ns/B 732.3 MiB/s 1.50 c/B
OCB enc | 54.54 ns/B 17.49 MiB/s 62.83 c/B
OCB dec | 54.48 ns/B 17.50 MiB/s 62.77 c/B
OCB auth | 53.89 ns/B 17.70 MiB/s 62.09 c/B
=
After (~1.7x faster):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 22.25 ns/B 42.87 MiB/s 25.63 c/B
ECB dec | 22.25 ns/B 42.87 MiB/s 25.63 c/B
CBC enc | 23.27 ns/B 40.97 MiB/s 26.81 c/B
CBC dec | 22.14 ns/B 43.08 MiB/s 25.50 c/B
CFB enc | 23.17 ns/B 41.17 MiB/s 26.69 c/B
CFB dec | 22.14 ns/B 43.08 MiB/s 25.50 c/B
OFB enc | 23.11 ns/B 41.26 MiB/s 26.63 c/B
OFB dec | 23.11 ns/B 41.26 MiB/s 26.63 c/B
CTR enc | 22.36 ns/B 42.65 MiB/s 25.76 c/B
CTR dec | 22.36 ns/B 42.65 MiB/s 25.76 c/B
CCM enc | 44.87 ns/B 21.26 MiB/s 51.69 c/B
CCM dec | 44.87 ns/B 21.25 MiB/s 51.69 c/B
CCM auth | 22.62 ns/B 42.15 MiB/s 26.06 c/B
GCM enc | 23.66 ns/B 40.31 MiB/s 27.25 c/B
GCM dec | 23.66 ns/B 40.31 MiB/s 27.25 c/B
GCM auth | 1.30 ns/B 732.0 MiB/s 1.50 c/B
OCB enc | 24.32 ns/B 39.21 MiB/s 28.02 c/B
OCB dec | 24.32 ns/B 39.21 MiB/s 28.02 c/B
OCB auth | 23.75 ns/B 40.15 MiB/s 27.36 c/B
=
CAMELLIA256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 29.08 ns/B 32.79 MiB/s 33.50 c/B
ECB dec | 29.19 ns/B 32.67 MiB/s 33.63 c/B
CBC enc | 30.11 ns/B 31.67 MiB/s 34.69 c/B
CBC dec | 29.05 ns/B 32.83 MiB/s 33.47 c/B
CFB enc | 30.00 ns/B 31.79 MiB/s 34.56 c/B
CFB dec | 28.97 ns/B 32.91 MiB/s 33.38 c/B
OFB enc | 29.95 ns/B 31.84 MiB/s 34.50 c/B
OFB dec | 29.95 ns/B 31.84 MiB/s 34.50 c/B
CTR enc | 29.19 ns/B 32.67 MiB/s 33.63 c/B
CTR dec | 29.19 ns/B 32.67 MiB/s 33.63 c/B
CCM enc | 58.54 ns/B 16.29 MiB/s 67.43 c/B
CCM dec | 58.54 ns/B 16.29 MiB/s 67.44 c/B
CCM auth | 29.46 ns/B 32.37 MiB/s 33.94 c/B
GCM enc | 30.49 ns/B 31.28 MiB/s 35.12 c/B
GCM dec | 30.49 ns/B 31.27 MiB/s 35.13 c/B
GCM auth | 1.30 ns/B 731.6 MiB/s 1.50 c/B
OCB enc | 31.16 ns/B 30.61 MiB/s 35.90 c/B
OCB dec | 31.22 ns/B 30.55 MiB/s 35.96 c/B
OCB auth | 30.59 ns/B 31.18 MiB/s 35.24 c/B
=
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'rijndael-armv-aarch64-ce.S'.
* cipher/rijndael-armv8-aarch64-ce.S: New.
* cipher/rijndael-internal.h (USE_ARM_CE): Enable for ARMv8/AArch64.
* configure.ac: Add 'rijndael-armv-aarch64-ce.lo' and
'rijndael-armv8-ce.lo' for ARMv8/AArch64.
--
Improvement vs AArch64 assembly on Cortex-A53:
AES-128 AES-192 AES-256
CBC enc: 13.19x 13.53x 13.76x
CBC dec: 20.53x 21.91x 22.60x
CFB enc: 14.29x 14.50x 14.63x
CFB dec: 20.42x 21.69x 22.50x
CTR: 18.29x 19.61x 20.53x
OCB enc: 15.21x 16.32x 17.12x
OCB dec: 14.95x 16.11x 16.88x
OCB auth: 16.73x 17.93x 18.66x
Benchmark on Cortex-A53 (1152 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 21.86 ns/B 43.62 MiB/s 25.19 c/B
ECB dec | 22.68 ns/B 42.05 MiB/s 26.13 c/B
CBC enc | 18.66 ns/B 51.10 MiB/s 21.50 c/B
CBC dec | 18.72 ns/B 50.95 MiB/s 21.56 c/B
CFB enc | 18.61 ns/B 51.25 MiB/s 21.44 c/B
CFB dec | 18.61 ns/B 51.25 MiB/s 21.44 c/B
OFB enc | 22.84 ns/B 41.75 MiB/s 26.31 c/B
OFB dec | 22.84 ns/B 41.75 MiB/s 26.31 c/B
CTR enc | 18.89 ns/B 50.50 MiB/s 21.76 c/B
CTR dec | 18.89 ns/B 50.50 MiB/s 21.76 c/B
CCM enc | 37.55 ns/B 25.40 MiB/s 43.25 c/B
CCM dec | 37.55 ns/B 25.40 MiB/s 43.25 c/B
CCM auth | 18.77 ns/B 50.80 MiB/s 21.63 c/B
GCM enc | 20.18 ns/B 47.25 MiB/s 23.25 c/B
GCM dec | 20.18 ns/B 47.25 MiB/s 23.25 c/B
GCM auth | 1.30 ns/B 732.5 MiB/s 1.50 c/B
OCB enc | 19.67 ns/B 48.48 MiB/s 22.66 c/B
OCB dec | 19.73 ns/B 48.34 MiB/s 22.72 c/B
OCB auth | 19.46 ns/B 49.00 MiB/s 22.42 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 25.39 ns/B 37.56 MiB/s 29.25 c/B
ECB dec | 26.15 ns/B 36.47 MiB/s 30.13 c/B
CBC enc | 22.08 ns/B 43.19 MiB/s 25.44 c/B
CBC dec | 22.25 ns/B 42.87 MiB/s 25.63 c/B
CFB enc | 22.03 ns/B 43.30 MiB/s 25.38 c/B
CFB dec | 22.03 ns/B 43.29 MiB/s 25.38 c/B
OFB enc | 26.26 ns/B 36.32 MiB/s 30.25 c/B
OFB dec | 26.26 ns/B 36.32 MiB/s 30.25 c/B
CTR enc | 22.30 ns/B 42.76 MiB/s 25.69 c/B
CTR dec | 22.30 ns/B 42.76 MiB/s 25.69 c/B
CCM enc | 44.38 ns/B 21.49 MiB/s 51.13 c/B
CCM dec | 44.38 ns/B 21.49 MiB/s 51.13 c/B
CCM auth | 22.20 ns/B 42.97 MiB/s 25.57 c/B
GCM enc | 23.60 ns/B 40.41 MiB/s 27.19 c/B
GCM dec | 23.60 ns/B 40.41 MiB/s 27.19 c/B
GCM auth | 1.30 ns/B 732.4 MiB/s 1.50 c/B
OCB enc | 23.09 ns/B 41.31 MiB/s 26.60 c/B
OCB dec | 23.21 ns/B 41.09 MiB/s 26.74 c/B
OCB auth | 22.88 ns/B 41.68 MiB/s 26.36 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 28.76 ns/B 33.17 MiB/s 33.13 c/B
ECB dec | 29.46 ns/B 32.37 MiB/s 33.94 c/B
CBC enc | 25.45 ns/B 37.48 MiB/s 29.31 c/B
CBC dec | 25.50 ns/B 37.40 MiB/s 29.38 c/B
CFB enc | 25.39 ns/B 37.56 MiB/s 29.25 c/B
CFB dec | 25.39 ns/B 37.56 MiB/s 29.25 c/B
OFB enc | 29.62 ns/B 32.19 MiB/s 34.13 c/B
OFB dec | 29.62 ns/B 32.19 MiB/s 34.13 c/B
CTR enc | 25.67 ns/B 37.15 MiB/s 29.57 c/B
CTR dec | 25.67 ns/B 37.15 MiB/s 29.57 c/B
CCM enc | 51.11 ns/B 18.66 MiB/s 58.88 c/B
CCM dec | 51.11 ns/B 18.66 MiB/s 58.88 c/B
CCM auth | 25.56 ns/B 37.32 MiB/s 29.44 c/B
GCM enc | 26.96 ns/B 35.37 MiB/s 31.06 c/B
GCM dec | 26.98 ns/B 35.35 MiB/s 31.08 c/B
GCM auth | 1.30 ns/B 733.4 MiB/s 1.50 c/B
OCB enc | 26.45 ns/B 36.05 MiB/s 30.47 c/B
OCB dec | 26.53 ns/B 35.95 MiB/s 30.56 c/B
OCB auth | 26.24 ns/B 36.34 MiB/s 30.23 c/B
=
After:
Cipher:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 4.83 ns/B 197.5 MiB/s 5.56 c/B
ECB dec | 4.99 ns/B 191.1 MiB/s 5.75 c/B
CBC enc | 1.41 ns/B 675.5 MiB/s 1.63 c/B
CBC dec | 0.911 ns/B 1046.9 MiB/s 1.05 c/B
CFB enc | 1.30 ns/B 732.2 MiB/s 1.50 c/B
CFB dec | 0.911 ns/B 1046.7 MiB/s 1.05 c/B
OFB enc | 5.81 ns/B 164.3 MiB/s 6.69 c/B
OFB dec | 5.81 ns/B 164.3 MiB/s 6.69 c/B
CTR enc | 1.03 ns/B 924.0 MiB/s 1.19 c/B
CTR dec | 1.03 ns/B 924.1 MiB/s 1.19 c/B
CCM enc | 2.50 ns/B 381.8 MiB/s 2.88 c/B
CCM dec | 2.50 ns/B 381.7 MiB/s 2.88 c/B
CCM auth | 1.57 ns/B 606.1 MiB/s 1.81 c/B
GCM enc | 2.33 ns/B 408.5 MiB/s 2.69 c/B
GCM dec | 2.34 ns/B 408.4 MiB/s 2.69 c/B
GCM auth | 1.30 ns/B 732.1 MiB/s 1.50 c/B
OCB enc | 1.29 ns/B 736.6 MiB/s 1.49 c/B
OCB dec | 1.32 ns/B 724.4 MiB/s 1.52 c/B
OCB auth | 1.16 ns/B 819.6 MiB/s 1.34 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.48 ns/B 174.0 MiB/s 6.31 c/B
ECB dec | 5.64 ns/B 169.0 MiB/s 6.50 c/B
CBC enc | 1.63 ns/B 585.8 MiB/s 1.88 c/B
CBC dec | 1.02 ns/B 935.8 MiB/s 1.17 c/B
CFB enc | 1.52 ns/B 627.7 MiB/s 1.75 c/B
CFB dec | 1.02 ns/B 935.9 MiB/s 1.17 c/B
OFB enc | 6.46 ns/B 147.7 MiB/s 7.44 c/B
OFB dec | 6.46 ns/B 147.7 MiB/s 7.44 c/B
CTR enc | 1.14 ns/B 836.1 MiB/s 1.31 c/B
CTR dec | 1.14 ns/B 835.9 MiB/s 1.31 c/B
CCM enc | 2.83 ns/B 337.6 MiB/s 3.25 c/B
CCM dec | 2.82 ns/B 338.0 MiB/s 3.25 c/B
CCM auth | 1.79 ns/B 532.7 MiB/s 2.06 c/B
GCM enc | 2.44 ns/B 390.3 MiB/s 2.82 c/B
GCM dec | 2.44 ns/B 390.2 MiB/s 2.82 c/B
GCM auth | 1.30 ns/B 731.9 MiB/s 1.50 c/B
OCB enc | 1.41 ns/B 674.7 MiB/s 1.63 c/B
OCB dec | 1.44 ns/B 662.0 MiB/s 1.66 c/B
OCB auth | 1.28 ns/B 746.1 MiB/s 1.47 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 6.13 ns/B 155.5 MiB/s 7.06 c/B
ECB dec | 6.29 ns/B 151.5 MiB/s 7.25 c/B
CBC enc | 1.85 ns/B 516.8 MiB/s 2.13 c/B
CBC dec | 1.13 ns/B 845.6 MiB/s 1.30 c/B
CFB enc | 1.74 ns/B 549.5 MiB/s 2.00 c/B
CFB dec | 1.13 ns/B 846.1 MiB/s 1.30 c/B
OFB enc | 7.11 ns/B 134.2 MiB/s 8.19 c/B
OFB dec | 7.11 ns/B 134.2 MiB/s 8.19 c/B
CTR enc | 1.25 ns/B 763.5 MiB/s 1.44 c/B
CTR dec | 1.25 ns/B 763.4 MiB/s 1.44 c/B
CCM enc | 3.15 ns/B 302.9 MiB/s 3.63 c/B
CCM dec | 3.15 ns/B 302.9 MiB/s 3.63 c/B
CCM auth | 2.01 ns/B 474.2 MiB/s 2.32 c/B
GCM enc | 2.55 ns/B 374.2 MiB/s 2.94 c/B
GCM dec | 2.55 ns/B 373.7 MiB/s 2.94 c/B
GCM auth | 1.30 ns/B 732.2 MiB/s 1.50 c/B
OCB enc | 1.54 ns/B 617.6 MiB/s 1.78 c/B
OCB dec | 1.57 ns/B 606.8 MiB/s 1.81 c/B
OCB auth | 1.40 ns/B 679.8 MiB/s 1.62 c/B
=
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sha256-armv8-aarch64-ce.S'.
* cipher/sha256-armv8-aarch64-ce.S: New.
* cipher/sha256-armv8-aarch32-ce.S: Move round macros to correct
section.
* cipher/sha256.c (USE_ARM_CE): Enable on ARMv8/AArch64.
* configure.ac: Add 'sha256-armv8-aarch64-ce.lo'; Swap places for
'sha512-arm.lo' and 'sha256-armv8-aarch32-ce.lo'.
--
Benchmark on Cortex-A53 (1152 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 13.34 ns/B 71.51 MiB/s 15.36 c/B
After (7.2x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 1.85 ns/B 516.3 MiB/s 2.13 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sha1-armv8-aarch64-ce.S'.
* cipher/sha1-armv8-aarch64-ce.S: New.
* cipher/sha1.c (USE_ARM_CE): Enable on ARMv8/AArch64.
* configure.ac: Add 'sha1-armv8-aarch64-ce.lo'.
--
Benchmark on Cortex-A53 (1152 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA1 | 7.54 ns/B 126.4 MiB/s 8.69 c/B
After (4.3x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA1 | 1.72 ns/B 553.0 MiB/s 1.99 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'rijndael-aarch64.S'.
* cipher/rijndael-aarch64.S: New.
* cipher/rijndael-internal.h: Enable USE_ARM_ASM if __AARCH64EL__ and
HAVE_COMPATIBLE_GCC_AARCH64_PLATFORM_AS defined.
* configure.ac (gcry_cv_gcc_aarch64_platform_as_ok): New check.
[host=aarch64]: Add 'rijndael-aarch64.lo'.
--
Patch adds ARMv8/Aarch64 implementation of AES.
Benchmark on Cortex-A53 (1536 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 19.37 ns/B 49.22 MiB/s 29.76 c/B
ECB dec | 19.85 ns/B 48.03 MiB/s 30.50 c/B
CBC enc | 16.84 ns/B 56.62 MiB/s 25.87 c/B
CBC dec | 16.81 ns/B 56.74 MiB/s 25.82 c/B
CFB enc | 16.80 ns/B 56.75 MiB/s 25.81 c/B
CFB dec | 16.81 ns/B 56.75 MiB/s 25.81 c/B
OFB enc | 20.02 ns/B 47.64 MiB/s 30.75 c/B
OFB dec | 20.02 ns/B 47.64 MiB/s 30.75 c/B
CTR enc | 17.06 ns/B 55.91 MiB/s 26.20 c/B
CTR dec | 17.06 ns/B 55.92 MiB/s 26.20 c/B
CCM enc | 33.94 ns/B 28.10 MiB/s 52.13 c/B
CCM dec | 33.94 ns/B 28.10 MiB/s 52.14 c/B
CCM auth | 16.97 ns/B 56.18 MiB/s 26.07 c/B
GCM enc | 28.70 ns/B 33.23 MiB/s 44.09 c/B
GCM dec | 28.70 ns/B 33.23 MiB/s 44.09 c/B
GCM auth | 11.66 ns/B 81.81 MiB/s 17.90 c/B
OCB enc | 17.66 ns/B 53.99 MiB/s 27.13 c/B
OCB dec | 17.61 ns/B 54.16 MiB/s 27.05 c/B
OCB auth | 17.44 ns/B 54.69 MiB/s 26.78 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 21.82 ns/B 43.71 MiB/s 33.51 c/B
ECB dec | 22.55 ns/B 42.30 MiB/s 34.63 c/B
CBC enc | 19.33 ns/B 49.33 MiB/s 29.70 c/B
CBC dec | 19.50 ns/B 48.91 MiB/s 29.95 c/B
CFB enc | 19.29 ns/B 49.44 MiB/s 29.63 c/B
CFB dec | 19.28 ns/B 49.46 MiB/s 29.61 c/B
OFB enc | 22.49 ns/B 42.40 MiB/s 34.55 c/B
OFB dec | 22.50 ns/B 42.38 MiB/s 34.56 c/B
CTR enc | 19.53 ns/B 48.83 MiB/s 30.00 c/B
CTR dec | 19.54 ns/B 48.80 MiB/s 30.02 c/B
CCM enc | 38.91 ns/B 24.51 MiB/s 59.77 c/B
CCM dec | 38.90 ns/B 24.51 MiB/s 59.76 c/B
CCM auth | 19.45 ns/B 49.02 MiB/s 29.88 c/B
GCM enc | 31.13 ns/B 30.63 MiB/s 47.82 c/B
GCM dec | 31.14 ns/B 30.63 MiB/s 47.82 c/B
GCM auth | 11.66 ns/B 81.80 MiB/s 17.91 c/B
OCB enc | 20.15 ns/B 47.33 MiB/s 30.95 c/B
OCB dec | 20.30 ns/B 46.98 MiB/s 31.18 c/B
OCB auth | 19.92 ns/B 47.88 MiB/s 30.59 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 24.33 ns/B 39.19 MiB/s 37.38 c/B
ECB dec | 25.23 ns/B 37.80 MiB/s 38.76 c/B
CBC enc | 21.82 ns/B 43.71 MiB/s 33.51 c/B
CBC dec | 22.18 ns/B 42.99 MiB/s 34.07 c/B
CFB enc | 21.77 ns/B 43.80 MiB/s 33.44 c/B
CFB dec | 21.77 ns/B 43.81 MiB/s 33.44 c/B
OFB enc | 24.99 ns/B 38.16 MiB/s 38.39 c/B
OFB dec | 24.99 ns/B 38.17 MiB/s 38.38 c/B
CTR enc | 22.02 ns/B 43.32 MiB/s 33.82 c/B
CTR dec | 22.02 ns/B 43.31 MiB/s 33.82 c/B
CCM enc | 43.86 ns/B 21.74 MiB/s 67.38 c/B
CCM dec | 43.87 ns/B 21.74 MiB/s 67.39 c/B
CCM auth | 21.94 ns/B 43.48 MiB/s 33.69 c/B
GCM enc | 33.66 ns/B 28.33 MiB/s 51.71 c/B
GCM dec | 33.66 ns/B 28.33 MiB/s 51.70 c/B
GCM auth | 11.69 ns/B 81.59 MiB/s 17.95 c/B
OCB enc | 22.90 ns/B 41.65 MiB/s 35.17 c/B
OCB dec | 23.25 ns/B 41.02 MiB/s 35.71 c/B
OCB auth | 22.69 ns/B 42.03 MiB/s 34.85 c/B
=
After (~1.2x faster):
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 16.40 ns/B 58.16 MiB/s 25.19 c/B
ECB dec | 17.01 ns/B 56.07 MiB/s 26.13 c/B
CBC enc | 13.99 ns/B 68.15 MiB/s 21.49 c/B
CBC dec | 14.04 ns/B 67.94 MiB/s 21.56 c/B
CFB enc | 13.96 ns/B 68.32 MiB/s 21.44 c/B
CFB dec | 13.95 ns/B 68.34 MiB/s 21.43 c/B
OFB enc | 17.14 ns/B 55.65 MiB/s 26.32 c/B
OFB dec | 17.13 ns/B 55.67 MiB/s 26.31 c/B
CTR enc | 14.17 ns/B 67.31 MiB/s 21.76 c/B
CTR dec | 14.17 ns/B 67.29 MiB/s 21.77 c/B
CCM enc | 28.16 ns/B 33.86 MiB/s 43.26 c/B
CCM dec | 28.16 ns/B 33.87 MiB/s 43.26 c/B
CCM auth | 14.08 ns/B 67.71 MiB/s 21.63 c/B
GCM enc | 25.82 ns/B 36.94 MiB/s 39.66 c/B
GCM dec | 25.82 ns/B 36.94 MiB/s 39.65 c/B
GCM auth | 11.67 ns/B 81.74 MiB/s 17.92 c/B
OCB enc | 14.78 ns/B 64.55 MiB/s 22.69 c/B
OCB dec | 14.80 ns/B 64.43 MiB/s 22.74 c/B
OCB auth | 14.59 ns/B 65.36 MiB/s 22.41 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 19.05 ns/B 50.07 MiB/s 29.25 c/B
ECB dec | 19.62 ns/B 48.62 MiB/s 30.13 c/B
CBC enc | 16.56 ns/B 57.59 MiB/s 25.44 c/B
CBC dec | 16.69 ns/B 57.14 MiB/s 25.64 c/B
CFB enc | 16.52 ns/B 57.71 MiB/s 25.38 c/B
CFB dec | 16.52 ns/B 57.73 MiB/s 25.37 c/B
OFB enc | 19.70 ns/B 48.41 MiB/s 30.26 c/B
OFB dec | 19.69 ns/B 48.43 MiB/s 30.24 c/B
CTR enc | 16.73 ns/B 57.00 MiB/s 25.70 c/B
CTR dec | 16.73 ns/B 57.01 MiB/s 25.70 c/B
CCM enc | 33.29 ns/B 28.65 MiB/s 51.13 c/B
CCM dec | 33.29 ns/B 28.65 MiB/s 51.13 c/B
CCM auth | 16.65 ns/B 57.29 MiB/s 25.57 c/B
GCM enc | 28.39 ns/B 33.60 MiB/s 43.60 c/B
GCM dec | 28.39 ns/B 33.59 MiB/s 43.60 c/B
GCM auth | 11.64 ns/B 81.92 MiB/s 17.88 c/B
OCB enc | 17.33 ns/B 55.03 MiB/s 26.62 c/B
OCB dec | 17.40 ns/B 54.82 MiB/s 26.72 c/B
OCB auth | 17.16 ns/B 55.59 MiB/s 26.35 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 21.56 ns/B 44.23 MiB/s 33.12 c/B
ECB dec | 22.09 ns/B 43.17 MiB/s 33.93 c/B
CBC enc | 19.09 ns/B 49.97 MiB/s 29.31 c/B
CBC dec | 19.13 ns/B 49.86 MiB/s 29.38 c/B
CFB enc | 19.04 ns/B 50.09 MiB/s 29.24 c/B
CFB dec | 19.04 ns/B 50.08 MiB/s 29.25 c/B
OFB enc | 22.22 ns/B 42.93 MiB/s 34.13 c/B
OFB dec | 22.22 ns/B 42.92 MiB/s 34.13 c/B
CTR enc | 19.25 ns/B 49.53 MiB/s 29.57 c/B
CTR dec | 19.25 ns/B 49.55 MiB/s 29.57 c/B
CCM enc | 38.33 ns/B 24.88 MiB/s 58.88 c/B
CCM dec | 38.34 ns/B 24.88 MiB/s 58.88 c/B
CCM auth | 19.17 ns/B 49.76 MiB/s 29.44 c/B
GCM enc | 30.91 ns/B 30.86 MiB/s 47.47 c/B
GCM dec | 30.91 ns/B 30.85 MiB/s 47.48 c/B
GCM auth | 11.71 ns/B 81.47 MiB/s 17.98 c/B
OCB enc | 19.85 ns/B 48.04 MiB/s 30.49 c/B
OCB dec | 19.89 ns/B 47.95 MiB/s 30.55 c/B
OCB auth | 19.67 ns/B 48.48 MiB/s 30.22 c/B
=
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
| |
--
|
|
|
|
|
|
| |
* configure.ac: Set LT version to C21/A1/R3.
Signed-off-by: Werner Koch <wk@gnupg.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'rijndael-armv8-ce.c' and
'rijndael-armv-aarch32-ce.S'.
* cipher/rijndael-armv8-aarch32-ce.S: New.
* cipher/rijndael-armv8-ce.c: New.
* cipher/rijndael-internal.h (USE_ARM_CE): New.
(RIJNDAEL_context_s): Add 'use_arm_ce'.
* cipher/rijndael.c [USE_ARM_CE] (_gcry_aes_armv8_ce_setkey)
(_gcry_aes_armv8_ce_prepare_decryption)
(_gcry_aes_armv8_ce_encrypt, _gcry_aes_armv8_ce_decrypt)
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_ocb_auth): New.
(do_setkey) [USE_ARM_CE]: Add ARM CE/AES HW feature check and key
setup for ARM CE.
(prepare_decryption, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec)
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_ARM_CE]: Add
ARM CE support.
* configure.ac: Add 'rijndael-armv8-ce.lo' and
'rijndael-armv8-aarch32-ce.lo'.
--
Improvement vs ARM assembly on Cortex-A53:
AES-128 AES-192 AES-256
CBC enc: 14.8x 12.8x 11.4x
CBC dec: 21.4x 20.5x 19.4x
CFB enc: 16.2x 13.6x 11.6x
CFB dec: 21.6x 20.5x 19.4x
CTR: 19.1x 18.6x 17.8x
OCB enc: 16.0x 16.2x 16.1x
OCB dec: 15.6x 15.9x 15.8x
OCB auth: 18.3x 18.4x 18.0x
Benchmark on Cortex-A53 (1152 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 24.42 ns/B 39.06 MiB/s 28.13 c/B
ECB dec | 25.07 ns/B 38.05 MiB/s 28.88 c/B
CBC enc | 21.05 ns/B 45.30 MiB/s 24.25 c/B
CBC dec | 21.16 ns/B 45.07 MiB/s 24.38 c/B
CFB enc | 21.05 ns/B 45.31 MiB/s 24.25 c/B
CFB dec | 21.38 ns/B 44.61 MiB/s 24.62 c/B
OFB enc | 26.15 ns/B 36.47 MiB/s 30.13 c/B
OFB dec | 26.15 ns/B 36.47 MiB/s 30.13 c/B
CTR enc | 21.17 ns/B 45.06 MiB/s 24.38 c/B
CTR dec | 21.16 ns/B 45.06 MiB/s 24.38 c/B
CCM enc | 42.32 ns/B 22.53 MiB/s 48.75 c/B
CCM dec | 42.32 ns/B 22.53 MiB/s 48.75 c/B
CCM auth | 21.17 ns/B 45.06 MiB/s 24.38 c/B
GCM enc | 22.08 ns/B 43.19 MiB/s 25.44 c/B
GCM dec | 22.08 ns/B 43.18 MiB/s 25.44 c/B
GCM auth | 0.923 ns/B 1032.8 MiB/s 1.06 c/B
OCB enc | 26.20 ns/B 36.40 MiB/s 30.18 c/B
OCB dec | 25.97 ns/B 36.73 MiB/s 29.91 c/B
OCB auth | 24.52 ns/B 38.90 MiB/s 28.24 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 27.83 ns/B 34.26 MiB/s 32.06 c/B
ECB dec | 28.54 ns/B 33.42 MiB/s 32.88 c/B
CBC enc | 24.47 ns/B 38.97 MiB/s 28.19 c/B
CBC dec | 25.27 ns/B 37.74 MiB/s 29.11 c/B
CFB enc | 25.08 ns/B 38.02 MiB/s 28.89 c/B
CFB dec | 25.31 ns/B 37.68 MiB/s 29.16 c/B
OFB enc | 29.57 ns/B 32.25 MiB/s 34.06 c/B
OFB dec | 29.57 ns/B 32.25 MiB/s 34.06 c/B
CTR enc | 25.24 ns/B 37.78 MiB/s 29.08 c/B
CTR dec | 25.24 ns/B 37.79 MiB/s 29.08 c/B
CCM enc | 49.81 ns/B 19.15 MiB/s 57.38 c/B
CCM dec | 49.80 ns/B 19.15 MiB/s 57.37 c/B
CCM auth | 24.58 ns/B 38.80 MiB/s 28.32 c/B
GCM enc | 26.15 ns/B 36.47 MiB/s 30.13 c/B
GCM dec | 26.11 ns/B 36.52 MiB/s 30.08 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 29.59 ns/B 32.23 MiB/s 34.09 c/B
OCB dec | 29.42 ns/B 32.42 MiB/s 33.89 c/B
OCB auth | 27.92 ns/B 34.16 MiB/s 32.16 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 31.20 ns/B 30.57 MiB/s 35.94 c/B
ECB dec | 31.80 ns/B 29.99 MiB/s 36.63 c/B
CBC enc | 27.83 ns/B 34.27 MiB/s 32.06 c/B
CBC dec | 27.87 ns/B 34.21 MiB/s 32.11 c/B
CFB enc | 27.88 ns/B 34.20 MiB/s 32.12 c/B
CFB dec | 28.16 ns/B 33.87 MiB/s 32.44 c/B
OFB enc | 32.93 ns/B 28.96 MiB/s 37.94 c/B
OFB dec | 32.93 ns/B 28.96 MiB/s 37.94 c/B
CTR enc | 27.95 ns/B 34.13 MiB/s 32.19 c/B
CTR dec | 27.95 ns/B 34.12 MiB/s 32.20 c/B
CCM enc | 55.88 ns/B 17.07 MiB/s 64.38 c/B
CCM dec | 55.88 ns/B 17.07 MiB/s 64.38 c/B
CCM auth | 27.95 ns/B 34.12 MiB/s 32.20 c/B
GCM enc | 28.86 ns/B 33.05 MiB/s 33.25 c/B
GCM dec | 28.87 ns/B 33.04 MiB/s 33.25 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 32.96 ns/B 28.94 MiB/s 37.97 c/B
OCB dec | 32.73 ns/B 29.14 MiB/s 37.70 c/B
OCB auth | 31.29 ns/B 30.48 MiB/s 36.04 c/B
After:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.10 ns/B 187.0 MiB/s 5.88 c/B
ECB dec | 5.27 ns/B 181.0 MiB/s 6.07 c/B
CBC enc | 1.41 ns/B 675.8 MiB/s 1.63 c/B
CBC dec | 0.992 ns/B 961.7 MiB/s 1.14 c/B
CFB enc | 1.30 ns/B 732.4 MiB/s 1.50 c/B
CFB dec | 0.991 ns/B 962.7 MiB/s 1.14 c/B
OFB enc | 7.05 ns/B 135.2 MiB/s 8.13 c/B
OFB dec | 7.05 ns/B 135.2 MiB/s 8.13 c/B
CTR enc | 1.11 ns/B 856.9 MiB/s 1.28 c/B
CTR dec | 1.11 ns/B 857.0 MiB/s 1.28 c/B
CCM enc | 2.58 ns/B 369.8 MiB/s 2.97 c/B
CCM dec | 2.58 ns/B 369.5 MiB/s 2.97 c/B
CCM auth | 1.58 ns/B 605.2 MiB/s 1.82 c/B
GCM enc | 2.04 ns/B 467.9 MiB/s 2.35 c/B
GCM dec | 2.04 ns/B 466.6 MiB/s 2.35 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 1.64 ns/B 579.8 MiB/s 1.89 c/B
OCB dec | 1.66 ns/B 574.5 MiB/s 1.91 c/B
OCB auth | 1.33 ns/B 715.5 MiB/s 1.54 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.64 ns/B 169.0 MiB/s 6.50 c/B
ECB dec | 5.81 ns/B 164.3 MiB/s 6.69 c/B
CBC enc | 1.90 ns/B 502.1 MiB/s 2.19 c/B
CBC dec | 1.24 ns/B 771.7 MiB/s 1.42 c/B
CFB enc | 1.84 ns/B 517.1 MiB/s 2.12 c/B
CFB dec | 1.23 ns/B 772.5 MiB/s 1.42 c/B
OFB enc | 7.60 ns/B 125.5 MiB/s 8.75 c/B
OFB dec | 7.60 ns/B 125.6 MiB/s 8.75 c/B
CTR enc | 1.36 ns/B 702.7 MiB/s 1.56 c/B
CTR dec | 1.36 ns/B 702.5 MiB/s 1.56 c/B
CCM enc | 3.31 ns/B 287.8 MiB/s 3.82 c/B
CCM dec | 3.31 ns/B 288.0 MiB/s 3.81 c/B
CCM auth | 2.06 ns/B 462.1 MiB/s 2.38 c/B
GCM enc | 2.28 ns/B 418.4 MiB/s 2.63 c/B
GCM dec | 2.28 ns/B 418.0 MiB/s 2.63 c/B
GCM auth | 0.923 ns/B 1032.8 MiB/s 1.06 c/B
OCB enc | 1.83 ns/B 520.1 MiB/s 2.11 c/B
OCB dec | 1.84 ns/B 517.8 MiB/s 2.12 c/B
OCB auth | 1.52 ns/B 626.1 MiB/s 1.75 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.86 ns/B 162.7 MiB/s 6.75 c/B
ECB dec | 6.02 ns/B 158.3 MiB/s 6.94 c/B
CBC enc | 2.44 ns/B 390.5 MiB/s 2.81 c/B
CBC dec | 1.45 ns/B 656.4 MiB/s 1.67 c/B
CFB enc | 2.39 ns/B 399.5 MiB/s 2.75 c/B
CFB dec | 1.45 ns/B 656.8 MiB/s 1.67 c/B
OFB enc | 7.81 ns/B 122.1 MiB/s 9.00 c/B
OFB dec | 7.81 ns/B 122.1 MiB/s 9.00 c/B
CTR enc | 1.57 ns/B 605.8 MiB/s 1.81 c/B
CTR dec | 1.57 ns/B 605.9 MiB/s 1.81 c/B
CCM enc | 4.07 ns/B 234.3 MiB/s 4.69 c/B
CCM dec | 4.07 ns/B 234.1 MiB/s 4.69 c/B
CCM auth | 2.61 ns/B 365.7 MiB/s 3.00 c/B
GCM enc | 2.50 ns/B 381.9 MiB/s 2.88 c/B
GCM dec | 2.49 ns/B 382.3 MiB/s 2.87 c/B
GCM auth | 0.926 ns/B 1029.7 MiB/s 1.07 c/B
OCB enc | 2.05 ns/B 465.6 MiB/s 2.36 c/B
OCB dec | 2.06 ns/B 462.0 MiB/s 2.38 c/B
OCB auth | 1.74 ns/B 548.4 MiB/s 2.00 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sha256-armv8-aarch32-ce.S'.
* cipher/sha256-armv8-aarch32-ce.S: New.
* cipher/sha256.c (USE_ARM_CE): New.
(sha256_init, sha224_init): Check features for HWF_ARM_SHA1.
[USE_ARM_CE] (_gcry_sha256_transform_armv8_ce): New.
(transform) [USE_ARM_CE]: Use ARMv8 CE implementation if HW supports.
(SHA256_CONTEXT): Add 'use_arm_ce'.
* configure.ac: Add 'sha256-armv8-aarch32-ce.lo'.
--
Benchmark on Cortex-A53 (1152 Mhz):
Before:
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 17.38 ns/B 54.88 MiB/s 20.02 c/B
After (~9.3x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA256 | 1.85 ns/B 515.7 MiB/s 2.13 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sha1-armv8-aarch32-ce.S'.
* cipher/sha1-armv7-neon.S (_gcry_sha1_transform_armv7_neon): Add
missing size.
* cipher/sha1-armv8-aarch32-ce.S: New.
* cipher/sha1.c (USE_ARM_CE): New.
(sha1_init): Check features for HWF_ARM_SHA1.
[USE_ARM_CE] (_gcry_sha1_transform_armv8_ce): New.
(transform) [USE_ARM_CE]: Use ARMv8 CE implementation if HW supports
it.
* cipher/sha1.h (SHA1_CONTEXT): Add 'use_arm_ce'.
* configure.ac: Add 'sha1-armv8-aarch32-ce.lo'.
--
Benchmark on Cortex-A53 (1152 Mhz):
Before (SHA-1 NEON):
| nanosecs/byte mebibytes/sec cycles/byte
SHA1 | 6.62 ns/B 144.2 MiB/s 7.62 c/B
After (~3.8x faster):
| nanosecs/byte mebibytes/sec cycles/byte
SHA1 | 1.73 ns/B 552.2 MiB/s 1.99 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|