summaryrefslogtreecommitdiff
path: root/cipher/cipher-gcm-armv8-aarch32-ce.S
Commit message (Collapse)AuthorAgeFilesLines
* Add armv8/pmull accelerated POLYVAL for GCM-SIVJussi Kivilinna2022-01-111-0/+155
| | | | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-armv8-aarch32-ce.S (_gcry_polyval_armv8_ce_pmull): New. * cipher/cipher-gcm-armv8-aarch64-ce.S (_gcry_polyval_armv8_ce_pmull): New. * cipher/cipher-gcm.c (_gcry_polyval_armv8_ce_pmull) (polyval_armv8_ce_pmull): New. (setupM) [GCM_USE_ARM_PMULL]: Setup 'polyval_armv8_ce_pmull' as POLYVAL function. -- Benchmark on Cortex-A53 (aarch64): Before: AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV auth | 1.74 ns/B 547.6 MiB/s 2.01 c/B 1152 After (76% faster): AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV auth | 0.990 ns/B 963.2 MiB/s 1.14 c/B 1152 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Use 'vmov' and 'movi' for vector register clearing in ARM assemblyJussi Kivilinna2022-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * cipher/chacha20-aarch64.S (clear): Use 'movi'. * cipher/chacha20-armv7-neon.S (clear): Use 'vmov'. * cipher/cipher-gcm-armv7-neon.S (clear): Use 'vmov'. * cipher/cipher-gcm-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/cipher-gcm-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/rijndael-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha1-armv7-neon.S (clear): Use 'vmov'. * cipher/sha1-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha1-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/sha256-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha256-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/sha512-armv7-neon.S (CLEAR_REG): New using 'vmov'. (_gcry_sha512_transform_armv7_neon): Use CLEAR_REG for clearing registers. -- Use 'vmov reg, #0' on 32-bit and 'movi reg.16b, #0' instead of self-xoring register to break false register dependency. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Fix building AArch32 CE implementations when target is ARMv6 archJussi Kivilinna2017-07-291-0/+1
| | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-armv8-aarch32-ce.S: Select ARMv8 architecure. * cipher/rijndael-armv8-aarch32-ce.S: Ditto. * cipher/sha1-armv8-aarch32-ce.S: Ditto. * cipher/sha256-armv8-aarch32-ce.S: Ditto. * configure.ac (gcry_cv_gcc_inline_asm_aarch32_crypto): Ditto. -- Raspbian distribution defaults to ARMv6 architecture thus 'rbit' instruction is not available with default compiler flags. Patch adds explicit architecture selection for ARMv8 to enable 'rbit' usage with ARMv8/AArch32-CE assembly implementations of SHA, GHASH and AES. Reported-by: Chris Horry <zerbey@gmail.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* GCM: Add bulk processing for ARMv8/AArch32 implementationJussi Kivilinna2016-10-091-62/+259
| | | | | | | | | | | | | | | | | | | | | | * cipher/cipher-gcm-armv8-aarch32-ce.S: Add 4 blocks bulk processing. * tests/basic.c (check_digests): Print correct data length for "?" tests. (check_one_mac): Add large 1000000 bytes tests, when input is "!" or "?". (check_mac): Add "?" tests vectors for HMAC, CMAC, GMAC and POLY1305. -- Benchmark on Cortex-A53 (1152 Mhz): Before: | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 0.924 ns/B 1032.2 MiB/s 1.06 c/B After (1.21x faster): | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 0.764 ns/B 1248.2 MiB/s 0.880 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARMv8/AArch32 Crypto Extension implementation of GCMJussi Kivilinna2016-07-141-0/+235
* cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch32-ce.S'. * cipher/cipher-gcm-armv8-aarch32-ce.S: New. * cipher/cipher-gcm.c [GCM_USE_ARM_PMULL] (_gcry_ghash_setup_armv8_ce_pmull, _gcry_ghash_armv8_ce_pmull) (ghash_setup_armv8_ce_pmull, ghash_armv8_ce_pmull): New. (setupM) [GCM_USE_ARM_PMULL]: Enable ARM PMULL implementation if HWF_ARM_PULL HW feature flag is enabled. * cipher/cipher-gcm.h (GCM_USE_ARM_PMULL): New. -- Benchmark on Cortex-A53 (1152 Mhz): Before: | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 24.10 ns/B 39.57 MiB/s 27.76 c/B After (~26x faster): | nanosecs/byte mebibytes/sec cycles/byte GMAC_AES | 0.924 ns/B 1032.2 MiB/s 1.06 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>