summaryrefslogtreecommitdiff
path: root/cipher/sha1-armv7-neon.S
Commit message (Collapse)AuthorAgeFilesLines
* Use 'vmov' and 'movi' for vector register clearing in ARM assemblyJussi Kivilinna2022-01-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | * cipher/chacha20-aarch64.S (clear): Use 'movi'. * cipher/chacha20-armv7-neon.S (clear): Use 'vmov'. * cipher/cipher-gcm-armv7-neon.S (clear): Use 'vmov'. * cipher/cipher-gcm-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/cipher-gcm-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/rijndael-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha1-armv7-neon.S (clear): Use 'vmov'. * cipher/sha1-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha1-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/sha256-armv8-aarch32-ce.S (CLEAR_REG): Use 'vmov'. * cipher/sha256-armv8-aarch64-ce.S (CLEAR_REG): Use 'movi'. * cipher/sha512-armv7-neon.S (CLEAR_REG): New using 'vmov'. (_gcry_sha512_transform_armv7_neon): Use CLEAR_REG for clearing registers. -- Use 'vmov reg, #0' on 32-bit and 'movi reg.16b, #0' instead of self-xoring register to break false register dependency. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARMv8/AArch32 Crypto Extension implementation of SHA-1Jussi Kivilinna2016-07-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | * cipher/Makefile.am: Add 'sha1-armv8-aarch32-ce.S'. * cipher/sha1-armv7-neon.S (_gcry_sha1_transform_armv7_neon): Add missing size. * cipher/sha1-armv8-aarch32-ce.S: New. * cipher/sha1.c (USE_ARM_CE): New. (sha1_init): Check features for HWF_ARM_SHA1. [USE_ARM_CE] (_gcry_sha1_transform_armv8_ce): New. (transform) [USE_ARM_CE]: Use ARMv8 CE implementation if HW supports it. * cipher/sha1.h (SHA1_CONTEXT): Add 'use_arm_ce'. * configure.ac: Add 'sha1-armv8-aarch32-ce.lo'. -- Benchmark on Cortex-A53 (1152 Mhz): Before (SHA-1 NEON): | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 6.62 ns/B 144.2 MiB/s 7.62 c/B After (~3.8x faster): | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 1.73 ns/B 552.2 MiB/s 1.99 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Speed-up SHA-1 NEON assembly implementationJussi Kivilinna2014-06-291-73/+82
| | | | | | | | | | | | | | | | | * cipher/sha1-armv7-neon.S: Tweak implementation for speed-up. -- Benchmark on Cortex-A8 1008Mhz: New: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 7.04 ns/B 135.4 MiB/s 7.10 c/B Old: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 7.79 ns/B 122.4 MiB/s 7.85 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Fix ARM assembly when building __PIC__Jussi Kivilinna2014-05-201-2/+17
| | | | | | | | | | | | | | | | | | | * cipher/camellia-arm.S (GET_DATA_POINTER): New. (_gcry_camellia_arm_encrypt_block): Use GET_DATA_POINTER. (_gcry_camellia_arm_decrypt_block): Ditto. * cipher/cast5-arm.S (GET_DATA_POINTER): New. (_gcry_cast5_arm_encrypt_block, _gcry_cast5_arm_decrypt_block) (_gcry_cast5_arm_enc_blk2, _gcry_cast5_arm_dec_blk2): Use GET_DATA_POINTER. * cipher/rijndael-arm.S (GET_DATA_POINTER): New. (_gcry_aes_arm_encrypt_block, _gcry_aes_arm_decrypt_block): Use GET_DATA_POINTER. * cipher/sha1-armv7-neon.S (GET_DATA_POINTER): New. (.LK_VEC): Move from .text to .data section. (_gcry_sha1_transform_armv7_neon): Use GET_DATA_POINTER. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add ARM/NEON implementation for SHA-1Jussi Kivilinna2013-12-181-0/+501
* cipher/Makefile.am: Add 'sha1-armv7-neon.S'. * cipher/sha1-armv7-neon.S: New. * cipher/sha1.c (USE_NEON): New. (SHA1_CONTEXT, sha1_init) [USE_NEON]: Add and initialize 'use_neon'. [USE_NEON] (_gcry_sha1_transform_armv7_neon): New. (transform) [USE_NEON]: Use ARM/NEON assembly if enabled. * configure.ac: Add 'sha1-armv7-neon.lo'. -- Patch adds ARM/NEON implementation for SHA-1. Benchmarks show 1.72x improvement on ARM Cortex-A8, 1008 Mhz: jussi@cubie:~/libgcrypt$ tests/bench-slope --cpu-mhz 1008 hash sha1 Hash: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 7.80 ns/B 122.3 MiB/s 7.86 c/B = jussi@cubie:~/libgcrypt$ tests/bench-slope --disable-hwf arm-neon --cpu-mhz 1008 hash sha1 Hash: | nanosecs/byte mebibytes/sec cycles/byte SHA1 | 13.41 ns/B 71.10 MiB/s 13.52 c/B = Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>