diff options
author | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2014-11-02 16:01:11 +0200 |
---|---|---|
committer | Jussi Kivilinna <jussi.kivilinna@iki.fi> | 2014-11-02 16:26:53 +0200 |
commit | 0b520128551054d83fb0bb2db8873394f38de498 (patch) | |
tree | cba613c83ce9a044417a2084573211ee254654eb /cipher/poly1305-internal.h | |
parent | c584f44543883346d5a565581ff99a0afce9c5e1 (diff) | |
download | libgcrypt-0b520128551054d83fb0bb2db8873394f38de498.tar.gz |
Add ARM/NEON implementation of Poly1305
* cipher/Makefile.am: Add 'poly1305-armv7-neon.S'.
* cipher/poly1305-armv7-neon.S: New.
* cipher/poly1305-internal.h (POLY1305_USE_NEON)
(POLY1305_NEON_BLOCKSIZE, POLY1305_NEON_STATESIZE)
(POLY1305_NEON_ALIGNMENT): New.
* cipher/poly1305.c [POLY1305_USE_NEON]
(_gcry_poly1305_armv7_neon_init_ext)
(_gcry_poly1305_armv7_neon_finish_ext)
(_gcry_poly1305_armv7_neon_blocks, poly1305_armv7_neon_ops): New.
(_gcry_poly1305_init) [POLY1305_USE_NEON]: Select NEON implementation
if HWF_ARM_NEON set.
* configure.ac [neonsupport=yes]: Add 'poly1305-armv7-neon.lo'.
--
Add Andrew Moon's public domain NEON implementation of Poly1305. Original
source is available at: https://github.com/floodyberry/poly1305-opt
Benchmark on Cortex-A8 (--cpu-mhz 1008):
Old:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 12.34 ns/B 77.27 MiB/s 12.44 c/B
New:
| nanosecs/byte mebibytes/sec cycles/byte
POLY1305 | 2.12 ns/B 450.7 MiB/s 2.13 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'cipher/poly1305-internal.h')
-rw-r--r-- | cipher/poly1305-internal.h | 18 |
1 files changed, 18 insertions, 0 deletions
diff --git a/cipher/poly1305-internal.h b/cipher/poly1305-internal.h index 0299c430..dfc0c048 100644 --- a/cipher/poly1305-internal.h +++ b/cipher/poly1305-internal.h @@ -65,10 +65,24 @@ #endif +/* POLY1305_USE_NEON indicates whether to enable ARM NEON assembly code. */ +#undef POLY1305_USE_NEON +#if defined(ENABLE_NEON_SUPPORT) && defined(HAVE_ARM_ARCH_V6) && \ + defined(__ARMEL__) && defined(HAVE_COMPATIBLE_GCC_ARM_PLATFORM_AS) && \ + defined(HAVE_GCC_INLINE_ASM_NEON) +# define POLY1305_USE_NEON 1 +# define POLY1305_NEON_BLOCKSIZE 32 +# define POLY1305_NEON_STATESIZE 128 +# define POLY1305_NEON_ALIGNMENT 16 +#endif + + /* Largest block-size used in any implementation (optimized implementations * might use block-size multiple of 16). */ #ifdef POLY1305_USE_AVX2 # define POLY1305_LARGEST_BLOCKSIZE POLY1305_AVX2_BLOCKSIZE +#elif defined(POLY1305_USE_NEON) +# define POLY1305_LARGEST_BLOCKSIZE POLY1305_NEON_BLOCKSIZE #elif defined(POLY1305_USE_SSE2) # define POLY1305_LARGEST_BLOCKSIZE POLY1305_SSE2_BLOCKSIZE #else @@ -78,6 +92,8 @@ /* Largest state-size used in any implementation. */ #ifdef POLY1305_USE_AVX2 # define POLY1305_LARGEST_STATESIZE POLY1305_AVX2_STATESIZE +#elif defined(POLY1305_USE_NEON) +# define POLY1305_LARGEST_STATESIZE POLY1305_NEON_STATESIZE #elif defined(POLY1305_USE_SSE2) # define POLY1305_LARGEST_STATESIZE POLY1305_SSE2_STATESIZE #else @@ -87,6 +103,8 @@ /* Minimum alignment for state pointer passed to implementations. */ #ifdef POLY1305_USE_AVX2 # define POLY1305_STATE_ALIGNMENT POLY1305_AVX2_ALIGNMENT +#elif defined(POLY1305_USE_NEON) +# define POLY1305_STATE_ALIGNMENT POLY1305_NEON_ALIGNMENT #elif defined(POLY1305_USE_SSE2) # define POLY1305_STATE_ALIGNMENT POLY1305_SSE2_ALIGNMENT #else |