summaryrefslogtreecommitdiff
path: root/cipher/crc-ppc.c
Commit message (Collapse)AuthorAgeFilesLines
* ppc: avoid using vec_vsx_ld/vec_vsx_st for 2x64-bit vectorsJussi Kivilinna2020-04-041-6/+43
| | | | | | | | | | | | | | | | | | | | | | | * cipher/crc-ppc.c (CRC_VEC_U64_LOAD, CRC_VEC_U64_LOAD_LE) (CRC_VEC_U64_LOAD_BE): Remove vec_vsx_ld usage. (asm_vec_u64_load, asm_vec_u64_load_le): New. * cipher/sha512-ppc.c (vec_vshasigma_u64): Use '__asm__' instead of 'asm' for assembly block. (vec_u64_load, vec_u64_store): New. (_gcry_sha512_transform_ppc8): Use vec_u64_load/store instead of vec_vsx_ld/vec_vsx_st. * configure.ac (gcy_cv_cc_ppc_altivec) (gcy_cv_cc_ppc_altivec_cflags): Add check for vec_vsx_ld with 'unsigned int *' pointer type. -- GCC 7.5 and clang 8.0 do not support vec_vsx_ld with 'unsigned long long *' pointer type. Switch code to use inline assembly instead. As vec_vsx_ld is still used with 'unsigned int *' pointers, add new check for this in configure.ac. GnuPG-bug-id: 4906 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* crc-ppc: fix bad register used for vector load/store assemblyJussi Kivilinna2020-02-021-13/+28
| | | | | | | | | | | | * cipher/crc-ppc.c (CRC_VEC_U64_LOAD_BE): Move implementation to... (asm_vec_u64_load_be): ...here; Add "r0" to clobber list for load instruction when offset is not zero; Add zero offset path. -- Register r0 must not be used for RA input for vector load/store instructions as r0 is not read as register but as value '0'. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* Add PowerPC vpmsum implementation of CRCJussi Kivilinna2019-09-151-0/+604
* cipher/Makefile.am: Add 'crc-ppc.c'. * cipher/crc-armv8-ce.c: Remove 'USE_INTEL_PCLMUL' comment. * cipher/crc-ppc.c: New. * cipher/crc.c (USE_PPC_VPMSUM): New. (CRC_CONTEXT): Add 'use_vpmsum'. (_gcry_crc32_ppc8_vpmsum, _gcry_crc24rfc2440_ppc8_vpmsum): New. (crc32_init, crc24rfc2440_init): Add HWF check for 'use_vpmsum'. (crc32_write, crc24rfc2440_write): Add 'use_vpmsum' code-path. * configure.ac: Add 'vpmsumd' instruction to PowerPC VSX inline assembly check; Add 'crc-ppc.lo'. -- Benchmark on POWER8 (ppc64le, ~3.8Ghz): Before: | nanosecs/byte mebibytes/sec cycles/byte CRC32 | 0.978 ns/B 975.0 MiB/s 3.72 c/B CRC24RFC2440 | 0.974 ns/B 978.8 MiB/s 3.70 c/B After(~22x faster): | nanosecs/byte mebibytes/sec cycles/byte CRC32 | 0.044 ns/B 21878 MiB/s 0.166 c/B CRC24RFC2440 | 0.043 ns/B 22077 MiB/s 0.164 c/B Benchmark on POWER9 (ppc64le, ~3.8Ghz): Before: | nanosecs/byte mebibytes/sec cycles/byte CRC32 | 1.01 ns/B 943.7 MiB/s 3.84 c/B CRC24RFC2440 | 0.993 ns/B 960.6 MiB/s 3.77 c/B After (~20x faster): | nanosecs/byte mebibytes/sec cycles/byte CRC32 | 0.046 ns/B 20675 MiB/s 0.175 c/B CRC24RFC2440 | 0.048 ns/B 19691 MiB/s 0.184 c/B GnuPG-bug-id: 4460 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>