summaryrefslogtreecommitdiff
path: root/cipher/blake2b-amd64-avx512.S
Commit message (Collapse)AuthorAgeFilesLines
* amd64-asm: move constant data to read-only section for hash/mac algosJussi Kivilinna2023-01-191-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | * cipher/asm-common-amd64.h (SECTION_RODATA): New. * cipher/blake2b-amd64-avx2.S: Use read-only section for constant data. * cipher/blake2b-amd64-avx512.S: Likewise. * cipher/blake2s-amd64-avx.S: Likewise. * cipher/blake2s-amd64-avx512.S: Likewise. * cipher/poly1305-amd64-avx512.S: Likewise. * cipher/sha1-avx-amd64.S: Likewise. * cipher/sha1-avx-bmi2-amd64.S: Likewise. * cipher/sha1-avx2-bmi2-amd64.S: Likewise. * cipher/sha1-ssse3-amd64.S: Likewise. * cipher/sha256-avx-amd64.S: Likewise. * cipher/sha256-avx2-bmi2-amd64.S: Likewise. * cipher/sha256-ssse3-amd64.S: Likewise. * cipher/sha512-avx-amd64.S: Likewise. * cipher/sha512-avx2-bmi2-amd64.S: Likewise. * cipher/sha512-avx512-amd64.S: Likewise. * cipher/sha512-ssse3-amd64.S: Likewise. * cipher/sha3-avx-bmi2-amd64.S: Likewise. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* avx512: tweak AVX512 spec stop, use common macro in assemblyJussi Kivilinna2022-12-121-0/+2
| | | | | | | | | | | | | | | | | * cipher/cipher-gcm-intel-pclmul.c: Use xmm registers for AVX512 spec stop. * cipher/asm-common-amd64.h (spec_stop_avx512): New. * cipher/blake2b-amd64-avx512.S: Use spec_stop_avx512. * cipher/blake2s-amd64-avx512.S: Likewise. * cipher/camellia-gfni-avx512-amd64.S: Likewise. * cipher/chacha20-avx512-amd64.S: Likewise. * cipher/keccak-amd64-avx512.S: Likewise. * cipher/poly1305-amd64-avx512.S: Likewise. * cipher/sha512-avx512-amd64.S: Likewise. * cipher/sm4-gfni-avx512-amd64.S: Likewise. --- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
* blake2: add AVX512 accelerated implementationsJussi Kivilinna2022-07-251-0/+312
* cipher/Makefile.am: Add 'blake2b-amd64-avx512.S' and 'blake2s-amd64-avx512.S'. * cipher/blake2.c (USE_AVX512): New. (ASM_FUNC_ABI): Setup attribute if USE_AVX2 or USE_AVX512 enabled in addition to USE_AVX. (BLAKE2B_CONTEXT_S, BLAKE2S_CONTEXT_S): Add 'use_avx512'. (_gcry_blake2b_transform_amd64_avx512) (_gcry_blake2s_transform_amd64_avx512): New. (blake2b_transform, blake2s_transform) [USE_AVX512]: Add AVX512 path. (blake2b_init_ctx, blake2s_init_ctx) [USE_AVX512]: Use AVX512 if HW feature available. * cipher/blake2b-amd64-avx512.S: New. * cipher/blake2s-amd64-avx512.S: New. * configure.ac: Add 'blake2b-amd64-avx512.lo' and 'blake2s-amd64-avx512.lo'. -- Benchmark on Intel Core i3-1115G4 (tigerlake): Before (AVX/AVX2 implementations): | nanosecs/byte mebibytes/sec cycles/byte auto Mhz BLAKE2B_512 | 0.841 ns/B 1134 MiB/s 3.44 c/B 4089 BLAKE2S_256 | 1.29 ns/B 741.2 MiB/s 5.26 c/B 4089 After (blake2s ~19% faster, blake2b ~25% faster): | nanosecs/byte mebibytes/sec cycles/byte auto Mhz BLAKE2B_512 | 0.705 ns/B 1353 MiB/s 2.88 c/B 4088 BLAKE2S_256 | 1.02 ns/B 933.3 MiB/s 4.18 c/B 4088 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>