From 9a63cfd61753b2c7ef7a872a01565154f10a72c0 Mon Sep 17 00:00:00 2001 From: Jussi Kivilinna Date: Sat, 26 Mar 2022 19:48:08 +0200 Subject: chacha20: add AVX512 implementation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * cipher/Makefile.am: Add 'chacha20-amd64-avx512.S'. * cipher/chacha20-amd64-avx512.S: New. * cipher/chacha20.c (USE_AVX512): New. (CHACHA20_context_s): Add 'use_avx512'. [USE_AVX512] (_gcry_chacha20_amd64_avx512_blocks16): New. (chacha20_do_setkey) [USE_AVX512]: Setup 'use_avx512' based on HW features. (do_chacha20_encrypt_stream_tail) [USE_AVX512]: Use AVX512 implementation if supported. (_gcry_chacha20_poly1305_encrypt) [USE_AVX512]: Disable stitched chacha20-poly1305 implementations if AVX512 implementation is used. (_gcry_chacha20_poly1305_decrypt) [USE_AVX512]: Disable stitched chacha20-poly1305 implementations if AVX512 implementation is used. -- Benchmark on Intel Core i3-1115G4 (tigerlake): Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz STREAM enc | 0.276 ns/B 3451 MiB/s 1.13 c/B 4090 STREAM dec | 0.284 ns/B 3359 MiB/s 1.16 c/B 4090 POLY1305 enc | 0.411 ns/B 2320 MiB/s 1.68 c/B 4098±3 POLY1305 dec | 0.408 ns/B 2338 MiB/s 1.67 c/B 4091±1 POLY1305 auth | 0.060 ns/B 15785 MiB/s 0.247 c/B 4090±1 After (stream 1.7x faster, poly1305-aead 1.8x faster): | nanosecs/byte mebibytes/sec cycles/byte auto Mhz STREAM enc | 0.162 ns/B 5869 MiB/s 0.665 c/B 4092±1 STREAM dec | 0.162 ns/B 5884 MiB/s 0.664 c/B 4096±3 POLY1305 enc | 0.221 ns/B 4306 MiB/s 0.907 c/B 4097±3 POLY1305 dec | 0.220 ns/B 4342 MiB/s 0.900 c/B 4096±3 POLY1305 auth | 0.060 ns/B 15797 MiB/s 0.247 c/B 4085±2 Signed-off-by: Jussi Kivilinna --- configure.ac | 1 + 1 file changed, 1 insertion(+) (limited to 'configure.ac') diff --git a/configure.ac b/configure.ac index eb149a51..9f0c10f9 100644 --- a/configure.ac +++ b/configure.ac @@ -2759,6 +2759,7 @@ if test "$found" = "1" ; then # Build with the assembly implementation GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS chacha20-amd64-ssse3.lo" GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS chacha20-amd64-avx2.lo" + GCRYPT_ASM_CIPHERS="$GCRYPT_ASM_CIPHERS chacha20-amd64-avx512.lo" ;; aarch64-*-*) # Build with the assembly implementation -- cgit v1.2.1