delta/libgcrypt.git - dev.gnupg.org: source/libgcrypt.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	sha3: Add x86-64 AVX512 accelerated implementation	Jussi Kivilinna	2022-07-25	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSES: Add 'cipher/keccak-amd64-avx512.S'. * configure.ac: Add 'keccak-amd64-avx512.lo'. * cipher/Makefile.am: Add 'keccak-amd64-avx512.S'. * cipher/keccak-amd64-avx512.S: New. * cipher/keccak.c (USE_64BIT_AVX512, ASM_FUNC_ABI): New. [USE_64BIT_AVX512] (_gcry_keccak_f1600_state_permute64_avx512) (_gcry_keccak_absorb_blocks_avx512, keccak_f1600_state_permute64_avx512) (keccak_absorb_lanes64_avx512, keccak_avx512_64_ops): New. (keccak_init) [USE_64BIT_AVX512]: Enable x86-64 AVX512 implementation if supported by HW features. -- Benchmark on Intel Core i3-1115G4 (tigerlake): Before (BMI2 instructions): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA3-224 \| 1.77 ns/B 540.3 MiB/s 7.22 c/B 4088 SHA3-256 \| 1.86 ns/B 514.0 MiB/s 7.59 c/B 4089 SHA3-384 \| 2.43 ns/B 393.1 MiB/s 9.92 c/B 4089 SHA3-512 \| 3.49 ns/B 273.2 MiB/s 14.27 c/B 4088 SHAKE128 \| 1.52 ns/B 629.1 MiB/s 6.20 c/B 4089 SHAKE256 \| 1.86 ns/B 511.6 MiB/s 7.62 c/B 4089 After (~33% faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA3-224 \| 1.32 ns/B 721.8 MiB/s 5.40 c/B 4089 SHA3-256 \| 1.40 ns/B 681.7 MiB/s 5.72 c/B 4089 SHA3-384 \| 1.83 ns/B 522.5 MiB/s 7.46 c/B 4089 SHA3-512 \| 2.63 ns/B 362.1 MiB/s 10.77 c/B 4088 SHAKE128 \| 1.13 ns/B 840.4 MiB/s 4.64 c/B 4089 SHAKE256 \| 1.40 ns/B 682.1 MiB/s 5.72 c/B 4089 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	poly1305: add AVX512 implementation	Jussi Kivilinna	2022-04-06	1	-0/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSES: Add 3-clause BSD license for poly1305-amd64-avx512.S. * cipher/Makefile.am: Add 'poly1305-amd64-avx512.S'. * cipher/poly1305-amd64-avx512.S: New. * cipher/poly1305-internal.h (POLY1305_USE_AVX512): New. (poly1305_context_s): Add 'use_avx512'. * cipher/poly1305.c (ASM_FUNC_ABI, ASM_FUNC_WRAPPER_ATTR): New. [POLY1305_USE_AVX512] (_gcry_poly1305_amd64_avx512_blocks) (poly1305_amd64_avx512_blocks): New. (poly1305_init): Use AVX512 is HW feature available (set use_avx512). [USE_MPI_64BIT] (poly1305_blocks): Rename to ... [USE_MPI_64BIT] (poly1305_blocks_generic): ... this. [USE_MPI_64BIT] (poly1305_blocks): New. -- Patch adds AMD64 AVX512-FMA52 implementation for Poly1305. Benchmark on Intel Core i3-1115G4 (tigerlake): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz POLY1305 \| 0.306 ns/B 3117 MiB/s 1.25 c/B 4090 After (5.0x faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz POLY1305 \| 0.061 ns/B 15699 MiB/s 0.249 c/B 4095±3 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	SHA512: Add AVX512 implementation	Jussi Kivilinna	2022-03-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSES: Add 'cipher/sha512-avx512-amd64.S'. * cipher/Makefile.am: Add 'sha512-avx512-amd64.S'. * cipher/sha512-avx512-amd64.S: New. * cipher/sha512.c (USE_AVX512): New. (do_sha512_transform_amd64_ssse3, do_sha512_transform_amd64_avx) (do_sha512_transform_amd64_avx2): Add ASM_EXTRA_STACK to return value only if assembly routine returned non-zero value. [USE_AVX512] (_gcry_sha512_transform_amd64_avx512) (do_sha512_transform_amd64_avx512): New. (sha512_init_common) [USE_AVX512]: Use AVX512 implementation if HW feature supported. --- Benchmark on Intel Core i3-1115G4 (tigerlake): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA512 \| 1.51 ns/B 631.6 MiB/s 6.17 c/B 4089 After (~29% faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA512 \| 1.16 ns/B 819.0 MiB/s 4.76 c/B 4090 GnuPG-bug-id: T4460 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	doc: Update LICENSES for jitterentropy 3.3.0.	NIIBE Yutaka	2021-11-17	1	-36/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	-- In the files of the implementation (.h, .c), it says: License: see LICENSE file in root directory I think that user may look the LICENSES file instead easily. GnuPG-bug-id 5523 Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
*	VPMSUMD acceleration for GCM mode on PPC	Shawn Landden	2021-03-07	1	-1/+42
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-ppc.c'. * cipher/cipher-gcm-ppc.c: New. * cipher/cipher-gcm.c [GCM_USE_PPC_VPMSUM] (_gcry_ghash_setup_ppc_vpmsum) (_gcry_ghash_ppc_vpmsum, ghash_setup_ppc_vpsum, ghash_ppc_vpmsum): New. (setupM) [GCM_USE_PPC_VPMSUM]: Select ppc-vpmsum implementation if HW feature "ppc-vcrypto" is available. * cipher/cipher-internal.h (GCM_USE_PPC_VPMSUM): New. (gcry_cipher_handle): Move 'ghash_fn' at end of 'gcm' block to align 'gcm_table' to 16 bytes. * configure.ac: Add 'cipher-gcm-ppc.lo'. * tests/basic.c (_check_gcm_cipher): New AES256 test vector. * AUTHORS: Add 'CRYPTOGAMS'. * LICENSES: Add original license to 3-clause-BSD section. -- https://dev.gnupg.org/D501: 10-20X speed. However this Power 9 machine is faster than the last Power 9 benchmarks on the optimized versions, so while better than the last patch, it is not all due to the code. Before: GCM enc \| 4.23 ns/B 225.3 MiB/s - c/B GCM dec \| 3.58 ns/B 266.2 MiB/s - c/B GCM auth \| 3.34 ns/B 285.3 MiB/s - c/B After: GCM enc \| 0.370 ns/B 2578 MiB/s - c/B GCM dec \| 0.371 ns/B 2571 MiB/s - c/B GCM auth \| 0.159 ns/B 6003 MiB/s - c/B Signed-off-by: Shawn Landden <shawn@git.icu> [jk: coding style fixes, Makefile.am integration, patch from Differential to git, commit changelog, fixed few compiler warnings] GnuPG-bug-id: 5040 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add i386/SSSE3 implementation of SHA512	Jussi Kivilinna	2019-11-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSES: Add 'sha512-ssse3-i386.c'. * configure.ac: Add 'sha512-ssse3-i386.lo'. * cipher/Makefile.am: Add 'sha512-ssse3-i386.c'. * cipher/sha512-ssse3-i386.c: New. * cipher/sha512.c (USE_SSSE3_I386, _gcry_sha512_transform_i386_ssse3) (do_sha512_transform_i386_ssse3): New. (_gcry_sha512_transform_arm) [USE_SSSE3_I386]: Use i386/SSSE3 transform function if supported by CPU. -- Benchmark on AMD Ryzen 7 3700X: Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA512 \| 12.58 ns/B 75.79 MiB/s 55.06 c/B 4375 After (~4.5x faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz SHA512 \| 2.78 ns/B 343.3 MiB/s 12.09 c/B 4351 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	random: Allow building rndjent.c with stats collecting enabled.	Werner Koch	2017-06-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* random/rndjent.c: Change license to the one used by jitterentropy.h. (jent_init_statistic): New. (jent_bit_count): New. (jent_statistic_copy_stat): new. (jent_calc_statistic): New. -- New code taken from Stephan's jitterentropy-stat.c. This does now build with CONFIG_CRYPTO_CPU_JITTERENTROPY_STAT defined; not sure whether this is already useful. Changed the license due to the new code. Signed-off-by: Werner Koch <wk@gnupg.org>
*	random: Add original Jitter RNG implementation	Stephan Mueller	2017-06-13	1	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	* random/jitterentropy-base-user.h: New. * random/jitterentropy-base.c: New. * random/jitterentropy.h: New. -- Signed-off-by: Werner Koch <wk@gnupg.org> - Tabs and trailing white spaces removed from original source. - Source received by mail dated Fri, 27 Jan 2017 17:52:38 +0100 from Stephan
*	Document more non LGPL-licensed code.	Andreas Metzler	2016-02-12	1	-0/+107
\| \| \| \| \| \| \|	-- Add license and copyright statement for cipher/arcfour-amd64.S (public domain) and cipher/cipher-ocb.c (OCB license 1)
*	Update license information for CRC	Jussi Kivilinna	2015-11-18	1	-50/+0
\| \| \| \| \| \| \| \|	* LICENSES: Remove 'Simple permissive' and 'IETF permissive' licenses for 'cipher/crc.c' as result of rewrite of CRC implementations. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add AVX and AVX2/BMI implementations for SHA-256	Jussi Kivilinna	2013-12-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* LICENSES: Add 'cipher/sha256-avx-amd64.S' and 'cipher/sha256-avx2-bmi2-amd64.S'. * cipher/Makefile.am: Add 'sha256-avx-amd64.S' and 'sha256-avx2-bmi2-amd64.S'. * cipher/sha256-avx-amd64.S: New. * cipher/sha256-avx2-bmi2-amd64.S: New. * cipher/sha256-ssse3-amd64.S: Use 'lea' instead of 'add' in few places for tiny speed improvement. * cipher/sha256.c (USE_AVX, USE_AVX2): New. (SHA256_CONTEXT) [USE_AVX, USE_AVX2]: Add 'use_avx' and 'use_avx2'. (sha256_init, sha224_init) [USE_AVX, USE_AVX2]: Initialize above new context members. [USE_AVX] (_gcry_sha256_transform_amd64_avx): New. [USE_AVX2] (_gcry_sha256_transform_amd64_avx2): New. (transform) [USE_AVX2]: Use AVX2 assembly if enabled. (transform) [USE_AVX]: Use AVX assembly if enabled. * configure.ac: Add 'sha256-avx-amd64.lo' and 'sha256-avx2-bmi2-amd64.lo'. -- Patch adds fast AVX and AVX2/BMI2 implementations of SHA-256 by Intel Corporation. The assembly source is licensed under 3-clause BSD license, thus compatible with LGPL2.1+. Original source can be accessed at: http://www.intel.com/p/en_US/embedded/hwsw/technology/packet-processing#docs Implementation is described in white paper "Fast SHA - 256 Implementations on Intel® Architecture Processors" http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/sha-256-implementations-paper.html Note: AVX implementation uses SHLD instruction to emulate RORQ, since it's faster on Intel Sandy-Bridge. However, on non-Intel CPUs SHLD is much slower than RORQ, so therefore AVX implementation is (for now) limited to Intel CPUs. Note: AVX2 implementation also uses BMI2 instruction rorx, thus additional HWF flag. Benchmarks: cpu C-lang SSSE3 AVX/AVX2 C vs AVX/AVX2 vs SSSE3 Intel i5-4570 13.86 c/B 10.27 c/B 8.70 c/B 1.59x 1.18x Intel i5-2450M 17.25 c/B 12.36 c/B 10.31 c/B 1.67x 1.19x Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Update license information	Werner Koch	2013-12-13	1	-0/+134
	* LICENSES: New. * Makefile.am (EXTRA_DIST): Add LICENSES. * AUTHORS: Add list of copyright holders. * README: Reference AUTHORS. Signed-off-by: Werner Koch <wk@gnupg.org>