delta/libgcrypt.git - dev.gnupg.org: source/libgcrypt.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	ghash\|polyval: add x86_64 VPCLMUL/AVX2 accelerated implementation	Jussi Kivilinna	2022-03-06	1	-12/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm-intel-pclmul.c (GCM_INTEL_USE_VPCLMUL_AVX2) (GCM_INTEL_AGGR8_TABLE_INITIALIZED) (GCM_INTEL_AGGR16_TABLE_INITIALIZED): New. (gfmul_pclmul): Fixes to comments. [GCM_USE_INTEL_VPCLMUL_AVX2] (GFMUL_AGGR16_ASM_VPCMUL_AVX2) (gfmul_vpclmul_avx2_aggr16, gfmul_vpclmul_avx2_aggr16_le) (gfmul_pclmul_avx2, gcm_lsh_avx2, load_h1h2_to_ymm1) (ghash_setup_aggr8_avx2, ghash_setup_aggr16_avx2): New. (_gcry_ghash_setup_intel_pclmul): Add 'hw_features' parameter; Setup ghash and polyval function pointers for context; Add VPCLMUL/AVX2 code path; Defer aggr8 and aggr16 table initialization to until first use in '_gcry_ghash_intel_pclmul' or '_gcry_polyval_intel_pclmul'. [__x86_64__] (ghash_setup_aggr8): New. (_gcry_ghash_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for aggr8 table initialization. (_gcry_polyval_intel_pclmul): Add VPCLMUL/AVX2 code path; Add call for aggr8 table initialization. * cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_intel_pclmul) (_gcry_polyval_intel_pclmul): Remove. [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_setup_intel_pclmul): Add 'hw_features' parameter. (setupM) [GCM_USE_INTEL_PCLMUL]: Pass HW features to '_gcry_ghash_setup_intel_pclmul'; Let '_gcry_ghash_setup_intel_pclmul' setup function pointers. * cipher/cipher-internal.h (GCM_USE_INTEL_VPCLMUL_AVX2): New. (gcry_cipher_handle): Add member 'gcm.hw_impl_flags'. -- Patch adds VPCLMUL/AVX2 accelerated implementation for GHASH (GCM) and POLYVAL (GCM-SIV). Benchmark on AMD Ryzen 5800X (zen3): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth \| 0.088 ns/B 10825 MiB/s 0.427 c/B 4850 GCM-SIV auth \| 0.083 ns/B 11472 MiB/s 0.403 c/B 4850 After: (~1.93x faster) \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth \| 0.045 ns/B 21098 MiB/s 0.219 c/B 4850 GCM-SIV auth \| 0.043 ns/B 22181 MiB/s 0.209 c/B 4850 AES128-GCM / AES128-GCM-SIV encryption: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM enc \| 0.079 ns/B 12073 MiB/s 0.383 c/B 4850 GCM-SIV enc \| 0.076 ns/B 12500 MiB/s 0.370 c/B 4850 Benchmark on Intel Core i3-1115G4 (tigerlake): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth \| 0.080 ns/B 11919 MiB/s 0.327 c/B 4090 GCM-SIV auth \| 0.075 ns/B 12643 MiB/s 0.309 c/B 4090 After: (~1.28x faster) \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM auth \| 0.062 ns/B 15348 MiB/s 0.254 c/B 4090 GCM-SIV auth \| 0.058 ns/B 16381 MiB/s 0.238 c/B 4090 AES128-GCM / AES128-GCM-SIV encryption: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM enc \| 0.101 ns/B 9441 MiB/s 0.413 c/B 4090 GCM-SIV enc \| 0.098 ns/B 9692 MiB/s 0.402 c/B 4089 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Perform AEAD input 24KiB splitting only when input larger than 32KiB	Jussi Kivilinna	2022-02-22	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/chacha20.c (_gcry_chacha20_poly1305_encrypt) (_gcry_chacha20_poly1305_decrypt): Process in 24KiB chunks if input larger than 32KiB. * cipher/cipher-ccm.c (_gcry_cipher_ccm_encrypt) (_gcry_cipher_ccm_decrypt): Likewise. * cipher/cipher-eax.c (_gcry_cipher_eax_encrypt) (_gcry_cipher_eax_decrypt): Likewise. * cipher/cipher-gcm.c (gcm_cipher_inner): Likewise. * cipher/cipher-ocb.c (ocb_crypt): Likewise. * cipher/cipher-poly2305.c (_gcry_cipher_poly1305_encrypt) (_gcry_cipher_poly1305_decrypt): Likewise. -- Splitting input which length is just above 24KiB is not benefical. Instead perform splitting if input is longer than 32KiB to ensure that last chunk is also a large buffer. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Fix building GCM when GCM_USE_ARM_NEON defined but GCM_USE_ARM_PMULL not	Jussi Kivilinna	2022-01-22	1	-4/+3
\| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (setupM): Remove ifdef around 'features'. -- GnuPG-bug-id: 5796 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add armv8/pmull accelerated POLYVAL for GCM-SIV	Jussi Kivilinna	2022-01-11	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm-armv8-aarch32-ce.S (_gcry_polyval_armv8_ce_pmull): New. * cipher/cipher-gcm-armv8-aarch64-ce.S (_gcry_polyval_armv8_ce_pmull): New. * cipher/cipher-gcm.c (_gcry_polyval_armv8_ce_pmull) (polyval_armv8_ce_pmull): New. (setupM) [GCM_USE_ARM_PMULL]: Setup 'polyval_armv8_ce_pmull' as POLYVAL function. -- Benchmark on Cortex-A53 (aarch64): Before: AES \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV auth \| 1.74 ns/B 547.6 MiB/s 2.01 c/B 1152 After (76% faster): AES \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV auth \| 0.990 ns/B 963.2 MiB/s 1.14 c/B 1152 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add intel-pclmul accelerated POLYVAL for GCM-SIV	Jussi Kivilinna	2021-11-15	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm-intel-pclmul.c (gfmul_pclmul_aggr4) (gfmul_pclmul_aggr8): Move assembly to new GFMUL_AGGRx_ASM* macros. (GFMUL_AGGR4_ASM_1, GFMUL_AGGR4_ASM_2, gfmul_pclmul_aggr4_le) (GFMUL_AGGR8_ASM, gfmul_pclmul_aggr8_le) (_gcry_polyval_intel_pclmul): New. * cipher/cipher-gcm-siv.c (do_polyval_buf): Use polyval function if available. * cipher/cipher-gcm.c (_gcry_polyval_intel_pclmul): New. (setupM): Setup 'c->u_mode.gcm.polyval_fn' with accelerated polyval function if available. * cipher/cipher-internal.h (gcry_cipher_handle): Add member 'u_mode.gcm.polyval_fn'. -- Benchmark on AMD Ryzen 7 5800X: Before: AES \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV enc \| 0.150 ns/B 6337 MiB/s 0.730 c/B 4849 GCM-SIV dec \| 0.163 ns/B 5862 MiB/s 0.789 c/B 4850 GCM-SIV auth \| 0.119 ns/B 8022 MiB/s 0.577 c/B 4850 After (enc/dec ~26% faster, auth ~43% faster): AES \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GCM-SIV enc \| 0.117 ns/B 8138 MiB/s 0.568 c/B 4850 GCM-SIV dec \| 0.128 ns/B 7429 MiB/s 0.623 c/B 4850 GCM-SIV auth \| 0.083 ns/B 11507 MiB/s 0.402 c/B 4851 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add AES-GCM-SIV mode (RFC 8452)	Jussi Kivilinna	2021-08-26	1	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-siv.c'. * cipher/cipher-gcm-siv.c: New. * cipher/cipher-gcm.c (_gcry_cipher_gcm_setupM): New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'siv_keylen'. (_gcry_cipher_gcm_setupM, _gcry_cipher_gcm_siv_encrypt) (_gcry_cipher_gcm_siv_decrypt, _gcry_cipher_gcm_siv_set_nonce) (_gcry_cipher_gcm_siv_authenticate) (_gcry_cipher_gcm_siv_set_decryption_tag) (_gcry_cipher_gcm_siv_get_tag, _gcry_cipher_gcm_siv_check_tag) (_gcry_cipher_gcm_siv_setkey): New prototypes. (cipher_block_bswap): New helper function. * cipher/cipher.c (_gcry_cipher_open_internal): Add 'GCRY_CIPHER_MODE_GCM_SIV'; Refactor mode requirement checks for better size optimization (check pointers & blocksize in same order for all). (cipher_setkey, cipher_reset, _gcry_cipher_setup_mode_ops) (_gcry_cipher_setup_mode_ops, _gcry_cipher_info): Add GCM-SIV. (_gcry_cipher_ctl): Handle 'set decryption tag' for GCM-SIV. * doc/gcrypt.texi: Add GCM-SIV. * src/gcrypt.h.in (GCRY_CIPHER_MODE_GCM_SIV): New. (GCRY_SIV_BLOCK_LEN, gcry_cipher_set_decryption_tag): Add to comment that these are also for GCM-SIV in addition to SIV mode. * tests/basic.c (check_gcm_siv_cipher): New. (check_cipher_modes): Check for GCM-SIV. * tests/bench-slope.c (bench_gcm_siv_encrypt_do_bench) (bench_gcm_siv_decrypt_do_bench, bench_gcm_siv_authenticate_do_bench) (gcm_siv_encrypt_ops, gcm_siv_decrypt_ops) (gcm_siv_authenticate_ops): New. (cipher_modes): Add GCM-SIV. (cipher_bench_one): Check key length requirement for GCM-SIV. -- GnuPG-bug-id: T4485 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	cipher-gcm-ppc: add big-endian support	Jussi Kivilinna	2021-04-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm-ppc.c (ALIGNED_16): New. (vec_store_he, vec_load_he): Remove WORDS_BIGENDIAN ifdef. (vec_dup_byte_elem): New. (_gcry_ghash_setup_ppc_vpmsum): Match function declaration with prototype in cipher-gcm.c; Load C2 with VEC_LOAD_BE; Use vec_dup_byte_elem; Align constants to 16 bytes. (_gcry_ghash_ppc_vpmsum): Match function declaration with prototype in cipher-gcm.c; Align constant to 16 bytes. * cipher/cipher-gcm.c (ghash_ppc_vpmsum): Return value from _gcry_ghash_ppc_vpmsum. * cipher/cipher-internal.h (GCM_USE_PPC_VPMSUM): Remove requirement for !WORDS_BIGENDIAN. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	VPMSUMD acceleration for GCM mode on PPC	Shawn Landden	2021-03-07	1	-1/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-ppc.c'. * cipher/cipher-gcm-ppc.c: New. * cipher/cipher-gcm.c [GCM_USE_PPC_VPMSUM] (_gcry_ghash_setup_ppc_vpmsum) (_gcry_ghash_ppc_vpmsum, ghash_setup_ppc_vpsum, ghash_ppc_vpmsum): New. (setupM) [GCM_USE_PPC_VPMSUM]: Select ppc-vpmsum implementation if HW feature "ppc-vcrypto" is available. * cipher/cipher-internal.h (GCM_USE_PPC_VPMSUM): New. (gcry_cipher_handle): Move 'ghash_fn' at end of 'gcm' block to align 'gcm_table' to 16 bytes. * configure.ac: Add 'cipher-gcm-ppc.lo'. * tests/basic.c (_check_gcm_cipher): New AES256 test vector. * AUTHORS: Add 'CRYPTOGAMS'. * LICENSES: Add original license to 3-clause-BSD section. -- https://dev.gnupg.org/D501: 10-20X speed. However this Power 9 machine is faster than the last Power 9 benchmarks on the optimized versions, so while better than the last patch, it is not all due to the code. Before: GCM enc \| 4.23 ns/B 225.3 MiB/s - c/B GCM dec \| 3.58 ns/B 266.2 MiB/s - c/B GCM auth \| 3.34 ns/B 285.3 MiB/s - c/B After: GCM enc \| 0.370 ns/B 2578 MiB/s - c/B GCM dec \| 0.371 ns/B 2571 MiB/s - c/B GCM auth \| 0.159 ns/B 6003 MiB/s - c/B Signed-off-by: Shawn Landden <shawn@git.icu> [jk: coding style fixes, Makefile.am integration, patch from Differential to git, commit changelog, fixed few compiler warnings] GnuPG-bug-id: 5040 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add bulk AES-GCM acceleration for s390x/zSeries	Jussi Kivilinna	2020-12-18	1	-2/+35
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'asm-inline-s390x.h'. * cipher/asm-inline-s390x.h: New. * cipher/cipher-gcm.c [GCM_USE_S390X_CRYPTO] (ghash_s390x_kimd): New. (setupM) [GCM_USE_S390X_CRYPTO]: Add setup for s390x GHASH function. * cipher/cipher-internal.h (GCM_USE_S390X_CRYPTO): New. * cipher/rijndael-s390x.c (u128_t, km_functions_e): Move to 'asm-inline-s390x.h'. (aes_s390x_gcm_crypt): New. (_gcry_aes_s390x_setup_acceleration): Use 'km_function_to_mask'; Add setup for GCM bulk function. -- This patch adds zSeries acceleration for GHASH and AES-GCM. Benchmarks (z15, 5.2Ghz): Before: AES \| nanosecs/byte mebibytes/sec cycles/byte GCM enc \| 2.64 ns/B 361.6 MiB/s 13.71 c/B GCM dec \| 2.64 ns/B 361.3 MiB/s 13.72 c/B GCM auth \| 2.58 ns/B 370.1 MiB/s 13.40 c/B After: AES \| nanosecs/byte mebibytes/sec cycles/byte GCM enc \| 0.059 ns/B 16066 MiB/s 0.309 c/B GCM dec \| 0.059 ns/B 16114 MiB/s 0.308 c/B GCM auth \| 0.057 ns/B 16747 MiB/s 0.296 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add bulk function interface for GCM mode	Jussi Kivilinna	2020-12-18	1	-48/+77
\| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (do_ghash_buf): Proper handling for the case where 'unused' gets filled to full blocksize. (gcm_crypt_inner): New. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Use 'gcm_crypt_inner'. * cipher/cipher-internal.h (cipher_bulk_ops_t): Add 'gcm_crypt'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: move look-up table to .data section and unshare between processes	Jussi Kivilinna	2019-06-05	1	-36/+70
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (ATTR_ALIGNED_64): New. (gcmR): Move to 'gcm_table' structure. (gcm_table): New structure for look-up table with counters before and after. (gcmR): New macro. (prefetch_table): Handle input with length not multiple of 256. (do_prefetch_tables): Modify pre- and post-table counters to unshare look-up table pages between processes. -- GnuPG-bug-id: 4541 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Prefetch GCM look-up tables	Jussi Kivilinna	2019-04-27	1	-0/+33
\| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (prefetch_table, do_prefetch_tables) (prefetch_tables): New. (ghash_internal): Call prefetch_tables. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Optimizations for generic table-based GCM implementations	Jussi Kivilinna	2019-04-27	1	-26/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c [GCM_TABLES_USE_U64] (do_fillM): Precalculate M[32..63] values. [GCM_TABLES_USE_U64] (do_ghash): Split processing of two 64-bit halfs of the input to two separate loops; Use precalculated M[] values. [GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_fillM): Precalculate M[64..127] values. [GCM_USE_TABLES && !GCM_TABLES_USE_U64] (do_ghash): Use precalculated M[] values. [GCM_USE_TABLES] (bshift): Avoid conditional execution for mask calculation. * cipher/cipher-internal.h (gcry_cipher_handle): Double gcm_table size. -- Benchmark on Intel Haswell (amd64, --disable-hwf all): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES \| 2.79 ns/B 341.3 MiB/s 11.17 c/B 3998 After (~36% faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES \| 2.05 ns/B 464.7 MiB/s 8.20 c/B 3998 Benchmark on Intel Haswell (win32, --disable-hwf all): Before: \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES \| 4.90 ns/B 194.8 MiB/s 19.57 c/B 3997 After (~36% faster): \| nanosecs/byte mebibytes/sec cycles/byte auto Mhz GMAC_AES \| 3.58 ns/B 266.4 MiB/s 14.31 c/B 3999 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add ARMv7/NEON accelerated GCM implementation	Jussi Kivilinna	2019-03-23	1	-1/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-armv7-neon.S'. * cipher/cipher-gcm-armv7-neon.S: New. * cipher/cipher-gcm.c [GCM_USE_ARM_NEON] (_gcry_ghash_setup_armv7_neon) (_gcry_ghash_armv7_neon, ghash_setup_armv7_neon) (ghash_armv7_neon): New. (setupM) [GCM_USE_ARM_NEON]: Use armv7/neon implementation if have HWF_ARM_NEON. * cipher/cipher-internal.h (GCM_USE_ARM_NEON): New. -- Benchmark on Cortex-A53 (816 Mhz): Before: \| nanosecs/byte mebibytes/sec cycles/byte GMAC_AES \| 34.81 ns/B 27.40 MiB/s 28.41 c/B After (3.0x faster): \| nanosecs/byte mebibytes/sec cycles/byte GMAC_AES \| 11.49 ns/B 82.99 MiB/s 9.38 c/B Reported-by: Yuriy M. Kaminskiy <yumkam@gmail.com> Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Use memset instead of setting buffers byte by byte	Jussi Kivilinna	2019-03-23	1	-2/+6
\| \| \| \| \| \| \| \| \|	* cipher/cipher-ccm.c (do_cbc_mac): Replace buffer setting loop with memset call. * cipher/cipher-gcm.c (do_ghash_buf): Ditto. * cipher/poly1305.c (poly1305_final): Ditto. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Process CCM/EAX/GCM/Poly1305 AEAD cipher modes input in 24 KiB chucks	Jussi Kivilinna	2019-01-02	1	-6/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-ccm.c (_gcry_cipher_ccm_encrypt) (_gcry_cipher_ccm_decrypt): Process data in 24 KiB chunks. * cipher/cipher-eax.c (_gcry_cipher_eax_encrypt) (_gcry_cipher_eax_decrypt): Ditto. * cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_decrypt): Ditto. * cipher/cipher-poly1305.c (_gcry_cipher_poly1305_encrypt) (_gcry_cipher_poly1305_decrypt): Ditto. -- Patch changes AEAD modes to process input in 24 KiB chuncks to improve cache locality when processing large buffers. Huge buffer test in tests/benchmark show 0.7% improvement for AES-CCM and AES-EAX, 6% for AES-GCM and 4% for Chacha20-Poly1305 on Intel Core i7-4790K. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add size optimized cipher block copy and xor functions	Jussi Kivilinna	2018-07-21	1	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/bufhelp.h (buf_get_he32, buf_put_he32, buf_get_he64) (buf_put_he64): New. * cipher/cipher-internal.h (cipher_block_cpy, cipher_block_xor) (cipher_block_xor_1, cipher_block_xor_2dst, cipher_block_xor_n_copy_2) (cipher_block_xor_n_copy): New. * cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_setup_intel_pclmul): Use assembly for swapping endianness instead of buf_get_be64 and buf_cpy. * cipher/blowfish.c: Use new cipher_block_* functions for cipher block sized buf_cpy/xor* operations. * cipher/camellia-glue.c: Ditto. * cipher/cast5.c: Ditto. * cipher/cipher-aeswrap.c: Ditto. * cipher/cipher-cbc.c: Ditto. * cipher/cipher-ccm.c: Ditto. * cipher/cipher-cfb.c: Ditto. * cipher/cipher-cmac.c: Ditto. * cipher/cipher-ctr.c: Ditto. * cipher/cipher-eax.c: Ditto. * cipher/cipher-gcm.c: Ditto. * cipher/cipher-ocb.c: Ditto. * cipher/cipher-ofb.c: Ditto. * cipher/cipher-xts.c: Ditto. * cipher/des.c: Ditto. * cipher/rijndael.c: Ditto. * cipher/serpent.c: Ditto. * cipher/twofish.c: Ditto. -- This commit adds size-optimized functions for copying and xoring cipher block sized buffers. These functions also allow GCC to use inline auto-vectorization for block cipher copying and xoring on higher optimization levels. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Fix incorrect counter overflow handling for GCM	Jussi Kivilinna	2018-01-31	1	-3/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (gcm_ctr_encrypt): New function to handle 32-bit CTR increment for GCM. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Do not use generic CTR implementation directly, use gcm_ctr_encrypt instead. * tests/basic.c (_check_gcm_cipher): Add test-vectors for 32-bit CTR overflow. (check_gcm_cipher): Add 'split input to 15 bytes and 17 bytes' test-runs. -- Reported-by: Clemens Lang <Clemens.Lang@bmw.de> > I believe we have found what seems to be a bug in counter overflow > handling in AES-GCM in libgcrypt's implementation. This leads to > incorrect results when using a non-12-byte IV and decrypting payloads > encrypted with other AES-GCM implementations, such as OpenSSL. > > According to the NIST Special Publication 800-38D "Recommendation for > Block Cipher Modes of Operation: Galois/Counter Mode (GCM) and GMAC", > section 7.1, algorithm 4, step 3 [NIST38D], the counter increment is > defined as inc_32. Section 6.2 of the same document defines the > incrementing function inc_s for positive integers s as follows: > > \| the function increments the right-most s bits of the string, regarded > \| as the binary representation of an integer, modulo 2^s; the remaining, > \| left-most len(X) - s bits remain unchanged > > (X is the complete counter value in this case) > > This problem does not occur when using a 12-byte IV, because AES-GCM has > a special case for the inital counter value with 12-byte IVs: > > \| If len(IV)=96, then J_0 = IV \|\| 0^31 \|\| 1 > > i.e., one would have to encrypt (UINT_MAX - 1) * blocksize of data to > hit an overflow. However, for non-12-byte IVs, the initial counter value > is the output of a hash function, which makes hitting an overflow much > more likely. > > In practice, we have found that using > > iv = 9e 79 18 8c ff 09 56 1e c9 90 99 cc 6d 5d f6 d3 > key = 26 56 e5 73 76 03 c6 95 0d 22 07 31 5d 32 5c 6b a5 54 5f 40 23 98 60 f6 f7 06 6f 7a 4f c2 ca 40 > > will reliably trigger an overflow when encrypting 10 MiB of data. It > seems that this is caused by re-using the AES-CTR implementation for > incrementing the counter. Bug was introduced by commit bd4bd23a2511a4bce63c3217cca0d4ecf0c79532 "GCM: Use counter mode code for speed-up". GnuPG-bug-id: 3764 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add ARMv8/AArch32 Crypto Extension implementation of GCM	Jussi Kivilinna	2016-07-14	1	-1/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-armv8-aarch32-ce.S'. * cipher/cipher-gcm-armv8-aarch32-ce.S: New. * cipher/cipher-gcm.c [GCM_USE_ARM_PMULL] (_gcry_ghash_setup_armv8_ce_pmull, _gcry_ghash_armv8_ce_pmull) (ghash_setup_armv8_ce_pmull, ghash_armv8_ce_pmull): New. (setupM) [GCM_USE_ARM_PMULL]: Enable ARM PMULL implementation if HWF_ARM_PULL HW feature flag is enabled. * cipher/cipher-gcm.h (GCM_USE_ARM_PMULL): New. -- Benchmark on Cortex-A53 (1152 Mhz): Before: \| nanosecs/byte mebibytes/sec cycles/byte GMAC_AES \| 24.10 ns/B 39.57 MiB/s 27.76 c/B After (~26x faster): \| nanosecs/byte mebibytes/sec cycles/byte GMAC_AES \| 0.924 ns/B 1032.2 MiB/s 1.06 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	cipher: GCM: check that length of supplied tag is one of valid lengths	Jussi Kivilinna	2016-03-27	1	-7/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (is_tag_length_valid): New. (_gcry_cipher_gcm_tag): Check that 'outbuflen' has valid tag length. * tests/basic.c (_check_gcm_cipher): Add test-vectors with different valid tag lengths and negative test vectors with invalid lengths. -- NIST SP 800-38D allows following tag lengths: 128, 120, 112, 104, 96, 64 and 32 bits. [v2: allow larger buffer when outputting tag. 128-bit tag is written to target buffer in this case] Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	cipher: Avoid NULL-segv in GCM mode if a key has not been set.	Werner Koch	2016-03-23	1	-4/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt): Check that GHASH_FN has been initialized. (_gcry_cipher_gcm_decrypt): Ditto. (_gcry_cipher_gcm_authenticate): Ditto. (_gcry_cipher_gcm_initiv): Ditto. (_gcry_cipher_gcm_tag): Ditto. -- Avoid a crash if certain functions are used before setkey. Reported-by: Peter Wu <peter@lekensteyn.nl> One crash is not fixed, that is the crash when setkey is not invoked before using the GCM ciphers (introduced in the 1.7.0 cycle). Either these functions should check that the key is present, or they should initialize the ghash table earlier. Affected functions: _gcry_cipher_gcm_encrypt _gcry_cipher_gcm_decrypt _gcry_cipher_gcm_authenticate _gcry_cipher_gcm_initiv (via _gcry_cipher_gcm_setiv) _gcry_cipher_gcm_tag (via _gcry_cipher_gcm_get_tag, _gcry_cipher_gcm_check_tag) Regression-due-to: 4a0795af021305f9240f23626a3796157db46bd7 Signed-off-by: Werner Koch <wk@gnupg.org>
*	cipher: Check length of supplied tag in _gcry_cipher_gcm_check_tag.	Werner Koch	2016-03-23	1	-3/+8
\| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (_gcry_cipher_gcm_tag): Check that the provided tag length matches the actual tag length. Avoid gratuitous return statements. -- Signed-off-by: Werner Koch <wk@gnupg.org>
*	Fix buffer overrun in gettag for GCM	Peter Wu	2016-03-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c: copy a fixed length instead of the user-supplied number. -- The outbuflen is used to check the minimum size, the real tag is always of fixed length. Signed-off-by: Peter Wu <peter@lekensteyn.nl> Actually this is not a buffer overrun because we copy not more than has been allocated for OUTBUF. However a too long OUTBUFLEN accesses data outside of the source buffer. -wk
*	Always require a 64 bit integer type	Werner Koch	2016-03-18	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* configure.ac (available_digests_64): Merge with available_digests. (available_kdfs_64): Merge with available_kdfs. <64 bit datatype test>: Bail out if no such type is available. * src/types.h: Emit #error if no u64 can be defined. (PROPERLY_ALIGNED_TYPE): Always add u64 type. * cipher/bithelp.h: Remove all code paths which handle the case of !HAVE_U64_TYPEDEF. * cipher/bufhelp.h: Ditto. * cipher/cipher-ccm.c: Ditto. * cipher/cipher-gcm.c: Ditto. * cipher/cipher-internal.h: Ditto. * cipher/cipher.c: Ditto. * cipher/hash-common.h: Ditto. * cipher/md.c: Ditto. * cipher/poly1305.c: Ditto. * cipher/scrypt.c: Ditto. * cipher/tiger.c: Ditto. * src/g10lib.h: Ditto. * tests/basic.c: Ditto. * tests/bench-slope.c: Ditto. * tests/benchmark.c: Ditto. -- Given that SHA-2 and some other algorithms require a 64 bit type it does not make anymore sense to conditionally compile some part when the platform does not provide such a type. GnuPG-bug-id: 1815. Signed-off-by: Werner Koch <wk@gnupg.org>
*	Fix undefined behavior wrt memcpy	Peter Wu	2015-07-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c: Do not copy zero bytes from an empty buffer. Let the function continue to add padding as needed though. * cipher/mac-poly1305.c: If the caller requested to finish the hash function without a copy of the result, return immediately. -- Caught by UndefinedBehaviorSanitizer. Signed-off-by: Peter Wu <peter@lekensteyn.nl>
*	gcm: do not pass extra key pointer for setupM/fillM	Jussi Kivilinna	2014-12-23	1	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm-intel-pclmul.c (_gcry_ghash_setup_intel_pclmul): Remove 'h' parameter. * cipher/cipher-gcm.c (_gcry_ghash_setup_intel_pclmul): Ditto. (fillM): Get 'h' pointer from 'c'. (setupM): Remome 'h' parameter. (_gcry_cipher_gcm_setkey): Only pass 'c' to setupM. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: move Intel PCLMUL accelerated implementation to separate file	Jussi Kivilinna	2014-12-12	1	-370/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/Makefile.am: Add 'cipher-gcm-intel-pclmul.c'. * cipher/cipher-gcm-intel-pclmul.c: New. * cipher/cipher-gcm.c [GCM_USE_INTEL_PCLMUL] (_gcry_ghash_setup_intel_pclmul, _gcry_ghash_intel_pclmul): New prototypes. [GCM_USE_INTEL_PCLMUL] (gfmul_pclmul, gfmul_pclmul_aggr4): Move to 'cipher-gcm-intel-pclmul.c'. (ghash): Rename to... (ghash_internal): ...this and move GCM_USE_INTEL_PCLMUL part to new function in 'cipher-gcm-intel-pclmul.c'. (setupM): Move GCM_USE_INTEL_PCLMUL part to new function in 'cipher-gcm-intel-pclmul.c'; Add selection of ghash function based on available HW acceleration. (do_ghash_buf): Change use of 'ghash' to 'c->u_mode.gcm.ghash_fn'. * cipher/internal.h (ghash_fn_t): New. (gcry_cipher_handle): Remove 'use_intel_pclmul'; Add 'ghash_fn'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Replace ath based mutexes by gpgrt based locks.	Werner Koch	2014-01-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* configure.ac (NEED_GPG_ERROR_VERSION): Require 1.13. (gl_LOCK): Remove. * src/ath.c, src/ath.h: Remove. Remove from all files. Replace all mutexes by gpgrt based statically initialized locks. * src/global.c (global_init): Remove ath_init. (_gcry_vcontrol): Make ath install a dummy function. (print_config): Remove threads info line. * doc/gcrypt.texi: Simplify the multi-thread related documentation. -- The current code does only work on ELF systems with weak symbol support. In particular no locks were used under Windows. With the new gpgrt_lock functions from the soon to be released libgpg-error 1.13 we have a better portable scheme which also allows for static initialized mutexes. Signed-off-by: Werner Koch <wk@gnupg.org>
*	Change utf-8 copyright characters to '(C)'	Jussi Kivilinna	2013-12-18	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	cipher/blowfish-amd64.S: Change utf-8 encoded copyright character to '(C)'. cipher/blowfish-arm.S: Ditto. cipher/bufhelp.h: Ditto. cipher/camellia-aesni-avx-amd64.S: Ditto. cipher/camellia-aesni-avx2-amd64.S: Ditto. cipher/camellia-arm.S: Ditto. cipher/cast5-amd64.S: Ditto. cipher/cast5-arm.S: Ditto. cipher/cipher-ccm.c: Ditto. cipher/cipher-cmac.c: Ditto. cipher/cipher-gcm.c: Ditto. cipher/cipher-selftest.c: Ditto. cipher/cipher-selftest.h: Ditto. cipher/mac-cmac.c: Ditto. cipher/mac-gmac.c: Ditto. cipher/mac-hmac.c: Ditto. cipher/mac-internal.h: Ditto. cipher/mac.c: Ditto. cipher/rijndael-amd64.S: Ditto. cipher/rijndael-arm.S: Ditto. cipher/salsa20-amd64.S: Ditto. cipher/salsa20-armv7-neon.S: Ditto. cipher/serpent-armv7-neon.S: Ditto. cipher/serpent-avx2-amd64.S: Ditto. cipher/serpent-sse2-amd64.S: Ditto. -- Avoid use of '©' for easier parsing of source for copyright information. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Move gcm_table initialization to setkey	Jussi Kivilinna	2013-11-21	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c: Change all 'c->u_iv.iv' to 'c->u_mode.gcm.u_ghash_key.key'. (_gcry_cipher_gcm_setkey): New. (_gcry_cipher_gcm_initiv): Move ghash initialization to function above. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.u_ghash_key'; Reorder 'u_mode.gcm' members for partial clearing in gcry_cipher_reset. (_gcry_cipher_gcm_setkey): New prototype. * cipher/cipher.c (cipher_setkey): Add GCM setkey. (cipher_reset): Clear 'u_mode' only partially for GCM. -- GHASH tables can be generated at setkey time. No need to regenerate for every new IV. Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Add support for split data buffers and online operation	Jussi Kivilinna	2013-11-20	1	-28/+85
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (do_ghash_buf): Add buffering for less than blocksize length input and padding handling. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt): Add handling for AAD padding and check if data has already being padded. (_gcry_cipher_gcm_authenticate): Check that AAD or data has not being padded yet. (_gcry_cipher_gcm_initiv): Clear padding marks. (_gcry_cipher_gcm_tag): Add finalization and padding; Clear sensitive data from cipher handle, since they are not used after generating tag. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.macbuf', 'u_mode.gcm.mac_unused', 'u_mode.gcm.ghash_data_finalized' and 'u_mode.gcm.ghash_aad_finalized'. * tests/basic.c (check_gcm_cipher): Rename to... (_check_gcm_cipher): ...this and add handling for different buffer step lengths; Enable per byte buffer testing. (check_gcm_cipher): Call _check_gcm_cipher with different buffer step sizes. -- Until now, GCM was expecting full data to be input in one go. This patch adds support for feeding data continuously (for encryption/decryption/aad). Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Use size_t for buffer sizes	Jussi Kivilinna	2013-11-20	1	-12/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (ghash, gcm_bytecounter_add, do_ghash_buf) (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_geniv) (_gcry_cipher_gcm_tag): Use size_t for buffer lengths. * cipher/cipher-internal.h (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_decrypt, _gcry_cipher_gcm_authenticate): Use size_t for buffer lengths. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: add FIPS mode restrictions	Jussi Kivilinna	2013-11-20	1	-2/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_get_tag): Do not allow using in FIPS mode is setiv was invocated directly. (_gcry_cipher_gcm_setiv): Rename to... (_gcry_cipher_gcm_initiv): ...this. (_gcry_cipher_gcm_setiv): New setiv function with check for FIPS mode. [TODO] (_gcry_cipher_gcm_getiv): New. * cipher/cipher-internal.h (gcry_cipher_handle): Add 'u_mode.gcm.disallow_encryption_because_of_setiv_in_fips_mode'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Add clearing and checking of marks.tag	Jussi Kivilinna	2013-11-20	1	-0/+7
\| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (_gcry_cipher_gcm_encrypt) (_gcry_cipher_gcm_decrypt, _gcry_cipher_gcm_authenticate): Make sure that tag has not been finalized yet. (_gcry_cipher_gcm_setiv): Clear 'marks.tag'. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Add stack burning	Jussi Kivilinna	2013-11-20	1	-9/+26
\| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (do_ghash, ghash): Return stack burn depth. (setupM): Wipe 'tmp' buffer. (do_ghash_buf): Wipe 'tmp' buffer and add stack burning. -- Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add aggregated bulk processing for GCM on x86-64	Jussi Kivilinna	2013-11-20	1	-9/+219
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c [__x86_64__] (gfmul_pclmul_aggr4): New. (ghash) [GCM_USE_INTEL_PCLMUL]: Add aggregated bulk processing for __x86_64__. (setupM) [__x86_64__]: Add initialization for aggregated bulk processing. -- Intel Haswell (x86-64): Old: AES GCM enc \| 0.990 ns/B 963.3 MiB/s 3.17 c/B GCM dec \| 0.982 ns/B 970.9 MiB/s 3.14 c/B GCM auth \| 0.711 ns/B 1340.8 MiB/s 2.28 c/B New: AES GCM enc \| 0.535 ns/B 1783.8 MiB/s 1.71 c/B GCM dec \| 0.531 ns/B 1796.2 MiB/s 1.70 c/B GCM auth \| 0.255 ns/B 3736.4 MiB/s 0.817 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Tweak Intel PCLMUL ghash loop for small speed-up	Jussi Kivilinna	2013-11-20	1	-55/+65
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (do_ghash): Mark 'inline'. [GCM_USE_INTEL_PCLMUL] (do_ghash_pclmul): Rename to... [GCM_USE_INTEL_PCLMUL] (gfmul_pclmul): ..this and make inline function. (ghash) [GCM_USE_INTEL_PCLMUL]: Preload data before ghash-pclmul loop. -- Intel Haswell: Old: AES GCM enc \| 1.12 ns/B 853.5 MiB/s 3.58 c/B GCM dec \| 1.12 ns/B 853.4 MiB/s 3.58 c/B GCM auth \| 0.843 ns/B 1131.5 MiB/s 2.70 c/B New: AES GCM enc \| 0.990 ns/B 963.3 MiB/s 3.17 c/B GCM dec \| 0.982 ns/B 970.9 MiB/s 3.14 c/B GCM auth \| 0.711 ns/B 1340.8 MiB/s 2.28 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: Use counter mode code for speed-up	Jussi Kivilinna	2013-11-20	1	-147/+215
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (ghash): Add process for multiple blocks. (gcm_bytecounter_add, gcm_add32_be128, gcm_check_datalen) (gcm_check_aadlen_or_ivlen, do_ghash_buf): New functions. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_set_iv) (_gcry_cipher_gcm_tag): Adjust to use above new functions and counter mode functions for encryption/decryption. * cipher/cipher-internal.h (gcry_cipher_handle): Remove 'length'; Add 'u_mode.gcm.(addlen\|datalen\|tagiv\|datalen_over_limits)'. (_gcry_cipher_gcm_setiv): Return gcry_err_code_t. * cipher/cipher.c (cipher_setiv): Return error code. (_gcry_cipher_setiv): Handle error code from 'cipher_setiv'. -- Patch changes GCM to use counter mode code for bulk speed up and also adds data length checks as given in NIST SP-800-38D section 5.2.1.1. Bit length requirements from section 5.2.1.1: len(plaintext) <= 2^39-256 bits == 2^36-32 bytes == 2^32-2 blocks len(aad) <= 2^64-1 bits ~= 2^61-1 bytes len(iv) <= 2^64-1 bit ~= 2^61-1 bytes Intel Haswell: Old: AES GCM enc \| 3.00 ns/B 317.4 MiB/s 9.61 c/B GCM dec \| 1.96 ns/B 486.9 MiB/s 6.27 c/B GCM auth \| 0.848 ns/B 1124.7 MiB/s 2.71 c/B New: AES GCM enc \| 1.12 ns/B 851.8 MiB/s 3.58 c/B GCM dec \| 1.12 ns/B 853.7 MiB/s 3.57 c/B GCM auth \| 0.843 ns/B 1131.4 MiB/s 2.70 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Add Intel PCLMUL acceleration for GCM	Jussi Kivilinna	2013-11-20	1	-29/+176
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c (fillM): Rename... (do_fillM): ...to this. (ghash): Remove. (fillM): New macro. (GHASH): Use 'do_ghash' instead of 'ghash'. [GCM_USE_INTEL_PCLMUL] (do_ghash_pclmul): New. (ghash): New. (setupM): New. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_authenticate, _gcry_cipher_gcm_setiv) (_gcry_cipher_gcm_tag): Use 'ghash' instead of 'GHASH' and 'c->u_mode.gcm.u_tag.tag' instead of 'c->u_tag.tag'. * cipher/cipher-internal.h (GCM_USE_INTEL_PCLMUL): New. (gcry_cipher_handle): Move 'u_tag' and 'gcm_table' under 'u_mode.gcm'. * configure.ac (pclmulsupport, gcry_cv_gcc_inline_asm_pclmul): New. * src/g10lib.h (HWF_INTEL_PCLMUL): New. * src/global.c: Add "intel-pclmul". * src/hwf-x86.c (detect_x86_gnuc): Add check for Intel PCLMUL. -- Speed-up GCM for Intel CPUs. Intel Haswell (x86-64): Old: AES GCM enc \| 5.17 ns/B 184.4 MiB/s 16.55 c/B GCM dec \| 4.38 ns/B 218.0 MiB/s 14.00 c/B GCM auth \| 3.17 ns/B 300.4 MiB/s 10.16 c/B New: AES GCM enc \| 3.01 ns/B 317.2 MiB/s 9.62 c/B GCM dec \| 1.96 ns/B 486.9 MiB/s 6.27 c/B GCM auth \| 0.848 ns/B 1124.8 MiB/s 2.71 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	GCM: GHASH optimizations	Jussi Kivilinna	2013-11-20	1	-111/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* cipher/cipher-gcm.c [GCM_USE_TABLES] (gcmR, ghash): Replace with new. [GCM_USE_TABLES] [GCM_TABLES_USE_U64] (bshift, fillM, do_ghash): New. [GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (bshift, fillM): Replace with new. [GCM_USE_TABLES] [!GCM_TABLES_USE_U64] (do_ghash): New. (_gcry_cipher_gcm_tag): Remove extra memcpy to outbuf and use buf_eq_const for comparing authentication tag. * cipher/cipher-internal.h (gcry_cipher_handle): Different 'gcm_table' for 32-bit and 64-bit platforms. -- Patch improves GHASH speed. Intel Haswell (x86-64): Old: GCM auth \| 26.22 ns/B 36.38 MiB/s 83.89 c/B New: GCM auth \| 3.18 ns/B 300.0 MiB/s 10.17 c/B Intel Haswell (mingw32): Old: GCM auth \| 27.27 ns/B 34.97 MiB/s 87.27 c/B New: GCM auth \| 7.58 ns/B 125.7 MiB/s 24.27 c/B Cortex-A8: Old: GCM auth \| 231.4 ns/B 4.12 MiB/s 233.3 c/B New: GCM auth \| 30.82 ns/B 30.94 MiB/s 31.07 c/B Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
*	Initial implementation of GCM	Dmitry Eremin-Solenikov	2013-11-19	1	-0/+483
	* cipher/Makefile.am: Add 'cipher-gcm.c'. * cipher/cipher-ccm.c (_gcry_ciphert_ccm_set_lengths) (_gcry_cipher_ccm_authenticate, _gcry_cipher_ccm_tag) (_gcry_cipher_ccm_encrypt, _gcry_cipher_ccm_decrypt): Change 'c->u_mode.ccm.tag' to 'c->marks.tag'. * cipher/cipher-gcm.c: New. * cipher/cipher-internal.h (GCM_USE_TABLES): New. (gcry_cipher_handle): Add 'marks.tag', 'u_tag', 'length' and 'gcm_table'; Remove 'u_mode.ccm.tag'. (_gcry_cipher_gcm_encrypt, _gcry_cipher_gcm_decrypt) (_gcry_cipher_gcm_setiv, _gcry_cipher_gcm_authenticate) (_gcry_cipher_gcm_get_tag, _gcry_cipher_gcm_check_tag): New. * cipher/cipher.c (_gcry_cipher_open_internal, cipher_setkey) (cipher_encrypt, cipher_decrypt, _gcry_cipher_authenticate) (_gcry_cipher_gettag, _gcry_cipher_checktag): Add GCM mode handling. * src/gcrypt.h.in (gcry_cipher_modes): Add GCRY_CIPHER_MODE_GCM. (GCRY_GCM_BLOCK_LEN): New. * tests/basic.c (check_gcm_cipher): New. (check_ciphers): Add GCM check. (check_cipher_modes): Call 'check_gcm_cipher'. * tests/bench-slope.c (bench_gcm_encrypt_do_bench) (bench_gcm_decrypt_do_bench, bench_gcm_authenticate_do_bench) (gcm_encrypt_ops, gcm_decrypt_ops, gcm_authenticate_ops): New. (cipher_modes): Add GCM enc/dec/auth. (cipher_bench_one): Limit GCM to block ciphers with 16 byte block-size. * tests/benchmark.c (cipher_bench): Add GCM. -- Currently it is still quite slow. Still no support for generate_iv(). Is it really necessary? TODO: Merge/reuse cipher-internal state used by CCM. Changelog entry will be present in final patch submission. Changes since v1: - 6x-7x speedup. - added bench-slope support Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com> [jk: mangle new file throught 'indent -nut'] [jk: few fixes] [jk: changelog]