| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/chacha20.c (chacha20_do_setkey) [USE_PPC_VEC]: Enable
P10 assembly for HWF_PPC_ARCH_3_00 if ENABLE_FORCE_SOFT_HWFEATURES is
defined.
* cipher/poly1305.c (poly1305_init) [POLY1305_USE_PPC_VEC]: Likewise.
* cipher/rijndael.c (do_setkey) [USE_PPC_CRYPTO_WITH_PPC9LE]: Likewise.
---
This change allows testing P10 implementations with P9 and with QEMU-PPC.
GnuPG-bug-id: 6006
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac: Added chacha20 and poly1305 assembly implementations.
* cipher/chacha20-p10le-8x.s: (New) - support 8 blocks (512 bytes)
unrolling.
* cipher/poly1305-p10le.s: (New) - support 4 blocks (128 bytes)
unrolling.
* cipher/Makefile.am: Added new chacha20 and poly1305 files.
* cipher/chacha20.c: Added PPC p10 le support for 8x chacha20.
* cipher/poly1305.c: Added PPC p10 le support for 4x poly1305.
* cipher/poly1305-internal.h: Added PPC p10 le support for poly1305.
---
GnuPG-bug-id: 6006
Signed-off-by: Danny Tsen <dtsen@us.ibm.com>
[jk: cosmetic changes to C code]
[jk: fix building on ppc64be]
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/gcrypt.h.in (GCRY_KDF_ONESTEP_KDF_MAC): New.
* cipher/kdf.c (onestep_kdf_mac_open, onestep_kdf_mac_compute): New.
(onestep_kdf_mac_final, onestep_kdf_mac_close): New.
(_gcry_kdf_open, _gcry_kdf_compute, _gcry_kdf_final, _gcry_kdf_close):
Add support for GCRY_KDF_ONESTEP_KDF_MAC.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* src/gcrypt.h.in (GCRY_KDF_ONESTEP_KDF): New.
* cipher/kdf.c (onestep_kdf_open, onestep_kdf_compute): New.
(onestep_kdf_final): New.
(_gcry_kdf_open, _gcry_kdf_compute, _gcry_kdf_final): Add
GCRY_KDF_ONESTEP_KDF support.
* tests/t-kdf.c (check_onestep_kdf): Add the test.
(main): Call check_onestep_kdf.
--
GnuPG-bug-id: 5964
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
| |
* src/gcrypt.h.in (struct gcry_thread_cbs): Since it's no use any
more, even internally, use _GCRY_GCC_ATTR_DEPRECATED instead.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
| |
* src/secmem.c [__riscos__]: Remove.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
| |
* src/secmem.c (lock_pool_pages): Use ERR only for the return value
from mlock.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
| |
* src/secmem.c (lock_pool_pages): Remove escalation of the capability.
--
With CAP_SETPCAP, it might make sense before Linux 2.6.24 when file
capabilityes were not supported. But not any more.
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
| |
--
* tests/basic.c (check_ocb_cipher_checksum): Check the right value for
errors
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* tests/aeswrap.c (check_one_with_padding): Free hd on error paths
* tests/basic.c (check_ccm_cipher): Free context on error paths
(check_ocb_cipher_checksum): Ditto.
(do_check_xts_cipher): Ditto.
(check_gost28147_cipher_basic): Ditto.
* tests/bench-slope.c (bench_ecc_init): Free memory on invalid input.
* tests/t-cv25519.c (test_it): Free memory on error path
* tests/t-dsa.c (hex2buffer): Free memory on error path
* tests/t-ecdsa.c (hex2buffer): Free memory on error path
(one_test_sexp): Cleanup memory on exit
* tests/t-mpi-point.c (check_ec_mul): Free memory on error
(check_ec_mul_reduction): Ditto
* tests/t-rsa-15.c (hex2buffer): Ditto
* tests/t-rsa-pss.c (hex2buffer): Ditto
* tests/t-x448.c (test_it): Free memory on error path
* tests/testdrv.c (my_spawn): Free memory on error paths
--
GnuPG-bug-id: 5973
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rsa.c (rsa_check_keysize): Formatting.
(rsa_check_verify_keysize): New function.
(rsa_verify): Allow using smaller keys for verification.
--
GnuPG-bug-id: 5975
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
| |
* src/gcrypt-int.h (_gcry_kdf_compute): Return gcry_err_code_t.
--
GnuPG-bug-id: 5980
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
| |
* mpi/longlong.h [__hppa] (udiv_qrnnd): Only define
when assembler is enabled.
--
GnuPG-bug-id: 5976
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/asm-common-aarch64.h (GET_DATA_POINTER): Remove.
(GET_LOCAL_POINTER): New.
* cipher/camellia-aarch64.S: Use GET_LOCAL_POINTER instead of ADR
instruction directly.
* cipher/chacha20-aarch64.S: Use GET_LOCAL_POINTER instead of
GET_DATA_POINTER.
* cipher/cipher-gcm-armv8-aarch64-ce.S: Likewise.
* cipher/crc-armv8-aarch64-ce.S: Likewise.
* cipher/sha1-armv8-aarch64-ce.S: Likewise.
* cipher/sha256-armv8-aarch64-ce.S: Likewise.
* cipher/sm3-aarch64.S: Likewise.
* cipher/sm3-armv8-aarch64-ce.S: Likewise.
* cipher/sm4-aarch64.S: Likewise.
---
Switch to use ADR instead of ADRP/LDR or ADRP/ADD for getting
data pointers within assembly files. ADR is more portable across
targets and does not require labels to be declared in GOT tables.
Reviewed-and-tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Remove 'cipher-selftest.c' and 'cipher-selftest.h'.
* cipher/cipher-selftest.c: Remove (refactor these tests to
tests/basic.c).
* cipher/cipher-selftest.h: Remove.
* cipher/blowfish.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/cast5.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/des.c (bulk_selftest_setkey, selftest_ctr, selftest_cbc)
(selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/rijndael.c (selftest_basic_128, selftest_basic_192)
(selftest_basic_256): Allocate context from stack instead of heap and
handle alignment manually.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/serpent.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/sm4.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/twofish.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* tests/basic.c (buf_xor, cipher_cbc_bulk_test, buf_xor_2dst)
(cipher_cfb_bulk_test, cipher_ctr_bulk_test): New.
(check_ciphers): Run cipher_cbc_bulk_test(), cipher_cfb_bulk_test() and
cipher_ctr_bulk_test() for block ciphers.
---
CBC/CFB/CTR bulk self-tests are quite computationally heavy and
slow down use cases where application opens cipher context once,
does processing and exits. Better place for these tests is in
`tests/basic`.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'camellia-gfni-avx512-amd64.S'.
* cipher/bulkhelp.h (bulk_ocb_prepare_L_pointers_array_blk64): New.
* cipher/camellia-aesni-avx2-amd64.h: Rename internal functions from
"__camellia_???" to "FUNC_NAME(???)"; Minor changes to comments.
* cipher/camellia-gfni-avx512-amd64.S: New.
* cipher/camellia-gfni.c (USE_GFNI_AVX512): New.
(CAMELLIA_context): Add 'use_gfni_avx512'.
(_gcry_camellia_gfni_avx512_ctr_enc, _gcry_camellia_gfni_avx512_cbc_dec)
(_gcry_camellia_gfni_avx512_cfb_dec, _gcry_camellia_gfni_avx512_ocb_enc)
(_gcry_camellia_gfni_avx512_ocb_dec)
(_gcry_camellia_gfni_avx512_enc_blk64)
(_gcry_camellia_gfni_avx512_dec_blk64, avx512_burn_stack_depth): New.
(camellia_setkey): Use GFNI/AVX512 if supported by CPU.
(camellia_encrypt_blk1_64, camellia_decrypt_blk1_64): New.
(_gcry_camellia_ctr_enc, _gcry_camellia_cbc_dec, _gcry_camellia_cfb_dec)
(_gcry_camellia_ocb_crypt) [USE_GFNI_AVX512]: Add GFNI/AVX512 code path.
(_gcry_camellia_xts_crypt): Change parallel block size from 32 to 64.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Increase test
block size.
* cipher/chacha20-amd64-avx512.S: Clear k-mask registers with xor.
* cipher/poly1305-amd64-avx512.S: Likewise.
* cipher/sha512-avx512-amd64.S: Likewise.
---
Benchmark on Intel i3-1115G4 (tigerlake):
Before (GFNI/AVX2):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.356 ns/B 2679 MiB/s 1.46 c/B 4089
CFB dec | 0.374 ns/B 2547 MiB/s 1.53 c/B 4089
CTR enc | 0.409 ns/B 2332 MiB/s 1.67 c/B 4089
CTR dec | 0.406 ns/B 2347 MiB/s 1.66 c/B 4089
XTS enc | 0.430 ns/B 2216 MiB/s 1.76 c/B 4090
XTS dec | 0.433 ns/B 2201 MiB/s 1.77 c/B 4090
OCB enc | 0.460 ns/B 2071 MiB/s 1.88 c/B 4089
OCB dec | 0.492 ns/B 1939 MiB/s 2.01 c/B 4089
After (GFNI/AVX512):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.207 ns/B 4600 MiB/s 0.827 c/B 3989
CFB dec | 0.207 ns/B 4610 MiB/s 0.825 c/B 3989
CTR enc | 0.218 ns/B 4382 MiB/s 0.868 c/B 3990
CTR dec | 0.217 ns/B 4389 MiB/s 0.867 c/B 3990
XTS enc | 0.330 ns/B 2886 MiB/s 1.35 c/B 4097±4
XTS dec | 0.328 ns/B 2904 MiB/s 1.35 c/B 4097±3
OCB enc | 0.246 ns/B 3879 MiB/s 0.981 c/B 3990
OCB dec | 0.247 ns/B 3855 MiB/s 0.987 c/B 3990
CBC dec: 70% faster
CFB dec: 80% faster
CTR: 87% faster
XTS: 31% faster
OCB: 92% faster
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
| |
* mpi/mpih-const-time.c (_gcry_mpih_cmp_ui): Compare 64-bit
value correctly.
--
Reported-by: Guido Vranken <guidovranken@gmail.com>
GnuPG-bug-id: 5970
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* random/jitterentropy-base-user.h [HAVE_W32_SYSTEM] (jent_ncpu):
Implement.
* random/rndjent.c (_WIN32_WINNT): Define for GetNativeSystemInfo.
(EOPNOTSUPP): Define when not available.
--
Reported-by: Eli Zaretskii
GnuPG-bug-id: 5891
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
| |
* tests/basic.c (check_one_cipher_core): Add 'split_mode' parameter and
handling for split_mode==1.
(check_one_cipher): Use split_mode==0 for existing check_one_cipher_core
calls; Add new large buffer check with split_mode==1.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/sm4-aesni-avx2-amd64.S: Remove unnecessary vzeroupper at
function entries.
(_gcry_sm4_aesni_avx2_crypt_blk1_16): New.
* cipher/sm4.c (_gcry_sm4_aesni_avx2_crypt_blk1_16)
(sm4_aesni_avx2_crypt_blk1_16): New.
(sm4_get_crypt_blk1_16_fn) [USE_AESNI_AVX2]: Add
'sm4_aesni_avx2_crypt_blk1_16'.
--
Benchmark AMD Ryzen 5800X:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 1.48 ns/B 643.2 MiB/s 7.19 c/B 4850
XTS dec | 1.48 ns/B 644.3 MiB/s 7.18 c/B 4850
After (1.37x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 1.07 ns/B 888.7 MiB/s 5.21 c/B 4850
XTS dec | 1.07 ns/B 889.4 MiB/s 5.20 c/B 4850
Benchmark on Intel i5-6200U 2.30GHz:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 2.95 ns/B 323.0 MiB/s 8.25 c/B 2792
XTS dec | 2.95 ns/B 323.0 MiB/s 8.24 c/B 2792
After (1.64x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 1.79 ns/B 531.4 MiB/s 5.01 c/B 2791
XTS dec | 1.79 ns/B 531.6 MiB/s 5.01 c/B 2791
Reviewed-and-tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sm4-gfni-avx2-amd64.S'.
* cipher/sm4-aesni-avx2-amd64.S: New.
* cipher/sm4.c (USE_GFNI_AVX2): New.
(SM4_context): Add 'use_gfni_avx2'.
(crypt_blk1_8_fn_t): Rename to...
(crypt_blk1_16_fn_t): ...this.
(sm4_aesni_avx_crypt_blk1_8): Rename to...
(sm4_aesni_avx_crypt_blk1_16): ...this and add handling for 9 to 16
input blocks.
(_gcry_sm4_gfni_avx_expand_key, _gcry_sm4_gfni_avx2_ctr_enc)
(_gcry_sm4_gfni_avx2_cbc_dec, _gcry_sm4_gfni_avx2_cfb_dec)
(_gcry_sm4_gfni_avx2_ocb_enc, _gcry_sm4_gfni_avx2_ocb_dec)
(_gcry_sm4_gfni_avx2_ocb_auth, _gcry_sm4_gfni_avx2_crypt_blk1_16)
(sm4_gfni_avx2_crypt_blk1_16): New.
(sm4_aarch64_crypt_blk1_8): Rename to...
(sm4_aarch64_crypt_blk1_16): ...this and add handling for 9 to 16
input blocks.
(sm4_armv8_ce_crypt_blk1_8): Rename to...
(sm4_armv8_ce_crypt_blk1_16): ...this and add handling for 9 to 16
input blocks.
(sm4_expand_key): Add GFNI/AVX2 path.
(sm4_setkey): Enable GFNI/AVX2 implementation if HW features
available; Disable AESNI implementations when GFNI implementation is
enabled.
(sm4_encrypt) [USE_GFNI_AVX2]: New.
(sm4_decrypt) [USE_GFNI_AVX2]: New.
(sm4_get_crypt_blk1_8_fn): Rename to...
(sm4_get_crypt_blk1_16_fn): ...this; Update to use *_blk1_16 functions;
Add GFNI/AVX2 selection.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth): Add GFNI/AVX2 path; Widen
generic bulk processing from 8 blocks to 16 blocks.
(_gcry_sm4_xts_crypt): Widen generic bulk processing from 8 blocks to
16 blocks.
--
Benchmark on Intel i3-1115G4 (tigerlake):
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 10.34 ns/B 92.21 MiB/s 42.29 c/B 4089
ECB dec | 10.34 ns/B 92.24 MiB/s 42.29 c/B 4090
CBC enc | 11.06 ns/B 86.26 MiB/s 45.21 c/B 4090
CBC dec | 1.13 ns/B 844.8 MiB/s 4.62 c/B 4090
CFB enc | 11.06 ns/B 86.27 MiB/s 45.22 c/B 4090
CFB dec | 1.13 ns/B 846.0 MiB/s 4.61 c/B 4090
CTR enc | 1.14 ns/B 834.3 MiB/s 4.67 c/B 4089
CTR dec | 1.14 ns/B 834.5 MiB/s 4.67 c/B 4089
XTS enc | 1.93 ns/B 494.1 MiB/s 7.89 c/B 4090
XTS dec | 1.94 ns/B 492.5 MiB/s 7.92 c/B 4090
OCB enc | 1.16 ns/B 823.3 MiB/s 4.74 c/B 4090
OCB dec | 1.16 ns/B 818.8 MiB/s 4.76 c/B 4089
OCB auth | 1.15 ns/B 831.0 MiB/s 4.69 c/B 4089
After:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 8.39 ns/B 113.6 MiB/s 34.33 c/B 4090
ECB dec | 8.40 ns/B 113.5 MiB/s 34.35 c/B 4090
CBC enc | 9.45 ns/B 101.0 MiB/s 38.63 c/B 4089
CBC dec | 0.650 ns/B 1468 MiB/s 2.66 c/B 4090
CFB enc | 9.44 ns/B 101.1 MiB/s 38.59 c/B 4090
CFB dec | 0.660 ns/B 1444 MiB/s 2.70 c/B 4090
CTR enc | 0.664 ns/B 1437 MiB/s 2.71 c/B 4090
CTR dec | 0.664 ns/B 1437 MiB/s 2.71 c/B 4090
XTS enc | 0.756 ns/B 1262 MiB/s 3.09 c/B 4090
XTS dec | 0.757 ns/B 1260 MiB/s 3.10 c/B 4090
OCB enc | 0.673 ns/B 1417 MiB/s 2.75 c/B 4090
OCB dec | 0.675 ns/B 1413 MiB/s 2.76 c/B 4090
OCB auth | 0.672 ns/B 1418 MiB/s 2.75 c/B 4090
ECB: 1.2x faster
CBC-enc / CFB-enc: 1.17x faster
CBC-dec / CFB-dec / CTR / OCB: 1.7x faster
XTS: 2.5x faster
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/sm4.c (_gcry_sm4_xts_crypt): New.
(sm4_setkey): Set XTS bulk function.
--
Benchmark on Ryzen 5800X:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 7.28 ns/B 131.0 MiB/s 35.31 c/B 4850
XTS dec | 7.29 ns/B 130.9 MiB/s 35.34 c/B 4850
After (4.8x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 1.49 ns/B 638.6 MiB/s 7.24 c/B 4850
XTS dec | 1.49 ns/B 639.3 MiB/s 7.24 c/B 4850
Benchmark on Intel i5-6200U 2.30GHz:
Before:
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 13.41 ns/B 71.10 MiB/s 37.45 c/B 2792
XTS dec | 13.43 ns/B 71.03 MiB/s 37.49 c/B 2792
After (4.54x faster):
SM4 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 2.96 ns/B 322.7 MiB/s 8.25 c/B 2792
XTS dec | 2.96 ns/B 322.5 MiB/s 8.26 c/B 2792
Reviewed-and-tested-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/bulkhelp.h (bulk_xts_crypt_128): New.
* cipher/camellia-glue.c (_gcry_camellia_xts_crypt): New.
(camellia_set_key) [USE_AESNI_AVX2]: Set XTS bulk function if AVX2
implementation is available.
--
Benchmark on AMD Ryzen 5800X:
Before:
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 3.79 ns/B 251.8 MiB/s 18.37 c/B 4850
XTS dec | 3.77 ns/B 253.2 MiB/s 18.27 c/B 4850
After (6.8x faster):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
XTS enc | 0.554 ns/B 1720 MiB/s 2.69 c/B 4850
XTS dec | 0.541 ns/B 1762 MiB/s 2.63 c/B 4850
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/camellia-aesni-avx2-amd64.h: Remove unnecessary vzeroupper
from function entry.
(enc_blk1_32, dec_blk1_32): New.
* cipher/camellia-glue.c (avx_burn_stack_depth)
(avx2_burn_stack_depth): Move outside of bulk functions to deduplicate.
(camellia_setkey): Disable AESNI & VAES implementation when GFNI
implementation is enabled.
(_gcry_camellia_aesni_avx2_enc_blk1_32)
(_gcry_camellia_aesni_avx2_dec_blk1_32)
(_gcry_camellia_vaes_avx2_enc_blk1_32)
(_gcry_camellia_vaes_avx2_dec_blk1_32)
(_gcry_camellia_gfni_avx2_enc_blk1_32)
(_gcry_camellia_gfni_avx2_dec_blk1_32, camellia_encrypt_blk1_32)
(camellia_decrypt_blk1_32): New.
(_gcry_camellia_ctr_enc, _gcry_camellia_cbc_dec, _gcry_camellia_cfb_dec)
(_gcry_camellia_ocb_crypt, _gcry_camellia_ocb_auth): Use new bulk
processing helpers from 'bulkhelp.h' and 'camellia_encrypt_blk1_32'
and 'camellia_decrypt_blk1_32' for partial parallel processing.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/bulkhelp.h (bulk_crypt_fn_t, bulk_ctr_enc_128)
(bulk_cbc_dec_128, bulk_cfb_dec_128, bulk_ocb_crypt_128)
(bulk_ocb_auth_128): New.
* cipher/sm4.c (_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec)
(_gcry_sm4_cfb_dec, _gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth): Switch
to use helper functions from 'bulkhelp.h'.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/bulkhelp.h: New.
* cipher/camellia-glue.c (_gcry_camellia_ocb_crypt)
(_gcry_camellia_ocb_crypt): Use new
`bulk_ocb_prepare_L_pointers_array_blkXX` function for OCB L pointer
array setup.
* cipher/serpent.c (_gcry_serpent_ocb_crypt)
(_gcry_serpent_ocb_auth): Likewise.
* cipher/sm4.c (_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth): Likewise.
* cipher/twofish.c (_gcry_twofish_ocb_crypt)
(_gcry_twofish_ocb_auth): Likewise.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/sm4.c (crypt_blk1_8_fn_t): New.
(sm4_aesni_avx_crypt_blk1_8, sm4_aarch64_crypt_blk1_8)
(sm4_armv8_ce_crypt_blk1_8, sm4_crypt_blocks): Change first parameter
to void pointer type.
(sm4_get_crypt_blk1_8_fn): New.
(_gcry_sm4_ctr_enc, _gcry_sm4_cbc_dec, _gcry_sm4_cfb_dec)
(_gcry_sm4_ocb_crypt, _gcry_sm4_ocb_auth): Use sm4_get_crypt_blk1_8_fn
for selecting crypt_blk1_8.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add "camellia-gfni-avx2-amd64.S".
* cipher/camellia-aesni-avx2-amd64.h [CAMELLIA_GFNI_BUILD]: Add GFNI
support.
* cipher/camellia-gfni-avx2-amd64.S: New.
* cipher/camellia-glue.c (USE_GFNI_AVX2): New.
(CAMELLIA_context) [USE_AESNI_AVX2]: New member "use_gfni_avx2".
[USE_GFNI_AVX2] (_gcry_camellia_gfni_avx2_ctr_enc)
(_gcry_camellia_gfni_avx2_cbc_dec, _gcry_camellia_gfni_avx2_cfb_dec)
(_gcry_camellia_gfni_avx2_ocb_enc, _gcry_camellia_gfni_avx2_ocb_dec)
(_gcry_camellia_gfni_avx2_ocb_auth): New.
(camellia_setkey) [USE_GFNI_AVX2]: Enable GFNI if supported by HW.
(_gcry_camellia_ctr_enc) [USE_GFNI_AVX2]: Add GFNI support.
(_gcry_camellia_cbc_dec) [USE_GFNI_AVX2]: Add GFNI support.
(_gcry_camellia_cfb_dec) [USE_GFNI_AVX2]: Add GFNI support.
(_gcry_camellia_ocb_crypt) [USE_GFNI_AVX2]: Add GFNI support.
(_gcry_camellia_ocb_auth) [USE_GFNI_AVX2]: Add GFNI support.
* configure.ac: Add "camellia-gfni-avx2-amd64.lo".
--
Benchmark on Intel Core i3-1115G4 (tigerlake):
Before (VAES/AVX2 implementation):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.579 ns/B 1646 MiB/s 2.37 c/B 4090
CFB dec | 0.579 ns/B 1648 MiB/s 2.37 c/B 4089
CTR enc | 0.586 ns/B 1628 MiB/s 2.40 c/B 4090
CTR dec | 0.587 ns/B 1626 MiB/s 2.40 c/B 4090
OCB enc | 0.607 ns/B 1570 MiB/s 2.48 c/B 4089
OCB dec | 0.611 ns/B 1561 MiB/s 2.50 c/B 4089
OCB auth | 0.602 ns/B 1585 MiB/s 2.46 c/B 4089
After (~80% faster):
CAMELLIA128 | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
CBC dec | 0.299 ns/B 3186 MiB/s 1.22 c/B 4090
CFB dec | 0.314 ns/B 3039 MiB/s 1.28 c/B 4089
CTR enc | 0.322 ns/B 2962 MiB/s 1.32 c/B 4090
CTR dec | 0.321 ns/B 2970 MiB/s 1.31 c/B 4090
OCB enc | 0.339 ns/B 2817 MiB/s 1.38 c/B 4089
OCB dec | 0.346 ns/B 2756 MiB/s 1.41 c/B 4089
OCB auth | 0.337 ns/B 2831 MiB/s 1.38 c/B 4089
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (gfnisupport, gcry_cv_gcc_inline_asm_gfni)
(ENABLE_GFNI_SUPPORT): New.
* src/g10lib.h (HWF_INTEL_GFNI): New.
* src/hwf-x86.c (detect_x86_gnuc): Add GFNI detection.
* src/hwfeatures.c (hwflist): Add "intel-gfni".
* doc/gcrypt.texi: Add "intel-gfni" to HW features list.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
| |
* tests/basic.c (check_pubkey_crypt): Expect RSA PKCS #1.5 encryption to
fail in FIPS mode. Expect failure when wrong padding is selected
* tests/pkcs1v2.c (check_v15crypt): Expect RSA PKCS #1.5 encryption to
fail in FIPS mode
--
GnuPG-bug-id: 5918
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* tests/basic.c (global): New flag FLAG_SPECIAL
(check_pubkey_crypt): Change to use bitfield flags
--
GnuPG-bug-id: 5918
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
* cipher/pubkey-util.c (_gcry_pk_util_data_to_mpi): Block PKCS #1.5
padding for encryption in FIPS mode
* cipher/rsa.c (rsa_decrypt): Block PKCS #1.5 decryption in FIPS mode
--
GnuPG-bug-id: 5918
Signed-off-by: Jakub Jelen <jjelen@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
* random/random-drbg.c (drbg_instance): New at BSS.
(_drbg_init_internal): Don't allocate at secure memory.
(_gcry_rngdrbg_close_fds): Follow the change.
--
GnuPG-bug-id: 5933
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rsa.c (generate_fips): Use 10 for p, 20 for q.
--
Constants from FIPS 186-5-draft.
GnuPG-bug-id: 5919
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
| |
* src/secmem.c (_gcry_secmem_realloc_internal): Use offsetof.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rsa.c (generate_fips): Set the least significant bit.
--
GnuPG-bug-id: 5919
Fixes-commit: 5f9b3c2e220ca6d0eaff32324a973ef67933a844
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
| |
* m4/Makefile.am: Remove socklen.m4 from EXTRA_DIST
--
Signed-off-by: Clemens Lang <cllang@redhat.com>
|
|
|
|
|
|
|
|
|
| |
* configure.ac (gl_TYPE_SOCKLEN_T): Remove.
* m4/socklen.m4: Remove.
--
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
| |
* doc/gcrypt.texi: Add sha3/sm3/sm4/sha512 to ARM hardware features.
--
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
|
|
|
|
|
|
|
| |
* configure.ac: Correct wrong variable names.
--
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'chacha20-amd64-avx512.S'.
* cipher/chacha20-amd64-avx512.S: New.
* cipher/chacha20.c (USE_AVX512): New.
(CHACHA20_context_s): Add 'use_avx512'.
[USE_AVX512] (_gcry_chacha20_amd64_avx512_blocks16): New.
(chacha20_do_setkey) [USE_AVX512]: Setup 'use_avx512' based on
HW features.
(do_chacha20_encrypt_stream_tail) [USE_AVX512]: Use AVX512
implementation if supported.
(_gcry_chacha20_poly1305_encrypt) [USE_AVX512]: Disable stitched
chacha20-poly1305 implementations if AVX512 implementation is used.
(_gcry_chacha20_poly1305_decrypt) [USE_AVX512]: Disable stitched
chacha20-poly1305 implementations if AVX512 implementation is used.
--
Benchmark on Intel Core i3-1115G4 (tigerlake):
Before:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
STREAM enc | 0.276 ns/B 3451 MiB/s 1.13 c/B 4090
STREAM dec | 0.284 ns/B 3359 MiB/s 1.16 c/B 4090
POLY1305 enc | 0.411 ns/B 2320 MiB/s 1.68 c/B 4098±3
POLY1305 dec | 0.408 ns/B 2338 MiB/s 1.67 c/B 4091±1
POLY1305 auth | 0.060 ns/B 15785 MiB/s 0.247 c/B 4090±1
After (stream 1.7x faster, poly1305-aead 1.8x faster):
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
STREAM enc | 0.162 ns/B 5869 MiB/s 0.665 c/B 4092±1
STREAM dec | 0.162 ns/B 5884 MiB/s 0.664 c/B 4096±3
POLY1305 enc | 0.221 ns/B 4306 MiB/s 0.907 c/B 4097±3
POLY1305 dec | 0.220 ns/B 4342 MiB/s 0.900 c/B 4096±3
POLY1305 auth | 0.060 ns/B 15797 MiB/s 0.247 c/B 4085±2
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* LICENSES: Add 3-clause BSD license for poly1305-amd64-avx512.S.
* cipher/Makefile.am: Add 'poly1305-amd64-avx512.S'.
* cipher/poly1305-amd64-avx512.S: New.
* cipher/poly1305-internal.h (POLY1305_USE_AVX512): New.
(poly1305_context_s): Add 'use_avx512'.
* cipher/poly1305.c (ASM_FUNC_ABI, ASM_FUNC_WRAPPER_ATTR): New.
[POLY1305_USE_AVX512] (_gcry_poly1305_amd64_avx512_blocks)
(poly1305_amd64_avx512_blocks): New.
(poly1305_init): Use AVX512 is HW feature available (set use_avx512).
[USE_MPI_64BIT] (poly1305_blocks): Rename to ...
[USE_MPI_64BIT] (poly1305_blocks_generic): ... this.
[USE_MPI_64BIT] (poly1305_blocks): New.
--
Patch adds AMD64 AVX512-FMA52 implementation for Poly1305.
Benchmark on Intel Core i3-1115G4 (tigerlake):
Before:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
POLY1305 | 0.306 ns/B 3117 MiB/s 1.25 c/B 4090
After (5.0x faster):
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
POLY1305 | 0.061 ns/B 15699 MiB/s 0.249 c/B 4095±3
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
| |
* doc/yat2m.c: Update.
--
Stderr output of "writing '<THE PAGE NAME>'" will be suppressed
unless --verbose is specified.
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Add 'sm3-armv8-aarch64-ce.S'.
* cipher/sm3-armv8-aarch64-ce.S: New.
* cipher/sm3.c (USE_ARM_CE): New.
[USE_ARM_CE] (_gcry_sm3_transform_armv8_ce)
(do_sm3_transform_armv8_ce): New.
(sm3_init) [USE_ARM_CE]: New.
* configure.ac: Add 'sm3-armv8-aarch64-ce.lo'.
--
Benchmark on T-Head Yitian-710 2.75 GHz:
Before:
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
SM3 | 2.84 ns/B 335.3 MiB/s 7.82 c/B 2749
After (~55% faster):
| nanosecs/byte mebibytes/sec cycles/byte auto Mhz
SM3 | 1.84 ns/B 518.1 MiB/s 5.06 c/B 2749
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
|
|
|
|
|
|
|
|
| |
* src/hwf-ppc.c (ppc_features): Add HWF_PPC_ARCH_3_10.
--
GnuPG-bug-id: T5913
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
| |
* random/random-drbg.c (_gcry_rngdrbg_randomize): Update change of PID
detection.
--
In a child process, it calls to drbg_reseed again and again, without
this change.
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (USE_GPGRT_CONFIG): New.
* src/Makefile.am [USE_GPGRT_CONFIG]: Conditionalize the install
of libgcrypt-config.
--
When system will migrate use of gpgrt-config and removal of
gpg-error-config, libgcrypt-config will not be installed (but use
libgcrypt.pc by gpgrt-config).
Signed-off-by: NIIBE Yutaka <gniibe@fsij.org>
|
|
|
|
|
|
|
|
|
| |
* tests/bench-slope.c (ECC_ALGO_BRAINP256R1): New.
(ecc_algo_fips_allowed): Support this curve.
(ecc_algo_name): Ditto.
(ecc_algo_curve): Ditto.
(ecc_nbits): Ditto.
(bench_ecc_init): Ditto.
|
|
|
|
|
|
|
|
| |
* configure.ac (gcry_cv_gcc_inline_asm_avx512): Do not use ZMM22
register; Check for broadcast memory source.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
| |
* configure.ac: Correctly set value for avx512support.
--
Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
|