| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/cipher-internal.h (cipher_bulk_ops): Add 'ecb_crypt'.
* cipher/cipher.c (do_ecb_crypt): Use bulk function if available.
* cipher/rijndael-aesni.c (do_aesni_enc_vec8): Change asm label
'.Ldeclast' to '.Lenclast'.
(_gcry_aes_aesni_ecb_crypt): New.
* cipher/rijndael-armv8-aarch32-ce.S (_gcry_aes_ecb_enc_armv8_ce)
(_gcry_aes_ecb_dec_armv8_ce): New.
* cipher/rijndael-armv8-aarch64-ce.S (_gcry_aes_ecb_enc_armv8_ce)
(_gcry_aes_ecb_dec_armv8_ce): New.
* cipher/rijndael-armv8-ce.c (_gcry_aes_ocb_enc_armv8_ce)
(_gcry_aes_ocb_dec_armv8_ce, _gcry_aes_ocb_auth_armv8_ce): Change
return value from void to size_t.
(ocb_crypt_fn_t, xts_crypt_fn_t): Remove.
(_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_xts_crypt): Remove
indirect function call; Return value from called function (allows tail
call optimization).
(_gcry_aes_armv8_ce_ocb_auth): Return value from called function (allows
tail call optimization).
(_gcry_aes_ecb_enc_armv8_ce, _gcry_aes_ecb_dec_armv8_ce)
(_gcry_aes_armv8_ce_ecb_crypt): New.
* cipher/rijndael-vaes-avx2-amd64.S
(_gcry_vaes_avx2_ecb_crypt_amd64): New.
* cipher/rijndael-vaes.c (_gcry_vaes_avx2_ecb_crypt_amd64)
(_gcry_aes_vaes_ecb_crypt): New.
* cipher/rijndael.c (_gcry_aes_aesni_ecb_crypt)
(_gcry_aes_vaes_ecb_crypt, _gcry_aes_armv8_ce_ecb_crypt): New.
(do_setkey): Setup ECB bulk function for x86 AESNI/VAES and ARM CE.
--
Benchmark on AMD Ryzen 9 7900X:
Before (OCB for reference):
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 0.128 ns/B 7460 MiB/s 0.720 c/B 5634±1
ECB dec | 0.134 ns/B 7103 MiB/s 0.753 c/B 5608
OCB enc | 0.029 ns/B 32930 MiB/s 0.163 c/B 5625
OCB dec | 0.029 ns/B 32738 MiB/s 0.164 c/B 5625
After:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
ECB enc | 0.028 ns/B 33761 MiB/s 0.159 c/B 5625
ECB dec | 0.028 ns/B 33917 MiB/s 0.158 c/B 5625
GnuPG-bug-id: T6242
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rijndael-armv8-ce.c (_gcry_aes_armv8_ce_setkey): New key
schedule with simplified structure and less stack usage.
* cipher/rijndael-internal.h (RIJNDAEL_context_s): Add
'keyschedule32b'.
(keyschenc32b): New.
* cipher/rijndael-ppc-common.h (vec_u32): New.
* cipher/rijndael-ppc.c (vec_bswap32_const): Remove.
(_gcry_aes_sbox4_ppc8): Optimize for less instructions emitted.
(keysched_idx): New.
(_gcry_aes_ppc8_setkey): New key schedule with simplified structure.
* cipher/rijndael-tables.h (rcon): Remove.
* cipher/rijndael.c (sbox4): New.
(do_setkey): New key schedule with simplified structure and less
stack usage.
--
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/Makefile.am: Remove 'cipher-selftest.c' and 'cipher-selftest.h'.
* cipher/cipher-selftest.c: Remove (refactor these tests to
tests/basic.c).
* cipher/cipher-selftest.h: Remove.
* cipher/blowfish.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/camellia-glue.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/cast5.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/des.c (bulk_selftest_setkey, selftest_ctr, selftest_cbc)
(selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/rijndael.c (selftest_basic_128, selftest_basic_192)
(selftest_basic_256): Allocate context from stack instead of heap and
handle alignment manually.
(selftest_ctr_128, selftest_cbc_128, selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/serpent.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/sm4.c (selftest_ctr_128, selftest_cbc_128)
(selftest_cfb_128): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* cipher/twofish.c (selftest_ctr, selftest_cbc, selftest_cfb): Remove.
(selftest): Remove CTR/CBC/CFB bulk self-tests.
* tests/basic.c (buf_xor, cipher_cbc_bulk_test, buf_xor_2dst)
(cipher_cfb_bulk_test, cipher_ctr_bulk_test): New.
(check_ciphers): Run cipher_cbc_bulk_test(), cipher_cfb_bulk_test() and
cipher_ctr_bulk_test() for block ciphers.
---
CBC/CFB/CTR bulk self-tests are quite computationally heavy and
slow down use cases where application opens cipher context once,
does processing and exits. Better place for these tests is in
`tests/basic`.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rijndael-armv8-aarch32-ce.S
(_gcry_aes_ctr32le_enc_armv8_ce): New.
* cipher/rijndael-armv8-aarch64-ce.S
(_gcry_aes_ctr32le_enc_armv8_ce): New.
* cipher/rijndael-armv8-ce.c
(_gcry_aes_ctr32le_enc_armv8_ce)
(_gcry_aes_armv8_ce_ctr32le_enc): New.
* cipher/rijndael.c
(_gcry_aes_armv8_ce_ctr32le_enc): New prototype.
(do_setkey): Add setup of 'bulk_ops->ctr32le_enc' for ARMv8-CE.
--
Benchmark on Cortex-A53 (aarch64):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM-SIV enc | 11.77 ns/B 81.03 MiB/s 7.63 c/B 647.9
GCM-SIV dec | 11.92 ns/B 79.98 MiB/s 7.73 c/B 647.9
GCM-SIV auth | 2.99 ns/B 318.9 MiB/s 1.94 c/B 648.0
After (~2.4x faster):
AES | nanosecs/byte mebibytes/sec cycles/byte auto Mhz
GCM-SIV enc | 4.66 ns/B 204.5 MiB/s 3.02 c/B 647.9
GCM-SIV dec | 4.82 ns/B 198.0 MiB/s 3.12 c/B 647.9
GCM-SIV auth | 3.00 ns/B 318.4 MiB/s 1.94 c/B 648.0
GnuPG-bug-id: T4485
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rijndael-aesni.c
(_gcry_aes_aesni_prepare_decryption): Rename...
(do_aesni_prepare_decryption): .. to this.
(_gcry_aes_aesni_prepare_decryption): New.
(_gcry_aes_aesni_cfb_enc, _gcry_aes_aesni_cbc_enc)
(_gcry_aes_aesni_ctr_enc, _gcry_aes_aesni_cfb_dec)
(_gcry_aes_aesni_cbc_dec): Reorder parameters to match bulk
operations.
(_gcry_aes_aesni_cbc_dec, aesni_ocb_dec)
(_gcry_aes_aesni_xts_dec): Check and prepare decryption.
(_gcry_aes_aesni_ocb_crypt, _gcry_aes_aesni_ocb_auth): Change return
type to size_t.
* cipher/rijndael-armv8-ce.c
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec): Reorder parameters to match bulk
operations.
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_xts_dec): Check and prepare decryption.
(_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Change
return type to size_t.
* cipher/rijndael-ssse3-amd64.c
(_gcry_ssse3_prepare_decryption): Rename...
(do_ssse3_prepare_decryption): .. to this.
(_gcry_ssse3_prepare_decryption): New.
(_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc)
(_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec): Reorder parameters to match bulk
operations.
(_gcry_aes_ssse3_cbc_dec, ssse3_ocb_dec): Check and prepare decryption.
(_gcry_aes_ssse3_ocb_crypt, _gcry_aes_ssse3_ocb_auth): Change return
type to size_t.
* cipher/rijndael.c
(_gcry_aes_aesni_cfb_enc, _gcry_aes_aesni_cbc_enc)
(_gcry_aes_aesni_ctr_enc, _gcry_aes_aesni_cfb_dec)
(_gcry_aes_aesni_cbc_dec, _gcry_aes_aesni_ocb_crypt)
(_gcry_aes_aesni_ocb_auth, _gcry_aes_aesni_xts_crypt)
(_gcry_aes_ssse3_cfb_enc, _gcry_aes_ssse3_cbc_enc)
(_gcry_aes_ssse3_ctr_enc, _gcry_aes_ssse3_cfb_dec)
(_gcry_aes_ssse3_cbc_dec, _gcry_aes_ssse3_ocb_crypt)
(_gcry_aes_ssse3_ocb_auth, _gcry_aes_ssse3_xts_crypt)
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_ocb_auth, _gcry_aes_armv8_ce_xts_crypt): Change
prototypes to match bulk operations.
(do_setkey): Setup bulk operations with optimized implementations.
(_gcry_aes_cfb_enc, _gcry_aes_cbc_enc, _gcry_aes_ctr_enc)
(_gcry_aes_cfb_dec, _gcry_aes_cbc_dec, _gcry_aes_ocb_crypt)
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth, _gcry_aes_xts_crypt): Update
usage to match new prototypes, avoid prefetch and decryption
preparation on optimized code paths.
--
Replace bulk operation functions of cipher object with faster
version for reduced per call overhead.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rijndael-armv8-aarch32-ce.S (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce): New.
* cipher/rijndael-armv8-aarch64-ce.S (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce): New.
* cipher/rijndael-armv8-ce.c (_gcry_aes_xts_enc_armv8_ce)
(_gcry_aes_xts_dec_armv8_ce, xts_crypt_fn_t)
(_gcry_aes_armv8_ce_xts_crypt): New.
* cipher/rijndael.c (_gcry_aes_armv8_ce_xts_crypt): New.
(_gcry_aes_xts_crypt) [USE_ARM_CE]: New.
--
Benchmark on Cortex-A53 (AArch64, 1152 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 4.88 ns/B 195.5 MiB/s 5.62 c/B
XTS dec | 4.94 ns/B 192.9 MiB/s 5.70 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 5.55 ns/B 171.8 MiB/s 6.39 c/B
XTS dec | 5.61 ns/B 169.9 MiB/s 6.47 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 6.22 ns/B 153.3 MiB/s 7.17 c/B
XTS dec | 6.29 ns/B 151.7 MiB/s 7.24 c/B
=
After (~2.6x faster):
AES | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 1.83 ns/B 520.9 MiB/s 2.11 c/B
XTS dec | 1.82 ns/B 524.9 MiB/s 2.09 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 1.97 ns/B 483.3 MiB/s 2.27 c/B
XTS dec | 1.96 ns/B 486.9 MiB/s 2.26 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 2.11 ns/B 450.9 MiB/s 2.44 c/B
XTS dec | 2.10 ns/B 453.8 MiB/s 2.42 c/B
=
Benchmark on Cortex-A53 (AArch32, 1152 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 6.52 ns/B 146.2 MiB/s 7.51 c/B
XTS dec | 6.57 ns/B 145.2 MiB/s 7.57 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 7.10 ns/B 134.3 MiB/s 8.18 c/B
XTS dec | 7.11 ns/B 134.2 MiB/s 8.19 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 7.30 ns/B 130.7 MiB/s 8.41 c/B
XTS dec | 7.38 ns/B 129.3 MiB/s 8.50 c/B
=
After (~2.7x faster):
Cipher:
AES | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 2.33 ns/B 409.6 MiB/s 2.68 c/B
XTS dec | 2.35 ns/B 405.3 MiB/s 2.71 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 2.53 ns/B 377.6 MiB/s 2.91 c/B
XTS dec | 2.54 ns/B 375.5 MiB/s 2.93 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
XTS enc | 2.75 ns/B 346.8 MiB/s 3.17 c/B
XTS dec | 2.76 ns/B 345.2 MiB/s 3.18 c/B
=
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/rijndael-armv8-aarch32-ce.S: Add OCB 'L_{ntz(i)}' calculation.
* cipher/rijndael-armv8-aarch64-ce.S: Ditto.
* cipher/rijndael-armv8-ce.c (_gcry_aes_ocb_enc_armv8_ce)
(_gcry_aes_ocb_dec_armv8_ce, _gcry_aes_ocb_auth_armv8_ce)
(ocb_cryt_fn_t): Updated arguments.
(_gcry_aes_armv8_ce_ocb_crypt, _gcry_aes_armv8_ce_ocb_auth): Remove
'ocb_get_l' handling and splitting input to 32 block chunks, instead
pass full buffers to assembly.
--
Performance on Cortex-A53 (AArch32):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
OCB enc | 1.63 ns/B 583.8 MiB/s 1.88 c/B
OCB dec | 1.67 ns/B 572.1 MiB/s 1.92 c/B
OCB auth | 1.33 ns/B 717.1 MiB/s 1.53 c/B
After (~12% faster):
AES | nanosecs/byte mebibytes/sec cycles/byte
OCB enc | 1.47 ns/B 650.2 MiB/s 1.69 c/B
OCB dec | 1.48 ns/B 644.5 MiB/s 1.70 c/B
OCB auth | 1.19 ns/B 798.2 MiB/s 1.38 c/B
Performance on Cortex-A53 (AArch64):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
OCB enc | 1.29 ns/B 738.5 MiB/s 1.49 c/B
OCB dec | 1.32 ns/B 723.5 MiB/s 1.52 c/B
OCB auth | 1.15 ns/B 827.0 MiB/s 1.33 c/B
After (~8% faster):
AES | nanosecs/byte mebibytes/sec cycles/byte
OCB enc | 1.21 ns/B 789.1 MiB/s 1.39 c/B
OCB dec | 1.21 ns/B 789.2 MiB/s 1.39 c/B
OCB auth | 1.10 ns/B 867.0 MiB/s 1.27 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/cipher-ocb.c (_gcry_cipher_ocb_get_l): Remove.
(ocb_get_L_big): New.
(_gcry_cipher_ocb_authenticate): L-big handling done in upper
processing loop, so that lower level never sees the case where
'aad_nblocks % 65536 == 0'; Add missing stack burn.
(ocb_aad_finalize): Add missing stack burn.
(ocb_crypt): L-big handling done in upper processing loop, so that
lower level never sees the case where 'data_nblocks % 65536 == 0'.
* cipher/cipher-internal.h (_gcry_cipher_ocb_get_l): Remove.
(ocb_get_l): Remove 'l_tmp' usage and simplify since input
is more limited now, 'N is not multiple of 65536'.
* cipher/rijndael-aesni.c (get_l): Remove.
(aesni_ocb_enc, aesni_ocb_dec, _gcry_aes_aesni_ocb_auth): Remove
l_tmp; Use 'ocb_get_l'.
* cipher/rijndael-ssse3-amd64.c (get_l): Remove.
(ssse3_ocb_enc, ssse3_ocb_dec, _gcry_aes_ssse3_ocb_auth): Remove
l_tmp; Use 'ocb_get_l'.
* cipher/camellia-glue.c: Remove OCB l_tmp usage.
* cipher/rijndael-armv8-ce.c: Ditto.
* cipher/rijndael.c: Ditto.
* cipher/serpent.c: Ditto.
* cipher/twofish.c: Ditto.
--
Move large L value generation to up-most level to simplify lower level
ocb_get_l for greater performance and simpler implementation. This helps
implementing OCB in assembly as 'ocb_get_l' no longer has function call
on slow-path.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* cipher/Makefile.am: Add 'rijndael-armv8-ce.c' and
'rijndael-armv-aarch32-ce.S'.
* cipher/rijndael-armv8-aarch32-ce.S: New.
* cipher/rijndael-armv8-ce.c: New.
* cipher/rijndael-internal.h (USE_ARM_CE): New.
(RIJNDAEL_context_s): Add 'use_arm_ce'.
* cipher/rijndael.c [USE_ARM_CE] (_gcry_aes_armv8_ce_setkey)
(_gcry_aes_armv8_ce_prepare_decryption)
(_gcry_aes_armv8_ce_encrypt, _gcry_aes_armv8_ce_decrypt)
(_gcry_aes_armv8_ce_cfb_enc, _gcry_aes_armv8_ce_cbc_enc)
(_gcry_aes_armv8_ce_ctr_enc, _gcry_aes_armv8_ce_cfb_dec)
(_gcry_aes_armv8_ce_cbc_dec, _gcry_aes_armv8_ce_ocb_crypt)
(_gcry_aes_armv8_ce_ocb_auth): New.
(do_setkey) [USE_ARM_CE]: Add ARM CE/AES HW feature check and key
setup for ARM CE.
(prepare_decryption, _gcry_aes_cfb_enc, _gcry_aes_cbc_enc)
(_gcry_aes_ctr_enc, _gcry_aes_cfb_dec, _gcry_aes_cbc_dec)
(_gcry_aes_ocb_crypt, _gcry_aes_ocb_auth) [USE_ARM_CE]: Add
ARM CE support.
* configure.ac: Add 'rijndael-armv8-ce.lo' and
'rijndael-armv8-aarch32-ce.lo'.
--
Improvement vs ARM assembly on Cortex-A53:
AES-128 AES-192 AES-256
CBC enc: 14.8x 12.8x 11.4x
CBC dec: 21.4x 20.5x 19.4x
CFB enc: 16.2x 13.6x 11.6x
CFB dec: 21.6x 20.5x 19.4x
CTR: 19.1x 18.6x 17.8x
OCB enc: 16.0x 16.2x 16.1x
OCB dec: 15.6x 15.9x 15.8x
OCB auth: 18.3x 18.4x 18.0x
Benchmark on Cortex-A53 (1152 Mhz):
Before:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 24.42 ns/B 39.06 MiB/s 28.13 c/B
ECB dec | 25.07 ns/B 38.05 MiB/s 28.88 c/B
CBC enc | 21.05 ns/B 45.30 MiB/s 24.25 c/B
CBC dec | 21.16 ns/B 45.07 MiB/s 24.38 c/B
CFB enc | 21.05 ns/B 45.31 MiB/s 24.25 c/B
CFB dec | 21.38 ns/B 44.61 MiB/s 24.62 c/B
OFB enc | 26.15 ns/B 36.47 MiB/s 30.13 c/B
OFB dec | 26.15 ns/B 36.47 MiB/s 30.13 c/B
CTR enc | 21.17 ns/B 45.06 MiB/s 24.38 c/B
CTR dec | 21.16 ns/B 45.06 MiB/s 24.38 c/B
CCM enc | 42.32 ns/B 22.53 MiB/s 48.75 c/B
CCM dec | 42.32 ns/B 22.53 MiB/s 48.75 c/B
CCM auth | 21.17 ns/B 45.06 MiB/s 24.38 c/B
GCM enc | 22.08 ns/B 43.19 MiB/s 25.44 c/B
GCM dec | 22.08 ns/B 43.18 MiB/s 25.44 c/B
GCM auth | 0.923 ns/B 1032.8 MiB/s 1.06 c/B
OCB enc | 26.20 ns/B 36.40 MiB/s 30.18 c/B
OCB dec | 25.97 ns/B 36.73 MiB/s 29.91 c/B
OCB auth | 24.52 ns/B 38.90 MiB/s 28.24 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 27.83 ns/B 34.26 MiB/s 32.06 c/B
ECB dec | 28.54 ns/B 33.42 MiB/s 32.88 c/B
CBC enc | 24.47 ns/B 38.97 MiB/s 28.19 c/B
CBC dec | 25.27 ns/B 37.74 MiB/s 29.11 c/B
CFB enc | 25.08 ns/B 38.02 MiB/s 28.89 c/B
CFB dec | 25.31 ns/B 37.68 MiB/s 29.16 c/B
OFB enc | 29.57 ns/B 32.25 MiB/s 34.06 c/B
OFB dec | 29.57 ns/B 32.25 MiB/s 34.06 c/B
CTR enc | 25.24 ns/B 37.78 MiB/s 29.08 c/B
CTR dec | 25.24 ns/B 37.79 MiB/s 29.08 c/B
CCM enc | 49.81 ns/B 19.15 MiB/s 57.38 c/B
CCM dec | 49.80 ns/B 19.15 MiB/s 57.37 c/B
CCM auth | 24.58 ns/B 38.80 MiB/s 28.32 c/B
GCM enc | 26.15 ns/B 36.47 MiB/s 30.13 c/B
GCM dec | 26.11 ns/B 36.52 MiB/s 30.08 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 29.59 ns/B 32.23 MiB/s 34.09 c/B
OCB dec | 29.42 ns/B 32.42 MiB/s 33.89 c/B
OCB auth | 27.92 ns/B 34.16 MiB/s 32.16 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 31.20 ns/B 30.57 MiB/s 35.94 c/B
ECB dec | 31.80 ns/B 29.99 MiB/s 36.63 c/B
CBC enc | 27.83 ns/B 34.27 MiB/s 32.06 c/B
CBC dec | 27.87 ns/B 34.21 MiB/s 32.11 c/B
CFB enc | 27.88 ns/B 34.20 MiB/s 32.12 c/B
CFB dec | 28.16 ns/B 33.87 MiB/s 32.44 c/B
OFB enc | 32.93 ns/B 28.96 MiB/s 37.94 c/B
OFB dec | 32.93 ns/B 28.96 MiB/s 37.94 c/B
CTR enc | 27.95 ns/B 34.13 MiB/s 32.19 c/B
CTR dec | 27.95 ns/B 34.12 MiB/s 32.20 c/B
CCM enc | 55.88 ns/B 17.07 MiB/s 64.38 c/B
CCM dec | 55.88 ns/B 17.07 MiB/s 64.38 c/B
CCM auth | 27.95 ns/B 34.12 MiB/s 32.20 c/B
GCM enc | 28.86 ns/B 33.05 MiB/s 33.25 c/B
GCM dec | 28.87 ns/B 33.04 MiB/s 33.25 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 32.96 ns/B 28.94 MiB/s 37.97 c/B
OCB dec | 32.73 ns/B 29.14 MiB/s 37.70 c/B
OCB auth | 31.29 ns/B 30.48 MiB/s 36.04 c/B
After:
AES | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.10 ns/B 187.0 MiB/s 5.88 c/B
ECB dec | 5.27 ns/B 181.0 MiB/s 6.07 c/B
CBC enc | 1.41 ns/B 675.8 MiB/s 1.63 c/B
CBC dec | 0.992 ns/B 961.7 MiB/s 1.14 c/B
CFB enc | 1.30 ns/B 732.4 MiB/s 1.50 c/B
CFB dec | 0.991 ns/B 962.7 MiB/s 1.14 c/B
OFB enc | 7.05 ns/B 135.2 MiB/s 8.13 c/B
OFB dec | 7.05 ns/B 135.2 MiB/s 8.13 c/B
CTR enc | 1.11 ns/B 856.9 MiB/s 1.28 c/B
CTR dec | 1.11 ns/B 857.0 MiB/s 1.28 c/B
CCM enc | 2.58 ns/B 369.8 MiB/s 2.97 c/B
CCM dec | 2.58 ns/B 369.5 MiB/s 2.97 c/B
CCM auth | 1.58 ns/B 605.2 MiB/s 1.82 c/B
GCM enc | 2.04 ns/B 467.9 MiB/s 2.35 c/B
GCM dec | 2.04 ns/B 466.6 MiB/s 2.35 c/B
GCM auth | 0.923 ns/B 1033.0 MiB/s 1.06 c/B
OCB enc | 1.64 ns/B 579.8 MiB/s 1.89 c/B
OCB dec | 1.66 ns/B 574.5 MiB/s 1.91 c/B
OCB auth | 1.33 ns/B 715.5 MiB/s 1.54 c/B
=
AES192 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.64 ns/B 169.0 MiB/s 6.50 c/B
ECB dec | 5.81 ns/B 164.3 MiB/s 6.69 c/B
CBC enc | 1.90 ns/B 502.1 MiB/s 2.19 c/B
CBC dec | 1.24 ns/B 771.7 MiB/s 1.42 c/B
CFB enc | 1.84 ns/B 517.1 MiB/s 2.12 c/B
CFB dec | 1.23 ns/B 772.5 MiB/s 1.42 c/B
OFB enc | 7.60 ns/B 125.5 MiB/s 8.75 c/B
OFB dec | 7.60 ns/B 125.6 MiB/s 8.75 c/B
CTR enc | 1.36 ns/B 702.7 MiB/s 1.56 c/B
CTR dec | 1.36 ns/B 702.5 MiB/s 1.56 c/B
CCM enc | 3.31 ns/B 287.8 MiB/s 3.82 c/B
CCM dec | 3.31 ns/B 288.0 MiB/s 3.81 c/B
CCM auth | 2.06 ns/B 462.1 MiB/s 2.38 c/B
GCM enc | 2.28 ns/B 418.4 MiB/s 2.63 c/B
GCM dec | 2.28 ns/B 418.0 MiB/s 2.63 c/B
GCM auth | 0.923 ns/B 1032.8 MiB/s 1.06 c/B
OCB enc | 1.83 ns/B 520.1 MiB/s 2.11 c/B
OCB dec | 1.84 ns/B 517.8 MiB/s 2.12 c/B
OCB auth | 1.52 ns/B 626.1 MiB/s 1.75 c/B
=
AES256 | nanosecs/byte mebibytes/sec cycles/byte
ECB enc | 5.86 ns/B 162.7 MiB/s 6.75 c/B
ECB dec | 6.02 ns/B 158.3 MiB/s 6.94 c/B
CBC enc | 2.44 ns/B 390.5 MiB/s 2.81 c/B
CBC dec | 1.45 ns/B 656.4 MiB/s 1.67 c/B
CFB enc | 2.39 ns/B 399.5 MiB/s 2.75 c/B
CFB dec | 1.45 ns/B 656.8 MiB/s 1.67 c/B
OFB enc | 7.81 ns/B 122.1 MiB/s 9.00 c/B
OFB dec | 7.81 ns/B 122.1 MiB/s 9.00 c/B
CTR enc | 1.57 ns/B 605.8 MiB/s 1.81 c/B
CTR dec | 1.57 ns/B 605.9 MiB/s 1.81 c/B
CCM enc | 4.07 ns/B 234.3 MiB/s 4.69 c/B
CCM dec | 4.07 ns/B 234.1 MiB/s 4.69 c/B
CCM auth | 2.61 ns/B 365.7 MiB/s 3.00 c/B
GCM enc | 2.50 ns/B 381.9 MiB/s 2.88 c/B
GCM dec | 2.49 ns/B 382.3 MiB/s 2.87 c/B
GCM auth | 0.926 ns/B 1029.7 MiB/s 1.07 c/B
OCB enc | 2.05 ns/B 465.6 MiB/s 2.36 c/B
OCB dec | 2.06 ns/B 462.0 MiB/s 2.38 c/B
OCB auth | 1.74 ns/B 548.4 MiB/s 2.00 c/B
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|