| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* cipher/asm-inline-s390x.h (PCC_FUNCTION_*): New.
(pcc_query, pcc_scalar_multiply): New.
* mpi/Makefile.am: Add 'ec-hw-s390x.c'.
* mpi/ec-hw-s390x.c: New.
* mpi/ec-internal.h (_gcry_s390x_ec_hw_mul_point)
(mpi_ec_hw_mul_point): New.
* mpi/ec.c (_gcry_mpi_ec_mul_point): Call 'mpi_ec_hw_mul_point'.
* src/g10lib.h (HWF_S390X_MSA_9): New.
* src/hwf-s390x.c (s390x_features): Add MSA9.
* src/hwfeatures.c (hwflist): Add 's390x-msa-9'.
--
Patch adds ECC scalar multiplication acceleration using
s390x's PCC instruction. Following curves are supported:
- Ed25519
- Ed448
- X25519
- X448
- NIST curves P-256, P-384 and P-521
Benchmark on z15 (5.2Ghz):
Before:
Ed25519 | nanosecs/iter cycles/iter
mult | 389791 2026916
keygen | 572017 2974487
sign | 636603 3310336
verify | 1189097 6183305
=
X25519 | nanosecs/iter cycles/iter
mult | 296805 1543385
=
Ed448 | nanosecs/iter cycles/iter
mult | 1693373 8805541
keygen | 2382473 12388858
sign | 2609562 13569725
verify | 5177606 26923552
=
X448 | nanosecs/iter cycles/iter
mult | 1136178 5908127
=
NIST-P256 | nanosecs/iter cycles/iter
mult | 792620 4121625
keygen | 4627835 24064740
sign | 1528268 7946991
verify | 1678205 8726664
=
NIST-P384 | nanosecs/iter cycles/iter
mult | 1766418 9185373
keygen | 10158485 52824123
sign | 3341172 17374095
verify | 3694750 19212700
=
NIST-P521 | nanosecs/iter cycles/iter
mult | 3172566 16497346
keygen | 18184747 94560683
sign | 6039956 31407771
verify | 6480882 33700588
After:
Ed25519 | nanosecs/iter cycles/iter speed-up
mult | 25913 134746 15x
keygen | 44447 231124 12x
sign | 106928 556028 6x
verify | 164681 856341 7x
=
X25519 | nanosecs/iter cycles/iter speed-up
mult | 17761 92358 16x
=
Ed448 | nanosecs/iter cycles/iter speed-up
mult | 50808 264199 33x
keygen | 68644 356951 34x
sign | 317446 1650720 8x
verify | 457115 2376997 11x
=
X448 | nanosecs/iter cycles/iter speed-up
mult | 35637 185313 31x
=
NIST-P256 | nanosecs/iter cycles/iter speed-up
mult | 30678 159528 25x
keygen | 323722 1683356 14x
sign | 114176 593713 13x
verify | 169901 883487 9x
=
NIST-P384 | nanosecs/iter cycles/iter speed-up
mult | 59966 311822 29x
keygen | 607778 3160445 16x
sign | 209832 1091128 16x
verify | 329506 1713431 11x
=
NIST-P521 | nanosecs/iter cycles/iter speed-up
mult | 98230 510797 32x
keygen | 1131686 5884765 16x
sign | 397777 2068442 15x
verify | 623076 3239998 10x
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* configure.ac (ASM_DISABLED): New.
* mpi/Makefile.am: Add 'ec-nist.c' and 'ec-inline.h'.
* mpi/ec-nist.c: New.
* mpi/ec-inline.h: New.
* mpi/ec-internal.h (_gcry_mpi_ec_nist192_mod)
(_gcry_mpi_ec_nist224_mod, _gcry_mpi_ec_nist256_mod)
(_gcry_mpi_ec_nist384_mod, _gcry_mpi_ec_nist521_mod): New.
* mpi/ec.c (ec_addm, ec_subm, ec_mulm, ec_mul2): Use
'ctx->mod'.
(field_table): Add 'mod' function; Add NIST reduction
functions.
(ec_p_init): Setup ctx->mod; Setup function pointers
from field_table only if pointer is not NULL; Resize
ctx->a and ctx->b only if set.
* mpi/mpi-internal.h (RESIZE_AND_CLEAR_IF_NEEDED): New.
* mpi/mpiutil.c (_gcry_mpi_resize): Clear all unused
limbs also in realloc case.
* src/ec-context.h (mpi_ec_ctx_s): Add 'mod' function.
--
Benchmark on AMD Ryzen 7 5800X (x86_64):
Before:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz
mult | 283346 1369473 4833
keygen | 1688442 8185744 4848
sign | 549683 2662984 4845
verify | 615284 2984325 4850
=
NIST-P224 | nanosecs/iter cycles/iter auto Mhz
mult | 516443 2501173 4843
keygen | 2859746 13866802 4849
sign | 918472 4455043 4850
verify | 1057940 5131372 4850
=
NIST-P256 | nanosecs/iter cycles/iter auto Mhz
mult | 423536 2054040 4850
keygen | 2383097 11557572 4850
sign | 774346 3754243 4848
verify | 864934 4196315 4852
=
NIST-P384 | nanosecs/iter cycles/iter auto Mhz
mult | 929985 4511881 4852
keygen | 5230788 25367299 4850
sign | 1671432 8109726 4852
verify | 1902729 9228568 4850
=
NIST-P521 | nanosecs/iter cycles/iter auto Mhz
mult | 2123546 10300952 4851
keygen | 12019340 58297774 4850
sign | 3886988 18853054 4850
verify | 4507885 21864015 4850
After:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 186679 905603 4851 +51%
keygen | 1161423 5623822 4842 +46%
sign | 389531 1887557 4846 +41%
verify | 412936 2000461 4844 +49%
=
NIST-P224 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 260621 1256327 4821 +99%
keygen | 1557845 7531677 4835 +84%
sign | 521678 2527083 4844 +76%
verify | 554084 2677949 4833 +92%
=
NIST-P256 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 319045 1542061 4833 +33%
keygen | 1834822 8898950 4850 +30%
sign | 612866 2972630 4850 +26%
verify | 664821 3222597 4847 +30%
=
NIST-P384 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 593894 2875260 4841 +57%
keygen | 3526600 17089717 4846 +48%
sign | 1178098 5710151 4847 +42%
verify | 1260185 6107449 4846 +51%
=
NIST-P521 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 1160220 5621946 4846 +83%
keygen | 6862975 33247351 4844 +75%ยด
sign | 2287366 11096711 4851 +70%
verify | 2455858 11888045 4841 +84%
Benchmark on AMD Ryzen 7 5800X (i386):
Before:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz
mult | 648039 3143236 4850
keygen | 3554452 17244822 4852
sign | 1163173 5641932 4850
verify | 1300076 6305673 4850
=
NIST-P224 | nanosecs/iter cycles/iter auto Mhz
mult | 798607 3874405 4851
keygen | 4657604 22589864 4850
sign | 1515803 7352049 4850
verify | 1635470 7935373 4852
=
NIST-P256 | nanosecs/iter cycles/iter auto Mhz
mult | 927033 4496283 4850
keygen | 5313601 25771983 4850
sign | 1735795 8418514 4850
verify | 1945804 9438212 4851
=
NIST-P384 | nanosecs/iter cycles/iter auto Mhz
mult | 2301781 11164473 4850
keygen | 12856001 62353242 4850
sign | 4161041 20180651 4850
verify | 4705961 22827478 4851
=
NIST-P521 | nanosecs/iter cycles/iter auto Mhz
mult | 6066635 29422721 4850
keygen | 32995868 160046407 4850
sign | 10503306 50945387 4850
verify | 12225252 59294323 4850
After:
NIST-P192 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 413605 2007498 4854 +57%
keygen | 2479429 12010926 4844 +44%
sign | 825111 3997147 4844 +41%
verify | 890206 4318723 4851 +46%
=
NIST-P224 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 551703 2676454 4851 +45%
keygen | 3257022 15781844 4845 +43%
sign | 1085678 5258894 4844 +40%
verify | 1172195 5678499 4844 +40%
=
NIST-P256 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 720395 3497486 4855 +29%
keygen | 4217758 20461257 4851 +26%
sign | 1404350 6814131 4852 +24%
verify | 1515136 7353955 4854 +28%
=
NIST-P384 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 1525742 7400771 4851 +51%
keygen | 9046660 43877889 4850 +42%
sign | 2974641 14408703 4844 +40%
verify | 3265285 15834951 4849 +44%
=
NIST-P521 | nanosecs/iter cycles/iter auto Mhz speed-up
mult | 3289348 15968678 4855 +84%
keygen | 19354174 93873531 4850 +70%
sign | 6351493 30830140 4854 +65%
verify | 6979292 33854215 4851 +75%
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
|
|
* mpi/ec-ed25519.c: New but empty file.
* mpi/ec-internal.h: New.
* mpi/ec.c: Include ec-internal.h.
(ec_mod): New.
(ec_addm): Use ec_mod.
(ec_mulm): Remove commented code. Use ec_mod.
(ec_subm): Call simple sub.
(ec_pow2): Use ec_mulm.
(ec_mul2): New.
(dup_point_weierstrass): Use ec_mul2.
(dup_point_twistededwards): Add special case for a == -1. Use
ec_mul2.
(add_points_weierstrass): Use ec_mul2.
(add_points_twistededwards): Add special case for a == -1.
(_gcry_mpi_ec_curve_point): Ditto.
(ec_p_init): Add hack to test Barrett functions.
* src/ec-context.h (mpi_ec_ctx_s): Add P_BARRETT.
* mpi/mpi-mod.c (_gcry_mpi_mod_barrett): Fix sign problem.
Signed-off-by: Werner Koch <wk@gnupg.org>
|