summaryrefslogtreecommitdiff
path: root/acinclude.m4
diff options
context:
space:
mode:
authorJussi Kivilinna <jussi.kivilinna@iki.fi>2023-04-10 22:24:43 +0300
committerJussi Kivilinna <jussi.kivilinna@iki.fi>2023-04-23 21:19:09 +0300
commitfdf2e8ba654a4dcfee25586dd7e0749f2b7a92c0 (patch)
tree2f7e68db261f9fc9bde89efc3e5fe190a6316028 /acinclude.m4
parentad4ee8d52f7199ba8bdee767044337060529069f (diff)
downloadlibgcrypt-fdf2e8ba654a4dcfee25586dd7e0749f2b7a92c0.tar.gz
mpi: optimize mpi_rshift and mpi_lshift to avoid extra MPI copying
* mpi/mpi-bit.c (_gcry_mpi_rshift): Refactor so that _gcry_mpih_rshift is used to do the copying along with shifting when copying is needed and refactor so that same code-path is used for both in-place and copying operation. (_gcry_mpi_lshift): Refactor so that _gcry_mpih_lshift is used to do the copying along with shifting when copying is needed and refactor so that same code-path is used for both in-place and copying operation. -- Benchmark on AMD Ryzen 9 7900X: Before: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz rshift3 | 0.039 ns/B 24662 MiB/s 0.182 c/B 4700 lshift3 | 0.108 ns/B 8832 MiB/s 0.508 c/B 4700 rshift65 | 0.137 ns/B 6968 MiB/s 0.643 c/B 4700 lshift65 | 0.109 ns/B 8776 MiB/s 0.511 c/B 4700 After: | nanosecs/byte mebibytes/sec cycles/byte auto Mhz rshift3 | 0.038 ns/B 25049 MiB/s 0.179 c/B 4700 lshift3 | 0.039 ns/B 24709 MiB/s 0.181 c/B 4700 rshift65 | 0.038 ns/B 24942 MiB/s 0.180 c/B 4700 lshift65 | 0.040 ns/B 23671 MiB/s 0.189 c/B 4700 Signed-off-by: Jussi Kivilinna <jussi.kivilinna@iki.fi>
Diffstat (limited to 'acinclude.m4')
0 files changed, 0 insertions, 0 deletions