diff options
author | Niels M?ller <nisse@lysator.liu.se> | 2019-09-05 21:10:05 +0200 |
---|---|---|
committer | Niels M?ller <nisse@lysator.liu.se> | 2019-09-05 21:10:05 +0200 |
commit | 9c489cead96f7d38dd428d274805fc9cd48514fa (patch) | |
tree | 82af0b57717216c477c939c01f7f461ab043777a /tune/speed.h | |
parent | 95d545683e958f411d896cc9ae55507c144a063a (diff) | |
download | gmp-9c489cead96f7d38dd428d274805fc9cd48514fa.tar.gz |
For hgcd2, add a div1 function handling q <= 7 specially.
* mpn/generic/hgcd2.c (div1): Return both r and q as a
mp_double_limb_t, replacing the DIV1 macro.
(div1) [HGCD2_METHOD == 3]: New implementation handling q <= 7
specially and without branches. Based on Torbj?rn's mail to the
gmp-devel list.
* tune/speed.c, tune/speed.h, tune/common.c, tune/Makefile.am: Add
corresponding speed support.
* tune/hgcd2-3.c: New file.
* tune/tuneup.c (print_define_with_speedup): New function, to
output a comment with speedup compared to next-best method.
(tune_hgcd2): Update tuning.
Diffstat (limited to 'tune/speed.h')
-rw-r--r-- | tune/speed.h | 3 |
1 files changed, 3 insertions, 0 deletions
diff --git a/tune/speed.h b/tune/speed.h index 968bccac7..5df155841 100644 --- a/tune/speed.h +++ b/tune/speed.h @@ -217,6 +217,7 @@ double speed_mpn_matrix22_mul (struct speed_params *); double speed_mpn_hgcd2 (struct speed_params *); double speed_mpn_hgcd2_1 (struct speed_params *); double speed_mpn_hgcd2_2 (struct speed_params *); +double speed_mpn_hgcd2_3 (struct speed_params *); double speed_mpn_hgcd (struct speed_params *); double speed_mpn_hgcd_lehmer (struct speed_params *); double speed_mpn_hgcd_appr (struct speed_params *); @@ -487,6 +488,8 @@ int mpn_hgcd2_1 (mp_limb_t ah, mp_limb_t al, mp_limb_t bh, mp_limb_t bl, struct hgcd_matrix1 *M); int mpn_hgcd2_2 (mp_limb_t ah, mp_limb_t al, mp_limb_t bh, mp_limb_t bl, struct hgcd_matrix1 *M); +int mpn_hgcd2_3 (mp_limb_t ah, mp_limb_t al, mp_limb_t bh, mp_limb_t bl, + struct hgcd_matrix1 *M); mp_limb_t mpn_mod_1_div (mp_srcptr, mp_size_t, mp_limb_t); mp_limb_t mpn_mod_1_inv (mp_srcptr, mp_size_t, mp_limb_t); |