summaryrefslogtreecommitdiff
path: root/tune/README
diff options
context:
space:
mode:
authorKevin Ryde <user42@zip.com.au>2001-03-15 00:33:55 +0100
committerKevin Ryde <user42@zip.com.au>2001-03-15 00:33:55 +0100
commit2cae80c6232029f3f9c4c07788844acb9525021b (patch)
treebf50c36ffefe282ce3552bacd8b4c4e0bcb9cacf /tune/README
parent8b3495b569ef6138c432896167dbe646a6e378dd (diff)
downloadgmp-2cae80c6232029f3f9c4c07788844acb9525021b.tar.gz
* tune/README: Notes on the 1x1 div threshold for mpn_gcd_1.
Diffstat (limited to 'tune/README')
-rw-r--r--tune/README23
1 files changed, 23 insertions, 0 deletions
diff --git a/tune/README b/tune/README
index 181af7579..52ff6689d 100644
--- a/tune/README
+++ b/tune/README
@@ -349,6 +349,29 @@ optimize some multiplications by 10.
+EXAMPLE COMPARISONS - GCDs
+
+mpn_gcd_1 has a threshold for when to reduce using an initial x%y when both
+x and y are single limbs. This isn't tuned currently, but a value can be
+established by a measurement like
+
+ ./speed -s 10-32 mpn_gcd_1.10
+
+This runs src[0] from 10 to 32 bits, and y fixed at 10 bits. If the div
+threshold is high, say 31 so it's effectively disabled then a 32x10 bit gcd
+is done by nibbling away at the 32-bit operands bit-by-bit. When the
+threshold is small, say 1 bit, then an initial x%y is done to reduce it to a
+10x10 bit operation.
+
+The threshold in mpn/generic/gcd_1.c or the various assembler
+implementations can be tweaked up or down until there's no more speedups on
+interesting combinations of sizes. Note that this affects only a 1x1 limb
+operation and so isn't very important. (An Nx1 limb operation always does
+an initial modular reduction, using mpn_mod_1 or mpn_modexact_1_odd.)
+
+
+
+
SPEED PROGRAM EXTENSIONS
Potentially lots of things could be made available in the program, but it's