diff options
author | Kevin Ryde <user42@zip.com.au> | 2000-04-28 23:12:50 +0200 |
---|---|---|
committer | Kevin Ryde <user42@zip.com.au> | 2000-04-28 23:12:50 +0200 |
commit | 9d7a20e626acc647b830dabc0f5d36fe682a5f3d (patch) | |
tree | 976a08cdd583ab4158003428a30d945f92d4693e /tune/README | |
parent | 0023d4c1eed59f7f0c56b58b7c84bfa9f8c6c1cb (diff) | |
download | gmp-9d7a20e626acc647b830dabc0f5d36fe682a5f3d.tar.gz |
A bit extra on setting thresholds temporarily bigger.
Diffstat (limited to 'tune/README')
-rw-r--r-- | tune/README | 21 |
1 files changed, 14 insertions, 7 deletions
diff --git a/tune/README b/tune/README index c9bd2c381..d8c73d965 100644 --- a/tune/README +++ b/tune/README @@ -249,10 +249,8 @@ When examining the toom3 threshold, remember it depends on the karatsuba threshold, so the right karatsuba threshold needs to be compiled into the library first. The tune program uses special recompiled versions of mpn/mul_n.c etc for this reason, but the speed program simply uses the -normal libgmp.la. - -The BZ threshold depends on both the karatsuba and toom3 multiply -thresholds. +normal libgmp.la. The BZ threshold depends on both the karatsuba and toom3 +multiply thresholds. Note further that the various routines may recurse into themselves on sizes far enough above applicable thresholds. For example, mpn_kara_mul_n will @@ -262,10 +260,14 @@ KARATSUBA_MUL_THRESHOLD. When doing the above comparison between mul_basecase and kara_mul_n what's probably of interest is mul_basecase versus a kara_mul_n that does one level of karatsuba then calls to mul_basecase, but this only happens on sizes less -than twice the compiled KARATSUBA_MUL_THRESHOLD. A large value for that -setting can be compiled-in to avoid the problem if necessary. +than twice the compiled KARATSUBA_MUL_THRESHOLD. A larger value for that +setting can be compiled-in to avoid the problem if necessary. The same +applies to toom3 and BZ, though in a trickier fashion. -The same applies to toom3 and BZ, though in a trickier fashion. +There are some upper limits on some of the thresholds, arising from arrays +dimensioned according to a threshold (mpn_mul_n), or asm code with certain +size displacements (some x86 versions of sqr_basecase). So putting huge +values for the thresholds, even just for testing, may fail. @@ -282,6 +284,11 @@ Measuring of udiv_qrnnd, udiv_qrnnd_preinv and udiv_qrnnd_preinv2norm to see which is better. Watch out for function call overhead when udiv_qrnnd is actually an mpn_udiv_qrnnd subroutine. +Make an option in struct speed_parameters to specify the overlap, 0 for +none, 1 for dst=src1, 2 for dst=src2, 3 for dst1=src1 dst2=src2, 4 for +dst1=src2 dst2=src1. This would be better than lots of _inplace versions of +measuring functions. + When speed_measure() does a division of total time measured by repetitions performed, it divides the fixed overheads imposed by speed_starttime() and speed_endtime(). When different routines are run with different repetitions |