From e3ba8d9b2ac1c0333c83d46320a65aa45a585214 Mon Sep 17 00:00:00 2001 From: Kevin Ryde Date: Mon, 8 Oct 2001 23:49:13 +0200 Subject: Add PP factors of 2. Add PP possible factors of 3 or 5. Add Cray vectorize mpn_com_n. Add hppa 2.0w on gcc configs. Fix some < and > to < and >. --- doc/tasks.html | 31 +++++++++++++++++++++++++------ 1 file changed, 25 insertions(+), 6 deletions(-) (limited to 'doc') diff --git a/doc/tasks.html b/doc/tasks.html index 2cffc0ef8..31420495b 100644 --- a/doc/tasks.html +++ b/doc/tasks.html @@ -173,6 +173,19 @@ Copyright 2000, 2001 Free Software Foundation, Inc.
  • Change PP/PP_INVERTED into an array of such pairs, listing several hundred primes. Perhaps actually make the products larger than one limb each. +
  • PP can have factors of 2 introduced in order to get the high + bit set and therefore a PP_INVERTED existing. The factors + of 2 don't affect the way the remainder r = a % ((x*y*z)*2^n) is used, + further remainders r%x, r%y, etc, are the same since x, y, etc are odd. + The advantage of this is that mpn_preinv_mod_1 can then be + used if it's faster than plain mpn_mod_1. This would be a + change only for 16-bit limbs, all the rest already have PP + in the right form. +
  • PP could have factors of 3 or 5 or whatever introduced if + they fit, and final remainders mod 9 or 25 or whatever used, thereby + making more efficient use of the mpn_mod_1 done. On a + 16-bit limb it looks like PP could take an extra factor of + 3.
  • mpz_probab_prime_p, mpn_perfect_square_p and mpz_perfect_power_p could take a remainder mod 2^24-1 to quickly get remainders mod 3, 5, 7, 13 and 17 (factors of 2^24-1). A @@ -374,6 +387,8 @@ Copyright 2000, 2001 Free Software Foundation, Inc. mpn_rshift already provided.
  • Cray T3E: Experiment with optimization options. In particular, -hpipeline3 seems promising. We should at least up -O to -O2 or -O3. +
  • Cray: mpn_com_n and mpn_and_n etc very probably + wants a pragma like MPN_COPY_INCR.
  • Cray: Variable length arrays seem to be faster than the stack-alloc.c scheme. Not sure why, maybe they merely give the compiler more information about aliasing (or the lack thereof). Would like to modify @@ -421,9 +436,9 @@ Copyright 2000, 2001 Free Software Foundation, Inc. is the first allocated, rather than allocating a small size and then reallocing it. Update functions that rely on a single limb like mpz_set_ui, mpz_{t,f,c}div_{qr,r}_ui, and - others. Would need the initial z->_mp_d to point to a dummy - initial location, so it can be fetched from by mpz_odd_p and - similar macros. + others. Would need the initial z->_mp_d to point to a + dummy initial location, so it can be fetched from by + mpz_odd_p and similar macros.
  • Add mpf_out_raw and mpf_inp_raw. Make sure format is portable between 32-bit and 64-bit machines, and between little-endian and big-endian machines. @@ -497,6 +512,10 @@ Copyright 2000, 2001 Free Software Foundation, Inc. configfsf.guess does. (Not that we've got anything specific for them right now.)
  • HPPA: config.guess should recognize 7000, 7100, 7200, and 8x00. +
  • HPPA 2.0w: gcc is rumoured to support 2.0w as of version 3, though + perhaps just as a build-time choice. In any case, figure out how to + identify a suitable gcc or put it in the right mode, for the GMP compiler + choices.
  • Mips: config.guess should say mipsr3000, mipsr4000, mipsr10000, etc. "hinv -c processor" gives lots of information on Irix. Standard config.guess etc append "el" to indicate endianness, but GMP probably @@ -549,8 +568,8 @@ Copyright 2000, 2001 Free Software Foundation, Inc. what range of values the generated cofactors can take, and preferrably ensure the definition uniquely specifies the cofactors for given inputs. A basic extended Euclidean algorithm or multi-step variant leads to - |x|<|b| and |y|<|a| or something like that, but there's probably two - solutions under just those restrictions. + |x|<|b| and |y|<|a| or something like that, but there's probably + two solutions under just those restrictions.
  • mpz_invert should call mpn_gcdext directly.
  • Merge mpn/pa64 and pa64w.
  • mpz_urandomm should do something for n<=0 (but what?), @@ -665,7 +684,7 @@ near future, but are at least worth thinking about. native multiply is only 16x16. Could have this as an ABI option, selecting _SHORT_LIMB in gmp.h. Naturally a new set of asm subroutines would be necessary. Would need new - mpz_set_ui etc since the current code assumes limb>=long, + mpz_set_ui etc since the current code assumes limb>=long, but 2-limb operand forms would find a use for long long on other processors too. -- cgit v1.2.1