diff options
author | Richard Maw <richard.maw@codethink.co.uk> | 2012-01-19 10:33:31 +0000 |
---|---|---|
committer | Richard Maw <richard.maw@codethink.co.uk> | 2012-01-19 10:33:31 +0000 |
commit | 29137c6ff7a9e370e2332d855ab46616ad4e9cc9 (patch) | |
tree | fbca7aa7cfa645df1b059aeba7e81739620b013c /mpn/x86/k7/addlsh1_n.asm | |
parent | 962de8d4b353178d38c2c70e952944686b9fd47b (diff) | |
parent | 2c033efc02631f22e6e180ce737a2faf81b09ccc (diff) | |
download | gmp-29137c6ff7a9e370e2332d855ab46616ad4e9cc9.tar.gz |
Merge branch 'master' into baserock/morph
Diffstat (limited to 'mpn/x86/k7/addlsh1_n.asm')
-rw-r--r-- | mpn/x86/k7/addlsh1_n.asm | 6 |
1 files changed, 3 insertions, 3 deletions
diff --git a/mpn/x86/k7/addlsh1_n.asm b/mpn/x86/k7/addlsh1_n.asm index e5163b676..05df4a740 100644 --- a/mpn/x86/k7/addlsh1_n.asm +++ b/mpn/x86/k7/addlsh1_n.asm @@ -44,14 +44,14 @@ C AMD K8 C This is a basic addlsh1_n for k7, atom, and perhaps some other x86-32 C processors. It uses 2*3-way unrolling, for good reasons. Unfortunately, C that means we need an initial magic multiply. -C +C C It is not clear how to do sublsh1_n or rsblsh1_n using the same pattern. We C cannot do rsblsh1_n since we feed carry from the shift blocks to the C add/subtract blocks, which is right for addition but reversed for C subtraction. We could perhaps do sublsh1_n, with some extra move insns, C without losing any time, since we're not issue limited but carry recurrency C latency. -C +C C Breaking carry recurrency might be a good idea. We would then need separate C registers for the shift carry and add/subtract carry, which in turn would C force is to 2*2-way unrolling. @@ -120,7 +120,7 @@ ifdef(`CPU_P6',` L(exact): incl VAR_COUNT jz L(end) - + ALIGN(16) L(top): ifdef(`CPU_P6',` |