summaryrefslogtreecommitdiff
path: root/rts/gmp/mpn/cray/README
diff options
context:
space:
mode:
Diffstat (limited to 'rts/gmp/mpn/cray/README')
-rw-r--r--rts/gmp/mpn/cray/README14
1 files changed, 14 insertions, 0 deletions
diff --git a/rts/gmp/mpn/cray/README b/rts/gmp/mpn/cray/README
new file mode 100644
index 0000000000..8195c67e21
--- /dev/null
+++ b/rts/gmp/mpn/cray/README
@@ -0,0 +1,14 @@
+The (poorly optimized) code in this directory was originally written for a
+j90 system, but finished on a c90. It should work on all Cray vector
+computers. For the T3E and T3D systems, the `alpha' subdirectory at the
+same level as the directory containing this file, is much better.
+
+* `+' seems to be faster than `|' when combining carries.
+
+* It is possible that the best multiply performance would be achived by
+ storing only 24 bits per element, and using lazy carry propagation. Before
+ calling i24mult, full carry propagation would be needed.
+
+* Supply tasking versions of the C loops.
+
+