blob: 8195c67e210659e77ccff3a7e86ed39008cf66a3 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
The (poorly optimized) code in this directory was originally written for a
j90 system, but finished on a c90. It should work on all Cray vector
computers. For the T3E and T3D systems, the `alpha' subdirectory at the
same level as the directory containing this file, is much better.
* `+' seems to be faster than `|' when combining carries.
* It is possible that the best multiply performance would be achived by
storing only 24 bits per element, and using lazy carry propagation. Before
calling i24mult, full carry propagation would be needed.
* Supply tasking versions of the C loops.
|