summaryrefslogtreecommitdiff
path: root/compiler/nativeGen/X86/RegInfo.hs
diff options
context:
space:
mode:
authorSimon Marlow <marlowsd@gmail.com>2010-02-04 10:48:49 +0000
committerSimon Marlow <marlowsd@gmail.com>2010-02-04 10:48:49 +0000
commit335b9f366ac440259318777c4c07e4fa42fbbec6 (patch)
tree6eaa6bee7a0af467c18ed1d42eb47b38c52a9169 /compiler/nativeGen/X86/RegInfo.hs
parentd9f7177402769968e8f42b49c1941661e18c5773 (diff)
downloadhaskell-335b9f366ac440259318777c4c07e4fa42fbbec6.tar.gz
Implement SSE2 floating-point support in the x86 native code generator (#594)
The new flag -msse2 enables code generation for SSE2 on x86. It results in substantially faster floating-point performance; the main reason for doing this was that our x87 code generation is appallingly bad, and since we plan to drop -fvia-C soon, we need a way to generate half-decent floating-point code. The catch is that SSE2 is only available on CPUs that support it (P4+, AMD K8+). We'll have to think hard about whether we should enable it by default for the libraries we ship. In the meantime, at least -msse2 should be an acceptable replacement for "-fvia-C -optc-ffast-math -fexcess-precision". SSE2 also has the advantage of performing all operations at the correct precision, so floating-point results are consistent with other platforms. I also tweaked the x87 code generation a bit while I was here, now it's slighlty less bad than before.
Diffstat (limited to 'compiler/nativeGen/X86/RegInfo.hs')
-rw-r--r--compiler/nativeGen/X86/RegInfo.hs34
1 files changed, 18 insertions, 16 deletions
diff --git a/compiler/nativeGen/X86/RegInfo.hs b/compiler/nativeGen/X86/RegInfo.hs
index ed420a41b0..eb8e82c82f 100644
--- a/compiler/nativeGen/X86/RegInfo.hs
+++ b/compiler/nativeGen/X86/RegInfo.hs
@@ -23,12 +23,11 @@ import X86.Regs
mkVirtualReg :: Unique -> Size -> VirtualReg
mkVirtualReg u size
- | not (isFloatSize size) = VirtualRegI u
- | otherwise
= case size of
- FF32 -> VirtualRegD u
- FF64 -> VirtualRegD u
- _ -> panic "mkVirtualReg"
+ FF32 -> VirtualRegSSE u
+ FF64 -> VirtualRegSSE u
+ FF80 -> VirtualRegD u
+ _other -> VirtualRegI u
-- reg colors for x86
@@ -44,15 +43,8 @@ regColors
$ [ (eax, "#00ff00")
, (ebx, "#0000ff")
, (ecx, "#00ffff")
- , (edx, "#0080ff")
-
- , (fake0, "#ff00ff")
- , (fake1, "#ff00aa")
- , (fake2, "#aa00ff")
- , (fake3, "#aa00aa")
- , (fake4, "#ff0055")
- , (fake5, "#5500ff") ]
-
+ , (edx, "#0080ff") ]
+ ++ fpRegColors
-- reg colors for x86_64
#elif x86_64_TARGET_ARCH
@@ -76,9 +68,19 @@ regColors
, (r13, "#004080")
, (r14, "#004040")
, (r15, "#002080") ]
-
- ++ zip (map regSingle [16..31]) (repeat "red")
+ ++ fpRegColors
#else
regDotColor :: Reg -> SDoc
regDotColor = panic "not defined"
#endif
+
+fpRegColors :: [(Reg,String)]
+fpRegColors =
+ [ (fake0, "#ff00ff")
+ , (fake1, "#ff00aa")
+ , (fake2, "#aa00ff")
+ , (fake3, "#aa00aa")
+ , (fake4, "#ff0055")
+ , (fake5, "#5500ff") ]
+
+ ++ zip (map regSingle [24..39]) (repeat "red")