diff options
author | Simon Marlow <marlowsd@gmail.com> | 2010-02-04 10:48:49 +0000 |
---|---|---|
committer | Simon Marlow <marlowsd@gmail.com> | 2010-02-04 10:48:49 +0000 |
commit | 335b9f366ac440259318777c4c07e4fa42fbbec6 (patch) | |
tree | 6eaa6bee7a0af467c18ed1d42eb47b38c52a9169 /compiler/nativeGen/X86/RegInfo.hs | |
parent | d9f7177402769968e8f42b49c1941661e18c5773 (diff) | |
download | haskell-335b9f366ac440259318777c4c07e4fa42fbbec6.tar.gz |
Implement SSE2 floating-point support in the x86 native code generator (#594)
The new flag -msse2 enables code generation for SSE2 on x86. It
results in substantially faster floating-point performance; the main
reason for doing this was that our x87 code generation is appallingly
bad, and since we plan to drop -fvia-C soon, we need a way to generate
half-decent floating-point code.
The catch is that SSE2 is only available on CPUs that support it (P4+,
AMD K8+). We'll have to think hard about whether we should enable it
by default for the libraries we ship. In the meantime, at least
-msse2 should be an acceptable replacement for "-fvia-C
-optc-ffast-math -fexcess-precision".
SSE2 also has the advantage of performing all operations at the
correct precision, so floating-point results are consistent with other
platforms.
I also tweaked the x87 code generation a bit while I was here, now
it's slighlty less bad than before.
Diffstat (limited to 'compiler/nativeGen/X86/RegInfo.hs')
-rw-r--r-- | compiler/nativeGen/X86/RegInfo.hs | 34 |
1 files changed, 18 insertions, 16 deletions
diff --git a/compiler/nativeGen/X86/RegInfo.hs b/compiler/nativeGen/X86/RegInfo.hs index ed420a41b0..eb8e82c82f 100644 --- a/compiler/nativeGen/X86/RegInfo.hs +++ b/compiler/nativeGen/X86/RegInfo.hs @@ -23,12 +23,11 @@ import X86.Regs mkVirtualReg :: Unique -> Size -> VirtualReg mkVirtualReg u size - | not (isFloatSize size) = VirtualRegI u - | otherwise = case size of - FF32 -> VirtualRegD u - FF64 -> VirtualRegD u - _ -> panic "mkVirtualReg" + FF32 -> VirtualRegSSE u + FF64 -> VirtualRegSSE u + FF80 -> VirtualRegD u + _other -> VirtualRegI u -- reg colors for x86 @@ -44,15 +43,8 @@ regColors $ [ (eax, "#00ff00") , (ebx, "#0000ff") , (ecx, "#00ffff") - , (edx, "#0080ff") - - , (fake0, "#ff00ff") - , (fake1, "#ff00aa") - , (fake2, "#aa00ff") - , (fake3, "#aa00aa") - , (fake4, "#ff0055") - , (fake5, "#5500ff") ] - + , (edx, "#0080ff") ] + ++ fpRegColors -- reg colors for x86_64 #elif x86_64_TARGET_ARCH @@ -76,9 +68,19 @@ regColors , (r13, "#004080") , (r14, "#004040") , (r15, "#002080") ] - - ++ zip (map regSingle [16..31]) (repeat "red") + ++ fpRegColors #else regDotColor :: Reg -> SDoc regDotColor = panic "not defined" #endif + +fpRegColors :: [(Reg,String)] +fpRegColors = + [ (fake0, "#ff00ff") + , (fake1, "#ff00aa") + , (fake2, "#aa00ff") + , (fake3, "#aa00aa") + , (fake4, "#ff0055") + , (fake5, "#5500ff") ] + + ++ zip (map regSingle [24..39]) (repeat "red") |