summaryrefslogtreecommitdiff
path: root/ghc/compiler/nativeGen/AsmCodeGen.lhs
diff options
context:
space:
mode:
authorsewardj <unknown>2000-06-15 08:38:25 +0000
committersewardj <unknown>2000-06-15 08:38:25 +0000
commit665229e50c7704d70e8b8966c9f0e8d4623cf293 (patch)
tree7004c6a7692c702502a87fa8e2d74e5247240999 /ghc/compiler/nativeGen/AsmCodeGen.lhs
parent1364aa0bb9152ca00f433359487ec83ed7136df0 (diff)
downloadhaskell-665229e50c7704d70e8b8966c9f0e8d4623cf293.tar.gz
[project @ 2000-06-15 08:38:25 by sewardj]
Major thing: new register allocator. Brief description follows. Should correctly handle code with loops in, even though we don't generate any such at the moment. A lot of comments. The previous machinery for spilling is retained, as is the idea of a fast-and-easy initial allocation attempt intended to deal with the majority of code blocks (about 60% on x86) very cheaply. Many comments explaining in detail how it works :-) The Stix inliner is now on by default. Integer code seems to run within about 1% of that -fvia-C. x86 fp code is significantly worse, up to about 30% slower, depending on the amount of fp activity. Minor thing: lazyfication of the top-level NCG plumbing, so that the NCG doesn't require any greater residency than compiling to C, just a bit more time. Created lazyThenUs and lazyMapUs for this purpose. The new allocator is somewhat, although not catastophically, slower than the old one. Fixing of the long-standing NCG space leak more than makes up for it; overall hsc run-time is down about 5%, due to significantly reduced GC time. -------------------------------------------------------------------- Instructions are numbered sequentially, starting at zero. A flow edge (FE) is a pair of insn numbers (MkFE Int Int) denoting a possible flow of control from the first insn to the second. The input to the register allocator is a list of instructions, which mention Regs. A Reg can be a RealReg -- a real machine reg -- or a VirtualReg, which carries a unique. After allocation, all the VirtualReg references will have been converted into RealRegs, and possibly some spill code will have been inserted. The heart of the register allocator works in four phases. 1. (find_flow_edges) Calculate all the FEs for the code list. Return them not as a [FE], but implicitly, as a pair of Array Int [Int], being the successor and predecessor maps for instructions. 2. (calc_liveness) Returns a FiniteMap FE RegSet. For each FE, indicates the set of registers live on that FE. Note that the set includes both RealRegs and VirtualRegs. The former appear because the code could mention fixed register usages, and we need to take them into account from the start. 3. (calc_live_range_sets) Invert the above mapping, giving a FiniteMap Reg FeSet, indicating, for each virtual and real reg mentioned in the code, which FEs it is live on. 4. (calc_vreg_to_rreg_mapping) For virtual reg, try and find an allocatable real register for it. Each real register has a "current commitment", indicating the set of FEs it is currently live on. A virtual reg v can be assigned to real reg r iff v's live-fe-set does not intersect with r's current commitment fe-set. If the assignment is made, v's live-fe-set is union'd into r's current commitment fe-set. There is also the minor restriction that v and r must be of the same register class (integer or floating). Once this mapping is established, we simply apply it to the input insns, and that's it. If no suitable real register can be found, the vreg is mapped to itself, and we deem allocation to have failed. The partially allocated code is returned. The higher echelons of the allocator (doGeneralAlloc and runRegAlloc) then cooperate to insert spill code and re-run allocation, until a successful allocation is found.
Diffstat (limited to 'ghc/compiler/nativeGen/AsmCodeGen.lhs')
-rw-r--r--ghc/compiler/nativeGen/AsmCodeGen.lhs103
1 files changed, 51 insertions, 52 deletions
diff --git a/ghc/compiler/nativeGen/AsmCodeGen.lhs b/ghc/compiler/nativeGen/AsmCodeGen.lhs
index f483095f82..d85bc69133 100644
--- a/ghc/compiler/nativeGen/AsmCodeGen.lhs
+++ b/ghc/compiler/nativeGen/AsmCodeGen.lhs
@@ -18,9 +18,10 @@ import PprMach
import AbsCStixGen ( genCodeAbstractC )
import AbsCSyn ( AbstractC, MagicId )
+import AbsCUtils ( mkAbsCStmtList )
import AsmRegAlloc ( runRegAllocate )
import PrimOp ( commutableOp, PrimOp(..) )
-import RegAllocInfo ( mkMRegsState, MRegsState, findReservedRegs )
+import RegAllocInfo ( findReservedRegs )
import Stix ( StixTree(..), StixReg(..),
pprStixTrees, pprStixTree, CodeSegment(..),
stixCountTempUses, stixSubst,
@@ -29,7 +30,8 @@ import Stix ( StixTree(..), StixReg(..),
uniqOfNatM_State, deltaOfNatM_State )
import PrimRep ( isFloatingRep, PrimRep(..) )
import UniqSupply ( returnUs, thenUs, mapUs, initUs,
- initUs_, UniqSM, UniqSupply )
+ initUs_, UniqSM, UniqSupply,
+ lazyThenUs, lazyMapUs )
import MachMisc ( IF_ARCH_i386(i386_insert_ffrees,) )
import OrdList ( fromOL, concatOL )
@@ -87,38 +89,47 @@ So, here we go:
\begin{code}
nativeCodeGen :: AbstractC -> UniqSupply -> (SDoc, SDoc)
nativeCodeGen absC us
- = let (stixRaw, us1) = initUs us (genCodeAbstractC absC)
- stixOpt = map genericOpt stixRaw
- insns = initUs_ us1 (codeGen stixOpt)
- debug_stix = vcat (map pprStixTrees stixOpt)
- in {- trace "nativeGen: begin" -} (debug_stix, insns)
-\end{code}
-
-@codeGen@ is the top-level code-generation function:
-\begin{code}
-codeGen :: [[StixTree]] -> UniqSM SDoc
-
-codeGen stixFinal
- = mapUs genMachCode stixFinal `thenUs` \ dynamic_codes ->
- let
- fp_kludge :: [Instr] -> [Instr]
- fp_kludge = IF_ARCH_i386(i386_insert_ffrees,id)
-
- static_instrss :: [[Instr]]
- static_instrss = map fp_kludge (scheduleMachCode dynamic_codes)
- docs = map (vcat . map pprInstr) static_instrss
-
- -- for debugging only
- docs_prealloc = map (vcat . map pprInstr . fromOL)
- dynamic_codes
- text_prealloc = vcat (intersperse (char ' ' $$ char ' ') docs_prealloc)
- in
- --trace (showSDoc text_prealloc) (
- returnUs (vcat (intersperse (char ' '
- $$ ptext SLIT("# ___stg_split_marker")
- $$ char ' ')
- docs))
- --)
+ = let absCstmts = mkAbsCStmtList absC
+ (sdoc_pairs, us1) = initUs us (lazyMapUs absCtoNat absCstmts)
+ stix_sdocs = map fst sdoc_pairs
+ insn_sdocs = map snd sdoc_pairs
+
+ insn_sdoc = my_vcat insn_sdocs
+ stix_sdoc = vcat stix_sdocs
+
+# if DEBUG
+ my_trace m x = trace m x
+ my_vcat sds = vcat (intersperse (char ' '
+ $$ ptext SLIT("# ___stg_split_marker")
+ $$ char ' ')
+ sds)
+# else
+ my_vcat sds = vcat sds
+ my_trace m x = x
+# endif
+ in
+ my_trace "nativeGen: begin"
+ (stix_sdoc, insn_sdoc)
+
+
+absCtoNat :: AbstractC -> UniqSM (SDoc, SDoc)
+absCtoNat absC
+ = genCodeAbstractC absC `thenUs` \ stixRaw ->
+ genericOpt stixRaw `bind` \ stixOpt ->
+ genMachCode stixOpt `thenUs` \ pre_regalloc ->
+ regAlloc pre_regalloc `bind` \ almost_final ->
+ x86fp_kludge almost_final `bind` \ final_mach_code ->
+ vcat (map pprInstr final_mach_code) `bind` \ final_sdoc ->
+ pprStixTrees stixOpt `bind` \ stix_sdoc ->
+ returnUs (stix_sdoc, final_sdoc)
+ where
+ bind f x = x f
+
+ x86fp_kludge :: [Instr] -> [Instr]
+ x86fp_kludge = IF_ARCH_i386(i386_insert_ffrees,id)
+
+ regAlloc :: InstrBlock -> [Instr]
+ regAlloc = runRegAllocate allocatableRegs findReservedRegs
\end{code}
Top level code generator for a chunk of stix code. For this part of
@@ -154,20 +165,6 @@ genMachCode stmts initial_us
(int final_delta)
\end{code}
-The next bit does the code scheduling. The scheduler must also deal
-with register allocation of temporaries. Much parallelism can be
-exposed via the OrdList, but more might occur, so further analysis
-might be needed.
-
-\begin{code}
-scheduleMachCode :: [InstrBlock] -> [[Instr]]
-
-scheduleMachCode
- = map (runRegAllocate freeRegsState findReservedRegs)
- where
- freeRegsState = mkMRegsState (extractMappedRegNos freeRegs)
-\end{code}
-
%************************************************************************
%* *
\subsection[NCOpt]{The Generic Optimiser}
@@ -197,24 +194,26 @@ stixPeep :: [StixTree] -> [StixTree]
-- second assignment would be substituted for, giving nonsense
-- code. As far as I can see, StixTemps are only ever assigned
-- to once. It would be nice to be sure!
-{-
+
stixPeep ( t1@(StAssign pka (StReg (StixTemp u pk)) rhs)
: t2
: ts )
| stixCountTempUses u t2 == 1
&& sum (map (stixCountTempUses u) ts) == 0
- = trace ("nativeGen: stixInline: " ++ showSDoc (pprStixTree rhs))
+ =
+# ifdef DEBUG
+ trace ("nativeGen: inlining " ++ showSDoc (pprStixTree rhs))
+# endif
(stixPeep (stixSubst u rhs t2 : ts))
stixPeep (t1:t2:ts) = t1 : stixPeep (t2:ts)
stixPeep [t1] = [t1]
stixPeep [] = []
--}
-- disable stix inlining until we figure out how to fix the
-- latent bugs in the register allocator which are exposed by
-- the inliner.
-stixPeep = id
+--stixPeep = id
\end{code}
For most nodes, just optimize the children.