Optimize self-recursive tail calls

This patch implements loopification optimization. It was described in "Low-level code optimisations in the Glasgow Haskell Compiler" by Krzysztof Woś, but we use a different approach here. Krzysztof's approach was to perform optimization as a Cmm-to-Cmm pass. Our approach is to generate properly optimized tail calls in the code generator, which saves us the trouble of processing Cmm. This idea was proposed by Simon Marlow. Implementation details are explained in Note [Self-recursive tail calls]. Performance of most nofib benchmarks is not affected. There are some benchmarks that show 5-7% improvement, with an average improvement of 2.6%. It would require some further investigation to check if this is related to benchamrking noise or does this optimization really help make some class of programs faster. As a minor cleanup, this patch renames forkProc to forkLneBody. It also moves some data declarations from StgCmmMonad to StgCmmClosure, because they are needed there and it seems that StgCmmClosure is on top of the whole StgCmm* hierarchy.
author: Jan Stolarek <jan.stolarek@p.lodz.pl> 2013-08-29 10:57:04 +0100
committer: Jan Stolarek <jan.stolarek@p.lodz.pl> 2013-08-29 12:56:09 +0100
commit: d61c3ac186c94021c851f7a2a6d20631e35fc1ba (patch)
tree: ff43791dfcd729fb9951feb6568be5306292fc9d /compiler/codeGen/StgCmmBind.hs
parent: 1d1ab12d084c07bd6aee03177ef6008c7ab08127 (diff)
download: haskell-d61c3ac186c94021c851f7a2a6d20631e35fc1ba.tar.gz
1 files changed, 14 insertions, 2 deletions
diff --git a/compiler/codeGen/StgCmmBind.hs b/compiler/codeGen/StgCmmBind.hs
index ce5491dc10..dccefd0fb0 100644
--- a/compiler/codeGen/StgCmmBind.hs
+++ b/compiler/codeGen/StgCmmBind.hs
@@ -30,6 +30,7 @@ import StgCmmForeign    (emitPrimCall)
 import MkGraph
 import CoreSyn          ( AltCon(..) )
 import SMRep
+import BlockId
 import Cmm
 import CmmInfo
 import CmmUtils
@@ -476,7 +477,17 @@ closureCodeBody top_lvl bndr cl_info cc args arity body fv_details
                 ; let node_points = nodeMustPointToIt dflags lf_info
                       node' = if node_points then Just node else Nothing
                 ; when node_points (ldvEnterClosure cl_info)
-
+                -- Emit new label that might potentially be a header
+                -- of a self-recursive tail call. See Note
+                -- [Self-recursive tail calls] in StgCmmExpr
+                ; u <- newUnique
+                ; let loop_header_id = mkBlockId u
+                ; emitLabel loop_header_id
+                -- Extend reader monad with information that
+                -- self-recursive tail calls can be optimized into local
+                -- jumps
+                ; withSelfLoop (bndr, loop_header_id, arg_regs) $ do
+                {
                 -- Main payload
                 ; entryHeapCheck cl_info node' arity arg_regs $ do
                 { -- ticky after heap check to avoid double counting
@@ -490,7 +501,8 @@ closureCodeBody top_lvl bndr cl_info cc args arity body fv_details
                 -- heap check, to reduce live vars over check
                 ; when node_points $ load_fvs node lf_info fv_bindings
                 ; void $ cgExpr body
-                }}
+                }}}
+
   }
 
 -- A function closure pointer may be tagged, so we
author	Jan Stolarek <jan.stolarek@p.lodz.pl>	2013-08-29 10:57:04 +0100
committer	Jan Stolarek <jan.stolarek@p.lodz.pl>	2013-08-29 12:56:09 +0100
commit	d61c3ac186c94021c851f7a2a6d20631e35fc1ba (patch)
tree	ff43791dfcd729fb9951feb6568be5306292fc9d /compiler/codeGen/StgCmmBind.hs
parent	1d1ab12d084c07bd6aee03177ef6008c7ab08127 (diff)
download	haskell-d61c3ac186c94021c851f7a2a6d20631e35fc1ba.tar.gz