Fix a small Float-Out bug

The float-out pass uses a heuristic based on strictness info. But it was getting the strictness info mis-aligned; I'd forgotten that strictness flags only apply to /value/ arguments. This patch fixes it. It has some surprising effects! -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- integer -0.1% +9.9% +0.2% +0.2% +0.0% lcss +0.0% +0.0% -11.9% -11.9% +0.0% queens -0.2% +29.0% 0.02 0.02 +0.0% simple -0.1% -22.6% -21.7% -21.7% -3.6% treejoin +0.0% +0.0% -12.3% -12.6% +0.0% -------------------------------------------------------------------------------- Min -0.2% -22.6% -21.7% -21.7% -10.0% Max +3.3% +29.0% +19.2% +19.2% +50.0% Geometric Mean +0.0% +0.1% -2.1% -2.1% +0.2% The 'queens' and 'integer' allocation regressions are because, just before let-floatting, we get \v -> foldr k z (case x of I# y -> build ..y..) Becase of Note [Case MFEs] we don't float the build; so fusion happens. This increases allocation in queens because the build isn't shared; but actaully runtime improves solidly. Situation is similar in integer, although I think runtime gets a bit worse. The bug meant that, because of foldr's type arguments, the mis-aligned strictness info meant that the entire (case x ...) was floated, so fusion failed, but sharing happened. This is all very artificial-benchmark-ish so I'm not losing sleep over it. I did see some runtime numbers increasd, but I think it's noise; the differnce went away when I tried them one by one afterwards.
author: Simon Peyton Jones <simonpj@microsoft.com> 2017-04-26 17:43:24 +0100
committer: Simon Peyton Jones <simonpj@microsoft.com> 2017-05-02 09:00:14 +0100
commit: ff239787f7170a93f1015bd0f5582772b7b87f0a (patch)
tree: bbc2614bb72f76b85b075afdbc2a797113c3bedf
parent: 71037b61597d8e80ba5acebc8ad2295e5266dc07 (diff)
download: haskell-ff239787f7170a93f1015bd0f5582772b7b87f0a.tar.gz
1 files changed, 31 insertions, 8 deletions
diff --git a/compiler/simplCore/SetLevels.hs b/compiler/simplCore/SetLevels.hs
index afca7ae3b9..2b533b73bd 100644
--- a/compiler/simplCore/SetLevels.hs
+++ b/compiler/simplCore/SetLevels.hs
@@ -81,7 +81,7 @@ import Var
 import VarSet
 import VarEnv
 import Literal          ( litIsTrivial )
-import Demand           ( StrictSig, isStrictDmd, splitStrictSig, increaseStrictSigArity )
+import Demand           ( StrictSig, Demand, isStrictDmd, splitStrictSig, increaseStrictSigArity )
 import Name             ( getOccName, mkSystemVarName )
 import OccName          ( occNameString )
 import Type             ( Type, mkLamTypes, splitTyConApp_maybe )
@@ -95,7 +95,7 @@ import FastString
 import UniqDFM
 import FV
 import Data.Maybe
-import Control.Monad    ( zipWithM )
+import MonadUtils       ( mapAccumLM )
 
 {-
 ************************************************************************
@@ -402,7 +402,7 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
         ; return (foldl App lapp' rargs') }
 
   | otherwise
-  = do { args' <- zipWithM (lvlMFE env) stricts args
+  = do { (_, args') <- mapAccumLM lvl_arg stricts args
             -- Take account of argument strictness; see
             -- Note [Floating to the top]
        ; return (foldl App (lookupVar env fn) args') }
@@ -410,12 +410,12 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
     n_val_args = count (isValArg . deAnnotate) args
     arity      = idArity fn
 
-    stricts :: [Bool]   -- True for strict argument
+    stricts :: [Demand]   -- True for strict /value/ arguments
     stricts = case splitStrictSig (idStrictness fn) of
-                (arg_ds, _) | not (arg_ds `lengthExceeds` n_val_args)
-                            -> map isStrictDmd arg_ds ++ repeat False
+                (arg_ds, _) | arg_ds `lengthExceeds` n_val_args
+                            -> []
                             | otherwise
-                            -> repeat False
+                            -> arg_ds
 
     -- Separate out the PAP that we are floating from the extra
     -- arguments, by traversing the spine until we have collected
@@ -428,6 +428,19 @@ lvlApp env orig_expr ((_,AnnVar fn), args)
        | otherwise               = left n     f (a:rargs)
     left _ _ _                   = panic "SetLevels.lvlExpr.left"
 
+    is_val_arg :: CoreExprWithFVs -> Bool
+    is_val_arg (_, AnnType {}) = False
+    is_val_arg _               = True
+
+    lvl_arg :: [Demand] -> CoreExprWithFVs -> LvlM ([Demand], LevelledExpr)
+    lvl_arg strs arg | (str1 : strs') <- strs
+                     , is_val_arg arg
+                     = do { arg' <- lvlMFE env (isStrictDmd str1) arg
+                          ; return (strs', arg') }
+                     | otherwise
+                     = do { arg' <- lvlMFE env False arg
+                          ; return (strs, arg') }
+
 lvlApp env _ (fun, args)
   =  -- No PAPs that we can float: just carry on with the
      -- arguments and the function.
@@ -893,7 +906,17 @@ in exchange we build a thunk, which is bad.  This case reduces allocation
 by 7% in spectral/puzzle (a rather strange benchmark) and 1.2% in real/fem.
 Doesn't change any other allocation at all.
 
-We will make a separate decision for the scrutinees and alternatives.
+We will make a separate decision for the scrutinee and alternatives.
+
+However this can have a knock-on effect for fusion: consider
+    \v -> foldr k z (case x of I# y -> build ..y..)
+Perhaps we can float the entire (case x of ...) out of the \v.  Then
+fusion will not happen, but we will get more sharing.  But if we don't
+float the case (as advocated here) we won't float the (build ...y..)
+either, so fusion will happen.  It can be a big effect, esp in some
+artificial benchmarks (e.g. integer, queens), but there is no perfect
+answer.
+
 -}
 
 annotateBotStr :: Id -> Arity -> Maybe (Arity, StrictSig) -> Id
author	Simon Peyton Jones <simonpj@microsoft.com>	2017-04-26 17:43:24 +0100
committer	Simon Peyton Jones <simonpj@microsoft.com>	2017-05-02 09:00:14 +0100
commit	ff239787f7170a93f1015bd0f5582772b7b87f0a (patch)
tree	bbc2614bb72f76b85b075afdbc2a797113c3bedf
parent	71037b61597d8e80ba5acebc8ad2295e5266dc07 (diff)
download	haskell-ff239787f7170a93f1015bd0f5582772b7b87f0a.tar.gz