Make worker/wrapper work on INLINEABLE things

This fixes a long-standing bug: Trac #6056. The trouble was that INLINEABLE "used up" the unfolding for the Id, so it couldn't be worker/wrapper'd by the strictness analyser. This patch allows the w/w to go ahead, and makes the *worker* INLINEABLE instead, so it can later be specialised. However, that doesn't completely solve the problem, because the dictionary argument (which the specialiser treats specially) may be strict and hence unpacked by w/w, so now the worker won't be specilialised after all. Solution: never unpack dictionary arguments, which is done by the isClassTyCon test in WwLib.deepSplitProductType_maybe
author: Simon Peyton Jones <simonpj@microsoft.com> 2014-05-09 11:15:33 +0100
committer: Simon Peyton Jones <simonpj@microsoft.com> 2014-08-28 11:14:08 +0100
commit: 9cf5906b692c31b7ec67856b0859cb0e33770651 (patch)
tree: 3e8dd54d2e37356dc700e4bd46735dcc443ba702 /compiler/stranal
parent: 8f09937426a40b9c638d63a2d726c3b755f88f82 (diff)
download: haskell-9cf5906b692c31b7ec67856b0859cb0e33770651.tar.gz
2 files changed, 78 insertions, 37 deletions
diff --git a/compiler/stranal/WorkWrap.lhs b/compiler/stranal/WorkWrap.lhs
index 4e4b1aee99..f845151098 100644
--- a/compiler/stranal/WorkWrap.lhs
+++ b/compiler/stranal/WorkWrap.lhs
@@ -8,7 +8,7 @@
 module WorkWrap ( wwTopBinds ) where
 
 import CoreSyn
-import CoreUnfold       ( certainlyWillInline, mkInlineUnfolding, mkWwInlineRule )
+import CoreUnfold	( certainlyWillInline, mkWwInlineRule, mkWorkerUnfolding )
 import CoreUtils        ( exprType, exprIsHNF )
 import CoreArity        ( exprArity )
 import Var
@@ -163,19 +163,33 @@ Notice that we refrain from w/w'ing an INLINE function even if it is
 in a recursive group.  It might not be the loop breaker.  (We could
 test for loop-breaker-hood, but I'm not sure that ever matters.)
 
-Note [Don't w/w INLINABLE things]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Note [Worker-wrapper for INLINABLE functions]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 If we have
   {-# INLINABLE f #-}
-  f x y = ....
-then in principle we might get a more efficient loop by w/w'ing f.
-But that would make a new unfolding which would overwrite the old
-one.  So we leave INLINABLE things alone too.
+  f :: Ord a => [a] -> Int -> a
+  f x y = ....f....
 
-This is a slight infelicity really, because it means that adding
-an INLINABLE pragma could make a program a bit less efficient,
-because you lose the worker/wrapper stuff.  But I don't see a way
-to avoid that.
+where f is strict in y, we might get a more efficient loop by w/w'ing
+f.  But that would make a new unfolding which would overwrite the old
+one! So the function would no longer be ININABLE, and in particular
+will not be specialised at call sites in other modules.
+
+This comes in practice (Trac #6056).
+
+Solution: do the w/w for strictness analysis, but transfer the Stable
+unfolding to the *worker*.  So we will get something like this:
+
+  {-# INLINE f #-}
+  f :: Ord a => [a] -> Int -> a
+  f d x y = case y of I# y' -> fw d x y'
+
+  {-# INLINABLE fw #-}
+  fw :: Ord a => [a] -> Int# -> a
+  fw d x y' = let y = I# y' in ...f...
+
+How do we "transfer the unfolding"? Easy: by using the old one, wrapped
+in work_fn! See CoreUnfold.mkWorkerUnfolding.
 
 Note [Don't w/w inline small non-loop-breaker things]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -253,19 +267,21 @@ tryWW dflags fam_envs is_rec fn_id rhs
         -- Furthermore, don't even expose strictness info
   = return [ (fn_id, rhs) ]
 
+
+{-
   | isStableUnfolding (realIdUnfolding fn_id)
   = return [ (fn_id, rhs) ]
       -- See Note [Don't w/w INLINE things]
-      -- and Note [Don't w/w INLINABLE things]
       -- NB: use realIdUnfolding because we want to see the unfolding
       --     even if it's a loop breaker!
+-}
 
-  | certainlyWillInline dflags (idUnfolding fn_id)
-  = let inline_rule = mkInlineUnfolding Nothing rhs
-    in  return [ (fn_id `setIdUnfolding` inline_rule, rhs) ]
-        -- Note [Don't w/w inline small non-loop-breaker things]
-        -- NB: use idUnfolding because we don't want to apply
-        --     this criterion to a loop breaker!
+  | not loop_breaker
+  , Just stable_unf <- certainlyWillInline dflags fn_unf
+  = return [ (fn_id `setIdUnfolding` stable_unf, rhs) ]
+	-- Note [Don't w/w inline small non-loop-breaker, or INLINE, things]
+	-- NB: use idUnfolding because we don't want to apply
+	--     this criterion to a loop breaker!
 
   | is_fun
   = splitFun dflags fam_envs new_fn_id fn_info wrap_dmds res_info rhs
@@ -277,8 +293,10 @@ tryWW dflags fam_envs is_rec fn_id rhs
   = return [ (new_fn_id, rhs) ]
 
   where
-    fn_info      = idInfo fn_id
+    loop_breaker = isStrongLoopBreaker (occInfo fn_info)
+    fn_info	 = idInfo fn_id
     inline_act   = inlinePragmaActivation (inlinePragInfo fn_info)
+    fn_unf       = unfoldingInfo fn_info
 
         -- In practice it always will have a strictness
         -- signature, even if it's a uninformative one
@@ -309,25 +327,28 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds res_info rhs
       Just (work_demands, wrap_fn, work_fn) -> do
         work_uniq <- getUniqueM
         let work_rhs = work_fn rhs
-            work_id  = mkWorkerId work_uniq fn_id (exprType work_rhs)
-                        `setIdOccInfo` occInfo fn_info
-                                -- Copy over occurrence info from parent
-                                -- Notably whether it's a loop breaker
-                                -- Doesn't matter much, since we will simplify next, but
-                                -- seems right-er to do so
-
-                        `setInlineActivation` (inlinePragmaActivation inl_prag)
-                                -- Any inline activation (which sets when inlining is active)
-                                -- on the original function is duplicated on the worker
-                                -- It *matters* that the pragma stays on the wrapper
-                                -- It seems sensible to have it on the worker too, although we
-                                -- can't think of a compelling reason. (In ptic, INLINE things are
-                                -- not w/wd). However, the RuleMatchInfo is not transferred since
+	    work_id  = mkWorkerId work_uniq fn_id (exprType work_rhs)
+		        `setIdOccInfo` occInfo fn_info
+				-- Copy over occurrence info from parent
+				-- Notably whether it's a loop breaker
+				-- Doesn't matter much, since we will simplify next, but
+				-- seems right-er to do so
+
+			`setInlinePragma` inl_prag
+				-- Any inline activation (which sets when inlining is active)
+				-- on the original function is duplicated on the worker
+				-- It *matters* that the pragma stays on the wrapper
+				-- It seems sensible to have it on the worker too, although we
+				-- can't think of a compelling reason. (In ptic, INLINE things are
+				-- not w/wd). However, the RuleMatchInfo is not transferred since
                                 -- it does not make sense for workers to be constructorlike.
 
-                        `setIdStrictness` mkClosedStrictSig work_demands work_res_info
-                                -- Even though we may not be at top level,
-                                -- it's ok to give it an empty DmdEnv
+                        `setIdUnfolding` mkWorkerUnfolding dflags work_fn (unfoldingInfo fn_info)
+                                -- See Note [Worker-wrapper for INLINABLE functions]
+
+			`setIdStrictness` mkClosedStrictSig work_demands work_res_info
+				-- Even though we may not be at top level,
+				-- it's ok to give it an empty DmdEnv
 
                         `setIdArity` (exprArity work_rhs)
                                 -- Set the arity so that the Core Lint check that the
@@ -348,9 +369,9 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds res_info rhs
                               `setIdOccInfo` NoOccInfo
                                 -- Zap any loop-breaker-ness, to avoid bleating from Lint
                                 -- about a loop breaker with an INLINE rule
+
         return $ [(work_id, work_rhs), (wrap_id, wrap_rhs)]
             -- Worker first, because wrapper mentions it
-            -- mkWwBodies has already built a wrap_rhs with an INLINE pragma wrapped around it
 
       Nothing -> return [(fn_id, rhs)]
   where
diff --git a/compiler/stranal/WwLib.lhs b/compiler/stranal/WwLib.lhs
index 7a9845b3d7..82c310710f 100644
--- a/compiler/stranal/WwLib.lhs
+++ b/compiler/stranal/WwLib.lhs
@@ -516,6 +516,25 @@ bug.  The fix here is simply to decline to do w/w if that happens.
 %*                                                                      *
 %************************************************************************
 
+Note [Do not unpack class dictionaries]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we have
+   f :: Ord a => [a] -> Int -> a
+   {-# INLINABLE f #-}
+and we worker/wrapper f, we'll get a worker with an INLINALBE pragma
+(see Note [Worker-wrapper for INLINABLE functions] in WorkWrap), which
+can still be specialised by the type-class specialiser, something like
+   fw :: Ord a => [a] -> Int# -> a
+
+BUT if f is strict in the Ord dictionary, we might unpack it, to get
+   fw :: (a->a->Bool) -> [a] -> Int# -> a
+and the type-class specialiser can't specialise that.
+
+Moreover, dictinoaries can have a lot of fields, so unpacking them can
+increase closure sizes.
+
+Conclusion: don't unpack dictionaries.
+
 \begin{code}
 deepSplitProductType_maybe :: FamInstEnvs -> Type -> Maybe (DataCon, [Type], [Type], Coercion)
 -- If    deepSplitProductType_maybe ty = Just (dc, tys, arg_tys, co)
@@ -526,6 +545,7 @@ deepSplitProductType_maybe fam_envs ty
                     `orElse` (mkReflCo Representational ty, ty)
   , Just (tc, tc_args) <- splitTyConApp_maybe ty1
   , Just con <- isDataProductTyCon_maybe tc
+  , not (isClassTyCon tc)  -- See Note [Do not unpack class dictionaries]
   = Just (con, tc_args, dataConInstArgTys con tc_args, co)
 deepSplitProductType_maybe _ _ = Nothing
author	Simon Peyton Jones <simonpj@microsoft.com>	2014-05-09 11:15:33 +0100
committer	Simon Peyton Jones <simonpj@microsoft.com>	2014-08-28 11:14:08 +0100
commit	9cf5906b692c31b7ec67856b0859cb0e33770651 (patch)
tree	3e8dd54d2e37356dc700e4bd46735dcc443ba702 /compiler/stranal
parent	8f09937426a40b9c638d63a2d726c3b755f88f82 (diff)
download	haskell-9cf5906b692c31b7ec67856b0859cb0e33770651.tar.gz