Fix a bug in CSE, for INLINE/INLNEABLE things

Previusly we simply weren't doing CSE at all on things whose unfolding were not always-active, for reasons explained in Note [CSE for INLINE and NOINLINE]. But that was bad! Making something INLNEABLE meant that its RHS was no longer CSE'd, and that made some nofib programs worse. And it's entirely unnecessary. I thoguht it through again, wrote new comments (under the same Note), and things are better again.
author: Simon Peyton Jones <simonpj@microsoft.com> 2014-08-29 15:11:19 +0100
committer: Simon Peyton Jones <simonpj@microsoft.com> 2014-08-29 17:16:53 +0100
commit: 4e0e7746344ca684af3dde216fa95a76df380cf1 (patch)
tree: 68638099be4c4520e5b6aaf0ea41125aaa023cc8
parent: f0db1857b053597e9ac43d9ce578e5f5fa0545cb (diff)
download: haskell-4e0e7746344ca684af3dde216fa95a76df380cf1.tar.gz
1 files changed, 40 insertions, 44 deletions
diff --git a/compiler/simplCore/CSE.lhs b/compiler/simplCore/CSE.lhs
index 90715737c2..f47c89b939 100644
--- a/compiler/simplCore/CSE.lhs
+++ b/compiler/simplCore/CSE.lhs
@@ -90,26 +90,16 @@ to the substitution
 
 Note [CSE for INLINE and NOINLINE]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We are careful to do no CSE inside functions that the user has marked as
-INLINE or NOINLINE.  In terms of Core, that means
+We are careful to with CSE inside functions that the user has marked as
+INLINE or NOINLINE. (Examples from Roman Leshchinskiy.)  Consider
 
-        a) we do not do CSE inside an InlineRule
-
-        b) we do not do CSE on the RHS of a binding b=e
-           unless b's InlinePragma is AlwaysActive
-
-Here's why (examples from Roman Leshchinskiy).  Consider
-
-        yes :: Int
-        {-# NOINLINE yes #-}
+        yes :: Int  {-# NOINLINE yes #-}
         yes = undefined
 
-        no :: Int
-        {-# NOINLINE no #-}
+        no :: Int   {-# NOINLINE no #-}
         no = undefined
 
-        foo :: Int -> Int -> Int
-        {-# NOINLINE foo #-}
+        foo :: Int -> Int -> Int  {-# NOINLINE foo #-}
         foo m n = n
 
         {-# RULES "foo/no" foo no = id #-}
@@ -117,35 +107,36 @@ Here's why (examples from Roman Leshchinskiy).  Consider
         bar :: Int -> Int
         bar = foo yes
 
-We do not expect the rule to fire.  But if we do CSE, then we get
-yes=no, and the rule does fire.  Worse, whether we get yes=no or
-no=yes depends on the order of the definitions.
+We do not expect the rule to fire.  But if we do CSE, then we risk
+getting yes=no, and the rule does fire.  Actually, it won't becuase
+NOINLINE means that 'yes' will never be inlined, not even if we have
+yes=no.  So that's fine (now; perhpas in the olden days, yes=no would
+have substituted even if 'yes' was NOINLINE.
 
-In general, CSE should probably never touch things with INLINE pragmas
-as this could lead to surprising results.  Consider
-
-        {-# INLINE foo #-}
-        foo = <rhs>
+But we do need to take care.  Consider
 
         {-# NOINLINE bar #-}
         bar = <rhs>     -- Same rhs as foo
 
+        foo = <rhs>
+
 If CSE produces
         foo = bar
-then foo will never be inlined (when it should be); but if it produces
-        bar = foo
-bar will be inlined (when it should not be). Even if we remove INLINE foo,
-we'd still like foo to be inlined if rhs is small. This won't happen
-with foo = bar.
-
-Not CSE-ing inside INLINE also solves an annoying bug in CSE. Consider
-a worker/wrapper, in which the worker has turned into a single variable:
-        $wf = h
-        f = \x -> ...$wf...
-Now CSE may transform to
-        f = \x -> ...h...
-But the WorkerInfo for f still says $wf, which is now dead!  This won't
-happen now that we don't look inside INLINEs (which wrappers are).
+then foo will never be inlined to <rhs> (when it should be, if <rhs>
+is small).  The conclusion here is this:
+
+   We should not add
+       <rhs> :-> bar
+  to the CSEnv if 'bar' has any constraints on when it can inline;
+  that is, if its 'activation' not always active.  Otherwise we
+  might replace <rhs> by 'bar', and then later be unable to see that it
+  really was <rhs>.
+
+Note that we do not (currently) do CSE on the unfolding stored inside
+an Id, even if is a 'stable' unfolding.  That means that when an
+unfolding happens, it is always faithful to what the stable unfolding
+originally was.
+
 
 Note [CSE for case expressions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -188,8 +179,12 @@ cseBind env (Rec pairs)
 cseRhs :: CSEnv -> (OutBndr, InExpr) -> (CSEnv, OutExpr)
 cseRhs env (id',rhs)
   = case lookupCSEnv env rhs' of
-        Nothing -> (extendCSEnv env rhs' id', rhs')
-        Just id -> (extendCSSubst env id' id, Var id)
+        Nothing
+          | always_active -> (extendCSEnv env rhs' id', rhs')
+          | otherwise     -> (env,                      rhs')
+        Just id
+          | always_active -> (extendCSSubst env id' id, Var id)
+          | otherwise     -> (env,                      Var id)
           -- In the Just case, we have
           --        x = rhs
           --        ...
@@ -199,9 +194,10 @@ cseRhs env (id',rhs)
           -- that subsequent uses of x' are replaced with x,
           -- See Trac #5996
   where
-    rhs' | isAlwaysActive (idInlineActivation id') = cseExpr env rhs
-         | otherwise                               = rhs
-                -- See Note [CSE for INLINE and NOINLINE]
+    rhs' = cseExpr env rhs
+
+    always_active = isAlwaysActive (idInlineActivation id')
+         -- See Note [CSE for INLINE and NOINLINE]
 
 tryForCSE :: CSEnv -> InExpr -> OutExpr
 tryForCSE env expr
@@ -259,8 +255,8 @@ cseAlts env scrut' bndr bndr' alts
         = (DataAlt con, args', tryForCSE new_env rhs)
         where
           (env', args') = addBinders alt_env args
-          new_env       = extendCSEnv env' (mkAltExpr (DataAlt con) args' arg_tys)
-                                           con_target
+          new_env       = extendCSEnv env' con_expr con_target
+          con_expr      = mkAltExpr (DataAlt con) args' arg_tys
 
     cse_alt (con, args, rhs)
         = (con, args', tryForCSE env' rhs)
author	Simon Peyton Jones <simonpj@microsoft.com>	2014-08-29 15:11:19 +0100
committer	Simon Peyton Jones <simonpj@microsoft.com>	2014-08-29 17:16:53 +0100
commit	4e0e7746344ca684af3dde216fa95a76df380cf1 (patch)
tree	68638099be4c4520e5b6aaf0ea41125aaa023cc8
parent	f0db1857b053597e9ac43d9ce578e5f5fa0545cb (diff)
download	haskell-4e0e7746344ca684af3dde216fa95a76df380cf1.tar.gz