diff options
author | Sylvain Henry <sylvain@haskus.fr> | 2020-03-17 09:45:29 +0100 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2020-03-18 10:06:43 -0400 |
commit | 528df8ecb4e2f9c78b1ae4ab7ff8230644e9b643 (patch) | |
tree | 86cd4522d35c4c8fd3a17db5f4e6b138f8be70df /compiler/simplCore | |
parent | 53ff2cd0c49735e8f709ac8a5ceab68483eb89df (diff) | |
download | haskell-528df8ecb4e2f9c78b1ae4ab7ff8230644e9b643.tar.gz |
Modules: Core operations (#13009)
Diffstat (limited to 'compiler/simplCore')
-rw-r--r-- | compiler/simplCore/CSE.hs | 799 | ||||
-rw-r--r-- | compiler/simplCore/CallArity.hs | 763 | ||||
-rw-r--r-- | compiler/simplCore/CoreMonad.hs | 829 | ||||
-rw-r--r-- | compiler/simplCore/CoreMonad.hs-boot | 30 | ||||
-rw-r--r-- | compiler/simplCore/Exitify.hs | 499 | ||||
-rw-r--r-- | compiler/simplCore/FloatIn.hs | 772 | ||||
-rw-r--r-- | compiler/simplCore/FloatOut.hs | 757 | ||||
-rw-r--r-- | compiler/simplCore/LiberateCase.hs | 442 | ||||
-rw-r--r-- | compiler/simplCore/OccurAnal.hs | 2898 | ||||
-rw-r--r-- | compiler/simplCore/SAT.hs | 433 | ||||
-rw-r--r-- | compiler/simplCore/SetLevels.hs | 1771 | ||||
-rw-r--r-- | compiler/simplCore/SimplCore.hs | 1037 | ||||
-rw-r--r-- | compiler/simplCore/SimplEnv.hs | 938 | ||||
-rw-r--r-- | compiler/simplCore/SimplMonad.hs | 252 | ||||
-rw-r--r-- | compiler/simplCore/SimplUtils.hs | 2324 | ||||
-rw-r--r-- | compiler/simplCore/Simplify.hs | 3666 | ||||
-rw-r--r-- | compiler/simplCore/simplifier.tib | 771 |
17 files changed, 0 insertions, 18981 deletions
diff --git a/compiler/simplCore/CSE.hs b/compiler/simplCore/CSE.hs deleted file mode 100644 index 81cb825e68..0000000000 --- a/compiler/simplCore/CSE.hs +++ /dev/null @@ -1,799 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section{Common subexpression} --} - -{-# LANGUAGE CPP #-} - -{-# OPTIONS_GHC -Wno-incomplete-record-updates #-} -{-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-} - -module CSE (cseProgram, cseOneExpr) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Core.Subst -import Var ( Var ) -import VarEnv ( mkInScopeSet ) -import Id ( Id, idType, idHasRules - , idInlineActivation, setInlineActivation - , zapIdOccInfo, zapIdUsageInfo, idInlinePragma - , isJoinId, isJoinId_maybe ) -import GHC.Core.Utils ( mkAltExpr, eqExpr - , exprIsTickedString - , stripTicksE, stripTicksT, mkTicks ) -import GHC.Core.FVs ( exprFreeVars ) -import GHC.Core.Type ( tyConAppArgs ) -import GHC.Core -import Outputable -import BasicTypes -import GHC.Core.Map -import Util ( filterOut, equalLength, debugIsOn ) -import Data.List ( mapAccumL ) - -{- - Simple common sub-expression - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we see - x1 = C a b - x2 = C x1 b -we build up a reverse mapping: C a b -> x1 - C x1 b -> x2 -and apply that to the rest of the program. - -When we then see - y1 = C a b - y2 = C y1 b -we replace the C a b with x1. But then we *dont* want to -add x1 -> y1 to the mapping. Rather, we want the reverse, y1 -> x1 -so that a subsequent binding - y2 = C y1 b -will get transformed to C x1 b, and then to x2. - -So we carry an extra var->var substitution which we apply *before* looking up in the -reverse mapping. - - -Note [Shadowing] -~~~~~~~~~~~~~~~~ -We have to be careful about shadowing. -For example, consider - f = \x -> let y = x+x in - h = \x -> x+x - in ... - -Here we must *not* do CSE on the inner x+x! The simplifier used to guarantee no -shadowing, but it doesn't any more (it proved too hard), so we clone as we go. -We can simply add clones to the substitution already described. - - -Note [CSE for bindings] -~~~~~~~~~~~~~~~~~~~~~~~ -Let-bindings have two cases, implemented by addBinding. - -* SUBSTITUTE: applies when the RHS is a variable - - let x = y in ...(h x).... - - Here we want to extend the /substitution/ with x -> y, so that the - (h x) in the body might CSE with an enclosing (let v = h y in ...). - NB: the substitution maps InIds, so we extend the substitution with - a binding for the original InId 'x' - - How can we have a variable on the RHS? Doesn't the simplifier inline them? - - - First, the original RHS might have been (g z) which has CSE'd - with an enclosing (let y = g z in ...). This is super-important. - See #5996: - x1 = C a b - x2 = C x1 b - y1 = C a b - y2 = C y1 b - Here we CSE y1's rhs to 'x1', and then we must add (y1->x1) to - the substitution so that we can CSE the binding for y2. - - - Second, we use addBinding for case expression scrutinees too; - see Note [CSE for case expressions] - -* EXTEND THE REVERSE MAPPING: applies in all other cases - - let x = h y in ...(h y)... - - Here we want to extend the /reverse mapping (cs_map)/ so that - we CSE the (h y) call to x. - - Note that we use EXTEND even for a trivial expression, provided it - is not a variable or literal. In particular this /includes/ type - applications. This can be important (#13156); e.g. - case f @ Int of { r1 -> - case f @ Int of { r2 -> ... - Here we want to common-up the two uses of (f @ Int) so we can - remove one of the case expressions. - - See also Note [Corner case for case expressions] for another - reason not to use SUBSTITUTE for all trivial expressions. - -Notice that - - The SUBSTITUTE situation extends the substitution (cs_subst) - - The EXTEND situation extends the reverse mapping (cs_map) - -Notice also that in the SUBSTITUTE case we leave behind a binding - x = y -even though we /also/ carry a substitution x -> y. Can we just drop -the binding instead? Well, not at top level! See SimplUtils -Note [Top level and postInlineUnconditionally]; and in any case CSE -applies only to the /bindings/ of the program, and we leave it to the -simplifier to propate effects to the RULES. Finally, it doesn't seem -worth the effort to discard the nested bindings because the simplifier -will do it next. - -Note [CSE for case expressions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - case scrut_expr of x { ...alts... } -This is very like a strict let-binding - let !x = scrut_expr in ... -So we use (addBinding x scrut_expr) to process scrut_expr and x, and as a -result all the stuff under Note [CSE for bindings] applies directly. - -For example: - -* Trivial scrutinee - f = \x -> case x of wild { - (a:as) -> case a of wild1 { - (p,q) -> ...(wild1:as)... - - Here, (wild1:as) is morally the same as (a:as) and hence equal to - wild. But that's not quite obvious. In the rest of the compiler we - want to keep it as (wild1:as), but for CSE purpose that's a bad - idea. - - By using addBinding we add the binding (wild1 -> a) to the substitution, - which does exactly the right thing. - - (Notice this is exactly backwards to what the simplifier does, which - is to try to replaces uses of 'a' with uses of 'wild1'.) - - This is the main reason that addBinding is called with a trivial rhs. - -* Non-trivial scrutinee - case (f x) of y { pat -> ...let z = f x in ... } - - By using addBinding we'll add (f x :-> y) to the cs_map, and - thereby CSE the inner (f x) to y. - -Note [CSE for INLINE and NOINLINE] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -There are some subtle interactions of CSE with functions that the user -has marked as INLINE or NOINLINE. (Examples from Roman Leshchinskiy.) -Consider - - yes :: Int {-# NOINLINE yes #-} - yes = undefined - - no :: Int {-# NOINLINE no #-} - no = undefined - - foo :: Int -> Int -> Int {-# NOINLINE foo #-} - foo m n = n - - {-# RULES "foo/no" foo no = id #-} - - bar :: Int -> Int - bar = foo yes - -We do not expect the rule to fire. But if we do CSE, then we risk -getting yes=no, and the rule does fire. Actually, it won't because -NOINLINE means that 'yes' will never be inlined, not even if we have -yes=no. So that's fine (now; perhaps in the olden days, yes=no would -have substituted even if 'yes' was NOINLINE). - -But we do need to take care. Consider - - {-# NOINLINE bar #-} - bar = <rhs> -- Same rhs as foo - - foo = <rhs> - -If CSE produces - foo = bar -then foo will never be inlined to <rhs> (when it should be, if <rhs> -is small). The conclusion here is this: - - We should not add - <rhs> :-> bar - to the CSEnv if 'bar' has any constraints on when it can inline; - that is, if its 'activation' not always active. Otherwise we - might replace <rhs> by 'bar', and then later be unable to see that it - really was <rhs>. - -An except to the rule is when the INLINE pragma is not from the user, e.g. from -WorkWrap (see Note [Wrapper activation]). We can tell because noUserInlineSpec -is then true. - -Note that we do not (currently) do CSE on the unfolding stored inside -an Id, even if it is a 'stable' unfolding. That means that when an -unfolding happens, it is always faithful to what the stable unfolding -originally was. - -Note [CSE for stable unfoldings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - {-# Unf = Stable (\pq. build blah) #-} - foo = x - -Here 'foo' has a stable unfolding, but its (optimised) RHS is trivial. -(Turns out that this actually happens for the enumFromTo method of -the Integer instance of Enum in GHC.Enum.) Suppose moreover that foo's -stable unfolding originates from an INLINE or INLINEABLE pragma on foo. -Then we obviously do NOT want to extend the substitution with (foo->x), -because we promised to inline foo as what the user wrote. See similar -SimplUtils Note [Stable unfoldings and postInlineUnconditionally]. - -Nor do we want to change the reverse mapping. Suppose we have - - {-# Unf = Stable (\pq. build blah) #-} - foo = <expr> - bar = <expr> - -There could conceivably be merit in rewriting the RHS of bar: - bar = foo -but now bar's inlining behaviour will change, and importing -modules might see that. So it seems dodgy and we don't do it. - -Stable unfoldings are also created during worker/wrapper when we decide -that a function's definition is so small that it should always inline. -In this case we still want to do CSE (#13340). Hence the use of -isAnyInlinePragma rather than isStableUnfolding. - -Note [Corner case for case expressions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Here is another reason that we do not use SUBSTITUTE for -all trivial expressions. Consider - case x |> co of (y::Array# Int) { ... } - -We do not want to extend the substitution with (y -> x |> co); since y -is of unlifted type, this would destroy the let/app invariant if (x |> -co) was not ok-for-speculation. - -But surely (x |> co) is ok-for-speculation, because it's a trivial -expression, and x's type is also unlifted, presumably. Well, maybe -not if you are using unsafe casts. I actually found a case where we -had - (x :: HValue) |> (UnsafeCo :: HValue ~ Array# Int) - -Note [CSE for join points?] -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We must not be naive about join points in CSE: - join j = e in - if b then jump j else 1 + e -The expression (1 + jump j) is not good (see Note [Invariants on join points] in -GHC.Core). This seems to come up quite seldom, but it happens (first seen -compiling ppHtml in Haddock.Backends.Xhtml). - -We could try and be careful by tracking which join points are still valid at -each subexpression, but since join points aren't allocated or shared, there's -less to gain by trying to CSE them. (#13219) - -Note [Look inside join-point binders] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Another way how CSE for join points is tricky is - - let join foo x = (x, 42) - join bar x = (x, 42) - in … jump foo 1 … jump bar 2 … - -naively, CSE would turn this into - - let join foo x = (x, 42) - join bar = foo - in … jump foo 1 … jump bar 2 … - -but now bar is a join point that claims arity one, but its right-hand side -is not a lambda, breaking the join-point invariant (this was #15002). - -So `cse_bind` must zoom past the lambdas of a join point (using -`collectNBinders`) and resume searching for CSE opportunities only in -the body of the join point. - -Note [CSE for recursive bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f = \x ... f.... - g = \y ... g ... -where the "..." are identical. Could we CSE them? In full generality -with mutual recursion it's quite hard; but for self-recursive bindings -(which are very common) it's rather easy: - -* Maintain a separate cs_rec_map, that maps - (\f. (\x. ...f...) ) -> f - Note the \f in the domain of the mapping! - -* When we come across the binding for 'g', look up (\g. (\y. ...g...)) - Bingo we get a hit. So we can replace the 'g' binding with - g = f - -We can't use cs_map for this, because the key isn't an expression of -the program; it's a kind of synthetic key for recursive bindings. - - -************************************************************************ -* * -\section{Common subexpression} -* * -************************************************************************ --} - -cseProgram :: CoreProgram -> CoreProgram -cseProgram binds = snd (mapAccumL (cseBind TopLevel) emptyCSEnv binds) - -cseBind :: TopLevelFlag -> CSEnv -> CoreBind -> (CSEnv, CoreBind) -cseBind toplevel env (NonRec b e) - = (env2, NonRec b2 e2) - where - (env1, b1) = addBinder env b - (env2, (b2, e2)) = cse_bind toplevel env1 (b,e) b1 - -cseBind toplevel env (Rec [(in_id, rhs)]) - | noCSE in_id - = (env1, Rec [(out_id, rhs')]) - - -- See Note [CSE for recursive bindings] - | Just previous <- lookupCSRecEnv env out_id rhs'' - , let previous' = mkTicks ticks previous - out_id' = delayInlining toplevel out_id - = -- We have a hit in the recursive-binding cache - (extendCSSubst env1 in_id previous', NonRec out_id' previous') - - | otherwise - = (extendCSRecEnv env1 out_id rhs'' id_expr', Rec [(zapped_id, rhs')]) - - where - (env1, [out_id]) = addRecBinders env [in_id] - rhs' = cseExpr env1 rhs - rhs'' = stripTicksE tickishFloatable rhs' - ticks = stripTicksT tickishFloatable rhs' - id_expr' = varToCoreExpr out_id - zapped_id = zapIdUsageInfo out_id - -cseBind toplevel env (Rec pairs) - = (env2, Rec pairs') - where - (env1, bndrs1) = addRecBinders env (map fst pairs) - (env2, pairs') = mapAccumL do_one env1 (zip pairs bndrs1) - - do_one env (pr, b1) = cse_bind toplevel env pr b1 - --- | Given a binding of @in_id@ to @in_rhs@, and a fresh name to refer --- to @in_id@ (@out_id@, created from addBinder or addRecBinders), --- first try to CSE @in_rhs@, and then add the resulting (possibly CSE'd) --- binding to the 'CSEnv', so that we attempt to CSE any expressions --- which are equal to @out_rhs@. -cse_bind :: TopLevelFlag -> CSEnv -> (InId, InExpr) -> OutId -> (CSEnv, (OutId, OutExpr)) -cse_bind toplevel env (in_id, in_rhs) out_id - | isTopLevel toplevel, exprIsTickedString in_rhs - -- See Note [Take care with literal strings] - = (env', (out_id', in_rhs)) - - | Just arity <- isJoinId_maybe in_id - -- See Note [Look inside join-point binders] - = let (params, in_body) = collectNBinders arity in_rhs - (env', params') = addBinders env params - out_body = tryForCSE env' in_body - in (env, (out_id, mkLams params' out_body)) - - | otherwise - = (env', (out_id'', out_rhs)) - where - (env', out_id') = addBinding env in_id out_id out_rhs - (cse_done, out_rhs) = try_for_cse env in_rhs - out_id'' | cse_done = delayInlining toplevel out_id' - | otherwise = out_id' - -delayInlining :: TopLevelFlag -> Id -> Id --- Add a NOINLINE[2] if the Id doesn't have an INLNE pragma already --- See Note [Delay inlining after CSE] -delayInlining top_lvl bndr - | isTopLevel top_lvl - , isAlwaysActive (idInlineActivation bndr) - , idHasRules bndr -- Only if the Id has some RULES, - -- which might otherwise get lost - -- These rules are probably auto-generated specialisations, - -- since Ids with manual rules usually have manually-inserted - -- delayed inlining anyway - = bndr `setInlineActivation` activeAfterInitial - | otherwise - = bndr - -addBinding :: CSEnv -- Includes InId->OutId cloning - -> InVar -- Could be a let-bound type - -> OutId -> OutExpr -- Processed binding - -> (CSEnv, OutId) -- Final env, final bndr --- Extend the CSE env with a mapping [rhs -> out-id] --- unless we can instead just substitute [in-id -> rhs] --- --- It's possible for the binder to be a type variable (see --- Note [Type-let] in GHC.Core), in which case we can just substitute. -addBinding env in_id out_id rhs' - | not (isId in_id) = (extendCSSubst env in_id rhs', out_id) - | noCSE in_id = (env, out_id) - | use_subst = (extendCSSubst env in_id rhs', out_id) - | otherwise = (extendCSEnv env rhs' id_expr', zapped_id) - where - id_expr' = varToCoreExpr out_id - zapped_id = zapIdUsageInfo out_id - -- Putting the Id into the cs_map makes it possible that - -- it'll become shared more than it is now, which would - -- invalidate (the usage part of) its demand info. - -- This caused #100218. - -- Easiest thing is to zap the usage info; subsequently - -- performing late demand-analysis will restore it. Don't zap - -- the strictness info; it's not necessary to do so, and losing - -- it is bad for performance if you don't do late demand - -- analysis - - -- Should we use SUBSTITUTE or EXTEND? - -- See Note [CSE for bindings] - use_subst = case rhs' of - Var {} -> True - _ -> False - --- | Given a binder `let x = e`, this function --- determines whether we should add `e -> x` to the cs_map -noCSE :: InId -> Bool -noCSE id = not (isAlwaysActive (idInlineActivation id)) && - not (noUserInlineSpec (inlinePragmaSpec (idInlinePragma id))) - -- See Note [CSE for INLINE and NOINLINE] - || isAnyInlinePragma (idInlinePragma id) - -- See Note [CSE for stable unfoldings] - || isJoinId id - -- See Note [CSE for join points?] - - -{- Note [Take care with literal strings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this example: - - x = "foo"# - y = "foo"# - ...x...y...x...y.... - -We would normally turn this into: - - x = "foo"# - y = x - ...x...x...x...x.... - -But this breaks an invariant of Core, namely that the RHS of a top-level binding -of type Addr# must be a string literal, not another variable. See Note -[Core top-level string literals] in GHC.Core. - -For this reason, we special case top-level bindings to literal strings and leave -the original RHS unmodified. This produces: - - x = "foo"# - y = "foo"# - ...x...x...x...x.... - -Now 'y' will be discarded as dead code, and we are done. - -The net effect is that for the y-binding we want to - - Use SUBSTITUTE, by extending the substitution with y :-> x - - but leave the original binding for y undisturbed - -This is done by cse_bind. I got it wrong the first time (#13367). - -Note [Delay inlining after CSE] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Suppose (#15445) we have - f,g :: Num a => a -> a - f x = ...f (x-1)..... - g y = ...g (y-1) .... - -and we make some specialisations of 'g', either automatically, or via -a SPECIALISE pragma. Then CSE kicks in and notices that the RHSs of -'f' and 'g' are identical, so we get - f x = ...f (x-1)... - g = f - {-# RULES g @Int _ = $sg #-} - -Now there is terrible danger that, in an importing module, we'll inline -'g' before we have a chance to run its specialisation! - -Solution: during CSE, after a "hit" in the CSE cache - * when adding a binding - g = f - * for a top-level function g - * and g has specialisation RULES -add a NOINLINE[2] activation to it, to ensure it's not inlined -right away. - -Notes: -* Why top level only? Because for nested bindings we are already past - phase 2 and will never return there. - -* Why "only if g has RULES"? Because there is no point in - doing this if there are no RULES; and other things being - equal it delays optimisation to delay inlining (#17409) - - ----- Historical note --- - -This patch is simpler and more direct than an earlier -version: - - commit 2110738b280543698407924a16ac92b6d804dc36 - Author: Simon Peyton Jones <simonpj@microsoft.com> - Date: Mon Jul 30 13:43:56 2018 +0100 - - Don't inline functions with RULES too early - -We had to revert this patch because it made GHC itself slower. - -Why? It delayed inlining of /all/ functions with RULES, and that was -very bad in TcFlatten.flatten_ty_con_app - -* It delayed inlining of liftM -* That delayed the unravelling of the recursion in some dictionary - bindings. -* That delayed some eta expansion, leaving - flatten_ty_con_app = \x y. let <stuff> in \z. blah -* That allowed the float-out pass to put sguff between - the \y and \z. -* And that permanently stopped eta expansion of the function, - even once <stuff> was simplified. - --} - -tryForCSE :: CSEnv -> InExpr -> OutExpr -tryForCSE env expr = snd (try_for_cse env expr) - -try_for_cse :: CSEnv -> InExpr -> (Bool, OutExpr) --- (False, e') => We did not CSE the entire expression, --- but we might have CSE'd some sub-expressions, --- yielding e' --- --- (True, te') => We CSE'd the entire expression, --- yielding the trivial expression te' -try_for_cse env expr - | Just e <- lookupCSEnv env expr'' = (True, mkTicks ticks e) - | otherwise = (False, expr') - -- The varToCoreExpr is needed if we have - -- case e of xco { ...case e of yco { ... } ... } - -- Then CSE will substitute yco -> xco; - -- but these are /coercion/ variables - where - expr' = cseExpr env expr - expr'' = stripTicksE tickishFloatable expr' - ticks = stripTicksT tickishFloatable expr' - -- We don't want to lose the source notes when a common sub - -- expression gets eliminated. Hence we push all (!) of them on - -- top of the replaced sub-expression. This is probably not too - -- useful in practice, but upholds our semantics. - --- | Runs CSE on a single expression. --- --- This entry point is not used in the compiler itself, but is provided --- as a convenient entry point for users of the GHC API. -cseOneExpr :: InExpr -> OutExpr -cseOneExpr e = cseExpr env e - where env = emptyCSEnv {cs_subst = mkEmptySubst (mkInScopeSet (exprFreeVars e)) } - -cseExpr :: CSEnv -> InExpr -> OutExpr -cseExpr env (Type t) = Type (substTy (csEnvSubst env) t) -cseExpr env (Coercion c) = Coercion (substCo (csEnvSubst env) c) -cseExpr _ (Lit lit) = Lit lit -cseExpr env (Var v) = lookupSubst env v -cseExpr env (App f a) = App (cseExpr env f) (tryForCSE env a) -cseExpr env (Tick t e) = Tick t (cseExpr env e) -cseExpr env (Cast e co) = Cast (tryForCSE env e) (substCo (csEnvSubst env) co) -cseExpr env (Lam b e) = let (env', b') = addBinder env b - in Lam b' (cseExpr env' e) -cseExpr env (Let bind e) = let (env', bind') = cseBind NotTopLevel env bind - in Let bind' (cseExpr env' e) -cseExpr env (Case e bndr ty alts) = cseCase env e bndr ty alts - -cseCase :: CSEnv -> InExpr -> InId -> InType -> [InAlt] -> OutExpr -cseCase env scrut bndr ty alts - = Case scrut1 bndr3 ty' $ - combineAlts alt_env (map cse_alt alts) - where - ty' = substTy (csEnvSubst env) ty - scrut1 = tryForCSE env scrut - - bndr1 = zapIdOccInfo bndr - -- Zapping the OccInfo is needed because the extendCSEnv - -- in cse_alt may mean that a dead case binder - -- becomes alive, and Lint rejects that - (env1, bndr2) = addBinder env bndr1 - (alt_env, bndr3) = addBinding env1 bndr bndr2 scrut1 - -- addBinding: see Note [CSE for case expressions] - - con_target :: OutExpr - con_target = lookupSubst alt_env bndr - - arg_tys :: [OutType] - arg_tys = tyConAppArgs (idType bndr3) - - -- See Note [CSE for case alternatives] - cse_alt (DataAlt con, args, rhs) - = (DataAlt con, args', tryForCSE new_env rhs) - where - (env', args') = addBinders alt_env args - new_env = extendCSEnv env' con_expr con_target - con_expr = mkAltExpr (DataAlt con) args' arg_tys - - cse_alt (con, args, rhs) - = (con, args', tryForCSE env' rhs) - where - (env', args') = addBinders alt_env args - -combineAlts :: CSEnv -> [OutAlt] -> [OutAlt] --- See Note [Combine case alternatives] -combineAlts env alts - | (Just alt1, rest_alts) <- find_bndr_free_alt alts - , (_,bndrs1,rhs1) <- alt1 - , let filtered_alts = filterOut (identical_alt rhs1) rest_alts - , not (equalLength rest_alts filtered_alts) - = ASSERT2( null bndrs1, ppr alts ) - (DEFAULT, [], rhs1) : filtered_alts - - | otherwise - = alts - where - in_scope = substInScope (csEnvSubst env) - - find_bndr_free_alt :: [CoreAlt] -> (Maybe CoreAlt, [CoreAlt]) - -- The (Just alt) is a binder-free alt - -- See Note [Combine case alts: awkward corner] - find_bndr_free_alt [] - = (Nothing, []) - find_bndr_free_alt (alt@(_,bndrs,_) : alts) - | null bndrs = (Just alt, alts) - | otherwise = case find_bndr_free_alt alts of - (mb_bf, alts) -> (mb_bf, alt:alts) - - identical_alt rhs1 (_,_,rhs) = eqExpr in_scope rhs1 rhs - -- Even if this alt has binders, they will have been cloned - -- If any of these binders are mentioned in 'rhs', then - -- 'rhs' won't compare equal to 'rhs1' (which is from an - -- alt with no binders). - -{- Note [CSE for case alternatives] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider case e of x - K1 y -> ....(K1 y)... - K2 -> ....K2.... - -We definitely want to CSE that (K1 y) into just x. - -But what about the lone K2? At first you would think "no" because -turning K2 into 'x' increases the number of live variables. But - -* Turning K2 into x increases the chance of combining identical alts. - Example case xs of - (_:_) -> f xs - [] -> f [] - See #17901 and simplCore/should_compile/T17901 for more examples - of this kind. - -* The next run of the simplifier will turn 'x' back into K2, so we won't - permanently bloat the free-var count. - - -Note [Combine case alternatives] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -combineAlts is just a more heavyweight version of the use of -combineIdenticalAlts in SimplUtils.prepareAlts. The basic idea is -to transform - - DEFAULT -> e1 - K x -> e1 - W y z -> e2 -===> - DEFAULT -> e1 - W y z -> e2 - -In the simplifier we use cheapEqExpr, because it is called a lot. -But here in CSE we use the full eqExpr. After all, two alternatives usually -differ near the root, so it probably isn't expensive to compare the full -alternative. It seems like the same kind of thing that CSE is supposed -to be doing, which is why I put it here. - -I actually saw some examples in the wild, where some inlining made e1 too -big for cheapEqExpr to catch it. - -Note [Combine case alts: awkward corner] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We would really like to check isDeadBinder on the binders in the -alternative. But alas, the simplifer zaps occ-info on binders in case -alternatives; see Note [Case alternative occ info] in Simplify. - -* One alternative (perhaps a good one) would be to do OccAnal - just before CSE. Then perhaps we could get rid of combineIdenticalAlts - in the Simplifier, which might save work. - -* Another would be for CSE to return free vars as it goes. - -* But the current solution is to find a nullary alternative (including - the DEFAULT alt, if any). This will not catch - case x of - A y -> blah - B z p -> blah - where no alternative is nullary or DEFAULT. But the current - solution is at least cheap. - - -************************************************************************ -* * -\section{The CSE envt} -* * -************************************************************************ --} - -data CSEnv - = CS { cs_subst :: Subst -- Maps InBndrs to OutExprs - -- The substitution variables to - -- /trivial/ OutExprs, not arbitrary expressions - - , cs_map :: CoreMap OutExpr -- The reverse mapping - -- Maps a OutExpr to a /trivial/ OutExpr - -- The key of cs_map is stripped of all Ticks - - , cs_rec_map :: CoreMap OutExpr - -- See Note [CSE for recursive bindings] - } - -emptyCSEnv :: CSEnv -emptyCSEnv = CS { cs_map = emptyCoreMap, cs_rec_map = emptyCoreMap - , cs_subst = emptySubst } - -lookupCSEnv :: CSEnv -> OutExpr -> Maybe OutExpr -lookupCSEnv (CS { cs_map = csmap }) expr - = lookupCoreMap csmap expr - -extendCSEnv :: CSEnv -> OutExpr -> OutExpr -> CSEnv -extendCSEnv cse expr triv_expr - = cse { cs_map = extendCoreMap (cs_map cse) sexpr triv_expr } - where - sexpr = stripTicksE tickishFloatable expr - -extendCSRecEnv :: CSEnv -> OutId -> OutExpr -> OutExpr -> CSEnv --- See Note [CSE for recursive bindings] -extendCSRecEnv cse bndr expr triv_expr - = cse { cs_rec_map = extendCoreMap (cs_rec_map cse) (Lam bndr expr) triv_expr } - -lookupCSRecEnv :: CSEnv -> OutId -> OutExpr -> Maybe OutExpr --- See Note [CSE for recursive bindings] -lookupCSRecEnv (CS { cs_rec_map = csmap }) bndr expr - = lookupCoreMap csmap (Lam bndr expr) - -csEnvSubst :: CSEnv -> Subst -csEnvSubst = cs_subst - -lookupSubst :: CSEnv -> Id -> OutExpr -lookupSubst (CS { cs_subst = sub}) x = lookupIdSubst (text "CSE.lookupSubst") sub x - -extendCSSubst :: CSEnv -> Id -> CoreExpr -> CSEnv -extendCSSubst cse x rhs = cse { cs_subst = extendSubst (cs_subst cse) x rhs } - --- | Add clones to the substitution to deal with shadowing. See --- Note [Shadowing] for more details. You should call this whenever --- you go under a binder. -addBinder :: CSEnv -> Var -> (CSEnv, Var) -addBinder cse v = (cse { cs_subst = sub' }, v') - where - (sub', v') = substBndr (cs_subst cse) v - -addBinders :: CSEnv -> [Var] -> (CSEnv, [Var]) -addBinders cse vs = (cse { cs_subst = sub' }, vs') - where - (sub', vs') = substBndrs (cs_subst cse) vs - -addRecBinders :: CSEnv -> [Id] -> (CSEnv, [Id]) -addRecBinders cse vs = (cse { cs_subst = sub' }, vs') - where - (sub', vs') = substRecBndrs (cs_subst cse) vs diff --git a/compiler/simplCore/CallArity.hs b/compiler/simplCore/CallArity.hs deleted file mode 100644 index 84d62e4ad9..0000000000 --- a/compiler/simplCore/CallArity.hs +++ /dev/null @@ -1,763 +0,0 @@ --- --- Copyright (c) 2014 Joachim Breitner --- - -module CallArity - ( callArityAnalProgram - , callArityRHS -- for testing - ) where - -import GhcPrelude - -import VarSet -import VarEnv -import GHC.Driver.Session ( DynFlags ) - -import BasicTypes -import GHC.Core -import Id -import GHC.Core.Arity ( typeArity ) -import GHC.Core.Utils ( exprIsCheap, exprIsTrivial ) -import UnVarGraph -import Demand -import Util - -import Control.Arrow ( first, second ) - - -{- -%************************************************************************ -%* * - Call Arity Analysis -%* * -%************************************************************************ - -Note [Call Arity: The goal] -~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The goal of this analysis is to find out if we can eta-expand a local function, -based on how it is being called. The motivating example is this code, -which comes up when we implement foldl using foldr, and do list fusion: - - let go = \x -> let d = case ... of - False -> go (x+1) - True -> id - in \z -> d (x + z) - in go 1 0 - -If we do not eta-expand `go` to have arity 2, we are going to allocate a lot of -partial function applications, which would be bad. - -The function `go` has a type of arity two, but only one lambda is manifest. -Furthermore, an analysis that only looks at the RHS of go cannot be sufficient -to eta-expand go: If `go` is ever called with one argument (and the result used -multiple times), we would be doing the work in `...` multiple times. - -So `callArityAnalProgram` looks at the whole let expression to figure out if -all calls are nice, i.e. have a high enough arity. It then stores the result in -the `calledArity` field of the `IdInfo` of `go`, which the next simplifier -phase will eta-expand. - -The specification of the `calledArity` field is: - - No work will be lost if you eta-expand me to the arity in `calledArity`. - -What we want to know for a variable ------------------------------------ - -For every let-bound variable we'd like to know: - 1. A lower bound on the arity of all calls to the variable, and - 2. whether the variable is being called at most once or possible multiple - times. - -It is always ok to lower the arity, or pretend that there are multiple calls. -In particular, "Minimum arity 0 and possible called multiple times" is always -correct. - - -What we want to know from an expression ---------------------------------------- - -In order to obtain that information for variables, we analyze expression and -obtain bits of information: - - I. The arity analysis: - For every variable, whether it is absent, or called, - and if called, which what arity. - - II. The Co-Called analysis: - For every two variables, whether there is a possibility that both are being - called. - We obtain as a special case: For every variables, whether there is a - possibility that it is being called twice. - -For efficiency reasons, we gather this information only for a set of -*interesting variables*, to avoid spending time on, e.g., variables from pattern matches. - -The two analysis are not completely independent, as a higher arity can improve -the information about what variables are being called once or multiple times. - -Note [Analysis I: The arity analysis] ------------------------------------- - -The arity analysis is quite straight forward: The information about an -expression is an - VarEnv Arity -where absent variables are bound to Nothing and otherwise to a lower bound to -their arity. - -When we analyze an expression, we analyze it with a given context arity. -Lambdas decrease and applications increase the incoming arity. Analysizing a -variable will put that arity in the environment. In lets or cases all the -results from the various subexpressions are lubed, which takes the point-wise -minimum (considering Nothing an infinity). - - -Note [Analysis II: The Co-Called analysis] ------------------------------------------- - -The second part is more sophisticated. For reasons explained below, it is not -sufficient to simply know how often an expression evaluates a variable. Instead -we need to know which variables are possibly called together. - -The data structure here is an undirected graph of variables, which is provided -by the abstract - UnVarGraph - -It is safe to return a larger graph, i.e. one with more edges. The worst case -(i.e. the least useful and always correct result) is the complete graph on all -free variables, which means that anything can be called together with anything -(including itself). - -Notation for the following: -C(e) is the co-called result for e. -G₁∪G₂ is the union of two graphs -fv is the set of free variables (conveniently the domain of the arity analysis result) -S₁×S₂ is the complete bipartite graph { {a,b} | a ∈ S₁, b ∈ S₂ } -S² is the complete graph on the set of variables S, S² = S×S -C'(e) is a variant for bound expression: - If e is called at most once, or it is and stays a thunk (after the analysis), - it is simply C(e). Otherwise, the expression can be called multiple times - and we return (fv e)² - -The interesting cases of the analysis: - * Var v: - No other variables are being called. - Return {} (the empty graph) - * Lambda v e, under arity 0: - This means that e can be evaluated many times and we cannot get - any useful co-call information. - Return (fv e)² - * Case alternatives alt₁,alt₂,...: - Only one can be execuded, so - Return (alt₁ ∪ alt₂ ∪...) - * App e₁ e₂ (and analogously Case scrut alts), with non-trivial e₂: - We get the results from both sides, with the argument evaluated at most once. - Additionally, anything called by e₁ can possibly be called with anything - from e₂. - Return: C(e₁) ∪ C(e₂) ∪ (fv e₁) × (fv e₂) - * App e₁ x: - As this is already in A-normal form, CorePrep will not separately lambda - bind (and hence share) x. So we conservatively assume multiple calls to x here - Return: C(e₁) ∪ (fv e₁) × {x} ∪ {(x,x)} - * Let v = rhs in body: - In addition to the results from the subexpressions, add all co-calls from - everything that the body calls together with v to everything that is called - by v. - Return: C'(rhs) ∪ C(body) ∪ (fv rhs) × {v'| {v,v'} ∈ C(body)} - * Letrec v₁ = rhs₁ ... vₙ = rhsₙ in body - Tricky. - We assume that it is really mutually recursive, i.e. that every variable - calls one of the others, and that this is strongly connected (otherwise we - return an over-approximation, so that's ok), see note [Recursion and fixpointing]. - - Let V = {v₁,...vₙ}. - Assume that the vs have been analysed with an incoming demand and - cardinality consistent with the final result (this is the fixed-pointing). - Again we can use the results from all subexpressions. - In addition, for every variable vᵢ, we need to find out what it is called - with (call this set Sᵢ). There are two cases: - * If vᵢ is a function, we need to go through all right-hand-sides and bodies, - and collect every variable that is called together with any variable from V: - Sᵢ = {v' | j ∈ {1,...,n}, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) } - * If vᵢ is a thunk, then its rhs is evaluated only once, so we need to - exclude it from this set: - Sᵢ = {v' | j ∈ {1,...,n}, j≠i, {v',vⱼ} ∈ C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ C(body) } - Finally, combine all this: - Return: C(body) ∪ - C'(rhs₁) ∪ ... ∪ C'(rhsₙ) ∪ - (fv rhs₁) × S₁) ∪ ... ∪ (fv rhsₙ) × Sₙ) - -Using the result: Eta-Expansion -------------------------------- - -We use the result of these two analyses to decide whether we can eta-expand the -rhs of a let-bound variable. - -If the variable is already a function (exprIsCheap), and all calls to the -variables have a higher arity than the current manifest arity (i.e. the number -of lambdas), expand. - -If the variable is a thunk we must be careful: Eta-Expansion will prevent -sharing of work, so this is only safe if there is at most one call to the -function. Therefore, we check whether {v,v} ∈ G. - - Example: - - let n = case .. of .. -- A thunk! - in n 0 + n 1 - - vs. - - let n = case .. of .. - in case .. of T -> n 0 - F -> n 1 - - We are only allowed to eta-expand `n` if it is going to be called at most - once in the body of the outer let. So we need to know, for each variable - individually, that it is going to be called at most once. - - -Why the co-call graph? ----------------------- - -Why is it not sufficient to simply remember which variables are called once and -which are called multiple times? It would be in the previous example, but consider - - let n = case .. of .. - in case .. of - True -> let go = \y -> case .. of - True -> go (y + n 1) - False > n - in go 1 - False -> n - -vs. - - let n = case .. of .. - in case .. of - True -> let go = \y -> case .. of - True -> go (y+1) - False > n - in go 1 - False -> n - -In both cases, the body and the rhs of the inner let call n at most once. -But only in the second case that holds for the whole expression! The -crucial difference is that in the first case, the rhs of `go` can call -*both* `go` and `n`, and hence can call `n` multiple times as it recurses, -while in the second case find out that `go` and `n` are not called together. - - -Why co-call information for functions? --------------------------------------- - -Although for eta-expansion we need the information only for thunks, we still -need to know whether functions are being called once or multiple times, and -together with what other functions. - - Example: - - let n = case .. of .. - f x = n (x+1) - in f 1 + f 2 - - vs. - - let n = case .. of .. - f x = n (x+1) - in case .. of T -> f 0 - F -> f 1 - - Here, the body of f calls n exactly once, but f itself is being called - multiple times, so eta-expansion is not allowed. - - -Note [Analysis type signature] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The work-hourse of the analysis is the function `callArityAnal`, with the -following type: - - type CallArityRes = (UnVarGraph, VarEnv Arity) - callArityAnal :: - Arity -> -- The arity this expression is called with - VarSet -> -- The set of interesting variables - CoreExpr -> -- The expression to analyse - (CallArityRes, CoreExpr) - -and the following specification: - - ((coCalls, callArityEnv), expr') = callArityEnv arity interestingIds expr - - <=> - - Assume the expression `expr` is being passed `arity` arguments. Then it holds that - * The domain of `callArityEnv` is a subset of `interestingIds`. - * Any variable from `interestingIds` that is not mentioned in the `callArityEnv` - is absent, i.e. not called at all. - * Every call from `expr` to a variable bound to n in `callArityEnv` has at - least n value arguments. - * For two interesting variables `v1` and `v2`, they are not adjacent in `coCalls`, - then in no execution of `expr` both are being called. - Furthermore, expr' is expr with the callArity field of the `IdInfo` updated. - - -Note [Which variables are interesting] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The analysis would quickly become prohibitive expensive if we would analyse all -variables; for most variables we simply do not care about how often they are -called, i.e. variables bound in a pattern match. So interesting are variables that are - * top-level or let bound - * and possibly functions (typeArity > 0) - -Note [Taking boring variables into account] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -If we decide that the variable bound in `let x = e1 in e2` is not interesting, -the analysis of `e2` will not report anything about `x`. To ensure that -`callArityBind` does still do the right thing we have to take that into account -every time we would be lookup up `x` in the analysis result of `e2`. - * Instead of calling lookupCallArityRes, we return (0, True), indicating - that this variable might be called many times with no arguments. - * Instead of checking `calledWith x`, we assume that everything can be called - with it. - * In the recursive case, when calclulating the `cross_calls`, if there is - any boring variable in the recursive group, we ignore all co-call-results - and directly go to a very conservative assumption. - -The last point has the nice side effect that the relatively expensive -integration of co-call results in a recursive groups is often skipped. This -helped to avoid the compile time blowup in some real-world code with large -recursive groups (#10293). - -Note [Recursion and fixpointing] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -For a mutually recursive let, we begin by - 1. analysing the body, using the same incoming arity as for the whole expression. - 2. Then we iterate, memoizing for each of the bound variables the last - analysis call, i.e. incoming arity, whether it is called once, and the CallArityRes. - 3. We combine the analysis result from the body and the memoized results for - the arguments (if already present). - 4. For each variable, we find out the incoming arity and whether it is called - once, based on the current analysis result. If this differs from the - memoized results, we re-analyse the rhs and update the memoized table. - 5. If nothing had to be reanalyzed, we are done. - Otherwise, repeat from step 3. - - -Note [Thunks in recursive groups] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We never eta-expand a thunk in a recursive group, on the grounds that if it is -part of a recursive group, then it will be called multiple times. - -This is not necessarily true, e.g. it would be safe to eta-expand t2 (but not -t1) in the following code: - - let go x = t1 - t1 = if ... then t2 else ... - t2 = if ... then go 1 else ... - in go 0 - -Detecting this would require finding out what variables are only ever called -from thunks. While this is certainly possible, we yet have to see this to be -relevant in the wild. - - -Note [Analysing top-level binds] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We can eta-expand top-level-binds if they are not exported, as we see all calls -to them. The plan is as follows: Treat the top-level binds as nested lets around -a body representing “all external calls”, which returns a pessimistic -CallArityRes (the co-call graph is the complete graph, all arityies 0). - -Note [Trimming arity] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -In the Call Arity papers, we are working on an untyped lambda calculus with no -other id annotations, where eta-expansion is always possible. But this is not -the case for Core! - 1. We need to ensure the invariant - callArity e <= typeArity (exprType e) - for the same reasons that exprArity needs this invariant (see Note - [exprArity invariant] in GHC.Core.Arity). - - If we are not doing that, a too-high arity annotation will be stored with - the id, confusing the simplifier later on. - - 2. Eta-expanding a right hand side might invalidate existing annotations. In - particular, if an id has a strictness annotation of <...><...>b, then - passing two arguments to it will definitely bottom out, so the simplifier - will throw away additional parameters. This conflicts with Call Arity! So - we ensure that we never eta-expand such a value beyond the number of - arguments mentioned in the strictness signature. - See #10176 for a real-world-example. - -Note [What is a thunk] -~~~~~~~~~~~~~~~~~~~~~~ - -Originally, everything that is not in WHNF (`exprIsWHNF`) is considered a -thunk, not eta-expanded, to avoid losing any sharing. This is also how the -published papers on Call Arity describe it. - -In practice, there are thunks that do a just little work, such as -pattern-matching on a variable, and the benefits of eta-expansion likely -outweigh the cost of doing that repeatedly. Therefore, this implementation of -Call Arity considers everything that is not cheap (`exprIsCheap`) as a thunk. - -Note [Call Arity and Join Points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -The Call Arity analysis does not care about join points, and treats them just -like normal functions. This is ok. - -The analysis *could* make use of the fact that join points are always evaluated -in the same context as the join-binding they are defined in and are always -one-shot, and handle join points separately, as suggested in -https://gitlab.haskell.org/ghc/ghc/issues/13479#note_134870. -This *might* be more efficient (for example, join points would not have to be -considered interesting variables), but it would also add redundant code. So for -now we do not do that. - -The simplifier never eta-expands join points (it instead pushes extra arguments from -an eta-expanded context into the join point’s RHS), so the call arity -annotation on join points is not actually used. As it would be equally valid -(though less efficient) to eta-expand join points, this is the simplifier's -choice, and hence Call Arity sets the call arity for join points as well. --} - --- Main entry point - -callArityAnalProgram :: DynFlags -> CoreProgram -> CoreProgram -callArityAnalProgram _dflags binds = binds' - where - (_, binds') = callArityTopLvl [] emptyVarSet binds - --- See Note [Analysing top-level-binds] -callArityTopLvl :: [Var] -> VarSet -> [CoreBind] -> (CallArityRes, [CoreBind]) -callArityTopLvl exported _ [] - = ( calledMultipleTimes $ (emptyUnVarGraph, mkVarEnv $ [(v, 0) | v <- exported]) - , [] ) -callArityTopLvl exported int1 (b:bs) - = (ae2, b':bs') - where - int2 = bindersOf b - exported' = filter isExportedId int2 ++ exported - int' = int1 `addInterestingBinds` b - (ae1, bs') = callArityTopLvl exported' int' bs - (ae2, b') = callArityBind (boringBinds b) ae1 int1 b - - -callArityRHS :: CoreExpr -> CoreExpr -callArityRHS = snd . callArityAnal 0 emptyVarSet - --- The main analysis function. See Note [Analysis type signature] -callArityAnal :: - Arity -> -- The arity this expression is called with - VarSet -> -- The set of interesting variables - CoreExpr -> -- The expression to analyse - (CallArityRes, CoreExpr) - -- How this expression uses its interesting variables - -- and the expression with IdInfo updated - --- The trivial base cases -callArityAnal _ _ e@(Lit _) - = (emptyArityRes, e) -callArityAnal _ _ e@(Type _) - = (emptyArityRes, e) -callArityAnal _ _ e@(Coercion _) - = (emptyArityRes, e) --- The transparent cases -callArityAnal arity int (Tick t e) - = second (Tick t) $ callArityAnal arity int e -callArityAnal arity int (Cast e co) - = second (\e -> Cast e co) $ callArityAnal arity int e - --- The interesting case: Variables, Lambdas, Lets, Applications, Cases -callArityAnal arity int e@(Var v) - | v `elemVarSet` int - = (unitArityRes v arity, e) - | otherwise - = (emptyArityRes, e) - --- Non-value lambdas are ignored -callArityAnal arity int (Lam v e) | not (isId v) - = second (Lam v) $ callArityAnal arity (int `delVarSet` v) e - --- We have a lambda that may be called multiple times, so its free variables --- can all be co-called. -callArityAnal 0 int (Lam v e) - = (ae', Lam v e') - where - (ae, e') = callArityAnal 0 (int `delVarSet` v) e - ae' = calledMultipleTimes ae --- We have a lambda that we are calling. decrease arity. -callArityAnal arity int (Lam v e) - = (ae, Lam v e') - where - (ae, e') = callArityAnal (arity - 1) (int `delVarSet` v) e - --- Application. Increase arity for the called expression, nothing to know about --- the second -callArityAnal arity int (App e (Type t)) - = second (\e -> App e (Type t)) $ callArityAnal arity int e -callArityAnal arity int (App e1 e2) - = (final_ae, App e1' e2') - where - (ae1, e1') = callArityAnal (arity + 1) int e1 - (ae2, e2') = callArityAnal 0 int e2 - -- If the argument is trivial (e.g. a variable), then it will _not_ be - -- let-bound in the Core to STG transformation (CorePrep actually), - -- so no sharing will happen here, and we have to assume many calls. - ae2' | exprIsTrivial e2 = calledMultipleTimes ae2 - | otherwise = ae2 - final_ae = ae1 `both` ae2' - --- Case expression. -callArityAnal arity int (Case scrut bndr ty alts) - = -- pprTrace "callArityAnal:Case" - -- (vcat [ppr scrut, ppr final_ae]) - (final_ae, Case scrut' bndr ty alts') - where - (alt_aes, alts') = unzip $ map go alts - go (dc, bndrs, e) = let (ae, e') = callArityAnal arity int e - in (ae, (dc, bndrs, e')) - alt_ae = lubRess alt_aes - (scrut_ae, scrut') = callArityAnal 0 int scrut - final_ae = scrut_ae `both` alt_ae - --- For lets, use callArityBind -callArityAnal arity int (Let bind e) - = -- pprTrace "callArityAnal:Let" - -- (vcat [ppr v, ppr arity, ppr n, ppr final_ae ]) - (final_ae, Let bind' e') - where - int_body = int `addInterestingBinds` bind - (ae_body, e') = callArityAnal arity int_body e - (final_ae, bind') = callArityBind (boringBinds bind) ae_body int bind - --- Which bindings should we look at? --- See Note [Which variables are interesting] -isInteresting :: Var -> Bool -isInteresting v = not $ null (typeArity (idType v)) - -interestingBinds :: CoreBind -> [Var] -interestingBinds = filter isInteresting . bindersOf - -boringBinds :: CoreBind -> VarSet -boringBinds = mkVarSet . filter (not . isInteresting) . bindersOf - -addInterestingBinds :: VarSet -> CoreBind -> VarSet -addInterestingBinds int bind - = int `delVarSetList` bindersOf bind -- Possible shadowing - `extendVarSetList` interestingBinds bind - --- Used for both local and top-level binds --- Second argument is the demand from the body -callArityBind :: VarSet -> CallArityRes -> VarSet -> CoreBind -> (CallArityRes, CoreBind) --- Non-recursive let -callArityBind boring_vars ae_body int (NonRec v rhs) - | otherwise - = -- pprTrace "callArityBind:NonRec" - -- (vcat [ppr v, ppr ae_body, ppr int, ppr ae_rhs, ppr safe_arity]) - (final_ae, NonRec v' rhs') - where - is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk] - -- If v is boring, we will not find it in ae_body, but always assume (0, False) - boring = v `elemVarSet` boring_vars - - (arity, called_once) - | boring = (0, False) -- See Note [Taking boring variables into account] - | otherwise = lookupCallArityRes ae_body v - safe_arity | called_once = arity - | is_thunk = 0 -- A thunk! Do not eta-expand - | otherwise = arity - - -- See Note [Trimming arity] - trimmed_arity = trimArity v safe_arity - - (ae_rhs, rhs') = callArityAnal trimmed_arity int rhs - - - ae_rhs'| called_once = ae_rhs - | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once - | otherwise = calledMultipleTimes ae_rhs - - called_by_v = domRes ae_rhs' - called_with_v - | boring = domRes ae_body - | otherwise = calledWith ae_body v `delUnVarSet` v - final_ae = addCrossCoCalls called_by_v called_with_v $ ae_rhs' `lubRes` resDel v ae_body - - v' = v `setIdCallArity` trimmed_arity - - --- Recursive let. See Note [Recursion and fixpointing] -callArityBind boring_vars ae_body int b@(Rec binds) - = -- (if length binds > 300 then - -- pprTrace "callArityBind:Rec" - -- (vcat [ppr (Rec binds'), ppr ae_body, ppr int, ppr ae_rhs]) else id) $ - (final_ae, Rec binds') - where - -- See Note [Taking boring variables into account] - any_boring = any (`elemVarSet` boring_vars) [ i | (i, _) <- binds] - - int_body = int `addInterestingBinds` b - (ae_rhs, binds') = fix initial_binds - final_ae = bindersOf b `resDelList` ae_rhs - - initial_binds = [(i,Nothing,e) | (i,e) <- binds] - - fix :: [(Id, Maybe (Bool, Arity, CallArityRes), CoreExpr)] -> (CallArityRes, [(Id, CoreExpr)]) - fix ann_binds - | -- pprTrace "callArityBind:fix" (vcat [ppr ann_binds, ppr any_change, ppr ae]) $ - any_change - = fix ann_binds' - | otherwise - = (ae, map (\(i, _, e) -> (i, e)) ann_binds') - where - aes_old = [ (i,ae) | (i, Just (_,_,ae), _) <- ann_binds ] - ae = callArityRecEnv any_boring aes_old ae_body - - rerun (i, mbLastRun, rhs) - | i `elemVarSet` int_body && not (i `elemUnVarSet` domRes ae) - -- No call to this yet, so do nothing - = (False, (i, Nothing, rhs)) - - | Just (old_called_once, old_arity, _) <- mbLastRun - , called_once == old_called_once - , new_arity == old_arity - -- No change, no need to re-analyze - = (False, (i, mbLastRun, rhs)) - - | otherwise - -- We previously analyzed this with a different arity (or not at all) - = let is_thunk = not (exprIsCheap rhs) -- see note [What is a thunk] - - safe_arity | is_thunk = 0 -- See Note [Thunks in recursive groups] - | otherwise = new_arity - - -- See Note [Trimming arity] - trimmed_arity = trimArity i safe_arity - - (ae_rhs, rhs') = callArityAnal trimmed_arity int_body rhs - - ae_rhs' | called_once = ae_rhs - | safe_arity == 0 = ae_rhs -- If it is not a function, its body is evaluated only once - | otherwise = calledMultipleTimes ae_rhs - - i' = i `setIdCallArity` trimmed_arity - - in (True, (i', Just (called_once, new_arity, ae_rhs'), rhs')) - where - -- See Note [Taking boring variables into account] - (new_arity, called_once) | i `elemVarSet` boring_vars = (0, False) - | otherwise = lookupCallArityRes ae i - - (changes, ann_binds') = unzip $ map rerun ann_binds - any_change = or changes - --- Combining the results from body and rhs, (mutually) recursive case --- See Note [Analysis II: The Co-Called analysis] -callArityRecEnv :: Bool -> [(Var, CallArityRes)] -> CallArityRes -> CallArityRes -callArityRecEnv any_boring ae_rhss ae_body - = -- (if length ae_rhss > 300 then pprTrace "callArityRecEnv" (vcat [ppr ae_rhss, ppr ae_body, ppr ae_new]) else id) $ - ae_new - where - vars = map fst ae_rhss - - ae_combined = lubRess (map snd ae_rhss) `lubRes` ae_body - - cross_calls - -- See Note [Taking boring variables into account] - | any_boring = completeGraph (domRes ae_combined) - -- Also, calculating cross_calls is expensive. Simply be conservative - -- if the mutually recursive group becomes too large. - | lengthExceeds ae_rhss 25 = completeGraph (domRes ae_combined) - | otherwise = unionUnVarGraphs $ map cross_call ae_rhss - cross_call (v, ae_rhs) = completeBipartiteGraph called_by_v called_with_v - where - is_thunk = idCallArity v == 0 - -- What rhs are relevant as happening before (or after) calling v? - -- If v is a thunk, everything from all the _other_ variables - -- If v is not a thunk, everything can happen. - ae_before_v | is_thunk = lubRess (map snd $ filter ((/= v) . fst) ae_rhss) `lubRes` ae_body - | otherwise = ae_combined - -- What do we want to know from these? - -- Which calls can happen next to any recursive call. - called_with_v - = unionUnVarSets $ map (calledWith ae_before_v) vars - called_by_v = domRes ae_rhs - - ae_new = first (cross_calls `unionUnVarGraph`) ae_combined - --- See Note [Trimming arity] -trimArity :: Id -> Arity -> Arity -trimArity v a = minimum [a, max_arity_by_type, max_arity_by_strsig] - where - max_arity_by_type = length (typeArity (idType v)) - max_arity_by_strsig - | isBotDiv result_info = length demands - | otherwise = a - - (demands, result_info) = splitStrictSig (idStrictness v) - ---------------------------------------- --- Functions related to CallArityRes -- ---------------------------------------- - --- Result type for the two analyses. --- See Note [Analysis I: The arity analysis] --- and Note [Analysis II: The Co-Called analysis] -type CallArityRes = (UnVarGraph, VarEnv Arity) - -emptyArityRes :: CallArityRes -emptyArityRes = (emptyUnVarGraph, emptyVarEnv) - -unitArityRes :: Var -> Arity -> CallArityRes -unitArityRes v arity = (emptyUnVarGraph, unitVarEnv v arity) - -resDelList :: [Var] -> CallArityRes -> CallArityRes -resDelList vs ae = foldr resDel ae vs - -resDel :: Var -> CallArityRes -> CallArityRes -resDel v (g, ae) = (g `delNode` v, ae `delVarEnv` v) - -domRes :: CallArityRes -> UnVarSet -domRes (_, ae) = varEnvDom ae - --- In the result, find out the minimum arity and whether the variable is called --- at most once. -lookupCallArityRes :: CallArityRes -> Var -> (Arity, Bool) -lookupCallArityRes (g, ae) v - = case lookupVarEnv ae v of - Just a -> (a, not (g `hasLoopAt` v)) - Nothing -> (0, False) - -calledWith :: CallArityRes -> Var -> UnVarSet -calledWith (g, _) v = neighbors g v - -addCrossCoCalls :: UnVarSet -> UnVarSet -> CallArityRes -> CallArityRes -addCrossCoCalls set1 set2 = first (completeBipartiteGraph set1 set2 `unionUnVarGraph`) - --- Replaces the co-call graph by a complete graph (i.e. no information) -calledMultipleTimes :: CallArityRes -> CallArityRes -calledMultipleTimes res = first (const (completeGraph (domRes res))) res - --- Used for application and cases -both :: CallArityRes -> CallArityRes -> CallArityRes -both r1 r2 = addCrossCoCalls (domRes r1) (domRes r2) $ r1 `lubRes` r2 - --- Used when combining results from alternative cases; take the minimum -lubRes :: CallArityRes -> CallArityRes -> CallArityRes -lubRes (g1, ae1) (g2, ae2) = (g1 `unionUnVarGraph` g2, ae1 `lubArityEnv` ae2) - -lubArityEnv :: VarEnv Arity -> VarEnv Arity -> VarEnv Arity -lubArityEnv = plusVarEnv_C min - -lubRess :: [CallArityRes] -> CallArityRes -lubRess = foldl' lubRes emptyArityRes diff --git a/compiler/simplCore/CoreMonad.hs b/compiler/simplCore/CoreMonad.hs deleted file mode 100644 index cb17f33b88..0000000000 --- a/compiler/simplCore/CoreMonad.hs +++ /dev/null @@ -1,829 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section[CoreMonad]{The core pipeline monad} --} - -{-# LANGUAGE CPP #-} -{-# LANGUAGE DeriveFunctor #-} - -{-# OPTIONS_GHC -Wno-incomplete-record-updates #-} - -module CoreMonad ( - -- * Configuration of the core-to-core passes - CoreToDo(..), runWhen, runMaybe, - SimplMode(..), - FloatOutSwitches(..), - pprPassDetails, - - -- * Plugins - CorePluginPass, bindsOnlyPass, - - -- * Counting - SimplCount, doSimplTick, doFreeSimplTick, simplCountN, - pprSimplCount, plusSimplCount, zeroSimplCount, - isZeroSimplCount, hasDetailedCounts, Tick(..), - - -- * The monad - CoreM, runCoreM, - - -- ** Reading from the monad - getHscEnv, getRuleBase, getModule, - getDynFlags, getPackageFamInstEnv, - getVisibleOrphanMods, getUniqMask, - getPrintUnqualified, getSrcSpanM, - - -- ** Writing to the monad - addSimplCount, - - -- ** Lifting into the monad - liftIO, liftIOWithCount, - - -- ** Dealing with annotations - getAnnotations, getFirstAnnotations, - - -- ** Screen output - putMsg, putMsgS, errorMsg, errorMsgS, warnMsg, - fatalErrorMsg, fatalErrorMsgS, - debugTraceMsg, debugTraceMsgS, - dumpIfSet_dyn - ) where - -import GhcPrelude hiding ( read ) - -import GHC.Core -import GHC.Driver.Types -import Module -import GHC.Driver.Session -import BasicTypes ( CompilerPhase(..) ) -import Annotations - -import IOEnv hiding ( liftIO, failM, failWithM ) -import qualified IOEnv ( liftIO ) -import Var -import Outputable -import FastString -import ErrUtils( Severity(..), DumpFormat (..), dumpOptionsFromFlag ) -import UniqSupply -import MonadUtils -import NameEnv -import SrcLoc -import Data.Bifunctor ( bimap ) -import ErrUtils (dumpAction) -import Data.List (intersperse, groupBy, sortBy) -import Data.Ord -import Data.Dynamic -import Data.Map (Map) -import qualified Data.Map as Map -import qualified Data.Map.Strict as MapStrict -import Data.Word -import Control.Monad -import Control.Applicative ( Alternative(..) ) -import Panic (throwGhcException, GhcException(..)) - -{- -************************************************************************ -* * - The CoreToDo type and related types - Abstraction of core-to-core passes to run. -* * -************************************************************************ --} - -data CoreToDo -- These are diff core-to-core passes, - -- which may be invoked in any order, - -- as many times as you like. - - = CoreDoSimplify -- The core-to-core simplifier. - Int -- Max iterations - SimplMode - | CoreDoPluginPass String CorePluginPass - | CoreDoFloatInwards - | CoreDoFloatOutwards FloatOutSwitches - | CoreLiberateCase - | CoreDoPrintCore - | CoreDoStaticArgs - | CoreDoCallArity - | CoreDoExitify - | CoreDoDemand - | CoreDoCpr - | CoreDoWorkerWrapper - | CoreDoSpecialising - | CoreDoSpecConstr - | CoreCSE - | CoreDoRuleCheck CompilerPhase String -- Check for non-application of rules - -- matching this string - | CoreDoNothing -- Useful when building up - | CoreDoPasses [CoreToDo] -- lists of these things - - | CoreDesugar -- Right after desugaring, no simple optimisation yet! - | CoreDesugarOpt -- CoreDesugarXXX: Not strictly a core-to-core pass, but produces - -- Core output, and hence useful to pass to endPass - - | CoreTidy - | CorePrep - | CoreOccurAnal - -instance Outputable CoreToDo where - ppr (CoreDoSimplify _ _) = text "Simplifier" - ppr (CoreDoPluginPass s _) = text "Core plugin: " <+> text s - ppr CoreDoFloatInwards = text "Float inwards" - ppr (CoreDoFloatOutwards f) = text "Float out" <> parens (ppr f) - ppr CoreLiberateCase = text "Liberate case" - ppr CoreDoStaticArgs = text "Static argument" - ppr CoreDoCallArity = text "Called arity analysis" - ppr CoreDoExitify = text "Exitification transformation" - ppr CoreDoDemand = text "Demand analysis" - ppr CoreDoCpr = text "Constructed Product Result analysis" - ppr CoreDoWorkerWrapper = text "Worker Wrapper binds" - ppr CoreDoSpecialising = text "Specialise" - ppr CoreDoSpecConstr = text "SpecConstr" - ppr CoreCSE = text "Common sub-expression" - ppr CoreDesugar = text "Desugar (before optimization)" - ppr CoreDesugarOpt = text "Desugar (after optimization)" - ppr CoreTidy = text "Tidy Core" - ppr CorePrep = text "CorePrep" - ppr CoreOccurAnal = text "Occurrence analysis" - ppr CoreDoPrintCore = text "Print core" - ppr (CoreDoRuleCheck {}) = text "Rule check" - ppr CoreDoNothing = text "CoreDoNothing" - ppr (CoreDoPasses passes) = text "CoreDoPasses" <+> ppr passes - -pprPassDetails :: CoreToDo -> SDoc -pprPassDetails (CoreDoSimplify n md) = vcat [ text "Max iterations =" <+> int n - , ppr md ] -pprPassDetails _ = Outputable.empty - -data SimplMode -- See comments in SimplMonad - = SimplMode - { sm_names :: [String] -- Name(s) of the phase - , sm_phase :: CompilerPhase - , sm_dflags :: DynFlags -- Just for convenient non-monadic - -- access; we don't override these - , sm_rules :: Bool -- Whether RULES are enabled - , sm_inline :: Bool -- Whether inlining is enabled - , sm_case_case :: Bool -- Whether case-of-case is enabled - , sm_eta_expand :: Bool -- Whether eta-expansion is enabled - } - -instance Outputable SimplMode where - ppr (SimplMode { sm_phase = p, sm_names = ss - , sm_rules = r, sm_inline = i - , sm_eta_expand = eta, sm_case_case = cc }) - = text "SimplMode" <+> braces ( - sep [ text "Phase =" <+> ppr p <+> - brackets (text (concat $ intersperse "," ss)) <> comma - , pp_flag i (sLit "inline") <> comma - , pp_flag r (sLit "rules") <> comma - , pp_flag eta (sLit "eta-expand") <> comma - , pp_flag cc (sLit "case-of-case") ]) - where - pp_flag f s = ppUnless f (text "no") <+> ptext s - -data FloatOutSwitches = FloatOutSwitches { - floatOutLambdas :: Maybe Int, -- ^ Just n <=> float lambdas to top level, if - -- doing so will abstract over n or fewer - -- value variables - -- Nothing <=> float all lambdas to top level, - -- regardless of how many free variables - -- Just 0 is the vanilla case: float a lambda - -- iff it has no free vars - - floatOutConstants :: Bool, -- ^ True <=> float constants to top level, - -- even if they do not escape a lambda - floatOutOverSatApps :: Bool, - -- ^ True <=> float out over-saturated applications - -- based on arity information. - -- See Note [Floating over-saturated applications] - -- in SetLevels - floatToTopLevelOnly :: Bool -- ^ Allow floating to the top level only. - } -instance Outputable FloatOutSwitches where - ppr = pprFloatOutSwitches - -pprFloatOutSwitches :: FloatOutSwitches -> SDoc -pprFloatOutSwitches sw - = text "FOS" <+> (braces $ - sep $ punctuate comma $ - [ text "Lam =" <+> ppr (floatOutLambdas sw) - , text "Consts =" <+> ppr (floatOutConstants sw) - , text "OverSatApps =" <+> ppr (floatOutOverSatApps sw) ]) - --- The core-to-core pass ordering is derived from the DynFlags: -runWhen :: Bool -> CoreToDo -> CoreToDo -runWhen True do_this = do_this -runWhen False _ = CoreDoNothing - -runMaybe :: Maybe a -> (a -> CoreToDo) -> CoreToDo -runMaybe (Just x) f = f x -runMaybe Nothing _ = CoreDoNothing - -{- - -************************************************************************ -* * - Types for Plugins -* * -************************************************************************ --} - --- | A description of the plugin pass itself -type CorePluginPass = ModGuts -> CoreM ModGuts - -bindsOnlyPass :: (CoreProgram -> CoreM CoreProgram) -> ModGuts -> CoreM ModGuts -bindsOnlyPass pass guts - = do { binds' <- pass (mg_binds guts) - ; return (guts { mg_binds = binds' }) } - -{- -************************************************************************ -* * - Counting and logging -* * -************************************************************************ --} - -getVerboseSimplStats :: (Bool -> SDoc) -> SDoc -getVerboseSimplStats = getPprDebug -- For now, anyway - -zeroSimplCount :: DynFlags -> SimplCount -isZeroSimplCount :: SimplCount -> Bool -hasDetailedCounts :: SimplCount -> Bool -pprSimplCount :: SimplCount -> SDoc -doSimplTick :: DynFlags -> Tick -> SimplCount -> SimplCount -doFreeSimplTick :: Tick -> SimplCount -> SimplCount -plusSimplCount :: SimplCount -> SimplCount -> SimplCount - -data SimplCount - = VerySimplCount !Int -- Used when don't want detailed stats - - | SimplCount { - ticks :: !Int, -- Total ticks - details :: !TickCounts, -- How many of each type - - n_log :: !Int, -- N - log1 :: [Tick], -- Last N events; <= opt_HistorySize, - -- most recent first - log2 :: [Tick] -- Last opt_HistorySize events before that - -- Having log1, log2 lets us accumulate the - -- recent history reasonably efficiently - } - -type TickCounts = Map Tick Int - -simplCountN :: SimplCount -> Int -simplCountN (VerySimplCount n) = n -simplCountN (SimplCount { ticks = n }) = n - -zeroSimplCount dflags - -- This is where we decide whether to do - -- the VerySimpl version or the full-stats version - | dopt Opt_D_dump_simpl_stats dflags - = SimplCount {ticks = 0, details = Map.empty, - n_log = 0, log1 = [], log2 = []} - | otherwise - = VerySimplCount 0 - -isZeroSimplCount (VerySimplCount n) = n==0 -isZeroSimplCount (SimplCount { ticks = n }) = n==0 - -hasDetailedCounts (VerySimplCount {}) = False -hasDetailedCounts (SimplCount {}) = True - -doFreeSimplTick tick sc@SimplCount { details = dts } - = sc { details = dts `addTick` tick } -doFreeSimplTick _ sc = sc - -doSimplTick dflags tick - sc@(SimplCount { ticks = tks, details = dts, n_log = nl, log1 = l1 }) - | nl >= historySize dflags = sc1 { n_log = 1, log1 = [tick], log2 = l1 } - | otherwise = sc1 { n_log = nl+1, log1 = tick : l1 } - where - sc1 = sc { ticks = tks+1, details = dts `addTick` tick } - -doSimplTick _ _ (VerySimplCount n) = VerySimplCount (n+1) - - -addTick :: TickCounts -> Tick -> TickCounts -addTick fm tick = MapStrict.insertWith (+) tick 1 fm - -plusSimplCount sc1@(SimplCount { ticks = tks1, details = dts1 }) - sc2@(SimplCount { ticks = tks2, details = dts2 }) - = log_base { ticks = tks1 + tks2 - , details = MapStrict.unionWith (+) dts1 dts2 } - where - -- A hackish way of getting recent log info - log_base | null (log1 sc2) = sc1 -- Nothing at all in sc2 - | null (log2 sc2) = sc2 { log2 = log1 sc1 } - | otherwise = sc2 - -plusSimplCount (VerySimplCount n) (VerySimplCount m) = VerySimplCount (n+m) -plusSimplCount lhs rhs = - throwGhcException . PprProgramError "plusSimplCount" $ vcat - [ text "lhs" - , pprSimplCount lhs - , text "rhs" - , pprSimplCount rhs - ] - -- We use one or the other consistently - -pprSimplCount (VerySimplCount n) = text "Total ticks:" <+> int n -pprSimplCount (SimplCount { ticks = tks, details = dts, log1 = l1, log2 = l2 }) - = vcat [text "Total ticks: " <+> int tks, - blankLine, - pprTickCounts dts, - getVerboseSimplStats $ \dbg -> if dbg - then - vcat [blankLine, - text "Log (most recent first)", - nest 4 (vcat (map ppr l1) $$ vcat (map ppr l2))] - else Outputable.empty - ] - -{- Note [Which transformations are innocuous] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -At one point (Jun 18) I wondered if some transformations (ticks) -might be "innocuous", in the sense that they do not unlock a later -transformation that does not occur in the same pass. If so, we could -refrain from bumping the overall tick-count for such innocuous -transformations, and perhaps terminate the simplifier one pass -earlier. - -But alas I found that virtually nothing was innocuous! This Note -just records what I learned, in case anyone wants to try again. - -These transformations are not innocuous: - -*** NB: I think these ones could be made innocuous - EtaExpansion - LetFloatFromLet - -LetFloatFromLet - x = K (let z = e2 in Just z) - prepareRhs transforms to - x2 = let z=e2 in Just z - x = K xs - And now more let-floating can happen in the - next pass, on x2 - -PreInlineUnconditionally - Example in spectral/cichelli/Auxil - hinsert = ...let lo = e in - let j = ...lo... in - case x of - False -> () - True -> case lo of I# lo' -> - ...j... - When we PreInlineUnconditionally j, lo's occ-info changes to once, - so it can be PreInlineUnconditionally in the next pass, and a - cascade of further things can happen. - -PostInlineUnconditionally - let x = e in - let y = ...x.. in - case .. of { A -> ...x...y... - B -> ...x...y... } - Current postinlineUnconditinaly will inline y, and then x; sigh. - - But PostInlineUnconditionally might also unlock subsequent - transformations for the same reason as PreInlineUnconditionally, - so it's probably not innocuous anyway. - -KnownBranch, BetaReduction: - May drop chunks of code, and thereby enable PreInlineUnconditionally - for some let-binding which now occurs once - -EtaExpansion: - Example in imaginary/digits-of-e1 - fail = \void. e where e :: IO () - --> etaExpandRhs - fail = \void. (\s. (e |> g) s) |> sym g where g :: IO () ~ S -> (S,()) - --> Next iteration of simplify - fail1 = \void. \s. (e |> g) s - fail = fail1 |> Void#->sym g - And now inline 'fail' - -CaseMerge: - case x of y { - DEFAULT -> case y of z { pi -> ei } - alts2 } - ---> CaseMerge - case x of { pi -> let z = y in ei - ; alts2 } - The "let z=y" case-binder-swap gets dealt with in the next pass --} - -pprTickCounts :: Map Tick Int -> SDoc -pprTickCounts counts - = vcat (map pprTickGroup groups) - where - groups :: [[(Tick,Int)]] -- Each group shares a common tag - -- toList returns common tags adjacent - groups = groupBy same_tag (Map.toList counts) - same_tag (tick1,_) (tick2,_) = tickToTag tick1 == tickToTag tick2 - -pprTickGroup :: [(Tick, Int)] -> SDoc -pprTickGroup group@((tick1,_):_) - = hang (int (sum [n | (_,n) <- group]) <+> text (tickString tick1)) - 2 (vcat [ int n <+> pprTickCts tick - -- flip as we want largest first - | (tick,n) <- sortBy (flip (comparing snd)) group]) -pprTickGroup [] = panic "pprTickGroup" - -data Tick -- See Note [Which transformations are innocuous] - = PreInlineUnconditionally Id - | PostInlineUnconditionally Id - - | UnfoldingDone Id - | RuleFired FastString -- Rule name - - | LetFloatFromLet - | EtaExpansion Id -- LHS binder - | EtaReduction Id -- Binder on outer lambda - | BetaReduction Id -- Lambda binder - - - | CaseOfCase Id -- Bndr on *inner* case - | KnownBranch Id -- Case binder - | CaseMerge Id -- Binder on outer case - | AltMerge Id -- Case binder - | CaseElim Id -- Case binder - | CaseIdentity Id -- Case binder - | FillInCaseDefault Id -- Case binder - - | SimplifierDone -- Ticked at each iteration of the simplifier - -instance Outputable Tick where - ppr tick = text (tickString tick) <+> pprTickCts tick - -instance Eq Tick where - a == b = case a `cmpTick` b of - EQ -> True - _ -> False - -instance Ord Tick where - compare = cmpTick - -tickToTag :: Tick -> Int -tickToTag (PreInlineUnconditionally _) = 0 -tickToTag (PostInlineUnconditionally _) = 1 -tickToTag (UnfoldingDone _) = 2 -tickToTag (RuleFired _) = 3 -tickToTag LetFloatFromLet = 4 -tickToTag (EtaExpansion _) = 5 -tickToTag (EtaReduction _) = 6 -tickToTag (BetaReduction _) = 7 -tickToTag (CaseOfCase _) = 8 -tickToTag (KnownBranch _) = 9 -tickToTag (CaseMerge _) = 10 -tickToTag (CaseElim _) = 11 -tickToTag (CaseIdentity _) = 12 -tickToTag (FillInCaseDefault _) = 13 -tickToTag SimplifierDone = 16 -tickToTag (AltMerge _) = 17 - -tickString :: Tick -> String -tickString (PreInlineUnconditionally _) = "PreInlineUnconditionally" -tickString (PostInlineUnconditionally _)= "PostInlineUnconditionally" -tickString (UnfoldingDone _) = "UnfoldingDone" -tickString (RuleFired _) = "RuleFired" -tickString LetFloatFromLet = "LetFloatFromLet" -tickString (EtaExpansion _) = "EtaExpansion" -tickString (EtaReduction _) = "EtaReduction" -tickString (BetaReduction _) = "BetaReduction" -tickString (CaseOfCase _) = "CaseOfCase" -tickString (KnownBranch _) = "KnownBranch" -tickString (CaseMerge _) = "CaseMerge" -tickString (AltMerge _) = "AltMerge" -tickString (CaseElim _) = "CaseElim" -tickString (CaseIdentity _) = "CaseIdentity" -tickString (FillInCaseDefault _) = "FillInCaseDefault" -tickString SimplifierDone = "SimplifierDone" - -pprTickCts :: Tick -> SDoc -pprTickCts (PreInlineUnconditionally v) = ppr v -pprTickCts (PostInlineUnconditionally v)= ppr v -pprTickCts (UnfoldingDone v) = ppr v -pprTickCts (RuleFired v) = ppr v -pprTickCts LetFloatFromLet = Outputable.empty -pprTickCts (EtaExpansion v) = ppr v -pprTickCts (EtaReduction v) = ppr v -pprTickCts (BetaReduction v) = ppr v -pprTickCts (CaseOfCase v) = ppr v -pprTickCts (KnownBranch v) = ppr v -pprTickCts (CaseMerge v) = ppr v -pprTickCts (AltMerge v) = ppr v -pprTickCts (CaseElim v) = ppr v -pprTickCts (CaseIdentity v) = ppr v -pprTickCts (FillInCaseDefault v) = ppr v -pprTickCts _ = Outputable.empty - -cmpTick :: Tick -> Tick -> Ordering -cmpTick a b = case (tickToTag a `compare` tickToTag b) of - GT -> GT - EQ -> cmpEqTick a b - LT -> LT - -cmpEqTick :: Tick -> Tick -> Ordering -cmpEqTick (PreInlineUnconditionally a) (PreInlineUnconditionally b) = a `compare` b -cmpEqTick (PostInlineUnconditionally a) (PostInlineUnconditionally b) = a `compare` b -cmpEqTick (UnfoldingDone a) (UnfoldingDone b) = a `compare` b -cmpEqTick (RuleFired a) (RuleFired b) = a `compare` b -cmpEqTick (EtaExpansion a) (EtaExpansion b) = a `compare` b -cmpEqTick (EtaReduction a) (EtaReduction b) = a `compare` b -cmpEqTick (BetaReduction a) (BetaReduction b) = a `compare` b -cmpEqTick (CaseOfCase a) (CaseOfCase b) = a `compare` b -cmpEqTick (KnownBranch a) (KnownBranch b) = a `compare` b -cmpEqTick (CaseMerge a) (CaseMerge b) = a `compare` b -cmpEqTick (AltMerge a) (AltMerge b) = a `compare` b -cmpEqTick (CaseElim a) (CaseElim b) = a `compare` b -cmpEqTick (CaseIdentity a) (CaseIdentity b) = a `compare` b -cmpEqTick (FillInCaseDefault a) (FillInCaseDefault b) = a `compare` b -cmpEqTick _ _ = EQ - -{- -************************************************************************ -* * - Monad and carried data structure definitions -* * -************************************************************************ --} - -data CoreReader = CoreReader { - cr_hsc_env :: HscEnv, - cr_rule_base :: RuleBase, - cr_module :: Module, - cr_print_unqual :: PrintUnqualified, - cr_loc :: SrcSpan, -- Use this for log/error messages so they - -- are at least tagged with the right source file - cr_visible_orphan_mods :: !ModuleSet, - cr_uniq_mask :: !Char -- Mask for creating unique values -} - --- Note: CoreWriter used to be defined with data, rather than newtype. If it --- is defined that way again, the cw_simpl_count field, at least, must be --- strict to avoid a space leak (#7702). -newtype CoreWriter = CoreWriter { - cw_simpl_count :: SimplCount -} - -emptyWriter :: DynFlags -> CoreWriter -emptyWriter dflags = CoreWriter { - cw_simpl_count = zeroSimplCount dflags - } - -plusWriter :: CoreWriter -> CoreWriter -> CoreWriter -plusWriter w1 w2 = CoreWriter { - cw_simpl_count = (cw_simpl_count w1) `plusSimplCount` (cw_simpl_count w2) - } - -type CoreIOEnv = IOEnv CoreReader - --- | The monad used by Core-to-Core passes to register simplification statistics. --- Also used to have common state (in the form of UniqueSupply) for generating Uniques. -newtype CoreM a = CoreM { unCoreM :: CoreIOEnv (a, CoreWriter) } - deriving (Functor) - -instance Monad CoreM where - mx >>= f = CoreM $ do - (x, w1) <- unCoreM mx - (y, w2) <- unCoreM (f x) - let w = w1 `plusWriter` w2 - return $ seq w (y, w) - -- forcing w before building the tuple avoids a space leak - -- (#7702) - -instance Applicative CoreM where - pure x = CoreM $ nop x - (<*>) = ap - m *> k = m >>= \_ -> k - -instance Alternative CoreM where - empty = CoreM Control.Applicative.empty - m <|> n = CoreM (unCoreM m <|> unCoreM n) - -instance MonadPlus CoreM - -instance MonadUnique CoreM where - getUniqueSupplyM = do - mask <- read cr_uniq_mask - liftIO $! mkSplitUniqSupply mask - - getUniqueM = do - mask <- read cr_uniq_mask - liftIO $! uniqFromMask mask - -runCoreM :: HscEnv - -> RuleBase - -> Char -- ^ Mask - -> Module - -> ModuleSet - -> PrintUnqualified - -> SrcSpan - -> CoreM a - -> IO (a, SimplCount) -runCoreM hsc_env rule_base mask mod orph_imps print_unqual loc m - = liftM extract $ runIOEnv reader $ unCoreM m - where - reader = CoreReader { - cr_hsc_env = hsc_env, - cr_rule_base = rule_base, - cr_module = mod, - cr_visible_orphan_mods = orph_imps, - cr_print_unqual = print_unqual, - cr_loc = loc, - cr_uniq_mask = mask - } - - extract :: (a, CoreWriter) -> (a, SimplCount) - extract (value, writer) = (value, cw_simpl_count writer) - -{- -************************************************************************ -* * - Core combinators, not exported -* * -************************************************************************ --} - -nop :: a -> CoreIOEnv (a, CoreWriter) -nop x = do - r <- getEnv - return (x, emptyWriter $ (hsc_dflags . cr_hsc_env) r) - -read :: (CoreReader -> a) -> CoreM a -read f = CoreM $ getEnv >>= (\r -> nop (f r)) - -write :: CoreWriter -> CoreM () -write w = CoreM $ return ((), w) - --- \subsection{Lifting IO into the monad} - --- | Lift an 'IOEnv' operation into 'CoreM' -liftIOEnv :: CoreIOEnv a -> CoreM a -liftIOEnv mx = CoreM (mx >>= (\x -> nop x)) - -instance MonadIO CoreM where - liftIO = liftIOEnv . IOEnv.liftIO - --- | Lift an 'IO' operation into 'CoreM' while consuming its 'SimplCount' -liftIOWithCount :: IO (SimplCount, a) -> CoreM a -liftIOWithCount what = liftIO what >>= (\(count, x) -> addSimplCount count >> return x) - -{- -************************************************************************ -* * - Reader, writer and state accessors -* * -************************************************************************ --} - -getHscEnv :: CoreM HscEnv -getHscEnv = read cr_hsc_env - -getRuleBase :: CoreM RuleBase -getRuleBase = read cr_rule_base - -getVisibleOrphanMods :: CoreM ModuleSet -getVisibleOrphanMods = read cr_visible_orphan_mods - -getPrintUnqualified :: CoreM PrintUnqualified -getPrintUnqualified = read cr_print_unqual - -getSrcSpanM :: CoreM SrcSpan -getSrcSpanM = read cr_loc - -addSimplCount :: SimplCount -> CoreM () -addSimplCount count = write (CoreWriter { cw_simpl_count = count }) - -getUniqMask :: CoreM Char -getUniqMask = read cr_uniq_mask - --- Convenience accessors for useful fields of HscEnv - -instance HasDynFlags CoreM where - getDynFlags = fmap hsc_dflags getHscEnv - -instance HasModule CoreM where - getModule = read cr_module - -getPackageFamInstEnv :: CoreM PackageFamInstEnv -getPackageFamInstEnv = do - hsc_env <- getHscEnv - eps <- liftIO $ hscEPS hsc_env - return $ eps_fam_inst_env eps - -{- -************************************************************************ -* * - Dealing with annotations -* * -************************************************************************ --} - --- | Get all annotations of a given type. This happens lazily, that is --- no deserialization will take place until the [a] is actually demanded and --- the [a] can also be empty (the UniqFM is not filtered). --- --- This should be done once at the start of a Core-to-Core pass that uses --- annotations. --- --- See Note [Annotations] -getAnnotations :: Typeable a => ([Word8] -> a) -> ModGuts -> CoreM (ModuleEnv [a], NameEnv [a]) -getAnnotations deserialize guts = do - hsc_env <- getHscEnv - ann_env <- liftIO $ prepareAnnotations hsc_env (Just guts) - return (deserializeAnns deserialize ann_env) - --- | Get at most one annotation of a given type per annotatable item. -getFirstAnnotations :: Typeable a => ([Word8] -> a) -> ModGuts -> CoreM (ModuleEnv a, NameEnv a) -getFirstAnnotations deserialize guts - = bimap mod name <$> getAnnotations deserialize guts - where - mod = mapModuleEnv head . filterModuleEnv (const $ not . null) - name = mapNameEnv head . filterNameEnv (not . null) - -{- -Note [Annotations] -~~~~~~~~~~~~~~~~~~ -A Core-to-Core pass that wants to make use of annotations calls -getAnnotations or getFirstAnnotations at the beginning to obtain a UniqFM with -annotations of a specific type. This produces all annotations from interface -files read so far. However, annotations from interface files read during the -pass will not be visible until getAnnotations is called again. This is similar -to how rules work and probably isn't too bad. - -The current implementation could be optimised a bit: when looking up -annotations for a thing from the HomePackageTable, we could search directly in -the module where the thing is defined rather than building one UniqFM which -contains all annotations we know of. This would work because annotations can -only be given to things defined in the same module. However, since we would -only want to deserialise every annotation once, we would have to build a cache -for every module in the HTP. In the end, it's probably not worth it as long as -we aren't using annotations heavily. - -************************************************************************ -* * - Direct screen output -* * -************************************************************************ --} - -msg :: Severity -> WarnReason -> SDoc -> CoreM () -msg sev reason doc - = do { dflags <- getDynFlags - ; loc <- getSrcSpanM - ; unqual <- getPrintUnqualified - ; let sty = case sev of - SevError -> err_sty - SevWarning -> err_sty - SevDump -> dump_sty - _ -> user_sty - err_sty = mkErrStyle dflags unqual - user_sty = mkUserStyle dflags unqual AllTheWay - dump_sty = mkDumpStyle dflags unqual - ; liftIO $ putLogMsg dflags reason sev loc sty doc } - --- | Output a String message to the screen -putMsgS :: String -> CoreM () -putMsgS = putMsg . text - --- | Output a message to the screen -putMsg :: SDoc -> CoreM () -putMsg = msg SevInfo NoReason - --- | Output an error to the screen. Does not cause the compiler to die. -errorMsgS :: String -> CoreM () -errorMsgS = errorMsg . text - --- | Output an error to the screen. Does not cause the compiler to die. -errorMsg :: SDoc -> CoreM () -errorMsg = msg SevError NoReason - -warnMsg :: WarnReason -> SDoc -> CoreM () -warnMsg = msg SevWarning - --- | Output a fatal error to the screen. Does not cause the compiler to die. -fatalErrorMsgS :: String -> CoreM () -fatalErrorMsgS = fatalErrorMsg . text - --- | Output a fatal error to the screen. Does not cause the compiler to die. -fatalErrorMsg :: SDoc -> CoreM () -fatalErrorMsg = msg SevFatal NoReason - --- | Output a string debugging message at verbosity level of @-v@ or higher -debugTraceMsgS :: String -> CoreM () -debugTraceMsgS = debugTraceMsg . text - --- | Outputs a debugging message at verbosity level of @-v@ or higher -debugTraceMsg :: SDoc -> CoreM () -debugTraceMsg = msg SevDump NoReason - --- | Show some labelled 'SDoc' if a particular flag is set or at a verbosity level of @-v -ddump-most@ or higher -dumpIfSet_dyn :: DumpFlag -> String -> DumpFormat -> SDoc -> CoreM () -dumpIfSet_dyn flag str fmt doc - = do { dflags <- getDynFlags - ; unqual <- getPrintUnqualified - ; when (dopt flag dflags) $ liftIO $ do - let sty = mkDumpStyle dflags unqual - dumpAction dflags sty (dumpOptionsFromFlag flag) str fmt doc } diff --git a/compiler/simplCore/CoreMonad.hs-boot b/compiler/simplCore/CoreMonad.hs-boot deleted file mode 100644 index 74c21e8216..0000000000 --- a/compiler/simplCore/CoreMonad.hs-boot +++ /dev/null @@ -1,30 +0,0 @@ --- Created this hs-boot file to remove circular dependencies from the use of --- Plugins. Plugins needs CoreToDo and CoreM types to define core-to-core --- transformations. --- However CoreMonad does much more than defining these, and because Plugins are --- activated in various modules, the imports become circular. To solve this I --- extracted CoreToDo and CoreM into this file. --- I needed to write the whole definition of these types, otherwise it created --- a data-newtype conflict. - -module CoreMonad ( CoreToDo, CoreM ) where - -import GhcPrelude - -import IOEnv ( IOEnv ) - -type CoreIOEnv = IOEnv CoreReader - -data CoreReader - -newtype CoreWriter = CoreWriter { - cw_simpl_count :: SimplCount -} - -data SimplCount - -newtype CoreM a = CoreM { unCoreM :: CoreIOEnv (a, CoreWriter) } - -instance Monad CoreM - -data CoreToDo diff --git a/compiler/simplCore/Exitify.hs b/compiler/simplCore/Exitify.hs deleted file mode 100644 index cbcacfa465..0000000000 --- a/compiler/simplCore/Exitify.hs +++ /dev/null @@ -1,499 +0,0 @@ -module Exitify ( exitifyProgram ) where - -{- -Note [Exitification] -~~~~~~~~~~~~~~~~~~~~ - -This module implements Exitification. The goal is to pull as much code out of -recursive functions as possible, as the simplifier is better at inlining into -call-sites that are not in recursive functions. - -Example: - - let t = foo bar - joinrec go 0 x y = t (x*x) - go (n-1) x y = jump go (n-1) (x+y) - in … - -We’d like to inline `t`, but that does not happen: Because t is a thunk and is -used in a recursive function, doing so might lose sharing in general. In -this case, however, `t` is on the _exit path_ of `go`, so called at most once. -How do we make this clearly visible to the simplifier? - -A code path (i.e., an expression in a tail-recursive position) in a recursive -function is an exit path if it does not contain a recursive call. We can bind -this expression outside the recursive function, as a join-point. - -Example result: - - let t = foo bar - join exit x = t (x*x) - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - in … - -Now `t` is no longer in a recursive function, and good things happen! --} - -import GhcPrelude -import Var -import Id -import IdInfo -import GHC.Core -import GHC.Core.Utils -import State -import Unique -import VarSet -import VarEnv -import GHC.Core.FVs -import FastString -import GHC.Core.Type -import Util( mapSnd ) - -import Data.Bifunctor -import Control.Monad - --- | Traverses the AST, simply to find all joinrecs and call 'exitify' on them. --- The really interesting function is exitifyRec -exitifyProgram :: CoreProgram -> CoreProgram -exitifyProgram binds = map goTopLvl binds - where - goTopLvl (NonRec v e) = NonRec v (go in_scope_toplvl e) - goTopLvl (Rec pairs) = Rec (map (second (go in_scope_toplvl)) pairs) - -- Top-level bindings are never join points - - in_scope_toplvl = emptyInScopeSet `extendInScopeSetList` bindersOfBinds binds - - go :: InScopeSet -> CoreExpr -> CoreExpr - go _ e@(Var{}) = e - go _ e@(Lit {}) = e - go _ e@(Type {}) = e - go _ e@(Coercion {}) = e - go in_scope (Cast e' c) = Cast (go in_scope e') c - go in_scope (Tick t e') = Tick t (go in_scope e') - go in_scope (App e1 e2) = App (go in_scope e1) (go in_scope e2) - - go in_scope (Lam v e') - = Lam v (go in_scope' e') - where in_scope' = in_scope `extendInScopeSet` v - - go in_scope (Case scrut bndr ty alts) - = Case (go in_scope scrut) bndr ty (map go_alt alts) - where - in_scope1 = in_scope `extendInScopeSet` bndr - go_alt (dc, pats, rhs) = (dc, pats, go in_scope' rhs) - where in_scope' = in_scope1 `extendInScopeSetList` pats - - go in_scope (Let (NonRec bndr rhs) body) - = Let (NonRec bndr (go in_scope rhs)) (go in_scope' body) - where - in_scope' = in_scope `extendInScopeSet` bndr - - go in_scope (Let (Rec pairs) body) - | is_join_rec = mkLets (exitifyRec in_scope' pairs') body' - | otherwise = Let (Rec pairs') body' - where - is_join_rec = any (isJoinId . fst) pairs - in_scope' = in_scope `extendInScopeSetList` bindersOf (Rec pairs) - pairs' = mapSnd (go in_scope') pairs - body' = go in_scope' body - - --- | State Monad used inside `exitify` -type ExitifyM = State [(JoinId, CoreExpr)] - --- | Given a recursive group of a joinrec, identifies “exit paths” and binds them as --- join-points outside the joinrec. -exitifyRec :: InScopeSet -> [(Var,CoreExpr)] -> [CoreBind] -exitifyRec in_scope pairs - = [ NonRec xid rhs | (xid,rhs) <- exits ] ++ [Rec pairs'] - where - -- We need the set of free variables of many subexpressions here, so - -- annotate the AST with them - -- see Note [Calculating free variables] - ann_pairs = map (second freeVars) pairs - - -- Which are the recursive calls? - recursive_calls = mkVarSet $ map fst pairs - - (pairs',exits) = (`runState` []) $ do - forM ann_pairs $ \(x,rhs) -> do - -- go past the lambdas of the join point - let (args, body) = collectNAnnBndrs (idJoinArity x) rhs - body' <- go args body - let rhs' = mkLams args body' - return (x, rhs') - - --------------------- - -- 'go' is the main working function. - -- It goes through the RHS (tail-call positions only), - -- checks if there are no more recursive calls, if so, abstracts over - -- variables bound on the way and lifts it out as a join point. - -- - -- ExitifyM is a state monad to keep track of floated binds - go :: [Var] -- ^ Variables that are in-scope here, but - -- not in scope at the joinrec; that is, - -- we must potentially abstract over them. - -- Invariant: they are kept in dependency order - -> CoreExprWithFVs -- ^ Current expression in tail position - -> ExitifyM CoreExpr - - -- We first look at the expression (no matter what it shape is) - -- and determine if we can turn it into a exit join point - go captured ann_e - | -- An exit expression has no recursive calls - let fvs = dVarSetToVarSet (freeVarsOf ann_e) - , disjointVarSet fvs recursive_calls - = go_exit captured (deAnnotate ann_e) fvs - - -- We could not turn it into a exit join point. So now recurse - -- into all expression where eligible exit join points might sit, - -- i.e. into all tail-call positions: - - -- Case right hand sides are in tail-call position - go captured (_, AnnCase scrut bndr ty alts) = do - alts' <- forM alts $ \(dc, pats, rhs) -> do - rhs' <- go (captured ++ [bndr] ++ pats) rhs - return (dc, pats, rhs') - return $ Case (deAnnotate scrut) bndr ty alts' - - go captured (_, AnnLet ann_bind body) - -- join point, RHS and body are in tail-call position - | AnnNonRec j rhs <- ann_bind - , Just join_arity <- isJoinId_maybe j - = do let (params, join_body) = collectNAnnBndrs join_arity rhs - join_body' <- go (captured ++ params) join_body - let rhs' = mkLams params join_body' - body' <- go (captured ++ [j]) body - return $ Let (NonRec j rhs') body' - - -- rec join point, RHSs and body are in tail-call position - | AnnRec pairs <- ann_bind - , isJoinId (fst (head pairs)) - = do let js = map fst pairs - pairs' <- forM pairs $ \(j,rhs) -> do - let join_arity = idJoinArity j - (params, join_body) = collectNAnnBndrs join_arity rhs - join_body' <- go (captured ++ js ++ params) join_body - let rhs' = mkLams params join_body' - return (j, rhs') - body' <- go (captured ++ js) body - return $ Let (Rec pairs') body' - - -- normal Let, only the body is in tail-call position - | otherwise - = do body' <- go (captured ++ bindersOf bind ) body - return $ Let bind body' - where bind = deAnnBind ann_bind - - -- Cannot be turned into an exit join point, but also has no - -- tail-call subexpression. Nothing to do here. - go _ ann_e = return (deAnnotate ann_e) - - --------------------- - go_exit :: [Var] -- Variables captured locally - -> CoreExpr -- An exit expression - -> VarSet -- Free vars of the expression - -> ExitifyM CoreExpr - -- go_exit deals with a tail expression that is floatable - -- out as an exit point; that is, it mentions no recursive calls - go_exit captured e fvs - -- Do not touch an expression that is already a join jump where all arguments - -- are captured variables. See Note [Idempotency] - -- But _do_ float join jumps with interesting arguments. - -- See Note [Jumps can be interesting] - | (Var f, args) <- collectArgs e - , isJoinId f - , all isCapturedVarArg args - = return e - - -- Do not touch a boring expression (see Note [Interesting expression]) - | not is_interesting - = return e - - -- Cannot float out if local join points are used, as - -- we cannot abstract over them - | captures_join_points - = return e - - -- We have something to float out! - | otherwise - = do { -- Assemble the RHS of the exit join point - let rhs = mkLams abs_vars e - avoid = in_scope `extendInScopeSetList` captured - -- Remember this binding under a suitable name - ; v <- addExit avoid (length abs_vars) rhs - -- And jump to it from here - ; return $ mkVarApps (Var v) abs_vars } - - where - -- Used to detect exit expressions that are already proper exit jumps - isCapturedVarArg (Var v) = v `elem` captured - isCapturedVarArg _ = False - - -- An interesting exit expression has free, non-imported - -- variables from outside the recursive group - -- See Note [Interesting expression] - is_interesting = anyVarSet isLocalId $ - fvs `minusVarSet` mkVarSet captured - - -- The arguments of this exit join point - -- See Note [Picking arguments to abstract over] - abs_vars = snd $ foldr pick (fvs, []) captured - where - pick v (fvs', acc) | v `elemVarSet` fvs' = (fvs' `delVarSet` v, zap v : acc) - | otherwise = (fvs', acc) - - -- We are going to abstract over these variables, so we must - -- zap any IdInfo they have; see #15005 - -- cf. SetLevels.abstractVars - zap v | isId v = setIdInfo v vanillaIdInfo - | otherwise = v - - -- We cannot abstract over join points - captures_join_points = any isJoinId abs_vars - - --- Picks a new unique, which is disjoint from --- * the free variables of the whole joinrec --- * any bound variables (captured) --- * any exit join points created so far. -mkExitJoinId :: InScopeSet -> Type -> JoinArity -> ExitifyM JoinId -mkExitJoinId in_scope ty join_arity = do - fs <- get - let avoid = in_scope `extendInScopeSetList` (map fst fs) - `extendInScopeSet` exit_id_tmpl -- just cosmetics - return (uniqAway avoid exit_id_tmpl) - where - exit_id_tmpl = mkSysLocal (fsLit "exit") initExitJoinUnique ty - `asJoinId` join_arity - -addExit :: InScopeSet -> JoinArity -> CoreExpr -> ExitifyM JoinId -addExit in_scope join_arity rhs = do - -- Pick a suitable name - let ty = exprType rhs - v <- mkExitJoinId in_scope ty join_arity - fs <- get - put ((v,rhs):fs) - return v - -{- -Note [Interesting expression] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We do not want this to happen: - - joinrec go 0 x y = x - go (n-1) x y = jump go (n-1) (x+y) - in … -==> - join exit x = x - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - in … - -because the floated exit path (`x`) is simply a parameter of `go`; there are -not useful interactions exposed this way. - -Neither do we want this to happen - - joinrec go 0 x y = x+x - go (n-1) x y = jump go (n-1) (x+y) - in … -==> - join exit x = x+x - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - in … - -where the floated expression `x+x` is a bit more complicated, but still not -intersting. - -Expressions are interesting when they move an occurrence of a variable outside -the recursive `go` that can benefit from being obviously called once, for example: - * a local thunk that can then be inlined (see example in note [Exitification]) - * the parameter of a function, where the demand analyzer then can then - see that it is called at most once, and hence improve the function’s - strictness signature - -So we only hoist an exit expression out if it mentiones at least one free, -non-imported variable. - -Note [Jumps can be interesting] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -A jump to a join point can be interesting, if its arguments contain free -non-exported variables (z in the following example): - - joinrec go 0 x y = jump j (x+z) - go (n-1) x y = jump go (n-1) (x+y) - in … -==> - join exit x y = jump j (x+z) - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - - -The join point itself can be interesting, even if none if its -arguments have free variables free in the joinrec. For example - - join j p = case p of (x,y) -> x+y - joinrec go 0 x y = jump j (x,y) - go (n-1) x y = jump go (n-1) (x+y) y - in … - -Here, `j` would not be inlined because we do not inline something that looks -like an exit join point (see Note [Do not inline exit join points]). But -if we exitify the 'jump j (x,y)' we get - - join j p = case p of (x,y) -> x+y - join exit x y = jump j (x,y) - joinrec go 0 x y = jump exit x y - go (n-1) x y = jump go (n-1) (x+y) y - in … - -and now 'j' can inline, and we get rid of the pair. Here's another -example (assume `g` to be an imported function that, on its own, -does not make this interesting): - - join j y = map f y - joinrec go 0 x y = jump j (map g x) - go (n-1) x y = jump go (n-1) (x+y) - in … - -Again, `j` would not be inlined because we do not inline something that looks -like an exit join point (see Note [Do not inline exit join points]). - -But after exitification we have - - join j y = map f y - join exit x = jump j (map g x) - joinrec go 0 x y = jump j (map g x) - go (n-1) x y = jump go (n-1) (x+y) - in … - -and now we can inline `j` and this will allow `map/map` to fire. - - -Note [Idempotency] -~~~~~~~~~~~~~~~~~~ - -We do not want this to happen, where we replace the floated expression with -essentially the same expression: - - join exit x = t (x*x) - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - in … -==> - join exit x = t (x*x) - join exit' x = jump exit x - joinrec go 0 x y = jump exit' x - go (n-1) x y = jump go (n-1) (x+y) - in … - -So when the RHS is a join jump, and all of its arguments are captured variables, -then we leave it in place. - -Note that `jump exit x` in this example looks interesting, as `exit` is a free -variable. Therefore, idempotency does not simply follow from floating only -interesting expressions. - -Note [Calculating free variables] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We have two options where to annotate the tree with free variables: - - A) The whole tree. - B) Each individual joinrec as we come across it. - -Downside of A: We pay the price on the whole module, even outside any joinrecs. -Downside of B: We pay the price per joinrec, possibly multiple times when -joinrecs are nested. - -Further downside of A: If the exitify function returns annotated expressions, -it would have to ensure that the annotations are correct. - -We therefore choose B, and calculate the free variables in `exitify`. - - -Note [Do not inline exit join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we have - - let t = foo bar - join exit x = t (x*x) - joinrec go 0 x y = jump exit x - go (n-1) x y = jump go (n-1) (x+y) - in … - -we do not want the simplifier to simply inline `exit` back in (which it happily -would). - -To prevent this, we need to recognize exit join points, and then disable -inlining. - -Exit join points, recognizeable using `isExitJoinId` are join points with an -occurrence in a recursive group, and can be recognized (after the occurrence -analyzer ran!) using `isExitJoinId`. -This function detects joinpoints with `occ_in_lam (idOccinfo id) == True`, -because the lambdas of a non-recursive join point are not considered for -`occ_in_lam`. For example, in the following code, `j1` is /not/ marked -occ_in_lam, because `j2` is called only once. - - join j1 x = x+1 - join j2 y = join j1 (y+2) - -To prevent inlining, we check for isExitJoinId -* In `preInlineUnconditionally` directly. -* In `simplLetUnfolding` we simply give exit join points no unfolding, which - prevents inlining in `postInlineUnconditionally` and call sites. - -Note [Placement of the exitification pass] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -I (Joachim) experimented with multiple positions for the Exitification pass in -the Core2Core pipeline: - - A) Before the `simpl_phases` - B) Between the `simpl_phases` and the "main" simplifier pass - C) After demand_analyser - D) Before the final simplification phase - -Here is the table (this is without inlining join exit points in the final -simplifier run): - - Program | Allocs | Instrs - | ABCD.log A.log B.log C.log D.log | ABCD.log A.log B.log C.log D.log -----------------|---------------------------------------------------|------------------------------------------------- - fannkuch-redux | -99.9% +0.0% -99.9% -99.9% -99.9% | -3.9% +0.5% -3.0% -3.9% -3.9% - fasta | -0.0% +0.0% +0.0% -0.0% -0.0% | -8.5% +0.0% +0.0% -0.0% -8.5% - fem | 0.0% 0.0% 0.0% 0.0% +0.0% | -2.2% -0.1% -0.1% -2.1% -2.1% - fish | 0.0% 0.0% 0.0% 0.0% +0.0% | -3.1% +0.0% -1.1% -1.1% -0.0% - k-nucleotide | -91.3% -91.0% -91.0% -91.3% -91.3% | -6.3% +11.4% +11.4% -6.3% -6.2% - scs | -0.0% -0.0% -0.0% -0.0% -0.0% | -3.4% -3.0% -3.1% -3.3% -3.3% - simple | -6.0% 0.0% -6.0% -6.0% +0.0% | -3.4% +0.0% -5.2% -3.4% -0.1% - spectral-norm | -0.0% 0.0% 0.0% -0.0% +0.0% | -2.7% +0.0% -2.7% -5.4% -5.4% -----------------|---------------------------------------------------|------------------------------------------------- - Min | -95.0% -91.0% -95.0% -95.0% -95.0% | -8.5% -3.0% -5.2% -6.3% -8.5% - Max | +0.2% +0.2% +0.2% +0.2% +1.5% | +0.4% +11.4% +11.4% +0.4% +1.5% - Geometric Mean | -4.7% -2.1% -4.7% -4.7% -4.6% | -0.4% +0.1% -0.1% -0.3% -0.2% - -Position A is disqualified, as it does not get rid of the allocations in -fannkuch-redux. -Position A and B are disqualified because it increases instructions in k-nucleotide. -Positions C and D have their advantages: C decreases allocations in simpl, but D instructions in fasta. - -Assuming we have a budget of _one_ run of Exitification, then C wins (but we -could get more from running it multiple times, as seen in fish). - -Note [Picking arguments to abstract over] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -When we create an exit join point, so we need to abstract over those of its -free variables that are be out-of-scope at the destination of the exit join -point. So we go through the list `captured` and pick those that are actually -free variables of the join point. - -We do not just `filter (`elemVarSet` fvs) captured`, as there might be -shadowing, and `captured` may contain multiple variables with the same Unique. I -these cases we want to abstract only over the last occurrence, hence the `foldr` -(with emphasis on the `r`). This is #15110. - --} diff --git a/compiler/simplCore/FloatIn.hs b/compiler/simplCore/FloatIn.hs deleted file mode 100644 index c1121e16e2..0000000000 --- a/compiler/simplCore/FloatIn.hs +++ /dev/null @@ -1,772 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - -************************************************************************ -* * -\section[FloatIn]{Floating Inwards pass} -* * -************************************************************************ - -The main purpose of @floatInwards@ is floating into branches of a -case, so that we don't allocate things, save them on the stack, and -then discover that they aren't needed in the chosen branch. --} - -{-# LANGUAGE CPP #-} -{-# OPTIONS_GHC -fprof-auto #-} -{-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-} - -module FloatIn ( floatInwards ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Core -import GHC.Core.Make hiding ( wrapFloats ) -import GHC.Driver.Types ( ModGuts(..) ) -import GHC.Core.Utils -import GHC.Core.FVs -import CoreMonad ( CoreM ) -import Id ( isOneShotBndr, idType, isJoinId, isJoinId_maybe ) -import Var -import GHC.Core.Type -import VarSet -import Util -import GHC.Driver.Session -import Outputable --- import Data.List ( mapAccumL ) -import BasicTypes ( RecFlag(..), isRec ) - -{- -Top-level interface function, @floatInwards@. Note that we do not -actually float any bindings downwards from the top-level. --} - -floatInwards :: ModGuts -> CoreM ModGuts -floatInwards pgm@(ModGuts { mg_binds = binds }) - = do { dflags <- getDynFlags - ; return (pgm { mg_binds = map (fi_top_bind dflags) binds }) } - where - fi_top_bind dflags (NonRec binder rhs) - = NonRec binder (fiExpr dflags [] (freeVars rhs)) - fi_top_bind dflags (Rec pairs) - = Rec [ (b, fiExpr dflags [] (freeVars rhs)) | (b, rhs) <- pairs ] - - -{- -************************************************************************ -* * -\subsection{Mail from Andr\'e [edited]} -* * -************************************************************************ - -{\em Will wrote: What??? I thought the idea was to float as far -inwards as possible, no matter what. This is dropping all bindings -every time it sees a lambda of any kind. Help! } - -You are assuming we DO DO full laziness AFTER floating inwards! We -have to [not float inside lambdas] if we don't. - -If we indeed do full laziness after the floating inwards (we could -check the compilation flags for that) then I agree we could be more -aggressive and do float inwards past lambdas. - -Actually we are not doing a proper full laziness (see below), which -was another reason for not floating inwards past a lambda. - -This can easily be fixed. The problem is that we float lets outwards, -but there are a few expressions which are not let bound, like case -scrutinees and case alternatives. After floating inwards the -simplifier could decide to inline the let and the laziness would be -lost, e.g. - -\begin{verbatim} -let a = expensive ==> \b -> case expensive of ... -in \ b -> case a of ... -\end{verbatim} -The fix is -\begin{enumerate} -\item -to let bind the algebraic case scrutinees (done, I think) and -the case alternatives (except the ones with an -unboxed type)(not done, I think). This is best done in the -SetLevels.hs module, which tags things with their level numbers. -\item -do the full laziness pass (floating lets outwards). -\item -simplify. The simplifier inlines the (trivial) lets that were - created but were not floated outwards. -\end{enumerate} - -With the fix I think Will's suggestion that we can gain even more from -strictness by floating inwards past lambdas makes sense. - -We still gain even without going past lambdas, as things may be -strict in the (new) context of a branch (where it was floated to) or -of a let rhs, e.g. -\begin{verbatim} -let a = something case x of -in case x of alt1 -> case something of a -> a + a - alt1 -> a + a ==> alt2 -> b - alt2 -> b - -let a = something let b = case something of a -> a + a -in let b = a + a ==> in (b,b) -in (b,b) -\end{verbatim} -Also, even if a is not found to be strict in the new context and is -still left as a let, if the branch is not taken (or b is not entered) -the closure for a is not built. - -************************************************************************ -* * -\subsection{Main floating-inwards code} -* * -************************************************************************ --} - -type FreeVarSet = DIdSet -type BoundVarSet = DIdSet - -data FloatInBind = FB BoundVarSet FreeVarSet FloatBind - -- The FreeVarSet is the free variables of the binding. In the case - -- of recursive bindings, the set doesn't include the bound - -- variables. - -type FloatInBinds = [FloatInBind] - -- In reverse dependency order (innermost binder first) - -fiExpr :: DynFlags - -> FloatInBinds -- Binds we're trying to drop - -- as far "inwards" as possible - -> CoreExprWithFVs -- Input expr - -> CoreExpr -- Result - -fiExpr _ to_drop (_, AnnLit lit) = wrapFloats to_drop (Lit lit) - -- See Note [Dead bindings] -fiExpr _ to_drop (_, AnnType ty) = ASSERT( null to_drop ) Type ty -fiExpr _ to_drop (_, AnnVar v) = wrapFloats to_drop (Var v) -fiExpr _ to_drop (_, AnnCoercion co) = wrapFloats to_drop (Coercion co) -fiExpr dflags to_drop (_, AnnCast expr (co_ann, co)) - = wrapFloats (drop_here ++ co_drop) $ - Cast (fiExpr dflags e_drop expr) co - where - [drop_here, e_drop, co_drop] - = sepBindsByDropPoint dflags False - [freeVarsOf expr, freeVarsOfAnn co_ann] - to_drop - -{- -Applications: we do float inside applications, mainly because we -need to get at all the arguments. The next simplifier run will -pull out any silly ones. --} - -fiExpr dflags to_drop ann_expr@(_,AnnApp {}) - = wrapFloats drop_here $ wrapFloats extra_drop $ - mkTicks ticks $ - mkApps (fiExpr dflags fun_drop ann_fun) - (zipWith (fiExpr dflags) arg_drops ann_args) - where - (ann_fun, ann_args, ticks) = collectAnnArgsTicks tickishFloatable ann_expr - fun_ty = exprType (deAnnotate ann_fun) - fun_fvs = freeVarsOf ann_fun - arg_fvs = map freeVarsOf ann_args - - (drop_here : extra_drop : fun_drop : arg_drops) - = sepBindsByDropPoint dflags False - (extra_fvs : fun_fvs : arg_fvs) - to_drop - -- Shortcut behaviour: if to_drop is empty, - -- sepBindsByDropPoint returns a suitable bunch of empty - -- lists without evaluating extra_fvs, and hence without - -- peering into each argument - - (_, extra_fvs) = foldl' add_arg (fun_ty, extra_fvs0) ann_args - extra_fvs0 = case ann_fun of - (_, AnnVar _) -> fun_fvs - _ -> emptyDVarSet - -- Don't float the binding for f into f x y z; see Note [Join points] - -- for why we *can't* do it when f is a join point. (If f isn't a - -- join point, floating it in isn't especially harmful but it's - -- useless since the simplifier will immediately float it back out.) - - add_arg :: (Type,FreeVarSet) -> CoreExprWithFVs -> (Type,FreeVarSet) - add_arg (fun_ty, extra_fvs) (_, AnnType ty) - = (piResultTy fun_ty ty, extra_fvs) - - add_arg (fun_ty, extra_fvs) (arg_fvs, arg) - | noFloatIntoArg arg arg_ty - = (res_ty, extra_fvs `unionDVarSet` arg_fvs) - | otherwise - = (res_ty, extra_fvs) - where - (arg_ty, res_ty) = splitFunTy fun_ty - -{- Note [Dead bindings] -~~~~~~~~~~~~~~~~~~~~~~~ -At a literal we won't usually have any floated bindings; the -only way that can happen is if the binding wrapped the literal -/in the original input program/. e.g. - case x of { DEFAULT -> 1# } -But, while this may be unusual it is not actually wrong, and it did -once happen (#15696). - -Note [Do not destroy the let/app invariant] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Watch out for - f (x +# y) -We don't want to float bindings into here - f (case ... of { x -> x +# y }) -because that might destroy the let/app invariant, which requires -unlifted function arguments to be ok-for-speculation. - -Note [Join points] -~~~~~~~~~~~~~~~~~~ -Generally, we don't need to worry about join points - there are places we're -not allowed to float them, but since they can't have occurrences in those -places, we're not tempted. - -We do need to be careful about jumps, however: - - joinrec j x y z = ... in - jump j a b c - -Previous versions often floated the definition of a recursive function into its -only non-recursive occurrence. But for a join point, this is a disaster: - - (joinrec j x y z = ... in - jump j) a b c -- wrong! - -Every jump must be exact, so the jump to j must have three arguments. Hence -we're careful not to float into the target of a jump (though we can float into -the arguments just fine). - -Note [Floating in past a lambda group] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* We must be careful about floating inside a value lambda. - That risks losing laziness. - The float-out pass might rescue us, but then again it might not. - -* We must be careful about type lambdas too. At one time we did, and - there is no risk of duplicating work thereby, but we do need to be - careful. In particular, here is a bad case (it happened in the - cichelli benchmark: - let v = ... - in let f = /\t -> \a -> ... - ==> - let f = /\t -> let v = ... in \a -> ... - This is bad as now f is an updatable closure (update PAP) - and has arity 0. - -* Hack alert! We only float in through one-shot lambdas, - not (as you might guess) through lone big lambdas. - Reason: we float *out* past big lambdas (see the test in the Lam - case of FloatOut.floatExpr) and we don't want to float straight - back in again. - - It *is* important to float into one-shot lambdas, however; - see the remarks with noFloatIntoRhs. - -So we treat lambda in groups, using the following rule: - - Float in if (a) there is at least one Id, - and (b) there are no non-one-shot Ids - - Otherwise drop all the bindings outside the group. - -This is what the 'go' function in the AnnLam case is doing. - -(Join points are handled similarly: a join point is considered one-shot iff -it's non-recursive, so we float only into non-recursive join points.) - -Urk! if all are tyvars, and we don't float in, we may miss an - opportunity to float inside a nested case branch - - -Note [Floating coercions] -~~~~~~~~~~~~~~~~~~~~~~~~~ -We could, in principle, have a coercion binding like - case f x of co { DEFAULT -> e1 e2 } -It's not common to have a function that returns a coercion, but nothing -in Core prohibits it. If so, 'co' might be mentioned in e1 or e2 -/only in a type/. E.g. suppose e1 was - let (x :: Int |> co) = blah in blah2 - - -But, with coercions appearing in types, there is a complication: we -might be floating in a "strict let" -- that is, a case. Case expressions -mention their return type. We absolutely can't float a coercion binding -inward to the point that the type of the expression it's about to wrap -mentions the coercion. So we include the union of the sets of free variables -of the types of all the drop points involved. If any of the floaters -bind a coercion variable mentioned in any of the types, that binder must -be dropped right away. - --} - -fiExpr dflags to_drop lam@(_, AnnLam _ _) - | noFloatIntoLam bndrs -- Dump it all here - -- NB: Must line up with noFloatIntoRhs (AnnLam...); see #7088 - = wrapFloats to_drop (mkLams bndrs (fiExpr dflags [] body)) - - | otherwise -- Float inside - = mkLams bndrs (fiExpr dflags to_drop body) - - where - (bndrs, body) = collectAnnBndrs lam - -{- -We don't float lets inwards past an SCC. - ToDo: keep info on current cc, and when passing - one, if it is not the same, annotate all lets in binds with current - cc, change current cc to the new one and float binds into expr. --} - -fiExpr dflags to_drop (_, AnnTick tickish expr) - | tickish `tickishScopesLike` SoftScope - = Tick tickish (fiExpr dflags to_drop expr) - - | otherwise -- Wimp out for now - we could push values in - = wrapFloats to_drop (Tick tickish (fiExpr dflags [] expr)) - -{- -For @Lets@, the possible ``drop points'' for the \tr{to_drop} -bindings are: (a)~in the body, (b1)~in the RHS of a NonRec binding, -or~(b2), in each of the RHSs of the pairs of a @Rec@. - -Note that we do {\em weird things} with this let's binding. Consider: -\begin{verbatim} -let - w = ... -in { - let v = ... w ... - in ... v .. w ... -} -\end{verbatim} -Look at the inner \tr{let}. As \tr{w} is used in both the bind and -body of the inner let, we could panic and leave \tr{w}'s binding where -it is. But \tr{v} is floatable further into the body of the inner let, and -{\em then} \tr{w} will also be only in the body of that inner let. - -So: rather than drop \tr{w}'s binding here, we add it onto the list of -things to drop in the outer let's body, and let nature take its -course. - -Note [extra_fvs (1): avoid floating into RHS] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider let x=\y....t... in body. We do not necessarily want to float -a binding for t into the RHS, because it'll immediately be floated out -again. (It won't go inside the lambda else we risk losing work.) -In letrec, we need to be more careful still. We don't want to transform - let x# = y# +# 1# - in - letrec f = \z. ...x#...f... - in ... -into - letrec f = let x# = y# +# 1# in \z. ...x#...f... in ... -because now we can't float the let out again, because a letrec -can't have unboxed bindings. - -So we make "extra_fvs" which is the rhs_fvs of such bindings, and -arrange to dump bindings that bind extra_fvs before the entire let. - -Note [extra_fvs (2): free variables of rules] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - let x{rule mentioning y} = rhs in body -Here y is not free in rhs or body; but we still want to dump bindings -that bind y outside the let. So we augment extra_fvs with the -idRuleAndUnfoldingVars of x. No need for type variables, hence not using -idFreeVars. --} - -fiExpr dflags to_drop (_,AnnLet bind body) - = fiExpr dflags (after ++ new_float : before) body - -- to_drop is in reverse dependency order - where - (before, new_float, after) = fiBind dflags to_drop bind body_fvs - body_fvs = freeVarsOf body - -{- Note [Floating primops] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -We try to float-in a case expression over an unlifted type. The -motivating example was #5658: in particular, this change allows -array indexing operations, which have a single DEFAULT alternative -without any binders, to be floated inward. - -SIMD primops for unpacking SIMD vectors into an unboxed tuple of unboxed -scalars also need to be floated inward, but unpacks have a single non-DEFAULT -alternative that binds the elements of the tuple. We now therefore also support -floating in cases with a single alternative that may bind values. - -But there are wrinkles - -* Which unlifted cases do we float? See PrimOp.hs - Note [PrimOp can_fail and has_side_effects] which explains: - - We can float-in can_fail primops, but we can't float them out. - - But we can float a has_side_effects primop, but NOT inside a lambda, - so for now we don't float them at all. - Hence exprOkForSideEffects - -* Because we can float can-fail primops (array indexing, division) inwards - but not outwards, we must be careful not to transform - case a /# b of r -> f (F# r) - ===> - f (case a /# b of r -> F# r) - because that creates a new thunk that wasn't there before. And - because it can't be floated out (can_fail), the thunk will stay - there. Disaster! (This happened in nofib 'simple' and 'scs'.) - - Solution: only float cases into the branches of other cases, and - not into the arguments of an application, or the RHS of a let. This - is somewhat conservative, but it's simple. And it still hits the - cases like #5658. This is implemented in sepBindsByJoinPoint; - if is_case is False we dump all floating cases right here. - -* #14511 is another example of why we want to restrict float-in - of case-expressions. Consider - case indexArray# a n of (# r #) -> writeArray# ma i (f r) - Now, floating that indexing operation into the (f r) thunk will - not create any new thunks, but it will keep the array 'a' alive - for much longer than the programmer expected. - - So again, not floating a case into a let or argument seems like - the Right Thing - -For @Case@, the possible drop points for the 'to_drop' -bindings are: - (a) inside the scrutinee - (b) inside one of the alternatives/default (default FVs always /first/!). - --} - -fiExpr dflags to_drop (_, AnnCase scrut case_bndr _ [(con,alt_bndrs,rhs)]) - | isUnliftedType (idType case_bndr) - , exprOkForSideEffects (deAnnotate scrut) - -- See Note [Floating primops] - = wrapFloats shared_binds $ - fiExpr dflags (case_float : rhs_binds) rhs - where - case_float = FB (mkDVarSet (case_bndr : alt_bndrs)) scrut_fvs - (FloatCase scrut' case_bndr con alt_bndrs) - scrut' = fiExpr dflags scrut_binds scrut - rhs_fvs = freeVarsOf rhs `delDVarSetList` (case_bndr : alt_bndrs) - scrut_fvs = freeVarsOf scrut - - [shared_binds, scrut_binds, rhs_binds] - = sepBindsByDropPoint dflags False - [scrut_fvs, rhs_fvs] - to_drop - -fiExpr dflags to_drop (_, AnnCase scrut case_bndr ty alts) - = wrapFloats drop_here1 $ - wrapFloats drop_here2 $ - Case (fiExpr dflags scrut_drops scrut) case_bndr ty - (zipWith fi_alt alts_drops_s alts) - where - -- Float into the scrut and alts-considered-together just like App - [drop_here1, scrut_drops, alts_drops] - = sepBindsByDropPoint dflags False - [scrut_fvs, all_alts_fvs] - to_drop - - -- Float into the alts with the is_case flag set - (drop_here2 : alts_drops_s) - | [ _ ] <- alts = [] : [alts_drops] - | otherwise = sepBindsByDropPoint dflags True alts_fvs alts_drops - - scrut_fvs = freeVarsOf scrut - alts_fvs = map alt_fvs alts - all_alts_fvs = unionDVarSets alts_fvs - alt_fvs (_con, args, rhs) - = foldl' delDVarSet (freeVarsOf rhs) (case_bndr:args) - -- Delete case_bndr and args from free vars of rhs - -- to get free vars of alt - - fi_alt to_drop (con, args, rhs) = (con, args, fiExpr dflags to_drop rhs) - ------------------- -fiBind :: DynFlags - -> FloatInBinds -- Binds we're trying to drop - -- as far "inwards" as possible - -> CoreBindWithFVs -- Input binding - -> DVarSet -- Free in scope of binding - -> ( FloatInBinds -- Land these before - , FloatInBind -- The binding itself - , FloatInBinds) -- Land these after - -fiBind dflags to_drop (AnnNonRec id ann_rhs@(rhs_fvs, rhs)) body_fvs - = ( extra_binds ++ shared_binds -- Land these before - -- See Note [extra_fvs (1,2)] - , FB (unitDVarSet id) rhs_fvs' -- The new binding itself - (FloatLet (NonRec id rhs')) - , body_binds ) -- Land these after - - where - body_fvs2 = body_fvs `delDVarSet` id - - rule_fvs = bndrRuleAndUnfoldingVarsDSet id -- See Note [extra_fvs (2): free variables of rules] - extra_fvs | noFloatIntoRhs NonRecursive id rhs - = rule_fvs `unionDVarSet` rhs_fvs - | otherwise - = rule_fvs - -- See Note [extra_fvs (1): avoid floating into RHS] - -- No point in floating in only to float straight out again - -- We *can't* float into ok-for-speculation unlifted RHSs - -- But do float into join points - - [shared_binds, extra_binds, rhs_binds, body_binds] - = sepBindsByDropPoint dflags False - [extra_fvs, rhs_fvs, body_fvs2] - to_drop - - -- Push rhs_binds into the right hand side of the binding - rhs' = fiRhs dflags rhs_binds id ann_rhs - rhs_fvs' = rhs_fvs `unionDVarSet` floatedBindsFVs rhs_binds `unionDVarSet` rule_fvs - -- Don't forget the rule_fvs; the binding mentions them! - -fiBind dflags to_drop (AnnRec bindings) body_fvs - = ( extra_binds ++ shared_binds - , FB (mkDVarSet ids) rhs_fvs' - (FloatLet (Rec (fi_bind rhss_binds bindings))) - , body_binds ) - where - (ids, rhss) = unzip bindings - rhss_fvs = map freeVarsOf rhss - - -- See Note [extra_fvs (1,2)] - rule_fvs = mapUnionDVarSet bndrRuleAndUnfoldingVarsDSet ids - extra_fvs = rule_fvs `unionDVarSet` - unionDVarSets [ rhs_fvs | (bndr, (rhs_fvs, rhs)) <- bindings - , noFloatIntoRhs Recursive bndr rhs ] - - (shared_binds:extra_binds:body_binds:rhss_binds) - = sepBindsByDropPoint dflags False - (extra_fvs:body_fvs:rhss_fvs) - to_drop - - rhs_fvs' = unionDVarSets rhss_fvs `unionDVarSet` - unionDVarSets (map floatedBindsFVs rhss_binds) `unionDVarSet` - rule_fvs -- Don't forget the rule variables! - - -- Push rhs_binds into the right hand side of the binding - fi_bind :: [FloatInBinds] -- one per "drop pt" conjured w/ fvs_of_rhss - -> [(Id, CoreExprWithFVs)] - -> [(Id, CoreExpr)] - - fi_bind to_drops pairs - = [ (binder, fiRhs dflags to_drop binder rhs) - | ((binder, rhs), to_drop) <- zipEqual "fi_bind" pairs to_drops ] - ------------------- -fiRhs :: DynFlags -> FloatInBinds -> CoreBndr -> CoreExprWithFVs -> CoreExpr -fiRhs dflags to_drop bndr rhs - | Just join_arity <- isJoinId_maybe bndr - , let (bndrs, body) = collectNAnnBndrs join_arity rhs - = mkLams bndrs (fiExpr dflags to_drop body) - | otherwise - = fiExpr dflags to_drop rhs - ------------------- -noFloatIntoLam :: [Var] -> Bool -noFloatIntoLam bndrs = any bad bndrs - where - bad b = isId b && not (isOneShotBndr b) - -- Don't float inside a non-one-shot lambda - -noFloatIntoRhs :: RecFlag -> Id -> CoreExprWithFVs' -> Bool --- ^ True if it's a bad idea to float bindings into this RHS -noFloatIntoRhs is_rec bndr rhs - | isJoinId bndr - = isRec is_rec -- Joins are one-shot iff non-recursive - - | otherwise - = noFloatIntoArg rhs (idType bndr) - -noFloatIntoArg :: CoreExprWithFVs' -> Type -> Bool -noFloatIntoArg expr expr_ty - | isUnliftedType expr_ty - = True -- See Note [Do not destroy the let/app invariant] - - | AnnLam bndr e <- expr - , (bndrs, _) <- collectAnnBndrs e - = noFloatIntoLam (bndr:bndrs) -- Wrinkle 1 (a) - || all isTyVar (bndr:bndrs) -- Wrinkle 1 (b) - -- See Note [noFloatInto considerations] wrinkle 2 - - | otherwise -- Note [noFloatInto considerations] wrinkle 2 - = exprIsTrivial deann_expr || exprIsHNF deann_expr - where - deann_expr = deAnnotate' expr - -{- Note [noFloatInto considerations] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When do we want to float bindings into - - noFloatIntoRHs: the RHS of a let-binding - - noFloatIntoArg: the argument of a function application - -Definitely don't float in if it has unlifted type; that -would destroy the let/app invariant. - -* Wrinkle 1: do not float in if - (a) any non-one-shot value lambdas - or (b) all type lambdas - In both cases we'll float straight back out again - NB: Must line up with fiExpr (AnnLam...); see #7088 - - (a) is important: we /must/ float into a one-shot lambda group - (which includes join points). This makes a big difference - for things like - f x# = let x = I# x# - in let j = \() -> ...x... - in if <condition> then normal-path else j () - If x is used only in the error case join point, j, we must float the - boxing constructor into it, else we box it every time which is very - bad news indeed. - -* Wrinkle 2: for RHSs, do not float into a HNF; we'll just float right - back out again... not tragic, but a waste of time. - - For function arguments we will still end up with this - in-then-out stuff; consider - letrec x = e in f x - Here x is not a HNF, so we'll produce - f (letrec x = e in x) - which is OK... it's not that common, and we'll end up - floating out again, in CorePrep if not earlier. - Still, we use exprIsTrivial to catch this case (sigh) - - -************************************************************************ -* * -\subsection{@sepBindsByDropPoint@} -* * -************************************************************************ - -This is the crucial function. The idea is: We have a wad of bindings -that we'd like to distribute inside a collection of {\em drop points}; -insides the alternatives of a \tr{case} would be one example of some -drop points; the RHS and body of a non-recursive \tr{let} binding -would be another (2-element) collection. - -So: We're given a list of sets-of-free-variables, one per drop point, -and a list of floating-inwards bindings. If a binding can go into -only one drop point (without suddenly making something out-of-scope), -in it goes. If a binding is used inside {\em multiple} drop points, -then it has to go in a you-must-drop-it-above-all-these-drop-points -point. - -We have to maintain the order on these drop-point-related lists. --} - --- pprFIB :: FloatInBinds -> SDoc --- pprFIB fibs = text "FIB:" <+> ppr [b | FB _ _ b <- fibs] - -sepBindsByDropPoint - :: DynFlags - -> Bool -- True <=> is case expression - -> [FreeVarSet] -- One set of FVs per drop point - -- Always at least two long! - -> FloatInBinds -- Candidate floaters - -> [FloatInBinds] -- FIRST one is bindings which must not be floated - -- inside any drop point; the rest correspond - -- one-to-one with the input list of FV sets - --- Every input floater is returned somewhere in the result; --- none are dropped, not even ones which don't seem to be --- free in *any* of the drop-point fvs. Why? Because, for example, --- a binding (let x = E in B) might have a specialised version of --- x (say x') stored inside x, but x' isn't free in E or B. - -type DropBox = (FreeVarSet, FloatInBinds) - -sepBindsByDropPoint dflags is_case drop_pts floaters - | null floaters -- Shortcut common case - = [] : [[] | _ <- drop_pts] - - | otherwise - = ASSERT( drop_pts `lengthAtLeast` 2 ) - go floaters (map (\fvs -> (fvs, [])) (emptyDVarSet : drop_pts)) - where - n_alts = length drop_pts - - go :: FloatInBinds -> [DropBox] -> [FloatInBinds] - -- The *first* one in the argument list is the drop_here set - -- The FloatInBinds in the lists are in the reverse of - -- the normal FloatInBinds order; that is, they are the right way round! - - go [] drop_boxes = map (reverse . snd) drop_boxes - - go (bind_w_fvs@(FB bndrs bind_fvs bind) : binds) drop_boxes@(here_box : fork_boxes) - = go binds new_boxes - where - -- "here" means the group of bindings dropped at the top of the fork - - (used_here : used_in_flags) = [ fvs `intersectsDVarSet` bndrs - | (fvs, _) <- drop_boxes] - - drop_here = used_here || cant_push - - n_used_alts = count id used_in_flags -- returns number of Trues in list. - - cant_push - | is_case = n_used_alts == n_alts -- Used in all, don't push - -- Remember n_alts > 1 - || (n_used_alts > 1 && not (floatIsDupable dflags bind)) - -- floatIsDupable: see Note [Duplicating floats] - - | otherwise = floatIsCase bind || n_used_alts > 1 - -- floatIsCase: see Note [Floating primops] - - new_boxes | drop_here = (insert here_box : fork_boxes) - | otherwise = (here_box : new_fork_boxes) - - new_fork_boxes = zipWithEqual "FloatIn.sepBinds" insert_maybe - fork_boxes used_in_flags - - insert :: DropBox -> DropBox - insert (fvs,drops) = (fvs `unionDVarSet` bind_fvs, bind_w_fvs:drops) - - insert_maybe box True = insert box - insert_maybe box False = box - - go _ _ = panic "sepBindsByDropPoint/go" - - -{- Note [Duplicating floats] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -For case expressions we duplicate the binding if it is reasonably -small, and if it is not used in all the RHSs This is good for -situations like - let x = I# y in - case e of - C -> error x - D -> error x - E -> ...not mentioning x... - -If the thing is used in all RHSs there is nothing gained, -so we don't duplicate then. --} - -floatedBindsFVs :: FloatInBinds -> FreeVarSet -floatedBindsFVs binds = mapUnionDVarSet fbFVs binds - -fbFVs :: FloatInBind -> DVarSet -fbFVs (FB _ fvs _) = fvs - -wrapFloats :: FloatInBinds -> CoreExpr -> CoreExpr --- Remember FloatInBinds is in *reverse* dependency order -wrapFloats [] e = e -wrapFloats (FB _ _ fl : bs) e = wrapFloats bs (wrapFloat fl e) - -floatIsDupable :: DynFlags -> FloatBind -> Bool -floatIsDupable dflags (FloatCase scrut _ _ _) = exprIsDupable dflags scrut -floatIsDupable dflags (FloatLet (Rec prs)) = all (exprIsDupable dflags . snd) prs -floatIsDupable dflags (FloatLet (NonRec _ r)) = exprIsDupable dflags r - -floatIsCase :: FloatBind -> Bool -floatIsCase (FloatCase {}) = True -floatIsCase (FloatLet {}) = False diff --git a/compiler/simplCore/FloatOut.hs b/compiler/simplCore/FloatOut.hs deleted file mode 100644 index 8c2b4c93e0..0000000000 --- a/compiler/simplCore/FloatOut.hs +++ /dev/null @@ -1,757 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - -\section[FloatOut]{Float bindings outwards (towards the top level)} - -``Long-distance'' floating of bindings towards the top level. --} - -{-# LANGUAGE CPP #-} - -module FloatOut ( floatOutwards ) where - -import GhcPrelude - -import GHC.Core -import GHC.Core.Utils -import GHC.Core.Make -import GHC.Core.Arity ( etaExpand ) -import CoreMonad ( FloatOutSwitches(..) ) - -import GHC.Driver.Session -import ErrUtils ( dumpIfSet_dyn, DumpFormat (..) ) -import Id ( Id, idArity, idType, isBottomingId, - isJoinId, isJoinId_maybe ) -import SetLevels -import UniqSupply ( UniqSupply ) -import Bag -import Util -import Maybes -import Outputable -import GHC.Core.Type -import qualified Data.IntMap as M - -import Data.List ( partition ) - -#include "HsVersions.h" - -{- - ----------------- - Overall game plan - ----------------- - -The Big Main Idea is: - - To float out sub-expressions that can thereby get outside - a non-one-shot value lambda, and hence may be shared. - - -To achieve this we may need to do two things: - - a) Let-bind the sub-expression: - - f (g x) ==> let lvl = f (g x) in lvl - - Now we can float the binding for 'lvl'. - - b) More than that, we may need to abstract wrt a type variable - - \x -> ... /\a -> let v = ...a... in .... - - Here the binding for v mentions 'a' but not 'x'. So we - abstract wrt 'a', to give this binding for 'v': - - vp = /\a -> ...a... - v = vp a - - Now the binding for vp can float out unimpeded. - I can't remember why this case seemed important enough to - deal with, but I certainly found cases where important floats - didn't happen if we did not abstract wrt tyvars. - -With this in mind we can also achieve another goal: lambda lifting. -We can make an arbitrary (function) binding float to top level by -abstracting wrt *all* local variables, not just type variables, leaving -a binding that can be floated right to top level. Whether or not this -happens is controlled by a flag. - - -Random comments -~~~~~~~~~~~~~~~ - -At the moment we never float a binding out to between two adjacent -lambdas. For example: - -@ - \x y -> let t = x+x in ... -===> - \x -> let t = x+x in \y -> ... -@ -Reason: this is less efficient in the case where the original lambda -is never partially applied. - -But there's a case I've seen where this might not be true. Consider: -@ -elEm2 x ys - = elem' x ys - where - elem' _ [] = False - elem' x (y:ys) = x==y || elem' x ys -@ -It turns out that this generates a subexpression of the form -@ - \deq x ys -> let eq = eqFromEqDict deq in ... -@ -which might usefully be separated to -@ - \deq -> let eq = eqFromEqDict deq in \xy -> ... -@ -Well, maybe. We don't do this at the moment. - -Note [Join points] -~~~~~~~~~~~~~~~~~~ -Every occurrence of a join point must be a tail call (see Note [Invariants on -join points] in GHC.Core), so we must be careful with how far we float them. The -mechanism for doing so is the *join ceiling*, detailed in Note [Join ceiling] -in SetLevels. For us, the significance is that a binder might be marked to be -dropped at the nearest boundary between tail calls and non-tail calls. For -example: - - (< join j = ... in - let x = < ... > in - case < ... > of - A -> ... - B -> ... - >) < ... > < ... > - -Here the join ceilings are marked with angle brackets. Either side of an -application is a join ceiling, as is the scrutinee position of a case -expression or the RHS of a let binding (but not a join point). - -Why do we *want* do float join points at all? After all, they're never -allocated, so there's no sharing to be gained by floating them. However, the -other benefit of floating is making RHSes small, and this can have a significant -impact. In particular, stream fusion has been known to produce nested loops like -this: - - joinrec j1 x1 = - joinrec j2 x2 = - joinrec j3 x3 = ... jump j1 (x3 + 1) ... jump j2 (x3 + 1) ... - in jump j3 x2 - in jump j2 x1 - in jump j1 x - -(Assume x1 and x2 do *not* occur free in j3.) - -Here j1 and j2 are wholly superfluous---each of them merely forwards its -argument to j3. Since j3 only refers to x3, we can float j2 and j3 to make -everything one big mutual recursion: - - joinrec j1 x1 = jump j2 x1 - j2 x2 = jump j3 x2 - j3 x3 = ... jump j1 (x3 + 1) ... jump j2 (x3 + 1) ... - in jump j1 x - -Now the simplifier will happily inline the trivial j1 and j2, leaving only j3. -Without floating, we're stuck with three loops instead of one. - -************************************************************************ -* * -\subsection[floatOutwards]{@floatOutwards@: let-floating interface function} -* * -************************************************************************ --} - -floatOutwards :: FloatOutSwitches - -> DynFlags - -> UniqSupply - -> CoreProgram -> IO CoreProgram - -floatOutwards float_sws dflags us pgm - = do { - let { annotated_w_levels = setLevels float_sws pgm us ; - (fss, binds_s') = unzip (map floatTopBind annotated_w_levels) - } ; - - dumpIfSet_dyn dflags Opt_D_verbose_core2core "Levels added:" - FormatCore - (vcat (map ppr annotated_w_levels)); - - let { (tlets, ntlets, lams) = get_stats (sum_stats fss) }; - - dumpIfSet_dyn dflags Opt_D_dump_simpl_stats "FloatOut stats:" - FormatText - (hcat [ int tlets, text " Lets floated to top level; ", - int ntlets, text " Lets floated elsewhere; from ", - int lams, text " Lambda groups"]); - - return (bagToList (unionManyBags binds_s')) - } - -floatTopBind :: LevelledBind -> (FloatStats, Bag CoreBind) -floatTopBind bind - = case (floatBind bind) of { (fs, floats, bind') -> - let float_bag = flattenTopFloats floats - in case bind' of - -- bind' can't have unlifted values or join points, so can only be one - -- value bind, rec or non-rec (see comment on floatBind) - [Rec prs] -> (fs, unitBag (Rec (addTopFloatPairs float_bag prs))) - [NonRec b e] -> (fs, float_bag `snocBag` NonRec b e) - _ -> pprPanic "floatTopBind" (ppr bind') } - -{- -************************************************************************ -* * -\subsection[FloatOut-Bind]{Floating in a binding (the business end)} -* * -************************************************************************ --} - -floatBind :: LevelledBind -> (FloatStats, FloatBinds, [CoreBind]) - -- Returns a list with either - -- * A single non-recursive binding (value or join point), or - -- * The following, in order: - -- * Zero or more non-rec unlifted bindings - -- * One or both of: - -- * A recursive group of join binds - -- * A recursive group of value binds - -- See Note [Floating out of Rec rhss] for why things get arranged this way. -floatBind (NonRec (TB var _) rhs) - = case (floatRhs var rhs) of { (fs, rhs_floats, rhs') -> - - -- A tiresome hack: - -- see Note [Bottoming floats: eta expansion] in SetLevels - let rhs'' | isBottomingId var = etaExpand (idArity var) rhs' - | otherwise = rhs' - - in (fs, rhs_floats, [NonRec var rhs'']) } - -floatBind (Rec pairs) - = case floatList do_pair pairs of { (fs, rhs_floats, new_pairs) -> - let (new_ul_pairss, new_other_pairss) = unzip new_pairs - (new_join_pairs, new_l_pairs) = partition (isJoinId . fst) - (concat new_other_pairss) - -- Can't put the join points and the values in the same rec group - new_rec_binds | null new_join_pairs = [ Rec new_l_pairs ] - | null new_l_pairs = [ Rec new_join_pairs ] - | otherwise = [ Rec new_l_pairs - , Rec new_join_pairs ] - new_non_rec_binds = [ NonRec b e | (b, e) <- concat new_ul_pairss ] - in - (fs, rhs_floats, new_non_rec_binds ++ new_rec_binds) } - where - do_pair :: (LevelledBndr, LevelledExpr) - -> (FloatStats, FloatBinds, - ([(Id,CoreExpr)], -- Non-recursive unlifted value bindings - [(Id,CoreExpr)])) -- Join points and lifted value bindings - do_pair (TB name spec, rhs) - | isTopLvl dest_lvl -- See Note [floatBind for top level] - = case (floatRhs name rhs) of { (fs, rhs_floats, rhs') -> - (fs, emptyFloats, ([], addTopFloatPairs (flattenTopFloats rhs_floats) - [(name, rhs')]))} - | otherwise -- Note [Floating out of Rec rhss] - = case (floatRhs name rhs) of { (fs, rhs_floats, rhs') -> - case (partitionByLevel dest_lvl rhs_floats) of { (rhs_floats', heres) -> - case (splitRecFloats heres) of { (ul_pairs, pairs, case_heres) -> - let pairs' = (name, installUnderLambdas case_heres rhs') : pairs in - (fs, rhs_floats', (ul_pairs, pairs')) }}} - where - dest_lvl = floatSpecLevel spec - -splitRecFloats :: Bag FloatBind - -> ([(Id,CoreExpr)], -- Non-recursive unlifted value bindings - [(Id,CoreExpr)], -- Join points and lifted value bindings - Bag FloatBind) -- A tail of further bindings --- The "tail" begins with a case --- See Note [Floating out of Rec rhss] -splitRecFloats fs - = go [] [] (bagToList fs) - where - go ul_prs prs (FloatLet (NonRec b r) : fs) | isUnliftedType (idType b) - , not (isJoinId b) - = go ((b,r):ul_prs) prs fs - | otherwise - = go ul_prs ((b,r):prs) fs - go ul_prs prs (FloatLet (Rec prs') : fs) = go ul_prs (prs' ++ prs) fs - go ul_prs prs fs = (reverse ul_prs, prs, - listToBag fs) - -- Order only matters for - -- non-rec - -installUnderLambdas :: Bag FloatBind -> CoreExpr -> CoreExpr --- Note [Floating out of Rec rhss] -installUnderLambdas floats e - | isEmptyBag floats = e - | otherwise = go e - where - go (Lam b e) = Lam b (go e) - go e = install floats e - ---------------- -floatList :: (a -> (FloatStats, FloatBinds, b)) -> [a] -> (FloatStats, FloatBinds, [b]) -floatList _ [] = (zeroStats, emptyFloats, []) -floatList f (a:as) = case f a of { (fs_a, binds_a, b) -> - case floatList f as of { (fs_as, binds_as, bs) -> - (fs_a `add_stats` fs_as, binds_a `plusFloats` binds_as, b:bs) }} - -{- -Note [Floating out of Rec rhss] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider Rec { f<1,0> = \xy. body } -From the body we may get some floats. The ones with level <1,0> must -stay here, since they may mention f. Ideally we'd like to make them -part of the Rec block pairs -- but we can't if there are any -FloatCases involved. - -Nor is it a good idea to dump them in the rhs, but outside the lambda - f = case x of I# y -> \xy. body -because now f's arity might get worse, which is Not Good. (And if -there's an SCC around the RHS it might not get better again. -See #5342.) - -So, gruesomely, we split the floats into - * the outer FloatLets, which can join the Rec, and - * an inner batch starting in a FloatCase, which are then - pushed *inside* the lambdas. -This loses full-laziness the rare situation where there is a -FloatCase and a Rec interacting. - -If there are unlifted FloatLets (that *aren't* join points) among the floats, -we can't add them to the recursive group without angering Core Lint, but since -they must be ok-for-speculation, they can't actually be making any recursive -calls, so we can safely pull them out and keep them non-recursive. - -(Why is something getting floated to <1,0> that doesn't make a recursive call? -The case that came up in testing was that f *and* the unlifted binding were -getting floated *to the same place*: - - \x<2,0> -> - ... <3,0> - letrec { f<F<2,0>> = - ... let x'<F<2,0>> = x +# 1# in ... - } in ... - -Everything gets labeled "float to <2,0>" because it all depends on x, but this -makes f and x' look mutually recursive when they're not. - -The test was shootout/k-nucleotide, as compiled using commit 47d5dd68 on the -wip/join-points branch. - -TODO: This can probably be solved somehow in SetLevels. The difference between -"this *is at* level <2,0>" and "this *depends on* level <2,0>" is very -important.) - -Note [floatBind for top level] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We may have a *nested* binding whose destination level is (FloatMe tOP_LEVEL), thus - letrec { foo <0,0> = .... (let bar<0,0> = .. in ..) .... } -The binding for bar will be in the "tops" part of the floating binds, -and thus not partioned by floatBody. - -We could perhaps get rid of the 'tops' component of the floating binds, -but this case works just as well. - - -************************************************************************ - -\subsection[FloatOut-Expr]{Floating in expressions} -* * -************************************************************************ --} - -floatBody :: Level - -> LevelledExpr - -> (FloatStats, FloatBinds, CoreExpr) - -floatBody lvl arg -- Used rec rhss, and case-alternative rhss - = case (floatExpr arg) of { (fsa, floats, arg') -> - case (partitionByLevel lvl floats) of { (floats', heres) -> - -- Dump bindings are bound here - (fsa, floats', install heres arg') }} - ------------------ - -{- Note [Floating past breakpoints] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We used to disallow floating out of breakpoint ticks (see #10052). However, I -think this is too restrictive. - -Consider the case of an expression scoped over by a breakpoint tick, - - tick<...> (let x = ... in f x) - -In this case it is completely legal to float out x, despite the fact that -breakpoint ticks are scoped, - - let x = ... in (tick<...> f x) - -The reason here is that we know that the breakpoint will still be hit when the -expression is entered since the tick still scopes over the RHS. - --} - -floatExpr :: LevelledExpr - -> (FloatStats, FloatBinds, CoreExpr) -floatExpr (Var v) = (zeroStats, emptyFloats, Var v) -floatExpr (Type ty) = (zeroStats, emptyFloats, Type ty) -floatExpr (Coercion co) = (zeroStats, emptyFloats, Coercion co) -floatExpr (Lit lit) = (zeroStats, emptyFloats, Lit lit) - -floatExpr (App e a) - = case (atJoinCeiling $ floatExpr e) of { (fse, floats_e, e') -> - case (atJoinCeiling $ floatExpr a) of { (fsa, floats_a, a') -> - (fse `add_stats` fsa, floats_e `plusFloats` floats_a, App e' a') }} - -floatExpr lam@(Lam (TB _ lam_spec) _) - = let (bndrs_w_lvls, body) = collectBinders lam - bndrs = [b | TB b _ <- bndrs_w_lvls] - bndr_lvl = asJoinCeilLvl (floatSpecLevel lam_spec) - -- All the binders have the same level - -- See SetLevels.lvlLamBndrs - -- Use asJoinCeilLvl to make this the join ceiling - in - case (floatBody bndr_lvl body) of { (fs, floats, body') -> - (add_to_stats fs floats, floats, mkLams bndrs body') } - -floatExpr (Tick tickish expr) - | tickish `tickishScopesLike` SoftScope -- not scoped, can just float - = case (atJoinCeiling $ floatExpr expr) of { (fs, floating_defns, expr') -> - (fs, floating_defns, Tick tickish expr') } - - | not (tickishCounts tickish) || tickishCanSplit tickish - = case (atJoinCeiling $ floatExpr expr) of { (fs, floating_defns, expr') -> - let -- Annotate bindings floated outwards past an scc expression - -- with the cc. We mark that cc as "duplicated", though. - annotated_defns = wrapTick (mkNoCount tickish) floating_defns - in - (fs, annotated_defns, Tick tickish expr') } - - -- Note [Floating past breakpoints] - | Breakpoint{} <- tickish - = case (floatExpr expr) of { (fs, floating_defns, expr') -> - (fs, floating_defns, Tick tickish expr') } - - | otherwise - = pprPanic "floatExpr tick" (ppr tickish) - -floatExpr (Cast expr co) - = case (atJoinCeiling $ floatExpr expr) of { (fs, floating_defns, expr') -> - (fs, floating_defns, Cast expr' co) } - -floatExpr (Let bind body) - = case bind_spec of - FloatMe dest_lvl - -> case (floatBind bind) of { (fsb, bind_floats, binds') -> - case (floatExpr body) of { (fse, body_floats, body') -> - let new_bind_floats = foldr plusFloats emptyFloats - (map (unitLetFloat dest_lvl) binds') in - ( add_stats fsb fse - , bind_floats `plusFloats` new_bind_floats - `plusFloats` body_floats - , body') }} - - StayPut bind_lvl -- See Note [Avoiding unnecessary floating] - -> case (floatBind bind) of { (fsb, bind_floats, binds') -> - case (floatBody bind_lvl body) of { (fse, body_floats, body') -> - ( add_stats fsb fse - , bind_floats `plusFloats` body_floats - , foldr Let body' binds' ) }} - where - bind_spec = case bind of - NonRec (TB _ s) _ -> s - Rec ((TB _ s, _) : _) -> s - Rec [] -> panic "floatExpr:rec" - -floatExpr (Case scrut (TB case_bndr case_spec) ty alts) - = case case_spec of - FloatMe dest_lvl -- Case expression moves - | [(con@(DataAlt {}), bndrs, rhs)] <- alts - -> case atJoinCeiling $ floatExpr scrut of { (fse, fde, scrut') -> - case floatExpr rhs of { (fsb, fdb, rhs') -> - let - float = unitCaseFloat dest_lvl scrut' - case_bndr con [b | TB b _ <- bndrs] - in - (add_stats fse fsb, fde `plusFloats` float `plusFloats` fdb, rhs') }} - | otherwise - -> pprPanic "Floating multi-case" (ppr alts) - - StayPut bind_lvl -- Case expression stays put - -> case atJoinCeiling $ floatExpr scrut of { (fse, fde, scrut') -> - case floatList (float_alt bind_lvl) alts of { (fsa, fda, alts') -> - (add_stats fse fsa, fda `plusFloats` fde, Case scrut' case_bndr ty alts') - }} - where - float_alt bind_lvl (con, bs, rhs) - = case (floatBody bind_lvl rhs) of { (fs, rhs_floats, rhs') -> - (fs, rhs_floats, (con, [b | TB b _ <- bs], rhs')) } - -floatRhs :: CoreBndr - -> LevelledExpr - -> (FloatStats, FloatBinds, CoreExpr) -floatRhs bndr rhs - | Just join_arity <- isJoinId_maybe bndr - , Just (bndrs, body) <- try_collect join_arity rhs [] - = case bndrs of - [] -> floatExpr rhs - (TB _ lam_spec):_ -> - let lvl = floatSpecLevel lam_spec in - case floatBody lvl body of { (fs, floats, body') -> - (fs, floats, mkLams [b | TB b _ <- bndrs] body') } - | otherwise - = atJoinCeiling $ floatExpr rhs - where - try_collect 0 expr acc = Just (reverse acc, expr) - try_collect n (Lam b e) acc = try_collect (n-1) e (b:acc) - try_collect _ _ _ = Nothing - -{- -Note [Avoiding unnecessary floating] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In general we want to avoid floating a let unnecessarily, because -it might worsen strictness: - let - x = ...(let y = e in y+y).... -Here y is demanded. If we float it outside the lazy 'x=..' then -we'd have to zap its demand info, and it may never be restored. - -So at a 'let' we leave the binding right where the are unless -the binding will escape a value lambda, e.g. - -(\x -> let y = fac 100 in y) - -That's what the partitionByMajorLevel does in the floatExpr (Let ...) -case. - -Notice, though, that we must take care to drop any bindings -from the body of the let that depend on the staying-put bindings. - -We used instead to do the partitionByMajorLevel on the RHS of an '=', -in floatRhs. But that was quite tiresome. We needed to test for -values or trivial rhss, because (in particular) we don't want to insert -new bindings between the "=" and the "\". E.g. - f = \x -> let <bind> in <body> -We do not want - f = let <bind> in \x -> <body> -(a) The simplifier will immediately float it further out, so we may - as well do so right now; in general, keeping rhss as manifest - values is good -(b) If a float-in pass follows immediately, it might add yet more - bindings just after the '='. And some of them might (correctly) - be strict even though the 'let f' is lazy, because f, being a value, - gets its demand-info zapped by the simplifier. -And even all that turned out to be very fragile, and broke -altogether when profiling got in the way. - -So now we do the partition right at the (Let..) itself. - -************************************************************************ -* * -\subsection{Utility bits for floating stats} -* * -************************************************************************ - -I didn't implement this with unboxed numbers. I don't want to be too -strict in this stuff, as it is rarely turned on. (WDP 95/09) --} - -data FloatStats - = FlS Int -- Number of top-floats * lambda groups they've been past - Int -- Number of non-top-floats * lambda groups they've been past - Int -- Number of lambda (groups) seen - -get_stats :: FloatStats -> (Int, Int, Int) -get_stats (FlS a b c) = (a, b, c) - -zeroStats :: FloatStats -zeroStats = FlS 0 0 0 - -sum_stats :: [FloatStats] -> FloatStats -sum_stats xs = foldr add_stats zeroStats xs - -add_stats :: FloatStats -> FloatStats -> FloatStats -add_stats (FlS a1 b1 c1) (FlS a2 b2 c2) - = FlS (a1 + a2) (b1 + b2) (c1 + c2) - -add_to_stats :: FloatStats -> FloatBinds -> FloatStats -add_to_stats (FlS a b c) (FB tops ceils others) - = FlS (a + lengthBag tops) - (b + lengthBag ceils + lengthBag (flattenMajor others)) - (c + 1) - -{- -************************************************************************ -* * -\subsection{Utility bits for floating} -* * -************************************************************************ - -Note [Representation of FloatBinds] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The FloatBinds types is somewhat important. We can get very large numbers -of floating bindings, often all destined for the top level. A typical example -is x = [4,2,5,2,5, .... ] -Then we get lots of small expressions like (fromInteger 4), which all get -lifted to top level. - -The trouble is that - (a) we partition these floating bindings *at every binding site* - (b) SetLevels introduces a new bindings site for every float -So we had better not look at each binding at each binding site! - -That is why MajorEnv is represented as a finite map. - -We keep the bindings destined for the *top* level separate, because -we float them out even if they don't escape a *value* lambda; see -partitionByMajorLevel. --} - -type FloatLet = CoreBind -- INVARIANT: a FloatLet is always lifted -type MajorEnv = M.IntMap MinorEnv -- Keyed by major level -type MinorEnv = M.IntMap (Bag FloatBind) -- Keyed by minor level - -data FloatBinds = FB !(Bag FloatLet) -- Destined for top level - !(Bag FloatBind) -- Destined for join ceiling - !MajorEnv -- Other levels - -- See Note [Representation of FloatBinds] - -instance Outputable FloatBinds where - ppr (FB fbs ceils defs) - = text "FB" <+> (braces $ vcat - [ text "tops =" <+> ppr fbs - , text "ceils =" <+> ppr ceils - , text "non-tops =" <+> ppr defs ]) - -flattenTopFloats :: FloatBinds -> Bag CoreBind -flattenTopFloats (FB tops ceils defs) - = ASSERT2( isEmptyBag (flattenMajor defs), ppr defs ) - ASSERT2( isEmptyBag ceils, ppr ceils ) - tops - -addTopFloatPairs :: Bag CoreBind -> [(Id,CoreExpr)] -> [(Id,CoreExpr)] -addTopFloatPairs float_bag prs - = foldr add prs float_bag - where - add (NonRec b r) prs = (b,r):prs - add (Rec prs1) prs2 = prs1 ++ prs2 - -flattenMajor :: MajorEnv -> Bag FloatBind -flattenMajor = M.foldr (unionBags . flattenMinor) emptyBag - -flattenMinor :: MinorEnv -> Bag FloatBind -flattenMinor = M.foldr unionBags emptyBag - -emptyFloats :: FloatBinds -emptyFloats = FB emptyBag emptyBag M.empty - -unitCaseFloat :: Level -> CoreExpr -> Id -> AltCon -> [Var] -> FloatBinds -unitCaseFloat (Level major minor t) e b con bs - | t == JoinCeilLvl - = FB emptyBag floats M.empty - | otherwise - = FB emptyBag emptyBag (M.singleton major (M.singleton minor floats)) - where - floats = unitBag (FloatCase e b con bs) - -unitLetFloat :: Level -> FloatLet -> FloatBinds -unitLetFloat lvl@(Level major minor t) b - | isTopLvl lvl = FB (unitBag b) emptyBag M.empty - | t == JoinCeilLvl = FB emptyBag floats M.empty - | otherwise = FB emptyBag emptyBag (M.singleton major - (M.singleton minor floats)) - where - floats = unitBag (FloatLet b) - -plusFloats :: FloatBinds -> FloatBinds -> FloatBinds -plusFloats (FB t1 c1 l1) (FB t2 c2 l2) - = FB (t1 `unionBags` t2) (c1 `unionBags` c2) (l1 `plusMajor` l2) - -plusMajor :: MajorEnv -> MajorEnv -> MajorEnv -plusMajor = M.unionWith plusMinor - -plusMinor :: MinorEnv -> MinorEnv -> MinorEnv -plusMinor = M.unionWith unionBags - -install :: Bag FloatBind -> CoreExpr -> CoreExpr -install defn_groups expr - = foldr wrapFloat expr defn_groups - -partitionByLevel - :: Level -- Partitioning level - -> FloatBinds -- Defns to be divided into 2 piles... - -> (FloatBinds, -- Defns with level strictly < partition level, - Bag FloatBind) -- The rest - -{- --- ---- partitionByMajorLevel ---- --- Float it if we escape a value lambda, --- *or* if we get to the top level --- *or* if it's a case-float and its minor level is < current --- --- If we can get to the top level, say "yes" anyway. This means that --- x = f e --- transforms to --- lvl = e --- x = f lvl --- which is as it should be - -partitionByMajorLevel (Level major _) (FB tops defns) - = (FB tops outer, heres `unionBags` flattenMajor inner) - where - (outer, mb_heres, inner) = M.splitLookup major defns - heres = case mb_heres of - Nothing -> emptyBag - Just h -> flattenMinor h --} - -partitionByLevel (Level major minor typ) (FB tops ceils defns) - = (FB tops ceils' (outer_maj `plusMajor` M.singleton major outer_min), - here_min `unionBags` here_ceil - `unionBags` flattenMinor inner_min - `unionBags` flattenMajor inner_maj) - - where - (outer_maj, mb_here_maj, inner_maj) = M.splitLookup major defns - (outer_min, mb_here_min, inner_min) = case mb_here_maj of - Nothing -> (M.empty, Nothing, M.empty) - Just min_defns -> M.splitLookup minor min_defns - here_min = mb_here_min `orElse` emptyBag - (here_ceil, ceils') | typ == JoinCeilLvl = (ceils, emptyBag) - | otherwise = (emptyBag, ceils) - --- Like partitionByLevel, but instead split out the bindings that are marked --- to float to the nearest join ceiling (see Note [Join points]) -partitionAtJoinCeiling :: FloatBinds -> (FloatBinds, Bag FloatBind) -partitionAtJoinCeiling (FB tops ceils defs) - = (FB tops emptyBag defs, ceils) - --- Perform some action at a join ceiling, i.e., don't let join points float out --- (see Note [Join points]) -atJoinCeiling :: (FloatStats, FloatBinds, CoreExpr) - -> (FloatStats, FloatBinds, CoreExpr) -atJoinCeiling (fs, floats, expr') - = (fs, floats', install ceils expr') - where - (floats', ceils) = partitionAtJoinCeiling floats - -wrapTick :: Tickish Id -> FloatBinds -> FloatBinds -wrapTick t (FB tops ceils defns) - = FB (mapBag wrap_bind tops) (wrap_defns ceils) - (M.map (M.map wrap_defns) defns) - where - wrap_defns = mapBag wrap_one - - wrap_bind (NonRec binder rhs) = NonRec binder (maybe_tick rhs) - wrap_bind (Rec pairs) = Rec (mapSnd maybe_tick pairs) - - wrap_one (FloatLet bind) = FloatLet (wrap_bind bind) - wrap_one (FloatCase e b c bs) = FloatCase (maybe_tick e) b c bs - - maybe_tick e | exprIsHNF e = tickHNFArgs t e - | otherwise = mkTick t e - -- we don't need to wrap a tick around an HNF when we float it - -- outside a tick: that is an invariant of the tick semantics - -- Conversely, inlining of HNFs inside an SCC is allowed, and - -- indeed the HNF we're floating here might well be inlined back - -- again, and we don't want to end up with duplicate ticks. diff --git a/compiler/simplCore/LiberateCase.hs b/compiler/simplCore/LiberateCase.hs deleted file mode 100644 index 1347cf37bf..0000000000 --- a/compiler/simplCore/LiberateCase.hs +++ /dev/null @@ -1,442 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1994-1998 - -\section[LiberateCase]{Unroll recursion to allow evals to be lifted from a loop} --} - -{-# LANGUAGE CPP #-} -module LiberateCase ( liberateCase ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Driver.Session -import GHC.Core -import GHC.Core.Unfold ( couldBeSmallEnoughToInline ) -import TysWiredIn ( unitDataConId ) -import Id -import VarEnv -import Util ( notNull ) - -{- -The liberate-case transformation -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This module walks over @Core@, and looks for @case@ on free variables. -The criterion is: - if there is case on a free on the route to the recursive call, - then the recursive call is replaced with an unfolding. - -Example - - f = \ t -> case v of - V a b -> a : f t - -=> the inner f is replaced. - - f = \ t -> case v of - V a b -> a : (letrec - f = \ t -> case v of - V a b -> a : f t - in f) t -(note the NEED for shadowing) - -=> Simplify - - f = \ t -> case v of - V a b -> a : (letrec - f = \ t -> a : f t - in f t) - -Better code, because 'a' is free inside the inner letrec, rather -than needing projection from v. - -Note that this deals with *free variables*. SpecConstr deals with -*arguments* that are of known form. E.g. - - last [] = error - last (x:[]) = x - last (x:xs) = last xs - - -Note [Scrutinee with cast] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this: - f = \ t -> case (v `cast` co) of - V a b -> a : f t - -Exactly the same optimisation (unrolling one call to f) will work here, -despite the cast. See mk_alt_env in the Case branch of libCase. - - -To think about (Apr 94) -~~~~~~~~~~~~~~ -Main worry: duplicating code excessively. At the moment we duplicate -the entire binding group once at each recursive call. But there may -be a group of recursive calls which share a common set of evaluated -free variables, in which case the duplication is a plain waste. - -Another thing we could consider adding is some unfold-threshold thing, -so that we'll only duplicate if the size of the group rhss isn't too -big. - -Data types -~~~~~~~~~~ -The ``level'' of a binder tells how many -recursive defns lexically enclose the binding -A recursive defn "encloses" its RHS, not its -scope. For example: -\begin{verbatim} - letrec f = let g = ... in ... - in - let h = ... - in ... -\end{verbatim} -Here, the level of @f@ is zero, the level of @g@ is one, -and the level of @h@ is zero (NB not one). - - -************************************************************************ -* * - Top-level code -* * -************************************************************************ --} - -liberateCase :: DynFlags -> CoreProgram -> CoreProgram -liberateCase dflags binds = do_prog (initEnv dflags) binds - where - do_prog _ [] = [] - do_prog env (bind:binds) = bind' : do_prog env' binds - where - (env', bind') = libCaseBind env bind - -{- -************************************************************************ -* * - Main payload -* * -************************************************************************ - -Bindings -~~~~~~~~ --} - -libCaseBind :: LibCaseEnv -> CoreBind -> (LibCaseEnv, CoreBind) - -libCaseBind env (NonRec binder rhs) - = (addBinders env [binder], NonRec binder (libCase env rhs)) - -libCaseBind env (Rec pairs) - = (env_body, Rec pairs') - where - binders = map fst pairs - - env_body = addBinders env binders - - pairs' = [(binder, libCase env_rhs rhs) | (binder,rhs) <- pairs] - - -- We extend the rec-env by binding each Id to its rhs, first - -- processing the rhs with an *un-extended* environment, so - -- that the same process doesn't occur for ever! - env_rhs | is_dupable_bind = addRecBinds env dup_pairs - | otherwise = env - - dup_pairs = [ (localiseId binder, libCase env_body rhs) - | (binder, rhs) <- pairs ] - -- localiseID : see Note [Need to localiseId in libCaseBind] - - is_dupable_bind = small_enough && all ok_pair pairs - - -- Size: we are going to duplicate dup_pairs; to find their - -- size, build a fake binding (let { dup_pairs } in (), - -- and find the size of that - -- See Note [Small enough] - small_enough = case bombOutSize env of - Nothing -> True -- Infinity - Just size -> couldBeSmallEnoughToInline (lc_dflags env) size $ - Let (Rec dup_pairs) (Var unitDataConId) - - ok_pair (id,_) - = idArity id > 0 -- Note [Only functions!] - && not (isBottomingId id) -- Note [Not bottoming ids] - -{- Note [Not bottoming Ids] -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Do not specialise error-functions (this is unusual, but I once saw it, -(actually in Data.Typable.Internal) - -Note [Only functions!] -~~~~~~~~~~~~~~~~~~~~~~ -Consider the following code - - f = g (case v of V a b -> a : t f) - -where g is expensive. If we aren't careful, liberate case will turn this into - - f = g (case v of - V a b -> a : t (letrec f = g (case v of V a b -> a : f t) - in f) - ) - -Yikes! We evaluate g twice. This leads to a O(2^n) explosion -if g calls back to the same code recursively. - -Solution: make sure that we only do the liberate-case thing on *functions* - -Note [Small enough] -~~~~~~~~~~~~~~~~~~~ -Consider - \fv. letrec - f = \x. BIG...(case fv of { (a,b) -> ...g.. })... - g = \y. SMALL...f... - -Then we *can* in principle do liberate-case on 'g' (small RHS) but not -for 'f' (too big). But doing so is not profitable, because duplicating -'g' at its call site in 'f' doesn't get rid of any cases. So we just -ask for the whole group to be small enough. - -Note [Need to localiseId in libCaseBind] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The call to localiseId is needed for two subtle reasons -(a) Reset the export flags on the binders so - that we don't get name clashes on exported things if the - local binding floats out to top level. This is most unlikely - to happen, since the whole point concerns free variables. - But resetting the export flag is right regardless. - -(b) Make the name an Internal one. External Names should never be - nested; if it were floated to the top level, we'd get a name - clash at code generation time. - -Expressions -~~~~~~~~~~~ --} - -libCase :: LibCaseEnv - -> CoreExpr - -> CoreExpr - -libCase env (Var v) = libCaseApp env v [] -libCase _ (Lit lit) = Lit lit -libCase _ (Type ty) = Type ty -libCase _ (Coercion co) = Coercion co -libCase env e@(App {}) | let (fun, args) = collectArgs e - , Var v <- fun - = libCaseApp env v args -libCase env (App fun arg) = App (libCase env fun) (libCase env arg) -libCase env (Tick tickish body) = Tick tickish (libCase env body) -libCase env (Cast e co) = Cast (libCase env e) co - -libCase env (Lam binder body) - = Lam binder (libCase (addBinders env [binder]) body) - -libCase env (Let bind body) - = Let bind' (libCase env_body body) - where - (env_body, bind') = libCaseBind env bind - -libCase env (Case scrut bndr ty alts) - = Case (libCase env scrut) bndr ty (map (libCaseAlt env_alts) alts) - where - env_alts = addBinders (mk_alt_env scrut) [bndr] - mk_alt_env (Var scrut_var) = addScrutedVar env scrut_var - mk_alt_env (Cast scrut _) = mk_alt_env scrut -- Note [Scrutinee with cast] - mk_alt_env _ = env - -libCaseAlt :: LibCaseEnv -> (AltCon, [CoreBndr], CoreExpr) - -> (AltCon, [CoreBndr], CoreExpr) -libCaseAlt env (con,args,rhs) = (con, args, libCase (addBinders env args) rhs) - -{- -Ids -~~~ - -To unfold, we can't just wrap the id itself in its binding if it's a join point: - - jump j a b c => (joinrec j x y z = ... in jump j) a b c -- wrong!!! - -Every jump must provide all arguments, so we have to be careful to wrap the -whole jump instead: - - jump j a b c => joinrec j x y z = ... in jump j a b c -- right - --} - -libCaseApp :: LibCaseEnv -> Id -> [CoreExpr] -> CoreExpr -libCaseApp env v args - | Just the_bind <- lookupRecId env v -- It's a use of a recursive thing - , notNull free_scruts -- with free vars scrutinised in RHS - = Let the_bind expr' - - | otherwise - = expr' - - where - rec_id_level = lookupLevel env v - free_scruts = freeScruts env rec_id_level - expr' = mkApps (Var v) (map (libCase env) args) - -freeScruts :: LibCaseEnv - -> LibCaseLevel -- Level of the recursive Id - -> [Id] -- Ids that are scrutinised between the binding - -- of the recursive Id and here -freeScruts env rec_bind_lvl - = [v | (v, scrut_bind_lvl, scrut_at_lvl) <- lc_scruts env - , scrut_bind_lvl <= rec_bind_lvl - , scrut_at_lvl > rec_bind_lvl] - -- Note [When to specialise] - -- Note [Avoiding fruitless liberate-case] - -{- -Note [When to specialise] -~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f = \x. letrec g = \y. case x of - True -> ... (f a) ... - False -> ... (g b) ... - -We get the following levels - f 0 - x 1 - g 1 - y 2 - -Then 'x' is being scrutinised at a deeper level than its binding, so -it's added to lc_sruts: [(x,1)] - -We do *not* want to specialise the call to 'f', because 'x' is not free -in 'f'. So here the bind-level of 'x' (=1) is not <= the bind-level of 'f' (=0). - -We *do* want to specialise the call to 'g', because 'x' is free in g. -Here the bind-level of 'x' (=1) is <= the bind-level of 'g' (=1). - -Note [Avoiding fruitless liberate-case] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider also: - f = \x. case top_lvl_thing of - I# _ -> let g = \y. ... g ... - in ... - -Here, top_lvl_thing is scrutinised at a level (1) deeper than its -binding site (0). Nevertheless, we do NOT want to specialise the call -to 'g' because all the structure in its free variables is already -visible at the definition site for g. Hence, when considering specialising -an occurrence of 'g', we want to check that there's a scruted-var v st - - a) v's binding site is *outside* g - b) v's scrutinisation site is *inside* g - - -************************************************************************ -* * - Utility functions -* * -************************************************************************ --} - -addBinders :: LibCaseEnv -> [CoreBndr] -> LibCaseEnv -addBinders env@(LibCaseEnv { lc_lvl = lvl, lc_lvl_env = lvl_env }) binders - = env { lc_lvl_env = lvl_env' } - where - lvl_env' = extendVarEnvList lvl_env (binders `zip` repeat lvl) - -addRecBinds :: LibCaseEnv -> [(Id,CoreExpr)] -> LibCaseEnv -addRecBinds env@(LibCaseEnv {lc_lvl = lvl, lc_lvl_env = lvl_env, - lc_rec_env = rec_env}) pairs - = env { lc_lvl = lvl', lc_lvl_env = lvl_env', lc_rec_env = rec_env' } - where - lvl' = lvl + 1 - lvl_env' = extendVarEnvList lvl_env [(binder,lvl) | (binder,_) <- pairs] - rec_env' = extendVarEnvList rec_env [(binder, Rec pairs) | (binder,_) <- pairs] - -addScrutedVar :: LibCaseEnv - -> Id -- This Id is being scrutinised by a case expression - -> LibCaseEnv - -addScrutedVar env@(LibCaseEnv { lc_lvl = lvl, lc_lvl_env = lvl_env, - lc_scruts = scruts }) scrut_var - | bind_lvl < lvl - = env { lc_scruts = scruts' } - -- Add to scruts iff the scrut_var is being scrutinised at - -- a deeper level than its defn - - | otherwise = env - where - scruts' = (scrut_var, bind_lvl, lvl) : scruts - bind_lvl = case lookupVarEnv lvl_env scrut_var of - Just lvl -> lvl - Nothing -> topLevel - -lookupRecId :: LibCaseEnv -> Id -> Maybe CoreBind -lookupRecId env id = lookupVarEnv (lc_rec_env env) id - -lookupLevel :: LibCaseEnv -> Id -> LibCaseLevel -lookupLevel env id - = case lookupVarEnv (lc_lvl_env env) id of - Just lvl -> lvl - Nothing -> topLevel - -{- -************************************************************************ -* * - The environment -* * -************************************************************************ --} - -type LibCaseLevel = Int - -topLevel :: LibCaseLevel -topLevel = 0 - -data LibCaseEnv - = LibCaseEnv { - lc_dflags :: DynFlags, - - lc_lvl :: LibCaseLevel, -- Current level - -- The level is incremented when (and only when) going - -- inside the RHS of a (sufficiently small) recursive - -- function. - - lc_lvl_env :: IdEnv LibCaseLevel, - -- Binds all non-top-level in-scope Ids (top-level and - -- imported things have a level of zero) - - lc_rec_env :: IdEnv CoreBind, - -- Binds *only* recursively defined ids, to their own - -- binding group, and *only* in their own RHSs - - lc_scruts :: [(Id, LibCaseLevel, LibCaseLevel)] - -- Each of these Ids was scrutinised by an enclosing - -- case expression, at a level deeper than its binding - -- level. - -- - -- The first LibCaseLevel is the *binding level* of - -- the scrutinised Id, - -- The second is the level *at which it was scrutinised*. - -- (see Note [Avoiding fruitless liberate-case]) - -- The former is a bit redundant, since you could always - -- look it up in lc_lvl_env, but it's just cached here - -- - -- The order is insignificant; it's a bag really - -- - -- There's one element per scrutinisation; - -- in principle the same Id may appear multiple times, - -- although that'd be unusual: - -- case x of { (a,b) -> ....(case x of ...) .. } - } - -initEnv :: DynFlags -> LibCaseEnv -initEnv dflags - = LibCaseEnv { lc_dflags = dflags, - lc_lvl = 0, - lc_lvl_env = emptyVarEnv, - lc_rec_env = emptyVarEnv, - lc_scruts = [] } - --- Bomb-out size for deciding if --- potential liberatees are too big. --- (passed in from cmd-line args) -bombOutSize :: LibCaseEnv -> Maybe Int -bombOutSize = liberateCaseThreshold . lc_dflags diff --git a/compiler/simplCore/OccurAnal.hs b/compiler/simplCore/OccurAnal.hs deleted file mode 100644 index aa8c5730e9..0000000000 --- a/compiler/simplCore/OccurAnal.hs +++ /dev/null @@ -1,2898 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - -************************************************************************ -* * -\section[OccurAnal]{Occurrence analysis pass} -* * -************************************************************************ - -The occurrence analyser re-typechecks a core expression, returning a new -core expression with (hopefully) improved usage information. --} - -{-# LANGUAGE CPP, BangPatterns, MultiWayIf, ViewPatterns #-} - -{-# OPTIONS_GHC -Wno-incomplete-record-updates #-} - -module OccurAnal ( - occurAnalysePgm, occurAnalyseExpr, occurAnalyseExpr_NoBinderSwap - ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Core -import GHC.Core.FVs -import GHC.Core.Utils ( exprIsTrivial, isDefaultAlt, isExpandableApp, - stripTicksTopE, mkTicks ) -import GHC.Core.Arity ( joinRhsArity ) -import Id -import IdInfo -import Name( localiseName ) -import BasicTypes -import Module( Module ) -import GHC.Core.Coercion -import GHC.Core.Type - -import VarSet -import VarEnv -import Var -import Demand ( argOneShots, argsOneShots ) -import Digraph ( SCC(..), Node(..) - , stronglyConnCompFromEdgedVerticesUniq - , stronglyConnCompFromEdgedVerticesUniqR ) -import Unique -import UniqFM -import UniqSet -import Util -import Outputable -import Data.List -import Control.Arrow ( second ) - -{- -************************************************************************ -* * - occurAnalysePgm, occurAnalyseExpr, occurAnalyseExpr_NoBinderSwap -* * -************************************************************************ - -Here's the externally-callable interface: --} - -occurAnalysePgm :: Module -- Used only in debug output - -> (Id -> Bool) -- Active unfoldings - -> (Activation -> Bool) -- Active rules - -> [CoreRule] - -> CoreProgram -> CoreProgram -occurAnalysePgm this_mod active_unf active_rule imp_rules binds - | isEmptyDetails final_usage - = occ_anald_binds - - | otherwise -- See Note [Glomming] - = WARN( True, hang (text "Glomming in" <+> ppr this_mod <> colon) - 2 (ppr final_usage ) ) - occ_anald_glommed_binds - where - init_env = initOccEnv { occ_rule_act = active_rule - , occ_unf_act = active_unf } - - (final_usage, occ_anald_binds) = go init_env binds - (_, occ_anald_glommed_binds) = occAnalRecBind init_env TopLevel - imp_rule_edges - (flattenBinds binds) - initial_uds - -- It's crucial to re-analyse the glommed-together bindings - -- so that we establish the right loop breakers. Otherwise - -- we can easily create an infinite loop (#9583 is an example) - -- - -- Also crucial to re-analyse the /original/ bindings - -- in case the first pass accidentally discarded as dead code - -- a binding that was actually needed (albeit before its - -- definition site). #17724 threw this up. - - initial_uds = addManyOccsSet emptyDetails - (rulesFreeVars imp_rules) - -- The RULES declarations keep things alive! - - -- Note [Preventing loops due to imported functions rules] - imp_rule_edges = foldr (plusVarEnv_C unionVarSet) emptyVarEnv - [ mapVarEnv (const maps_to) $ - getUniqSet (exprFreeIds arg `delVarSetList` ru_bndrs imp_rule) - | imp_rule <- imp_rules - , not (isBuiltinRule imp_rule) -- See Note [Plugin rules] - , let maps_to = exprFreeIds (ru_rhs imp_rule) - `delVarSetList` ru_bndrs imp_rule - , arg <- ru_args imp_rule ] - - go :: OccEnv -> [CoreBind] -> (UsageDetails, [CoreBind]) - go _ [] - = (initial_uds, []) - go env (bind:binds) - = (final_usage, bind' ++ binds') - where - (bs_usage, binds') = go env binds - (final_usage, bind') = occAnalBind env TopLevel imp_rule_edges bind - bs_usage - -occurAnalyseExpr :: CoreExpr -> CoreExpr - -- Do occurrence analysis, and discard occurrence info returned -occurAnalyseExpr = occurAnalyseExpr' True -- do binder swap - -occurAnalyseExpr_NoBinderSwap :: CoreExpr -> CoreExpr -occurAnalyseExpr_NoBinderSwap = occurAnalyseExpr' False -- do not do binder swap - -occurAnalyseExpr' :: Bool -> CoreExpr -> CoreExpr -occurAnalyseExpr' enable_binder_swap expr - = snd (occAnal env expr) - where - env = initOccEnv { occ_binder_swap = enable_binder_swap } - -{- Note [Plugin rules] -~~~~~~~~~~~~~~~~~~~~~~ -Conal Elliott (#11651) built a GHC plugin that added some -BuiltinRules (for imported Ids) to the mg_rules field of ModGuts, to -do some domain-specific transformations that could not be expressed -with an ordinary pattern-matching CoreRule. But then we can't extract -the dependencies (in imp_rule_edges) from ru_rhs etc, because a -BuiltinRule doesn't have any of that stuff. - -So we simply assume that BuiltinRules have no dependencies, and filter -them out from the imp_rule_edges comprehension. --} - -{- -************************************************************************ -* * - Bindings -* * -************************************************************************ - -Note [Recursive bindings: the grand plan] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we come across a binding group - Rec { x1 = r1; ...; xn = rn } -we treat it like this (occAnalRecBind): - -1. Occurrence-analyse each right hand side, and build a - "Details" for each binding to capture the results. - - Wrap the details in a Node (details, node-id, dep-node-ids), - where node-id is just the unique of the binder, and - dep-node-ids lists all binders on which this binding depends. - We'll call these the "scope edges". - See Note [Forming the Rec groups]. - - All this is done by makeNode. - -2. Do SCC-analysis on these Nodes. Each SCC will become a new Rec or - NonRec. The key property is that every free variable of a binding - is accounted for by the scope edges, so that when we are done - everything is still in scope. - -3. For each Cyclic SCC of the scope-edge SCC-analysis in (2), we - identify suitable loop-breakers to ensure that inlining terminates. - This is done by occAnalRec. - -4. To do so we form a new set of Nodes, with the same details, but - different edges, the "loop-breaker nodes". The loop-breaker nodes - have both more and fewer dependencies than the scope edges - (see Note [Choosing loop breakers]) - - More edges: if f calls g, and g has an active rule that mentions h - then we add an edge from f -> h - - Fewer edges: we only include dependencies on active rules, on rule - RHSs (not LHSs) and if there is an INLINE pragma only - on the stable unfolding (and vice versa). The scope - edges must be much more inclusive. - -5. The "weak fvs" of a node are, by definition: - the scope fvs - the loop-breaker fvs - See Note [Weak loop breakers], and the nd_weak field of Details - -6. Having formed the loop-breaker nodes - -Note [Dead code] -~~~~~~~~~~~~~~~~ -Dropping dead code for a cyclic Strongly Connected Component is done -in a very simple way: - - the entire SCC is dropped if none of its binders are mentioned - in the body; otherwise the whole thing is kept. - -The key observation is that dead code elimination happens after -dependency analysis: so 'occAnalBind' processes SCCs instead of the -original term's binding groups. - -Thus 'occAnalBind' does indeed drop 'f' in an example like - - letrec f = ...g... - g = ...(...g...)... - in - ...g... - -when 'g' no longer uses 'f' at all (eg 'f' does not occur in a RULE in -'g'). 'occAnalBind' first consumes 'CyclicSCC g' and then it consumes -'AcyclicSCC f', where 'body_usage' won't contain 'f'. - ------------------------------------------------------------- -Note [Forming Rec groups] -~~~~~~~~~~~~~~~~~~~~~~~~~ -We put bindings {f = ef; g = eg } in a Rec group if "f uses g" -and "g uses f", no matter how indirectly. We do a SCC analysis -with an edge f -> g if "f uses g". - -More precisely, "f uses g" iff g should be in scope wherever f is. -That is, g is free in: - a) the rhs 'ef' - b) or the RHS of a rule for f (Note [Rules are extra RHSs]) - c) or the LHS or a rule for f (Note [Rule dependency info]) - -These conditions apply regardless of the activation of the RULE (eg it might be -inactive in this phase but become active later). Once a Rec is broken up -it can never be put back together, so we must be conservative. - -The principle is that, regardless of rule firings, every variable is -always in scope. - - * Note [Rules are extra RHSs] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - A RULE for 'f' is like an extra RHS for 'f'. That way the "parent" - keeps the specialised "children" alive. If the parent dies - (because it isn't referenced any more), then the children will die - too (unless they are already referenced directly). - - To that end, we build a Rec group for each cyclic strongly - connected component, - *treating f's rules as extra RHSs for 'f'*. - More concretely, the SCC analysis runs on a graph with an edge - from f -> g iff g is mentioned in - (a) f's rhs - (b) f's RULES - These are rec_edges. - - Under (b) we include variables free in *either* LHS *or* RHS of - the rule. The former might seems silly, but see Note [Rule - dependency info]. So in Example [eftInt], eftInt and eftIntFB - will be put in the same Rec, even though their 'main' RHSs are - both non-recursive. - - * Note [Rule dependency info] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ - The VarSet in a RuleInfo is used for dependency analysis in the - occurrence analyser. We must track free vars in *both* lhs and rhs. - Hence use of idRuleVars, rather than idRuleRhsVars in occAnalBind. - Why both? Consider - x = y - RULE f x = v+4 - Then if we substitute y for x, we'd better do so in the - rule's LHS too, so we'd better ensure the RULE appears to mention 'x' - as well as 'v' - - * Note [Rules are visible in their own rec group] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - We want the rules for 'f' to be visible in f's right-hand side. - And we'd like them to be visible in other functions in f's Rec - group. E.g. in Note [Specialisation rules] we want f' rule - to be visible in both f's RHS, and fs's RHS. - - This means that we must simplify the RULEs first, before looking - at any of the definitions. This is done by Simplify.simplRecBind, - when it calls addLetIdInfo. - ------------------------------------------------------------- -Note [Choosing loop breakers] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Loop breaking is surprisingly subtle. First read the section 4 of -"Secrets of the GHC inliner". This describes our basic plan. -We avoid infinite inlinings by choosing loop breakers, and -ensuring that a loop breaker cuts each loop. - -See also Note [Inlining and hs-boot files] in GHC.Core.ToIface, which -deals with a closely related source of infinite loops. - -Fundamentally, we do SCC analysis on a graph. For each recursive -group we choose a loop breaker, delete all edges to that node, -re-analyse the SCC, and iterate. - -But what is the graph? NOT the same graph as was used for Note -[Forming Rec groups]! In particular, a RULE is like an equation for -'f' that is *always* inlined if it is applicable. We do *not* disable -rules for loop-breakers. It's up to whoever makes the rules to make -sure that the rules themselves always terminate. See Note [Rules for -recursive functions] in Simplify.hs - -Hence, if - f's RHS (or its INLINE template if it has one) mentions g, and - g has a RULE that mentions h, and - h has a RULE that mentions f - -then we *must* choose f to be a loop breaker. Example: see Note -[Specialisation rules]. - -In general, take the free variables of f's RHS, and augment it with -all the variables reachable by RULES from those starting points. That -is the whole reason for computing rule_fv_env in occAnalBind. (Of -course we only consider free vars that are also binders in this Rec -group.) See also Note [Finding rule RHS free vars] - -Note that when we compute this rule_fv_env, we only consider variables -free in the *RHS* of the rule, in contrast to the way we build the -Rec group in the first place (Note [Rule dependency info]) - -Note that if 'g' has RHS that mentions 'w', we should add w to -g's loop-breaker edges. More concretely there is an edge from f -> g -iff - (a) g is mentioned in f's RHS `xor` f's INLINE rhs - (see Note [Inline rules]) - (b) or h is mentioned in f's RHS, and - g appears in the RHS of an active RULE of h - or a transitive sequence of active rules starting with h - -Why "active rules"? See Note [Finding rule RHS free vars] - -Note that in Example [eftInt], *neither* eftInt *nor* eftIntFB is -chosen as a loop breaker, because their RHSs don't mention each other. -And indeed both can be inlined safely. - -Note again that the edges of the graph we use for computing loop breakers -are not the same as the edges we use for computing the Rec blocks. -That's why we compute - -- rec_edges for the Rec block analysis -- loop_breaker_nodes for the loop breaker analysis - - * Note [Finding rule RHS free vars] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Consider this real example from Data Parallel Haskell - tagZero :: Array Int -> Array Tag - {-# INLINE [1] tagZeroes #-} - tagZero xs = pmap (\x -> fromBool (x==0)) xs - - {-# RULES "tagZero" [~1] forall xs n. - pmap fromBool <blah blah> = tagZero xs #-} - So tagZero's RHS mentions pmap, and pmap's RULE mentions tagZero. - However, tagZero can only be inlined in phase 1 and later, while - the RULE is only active *before* phase 1. So there's no problem. - - To make this work, we look for the RHS free vars only for - *active* rules. That's the reason for the occ_rule_act field - of the OccEnv. - - * Note [Weak loop breakers] - ~~~~~~~~~~~~~~~~~~~~~~~~~ - There is a last nasty wrinkle. Suppose we have - - Rec { f = f_rhs - RULE f [] = g - - h = h_rhs - g = h - ...more... - } - - Remember that we simplify the RULES before any RHS (see Note - [Rules are visible in their own rec group] above). - - So we must *not* postInlineUnconditionally 'g', even though - its RHS turns out to be trivial. (I'm assuming that 'g' is - not chosen as a loop breaker.) Why not? Because then we - drop the binding for 'g', which leaves it out of scope in the - RULE! - - Here's a somewhat different example of the same thing - Rec { g = h - ; h = ...f... - ; f = f_rhs - RULE f [] = g } - Here the RULE is "below" g, but we *still* can't postInlineUnconditionally - g, because the RULE for f is active throughout. So the RHS of h - might rewrite to h = ...g... - So g must remain in scope in the output program! - - We "solve" this by: - - Make g a "weak" loop breaker (OccInfo = IAmLoopBreaker True) - iff g is a "missing free variable" of the Rec group - - A "missing free variable" x is one that is mentioned in an RHS or - INLINE or RULE of a binding in the Rec group, but where the - dependency on x may not show up in the loop_breaker_nodes (see - note [Choosing loop breakers} above). - - A normal "strong" loop breaker has IAmLoopBreaker False. So - - Inline postInlineUnconditionally - strong IAmLoopBreaker False no no - weak IAmLoopBreaker True yes no - other yes yes - - The **sole** reason for this kind of loop breaker is so that - postInlineUnconditionally does not fire. Ugh. (Typically it'll - inline via the usual callSiteInline stuff, so it'll be dead in the - next pass, so the main Ugh is the tiresome complication.) - -Note [Rules for imported functions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this - f = /\a. B.g a - RULE B.g Int = 1 + f Int -Note that - * The RULE is for an imported function. - * f is non-recursive -Now we -can get - f Int --> B.g Int Inlining f - --> 1 + f Int Firing RULE -and so the simplifier goes into an infinite loop. This -would not happen if the RULE was for a local function, -because we keep track of dependencies through rules. But -that is pretty much impossible to do for imported Ids. Suppose -f's definition had been - f = /\a. C.h a -where (by some long and devious process), C.h eventually inlines to -B.g. We could only spot such loops by exhaustively following -unfoldings of C.h etc, in case we reach B.g, and hence (via the RULE) -f. - -Note that RULES for imported functions are important in practice; they -occur a lot in the libraries. - -We regard this potential infinite loop as a *programmer* error. -It's up the programmer not to write silly rules like - RULE f x = f x -and the example above is just a more complicated version. - -Note [Preventing loops due to imported functions rules] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider: - import GHC.Base (foldr) - - {-# RULES "filterList" forall p. foldr (filterFB (:) p) [] = filter p #-} - filter p xs = build (\c n -> foldr (filterFB c p) n xs) - filterFB c p = ... - - f = filter p xs - -Note that filter is not a loop-breaker, so what happens is: - f = filter p xs - = {inline} build (\c n -> foldr (filterFB c p) n xs) - = {inline} foldr (filterFB (:) p) [] xs - = {RULE} filter p xs - -We are in an infinite loop. - -A more elaborate example (that I actually saw in practice when I went to -mark GHC.List.filter as INLINABLE) is as follows. Say I have this module: - {-# LANGUAGE RankNTypes #-} - module GHCList where - - import Prelude hiding (filter) - import GHC.Base (build) - - {-# INLINABLE filter #-} - filter :: (a -> Bool) -> [a] -> [a] - filter p [] = [] - filter p (x:xs) = if p x then x : filter p xs else filter p xs - - {-# NOINLINE [0] filterFB #-} - filterFB :: (a -> b -> b) -> (a -> Bool) -> a -> b -> b - filterFB c p x r | p x = x `c` r - | otherwise = r - - {-# RULES - "filter" [~1] forall p xs. filter p xs = build (\c n -> foldr - (filterFB c p) n xs) - "filterList" [1] forall p. foldr (filterFB (:) p) [] = filter p - #-} - -Then (because RULES are applied inside INLINABLE unfoldings, but inlinings -are not), the unfolding given to "filter" in the interface file will be: - filter p [] = [] - filter p (x:xs) = if p x then x : build (\c n -> foldr (filterFB c p) n xs) - else build (\c n -> foldr (filterFB c p) n xs - -Note that because this unfolding does not mention "filter", filter is not -marked as a strong loop breaker. Therefore at a use site in another module: - filter p xs - = {inline} - case xs of [] -> [] - (x:xs) -> if p x then x : build (\c n -> foldr (filterFB c p) n xs) - else build (\c n -> foldr (filterFB c p) n xs) - - build (\c n -> foldr (filterFB c p) n xs) - = {inline} foldr (filterFB (:) p) [] xs - = {RULE} filter p xs - -And we are in an infinite loop again, except that this time the loop is producing an -infinitely large *term* (an unrolling of filter) and so the simplifier finally -dies with "ticks exhausted" - -Because of this problem, we make a small change in the occurrence analyser -designed to mark functions like "filter" as strong loop breakers on the basis that: - 1. The RHS of filter mentions the local function "filterFB" - 2. We have a rule which mentions "filterFB" on the LHS and "filter" on the RHS - -So for each RULE for an *imported* function we are going to add -dependency edges between the *local* FVS of the rule LHS and the -*local* FVS of the rule RHS. We don't do anything special for RULES on -local functions because the standard occurrence analysis stuff is -pretty good at getting loop-breakerness correct there. - -It is important to note that even with this extra hack we aren't always going to get -things right. For example, it might be that the rule LHS mentions an imported Id, -and another module has a RULE that can rewrite that imported Id to one of our local -Ids. - -Note [Specialising imported functions] (referred to from Specialise) -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -BUT for *automatically-generated* rules, the programmer can't be -responsible for the "programmer error" in Note [Rules for imported -functions]. In particular, consider specialising a recursive function -defined in another module. If we specialise a recursive function B.g, -we get - g_spec = .....(B.g Int)..... - RULE B.g Int = g_spec -Here, g_spec doesn't look recursive, but when the rule fires, it -becomes so. And if B.g was mutually recursive, the loop might -not be as obvious as it is here. - -To avoid this, - * When specialising a function that is a loop breaker, - give a NOINLINE pragma to the specialised function - -Note [Glomming] -~~~~~~~~~~~~~~~ -RULES for imported Ids can make something at the top refer to something at the bottom: - f = \x -> B.g (q x) - h = \y -> 3 - - RULE: B.g (q x) = h x - -Applying this rule makes f refer to h, although f doesn't appear to -depend on h. (And, as in Note [Rules for imported functions], the -dependency might be more indirect. For example, f might mention C.t -rather than B.g, where C.t eventually inlines to B.g.) - -NOTICE that this cannot happen for rules whose head is a -locally-defined function, because we accurately track dependencies -through RULES. It only happens for rules whose head is an imported -function (B.g in the example above). - -Solution: - - When simplifying, bring all top level identifiers into - scope at the start, ignoring the Rec/NonRec structure, so - that when 'h' pops up in f's rhs, we find it in the in-scope set - (as the simplifier generally expects). This happens in simplTopBinds. - - - In the occurrence analyser, if there are any out-of-scope - occurrences that pop out of the top, which will happen after - firing the rule: f = \x -> h x - h = \y -> 3 - then just glom all the bindings into a single Rec, so that - the *next* iteration of the occurrence analyser will sort - them all out. This part happens in occurAnalysePgm. - ------------------------------------------------------------- -Note [Inline rules] -~~~~~~~~~~~~~~~~~~~ -None of the above stuff about RULES applies to Inline Rules, -stored in a CoreUnfolding. The unfolding, if any, is simplified -at the same time as the regular RHS of the function (ie *not* like -Note [Rules are visible in their own rec group]), so it should be -treated *exactly* like an extra RHS. - -Or, rather, when computing loop-breaker edges, - * If f has an INLINE pragma, and it is active, we treat the - INLINE rhs as f's rhs - * If it's inactive, we treat f as having no rhs - * If it has no INLINE pragma, we look at f's actual rhs - - -There is a danger that we'll be sub-optimal if we see this - f = ...f... - [INLINE f = ..no f...] -where f is recursive, but the INLINE is not. This can just about -happen with a sufficiently odd set of rules; eg - - foo :: Int -> Int - {-# INLINE [1] foo #-} - foo x = x+1 - - bar :: Int -> Int - {-# INLINE [1] bar #-} - bar x = foo x + 1 - - {-# RULES "foo" [~1] forall x. foo x = bar x #-} - -Here the RULE makes bar recursive; but it's INLINE pragma remains -non-recursive. It's tempting to then say that 'bar' should not be -a loop breaker, but an attempt to do so goes wrong in two ways: - a) We may get - $df = ...$cfoo... - $cfoo = ...$df.... - [INLINE $cfoo = ...no-$df...] - But we want $cfoo to depend on $df explicitly so that we - put the bindings in the right order to inline $df in $cfoo - and perhaps break the loop altogether. (Maybe this - b) - - -Example [eftInt] -~~~~~~~~~~~~~~~ -Example (from GHC.Enum): - - eftInt :: Int# -> Int# -> [Int] - eftInt x y = ...(non-recursive)... - - {-# INLINE [0] eftIntFB #-} - eftIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> r - eftIntFB c n x y = ...(non-recursive)... - - {-# RULES - "eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y) - "eftIntList" [1] eftIntFB (:) [] = eftInt - #-} - -Note [Specialisation rules] -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this group, which is typical of what SpecConstr builds: - - fs a = ....f (C a).... - f x = ....f (C a).... - {-# RULE f (C a) = fs a #-} - -So 'f' and 'fs' are in the same Rec group (since f refers to fs via its RULE). - -But watch out! If 'fs' is not chosen as a loop breaker, we may get an infinite loop: - - the RULE is applied in f's RHS (see Note [Self-recursive rules] in Simplify - - fs is inlined (say it's small) - - now there's another opportunity to apply the RULE - -This showed up when compiling Control.Concurrent.Chan.getChanContents. - ------------------------------------------------------------- -Note [Finding join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's the occurrence analyser's job to find bindings that we can turn into join -points, but it doesn't perform that transformation right away. Rather, it marks -the eligible bindings as part of their occurrence data, leaving it to the -simplifier (or to simpleOptPgm) to actually change the binder's 'IdDetails'. -The simplifier then eta-expands the RHS if needed and then updates the -occurrence sites. Dividing the work this way means that the occurrence analyser -still only takes one pass, yet one can always tell the difference between a -function call and a jump by looking at the occurrence (because the same pass -changes the 'IdDetails' and propagates the binders to their occurrence sites). - -To track potential join points, we use the 'occ_tail' field of OccInfo. A value -of `AlwaysTailCalled n` indicates that every occurrence of the variable is a -tail call with `n` arguments (counting both value and type arguments). Otherwise -'occ_tail' will be 'NoTailCallInfo'. The tail call info flows bottom-up with the -rest of 'OccInfo' until it goes on the binder. - -Note [Rules and join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Things get fiddly with rules. Suppose we have: - - let j :: Int -> Int - j y = 2 * y - k :: Int -> Int -> Int - {-# RULES "SPEC k 0" k 0 = j #-} - k x y = x + 2 * y - in ... - -Now suppose that both j and k appear only as saturated tail calls in the body. -Thus we would like to make them both join points. The rule complicates matters, -though, as its RHS has an unapplied occurrence of j. *However*, if we were to -eta-expand the rule, all would be well: - - {-# RULES "SPEC k 0" forall a. k 0 a = j a #-} - -So conceivably we could notice that a potential join point would have an -"undersaturated" rule and account for it. This would mean we could make -something that's been specialised a join point, for instance. But local bindings -are rarely specialised, and being overly cautious about rules only -costs us anything when, for some `j`: - - * Before specialisation, `j` has non-tail calls, so it can't be a join point. - * During specialisation, `j` gets specialised and thus acquires rules. - * Sometime afterward, the non-tail calls to `j` disappear (as dead code, say), - and so now `j` *could* become a join point. - -This appears to be very rare in practice. TODO Perhaps we should gather -statistics to be sure. - ------------------------------------------------------------- -Note [Adjusting right-hand sides] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -There's a bit of a dance we need to do after analysing a lambda expression or -a right-hand side. In particular, we need to - - a) call 'markAllInsideLam' *unless* the binding is for a thunk, a one-shot - lambda, or a non-recursive join point; and - b) call 'markAllNonTailCalled' *unless* the binding is for a join point. - -Some examples, with how the free occurrences in e (assumed not to be a value -lambda) get marked: - - inside lam non-tail-called - ------------------------------------------------------------ - let x = e No Yes - let f = \x -> e Yes Yes - let f = \x{OneShot} -> e No Yes - \x -> e Yes Yes - join j x = e No No - joinrec j x = e Yes No - -There are a few other caveats; most importantly, if we're marking a binding as -'AlwaysTailCalled', it's *going* to be a join point, so we treat it as one so -that the effect cascades properly. Consequently, at the time the RHS is -analysed, we won't know what adjustments to make; thus 'occAnalLamOrRhs' must -return the unadjusted 'UsageDetails', to be adjusted by 'adjustRhsUsage' once -join-point-hood has been decided. - -Thus the overall sequence taking place in 'occAnalNonRecBind' and -'occAnalRecBind' is as follows: - - 1. Call 'occAnalLamOrRhs' to find usage information for the RHS. - 2. Call 'tagNonRecBinder' or 'tagRecBinders', which decides whether to make - the binding a join point. - 3. Call 'adjustRhsUsage' accordingly. (Done as part of 'tagRecBinders' when - recursive.) - -(In the recursive case, this logic is spread between 'makeNode' and -'occAnalRec'.) --} - ------------------------------------------------------------------- --- occAnalBind ------------------------------------------------------------------- - -occAnalBind :: OccEnv -- The incoming OccEnv - -> TopLevelFlag - -> ImpRuleEdges - -> CoreBind - -> UsageDetails -- Usage details of scope - -> (UsageDetails, -- Of the whole let(rec) - [CoreBind]) - -occAnalBind env lvl top_env (NonRec binder rhs) body_usage - = occAnalNonRecBind env lvl top_env binder rhs body_usage -occAnalBind env lvl top_env (Rec pairs) body_usage - = occAnalRecBind env lvl top_env pairs body_usage - ------------------ -occAnalNonRecBind :: OccEnv -> TopLevelFlag -> ImpRuleEdges -> Var -> CoreExpr - -> UsageDetails -> (UsageDetails, [CoreBind]) -occAnalNonRecBind env lvl imp_rule_edges binder rhs body_usage - | isTyVar binder -- A type let; we don't gather usage info - = (body_usage, [NonRec binder rhs]) - - | not (binder `usedIn` body_usage) -- It's not mentioned - = (body_usage, []) - - | otherwise -- It's mentioned in the body - = (body_usage' `andUDs` rhs_usage', [NonRec tagged_binder rhs']) - where - (body_usage', tagged_binder) = tagNonRecBinder lvl body_usage binder - mb_join_arity = willBeJoinId_maybe tagged_binder - - (bndrs, body) = collectBinders rhs - - (rhs_usage1, bndrs', body') = occAnalNonRecRhs env tagged_binder bndrs body - rhs' = mkLams (markJoinOneShots mb_join_arity bndrs') body' - -- For a /non-recursive/ join point we can mark all - -- its join-lambda as one-shot; and it's a good idea to do so - - -- Unfoldings - -- See Note [Unfoldings and join points] - rhs_usage2 = case occAnalUnfolding env NonRecursive binder of - Just unf_usage -> rhs_usage1 `andUDs` unf_usage - Nothing -> rhs_usage1 - - -- Rules - -- See Note [Rules are extra RHSs] and Note [Rule dependency info] - rules_w_uds = occAnalRules env mb_join_arity NonRecursive tagged_binder - rule_uds = map (\(_, l, r) -> l `andUDs` r) rules_w_uds - rhs_usage3 = foldr andUDs rhs_usage2 rule_uds - rhs_usage4 = case lookupVarEnv imp_rule_edges binder of - Nothing -> rhs_usage3 - Just vs -> addManyOccsSet rhs_usage3 vs - -- See Note [Preventing loops due to imported functions rules] - - -- Final adjustment - rhs_usage' = adjustRhsUsage mb_join_arity NonRecursive bndrs' rhs_usage4 - ------------------ -occAnalRecBind :: OccEnv -> TopLevelFlag -> ImpRuleEdges -> [(Var,CoreExpr)] - -> UsageDetails -> (UsageDetails, [CoreBind]) -occAnalRecBind env lvl imp_rule_edges pairs body_usage - = foldr (occAnalRec env lvl) (body_usage, []) sccs - -- For a recursive group, we - -- * occ-analyse all the RHSs - -- * compute strongly-connected components - -- * feed those components to occAnalRec - -- See Note [Recursive bindings: the grand plan] - where - sccs :: [SCC Details] - sccs = {-# SCC "occAnalBind.scc" #-} - stronglyConnCompFromEdgedVerticesUniq nodes - - nodes :: [LetrecNode] - nodes = {-# SCC "occAnalBind.assoc" #-} - map (makeNode env imp_rule_edges bndr_set) pairs - - bndr_set = mkVarSet (map fst pairs) - -{- -Note [Unfoldings and join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -We assume that anything in an unfolding occurs multiple times, since unfoldings -are often copied (that's the whole point!). But we still need to track tail -calls for the purpose of finding join points. --} - ------------------------------ -occAnalRec :: OccEnv -> TopLevelFlag - -> SCC Details - -> (UsageDetails, [CoreBind]) - -> (UsageDetails, [CoreBind]) - - -- The NonRec case is just like a Let (NonRec ...) above -occAnalRec _ lvl (AcyclicSCC (ND { nd_bndr = bndr, nd_rhs = rhs - , nd_uds = rhs_uds, nd_rhs_bndrs = rhs_bndrs })) - (body_uds, binds) - | not (bndr `usedIn` body_uds) - = (body_uds, binds) -- See Note [Dead code] - - | otherwise -- It's mentioned in the body - = (body_uds' `andUDs` rhs_uds', - NonRec tagged_bndr rhs : binds) - where - (body_uds', tagged_bndr) = tagNonRecBinder lvl body_uds bndr - rhs_uds' = adjustRhsUsage (willBeJoinId_maybe tagged_bndr) NonRecursive - rhs_bndrs rhs_uds - - -- The Rec case is the interesting one - -- See Note [Recursive bindings: the grand plan] - -- See Note [Loop breaking] -occAnalRec env lvl (CyclicSCC details_s) (body_uds, binds) - | not (any (`usedIn` body_uds) bndrs) -- NB: look at body_uds, not total_uds - = (body_uds, binds) -- See Note [Dead code] - - | otherwise -- At this point we always build a single Rec - = -- pprTrace "occAnalRec" (vcat - -- [ text "weak_fvs" <+> ppr weak_fvs - -- , text "lb nodes" <+> ppr loop_breaker_nodes]) - (final_uds, Rec pairs : binds) - - where - bndrs = map nd_bndr details_s - bndr_set = mkVarSet bndrs - - ------------------------------ - -- See Note [Choosing loop breakers] for loop_breaker_nodes - final_uds :: UsageDetails - loop_breaker_nodes :: [LetrecNode] - (final_uds, loop_breaker_nodes) - = mkLoopBreakerNodes env lvl bndr_set body_uds details_s - - ------------------------------ - weak_fvs :: VarSet - weak_fvs = mapUnionVarSet nd_weak details_s - - --------------------------- - -- Now reconstruct the cycle - pairs :: [(Id,CoreExpr)] - pairs | isEmptyVarSet weak_fvs = reOrderNodes 0 bndr_set weak_fvs loop_breaker_nodes [] - | otherwise = loopBreakNodes 0 bndr_set weak_fvs loop_breaker_nodes [] - -- If weak_fvs is empty, the loop_breaker_nodes will include - -- all the edges in the original scope edges [remember, - -- weak_fvs is the difference between scope edges and - -- lb-edges], so a fresh SCC computation would yield a - -- single CyclicSCC result; and reOrderNodes deals with - -- exactly that case - - ------------------------------------------------------------------- --- Loop breaking ------------------------------------------------------------------- - -type Binding = (Id,CoreExpr) - -loopBreakNodes :: Int - -> VarSet -- All binders - -> VarSet -- Binders whose dependencies may be "missing" - -- See Note [Weak loop breakers] - -> [LetrecNode] - -> [Binding] -- Append these to the end - -> [Binding] -{- -loopBreakNodes is applied to the list of nodes for a cyclic strongly -connected component (there's guaranteed to be a cycle). It returns -the same nodes, but - a) in a better order, - b) with some of the Ids having a IAmALoopBreaker pragma - -The "loop-breaker" Ids are sufficient to break all cycles in the SCC. This means -that the simplifier can guarantee not to loop provided it never records an inlining -for these no-inline guys. - -Furthermore, the order of the binds is such that if we neglect dependencies -on the no-inline Ids then the binds are topologically sorted. This means -that the simplifier will generally do a good job if it works from top bottom, -recording inlinings for any Ids which aren't marked as "no-inline" as it goes. --} - --- Return the bindings sorted into a plausible order, and marked with loop breakers. -loopBreakNodes depth bndr_set weak_fvs nodes binds - = -- pprTrace "loopBreakNodes" (ppr nodes) $ - go (stronglyConnCompFromEdgedVerticesUniqR nodes) binds - where - go [] binds = binds - go (scc:sccs) binds = loop_break_scc scc (go sccs binds) - - loop_break_scc scc binds - = case scc of - AcyclicSCC node -> mk_non_loop_breaker weak_fvs node : binds - CyclicSCC nodes -> reOrderNodes depth bndr_set weak_fvs nodes binds - ----------------------------------- -reOrderNodes :: Int -> VarSet -> VarSet -> [LetrecNode] -> [Binding] -> [Binding] - -- Choose a loop breaker, mark it no-inline, - -- and call loopBreakNodes on the rest -reOrderNodes _ _ _ [] _ = panic "reOrderNodes" -reOrderNodes _ _ _ [node] binds = mk_loop_breaker node : binds -reOrderNodes depth bndr_set weak_fvs (node : nodes) binds - = -- pprTrace "reOrderNodes" (vcat [ text "unchosen" <+> ppr unchosen - -- , text "chosen" <+> ppr chosen_nodes ]) $ - loopBreakNodes new_depth bndr_set weak_fvs unchosen $ - (map mk_loop_breaker chosen_nodes ++ binds) - where - (chosen_nodes, unchosen) = chooseLoopBreaker approximate_lb - (nd_score (node_payload node)) - [node] [] nodes - - approximate_lb = depth >= 2 - new_depth | approximate_lb = 0 - | otherwise = depth+1 - -- After two iterations (d=0, d=1) give up - -- and approximate, returning to d=0 - -mk_loop_breaker :: LetrecNode -> Binding -mk_loop_breaker (node_payload -> ND { nd_bndr = bndr, nd_rhs = rhs}) - = (bndr `setIdOccInfo` strongLoopBreaker { occ_tail = tail_info }, rhs) - where - tail_info = tailCallInfo (idOccInfo bndr) - -mk_non_loop_breaker :: VarSet -> LetrecNode -> Binding --- See Note [Weak loop breakers] -mk_non_loop_breaker weak_fvs (node_payload -> ND { nd_bndr = bndr - , nd_rhs = rhs}) - | bndr `elemVarSet` weak_fvs = (setIdOccInfo bndr occ', rhs) - | otherwise = (bndr, rhs) - where - occ' = weakLoopBreaker { occ_tail = tail_info } - tail_info = tailCallInfo (idOccInfo bndr) - ----------------------------------- -chooseLoopBreaker :: Bool -- True <=> Too many iterations, - -- so approximate - -> NodeScore -- Best score so far - -> [LetrecNode] -- Nodes with this score - -> [LetrecNode] -- Nodes with higher scores - -> [LetrecNode] -- Unprocessed nodes - -> ([LetrecNode], [LetrecNode]) - -- This loop looks for the bind with the lowest score - -- to pick as the loop breaker. The rest accumulate in -chooseLoopBreaker _ _ loop_nodes acc [] - = (loop_nodes, acc) -- Done - - -- If approximate_loop_breaker is True, we pick *all* - -- nodes with lowest score, else just one - -- See Note [Complexity of loop breaking] -chooseLoopBreaker approx_lb loop_sc loop_nodes acc (node : nodes) - | approx_lb - , rank sc == rank loop_sc - = chooseLoopBreaker approx_lb loop_sc (node : loop_nodes) acc nodes - - | sc `betterLB` loop_sc -- Better score so pick this new one - = chooseLoopBreaker approx_lb sc [node] (loop_nodes ++ acc) nodes - - | otherwise -- Worse score so don't pick it - = chooseLoopBreaker approx_lb loop_sc loop_nodes (node : acc) nodes - where - sc = nd_score (node_payload node) - -{- -Note [Complexity of loop breaking] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The loop-breaking algorithm knocks out one binder at a time, and -performs a new SCC analysis on the remaining binders. That can -behave very badly in tightly-coupled groups of bindings; in the -worst case it can be (N**2)*log N, because it does a full SCC -on N, then N-1, then N-2 and so on. - -To avoid this, we switch plans after 2 (or whatever) attempts: - Plan A: pick one binder with the lowest score, make it - a loop breaker, and try again - Plan B: pick *all* binders with the lowest score, make them - all loop breakers, and try again -Since there are only a small finite number of scores, this will -terminate in a constant number of iterations, rather than O(N) -iterations. - -You might thing that it's very unlikely, but RULES make it much -more likely. Here's a real example from #1969: - Rec { $dm = \d.\x. op d - {-# RULES forall d. $dm Int d = $s$dm1 - forall d. $dm Bool d = $s$dm2 #-} - - dInt = MkD .... opInt ... - dInt = MkD .... opBool ... - opInt = $dm dInt - opBool = $dm dBool - - $s$dm1 = \x. op dInt - $s$dm2 = \x. op dBool } -The RULES stuff means that we can't choose $dm as a loop breaker -(Note [Choosing loop breakers]), so we must choose at least (say) -opInt *and* opBool, and so on. The number of loop breakders is -linear in the number of instance declarations. - -Note [Loop breakers and INLINE/INLINABLE pragmas] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Avoid choosing a function with an INLINE pramga as the loop breaker! -If such a function is mutually-recursive with a non-INLINE thing, -then the latter should be the loop-breaker. - -It's vital to distinguish between INLINE and INLINABLE (the -Bool returned by hasStableCoreUnfolding_maybe). If we start with - Rec { {-# INLINABLE f #-} - f x = ...f... } -and then worker/wrapper it through strictness analysis, we'll get - Rec { {-# INLINABLE $wf #-} - $wf p q = let x = (p,q) in ...f... - - {-# INLINE f #-} - f x = case x of (p,q) -> $wf p q } - -Now it is vital that we choose $wf as the loop breaker, so we can -inline 'f' in '$wf'. - -Note [DFuns should not be loop breakers] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's particularly bad to make a DFun into a loop breaker. See -Note [How instance declarations are translated] in TcInstDcls - -We give DFuns a higher score than ordinary CONLIKE things because -if there's a choice we want the DFun to be the non-loop breaker. Eg - -rec { sc = /\ a \$dC. $fBWrap (T a) ($fCT @ a $dC) - - $fCT :: forall a_afE. (Roman.C a_afE) => Roman.C (Roman.T a_afE) - {-# DFUN #-} - $fCT = /\a \$dC. MkD (T a) ((sc @ a $dC) |> blah) ($ctoF @ a $dC) - } - -Here 'sc' (the superclass) looks CONLIKE, but we'll never get to it -if we can't unravel the DFun first. - -Note [Constructor applications] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's really really important to inline dictionaries. Real -example (the Enum Ordering instance from GHC.Base): - - rec f = \ x -> case d of (p,q,r) -> p x - g = \ x -> case d of (p,q,r) -> q x - d = (v, f, g) - -Here, f and g occur just once; but we can't inline them into d. -On the other hand we *could* simplify those case expressions if -we didn't stupidly choose d as the loop breaker. -But we won't because constructor args are marked "Many". -Inlining dictionaries is really essential to unravelling -the loops in static numeric dictionaries, see GHC.Float. - -Note [Closure conversion] -~~~~~~~~~~~~~~~~~~~~~~~~~ -We treat (\x. C p q) as a high-score candidate in the letrec scoring algorithm. -The immediate motivation came from the result of a closure-conversion transformation -which generated code like this: - - data Clo a b = forall c. Clo (c -> a -> b) c - - ($:) :: Clo a b -> a -> b - Clo f env $: x = f env x - - rec { plus = Clo plus1 () - - ; plus1 _ n = Clo plus2 n - - ; plus2 Zero n = n - ; plus2 (Succ m) n = Succ (plus $: m $: n) } - -If we inline 'plus' and 'plus1', everything unravels nicely. But if -we choose 'plus1' as the loop breaker (which is entirely possible -otherwise), the loop does not unravel nicely. - - -@occAnalUnfolding@ deals with the question of bindings where the Id is marked -by an INLINE pragma. For these we record that anything which occurs -in its RHS occurs many times. This pessimistically assumes that this -inlined binder also occurs many times in its scope, but if it doesn't -we'll catch it next time round. At worst this costs an extra simplifier pass. -ToDo: try using the occurrence info for the inline'd binder. - -[March 97] We do the same for atomic RHSs. Reason: see notes with loopBreakSCC. -[June 98, SLPJ] I've undone this change; I don't understand it. See notes with loopBreakSCC. - - -************************************************************************ -* * - Making nodes -* * -************************************************************************ --} - -type ImpRuleEdges = IdEnv IdSet -- Mapping from FVs of imported RULE LHSs to RHS FVs - -noImpRuleEdges :: ImpRuleEdges -noImpRuleEdges = emptyVarEnv - -type LetrecNode = Node Unique Details -- Node comes from Digraph - -- The Unique key is gotten from the Id -data Details - = ND { nd_bndr :: Id -- Binder - , nd_rhs :: CoreExpr -- RHS, already occ-analysed - , nd_rhs_bndrs :: [CoreBndr] -- Outer lambdas of RHS - -- INVARIANT: (nd_rhs_bndrs nd, _) == - -- collectBinders (nd_rhs nd) - - , nd_uds :: UsageDetails -- Usage from RHS, and RULES, and stable unfoldings - -- ignoring phase (ie assuming all are active) - -- See Note [Forming Rec groups] - - , nd_inl :: IdSet -- Free variables of - -- the stable unfolding (if present and active) - -- or the RHS (if not) - -- but excluding any RULES - -- This is the IdSet that may be used if the Id is inlined - - , nd_weak :: IdSet -- Binders of this Rec that are mentioned in nd_uds - -- but are *not* in nd_inl. These are the ones whose - -- dependencies might not be respected by loop_breaker_nodes - -- See Note [Weak loop breakers] - - , nd_active_rule_fvs :: IdSet -- Free variables of the RHS of active RULES - - , nd_score :: NodeScore - } - -instance Outputable Details where - ppr nd = text "ND" <> braces - (sep [ text "bndr =" <+> ppr (nd_bndr nd) - , text "uds =" <+> ppr (nd_uds nd) - , text "inl =" <+> ppr (nd_inl nd) - , text "weak =" <+> ppr (nd_weak nd) - , text "rule =" <+> ppr (nd_active_rule_fvs nd) - , text "score =" <+> ppr (nd_score nd) - ]) - --- The NodeScore is compared lexicographically; --- e.g. lower rank wins regardless of size -type NodeScore = ( Int -- Rank: lower => more likely to be picked as loop breaker - , Int -- Size of rhs: higher => more likely to be picked as LB - -- Maxes out at maxExprSize; we just use it to prioritise - -- small functions - , Bool ) -- Was it a loop breaker before? - -- True => more likely to be picked - -- Note [Loop breakers, node scoring, and stability] - -rank :: NodeScore -> Int -rank (r, _, _) = r - -makeNode :: OccEnv -> ImpRuleEdges -> VarSet - -> (Var, CoreExpr) -> LetrecNode --- See Note [Recursive bindings: the grand plan] -makeNode env imp_rule_edges bndr_set (bndr, rhs) - = DigraphNode details (varUnique bndr) (nonDetKeysUniqSet node_fvs) - -- It's OK to use nonDetKeysUniqSet here as stronglyConnCompFromEdgedVerticesR - -- is still deterministic with edges in nondeterministic order as - -- explained in Note [Deterministic SCC] in Digraph. - where - details = ND { nd_bndr = bndr - , nd_rhs = rhs' - , nd_rhs_bndrs = bndrs' - , nd_uds = rhs_usage3 - , nd_inl = inl_fvs - , nd_weak = node_fvs `minusVarSet` inl_fvs - , nd_active_rule_fvs = active_rule_fvs - , nd_score = pprPanic "makeNodeDetails" (ppr bndr) } - - -- Constructing the edges for the main Rec computation - -- See Note [Forming Rec groups] - (bndrs, body) = collectBinders rhs - (rhs_usage1, bndrs', body') = occAnalRecRhs env bndrs body - rhs' = mkLams bndrs' body' - rhs_usage2 = foldr andUDs rhs_usage1 rule_uds - -- Note [Rules are extra RHSs] - -- Note [Rule dependency info] - rhs_usage3 = case mb_unf_uds of - Just unf_uds -> rhs_usage2 `andUDs` unf_uds - Nothing -> rhs_usage2 - node_fvs = udFreeVars bndr_set rhs_usage3 - - -- Finding the free variables of the rules - is_active = occ_rule_act env :: Activation -> Bool - - rules_w_uds :: [(CoreRule, UsageDetails, UsageDetails)] - rules_w_uds = occAnalRules env (Just (length bndrs)) Recursive bndr - - rules_w_rhs_fvs :: [(Activation, VarSet)] -- Find the RHS fvs - rules_w_rhs_fvs = maybe id (\ids -> ((AlwaysActive, ids):)) - (lookupVarEnv imp_rule_edges bndr) - -- See Note [Preventing loops due to imported functions rules] - [ (ru_act rule, udFreeVars bndr_set rhs_uds) - | (rule, _, rhs_uds) <- rules_w_uds ] - rule_uds = map (\(_, l, r) -> l `andUDs` r) rules_w_uds - active_rule_fvs = unionVarSets [fvs | (a,fvs) <- rules_w_rhs_fvs - , is_active a] - - -- Finding the usage details of the INLINE pragma (if any) - mb_unf_uds = occAnalUnfolding env Recursive bndr - - -- Find the "nd_inl" free vars; for the loop-breaker phase - inl_fvs = case mb_unf_uds of - Nothing -> udFreeVars bndr_set rhs_usage1 -- No INLINE, use RHS - Just unf_uds -> udFreeVars bndr_set unf_uds - -- We could check for an *active* INLINE (returning - -- emptyVarSet for an inactive one), but is_active - -- isn't the right thing (it tells about - -- RULE activation), so we'd need more plumbing - -mkLoopBreakerNodes :: OccEnv -> TopLevelFlag - -> VarSet - -> UsageDetails -- for BODY of let - -> [Details] - -> (UsageDetails, -- adjusted - [LetrecNode]) --- Does four things --- a) tag each binder with its occurrence info --- b) add a NodeScore to each node --- c) make a Node with the right dependency edges for --- the loop-breaker SCC analysis --- d) adjust each RHS's usage details according to --- the binder's (new) shotness and join-point-hood -mkLoopBreakerNodes env lvl bndr_set body_uds details_s - = (final_uds, zipWith mk_lb_node details_s bndrs') - where - (final_uds, bndrs') = tagRecBinders lvl body_uds - [ ((nd_bndr nd) - ,(nd_uds nd) - ,(nd_rhs_bndrs nd)) - | nd <- details_s ] - mk_lb_node nd@(ND { nd_bndr = bndr, nd_rhs = rhs, nd_inl = inl_fvs }) bndr' - = DigraphNode nd' (varUnique bndr) (nonDetKeysUniqSet lb_deps) - -- It's OK to use nonDetKeysUniqSet here as - -- stronglyConnCompFromEdgedVerticesR is still deterministic with edges - -- in nondeterministic order as explained in - -- Note [Deterministic SCC] in Digraph. - where - nd' = nd { nd_bndr = bndr', nd_score = score } - score = nodeScore env bndr bndr' rhs lb_deps - lb_deps = extendFvs_ rule_fv_env inl_fvs - - rule_fv_env :: IdEnv IdSet - -- Maps a variable f to the variables from this group - -- mentioned in RHS of active rules for f - -- Domain is *subset* of bound vars (others have no rule fvs) - rule_fv_env = transClosureFV (mkVarEnv init_rule_fvs) - init_rule_fvs -- See Note [Finding rule RHS free vars] - = [ (b, trimmed_rule_fvs) - | ND { nd_bndr = b, nd_active_rule_fvs = rule_fvs } <- details_s - , let trimmed_rule_fvs = rule_fvs `intersectVarSet` bndr_set - , not (isEmptyVarSet trimmed_rule_fvs) ] - - ------------------------------------------- -nodeScore :: OccEnv - -> Id -- Binder has old occ-info (just for loop-breaker-ness) - -> Id -- Binder with new occ-info - -> CoreExpr -- RHS - -> VarSet -- Loop-breaker dependencies - -> NodeScore -nodeScore env old_bndr new_bndr bind_rhs lb_deps - | not (isId old_bndr) -- A type or coercion variable is never a loop breaker - = (100, 0, False) - - | old_bndr `elemVarSet` lb_deps -- Self-recursive things are great loop breakers - = (0, 0, True) -- See Note [Self-recursion and loop breakers] - - | not (occ_unf_act env old_bndr) -- A binder whose inlining is inactive (e.g. has - = (0, 0, True) -- a NOINLINE pragma) makes a great loop breaker - - | exprIsTrivial rhs - = mk_score 10 -- Practically certain to be inlined - -- Used to have also: && not (isExportedId bndr) - -- But I found this sometimes cost an extra iteration when we have - -- rec { d = (a,b); a = ...df...; b = ...df...; df = d } - -- where df is the exported dictionary. Then df makes a really - -- bad choice for loop breaker - - | DFunUnfolding { df_args = args } <- id_unfolding - -- Never choose a DFun as a loop breaker - -- Note [DFuns should not be loop breakers] - = (9, length args, is_lb) - - -- Data structures are more important than INLINE pragmas - -- so that dictionary/method recursion unravels - - | CoreUnfolding { uf_guidance = UnfWhen {} } <- id_unfolding - = mk_score 6 - - | is_con_app rhs -- Data types help with cases: - = mk_score 5 -- Note [Constructor applications] - - | isStableUnfolding id_unfolding - , can_unfold - = mk_score 3 - - | isOneOcc (idOccInfo new_bndr) - = mk_score 2 -- Likely to be inlined - - | can_unfold -- The Id has some kind of unfolding - = mk_score 1 - - | otherwise - = (0, 0, is_lb) - - where - mk_score :: Int -> NodeScore - mk_score rank = (rank, rhs_size, is_lb) - - is_lb = isStrongLoopBreaker (idOccInfo old_bndr) - rhs = case id_unfolding of - CoreUnfolding { uf_src = src, uf_tmpl = unf_rhs } - | isStableSource src - -> unf_rhs - _ -> bind_rhs - -- 'bind_rhs' is irrelevant for inlining things with a stable unfolding - rhs_size = case id_unfolding of - CoreUnfolding { uf_guidance = guidance } - | UnfIfGoodArgs { ug_size = size } <- guidance - -> size - _ -> cheapExprSize rhs - - can_unfold = canUnfold id_unfolding - id_unfolding = realIdUnfolding old_bndr - -- realIdUnfolding: Ignore loop-breaker-ness here because - -- that is what we are setting! - - -- Checking for a constructor application - -- Cheap and cheerful; the simplifier moves casts out of the way - -- The lambda case is important to spot x = /\a. C (f a) - -- which comes up when C is a dictionary constructor and - -- f is a default method. - -- Example: the instance for Show (ST s a) in GHC.ST - -- - -- However we *also* treat (\x. C p q) as a con-app-like thing, - -- Note [Closure conversion] - is_con_app (Var v) = isConLikeId v - is_con_app (App f _) = is_con_app f - is_con_app (Lam _ e) = is_con_app e - is_con_app (Tick _ e) = is_con_app e - is_con_app _ = False - -maxExprSize :: Int -maxExprSize = 20 -- Rather arbitrary - -cheapExprSize :: CoreExpr -> Int --- Maxes out at maxExprSize -cheapExprSize e - = go 0 e - where - go n e | n >= maxExprSize = n - | otherwise = go1 n e - - go1 n (Var {}) = n+1 - go1 n (Lit {}) = n+1 - go1 n (Type {}) = n - go1 n (Coercion {}) = n - go1 n (Tick _ e) = go1 n e - go1 n (Cast e _) = go1 n e - go1 n (App f a) = go (go1 n f) a - go1 n (Lam b e) - | isTyVar b = go1 n e - | otherwise = go (n+1) e - go1 n (Let b e) = gos (go1 n e) (rhssOfBind b) - go1 n (Case e _ _ as) = gos (go1 n e) (rhssOfAlts as) - - gos n [] = n - gos n (e:es) | n >= maxExprSize = n - | otherwise = gos (go1 n e) es - -betterLB :: NodeScore -> NodeScore -> Bool --- If n1 `betterLB` n2 then choose n1 as the loop breaker -betterLB (rank1, size1, lb1) (rank2, size2, _) - | rank1 < rank2 = True - | rank1 > rank2 = False - | size1 < size2 = False -- Make the bigger n2 into the loop breaker - | size1 > size2 = True - | lb1 = True -- Tie-break: if n1 was a loop breaker before, choose it - | otherwise = False -- See Note [Loop breakers, node scoring, and stability] - -{- Note [Self-recursion and loop breakers] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have - rec { f = ...f...g... - ; g = .....f... } -then 'f' has to be a loop breaker anyway, so we may as well choose it -right away, so that g can inline freely. - -This is really just a cheap hack. Consider - rec { f = ...g... - ; g = ..f..h... - ; h = ...f....} -Here f or g are better loop breakers than h; but we might accidentally -choose h. Finding the minimal set of loop breakers is hard. - -Note [Loop breakers, node scoring, and stability] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To choose a loop breaker, we give a NodeScore to each node in the SCC, -and pick the one with the best score (according to 'betterLB'). - -We need to be jolly careful (#12425, #12234) about the stability -of this choice. Suppose we have - - let rec { f = ...g...g... - ; g = ...f...f... } - in - case x of - True -> ...f.. - False -> ..f... - -In each iteration of the simplifier the occurrence analyser OccAnal -chooses a loop breaker. Suppose in iteration 1 it choose g as the loop -breaker. That means it is free to inline f. - -Suppose that GHC decides to inline f in the branches of the case, but -(for some reason; eg it is not saturated) in the rhs of g. So we get - - let rec { f = ...g...g... - ; g = ...f...f... } - in - case x of - True -> ...g...g..... - False -> ..g..g.... - -Now suppose that, for some reason, in the next iteration the occurrence -analyser chooses f as the loop breaker, so it can freely inline g. And -again for some reason the simplifier inlines g at its calls in the case -branches, but not in the RHS of f. Then we get - - let rec { f = ...g...g... - ; g = ...f...f... } - in - case x of - True -> ...(...f...f...)...(...f..f..)..... - False -> ..(...f...f...)...(..f..f...).... - -You can see where this is going! Each iteration of the simplifier -doubles the number of calls to f or g. No wonder GHC is slow! - -(In the particular example in comment:3 of #12425, f and g are the two -mutually recursive fmap instances for CondT and Result. They are both -marked INLINE which, oddly, is why they don't inline in each other's -RHS, because the call there is not saturated.) - -The root cause is that we flip-flop on our choice of loop breaker. I -always thought it didn't matter, and indeed for any single iteration -to terminate, it doesn't matter. But when we iterate, it matters a -lot!! - -So The Plan is this: - If there is a tie, choose the node that - was a loop breaker last time round - -Hence the is_lb field of NodeScore - -************************************************************************ -* * - Right hand sides -* * -************************************************************************ --} - -occAnalRhs :: OccEnv -> RecFlag -> Id -> [CoreBndr] -> CoreExpr - -> (UsageDetails, [CoreBndr], CoreExpr) - -- Returned usage details covers only the RHS, - -- and *not* the RULE or INLINE template for the Id -occAnalRhs env Recursive _ bndrs body - = occAnalRecRhs env bndrs body -occAnalRhs env NonRecursive id bndrs body - = occAnalNonRecRhs env id bndrs body - -occAnalRecRhs :: OccEnv -> [CoreBndr] -> CoreExpr -- Rhs lambdas, body - -> (UsageDetails, [CoreBndr], CoreExpr) - -- Returned usage details covers only the RHS, - -- and *not* the RULE or INLINE template for the Id -occAnalRecRhs env bndrs body = occAnalLamOrRhs (rhsCtxt env) bndrs body - -occAnalNonRecRhs :: OccEnv - -> Id -> [CoreBndr] -> CoreExpr -- Binder; rhs lams, body - -- Binder is already tagged with occurrence info - -> (UsageDetails, [CoreBndr], CoreExpr) - -- Returned usage details covers only the RHS, - -- and *not* the RULE or INLINE template for the Id -occAnalNonRecRhs env bndr bndrs body - = occAnalLamOrRhs rhs_env bndrs body - where - env1 | is_join_point = env -- See Note [Join point RHSs] - | certainly_inline = env -- See Note [Cascading inlines] - | otherwise = rhsCtxt env - - -- See Note [Sources of one-shot information] - rhs_env = env1 { occ_one_shots = argOneShots dmd } - - certainly_inline -- See Note [Cascading inlines] - = case occ of - OneOcc { occ_in_lam = NotInsideLam, occ_one_br = InOneBranch } - -> active && not_stable - _ -> False - - is_join_point = isAlwaysTailCalled occ - -- Like (isJoinId bndr) but happens one step earlier - -- c.f. willBeJoinId_maybe - - occ = idOccInfo bndr - dmd = idDemandInfo bndr - active = isAlwaysActive (idInlineActivation bndr) - not_stable = not (isStableUnfolding (idUnfolding bndr)) - -occAnalUnfolding :: OccEnv - -> RecFlag - -> Id - -> Maybe UsageDetails - -- Just the analysis, not a new unfolding. The unfolding - -- got analysed when it was created and we don't need to - -- update it. -occAnalUnfolding env rec_flag id - = case realIdUnfolding id of -- ignore previous loop-breaker flag - CoreUnfolding { uf_tmpl = rhs, uf_src = src } - | not (isStableSource src) - -> Nothing - | otherwise - -> Just $ markAllMany usage - where - (bndrs, body) = collectBinders rhs - (usage, _, _) = occAnalRhs env rec_flag id bndrs body - - DFunUnfolding { df_bndrs = bndrs, df_args = args } - -> Just $ zapDetails (delDetailsList usage bndrs) - where - usage = andUDsList (map (fst . occAnal env) args) - - _ -> Nothing - -occAnalRules :: OccEnv - -> Maybe JoinArity -- If the binder is (or MAY become) a join - -- point, what its join arity is (or WOULD - -- become). See Note [Rules and join points]. - -> RecFlag - -> Id - -> [(CoreRule, -- Each (non-built-in) rule - UsageDetails, -- Usage details for LHS - UsageDetails)] -- Usage details for RHS -occAnalRules env mb_expected_join_arity rec_flag id - = [ (rule, lhs_uds, rhs_uds) | rule@Rule {} <- idCoreRules id - , let (lhs_uds, rhs_uds) = occ_anal_rule rule ] - where - occ_anal_rule (Rule { ru_bndrs = bndrs, ru_args = args, ru_rhs = rhs }) - = (lhs_uds, final_rhs_uds) - where - lhs_uds = addManyOccsSet emptyDetails $ - (exprsFreeVars args `delVarSetList` bndrs) - (rhs_bndrs, rhs_body) = collectBinders rhs - (rhs_uds, _, _) = occAnalRhs env rec_flag id rhs_bndrs rhs_body - -- Note [Rules are extra RHSs] - -- Note [Rule dependency info] - final_rhs_uds = adjust_tail_info args $ markAllMany $ - (rhs_uds `delDetailsList` bndrs) - occ_anal_rule _ - = (emptyDetails, emptyDetails) - - adjust_tail_info args uds -- see Note [Rules and join points] - = case mb_expected_join_arity of - Just ar | args `lengthIs` ar -> uds - _ -> markAllNonTailCalled uds -{- Note [Join point RHSs] -~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - x = e - join j = Just x - -We want to inline x into j right away, so we don't want to give -the join point a RhsCtxt (#14137). It's not a huge deal, because -the FloatIn pass knows to float into join point RHSs; and the simplifier -does not float things out of join point RHSs. But it's a simple, cheap -thing to do. See #14137. - -Note [Cascading inlines] -~~~~~~~~~~~~~~~~~~~~~~~~ -By default we use an rhsCtxt for the RHS of a binding. This tells the -occ anal n that it's looking at an RHS, which has an effect in -occAnalApp. In particular, for constructor applications, it makes -the arguments appear to have NoOccInfo, so that we don't inline into -them. Thus x = f y - k = Just x -we do not want to inline x. - -But there's a problem. Consider - x1 = a0 : [] - x2 = a1 : x1 - x3 = a2 : x2 - g = f x3 -First time round, it looks as if x1 and x2 occur as an arg of a -let-bound constructor ==> give them a many-occurrence. -But then x3 is inlined (unconditionally as it happens) and -next time round, x2 will be, and the next time round x1 will be -Result: multiple simplifier iterations. Sigh. - -So, when analysing the RHS of x3 we notice that x3 will itself -definitely inline the next time round, and so we analyse x3's rhs in -an ordinary context, not rhsCtxt. Hence the "certainly_inline" stuff. - -Annoyingly, we have to approximate SimplUtils.preInlineUnconditionally. -If (a) the RHS is expandable (see isExpandableApp in occAnalApp), and - (b) certainly_inline says "yes" when preInlineUnconditionally says "no" -then the simplifier iterates indefinitely: - x = f y - k = Just x -- We decide that k is 'certainly_inline' - v = ...k... -- but preInlineUnconditionally doesn't inline it -inline ==> - k = Just (f y) - v = ...k... -float ==> - x1 = f y - k = Just x1 - v = ...k... - -This is worse than the slow cascade, so we only want to say "certainly_inline" -if it really is certain. Look at the note with preInlineUnconditionally -for the various clauses. - - -************************************************************************ -* * - Expressions -* * -************************************************************************ --} - -occAnal :: OccEnv - -> CoreExpr - -> (UsageDetails, -- Gives info only about the "interesting" Ids - CoreExpr) - -occAnal _ expr@(Type _) = (emptyDetails, expr) -occAnal _ expr@(Lit _) = (emptyDetails, expr) -occAnal env expr@(Var _) = occAnalApp env (expr, [], []) - -- At one stage, I gathered the idRuleVars for the variable here too, - -- which in a way is the right thing to do. - -- But that went wrong right after specialisation, when - -- the *occurrences* of the overloaded function didn't have any - -- rules in them, so the *specialised* versions looked as if they - -- weren't used at all. - -occAnal _ (Coercion co) - = (addManyOccsSet emptyDetails (coVarsOfCo co), Coercion co) - -- See Note [Gather occurrences of coercion variables] - -{- -Note [Gather occurrences of coercion variables] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We need to gather info about what coercion variables appear, so that -we can sort them into the right place when doing dependency analysis. --} - -occAnal env (Tick tickish body) - | SourceNote{} <- tickish - = (usage, Tick tickish body') - -- SourceNotes are best-effort; so we just proceed as usual. - -- If we drop a tick due to the issues described below it's - -- not the end of the world. - - | tickish `tickishScopesLike` SoftScope - = (markAllNonTailCalled usage, Tick tickish body') - - | Breakpoint _ ids <- tickish - = (usage_lam `andUDs` foldr addManyOccs emptyDetails ids, Tick tickish body') - -- never substitute for any of the Ids in a Breakpoint - - | otherwise - = (usage_lam, Tick tickish body') - where - !(usage,body') = occAnal env body - -- for a non-soft tick scope, we can inline lambdas only - usage_lam = markAllNonTailCalled (markAllInsideLam usage) - -- TODO There may be ways to make ticks and join points play - -- nicer together, but right now there are problems: - -- let j x = ... in tick<t> (j 1) - -- Making j a join point may cause the simplifier to drop t - -- (if the tick is put into the continuation). So we don't - -- count j 1 as a tail call. - -- See #14242. - -occAnal env (Cast expr co) - = case occAnal env expr of { (usage, expr') -> - let usage1 = zapDetailsIf (isRhsEnv env) usage - -- usage1: if we see let x = y `cast` co - -- then mark y as 'Many' so that we don't - -- immediately inline y again. - usage2 = addManyOccsSet usage1 (coVarsOfCo co) - -- usage2: see Note [Gather occurrences of coercion variables] - in (markAllNonTailCalled usage2, Cast expr' co) - } - -occAnal env app@(App _ _) - = occAnalApp env (collectArgsTicks tickishFloatable app) - --- Ignore type variables altogether --- (a) occurrences inside type lambdas only not marked as InsideLam --- (b) type variables not in environment - -occAnal env (Lam x body) - | isTyVar x - = case occAnal env body of { (body_usage, body') -> - (markAllNonTailCalled body_usage, Lam x body') - } - --- For value lambdas we do a special hack. Consider --- (\x. \y. ...x...) --- If we did nothing, x is used inside the \y, so would be marked --- as dangerous to dup. But in the common case where the abstraction --- is applied to two arguments this is over-pessimistic. --- So instead, we just mark each binder with its occurrence --- info in the *body* of the multiple lambda. --- Then, the simplifier is careful when partially applying lambdas. - -occAnal env expr@(Lam _ _) - = case occAnalLamOrRhs env binders body of { (usage, tagged_binders, body') -> - let - expr' = mkLams tagged_binders body' - usage1 = markAllNonTailCalled usage - one_shot_gp = all isOneShotBndr tagged_binders - final_usage | one_shot_gp = usage1 - | otherwise = markAllInsideLam usage1 - in - (final_usage, expr') } - where - (binders, body) = collectBinders expr - -occAnal env (Case scrut bndr ty alts) - = case occ_anal_scrut scrut alts of { (scrut_usage, scrut') -> - case mapAndUnzip occ_anal_alt alts of { (alts_usage_s, alts') -> - let - alts_usage = foldr orUDs emptyDetails alts_usage_s - (alts_usage1, tagged_bndr) = tagLamBinder alts_usage bndr - total_usage = markAllNonTailCalled scrut_usage `andUDs` alts_usage1 - -- Alts can have tail calls, but the scrutinee can't - in - total_usage `seq` (total_usage, Case scrut' tagged_bndr ty alts') }} - where - alt_env = mkAltEnv env scrut bndr - occ_anal_alt = occAnalAlt alt_env - - occ_anal_scrut (Var v) (alt1 : other_alts) - | not (null other_alts) || not (isDefaultAlt alt1) - = (mkOneOcc env v IsInteresting 0, Var v) - -- The 'True' says that the variable occurs in an interesting - -- context; the case has at least one non-default alternative - occ_anal_scrut (Tick t e) alts - | t `tickishScopesLike` SoftScope - -- No reason to not look through all ticks here, but only - -- for soft-scoped ticks we can do so without having to - -- update returned occurrence info (see occAnal) - = second (Tick t) $ occ_anal_scrut e alts - - occ_anal_scrut scrut _alts - = occAnal (vanillaCtxt env) scrut -- No need for rhsCtxt - -occAnal env (Let bind body) - = case occAnal env body of { (body_usage, body') -> - case occAnalBind env NotTopLevel - noImpRuleEdges bind - body_usage of { (final_usage, new_binds) -> - (final_usage, mkLets new_binds body') }} - -occAnalArgs :: OccEnv -> [CoreExpr] -> [OneShots] -> (UsageDetails, [CoreExpr]) -occAnalArgs _ [] _ - = (emptyDetails, []) - -occAnalArgs env (arg:args) one_shots - | isTypeArg arg - = case occAnalArgs env args one_shots of { (uds, args') -> - (uds, arg:args') } - - | otherwise - = case argCtxt env one_shots of { (arg_env, one_shots') -> - case occAnal arg_env arg of { (uds1, arg') -> - case occAnalArgs env args one_shots' of { (uds2, args') -> - (uds1 `andUDs` uds2, arg':args') }}} - -{- -Applications are dealt with specially because we want -the "build hack" to work. - -Note [Arguments of let-bound constructors] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f x = let y = expensive x in - let z = (True,y) in - (case z of {(p,q)->q}, case z of {(p,q)->q}) -We feel free to duplicate the WHNF (True,y), but that means -that y may be duplicated thereby. - -If we aren't careful we duplicate the (expensive x) call! -Constructors are rather like lambdas in this way. --} - -occAnalApp :: OccEnv - -> (Expr CoreBndr, [Arg CoreBndr], [Tickish Id]) - -> (UsageDetails, Expr CoreBndr) -occAnalApp env (Var fun, args, ticks) - | null ticks = (uds, mkApps (Var fun) args') - | otherwise = (uds, mkTicks ticks $ mkApps (Var fun) args') - where - uds = fun_uds `andUDs` final_args_uds - - !(args_uds, args') = occAnalArgs env args one_shots - !final_args_uds - | isRhsEnv env && is_exp = markAllNonTailCalled $ - markAllInsideLam args_uds - | otherwise = markAllNonTailCalled args_uds - -- We mark the free vars of the argument of a constructor or PAP - -- as "inside-lambda", if it is the RHS of a let(rec). - -- This means that nothing gets inlined into a constructor or PAP - -- argument position, which is what we want. Typically those - -- constructor arguments are just variables, or trivial expressions. - -- We use inside-lam because it's like eta-expanding the PAP. - -- - -- This is the *whole point* of the isRhsEnv predicate - -- See Note [Arguments of let-bound constructors] - - n_val_args = valArgCount args - n_args = length args - fun_uds = mkOneOcc env fun (if n_val_args > 0 then IsInteresting else NotInteresting) n_args - is_exp = isExpandableApp fun n_val_args - -- See Note [CONLIKE pragma] in BasicTypes - -- The definition of is_exp should match that in Simplify.prepareRhs - - one_shots = argsOneShots (idStrictness fun) guaranteed_val_args - guaranteed_val_args = n_val_args + length (takeWhile isOneShotInfo - (occ_one_shots env)) - -- See Note [Sources of one-shot information], bullet point A'] - -occAnalApp env (fun, args, ticks) - = (markAllNonTailCalled (fun_uds `andUDs` args_uds), - mkTicks ticks $ mkApps fun' args') - where - !(fun_uds, fun') = occAnal (addAppCtxt env args) fun - -- The addAppCtxt is a bit cunning. One iteration of the simplifier - -- often leaves behind beta redexs like - -- (\x y -> e) a1 a2 - -- Here we would like to mark x,y as one-shot, and treat the whole - -- thing much like a let. We do this by pushing some True items - -- onto the context stack. - !(args_uds, args') = occAnalArgs env args [] - -zapDetailsIf :: Bool -- If this is true - -> UsageDetails -- Then do zapDetails on this - -> UsageDetails -zapDetailsIf True uds = zapDetails uds -zapDetailsIf False uds = uds - -{- -Note [Sources of one-shot information] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The occurrence analyser obtains one-shot-lambda information from two sources: - -A: Saturated applications: eg f e1 .. en - - In general, given a call (f e1 .. en) we can propagate one-shot info from - f's strictness signature into e1 .. en, but /only/ if n is enough to - saturate the strictness signature. A strictness signature like - - f :: C1(C1(L))LS - - means that *if f is applied to three arguments* then it will guarantee to - call its first argument at most once, and to call the result of that at - most once. But if f has fewer than three arguments, all bets are off; e.g. - - map (f (\x y. expensive) e2) xs - - Here the \x y abstraction may be called many times (once for each element of - xs) so we should not mark x and y as one-shot. But if it was - - map (f (\x y. expensive) 3 2) xs - - then the first argument of f will be called at most once. - - The one-shot info, derived from f's strictness signature, is - computed by 'argsOneShots', called in occAnalApp. - -A': Non-obviously saturated applications: eg build (f (\x y -> expensive)) - where f is as above. - - In this case, f is only manifestly applied to one argument, so it does not - look saturated. So by the previous point, we should not use its strictness - signature to learn about the one-shotness of \x y. But in this case we can: - build is fully applied, so we may use its strictness signature; and from - that we learn that build calls its argument with two arguments *at most once*. - - So there is really only one call to f, and it will have three arguments. In - that sense, f is saturated, and we may proceed as described above. - - Hence the computation of 'guaranteed_val_args' in occAnalApp, using - '(occ_one_shots env)'. See also #13227, comment:9 - -B: Let-bindings: eg let f = \c. let ... in \n -> blah - in (build f, build f) - - Propagate one-shot info from the demanand-info on 'f' to the - lambdas in its RHS (which may not be syntactically at the top) - - This information must have come from a previous run of the demanand - analyser. - -Previously, the demand analyser would *also* set the one-shot information, but -that code was buggy (see #11770), so doing it only in on place, namely here, is -saner. - -Note [OneShots] -~~~~~~~~~~~~~~~ -When analysing an expression, the occ_one_shots argument contains information -about how the function is being used. The length of the list indicates -how many arguments will eventually be passed to the analysed expression, -and the OneShotInfo indicates whether this application is once or multiple times. - -Example: - - Context of f occ_one_shots when analysing f - - f 1 2 [OneShot, OneShot] - map (f 1) [OneShot, NoOneShotInfo] - build f [OneShot, OneShot] - f 1 2 `seq` f 2 1 [NoOneShotInfo, OneShot] - -Note [Binders in case alternatives] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - case x of y { (a,b) -> f y } -We treat 'a', 'b' as dead, because they don't physically occur in the -case alternative. (Indeed, a variable is dead iff it doesn't occur in -its scope in the output of OccAnal.) It really helps to know when -binders are unused. See esp the call to isDeadBinder in -Simplify.mkDupableAlt - -In this example, though, the Simplifier will bring 'a' and 'b' back to -life, because it binds 'y' to (a,b) (imagine got inlined and -scrutinised y). --} - -occAnalLamOrRhs :: OccEnv -> [CoreBndr] -> CoreExpr - -> (UsageDetails, [CoreBndr], CoreExpr) -occAnalLamOrRhs env [] body - = case occAnal env body of (body_usage, body') -> (body_usage, [], body') - -- RHS of thunk or nullary join point -occAnalLamOrRhs env (bndr:bndrs) body - | isTyVar bndr - = -- Important: Keep the environment so that we don't inline into an RHS like - -- \(@ x) -> C @x (f @x) - -- (see the beginning of Note [Cascading inlines]). - case occAnalLamOrRhs env bndrs body of - (body_usage, bndrs', body') -> (body_usage, bndr:bndrs', body') -occAnalLamOrRhs env binders body - = case occAnal env_body body of { (body_usage, body') -> - let - (final_usage, tagged_binders) = tagLamBinders body_usage binders' - -- Use binders' to put one-shot info on the lambdas - in - (final_usage, tagged_binders, body') } - where - (env_body, binders') = oneShotGroup env binders - -occAnalAlt :: (OccEnv, Maybe (Id, CoreExpr)) - -> CoreAlt - -> (UsageDetails, Alt IdWithOccInfo) -occAnalAlt (env, scrut_bind) (con, bndrs, rhs) - = case occAnal env rhs of { (rhs_usage1, rhs1) -> - let - (alt_usg, tagged_bndrs) = tagLamBinders rhs_usage1 bndrs - -- See Note [Binders in case alternatives] - (alt_usg', rhs2) = wrapAltRHS env scrut_bind alt_usg tagged_bndrs rhs1 - in - (alt_usg', (con, tagged_bndrs, rhs2)) } - -wrapAltRHS :: OccEnv - -> Maybe (Id, CoreExpr) -- proxy mapping generated by mkAltEnv - -> UsageDetails -- usage for entire alt (p -> rhs) - -> [Var] -- alt binders - -> CoreExpr -- alt RHS - -> (UsageDetails, CoreExpr) -wrapAltRHS env (Just (scrut_var, let_rhs)) alt_usg bndrs alt_rhs - | occ_binder_swap env - , scrut_var `usedIn` alt_usg -- bndrs are not be present in alt_usg so this - -- handles condition (a) in Note [Binder swap] - , not captured -- See condition (b) in Note [Binder swap] - = ( alt_usg' `andUDs` let_rhs_usg - , Let (NonRec tagged_scrut_var let_rhs') alt_rhs ) - where - captured = any (`usedIn` let_rhs_usg) bndrs -- Check condition (b) - - -- The rhs of the let may include coercion variables - -- if the scrutinee was a cast, so we must gather their - -- usage. See Note [Gather occurrences of coercion variables] - -- Moreover, the rhs of the let may mention the case-binder, and - -- we want to gather its occ-info as well - (let_rhs_usg, let_rhs') = occAnal env let_rhs - - (alt_usg', tagged_scrut_var) = tagLamBinder alt_usg scrut_var - -wrapAltRHS _ _ alt_usg _ alt_rhs - = (alt_usg, alt_rhs) - -{- -************************************************************************ -* * - OccEnv -* * -************************************************************************ --} - -data OccEnv - = OccEnv { occ_encl :: !OccEncl -- Enclosing context information - , occ_one_shots :: !OneShots -- See Note [OneShots] - , occ_gbl_scrut :: GlobalScruts - - , occ_unf_act :: Id -> Bool -- Which Id unfoldings are active - - , occ_rule_act :: Activation -> Bool -- Which rules are active - -- See Note [Finding rule RHS free vars] - - , occ_binder_swap :: !Bool -- enable the binder_swap - -- See CorePrep Note [Dead code in CorePrep] - } - -type GlobalScruts = IdSet -- See Note [Binder swap on GlobalId scrutinees] - ------------------------------ --- OccEncl is used to control whether to inline into constructor arguments --- For example: --- x = (p,q) -- Don't inline p or q --- y = /\a -> (p a, q a) -- Still don't inline p or q --- z = f (p,q) -- Do inline p,q; it may make a rule fire --- So OccEncl tells enough about the context to know what to do when --- we encounter a constructor application or PAP. - -data OccEncl - = OccRhs -- RHS of let(rec), albeit perhaps inside a type lambda - -- Don't inline into constructor args here - | OccVanilla -- Argument of function, body of lambda, scruintee of case etc. - -- Do inline into constructor args here - -instance Outputable OccEncl where - ppr OccRhs = text "occRhs" - ppr OccVanilla = text "occVanilla" - --- See note [OneShots] -type OneShots = [OneShotInfo] - -initOccEnv :: OccEnv -initOccEnv - = OccEnv { occ_encl = OccVanilla - , occ_one_shots = [] - , occ_gbl_scrut = emptyVarSet - -- To be conservative, we say that all - -- inlines and rules are active - , occ_unf_act = \_ -> True - , occ_rule_act = \_ -> True - , occ_binder_swap = True } - -vanillaCtxt :: OccEnv -> OccEnv -vanillaCtxt env = env { occ_encl = OccVanilla, occ_one_shots = [] } - -rhsCtxt :: OccEnv -> OccEnv -rhsCtxt env = env { occ_encl = OccRhs, occ_one_shots = [] } - -argCtxt :: OccEnv -> [OneShots] -> (OccEnv, [OneShots]) -argCtxt env [] - = (env { occ_encl = OccVanilla, occ_one_shots = [] }, []) -argCtxt env (one_shots:one_shots_s) - = (env { occ_encl = OccVanilla, occ_one_shots = one_shots }, one_shots_s) - -isRhsEnv :: OccEnv -> Bool -isRhsEnv (OccEnv { occ_encl = OccRhs }) = True -isRhsEnv (OccEnv { occ_encl = OccVanilla }) = False - -oneShotGroup :: OccEnv -> [CoreBndr] - -> ( OccEnv - , [CoreBndr] ) - -- The result binders have one-shot-ness set that they might not have had originally. - -- This happens in (build (\c n -> e)). Here the occurrence analyser - -- linearity context knows that c,n are one-shot, and it records that fact in - -- the binder. This is useful to guide subsequent float-in/float-out transformations - -oneShotGroup env@(OccEnv { occ_one_shots = ctxt }) bndrs - = go ctxt bndrs [] - where - go ctxt [] rev_bndrs - = ( env { occ_one_shots = ctxt, occ_encl = OccVanilla } - , reverse rev_bndrs ) - - go [] bndrs rev_bndrs - = ( env { occ_one_shots = [], occ_encl = OccVanilla } - , reverse rev_bndrs ++ bndrs ) - - go ctxt@(one_shot : ctxt') (bndr : bndrs) rev_bndrs - | isId bndr = go ctxt' bndrs (bndr': rev_bndrs) - | otherwise = go ctxt bndrs (bndr : rev_bndrs) - where - bndr' = updOneShotInfo bndr one_shot - -- Use updOneShotInfo, not setOneShotInfo, as pre-existing - -- one-shot info might be better than what we can infer, e.g. - -- due to explicit use of the magic 'oneShot' function. - -- See Note [The oneShot function] - - -markJoinOneShots :: Maybe JoinArity -> [Var] -> [Var] --- Mark the lambdas of a non-recursive join point as one-shot. --- This is good to prevent gratuitous float-out etc -markJoinOneShots mb_join_arity bndrs - = case mb_join_arity of - Nothing -> bndrs - Just n -> go n bndrs - where - go 0 bndrs = bndrs - go _ [] = [] -- This can legitimately happen. - -- e.g. let j = case ... in j True - -- This will become an arity-1 join point after the - -- simplifier has eta-expanded it; but it may not have - -- enough lambdas /yet/. (Lint checks that JoinIds do - -- have enough lambdas.) - go n (b:bs) = b' : go (n-1) bs - where - b' | isId b = setOneShotLambda b - | otherwise = b - -addAppCtxt :: OccEnv -> [Arg CoreBndr] -> OccEnv -addAppCtxt env@(OccEnv { occ_one_shots = ctxt }) args - = env { occ_one_shots = replicate (valArgCount args) OneShotLam ++ ctxt } - -transClosureFV :: UniqFM VarSet -> UniqFM VarSet --- If (f,g), (g,h) are in the input, then (f,h) is in the output --- as well as (f,g), (g,h) -transClosureFV env - | no_change = env - | otherwise = transClosureFV (listToUFM new_fv_list) - where - (no_change, new_fv_list) = mapAccumL bump True (nonDetUFMToList env) - -- It's OK to use nonDetUFMToList here because we'll forget the - -- ordering by creating a new set with listToUFM - bump no_change (b,fvs) - | no_change_here = (no_change, (b,fvs)) - | otherwise = (False, (b,new_fvs)) - where - (new_fvs, no_change_here) = extendFvs env fvs - -------------- -extendFvs_ :: UniqFM VarSet -> VarSet -> VarSet -extendFvs_ env s = fst (extendFvs env s) -- Discard the Bool flag - -extendFvs :: UniqFM VarSet -> VarSet -> (VarSet, Bool) --- (extendFVs env s) returns --- (s `union` env(s), env(s) `subset` s) -extendFvs env s - | isNullUFM env - = (s, True) - | otherwise - = (s `unionVarSet` extras, extras `subVarSet` s) - where - extras :: VarSet -- env(s) - extras = nonDetFoldUFM unionVarSet emptyVarSet $ - -- It's OK to use nonDetFoldUFM here because unionVarSet commutes - intersectUFM_C (\x _ -> x) env (getUniqSet s) - -{- -************************************************************************ -* * - Binder swap -* * -************************************************************************ - -Note [Binder swap] -~~~~~~~~~~~~~~~~~~ -The "binder swap" transformation swaps occurrence of the -scrutinee of a case for occurrences of the case-binder: - - (1) case x of b { pi -> ri } - ==> - case x of b { pi -> let x=b in ri } - - (2) case (x |> co) of b { pi -> ri } - ==> - case (x |> co) of b { pi -> let x = b |> sym co in ri } - -In both cases, the trivial 'let' can be eliminated by the -immediately following simplifier pass. - -There are two reasons for making this swap: - -(A) It reduces the number of occurrences of the scrutinee, x. - That in turn might reduce its occurrences to one, so we - can inline it and save an allocation. E.g. - let x = factorial y in case x of b { I# v -> ...x... } - If we replace 'x' by 'b' in the alternative we get - let x = factorial y in case x of b { I# v -> ...b... } - and now we can inline 'x', thus - case (factorial y) of b { I# v -> ...b... } - -(B) The case-binder b has unfolding information; in the - example above we know that b = I# v. That in turn allows - nested cases to simplify. Consider - case x of b { I# v -> - ...(case x of b2 { I# v2 -> rhs })... - If we replace 'x' by 'b' in the alternative we get - case x of b { I# v -> - ...(case b of b2 { I# v2 -> rhs })... - and now it is trivial to simplify the inner case: - case x of b { I# v -> - ...(let b2 = b in rhs)... - - The same can happen even if the scrutinee is a variable - with a cast: see Note [Case of cast] - -In both cases, in a particular alternative (pi -> ri), we only -add the binding if - (a) x occurs free in (pi -> ri) - (ie it occurs in ri, but is not bound in pi) - (b) the pi does not bind b (or the free vars of co) -We need (a) and (b) for the inserted binding to be correct. - -For the alternatives where we inject the binding, we can transfer -all x's OccInfo to b. And that is the point. - -Notice that - * The deliberate shadowing of 'x'. - * That (a) rapidly becomes false, so no bindings are injected. - -The reason for doing these transformations /here in the occurrence -analyser/ is because it allows us to adjust the OccInfo for 'x' and -'b' as we go. - - * Suppose the only occurrences of 'x' are the scrutinee and in the - ri; then this transformation makes it occur just once, and hence - get inlined right away. - - * If instead we do this in the Simplifier, we don't know whether 'x' - is used in ri, so we are forced to pessimistically zap b's OccInfo - even though it is typically dead (ie neither it nor x appear in - the ri). There's nothing actually wrong with zapping it, except - that it's kind of nice to know which variables are dead. My nose - tells me to keep this information as robustly as possible. - -The Maybe (Id,CoreExpr) passed to occAnalAlt is the extra let-binding -{x=b}; it's Nothing if the binder-swap doesn't happen. - -There is a danger though. Consider - let v = x +# y - in case (f v) of w -> ...v...v... -And suppose that (f v) expands to just v. Then we'd like to -use 'w' instead of 'v' in the alternative. But it may be too -late; we may have substituted the (cheap) x+#y for v in the -same simplifier pass that reduced (f v) to v. - -I think this is just too bad. CSE will recover some of it. - -Note [Case of cast] -~~~~~~~~~~~~~~~~~~~ -Consider case (x `cast` co) of b { I# -> - ... (case (x `cast` co) of {...}) ... -We'd like to eliminate the inner case. That is the motivation for -equation (2) in Note [Binder swap]. When we get to the inner case, we -inline x, cancel the casts, and away we go. - -Note [Binder swap on GlobalId scrutinees] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When the scrutinee is a GlobalId we must take care in two ways - - i) In order to *know* whether 'x' occurs free in the RHS, we need its - occurrence info. BUT, we don't gather occurrence info for - GlobalIds. That's the reason for the (small) occ_gbl_scrut env in - OccEnv is for: it says "gather occurrence info for these". - - ii) We must call localiseId on 'x' first, in case it's a GlobalId, or - has an External Name. See, for example, SimplEnv Note [Global Ids in - the substitution]. - -Note [Zap case binders in proxy bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -From the original - case x of cb(dead) { p -> ...x... } -we will get - case x of cb(live) { p -> let x = cb in ...x... } - -Core Lint never expects to find an *occurrence* of an Id marked -as Dead, so we must zap the OccInfo on cb before making the -binding x = cb. See #5028. - -NB: the OccInfo on /occurrences/ really doesn't matter much; the simplifier -doesn't use it. So this is only to satisfy the perhaps-over-picky Lint. - -Historical note [no-case-of-case] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We *used* to suppress the binder-swap in case expressions when --fno-case-of-case is on. Old remarks: - "This happens in the first simplifier pass, - and enhances full laziness. Here's the bad case: - f = \ y -> ...(case x of I# v -> ...(case x of ...) ... ) - If we eliminate the inner case, we trap it inside the I# v -> arm, - which might prevent some full laziness happening. I've seen this - in action in spectral/cichelli/Prog.hs: - [(m,n) | m <- [1..max], n <- [1..max]] - Hence the check for NoCaseOfCase." -However, now the full-laziness pass itself reverses the binder-swap, so this -check is no longer necessary. - -Historical note [Suppressing the case binder-swap] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This old note describes a problem that is also fixed by doing the -binder-swap in OccAnal: - - There is another situation when it might make sense to suppress the - case-expression binde-swap. If we have - - case x of w1 { DEFAULT -> case x of w2 { A -> e1; B -> e2 } - ...other cases .... } - - We'll perform the binder-swap for the outer case, giving - - case x of w1 { DEFAULT -> case w1 of w2 { A -> e1; B -> e2 } - ...other cases .... } - - But there is no point in doing it for the inner case, because w1 can't - be inlined anyway. Furthermore, doing the case-swapping involves - zapping w2's occurrence info (see paragraphs that follow), and that - forces us to bind w2 when doing case merging. So we get - - case x of w1 { A -> let w2 = w1 in e1 - B -> let w2 = w1 in e2 - ...other cases .... } - - This is plain silly in the common case where w2 is dead. - - Even so, I can't see a good way to implement this idea. I tried - not doing the binder-swap if the scrutinee was already evaluated - but that failed big-time: - - data T = MkT !Int - - case v of w { MkT x -> - case x of x1 { I# y1 -> - case x of x2 { I# y2 -> ... - - Notice that because MkT is strict, x is marked "evaluated". But to - eliminate the last case, we must either make sure that x (as well as - x1) has unfolding MkT y1. The straightforward thing to do is to do - the binder-swap. So this whole note is a no-op. - -It's fixed by doing the binder-swap in OccAnal because we can do the -binder-swap unconditionally and still get occurrence analysis -information right. --} - -mkAltEnv :: OccEnv -> CoreExpr -> Id -> (OccEnv, Maybe (Id, CoreExpr)) --- Does three things: a) makes the occ_one_shots = OccVanilla --- b) extends the GlobalScruts if possible --- c) returns a proxy mapping, binding the scrutinee --- to the case binder, if possible -mkAltEnv env@(OccEnv { occ_gbl_scrut = pe }) scrut case_bndr - = case stripTicksTopE (const True) scrut of - Var v -> add_scrut v case_bndr' - Cast (Var v) co -> add_scrut v (Cast case_bndr' (mkSymCo co)) - -- See Note [Case of cast] - _ -> (env { occ_encl = OccVanilla }, Nothing) - - where - add_scrut v rhs - | isGlobalId v = (env { occ_encl = OccVanilla }, Nothing) - | otherwise = ( env { occ_encl = OccVanilla - , occ_gbl_scrut = pe `extendVarSet` v } - , Just (localise v, rhs) ) - -- ToDO: this isGlobalId stuff is a TEMPORARY FIX - -- to avoid the binder-swap for GlobalIds - -- See #16346 - - case_bndr' = Var (zapIdOccInfo case_bndr) - -- See Note [Zap case binders in proxy bindings] - - -- Localise the scrut_var before shadowing it; we're making a - -- new binding for it, and it might have an External Name, or - -- even be a GlobalId; Note [Binder swap on GlobalId scrutinees] - -- Also we don't want any INLINE or NOINLINE pragmas! - localise scrut_var = mkLocalIdOrCoVar (localiseName (idName scrut_var)) - (idType scrut_var) - -{- -************************************************************************ -* * -\subsection[OccurAnal-types]{OccEnv} -* * -************************************************************************ - -Note [UsageDetails and zapping] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -On many occasions, we must modify all gathered occurrence data at once. For -instance, all occurrences underneath a (non-one-shot) lambda set the -'occ_in_lam' flag to become 'True'. We could use 'mapVarEnv' to do this, but -that takes O(n) time and we will do this often---in particular, there are many -places where tail calls are not allowed, and each of these causes all variables -to get marked with 'NoTailCallInfo'. - -Instead of relying on `mapVarEnv`, then, we carry three 'IdEnv's around along -with the 'OccInfoEnv'. Each of these extra environments is a "zapped set" -recording which variables have been zapped in some way. Zapping all occurrence -info then simply means setting the corresponding zapped set to the whole -'OccInfoEnv', a fast O(1) operation. --} - -type OccInfoEnv = IdEnv OccInfo -- A finite map from ids to their usage - -- INVARIANT: never IAmDead - -- (Deadness is signalled by not being in the map at all) - -type ZappedSet = OccInfoEnv -- Values are ignored - -data UsageDetails - = UD { ud_env :: !OccInfoEnv - , ud_z_many :: ZappedSet -- apply 'markMany' to these - , ud_z_in_lam :: ZappedSet -- apply 'markInsideLam' to these - , ud_z_no_tail :: ZappedSet } -- apply 'markNonTailCalled' to these - -- INVARIANT: All three zapped sets are subsets of the OccInfoEnv - -instance Outputable UsageDetails where - ppr ud = ppr (ud_env (flattenUsageDetails ud)) - -------------------- --- UsageDetails API - -andUDs, orUDs - :: UsageDetails -> UsageDetails -> UsageDetails -andUDs = combineUsageDetailsWith addOccInfo -orUDs = combineUsageDetailsWith orOccInfo - -andUDsList :: [UsageDetails] -> UsageDetails -andUDsList = foldl' andUDs emptyDetails - -mkOneOcc :: OccEnv -> Id -> InterestingCxt -> JoinArity -> UsageDetails -mkOneOcc env id int_cxt arity - | isLocalId id - = singleton $ OneOcc { occ_in_lam = NotInsideLam - , occ_one_br = InOneBranch - , occ_int_cxt = int_cxt - , occ_tail = AlwaysTailCalled arity } - | id `elemVarSet` occ_gbl_scrut env - = singleton noOccInfo - - | otherwise - = emptyDetails - where - singleton info = emptyDetails { ud_env = unitVarEnv id info } - -addOneOcc :: UsageDetails -> Id -> OccInfo -> UsageDetails -addOneOcc ud id info - = ud { ud_env = extendVarEnv_C plus_zapped (ud_env ud) id info } - `alterZappedSets` (`delVarEnv` id) - where - plus_zapped old new = doZapping ud id old `addOccInfo` new - -addManyOccsSet :: UsageDetails -> VarSet -> UsageDetails -addManyOccsSet usage id_set = nonDetFoldUniqSet addManyOccs usage id_set - -- It's OK to use nonDetFoldUFM here because addManyOccs commutes - --- Add several occurrences, assumed not to be tail calls -addManyOccs :: Var -> UsageDetails -> UsageDetails -addManyOccs v u | isId v = addOneOcc u v noOccInfo - | otherwise = u - -- Give a non-committal binder info (i.e noOccInfo) because - -- a) Many copies of the specialised thing can appear - -- b) We don't want to substitute a BIG expression inside a RULE - -- even if that's the only occurrence of the thing - -- (Same goes for INLINE.) - -delDetails :: UsageDetails -> Id -> UsageDetails -delDetails ud bndr - = ud `alterUsageDetails` (`delVarEnv` bndr) - -delDetailsList :: UsageDetails -> [Id] -> UsageDetails -delDetailsList ud bndrs - = ud `alterUsageDetails` (`delVarEnvList` bndrs) - -emptyDetails :: UsageDetails -emptyDetails = UD { ud_env = emptyVarEnv - , ud_z_many = emptyVarEnv - , ud_z_in_lam = emptyVarEnv - , ud_z_no_tail = emptyVarEnv } - -isEmptyDetails :: UsageDetails -> Bool -isEmptyDetails = isEmptyVarEnv . ud_env - -markAllMany, markAllInsideLam, markAllNonTailCalled, zapDetails - :: UsageDetails -> UsageDetails -markAllMany ud = ud { ud_z_many = ud_env ud } -markAllInsideLam ud = ud { ud_z_in_lam = ud_env ud } -markAllNonTailCalled ud = ud { ud_z_no_tail = ud_env ud } - -zapDetails = markAllMany . markAllNonTailCalled -- effectively sets to noOccInfo - -lookupDetails :: UsageDetails -> Id -> OccInfo -lookupDetails ud id - | isCoVar id -- We do not currently gather occurrence info (from types) - = noOccInfo -- for CoVars, so we must conservatively mark them as used - -- See Note [DoO not mark CoVars as dead] - | otherwise - = case lookupVarEnv (ud_env ud) id of - Just occ -> doZapping ud id occ - Nothing -> IAmDead - -usedIn :: Id -> UsageDetails -> Bool -v `usedIn` ud = isExportedId v || v `elemVarEnv` ud_env ud - -udFreeVars :: VarSet -> UsageDetails -> VarSet --- Find the subset of bndrs that are mentioned in uds -udFreeVars bndrs ud = restrictUniqSetToUFM bndrs (ud_env ud) - -{- Note [Do not mark CoVars as dead] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's obviously wrong to mark CoVars as dead if they are used. -Currently we don't traverse types to gather usase info for CoVars, -so we had better treat them as having noOccInfo. - -This showed up in #15696 we had something like - case eq_sel d of co -> ...(typeError @(...co...) "urk")... - -Then 'd' was substituted by a dictionary, so the expression -simpified to - case (Coercion <blah>) of co -> ...(typeError @(...co...) "urk")... - -But then the "drop the case altogether" equation of rebuildCase -thought that 'co' was dead, and discarded the entire case. Urk! - -I have no idea how we managed to avoid this pitfall for so long! --} - -------------------- --- Auxiliary functions for UsageDetails implementation - -combineUsageDetailsWith :: (OccInfo -> OccInfo -> OccInfo) - -> UsageDetails -> UsageDetails -> UsageDetails -combineUsageDetailsWith plus_occ_info ud1 ud2 - | isEmptyDetails ud1 = ud2 - | isEmptyDetails ud2 = ud1 - | otherwise - = UD { ud_env = plusVarEnv_C plus_occ_info (ud_env ud1) (ud_env ud2) - , ud_z_many = plusVarEnv (ud_z_many ud1) (ud_z_many ud2) - , ud_z_in_lam = plusVarEnv (ud_z_in_lam ud1) (ud_z_in_lam ud2) - , ud_z_no_tail = plusVarEnv (ud_z_no_tail ud1) (ud_z_no_tail ud2) } - -doZapping :: UsageDetails -> Var -> OccInfo -> OccInfo -doZapping ud var occ - = doZappingByUnique ud (varUnique var) occ - -doZappingByUnique :: UsageDetails -> Unique -> OccInfo -> OccInfo -doZappingByUnique ud uniq - = (if | in_subset ud_z_many -> markMany - | in_subset ud_z_in_lam -> markInsideLam - | otherwise -> id) . - (if | in_subset ud_z_no_tail -> markNonTailCalled - | otherwise -> id) - where - in_subset field = uniq `elemVarEnvByKey` field ud - -alterZappedSets :: UsageDetails -> (ZappedSet -> ZappedSet) -> UsageDetails -alterZappedSets ud f - = ud { ud_z_many = f (ud_z_many ud) - , ud_z_in_lam = f (ud_z_in_lam ud) - , ud_z_no_tail = f (ud_z_no_tail ud) } - -alterUsageDetails :: UsageDetails -> (OccInfoEnv -> OccInfoEnv) -> UsageDetails -alterUsageDetails ud f - = ud { ud_env = f (ud_env ud) } - `alterZappedSets` f - -flattenUsageDetails :: UsageDetails -> UsageDetails -flattenUsageDetails ud - = ud { ud_env = mapUFM_Directly (doZappingByUnique ud) (ud_env ud) } - `alterZappedSets` const emptyVarEnv - -------------------- --- See Note [Adjusting right-hand sides] -adjustRhsUsage :: Maybe JoinArity -> RecFlag - -> [CoreBndr] -- Outer lambdas, AFTER occ anal - -> UsageDetails -> UsageDetails -adjustRhsUsage mb_join_arity rec_flag bndrs usage - = maybe_mark_lam (maybe_drop_tails usage) - where - maybe_mark_lam ud | one_shot = ud - | otherwise = markAllInsideLam ud - maybe_drop_tails ud | exact_join = ud - | otherwise = markAllNonTailCalled ud - - one_shot = case mb_join_arity of - Just join_arity - | isRec rec_flag -> False - | otherwise -> all isOneShotBndr (drop join_arity bndrs) - Nothing -> all isOneShotBndr bndrs - - exact_join = case mb_join_arity of - Just join_arity -> bndrs `lengthIs` join_arity - _ -> False - -type IdWithOccInfo = Id - -tagLamBinders :: UsageDetails -- Of scope - -> [Id] -- Binders - -> (UsageDetails, -- Details with binders removed - [IdWithOccInfo]) -- Tagged binders -tagLamBinders usage binders - = usage' `seq` (usage', bndrs') - where - (usage', bndrs') = mapAccumR tagLamBinder usage binders - -tagLamBinder :: UsageDetails -- Of scope - -> Id -- Binder - -> (UsageDetails, -- Details with binder removed - IdWithOccInfo) -- Tagged binders --- Used for lambda and case binders --- It copes with the fact that lambda bindings can have a --- stable unfolding, used for join points -tagLamBinder usage bndr - = (usage2, bndr') - where - occ = lookupDetails usage bndr - bndr' = setBinderOcc (markNonTailCalled occ) bndr - -- Don't try to make an argument into a join point - usage1 = usage `delDetails` bndr - usage2 | isId bndr = addManyOccsSet usage1 (idUnfoldingVars bndr) - -- This is effectively the RHS of a - -- non-join-point binding, so it's okay to use - -- addManyOccsSet, which assumes no tail calls - | otherwise = usage1 - -tagNonRecBinder :: TopLevelFlag -- At top level? - -> UsageDetails -- Of scope - -> CoreBndr -- Binder - -> (UsageDetails, -- Details with binder removed - IdWithOccInfo) -- Tagged binder - -tagNonRecBinder lvl usage binder - = let - occ = lookupDetails usage binder - will_be_join = decideJoinPointHood lvl usage [binder] - occ' | will_be_join = -- must already be marked AlwaysTailCalled - ASSERT(isAlwaysTailCalled occ) occ - | otherwise = markNonTailCalled occ - binder' = setBinderOcc occ' binder - usage' = usage `delDetails` binder - in - usage' `seq` (usage', binder') - -tagRecBinders :: TopLevelFlag -- At top level? - -> UsageDetails -- Of body of let ONLY - -> [(CoreBndr, -- Binder - UsageDetails, -- RHS usage details - [CoreBndr])] -- Lambdas in new RHS - -> (UsageDetails, -- Adjusted details for whole scope, - -- with binders removed - [IdWithOccInfo]) -- Tagged binders --- Substantially more complicated than non-recursive case. Need to adjust RHS --- details *before* tagging binders (because the tags depend on the RHSes). -tagRecBinders lvl body_uds triples - = let - (bndrs, rhs_udss, _) = unzip3 triples - - -- 1. Determine join-point-hood of whole group, as determined by - -- the *unadjusted* usage details - unadj_uds = foldr andUDs body_uds rhs_udss - will_be_joins = decideJoinPointHood lvl unadj_uds bndrs - - -- 2. Adjust usage details of each RHS, taking into account the - -- join-point-hood decision - rhs_udss' = map adjust triples - adjust (bndr, rhs_uds, rhs_bndrs) - = adjustRhsUsage mb_join_arity Recursive rhs_bndrs rhs_uds - where - -- Can't use willBeJoinId_maybe here because we haven't tagged the - -- binder yet (the tag depends on these adjustments!) - mb_join_arity - | will_be_joins - , let occ = lookupDetails unadj_uds bndr - , AlwaysTailCalled arity <- tailCallInfo occ - = Just arity - | otherwise - = ASSERT(not will_be_joins) -- Should be AlwaysTailCalled if - Nothing -- we are making join points! - - -- 3. Compute final usage details from adjusted RHS details - adj_uds = foldr andUDs body_uds rhs_udss' - - -- 4. Tag each binder with its adjusted details - bndrs' = [ setBinderOcc (lookupDetails adj_uds bndr) bndr - | bndr <- bndrs ] - - -- 5. Drop the binders from the adjusted details and return - usage' = adj_uds `delDetailsList` bndrs - in - (usage', bndrs') - -setBinderOcc :: OccInfo -> CoreBndr -> CoreBndr -setBinderOcc occ_info bndr - | isTyVar bndr = bndr - | isExportedId bndr = if isManyOccs (idOccInfo bndr) - then bndr - else setIdOccInfo bndr noOccInfo - -- Don't use local usage info for visible-elsewhere things - -- BUT *do* erase any IAmALoopBreaker annotation, because we're - -- about to re-generate it and it shouldn't be "sticky" - - | otherwise = setIdOccInfo bndr occ_info - --- | Decide whether some bindings should be made into join points or not. --- Returns `False` if they can't be join points. Note that it's an --- all-or-nothing decision, as if multiple binders are given, they're --- assumed to be mutually recursive. --- --- It must, however, be a final decision. If we say "True" for 'f', --- and then subsequently decide /not/ make 'f' into a join point, then --- the decision about another binding 'g' might be invalidated if (say) --- 'f' tail-calls 'g'. --- --- See Note [Invariants on join points] in GHC.Core. -decideJoinPointHood :: TopLevelFlag -> UsageDetails - -> [CoreBndr] - -> Bool -decideJoinPointHood TopLevel _ _ - = False -decideJoinPointHood NotTopLevel usage bndrs - | isJoinId (head bndrs) - = WARN(not all_ok, text "OccurAnal failed to rediscover join point(s):" <+> - ppr bndrs) - all_ok - | otherwise - = all_ok - where - -- See Note [Invariants on join points]; invariants cited by number below. - -- Invariant 2 is always satisfiable by the simplifier by eta expansion. - all_ok = -- Invariant 3: Either all are join points or none are - all ok bndrs - - ok bndr - | -- Invariant 1: Only tail calls, all same join arity - AlwaysTailCalled arity <- tailCallInfo (lookupDetails usage bndr) - - , -- Invariant 1 as applied to LHSes of rules - all (ok_rule arity) (idCoreRules bndr) - - -- Invariant 2a: stable unfoldings - -- See Note [Join points and INLINE pragmas] - , ok_unfolding arity (realIdUnfolding bndr) - - -- Invariant 4: Satisfies polymorphism rule - , isValidJoinPointType arity (idType bndr) - = True - - | otherwise - = False - - ok_rule _ BuiltinRule{} = False -- only possible with plugin shenanigans - ok_rule join_arity (Rule { ru_args = args }) - = args `lengthIs` join_arity - -- Invariant 1 as applied to LHSes of rules - - -- ok_unfolding returns False if we should /not/ convert a non-join-id - -- into a join-id, even though it is AlwaysTailCalled - ok_unfolding join_arity (CoreUnfolding { uf_src = src, uf_tmpl = rhs }) - = not (isStableSource src && join_arity > joinRhsArity rhs) - ok_unfolding _ (DFunUnfolding {}) - = False - ok_unfolding _ _ - = True - -willBeJoinId_maybe :: CoreBndr -> Maybe JoinArity -willBeJoinId_maybe bndr - = case tailCallInfo (idOccInfo bndr) of - AlwaysTailCalled arity -> Just arity - _ -> isJoinId_maybe bndr - - -{- Note [Join points and INLINE pragmas] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f x = let g = \x. not -- Arity 1 - {-# INLINE g #-} - in case x of - A -> g True True - B -> g True False - C -> blah2 - -Here 'g' is always tail-called applied to 2 args, but the stable -unfolding captured by the INLINE pragma has arity 1. If we try to -convert g to be a join point, its unfolding will still have arity 1 -(since it is stable, and we don't meddle with stable unfoldings), and -Lint will complain (see Note [Invariants on join points], (2a), in -GHC.Core. #13413. - -Moreover, since g is going to be inlined anyway, there is no benefit -from making it a join point. - -If it is recursive, and uselessly marked INLINE, this will stop us -making it a join point, which is annoying. But occasionally -(notably in class methods; see Note [Instances and loop breakers] in -TcInstDcls) we mark recursive things as INLINE but the recursion -unravels; so ignoring INLINE pragmas on recursive things isn't good -either. - -See Invariant 2a of Note [Invariants on join points] in GHC.Core - - -************************************************************************ -* * -\subsection{Operations over OccInfo} -* * -************************************************************************ --} - -markMany, markInsideLam, markNonTailCalled :: OccInfo -> OccInfo - -markMany IAmDead = IAmDead -markMany occ = ManyOccs { occ_tail = occ_tail occ } - -markInsideLam occ@(OneOcc {}) = occ { occ_in_lam = IsInsideLam } -markInsideLam occ = occ - -markNonTailCalled IAmDead = IAmDead -markNonTailCalled occ = occ { occ_tail = NoTailCallInfo } - -addOccInfo, orOccInfo :: OccInfo -> OccInfo -> OccInfo - -addOccInfo a1 a2 = ASSERT( not (isDeadOcc a1 || isDeadOcc a2) ) - ManyOccs { occ_tail = tailCallInfo a1 `andTailCallInfo` - tailCallInfo a2 } - -- Both branches are at least One - -- (Argument is never IAmDead) - --- (orOccInfo orig new) is used --- when combining occurrence info from branches of a case - -orOccInfo (OneOcc { occ_in_lam = in_lam1, occ_int_cxt = int_cxt1 - , occ_tail = tail1 }) - (OneOcc { occ_in_lam = in_lam2, occ_int_cxt = int_cxt2 - , occ_tail = tail2 }) - = OneOcc { occ_one_br = MultipleBranches -- because it occurs in both branches - , occ_in_lam = in_lam1 `mappend` in_lam2 - , occ_int_cxt = int_cxt1 `mappend` int_cxt2 - , occ_tail = tail1 `andTailCallInfo` tail2 } - -orOccInfo a1 a2 = ASSERT( not (isDeadOcc a1 || isDeadOcc a2) ) - ManyOccs { occ_tail = tailCallInfo a1 `andTailCallInfo` - tailCallInfo a2 } - -andTailCallInfo :: TailCallInfo -> TailCallInfo -> TailCallInfo -andTailCallInfo info@(AlwaysTailCalled arity1) (AlwaysTailCalled arity2) - | arity1 == arity2 = info -andTailCallInfo _ _ = NoTailCallInfo diff --git a/compiler/simplCore/SAT.hs b/compiler/simplCore/SAT.hs deleted file mode 100644 index 23fdff540b..0000000000 --- a/compiler/simplCore/SAT.hs +++ /dev/null @@ -1,433 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - - -************************************************************************ - - Static Argument Transformation pass - -************************************************************************ - -May be seen as removing invariants from loops: -Arguments of recursive functions that do not change in recursive -calls are removed from the recursion, which is done locally -and only passes the arguments which effectively change. - -Example: -map = /\ ab -> \f -> \xs -> case xs of - [] -> [] - (a:b) -> f a : map f b - -as map is recursively called with the same argument f (unmodified) -we transform it to - -map = /\ ab -> \f -> \xs -> let map' ys = case ys of - [] -> [] - (a:b) -> f a : map' b - in map' xs - -Notice that for a compiler that uses lambda lifting this is -useless as map' will be transformed back to what map was. - -We could possibly do the same for big lambdas, but we don't as -they will eventually be removed in later stages of the compiler, -therefore there is no penalty in keeping them. - -We only apply the SAT when the number of static args is > 2. This -produces few bad cases. See - should_transform -in saTransform. - -Here are the headline nofib results: - Size Allocs Runtime -Min +0.0% -13.7% -21.4% -Max +0.1% +0.0% +5.4% -Geometric Mean +0.0% -0.2% -6.9% - -The previous patch, to fix polymorphic floatout demand signatures, is -essential to make this work well! --} - -{-# LANGUAGE CPP #-} -module SAT ( doStaticArgs ) where - -import GhcPrelude - -import Var -import GHC.Core -import GHC.Core.Utils -import GHC.Core.Type -import GHC.Core.Coercion -import Id -import Name -import VarEnv -import UniqSupply -import Util -import UniqFM -import VarSet -import Unique -import UniqSet -import Outputable - -import Data.List (mapAccumL) -import FastString - -#include "HsVersions.h" - -doStaticArgs :: UniqSupply -> CoreProgram -> CoreProgram -doStaticArgs us binds = snd $ mapAccumL sat_bind_threaded_us us binds - where - sat_bind_threaded_us us bind = - let (us1, us2) = splitUniqSupply us - in (us1, fst $ runSAT us2 (satBind bind emptyUniqSet)) - --- We don't bother to SAT recursive groups since it can lead --- to massive code expansion: see Andre Santos' thesis for details. --- This means we only apply the actual SAT to Rec groups of one element, --- but we want to recurse into the others anyway to discover other binds -satBind :: CoreBind -> IdSet -> SatM (CoreBind, IdSATInfo) -satBind (NonRec binder expr) interesting_ids = do - (expr', sat_info_expr, expr_app) <- satExpr expr interesting_ids - return (NonRec binder expr', finalizeApp expr_app sat_info_expr) -satBind (Rec [(binder, rhs)]) interesting_ids = do - let interesting_ids' = interesting_ids `addOneToUniqSet` binder - (rhs_binders, rhs_body) = collectBinders rhs - (rhs_body', sat_info_rhs_body) <- satTopLevelExpr rhs_body interesting_ids' - let sat_info_rhs_from_args = unitVarEnv binder (bindersToSATInfo rhs_binders) - sat_info_rhs' = mergeIdSATInfo sat_info_rhs_from_args sat_info_rhs_body - - shadowing = binder `elementOfUniqSet` interesting_ids - sat_info_rhs'' = if shadowing - then sat_info_rhs' `delFromUFM` binder -- For safety - else sat_info_rhs' - - bind' <- saTransformMaybe binder (lookupUFM sat_info_rhs' binder) - rhs_binders rhs_body' - return (bind', sat_info_rhs'') -satBind (Rec pairs) interesting_ids = do - let (binders, rhss) = unzip pairs - rhss_SATed <- mapM (\e -> satTopLevelExpr e interesting_ids) rhss - let (rhss', sat_info_rhss') = unzip rhss_SATed - return (Rec (zipEqual "satBind" binders rhss'), mergeIdSATInfos sat_info_rhss') - -data App = VarApp Id | TypeApp Type | CoApp Coercion -data Staticness a = Static a | NotStatic - -type IdAppInfo = (Id, SATInfo) - -type SATInfo = [Staticness App] -type IdSATInfo = IdEnv SATInfo -emptyIdSATInfo :: IdSATInfo -emptyIdSATInfo = emptyUFM - -{- -pprIdSATInfo id_sat_info = vcat (map pprIdAndSATInfo (Map.toList id_sat_info)) - where pprIdAndSATInfo (v, sat_info) = hang (ppr v <> colon) 4 (pprSATInfo sat_info) --} - -pprSATInfo :: SATInfo -> SDoc -pprSATInfo staticness = hcat $ map pprStaticness staticness - -pprStaticness :: Staticness App -> SDoc -pprStaticness (Static (VarApp _)) = text "SV" -pprStaticness (Static (TypeApp _)) = text "ST" -pprStaticness (Static (CoApp _)) = text "SC" -pprStaticness NotStatic = text "NS" - - -mergeSATInfo :: SATInfo -> SATInfo -> SATInfo -mergeSATInfo l r = zipWith mergeSA l r - where - mergeSA NotStatic _ = NotStatic - mergeSA _ NotStatic = NotStatic - mergeSA (Static (VarApp v)) (Static (VarApp v')) - | v == v' = Static (VarApp v) - | otherwise = NotStatic - mergeSA (Static (TypeApp t)) (Static (TypeApp t')) - | t `eqType` t' = Static (TypeApp t) - | otherwise = NotStatic - mergeSA (Static (CoApp c)) (Static (CoApp c')) - | c `eqCoercion` c' = Static (CoApp c) - | otherwise = NotStatic - mergeSA _ _ = pprPanic "mergeSATInfo" $ - text "Left:" - <> pprSATInfo l <> text ", " - <> text "Right:" - <> pprSATInfo r - -mergeIdSATInfo :: IdSATInfo -> IdSATInfo -> IdSATInfo -mergeIdSATInfo = plusUFM_C mergeSATInfo - -mergeIdSATInfos :: [IdSATInfo] -> IdSATInfo -mergeIdSATInfos = foldl' mergeIdSATInfo emptyIdSATInfo - -bindersToSATInfo :: [Id] -> SATInfo -bindersToSATInfo vs = map (Static . binderToApp) vs - where binderToApp v | isId v = VarApp v - | isTyVar v = TypeApp $ mkTyVarTy v - | otherwise = CoApp $ mkCoVarCo v - -finalizeApp :: Maybe IdAppInfo -> IdSATInfo -> IdSATInfo -finalizeApp Nothing id_sat_info = id_sat_info -finalizeApp (Just (v, sat_info')) id_sat_info = - let sat_info'' = case lookupUFM id_sat_info v of - Nothing -> sat_info' - Just sat_info -> mergeSATInfo sat_info sat_info' - in extendVarEnv id_sat_info v sat_info'' - -satTopLevelExpr :: CoreExpr -> IdSet -> SatM (CoreExpr, IdSATInfo) -satTopLevelExpr expr interesting_ids = do - (expr', sat_info_expr, expr_app) <- satExpr expr interesting_ids - return (expr', finalizeApp expr_app sat_info_expr) - -satExpr :: CoreExpr -> IdSet -> SatM (CoreExpr, IdSATInfo, Maybe IdAppInfo) -satExpr var@(Var v) interesting_ids = do - let app_info = if v `elementOfUniqSet` interesting_ids - then Just (v, []) - else Nothing - return (var, emptyIdSATInfo, app_info) - -satExpr lit@(Lit _) _ = do - return (lit, emptyIdSATInfo, Nothing) - -satExpr (Lam binders body) interesting_ids = do - (body', sat_info, this_app) <- satExpr body interesting_ids - return (Lam binders body', finalizeApp this_app sat_info, Nothing) - -satExpr (App fn arg) interesting_ids = do - (fn', sat_info_fn, fn_app) <- satExpr fn interesting_ids - let satRemainder = boring fn' sat_info_fn - case fn_app of - Nothing -> satRemainder Nothing - Just (fn_id, fn_app_info) -> - -- TODO: remove this use of append somehow (use a data structure with O(1) append but a left-to-right kind of interface) - let satRemainderWithStaticness arg_staticness = satRemainder $ Just (fn_id, fn_app_info ++ [arg_staticness]) - in case arg of - Type t -> satRemainderWithStaticness $ Static (TypeApp t) - Coercion c -> satRemainderWithStaticness $ Static (CoApp c) - Var v -> satRemainderWithStaticness $ Static (VarApp v) - _ -> satRemainderWithStaticness $ NotStatic - where - boring :: CoreExpr -> IdSATInfo -> Maybe IdAppInfo -> SatM (CoreExpr, IdSATInfo, Maybe IdAppInfo) - boring fn' sat_info_fn app_info = - do (arg', sat_info_arg, arg_app) <- satExpr arg interesting_ids - let sat_info_arg' = finalizeApp arg_app sat_info_arg - sat_info = mergeIdSATInfo sat_info_fn sat_info_arg' - return (App fn' arg', sat_info, app_info) - -satExpr (Case expr bndr ty alts) interesting_ids = do - (expr', sat_info_expr, expr_app) <- satExpr expr interesting_ids - let sat_info_expr' = finalizeApp expr_app sat_info_expr - - zipped_alts' <- mapM satAlt alts - let (alts', sat_infos_alts) = unzip zipped_alts' - return (Case expr' bndr ty alts', mergeIdSATInfo sat_info_expr' (mergeIdSATInfos sat_infos_alts), Nothing) - where - satAlt (con, bndrs, expr) = do - (expr', sat_info_expr) <- satTopLevelExpr expr interesting_ids - return ((con, bndrs, expr'), sat_info_expr) - -satExpr (Let bind body) interesting_ids = do - (body', sat_info_body, body_app) <- satExpr body interesting_ids - (bind', sat_info_bind) <- satBind bind interesting_ids - return (Let bind' body', mergeIdSATInfo sat_info_body sat_info_bind, body_app) - -satExpr (Tick tickish expr) interesting_ids = do - (expr', sat_info_expr, expr_app) <- satExpr expr interesting_ids - return (Tick tickish expr', sat_info_expr, expr_app) - -satExpr ty@(Type _) _ = do - return (ty, emptyIdSATInfo, Nothing) - -satExpr co@(Coercion _) _ = do - return (co, emptyIdSATInfo, Nothing) - -satExpr (Cast expr coercion) interesting_ids = do - (expr', sat_info_expr, expr_app) <- satExpr expr interesting_ids - return (Cast expr' coercion, sat_info_expr, expr_app) - -{- -************************************************************************ - - Static Argument Transformation Monad - -************************************************************************ --} - -type SatM result = UniqSM result - -runSAT :: UniqSupply -> SatM a -> a -runSAT = initUs_ - -newUnique :: SatM Unique -newUnique = getUniqueM - -{- -************************************************************************ - - Static Argument Transformation Monad - -************************************************************************ - -To do the transformation, the game plan is to: - -1. Create a small nonrecursive RHS that takes the - original arguments to the function but discards - the ones that are static and makes a call to the - SATed version with the remainder. We intend that - this will be inlined later, removing the overhead - -2. Bind this nonrecursive RHS over the original body - WITH THE SAME UNIQUE as the original body so that - any recursive calls to the original now go via - the small wrapper - -3. Rebind the original function to a new one which contains - our SATed function and just makes a call to it: - we call the thing making this call the local body - -Example: transform this - - map :: forall a b. (a->b) -> [a] -> [b] - map = /\ab. \(f:a->b) (as:[a]) -> body[map] -to - map :: forall a b. (a->b) -> [a] -> [b] - map = /\ab. \(f:a->b) (as:[a]) -> - letrec map' :: [a] -> [b] - -- The "worker function - map' = \(as:[a]) -> - let map :: forall a' b'. (a -> b) -> [a] -> [b] - -- The "shadow function - map = /\a'b'. \(f':(a->b) (as:[a]). - map' as - in body[map] - in map' as - -Note [Shadow binding] -~~~~~~~~~~~~~~~~~~~~~ -The calls to the inner map inside body[map] should get inlined -by the local re-binding of 'map'. We call this the "shadow binding". - -But we can't use the original binder 'map' unchanged, because -it might be exported, in which case the shadow binding won't be -discarded as dead code after it is inlined. - -So we use a hack: we make a new SysLocal binder with the *same* unique -as binder. (Another alternative would be to reset the export flag.) - -Note [Binder type capture] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -Notice that in the inner map (the "shadow function"), the static arguments -are discarded -- it's as if they were underscores. Instead, mentions -of these arguments (notably in the types of dynamic arguments) are bound -by the *outer* lambdas of the main function. So we must make up fresh -names for the static arguments so that they do not capture variables -mentioned in the types of dynamic args. - -In the map example, the shadow function must clone the static type -argument a,b, giving a',b', to ensure that in the \(as:[a]), the 'a' -is bound by the outer forall. We clone f' too for consistency, but -that doesn't matter either way because static Id arguments aren't -mentioned in the shadow binding at all. - -If we don't we get something like this: - -[Exported] -[Arity 3] -GHC.Base.until = - \ (@ a_aiK) - (p_a6T :: a_aiK -> GHC.Types.Bool) - (f_a6V :: a_aiK -> a_aiK) - (x_a6X :: a_aiK) -> - letrec { - sat_worker_s1aU :: a_aiK -> a_aiK - [] - sat_worker_s1aU = - \ (x_a6X :: a_aiK) -> - let { - sat_shadow_r17 :: forall a_a3O. - (a_a3O -> GHC.Types.Bool) -> (a_a3O -> a_a3O) -> a_a3O -> a_a3O - [] - sat_shadow_r17 = - \ (@ a_aiK) - (p_a6T :: a_aiK -> GHC.Types.Bool) - (f_a6V :: a_aiK -> a_aiK) - (x_a6X :: a_aiK) -> - sat_worker_s1aU x_a6X } in - case p_a6T x_a6X of wild_X3y [ALWAYS Dead Nothing] { - GHC.Types.False -> GHC.Base.until @ a_aiK p_a6T f_a6V (f_a6V x_a6X); - GHC.Types.True -> x_a6X - }; } in - sat_worker_s1aU x_a6X - -Where sat_shadow has captured the type variables of x_a6X etc as it has a a_aiK -type argument. This is bad because it means the application sat_worker_s1aU x_a6X -is not well typed. --} - -saTransformMaybe :: Id -> Maybe SATInfo -> [Id] -> CoreExpr -> SatM CoreBind -saTransformMaybe binder maybe_arg_staticness rhs_binders rhs_body - | Just arg_staticness <- maybe_arg_staticness - , should_transform arg_staticness - = saTransform binder arg_staticness rhs_binders rhs_body - | otherwise - = return (Rec [(binder, mkLams rhs_binders rhs_body)]) - where - should_transform staticness = n_static_args > 1 -- THIS IS THE DECISION POINT - where - n_static_args = count isStaticValue staticness - -saTransform :: Id -> SATInfo -> [Id] -> CoreExpr -> SatM CoreBind -saTransform binder arg_staticness rhs_binders rhs_body - = do { shadow_lam_bndrs <- mapM clone binders_w_staticness - ; uniq <- newUnique - ; return (NonRec binder (mk_new_rhs uniq shadow_lam_bndrs)) } - where - -- Running example: foldr - -- foldr \alpha \beta c n xs = e, for some e - -- arg_staticness = [Static TypeApp, Static TypeApp, Static VarApp, Static VarApp, NonStatic] - -- rhs_binders = [\alpha, \beta, c, n, xs] - -- rhs_body = e - - binders_w_staticness = rhs_binders `zip` (arg_staticness ++ repeat NotStatic) - -- Any extra args are assumed NotStatic - - non_static_args :: [Var] - -- non_static_args = [xs] - -- rhs_binders_without_type_capture = [\alpha', \beta', c, n, xs] - non_static_args = [v | (v, NotStatic) <- binders_w_staticness] - - clone (bndr, NotStatic) = return bndr - clone (bndr, _ ) = do { uniq <- newUnique - ; return (setVarUnique bndr uniq) } - - -- new_rhs = \alpha beta c n xs -> - -- let sat_worker = \xs -> let sat_shadow = \alpha' beta' c n xs -> - -- sat_worker xs - -- in e - -- in sat_worker xs - mk_new_rhs uniq shadow_lam_bndrs - = mkLams rhs_binders $ - Let (Rec [(rec_body_bndr, rec_body)]) - local_body - where - local_body = mkVarApps (Var rec_body_bndr) non_static_args - - rec_body = mkLams non_static_args $ - Let (NonRec shadow_bndr shadow_rhs) rhs_body - - -- See Note [Binder type capture] - shadow_rhs = mkLams shadow_lam_bndrs local_body - -- nonrec_rhs = \alpha' beta' c n xs -> sat_worker xs - - rec_body_bndr = mkSysLocal (fsLit "sat_worker") uniq (exprType rec_body) - -- rec_body_bndr = sat_worker - - -- See Note [Shadow binding]; make a SysLocal - shadow_bndr = mkSysLocal (occNameFS (getOccName binder)) - (idUnique binder) - (exprType shadow_rhs) - -isStaticValue :: Staticness App -> Bool -isStaticValue (Static (VarApp _)) = True -isStaticValue _ = False diff --git a/compiler/simplCore/SetLevels.hs b/compiler/simplCore/SetLevels.hs deleted file mode 100644 index 7132b2f596..0000000000 --- a/compiler/simplCore/SetLevels.hs +++ /dev/null @@ -1,1771 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - -\section{SetLevels} - - *************************** - Overview - *************************** - -1. We attach binding levels to Core bindings, in preparation for floating - outwards (@FloatOut@). - -2. We also let-ify many expressions (notably case scrutinees), so they - will have a fighting chance of being floated sensible. - -3. Note [Need for cloning during float-out] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - We clone the binders of any floatable let-binding, so that when it is - floated out it will be unique. Example - (let x=2 in x) + (let x=3 in x) - we must clone before floating so we get - let x1=2 in - let x2=3 in - x1+x2 - - NOTE: this can't be done using the uniqAway idea, because the variable - must be unique in the whole program, not just its current scope, - because two variables in different scopes may float out to the - same top level place - - NOTE: Very tiresomely, we must apply this substitution to - the rules stored inside a variable too. - - We do *not* clone top-level bindings, because some of them must not change, - but we *do* clone bindings that are heading for the top level - -4. Note [Binder-swap during float-out] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - In the expression - case x of wild { p -> ...wild... } - we substitute x for wild in the RHS of the case alternatives: - case x of wild { p -> ...x... } - This means that a sub-expression involving x is not "trapped" inside the RHS. - And it's not inconvenient because we already have a substitution. - - Note that this is EXACTLY BACKWARDS from the what the simplifier does. - The simplifier tries to get rid of occurrences of x, in favour of wild, - in the hope that there will only be one remaining occurrence of x, namely - the scrutinee of the case, and we can inline it. --} - -{-# LANGUAGE CPP, MultiWayIf #-} - -{-# OPTIONS_GHC -Wno-incomplete-uni-patterns #-} -module SetLevels ( - setLevels, - - Level(..), LevelType(..), tOP_LEVEL, isJoinCeilLvl, asJoinCeilLvl, - LevelledBind, LevelledExpr, LevelledBndr, - FloatSpec(..), floatSpecLevel, - - incMinorLvl, ltMajLvl, ltLvl, isTopLvl - ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Core -import CoreMonad ( FloatOutSwitches(..) ) -import GHC.Core.Utils ( exprType, exprIsHNF - , exprOkForSpeculation - , exprIsTopLevelBindable - , isExprLevPoly - , collectMakeStaticArgs - ) -import GHC.Core.Arity ( exprBotStrictness_maybe ) -import GHC.Core.FVs -- all of it -import GHC.Core.Subst -import GHC.Core.Make ( sortQuantVars ) - -import Id -import IdInfo -import Var -import VarSet -import UniqSet ( nonDetFoldUniqSet ) -import UniqDSet ( getUniqDSet ) -import VarEnv -import Literal ( litIsTrivial ) -import Demand ( StrictSig, Demand, isStrictDmd, splitStrictSig, increaseStrictSigArity ) -import Cpr ( mkCprSig, botCpr ) -import Name ( getOccName, mkSystemVarName ) -import OccName ( occNameString ) -import GHC.Core.Type ( Type, mkLamTypes, splitTyConApp_maybe, tyCoVarsOfType - , mightBeUnliftedType, closeOverKindsDSet ) -import BasicTypes ( Arity, RecFlag(..), isRec ) -import GHC.Core.DataCon ( dataConOrigResTy ) -import TysWiredIn -import UniqSupply -import Util -import Outputable -import FastString -import UniqDFM -import FV -import Data.Maybe -import MonadUtils ( mapAccumLM ) - -{- -************************************************************************ -* * -\subsection{Level numbers} -* * -************************************************************************ --} - -type LevelledExpr = TaggedExpr FloatSpec -type LevelledBind = TaggedBind FloatSpec -type LevelledBndr = TaggedBndr FloatSpec - -data Level = Level Int -- Level number of enclosing lambdas - Int -- Number of big-lambda and/or case expressions and/or - -- context boundaries between - -- here and the nearest enclosing lambda - LevelType -- Binder or join ceiling? -data LevelType = BndrLvl | JoinCeilLvl deriving (Eq) - -data FloatSpec - = FloatMe Level -- Float to just inside the binding - -- tagged with this level - | StayPut Level -- Stay where it is; binding is - -- tagged with this level - -floatSpecLevel :: FloatSpec -> Level -floatSpecLevel (FloatMe l) = l -floatSpecLevel (StayPut l) = l - -{- -The {\em level number} on a (type-)lambda-bound variable is the -nesting depth of the (type-)lambda which binds it. The outermost lambda -has level 1, so (Level 0 0) means that the variable is bound outside any lambda. - -On an expression, it's the maximum level number of its free -(type-)variables. On a let(rec)-bound variable, it's the level of its -RHS. On a case-bound variable, it's the number of enclosing lambdas. - -Top-level variables: level~0. Those bound on the RHS of a top-level -definition but ``before'' a lambda; e.g., the \tr{x} in (levels shown -as ``subscripts'')... -\begin{verbatim} -a_0 = let b_? = ... in - x_1 = ... b ... in ... -\end{verbatim} - -The main function @lvlExpr@ carries a ``context level'' (@le_ctxt_lvl@). -That's meant to be the level number of the enclosing binder in the -final (floated) program. If the level number of a sub-expression is -less than that of the context, then it might be worth let-binding the -sub-expression so that it will indeed float. - -If you can float to level @Level 0 0@ worth doing so because then your -allocation becomes static instead of dynamic. We always start with -context @Level 0 0@. - - -Note [FloatOut inside INLINE] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -@InlineCtxt@ very similar to @Level 0 0@, but is used for one purpose: -to say "don't float anything out of here". That's exactly what we -want for the body of an INLINE, where we don't want to float anything -out at all. See notes with lvlMFE below. - -But, check this out: - --- At one time I tried the effect of not floating anything out of an InlineMe, --- but it sometimes works badly. For example, consider PrelArr.done. It --- has the form __inline (\d. e) --- where e doesn't mention d. If we float this to --- __inline (let x = e in \d. x) --- things are bad. The inliner doesn't even inline it because it doesn't look --- like a head-normal form. So it seems a lesser evil to let things float. --- In SetLevels we do set the context to (Level 0 0) when we get to an InlineMe --- which discourages floating out. - -So the conclusion is: don't do any floating at all inside an InlineMe. -(In the above example, don't float the {x=e} out of the \d.) - -One particular case is that of workers: we don't want to float the -call to the worker outside the wrapper, otherwise the worker might get -inlined into the floated expression, and an importing module won't see -the worker at all. - -Note [Join ceiling] -~~~~~~~~~~~~~~~~~~~ -Join points can't float very far; too far, and they can't remain join points -So, suppose we have: - - f x = (joinrec j y = ... x ... in jump j x) + 1 - -One may be tempted to float j out to the top of f's RHS, but then the jump -would not be a tail call. Thus we keep track of a level called the *join -ceiling* past which join points are not allowed to float. - -The troublesome thing is that, unlike most levels to which something might -float, there is not necessarily an identifier to which the join ceiling is -attached. Fortunately, if something is to be floated to a join ceiling, it must -be dropped at the *nearest* join ceiling. Thus each level is marked as to -whether it is a join ceiling, so that FloatOut can tell which binders are being -floated to the nearest join ceiling and which to a particular binder (or set of -binders). --} - -instance Outputable FloatSpec where - ppr (FloatMe l) = char 'F' <> ppr l - ppr (StayPut l) = ppr l - -tOP_LEVEL :: Level -tOP_LEVEL = Level 0 0 BndrLvl - -incMajorLvl :: Level -> Level -incMajorLvl (Level major _ _) = Level (major + 1) 0 BndrLvl - -incMinorLvl :: Level -> Level -incMinorLvl (Level major minor _) = Level major (minor+1) BndrLvl - -asJoinCeilLvl :: Level -> Level -asJoinCeilLvl (Level major minor _) = Level major minor JoinCeilLvl - -maxLvl :: Level -> Level -> Level -maxLvl l1@(Level maj1 min1 _) l2@(Level maj2 min2 _) - | (maj1 > maj2) || (maj1 == maj2 && min1 > min2) = l1 - | otherwise = l2 - -ltLvl :: Level -> Level -> Bool -ltLvl (Level maj1 min1 _) (Level maj2 min2 _) - = (maj1 < maj2) || (maj1 == maj2 && min1 < min2) - -ltMajLvl :: Level -> Level -> Bool - -- Tells if one level belongs to a difft *lambda* level to another -ltMajLvl (Level maj1 _ _) (Level maj2 _ _) = maj1 < maj2 - -isTopLvl :: Level -> Bool -isTopLvl (Level 0 0 _) = True -isTopLvl _ = False - -isJoinCeilLvl :: Level -> Bool -isJoinCeilLvl (Level _ _ t) = t == JoinCeilLvl - -instance Outputable Level where - ppr (Level maj min typ) - = hcat [ char '<', int maj, char ',', int min, char '>' - , ppWhen (typ == JoinCeilLvl) (char 'C') ] - -instance Eq Level where - (Level maj1 min1 _) == (Level maj2 min2 _) = maj1 == maj2 && min1 == min2 - -{- -************************************************************************ -* * -\subsection{Main level-setting code} -* * -************************************************************************ --} - -setLevels :: FloatOutSwitches - -> CoreProgram - -> UniqSupply - -> [LevelledBind] - -setLevels float_lams binds us - = initLvl us (do_them init_env binds) - where - init_env = initialEnv float_lams - - do_them :: LevelEnv -> [CoreBind] -> LvlM [LevelledBind] - do_them _ [] = return [] - do_them env (b:bs) - = do { (lvld_bind, env') <- lvlTopBind env b - ; lvld_binds <- do_them env' bs - ; return (lvld_bind : lvld_binds) } - -lvlTopBind :: LevelEnv -> Bind Id -> LvlM (LevelledBind, LevelEnv) -lvlTopBind env (NonRec bndr rhs) - = do { rhs' <- lvl_top env NonRecursive bndr rhs - ; let (env', [bndr']) = substAndLvlBndrs NonRecursive env tOP_LEVEL [bndr] - ; return (NonRec bndr' rhs', env') } - -lvlTopBind env (Rec pairs) - = do { let (env', bndrs') = substAndLvlBndrs Recursive env tOP_LEVEL - (map fst pairs) - ; rhss' <- mapM (\(b,r) -> lvl_top env' Recursive b r) pairs - ; return (Rec (bndrs' `zip` rhss'), env') } - -lvl_top :: LevelEnv -> RecFlag -> Id -> CoreExpr -> LvlM LevelledExpr -lvl_top env is_rec bndr rhs - = lvlRhs env is_rec - (isBottomingId bndr) - Nothing -- Not a join point - (freeVars rhs) - -{- -************************************************************************ -* * -\subsection{Setting expression levels} -* * -************************************************************************ - -Note [Floating over-saturated applications] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we see (f x y), and (f x) is a redex (ie f's arity is 1), -we call (f x) an "over-saturated application" - -Should we float out an over-sat app, if can escape a value lambda? -It is sometimes very beneficial (-7% runtime -4% alloc over nofib -O2). -But we don't want to do it for class selectors, because the work saved -is minimal, and the extra local thunks allocated cost money. - -Arguably we could float even class-op applications if they were going to -top level -- but then they must be applied to a constant dictionary and -will almost certainly be optimised away anyway. --} - -lvlExpr :: LevelEnv -- Context - -> CoreExprWithFVs -- Input expression - -> LvlM LevelledExpr -- Result expression - -{- -The @le_ctxt_lvl@ is, roughly, the level of the innermost enclosing -binder. Here's an example - - v = \x -> ...\y -> let r = case (..x..) of - ..x.. - in .. - -When looking at the rhs of @r@, @le_ctxt_lvl@ will be 1 because that's -the level of @r@, even though it's inside a level-2 @\y@. It's -important that @le_ctxt_lvl@ is 1 and not 2 in @r@'s rhs, because we -don't want @lvlExpr@ to turn the scrutinee of the @case@ into an MFE ---- because it isn't a *maximal* free expression. - -If there were another lambda in @r@'s rhs, it would get level-2 as well. --} - -lvlExpr env (_, AnnType ty) = return (Type (GHC.Core.Subst.substTy (le_subst env) ty)) -lvlExpr env (_, AnnCoercion co) = return (Coercion (substCo (le_subst env) co)) -lvlExpr env (_, AnnVar v) = return (lookupVar env v) -lvlExpr _ (_, AnnLit lit) = return (Lit lit) - -lvlExpr env (_, AnnCast expr (_, co)) = do - expr' <- lvlNonTailExpr env expr - return (Cast expr' (substCo (le_subst env) co)) - -lvlExpr env (_, AnnTick tickish expr) = do - expr' <- lvlNonTailExpr env expr - let tickish' = substTickish (le_subst env) tickish - return (Tick tickish' expr') - -lvlExpr env expr@(_, AnnApp _ _) = lvlApp env expr (collectAnnArgs expr) - --- We don't split adjacent lambdas. That is, given --- \x y -> (x+1,y) --- we don't float to give --- \x -> let v = x+1 in \y -> (v,y) --- Why not? Because partial applications are fairly rare, and splitting --- lambdas makes them more expensive. - -lvlExpr env expr@(_, AnnLam {}) - = do { new_body <- lvlNonTailMFE new_env True body - ; return (mkLams new_bndrs new_body) } - where - (bndrs, body) = collectAnnBndrs expr - (env1, bndrs1) = substBndrsSL NonRecursive env bndrs - (new_env, new_bndrs) = lvlLamBndrs env1 (le_ctxt_lvl env) bndrs1 - -- At one time we called a special version of collectBinders, - -- which ignored coercions, because we don't want to split - -- a lambda like this (\x -> coerce t (\s -> ...)) - -- This used to happen quite a bit in state-transformer programs, - -- but not nearly so much now non-recursive newtypes are transparent. - -- [See SetLevels rev 1.50 for a version with this approach.] - -lvlExpr env (_, AnnLet bind body) - = do { (bind', new_env) <- lvlBind env bind - ; body' <- lvlExpr new_env body - -- No point in going via lvlMFE here. If the binding is alive - -- (mentioned in body), and the whole let-expression doesn't - -- float, then neither will the body - ; return (Let bind' body') } - -lvlExpr env (_, AnnCase scrut case_bndr ty alts) - = do { scrut' <- lvlNonTailMFE env True scrut - ; lvlCase env (freeVarsOf scrut) scrut' case_bndr ty alts } - -lvlNonTailExpr :: LevelEnv -- Context - -> CoreExprWithFVs -- Input expression - -> LvlM LevelledExpr -- Result expression -lvlNonTailExpr env expr - = lvlExpr (placeJoinCeiling env) expr - -------------------------------------------- -lvlApp :: LevelEnv - -> CoreExprWithFVs - -> (CoreExprWithFVs, [CoreExprWithFVs]) -- Input application - -> LvlM LevelledExpr -- Result expression -lvlApp env orig_expr ((_,AnnVar fn), args) - | floatOverSat env -- See Note [Floating over-saturated applications] - , arity > 0 - , arity < n_val_args - , Nothing <- isClassOpId_maybe fn - = do { rargs' <- mapM (lvlNonTailMFE env False) rargs - ; lapp' <- lvlNonTailMFE env False lapp - ; return (foldl' App lapp' rargs') } - - | otherwise - = do { (_, args') <- mapAccumLM lvl_arg stricts args - -- Take account of argument strictness; see - -- Note [Floating to the top] - ; return (foldl' App (lookupVar env fn) args') } - where - n_val_args = count (isValArg . deAnnotate) args - arity = idArity fn - - stricts :: [Demand] -- True for strict /value/ arguments - stricts = case splitStrictSig (idStrictness fn) of - (arg_ds, _) | arg_ds `lengthExceeds` n_val_args - -> [] - | otherwise - -> arg_ds - - -- Separate out the PAP that we are floating from the extra - -- arguments, by traversing the spine until we have collected - -- (n_val_args - arity) value arguments. - (lapp, rargs) = left (n_val_args - arity) orig_expr [] - - left 0 e rargs = (e, rargs) - left n (_, AnnApp f a) rargs - | isValArg (deAnnotate a) = left (n-1) f (a:rargs) - | otherwise = left n f (a:rargs) - left _ _ _ = panic "SetLevels.lvlExpr.left" - - is_val_arg :: CoreExprWithFVs -> Bool - is_val_arg (_, AnnType {}) = False - is_val_arg _ = True - - lvl_arg :: [Demand] -> CoreExprWithFVs -> LvlM ([Demand], LevelledExpr) - lvl_arg strs arg | (str1 : strs') <- strs - , is_val_arg arg - = do { arg' <- lvlMFE env (isStrictDmd str1) arg - ; return (strs', arg') } - | otherwise - = do { arg' <- lvlMFE env False arg - ; return (strs, arg') } - -lvlApp env _ (fun, args) - = -- No PAPs that we can float: just carry on with the - -- arguments and the function. - do { args' <- mapM (lvlNonTailMFE env False) args - ; fun' <- lvlNonTailExpr env fun - ; return (foldl' App fun' args') } - -------------------------------------------- -lvlCase :: LevelEnv -- Level of in-scope names/tyvars - -> DVarSet -- Free vars of input scrutinee - -> LevelledExpr -- Processed scrutinee - -> Id -> Type -- Case binder and result type - -> [CoreAltWithFVs] -- Input alternatives - -> LvlM LevelledExpr -- Result expression -lvlCase env scrut_fvs scrut' case_bndr ty alts - -- See Note [Floating single-alternative cases] - | [(con@(DataAlt {}), bs, body)] <- alts - , exprIsHNF (deTagExpr scrut') -- See Note [Check the output scrutinee for exprIsHNF] - , not (isTopLvl dest_lvl) -- Can't have top-level cases - , not (floatTopLvlOnly env) -- Can float anywhere - = -- Always float the case if possible - -- Unlike lets we don't insist that it escapes a value lambda - do { (env1, (case_bndr' : bs')) <- cloneCaseBndrs env dest_lvl (case_bndr : bs) - ; let rhs_env = extendCaseBndrEnv env1 case_bndr scrut' - ; body' <- lvlMFE rhs_env True body - ; let alt' = (con, map (stayPut dest_lvl) bs', body') - ; return (Case scrut' (TB case_bndr' (FloatMe dest_lvl)) ty' [alt']) } - - | otherwise -- Stays put - = do { let (alts_env1, [case_bndr']) = substAndLvlBndrs NonRecursive env incd_lvl [case_bndr] - alts_env = extendCaseBndrEnv alts_env1 case_bndr scrut' - ; alts' <- mapM (lvl_alt alts_env) alts - ; return (Case scrut' case_bndr' ty' alts') } - where - ty' = substTy (le_subst env) ty - - incd_lvl = incMinorLvl (le_ctxt_lvl env) - dest_lvl = maxFvLevel (const True) env scrut_fvs - -- Don't abstract over type variables, hence const True - - lvl_alt alts_env (con, bs, rhs) - = do { rhs' <- lvlMFE new_env True rhs - ; return (con, bs', rhs') } - where - (new_env, bs') = substAndLvlBndrs NonRecursive alts_env incd_lvl bs - -{- Note [Floating single-alternative cases] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this: - data T a = MkT !a - f :: T Int -> blah - f x vs = case x of { MkT y -> - let f vs = ...(case y of I# w -> e)...f.. - in f vs - -Here we can float the (case y ...) out, because y is sure -to be evaluated, to give - f x vs = case x of { MkT y -> - case y of I# w -> - let f vs = ...(e)...f.. - in f vs - -That saves unboxing it every time round the loop. It's important in -some DPH stuff where we really want to avoid that repeated unboxing in -the inner loop. - -Things to note: - - * The test we perform is exprIsHNF, and /not/ exprOkForSpeculation. - - - exrpIsHNF catches the key case of an evaluated variable - - - exprOkForSpeculation is /false/ of an evaluated variable; - See Note [exprOkForSpeculation and evaluated variables] in GHC.Core.Utils - So we'd actually miss the key case! - - - Nothing is gained from the extra generality of exprOkForSpeculation - since we only consider floating a case whose single alternative - is a DataAlt K a b -> rhs - - * We can't float a case to top level - - * It's worth doing this float even if we don't float - the case outside a value lambda. Example - case x of { - MkT y -> (case y of I# w2 -> ..., case y of I# w2 -> ...) - If we floated the cases out we could eliminate one of them. - - * We only do this with a single-alternative case - - -Note [Setting levels when floating single-alternative cases] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Handling level-setting when floating a single-alternative case binding -is a bit subtle, as evidenced by #16978. In particular, we must keep -in mind that we are merely moving the case and its binders, not the -body. For example, suppose 'a' is known to be evaluated and we have - - \z -> case a of - (x,_) -> <body involving x and z> - -After floating we may have: - - case a of - (x,_) -> \z -> <body involving x and z> - {- some expression involving x and z -} - -When analysing <body involving...> we want to use the /ambient/ level, -and /not/ the destination level of the 'case a of (x,-) ->' binding. - -#16978 was caused by us setting the context level to the destination -level of `x` when analysing <body>. This led us to conclude that we -needed to quantify over some of its free variables (e.g. z), resulting -in shadowing and very confusing Core Lint failures. - - -Note [Check the output scrutinee for exprIsHNF] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this: - case x of y { - A -> ....(case y of alts).... - } - -Because of the binder-swap, the inner case will get substituted to -(case x of ..). So when testing whether the scrutinee is in HNF we -must be careful to test the *result* scrutinee ('x' in this case), not -the *input* one 'y'. The latter *is* in HNF here (because y is -evaluated), but the former is not -- and indeed we can't float the -inner case out, at least not unless x is also evaluated at its binding -site. See #5453. - -That's why we apply exprIsHNF to scrut' and not to scrut. - -See Note [Floating single-alternative cases] for why -we use exprIsHNF in the first place. --} - -lvlNonTailMFE :: LevelEnv -- Level of in-scope names/tyvars - -> Bool -- True <=> strict context [body of case - -- or let] - -> CoreExprWithFVs -- input expression - -> LvlM LevelledExpr -- Result expression -lvlNonTailMFE env strict_ctxt ann_expr - = lvlMFE (placeJoinCeiling env) strict_ctxt ann_expr - -lvlMFE :: LevelEnv -- Level of in-scope names/tyvars - -> Bool -- True <=> strict context [body of case or let] - -> CoreExprWithFVs -- input expression - -> LvlM LevelledExpr -- Result expression --- lvlMFE is just like lvlExpr, except that it might let-bind --- the expression, so that it can itself be floated. - -lvlMFE env _ (_, AnnType ty) - = return (Type (GHC.Core.Subst.substTy (le_subst env) ty)) - --- No point in floating out an expression wrapped in a coercion or note --- If we do we'll transform lvl = e |> co --- to lvl' = e; lvl = lvl' |> co --- and then inline lvl. Better just to float out the payload. -lvlMFE env strict_ctxt (_, AnnTick t e) - = do { e' <- lvlMFE env strict_ctxt e - ; let t' = substTickish (le_subst env) t - ; return (Tick t' e') } - -lvlMFE env strict_ctxt (_, AnnCast e (_, co)) - = do { e' <- lvlMFE env strict_ctxt e - ; return (Cast e' (substCo (le_subst env) co)) } - -lvlMFE env strict_ctxt e@(_, AnnCase {}) - | strict_ctxt -- Don't share cases in a strict context - = lvlExpr env e -- See Note [Case MFEs] - -lvlMFE env strict_ctxt ann_expr - | floatTopLvlOnly env && not (isTopLvl dest_lvl) - -- Only floating to the top level is allowed. - || anyDVarSet isJoinId fvs -- If there is a free join, don't float - -- See Note [Free join points] - || isExprLevPoly expr - -- We can't let-bind levity polymorphic expressions - -- See Note [Levity polymorphism invariants] in GHC.Core - || notWorthFloating expr abs_vars - || not float_me - = -- Don't float it out - lvlExpr env ann_expr - - | float_is_new_lam || exprIsTopLevelBindable expr expr_ty - -- No wrapping needed if the type is lifted, or is a literal string - -- or if we are wrapping it in one or more value lambdas - = do { expr1 <- lvlFloatRhs abs_vars dest_lvl rhs_env NonRecursive - (isJust mb_bot_str) - join_arity_maybe - ann_expr - -- Treat the expr just like a right-hand side - ; var <- newLvlVar expr1 join_arity_maybe is_mk_static - ; let var2 = annotateBotStr var float_n_lams mb_bot_str - ; return (Let (NonRec (TB var2 (FloatMe dest_lvl)) expr1) - (mkVarApps (Var var2) abs_vars)) } - - -- OK, so the float has an unlifted type (not top-level bindable) - -- and no new value lambdas (float_is_new_lam is False) - -- Try for the boxing strategy - -- See Note [Floating MFEs of unlifted type] - | escapes_value_lam - , not expr_ok_for_spec -- Boxing/unboxing isn't worth it for cheap expressions - -- See Note [Test cheapness with exprOkForSpeculation] - , Just (tc, _) <- splitTyConApp_maybe expr_ty - , Just dc <- boxingDataCon_maybe tc - , let dc_res_ty = dataConOrigResTy dc -- No free type variables - [bx_bndr, ubx_bndr] = mkTemplateLocals [dc_res_ty, expr_ty] - = do { expr1 <- lvlExpr rhs_env ann_expr - ; let l1r = incMinorLvlFrom rhs_env - float_rhs = mkLams abs_vars_w_lvls $ - Case expr1 (stayPut l1r ubx_bndr) dc_res_ty - [(DEFAULT, [], mkConApp dc [Var ubx_bndr])] - - ; var <- newLvlVar float_rhs Nothing is_mk_static - ; let l1u = incMinorLvlFrom env - use_expr = Case (mkVarApps (Var var) abs_vars) - (stayPut l1u bx_bndr) expr_ty - [(DataAlt dc, [stayPut l1u ubx_bndr], Var ubx_bndr)] - ; return (Let (NonRec (TB var (FloatMe dest_lvl)) float_rhs) - use_expr) } - - | otherwise -- e.g. do not float unboxed tuples - = lvlExpr env ann_expr - - where - expr = deAnnotate ann_expr - expr_ty = exprType expr - fvs = freeVarsOf ann_expr - fvs_ty = tyCoVarsOfType expr_ty - is_bot = isBottomThunk mb_bot_str - is_function = isFunction ann_expr - mb_bot_str = exprBotStrictness_maybe expr - -- See Note [Bottoming floats] - -- esp Bottoming floats (2) - expr_ok_for_spec = exprOkForSpeculation expr - dest_lvl = destLevel env fvs fvs_ty is_function is_bot False - abs_vars = abstractVars dest_lvl env fvs - - -- float_is_new_lam: the floated thing will be a new value lambda - -- replacing, say (g (x+4)) by (lvl x). No work is saved, nor is - -- allocation saved. The benefit is to get it to the top level - -- and hence out of the body of this function altogether, making - -- it smaller and more inlinable - float_is_new_lam = float_n_lams > 0 - float_n_lams = count isId abs_vars - - (rhs_env, abs_vars_w_lvls) = lvlLamBndrs env dest_lvl abs_vars - - join_arity_maybe = Nothing - - is_mk_static = isJust (collectMakeStaticArgs expr) - -- Yuk: See Note [Grand plan for static forms] in main/StaticPtrTable - - -- A decision to float entails let-binding this thing, and we only do - -- that if we'll escape a value lambda, or will go to the top level. - float_me = saves_work || saves_alloc || is_mk_static - - -- We can save work if we can move a redex outside a value lambda - -- But if float_is_new_lam is True, then the redex is wrapped in a - -- a new lambda, so no work is saved - saves_work = escapes_value_lam && not float_is_new_lam - - escapes_value_lam = dest_lvl `ltMajLvl` (le_ctxt_lvl env) - -- See Note [Escaping a value lambda] - - -- See Note [Floating to the top] - saves_alloc = isTopLvl dest_lvl - && floatConsts env - && (not strict_ctxt || is_bot || exprIsHNF expr) - -isBottomThunk :: Maybe (Arity, s) -> Bool --- See Note [Bottoming floats] (2) -isBottomThunk (Just (0, _)) = True -- Zero arity -isBottomThunk _ = False - -{- Note [Floating to the top] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We are keen to float something to the top level, even if it does not -escape a value lambda (and hence save work), for two reasons: - - * Doing so makes the function smaller, by floating out - bottoming expressions, or integer or string literals. That in - turn makes it easier to inline, with less duplication. - - * (Minor) Doing so may turn a dynamic allocation (done by machine - instructions) into a static one. Minor because we are assuming - we are not escaping a value lambda. - -But do not so if: - - the context is a strict, and - - the expression is not a HNF, and - - the expression is not bottoming - -Exammples: - -* Bottoming - f x = case x of - 0 -> error <big thing> - _ -> x+1 - Here we want to float (error <big thing>) to top level, abstracting - over 'x', so as to make f's RHS smaller. - -* HNF - f = case y of - True -> p:q - False -> blah - We may as well float the (p:q) so it becomes a static data structure. - -* Case scrutinee - f = case g True of .... - Don't float (g True) to top level; then we have the admin of a - top-level thunk to worry about, with zero gain. - -* Case alternative - h = case y of - True -> g True - False -> False - Don't float (g True) to the top level - -* Arguments - t = f (g True) - If f is lazy, we /do/ float (g True) because then we can allocate - the thunk statically rather than dynamically. But if f is strict - we don't (see the use of idStrictness in lvlApp). It's not clear - if this test is worth the bother: it's only about CAFs! - -It's controlled by a flag (floatConsts), because doing this too -early loses opportunities for RULES which (needless to say) are -important in some nofib programs (gcd is an example). [SPJ note: -I think this is obsolete; the flag seems always on.] - -Note [Floating join point bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Mostly we only float a join point if it can /stay/ a join point. But -there is one exception: if it can go to the top level (#13286). -Consider - f x = joinrec j y n = <...j y' n'...> - in jump j x 0 - -Here we may just as well produce - j y n = <....j y' n'...> - f x = j x 0 - -and now there is a chance that 'f' will be inlined at its call sites. -It shouldn't make a lot of difference, but these tests - perf/should_run/MethSharing - simplCore/should_compile/spec-inline -and one nofib program, all improve if you do float to top, because -of the resulting inlining of f. So ok, let's do it. - -Note [Free join points] -~~~~~~~~~~~~~~~~~~~~~~~ -We never float a MFE that has a free join-point variable. You might think -this can never occur. After all, consider - join j x = ... - in ....(jump j x).... -How might we ever want to float that (jump j x)? - * If it would escape a value lambda, thus - join j x = ... in (\y. ...(jump j x)... ) - then 'j' isn't a valid join point in the first place. - -But consider - join j x = .... in - joinrec j2 y = ...(jump j x)...(a+b).... - -Since j2 is recursive, it /is/ worth floating (a+b) out of the joinrec. -But it is emphatically /not/ good to float the (jump j x) out: - (a) 'j' will stop being a join point - (b) In any case, jumping to 'j' must be an exit of the j2 loop, so no - work would be saved by floating it out of the \y. - -Even if we floated 'j' to top level, (b) would still hold. - -Bottom line: never float a MFE that has a free JoinId. - -Note [Floating MFEs of unlifted type] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Suppose we have - case f x of (r::Int#) -> blah -we'd like to float (f x). But it's not trivial because it has type -Int#, and we don't want to evaluate it too early. But we can instead -float a boxed version - y = case f x of r -> I# r -and replace the original (f x) with - case (case y of I# r -> r) of r -> blah - -Being able to float unboxed expressions is sometimes important; see -#12603. I'm not sure how /often/ it is important, but it's -not hard to achieve. - -We only do it for a fixed collection of types for which we have a -convenient boxing constructor (see boxingDataCon_maybe). In -particular we /don't/ do it for unboxed tuples; it's better to float -the components of the tuple individually. - -I did experiment with a form of boxing that works for any type, namely -wrapping in a function. In our example - - let y = case f x of r -> \v. f x - in case y void of r -> blah - -It works fine, but it's 50% slower (based on some crude benchmarking). -I suppose we could do it for types not covered by boxingDataCon_maybe, -but it's more code and I'll wait to see if anyone wants it. - -Note [Test cheapness with exprOkForSpeculation] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We don't want to float very cheap expressions by boxing and unboxing. -But we use exprOkForSpeculation for the test, not exprIsCheap. -Why? Because it's important /not/ to transform - f (a /# 3) -to - f (case bx of I# a -> a /# 3) -and float bx = I# (a /# 3), because the application of f no -longer obeys the let/app invariant. But (a /# 3) is ok-for-spec -due to a special hack that says division operators can't fail -when the denominator is definitely non-zero. And yet that -same expression says False to exprIsCheap. Simplest way to -guarantee the let/app invariant is to use the same function! - -If an expression is okay for speculation, we could also float it out -*without* boxing and unboxing, since evaluating it early is okay. -However, it turned out to usually be better not to float such expressions, -since they tend to be extremely cheap things like (x +# 1#). Even the -cost of spilling the let-bound variable to the stack across a call may -exceed the cost of recomputing such an expression. (And we can't float -unlifted bindings to top-level.) - -We could try to do something smarter here, and float out expensive yet -okay-for-speculation things, such as division by non-zero constants. -But I suspect it's a narrow target. - -Note [Bottoming floats] -~~~~~~~~~~~~~~~~~~~~~~~ -If we see - f = \x. g (error "urk") -we'd like to float the call to error, to get - lvl = error "urk" - f = \x. g lvl - -But, as ever, we need to be careful: - -(1) We want to float a bottoming - expression even if it has free variables: - f = \x. g (let v = h x in error ("urk" ++ v)) - Then we'd like to abstract over 'x' can float the whole arg of g: - lvl = \x. let v = h x in error ("urk" ++ v) - f = \x. g (lvl x) - To achieve this we pass is_bot to destLevel - -(2) We do not do this for lambdas that return - bottom. Instead we treat the /body/ of such a function specially, - via point (1). For example: - f = \x. ....(\y z. if x then error y else error z).... - ===> - lvl = \x z y. if b then error y else error z - f = \x. ...(\y z. lvl x z y)... - (There is no guarantee that we'll choose the perfect argument order.) - -(3) If we have a /binding/ that returns bottom, we want to float it to top - level, even if it has free vars (point (1)), and even it has lambdas. - Example: - ... let { v = \y. error (show x ++ show y) } in ... - We want to abstract over x and float the whole thing to top: - lvl = \xy. errror (show x ++ show y) - ...let {v = lvl x} in ... - - Then of course we don't want to separately float the body (error ...) - as /another/ MFE, so we tell lvlFloatRhs not to do that, via the is_bot - argument. - -See Maessen's paper 1999 "Bottom extraction: factoring error handling out -of functional programs" (unpublished I think). - -When we do this, we set the strictness and arity of the new bottoming -Id, *immediately*, for three reasons: - - * To prevent the abstracted thing being immediately inlined back in again - via preInlineUnconditionally. The latter has a test for bottoming Ids - to stop inlining them, so we'd better make sure it *is* a bottoming Id! - - * So that it's properly exposed as such in the interface file, even if - this is all happening after strictness analysis. - - * In case we do CSE with the same expression that *is* marked bottom - lvl = error "urk" - x{str=bot) = error "urk" - Here we don't want to replace 'x' with 'lvl', else we may get Lint - errors, e.g. via a case with empty alternatives: (case x of {}) - Lint complains unless the scrutinee of such a case is clearly bottom. - - This was reported in #11290. But since the whole bottoming-float - thing is based on the cheap-and-cheerful exprIsBottom, I'm not sure - that it'll nail all such cases. - -Note [Bottoming floats: eta expansion] c.f Note [Bottoming floats] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Tiresomely, though, the simplifier has an invariant that the manifest -arity of the RHS should be the same as the arity; but we can't call -etaExpand during SetLevels because it works over a decorated form of -CoreExpr. So we do the eta expansion later, in FloatOut. - -Note [Case MFEs] -~~~~~~~~~~~~~~~~ -We don't float a case expression as an MFE from a strict context. Why not? -Because in doing so we share a tiny bit of computation (the switch) but -in exchange we build a thunk, which is bad. This case reduces allocation -by 7% in spectral/puzzle (a rather strange benchmark) and 1.2% in real/fem. -Doesn't change any other allocation at all. - -We will make a separate decision for the scrutinee and alternatives. - -However this can have a knock-on effect for fusion: consider - \v -> foldr k z (case x of I# y -> build ..y..) -Perhaps we can float the entire (case x of ...) out of the \v. Then -fusion will not happen, but we will get more sharing. But if we don't -float the case (as advocated here) we won't float the (build ...y..) -either, so fusion will happen. It can be a big effect, esp in some -artificial benchmarks (e.g. integer, queens), but there is no perfect -answer. - --} - -annotateBotStr :: Id -> Arity -> Maybe (Arity, StrictSig) -> Id --- See Note [Bottoming floats] for why we want to add --- bottoming information right now --- --- n_extra are the number of extra value arguments added during floating -annotateBotStr id n_extra mb_str - = case mb_str of - Nothing -> id - Just (arity, sig) -> id `setIdArity` (arity + n_extra) - `setIdStrictness` (increaseStrictSigArity n_extra sig) - `setIdCprInfo` mkCprSig (arity + n_extra) botCpr - -notWorthFloating :: CoreExpr -> [Var] -> Bool --- Returns True if the expression would be replaced by --- something bigger than it is now. For example: --- abs_vars = tvars only: return True if e is trivial, --- but False for anything bigger --- abs_vars = [x] (an Id): return True for trivial, or an application (f x) --- but False for (f x x) --- --- One big goal is that floating should be idempotent. Eg if --- we replace e with (lvl79 x y) and then run FloatOut again, don't want --- to replace (lvl79 x y) with (lvl83 x y)! - -notWorthFloating e abs_vars - = go e (count isId abs_vars) - where - go (Var {}) n = n >= 0 - go (Lit lit) n = ASSERT( n==0 ) - litIsTrivial lit -- Note [Floating literals] - go (Tick t e) n = not (tickishIsCode t) && go e n - go (Cast e _) n = go e n - go (App e arg) n - -- See Note [Floating applications to coercions] - | Type {} <- arg = go e n - | n==0 = False - | is_triv arg = go e (n-1) - | otherwise = False - go _ _ = False - - is_triv (Lit {}) = True -- Treat all literals as trivial - is_triv (Var {}) = True -- (ie not worth floating) - is_triv (Cast e _) = is_triv e - is_triv (App e (Type {})) = is_triv e -- See Note [Floating applications to coercions] - is_triv (Tick t e) = not (tickishIsCode t) && is_triv e - is_triv _ = False - -{- -Note [Floating literals] -~~~~~~~~~~~~~~~~~~~~~~~~ -It's important to float Integer literals, so that they get shared, -rather than being allocated every time round the loop. -Hence the litIsTrivial. - -Ditto literal strings (LitString), which we'd like to float to top -level, which is now possible. - -Note [Floating applications to coercions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We don’t float out variables applied only to type arguments, since the -extra binding would be pointless: type arguments are completely erased. -But *coercion* arguments aren’t (see Note [Coercion tokens] in -CoreToStg.hs and Note [Count coercion arguments in boring contexts] in -CoreUnfold.hs), so we still want to float out variables applied only to -coercion arguments. - -Note [Escaping a value lambda] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We want to float even cheap expressions out of value lambdas, -because that saves allocation. Consider - f = \x. .. (\y.e) ... -Then we'd like to avoid allocating the (\y.e) every time we call f, -(assuming e does not mention x). An example where this really makes a -difference is simplrun009. - -Another reason it's good is because it makes SpecContr fire on functions. -Consider - f = \x. ....(f (\y.e)).... -After floating we get - lvl = \y.e - f = \x. ....(f lvl)... -and that is much easier for SpecConstr to generate a robust -specialisation for. - -However, if we are wrapping the thing in extra value lambdas (in -abs_vars), then nothing is saved. E.g. - f = \xyz. ...(e1[y],e2).... -If we float - lvl = \y. (e1[y],e2) - f = \xyz. ...(lvl y)... -we have saved nothing: one pair will still be allocated for each -call of 'f'. Hence the (not float_is_lam) in float_me. - - -************************************************************************ -* * -\subsection{Bindings} -* * -************************************************************************ - -The binding stuff works for top level too. --} - -lvlBind :: LevelEnv - -> CoreBindWithFVs - -> LvlM (LevelledBind, LevelEnv) - -lvlBind env (AnnNonRec bndr rhs) - | isTyVar bndr -- Don't do anything for TyVar binders - -- (simplifier gets rid of them pronto) - || isCoVar bndr -- Difficult to fix up CoVar occurrences (see extendPolyLvlEnv) - -- so we will ignore this case for now - || not (profitableFloat env dest_lvl) - || (isTopLvl dest_lvl && not (exprIsTopLevelBindable deann_rhs bndr_ty)) - -- We can't float an unlifted binding to top level (except - -- literal strings), so we don't float it at all. It's a - -- bit brutal, but unlifted bindings aren't expensive either - - = -- No float - do { rhs' <- lvlRhs env NonRecursive is_bot mb_join_arity rhs - ; let bind_lvl = incMinorLvl (le_ctxt_lvl env) - (env', [bndr']) = substAndLvlBndrs NonRecursive env bind_lvl [bndr] - ; return (NonRec bndr' rhs', env') } - - -- Otherwise we are going to float - | null abs_vars - = do { -- No type abstraction; clone existing binder - rhs' <- lvlFloatRhs [] dest_lvl env NonRecursive - is_bot mb_join_arity rhs - ; (env', [bndr']) <- cloneLetVars NonRecursive env dest_lvl [bndr] - ; let bndr2 = annotateBotStr bndr' 0 mb_bot_str - ; return (NonRec (TB bndr2 (FloatMe dest_lvl)) rhs', env') } - - | otherwise - = do { -- Yes, type abstraction; create a new binder, extend substitution, etc - rhs' <- lvlFloatRhs abs_vars dest_lvl env NonRecursive - is_bot mb_join_arity rhs - ; (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr] - ; let bndr2 = annotateBotStr bndr' n_extra mb_bot_str - ; return (NonRec (TB bndr2 (FloatMe dest_lvl)) rhs', env') } - - where - bndr_ty = idType bndr - ty_fvs = tyCoVarsOfType bndr_ty - rhs_fvs = freeVarsOf rhs - bind_fvs = rhs_fvs `unionDVarSet` dIdFreeVars bndr - abs_vars = abstractVars dest_lvl env bind_fvs - dest_lvl = destLevel env bind_fvs ty_fvs (isFunction rhs) is_bot is_join - - deann_rhs = deAnnotate rhs - mb_bot_str = exprBotStrictness_maybe deann_rhs - is_bot = isJust mb_bot_str - -- NB: not isBottomThunk! See Note [Bottoming floats] point (3) - - n_extra = count isId abs_vars - mb_join_arity = isJoinId_maybe bndr - is_join = isJust mb_join_arity - -lvlBind env (AnnRec pairs) - | floatTopLvlOnly env && not (isTopLvl dest_lvl) - -- Only floating to the top level is allowed. - || not (profitableFloat env dest_lvl) - || (isTopLvl dest_lvl && any (mightBeUnliftedType . idType) bndrs) - -- This mightBeUnliftedType stuff is the same test as in the non-rec case - -- You might wonder whether we can have a recursive binding for - -- an unlifted value -- but we can if it's a /join binding/ (#16978) - -- (Ultimately I think we should not use SetLevels to - -- float join bindings at all, but that's another story.) - = -- No float - do { let bind_lvl = incMinorLvl (le_ctxt_lvl env) - (env', bndrs') = substAndLvlBndrs Recursive env bind_lvl bndrs - lvl_rhs (b,r) = lvlRhs env' Recursive is_bot (isJoinId_maybe b) r - ; rhss' <- mapM lvl_rhs pairs - ; return (Rec (bndrs' `zip` rhss'), env') } - - -- Otherwise we are going to float - | null abs_vars - = do { (new_env, new_bndrs) <- cloneLetVars Recursive env dest_lvl bndrs - ; new_rhss <- mapM (do_rhs new_env) pairs - ; return ( Rec ([TB b (FloatMe dest_lvl) | b <- new_bndrs] `zip` new_rhss) - , new_env) } - --- ToDo: when enabling the floatLambda stuff, --- I think we want to stop doing this - | [(bndr,rhs)] <- pairs - , count isId abs_vars > 1 - = do -- Special case for self recursion where there are - -- several variables carried around: build a local loop: - -- poly_f = \abs_vars. \lam_vars . letrec f = \lam_vars. rhs in f lam_vars - -- This just makes the closures a bit smaller. If we don't do - -- this, allocation rises significantly on some programs - -- - -- We could elaborate it for the case where there are several - -- mutually recursive functions, but it's quite a bit more complicated - -- - -- This all seems a bit ad hoc -- sigh - let (rhs_env, abs_vars_w_lvls) = lvlLamBndrs env dest_lvl abs_vars - rhs_lvl = le_ctxt_lvl rhs_env - - (rhs_env', [new_bndr]) <- cloneLetVars Recursive rhs_env rhs_lvl [bndr] - let - (lam_bndrs, rhs_body) = collectAnnBndrs rhs - (body_env1, lam_bndrs1) = substBndrsSL NonRecursive rhs_env' lam_bndrs - (body_env2, lam_bndrs2) = lvlLamBndrs body_env1 rhs_lvl lam_bndrs1 - new_rhs_body <- lvlRhs body_env2 Recursive is_bot (get_join bndr) rhs_body - (poly_env, [poly_bndr]) <- newPolyBndrs dest_lvl env abs_vars [bndr] - return (Rec [(TB poly_bndr (FloatMe dest_lvl) - , mkLams abs_vars_w_lvls $ - mkLams lam_bndrs2 $ - Let (Rec [( TB new_bndr (StayPut rhs_lvl) - , mkLams lam_bndrs2 new_rhs_body)]) - (mkVarApps (Var new_bndr) lam_bndrs1))] - , poly_env) - - | otherwise -- Non-null abs_vars - = do { (new_env, new_bndrs) <- newPolyBndrs dest_lvl env abs_vars bndrs - ; new_rhss <- mapM (do_rhs new_env) pairs - ; return ( Rec ([TB b (FloatMe dest_lvl) | b <- new_bndrs] `zip` new_rhss) - , new_env) } - - where - (bndrs,rhss) = unzip pairs - is_join = isJoinId (head bndrs) - -- bndrs is always non-empty and if one is a join they all are - -- Both are checked by Lint - is_fun = all isFunction rhss - is_bot = False -- It's odd to have an unconditionally divergent - -- function in a Rec, and we don't much care what - -- happens to it. False is simple! - - do_rhs env (bndr,rhs) = lvlFloatRhs abs_vars dest_lvl env Recursive - is_bot (get_join bndr) - rhs - - get_join bndr | need_zap = Nothing - | otherwise = isJoinId_maybe bndr - need_zap = dest_lvl `ltLvl` joinCeilingLevel env - - -- Finding the free vars of the binding group is annoying - bind_fvs = ((unionDVarSets [ freeVarsOf rhs | (_, rhs) <- pairs]) - `unionDVarSet` - (fvDVarSet $ unionsFV [ idFVs bndr - | (bndr, (_,_)) <- pairs])) - `delDVarSetList` - bndrs - - ty_fvs = foldr (unionVarSet . tyCoVarsOfType . idType) emptyVarSet bndrs - dest_lvl = destLevel env bind_fvs ty_fvs is_fun is_bot is_join - abs_vars = abstractVars dest_lvl env bind_fvs - -profitableFloat :: LevelEnv -> Level -> Bool -profitableFloat env dest_lvl - = (dest_lvl `ltMajLvl` le_ctxt_lvl env) -- Escapes a value lambda - || isTopLvl dest_lvl -- Going all the way to top level - - ----------------------------------------------------- --- Three help functions for the type-abstraction case - -lvlRhs :: LevelEnv - -> RecFlag - -> Bool -- Is this a bottoming function - -> Maybe JoinArity - -> CoreExprWithFVs - -> LvlM LevelledExpr -lvlRhs env rec_flag is_bot mb_join_arity expr - = lvlFloatRhs [] (le_ctxt_lvl env) env - rec_flag is_bot mb_join_arity expr - -lvlFloatRhs :: [OutVar] -> Level -> LevelEnv -> RecFlag - -> Bool -- Binding is for a bottoming function - -> Maybe JoinArity - -> CoreExprWithFVs - -> LvlM (Expr LevelledBndr) --- Ignores the le_ctxt_lvl in env; treats dest_lvl as the baseline -lvlFloatRhs abs_vars dest_lvl env rec is_bot mb_join_arity rhs - = do { body' <- if not is_bot -- See Note [Floating from a RHS] - && any isId bndrs - then lvlMFE body_env True body - else lvlExpr body_env body - ; return (mkLams bndrs' body') } - where - (bndrs, body) | Just join_arity <- mb_join_arity - = collectNAnnBndrs join_arity rhs - | otherwise - = collectAnnBndrs rhs - (env1, bndrs1) = substBndrsSL NonRecursive env bndrs - all_bndrs = abs_vars ++ bndrs1 - (body_env, bndrs') | Just _ <- mb_join_arity - = lvlJoinBndrs env1 dest_lvl rec all_bndrs - | otherwise - = case lvlLamBndrs env1 dest_lvl all_bndrs of - (env2, bndrs') -> (placeJoinCeiling env2, bndrs') - -- The important thing here is that we call lvlLamBndrs on - -- all these binders at once (abs_vars and bndrs), so they - -- all get the same major level. Otherwise we create stupid - -- let-bindings inside, joyfully thinking they can float; but - -- in the end they don't because we never float bindings in - -- between lambdas - -{- Note [Floating from a RHS] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When floating the RHS of a let-binding, we don't always want to apply -lvlMFE to the body of a lambda, as we usually do, because the entire -binding body is already going to the right place (dest_lvl). - -A particular example is the top level. Consider - concat = /\ a -> foldr ..a.. (++) [] -We don't want to float the body of the lambda to get - lvl = /\ a -> foldr ..a.. (++) [] - concat = /\ a -> lvl a -That would be stupid. - -Previously this was avoided in a much nastier way, by testing strict_ctxt -in float_me in lvlMFE. But that wasn't even right because it would fail -to float out the error sub-expression in - f = \x. case x of - True -> error ("blah" ++ show x) - False -> ... - -But we must be careful: - -* If we had - f = \x -> factorial 20 - we /would/ want to float that (factorial 20) out! Functions are treated - differently: see the use of isFunction in the calls to destLevel. If - there are only type lambdas, then destLevel will say "go to top, and - abstract over the free tyvars" and we don't want that here. - -* But if we had - f = \x -> error (...x....) - we would NOT want to float the bottoming expression out to give - lvl = \x -> error (...x...) - f = \x -> lvl x - -Conclusion: use lvlMFE if there are - * any value lambdas in the original function, and - * this is not a bottoming function (the is_bot argument) -Use lvlExpr otherwise. A little subtle, and I got it wrong at least twice -(e.g. #13369). --} - -{- -************************************************************************ -* * -\subsection{Deciding floatability} -* * -************************************************************************ --} - -substAndLvlBndrs :: RecFlag -> LevelEnv -> Level -> [InVar] -> (LevelEnv, [LevelledBndr]) -substAndLvlBndrs is_rec env lvl bndrs - = lvlBndrs subst_env lvl subst_bndrs - where - (subst_env, subst_bndrs) = substBndrsSL is_rec env bndrs - -substBndrsSL :: RecFlag -> LevelEnv -> [InVar] -> (LevelEnv, [OutVar]) --- So named only to avoid the name clash with GHC.Core.Subst.substBndrs -substBndrsSL is_rec env@(LE { le_subst = subst, le_env = id_env }) bndrs - = ( env { le_subst = subst' - , le_env = foldl' add_id id_env (bndrs `zip` bndrs') } - , bndrs') - where - (subst', bndrs') = case is_rec of - NonRecursive -> substBndrs subst bndrs - Recursive -> substRecBndrs subst bndrs - -lvlLamBndrs :: LevelEnv -> Level -> [OutVar] -> (LevelEnv, [LevelledBndr]) --- Compute the levels for the binders of a lambda group -lvlLamBndrs env lvl bndrs - = lvlBndrs env new_lvl bndrs - where - new_lvl | any is_major bndrs = incMajorLvl lvl - | otherwise = incMinorLvl lvl - - is_major bndr = isId bndr && not (isProbablyOneShotLambda bndr) - -- The "probably" part says "don't float things out of a - -- probable one-shot lambda" - -- See Note [Computing one-shot info] in Demand.hs - -lvlJoinBndrs :: LevelEnv -> Level -> RecFlag -> [OutVar] - -> (LevelEnv, [LevelledBndr]) -lvlJoinBndrs env lvl rec bndrs - = lvlBndrs env new_lvl bndrs - where - new_lvl | isRec rec = incMajorLvl lvl - | otherwise = incMinorLvl lvl - -- Non-recursive join points are one-shot; recursive ones are not - -lvlBndrs :: LevelEnv -> Level -> [CoreBndr] -> (LevelEnv, [LevelledBndr]) --- The binders returned are exactly the same as the ones passed, --- apart from applying the substitution, but they are now paired --- with a (StayPut level) --- --- The returned envt has le_ctxt_lvl updated to the new_lvl --- --- All the new binders get the same level, because --- any floating binding is either going to float past --- all or none. We never separate binders. -lvlBndrs env@(LE { le_lvl_env = lvl_env }) new_lvl bndrs - = ( env { le_ctxt_lvl = new_lvl - , le_join_ceil = new_lvl - , le_lvl_env = addLvls new_lvl lvl_env bndrs } - , map (stayPut new_lvl) bndrs) - -stayPut :: Level -> OutVar -> LevelledBndr -stayPut new_lvl bndr = TB bndr (StayPut new_lvl) - - -- Destination level is the max Id level of the expression - -- (We'll abstract the type variables, if any.) -destLevel :: LevelEnv - -> DVarSet -- Free vars of the term - -> TyCoVarSet -- Free in the /type/ of the term - -- (a subset of the previous argument) - -> Bool -- True <=> is function - -> Bool -- True <=> is bottom - -> Bool -- True <=> is a join point - -> Level --- INVARIANT: if is_join=True then result >= join_ceiling -destLevel env fvs fvs_ty is_function is_bot is_join - | isTopLvl max_fv_id_level -- Float even joins if they get to top level - -- See Note [Floating join point bindings] - = tOP_LEVEL - - | is_join -- Never float a join point past the join ceiling - -- See Note [Join points] in FloatOut - = if max_fv_id_level `ltLvl` join_ceiling - then join_ceiling - else max_fv_id_level - - | is_bot -- Send bottoming bindings to the top - = as_far_as_poss -- regardless; see Note [Bottoming floats] - -- Esp Bottoming floats (1) - - | Just n_args <- floatLams env - , n_args > 0 -- n=0 case handled uniformly by the 'otherwise' case - , is_function - , countFreeIds fvs <= n_args - = as_far_as_poss -- Send functions to top level; see - -- the comments with isFunction - - | otherwise = max_fv_id_level - where - join_ceiling = joinCeilingLevel env - max_fv_id_level = maxFvLevel isId env fvs -- Max over Ids only; the - -- tyvars will be abstracted - - as_far_as_poss = maxFvLevel' isId env fvs_ty - -- See Note [Floating and kind casts] - -{- Note [Floating and kind casts] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this - case x of - K (co :: * ~# k) -> let v :: Int |> co - v = e - in blah - -Then, even if we are abstracting over Ids, or if e is bottom, we can't -float v outside the 'co' binding. Reason: if we did we'd get - v' :: forall k. (Int ~# Age) => Int |> co -and now 'co' isn't in scope in that type. The underlying reason is -that 'co' is a value-level thing and we can't abstract over that in a -type (else we'd get a dependent type). So if v's /type/ mentions 'co' -we can't float it out beyond the binding site of 'co'. - -That's why we have this as_far_as_poss stuff. Usually as_far_as_poss -is just tOP_LEVEL; but occasionally a coercion variable (which is an -Id) mentioned in type prevents this. - -Example #14270 comment:15. --} - - -isFunction :: CoreExprWithFVs -> Bool --- The idea here is that we want to float *functions* to --- the top level. This saves no work, but --- (a) it can make the host function body a lot smaller, --- and hence inlinable. --- (b) it can also save allocation when the function is recursive: --- h = \x -> letrec f = \y -> ...f...y...x... --- in f x --- becomes --- f = \x y -> ...(f x)...y...x... --- h = \x -> f x x --- No allocation for f now. --- We may only want to do this if there are sufficiently few free --- variables. We certainly only want to do it for values, and not for --- constructors. So the simple thing is just to look for lambdas -isFunction (_, AnnLam b e) | isId b = True - | otherwise = isFunction e --- isFunction (_, AnnTick _ e) = isFunction e -- dubious -isFunction _ = False - -countFreeIds :: DVarSet -> Int -countFreeIds = nonDetFoldUDFM add 0 . getUniqDSet - -- It's OK to use nonDetFoldUDFM here because we're just counting things. - where - add :: Var -> Int -> Int - add v n | isId v = n+1 - | otherwise = n - -{- -************************************************************************ -* * -\subsection{Free-To-Level Monad} -* * -************************************************************************ --} - -data LevelEnv - = LE { le_switches :: FloatOutSwitches - , le_ctxt_lvl :: Level -- The current level - , le_lvl_env :: VarEnv Level -- Domain is *post-cloned* TyVars and Ids - , le_join_ceil:: Level -- Highest level to which joins float - -- Invariant: always >= le_ctxt_lvl - - -- See Note [le_subst and le_env] - , le_subst :: Subst -- Domain is pre-cloned TyVars and Ids - -- The Id -> CoreExpr in the Subst is ignored - -- (since we want to substitute a LevelledExpr for - -- an Id via le_env) but we do use the Co/TyVar substs - , le_env :: IdEnv ([OutVar], LevelledExpr) -- Domain is pre-cloned Ids - } - -{- Note [le_subst and le_env] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We clone let- and case-bound variables so that they are still distinct -when floated out; hence the le_subst/le_env. (see point 3 of the -module overview comment). We also use these envs when making a -variable polymorphic because we want to float it out past a big -lambda. - -The le_subst and le_env always implement the same mapping, - in_x :-> out_x a b -where out_x is an OutVar, and a,b are its arguments (when -we perform abstraction at the same time as floating). - - le_subst maps to CoreExpr - le_env maps to LevelledExpr - -Since the range is always a variable or application, there is never -any difference between the two, but sadly the types differ. The -le_subst is used when substituting in a variable's IdInfo; the le_env -when we find a Var. - -In addition the le_env records a [OutVar] of variables free in the -OutExpr/LevelledExpr, just so we don't have to call freeVars -repeatedly. This list is always non-empty, and the first element is -out_x - -The domain of the both envs is *pre-cloned* Ids, though - -The domain of the le_lvl_env is the *post-cloned* Ids --} - -initialEnv :: FloatOutSwitches -> LevelEnv -initialEnv float_lams - = LE { le_switches = float_lams - , le_ctxt_lvl = tOP_LEVEL - , le_join_ceil = panic "initialEnv" - , le_lvl_env = emptyVarEnv - , le_subst = emptySubst - , le_env = emptyVarEnv } - -addLvl :: Level -> VarEnv Level -> OutVar -> VarEnv Level -addLvl dest_lvl env v' = extendVarEnv env v' dest_lvl - -addLvls :: Level -> VarEnv Level -> [OutVar] -> VarEnv Level -addLvls dest_lvl env vs = foldl' (addLvl dest_lvl) env vs - -floatLams :: LevelEnv -> Maybe Int -floatLams le = floatOutLambdas (le_switches le) - -floatConsts :: LevelEnv -> Bool -floatConsts le = floatOutConstants (le_switches le) - -floatOverSat :: LevelEnv -> Bool -floatOverSat le = floatOutOverSatApps (le_switches le) - -floatTopLvlOnly :: LevelEnv -> Bool -floatTopLvlOnly le = floatToTopLevelOnly (le_switches le) - -incMinorLvlFrom :: LevelEnv -> Level -incMinorLvlFrom env = incMinorLvl (le_ctxt_lvl env) - --- extendCaseBndrEnv adds the mapping case-bndr->scrut-var if it can --- See Note [Binder-swap during float-out] -extendCaseBndrEnv :: LevelEnv - -> Id -- Pre-cloned case binder - -> Expr LevelledBndr -- Post-cloned scrutinee - -> LevelEnv -extendCaseBndrEnv le@(LE { le_subst = subst, le_env = id_env }) - case_bndr (Var scrut_var) - = le { le_subst = extendSubstWithVar subst case_bndr scrut_var - , le_env = add_id id_env (case_bndr, scrut_var) } -extendCaseBndrEnv env _ _ = env - --- See Note [Join ceiling] -placeJoinCeiling :: LevelEnv -> LevelEnv -placeJoinCeiling le@(LE { le_ctxt_lvl = lvl }) - = le { le_ctxt_lvl = lvl', le_join_ceil = lvl' } - where - lvl' = asJoinCeilLvl (incMinorLvl lvl) - -maxFvLevel :: (Var -> Bool) -> LevelEnv -> DVarSet -> Level -maxFvLevel max_me env var_set - = foldDVarSet (maxIn max_me env) tOP_LEVEL var_set - -maxFvLevel' :: (Var -> Bool) -> LevelEnv -> TyCoVarSet -> Level --- Same but for TyCoVarSet -maxFvLevel' max_me env var_set - = nonDetFoldUniqSet (maxIn max_me env) tOP_LEVEL var_set - -maxIn :: (Var -> Bool) -> LevelEnv -> InVar -> Level -> Level -maxIn max_me (LE { le_lvl_env = lvl_env, le_env = id_env }) in_var lvl - = case lookupVarEnv id_env in_var of - Just (abs_vars, _) -> foldr max_out lvl abs_vars - Nothing -> max_out in_var lvl - where - max_out out_var lvl - | max_me out_var = case lookupVarEnv lvl_env out_var of - Just lvl' -> maxLvl lvl' lvl - Nothing -> lvl - | otherwise = lvl -- Ignore some vars depending on max_me - -lookupVar :: LevelEnv -> Id -> LevelledExpr -lookupVar le v = case lookupVarEnv (le_env le) v of - Just (_, expr) -> expr - _ -> Var v - --- Level to which join points are allowed to float (boundary of current tail --- context). See Note [Join ceiling] -joinCeilingLevel :: LevelEnv -> Level -joinCeilingLevel = le_join_ceil - -abstractVars :: Level -> LevelEnv -> DVarSet -> [OutVar] - -- Find the variables in fvs, free vars of the target expression, - -- whose level is greater than the destination level - -- These are the ones we are going to abstract out - -- - -- Note that to get reproducible builds, the variables need to be - -- abstracted in deterministic order, not dependent on the values of - -- Uniques. This is achieved by using DVarSets, deterministic free - -- variable computation and deterministic sort. - -- See Note [Unique Determinism] in Unique for explanation of why - -- Uniques are not deterministic. -abstractVars dest_lvl (LE { le_subst = subst, le_lvl_env = lvl_env }) in_fvs - = -- NB: sortQuantVars might not put duplicates next to each other - map zap $ sortQuantVars $ - filter abstract_me $ - dVarSetElems $ - closeOverKindsDSet $ - substDVarSet subst in_fvs - -- NB: it's important to call abstract_me only on the OutIds the - -- come from substDVarSet (not on fv, which is an InId) - where - abstract_me v = case lookupVarEnv lvl_env v of - Just lvl -> dest_lvl `ltLvl` lvl - Nothing -> False - - -- We are going to lambda-abstract, so nuke any IdInfo, - -- and add the tyvars of the Id (if necessary) - zap v | isId v = WARN( isStableUnfolding (idUnfolding v) || - not (isEmptyRuleInfo (idSpecialisation v)), - text "absVarsOf: discarding info on" <+> ppr v ) - setIdInfo v vanillaIdInfo - | otherwise = v - -type LvlM result = UniqSM result - -initLvl :: UniqSupply -> UniqSM a -> a -initLvl = initUs_ - -newPolyBndrs :: Level -> LevelEnv -> [OutVar] -> [InId] - -> LvlM (LevelEnv, [OutId]) --- The envt is extended to bind the new bndrs to dest_lvl, but --- the le_ctxt_lvl is unaffected -newPolyBndrs dest_lvl - env@(LE { le_lvl_env = lvl_env, le_subst = subst, le_env = id_env }) - abs_vars bndrs - = ASSERT( all (not . isCoVar) bndrs ) -- What would we add to the CoSubst in this case. No easy answer. - do { uniqs <- getUniquesM - ; let new_bndrs = zipWith mk_poly_bndr bndrs uniqs - bndr_prs = bndrs `zip` new_bndrs - env' = env { le_lvl_env = addLvls dest_lvl lvl_env new_bndrs - , le_subst = foldl' add_subst subst bndr_prs - , le_env = foldl' add_id id_env bndr_prs } - ; return (env', new_bndrs) } - where - add_subst env (v, v') = extendIdSubst env v (mkVarApps (Var v') abs_vars) - add_id env (v, v') = extendVarEnv env v ((v':abs_vars), mkVarApps (Var v') abs_vars) - - mk_poly_bndr bndr uniq = transferPolyIdInfo bndr abs_vars $ -- Note [transferPolyIdInfo] in Id.hs - transfer_join_info bndr $ - mkSysLocal (mkFastString str) uniq poly_ty - where - str = "poly_" ++ occNameString (getOccName bndr) - poly_ty = mkLamTypes abs_vars (GHC.Core.Subst.substTy subst (idType bndr)) - - -- If we are floating a join point to top level, it stops being - -- a join point. Otherwise it continues to be a join point, - -- but we may need to adjust its arity - dest_is_top = isTopLvl dest_lvl - transfer_join_info bndr new_bndr - | Just join_arity <- isJoinId_maybe bndr - , not dest_is_top - = new_bndr `asJoinId` join_arity + length abs_vars - | otherwise - = new_bndr - -newLvlVar :: LevelledExpr -- The RHS of the new binding - -> Maybe JoinArity -- Its join arity, if it is a join point - -> Bool -- True <=> the RHS looks like (makeStatic ...) - -> LvlM Id -newLvlVar lvld_rhs join_arity_maybe is_mk_static - = do { uniq <- getUniqueM - ; return (add_join_info (mk_id uniq rhs_ty)) - } - where - add_join_info var = var `asJoinId_maybe` join_arity_maybe - de_tagged_rhs = deTagExpr lvld_rhs - rhs_ty = exprType de_tagged_rhs - - mk_id uniq rhs_ty - -- See Note [Grand plan for static forms] in StaticPtrTable. - | is_mk_static - = mkExportedVanillaId (mkSystemVarName uniq (mkFastString "static_ptr")) - rhs_ty - | otherwise - = mkSysLocal (mkFastString "lvl") uniq rhs_ty - --- | Clone the binders bound by a single-alternative case. -cloneCaseBndrs :: LevelEnv -> Level -> [Var] -> LvlM (LevelEnv, [Var]) -cloneCaseBndrs env@(LE { le_subst = subst, le_lvl_env = lvl_env, le_env = id_env }) - new_lvl vs - = do { us <- getUniqueSupplyM - ; let (subst', vs') = cloneBndrs subst us vs - -- N.B. We are not moving the body of the case, merely its case - -- binders. Consequently we should *not* set le_ctxt_lvl and - -- le_join_ceil. See Note [Setting levels when floating - -- single-alternative cases]. - env' = env { le_lvl_env = addLvls new_lvl lvl_env vs' - , le_subst = subst' - , le_env = foldl' add_id id_env (vs `zip` vs') } - - ; return (env', vs') } - -cloneLetVars :: RecFlag -> LevelEnv -> Level -> [InVar] - -> LvlM (LevelEnv, [OutVar]) --- See Note [Need for cloning during float-out] --- Works for Ids bound by let(rec) --- The dest_lvl is attributed to the binders in the new env, --- but cloneVars doesn't affect the le_ctxt_lvl of the incoming env -cloneLetVars is_rec - env@(LE { le_subst = subst, le_lvl_env = lvl_env, le_env = id_env }) - dest_lvl vs - = do { us <- getUniqueSupplyM - ; let vs1 = map zap vs - -- See Note [Zapping the demand info] - (subst', vs2) = case is_rec of - NonRecursive -> cloneBndrs subst us vs1 - Recursive -> cloneRecIdBndrs subst us vs1 - prs = vs `zip` vs2 - env' = env { le_lvl_env = addLvls dest_lvl lvl_env vs2 - , le_subst = subst' - , le_env = foldl' add_id id_env prs } - - ; return (env', vs2) } - where - zap :: Var -> Var - zap v | isId v = zap_join (zapIdDemandInfo v) - | otherwise = v - - zap_join | isTopLvl dest_lvl = zapJoinId - | otherwise = id - -add_id :: IdEnv ([Var], LevelledExpr) -> (Var, Var) -> IdEnv ([Var], LevelledExpr) -add_id id_env (v, v1) - | isTyVar v = delVarEnv id_env v - | otherwise = extendVarEnv id_env v ([v1], ASSERT(not (isCoVar v1)) Var v1) - -{- -Note [Zapping the demand info] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -VERY IMPORTANT: we must zap the demand info if the thing is going to -float out, because it may be less demanded than at its original -binding site. Eg - f :: Int -> Int - f x = let v = 3*4 in v+x -Here v is strict; but if we float v to top level, it isn't any more. - -Similarly, if we're floating a join point, it won't be one anymore, so we zap -join point information as well. --} diff --git a/compiler/simplCore/SimplCore.hs b/compiler/simplCore/SimplCore.hs deleted file mode 100644 index faeb3c5811..0000000000 --- a/compiler/simplCore/SimplCore.hs +++ /dev/null @@ -1,1037 +0,0 @@ -{- -(c) The GRASP/AQUA Project, Glasgow University, 1992-1998 - -\section[SimplCore]{Driver for simplifying @Core@ programs} --} - -{-# LANGUAGE CPP #-} - -module SimplCore ( core2core, simplifyExpr ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Driver.Session -import GHC.Core -import GHC.Driver.Types -import CSE ( cseProgram ) -import GHC.Core.Rules ( mkRuleBase, unionRuleBase, - extendRuleBaseList, ruleCheckProgram, addRuleInfo, - getRules ) -import GHC.Core.Ppr ( pprCoreBindings, pprCoreExpr ) -import OccurAnal ( occurAnalysePgm, occurAnalyseExpr ) -import IdInfo -import GHC.Core.Stats ( coreBindsSize, coreBindsStats, exprSize ) -import GHC.Core.Utils ( mkTicks, stripTicksTop ) -import GHC.Core.Lint ( endPass, lintPassResult, dumpPassResult, - lintAnnots ) -import Simplify ( simplTopBinds, simplExpr, simplRules ) -import SimplUtils ( simplEnvForGHCi, activeRule, activeUnfolding ) -import SimplEnv -import SimplMonad -import CoreMonad -import qualified ErrUtils as Err -import FloatIn ( floatInwards ) -import FloatOut ( floatOutwards ) -import GHC.Core.FamInstEnv -import Id -import ErrUtils ( withTiming, withTimingD, DumpFormat (..) ) -import BasicTypes ( CompilerPhase(..), isDefaultInlinePragma, defaultInlinePragma ) -import VarSet -import VarEnv -import LiberateCase ( liberateCase ) -import SAT ( doStaticArgs ) -import Specialise ( specProgram) -import SpecConstr ( specConstrProgram) -import DmdAnal ( dmdAnalProgram ) -import CprAnal ( cprAnalProgram ) -import CallArity ( callArityAnalProgram ) -import Exitify ( exitifyProgram ) -import WorkWrap ( wwTopBinds ) -import SrcLoc -import Util -import Module -import GHC.Driver.Plugins ( withPlugins, installCoreToDos ) -import GHC.Runtime.Loader -- ( initializePlugins ) - -import UniqSupply ( UniqSupply, mkSplitUniqSupply, splitUniqSupply ) -import UniqFM -import Outputable -import Control.Monad -import qualified GHC.LanguageExtensions as LangExt -{- -************************************************************************ -* * -\subsection{The driver for the simplifier} -* * -************************************************************************ --} - -core2core :: HscEnv -> ModGuts -> IO ModGuts -core2core hsc_env guts@(ModGuts { mg_module = mod - , mg_loc = loc - , mg_deps = deps - , mg_rdr_env = rdr_env }) - = do { -- make sure all plugins are loaded - - ; let builtin_passes = getCoreToDo dflags - orph_mods = mkModuleSet (mod : dep_orphs deps) - uniq_mask = 's' - ; - ; (guts2, stats) <- runCoreM hsc_env hpt_rule_base uniq_mask mod - orph_mods print_unqual loc $ - do { hsc_env' <- getHscEnv - ; dflags' <- liftIO $ initializePlugins hsc_env' - (hsc_dflags hsc_env') - ; all_passes <- withPlugins dflags' - installCoreToDos - builtin_passes - ; runCorePasses all_passes guts } - - ; Err.dumpIfSet_dyn dflags Opt_D_dump_simpl_stats - "Grand total simplifier statistics" - FormatText - (pprSimplCount stats) - - ; return guts2 } - where - dflags = hsc_dflags hsc_env - home_pkg_rules = hptRules hsc_env (dep_mods deps) - hpt_rule_base = mkRuleBase home_pkg_rules - print_unqual = mkPrintUnqualified dflags rdr_env - -- mod: get the module out of the current HscEnv so we can retrieve it from the monad. - -- This is very convienent for the users of the monad (e.g. plugins do not have to - -- consume the ModGuts to find the module) but somewhat ugly because mg_module may - -- _theoretically_ be changed during the Core pipeline (it's part of ModGuts), which - -- would mean our cached value would go out of date. - -{- -************************************************************************ -* * - Generating the main optimisation pipeline -* * -************************************************************************ --} - -getCoreToDo :: DynFlags -> [CoreToDo] -getCoreToDo dflags - = flatten_todos core_todo - where - opt_level = optLevel dflags - phases = simplPhases dflags - max_iter = maxSimplIterations dflags - rule_check = ruleCheck dflags - call_arity = gopt Opt_CallArity dflags - exitification = gopt Opt_Exitification dflags - strictness = gopt Opt_Strictness dflags - full_laziness = gopt Opt_FullLaziness dflags - do_specialise = gopt Opt_Specialise dflags - do_float_in = gopt Opt_FloatIn dflags - cse = gopt Opt_CSE dflags - spec_constr = gopt Opt_SpecConstr dflags - liberate_case = gopt Opt_LiberateCase dflags - late_dmd_anal = gopt Opt_LateDmdAnal dflags - late_specialise = gopt Opt_LateSpecialise dflags - static_args = gopt Opt_StaticArgumentTransformation dflags - rules_on = gopt Opt_EnableRewriteRules dflags - eta_expand_on = gopt Opt_DoLambdaEtaExpansion dflags - ww_on = gopt Opt_WorkerWrapper dflags - static_ptrs = xopt LangExt.StaticPointers dflags - - maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase) - - maybe_strictness_before phase - = runWhen (phase `elem` strictnessBefore dflags) CoreDoDemand - - base_mode = SimplMode { sm_phase = panic "base_mode" - , sm_names = [] - , sm_dflags = dflags - , sm_rules = rules_on - , sm_eta_expand = eta_expand_on - , sm_inline = True - , sm_case_case = True } - - simpl_phase phase names iter - = CoreDoPasses - $ [ maybe_strictness_before phase - , CoreDoSimplify iter - (base_mode { sm_phase = Phase phase - , sm_names = names }) - - , maybe_rule_check (Phase phase) ] - - simpl_phases = CoreDoPasses [ simpl_phase phase ["main"] max_iter - | phase <- [phases, phases-1 .. 1] ] - - - -- initial simplify: mk specialiser happy: minimum effort please - simpl_gently = CoreDoSimplify max_iter - (base_mode { sm_phase = InitialPhase - , sm_names = ["Gentle"] - , sm_rules = rules_on -- Note [RULEs enabled in InitialPhase] - , sm_inline = True - -- See Note [Inline in InitialPhase] - , sm_case_case = False }) - -- Don't do case-of-case transformations. - -- This makes full laziness work better - - dmd_cpr_ww = if ww_on then [CoreDoDemand,CoreDoCpr,CoreDoWorkerWrapper] - else [CoreDoDemand,CoreDoCpr] - - - demand_analyser = (CoreDoPasses ( - dmd_cpr_ww ++ - [simpl_phase 0 ["post-worker-wrapper"] max_iter] - )) - - -- Static forms are moved to the top level with the FloatOut pass. - -- See Note [Grand plan for static forms] in StaticPtrTable. - static_ptrs_float_outwards = - runWhen static_ptrs $ CoreDoPasses - [ simpl_gently -- Float Out can't handle type lets (sometimes created - -- by simpleOptPgm via mkParallelBindings) - , CoreDoFloatOutwards FloatOutSwitches - { floatOutLambdas = Just 0 - , floatOutConstants = True - , floatOutOverSatApps = False - , floatToTopLevelOnly = True - } - ] - - core_todo = - if opt_level == 0 then - [ static_ptrs_float_outwards, - CoreDoSimplify max_iter - (base_mode { sm_phase = Phase 0 - , sm_names = ["Non-opt simplification"] }) - ] - - else {- opt_level >= 1 -} [ - - -- We want to do the static argument transform before full laziness as it - -- may expose extra opportunities to float things outwards. However, to fix - -- up the output of the transformation we need at do at least one simplify - -- after this before anything else - runWhen static_args (CoreDoPasses [ simpl_gently, CoreDoStaticArgs ]), - - -- initial simplify: mk specialiser happy: minimum effort please - simpl_gently, - - -- Specialisation is best done before full laziness - -- so that overloaded functions have all their dictionary lambdas manifest - runWhen do_specialise CoreDoSpecialising, - - if full_laziness then - CoreDoFloatOutwards FloatOutSwitches { - floatOutLambdas = Just 0, - floatOutConstants = True, - floatOutOverSatApps = False, - floatToTopLevelOnly = False } - -- Was: gentleFloatOutSwitches - -- - -- I have no idea why, but not floating constants to - -- top level is very bad in some cases. - -- - -- Notably: p_ident in spectral/rewrite - -- Changing from "gentle" to "constantsOnly" - -- improved rewrite's allocation by 19%, and - -- made 0.0% difference to any other nofib - -- benchmark - -- - -- Not doing floatOutOverSatApps yet, we'll do - -- that later on when we've had a chance to get more - -- accurate arity information. In fact it makes no - -- difference at all to performance if we do it here, - -- but maybe we save some unnecessary to-and-fro in - -- the simplifier. - else - -- Even with full laziness turned off, we still need to float static - -- forms to the top level. See Note [Grand plan for static forms] in - -- StaticPtrTable. - static_ptrs_float_outwards, - - simpl_phases, - - -- Phase 0: allow all Ids to be inlined now - -- This gets foldr inlined before strictness analysis - - -- At least 3 iterations because otherwise we land up with - -- huge dead expressions because of an infelicity in the - -- simplifier. - -- let k = BIG in foldr k z xs - -- ==> let k = BIG in letrec go = \xs -> ...(k x).... in go xs - -- ==> let k = BIG in letrec go = \xs -> ...(BIG x).... in go xs - -- Don't stop now! - simpl_phase 0 ["main"] (max max_iter 3), - - runWhen do_float_in CoreDoFloatInwards, - -- Run float-inwards immediately before the strictness analyser - -- Doing so pushes bindings nearer their use site and hence makes - -- them more likely to be strict. These bindings might only show - -- up after the inlining from simplification. Example in fulsom, - -- Csg.calc, where an arg of timesDouble thereby becomes strict. - - runWhen call_arity $ CoreDoPasses - [ CoreDoCallArity - , simpl_phase 0 ["post-call-arity"] max_iter - ], - - runWhen strictness demand_analyser, - - runWhen exitification CoreDoExitify, - -- See note [Placement of the exitification pass] - - runWhen full_laziness $ - CoreDoFloatOutwards FloatOutSwitches { - floatOutLambdas = floatLamArgs dflags, - floatOutConstants = True, - floatOutOverSatApps = True, - floatToTopLevelOnly = False }, - -- nofib/spectral/hartel/wang doubles in speed if you - -- do full laziness late in the day. It only happens - -- after fusion and other stuff, so the early pass doesn't - -- catch it. For the record, the redex is - -- f_el22 (f_el21 r_midblock) - - - runWhen cse CoreCSE, - -- We want CSE to follow the final full-laziness pass, because it may - -- succeed in commoning up things floated out by full laziness. - -- CSE used to rely on the no-shadowing invariant, but it doesn't any more - - runWhen do_float_in CoreDoFloatInwards, - - maybe_rule_check (Phase 0), - - -- Case-liberation for -O2. This should be after - -- strictness analysis and the simplification which follows it. - runWhen liberate_case (CoreDoPasses [ - CoreLiberateCase, - simpl_phase 0 ["post-liberate-case"] max_iter - ]), -- Run the simplifier after LiberateCase to vastly - -- reduce the possibility of shadowing - -- Reason: see Note [Shadowing] in SpecConstr.hs - - runWhen spec_constr CoreDoSpecConstr, - - maybe_rule_check (Phase 0), - - runWhen late_specialise - (CoreDoPasses [ CoreDoSpecialising - , simpl_phase 0 ["post-late-spec"] max_iter]), - - -- LiberateCase can yield new CSE opportunities because it peels - -- off one layer of a recursive function (concretely, I saw this - -- in wheel-sieve1), and I'm guessing that SpecConstr can too - -- And CSE is a very cheap pass. So it seems worth doing here. - runWhen ((liberate_case || spec_constr) && cse) CoreCSE, - - -- Final clean-up simplification: - simpl_phase 0 ["final"] max_iter, - - runWhen late_dmd_anal $ CoreDoPasses ( - dmd_cpr_ww ++ - [simpl_phase 0 ["post-late-ww"] max_iter] - ), - - -- Final run of the demand_analyser, ensures that one-shot thunks are - -- really really one-shot thunks. Only needed if the demand analyser - -- has run at all. See Note [Final Demand Analyser run] in DmdAnal - -- It is EXTREMELY IMPORTANT to run this pass, otherwise execution - -- can become /exponentially/ more expensive. See #11731, #12996. - runWhen (strictness || late_dmd_anal) CoreDoDemand, - - maybe_rule_check (Phase 0) - ] - - -- Remove 'CoreDoNothing' and flatten 'CoreDoPasses' for clarity. - flatten_todos [] = [] - flatten_todos (CoreDoNothing : rest) = flatten_todos rest - flatten_todos (CoreDoPasses passes : rest) = - flatten_todos passes ++ flatten_todos rest - flatten_todos (todo : rest) = todo : flatten_todos rest - -{- Note [Inline in InitialPhase] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In GHC 8 and earlier we did not inline anything in the InitialPhase. But that is -confusing for users because when they say INLINE they expect the function to inline -right away. - -So now we do inlining immediately, even in the InitialPhase, assuming that the -Id's Activation allows it. - -This is a surprisingly big deal. Compiler performance improved a lot -when I made this change: - - perf/compiler/T5837.run T5837 [stat too good] (normal) - perf/compiler/parsing001.run parsing001 [stat too good] (normal) - perf/compiler/T12234.run T12234 [stat too good] (optasm) - perf/compiler/T9020.run T9020 [stat too good] (optasm) - perf/compiler/T3064.run T3064 [stat too good] (normal) - perf/compiler/T9961.run T9961 [stat too good] (normal) - perf/compiler/T13056.run T13056 [stat too good] (optasm) - perf/compiler/T9872d.run T9872d [stat too good] (normal) - perf/compiler/T783.run T783 [stat too good] (normal) - perf/compiler/T12227.run T12227 [stat too good] (normal) - perf/should_run/lazy-bs-alloc.run lazy-bs-alloc [stat too good] (normal) - perf/compiler/T1969.run T1969 [stat too good] (normal) - perf/compiler/T9872a.run T9872a [stat too good] (normal) - perf/compiler/T9872c.run T9872c [stat too good] (normal) - perf/compiler/T9872b.run T9872b [stat too good] (normal) - perf/compiler/T9872d.run T9872d [stat too good] (normal) - -Note [RULEs enabled in InitialPhase] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -RULES are enabled when doing "gentle" simplification in InitialPhase, -or with -O0. Two reasons: - - * We really want the class-op cancellation to happen: - op (df d1 d2) --> $cop3 d1 d2 - because this breaks the mutual recursion between 'op' and 'df' - - * I wanted the RULE - lift String ===> ... - to work in Template Haskell when simplifying - splices, so we get simpler code for literal strings - -But watch out: list fusion can prevent floating. So use phase control -to switch off those rules until after floating. - -************************************************************************ -* * - The CoreToDo interpreter -* * -************************************************************************ --} - -runCorePasses :: [CoreToDo] -> ModGuts -> CoreM ModGuts -runCorePasses passes guts - = foldM do_pass guts passes - where - do_pass guts CoreDoNothing = return guts - do_pass guts (CoreDoPasses ps) = runCorePasses ps guts - do_pass guts pass = do - withTimingD (ppr pass <+> brackets (ppr mod)) - (const ()) $ do - { guts' <- lintAnnots (ppr pass) (doCorePass pass) guts - ; endPass pass (mg_binds guts') (mg_rules guts') - ; return guts' } - - mod = mg_module guts - -doCorePass :: CoreToDo -> ModGuts -> CoreM ModGuts -doCorePass pass@(CoreDoSimplify {}) = {-# SCC "Simplify" #-} - simplifyPgm pass - -doCorePass CoreCSE = {-# SCC "CommonSubExpr" #-} - doPass cseProgram - -doCorePass CoreLiberateCase = {-# SCC "LiberateCase" #-} - doPassD liberateCase - -doCorePass CoreDoFloatInwards = {-# SCC "FloatInwards" #-} - floatInwards - -doCorePass (CoreDoFloatOutwards f) = {-# SCC "FloatOutwards" #-} - doPassDUM (floatOutwards f) - -doCorePass CoreDoStaticArgs = {-# SCC "StaticArgs" #-} - doPassU doStaticArgs - -doCorePass CoreDoCallArity = {-# SCC "CallArity" #-} - doPassD callArityAnalProgram - -doCorePass CoreDoExitify = {-# SCC "Exitify" #-} - doPass exitifyProgram - -doCorePass CoreDoDemand = {-# SCC "DmdAnal" #-} - doPassDFM dmdAnalProgram - -doCorePass CoreDoCpr = {-# SCC "CprAnal" #-} - doPassDFM cprAnalProgram - -doCorePass CoreDoWorkerWrapper = {-# SCC "WorkWrap" #-} - doPassDFU wwTopBinds - -doCorePass CoreDoSpecialising = {-# SCC "Specialise" #-} - specProgram - -doCorePass CoreDoSpecConstr = {-# SCC "SpecConstr" #-} - specConstrProgram - -doCorePass CoreDoPrintCore = observe printCore -doCorePass (CoreDoRuleCheck phase pat) = ruleCheckPass phase pat -doCorePass CoreDoNothing = return -doCorePass (CoreDoPasses passes) = runCorePasses passes - -doCorePass (CoreDoPluginPass _ pass) = {-# SCC "Plugin" #-} pass - -doCorePass pass@CoreDesugar = pprPanic "doCorePass" (ppr pass) -doCorePass pass@CoreDesugarOpt = pprPanic "doCorePass" (ppr pass) -doCorePass pass@CoreTidy = pprPanic "doCorePass" (ppr pass) -doCorePass pass@CorePrep = pprPanic "doCorePass" (ppr pass) -doCorePass pass@CoreOccurAnal = pprPanic "doCorePass" (ppr pass) - -{- -************************************************************************ -* * -\subsection{Core pass combinators} -* * -************************************************************************ --} - -printCore :: DynFlags -> CoreProgram -> IO () -printCore dflags binds - = Err.dumpIfSet dflags True "Print Core" (pprCoreBindings binds) - -ruleCheckPass :: CompilerPhase -> String -> ModGuts -> CoreM ModGuts -ruleCheckPass current_phase pat guts = - withTimingD (text "RuleCheck"<+>brackets (ppr $ mg_module guts)) - (const ()) $ do - { rb <- getRuleBase - ; dflags <- getDynFlags - ; vis_orphs <- getVisibleOrphanMods - ; let rule_fn fn = getRules (RuleEnv rb vis_orphs) fn - ++ (mg_rules guts) - ; liftIO $ putLogMsg dflags NoReason Err.SevDump noSrcSpan - (defaultDumpStyle dflags) - (ruleCheckProgram current_phase pat - rule_fn (mg_binds guts)) - ; return guts } - -doPassDUM :: (DynFlags -> UniqSupply -> CoreProgram -> IO CoreProgram) -> ModGuts -> CoreM ModGuts -doPassDUM do_pass = doPassM $ \binds -> do - dflags <- getDynFlags - us <- getUniqueSupplyM - liftIO $ do_pass dflags us binds - -doPassDM :: (DynFlags -> CoreProgram -> IO CoreProgram) -> ModGuts -> CoreM ModGuts -doPassDM do_pass = doPassDUM (\dflags -> const (do_pass dflags)) - -doPassD :: (DynFlags -> CoreProgram -> CoreProgram) -> ModGuts -> CoreM ModGuts -doPassD do_pass = doPassDM (\dflags -> return . do_pass dflags) - -doPassDU :: (DynFlags -> UniqSupply -> CoreProgram -> CoreProgram) -> ModGuts -> CoreM ModGuts -doPassDU do_pass = doPassDUM (\dflags us -> return . do_pass dflags us) - -doPassU :: (UniqSupply -> CoreProgram -> CoreProgram) -> ModGuts -> CoreM ModGuts -doPassU do_pass = doPassDU (const do_pass) - -doPassDFM :: (DynFlags -> FamInstEnvs -> CoreProgram -> IO CoreProgram) -> ModGuts -> CoreM ModGuts -doPassDFM do_pass guts = do - dflags <- getDynFlags - p_fam_env <- getPackageFamInstEnv - let fam_envs = (p_fam_env, mg_fam_inst_env guts) - doPassM (liftIO . do_pass dflags fam_envs) guts - -doPassDFU :: (DynFlags -> FamInstEnvs -> UniqSupply -> CoreProgram -> CoreProgram) -> ModGuts -> CoreM ModGuts -doPassDFU do_pass guts = do - dflags <- getDynFlags - us <- getUniqueSupplyM - p_fam_env <- getPackageFamInstEnv - let fam_envs = (p_fam_env, mg_fam_inst_env guts) - doPass (do_pass dflags fam_envs us) guts - --- Most passes return no stats and don't change rules: these combinators --- let us lift them to the full blown ModGuts+CoreM world -doPassM :: Monad m => (CoreProgram -> m CoreProgram) -> ModGuts -> m ModGuts -doPassM bind_f guts = do - binds' <- bind_f (mg_binds guts) - return (guts { mg_binds = binds' }) - -doPass :: (CoreProgram -> CoreProgram) -> ModGuts -> CoreM ModGuts -doPass bind_f guts = return $ guts { mg_binds = bind_f (mg_binds guts) } - --- Observer passes just peek; don't modify the bindings at all -observe :: (DynFlags -> CoreProgram -> IO a) -> ModGuts -> CoreM ModGuts -observe do_pass = doPassM $ \binds -> do - dflags <- getDynFlags - _ <- liftIO $ do_pass dflags binds - return binds - -{- -************************************************************************ -* * - Gentle simplification -* * -************************************************************************ --} - -simplifyExpr :: HscEnv -- includes spec of what core-to-core passes to do - -> CoreExpr - -> IO CoreExpr --- simplifyExpr is called by the driver to simplify an --- expression typed in at the interactive prompt -simplifyExpr hsc_env expr - = withTiming dflags (text "Simplify [expr]") (const ()) $ - do { eps <- hscEPS hsc_env ; - ; let rule_env = mkRuleEnv (eps_rule_base eps) [] - fi_env = ( eps_fam_inst_env eps - , extendFamInstEnvList emptyFamInstEnv $ - snd $ ic_instances $ hsc_IC hsc_env ) - simpl_env = simplEnvForGHCi dflags - - ; us <- mkSplitUniqSupply 's' - ; let sz = exprSize expr - - ; (expr', counts) <- initSmpl dflags rule_env fi_env us sz $ - simplExprGently simpl_env expr - - ; Err.dumpIfSet dflags (dopt Opt_D_dump_simpl_stats dflags) - "Simplifier statistics" (pprSimplCount counts) - - ; Err.dumpIfSet_dyn dflags Opt_D_dump_simpl "Simplified expression" - FormatCore - (pprCoreExpr expr') - - ; return expr' - } - where - dflags = hsc_dflags hsc_env - -simplExprGently :: SimplEnv -> CoreExpr -> SimplM CoreExpr --- Simplifies an expression --- does occurrence analysis, then simplification --- and repeats (twice currently) because one pass --- alone leaves tons of crud. --- Used (a) for user expressions typed in at the interactive prompt --- (b) the LHS and RHS of a RULE --- (c) Template Haskell splices --- --- The name 'Gently' suggests that the SimplMode is InitialPhase, --- and in fact that is so.... but the 'Gently' in simplExprGently doesn't --- enforce that; it just simplifies the expression twice - --- It's important that simplExprGently does eta reduction; see --- Note [Simplifying the left-hand side of a RULE] above. The --- simplifier does indeed do eta reduction (it's in Simplify.completeLam) --- but only if -O is on. - -simplExprGently env expr = do - expr1 <- simplExpr env (occurAnalyseExpr expr) - simplExpr env (occurAnalyseExpr expr1) - -{- -************************************************************************ -* * -\subsection{The driver for the simplifier} -* * -************************************************************************ --} - -simplifyPgm :: CoreToDo -> ModGuts -> CoreM ModGuts -simplifyPgm pass guts - = do { hsc_env <- getHscEnv - ; us <- getUniqueSupplyM - ; rb <- getRuleBase - ; liftIOWithCount $ - simplifyPgmIO pass hsc_env us rb guts } - -simplifyPgmIO :: CoreToDo - -> HscEnv - -> UniqSupply - -> RuleBase - -> ModGuts - -> IO (SimplCount, ModGuts) -- New bindings - -simplifyPgmIO pass@(CoreDoSimplify max_iterations mode) - hsc_env us hpt_rule_base - guts@(ModGuts { mg_module = this_mod - , mg_rdr_env = rdr_env - , mg_deps = deps - , mg_binds = binds, mg_rules = rules - , mg_fam_inst_env = fam_inst_env }) - = do { (termination_msg, it_count, counts_out, guts') - <- do_iteration us 1 [] binds rules - - ; Err.dumpIfSet dflags (dopt Opt_D_verbose_core2core dflags && - dopt Opt_D_dump_simpl_stats dflags) - "Simplifier statistics for following pass" - (vcat [text termination_msg <+> text "after" <+> ppr it_count - <+> text "iterations", - blankLine, - pprSimplCount counts_out]) - - ; return (counts_out, guts') - } - where - dflags = hsc_dflags hsc_env - print_unqual = mkPrintUnqualified dflags rdr_env - simpl_env = mkSimplEnv mode - active_rule = activeRule mode - active_unf = activeUnfolding mode - - do_iteration :: UniqSupply - -> Int -- Counts iterations - -> [SimplCount] -- Counts from earlier iterations, reversed - -> CoreProgram -- Bindings in - -> [CoreRule] -- and orphan rules - -> IO (String, Int, SimplCount, ModGuts) - - do_iteration us iteration_no counts_so_far binds rules - -- iteration_no is the number of the iteration we are - -- about to begin, with '1' for the first - | iteration_no > max_iterations -- Stop if we've run out of iterations - = WARN( debugIsOn && (max_iterations > 2) - , hang (text "Simplifier bailing out after" <+> int max_iterations - <+> text "iterations" - <+> (brackets $ hsep $ punctuate comma $ - map (int . simplCountN) (reverse counts_so_far))) - 2 (text "Size =" <+> ppr (coreBindsStats binds))) - - -- Subtract 1 from iteration_no to get the - -- number of iterations we actually completed - return ( "Simplifier baled out", iteration_no - 1 - , totalise counts_so_far - , guts { mg_binds = binds, mg_rules = rules } ) - - -- Try and force thunks off the binds; significantly reduces - -- space usage, especially with -O. JRS, 000620. - | let sz = coreBindsSize binds - , () <- sz `seq` () -- Force it - = do { - -- Occurrence analysis - let { tagged_binds = {-# SCC "OccAnal" #-} - occurAnalysePgm this_mod active_unf active_rule rules - binds - } ; - Err.dumpIfSet_dyn dflags Opt_D_dump_occur_anal "Occurrence analysis" - FormatCore - (pprCoreBindings tagged_binds); - - -- Get any new rules, and extend the rule base - -- See Note [Overall plumbing for rules] in GHC.Core.Rules - -- We need to do this regularly, because simplification can - -- poke on IdInfo thunks, which in turn brings in new rules - -- behind the scenes. Otherwise there's a danger we'll simply - -- miss the rules for Ids hidden inside imported inlinings - eps <- hscEPS hsc_env ; - let { rule_base1 = unionRuleBase hpt_rule_base (eps_rule_base eps) - ; rule_base2 = extendRuleBaseList rule_base1 rules - ; fam_envs = (eps_fam_inst_env eps, fam_inst_env) - ; vis_orphs = this_mod : dep_orphs deps } ; - - -- Simplify the program - ((binds1, rules1), counts1) <- - initSmpl dflags (mkRuleEnv rule_base2 vis_orphs) fam_envs us1 sz $ - do { (floats, env1) <- {-# SCC "SimplTopBinds" #-} - simplTopBinds simpl_env tagged_binds - - -- Apply the substitution to rules defined in this module - -- for imported Ids. Eg RULE map my_f = blah - -- If we have a substitution my_f :-> other_f, we'd better - -- apply it to the rule to, or it'll never match - ; rules1 <- simplRules env1 Nothing rules Nothing - - ; return (getTopFloatBinds floats, rules1) } ; - - -- Stop if nothing happened; don't dump output - -- See Note [Which transformations are innocuous] in CoreMonad - if isZeroSimplCount counts1 then - return ( "Simplifier reached fixed point", iteration_no - , totalise (counts1 : counts_so_far) -- Include "free" ticks - , guts { mg_binds = binds1, mg_rules = rules1 } ) - else do { - -- Short out indirections - -- We do this *after* at least one run of the simplifier - -- because indirection-shorting uses the export flag on *occurrences* - -- and that isn't guaranteed to be ok until after the first run propagates - -- stuff from the binding site to its occurrences - -- - -- ToDo: alas, this means that indirection-shorting does not happen at all - -- if the simplifier does nothing (not common, I know, but unsavoury) - let { binds2 = {-# SCC "ZapInd" #-} shortOutIndirections binds1 } ; - - -- Dump the result of this iteration - dump_end_iteration dflags print_unqual iteration_no counts1 binds2 rules1 ; - lintPassResult hsc_env pass binds2 ; - - -- Loop - do_iteration us2 (iteration_no + 1) (counts1:counts_so_far) binds2 rules1 - } } -#if __GLASGOW_HASKELL__ <= 810 - | otherwise = panic "do_iteration" -#endif - where - (us1, us2) = splitUniqSupply us - - -- Remember the counts_so_far are reversed - totalise :: [SimplCount] -> SimplCount - totalise = foldr (\c acc -> acc `plusSimplCount` c) - (zeroSimplCount dflags) - -simplifyPgmIO _ _ _ _ _ = panic "simplifyPgmIO" - -------------------- -dump_end_iteration :: DynFlags -> PrintUnqualified -> Int - -> SimplCount -> CoreProgram -> [CoreRule] -> IO () -dump_end_iteration dflags print_unqual iteration_no counts binds rules - = dumpPassResult dflags print_unqual mb_flag hdr pp_counts binds rules - where - mb_flag | dopt Opt_D_dump_simpl_iterations dflags = Just Opt_D_dump_simpl_iterations - | otherwise = Nothing - -- Show details if Opt_D_dump_simpl_iterations is on - - hdr = text "Simplifier iteration=" <> int iteration_no - pp_counts = vcat [ text "---- Simplifier counts for" <+> hdr - , pprSimplCount counts - , text "---- End of simplifier counts for" <+> hdr ] - -{- -************************************************************************ -* * - Shorting out indirections -* * -************************************************************************ - -If we have this: - - x_local = <expression> - ...bindings... - x_exported = x_local - -where x_exported is exported, and x_local is not, then we replace it with this: - - x_exported = <expression> - x_local = x_exported - ...bindings... - -Without this we never get rid of the x_exported = x_local thing. This -save a gratuitous jump (from \tr{x_exported} to \tr{x_local}), and -makes strictness information propagate better. This used to happen in -the final phase, but it's tidier to do it here. - -Note [Messing up the exported Id's RULES] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We must be careful about discarding (obviously) or even merging the -RULES on the exported Id. The example that went bad on me at one stage -was this one: - - iterate :: (a -> a) -> a -> [a] - [Exported] - iterate = iterateList - - iterateFB c f x = x `c` iterateFB c f (f x) - iterateList f x = x : iterateList f (f x) - [Not exported] - - {-# RULES - "iterate" forall f x. iterate f x = build (\c _n -> iterateFB c f x) - "iterateFB" iterateFB (:) = iterateList - #-} - -This got shorted out to: - - iterateList :: (a -> a) -> a -> [a] - iterateList = iterate - - iterateFB c f x = x `c` iterateFB c f (f x) - iterate f x = x : iterate f (f x) - - {-# RULES - "iterate" forall f x. iterate f x = build (\c _n -> iterateFB c f x) - "iterateFB" iterateFB (:) = iterate - #-} - -And now we get an infinite loop in the rule system - iterate f x -> build (\cn -> iterateFB c f x) - -> iterateFB (:) f x - -> iterate f x - -Old "solution": - use rule switching-off pragmas to get rid - of iterateList in the first place - -But in principle the user *might* want rules that only apply to the Id -he says. And inline pragmas are similar - {-# NOINLINE f #-} - f = local - local = <stuff> -Then we do not want to get rid of the NOINLINE. - -Hence hasShortableIdinfo. - - -Note [Rules and indirection-zapping] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Problem: what if x_exported has a RULE that mentions something in ...bindings...? -Then the things mentioned can be out of scope! Solution - a) Make sure that in this pass the usage-info from x_exported is - available for ...bindings... - b) If there are any such RULES, rec-ify the entire top-level. - It'll get sorted out next time round - -Other remarks -~~~~~~~~~~~~~ -If more than one exported thing is equal to a local thing (i.e., the -local thing really is shared), then we do one only: -\begin{verbatim} - x_local = .... - x_exported1 = x_local - x_exported2 = x_local -==> - x_exported1 = .... - - x_exported2 = x_exported1 -\end{verbatim} - -We rely on prior eta reduction to simplify things like -\begin{verbatim} - x_exported = /\ tyvars -> x_local tyvars -==> - x_exported = x_local -\end{verbatim} -Hence,there's a possibility of leaving unchanged something like this: -\begin{verbatim} - x_local = .... - x_exported1 = x_local Int -\end{verbatim} -By the time we've thrown away the types in STG land this -could be eliminated. But I don't think it's very common -and it's dangerous to do this fiddling in STG land -because we might eliminate a binding that's mentioned in the -unfolding for something. - -Note [Indirection zapping and ticks] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Unfortunately this is another place where we need a special case for -ticks. The following happens quite regularly: - - x_local = <expression> - x_exported = tick<x> x_local - -Which we want to become: - - x_exported = tick<x> <expression> - -As it makes no sense to keep the tick and the expression on separate -bindings. Note however that that this might increase the ticks scoping -over the execution of x_local, so we can only do this for floatable -ticks. More often than not, other references will be unfoldings of -x_exported, and therefore carry the tick anyway. --} - -type IndEnv = IdEnv (Id, [Tickish Var]) -- Maps local_id -> exported_id, ticks - -shortOutIndirections :: CoreProgram -> CoreProgram -shortOutIndirections binds - | isEmptyVarEnv ind_env = binds - | no_need_to_flatten = binds' -- See Note [Rules and indirect-zapping] - | otherwise = [Rec (flattenBinds binds')] -- for this no_need_to_flatten stuff - where - ind_env = makeIndEnv binds - -- These exported Ids are the subjects of the indirection-elimination - exp_ids = map fst $ nonDetEltsUFM ind_env - -- It's OK to use nonDetEltsUFM here because we forget the ordering - -- by immediately converting to a set or check if all the elements - -- satisfy a predicate. - exp_id_set = mkVarSet exp_ids - no_need_to_flatten = all (null . ruleInfoRules . idSpecialisation) exp_ids - binds' = concatMap zap binds - - zap (NonRec bndr rhs) = [NonRec b r | (b,r) <- zapPair (bndr,rhs)] - zap (Rec pairs) = [Rec (concatMap zapPair pairs)] - - zapPair (bndr, rhs) - | bndr `elemVarSet` exp_id_set - = [] -- Kill the exported-id binding - - | Just (exp_id, ticks) <- lookupVarEnv ind_env bndr - , (exp_id', lcl_id') <- transferIdInfo exp_id bndr - = -- Turn a local-id binding into two bindings - -- exp_id = rhs; lcl_id = exp_id - [ (exp_id', mkTicks ticks rhs), - (lcl_id', Var exp_id') ] - - | otherwise - = [(bndr,rhs)] - -makeIndEnv :: [CoreBind] -> IndEnv -makeIndEnv binds - = foldl' add_bind emptyVarEnv binds - where - add_bind :: IndEnv -> CoreBind -> IndEnv - add_bind env (NonRec exported_id rhs) = add_pair env (exported_id, rhs) - add_bind env (Rec pairs) = foldl' add_pair env pairs - - add_pair :: IndEnv -> (Id,CoreExpr) -> IndEnv - add_pair env (exported_id, exported) - | (ticks, Var local_id) <- stripTicksTop tickishFloatable exported - , shortMeOut env exported_id local_id - = extendVarEnv env local_id (exported_id, ticks) - add_pair env _ = env - ------------------ -shortMeOut :: IndEnv -> Id -> Id -> Bool -shortMeOut ind_env exported_id local_id --- The if-then-else stuff is just so I can get a pprTrace to see --- how often I don't get shorting out because of IdInfo stuff - = if isExportedId exported_id && -- Only if this is exported - - isLocalId local_id && -- Only if this one is defined in this - -- module, so that we *can* change its - -- binding to be the exported thing! - - not (isExportedId local_id) && -- Only if this one is not itself exported, - -- since the transformation will nuke it - - not (local_id `elemVarEnv` ind_env) -- Only if not already substituted for - then - if hasShortableIdInfo exported_id - then True -- See Note [Messing up the exported Id's IdInfo] - else WARN( True, text "Not shorting out:" <+> ppr exported_id ) - False - else - False - ------------------ -hasShortableIdInfo :: Id -> Bool --- True if there is no user-attached IdInfo on exported_id, --- so we can safely discard it --- See Note [Messing up the exported Id's IdInfo] -hasShortableIdInfo id - = isEmptyRuleInfo (ruleInfo info) - && isDefaultInlinePragma (inlinePragInfo info) - && not (isStableUnfolding (unfoldingInfo info)) - where - info = idInfo id - ------------------ -{- Note [Transferring IdInfo] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have - lcl_id = e; exp_id = lcl_id - -and lcl_id has useful IdInfo, we don't want to discard it by going - gbl_id = e; lcl_id = gbl_id - -Instead, transfer IdInfo from lcl_id to exp_id, specifically -* (Stable) unfolding -* Strictness -* Rules -* Inline pragma - -Overwriting, rather than merging, seems to work ok. - -We also zap the InlinePragma on the lcl_id. It might originally -have had a NOINLINE, which we have now transferred; and we really -want the lcl_id to inline now that its RHS is trivial! --} - -transferIdInfo :: Id -> Id -> (Id, Id) --- See Note [Transferring IdInfo] -transferIdInfo exported_id local_id - = ( modifyIdInfo transfer exported_id - , local_id `setInlinePragma` defaultInlinePragma ) - where - local_info = idInfo local_id - transfer exp_info = exp_info `setStrictnessInfo` strictnessInfo local_info - `setCprInfo` cprInfo local_info - `setUnfoldingInfo` unfoldingInfo local_info - `setInlinePragInfo` inlinePragInfo local_info - `setRuleInfo` addRuleInfo (ruleInfo exp_info) new_info - new_info = setRuleInfoHead (idName exported_id) - (ruleInfo local_info) - -- Remember to set the function-name field of the - -- rules as we transfer them from one function to another diff --git a/compiler/simplCore/SimplEnv.hs b/compiler/simplCore/SimplEnv.hs deleted file mode 100644 index e19b9a19c8..0000000000 --- a/compiler/simplCore/SimplEnv.hs +++ /dev/null @@ -1,938 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section[SimplMonad]{The simplifier Monad} --} - -{-# LANGUAGE CPP #-} - -module SimplEnv ( - -- * The simplifier mode - setMode, getMode, updMode, seDynFlags, - - -- * Environments - SimplEnv(..), pprSimplEnv, -- Temp not abstract - mkSimplEnv, extendIdSubst, - SimplEnv.extendTvSubst, SimplEnv.extendCvSubst, - zapSubstEnv, setSubstEnv, - getInScope, setInScopeFromE, setInScopeFromF, - setInScopeSet, modifyInScope, addNewInScopeIds, - getSimplRules, - - -- * Substitution results - SimplSR(..), mkContEx, substId, lookupRecBndr, refineFromInScope, - - -- * Simplifying 'Id' binders - simplNonRecBndr, simplNonRecJoinBndr, simplRecBndrs, simplRecJoinBndrs, - simplBinder, simplBinders, - substTy, substTyVar, getTCvSubst, - substCo, substCoVar, - - -- * Floats - SimplFloats(..), emptyFloats, mkRecFloats, - mkFloatBind, addLetFloats, addJoinFloats, addFloats, - extendFloats, wrapFloats, - doFloatFromRhs, getTopFloatBinds, - - -- * LetFloats - LetFloats, letFloatBinds, emptyLetFloats, unitLetFloat, - addLetFlts, mapLetFloats, - - -- * JoinFloats - JoinFloat, JoinFloats, emptyJoinFloats, - wrapJoinFloats, wrapJoinFloatsX, unitJoinFloat, addJoinFlts - ) where - -#include "HsVersions.h" - -import GhcPrelude - -import SimplMonad -import CoreMonad ( SimplMode(..) ) -import GHC.Core -import GHC.Core.Utils -import Var -import VarEnv -import VarSet -import OrdList -import Id -import GHC.Core.Make ( mkWildValBinder ) -import GHC.Driver.Session ( DynFlags ) -import TysWiredIn -import qualified GHC.Core.Type as Type -import GHC.Core.Type hiding ( substTy, substTyVar, substTyVarBndr ) -import qualified GHC.Core.Coercion as Coercion -import GHC.Core.Coercion hiding ( substCo, substCoVar, substCoVarBndr ) -import BasicTypes -import MonadUtils -import Outputable -import Util -import UniqFM ( pprUniqFM ) - -import Data.List (mapAccumL) - -{- -************************************************************************ -* * -\subsubsection{The @SimplEnv@ type} -* * -************************************************************************ --} - -data SimplEnv - = SimplEnv { - ----------- Static part of the environment ----------- - -- Static in the sense of lexically scoped, - -- wrt the original expression - - seMode :: SimplMode - - -- The current substitution - , seTvSubst :: TvSubstEnv -- InTyVar |--> OutType - , seCvSubst :: CvSubstEnv -- InCoVar |--> OutCoercion - , seIdSubst :: SimplIdSubst -- InId |--> OutExpr - - ----------- Dynamic part of the environment ----------- - -- Dynamic in the sense of describing the setup where - -- the expression finally ends up - - -- The current set of in-scope variables - -- They are all OutVars, and all bound in this module - , seInScope :: InScopeSet -- OutVars only - } - -data SimplFloats - = SimplFloats - { -- Ordinary let bindings - sfLetFloats :: LetFloats - -- See Note [LetFloats] - - -- Join points - , sfJoinFloats :: JoinFloats - -- Handled separately; they don't go very far - -- We consider these to be /inside/ sfLetFloats - -- because join points can refer to ordinary bindings, - -- but not vice versa - - -- Includes all variables bound by sfLetFloats and - -- sfJoinFloats, plus at least whatever is in scope where - -- these bindings land up. - , sfInScope :: InScopeSet -- All OutVars - } - -instance Outputable SimplFloats where - ppr (SimplFloats { sfLetFloats = lf, sfJoinFloats = jf, sfInScope = is }) - = text "SimplFloats" - <+> braces (vcat [ text "lets: " <+> ppr lf - , text "joins:" <+> ppr jf - , text "in_scope:" <+> ppr is ]) - -emptyFloats :: SimplEnv -> SimplFloats -emptyFloats env - = SimplFloats { sfLetFloats = emptyLetFloats - , sfJoinFloats = emptyJoinFloats - , sfInScope = seInScope env } - -pprSimplEnv :: SimplEnv -> SDoc --- Used for debugging; selective -pprSimplEnv env - = vcat [text "TvSubst:" <+> ppr (seTvSubst env), - text "CvSubst:" <+> ppr (seCvSubst env), - text "IdSubst:" <+> id_subst_doc, - text "InScope:" <+> in_scope_vars_doc - ] - where - id_subst_doc = pprUniqFM ppr (seIdSubst env) - in_scope_vars_doc = pprVarSet (getInScopeVars (seInScope env)) - (vcat . map ppr_one) - ppr_one v | isId v = ppr v <+> ppr (idUnfolding v) - | otherwise = ppr v - -type SimplIdSubst = IdEnv SimplSR -- IdId |--> OutExpr - -- See Note [Extending the Subst] in GHC.Core.Subst - --- | A substitution result. -data SimplSR - = DoneEx OutExpr (Maybe JoinArity) - -- If x :-> DoneEx e ja is in the SimplIdSubst - -- then replace occurrences of x by e - -- and ja = Just a <=> x is a join-point of arity a - -- See Note [Join arity in SimplIdSubst] - - - | DoneId OutId - -- If x :-> DoneId v is in the SimplIdSubst - -- then replace occurrences of x by v - -- and v is a join-point of arity a - -- <=> x is a join-point of arity a - - | ContEx TvSubstEnv -- A suspended substitution - CvSubstEnv - SimplIdSubst - InExpr - -- If x :-> ContEx tv cv id e is in the SimplISubst - -- then replace occurrences of x by (subst (tv,cv,id) e) - -instance Outputable SimplSR where - ppr (DoneId v) = text "DoneId" <+> ppr v - ppr (DoneEx e mj) = text "DoneEx" <> pp_mj <+> ppr e - where - pp_mj = case mj of - Nothing -> empty - Just n -> parens (int n) - - ppr (ContEx _tv _cv _id e) = vcat [text "ContEx" <+> ppr e {-, - ppr (filter_env tv), ppr (filter_env id) -}] - -- where - -- fvs = exprFreeVars e - -- filter_env env = filterVarEnv_Directly keep env - -- keep uniq _ = uniq `elemUFM_Directly` fvs - -{- -Note [SimplEnv invariants] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -seInScope: - The in-scope part of Subst includes *all* in-scope TyVars and Ids - The elements of the set may have better IdInfo than the - occurrences of in-scope Ids, and (more important) they will - have a correctly-substituted type. So we use a lookup in this - set to replace occurrences - - The Ids in the InScopeSet are replete with their Rules, - and as we gather info about the unfolding of an Id, we replace - it in the in-scope set. - - The in-scope set is actually a mapping OutVar -> OutVar, and - in case expressions we sometimes bind - -seIdSubst: - The substitution is *apply-once* only, because InIds and OutIds - can overlap. - For example, we generally omit mappings - a77 -> a77 - from the substitution, when we decide not to clone a77, but it's quite - legitimate to put the mapping in the substitution anyway. - - Furthermore, consider - let x = case k of I# x77 -> ... in - let y = case k of I# x77 -> ... in ... - and suppose the body is strict in both x and y. Then the simplifier - will pull the first (case k) to the top; so the second (case k) will - cancel out, mapping x77 to, well, x77! But one is an in-Id and the - other is an out-Id. - - Of course, the substitution *must* applied! Things in its domain - simply aren't necessarily bound in the result. - -* substId adds a binding (DoneId new_id) to the substitution if - the Id's unique has changed - - Note, though that the substitution isn't necessarily extended - if the type of the Id changes. Why not? Because of the next point: - -* We *always, always* finish by looking up in the in-scope set - any variable that doesn't get a DoneEx or DoneVar hit in the substitution. - Reason: so that we never finish up with a "old" Id in the result. - An old Id might point to an old unfolding and so on... which gives a space - leak. - - [The DoneEx and DoneVar hits map to "new" stuff.] - -* It follows that substExpr must not do a no-op if the substitution is empty. - substType is free to do so, however. - -* When we come to a let-binding (say) we generate new IdInfo, including an - unfolding, attach it to the binder, and add this newly adorned binder to - the in-scope set. So all subsequent occurrences of the binder will get - mapped to the full-adorned binder, which is also the one put in the - binding site. - -* The in-scope "set" usually maps x->x; we use it simply for its domain. - But sometimes we have two in-scope Ids that are synomyms, and should - map to the same target: x->x, y->x. Notably: - case y of x { ... } - That's why the "set" is actually a VarEnv Var - -Note [Join arity in SimplIdSubst] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We have to remember which incoming variables are join points: the occurrences -may not be marked correctly yet, and we're in change of propagating the change if -OccurAnal makes something a join point). - -Normally the in-scope set is where we keep the latest information, but -the in-scope set tracks only OutVars; if a binding is unconditionally -inlined (via DoneEx), it never makes it into the in-scope set, and we -need to know at the occurrence site that the variable is a join point -so that we know to drop the context. Thus we remember which join -points we're substituting. -} - -mkSimplEnv :: SimplMode -> SimplEnv -mkSimplEnv mode - = SimplEnv { seMode = mode - , seInScope = init_in_scope - , seTvSubst = emptyVarEnv - , seCvSubst = emptyVarEnv - , seIdSubst = emptyVarEnv } - -- The top level "enclosing CC" is "SUBSUMED". - -init_in_scope :: InScopeSet -init_in_scope = mkInScopeSet (unitVarSet (mkWildValBinder unitTy)) - -- See Note [WildCard binders] - -{- -Note [WildCard binders] -~~~~~~~~~~~~~~~~~~~~~~~ -The program to be simplified may have wild binders - case e of wild { p -> ... } -We want to *rename* them away, so that there are no -occurrences of 'wild-id' (with wildCardKey). The easy -way to do that is to start of with a representative -Id in the in-scope set - -There can be *occurrences* of wild-id. For example, -GHC.Core.Make.mkCoreApp transforms - e (a /# b) --> case (a /# b) of wild { DEFAULT -> e wild } -This is ok provided 'wild' isn't free in 'e', and that's the delicate -thing. Generally, you want to run the simplifier to get rid of the -wild-ids before doing much else. - -It's a very dark corner of GHC. Maybe it should be cleaned up. --} - -getMode :: SimplEnv -> SimplMode -getMode env = seMode env - -seDynFlags :: SimplEnv -> DynFlags -seDynFlags env = sm_dflags (seMode env) - -setMode :: SimplMode -> SimplEnv -> SimplEnv -setMode mode env = env { seMode = mode } - -updMode :: (SimplMode -> SimplMode) -> SimplEnv -> SimplEnv -updMode upd env = env { seMode = upd (seMode env) } - ---------------------- -extendIdSubst :: SimplEnv -> Id -> SimplSR -> SimplEnv -extendIdSubst env@(SimplEnv {seIdSubst = subst}) var res - = ASSERT2( isId var && not (isCoVar var), ppr var ) - env { seIdSubst = extendVarEnv subst var res } - -extendTvSubst :: SimplEnv -> TyVar -> Type -> SimplEnv -extendTvSubst env@(SimplEnv {seTvSubst = tsubst}) var res - = ASSERT2( isTyVar var, ppr var $$ ppr res ) - env {seTvSubst = extendVarEnv tsubst var res} - -extendCvSubst :: SimplEnv -> CoVar -> Coercion -> SimplEnv -extendCvSubst env@(SimplEnv {seCvSubst = csubst}) var co - = ASSERT( isCoVar var ) - env {seCvSubst = extendVarEnv csubst var co} - ---------------------- -getInScope :: SimplEnv -> InScopeSet -getInScope env = seInScope env - -setInScopeSet :: SimplEnv -> InScopeSet -> SimplEnv -setInScopeSet env in_scope = env {seInScope = in_scope} - -setInScopeFromE :: SimplEnv -> SimplEnv -> SimplEnv --- See Note [Setting the right in-scope set] -setInScopeFromE rhs_env here_env = rhs_env { seInScope = seInScope here_env } - -setInScopeFromF :: SimplEnv -> SimplFloats -> SimplEnv -setInScopeFromF env floats = env { seInScope = sfInScope floats } - -addNewInScopeIds :: SimplEnv -> [CoreBndr] -> SimplEnv - -- The new Ids are guaranteed to be freshly allocated -addNewInScopeIds env@(SimplEnv { seInScope = in_scope, seIdSubst = id_subst }) vs - = env { seInScope = in_scope `extendInScopeSetList` vs, - seIdSubst = id_subst `delVarEnvList` vs } - -- Why delete? Consider - -- let x = a*b in (x, \x -> x+3) - -- We add [x |-> a*b] to the substitution, but we must - -- _delete_ it from the substitution when going inside - -- the (\x -> ...)! - -modifyInScope :: SimplEnv -> CoreBndr -> SimplEnv --- The variable should already be in scope, but --- replace the existing version with this new one --- which has more information -modifyInScope env@(SimplEnv {seInScope = in_scope}) v - = env {seInScope = extendInScopeSet in_scope v} - -{- Note [Setting the right in-scope set] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - \x. (let x = e in b) arg[x] -where the let shadows the lambda. Really this means something like - \x1. (let x2 = e in b) arg[x1] - -- When we capture the 'arg' in an ApplyToVal continuation, we capture - the environment, which says what 'x' is bound to, namely x1 - -- Then that continuation gets pushed under the let - -- Finally we simplify 'arg'. We want - - the static, lexical environment binding x :-> x1 - - the in-scopeset from "here", under the 'let' which includes - both x1 and x2 - -It's important to have the right in-scope set, else we may rename a -variable to one that is already in scope. So we must pick up the -in-scope set from "here", but otherwise use the environment we -captured along with 'arg'. This transfer of in-scope set is done by -setInScopeFromE. --} - ---------------------- -zapSubstEnv :: SimplEnv -> SimplEnv -zapSubstEnv env = env {seTvSubst = emptyVarEnv, seCvSubst = emptyVarEnv, seIdSubst = emptyVarEnv} - -setSubstEnv :: SimplEnv -> TvSubstEnv -> CvSubstEnv -> SimplIdSubst -> SimplEnv -setSubstEnv env tvs cvs ids = env { seTvSubst = tvs, seCvSubst = cvs, seIdSubst = ids } - -mkContEx :: SimplEnv -> InExpr -> SimplSR -mkContEx (SimplEnv { seTvSubst = tvs, seCvSubst = cvs, seIdSubst = ids }) e = ContEx tvs cvs ids e - -{- -************************************************************************ -* * -\subsection{LetFloats} -* * -************************************************************************ - -Note [LetFloats] -~~~~~~~~~~~~~~~~ -The LetFloats is a bunch of bindings, classified by a FloatFlag. - -* All of them satisfy the let/app invariant - -Examples - - NonRec x (y:ys) FltLifted - Rec [(x,rhs)] FltLifted - - NonRec x* (p:q) FltOKSpec -- RHS is WHNF. Question: why not FltLifted? - NonRec x# (y +# 3) FltOkSpec -- Unboxed, but ok-for-spec'n - - NonRec x* (f y) FltCareful -- Strict binding; might fail or diverge - -Can't happen: - NonRec x# (a /# b) -- Might fail; does not satisfy let/app - NonRec x# (f y) -- Might diverge; does not satisfy let/app --} - -data LetFloats = LetFloats (OrdList OutBind) FloatFlag - -- See Note [LetFloats] - -type JoinFloat = OutBind -type JoinFloats = OrdList JoinFloat - -data FloatFlag - = FltLifted -- All bindings are lifted and lazy *or* - -- consist of a single primitive string literal - -- Hence ok to float to top level, or recursive - - | FltOkSpec -- All bindings are FltLifted *or* - -- strict (perhaps because unlifted, - -- perhaps because of a strict binder), - -- *and* ok-for-speculation - -- Hence ok to float out of the RHS - -- of a lazy non-recursive let binding - -- (but not to top level, or into a rec group) - - | FltCareful -- At least one binding is strict (or unlifted) - -- and not guaranteed cheap - -- Do not float these bindings out of a lazy let - -instance Outputable LetFloats where - ppr (LetFloats binds ff) = ppr ff $$ ppr (fromOL binds) - -instance Outputable FloatFlag where - ppr FltLifted = text "FltLifted" - ppr FltOkSpec = text "FltOkSpec" - ppr FltCareful = text "FltCareful" - -andFF :: FloatFlag -> FloatFlag -> FloatFlag -andFF FltCareful _ = FltCareful -andFF FltOkSpec FltCareful = FltCareful -andFF FltOkSpec _ = FltOkSpec -andFF FltLifted flt = flt - -doFloatFromRhs :: TopLevelFlag -> RecFlag -> Bool -> SimplFloats -> OutExpr -> Bool --- If you change this function look also at FloatIn.noFloatFromRhs -doFloatFromRhs lvl rec str (SimplFloats { sfLetFloats = LetFloats fs ff }) rhs - = not (isNilOL fs) && want_to_float && can_float - where - want_to_float = isTopLevel lvl || exprIsCheap rhs || exprIsExpandable rhs - -- See Note [Float when cheap or expandable] - can_float = case ff of - FltLifted -> True - FltOkSpec -> isNotTopLevel lvl && isNonRec rec - FltCareful -> isNotTopLevel lvl && isNonRec rec && str - -{- -Note [Float when cheap or expandable] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We want to float a let from a let if the residual RHS is - a) cheap, such as (\x. blah) - b) expandable, such as (f b) if f is CONLIKE -But there are - - cheap things that are not expandable (eg \x. expensive) - - expandable things that are not cheap (eg (f b) where b is CONLIKE) -so we must take the 'or' of the two. --} - -emptyLetFloats :: LetFloats -emptyLetFloats = LetFloats nilOL FltLifted - -emptyJoinFloats :: JoinFloats -emptyJoinFloats = nilOL - -unitLetFloat :: OutBind -> LetFloats --- This key function constructs a singleton float with the right form -unitLetFloat bind = ASSERT(all (not . isJoinId) (bindersOf bind)) - LetFloats (unitOL bind) (flag bind) - where - flag (Rec {}) = FltLifted - flag (NonRec bndr rhs) - | not (isStrictId bndr) = FltLifted - | exprIsTickedString rhs = FltLifted - -- String literals can be floated freely. - -- See Note [Core top-level string literals] in GHC.Core. - | exprOkForSpeculation rhs = FltOkSpec -- Unlifted, and lifted but ok-for-spec (eg HNF) - | otherwise = ASSERT2( not (isUnliftedType (idType bndr)), ppr bndr ) - FltCareful - -- Unlifted binders can only be let-bound if exprOkForSpeculation holds - -unitJoinFloat :: OutBind -> JoinFloats -unitJoinFloat bind = ASSERT(all isJoinId (bindersOf bind)) - unitOL bind - -mkFloatBind :: SimplEnv -> OutBind -> (SimplFloats, SimplEnv) --- Make a singleton SimplFloats, and --- extend the incoming SimplEnv's in-scope set with its binders --- These binders may already be in the in-scope set, --- but may have by now been augmented with more IdInfo -mkFloatBind env bind - = (floats, env { seInScope = in_scope' }) - where - floats - | isJoinBind bind - = SimplFloats { sfLetFloats = emptyLetFloats - , sfJoinFloats = unitJoinFloat bind - , sfInScope = in_scope' } - | otherwise - = SimplFloats { sfLetFloats = unitLetFloat bind - , sfJoinFloats = emptyJoinFloats - , sfInScope = in_scope' } - - in_scope' = seInScope env `extendInScopeSetBind` bind - -extendFloats :: SimplFloats -> OutBind -> SimplFloats --- Add this binding to the floats, and extend the in-scope env too -extendFloats (SimplFloats { sfLetFloats = floats - , sfJoinFloats = jfloats - , sfInScope = in_scope }) - bind - | isJoinBind bind - = SimplFloats { sfInScope = in_scope' - , sfLetFloats = floats - , sfJoinFloats = jfloats' } - | otherwise - = SimplFloats { sfInScope = in_scope' - , sfLetFloats = floats' - , sfJoinFloats = jfloats } - where - in_scope' = in_scope `extendInScopeSetBind` bind - floats' = floats `addLetFlts` unitLetFloat bind - jfloats' = jfloats `addJoinFlts` unitJoinFloat bind - -addLetFloats :: SimplFloats -> LetFloats -> SimplFloats --- Add the let-floats for env2 to env1; --- *plus* the in-scope set for env2, which is bigger --- than that for env1 -addLetFloats floats let_floats@(LetFloats binds _) - = floats { sfLetFloats = sfLetFloats floats `addLetFlts` let_floats - , sfInScope = foldlOL extendInScopeSetBind - (sfInScope floats) binds } - -addJoinFloats :: SimplFloats -> JoinFloats -> SimplFloats -addJoinFloats floats join_floats - = floats { sfJoinFloats = sfJoinFloats floats `addJoinFlts` join_floats - , sfInScope = foldlOL extendInScopeSetBind - (sfInScope floats) join_floats } - -extendInScopeSetBind :: InScopeSet -> CoreBind -> InScopeSet -extendInScopeSetBind in_scope bind - = extendInScopeSetList in_scope (bindersOf bind) - -addFloats :: SimplFloats -> SimplFloats -> SimplFloats --- Add both let-floats and join-floats for env2 to env1; --- *plus* the in-scope set for env2, which is bigger --- than that for env1 -addFloats (SimplFloats { sfLetFloats = lf1, sfJoinFloats = jf1 }) - (SimplFloats { sfLetFloats = lf2, sfJoinFloats = jf2, sfInScope = in_scope }) - = SimplFloats { sfLetFloats = lf1 `addLetFlts` lf2 - , sfJoinFloats = jf1 `addJoinFlts` jf2 - , sfInScope = in_scope } - -addLetFlts :: LetFloats -> LetFloats -> LetFloats -addLetFlts (LetFloats bs1 l1) (LetFloats bs2 l2) - = LetFloats (bs1 `appOL` bs2) (l1 `andFF` l2) - -letFloatBinds :: LetFloats -> [CoreBind] -letFloatBinds (LetFloats bs _) = fromOL bs - -addJoinFlts :: JoinFloats -> JoinFloats -> JoinFloats -addJoinFlts = appOL - -mkRecFloats :: SimplFloats -> SimplFloats --- Flattens the floats from env2 into a single Rec group, --- They must either all be lifted LetFloats or all JoinFloats -mkRecFloats floats@(SimplFloats { sfLetFloats = LetFloats bs ff - , sfJoinFloats = jbs - , sfInScope = in_scope }) - = ASSERT2( case ff of { FltLifted -> True; _ -> False }, ppr (fromOL bs) ) - ASSERT2( isNilOL bs || isNilOL jbs, ppr floats ) - SimplFloats { sfLetFloats = floats' - , sfJoinFloats = jfloats' - , sfInScope = in_scope } - where - floats' | isNilOL bs = emptyLetFloats - | otherwise = unitLetFloat (Rec (flattenBinds (fromOL bs))) - jfloats' | isNilOL jbs = emptyJoinFloats - | otherwise = unitJoinFloat (Rec (flattenBinds (fromOL jbs))) - -wrapFloats :: SimplFloats -> OutExpr -> OutExpr --- Wrap the floats around the expression; they should all --- satisfy the let/app invariant, so mkLets should do the job just fine -wrapFloats (SimplFloats { sfLetFloats = LetFloats bs _ - , sfJoinFloats = jbs }) body - = foldrOL Let (wrapJoinFloats jbs body) bs - -- Note: Always safe to put the joins on the inside - -- since the values can't refer to them - -wrapJoinFloatsX :: SimplFloats -> OutExpr -> (SimplFloats, OutExpr) --- Wrap the sfJoinFloats of the env around the expression, --- and take them out of the SimplEnv -wrapJoinFloatsX floats body - = ( floats { sfJoinFloats = emptyJoinFloats } - , wrapJoinFloats (sfJoinFloats floats) body ) - -wrapJoinFloats :: JoinFloats -> OutExpr -> OutExpr --- Wrap the sfJoinFloats of the env around the expression, --- and take them out of the SimplEnv -wrapJoinFloats join_floats body - = foldrOL Let body join_floats - -getTopFloatBinds :: SimplFloats -> [CoreBind] -getTopFloatBinds (SimplFloats { sfLetFloats = lbs - , sfJoinFloats = jbs}) - = ASSERT( isNilOL jbs ) -- Can't be any top-level join bindings - letFloatBinds lbs - -mapLetFloats :: LetFloats -> ((Id,CoreExpr) -> (Id,CoreExpr)) -> LetFloats -mapLetFloats (LetFloats fs ff) fun - = LetFloats (mapOL app fs) ff - where - app (NonRec b e) = case fun (b,e) of (b',e') -> NonRec b' e' - app (Rec bs) = Rec (map fun bs) - -{- -************************************************************************ -* * - Substitution of Vars -* * -************************************************************************ - -Note [Global Ids in the substitution] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We look up even a global (eg imported) Id in the substitution. Consider - case X.g_34 of b { (a,b) -> ... case X.g_34 of { (p,q) -> ...} ... } -The binder-swap in the occurrence analyser will add a binding -for a LocalId version of g (with the same unique though): - case X.g_34 of b { (a,b) -> let g_34 = b in - ... case X.g_34 of { (p,q) -> ...} ... } -So we want to look up the inner X.g_34 in the substitution, where we'll -find that it has been substituted by b. (Or conceivably cloned.) --} - -substId :: SimplEnv -> InId -> SimplSR --- Returns DoneEx only on a non-Var expression -substId (SimplEnv { seInScope = in_scope, seIdSubst = ids }) v - = case lookupVarEnv ids v of -- Note [Global Ids in the substitution] - Nothing -> DoneId (refineFromInScope in_scope v) - Just (DoneId v) -> DoneId (refineFromInScope in_scope v) - Just res -> res -- DoneEx non-var, or ContEx - - -- Get the most up-to-date thing from the in-scope set - -- Even though it isn't in the substitution, it may be in - -- the in-scope set with better IdInfo. - -- - -- See also Note [In-scope set as a substitution] in Simplify. - -refineFromInScope :: InScopeSet -> Var -> Var -refineFromInScope in_scope v - | isLocalId v = case lookupInScope in_scope v of - Just v' -> v' - Nothing -> WARN( True, ppr v ) v -- This is an error! - | otherwise = v - -lookupRecBndr :: SimplEnv -> InId -> OutId --- Look up an Id which has been put into the envt by simplRecBndrs, --- but where we have not yet done its RHS -lookupRecBndr (SimplEnv { seInScope = in_scope, seIdSubst = ids }) v - = case lookupVarEnv ids v of - Just (DoneId v) -> v - Just _ -> pprPanic "lookupRecBndr" (ppr v) - Nothing -> refineFromInScope in_scope v - -{- -************************************************************************ -* * -\section{Substituting an Id binder} -* * -************************************************************************ - - -These functions are in the monad only so that they can be made strict via seq. - -Note [Return type for join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - - (join j :: Char -> Int -> Int) 77 - ( j x = \y. y + ord x ) - (in case v of ) - ( A -> j 'x' ) - ( B -> j 'y' ) - ( C -> <blah> ) - -The simplifier pushes the "apply to 77" continuation inwards to give - - join j :: Char -> Int - j x = (\y. y + ord x) 77 - in case v of - A -> j 'x' - B -> j 'y' - C -> <blah> 77 - -Notice that the "apply to 77" continuation went into the RHS of the -join point. And that meant that the return type of the join point -changed!! - -That's why we pass res_ty into simplNonRecJoinBndr, and substIdBndr -takes a (Just res_ty) argument so that it knows to do the type-changing -thing. --} - -simplBinders :: SimplEnv -> [InBndr] -> SimplM (SimplEnv, [OutBndr]) -simplBinders env bndrs = mapAccumLM simplBinder env bndrs - -------------- -simplBinder :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr) --- Used for lambda and case-bound variables --- Clone Id if necessary, substitute type --- Return with IdInfo already substituted, but (fragile) occurrence info zapped --- The substitution is extended only if the variable is cloned, because --- we *don't* need to use it to track occurrence info. -simplBinder env bndr - | isTyVar bndr = do { let (env', tv) = substTyVarBndr env bndr - ; seqTyVar tv `seq` return (env', tv) } - | otherwise = do { let (env', id) = substIdBndr Nothing env bndr - ; seqId id `seq` return (env', id) } - ---------------- -simplNonRecBndr :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr) --- A non-recursive let binder -simplNonRecBndr env id - = do { let (env1, id1) = substIdBndr Nothing env id - ; seqId id1 `seq` return (env1, id1) } - ---------------- -simplNonRecJoinBndr :: SimplEnv -> OutType -> InBndr - -> SimplM (SimplEnv, OutBndr) --- A non-recursive let binder for a join point; --- context being pushed inward may change the type --- See Note [Return type for join points] -simplNonRecJoinBndr env res_ty id - = do { let (env1, id1) = substIdBndr (Just res_ty) env id - ; seqId id1 `seq` return (env1, id1) } - ---------------- -simplRecBndrs :: SimplEnv -> [InBndr] -> SimplM SimplEnv --- Recursive let binders -simplRecBndrs env@(SimplEnv {}) ids - = ASSERT(all (not . isJoinId) ids) - do { let (env1, ids1) = mapAccumL (substIdBndr Nothing) env ids - ; seqIds ids1 `seq` return env1 } - ---------------- -simplRecJoinBndrs :: SimplEnv -> OutType -> [InBndr] -> SimplM SimplEnv --- Recursive let binders for join points; --- context being pushed inward may change types --- See Note [Return type for join points] -simplRecJoinBndrs env@(SimplEnv {}) res_ty ids - = ASSERT(all isJoinId ids) - do { let (env1, ids1) = mapAccumL (substIdBndr (Just res_ty)) env ids - ; seqIds ids1 `seq` return env1 } - ---------------- -substIdBndr :: Maybe OutType -> SimplEnv -> InBndr -> (SimplEnv, OutBndr) --- Might be a coercion variable -substIdBndr new_res_ty env bndr - | isCoVar bndr = substCoVarBndr env bndr - | otherwise = substNonCoVarIdBndr new_res_ty env bndr - ---------------- -substNonCoVarIdBndr - :: Maybe OutType -- New result type, if a join binder - -- See Note [Return type for join points] - -> SimplEnv - -> InBndr -- Env and binder to transform - -> (SimplEnv, OutBndr) --- Clone Id if necessary, substitute its type --- Return an Id with its --- * Type substituted --- * UnfoldingInfo, Rules, WorkerInfo zapped --- * Fragile OccInfo (only) zapped: Note [Robust OccInfo] --- * Robust info, retained especially arity and demand info, --- so that they are available to occurrences that occur in an --- earlier binding of a letrec --- --- For the robust info, see Note [Arity robustness] --- --- Augment the substitution if the unique changed --- Extend the in-scope set with the new Id --- --- Similar to GHC.Core.Subst.substIdBndr, except that --- the type of id_subst differs --- all fragile info is zapped -substNonCoVarIdBndr new_res_ty - env@(SimplEnv { seInScope = in_scope - , seIdSubst = id_subst }) - old_id - = ASSERT2( not (isCoVar old_id), ppr old_id ) - (env { seInScope = in_scope `extendInScopeSet` new_id, - seIdSubst = new_subst }, new_id) - where - id1 = uniqAway in_scope old_id - id2 = substIdType env id1 - - id3 | Just res_ty <- new_res_ty - = id2 `setIdType` setJoinResTy (idJoinArity id2) res_ty (idType id2) - -- See Note [Return type for join points] - | otherwise - = id2 - - new_id = zapFragileIdInfo id3 -- Zaps rules, worker-info, unfolding - -- and fragile OccInfo - - -- Extend the substitution if the unique has changed, - -- or there's some useful occurrence information - -- See the notes with substTyVarBndr for the delSubstEnv - new_subst | new_id /= old_id - = extendVarEnv id_subst old_id (DoneId new_id) - | otherwise - = delVarEnv id_subst old_id - ------------------------------------- -seqTyVar :: TyVar -> () -seqTyVar b = b `seq` () - -seqId :: Id -> () -seqId id = seqType (idType id) `seq` - idInfo id `seq` - () - -seqIds :: [Id] -> () -seqIds [] = () -seqIds (id:ids) = seqId id `seq` seqIds ids - -{- -Note [Arity robustness] -~~~~~~~~~~~~~~~~~~~~~~~ -We *do* transfer the arity from from the in_id of a let binding to the -out_id. This is important, so that the arity of an Id is visible in -its own RHS. For example: - f = \x. ....g (\y. f y).... -We can eta-reduce the arg to g, because f is a value. But that -needs to be visible. - -This interacts with the 'state hack' too: - f :: Bool -> IO Int - f = \x. case x of - True -> f y - False -> \s -> ... -Can we eta-expand f? Only if we see that f has arity 1, and then we -take advantage of the 'state hack' on the result of -(f y) :: State# -> (State#, Int) to expand the arity one more. - -There is a disadvantage though. Making the arity visible in the RHS -allows us to eta-reduce - f = \x -> f x -to - f = f -which technically is not sound. This is very much a corner case, so -I'm not worried about it. Another idea is to ensure that f's arity -never decreases; its arity started as 1, and we should never eta-reduce -below that. - - -Note [Robust OccInfo] -~~~~~~~~~~~~~~~~~~~~~ -It's important that we *do* retain the loop-breaker OccInfo, because -that's what stops the Id getting inlined infinitely, in the body of -the letrec. --} - - -{- -************************************************************************ -* * - Impedance matching to type substitution -* * -************************************************************************ --} - -getTCvSubst :: SimplEnv -> TCvSubst -getTCvSubst (SimplEnv { seInScope = in_scope, seTvSubst = tv_env - , seCvSubst = cv_env }) - = mkTCvSubst in_scope (tv_env, cv_env) - -substTy :: SimplEnv -> Type -> Type -substTy env ty = Type.substTy (getTCvSubst env) ty - -substTyVar :: SimplEnv -> TyVar -> Type -substTyVar env tv = Type.substTyVar (getTCvSubst env) tv - -substTyVarBndr :: SimplEnv -> TyVar -> (SimplEnv, TyVar) -substTyVarBndr env tv - = case Type.substTyVarBndr (getTCvSubst env) tv of - (TCvSubst in_scope' tv_env' cv_env', tv') - -> (env { seInScope = in_scope', seTvSubst = tv_env', seCvSubst = cv_env' }, tv') - -substCoVar :: SimplEnv -> CoVar -> Coercion -substCoVar env tv = Coercion.substCoVar (getTCvSubst env) tv - -substCoVarBndr :: SimplEnv -> CoVar -> (SimplEnv, CoVar) -substCoVarBndr env cv - = case Coercion.substCoVarBndr (getTCvSubst env) cv of - (TCvSubst in_scope' tv_env' cv_env', cv') - -> (env { seInScope = in_scope', seTvSubst = tv_env', seCvSubst = cv_env' }, cv') - -substCo :: SimplEnv -> Coercion -> Coercion -substCo env co = Coercion.substCo (getTCvSubst env) co - ------------------- -substIdType :: SimplEnv -> Id -> Id -substIdType (SimplEnv { seInScope = in_scope, seTvSubst = tv_env, seCvSubst = cv_env }) id - | (isEmptyVarEnv tv_env && isEmptyVarEnv cv_env) - || noFreeVarsOfType old_ty - = id - | otherwise = Id.setIdType id (Type.substTy (TCvSubst in_scope tv_env cv_env) old_ty) - -- The tyCoVarsOfType is cheaper than it looks - -- because we cache the free tyvars of the type - -- in a Note in the id's type itself - where - old_ty = idType id diff --git a/compiler/simplCore/SimplMonad.hs b/compiler/simplCore/SimplMonad.hs deleted file mode 100644 index f26fd18e92..0000000000 --- a/compiler/simplCore/SimplMonad.hs +++ /dev/null @@ -1,252 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section[SimplMonad]{The simplifier Monad} --} - -{-# LANGUAGE DeriveFunctor #-} -module SimplMonad ( - -- The monad - SimplM, - initSmpl, traceSmpl, - getSimplRules, getFamEnvs, - - -- Unique supply - MonadUnique(..), newId, newJoinId, - - -- Counting - SimplCount, tick, freeTick, checkedTick, - getSimplCount, zeroSimplCount, pprSimplCount, - plusSimplCount, isZeroSimplCount - ) where - -import GhcPrelude - -import Var ( Var, isId, mkLocalVar ) -import Name ( mkSystemVarName ) -import Id ( Id, mkSysLocalOrCoVar ) -import IdInfo ( IdDetails(..), vanillaIdInfo, setArityInfo ) -import GHC.Core.Type ( Type, mkLamTypes ) -import GHC.Core.FamInstEnv ( FamInstEnv ) -import GHC.Core ( RuleEnv(..) ) -import UniqSupply -import GHC.Driver.Session -import CoreMonad -import Outputable -import FastString -import MonadUtils -import ErrUtils as Err -import Util ( count ) -import Panic (throwGhcExceptionIO, GhcException (..)) -import BasicTypes ( IntWithInf, treatZeroAsInf, mkIntWithInf ) -import Control.Monad ( ap ) - -{- -************************************************************************ -* * -\subsection{Monad plumbing} -* * -************************************************************************ - -For the simplifier monad, we want to {\em thread} a unique supply and a counter. -(Command-line switches move around through the explicitly-passed SimplEnv.) --} - -newtype SimplM result - = SM { unSM :: SimplTopEnv -- Envt that does not change much - -> UniqSupply -- We thread the unique supply because - -- constantly splitting it is rather expensive - -> SimplCount - -> IO (result, UniqSupply, SimplCount)} - -- we only need IO here for dump output - deriving (Functor) - -data SimplTopEnv - = STE { st_flags :: DynFlags - , st_max_ticks :: IntWithInf -- Max #ticks in this simplifier run - , st_rules :: RuleEnv - , st_fams :: (FamInstEnv, FamInstEnv) } - -initSmpl :: DynFlags -> RuleEnv -> (FamInstEnv, FamInstEnv) - -> UniqSupply -- No init count; set to 0 - -> Int -- Size of the bindings, used to limit - -- the number of ticks we allow - -> SimplM a - -> IO (a, SimplCount) - -initSmpl dflags rules fam_envs us size m - = do (result, _, count) <- unSM m env us (zeroSimplCount dflags) - return (result, count) - where - env = STE { st_flags = dflags, st_rules = rules - , st_max_ticks = computeMaxTicks dflags size - , st_fams = fam_envs } - -computeMaxTicks :: DynFlags -> Int -> IntWithInf --- Compute the max simplifier ticks as --- (base-size + pgm-size) * magic-multiplier * tick-factor/100 --- where --- magic-multiplier is a constant that gives reasonable results --- base-size is a constant to deal with size-zero programs -computeMaxTicks dflags size - = treatZeroAsInf $ - fromInteger ((toInteger (size + base_size) - * toInteger (tick_factor * magic_multiplier)) - `div` 100) - where - tick_factor = simplTickFactor dflags - base_size = 100 - magic_multiplier = 40 - -- MAGIC NUMBER, multiplies the simplTickFactor - -- We can afford to be generous; this is really - -- just checking for loops, and shouldn't usually fire - -- A figure of 20 was too small: see #5539. - -{-# INLINE thenSmpl #-} -{-# INLINE thenSmpl_ #-} -{-# INLINE returnSmpl #-} - - -instance Applicative SimplM where - pure = returnSmpl - (<*>) = ap - (*>) = thenSmpl_ - -instance Monad SimplM where - (>>) = (*>) - (>>=) = thenSmpl - -returnSmpl :: a -> SimplM a -returnSmpl e = SM (\_st_env us sc -> return (e, us, sc)) - -thenSmpl :: SimplM a -> (a -> SimplM b) -> SimplM b -thenSmpl_ :: SimplM a -> SimplM b -> SimplM b - -thenSmpl m k - = SM $ \st_env us0 sc0 -> do - (m_result, us1, sc1) <- unSM m st_env us0 sc0 - unSM (k m_result) st_env us1 sc1 - -thenSmpl_ m k - = SM $ \st_env us0 sc0 -> do - (_, us1, sc1) <- unSM m st_env us0 sc0 - unSM k st_env us1 sc1 - --- TODO: this specializing is not allowed --- {-# SPECIALIZE mapM :: (a -> SimplM b) -> [a] -> SimplM [b] #-} --- {-# SPECIALIZE mapAndUnzipM :: (a -> SimplM (b, c)) -> [a] -> SimplM ([b],[c]) #-} --- {-# SPECIALIZE mapAccumLM :: (acc -> b -> SimplM (acc,c)) -> acc -> [b] -> SimplM (acc, [c]) #-} - -traceSmpl :: String -> SDoc -> SimplM () -traceSmpl herald doc - = do { dflags <- getDynFlags - ; liftIO $ Err.dumpIfSet_dyn dflags Opt_D_dump_simpl_trace "Simpl Trace" - FormatText - (hang (text herald) 2 doc) } - -{- -************************************************************************ -* * -\subsection{The unique supply} -* * -************************************************************************ --} - -instance MonadUnique SimplM where - getUniqueSupplyM - = SM (\_st_env us sc -> case splitUniqSupply us of - (us1, us2) -> return (us1, us2, sc)) - - getUniqueM - = SM (\_st_env us sc -> case takeUniqFromSupply us of - (u, us') -> return (u, us', sc)) - - getUniquesM - = SM (\_st_env us sc -> case splitUniqSupply us of - (us1, us2) -> return (uniqsFromSupply us1, us2, sc)) - -instance HasDynFlags SimplM where - getDynFlags = SM (\st_env us sc -> return (st_flags st_env, us, sc)) - -instance MonadIO SimplM where - liftIO m = SM $ \_ us sc -> do - x <- m - return (x, us, sc) - -getSimplRules :: SimplM RuleEnv -getSimplRules = SM (\st_env us sc -> return (st_rules st_env, us, sc)) - -getFamEnvs :: SimplM (FamInstEnv, FamInstEnv) -getFamEnvs = SM (\st_env us sc -> return (st_fams st_env, us, sc)) - -newId :: FastString -> Type -> SimplM Id -newId fs ty = do uniq <- getUniqueM - return (mkSysLocalOrCoVar fs uniq ty) - -newJoinId :: [Var] -> Type -> SimplM Id -newJoinId bndrs body_ty - = do { uniq <- getUniqueM - ; let name = mkSystemVarName uniq (fsLit "$j") - join_id_ty = mkLamTypes bndrs body_ty -- Note [Funky mkLamTypes] - arity = count isId bndrs - -- arity: See Note [Invariants on join points] invariant 2b, in GHC.Core - join_arity = length bndrs - details = JoinId join_arity - id_info = vanillaIdInfo `setArityInfo` arity --- `setOccInfo` strongLoopBreaker - - ; return (mkLocalVar details name join_id_ty id_info) } - -{- -************************************************************************ -* * -\subsection{Counting up what we've done} -* * -************************************************************************ --} - -getSimplCount :: SimplM SimplCount -getSimplCount = SM (\_st_env us sc -> return (sc, us, sc)) - -tick :: Tick -> SimplM () -tick t = SM (\st_env us sc -> let sc' = doSimplTick (st_flags st_env) t sc - in sc' `seq` return ((), us, sc')) - -checkedTick :: Tick -> SimplM () --- Try to take a tick, but fail if too many -checkedTick t - = SM (\st_env us sc -> - if st_max_ticks st_env <= mkIntWithInf (simplCountN sc) - then throwGhcExceptionIO $ - PprProgramError "Simplifier ticks exhausted" (msg sc) - else let sc' = doSimplTick (st_flags st_env) t sc - in sc' `seq` return ((), us, sc')) - where - msg sc = vcat - [ text "When trying" <+> ppr t - , text "To increase the limit, use -fsimpl-tick-factor=N (default 100)." - , space - , text "If you need to increase the limit substantially, please file a" - , text "bug report and indicate the factor you needed." - , space - , text "If GHC was unable to complete compilation even" - <+> text "with a very large factor" - , text "(a thousand or more), please consult the" - <+> doubleQuotes (text "Known bugs or infelicities") - , text "section in the Users Guide before filing a report. There are a" - , text "few situations unlikely to occur in practical programs for which" - , text "simplifier non-termination has been judged acceptable." - , space - , pp_details sc - , pprSimplCount sc ] - pp_details sc - | hasDetailedCounts sc = empty - | otherwise = text "To see detailed counts use -ddump-simpl-stats" - - -freeTick :: Tick -> SimplM () --- Record a tick, but don't add to the total tick count, which is --- used to decide when nothing further has happened -freeTick t - = SM (\_st_env us sc -> let sc' = doFreeSimplTick t sc - in sc' `seq` return ((), us, sc')) diff --git a/compiler/simplCore/SimplUtils.hs b/compiler/simplCore/SimplUtils.hs deleted file mode 100644 index faf1131d36..0000000000 --- a/compiler/simplCore/SimplUtils.hs +++ /dev/null @@ -1,2324 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section[SimplUtils]{The simplifier utilities} --} - -{-# LANGUAGE CPP #-} - -module SimplUtils ( - -- Rebuilding - mkLam, mkCase, prepareAlts, tryEtaExpandRhs, - - -- Inlining, - preInlineUnconditionally, postInlineUnconditionally, - activeUnfolding, activeRule, - getUnfoldingInRuleMatch, - simplEnvForGHCi, updModeForStableUnfoldings, updModeForRules, - - -- The continuation type - SimplCont(..), DupFlag(..), StaticEnv, - isSimplified, contIsStop, - contIsDupable, contResultType, contHoleType, - contIsTrivial, contArgs, - countArgs, - mkBoringStop, mkRhsStop, mkLazyArgStop, contIsRhsOrArg, - interestingCallContext, - - -- ArgInfo - ArgInfo(..), ArgSpec(..), mkArgInfo, - addValArgTo, addCastTo, addTyArgTo, - argInfoExpr, argInfoAppArgs, pushSimplifiedArgs, - - abstractFloats, - - -- Utilities - isExitJoinId - ) where - -#include "HsVersions.h" - -import GhcPrelude - -import SimplEnv -import CoreMonad ( SimplMode(..), Tick(..) ) -import GHC.Driver.Session -import GHC.Core -import qualified GHC.Core.Subst -import GHC.Core.Ppr -import GHC.Core.TyCo.Ppr ( pprParendType ) -import GHC.Core.FVs -import GHC.Core.Utils -import GHC.Core.Arity -import GHC.Core.Unfold -import Name -import Id -import IdInfo -import Var -import Demand -import SimplMonad -import GHC.Core.Type hiding( substTy ) -import GHC.Core.Coercion hiding( substCo ) -import GHC.Core.DataCon ( dataConWorkId, isNullaryRepDataCon ) -import VarSet -import BasicTypes -import Util -import OrdList ( isNilOL ) -import MonadUtils -import Outputable -import PrelRules -import FastString ( fsLit ) - -import Control.Monad ( when ) -import Data.List ( sortBy ) - -{- -************************************************************************ -* * - The SimplCont and DupFlag types -* * -************************************************************************ - -A SimplCont allows the simplifier to traverse the expression in a -zipper-like fashion. The SimplCont represents the rest of the expression, -"above" the point of interest. - -You can also think of a SimplCont as an "evaluation context", using -that term in the way it is used for operational semantics. This is the -way I usually think of it, For example you'll often see a syntax for -evaluation context looking like - C ::= [] | C e | case C of alts | C `cast` co -That's the kind of thing we are doing here, and I use that syntax in -the comments. - - -Key points: - * A SimplCont describes a *strict* context (just like - evaluation contexts do). E.g. Just [] is not a SimplCont - - * A SimplCont describes a context that *does not* bind - any variables. E.g. \x. [] is not a SimplCont --} - -data SimplCont - = Stop -- Stop[e] = e - OutType -- Type of the <hole> - CallCtxt -- Tells if there is something interesting about - -- the context, and hence the inliner - -- should be a bit keener (see interestingCallContext) - -- Specifically: - -- This is an argument of a function that has RULES - -- Inlining the call might allow the rule to fire - -- Never ValAppCxt (use ApplyToVal instead) - -- or CaseCtxt (use Select instead) - - | CastIt -- (CastIt co K)[e] = K[ e `cast` co ] - OutCoercion -- The coercion simplified - -- Invariant: never an identity coercion - SimplCont - - | ApplyToVal -- (ApplyToVal arg K)[e] = K[ e arg ] - { sc_dup :: DupFlag -- See Note [DupFlag invariants] - , sc_arg :: InExpr -- The argument, - , sc_env :: StaticEnv -- see Note [StaticEnv invariant] - , sc_cont :: SimplCont } - - | ApplyToTy -- (ApplyToTy ty K)[e] = K[ e ty ] - { sc_arg_ty :: OutType -- Argument type - , sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah) - -- See Note [The hole type in ApplyToTy] - , sc_cont :: SimplCont } - - | Select -- (Select alts K)[e] = K[ case e of alts ] - { sc_dup :: DupFlag -- See Note [DupFlag invariants] - , sc_bndr :: InId -- case binder - , sc_alts :: [InAlt] -- Alternatives - , sc_env :: StaticEnv -- See Note [StaticEnv invariant] - , sc_cont :: SimplCont } - - -- The two strict forms have no DupFlag, because we never duplicate them - | StrictBind -- (StrictBind x xs b K)[e] = let x = e in K[\xs.b] - -- or, equivalently, = K[ (\x xs.b) e ] - { sc_dup :: DupFlag -- See Note [DupFlag invariants] - , sc_bndr :: InId - , sc_bndrs :: [InBndr] - , sc_body :: InExpr - , sc_env :: StaticEnv -- See Note [StaticEnv invariant] - , sc_cont :: SimplCont } - - | StrictArg -- (StrictArg (f e1 ..en) K)[e] = K[ f e1 .. en e ] - { sc_dup :: DupFlag -- Always Simplified or OkToDup - , sc_fun :: ArgInfo -- Specifies f, e1..en, Whether f has rules, etc - -- plus strictness flags for *further* args - , sc_cci :: CallCtxt -- Whether *this* argument position is interesting - , sc_cont :: SimplCont } - - | TickIt -- (TickIt t K)[e] = K[ tick t e ] - (Tickish Id) -- Tick tickish <hole> - SimplCont - -type StaticEnv = SimplEnv -- Just the static part is relevant - -data DupFlag = NoDup -- Unsimplified, might be big - | Simplified -- Simplified - | OkToDup -- Simplified and small - -isSimplified :: DupFlag -> Bool -isSimplified NoDup = False -isSimplified _ = True -- Invariant: the subst-env is empty - -perhapsSubstTy :: DupFlag -> StaticEnv -> Type -> Type -perhapsSubstTy dup env ty - | isSimplified dup = ty - | otherwise = substTy env ty - -{- Note [StaticEnv invariant] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We pair up an InExpr or InAlts with a StaticEnv, which establishes the -lexical scope for that InExpr. When we simplify that InExpr/InAlts, we -use - - Its captured StaticEnv - - Overriding its InScopeSet with the larger one at the - simplification point. - -Why override the InScopeSet? Example: - (let y = ey in f) ex -By the time we simplify ex, 'y' will be in scope. - -However the InScopeSet in the StaticEnv is not irrelevant: it should -include all the free vars of applying the substitution to the InExpr. -Reason: contHoleType uses perhapsSubstTy to apply the substitution to -the expression, and that (rightly) gives ASSERT failures if the InScopeSet -isn't big enough. - -Note [DupFlag invariants] -~~~~~~~~~~~~~~~~~~~~~~~~~ -In both (ApplyToVal dup _ env k) - and (Select dup _ _ env k) -the following invariants hold - - (a) if dup = OkToDup, then continuation k is also ok-to-dup - (b) if dup = OkToDup or Simplified, the subst-env is empty - (and and hence no need to re-simplify) --} - -instance Outputable DupFlag where - ppr OkToDup = text "ok" - ppr NoDup = text "nodup" - ppr Simplified = text "simpl" - -instance Outputable SimplCont where - ppr (Stop ty interesting) = text "Stop" <> brackets (ppr interesting) <+> ppr ty - ppr (CastIt co cont ) = (text "CastIt" <+> pprOptCo co) $$ ppr cont - ppr (TickIt t cont) = (text "TickIt" <+> ppr t) $$ ppr cont - ppr (ApplyToTy { sc_arg_ty = ty, sc_cont = cont }) - = (text "ApplyToTy" <+> pprParendType ty) $$ ppr cont - ppr (ApplyToVal { sc_arg = arg, sc_dup = dup, sc_cont = cont }) - = (text "ApplyToVal" <+> ppr dup <+> pprParendExpr arg) - $$ ppr cont - ppr (StrictBind { sc_bndr = b, sc_cont = cont }) - = (text "StrictBind" <+> ppr b) $$ ppr cont - ppr (StrictArg { sc_fun = ai, sc_cont = cont }) - = (text "StrictArg" <+> ppr (ai_fun ai)) $$ ppr cont - ppr (Select { sc_dup = dup, sc_bndr = bndr, sc_alts = alts, sc_env = se, sc_cont = cont }) - = (text "Select" <+> ppr dup <+> ppr bndr) $$ - whenPprDebug (nest 2 $ vcat [ppr (seTvSubst se), ppr alts]) $$ ppr cont - - -{- Note [The hole type in ApplyToTy] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The sc_hole_ty field of ApplyToTy records the type of the "hole" in the -continuation. It is absolutely necessary to compute contHoleType, but it is -not used for anything else (and hence may not be evaluated). - -Why is it necessary for contHoleType? Consider the continuation - ApplyToType Int (Stop Int) -corresponding to - (<hole> @Int) :: Int -What is the type of <hole>? It could be (forall a. Int) or (forall a. a), -and there is no way to know which, so we must record it. - -In a chain of applications (f @t1 @t2 @t3) we'll lazily compute exprType -for (f @t1) and (f @t1 @t2), which is potentially non-linear; but it probably -doesn't matter because we'll never compute them all. - -************************************************************************ -* * - ArgInfo and ArgSpec -* * -************************************************************************ --} - -data ArgInfo - = ArgInfo { - ai_fun :: OutId, -- The function - ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order) - - ai_type :: OutType, -- Type of (f a1 ... an) - - ai_rules :: FunRules, -- Rules for this function - - ai_encl :: Bool, -- Flag saying whether this function - -- or an enclosing one has rules (recursively) - -- True => be keener to inline in all args - - ai_strs :: [Bool], -- Strictness of remaining arguments - -- Usually infinite, but if it is finite it guarantees - -- that the function diverges after being given - -- that number of args - ai_discs :: [Int] -- Discounts for remaining arguments; non-zero => be keener to inline - -- Always infinite - } - -data ArgSpec - = ValArg OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal - | TyArg { as_arg_ty :: OutType -- Apply to this type; c.f. ApplyToTy - , as_hole_ty :: OutType } -- Type of the function (presumably forall a. blah) - | CastBy OutCoercion -- Cast by this; c.f. CastIt - -instance Outputable ArgSpec where - ppr (ValArg e) = text "ValArg" <+> ppr e - ppr (TyArg { as_arg_ty = ty }) = text "TyArg" <+> ppr ty - ppr (CastBy c) = text "CastBy" <+> ppr c - -addValArgTo :: ArgInfo -> OutExpr -> ArgInfo -addValArgTo ai arg = ai { ai_args = ValArg arg : ai_args ai - , ai_type = applyTypeToArg (ai_type ai) arg - , ai_rules = decRules (ai_rules ai) } - -addTyArgTo :: ArgInfo -> OutType -> ArgInfo -addTyArgTo ai arg_ty = ai { ai_args = arg_spec : ai_args ai - , ai_type = piResultTy poly_fun_ty arg_ty - , ai_rules = decRules (ai_rules ai) } - where - poly_fun_ty = ai_type ai - arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = poly_fun_ty } - -addCastTo :: ArgInfo -> OutCoercion -> ArgInfo -addCastTo ai co = ai { ai_args = CastBy co : ai_args ai - , ai_type = coercionRKind co } - -argInfoAppArgs :: [ArgSpec] -> [OutExpr] -argInfoAppArgs [] = [] -argInfoAppArgs (CastBy {} : _) = [] -- Stop at a cast -argInfoAppArgs (ValArg e : as) = e : argInfoAppArgs as -argInfoAppArgs (TyArg { as_arg_ty = ty } : as) = Type ty : argInfoAppArgs as - -pushSimplifiedArgs :: SimplEnv -> [ArgSpec] -> SimplCont -> SimplCont -pushSimplifiedArgs _env [] k = k -pushSimplifiedArgs env (arg : args) k - = case arg of - TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty } - -> ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = rest } - ValArg e -> ApplyToVal { sc_arg = e, sc_env = env, sc_dup = Simplified, sc_cont = rest } - CastBy c -> CastIt c rest - where - rest = pushSimplifiedArgs env args k - -- The env has an empty SubstEnv - -argInfoExpr :: OutId -> [ArgSpec] -> OutExpr --- NB: the [ArgSpec] is reversed so that the first arg --- in the list is the last one in the application -argInfoExpr fun rev_args - = go rev_args - where - go [] = Var fun - go (ValArg a : as) = go as `App` a - go (TyArg { as_arg_ty = ty } : as) = go as `App` Type ty - go (CastBy co : as) = mkCast (go as) co - - -type FunRules = Maybe (Int, [CoreRule]) -- Remaining rules for this function - -- Nothing => No rules - -- Just (n, rules) => some rules, requiring at least n more type/value args - -decRules :: FunRules -> FunRules -decRules (Just (n, rules)) = Just (n-1, rules) -decRules Nothing = Nothing - -mkFunRules :: [CoreRule] -> FunRules -mkFunRules [] = Nothing -mkFunRules rs = Just (n_required, rs) - where - n_required = maximum (map ruleArity rs) - -{- -************************************************************************ -* * - Functions on SimplCont -* * -************************************************************************ --} - -mkBoringStop :: OutType -> SimplCont -mkBoringStop ty = Stop ty BoringCtxt - -mkRhsStop :: OutType -> SimplCont -- See Note [RHS of lets] in GHC.Core.Unfold -mkRhsStop ty = Stop ty RhsCtxt - -mkLazyArgStop :: OutType -> CallCtxt -> SimplCont -mkLazyArgStop ty cci = Stop ty cci - -------------------- -contIsRhsOrArg :: SimplCont -> Bool -contIsRhsOrArg (Stop {}) = True -contIsRhsOrArg (StrictBind {}) = True -contIsRhsOrArg (StrictArg {}) = True -contIsRhsOrArg _ = False - -contIsRhs :: SimplCont -> Bool -contIsRhs (Stop _ RhsCtxt) = True -contIsRhs _ = False - -------------------- -contIsStop :: SimplCont -> Bool -contIsStop (Stop {}) = True -contIsStop _ = False - -contIsDupable :: SimplCont -> Bool -contIsDupable (Stop {}) = True -contIsDupable (ApplyToTy { sc_cont = k }) = contIsDupable k -contIsDupable (ApplyToVal { sc_dup = OkToDup }) = True -- See Note [DupFlag invariants] -contIsDupable (Select { sc_dup = OkToDup }) = True -- ...ditto... -contIsDupable (StrictArg { sc_dup = OkToDup }) = True -- ...ditto... -contIsDupable (CastIt _ k) = contIsDupable k -contIsDupable _ = False - -------------------- -contIsTrivial :: SimplCont -> Bool -contIsTrivial (Stop {}) = True -contIsTrivial (ApplyToTy { sc_cont = k }) = contIsTrivial k -contIsTrivial (ApplyToVal { sc_arg = Coercion _, sc_cont = k }) = contIsTrivial k -contIsTrivial (CastIt _ k) = contIsTrivial k -contIsTrivial _ = False - -------------------- -contResultType :: SimplCont -> OutType -contResultType (Stop ty _) = ty -contResultType (CastIt _ k) = contResultType k -contResultType (StrictBind { sc_cont = k }) = contResultType k -contResultType (StrictArg { sc_cont = k }) = contResultType k -contResultType (Select { sc_cont = k }) = contResultType k -contResultType (ApplyToTy { sc_cont = k }) = contResultType k -contResultType (ApplyToVal { sc_cont = k }) = contResultType k -contResultType (TickIt _ k) = contResultType k - -contHoleType :: SimplCont -> OutType -contHoleType (Stop ty _) = ty -contHoleType (TickIt _ k) = contHoleType k -contHoleType (CastIt co _) = coercionLKind co -contHoleType (StrictBind { sc_bndr = b, sc_dup = dup, sc_env = se }) - = perhapsSubstTy dup se (idType b) -contHoleType (StrictArg { sc_fun = ai }) = funArgTy (ai_type ai) -contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy] -contHoleType (ApplyToVal { sc_arg = e, sc_env = se, sc_dup = dup, sc_cont = k }) - = mkVisFunTy (perhapsSubstTy dup se (exprType e)) - (contHoleType k) -contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se }) - = perhapsSubstTy d se (idType b) - -------------------- -countArgs :: SimplCont -> Int --- Count all arguments, including types, coercions, and other values -countArgs (ApplyToTy { sc_cont = cont }) = 1 + countArgs cont -countArgs (ApplyToVal { sc_cont = cont }) = 1 + countArgs cont -countArgs _ = 0 - -contArgs :: SimplCont -> (Bool, [ArgSummary], SimplCont) --- Summarises value args, discards type args and coercions --- The returned continuation of the call is only used to --- answer questions like "are you interesting?" -contArgs cont - | lone cont = (True, [], cont) - | otherwise = go [] cont - where - lone (ApplyToTy {}) = False -- See Note [Lone variables] in GHC.Core.Unfold - lone (ApplyToVal {}) = False - lone (CastIt {}) = False - lone _ = True - - go args (ApplyToVal { sc_arg = arg, sc_env = se, sc_cont = k }) - = go (is_interesting arg se : args) k - go args (ApplyToTy { sc_cont = k }) = go args k - go args (CastIt _ k) = go args k - go args k = (False, reverse args, k) - - is_interesting arg se = interestingArg se arg - -- Do *not* use short-cutting substitution here - -- because we want to get as much IdInfo as possible - - -------------------- -mkArgInfo :: SimplEnv - -> Id - -> [CoreRule] -- Rules for function - -> Int -- Number of value args - -> SimplCont -- Context of the call - -> ArgInfo - -mkArgInfo env fun rules n_val_args call_cont - | n_val_args < idArity fun -- Note [Unsaturated functions] - = ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty - , ai_rules = fun_rules - , ai_encl = False - , ai_strs = vanilla_stricts - , ai_discs = vanilla_discounts } - | otherwise - = ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty - , ai_rules = fun_rules - , ai_encl = interestingArgContext rules call_cont - , ai_strs = arg_stricts - , ai_discs = arg_discounts } - where - fun_ty = idType fun - - fun_rules = mkFunRules rules - - vanilla_discounts, arg_discounts :: [Int] - vanilla_discounts = repeat 0 - arg_discounts = case idUnfolding fun of - CoreUnfolding {uf_guidance = UnfIfGoodArgs {ug_args = discounts}} - -> discounts ++ vanilla_discounts - _ -> vanilla_discounts - - vanilla_stricts, arg_stricts :: [Bool] - vanilla_stricts = repeat False - - arg_stricts - | not (sm_inline (seMode env)) - = vanilla_stricts -- See Note [Do not expose strictness if sm_inline=False] - | otherwise - = add_type_str fun_ty $ - case splitStrictSig (idStrictness fun) of - (demands, result_info) - | not (demands `lengthExceeds` n_val_args) - -> -- Enough args, use the strictness given. - -- For bottoming functions we used to pretend that the arg - -- is lazy, so that we don't treat the arg as an - -- interesting context. This avoids substituting - -- top-level bindings for (say) strings into - -- calls to error. But now we are more careful about - -- inlining lone variables, so its ok (see SimplUtils.analyseCont) - if isBotDiv result_info then - map isStrictDmd demands -- Finite => result is bottom - else - map isStrictDmd demands ++ vanilla_stricts - | otherwise - -> WARN( True, text "More demands than arity" <+> ppr fun <+> ppr (idArity fun) - <+> ppr n_val_args <+> ppr demands ) - vanilla_stricts -- Not enough args, or no strictness - - add_type_str :: Type -> [Bool] -> [Bool] - -- If the function arg types are strict, record that in the 'strictness bits' - -- No need to instantiate because unboxed types (which dominate the strict - -- types) can't instantiate type variables. - -- add_type_str is done repeatedly (for each call); - -- might be better once-for-all in the function - -- But beware primops/datacons with no strictness - - add_type_str _ [] = [] - add_type_str fun_ty all_strs@(str:strs) - | Just (arg_ty, fun_ty') <- splitFunTy_maybe fun_ty -- Add strict-type info - = (str || Just False == isLiftedType_maybe arg_ty) - : add_type_str fun_ty' strs - -- If the type is levity-polymorphic, we can't know whether it's - -- strict. isLiftedType_maybe will return Just False only when - -- we're sure the type is unlifted. - - | Just (_, fun_ty') <- splitForAllTy_maybe fun_ty - = add_type_str fun_ty' all_strs -- Look through foralls - - | otherwise - = all_strs - -{- Note [Unsaturated functions] - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider (test eyeball/inline4) - x = a:as - y = f x -where f has arity 2. Then we do not want to inline 'x', because -it'll just be floated out again. Even if f has lots of discounts -on its first argument -- it must be saturated for these to kick in - -Note [Do not expose strictness if sm_inline=False] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -#15163 showed a case in which we had - - {-# INLINE [1] zip #-} - zip = undefined - - {-# RULES "foo" forall as bs. stream (zip as bs) = ..blah... #-} - -If we expose zip's bottoming nature when simplifying the LHS of the -RULE we get - {-# RULES "foo" forall as bs. - stream (case zip of {}) = ..blah... #-} -discarding the arguments to zip. Usually this is fine, but on the -LHS of a rule it's not, because 'as' and 'bs' are now not bound on -the LHS. - -This is a pretty pathological example, so I'm not losing sleep over -it, but the simplest solution was to check sm_inline; if it is False, -which it is on the LHS of a rule (see updModeForRules), then don't -make use of the strictness info for the function. --} - - -{- -************************************************************************ -* * - Interesting arguments -* * -************************************************************************ - -Note [Interesting call context] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We want to avoid inlining an expression where there can't possibly be -any gain, such as in an argument position. Hence, if the continuation -is interesting (eg. a case scrutinee, application etc.) then we -inline, otherwise we don't. - -Previously some_benefit used to return True only if the variable was -applied to some value arguments. This didn't work: - - let x = _coerce_ (T Int) Int (I# 3) in - case _coerce_ Int (T Int) x of - I# y -> .... - -we want to inline x, but can't see that it's a constructor in a case -scrutinee position, and some_benefit is False. - -Another example: - -dMonadST = _/\_ t -> :Monad (g1 _@_ t, g2 _@_ t, g3 _@_ t) - -.... case dMonadST _@_ x0 of (a,b,c) -> .... - -we'd really like to inline dMonadST here, but we *don't* want to -inline if the case expression is just - - case x of y { DEFAULT -> ... } - -since we can just eliminate this case instead (x is in WHNF). Similar -applies when x is bound to a lambda expression. Hence -contIsInteresting looks for case expressions with just a single -default case. - -Note [No case of case is boring] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we see - case f x of <alts> - -we'd usually treat the context as interesting, to encourage 'f' to -inline. But if case-of-case is off, it's really not so interesting -after all, because we are unlikely to be able to push the case -expression into the branches of any case in f's unfolding. So, to -reduce unnecessary code expansion, we just make the context look boring. -This made a small compile-time perf improvement in perf/compiler/T6048, -and it looks plausible to me. --} - -interestingCallContext :: SimplEnv -> SimplCont -> CallCtxt --- See Note [Interesting call context] -interestingCallContext env cont - = interesting cont - where - interesting (Select {}) - | sm_case_case (getMode env) = CaseCtxt - | otherwise = BoringCtxt - -- See Note [No case of case is boring] - - interesting (ApplyToVal {}) = ValAppCtxt - -- Can happen if we have (f Int |> co) y - -- If f has an INLINE prag we need to give it some - -- motivation to inline. See Note [Cast then apply] - -- in GHC.Core.Unfold - - interesting (StrictArg { sc_cci = cci }) = cci - interesting (StrictBind {}) = BoringCtxt - interesting (Stop _ cci) = cci - interesting (TickIt _ k) = interesting k - interesting (ApplyToTy { sc_cont = k }) = interesting k - interesting (CastIt _ k) = interesting k - -- If this call is the arg of a strict function, the context - -- is a bit interesting. If we inline here, we may get useful - -- evaluation information to avoid repeated evals: e.g. - -- x + (y * z) - -- Here the contIsInteresting makes the '*' keener to inline, - -- which in turn exposes a constructor which makes the '+' inline. - -- Assuming that +,* aren't small enough to inline regardless. - -- - -- It's also very important to inline in a strict context for things - -- like - -- foldr k z (f x) - -- Here, the context of (f x) is strict, and if f's unfolding is - -- a build it's *great* to inline it here. So we must ensure that - -- the context for (f x) is not totally uninteresting. - -interestingArgContext :: [CoreRule] -> SimplCont -> Bool --- If the argument has form (f x y), where x,y are boring, --- and f is marked INLINE, then we don't want to inline f. --- But if the context of the argument is --- g (f x y) --- where g has rules, then we *do* want to inline f, in case it --- exposes a rule that might fire. Similarly, if the context is --- h (g (f x x)) --- where h has rules, then we do want to inline f; hence the --- call_cont argument to interestingArgContext --- --- The ai-rules flag makes this happen; if it's --- set, the inliner gets just enough keener to inline f --- regardless of how boring f's arguments are, if it's marked INLINE --- --- The alternative would be to *always* inline an INLINE function, --- regardless of how boring its context is; but that seems overkill --- For example, it'd mean that wrapper functions were always inlined --- --- The call_cont passed to interestingArgContext is the context of --- the call itself, e.g. g <hole> in the example above -interestingArgContext rules call_cont - = notNull rules || enclosing_fn_has_rules - where - enclosing_fn_has_rules = go call_cont - - go (Select {}) = False - go (ApplyToVal {}) = False -- Shouldn't really happen - go (ApplyToTy {}) = False -- Ditto - go (StrictArg { sc_cci = cci }) = interesting cci - go (StrictBind {}) = False -- ?? - go (CastIt _ c) = go c - go (Stop _ cci) = interesting cci - go (TickIt _ c) = go c - - interesting RuleArgCtxt = True - interesting _ = False - - -{- Note [Interesting arguments] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -An argument is interesting if it deserves a discount for unfoldings -with a discount in that argument position. The idea is to avoid -unfolding a function that is applied only to variables that have no -unfolding (i.e. they are probably lambda bound): f x y z There is -little point in inlining f here. - -Generally, *values* (like (C a b) and (\x.e)) deserve discounts. But -we must look through lets, eg (let x = e in C a b), because the let will -float, exposing the value, if we inline. That makes it different to -exprIsHNF. - -Before 2009 we said it was interesting if the argument had *any* structure -at all; i.e. (hasSomeUnfolding v). But does too much inlining; see #3016. - -But we don't regard (f x y) as interesting, unless f is unsaturated. -If it's saturated and f hasn't inlined, then it's probably not going -to now! - -Note [Conlike is interesting] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - f d = ...((*) d x y)... - ... f (df d')... -where df is con-like. Then we'd really like to inline 'f' so that the -rule for (*) (df d) can fire. To do this - a) we give a discount for being an argument of a class-op (eg (*) d) - b) we say that a con-like argument (eg (df d)) is interesting --} - -interestingArg :: SimplEnv -> CoreExpr -> ArgSummary --- See Note [Interesting arguments] -interestingArg env e = go env 0 e - where - -- n is # value args to which the expression is applied - go env n (Var v) - = case substId env v of - DoneId v' -> go_var n v' - DoneEx e _ -> go (zapSubstEnv env) n e - ContEx tvs cvs ids e -> go (setSubstEnv env tvs cvs ids) n e - - go _ _ (Lit {}) = ValueArg - go _ _ (Type _) = TrivArg - go _ _ (Coercion _) = TrivArg - go env n (App fn (Type _)) = go env n fn - go env n (App fn _) = go env (n+1) fn - go env n (Tick _ a) = go env n a - go env n (Cast e _) = go env n e - go env n (Lam v e) - | isTyVar v = go env n e - | n>0 = NonTrivArg -- (\x.b) e is NonTriv - | otherwise = ValueArg - go _ _ (Case {}) = NonTrivArg - go env n (Let b e) = case go env' n e of - ValueArg -> ValueArg - _ -> NonTrivArg - where - env' = env `addNewInScopeIds` bindersOf b - - go_var n v - | isConLikeId v = ValueArg -- Experimenting with 'conlike' rather that - -- data constructors here - | idArity v > n = ValueArg -- Catches (eg) primops with arity but no unfolding - | n > 0 = NonTrivArg -- Saturated or unknown call - | conlike_unfolding = ValueArg -- n==0; look for an interesting unfolding - -- See Note [Conlike is interesting] - | otherwise = TrivArg -- n==0, no useful unfolding - where - conlike_unfolding = isConLikeUnfolding (idUnfolding v) - -{- -************************************************************************ -* * - SimplMode -* * -************************************************************************ - -The SimplMode controls several switches; see its definition in -CoreMonad - sm_rules :: Bool -- Whether RULES are enabled - sm_inline :: Bool -- Whether inlining is enabled - sm_case_case :: Bool -- Whether case-of-case is enabled - sm_eta_expand :: Bool -- Whether eta-expansion is enabled --} - -simplEnvForGHCi :: DynFlags -> SimplEnv -simplEnvForGHCi dflags - = mkSimplEnv $ SimplMode { sm_names = ["GHCi"] - , sm_phase = InitialPhase - , sm_dflags = dflags - , sm_rules = rules_on - , sm_inline = False - , sm_eta_expand = eta_expand_on - , sm_case_case = True } - where - rules_on = gopt Opt_EnableRewriteRules dflags - eta_expand_on = gopt Opt_DoLambdaEtaExpansion dflags - -- Do not do any inlining, in case we expose some unboxed - -- tuple stuff that confuses the bytecode interpreter - -updModeForStableUnfoldings :: Activation -> SimplMode -> SimplMode --- See Note [Simplifying inside stable unfoldings] -updModeForStableUnfoldings inline_rule_act current_mode - = current_mode { sm_phase = phaseFromActivation inline_rule_act - , sm_inline = True - , sm_eta_expand = False } - -- sm_eta_expand: see Note [No eta expansion in stable unfoldings] - -- For sm_rules, just inherit; sm_rules might be "off" - -- because of -fno-enable-rewrite-rules - where - phaseFromActivation (ActiveAfter _ n) = Phase n - phaseFromActivation _ = InitialPhase - -updModeForRules :: SimplMode -> SimplMode --- See Note [Simplifying rules] -updModeForRules current_mode - = current_mode { sm_phase = InitialPhase - , sm_inline = False -- See Note [Do not expose strictness if sm_inline=False] - , sm_rules = False - , sm_eta_expand = False } - -{- Note [Simplifying rules] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When simplifying a rule LHS, refrain from /any/ inlining or applying -of other RULES. - -Doing anything to the LHS is plain confusing, because it means that what the -rule matches is not what the user wrote. c.f. #10595, and #10528. -Moreover, inlining (or applying rules) on rule LHSs risks introducing -Ticks into the LHS, which makes matching trickier. #10665, #10745. - -Doing this to either side confounds tools like HERMIT, which seek to reason -about and apply the RULES as originally written. See #10829. - -Note [No eta expansion in stable unfoldings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have a stable unfolding - - f :: Ord a => a -> IO () - -- Unfolding template - -- = /\a \(d:Ord a) (x:a). bla - -we do not want to eta-expand to - - f :: Ord a => a -> IO () - -- Unfolding template - -- = (/\a \(d:Ord a) (x:a) (eta:State#). bla eta) |> co - -because not specialisation of the overloading doesn't work properly -(see Note [Specialisation shape] in Specialise), #9509. - -So we disable eta-expansion in stable unfoldings. - -Note [Inlining in gentle mode] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Something is inlined if - (i) the sm_inline flag is on, AND - (ii) the thing has an INLINE pragma, AND - (iii) the thing is inlinable in the earliest phase. - -Example of why (iii) is important: - {-# INLINE [~1] g #-} - g = ... - - {-# INLINE f #-} - f x = g (g x) - -If we were to inline g into f's inlining, then an importing module would -never be able to do - f e --> g (g e) ---> RULE fires -because the stable unfolding for f has had g inlined into it. - -On the other hand, it is bad not to do ANY inlining into an -stable unfolding, because then recursive knots in instance declarations -don't get unravelled. - -However, *sometimes* SimplGently must do no call-site inlining at all -(hence sm_inline = False). Before full laziness we must be careful -not to inline wrappers, because doing so inhibits floating - e.g. ...(case f x of ...)... - ==> ...(case (case x of I# x# -> fw x#) of ...)... - ==> ...(case x of I# x# -> case fw x# of ...)... -and now the redex (f x) isn't floatable any more. - -The no-inlining thing is also important for Template Haskell. You might be -compiling in one-shot mode with -O2; but when TH compiles a splice before -running it, we don't want to use -O2. Indeed, we don't want to inline -anything, because the byte-code interpreter might get confused about -unboxed tuples and suchlike. - -Note [Simplifying inside stable unfoldings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We must take care with simplification inside stable unfoldings (which come from -INLINE pragmas). - -First, consider the following example - let f = \pq -> BIG - in - let g = \y -> f y y - {-# INLINE g #-} - in ...g...g...g...g...g... -Now, if that's the ONLY occurrence of f, it might be inlined inside g, -and thence copied multiple times when g is inlined. HENCE we treat -any occurrence in a stable unfolding as a multiple occurrence, not a single -one; see OccurAnal.addRuleUsage. - -Second, we do want *do* to some modest rules/inlining stuff in stable -unfoldings, partly to eliminate senseless crap, and partly to break -the recursive knots generated by instance declarations. - -However, suppose we have - {-# INLINE <act> f #-} - f = <rhs> -meaning "inline f in phases p where activation <act>(p) holds". -Then what inlinings/rules can we apply to the copy of <rhs> captured in -f's stable unfolding? Our model is that literally <rhs> is substituted for -f when it is inlined. So our conservative plan (implemented by -updModeForStableUnfoldings) is this: - - ------------------------------------------------------------- - When simplifying the RHS of a stable unfolding, set the phase - to the phase in which the stable unfolding first becomes active - ------------------------------------------------------------- - -That ensures that - - a) Rules/inlinings that *cease* being active before p will - not apply to the stable unfolding, consistent with it being - inlined in its *original* form in phase p. - - b) Rules/inlinings that only become active *after* p will - not apply to the stable unfolding, again to be consistent with - inlining the *original* rhs in phase p. - -For example, - {-# INLINE f #-} - f x = ...g... - - {-# NOINLINE [1] g #-} - g y = ... - - {-# RULE h g = ... #-} -Here we must not inline g into f's RHS, even when we get to phase 0, -because when f is later inlined into some other module we want the -rule for h to fire. - -Similarly, consider - {-# INLINE f #-} - f x = ...g... - - g y = ... -and suppose that there are auto-generated specialisations and a strictness -wrapper for g. The specialisations get activation AlwaysActive, and the -strictness wrapper get activation (ActiveAfter 0). So the strictness -wrepper fails the test and won't be inlined into f's stable unfolding. That -means f can inline, expose the specialised call to g, so the specialisation -rules can fire. - -A note about wrappers -~~~~~~~~~~~~~~~~~~~~~ -It's also important not to inline a worker back into a wrapper. -A wrapper looks like - wraper = inline_me (\x -> ...worker... ) -Normally, the inline_me prevents the worker getting inlined into -the wrapper (initially, the worker's only call site!). But, -if the wrapper is sure to be called, the strictness analyser will -mark it 'demanded', so when the RHS is simplified, it'll get an ArgOf -continuation. --} - -activeUnfolding :: SimplMode -> Id -> Bool -activeUnfolding mode id - | isCompulsoryUnfolding (realIdUnfolding id) - = True -- Even sm_inline can't override compulsory unfoldings - | otherwise - = isActive (sm_phase mode) (idInlineActivation id) - && sm_inline mode - -- `or` isStableUnfolding (realIdUnfolding id) - -- Inline things when - -- (a) they are active - -- (b) sm_inline says so, except that for stable unfoldings - -- (ie pragmas) we inline anyway - -getUnfoldingInRuleMatch :: SimplEnv -> InScopeEnv --- When matching in RULE, we want to "look through" an unfolding --- (to see a constructor) if *rules* are on, even if *inlinings* --- are not. A notable example is DFuns, which really we want to --- match in rules like (op dfun) in gentle mode. Another example --- is 'otherwise' which we want exprIsConApp_maybe to be able to --- see very early on -getUnfoldingInRuleMatch env - = (in_scope, id_unf) - where - in_scope = seInScope env - mode = getMode env - id_unf id | unf_is_active id = idUnfolding id - | otherwise = NoUnfolding - unf_is_active id - | not (sm_rules mode) = -- active_unfolding_minimal id - isStableUnfolding (realIdUnfolding id) - -- Do we even need to test this? I think this InScopeEnv - -- is only consulted if activeRule returns True, which - -- never happens if sm_rules is False - | otherwise = isActive (sm_phase mode) (idInlineActivation id) - ----------------------- -activeRule :: SimplMode -> Activation -> Bool --- Nothing => No rules at all -activeRule mode - | not (sm_rules mode) = \_ -> False -- Rewriting is off - | otherwise = isActive (sm_phase mode) - -{- -************************************************************************ -* * - preInlineUnconditionally -* * -************************************************************************ - -preInlineUnconditionally -~~~~~~~~~~~~~~~~~~~~~~~~ -@preInlineUnconditionally@ examines a bndr to see if it is used just -once in a completely safe way, so that it is safe to discard the -binding inline its RHS at the (unique) usage site, REGARDLESS of how -big the RHS might be. If this is the case we don't simplify the RHS -first, but just inline it un-simplified. - -This is much better than first simplifying a perhaps-huge RHS and then -inlining and re-simplifying it. Indeed, it can be at least quadratically -better. Consider - - x1 = e1 - x2 = e2[x1] - x3 = e3[x2] - ...etc... - xN = eN[xN-1] - -We may end up simplifying e1 N times, e2 N-1 times, e3 N-3 times etc. -This can happen with cascades of functions too: - - f1 = \x1.e1 - f2 = \xs.e2[f1] - f3 = \xs.e3[f3] - ...etc... - -THE MAIN INVARIANT is this: - - ---- preInlineUnconditionally invariant ----- - IF preInlineUnconditionally chooses to inline x = <rhs> - THEN doing the inlining should not change the occurrence - info for the free vars of <rhs> - ---------------------------------------------- - -For example, it's tempting to look at trivial binding like - x = y -and inline it unconditionally. But suppose x is used many times, -but this is the unique occurrence of y. Then inlining x would change -y's occurrence info, which breaks the invariant. It matters: y -might have a BIG rhs, which will now be dup'd at every occurrence of x. - - -Even RHSs labelled InlineMe aren't caught here, because there might be -no benefit from inlining at the call site. - -[Sept 01] Don't unconditionally inline a top-level thing, because that -can simply make a static thing into something built dynamically. E.g. - x = (a,b) - main = \s -> h x - -[Remember that we treat \s as a one-shot lambda.] No point in -inlining x unless there is something interesting about the call site. - -But watch out: if you aren't careful, some useful foldr/build fusion -can be lost (most notably in spectral/hartel/parstof) because the -foldr didn't see the build. Doing the dynamic allocation isn't a big -deal, in fact, but losing the fusion can be. But the right thing here -seems to be to do a callSiteInline based on the fact that there is -something interesting about the call site (it's strict). Hmm. That -seems a bit fragile. - -Conclusion: inline top level things gaily until Phase 0 (the last -phase), at which point don't. - -Note [pre/postInlineUnconditionally in gentle mode] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Even in gentle mode we want to do preInlineUnconditionally. The -reason is that too little clean-up happens if you don't inline -use-once things. Also a bit of inlining is *good* for full laziness; -it can expose constant sub-expressions. Example in -spectral/mandel/Mandel.hs, where the mandelset function gets a useful -let-float if you inline windowToViewport - -However, as usual for Gentle mode, do not inline things that are -inactive in the initial stages. See Note [Gentle mode]. - -Note [Stable unfoldings and preInlineUnconditionally] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Surprisingly, do not pre-inline-unconditionally Ids with INLINE pragmas! -Example - - {-# INLINE f #-} - f :: Eq a => a -> a - f x = ... - - fInt :: Int -> Int - fInt = f Int dEqInt - - ...fInt...fInt...fInt... - -Here f occurs just once, in the RHS of fInt. But if we inline it there -it might make fInt look big, and we'll lose the opportunity to inline f -at each of fInt's call sites. The INLINE pragma will only inline when -the application is saturated for exactly this reason; and we don't -want PreInlineUnconditionally to second-guess it. A live example is -#3736. - c.f. Note [Stable unfoldings and postInlineUnconditionally] - -NB: if the pragma is INLINEABLE, then we don't want to behave in -this special way -- an INLINEABLE pragma just says to GHC "inline this -if you like". But if there is a unique occurrence, we want to inline -the stable unfolding, not the RHS. - -Note [Top-level bottoming Ids] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Don't inline top-level Ids that are bottoming, even if they are used just -once, because FloatOut has gone to some trouble to extract them out. -Inlining them won't make the program run faster! - -Note [Do not inline CoVars unconditionally] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Coercion variables appear inside coercions, and the RHS of a let-binding -is a term (not a coercion) so we can't necessarily inline the latter in -the former. --} - -preInlineUnconditionally - :: SimplEnv -> TopLevelFlag -> InId - -> InExpr -> StaticEnv -- These two go together - -> Maybe SimplEnv -- Returned env has extended substitution --- Precondition: rhs satisfies the let/app invariant --- See Note [Core let/app invariant] in GHC.Core --- Reason: we don't want to inline single uses, or discard dead bindings, --- for unlifted, side-effect-ful bindings -preInlineUnconditionally env top_lvl bndr rhs rhs_env - | not pre_inline_unconditionally = Nothing - | not active = Nothing - | isTopLevel top_lvl && isBottomingId bndr = Nothing -- Note [Top-level bottoming Ids] - | isCoVar bndr = Nothing -- Note [Do not inline CoVars unconditionally] - | isExitJoinId bndr = Nothing -- Note [Do not inline exit join points] - -- in module Exitify - | not (one_occ (idOccInfo bndr)) = Nothing - | not (isStableUnfolding unf) = Just (extend_subst_with rhs) - - -- Note [Stable unfoldings and preInlineUnconditionally] - | isInlinablePragma inline_prag - , Just inl <- maybeUnfoldingTemplate unf = Just (extend_subst_with inl) - | otherwise = Nothing - where - unf = idUnfolding bndr - extend_subst_with inl_rhs = extendIdSubst env bndr (mkContEx rhs_env inl_rhs) - - one_occ IAmDead = True -- Happens in ((\x.1) v) - one_occ OneOcc{ occ_one_br = InOneBranch - , occ_in_lam = NotInsideLam } = isNotTopLevel top_lvl || early_phase - one_occ OneOcc{ occ_one_br = InOneBranch - , occ_in_lam = IsInsideLam - , occ_int_cxt = IsInteresting } = canInlineInLam rhs - one_occ _ = False - - pre_inline_unconditionally = gopt Opt_SimplPreInlining (seDynFlags env) - mode = getMode env - active = isActive (sm_phase mode) (inlinePragmaActivation inline_prag) - -- See Note [pre/postInlineUnconditionally in gentle mode] - inline_prag = idInlinePragma bndr - --- Be very careful before inlining inside a lambda, because (a) we must not --- invalidate occurrence information, and (b) we want to avoid pushing a --- single allocation (here) into multiple allocations (inside lambda). --- Inlining a *function* with a single *saturated* call would be ok, mind you. --- || (if is_cheap && not (canInlineInLam rhs) then pprTrace "preinline" (ppr bndr <+> ppr rhs) ok else ok) --- where --- is_cheap = exprIsCheap rhs --- ok = is_cheap && int_cxt - - -- int_cxt The context isn't totally boring - -- E.g. let f = \ab.BIG in \y. map f xs - -- Don't want to substitute for f, because then we allocate - -- its closure every time the \y is called - -- But: let f = \ab.BIG in \y. map (f y) xs - -- Now we do want to substitute for f, even though it's not - -- saturated, because we're going to allocate a closure for - -- (f y) every time round the loop anyhow. - - -- canInlineInLam => free vars of rhs are (Once in_lam) or Many, - -- so substituting rhs inside a lambda doesn't change the occ info. - -- Sadly, not quite the same as exprIsHNF. - canInlineInLam (Lit _) = True - canInlineInLam (Lam b e) = isRuntimeVar b || canInlineInLam e - canInlineInLam (Tick t e) = not (tickishIsCode t) && canInlineInLam e - canInlineInLam _ = False - -- not ticks. Counting ticks cannot be duplicated, and non-counting - -- ticks around a Lam will disappear anyway. - - early_phase = case sm_phase mode of - Phase 0 -> False - _ -> True --- If we don't have this early_phase test, consider --- x = length [1,2,3] --- The full laziness pass carefully floats all the cons cells to --- top level, and preInlineUnconditionally floats them all back in. --- Result is (a) static allocation replaced by dynamic allocation --- (b) many simplifier iterations because this tickles --- a related problem; only one inlining per pass --- --- On the other hand, I have seen cases where top-level fusion is --- lost if we don't inline top level thing (e.g. string constants) --- Hence the test for phase zero (which is the phase for all the final --- simplifications). Until phase zero we take no special notice of --- top level things, but then we become more leery about inlining --- them. - -{- -************************************************************************ -* * - postInlineUnconditionally -* * -************************************************************************ - -postInlineUnconditionally -~~~~~~~~~~~~~~~~~~~~~~~~~ -@postInlineUnconditionally@ decides whether to unconditionally inline -a thing based on the form of its RHS; in particular if it has a -trivial RHS. If so, we can inline and discard the binding altogether. - -NB: a loop breaker has must_keep_binding = True and non-loop-breakers -only have *forward* references. Hence, it's safe to discard the binding - -NOTE: This isn't our last opportunity to inline. We're at the binding -site right now, and we'll get another opportunity when we get to the -occurrence(s) - -Note that we do this unconditional inlining only for trivial RHSs. -Don't inline even WHNFs inside lambdas; doing so may simply increase -allocation when the function is called. This isn't the last chance; see -NOTE above. - -NB: Even inline pragmas (e.g. IMustBeINLINEd) are ignored here Why? -Because we don't even want to inline them into the RHS of constructor -arguments. See NOTE above - -NB: At one time even NOINLINE was ignored here: if the rhs is trivial -it's best to inline it anyway. We often get a=E; b=a from desugaring, -with both a and b marked NOINLINE. But that seems incompatible with -our new view that inlining is like a RULE, so I'm sticking to the 'active' -story for now. --} - -postInlineUnconditionally - :: SimplEnv -> TopLevelFlag - -> OutId -- The binder (*not* a CoVar), including its unfolding - -> OccInfo -- From the InId - -> OutExpr - -> Bool --- Precondition: rhs satisfies the let/app invariant --- See Note [Core let/app invariant] in GHC.Core --- Reason: we don't want to inline single uses, or discard dead bindings, --- for unlifted, side-effect-ful bindings -postInlineUnconditionally env top_lvl bndr occ_info rhs - | not active = False - | isWeakLoopBreaker occ_info = False -- If it's a loop-breaker of any kind, don't inline - -- because it might be referred to "earlier" - | isStableUnfolding unfolding = False -- Note [Stable unfoldings and postInlineUnconditionally] - | isTopLevel top_lvl = False -- Note [Top level and postInlineUnconditionally] - | exprIsTrivial rhs = True - | otherwise - = case occ_info of - -- The point of examining occ_info here is that for *non-values* - -- that occur outside a lambda, the call-site inliner won't have - -- a chance (because it doesn't know that the thing - -- only occurs once). The pre-inliner won't have gotten - -- it either, if the thing occurs in more than one branch - -- So the main target is things like - -- let x = f y in - -- case v of - -- True -> case x of ... - -- False -> case x of ... - -- This is very important in practice; e.g. wheel-seive1 doubles - -- in allocation if you miss this out - OneOcc { occ_in_lam = in_lam, occ_int_cxt = int_cxt } - -- OneOcc => no code-duplication issue - -> smallEnoughToInline dflags unfolding -- Small enough to dup - -- ToDo: consider discount on smallEnoughToInline if int_cxt is true - -- - -- NB: Do NOT inline arbitrarily big things, even if one_br is True - -- Reason: doing so risks exponential behaviour. We simplify a big - -- expression, inline it, and simplify it again. But if the - -- very same thing happens in the big expression, we get - -- exponential cost! - -- PRINCIPLE: when we've already simplified an expression once, - -- make sure that we only inline it if it's reasonably small. - - && (in_lam == NotInsideLam || - -- Outside a lambda, we want to be reasonably aggressive - -- about inlining into multiple branches of case - -- e.g. let x = <non-value> - -- in case y of { C1 -> ..x..; C2 -> ..x..; C3 -> ... } - -- Inlining can be a big win if C3 is the hot-spot, even if - -- the uses in C1, C2 are not 'interesting' - -- An example that gets worse if you add int_cxt here is 'clausify' - - (isCheapUnfolding unfolding && int_cxt == IsInteresting)) - -- isCheap => acceptable work duplication; in_lam may be true - -- int_cxt to prevent us inlining inside a lambda without some - -- good reason. See the notes on int_cxt in preInlineUnconditionally - - IAmDead -> True -- This happens; for example, the case_bndr during case of - -- known constructor: case (a,b) of x { (p,q) -> ... } - -- Here x isn't mentioned in the RHS, so we don't want to - -- create the (dead) let-binding let x = (a,b) in ... - - _ -> False - --- Here's an example that we don't handle well: --- let f = if b then Left (\x.BIG) else Right (\y.BIG) --- in \y. ....case f of {...} .... --- Here f is used just once, and duplicating the case work is fine (exprIsCheap). --- But --- - We can't preInlineUnconditionally because that would invalidate --- the occ info for b. --- - We can't postInlineUnconditionally because the RHS is big, and --- that risks exponential behaviour --- - We can't call-site inline, because the rhs is big --- Alas! - - where - unfolding = idUnfolding bndr - dflags = seDynFlags env - active = isActive (sm_phase (getMode env)) (idInlineActivation bndr) - -- See Note [pre/postInlineUnconditionally in gentle mode] - -{- -Note [Top level and postInlineUnconditionally] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We don't do postInlineUnconditionally for top-level things (even for -ones that are trivial): - - * Doing so will inline top-level error expressions that have been - carefully floated out by FloatOut. More generally, it might - replace static allocation with dynamic. - - * Even for trivial expressions there's a problem. Consider - {-# RULE "foo" forall (xs::[T]). reverse xs = ruggle xs #-} - blah xs = reverse xs - ruggle = sort - In one simplifier pass we might fire the rule, getting - blah xs = ruggle xs - but in *that* simplifier pass we must not do postInlineUnconditionally - on 'ruggle' because then we'll have an unbound occurrence of 'ruggle' - - If the rhs is trivial it'll be inlined by callSiteInline, and then - the binding will be dead and discarded by the next use of OccurAnal - - * There is less point, because the main goal is to get rid of local - bindings used in multiple case branches. - - * The inliner should inline trivial things at call sites anyway. - - * The Id might be exported. We could check for that separately, - but since we aren't going to postInlineUnconditionally /any/ - top-level bindings, we don't need to test. - -Note [Stable unfoldings and postInlineUnconditionally] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Do not do postInlineUnconditionally if the Id has a stable unfolding, -otherwise we lose the unfolding. Example - - -- f has stable unfolding with rhs (e |> co) - -- where 'e' is big - f = e |> co - -Then there's a danger we'll optimise to - - f' = e - f = f' |> co - -and now postInlineUnconditionally, losing the stable unfolding on f. Now f' -won't inline because 'e' is too big. - - c.f. Note [Stable unfoldings and preInlineUnconditionally] - - -************************************************************************ -* * - Rebuilding a lambda -* * -************************************************************************ --} - -mkLam :: SimplEnv -> [OutBndr] -> OutExpr -> SimplCont -> SimplM OutExpr --- mkLam tries three things --- a) eta reduction, if that gives a trivial expression --- b) eta expansion [only if there are some value lambdas] - -mkLam _env [] body _cont - = return body -mkLam env bndrs body cont - = do { dflags <- getDynFlags - ; mkLam' dflags bndrs body } - where - mkLam' :: DynFlags -> [OutBndr] -> OutExpr -> SimplM OutExpr - mkLam' dflags bndrs (Cast body co) - | not (any bad bndrs) - -- Note [Casts and lambdas] - = do { lam <- mkLam' dflags bndrs body - ; return (mkCast lam (mkPiCos Representational bndrs co)) } - where - co_vars = tyCoVarsOfCo co - bad bndr = isCoVar bndr && bndr `elemVarSet` co_vars - - mkLam' dflags bndrs body@(Lam {}) - = mkLam' dflags (bndrs ++ bndrs1) body1 - where - (bndrs1, body1) = collectBinders body - - mkLam' dflags bndrs (Tick t expr) - | tickishFloatable t - = mkTick t <$> mkLam' dflags bndrs expr - - mkLam' dflags bndrs body - | gopt Opt_DoEtaReduction dflags - , Just etad_lam <- tryEtaReduce bndrs body - = do { tick (EtaReduction (head bndrs)) - ; return etad_lam } - - | not (contIsRhs cont) -- See Note [Eta-expanding lambdas] - , sm_eta_expand (getMode env) - , any isRuntimeVar bndrs - , let body_arity = exprEtaExpandArity dflags body - , body_arity > 0 - = do { tick (EtaExpansion (head bndrs)) - ; let res = mkLams bndrs (etaExpand body_arity body) - ; traceSmpl "eta expand" (vcat [text "before" <+> ppr (mkLams bndrs body) - , text "after" <+> ppr res]) - ; return res } - - | otherwise - = return (mkLams bndrs body) - -{- -Note [Eta expanding lambdas] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In general we *do* want to eta-expand lambdas. Consider - f (\x -> case x of (a,b) -> \s -> blah) -where 's' is a state token, and hence can be eta expanded. This -showed up in the code for GHc.IO.Handle.Text.hPutChar, a rather -important function! - -The eta-expansion will never happen unless we do it now. (Well, it's -possible that CorePrep will do it, but CorePrep only has a half-baked -eta-expander that can't deal with casts. So it's much better to do it -here.) - -However, when the lambda is let-bound, as the RHS of a let, we have a -better eta-expander (in the form of tryEtaExpandRhs), so we don't -bother to try expansion in mkLam in that case; hence the contIsRhs -guard. - -NB: We check the SimplEnv (sm_eta_expand), not DynFlags. - See Note [No eta expansion in stable unfoldings] - -Note [Casts and lambdas] -~~~~~~~~~~~~~~~~~~~~~~~~ -Consider - (\x. (\y. e) `cast` g1) `cast` g2 -There is a danger here that the two lambdas look separated, and the -full laziness pass might float an expression to between the two. - -So this equation in mkLam' floats the g1 out, thus: - (\x. e `cast` g1) --> (\x.e) `cast` (tx -> g1) -where x:tx. - -In general, this floats casts outside lambdas, where (I hope) they -might meet and cancel with some other cast: - \x. e `cast` co ===> (\x. e) `cast` (tx -> co) - /\a. e `cast` co ===> (/\a. e) `cast` (/\a. co) - /\g. e `cast` co ===> (/\g. e) `cast` (/\g. co) - (if not (g `in` co)) - -Notice that it works regardless of 'e'. Originally it worked only -if 'e' was itself a lambda, but in some cases that resulted in -fruitless iteration in the simplifier. A good example was when -compiling Text.ParserCombinators.ReadPrec, where we had a definition -like (\x. Get `cast` g) -where Get is a constructor with nonzero arity. Then mkLam eta-expanded -the Get, and the next iteration eta-reduced it, and then eta-expanded -it again. - -Note also the side condition for the case of coercion binders. -It does not make sense to transform - /\g. e `cast` g ==> (/\g.e) `cast` (/\g.g) -because the latter is not well-kinded. - -************************************************************************ -* * - Eta expansion -* * -************************************************************************ --} - -tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr - -> SimplM (Arity, Bool, OutExpr) --- See Note [Eta-expanding at let bindings] --- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then --- (a) rhs' has manifest arity n --- (b) if is_bot is True then rhs' applied to n args is guaranteed bottom -tryEtaExpandRhs mode bndr rhs - | Just join_arity <- isJoinId_maybe bndr - = do { let (join_bndrs, join_body) = collectNBinders join_arity rhs - ; return (count isId join_bndrs, exprIsBottom join_body, rhs) } - -- Note [Do not eta-expand join points] - -- But do return the correct arity and bottom-ness, because - -- these are used to set the bndr's IdInfo (#15517) - -- Note [Invariants on join points] invariant 2b, in GHC.Core - - | otherwise - = do { (new_arity, is_bot, new_rhs) <- try_expand - - ; WARN( new_arity < old_id_arity, - (text "Arity decrease:" <+> (ppr bndr <+> ppr old_id_arity - <+> ppr old_arity <+> ppr new_arity) $$ ppr new_rhs) ) - -- Note [Arity decrease] in Simplify - return (new_arity, is_bot, new_rhs) } - where - try_expand - | exprIsTrivial rhs - = return (exprArity rhs, False, rhs) - - | sm_eta_expand mode -- Provided eta-expansion is on - , new_arity > old_arity -- And the current manifest arity isn't enough - = do { tick (EtaExpansion bndr) - ; return (new_arity, is_bot, etaExpand new_arity rhs) } - - | otherwise - = return (old_arity, is_bot && new_arity == old_arity, rhs) - - dflags = sm_dflags mode - old_arity = exprArity rhs -- See Note [Do not expand eta-expand PAPs] - old_id_arity = idArity bndr - - (new_arity1, is_bot) = findRhsArity dflags bndr rhs old_arity - new_arity2 = idCallArity bndr - new_arity = max new_arity1 new_arity2 - -{- -Note [Eta-expanding at let bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We now eta expand at let-bindings, which is where the payoff comes. -The most significant thing is that we can do a simple arity analysis -(in GHC.Core.Arity.findRhsArity), which we can't do for free-floating lambdas - -One useful consequence of not eta-expanding lambdas is this example: - genMap :: C a => ... - {-# INLINE genMap #-} - genMap f xs = ... - - myMap :: D a => ... - {-# INLINE myMap #-} - myMap = genMap - -Notice that 'genMap' should only inline if applied to two arguments. -In the stable unfolding for myMap we'll have the unfolding - (\d -> genMap Int (..d..)) -We do not want to eta-expand to - (\d f xs -> genMap Int (..d..) f xs) -because then 'genMap' will inline, and it really shouldn't: at least -as far as the programmer is concerned, it's not applied to two -arguments! - -Note [Do not eta-expand join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Similarly to CPR (see Note [Don't w/w join points for CPR] in WorkWrap), a join point -stands well to gain from its outer binding's eta-expansion, and eta-expanding a -join point is fraught with issues like how to deal with a cast: - - let join $j1 :: IO () - $j1 = ... - $j2 :: Int -> IO () - $j2 n = if n > 0 then $j1 - else ... - - => - - let join $j1 :: IO () - $j1 = (\eta -> ...) - `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ()) - ~ IO () - $j2 :: Int -> IO () - $j2 n = (\eta -> if n > 0 then $j1 - else ...) - `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ()) - ~ IO () - -The cast here can't be pushed inside the lambda (since it's not casting to a -function type), so the lambda has to stay, but it can't because it contains a -reference to a join point. In fact, $j2 can't be eta-expanded at all. Rather -than try and detect this situation (and whatever other situations crop up!), we -don't bother; again, any surrounding eta-expansion will improve these join -points anyway, since an outer cast can *always* be pushed inside. By the time -CorePrep comes around, the code is very likely to look more like this: - - let join $j1 :: State# RealWorld -> (# State# RealWorld, ()) - $j1 = (...) eta - $j2 :: Int -> State# RealWorld -> (# State# RealWorld, ()) - $j2 = if n > 0 then $j1 - else (...) eta - -Note [Do not eta-expand PAPs] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We used to have old_arity = manifestArity rhs, which meant that we -would eta-expand even PAPs. But this gives no particular advantage, -and can lead to a massive blow-up in code size, exhibited by #9020. -Suppose we have a PAP - foo :: IO () - foo = returnIO () -Then we can eta-expand do - foo = (\eta. (returnIO () |> sym g) eta) |> g -where - g :: IO () ~ State# RealWorld -> (# State# RealWorld, () #) - -But there is really no point in doing this, and it generates masses of -coercions and whatnot that eventually disappear again. For T9020, GHC -allocated 6.6G before, and 0.8G afterwards; and residency dropped from -1.8G to 45M. - -But note that this won't eta-expand, say - f = \g -> map g -Does it matter not eta-expanding such functions? I'm not sure. Perhaps -strictness analysis will have less to bite on? - - -************************************************************************ -* * -\subsection{Floating lets out of big lambdas} -* * -************************************************************************ - -Note [Floating and type abstraction] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this: - x = /\a. C e1 e2 -We'd like to float this to - y1 = /\a. e1 - y2 = /\a. e2 - x = /\a. C (y1 a) (y2 a) -for the usual reasons: we want to inline x rather vigorously. - -You may think that this kind of thing is rare. But in some programs it is -common. For example, if you do closure conversion you might get: - - data a :-> b = forall e. (e -> a -> b) :$ e - - f_cc :: forall a. a :-> a - f_cc = /\a. (\e. id a) :$ () - -Now we really want to inline that f_cc thing so that the -construction of the closure goes away. - -So I have elaborated simplLazyBind to understand right-hand sides that look -like - /\ a1..an. body - -and treat them specially. The real work is done in SimplUtils.abstractFloats, -but there is quite a bit of plumbing in simplLazyBind as well. - -The same transformation is good when there are lets in the body: - - /\abc -> let(rec) x = e in b - ==> - let(rec) x' = /\abc -> let x = x' a b c in e - in - /\abc -> let x = x' a b c in b - -This is good because it can turn things like: - - let f = /\a -> letrec g = ... g ... in g -into - letrec g' = /\a -> ... g' a ... - in - let f = /\ a -> g' a - -which is better. In effect, it means that big lambdas don't impede -let-floating. - -This optimisation is CRUCIAL in eliminating the junk introduced by -desugaring mutually recursive definitions. Don't eliminate it lightly! - -[May 1999] If we do this transformation *regardless* then we can -end up with some pretty silly stuff. For example, - - let - st = /\ s -> let { x1=r1 ; x2=r2 } in ... - in .. -becomes - let y1 = /\s -> r1 - y2 = /\s -> r2 - st = /\s -> ...[y1 s/x1, y2 s/x2] - in .. - -Unless the "..." is a WHNF there is really no point in doing this. -Indeed it can make things worse. Suppose x1 is used strictly, -and is of the form - - x1* = case f y of { (a,b) -> e } - -If we abstract this wrt the tyvar we then can't do the case inline -as we would normally do. - -That's why the whole transformation is part of the same process that -floats let-bindings and constructor arguments out of RHSs. In particular, -it is guarded by the doFloatFromRhs call in simplLazyBind. - -Note [Which type variables to abstract over] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Abstract only over the type variables free in the rhs wrt which the -new binding is abstracted. Note that - - * The naive approach of abstracting wrt the - tyvars free in the Id's /type/ fails. Consider: - /\ a b -> let t :: (a,b) = (e1, e2) - x :: a = fst t - in ... - Here, b isn't free in x's type, but we must nevertheless - abstract wrt b as well, because t's type mentions b. - Since t is floated too, we'd end up with the bogus: - poly_t = /\ a b -> (e1, e2) - poly_x = /\ a -> fst (poly_t a *b*) - - * We must do closeOverKinds. Example (#10934): - f = /\k (f:k->*) (a:k). let t = AccFailure @ (f a) in ... - Here we want to float 't', but we must remember to abstract over - 'k' as well, even though it is not explicitly mentioned in the RHS, - otherwise we get - t = /\ (f:k->*) (a:k). AccFailure @ (f a) - which is obviously bogus. --} - -abstractFloats :: DynFlags -> TopLevelFlag -> [OutTyVar] -> SimplFloats - -> OutExpr -> SimplM ([OutBind], OutExpr) -abstractFloats dflags top_lvl main_tvs floats body - = ASSERT( notNull body_floats ) - ASSERT( isNilOL (sfJoinFloats floats) ) - do { (subst, float_binds) <- mapAccumLM abstract empty_subst body_floats - ; return (float_binds, GHC.Core.Subst.substExpr (text "abstract_floats1") subst body) } - where - is_top_lvl = isTopLevel top_lvl - main_tv_set = mkVarSet main_tvs - body_floats = letFloatBinds (sfLetFloats floats) - empty_subst = GHC.Core.Subst.mkEmptySubst (sfInScope floats) - - abstract :: GHC.Core.Subst.Subst -> OutBind -> SimplM (GHC.Core.Subst.Subst, OutBind) - abstract subst (NonRec id rhs) - = do { (poly_id1, poly_app) <- mk_poly1 tvs_here id - ; let (poly_id2, poly_rhs) = mk_poly2 poly_id1 tvs_here rhs' - subst' = GHC.Core.Subst.extendIdSubst subst id poly_app - ; return (subst', NonRec poly_id2 poly_rhs) } - where - rhs' = GHC.Core.Subst.substExpr (text "abstract_floats2") subst rhs - - -- tvs_here: see Note [Which type variables to abstract over] - tvs_here = scopedSort $ - filter (`elemVarSet` main_tv_set) $ - closeOverKindsList $ - exprSomeFreeVarsList isTyVar rhs' - - abstract subst (Rec prs) - = do { (poly_ids, poly_apps) <- mapAndUnzipM (mk_poly1 tvs_here) ids - ; let subst' = GHC.Core.Subst.extendSubstList subst (ids `zip` poly_apps) - poly_pairs = [ mk_poly2 poly_id tvs_here rhs' - | (poly_id, rhs) <- poly_ids `zip` rhss - , let rhs' = GHC.Core.Subst.substExpr (text "abstract_floats") - subst' rhs ] - ; return (subst', Rec poly_pairs) } - where - (ids,rhss) = unzip prs - -- For a recursive group, it's a bit of a pain to work out the minimal - -- set of tyvars over which to abstract: - -- /\ a b c. let x = ...a... in - -- letrec { p = ...x...q... - -- q = .....p...b... } in - -- ... - -- Since 'x' is abstracted over 'a', the {p,q} group must be abstracted - -- over 'a' (because x is replaced by (poly_x a)) as well as 'b'. - -- Since it's a pain, we just use the whole set, which is always safe - -- - -- If you ever want to be more selective, remember this bizarre case too: - -- x::a = x - -- Here, we must abstract 'x' over 'a'. - tvs_here = scopedSort main_tvs - - mk_poly1 :: [TyVar] -> Id -> SimplM (Id, CoreExpr) - mk_poly1 tvs_here var - = do { uniq <- getUniqueM - ; let poly_name = setNameUnique (idName var) uniq -- Keep same name - poly_ty = mkInvForAllTys tvs_here (idType var) -- But new type of course - poly_id = transferPolyIdInfo var tvs_here $ -- Note [transferPolyIdInfo] in Id.hs - mkLocalId poly_name poly_ty - ; return (poly_id, mkTyApps (Var poly_id) (mkTyVarTys tvs_here)) } - -- In the olden days, it was crucial to copy the occInfo of the original var, - -- because we were looking at occurrence-analysed but as yet unsimplified code! - -- In particular, we mustn't lose the loop breakers. BUT NOW we are looking - -- at already simplified code, so it doesn't matter - -- - -- It's even right to retain single-occurrence or dead-var info: - -- Suppose we started with /\a -> let x = E in B - -- where x occurs once in B. Then we transform to: - -- let x' = /\a -> E in /\a -> let x* = x' a in B - -- where x* has an INLINE prag on it. Now, once x* is inlined, - -- the occurrences of x' will be just the occurrences originally - -- pinned on x. - - mk_poly2 :: Id -> [TyVar] -> CoreExpr -> (Id, CoreExpr) - mk_poly2 poly_id tvs_here rhs - = (poly_id `setIdUnfolding` unf, poly_rhs) - where - poly_rhs = mkLams tvs_here rhs - unf = mkUnfolding dflags InlineRhs is_top_lvl False poly_rhs - - -- We want the unfolding. Consider - -- let - -- x = /\a. let y = ... in Just y - -- in body - -- Then we float the y-binding out (via abstractFloats and addPolyBind) - -- but 'x' may well then be inlined in 'body' in which case we'd like the - -- opportunity to inline 'y' too. - -{- -Note [Abstract over coercions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If a coercion variable (g :: a ~ Int) is free in the RHS, then so is the -type variable a. Rather than sort this mess out, we simply bale out and abstract -wrt all the type variables if any of them are coercion variables. - - -Historical note: if you use let-bindings instead of a substitution, beware of this: - - -- Suppose we start with: - -- - -- x = /\ a -> let g = G in E - -- - -- Then we'll float to get - -- - -- x = let poly_g = /\ a -> G - -- in /\ a -> let g = poly_g a in E - -- - -- But now the occurrence analyser will see just one occurrence - -- of poly_g, not inside a lambda, so the simplifier will - -- PreInlineUnconditionally poly_g back into g! Badk to square 1! - -- (I used to think that the "don't inline lone occurrences" stuff - -- would stop this happening, but since it's the *only* occurrence, - -- PreInlineUnconditionally kicks in first!) - -- - -- Solution: put an INLINE note on g's RHS, so that poly_g seems - -- to appear many times. (NB: mkInlineMe eliminates - -- such notes on trivial RHSs, so do it manually.) - -************************************************************************ -* * - prepareAlts -* * -************************************************************************ - -prepareAlts tries these things: - -1. Eliminate alternatives that cannot match, including the - DEFAULT alternative. - -2. If the DEFAULT alternative can match only one possible constructor, - then make that constructor explicit. - e.g. - case e of x { DEFAULT -> rhs } - ===> - case e of x { (a,b) -> rhs } - where the type is a single constructor type. This gives better code - when rhs also scrutinises x or e. - -3. Returns a list of the constructors that cannot holds in the - DEFAULT alternative (if there is one) - -Here "cannot match" includes knowledge from GADTs - -It's a good idea to do this stuff before simplifying the alternatives, to -avoid simplifying alternatives we know can't happen, and to come up with -the list of constructors that are handled, to put into the IdInfo of the -case binder, for use when simplifying the alternatives. - -Eliminating the default alternative in (1) isn't so obvious, but it can -happen: - -data Colour = Red | Green | Blue - -f x = case x of - Red -> .. - Green -> .. - DEFAULT -> h x - -h y = case y of - Blue -> .. - DEFAULT -> [ case y of ... ] - -If we inline h into f, the default case of the inlined h can't happen. -If we don't notice this, we may end up filtering out *all* the cases -of the inner case y, which give us nowhere to go! --} - -prepareAlts :: OutExpr -> OutId -> [InAlt] -> SimplM ([AltCon], [InAlt]) --- The returned alternatives can be empty, none are possible -prepareAlts scrut case_bndr' alts - | Just (tc, tys) <- splitTyConApp_maybe (varType case_bndr') - -- Case binder is needed just for its type. Note that as an - -- OutId, it has maximum information; this is important. - -- Test simpl013 is an example - = do { us <- getUniquesM - ; let (idcs1, alts1) = filterAlts tc tys imposs_cons alts - (yes2, alts2) = refineDefaultAlt us tc tys idcs1 alts1 - (yes3, idcs3, alts3) = combineIdenticalAlts idcs1 alts2 - -- "idcs" stands for "impossible default data constructors" - -- i.e. the constructors that can't match the default case - ; when yes2 $ tick (FillInCaseDefault case_bndr') - ; when yes3 $ tick (AltMerge case_bndr') - ; return (idcs3, alts3) } - - | otherwise -- Not a data type, so nothing interesting happens - = return ([], alts) - where - imposs_cons = case scrut of - Var v -> otherCons (idUnfolding v) - _ -> [] - - -{- -************************************************************************ -* * - mkCase -* * -************************************************************************ - -mkCase tries these things - -* Note [Nerge nested cases] -* Note [Eliminate identity case] -* Note [Scrutinee constant folding] - -Note [Merge Nested Cases] -~~~~~~~~~~~~~~~~~~~~~~~~~ - case e of b { ==> case e of b { - p1 -> rhs1 p1 -> rhs1 - ... ... - pm -> rhsm pm -> rhsm - _ -> case b of b' { pn -> let b'=b in rhsn - pn -> rhsn ... - ... po -> let b'=b in rhso - po -> rhso _ -> let b'=b in rhsd - _ -> rhsd - } - -which merges two cases in one case when -- the default alternative of -the outer case scrutises the same variable as the outer case. This -transformation is called Case Merging. It avoids that the same -variable is scrutinised multiple times. - -Note [Eliminate Identity Case] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - case e of ===> e - True -> True; - False -> False - -and similar friends. - -Note [Scrutinee Constant Folding] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - case x op# k# of _ { ===> case x of _ { - a1# -> e1 (a1# inv_op# k#) -> e1 - a2# -> e2 (a2# inv_op# k#) -> e2 - ... ... - DEFAULT -> ed DEFAULT -> ed - - where (x op# k#) inv_op# k# == x - -And similarly for commuted arguments and for some unary operations. - -The purpose of this transformation is not only to avoid an arithmetic -operation at runtime but to allow other transformations to apply in cascade. - -Example with the "Merge Nested Cases" optimization (from #12877): - - main = case t of t0 - 0## -> ... - DEFAULT -> case t0 `minusWord#` 1## of t1 - 0## -> ... - DEFAULT -> case t1 `minusWord#` 1## of t2 - 0## -> ... - DEFAULT -> case t2 `minusWord#` 1## of _ - 0## -> ... - DEFAULT -> ... - - becomes: - - main = case t of _ - 0## -> ... - 1## -> ... - 2## -> ... - 3## -> ... - DEFAULT -> ... - -There are some wrinkles - -* Do not apply caseRules if there is just a single DEFAULT alternative - case e +# 3# of b { DEFAULT -> rhs } - If we applied the transformation here we would (stupidly) get - case a of b' { DEFAULT -> let b = e +# 3# in rhs } - and now the process may repeat, because that let will really - be a case. - -* The type of the scrutinee might change. E.g. - case tagToEnum (x :: Int#) of (b::Bool) - False -> e1 - True -> e2 - ==> - case x of (b'::Int#) - DEFAULT -> e1 - 1# -> e2 - -* The case binder may be used in the right hand sides, so we need - to make a local binding for it, if it is alive. e.g. - case e +# 10# of b - DEFAULT -> blah...b... - 44# -> blah2...b... - ===> - case e of b' - DEFAULT -> let b = b' +# 10# in blah...b... - 34# -> let b = 44# in blah2...b... - - Note that in the non-DEFAULT cases we know what to bind 'b' to, - whereas in the DEFAULT case we must reconstruct the original value. - But NB: we use b'; we do not duplicate 'e'. - -* In dataToTag we might need to make up some fake binders; - see Note [caseRules for dataToTag] in PrelRules --} - -mkCase, mkCase1, mkCase2, mkCase3 - :: DynFlags - -> OutExpr -> OutId - -> OutType -> [OutAlt] -- Alternatives in standard (increasing) order - -> SimplM OutExpr - --------------------------------------------------- --- 1. Merge Nested Cases --------------------------------------------------- - -mkCase dflags scrut outer_bndr alts_ty ((DEFAULT, _, deflt_rhs) : outer_alts) - | gopt Opt_CaseMerge dflags - , (ticks, Case (Var inner_scrut_var) inner_bndr _ inner_alts) - <- stripTicksTop tickishFloatable deflt_rhs - , inner_scrut_var == outer_bndr - = do { tick (CaseMerge outer_bndr) - - ; let wrap_alt (con, args, rhs) = ASSERT( outer_bndr `notElem` args ) - (con, args, wrap_rhs rhs) - -- Simplifier's no-shadowing invariant should ensure - -- that outer_bndr is not shadowed by the inner patterns - wrap_rhs rhs = Let (NonRec inner_bndr (Var outer_bndr)) rhs - -- The let is OK even for unboxed binders, - - wrapped_alts | isDeadBinder inner_bndr = inner_alts - | otherwise = map wrap_alt inner_alts - - merged_alts = mergeAlts outer_alts wrapped_alts - -- NB: mergeAlts gives priority to the left - -- case x of - -- A -> e1 - -- DEFAULT -> case x of - -- A -> e2 - -- B -> e3 - -- When we merge, we must ensure that e1 takes - -- precedence over e2 as the value for A! - - ; fmap (mkTicks ticks) $ - mkCase1 dflags scrut outer_bndr alts_ty merged_alts - } - -- Warning: don't call mkCase recursively! - -- Firstly, there's no point, because inner alts have already had - -- mkCase applied to them, so they won't have a case in their default - -- Secondly, if you do, you get an infinite loop, because the bindCaseBndr - -- in munge_rhs may put a case into the DEFAULT branch! - -mkCase dflags scrut bndr alts_ty alts = mkCase1 dflags scrut bndr alts_ty alts - --------------------------------------------------- --- 2. Eliminate Identity Case --------------------------------------------------- - -mkCase1 _dflags scrut case_bndr _ alts@((_,_,rhs1) : _) -- Identity case - | all identity_alt alts - = do { tick (CaseIdentity case_bndr) - ; return (mkTicks ticks $ re_cast scrut rhs1) } - where - ticks = concatMap (stripTicksT tickishFloatable . thdOf3) (tail alts) - identity_alt (con, args, rhs) = check_eq rhs con args - - check_eq (Cast rhs co) con args -- See Note [RHS casts] - = not (any (`elemVarSet` tyCoVarsOfCo co) args) && check_eq rhs con args - check_eq (Tick t e) alt args - = tickishFloatable t && check_eq e alt args - - check_eq (Lit lit) (LitAlt lit') _ = lit == lit' - check_eq (Var v) _ _ | v == case_bndr = True - check_eq (Var v) (DataAlt con) args - | null arg_tys, null args = v == dataConWorkId con - -- Optimisation only - check_eq rhs (DataAlt con) args = cheapEqExpr' tickishFloatable rhs $ - mkConApp2 con arg_tys args - check_eq _ _ _ = False - - arg_tys = tyConAppArgs (idType case_bndr) - - -- Note [RHS casts] - -- ~~~~~~~~~~~~~~~~ - -- We've seen this: - -- case e of x { _ -> x `cast` c } - -- And we definitely want to eliminate this case, to give - -- e `cast` c - -- So we throw away the cast from the RHS, and reconstruct - -- it at the other end. All the RHS casts must be the same - -- if (all identity_alt alts) holds. - -- - -- Don't worry about nested casts, because the simplifier combines them - - re_cast scrut (Cast rhs co) = Cast (re_cast scrut rhs) co - re_cast scrut _ = scrut - -mkCase1 dflags scrut bndr alts_ty alts = mkCase2 dflags scrut bndr alts_ty alts - --------------------------------------------------- --- 2. Scrutinee Constant Folding --------------------------------------------------- - -mkCase2 dflags scrut bndr alts_ty alts - | -- See Note [Scrutinee Constant Folding] - case alts of -- Not if there is just a DEFAULT alternative - [(DEFAULT,_,_)] -> False - _ -> True - , gopt Opt_CaseFolding dflags - , Just (scrut', tx_con, mk_orig) <- caseRules dflags scrut - = do { bndr' <- newId (fsLit "lwild") (exprType scrut') - - ; alts' <- mapMaybeM (tx_alt tx_con mk_orig bndr') alts - -- mapMaybeM: discard unreachable alternatives - -- See Note [Unreachable caseRules alternatives] - -- in PrelRules - - ; mkCase3 dflags scrut' bndr' alts_ty $ - add_default (re_sort alts') - } - - | otherwise - = mkCase3 dflags scrut bndr alts_ty alts - where - -- We need to keep the correct association between the scrutinee and its - -- binder if the latter isn't dead. Hence we wrap rhs of alternatives with - -- "let bndr = ... in": - -- - -- case v + 10 of y =====> case v of y - -- 20 -> e1 10 -> let y = 20 in e1 - -- DEFAULT -> e2 DEFAULT -> let y = v + 10 in e2 - -- - -- Other transformations give: =====> case v of y' - -- 10 -> let y = 20 in e1 - -- DEFAULT -> let y = y' + 10 in e2 - -- - -- This wrapping is done in tx_alt; we use mk_orig, returned by caseRules, - -- to construct an expression equivalent to the original one, for use - -- in the DEFAULT case - - tx_alt :: (AltCon -> Maybe AltCon) -> (Id -> CoreExpr) -> Id - -> CoreAlt -> SimplM (Maybe CoreAlt) - tx_alt tx_con mk_orig new_bndr (con, bs, rhs) - = case tx_con con of - Nothing -> return Nothing - Just con' -> do { bs' <- mk_new_bndrs new_bndr con' - ; return (Just (con', bs', rhs')) } - where - rhs' | isDeadBinder bndr = rhs - | otherwise = bindNonRec bndr orig_val rhs - - orig_val = case con of - DEFAULT -> mk_orig new_bndr - LitAlt l -> Lit l - DataAlt dc -> mkConApp2 dc (tyConAppArgs (idType bndr)) bs - - mk_new_bndrs new_bndr (DataAlt dc) - | not (isNullaryRepDataCon dc) - = -- For non-nullary data cons we must invent some fake binders - -- See Note [caseRules for dataToTag] in PrelRules - do { us <- getUniquesM - ; let (ex_tvs, arg_ids) = dataConRepInstPat us dc - (tyConAppArgs (idType new_bndr)) - ; return (ex_tvs ++ arg_ids) } - mk_new_bndrs _ _ = return [] - - re_sort :: [CoreAlt] -> [CoreAlt] - -- Sort the alternatives to re-establish - -- GHC.Core Note [Case expression invariants] - re_sort alts = sortBy cmpAlt alts - - add_default :: [CoreAlt] -> [CoreAlt] - -- See Note [Literal cases] - add_default ((LitAlt {}, bs, rhs) : alts) = (DEFAULT, bs, rhs) : alts - add_default alts = alts - -{- Note [Literal cases] -~~~~~~~~~~~~~~~~~~~~~~~ -If we have - case tagToEnum (a ># b) of - False -> e1 - True -> e2 - -then caseRules for TagToEnum will turn it into - case tagToEnum (a ># b) of - 0# -> e1 - 1# -> e2 - -Since the case is exhaustive (all cases are) we can convert it to - case tagToEnum (a ># b) of - DEFAULT -> e1 - 1# -> e2 - -This may generate sligthtly better code (although it should not, since -all cases are exhaustive) and/or optimise better. I'm not certain that -it's necessary, but currently we do make this change. We do it here, -NOT in the TagToEnum rules (see "Beware" in Note [caseRules for tagToEnum] -in PrelRules) --} - --------------------------------------------------- --- Catch-all --------------------------------------------------- -mkCase3 _dflags scrut bndr alts_ty alts - = return (Case scrut bndr alts_ty alts) - --- See Note [Exitification] and Note [Do not inline exit join points] in Exitify.hs --- This lives here (and not in Id) because occurrence info is only valid on --- InIds, so it's crucial that isExitJoinId is only called on freshly --- occ-analysed code. It's not a generic function you can call anywhere. -isExitJoinId :: Var -> Bool -isExitJoinId id - = isJoinId id - && isOneOcc (idOccInfo id) - && occ_in_lam (idOccInfo id) == IsInsideLam - -{- -Note [Dead binders] -~~~~~~~~~~~~~~~~~~~~ -Note that dead-ness is maintained by the simplifier, so that it is -accurate after simplification as well as before. - - -Note [Cascading case merge] -~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Case merging should cascade in one sweep, because it -happens bottom-up - - case e of a { - DEFAULT -> case a of b - DEFAULT -> case b of c { - DEFAULT -> e - A -> ea - B -> eb - C -> ec -==> - case e of a { - DEFAULT -> case a of b - DEFAULT -> let c = b in e - A -> let c = b in ea - B -> eb - C -> ec -==> - case e of a { - DEFAULT -> let b = a in let c = b in e - A -> let b = a in let c = b in ea - B -> let b = a in eb - C -> ec - - -However here's a tricky case that we still don't catch, and I don't -see how to catch it in one pass: - - case x of c1 { I# a1 -> - case a1 of c2 -> - 0 -> ... - DEFAULT -> case x of c3 { I# a2 -> - case a2 of ... - -After occurrence analysis (and its binder-swap) we get this - - case x of c1 { I# a1 -> - let x = c1 in -- Binder-swap addition - case a1 of c2 -> - 0 -> ... - DEFAULT -> case x of c3 { I# a2 -> - case a2 of ... - -When we simplify the inner case x, we'll see that -x=c1=I# a1. So we'll bind a2 to a1, and get - - case x of c1 { I# a1 -> - case a1 of c2 -> - 0 -> ... - DEFAULT -> case a1 of ... - -This is correct, but we can't do a case merge in this sweep -because c2 /= a1. Reason: the binding c1=I# a1 went inwards -without getting changed to c1=I# c2. - -I don't think this is worth fixing, even if I knew how. It'll -all come out in the next pass anyway. --} diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs deleted file mode 100644 index fc8c861480..0000000000 --- a/compiler/simplCore/Simplify.hs +++ /dev/null @@ -1,3666 +0,0 @@ -{- -(c) The AQUA Project, Glasgow University, 1993-1998 - -\section[Simplify]{The main module of the simplifier} --} - -{-# LANGUAGE CPP #-} - -{-# OPTIONS_GHC -Wno-incomplete-record-updates #-} -module Simplify ( simplTopBinds, simplExpr, simplRules ) where - -#include "HsVersions.h" - -import GhcPrelude - -import GHC.Driver.Session -import SimplMonad -import GHC.Core.Type hiding ( substTy, substTyVar, extendTvSubst, extendCvSubst ) -import SimplEnv -import SimplUtils -import OccurAnal ( occurAnalyseExpr ) -import GHC.Core.FamInstEnv ( FamInstEnv ) -import Literal ( litIsLifted ) --, mkLitInt ) -- temporalily commented out. See #8326 -import Id -import MkId ( seqId ) -import GHC.Core.Make ( FloatBind, mkImpossibleExpr, castBottomExpr ) -import qualified GHC.Core.Make -import IdInfo -import Name ( mkSystemVarName, isExternalName, getOccFS ) -import GHC.Core.Coercion hiding ( substCo, substCoVar ) -import GHC.Core.Coercion.Opt ( optCoercion ) -import GHC.Core.FamInstEnv ( topNormaliseType_maybe ) -import GHC.Core.DataCon - ( DataCon, dataConWorkId, dataConRepStrictness - , dataConRepArgTys, isUnboxedTupleCon - , StrictnessMark (..) ) -import CoreMonad ( Tick(..), SimplMode(..) ) -import GHC.Core -import Demand ( StrictSig(..), dmdTypeDepth, isStrictDmd - , mkClosedStrictSig, topDmd, botDiv ) -import Cpr ( mkCprSig, botCpr ) -import GHC.Core.Ppr ( pprCoreExpr ) -import GHC.Core.Unfold -import GHC.Core.Utils -import GHC.Core.SimpleOpt ( pushCoTyArg, pushCoValArg - , joinPointBinding_maybe, joinPointBindings_maybe ) -import GHC.Core.Rules ( mkRuleInfo, lookupRule, getRules ) -import BasicTypes ( TopLevelFlag(..), isNotTopLevel, isTopLevel, - RecFlag(..), Arity ) -import MonadUtils ( mapAccumLM, liftIO ) -import Var ( isTyCoVar ) -import Maybes ( orElse ) -import Control.Monad -import Outputable -import FastString -import Util -import ErrUtils -import Module ( moduleName, pprModuleName ) -import PrimOp ( PrimOp (SeqOp) ) - - -{- -The guts of the simplifier is in this module, but the driver loop for -the simplifier is in SimplCore.hs. - -Note [The big picture] -~~~~~~~~~~~~~~~~~~~~~~ -The general shape of the simplifier is this: - - simplExpr :: SimplEnv -> InExpr -> SimplCont -> SimplM (SimplFloats, OutExpr) - simplBind :: SimplEnv -> InBind -> SimplM (SimplFloats, SimplEnv) - - * SimplEnv contains - - Simplifier mode (which includes DynFlags for convenience) - - Ambient substitution - - InScopeSet - - * SimplFloats contains - - Let-floats (which includes ok-for-spec case-floats) - - Join floats - - InScopeSet (including all the floats) - - * Expressions - simplExpr :: SimplEnv -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) - The result of simplifying an /expression/ is (floats, expr) - - A bunch of floats (let bindings, join bindings) - - A simplified expression. - The overall result is effectively (let floats in expr) - - * Bindings - simplBind :: SimplEnv -> InBind -> SimplM (SimplFloats, SimplEnv) - The result of simplifying a binding is - - A bunch of floats, the last of which is the simplified binding - There may be auxiliary bindings too; see prepareRhs - - An environment suitable for simplifying the scope of the binding - - The floats may also be empty, if the binding is inlined unconditionally; - in that case the returned SimplEnv will have an augmented substitution. - - The returned floats and env both have an in-scope set, and they are - guaranteed to be the same. - - -Note [Shadowing] -~~~~~~~~~~~~~~~~ -The simplifier used to guarantee that the output had no shadowing, but -it does not do so any more. (Actually, it never did!) The reason is -documented with simplifyArgs. - - -Eta expansion -~~~~~~~~~~~~~~ -For eta expansion, we want to catch things like - - case e of (a,b) -> \x -> case a of (p,q) -> \y -> r - -If the \x was on the RHS of a let, we'd eta expand to bring the two -lambdas together. And in general that's a good thing to do. Perhaps -we should eta expand wherever we find a (value) lambda? Then the eta -expansion at a let RHS can concentrate solely on the PAP case. - -Note [In-scope set as a substitution] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -As per Note [Lookups in in-scope set], an in-scope set can act as -a substitution. Specifically, it acts as a substitution from variable to -variables /with the same unique/. - -Why do we need this? Well, during the course of the simplifier, we may want to -adjust inessential properties of a variable. For instance, when performing a -beta-reduction, we change - - (\x. e) u ==> let x = u in e - -We typically want to add an unfolding to `x` so that it inlines to (the -simplification of) `u`. - -We do that by adding the unfolding to the binder `x`, which is added to the -in-scope set. When simplifying occurrences of `x` (every occurrence!), they are -replaced by their “updated” version from the in-scope set, hence inherit the -unfolding. This happens in `SimplEnv.substId`. - -Another example. Consider - - case x of y { Node a b -> ...y... - ; Leaf v -> ...y... } - -In the Node branch want y's unfolding to be (Node a b); in the Leaf branch we -want y's unfolding to be (Leaf v). We achieve this by adding the appropriate -unfolding to y, and re-adding it to the in-scope set. See the calls to -`addBinderUnfolding` in `Simplify.addAltUnfoldings` and elsewhere. - -It's quite convenient. This way we don't need to manipulate the substitution all -the time: every update to a binder is automatically reflected to its bound -occurrences. - -************************************************************************ -* * -\subsection{Bindings} -* * -************************************************************************ --} - -simplTopBinds :: SimplEnv -> [InBind] -> SimplM (SimplFloats, SimplEnv) --- See Note [The big picture] -simplTopBinds env0 binds0 - = do { -- Put all the top-level binders into scope at the start - -- so that if a transformation rule has unexpectedly brought - -- anything into scope, then we don't get a complaint about that. - -- It's rather as if the top-level binders were imported. - -- See note [Glomming] in OccurAnal. - ; env1 <- {-#SCC "simplTopBinds-simplRecBndrs" #-} simplRecBndrs env0 (bindersOfBinds binds0) - ; (floats, env2) <- {-#SCC "simplTopBinds-simpl_binds" #-} simpl_binds env1 binds0 - ; freeTick SimplifierDone - ; return (floats, env2) } - where - -- We need to track the zapped top-level binders, because - -- they should have their fragile IdInfo zapped (notably occurrence info) - -- That's why we run down binds and bndrs' simultaneously. - -- - simpl_binds :: SimplEnv -> [InBind] -> SimplM (SimplFloats, SimplEnv) - simpl_binds env [] = return (emptyFloats env, env) - simpl_binds env (bind:binds) = do { (float, env1) <- simpl_bind env bind - ; (floats, env2) <- simpl_binds env1 binds - ; return (float `addFloats` floats, env2) } - - simpl_bind env (Rec pairs) - = simplRecBind env TopLevel Nothing pairs - simpl_bind env (NonRec b r) - = do { (env', b') <- addBndrRules env b (lookupRecBndr env b) Nothing - ; simplRecOrTopPair env' TopLevel NonRecursive Nothing b b' r } - -{- -************************************************************************ -* * - Lazy bindings -* * -************************************************************************ - -simplRecBind is used for - * recursive bindings only --} - -simplRecBind :: SimplEnv -> TopLevelFlag -> MaybeJoinCont - -> [(InId, InExpr)] - -> SimplM (SimplFloats, SimplEnv) -simplRecBind env0 top_lvl mb_cont pairs0 - = do { (env_with_info, triples) <- mapAccumLM add_rules env0 pairs0 - ; (rec_floats, env1) <- go env_with_info triples - ; return (mkRecFloats rec_floats, env1) } - where - add_rules :: SimplEnv -> (InBndr,InExpr) -> SimplM (SimplEnv, (InBndr, OutBndr, InExpr)) - -- Add the (substituted) rules to the binder - add_rules env (bndr, rhs) - = do { (env', bndr') <- addBndrRules env bndr (lookupRecBndr env bndr) mb_cont - ; return (env', (bndr, bndr', rhs)) } - - go env [] = return (emptyFloats env, env) - - go env ((old_bndr, new_bndr, rhs) : pairs) - = do { (float, env1) <- simplRecOrTopPair env top_lvl Recursive mb_cont - old_bndr new_bndr rhs - ; (floats, env2) <- go env1 pairs - ; return (float `addFloats` floats, env2) } - -{- -simplOrTopPair is used for - * recursive bindings (whether top level or not) - * top-level non-recursive bindings - -It assumes the binder has already been simplified, but not its IdInfo. --} - -simplRecOrTopPair :: SimplEnv - -> TopLevelFlag -> RecFlag -> MaybeJoinCont - -> InId -> OutBndr -> InExpr -- Binder and rhs - -> SimplM (SimplFloats, SimplEnv) - -simplRecOrTopPair env top_lvl is_rec mb_cont old_bndr new_bndr rhs - | Just env' <- preInlineUnconditionally env top_lvl old_bndr rhs env - = {-#SCC "simplRecOrTopPair-pre-inline-uncond" #-} - trace_bind "pre-inline-uncond" $ - do { tick (PreInlineUnconditionally old_bndr) - ; return ( emptyFloats env, env' ) } - - | Just cont <- mb_cont - = {-#SCC "simplRecOrTopPair-join" #-} - ASSERT( isNotTopLevel top_lvl && isJoinId new_bndr ) - trace_bind "join" $ - simplJoinBind env cont old_bndr new_bndr rhs env - - | otherwise - = {-#SCC "simplRecOrTopPair-normal" #-} - trace_bind "normal" $ - simplLazyBind env top_lvl is_rec old_bndr new_bndr rhs env - - where - dflags = seDynFlags env - - -- trace_bind emits a trace for each top-level binding, which - -- helps to locate the tracing for inlining and rule firing - trace_bind what thing_inside - | not (dopt Opt_D_verbose_core2core dflags) - = thing_inside - | otherwise - = traceAction dflags ("SimplBind " ++ what) - (ppr old_bndr) thing_inside - --------------------------- -simplLazyBind :: SimplEnv - -> TopLevelFlag -> RecFlag - -> InId -> OutId -- Binder, both pre-and post simpl - -- Not a JoinId - -- The OutId has IdInfo, except arity, unfolding - -- Ids only, no TyVars - -> InExpr -> SimplEnv -- The RHS and its environment - -> SimplM (SimplFloats, SimplEnv) --- Precondition: not a JoinId --- Precondition: rhs obeys the let/app invariant --- NOT used for JoinIds -simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se - = ASSERT( isId bndr ) - ASSERT2( not (isJoinId bndr), ppr bndr ) - -- pprTrace "simplLazyBind" ((ppr bndr <+> ppr bndr1) $$ ppr rhs $$ ppr (seIdSubst rhs_se)) $ - do { let rhs_env = rhs_se `setInScopeFromE` env - (tvs, body) = case collectTyAndValBinders rhs of - (tvs, [], body) - | surely_not_lam body -> (tvs, body) - _ -> ([], rhs) - - surely_not_lam (Lam {}) = False - surely_not_lam (Tick t e) - | not (tickishFloatable t) = surely_not_lam e - -- eta-reduction could float - surely_not_lam _ = True - -- Do not do the "abstract tyvar" thing if there's - -- a lambda inside, because it defeats eta-reduction - -- f = /\a. \x. g a x - -- should eta-reduce. - - - ; (body_env, tvs') <- {-#SCC "simplBinders" #-} simplBinders rhs_env tvs - -- See Note [Floating and type abstraction] in SimplUtils - - -- Simplify the RHS - ; let rhs_cont = mkRhsStop (substTy body_env (exprType body)) - ; (body_floats0, body0) <- {-#SCC "simplExprF" #-} simplExprF body_env body rhs_cont - - -- Never float join-floats out of a non-join let-binding - -- So wrap the body in the join-floats right now - -- Hence: body_floats1 consists only of let-floats - ; let (body_floats1, body1) = wrapJoinFloatsX body_floats0 body0 - - -- ANF-ise a constructor or PAP rhs - -- We get at most one float per argument here - ; (let_floats, body2) <- {-#SCC "prepareRhs" #-} prepareRhs (getMode env) top_lvl - (getOccFS bndr1) (idInfo bndr1) body1 - ; let body_floats2 = body_floats1 `addLetFloats` let_floats - - ; (rhs_floats, rhs') - <- if not (doFloatFromRhs top_lvl is_rec False body_floats2 body2) - then -- No floating, revert to body1 - {-#SCC "simplLazyBind-no-floating" #-} - do { rhs' <- mkLam env tvs' (wrapFloats body_floats2 body1) rhs_cont - ; return (emptyFloats env, rhs') } - - else if null tvs then -- Simple floating - {-#SCC "simplLazyBind-simple-floating" #-} - do { tick LetFloatFromLet - ; return (body_floats2, body2) } - - else -- Do type-abstraction first - {-#SCC "simplLazyBind-type-abstraction-first" #-} - do { tick LetFloatFromLet - ; (poly_binds, body3) <- abstractFloats (seDynFlags env) top_lvl - tvs' body_floats2 body2 - ; let floats = foldl' extendFloats (emptyFloats env) poly_binds - ; rhs' <- mkLam env tvs' body3 rhs_cont - ; return (floats, rhs') } - - ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats) - top_lvl Nothing bndr bndr1 rhs' - ; return (rhs_floats `addFloats` bind_float, env2) } - --------------------------- -simplJoinBind :: SimplEnv - -> SimplCont - -> InId -> OutId -- Binder, both pre-and post simpl - -- The OutId has IdInfo, except arity, - -- unfolding - -> InExpr -> SimplEnv -- The right hand side and its env - -> SimplM (SimplFloats, SimplEnv) -simplJoinBind env cont old_bndr new_bndr rhs rhs_se - = do { let rhs_env = rhs_se `setInScopeFromE` env - ; rhs' <- simplJoinRhs rhs_env old_bndr rhs cont - ; completeBind env NotTopLevel (Just cont) old_bndr new_bndr rhs' } - --------------------------- -simplNonRecX :: SimplEnv - -> InId -- Old binder; not a JoinId - -> OutExpr -- Simplified RHS - -> SimplM (SimplFloats, SimplEnv) --- A specialised variant of simplNonRec used when the RHS is already --- simplified, notably in knownCon. It uses case-binding where necessary. --- --- Precondition: rhs satisfies the let/app invariant - -simplNonRecX env bndr new_rhs - | ASSERT2( not (isJoinId bndr), ppr bndr ) - isDeadBinder bndr -- Not uncommon; e.g. case (a,b) of c { (p,q) -> p } - = return (emptyFloats env, env) -- Here c is dead, and we avoid - -- creating the binding c = (a,b) - - | Coercion co <- new_rhs - = return (emptyFloats env, extendCvSubst env bndr co) - - | otherwise - = do { (env', bndr') <- simplBinder env bndr - ; completeNonRecX NotTopLevel env' (isStrictId bndr) bndr bndr' new_rhs } - -- simplNonRecX is only used for NotTopLevel things - --------------------------- -completeNonRecX :: TopLevelFlag -> SimplEnv - -> Bool - -> InId -- Old binder; not a JoinId - -> OutId -- New binder - -> OutExpr -- Simplified RHS - -> SimplM (SimplFloats, SimplEnv) -- The new binding is in the floats --- Precondition: rhs satisfies the let/app invariant --- See Note [Core let/app invariant] in GHC.Core - -completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs - = ASSERT2( not (isJoinId new_bndr), ppr new_bndr ) - do { (prepd_floats, rhs1) <- prepareRhs (getMode env) top_lvl (getOccFS new_bndr) - (idInfo new_bndr) new_rhs - ; let floats = emptyFloats env `addLetFloats` prepd_floats - ; (rhs_floats, rhs2) <- - if doFloatFromRhs NotTopLevel NonRecursive is_strict floats rhs1 - then -- Add the floats to the main env - do { tick LetFloatFromLet - ; return (floats, rhs1) } - else -- Do not float; wrap the floats around the RHS - return (emptyFloats env, wrapFloats floats rhs1) - - ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats) - NotTopLevel Nothing - old_bndr new_bndr rhs2 - ; return (rhs_floats `addFloats` bind_float, env2) } - - -{- ********************************************************************* -* * - prepareRhs, makeTrivial -* * -************************************************************************ - -Note [prepareRhs] -~~~~~~~~~~~~~~~~~ -prepareRhs takes a putative RHS, checks whether it's a PAP or -constructor application and, if so, converts it to ANF, so that the -resulting thing can be inlined more easily. Thus - x = (f a, g b) -becomes - t1 = f a - t2 = g b - x = (t1,t2) - -We also want to deal well cases like this - v = (f e1 `cast` co) e2 -Here we want to make e1,e2 trivial and get - x1 = e1; x2 = e2; v = (f x1 `cast` co) v2 -That's what the 'go' loop in prepareRhs does --} - -prepareRhs :: SimplMode -> TopLevelFlag - -> FastString -- Base for any new variables - -> IdInfo -- IdInfo for the LHS of this binding - -> OutExpr - -> SimplM (LetFloats, OutExpr) --- Transforms a RHS into a better RHS by adding floats --- e.g x = Just e --- becomes a = e --- x = Just a --- See Note [prepareRhs] -prepareRhs mode top_lvl occ info (Cast rhs co) -- Note [Float coercions] - | let ty1 = coercionLKind co -- Do *not* do this if rhs has an unlifted type - , not (isUnliftedType ty1) -- see Note [Float coercions (unlifted)] - = do { (floats, rhs') <- makeTrivialWithInfo mode top_lvl occ sanitised_info rhs - ; return (floats, Cast rhs' co) } - where - sanitised_info = vanillaIdInfo `setStrictnessInfo` strictnessInfo info - `setCprInfo` cprInfo info - `setDemandInfo` demandInfo info - -prepareRhs mode top_lvl occ _ rhs0 - = do { (_is_exp, floats, rhs1) <- go 0 rhs0 - ; return (floats, rhs1) } - where - go :: Int -> OutExpr -> SimplM (Bool, LetFloats, OutExpr) - go n_val_args (Cast rhs co) - = do { (is_exp, floats, rhs') <- go n_val_args rhs - ; return (is_exp, floats, Cast rhs' co) } - go n_val_args (App fun (Type ty)) - = do { (is_exp, floats, rhs') <- go n_val_args fun - ; return (is_exp, floats, App rhs' (Type ty)) } - go n_val_args (App fun arg) - = do { (is_exp, floats1, fun') <- go (n_val_args+1) fun - ; case is_exp of - False -> return (False, emptyLetFloats, App fun arg) - True -> do { (floats2, arg') <- makeTrivial mode top_lvl occ arg - ; return (True, floats1 `addLetFlts` floats2, App fun' arg') } } - go n_val_args (Var fun) - = return (is_exp, emptyLetFloats, Var fun) - where - is_exp = isExpandableApp fun n_val_args -- The fun a constructor or PAP - -- See Note [CONLIKE pragma] in BasicTypes - -- The definition of is_exp should match that in - -- OccurAnal.occAnalApp - - go n_val_args (Tick t rhs) - -- We want to be able to float bindings past this - -- tick. Non-scoping ticks don't care. - | tickishScoped t == NoScope - = do { (is_exp, floats, rhs') <- go n_val_args rhs - ; return (is_exp, floats, Tick t rhs') } - - -- On the other hand, for scoping ticks we need to be able to - -- copy them on the floats, which in turn is only allowed if - -- we can obtain non-counting ticks. - | (not (tickishCounts t) || tickishCanSplit t) - = do { (is_exp, floats, rhs') <- go n_val_args rhs - ; let tickIt (id, expr) = (id, mkTick (mkNoCount t) expr) - floats' = mapLetFloats floats tickIt - ; return (is_exp, floats', Tick t rhs') } - - go _ other - = return (False, emptyLetFloats, other) - -{- -Note [Float coercions] -~~~~~~~~~~~~~~~~~~~~~~ -When we find the binding - x = e `cast` co -we'd like to transform it to - x' = e - x = x `cast` co -- A trivial binding -There's a chance that e will be a constructor application or function, or something -like that, so moving the coercion to the usage site may well cancel the coercions -and lead to further optimisation. Example: - - data family T a :: * - data instance T Int = T Int - - foo :: Int -> Int -> Int - foo m n = ... - where - x = T m - go 0 = 0 - go n = case x of { T m -> go (n-m) } - -- This case should optimise - -Note [Preserve strictness when floating coercions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In the Note [Float coercions] transformation, keep the strictness info. -Eg - f = e `cast` co -- f has strictness SSL -When we transform to - f' = e -- f' also has strictness SSL - f = f' `cast` co -- f still has strictness SSL - -Its not wrong to drop it on the floor, but better to keep it. - -Note [Float coercions (unlifted)] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -BUT don't do [Float coercions] if 'e' has an unlifted type. -This *can* happen: - - foo :: Int = (error (# Int,Int #) "urk") - `cast` CoUnsafe (# Int,Int #) Int - -If do the makeTrivial thing to the error call, we'll get - foo = case error (# Int,Int #) "urk" of v -> v `cast` ... -But 'v' isn't in scope! - -These strange casts can happen as a result of case-of-case - bar = case (case x of { T -> (# 2,3 #); F -> error "urk" }) of - (# p,q #) -> p+q --} - -makeTrivialArg :: SimplMode -> ArgSpec -> SimplM (LetFloats, ArgSpec) -makeTrivialArg mode (ValArg e) - = do { (floats, e') <- makeTrivial mode NotTopLevel (fsLit "arg") e - ; return (floats, ValArg e') } -makeTrivialArg _ arg - = return (emptyLetFloats, arg) -- CastBy, TyArg - -makeTrivial :: SimplMode -> TopLevelFlag - -> FastString -- ^ A "friendly name" to build the new binder from - -> OutExpr -- ^ This expression satisfies the let/app invariant - -> SimplM (LetFloats, OutExpr) --- Binds the expression to a variable, if it's not trivial, returning the variable -makeTrivial mode top_lvl context expr - = makeTrivialWithInfo mode top_lvl context vanillaIdInfo expr - -makeTrivialWithInfo :: SimplMode -> TopLevelFlag - -> FastString -- ^ a "friendly name" to build the new binder from - -> IdInfo - -> OutExpr -- ^ This expression satisfies the let/app invariant - -> SimplM (LetFloats, OutExpr) --- Propagate strictness and demand info to the new binder --- Note [Preserve strictness when floating coercions] --- Returned SimplEnv has same substitution as incoming one -makeTrivialWithInfo mode top_lvl occ_fs info expr - | exprIsTrivial expr -- Already trivial - || not (bindingOk top_lvl expr expr_ty) -- Cannot trivialise - -- See Note [Cannot trivialise] - = return (emptyLetFloats, expr) - - | otherwise - = do { (floats, expr1) <- prepareRhs mode top_lvl occ_fs info expr - ; if exprIsTrivial expr1 -- See Note [Trivial after prepareRhs] - then return (floats, expr1) - else do - { uniq <- getUniqueM - ; let name = mkSystemVarName uniq occ_fs - var = mkLocalIdWithInfo name expr_ty info - - -- Now something very like completeBind, - -- but without the postInlineUnconditionally part - ; (arity, is_bot, expr2) <- tryEtaExpandRhs mode var expr1 - ; unf <- mkLetUnfolding (sm_dflags mode) top_lvl InlineRhs var expr2 - - ; let final_id = addLetBndrInfo var arity is_bot unf - bind = NonRec final_id expr2 - - ; return ( floats `addLetFlts` unitLetFloat bind, Var final_id ) }} - where - expr_ty = exprType expr - -bindingOk :: TopLevelFlag -> CoreExpr -> Type -> Bool --- True iff we can have a binding of this expression at this level --- Precondition: the type is the type of the expression -bindingOk top_lvl expr expr_ty - | isTopLevel top_lvl = exprIsTopLevelBindable expr expr_ty - | otherwise = True - -{- Note [Trivial after prepareRhs] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we call makeTrival on (e |> co), the recursive use of prepareRhs -may leave us with - { a1 = e } and (a1 |> co) -Now the latter is trivial, so we don't want to let-bind it. - -Note [Cannot trivialise] -~~~~~~~~~~~~~~~~~~~~~~~~ -Consider: - f :: Int -> Addr# - - foo :: Bar - foo = Bar (f 3) - -Then we can't ANF-ise foo, even though we'd like to, because -we can't make a top-level binding for the Addr# (f 3). And if -so we don't want to turn it into - foo = let x = f 3 in Bar x -because we'll just end up inlining x back, and that makes the -simplifier loop. Better not to ANF-ise it at all. - -Literal strings are an exception. - - foo = Ptr "blob"# - -We want to turn this into: - - foo1 = "blob"# - foo = Ptr foo1 - -See Note [Core top-level string literals] in GHC.Core. - -************************************************************************ -* * - Completing a lazy binding -* * -************************************************************************ - -completeBind - * deals only with Ids, not TyVars - * takes an already-simplified binder and RHS - * is used for both recursive and non-recursive bindings - * is used for both top-level and non-top-level bindings - -It does the following: - - tries discarding a dead binding - - tries PostInlineUnconditionally - - add unfolding [this is the only place we add an unfolding] - - add arity - -It does *not* attempt to do let-to-case. Why? Because it is used for - - top-level bindings (when let-to-case is impossible) - - many situations where the "rhs" is known to be a WHNF - (so let-to-case is inappropriate). - -Nor does it do the atomic-argument thing --} - -completeBind :: SimplEnv - -> TopLevelFlag -- Flag stuck into unfolding - -> MaybeJoinCont -- Required only for join point - -> InId -- Old binder - -> OutId -> OutExpr -- New binder and RHS - -> SimplM (SimplFloats, SimplEnv) --- completeBind may choose to do its work --- * by extending the substitution (e.g. let x = y in ...) --- * or by adding to the floats in the envt --- --- Binder /can/ be a JoinId --- Precondition: rhs obeys the let/app invariant -completeBind env top_lvl mb_cont old_bndr new_bndr new_rhs - | isCoVar old_bndr - = case new_rhs of - Coercion co -> return (emptyFloats env, extendCvSubst env old_bndr co) - _ -> return (mkFloatBind env (NonRec new_bndr new_rhs)) - - | otherwise - = ASSERT( isId new_bndr ) - do { let old_info = idInfo old_bndr - old_unf = unfoldingInfo old_info - occ_info = occInfo old_info - - -- Do eta-expansion on the RHS of the binding - -- See Note [Eta-expanding at let bindings] in SimplUtils - ; (new_arity, is_bot, final_rhs) <- tryEtaExpandRhs (getMode env) - new_bndr new_rhs - - -- Simplify the unfolding - ; new_unfolding <- simplLetUnfolding env top_lvl mb_cont old_bndr - final_rhs (idType new_bndr) old_unf - - ; let final_bndr = addLetBndrInfo new_bndr new_arity is_bot new_unfolding - -- See Note [In-scope set as a substitution] - - ; if postInlineUnconditionally env top_lvl final_bndr occ_info final_rhs - - then -- Inline and discard the binding - do { tick (PostInlineUnconditionally old_bndr) - ; return ( emptyFloats env - , extendIdSubst env old_bndr $ - DoneEx final_rhs (isJoinId_maybe new_bndr)) } - -- Use the substitution to make quite, quite sure that the - -- substitution will happen, since we are going to discard the binding - - else -- Keep the binding - -- pprTrace "Binding" (ppr final_bndr <+> ppr new_unfolding) $ - return (mkFloatBind env (NonRec final_bndr final_rhs)) } - -addLetBndrInfo :: OutId -> Arity -> Bool -> Unfolding -> OutId -addLetBndrInfo new_bndr new_arity is_bot new_unf - = new_bndr `setIdInfo` info5 - where - info1 = idInfo new_bndr `setArityInfo` new_arity - - -- Unfolding info: Note [Setting the new unfolding] - info2 = info1 `setUnfoldingInfo` new_unf - - -- Demand info: Note [Setting the demand info] - -- We also have to nuke demand info if for some reason - -- eta-expansion *reduces* the arity of the binding to less - -- than that of the strictness sig. This can happen: see Note [Arity decrease]. - info3 | isEvaldUnfolding new_unf - || (case strictnessInfo info2 of - StrictSig dmd_ty -> new_arity < dmdTypeDepth dmd_ty) - = zapDemandInfo info2 `orElse` info2 - | otherwise - = info2 - - -- Bottoming bindings: see Note [Bottoming bindings] - info4 | is_bot = info3 - `setStrictnessInfo` - mkClosedStrictSig (replicate new_arity topDmd) botDiv - `setCprInfo` mkCprSig new_arity botCpr - | otherwise = info3 - - -- Zap call arity info. We have used it by now (via - -- `tryEtaExpandRhs`), and the simplifier can invalidate this - -- information, leading to broken code later (e.g. #13479) - info5 = zapCallArityInfo info4 - - -{- Note [Arity decrease] -~~~~~~~~~~~~~~~~~~~~~~~~ -Generally speaking the arity of a binding should not decrease. But it *can* -legitimately happen because of RULES. Eg - f = g Int -where g has arity 2, will have arity 2. But if there's a rewrite rule - g Int --> h -where h has arity 1, then f's arity will decrease. Here's a real-life example, -which is in the output of Specialise: - - Rec { - $dm {Arity 2} = \d.\x. op d - {-# RULES forall d. $dm Int d = $s$dm #-} - - dInt = MkD .... opInt ... - opInt {Arity 1} = $dm dInt - - $s$dm {Arity 0} = \x. op dInt } - -Here opInt has arity 1; but when we apply the rule its arity drops to 0. -That's why Specialise goes to a little trouble to pin the right arity -on specialised functions too. - -Note [Bottoming bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~ -Suppose we have - let x = error "urk" - in ...(case x of <alts>)... -or - let f = \x. error (x ++ "urk") - in ...(case f "foo" of <alts>)... - -Then we'd like to drop the dead <alts> immediately. So it's good to -propagate the info that x's RHS is bottom to x's IdInfo as rapidly as -possible. - -We use tryEtaExpandRhs on every binding, and it turns ou that the -arity computation it performs (via GHC.Core.Arity.findRhsArity) already -does a simple bottoming-expression analysis. So all we need to do -is propagate that info to the binder's IdInfo. - -This showed up in #12150; see comment:16. - -Note [Setting the demand info] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If the unfolding is a value, the demand info may -go pear-shaped, so we nuke it. Example: - let x = (a,b) in - case x of (p,q) -> h p q x -Here x is certainly demanded. But after we've nuked -the case, we'll get just - let x = (a,b) in h a b x -and now x is not demanded (I'm assuming h is lazy) -This really happens. Similarly - let f = \x -> e in ...f..f... -After inlining f at some of its call sites the original binding may -(for example) be no longer strictly demanded. -The solution here is a bit ad hoc... - - -************************************************************************ -* * -\subsection[Simplify-simplExpr]{The main function: simplExpr} -* * -************************************************************************ - -The reason for this OutExprStuff stuff is that we want to float *after* -simplifying a RHS, not before. If we do so naively we get quadratic -behaviour as things float out. - -To see why it's important to do it after, consider this (real) example: - - let t = f x - in fst t -==> - let t = let a = e1 - b = e2 - in (a,b) - in fst t -==> - let a = e1 - b = e2 - t = (a,b) - in - a -- Can't inline a this round, cos it appears twice -==> - e1 - -Each of the ==> steps is a round of simplification. We'd save a -whole round if we float first. This can cascade. Consider - - let f = g d - in \x -> ...f... -==> - let f = let d1 = ..d.. in \y -> e - in \x -> ...f... -==> - let d1 = ..d.. - in \x -> ...(\y ->e)... - -Only in this second round can the \y be applied, and it -might do the same again. --} - -simplExpr :: SimplEnv -> CoreExpr -> SimplM CoreExpr -simplExpr env (Type ty) - = do { ty' <- simplType env ty -- See Note [Avoiding space leaks in OutType] - ; return (Type ty') } - -simplExpr env expr - = simplExprC env expr (mkBoringStop expr_out_ty) - where - expr_out_ty :: OutType - expr_out_ty = substTy env (exprType expr) - -- NB: Since 'expr' is term-valued, not (Type ty), this call - -- to exprType will succeed. exprType fails on (Type ty). - -simplExprC :: SimplEnv - -> InExpr -- A term-valued expression, never (Type ty) - -> SimplCont - -> SimplM OutExpr - -- Simplify an expression, given a continuation -simplExprC env expr cont - = -- pprTrace "simplExprC" (ppr expr $$ ppr cont {- $$ ppr (seIdSubst env) -} $$ ppr (seLetFloats env) ) $ - do { (floats, expr') <- simplExprF env expr cont - ; -- pprTrace "simplExprC ret" (ppr expr $$ ppr expr') $ - -- pprTrace "simplExprC ret3" (ppr (seInScope env')) $ - -- pprTrace "simplExprC ret4" (ppr (seLetFloats env')) $ - return (wrapFloats floats expr') } - --------------------------------------------------- -simplExprF :: SimplEnv - -> InExpr -- A term-valued expression, never (Type ty) - -> SimplCont - -> SimplM (SimplFloats, OutExpr) - -simplExprF env e cont - = {- pprTrace "simplExprF" (vcat - [ ppr e - , text "cont =" <+> ppr cont - , text "inscope =" <+> ppr (seInScope env) - , text "tvsubst =" <+> ppr (seTvSubst env) - , text "idsubst =" <+> ppr (seIdSubst env) - , text "cvsubst =" <+> ppr (seCvSubst env) - ]) $ -} - simplExprF1 env e cont - -simplExprF1 :: SimplEnv -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) - -simplExprF1 _ (Type ty) _ - = pprPanic "simplExprF: type" (ppr ty) - -- simplExprF does only with term-valued expressions - -- The (Type ty) case is handled separately by simplExpr - -- and by the other callers of simplExprF - -simplExprF1 env (Var v) cont = {-#SCC "simplIdF" #-} simplIdF env v cont -simplExprF1 env (Lit lit) cont = {-#SCC "rebuild" #-} rebuild env (Lit lit) cont -simplExprF1 env (Tick t expr) cont = {-#SCC "simplTick" #-} simplTick env t expr cont -simplExprF1 env (Cast body co) cont = {-#SCC "simplCast" #-} simplCast env body co cont -simplExprF1 env (Coercion co) cont = {-#SCC "simplCoercionF" #-} simplCoercionF env co cont - -simplExprF1 env (App fun arg) cont - = {-#SCC "simplExprF1-App" #-} case arg of - Type ty -> do { -- The argument type will (almost) certainly be used - -- in the output program, so just force it now. - -- See Note [Avoiding space leaks in OutType] - arg' <- simplType env ty - - -- But use substTy, not simplType, to avoid forcing - -- the hole type; it will likely not be needed. - -- See Note [The hole type in ApplyToTy] - ; let hole' = substTy env (exprType fun) - - ; simplExprF env fun $ - ApplyToTy { sc_arg_ty = arg' - , sc_hole_ty = hole' - , sc_cont = cont } } - _ -> simplExprF env fun $ - ApplyToVal { sc_arg = arg, sc_env = env - , sc_dup = NoDup, sc_cont = cont } - -simplExprF1 env expr@(Lam {}) cont - = {-#SCC "simplExprF1-Lam" #-} - simplLam env zapped_bndrs body cont - -- The main issue here is under-saturated lambdas - -- (\x1. \x2. e) arg1 - -- Here x1 might have "occurs-once" occ-info, because occ-info - -- is computed assuming that a group of lambdas is applied - -- all at once. If there are too few args, we must zap the - -- occ-info, UNLESS the remaining binders are one-shot - where - (bndrs, body) = collectBinders expr - zapped_bndrs | need_to_zap = map zap bndrs - | otherwise = bndrs - - need_to_zap = any zappable_bndr (drop n_args bndrs) - n_args = countArgs cont - -- NB: countArgs counts all the args (incl type args) - -- and likewise drop counts all binders (incl type lambdas) - - zappable_bndr b = isId b && not (isOneShotBndr b) - zap b | isTyVar b = b - | otherwise = zapLamIdInfo b - -simplExprF1 env (Case scrut bndr _ alts) cont - = {-#SCC "simplExprF1-Case" #-} - simplExprF env scrut (Select { sc_dup = NoDup, sc_bndr = bndr - , sc_alts = alts - , sc_env = env, sc_cont = cont }) - -simplExprF1 env (Let (Rec pairs) body) cont - | Just pairs' <- joinPointBindings_maybe pairs - = {-#SCC "simplRecJoinPoin" #-} simplRecJoinPoint env pairs' body cont - - | otherwise - = {-#SCC "simplRecE" #-} simplRecE env pairs body cont - -simplExprF1 env (Let (NonRec bndr rhs) body) cont - | Type ty <- rhs -- First deal with type lets (let a = Type ty in e) - = {-#SCC "simplExprF1-NonRecLet-Type" #-} - ASSERT( isTyVar bndr ) - do { ty' <- simplType env ty - ; simplExprF (extendTvSubst env bndr ty') body cont } - - | Just (bndr', rhs') <- joinPointBinding_maybe bndr rhs - = {-#SCC "simplNonRecJoinPoint" #-} simplNonRecJoinPoint env bndr' rhs' body cont - - | otherwise - = {-#SCC "simplNonRecE" #-} simplNonRecE env bndr (rhs, env) ([], body) cont - -{- Note [Avoiding space leaks in OutType] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Since the simplifier is run for multiple iterations, we need to ensure -that any thunks in the output of one simplifier iteration are forced -by the evaluation of the next simplifier iteration. Otherwise we may -retain multiple copies of the Core program and leak a terrible amount -of memory (as in #13426). - -The simplifier is naturally strict in the entire "Expr part" of the -input Core program, because any expression may contain binders, which -we must find in order to extend the SimplEnv accordingly. But types -do not contain binders and so it is tempting to write things like - - simplExpr env (Type ty) = return (Type (substTy env ty)) -- Bad! - -This is Bad because the result includes a thunk (substTy env ty) which -retains a reference to the whole simplifier environment; and the next -simplifier iteration will not force this thunk either, because the -line above is not strict in ty. - -So instead our strategy is for the simplifier to fully evaluate -OutTypes when it emits them into the output Core program, for example - - simplExpr env (Type ty) = do { ty' <- simplType env ty -- Good - ; return (Type ty') } - -where the only difference from above is that simplType calls seqType -on the result of substTy. - -However, SimplCont can also contain OutTypes and it's not necessarily -a good idea to force types on the way in to SimplCont, because they -may end up not being used and forcing them could be a lot of wasted -work. T5631 is a good example of this. - -- For ApplyToTy's sc_arg_ty, we force the type on the way in because - the type will almost certainly appear as a type argument in the - output program. - -- For the hole types in Stop and ApplyToTy, we force the type when we - emit it into the output program, after obtaining it from - contResultType. (The hole type in ApplyToTy is only directly used - to form the result type in a new Stop continuation.) --} - ---------------------------------- --- Simplify a join point, adding the context. --- Context goes *inside* the lambdas. IOW, if the join point has arity n, we do: --- \x1 .. xn -> e => \x1 .. xn -> E[e] --- Note that we need the arity of the join point, since e may be a lambda --- (though this is unlikely). See Note [Join points and case-of-case]. -simplJoinRhs :: SimplEnv -> InId -> InExpr -> SimplCont - -> SimplM OutExpr -simplJoinRhs env bndr expr cont - | Just arity <- isJoinId_maybe bndr - = do { let (join_bndrs, join_body) = collectNBinders arity expr - ; (env', join_bndrs') <- simplLamBndrs env join_bndrs - ; join_body' <- simplExprC env' join_body cont - ; return $ mkLams join_bndrs' join_body' } - - | otherwise - = pprPanic "simplJoinRhs" (ppr bndr) - ---------------------------------- -simplType :: SimplEnv -> InType -> SimplM OutType - -- Kept monadic just so we can do the seqType - -- See Note [Avoiding space leaks in OutType] -simplType env ty - = -- pprTrace "simplType" (ppr ty $$ ppr (seTvSubst env)) $ - seqType new_ty `seq` return new_ty - where - new_ty = substTy env ty - ---------------------------------- -simplCoercionF :: SimplEnv -> InCoercion -> SimplCont - -> SimplM (SimplFloats, OutExpr) -simplCoercionF env co cont - = do { co' <- simplCoercion env co - ; rebuild env (Coercion co') cont } - -simplCoercion :: SimplEnv -> InCoercion -> SimplM OutCoercion -simplCoercion env co - = do { dflags <- getDynFlags - ; let opt_co = optCoercion dflags (getTCvSubst env) co - ; seqCo opt_co `seq` return opt_co } - ------------------------------------ --- | Push a TickIt context outwards past applications and cases, as --- long as this is a non-scoping tick, to let case and application --- optimisations apply. - -simplTick :: SimplEnv -> Tickish Id -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) -simplTick env tickish expr cont - -- A scoped tick turns into a continuation, so that we can spot - -- (scc t (\x . e)) in simplLam and eliminate the scc. If we didn't do - -- it this way, then it would take two passes of the simplifier to - -- reduce ((scc t (\x . e)) e'). - -- NB, don't do this with counting ticks, because if the expr is - -- bottom, then rebuildCall will discard the continuation. - --- XXX: we cannot do this, because the simplifier assumes that --- the context can be pushed into a case with a single branch. e.g. --- scc<f> case expensive of p -> e --- becomes --- case expensive of p -> scc<f> e --- --- So I'm disabling this for now. It just means we will do more --- simplifier iterations that necessary in some cases. - --- | tickishScoped tickish && not (tickishCounts tickish) --- = simplExprF env expr (TickIt tickish cont) - - -- For unscoped or soft-scoped ticks, we are allowed to float in new - -- cost, so we simply push the continuation inside the tick. This - -- has the effect of moving the tick to the outside of a case or - -- application context, allowing the normal case and application - -- optimisations to fire. - | tickish `tickishScopesLike` SoftScope - = do { (floats, expr') <- simplExprF env expr cont - ; return (floats, mkTick tickish expr') - } - - -- Push tick inside if the context looks like this will allow us to - -- do a case-of-case - see Note [case-of-scc-of-case] - | Select {} <- cont, Just expr' <- push_tick_inside - = simplExprF env expr' cont - - -- We don't want to move the tick, but we might still want to allow - -- floats to pass through with appropriate wrapping (or not, see - -- wrap_floats below) - --- | not (tickishCounts tickish) || tickishCanSplit tickish - -- = wrap_floats - - | otherwise - = no_floating_past_tick - - where - - -- Try to push tick inside a case, see Note [case-of-scc-of-case]. - push_tick_inside = - case expr0 of - Case scrut bndr ty alts - -> Just $ Case (tickScrut scrut) bndr ty (map tickAlt alts) - _other -> Nothing - where (ticks, expr0) = stripTicksTop movable (Tick tickish expr) - movable t = not (tickishCounts t) || - t `tickishScopesLike` NoScope || - tickishCanSplit t - tickScrut e = foldr mkTick e ticks - -- Alternatives get annotated with all ticks that scope in some way, - -- but we don't want to count entries. - tickAlt (c,bs,e) = (c,bs, foldr mkTick e ts_scope) - ts_scope = map mkNoCount $ - filter (not . (`tickishScopesLike` NoScope)) ticks - - no_floating_past_tick = - do { let (inc,outc) = splitCont cont - ; (floats, expr1) <- simplExprF env expr inc - ; let expr2 = wrapFloats floats expr1 - tickish' = simplTickish env tickish - ; rebuild env (mkTick tickish' expr2) outc - } - --- Alternative version that wraps outgoing floats with the tick. This --- results in ticks being duplicated, as we don't make any attempt to --- eliminate the tick if we re-inline the binding (because the tick --- semantics allows unrestricted inlining of HNFs), so I'm not doing --- this any more. FloatOut will catch any real opportunities for --- floating. --- --- wrap_floats = --- do { let (inc,outc) = splitCont cont --- ; (env', expr') <- simplExprF (zapFloats env) expr inc --- ; let tickish' = simplTickish env tickish --- ; let wrap_float (b,rhs) = (zapIdStrictness (setIdArity b 0), --- mkTick (mkNoCount tickish') rhs) --- -- when wrapping a float with mkTick, we better zap the Id's --- -- strictness info and arity, because it might be wrong now. --- ; let env'' = addFloats env (mapFloats env' wrap_float) --- ; rebuild env'' expr' (TickIt tickish' outc) --- } - - - simplTickish env tickish - | Breakpoint n ids <- tickish - = Breakpoint n (map (getDoneId . substId env) ids) - | otherwise = tickish - - -- Push type application and coercion inside a tick - splitCont :: SimplCont -> (SimplCont, SimplCont) - splitCont cont@(ApplyToTy { sc_cont = tail }) = (cont { sc_cont = inc }, outc) - where (inc,outc) = splitCont tail - splitCont (CastIt co c) = (CastIt co inc, outc) - where (inc,outc) = splitCont c - splitCont other = (mkBoringStop (contHoleType other), other) - - getDoneId (DoneId id) = id - getDoneId (DoneEx e _) = getIdFromTrivialExpr e -- Note [substTickish] in GHC.Core.Subst - getDoneId other = pprPanic "getDoneId" (ppr other) - --- Note [case-of-scc-of-case] --- It's pretty important to be able to transform case-of-case when --- there's an SCC in the way. For example, the following comes up --- in nofib/real/compress/Encode.hs: --- --- case scctick<code_string.r1> --- case $wcode_string_r13s wild_XC w1_s137 w2_s138 l_aje --- of _ { (# ww1_s13f, ww2_s13g, ww3_s13h #) -> --- (ww1_s13f, ww2_s13g, ww3_s13h) --- } --- of _ { (ww_s12Y, ww1_s12Z, ww2_s130) -> --- tick<code_string.f1> --- (ww_s12Y, --- ww1_s12Z, --- PTTrees.PT --- @ GHC.Types.Char @ GHC.Types.Int wild2_Xj ww2_s130 r_ajf) --- } --- --- We really want this case-of-case to fire, because then the 3-tuple --- will go away (indeed, the CPR optimisation is relying on this --- happening). But the scctick is in the way - we need to push it --- inside to expose the case-of-case. So we perform this --- transformation on the inner case: --- --- scctick c (case e of { p1 -> e1; ...; pn -> en }) --- ==> --- case (scctick c e) of { p1 -> scc c e1; ...; pn -> scc c en } --- --- So we've moved a constant amount of work out of the scc to expose --- the case. We only do this when the continuation is interesting: in --- for now, it has to be another Case (maybe generalise this later). - -{- -************************************************************************ -* * -\subsection{The main rebuilder} -* * -************************************************************************ --} - -rebuild :: SimplEnv -> OutExpr -> SimplCont -> SimplM (SimplFloats, OutExpr) --- At this point the substitution in the SimplEnv should be irrelevant; --- only the in-scope set matters -rebuild env expr cont - = case cont of - Stop {} -> return (emptyFloats env, expr) - TickIt t cont -> rebuild env (mkTick t expr) cont - CastIt co cont -> rebuild env (mkCast expr co) cont - -- NB: mkCast implements the (Coercion co |> g) optimisation - - Select { sc_bndr = bndr, sc_alts = alts, sc_env = se, sc_cont = cont } - -> rebuildCase (se `setInScopeFromE` env) expr bndr alts cont - - StrictArg { sc_fun = fun, sc_cont = cont } - -> rebuildCall env (fun `addValArgTo` expr) cont - StrictBind { sc_bndr = b, sc_bndrs = bs, sc_body = body - , sc_env = se, sc_cont = cont } - -> do { (floats1, env') <- simplNonRecX (se `setInScopeFromE` env) b expr - -- expr satisfies let/app since it started life - -- in a call to simplNonRecE - ; (floats2, expr') <- simplLam env' bs body cont - ; return (floats1 `addFloats` floats2, expr') } - - ApplyToTy { sc_arg_ty = ty, sc_cont = cont} - -> rebuild env (App expr (Type ty)) cont - - ApplyToVal { sc_arg = arg, sc_env = se, sc_dup = dup_flag, sc_cont = cont} - -- See Note [Avoid redundant simplification] - -> do { (_, _, arg') <- simplArg env dup_flag se arg - ; rebuild env (App expr arg') cont } - -{- -************************************************************************ -* * -\subsection{Lambdas} -* * -************************************************************************ --} - -{- Note [Optimising reflexivity] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's important (for compiler performance) to get rid of reflexivity as soon -as it appears. See #11735, #14737, and #15019. - -In particular, we want to behave well on - - * e |> co1 |> co2 - where the two happen to cancel out entirely. That is quite common; - e.g. a newtype wrapping and unwrapping cancel. - - - * (f |> co) @t1 @t2 ... @tn x1 .. xm - Here we wil use pushCoTyArg and pushCoValArg successively, which - build up NthCo stacks. Silly to do that if co is reflexive. - -However, we don't want to call isReflexiveCo too much, because it uses -type equality which is expensive on big types (#14737 comment:7). - -A good compromise (determined experimentally) seems to be to call -isReflexiveCo - * when composing casts, and - * at the end - -In investigating this I saw missed opportunities for on-the-fly -coercion shrinkage. See #15090. --} - - -simplCast :: SimplEnv -> InExpr -> Coercion -> SimplCont - -> SimplM (SimplFloats, OutExpr) -simplCast env body co0 cont0 - = do { co1 <- {-#SCC "simplCast-simplCoercion" #-} simplCoercion env co0 - ; cont1 <- {-#SCC "simplCast-addCoerce" #-} - if isReflCo co1 - then return cont0 -- See Note [Optimising reflexivity] - else addCoerce co1 cont0 - ; {-#SCC "simplCast-simplExprF" #-} simplExprF env body cont1 } - where - -- If the first parameter is MRefl, then simplifying revealed a - -- reflexive coercion. Omit. - addCoerceM :: MOutCoercion -> SimplCont -> SimplM SimplCont - addCoerceM MRefl cont = return cont - addCoerceM (MCo co) cont = addCoerce co cont - - addCoerce :: OutCoercion -> SimplCont -> SimplM SimplCont - addCoerce co1 (CastIt co2 cont) -- See Note [Optimising reflexivity] - | isReflexiveCo co' = return cont - | otherwise = addCoerce co' cont - where - co' = mkTransCo co1 co2 - - addCoerce co cont@(ApplyToTy { sc_arg_ty = arg_ty, sc_cont = tail }) - | Just (arg_ty', m_co') <- pushCoTyArg co arg_ty - -- N.B. As mentioned in Note [The hole type in ApplyToTy] this is - -- only needed by `sc_hole_ty` which is often not forced. - -- Consequently it is worthwhile using a lazy pattern match here to - -- avoid unnecessary coercionKind evaluations. - , let hole_ty = coercionLKind co - = {-#SCC "addCoerce-pushCoTyArg" #-} - do { tail' <- addCoerceM m_co' tail - ; return (cont { sc_arg_ty = arg_ty' - , sc_hole_ty = hole_ty -- NB! As the cast goes past, the - -- type of the hole changes (#16312) - , sc_cont = tail' }) } - - addCoerce co cont@(ApplyToVal { sc_arg = arg, sc_env = arg_se - , sc_dup = dup, sc_cont = tail }) - | Just (co1, m_co2) <- pushCoValArg co - , let new_ty = coercionRKind co1 - , not (isTypeLevPoly new_ty) -- Without this check, we get a lev-poly arg - -- See Note [Levity polymorphism invariants] in GHC.Core - -- test: typecheck/should_run/EtaExpandLevPoly - = {-#SCC "addCoerce-pushCoValArg" #-} - do { tail' <- addCoerceM m_co2 tail - ; if isReflCo co1 - then return (cont { sc_cont = tail' }) - -- Avoid simplifying if possible; - -- See Note [Avoiding exponential behaviour] - else do - { (dup', arg_se', arg') <- simplArg env dup arg_se arg - -- When we build the ApplyTo we can't mix the OutCoercion - -- 'co' with the InExpr 'arg', so we simplify - -- to make it all consistent. It's a bit messy. - -- But it isn't a common case. - -- Example of use: #995 - ; return (ApplyToVal { sc_arg = mkCast arg' co1 - , sc_env = arg_se' - , sc_dup = dup' - , sc_cont = tail' }) } } - - addCoerce co cont - | isReflexiveCo co = return cont -- Having this at the end makes a huge - -- difference in T12227, for some reason - -- See Note [Optimising reflexivity] - | otherwise = return (CastIt co cont) - -simplArg :: SimplEnv -> DupFlag -> StaticEnv -> CoreExpr - -> SimplM (DupFlag, StaticEnv, OutExpr) -simplArg env dup_flag arg_env arg - | isSimplified dup_flag - = return (dup_flag, arg_env, arg) - | otherwise - = do { arg' <- simplExpr (arg_env `setInScopeFromE` env) arg - ; return (Simplified, zapSubstEnv arg_env, arg') } - -{- -************************************************************************ -* * -\subsection{Lambdas} -* * -************************************************************************ --} - -simplLam :: SimplEnv -> [InId] -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) - -simplLam env [] body cont - = simplExprF env body cont - -simplLam env (bndr:bndrs) body (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont }) - = do { tick (BetaReduction bndr) - ; simplLam (extendTvSubst env bndr arg_ty) bndrs body cont } - -simplLam env (bndr:bndrs) body (ApplyToVal { sc_arg = arg, sc_env = arg_se - , sc_cont = cont, sc_dup = dup }) - | isSimplified dup -- Don't re-simplify if we've simplified it once - -- See Note [Avoiding exponential behaviour] - = do { tick (BetaReduction bndr) - ; (floats1, env') <- simplNonRecX env zapped_bndr arg - ; (floats2, expr') <- simplLam env' bndrs body cont - ; return (floats1 `addFloats` floats2, expr') } - - | otherwise - = do { tick (BetaReduction bndr) - ; simplNonRecE env zapped_bndr (arg, arg_se) (bndrs, body) cont } - where - zapped_bndr -- See Note [Zap unfolding when beta-reducing] - | isId bndr = zapStableUnfolding bndr - | otherwise = bndr - - -- Discard a non-counting tick on a lambda. This may change the - -- cost attribution slightly (moving the allocation of the - -- lambda elsewhere), but we don't care: optimisation changes - -- cost attribution all the time. -simplLam env bndrs body (TickIt tickish cont) - | not (tickishCounts tickish) - = simplLam env bndrs body cont - - -- Not enough args, so there are real lambdas left to put in the result -simplLam env bndrs body cont - = do { (env', bndrs') <- simplLamBndrs env bndrs - ; body' <- simplExpr env' body - ; new_lam <- mkLam env bndrs' body' cont - ; rebuild env' new_lam cont } - -------------- -simplLamBndr :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr) --- Used for lambda binders. These sometimes have unfoldings added by --- the worker/wrapper pass that must be preserved, because they can't --- be reconstructed from context. For example: --- f x = case x of (a,b) -> fw a b x --- fw a b x{=(a,b)} = ... --- The "{=(a,b)}" is an unfolding we can't reconstruct otherwise. -simplLamBndr env bndr - | isId bndr && isFragileUnfolding old_unf -- Special case - = do { (env1, bndr1) <- simplBinder env bndr - ; unf' <- simplStableUnfolding env1 NotTopLevel Nothing bndr - old_unf (idType bndr1) - ; let bndr2 = bndr1 `setIdUnfolding` unf' - ; return (modifyInScope env1 bndr2, bndr2) } - - | otherwise - = simplBinder env bndr -- Normal case - where - old_unf = idUnfolding bndr - -simplLamBndrs :: SimplEnv -> [InBndr] -> SimplM (SimplEnv, [OutBndr]) -simplLamBndrs env bndrs = mapAccumLM simplLamBndr env bndrs - ------------------- -simplNonRecE :: SimplEnv - -> InId -- The binder, always an Id - -- Never a join point - -> (InExpr, SimplEnv) -- Rhs of binding (or arg of lambda) - -> ([InBndr], InExpr) -- Body of the let/lambda - -- \xs.e - -> SimplCont - -> SimplM (SimplFloats, OutExpr) - --- simplNonRecE is used for --- * non-top-level non-recursive non-join-point lets in expressions --- * beta reduction --- --- simplNonRec env b (rhs, rhs_se) (bs, body) k --- = let env in --- cont< let b = rhs_se(rhs) in \bs.body > --- --- It deals with strict bindings, via the StrictBind continuation, --- which may abort the whole process --- --- Precondition: rhs satisfies the let/app invariant --- Note [Core let/app invariant] in GHC.Core --- --- The "body" of the binding comes as a pair of ([InId],InExpr) --- representing a lambda; so we recurse back to simplLam --- Why? Because of the binder-occ-info-zapping done before --- the call to simplLam in simplExprF (Lam ...) - -simplNonRecE env bndr (rhs, rhs_se) (bndrs, body) cont - | ASSERT( isId bndr && not (isJoinId bndr) ) True - , Just env' <- preInlineUnconditionally env NotTopLevel bndr rhs rhs_se - = do { tick (PreInlineUnconditionally bndr) - ; -- pprTrace "preInlineUncond" (ppr bndr <+> ppr rhs) $ - simplLam env' bndrs body cont } - - -- Deal with strict bindings - | isStrictId bndr -- Includes coercions - , sm_case_case (getMode env) - = simplExprF (rhs_se `setInScopeFromE` env) rhs - (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs, sc_body = body - , sc_env = env, sc_cont = cont, sc_dup = NoDup }) - - -- Deal with lazy bindings - | otherwise - = ASSERT( not (isTyVar bndr) ) - do { (env1, bndr1) <- simplNonRecBndr env bndr - ; (env2, bndr2) <- addBndrRules env1 bndr bndr1 Nothing - ; (floats1, env3) <- simplLazyBind env2 NotTopLevel NonRecursive bndr bndr2 rhs rhs_se - ; (floats2, expr') <- simplLam env3 bndrs body cont - ; return (floats1 `addFloats` floats2, expr') } - ------------------- -simplRecE :: SimplEnv - -> [(InId, InExpr)] - -> InExpr - -> SimplCont - -> SimplM (SimplFloats, OutExpr) - --- simplRecE is used for --- * non-top-level recursive lets in expressions -simplRecE env pairs body cont - = do { let bndrs = map fst pairs - ; MASSERT(all (not . isJoinId) bndrs) - ; env1 <- simplRecBndrs env bndrs - -- NB: bndrs' don't have unfoldings or rules - -- We add them as we go down - ; (floats1, env2) <- simplRecBind env1 NotTopLevel Nothing pairs - ; (floats2, expr') <- simplExprF env2 body cont - ; return (floats1 `addFloats` floats2, expr') } - -{- Note [Avoiding exponential behaviour] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -One way in which we can get exponential behaviour is if we simplify a -big expression, and the re-simplify it -- and then this happens in a -deeply-nested way. So we must be jolly careful about re-simplifying -an expression. That is why completeNonRecX does not try -preInlineUnconditionally. - -Example: - f BIG, where f has a RULE -Then - * We simplify BIG before trying the rule; but the rule does not fire - * We inline f = \x. x True - * So if we did preInlineUnconditionally we'd re-simplify (BIG True) - -However, if BIG has /not/ already been simplified, we'd /like/ to -simplify BIG True; maybe good things happen. That is why - -* simplLam has - - a case for (isSimplified dup), which goes via simplNonRecX, and - - a case for the un-simplified case, which goes via simplNonRecE - -* We go to some efforts to avoid unnecessarily simplifying ApplyToVal, - in at least two places - - In simplCast/addCoerce, where we check for isReflCo - - In rebuildCall we avoid simplifying arguments before we have to - (see Note [Trying rewrite rules]) - - -Note [Zap unfolding when beta-reducing] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Lambda-bound variables can have stable unfoldings, such as - $j = \x. \b{Unf=Just x}. e -See Note [Case binders and join points] below; the unfolding for lets -us optimise e better. However when we beta-reduce it we want to -revert to using the actual value, otherwise we can end up in the -stupid situation of - let x = blah in - let b{Unf=Just x} = y - in ...b... -Here it'd be far better to drop the unfolding and use the actual RHS. - -************************************************************************ -* * - Join points -* * -********************************************************************* -} - -{- Note [Rules and unfolding for join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Suppose we have - - simplExpr (join j x = rhs ) cont - ( {- RULE j (p:ps) = blah -} ) - ( {- StableUnfolding j = blah -} ) - (in blah ) - -Then we will push 'cont' into the rhs of 'j'. But we should *also* push -'cont' into the RHS of - * Any RULEs for j, e.g. generated by SpecConstr - * Any stable unfolding for j, e.g. the result of an INLINE pragma - -Simplifying rules and stable-unfoldings happens a bit after -simplifying the right-hand side, so we remember whether or not it -is a join point, and what 'cont' is, in a value of type MaybeJoinCont - -#13900 was caused by forgetting to push 'cont' into the RHS -of a SpecConstr-generated RULE for a join point. --} - -type MaybeJoinCont = Maybe SimplCont - -- Nothing => Not a join point - -- Just k => This is a join binding with continuation k - -- See Note [Rules and unfolding for join points] - -simplNonRecJoinPoint :: SimplEnv -> InId -> InExpr - -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) -simplNonRecJoinPoint env bndr rhs body cont - | ASSERT( isJoinId bndr ) True - , Just env' <- preInlineUnconditionally env NotTopLevel bndr rhs env - = do { tick (PreInlineUnconditionally bndr) - ; simplExprF env' body cont } - - | otherwise - = wrapJoinCont env cont $ \ env cont -> - do { -- We push join_cont into the join RHS and the body; - -- and wrap wrap_cont around the whole thing - ; let res_ty = contResultType cont - ; (env1, bndr1) <- simplNonRecJoinBndr env res_ty bndr - ; (env2, bndr2) <- addBndrRules env1 bndr bndr1 (Just cont) - ; (floats1, env3) <- simplJoinBind env2 cont bndr bndr2 rhs env - ; (floats2, body') <- simplExprF env3 body cont - ; return (floats1 `addFloats` floats2, body') } - - ------------------- -simplRecJoinPoint :: SimplEnv -> [(InId, InExpr)] - -> InExpr -> SimplCont - -> SimplM (SimplFloats, OutExpr) -simplRecJoinPoint env pairs body cont - = wrapJoinCont env cont $ \ env cont -> - do { let bndrs = map fst pairs - res_ty = contResultType cont - ; env1 <- simplRecJoinBndrs env res_ty bndrs - -- NB: bndrs' don't have unfoldings or rules - -- We add them as we go down - ; (floats1, env2) <- simplRecBind env1 NotTopLevel (Just cont) pairs - ; (floats2, body') <- simplExprF env2 body cont - ; return (floats1 `addFloats` floats2, body') } - --------------------- -wrapJoinCont :: SimplEnv -> SimplCont - -> (SimplEnv -> SimplCont -> SimplM (SimplFloats, OutExpr)) - -> SimplM (SimplFloats, OutExpr) --- Deal with making the continuation duplicable if necessary, --- and with the no-case-of-case situation. -wrapJoinCont env cont thing_inside - | contIsStop cont -- Common case; no need for fancy footwork - = thing_inside env cont - - | not (sm_case_case (getMode env)) - -- See Note [Join points with -fno-case-of-case] - = do { (floats1, expr1) <- thing_inside env (mkBoringStop (contHoleType cont)) - ; let (floats2, expr2) = wrapJoinFloatsX floats1 expr1 - ; (floats3, expr3) <- rebuild (env `setInScopeFromF` floats2) expr2 cont - ; return (floats2 `addFloats` floats3, expr3) } - - | otherwise - -- Normal case; see Note [Join points and case-of-case] - = do { (floats1, cont') <- mkDupableCont env cont - ; (floats2, result) <- thing_inside (env `setInScopeFromF` floats1) cont' - ; return (floats1 `addFloats` floats2, result) } - - --------------------- -trimJoinCont :: Id -> Maybe JoinArity -> SimplCont -> SimplCont --- Drop outer context from join point invocation (jump) --- See Note [Join points and case-of-case] - -trimJoinCont _ Nothing cont - = cont -- Not a jump -trimJoinCont var (Just arity) cont - = trim arity cont - where - trim 0 cont@(Stop {}) - = cont - trim 0 cont - = mkBoringStop (contResultType cont) - trim n cont@(ApplyToVal { sc_cont = k }) - = cont { sc_cont = trim (n-1) k } - trim n cont@(ApplyToTy { sc_cont = k }) - = cont { sc_cont = trim (n-1) k } -- join arity counts types! - trim _ cont - = pprPanic "completeCall" $ ppr var $$ ppr cont - - -{- Note [Join points and case-of-case] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we perform the case-of-case transform (or otherwise push continuations -inward), we want to treat join points specially. Since they're always -tail-called and we want to maintain this invariant, we can do this (for any -evaluation context E): - - E[join j = e - in case ... of - A -> jump j 1 - B -> jump j 2 - C -> f 3] - - --> - - join j = E[e] - in case ... of - A -> jump j 1 - B -> jump j 2 - C -> E[f 3] - -As is evident from the example, there are two components to this behavior: - - 1. When entering the RHS of a join point, copy the context inside. - 2. When a join point is invoked, discard the outer context. - -We need to be very careful here to remain consistent---neither part is -optional! - -We need do make the continuation E duplicable (since we are duplicating it) -with mkDupableCont. - - -Note [Join points with -fno-case-of-case] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Supose case-of-case is switched off, and we are simplifying - - case (join j x = <j-rhs> in - case y of - A -> j 1 - B -> j 2 - C -> e) of <outer-alts> - -Usually, we'd push the outer continuation (case . of <outer-alts>) into -both the RHS and the body of the join point j. But since we aren't doing -case-of-case we may then end up with this totally bogus result - - join x = case <j-rhs> of <outer-alts> in - case (case y of - A -> j 1 - B -> j 2 - C -> e) of <outer-alts> - -This would be OK in the language of the paper, but not in GHC: j is no longer -a join point. We can only do the "push continuation into the RHS of the -join point j" if we also push the continuation right down to the /jumps/ to -j, so that it can evaporate there. If we are doing case-of-case, we'll get to - - join x = case <j-rhs> of <outer-alts> in - case y of - A -> j 1 - B -> j 2 - C -> case e of <outer-alts> - -which is great. - -Bottom line: if case-of-case is off, we must stop pushing the continuation -inwards altogether at any join point. Instead simplify the (join ... in ...) -with a Stop continuation, and wrap the original continuation around the -outside. Surprisingly tricky! - - -************************************************************************ -* * - Variables -* * -************************************************************************ --} - -simplVar :: SimplEnv -> InVar -> SimplM OutExpr --- Look up an InVar in the environment -simplVar env var - | isTyVar var = return (Type (substTyVar env var)) - | isCoVar var = return (Coercion (substCoVar env var)) - | otherwise - = case substId env var of - ContEx tvs cvs ids e -> simplExpr (setSubstEnv env tvs cvs ids) e - DoneId var1 -> return (Var var1) - DoneEx e _ -> return e - -simplIdF :: SimplEnv -> InId -> SimplCont -> SimplM (SimplFloats, OutExpr) -simplIdF env var cont - = case substId env var of - ContEx tvs cvs ids e -> simplExprF (setSubstEnv env tvs cvs ids) e cont - -- Don't trim; haven't already simplified e, - -- so the cont is not embodied in e - - DoneId var1 -> completeCall env var1 (trimJoinCont var (isJoinId_maybe var1) cont) - - DoneEx e mb_join -> simplExprF (zapSubstEnv env) e (trimJoinCont var mb_join cont) - -- Note [zapSubstEnv] - -- The template is already simplified, so don't re-substitute. - -- This is VITAL. Consider - -- let x = e in - -- let y = \z -> ...x... in - -- \ x -> ...y... - -- We'll clone the inner \x, adding x->x' in the id_subst - -- Then when we inline y, we must *not* replace x by x' in - -- the inlined copy!! - ---------------------------------------------------------- --- Dealing with a call site - -completeCall :: SimplEnv -> OutId -> SimplCont -> SimplM (SimplFloats, OutExpr) -completeCall env var cont - | Just expr <- callSiteInline dflags var active_unf - lone_variable arg_infos interesting_cont - -- Inline the variable's RHS - = do { checkedTick (UnfoldingDone var) - ; dump_inline expr cont - ; simplExprF (zapSubstEnv env) expr cont } - - | otherwise - -- Don't inline; instead rebuild the call - = do { rule_base <- getSimplRules - ; let info = mkArgInfo env var (getRules rule_base var) - n_val_args call_cont - ; rebuildCall env info cont } - - where - dflags = seDynFlags env - (lone_variable, arg_infos, call_cont) = contArgs cont - n_val_args = length arg_infos - interesting_cont = interestingCallContext env call_cont - active_unf = activeUnfolding (getMode env) var - - log_inlining doc - = liftIO $ dumpAction dflags - (mkUserStyle dflags alwaysQualify AllTheWay) - (dumpOptionsFromFlag Opt_D_dump_inlinings) - "" FormatText doc - - dump_inline unfolding cont - | not (dopt Opt_D_dump_inlinings dflags) = return () - | not (dopt Opt_D_verbose_core2core dflags) - = when (isExternalName (idName var)) $ - log_inlining $ - sep [text "Inlining done:", nest 4 (ppr var)] - | otherwise - = liftIO $ log_inlining $ - sep [text "Inlining done: " <> ppr var, - nest 4 (vcat [text "Inlined fn: " <+> nest 2 (ppr unfolding), - text "Cont: " <+> ppr cont])] - -rebuildCall :: SimplEnv - -> ArgInfo - -> SimplCont - -> SimplM (SimplFloats, OutExpr) --- We decided not to inline, so --- - simplify the arguments --- - try rewrite rules --- - and rebuild - ----------- Bottoming applications -------------- -rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_strs = [] }) cont - -- When we run out of strictness args, it means - -- that the call is definitely bottom; see SimplUtils.mkArgInfo - -- Then we want to discard the entire strict continuation. E.g. - -- * case (error "hello") of { ... } - -- * (error "Hello") arg - -- * f (error "Hello") where f is strict - -- etc - -- Then, especially in the first of these cases, we'd like to discard - -- the continuation, leaving just the bottoming expression. But the - -- type might not be right, so we may have to add a coerce. - | not (contIsTrivial cont) -- Only do this if there is a non-trivial - -- continuation to discard, else we do it - -- again and again! - = seqType cont_ty `seq` -- See Note [Avoiding space leaks in OutType] - return (emptyFloats env, castBottomExpr res cont_ty) - where - res = argInfoExpr fun rev_args - cont_ty = contResultType cont - ----------- Try rewrite RULES -------------- --- See Note [Trying rewrite rules] -rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args - , ai_rules = Just (nr_wanted, rules) }) cont - | nr_wanted == 0 || no_more_args - , let info' = info { ai_rules = Nothing } - = -- We've accumulated a simplified call in <fun,rev_args> - -- so try rewrite rules; see Note [RULEs apply to simplified arguments] - -- See also Note [Rules for recursive functions] - do { mb_match <- tryRules env rules fun (reverse rev_args) cont - ; case mb_match of - Just (env', rhs, cont') -> simplExprF env' rhs cont' - Nothing -> rebuildCall env info' cont } - where - no_more_args = case cont of - ApplyToTy {} -> False - ApplyToVal {} -> False - _ -> True - - ----------- Simplify applications and casts -------------- -rebuildCall env info (CastIt co cont) - = rebuildCall env (addCastTo info co) cont - -rebuildCall env info (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont }) - = rebuildCall env (addTyArgTo info arg_ty) cont - -rebuildCall env info@(ArgInfo { ai_encl = encl_rules, ai_type = fun_ty - , ai_strs = str:strs, ai_discs = disc:discs }) - (ApplyToVal { sc_arg = arg, sc_env = arg_se - , sc_dup = dup_flag, sc_cont = cont }) - | isSimplified dup_flag -- See Note [Avoid redundant simplification] - = rebuildCall env (addValArgTo info' arg) cont - - | str -- Strict argument - , sm_case_case (getMode env) - = -- pprTrace "Strict Arg" (ppr arg $$ ppr (seIdSubst env) $$ ppr (seInScope env)) $ - simplExprF (arg_se `setInScopeFromE` env) arg - (StrictArg { sc_fun = info', sc_cci = cci_strict - , sc_dup = Simplified, sc_cont = cont }) - -- Note [Shadowing] - - | otherwise -- Lazy argument - -- DO NOT float anything outside, hence simplExprC - -- There is no benefit (unlike in a let-binding), and we'd - -- have to be very careful about bogus strictness through - -- floating a demanded let. - = do { arg' <- simplExprC (arg_se `setInScopeFromE` env) arg - (mkLazyArgStop arg_ty cci_lazy) - ; rebuildCall env (addValArgTo info' arg') cont } - where - info' = info { ai_strs = strs, ai_discs = discs } - arg_ty = funArgTy fun_ty - - -- Use this for lazy arguments - cci_lazy | encl_rules = RuleArgCtxt - | disc > 0 = DiscArgCtxt -- Be keener here - | otherwise = BoringCtxt -- Nothing interesting - - -- ..and this for strict arguments - cci_strict | encl_rules = RuleArgCtxt - | disc > 0 = DiscArgCtxt - | otherwise = RhsCtxt - -- Why RhsCtxt? if we see f (g x) (h x), and f is strict, we - -- want to be a bit more eager to inline g, because it may - -- expose an eval (on x perhaps) that can be eliminated or - -- shared. I saw this in nofib 'boyer2', RewriteFuns.onewayunify1 - -- It's worth an 18% improvement in allocation for this - -- particular benchmark; 5% on 'mate' and 1.3% on 'multiplier' - ----------- No further useful info, revert to generic rebuild ------------ -rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args }) cont - = rebuild env (argInfoExpr fun rev_args) cont - -{- Note [Trying rewrite rules] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider an application (f e1 e2 e3) where the e1,e2,e3 are not yet -simplified. We want to simplify enough arguments to allow the rules -to apply, but it's more efficient to avoid simplifying e2,e3 if e1 alone -is sufficient. Example: class ops - (+) dNumInt e2 e3 -If we rewrite ((+) dNumInt) to plusInt, we can take advantage of the -latter's strictness when simplifying e2, e3. Moreover, suppose we have - RULE f Int = \x. x True - -Then given (f Int e1) we rewrite to - (\x. x True) e1 -without simplifying e1. Now we can inline x into its unique call site, -and absorb the True into it all in the same pass. If we simplified -e1 first, we couldn't do that; see Note [Avoiding exponential behaviour]. - -So we try to apply rules if either - (a) no_more_args: we've run out of argument that the rules can "see" - (b) nr_wanted: none of the rules wants any more arguments - - -Note [RULES apply to simplified arguments] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's very desirable to try RULES once the arguments have been simplified, because -doing so ensures that rule cascades work in one pass. Consider - {-# RULES g (h x) = k x - f (k x) = x #-} - ...f (g (h x))... -Then we want to rewrite (g (h x)) to (k x) and only then try f's rules. If -we match f's rules against the un-simplified RHS, it won't match. This -makes a particularly big difference when superclass selectors are involved: - op ($p1 ($p2 (df d))) -We want all this to unravel in one sweep. - -Note [Avoid redundant simplification] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Because RULES apply to simplified arguments, there's a danger of repeatedly -simplifying already-simplified arguments. An important example is that of - (>>=) d e1 e2 -Here e1, e2 are simplified before the rule is applied, but don't really -participate in the rule firing. So we mark them as Simplified to avoid -re-simplifying them. - -Note [Shadowing] -~~~~~~~~~~~~~~~~ -This part of the simplifier may break the no-shadowing invariant -Consider - f (...(\a -> e)...) (case y of (a,b) -> e') -where f is strict in its second arg -If we simplify the innermost one first we get (...(\a -> e)...) -Simplifying the second arg makes us float the case out, so we end up with - case y of (a,b) -> f (...(\a -> e)...) e' -So the output does not have the no-shadowing invariant. However, there is -no danger of getting name-capture, because when the first arg was simplified -we used an in-scope set that at least mentioned all the variables free in its -static environment, and that is enough. - -We can't just do innermost first, or we'd end up with a dual problem: - case x of (a,b) -> f e (...(\a -> e')...) - -I spent hours trying to recover the no-shadowing invariant, but I just could -not think of an elegant way to do it. The simplifier is already knee-deep in -continuations. We have to keep the right in-scope set around; AND we have -to get the effect that finding (error "foo") in a strict arg position will -discard the entire application and replace it with (error "foo"). Getting -all this at once is TOO HARD! - - -************************************************************************ -* * - Rewrite rules -* * -************************************************************************ --} - -tryRules :: SimplEnv -> [CoreRule] - -> Id -> [ArgSpec] - -> SimplCont - -> SimplM (Maybe (SimplEnv, CoreExpr, SimplCont)) - -tryRules env rules fn args call_cont - | null rules - = return Nothing - -{- Disabled until we fix #8326 - | fn `hasKey` tagToEnumKey -- See Note [Optimising tagToEnum#] - , [_type_arg, val_arg] <- args - , Select dup bndr ((_,[],rhs1) : rest_alts) se cont <- call_cont - , isDeadBinder bndr - = do { let enum_to_tag :: CoreAlt -> CoreAlt - -- Takes K -> e into tagK# -> e - -- where tagK# is the tag of constructor K - enum_to_tag (DataAlt con, [], rhs) - = ASSERT( isEnumerationTyCon (dataConTyCon con) ) - (LitAlt tag, [], rhs) - where - tag = mkLitInt dflags (toInteger (dataConTag con - fIRST_TAG)) - enum_to_tag alt = pprPanic "tryRules: tagToEnum" (ppr alt) - - new_alts = (DEFAULT, [], rhs1) : map enum_to_tag rest_alts - new_bndr = setIdType bndr intPrimTy - -- The binder is dead, but should have the right type - ; return (Just (val_arg, Select dup new_bndr new_alts se cont)) } --} - - | Just (rule, rule_rhs) <- lookupRule dflags (getUnfoldingInRuleMatch env) - (activeRule (getMode env)) fn - (argInfoAppArgs args) rules - -- Fire a rule for the function - = do { checkedTick (RuleFired (ruleName rule)) - ; let cont' = pushSimplifiedArgs zapped_env - (drop (ruleArity rule) args) - call_cont - -- (ruleArity rule) says how - -- many args the rule consumed - - occ_anald_rhs = occurAnalyseExpr rule_rhs - -- See Note [Occurrence-analyse after rule firing] - ; dump rule rule_rhs - ; return (Just (zapped_env, occ_anald_rhs, cont')) } - -- The occ_anald_rhs and cont' are all Out things - -- hence zapping the environment - - | otherwise -- No rule fires - = do { nodump -- This ensures that an empty file is written - ; return Nothing } - - where - dflags = seDynFlags env - zapped_env = zapSubstEnv env -- See Note [zapSubstEnv] - - printRuleModule rule - = parens (maybe (text "BUILTIN") - (pprModuleName . moduleName) - (ruleModule rule)) - - dump rule rule_rhs - | dopt Opt_D_dump_rule_rewrites dflags - = log_rule dflags Opt_D_dump_rule_rewrites "Rule fired" $ vcat - [ text "Rule:" <+> ftext (ruleName rule) - , text "Module:" <+> printRuleModule rule - , text "Before:" <+> hang (ppr fn) 2 (sep (map ppr args)) - , text "After: " <+> pprCoreExpr rule_rhs - , text "Cont: " <+> ppr call_cont ] - - | dopt Opt_D_dump_rule_firings dflags - = log_rule dflags Opt_D_dump_rule_firings "Rule fired:" $ - ftext (ruleName rule) - <+> printRuleModule rule - - | otherwise - = return () - - nodump - | dopt Opt_D_dump_rule_rewrites dflags - = liftIO $ do - touchDumpFile dflags (dumpOptionsFromFlag Opt_D_dump_rule_rewrites) - - | dopt Opt_D_dump_rule_firings dflags - = liftIO $ do - touchDumpFile dflags (dumpOptionsFromFlag Opt_D_dump_rule_firings) - - | otherwise - = return () - - log_rule dflags flag hdr details - = liftIO $ do - let sty = mkDumpStyle dflags alwaysQualify - dumpAction dflags sty (dumpOptionsFromFlag flag) "" FormatText $ - sep [text hdr, nest 4 details] - -trySeqRules :: SimplEnv - -> OutExpr -> InExpr -- Scrutinee and RHS - -> SimplCont - -> SimplM (Maybe (SimplEnv, CoreExpr, SimplCont)) --- See Note [User-defined RULES for seq] -trySeqRules in_env scrut rhs cont - = do { rule_base <- getSimplRules - ; tryRules in_env (getRules rule_base seqId) seqId out_args rule_cont } - where - no_cast_scrut = drop_casts scrut - scrut_ty = exprType no_cast_scrut - seq_id_ty = idType seqId - res1_ty = piResultTy seq_id_ty rhs_rep - res2_ty = piResultTy res1_ty scrut_ty - rhs_ty = substTy in_env (exprType rhs) - rhs_rep = getRuntimeRep rhs_ty - out_args = [ TyArg { as_arg_ty = rhs_rep - , as_hole_ty = seq_id_ty } - , TyArg { as_arg_ty = scrut_ty - , as_hole_ty = res1_ty } - , TyArg { as_arg_ty = rhs_ty - , as_hole_ty = res2_ty } - , ValArg no_cast_scrut] - rule_cont = ApplyToVal { sc_dup = NoDup, sc_arg = rhs - , sc_env = in_env, sc_cont = cont } - -- Lazily evaluated, so we don't do most of this - - drop_casts (Cast e _) = drop_casts e - drop_casts e = e - -{- Note [User-defined RULES for seq] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Given - case (scrut |> co) of _ -> rhs -look for rules that match the expression - seq @t1 @t2 scrut -where scrut :: t1 - rhs :: t2 - -If you find a match, rewrite it, and apply to 'rhs'. - -Notice that we can simply drop casts on the fly here, which -makes it more likely that a rule will match. - -See Note [User-defined RULES for seq] in MkId. - -Note [Occurrence-analyse after rule firing] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -After firing a rule, we occurrence-analyse the instantiated RHS before -simplifying it. Usually this doesn't make much difference, but it can -be huge. Here's an example (simplCore/should_compile/T7785) - - map f (map f (map f xs) - -= -- Use build/fold form of map, twice - map f (build (\cn. foldr (mapFB c f) n - (build (\cn. foldr (mapFB c f) n xs)))) - -= -- Apply fold/build rule - map f (build (\cn. (\cn. foldr (mapFB c f) n xs) (mapFB c f) n)) - -= -- Beta-reduce - -- Alas we have no occurrence-analysed, so we don't know - -- that c is used exactly once - map f (build (\cn. let c1 = mapFB c f in - foldr (mapFB c1 f) n xs)) - -= -- Use mapFB rule: mapFB (mapFB c f) g = mapFB c (f.g) - -- We can do this because (mapFB c n) is a PAP and hence expandable - map f (build (\cn. let c1 = mapFB c n in - foldr (mapFB c (f.f)) n x)) - -This is not too bad. But now do the same with the outer map, and -we get another use of mapFB, and t can interact with /both/ remaining -mapFB calls in the above expression. This is stupid because actually -that 'c1' binding is dead. The outer map introduces another c2. If -there is a deep stack of maps we get lots of dead bindings, and lots -of redundant work as we repeatedly simplify the result of firing rules. - -The easy thing to do is simply to occurrence analyse the result of -the rule firing. Note that this occ-anals not only the RHS of the -rule, but also the function arguments, which by now are OutExprs. -E.g. - RULE f (g x) = x+1 - -Call f (g BIG) --> (\x. x+1) BIG - -The rule binders are lambda-bound and applied to the OutExpr arguments -(here BIG) which lack all internal occurrence info. - -Is this inefficient? Not really: we are about to walk over the result -of the rule firing to simplify it, so occurrence analysis is at most -a constant factor. - -Possible improvement: occ-anal the rules when putting them in the -database; and in the simplifier just occ-anal the OutExpr arguments. -But that's more complicated and the rule RHS is usually tiny; so I'm -just doing the simple thing. - -Historical note: previously we did occ-anal the rules in Rule.hs, -but failed to occ-anal the OutExpr arguments, which led to the -nasty performance problem described above. - - -Note [Optimising tagToEnum#] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have an enumeration data type: - - data Foo = A | B | C - -Then we want to transform - - case tagToEnum# x of ==> case x of - A -> e1 DEFAULT -> e1 - B -> e2 1# -> e2 - C -> e3 2# -> e3 - -thereby getting rid of the tagToEnum# altogether. If there was a DEFAULT -alternative we retain it (remember it comes first). If not the case must -be exhaustive, and we reflect that in the transformed version by adding -a DEFAULT. Otherwise Lint complains that the new case is not exhaustive. -See #8317. - -Note [Rules for recursive functions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -You might think that we shouldn't apply rules for a loop breaker: -doing so might give rise to an infinite loop, because a RULE is -rather like an extra equation for the function: - RULE: f (g x) y = x+y - Eqn: f a y = a-y - -But it's too drastic to disable rules for loop breakers. -Even the foldr/build rule would be disabled, because foldr -is recursive, and hence a loop breaker: - foldr k z (build g) = g k z -So it's up to the programmer: rules can cause divergence - - -************************************************************************ -* * - Rebuilding a case expression -* * -************************************************************************ - -Note [Case elimination] -~~~~~~~~~~~~~~~~~~~~~~~ -The case-elimination transformation discards redundant case expressions. -Start with a simple situation: - - case x# of ===> let y# = x# in e - y# -> e - -(when x#, y# are of primitive type, of course). We can't (in general) -do this for algebraic cases, because we might turn bottom into -non-bottom! - -The code in SimplUtils.prepareAlts has the effect of generalise this -idea to look for a case where we're scrutinising a variable, and we -know that only the default case can match. For example: - - case x of - 0# -> ... - DEFAULT -> ...(case x of - 0# -> ... - DEFAULT -> ...) ... - -Here the inner case is first trimmed to have only one alternative, the -DEFAULT, after which it's an instance of the previous case. This -really only shows up in eliminating error-checking code. - -Note that SimplUtils.mkCase combines identical RHSs. So - - case e of ===> case e of DEFAULT -> r - True -> r - False -> r - -Now again the case may be eliminated by the CaseElim transformation. -This includes things like (==# a# b#)::Bool so that we simplify - case ==# a# b# of { True -> x; False -> x } -to just - x -This particular example shows up in default methods for -comparison operations (e.g. in (>=) for Int.Int32) - -Note [Case to let transformation] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If a case over a lifted type has a single alternative, and is being -used as a strict 'let' (all isDeadBinder bndrs), we may want to do -this transformation: - - case e of r ===> let r = e in ...r... - _ -> ...r... - -We treat the unlifted and lifted cases separately: - -* Unlifted case: 'e' satisfies exprOkForSpeculation - (ok-for-spec is needed to satisfy the let/app invariant). - This turns case a +# b of r -> ...r... - into let r = a +# b in ...r... - and thence .....(a +# b).... - - However, if we have - case indexArray# a i of r -> ...r... - we might like to do the same, and inline the (indexArray# a i). - But indexArray# is not okForSpeculation, so we don't build a let - in rebuildCase (lest it get floated *out*), so the inlining doesn't - happen either. Annoying. - -* Lifted case: we need to be sure that the expression is already - evaluated (exprIsHNF). If it's not already evaluated - - we risk losing exceptions, divergence or - user-specified thunk-forcing - - even if 'e' is guaranteed to converge, we don't want to - create a thunk (call by need) instead of evaluating it - right away (call by value) - - However, we can turn the case into a /strict/ let if the 'r' is - used strictly in the body. Then we won't lose divergence; and - we won't build a thunk because the let is strict. - See also Note [Case-to-let for strictly-used binders] - - NB: absentError satisfies exprIsHNF: see Note [aBSENT_ERROR_ID] in GHC.Core.Make. - We want to turn - case (absentError "foo") of r -> ...MkT r... - into - let r = absentError "foo" in ...MkT r... - - -Note [Case-to-let for strictly-used binders] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have this: - case <scrut> of r { _ -> ..r.. } - -where 'r' is used strictly in (..r..), we can safely transform to - let r = <scrut> in ...r... - -This is a Good Thing, because 'r' might be dead (if the body just -calls error), or might be used just once (in which case it can be -inlined); or we might be able to float the let-binding up or down. -E.g. #15631 has an example. - -Note that this can change the error behaviour. For example, we might -transform - case x of { _ -> error "bad" } - --> error "bad" -which is might be puzzling if 'x' currently lambda-bound, but later gets -let-bound to (error "good"). - -Nevertheless, the paper "A semantics for imprecise exceptions" allows -this transformation. If you want to fix the evaluation order, use -'pseq'. See #8900 for an example where the loss of this -transformation bit us in practice. - -See also Note [Empty case alternatives] in GHC.Core. - -Historical notes - -There have been various earlier versions of this patch: - -* By Sept 18 the code looked like this: - || scrut_is_demanded_var scrut - - scrut_is_demanded_var :: CoreExpr -> Bool - scrut_is_demanded_var (Cast s _) = scrut_is_demanded_var s - scrut_is_demanded_var (Var _) = isStrictDmd (idDemandInfo case_bndr) - scrut_is_demanded_var _ = False - - This only fired if the scrutinee was a /variable/, which seems - an unnecessary restriction. So in #15631 I relaxed it to allow - arbitrary scrutinees. Less code, less to explain -- but the change - had 0.00% effect on nofib. - -* Previously, in Jan 13 the code looked like this: - || case_bndr_evald_next rhs - - case_bndr_evald_next :: CoreExpr -> Bool - -- See Note [Case binder next] - case_bndr_evald_next (Var v) = v == case_bndr - case_bndr_evald_next (Cast e _) = case_bndr_evald_next e - case_bndr_evald_next (App e _) = case_bndr_evald_next e - case_bndr_evald_next (Case e _ _ _) = case_bndr_evald_next e - case_bndr_evald_next _ = False - - This patch was part of fixing #7542. See also - Note [Eta reduction of an eval'd function] in GHC.Core.Utils.) - - -Further notes about case elimination -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider: test :: Integer -> IO () - test = print - -Turns out that this compiles to: - Print.test - = \ eta :: Integer - eta1 :: Void# -> - case PrelNum.< eta PrelNum.zeroInteger of wild { __DEFAULT -> - case hPutStr stdout - (PrelNum.jtos eta ($w[] @ Char)) - eta1 - of wild1 { (# new_s, a4 #) -> PrelIO.lvl23 new_s }} - -Notice the strange '<' which has no effect at all. This is a funny one. -It started like this: - -f x y = if x < 0 then jtos x - else if y==0 then "" else jtos x - -At a particular call site we have (f v 1). So we inline to get - - if v < 0 then jtos x - else if 1==0 then "" else jtos x - -Now simplify the 1==0 conditional: - - if v<0 then jtos v else jtos v - -Now common-up the two branches of the case: - - case (v<0) of DEFAULT -> jtos v - -Why don't we drop the case? Because it's strict in v. It's technically -wrong to drop even unnecessary evaluations, and in practice they -may be a result of 'seq' so we *definitely* don't want to drop those. -I don't really know how to improve this situation. - - -Note [FloatBinds from constructor wrappers] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -If we have FloatBinds coming from the constructor wrapper -(as in Note [exprIsConApp_maybe on data constructors with wrappers]), -we cannot float past them. We'd need to float the FloatBind -together with the simplify floats, unfortunately the -simplifier doesn't have case-floats. The simplest thing we can -do is to wrap all the floats here. The next iteration of the -simplifier will take care of all these cases and lets. - -Given data T = MkT !Bool, this allows us to simplify -case $WMkT b of { MkT x -> f x } -to -case b of { b' -> f b' }. - -We could try and be more clever (like maybe wfloats only contain -let binders, so we could float them). But the need for the -extra complication is not clear. --} - ---------------------------------------------------------- --- Eliminate the case if possible - -rebuildCase, reallyRebuildCase - :: SimplEnv - -> OutExpr -- Scrutinee - -> InId -- Case binder - -> [InAlt] -- Alternatives (increasing order) - -> SimplCont - -> SimplM (SimplFloats, OutExpr) - --------------------------------------------------- --- 1. Eliminate the case if there's a known constructor --------------------------------------------------- - -rebuildCase env scrut case_bndr alts cont - | Lit lit <- scrut -- No need for same treatment as constructors - -- because literals are inlined more vigorously - , not (litIsLifted lit) - = do { tick (KnownBranch case_bndr) - ; case findAlt (LitAlt lit) alts of - Nothing -> missingAlt env case_bndr alts cont - Just (_, bs, rhs) -> simple_rhs env [] scrut bs rhs } - - | Just (in_scope', wfloats, con, ty_args, other_args) - <- exprIsConApp_maybe (getUnfoldingInRuleMatch env) scrut - -- Works when the scrutinee is a variable with a known unfolding - -- as well as when it's an explicit constructor application - , let env0 = setInScopeSet env in_scope' - = do { tick (KnownBranch case_bndr) - ; case findAlt (DataAlt con) alts of - Nothing -> missingAlt env0 case_bndr alts cont - Just (DEFAULT, bs, rhs) -> let con_app = Var (dataConWorkId con) - `mkTyApps` ty_args - `mkApps` other_args - in simple_rhs env0 wfloats con_app bs rhs - Just (_, bs, rhs) -> knownCon env0 scrut wfloats con ty_args other_args - case_bndr bs rhs cont - } - where - simple_rhs env wfloats scrut' bs rhs = - ASSERT( null bs ) - do { (floats1, env') <- simplNonRecX env case_bndr scrut' - -- scrut is a constructor application, - -- hence satisfies let/app invariant - ; (floats2, expr') <- simplExprF env' rhs cont - ; case wfloats of - [] -> return (floats1 `addFloats` floats2, expr') - _ -> return - -- See Note [FloatBinds from constructor wrappers] - ( emptyFloats env, - GHC.Core.Make.wrapFloats wfloats $ - wrapFloats (floats1 `addFloats` floats2) expr' )} - - --------------------------------------------------- --- 2. Eliminate the case if scrutinee is evaluated --------------------------------------------------- - -rebuildCase env scrut case_bndr alts@[(_, bndrs, rhs)] cont - -- See if we can get rid of the case altogether - -- See Note [Case elimination] - -- mkCase made sure that if all the alternatives are equal, - -- then there is now only one (DEFAULT) rhs - - -- 2a. Dropping the case altogether, if - -- a) it binds nothing (so it's really just a 'seq') - -- b) evaluating the scrutinee has no side effects - | is_plain_seq - , exprOkForSideEffects scrut - -- The entire case is dead, so we can drop it - -- if the scrutinee converges without having imperative - -- side effects or raising a Haskell exception - -- See Note [PrimOp can_fail and has_side_effects] in PrimOp - = simplExprF env rhs cont - - -- 2b. Turn the case into a let, if - -- a) it binds only the case-binder - -- b) unlifted case: the scrutinee is ok-for-speculation - -- lifted case: the scrutinee is in HNF (or will later be demanded) - -- See Note [Case to let transformation] - | all_dead_bndrs - , doCaseToLet scrut case_bndr - = do { tick (CaseElim case_bndr) - ; (floats1, env') <- simplNonRecX env case_bndr scrut - ; (floats2, expr') <- simplExprF env' rhs cont - ; return (floats1 `addFloats` floats2, expr') } - - -- 2c. Try the seq rules if - -- a) it binds only the case binder - -- b) a rule for seq applies - -- See Note [User-defined RULES for seq] in MkId - | is_plain_seq - = do { mb_rule <- trySeqRules env scrut rhs cont - ; case mb_rule of - Just (env', rule_rhs, cont') -> simplExprF env' rule_rhs cont' - Nothing -> reallyRebuildCase env scrut case_bndr alts cont } - where - all_dead_bndrs = all isDeadBinder bndrs -- bndrs are [InId] - is_plain_seq = all_dead_bndrs && isDeadBinder case_bndr -- Evaluation *only* for effect - -rebuildCase env scrut case_bndr alts cont - = reallyRebuildCase env scrut case_bndr alts cont - - -doCaseToLet :: OutExpr -- Scrutinee - -> InId -- Case binder - -> Bool --- The situation is case scrut of b { DEFAULT -> body } --- Can we transform thus? let { b = scrut } in body -doCaseToLet scrut case_bndr - | isTyCoVar case_bndr -- Respect GHC.Core - = isTyCoArg scrut -- Note [Core type and coercion invariant] - - | isUnliftedType (idType case_bndr) - = exprOkForSpeculation scrut - - | otherwise -- Scrut has a lifted type - = exprIsHNF scrut - || isStrictDmd (idDemandInfo case_bndr) - -- See Note [Case-to-let for strictly-used binders] - --------------------------------------------------- --- 3. Catch-all case --------------------------------------------------- - -reallyRebuildCase env scrut case_bndr alts cont - | not (sm_case_case (getMode env)) - = do { case_expr <- simplAlts env scrut case_bndr alts - (mkBoringStop (contHoleType cont)) - ; rebuild env case_expr cont } - - | otherwise - = do { (floats, cont') <- mkDupableCaseCont env alts cont - ; case_expr <- simplAlts (env `setInScopeFromF` floats) - scrut case_bndr alts cont' - ; return (floats, case_expr) } - -{- -simplCaseBinder checks whether the scrutinee is a variable, v. If so, -try to eliminate uses of v in the RHSs in favour of case_bndr; that -way, there's a chance that v will now only be used once, and hence -inlined. - -Historical note: we use to do the "case binder swap" in the Simplifier -so there were additional complications if the scrutinee was a variable. -Now the binder-swap stuff is done in the occurrence analyser; see -OccurAnal Note [Binder swap]. - -Note [knownCon occ info] -~~~~~~~~~~~~~~~~~~~~~~~~ -If the case binder is not dead, then neither are the pattern bound -variables: - case <any> of x { (a,b) -> - case x of { (p,q) -> p } } -Here (a,b) both look dead, but come alive after the inner case is eliminated. -The point is that we bring into the envt a binding - let x = (a,b) -after the outer case, and that makes (a,b) alive. At least we do unless -the case binder is guaranteed dead. - -Note [Case alternative occ info] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we are simply reconstructing a case (the common case), we always -zap the occurrence info on the binders in the alternatives. Even -if the case binder is dead, the scrutinee is usually a variable, and *that* -can bring the case-alternative binders back to life. -See Note [Add unfolding for scrutinee] - -Note [Improving seq] -~~~~~~~~~~~~~~~~~~~ -Consider - type family F :: * -> * - type instance F Int = Int - -We'd like to transform - case e of (x :: F Int) { DEFAULT -> rhs } -===> - case e `cast` co of (x'::Int) - I# x# -> let x = x' `cast` sym co - in rhs - -so that 'rhs' can take advantage of the form of x'. Notice that Note -[Case of cast] (in OccurAnal) may then apply to the result. - -We'd also like to eliminate empty types (#13468). So if - - data Void - type instance F Bool = Void - -then we'd like to transform - case (x :: F Bool) of { _ -> error "urk" } -===> - case (x |> co) of (x' :: Void) of {} - -Nota Bene: we used to have a built-in rule for 'seq' that dropped -casts, so that - case (x |> co) of { _ -> blah } -dropped the cast; in order to improve the chances of trySeqRules -firing. But that works in the /opposite/ direction to Note [Improving -seq] so there's a danger of flip/flopping. Better to make trySeqRules -insensitive to the cast, which is now is. - -The need for [Improving seq] showed up in Roman's experiments. Example: - foo :: F Int -> Int -> Int - foo t n = t `seq` bar n - where - bar 0 = 0 - bar n = bar (n - case t of TI i -> i) -Here we'd like to avoid repeated evaluating t inside the loop, by -taking advantage of the `seq`. - -At one point I did transformation in LiberateCase, but it's more -robust here. (Otherwise, there's a danger that we'll simply drop the -'seq' altogether, before LiberateCase gets to see it.) --} - -simplAlts :: SimplEnv - -> OutExpr -- Scrutinee - -> InId -- Case binder - -> [InAlt] -- Non-empty - -> SimplCont - -> SimplM OutExpr -- Returns the complete simplified case expression - -simplAlts env0 scrut case_bndr alts cont' - = do { traceSmpl "simplAlts" (vcat [ ppr case_bndr - , text "cont':" <+> ppr cont' - , text "in_scope" <+> ppr (seInScope env0) ]) - ; (env1, case_bndr1) <- simplBinder env0 case_bndr - ; let case_bndr2 = case_bndr1 `setIdUnfolding` evaldUnfolding - env2 = modifyInScope env1 case_bndr2 - -- See Note [Case binder evaluated-ness] - - ; fam_envs <- getFamEnvs - ; (alt_env', scrut', case_bndr') <- improveSeq fam_envs env2 scrut - case_bndr case_bndr2 alts - - ; (imposs_deflt_cons, in_alts) <- prepareAlts scrut' case_bndr' alts - -- NB: it's possible that the returned in_alts is empty: this is handled - -- by the caller (rebuildCase) in the missingAlt function - - ; alts' <- mapM (simplAlt alt_env' (Just scrut') imposs_deflt_cons case_bndr' cont') in_alts - ; -- pprTrace "simplAlts" (ppr case_bndr $$ ppr alts_ty $$ ppr alts_ty' $$ ppr alts $$ ppr cont') $ - - ; let alts_ty' = contResultType cont' - -- See Note [Avoiding space leaks in OutType] - ; seqType alts_ty' `seq` - mkCase (seDynFlags env0) scrut' case_bndr' alts_ty' alts' } - - ------------------------------------- -improveSeq :: (FamInstEnv, FamInstEnv) -> SimplEnv - -> OutExpr -> InId -> OutId -> [InAlt] - -> SimplM (SimplEnv, OutExpr, OutId) --- Note [Improving seq] -improveSeq fam_envs env scrut case_bndr case_bndr1 [(DEFAULT,_,_)] - | Just (co, ty2) <- topNormaliseType_maybe fam_envs (idType case_bndr1) - = do { case_bndr2 <- newId (fsLit "nt") ty2 - ; let rhs = DoneEx (Var case_bndr2 `Cast` mkSymCo co) Nothing - env2 = extendIdSubst env case_bndr rhs - ; return (env2, scrut `Cast` co, case_bndr2) } - -improveSeq _ env scrut _ case_bndr1 _ - = return (env, scrut, case_bndr1) - - ------------------------------------- -simplAlt :: SimplEnv - -> Maybe OutExpr -- The scrutinee - -> [AltCon] -- These constructors can't be present when - -- matching the DEFAULT alternative - -> OutId -- The case binder - -> SimplCont - -> InAlt - -> SimplM OutAlt - -simplAlt env _ imposs_deflt_cons case_bndr' cont' (DEFAULT, bndrs, rhs) - = ASSERT( null bndrs ) - do { let env' = addBinderUnfolding env case_bndr' - (mkOtherCon imposs_deflt_cons) - -- Record the constructors that the case-binder *can't* be. - ; rhs' <- simplExprC env' rhs cont' - ; return (DEFAULT, [], rhs') } - -simplAlt env scrut' _ case_bndr' cont' (LitAlt lit, bndrs, rhs) - = ASSERT( null bndrs ) - do { env' <- addAltUnfoldings env scrut' case_bndr' (Lit lit) - ; rhs' <- simplExprC env' rhs cont' - ; return (LitAlt lit, [], rhs') } - -simplAlt env scrut' _ case_bndr' cont' (DataAlt con, vs, rhs) - = do { -- See Note [Adding evaluatedness info to pattern-bound variables] - let vs_with_evals = addEvals scrut' con vs - ; (env', vs') <- simplLamBndrs env vs_with_evals - - -- Bind the case-binder to (con args) - ; let inst_tys' = tyConAppArgs (idType case_bndr') - con_app :: OutExpr - con_app = mkConApp2 con inst_tys' vs' - - ; env'' <- addAltUnfoldings env' scrut' case_bndr' con_app - ; rhs' <- simplExprC env'' rhs cont' - ; return (DataAlt con, vs', rhs') } - -{- Note [Adding evaluatedness info to pattern-bound variables] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -addEvals records the evaluated-ness of the bound variables of -a case pattern. This is *important*. Consider - - data T = T !Int !Int - - case x of { T a b -> T (a+1) b } - -We really must record that b is already evaluated so that we don't -go and re-evaluate it when constructing the result. -See Note [Data-con worker strictness] in MkId.hs - -NB: simplLamBndrs preserves this eval info - -In addition to handling data constructor fields with !s, addEvals -also records the fact that the result of seq# is always in WHNF. -See Note [seq# magic] in PrelRules. Example (#15226): - - case seq# v s of - (# s', v' #) -> E - -we want the compiler to be aware that v' is in WHNF in E. - -Open problem: we don't record that v itself is in WHNF (and we can't -do it here). The right thing is to do some kind of binder-swap; -see #15226 for discussion. --} - -addEvals :: Maybe OutExpr -> DataCon -> [Id] -> [Id] --- See Note [Adding evaluatedness info to pattern-bound variables] -addEvals scrut con vs - -- Deal with seq# applications - | Just scr <- scrut - , isUnboxedTupleCon con - , [s,x] <- vs - -- Use stripNArgs rather than collectArgsTicks to avoid building - -- a list of arguments only to throw it away immediately. - , Just (Var f) <- stripNArgs 4 scr - , Just SeqOp <- isPrimOpId_maybe f - , let x' = zapIdOccInfoAndSetEvald MarkedStrict x - = [s, x'] - - -- Deal with banged datacon fields -addEvals _scrut con vs = go vs the_strs - where - the_strs = dataConRepStrictness con - - go [] [] = [] - go (v:vs') strs | isTyVar v = v : go vs' strs - go (v:vs') (str:strs) = zapIdOccInfoAndSetEvald str v : go vs' strs - go _ _ = pprPanic "Simplify.addEvals" - (ppr con $$ - ppr vs $$ - ppr_with_length (map strdisp the_strs) $$ - ppr_with_length (dataConRepArgTys con) $$ - ppr_with_length (dataConRepStrictness con)) - where - ppr_with_length list - = ppr list <+> parens (text "length =" <+> ppr (length list)) - strdisp MarkedStrict = text "MarkedStrict" - strdisp NotMarkedStrict = text "NotMarkedStrict" - -zapIdOccInfoAndSetEvald :: StrictnessMark -> Id -> Id -zapIdOccInfoAndSetEvald str v = - setCaseBndrEvald str $ -- Add eval'dness info - zapIdOccInfo v -- And kill occ info; - -- see Note [Case alternative occ info] - -addAltUnfoldings :: SimplEnv -> Maybe OutExpr -> OutId -> OutExpr -> SimplM SimplEnv -addAltUnfoldings env scrut case_bndr con_app - = do { let con_app_unf = mk_simple_unf con_app - env1 = addBinderUnfolding env case_bndr con_app_unf - - -- See Note [Add unfolding for scrutinee] - env2 = case scrut of - Just (Var v) -> addBinderUnfolding env1 v con_app_unf - Just (Cast (Var v) co) -> addBinderUnfolding env1 v $ - mk_simple_unf (Cast con_app (mkSymCo co)) - _ -> env1 - - ; traceSmpl "addAltUnf" (vcat [ppr case_bndr <+> ppr scrut, ppr con_app]) - ; return env2 } - where - mk_simple_unf = mkSimpleUnfolding (seDynFlags env) - -addBinderUnfolding :: SimplEnv -> Id -> Unfolding -> SimplEnv -addBinderUnfolding env bndr unf - | debugIsOn, Just tmpl <- maybeUnfoldingTemplate unf - = WARN( not (eqType (idType bndr) (exprType tmpl)), - ppr bndr $$ ppr (idType bndr) $$ ppr tmpl $$ ppr (exprType tmpl) ) - modifyInScope env (bndr `setIdUnfolding` unf) - - | otherwise - = modifyInScope env (bndr `setIdUnfolding` unf) - -zapBndrOccInfo :: Bool -> Id -> Id --- Consider case e of b { (a,b) -> ... } --- Then if we bind b to (a,b) in "...", and b is not dead, --- then we must zap the deadness info on a,b -zapBndrOccInfo keep_occ_info pat_id - | keep_occ_info = pat_id - | otherwise = zapIdOccInfo pat_id - -{- Note [Case binder evaluated-ness] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We pin on a (OtherCon []) unfolding to the case-binder of a Case, -even though it'll be over-ridden in every case alternative with a more -informative unfolding. Why? Because suppose a later, less clever, pass -simply replaces all occurrences of the case binder with the binder itself; -then Lint may complain about the let/app invariant. Example - case e of b { DEFAULT -> let v = reallyUnsafePtrEq# b y in .... - ; K -> blah } - -The let/app invariant requires that y is evaluated in the call to -reallyUnsafePtrEq#, which it is. But we still want that to be true if we -propagate binders to occurrences. - -This showed up in #13027. - -Note [Add unfolding for scrutinee] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In general it's unlikely that a variable scrutinee will appear -in the case alternatives case x of { ...x unlikely to appear... } -because the binder-swap in OccAnal has got rid of all such occurrences -See Note [Binder swap] in OccAnal. - -BUT it is still VERY IMPORTANT to add a suitable unfolding for a -variable scrutinee, in simplAlt. Here's why - case x of y - (a,b) -> case b of c - I# v -> ...(f y)... -There is no occurrence of 'b' in the (...(f y)...). But y gets -the unfolding (a,b), and *that* mentions b. If f has a RULE - RULE f (p, I# q) = ... -we want that rule to match, so we must extend the in-scope env with a -suitable unfolding for 'y'. It's *essential* for rule matching; but -it's also good for case-elimintation -- suppose that 'f' was inlined -and did multi-level case analysis, then we'd solve it in one -simplifier sweep instead of two. - -Exactly the same issue arises in SpecConstr; -see Note [Add scrutinee to ValueEnv too] in SpecConstr - -HOWEVER, given - case x of y { Just a -> r1; Nothing -> r2 } -we do not want to add the unfolding x -> y to 'x', which might seem cool, -since 'y' itself has different unfoldings in r1 and r2. Reason: if we -did that, we'd have to zap y's deadness info and that is a very useful -piece of information. - -So instead we add the unfolding x -> Just a, and x -> Nothing in the -respective RHSs. - - -************************************************************************ -* * -\subsection{Known constructor} -* * -************************************************************************ - -We are a bit careful with occurrence info. Here's an example - - (\x* -> case x of (a*, b) -> f a) (h v, e) - -where the * means "occurs once". This effectively becomes - case (h v, e) of (a*, b) -> f a) -and then - let a* = h v; b = e in f a -and then - f (h v) - -All this should happen in one sweep. --} - -knownCon :: SimplEnv - -> OutExpr -- The scrutinee - -> [FloatBind] -> DataCon -> [OutType] -> [OutExpr] -- The scrutinee (in pieces) - -> InId -> [InBndr] -> InExpr -- The alternative - -> SimplCont - -> SimplM (SimplFloats, OutExpr) - -knownCon env scrut dc_floats dc dc_ty_args dc_args bndr bs rhs cont - = do { (floats1, env1) <- bind_args env bs dc_args - ; (floats2, env2) <- bind_case_bndr env1 - ; (floats3, expr') <- simplExprF env2 rhs cont - ; case dc_floats of - [] -> - return (floats1 `addFloats` floats2 `addFloats` floats3, expr') - _ -> - return ( emptyFloats env - -- See Note [FloatBinds from constructor wrappers] - , GHC.Core.Make.wrapFloats dc_floats $ - wrapFloats (floats1 `addFloats` floats2 `addFloats` floats3) expr') } - where - zap_occ = zapBndrOccInfo (isDeadBinder bndr) -- bndr is an InId - - -- Ugh! - bind_args env' [] _ = return (emptyFloats env', env') - - bind_args env' (b:bs') (Type ty : args) - = ASSERT( isTyVar b ) - bind_args (extendTvSubst env' b ty) bs' args - - bind_args env' (b:bs') (Coercion co : args) - = ASSERT( isCoVar b ) - bind_args (extendCvSubst env' b co) bs' args - - bind_args env' (b:bs') (arg : args) - = ASSERT( isId b ) - do { let b' = zap_occ b - -- Note that the binder might be "dead", because it doesn't - -- occur in the RHS; and simplNonRecX may therefore discard - -- it via postInlineUnconditionally. - -- Nevertheless we must keep it if the case-binder is alive, - -- because it may be used in the con_app. See Note [knownCon occ info] - ; (floats1, env2) <- simplNonRecX env' b' arg -- arg satisfies let/app invariant - ; (floats2, env3) <- bind_args env2 bs' args - ; return (floats1 `addFloats` floats2, env3) } - - bind_args _ _ _ = - pprPanic "bind_args" $ ppr dc $$ ppr bs $$ ppr dc_args $$ - text "scrut:" <+> ppr scrut - - -- It's useful to bind bndr to scrut, rather than to a fresh - -- binding x = Con arg1 .. argn - -- because very often the scrut is a variable, so we avoid - -- creating, and then subsequently eliminating, a let-binding - -- BUT, if scrut is a not a variable, we must be careful - -- about duplicating the arg redexes; in that case, make - -- a new con-app from the args - bind_case_bndr env - | isDeadBinder bndr = return (emptyFloats env, env) - | exprIsTrivial scrut = return (emptyFloats env - , extendIdSubst env bndr (DoneEx scrut Nothing)) - | otherwise = do { dc_args <- mapM (simplVar env) bs - -- dc_ty_args are already OutTypes, - -- but bs are InBndrs - ; let con_app = Var (dataConWorkId dc) - `mkTyApps` dc_ty_args - `mkApps` dc_args - ; simplNonRecX env bndr con_app } - -------------------- -missingAlt :: SimplEnv -> Id -> [InAlt] -> SimplCont - -> SimplM (SimplFloats, OutExpr) - -- This isn't strictly an error, although it is unusual. - -- It's possible that the simplifier might "see" that - -- an inner case has no accessible alternatives before - -- it "sees" that the entire branch of an outer case is - -- inaccessible. So we simply put an error case here instead. -missingAlt env case_bndr _ cont - = WARN( True, text "missingAlt" <+> ppr case_bndr ) - -- See Note [Avoiding space leaks in OutType] - let cont_ty = contResultType cont - in seqType cont_ty `seq` - return (emptyFloats env, mkImpossibleExpr cont_ty) - -{- -************************************************************************ -* * -\subsection{Duplicating continuations} -* * -************************************************************************ - -Consider - let x* = case e of { True -> e1; False -> e2 } - in b -where x* is a strict binding. Then mkDupableCont will be given -the continuation - case [] of { True -> e1; False -> e2 } ; let x* = [] in b ; stop -and will split it into - dupable: case [] of { True -> $j1; False -> $j2 } ; stop - join floats: $j1 = e1, $j2 = e2 - non_dupable: let x* = [] in b; stop - -Putting this back together would give - let x* = let { $j1 = e1; $j2 = e2 } in - case e of { True -> $j1; False -> $j2 } - in b -(Of course we only do this if 'e' wants to duplicate that continuation.) -Note how important it is that the new join points wrap around the -inner expression, and not around the whole thing. - -In contrast, any let-bindings introduced by mkDupableCont can wrap -around the entire thing. - -Note [Bottom alternatives] -~~~~~~~~~~~~~~~~~~~~~~~~~~ -When we have - case (case x of { A -> error .. ; B -> e; C -> error ..) - of alts -then we can just duplicate those alts because the A and C cases -will disappear immediately. This is more direct than creating -join points and inlining them away. See #4930. --} - --------------------- -mkDupableCaseCont :: SimplEnv -> [InAlt] -> SimplCont - -> SimplM (SimplFloats, SimplCont) -mkDupableCaseCont env alts cont - | altsWouldDup alts = mkDupableCont env cont - | otherwise = return (emptyFloats env, cont) - -altsWouldDup :: [InAlt] -> Bool -- True iff strictly > 1 non-bottom alternative -altsWouldDup [] = False -- See Note [Bottom alternatives] -altsWouldDup [_] = False -altsWouldDup (alt:alts) - | is_bot_alt alt = altsWouldDup alts - | otherwise = not (all is_bot_alt alts) - where - is_bot_alt (_,_,rhs) = exprIsBottom rhs - -------------------------- -mkDupableCont :: SimplEnv -> SimplCont - -> SimplM ( SimplFloats -- Incoming SimplEnv augmented with - -- extra let/join-floats and in-scope variables - , SimplCont) -- dup_cont: duplicable continuation - -mkDupableCont env cont - | contIsDupable cont - = return (emptyFloats env, cont) - -mkDupableCont _ (Stop {}) = panic "mkDupableCont" -- Handled by previous eqn - -mkDupableCont env (CastIt ty cont) - = do { (floats, cont') <- mkDupableCont env cont - ; return (floats, CastIt ty cont') } - --- Duplicating ticks for now, not sure if this is good or not -mkDupableCont env (TickIt t cont) - = do { (floats, cont') <- mkDupableCont env cont - ; return (floats, TickIt t cont') } - -mkDupableCont env (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs - , sc_body = body, sc_env = se, sc_cont = cont}) - -- See Note [Duplicating StrictBind] - = do { let sb_env = se `setInScopeFromE` env - ; (sb_env1, bndr') <- simplBinder sb_env bndr - ; (floats1, join_inner) <- simplLam sb_env1 bndrs body cont - -- No need to use mkDupableCont before simplLam; we - -- use cont once here, and then share the result if necessary - - ; let join_body = wrapFloats floats1 join_inner - res_ty = contResultType cont - - ; (floats2, body2) - <- if exprIsDupable (seDynFlags env) join_body - then return (emptyFloats env, join_body) - else do { join_bndr <- newJoinId [bndr'] res_ty - ; let join_call = App (Var join_bndr) (Var bndr') - join_rhs = Lam (setOneShotLambda bndr') join_body - join_bind = NonRec join_bndr join_rhs - floats = emptyFloats env `extendFloats` join_bind - ; return (floats, join_call) } - ; return ( floats2 - , StrictBind { sc_bndr = bndr', sc_bndrs = [] - , sc_body = body2 - , sc_env = zapSubstEnv se `setInScopeFromF` floats2 - -- See Note [StaticEnv invariant] in SimplUtils - , sc_dup = OkToDup - , sc_cont = mkBoringStop res_ty } ) } - -mkDupableCont env (StrictArg { sc_fun = info, sc_cci = cci, sc_cont = cont }) - -- See Note [Duplicating StrictArg] - -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable - = do { (floats1, cont') <- mkDupableCont env cont - ; (floats_s, args') <- mapAndUnzipM (makeTrivialArg (getMode env)) - (ai_args info) - ; return ( foldl' addLetFloats floats1 floats_s - , StrictArg { sc_fun = info { ai_args = args' } - , sc_cci = cci - , sc_cont = cont' - , sc_dup = OkToDup} ) } - -mkDupableCont env (ApplyToTy { sc_cont = cont - , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty }) - = do { (floats, cont') <- mkDupableCont env cont - ; return (floats, ApplyToTy { sc_cont = cont' - , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty }) } - -mkDupableCont env (ApplyToVal { sc_arg = arg, sc_dup = dup - , sc_env = se, sc_cont = cont }) - = -- e.g. [...hole...] (...arg...) - -- ==> - -- let a = ...arg... - -- in [...hole...] a - -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable - do { (floats1, cont') <- mkDupableCont env cont - ; let env' = env `setInScopeFromF` floats1 - ; (_, se', arg') <- simplArg env' dup se arg - ; (let_floats2, arg'') <- makeTrivial (getMode env) NotTopLevel (fsLit "karg") arg' - ; let all_floats = floats1 `addLetFloats` let_floats2 - ; return ( all_floats - , ApplyToVal { sc_arg = arg'' - , sc_env = se' `setInScopeFromF` all_floats - -- Ensure that sc_env includes the free vars of - -- arg'' in its in-scope set, even if makeTrivial - -- has turned arg'' into a fresh variable - -- See Note [StaticEnv invariant] in SimplUtils - , sc_dup = OkToDup, sc_cont = cont' }) } - -mkDupableCont env (Select { sc_bndr = case_bndr, sc_alts = alts - , sc_env = se, sc_cont = cont }) - = -- e.g. (case [...hole...] of { pi -> ei }) - -- ===> - -- let ji = \xij -> ei - -- in case [...hole...] of { pi -> ji xij } - -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable - do { tick (CaseOfCase case_bndr) - ; (floats, alt_cont) <- mkDupableCaseCont env alts cont - -- NB: We call mkDupableCaseCont here to make cont duplicable - -- (if necessary, depending on the number of alts) - -- And this is important: see Note [Fusing case continuations] - - ; let alt_env = se `setInScopeFromF` floats - ; (alt_env', case_bndr') <- simplBinder alt_env case_bndr - ; alts' <- mapM (simplAlt alt_env' Nothing [] case_bndr' alt_cont) alts - -- Safe to say that there are no handled-cons for the DEFAULT case - -- NB: simplBinder does not zap deadness occ-info, so - -- a dead case_bndr' will still advertise its deadness - -- This is really important because in - -- case e of b { (# p,q #) -> ... } - -- b is always dead, and indeed we are not allowed to bind b to (# p,q #), - -- which might happen if e was an explicit unboxed pair and b wasn't marked dead. - -- In the new alts we build, we have the new case binder, so it must retain - -- its deadness. - -- NB: we don't use alt_env further; it has the substEnv for - -- the alternatives, and we don't want that - - ; (join_floats, alts'') <- mapAccumLM (mkDupableAlt (seDynFlags env) case_bndr') - emptyJoinFloats alts' - - ; let all_floats = floats `addJoinFloats` join_floats - -- Note [Duplicated env] - ; return (all_floats - , Select { sc_dup = OkToDup - , sc_bndr = case_bndr' - , sc_alts = alts'' - , sc_env = zapSubstEnv se `setInScopeFromF` all_floats - -- See Note [StaticEnv invariant] in SimplUtils - , sc_cont = mkBoringStop (contResultType cont) } ) } - -mkDupableAlt :: DynFlags -> OutId - -> JoinFloats -> OutAlt - -> SimplM (JoinFloats, OutAlt) -mkDupableAlt dflags case_bndr jfloats (con, bndrs', rhs') - | exprIsDupable dflags rhs' -- Note [Small alternative rhs] - = return (jfloats, (con, bndrs', rhs')) - - | otherwise - = do { let rhs_ty' = exprType rhs' - scrut_ty = idType case_bndr - case_bndr_w_unf - = case con of - DEFAULT -> case_bndr - DataAlt dc -> setIdUnfolding case_bndr unf - where - -- See Note [Case binders and join points] - unf = mkInlineUnfolding rhs - rhs = mkConApp2 dc (tyConAppArgs scrut_ty) bndrs' - - LitAlt {} -> WARN( True, text "mkDupableAlt" - <+> ppr case_bndr <+> ppr con ) - case_bndr - -- The case binder is alive but trivial, so why has - -- it not been substituted away? - - final_bndrs' - | isDeadBinder case_bndr = filter abstract_over bndrs' - | otherwise = bndrs' ++ [case_bndr_w_unf] - - abstract_over bndr - | isTyVar bndr = True -- Abstract over all type variables just in case - | otherwise = not (isDeadBinder bndr) - -- The deadness info on the new Ids is preserved by simplBinders - final_args = varsToCoreExprs final_bndrs' - -- Note [Join point abstraction] - - -- We make the lambdas into one-shot-lambdas. The - -- join point is sure to be applied at most once, and doing so - -- prevents the body of the join point being floated out by - -- the full laziness pass - really_final_bndrs = map one_shot final_bndrs' - one_shot v | isId v = setOneShotLambda v - | otherwise = v - join_rhs = mkLams really_final_bndrs rhs' - - ; join_bndr <- newJoinId final_bndrs' rhs_ty' - - ; let join_call = mkApps (Var join_bndr) final_args - alt' = (con, bndrs', join_call) - - ; return ( jfloats `addJoinFlts` unitJoinFloat (NonRec join_bndr join_rhs) - , alt') } - -- See Note [Duplicated env] - -{- -Note [Fusing case continuations] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It's important to fuse two successive case continuations when the -first has one alternative. That's why we call prepareCaseCont here. -Consider this, which arises from thunk splitting (see Note [Thunk -splitting] in WorkWrap): - - let - x* = case (case v of {pn -> rn}) of - I# a -> I# a - in body - -The simplifier will find - (Var v) with continuation - Select (pn -> rn) ( - Select [I# a -> I# a] ( - StrictBind body Stop - -So we'll call mkDupableCont on - Select [I# a -> I# a] (StrictBind body Stop) -There is just one alternative in the first Select, so we want to -simplify the rhs (I# a) with continuation (StrictBind body Stop) -Supposing that body is big, we end up with - let $j a = <let x = I# a in body> - in case v of { pn -> case rn of - I# a -> $j a } -This is just what we want because the rn produces a box that -the case rn cancels with. - -See #4957 a fuller example. - -Note [Case binders and join points] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Consider this - case (case .. ) of c { - I# c# -> ....c.... - -If we make a join point with c but not c# we get - $j = \c -> ....c.... - -But if later inlining scrutinises the c, thus - - $j = \c -> ... case c of { I# y -> ... } ... - -we won't see that 'c' has already been scrutinised. This actually -happens in the 'tabulate' function in wave4main, and makes a significant -difference to allocation. - -An alternative plan is this: - - $j = \c# -> let c = I# c# in ...c.... - -but that is bad if 'c' is *not* later scrutinised. - -So instead we do both: we pass 'c' and 'c#' , and record in c's inlining -(a stable unfolding) that it's really I# c#, thus - - $j = \c# -> \c[=I# c#] -> ...c.... - -Absence analysis may later discard 'c'. - -NB: take great care when doing strictness analysis; - see Note [Lambda-bound unfoldings] in DmdAnal. - -Also note that we can still end up passing stuff that isn't used. Before -strictness analysis we have - let $j x y c{=(x,y)} = (h c, ...) - in ... -After strictness analysis we see that h is strict, we end up with - let $j x y c{=(x,y)} = ($wh x y, ...) -and c is unused. - -Note [Duplicated env] -~~~~~~~~~~~~~~~~~~~~~ -Some of the alternatives are simplified, but have not been turned into a join point -So they *must* have a zapped subst-env. So we can't use completeNonRecX to -bind the join point, because it might to do PostInlineUnconditionally, and -we'd lose that when zapping the subst-env. We could have a per-alt subst-env, -but zapping it (as we do in mkDupableCont, the Select case) is safe, and -at worst delays the join-point inlining. - -Note [Small alternative rhs] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -It is worth checking for a small RHS because otherwise we -get extra let bindings that may cause an extra iteration of the simplifier to -inline back in place. Quite often the rhs is just a variable or constructor. -The Ord instance of Maybe in PrelMaybe.hs, for example, took several extra -iterations because the version with the let bindings looked big, and so wasn't -inlined, but after the join points had been inlined it looked smaller, and so -was inlined. - -NB: we have to check the size of rhs', not rhs. -Duplicating a small InAlt might invalidate occurrence information -However, if it *is* dupable, we return the *un* simplified alternative, -because otherwise we'd need to pair it up with an empty subst-env.... -but we only have one env shared between all the alts. -(Remember we must zap the subst-env before re-simplifying something). -Rather than do this we simply agree to re-simplify the original (small) thing later. - -Note [Funky mkLamTypes] -~~~~~~~~~~~~~~~~~~~~~~ -Notice the funky mkLamTypes. If the constructor has existentials -it's possible that the join point will be abstracted over -type variables as well as term variables. - Example: Suppose we have - data T = forall t. C [t] - Then faced with - case (case e of ...) of - C t xs::[t] -> rhs - We get the join point - let j :: forall t. [t] -> ... - j = /\t \xs::[t] -> rhs - in - case (case e of ...) of - C t xs::[t] -> j t xs - -Note [Duplicating StrictArg] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We make a StrictArg duplicable simply by making all its -stored-up arguments (in sc_fun) trivial, by let-binding -them. Thus: - f E [..hole..] - ==> let a = E - in f a [..hole..] -Now if the thing in the hole is a case expression (which is when -we'll call mkDupableCont), we'll push the function call into the -branches, which is what we want. Now RULES for f may fire, and -call-pattern specialisation. Here's an example from #3116 - go (n+1) (case l of - 1 -> bs' - _ -> Chunk p fpc (o+1) (l-1) bs') -If we can push the call for 'go' inside the case, we get -call-pattern specialisation for 'go', which is *crucial* for -this program. - -Here is the (&&) example: - && E (case x of { T -> F; F -> T }) - ==> let a = E in - case x of { T -> && a F; F -> && a T } -Much better! - -Notice that - * Arguments to f *after* the strict one are handled by - the ApplyToVal case of mkDupableCont. Eg - f [..hole..] E - - * We can only do the let-binding of E because the function - part of a StrictArg continuation is an explicit syntax - tree. In earlier versions we represented it as a function - (CoreExpr -> CoreEpxr) which we couldn't take apart. - -Historical aide: previously we did this (where E is a -big argument: - f E [..hole..] - ==> let $j = \a -> f E a - in $j [..hole..] - -But this is terrible! Here's an example: - && E (case x of { T -> F; F -> T }) -Now, && is strict so we end up simplifying the case with -an ArgOf continuation. If we let-bind it, we get - let $j = \v -> && E v - in simplExpr (case x of { T -> F; F -> T }) - (ArgOf (\r -> $j r) -And after simplifying more we get - let $j = \v -> && E v - in case x of { T -> $j F; F -> $j T } -Which is a Very Bad Thing - - -Note [Duplicating StrictBind] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We make a StrictBind duplicable in a very similar way to -that for case expressions. After all, - let x* = e in b is similar to case e of x -> b - -So we potentially make a join-point for the body, thus: - let x = [] in b ==> join j x = b - in let x = [] in j x - - -Note [Join point abstraction] Historical note -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -NB: This note is now historical, describing how (in the past) we used -to add a void argument to nullary join points. But now that "join -point" is not a fuzzy concept but a formal syntactic construct (as -distinguished by the JoinId constructor of IdDetails), each of these -concerns is handled separately, with no need for a vestigial extra -argument. - -Join points always have at least one value argument, -for several reasons - -* If we try to lift a primitive-typed something out - for let-binding-purposes, we will *caseify* it (!), - with potentially-disastrous strictness results. So - instead we turn it into a function: \v -> e - where v::Void#. The value passed to this function is void, - which generates (almost) no code. - -* CPR. We used to say "&& isUnliftedType rhs_ty'" here, but now - we make the join point into a function whenever used_bndrs' - is empty. This makes the join-point more CPR friendly. - Consider: let j = if .. then I# 3 else I# 4 - in case .. of { A -> j; B -> j; C -> ... } - - Now CPR doesn't w/w j because it's a thunk, so - that means that the enclosing function can't w/w either, - which is a lose. Here's the example that happened in practice: - kgmod :: Int -> Int -> Int - kgmod x y = if x > 0 && y < 0 || x < 0 && y > 0 - then 78 - else 5 - -* Let-no-escape. We want a join point to turn into a let-no-escape - so that it is implemented as a jump, and one of the conditions - for LNE is that it's not updatable. In CoreToStg, see - Note [What is a non-escaping let] - -* Floating. Since a join point will be entered once, no sharing is - gained by floating out, but something might be lost by doing - so because it might be allocated. - -I have seen a case alternative like this: - True -> \v -> ... -It's a bit silly to add the realWorld dummy arg in this case, making - $j = \s v -> ... - True -> $j s -(the \v alone is enough to make CPR happy) but I think it's rare - -There's a slight infelicity here: we pass the overall -case_bndr to all the join points if it's used in *any* RHS, -because we don't know its usage in each RHS separately - - - -************************************************************************ -* * - Unfoldings -* * -************************************************************************ --} - -simplLetUnfolding :: SimplEnv-> TopLevelFlag - -> MaybeJoinCont - -> InId - -> OutExpr -> OutType - -> Unfolding -> SimplM Unfolding -simplLetUnfolding env top_lvl cont_mb id new_rhs rhs_ty unf - | isStableUnfolding unf - = simplStableUnfolding env top_lvl cont_mb id unf rhs_ty - | isExitJoinId id - = return noUnfolding -- See Note [Do not inline exit join points] in Exitify - | otherwise - = mkLetUnfolding (seDynFlags env) top_lvl InlineRhs id new_rhs - -------------------- -mkLetUnfolding :: DynFlags -> TopLevelFlag -> UnfoldingSource - -> InId -> OutExpr -> SimplM Unfolding -mkLetUnfolding dflags top_lvl src id new_rhs - = is_bottoming `seq` -- See Note [Force bottoming field] - return (mkUnfolding dflags src is_top_lvl is_bottoming new_rhs) - -- We make an unfolding *even for loop-breakers*. - -- Reason: (a) It might be useful to know that they are WHNF - -- (b) In GHC.Iface.Tidy we currently assume that, if we want to - -- expose the unfolding then indeed we *have* an unfolding - -- to expose. (We could instead use the RHS, but currently - -- we don't.) The simple thing is always to have one. - where - is_top_lvl = isTopLevel top_lvl - is_bottoming = isBottomingId id - -------------------- -simplStableUnfolding :: SimplEnv -> TopLevelFlag - -> MaybeJoinCont -- Just k => a join point with continuation k - -> InId - -> Unfolding -> OutType -> SimplM Unfolding --- Note [Setting the new unfolding] -simplStableUnfolding env top_lvl mb_cont id unf rhs_ty - = case unf of - NoUnfolding -> return unf - BootUnfolding -> return unf - OtherCon {} -> return unf - - DFunUnfolding { df_bndrs = bndrs, df_con = con, df_args = args } - -> do { (env', bndrs') <- simplBinders unf_env bndrs - ; args' <- mapM (simplExpr env') args - ; return (mkDFunUnfolding bndrs' con args') } - - CoreUnfolding { uf_tmpl = expr, uf_src = src, uf_guidance = guide } - | isStableSource src - -> do { expr' <- case mb_cont of -- See Note [Rules and unfolding for join points] - Just cont -> simplJoinRhs unf_env id expr cont - Nothing -> simplExprC unf_env expr (mkBoringStop rhs_ty) - ; case guide of - UnfWhen { ug_arity = arity - , ug_unsat_ok = sat_ok - , ug_boring_ok = boring_ok - } - -- Happens for INLINE things - -> let guide' = - UnfWhen { ug_arity = arity - , ug_unsat_ok = sat_ok - , ug_boring_ok = - boring_ok || inlineBoringOk expr' - } - -- Refresh the boring-ok flag, in case expr' - -- has got small. This happens, notably in the inlinings - -- for dfuns for single-method classes; see - -- Note [Single-method classes] in TcInstDcls. - -- A test case is #4138 - -- But retain a previous boring_ok of True; e.g. see - -- the way it is set in calcUnfoldingGuidanceWithArity - in return (mkCoreUnfolding src is_top_lvl expr' guide') - -- See Note [Top-level flag on inline rules] in GHC.Core.Unfold - - _other -- Happens for INLINABLE things - -> mkLetUnfolding dflags top_lvl src id expr' } - -- If the guidance is UnfIfGoodArgs, this is an INLINABLE - -- unfolding, and we need to make sure the guidance is kept up - -- to date with respect to any changes in the unfolding. - - | otherwise -> return noUnfolding -- Discard unstable unfoldings - where - dflags = seDynFlags env - is_top_lvl = isTopLevel top_lvl - act = idInlineActivation id - unf_env = updMode (updModeForStableUnfoldings act) env - -- See Note [Simplifying inside stable unfoldings] in SimplUtils - -{- -Note [Force bottoming field] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We need to force bottoming, or the new unfolding holds -on to the old unfolding (which is part of the id). - -Note [Setting the new unfolding] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -* If there's an INLINE pragma, we simplify the RHS gently. Maybe we - should do nothing at all, but simplifying gently might get rid of - more crap. - -* If not, we make an unfolding from the new RHS. But *only* for - non-loop-breakers. Making loop breakers not have an unfolding at all - means that we can avoid tests in exprIsConApp, for example. This is - important: if exprIsConApp says 'yes' for a recursive thing, then we - can get into an infinite loop - -If there's a stable unfolding on a loop breaker (which happens for -INLINABLE), we hang on to the inlining. It's pretty dodgy, but the -user did say 'INLINE'. May need to revisit this choice. - -************************************************************************ -* * - Rules -* * -************************************************************************ - -Note [Rules in a letrec] -~~~~~~~~~~~~~~~~~~~~~~~~ -After creating fresh binders for the binders of a letrec, we -substitute the RULES and add them back onto the binders; this is done -*before* processing any of the RHSs. This is important. Manuel found -cases where he really, really wanted a RULE for a recursive function -to apply in that function's own right-hand side. - -See Note [Forming Rec groups] in OccurAnal --} - -addBndrRules :: SimplEnv -> InBndr -> OutBndr - -> MaybeJoinCont -- Just k for a join point binder - -- Nothing otherwise - -> SimplM (SimplEnv, OutBndr) --- Rules are added back into the bin -addBndrRules env in_id out_id mb_cont - | null old_rules - = return (env, out_id) - | otherwise - = do { new_rules <- simplRules env (Just out_id) old_rules mb_cont - ; let final_id = out_id `setIdSpecialisation` mkRuleInfo new_rules - ; return (modifyInScope env final_id, final_id) } - where - old_rules = ruleInfoRules (idSpecialisation in_id) - -simplRules :: SimplEnv -> Maybe OutId -> [CoreRule] - -> MaybeJoinCont -> SimplM [CoreRule] -simplRules env mb_new_id rules mb_cont - = mapM simpl_rule rules - where - simpl_rule rule@(BuiltinRule {}) - = return rule - - simpl_rule rule@(Rule { ru_bndrs = bndrs, ru_args = args - , ru_fn = fn_name, ru_rhs = rhs }) - = do { (env', bndrs') <- simplBinders env bndrs - ; let rhs_ty = substTy env' (exprType rhs) - rhs_cont = case mb_cont of -- See Note [Rules and unfolding for join points] - Nothing -> mkBoringStop rhs_ty - Just cont -> ASSERT2( join_ok, bad_join_msg ) - cont - rule_env = updMode updModeForRules env' - fn_name' = case mb_new_id of - Just id -> idName id - Nothing -> fn_name - - -- join_ok is an assertion check that the join-arity of the - -- binder matches that of the rule, so that pushing the - -- continuation into the RHS makes sense - join_ok = case mb_new_id of - Just id | Just join_arity <- isJoinId_maybe id - -> length args == join_arity - _ -> False - bad_join_msg = vcat [ ppr mb_new_id, ppr rule - , ppr (fmap isJoinId_maybe mb_new_id) ] - - ; args' <- mapM (simplExpr rule_env) args - ; rhs' <- simplExprC rule_env rhs rhs_cont - ; return (rule { ru_bndrs = bndrs' - , ru_fn = fn_name' - , ru_args = args' - , ru_rhs = rhs' }) } diff --git a/compiler/simplCore/simplifier.tib b/compiler/simplCore/simplifier.tib deleted file mode 100644 index e0f9dc91f2..0000000000 --- a/compiler/simplCore/simplifier.tib +++ /dev/null @@ -1,771 +0,0 @@ -% Andre: -% -% - I'd like the transformation rules to appear clearly-identified in -% a box of some kind, so they can be distinguished from the examples. -% - - - -\documentstyle[slpj,11pt]{article} - -\renewcommand{\textfraction}{0.2} -\renewcommand{\floatpagefraction}{0.7} - -\begin{document} - -\title{How to simplify matters} - -\author{Simon Peyton Jones and Andre Santos\\ -Department of Computing Science, University of Glasgow, G12 8QQ \\ - @simonpj@@dcs.gla.ac.uk@ -} - -\maketitle - - -\section{Motivation} - -Quite a few compilers use the {\em compilation by transformation} idiom. -The idea is that as much of possible of the compilation process is -expressed as correctness-preserving transformations, each of which -transforms a program into a semantically-equivalent -program that (hopefully) executes more quickly or in less space. -Functional languages are particularly amenable to this approach because -they have a particularly rich family of possible transformations. -Examples of transformation-based compilers -include the Orbit compiler,[.kranz orbit thesis.] -Kelsey's compilers,[.kelsey thesis, hudak kelsey principles 1989.] -the New Jersey SML compiler,[.appel compiling with continuations.] -and the Glasgow Haskell compiler.[.ghc JFIT.] Of course many, perhaps most, -other compilers also use transformation to some degree. - -Compilation by transformation uses automatic transformations; that is, those -which can safely be applied automatically by a compiler. There -is also a whole approach to programming, which we might call {\em programming by transformation}, -in which the programmer manually transforms an inefficient specification into -an efficient program. This development process might be supported by -a programming environment in which does the book keeping, but the key steps -are guided by the programmer. We focus exclusively on automatic transformations -in this paper. - -Automatic program transformations seem to fall into two broad categories: -\begin{itemize} -\item {\bf Glamorous transformations} are global, sophisticated, -intellectually satisfying transformations, sometimes guided by some -interesting kind of analysis. -Examples include: -lambda lifting,[.johnsson lambda lifting.] -full laziness,[.hughes thesis, lester spe.] -closure conversion,[.appel jim 1989.] -deforestation,[.wadler 1990 deforestation, marlow wadler deforestation Glasgow92, chin phd 1990 march, gill launchbury.] -transformations based on strictness analysis,[.peyton launchbury unboxed.] -and so on. It is easy to write papers about these sorts of transformations. - -\item {\bf Humble transformations} are small, simple, local transformations, -which individually look pretty trivial. Here are two simple examples\footnote{ -The notation @E[]@ stands for an arbitrary expression with zero or more holes. -The notation @E[e]@ denotes @E[]@ with the holes filled in by the expression @e@. -We implicitly assume that no name-capture happens --- it's just -a short-hand, not an algorithm. -}: -@ - let x = y in E[x] ===> E[y] - - case (x:xs) of ===> E1[x,xs] - (y:ys) -> E1[y,ys] - [] -> E2 -@ -Transformations of this kind are almost embarrassingly simple. How could -anyone write a paper about them? -\end{itemize} -This paper is about humble transformations, and how to implement them. -Although each individual -transformation is simple enough, there is a scaling issue: -there are a large number of candidate transformations to consider, and -there are a very large number of opportunities to apply them. - -In the Glasgow Haskell compiler, all humble transformations -are performed by the so-called {\em simplifier}. -Our goal in this paper is to give an overview of how the simplifier works, what -transformations it applies, and what issues arose in its design. - -\section{The language} - -Mutter mutter. Important points: -\begin{itemize} -\item Second order lambda calculus. -\item Arguments are variables. -\item Unboxed data types, and unboxed cases. -\end{itemize} -Less important points: -\begin{itemize} -\item Constructors and primitives are saturated. -\item if-then-else desugared to @case@ -\end{itemize} - -Give data type. - -\section{Transformations} - -This section lists all the transformations implemented by the simplifier. -Because it is a complete list, it is a long one. -We content ourselves with a brief statement of each transformation, -augmented with forward references to Section~\ref{sect:composing} -which gives examples of the ways in which the transformations can compose together. - -\subsection{Beta reduction} - -If a lambda abstraction is applied to an argument, we can simply -beta-reduce. This applies equally to ordinary lambda abstractions and -type abstractions: -@ - (\x -> E[x]) arg ===> E[arg] - (/\a -> E[a]) ty ===> E[ty] -@ -There is no danger of duplicating work because the argument is -guaranteed to be a simple variable or literal. - -\subsubsection{Floating applications inward} - -Applications can be floated inside a @let(rec)@ or @case@ expression. -This is a good idea, because they might find a lambda abstraction inside -to beta-reduce with: -@ - (let(rec) Bind in E) arg ===> let(rec) Bind in (E arg) - - (case E of {P1 -> E1;...; Pn -> En}) arg - ===> - case E of {P1 -> E1 arg; ...; Pn -> En arg} -@ - - - -\subsection{Transformations concerning @let(rec)@} - -\subsubsection{Floating @let@ out of @let@} - -It is sometimes useful to float a @let(rec)@ out of a @let(rec)@ right-hand -side: -@ - let x = let(rec) Bind in B1 ===> let(rec) Bind in - in B2 let x = B1 - in B2 - - - letrec x = let(rec) Bind in B1 ===> let(rec) Bind - in B2 x = B1 - in B2 -@ - -\subsubsection{Floating @case@ out of @let@} - - -\subsubsection{@let@ to @case@} - - -\subsection{Transformations concerning @case@} - -\subsubsection{Case of known constructor} - -If a @case@ expression scrutinises a constructor, -the @case@ can be eliminated. This transformation is a real -win: it eliminates a whole @case@ expression. -@ - case (C a1 .. an) of ===> E[a1..an] - ... - C b1 .. bn -> E[b1..bn] - ... -@ -If none of the constructors in the alternatives match, then -the default is taken: -@ - case (C a1 .. an) of ===> let y = C a1 .. an - ...[no alt matches C]... in E - y -> E -@ -There is an important variant of this transformation when -the @case@ expression scrutinises a {\em variable} -which is known to be bound to a constructor. -This situation can -arise for two reasons: -\begin{itemize} -\item An enclosing @let(rec)@ binding binds the variable to a constructor. -For example: -@ - let x = C p q in ... (case x of ...) ... -@ -\item An enclosing @case@ expression scrutinises the same variable. -For example: -@ - case x of - ... - C p q -> ... (case x of ...) ... - ... -@ -This situation is particularly common, as we discuss in Section~\ref{sect:repeated-evals}. -\end{itemize} -In each of these examples, @x@ is known to be bound to @C p q@ -at the inner @case@. The general rules are: -@ - case x of {...; C b1 .. bn -> E[b1..bn]; ...} -===> {x bound to C a1 .. an} - E[a1..an] - - case x of {...[no alts match C]...; y -> E[y]} -===> {x bound to C a1 .. an} - E[x] -@ - -\subsubsection{Dead alternative elimination} -@ - case x of - C a .. z -> E - ...[other alts]... -===> x *not* bound to C - case x of - ...[other alts]... -@ -We might know that @x@ is not bound to a particular constructor -because of an enclosing case: -@ - case x of - C a .. z -> E1 - other -> E2 -@ -Inside @E1@ we know that @x@ is bound to @C@. -However, if the type has more than two constructors, -inside @E2@ all we know is that @x@ is {\em not} bound to @C@. - -This applies to unboxed cases also, in the obvious way. - -\subsubsection{Case elimination} - -If we can prove that @x@ is not bottom, then this rule applies. -@ - case x of ===> E[x] - y -> E[y] -@ -We might know that @x@ is non-bottom because: -\begin{itemize} -\item @x@ has an unboxed type. -\item There's an enclosing case which scrutinises @x@. -\item It is bound to an expression which provably terminates. -\end{itemize} -Since this transformation can only improve termination, even if we apply it -when @x@ is not provably non-bottom, we provide a compiler flag to -enable it all the time. - -\subsubsection{Case of error} - -@ - case (error ty E) of Alts ===> error ty' E - where - ty' is type of whole case expression -@ - -Mutter about types. Mutter about variables bound to error. -Mutter about disguised forms of error. - -\subsubsection{Floating @let(rec)@ out of @case@} - -A @let(rec)@ binding can be floated out of a @case@ scrutinee: -@ - case (let(rec) Bind in E) of Alts ===> let(rec) Bind in - case E of Alts -@ -This increases the likelihood of a case-of-known-constructor transformation, -because @E@ is not hidden from the @case@ by the @let(rec)@. - -\subsubsection{Floating @case@ out of @case@} - -Analogous to floating a @let(rec)@ from a @case@ scrutinee is -floating a @case@ from a @case@ scrutinee. We have to be -careful, though, about code size. If there's only one alternative -in the inner case, things are easy: -@ - case (case E of {P -> R}) of ===> case E of {P -> case R of - Q1 -> S1 Q1 -> S1 - ... ... - Qm -> Sm Qm -> Sm} -@ -If there's more than one alternative there's a danger -that we'll duplicate @S1@...@Sm@, which might be a lot of code. -Our solution is to create a new local definition for each -alternative: -@ - case (case E of {P1 -> R1; ...; Pn -> Rn}) of - Q1 -> S1 - ... - Qm -> Sm -===> - let s1 = \x1 ... z1 -> S1 - ... - sm = \xm ... zm -> Sm - in - case E of - P1 -> case R1 of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} - ... - Pn -> case Rn of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} -@ -Here, @x1 ... z1@ are that subset of -variables bound by the pattern @Q1@ which are free in @S1@, and -similarly for the other @si@. - -Is this transformation a win? After all, we have introduced @m@ new -functions! Section~\ref{sect:join-points} discusses this point. - -\subsubsection{Case merging} - -@ - case x of - ...[some alts]... - other -> case x of - ...[more alts]... -===> - case x of - ...[some alts]... - ...[more alts]... -@ -Any alternatives in @[more alts]@ which are already covered by @[some alts]@ -should first be eliminated by the dead-alternative transformation. - - -\subsection{Constructor reuse} - - -\subsection{Inlining} - -The inlining transformation is simple enough: -@ - let x = R in B[x] ===> B[R] -@ -Inlining is more conventionally used to describe the instantiation of a function -body at its call site, with arguments substituted for formal parameters. We treat -this as a two-stage process: inlining followed by beta reduction. Since we are -working with a higher-order language, not all the arguments may be available at every -call site, so separating inlining from beta reduction allows us to concentrate on -one problem at a time. - -The choice of exactly {\em which} bindings to inline has a major impact on efficiency. -Specifically, we need to consider the following factors: -\begin{itemize} -\item -Inlining a function at its call site, followed by some beta reduction, -very often exposes opportunities for further transformations. -We inline many simple arithmetic and boolean operators for this reason. -\item -Inlining can increase code size. -\item -Inlining can duplicate work, for example if a redex is inlined at more than one site. -Duplicating a single expensive redex can ruin a program's efficiency. -\end{itemize} - - -Our inlining strategy depends on the form of @R@: - -Mutter mutter. - - -\subsubsection{Dead code removal} - -If a @let@-bound variable is not used the binding can be dropped: -@ - let x = E in B ===> B - x not free in B -@ -A similar transformation applies for @letrec@-bound variables. -Programmers seldom write dead code, of course, but bindings often become dead when they -are inlined. - - - - -\section{Composing transformations} -\label{sect:composing} - -The really interesting thing about humble transformations is the way in which -they compose together to carry out substantial and useful transformations. -This section gives a collection of motivating examples, all of which have -shown up in real application programs. - -\subsection{Repeated evals} -\label{sect:repeated-evals} - -Example: x+x, as in unboxed paper. - - -\subsection{Lazy pattern matching} - -Lazy pattern matching is pretty inefficient. Consider: -@ - let (x,y) = E in B -@ -which desugars to: -@ - let t = E - x = case t of (x,y) -> x - y = case t of (x,y) -> y - in B -@ -This code allocates three thunks! However, if @B@ is strict in {\em either} -@x@ {\em or} @y@, then the strictness analyser will easily spot that -the binding for @t@ is strict, so we can do a @let@-to-@case@ transformation: -@ - case E of - (x,y) -> let t = (x,y) in - let x = case t of (x,y) -> x - y = case t of (x,y) -> y - in B -@ -whereupon the case-of-known-constructor transformation -eliminates the @case@ expressions in the right-hand side of @x@ and @y@, -and @t@ is then spotted as being dead, so we get -@ - case E of - (x,y) -> B -@ - -\subsection{Join points} -\label{sect:join-points} - -One motivating example is this: -@ - if (not x) then E1 else E2 -@ -After desugaring the conditional, and inlining the definition of -@not@, we get -@ - case (case x of True -> False; False -> True}) of - True -> E1 - False -> E2 -@ -Now, if we apply our case-of-case transformation we get: -@ - let e1 = E1 - e2 = E2 - in - case x of - True -> case False of {True -> e1; False -> e2} - False -> case True of {True -> e1; False -> e2} -@ -Now the case-of-known constructor transformation applies: -@ - let e1 = E1 - e2 = E2 - in - case x of - True -> e2 - False -> e1 -@ -Since there is now only one occurrence of @e1@ and @e2@ we can -inline them, giving just what we hoped for: -@ - case x of {True -> E2; False -> E1} -@ -The point is that the local definitions will often disappear again. - -\subsubsection{How join points occur} - -But what if they don't disappear? Then the definitions @s1@ ... @sm@ -play the role of ``join points''; they represent the places where -execution joins up again, having forked at the @case x@. The -``calls'' to the @si@ should really be just jumps. To see this more clearly -consider the expression -@ - if (x || y) then E1 else E2 -@ -A C compiler will ``short-circuit'' the -evaluation of the condition if @x@ turns out to be true -generate code, something like this: -@ - if (x) goto l1; - if (y) {...code for E2...} - l1: ...code for E1... -@ -In our setting, here's what will happen. First we desugar the -conditional, and inline the definition of @||@: -@ - case (case x of {True -> True; False -> y}) of - True -> E1 - False -> E2 -@ -Now apply the case-of-case transformation: -@ - let e1 = E1 - e2 = E2 - in - case x of - True -> case True of {True -> e1; False -> e2} - False -> case y of {True -> e1; False -> e2} -@ -Unlike the @not@ example, only one of the two inner case -simplifies, and we can therefore only inline @e2@, because -@e1@ is still mentioned twice\footnote{Unless the -inlining strategy decides that @E1@ is small enough to duplicate; -it is used in separate @case@ branches so there's no concern about duplicating -work. Here's another example of the way in which we make one part of the -simplifier (the inlining strategy) help with the work of another (@case@-expression -simplification.} -@ - let e1 = E1 - in - case x of - True -> e1 - False -> case y of {True -> e1; False -> e2} -@ -The code generator produces essentially the same code as -the C code given above. The binding for @e1@ turns into -just a label, which is jumped to from the two occurrences of @e1@. - -\subsubsection{Case of @error@} - -The case-of-error transformation is often exposed by the case-of-case -transformation. Consider -@ - case (hd xs) of - True -> E1 - False -> E2 -@ -After inlining @hd@, we get -@ - case (case xs of [] -> error "hd"; (x:_) -> x) of - True -> E1 - False -> E2 -@ -(I've omitted the type argument of @error@ to save clutter.) -Now doing case-of-case gives -@ - let e1 = E1 - e2 = E2 - in - case xs of - [] -> case (error "hd") of { True -> e1; False -> e2 } - (x:_) -> case x of { True -> e1; False -> e2 } -@ -Now the case-of-error transformation springs to life, after which -we can inline @e1@ and @e2@: -@ - case xs of - [] -> error "hd" - (x:_) -> case x of {True -> E1; False -> E2} -@ - -\subsection{Nested conditionals combined} - -Sometimes programmers write something which should be done -by a single @case@ as a sequence of tests: -@ - if x==0::Int then E0 else - if x==1 then E1 else - E2 -@ -After eliminating some redundant evals and doing the case-of-case -transformation we get -@ - case x of I# x# -> - case x# of - 0# -> E0 - other -> case x# of - 1# -> E1 - other -> E2 -@ -The case-merging transformation puts these together to get -@ - case x of I# x# -> - case x# of - 0# -> E0 - 1# -> E1 - other -> E2 -@ -Sometimes the sequence of tests cannot be eliminated from the source -code because of overloading: -@ - f :: Num a => a -> Bool - f 0 = True - f 3 = True - f n = False -@ -If we specialise @f@ to @Int@ we'll get the previous example again. - -\subsection{Error tests eliminated} - -The elimination of redundant alternatives, and then of redundant cases, -arises when we inline functions which do error checking. A typical -example is this: -@ - if (x `rem` y) == 0 then (x `div` y) else y -@ -Here, both @rem@ and @div@ do an error-check for @y@ being zero. -The second check is eliminated by the transformations. -After transformation the code becomes: -@ - case x of I# x# -> - case y of I# y# -> - case y of - 0# -> error "rem: zero divisor" - _ -> case x# rem# y# of - 0# -> case x# div# y# of - r# -> I# r# - _ -> y -@ - -\subsection{Atomic arguments} - -At this point it is possible to appreciate the usefulness of -the Core-language syntax requirement that arguments are atomic. -For example, suppose that arguments could be arbitrary expressions. -Here is a possible transformation: -@ - f (case x of (p,q) -> p) -===> f strict in its second argument - case x of (p,q) -> f (p,p) -@ -Doing this transformation would be useful, because now the -argument to @f@ is a simple variable rather than a thunk. -However, if arguments are atomic, this transformation becomes -just a special case of floating a @case@ out of a strict @let@: -@ - let a = case x of (p,q) -> p - in f a -===> (f a) strict in a - case x of (p,q) -> let a=p in f a -===> - case x of (p,q) -> f p -@ -There are many examples of this kind. For almost any transformation -involving @let@ there is a corresponding one involving a function -argument. The same effect is achieved with much less complexity -by restricting function arguments to be atomic. - -\section{Design} - -Dependency analysis -Occurrence analysis - -\subsection{Renaming and cloning} - -Every program-transformation system has to worry about name capture. -For example, here is an erroneous transformation: -@ - let y = E - in - (\x -> \y -> x + y) (y+3) -===> WRONG! - let y = E - in - (\y -> (y+3) + y) -@ -The transformation fails because the originally free-occurrence -of @y@ in the argument @y+3@ has been ``captured'' by the @\y@-abstraction. -There are various sophisticated solutions to this difficulty, but -we adopted a very simple one: we uniquely rename every locally-bound identifier -on every pass of the simplifier. -Since we are in any case producing an entirely new program (rather than side-effecting -an existing one) it costs very little extra to rename the identifiers as we go. - -So our example would become -@ - let y = E - in - (\x -> \y -> x + y) (y+3) -===> WRONG! - let y1 = E - in - (\y2 -> (y1+3) + y2) -@ -The simplifier accepts as input a program which has arbitrary bound -variable names, including ``shadowing'' (where a binding hides an -outer binding for the same identifier), but it produces a program in -which every bound identifier has a distinct name. - -Both the ``old'' and ``new'' identifiers have type @Id@, but when writing -type signatures for functions in the simplifier we use the types @InId@, for -identifiers from the input program, and @OutId@ for identifiers from the output program: -@ - type InId = Id - type OutId = Id -@ -This nomenclature extends naturally to expressions: a value of type @InExpr@ is an -expression whose identifiers are from the input-program name-space, and similarly -@OutExpr@. - - -\section{The simplifier} - -The basic algorithm followed by the simplifier is: -\begin{enumerate} -\item Analyse: perform occurrence analysis and dependency analysis. -\item Simplify: apply as many transformations as possible. -\item Iterate: perform the above two steps repeatedly until no further transformations are possible. -(A compiler flag allows the programmer to bound the maximum number of iterations.) -\end{enumerate} -We make a effort to apply as many transformations as possible in Step -2. To see why this is a good idea, just consider a sequence of -transformations in which each transformation enables the next. If -each iteration of Step 2 only performs one transformation, then the -entire program will to be re-analysed by Step 1, and re-traversed by -Step 2, for each transformation of the sequence. Sometimes this is -unavoidable, but it is often possible to perform a sequence of -transformtions in a single pass. - -The key function, which simplifies expressions, has the following type: -@ - simplExpr :: SimplEnv - -> InExpr -> [OutArg] - -> SmplM OutExpr -@ -The monad, @SmplM@ can quickly be disposed of. It has only two purposes: -\begin{itemize} -\item It plumbs around a supply of unique names, so that the simplifier can -easily invent new names. -\item It gathers together counts of how many of each kind of transformation -has been applied, for statistical purposes. These counts are also used -in Step 3 to decide when the simplification process has terminated. -\end{itemize} - -The signature can be understood like this: -\begin{itemize} -\item The environment, of type @SimplEnv@, provides information about -identifiers bound by the enclosing context. -\item The second and third arguments together specify the expression to be simplified. -\item The result is the simplified expression, wrapped up by the monad. -\end{itemize} -The simplifier's invariant is this: -$$ -@simplExpr@~env~expr~[a_1,\ldots,a_n] = expr[env]~a_1~\ldots~a_n -$$ -That is, the expression returned by $@simplExpr@~env~expr~[a_1,\ldots,a_n]$ -is semantically equal (although hopefully more efficient than) -$expr$, with the renamings in $env$ applied to it, applied to the arguments -$a_1,\ldots,a_n$. - -\subsection{Application and beta reduction} - -The arguments are carried ``inwards'' by @simplExpr@, as an accumulating parameter. -This is a convenient way of implementing the transformations which float -arguments inside a @let@ and @case@. This list of pending arguments -requires a new data type, @CoreArg@, along with its ``in'' and ``out'' synonyms, -because an argument might be a type or an atom: -@ -data CoreArg bindee = TypeArg UniType - | ValArg (CoreAtom bindee) - -type InArg = CoreArg InId -type OutArg = CoreArg OutId -@ -The equations for applications simply apply -the environment to the argument (to handle renaming) and put the result -on the argument stack, tagged to say whether it is a type argument or value argument: -@ - simplExpr env (CoApp fun arg) args - = simplExpr env fun (ValArg (simplAtom env arg) : args) - simplExpr env (CoTyApp fun ty) args - = simplExpr env fun (TypeArg (simplTy env ty) : args) -@ - - - - - - -\end{document} |