diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2018-12-12 17:22:07 +0000 |
---|---|---|
committer | Simon Peyton Jones <simonpj@microsoft.com> | 2018-12-12 17:38:25 +0000 |
commit | d77501cd5b9060e38acd50e11e0c5aae89d75b65 (patch) | |
tree | a6cba21b5adb5c04b476c92bf33436bd1a880981 /compiler | |
parent | ded4a1db4d61b1bc8b5fd73e8eb87cf572efda35 (diff) | |
download | haskell-d77501cd5b9060e38acd50e11e0c5aae89d75b65.tar.gz |
Improvements to demand analysis
This patch collects a few improvements triggered by Trac #15696,
and fixing Trac #16029
* Stop making toCleanDmd behave specially for unlifted types.
This special case was the cause of stupid behaviour in Trac
#16029. And to my joy I discovered the let/app invariant
rendered it unnecessary. (Maybe the special case pre-dated
the let/app invariant.)
Result: less special-case handling in the compiler, and
better perf for the compiled code.
* In WwLib.mkWWstr_one, treat seqDmd like U(AAA). It was not
being so treated before, which again led to stupid code.
* Update and improve Notes
There are .stderr test wibbles because we get slightly different
strictness signatures for an argumment of unlifted type:
<L,U> rather than <S,U> for Int#
<S,U> rather than <S(S),U(U)> for Int
Diffstat (limited to 'compiler')
-rw-r--r-- | compiler/basicTypes/Demand.hs | 124 | ||||
-rw-r--r-- | compiler/simplStg/StgLiftLams/Analysis.hs | 2 | ||||
-rw-r--r-- | compiler/stranal/DmdAnal.hs | 28 | ||||
-rw-r--r-- | compiler/stranal/WwLib.hs | 213 |
4 files changed, 201 insertions, 166 deletions
diff --git a/compiler/basicTypes/Demand.hs b/compiler/basicTypes/Demand.hs index 88845426a0..ff71027eb6 100644 --- a/compiler/basicTypes/Demand.hs +++ b/compiler/basicTypes/Demand.hs @@ -74,7 +74,7 @@ import BasicTypes import Binary import Maybes ( orElse ) -import Type ( Type, isUnliftedType ) +import Type ( Type ) import TyCon ( isNewTyCon, isClassTyCon ) import DataCon ( splitDataProductType_maybe ) @@ -393,10 +393,15 @@ data UseDmd -- (in that case, use UHead) | UHead -- ^ May be used but its sub-components are - -- definitely *not* used. Roughly U(AAA) - -- e.g. the usage of @x@ in @x `seq` e@ - -- A polymorphic demand: used for values of all types, - -- including a type variable + -- definitely *not* used. For product types, UHead + -- is equivalent to U(AAA); see mkUProd. + -- + -- UHead is needed only to express the demand + -- of 'seq' and 'case' which are polymorphic; + -- i.e. the scrutinised value is of type 'a' + -- rather than a product type. That's why we + -- can't use UProd [A,A,A] + -- -- Since (UCall _ Abs) is ill-typed, UHead doesn't -- make sense for lambdas @@ -1100,81 +1105,6 @@ different: unused, so we can use absDmd there. * Further arguments *can* be used, of course. Hence topDmd is used. -Note [Worthy functions for Worker-Wrapper split] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For non-bottoming functions a worker-wrapper transformation takes into -account several possibilities to decide if the function is worthy for -splitting: - -1. The result is of product type and the function is strict in some -(or even all) of its arguments. The check that the argument is used is -more of sanity nature, since strictness implies usage. Example: - -f :: (Int, Int) -> Int -f p = (case p of (a,b) -> a) + 1 - -should be splitted to - -f :: (Int, Int) -> Int -f p = case p of (a,b) -> $wf a - -$wf :: Int -> Int -$wf a = a + 1 - -2. Sometimes it also makes sense to perform a WW split if the -strictness analysis cannot say for sure if the function is strict in -components of its argument. Then we reason according to the inferred -usage information: if the function uses its product argument's -components, the WW split can be beneficial. Example: - -g :: Bool -> (Int, Int) -> Int -g c p = case p of (a,b) -> - if c then a else b - -The function g is strict in is argument p and lazy in its -components. However, both components are used in the RHS. The idea is -since some of the components (both in this case) are used in the -right-hand side, the product must presumable be taken apart. - -Therefore, the WW transform splits the function g to - -g :: Bool -> (Int, Int) -> Int -g c p = case p of (a,b) -> $wg c a b - -$wg :: Bool -> Int -> Int -> Int -$wg c a b = if c then a else b - -3. If an argument is absent, it would be silly to pass it to a -function, hence the worker with reduced arity is generated. - - -Note [Worker-wrapper for bottoming functions] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We used not to split if the result is bottom. -[Justification: there's no efficiency to be gained.] - -But it's sometimes bad not to make a wrapper. Consider - fw = \x# -> let x = I# x# in case e of - p1 -> error_fn x - p2 -> error_fn x - p3 -> the real stuff -The re-boxing code won't go away unless error_fn gets a wrapper too. -[We don't do reboxing now, but in general it's better to pass an -unboxed thing to f, and have it reboxed in the error cases....] - -However we *don't* want to do this when the argument is not actually -taken apart in the function at all. Otherwise we risk decomposing a -massive tuple which is barely used. Example: - - f :: ((Int,Int) -> String) -> (Int,Int) -> a - f g pr = error (g pr) - - main = print (f fst (1, error "no")) - -Here, f does not take 'pr' apart, and it's stupid to do so. -Imagine that it had millions of fields. This actually happened -in GHC itself where the tuple was DynFlags - ************************************************************************ * * @@ -1406,25 +1336,20 @@ type DmdShell -- Describes the "outer shell" -- of a Demand = JointDmd (Str ()) (Use ()) -toCleanDmd :: Demand -> Type -> (DmdShell, CleanDemand) +toCleanDmd :: Demand -> (DmdShell, CleanDemand) -- Splits a Demand into its "shell" and the inner "clean demand" -toCleanDmd (JD { sd = s, ud = u }) expr_ty +toCleanDmd (JD { sd = s, ud = u }) = (JD { sd = ss, ud = us }, JD { sd = s', ud = u' }) -- See Note [Analyzing with lazy demand and lambdas] + -- See Note [Analysing with absent demand] where (ss, s') = case s of - Str x s' -> (Str x (), s') - Lazy | is_unlifted -> (Str VanStr (), HeadStr) - | otherwise -> (Lazy, HeadStr) + Str x s' -> (Str x (), s') + Lazy -> (Lazy, HeadStr) (us, u') = case u of - Use c u' -> (Use c (), u') - Abs | is_unlifted -> (Use One (), Used) - | otherwise -> (Abs, Used) - - is_unlifted = isUnliftedType expr_ty - -- See Note [Analysing with absent demand] - + Use c u' -> (Use c (), u') + Abs -> (Abs, Used) -- This is used in dmdAnalStar when post-processing -- a function's argument demand. So we only care about what @@ -1646,9 +1571,9 @@ There are several wrinkles: But we can post-process the results to ignore all the usage demands coming back. This is done by postProcessDmdType. -* But in the case of an *unlifted type* we must be extra careful, - because unlifted values are evaluated even if they are not used. - Example (see Trac #9254): +* In a previous incarnation of GHC we needed to be extra careful in the + case of an *unlifted type*, because unlifted values are evaluated + even if they are not used. Example (see Trac #9254): f :: (() -> (# Int#, () #)) -> () -- Strictness signature is -- <C(S(LS)), 1*C1(U(A,1*U()))> @@ -1668,10 +1593,11 @@ There are several wrinkles: usage of 'y', else 'g' will say 'y' is absent, and will w/w so that 'y' is bound to an aBSENT_ERROR thunk. - An alternative would be to replace the 'case y of ...' with (say) 0#, - but I have not tried that. It's not a common situation, but it is - not theoretical: unsafePerformIO's implementation is very very like - 'f' above. + However, the argument of toCleanDmd always satisfies the let/app + invariant; so if it is unlifted it is also okForSpeculation, and so + can be evaluated in a short finite time -- and that rules out nasty + cases like the one above. (I'm not quite sure why this was a + problem in an earlier version of GHC, but it isn't now.) ************************************************************************ diff --git a/compiler/simplStg/StgLiftLams/Analysis.hs b/compiler/simplStg/StgLiftLams/Analysis.hs index 5b87f58ce0..7fb60df0b0 100644 --- a/compiler/simplStg/StgLiftLams/Analysis.hs +++ b/compiler/simplStg/StgLiftLams/Analysis.hs @@ -342,7 +342,7 @@ rhsDmdShell bndr where is_thunk = idArity bndr == 0 -- Let's pray idDemandInfo is still OK after unarise... - (ds, cd) = toCleanDmd (idDemandInfo bndr) (idType bndr) + (ds, cd) = toCleanDmd (idDemandInfo bndr) tagSkeletonAlt :: CgStgAlt -> (Skeleton, IdSet, LlStgAlt) tagSkeletonAlt (con, bndrs, rhs) diff --git a/compiler/stranal/DmdAnal.hs b/compiler/stranal/DmdAnal.hs index 0b8133d98f..6e10c987a9 100644 --- a/compiler/stranal/DmdAnal.hs +++ b/compiler/stranal/DmdAnal.hs @@ -26,7 +26,7 @@ import BasicTypes import Data.List import DataCon import Id -import CoreUtils ( exprIsHNF, exprType, exprIsTrivial ) +import CoreUtils ( exprIsHNF, exprType, exprIsTrivial, exprOkForSpeculation ) import TyCon import Type import Coercion ( Coercion, coVarsOfCo ) @@ -140,11 +140,15 @@ dmdTransformThunkDmd e -- See ↦* relation in the Cardinality Analysis paper dmdAnalStar :: AnalEnv -> Demand -- This one takes a *Demand* - -> CoreExpr -> (BothDmdArg, CoreExpr) + -> CoreExpr -- Should obey the let/app invariatn + -> (BothDmdArg, CoreExpr) dmdAnalStar env dmd e - | (defer_and_use, cd) <- toCleanDmd dmd (exprType e) - , (dmd_ty, e') <- dmdAnal env cd e - = (postProcessDmdType defer_and_use dmd_ty, e') + | (dmd_shell, cd) <- toCleanDmd dmd + , (dmd_ty, e') <- dmdAnal env cd e + = ASSERT2( not (isUnliftedType (exprType e)) || exprOkForSpeculation e, ppr e ) + -- The argument 'e' should satisfy the let/app invariant + -- See Note [Analysing with absent demand] in Demand.hs + (postProcessDmdType dmd_shell dmd_ty, e') -- Main Demand Analsysis machinery dmdAnal, dmdAnal' :: AnalEnv @@ -170,19 +174,6 @@ dmdAnal' env dmd (Cast e co) where (dmd_ty, e') = dmdAnal env dmd e -{- ----- I don't get this, so commenting out ------- - to_co = pSnd (coercionKind co) - dmd' - | Just tc <- tyConAppTyCon_maybe to_co - , isRecursiveTyCon tc = cleanEvalDmd - | otherwise = dmd - -- This coerce usually arises from a recursive - -- newtype, and we don't want to look inside them - -- for exactly the same reason that we don't look - -- inside recursive products -- we might not reach - -- a fixpoint. So revert to a vanilla Eval demand --} - dmdAnal' env dmd (Tick t e) = (dmd_ty, Tick t e') where @@ -259,6 +250,7 @@ dmdAnal' env dmd (Case scrut case_bndr ty [(DataAlt dc, bndrs, rhs)]) -- pprTrace "dmdAnal:Case1" (vcat [ text "scrut" <+> ppr scrut -- , text "dmd" <+> ppr dmd -- , text "case_bndr_dmd" <+> ppr (idDemandInfo case_bndr') +-- , text "id_dmds" <+> ppr id_dmds -- , text "scrut_dmd" <+> ppr scrut_dmd -- , text "scrut_ty" <+> ppr scrut_ty -- , text "alt_ty" <+> ppr alt_ty2 diff --git a/compiler/stranal/WwLib.hs b/compiler/stranal/WwLib.hs index ce036c8c26..ef6be898c7 100644 --- a/compiler/stranal/WwLib.hs +++ b/compiler/stranal/WwLib.hs @@ -567,6 +567,7 @@ as-yet-un-filled-in pkgState files. -- brings into scope work_args (via cases) -- * work_fn assumes work_args are in scope, a -- brings into scope wrap_arg (via lets) +-- See Note [How to do the worker/wrapper split] mkWWstr_one :: DynFlags -> FamInstEnvs -> Bool -- True <=> INLINEABLE pragama on this function defn -- See Note [Do not unpack class dictionaries] @@ -576,43 +577,42 @@ mkWWstr_one dflags fam_envs has_inlineable_prag arg | isTyVar arg = return (False, [arg], nop_fn, nop_fn) - -- See Note [Worker-wrapper for bottoming functions] | isAbsDmd dmd , Just work_fn <- mk_absent_let dflags arg -- Absent case. We can't always handle absence for arbitrary -- unlifted types, so we need to choose just the cases we can - --- (that's what mk_absent_let does) + -- (that's what mk_absent_let does) = return (True, [], nop_fn, work_fn) - -- See Note [Worthy functions for Worker-Wrapper split] - | isSeqDmd dmd -- `seq` demand; evaluate in wrapper in the hope - -- of dropping seqs in the worker - = let arg_w_unf = arg `setIdUnfolding` evaldUnfolding - -- Tell the worker arg that it's sure to be evaluated - -- so that internal seqs can be dropped - in return (True, [arg_w_unf], mk_seq_case arg, nop_fn) - -- Pass the arg, anyway, even if it is in theory discarded - -- Consider - -- f x y = x `seq` y - -- x gets a (Eval (Poly Abs)) demand, but if we fail to pass it to the worker - -- we ABSOLUTELY MUST record that x is evaluated in the wrapper. - -- Something like: - -- f x y = x `seq` fw y - -- fw y = let x{Evald} = error "oops" in (x `seq` y) - -- If we don't pin on the "Evald" flag, the seq doesn't disappear, and - -- we end up evaluating the absent thunk. - -- But the Evald flag is pretty weird, and I worry that it might disappear - -- during simplification, so for now I've just nuked this whole case - | isStrictDmd dmd , Just cs <- splitProdDmd_maybe dmd -- See Note [Unpacking arguments with product and polymorphic demands] , not (has_inlineable_prag && isClassPred arg_ty) -- See Note [Do not unpack class dictionaries] - , Just (data_con, inst_tys, inst_con_arg_tys, co) - <- deepSplitProductType_maybe fam_envs arg_ty + , Just stuff@(_, _, inst_con_arg_tys, _) <- deepSplitProductType_maybe fam_envs arg_ty , cs `equalLength` inst_con_arg_tys -- See Note [mkWWstr and unsafeCoerce] + = unbox_one dflags fam_envs arg cs stuff + + | isSeqDmd dmd -- For seqDmd, splitProdDmd_maybe will return Nothing, but + -- it should behave like <S, U(AAAA)>, for some suitable arity + , Just stuff@(_, _, inst_con_arg_tys, _) <- deepSplitProductType_maybe fam_envs arg_ty + , let abs_dmds = map (const absDmd) inst_con_arg_tys + = unbox_one dflags fam_envs arg abs_dmds stuff + + | otherwise -- Other cases + = return (False, [arg], nop_fn, nop_fn) + + where + arg_ty = idType arg + dmd = idDemandInfo arg + +unbox_one :: DynFlags -> FamInstEnvs -> Var + -> [Demand] + -> (DataCon, [Type], [(Type, StrictnessMark)], Coercion) + -> UniqSM (Bool, [Var], CoreExpr -> CoreExpr, CoreExpr -> CoreExpr) +unbox_one dflags fam_envs arg cs + (data_con, inst_tys, inst_con_arg_tys, co) = do { (uniq1:uniqs) <- getUniquesM ; let -- See Note [Add demands for strict constructors] cs' = addDataConStrictness data_con cs @@ -627,13 +627,7 @@ mkWWstr_one dflags fam_envs has_inlineable_prag arg ; (_, worker_args, wrap_fn, work_fn) <- mkWWstr dflags fam_envs False unpk_args ; return (True, worker_args, unbox_fn . wrap_fn, work_fn . rebox_fn) } -- Don't pass the arg, rebox instead - - | otherwise -- Other cases - = return (False, [arg], nop_fn, nop_fn) - where - arg_ty = idType arg - dmd = idDemandInfo arg mk_ww_arg uniq ty sub_dmd = setIdDemandInfo (mk_ww_local uniq ty) sub_dmd ---------------------- @@ -647,11 +641,76 @@ addDataConStrictness con ds zipWith add ds strs where strs = dataConRepStrictness con - add dmd str | isMarkedStrict str - , not (isAbsDmd dmd) = dmd `bothDmd` seqDmd + add dmd str | isMarkedStrict str = strictifyDmd dmd | otherwise = dmd -{- +{- Note [How to do the worker/wrapper split] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The worker-wrapper transformation, mkWWstr_one, takes into account +several possibilities to decide if the function is worthy for +splitting: + +1. If an argument is absent, it would be silly to pass it to + the worker. Hence the isAbsDmd case. This case must come + first because a demand like <S,A> or <B,A> is possible. + E.g. <B,A> comes from a function like + f x = error "urk" + and <S,A> can come from Note [Add demands for strict constructors] + +2. If the argument is evaluated strictly, and we can split the + product demand (splitProdDmd_maybe), then unbox it and w/w its + pieces. For example + + f :: (Int, Int) -> Int + f p = (case p of (a,b) -> a) + 1 + is split to + f :: (Int, Int) -> Int + f p = case p of (a,b) -> $wf a + + $wf :: Int -> Int + $wf a = a + 1 + + and + g :: Bool -> (Int, Int) -> Int + g c p = case p of (a,b) -> + if c then a else b + is split to + g c p = case p of (a,b) -> $gw c a b + $gw c a b = if c then a else b + +2a But do /not/ split if the components are not used; that is, the + usage is just 'Used' rather than 'UProd'. In this case + splitProdDmd_maybe returns Nothing. Otherwise we risk decomposing + a massive tuple which is barely used. Example: + + f :: ((Int,Int) -> String) -> (Int,Int) -> a + f g pr = error (g pr) + + main = print (f fst (1, error "no")) + + Here, f does not take 'pr' apart, and it's stupid to do so. + Imagine that it had millions of fields. This actually happened + in GHC itself where the tuple was DynFlags + +3. A plain 'seqDmd', which is head-strict with usage UHead, can't + be split by splitProdDmd_maybe. But we want it to behave just + like U(AAAA) for suitable number of absent demands. So we have + a special case for it, with arity coming from the data constructor. + +Note [Worker-wrapper for bottoming functions] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We used not to split if the result is bottom. +[Justification: there's no efficiency to be gained.] + +But it's sometimes bad not to make a wrapper. Consider + fw = \x# -> let x = I# x# in case e of + p1 -> error_fn x + p2 -> error_fn x + p3 -> the real stuff +The re-boxing code won't go away unless error_fn gets a wrapper too. +[We don't do reboxing now, but in general it's better to pass an +unboxed thing to f, and have it reboxed in the error cases....] + Note [Add demands for strict constructors] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider this program (due to Roman): @@ -677,22 +736,36 @@ because X is strict, so its argument must be evaluated. And if we foo (X a) n = a `seq` go 0 +because the seq is discarded (very early) since X is strict! + So here's what we do -* We leave the demand-analysis alone. The demand on 'a' in the definition of - 'foo' is <L, U(U)>; the strictness info is Lazy because foo's body may or may - not evaluate 'a'; but the usage info says that 'a' is unpacked and its content - is used. +* We leave the demand-analysis alone. The demand on 'a' in the + definition of 'foo' is <L, U(U)>; the strictness info is Lazy + because foo's body may or may not evaluate 'a'; but the usage info + says that 'a' is unpacked and its content is used. -* During worker/wrapper, if we unpack a strict constructor (as we do for 'foo'), - we use 'strictifyDemand' to bump up the strictness on the strict arguments of - the data constructor. That in turn means that, if the usage info supports - doing so (i.e. splitProdDmd_maybe returns Just), we will unpack that argument +* During worker/wrapper, if we unpack a strict constructor (as we do + for 'foo'), we use 'addDataConStrictness' to bump up the strictness on + the strict arguments of the data constructor. + +* That in turn means that, if the usage info supports doing so + (i.e. splitProdDmd_maybe returns Just), we will unpack that argument -- even though the original demand (e.g. on 'a') was lazy. -The net effect is that the w/w transformation is more aggressive about unpacking -the strict arguments of a data constructor, when that eagerness is supported by -the usage info. +* What does "bump up the strictness" mean? Just add a head-strict + demand to the strictness! Even for a demand like <L,A> we can + safely turn it into <S,A>; remember case (1) of + Note [How to do the worker/wrapper split]. + +The net effect is that the w/w transformation is more aggressive about +unpacking the strict arguments of a data constructor, when that +eagerness is supported by the usage info. + +There is the usual danger of reboxing, which as usual we ignore. But +if X is monomorphic, and has an UNPACK pragma, then this optimisation +is even more important. We don't want the wrapper to rebox an unboxed +argument, and pass an Int to $wfoo! This works in nested situations like @@ -701,20 +774,64 @@ This works in nested situations like newtype instance Bar Int = Bar Int foo :: Bar ((Int, Int), Int) -> Int -> Int + foo f k = case f of BarPair x y -> + case burble of + True -> case x of + BarPair p q -> ... + False -> ... + +The extra eagerness lets us produce a worker of type: + $wfoo :: Int# -> Int# -> Int# -> Int -> Int + $wfoo p# q# y# = ... + +even though the `case x` is only lazily evaluated. + +--------- Historical note ------------ +We used to add data-con strictness demands when demand analysing case +expression. However, it was noticed in #15696 that this misses some cases. For +instance, consider the program (from T10482) + + data family Bar a + data instance Bar (a, b) = BarPair !(Bar a) !(Bar b) + newtype instance Bar Int = Bar Int + + foo :: Bar ((Int, Int), Int) -> Int -> Int foo f k = case f of BarPair x y -> case burble of - True -> case x of - BarPair p q -> ... - False -> ... + True -> case x of + BarPair p q -> ... + False -> ... -The extra eagerness lets us produce a worker of type: +We really should be able to assume that `p` is already evaluated since it came +from a strict field of BarPair. This strictness would allow us to produce a +worker of type: $wfoo :: Int# -> Int# -> Int# -> Int -> Int $wfoo p# q# y# = ... even though the `case x` is only lazily evaluated +Indeed before we fixed #15696 this would happen since we would float the inner +`case x` through the `case burble` to get: + + foo f k = + case f of + BarPair x y -> case x of + BarPair p q -> case burble of + True -> ... + False -> ... + +However, after fixing #15696 this could no longer happen (for the reasons +discussed in ticket:15696#comment:76). This means that the demand placed on `f` +would then be significantly weaker (since the False branch of the case on +`burble` is not strict in `p` or `q`). + +Consequently, we now instead account for data-con strictness in mkWWstr_one, +applying the strictness demands to the final result of DmdAnal. The result is +that we get the strict demand signature we wanted even if we can't float +the case on `x` up through the case on `burble`. + Note [mkWWstr and unsafeCoerce] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |