summaryrefslogtreecommitdiff
path: root/compiler/stranal
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2013-01-17 10:54:07 +0000
committerSimon Peyton Jones <simonpj@microsoft.com>2013-01-17 10:54:07 +0000
commit0831a12ea2fc73c33652eeec1adc79fa19700578 (patch)
tree6382f3cd4cb7070d101e22d7de2876aa8cbbbc39 /compiler/stranal
parentaef38d130b0ff74b0a5f2478be985e96b40c0f97 (diff)
downloadhaskell-0831a12ea2fc73c33652eeec1adc79fa19700578.tar.gz
Major patch to implement the new Demand Analyser
This patch is the result of Ilya Sergey's internship at MSR. It constitutes a thorough overhaul and simplification of the demand analyser. It makes a solid foundation on which we can now build. Main changes are * Instead of having one combined type for Demand, a Demand is now a pair (JointDmd) of - a StrDmd and - an AbsDmd. This allows strictness and absence to be though about quite orthogonally, and greatly reduces brain melt-down. * Similarly in the DmdResult type, it's a pair of - a PureResult (indicating only divergence/non-divergence) - a CPRResult (which deals only with the CPR property * In IdInfo, the strictnessInfo field contains a StrictSig, not a Maybe StrictSig demandInfo field contains a Demand, not a Maybe Demand We don't need Nothing (to indicate no strictness/demand info) any more; topSig/topDmd will do. * Remove "boxity" analysis entirely. This was an attempt to avoid "reboxing", but it added complexity, is extremely ad-hoc, and makes very little difference in practice. * Remove the "unboxing strategy" computation. This was an an attempt to ensure that a worker didn't get zillions of arguments by unboxing big tuples. But in fact removing it DRAMATICALLY reduces allocation in an inner loop of the I/O library (where the threshold argument-count had been set just too low). It's exceptional to have a zillion arguments and I don't think it's worth the complexity, especially since it turned out to have a serious performance hit. * Remove quite a bit of ad-hoc cruft * Move worthSplittingFun, worthSplittingThunk from WorkWrap to Demand. This allows JointDmd to be fully abstract, examined only inside Demand. Everything else really follows from these changes. All of this is really just refactoring, so we don't expect big performance changes, but acutally the numbers look quite good. Here is a full nofib run with some highlights identified: Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- expert -2.6% -15.5% 0.00 0.00 +0.0% fluid -2.4% -7.1% 0.01 0.01 +0.0% gg -2.5% -28.9% 0.02 0.02 -33.3% integrate -2.6% +3.2% +2.6% +2.6% +0.0% mandel2 -2.6% +4.2% 0.01 0.01 +0.0% nucleic2 -2.0% -16.3% 0.11 0.11 +0.0% para -2.6% -20.0% -11.8% -11.7% +0.0% parser -2.5% -17.9% 0.05 0.05 +0.0% prolog -2.6% -13.0% 0.00 0.00 +0.0% puzzle -2.6% +2.2% +0.8% +0.8% +0.0% sorting -2.6% -35.9% 0.00 0.00 +0.0% treejoin -2.6% -52.2% -9.8% -9.9% +0.0% -------------------------------------------------------------------------------- Min -2.7% -52.2% -11.8% -11.7% -33.3% Max -1.8% +4.2% +10.5% +10.5% +7.7% Geometric Mean -2.5% -2.8% -0.4% -0.5% -0.4% Things to note * Binary sizes are smaller. I don't know why, but it's good. * Allocation is sometiemes a *lot* smaller. I believe that all the big numbers (I checked treejoin, gg, sorting) arise from one place, namely a function GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of which have several arugments. Not w/w'ing both arguments (which is what we did before) has a big effect. So the big win in actually somewhat accidental, gained by removing the "unboxing strategy" code. * A couple of benchmarks allocate slightly more. This turns out to be due to reboxing (integrate). But the biggest increase is mandel2, and *that* turned out also to be a somewhat accidental loss of CSE, and pointed the way to doing better CSE: see Trac #7596. * Runtimes are never very reliable, but seem to improve very slightly. All in all, a good piece of work. Thank you Ilya!
Diffstat (limited to 'compiler/stranal')
-rw-r--r--compiler/stranal/DmdAnal.lhs1077
-rw-r--r--compiler/stranal/WorkWrap.lhs58
-rw-r--r--compiler/stranal/WwLib.lhs131
3 files changed, 441 insertions, 825 deletions
diff --git a/compiler/stranal/DmdAnal.lhs b/compiler/stranal/DmdAnal.lhs
index 0ecefa7565..9e38bb7c0d 100644
--- a/compiler/stranal/DmdAnal.lhs
+++ b/compiler/stranal/DmdAnal.lhs
@@ -8,59 +8,36 @@
\begin{code}
{-# OPTIONS -fno-warn-tabs #-}
--- The above warning supression flag is a temporary kludge.
--- While working on this module you are encouraged to remove it and
--- detab the module (please do the detabbing in a separate patch). See
--- http://hackage.haskell.org/trac/ghc/wiki/Commentary/CodingStyle#TabsvsSpaces
--- for details
-module DmdAnal ( dmdAnalPgm, dmdAnalTopRhs,
- both {- needed by WwLib -}
- ) where
+module DmdAnal ( dmdAnalProgram ) where
#include "HsVersions.h"
+import Var ( isTyVar )
import DynFlags
+import WwLib ( deepSplitProductType_maybe )
import Demand -- All of it
import CoreSyn
-import PprCore
-import Coercion ( isCoVarType )
-import CoreUtils ( exprIsHNF, exprIsTrivial )
-import CoreArity ( exprArity )
-import DataCon ( dataConTyCon, dataConRepStrictness, isMarkedStrict )
-import TyCon ( isProductTyCon, isRecursiveTyCon )
-import Id ( Id, idType, idInlineActivation,
- isDataConWorkId, isGlobalId, idArity,
- idStrictness,
- setIdStrictness, idDemandInfo, idUnfolding,
- idDemandInfo_maybe, setIdDemandInfo
- )
-import Var ( Var, isTyVar )
+import Outputable
import VarEnv
-import TysWiredIn ( unboxedPairDataCon )
-import TysPrim ( realWorldStatePrimTy )
-import UniqFM ( addToUFM_Directly, lookupUFM_Directly,
- minusUFM, filterUFM )
-import Type ( isUnLiftedType, eqType, tyConAppTyCon_maybe )
+import BasicTypes
+import FastString
+import Data.List
+import DataCon ( dataConTyCon, dataConRepStrictness, isMarkedStrict )
+import Id
+import CoreUtils ( exprIsHNF, exprIsTrivial )
+import PprCore
+import UniqFM ( filterUFM )
+import TyCon
+import Pair
+import Type ( eqType, tyConAppTyCon_maybe )
import Coercion ( coercionKind )
import Util
-import BasicTypes ( Arity, TopLevelFlag(..), isTopLevel, isNeverActive,
- RecFlag(..), isRec )
-import Maybes ( orElse, expectJust )
-import Outputable
-import Pair
-import Data.List
-import FastString
+import Maybes ( orElse )
+import TysWiredIn ( unboxedPairDataCon )
+import TysPrim ( realWorldStatePrimTy )
\end{code}
-To think about
-
-* set a noinline pragma on bottoming Ids
-
-* Consider f x = x+1 `fatbar` error (show x)
- We'd like to unbox x, even if that means reboxing it in the error case.
-
-
%************************************************************************
%* *
\subsection{Top level stuff}
@@ -68,8 +45,9 @@ To think about
%************************************************************************
\begin{code}
-dmdAnalPgm :: DynFlags -> CoreProgram -> IO CoreProgram
-dmdAnalPgm dflags binds
+
+dmdAnalProgram :: DynFlags -> CoreProgram -> IO CoreProgram
+dmdAnalProgram dflags binds
= do {
let { binds_plus_dmds = do_prog binds } ;
return binds_plus_dmds
@@ -78,6 +56,7 @@ dmdAnalPgm dflags binds
do_prog :: CoreProgram -> CoreProgram
do_prog binds = snd $ mapAccumL (dmdAnalTopBind dflags) emptySigEnv binds
+-- Analyse a (group of) top-level binding(s)
dmdAnalTopBind :: DynFlags
-> SigEnv
-> CoreBind
@@ -97,26 +76,7 @@ dmdAnalTopBind dflags sigs (Rec pairs)
(sigs', _, pairs') = dmdFix dflags TopLevel (virgin sigs) pairs
-- We get two iterations automatically
-- c.f. the NonRec case above
-\end{code}
-\begin{code}
-dmdAnalTopRhs :: DynFlags -> CoreExpr -> (StrictSig, CoreExpr)
--- Analyse the RHS and return
--- a) appropriate strictness info
--- b) the unfolding (decorated with strictness info)
-dmdAnalTopRhs dflags rhs
- = (sig, rhs2)
- where
- call_dmd = vanillaCall (exprArity rhs)
- (_, rhs1) = dmdAnal dflags (virgin emptySigEnv) call_dmd rhs
- (rhs_ty, rhs2) = dmdAnal dflags (nonVirgin emptySigEnv) call_dmd rhs1
- sig = mkTopSigTy dflags rhs rhs_ty
- -- Do two passes; see notes with extendSigsWithLam
- -- Otherwise we get bogus CPR info for constructors like
- -- newtype T a = MkT a
- -- The constructor looks like (\x::T a -> x), modulo the coerce
- -- extendSigsWithLam will optimistically give x a CPR tag the
- -- first time, which is wrong in the end.
\end{code}
%************************************************************************
@@ -125,34 +85,64 @@ dmdAnalTopRhs dflags rhs
%* *
%************************************************************************
+Note [Ensure demand is strict]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+It's important not to analyse e with a lazy demand because
+a) When we encounter case s of (a,b) ->
+ we demand s with U(d1d2)... but if the overall demand is lazy
+ that is wrong, and we'd need to reduce the demand on s,
+ which is inconvenient
+b) More important, consider
+ f (let x = R in x+x), where f is lazy
+ We still want to mark x as demanded, because it will be when we
+ enter the let. If we analyse f's arg with a Lazy demand, we'll
+ just mark x as Lazy
+c) The application rule wouldn't be right either
+ Evaluating (f x) in a L demand does *not* cause
+ evaluation of f in a C(L) demand!
+
+Note [Always analyse in virgin pass]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Tricky point: make sure that we analyse in the 'virgin' pass. Consider
+ rec { f acc x True = f (...rec { g y = ...g... }...)
+ f acc x False = acc }
+In the virgin pass for 'f' we'll give 'f' a very strict (bottom) type.
+That might mean that we analyse the sub-expression containing the
+E = "...rec g..." stuff in a bottom demand. Suppose we *didn't analyse*
+E, but just retuned botType.
+
+Then in the *next* (non-virgin) iteration for 'f', we might analyse E
+in a weaker demand, and that will trigger doing a fixpoint iteration
+for g. But *because it's not the virgin pass* we won't start g's
+iteration at bottom. Disaster. (This happened in $sfibToList' of
+nofib/spectral/fibheaps.)
+
+So in the virgin pass we make sure that we do analyse the expression
+at least once, to initialise its signatures.
+
\begin{code}
-dmdAnal :: DynFlags -> AnalEnv -> Demand -> CoreExpr -> (DmdType, CoreExpr)
+evalDmdAnal :: DynFlags -> AnalEnv -> CoreExpr -> (DmdType, CoreExpr)
+-- See Note [Ensure demand is strict]
+evalDmdAnal dflags env e
+ | (res_ty, e') <- dmdAnal dflags env evalDmd e
+ = (deferType res_ty, e')
+
+simpleDmdAnal :: DynFlags -> AnalEnv -> DmdType -> CoreExpr -> (DmdType, CoreExpr)
+simpleDmdAnal dflags env res_ty e
+ | ae_virgin env -- See Note [Always analyse in virgin pass]
+ , (_discarded_res_ty, e') <- dmdAnal dflags env evalDmd e
+ = (res_ty, e')
+ | otherwise
+ = (res_ty, e)
-dmdAnal _ _ Abs e = (topDmdType, e)
+dmdAnal :: DynFlags -> AnalEnv -> Demand -> CoreExpr -> (DmdType, CoreExpr)
+dmdAnal dflags env dmd e
+ | isBotDmd dmd = simpleDmdAnal dflags env botDmdType e
+ | isAbsDmd dmd = simpleDmdAnal dflags env topDmdType e
+ | not (isStrictDmd dmd) = evalDmdAnal dflags env e
-dmdAnal dflags env dmd e
- | not (isStrictDmd dmd)
- = let
- (res_ty, e') = dmdAnal dflags env evalDmd e
- in
- (deferType res_ty, e')
- -- It's important not to analyse e with a lazy demand because
- -- a) When we encounter case s of (a,b) ->
- -- we demand s with U(d1d2)... but if the overall demand is lazy
- -- that is wrong, and we'd need to reduce the demand on s,
- -- which is inconvenient
- -- b) More important, consider
- -- f (let x = R in x+x), where f is lazy
- -- We still want to mark x as demanded, because it will be when we
- -- enter the let. If we analyse f's arg with a Lazy demand, we'll
- -- just mark x as Lazy
- -- c) The application rule wouldn't be right either
- -- Evaluating (f x) in a L demand does *not* cause
- -- evaluation of f in a C(L) demand!
-
-
-dmdAnal _ _ _ (Lit lit) = (topDmdType, Lit lit)
-dmdAnal _ _ _ (Type ty) = (topDmdType, Type ty) -- Doesn't happen, in fact
+dmdAnal _ _ _ (Lit lit) = (topDmdType, Lit lit)
+dmdAnal _ _ _ (Type ty) = (topDmdType, Type ty) -- Doesn't happen, in fact
dmdAnal _ _ _ (Coercion co) = (topDmdType, Coercion co)
dmdAnal _ env dmd (Var var)
@@ -191,12 +181,20 @@ dmdAnal dflags sigs dmd (App fun (Coercion co))
-- Lots of the other code is there to make this
-- beautiful, compositional, application rule :-)
dmdAnal dflags env dmd (App fun arg) -- Non-type arguments
- = let -- [Type arg handled above]
- (fun_ty, fun') = dmdAnal dflags env (Call dmd) fun
+ = let -- [Type arg handled above]
+ (fun_ty, fun') = dmdAnal dflags env (mkCallDmd dmd) fun
(arg_ty, arg') = dmdAnal dflags env arg_dmd arg
(arg_dmd, res_ty) = splitDmdTy fun_ty
in
- (res_ty `bothType` arg_ty, App fun' arg')
+-- pprTrace "dmdAnal:app" (vcat
+-- [ text "dmd =" <+> ppr dmd
+-- , text "expr =" <+> ppr (App fun arg)
+-- , text "fun dmd_ty =" <+> ppr fun_ty
+-- , text "arg dmd =" <+> ppr arg_dmd
+-- , text "arg dmd_ty =" <+> ppr arg_ty
+-- , text "res dmd_ty =" <+> ppr res_ty
+-- , text "overall res dmd_ty =" <+> ppr (res_ty `bothDmdType` arg_ty) ])
+ (res_ty `bothDmdType` arg_ty, App fun' arg')
dmdAnal dflags env dmd (Lam var body)
| isTyVar var
@@ -205,7 +203,7 @@ dmdAnal dflags env dmd (Lam var body)
in
(body_ty, Lam var body')
- | Call body_dmd <- dmd -- A call demand: good!
+ | Just body_dmd <- peelCallDmd dmd -- A call demand: good!
= let
env' = extendSigsWithLam env var
(body_ty, body') = dmdAnal dflags env' body_dmd body
@@ -221,11 +219,12 @@ dmdAnal dflags env dmd (Lam var body)
(deferType lam_ty, Lam var' body')
dmdAnal dflags env dmd (Case scrut case_bndr ty [alt@(DataAlt dc, _, _)])
+ -- Only one alternative with a product constructor
| let tycon = dataConTyCon dc
, isProductTyCon tycon
, not (isRecursiveTyCon tycon)
= let
- env_alt = extendAnalEnv NotTopLevel env case_bndr case_bndr_sig
+ env_alt = extendAnalEnv NotTopLevel env case_bndr case_bndr_sig
(alt_ty, alt') = dmdAnalAlt dflags env_alt dmd alt
(alt_ty1, case_bndr') = annotateBndr alt_ty case_bndr
(_, bndrs', _) = alt'
@@ -249,7 +248,7 @@ dmdAnal dflags env dmd (Case scrut case_bndr ty [alt@(DataAlt dc, _, _)])
-- x = (a, absent-error)
-- and that'll crash.
-- So at one stage I had:
- -- dead_case_bndr = isAbsentDmd (idDemandInfo case_bndr')
+ -- dead_case_bndr = isAbsDmd (idDemandInfo case_bndr')
-- keepity | dead_case_bndr = Drop
-- | otherwise = Keep
--
@@ -260,25 +259,29 @@ dmdAnal dflags env dmd (Case scrut case_bndr ty [alt@(DataAlt dc, _, _)])
-- The insight is, of course, that a demand on y is a demand on the
-- scrutinee, so we need to `both` it with the scrut demand
- alt_dmd = Eval (Prod [idDemandInfo b | b <- bndrs', isId b])
- scrut_dmd = alt_dmd `both`
+ alt_dmd = mkProdDmd [idDemandInfo b | b <- bndrs', isId b]
+ scrut_dmd = alt_dmd `bothDmd`
idDemandInfo case_bndr'
(scrut_ty, scrut') = dmdAnal dflags env scrut_dmd scrut
- res_ty = alt_ty1 `bothType` scrut_ty
+ res_ty = alt_ty1 `bothDmdType` scrut_ty
in
-- pprTrace "dmdAnal:Case1" (vcat [ text "scrut" <+> ppr scrut
--- , text "scrut_ty" <+> ppr scrut_ty
--- , text "alt_ty" <+> ppr alt_ty1
--- , text "res_ty" <+> ppr res_ty ]) $
+-- , text "dmd" <+> ppr dmd
+-- , text "alt_dmd" <+> ppr alt_dmd
+-- , text "case_bndr_dmd" <+> ppr (idDemandInfo case_bndr')
+-- , text "scrut_dmd" <+> ppr scrut_dmd
+-- , text "scrut_ty" <+> ppr scrut_ty
+-- , text "alt_ty" <+> ppr alt_ty1
+-- , text "res_ty" <+> ppr res_ty ]) $
(res_ty, Case scrut' case_bndr' ty [alt'])
dmdAnal dflags env dmd (Case scrut case_bndr ty alts)
- = let
+ = let -- Case expression with multiple alternatives
(alt_tys, alts') = mapAndUnzip (dmdAnalAlt dflags env dmd) alts
(scrut_ty, scrut') = dmdAnal dflags env evalDmd scrut
- (alt_ty, case_bndr') = annotateBndr (foldr lubType botDmdType alt_tys) case_bndr
- res_ty = alt_ty `bothType` scrut_ty
+ (alt_ty, case_bndr') = annotateBndr (foldr lubDmdType botDmdType alt_tys) case_bndr
+ res_ty = alt_ty `bothDmdType` scrut_ty
in
-- pprTrace "dmdAnal:Case2" (vcat [ text "scrut" <+> ppr scrut
-- , text "scrut_ty" <+> ppr scrut_ty
@@ -331,7 +334,7 @@ dmdAnalAlt dflags env dmd (con,bndrs,rhs)
(rhs_ty, rhs') = dmdAnal dflags env dmd rhs
rhs_ty' = addDataConPatDmds con bndrs rhs_ty
(alt_ty, bndrs') = annotateBndrs rhs_ty' bndrs
- final_alt_ty | io_hack_reqd = alt_ty `lubType` topDmdType
+ final_alt_ty | io_hack_reqd = alt_ty `lubDmdType` topDmdType
| otherwise = alt_ty
-- There's a hack here for I/O operations. Consider
@@ -363,45 +366,13 @@ addDataConPatDmds (LitAlt _) _ dmd_ty = dmd_ty
addDataConPatDmds (DataAlt con) bndrs dmd_ty
= foldr add dmd_ty str_bndrs
where
- add bndr dmd_ty = addVarDmd dmd_ty bndr seqDmd
+ add bndr dmd_ty = addVarDmd dmd_ty bndr absDmd
str_bndrs = [ b | (b,s) <- zipEqual "addDataConPatBndrs"
(filter isId bndrs)
(dataConRepStrictness con)
, isMarkedStrict s ]
-\end{code}
-
-Note [Add demands for strict constructors]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider this program (due to Roman):
-
- data X a = X !a
-
- foo :: X Int -> Int -> Int
- foo (X a) n = go 0
- where
- go i | i < n = a + go (i+1)
- | otherwise = 0
-
-We want the worker for 'foo' too look like this:
-
- $wfoo :: Int# -> Int# -> Int#
-
-with the first argument unboxed, so that it is not eval'd each time
-around the loop (which would otherwise happen, since 'foo' is not
-strict in 'a'. It is sound for the wrapper to pass an unboxed arg
-because X is strict, so its argument must be evaluated. And if we
-*don't* pass an unboxed argument, we can't even repair it by adding a
-`seq` thus:
-
- foo (X a) n = a `seq` go 0
-
-because the seq is discarded (very early) since X is strict!
-
-There is the usual danger of reboxing, which as usual we ignore. But
-if X is monomorphic, and has an UNPACK pragma, then this optimisation
-is even more important. We don't want the wrapper to rebox an unboxed
-argument, and pass an Int to $wfoo!
+\end{code}
%************************************************************************
%* *
@@ -418,67 +389,21 @@ dmdTransform :: AnalEnv -- The strictness environment
-- this function plus demand on its free variables
dmdTransform env var dmd
+ | isDataConWorkId var -- Data constructor
+ = dmdTransformDataConSig
+ (idArity var) (idStrictness var) dmd
------- DATA CONSTRUCTOR
- | isDataConWorkId var -- Data constructor
- = let
- StrictSig dmd_ty = idStrictness var -- It must have a strictness sig
- DmdType _ _ con_res = dmd_ty
- arity = idArity var
- in
- if arity == call_depth then -- Saturated, so unleash the demand
- let
- -- Important! If we Keep the constructor application, then
- -- we need the demands the constructor places (always lazy)
- -- If not, we don't need to. For example:
- -- f p@(x,y) = (p,y) -- S(AL)
- -- g a b = f (a,b)
- -- It's vital that we don't calculate Absent for a!
- dmd_ds = case res_dmd of
- Box (Eval ds) -> mapDmds box ds
- Eval ds -> ds
- _ -> Poly Top
-
- -- ds can be empty, when we are just seq'ing the thing
- -- If so we must make up a suitable bunch of demands
- arg_ds = case dmd_ds of
- Poly d -> replicate arity d
- Prod ds -> ASSERT( ds `lengthIs` arity ) ds
-
- in
- mkDmdType emptyDmdEnv arg_ds con_res
- -- Must remember whether it's a product, hence con_res, not TopRes
- else
- topDmdType
-
------- IMPORTED FUNCTION
- | isGlobalId var, -- Imported function
- let StrictSig dmd_ty = idStrictness var
- = -- pprTrace "strict-sig" (ppr var $$ ppr dmd_ty) $
- if dmdTypeDepth dmd_ty <= call_depth then -- Saturated, so unleash the demand
- dmd_ty
- else
- topDmdType
-
------- LOCAL LET/REC BOUND THING
- | Just (StrictSig dmd_ty, top_lvl) <- lookupSigEnv env var
- = let
- fn_ty | dmdTypeDepth dmd_ty <= call_depth = dmd_ty
- | otherwise = deferType dmd_ty
- -- NB: it's important to use deferType, and not just return topDmdType
- -- Consider let { f x y = p + x } in f 1
- -- The application isn't saturated, but we must nevertheless propagate
- -- a lazy demand for p!
- in
- if isTopLevel top_lvl then fn_ty -- Don't record top level things
+ | isGlobalId var -- Imported function
+ = dmdTransformSig (idStrictness var) dmd
+
+ | Just (sig, top_lvl) <- lookupSigEnv env var -- Local letrec bound thing
+ , let fn_ty = dmdTransformSig sig dmd
+ = if isTopLevel top_lvl
+ then fn_ty -- Don't record top level things
else addVarDmd fn_ty var dmd
------- LOCAL NON-LET/REC BOUND THING
- | otherwise -- Default case
+ | otherwise -- Local non-letrec-bound thing
= unitVarDmd var dmd
-
- where
- (call_depth, res_dmd) = splitCallDmd dmd
\end{code}
%************************************************************************
@@ -488,6 +413,8 @@ dmdTransform env var dmd
%************************************************************************
\begin{code}
+
+-- Recursive bindings
dmdFix :: DynFlags
-> TopLevelFlag
-> AnalEnv -- Does not include bindings for this binding
@@ -545,19 +472,19 @@ dmdFix dflags top_lvl env orig_pairs
= ((sigs', lazy_fv'), pair')
where
(sigs', lazy_fv1, pair') = dmdAnalRhs dflags top_lvl Recursive (updSigEnv env sigs) (id,rhs)
- lazy_fv' = plusVarEnv_C both lazy_fv lazy_fv1
+ lazy_fv' = plusVarEnv_C bothDmd lazy_fv lazy_fv1
same_sig sigs sigs' var = lookup sigs var == lookup sigs' var
lookup sigs var = case lookupVarEnv sigs var of
Just (sig,_) -> sig
Nothing -> pprPanic "dmdFix" (ppr var)
+-- Non-recursive bindings
dmdAnalRhs :: DynFlags -> TopLevelFlag -> RecFlag
-> AnalEnv -> (Id, CoreExpr)
-> (SigEnv, DmdEnv, (Id, CoreExpr))
-- Process the RHS of the binding, add the strictness signature
-- to the Id, and augment the environment with the signature as well.
-
dmdAnalRhs dflags top_lvl rec_flag env (id, rhs)
= (sigs', lazy_fv, (id', rhs'))
where
@@ -565,13 +492,13 @@ dmdAnalRhs dflags top_lvl rec_flag env (id, rhs)
-- The simplifier was run just beforehand
(rhs_dmd_ty, rhs') = dmdAnal dflags env (vanillaCall arity) rhs
(lazy_fv, sig_ty) = WARN( arity /= dmdTypeDepth rhs_dmd_ty && not (exprIsTrivial rhs), ppr id )
- -- The RHS can be eta-reduced to just a variable,
- -- in which case we should not complain.
- mkSigTy dflags top_lvl rec_flag id rhs rhs_dmd_ty
+ -- The RHS can be eta-reduced to just a variable,
+ -- in which case we should not complain.
+ mkSigTy top_lvl rec_flag env id rhs rhs_dmd_ty
id' = id `setIdStrictness` sig_ty
sigs' = extendSigEnv top_lvl (sigEnv env) id sig_ty
-\end{code}
+\end{code}
%************************************************************************
%* *
@@ -580,26 +507,134 @@ dmdAnalRhs dflags top_lvl rec_flag env (id, rhs)
%************************************************************************
\begin{code}
-mkTopSigTy :: DynFlags -> CoreExpr -> DmdType -> StrictSig
- -- Take a DmdType and turn it into a StrictSig
- -- NB: not used for never-inline things; hence False
-mkTopSigTy dflags rhs dmd_ty = snd (mk_sig_ty dflags False False rhs dmd_ty)
-
-mkSigTy :: DynFlags -> TopLevelFlag -> RecFlag -> Id -> CoreExpr -> DmdType -> (DmdEnv, StrictSig)
-mkSigTy dflags top_lvl rec_flag id rhs dmd_ty
- = mk_sig_ty dflags never_inline thunk_cpr_ok rhs dmd_ty
+unitVarDmd :: Var -> Demand -> DmdType
+unitVarDmd var dmd
+ = DmdType (unitVarEnv var dmd) [] topRes
+
+addVarDmd :: DmdType -> Var -> Demand -> DmdType
+addVarDmd (DmdType fv ds res) var dmd
+ = DmdType (extendVarEnv_C bothDmd fv var dmd) ds res
+
+addLazyFVs :: DmdType -> DmdEnv -> DmdType
+addLazyFVs (DmdType fv ds res) lazy_fvs
+ = DmdType both_fv1 ds res
where
- never_inline = isNeverActive (idInlineActivation id)
- maybe_id_dmd = idDemandInfo_maybe id
- -- Is Nothing the first time round
+ both_fv = plusVarEnv_C bothDmd fv lazy_fvs
+ both_fv1 = modifyEnv (isBotRes res) (`bothDmd` botDmd) lazy_fvs fv both_fv
+ -- This modifyEnv is vital. Consider
+ -- let f = \x -> (x,y)
+ -- in error (f 3)
+ -- Here, y is treated as a lazy-fv of f, but we must `bothDmd` that L
+ -- demand with the bottom coming up from 'error'
+ --
+ -- I got a loop in the fixpointer without this, due to an interaction
+ -- with the lazy_fv filtering in mkSigTy. Roughly, it was
+ -- letrec f n x
+ -- = letrec g y = x `fatbar`
+ -- letrec h z = z + ...g...
+ -- in h (f (n-1) x)
+ -- in ...
+ -- In the initial iteration for f, f=Bot
+ -- Suppose h is found to be strict in z, but the occurrence of g in its RHS
+ -- is lazy. Now consider the fixpoint iteration for g, esp the demands it
+ -- places on its free variables. Suppose it places none. Then the
+ -- x `fatbar` ...call to h...
+ -- will give a x->V demand for x. That turns into a L demand for x,
+ -- which floats out of the defn for h. Without the modifyEnv, that
+ -- L demand doesn't get both'd with the Bot coming up from the inner
+ -- call to f. So we just get an L demand for x for g.
+ --
+ -- A better way to say this is that the lazy-fv filtering should give the
+ -- same answer as putting the lazy fv demands in the function's type.
+
+
+removeFV :: DmdEnv -> Var -> DmdResult -> (DmdEnv, Demand)
+removeFV fv id res = -- pprTrace "rfv" (ppr id <+> ppr dmd $$ ppr fv)
+ (fv', dmd)
+ where
+ fv' = fv `delVarEnv` id
+ dmd = lookupVarEnv fv id `orElse` deflt
+ -- See note [Default demand for variables]
+ deflt | isBotRes res = botDmd
+ | otherwise = absDmd
+\end{code}
+
+Note [Default demand for variables]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the variable is not mentioned in the environment of a demand type,
+its demand is taken to be a result demand of the type: either L or the
+bottom. Both are safe from the semantical pont of view, however, for
+the safe result we also have absent demand set to Abs, which makes it
+possible to safely ignore non-mentioned variables (their joint demand
+is <L,A>).
+
+\begin{code}
+annotateBndr :: DmdType -> Var -> (DmdType, Var)
+-- The returned env has the var deleted
+-- The returned var is annotated with demand info
+-- according to the result demand of the provided demand type
+-- No effect on the argument demands
+annotateBndr dmd_ty@(DmdType fv ds res) var
+ | isTyVar var = (dmd_ty, var)
+ | otherwise = (DmdType fv' ds res, setIdDemandInfo var dmd)
+ where
+ (fv', dmd) = removeFV fv var res
+
+annotateBndrs :: DmdType -> [Var] -> (DmdType, [Var])
+annotateBndrs = mapAccumR annotateBndr
+
+annotateLamIdBndr :: DynFlags
+ -> AnalEnv
+ -> DmdType -- Demand type of body
+ -> Id -- Lambda binder
+ -> (DmdType, -- Demand type of lambda
+ Id) -- and binder annotated with demand
+
+annotateLamIdBndr dflags env (DmdType fv ds res) id
+-- For lambdas we add the demand to the argument demands
+-- Only called for Ids
+ = ASSERT( isId id )
+ (final_ty, setIdDemandInfo id dmd)
+ where
+ -- Watch out! See note [Lambda-bound unfoldings]
+ final_ty = case maybeUnfoldingTemplate (idUnfolding id) of
+ Nothing -> main_ty
+ Just unf -> main_ty `bothDmdType` unf_ty
+ where
+ (unf_ty, _) = dmdAnal dflags env dmd unf
+
+ main_ty = DmdType fv' (dmd:ds) res
+
+ (fv', dmd) = removeFV fv id res
+
+mkSigTy :: TopLevelFlag -> RecFlag -> AnalEnv -> Id ->
+ CoreExpr -> DmdType -> (DmdEnv, StrictSig)
+mkSigTy top_lvl rec_flag env id rhs (DmdType fv dmds res)
+ = (lazy_fv, mkStrictSig dmd_ty)
+ -- See Note [NOINLINE and strictness]
+ where
+ dmd_ty = mkDmdType strict_fv dmds res'
+
+ -- See Note [Lazy and strict free variables]
+ lazy_fv = filterUFM (not . isStrictDmd) fv
+ strict_fv = filterUFM isStrictDmd fv
+
+ ignore_cpr_info = not (exprIsHNF rhs || thunk_cpr_ok)
+ res' = if returnsCPR res && ignore_cpr_info
+ then topRes
+ else res
+ -- Is it okay or not to assign CPR
+ -- (not okay in the first pass)
thunk_cpr_ok -- See Note [CPR for thunks]
- | isTopLevel top_lvl = False -- Top level things don't get
- -- their demandInfo set at all
- | isRec rec_flag = False -- Ditto recursive things
- | Just dmd <- maybe_id_dmd = isStrictDmd dmd
- | otherwise = True -- Optimistic, first time round
- -- See notes below
+ | isTopLevel top_lvl = False -- Top level things don't get
+ -- their demandInfo set at all
+ | isRec rec_flag = False -- Ditto recursive things
+ | ae_virgin env = True -- Optimistic, first time round
+ -- See Note [Optimistic CPR in the "virgin" case]
+ | isStrictDmd (idDemandInfo id) = True
+ | otherwise = False
\end{code}
Note [CPR for thunks]
@@ -663,15 +698,15 @@ have a CPR in it or not. Simple solution:
NB: strictly_demanded is never true of a top-level Id, or of a recursive Id.
-Note [Optimistic in the Nothing case]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Demand info now has a 'Nothing' state, just like strictness info.
-The analysis works from 'dangerous' towards a 'safe' state; so we
-start with botSig for 'Nothing' strictness infos, and we start with
-"yes, it's demanded" for 'Nothing' in the demand info. The
-fixpoint iteration will sort it all out.
+Note [Optimistic CPR in the "virgin" case]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Demand and strictness info are initialized by top elements. However,
+this prevents from inferring a CPR property in the first pass of the
+analyser, so we keep an explicit flag ae_virgin in the AnalEnv
+datatype.
-We can't start with 'not-demanded' because then consider
+We can't start with 'not-demanded' (i.e., top) because then consider
f x = let
t = ... I# x
in
@@ -680,9 +715,9 @@ We can't start with 'not-demanded' because then consider
In the first iteration we'd have no demand info for x, so assume
not-demanded; then we'd get TopRes for f's CPR info. Next iteration
we'd see that t was demanded, and so give it the CPR property, but by
-now f has TopRes, so it will stay TopRes. Instead, with the Nothing
-setting the first time round, we say 'yes t is demanded' the first
-time.
+now f has TopRes, so it will stay TopRes. Instead, by checking the
+ae_virgin flag at the first time round, we say 'yes t is demanded' the
+first time.
However, this does mean that for non-recursive bindings we must
iterate twice to be sure of not getting over-optimistic CPR info,
@@ -726,204 +761,40 @@ strictness. For example, if you have a function implemented by an
error stub, but which has RULES, you may want it not to be eliminated
in favour of error!
+Note [Lazy and strict free variables]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{code}
-mk_sig_ty :: DynFlags -> Bool -> Bool -> CoreExpr
- -> DmdType -> (DmdEnv, StrictSig)
-mk_sig_ty dflags _never_inline thunk_cpr_ok rhs (DmdType fv dmds res)
- = (lazy_fv, mkStrictSig dmd_ty)
- -- Re unused never_inline, see Note [NOINLINE and strictness]
- where
- dmd_ty = DmdType strict_fv final_dmds res'
-
- lazy_fv = filterUFM (not . isStrictDmd) fv
- strict_fv = filterUFM isStrictDmd fv
- -- We put the strict FVs in the DmdType of the Id, so
- -- that at its call sites we unleash demands on its strict fvs.
- -- An example is 'roll' in imaginary/wheel-sieve2
- -- Something like this:
- -- roll x = letrec
- -- go y = if ... then roll (x-1) else x+1
- -- in
- -- go ms
- -- We want to see that roll is strict in x, which is because
- -- go is called. So we put the DmdEnv for x in go's DmdType.
- --
- -- Another example:
- -- f :: Int -> Int -> Int
- -- f x y = let t = x+1
- -- h z = if z==0 then t else
- -- if z==1 then x+1 else
- -- x + h (z-1)
- -- in
- -- h y
- -- Calling h does indeed evaluate x, but we can only see
- -- that if we unleash a demand on x at the call site for t.
- --
- -- Incidentally, here's a place where lambda-lifting h would
- -- lose the cigar --- we couldn't see the joint strictness in t/x
- --
- -- ON THE OTHER HAND
- -- We don't want to put *all* the fv's from the RHS into the
- -- DmdType, because that makes fixpointing very slow --- the
- -- DmdType gets full of lazy demands that are slow to converge.
-
- final_dmds = setUnpackStrategy dflags dmds
- -- Set the unpacking strategy
-
- res' = case res of
- RetCPR | ignore_cpr_info -> TopRes
- _ -> res
- ignore_cpr_info = not (exprIsHNF rhs || thunk_cpr_ok)
-\end{code}
+We put the strict FVs in the DmdType of the Id, so
+that at its call sites we unleash demands on its strict fvs.
+An example is 'roll' in imaginary/wheel-sieve2
+Something like this:
+ roll x = letrec
+ go y = if ... then roll (x-1) else x+1
+ in
+ go ms
+We want to see that roll is strict in x, which is because
+go is called. So we put the DmdEnv for x in go's DmdType.
-The unpack strategy determines whether we'll *really* unpack the argument,
-or whether we'll just remember its strictness. If unpacking would give
-rise to a *lot* of worker args, we may decide not to unpack after all.
+Another example:
-\begin{code}
-setUnpackStrategy :: DynFlags -> [Demand] -> [Demand]
-setUnpackStrategy dflags ds
- = snd (go (maxWorkerArgs dflags - nonAbsentArgs ds) ds)
- where
- go :: Int -- Max number of args available for sub-components of [Demand]
- -> [Demand]
- -> (Int, [Demand]) -- Args remaining after subcomponents of [Demand] are unpacked
-
- go n (Eval (Prod cs) : ds)
- | n' >= 0 = Eval (Prod cs') `cons` go n'' ds
- | otherwise = Box (Eval (Prod cs)) `cons` go n ds
- where
- (n'',cs') = go n' cs
- n' = n + 1 - non_abs_args
- -- Add one to the budget 'cos we drop the top-level arg
- non_abs_args = nonAbsentArgs cs
- -- Delete # of non-absent args to which we'll now be committed
-
- go n (d:ds) = d `cons` go n ds
- go n [] = (n,[])
-
- cons d (n,ds) = (n, d:ds)
-
-nonAbsentArgs :: [Demand] -> Int
-nonAbsentArgs [] = 0
-nonAbsentArgs (Abs : ds) = nonAbsentArgs ds
-nonAbsentArgs (_ : ds) = 1 + nonAbsentArgs ds
-\end{code}
+ f :: Int -> Int -> Int
+ f x y = let t = x+1
+ h z = if z==0 then t else
+ if z==1 then x+1 else
+ x + h (z-1)
+ in h y
+Calling h does indeed evaluate x, but we can only see
+that if we unleash a demand on x at the call site for t.
-%************************************************************************
-%* *
-\subsection{Strictness signatures and types}
-%* *
-%************************************************************************
+Incidentally, here's a place where lambda-lifting h would
+lose the cigar --- we couldn't see the joint strictness in t/x
-\begin{code}
-unitVarDmd :: Var -> Demand -> DmdType
-unitVarDmd var dmd = DmdType (unitVarEnv var dmd) [] TopRes
+ ON THE OTHER HAND
+We don't want to put *all* the fv's from the RHS into the
+DmdType, because that makes fixpointing very slow --- the
+DmdType gets full of lazy demands that are slow to converge.
-addVarDmd :: DmdType -> Var -> Demand -> DmdType
-addVarDmd (DmdType fv ds res) var dmd
- = DmdType (extendVarEnv_C both fv var dmd) ds res
-
-addLazyFVs :: DmdType -> DmdEnv -> DmdType
-addLazyFVs (DmdType fv ds res) lazy_fvs
- = DmdType both_fv1 ds res
- where
- both_fv = plusVarEnv_C both fv lazy_fvs
- both_fv1 = modifyEnv (isBotRes res) (`both` Bot) lazy_fvs fv both_fv
- -- This modifyEnv is vital. Consider
- -- let f = \x -> (x,y)
- -- in error (f 3)
- -- Here, y is treated as a lazy-fv of f, but we must `both` that L
- -- demand with the bottom coming up from 'error'
- --
- -- I got a loop in the fixpointer without this, due to an interaction
- -- with the lazy_fv filtering in mkSigTy. Roughly, it was
- -- letrec f n x
- -- = letrec g y = x `fatbar`
- -- letrec h z = z + ...g...
- -- in h (f (n-1) x)
- -- in ...
- -- In the initial iteration for f, f=Bot
- -- Suppose h is found to be strict in z, but the occurrence of g in its RHS
- -- is lazy. Now consider the fixpoint iteration for g, esp the demands it
- -- places on its free variables. Suppose it places none. Then the
- -- x `fatbar` ...call to h...
- -- will give a x->V demand for x. That turns into a L demand for x,
- -- which floats out of the defn for h. Without the modifyEnv, that
- -- L demand doesn't get both'd with the Bot coming up from the inner
- -- call to f. So we just get an L demand for x for g.
- --
- -- A better way to say this is that the lazy-fv filtering should give the
- -- same answer as putting the lazy fv demands in the function's type.
-
-annotateBndr :: DmdType -> Var -> (DmdType, Var)
--- The returned env has the var deleted
--- The returned var is annotated with demand info
--- No effect on the argument demands
-annotateBndr dmd_ty@(DmdType fv ds res) var
- | isTyVar var = (dmd_ty, var)
- | otherwise = (DmdType fv' ds res, setIdDemandInfo var dmd)
- where
- (fv', dmd) = removeFV fv var res
-
-annotateBndrs :: DmdType -> [Var] -> (DmdType, [Var])
-annotateBndrs = mapAccumR annotateBndr
-
-annotateLamIdBndr :: DynFlags
- -> AnalEnv
- -> DmdType -- Demand type of body
- -> Id -- Lambda binder
- -> (DmdType, -- Demand type of lambda
- Id) -- and binder annotated with demand
-
-annotateLamIdBndr dflags env (DmdType fv ds res) id
--- For lambdas we add the demand to the argument demands
--- Only called for Ids
- = ASSERT( isId id )
- (final_ty, setIdDemandInfo id hacked_dmd)
- where
- -- Watch out! See note [Lambda-bound unfoldings]
- final_ty = case maybeUnfoldingTemplate (idUnfolding id) of
- Nothing -> main_ty
- Just unf -> main_ty `bothType` unf_ty
- where
- (unf_ty, _) = dmdAnal dflags env dmd unf
-
- main_ty = DmdType fv' (hacked_dmd:ds) res
-
- (fv', dmd) = removeFV fv id res
- hacked_dmd = argDemand dmd
- -- This call to argDemand is vital, because otherwise we label
- -- a lambda binder with demand 'B'. But in terms of calling
- -- conventions that's Abs, because we don't pass it. But
- -- when we do a w/w split we get
- -- fw x = (\x y:B -> ...) x (error "oops")
- -- And then the simplifier things the 'B' is a strict demand
- -- and evaluates the (error "oops"). Sigh
-
-removeFV :: DmdEnv -> Var -> DmdResult -> (DmdEnv, Demand)
-removeFV fv id res = (fv', zapUnlifted id dmd)
- where
- fv' = fv `delVarEnv` id
- dmd = lookupVarEnv fv id `orElse` deflt
- deflt | isBotRes res = Bot
- | otherwise = Abs
-
-zapUnlifted :: Id -> Demand -> Demand
--- For unlifted-type variables, we are only
--- interested in Bot/Abs/Box Abs
-zapUnlifted id dmd
- = case dmd of
- _ | isCoVarType ty -> lazyDmd -- For coercions, ignore str/abs totally
- Bot -> Bot
- Abs -> Abs
- _ | isUnLiftedType ty -> lazyDmd -- For unlifted types, ignore strictness
- | otherwise -> dmd
- where
- ty = idType id
-\end{code}
Note [Lamba-bound unfoldings]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -995,323 +866,77 @@ nonVirgin sigs = AE { ae_sigs = sigs, ae_virgin = False }
extendSigsWithLam :: AnalEnv -> Id -> AnalEnv
-- Extend the AnalEnv when we meet a lambda binder
--- If the binder is marked demanded with a product demand, then give it a CPR
--- signature, because in the likely event that this is a lambda on a fn defn
--- [we only use this when the lambda is being consumed with a call demand],
--- it'll be w/w'd and so it will be CPR-ish. E.g.
--- f = \x::(Int,Int). if ...strict in x... then
--- x
--- else
--- (a,b)
--- We want f to have the CPR property because x does, by the time f has been w/w'd
---
--- Also note that we only want to do this for something that
--- definitely has product type, else we may get over-optimistic
--- CPR results (e.g. from \x -> x!).
-
extendSigsWithLam env id
- = case idDemandInfo_maybe id of
- Nothing -> extendAnalEnv NotTopLevel env id cprSig
- -- See Note [Optimistic in the Nothing case]
- Just (Eval (Prod _)) -> extendAnalEnv NotTopLevel env id cprSig
- _ -> env
-\end{code}
-
-Note [Initialising strictness]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Our basic plan is to initialise the strictness of each Id in
-a recursive group to "bottom", and find a fixpoint from there.
-However, this group A might be inside an *enclosing* recursive
-group B, in which case we'll do the entire fixpoint shebang on A
-for each iteration of B.
-
-To speed things up, we initialise each iteration of B from the result
-of the last one, which is neatly recorded in each binder. That way we
-make use of earlier iterations of the fixpoint algorithm. (Cunning
-plan.)
-
-But on the *first* iteration we want to *ignore* the current strictness
-of the Id, and start from "bottom". Nowadays the Id can have a current
-strictness, because interface files record strictness for nested bindings.
-To know when we are in the first iteration, we look at the ae_virgin
-field of the AnalEnv.
-
-
-%************************************************************************
-%* *
- Demands
-%* *
-%************************************************************************
+ | ae_virgin env -- See Note [Optimistic CPR in the "virgin" case]
+ = extendAnalEnv NotTopLevel env id cprSig
-\begin{code}
-splitDmdTy :: DmdType -> (Demand, DmdType)
--- Split off one function argument
--- We already have a suitable demand on all
--- free vars, so no need to add more!
-splitDmdTy (DmdType fv (dmd:dmds) res_ty) = (dmd, DmdType fv dmds res_ty)
-splitDmdTy ty@(DmdType _ [] res_ty) = (resTypeArgDmd res_ty, ty)
-
-splitCallDmd :: Demand -> (Int, Demand)
-splitCallDmd (Call d) = case splitCallDmd d of
- (n, r) -> (n+1, r)
-splitCallDmd d = (0, d)
-
-vanillaCall :: Arity -> Demand
-vanillaCall 0 = evalDmd
-vanillaCall n = Call (vanillaCall (n-1))
-
-deferType :: DmdType -> DmdType
-deferType (DmdType fv _ _) = DmdType (deferEnv fv) [] TopRes
- -- Notice that we throw away info about both arguments and results
- -- For example, f = let ... in \x -> x
- -- We don't want to get a stricness type V->T for f.
-
-deferEnv :: DmdEnv -> DmdEnv
-deferEnv fv = mapVarEnv defer fv
-
-
-----------------
-argDemand :: Demand -> Demand
--- The 'Defer' demands are just Lazy at function boundaries
--- Ugly! Ask John how to improve it.
-argDemand Top = lazyDmd
-argDemand (Defer _) = lazyDmd
-argDemand (Eval ds) = Eval (mapDmds argDemand ds)
-argDemand (Box Bot) = evalDmd
-argDemand (Box d) = box (argDemand d)
-argDemand Bot = Abs -- Don't pass args that are consumed (only) by bottom
-argDemand d = d
-\end{code}
+ | isStrictDmd dmd_info -- Might be bottom, first time round
+ , Just {} <- deepSplitProductType_maybe $ idType id
+ = extendAnalEnv NotTopLevel env id cprSig
+ -- See Note [Initial CPR for strict binders]
-\begin{code}
--------------------------
-lubType :: DmdType -> DmdType -> DmdType
--- Consider (if x then y else []) with demand V
--- Then the first branch gives {y->V} and the second
--- *implicitly* has {y->A}. So we must put {y->(V `lub` A)}
--- in the result env.
-lubType (DmdType fv1 ds1 r1) (DmdType fv2 ds2 r2)
- = DmdType lub_fv2 (lub_ds ds1 ds2) (r1 `lubRes` r2)
- where
- lub_fv = plusVarEnv_C lub fv1 fv2
- lub_fv1 = modifyEnv (not (isBotRes r1)) absLub fv2 fv1 lub_fv
- lub_fv2 = modifyEnv (not (isBotRes r2)) absLub fv1 fv2 lub_fv1
- -- lub is the identity for Bot
-
- -- Extend the shorter argument list to match the longer
- lub_ds (d1:ds1) (d2:ds2) = lub d1 d2 : lub_ds ds1 ds2
- lub_ds [] [] = []
- lub_ds ds1 [] = map (`lub` resTypeArgDmd r2) ds1
- lub_ds [] ds2 = map (resTypeArgDmd r1 `lub`) ds2
-
------------------------------------
-bothType :: DmdType -> DmdType -> DmdType
--- (t1 `bothType` t2) takes the argument/result info from t1,
--- using t2 just for its free-var info
--- NB: Don't forget about r2! It might be BotRes, which is
--- a bottom demand on all the in-scope variables.
--- Peter: can this be done more neatly?
-bothType (DmdType fv1 ds1 r1) (DmdType fv2 _ r2)
- = DmdType both_fv2 ds1 (r1 `bothRes` r2)
+ | otherwise = env
where
- both_fv = plusVarEnv_C both fv1 fv2
- both_fv1 = modifyEnv (isBotRes r1) (`both` Bot) fv2 fv1 both_fv
- both_fv2 = modifyEnv (isBotRes r2) (`both` Bot) fv1 fv2 both_fv1
- -- both is the identity for Abs
+ dmd_info = idDemandInfo id
\end{code}
+Note [Initial CPR for strict binders]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-\begin{code}
-lubRes :: DmdResult -> DmdResult -> DmdResult
-lubRes BotRes r = r
-lubRes r BotRes = r
-lubRes RetCPR RetCPR = RetCPR
-lubRes _ _ = TopRes
-
-bothRes :: DmdResult -> DmdResult -> DmdResult
--- If either diverges, the whole thing does
--- Otherwise take CPR info from the first
-bothRes _ BotRes = BotRes
-bothRes r1 _ = r1
-\end{code}
+CPR is initialized for a lambda binder in an optimistic manner, i.e,
+if the binder is used strictly and at least some of its components as
+a product are used, which is checked by the value of the absence
+demand.
-\begin{code}
-modifyEnv :: Bool -- No-op if False
- -> (Demand -> Demand) -- The zapper
- -> DmdEnv -> DmdEnv -- Env1 and Env2
- -> DmdEnv -> DmdEnv -- Transform this env
- -- Zap anything in Env1 but not in Env2
- -- Assume: dom(env) includes dom(Env1) and dom(Env2)
-
-modifyEnv need_to_modify zapper env1 env2 env
- | need_to_modify = foldr zap env (varEnvKeys (env1 `minusUFM` env2))
- | otherwise = env
- where
- zap uniq env = addToUFM_Directly env uniq (zapper current_val)
- where
- current_val = expectJust "modifyEnv" (lookupUFM_Directly env uniq)
-\end{code}
+If the binder is marked demanded with a strict demand, then give it a
+CPR signature, because in the likely event that this is a lambda on a
+fn defn [we only use this when the lambda is being consumed with a
+call demand], it'll be w/w'd and so it will be CPR-ish. E.g.
+ f = \x::(Int,Int). if ...strict in x... then
+ x
+ else
+ (a,b)
+We want f to have the CPR property because x does, by the time f has been w/w'd
-%************************************************************************
-%* *
-\subsection{LUB and BOTH}
-%* *
-%************************************************************************
+Also note that we only want to do this for something that definitely
+has product type, else we may get over-optimistic CPR results
+(e.g. from \x -> x!).
-\begin{code}
-lub :: Demand -> Demand -> Demand
-
-lub Bot d2 = d2
-lub Abs d2 = absLub d2
-lub Top _ = Top
-lub (Defer ds1) d2 = defer (Eval ds1 `lub` d2)
-
-lub (Call d1) (Call d2) = Call (d1 `lub` d2)
-lub d1@(Call _) (Box d2) = d1 `lub` d2 -- Just strip the box
-lub (Call _) d2@(Eval _) = d2 -- Presumably seq or vanilla eval
-lub d1@(Call _) d2 = d2 `lub` d1 -- Bot, Abs, Top
-
--- For the Eval case, we use these approximation rules
--- Box Bot <= Eval (Box Bot ...)
--- Box Top <= Defer (Box Bot ...)
--- Box (Eval ds) <= Eval (map Box ds)
-lub (Eval ds1) (Eval ds2) = Eval (ds1 `lubs` ds2)
-lub (Eval ds1) (Box Bot) = Eval (mapDmds (`lub` Box Bot) ds1)
-lub (Eval ds1) (Box (Eval ds2)) = Eval (ds1 `lubs` mapDmds box ds2)
-lub (Eval ds1) (Box Abs) = deferEval (mapDmds (`lub` Box Bot) ds1)
-lub d1@(Eval _) d2 = d2 `lub` d1 -- Bot,Abs,Top,Call,Defer
-
-lub (Box d1) (Box d2) = box (d1 `lub` d2)
-lub d1@(Box _) d2 = d2 `lub` d1
-
-lubs :: Demands -> Demands -> Demands
-lubs ds1 ds2 = zipWithDmds lub ds1 ds2
-
----------------------
-box :: Demand -> Demand
--- box is the smart constructor for Box
--- It computes <B,bot> & d
--- INVARIANT: (Box d) => d = Bot, Abs, Eval
--- Seems to be no point in allowing (Box (Call d))
-box (Call d) = Call d -- The odd man out. Why?
-box (Box d) = Box d
-box (Defer _) = lazyDmd
-box Top = lazyDmd -- Box Abs and Box Top
-box Abs = lazyDmd -- are the same <B,L>
-box d = Box d -- Bot, Eval
-
----------------
-defer :: Demand -> Demand
-
--- defer is the smart constructor for Defer
--- The idea is that (Defer ds) = <U(ds), L>
---
--- It specifies what happens at a lazy function argument
--- or a lambda; the L* operator
--- Set the strictness part to L, but leave
--- the boxity side unaffected
--- It also ensures that Defer (Eval [LLLL]) = L
-
-defer Bot = Abs
-defer Abs = Abs
-defer Top = Top
-defer (Call _) = lazyDmd -- Approximation here?
-defer (Box _) = lazyDmd
-defer (Defer ds) = Defer ds
-defer (Eval ds) = deferEval ds
-
-deferEval :: Demands -> Demand
--- deferEval ds = defer (Eval ds)
-deferEval ds | allTop ds = Top
- | otherwise = Defer ds
-
----------------------
-absLub :: Demand -> Demand
--- Computes (Abs `lub` d)
--- For the Bot case consider
--- f x y = if ... then x else error x
--- Then for y we get Abs `lub` Bot, and we really
--- want Abs overall
-absLub Bot = Abs
-absLub Abs = Abs
-absLub Top = Top
-absLub (Call _) = Top
-absLub (Box _) = Top
-absLub (Eval ds) = Defer (absLubs ds) -- Or (Defer ds)?
-absLub (Defer ds) = Defer (absLubs ds) -- Or (Defer ds)?
-
-absLubs :: Demands -> Demands
-absLubs = mapDmds absLub
-
----------------
-both :: Demand -> Demand -> Demand
-
-both Abs d2 = d2
-
--- Note [Bottom demands]
-both Bot Bot = Bot
-both Bot Abs = Bot
-both Bot (Eval ds) = Eval (mapDmds (`both` Bot) ds)
-both Bot (Defer ds) = Eval (mapDmds (`both` Bot) ds)
-both Bot _ = errDmd
-
-both Top Bot = errDmd
-both Top Abs = Top
-both Top Top = Top
-both Top (Box d) = Box d
-both Top (Call d) = Call d
-both Top (Eval ds) = Eval (mapDmds (`both` Top) ds)
-both Top (Defer ds) -- = defer (Top `both` Eval ds)
- -- = defer (Eval (mapDmds (`both` Top) ds))
- = deferEval (mapDmds (`both` Top) ds)
-
-
-both (Box d1) (Box d2) = box (d1 `both` d2)
-both (Box d1) d2@(Call _) = box (d1 `both` d2)
-both (Box d1) d2@(Eval _) = box (d1 `both` d2)
-both (Box d1) (Defer _) = Box d1
-both d1@(Box _) d2 = d2 `both` d1
-
-both (Call d1) (Call d2) = Call (d1 `both` d2)
-both (Call d1) (Eval _) = Call d1 -- Could do better for (Poly Bot)?
-both (Call d1) (Defer _) = Call d1 -- Ditto
-both d1@(Call _) d2 = d2 `both` d1
-
-both (Eval ds1) (Eval ds2) = Eval (ds1 `boths` ds2)
-both (Eval ds1) (Defer ds2) = Eval (ds1 `boths` mapDmds defer ds2)
-both d1@(Eval _) d2 = d2 `both` d1
-
-both (Defer ds1) (Defer ds2) = deferEval (ds1 `boths` ds2)
-both d1@(Defer _) d2 = d2 `both` d1
-
-boths :: Demands -> Demands -> Demands
-boths ds1 ds2 = zipWithDmds both ds1 ds2
-\end{code}
-Note [Bottom demands]
-~~~~~~~~~~~~~~~~~~~~~
-Consider
- f x = error x
-From 'error' itself we get demand Bot on x
-From the arg demand on x we get
- x :-> evalDmd = Box (Eval (Poly Abs))
-So we get Bot `both` Box (Eval (Poly Abs))
- = Seq Keep (Poly Bot)
-
-Consider also
- f x = if ... then error (fst x) else fst x
-Then we get (Eval (Box Bot, Bot) `lub` Eval (SA))
- = Eval (SA)
-which is what we want.
-
-Consider also
- f x = error [fst x]
-Then we get
- x :-> Bot `both` Defer [SA]
-and we want the Bot demand to cancel out the Defer
-so that we get Eval [SA]. Otherwise we'd have the odd
-situation that
- f x = error (fst x) -- Strictness U(SA)b
- g x = error ('y':fst x) -- Strictness Tb
+Note [Initialising strictness]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+See section 9.2 (Finding fixpoints) of the paper.
+
+Our basic plan is to initialise the strictness of each Id in a
+recursive group to "bottom", and find a fixpoint from there. However,
+this group B might be inside an *enclosing* recursiveb group A, in
+which case we'll do the entire fixpoint shebang on for each iteration
+of A. This can be illustrated by the following example:
+
+Example:
+
+ f [] = []
+ f (x:xs) = let g [] = f xs
+ g (y:ys) = y+1 : g ys
+ in g (h x)
+
+At each iteration of the fixpoint for f, the analyser has to find a
+fixpoint for the enclosed function g. In the meantime, the demand
+values for g at each iteration for f are *greater* than those we
+encountered in the previous iteration for f. Therefore, we can begin
+the fixpoint for g not with the bottom value but rather with the
+result of the previous analysis. I.e., when beginning the fixpoint
+process for g, we can start from the demand signature computed for g
+previously and attached to the binding occurrence of g.
+
+To speed things up, we initialise each iteration of A (the enclosing
+one) from the result of the last one, which is neatly recorded in each
+binder. That way we make use of earlier iterations of the fixpoint
+algorithm. (Cunning plan.)
+But on the *first* iteration we want to *ignore* the current strictness
+of the Id, and start from "bottom". Nowadays the Id can have a current
+strictness, because interface files record strictness for nested bindings.
+To know when we are in the first iteration, we look at the ae_virgin
+field of the AnalEnv.
diff --git a/compiler/stranal/WorkWrap.lhs b/compiler/stranal/WorkWrap.lhs
index 5be63a9bc7..e697dfe1ff 100644
--- a/compiler/stranal/WorkWrap.lhs
+++ b/compiler/stranal/WorkWrap.lhs
@@ -21,12 +21,11 @@ import Var
import Id
import Type ( Type )
import IdInfo
-import Demand
import UniqSupply
import BasicTypes
import DynFlags
import VarEnv ( isEmptyVarEnv )
-import Maybes ( orElse )
+import Demand
import WwLib
import Util
import Outputable
@@ -258,7 +257,7 @@ tryWW dflags is_rec fn_id rhs
-- Furthermore, don't even expose strictness info
= return [ (fn_id, rhs) ]
- | is_thunk && worthSplittingThunk maybe_fn_dmd res_info
+ | is_thunk && worthSplittingThunk fn_dmd res_info
-- See Note [Thunk splitting]
= ASSERT2( isNonRec is_rec, ppr new_fn_id ) -- The thunk must be non-recursive
checkSize dflags new_fn_id rhs $
@@ -273,12 +272,12 @@ tryWW dflags is_rec fn_id rhs
where
fn_info = idInfo fn_id
- maybe_fn_dmd = demandInfo fn_info
+ fn_dmd = demandInfo fn_info
inline_act = inlinePragmaActivation (inlinePragInfo fn_info)
-- In practice it always will have a strictness
-- signature, even if it's a uninformative one
- strict_sig = strictnessInfo fn_info `orElse` topSig
+ strict_sig = strictnessInfo fn_info
StrictSig (DmdType env wrap_dmds res_info) = strict_sig
-- new_fn_id has the DmdEnv zapped.
@@ -376,8 +375,8 @@ splitFun dflags fn_id fn_info wrap_dmds res_info rhs
-- The arity is set by the simplifier using exprEtaExpandArity
-- So it may be more than the number of top-level-visible lambdas
- work_res_info | isBotRes res_info = BotRes -- Cpr stuff done by wrapper
- | otherwise = TopRes
+ work_res_info | isBotRes res_info = botRes -- Cpr stuff done by wrapper
+ | otherwise = topRes
one_shots = get_one_shots rhs
@@ -451,51 +450,6 @@ splitThunk dflags fn_id rhs = do
%************************************************************************
%* *
-\subsection{Functions over Demands}
-%* *
-%************************************************************************
-
-\begin{code}
-worthSplittingFun :: [Demand] -> DmdResult -> Bool
- -- True <=> the wrapper would not be an identity function
-worthSplittingFun ds res
- = any worth_it ds || returnsCPR res
- -- worthSplitting returns False for an empty list of demands,
- -- and hence do_strict_ww is False if arity is zero and there is no CPR
- -- See Note [Worker-wrapper for bottoming functions]
- where
- worth_it Abs = True -- Absent arg
- worth_it (Eval (Prod _)) = True -- Product arg to evaluate
- worth_it _ = False
-
-worthSplittingThunk :: Maybe Demand -- Demand on the thunk
- -> DmdResult -- CPR info for the thunk
- -> Bool
-worthSplittingThunk maybe_dmd res
- = worth_it maybe_dmd || returnsCPR res
- where
- -- Split if the thing is unpacked
- worth_it (Just (Eval (Prod ds))) = not (all isAbsent ds)
- worth_it _ = False
-\end{code}
-
-Note [Worker-wrapper for bottoming functions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We used not to split if the result is bottom.
-[Justification: there's no efficiency to be gained.]
-
-But it's sometimes bad not to make a wrapper. Consider
- fw = \x# -> let x = I# x# in case e of
- p1 -> error_fn x
- p2 -> error_fn x
- p3 -> the real stuff
-The re-boxing code won't go away unless error_fn gets a wrapper too.
-[We don't do reboxing now, but in general it's better to pass an
-unboxed thing to f, and have it reboxed in the error cases....]
-
-
-%************************************************************************
-%* *
\subsection{The worker wrapper core}
%* *
%************************************************************************
diff --git a/compiler/stranal/WwLib.lhs b/compiler/stranal/WwLib.lhs
index 8aaa13171c..1cbebf8c23 100644
--- a/compiler/stranal/WwLib.lhs
+++ b/compiler/stranal/WwLib.lhs
@@ -11,7 +11,7 @@
-- http://hackage.haskell.org/trac/ghc/wiki/Commentary/CodingStyle#TabsvsSpaces
-- for details
-module WwLib ( mkWwBodies, mkWWstr, mkWorkerArgs ) where
+module WwLib ( mkWwBodies, mkWWstr, mkWorkerArgs, deepSplitProductType_maybe ) where
#include "HsVersions.h"
@@ -23,7 +23,7 @@ import Id ( Id, idType, mkSysLocal, idDemandInfo, setIdDemandInfo,
)
import IdInfo ( vanillaIdInfo )
import DataCon
-import Demand ( Demand(..), DmdResult(..), Demands(..) )
+import Demand
import MkCore ( mkRuntimeErrorApp, aBSENT_ERROR_ID )
import MkId ( realWorldPrimId, voidArgId
, wrapNewTypeBody, unwrapNewTypeBody )
@@ -36,7 +36,7 @@ import Literal ( absentLiteralOf )
import TyCon
import UniqSupply
import Unique
-import Util ( zipWithEqual )
+import Util
import Outputable
import DynFlags
import FastString
@@ -133,13 +133,14 @@ mkWwBodies :: DynFlags
mkWwBodies dflags fun_ty demands res_info one_shots
= do { let arg_info = demands `zip` (one_shots ++ repeat False)
+ all_one_shots = all snd arg_info
; (wrap_args, wrap_fn_args, work_fn_args, res_ty) <- mkWWargs emptyTvSubst fun_ty arg_info
; (work_args, wrap_fn_str, work_fn_str) <- mkWWstr dflags wrap_args
-- Do CPR w/w. See Note [Always do CPR w/w]
; (wrap_fn_cpr, work_fn_cpr, cpr_res_ty) <- mkWWcpr res_ty res_info
- ; let (work_lam_args, work_call_args) = mkWorkerArgs work_args cpr_res_ty
+ ; let (work_lam_args, work_call_args) = mkWorkerArgs work_args all_one_shots cpr_res_ty
; return ([idDemandInfo v | v <- work_call_args, isId v],
wrap_fn_args . wrap_fn_cpr . wrap_fn_str . applyToVars work_call_args . Var,
mkLams work_lam_args. work_fn_str . work_fn_cpr . work_fn_args) }
@@ -184,16 +185,39 @@ We use the state-token type which generates no code.
\begin{code}
mkWorkerArgs :: [Var]
+ -> Bool -- Whether all arguments are one-shot
-> Type -- Type of body
-> ([Var], -- Lambda bound args
[Var]) -- Args at call site
-mkWorkerArgs args res_ty
+mkWorkerArgs args all_one_shot res_ty
| any isId args || not (isUnLiftedType res_ty)
= (args, args)
| otherwise
- = (args ++ [voidArgId], args ++ [realWorldPrimId])
+ = (args ++ [newArg], args ++ [realWorldPrimId])
+ where
+ -- see Note [All One-Shot Arguments of a Worker]
+ newArg = if all_one_shot
+ then setOneShotLambda voidArgId
+ else voidArgId
\end{code}
+Note [All One-Shot Arguments of a Worker]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Sometimes, derived joint-points are just lambda-lifted thunks, whose
+only argument is of the unit type and is never used. This might
+interfere with the absence analysis, basing on which results these
+never-used arguments are eliminated in the worker. The additional
+argument `all_one_shot` of `mkWorkerArgs` is to prevent this.
+
+An example for this phenomenon is a `treejoin` program from the
+`nofib` suite, which features the following joint points:
+
+$j_s1l1 =
+ \ _ ->
+ case GHC.Prim.<=# 56320 y_aOy of _ {
+ GHC.Types.False -> $j_s1kP GHC.Prim.realWorld#;
+ GHC.Types.True -> ... }
%************************************************************************
%* *
@@ -342,6 +366,24 @@ mkWWstr dflags (arg : args) = do
(args2, wrap_fn2, work_fn2) <- mkWWstr dflags args
return (args1 ++ args2, wrap_fn1 . wrap_fn2, work_fn1 . work_fn2)
+\end{code}
+
+Note [Unpacking arguments with product and polymorphic demands]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The argument is unpacked in a case if it has a product type and has a
+strict and used demand put on it. I.e., arguments, with demands such
+as the following ones:
+
+<S,U(U, L)>
+<S(L,S),U>
+
+will be unpacked. Moreover, for arguments whose demand is <S,U> or
+<S,H>, we take an advantage of the polymorphic nature of S and U and
+replicate the enclosed demand correspondingly (see definition of
+replicateDmd).
+
+
+\begin{code}
----------------------
-- mkWWstr_one wrap_arg = (work_args, wrap_fn, work_fn)
-- * wrap_fn assumes wrap_arg is in scope,
@@ -353,39 +395,19 @@ mkWWstr_one dflags arg
| isTyVar arg
= return ([arg], nop_fn, nop_fn)
- | otherwise
- = case idDemandInfo arg of
-
- -- Absent case. We can't always handle absence for arbitrary
- -- unlifted types, so we need to choose just the cases we can
- -- (that's what mk_absent_let does)
- Abs | Just work_fn <- mk_absent_let dflags arg
- -> return ([], nop_fn, work_fn)
-
- -- Unpack case
- Eval (Prod cs)
- | Just (_arg_tycon, _tycon_arg_tys, data_con, inst_con_arg_tys)
- <- deepSplitProductType_maybe (idType arg)
- -> do uniqs <- getUniquesM
- let
- unpk_args = zipWith mk_ww_local uniqs inst_con_arg_tys
- unpk_args_w_ds = zipWithEqual "mkWWstr" set_worker_arg_info unpk_args cs
- unbox_fn = mkUnpackCase (sanitiseCaseBndr arg) (Var arg) unpk_args data_con
- rebox_fn = Let (NonRec arg con_app)
- con_app = mkProductBox unpk_args (idType arg)
- (worker_args, wrap_fn, work_fn) <- mkWWstr dflags unpk_args_w_ds
- return (worker_args, unbox_fn . wrap_fn, work_fn . rebox_fn)
- -- Don't pass the arg, rebox instead
-
- -- `seq` demand; evaluate in wrapper in the hope
- -- of dropping seqs in the worker
- Eval (Poly Abs)
- -> let
- arg_w_unf = arg `setIdUnfolding` evaldUnfolding
- -- Tell the worker arg that it's sure to be evaluated
- -- so that internal seqs can be dropped
- in
- return ([arg_w_unf], mk_seq_case arg, nop_fn)
+ | isAbsDmd dmd
+ , Just work_fn <- mk_absent_let dflags arg
+ -- Absent case. We can't always handle absence for arbitrary
+ -- unlifted types, so we need to choose just the cases we can
+ --- (that's what mk_absent_let does)
+ = return ([], nop_fn, work_fn)
+
+ | isSeqDmd dmd -- `seq` demand; evaluate in wrapper in the hope
+ -- of dropping seqs in the worker
+ = let arg_w_unf = arg `setIdUnfolding` evaldUnfolding
+ -- Tell the worker arg that it's sure to be evaluated
+ -- so that internal seqs can be dropped
+ in return ([arg_w_unf], mk_seq_case arg, nop_fn)
-- Pass the arg, anyway, even if it is in theory discarded
-- Consider
-- f x y = x `seq` y
@@ -398,11 +420,28 @@ mkWWstr_one dflags arg
-- we end up evaluating the absent thunk.
-- But the Evald flag is pretty weird, and I worry that it might disappear
-- during simplification, so for now I've just nuked this whole case
+
+ -- Unpack case,
+ -- see note [Unpacking arguments with product and polymorphic demands]
+ | isStrictDmd dmd
+ , Just (_arg_tycon, _tycon_arg_tys, data_con, inst_con_arg_tys)
+ <- deepSplitProductType_maybe (idType arg)
+ = do { uniqs <- getUniquesM
+ ; let cs = splitProdDmd (length inst_con_arg_tys) dmd
+ unpk_args = zipWith mk_ww_local uniqs inst_con_arg_tys
+ unpk_args_w_ds = zipWithEqual "mkWWstr" set_worker_arg_info unpk_args cs
+ unbox_fn = mkUnpackCase (sanitiseCaseBndr arg) (Var arg) unpk_args data_con
+ rebox_fn = Let (NonRec arg con_app)
+ con_app = mkProductBox unpk_args (idType arg)
+ ; (worker_args, wrap_fn, work_fn) <- mkWWstr dflags unpk_args_w_ds
+ ; return (worker_args, unbox_fn . wrap_fn, work_fn . rebox_fn) }
+ -- Don't pass the arg, rebox instead
- -- Other cases
- _other_demand -> return ([arg], nop_fn, nop_fn)
+ | otherwise -- Other cases
+ = return ([arg], nop_fn, nop_fn)
where
+ dmd = idDemandInfo arg
-- If the wrapper argument is a one-shot lambda, then
-- so should (all) the corresponding worker arguments be
-- This bites when we do w/w on a case join point
@@ -416,8 +455,6 @@ nop_fn :: CoreExpr -> CoreExpr
nop_fn body = body
\end{code}
-
-
\begin{code}
mkUnpackCase :: Id -> CoreExpr -> [Id] -> DataCon -> CoreExpr -> CoreExpr
-- (mkUnpackCase x e args Con body)
@@ -496,7 +533,10 @@ mkWWcpr :: Type -- function body type
CoreExpr -> CoreExpr, -- New worker
Type) -- Type of worker's body
-mkWWcpr body_ty RetCPR
+mkWWcpr body_ty res
+ | not (returnsCPR res) -- No CPR info
+ = return (id, id, body_ty)
+
| not (isClosedAlgType body_ty)
= WARN( True,
text "mkWWcpr: non-algebraic or open body type" <+> ppr body_ty )
@@ -537,9 +577,6 @@ mkWWcpr body_ty RetCPR
n_con_args = length con_arg_tys
con_arg_ty1 = head con_arg_tys
-mkWWcpr body_ty _other -- No CPR info
- = return (id, id, body_ty)
-
-- If the original function looked like
-- f = \ x -> _scc_ "foo" E
--