diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2022-05-03 12:18:30 +0100 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2022-07-18 16:36:54 -0400 |
commit | 415468fef8a3e9181b7eca86de0e05c0cce31729 (patch) | |
tree | bccbf333547b3f3c2c863afcd3a5a97850c68ae5 | |
parent | af6731a40782b418947d376a09fd605111dfea2a (diff) | |
download | haskell-415468fef8a3e9181b7eca86de0e05c0cce31729.tar.gz |
Refactor SpecConstr to use treat bindings uniformly
This patch, provoked by #21457, simplifies SpecConstr by treating
top-level and nested bindings uniformly (see the new scBind).
* Eliminates the mysterious scTopBindEnv
* Refactors scBind to handle top-level and nested definitions
uniformly.
* But, for now at least, continues the status quo of not doing
SpecConstr for top-level non-recursive bindings. (In contrast
we do specialise nested non-recursive bindings, although the
original paper did not; see Note [Local let bindings].)
I tried the effect of specialising top-level non-recursive
bindings (which is now dead easy to switch on, unlike before)
but found some regressions, so I backed off. See !8135.
It's a pure refactoring. I think it'll do a better job in a few
cases, but there is no regression test.
-rw-r--r-- | compiler/GHC/Core/Opt/SpecConstr.hs | 524 |
1 files changed, 260 insertions, 264 deletions
diff --git a/compiler/GHC/Core/Opt/SpecConstr.hs b/compiler/GHC/Core/Opt/SpecConstr.hs index 9ef2d3e3e6..173d546b73 100644 --- a/compiler/GHC/Core/Opt/SpecConstr.hs +++ b/compiler/GHC/Core/Opt/SpecConstr.hs @@ -76,6 +76,7 @@ import GHC.Serialized ( deserializeWithData ) import Control.Monad ( zipWithM ) import Data.List (nubBy, sortBy, partition, dropWhileEnd, mapAccumL ) +import Data.Maybe( mapMaybe ) import Data.Ord( comparing ) {- @@ -373,11 +374,14 @@ The recursive call ends up looking like So we want to spot the constructor application inside the cast. That's why we have the Cast case in argToPat -Note [Local recursive groups] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For a *local* recursive group, we can see all the calls to the -function, so we seed the specialisation loop from the calls in the -body, not from the calls in the RHS. Consider: +Note [Seeding recursive groups] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +For a recursive group that is either + * nested, or + * top-level, but with no exported Ids +we can see all the calls to the function, so we seed the specialisation +loop from the calls in the body, and /not/ from the calls in the RHS. +Consider: bar m n = foo n (n,n) (n,n) (n,n) (n,n) where @@ -400,52 +404,42 @@ a local function. In a case like the above we end up never calling the original un-specialised function. (Although we still leave its code around just in case.) -However, if we find any boring calls in the body, including *unsaturated* -ones, such as +Wrinkles + +* Boring calls. If we find any boring calls in the body, including + *unsaturated* ones, such as letrec foo x y = ....foo... in map foo xs -then we will end up calling the un-specialised function, so then we *should* -use the calls in the un-specialised RHS as seeds. We call these -"boring call patterns", and callsToPats reports if it finds any of these. - -Note [Seeding top-level recursive groups] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -This seeding is done in the binding for seed_calls in specRec. - -1. If all the bindings in a top-level recursive group are local (not - exported), then all the calls are in the rest of the top-level - bindings. This means we can specialise with those call patterns - ONLY, and NOT with the RHSs of the recursive group (exactly like - Note [Local recursive groups]) - -2. But if any of the bindings are exported, the function may be called - with any old arguments, so (for lack of anything better) we specialise - based on - (a) the call patterns in the RHS - (b) the call patterns in the rest of the top-level bindings - NB: before Apr 15 we used (a) only, but Dimitrios had an example - where (b) was crucial, so I added that. - Adding (b) also improved nofib allocation results: - multiplier: 4% better - minimax: 2.8% better - -Actually in case (2), instead of using the calls from the RHS, it -would be better to specialise in the importing module. We'd need to -add an INLINABLE pragma to the function, and then it can be -specialised in the importing scope, just as is done for type classes -in GHC.Core.Opt.Specialise.specImports. This remains to be done (#10346). - -Note [Top-level recursive groups] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -To get the call usage information from "the rest of the top level -bindings" (c.f. Note [Seeding top-level recursive groups]), we work -backwards through the top-level bindings so we see the usage before we -get to the binding of the function. Before we can collect the usage -though, we go through all the bindings and add them to the -environment. This is necessary because usage is only tracked for -functions in the environment. These two passes are called - 'go' and 'goEnv' -in specConstrProgram. (Looks a bit revolting to me.) + then we will end up calling the un-specialised function, so then we + *should* use the calls in the un-specialised RHS as seeds. We call + these "boring call patterns", and callsToNewPats reports if it finds + any of these. Then 'specialise' unleashes the usage info from the + un-specialised RHS. + +* Exported Ids. `specialise` /also/ unleashes `si_mb_unspec` + for exported Ids. That way we are sure to generate usage info from + the /un-specialised/ RHS of an exported function. + +More precisely: + +* Always start from the calls in the body of the let or (for top level) + calls in the rest of the module. See the body_calls in the call to + `specialise` in `specNonRec`, and to `go` in `specRec`. + +* si_mb_unspec holds the usage from the unspecialised RHS. + See `initSpecInfo`. + +* `specialise` will unleash si_mb_unspec, if + - `callsToNewPats` reports "boring calls found", or + - this is a top-level exported Id. + +Historical note. At an earlier point, if a top-level Id was exported, +we used only seeds from the RHS, and /not/from the body. But Dimitrios +had an example where using call patterns from the body (the other defns +in the module) was crucial. And doing so improved nofib allocation results: + multiplier: 4% better + minimax: 2.8% better +In any case, it is easier to do! Note [Do not specialise diverging functions] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -763,35 +757,18 @@ unbox the strict fields, because T is polymorphic!) specConstrProgram :: ModGuts -> CoreM ModGuts specConstrProgram guts - = do - dflags <- getDynFlags - us <- getUniqueSupplyM - (_, annos) <- getFirstAnnotations deserializeWithData guts - this_mod <- getModule - -- pprTraceM "specConstrInput" (ppr $ mg_binds guts) - let binds' = reverse $ fst $ initUs us $ do - -- Note [Top-level recursive groups] - (env, binds) <- goEnv (initScEnv (initScOpts dflags this_mod) annos) - (mg_binds guts) - -- binds is identical to (mg_binds guts), except that the - -- binders on the LHS have been replaced by extendBndr - -- (SPJ this seems like overkill; I don't think the binders - -- will change at all; and we don't substitute in the RHSs anyway!!) - go env nullUsage (reverse binds) - - return (guts { mg_binds = binds' }) - where - -- See Note [Top-level recursive groups] - goEnv env [] = return (env, []) - goEnv env (bind:binds) = do (env', bind') <- scTopBindEnv env bind - (env'', binds') <- goEnv env' binds - return (env'', bind' : binds') - - -- Arg list of bindings is in reverse order - go _ _ [] = return [] - go env usg (bind:binds) = do (usg', bind') <- scTopBind env usg bind - binds' <- go env usg' binds - return (bind' : binds') + = do { env0 <- initScEnv guts + ; us <- getUniqueSupplyM + ; let (_usg, binds') = initUs_ us $ + scTopBinds env0 (mg_binds guts) + + ; return (guts { mg_binds = binds' }) } + +scTopBinds :: ScEnv -> [InBind] -> UniqSM (ScUsage, [OutBind]) +scTopBinds _env [] = return (nullUsage, []) +scTopBinds env (b:bs) = do { (usg, b', bs') <- scBind TopLevel env b $ + (\env -> scTopBinds env bs) + ; return (usg, b' ++ bs') } {- ************************************************************************ @@ -955,14 +932,24 @@ initScOpts dflags this_mod = SpecConstrOpts sc_keen = gopt Opt_SpecConstrKeen dflags } -initScEnv :: SpecConstrOpts -> UniqFM Name SpecConstrAnnotation -> ScEnv -initScEnv opts anns - = SCE { sc_opts = opts, - sc_force = False, - sc_subst = emptySubst, - sc_how_bound = emptyVarEnv, - sc_vals = emptyVarEnv, - sc_annotations = anns } +initScEnv :: ModGuts -> CoreM ScEnv +initScEnv guts + = do { dflags <- getDynFlags + ; (_, anns) <- getFirstAnnotations deserializeWithData guts + ; this_mod <- getModule + ; return (SCE { sc_opts = initScOpts dflags this_mod, + sc_force = False, + sc_subst = init_subst, + sc_how_bound = emptyVarEnv, + sc_vals = emptyVarEnv, + sc_annotations = anns }) } + where + init_subst = mkEmptySubst $ mkInScopeSet $ mkVarSet $ + bindersOfBinds (mg_binds guts) + -- Acccount for top-level bindings that are not in dependency order; + -- see Note [Glomming] in GHC.Core.Opt.OccurAnal + -- Easiest thing is to bring all the top level binders into scope at once, + -- as if at once, as if all the top-level decls were mutually recursive. data HowBound = RecFun -- These are the recursive functions for which -- we seek interesting call patterns @@ -1186,8 +1173,8 @@ data ScUsage scu_occs :: !(IdEnv ArgOcc) -- Information on argument occurrences } -- The domain is OutIds -type CallEnv = IdEnv [Call] -data Call = Call Id [CoreArg] ValueEnv +type CallEnv = IdEnv [Call] -- Domain is OutIds +data Call = Call OutId [CoreArg] ValueEnv -- The arguments of the call, together with the -- env giving the constructor bindings at the call site -- We keep the function mainly for debug output @@ -1209,6 +1196,9 @@ nullUsage = SCU { scu_calls = emptyVarEnv, scu_occs = emptyVarEnv } combineCalls :: CallEnv -> CallEnv -> CallEnv combineCalls = plusVarEnv_C (++) +delCallsFor :: ScUsage -> [Var] -> ScUsage +delCallsFor env bndrs = env { scu_calls = scu_calls env `delVarEnvList` bndrs } + combineUsage :: ScUsage -> ScUsage -> ScUsage combineUsage u1 u2 = SCU { scu_calls = combineCalls (scu_calls u1) (scu_calls u2), scu_occs = plusVarEnv_C combineOcc (scu_occs u1) (scu_occs u2) } @@ -1291,6 +1281,121 @@ The main recursive function gathers up usage information, and creates specialised versions of functions. -} +scBind :: TopLevelFlag -> ScEnv -> InBind + -> (ScEnv -> UniqSM (ScUsage, a)) -- Specialise the scope of the binding + -> UniqSM (ScUsage, [OutBind], a) +scBind top_lvl env (NonRec bndr rhs) do_body + | isTyVar bndr -- Type-lets may be created by doBeta + = do { (final_usage, body') <- do_body (extendScSubst env bndr rhs) + ; return (final_usage, [], body') } + + | not (isTopLevel top_lvl) -- Nested non-recursive value binding + -- See Note [Specialising local let bindings] + = do { let (body_env, bndr') = extendBndr env bndr + -- Not necessary at top level; but here we are nested + + ; rhs_info <- scRecRhs env (bndr',rhs) + + ; let body_env2 = extendHowBound body_env [bndr'] RecFun + rhs' = ri_new_rhs rhs_info + body_env3 = extendValEnv body_env2 bndr' (isValue (sc_vals env) rhs') + + ; (body_usg, body') <- do_body body_env3 + + -- Now make specialised copies of the binding, + -- based on calls in body_usg + ; (spec_usg, specs) <- specNonRec env (scu_calls body_usg) rhs_info + -- NB: For non-recursive bindings we inherit sc_force flag from + -- the parent function (see Note [Forcing specialisation]) + + -- Specialized + original binding + ; let spec_bnds = [NonRec b r | (b,r) <- ruleInfoBinds rhs_info specs] + bind_usage = (body_usg `delCallsFor` [bndr']) + `combineUsage` spec_usg -- Note [spec_usg includes rhs_usg] + + ; return (bind_usage, spec_bnds, body') + } + + | otherwise -- Top-level, non-recursive value binding + -- At top level we do not specialise non-recursive bindings; that + -- is, we do not call specNonRec, passing the calls from the body. + -- The original paper only specialised /recursive/ bindings, but + -- we later started specialising nested non-recursive bindings: + -- see Note [Specialising local let bindings] + -- + -- I tried always specialising non-recursive top-level bindings too, + -- but found some regressions (see !8135). So I backed off. + = do { (rhs_usage, rhs') <- scExpr env rhs + + -- At top level, we've already put all binders into scope; see initScEnv + -- Hence no need to call `extendBndr`. But we still want to + -- extend the `ValueEnv` to record the value of this binder. + ; let body_env = extendValEnv env bndr (isValue (sc_vals env) rhs') + ; (body_usage, body') <- do_body body_env + + ; return (rhs_usage `combineUsage` body_usage, [NonRec bndr rhs'], body') } + +scBind top_lvl env (Rec prs) do_body + | isTopLevel top_lvl + , Just threshold <- sc_size (sc_opts env) + , not force_spec + , not (all (couldBeSmallEnoughToInline (sc_uf_opts (sc_opts env)) threshold) rhss) + = -- Do no specialisation if the RHSs are too big + -- ToDo: I'm honestly not sure of the rationale of this size-testing, nor + -- why it only applies at top level. But that's the way it has been + -- for a while. See #21456. + do { (body_usg, body') <- do_body rhs_env2 + ; (rhs_usgs, rhss') <- mapAndUnzipM (scExpr env) rhss + ; let all_usg = (combineUsages rhs_usgs `combineUsage` body_usg) + `delCallsFor` bndrs' + bind' = Rec (bndrs' `zip` rhss') + ; return (all_usg, [bind'], body') } + + | otherwise + = do { rhs_infos <- mapM (scRecRhs rhs_env2) (bndrs' `zip` rhss) + ; (body_usg, body') <- do_body rhs_env2 + + ; (spec_usg, specs) <- specRec (scForce rhs_env2 force_spec) + (scu_calls body_usg) rhs_infos + -- Do not unconditionally generate specialisations from rhs_usgs + -- Instead use them only if we find an unspecialised call + -- See Note [Seeding recursive groups] + + ; let all_usg = (spec_usg `combineUsage` body_usg) -- Note [spec_usg includes rhs_usg] + `delCallsFor` bndrs' + bind' = Rec (concat (zipWithEqual "scExpr'" ruleInfoBinds rhs_infos specs)) + -- zipWithEqual: length of returned [SpecInfo] + -- should be the same as incoming [RhsInfo] + + ; return (all_usg, [bind'], body') } + where + (bndrs,rhss) = unzip prs + force_spec = any (forceSpecBndr env) bndrs -- Note [Forcing specialisation] + + (rhs_env1,bndrs') | isTopLevel top_lvl = (env, bndrs) + | otherwise = extendRecBndrs env bndrs + -- At top level, we've already put all binders into scope; see initScEnv + + rhs_env2 = extendHowBound rhs_env1 bndrs' RecFun + +{- Note [Specialising local let bindings] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +It is not uncommon to find this + + let $j = \x. <blah> in ...$j True...$j True... + +Here $j is an arbitrary let-bound function, but it often comes up for +join points. We might like to specialise $j for its call patterns. +Notice the difference from a letrec, where we look for call patterns +in the *RHS* of the function. Here we look for call patterns in the +*body* of the let. + +At one point I predicated this on the RHS mentioning the outer +recursive function, but that's not essential and might even be +harmful. I'm not sure. +-} + +------------------------ scExpr, scExpr' :: ScEnv -> CoreExpr -> UniqSM (ScUsage, CoreExpr) -- The unique supply is needed when we invent -- a new name for the specialised function and its args @@ -1315,6 +1420,11 @@ scExpr' env (Lam b e) = do let (env', b') = extendBndr env b (usg, e') <- scExpr env' e return (usg, Lam b' e') +scExpr' env (Let bind body) + = do { (final_usage, binds', body') <- scBind NotTopLevel env bind $ + (\env -> scExpr env body) + ; return (final_usage, mkLets binds' body') } + scExpr' env (Case scrut b ty alts) = do { (scrut_usg, scrut') <- scExpr env scrut ; case isValue (sc_vals env) scrut' of @@ -1354,79 +1464,7 @@ scExpr' env (Case scrut b ty alts) _ -> evalScrutOcc ; return (usg', b_occ `combineOcc` scrut_occ, Alt con bs2 rhs') } -scExpr' env (Let (NonRec bndr rhs) body) - | isTyVar bndr -- Type-lets may be created by doBeta - = scExpr' (extendScSubst env bndr rhs) body - | otherwise - = do { let (body_env, bndr') = extendBndr env bndr - ; rhs_info <- scRecRhs env (bndr',rhs) - - ; let body_env2 = extendHowBound body_env [bndr'] RecFun - -- See Note [Local let bindings] - rhs' = ri_new_rhs rhs_info - body_env3 = extendValEnv body_env2 bndr' (isValue (sc_vals env) rhs') - - ; (body_usg, body') <- scExpr body_env3 body - - -- NB: For non-recursive bindings we inherit sc_force flag from - -- the parent function (see Note [Forcing specialisation]) - ; (spec_usg, specs) <- specNonRec env body_usg rhs_info - - -- Specialized + original binding - ; let spec_bnds = mkLets [NonRec b r | (b,r) <- ruleInfoBinds rhs_info specs] body' - -- ; pprTraceM "spec_bnds" $ (ppr spec_bnds) - - ; return (body_usg { scu_calls = scu_calls body_usg `delVarEnv` bndr' } - `combineUsage` spec_usg, -- Note [spec_usg includes rhs_usg] - spec_bnds - ) - } - - --- A *local* recursive group: see Note [Local recursive groups] -scExpr' env (Let (Rec prs) body) - = do { let (bndrs,rhss) = unzip prs - (rhs_env1,bndrs') = extendRecBndrs env bndrs - rhs_env2 = extendHowBound rhs_env1 bndrs' RecFun - force_spec = any (forceSpecBndr env) bndrs' - -- Note [Forcing specialisation] - - ; rhs_infos <- mapM (scRecRhs rhs_env2) (bndrs' `zip` rhss) - ; (body_usg, body') <- scExpr rhs_env2 body - - -- NB: start specLoop from body_usg - ; (spec_usg, specs) <- specRec NotTopLevel (scForce rhs_env2 force_spec) - body_usg rhs_infos - -- Do not unconditionally generate specialisations from rhs_usgs - -- Instead use them only if we find an unspecialised call - -- See Note [Local recursive groups] - - ; let all_usg = spec_usg `combineUsage` body_usg -- Note [spec_usg includes rhs_usg] - bind' = Rec (concat (zipWithEqual "scExpr'" ruleInfoBinds rhs_infos specs)) - -- zipWithEqual: length of returned [SpecInfo] - -- should be the same as incoming [RhsInfo] - - ; return (all_usg { scu_calls = scu_calls all_usg `delVarEnvList` bndrs' }, - Let bind' body') } - -{- -Note [Local let bindings] -~~~~~~~~~~~~~~~~~~~~~~~~~ -It is not uncommon to find this - - let $j = \x. <blah> in ...$j True...$j True... - -Here $j is an arbitrary let-bound function, but it often comes up for -join points. We might like to specialise $j for its call patterns. -Notice the difference from a letrec, where we look for call patterns -in the *RHS* of the function. Here we look for call patterns in the -*body* of the let. - -At one point I predicated this on the RHS mentioning the outer -recursive function, but that's not essential and might even be -harmful. I'm not sure. --} scApp :: ScEnv -> (InExpr, [InExpr]) -> UniqSM (ScUsage, CoreExpr) @@ -1482,51 +1520,6 @@ mkVarUsage env fn args | otherwise = evalScrutOcc ---------------------- -scTopBindEnv :: ScEnv -> CoreBind -> UniqSM (ScEnv, CoreBind) -scTopBindEnv env (Rec prs) - = do { let (rhs_env1,bndrs') = extendRecBndrs env bndrs - rhs_env2 = extendHowBound rhs_env1 bndrs RecFun - - prs' = zip bndrs' rhss - ; return (rhs_env2, Rec prs') } - where - (bndrs,rhss) = unzip prs - -scTopBindEnv env (NonRec bndr rhs) - = do { let (env1, bndr') = extendBndr env bndr - env2 = extendValEnv env1 bndr' (isValue (sc_vals env) rhs) - ; return (env2, NonRec bndr' rhs) } - ----------------------- -scTopBind :: ScEnv -> ScUsage -> CoreBind -> UniqSM (ScUsage, CoreBind) - -scTopBind env body_usage (Rec prs) - | Just threshold <- sc_size $ sc_opts env - , not force_spec - , not (all (couldBeSmallEnoughToInline (sc_uf_opts $ sc_opts env) threshold) rhss) - -- No specialisation - = -- pprTrace "scTopBind: nospec" (ppr bndrs) $ - do { (rhs_usgs, rhss') <- mapAndUnzipM (scExpr env) rhss - ; return (body_usage `combineUsage` combineUsages rhs_usgs, Rec (bndrs `zip` rhss')) } - - | otherwise -- Do specialisation - = do { rhs_infos <- mapM (scRecRhs env) prs - - ; (spec_usage, specs) <- specRec TopLevel (scForce env force_spec) - body_usage rhs_infos - - ; return (body_usage `combineUsage` spec_usage, - Rec (concat (zipWith ruleInfoBinds rhs_infos specs))) } - where - (bndrs,rhss) = unzip prs - force_spec = any (forceSpecBndr env) bndrs - -- Note [Forcing specialisation] - -scTopBind env usage (NonRec bndr rhs) -- Oddly, we don't seem to specialise top-level non-rec functions - = do { (rhs_usg', rhs') <- scExpr env rhs - ; return (usage `combineUsage` rhs_usg', NonRec bndr rhs') } - ----------------------- scRecRhs :: ScEnv -> (OutId, InExpr) -> UniqSM RhsInfo scRecRhs env (bndr,rhs) = do { let (arg_bndrs,body) = collectBinders rhs @@ -1573,7 +1566,8 @@ data RhsInfo } data SpecInfo -- Info about specialisations for a particular Id - = SI { si_specs :: [OneSpec] -- The specialisations we have generated + = SI { si_specs :: [OneSpec] -- The specialisations we have + -- generated for this function , si_n_specs :: Int -- Length of si_specs; used for numbering them @@ -1584,7 +1578,7 @@ data SpecInfo -- Info about specialisations for a particular Id -- RHS usage (which has not yet been -- unleashed) -- Nothing => we have - -- See Note [Local recursive groups] + -- See Note [Seeding recursive groups] -- See Note [spec_usg includes rhs_usg] -- One specialisation: Rule plus definition @@ -1594,57 +1588,62 @@ data OneSpec = , os_id :: OutId -- Spec id , os_rhs :: OutExpr } -- Spec rhs -noSpecInfo :: SpecInfo -noSpecInfo = SI { si_specs = [], si_n_specs = 0, si_mb_unspec = Nothing } +initSpecInfo :: RhsInfo -> SpecInfo +initSpecInfo (RI { ri_rhs_usg = rhs_usg }) + = SI { si_specs = [], si_n_specs = 0, si_mb_unspec = Just rhs_usg } + -- si_mb_unspec: add in rhs_usg if there are any boring calls, + -- or if the bndr is exported ---------------------- specNonRec :: ScEnv - -> ScUsage -- Body usage + -> CallEnv -- Calls in body -> RhsInfo -- Structure info usage info for un-specialised RHS -> UniqSM (ScUsage, SpecInfo) -- Usage from RHSs (specialised and not) -- plus details of specialisations -specNonRec env body_usg rhs_info - = specialise env (scu_calls body_usg) rhs_info - (noSpecInfo { si_mb_unspec = Just (ri_rhs_usg rhs_info) }) +specNonRec env body_calls rhs_info + = specialise env body_calls rhs_info (initSpecInfo rhs_info) ---------------------- -specRec :: TopLevelFlag -> ScEnv - -> ScUsage -- Body usage +specRec :: ScEnv + -> CallEnv -- Calls in body -> [RhsInfo] -- Structure info and usage info for un-specialised RHSs -> UniqSM (ScUsage, [SpecInfo]) -- Usage from all RHSs (specialised and not) -- plus details of specialisations -specRec top_lvl env body_usg rhs_infos - = go 1 seed_calls nullUsage init_spec_infos +specRec env body_calls rhs_infos + = go 1 body_calls nullUsage (map initSpecInfo rhs_infos) + -- body_calls: see Note [Seeding recursive groups] + -- NB: 'go' always calls 'specialise' once, which in turn unleashes + -- si_mb_unspec if there are any boring calls in body_calls, + -- or if any of the Id(s) are exported where opts = sc_opts env - (seed_calls, init_spec_infos) -- Note [Seeding top-level recursive groups] - | isTopLevel top_lvl - , any (isExportedId . ri_fn) rhs_infos -- Seed from body and RHSs - = (all_calls, [noSpecInfo | _ <- rhs_infos]) - | otherwise -- Seed from body only - = (calls_in_body, [noSpecInfo { si_mb_unspec = Just (ri_rhs_usg ri) } - | ri <- rhs_infos]) - - calls_in_body = scu_calls body_usg - calls_in_rhss = foldr (combineCalls . scu_calls . ri_rhs_usg) emptyVarEnv rhs_infos - all_calls = calls_in_rhss `combineCalls` calls_in_body -- Loop, specialising, until you get no new specialisations - go :: Int -- Which iteration of the "until no new specialisations" - -- loop we are on; first iteration is 1 - -> CallEnv -- Seed calls - -- Two accumulating parameters: - -> ScUsage -- Usage from earlier specialisations - -> [SpecInfo] -- Details of specialisations so far - -> UniqSM (ScUsage, [SpecInfo]) + go, go_again :: Int -- Which iteration of the "until no new specialisations" + -- loop we are on; first iteration is 1 + -> CallEnv -- Seed calls + -- Two accumulating parameters: + -> ScUsage -- Usage from earlier specialisations + -> [SpecInfo] -- Details of specialisations so far + -> UniqSM (ScUsage, [SpecInfo]) go n_iter seed_calls usg_so_far spec_infos + = -- pprTrace "specRec3" (vcat [ text "bndrs" <+> ppr (map ri_fn rhs_infos) + -- , text "iteration" <+> int n_iter + -- , text "spec_infos" <+> ppr (map (map os_pat . si_specs) spec_infos) + -- ]) $ + do { specs_w_usg <- zipWithM (specialise env seed_calls) rhs_infos spec_infos + ; let (extra_usg_s, all_spec_infos) = unzip specs_w_usg + extra_usg = combineUsages extra_usg_s + all_usg = usg_so_far `combineUsage` extra_usg + new_calls = scu_calls extra_usg + ; go_again n_iter new_calls all_usg all_spec_infos } + + -- go_again deals with termination + go_again n_iter seed_calls usg_so_far spec_infos | isEmptyVarEnv seed_calls - = -- pprTrace "specRec1" (vcat [ ppr (map ri_fn rhs_infos) - -- , ppr seed_calls - -- , ppr body_usg ]) $ - return (usg_so_far, spec_infos) + = return (usg_so_far, spec_infos) -- Limit recursive specialisation -- See Note [Limit recursive specialisation] @@ -1653,26 +1652,20 @@ specRec top_lvl env body_usg rhs_infos -- If both of these are false, the sc_count -- threshold will prevent non-termination , any ((> the_limit) . si_n_specs) spec_infos - = -- pprTrace "specRec2" (ppr (map (map os_pat . si_specs) spec_infos)) $ - return (usg_so_far, spec_infos) + = -- Give up on specialisation, but don't forget to include the rhs_usg + -- for the unspecialised function, since it may now be called + -- pprTrace "specRec2" (ppr (map (map os_pat . si_specs) spec_infos)) $ + let rhs_usgs = combineUsages (mapMaybe si_mb_unspec spec_infos) + in return (usg_so_far `combineUsage` rhs_usgs, spec_infos) | otherwise - = -- pprTrace "specRec3" (vcat [ text "bndrs" <+> ppr (map ri_fn rhs_infos) - -- , text "iteration" <+> int n_iter - -- , text "spec_infos" <+> ppr (map (map os_pat . si_specs) spec_infos) - -- ]) $ - do { specs_w_usg <- zipWithM (specialise env seed_calls) rhs_infos spec_infos - ; let (extra_usg_s, new_spec_infos) = unzip specs_w_usg - extra_usg = combineUsages extra_usg_s - all_usg = usg_so_far `combineUsage` extra_usg - ; go (n_iter + 1) (scu_calls extra_usg) all_usg new_spec_infos } + = go (n_iter + 1) seed_calls usg_so_far spec_infos -- See Note [Limit recursive specialisation] the_limit = case sc_count opts of Nothing -> 10 -- Ugh! Just max -> max - ---------------------- specialise :: ScEnv @@ -1695,14 +1688,12 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs spec_info@(SI { si_specs = specs, si_n_specs = spec_count , si_mb_unspec = mb_unspec }) | isDeadEndId fn -- Note [Do not specialise diverging functions] - -- and do not generate specialisation seeds from its RHS + -- /and/ do not generate specialisation seeds from its RHS = -- pprTrace "specialise bot" (ppr fn) $ return (nullUsage, spec_info) | not (isNeverActive (idInlineActivation fn)) -- See Note [Transfer activation] - -- - -- -- Don't specialise OPAQUE things, see Note [OPAQUE pragma]. -- Since OPAQUE things are always never-active (see -- GHC.Parser.PostProcess.mkOpaquePragma) this guard never fires for @@ -1728,14 +1719,16 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs ; let spec_usg = combineUsages spec_usgs + unspec_rhs_needed = boring_call || isExportedId fn + -- If there were any boring calls among the seeds (= all_calls), then those -- calls will call the un-specialised function. So we should use the seeds -- from the _unspecialised_ function's RHS, which are in mb_unspec, by returning -- then in new_usg. - (new_usg, mb_unspec') - = case mb_unspec of - Just rhs_usg | boring_call -> (spec_usg `combineUsage` rhs_usg, Nothing) - _ -> (spec_usg, mb_unspec) + (new_usg, mb_unspec') = case mb_unspec of + Just rhs_usg | unspec_rhs_needed + -> (spec_usg `combineUsage` rhs_usg, Nothing) + _ -> (spec_usg, mb_unspec) -- ; pprTrace "specialise return }" -- (vcat [ ppr fn @@ -1743,8 +1736,8 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs -- , text "new calls:" <+> ppr (scu_calls new_usg)]) $ -- return () - ; return (new_usg, SI { si_specs = new_specs ++ specs - , si_n_specs = spec_count + n_pats + ; return (new_usg, SI { si_specs = new_specs ++ specs + , si_n_specs = spec_count + n_pats , si_mb_unspec = mb_unspec' }) } | otherwise -- No calls, inactive, or not a function @@ -1976,7 +1969,8 @@ calcSpecInfo fn (CP { cp_qvars = qvars, cp_args = pats }) extra_bndrs Note [spec_usg includes rhs_usg] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In calls to 'specialise', the returned ScUsage must include the rhs_usg in -the passed-in SpecInfo, unless there are no calls at all to the function. +the passed-in SpecInfo in si_mb_unspec, unless there are no calls at all to +the function. The caller can, indeed must, assume this. They should not combine in rhs_usg themselves, or they'll get rhs_usg twice -- and that can lead to an exponential @@ -2194,9 +2188,11 @@ callsToNewPats :: ScEnv -> Id -> SpecInfo -> [ArgOcc] -> [Call] -> UniqSM (Bool, [CallPat]) - -- Result has no duplicate patterns, - -- nor ones mentioned in done_pats - -- Bool indicates that there was at least one boring pattern +-- Result has no duplicate patterns, +-- nor ones mentioned in si_specs (hence "new" patterns) +-- Bool indicates that there was at least one boring pattern +-- The "New" in the name means "patterns that are not already covered +-- by an existing specialisation" callsToNewPats env fn spec_info@(SI { si_specs = done_specs }) bndr_occs calls = do { mb_pats <- mapM (callToPats env bndr_occs) calls |