summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2020-03-09 10:20:42 +0000
committerBen Gamari <ben@smart-cactus.org>2021-02-12 15:54:08 -0500
commit9a98ecc1a44d5cc200e57db8752de18b2fc3fc6f (patch)
tree6619931f635291041c064fb618a66df48a7c4f53
parent40983d2331fe34c0af6925db7588d5ac6a19ae36 (diff)
downloadhaskell-wip/T17917.tar.gz
Avoid useless w/w splitwip/T17917
This patch is just a tidy-up for the post-strictness-analysis worker wrapper split. Consider f x = x Strictnesss analysis does not lead to a w/w split, so the obvious thing is to leave it 100% alone. But actually, because the RHS is small, we ended up adding a StableUnfolding for it. There is some reason to do this if we choose /not/ do to w/w on the grounds that the function is small. See Note [Don't w/w inline small non-loop-breaker things] But there is no reason if we would not have done w/w anyway. This patch just moves the conditional to later. Easy. This does move some -ddump-simpl printouts around a bit. I also discovered that the previous code was overwritten an InlineCompulsory with InlineStable, which is utterly wrong. That in turn meant that some default methods (marked InlineCompulsory) were getting their InlineCompulsory squashed. This patch fixes that bug --- but of course that does mean a bit more inlining! Metric Decrease: T9233 T9675 Metric Increase: T12707 T11374 T3064 T4029 T9872b T9872d haddock.Cabal
-rw-r--r--compiler/GHC/Core/Opt/WorkWrap.hs206
-rw-r--r--compiler/GHC/Core/Unfold.hs29
2 files changed, 139 insertions, 96 deletions
diff --git a/compiler/GHC/Core/Opt/WorkWrap.hs b/compiler/GHC/Core/Opt/WorkWrap.hs
index 8631888bbd..a4444d9957 100644
--- a/compiler/GHC/Core/Opt/WorkWrap.hs
+++ b/compiler/GHC/Core/Opt/WorkWrap.hs
@@ -30,6 +30,7 @@ import GHC.Types.SourceText
import GHC.Core.Opt.WorkWrap.Utils
import GHC.Utils.Misc
import GHC.Utils.Outputable
+import GHC.Types.Unique
import GHC.Utils.Panic
import GHC.Core.FamInstEnv
import GHC.Utils.Monad
@@ -208,6 +209,23 @@ unfolding to the *worker*. So we will get something like this:
How do we "transfer the unfolding"? Easy: by using the old one, wrapped
in work_fn! See GHC.Core.Unfold.mkWorkerUnfolding.
+Note [No worker-wrapper for record selectors]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We sometimes generate a lot of record selectors, and generally the
+don't benefit from worker/wrapper. Yes, mkWwBodies would find a w/w split,
+but it is then suppressed by the certainlyWillInline test in splitFun.
+
+The wasted effort in mkWwBodies makes a measurable difference in
+compile time (see MR !2873), so although it's a terribly ad-hoc test,
+we just check here for record selectors, and do a no-op in that case.
+
+I did look for a generalisation, so that it's not just record
+selectors that benefit. But you'd need a cheap test for "this
+function will definitely get a w/w split" and that's hard to predict
+in advance...the logic in mkWwBodies is complex. So I've left the
+super-simple test, with this Note to explain.
+
+
Note [Worker-wrapper for NOINLINE functions]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We used to disable worker/wrapper for NOINLINE things, but it turns out
@@ -321,8 +339,8 @@ Note [Don't w/w inline small non-loop-breaker things]
In general, we refrain from w/w-ing *small* functions, which are not
loop breakers, because they'll inline anyway. But we must take care:
it may look small now, but get to be big later after other inlining
-has happened. So we take the precaution of adding an INLINE pragma to
-any such functions.
+has happened. So we take the precaution of adding a StableUnfolding
+for any such functions.
I made this change when I observed a big function at the end of
compilation with a useful strictness signature but no w-w. (It was
@@ -587,93 +605,114 @@ See https://gitlab.haskell.org/ghc/ghc/merge_requests/312#note_192064.
splitFun :: DynFlags -> FamInstEnvs -> Id -> IdInfo -> [Demand] -> Divergence -> CprResult -> CoreExpr
-> UniqSM [(Id, CoreExpr)]
splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
- = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr cpr) ) do
- -- The arity should match the signature
- stuff <- mkWwBodies dflags fam_envs rhs_fvs fn_id wrap_dmds use_cpr_info
- case stuff of
- Just (work_demands, join_arity, wrap_fn, work_fn) -> do
- work_uniq <- getUniqueM
- let work_rhs = work_fn rhs
- work_act = case fn_inline_spec of -- See Note [Worker activation]
- NoInline -> inl_act fn_inl_prag
- _ -> inl_act wrap_prag
-
- work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
- , inl_inline = fn_inline_spec
- , inl_sat = Nothing
- , inl_act = work_act
- , inl_rule = FunLike }
- -- inl_inline: copy from fn_id; see Note [Worker-wrapper for INLINABLE functions]
- -- inl_act: see Note [Worker activation]
- -- inl_rule: it does not make sense for workers to be constructorlike.
-
- work_join_arity | isJoinId fn_id = Just join_arity
- | otherwise = Nothing
- -- worker is join point iff wrapper is join point
- -- (see Note [Don't w/w join points for CPR])
-
- simpl_opts = initSimpleOpts dflags
-
- work_id = mkWorkerId work_uniq fn_id (exprType work_rhs)
- `setIdOccInfo` occInfo fn_info
- -- Copy over occurrence info from parent
- -- Notably whether it's a loop breaker
- -- Doesn't matter much, since we will simplify next, but
- -- seems right-er to do so
-
- `setInlinePragma` work_prag
-
- `setIdUnfolding` mkWorkerUnfolding simpl_opts work_fn fn_unfolding
- -- See Note [Worker-wrapper for INLINABLE functions]
-
- `setIdStrictness` mkClosedStrictSig work_demands div
- -- Even though we may not be at top level,
- -- it's ok to give it an empty DmdEnv
-
- `setIdCprInfo` mkCprSig work_arity work_cpr_info
-
- `setIdDemandInfo` worker_demand
-
- `setIdArity` work_arity
- -- Set the arity so that the Core Lint check that the
- -- arity is consistent with the demand type goes
- -- through
- `asJoinId_maybe` work_join_arity
-
- work_arity = length work_demands
-
- -- See Note [Demand on the Worker]
- single_call = saturatedByOneShots arity (demandInfo fn_info)
- worker_demand | single_call = mkWorkerDemand work_arity
- | otherwise = topDmd
-
- wrap_rhs = wrap_fn work_id
- wrap_prag = mkStrWrapperInlinePrag fn_inl_prag
- wrap_id = fn_id `setIdUnfolding` mkWwInlineRule simpl_opts wrap_rhs arity
- `setInlinePragma` wrap_prag
- `setIdOccInfo` noOccInfo
- -- Zap any loop-breaker-ness, to avoid bleating from Lint
- -- about a loop breaker with an INLINE rule
-
-
-
- return $ [(work_id, work_rhs), (wrap_id, wrap_rhs)]
- -- Worker first, because wrapper mentions it
-
- Nothing -> return [(fn_id, rhs)]
+ | isRecordSelector fn_id -- See Note [No worker/wrapper for record selectors]
+ = return [ (fn_id, rhs ) ]
+
+ | otherwise
+ = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr cpr) )
+ -- The arity should match the signature
+ do { mb_stuff <- mkWwBodies dflags fam_envs rhs_fvs fn_id wrap_dmds use_cpr_info
+ ; case mb_stuff of
+ Nothing -> return [(fn_id, rhs)]
+
+ Just stuff
+ | Just stable_unf <- certainlyWillInline (unfoldingOpts dflags) fn_info
+ -> return [ (fn_id `setIdUnfolding` stable_unf, rhs) ]
+ -- See Note [Don't w/w INLINE things]
+ -- See Note [Don't w/w inline small non-loop-breaker things]
+
+ | otherwise
+ -> do { work_uniq <- getUniqueM
+ ; return (mkWWBindPair dflags fn_id fn_info arity rhs
+ work_uniq div cpr stuff) } }
where
- rhs_fvs = exprFreeVars rhs
- fn_inl_prag = inlinePragInfo fn_info
- fn_inline_spec = inl_inline fn_inl_prag
- fn_unfolding = unfoldingInfo fn_info
- arity = arityInfo fn_info
- -- The arity is set by the simplifier using exprEtaExpandArity
- -- So it may be more than the number of top-level-visible lambdas
+ rhs_fvs = exprFreeVars rhs
+ arity = arityInfo fn_info
+ -- The arity is set by the simplifier using exprEtaExpandArity
+ -- So it may be more than the number of top-level-visible lambdas
-- use_cpr_info is the CPR we w/w for. Note that we kill it for join points,
-- see Note [Don't w/w join points for CPR].
use_cpr_info | isJoinId fn_id = topCpr
| otherwise = cpr
+
+
+mkWWBindPair :: DynFlags -> Id -> IdInfo -> Arity
+ -> CoreExpr -> Unique -> Divergence -> CprResult
+ -> ([Demand], JoinArity, Id -> CoreExpr, Expr CoreBndr -> CoreExpr)
+ -> [(Id, CoreExpr)]
+mkWWBindPair dflags fn_id fn_info arity rhs work_uniq div cpr
+ (work_demands, join_arity, wrap_fn, work_fn)
+ = [(work_id, work_rhs), (wrap_id, wrap_rhs)]
+ -- Worker first, because wrapper mentions it
+ where
+ simpl_opts = initSimpleOpts dflags
+
+ work_rhs = work_fn rhs
+ work_act = case fn_inline_spec of -- See Note [Worker activation]
+ NoInline -> inl_act fn_inl_prag
+ _ -> inl_act wrap_prag
+
+ work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
+ , inl_inline = fn_inline_spec
+ , inl_sat = Nothing
+ , inl_act = work_act
+ , inl_rule = FunLike }
+ -- inl_inline: copy from fn_id; see Note [Worker-wrapper for INLINABLE functions]
+ -- inl_act: see Note [Worker activation]
+ -- inl_rule: it does not make sense for workers to be constructorlike.
+
+ work_join_arity | isJoinId fn_id = Just join_arity
+ | otherwise = Nothing
+ -- worker is join point iff wrapper is join point
+ -- (see Note [Don't w/w join points for CPR])
+
+ work_id = mkWorkerId work_uniq fn_id (exprType work_rhs)
+ `setIdOccInfo` occInfo fn_info
+ -- Copy over occurrence info from parent
+ -- Notably whether it's a loop breaker
+ -- Doesn't matter much, since we will simplify next, but
+ -- seems right-er to do so
+
+ `setInlinePragma` work_prag
+
+ `setIdUnfolding` mkWorkerUnfolding simpl_opts work_fn fn_unfolding
+ -- See Note [Worker-wrapper for INLINABLE functions]
+
+ `setIdStrictness` mkClosedStrictSig work_demands div
+ -- Even though we may not be at top level,
+ -- it's ok to give it an empty DmdEnv
+
+ `setIdCprInfo` mkCprSig work_arity work_cpr_info
+
+ `setIdDemandInfo` worker_demand
+
+ `setIdArity` work_arity
+ -- Set the arity so that the Core Lint check that the
+ -- arity is consistent with the demand type goes
+ -- through
+ `asJoinId_maybe` work_join_arity
+
+ work_arity = length work_demands
+
+ -- See Note [Demand on the Worker]
+ single_call = saturatedByOneShots arity (demandInfo fn_info)
+ worker_demand | single_call = mkWorkerDemand work_arity
+ | otherwise = topDmd
+
+ wrap_rhs = wrap_fn work_id
+ wrap_prag = mkStrWrapperInlinePrag fn_inl_prag
+
+ wrap_id = fn_id `setIdUnfolding` mkWwInlineRule simpl_opts wrap_rhs arity
+ `setInlinePragma` wrap_prag
+ `setIdOccInfo` noOccInfo
+ -- Zap any loop-breaker-ness, to avoid bleating from Lint
+ -- about a loop breaker with an INLINE rule
+
+ fn_inl_prag = inlinePragInfo fn_info
+ fn_inline_spec = inl_inline fn_inl_prag
+ fn_unfolding = unfoldingInfo fn_info
+
-- Even if we don't w/w join points for CPR, we might still do so for
-- strictness. In which case a join point worker keeps its original CPR
-- property; see Note [Don't w/w join points for CPR]. Otherwise, the worker
@@ -681,7 +720,6 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
work_cpr_info | isJoinId fn_id = cpr
| otherwise = topCpr
-
mkStrWrapperInlinePrag :: InlinePragma -> InlinePragma
mkStrWrapperInlinePrag (InlinePragma { inl_act = act, inl_rule = rule_info })
= InlinePragma { inl_src = SourceText "{-# INLINE"
diff --git a/compiler/GHC/Core/Unfold.hs b/compiler/GHC/Core/Unfold.hs
index 8a61eec3c7..27510cf1ab 100644
--- a/compiler/GHC/Core/Unfold.hs
+++ b/compiler/GHC/Core/Unfold.hs
@@ -956,11 +956,21 @@ certainlyWillInline :: UnfoldingOpts -> IdInfo -> Maybe Unfolding
-- ^ Sees if the unfolding is pretty certain to inline.
-- If so, return a *stable* unfolding for it, that will always inline.
certainlyWillInline opts fn_info
- = case unfoldingInfo fn_info of
- CoreUnfolding { uf_tmpl = e, uf_guidance = g }
- | loop_breaker -> Nothing -- Won't inline, so try w/w
- | noinline -> Nothing -- See Note [Worker-wrapper for NOINLINE functions]
- | otherwise -> do_cunf e g -- Depends on size, so look at that
+ = case fn_unf of
+ CoreUnfolding { uf_tmpl = expr, uf_guidance = guidance, uf_src = src }
+ | loop_breaker -> Nothing -- Won't inline, so try w/w
+ | noinline -> Nothing -- See Note [Worker-wrapper for NOINLINE functions]
+ | otherwise
+ -> case guidance of
+ UnfNever -> Nothing
+ UnfWhen {} -> Just (fn_unf { uf_src = src' })
+ -- INLINE functions have UnfWhen
+ UnfIfGoodArgs { ug_size = size, ug_args = args }
+ -> do_cunf expr size args src'
+ where
+ src' = case src of
+ InlineRhs -> InlineStable
+ _ -> src -- Do not change InlineCompulsory!
DFunUnfolding {} -> Just fn_unf -- Don't w/w DFuns; it never makes sense
-- to do so, and even if it is currently a
@@ -973,17 +983,12 @@ certainlyWillInline opts fn_info
noinline = inlinePragmaSpec (inlinePragInfo fn_info) == NoInline
fn_unf = unfoldingInfo fn_info
- do_cunf :: CoreExpr -> UnfoldingGuidance -> Maybe Unfolding
- do_cunf _ UnfNever = Nothing
- do_cunf _ (UnfWhen {}) = Just (fn_unf { uf_src = InlineStable })
- -- INLINE functions have UnfWhen
-
-- The UnfIfGoodArgs case seems important. If we w/w small functions
-- binary sizes go up by 10%! (This is with SplitObjs.)
-- I'm not totally sure why.
-- INLINABLE functions come via this path
-- See Note [certainlyWillInline: INLINABLE]
- do_cunf expr (UnfIfGoodArgs { ug_size = size, ug_args = args })
+ do_cunf expr size args src'
| arityInfo fn_info > 0 -- See Note [certainlyWillInline: be careful of thunks]
, not (isDeadEndSig (strictnessInfo fn_info))
-- Do not unconditionally inline a bottoming functions even if
@@ -991,7 +996,7 @@ certainlyWillInline opts fn_info
-- so we don't want to re-inline it.
, let unf_arity = length args
, size - (10 * (unf_arity + 1)) <= unfoldingUseThreshold opts
- = Just (fn_unf { uf_src = InlineStable
+ = Just (fn_unf { uf_src = src'
, uf_guidance = UnfWhen { ug_arity = unf_arity
, ug_unsat_ok = unSaturatedOk
, ug_boring_ok = inlineBoringOk expr } })