diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2011-11-11 22:04:20 +0000 |
---|---|---|
committer | Simon Peyton Jones <simonpj@microsoft.com> | 2011-11-11 23:20:40 +0000 |
commit | 479504030370947ff3e8d62adb193dd492cf5725 (patch) | |
tree | 91a3e055b4fc37317cdc8db86dc9ab762eabf82b | |
parent | a522c3b25eea1fe40edae7052335acce75e8a1c3 (diff) | |
download | haskell-479504030370947ff3e8d62adb193dd492cf5725.tar.gz |
Make certainlyWillInline more conservative, so that it is never true of thunks. Otherwise the worker-wrapper phase can make a thunk into an unconditionally inline UnfWhen thing, which is Very Bad Thing. Shown up by Trac #5623.
See Note [certainlyWillInline: be caseful of thunks].
-rw-r--r-- | compiler/coreSyn/CoreUnfold.lhs | 28 |
1 files changed, 21 insertions, 7 deletions
diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs index 4f1dee3da3..8cbf4acea2 100644 --- a/compiler/coreSyn/CoreUnfold.lhs +++ b/compiler/coreSyn/CoreUnfold.lhs @@ -278,9 +278,10 @@ Notice that 'x' counts 0, while (f x) counts 2. That's deliberate: there's a function call to account for. Notice also that constructor applications are very cheap, because exposing them to a caller is so valuable. -[25/5/11] All sizes are now multiplied by 10, except for primops. -This makes primops look cheap, and seems to be almost unversally -beneficial. Done partly as a result of #4978. +[25/5/11] All sizes are now multiplied by 10, except for primops +(which have sizes like 1 or 4. This makes primops look fantastically +cheap, and seems to be almost unversally beneficial. Done partly as a +result of #4978. Note [Do not inline top-level bottoming functions] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -337,7 +338,8 @@ uncondInline :: Arity -> Int -> Bool -- Size of call is arity (+1 for the function) -- See Note [INLINE for small functions] uncondInline arity size - | arity == 0 = size == 0 + | arity == 0 = False -- Never unconditionally inline non-lambda + -- PostInlineUnconditionally will do that | otherwise = size <= 10 * (arity + 1) \end{code} @@ -747,17 +749,28 @@ smallEnoughToInline _ ---------------- certainlyWillInline :: Unfolding -> Bool -- Sees if the unfolding is pretty certain to inline -certainlyWillInline (CoreUnfolding { uf_is_cheap = is_cheap, uf_arity = n_vals, uf_guidance = guidance }) +certainlyWillInline (CoreUnfolding { uf_arity = n_vals, uf_guidance = guidance }) = case guidance of UnfNever -> False UnfWhen {} -> True UnfIfGoodArgs { ug_size = size} - -> is_cheap && size - (10 * (n_vals +1)) <= opt_UF_UseThreshold + -> n_vals > 0 -- See Note [certainlyWillInline: be caseful of thunks] + && size - (10 * (n_vals +1)) <= opt_UF_UseThreshold certainlyWillInline _ = False \end{code} +Note [certainlyWillInline: be caseful of thunks] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Don't claim that thunks will certainly inline, because that risks work +duplication. Even if the work duplication is not great (eg is_cheap +holds), it can make a big difference in an inner loop In Trac #5623 we +found that the WorkWrap phase thought that + y = case x of F# v -> F# (v +# v) +was certainlyWillInline, so the addition got duplicated. + + %************************************************************************ %* * \subsection{callSiteInline} @@ -1084,7 +1097,8 @@ to be cheap, and that's good because exprIsConApp_maybe doesn't think that expression is a constructor application. I used to test is_value rather than is_cheap, which was utterly -wrong, because the above expression responds True to exprIsHNF. +wrong, because the above expression responds True to exprIsHNF, +which is what sets is_value. This kind of thing can occur if you have |