summaryrefslogtreecommitdiff
path: root/compiler
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2020-05-21 12:53:35 +0100
committerMarge Bot <ben+marge-bot@smart-cactus.org>2020-06-10 04:17:06 -0400
commit6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c (patch)
tree3129da29bee5d7f34666133a4247a60418d30212 /compiler
parent9454511b0bdfcd79a1899d7f24bf65a3eb0d06e3 (diff)
downloadhaskell-6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c.tar.gz
Implement cast worker/wrapper properly
The cast worker/wrapper transformation transforms x = e |> co into y = e x = y |> co This is done by the simplifier, but we were being careless about transferring IdInfo from x to y, and about what to do if x is a NOINLNE function. This resulted in a series of bugs: #17673, #18093, #18078. This patch fixes all that: * Main change is in GHC.Core.Opt.Simplify, and the new prepareBinding function, which does this cast worker/wrapper transform. See Note [Cast worker/wrappers]. * There is quite a bit of refactoring around prepareRhs, makeTrivial etc. It's nicer now. * Some wrappers from strictness and cast w/w, notably those for a function with a NOINLINE, should inline very late. There wasn't really a mechanism for that, which was an existing bug really; so I invented a new finalPhase = Phase (-1). It's used for all simplifier runs after the user-visible phase 2,1,0 have run. (No new runs of the simplifier are introduced thereby.) See new Note [Compiler phases] in GHC.Types.Basic; the main changes are in GHC.Core.Opt.Driver * Doing this made me trip over two places where the AnonArgFlag on a FunTy was being lost so we could end up with (Num a -> ty) rather than (Num a => ty) - In coercionLKind/coercionRKind - In contHoleType in the Simplifier I fixed the former by defining mkFunctionType and using it in coercionLKind/RKind. I could have done the same for the latter, but the information is almost to hand. So I fixed the latter by - adding sc_hole_ty to ApplyToVal (like ApplyToTy), - adding as_hole_ty to ValArg (like TyArg) - adding sc_fun_ty to StrictArg Turned out I could then remove ai_type from ArgInfo. This is just moving the deck chairs around, but it worked out nicely. See the new Note [AnonArgFlag] in GHC.Types.Var * When looking at the 'arity decrease' thing (#18093) I discovered that stable unfoldings had a much lower arity than the actual optimised function. That's what led to the arity-decrease message. Simple solution: eta-expand. It's described in Note [Eta-expand stable unfoldings] in GHC.Core.Opt.Simplify * I also discovered that unsafeCoerce wasn't being inlined if the context was boring. So (\x. f (unsafeCoerce x)) would create a thunk -- yikes! I fixed that by making inlineBoringOK a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold. I also found that unsafeCoerceName was unused, so I removed it. I made a test case for #18078, and a very similar one for #17673. The net effect of all this on nofib is very modest, but positive: -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- anna -0.4% -0.1% -3.1% -3.1% 0.0% fannkuch-redux -0.4% -0.3% -0.1% -0.1% 0.0% maillist -0.4% -0.1% -7.8% -1.0% -14.3% primetest -0.4% -15.6% -7.1% -6.6% 0.0% -------------------------------------------------------------------------------- Min -0.9% -15.6% -13.3% -14.2% -14.3% Max -0.3% 0.0% +12.1% +12.4% 0.0% Geometric Mean -0.4% -0.2% -2.3% -2.2% -0.1% All following metric decreases are compile-time allocation decreases between -1% and -3%: Metric Decrease: T5631 T13701 T14697 T15164
Diffstat (limited to 'compiler')
-rw-r--r--compiler/GHC/Builtin/Names.hs7
-rw-r--r--compiler/GHC/Core/Coercion.hs4
-rw-r--r--compiler/GHC/Core/Opt/CSE.hs2
-rw-r--r--compiler/GHC/Core/Opt/CprAnal.hs4
-rw-r--r--compiler/GHC/Core/Opt/Driver.hs53
-rw-r--r--compiler/GHC/Core/Opt/Simplify.hs429
-rw-r--r--compiler/GHC/Core/Opt/Simplify/Utils.hs104
-rw-r--r--compiler/GHC/Core/Opt/SpecConstr.hs4
-rw-r--r--compiler/GHC/Core/Opt/WorkWrap.hs61
-rw-r--r--compiler/GHC/Core/SimpleOpt.hs3
-rw-r--r--compiler/GHC/Core/Type.hs48
-rw-r--r--compiler/GHC/Core/Unfold.hs18
-rw-r--r--compiler/GHC/Core/Utils.hs22
-rw-r--r--compiler/GHC/CoreToStg/Prep.hs9
-rw-r--r--compiler/GHC/Types/Basic.hs202
-rw-r--r--compiler/GHC/Types/Id/Make.hs2
-rw-r--r--compiler/GHC/Types/Var.hs85
-rw-r--r--compiler/GHC/Utils/Binary.hs13
18 files changed, 665 insertions, 405 deletions
diff --git a/compiler/GHC/Builtin/Names.hs b/compiler/GHC/Builtin/Names.hs
index 8d7cecafdf..36aba77356 100644
--- a/compiler/GHC/Builtin/Names.hs
+++ b/compiler/GHC/Builtin/Names.hs
@@ -472,7 +472,6 @@ basicKnownKeyNames
, unsafeEqualityTyConName
, unsafeReflDataConName
, unsafeCoercePrimName
- , unsafeCoerceName
]
genericTyConNames :: [Name]
@@ -1333,12 +1332,11 @@ typeErrorShowTypeDataConName =
-- Unsafe coercion proofs
unsafeEqualityProofName, unsafeEqualityTyConName, unsafeCoercePrimName,
- unsafeCoerceName, unsafeReflDataConName :: Name
+ unsafeReflDataConName :: Name
unsafeEqualityProofName = varQual uNSAFE_COERCE (fsLit "unsafeEqualityProof") unsafeEqualityProofIdKey
unsafeEqualityTyConName = tcQual uNSAFE_COERCE (fsLit "UnsafeEquality") unsafeEqualityTyConKey
unsafeReflDataConName = dcQual uNSAFE_COERCE (fsLit "UnsafeRefl") unsafeReflDataConKey
unsafeCoercePrimName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce#") unsafeCoercePrimIdKey
-unsafeCoerceName = varQual uNSAFE_COERCE (fsLit "unsafeCoerce") unsafeCoerceIdKey
-- Dynamic
toDynName :: Name
@@ -2417,10 +2415,9 @@ naturalSDataConKey = mkPreludeMiscIdUnique 568
wordToNaturalIdKey = mkPreludeMiscIdUnique 569
-- Unsafe coercion proofs
-unsafeEqualityProofIdKey, unsafeCoercePrimIdKey, unsafeCoerceIdKey :: Unique
+unsafeEqualityProofIdKey, unsafeCoercePrimIdKey :: Unique
unsafeEqualityProofIdKey = mkPreludeMiscIdUnique 570
unsafeCoercePrimIdKey = mkPreludeMiscIdUnique 571
-unsafeCoerceIdKey = mkPreludeMiscIdUnique 572
{-
************************************************************************
diff --git a/compiler/GHC/Core/Coercion.hs b/compiler/GHC/Core/Coercion.hs
index b5e7770ed3..e89709929b 100644
--- a/compiler/GHC/Core/Coercion.hs
+++ b/compiler/GHC/Core/Coercion.hs
@@ -2188,7 +2188,7 @@ coercionLKind co
go (TyConAppCo _ tc cos) = mkTyConApp tc (map go cos)
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (ForAllCo tv1 _ co1) = mkTyCoInvForAllTy tv1 (go co1)
- go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
+ go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
go (CoVarCo cv) = coVarLType cv
go (HoleCo h) = coVarLType (coHoleCoVar h)
go (UnivCo _ _ ty1 _) = ty1
@@ -2245,7 +2245,7 @@ coercionRKind co
go (AppCo co1 co2) = mkAppTy (go co1) (go co2)
go (CoVarCo cv) = coVarRType cv
go (HoleCo h) = coVarRType (coHoleCoVar h)
- go (FunCo _ co1 co2) = mkVisFunTy (go co1) (go co2)
+ go (FunCo _ co1 co2) = mkFunctionType (go co1) (go co2)
go (UnivCo _ _ _ ty2) = ty2
go (SymCo co) = coercionLKind co
go (TransCo _ co2) = go co2
diff --git a/compiler/GHC/Core/Opt/CSE.hs b/compiler/GHC/Core/Opt/CSE.hs
index 39e5dd8d0a..d6f37f6eb5 100644
--- a/compiler/GHC/Core/Opt/CSE.hs
+++ b/compiler/GHC/Core/Opt/CSE.hs
@@ -404,7 +404,7 @@ delayInlining top_lvl bndr
-- These rules are probably auto-generated specialisations,
-- since Ids with manual rules usually have manually-inserted
-- delayed inlining anyway
- = bndr `setInlineActivation` activeAfterInitial
+ = bndr `setInlineActivation` activateAfterInitial
| otherwise
= bndr
diff --git a/compiler/GHC/Core/Opt/CprAnal.hs b/compiler/GHC/Core/Opt/CprAnal.hs
index bb8161a0b2..4fcda8c4a8 100644
--- a/compiler/GHC/Core/Opt/CprAnal.hs
+++ b/compiler/GHC/Core/Opt/CprAnal.hs
@@ -331,8 +331,8 @@ cprAnalBind top_lvl env id rhs
cprExpandUnfolding_maybe :: Id -> Maybe CoreExpr
cprExpandUnfolding_maybe id = do
guard (idArity id == 0)
- -- There are only phase 0 Simplifier runs after CPR analysis
- guard (isActiveIn 0 (idInlineActivation id))
+ -- There are only FinalPhase Simplifier runs after CPR analysis
+ guard (activeInFinalPhase (idInlineActivation id))
expandUnfolding_maybe (idUnfolding id)
{- Note [Arity trimming for CPR signatures]
diff --git a/compiler/GHC/Core/Opt/Driver.hs b/compiler/GHC/Core/Opt/Driver.hs
index 082eb9d326..07714aafaa 100644
--- a/compiler/GHC/Core/Opt/Driver.hs
+++ b/compiler/GHC/Core/Opt/Driver.hs
@@ -37,7 +37,7 @@ import GHC.Core.Opt.FloatOut ( floatOutwards )
import GHC.Core.FamInstEnv
import GHC.Types.Id
import GHC.Utils.Error ( withTiming, withTimingD, DumpFormat (..) )
-import GHC.Types.Basic ( CompilerPhase(..), isDefaultInlinePragma, defaultInlinePragma )
+import GHC.Types.Basic
import GHC.Types.Var.Set
import GHC.Types.Var.Env
import GHC.Core.Opt.LiberateCase ( liberateCase )
@@ -141,8 +141,10 @@ getCoreToDo dflags
maybe_rule_check phase = runMaybe rule_check (CoreDoRuleCheck phase)
- maybe_strictness_before phase
- = runWhen (phase `elem` strictnessBefore dflags) CoreDoDemand
+ maybe_strictness_before (Phase phase)
+ | phase `elem` strictnessBefore dflags = CoreDoDemand
+ maybe_strictness_before _
+ = CoreDoNothing
base_mode = SimplMode { sm_phase = panic "base_mode"
, sm_names = []
@@ -152,20 +154,20 @@ getCoreToDo dflags
, sm_inline = True
, sm_case_case = True }
- simpl_phase phase names iter
+ simpl_phase phase name iter
= CoreDoPasses
$ [ maybe_strictness_before phase
, CoreDoSimplify iter
- (base_mode { sm_phase = Phase phase
- , sm_names = names })
+ (base_mode { sm_phase = phase
+ , sm_names = [name] })
- , maybe_rule_check (Phase phase) ]
+ , maybe_rule_check phase ]
- simpl_phases = CoreDoPasses [ simpl_phase phase ["main"] max_iter
- | phase <- [phases, phases-1 .. 1] ]
+ -- Run GHC's internal simplification phase, after all rules have run.
+ -- See Note [Compiler phases] in GHC.Types.Basic
+ simplify name = simpl_phase FinalPhase name max_iter
-
- -- initial simplify: mk specialiser happy: minimum effort please
+ -- initial simplify: mk specialiser happy: minimum effort please
simpl_gently = CoreDoSimplify max_iter
(base_mode { sm_phase = InitialPhase
, sm_names = ["Gentle"]
@@ -182,7 +184,7 @@ getCoreToDo dflags
demand_analyser = (CoreDoPasses (
dmd_cpr_ww ++
- [simpl_phase 0 ["post-worker-wrapper"] max_iter]
+ [simplify "post-worker-wrapper"]
))
-- Static forms are moved to the top level with the FloatOut pass.
@@ -203,7 +205,7 @@ getCoreToDo dflags
if opt_level == 0 then
[ static_ptrs_float_outwards,
CoreDoSimplify max_iter
- (base_mode { sm_phase = Phase 0
+ (base_mode { sm_phase = FinalPhase
, sm_names = ["Non-opt simplification"] })
]
@@ -251,8 +253,10 @@ getCoreToDo dflags
-- GHC.Iface.Tidy.StaticPtrTable.
static_ptrs_float_outwards,
- simpl_phases,
-
+ -- Run the simplier phases 2,1,0 to allow rewrite rules to fire
+ CoreDoPasses [ simpl_phase (Phase phase) "main" max_iter
+ | phase <- [phases, phases-1 .. 1] ],
+ simpl_phase (Phase 0) "main" (max max_iter 3),
-- Phase 0: allow all Ids to be inlined now
-- This gets foldr inlined before strictness analysis
@@ -263,7 +267,6 @@ getCoreToDo dflags
-- ==> let k = BIG in letrec go = \xs -> ...(k x).... in go xs
-- ==> let k = BIG in letrec go = \xs -> ...(BIG x).... in go xs
-- Don't stop now!
- simpl_phase 0 ["main"] (max max_iter 3),
runWhen do_float_in CoreDoFloatInwards,
-- Run float-inwards immediately before the strictness analyser
@@ -274,9 +277,10 @@ getCoreToDo dflags
runWhen call_arity $ CoreDoPasses
[ CoreDoCallArity
- , simpl_phase 0 ["post-call-arity"] max_iter
+ , simplify "post-call-arity"
],
+ -- Strictness analysis
runWhen strictness demand_analyser,
runWhen exitification CoreDoExitify,
@@ -302,24 +306,24 @@ getCoreToDo dflags
runWhen do_float_in CoreDoFloatInwards,
- maybe_rule_check (Phase 0),
+ maybe_rule_check FinalPhase,
-- Case-liberation for -O2. This should be after
-- strictness analysis and the simplification which follows it.
runWhen liberate_case (CoreDoPasses [
CoreLiberateCase,
- simpl_phase 0 ["post-liberate-case"] max_iter
+ simplify "post-liberate-case"
]), -- Run the simplifier after LiberateCase to vastly
-- reduce the possibility of shadowing
-- Reason: see Note [Shadowing] in GHC.Core.Opt.SpecConstr
runWhen spec_constr CoreDoSpecConstr,
- maybe_rule_check (Phase 0),
+ maybe_rule_check FinalPhase,
runWhen late_specialise
(CoreDoPasses [ CoreDoSpecialising
- , simpl_phase 0 ["post-late-spec"] max_iter]),
+ , simplify "post-late-spec"]),
-- LiberateCase can yield new CSE opportunities because it peels
-- off one layer of a recursive function (concretely, I saw this
@@ -328,11 +332,10 @@ getCoreToDo dflags
runWhen ((liberate_case || spec_constr) && cse) CoreCSE,
-- Final clean-up simplification:
- simpl_phase 0 ["final"] max_iter,
+ simplify "final",
runWhen late_dmd_anal $ CoreDoPasses (
- dmd_cpr_ww ++
- [simpl_phase 0 ["post-late-ww"] max_iter]
+ dmd_cpr_ww ++ [simplify "post-late-ww"]
),
-- Final run of the demand_analyser, ensures that one-shot thunks are
@@ -342,7 +345,7 @@ getCoreToDo dflags
-- can become /exponentially/ more expensive. See #11731, #12996.
runWhen (strictness || late_dmd_anal) CoreDoDemand,
- maybe_rule_check (Phase 0)
+ maybe_rule_check FinalPhase
]
-- Remove 'CoreDoNothing' and flatten 'CoreDoPasses' for clarity.
diff --git a/compiler/GHC/Core/Opt/Simplify.hs b/compiler/GHC/Core/Opt/Simplify.hs
index 15e1c0550a..73f941c332 100644
--- a/compiler/GHC/Core/Opt/Simplify.hs
+++ b/compiler/GHC/Core/Opt/Simplify.hs
@@ -46,12 +46,12 @@ import GHC.Core.Ppr ( pprCoreExpr )
import GHC.Types.Unique ( hasKey )
import GHC.Core.Unfold
import GHC.Core.Utils
+import GHC.Core.Opt.Arity ( etaExpand )
import GHC.Core.SimpleOpt ( pushCoTyArg, pushCoValArg
, joinPointBinding_maybe, joinPointBindings_maybe )
import GHC.Core.FVs ( mkRuleInfo )
import GHC.Core.Rules ( lookupRule, getRules )
-import GHC.Types.Basic ( TopLevelFlag(..), isNotTopLevel, isTopLevel,
- RecFlag(..), Arity )
+import GHC.Types.Basic
import GHC.Utils.Monad ( mapAccumLM, liftIO )
import GHC.Types.Var ( isTyCoVar )
import GHC.Data.Maybe ( orElse )
@@ -318,8 +318,8 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
-- ANF-ise a constructor or PAP rhs
-- We get at most one float per argument here
- ; (let_floats, body2) <- {-#SCC "prepareRhs" #-} prepareRhs (getMode env) top_lvl
- (getOccFS bndr1) (idInfo bndr1) body1
+ ; (let_floats, bndr2, body2) <- {-#SCC "prepareBinding" #-}
+ prepareBinding env top_lvl bndr bndr1 body1
; let body_floats2 = body_floats1 `addLetFloats` let_floats
; (rhs_floats, rhs')
@@ -344,7 +344,7 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
; return (floats, rhs') }
; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
- top_lvl Nothing bndr bndr1 rhs'
+ top_lvl Nothing bndr bndr2 rhs'
; return (rhs_floats `addFloats` bind_float, env2) }
--------------------------
@@ -396,16 +396,16 @@ completeNonRecX :: TopLevelFlag -> SimplEnv
completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
= ASSERT2( not (isJoinId new_bndr), ppr new_bndr )
- do { (prepd_floats, rhs1) <- prepareRhs (getMode env) top_lvl (getOccFS new_bndr)
- (idInfo new_bndr) new_rhs
+ do { (prepd_floats, new_bndr, new_rhs)
+ <- prepareBinding env top_lvl old_bndr new_bndr new_rhs
; let floats = emptyFloats env `addLetFloats` prepd_floats
; (rhs_floats, rhs2) <-
- if doFloatFromRhs NotTopLevel NonRecursive is_strict floats rhs1
+ if doFloatFromRhs NotTopLevel NonRecursive is_strict floats new_rhs
then -- Add the floats to the main env
do { tick LetFloatFromLet
- ; return (floats, rhs1) }
+ ; return (floats, new_rhs) }
else -- Do not float; wrap the floats around the RHS
- return (emptyFloats env, wrapFloats floats rhs1)
+ return (emptyFloats env, wrapFloats floats new_rhs)
; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
NotTopLevel Nothing
@@ -415,12 +415,146 @@ completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
{- *********************************************************************
* *
- prepareRhs, makeTrivial
+ prepareBinding, prepareRhs, makeTrivial
* *
************************************************************************
-Note [prepareRhs]
-~~~~~~~~~~~~~~~~~
+Note [Cast worker/wrappers]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When we have a binding
+ x = e |> co
+we want to do something very similar to worker/wrapper:
+ $wx = e
+ x = $wx |> co
+
+So now x can be inlined freely. There's a chance that e will be a
+constructor application or function, or something like that, so moving
+the coercion to the usage site may well cancel the coercions and lead
+to further optimisation. Example:
+
+ data family T a :: *
+ data instance T Int = T Int
+
+ foo :: Int -> Int -> Int
+ foo m n = ...
+ where
+ t = T m
+ go 0 = 0
+ go n = case t of { T m -> go (n-m) }
+ -- This case should optimise
+
+We call this making a cast worker/wrapper, and it's done by prepareBinding.
+
+We need to be careful with inline/noinline pragmas:
+ rec { {-# NOINLINE f #-}
+ f = (...g...) |> co
+ ; g = ...f... }
+This is legitimate -- it tells GHC to use f as the loop breaker
+rather than g. Now we do the cast thing, to get something like
+ rec { $wf = ...g...
+ ; f = $wf |> co
+ ; g = ...f... }
+Where should the NOINLINE pragma go? If we leave it on f we'll get
+ rec { $wf = ...g...
+ ; {-# NOINLINE f #-}
+ f = $wf |> co
+ ; g = ...f... }
+and that is bad bad: the whole point is that we want to inline that
+cast! We want to transfer the pagma to $wf:
+ rec { {-# NOINLINE $wf #-}
+ $wf = ...g...
+ ; f = $wf |> co
+ ; g = ...f... }
+It's exactly like worker/wrapper for strictness analysis:
+ f is the wrapper and must inline like crazy
+ $wf is the worker and must carry f's original pragma
+See Note [Worker-wrapper for NOINLINE functions] in
+GHC.Core.Opt.WorkWrap.
+
+See #17673, #18093, #18078.
+
+Note [Preserve strictness in cast w/w]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In the Note [Cast worker/wrappers] transformation, keep the strictness info.
+Eg
+ f = e `cast` co -- f has strictness SSL
+When we transform to
+ f' = e -- f' also has strictness SSL
+ f = f' `cast` co -- f still has strictness SSL
+
+Its not wrong to drop it on the floor, but better to keep it.
+
+Note [Cast w/w: unlifted]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+BUT don't do cast worker/wrapper if 'e' has an unlifted type.
+This *can* happen:
+
+ foo :: Int = (error (# Int,Int #) "urk")
+ `cast` CoUnsafe (# Int,Int #) Int
+
+If do the makeTrivial thing to the error call, we'll get
+ foo = case error (# Int,Int #) "urk" of v -> v `cast` ...
+But 'v' isn't in scope!
+
+These strange casts can happen as a result of case-of-case
+ bar = case (case x of { T -> (# 2,3 #); F -> error "urk" }) of
+ (# p,q #) -> p+q
+
+NOTE: Nowadays we don't use casts for these error functions;
+instead, we use (case erorr ... of {}). So I'm not sure
+this Note makes much sense any more.
+-}
+
+prepareBinding :: SimplEnv -> TopLevelFlag
+ -> InId -> OutId -> OutExpr
+ -> SimplM (LetFloats, OutId, OutExpr)
+
+prepareBinding env top_lvl old_bndr bndr rhs
+ | Cast rhs1 co <- rhs
+ -- Try for cast worker/wrapper
+ -- See Note [Cast worker/wrappers]
+ , not (isStableUnfolding (realIdUnfolding old_bndr))
+ -- Don't make a cast w/w if the thing is going to be inlined anyway
+ , not (exprIsTrivial rhs1)
+ -- Nor if the RHS is trivial; then again it'll be inlined
+ , let ty1 = coercionLKind co
+ , not (isUnliftedType ty1)
+ -- Not if rhs has an unlifted type; see Note [Cast w/w: unlifted]
+ = do { (floats, new_id) <- makeTrivialBinding (getMode env) top_lvl
+ (getOccFS bndr) worker_info rhs1 ty1
+ ; let bndr' = bndr `setInlinePragma` mkCastWrapperInlinePrag (idInlinePragma bndr)
+ ; return (floats, bndr', Cast (Var new_id) co) }
+
+ | otherwise
+ = do { (floats, rhs') <- prepareRhs (getMode env) top_lvl (getOccFS bndr) rhs
+ ; return (floats, bndr, rhs') }
+ where
+ info = idInfo bndr
+ worker_info = vanillaIdInfo `setStrictnessInfo` strictnessInfo info
+ `setCprInfo` cprInfo info
+ `setDemandInfo` demandInfo info
+ `setInlinePragInfo` inlinePragInfo info
+ `setArityInfo` arityInfo info
+ -- We do /not/ want to transfer OccInfo, Rules, Unfolding
+ -- Note [Preserve strictness in cast w/w]
+
+mkCastWrapperInlinePrag :: InlinePragma -> InlinePragma
+-- See Note [Cast wrappers]
+mkCastWrapperInlinePrag (InlinePragma { inl_act = act, inl_rule = rule_info })
+ = InlinePragma { inl_src = SourceText "{-# INLINE"
+ , inl_inline = NoUserInline -- See Note [Wrapper NoUserInline]
+ , inl_sat = Nothing -- in GHC.Core.Opt.WorkWrap
+ , inl_act = wrap_act -- See Note [Wrapper activation]
+ , inl_rule = rule_info } -- in GHC.Core.Opt.WorkWrap
+ -- RuleMatchInfo is (and must be) unaffected
+ where
+ -- See Note [Wrapper activation] in GHC.Core.Opt.WorkWrap
+ -- But simpler, because we don't need to disable during InitialPhase
+ wrap_act | isNeverActive act = activateDuringFinal
+ | otherwise = act
+
+{- Note [prepareRhs]
+~~~~~~~~~~~~~~~~~~~~
prepareRhs takes a putative RHS, checks whether it's a PAP or
constructor application and, if so, converts it to ANF, so that the
resulting thing can be inlined more easily. Thus
@@ -438,26 +572,16 @@ That's what the 'go' loop in prepareRhs does
-}
prepareRhs :: SimplMode -> TopLevelFlag
- -> FastString -- Base for any new variables
- -> IdInfo -- IdInfo for the LHS of this binding
+ -> FastString -- Base for any new variables
-> OutExpr
-> SimplM (LetFloats, OutExpr)
--- Transforms a RHS into a better RHS by adding floats
+-- Transforms a RHS into a better RHS by ANF'ing args
+-- for expandable RHSs: constructors and PAPs
-- e.g x = Just e
-- becomes a = e
-- x = Just a
-- See Note [prepareRhs]
-prepareRhs mode top_lvl occ info (Cast rhs co) -- Note [Float coercions]
- | let ty1 = coercionLKind co -- Do *not* do this if rhs has an unlifted type
- , not (isUnliftedType ty1) -- see Note [Float coercions (unlifted)]
- = do { (floats, rhs') <- makeTrivialWithInfo mode top_lvl occ sanitised_info rhs
- ; return (floats, Cast rhs' co) }
- where
- sanitised_info = vanillaIdInfo `setStrictnessInfo` strictnessInfo info
- `setCprInfo` cprInfo info
- `setDemandInfo` demandInfo info
-
-prepareRhs mode top_lvl occ _ rhs0
+prepareRhs mode top_lvl occ rhs0
= do { (_is_exp, floats, rhs1) <- go 0 rhs0
; return (floats, rhs1) }
where
@@ -501,61 +625,10 @@ prepareRhs mode top_lvl occ _ rhs0
go _ other
= return (False, emptyLetFloats, other)
-{-
-Note [Float coercions]
-~~~~~~~~~~~~~~~~~~~~~~
-When we find the binding
- x = e `cast` co
-we'd like to transform it to
- x' = e
- x = x `cast` co -- A trivial binding
-There's a chance that e will be a constructor application or function, or something
-like that, so moving the coercion to the usage site may well cancel the coercions
-and lead to further optimisation. Example:
-
- data family T a :: *
- data instance T Int = T Int
-
- foo :: Int -> Int -> Int
- foo m n = ...
- where
- x = T m
- go 0 = 0
- go n = case x of { T m -> go (n-m) }
- -- This case should optimise
-
-Note [Preserve strictness when floating coercions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In the Note [Float coercions] transformation, keep the strictness info.
-Eg
- f = e `cast` co -- f has strictness SSL
-When we transform to
- f' = e -- f' also has strictness SSL
- f = f' `cast` co -- f still has strictness SSL
-
-Its not wrong to drop it on the floor, but better to keep it.
-
-Note [Float coercions (unlifted)]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-BUT don't do [Float coercions] if 'e' has an unlifted type.
-This *can* happen:
-
- foo :: Int = (error (# Int,Int #) "urk")
- `cast` CoUnsafe (# Int,Int #) Int
-
-If do the makeTrivial thing to the error call, we'll get
- foo = case error (# Int,Int #) "urk" of v -> v `cast` ...
-But 'v' isn't in scope!
-
-These strange casts can happen as a result of case-of-case
- bar = case (case x of { T -> (# 2,3 #); F -> error "urk" }) of
- (# p,q #) -> p+q
--}
-
makeTrivialArg :: SimplMode -> ArgSpec -> SimplM (LetFloats, ArgSpec)
-makeTrivialArg mode (ValArg e)
+makeTrivialArg mode arg@(ValArg { as_arg = e })
= do { (floats, e') <- makeTrivial mode NotTopLevel (fsLit "arg") e
- ; return (floats, ValArg e') }
+ ; return (floats, arg { as_arg = e' }) }
makeTrivialArg _ arg
= return (emptyLetFloats, arg) -- CastBy, TyArg
@@ -564,29 +637,32 @@ makeTrivial :: SimplMode -> TopLevelFlag
-> OutExpr -- ^ This expression satisfies the let/app invariant
-> SimplM (LetFloats, OutExpr)
-- Binds the expression to a variable, if it's not trivial, returning the variable
-makeTrivial mode top_lvl context expr
- = makeTrivialWithInfo mode top_lvl context vanillaIdInfo expr
-
-makeTrivialWithInfo :: SimplMode -> TopLevelFlag
- -> FastString -- ^ a "friendly name" to build the new binder from
- -> IdInfo
- -> OutExpr -- ^ This expression satisfies the let/app invariant
- -> SimplM (LetFloats, OutExpr)
--- Propagate strictness and demand info to the new binder
--- Note [Preserve strictness when floating coercions]
--- Returned SimplEnv has same substitution as incoming one
-makeTrivialWithInfo mode top_lvl occ_fs info expr
+makeTrivial mode top_lvl occ_fs expr
| exprIsTrivial expr -- Already trivial
|| not (bindingOk top_lvl expr expr_ty) -- Cannot trivialise
-- See Note [Cannot trivialise]
= return (emptyLetFloats, expr)
+ | Cast expr' co <- expr
+ = do { (floats, triv_expr) <- makeTrivial mode top_lvl occ_fs expr'
+ ; return (floats, Cast triv_expr co) }
+
| otherwise
- = do { (floats, expr1) <- prepareRhs mode top_lvl occ_fs info expr
- ; if exprIsTrivial expr1 -- See Note [Trivial after prepareRhs]
- then return (floats, expr1)
- else do
- { uniq <- getUniqueM
+ = do { (floats, new_id) <- makeTrivialBinding mode top_lvl occ_fs
+ vanillaIdInfo expr expr_ty
+ ; return (floats, Var new_id) }
+ where
+ expr_ty = exprType expr
+
+makeTrivialBinding :: SimplMode -> TopLevelFlag
+ -> FastString -- ^ a "friendly name" to build the new binder from
+ -> IdInfo
+ -> OutExpr -- ^ This expression satisfies the let/app invariant
+ -> OutType -- Type of the expression
+ -> SimplM (LetFloats, OutId)
+makeTrivialBinding mode top_lvl occ_fs info expr expr_ty
+ = do { (floats, expr1) <- prepareRhs mode top_lvl occ_fs expr
+ ; uniq <- getUniqueM
; let name = mkSystemVarName uniq occ_fs
var = mkLocalIdWithInfo name expr_ty info
@@ -598,9 +674,7 @@ makeTrivialWithInfo mode top_lvl occ_fs info expr
; let final_id = addLetBndrInfo var arity is_bot unf
bind = NonRec final_id expr2
- ; return ( floats `addLetFlts` unitLetFloat bind, Var final_id ) }}
- where
- expr_ty = exprType expr
+ ; return ( floats `addLetFlts` unitLetFloat bind, final_id ) }
bindingOk :: TopLevelFlag -> CoreExpr -> Type -> Bool
-- True iff we can have a binding of this expression at this level
@@ -609,15 +683,8 @@ bindingOk top_lvl expr expr_ty
| isTopLevel top_lvl = exprIsTopLevelBindable expr expr_ty
| otherwise = True
-{- Note [Trivial after prepareRhs]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-If we call makeTrival on (e |> co), the recursive use of prepareRhs
-may leave us with
- { a1 = e } and (a1 |> co)
-Now the latter is trivial, so we don't want to let-bind it.
-
-Note [Cannot trivialise]
-~~~~~~~~~~~~~~~~~~~~~~~~
+{- Note [Cannot trivialise]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
Consider:
f :: Int -> Addr#
@@ -699,7 +766,7 @@ completeBind env top_lvl mb_cont old_bndr new_bndr new_rhs
-- Simplify the unfolding
; new_unfolding <- simplLetUnfolding env top_lvl mb_cont old_bndr
- final_rhs (idType new_bndr) old_unf
+ final_rhs (idType new_bndr) new_arity old_unf
; let final_bndr = addLetBndrInfo new_bndr new_arity is_bot new_unfolding
-- See Note [In-scope set as a substitution]
@@ -931,6 +998,7 @@ simplExprF1 env (App fun arg) cont
, sc_cont = cont } }
_ -> simplExprF env fun $
ApplyToVal { sc_arg = arg, sc_env = env
+ , sc_hole_ty = substTy env (exprType fun)
, sc_dup = NoDup, sc_cont = cont }
simplExprF1 env expr@(Lam {}) cont
@@ -1235,8 +1303,8 @@ rebuild env expr cont
Select { sc_bndr = bndr, sc_alts = alts, sc_env = se, sc_cont = cont }
-> rebuildCase (se `setInScopeFromE` env) expr bndr alts cont
- StrictArg { sc_fun = fun, sc_cont = cont }
- -> rebuildCall env (fun `addValArgTo` expr) cont
+ StrictArg { sc_fun = fun, sc_cont = cont, sc_fun_ty = fun_ty }
+ -> rebuildCall env (addValArgTo fun expr fun_ty ) cont
StrictBind { sc_bndr = b, sc_bndrs = bs, sc_body = body
, sc_env = se, sc_cont = cont }
-> do { (floats1, env') <- simplNonRecX (se `setInScopeFromE` env) b expr
@@ -1274,7 +1342,7 @@ In particular, we want to behave well on
* (f |> co) @t1 @t2 ... @tn x1 .. xm
- Here we wil use pushCoTyArg and pushCoValArg successively, which
+ Here we will use pushCoTyArg and pushCoValArg successively, which
build up NthCo stacks. Silly to do that if co is reflexive.
However, we don't want to call isReflexiveCo too much, because it uses
@@ -1313,20 +1381,20 @@ simplCast env body co0 cont0
where
co' = mkTransCo co1 co2
- addCoerce co cont@(ApplyToTy { sc_arg_ty = arg_ty, sc_cont = tail })
+ addCoerce co (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = tail })
| Just (arg_ty', m_co') <- pushCoTyArg co arg_ty
- -- N.B. As mentioned in Note [The hole type in ApplyToTy] this is
- -- only needed by `sc_hole_ty` which is often not forced.
- -- Consequently it is worthwhile using a lazy pattern match here to
- -- avoid unnecessary coercionKind evaluations.
- , let hole_ty = coercionLKind co
= {-#SCC "addCoerce-pushCoTyArg" #-}
do { tail' <- addCoerceM m_co' tail
- ; return (cont { sc_arg_ty = arg_ty'
- , sc_hole_ty = hole_ty -- NB! As the cast goes past, the
- -- type of the hole changes (#16312)
- , sc_cont = tail' }) }
-
+ ; return (ApplyToTy { sc_arg_ty = arg_ty'
+ , sc_cont = tail'
+ , sc_hole_ty = coercionLKind co }) }
+ -- NB! As the cast goes past, the
+ -- type of the hole changes (#16312)
+
+ -- (f |> co) e ===> (f (e |> co1)) |> co2
+ -- where co :: (s1->s2) ~ (t1~t2)
+ -- co1 :: t1 ~ s1
+ -- co2 :: s2 ~ t2
addCoerce co cont@(ApplyToVal { sc_arg = arg, sc_env = arg_se
, sc_dup = dup, sc_cont = tail })
| Just (co1, m_co2) <- pushCoValArg co
@@ -1350,7 +1418,8 @@ simplCast env body co0 cont0
; return (ApplyToVal { sc_arg = mkCast arg' co1
, sc_env = arg_se'
, sc_dup = dup'
- , sc_cont = tail' }) } }
+ , sc_cont = tail'
+ , sc_hole_ty = coercionLKind co }) } }
addCoerce co cont
| isReflexiveCo co = return cont -- Having this at the end makes a huge
@@ -1429,7 +1498,7 @@ simplLamBndr env bndr
| isId bndr && hasCoreUnfolding old_unf -- Special case
= do { (env1, bndr1) <- simplBinder env bndr
; unf' <- simplStableUnfolding env1 NotTopLevel Nothing bndr
- old_unf (idType bndr1)
+ (idType bndr1) (idArity bndr1) old_unf
; let bndr2 = bndr1 `setIdUnfolding` unf'
; return (modifyInScope env1 bndr2, bndr2) }
@@ -1877,12 +1946,12 @@ rebuildCall env info@(ArgInfo { ai_fun = fun, ai_args = rev_args
rebuildCall env info (CastIt co cont)
= rebuildCall env (addCastTo info co) cont
-rebuildCall env info (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont })
- = rebuildCall env (addTyArgTo info arg_ty) cont
+rebuildCall env info (ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = cont })
+ = rebuildCall env (addTyArgTo info arg_ty hole_ty) cont
---------- The runRW# rule. Do this after absorbing all arguments ------
-- runRW# :: forall (r :: RuntimeRep) (o :: TYPE r). (State# RealWorld -> o) -> o
--- K[ runRW# rr ty (\s. body) ] --> runRW rr' ty' (\s. K[ body ])
+-- K[ runRW# rr ty body ] --> runRW rr' ty' (\s. K[ body s ])
rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args })
(ApplyToVal { sc_arg = arg, sc_env = arg_se, sc_cont = cont })
| fun `hasKey` runRWKey
@@ -1890,23 +1959,25 @@ rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args })
, [ TyArg {}, TyArg {} ] <- rev_args
= do { s <- newId (fsLit "s") realWorldStatePrimTy
; let env' = (arg_se `setInScopeFromE` env) `addNewInScopeIds` [s]
+ ty' = contResultType cont
cont' = ApplyToVal { sc_dup = Simplified, sc_arg = Var s
- , sc_env = env', sc_cont = cont }
+ , sc_env = env', sc_cont = cont
+ , sc_hole_ty = mkVisFunTy realWorldStatePrimTy ty' }
+ -- cont' applies to s, then K
; body' <- simplExprC env' arg cont'
; let arg' = Lam s body'
- ty' = contResultType cont
rr' = getRuntimeRep ty'
call' = mkApps (Var fun) [mkTyArg rr', mkTyArg ty', arg']
; return (emptyFloats env, call') }
-rebuildCall env info@(ArgInfo { ai_type = fun_ty, ai_encl = encl_rules
+rebuildCall env info@(ArgInfo { ai_encl = encl_rules
, ai_strs = str:strs, ai_discs = disc:discs })
(ApplyToVal { sc_arg = arg, sc_env = arg_se
- , sc_dup = dup_flag, sc_cont = cont })
-
+ , sc_dup = dup_flag, sc_hole_ty = fun_ty
+ , sc_cont = cont })
-- Argument is already simplified
| isSimplified dup_flag -- See Note [Avoid redundant simplification]
- = rebuildCall env (addValArgTo info' arg) cont
+ = rebuildCall env (addValArgTo info' arg fun_ty) cont
-- Strict arguments
| str
@@ -1914,7 +1985,8 @@ rebuildCall env info@(ArgInfo { ai_type = fun_ty, ai_encl = encl_rules
= -- pprTrace "Strict Arg" (ppr arg $$ ppr (seIdSubst env) $$ ppr (seInScope env)) $
simplExprF (arg_se `setInScopeFromE` env) arg
(StrictArg { sc_fun = info', sc_cci = cci_strict
- , sc_dup = Simplified, sc_cont = cont })
+ , sc_dup = Simplified, sc_fun_ty = fun_ty
+ , sc_cont = cont })
-- Note [Shadowing]
-- Lazy arguments
@@ -1925,7 +1997,7 @@ rebuildCall env info@(ArgInfo { ai_type = fun_ty, ai_encl = encl_rules
-- floating a demanded let.
= do { arg' <- simplExprC (arg_se `setInScopeFromE` env) arg
(mkLazyArgStop arg_ty cci_lazy)
- ; rebuildCall env (addValArgTo info' arg') cont }
+ ; rebuildCall env (addValArgTo info' arg' fun_ty) cont }
where
info' = info { ai_strs = strs, ai_discs = discs }
arg_ty = funArgTy fun_ty
@@ -2133,9 +2205,11 @@ trySeqRules in_env scrut rhs cont
where
no_cast_scrut = drop_casts scrut
scrut_ty = exprType no_cast_scrut
- seq_id_ty = idType seqId
- res1_ty = piResultTy seq_id_ty rhs_rep
- res2_ty = piResultTy res1_ty scrut_ty
+ seq_id_ty = idType seqId -- forall r a (b::TYPE r). a -> b -> b
+ res1_ty = piResultTy seq_id_ty rhs_rep -- forall a (b::TYPE rhs_rep). a -> b -> b
+ res2_ty = piResultTy res1_ty scrut_ty -- forall (b::TYPE rhs_rep). scrut_ty -> b -> b
+ res3_ty = piResultTy res2_ty rhs_ty -- scrut_ty -> rhs_ty -> rhs_ty
+ res4_ty = funResultTy res3_ty -- rhs_ty -> rhs_ty
rhs_ty = substTy in_env (exprType rhs)
rhs_rep = getRuntimeRep rhs_ty
out_args = [ TyArg { as_arg_ty = rhs_rep
@@ -2144,9 +2218,11 @@ trySeqRules in_env scrut rhs cont
, as_hole_ty = res1_ty }
, TyArg { as_arg_ty = rhs_ty
, as_hole_ty = res2_ty }
- , ValArg no_cast_scrut]
+ , ValArg { as_arg = no_cast_scrut
+ , as_hole_ty = res3_ty } ]
rule_cont = ApplyToVal { sc_dup = NoDup, sc_arg = rhs
- , sc_env = in_env, sc_cont = cont }
+ , sc_env = in_env, sc_cont = cont
+ , sc_hole_ty = res4_ty }
-- Lazily evaluated, so we don't do most of this
drop_casts (Cast e _) = drop_casts e
@@ -3136,7 +3212,8 @@ mkDupableCont env (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs
, sc_dup = OkToDup
, sc_cont = mkBoringStop res_ty } ) }
-mkDupableCont env (StrictArg { sc_fun = info, sc_cci = cci, sc_cont = cont })
+mkDupableCont env (StrictArg { sc_fun = info, sc_cci = cci
+ , sc_cont = cont, sc_fun_ty = fun_ty })
-- See Note [Duplicating StrictArg]
-- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable
= do { (floats1, cont') <- mkDupableCont env cont
@@ -3144,8 +3221,9 @@ mkDupableCont env (StrictArg { sc_fun = info, sc_cci = cci, sc_cont = cont })
(ai_args info)
; return ( foldl' addLetFloats floats1 floats_s
, StrictArg { sc_fun = info { ai_args = args' }
- , sc_cci = cci
, sc_cont = cont'
+ , sc_cci = cci
+ , sc_fun_ty = fun_ty
, sc_dup = OkToDup} ) }
mkDupableCont env (ApplyToTy { sc_cont = cont
@@ -3155,7 +3233,8 @@ mkDupableCont env (ApplyToTy { sc_cont = cont
, sc_arg_ty = arg_ty, sc_hole_ty = hole_ty }) }
mkDupableCont env (ApplyToVal { sc_arg = arg, sc_dup = dup
- , sc_env = se, sc_cont = cont })
+ , sc_env = se, sc_cont = cont
+ , sc_hole_ty = hole_ty })
= -- e.g. [...hole...] (...arg...)
-- ==>
-- let a = ...arg...
@@ -3173,7 +3252,8 @@ mkDupableCont env (ApplyToVal { sc_arg = arg, sc_dup = dup
-- arg'' in its in-scope set, even if makeTrivial
-- has turned arg'' into a fresh variable
-- See Note [StaticEnv invariant] in GHC.Core.Opt.Simplify.Utils
- , sc_dup = OkToDup, sc_cont = cont' }) }
+ , sc_dup = OkToDup, sc_cont = cont'
+ , sc_hole_ty = hole_ty }) }
mkDupableCont env (Select { sc_bndr = case_bndr, sc_alts = alts
, sc_env = se, sc_cont = cont })
@@ -3517,11 +3597,11 @@ because we don't know its usage in each RHS separately
simplLetUnfolding :: SimplEnv-> TopLevelFlag
-> MaybeJoinCont
-> InId
- -> OutExpr -> OutType
+ -> OutExpr -> OutType -> Arity
-> Unfolding -> SimplM Unfolding
-simplLetUnfolding env top_lvl cont_mb id new_rhs rhs_ty unf
+simplLetUnfolding env top_lvl cont_mb id new_rhs rhs_ty arity unf
| isStableUnfolding unf
- = simplStableUnfolding env top_lvl cont_mb id unf rhs_ty
+ = simplStableUnfolding env top_lvl cont_mb id rhs_ty arity unf
| isExitJoinId id
= return noUnfolding -- See Note [Do not inline exit join points] in GHC.Core.Opt.Exitify
| otherwise
@@ -3547,9 +3627,10 @@ mkLetUnfolding dflags top_lvl src id new_rhs
simplStableUnfolding :: SimplEnv -> TopLevelFlag
-> MaybeJoinCont -- Just k => a join point with continuation k
-> InId
- -> Unfolding -> OutType -> SimplM Unfolding
+ -> OutType -> Arity -> Unfolding
+ ->SimplM Unfolding
-- Note [Setting the new unfolding]
-simplStableUnfolding env top_lvl mb_cont id unf rhs_ty
+simplStableUnfolding env top_lvl mb_cont id rhs_ty id_arity unf
= case unf of
NoUnfolding -> return unf
BootUnfolding -> return unf
@@ -3562,9 +3643,13 @@ simplStableUnfolding env top_lvl mb_cont id unf rhs_ty
CoreUnfolding { uf_tmpl = expr, uf_src = src, uf_guidance = guide }
| isStableSource src
- -> do { expr' <- case mb_cont of -- See Note [Rules and unfolding for join points]
- Just cont -> simplJoinRhs unf_env id expr cont
- Nothing -> simplExprC unf_env expr (mkBoringStop rhs_ty)
+ -> do { expr' <- case mb_cont of
+ Just cont -> -- Binder is a join point
+ -- See Note [Rules and unfolding for join points]
+ simplJoinRhs unf_env id expr cont
+ Nothing -> -- Binder is not a join point
+ do { expr' <- simplExprC unf_env expr (mkBoringStop rhs_ty)
+ ; return (eta_expand expr') }
; case guide of
UnfWhen { ug_arity = arity
, ug_unsat_ok = sat_ok
@@ -3601,7 +3686,41 @@ simplStableUnfolding env top_lvl mb_cont id unf rhs_ty
unf_env = updMode (updModeForStableUnfoldings act) env
-- See Note [Simplifying inside stable unfoldings] in GHC.Core.Opt.Simplify.Utils
-{-
+ -- See Note [Eta-expand stable unfoldings]
+ eta_expand expr
+ | not eta_on = expr
+ | exprIsTrivial expr = expr
+ | otherwise = etaExpand id_arity expr
+ eta_on = sm_eta_expand (getMode env)
+
+{- Note [Eta-expand stable unfoldings]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+For INLINE/INLINABLE things (which get stable unfoldings) there's a danger
+of getting
+ f :: Int -> Int -> Int -> Blah
+ [ Arity = 3 -- Good arity
+ , Unf=Stable (\xy. blah) -- Less good arity, only 2
+ f = \pqr. e
+
+This can happen because f's RHS is optimised more vigorously than
+its stable unfolding. Now suppose we have a call
+ g = f x
+Because f has arity=3, g will have arity=2. But if we inline f (using
+its stable unfolding) g's arity will reduce to 1, because <blah>
+hasn't been optimised yet. This happened in the 'parsec' library,
+for Text.Pasec.Char.string.
+
+Generally, if we know that 'f' has arity N, it seems sensible to
+eta-expand the stable unfolding to arity N too. Simple and consistent.
+
+Wrinkles
+* Don't eta-expand a trivial expr, else each pass will eta-reduce it,
+ and then eta-expand again. See Note [Do not eta-expand trivial expressions]
+ in GHC.Core.Opt.Simplify.Utils.
+* Don't eta-expand join points; see Note [Do not eta-expand join points]
+ in GHC.Core.Opt.Simplify.Utils. We uphold this because the join-point
+ case (mb_cont = Just _) doesn't use eta_expand.
+
Note [Force bottoming field]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We need to force bottoming, or the new unfolding holds
diff --git a/compiler/GHC/Core/Opt/Simplify/Utils.hs b/compiler/GHC/Core/Opt/Simplify/Utils.hs
index f26fdf9840..5878445d44 100644
--- a/compiler/GHC/Core/Opt/Simplify/Utils.hs
+++ b/compiler/GHC/Core/Opt/Simplify/Utils.hs
@@ -118,7 +118,9 @@ data SimplCont
SimplCont
| ApplyToVal -- (ApplyToVal arg K)[e] = K[ e arg ]
- { sc_dup :: DupFlag -- See Note [DupFlag invariants]
+ { sc_dup :: DupFlag -- See Note [DupFlag invariants]
+ , sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
+ -- See Note [The hole type in ApplyToTy/Val]
, sc_arg :: InExpr -- The argument,
, sc_env :: StaticEnv -- see Note [StaticEnv invariant]
, sc_cont :: SimplCont }
@@ -126,7 +128,7 @@ data SimplCont
| ApplyToTy -- (ApplyToTy ty K)[e] = K[ e ty ]
{ sc_arg_ty :: OutType -- Argument type
, sc_hole_ty :: OutType -- Type of the function, presumably (forall a. blah)
- -- See Note [The hole type in ApplyToTy]
+ -- See Note [The hole type in ApplyToTy/Val]
, sc_cont :: SimplCont }
| Select -- (Select alts K)[e] = K[ case e of alts ]
@@ -151,6 +153,9 @@ data SimplCont
, sc_fun :: ArgInfo -- Specifies f, e1..en, Whether f has rules, etc
-- plus strictness flags for *further* args
, sc_cci :: CallCtxt -- Whether *this* argument position is interesting
+ , sc_fun_ty :: OutType -- Type of the function (f e1 .. en),
+ -- presumably (arg_ty -> res_ty)
+ -- where res_ty is expected by sc_cont
, sc_cont :: SimplCont }
| TickIt -- (TickIt t K)[e] = K[ tick t e ]
@@ -254,8 +259,6 @@ data ArgInfo
ai_fun :: OutId, -- The function
ai_args :: [ArgSpec], -- ...applied to these args (which are in *reverse* order)
- ai_type :: OutType, -- Type of (f a1 ... an)
-
ai_rules :: FunRules, -- Rules for this function
ai_encl :: Bool, -- Flag saying whether this function
@@ -271,37 +274,36 @@ data ArgInfo
}
data ArgSpec
- = ValArg OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
+ = ValArg { as_arg :: OutExpr -- Apply to this (coercion or value); c.f. ApplyToVal
+ , as_hole_ty :: OutType } -- Type of the function (presumably t1 -> t2)
| TyArg { as_arg_ty :: OutType -- Apply to this type; c.f. ApplyToTy
, as_hole_ty :: OutType } -- Type of the function (presumably forall a. blah)
| CastBy OutCoercion -- Cast by this; c.f. CastIt
instance Outputable ArgSpec where
- ppr (ValArg e) = text "ValArg" <+> ppr e
+ ppr (ValArg { as_arg = arg }) = text "ValArg" <+> ppr arg
ppr (TyArg { as_arg_ty = ty }) = text "TyArg" <+> ppr ty
ppr (CastBy c) = text "CastBy" <+> ppr c
-addValArgTo :: ArgInfo -> OutExpr -> ArgInfo
-addValArgTo ai arg = ai { ai_args = ValArg arg : ai_args ai
- , ai_type = applyTypeToArg (ai_type ai) arg
- , ai_rules = decRules (ai_rules ai) }
+addValArgTo :: ArgInfo -> OutExpr -> OutType -> ArgInfo
+addValArgTo ai arg hole_ty = ai { ai_args = arg_spec : ai_args ai
+ , ai_rules = decRules (ai_rules ai) }
+ where
+ arg_spec = ValArg { as_arg = arg, as_hole_ty = hole_ty }
-addTyArgTo :: ArgInfo -> OutType -> ArgInfo
-addTyArgTo ai arg_ty = ai { ai_args = arg_spec : ai_args ai
- , ai_type = piResultTy poly_fun_ty arg_ty
- , ai_rules = decRules (ai_rules ai) }
+addTyArgTo :: ArgInfo -> OutType -> OutType -> ArgInfo
+addTyArgTo ai arg_ty hole_ty = ai { ai_args = arg_spec : ai_args ai
+ , ai_rules = decRules (ai_rules ai) }
where
- poly_fun_ty = ai_type ai
- arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = poly_fun_ty }
+ arg_spec = TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
addCastTo :: ArgInfo -> OutCoercion -> ArgInfo
-addCastTo ai co = ai { ai_args = CastBy co : ai_args ai
- , ai_type = coercionRKind co }
+addCastTo ai co = ai { ai_args = CastBy co : ai_args ai }
argInfoAppArgs :: [ArgSpec] -> [OutExpr]
argInfoAppArgs [] = []
argInfoAppArgs (CastBy {} : _) = [] -- Stop at a cast
-argInfoAppArgs (ValArg e : as) = e : argInfoAppArgs as
+argInfoAppArgs (ValArg { as_arg = arg } : as) = arg : argInfoAppArgs as
argInfoAppArgs (TyArg { as_arg_ty = ty } : as) = Type ty : argInfoAppArgs as
pushSimplifiedArgs :: SimplEnv -> [ArgSpec] -> SimplCont -> SimplCont
@@ -310,7 +312,9 @@ pushSimplifiedArgs env (arg : args) k
= case arg of
TyArg { as_arg_ty = arg_ty, as_hole_ty = hole_ty }
-> ApplyToTy { sc_arg_ty = arg_ty, sc_hole_ty = hole_ty, sc_cont = rest }
- ValArg e -> ApplyToVal { sc_arg = e, sc_env = env, sc_dup = Simplified, sc_cont = rest }
+ ValArg { as_arg = arg, as_hole_ty = hole_ty }
+ -> ApplyToVal { sc_arg = arg, sc_env = env, sc_dup = Simplified
+ , sc_hole_ty = hole_ty, sc_cont = rest }
CastBy c -> CastIt c rest
where
rest = pushSimplifiedArgs env args k
@@ -323,7 +327,7 @@ argInfoExpr fun rev_args
= go rev_args
where
go [] = Var fun
- go (ValArg a : as) = go as `App` a
+ go (ValArg { as_arg = arg } : as) = go as `App` arg
go (TyArg { as_arg_ty = ty } : as) = go as `App` Type ty
go (CastBy co : as) = mkCast (go as) co
@@ -409,11 +413,9 @@ contHoleType (TickIt _ k) = contHoleType k
contHoleType (CastIt co _) = coercionLKind co
contHoleType (StrictBind { sc_bndr = b, sc_dup = dup, sc_env = se })
= perhapsSubstTy dup se (idType b)
-contHoleType (StrictArg { sc_fun = ai }) = funArgTy (ai_type ai)
-contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy]
-contHoleType (ApplyToVal { sc_arg = e, sc_env = se, sc_dup = dup, sc_cont = k })
- = mkVisFunTy (perhapsSubstTy dup se (exprType e))
- (contHoleType k)
+contHoleType (StrictArg { sc_fun_ty = ty }) = funArgTy ty
+contHoleType (ApplyToTy { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
+contHoleType (ApplyToVal { sc_hole_ty = ty }) = ty -- See Note [The hole type in ApplyToTy/Val]
contHoleType (Select { sc_dup = d, sc_bndr = b, sc_env = se })
= perhapsSubstTy d se (idType b)
@@ -458,13 +460,13 @@ mkArgInfo :: SimplEnv
mkArgInfo env fun rules n_val_args call_cont
| n_val_args < idArity fun -- Note [Unsaturated functions]
- = ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
+ = ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = False
, ai_strs = vanilla_stricts
, ai_discs = vanilla_discounts }
| otherwise
- = ArgInfo { ai_fun = fun, ai_args = [], ai_type = fun_ty
+ = ArgInfo { ai_fun = fun, ai_args = []
, ai_rules = fun_rules
, ai_encl = interestingArgContext rules call_cont
, ai_strs = arg_stricts
@@ -1091,7 +1093,7 @@ seems to be to do a callSiteInline based on the fact that there is
something interesting about the call site (it's strict). Hmm. That
seems a bit fragile.
-Conclusion: inline top level things gaily until Phase 0 (the last
+Conclusion: inline top level things gaily until FinalPhase (the last
phase), at which point don't.
Note [pre/postInlineUnconditionally in gentle mode]
@@ -1214,23 +1216,21 @@ preInlineUnconditionally env top_lvl bndr rhs rhs_env
-- not ticks. Counting ticks cannot be duplicated, and non-counting
-- ticks around a Lam will disappear anyway.
- early_phase = case sm_phase mode of
- Phase 0 -> False
- _ -> True
--- If we don't have this early_phase test, consider
--- x = length [1,2,3]
--- The full laziness pass carefully floats all the cons cells to
--- top level, and preInlineUnconditionally floats them all back in.
--- Result is (a) static allocation replaced by dynamic allocation
--- (b) many simplifier iterations because this tickles
--- a related problem; only one inlining per pass
---
--- On the other hand, I have seen cases where top-level fusion is
--- lost if we don't inline top level thing (e.g. string constants)
--- Hence the test for phase zero (which is the phase for all the final
--- simplifications). Until phase zero we take no special notice of
--- top level things, but then we become more leery about inlining
--- them.
+ early_phase = sm_phase mode /= FinalPhase
+ -- If we don't have this early_phase test, consider
+ -- x = length [1,2,3]
+ -- The full laziness pass carefully floats all the cons cells to
+ -- top level, and preInlineUnconditionally floats them all back in.
+ -- Result is (a) static allocation replaced by dynamic allocation
+ -- (b) many simplifier iterations because this tickles
+ -- a related problem; only one inlining per pass
+ --
+ -- On the other hand, I have seen cases where top-level fusion is
+ -- lost if we don't inline top level thing (e.g. string constants)
+ -- Hence the test for phase zero (which is the phase for all the final
+ -- simplifications). Until phase zero we take no special notice of
+ -- top level things, but then we become more leery about inlining
+ -- them.
{-
************************************************************************
@@ -1549,7 +1549,7 @@ tryEtaExpandRhs mode bndr rhs
return (new_arity, is_bot, new_rhs) }
where
try_expand
- | exprIsTrivial rhs
+ | exprIsTrivial rhs -- See Note [Do not eta-expand trivial expressions]
= return (exprArity rhs, False, rhs)
| sm_eta_expand mode -- Provided eta-expansion is on
@@ -1593,9 +1593,17 @@ because then 'genMap' will inline, and it really shouldn't: at least
as far as the programmer is concerned, it's not applied to two
arguments!
+Note [Do not eta-expand trivial expressions]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Do not eta-expand a trivial RHS like
+ f = g
+If we eta expand do
+ f = \x. g x
+we'll just eta-reduce again, and so on; so the
+simplifier never terminates.
+
Note [Do not eta-expand join points]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
Similarly to CPR (see Note [Don't w/w join points for CPR] in
GHC.Core.Opt.WorkWrap), a join point stands well to gain from its outer binding's
eta-expansion, and eta-expanding a join point is fraught with issues like how to
diff --git a/compiler/GHC/Core/Opt/SpecConstr.hs b/compiler/GHC/Core/Opt/SpecConstr.hs
index b2b9f11b9e..da8aaa3447 100644
--- a/compiler/GHC/Core/Opt/SpecConstr.hs
+++ b/compiler/GHC/Core/Opt/SpecConstr.hs
@@ -1760,8 +1760,8 @@ Note [Transfer activation]
In which phase should the specialise-constructor rules be active?
Originally I made them always-active, but Manuel found that this
defeated some clever user-written rules. Then I made them active only
-in Phase 0; after all, currently, the specConstr transformation is
-only run after the simplifier has reached Phase 0, but that meant
+in FinalPhase; after all, currently, the specConstr transformation is
+only run after the simplifier has reached FinalPhase, but that meant
that specialisations didn't fire inside wrappers; see test
simplCore/should_compile/spec-inline.
diff --git a/compiler/GHC/Core/Opt/WorkWrap.hs b/compiler/GHC/Core/Opt/WorkWrap.hs
index df71f103a8..acffd58c43 100644
--- a/compiler/GHC/Core/Opt/WorkWrap.hs
+++ b/compiler/GHC/Core/Opt/WorkWrap.hs
@@ -245,8 +245,8 @@ NOINLINE pragma to the worker.
(See #13143 for a real-world example.)
It is crucial that we do this for *all* NOINLINE functions. #10069
-demonstrates what happens when we promise to w/w a (NOINLINE) leaf function, but
-fail to deliver:
+demonstrates what happens when we promise to w/w a (NOINLINE) leaf
+function, but fail to deliver:
data C = C Int# Int#
@@ -421,19 +421,27 @@ When should the wrapper inlining be active?
In module Bar we want to give specialisations a chance to fire
before inlining f's wrapper.
+ Historical note: At one stage I tried making the wrapper inlining
+ always-active, and that had a very bad effect on nofib/imaginary/x2n1;
+ a wrapper was inlined before the specialisation fired.
+
Reminder: Note [Don't w/w INLINE things], so we don't need to worry
about INLINE things here.
Conclusion:
- If the user said NOINLINE[n], respect that
- - If the user said NOINLINE, inline the wrapper as late as
- poss (phase 0). This is a compromise driven by (2) above
+
+ - If the user said NOINLINE, inline the wrapper only after
+ phase 0, the last user-visible phase. That means that all
+ rules will have had a chance to fire.
+
+ What phase is after phase 0? Answer: FinalPhase, that's the reason it
+ exists. NB: Similar to InitialPhase, users can't write INLINE[Final] f;
+ it's syntactically illegal.
+
- Otherwise inline wrapper in phase 2. That allows the
'gentle' simplification pass to apply specialisation rules
-Historical note: At one stage I tried making the wrapper inlining
-always-active, and that had a very bad effect on nofib/imaginary/x2n1;
-a wrapper was inlined before the specialisation fired.
Note [Wrapper NoUserInline]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -575,8 +583,8 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
work_uniq <- getUniqueM
let work_rhs = work_fn rhs
work_act = case fn_inline_spec of -- See Note [Worker activation]
- NoInline -> fn_act
- _ -> wrap_act
+ NoInline -> inl_act fn_inl_prag
+ _ -> inl_act wrap_prag
work_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
, inl_inline = fn_inline_spec
@@ -626,19 +634,7 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topDmd
wrap_rhs = wrap_fn work_id
- wrap_act = case fn_act of -- See Note [Wrapper activation]
- ActiveAfter {} -> fn_act
- NeverActive -> activeDuringFinal
- _ -> activeAfterInitial
- wrap_prag = InlinePragma { inl_src = SourceText "{-# INLINE"
- , inl_inline = NoUserInline
- , inl_sat = Nothing
- , inl_act = wrap_act
- , inl_rule = rule_match_info }
- -- inl_act: see Note [Wrapper activation]
- -- inl_inline: see Note [Wrapper NoUserInline]
- -- inl_rule: RuleMatchInfo is (and must be) unaffected
-
+ wrap_prag = mkStrWrapperInlinePrag fn_inl_prag
wrap_id = fn_id `setIdUnfolding` mkWwInlineRule dflags wrap_rhs arity
`setInlinePragma` wrap_prag
`setIdOccInfo` noOccInfo
@@ -655,8 +651,6 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
rhs_fvs = exprFreeVars rhs
fn_inl_prag = inlinePragInfo fn_info
fn_inline_spec = inl_inline fn_inl_prag
- fn_act = inl_act fn_inl_prag
- rule_match_info = inlinePragmaRuleMatchInfo fn_inl_prag
fn_unfolding = unfoldingInfo fn_info
arity = arityInfo fn_info
-- The arity is set by the simplifier using exprEtaExpandArity
@@ -674,6 +668,25 @@ splitFun dflags fam_envs fn_id fn_info wrap_dmds div cpr rhs
| otherwise = topCpr
+mkStrWrapperInlinePrag :: InlinePragma -> InlinePragma
+mkStrWrapperInlinePrag (InlinePragma { inl_act = act, inl_rule = rule_info })
+ = InlinePragma { inl_src = SourceText "{-# INLINE"
+ , inl_inline = NoUserInline -- See Note [Wrapper NoUserInline]
+ , inl_sat = Nothing
+ , inl_act = wrap_act
+ , inl_rule = rule_info } -- RuleMatchInfo is (and must be) unaffected
+ where
+ wrap_act = case act of -- See Note [Wrapper activation]
+ NeverActive -> activateDuringFinal
+ FinalActive -> act
+ ActiveAfter {} -> act
+ ActiveBefore {} -> activateAfterInitial
+ AlwaysActive -> activateAfterInitial
+ -- For the last two cases, see (4) in Note [Wrapper activation]
+ -- NB: the (ActiveBefore n) isn't quite right. We really want
+ -- it to be active *after* Initial but *before* n. We don't have
+ -- a way to say that, alas.
+
{-
Note [Demand on the worker]
~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/compiler/GHC/Core/SimpleOpt.hs b/compiler/GHC/Core/SimpleOpt.hs
index afb51d1219..216c30d8fc 100644
--- a/compiler/GHC/Core/SimpleOpt.hs
+++ b/compiler/GHC/Core/SimpleOpt.hs
@@ -1367,8 +1367,7 @@ pushCoTyArg co ty
| otherwise
= Nothing
where
- tyL = coercionLKind co
- tyR = coercionRKind co
+ Pair tyL tyR = coercionKind co
-- co :: tyL ~ tyR
-- tyL = forall (a1 :: k1). ty1
-- tyR = forall (a2 :: k2). ty2
diff --git a/compiler/GHC/Core/Type.hs b/compiler/GHC/Core/Type.hs
index d12aafb9d7..57e570cd79 100644
--- a/compiler/GHC/Core/Type.hs
+++ b/compiler/GHC/Core/Type.hs
@@ -51,7 +51,7 @@ module GHC.Core.Type (
splitPiTy_maybe, splitPiTy, splitPiTys,
mkTyConBindersPreferAnon,
mkPiTy, mkPiTys,
- mkLamType, mkLamTypes,
+ mkLamType, mkLamTypes, mkFunctionType,
piResultTy, piResultTys,
applyTysX, dropForAlls,
mkFamilyTyConApp,
@@ -256,7 +256,7 @@ import {-# SOURCE #-} GHC.Core.Coercion
, mkTyConAppCo, mkAppCo, mkCoVarCo, mkAxiomRuleCo
, mkForAllCo, mkFunCo, mkAxiomInstCo, mkUnivCo
, mkSymCo, mkTransCo, mkNthCo, mkLRCo, mkInstCo
- , mkKindCo, mkSubCo, mkFunCo, mkAxiomInstCo
+ , mkKindCo, mkSubCo
, decomposePiCos, coercionKind, coercionLKind
, coercionRKind, coercionType
, isReflexiveCo, seqCo )
@@ -1517,6 +1517,8 @@ mkLamType :: Var -> Type -> Type
mkLamTypes :: [Var] -> Type -> Type
-- ^ 'mkLamType' for multiple type or value arguments
+mkLamTypes vs ty = foldr mkLamType ty vs
+
mkLamType v body_ty
| isTyVar v
= ForAllTy (Bndr v Inferred) body_ty
@@ -1525,43 +1527,19 @@ mkLamType v body_ty
, v `elemVarSet` tyCoVarsOfType body_ty
= ForAllTy (Bndr v Required) body_ty
- | isPredTy arg_ty -- See Note [mkLamType: dictionary arguments]
- = mkInvisFunTy arg_ty body_ty
-
| otherwise
- = mkVisFunTy arg_ty body_ty
- where
- arg_ty = varType v
-
-mkLamTypes vs ty = foldr mkLamType ty vs
+ = mkFunctionType (varType v) body_ty
-{- Note [mkLamType: dictionary arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-If we have (\ (d :: Ord a). blah), we want to give it type
- (Ord a => blah_ty)
-with a fat arrow; that is, using mkInvisFunTy, not mkVisFunTy.
-Why? After all, we are in Core, where (=>) and (->) behave the same.
-Yes, but the /specialiser/ does treat dictionary arguments specially.
-Suppose we do w/w on 'foo' in module A, thus (#11272, #6056)
- foo :: Ord a => Int -> blah
- foo a d x = case x of I# x' -> $wfoo @a d x'
+mkFunctionType :: Type -> Type -> Type
+-- This one works out the AnonArgFlag from the argument type
+-- See GHC.Types.Var Note [AnonArgFlag]
+mkFunctionType arg_ty res_ty
+ | isPredTy arg_ty -- See GHC.Types.Var Note [AnonArgFlag]
+ = mkInvisFunTy arg_ty res_ty
- $wfoo :: Ord a => Int# -> blah
-
-Now in module B we see (foo @Int dOrdInt). The specialiser will
-specialise this to $sfoo, where
- $sfoo :: Int -> blah
- $sfoo x = case x of I# x' -> $wfoo @Int dOrdInt x'
-
-Now we /must/ also specialise $wfoo! But it wasn't user-written,
-and has a type built with mkLamTypes.
-
-Conclusion: the easiest thing is to make mkLamType build
- (c => ty)
-when the argument is a predicate type. See GHC.Core.TyCo.Rep
-Note [Types for coercions, predicates, and evidence]
--}
+ | otherwise
+ = mkVisFunTy arg_ty res_ty
-- | Given a list of type-level vars and the free vars of a result kind,
-- makes TyCoBinders, preferring anonymous binders
diff --git a/compiler/GHC/Core/Unfold.hs b/compiler/GHC/Core/Unfold.hs
index 7e080367e8..b614c87248 100644
--- a/compiler/GHC/Core/Unfold.hs
+++ b/compiler/GHC/Core/Unfold.hs
@@ -412,6 +412,8 @@ inlineBoringOk e
, exprIsTrivial a = go (credit-1) f
go credit (Tick _ e) = go credit e -- dubious
go credit (Cast e _) = go credit e
+ go credit (Case scrut _ _ [(_,_,rhs)]) -- See Note [Inline unsafeCoerce]
+ | isUnsafeEqualityProof scrut = go credit rhs
go _ (Var {}) = boringCxtOk
go _ _ = boringCxtNotOk
@@ -459,7 +461,21 @@ calcUnfoldingGuidance dflags is_top_bottoming expr
| otherwise = (+)
-- See Note [Function and non-function discounts]
-{-
+{- Note [Inline unsafeCoerce]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We really want to inline unsafeCoerce, even when applied to boring
+arguments. It doesn't look as if its RHS is smaller than the call
+ unsafeCoerce x = case unsafeEqualityProof @a @b of UnsafeRefl -> x
+but that case is discarded -- see Note [Implementing unsafeCoerce]
+in base:Unsafe.Coerce.
+
+Moreover, if we /don't/ inline it, we may be left with
+ f (unsafeCoerce x)
+which will build a thunk -- bad, bad, bad.
+
+Conclusion: we really want inlineBoringOk to be True of the RHS of
+unsafeCoerce. This is (U4a) in Note [Implementing unsafeCoerce].
+
Note [Computing the size of an expression]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The basic idea of sizeExpr is obvious enough: count nodes. But getting the
diff --git a/compiler/GHC/Core/Utils.hs b/compiler/GHC/Core/Utils.hs
index d6ff419c10..700ab14b1e 100644
--- a/compiler/GHC/Core/Utils.hs
+++ b/compiler/GHC/Core/Utils.hs
@@ -56,6 +56,9 @@ module GHC.Core.Utils (
-- * Join points
isJoinBind,
+ -- * unsafeEqualityProof
+ isUnsafeEqualityProof,
+
-- * Dumping stuff
dumpIdInfoOfProgram
) where
@@ -66,7 +69,7 @@ import GHC.Prelude
import GHC.Platform
import GHC.Core
-import GHC.Builtin.Names ( makeStaticName )
+import GHC.Builtin.Names ( makeStaticName, unsafeEqualityProofName )
import GHC.Core.Ppr
import GHC.Core.FVs( exprFreeVars )
import GHC.Types.Var
@@ -2533,3 +2536,20 @@ dumpIdInfoOfProgram ppr_id_info binds = vcat (map printId ids)
getIds (Rec bs) = map fst bs
printId id | isExportedId id = ppr id <> colon <+> (ppr_id_info (idInfo id))
| otherwise = empty
+
+
+{- *********************************************************************
+* *
+ unsafeEqualityProof
+* *
+********************************************************************* -}
+
+isUnsafeEqualityProof :: CoreExpr -> Bool
+-- See (U3) and (U4) in
+-- Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
+isUnsafeEqualityProof e
+ | Var v `App` Type _ `App` Type _ `App` Type _ <- e
+ = idName v == unsafeEqualityProofName
+ | otherwise
+ = False
+
diff --git a/compiler/GHC/CoreToStg/Prep.hs b/compiler/GHC/CoreToStg/Prep.hs
index 9b4690e016..7420475813 100644
--- a/compiler/GHC/CoreToStg/Prep.hs
+++ b/compiler/GHC/CoreToStg/Prep.hs
@@ -1095,15 +1095,6 @@ cpExprIsTrivial e
| otherwise
= exprIsTrivial e
-isUnsafeEqualityProof :: CoreExpr -> Bool
--- See (U3) and (U4) in
--- Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
-isUnsafeEqualityProof e
- | Var v `App` Type _ `App` Type _ `App` Type _ <- e
- = idName v == unsafeEqualityProofName
- | otherwise
- = False
-
-- This is where we arrange that a non-trivial argument is let-bound
cpeArg :: CorePrepEnv -> Demand
-> CoreArg -> Type -> UniqSM (Floats, CpeArg)
diff --git a/compiler/GHC/Types/Basic.hs b/compiler/GHC/Types/Basic.hs
index a831ab995e..82f929fb47 100644
--- a/compiler/GHC/Types/Basic.hs
+++ b/compiler/GHC/Types/Basic.hs
@@ -80,9 +80,9 @@ module GHC.Types.Basic (
CompilerPhase(..), PhaseNum,
- Activation(..), isActive, isActiveIn, competesWith,
- isNeverActive, isAlwaysActive, isEarlyActive,
- activeAfterInitial, activeDuringFinal,
+ Activation(..), isActive, competesWith,
+ isNeverActive, isAlwaysActive, activeInFinalPhase,
+ activateAfterInitial, activateDuringFinal,
RuleMatchInfo(..), isConLike, isFunLike,
InlineSpec(..), noUserInlineSpec,
@@ -1300,6 +1300,27 @@ pprWithSourceText (SourceText src) _ = text src
************************************************************************
When a rule or inlining is active
+
+Note [Compiler phases]
+~~~~~~~~~~~~~~~~~~~~~~
+The CompilerPhase says which phase the simplifier is running in:
+
+* InitialPhase: before all user-visible phases
+
+* Phase 2,1,0: user-visible phases; the phase number
+ controls rule ordering an inlining.
+
+* FinalPhase: used for all subsequent simplifier
+ runs. By delaying inlining of wrappers to FinalPhase we can
+ ensure that RULE have a good chance to fire. See
+ Note [Wrapper activation] in GHC.Core.Opt.WorkWrap
+
+ NB: FinalPhase is run repeatedly, not just once.
+
+ NB: users don't have access to InitialPhase or FinalPhase.
+ They write {-# INLINE[n] f #-}, meaning (Phase n)
+
+The phase sequencing is done by GHC.Opt.Simplify.Driver
-}
-- | Phase Number
@@ -1308,37 +1329,109 @@ type PhaseNum = Int -- Compilation phase
-- Zero is the last phase
data CompilerPhase
- = Phase PhaseNum
- | InitialPhase -- The first phase -- number = infinity!
+ = InitialPhase -- The first phase -- number = infinity!
+ | Phase PhaseNum -- User-specificable phases
+ | FinalPhase -- The last phase -- number = -infinity!
+ deriving Eq
instance Outputable CompilerPhase where
ppr (Phase n) = int n
ppr InitialPhase = text "InitialPhase"
+ ppr FinalPhase = text "FinalPhase"
-activeAfterInitial :: Activation
+-- See note [Pragma source text]
+data Activation
+ = AlwaysActive
+ | ActiveBefore SourceText PhaseNum -- Active only *strictly before* this phase
+ | ActiveAfter SourceText PhaseNum -- Active in this phase and later
+ | FinalActive -- Active in final phase only
+ | NeverActive
+ deriving( Eq, Data )
+ -- Eq used in comparing rules in GHC.Hs.Decls
+
+activateAfterInitial :: Activation
-- Active in the first phase after the initial phase
--- Currently we have just phases [2,1,0]
-activeAfterInitial = ActiveAfter NoSourceText 2
+-- Currently we have just phases [2,1,0,FinalPhase,FinalPhase,...]
+-- Where FinalPhase means GHC's internal simplification steps
+-- after all rules have run
+activateAfterInitial = ActiveAfter NoSourceText 2
-activeDuringFinal :: Activation
+activateDuringFinal :: Activation
-- Active in the final simplification phase (which is repeated)
-activeDuringFinal = ActiveAfter NoSourceText 0
+activateDuringFinal = FinalActive
--- See note [Pragma source text]
-data Activation = NeverActive
- | AlwaysActive
- | ActiveBefore SourceText PhaseNum
- -- Active only *strictly before* this phase
- | ActiveAfter SourceText PhaseNum
- -- Active in this phase and later
- deriving( Eq, Data )
- -- Eq used in comparing rules in GHC.Hs.Decls
+isActive :: CompilerPhase -> Activation -> Bool
+isActive InitialPhase act = activeInInitialPhase act
+isActive (Phase p) act = activeInPhase p act
+isActive FinalPhase act = activeInFinalPhase act
+
+activeInInitialPhase :: Activation -> Bool
+activeInInitialPhase AlwaysActive = True
+activeInInitialPhase (ActiveBefore {}) = True
+activeInInitialPhase _ = False
+
+activeInPhase :: PhaseNum -> Activation -> Bool
+activeInPhase _ AlwaysActive = True
+activeInPhase _ NeverActive = False
+activeInPhase _ FinalActive = False
+activeInPhase p (ActiveAfter _ n) = p <= n
+activeInPhase p (ActiveBefore _ n) = p > n
+
+activeInFinalPhase :: Activation -> Bool
+activeInFinalPhase AlwaysActive = True
+activeInFinalPhase FinalActive = True
+activeInFinalPhase (ActiveAfter {}) = True
+activeInFinalPhase _ = False
+
+isNeverActive, isAlwaysActive :: Activation -> Bool
+isNeverActive NeverActive = True
+isNeverActive _ = False
+
+isAlwaysActive AlwaysActive = True
+isAlwaysActive _ = False
+
+competesWith :: Activation -> Activation -> Bool
+-- See Note [Activation competition]
+competesWith AlwaysActive _ = True
+
+competesWith NeverActive _ = False
+competesWith _ NeverActive = False
+
+competesWith FinalActive FinalActive = True
+competesWith FinalActive _ = False
+
+competesWith (ActiveBefore {}) AlwaysActive = True
+competesWith (ActiveBefore {}) FinalActive = False
+competesWith (ActiveBefore {}) (ActiveBefore {}) = True
+competesWith (ActiveBefore _ a) (ActiveAfter _ b) = a < b
+
+competesWith (ActiveAfter {}) AlwaysActive = False
+competesWith (ActiveAfter {}) FinalActive = True
+competesWith (ActiveAfter {}) (ActiveBefore {}) = False
+competesWith (ActiveAfter _ a) (ActiveAfter _ b) = a >= b
+
+{- Note [Competing activations]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Sometimes a RULE and an inlining may compete, or two RULES.
+See Note [Rules and inlining/other rules] in GHC.HsToCore.
+
+We say that act1 "competes with" act2 iff
+ act1 is active in the phase when act2 *becomes* active
+NB: remember that phases count *down*: 2, 1, 0!
+
+It's too conservative to ensure that the two are never simultaneously
+active. For example, a rule might be always active, and an inlining
+might switch on in phase 2. We could switch off the rule, but it does
+no harm.
+-}
+
+
+{- *********************************************************************
+* *
+ InlinePragma, InlineSpec, RuleMatchInfo
+* *
+********************************************************************* -}
--- | Rule Match Information
-data RuleMatchInfo = ConLike -- See Note [CONLIKE pragma]
- | FunLike
- deriving( Eq, Data, Show )
- -- Show needed for GHC.Parser.Lexer
data InlinePragma -- Note [InlinePragma]
= InlinePragma
@@ -1358,6 +1451,12 @@ data InlinePragma -- Note [InlinePragma]
, inl_rule :: RuleMatchInfo -- Should the function be treated like a constructor?
} deriving( Eq, Data )
+-- | Rule Match Information
+data RuleMatchInfo = ConLike -- See Note [CONLIKE pragma]
+ | FunLike
+ deriving( Eq, Data, Show )
+ -- Show needed for GHC.Parser.Lexer
+
-- | Inline Specification
data InlineSpec -- What the user's INLINE pragma looked like
= Inline -- User wrote INLINE
@@ -1515,6 +1614,7 @@ instance Outputable Activation where
ppr NeverActive = brackets (text "~")
ppr (ActiveBefore _ n) = brackets (char '~' <> int n)
ppr (ActiveAfter _ n) = brackets (int n)
+ ppr FinalActive = text "[final]"
instance Outputable RuleMatchInfo where
ppr ConLike = text "CONLIKE"
@@ -1553,57 +1653,13 @@ pprInline' emptyInline (InlinePragma { inl_inline = inline, inl_act = activation
pp_info | isFunLike info = empty
| otherwise = ppr info
-isActive :: CompilerPhase -> Activation -> Bool
-isActive InitialPhase AlwaysActive = True
-isActive InitialPhase (ActiveBefore {}) = True
-isActive InitialPhase _ = False
-isActive (Phase p) act = isActiveIn p act
-isActiveIn :: PhaseNum -> Activation -> Bool
-isActiveIn _ NeverActive = False
-isActiveIn _ AlwaysActive = True
-isActiveIn p (ActiveAfter _ n) = p <= n
-isActiveIn p (ActiveBefore _ n) = p > n
-competesWith :: Activation -> Activation -> Bool
--- See Note [Activation competition]
-competesWith NeverActive _ = False
-competesWith _ NeverActive = False
-competesWith AlwaysActive _ = True
-
-competesWith (ActiveBefore {}) AlwaysActive = True
-competesWith (ActiveBefore {}) (ActiveBefore {}) = True
-competesWith (ActiveBefore _ a) (ActiveAfter _ b) = a < b
-
-competesWith (ActiveAfter {}) AlwaysActive = False
-competesWith (ActiveAfter {}) (ActiveBefore {}) = False
-competesWith (ActiveAfter _ a) (ActiveAfter _ b) = a >= b
-
-{- Note [Competing activations]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Sometimes a RULE and an inlining may compete, or two RULES.
-See Note [Rules and inlining/other rules] in GHC.HsToCore.
-
-We say that act1 "competes with" act2 iff
- act1 is active in the phase when act2 *becomes* active
-NB: remember that phases count *down*: 2, 1, 0!
-
-It's too conservative to ensure that the two are never simultaneously
-active. For example, a rule might be always active, and an inlining
-might switch on in phase 2. We could switch off the rule, but it does
-no harm.
--}
-
-isNeverActive, isAlwaysActive, isEarlyActive :: Activation -> Bool
-isNeverActive NeverActive = True
-isNeverActive _ = False
-
-isAlwaysActive AlwaysActive = True
-isAlwaysActive _ = False
-
-isEarlyActive AlwaysActive = True
-isEarlyActive (ActiveBefore {}) = True
-isEarlyActive _ = False
+{- *********************************************************************
+* *
+ Integer literals
+* *
+********************************************************************* -}
-- | Integral Literal
--
diff --git a/compiler/GHC/Types/Id/Make.hs b/compiler/GHC/Types/Id/Make.hs
index 6e3edbf7ba..9fdbb62a6e 100644
--- a/compiler/GHC/Types/Id/Make.hs
+++ b/compiler/GHC/Types/Id/Make.hs
@@ -655,7 +655,7 @@ mkDataConRep dflags fam_envs wrap_name mb_bangs data_con
| otherwise = topDmd
wrap_prag = dataConWrapperInlinePragma
- `setInlinePragmaActivation` activeDuringFinal
+ `setInlinePragmaActivation` activateDuringFinal
-- See Note [Activation for data constructor wrappers]
-- The wrapper will usually be inlined (see wrap_unf), so its
diff --git a/compiler/GHC/Types/Var.hs b/compiler/GHC/Types/Var.hs
index fc578851f6..0e7e806656 100644
--- a/compiler/GHC/Types/Var.hs
+++ b/compiler/GHC/Types/Var.hs
@@ -465,10 +465,10 @@ instance Binary ArgFlag where
_ -> return Inferred
-- | The non-dependent version of 'ArgFlag'.
-
--- Appears here partly so that it's together with its friend ArgFlag,
--- but also because it is used in IfaceType, rather early in the
--- compilation chain
+-- See Note [AnonArgFlag]
+-- Appears here partly so that it's together with its friends ArgFlag
+-- and ForallVisFlag, but also because it is used in IfaceType, rather
+-- early in the compilation chain
-- See Note [AnonArgFlag vs. ForallVisFlag]
data AnonArgFlag
= VisArg -- ^ Used for @(->)@: an ordinary non-dependent arrow.
@@ -511,7 +511,60 @@ argToForallVisFlag Required = ForallVis
argToForallVisFlag Specified = ForallInvis
argToForallVisFlag Inferred = ForallInvis
-{-
+{- Note [AnonArgFlag]
+~~~~~~~~~~~~~~~~~~~~~
+AnonArgFlag is used principally in the FunTy constructor of Type.
+ FunTy VisArg t1 t2 means t1 -> t2
+ FunTy InvisArg t1 t2 means t1 => t2
+
+However, the AnonArgFlag in a FunTy is just redundant, cached
+information. In (FunTy { ft_af = af, ft_arg = t1, ft_res = t2 })
+ * if (isPredTy t1 = True) then af = InvisArg
+ * if (isPredTy t1 = False) then af = VisArg
+where isPredTy is defined in GHC.Core.Type, and sees if t1's
+kind is Constraint. See GHC.Core.TyCo.Rep
+Note [Types for coercions, predicates, and evidence]
+
+GHC.Core.Type.mkFunctionType :: Type -> Type -> Type
+uses isPredTy to decide the AnonArgFlag for the FunTy.
+
+The term (Lam b e), and coercion (FunCo co1 co2) don't carry
+AnonArgFlags; instead they use mkFunctionType when we want to
+get their types; see mkLamType and coercionLKind/RKind resp.
+This is just an engineering choice; we could cache here too
+if we wanted.
+
+Why bother with all this? After all, we are in Core, where (=>) and
+(->) behave the same. We maintain this distinction throughout Core so
+that we can cheaply and conveniently determine
+* How to print a type
+* How to split up a type: tcSplitSigmaTy
+* How to specialise it (over type classes; GHC.Core.Opt.Specialise)
+
+For the specialisation point, consider
+(\ (d :: Ord a). blah). We want to give it type
+ (Ord a => blah_ty)
+with a fat arrow; that is, using mkInvisFunTy, not mkVisFunTy.
+Why? Because the /specialiser/ treats dictionary arguments specially.
+Suppose we do w/w on 'foo', thus (#11272, #6056)
+ foo :: Ord a => Int -> blah
+ foo a d x = case x of I# x' -> $wfoo @a d x'
+
+ $wfoo :: Ord a => Int# -> blah
+
+Now, at a call we see (foo @Int dOrdInt). The specialiser will
+specialise this to $sfoo, where
+ $sfoo :: Int -> blah
+ $sfoo x = case x of I# x' -> $wfoo @Int dOrdInt x'
+
+Now we /must/ also specialise $wfoo! But it wasn't user-written,
+and has a type built with mkLamTypes.
+
+Conclusion: the easiest thing is to make mkLamType build
+ (c => ty)
+when the argument is a predicate type. See GHC.Core.TyCo.Rep
+Note [Types for coercions, predicates, and evidence]
+
Note [AnonArgFlag vs. ForallVisFlag]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The AnonArgFlag and ForallVisFlag data types are quite similar at a first
@@ -522,15 +575,19 @@ glance:
Both data types keep track of visibility of some sort. AnonArgFlag tracks
whether a FunTy has a visible argument (->) or an invisible predicate argument
-(=>). ForallVisFlag tracks whether a `forall` quantifier is visible
-(forall a -> {...}) or invisible (forall a. {...}).
-
-Given their similarities, it's tempting to want to combine these two data types
-into one, but they actually represent distinct concepts. AnonArgFlag reflects a
-property of *Core* types, whereas ForallVisFlag reflects a property of the GHC
-AST. In other words, AnonArgFlag is all about internals, whereas ForallVisFlag
-is all about surface syntax. Therefore, they are kept as separate data types.
--}
+(=>). ForallVisFlag tracks whether a `forall` quantifier in a user-specified
+HsType is
+ visible: forall a -> {...}
+ invisible: forall a. {...}
+In fact the visible form can currently only appear in kinds.
+
+Given their similarities, it's tempting to want to combine these two
+data types into one, but they actually represent distinct
+concepts. AnonArgFlag reflects a property of *Core* types, whereas
+ForallVisFlag reflects a property of the HsSyn source-code AST. In
+other words, AnonArgFlag is all about internals, whereas ForallVisFlag
+is all about surface syntax. Therefore, they are kept as separate data
+types. -}
{- *********************************************************************
* *
diff --git a/compiler/GHC/Utils/Binary.hs b/compiler/GHC/Utils/Binary.hs
index 26a3eb811b..10810ba96a 100644
--- a/compiler/GHC/Utils/Binary.hs
+++ b/compiler/GHC/Utils/Binary.hs
@@ -1256,22 +1256,25 @@ instance Binary TupleSort where
instance Binary Activation where
put_ bh NeverActive = do
putByte bh 0
- put_ bh AlwaysActive = do
+ put_ bh FinalActive = do
putByte bh 1
- put_ bh (ActiveBefore src aa) = do
+ put_ bh AlwaysActive = do
putByte bh 2
+ put_ bh (ActiveBefore src aa) = do
+ putByte bh 3
put_ bh src
put_ bh aa
put_ bh (ActiveAfter src ab) = do
- putByte bh 3
+ putByte bh 4
put_ bh src
put_ bh ab
get bh = do
h <- getByte bh
case h of
0 -> do return NeverActive
- 1 -> do return AlwaysActive
- 2 -> do src <- get bh
+ 1 -> do return FinalActive
+ 2 -> do return AlwaysActive
+ 3 -> do src <- get bh
aa <- get bh
return (ActiveBefore src aa)
_ -> do src <- get bh