summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2020-12-30 09:37:25 +0000
committerSimon Peyton Jones <simonpj@microsoft.com>2021-01-01 15:09:51 +0000
commit9996197176094c56c8891f2504e7f8cce80aedac (patch)
tree07cfec071c609d20d8f8fc76de24655bd91c6e3c
parent2113a1d600e579bb0f54a0526a03626f105c0365 (diff)
downloadhaskell-wip/T19133.tar.gz
Don't use absentError thunks for strict constructor fieldswip/T19133
This patch fixes #19133 by using LitRubbish for strict constructor fields, even if they are of lifted types. Previously LitRubbish worked only for unlifted (but boxed) types. The change is very easy, although I needed a boolean field in LitRubbish to say whether or not it is lifted. (That seemed easier than giving it another type argument. This is preparing for Andreas's work on establishing the invariant that strict constructor fields are always tagged and evaluated (see #16970). Meanwhile, nothing was actually wrong before, so there are no tests.
-rw-r--r--compiler/GHC/ByteCode/Asm.hs3
-rw-r--r--compiler/GHC/Core/Opt/WorkWrap/Utils.hs105
-rw-r--r--compiler/GHC/Core/Utils.hs2
-rw-r--r--compiler/GHC/CoreToByteCode.hs12
-rw-r--r--compiler/GHC/CoreToStg.hs5
-rw-r--r--compiler/GHC/Types/Demand.hs41
-rw-r--r--compiler/GHC/Types/Literal.hs81
7 files changed, 172 insertions, 77 deletions
diff --git a/compiler/GHC/ByteCode/Asm.hs b/compiler/GHC/ByteCode/Asm.hs
index 17db8a2691..3f88187960 100644
--- a/compiler/GHC/ByteCode/Asm.hs
+++ b/compiler/GHC/ByteCode/Asm.hs
@@ -473,10 +473,11 @@ assembleI platform i = case i of
LitNumWord64 -> int64 (fromIntegral i)
LitNumInteger -> panic "GHC.ByteCode.Asm.literal: LitNumInteger"
LitNumNatural -> panic "GHC.ByteCode.Asm.literal: LitNumNatural"
+
-- We can lower 'LitRubbish' to an arbitrary constant, but @NULL@ is most
-- likely to elicit a crash (rather than corrupt memory) in case absence
-- analysis messed up.
- literal LitRubbish = int 0
+ literal (LitRubbish {}) = int 0
litlabel fs = lit [BCONPtrLbl fs]
addr (RemotePtr a) = words [fromIntegral a]
diff --git a/compiler/GHC/Core/Opt/WorkWrap/Utils.hs b/compiler/GHC/Core/Opt/WorkWrap/Utils.hs
index 7fd73b2cfc..0a7ef0f3a5 100644
--- a/compiler/GHC/Core/Opt/WorkWrap/Utils.hs
+++ b/compiler/GHC/Core/Opt/WorkWrap/Utils.hs
@@ -593,7 +593,7 @@ mkWWstr_one dflags fam_envs has_inlineable_prag arg
= return (False, [arg], nop_fn, nop_fn)
| isAbsDmd dmd
- , Just work_fn <- mk_absent_let dflags fam_envs arg
+ , Just work_fn <- mk_absent_let dflags fam_envs arg dmd
-- Absent case. We can't always handle absence for arbitrary
-- unlifted types, so we need to choose just the cases we can
-- (that's what mk_absent_let does)
@@ -1255,21 +1255,72 @@ part of the function (post transformation) anyway.
Note [Absent errors]
~~~~~~~~~~~~~~~~~~~~
-We make a new binding for Ids that are marked absent, thus
- let x = absentError "x :: Int"
-The idea is that this binding will never be used; but if it
-buggily is used we'll get a runtime error message.
-
-Coping with absence for *unlifted* types is important; see, for
-example, #4306 and #15627. In the UnliftedRep case, we can
-use LitRubbish, which we need to apply to the required type.
-For the unlifted types of singleton kind like Float#, Addr#, etc. we
-also find a suitable literal, using Literal.absentLiteralOf. We don't
-have literals for every primitive type, so the function is partial.
-
-Note: I did try the experiment of using an error thunk for unlifted
-things too, relying on the simplifier to drop it as dead code.
-But this is fragile
+Consider
+ data T = MkT [Int] [Int] ![Int]
+ f :: T -> Int# -> blah
+ f ps w = case ps of MkT xs _ _ -> <body mentioning xs>
+Then f gets a strictness sig of <S(L,A,A)><A>. We make worker $wf thus:
+
+$wf :: [Int] -> blah
+$wf xs = case ps of MkT xs _ _ -> <body mentioning xs>
+ where
+ ys = absentError "ys :: [Int]"
+ zs = LitRubbish True
+ ps = MkT xs ys zs
+ w = 0#
+
+We make a let-binding for Absent arguments, such as ys and w, that are not even
+passed to the worker. They should, of course, never be used. We distinguish four
+cases:
+
+1. Ordinary boxed, lifted arguments, like 'ys' We make a new binding for Ids
+ that are marked absent, thus
+ let ys = absentError "ys :: [Int]"
+ The idea is that this binding will never be used; but if it
+ buggily is used we'll get a runtime error message.
+
+2. Boxed, lifted types, with a strict demand, like 'zs'. You may ask: how the
+ demand be both absent and strict? That's exactly what happens for 'zs': it
+ is not used, so its demand is Absent, but then during w/w, in
+ addDataConStrictness, we strictify the demand. So it gets cardinality C_10,
+ the empty interval.
+
+ We don't want to use an error-thunk for 'zs' because MkT's third argument has
+ a bang, and hence should be always evaluated. This turned out to be
+ important when fixing #16970, which establishes the invariant that strict
+ constructor arguments are always evaluated. So we use LitRubbish instead
+ of an error thunk -- see #19133.
+
+ These first two cases are distinguished by isStrictDmd in lifted_rhs.
+
+3. Unboxed types, like 'w', with a type like Float#, Int#. Coping with absence
+ for unboxed types is important; see, for example, #4306 and #15627. We
+ simply find a suitable literal, using Literal.absentLiteralOf. We don't have
+ literals for every primitive type, so the function is partial.
+
+4. Boxed, unlifted types, like (Array# t). We can't use absentError because
+ unlifted bindings ares strict. So we use LitRubbish, which we need to apply
+ to the required type.
+
+Case (2) and (4) crucially use LitRubbish as the placeholder: see Note [Rubbish
+literals] in GHC.Types.Literal. We could do that in case (1) as well, but we
+get slightly better self-checking with an error thunk.
+
+Suppose we use LitRubbish and absence analysis is Wrong, so that the "absent"
+value is used after all. Then in case (2) we could get a seg-fault, because we
+may have replaced, say, a [Either Int Bool] by (), and that will fail if we do
+case analysis on it. Similarly with boxed unlifted types, case (4).
+
+In case (3), if absence analysis is wrong we could conceivably get an exception,
+from a divide-by-zero with the absent value. But it's very unlikely.
+
+Only in case (1) can we guarantee a civilised runtime error. Not much we can do
+about this; we really rely on absence analysis to be correct.
+
+
+Historical note: I did try the experiment of using an error thunk for unlifted
+things too, relying on the simplifier to drop it as dead code. But this is
+fragile
- It fails when profiling is on, which disables various optimisations
@@ -1281,10 +1332,8 @@ But this is fragile
pass that component to the worker for 'f', which reconstructs 'p' to
pass it to 'g'. Alas we can't say
...f (MkT a (absentError Int# "blah"))...
- bacause `MkT` is strict in its Int# argument, so we get an absentError
+ because `MkT` is strict in its Int# argument, so we get an absentError
exception when we shouldn't. Very annoying!
-
-So absentError is only used for lifted types.
-}
-- | Tries to find a suitable dummy RHS to bind the given absent identifier to.
@@ -1292,23 +1341,28 @@ So absentError is only used for lifted types.
-- If @mk_absent_let _ id == Just wrap@, then @wrap e@ will wrap a let binding
-- for @id@ with that RHS around @e@. Otherwise, there could no suitable RHS be
-- found (currently only happens for bindings of 'VecRep' representation).
-mk_absent_let :: DynFlags -> FamInstEnvs -> Id -> Maybe (CoreExpr -> CoreExpr)
-mk_absent_let dflags fam_envs arg
+mk_absent_let :: DynFlags -> FamInstEnvs -> Id -> Demand -> Maybe (CoreExpr -> CoreExpr)
+mk_absent_let dflags fam_envs arg dmd
+
-- The lifted case: Bind 'absentError'
-- See Note [Absent errors]
| not (isUnliftedType arg_ty)
- = Just (Let (NonRec lifted_arg abs_rhs))
+ = Just (Let (NonRec lifted_arg lifted_rhs))
-- The 'UnliftedRep' (because polymorphic) case: Bind @__RUBBISH \@arg_ty@
-- See Note [Absent errors]
+
| [UnliftedRep] <- typePrimRep arg_ty
= Just (Let (NonRec arg unlifted_rhs))
+
-- The monomorphic unlifted cases: Bind to some literal, if possible
-- See Note [Absent errors]
| Just tc <- tyConAppTyCon_maybe nty
, Just lit <- absentLiteralOf tc
= Just (Let (NonRec arg (Lit lit `mkCast` mkSymCo co)))
+
| nty `eqType` unboxedUnitTy
= Just (Let (NonRec arg (Var voidPrimId `mkCast` mkSymCo co)))
+
| otherwise
= WARN( True, text "No absent value for" <+> ppr arg_ty )
Nothing -- Can happen for 'State#' and things of 'VecRep'
@@ -1317,6 +1371,11 @@ mk_absent_let dflags fam_envs arg
-- Note in strictness signature that this is bottoming
-- (for the sake of the "empty case scrutinee not known to
-- diverge for sure lint" warning)
+
+ lifted_rhs | isStrictDmd dmd = mkTyApps (Lit (rubbishLit True)) [arg_ty]
+ | otherwise = mkAbsentErrorApp arg_ty msg
+ unlifted_rhs = mkTyApps (Lit (rubbishLit False)) [arg_ty]
+
arg_ty = idType arg
-- Normalise the type to have best chance of finding an absent literal
@@ -1326,7 +1385,6 @@ mk_absent_let dflags fam_envs arg
(co, nty) = topNormaliseType_maybe fam_envs arg_ty
`orElse` (mkRepReflCo arg_ty, arg_ty)
- abs_rhs = mkAbsentErrorApp arg_ty msg
msg = showSDoc (gopt_set dflags Opt_SuppressUniques)
(vcat
[ text "Arg:" <+> ppr arg
@@ -1342,7 +1400,6 @@ mk_absent_let dflags fam_envs arg
-- will have different lengths and hence different costs for
-- the inliner leading to different inlining.
-- See also Note [Unique Determinism] in GHC.Types.Unique
- unlifted_rhs = mkTyApps (Lit rubbishLit) [arg_ty]
ww_prefix :: FastString
ww_prefix = fsLit "ww"
diff --git a/compiler/GHC/Core/Utils.hs b/compiler/GHC/Core/Utils.hs
index 2e40ddc659..afebee0678 100644
--- a/compiler/GHC/Core/Utils.hs
+++ b/compiler/GHC/Core/Utils.hs
@@ -1611,7 +1611,7 @@ expr_ok primop_ok other_expr
Var f -> app_ok primop_ok f args
-- 'LitRubbish' is the only literal that can occur in the head of an
-- application and will not be matched by the above case (Var /= Lit).
- Lit lit -> ASSERT( lit == rubbishLit ) True
+ Lit lit -> ASSERT( isRubbishLit lit ) True
_ -> False
-----------------------------
diff --git a/compiler/GHC/CoreToByteCode.hs b/compiler/GHC/CoreToByteCode.hs
index e993688db9..31c40a9282 100644
--- a/compiler/GHC/CoreToByteCode.hs
+++ b/compiler/GHC/CoreToByteCode.hs
@@ -1655,13 +1655,13 @@ pushAtom _ _ (AnnLit lit) = do
_ -> PUSH_UBX lit (trunc16W $ bytesToWords platform size_bytes)
case lit of
- LitLabel _ _ _ -> code AddrRep
- LitFloat _ -> code FloatRep
- LitDouble _ -> code DoubleRep
- LitChar _ -> code WordRep
+ LitLabel {} -> code AddrRep
+ LitFloat {} -> code FloatRep
+ LitDouble {} -> code DoubleRep
+ LitChar {} -> code WordRep
LitNullAddr -> code AddrRep
- LitString _ -> code AddrRep
- LitRubbish -> code WordRep
+ LitString {} -> code AddrRep
+ LitRubbish {} -> code WordRep
LitNumber nt _ -> case nt of
LitNumInt -> code IntRep
LitNumWord -> code WordRep
diff --git a/compiler/GHC/CoreToStg.hs b/compiler/GHC/CoreToStg.hs
index ea59a84602..8082023ae7 100644
--- a/compiler/GHC/CoreToStg.hs
+++ b/compiler/GHC/CoreToStg.hs
@@ -374,9 +374,10 @@ coreToStgExpr
coreToStgExpr (Lit (LitNumber LitNumInteger _)) = panic "coreToStgExpr: LitInteger"
coreToStgExpr (Lit (LitNumber LitNumNatural _)) = panic "coreToStgExpr: LitNatural"
coreToStgExpr (Lit l) = return (StgLit l)
-coreToStgExpr (App (Lit LitRubbish) _some_unlifted_type)
+coreToStgExpr (App (Lit lit) _some_boxed_type)
+ | isRubbishLit lit
-- We lower 'LitRubbish' to @()@ here, which is much easier than doing it in
- -- a STG to Cmm pass.
+ -- a STG to Cmm pass. Doesn't matter whether it is lifted or unlifted
= coreToStgExpr (Var unitDataConId)
coreToStgExpr (Var v) = coreToStgApp v [] []
coreToStgExpr (Coercion _)
diff --git a/compiler/GHC/Types/Demand.hs b/compiler/GHC/Types/Demand.hs
index ba5e5266c9..c2e4770da6 100644
--- a/compiler/GHC/Types/Demand.hs
+++ b/compiler/GHC/Types/Demand.hs
@@ -26,7 +26,7 @@ module GHC.Types.Demand (
multCard, multDmd, multSubDmd,
-- ** Predicates on @Card@inalities and @Demand@s
isAbs, isUsedOnce, isStrict,
- isAbsDmd, isUsedOnceDmd, isStrUsedDmd,
+ isAbsDmd, isUsedOnceDmd, isStrUsedDmd, isStrictDmd,
isTopDmd, isSeqDmd, isWeakDmd,
-- ** Special demands
evalDmd,
@@ -106,12 +106,32 @@ import GHC.Utils.Panic
************************************************************************
-}
+{- Note [Evaluation cardinalities]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The demand analyser uses an /evaluation cardinality/ of type Card,
+to specify how many times a term is evaluated. A cardinality C_lu
+represents an /interval/ [l..u], meaning
+ C_lu means evaluated /at least/ 'l' times and
+ /at most/ 'u' times
+
+* The lower bound corresponds to /strictness/
+ Hence 'l' is either 0 (lazy)
+ or 1 (strict)
+
+* The upper bound corresponds to /usage/
+ Hence 'u' is either 0 (not used at all),
+ or 1 (used at most once)
+ or n (no information)
+
+Intervals describe sets, so the underlying lattice is the powerset lattice.
+
+Usually l<=u, but we also have C_10, the interval [1,0], the empty interval,
+denoting the empty set. This is the bottom element of the lattice.
+-}
+
+
-- | Describes an interval of /evaluation cardinalities/.
--- @C_lu@ means "evaluated /at least/ @l@ and /at most/ @u@ times".
--- The lower bound corresponds to /strictness/ (hence @l@ is either @0@ or @1@),
--- the upper bound corresponds to /usage/ (@u@ is one of @0@, @1@, @n@).
---
--- Intervals describe sets, so the underlying lattice is the powerset lattice.
+-- See Note [Evaluation cardinalities]
data Card
= C_00 -- ^ {0} Absent.
| C_01 -- ^ {0,1} Used at most once.
@@ -435,6 +455,10 @@ isTopDmd dmd = dmd == topDmd
isAbsDmd :: Demand -> Bool
isAbsDmd (n :* _) = isAbs n
+-- | Contrast with isStrictUsedDmd. See Note [Strict demands]
+isStrictDmd :: Demand -> Bool
+isStrictDmd (n :* _) = isStrict n
+
-- | Not absent and used strictly. See Note [Strict demands]
isStrUsedDmd :: Demand -> Bool
isStrUsedDmd (n :* _) = isStrict n && not (isAbs n)
@@ -601,8 +625,9 @@ saturatedByOneShots n (_ :* sd) = isUsedOnce (peelManyCalls n sd)
'isStrUsedDmd' returns true only of demands that are
both strict
and used
-In particular, it is False for <B>, which can and does
-arise in, say (#7319)
+
+In particular, it is False for <B> (i.e. strict and not used,
+cardinality C_10), which can and does arise in, say (#7319)
f x = raise# <some exception>
Then 'x' is not used, so f gets strictness <B> -> .
Now the w/w generates
diff --git a/compiler/GHC/Types/Literal.hs b/compiler/GHC/Types/Literal.hs
index a5c855a4fa..d3169b5ace 100644
--- a/compiler/GHC/Types/Literal.hs
+++ b/compiler/GHC/Types/Literal.hs
@@ -55,7 +55,8 @@ module GHC.Types.Literal
, word8Lit, word16Lit, word32Lit
, charToIntLit, intToCharLit
, floatToIntLit, intToFloatLit, doubleToIntLit, intToDoubleLit
- , nullAddrLit, rubbishLit, floatToDoubleLit, doubleToFloatLit
+ , nullAddrLit, floatToDoubleLit, doubleToFloatLit
+ , rubbishLit, isRubbishLit
) where
#include "HsVersions.h"
@@ -132,11 +133,10 @@ data Literal
-- that can be represented as a Literal. Create
-- with 'nullAddrLit'
- | LitRubbish -- ^ A nonsense value, used when an unlifted
- -- binding is absent and has type
- -- @forall (a :: 'TYPE' 'UnliftedRep'). a@.
- -- May be lowered by code-gen to any possible
- -- value. Also see Note [Rubbish literals]
+ | LitRubbish Bool -- ^ A nonsense value; always boxed, but
+ -- True <=> lifted, False <=> unlifted
+ -- Used when a binding is absent.
+ -- See Note [Rubbish literals]
| LitFloat Rational -- ^ @Float#@. Create with 'mkLitFloat'
| LitDouble Rational -- ^ @Double#@. Create with 'mkLitDouble'
@@ -241,7 +241,7 @@ instance Binary Literal where
= do putByte bh 6
put_ bh nt
put_ bh i
- put_ bh (LitRubbish) = putByte bh 7
+ put_ bh (LitRubbish b) = do putByte bh 7; put_ bh b
get bh = do
h <- getByte bh
case h of
@@ -267,7 +267,9 @@ instance Binary Literal where
nt <- get bh
i <- get bh
return (LitNumber nt i)
- _ -> return (LitRubbish)
+ _ -> do
+ b <- get bh
+ return (LitRubbish b)
instance Outputable Literal where
ppr = pprLiteral id
@@ -680,9 +682,13 @@ doubleToFloatLit l = pprPanic "doubleToFloatLit" (ppr l)
nullAddrLit :: Literal
nullAddrLit = LitNullAddr
--- | A nonsense literal of type @forall (a :: 'TYPE' 'UnliftedRep'). a@.
-rubbishLit :: Literal
-rubbishLit = LitRubbish
+-- | A rubbish literal; see Note [Rubbish literals]
+rubbishLit :: Bool -> Literal
+rubbishLit is_lifted = LitRubbish is_lifted
+
+isRubbishLit :: Literal -> Bool
+isRubbishLit (LitRubbish {}) = True
+isRubbishLit _ = False
{-
Predicates
@@ -807,9 +813,11 @@ literalType (LitNumber lt _) = case lt of
LitNumWord16 -> word16PrimTy
LitNumWord32 -> word32PrimTy
LitNumWord64 -> word64PrimTy
-literalType (LitRubbish) = mkForAllTy a Inferred (mkTyVarTy a)
+literalType (LitRubbish is_lifted) = mkForAllTy a Inferred (mkTyVarTy a)
where
- a = alphaTyVarUnliftedRep
+ -- See Note [Rubbish literals]
+ a | is_lifted = alphaTyVar
+ | otherwise = alphaTyVarUnliftedRep
absentLiteralOf :: TyCon -> Maybe Literal
-- Return a literal of the appropriate primitive
@@ -849,7 +857,7 @@ cmpLit (LitLabel a _ _) (LitLabel b _ _) = a `uniqCompareFS` b
cmpLit (LitNumber nt1 a) (LitNumber nt2 b)
| nt1 == nt2 = a `compare` b
| otherwise = nt1 `compare` nt2
-cmpLit (LitRubbish) (LitRubbish) = EQ
+cmpLit (LitRubbish b1) (LitRubbish b2) = b1 `compare` b2
cmpLit lit1 lit2
| litTag lit1 < litTag lit2 = LT
| otherwise = GT
@@ -862,7 +870,7 @@ litTag (LitFloat _) = 4
litTag (LitDouble _) = 5
litTag (LitLabel _ _ _) = 6
litTag (LitNumber {}) = 7
-litTag (LitRubbish) = 8
+litTag (LitRubbish {}) = 8
{-
Printing
@@ -895,7 +903,9 @@ pprLiteral add_par (LitLabel l mb fod) =
where b = case mb of
Nothing -> pprHsString l
Just x -> doubleQuotes (text (unpackFS l ++ '@':show x))
-pprLiteral _ (LitRubbish) = text "__RUBBISH"
+pprLiteral _ (LitRubbish is_lifted)
+ = text "__RUBBISH"
+ <> parens (if is_lifted then text "lifted" else text "unlifted")
pprIntegerVal :: (SDoc -> SDoc) -> Integer -> SDoc
-- See Note [Printing of literals in Core].
@@ -960,37 +970,38 @@ What is <absent value>?
* For primitive types like Int# or Word# we can use any random
value of that type.
* But what about /unlifted/ but /boxed/ types like MutVar# or
- Array#? We need a literal value of that type.
+ Array#? Or /lifted/ but /strict/ values, such as a field of
+ a strict data constructor. For these we use LitRubbish.
+ See Note [Absent errors] in GHC.Core.Opt.WorkWrap.Utils.hs
-That is 'LitRubbish'. Since we need a rubbish literal for
-many boxed, unlifted types, we say that LitRubbish has type
- LitRubbish :: forall (a :: TYPE UnliftedRep). a
+The literal (LitRubbish is_lifted)
+has type
+ LitRubbish :: forall (a :: TYPE LiftedRep). a if is_lifted
+ LitRubbish :: forall (a :: TYPE UnliftedRep). a otherwise
So we might see a w/w split like
- $wf x z = let y :: Array# Int = LitRubbish @(Array# Int)
+ $wf x z = let y :: Array# Int = (LitRubbish False) @(Array# Int)
in e
-Recall that (TYPE UnliftedRep) is the kind of boxed, unlifted
-heap pointers.
-
-Here are the moving parts:
+Here are the moving parts, but see also Note [Absent errors] in
+GHC.Core.Opt.WorkWrap.Utils
* We define LitRubbish as a constructor in GHC.Types.Literal.Literal
* It is given its polymorphic type by Literal.literalType
* GHC.Core.Opt.WorkWrap.Utils.mk_absent_let introduces a LitRubbish for absent
- arguments of boxed, unlifted type.
+ arguments of boxed, unlifted type; or boxed, lifted arguments of strict data
+ constructors.
-* In CoreToSTG we convert (RubishLit @t) to just (). STG is
- untyped, so it doesn't matter that it points to a lifted
- value. The important thing is that it is a heap pointer,
- which the garbage collector can follow if it encounters it.
+* In CoreToSTG we convert (RubishLit @t) to just (). STG is untyped, so this
+ will work OK for both lifted and unlifted (but boxed) values. The important
+ thing is that it is a heap pointer, which the garbage collector can follow if
+ it encounters it.
- We considered maintaining LitRubbish in STG, and lowering
- it in the code generators, but it seems simpler to do it
- once and for all in CoreToSTG.
+ We considered maintaining LitRubbish in STG, and lowering it in the code
+ generators, but it seems simpler to do it once and for all in CoreToSTG.
- In GHC.ByteCode.Asm we just lower it as a 0 literal, because
- it's all boxed and lifted to the host GC anyway.
+ In GHC.ByteCode.Asm we just lower it as a 0 literal, because it's all boxed to
+ the host GC anyway.
-}