diff options
author | sheaf <sam.derbyshire@gmail.com> | 2022-02-26 14:15:33 +0100 |
---|---|---|
committer | sheaf <sam.derbyshire@gmail.com> | 2022-02-26 14:15:33 +0100 |
commit | 32d8fe3a9a7f471e98d173eb83c02e61d9e12fbb (patch) | |
tree | 9b10cf787c475294a2e3261468805a52448f7440 /compiler | |
parent | 033e9f0fcd0c1f9a2814b6922275514951c87dfd (diff) | |
download | haskell-32d8fe3a9a7f471e98d173eb83c02e61d9e12fbb.tar.gz |
Core Lint: ensure primops can be eta-expanded
This patch adds a check to Core Lint, checkCanEtaExpand,
which ensures that primops and other wired-in functions with
no binding such as unsafeCoerce#, oneShot, rightSection...
can always be eta-expanded, by checking that the remaining
argument types have a fixed RuntimeRep.
Two subtleties came up:
- the notion of arity in Core looks through newtypes, so we may
need to unwrap newtypes in this check,
- we want to avoid calling hasNoBinding on something whose unfolding
we are in the process of linting, as this would cause a loop;
to avoid this we add some information to the Core Lint environment
that holds this information.
Fixes #20480
Diffstat (limited to 'compiler')
-rw-r--r-- | compiler/GHC/Core.hs | 49 | ||||
-rw-r--r-- | compiler/GHC/Core/Coercion.hs-boot | 4 | ||||
-rw-r--r-- | compiler/GHC/Core/Lint.hs | 134 | ||||
-rw-r--r-- | compiler/GHC/Core/Type.hs | 45 | ||||
-rw-r--r-- | compiler/GHC/CoreToStg/Prep.hs | 6 | ||||
-rw-r--r-- | compiler/GHC/Tc/Gen/App.hs | 17 | ||||
-rw-r--r-- | compiler/GHC/Tc/Utils/Concrete.hs | 5 | ||||
-rw-r--r-- | compiler/GHC/Types/Id.hs | 15 | ||||
-rw-r--r-- | compiler/GHC/Types/Id/Info.hs | 32 |
9 files changed, 276 insertions, 31 deletions
diff --git a/compiler/GHC/Core.hs b/compiler/GHC/Core.hs index 2216c65591..90eb08e7f6 100644 --- a/compiler/GHC/Core.hs +++ b/compiler/GHC/Core.hs @@ -557,19 +557,56 @@ where the type variable `r :: RuntimeRep` abstracts over the runtime representat of values of type `b`. To ensure that programs containing representation-polymorphism remain compilable, -we enforce two invariants (the representation-polymorphism invariants), -as per "Levity Polymorphism" [PLDI'17]: +we enforce the following representation-polymorphism invariants: + +The paper "Levity Polymorphism" [PLDI'17] states the first two invariants: I1. The type of a bound variable must have a fixed runtime representation (except for join points: See Note [Invariants on join points]) I2. The type of a function argument must have a fixed runtime representation. -For example +On top of these two invariants, GHC's internal eta-expansion mechanism also requires: + + I3. In any partial application `f e_1 .. e_n`, where `f` is `hasNoBinding`, + it must be the case that the application can be eta-expanded to match + the arity of `f`. + See Note [checkCanEtaExpand] in GHC.Core.Lint for more details. + +Example of I1: + \(r::RuntimeRep). \(a::TYPE r). \(x::a). e -is illegal because x's type has kind (TYPE r), which has 'r' free. -We thus wouldn't know how to compile this lambda abstraction. -In practice, we currently require something slightly stronger than a fixed runtime + This contravenes I1 because x's type has kind (TYPE r), which has 'r' free. + We thus wouldn't know how to compile this lambda abstraction. + +Example of I2: + + f (undefined :: (a :: TYPE r)) + + This contravenes I2: we are applying the function `f` to a value + with an unknown runtime representation. + +Examples of I3: + + myUnsafeCoerce# :: forall {r1} (a :: TYPE r1) {r2} (b :: TYPE r2). a -> b + myUnsafeCoerce# = unsafeCoerce# + + This contravenes I3: we are instantiating `unsafeCoerce#` without any + value arguments, and with a remaining argument type, `a`, which does not + have a fixed runtime representation. + But `unsafeCorce#` has no binding (see Note [Wiring in unsafeCoerce#] + in GHC.HsToCore). So before code-generation we must saturate it + by eta-expansion (see GHC.CoreToStg.Prep.maybeSaturate), thus + myUnsafeCoerce# = \x. unsafeCoerce# x + But we can't do that because now the \x binding would violate I1. + + bar :: forall (a :: TYPE) r (b :: TYPE r). a -> b + bar = unsafeCoerce# + + OK: eta expand to `\ (x :: Type) -> unsafeCoerce# x`, + and `x` has a fixed RuntimeRep. + +Note that we currently require something slightly stronger than a fixed runtime representation: we check whether bound variables and function arguments have a /fixed RuntimeRep/ in the sense of Note [Fixed RuntimeRep] in GHC.Tc.Utils.Concrete. See Note [Representation polymorphism checking] in GHC.Tc.Utils.Concrete diff --git a/compiler/GHC/Core/Coercion.hs-boot b/compiler/GHC/Core/Coercion.hs-boot index 0c18e5e68f..cbc40e2a13 100644 --- a/compiler/GHC/Core/Coercion.hs-boot +++ b/compiler/GHC/Core/Coercion.hs-boot @@ -51,3 +51,7 @@ coercionKind :: Coercion -> Pair Type coercionLKind :: Coercion -> Type coercionRKind :: Coercion -> Type coercionType :: Coercion -> Type + +topNormaliseNewType_maybe :: Type -> Maybe (Coercion, Type) + -- used to look through newtypes to the right of + -- function arrows, in 'GHC.Core.Type.getRuntimeArgTys' diff --git a/compiler/GHC/Core/Lint.hs b/compiler/GHC/Core/Lint.hs index 1200a4102a..688b63ff55 100644 --- a/compiler/GHC/Core/Lint.hs +++ b/compiler/GHC/Core/Lint.hs @@ -30,7 +30,7 @@ import GHC.Driver.Ppr import GHC.Driver.Env import GHC.Driver.Config.Diagnostic -import GHC.Tc.Utils.TcType ( isFloatingTy, isTyFamFree ) +import GHC.Tc.Utils.TcType ( isFloatingTy, isTyFamFree ) import GHC.Unit.Module.ModGuts import GHC.Runtime.Context @@ -551,10 +551,10 @@ hence the `TopLevelFlag` on `tcPragExpr` in GHC.IfaceToCore. -} -lintUnfolding :: Bool -- True <=> is a compulsory unfolding +lintUnfolding :: Bool -- ^ True <=> is a compulsory unfolding -> DynFlags -> SrcLoc - -> VarSet -- Treat these as in scope + -> VarSet -- ^ Treat these as in scope -> CoreExpr -> Maybe (Bag SDoc) -- Nothing => OK @@ -847,7 +847,10 @@ lintCoreExpr :: CoreExpr -> LintM (LintedType, UsageEnv) -- See Note [GHC Formalism] lintCoreExpr (Var var) - = lintIdOcc var 0 + = do + var_pair@(var_ty, _) <- lintIdOcc var 0 + checkCanEtaExpand (Var var) [] var_ty + return var_pair lintCoreExpr (Lit lit) = return (literalType lit, zeroUE) @@ -937,8 +940,10 @@ lintCoreExpr e@(App _ _) ; lintCoreArgs app_ty rest } | otherwise - = do { pair <- lintCoreFun fun (length args) - ; lintCoreArgs pair args } + = do { fun_pair <- lintCoreFun fun (length args) + ; app_pair@(app_ty, _) <- lintCoreArgs fun_pair args + ; checkCanEtaExpand fun args app_ty + ; return app_pair} where (fun, args) = collectArgs e @@ -1080,6 +1085,78 @@ checkJoinOcc var n_args | otherwise = return () +-- | This function checks that we are able to perform eta expansion for +-- functions with no binding, in order to satisfy invariant I3 +-- from Note [Representation polymorphism invariants] in GHC.Core. +checkCanEtaExpand :: CoreExpr -- ^ the function (head of the application) we are checking + -> [CoreArg] -- ^ the arguments to the application + -> LintedType -- ^ the instantiated type of the overall application + -> LintM () +checkCanEtaExpand (Var fun_id) args app_ty + | hasNoBinding fun_id + = checkL (null bad_arg_tys) err_msg + where + arity :: Arity + arity = idArity fun_id + + nb_val_args :: Int + nb_val_args = count isValArg args + + -- Check the remaining argument types, past the + -- given arguments and up to the arity of the 'Id'. + -- Returns the types that couldn't be determined to have + -- a fixed RuntimeRep. + check_args :: [Type] -> [Type] + check_args = go (nb_val_args + 1) + where + go :: Int -- ^ index of the argument (starting from 1) + -> [Type] -- ^ arguments + -> [Type] -- ^ value argument types that could not be + -- determined to have a fixed runtime representation + go i _ + | i > arity + = [] + go _ [] + -- The Arity of an Id should never exceed the number of value arguments + -- that can be read off from the Id's type. + -- See Note [Arity and function types] in GHC.Types.Id.Info. + = pprPanic "checkCanEtaExpand: arity larger than number of value arguments apparent in type" + $ vcat + [ text "fun_id =" <+> ppr fun_id + , text "arity =" <+> ppr arity + , text "app_ty =" <+> ppr app_ty + , text "args = " <+> ppr args + , text "nb_val_args =" <+> ppr nb_val_args ] + go i (ty : bndrs) + | typeHasFixedRuntimeRep ty + = go (i+1) bndrs + | otherwise + = ty : go (i+1) bndrs + + bad_arg_tys :: [Type] + bad_arg_tys = check_args . map fst $ getRuntimeArgTys app_ty + -- We use 'getRuntimeArgTys' to find all the argument types, + -- including those hidden under newtypes. For example, + -- if `FunNT a b` is a newtype around `a -> b`, then + -- when checking + -- + -- foo :: forall r (a :: TYPE r) (b :: TYPE r) c. a -> FunNT b c + -- + -- we should check that the instantiations of BOTH `a` AND `b` + -- have a fixed runtime representation. + + err_msg :: SDoc + err_msg + = vcat [ text "Cannot eta expand" <+> quotes (ppr fun_id) + , text "The following type" <> plural bad_arg_tys + <+> doOrDoes bad_arg_tys <+> text "not have a fixed runtime representation:" + , nest 2 $ vcat $ map ppr_ty_ki bad_arg_tys ] + + ppr_ty_ki :: Type -> SDoc + ppr_ty_ki ty = bullet <+> ppr ty <+> dcolon <+> ppr (typeKind ty) +checkCanEtaExpand _ _ _ + = return () + -- Check that the usage of var is consistent with var itself, and pop the var -- from the usage environment (this is important because of shadowing). checkLinearity :: UsageEnv -> Var -> LintM UsageEnv @@ -2673,6 +2750,44 @@ We plan to fix this issue in the very near future. For now, -dcore-lint enables only linting output of the desugarer, and full Linear Lint has to be enabled separately with -dlinear-core-lint. Ticket #19165 concerns enabling Linear Lint with -dcore-lint. + +Note [checkCanEtaExpand] +~~~~~~~~~~~~~~~~~~~~~~~~ +The checkCanEtaExpand function is responsible for enforcing invariant I3 +from Note [Representation polymorphism invariants] in GHC.Core: in any +partial application `f e_1 .. e_n`, if `f` has no binding, we must be able to +eta expand `f` to match the declared arity of `f`. + +Wrinkle 1: eta-expansion and newtypes + + Most of the time, when we have a partial application `f e_1 .. e_n` + in which `f` is `hasNoBinding`, we eta-expand it up to its arity + as follows: + + \ x_{n+1} ... x_arity -> f e_1 .. e_n x_{n+1} ... x_arity + + However, we might need to insert casts if some of the arguments + that `f` takes are under a newtype. + For example, suppose `f` `hasNoBinding`, has arity 1 and type + + f :: forall r (a :: TYPE r). Identity (a -> a) + + then we eta-expand the nullary application `f` to + + ( \ x -> f x ) |> co + + where + + co :: ( forall r (a :: TYPE r). a -> a ) ~# ( forall r (a :: TYPE r). Identity (a -> a) ) + + In this case we would have to perform a representation-polymorphism check on the instantiation + of `a`. + +Wrinkle 2: 'hasNoBinding' and laziness + + It's important that we able to compute 'hasNoBinding' for an 'Id' without ever forcing + the unfolding of the 'Id'. Otherwise, we could end up with a loop, as outlined in + Note [Lazily checking Unfoldings] in GHC.IfaceToCore. -} instance Applicative LintM where @@ -2711,8 +2826,11 @@ data LintLocInfo | InCo Coercion -- Inside a coercion | InAxiom (CoAxiom Branched) -- Inside a CoAxiom -initL :: DynFlags -> LintFlags -> [Var] - -> LintM a -> WarnsAndErrs -- Warnings and errors +initL :: DynFlags + -> LintFlags + -> [Var] -- ^ 'Id's that should be treated as being in scope + -> LintM a -- ^ Action to run + -> WarnsAndErrs initL dflags flags vars m = case unLintM m env (emptyBag, emptyBag) of (Just _, errs) -> errs diff --git a/compiler/GHC/Core/Type.hs b/compiler/GHC/Core/Type.hs index 2406fc6e7a..1b5a21b733 100644 --- a/compiler/GHC/Core/Type.hs +++ b/compiler/GHC/Core/Type.hs @@ -55,6 +55,7 @@ module GHC.Core.Type ( splitForAllTyCoVar_maybe, splitForAllTyCoVar, splitForAllTyVar_maybe, splitForAllCoVar_maybe, splitPiTy_maybe, splitPiTy, splitPiTys, + getRuntimeArgTys, mkTyConBindersPreferAnon, mkPiTy, mkPiTys, piResultTy, piResultTys, @@ -275,7 +276,9 @@ import {-# SOURCE #-} GHC.Core.Coercion , mkKindCo, mkSubCo , decomposePiCos, coercionKind, coercionLKind , coercionRKind, coercionType - , isReflexiveCo, seqCo ) + , isReflexiveCo, seqCo + , topNormaliseNewType_maybe + ) -- others import GHC.Utils.Misc @@ -2135,6 +2138,46 @@ splitPiTys ty = split ty ty [] split orig_ty ty bs | Just ty' <- coreView ty = split orig_ty ty' bs split orig_ty _ bs = (reverse bs, orig_ty) +-- | Extracts a list of run-time arguments from a function type, +-- looking through newtypes to the right of arrows. +-- +-- Examples: +-- +-- @ +-- newtype Identity a = I a +-- +-- getRuntimeArgTys (Int -> Bool -> Double) == [(Int, VisArg), (Bool, VisArg)] +-- getRuntimeArgTys (Identity Int -> Bool -> Double) == [(Identity Int, VisArg), (Bool, VisArg)] +-- getRuntimeArgTys (Int -> Identity (Bool -> Identity Double)) == [(Int, VisArg), (Bool, VisArg)] +-- getRuntimeArgTys (forall a. Show a => Identity a -> a -> Int -> Bool) == [(Show a, InvisArg), (Identity a, VisArg),(a, VisArg),(Int, VisArg)] +-- @ +-- +-- Note that, in the last case, the returned types might mention an out-of-scope +-- type variable. This function is used only when we really care about the /kinds/ +-- of the returned types, so this is OK. +-- +-- **Warning**: this function can return an infinite list. For example: +-- +-- @ +-- newtype N a = MkN (a -> N a) +-- getRuntimeArgTys (N a) == repeat (a, VisArg) +-- @ +getRuntimeArgTys :: Type -> [(Type, AnonArgFlag)] +getRuntimeArgTys = go + where + go :: Type -> [(Type, AnonArgFlag)] + go (ForAllTy _ res) + = go res + go (FunTy { ft_arg = arg, ft_res = res, ft_af = af }) + = (arg, af) : go res + go ty + | Just ty' <- coreView ty + = go ty' + | Just (_,ty') <- topNormaliseNewType_maybe ty + = go ty' + | otherwise + = [] + -- | Like 'splitPiTys' but split off only /named/ binders -- and returns 'TyCoVarBinder's rather than 'TyCoBinder's splitForAllTyCoVarBinders :: Type -> ([TyCoVarBinder], Type) diff --git a/compiler/GHC/CoreToStg/Prep.hs b/compiler/GHC/CoreToStg/Prep.hs index 8a6354147c..2e998bf94e 100644 --- a/compiler/GHC/CoreToStg/Prep.hs +++ b/compiler/GHC/CoreToStg/Prep.hs @@ -1518,9 +1518,9 @@ because that has different strictness. Hence the use of 'allLazy'. Note [Eta expansion of hasNoBinding things in CorePrep] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ maybeSaturate deals with eta expanding to saturate things that can't deal with -unsaturated applications (identified by 'hasNoBinding', currently just -foreign calls, unboxed tuple/sum constructors and strict workers). -See Note [Strict Worker Ids] +unsaturated applications (identified by 'hasNoBinding', currently +foreign calls, unboxed tuple/sum constructors, and representation-polymorphic +primitives such as 'coerce' and 'unsafeCoerce#'). Historical Note: Note that eta expansion in CorePrep used to be very fragile due to the "prediction" of CAFfyness that we used to make during tidying. diff --git a/compiler/GHC/Tc/Gen/App.hs b/compiler/GHC/Tc/Gen/App.hs index 560b3bac7d..659e4f64d5 100644 --- a/compiler/GHC/Tc/Gen/App.hs +++ b/compiler/GHC/Tc/Gen/App.hs @@ -558,8 +558,8 @@ hasFixedRuntimeRep_remainingValArgs applied_args app_res_rho = \case nb_applied_val_args :: Int nb_applied_val_args = countVisAndInvisValArgs applied_args - arg_tys :: [TyCoBinder] - arg_tys = fst $ splitPiTys app_res_rho + arg_tys :: [(Type,AnonArgFlag)] + arg_tys = getRuntimeArgTys app_res_rho -- We do not need to zonk app_res_rho first, because the number of arrows -- in the (possibly instantiated) inferred type of the function will -- be at least the arity of the function. @@ -569,11 +569,11 @@ hasFixedRuntimeRep_remainingValArgs applied_args app_res_rho = \case traceTc "tcApp remainingValArgs check_thing" (debug_msg thing arity) go (nb_applied_vis_val_args + 1) (nb_applied_val_args + 1) arg_tys where - go :: Int -- ^ visible value argument index - -- (only used to report the argument position in error messages) - -> Int -- ^ value argument index + go :: Int -- ^ visible value argument index, starting from 1 + -- only used to report the argument position in error messages + -> Int -- ^ value argument index, starting from 1 -- used to count up to the arity to ensure we don't check too many argument types - -> [TyCoBinder] + -> [(Type, AnonArgFlag)] -- ^ run-time argument types -> TcM () go _ i_val _ | i_val > arity @@ -582,15 +582,13 @@ hasFixedRuntimeRep_remainingValArgs applied_args app_res_rho = \case -- Should never happen: it would mean that the arity is higher -- than the number of arguments apparent from the type = pprPanic "hasFixedRuntimeRep_remainingValArgs" (debug_msg thing arity) - go i_visval !i_val (Anon af (Scaled _ arg_ty) : tys) + go i_visval !i_val ((arg_ty, af) : tys) = case af of InvisArg -> go i_visval (i_val + 1) tys VisArg -> do _concrete_ev <- hasFixedRuntimeRep (mk_frr_orig i_visval) arg_ty go (i_visval + 1) (i_val + 1) tys - go i_visval i_val (_: tys) - = go i_visval i_val tys -- A message containing all the relevant info, in case this functions -- needs to be debugged again at some point. @@ -600,7 +598,6 @@ hasFixedRuntimeRep_remainingValArgs applied_args app_res_rho = \case [ text "thing =" <+> ppr thing , text "arity =" <+> ppr arity , text "applied_args =" <+> ppr applied_args - , text "nb_applied_vis_val_args =" <+> ppr nb_applied_vis_val_args , text "nb_applied_val_args =" <+> ppr nb_applied_val_args , text "arg_tys =" <+> ppr arg_tys ] diff --git a/compiler/GHC/Tc/Utils/Concrete.hs b/compiler/GHC/Tc/Utils/Concrete.hs index bee886a58f..8ae32e5a78 100644 --- a/compiler/GHC/Tc/Utils/Concrete.hs +++ b/compiler/GHC/Tc/Utils/Concrete.hs @@ -11,7 +11,8 @@ module GHC.Tc.Utils.Concrete import GHC.Prelude -import GHC.Core.Coercion +import GHC.Core.Coercion ( multToCo ) +import GHC.Core.Type ( isConcrete, typeKind ) import GHC.Core.TyCo.Rep import GHC.Tc.Utils.Monad @@ -26,7 +27,7 @@ import GHC.Builtin.Types.Prim ( concretePrimTyCon ) import GHC.Types.Basic ( TypeOrKind(KindLevel) ) -import GHC.Core.Type ( isConcrete, typeKind ) + {- Note [Concrete overview] ~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/compiler/GHC/Types/Id.hs b/compiler/GHC/Types/Id.hs index 76dc4d0721..6c328fc693 100644 --- a/compiler/GHC/Types/Id.hs +++ b/compiler/GHC/Types/Id.hs @@ -579,7 +579,20 @@ hasNoBinding id = case Var.idDetails id of PrimOpId _ -> True -- See Note [Eta expanding primops] in GHC.Builtin.PrimOps FCallId _ -> True DataConWorkId dc -> isUnboxedTupleDataCon dc || isUnboxedSumDataCon dc - _ -> isCompulsoryUnfolding (idUnfolding id) + _ -> isCompulsoryUnfolding (realIdUnfolding id) + -- Note: this function must be very careful not to force + -- any of the fields that aren't the 'uf_src' field of + -- the 'Unfolding' of the 'Id'. This is because these fields are computed + -- in terms of the 'uf_tmpl' field, which is not available + -- until we have finished Core Lint for the unfolding, which calls 'hasNoBinding' + -- in 'checkCanEtaExpand'. + -- + -- In particular, calling 'idUnfolding' rather than 'realIdUnfolding' here can + -- force the 'uf_tmpl' field, because 'zapUnfolding' forces the 'uf_is_value' field, + -- and this field is usually computed in terms of the 'uf_tmpl' field, + -- so we will force that as well. + -- + -- See Note [Lazily checking Unfoldings] in GHC.IfaceToCore. isImplicitId :: Id -> Bool -- ^ 'isImplicitId' tells whether an 'Id's info is implied by other diff --git a/compiler/GHC/Types/Id/Info.hs b/compiler/GHC/Types/Id/Info.hs index 73e7169181..670935251a 100644 --- a/compiler/GHC/Types/Id/Info.hs +++ b/compiler/GHC/Types/Id/Info.hs @@ -526,8 +526,36 @@ noCafIdInfo = vanillaIdInfo `setCafInfo` NoCafRefs For locally-defined Ids, the code generator maintains its own notion of their arities; so it should not be asking... (but other things besides the code-generator need arity info!) + +Note [Arity and function types] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The arity of an 'Id' must never exceed the number of arguments that +can be read off from the 'Id's type, possibly after expanding newtypes. + +Examples: + + f1 :: forall a. a -> a + + idArity f1 <= 1: only one value argument, of type 'a' + + f2 :: forall a. Show a => Int -> a + + idArity f2 <= 2: two value arguments, of types 'Show a' and 'Int'. + + + newtype Id a = MkId a + f3 :: forall b. Id (Int -> b) + + idArity f3 <= 1: there is one value argument, of type 'Int', hidden under the newtype. + + newtype RecFun = MkRecFun (Int -> RecFun) + f4 :: RecFun + + no constraint on the arity of f4: we can unwrap as many layers of the newtype as we want, + to get arbitrarily many arguments of type 'Int'. -} + -- | Arity Information -- -- An 'ArityInfo' of @n@ tells us that partial application of this @@ -538,6 +566,10 @@ besides the code-generator need arity info!) -- -- The arity might increase later in the compilation process, if -- an extra lambda floats up to the binding site. +-- +-- /Invariant:/ the 'Arity' of an 'Id' must never exceed the number of +-- value arguments that appear in the type of the 'Id'. +-- See Note [Arity and function types]. type ArityInfo = Arity -- | It is always safe to assume that an 'Id' has an arity of 0 |