diff options
author | Rodrigo Mesquita <rodrigo.m.mesquita@gmail.com> | 2023-05-05 15:02:35 +0100 |
---|---|---|
committer | Rodrigo Mesquita <rodrigo.m.mesquita@gmail.com> | 2023-05-16 14:59:09 +0100 |
commit | 9048ced72ac1d8642990dc64ac7b2aebf1c0bea4 (patch) | |
tree | 96f7550a1e529cca917755bcb732e6f8d765f48c | |
parent | b938a8efe56ed77e520a1df206f82ec7133f0e60 (diff) | |
download | haskell-wip/T23146.tar.gz |
Merge outdated Note [Data con representation] into Note [Data constructor representation]wip/T23146
Introduce new Note [Constructor applications in STG] to better support
the merge, and reference it from the relevant bits in the STG syntax.
-rw-r--r-- | compiler/GHC/Core/DataCon.hs | 64 | ||||
-rw-r--r-- | compiler/GHC/Stg/Syntax.hs | 50 |
2 files changed, 64 insertions, 50 deletions
diff --git a/compiler/GHC/Core/DataCon.hs b/compiler/GHC/Core/DataCon.hs index 0f8d87eb57..4bf0659b56 100644 --- a/compiler/GHC/Core/DataCon.hs +++ b/compiler/GHC/Core/DataCon.hs @@ -141,7 +141,19 @@ becomes case e of { T a' b -> let a = I# a' in ... } To keep ourselves sane, we name the different versions of the data constructor -differently, as follows. +differently, as follows in Note [Data Constructor Naming]. + +The `dcRepType` field of a `DataCon` contains the type of the representation of +the constructor /worker/, also called the Core representation. + +The Core representation may differ from the type of the constructor /wrapper/ +(built by `mkDataConRep`). Besides unpacking (as seen in the example above), +dictionaries and coercions become explict arguments in the Core representation +of a constructor. + +Note that this representation is still *different* from runtime +representation. (Which is what STG uses after unarise). +See Note [Constructor applications in STG] in GHC.Stg.Syntax. Note [Data Constructor Naming] @@ -209,7 +221,8 @@ Note [Data constructor workers and wrappers] * See Note [Data Constructor Naming] for how the worker and wrapper are named -* Neither_ the worker _nor_ the wrapper take the dcStupidTheta dicts as arguments +* The workers don't take the dcStupidTheta dicts as arguments, while the + wrappers currently do * The wrapper (if it exists) takes dcOrigArgTys as its arguments. The worker takes dataConRepArgTys as its arguments @@ -528,7 +541,7 @@ data DataCon -- forall a x y. (a~(x,y), x~y, Ord x) => -- x -> y -> T a -- (this is *not* of the constructor wrapper Id: - -- see Note [Data con representation] below) + -- see Note [Data constructor representation]) -- Notice that the existential type parameters come *second*. -- Reason: in a case expression we may find: -- case (e :: T t) of @@ -988,51 +1001,6 @@ we consult HsImplBang: The boolean flag is used only for this warning. See #11270 for motivation. -Note [Data con representation] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The dcRepType field contains the type of the representation of a constructor -This may differ from the type of the constructor *Id* (built -by MkId.mkDataConId) for two reasons: - a) the constructor Id may be overloaded, but the dictionary isn't stored - e.g. data Eq a => T a = MkT a a - - b) the constructor may store an unboxed version of a strict field. - -So whenever this module talks about the representation of a data constructor -what it means is the DataCon with all Unpacking having been applied. -We can think of this as the Core representation. - -Here's an example illustrating the Core representation: - data Ord a => T a = MkT Int! a Void# -Here - T :: Ord a => Int -> a -> Void# -> T a -but the rep type is - Trep :: Int# -> a -> Void# -> T a -Actually, the unboxed part isn't implemented yet! - -Note that this representation is still *different* from runtime -representation. (Which is what STG uses after unarise). - -This is how T would end up being used in STG post-unarise: - - let x = T 1# y - in ... - case x of - T int a -> ... - -The Void# argument is dropped and the boxed int is replaced by an unboxed -one. In essence we only generate binders for runtime relevant values. - -We also flatten out unboxed tuples in this process. See the unarise -pass for details on how this is done. But as an example consider -`data S = MkS Bool (# Bool | Char #)` which when matched on would -result in an alternative with three binders like this - - MkS bool tag tpl_field -> - -See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation] -for the details of this transformation. - ************************************************************************ * * diff --git a/compiler/GHC/Stg/Syntax.hs b/compiler/GHC/Stg/Syntax.hs index f18a25beb8..7ddf7fef3b 100644 --- a/compiler/GHC/Stg/Syntax.hs +++ b/compiler/GHC/Stg/Syntax.hs @@ -237,6 +237,52 @@ StgConApp and StgPrimApp --- saturated applications There are specialised forms of application, for constructors, primitives, and literals. + +Note [Constructor applications in STG] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +After the unarisation pass: +* In `StgConApp` and `StgRhsCon` and `StgAlt` we filter out the void arguments, + leaving only non-void ones. +* In `StgApp` and `StgOpApp` we retain void arguments. + +We can do this because we know that `StgConApp` and `StgRhsCon` are saturated applications, +so we lose no information by dropping those void args. In contrast, in `StgApp` we need the + void argument to compare the number of args in the call with the arity of the function. + +This is an open design choice. We could instead choose to treat all these applications +consistently (keeping the void args). But for some reason we don't, and this Note simply +documents that design choice. + +As an example, consider: + + data T a = MkT Int! a Void# + +The wrapper's representation and the worker's representation (i.e. the +datacon's Core representation) are respectively: + + $WT :: Int -> a -> Void# -> T a + T :: Int# -> a -> Void# -> T a + +T would end up being used in STG post-unarise as: + + let x = T 1# y + in ... + case x of + T int a -> ... + +The Void# argument is dropped. In essence we only generate binders for runtime +relevant values. + +We also flatten out unboxed tuples in this process. See the unarise +pass for details on how this is done. But as an example consider +`data S = MkS Bool (# Bool | Char #)` which when matched on would +result in an alternative with three binders like this + + MkS bool tag tpl_field -> + +See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation] +for the details of this transformation. + -} | StgLit Literal @@ -245,7 +291,7 @@ literals. -- which can't be let-bound | StgConApp DataCon ConstructorNumber - [StgArg] -- Saturated. (After Unarisation, [NonVoid StgArg]) + [StgArg] -- Saturated. See Note [Constructor applications in STG] [Type] -- See Note [Types in StgConApp] in GHC.Stg.Unarise | StgOpApp StgOp -- Primitive op or foreign call @@ -422,7 +468,7 @@ important): -- are not allocated. ConstructorNumber [StgTickish] - [StgArg] -- Args + [StgArg] -- Saturated Args. See Note [Constructor applications in STG] Type -- Type, for rewriting to an StgRhsClosure -- | Like 'GHC.Hs.Extension.NoExtField', but with an 'Outputable' instance that |