summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorRodrigo Mesquita <rodrigo.m.mesquita@gmail.com>2023-05-05 15:02:35 +0100
committerRodrigo Mesquita <rodrigo.m.mesquita@gmail.com>2023-05-16 14:59:09 +0100
commit9048ced72ac1d8642990dc64ac7b2aebf1c0bea4 (patch)
tree96f7550a1e529cca917755bcb732e6f8d765f48c
parentb938a8efe56ed77e520a1df206f82ec7133f0e60 (diff)
downloadhaskell-wip/T23146.tar.gz
Merge outdated Note [Data con representation] into Note [Data constructor representation]wip/T23146
Introduce new Note [Constructor applications in STG] to better support the merge, and reference it from the relevant bits in the STG syntax.
-rw-r--r--compiler/GHC/Core/DataCon.hs64
-rw-r--r--compiler/GHC/Stg/Syntax.hs50
2 files changed, 64 insertions, 50 deletions
diff --git a/compiler/GHC/Core/DataCon.hs b/compiler/GHC/Core/DataCon.hs
index 0f8d87eb57..4bf0659b56 100644
--- a/compiler/GHC/Core/DataCon.hs
+++ b/compiler/GHC/Core/DataCon.hs
@@ -141,7 +141,19 @@ becomes
case e of { T a' b -> let a = I# a' in ... }
To keep ourselves sane, we name the different versions of the data constructor
-differently, as follows.
+differently, as follows in Note [Data Constructor Naming].
+
+The `dcRepType` field of a `DataCon` contains the type of the representation of
+the constructor /worker/, also called the Core representation.
+
+The Core representation may differ from the type of the constructor /wrapper/
+(built by `mkDataConRep`). Besides unpacking (as seen in the example above),
+dictionaries and coercions become explict arguments in the Core representation
+of a constructor.
+
+Note that this representation is still *different* from runtime
+representation. (Which is what STG uses after unarise).
+See Note [Constructor applications in STG] in GHC.Stg.Syntax.
Note [Data Constructor Naming]
@@ -209,7 +221,8 @@ Note [Data constructor workers and wrappers]
* See Note [Data Constructor Naming] for how the worker and wrapper
are named
-* Neither_ the worker _nor_ the wrapper take the dcStupidTheta dicts as arguments
+* The workers don't take the dcStupidTheta dicts as arguments, while the
+ wrappers currently do
* The wrapper (if it exists) takes dcOrigArgTys as its arguments.
The worker takes dataConRepArgTys as its arguments
@@ -528,7 +541,7 @@ data DataCon
-- forall a x y. (a~(x,y), x~y, Ord x) =>
-- x -> y -> T a
-- (this is *not* of the constructor wrapper Id:
- -- see Note [Data con representation] below)
+ -- see Note [Data constructor representation])
-- Notice that the existential type parameters come *second*.
-- Reason: in a case expression we may find:
-- case (e :: T t) of
@@ -988,51 +1001,6 @@ we consult HsImplBang:
The boolean flag is used only for this warning.
See #11270 for motivation.
-Note [Data con representation]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The dcRepType field contains the type of the representation of a constructor
-This may differ from the type of the constructor *Id* (built
-by MkId.mkDataConId) for two reasons:
- a) the constructor Id may be overloaded, but the dictionary isn't stored
- e.g. data Eq a => T a = MkT a a
-
- b) the constructor may store an unboxed version of a strict field.
-
-So whenever this module talks about the representation of a data constructor
-what it means is the DataCon with all Unpacking having been applied.
-We can think of this as the Core representation.
-
-Here's an example illustrating the Core representation:
- data Ord a => T a = MkT Int! a Void#
-Here
- T :: Ord a => Int -> a -> Void# -> T a
-but the rep type is
- Trep :: Int# -> a -> Void# -> T a
-Actually, the unboxed part isn't implemented yet!
-
-Note that this representation is still *different* from runtime
-representation. (Which is what STG uses after unarise).
-
-This is how T would end up being used in STG post-unarise:
-
- let x = T 1# y
- in ...
- case x of
- T int a -> ...
-
-The Void# argument is dropped and the boxed int is replaced by an unboxed
-one. In essence we only generate binders for runtime relevant values.
-
-We also flatten out unboxed tuples in this process. See the unarise
-pass for details on how this is done. But as an example consider
-`data S = MkS Bool (# Bool | Char #)` which when matched on would
-result in an alternative with three binders like this
-
- MkS bool tag tpl_field ->
-
-See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation]
-for the details of this transformation.
-
************************************************************************
* *
diff --git a/compiler/GHC/Stg/Syntax.hs b/compiler/GHC/Stg/Syntax.hs
index f18a25beb8..7ddf7fef3b 100644
--- a/compiler/GHC/Stg/Syntax.hs
+++ b/compiler/GHC/Stg/Syntax.hs
@@ -237,6 +237,52 @@ StgConApp and StgPrimApp --- saturated applications
There are specialised forms of application, for constructors, primitives, and
literals.
+
+Note [Constructor applications in STG]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+After the unarisation pass:
+* In `StgConApp` and `StgRhsCon` and `StgAlt` we filter out the void arguments,
+ leaving only non-void ones.
+* In `StgApp` and `StgOpApp` we retain void arguments.
+
+We can do this because we know that `StgConApp` and `StgRhsCon` are saturated applications,
+so we lose no information by dropping those void args. In contrast, in `StgApp` we need the
+ void argument to compare the number of args in the call with the arity of the function.
+
+This is an open design choice. We could instead choose to treat all these applications
+consistently (keeping the void args). But for some reason we don't, and this Note simply
+documents that design choice.
+
+As an example, consider:
+
+ data T a = MkT Int! a Void#
+
+The wrapper's representation and the worker's representation (i.e. the
+datacon's Core representation) are respectively:
+
+ $WT :: Int -> a -> Void# -> T a
+ T :: Int# -> a -> Void# -> T a
+
+T would end up being used in STG post-unarise as:
+
+ let x = T 1# y
+ in ...
+ case x of
+ T int a -> ...
+
+The Void# argument is dropped. In essence we only generate binders for runtime
+relevant values.
+
+We also flatten out unboxed tuples in this process. See the unarise
+pass for details on how this is done. But as an example consider
+`data S = MkS Bool (# Bool | Char #)` which when matched on would
+result in an alternative with three binders like this
+
+ MkS bool tag tpl_field ->
+
+See Note [Translating unboxed sums to unboxed tuples] and Note [Unarisation]
+for the details of this transformation.
+
-}
| StgLit Literal
@@ -245,7 +291,7 @@ literals.
-- which can't be let-bound
| StgConApp DataCon
ConstructorNumber
- [StgArg] -- Saturated. (After Unarisation, [NonVoid StgArg])
+ [StgArg] -- Saturated. See Note [Constructor applications in STG]
[Type] -- See Note [Types in StgConApp] in GHC.Stg.Unarise
| StgOpApp StgOp -- Primitive op or foreign call
@@ -422,7 +468,7 @@ important):
-- are not allocated.
ConstructorNumber
[StgTickish]
- [StgArg] -- Args
+ [StgArg] -- Saturated Args. See Note [Constructor applications in STG]
Type -- Type, for rewriting to an StgRhsClosure
-- | Like 'GHC.Hs.Extension.NoExtField', but with an 'Outputable' instance that