diff options
author | Ryan Scott <ryan.gl.scott@gmail.com> | 2020-05-21 12:01:06 -0400 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2020-06-09 08:06:24 -0400 |
commit | 72c7fe9a1e147dfeaf043f6d591d724a126cce45 (patch) | |
tree | 89c4eb7ba14900e32a6b48c7e73ececcc0b1a4d4 /docs | |
parent | 9b607671b9158c60470b2bd57804a7684d3ea33f (diff) | |
download | haskell-72c7fe9a1e147dfeaf043f6d591d724a126cce45.tar.gz |
Make GADT constructors adhere to the forall-or-nothing rule properly
Issue #18191 revealed that the types of GADT constructors don't quite
adhere to the `forall`-or-nothing rule. This patch serves to clean up
this sad state of affairs somewhat. The main change is not in the
code itself, but in the documentation, as this patch introduces two
sections to the GHC User's Guide:
* A "Formal syntax for GADTs" section that presents a BNF-style
grammar for what is and isn't allowed in GADT constructor types.
This mostly exists to codify GHC's existing behavior, but it also
imposes a new restriction that addresses #18191: the outermost
`forall` and/or context in a GADT constructor is not allowed to be
surrounded by parentheses. Doing so would make these
`forall`s/contexts nested, and GADTs do not support nested
`forall`s/contexts at present.
* A "`forall`-or-nothing rule" section that describes exactly what
the `forall`-or-nothing rule is all about. Surprisingly, there was
no mention of this anywhere in the User's Guide up until now!
To adhere the new specification in the "Formal syntax for GADTs"
section of the User's Guide, the following code changes were made:
* A new function, `GHC.Hs.Type.splitLHsGADTPrefixTy`, was introduced.
This is very much like `splitLHsSigmaTy`, except that it avoids
splitting apart any parentheses, which can be syntactically
significant for GADT types. See
`Note [No nested foralls or contexts in GADT constructors]` in
`GHC.Hs.Type`.
* `ConDeclGADTPrefixPs`, an extension constructor for `XConDecl`, was
introduced so that `GHC.Parser.PostProcess.mkGadtDecl` can return
it when given a prefix GADT constructor. Unlike `ConDeclGADT`,
`ConDeclGADTPrefixPs` does not split the GADT type into its argument
and result types, as this cannot be done until after the type is
renamed (see `Note [GADT abstract syntax]` in `GHC.Hs.Decls` for why
this is the case).
* `GHC.Renamer.Module.rnConDecl` now has an additional case for
`ConDeclGADTPrefixPs` that (1) splits apart the full `LHsType` into
its `forall`s, context, argument types, and result type, and
(2) checks for nested `forall`s/contexts. Step (2) used to be
performed the typechecker (in `GHC.Tc.TyCl.badDataConTyCon`) rather
than the renamer, but now the relevant code from the typechecker
can simply be deleted.
One nice side effect of this change is that we are able to give a
more accurate error message for GADT constructors that use visible
dependent quantification (e.g., `MkFoo :: forall a -> a -> Foo a`),
which improves the stderr in the `T16326_Fail6` test case.
Fixes #18191. Bumps the Haddock submodule.
Diffstat (limited to 'docs')
-rw-r--r-- | docs/users_guide/8.12.1-notes.rst | 21 | ||||
-rw-r--r-- | docs/users_guide/exts/explicit_forall.rst | 77 | ||||
-rw-r--r-- | docs/users_guide/exts/gadt_syntax.rst | 117 |
3 files changed, 215 insertions, 0 deletions
diff --git a/docs/users_guide/8.12.1-notes.rst b/docs/users_guide/8.12.1-notes.rst index 38b20df1f8..22b9e0b571 100644 --- a/docs/users_guide/8.12.1-notes.rst +++ b/docs/users_guide/8.12.1-notes.rst @@ -103,6 +103,27 @@ Language instantiated through visible type application. More information can be found here: :ref:`Manually-defining-inferred-variables`. +* GADT constructor types now properly adhere to :ref:`forall-or-nothing`. As + a result, GHC will now reject some GADT constructors that previous versions + of GHC would accept, such as the following: :: + + data T where + MkT1 :: (forall a. a -> b -> T) + MkT2 :: (forall a. a -> T) + + ``MkT1`` and ``MkT2`` are rejected because the lack of an outermost + ``forall`` triggers implicit quantification, making the explicit ``forall``s + nested. Furthermore, GADT constructors do not permit the use of nested + ``forall``s, as explained in :ref:`formal-gadt-syntax`. + + In addition to rejecting nested ``forall``s, GHC is now more stringent about + rejecting uses of nested *contexts* in GADT constructors. For example, the + following example, which previous versions of GHC would accept, is now + rejected: + + data U a where + MkU :: (Show a => U a) + Compiler ~~~~~~~~ diff --git a/docs/users_guide/exts/explicit_forall.rst b/docs/users_guide/exts/explicit_forall.rst index 38c6e0f441..6b4fc9cef7 100644 --- a/docs/users_guide/exts/explicit_forall.rst +++ b/docs/users_guide/exts/explicit_forall.rst @@ -45,4 +45,81 @@ Notes: would warn about the unused type variable `a`. +.. _forall-or-nothing: + +The ``forall``-or-nothing rule +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In certain forms of types, type variables obey what is known as the +"``forall``-or-nothing" rule: if a type has an outermost, explicit +``forall``, then all of the type variables in the type must be explicitly +quantified. These two examples illustrate how the rule works: :: + + f :: forall a b. a -> b -> b -- OK, `a` and `b` are explicitly bound + g :: forall a. a -> forall b. b -> b -- OK, `a` and `b` are explicitly bound + h :: forall a. a -> b -> b -- Rejected, `b` is not in scope + +The type signatures for ``f``, ``g``, and ``h`` all begin with an outermost +``forall``, so every type variable in these signatures must be explicitly +bound by a ``forall``. Both ``f`` and ``g`` obey the ``forall``-or-nothing +rule, since they explicitly quantify ``a`` and ``b``. On the other hand, +``h`` does not explicitly quantify ``b``, so GHC will reject its type +signature for being improperly scoped. + +In places where the ``forall``-or-nothing rule takes effect, if a type does +*not* have an outermost ``forall``, then any type variables that are not +explicitly bound by a ``forall`` become implicitly quantified. For example: :: + + i :: a -> b -> b -- `a` and `b` are implicitly quantified + j :: a -> forall b. b -> b -- `a` is implicitly quantified + k :: (forall a. a -> b -> b) -- `b` is implicitly quantified + +GHC will accept ``i``, ``j``, and ``k``'s type signatures. Note that: + +- ``j``'s signature is accepted despite its mixture of implicit and explicit + quantification. As long as a ``forall`` is not an outermost one, it is fine + to use it among implicitly bound type variables. +- ``k``'s signature is accepted because the outermost parentheses imply that + the ``forall`` is not an outermost ``forall``. The ``forall``-or-nothing + rule is one of the few places in GHC where the presence or absence of + parentheses can be semantically significant! + +The ``forall``-or-nothing rule takes effect in the following places: + +- Type signature declarations for functions, values, and class methods +- Expression type annotations +- Instance declarations +- :ref:`class-default-signatures` +- Type signatures in a :ref:`specialize-pragma` or + :ref:`specialize-instance-pragma` +- :ref:`standalone-kind-signatures` +- Type signatures for :ref:`gadt` constructors +- Type signatures for :ref:`pattern-synonyms` +- :ref:`data-instance-declarations`, :ref:`type-instance-declarations`, + :ref:`closed-type-families`, and :ref:`assoc-inst` +- :ref:`rewrite-rules` in which the type variables are explicitly quantified +Notes: + +- :ref:`pattern-type-sigs` are a notable example of a place where + types do *not* obey the ``forall``-or-nothing rule. For example, GHC will + accept the following: :: + + f (g :: forall a. a -> b) x = g x :: b + + Furthermore, :ref:`rewrite-rules` do not obey the ``forall``-or-nothing rule + when their type variables are not explicitly quantified: :: + + {-# RULES "f" forall (g :: forall a. a -> b) x. f g x = g x :: b #-} + +- GADT constructors are extra particular about their ``forall``s. In addition + to adhering to the ``forall``-or-nothing rule, GADT constructors also forbid + nested ``forall``s. For example, GHC would reject the following GADT: :: + + data T where + MkT :: (forall a. a -> b -> T) + + Because of the lack of an outermost ``forall`` in the type of ``MkT``, the + ``b`` would be implicitly quantified. In effect, it would be as if one had + written ``MkT :: forall b. (forall a. a -> b -> T)``, which contains nested + ``forall``s. See :ref:`formal-gadt-syntax`. diff --git a/docs/users_guide/exts/gadt_syntax.rst b/docs/users_guide/exts/gadt_syntax.rst index f89888ff3b..9b20c27ab1 100644 --- a/docs/users_guide/exts/gadt_syntax.rst +++ b/docs/users_guide/exts/gadt_syntax.rst @@ -103,6 +103,123 @@ implements this behaviour, odd though it is. But for GADT-style declarations, GHC's behaviour is much more useful, as well as much more intuitive. +.. _formal-gadt-syntax: + +Formal syntax for GADTs +~~~~~~~~~~~~~~~~~~~~~~~ + +To make more precise what is and what is not permitted inside of a GADT-style +constructor, we provide a BNF-style grammar for GADT below. Note that this +grammar is subject to change in the future. :: + + gadt_con ::= conids '::' opt_forall opt_ctxt gadt_body + + conids ::= conid + | conid ',' conids + + opt_forall ::= <empty> + | 'forall' tv_bndrs '.' + + tv_bndrs ::= <empty> + | tv_bndr tv_bndrs + + tv_bndr ::= tyvar + | '(' tyvar '::' ctype ')' + + opt_ctxt ::= <empty> + | btype '=>' + | '(' ctxt ')' '=>' + + ctxt ::= ctype + | ctype ',' ctxt + + gadt_body ::= prefix_gadt_body + | record_gadt_body + + prefix_gadt_body ::= '(' prefix_gadt_body ')' + | return_type + | opt_unpack btype '->' prefix_gadt_body + + record_gadt_body ::= '{' fieldtypes '}' '->' return_type + + fieldtypes ::= <empty> + | fieldnames '::' opt_unpack ctype + | fieldnames '::' opt_unpack ctype ',' fieldtypes + + fieldnames ::= fieldname + | fieldname ',' fieldnames + + opt_unpack ::= opt_bang + : {-# UNPACK #-} opt_bang + | {-# NOUNPACK #-} opt_bang + + opt_bang ::= <empty> + | '!' + | '~' + +Where: + +- ``btype`` is a type that is not allowed to have an outermost + ``forall``/``=>`` unless it is surrounded by parentheses. For example, + ``forall a. a`` and ``Eq a => a`` are not legal ``btype``s, but + ``(forall a. a)`` and ``(Eq a => a)`` are legal. +- ``ctype`` is a ``btype`` that has no restrictions on an outermost + ``forall``/``=>``, so ``forall a. a`` and ``Eq a => a`` are legal ``ctype``s. +- ``return_type`` is a type that is not allowed to have ``forall``s, ``=>``s, + or ``->``s. + +This is a simplified grammar that does not fully delve into all of the +implementation details of GHC's parser (such as the placement of Haddock +comments), but it is sufficient to attain an understanding of what is +syntactically allowed. Some further various observations about this grammar: + +- GADT constructor types are currently not permitted to have nested ``forall``s + or ``=>``s. (e.g., something like ``MkT :: Int -> forall a. a -> T`` would be + rejected.) As a result, ``gadt_sig`` puts all of its quantification and + constraints up front with ``opt_forall`` and ``opt_context``. Note that + higher-rank ``forall``s and ``=>``s are only permitted if they do not appear + directly to the right of a function arrow in a `prefix_gadt_body`. (e.g., + something like ``MkS :: Int -> (forall a. a) -> S`` is allowed, since + parentheses separate the ``forall`` from the ``->``.) +- Furthermore, GADT constructors do not permit outermost parentheses that + surround the ``opt_forall`` or ``opt_ctxt``, if at least one of them are + used. For example, ``MkU :: (forall a. a -> U)`` would be rejected, since + it would treat the ``forall`` as being nested. + + Note that it is acceptable to use parentheses in a ``prefix_gadt_body``. + For instance, ``MkV1 :: forall a. (a) -> (V1)`` is acceptable, as is + ``MkV2 :: forall a. (a -> V2)``. +- The function arrows in a ``prefix_gadt_body``, as well as the function + arrow in a ``record_gadt_body``, are required to be used infix. For + example, ``MkA :: (->) Int A`` would be rejected. +- GHC uses the function arrows in a ``prefix_gadt_body`` and + ``prefix_gadt_body`` to syntactically demarcate the function and result + types. Note that GHC does not attempt to be clever about looking through + type synonyms here. If you attempt to do this, for instance: :: + + type C = Int -> B + + data B where + MkB :: C + + Then GHC will interpret the return type of ``MkB`` to be ``C``, and since + GHC requires that the return type must be headed by ``B``, this will be + rejected. On the other hand, it is acceptable to use type synonyms within + the argument and result types themselves, so the following is permitted: :: + + type B1 = Int + type B2 = B + + data B where + MkB :: B1 -> B2 +- GHC will accept any combination of ``!``/``~`` and + ``{-# UNPACK #-}``/``{-# NOUNPACK #-}``, although GHC will ignore some + combinations. For example, GHC will produce a warning if you write + ``{-# UNPACK #-} ~Int`` and proceed as if you had written ``Int``. + +GADT syntax odds and ends +~~~~~~~~~~~~~~~~~~~~~~~~~ + The rest of this section gives further details about GADT-style data type declarations. |