summaryrefslogtreecommitdiff
path: root/docs/users_guide/exts/gadt_syntax.rst
blob: 5b0e1631f336f94ce0aa8bd10037582b6ad90501 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
.. _gadt-style:

Declaring data types with explicit constructor signatures
---------------------------------------------------------

.. extension:: GADTSyntax
    :shortdesc: Enable generalised algebraic data type syntax.

    :since: 7.2.1

    Allow the use of GADT syntax in data type definitions (but not GADTs
    themselves; for this see :extension:`GADTs`)

When the ``GADTSyntax`` extension is enabled, GHC allows you to declare
an algebraic data type by giving the type signatures of constructors
explicitly. For example: ::

      data Maybe a where
          Nothing :: Maybe a
          Just    :: a -> Maybe a

The form is called a "GADT-style declaration" because Generalised
Algebraic Data Types, described in :ref:`gadt`, can only be declared
using this form.

Notice that GADT-style syntax generalises existential types
(:ref:`existential-quantification`). For example, these two declarations
are equivalent: ::

      data Foo = forall a. MkFoo a (a -> Bool)
      data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }

Any data type that can be declared in standard Haskell 98 syntax can
also be declared using GADT-style syntax. The choice is largely
stylistic, but GADT-style declarations differ in one important respect:
they treat class constraints on the data constructors differently.
Specifically, if the constructor is given a type-class context, that
context is made available by pattern matching. For example: ::

      data Set a where
        MkSet :: Eq a => [a] -> Set a

      makeSet :: Eq a => [a] -> Set a
      makeSet xs = MkSet (nub xs)

      insert :: a -> Set a -> Set a
      insert a (MkSet as) | a `elem` as = MkSet as
                          | otherwise   = MkSet (a:as)

A use of ``MkSet`` as a constructor (e.g. in the definition of
``makeSet``) gives rise to a ``(Eq a)`` constraint, as you would expect.
The new feature is that pattern-matching on ``MkSet`` (as in the
definition of ``insert``) makes *available* an ``(Eq a)`` context. In
implementation terms, the ``MkSet`` constructor has a hidden field that
stores the ``(Eq a)`` dictionary that is passed to ``MkSet``; so when
pattern-matching that dictionary becomes available for the right-hand
side of the match. In the example, the equality dictionary is used to
satisfy the equality constraint generated by the call to ``elem``, so
that the type of ``insert`` itself has no ``Eq`` constraint.

For example, one possible application is to reify dictionaries: ::

       data NumInst a where
         MkNumInst :: Num a => NumInst a

       intInst :: NumInst Int
       intInst = MkNumInst

       plus :: NumInst a -> a -> a -> a
       plus MkNumInst p q = p + q

Here, a value of type ``NumInst a`` is equivalent to an explicit
``(Num a)`` dictionary.

All this applies to constructors declared using the syntax of
:ref:`existential-with-context`. For example, the ``NumInst`` data type
above could equivalently be declared like this: ::

       data NumInst a
          = Num a => MkNumInst (NumInst a)

Notice that, unlike the situation when declaring an existential, there
is no ``forall``, because the ``Num`` constrains the data type's
universally quantified type variable ``a``. A constructor may have both
universal and existential type variables: for example, the following two
declarations are equivalent: ::

       data T1 a
        = forall b. (Num a, Eq b) => MkT1 a b
       data T2 a where
        MkT2 :: (Num a, Eq b) => a -> b -> T2 a

All this behaviour contrasts with Haskell 98's peculiar treatment of
contexts on a data type declaration (Section 4.2.1 of the Haskell 98
Report). In Haskell 98 the definition ::

      data Eq a => Set' a = MkSet' [a]

gives ``MkSet'`` the same type as ``MkSet`` above. But instead of
*making available* an ``(Eq a)`` constraint, pattern-matching on
``MkSet'`` *requires* an ``(Eq a)`` constraint! GHC faithfully
implements this behaviour, odd though it is. But for GADT-style
declarations, GHC's behaviour is much more useful, as well as much more
intuitive.

.. _formal-gadt-syntax:

Formal syntax for GADTs
~~~~~~~~~~~~~~~~~~~~~~~

To make more precise what is and what is not permitted inside of a GADT-style
constructor, we provide a BNF-style grammar for GADT below. Note that this
grammar is subject to change in the future.

.. code-block:: none

  gadt_con ::= conids '::' opt_forall opt_ctxt gadt_body

  conids ::= conid
          |  conid ',' conids

  opt_forall ::= <empty>
              |  'forall' tv_bndrs '.'

  tv_bndrs ::= <empty>
            |  tv_bndr tv_bndrs

  tv_bndr ::= tyvar
           |  '(' tyvar '::' ctype ')'

  opt_ctxt ::= <empty>
            |  btype '=>'
            |  '(' ctxt ')' '=>'

  ctxt ::= ctype
        |  ctype ',' ctxt

  gadt_body ::= prefix_gadt_body
             |  record_gadt_body

  prefix_gadt_body ::= '(' prefix_gadt_body ')'
                    |  return_type
                    |  opt_unpack btype '->' prefix_gadt_body

  record_gadt_body ::= '{' fieldtypes '}' '->' return_type

  fieldtypes ::= <empty>
              |  fieldnames '::' opt_unpack ctype
              |  fieldnames '::' opt_unpack ctype ',' fieldtypes

  fieldnames ::= fieldname
              |  fieldname ',' fieldnames

  opt_unpack ::= opt_bang
              :  {-# UNPACK #-} opt_bang
              |  {-# NOUNPACK #-} opt_bang

  opt_bang ::= <empty>
            |  '!'
            |  '~'

Where:

- ``btype`` is a type that is not allowed to have an outermost
  ``forall``/``=>`` unless it is surrounded by parentheses. For example,
  ``forall a. a`` and ``Eq a => a`` are not legal ``btype``\ s, but
  ``(forall a. a)`` and ``(Eq a => a)`` are legal.
- ``ctype`` is a ``btype`` that has no restrictions on an outermost
  ``forall``/``=>``, so ``forall a. a`` and ``Eq a => a`` are legal ``ctype``\ s.
- ``return_type`` is a type that is not allowed to have ``forall``\ s, ``=>``\ s,
  or ``->``\ s.

This is a simplified grammar that does not fully delve into all of the
implementation details of GHC's parser (such as the placement of Haddock
comments), but it is sufficient to attain an understanding of what is
syntactically allowed. Some further various observations about this grammar:

- GADT constructor types are currently not permitted to have nested ``forall``\ s
  or ``=>``\ s. (e.g., something like ``MkT :: Int -> forall a. a -> T`` would be
  rejected.) As a result, ``gadt_sig`` puts all of its quantification and
  constraints up front with ``opt_forall`` and ``opt_context``. Note that
  higher-rank ``forall``\ s and ``=>``\ s are only permitted if they do not appear
  directly to the right of a function arrow in a `prefix_gadt_body`. (e.g.,
  something like ``MkS :: Int -> (forall a. a) -> S`` is allowed, since
  parentheses separate the ``forall`` from the ``->``.)
- Furthermore, GADT constructors do not permit outermost parentheses that
  surround the ``opt_forall`` or ``opt_ctxt``, if at least one of them are
  used. For example, ``MkU :: (forall a. a -> U)`` would be rejected, since
  it would treat the ``forall`` as being nested.

  Note that it is acceptable to use parentheses in a ``prefix_gadt_body``.
  For instance, ``MkV1 :: forall a. (a) -> (V1)`` is acceptable, as is
  ``MkV2 :: forall a. (a -> V2)``.
- The function arrows in a ``prefix_gadt_body``, as well as the function
  arrow in a ``record_gadt_body``, are required to be used infix. For
  example, ``MkA :: (->) Int A`` would be rejected.
- GHC uses the function arrows in a ``prefix_gadt_body`` and
  ``prefix_gadt_body`` to syntactically demarcate the function and result
  types. Note that GHC does not attempt to be clever about looking through
  type synonyms here. If you attempt to do this, for instance: ::

    type C = Int -> B

    data B where
      MkB :: C

  Then GHC will interpret the return type of ``MkB`` to be ``C``, and since
  GHC requires that the return type must be headed by ``B``, this will be
  rejected. On the other hand, it is acceptable to use type synonyms within
  the argument and result types themselves, so the following is permitted: ::

    type B1 = Int
    type B2 = B

    data B where
      MkB :: B1 -> B2
- GHC will accept any combination of ``!``/``~`` and
  ``{-# UNPACK #-}``/``{-# NOUNPACK #-}``, although GHC will ignore some
  combinations. For example, GHC will produce a warning if you write
  ``{-# UNPACK #-} ~Int`` and proceed as if you had written ``Int``.

GADT syntax odds and ends
~~~~~~~~~~~~~~~~~~~~~~~~~

The rest of this section gives further details about GADT-style data
type declarations.

-  The result type of each data constructor must begin with the type
   constructor being defined. If the result type of all constructors has
   the form ``T a1 ... an``, where ``a1 ... an`` are distinct type
   variables, then the data type is *ordinary*; otherwise is a
   *generalised* data type (:ref:`gadt`).

-  As with other type signatures, you can give a single signature for
   several data constructors. In this example we give a single signature
   for ``T1`` and ``T2``: ::

         data T a where
           T1,T2 :: a -> T a
           T3 :: T a

-  The type signature of each constructor is independent, and is
   implicitly universally quantified as usual. In particular, the type
   variable(s) in the "``data T a where``" header have no scope, and
   different constructors may have different universally-quantified type
   variables: ::

         data T a where        -- The 'a' has no scope
           T1,T2 :: b -> T b   -- Means forall b. b -> T b
           T3 :: T a           -- Means forall a. T a

-  A constructor signature may mention type class constraints, which can
   differ for different constructors. For example, this is fine: ::

         data T a where
           T1 :: Eq b => b -> b -> T b
           T2 :: (Show c, Ix c) => c -> [c] -> T c

   When pattern matching, these constraints are made available to
   discharge constraints in the body of the match. For example: ::

         f :: T a -> String
         f (T1 x y) | x==y      = "yes"
                    | otherwise = "no"
         f (T2 a b)             = show a

   Note that ``f`` is not overloaded; the ``Eq`` constraint arising from
   the use of ``==`` is discharged by the pattern match on ``T1`` and
   similarly the ``Show`` constraint arising from the use of ``show``.

-  Unlike a Haskell-98-style data type declaration, the type variable(s)
   in the "``data Set a where``" header have no scope. Indeed, one can
   write a kind signature instead: ::

         data Set :: Type -> Type where ...

   or even a mixture of the two: ::

         data Bar a :: (Type -> Type) -> Type where ...

   The type variables (if given) may be explicitly kinded, so we could
   also write the header for ``Foo`` like this: ::

         data Bar a (b :: Type -> Type) where ...

-  You can use strictness annotations, in the obvious places in the
   constructor type: ::

         data Term a where
             Lit    :: !Int -> Term Int
             If     :: Term Bool -> !(Term a) -> !(Term a) -> Term a
             Pair   :: Term a -> Term b -> Term (a,b)

-  You can use a ``deriving`` clause on a GADT-style data type
   declaration. For example, these two declarations are equivalent ::

         data Maybe1 a where {
             Nothing1 :: Maybe1 a ;
             Just1    :: a -> Maybe1 a
           } deriving( Eq, Ord )

         data Maybe2 a = Nothing2 | Just2 a
              deriving( Eq, Ord )

-  The type signature may have quantified type variables that do not
   appear in the result type: ::

         data Foo where
            MkFoo :: a -> (a->Bool) -> Foo
            Nil   :: Foo

   Here the type variable ``a`` does not appear in the result type of
   either constructor. Although it is universally quantified in the type
   of the constructor, such a type variable is often called
   "existential". Indeed, the above declaration declares precisely the
   same type as the ``data Foo`` in :ref:`existential-quantification`.

   The type may contain a class context too, of course: ::

         data Showable where
           MkShowable :: Show a => a -> Showable

-  You can use record syntax on a GADT-style data type declaration: ::

         data Person where
             Adult :: { name :: String, children :: [Person] } -> Person
             Child :: Show a => { name :: !String, funny :: a } -> Person

   As usual, for every constructor that has a field ``f``, the type of
   field ``f`` must be the same (modulo alpha conversion). The ``Child``
   constructor above shows that the signature may have a context,
   existentially-quantified variables, and strictness annotations, just
   as in the non-record case. (NB: the "type" that follows the
   double-colon is not really a type, because of the record syntax and
   strictness annotations. A "type" of this form can appear only in a
   constructor signature.)

-  Record updates are allowed with GADT-style declarations, only fields
   that have the following property: the type of the field mentions no
   existential type variables.

-  As in the case of existentials declared using the Haskell-98-like
   record syntax (:ref:`existential-records`), record-selector functions
   are generated only for those fields that have well-typed selectors.
   Here is the example of that section, in GADT-style syntax: ::

       data Counter a where
           NewCounter :: { _this    :: self
                         , _inc     :: self -> self
                         , _display :: self -> IO ()
                         , tag      :: a
                         } -> Counter a

   As before, only one selector function is generated here, that for
   ``tag``. Nevertheless, you can still use all the field names in
   pattern matching and record construction.

-  In a GADT-style data type declaration there is no obvious way to
   specify that a data constructor should be infix, which makes a
   difference if you derive ``Show`` for the type. (Data constructors
   declared infix are displayed infix by the derived ``show``.) So GHC
   implements the following design: a data constructor declared in a
   GADT-style data type declaration is displayed infix by ``Show`` iff
   (a) it is an operator symbol, (b) it has two arguments, (c) it has a
   programmer-supplied fixity declaration. For example

   ::

          infix 6 (:--:)
          data T a where
            (:--:) :: Int -> Bool -> T Int