summaryrefslogtreecommitdiff
path: root/docs/users_guide/exts/generics.rst
blob: d32fa246bb341ae9a53dcde5ec08826a1dcb233e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
.. _generic-programming:

Generic programming
===================

There are a few ways to do datatype-generic programming using the
:base-ref:`GHC.Generics.` framework. One is making use of the
``Generically`` and ``Generically1`` wrappers from ``GHC.Generics``,
instances can be derived via them using :extension:`DerivingVia`: ::

    {-# LANGUAGE DeriveGeneric      #-}
    {-# LANGUAGE DerivingStrategies #-}
    {-# LANGUAGE DerivingVia        #-}

    import GHC.Generics

    data V4 a = V4 a a a a
      deriving
      stock (Generic, Generic1)

      deriving (Semigroup, Monoid)
      via Generically (V4 a)

      deriving (Functor, Applicative)
      via Generically1 V4

The older approach uses :extension:`DeriveGeneric`,
:extension:`DefaultSignatures`, and :extension:`DeriveAnyClass`. It
derives instances by providing a distinguished generic implementation
as part of the type class declaration. This section gives a very brief
overview of how to do it.

Generic programming support in GHC allows defining classes with methods
that do not need a user specification when instantiating: the method
body is automatically derived by GHC. This is similar to what happens
for standard classes such as ``Read`` and ``Show``, for instance, but
now for user-defined classes.

.. _generic-classes:

.. note::

   GHC used to have an implementation of generic classes as defined in the paper
   "Derivable type classes", Ralf Hinze and Simon Peyton Jones, Haskell
   Workshop, Montreal Sept 2000, pp. 94-105. These have been removed and
   replaced by the more general support for generic programming.

Deriving representations
------------------------

The first thing we need is generic representations. The ``GHC.Generics``
module defines a couple of primitive types that are used to represent
Haskell datatypes: ::

    -- | Unit: used for constructors without arguments
    data U1 p = U1

    -- | Constants, additional parameters and recursion of kind Type
    newtype K1 i c p = K1 { unK1 :: c }

    -- | Meta-information (constructor names, etc.)
    newtype M1 i c f p = M1 { unM1 :: f p }

    -- | Sums: encode choice between constructors
    infixr 5 :+:
    data (:+:) f g p = L1 (f p) | R1 (g p)

    -- | Products: encode multiple arguments to constructors
    infixr 6 :*:
    data (:*:) f g p = f p :*: g p

The ``Generic`` and ``Generic1`` classes mediate between user-defined
datatypes and their internal representation as a sum-of-products: ::

    class Generic a where
      -- Encode the representation of a user datatype
      type Rep a :: Type -> Type
      -- Convert from the datatype to its representation
      from  :: a -> (Rep a) x
      -- Convert from the representation to the datatype
      to    :: (Rep a) x -> a

    class Generic1 (f :: k -> Type) where
      type Rep1 f :: k -> Type

      from1  :: f a -> Rep1 f a
      to1    :: Rep1 f a -> f a

``Generic1`` is used for functions that can only be defined over type
containers, such as ``map``. Note that ``Generic1`` ranges over types of kind
``Type -> Type`` by default, but if the :extension:`PolyKinds` extension is
enabled, then it can range of types of kind ``k -> Type``, for any kind ``k``.

.. extension:: DeriveGeneric
    :shortdesc: Enable deriving for the Generic class.

    :since: 7.2.1

    Allow automatic deriving of instances for the ``Generic`` typeclass.


Instances of these classes can be derived by GHC with the
:extension:`DeriveGeneric` extension, and are necessary to be able to define
generic instances automatically.

For example, a user-defined datatype of trees ::

    data UserTree a = Node a (UserTree a) (UserTree a) | Leaf

in a ``Main`` module in a package named ``foo`` will get the following
representation: ::

    instance Generic (UserTree a) where
      -- Representation type
      type Rep (UserTree a) =
        M1 D ('MetaData "UserTree" "Main" "package-name" 'False) (
              M1 C ('MetaCons "Node" 'PrefixI 'False) (
                    M1 S ('MetaSel 'Nothing
                                   'NoSourceUnpackedness
                                   'NoSourceStrictness
                                   'DecidedLazy)
                         (K1 R a)
                :*: M1 S ('MetaSel 'Nothing
                                   'NoSourceUnpackedness
                                   'NoSourceStrictness
                                   'DecidedLazy)
                         (K1 R (UserTree a))
                :*: M1 S ('MetaSel 'Nothing
                                   'NoSourceUnpackedness
                                   'NoSourceStrictness
                                   'DecidedLazy)
                         (K1 R (UserTree a)))
          :+: M1 C ('MetaCons "Leaf" 'PrefixI 'False) U1)

      -- Conversion functions
      from (Node x l r) = M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))
      from Leaf         = M1 (R1 (M1 U1))
      to (M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))) = Node x l r
      to (M1 (R1 (M1 U1)))                                      = Leaf

This representation is generated automatically if a ``deriving Generic``
clause is attached to the datatype. `Standalone
deriving <#stand-alone-deriving>`__ can also be used.

Writing generic functions
-------------------------

A generic function is defined by creating a class and giving instances
for each of the representation types of ``GHC.Generics``. As an example
we show generic serialization: ::

    data Bin = O | I

    class GSerialize f where
      gput :: f a -> [Bin]

    instance GSerialize U1 where
      gput U1 = []

    instance (GSerialize a, GSerialize b) => GSerialize (a :*: b) where
      gput (x :*: y) = gput x ++ gput y

    instance (GSerialize a, GSerialize b) => GSerialize (a :+: b) where
      gput (L1 x) = O : gput x
      gput (R1 x) = I : gput x

    instance (GSerialize a) => GSerialize (M1 i c a) where
      gput (M1 x) = gput x

    instance (Serialize a) => GSerialize (K1 i a) where
      gput (K1 x) = put x

A caveat: this encoding strategy may not be reliable across different versions
of GHC. When deriving a ``Generic`` instance is free to choose any nesting of
``:+:`` and ``:*:`` it chooses, so if GHC chooses ``(a :+: b) :+: c``, then the
encoding for ``a`` would be ``[O, O]``, ``b`` would be ``[O, I]``, and ``c``
would be ``[I]``. However, if GHC chooses ``a :+: (b :+: c)``, then the
encoding for ``a`` would be ``[O]``, ``b`` would be ``[I, O]``, and ``c`` would
be ``[I, I]``. (In practice, the current implementation tries to produce a
more-or-less balanced nesting of ``:+:`` and ``:*:`` so that the traversal of
the structure of the datatype from the root to a particular component can be
performed in logarithmic rather than linear time.)

Typically this ``GSerialize`` class will not be exported, as it only makes
sense to have instances for the representation types.

Unlifted representation types
-----------------------------

The data family ``URec`` is provided to enable generic programming over
datatypes with certain unlifted arguments. There are six instances corresponding
to common unlifted types: ::

    data family URec a p

    data instance URec (Ptr ()) p = UAddr   { uAddr#   :: Addr#   }
    data instance URec Char     p = UChar   { uChar#   :: Char#   }
    data instance URec Double   p = UDouble { uDouble# :: Double# }
    data instance URec Int      p = UInt    { uInt#    :: Int#    }
    data instance URec Float    p = UFloat  { uFloat#  :: Float#  }
    data instance URec Word     p = UWord   { uWord#   :: Word#   }

Six type synonyms are provided for convenience: ::

    type UAddr   = URec (Ptr ())
    type UChar   = URec Char
    type UDouble = URec Double
    type UFloat  = URec Float
    type UInt    = URec Int
    type UWord   = URec Word

As an example, this data declaration: ::

    data IntHash = IntHash Int#
      deriving Generic

results in the following ``Generic`` instance: ::

    instance 'Generic' IntHash where
      type 'Rep' IntHash =
        'D1' ('MetaData "IntHash" "Main" "package-name" 'False)
          ('C1' ('MetaCons "IntHash" 'PrefixI 'False)
            ('S1' ('MetaSel 'Nothing
                            'NoSourceUnpackedness
                            'NoSourceStrictness
                            'DecidedLazy)
                  'UInt'))

A user could provide, for example, a ``GSerialize UInt`` instance so that a
``Serialize IntHash`` instance could be easily defined in terms of
``GSerialize``.

Generic defaults
----------------

The only thing left to do now is to define a "front-end" class, which is
exposed to the user: ::

    class Serialize a where
      put :: a -> [Bin]

      default put :: (Generic a, GSerialize (Rep a)) => a -> [Bin]
      put = gput . from

Here we use a `default signature <#class-default-signatures>`__ to
specify that the user does not have to provide an implementation for
``put``, as long as there is a ``Generic`` instance for the type to
instantiate. For the ``UserTree`` type, for instance, the user can just
write: ::

    instance (Serialize a) => Serialize (UserTree a)

The default method for ``put`` is then used, corresponding to the
generic implementation of serialization. If you are using
:extension:`DeriveAnyClass`, the same instance is generated by simply attaching
a ``deriving Serialize`` clause to the ``UserTree`` datatype
declaration. For more examples of generic functions please refer to the
`generic-deriving <http://hackage.haskell.org/package/generic-deriving>`__
package on Hackage.

More information
----------------

For more details please refer to the `Haskell Wiki
page <http://www.haskell.org/haskellwiki/GHC.Generics>`__ or the
original paper [Generics2010]_.

.. [Generics2010] Jose Pedro Magalhaes, Atze Dijkstra, Johan Jeuring, and Andres Loeh.
   `A generic deriving mechanism for Haskell
   <http://dreixel.net/research/pdf/gdmh.pdf>`__. Proceedings of
   the third ACM Haskell symposium on Haskell (Haskell'2010), pp. 37-48,
   ACM, 2010.