summaryrefslogtreecommitdiff
path: root/docs/users_guide/exts/functional_dependencies.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/users_guide/exts/functional_dependencies.rst')
-rw-r--r--docs/users_guide/exts/functional_dependencies.rst305
1 files changed, 305 insertions, 0 deletions
diff --git a/docs/users_guide/exts/functional_dependencies.rst b/docs/users_guide/exts/functional_dependencies.rst
new file mode 100644
index 0000000000..7926c47efa
--- /dev/null
+++ b/docs/users_guide/exts/functional_dependencies.rst
@@ -0,0 +1,305 @@
+.. _functional-dependencies:
+
+Functional dependencies
+-----------------------
+
+.. extension:: FunctionalDependencies
+ :shortdesc: Enable functional dependencies.
+ Implies :extension:`MultiParamTypeClasses`.
+
+ :implies: :extension:`MultiParamTypeClasses`
+ :since: 6.8.1
+
+ Allow use of functional dependencies in class declarations.
+
+Functional dependencies are implemented as described by Mark Jones in
+[Jones2000]_.
+
+Functional dependencies are introduced by a vertical bar in the syntax
+of a class declaration; e.g. ::
+
+ class (Monad m) => MonadState s m | m -> s where ...
+
+ class Foo a b c | a b -> c where ...
+
+More documentation can be found in the `Haskell Wiki
+<https://wiki.haskell.org/Functional_dependencies>`_.
+
+.. [Jones2000]
+ "`Type Classes with Functional
+ Dependencies <https://web.cecs.pdx.edu/~mpj/pubs/fundeps.html>`__",
+ Mark P. Jones, In *Proceedings of the 9th European Symposium on Programming*,
+ ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, .
+
+Rules for functional dependencies
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In a class declaration, all of the class type variables must be
+reachable (in the sense mentioned in :ref:`flexible-contexts`) from the
+free variables of each method type. For example: ::
+
+ class Coll s a where
+ empty :: s
+ insert :: s -> a -> s
+
+is not OK, because the type of ``empty`` doesn't mention ``a``.
+Functional dependencies can make the type variable reachable: ::
+
+ class Coll s a | s -> a where
+ empty :: s
+ insert :: s -> a -> s
+
+Alternatively ``Coll`` might be rewritten ::
+
+ class Coll s a where
+ empty :: s a
+ insert :: s a -> a -> s a
+
+which makes the connection between the type of a collection of ``a``'s
+(namely ``(s a)``) and the element type ``a``. Occasionally this really
+doesn't work, in which case you can split the class like this: ::
+
+ class CollE s where
+ empty :: s
+
+ class CollE s => Coll s a where
+ insert :: s -> a -> s
+
+Background on functional dependencies
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The following description of the motivation and use of functional
+dependencies is taken from the Hugs user manual, reproduced here (with
+minor changes) by kind permission of Mark Jones.
+
+Consider the following class, intended as part of a library for
+collection types: ::
+
+ class Collects e ce where
+ empty :: ce
+ insert :: e -> ce -> ce
+ member :: e -> ce -> Bool
+
+The type variable ``e`` used here represents the element type, while ``ce`` is
+the type of the container itself. Within this framework, we might want to define
+instances of this class for lists or characteristic functions (both of which can
+be used to represent collections of any equality type), bit sets (which can be
+used to represent collections of characters), or hash tables (which can be used
+to represent any collection whose elements have a hash function). Omitting
+standard implementation details, this would lead to the following declarations: ::
+
+ instance Eq e => Collects e [e] where ...
+ instance Eq e => Collects e (e -> Bool) where ...
+ instance Collects Char BitSet where ...
+ instance (Hashable e, Collects a ce)
+ => Collects e (Array Int ce) where ...
+
+All this looks quite promising; we have a class and a range of
+interesting implementations. Unfortunately, there are some serious
+problems with the class declaration. First, the empty function has an
+ambiguous type: ::
+
+ empty :: Collects e ce => ce
+
+By "ambiguous" we mean that there is a type variable ``e`` that appears on
+the left of the ``=>`` symbol, but not on the right. The problem with
+this is that, according to the theoretical foundations of Haskell
+overloading, we cannot guarantee a well-defined semantics for any term
+with an ambiguous type.
+
+We can sidestep this specific problem by removing the empty member from
+the class declaration. However, although the remaining members, insert
+and member, do not have ambiguous types, we still run into problems when
+we try to use them. For example, consider the following two functions: ::
+
+ f x y = insert x . insert y
+ g = f True 'a'
+
+for which GHC infers the following types: ::
+
+ f :: (Collects a c, Collects b c) => a -> b -> c -> c
+ g :: (Collects Bool c, Collects Char c) => c -> c
+
+Notice that the type for ``f`` allows the two parameters ``x`` and ``y`` to be
+assigned different types, even though it attempts to insert each of the
+two values, one after the other, into the same collection. If we're
+trying to model collections that contain only one type of value, then
+this is clearly an inaccurate type. Worse still, the definition for g is
+accepted, without causing a type error. As a result, the error in this
+code will not be flagged at the point where it appears. Instead, it will
+show up only when we try to use ``g``, which might even be in a different
+module.
+
+An attempt to use constructor classes
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Faced with the problems described above, some Haskell programmers might
+be tempted to use something like the following version of the class
+declaration: ::
+
+ class Collects e c where
+ empty :: c e
+ insert :: e -> c e -> c e
+ member :: e -> c e -> Bool
+
+The key difference here is that we abstract over the type constructor ``c``
+that is used to form the collection type ``c e``, and not over that
+collection type itself, represented by ``ce`` in the original class
+declaration. This avoids the immediate problems that we mentioned above:
+empty has type ``Collects e c => c e``, which is not ambiguous.
+
+The function ``f`` from the previous section has a more accurate type: ::
+
+ f :: (Collects e c) => e -> e -> c e -> c e
+
+The function ``g`` from the previous section is now rejected with a type
+error as we would hope because the type of ``f`` does not allow the two
+arguments to have different types. This, then, is an example of a
+multiple parameter class that does actually work quite well in practice,
+without ambiguity problems. There is, however, a catch. This version of
+the ``Collects`` class is nowhere near as general as the original class
+seemed to be: only one of the four instances for ``Collects`` given
+above can be used with this version of Collects because only one of them—the
+instance for lists—has a collection type that can be written in the form ``c
+e``, for some type constructor ``c``, and element type ``e``.
+
+Adding functional dependencies
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To get a more useful version of the ``Collects`` class, GHC provides a
+mechanism that allows programmers to specify dependencies between the
+parameters of a multiple parameter class (For readers with an interest
+in theoretical foundations and previous work: The use of dependency
+information can be seen both as a generalisation of the proposal for
+"parametric type classes" that was put forward by Chen, Hudak, and
+Odersky, or as a special case of Mark Jones's later framework for
+"improvement" of qualified types. The underlying ideas are also
+discussed in a more theoretical and abstract setting in a manuscript
+[Jones1999]_, where they are identified as one point in a general design
+space for systems of implicit parameterisation). To start with an
+abstract example, consider a declaration such as: ::
+
+ class C a b where ...
+
+.. [Jones1999]
+ "`Exploring the Design Space for Type-based Implicit Parameterization
+ <https://web.cecs.pdx.edu/~mpj/pubs/fdtr.html>`__", Mark P. Jones, Oregon
+ Graduate Institute of Science & Technology, Technical Report, July 1999.
+
+which tells us simply that ``C`` can be thought of as a binary relation on
+types (or type constructors, depending on the kinds of ``a`` and ``b``). Extra
+clauses can be included in the definition of classes to add information
+about dependencies between parameters, as in the following examples: ::
+
+ class D a b | a -> b where ...
+ class E a b | a -> b, b -> a where ...
+
+The notation ``a -> b`` used here between the ``|`` and ``where`` symbols —
+not to be confused with a function type — indicates that the ``a``
+parameter uniquely determines the ``b`` parameter, and might be read as "``a``
+determines ``b``." Thus ``D`` is not just a relation, but actually a (partial)
+function. Similarly, from the two dependencies that are included in the
+definition of ``E``, we can see that ``E`` represents a (partial) one-to-one
+mapping between types.
+
+More generally, dependencies take the form ``x1 ... xn -> y1 ... ym``,
+where ``x1``, ..., ``xn``, and ``y1``, ..., ``yn`` are type variables with n>0 and m>=0,
+meaning that the ``y`` parameters are uniquely determined by the ``x``
+parameters. Spaces can be used as separators if more than one variable
+appears on any single side of a dependency, as in ``t -> a b``. Note
+that a class may be annotated with multiple dependencies using commas as
+separators, as in the definition of ``E`` above. Some dependencies that we
+can write in this notation are redundant, and will be rejected because
+they don't serve any useful purpose, and may instead indicate an error
+in the program. Examples of dependencies like this include ``a -> a``,
+``a -> a a``, ``a ->``, etc. There can also be some redundancy if
+multiple dependencies are given, as in ``a->b``, ``b->c``, ``a->c``, and
+in which some subset implies the remaining dependencies. Examples like
+this are not treated as errors. Note that dependencies appear only in
+class declarations, and not in any other part of the language. In
+particular, the syntax for instance declarations, class constraints, and
+types is completely unchanged.
+
+By including dependencies in a class declaration, we provide a mechanism
+for the programmer to specify each multiple parameter class more
+precisely. The compiler, on the other hand, is responsible for ensuring
+that the set of instances that are in scope at any given point in the
+program is consistent with any declared dependencies. For example, the
+following pair of instance declarations cannot appear together in the
+same scope because they violate the dependency for ``D``, even though either
+one on its own would be acceptable: ::
+
+ instance D Bool Int where ...
+ instance D Bool Char where ...
+
+Note also that the following declaration is not allowed, even by itself: ::
+
+ instance D [a] b where ...
+
+The problem here is that this instance would allow one particular choice
+of ``[a]`` to be associated with more than one choice for ``b``, which
+contradicts the dependency specified in the definition of ``D``. More
+generally, this means that, in any instance of the form: ::
+
+ instance D t s where ...
+
+for some particular types ``t`` and ``s``, the only variables that can appear in
+``s`` are the ones that appear in ``t``, and hence, if the type ``t`` is known,
+then ``s`` will be uniquely determined.
+
+The benefit of including dependency information is that it allows us to
+define more general multiple parameter classes, without ambiguity
+problems, and with the benefit of more accurate types. To illustrate
+this, we return to the collection class example, and annotate the
+original definition of ``Collects`` with a simple dependency: ::
+
+ class Collects e ce | ce -> e where
+ empty :: ce
+ insert :: e -> ce -> ce
+ member :: e -> ce -> Bool
+
+The dependency ``ce -> e`` here specifies that the type ``e`` of elements is
+uniquely determined by the type of the collection ``ce``. Note that both
+parameters of Collects are of kind ``Type``; there are no constructor classes
+here. Note too that all of the instances of ``Collects`` that we gave earlier
+can be used together with this new definition.
+
+What about the ambiguity problems that we encountered with the original
+definition? The empty function still has type ``Collects e ce => ce``, but
+it is no longer necessary to regard that as an ambiguous type: Although
+the variable ``e`` does not appear on the right of the ``=>`` symbol, the
+dependency for class ``Collects`` tells us that it is uniquely determined by
+``ce``, which does appear on the right of the ``=>`` symbol. Hence the context
+in which empty is used can still give enough information to determine
+types for both ``ce`` and ``e``, without ambiguity. More generally, we need only
+regard a type as ambiguous if it contains a variable on the left of the
+``=>`` that is not uniquely determined (either directly or indirectly) by
+the variables on the right.
+
+Dependencies also help to produce more accurate types for user defined
+functions, and hence to provide earlier detection of errors, and less
+cluttered types for programmers to work with. Recall the previous
+definition for a function ``f``: ::
+
+ f x y = insert x y = insert x . insert y
+
+for which we originally obtained a type: ::
+
+ f :: (Collects a c, Collects b c) => a -> b -> c -> c
+
+Given the dependency information that we have for ``Collects``, however, we
+can deduce that ``a`` and ``b`` must be equal because they both appear as the
+second parameter in a ``Collects`` constraint with the same first parameter
+``c``. Hence we can infer a shorter and more accurate type for ``f``: ::
+
+ f :: (Collects a c) => a -> a -> c -> c
+
+In a similar way, the earlier definition of ``g`` will now be flagged as a
+type error.
+
+Although we have given only a few examples here, it should be clear that
+the addition of dependency information can help to make multiple
+parameter classes more useful in practice, avoiding ambiguity problems,
+and allowing more general sets of instance declarations.
+
+