diff options
Diffstat (limited to 'docs/users_guide/glasgow_exts.xml')
-rw-r--r-- | docs/users_guide/glasgow_exts.xml | 694 |
1 files changed, 451 insertions, 243 deletions
diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index 135d8ecded..df1ff2c181 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -857,152 +857,248 @@ To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag </title> <para> -The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>, -that is, the variables bound in a do-expression are visible only in the textually following -code block. Compare this to a let-expression, where bound variables are visible in the entire binding -group. It turns out that several applications can benefit from recursive bindings in -the do-notation. The <option>-XDoRec</option> flag provides the necessary syntactic support. + The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>, + that is, the variables bound in a do-expression are visible only in the textually following + code block. Compare this to a let-expression, where bound variables are visible in the entire binding + group. +</para> + +<para> + It turns out that such recursive bindings do indeed make sense for a variety of monads, but + not all. In particular, recursion in this sense requires a fixed-point operator for the underlying + monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows: +<programlisting> +class Monad m => MonadFix m where + mfix :: (a -> m a) -> m a +</programlisting> + Haskell's + <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions), + <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative + side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not. </para> + +<para> + For monads that do belong to the <literal>MonadFix</literal> class, GHC provides + an extended version of the do-notation that allows recursive bindings. + The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>) + provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and + <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike + bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal> + are recursively defined, much like in an ordinary let-expression. Due to the new + keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>. +</para> + <para> -Here is a simple (albeit contrived) example: + Here is a simple (albeit contrived) example: <programlisting> -{-# LANGUAGE DoRec #-} +{-# LANGUAGE RecursiveDo #-} +justOnes = mdo { xs <- Just (1:xs) + ; return (map negate xs) } +</programlisting> +or equivalently +<programlisting> +{-# LANGUAGE RecursiveDo #-} justOnes = do { rec { xs <- Just (1:xs) } ; return (map negate xs) } </programlisting> As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>. </para> -<para> -The background and motivation for recursive do-notation is described in -<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>, -by Levent Erkok, John Launchbury, -Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania. -The theory behind monadic value recursion is explained further in Erkok's thesis -<ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>. -However, note that GHC uses a different syntax than the one described in these documents. + +<para> + GHC's implementation the mdo-notation closely follows the original translation as described in the paper + <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which + in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion + in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper + with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next. </para> <sect3> -<title>Details of recursive do-notation</title> +<title>Recursive binding groups</title> + <para> -The recursive do-notation is enabled with the flag <option>-XDoRec</option> or, equivalently, -the LANGUAGE pragma <option>DoRec</option>. It introduces the single new keyword "<literal>rec</literal>", -which wraps a mutually-recursive group of monadic statements, -producing a single statement. -</para> -<para>Similar to a <literal>let</literal> -statement, the variables bound in the <literal>rec</literal> are -visible throughout the <literal>rec</literal> group, and below it. -For example, compare + The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a + mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement. + Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in + the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare <programlisting> -do { a <- getChar do { a <- getChar - ; let { r1 = f a r2 ; rec { r1 <- f a r2 - ; r2 = g r1 } ; r2 <- g r1 } - ; return (r1 ++ r2) } ; return (r1 ++ r2) } + do { a <- getChar do { a <- getChar + ; let { r1 = f a r2 ; rec { r1 <- f a r2 + ; ; r2 = g r1 } ; ; r2 <- g r1 } + ; return (r1 ++ r2) } ; return (r1 ++ r2) } </programlisting> -In both cases, <literal>r1</literal> and <literal>r2</literal> are -available both throughout the <literal>let</literal> or <literal>rec</literal> block, and -in the statements that follow it. The difference is that <literal>let</literal> is non-monadic, -while <literal>rec</literal> is monadic. (In Haskell <literal>let</literal> is -really <literal>letrec</literal>, of course.) + In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout + the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it. + The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic. + (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.) </para> + <para> -The static and dynamic semantics of <literal>rec</literal> can be described as follows: -<itemizedlist> -<listitem><para> -First, -similar to let-bindings, the <literal>rec</literal> is broken into -minimal recursive groups, a process known as <emphasis>segmentation</emphasis>. -For example: -<programlisting> -rec { a <- getChar ===> a <- getChar - ; b <- f a c rec { b <- f a c - ; c <- f b a ; c <- f b a } - ; putChar c } putChar c -</programlisting> -The details of segmentation are described in Section 3.2 of -<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>. -Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper -describes, also has a semantic effect (unless the monad satisfies the right-shrinking law). -</para></listitem> -<listitem><para> -Then each resulting <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal>. -For example, the <literal>rec</literal> group in the preceding example is desugared like this: + The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal> + group, it will compute its set of bound variables, and will introduce an appropriate call + to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the + <literal>MonadFix</literal> class. Here is an example: <programlisting> rec { b <- f a c ===> (b,c) <- mfix (\~(b,c) -> do { b <- f a c ; c <- f b a } ; c <- f b a ; return (b,c) }) </programlisting> -In general, the statement <literal>rec <replaceable>ss</replaceable></literal> -is desugared to the statement + As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns. + In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement <programlisting> <replaceable>vs</replaceable> <- mfix (\~<replaceable>vs</replaceable> -> do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> }) </programlisting> -where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>. -</para><para> -The original <literal>rec</literal> typechecks exactly -when the above desugared version would do so. For example, this means that -the variables <replaceable>vs</replaceable> are all monomorphic in the statements -following the <literal>rec</literal>, because they are bound by a lambda. + where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>. </para> + <para> -The <literal>mfix</literal> function is defined in the <literal>MonadFix</literal> -class, in <literal>Control.Monad.Fix</literal>, thus: -<programlisting> -class Monad m => MonadFix m where - mfix :: (a -> m a) -> m a -</programlisting> + Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call + to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task + for the <literal>mdo</literal> notation, which is described next. </para> -</listitem> -</itemizedlist> -</para> -<para> -Here are some other important points in using the recursive-do notation: -<itemizedlist> -<listitem><para> -It is enabled with the flag <literal>-XDoRec</literal>. -</para></listitem> +</sect3> -<listitem><para> -If recursive bindings are required for a monad, -then that monad must be declared an instance of the <literal>MonadFix</literal> class. -</para></listitem> +<sect3> +<title>The <literal>mdo</literal> notation</title> -<listitem><para> -The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO. -Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class -for Haskell's internal state monad (strict and lazy, respectively). -</para></listitem> +<para> + A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that + the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped + around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described + in detail in Secton 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for + Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids + unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis> + axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis> + have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary + interference, changing the termination behavior of the resulting translation. + (Details can be found in Sections 3.1 and 7.2.2 of + <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.) +</para> -<listitem><para> -Like <literal>let</literal> and <literal>where</literal> bindings, -name shadowing is not allowed within a <literal>rec</literal>; -that is, all the names bound in a single <literal>rec</literal> must -be distinct (Section 3.3 of the paper). -</para></listitem> -<listitem><para> -It supports rebindable syntax (see <xref linkend="rebindable-syntax"/>). -</para></listitem> +<para> + The <literal>mdo</literal> notation removes the burden of placing + explicit <literal>rec</literal> blocks in the code. Unlike an + ordinary <literal>do</literal> expression, in which variables bound by + statements are only in scope for later statements, variables bound in + an <literal>mdo</literal> expression are in scope for all statements + of the expression. The compiler then automatically identifies minimal + mutually recursively dependent segments of statements, treating them as + if the user had wrapped a <literal>rec</literal> qualifier around them. +</para> + +<para> + The definition is syntactic: +</para> +<itemizedlist> + <listitem> + <para> + A generator <replaceable>g</replaceable> + <emphasis>depends</emphasis> on a textually following generator + <replaceable>g'</replaceable>, if + </para> + <itemizedlist> + <listitem> + <para> + <replaceable>g'</replaceable> defines a variable that + is used by <replaceable>g</replaceable>, or + </para> + </listitem> + <listitem> + <para> + <replaceable>g'</replaceable> textually appears between + <replaceable>g</replaceable> and + <replaceable>g''</replaceable>, where <replaceable>g</replaceable> + depends on <replaceable>g''</replaceable>. + </para> + </listitem> + </itemizedlist> + </listitem> + <listitem> + <para> + A <emphasis>segment</emphasis> of a given + <literal>mdo</literal>-expression is a minimal sequence of generators + such that no generator of the sequence depends on an outside + generator. As a special case, although it is not a generator, + the final expression in an <literal>mdo</literal>-expression is + considered to form a segment by itself. + </para> + </listitem> </itemizedlist> +<para> + Segments in this sense are + related to <emphasis>strongly-connected components</emphasis> analysis, + with the exception that bindings in a segment cannot be reordered and + must be contiguous. </para> -</sect3> -<sect3 id="mdo-notation"> <title> Mdo-notation (deprecated) </title> +<para> + Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks: +<programlisting> +mdo { a <- getChar ===> do { a <- getChar + ; b <- f a c ; rec { b <- f a c + ; c <- f b a ; ; c <- f b a } + ; z <- h a b ; z <- h a b + ; d <- g d e ; rec { d <- g d e + ; e <- g a z ; ; e <- g a z } + ; putChar c } ; putChar c } +</programlisting> +Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks. +If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this +latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one +would expect. +</para> -<para> GHC used to support the flag <option>-XRecursiveDo</option>, -which enabled the keyword <literal>mdo</literal>, precisely as described in -<ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>, -but this is now deprecated. Instead of <literal>mdo { Q; e }</literal>, write -<literal>do { rec Q; e }</literal>. +<para> + In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing + <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting + <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described + in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared + version would do so. </para> + <para> -Historical note: The old implementation of the mdo-notation (and most -of the existing documents) used the name -<literal>MonadRec</literal> for the class and the corresponding library. -This name is not supported by GHC. +Here are some other important points in using the recursive-do notation: + +<itemizedlist> + <listitem> + <para> + It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal> + pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal> + blocks inside <literal>do</literal> expressions.) + </para> + </listitem> + <listitem> + <para> + <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be + treated as a single statement. However, it is good style to either use <literal>mdo</literal> or + <literal>rec</literal> blocks in a single expression. + </para> + </listitem> + <listitem> + <para> + If recursive bindings are required for a monad, then that monad must be declared an instance of + the <literal>MonadFix</literal> class. + </para> + </listitem> + <listitem> + <para> + The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO. + Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal> + modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal + state monad (strict and lazy, respectively). + </para> + </listitem> + <listitem> + <para> + Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within + an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in + a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.) + </para> + </listitem> +</itemizedlist> </para> </sect3> + </sect2> @@ -1469,7 +1565,7 @@ the comprehension being over an arbitrary monad. functions <literal>(>>=)</literal>, <literal>(>>)</literal>, and <literal>fail</literal>, are in scope (not the Prelude - versions). List comprehensions, mdo (<xref linkend="mdo-notation"/>), and parallel array + versions). List comprehensions, mdo (<xref linkend="recursive-do-notation"/>), and parallel array comprehensions, are unaffected. </para></listitem> <listitem> @@ -2025,16 +2121,6 @@ The following syntax is stolen: <varlistentry> <term> - <literal>'<replaceable>varid</replaceable></literal> - </term> - <listitem><para> - Stolen by: <option>-XTemplateHaskell</option>and - <option>-XPolyKinds</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> <literal>[:<replaceable>varid</replaceable>|</literal> <indexterm><primary>quasi-quotation</primary></indexterm> </term> @@ -2103,8 +2189,9 @@ NilSet :: Set a ConsSet :: Eq a => a -> Set a -> Set a </programlisting> -<para>In GHC this feature is an extension called -<literal>DatatypeContexts</literal>, and on by default.</para> +<para>This is widely considered a misfeature, and is going to be removed from +the language. In GHC, it is controlled by the deprecated extension +<literal>DatatypeContexts</literal>.</para> </sect2> <sect2 id="infix-tycons"> @@ -3252,8 +3339,9 @@ then writing the data type instance by hand. </listitem> <listitem><para> With <option>-XDeriveGeneric</option>, you can derive -instances of the class <literal>Generic</literal>, defined in -<literal>GHC.Generics</literal>. You can use these to define generic functions, +instances of the classes <literal>Generic</literal> and +<literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>. +You can use these to define generic functions, as described in <xref linkend="generic-programming"/>. </para></listitem> @@ -3971,46 +4059,37 @@ must be of the form <literal>C (T a1 ... an)</literal>, where and the <literal>a1 ... an</literal> are distinct type variables. GHC relaxes these rules in two ways. <itemizedlist> -<listitem> -<para> -The <option>-XFlexibleInstances</option> flag allows the head of the instance -declaration to mention arbitrary nested types. -For example, this becomes a legal instance declaration -<programlisting> - instance C (Maybe Int) where ... -</programlisting> -See also the <link linkend="instance-overlap">rules on overlap</link>. -</para></listitem> <listitem><para> With the <option>-XTypeSynonymInstances</option> flag, instance heads may use type synonyms. As always, using a type synonym is just shorthand for writing the RHS of the type synonym definition. For example: - - <programlisting> - type Point = (Int,Int) - instance C Point where ... - instance C [Point] where ... + type Point a = (a,a) + instance C (Point a) where ... </programlisting> - - -is legal. However, if you added - - +is legal. The instance declaration is equivalent to <programlisting> - instance C (Int,Int) where ... + instance C (a,a) where ... </programlisting> - - -as well, then the compiler will complain about the overlapping -(actually, identical) instance declarations. As always, type synonyms +As always, type synonyms must be fully applied. You cannot, for example, write: - <programlisting> - type P a = [[a]] - instance Monad P where ... + instance Monad Point where ... </programlisting> +</para></listitem> +<listitem> +<para> +The <option>-XFlexibleInstances</option> flag allows the head of the instance +declaration to mention arbitrary nested types. +For example, this becomes a legal instance declaration +<programlisting> + instance C (Maybe Int) where ... +</programlisting> +See also the <link linkend="instance-overlap">rules on overlap</link>. +</para> +<para> +The <option>-XFlexibleInstances</option> flag implies <option>-XTypeSynonymInstances</option>. </para></listitem> </itemizedlist> </para> @@ -5172,60 +5251,19 @@ instance Show v => Show (GMap () v) where ... </sect1> -<sect1 id="kind-polymorphism-and-promotion"> -<title>Kind polymorphism and promotion</title> - -<para> -Standard Haskell has a rich type language. Types classify terms and serve to -avoid many common programming mistakes. The kind language, however, is -relatively simple, distinguishing only lifted types (kind <literal>*</literal>), -type constructors (eg. kind <literal>* -> * -> *</literal>), and unlifted -types (<xref linkend="glasgow-unboxed"/>). In particular when using advanced -type system features, such as type families (<xref linkend="type-families"/>) -or GADTs (<xref linkend="gadt"/>), this simple kind system is insufficient, -and fails to prevent simple errors. Consider the example of type-level natural -numbers, and length-indexed vectors: -<programlisting> -data Ze -data Su n - -data Vec :: * -> * -> * where - Nil :: Vec a Ze - Cons :: a -> Vec a n -> Vec a (Su n) -</programlisting> -The kind of <literal>Vec</literal> is <literal>* -> * -> *</literal>. This means -that eg. <literal>Vec Int Char</literal> is a well-kinded type, even though this -is not what we intend when defining length-indexed vectors. -</para> - -<para> -With the flags <option>-XPolyKinds</option> and <option>-XDataKinds</option>, -users get access to a richer kind language. -<option>-XPolyKinds</option> enables kind polymorphism, while -<option>-XDataKinds</option> enables user defined kinds through datatype -promotion. With <option>-XDataKinds</option>, the example above can then be -rewritten to: -<programlisting> -data Nat = Ze | Su Nat - -data Vec :: * -> Nat -> * where - Nil :: Vec a Ze - Cons :: a -> Vec a n -> Vec a (Su n) -</programlisting> -With the improved kind of <literal>Vec</literal>, things like -<literal>Vec Int Char</literal> are now ill-kinded, and GHC will report an -error. -</para> +<sect1 id="kind-polymorphism"> +<title>Kind polymorphism</title> <para> -In this section we show a few examples of how to make use of the new kind -system. This extension is described in more detail in the paper +This section describes <emphasis>kind polymorphism</emphasis>, and extension +enabled by <option>-XPolyKinds</option>. +It is described in more detail in the paper <ulink url="http://dreixel.net/research/pdf/ghp.pdf">Giving Haskell a Promotion</ulink>, which appeared at TLDI 2012. </para> -<sect2 id="kind-polymorphism"> -<title>Kind polymorphism</title> +<sect2> <title>Overview of kind polymorphism</title> + <para> Currently there is a lot of code duplication in the way Typeable is implemented (<xref linkend="deriving-typeable"/>): @@ -5258,34 +5296,148 @@ Note that the datatype <literal>Proxy</literal> has kind <literal>Typeable</literal> class has kind <literal>forall k. k -> Constraint</literal>. </para> +</sect2> + +<sect2> <title>Overview</title> <para> -There are some restrictions in the current implementation: +Generally speaking, with <option>-XPolyKinds</option>, GHC will infer a polymorphic +kind for un-decorated whenever possible. For example: +<programlisting> +data T m a = MkT (m a) +-- GHC infers kind T :: forall k. (k -> *) -> k -> * +</programlisting> +Just as in the world of terms, you can restrict polymorphism using a signature +(<option>-XPolyKinds</option> implies <option>-XKindSignatures</option>): +<programlisting> +data T m (a :: *) = MkT (m a) +-- GHC now infers kind T :: (* -> *) -> * -> * +</programlisting> +There is no "forall" for kind variables. Instead, you can simply mention a kind +variable in a kind signature, thus: +<programlisting> +data T (m :: k -> *) a = MkT (m a) +-- GHC now infers kind T :: forall k. (k -> *) -> k -> * +</programlisting> +</para> +</sect2> + +<sect2> <title>Polymorphic kind recursion and complete kind signatures</title> + +<para> +Just as in type inference, kind inference for recursive types can only use <emphasis>monomorphic</emphasis> recursion. +Consider this (contrived) example: +<programlisting> +data T m a = MkT (m a) (T Maybe (m a)) +-- GHC infers kind T :: (* -> *) -> * -> * +</programlisting> +The recursive use of <literal>T</literal> forced the second argument to have kind <literal>*</literal>. +However, just as in type inference, you can achieve polymorphic recursion by giving a +<emphasis>complete kind signature</emphasis> for <literal>T</literal>. The way to give +a complete kind signature for a data type is to use a GADT-style declaration with an +explicit kind signature thus: +<programlisting> +data T :: (k -> *) -> k -> * where + MkT :: m a -> T Maybe (m a) -> T m a +</programlisting> +The complete user-supplied kind signature specifies the polymorphic kind for <literal>T</literal>, +and this signature is used for all the calls to <literal>T</literal> including the recursive ones. +In particular, the recursive use of <literal>T</literal> is at kind <literal>*</literal>. +</para> + +<para> +What exactly is considered to be a "complete user-supplied kind signature" for a type constructor? +These are the forms: <itemizedlist> - <listitem><para>You cannot (yet) explicitly abstract over kinds, or mention - kind variables. So the following are all rejected: +<listitem><para> +A GADT-style data type declaration, with an explicit "<literal>::</literal>" in the header. +For example: <programlisting> -data D1 (t :: k) +data T1 :: (k -> *) -> k -> * where ... -- Yes T1 :: forall k. (k->*) -> k -> * +data T2 (a :: k -> *) :: k -> * where ... -- Yes T2 :: forall k. (k->*) -> k -> * +data T3 (a :: k -> *) (b :: k) :: * where ... -- Yes T3 :: forall k. (k->*) -> k -> * +data T4 a (b :: k) :: * where ... -- YES T4 :: forall k. * -> k -> * -data D2 :: k -> * +data T5 a b where ... -- NO kind is inferred +data T4 (a :: k -> *) (b :: k) where ... -- NO kind is inferred +</programlisting> +It makes no difference where you put the "<literal>::</literal>" but it must be there. +You cannot give a complete kind signature using a Haskell-98-style data type declaration; +you must use GADT syntax. +</para></listitem> -data D3 (k :: BOX) -</programlisting></para> - </listitem> - <listitem><para>The return kind of a type family is always defaulted to - <literal>*</literal>. So the following is rejected: +<listitem><para> +A type or data family declaration <emphasis>always</emphasis> have a +complete user-specified kind signature; no "<literal>::</literal>" is required: <programlisting> -type family F a -type instance F Int = Maybe -</programlisting></para> - </listitem> +data family D1 a -- D1 :: * -> * +data family D2 (a :: k) -- D2 :: forall k. k -> * +data family D3 (a :: k) :: * -- D3 :: forall k. k -> * +type family S1 a :: k -> * -- S1 :: forall k. * -> k -> * +</programlisting> +</para></listitem> </itemizedlist> +In a complete user-specified kind signature, any un-decorated type variable to the +left of the "<literal>::</literal>" is considered to have kind "<literal>*</literal>". +If you want kind polymorphism, specify a kind variable. </para> </sect2> +</sect1> -<sect2 id="promotion"> +<sect1 id="promotion"> <title>Datatype promotion</title> + +<para> +This section describes <emphasis>data type promotion</emphasis>, an extension +to the kind system that complements kind polymorphism. It is enabled by <option>-XDataKinds</option>, +and described in more detail in the paper +<ulink url="http://dreixel.net/research/pdf/ghp.pdf">Giving Haskell a +Promotion</ulink>, which appeared at TLDI 2012. +</para> + +<sect2> <title>Motivation</title> + +<para> +Standard Haskell has a rich type language. Types classify terms and serve to +avoid many common programming mistakes. The kind language, however, is +relatively simple, distinguishing only lifted types (kind <literal>*</literal>), +type constructors (eg. kind <literal>* -> * -> *</literal>), and unlifted +types (<xref linkend="glasgow-unboxed"/>). In particular when using advanced +type system features, such as type families (<xref linkend="type-families"/>) +or GADTs (<xref linkend="gadt"/>), this simple kind system is insufficient, +and fails to prevent simple errors. Consider the example of type-level natural +numbers, and length-indexed vectors: +<programlisting> +data Ze +data Su n + +data Vec :: * -> * -> * where + Nil :: Vec a Ze + Cons :: a -> Vec a n -> Vec a (Su n) +</programlisting> +The kind of <literal>Vec</literal> is <literal>* -> * -> *</literal>. This means +that eg. <literal>Vec Int Char</literal> is a well-kinded type, even though this +is not what we intend when defining length-indexed vectors. +</para> + +<para> +With <option>-XDataKinds</option>, the example above can then be +rewritten to: +<programlisting> +data Nat = Ze | Su Nat + +data Vec :: * -> Nat -> * where + Nil :: Vec a Ze + Cons :: a -> Vec a n -> Vec a (Su n) +</programlisting> +With the improved kind of <literal>Vec</literal>, things like +<literal>Vec Int Char</literal> are now ill-kinded, and GHC will report an +error. +</para> +</sect2> + +<sect2><title>Overview</title> <para> With <option>-XDataKinds</option>, GHC automatically promotes every suitable datatype to be a kind, and its (value) constructors to be type constructors. @@ -5333,10 +5485,13 @@ The following restrictions apply to promotion: <listitem><para>We do not promote datatypes whose constructors are kind polymorphic, involve constraints, or use existential quantification. </para></listitem> + <listitem><para>We do not promote data family instances (<xref linkend="data-families"/>). + </para></listitem> </itemizedlist> </para> +</sect2> -<sect3 id="promotion-syntax"> +<sect2 id="promotion-syntax"> <title>Distinguishing between types and constructors</title> <para> Since constructors and types share the same namespace, with promotion you can @@ -5356,11 +5511,14 @@ type T1 = P -- 1 type T2 = 'P -- promoted 2 </programlisting> Note that promoted datatypes give rise to named kinds. Since these can never be -ambiguous, we do not allow quotes in kind names. +ambiguous, we do not allow quotes in kind names. </para> -</sect3> +<para>Just as in the case of Template Haskell (<xref linkend="th-syntax"/>), there is +no way to quote a data constructor or type constructor whose second character +is a single quote.</para> +</sect2> -<sect3 id="promoted-lists-and-tuples"> +<sect2 id="promoted-lists-and-tuples"> <title>Promoted lists and tuples types</title> <para> Haskell's list and tuple types are natively promoted to kinds, and enjoy the @@ -5375,23 +5533,55 @@ data Tuple :: (*,*) -> * where </programlisting> Note that this requires <option>-XTypeOperators</option>. </para> -</sect3> - </sect2> -<sect2 id="kind-polymorphism-limitations"> -<title>Shortcomings of the current implementation</title> +<sect2 id="promoted-literals"> +<title>Promoted Literals</title> +<para> +Numeric and string literals are prmoted to the type level, giving convenient +access to a large number of predefined type-level constants. Numeric literals +are of kind <literal>Nat</literal>, while string literals are of kind +<literal>Symbol</literal>. These kinds are defined in the module +<literal>GHC.TypeLits</literal>. +</para> + +<para> +Here is an exampe of using type-level numeric literals to provide a safe +interface to a low-level function: +<programlisting> +import GHC.TypeLits +import Data.Word +import Foreign + +newtype ArrPtr (n :: Nat) a = ArrPtr (Ptr a) + +clearPage :: ArrPtr 4096 Word8 -> IO () +clearPage (ArrPtr p) = ... +</programlisting> +</para> + <para> -For the release on GHC 7.4 we focused on getting the new kind-polymorphic core -to work with all existing programs (which do not make use of kind polymorphism). -Many things already work properly with <option>-XPolyKinds</option>, but we -expect that some things will not work. If you run into trouble, please -<link linkend="bug-reporting">report a bug</link>! +Here is an example of using type-level string literals to simulate +simple record operations: +<programlisting> +data Label (l :: Symbol) = Get + +class Has a l b | a l -> b where + from :: a -> Label l -> b + +data Point = Point Int Int deriving Show + +instance Has Point "x" Int where from (Point x _) _ = x +instance Has Point "y" Int where from (Point _ y) _ = y + +example = from (Point 1 2) (Get :: Label "x") +</programlisting> </para> </sect2> </sect1> + <sect1 id="equality-constraints"> <title>Equality constraints</title> <para> @@ -6871,7 +7061,7 @@ understand Template Haskell; see the <ulink url="http://haskell.org/haskellwiki/ Wiki page</ulink>. </para> - <sect2> + <sect2 id="th-syntax"> <title>Syntax</title> <para> Template Haskell has the following new syntactic @@ -6931,7 +7121,19 @@ Wiki page</ulink>. <itemizedlist> <listitem><para> <literal>'f</literal> has type <literal>Name</literal>, and names the function <literal>f</literal>. Similarly <literal>'C</literal> has type <literal>Name</literal> and names the data constructor <literal>C</literal>. - In general <literal>'</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in an expression context. + In general <literal>'</literal><replaceable>thing</replaceable> + interprets <replaceable>thing</replaceable> in an expression context.</para> + <para>A name whose second character is a single + quote (sadly) cannot be quoted in this way, + because it will be parsed instead as a quoted + character. For example, if the function is called + <literal>f'7</literal> (which is a legal Haskell + identifier), an attempt to quote it as + <literal>'f'7</literal> would be parsed as the + character literal <literal>'f'</literal> followed + by the numeric literal <literal>7</literal>. There + is no current escape mechanism in this (unusual) + situation. </para></listitem> <listitem><para> <literal>''T</literal> has type <literal>Name</literal>, and names the type constructor <literal>T</literal>. That is, <literal>''</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in a type context. @@ -9239,7 +9441,7 @@ The following are good producers: <listitem> <para> - Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>). + Enumerations of <literal>Int</literal>, <literal>Integer</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>). </para> </listitem> <listitem> @@ -9683,8 +9885,9 @@ data (:*:) f g p = f p :*: g p </para> <para> -The <literal>Generic</literal> class mediates between user-defined datatypes -and their internal representation as a sum-of-products: +The <literal>Generic</literal> and <literal>Generic1</literal> classes mediate +between user-defined datatypes and their internal representation as a +sum-of-products: <programlisting> class Generic a where @@ -9694,9 +9897,17 @@ class Generic a where from :: a -> (Rep a) x -- Convert from the representation to the datatype to :: (Rep a) x -> a + +class Generic1 f where + type Rep1 f :: * -> * + + from1 :: f a -> Rep1 f a + to1 :: Rep1 f a -> f a </programlisting> -Instances of this class can be derived by GHC with the +<literal>Generic1</literal> is used for functions that can only be defined over +type containers, such as <literal>map</literal>. +Instances of these classes can be derived by GHC with the <option>-XDeriveGeneric</option> (<xref linkend="deriving-typeable"/>), and are necessary to be able to define generic instances automatically. </para> @@ -9711,7 +9922,7 @@ instance Generic (UserTree a) where type Rep (UserTree a) = M1 D D1UserTree ( M1 C C1_0UserTree ( - M1 S NoSelector (K1 P a) + M1 S NoSelector (K1 R a) :*: M1 S NoSelector (K1 R (UserTree a)) :*: M1 S NoSelector (K1 R (UserTree a))) :+: M1 C C1_1UserTree U1) @@ -9806,17 +10017,20 @@ instance (Serialize a) => Serialize (UserTree a) The default method for <literal>put</literal> is then used, corresponding to the generic implementation of serialization. + +For more examples of generic functions please refer to the +<ulink url="http://hackage.haskell.org/package/generic-deriving">generic-deriving</ulink> +package on Hackage. </para> </sect2> - <sect2> <title>More information</title> <para> -For more detail please refer to the -<ulink url="http://www.haskell.org/haskellwiki/Generics">HaskellWiki page</ulink> -or the original paper: +For more details please refer to the +<ulink url="http://www.haskell.org/haskellwiki/GHC.Generics">HaskellWiki +page</ulink> or the original paper: </para> <itemizedlist> @@ -9831,12 +10045,6 @@ Jose Pedro Magalhaes, Atze Dijkstra, Johan Jeuring, and Andres Loeh. </listitem> </itemizedlist> -<emphasis>Note</emphasis>: the current support for generic programming in GHC -is preliminary. In particular, we only allow deriving instances for the -<literal>Generic</literal> class. Support for deriving -<literal>Generic1</literal> (and thus enabling generic functions of kind -<literal>* -> *</literal> such as <literal>fmap</literal>) will come at a -later stage. </sect2> </sect1> |