diff options
Diffstat (limited to 'docs/users_guide/glasgow_exts.xml')
-rw-r--r-- | docs/users_guide/glasgow_exts.xml | 13867 |
1 files changed, 0 insertions, 13867 deletions
diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml deleted file mode 100644 index 7554c4d064..0000000000 --- a/docs/users_guide/glasgow_exts.xml +++ /dev/null @@ -1,13867 +0,0 @@ -<?xml version="1.0" encoding="iso-8859-1"?> -<para> -<indexterm><primary>language, GHC</primary></indexterm> -<indexterm><primary>extensions, GHC</primary></indexterm> -As with all known Haskell systems, GHC implements some extensions to -the language. They can all be enabled or disabled by command line flags -or language pragmas. By default GHC understands the most recent Haskell -version it supports, plus a handful of extensions. -</para> - -<para> -Some of the Glasgow extensions serve to give you access to the -underlying facilities with which we implement Haskell. Thus, you can -get at the Raw Iron, if you are willing to write some non-portable -code at a more primitive level. You need not be “stuck” -on performance because of the implementation costs of Haskell's -“high-level” features—you can always code -“under” them. In an extreme case, you can write all your -time-critical code in C, and then just glue it together with Haskell! -</para> - -<para> -Before you get too carried away working at the lowest level (e.g., -sloshing <literal>MutableByteArray#</literal>s around your -program), you may wish to check if there are libraries that provide a -“Haskellised veneer” over the features you want. The -separate <ulink url="../libraries/index.html">libraries -documentation</ulink> describes all the libraries that come with GHC. -</para> - -<!-- LANGUAGE OPTIONS --> - <sect1 id="options-language"> - <title>Language options</title> - - <indexterm><primary>language</primary><secondary>option</secondary> - </indexterm> - <indexterm><primary>options</primary><secondary>language</secondary> - </indexterm> - <indexterm><primary>extensions</primary><secondary>options controlling</secondary> - </indexterm> - - <para>The language option flags control what variation of the language are - permitted.</para> - - <para>Language options can be controlled in two ways: - <itemizedlist> - <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>" - (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>"; - (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem> - <listitem><para> - Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma, - thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para> - </listitem> - </itemizedlist></para> - - <para>The flag <option>-fglasgow-exts</option> - <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm> - is equivalent to enabling the following extensions: - &what_glasgow_exts_does; - Enabling these options is the <emphasis>only</emphasis> - effect of <option>-fglasgow-exts</option>. - We are trying to move away from this portmanteau flag, - and towards enabling features individually.</para> - - </sect1> - -<!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS --> -<sect1 id="primitives"> - <title>Unboxed types and primitive operations</title> - -<para>GHC is built on a raft of primitive data types and operations; -"primitive" in the sense that they cannot be defined in Haskell itself. -While you really can use this stuff to write fast code, -we generally find it a lot less painful, and more satisfying in the -long run, to use higher-level language features and libraries. With -any luck, the code you write will be optimised to the efficient -unboxed version in any case. And if it isn't, we'd like to know -about it.</para> - -<para>All these primitive data types and operations are exported by the -library <literal>GHC.Prim</literal>, for which there is -<ulink url="&libraryGhcPrimLocation;/GHC-Prim.html">detailed online documentation</ulink>. -(This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.) -</para> - -<para> -If you want to mention any of the primitive data types or operations in your -program, you must first import <literal>GHC.Prim</literal> to bring them -into scope. Many of them have names ending in "#", and to mention such -names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>). -</para> - -<para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link> -and <link linkend="unboxed-tuples">unboxed tuples</link>, which -we briefly summarise here. </para> - -<sect2 id="glasgow-unboxed"> -<title>Unboxed types</title> - -<para> -<indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm> -</para> - -<para>Most types in GHC are <firstterm>boxed</firstterm>, which means -that values of that type are represented by a pointer to a heap -object. The representation of a Haskell <literal>Int</literal>, for -example, is a two-word heap object. An <firstterm>unboxed</firstterm> -type, however, is represented by the value itself, no pointers or heap -allocation are involved. -</para> - -<para> -Unboxed types correspond to the “raw machine” types you -would use in C: <literal>Int#</literal> (long int), -<literal>Double#</literal> (double), <literal>Addr#</literal> -(void *), etc. The <emphasis>primitive operations</emphasis> -(PrimOps) on these types are what you might expect; e.g., -<literal>(+#)</literal> is addition on -<literal>Int#</literal>s, and is the machine-addition that we all -know and love—usually one instruction. -</para> - -<para> -Primitive (unboxed) types cannot be defined in Haskell, and are -therefore built into the language and compiler. Primitive types are -always unlifted; that is, a value of a primitive type cannot be -bottom. We use the convention (but it is only a convention) -that primitive types, values, and -operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>). -For some primitive types we have special syntax for literals, also -described in the <link linkend="magic-hash">same section</link>. -</para> - -<para> -Primitive values are often represented by a simple bit-pattern, such -as <literal>Int#</literal>, <literal>Float#</literal>, -<literal>Double#</literal>. But this is not necessarily the case: -a primitive value might be represented by a pointer to a -heap-allocated object. Examples include -<literal>Array#</literal>, the type of primitive arrays. A -primitive array is heap-allocated because it is too big a value to fit -in a register, and would be too expensive to copy around; in a sense, -it is accidental that it is represented by a pointer. If a pointer -represents a primitive value, then it really does point to that value: -no unevaluated thunks, no indirections…nothing can be at the -other end of the pointer than the primitive value. -A numerically-intensive program using unboxed types can -go a <emphasis>lot</emphasis> faster than its “standard” -counterpart—we saw a threefold speedup on one example. -</para> - -<para> -There are some restrictions on the use of primitive types: -<itemizedlist> -<listitem><para>The main restriction -is that you can't pass a primitive value to a polymorphic -function or store one in a polymorphic data type. This rules out -things like <literal>[Int#]</literal> (i.e. lists of primitive -integers). The reason for this restriction is that polymorphic -arguments and constructor fields are assumed to be pointers: if an -unboxed integer is stored in one of these, the garbage collector would -attempt to follow it, leading to unpredictable space leaks. Or a -<function>seq</function> operation on the polymorphic component may -attempt to dereference the pointer, with disastrous results. Even -worse, the unboxed value might be larger than a pointer -(<literal>Double#</literal> for instance). -</para> -</listitem> -<listitem><para> You cannot define a newtype whose representation type -(the argument type of the data constructor) is an unboxed type. Thus, -this is illegal: -<programlisting> - newtype A = MkA Int# -</programlisting> -</para></listitem> -<listitem><para> You cannot bind a variable with an unboxed type -in a <emphasis>top-level</emphasis> binding. -</para></listitem> -<listitem><para> You cannot bind a variable with an unboxed type -in a <emphasis>recursive</emphasis> binding. -</para></listitem> -<listitem><para> You may bind unboxed variables in a (non-recursive, -non-top-level) pattern binding, but you must make any such pattern-match -strict. For example, rather than: -<programlisting> - data Foo = Foo Int Int# - - f x = let (Foo a b, w) = ..rhs.. in ..body.. -</programlisting> -you must write: -<programlisting> - data Foo = Foo Int Int# - - f x = let !(Foo a b, w) = ..rhs.. in ..body.. -</programlisting> -since <literal>b</literal> has type <literal>Int#</literal>. -</para> -</listitem> -</itemizedlist> -</para> - -</sect2> - -<sect2 id="unboxed-tuples"> -<title>Unboxed tuples</title> - -<para> -Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>; -they are a syntactic extension enabled by the language flag <option>-XUnboxedTuples</option>. An -unboxed tuple looks like this: -</para> - -<para> - -<programlisting> -(# e_1, ..., e_n #) -</programlisting> - -</para> - -<para> -where <literal>e_1..e_n</literal> are expressions of any -type (primitive or non-primitive). The type of an unboxed tuple looks -the same. -</para> - -<para> -Note that when unboxed tuples are enabled, -<literal>(#</literal> is a single lexeme, so for example when using -operators like <literal>#</literal> and <literal>#-</literal> you need -to write <literal>( # )</literal> and <literal>( #- )</literal> rather than -<literal>(#)</literal> and <literal>(#-)</literal>. -</para> - -<para> -Unboxed tuples are used for functions that need to return multiple -values, but they avoid the heap allocation normally associated with -using fully-fledged tuples. When an unboxed tuple is returned, the -components are put directly into registers or on the stack; the -unboxed tuple itself does not have a composite representation. Many -of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed -tuples. -In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed -tuples to avoid unnecessary allocation during sequences of operations. -</para> - -<para> -There are some restrictions on the use of unboxed tuples: -<itemizedlist> - -<listitem> -<para> -Values of unboxed tuple types are subject to the same restrictions as -other unboxed types; i.e. they may not be stored in polymorphic data -structures or passed to polymorphic functions. -</para> -</listitem> - -<listitem> -<para> -The typical use of unboxed tuples is simply to return multiple values, -binding those multiple results with a <literal>case</literal> expression, thus: -<programlisting> - f x y = (# x+1, y-1 #) - g x = case f x x of { (# a, b #) -> a + b } -</programlisting> -You can have an unboxed tuple in a pattern binding, thus -<programlisting> - f x = let (# p,q #) = h x in ..body.. -</programlisting> -If the types of <literal>p</literal> and <literal>q</literal> are not unboxed, -the resulting binding is lazy like any other Haskell pattern binding. The -above example desugars like this: -<programlisting> - f x = let t = case h x of { (# p,q #) -> (p,q) } - p = fst t - q = snd t - in ..body.. -</programlisting> -Indeed, the bindings can even be recursive. -</para> -</listitem> -</itemizedlist> - -</para> - -</sect2> -</sect1> - - -<!-- ====================== SYNTACTIC EXTENSIONS ======================= --> - -<sect1 id="syntax-extns"> -<title>Syntactic extensions</title> - - <sect2 id="unicode-syntax"> - <title>Unicode syntax</title> - <para>The language - extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm> - enables Unicode characters to be used to stand for certain ASCII - character sequences. The following alternatives are provided:</para> - - <informaltable> - <tgroup cols="2" align="left" colsep="1" rowsep="1"> - <thead> - <row> - <entry>ASCII</entry> - <entry>Unicode alternative</entry> - <entry>Code point</entry> - <entry>Name</entry> - </row> - </thead> - -<!-- - to find the DocBook entities for these characters, find - the Unicode code point (e.g. 0x2237), and grep for it in - /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on - your system. Some of these Unicode code points don't have - equivalent DocBook entities. - --> - - <tbody> - <row> - <entry><literal>::</literal></entry> - <entry>∷</entry> - <entry>0x2237</entry> - <entry>PROPORTION</entry> - </row> - </tbody> - <tbody> - <row> - <entry><literal>=></literal></entry> - <entry>⇒</entry> - <entry>0x21D2</entry> - <entry>RIGHTWARDS DOUBLE ARROW</entry> - </row> - </tbody> - <tbody> - <row> - <entry><literal>forall</literal></entry> - <entry>∀</entry> - <entry>0x2200</entry> - <entry>FOR ALL</entry> - </row> - </tbody> - <tbody> - <row> - <entry><literal>-></literal></entry> - <entry>→</entry> - <entry>0x2192</entry> - <entry>RIGHTWARDS ARROW</entry> - </row> - </tbody> - <tbody> - <row> - <entry><literal><-</literal></entry> - <entry>←</entry> - <entry>0x2190</entry> - <entry>LEFTWARDS ARROW</entry> - </row> - </tbody> - - <tbody> - <row> - <entry>-<</entry> - <entry>⤙</entry> - <entry>0x2919</entry> - <entry>LEFTWARDS ARROW-TAIL</entry> - </row> - </tbody> - - <tbody> - <row> - <entry>>-</entry> - <entry>⤚</entry> - <entry>0x291A</entry> - <entry>RIGHTWARDS ARROW-TAIL</entry> - </row> - </tbody> - - <tbody> - <row> - <entry>-<<</entry> - <entry>⤛</entry> - <entry>0x291B</entry> - <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry> - </row> - </tbody> - - <tbody> - <row> - <entry>>>-</entry> - <entry>⤜</entry> - <entry>0x291C</entry> - <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry> - </row> - </tbody> - - <tbody> - <row> - <entry>*</entry> - <entry>★</entry> - <entry>0x2605</entry> - <entry>BLACK STAR</entry> - </row> - </tbody> - - </tgroup> - </informaltable> - </sect2> - - <sect2 id="magic-hash"> - <title>The magic hash</title> - <para>The language extension <option>-XMagicHash</option> allows "#" as a - postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is - a valid type constructor or data constructor.</para> - - <para>The hash sign does not change semantics at all. We tend to use variable - names ending in "#" for unboxed values or types (e.g. <literal>Int#</literal>), - but there is no requirement to do so; they are just plain ordinary variables. - Nor does the <option>-XMagicHash</option> extension bring anything into scope. - For example, to bring <literal>Int#</literal> into scope you must - import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>); - the <option>-XMagicHash</option> extension - then allows you to <emphasis>refer</emphasis> to the <literal>Int#</literal> - that is now in scope. Note that with this option, the meaning of <literal>x#y = 0</literal> - is changed: it defines a function <literal>x#</literal> taking a single argument <literal>y</literal>; - to define the operator <literal>#</literal>, put a space: <literal>x # y = 0</literal>. - -</para> - <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>): - <itemizedlist> - <listitem><para> <literal>'x'#</literal> has type <literal>Char#</literal></para> </listitem> - <listitem><para> <literal>"foo"#</literal> has type <literal>Addr#</literal></para> </listitem> - <listitem><para> <literal>3#</literal> has type <literal>Int#</literal>. In general, - any Haskell integer lexeme followed by a <literal>#</literal> is an <literal>Int#</literal> literal, e.g. - <literal>-0x3A#</literal> as well as <literal>32#</literal>.</para></listitem> - <listitem><para> <literal>3##</literal> has type <literal>Word#</literal>. In general, - any non-negative Haskell integer lexeme followed by <literal>##</literal> - is a <literal>Word#</literal>. </para> </listitem> - <listitem><para> <literal>3.2#</literal> has type <literal>Float#</literal>.</para> </listitem> - <listitem><para> <literal>3.2##</literal> has type <literal>Double#</literal></para> </listitem> - </itemizedlist> - </para> - </sect2> - - <sect2 id="negative-literals"> - <title>Negative literals</title> - <para> - The literal <literal>-123</literal> is, according to - Haskell98 and Haskell 2010, desugared as - <literal>negate (fromInteger 123)</literal>. - The language extension <option>-XNegativeLiterals</option> - means that it is instead desugared as - <literal>fromInteger (-123)</literal>. - </para> - - <para> - This can make a difference when the positive and negative range of - a numeric data type don't match up. For example, - in 8-bit arithmetic -128 is representable, but +128 is not. - So <literal>negate (fromInteger 128)</literal> will elicit an - unexpected integer-literal-overflow message. - </para> - </sect2> - - <sect2 id="num-decimals"> - <title>Fractional looking integer literals</title> - <para> - Haskell 2010 and Haskell 98 define floating literals with - the syntax <literal>1.2e6</literal>. These literals have the - type <literal>Fractional a => a</literal>. - </para> - - <para> - The language extension <option>-XNumDecimals</option> allows - you to also use the floating literal syntax for instances of - <literal>Integral</literal>, and have values like - <literal>(1.2e6 :: Num a => a)</literal> - </para> - </sect2> - - <sect2 id="binary-literals"> - <title>Binary integer literals</title> - <para> - Haskell 2010 and Haskell 98 allows for integer literals to - be given in decimal, octal (prefixed by - <literal>0o</literal> or <literal>0O</literal>), or - hexadecimal notation (prefixed by <literal>0x</literal> or - <literal>0X</literal>). - </para> - - <para> - The language extension <option>-XBinaryLiterals</option> - adds support for expressing integer literals in binary - notation with the prefix <literal>0b</literal> or - <literal>0B</literal>. For instance, the binary integer - literal <literal>0b11001001</literal> will be desugared into - <literal>fromInteger 201</literal> when - <option>-XBinaryLiterals</option> is enabled. - </para> - </sect2> - - <!-- ====================== HIERARCHICAL MODULES ======================= --> - - - <sect2 id="hierarchical-modules"> - <title>Hierarchical Modules</title> - - <para>GHC supports a small extension to the syntax of module - names: a module name is allowed to contain a dot - <literal>‘.’</literal>. This is also known as the - “hierarchical module namespace” extension, because - it extends the normally flat Haskell module namespace into a - more flexible hierarchy of modules.</para> - - <para>This extension has very little impact on the language - itself; modules names are <emphasis>always</emphasis> fully - qualified, so you can just think of the fully qualified module - name as <quote>the module name</quote>. In particular, this - means that the full module name must be given after the - <literal>module</literal> keyword at the beginning of the - module; for example, the module <literal>A.B.C</literal> must - begin</para> - -<programlisting>module A.B.C</programlisting> - - - <para>It is a common strategy to use the <literal>as</literal> - keyword to save some typing when using qualified names with - hierarchical modules. For example:</para> - -<programlisting> -import qualified Control.Monad.ST.Strict as ST -</programlisting> - - <para>For details on how GHC searches for source and interface - files in the presence of hierarchical modules, see <xref - linkend="search-path"/>.</para> - - <para>GHC comes with a large collection of libraries arranged - hierarchically; see the accompanying <ulink - url="../libraries/index.html">library - documentation</ulink>. More libraries to install are available - from <ulink - url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para> - </sect2> - - <!-- ====================== PATTERN GUARDS ======================= --> - -<sect2 id="pattern-guards"> -<title>Pattern guards</title> - -<para> -<indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm> -The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.) -</para> - -<para> -Suppose we have an abstract data type of finite maps, with a -lookup operation: - -<programlisting> -lookup :: FiniteMap -> Int -> Maybe Int -</programlisting> - -The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise, -where <varname>v</varname> is the value that the key maps to. Now consider the following definition: -</para> - -<programlisting> -clunky env var1 var2 | ok1 && ok2 = val1 + val2 -| otherwise = var1 + var2 -where - m1 = lookup env var1 - m2 = lookup env var2 - ok1 = maybeToBool m1 - ok2 = maybeToBool m2 - val1 = expectJust m1 - val2 = expectJust m2 -</programlisting> - -<para> -The auxiliary functions are -</para> - -<programlisting> -maybeToBool :: Maybe a -> Bool -maybeToBool (Just x) = True -maybeToBool Nothing = False - -expectJust :: Maybe a -> a -expectJust (Just x) = x -expectJust Nothing = error "Unexpected Nothing" -</programlisting> - -<para> -What is <function>clunky</function> doing? The guard <literal>ok1 && -ok2</literal> checks that both lookups succeed, using -<function>maybeToBool</function> to convert the <function>Maybe</function> -types to booleans. The (lazily evaluated) <function>expectJust</function> -calls extract the values from the results of the lookups, and binds the -returned values to <varname>val1</varname> and <varname>val2</varname> -respectively. If either lookup fails, then clunky takes the -<literal>otherwise</literal> case and returns the sum of its arguments. -</para> - -<para> -This is certainly legal Haskell, but it is a tremendously verbose and -un-obvious way to achieve the desired effect. Arguably, a more direct way -to write clunky would be to use case expressions: -</para> - -<programlisting> -clunky env var1 var2 = case lookup env var1 of - Nothing -> fail - Just val1 -> case lookup env var2 of - Nothing -> fail - Just val2 -> val1 + val2 -where - fail = var1 + var2 -</programlisting> - -<para> -This is a bit shorter, but hardly better. Of course, we can rewrite any set -of pattern-matching, guarded equations as case expressions; that is -precisely what the compiler does when compiling equations! The reason that -Haskell provides guarded equations is because they allow us to write down -the cases we want to consider, one at a time, independently of each other. -This structure is hidden in the case version. Two of the right-hand sides -are really the same (<function>fail</function>), and the whole expression -tends to become more and more indented. -</para> - -<para> -Here is how I would write clunky: -</para> - -<programlisting> -clunky env var1 var2 - | Just val1 <- lookup env var1 - , Just val2 <- lookup env var2 - = val1 + val2 -...other equations for clunky... -</programlisting> - -<para> -The semantics should be clear enough. The qualifiers are matched in order. -For a <literal><-</literal> qualifier, which I call a pattern guard, the -right hand side is evaluated and matched against the pattern on the left. -If the match fails then the whole guard fails and the next equation is -tried. If it succeeds, then the appropriate binding takes place, and the -next qualifier is matched, in the augmented environment. Unlike list -comprehensions, however, the type of the expression to the right of the -<literal><-</literal> is the same as the type of the pattern to its -left. The bindings introduced by pattern guards scope over all the -remaining guard qualifiers, and over the right hand side of the equation. -</para> - -<para> -Just as with list comprehensions, boolean expressions can be freely mixed -with among the pattern guards. For example: -</para> - -<programlisting> -f x | [y] <- x - , y > 3 - , Just z <- h y - = ... -</programlisting> - -<para> -Haskell's current guards therefore emerge as a special case, in which the -qualifier list has just one element, a boolean expression. -</para> -</sect2> - - <!-- ===================== View patterns =================== --> - -<sect2 id="view-patterns"> -<title>View patterns -</title> - -<para> -View patterns are enabled by the flag <literal>-XViewPatterns</literal>. -More information and examples of view patterns can be found on the -<ulink url="http://ghc.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki -page</ulink>. -</para> - -<para> -View patterns are somewhat like pattern guards that can be nested inside -of other patterns. They are a convenient way of pattern-matching -against values of abstract types. For example, in a programming language -implementation, we might represent the syntax of the types of the -language as follows: - -<programlisting> -type Typ - -data TypView = Unit - | Arrow Typ Typ - -view :: Typ -> TypView - --- additional operations for constructing Typ's ... -</programlisting> - -The representation of Typ is held abstract, permitting implementations -to use a fancy representation (e.g., hash-consing to manage sharing). - -Without view patterns, using this signature a little inconvenient: -<programlisting> -size :: Typ -> Integer -size t = case view t of - Unit -> 1 - Arrow t1 t2 -> size t1 + size t2 -</programlisting> - -It is necessary to iterate the case, rather than using an equational -function definition. And the situation is even worse when the matching -against <literal>t</literal> is buried deep inside another pattern. -</para> - -<para> -View patterns permit calling the view function inside the pattern and -matching against the result: -<programlisting> -size (view -> Unit) = 1 -size (view -> Arrow t1 t2) = size t1 + size t2 -</programlisting> - -That is, we add a new form of pattern, written -<replaceable>expression</replaceable> <literal>-></literal> -<replaceable>pattern</replaceable> that means "apply the expression to -whatever we're trying to match against, and then match the result of -that application against the pattern". The expression can be any Haskell -expression of function type, and view patterns can be used wherever -patterns are used. -</para> - -<para> -The semantics of a pattern <literal>(</literal> -<replaceable>exp</replaceable> <literal>-></literal> -<replaceable>pat</replaceable> <literal>)</literal> are as follows: - -<itemizedlist> - -<listitem> Scoping: - -<para>The variables bound by the view pattern are the variables bound by -<replaceable>pat</replaceable>. -</para> - -<para> -Any variables in <replaceable>exp</replaceable> are bound occurrences, -but variables bound "to the left" in a pattern are in scope. This -feature permits, for example, one argument to a function to be used in -the view of another argument. For example, the function -<literal>clunky</literal> from <xref linkend="pattern-guards" /> can be -written using view patterns as follows: - -<programlisting> -clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2 -...other equations for clunky... -</programlisting> -</para> - -<para> -More precisely, the scoping rules are: -<itemizedlist> -<listitem> -<para> -In a single pattern, variables bound by patterns to the left of a view -pattern expression are in scope. For example: -<programlisting> -example :: Maybe ((String -> Integer,Integer), String) -> Bool -example Just ((f,_), f -> 4) = True -</programlisting> - -Additionally, in function definitions, variables bound by matching earlier curried -arguments may be used in view pattern expressions in later arguments: -<programlisting> -example :: (String -> Integer) -> String -> Bool -example f (f -> 4) = True -</programlisting> -That is, the scoping is the same as it would be if the curried arguments -were collected into a tuple. -</para> -</listitem> - -<listitem> -<para> -In mutually recursive bindings, such as <literal>let</literal>, -<literal>where</literal>, or the top level, view patterns in one -declaration may not mention variables bound by other declarations. That -is, each declaration must be self-contained. For example, the following -program is not allowed: -<programlisting> -let {(x -> y) = e1 ; - (y -> x) = e2 } in x -</programlisting> - -(For some amplification on this design choice see -<ulink url="http://ghc.haskell.org/trac/ghc/ticket/4061">Trac #4061</ulink>.) - -</para> -</listitem> -</itemizedlist> - -</para> -</listitem> - -<listitem><para> Typing: If <replaceable>exp</replaceable> has type -<replaceable>T1</replaceable> <literal>-></literal> -<replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches -a <replaceable>T2</replaceable>, then the whole view pattern matches a -<replaceable>T1</replaceable>. -</para></listitem> - -<listitem><para> Matching: To the equations in Section 3.17.3 of the -<ulink url="http://www.haskell.org/onlinereport/">Haskell 98 -Report</ulink>, add the following: -<programlisting> -case v of { (e -> p) -> e1 ; _ -> e2 } - = -case (e v) of { p -> e1 ; _ -> e2 } -</programlisting> -That is, to match a variable <replaceable>v</replaceable> against a pattern -<literal>(</literal> <replaceable>exp</replaceable> -<literal>-></literal> <replaceable>pat</replaceable> -<literal>)</literal>, evaluate <literal>(</literal> -<replaceable>exp</replaceable> <replaceable> v</replaceable> -<literal>)</literal> and match the result against -<replaceable>pat</replaceable>. -</para></listitem> - -<listitem><para> Efficiency: When the same view function is applied in -multiple branches of a function definition or a case expression (e.g., -in <literal>size</literal> above), GHC makes an attempt to collect these -applications into a single nested case expression, so that the view -function is only applied once. Pattern compilation in GHC follows the -matrix algorithm described in Chapter 4 of <ulink -url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The -Implementation of Functional Programming Languages</ulink>. When the -top rows of the first column of a matrix are all view patterns with the -"same" expression, these patterns are transformed into a single nested -case. This includes, for example, adjacent view patterns that line up -in a tuple, as in -<programlisting> -f ((view -> A, p1), p2) = e1 -f ((view -> B, p3), p4) = e2 -</programlisting> -</para> - -<para> The current notion of when two view pattern expressions are "the -same" is very restricted: it is not even full syntactic equality. -However, it does include variables, literals, applications, and tuples; -e.g., two instances of <literal>view ("hi", "there")</literal> will be -collected. However, the current implementation does not compare up to -alpha-equivalence, so two instances of <literal>(x, view x -> -y)</literal> will not be coalesced. -</para> - -</listitem> - -</itemizedlist> -</para> - -</sect2> - - <!-- ===================== Pattern synonyms =================== --> - -<sect2 id="pattern-synonyms"> -<title>Pattern synonyms -</title> - -<para> -Pattern synonyms are enabled by the flag -<literal>-XPatternSynonyms</literal>, which is required for defining -them, but <emphasis>not</emphasis> for using them. More information -and examples of view patterns can be found on the <ulink -url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki -page</ulink>. -</para> - -<para> -Pattern synonyms enable giving names to parametrized pattern -schemes. They can also be thought of as abstract constructors that -don't have a bearing on data representation. For example, in a -programming language implementation, we might represent types of the -language as follows: -</para> - -<programlisting> -data Type = App String [Type] -</programlisting> - -<para> -Here are some examples of using said representation. -Consider a few types of the <literal>Type</literal> universe encoded -like this: -</para> - -<programlisting> - App "->" [t1, t2] -- t1 -> t2 - App "Int" [] -- Int - App "Maybe" [App "Int" []] -- Maybe Int -</programlisting> - -<para> -This representation is very generic in that no types are given special -treatment. However, some functions might need to handle some known -types specially, for example the following two functions collect all -argument types of (nested) arrow types, and recognize the -<literal>Int</literal> type, respectively: -</para> - -<programlisting> - collectArgs :: Type -> [Type] - collectArgs (App "->" [t1, t2]) = t1 : collectArgs t2 - collectArgs _ = [] - - isInt :: Type -> Bool - isInt (App "Int" []) = True - isInt _ = False -</programlisting> - -<para> -Matching on <literal>App</literal> directly is both hard to read and -error prone to write. And the situation is even worse when the -matching is nested: -</para> - -<programlisting> - isIntEndo :: Type -> Bool - isIntEndo (App "->" [App "Int" [], App "Int" []]) = True - isIntEndo _ = False -</programlisting> - -<para> -Pattern synonyms permit abstracting from the representation to expose -matchers that behave in a constructor-like manner with respect to -pattern matching. We can create pattern synonyms for the known types -we care about, without committing the representation to them (note -that these don't have to be defined in the same module as the -<literal>Type</literal> type): -</para> - -<programlisting> - pattern Arrow t1 t2 = App "->" [t1, t2] - pattern Int = App "Int" [] - pattern Maybe t = App "Maybe" [t] -</programlisting> - -<para> -Which enables us to rewrite our functions in a much cleaner style: -</para> - -<programlisting> - collectArgs :: Type -> [Type] - collectArgs (Arrow t1 t2) = t1 : collectArgs t2 - collectArgs _ = [] - - isInt :: Type -> Bool - isInt Int = True - isInt _ = False - - isIntEndo :: Type -> Bool - isIntEndo (Arrow Int Int) = True - isIntEndo _ = False -</programlisting> - -<para> - Note that in this example, the pattern synonyms - <literal>Int</literal> and <literal>Arrow</literal> can also be used - as expressions (they are <emphasis>bidirectional</emphasis>). This - is not necessarily the case: <emphasis>unidirectional</emphasis> - pattern synonyms can also be declared with the following syntax: -</para> - -<programlisting> - pattern Head x <- x:xs -</programlisting> - -<para> -In this case, <literal>Head</literal> <replaceable>x</replaceable> -cannot be used in expressions, only patterns, since it wouldn't -specify a value for the <replaceable>xs</replaceable> on the -right-hand side. We can give an explicit inversion of a pattern -synonym using the following syntax: -</para> - -<programlisting> - pattern Head x <- x:xs where - Head x = [x] -</programlisting> - -<para> -The syntax and semantics of pattern synonyms are elaborated in the -following subsections. -See the <ulink -url="http://ghc.haskell.org/trac/ghc/wiki/PatternSynonyms">Wiki -page</ulink> for more details. -</para> - -<sect3> <title>Syntax and scoping of pattern synonyms</title> -<para> -A pattern synonym declaration can be either unidirectional or -bidirectional. The syntax for unidirectional pattern synonyms is: -<programlisting> - pattern Name args <- pat -</programlisting> - and the syntax for bidirectional pattern synonyms is: -<programlisting> - pattern Name args = pat -</programlisting> or -<programlisting> - pattern Name args <- pat where - Name args = expr -</programlisting> - Either prefix or infix syntax can be - used. -</para> -<para> - Pattern synonym declarations can only occur in the top level of a - module. In particular, they are not allowed as local - definitions. -</para> -<para> - The variables in the left-hand side of the definition are bound by - the pattern on the right-hand side. For implicitly bidirectional - pattern synonyms, all the variables of the right-hand side must also - occur on the left-hand side; also, wildcard patterns and view - patterns are not allowed. For unidirectional and - explicitly-bidirectional pattern synonyms, there is no restriction - on the right-hand side pattern. -</para> - -<para> - Pattern synonyms cannot be defined recursively. -</para> -</sect3> - -<sect3 id="patsyn-impexp"> <title>Import and export of pattern synonyms</title> - -<para> - The name of the pattern synonym itself is in the same namespace as - proper data constructors. In an export or import specification, - you must prefix pattern - names with the <literal>pattern</literal> keyword, e.g.: -<programlisting> - module Example (pattern Single) where - pattern Single x = [x] -</programlisting> -Without the <literal>pattern</literal> prefix, <literal>Single</literal> would -be interpreted as a type constructor in the export list. -</para> -<para> -You may also use the <literal>pattern</literal> keyword in an import/export -specification to import or export an ordinary data constructor. For example: -<programlisting> - import Data.Maybe( pattern Just ) -</programlisting> -would bring into scope the data constructor <literal>Just</literal> from the -<literal>Maybe</literal> type, without also bringing the type constructor -<literal>Maybe</literal> into scope. -</para> -</sect3> - -<sect3> <title>Typing of pattern synonyms</title> - -<para> - Given a pattern synonym definition of the form -<programlisting> - pattern P var1 var2 ... varN <- pat -</programlisting> - it is assigned a <emphasis>pattern type</emphasis> of the form -<programlisting> - pattern P :: CProv => CReq => t1 -> t2 -> ... -> tN -> t -</programlisting> - where <replaceable>CProv</replaceable> and - <replaceable>CReq</replaceable> are type contexts, and - <replaceable>t1</replaceable>, <replaceable>t2</replaceable>, ..., - <replaceable>tN</replaceable> and <replaceable>t</replaceable> are - types. -Notice the unusual form of the type, with two contexts <replaceable>CProv</replaceable> and <replaceable>CReq</replaceable>: -<itemizedlist> -<listitem><para><replaceable>CReq</replaceable> are the constraints <emphasis>required</emphasis> to match the pattern.</para></listitem> -<listitem><para><replaceable>CProv</replaceable> are the constraints <emphasis>made available (provided)</emphasis> -by a successful pattern match.</para></listitem> -</itemizedlist> -For example, consider -<programlisting> -data T a where - MkT :: (Show b) => a -> b -> T a - -f1 :: (Eq a, Num a) => T a -> String -f1 (MkT 42 x) = show x - -pattern ExNumPat :: (Show b) => (Num a, Eq a) => b -> T a -pattern ExNumPat x = MkT 42 x - -f2 :: (Eq a, Num a) => T a -> String -f2 (ExNumPat x) = show x -</programlisting> -Here <literal>f1</literal> does not use pattern synonyms. To match against the -numeric pattern <literal>42</literal> <emphasis>requires</emphasis> the caller to -satisfy the constraints <literal>(Num a, Eq a)</literal>, -so they appear in <literal>f1</literal>'s type. The call to <literal>show</literal> generates a <literal>(Show b)</literal> -constraint, where <literal>b</literal> is an existentially type variable bound by the pattern match -on <literal>MkT</literal>. But the same pattern match also <emphasis>provides</emphasis> the constraint -<literal>(Show b)</literal> (see <literal>MkT</literal>'s type), and so all is well. -</para> -<para> -Exactly the same reasoning applies to <literal>ExNumPat</literal>: -matching against <literal>ExNumPat</literal> <emphasis>requires</emphasis> -the constraints <literal>(Num a, Eq a)</literal>, and <emphasis>provides</emphasis> -the constraint <literal>(Show b)</literal>. -</para> -<para> -Note also the following points -<itemizedlist> -<listitem><para> -In the common case where <replaceable>CReq</replaceable> is empty, - <literal>()</literal>, it can be omitted altogether. -</para> </listitem> - -<listitem><para> -You may specify an explicit <emphasis>pattern signature</emphasis>, as -we did for <literal>ExNumPat</literal> above, to specify the type of a pattern, -just as you can for a function. As usual, the type signature can be less polymorphic -than the inferred type. For example -<programlisting> - -- Inferred type would be 'a -> [a]' - pattern SinglePair :: (a, a) -> [(a, a)] - pattern SinglePair x = [x] -</programlisting> -</para> </listitem> - -<listitem><para> -The GHCi <literal>:info</literal> command shows pattern types in this format. -</para> </listitem> - -<listitem><para> -For a bidirectional pattern synonym, a use of the pattern synonym as an expression has the type -<programlisting> - (CProv, CReq) => t1 -> t2 -> ... -> tN -> t -</programlisting> - So in the previous example, when used in an expression, <literal>ExNumPat</literal> has type -<programlisting> - ExNumPat :: (Show b, Num a, Eq a) => b -> T t -</programlisting> -Notice that this is a tiny bit more restrictive than the expression <literal>MkT 42 x</literal> -which would not require <literal>(Eq a)</literal>. -</para> </listitem> - -<listitem><para> -Consider these two pattern synonyms: -<programlisting> -data S a where - S1 :: Bool -> S Bool - -pattern P1 b = Just b -- P1 :: Bool -> Maybe Bool -pattern P2 b = S1 b -- P2 :: (b~Bool) => Bool -> S b - -f :: Maybe a -> String -f (P1 x) = "no no no" -- Type-incorrect - -g :: S a -> String -g (P2 b) = "yes yes yes" -- Fine -</programlisting> -Pattern <literal>P1</literal> can only match against a value of type <literal>Maybe Bool</literal>, -so function <literal>f</literal> is rejected because the type signature is <literal>Maybe a</literal>. -(To see this, imagine expanding the pattern synonym.) -</para> -<para> -On the other hand, function <literal>g</literal> works fine, because matching against <literal>P2</literal> -(which wraps the GADT <literal>S</literal>) provides the local equality <literal>(a~Bool)</literal>. -If you were to give an explicit pattern signature <literal>P2 :: Bool -> S Bool</literal>, then <literal>P2</literal> -would become less polymorphic, and would behave exactly like <literal>P1</literal> so that <literal>g</literal> -would then be rejected. -</para> -<para> -In short, if you want GADT-like behaviour for pattern synonyms, -then (unlike unlike concrete data constructors like <literal>S1</literal>) -you must write its type with explicit provided equalities. -For a concrete data constructor like <literal>S1</literal> you can write -its type signature as either <literal>S1 :: Bool -> S Bool</literal> or -<literal>S1 :: (b~Bool) => Bool -> S b</literal>; the two are equivalent. -Not so for pattern synonyms: the two forms are different, in order to -distinguish the two cases above. (See <ulink url="https://ghc.haskell.org/trac/ghc/ticket/9953">Trac #9953</ulink> for -discussion of this choice.) -</para></listitem> -</itemizedlist> -</para> -</sect3> - -<sect3><title>Matching of pattern synonyms</title> - -<para> -A pattern synonym occurrence in a pattern is evaluated by first -matching against the pattern synonym itself, and then on the argument -patterns. For example, in the following program, <literal>f</literal> -and <literal>f'</literal> are equivalent: -</para> - -<programlisting> -pattern Pair x y <- [x, y] - -f (Pair True True) = True -f _ = False - -f' [x, y] | True <- x, True <- y = True -f' _ = False -</programlisting> - -<para> - Note that the strictness of <literal>f</literal> differs from that - of <literal>g</literal> defined below: -<programlisting> -g [True, True] = True -g _ = False - -*Main> f (False:undefined) -*** Exception: Prelude.undefined -*Main> g (False:undefined) -False -</programlisting> -</para> -</sect3> - -</sect2> - - <!-- ===================== n+k patterns =================== --> - -<sect2 id="n-k-patterns"> -<title>n+k patterns</title> -<indexterm><primary><option>-XNPlusKPatterns</option></primary></indexterm> - -<para> -<literal>n+k</literal> pattern support is disabled by default. To enable -it, you can use the <option>-XNPlusKPatterns</option> flag. -</para> - -</sect2> - - <!-- ===================== Traditional record syntax =================== --> - -<sect2 id="traditional-record-syntax"> -<title>Traditional record syntax</title> -<indexterm><primary><option>-XNoTraditionalRecordSyntax</option></primary></indexterm> - -<para> -Traditional record syntax, such as <literal>C {f = x}</literal>, is enabled by default. -To disable it, you can use the <option>-XNoTraditionalRecordSyntax</option> flag. -</para> - -</sect2> - - <!-- ===================== Recursive do-notation =================== --> - -<sect2 id="recursive-do-notation"> -<title>The recursive do-notation -</title> - -<para> - The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>, - that is, the variables bound in a do-expression are visible only in the textually following - code block. Compare this to a let-expression, where bound variables are visible in the entire binding - group. -</para> - -<para> - It turns out that such recursive bindings do indeed make sense for a variety of monads, but - not all. In particular, recursion in this sense requires a fixed-point operator for the underlying - monad, captured by the <literal>mfix</literal> method of the <literal>MonadFix</literal> class, defined in <literal>Control.Monad.Fix</literal> as follows: -<programlisting> -class Monad m => MonadFix m where - mfix :: (a -> m a) -> m a -</programlisting> - Haskell's - <literal>Maybe</literal>, <literal>[]</literal> (list), <literal>ST</literal> (both strict and lazy versions), - <literal>IO</literal>, and many other monads have <literal>MonadFix</literal> instances. On the negative - side, the continuation monad, with the signature <literal>(a -> r) -> r</literal>, does not. -</para> - -<para> - For monads that do belong to the <literal>MonadFix</literal> class, GHC provides - an extended version of the do-notation that allows recursive bindings. - The <option>-XRecursiveDo</option> (language pragma: <literal>RecursiveDo</literal>) - provides the necessary syntactic support, introducing the keywords <literal>mdo</literal> and - <literal>rec</literal> for higher and lower levels of the notation respectively. Unlike - bindings in a <literal>do</literal> expression, those introduced by <literal>mdo</literal> and <literal>rec</literal> - are recursively defined, much like in an ordinary let-expression. Due to the new - keyword <literal>mdo</literal>, we also call this notation the <emphasis>mdo-notation</emphasis>. -</para> - -<para> - Here is a simple (albeit contrived) example: -<programlisting> -{-# LANGUAGE RecursiveDo #-} -justOnes = mdo { xs <- Just (1:xs) - ; return (map negate xs) } -</programlisting> -or equivalently -<programlisting> -{-# LANGUAGE RecursiveDo #-} -justOnes = do { rec { xs <- Just (1:xs) } - ; return (map negate xs) } -</programlisting> -As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>. -</para> - -<para> - GHC's implementation the mdo-notation closely follows the original translation as described in the paper - <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for Haskell</ulink>, which - in turn is based on the work <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion - in Monadic Computations</ulink>. Furthermore, GHC extends the syntax described in the former paper - with a lower level syntax flagged by the <literal>rec</literal> keyword, as we describe next. -</para> - -<sect3> -<title>Recursive binding groups</title> - -<para> - The flag <option>-XRecursiveDo</option> also introduces a new keyword <literal>rec</literal>, which wraps a - mutually-recursive group of monadic statements inside a <literal>do</literal> expression, producing a single statement. - Similar to a <literal>let</literal> statement inside a <literal>do</literal>, variables bound in - the <literal>rec</literal> are visible throughout the <literal>rec</literal> group, and below it. For example, compare -<programlisting> - do { a <- getChar do { a <- getChar - ; let { r1 = f a r2 ; rec { r1 <- f a r2 - ; ; r2 = g r1 } ; ; r2 <- g r1 } - ; return (r1 ++ r2) } ; return (r1 ++ r2) } -</programlisting> - In both cases, <literal>r1</literal> and <literal>r2</literal> are available both throughout - the <literal>let</literal> or <literal>rec</literal> block, and in the statements that follow it. - The difference is that <literal>let</literal> is non-monadic, while <literal>rec</literal> is monadic. - (In Haskell <literal>let</literal> is really <literal>letrec</literal>, of course.) -</para> - -<para> - The semantics of <literal>rec</literal> is fairly straightforward. Whenever GHC finds a <literal>rec</literal> - group, it will compute its set of bound variables, and will introduce an appropriate call - to the underlying monadic value-recursion operator <literal>mfix</literal>, belonging to the - <literal>MonadFix</literal> class. Here is an example: -<programlisting> -rec { b <- f a c ===> (b,c) <- mfix (\ ~(b,c) -> do { b <- f a c - ; c <- f b a } ; c <- f b a - ; return (b,c) }) -</programlisting> - As usual, the meta-variables <literal>b</literal>, <literal>c</literal> etc., can be arbitrary patterns. - In general, the statement <literal>rec <replaceable>ss</replaceable></literal> is desugared to the statement -<programlisting> -<replaceable>vs</replaceable> <- mfix (\ ~<replaceable>vs</replaceable> -> do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> }) -</programlisting> - where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>. -</para> - -<para> - Note in particular that the translation for a <literal>rec</literal> block only involves wrapping a call - to <literal>mfix</literal>: it performs no other analysis on the bindings. The latter is the task - for the <literal>mdo</literal> notation, which is described next. -</para> -</sect3> - -<sect3> -<title>The <literal>mdo</literal> notation</title> - -<para> - A <literal>rec</literal>-block tells the compiler where precisely the recursive knot should be tied. It turns out that - the placement of the recursive knots can be rather delicate: in particular, we would like the knots to be wrapped - around as minimal groups as possible. This process is known as <emphasis>segmentation</emphasis>, and is described - in detail in Section 3.2 of <ulink url="https://sites.google.com/site/leventerkok/recdo.pdf">A recursive do for - Haskell</ulink>. Segmentation improves polymorphism and reduces the size of the recursive knot. Most importantly, it avoids - unnecessary interference caused by a fundamental issue with the so-called <emphasis>right-shrinking</emphasis> - axiom for monadic recursion. In brief, most monads of interest (IO, strict state, etc.) do <emphasis>not</emphasis> - have recursion operators that satisfy this axiom, and thus not performing segmentation can cause unnecessary - interference, changing the termination behavior of the resulting translation. - (Details can be found in Sections 3.1 and 7.2.2 of - <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.) -</para> - -<para> - The <literal>mdo</literal> notation removes the burden of placing - explicit <literal>rec</literal> blocks in the code. Unlike an - ordinary <literal>do</literal> expression, in which variables bound by - statements are only in scope for later statements, variables bound in - an <literal>mdo</literal> expression are in scope for all statements - of the expression. The compiler then automatically identifies minimal - mutually recursively dependent segments of statements, treating them as - if the user had wrapped a <literal>rec</literal> qualifier around them. -</para> - -<para> - The definition is syntactic: -</para> -<itemizedlist> - <listitem> - <para> - A generator <replaceable>g</replaceable> - <emphasis>depends</emphasis> on a textually following generator - <replaceable>g'</replaceable>, if - </para> - <itemizedlist> - <listitem> - <para> - <replaceable>g'</replaceable> defines a variable that - is used by <replaceable>g</replaceable>, or - </para> - </listitem> - <listitem> - <para> - <replaceable>g'</replaceable> textually appears between - <replaceable>g</replaceable> and - <replaceable>g''</replaceable>, where <replaceable>g</replaceable> - depends on <replaceable>g''</replaceable>. - </para> - </listitem> - </itemizedlist> - </listitem> - <listitem> - <para> - A <emphasis>segment</emphasis> of a given - <literal>mdo</literal>-expression is a minimal sequence of generators - such that no generator of the sequence depends on an outside - generator. As a special case, although it is not a generator, - the final expression in an <literal>mdo</literal>-expression is - considered to form a segment by itself. - </para> - </listitem> -</itemizedlist> -<para> - Segments in this sense are - related to <emphasis>strongly-connected components</emphasis> analysis, - with the exception that bindings in a segment cannot be reordered and - must be contiguous. -</para> - -<para> - Here is an example <literal>mdo</literal>-expression, and its translation to <literal>rec</literal> blocks: -<programlisting> -mdo { a <- getChar ===> do { a <- getChar - ; b <- f a c ; rec { b <- f a c - ; c <- f b a ; ; c <- f b a } - ; z <- h a b ; z <- h a b - ; d <- g d e ; rec { d <- g d e - ; e <- g a z ; ; e <- g a z } - ; putChar c } ; putChar c } -</programlisting> -Note that a given <literal>mdo</literal> expression can cause the creation of multiple <literal>rec</literal> blocks. -If there are no recursive dependencies, <literal>mdo</literal> will introduce no <literal>rec</literal> blocks. In this -latter case an <literal>mdo</literal> expression is precisely the same as a <literal>do</literal> expression, as one -would expect. -</para> - -<para> - In summary, given an <literal>mdo</literal> expression, GHC first performs segmentation, introducing - <literal>rec</literal> blocks to wrap over minimal recursive groups. Then, each resulting - <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal> as described - in the previous section. The original <literal>mdo</literal>-expression typechecks exactly when the desugared - version would do so. -</para> - -<para> -Here are some other important points in using the recursive-do notation: - -<itemizedlist> - <listitem> - <para> - It is enabled with the flag <literal>-XRecursiveDo</literal>, or the <literal>LANGUAGE RecursiveDo</literal> - pragma. (The same flag enables both <literal>mdo</literal>-notation, and the use of <literal>rec</literal> - blocks inside <literal>do</literal> expressions.) - </para> - </listitem> - <listitem> - <para> - <literal>rec</literal> blocks can also be used inside <literal>mdo</literal>-expressions, which will be - treated as a single statement. However, it is good style to either use <literal>mdo</literal> or - <literal>rec</literal> blocks in a single expression. - </para> - </listitem> - <listitem> - <para> - If recursive bindings are required for a monad, then that monad must be declared an instance of - the <literal>MonadFix</literal> class. - </para> - </listitem> - <listitem> - <para> - The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO. - Furthermore, the <literal>Control.Monad.ST</literal> and <literal>Control.Monad.ST.Lazy</literal> - modules provide the instances of the <literal>MonadFix</literal> class for Haskell's internal - state monad (strict and lazy, respectively). - </para> - </listitem> - <listitem> - <para> - Like <literal>let</literal> and <literal>where</literal> bindings, name shadowing is not allowed within - an <literal>mdo</literal>-expression or a <literal>rec</literal>-block; that is, all the names bound in - a single <literal>rec</literal> must be distinct. (GHC will complain if this is not the case.) - </para> - </listitem> -</itemizedlist> -</para> -</sect3> -</sect2> - -<sect2 id="applicative-do"> - <title>Applicative do-notation</title> - <indexterm><primary>Applicative do-notation</primary> - </indexterm> - <indexterm><primary>do-notation</primary><secondary>Applicative</secondary> - </indexterm> - - <para> - The language option - <option>-XApplicativeDo</option><indexterm><primary><option>-XApplicativeDo</option></primary></indexterm> - enables an alternative translation for the do-notation, which - uses the operators <literal><$></literal>, - <literal><*></literal>, along with - <literal>join</literal>, as far as possible. There are two main - reasons for wanting to do this: - </para> - - <itemizedlist> - <listitem> - <para> - We can use do-notation with types that are an instance of - <literal>Applicative</literal> and - <literal>Functor</literal>, but not - <literal>Monad</literal>. - </para> - </listitem> - <listitem> - <para> - In some monads, using the applicative operators is more - efficient than monadic bind. For example, it may enable - more parallelism. - </para> - </listitem> - </itemizedlist> - - <para> - Applicative do-notation desugaring preserves the original - semantics, provided that the <literal>Applicative</literal> - instance satisfies <literal><*> = ap</literal> and - <literal>pure = return</literal> (these are true of all the - common monadic types). Thus, you can normally turn on - <option>-XApplicativeDo</option> without fear of breaking your - program. There is one pitfall to watch out for; see <xref - linkend="applicative-do-pitfall" />. - </para> - - <para> - There are no syntactic changes with - <option>-XApplicativeDo</option>. The only way it shows up at - the source level is that you can have a <literal>do</literal> - expression that doesn't require a <literal>Monad</literal> - constraint. For example, in GHCi: - </para> - -<programlisting> -Prelude> :set -XApplicativeDo -Prelude> :t \m -> do { x <- m; return (not x) } -\m -> do { x <- m; return (not x) } - :: Functor f => f Bool -> f Bool -</programlisting> - - <para> - This example only requires <literal>Functor</literal>, because it - is translated into <literal>(\x -> not x) <$> m</literal>. A - more complex example requires <literal>Applicative</literal>: - -<programlisting> -Prelude> :t \m -> do { x <- m 'a'; y <- m 'b'; return (x || y) } -\m -> do { x <- m 'a'; y <- m 'b'; return (x || y) } - :: Applicative f => (Char -> f Bool) -> f Bool -</programlisting> - </para> - - <para> - Here GHC has translated the expression into - -<programlisting> -(\x y -> x || y) <$> m 'a' <*> m 'b' -</programlisting> - - It is possible to see the actual translation by using - <option>-ddump-ds</option>, but be warned, the output is quite - verbose. - </para> - - <para> - Note that if the expression can't be translated into uses of - <literal><$></literal>, <literal><*></literal> - only, then it will incur a <literal>Monad</literal> constraint as - usual. This happens when there is a dependency on a value - produced by an earlier statement in the do-block: - -<programlisting> -Prelude> :t \m -> do { x <- m True; y <- m x; return (x || y) } -\m -> do { x <- m True; y <- m x; return (x || y) } - :: Monad m => (Bool -> m Bool) -> m Bool -</programlisting> - - Here, <literal>m x</literal> depends on the value of - <literal>x</literal> produced by the first statement, so the - expression cannot be translated using <literal><*></literal>. - </para> - - <para>In general, the rule for when a <literal>do</literal> - statement incurs a <literal>Monad</literal> constraint is as - follows. If the do-expression has the following form: - -<programlisting> -do p1 <- E1; ...; pn <- En; return E -</programlisting> - - where none of the variables defined by <literal>p1...pn</literal> - are mentioned in <literal>E1...En</literal>, then the expression - will only require <literal>Applicative</literal>. Otherwise, the - expression will require <literal>Monad</literal>. - </para> - - <sect3 id="applicative-do-pitfall"> - <title>Things to watch out for</title> - - <para> - Your code should just work as before when - <option>-XApplicativeDo</option> is enabled, provided you use - conventional <literal>Applicative</literal> instances. However, if - you define a <literal>Functor</literal> or - <literal>Applicative</literal> instance using do-notation, then - it will likely get turned into an infinite loop by GHC. For - example, if you do this: - -<programlisting> -instance Functor MyType where - fmap f m = do x <- m; return (f x) -</programlisting> - - Then applicative desugaring will turn it into - -<programlisting> -instance Functor MyType where - fmap f m = fmap (\x -> f x) m -</programlisting> - - And the program will loop at runtime. Similarly, an - <literal>Applicative</literal> instance like this - -<programlisting> -instance Applicative MyType where - pure = return - x <*> y = do f <- x; a <- y; return (f a) -</programlisting> - will result in an infinte loop when <literal><*></literal> - is called. - </para> - - <para>Just as you wouldn't define a <literal>Monad</literal> - instance using the do-notation, you shouldn't define - <literal>Functor</literal> or <literal>Applicative</literal> - instance using do-notation (when using - <literal>ApplicativeDo</literal>) either. The correct way to - define these instances in terms of <literal>Monad</literal> is to - use the <literal>Monad</literal> operations directly, e.g. - -<programlisting> -instance Functor MyType where - fmap f m = m >>= return . f - -instance Applicative MyType where - pure = return - (<*>) = ap -</programlisting> - </para> - </sect3> - -</sect2> - - - <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== --> - - <sect2 id="parallel-list-comprehensions"> - <title>Parallel List Comprehensions</title> - <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary> - </indexterm> - <indexterm><primary>parallel list comprehensions</primary> - </indexterm> - - <para>Parallel list comprehensions are a natural extension to list - comprehensions. List comprehensions can be thought of as a nice - syntax for writing maps and filters. Parallel comprehensions - extend this to include the <literal>zipWith</literal> family.</para> - - <para>A parallel list comprehension has multiple independent - branches of qualifier lists, each separated by a `|' symbol. For - example, the following zips together two lists:</para> - -<programlisting> - [ (x, y) | x <- xs | y <- ys ] -</programlisting> - - <para>The behaviour of parallel list comprehensions follows that of - zip, in that the resulting list will have the same length as the - shortest branch.</para> - - <para>We can define parallel list comprehensions by translation to - regular comprehensions. Here's the basic idea:</para> - - <para>Given a parallel comprehension of the form: </para> - -<programlisting> - [ e | p1 <- e11, p2 <- e12, ... - | q1 <- e21, q2 <- e22, ... - ... - ] -</programlisting> - - <para>This will be translated to: </para> - -<programlisting> - [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...] - [(q1,q2) | q1 <- e21, q2 <- e22, ...] - ... - ] -</programlisting> - - <para>where `zipN' is the appropriate zip for the given number of - branches.</para> - - </sect2> - - <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== --> - - <sect2 id="generalised-list-comprehensions"> - <title>Generalised (SQL-Like) List Comprehensions</title> - <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary> - </indexterm> - <indexterm><primary>extended list comprehensions</primary> - </indexterm> - <indexterm><primary>group</primary></indexterm> - <indexterm><primary>sql</primary></indexterm> - - - <para>Generalised list comprehensions are a further enhancement to the - list comprehension syntactic sugar to allow operations such as sorting - and grouping which are familiar from SQL. They are fully described in the - paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp"> - Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>, - except that the syntax we use differs slightly from the paper.</para> -<para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para> -<para>Here is an example: -<programlisting> -employees = [ ("Simon", "MS", 80) -, ("Erik", "MS", 100) -, ("Phil", "Ed", 40) -, ("Gordon", "Ed", 45) -, ("Paul", "Yale", 60)] - -output = [ (the dept, sum salary) -| (name, dept, salary) <- employees -, then group by dept using groupWith -, then sortWith by (sum salary) -, then take 5 ] -</programlisting> -In this example, the list <literal>output</literal> would take on - the value: - -<programlisting> -[("Yale", 60), ("Ed", 85), ("MS", 180)] -</programlisting> -</para> -<para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>. -(The functions <literal>sortWith</literal> and <literal>groupWith</literal> are not keywords; they are ordinary -functions that are exported by <literal>GHC.Exts</literal>.)</para> - -<para>There are five new forms of comprehension qualifier, -all introduced by the (existing) keyword <literal>then</literal>: - <itemizedlist> - <listitem> - -<programlisting> -then f -</programlisting> - - This statement requires that <literal>f</literal> have the type <literal> - forall a. [a] -> [a]</literal>. You can see an example of its use in the - motivating example, as this form is used to apply <literal>take 5</literal>. - - </listitem> - - - <listitem> -<para> -<programlisting> -then f by e -</programlisting> - - This form is similar to the previous one, but allows you to create a function - which will be passed as the first argument to f. As a consequence f must have - the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see - from the type, this function lets f "project out" some information - from the elements of the list it is transforming.</para> - - <para>An example is shown in the opening example, where <literal>sortWith</literal> - is supplied with a function that lets it find out the <literal>sum salary</literal> - for any item in the list comprehension it transforms.</para> - - </listitem> - - - <listitem> - -<programlisting> -then group by e using f -</programlisting> - - <para>This is the most general of the grouping-type statements. In this form, - f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>. - As with the <literal>then f by e</literal> case above, the first argument - is a function supplied to f by the compiler which lets it compute e on every - element of the list being transformed. However, unlike the non-grouping case, - f additionally partitions the list into a number of sublists: this means that - at every point after this statement, binders occurring before it in the comprehension - refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand - this, let's look at an example:</para> - -<programlisting> --- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first -groupRuns :: Eq b => (a -> b) -> [a] -> [[a]] -groupRuns f = groupBy (\x y -> f x == f y) - -output = [ (the x, y) -| x <- ([1..3] ++ [1..2]) -, y <- [4..6] -, then group by x using groupRuns ] -</programlisting> - - <para>This results in the variable <literal>output</literal> taking on the value below:</para> - -<programlisting> -[(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])] -</programlisting> - - <para>Note that we have used the <literal>the</literal> function to change the type - of x from a list to its original numeric type. The variable y, in contrast, is left - unchanged from the list form introduced by the grouping.</para> - - </listitem> - - <listitem> - -<programlisting> -then group using f -</programlisting> - - <para>With this form of the group statement, f is required to simply have the type - <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the - comprehension so far directly. An example of this form is as follows:</para> - -<programlisting> -output = [ x -| y <- [1..5] -, x <- "hello" -, then group using inits] -</programlisting> - - <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para> - -<programlisting> -["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...] -</programlisting> - - </listitem> -</itemizedlist> -</para> - </sect2> - - <!-- ===================== MONAD COMPREHENSIONS ===================== --> - -<sect2 id="monad-comprehensions"> - <title>Monad comprehensions</title> - <indexterm><primary>monad comprehensions</primary></indexterm> - - <para> - Monad comprehensions generalise the list comprehension notation, - including parallel comprehensions - (<xref linkend="parallel-list-comprehensions"/>) and - transform comprehensions (<xref linkend="generalised-list-comprehensions"/>) - to work for any monad. - </para> - - <para>Monad comprehensions support:</para> - - <itemizedlist> - <listitem> - <para> - Bindings: - </para> - -<programlisting> -[ x + y | x <- Just 1, y <- Just 2 ] -</programlisting> - - <para> - Bindings are translated with the <literal>(>>=)</literal> and - <literal>return</literal> functions to the usual do-notation: - </para> - -<programlisting> -do x <- Just 1 - y <- Just 2 - return (x+y) -</programlisting> - - </listitem> - <listitem> - <para> - Guards: - </para> - -<programlisting> -[ x | x <- [1..10], x <= 5 ] -</programlisting> - - <para> - Guards are translated with the <literal>guard</literal> function, - which requires a <literal>MonadPlus</literal> instance: - </para> - -<programlisting> -do x <- [1..10] - guard (x <= 5) - return x -</programlisting> - - </listitem> - <listitem> - <para> - Transform statements (as with <literal>-XTransformListComp</literal>): - </para> - -<programlisting> -[ x+y | x <- [1..10], y <- [1..x], then take 2 ] -</programlisting> - - <para> - This translates to: - </para> - -<programlisting> -do (x,y) <- take 2 (do x <- [1..10] - y <- [1..x] - return (x,y)) - return (x+y) -</programlisting> - - </listitem> - <listitem> - <para> - Group statements (as with <literal>-XTransformListComp</literal>): - </para> - -<programlisting> -[ x | x <- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ] -[ x | x <- [1,1,2,2,3], then group using myGroup ] -</programlisting> - - </listitem> - <listitem> - <para> - Parallel statements (as with <literal>-XParallelListComp</literal>): - </para> - -<programlisting> -[ (x+y) | x <- [1..10] - | y <- [11..20] - ] -</programlisting> - - <para> - Parallel statements are translated using the - <literal>mzip</literal> function, which requires a - <literal>MonadZip</literal> instance defined in - <ulink url="&libraryBaseLocation;/Control-Monad-Zip.html"><literal>Control.Monad.Zip</literal></ulink>: - </para> - -<programlisting> -do (x,y) <- mzip (do x <- [1..10] - return x) - (do y <- [11..20] - return y) - return (x+y) -</programlisting> - - </listitem> - </itemizedlist> - - <para> - All these features are enabled by default if the - <literal>MonadComprehensions</literal> extension is enabled. The types - and more detailed examples on how to use comprehensions are explained - in the previous chapters <xref - linkend="generalised-list-comprehensions"/> and <xref - linkend="parallel-list-comprehensions"/>. In general you just have - to replace the type <literal>[a]</literal> with the type - <literal>Monad m => m a</literal> for monad comprehensions. - </para> - - <para> - Note: Even though most of these examples are using the list monad, - monad comprehensions work for any monad. - The <literal>base</literal> package offers all necessary instances for - lists, which make <literal>MonadComprehensions</literal> backward - compatible to built-in, transform and parallel list comprehensions. - </para> -<para> More formally, the desugaring is as follows. We write <literal>D[ e | Q]</literal> -to mean the desugaring of the monad comprehension <literal>[ e | Q]</literal>: -<programlisting> -Expressions: e -Declarations: d -Lists of qualifiers: Q,R,S - --- Basic forms -D[ e | ] = return e -D[ e | p <- e, Q ] = e >>= \p -> D[ e | Q ] -D[ e | e, Q ] = guard e >> \p -> D[ e | Q ] -D[ e | let d, Q ] = let d in D[ e | Q ] - --- Parallel comprehensions (iterate for multiple parallel branches) -D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] >>= \(Qv,Rv) -> D[ e | S ] - --- Transform comprehensions -D[ e | Q then f, R ] = f D[ Qv | Q ] >>= \Qv -> D[ e | R ] - -D[ e | Q then f by b, R ] = f (\Qv -> b) D[ Qv | Q ] >>= \Qv -> D[ e | R ] - -D[ e | Q then group using f, R ] = f D[ Qv | Q ] >>= \ys -> - case (fmap selQv1 ys, ..., fmap selQvn ys) of - Qv -> D[ e | R ] - -D[ e | Q then group by b using f, R ] = f (\Qv -> b) D[ Qv | Q ] >>= \ys -> - case (fmap selQv1 ys, ..., fmap selQvn ys) of - Qv -> D[ e | R ] - -where Qv is the tuple of variables bound by Q (and used subsequently) - selQvi is a selector mapping Qv to the ith component of Qv - -Operator Standard binding Expected type --------------------------------------------------------------------- -return GHC.Base t1 -> m t2 -(>>=) GHC.Base m1 t1 -> (t2 -> m2 t3) -> m3 t3 -(>>) GHC.Base m1 t1 -> m2 t2 -> m3 t3 -guard Control.Monad t1 -> m t2 -fmap GHC.Base forall a b. (a->b) -> n a -> n b -mzip Control.Monad.Zip forall a b. m a -> m b -> m (a,b) -</programlisting> -The comprehension should typecheck when its desugaring would typecheck, -except that (as discussed in <xref linkend="generalised-list-comprehensions"/>) -in the "then f" and "then group using f" clauses, -when the "by b" qualifier is omitted, argument f should have a polymorphic type. -In particular, "then Data.List.sort" and -"then group using Data.List.group" are insufficiently polymorphic. -</para> -<para> -Monad comprehensions support rebindable syntax (<xref linkend="rebindable-syntax"/>). -Without rebindable -syntax, the operators from the "standard binding" module are used; with -rebindable syntax, the operators are looked up in the current lexical scope. -For example, parallel comprehensions will be typechecked and desugared -using whatever "<literal>mzip</literal>" is in scope. -</para> -<para> -The rebindable operators must have the "Expected type" given in the -table above. These types are surprisingly general. For example, you can -use a bind operator with the type -<programlisting> -(>>=) :: T x y a -> (a -> T y z b) -> T x z b -</programlisting> -In the case of transform comprehensions, notice that the groups are -parameterised over some arbitrary type <literal>n</literal> (provided it -has an <literal>fmap</literal>, as well as -the comprehension being over an arbitrary monad. -</para> -</sect2> - - <!-- ===================== REBINDABLE SYNTAX =================== --> - -<sect2 id="rebindable-syntax"> -<title>Rebindable syntax and the implicit Prelude import</title> - - <para><indexterm><primary>-XNoImplicitPrelude - option</primary></indexterm> GHC normally imports - <filename>Prelude.hi</filename> files for you. If you'd - rather it didn't, then give it a - <option>-XNoImplicitPrelude</option> option. The idea is - that you can then import a Prelude of your own. (But don't - call it <literal>Prelude</literal>; the Haskell module - namespace is flat, and you must not conflict with any - Prelude module.)</para> - - <para>Suppose you are importing a Prelude of your own - in order to define your own numeric class - hierarchy. It completely defeats that purpose if the - literal "1" means "<literal>Prelude.fromInteger - 1</literal>", which is what the Haskell Report specifies. - So the <option>-XRebindableSyntax</option> - flag causes - the following pieces of built-in syntax to refer to - <emphasis>whatever is in scope</emphasis>, not the Prelude - versions: - <itemizedlist> - <listitem> - <para>An integer literal <literal>368</literal> means - "<literal>fromInteger (368::Integer)</literal>", rather than - "<literal>Prelude.fromInteger (368::Integer)</literal>". -</para> </listitem> - - <listitem><para>Fractional literals are handed in just the same way, - except that the translation is - <literal>fromRational (3.68::Rational)</literal>. -</para> </listitem> - - <listitem><para>The equality test in an overloaded numeric pattern - uses whatever <literal>(==)</literal> is in scope. -</para> </listitem> - - <listitem><para>The subtraction operation, and the - greater-than-or-equal test, in <literal>n+k</literal> patterns - use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope. - </para></listitem> - - <listitem> - <para>Negation (e.g. "<literal>- (f x)</literal>") - means "<literal>negate (f x)</literal>", both in numeric - patterns, and expressions. - </para></listitem> - - <listitem> - <para>Conditionals (e.g. "<literal>if</literal> e1 <literal>then</literal> e2 <literal>else</literal> e3") - means "<literal>ifThenElse</literal> e1 e2 e3". However <literal>case</literal> expressions are unaffected. - </para></listitem> - - <listitem> - <para>"Do" notation is translated using whatever - functions <literal>(>>=)</literal>, - <literal>(>>)</literal>, and <literal>fail</literal>, - are in scope (not the Prelude - versions). List comprehensions, <literal>mdo</literal> - (<xref linkend="recursive-do-notation"/>), and parallel array - comprehensions, are unaffected. </para></listitem> - - <listitem> - <para>Arrow - notation (see <xref linkend="arrow-notation"/>) - uses whatever <literal>arr</literal>, - <literal>(>>>)</literal>, <literal>first</literal>, - <literal>app</literal>, <literal>(|||)</literal> and - <literal>loop</literal> functions are in scope. But unlike the - other constructs, the types of these functions must match the - Prelude types very closely. Details are in flux; if you want - to use this, ask! - </para></listitem> - </itemizedlist> -<option>-XRebindableSyntax</option> implies <option>-XNoImplicitPrelude</option>. -</para> -<para> -In all cases (apart from arrow notation), the static semantics should be that of the desugared form, -even if that is a little unexpected. For example, the -static semantics of the literal <literal>368</literal> -is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for -<literal>fromInteger</literal> to have any of the types: -<programlisting> -fromInteger :: Integer -> Integer -fromInteger :: forall a. Foo a => Integer -> a -fromInteger :: Num a => a -> Integer -fromInteger :: Integer -> Bool -> Bool -</programlisting> -</para> - - <para>Be warned: this is an experimental facility, with - fewer checks than usual. Use <literal>-dcore-lint</literal> - to typecheck the desugared program. If Core Lint is happy - you should be all right.</para> - -</sect2> - -<sect2 id="postfix-operators"> -<title>Postfix operators</title> - -<para> - The <option>-XPostfixOperators</option> flag enables a small -extension to the syntax of left operator sections, which allows you to -define postfix operators. The extension is this: the left section -<programlisting> - (e !) -</programlisting> -is equivalent (from the point of view of both type checking and execution) to the expression -<programlisting> - ((!) e) -</programlisting> -(for any expression <literal>e</literal> and operator <literal>(!)</literal>. -The strict Haskell 98 interpretation is that the section is equivalent to -<programlisting> - (\y -> (!) e y) -</programlisting> -That is, the operator must be a function of two arguments. GHC allows it to -take only one argument, and that in turn allows you to write the function -postfix. -</para> -<para>The extension does not extend to the left-hand side of function -definitions; you must define such a function in prefix form.</para> - -</sect2> - -<sect2 id="tuple-sections"> -<title>Tuple sections</title> - -<para> - The <option>-XTupleSections</option> flag enables Python-style partially applied - tuple constructors. For example, the following program -<programlisting> - (, True) -</programlisting> - is considered to be an alternative notation for the more unwieldy alternative -<programlisting> - \x -> (x, True) -</programlisting> -You can omit any combination of arguments to the tuple, as in the following -<programlisting> - (, "I", , , "Love", , 1337) -</programlisting> -which translates to -<programlisting> - \a b c d -> (a, "I", b, c, "Love", d, 1337) -</programlisting> -</para> - -<para> - If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections - will also be available for them, like so -<programlisting> - (# , True #) -</programlisting> -Because there is no unboxed unit tuple, the following expression -<programlisting> - (# #) -</programlisting> -continues to stand for the unboxed singleton tuple data constructor. -</para> - -</sect2> - -<sect2 id="lambda-case"> -<title>Lambda-case</title> -<para> -The <option>-XLambdaCase</option> flag enables expressions of the form -<programlisting> - \case { p1 -> e1; ...; pN -> eN } -</programlisting> -which is equivalent to -<programlisting> - \freshName -> case freshName of { p1 -> e1; ...; pN -> eN } -</programlisting> -Note that <literal>\case</literal> starts a layout, so you can write -<programlisting> - \case - p1 -> e1 - ... - pN -> eN -</programlisting> -</para> -</sect2> - -<sect2 id="empty-case"> -<title>Empty case alternatives</title> -<para> -The <option>-XEmptyCase</option> flag enables -case expressions, or lambda-case expressions, that have no alternatives, -thus: -<programlisting> - case e of { } -- No alternatives -or - \case { } -- -XLambdaCase is also required -</programlisting> -This can be useful when you know that the expression being scrutinised -has no non-bottom values. For example: -<programlisting> - data Void - f :: Void -> Int - f x = case x of { } -</programlisting> -With dependently-typed features it is more useful -(see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/2431">Trac</ulink>). -For example, consider these two candidate definitions of <literal>absurd</literal>: -<programlisting> -data a :==: b where - Refl :: a :==: a - -absurd :: True :~: False -> a -absurd x = error "absurd" -- (A) -absurd x = case x of {} -- (B) -</programlisting> -We much prefer (B). Why? Because GHC can figure out that <literal>(True :~: False)</literal> -is an empty type. So (B) has no partiality and GHC should be able to compile with -<option>-fwarn-incomplete-patterns</option>. (Though the pattern match checking is not -yet clever enough to do that.) -On the other hand (A) looks dangerous, and GHC doesn't check to make -sure that, in fact, the function can never get called. -</para> -</sect2> - -<sect2 id="multi-way-if"> -<title>Multi-way if-expressions</title> -<para> -With <option>-XMultiWayIf</option> flag GHC accepts conditional expressions -with multiple branches: -<programlisting> - if | guard1 -> expr1 - | ... - | guardN -> exprN -</programlisting> -which is roughly equivalent to -<programlisting> - case () of - _ | guard1 -> expr1 - ... - _ | guardN -> exprN -</programlisting> -</para> - -<para>Multi-way if expressions introduce a new layout context. So the -example above is equivalent to: -<programlisting> - if { | guard1 -> expr1 - ; | ... - ; | guardN -> exprN - } -</programlisting> -The following behaves as expected: -<programlisting> - if | guard1 -> if | guard2 -> expr2 - | guard3 -> expr3 - | guard4 -> expr4 -</programlisting> -because layout translates it as -<programlisting> - if { | guard1 -> if { | guard2 -> expr2 - ; | guard3 -> expr3 - } - ; | guard4 -> expr4 - } -</programlisting> -Layout with multi-way if works in the same way as other layout -contexts, except that the semi-colons between guards in a multi-way if -are optional. So it is not necessary to line up all the guards at the -same column; this is consistent with the way guards work in function -definitions and case expressions. -</para> -</sect2> - -<sect2 id="disambiguate-fields"> -<title>Record field disambiguation</title> -<para> -In record construction and record pattern matching -it is entirely unambiguous which field is referred to, even if there are two different -data types in scope with a common field name. For example: -<programlisting> -module M where - data S = MkS { x :: Int, y :: Bool } - -module Foo where - import M - - data T = MkT { x :: Int } - - ok1 (MkS { x = n }) = n+1 -- Unambiguous - ok2 n = MkT { x = n+1 } -- Unambiguous - - bad1 k = k { x = 3 } -- Ambiguous - bad2 k = x k -- Ambiguous -</programlisting> -Even though there are two <literal>x</literal>'s in scope, -it is clear that the <literal>x</literal> in the pattern in the -definition of <literal>ok1</literal> can only mean the field -<literal>x</literal> from type <literal>S</literal>. Similarly for -the function <literal>ok2</literal>. However, in the record update -in <literal>bad1</literal> and the record selection in <literal>bad2</literal> -it is not clear which of the two types is intended. -</para> -<para> -Haskell 98 regards all four as ambiguous, but with the -<option>-XDisambiguateRecordFields</option> flag, GHC will accept -the former two. The rules are precisely the same as those for instance -declarations in Haskell 98, where the method names on the left-hand side -of the method bindings in an instance declaration refer unambiguously -to the method of that class (provided they are in scope at all), even -if there are other variables in scope with the same name. -This reduces the clutter of qualified names when you import two -records from different modules that use the same field name. -</para> -<para> -Some details: -<itemizedlist> -<listitem><para> -Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For example: -<programlisting> -module Foo where - import M - x=True - ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning -</programlisting> -</para></listitem> - -<listitem><para> -With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualified</emphasis> -field names even if the corresponding selector is only in scope <emphasis>qualified</emphasis> -For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal: -<programlisting> -module Foo where - import qualified M -- Note qualified - - ok4 (M.MkS { x = n }) = n+1 -- Unambiguous -</programlisting> -Since the constructor <literal>MkS</literal> is only in scope qualified, you must -name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need -to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal> -is not. (In effect, it is qualified by the constructor.) -</para></listitem> -</itemizedlist> -</para> - -</sect2> - - <!-- ===================== Record puns =================== --> - -<sect2 id="record-puns"> -<title>Record puns -</title> - -<para> -Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>. -</para> - -<para> -When using records, it is common to write a pattern that binds a -variable with the same name as a record field, such as: - -<programlisting> -data C = C {a :: Int} -f (C {a = a}) = a -</programlisting> -</para> - -<para> -Record punning permits the variable name to be elided, so one can simply -write - -<programlisting> -f (C {a}) = a -</programlisting> - -to mean the same pattern as above. That is, in a record pattern, the -pattern <literal>a</literal> expands into the pattern <literal>a = -a</literal> for the same name <literal>a</literal>. -</para> - -<para> -Note that: -<itemizedlist> -<listitem><para> -Record punning can also be used in an expression, writing, for example, -<programlisting> -let a = 1 in C {a} -</programlisting> -instead of -<programlisting> -let a = 1 in C {a = a} -</programlisting> -The expansion is purely syntactic, so the expanded right-hand side -expression refers to the nearest enclosing variable that is spelled the -same as the field name. -</para></listitem> - -<listitem><para> -Puns and other patterns can be mixed in the same record: -<programlisting> -data C = C {a :: Int, b :: Int} -f (C {a, b = 4}) = a -</programlisting> -</para></listitem> - -<listitem><para> -Puns can be used wherever record patterns occur (e.g. in -<literal>let</literal> bindings or at the top-level). -</para></listitem> - -<listitem><para> -A pun on a qualified field name is expanded by stripping off the module qualifier. -For example: -<programlisting> -f (C {M.a}) = a -</programlisting> -means -<programlisting> -f (M.C {M.a = a}) = a -</programlisting> -(This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal> -is only in scope in qualified form.) -</para></listitem> -</itemizedlist> -</para> - - -</sect2> - - <!-- ===================== Record wildcards =================== --> - -<sect2 id="record-wildcards"> -<title>Record wildcards -</title> - -<para> -Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>. -This flag implies <literal>-XDisambiguateRecordFields</literal>. -</para> - -<para> -For records with many fields, it can be tiresome to write out each field -individually in a record pattern, as in -<programlisting> -data C = C {a :: Int, b :: Int, c :: Int, d :: Int} -f (C {a = 1, b = b, c = c, d = d}) = b + c + d -</programlisting> -</para> - -<para> -Record wildcard syntax permits a "<literal>..</literal>" in a record -pattern, where each elided field <literal>f</literal> is replaced by the -pattern <literal>f = f</literal>. For example, the above pattern can be -written as -<programlisting> -f (C {a = 1, ..}) = b + c + d -</programlisting> -</para> - -<para> -More details: -<itemizedlist> -<listitem><para> -Record wildcards in patterns can be mixed with other patterns, including puns -(<xref linkend="record-puns"/>); for example, in a pattern <literal>(C {a -= 1, b, ..})</literal>. Additionally, record wildcards can be used -wherever record patterns occur, including in <literal>let</literal> -bindings and at the top-level. For example, the top-level binding -<programlisting> -C {a = 1, ..} = e -</programlisting> -defines <literal>b</literal>, <literal>c</literal>, and -<literal>d</literal>. -</para></listitem> - -<listitem><para> -Record wildcards can also be used in an expression, when constructing a record. For example, -<programlisting> -let {a = 1; b = 2; c = 3; d = 4} in C {..} -</programlisting> -in place of -<programlisting> -let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d} -</programlisting> -The expansion is purely syntactic, so the record wildcard -expression refers to the nearest enclosing variables that are spelled -the same as the omitted field names. -</para></listitem> - -<listitem><para> -Record wildcards may <emphasis>not</emphasis> be used in record <emphasis>updates</emphasis>. For example this -is illegal: -<programlisting> -f r = r { x = 3, .. } -</programlisting> -</para></listitem> - -<listitem><para> -For both pattern and expression wildcards, the "<literal>..</literal>" expands to the missing -<emphasis>in-scope</emphasis> record fields. -Specifically the expansion of "<literal>C {..}</literal>" includes -<literal>f</literal> if and only if: -<itemizedlist> -<listitem><para> -<literal>f</literal> is a record field of constructor <literal>C</literal>. -</para></listitem> -<listitem><para> -The record field <literal>f</literal> is in scope somehow (either qualified or unqualified). -</para></listitem> -<listitem><para> -In the case of expressions (but not patterns), -the variable <literal>f</literal> is in scope unqualified, -apart from the binding of the record selector itself. -</para></listitem> -</itemizedlist> -These rules restrict record wildcards to the situations in which the user -could have written the expanded version. -For example -<programlisting> -module M where - data R = R { a,b,c :: Int } -module X where - import M( R(a,c) ) - f b = R { .. } -</programlisting> -The <literal>R{..}</literal> expands to <literal>R{M.a=a}</literal>, -omitting <literal>b</literal> since the record field is not in scope, -and omitting <literal>c</literal> since the variable <literal>c</literal> -is not in scope (apart from the binding of the -record selector <literal>c</literal>, of course). -</para></listitem> - -<listitem><para> -Record wildcards cannot be used (a) in a record update construct, and (b) for data -constructors that are not declared with record fields. For example: -<programlisting> -f x = x { v=True, .. } -- Illegal (a) - -data T = MkT Int Bool -g = MkT { .. } -- Illegal (b) -h (MkT { .. }) = True -- Illegal (b) -</programlisting> -</para></listitem> -</itemizedlist> -</para> - -</sect2> - - <!-- ===================== Local fixity declarations =================== --> - -<sect2 id="local-fixity-declarations"> -<title>Local Fixity Declarations -</title> - -<para>A careful reading of the Haskell 98 Report reveals that fixity -declarations (<literal>infix</literal>, <literal>infixl</literal>, and -<literal>infixr</literal>) are permitted to appear inside local bindings -such those introduced by <literal>let</literal> and -<literal>where</literal>. However, the Haskell Report does not specify -the semantics of such bindings very precisely. -</para> - -<para>In GHC, a fixity declaration may accompany a local binding: -<programlisting> -let f = ... - infixr 3 `f` -in - ... -</programlisting> -and the fixity declaration applies wherever the binding is in scope. -For example, in a <literal>let</literal>, it applies in the right-hand -sides of other <literal>let</literal>-bindings and the body of the -<literal>let</literal>C. Or, in recursive <literal>do</literal> -expressions (<xref linkend="recursive-do-notation"/>), the local fixity -declarations of a <literal>let</literal> statement scope over other -statements in the group, just as the bound name does. -</para> - -<para> -Moreover, a local fixity declaration *must* accompany a local binding of -that name: it is not possible to revise the fixity of name bound -elsewhere, as in -<programlisting> -let infixr 9 $ in ... -</programlisting> - -Because local fixity declarations are technically Haskell 98, no flag is -necessary to enable them. -</para> -</sect2> - -<sect2 id="package-imports"> -<title>Import and export extensions</title> - -<sect3> - <title>Hiding things the imported module doesn't export</title> - -<para> -Technically in Haskell 2010 this is illegal: -<programlisting> -module A( f ) where - f = True - -module B where - import A hiding( g ) -- A does not export g - g = f -</programlisting> -The <literal>import A hiding( g )</literal> in module <literal>B</literal> -is technically an error (<ulink url="http://www.haskell.org/onlinereport/haskell2010/haskellch5.html#x11-1020005.3.1">Haskell Report, 5.3.1</ulink>) -because <literal>A</literal> does not export <literal>g</literal>. -However GHC allows it, in the interests of supporting backward compatibility; for example, a newer version of -<literal>A</literal> might export <literal>g</literal>, and you want <literal>B</literal> to work -in either case. -</para> -<para> -The warning <literal>-fwarn-dodgy-imports</literal>, which is off by default but included with <literal>-W</literal>, -warns if you hide something that the imported module does not export. -</para> -</sect3> - -<sect3> - <title id="package-qualified-imports">Package-qualified imports</title> - - <para>With the <option>-XPackageImports</option> flag, GHC allows - import declarations to be qualified by the package name that the - module is intended to be imported from. For example:</para> - -<programlisting> -import "network" Network.Socket -</programlisting> - - <para>would import the module <literal>Network.Socket</literal> from - the package <literal>network</literal> (any version). This may - be used to disambiguate an import when the same module is - available from multiple packages, or is present in both the - current package being built and an external package.</para> - - <para>The special package name <literal>this</literal> can be used to - refer to the current package being built.</para> - - <para>Note: you probably don't need to use this feature, it was - added mainly so that we can build backwards-compatible versions of - packages when APIs change. It can lead to fragile dependencies in - the common case: modules occasionally move from one package to - another, rendering any package-qualified imports broken. - See also <xref linkend="package-thinning-and-renaming" /> for - an alternative way of disambiguating between module names.</para> -</sect3> - -<sect3 id="safe-imports-ext"> - <title>Safe imports</title> - - <para>With the <option>-XSafe</option>, <option>-XTrustworthy</option> - and <option>-XUnsafe</option> language flags, GHC extends - the import declaration syntax to take an optional <literal>safe</literal> - keyword after the <literal>import</literal> keyword. This feature - is part of the Safe Haskell GHC extension. For example:</para> - -<programlisting> -import safe qualified Network.Socket as NS -</programlisting> - - <para>would import the module <literal>Network.Socket</literal> - with compilation only succeeding if Network.Socket can be - safely imported. For a description of when a import is - considered safe see <xref linkend="safe-haskell"/></para> - -</sect3> - -<sect3 id="explicit-namespaces"> -<title>Explicit namespaces in import/export</title> - -<para> In an import or export list, such as -<programlisting> - module M( f, (++) ) where ... - import N( f, (++) ) - ... -</programlisting> -the entities <literal>f</literal> and <literal>(++)</literal> are <emphasis>values</emphasis>. -However, with type operators (<xref linkend="type-operators"/>) it becomes possible -to declare <literal>(++)</literal> as a <emphasis>type constructor</emphasis>. In that -case, how would you export or import it? -</para> -<para> -The <option>-XExplicitNamespaces</option> extension allows you to prefix the name of -a type constructor in an import or export list with "<literal>type</literal>" to -disambiguate this case, thus: -<programlisting> - module M( f, type (++) ) where ... - import N( f, type (++) ) - ... - module N( f, type (++) ) where - data family a ++ b = L a | R b -</programlisting> -The extension <option>-XExplicitNamespaces</option> -is implied by <option>-XTypeOperators</option> and (for some reason) by <option>-XTypeFamilies</option>. -</para> -<para> -In addition, with <option>-XPatternSynonyms</option> you can prefix the name of -a data constructor in an import or export list with the keyword <literal>pattern</literal>, -to allow the import or export of a data constructor without its parent type constructor -(see <xref linkend="patsyn-impexp"/>). -</para> -</sect3> - -</sect2> - -<sect2 id="syntax-stolen"> -<title>Summary of stolen syntax</title> - - <para>Turning on an option that enables special syntax - <emphasis>might</emphasis> cause working Haskell 98 code to fail - to compile, perhaps because it uses a variable name which has - become a reserved word. This section lists the syntax that is - "stolen" by language extensions. - We use - notation and nonterminal names from the Haskell 98 lexical syntax - (see the Haskell 98 Report). - We only list syntax changes here that might affect - existing working programs (i.e. "stolen" syntax). Many of these - extensions will also enable new context-free syntax, but in all - cases programs written to use the new syntax would not be - compilable without the option enabled.</para> - -<para>There are two classes of special - syntax: - - <itemizedlist> - <listitem> - <para>New reserved words and symbols: character sequences - which are no longer available for use as identifiers in the - program.</para> - </listitem> - <listitem> - <para>Other special syntax: sequences of characters that have - a different meaning when this particular option is turned - on.</para> - </listitem> - </itemizedlist> - -The following syntax is stolen: - - <variablelist> - <varlistentry> - <term> - <literal>forall</literal> - <indexterm><primary><literal>forall</literal></primary></indexterm> - </term> - <listitem><para> - Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by - <option>-XScopedTypeVariables</option>, - <option>-XLiberalTypeSynonyms</option>, - <option>-XRankNTypes</option>, - <option>-XExistentialQuantification</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>mdo</literal> - <indexterm><primary><literal>mdo</literal></primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XRecursiveDo</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>foreign</literal> - <indexterm><primary><literal>foreign</literal></primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XForeignFunctionInterface</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>rec</literal>, - <literal>proc</literal>, <literal>-<</literal>, - <literal>>-</literal>, <literal>-<<</literal>, - <literal>>>-</literal>, and <literal>(|</literal>, - <literal>|)</literal> brackets - <indexterm><primary><literal>proc</literal></primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XArrows</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>?<replaceable>varid</replaceable></literal> - <indexterm><primary>implicit parameters</primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XImplicitParams</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>[|</literal>, - <literal>[e|</literal>, <literal>[p|</literal>, - <literal>[d|</literal>, <literal>[t|</literal>, - <literal>$(</literal>, - <literal>$$(</literal>, - <literal>[||</literal>, - <literal>[e||</literal>, - <literal>$<replaceable>varid</replaceable></literal>, - <literal>$$<replaceable>varid</replaceable></literal> - <indexterm><primary>Template Haskell</primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XTemplateHaskell</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>[<replaceable>varid</replaceable>|</literal> - <indexterm><primary>quasi-quotation</primary></indexterm> - </term> - <listitem><para> - Stolen by: <option>-XQuasiQuotes</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <replaceable>varid</replaceable>{<literal>#</literal>}, - <replaceable>char</replaceable><literal>#</literal>, - <replaceable>string</replaceable><literal>#</literal>, - <replaceable>integer</replaceable><literal>#</literal>, - <replaceable>float</replaceable><literal>#</literal>, - <replaceable>float</replaceable><literal>##</literal> - </term> - <listitem><para> - Stolen by: <option>-XMagicHash</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>(#</literal>, <literal>#)</literal> - </term> - <listitem><para> - Stolen by: <option>-XUnboxedTuples</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <replaceable>varid</replaceable><literal>!</literal><replaceable>varid</replaceable> - </term> - <listitem><para> - Stolen by: <option>-XBangPatterns</option> - </para></listitem> - </varlistentry> - - <varlistentry> - <term> - <literal>pattern</literal> - </term> - <listitem><para> - Stolen by: <option>-XPatternSynonyms</option> - </para></listitem> - </varlistentry> - </variablelist> -</para> -</sect2> -</sect1> - - -<!-- TYPE SYSTEM EXTENSIONS --> -<sect1 id="data-type-extensions"> -<title>Extensions to data types and type synonyms</title> - -<sect2 id="nullary-types"> -<title>Data types with no constructors</title> - -<para>With the <option>-XEmptyDataDecls</option> flag (or equivalent LANGUAGE pragma), -GHC lets you declare a data type with no constructors. For example:</para> - -<programlisting> - data S -- S :: * - data T a -- T :: * -> * -</programlisting> - -<para>Syntactically, the declaration lacks the "= constrs" part. The -type can be parameterised over types of any kind, but if the kind is -not <literal>*</literal> then an explicit kind annotation must be used -(see <xref linkend="kinding"/>).</para> - -<para>Such data types have only one value, namely bottom. -Nevertheless, they can be useful when defining "phantom types".</para> -</sect2> - -<sect2 id="datatype-contexts"> -<title>Data type contexts</title> - -<para>Haskell allows datatypes to be given contexts, e.g.</para> - -<programlisting> -data Eq a => Set a = NilSet | ConsSet a (Set a) -</programlisting> - -<para>give constructors with types:</para> - -<programlisting> -NilSet :: Set a -ConsSet :: Eq a => a -> Set a -> Set a -</programlisting> - -<para>This is widely considered a misfeature, and is going to be removed from -the language. In GHC, it is controlled by the deprecated extension -<literal>DatatypeContexts</literal>.</para> -</sect2> - -<sect2 id="infix-tycons"> -<title>Infix type constructors, classes, and type variables</title> - -<para> -GHC allows type constructors, classes, and type variables to be operators, and -to be written infix, very much like expressions. More specifically: -<itemizedlist> -<listitem><para> - A type constructor or class can be any non-reserved operator. - Symbols used in types are always like capitalized identifiers; they - are never variables. Note that this is different from the lexical - syntax of data constructors, which are required to begin with a - <literal>:</literal>. - </para></listitem> -<listitem><para> - Data type and type-synonym declarations can be written infix, parenthesised - if you want further arguments. E.g. -<screen> - data a :*: b = Foo a b - type a :+: b = Either a b - class a :=: b where ... - - data (a :**: b) x = Baz a b x - type (a :++: b) y = Either (a,b) y -</screen> - </para></listitem> -<listitem><para> - Types, and class constraints, can be written infix. For example -<screen> - x :: Int :*: Bool - f :: (a :=: b) => a -> b -</screen> - </para></listitem> -<listitem><para> - Back-quotes work - as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or - <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>. - </para></listitem> -<listitem><para> - Fixities may be declared for type constructors, or classes, just as for data constructors. However, - one cannot distinguish between the two in a fixity declaration; a fixity declaration - sets the fixity for a data constructor and the corresponding type constructor. For example: -<screen> - infixl 7 T, :*: -</screen> - sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>, - and similarly for <literal>:*:</literal>. - <literal>Int `a` Bool</literal>. - </para></listitem> -<listitem><para> - Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.) - </para></listitem> - -</itemizedlist> -</para> -</sect2> - -<sect2 id="type-operators"> -<title>Type operators</title> -<para> -In types, an operator symbol like <literal>(+)</literal> is normally treated as a type -<emphasis>variable</emphasis>, just like <literal>a</literal>. Thus in Haskell 98 you can say -<programlisting> -type T (+) = ((+), (+)) --- Just like: type T a = (a,a) - -f :: T Int -> Int -f (x,y)= x -</programlisting> -As you can see, using operators in this way is not very useful, and Haskell 98 does not even -allow you to write them infix. -</para> -<para> -The language <option>-XTypeOperators</option> changes this behaviour: -<itemizedlist> -<listitem><para> -Operator symbols become type <emphasis>constructors</emphasis> rather than -type <emphasis>variables</emphasis>. -</para></listitem> -<listitem><para> -Operator symbols in types can be written infix, both in definitions and uses. -for example: -<programlisting> -data a + b = Plus a b -type Foo = Int + Bool -</programlisting> -</para></listitem> -<listitem><para> -There is now some potential ambiguity in import and export lists; for example -if you write <literal>import M( (+) )</literal> do you mean the -<emphasis>function</emphasis> <literal>(+)</literal> or the -<emphasis>type constructor</emphasis> <literal>(+)</literal>? -The default is the former, but with <option>-XExplicitNamespaces</option> (which is implied -by <option>-XTypeOperators</option>) GHC allows you to specify the latter -by preceding it with the keyword <literal>type</literal>, thus: -<programlisting> -import M( type (+) ) -</programlisting> -See <xref linkend="explicit-namespaces"/>. -</para></listitem> -<listitem><para> -The fixity of a type operator may be set using the usual fixity declarations -but, as in <xref linkend="infix-tycons"/>, the function and type constructor share -a single fixity. -</para></listitem> -</itemizedlist> -</para> -</sect2> - -<sect2 id="type-synonyms"> -<title>Liberalised type synonyms</title> - -<para> -Type synonyms are like macros at the type level, but Haskell 98 imposes many rules -on individual synonym declarations. -With the <option>-XLiberalTypeSynonyms</option> extension, -GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>. -That means that GHC can be very much more liberal about type synonyms than Haskell 98. - -<itemizedlist> -<listitem> <para>You can write a <literal>forall</literal> (including overloading) -in a type synonym, thus: -<programlisting> - type Discard a = forall b. Show b => a -> b -> (a, String) - - f :: Discard a - f x y = (x, show y) - - g :: Discard Int -> (Int,String) -- A rank-2 type - g f = f 3 True -</programlisting> -</para> -</listitem> - -<listitem><para> -If you also use <option>-XUnboxedTuples</option>, -you can write an unboxed tuple in a type synonym: -<programlisting> - type Pr = (# Int, Int #) - - h :: Int -> Pr - h x = (# x, x #) -</programlisting> -</para></listitem> - -<listitem><para> -You can apply a type synonym to a forall type: -<programlisting> - type Foo a = a -> a -> Bool - - f :: Foo (forall b. b->b) -</programlisting> -After expanding the synonym, <literal>f</literal> has the legal (in GHC) type: -<programlisting> - f :: (forall b. b->b) -> (forall b. b->b) -> Bool -</programlisting> -</para></listitem> - -<listitem><para> -You can apply a type synonym to a partially applied type synonym: -<programlisting> - type Generic i o = forall x. i x -> o x - type Id x = x - - foo :: Generic Id [] -</programlisting> -After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type: -<programlisting> - foo :: forall x. x -> [x] -</programlisting> -</para></listitem> - -</itemizedlist> -</para> - -<para> -GHC currently does kind checking before expanding synonyms (though even that -could be changed.) -</para> -<para> -After expanding type synonyms, GHC does validity checking on types, looking for -the following mal-formedness which isn't detected simply by kind checking: -<itemizedlist> -<listitem><para> -Type constructor applied to a type involving for-alls (if <literal>XImpredicativeTypes</literal> -is off) -</para></listitem> -<listitem><para> -Partially-applied type synonym. -</para></listitem> -</itemizedlist> -So, for example, this will be rejected: -<programlisting> - type Pr = forall a. a - - h :: [Pr] - h = ... -</programlisting> -because GHC does not allow type constructors applied to for-all types. -</para> -</sect2> - - -<sect2 id="existential-quantification"> -<title>Existentially quantified data constructors -</title> - -<para> -The idea of using existential quantification in data type declarations -was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation -of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of -London, 1991). It was later formalised by Laufer and Odersky -(<emphasis>Polymorphic type inference and abstract data types</emphasis>, -TOPLAS, 16(5), pp1411-1430, 1994). -It's been in Lennart -Augustsson's <command>hbc</command> Haskell compiler for several years, and -proved very useful. Here's the idea. Consider the declaration: -</para> - -<para> - -<programlisting> - data Foo = forall a. MkFoo a (a -> Bool) - | Nil -</programlisting> - -</para> - -<para> -The data type <literal>Foo</literal> has two constructors with types: -</para> - -<para> - -<programlisting> - MkFoo :: forall a. a -> (a -> Bool) -> Foo - Nil :: Foo -</programlisting> - -</para> - -<para> -Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function> -does not appear in the data type itself, which is plain <literal>Foo</literal>. -For example, the following expression is fine: -</para> - -<para> - -<programlisting> - [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo] -</programlisting> - -</para> - -<para> -Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function -<function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c' -isUpper</function> packages a character with a compatible function. These -two things are each of type <literal>Foo</literal> and can be put in a list. -</para> - -<para> -What can we do with a value of type <literal>Foo</literal>?. In particular, -what happens when we pattern-match on <function>MkFoo</function>? -</para> - -<para> - -<programlisting> - f (MkFoo val fn) = ??? -</programlisting> - -</para> - -<para> -Since all we know about <literal>val</literal> and <function>fn</function> is that they -are compatible, the only (useful) thing we can do with them is to -apply <function>fn</function> to <literal>val</literal> to get a boolean. For example: -</para> - -<para> - -<programlisting> - f :: Foo -> Bool - f (MkFoo val fn) = fn val -</programlisting> - -</para> - -<para> -What this allows us to do is to package heterogeneous values -together with a bunch of functions that manipulate them, and then treat -that collection of packages in a uniform manner. You can express -quite a bit of object-oriented-like programming this way. -</para> - -<sect3 id="existential"> -<title>Why existential? -</title> - -<para> -What has this to do with <emphasis>existential</emphasis> quantification? -Simply that <function>MkFoo</function> has the (nearly) isomorphic type -</para> - -<para> - -<programlisting> - MkFoo :: (exists a . (a, a -> Bool)) -> Foo -</programlisting> - -</para> - -<para> -But Haskell programmers can safely think of the ordinary -<emphasis>universally</emphasis> quantified type given above, thereby avoiding -adding a new existential quantification construct. -</para> - -</sect3> - -<sect3 id="existential-with-context"> -<title>Existentials and type classes</title> - -<para> -An easy extension is to allow -arbitrary contexts before the constructor. For example: -</para> - -<para> - -<programlisting> -data Baz = forall a. Eq a => Baz1 a a - | forall b. Show b => Baz2 b (b -> b) -</programlisting> - -</para> - -<para> -The two constructors have the types you'd expect: -</para> - -<para> - -<programlisting> -Baz1 :: forall a. Eq a => a -> a -> Baz -Baz2 :: forall b. Show b => b -> (b -> b) -> Baz -</programlisting> - -</para> - -<para> -But when pattern matching on <function>Baz1</function> the matched values can be compared -for equality, and when pattern matching on <function>Baz2</function> the first matched -value can be converted to a string (as well as applying the function to it). -So this program is legal: -</para> - -<para> - -<programlisting> - f :: Baz -> String - f (Baz1 p q) | p == q = "Yes" - | otherwise = "No" - f (Baz2 v fn) = show (fn v) -</programlisting> - -</para> - -<para> -Operationally, in a dictionary-passing implementation, the -constructors <function>Baz1</function> and <function>Baz2</function> must store the -dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and -extract it on pattern matching. -</para> - -</sect3> - -<sect3 id="existential-records"> -<title>Record Constructors</title> - -<para> -GHC allows existentials to be used with records syntax as well. For example: - -<programlisting> -data Counter a = forall self. NewCounter - { _this :: self - , _inc :: self -> self - , _display :: self -> IO () - , tag :: a - } -</programlisting> -Here <literal>tag</literal> is a public field, with a well-typed selector -function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal> -type is hidden from the outside; any attempt to apply <literal>_this</literal>, -<literal>_inc</literal> or <literal>_display</literal> as functions will raise a -compile-time error. In other words, <emphasis>GHC defines a record selector function -only for fields whose type does not mention the existentially-quantified variables</emphasis>. -(This example used an underscore in the fields for which record selectors -will not be defined, but that is only programming style; GHC ignores them.) -</para> - -<para> -To make use of these hidden fields, we need to create some helper functions: - -<programlisting> -inc :: Counter a -> Counter a -inc (NewCounter x i d t) = NewCounter - { _this = i x, _inc = i, _display = d, tag = t } - -display :: Counter a -> IO () -display NewCounter{ _this = x, _display = d } = d x -</programlisting> - -Now we can define counters with different underlying implementations: - -<programlisting> -counterA :: Counter String -counterA = NewCounter - { _this = 0, _inc = (1+), _display = print, tag = "A" } - -counterB :: Counter String -counterB = NewCounter - { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" } - -main = do - display (inc counterA) -- prints "1" - display (inc (inc counterB)) -- prints "##" -</programlisting> - -Record update syntax is supported for existentials (and GADTs): -<programlisting> -setTag :: Counter a -> a -> Counter a -setTag obj t = obj{ tag = t } -</programlisting> -The rule for record update is this: <emphasis> -the types of the updated fields may -mention only the universally-quantified type variables -of the data constructor. For GADTs, the field may mention only types -that appear as a simple type-variable argument in the constructor's result -type</emphasis>. For example: -<programlisting> -data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential -upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b -upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is - -- existentially quantified) - -data G a b where { G1 { g1::a, g2::c } :: G a [c] } -upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b -upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple - -- type-variable argument in G1's result type) -</programlisting> -</para> - -</sect3> - - -<sect3> -<title>Restrictions</title> - -<para> -There are several restrictions on the ways in which existentially-quantified -constructors can be use. -</para> - -<para> - -<itemizedlist> -<listitem> - -<para> - When pattern matching, each pattern match introduces a new, -distinct, type for each existential type variable. These types cannot -be unified with any other type, nor can they escape from the scope of -the pattern match. For example, these fragments are incorrect: - - -<programlisting> -f1 (MkFoo a f) = a -</programlisting> - - -Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal> -is the result of <function>f1</function>. One way to see why this is wrong is to -ask what type <function>f1</function> has: - - -<programlisting> - f1 :: Foo -> a -- Weird! -</programlisting> - - -What is this "<literal>a</literal>" in the result type? Clearly we don't mean -this: - - -<programlisting> - f1 :: forall a. Foo -> a -- Wrong! -</programlisting> - - -The original program is just plain wrong. Here's another sort of error - - -<programlisting> - f2 (Baz1 a b) (Baz1 p q) = a==q -</programlisting> - - -It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but -<literal>a==q</literal> is wrong because it equates the two distinct types arising -from the two <function>Baz1</function> constructors. - - -</para> -</listitem> -<listitem> - -<para> -You can't pattern-match on an existentially quantified -constructor in a <literal>let</literal> or <literal>where</literal> group of -bindings. So this is illegal: - - -<programlisting> - f3 x = a==b where { Baz1 a b = x } -</programlisting> - -Instead, use a <literal>case</literal> expression: - -<programlisting> - f3 x = case x of Baz1 a b -> a==b -</programlisting> - -In general, you can only pattern-match -on an existentially-quantified constructor in a <literal>case</literal> expression or -in the patterns of a function definition. - -The reason for this restriction is really an implementation one. -Type-checking binding groups is already a nightmare without -existentials complicating the picture. Also an existential pattern -binding at the top level of a module doesn't make sense, because it's -not clear how to prevent the existentially-quantified type "escaping". -So for now, there's a simple-to-state restriction. We'll see how -annoying it is. - -</para> -</listitem> -<listitem> - -<para> -You can't use existential quantification for <literal>newtype</literal> -declarations. So this is illegal: - - -<programlisting> - newtype T = forall a. Ord a => MkT a -</programlisting> - - -Reason: a value of type <literal>T</literal> must be represented as a -pair of a dictionary for <literal>Ord t</literal> and a value of type -<literal>t</literal>. That contradicts the idea that -<literal>newtype</literal> should have no concrete representation. -You can get just the same efficiency and effect by using -<literal>data</literal> instead of <literal>newtype</literal>. If -there is no overloading involved, then there is more of a case for -allowing an existentially-quantified <literal>newtype</literal>, -because the <literal>data</literal> version does carry an -implementation cost, but single-field existentially quantified -constructors aren't much use. So the simple restriction (no -existential stuff on <literal>newtype</literal>) stands, unless there -are convincing reasons to change it. - - -</para> -</listitem> -<listitem> - -<para> - You can't use <literal>deriving</literal> to define instances of a -data type with existentially quantified data constructors. - -Reason: in most cases it would not make sense. For example:; - -<programlisting> -data T = forall a. MkT [a] deriving( Eq ) -</programlisting> - -To derive <literal>Eq</literal> in the standard way we would need to have equality -between the single component of two <function>MkT</function> constructors: - -<programlisting> -instance Eq T where - (MkT a) == (MkT b) = ??? -</programlisting> - -But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared. -It's just about possible to imagine examples in which the derived instance -would make sense, but it seems altogether simpler simply to prohibit such -declarations. Define your own instances! -</para> -</listitem> - -</itemizedlist> - -</para> - -</sect3> -</sect2> - -<!-- ====================== Generalised algebraic data types ======================= --> - -<sect2 id="gadt-style"> -<title>Declaring data types with explicit constructor signatures</title> - -<para>When the <literal>GADTSyntax</literal> extension is enabled, -GHC allows you to declare an algebraic data type by -giving the type signatures of constructors explicitly. For example: -<programlisting> - data Maybe a where - Nothing :: Maybe a - Just :: a -> Maybe a -</programlisting> -The form is called a "GADT-style declaration" -because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>, -can only be declared using this form.</para> -<para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>). -For example, these two declarations are equivalent: -<programlisting> - data Foo = forall a. MkFoo a (a -> Bool) - data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' } -</programlisting> -</para> -<para>Any data type that can be declared in standard Haskell-98 syntax -can also be declared using GADT-style syntax. -The choice is largely stylistic, but GADT-style declarations differ in one important respect: -they treat class constraints on the data constructors differently. -Specifically, if the constructor is given a type-class context, that -context is made available by pattern matching. For example: -<programlisting> - data Set a where - MkSet :: Eq a => [a] -> Set a - - makeSet :: Eq a => [a] -> Set a - makeSet xs = MkSet (nub xs) - - insert :: a -> Set a -> Set a - insert a (MkSet as) | a `elem` as = MkSet as - | otherwise = MkSet (a:as) -</programlisting> -A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>) -gives rise to a <literal>(Eq a)</literal> -constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal> -(as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal> -context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores -the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so -when pattern-matching that dictionary becomes available for the right-hand side of the match. -In the example, the equality dictionary is used to satisfy the equality constraint -generated by the call to <literal>elem</literal>, so that the type of -<literal>insert</literal> itself has no <literal>Eq</literal> constraint. -</para> -<para> -For example, one possible application is to reify dictionaries: -<programlisting> - data NumInst a where - MkNumInst :: Num a => NumInst a - - intInst :: NumInst Int - intInst = MkNumInst - - plus :: NumInst a -> a -> a -> a - plus MkNumInst p q = p + q -</programlisting> -Here, a value of type <literal>NumInst a</literal> is equivalent -to an explicit <literal>(Num a)</literal> dictionary. -</para> -<para> -All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>. -For example, the <literal>NumInst</literal> data type above could equivalently be declared -like this: -<programlisting> - data NumInst a - = Num a => MkNumInst (NumInst a) -</programlisting> -Notice that, unlike the situation when declaring an existential, there is -no <literal>forall</literal>, because the <literal>Num</literal> constrains the -data type's universally quantified type variable <literal>a</literal>. -A constructor may have both universal and existential type variables: for example, -the following two declarations are equivalent: -<programlisting> - data T1 a - = forall b. (Num a, Eq b) => MkT1 a b - data T2 a where - MkT2 :: (Num a, Eq b) => a -> b -> T2 a -</programlisting> -</para> -<para>All this behaviour contrasts with Haskell 98's peculiar treatment of -contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). -In Haskell 98 the definition -<programlisting> - data Eq a => Set' a = MkSet' [a] -</programlisting> -gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of -<emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching -on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint! -GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, -GHC's behaviour is much more useful, as well as much more intuitive. -</para> - -<para> -The rest of this section gives further details about GADT-style data -type declarations. - -<itemizedlist> -<listitem><para> -The result type of each data constructor must begin with the type constructor being defined. -If the result type of all constructors -has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal> -are distinct type variables, then the data type is <emphasis>ordinary</emphasis>; -otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>). -</para></listitem> - -<listitem><para> -As with other type signatures, you can give a single signature for several data constructors. -In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>: -<programlisting> - data T a where - T1,T2 :: a -> T a - T3 :: T a -</programlisting> -</para></listitem> - -<listitem><para> -The type signature of -each constructor is independent, and is implicitly universally quantified as usual. -In particular, the type variable(s) in the "<literal>data T a where</literal>" header -have no scope, and different constructors may have different universally-quantified type variables: -<programlisting> - data T a where -- The 'a' has no scope - T1,T2 :: b -> T b -- Means forall b. b -> T b - T3 :: T a -- Means forall a. T a -</programlisting> -</para></listitem> - -<listitem><para> -A constructor signature may mention type class constraints, which can differ for -different constructors. For example, this is fine: -<programlisting> - data T a where - T1 :: Eq b => b -> b -> T b - T2 :: (Show c, Ix c) => c -> [c] -> T c -</programlisting> -When pattern matching, these constraints are made available to discharge constraints -in the body of the match. For example: -<programlisting> - f :: T a -> String - f (T1 x y) | x==y = "yes" - | otherwise = "no" - f (T2 a b) = show a -</programlisting> -Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising -from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal> -and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>. -</para></listitem> - -<listitem><para> -Unlike a Haskell-98-style -data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header -have no scope. Indeed, one can write a kind signature instead: -<programlisting> - data Set :: * -> * where ... -</programlisting> -or even a mixture of the two: -<programlisting> - data Bar a :: (* -> *) -> * where ... -</programlisting> -The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal> -like this: -<programlisting> - data Bar a (b :: * -> *) where ... -</programlisting> -</para></listitem> - - -<listitem><para> -You can use strictness annotations, in the obvious places -in the constructor type: -<programlisting> - data Term a where - Lit :: !Int -> Term Int - If :: Term Bool -> !(Term a) -> !(Term a) -> Term a - Pair :: Term a -> Term b -> Term (a,b) -</programlisting> -</para></listitem> - -<listitem><para> -You can use a <literal>deriving</literal> clause on a GADT-style data type -declaration. For example, these two declarations are equivalent -<programlisting> - data Maybe1 a where { - Nothing1 :: Maybe1 a ; - Just1 :: a -> Maybe1 a - } deriving( Eq, Ord ) - - data Maybe2 a = Nothing2 | Just2 a - deriving( Eq, Ord ) -</programlisting> -</para></listitem> - -<listitem><para> -The type signature may have quantified type variables that do not appear -in the result type: -<programlisting> - data Foo where - MkFoo :: a -> (a->Bool) -> Foo - Nil :: Foo -</programlisting> -Here the type variable <literal>a</literal> does not appear in the result type -of either constructor. -Although it is universally quantified in the type of the constructor, such -a type variable is often called "existential". -Indeed, the above declaration declares precisely the same type as -the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>. -</para><para> -The type may contain a class context too, of course: -<programlisting> - data Showable where - MkShowable :: Show a => a -> Showable -</programlisting> -</para></listitem> - -<listitem><para> -You can use record syntax on a GADT-style data type declaration: - -<programlisting> - data Person where - Adult :: { name :: String, children :: [Person] } -> Person - Child :: Show a => { name :: !String, funny :: a } -> Person -</programlisting> -As usual, for every constructor that has a field <literal>f</literal>, the type of -field <literal>f</literal> must be the same (modulo alpha conversion). -The <literal>Child</literal> constructor above shows that the signature -may have a context, existentially-quantified variables, and strictness annotations, -just as in the non-record case. (NB: the "type" that follows the double-colon -is not really a type, because of the record syntax and strictness annotations. -A "type" of this form can appear only in a constructor signature.) -</para></listitem> - -<listitem><para> -Record updates are allowed with GADT-style declarations, -only fields that have the following property: the type of the field -mentions no existential type variables. -</para></listitem> - -<listitem><para> -As in the case of existentials declared using the Haskell-98-like record syntax -(<xref linkend="existential-records"/>), -record-selector functions are generated only for those fields that have well-typed -selectors. -Here is the example of that section, in GADT-style syntax: -<programlisting> -data Counter a where - NewCounter :: { _this :: self - , _inc :: self -> self - , _display :: self -> IO () - , tag :: a - } -> Counter a -</programlisting> -As before, only one selector function is generated here, that for <literal>tag</literal>. -Nevertheless, you can still use all the field names in pattern matching and record construction. -</para></listitem> - -<listitem><para> -In a GADT-style data type declaration there is no obvious way to specify that a data constructor -should be infix, which makes a difference if you derive <literal>Show</literal> for the type. -(Data constructors declared infix are displayed infix by the derived <literal>show</literal>.) -So GHC implements the following design: a data constructor declared in a GADT-style data type -declaration is displayed infix by <literal>Show</literal> iff (a) it is an operator symbol, -(b) it has two arguments, (c) it has a programmer-supplied fixity declaration. For example -<programlisting> - infix 6 (:--:) - data T a where - (:--:) :: Int -> Bool -> T Int -</programlisting> -</para></listitem> -</itemizedlist></para> -</sect2> - -<sect2 id="gadt"> -<title>Generalised Algebraic Data Types (GADTs)</title> - -<para>Generalised Algebraic Data Types generalise ordinary algebraic data types -by allowing constructors to have richer return types. Here is an example: -<programlisting> - data Term a where - Lit :: Int -> Term Int - Succ :: Term Int -> Term Int - IsZero :: Term Int -> Term Bool - If :: Term Bool -> Term a -> Term a -> Term a - Pair :: Term a -> Term b -> Term (a,b) -</programlisting> -Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the -case with ordinary data types. This generality allows us to -write a well-typed <literal>eval</literal> function -for these <literal>Terms</literal>: -<programlisting> - eval :: Term a -> a - eval (Lit i) = i - eval (Succ t) = 1 + eval t - eval (IsZero t) = eval t == 0 - eval (If b e1 e2) = if eval b then eval e1 else eval e2 - eval (Pair e1 e2) = (eval e1, eval e2) -</programlisting> -The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>. -For example, in the right hand side of the equation -<programlisting> - eval :: Term a -> a - eval (Lit i) = ... -</programlisting> -the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point! -A precise specification of the type rules is beyond what this user manual aspires to, -but the design closely follows that described in -the paper <ulink -url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple -unification-based type inference for GADTs</ulink>, -(ICFP 2006). -The general principle is this: <emphasis>type refinement is only carried out -based on user-supplied type annotations</emphasis>. -So if no type signature is supplied for <literal>eval</literal>, no type refinement happens, -and lots of obscure error messages will -occur. However, the refinement is quite general. For example, if we had: -<programlisting> - eval :: Term a -> a -> a - eval (Lit i) j = i+j -</programlisting> -the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type -of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and -the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal. -</para> -<para> -These and many other examples are given in papers by Hongwei Xi, and -Tim Sheard. There is a longer introduction -<ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>, -and Ralf Hinze's -<ulink url="http://www.cs.ox.ac.uk/ralf.hinze/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers -may use different notation to that implemented in GHC. -</para> -<para> -The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with -<option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XGADTSyntax</option> -and <option>-XMonoLocalBinds</option>. -<itemizedlist> -<listitem><para> -A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>); -the old Haskell-98 syntax for data declarations always declares an ordinary data type. -The result type of each constructor must begin with the type constructor being defined, -but for a GADT the arguments to the type constructor can be arbitrary monotypes. -For example, in the <literal>Term</literal> data -type above, the type of each constructor must end with <literal>Term ty</literal>, but -the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal> -constructor). -</para></listitem> - -<listitem><para> -It is permitted to declare an ordinary algebraic data type using GADT-style syntax. -What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors -whose result type is not just <literal>T a b</literal>. -</para></listitem> - -<listitem><para> -You cannot use a <literal>deriving</literal> clause for a GADT; only for -an ordinary data type. -</para></listitem> - -<listitem><para> -As mentioned in <xref linkend="gadt-style"/>, record syntax is supported. -For example: -<programlisting> - data Term a where - Lit :: { val :: Int } -> Term Int - Succ :: { num :: Term Int } -> Term Int - Pred :: { num :: Term Int } -> Term Int - IsZero :: { arg :: Term Int } -> Term Bool - Pair :: { arg1 :: Term a - , arg2 :: Term b - } -> Term (a,b) - If :: { cnd :: Term Bool - , tru :: Term a - , fls :: Term a - } -> Term a -</programlisting> -However, for GADTs there is the following additional constraint: -every constructor that has a field <literal>f</literal> must have -the same result type (modulo alpha conversion) -Hence, in the above example, we cannot merge the <literal>num</literal> -and <literal>arg</literal> fields above into a -single name. Although their field types are both <literal>Term Int</literal>, -their selector functions actually have different types: - -<programlisting> - num :: Term Int -> Term Int - arg :: Term Bool -> Term Int -</programlisting> -</para></listitem> - -<listitem><para> -When pattern-matching against data constructors drawn from a GADT, -for example in a <literal>case</literal> expression, the following rules apply: -<itemizedlist> -<listitem><para>The type of the scrutinee must be rigid.</para></listitem> -<listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem> -<listitem><para>The type of any free variable mentioned in any of -the <literal>case</literal> alternatives must be rigid.</para></listitem> -</itemizedlist> -A type is "rigid" if it is completely known to the compiler at its binding site. The easiest -way to ensure that a variable a rigid type is to give it a type signature. -For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt"> -Simple unification-based type inference for GADTs -</ulink>. The criteria implemented by GHC are given in the Appendix. - -</para></listitem> - -</itemizedlist> -</para> - -</sect2> -</sect1> - -<!-- ====================== End of Generalised algebraic data types ======================= --> - -<sect1 id="deriving"> -<title>Extensions to the "deriving" mechanism</title> - -<sect2 id="deriving-inferred"> -<title>Inferred context for deriving clauses</title> - -<para> -The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is -legal. For example: -<programlisting> - data T0 f a = MkT0 a deriving( Eq ) - data T1 f a = MkT1 (f a) deriving( Eq ) - data T2 f a = MkT2 (f (f a)) deriving( Eq ) -</programlisting> -The natural generated <literal>Eq</literal> code would result in these instance declarations: -<programlisting> - instance Eq a => Eq (T0 f a) where ... - instance Eq (f a) => Eq (T1 f a) where ... - instance Eq (f (f a)) => Eq (T2 f a) where ... -</programlisting> -The first of these is obviously fine. The second is still fine, although less obviously. -The third is not Haskell 98, and risks losing termination of instances. -</para> -<para> -GHC takes a conservative position: it accepts the first two, but not the third. The rule is this: -each constraint in the inferred instance context must consist only of type variables, -with no repetitions. -</para> -<para> -This rule is applied regardless of flags. If you want a more exotic context, you can write -it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>. -</para> -</sect2> - -<sect2 id="stand-alone-deriving"> -<title>Stand-alone deriving declarations</title> - -<para> -GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>: -<programlisting> - data Foo a = Bar a | Baz String - - deriving instance Eq a => Eq (Foo a) -</programlisting> -The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword -<literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part. -</para> -<para> -However, standalone deriving differs from a <literal>deriving</literal> clause in a number -of important ways: -<itemizedlist> -<listitem><para>The standalone deriving declaration does not need to be in the -same module as the data type declaration. (But be aware of the dangers of -orphan instances (<xref linkend="orphan-modules"/>). -</para></listitem> - -<listitem><para> -You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>), -exactly as you would in an ordinary instance declaration. -(In contrast, in a <literal>deriving</literal> clause -attached to a data type declaration, the context is inferred.) -</para></listitem> - -<listitem><para> -Unlike a <literal>deriving</literal> -declaration attached to a <literal>data</literal> declaration, the instance can be more specific -than the data type (assuming you also use -<literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider -for example -<programlisting> - data Foo a = Bar a | Baz String - - deriving instance Eq a => Eq (Foo [a]) - deriving instance Eq a => Eq (Foo (Maybe a)) -</programlisting> -This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>, -but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>. -</para></listitem> - -<listitem><para> -Unlike a <literal>deriving</literal> -declaration attached to a <literal>data</literal> declaration, -GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate -boilerplate code for the specified class, and typechecks it. If there is a type error, it is -your problem. (GHC will show you the offending code if it has a type error.) -</para> -<para> -The merit of this is that you can derive instances for GADTs and other exotic -data types, providing only that the boilerplate code does indeed typecheck. For example: -<programlisting> - data T a where - T1 :: T Int - T2 :: T Bool - - deriving instance Show (T a) -</programlisting> -In this example, you cannot say <literal>... deriving( Show )</literal> on the -data type declaration for <literal>T</literal>, -because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate -the instance declaration using stand-alone deriving. -</para> -<para> -The down-side is that, -if the boilerplate code fails to typecheck, you will get an error message about that -code, which you did not write. Whereas, with a <literal>deriving</literal> clause -the side-conditions are necessarily more conservative, but any error message -may be more comprehensible. -</para> -</listitem> -</itemizedlist></para> - -<para> -In other ways, however, a standalone deriving obeys the same rules as ordinary deriving: -<itemizedlist> -<listitem><para> -A <literal>deriving instance</literal> declaration -must obey the same rules concerning form and termination as ordinary instance declarations, -controlled by the same flags; see <xref linkend="instance-decls"/>. -</para></listitem> - -<listitem> -<para>The stand-alone syntax is generalised for newtypes in exactly the same -way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>). -For example: -<programlisting> - newtype Foo a = MkFoo (State Int a) - - deriving instance MonadState Int Foo -</programlisting> -GHC always treats the <emphasis>last</emphasis> parameter of the instance -(<literal>Foo</literal> in this example) as the type whose instance is being derived. -</para></listitem> -</itemizedlist></para> - -</sect2> - -<sect2 id="deriving-extra"> -<title>Deriving instances of extra classes (<literal>Data</literal>, etc)</title> - -<para> -Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type -declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause. -In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard -classes <literal>Eq</literal>, <literal>Ord</literal>, -<literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>. -</para> -<para> -GHC extends this list with several more classes that may be automatically derived: -<itemizedlist> -<listitem><para> With <option>-XDeriveGeneric</option>, you can derive -instances of the classes <literal>Generic</literal> and -<literal>Generic1</literal>, defined in <literal>GHC.Generics</literal>. -You can use these to define generic functions, -as described in <xref linkend="generic-programming"/>. -</para></listitem> - -<listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of -the class <literal>Functor</literal>, -defined in <literal>GHC.Base</literal>. See <xref linkend="deriving-functor"/>. -</para></listitem> - -<listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of -the class <literal>Data</literal>, -defined in <literal>Data.Data</literal>. See <xref linkend="deriving-typeable"/> for -deriving <literal>Typeable</literal>. -</para></listitem> - -<listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of -the class <literal>Foldable</literal>, -defined in <literal>Data.Foldable</literal>. See <xref linkend="deriving-foldable"/>. -</para></listitem> - -<listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of -the class <literal>Traversable</literal>, -defined in <literal>Data.Traversable</literal>. Since the <literal>Traversable</literal> -instance dictates the instances of <literal>Functor</literal> and -<literal>Foldable</literal>, you'll probably want to derive them too, so -<option>-XDeriveTraversable</option> implies -<option>-XDeriveFunctor</option> and <option>-XDeriveFoldable</option>. -See <xref linkend="deriving-traversable"/>. -</para></listitem> - -<listitem><para> With <option>-XDeriveLift</option>, you can derive instances -of the class <literal>Lift</literal>, defined in the -<literal>Language.Haskell.TH.Syntax</literal> module of the -<literal>template-haskell</literal> package. -See <xref linkend="deriving-lift"/>. -</para></listitem> -</itemizedlist> -You can also use a standalone deriving declaration instead -(see <xref linkend="stand-alone-deriving"/>). -</para> -<para> -In each case the appropriate class must be in scope before it -can be mentioned in the <literal>deriving</literal> clause. -</para> -</sect2> - -<sect2 id="deriving-functor"> -<title>Deriving <literal>Functor</literal> instances</title> - -<para>With <option>-XDeriveFunctor</option>, one can derive -<literal>Functor</literal> instances for data types of kind -<literal>* -> *</literal>. For example, this declaration: - -<programlisting> -data Example a = Ex a Char (Example a) (Example Char) - deriving Functor -</programlisting> - -would generate the following instance: - -<programlisting> -instance Functor Example where - fmap f (Ex a1 a2 a3 a4) = Ex (f a1) a2 (fmap f a3) a4 -</programlisting> -</para> - -<para>The basic algorithm for <option>-XDeriveFunctor</option> walks the -arguments of each constructor of a data type, applying a mapping function -depending on the type of each argument. Suppose we are deriving -<literal>Functor</literal> for a data type whose last type parameter is -<literal>a</literal>. Then we write the derivation of <literal>fmap</literal> -code over the type variable <literal>a</literal> for type -<literal>b</literal> as <literal>$(fmap 'a 'b)</literal>. - -<itemizedlist> -<listitem><para>If the argument's type is <literal>a</literal>, then -map over it. - -<programlisting> -$(fmap 'a 'a) = f -</programlisting> -</para></listitem> - -<listitem><para>If the argument's type does not mention <literal>a</literal>, -then do nothing to it. - -<programlisting> -$(fmap 'a 'b) = \x -> x -- when b does not contain a -</programlisting> -</para></listitem> - -<listitem><para>If the argument has a tuple type, generate map code for each -of its arguments. - -<programlisting> -$(fmap 'a '(b1,b2)) = \x -> case x of (x1,x2) -> ($(fmap 'a 'b1) x1, $(fmap 'a 'b2) x2) -</programlisting> -</para></listitem> - -<listitem><para>If the argument's type is a data type that mentions -<literal>a</literal>, apply <literal>fmap</literal> to it with the generated -map code for the data type's last type parameter. - -<programlisting> -$(fmap 'a '(T b1 b2)) = fmap $(fmap 'a 'b2) -- when a only occurs in the last parameter, b2 -</programlisting> -</para></listitem> - -<listitem><para>If the argument has a function type, apply generated -<literal>$(fmap)</literal> code to the result type, and apply generated -<literal>$(cofmap)</literal> code to the argument type. - -<programlisting> -$(fmap 'a '(b -> c)) = \x b -> $(fmap 'a' 'c) (x ($(cofmap 'a 'b) b)) -</programlisting> - -<literal>$(cofmap)</literal> is needed because the type parameter -<literal>a</literal> can occur in a contravariant position, which means we -need to derive a function like: - -<programlisting> -cofmap :: (a -> b) -> f b -> f a -</programlisting> - -This is pretty much the same as <literal>$(fmap)</literal>, only without the -<literal>$(cofmap 'a 'a)</literal> case: - -<programlisting> -$(cofmap 'a 'b) = \x -> x -- when b does not contain a -$(cofmap 'a 'a) = error "type variable in contravariant position" -$(cofmap 'a '(b1,b2)) = \x -> case x of (x1,x2) -> ($(cofmap 'a 'b1) x1, $(cofmap 'a 'b2) x2) -$(cofmap 'a '[b]) = map $(cofmap 'a 'b) -$(cofmap 'a '(T b1 b2)) = fmap $(cofmap 'a 'b2) -- when a only occurs in the last parameter, b2 -$(cofmap 'a '(b -> c)) = \x b -> $(cofmap 'a' 'c) (x ($(fmap 'a 'c) b)) -</programlisting> - -For more information on contravariance, see -<ulink url="https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/DeriveFunctor#Covariantandcontravariantpositions"> -this wiki page</ulink>. -</para></listitem> -</itemizedlist> -</para> - -<para>A data type can have a derived <literal>Functor</literal> instance if: - -<itemizedlist> -<listitem><para>It has at least one type parameter. -</para></listitem> - -<listitem><para>It does not use the last type parameter contravariantly. -</para></listitem> - -<listitem><para>It does not use the last type parameter in the "wrong place" -in any of the argument data types. For example, in: - -<programlisting> -data Right a = Right [a] (Either Int a) -</programlisting> - -the type parameter <literal>a</literal> is only ever used as the last type -argument in <literal>[]</literal> and <literal>Either</literal>, so both -<literal>[a]</literal> and <literal>Either Int a</literal> can be -<literal>fmap</literal>ped. However, in: - -<programlisting> -data Wrong a = Wrong (Either a a) -</programlisting> - -the type variable <literal>a</literal> appears in a position other than the -last, so trying to <literal>fmap</literal> an <literal>Either a a</literal> -value would not typecheck in a <literal>Functor</literal> instance. - -Note that there are two exceptions to this rule: tuple and function types, as -described above. -</para></listitem> - -<listitem><para>Its last type variable cannot be used in a -<option>-XDatatypeContexts</option> constraint. -</para></listitem> - -<listitem><para>Its last type variable cannot be used in an -<option>-XExistentialQuantification</option> or <option>-XGADTs</option> -constraint. -</para></listitem> -</itemizedlist> - -</para> -</sect2> - -<sect2 id="deriving-foldable"> -<title>Deriving <literal>Foldable</literal> instances</title> - -<para>With <option>-XDeriveFoldable</option>, one can derive -<literal>Foldable</literal> instances for data types of kind -<literal>* -> *</literal>. For example, this declaration: - -<programlisting> -data Example a = Ex a Char (Example a) (Example Char) - deriving Functor -</programlisting> - -would generate the following instance: - -<programlisting> -instance Foldable Example where - foldr f z (Ex a1 a2 a3 a4) = f a1 (foldr f z a3) - foldMap f (Ex a1 a2 a3 a4) = mappend (f a1) - (mappend mempty - (mappend (foldMap f a3) - mempty)) -</programlisting> - -The algorithm for <option>-XDeriveFoldable</option> is very similar to that of -<option>-XDeriveFunctor</option>, except that <literal>Foldable</literal> -instances are not possible for function types. The cases are: - -<programlisting> -$(foldr 'a 'b) = \x z -> z -- when b does not contain a -$(foldr 'a 'a) = f -$(foldr 'a '(b1,b2)) = \x z -> case x of (x1,x2) -> $(foldr 'a 'b1) x1 ( $(foldr 'a 'b2) x2 z ) -$(foldr 'a '(T b1 b2)) = \x z -> foldr $(foldr 'a 'b2) z x -- when a only occurs in the last parameter, b2 -</programlisting> - -Another difference between <option>-XDeriveFoldable</option> and -<option>-XDeriveFunctor</option> is that <option>-XDeriveFoldable</option> -instances can be derived for data types with existential constraints. For -example, the following data type: - -<programlisting> -data E a where - E1 :: (a ~ Int) => a -> E a - E2 :: Int -> E Int - E3 :: (a ~ Int) => a -> E Int - E4 :: (a ~ Int) => Int -> E a - -deriving instance Foldable E -</programlisting> - -would have the following <literal>Foldable</literal> instance: - -<programlisting> -instance Foldable E where - foldr f z (E1 e) = f e z - foldr f z (E2 e) = z - foldr f z (E3 e) = z - foldr f z (E4 e) = z - - foldMap f (E1 e) = f e - foldMap f (E2 e) = mempty - foldMap f (E3 e) = mempty - foldMap f (E4 e) = mempty -</programlisting> - -Notice that only the argument in <literal>E1</literal> is folded over. This is -because we only fold over constructor arguments (1) whose types are -syntactically equivalent to the last type parameter and (2) when the last type -parameter is not refined to a specific type. Only <literal>E1</literal> -satisfies both of these criteria. For more information, see -<ulink url="https://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/DeriveFunctor"> -this wiki page</ulink>. -</para> -</sect2> - -<sect2 id="deriving-traversable"> -<title>Deriving <literal>Traversable</literal> instances</title> - -<para>With <option>-XDeriveTraversable</option>, one can derive -<literal>Traversable</literal> instances for data types of kind -<literal>* -> *</literal>. For example, this declaration: - -<programlisting> -data Example a = Ex a Char (Example a) (Example Char) - deriving Functor -</programlisting> - -would generate the following instance: - -<programlisting> -instance Foldable Example where - traverse f (Ex a1 a2 a3 a4) - = fmap Ex (f a) - <*> pure a2 - <*> traverse f a3 - <*> pure a4 -</programlisting> - -The algorithm for <option>-XDeriveTraversable</option> is very similar to that -of <option>-XDeriveTraversable</option>, except that -<literal>Traversable</literal> instances are not possible for function types. -The cases are: - -<programlisting> -1812 $(traverse 'a 'b) = pure -- when b does not contain a -1813 $(traverse 'a 'a) = f -1814 $(traverse 'a '(b1,b2)) = \x -> case x of (x1,x2) -> fmap (,) $(traverse 'a 'b1) x1 <*> $(traverse 'a 'b2) x2 -1815 $(traverse 'a '(T b1 b2)) = traverse $(traverse 'a 'b2) -- when a only occurs in the last parameter, b2 -</programlisting> -</para> -</sect2> - -<sect2 id="deriving-typeable"> -<title>Deriving <literal>Typeable</literal> instances</title> - -<para>The class <literal>Typeable</literal> is very special: -<itemizedlist> -<listitem><para> -<literal>Typeable</literal> is kind-polymorphic (see -<xref linkend="kind-polymorphism"/>). -</para></listitem> - -<listitem><para> -GHC has a custom solver for discharging constraints that involve -class <literal>Typeable</literal>, and handwritten instances are forbidden. -This ensures that the programmer cannot subvert the type system by -writing bogus instances. -</para></listitem> - -<listitem><para> -Derived instances of <literal>Typeable</literal> are ignored, -and may be reported as an error in a later version of the compiler. -</para></listitem> - -<listitem><para> -The rules for solving `Typeable` constraints are as follows: -<itemizedlist> -<listitem><para>A concrete type constructor applied to some types. -<programlisting> -instance (Typeable t1, .., Typeable t_n) => - Typeable (T t1 .. t_n) -</programlisting> -This rule works for any concrete type constructor, including type -constructors with polymorphic kinds. The only restriction is that -if the type constructor has a polymorphic kind, then it has to be applied -to all of its kinds parameters, and these kinds need to be concrete -(i.e., they cannot mention kind variables). -</para></listitem> - -<listitem><para> -<programlisting>A type variable applied to some types. -instance (Typeable f, Typeable t1, .., Typeable t_n) => - Typeable (f t1 .. t_n) -</programlisting> -</para></listitem> - -<listitem><para> -<programlisting>A concrete type literal. -instance Typeable 0 -- Type natural literals -instance Typeable "Hello" -- Type-level symbols -</programlisting> -</para></listitem> -</itemizedlist> -</para></listitem> - - -</itemizedlist> - -</para> - -</sect2> - -<sect2 id="deriving-lift"> -<title>Deriving <literal>Lift</literal> instances</title> - -<para>The class <literal>Lift</literal>, unlike other derivable classes, lives -in <literal>template-haskell</literal> instead of <literal>base</literal>. -Having a data type be an instance of <literal>Lift</literal> permits its values -to be promoted to Template Haskell expressions (of type -<literal>ExpQ</literal>), which can then be spliced into Haskell source code. -</para> - -<para>Here is an example of how one can derive <literal>Lift</literal>: - -<programlisting> -{-# LANGUAGE DeriveLift #-} -module Bar where - -import Language.Haskell.TH.Syntax - -data Foo a = Foo a | a :^: a deriving Lift - -{- -instance (Lift a) => Lift (Foo a) where - lift (Foo a) - = appE - (conE - (mkNameG_d "package-name" "Bar" "Foo")) - (lift a) - lift (u :^: v) - = infixApp - (lift u) - (conE - (mkNameG_d "package-name" "Bar" ":^:")) - (lift v) --} - ------ -{-# LANGUAGE TemplateHaskell #-} -module Baz where - -import Bar -import Language.Haskell.TH.Lift - -foo :: Foo String -foo = $(lift $ Foo "foo") - -fooExp :: Lift a => Foo a -> Q Exp -fooExp f = [| f |] -</programlisting> - -<option>-XDeriveLift</option> also works for certain unboxed types -(<literal>Addr#</literal>, <literal>Char#</literal>, -<literal>Double#</literal>, <literal>Float#</literal>, -<literal>Int#</literal>, and <literal>Word#</literal>): - -<programlisting> -{-# LANGUAGE DeriveLift, MagicHash #-} -module Unboxed where - -import GHC.Exts -import Language.Haskell.TH.Syntax - -data IntHash = IntHash Int# deriving Lift - -{- -instance Lift IntHash where - lift (IntHash i) - = appE - (conE - (mkNameG_d "package-name" "Unboxed" "IntHash")) - (litE - (intPrimL (toInteger (I# i)))) --} -</programlisting> - -</para> - -</sect2> - -<sect2 id="newtype-deriving"> -<title>Generalised derived instances for newtypes</title> - -<para> -When you define an abstract type using <literal>newtype</literal>, you may want -the new type to inherit some instances from its representation. In -Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>, -<literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any -other classes you have to write an explicit instance declaration. For -example, if you define - -<programlisting> - newtype Dollars = Dollars Int -</programlisting> - -and you want to use arithmetic on <literal>Dollars</literal>, you have to -explicitly define an instance of <literal>Num</literal>: - -<programlisting> - instance Num Dollars where - Dollars a + Dollars b = Dollars (a+b) - ... -</programlisting> -All the instance does is apply and remove the <literal>newtype</literal> -constructor. It is particularly galling that, since the constructor -doesn't appear at run-time, this instance declaration defines a -dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal> -dictionary, only slower! -</para> - - -<sect3 id="generalized-newtype-deriving"> <title> Generalising the deriving clause </title> -<para> -GHC now permits such instances to be derived instead, -using the flag <option>-XGeneralizedNewtypeDeriving</option>, -so one can write -<programlisting> - newtype Dollars = Dollars Int deriving (Eq,Show,Num) -</programlisting> - -and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary -for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler -derives an instance declaration of the form - -<programlisting> - instance Num Int => Num Dollars -</programlisting> - -which just adds or removes the <literal>newtype</literal> constructor according to the type. -</para> -<para> - -We can also derive instances of constructor classes in a similar -way. For example, suppose we have implemented state and failure monad -transformers, such that - -<programlisting> - instance Monad m => Monad (State s m) - instance Monad m => Monad (Failure m) -</programlisting> -In Haskell 98, we can define a parsing monad by -<programlisting> - type Parser tok m a = State [tok] (Failure m) a -</programlisting> - -which is automatically a monad thanks to the instance declarations -above. With the extension, we can make the parser type abstract, -without needing to write an instance of class <literal>Monad</literal>, via - -<programlisting> - newtype Parser tok m a = Parser (State [tok] (Failure m) a) - deriving Monad -</programlisting> -In this case the derived instance declaration is of the form -<programlisting> - instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) -</programlisting> - -Notice that, since <literal>Monad</literal> is a constructor class, the -instance is a <emphasis>partial application</emphasis> of the new type, not the -entire left hand side. We can imagine that the type declaration is -"eta-converted" to generate the context of the instance -declaration. -</para> -<para> - -We can even derive instances of multi-parameter classes, provided the -newtype is the last class parameter. In this case, a ``partial -application'' of the class appears in the <literal>deriving</literal> -clause. For example, given the class - -<programlisting> - class StateMonad s m | m -> s where ... - instance Monad m => StateMonad s (State s m) where ... -</programlisting> -then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by -<programlisting> - newtype Parser tok m a = Parser (State [tok] (Failure m) a) - deriving (Monad, StateMonad [tok]) -</programlisting> - -The derived instance is obtained by completing the application of the -class to the new type: - -<programlisting> - instance StateMonad [tok] (State [tok] (Failure m)) => - StateMonad [tok] (Parser tok m) -</programlisting> -</para> -<para> - -As a result of this extension, all derived instances in newtype - declarations are treated uniformly (and implemented just by reusing -the dictionary for the representation type), <emphasis>except</emphasis> -<literal>Show</literal> and <literal>Read</literal>, which really behave differently for -the newtype and its representation. -</para> -</sect3> - -<sect3> <title> A more precise specification </title> -<para> -A derived instance is derived only for declarations of these forms (after expansion of any type synonyms) - -<programlisting> - newtype T v1..vn = MkT (t vk+1..vn) deriving (C t1..tj) - newtype instance T s1..sk vk+1..vn = MkT (t vk+1..vn) deriving (C t1..tj) -</programlisting> -where - <itemizedlist> -<listitem><para> -<literal>v1..vn</literal> are type variables, and <literal>t</literal>, -<literal>s1..sk</literal>, <literal>t1..tj</literal> are types. -</para></listitem> -<listitem><para> - The <literal>(C t1..tj)</literal> is a partial applications of the class <literal>C</literal>, - where the arity of <literal>C</literal> - is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument. -</para></listitem> -<listitem><para> - <literal>k</literal> is chosen so that <literal>C t1..tj (T v1...vk)</literal> is well-kinded. -(Or, in the case of a <literal>data instance</literal>, so that <literal>C t1..tj (T s1..sk)</literal> is -well kinded.) -</para></listitem> -<listitem><para> - The type <literal>t</literal> is an arbitrary type. -</para></listitem> -<listitem><para> - The type variables <literal>vk+1...vn</literal> do not occur in the types <literal>t</literal>, - <literal>s1..sk</literal>, or <literal>t1..tj</literal>. -</para></listitem> -<listitem><para> - <literal>C</literal> is not <literal>Read</literal>, <literal>Show</literal>, - <literal>Typeable</literal>, or <literal>Data</literal>. These classes - should not "look through" the type or its constructor. You can still - derive these classes for a newtype, but it happens in the usual way, not - via this new mechanism. -</para></listitem> -<listitem><para> - It is safe to coerce each of the methods of <literal>C</literal>. That is, - the missing last argument to <literal>C</literal> is not used - at a nominal role in any of the <literal>C</literal>'s methods. - (See <xref linkend="roles"/>.)</para></listitem> -</itemizedlist> -Then the derived instance is of form -declaration is: -<programlisting> - instance C t1..tj t => C t1..tj (T v1...vk) -</programlisting> -As an example which does <emphasis>not</emphasis> work, consider -<programlisting> - newtype NonMonad m s = NonMonad (State s m s) deriving Monad -</programlisting> -Here we cannot derive the instance -<programlisting> - instance Monad (State s m) => Monad (NonMonad m) -</programlisting> - -because the type variable <literal>s</literal> occurs in <literal>State s m</literal>, -and so cannot be "eta-converted" away. It is a good thing that this -<literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is -not, in fact, a monad --- for the same reason. Try defining -<literal>>>=</literal> with the correct type: you won't be able to. -</para> -<para> - -Notice also that the <emphasis>order</emphasis> of class parameters becomes -important, since we can only derive instances for the last one. If the -<literal>StateMonad</literal> class above were instead defined as - -<programlisting> - class StateMonad m s | m -> s where ... -</programlisting> - -then we would not have been able to derive an instance for the -<literal>Parser</literal> type above. We hypothesise that multi-parameter -classes usually have one "main" parameter for which deriving new -instances is most interesting. -</para> -<para>Lastly, all of this applies only for classes other than -<literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>, -and <literal>Data</literal>, for which the built-in derivation applies (section -4.3.3. of the Haskell Report). -(For the standard classes <literal>Eq</literal>, <literal>Ord</literal>, -<literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether -the standard method is used or the one described here.) -</para> -</sect3> -</sect2> - -<sect2 id="derive-any-class"> -<title>Deriving any other class</title> - -<para> -With <option>-XDeriveAnyClass</option> you can derive any other class. The -compiler will simply generate an empty instance. The instance context will be -generated according to the same rules used when deriving <literal>Eq</literal>. -This is mostly useful in classes whose <link linkend="minimal-pragma">minimal -set</link> is empty, and especially when writing -<link linkend="generic-programming">generic functions</link>. - -In case you try to derive some class on a newtype, and -<option>-XGeneralizedNewtypeDeriving</option> is also on, -<option>-XDeriveAnyClass</option> takes precedence. -</para> - -</sect2> - -</sect1> - - -<!-- TYPE SYSTEM EXTENSIONS --> -<sect1 id="type-class-extensions"> -<title>Class and instances declarations</title> - -<sect2 id="multi-param-type-classes"> -<title>Class declarations</title> - -<para> -This section, and the next one, documents GHC's type-class extensions. -There's lots of background in the paper <ulink -url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type -classes: exploring the design space</ulink> (Simon Peyton Jones, Mark -Jones, Erik Meijer). -</para> - -<sect3> -<title>Multi-parameter type classes</title> -<para> -Multi-parameter type classes are permitted, with flag <option>-XMultiParamTypeClasses</option>. -For example: - - -<programlisting> - class Collection c a where - union :: c a -> c a -> c a - ...etc. -</programlisting> - -</para> -</sect3> - -<sect3 id="superclass-rules"> -<title>The superclasses of a class declaration</title> - -<para> -In Haskell 98 the context of a class declaration (which introduces superclasses) -must be simple; that is, each predicate must consist of a class applied to -type variables. The flag <option>-XFlexibleContexts</option> -(<xref linkend="flexible-contexts"/>) -lifts this restriction, -so that the only restriction on the context in a class declaration is -that the class hierarchy must be acyclic. So these class declarations are OK: - - -<programlisting> - class Functor (m k) => FiniteMap m k where - ... - - class (Monad m, Monad (t m)) => Transform t m where - lift :: m a -> (t m) a -</programlisting> - - -</para> -<para> -As in Haskell 98, The class hierarchy must be acyclic. However, the definition -of "acyclic" involves only the superclass relationships. For example, -this is OK: - - -<programlisting> - class C a where { - op :: D b => a -> b -> b - } - - class C a => D a where { ... } -</programlisting> - - -Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a -class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It -would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.) -</para> -<para> -With the extension that adds a <link linkend="constraint-kind">kind of constraints</link>, -you can write more exotic superclass definitions. The superclass cycle check is even more -liberal in these case. For example, this is OK: - -<programlisting> - class A cls c where - meth :: cls c => c -> c - - class A B c => B c where -</programlisting> - -A superclass context for a class <literal>C</literal> is allowed if, after expanding -type synonyms to their right-hand-sides, and uses of classes (other than <literal>C</literal>) -to their superclasses, <literal>C</literal> does not occur syntactically in the context. -</para> -</sect3> - - - - -<sect3 id="class-method-types"> -<title>Class method types</title> - -<para> -Haskell 98 prohibits class method types to mention constraints on the -class type variable, thus: -<programlisting> - class Seq s a where - fromList :: [a] -> s a - elem :: Eq a => a -> s a -> Bool -</programlisting> -The type of <literal>elem</literal> is illegal in Haskell 98, because it -contains the constraint <literal>Eq a</literal>, which constrains only the -class type variable (in this case <literal>a</literal>). -</para> -<para> -GHC lifts this restriction with language extension <option>-XConstrainedClassMethods</option>. -The restriction is a pretty stupid one in the first place, -so <option>-XConstrainedClassMethods</option> is implied by <option>-XMultiParamTypeClasses</option>. -</para> -</sect3> - -<sect3 id="class-default-signatures"> -<title>Default method signatures</title> - -<para> -Haskell 98 allows you to define a default implementation when declaring a class: -<programlisting> - class Enum a where - enum :: [a] - enum = [] -</programlisting> -The type of the <literal>enum</literal> method is <literal>[a]</literal>, and -this is also the type of the default method. You can lift this restriction -and give another type to the default method using the flag -<option>-XDefaultSignatures</option>. For instance, if you have written a -generic implementation of enumeration in a class <literal>GEnum</literal> -with method <literal>genum</literal> in terms of <literal>GHC.Generics</literal>, -you can specify a default method that uses that generic implementation: -<programlisting> - class Enum a where - enum :: [a] - default enum :: (Generic a, GEnum (Rep a)) => [a] - enum = map to genum -</programlisting> -We reuse the keyword <literal>default</literal> to signal that a signature -applies to the default method only; when defining instances of the -<literal>Enum</literal> class, the original type <literal>[a]</literal> of -<literal>enum</literal> still applies. When giving an empty instance, however, -the default implementation <literal>map to genum</literal> is filled-in, -and type-checked with the type -<literal>(Generic a, GEnum (Rep a)) => [a]</literal>. -</para> - -<para> -We use default signatures to simplify generic programming in GHC -(<xref linkend="generic-programming"/>). -</para> - - -</sect3> - -<sect3 id="nullary-type-classes"> -<title>Nullary type classes</title> -Nullary (no parameter) type classes are enabled with -<option>-XMultiTypeClasses</option>; historically, they were enabled with the -(now deprecated) <option>-XNullaryTypeClasses</option>. -Since there are no available parameters, there can be at most one instance -of a nullary class. A nullary type class might be used to document some assumption -in a type signature (such as reliance on the Riemann hypothesis) or add some -globally configurable settings in a program. For example, - -<programlisting> - class RiemannHypothesis where - assumeRH :: a -> a - - -- Deterministic version of the Miller test - -- correctness depends on the generalised Riemann hypothesis - isPrime :: RiemannHypothesis => Integer -> Bool - isPrime n = assumeRH (...) -</programlisting> - -The type signature of <literal>isPrime</literal> informs users that its correctness -depends on an unproven conjecture. If the function is used, the user has -to acknowledge the dependence with: - -<programlisting> - instance RiemannHypothesis where - assumeRH = id -</programlisting> - -</sect3> -</sect2> - -<sect2 id="functional-dependencies"> -<title>Functional dependencies -</title> - -<para> Functional dependencies are implemented as described by Mark Jones -in “<ulink url="http://citeseer.ist.psu.edu/jones00type.html">Type Classes with Functional Dependencies</ulink>”, Mark P. Jones, -In Proceedings of the 9th European Symposium on Programming, -ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, -. -</para> -<para> -Functional dependencies are introduced by a vertical bar in the syntax of a -class declaration; e.g. -<programlisting> - class (Monad m) => MonadState s m | m -> s where ... - - class Foo a b c | a b -> c where ... -</programlisting> -There should be more documentation, but there isn't (yet). Yell if you need it. -</para> - -<sect3><title>Rules for functional dependencies </title> -<para> -In a class declaration, all of the class type variables must be reachable (in the sense -mentioned in <xref linkend="flexible-contexts"/>) -from the free variables of each method type. -For example: - -<programlisting> - class Coll s a where - empty :: s - insert :: s -> a -> s -</programlisting> - -is not OK, because the type of <literal>empty</literal> doesn't mention -<literal>a</literal>. Functional dependencies can make the type variable -reachable: -<programlisting> - class Coll s a | s -> a where - empty :: s - insert :: s -> a -> s -</programlisting> - -Alternatively <literal>Coll</literal> might be rewritten - -<programlisting> - class Coll s a where - empty :: s a - insert :: s a -> a -> s a -</programlisting> - - -which makes the connection between the type of a collection of -<literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>. -Occasionally this really doesn't work, in which case you can split the -class like this: - - -<programlisting> - class CollE s where - empty :: s - - class CollE s => Coll s a where - insert :: s -> a -> s -</programlisting> -</para> -</sect3> - - -<sect3> -<title>Background on functional dependencies</title> - -<para>The following description of the motivation and use of functional dependencies is taken -from the Hugs user manual, reproduced here (with minor changes) by kind -permission of Mark Jones. -</para> -<para> -Consider the following class, intended as part of a -library for collection types: -<programlisting> - class Collects e ce where - empty :: ce - insert :: e -> ce -> ce - member :: e -> ce -> Bool -</programlisting> -The type variable e used here represents the element type, while ce is the type -of the container itself. Within this framework, we might want to define -instances of this class for lists or characteristic functions (both of which -can be used to represent collections of any equality type), bit sets (which can -be used to represent collections of characters), or hash tables (which can be -used to represent any collection whose elements have a hash function). Omitting -standard implementation details, this would lead to the following declarations: -<programlisting> - instance Eq e => Collects e [e] where ... - instance Eq e => Collects e (e -> Bool) where ... - instance Collects Char BitSet where ... - instance (Hashable e, Collects a ce) - => Collects e (Array Int ce) where ... -</programlisting> -All this looks quite promising; we have a class and a range of interesting -implementations. Unfortunately, there are some serious problems with the class -declaration. First, the empty function has an ambiguous type: -<programlisting> - empty :: Collects e ce => ce -</programlisting> -By "ambiguous" we mean that there is a type variable e that appears on the left -of the <literal>=></literal> symbol, but not on the right. The problem with -this is that, according to the theoretical foundations of Haskell overloading, -we cannot guarantee a well-defined semantics for any term with an ambiguous -type. -</para> -<para> -We can sidestep this specific problem by removing the empty member from the -class declaration. However, although the remaining members, insert and member, -do not have ambiguous types, we still run into problems when we try to use -them. For example, consider the following two functions: -<programlisting> - f x y = insert x . insert y - g = f True 'a' -</programlisting> -for which GHC infers the following types: -<programlisting> - f :: (Collects a c, Collects b c) => a -> b -> c -> c - g :: (Collects Bool c, Collects Char c) => c -> c -</programlisting> -Notice that the type for f allows the two parameters x and y to be assigned -different types, even though it attempts to insert each of the two values, one -after the other, into the same collection. If we're trying to model collections -that contain only one type of value, then this is clearly an inaccurate -type. Worse still, the definition for g is accepted, without causing a type -error. As a result, the error in this code will not be flagged at the point -where it appears. Instead, it will show up only when we try to use g, which -might even be in a different module. -</para> - -<sect4><title>An attempt to use constructor classes</title> - -<para> -Faced with the problems described above, some Haskell programmers might be -tempted to use something like the following version of the class declaration: -<programlisting> - class Collects e c where - empty :: c e - insert :: e -> c e -> c e - member :: e -> c e -> Bool -</programlisting> -The key difference here is that we abstract over the type constructor c that is -used to form the collection type c e, and not over that collection type itself, -represented by ce in the original class declaration. This avoids the immediate -problems that we mentioned above: empty has type <literal>Collects e c => c -e</literal>, which is not ambiguous. -</para> -<para> -The function f from the previous section has a more accurate type: -<programlisting> - f :: (Collects e c) => e -> e -> c e -> c e -</programlisting> -The function g from the previous section is now rejected with a type error as -we would hope because the type of f does not allow the two arguments to have -different types. -This, then, is an example of a multiple parameter class that does actually work -quite well in practice, without ambiguity problems. -There is, however, a catch. This version of the Collects class is nowhere near -as general as the original class seemed to be: only one of the four instances -for <literal>Collects</literal> -given above can be used with this version of Collects because only one of -them---the instance for lists---has a collection type that can be written in -the form c e, for some type constructor c, and element type e. -</para> -</sect4> - -<sect4><title>Adding functional dependencies</title> - -<para> -To get a more useful version of the Collects class, Hugs provides a mechanism -that allows programmers to specify dependencies between the parameters of a -multiple parameter class (For readers with an interest in theoretical -foundations and previous work: The use of dependency information can be seen -both as a generalisation of the proposal for `parametric type classes' that was -put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's -later framework for "improvement" of qualified types. The -underlying ideas are also discussed in a more theoretical and abstract setting -in a manuscript [implparam], where they are identified as one point in a -general design space for systems of implicit parameterisation.). - -To start with an abstract example, consider a declaration such as: -<programlisting> - class C a b where ... -</programlisting> -which tells us simply that C can be thought of as a binary relation on types -(or type constructors, depending on the kinds of a and b). Extra clauses can be -included in the definition of classes to add information about dependencies -between parameters, as in the following examples: -<programlisting> - class D a b | a -> b where ... - class E a b | a -> b, b -> a where ... -</programlisting> -The notation <literal>a -> b</literal> used here between the | and where -symbols --- not to be -confused with a function type --- indicates that the a parameter uniquely -determines the b parameter, and might be read as "a determines b." Thus D is -not just a relation, but actually a (partial) function. Similarly, from the two -dependencies that are included in the definition of E, we can see that E -represents a (partial) one-one mapping between types. -</para> -<para> -More generally, dependencies take the form <literal>x1 ... xn -> y1 ... ym</literal>, -where x1, ..., xn, and y1, ..., yn are type variables with n>0 and -m>=0, meaning that the y parameters are uniquely determined by the x -parameters. Spaces can be used as separators if more than one variable appears -on any single side of a dependency, as in <literal>t -> a b</literal>. Note that a class may be -annotated with multiple dependencies using commas as separators, as in the -definition of E above. Some dependencies that we can write in this notation are -redundant, and will be rejected because they don't serve any useful -purpose, and may instead indicate an error in the program. Examples of -dependencies like this include <literal>a -> a </literal>, -<literal>a -> a a </literal>, -<literal>a -> </literal>, etc. There can also be -some redundancy if multiple dependencies are given, as in -<literal>a->b</literal>, - <literal>b->c </literal>, <literal>a->c </literal>, and -in which some subset implies the remaining dependencies. Examples like this are -not treated as errors. Note that dependencies appear only in class -declarations, and not in any other part of the language. In particular, the -syntax for instance declarations, class constraints, and types is completely -unchanged. -</para> -<para> -By including dependencies in a class declaration, we provide a mechanism for -the programmer to specify each multiple parameter class more precisely. The -compiler, on the other hand, is responsible for ensuring that the set of -instances that are in scope at any given point in the program is consistent -with any declared dependencies. For example, the following pair of instance -declarations cannot appear together in the same scope because they violate the -dependency for D, even though either one on its own would be acceptable: -<programlisting> - instance D Bool Int where ... - instance D Bool Char where ... -</programlisting> -Note also that the following declaration is not allowed, even by itself: -<programlisting> - instance D [a] b where ... -</programlisting> -The problem here is that this instance would allow one particular choice of [a] -to be associated with more than one choice for b, which contradicts the -dependency specified in the definition of D. More generally, this means that, -in any instance of the form: -<programlisting> - instance D t s where ... -</programlisting> -for some particular types t and s, the only variables that can appear in s are -the ones that appear in t, and hence, if the type t is known, then s will be -uniquely determined. -</para> -<para> -The benefit of including dependency information is that it allows us to define -more general multiple parameter classes, without ambiguity problems, and with -the benefit of more accurate types. To illustrate this, we return to the -collection class example, and annotate the original definition of <literal>Collects</literal> -with a simple dependency: -<programlisting> - class Collects e ce | ce -> e where - empty :: ce - insert :: e -> ce -> ce - member :: e -> ce -> Bool -</programlisting> -The dependency <literal>ce -> e</literal> here specifies that the type e of elements is uniquely -determined by the type of the collection ce. Note that both parameters of -Collects are of kind *; there are no constructor classes here. Note too that -all of the instances of Collects that we gave earlier can be used -together with this new definition. -</para> -<para> -What about the ambiguity problems that we encountered with the original -definition? The empty function still has type Collects e ce => ce, but it is no -longer necessary to regard that as an ambiguous type: Although the variable e -does not appear on the right of the => symbol, the dependency for class -Collects tells us that it is uniquely determined by ce, which does appear on -the right of the => symbol. Hence the context in which empty is used can still -give enough information to determine types for both ce and e, without -ambiguity. More generally, we need only regard a type as ambiguous if it -contains a variable on the left of the => that is not uniquely determined -(either directly or indirectly) by the variables on the right. -</para> -<para> -Dependencies also help to produce more accurate types for user defined -functions, and hence to provide earlier detection of errors, and less cluttered -types for programmers to work with. Recall the previous definition for a -function f: -<programlisting> - f x y = insert x y = insert x . insert y -</programlisting> -for which we originally obtained a type: -<programlisting> - f :: (Collects a c, Collects b c) => a -> b -> c -> c -</programlisting> -Given the dependency information that we have for Collects, however, we can -deduce that a and b must be equal because they both appear as the second -parameter in a Collects constraint with the same first parameter c. Hence we -can infer a shorter and more accurate type for f: -<programlisting> - f :: (Collects a c) => a -> a -> c -> c -</programlisting> -In a similar way, the earlier definition of g will now be flagged as a type error. -</para> -<para> -Although we have given only a few examples here, it should be clear that the -addition of dependency information can help to make multiple parameter classes -more useful in practice, avoiding ambiguity problems, and allowing more general -sets of instance declarations. -</para> -</sect4> -</sect3> -</sect2> - -<sect2 id="instance-decls"> -<title>Instance declarations</title> - -<para>An instance declaration has the form -<screen> - instance ( <replaceable>assertion</replaceable><subscript>1</subscript>, ..., <replaceable>assertion</replaceable><subscript>n</subscript>) => <replaceable>class</replaceable> <replaceable>type</replaceable><subscript>1</subscript> ... <replaceable>type</replaceable><subscript>m</subscript> where ... -</screen> -The part before the "<literal>=></literal>" is the -<emphasis>context</emphasis>, while the part after the -"<literal>=></literal>" is the <emphasis>head</emphasis> of the instance declaration. -</para> - -<sect3 id="instance-resolution"> -<title>Instance resolution</title> - -<para> -When GHC tries to resolve, say, the constraint <literal>C Int Bool</literal>, -it tries to match every instance declaration against the -constraint, -by instantiating the head of the instance declaration. Consider -these declarations: -<programlisting> - instance context1 => C Int a where ... -- (A) - instance context2 => C a Bool where ... -- (B) -</programlisting> -GHC's default behaviour is that <emphasis>exactly one instance must match the -constraint it is trying to resolve</emphasis>. -For example, the constraint <literal>C Int Bool</literal> matches instances (A) and (B), -and hence would be rejected; while <literal>C Int Char</literal> matches only (A) -and hence (A) is chosen.</para> - -<para> -Notice that -<itemizedlist> -<listitem><para> -When matching, GHC takes -no account of the context of the instance declaration -(<literal>context1</literal> etc). -</para></listitem> -<listitem><para> -It is fine for there to be a <emphasis>potential</emphasis> of overlap (by -including both declarations (A) and (B), say); an error is only reported if a -particular constraint matches more than one. -</para></listitem> -</itemizedlist> -See also <xref linkend="instance-overlap"/> for flags that loosen the -instance resolution rules. -</para> - -</sect3> - -<sect3 id="flexible-instance-head"> -<title>Relaxed rules for the instance head</title> - -<para> -In Haskell 98 the head of an instance declaration -must be of the form <literal>C (T a1 ... an)</literal>, where -<literal>C</literal> is the class, <literal>T</literal> is a data type constructor, -and the <literal>a1 ... an</literal> are distinct type variables. -In the case of multi-parameter type classes, this rule applies to each parameter of -the instance head. (Arguably it should be OK if just one has this form and the others -are type variables, but that's the rules at the moment.)</para> - -<para>GHC relaxes this rule in two ways. -<itemizedlist> -<listitem><para> -With the <option>-XTypeSynonymInstances</option> flag, instance heads may use type -synonyms. As always, using a type synonym is just shorthand for -writing the RHS of the type synonym definition. For example: -<programlisting> - type Point a = (a,a) - instance C (Point a) where ... -</programlisting> -is legal. The instance declaration is equivalent to -<programlisting> - instance C (a,a) where ... -</programlisting> -As always, type synonyms -must be fully applied. You cannot, for example, write: -<programlisting> - instance Monad Point where ... -</programlisting> -</para></listitem> - -<listitem> -<para> -The <option>-XFlexibleInstances</option> flag allows the head of the instance -declaration to mention arbitrary nested types. -For example, this becomes a legal instance declaration -<programlisting> - instance C (Maybe Int) where ... -</programlisting> -See also the <link linkend="instance-overlap">rules on overlap</link>. -</para> -<para> -The <option>-XFlexibleInstances</option> flag implies <option>-XTypeSynonymInstances</option>. -</para></listitem> -</itemizedlist> -</para> -<para> -However, the instance declaration must still conform to the rules for instance -termination: see <xref linkend="instance-termination"/>. -</para> -</sect3> - -<sect3 id="instance-rules"> -<title>Relaxed rules for instance contexts</title> - -<para>In Haskell 98, the class constraints in the context of the instance declaration -must be of the form <literal>C a</literal> where <literal>a</literal> -is a type variable that occurs in the head. -</para> - -<para> -The <option>-XFlexibleContexts</option> flag relaxes this rule, as well -as relaxing the corresponding rule for type signatures (see <xref linkend="flexible-contexts"/>). -Specifically, <option>-XFlexibleContexts</option>, allows (well-kinded) class constraints -of form <literal>(C t1 ... tn)</literal> in the context of an instance declaration. -</para> -<para> -Notice that the flag does not affect equality constraints in an instance context; -they are permitted by <option>-XTypeFamilies</option> or <option>-XGADTs</option>. -</para> -<para> -However, the instance declaration must still conform to the rules for instance -termination: see <xref linkend="instance-termination"/>. -</para> - -</sect3> - -<sect3 id="instance-termination"> -<title>Instance termination rules</title> - -<para> -Regardless of <option>-XFlexibleInstances</option> and <option>-XFlexibleContexts</option>, -instance declarations must conform to some rules that ensure that instance resolution -will terminate. The restrictions can be lifted with <option>-XUndecidableInstances</option> -(see <xref linkend="undecidable-instances"/>). -</para> -<para> -The rules are these: -<orderedlist> -<listitem><para> -The Paterson Conditions: for each class constraint <literal>(C t1 ... tn)</literal> in the context -<orderedlist> -<listitem><para>No type variable has more occurrences in the constraint than in the head</para></listitem> -<listitem><para>The constraint has fewer constructors and variables (taken together - and counting repetitions) than the head -</para></listitem> -<listitem><para>The constraint mentions no type functions. -A type function application can in principle expand to a -type of arbitrary size, and so are rejected out of hand -</para></listitem> -</orderedlist> -</para></listitem> - -<listitem><para>The Coverage Condition. For each functional dependency, -<replaceable>tvs</replaceable><subscript>left</subscript> <literal>-></literal> -<replaceable>tvs</replaceable><subscript>right</subscript>, of the class, -every type variable in -S(<replaceable>tvs</replaceable><subscript>right</subscript>) must appear in -S(<replaceable>tvs</replaceable><subscript>left</subscript>), where S is the -substitution mapping each type variable in the class declaration to the -corresponding type in the instance head. -</para></listitem> -</orderedlist> -These restrictions ensure that instance resolution terminates: each reduction -step makes the problem smaller by at least one -constructor. -You can find lots of background material about the reason for these -restrictions in the paper <ulink -url="http://research.microsoft.com/%7Esimonpj/papers/fd%2Dchr/"> -Understanding functional dependencies via Constraint Handling Rules</ulink>. -</para> -<para> -For example, these are OK: -<programlisting> - instance C Int [a] -- Multiple parameters - instance Eq (S [a]) -- Structured type in head - - -- Repeated type variable in head - instance C4 a a => C4 [a] [a] - instance Stateful (ST s) (MutVar s) - - -- Head can consist of type variables only - instance C a - instance (Eq a, Show b) => C2 a b - - -- Non-type variables in context - instance Show (s a) => Show (Sized s a) - instance C2 Int a => C3 Bool [a] - instance C2 Int a => C3 [a] b -</programlisting> -But these are not: -<programlisting> - -- Context assertion no smaller than head - instance C a => C a where ... - -- (C b b) has more occurrences of b than the head - instance C b b => Foo [b] where ... -</programlisting> -</para> - -<para> -The same restrictions apply to instances generated by -<literal>deriving</literal> clauses. Thus the following is accepted: -<programlisting> - data MinHeap h a = H a (h a) - deriving (Show) -</programlisting> -because the derived instance -<programlisting> - instance (Show a, Show (h a)) => Show (MinHeap h a) -</programlisting> -conforms to the above rules. -</para> - -<para> -A useful idiom permitted by the above rules is as follows. -If one allows overlapping instance declarations then it's quite -convenient to have a "default instance" declaration that applies if -something more specific does not: -<programlisting> - instance C a where - op = ... -- Default -</programlisting> -</para> -</sect3> - -<sect3 id="undecidable-instances"> -<title>Undecidable instances</title> - -<para> -Sometimes even the termination rules of <xref linkend="instance-termination"/> are too onerous. -So GHC allows you to experiment with more liberal rules: if you use -the experimental flag <option>-XUndecidableInstances</option> -<indexterm><primary>-XUndecidableInstances</primary></indexterm>, -both the Paterson Conditions and the Coverage Condition -(described in <xref linkend="instance-termination"/>) are lifted. -Termination is still ensured by having a -fixed-depth recursion stack. If you exceed the stack depth you get a -sort of backtrace, and the opportunity to increase the stack depth -with <option>-freduction-depth=</option><emphasis>N</emphasis>. -However, if you should exceed the default reduction depth limit, -it is probably best just to disable depth checking, with -<option>-freduction-depth=0</option>. The exact depth your program -requires depends on minutiae of your code, and it may change between -minor GHC releases. The safest bet for released code -- if you're sure -that it should compile in finite time -- is just to disable the check. -</para> - -<para> -For example, sometimes you might want to use the following to get the -effect of a "class synonym": -<programlisting> - class (C1 a, C2 a, C3 a) => C a where { } - - instance (C1 a, C2 a, C3 a) => C a where { } -</programlisting> -This allows you to write shorter signatures: -<programlisting> - f :: C a => ... -</programlisting> -instead of -<programlisting> - f :: (C1 a, C2 a, C3 a) => ... -</programlisting> -The restrictions on functional dependencies (<xref -linkend="functional-dependencies"/>) are particularly troublesome. -It is tempting to introduce type variables in the context that do not appear in -the head, something that is excluded by the normal rules. For example: -<programlisting> - class HasConverter a b | a -> b where - convert :: a -> b - - data Foo a = MkFoo a - - instance (HasConverter a b,Show b) => Show (Foo a) where - show (MkFoo value) = show (convert value) -</programlisting> -This is dangerous territory, however. Here, for example, is a program that would make the -typechecker loop: -<programlisting> - class D a - class F a b | a->b - instance F [a] [[a]] - instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head -</programlisting> -Similarly, it can be tempting to lift the coverage condition: -<programlisting> - class Mul a b c | a b -> c where - (.*.) :: a -> b -> c - - instance Mul Int Int Int where (.*.) = (*) - instance Mul Int Float Float where x .*. y = fromIntegral x * y - instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v -</programlisting> -The third instance declaration does not obey the coverage condition; -and indeed the (somewhat strange) definition: -<programlisting> - f = \ b x y -> if b then x .*. [y] else y -</programlisting> -makes instance inference go into a loop, because it requires the constraint -<literal>(Mul a [b] b)</literal>. -</para> - -<para> -The <option>-XUndecidableInstances</option> flag is also used to lift some of the -restrictions imposed on type family instances. See <xref linkend="type-family-decidability"/>. -</para> - -</sect3> - - -<sect3 id="instance-overlap"> -<title>Overlapping instances</title> - -<para> -In general, as discussed in <xref linkend="instance-resolution"/>, -<emphasis>GHC requires that it be unambiguous which instance -declaration -should be used to resolve a type-class constraint</emphasis>. -GHC also provides a way to to loosen -the instance resolution, by -allowing more than one instance to match, <emphasis>provided there is a most -specific one</emphasis>. Moreover, it can be loosened further, by allowing more than one instance to match -irrespective of whether there is a most specific one. -This section gives the details. -</para> -<para> -To control the choice of instance, it is possible to specify the overlap behavior for individual -instances with a pragma, written immediately after the -<literal>instance</literal> keyword. The pragma may be one of: -<literal>{-# OVERLAPPING #-}</literal>, -<literal>{-# OVERLAPPABLE #-}</literal>, -<literal>{-# OVERLAPS #-}</literal>, -or <literal>{-# INCOHERENT #-}</literal>. -</para> -<para> -The matching behaviour is also influenced by two module-level language extension flags: <option>-XOverlappingInstances</option> -<indexterm><primary>-XOverlappingInstances -</primary></indexterm> -and <option>-XIncoherentInstances</option> -<indexterm><primary>-XIncoherentInstances -</primary></indexterm>. These flags are now deprecated (since GHC 7.10) in favour of -the fine-grained per-instance pragmas. -</para> - -<para> -A more precise specification is as follows. -The willingness to be overlapped or incoherent is a property of -the <emphasis>instance declaration</emphasis> itself, controlled as follows: -<itemizedlist> -<listitem><para>An instance is <emphasis>incoherent</emphasis> if: it has an <literal>INCOHERENT</literal> pragma; or if the instance has no pragma and it appears in a module compiled with <literal>-XIncoherentInstances</literal>. -</para></listitem> -<listitem><para>An instance is <emphasis>overlappable</emphasis> if: it has an <literal>OVERLAPPABLE</literal> or <literal>OVERLAPS</literal> pragma; or if the instance has no pragma and it appears in a module compiled with <literal>-XOverlappingInstances</literal>; or if the instance is incoherent. -</para></listitem> -<listitem><para>An instance is <emphasis>overlapping</emphasis> if: it has an <literal>OVERLAPPING</literal> or <literal>OVERLAPS</literal> pragma; or if the instance has no pragma and it appears in a module compiled with <literal>-XOverlappingInstances</literal>; or if the instance is incoherent. -</para></listitem> -</itemizedlist> -</para> - -<para> -Now suppose that, in some client module, we are searching for an instance of the -<emphasis>target constraint</emphasis> <literal>(C ty1 .. tyn)</literal>. -The search works like this. -<itemizedlist> -<listitem><para> -Find all instances I that <emphasis>match</emphasis> the target constraint; -that is, the target constraint is a substitution instance of I. These -instance declarations are the <emphasis>candidates</emphasis>. -</para></listitem> - -<listitem><para> -Eliminate any candidate IX for which both of the following hold: - -<itemizedlist> - <listitem><para>There is another candidate IY that is strictly more specific; - that is, IY is a substitution instance of IX but not vice versa. - </para></listitem> - <listitem><para> - Either IX is <emphasis>overlappable</emphasis>, or IY is - <emphasis>overlapping</emphasis>. (This "either/or" design, rather than a "both/and" design, - allow a client to deliberately override an instance from a library, without requiring a change to the library.) - </para></listitem> - </itemizedlist> -</para> -</listitem> - -<listitem><para> -If exactly one non-incoherent candidate remains, select it. If all -remaining candidates are incoherent, select an arbitrary -one. Otherwise the search fails (i.e. when more than one surviving candidate is not incoherent). -</para></listitem> - -<listitem><para> -If the selected candidate (from the previous step) is incoherent, the search succeeds, returning that candidate. -</para></listitem> - -<listitem><para> -If not, find all instances that <emphasis>unify</emphasis> with the target -constraint, but do not <emphasis>match</emphasis> it. -Such non-candidate instances might match when the target constraint is further -instantiated. If all of them are incoherent, the search succeeds, returning the selected candidate; -if not, the search fails. -</para></listitem> - -</itemizedlist> -Notice that these rules are not influenced by flag settings in the client module, where -the instances are <emphasis>used</emphasis>. -These rules make it possible for a library author to design a library that relies on -overlapping instances without the client having to know. -</para> -<para> -Errors are reported <emphasis>lazily</emphasis> (when attempting to solve a constraint), rather than <emphasis>eagerly</emphasis> -(when the instances themselves are defined). Consider, for example -<programlisting> - instance C Int b where .. - instance C a Bool where .. -</programlisting> -These potentially overlap, but GHC will not complain about the instance declarations -themselves, regardless of flag settings. If we later try to solve the constraint -<literal>(C Int Char)</literal> then only the first instance matches, and all is well. -Similarly with <literal>(C Bool Bool)</literal>. But if we try to solve <literal>(C Int Bool)</literal>, -both instances match and an error is reported. -</para> - -<para> -As a more substantial example of the rules in action, consider -<programlisting> - instance {-# OVERLAPPABLE #-} context1 => C Int b where ... -- (A) - instance {-# OVERLAPPABLE #-} context2 => C a Bool where ... -- (B) - instance {-# OVERLAPPABLE #-} context3 => C a [b] where ... -- (C) - instance {-# OVERLAPPING #-} context4 => C Int [Int] where ... -- (D) -</programlisting> -Now suppose that the type inference -engine needs to solve the constraint -<literal>C Int [Int]</literal>. This constraint matches instances (A), (C) and (D), but the last -is more specific, and hence is chosen. -</para> -<para>If (D) did not exist then (A) and (C) would still be matched, but neither is -most specific. In that case, the program would be rejected, unless -<option>-XIncoherentInstances</option> is enabled, in which case it would be accepted and (A) or -(C) would be chosen arbitrarily. -</para> -<para> -An instance declaration is <emphasis>more specific</emphasis> than another iff -the head of former is a substitution instance of the latter. For example -(D) is "more specific" than (C) because you can get from (C) to (D) by -substituting <literal>a:=Int</literal>. -</para> -<para> -GHC is conservative about committing to an overlapping instance. For example: -<programlisting> - f :: [b] -> [b] - f x = ... -</programlisting> -Suppose that from the RHS of <literal>f</literal> we get the constraint -<literal>C b [b]</literal>. But -GHC does not commit to instance (C), because in a particular -call of <literal>f</literal>, <literal>b</literal> might be instantiate -to <literal>Int</literal>, in which case instance (D) would be more specific still. -So GHC rejects the program.</para> -<para> -If, however, you add the flag <option>-XIncoherentInstances</option> when -compiling the module that contains (D), GHC will instead pick (C), without -complaining about the problem of subsequent instantiations. -</para> -<para> -Notice that we gave a type signature to <literal>f</literal>, so GHC had to -<emphasis>check</emphasis> that <literal>f</literal> has the specified type. -Suppose instead we do not give a type signature, asking GHC to <emphasis>infer</emphasis> -it instead. In this case, GHC will refrain from -simplifying the constraint <literal>C Int [b]</literal> (for the same reason -as before) but, rather than rejecting the program, it will infer the type -<programlisting> - f :: C b [b] => [b] -> [b] -</programlisting> -That postpones the question of which instance to pick to the -call site for <literal>f</literal> -by which time more is known about the type <literal>b</literal>. -You can write this type signature yourself if you use the -<link linkend="flexible-contexts"><option>-XFlexibleContexts</option></link> -flag. -</para> -<para> -Exactly the same situation can arise in instance declarations themselves. Suppose we have -<programlisting> - class Foo a where - f :: a -> a - instance Foo [b] where - f x = ... -</programlisting> -and, as before, the constraint <literal>C Int [b]</literal> arises from <literal>f</literal>'s -right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve -the constraint <literal>C Int [b]</literal>, because it matches more than one instance -declaration. The solution is to postpone the choice by adding the constraint to the context -of the instance declaration, thus: -<programlisting> - instance C Int [b] => Foo [b] where - f x = ... -</programlisting> -(You need <link linkend="instance-rules"><option>-XFlexibleInstances</option></link> to do this.) -</para> -<para> -Warning: overlapping instances must be used with care. They -can give rise to incoherence (i.e. different instance choices are made -in different parts of the program) even without <option>-XIncoherentInstances</option>. Consider: -<programlisting> -{-# LANGUAGE OverlappingInstances #-} -module Help where - - class MyShow a where - myshow :: a -> String - - instance MyShow a => MyShow [a] where - myshow xs = concatMap myshow xs - - showHelp :: MyShow a => [a] -> String - showHelp xs = myshow xs - -{-# LANGUAGE FlexibleInstances, OverlappingInstances #-} -module Main where - import Help - - data T = MkT - - instance MyShow T where - myshow x = "Used generic instance" - - instance MyShow [T] where - myshow xs = "Used more specific instance" - - main = do { print (myshow [MkT]); print (showHelp [MkT]) } -</programlisting> -In function <literal>showHelp</literal> GHC sees no overlapping -instances, and so uses the <literal>MyShow [a]</literal> instance -without complaint. In the call to <literal>myshow</literal> in <literal>main</literal>, -GHC resolves the <literal>MyShow [T]</literal> constraint using the overlapping -instance declaration in module <literal>Main</literal>. As a result, -the program prints -<programlisting> - "Used more specific instance" - "Used generic instance" -</programlisting> -(An alternative possible behaviour, not currently implemented, -would be to reject module <literal>Help</literal> -on the grounds that a later instance declaration might overlap the local one.) -</para> -</sect3> - -<sect3 id="instance-sigs"> -<title>Instance signatures: type signatures in instance declarations</title> -<para>In Haskell, you can't write a type signature in an instance declaration, but it -is sometimes convenient to do so, and the language extension <option>-XInstanceSigs</option> -allows you to do so. For example: -<programlisting> - data T a = MkT a a - instance Eq a => Eq (T a) where - (==) :: T a -> T a -> Bool -- The signature - (==) (MkT x1 x2) (MkTy y1 y2) = x1==y1 && x2==y2 -</programlisting> -</para> -Some details -<itemizedlist> -<listitem><para> -The type signature in the instance declaration must be more polymorphic than (or the same as) -the one in the class declaration, instantiated with the instance type. -For example, this is fine: -<programlisting> - instance Eq a => Eq (T a) where - (==) :: forall b. b -> b -> Bool - (==) x y = True -</programlisting> -Here the signature in the instance declaration is more polymorphic than that -required by the instantiated class method. -</para> -</listitem> - -<listitem><para> -The code for the method in the instance declaration is typechecked against the type signature -supplied in the instance declaration, as you would expect. So if the instance signature -is more polymorphic than required, the code must be too. -</para></listitem> - -<listitem><para> -One stylistic reason for wanting to write a type signature is simple documentation. Another -is that you may want to bring scoped type variables into scope. For example: -<programlisting> -class C a where - foo :: b -> a -> (a, [b]) - -instance C a => C (T a) where - foo :: forall b. b -> T a -> (T a, [b]) - foo x (T y) = (T y, xs) - where - xs :: [b] - xs = [x,x,x] -</programlisting> -Provided that you also specify <option>-XScopedTypeVariables</option> -(<xref linkend="scoped-type-variables"/>), -the <literal>forall b</literal> scopes over the definition of <literal>foo</literal>, -and in particular over the type signature for <literal>xs</literal>. -</para></listitem> -</itemizedlist> -</sect3> - -</sect2> - -<sect2 id="overloaded-strings"> -<title>Overloaded string literals -</title> - -<para> -GHC supports <emphasis>overloaded string literals</emphasis>. Normally a -string literal has type <literal>String</literal>, but with overloaded string -literals enabled (with <literal>-XOverloadedStrings</literal>) - a string literal has type <literal>(IsString a) => a</literal>. -</para> -<para> - This means that the usual string syntax can be used, e.g., - for <literal>ByteString</literal>, <literal>Text</literal>, -and other variations of string like types. String literals behave very much -like integer literals, i.e., they can be used in both expressions and patterns. -If used in a pattern the literal with be replaced by an equality test, in the same -way as an integer literal is. -</para> -<para> -The class <literal>IsString</literal> is defined as: -<programlisting> -class IsString a where - fromString :: String -> a -</programlisting> -The only predefined instance is the obvious one to make strings work as usual: -<programlisting> -instance IsString [Char] where - fromString cs = cs -</programlisting> -The class <literal>IsString</literal> is not in scope by default. If you want to mention -it explicitly (for example, to give an instance declaration for it), you can import it -from module <literal>GHC.Exts</literal>. -</para> -<para> -Haskell's defaulting mechanism (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>) -is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified. -Specifically: -<itemizedlist> -<listitem><para> -Each type in a <literal>default</literal> declaration must be an -instance of <literal>Num</literal> <emphasis>or</emphasis> of <literal>IsString</literal>. -</para></listitem> - -<listitem><para> -If no <literal>default</literal> declaration is given, then it is just as if the module -contained the declaration <literal>default( Integer, Double, String)</literal>. -</para></listitem> - -<listitem><para> -The standard defaulting rule -is extended thus: defaulting applies when all the unresolved constraints involve standard classes -<emphasis>or</emphasis> <literal>IsString</literal>; and at least one is a numeric class -<emphasis>or</emphasis> <literal>IsString</literal>. -</para></listitem> -</itemizedlist> -So, for example, the expression <literal>length "foo"</literal> will give rise -to an ambiguous use of <literal>IsString a0</literal> which, because of the above -rules, will default to <literal>String</literal>. -</para> -<para> -A small example: -<programlisting> -module Main where - -import GHC.Exts( IsString(..) ) - -newtype MyString = MyString String deriving (Eq, Show) -instance IsString MyString where - fromString = MyString - -greet :: MyString -> MyString -greet "hello" = "world" -greet other = other - -main = do - print $ greet "hello" - print $ greet "fool" -</programlisting> -</para> -<para> -Note that deriving <literal>Eq</literal> is necessary for the pattern matching -to work since it gets translated into an equality comparison. -</para> -</sect2> - -<sect2 id="overloaded-lists"> -<title>Overloaded lists</title> - -<para> GHC supports <emphasis>overloading of the list notation</emphasis>. -Let us recap the notation for -constructing lists. In Haskell, the list notation can be be used in the -following seven ways: - -<programlisting> -[] -- Empty list -[x] -- x : [] -[x,y,z] -- x : y : z : [] -[x .. ] -- enumFrom x -[x,y ..] -- enumFromThen x y -[x .. y] -- enumFromTo x y -[x,y .. z] -- enumFromThenTo x y z -</programlisting> - -When the <option>OverloadedLists</option> extension is turned on, the -aforementioned seven notations are desugared as follows: </para> - -<programlisting> -[] -- fromListN 0 [] -[x] -- fromListN 1 (x : []) -[x,y,z] -- fromListN 3 (x : y : z : []) -[x .. ] -- fromList (enumFrom x) -[x,y ..] -- fromList (enumFromThen x y) -[x .. y] -- fromList (enumFromTo x y) -[x,y .. z] -- fromList (enumFromThenTo x y z) -</programlisting> - -<para> This extension allows programmers to use the list notation for -construction of structures like: <literal>Set</literal>, -<literal>Map</literal>, <literal>IntMap</literal>, <literal>Vector</literal>, -<literal>Text</literal> and <literal>Array</literal>. The following code -listing gives a few examples:</para> - -<programlisting> -['0' .. '9'] :: Set Char -[1 .. 10] :: Vector Int -[("default",0), (k1,v1)] :: Map String Int -['a' .. 'z'] :: Text -</programlisting> -<para> -List patterns are also overloaded. When the <option>OverloadedLists</option> -extension is turned on, these definitions are desugared as follows -<programlisting> -f [] = ... -- f (toList -> []) = ... -g [x,y,z] = ... -- g (toList -> [x,y,z]) = ... -</programlisting> -(Here we are using view-pattern syntax for the translation, see <xref linkend="view-patterns"/>.) -</para> - -<sect3> -<title>The <literal>IsList</literal> class</title> - -<para>In the above desugarings, the functions <literal>toList</literal>, -<literal>fromList</literal> and <literal>fromListN</literal> are all -methods of -the <literal>IsList</literal> class, which is itself exported from -the <literal>GHC.Exts</literal> module. -The type class is defined as follows:</para> - -<programlisting> -class IsList l where - type Item l - - fromList :: [Item l] -> l - toList :: l -> [Item l] - - fromListN :: Int -> [Item l] -> l - fromListN _ = fromList -</programlisting> - -<para>The <literal>IsList</literal> class and its methods are intended to be -used in conjunction with the <option>OverloadedLists</option> extension. -<itemizedlist> -<listitem> <para> The type function -<literal>Item</literal> returns the type of items of the -structure <literal>l</literal>. -</para></listitem> -<listitem><para> -The function <literal>fromList</literal> -constructs the structure <literal>l</literal> from the given list of -<literal>Item l</literal>. -</para></listitem> -<listitem><para> -The function <literal>fromListN</literal> takes the -input list's length as a hint. Its behaviour should be equivalent to -<literal>fromList</literal>. The hint can be used for more efficient -construction of the structure <literal>l</literal> compared to -<literal>fromList</literal>. If the given hint is not equal to the input -list's length the behaviour of <literal>fromListN</literal> is not -specified. -</para></listitem> -<listitem><para> -The function <literal>toList</literal> should be -the inverse of <literal>fromList</literal>. -</para></listitem> -</itemizedlist> -</para> -<para>It is perfectly fine to declare new instances -of <literal>IsList</literal>, so that list notation becomes -useful for completely new data types. -Here are several example instances: -<programlisting> -instance IsList [a] where - type Item [a] = a - fromList = id - toList = id - -instance (Ord a) => IsList (Set a) where - type Item (Set a) = a - fromList = Set.fromList - toList = Set.toList - -instance (Ord k) => IsList (Map k v) where - type Item (Map k v) = (k,v) - fromList = Map.fromList - toList = Map.toList - -instance IsList (IntMap v) where - type Item (IntMap v) = (Int,v) - fromList = IntMap.fromList - toList = IntMap.toList - -instance IsList Text where - type Item Text = Char - fromList = Text.pack - toList = Text.unpack - -instance IsList (Vector a) where - type Item (Vector a) = a - fromList = Vector.fromList - fromListN = Vector.fromListN - toList = Vector.toList -</programlisting> -</para> -</sect3> - -<sect3> -<title>Rebindable syntax</title> - -<para> When desugaring list notation with <option>-XOverloadedLists</option> -GHC uses the <literal>fromList</literal> (etc) methods from module <literal>GHC.Exts</literal>. -You do not need to import <literal>GHC.Exts</literal> for this to happen. -</para> -<para> However if you use <option>-XRebindableSyntax</option>, then -GHC instead uses whatever is in -scope with the names of <literal>toList</literal>, <literal>fromList</literal> and -<literal>fromListN</literal>. That is, these functions are rebindable; -c.f. <xref linkend="rebindable-syntax"/>. </para> -</sect3> - -<sect3> -<title>Defaulting</title> - -<para>Currently, the <literal>IsList</literal> class is not accompanied with -defaulting rules. Although feasible, not much thought has gone into how to -specify the meaning of the default declarations like:</para> - -<programlisting> -default ([a]) -</programlisting> -</sect3> - -<sect3> -<title>Speculation about the future</title> - - -<para>The current implementation of the <option>OverloadedLists</option> -extension can be improved by handling the lists that are only populated with -literals in a special way. More specifically, the compiler could allocate such -lists statically using a compact representation and allow -<literal>IsList</literal> instances to take advantage of the compact -representation. Equipped with this capability the -<option>OverloadedLists</option> extension will be in a good position to -subsume the <option>OverloadedStrings</option> extension (currently, as a -special case, string literals benefit from statically allocated compact -representation).</para> -</sect3> -</sect2> - -</sect1> - -<sect1 id="type-families"> -<title>Type families</title> - -<para> - <firstterm>Indexed type families</firstterm> form an extension to - facilitate type-level - programming. Type families are a generalisation of <firstterm>associated - data types</firstterm> - (“<ulink url="http://www.cse.unsw.edu.au/~chak/papers/CKPM05.html">Associated - Types with Class</ulink>”, M. Chakravarty, G. Keller, S. Peyton Jones, - and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT - Symposium on Principles of Programming Languages (POPL'05)”, pages - 1-13, ACM Press, 2005) and <firstterm>associated type synonyms</firstterm> - (“<ulink url="http://www.cse.unsw.edu.au/~chak/papers/CKP05.html">Type - Associated Type Synonyms</ulink>”. M. Chakravarty, G. Keller, and - S. Peyton Jones. - In Proceedings of “The Tenth ACM SIGPLAN International Conference on - Functional Programming”, ACM Press, pages 241-253, 2005). Type families - themselves are described in the paper “<ulink - url="http://www.cse.unsw.edu.au/~chak/papers/SPCS08.html">Type - Checking with Open Type Functions</ulink>”, T. Schrijvers, - S. Peyton-Jones, - M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The - 13th ACM SIGPLAN International Conference on Functional - Programming”, ACM Press, pages 51-62, 2008. Type families - essentially provide type-indexed data types and named functions on types, - which are useful for generic programming and highly parameterised library - interfaces as well as interfaces with enhanced static information, much like - dependent types. They might also be regarded as an alternative to functional - dependencies, but provide a more functional style of type-level programming - than the relational style of functional dependencies. -</para> -<para> - Indexed type families, or type families for short, are type constructors that - represent sets of types. Set members are denoted by supplying the type family - constructor with type parameters, which are called <firstterm>type - indices</firstterm>. The - difference between vanilla parametrised type constructors and family - constructors is much like between parametrically polymorphic functions and - (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions - behave the same at all type instances, whereas class methods can change their - behaviour in dependence on the class type parameters. Similarly, vanilla type - constructors imply the same data representation for all type instances, but - family constructors can have varying representation types for varying type - indices. -</para> -<para> - Indexed type families come in three flavours: <firstterm>data - families</firstterm>, <firstterm>open type synonym families</firstterm>, and - <firstterm>closed type synonym families</firstterm>. They are the indexed - family variants of algebraic data types and type synonyms, respectively. The - instances of data families can be data types and newtypes. -</para> -<para> - Type families are enabled by the flag <option>-XTypeFamilies</option>. - Additional information on the use of type families in GHC is available on - <ulink url="http://www.haskell.org/haskellwiki/GHC/Indexed_types">the - Haskell wiki page on type families</ulink>. -</para> - -<sect2 id="data-families"> - <title>Data families</title> - - <para> - Data families appear in two flavours: (1) they can be defined on the - toplevel - or (2) they can appear inside type classes (in which case they are known as - associated types). The former is the more general variant, as it lacks the - requirement for the type-indexes to coincide with the class - parameters. However, the latter can lead to more clearly structured code and - compiler warnings if some type instances were - possibly accidentally - - omitted. In the following, we always discuss the general toplevel form first - and then cover the additional constraints placed on associated types. - </para> - - <sect3 id="data-family-declarations"> - <title>Data family declarations</title> - - <para> - Indexed data families are introduced by a signature, such as -<programlisting> -data family GMap k :: * -> * -</programlisting> - The special <literal>family</literal> distinguishes family from standard - data declarations. The result kind annotation is optional and, as - usual, defaults to <literal>*</literal> if omitted. An example is -<programlisting> -data family Array e -</programlisting> - Named arguments can also be given explicit kind signatures if needed. - Just as with - [http://www.haskell.org/ghc/docs/latest/html/users_guide/gadt.html GADT - declarations] named arguments are entirely optional, so that we can - declare <literal>Array</literal> alternatively with -<programlisting> -data family Array :: * -> * -</programlisting> - </para> - </sect3> - - <sect3 id="data-instance-declarations"> - <title>Data instance declarations</title> - - <para> - Instance declarations of data and newtype families are very similar to - standard data and newtype declarations. The only two differences are - that the keyword <literal>data</literal> or <literal>newtype</literal> - is followed by <literal>instance</literal> and that some or all of the - type arguments can be non-variable types, but may not contain forall - types or type synonym families. However, data families are generally - allowed in type parameters, and type synonyms are allowed as long as - they are fully applied and expand to a type that is itself admissible - - exactly as this is required for occurrences of type synonyms in class - instance parameters. For example, the <literal>Either</literal> - instance for <literal>GMap</literal> is -<programlisting> -data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) -</programlisting> - In this example, the declaration has only one variant. In general, it - can be any number. - </para> - <para> - When the name of a type argument of a data or newtype instance - declaration doesn't matter, it can be replaced with an underscore - (<literal>_</literal>). This is the same as writing a type variable with - a unique name. -<programlisting> -data family F a b :: * -data instance F Int _ = Int --- Equivalent to -data instance F Int b = Int -</programlisting> - This resembles the wildcards that can be used in <xref - linkend="partial-type-signatures"/>. However, there are some - differences. Only anonymous wildcards are allowed in these instance - declarations, named and extra-constraints wildcards are not. No error - messages reporting the inferred types are generated, nor does the flag - <option>-XPartialTypeSignatures</option> have any effect. - </para> - <para> - Data and newtype instance declarations are only permitted when an - appropriate family declaration is in scope - just as a class instance declaration - requires the class declaration to be visible. Moreover, each instance - declaration has to conform to the kind determined by its family - declaration. This implies that the number of parameters of an instance - declaration matches the arity determined by the kind of the family. - </para> - <para> - A data family instance declaration can use the full expressiveness of - ordinary <literal>data</literal> or <literal>newtype</literal> declarations: - <itemizedlist> - <listitem><para> Although, a data family is <emphasis>introduced</emphasis> with - the keyword "<literal>data</literal>", a data family <emphasis>instance</emphasis> can - use either <literal>data</literal> or <literal>newtype</literal>. For example: -<programlisting> -data family T a -data instance T Int = T1 Int | T2 Bool -newtype instance T Char = TC Bool -</programlisting> - </para></listitem> - <listitem><para> A <literal>data instance</literal> can use GADT syntax for the data constructors, - and indeed can define a GADT. For example: -<programlisting> -data family G a b -data instance G [a] b where - G1 :: c -> G [Int] b - G2 :: G [a] Bool -</programlisting> - </para></listitem> - <listitem><para> You can use a <literal>deriving</literal> clause on a - <literal>data instance</literal> or <literal>newtype instance</literal> - declaration. - </para></listitem> - </itemizedlist> - </para> - - <para> - Even if data families are defined as toplevel declarations, functions - that perform different computations for different family instances may still - need to be defined as methods of type classes. In particular, the - following is not possible: -<programlisting> -data family T a -data instance T Int = A -data instance T Char = B -foo :: T a -> Int -foo A = 1 -- WRONG: These two equations together... -foo B = 2 -- ...will produce a type error. -</programlisting> -Instead, you would have to write <literal>foo</literal> as a class operation, thus: -<programlisting> -class Foo a where - foo :: T a -> Int -instance Foo Int where - foo A = 1 -instance Foo Char where - foo B = 2 -</programlisting> - (Given the functionality provided by GADTs (Generalised Algebraic Data - Types), it might seem as if a definition, such as the above, should be - feasible. However, type families are - in contrast to GADTs - are - <emphasis>open;</emphasis> i.e., new instances can always be added, - possibly in other - modules. Supporting pattern matching across different data instances - would require a form of extensible case construct.) - </para> - </sect3> - - <sect3 id="data-family-overlap"> - <title>Overlap of data instances</title> - <para> - The instance declarations of a data family used in a single program - may not overlap at all, independent of whether they are associated or - not. In contrast to type class instances, this is not only a matter - of consistency, but one of type safety. - </para> - </sect3> -</sect2> - -<sect2 id="synonym-families"> - <title>Synonym families</title> - - <para> - Type families appear in three flavours: (1) they can be defined as open - families on the toplevel, (2) they can be defined as closed families on - the toplevel, or (3) they can appear inside type classes (in which case - they are known as associated type synonyms). Toplevel families are more - general, as they lack the requirement for the type-indexes to coincide - with the class parameters. However, associated type synonyms can lead to - more clearly structured code and compiler warnings if some type instances - were - possibly accidentally - omitted. In the following, we always - discuss the general toplevel forms first and then cover the additional - constraints placed on associated types. Note that closed associated type - synonyms do not exist. - </para> - - <sect3 id="type-family-declarations"> - <title>Type family declarations</title> - - <para> - Open indexed type families are introduced by a signature, such as -<programlisting> -type family Elem c :: * -</programlisting> - The special <literal>family</literal> distinguishes family from standard - type declarations. The result kind annotation is optional and, as - usual, defaults to <literal>*</literal> if omitted. An example is -<programlisting> -type family Elem c -</programlisting> - Parameters can also be given explicit kind signatures if needed. We - call the number of parameters in a type family declaration, the family's - arity, and all applications of a type family must be fully saturated - w.r.t. to that arity. This requirement is unlike ordinary type synonyms - and it implies that the kind of a type family is not sufficient to - determine a family's arity, and hence in general, also insufficient to - determine whether a type family application is well formed. As an - example, consider the following declaration: -<programlisting> -type family F a b :: * -> * -- F's arity is 2, - -- although its overall kind is * -> * -> * -> * -</programlisting> - Given this declaration the following are examples of well-formed and - malformed types: -<programlisting> -F Char [Int] -- OK! Kind: * -> * -F Char [Int] Bool -- OK! Kind: * -F IO Bool -- WRONG: kind mismatch in the first argument -F Bool -- WRONG: unsaturated application -</programlisting> - </para> - - <para> - The result kind annotation is optional and defaults to - <literal>*</literal> (like argument kinds) if - omitted. Polykinded type families can be - declared using a parameter in the kind annotation: -<programlisting> -type family F a :: k -</programlisting> -In this case the kind parameter <literal>k</literal> is actually an implicit -parameter of the type family. - </para> - </sect3> - - <sect3 id="type-instance-declarations"> - <title>Type instance declarations</title> - <para> - Instance declarations of type families are very similar to - standard type synonym declarations. The only two differences are that - the keyword <literal>type</literal> is followed by - <literal>instance</literal> and that some or all of the type arguments - can be non-variable types, but may not contain forall types or type - synonym families. However, data families are generally allowed, and type - synonyms are allowed as long as they are fully applied and expand to a - type that is admissible - these are the exact same requirements as for - data instances. For example, the <literal>[e]</literal> instance for - <literal>Elem</literal> is -<programlisting> -type instance Elem [e] = e -</programlisting> - </para> - - <para> - Type arguments can be replaced with underscores (<literal>_</literal>) - if the names of the arguments don't matter. This is the same as writing - type variables with unique names. The same rules apply as for <xref - linkend="data-instance-declarations"/>. - </para> - - <para> - Type family instance declarations are only legitimate when an - appropriate family declaration is in scope - just like class instances - require the class declaration to be visible. Moreover, each instance - declaration has to conform to the kind determined by its family - declaration, and the number of type parameters in an instance - declaration must match the number of type parameters in the family - declaration. Finally, the right-hand side of a type instance must be a - monotype (i.e., it may not include foralls) and after the expansion of - all saturated vanilla type synonyms, no synonyms, except family synonyms - may remain. - </para> - </sect3> - - <sect3 id="closed-type-families"> - <title>Closed type families</title> - <para> - A type family can also be declared with a <literal>where</literal> clause, - defining the full set of equations for that family. For example: -<programlisting> -type family F a where - F Int = Double - F Bool = Char - F a = String -</programlisting> - A closed type family's equations are tried in order, from top to bottom, - when simplifying a type family application. In this example, we declare - an instance for <literal>F</literal> such that <literal>F Int</literal> - simplifies to <literal>Double</literal>, <literal>F Bool</literal> - simplifies to <literal>Char</literal>, and for any other type - <literal>a</literal> that is known not to be <literal>Int</literal> or - <literal>Bool</literal>, <literal>F a</literal> simplifies to - <literal>String</literal>. Note that GHC must be sure that - <literal>a</literal> cannot unify with <literal>Int</literal> or - <literal>Bool</literal> in that last case; if a programmer specifies - just <literal>F a</literal> in their code, GHC will not be able to - simplify the type. After all, <literal>a</literal> might later be - instantiated with <literal>Int</literal>. - </para> - - <para> - A closed type family's equations have the same restrictions as the - equations for open type family instances. - </para> - - <para> - A closed type family may be declared with no equations. Such - closed type families are opaque type-level definitions that will - never reduce, are not necessarily injective (unlike empty data - types), and cannot be given any instances. This is different - from omitting the equations of a closed type family in a - <filename>hs-boot</filename> file, which uses the syntax - <literal>where ..</literal>, as in that case there may or may - not be equations given in the <filename>hs</filename> file. - </para> - </sect3> - - <sect3 id="type-family-examples"> - <title>Type family examples</title> - <para> -Here are some examples of admissible and illegal type - instances: -<programlisting> -type family F a :: * -type instance F [Int] = Int -- OK! -type instance F String = Char -- OK! -type instance F (F a) = a -- WRONG: type parameter mentions a type family -type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter -type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type -type family H a where -- OK! - H Int = Int - H Bool = Bool - H a = String -type instance H Char = Char -- WRONG: cannot have instances of closed family -type family K a where -- OK! - -type family G a b :: * -> * -type instance G Int = (,) -- WRONG: must be two type parameters -type instance G Int Char Float = Double -- WRONG: must be two type parameters -</programlisting> - </para> - </sect3> - <sect3 id="type-family-overlap"> - <title>Compatibility and apartness of type family equations</title> - <para> - There must be some restrictions on the equations of type families, lest - we define an ambiguous rewrite system. So, equations of open type families - are restricted to be <firstterm>compatible</firstterm>. Two type patterns - are compatible if -<orderedlist> -<listitem><para>all corresponding types and implicit kinds in the patterns are <firstterm>apart</firstterm>, or</para></listitem> -<listitem><para>the two patterns unify producing a substitution, and the right-hand sides are equal under that substitution.</para></listitem> -</orderedlist> - Two types are considered <firstterm>apart</firstterm> if, for all possible - substitutions, the types cannot reduce to a common reduct. - </para> - - <para> - The first clause of "compatible" is the more straightforward one. It says - that the patterns of two distinct type family instances cannot overlap. - For example, the following is disallowed: -<programlisting> -type instance F Int = Bool -type instance F Int = Char -</programlisting> - The second clause is a little more interesting. It says that two - overlapping type family instances are allowed if the right-hand - sides coincide in the region of overlap. Some examples help here: -<programlisting> -type instance F (a, Int) = [a] -type instance F (Int, b) = [b] -- overlap permitted - -type instance G (a, Int) = [a] -type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int] -</programlisting> - Note that this compatibility condition is independent of whether the type family - is associated or not, and it is not only a matter of consistency, but - one of type safety. </para> - - <para>For a polykinded type family, the kinds are checked for - apartness just like types. For example, the following is accepted: -<programlisting> -type family J a :: k -type instance J Int = Bool -type instance J Int = Maybe -</programlisting> - These instances are compatible because they differ in their implicit kind parameter; the first uses <literal>*</literal> while the second uses <literal>* -> *</literal>.</para> - - - <para> - The definition for "compatible" uses a notion of "apart", whose definition - in turn relies on type family reduction. This condition of "apartness", as - stated, is impossible to check, so we use this conservative approximation: - two types are considered to be apart when the two types cannot be unified, - even by a potentially infinite unifier. Allowing the unifier to be infinite - disallows the following pair of instances: -<programlisting> -type instance H x x = Int -type instance H [x] x = Bool -</programlisting> - The type patterns in this pair equal if <literal>x</literal> is replaced - by an infinite nesting of lists. Rejecting instances such as these is - necessary for type soundness. - </para> - - <para> - Compatibility also affects closed type families. When simplifying an - application of a closed type family, GHC will select an equation only - when it is sure that no incompatible previous equation will ever apply. - Here are some examples: -<programlisting> -type family F a where - F Int = Bool - F a = Char - -type family G a where - G Int = Int - G a = a -</programlisting> - In the definition for <literal>F</literal>, the two equations are - incompatible -- their patterns are not apart, and yet their - right-hand sides do not coincide. Thus, before GHC selects the - second equation, it must be sure that the first can never apply. So, - the type <literal>F a</literal> does not simplify; only a type such - as <literal>F Double</literal> will simplify to - <literal>Char</literal>. In <literal>G</literal>, on the other hand, - the two equations are compatible. Thus, GHC can ignore the first - equation when looking at the second. So, <literal>G a</literal> will - simplify to <literal>a</literal>.</para> - - <para> However see <xref linkend="ghci-decls"/> for the overlap rules in GHCi.</para> - </sect3> - - <sect3 id="type-family-decidability"> - <title>Decidability of type synonym instances</title> - <para> - In order to guarantee that type inference in the presence of type - families decidable, we need to place a number of additional - restrictions on the formation of type instance declarations (c.f., - Definition 5 (Relaxed Conditions) of “<ulink - url="http://www.cse.unsw.edu.au/~chak/papers/SPCS08.html">Type - Checking with Open Type Functions</ulink>”). Instance - declarations have the general form -<programlisting> -type instance F t1 .. tn = t -</programlisting> - where we require that for every type family application <literal>(G s1 - .. sm)</literal> in <literal>t</literal>, - <orderedlist> - <listitem> - <para><literal>s1 .. sm</literal> do not contain any type family - constructors,</para> - </listitem> - <listitem> - <para>the total number of symbols (data type constructors and type - variables) in <literal>s1 .. sm</literal> is strictly smaller than - in <literal>t1 .. tn</literal>, and</para> - </listitem> - <listitem> - <para>for every type - variable <literal>a</literal>, <literal>a</literal> occurs - in <literal>s1 .. sm</literal> at most as often as in <literal>t1 - .. tn</literal>.</para> - </listitem> - </orderedlist> - These restrictions are easily verified and ensure termination of type - inference. However, they are not sufficient to guarantee completeness - of type inference in the presence of, so called, ''loopy equalities'', - such as <literal>a ~ [F a]</literal>, where a recursive occurrence of - a type variable is underneath a family application and data - constructor application - see the above mentioned paper for details. - </para> - <para> - If the option <option>-XUndecidableInstances</option> is passed to the - compiler, the above restrictions are not enforced and it is on the - programmer to ensure termination of the normalisation of type families - during type inference. - </para> - </sect3> - </sect2> - - -<sect2 id="assoc-decl"> -<title>Associated data and type families</title> -<para> -A data or type synonym family can be declared as part of a type class, thus: -<programlisting> -class GMapKey k where - data GMap k :: * -> * - ... - -class Collects ce where - type Elem ce :: * - ... -</programlisting> -When doing so, we (optionally) may drop the "<literal>family</literal>" keyword. -</para> -<para> - The type parameters must all be type variables, of course, - and some (but not necessarily all) of then can be the class - parameters. Each class parameter may - only be used at most once per associated type, but some may be omitted - and they may be in an order other than in the class head. Hence, the - following contrived example is admissible: -<programlisting> - class C a b c where - type T c a x :: * -</programlisting> - Here <literal>c</literal> and <literal>a</literal> are class parameters, - but the type is also indexed on a third parameter <literal>x</literal>. - </para> - - <sect3 id="assoc-data-inst"> - <title>Associated instances</title> - <para> - When an associated data or type synonym family instance is declared within a type - class instance, we (optionally) may drop the <literal>instance</literal> keyword in the - family instance: -<programlisting> -instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where - data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) - ... - -instance Eq (Elem [e]) => Collects [e] where - type Elem [e] = e - ... -</programlisting> -Note the following points: -<itemizedlist> -<listitem><para> - The type indexes corresponding to class parameters must have precisely the same shape - the type given in the instance head. To have the same "shape" means that - the two types are identical modulo renaming of type variables. For example: -<programlisting> -instance Eq (Elem [e]) => Collects [e] where - -- Choose one of the following alternatives: - type Elem [e] = e -- OK - type Elem [x] = x -- OK - type Elem x = x -- BAD; shape of 'x' is different to '[e]' - type Elem [Maybe x] = x -- BAD: shape of '[Maybe x]' is different to '[e]' -</programlisting> -</para></listitem> -<listitem><para> - An instances for an associated family can only appear as part of - an instance declarations of the class in which the family was declared, - just as with the equations of the methods of a class. -</para></listitem> -<listitem><para> - The instance for an associated type can be omitted in class instances. In that case, - unless there is a default instance (see <xref linkend="assoc-decl-defs"/>), - the corresponding instance type is not inhabited; - i.e., only diverging expressions, such - as <literal>undefined</literal>, can assume the type. -</para></listitem> -<listitem><para> - Although it is unusual, there (currently) can be <emphasis>multiple</emphasis> - instances for an associated family in a single instance declaration. - For example, this is legitimate: -<programlisting> -instance GMapKey Flob where - data GMap Flob [v] = G1 v - data GMap Flob Int = G2 Int - ... -</programlisting> - Here we give two data instance declarations, one in which the last - parameter is <literal>[v]</literal>, and one for which it is <literal>Int</literal>. - Since you cannot give any <emphasis>subsequent</emphasis> instances for - <literal>(GMap Flob ...)</literal>, this facility is most useful when - the free indexed parameter is of a kind with a finite number of alternatives - (unlike <literal>*</literal>). WARNING: this facility may be withdrawn in the future. -</para></listitem> -</itemizedlist> -</para> - </sect3> - - <sect3 id="assoc-decl-defs"> - <title>Associated type synonym defaults</title> - <para> - It is possible for the class defining the associated type to specify a - default for associated type instances. So for example, this is OK: -<programlisting> -class IsBoolMap v where - type Key v - type instance Key v = Int - - lookupKey :: Key v -> v -> Maybe Bool - -instance IsBoolMap [(Int, Bool)] where - lookupKey = lookup -</programlisting> -In an <literal>instance</literal> declaration for the class, if no explicit -<literal>type instance</literal> declaration is given for the associated type, the default declaration -is used instead, just as with default class methods. -</para> -<para> -Note the following points: -<itemizedlist> -<listitem><para> - The <literal>instance</literal> keyword is optional. -</para></listitem> -<listitem><para> - There can be at most one default declaration for an associated type synonym. -</para></listitem> -<listitem><para> - A default declaration is not permitted for an associated - <emphasis>data</emphasis> type. -</para></listitem> -<listitem><para> - The default declaration must mention only type <emphasis>variables</emphasis> on the left hand side, - and the right hand side must mention only type variables bound on the left hand side. - However, unlike the associated type family declaration itself, - the type variables of the default instance are independent of those of the parent class. -</para></listitem> -</itemizedlist> -Here are some examples: -<programlisting> - class C a where - type F1 a :: * - type instance F1 a = [a] -- OK - type instance F1 a = a->a -- BAD; only one default instance is allowed - - type F2 b a -- OK; note the family has more type - -- variables than the class - type instance F2 c d = c->d -- OK; you don't have to use 'a' in the type instance - - type F3 a - type F3 [b] = b -- BAD; only type variables allowed on the LHS - - type F4 a - type F4 b = a -- BAD; 'a' is not in scope in the RHS -</programlisting> -</para> - -</sect3> - - <sect3 id="scoping-class-params"> - <title>Scoping of class parameters</title> - <para> - The visibility of class - parameters in the right-hand side of associated family instances - depends <emphasis>solely</emphasis> on the parameters of the - family. As an example, consider the simple class declaration -<programlisting> -class C a b where - data T a -</programlisting> - Only one of the two class parameters is a parameter to the data - family. Hence, the following instance declaration is invalid: -<programlisting> -instance C [c] d where - data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope -</programlisting> - Here, the right-hand side of the data instance mentions the type - variable <literal>d</literal> that does not occur in its left-hand - side. We cannot admit such data instances as they would compromise - type safety. - </para> - </sect3> - - <sect3><title>Instance contexts and associated type and data instances</title> - <para>Associated type and data instance declarations do not inherit any - context specified on the enclosing instance. For type instance declarations, - it is unclear what the context would mean. For data instance declarations, - it is unlikely a user would want the context repeated for every data constructor. - The only place where the context might likely be useful is in a - <literal>deriving</literal> clause of an associated data instance. However, - even here, the role of the outer instance context is murky. So, for - clarity, we just stick to the rule above: the enclosing instance context - is ignored. If you need to use - a non-trivial context on a derived instance, - use a <link linkend="stand-alone-deriving">standalone - deriving</link> clause (at the top level). - </para> - </sect3> - - </sect2> - - <sect2 id="data-family-import-export"> - <title>Import and export</title> - - <para> -The rules for export lists -(Haskell Report - <ulink url="http://www.haskell.org/onlinereport/modules.html#sect5.2">Section 5.2</ulink>) -needs adjustment for type families: -<itemizedlist> -<listitem><para> - The form <literal>T(..)</literal>, where <literal>T</literal> - is a data family, names the family <literal>T</literal> and all the in-scope - constructors (whether in scope qualified or unqualified) that are data - instances of <literal>T</literal>. - </para></listitem> -<listitem><para> - The form <literal>T(.., ci, .., fj, ..)</literal>, where <literal>T</literal> is - a data family, names <literal>T</literal> and the specified constructors <literal>ci</literal> - and fields <literal>fj</literal> as usual. The constructors and field names must - belong to some data instance of <literal>T</literal>, but are not required to belong - to the <emphasis>same</emphasis> instance. - </para></listitem> -<listitem><para> - The form <literal>C(..)</literal>, where <literal>C</literal> - is a class, names the class <literal>C</literal> and all its methods - <emphasis>and associated types</emphasis>. - </para></listitem> -<listitem><para> - The form <literal>C(.., mi, .., type Tj, ..)</literal>, where <literal>C</literal> is a class, - names the class <literal>C</literal>, and the specified methods <literal>mi</literal> - and associated types <literal>Tj</literal>. The types need a keyword "<literal>type</literal>" - to distinguish them from data constructors. - </para></listitem> -</itemizedlist> -</para> - - <sect3 id="data-family-impexp-examples"> - <title>Examples</title> - <para> - Recall our running <literal>GMapKey</literal> class example: -<programlisting> -class GMapKey k where - data GMap k :: * -> * - insert :: GMap k v -> k -> v -> GMap k v - lookup :: GMap k v -> k -> Maybe v - empty :: GMap k v - -instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where - data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) - ...method declarations... -</programlisting> -Here are some export lists and their meaning: - <itemizedlist> - <listitem> - <para><literal>module GMap( GMapKey )</literal>: Exports - just the class name.</para> - </listitem> - <listitem> - <para><literal>module GMap( GMapKey(..) )</literal>: - Exports the class, the associated type <literal>GMap</literal> - and the member - functions <literal>empty</literal>, <literal>lookup</literal>, - and <literal>insert</literal>. The data constructors of <literal>GMap</literal> - (in this case <literal>GMapEither</literal>) are not exported.</para> - </listitem> - <listitem> - <para><literal>module GMap( GMapKey( type GMap, empty, lookup, insert ) )</literal>: - Same as the previous item. Note the "<literal>type</literal>" keyword.</para> - </listitem> - <listitem> - <para><literal>module GMap( GMapKey(..), GMap(..) )</literal>: - Same as previous item, but also exports all the data - constructors for <literal>GMap</literal>, namely <literal>GMapEither</literal>. - </para> - </listitem> - <listitem> - <para><literal>module GMap ( GMapKey( empty, lookup, insert), GMap(..) )</literal>: - Same as previous item.</para> - </listitem> - <listitem> - <para><literal>module GMap ( GMapKey, empty, lookup, insert, GMap(..) )</literal>: - Same as previous item.</para> - </listitem> - </itemizedlist> - </para> - <para> -Two things to watch out for: - <itemizedlist> - <listitem><para> - You cannot write <literal>GMapKey(type GMap(..))</literal> — i.e., - sub-component specifications cannot be nested. To - specify <literal>GMap</literal>'s data constructors, you have to list - it separately. - </para></listitem> - <listitem><para> - Consider this example: -<programlisting> - module X where - data family D - - module Y where - import X - data instance D Int = D1 | D2 -</programlisting> - Module Y exports all the entities defined in Y, namely the data constructors <literal>D1</literal> - and <literal>D2</literal>, <emphasis>but not the data family <literal>D</literal></emphasis>. - That (annoyingly) means that you cannot selectively import Y selectively, - thus "<literal>import Y( D(D1,D2) )</literal>", because Y does not export <literal>D</literal>. - Instead you should list the exports explicitly, thus: -<programlisting> - module Y( D(..) ) where ... -or module Y( module Y, D ) where ... -</programlisting> - </para></listitem> - </itemizedlist> -</para> -</sect3> - - <sect3 id="data-family-impexp-instances"> - <title>Instances</title> - <para> - Family instances are implicitly exported, just like class instances. - However, this applies only to the heads of instances, not to the data - constructors an instance defines. - </para> - </sect3> - - </sect2> - - <sect2 id="ty-fams-in-instances"> - <title>Type families and instance declarations</title> - - <para>Type families require us to extend the rules for - the form of instance heads, which are given - in <xref linkend="flexible-instance-head"/>. - Specifically: -<itemizedlist> - <listitem><para>Data type families may appear in an instance head</para></listitem> - <listitem><para>Type synonym families may not appear (at all) in an instance head</para></listitem> -</itemizedlist> -The reason for the latter restriction is that there is no way to check for instance -matching. Consider -<programlisting> - type family F a - type instance F Bool = Int - - class C a - - instance C Int - instance C (F a) -</programlisting> -Now a constraint <literal>(C (F Bool))</literal> would match both instances. -The situation is especially bad because the type instance for <literal>F Bool</literal> -might be in another module, or even in a module that is not yet written. -</para> -<para> -However, type class instances of instances of data families can be defined -much like any other data type. For example, we can say -<programlisting> -data instance T Int = T1 Int | T2 Bool -instance Eq (T Int) where - (T1 i) == (T1 j) = i==j - (T2 i) == (T2 j) = i==j - _ == _ = False -</programlisting> - Note that class instances are always for - particular <emphasis>instances</emphasis> of a data family and never - for an entire family as a whole. This is for essentially the same - reasons that we cannot define a toplevel function that performs - pattern matching on the data constructors - of <emphasis>different</emphasis> instances of a single type family. - It would require a form of extensible case construct. - </para> -<para> -Data instance declarations can also - have <literal>deriving</literal> clauses. For example, we can write -<programlisting> -data GMap () v = GMapUnit (Maybe v) - deriving Show -</programlisting> - which implicitly defines an instance of the form -<programlisting> -instance Show v => Show (GMap () v) where ... -</programlisting> - </para> - -</sect2> - - <sect2 id="injective-ty-fams"> - <title>Injective type families</title> - <para>Starting with GHC 7.12 type families can be annotated with injectivity - information. This information is then used by GHC during type checking to - resolve type ambiguities in situations where a type variable appears only - under type family applications. - </para> - - <para>For full details on injective type families refer to Haskell Symposium - 2015 paper <ulink - url="http://ics.p.lodz.pl/~stolarek/_media/pl:research:stolarek_peyton-jones_eisenberg_injectivity_extended.pdf">Injective - type families for Haskell</ulink>.</para> - - <sect3 id="injective-ty-fams-syntax"> - <title>Syntax of injectivity annotation</title> - <para>Injectivity annotation is added after type family head and consists of - two parts: - <itemizedlist> - <listitem><para>type variable that names the result of a type family. - Syntax: <literal>= tyvar</literal> or <literal>= (tyvar :: - kind)</literal>. Type variable must be fresh. - </para> - </listitem> - <listitem><para>injectivity annotation of the form <literal>| A -> - B</literal>, where <literal>A</literal> is the result type variable (see - previous bullet) and <literal>B</literal> is a list of argument type and - kind variables in which type family is injective. It is possible to omit - some variables if type family is not injective in them.</para></listitem> - </itemizedlist> - Examples: - <programlisting> -type family Id a = result | result -> a where -type family F a b c = d | d -> a c b -type family G (a :: k) b c = foo | foo -> k b where - </programlisting> - </para> - <para>For open and closed type families it is OK to name the result but - skip the injectivity annotation. This is not the case for associated type - synonyms, where the named result without injectivity annotation will be - interpreted as associated type synonym default.</para> - </sect3> - - <sect3 id="injective-ty-fams-typecheck"> - <title>Verifying injectivity annotation against type family equations - </title> - <para>Once the user declares type family to be injective GHC must verify - that this declaration is correct, ie. type family equations don't violate - the injectivity annotation. A general idea is that if at least one - equation (bullets (1), (2) and (3) below) or a pair of equations (bullets - (4) and (5) below) violates the injectivity annotation then a type family - is not injective in a way user claims and an error is reported. In the - bullets below <emphasis>RHS</emphasis> refers to the right-hand side of the - type family equation being checked for injectivity. - <emphasis>LHS</emphasis> refers to the arguments of that type family - equation. Below are the rules followed when checking injectivity of a type - family: - <orderedlist> - <listitem><para>If a RHS of a type family equation is a type family - application GHC reports that the type family is not injective.</para> - </listitem> - <listitem>If a RHS of a type family equation is a bare type variable we - require that all LHS variables (including implicit kind variables) are - also bare. In other words, this has to be a sole equation of that type - family and it has to cover all possible patterns. If the patterns are - not covering GHC reports that the type family is not injective. - </listitem> - <listitem>If a LHS type variable that is declared as injective is not - mentioned on <emphasis>injective position</emphasis> in the RHS GHC - reports that the type family is not injective. Injective position means - either argument to a type constructor or injective argument to a type - family.</listitem> - <listitem><para><emphasis>Open type families</emphasis>Open type families - are typechecked incrementally. This means that when a module is imported - type family instances contained in that module are checked against - instances present in already imported modules.</para> - <para>A pair of an open type family equations is checked by attempting to - unify their RHSs. If the RHSs don't unify this pair does not violate - injectivity annotation. If unification succeeds with a substitution then - LHSs of unified equations must be identical under that substitution. If - they are not identical then GHC reports that the type family is not - injective.</para> - </listitem> - <listitem><para>In a <emphasis>closed type family</emphasis> all - equations are ordered and in one place. Equations are also checked - pair-wise but this time an equation has to be paired with all the - preceeding equations. Of course a single-equation closed type family is - trivially injective (unless (1), (2) or (3) above holds). - </para> - <para>When checking a pair of closed type family equations GHC tried to - unify their RHSs. If they don't unify this pair of equations does not - violate injectivity annotation. If the RHSs can be unified under some - substitution (possibly empty) then either the LHSs unify under the same - substitution or the LHS of the latter equation is subsumed by earlier - equations. If neither condition is met GHC reports that a type family is - not injective. - </para> - </listitem> - </orderedlist> - </para> - <para>Note that for the purpose of injectivity check in bullets (4) and (5) - GHC uses a special variant of unification algorithm that treats type family - applications as possibly unifying with anything.</para> - </sect3> - </sect2> - -</sect1> - - -<sect1 id="kind-polymorphism"> -<title>Kind polymorphism</title> -<para> -This section describes <emphasis>kind polymorphism</emphasis>, and extension -enabled by <option>-XPolyKinds</option>. -It is described in more detail in the paper -<ulink url="http://dreixel.net/research/pdf/ghp.pdf">Giving Haskell a -Promotion</ulink>, which appeared at TLDI 2012. -</para> - -<sect2> <title>Overview of kind polymorphism</title> - -<para> -Currently there is a lot of code duplication in the way Typeable is implemented -(<xref linkend="deriving-typeable"/>): -<programlisting> -class Typeable (t :: *) where - typeOf :: t -> TypeRep - -class Typeable1 (t :: * -> *) where - typeOf1 :: t a -> TypeRep - -class Typeable2 (t :: * -> * -> *) where - typeOf2 :: t a b -> TypeRep -</programlisting> -</para> - -<para> -Kind polymorphism (with <option>-XPolyKinds</option>) -allows us to merge all these classes into one: -<programlisting> -data Proxy t = Proxy - -class Typeable t where - typeOf :: Proxy t -> TypeRep - -instance Typeable Int where typeOf _ = TypeRep -instance Typeable [] where typeOf _ = TypeRep -</programlisting> -Note that the datatype <literal>Proxy</literal> has kind -<literal>forall k. k -> *</literal> (inferred by GHC), and the new -<literal>Typeable</literal> class has kind -<literal>forall k. k -> Constraint</literal>. -</para> - -<para> -Note the following specific points: -<itemizedlist> -<listitem><para> -Generally speaking, with <option>-XPolyKinds</option>, GHC will infer a polymorphic -kind for un-decorated declarations, whenever possible. For example, in GHCi -<programlisting> -ghci> :set -XPolyKinds -ghci> data T m a = MkT (m a) -ghci> :k T -T :: (k -> *) -> k -> * -</programlisting> -</para></listitem> - -<listitem><para> -GHC does not usually print explicit <literal>forall</literal>s, including kind <literal>forall</literal>s. -You can make GHC show them explicitly with <option>-fprint-explicit-foralls</option> -(see <xref linkend="options-help"/>): -<programlisting> -ghci> :set -XPolyKinds -ghci> :set -fprint-explicit-foralls -ghci> data T m a = MkT (m a) -ghci> :k T -T :: forall (k :: BOX). (k -> *) -> k -> * -</programlisting> -Here the kind variable <literal>k</literal> itself has a -kind annotation "<literal>BOX</literal>". This is just GHC's way of -saying "<literal>k</literal> is a kind variable". -</para></listitem> - -<listitem><para> -Just as in the world of terms, you can restrict polymorphism using a -kind signature (sometimes called a kind annotation) -<programlisting> -data T m (a :: *) = MkT (m a) --- GHC now infers kind T :: (* -> *) -> * -> * -</programlisting> -NB: <option>-XPolyKinds</option> implies <option>-XKindSignatures</option> (see <xref linkend="kinding"/>). -</para></listitem> - -<listitem><para> -The source language does not support an explicit <literal>forall</literal> for kind variables. Instead, when binding a type variable, -you can simply mention a kind -variable in a kind annotation for that type-variable binding, thus: -<programlisting> -data T (m :: k -> *) a = MkT (m a) --- GHC now infers kind T :: forall k. (k -> *) -> k -> * -</programlisting> -</para></listitem> - -<listitem><para> -The (implicit) kind "forall" is placed -just outside the outermost type-variable binding whose kind annotation mentions -the kind variable. For example -<programlisting> -f1 :: (forall a m. m a -> Int) -> Int - -- f1 :: forall (k::BOX). - -- (forall (a::k) (m::k->*). m a -> Int) - -- -> Int - -f2 :: (forall (a::k) m. m a -> Int) -> Int - -- f2 :: (forall (k::BOX) (a::k) (m::k->*). m a -> Int) - -- -> Int -</programlisting> -Here in <literal>f1</literal> there is no kind annotation mentioning the polymorphic -kind variable, so <literal>k</literal> is generalised at the top -level of the signature for <literal>f1</literal>. -But in the case of of <literal>f2</literal> we give a kind annotation in the <literal>forall (a:k)</literal> -binding, and GHC therefore puts the kind <literal>forall</literal> right there too. -This design decision makes default case (<literal>f1</literal>) -as polymorphic as possible; remember that a <emphasis>more</emphasis> polymorphic argument type (as in <literal>f2</literal> -makes the overall function <emphasis>less</emphasis> polymorphic, because there are fewer acceptable arguments. -</para></listitem> -</itemizedlist> -</para> -<para> -(Note: These rules are a bit indirect and clumsy. Perhaps GHC should allow explicit kind quantification. -But the implicit quantification (e.g. in the declaration for data type T above) is certainly -very convenient, and it is not clear what the syntax for explicit quantification should be.) -</para> -</sect2> - -<sect2> <title>Principles of kind inference</title> - -<para> -Generally speaking, when <option>-XPolyKinds</option> is on, GHC tries to infer the most -general kind for a declaration. For example: -<programlisting> -data T f a = MkT (f a) -- GHC infers: - -- T :: forall k. (k->*) -> k -> * -</programlisting> -In this case the definition has a right-hand side to inform kind inference. -But that is not always the case. Consider -<programlisting> -type family F a -</programlisting> -Type family declarations have no right-hand side, but GHC must still infer a kind -for <literal>F</literal>. Since there are no constraints, it could infer -<literal>F :: forall k1 k2. k1 -> k2</literal>, but that seems <emphasis>too</emphasis> -polymorphic. So GHC defaults those entirely-unconstrained kind variables to <literal>*</literal> and -we get <literal>F :: * -> *</literal>. You can still declare <literal>F</literal> to be -kind-polymorphic using kind signatures: -<programlisting> -type family F1 a -- F1 :: * -> * -type family F2 (a :: k) -- F2 :: forall k. k -> * -type family F3 a :: k -- F3 :: forall k. * -> k -type family F4 (a :: k1) :: k -- F4 :: forall k1 k2. k1 -> k2 -</programlisting> -</para> -<para> -The general principle is this: -<itemizedlist> -<listitem><para> -<emphasis>When there is a right-hand side, GHC -infers the most polymorphic kind consistent with the right-hand side.</emphasis> -Examples: ordinary data type and GADT declarations, class declarations. -In the case of a class declaration the role of "right hand side" is played -by the class method signatures. -</para></listitem> -<listitem><para> -<emphasis>When there is no right hand side, GHC defaults argument and result kinds to <literal>*</literal>, -except when directed otherwise by a kind signature</emphasis>. -Examples: data and type family declarations. -</para></listitem> -</itemizedlist> -This rule has occasionally-surprising consequences -(see <ulink url="https://ghc.haskell.org/trac/ghc/ticket/10132">Trac 10132</ulink>). -<programlisting> -class C a where -- Class declarations are generalised - -- so C :: forall k. k -> Constraint - data D1 a -- No right hand side for these two family - type F1 a -- declarations, but the class forces (a :: k) - -- so D1, F1 :: forall k. k -> * - -data D2 a -- No right-hand side so D2 :: * -> * -type F2 a -- No right-hand side so F2 :: * -> * -</programlisting> -The kind-polymorphism from the class declaration makes <literal>D1</literal> -kind-polymorphic, but not so <literal>D2</literal>; and similarly <literal>F1</literal>, <literal>F1</literal>. -</para> -</sect2> - -<sect2 id="complete-kind-signatures"> <title>Polymorphic kind recursion and complete kind signatures</title> - -<para> -Just as in type inference, kind inference for recursive types can only use <emphasis>monomorphic</emphasis> recursion. -Consider this (contrived) example: -<programlisting> -data T m a = MkT (m a) (T Maybe (m a)) --- GHC infers kind T :: (* -> *) -> * -> * -</programlisting> -The recursive use of <literal>T</literal> forced the second argument to have kind <literal>*</literal>. -However, just as in type inference, you can achieve polymorphic recursion by giving a -<emphasis>complete kind signature</emphasis> for <literal>T</literal>. A complete -kind signature is present when all argument kinds and the result kind are known, without -any need for inference. For example: -<programlisting> -data T (m :: k -> *) :: k -> * where - MkT :: m a -> T Maybe (m a) -> T m a -</programlisting> -The complete user-supplied kind signature specifies the polymorphic kind for <literal>T</literal>, -and this signature is used for all the calls to <literal>T</literal> including the recursive ones. -In particular, the recursive use of <literal>T</literal> is at kind <literal>*</literal>. -</para> - -<para> -What exactly is considered to be a "complete user-supplied kind signature" for a type constructor? -These are the forms: -<itemizedlist> -<listitem><para>For a datatype, every type variable must be annotated with a kind. In a -GADT-style declaration, there may also be a kind signature (with a top-level -<literal>::</literal> in the header), but the presence or absence of this annotation -does not affect whether or not the declaration has a complete signature. -<programlisting> -data T1 :: (k -> *) -> k -> * where ... -- Yes T1 :: forall k. (k->*) -> k -> * -data T2 (a :: k -> *) :: k -> * where ... -- Yes T2 :: forall k. (k->*) -> k -> * -data T3 (a :: k -> *) (b :: k) :: * where ... -- Yes T3 :: forall k. (k->*) -> k -> * -data T4 (a :: k -> *) (b :: k) where ... -- Yes T4 :: forall k. (k->*) -> k -> * - -data T5 a (b :: k) :: * where ... -- NO kind is inferred -data T6 a b where ... -- NO kind is inferred -</programlisting></para> -</listitem> - -<listitem><para> -For a class, every type variable must be annotated with a kind. -</para></listitem> - -<listitem><para> -For a type synonym, every type variable and the result type must all be annotated -with kinds. -<programlisting> -type S1 (a :: k) = (a :: k) -- Yes S1 :: forall k. k -> k -type S2 (a :: k) = a -- No kind is inferred -type S3 (a :: k) = Proxy a -- No kind is inferred -</programlisting> -Note that in <literal>S2</literal> and <literal>S3</literal>, the kind of the -right-hand side is rather apparent, but it is still not considered to have a complete -signature -- no inference can be done before detecting the signature.</para></listitem> - -<listitem><para> -An open type or data family declaration <emphasis>always</emphasis> has a -complete user-specified kind signature; un-annotated type variables default to -kind <literal>*</literal>. -<programlisting> -data family D1 a -- D1 :: * -> * -data family D2 (a :: k) -- D2 :: forall k. k -> * -data family D3 (a :: k) :: * -- D3 :: forall k. k -> * -type family S1 a :: k -> * -- S1 :: forall k. * -> k -> * - -class C a where -- C :: k -> Constraint - type AT a b -- AT :: k -> * -> * -</programlisting> -In the last example, the variable <literal>a</literal> has an implicit kind -variable annotation from the class declaration. It keeps its polymorphic kind -in the associated type declaration. The variable <literal>b</literal>, however, -gets defaulted to <literal>*</literal>. -</para></listitem> - -<listitem><para> -A closed type family has a complete signature when all of its type variables -are annotated and a return kind (with a top-level <literal>::</literal>) is supplied. -</para></listitem> -</itemizedlist> -</para> - -</sect2> - -<sect2><title>Kind inference in closed type families</title> - -<para>Although all open type families are considered to have a complete -user-specified kind signature, we can relax this condition for closed type -families, where we have equations on which to perform kind inference. GHC will -infer kinds for the arguments and result types of a closed type family.</para> - -<para>GHC supports <emphasis>kind-indexed</emphasis> type families, where the -family matches both on the kind and type. GHC will <emphasis>not</emphasis> infer -this behaviour without a complete user-supplied kind signature, as doing so would -sometimes infer non-principal types.</para> - -<para>For example: -<programlisting> -type family F1 a where - F1 True = False - F1 False = True - F1 x = x --- F1 fails to compile: kind-indexing is not inferred - -type family F2 (a :: k) where - F2 True = False - F2 False = True - F2 x = x --- F2 fails to compile: no complete signature - -type family F3 (a :: k) :: k where - F3 True = False - F3 False = True - F3 x = x --- OK -</programlisting></para> - -</sect2> - -<sect2><title>Kind inference in class instance declarations</title> - -<para>Consider the following example of a poly-kinded class and an instance for it:</para> - -<programlisting> -class C a where - type F a - -instance C b where - type F b = b -> b -</programlisting> - -<para>In the class declaration, nothing constrains the kind of the type -<literal>a</literal>, so it becomes a poly-kinded type variable <literal>(a :: k)</literal>. -Yet, in the instance declaration, the right-hand side of the associated type instance -<literal>b -> b</literal> says that <literal>b</literal> must be of kind <literal>*</literal>. GHC could theoretically propagate this information back into the instance head, and -make that instance declaration apply only to type of kind <literal>*</literal>, as opposed -to types of any kind. However, GHC does <emphasis>not</emphasis> do this.</para> - -<para>In short: GHC does <emphasis>not</emphasis> propagate kind information from -the members of a class instance declaration into the instance declaration head.</para> - -<para>This lack of kind inference is simply an engineering problem within GHC, but -getting it to work would make a substantial change to the inference infrastructure, -and it's not clear the payoff is worth it. If you want to restrict <literal>b</literal>'s -kind in the instance above, just use a kind signature in the instance head.</para> - -</sect2> -</sect1> - -<sect1 id="promotion"> -<title>Datatype promotion</title> - -<para> -This section describes <emphasis>data type promotion</emphasis>, an extension -to the kind system that complements kind polymorphism. It is enabled by <option>-XDataKinds</option>, -and described in more detail in the paper -<ulink url="http://dreixel.net/research/pdf/ghp.pdf">Giving Haskell a -Promotion</ulink>, which appeared at TLDI 2012. -</para> - -<sect2> <title>Motivation</title> - -<para> -Standard Haskell has a rich type language. Types classify terms and serve to -avoid many common programming mistakes. The kind language, however, is -relatively simple, distinguishing only lifted types (kind <literal>*</literal>), -type constructors (e.g. kind <literal>* -> * -> *</literal>), and unlifted -types (<xref linkend="glasgow-unboxed"/>). In particular when using advanced -type system features, such as type families (<xref linkend="type-families"/>) -or GADTs (<xref linkend="gadt"/>), this simple kind system is insufficient, -and fails to prevent simple errors. Consider the example of type-level natural -numbers, and length-indexed vectors: -<programlisting> -data Ze -data Su n - -data Vec :: * -> * -> * where - Nil :: Vec a Ze - Cons :: a -> Vec a n -> Vec a (Su n) -</programlisting> -The kind of <literal>Vec</literal> is <literal>* -> * -> *</literal>. This means -that eg. <literal>Vec Int Char</literal> is a well-kinded type, even though this -is not what we intend when defining length-indexed vectors. -</para> - -<para> -With <option>-XDataKinds</option>, the example above can then be -rewritten to: -<programlisting> -data Nat = Ze | Su Nat - -data Vec :: * -> Nat -> * where - Nil :: Vec a Ze - Cons :: a -> Vec a n -> Vec a (Su n) -</programlisting> -With the improved kind of <literal>Vec</literal>, things like -<literal>Vec Int Char</literal> are now ill-kinded, and GHC will report an -error. -</para> -</sect2> - -<sect2><title>Overview</title> -<para> -With <option>-XDataKinds</option>, GHC automatically promotes every suitable -datatype to be a kind, and its (value) constructors to be type constructors. -The following types -<programlisting> -data Nat = Ze | Su Nat - -data List a = Nil | Cons a (List a) - -data Pair a b = Pair a b - -data Sum a b = L a | R b -</programlisting> -give rise to the following kinds and type constructors: -<programlisting> -Nat :: BOX -Ze :: Nat -Su :: Nat -> Nat - -List k :: BOX -Nil :: List k -Cons :: k -> List k -> List k - -Pair k1 k2 :: BOX -Pair :: k1 -> k2 -> Pair k1 k2 - -Sum k1 k2 :: BOX -L :: k1 -> Sum k1 k2 -R :: k2 -> Sum k1 k2 -</programlisting> -where <literal>BOX</literal> is the (unique) sort that classifies kinds. -Note that <literal>List</literal>, for instance, does not get sort -<literal>BOX -> BOX</literal>, because we do not further classify kinds; all -kinds have sort <literal>BOX</literal>. -</para> - -<para> -The following restrictions apply to promotion: -<itemizedlist> - <listitem><para>We promote <literal>data</literal> types and <literal>newtypes</literal>, - but not type synonyms, or type/data families (<xref linkend="type-families"/>). - </para></listitem> - <listitem><para>We only promote types whose kinds are of the form - <literal>* -> ... -> * -> *</literal>. In particular, we do not promote - higher-kinded datatypes such as <literal>data Fix f = In (f (Fix f))</literal>, - or datatypes whose kinds involve promoted types such as - <literal>Vec :: * -> Nat -> *</literal>.</para></listitem> - <listitem><para>We do not promote data constructors that are kind - polymorphic, involve constraints, mention type or data families, or involve types that - are not promotable. - </para></listitem> -</itemizedlist> -</para> -</sect2> - -<sect2 id="promotion-syntax"> -<title>Distinguishing between types and constructors</title> -<para> -Since constructors and types share the same namespace, with promotion you can -get ambiguous type names: -<programlisting> -data P -- 1 - -data Prom = P -- 2 - -type T = P -- 1 or promoted 2? -</programlisting> -In these cases, if you want to refer to the promoted constructor, you should -prefix its name with a quote: -<programlisting> -type T1 = P -- 1 - -type T2 = 'P -- promoted 2 -</programlisting> -Note that promoted datatypes give rise to named kinds. Since these can never be -ambiguous, we do not allow quotes in kind names. -</para> -<para>Just as in the case of Template Haskell (<xref linkend="th-syntax"/>), there is -no way to quote a data constructor or type constructor whose second character -is a single quote.</para> -</sect2> - -<sect2 id="promoted-lists-and-tuples"> -<title>Promoted list and tuple types</title> -<para> -With <option>-XDataKinds</option>, Haskell's list and tuple types are natively promoted to kinds, and enjoy the -same convenient syntax at the type level, albeit prefixed with a quote: -<programlisting> -data HList :: [*] -> * where - HNil :: HList '[] - HCons :: a -> HList t -> HList (a ': t) - -data Tuple :: (*,*) -> * where - Tuple :: a -> b -> Tuple '(a,b) - -foo0 :: HList '[] -foo0 = HNil - -foo1 :: HList '[Int] -foo1 = HCons (3::Int) HNil - -foo2 :: HList [Int, Bool] -foo2 = ... -</programlisting> -(Note: the declaration for <literal>HCons</literal> also requires <option>-XTypeOperators</option> -because of infix type operator <literal>(:')</literal>.) -For type-level lists of <emphasis>two or more elements</emphasis>, -such as the signature of <literal>foo2</literal> above, the quote may be omitted because the meaning is -unambiguous. But for lists of one or zero elements (as in <literal>foo0</literal> -and <literal>foo1</literal>), the quote is required, because the types <literal>[]</literal> -and <literal>[Int]</literal> have existing meanings in Haskell. -</para> -</sect2> - -<sect2 id="promotion-existentials"> -<title>Promoting existential data constructors</title> -<para> -Note that we do promote existential data constructors that are otherwise suitable. -For example, consider the following: -<programlisting> -data Ex :: * where - MkEx :: forall a. a -> Ex -</programlisting> -Both the type <literal>Ex</literal> and the data constructor <literal>MkEx</literal> -get promoted, with the polymorphic kind <literal>'MkEx :: forall k. k -> Ex</literal>. -Somewhat surprisingly, you can write a type family to extract the member -of a type-level existential: -<programlisting> -type family UnEx (ex :: Ex) :: k -type instance UnEx (MkEx x) = x -</programlisting> -At first blush, <literal>UnEx</literal> seems poorly-kinded. The return kind -<literal>k</literal> is not mentioned in the arguments, and thus it would seem -that an instance would have to return a member of <literal>k</literal> -<emphasis>for any</emphasis> <literal>k</literal>. However, this is not the -case. The type family <literal>UnEx</literal> is a kind-indexed type family. -The return kind <literal>k</literal> is an implicit parameter to <literal>UnEx</literal>. -The elaborated definitions are as follows: -<programlisting> -type family UnEx (k :: BOX) (ex :: Ex) :: k -type instance UnEx k (MkEx k x) = x -</programlisting> -Thus, the instance triggers only when the implicit parameter to <literal>UnEx</literal> -matches the implicit parameter to <literal>MkEx</literal>. Because <literal>k</literal> -is actually a parameter to <literal>UnEx</literal>, the kind is not escaping the -existential, and the above code is valid. -</para> - -<para> -See also <ulink url="http://ghc.haskell.org/trac/ghc/ticket/7347">Trac #7347</ulink>. -</para> -</sect2> - -<sect2> -<title>Promoting type operators</title> -<para> -Type operators are <emphasis>not</emphasis> promoted to the kind level. Why not? Because -<literal>*</literal> is a kind, parsed the way identifiers are. Thus, if a programmer -tried to write <literal>Either * Bool</literal>, would it be <literal>Either</literal> -applied to <literal>*</literal> and <literal>Bool</literal>? Or would it be -<literal>*</literal> applied to <literal>Either</literal> and <literal>Bool</literal>. -To avoid this quagmire, we simply forbid promoting type operators to the kind level. -</para> -</sect2> - - -</sect1> - -<sect1 id="type-level-literals"> -<title>Type-Level Literals</title> -<para> -GHC supports numeric and string literals at the type level, giving convenient -access to a large number of predefined type-level constants. -Numeric literals are of kind <literal>Nat</literal>, while string literals -are of kind <literal>Symbol</literal>. -This feature is enabled by the <literal>XDataKinds</literal> -language extension. -</para> - -<para> -The kinds of the literals and all other low-level operations for this feature -are defined in module <literal>GHC.TypeLits</literal>. Note that the module -defines some type-level operators that clash with their value-level -counterparts (e.g. <literal>(+)</literal>). Import and export declarations -referring to these operators require an explicit namespace -annotation (see <xref linkend="explicit-namespaces"/>). -</para> - -<para> -Here is an example of using type-level numeric literals to provide a safe -interface to a low-level function: -<programlisting> -import GHC.TypeLits -import Data.Word -import Foreign - -newtype ArrPtr (n :: Nat) a = ArrPtr (Ptr a) - -clearPage :: ArrPtr 4096 Word8 -> IO () -clearPage (ArrPtr p) = ... -</programlisting> -</para> - -<para> -Here is an example of using type-level string literals to simulate -simple record operations: -<programlisting> -data Label (l :: Symbol) = Get - -class Has a l b | a l -> b where - from :: a -> Label l -> b - -data Point = Point Int Int deriving Show - -instance Has Point "x" Int where from (Point x _) _ = x -instance Has Point "y" Int where from (Point _ y) _ = y - -example = from (Point 1 2) (Get :: Label "x") -</programlisting> -</para> - -<sect2 id="typelit-runtime"> -<title>Runtime Values for Type-Level Literals</title> -<para> -Sometimes it is useful to access the value-level literal associated with -a type-level literal. This is done with the functions -<literal>natVal</literal> and <literal>symbolVal</literal>. For example: -<programlisting> -GHC.TypeLits> natVal (Proxy :: Proxy 2) -2 -</programlisting> -These functions are overloaded because they need to return a different -result, depending on the type at which they are instantiated. -<programlisting> -natVal :: KnownNat n => proxy n -> Integer - --- instance KnownNat 0 --- instance KnownNat 1 --- instance KnownNat 2 --- ... -</programlisting> -GHC discharges the constraint as soon as it knows what concrete -type-level literal is being used in the program. Note that this works -only for <emphasis>literals</emphasis> and not arbitrary type expressions. -For example, a constraint of the form <literal>KnownNat (a + b)</literal> -will <emphasis>not</emphasis> be simplified to -<literal>(KnownNat a, KnownNat b)</literal>; instead, GHC will keep the -constraint as is, until it can simplify <literal>a + b</literal> to -a constant value. -</para> -</sect2> - -<para> -It is also possible to convert a run-time integer or string value to -the corresponding type-level literal. Of course, the resulting type -literal will be unknown at compile-time, so it is hidden in an existential -type. The conversion may be performed using <literal>someNatVal</literal> -for integers and <literal>someSymbolVal</literal> for strings: -<programlisting> -someNatVal :: Integer -> Maybe SomeNat -SomeNat :: KnownNat n => Proxy n -> SomeNat -</programlisting> -The operations on strings are similar. -</para> - -<sect2 id="typelit-tyfuns"> -<title>Computing With Type-Level Naturals</title> -<para> -GHC 7.8 can evaluate arithmetic expressions involving type-level natural -numbers. Such expressions may be constructed using the type-families -<literal>(+), (*), (^)</literal> for addition, multiplication, -and exponentiation. Numbers may be compared using <literal>(<=?)</literal>, -which returns a promoted boolean value, or <literal>(<=)</literal>, which -compares numbers as a constraint. For example: -<programlisting> -GHC.TypeLits> natVal (Proxy :: Proxy (2 + 3)) -5 -</programlisting> -</para> -<para> -At present, GHC is quite limited in its reasoning about arithmetic: -it will only evaluate the arithmetic type functions and compare the results--- -in the same way that it does for any other type function. In particular, -it does not know more general facts about arithmetic, such as the commutativity -and associativity of <literal>(+)</literal>, for example. -</para> - -<para> -However, it is possible to perform a bit of "backwards" evaluation. -For example, here is how we could get GHC to compute arbitrary logarithms -at the type level: -<programlisting> -lg :: Proxy base -> Proxy (base ^ pow) -> Proxy pow -lg _ _ = Proxy - -GHC.TypeLits> natVal (lg (Proxy :: Proxy 2) (Proxy :: Proxy 8)) -3 -</programlisting> -</para> -</sect2> - - -</sect1> - - - <sect1 id="equality-constraints"> - <title>Equality constraints</title> - <para> - A type context can include equality constraints of the form <literal>t1 ~ - t2</literal>, which denote that the types <literal>t1</literal> - and <literal>t2</literal> need to be the same. In the presence of type - families, whether two types are equal cannot generally be decided - locally. Hence, the contexts of function signatures may include - equality constraints, as in the following example: -<programlisting> -sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2 -</programlisting> - where we require that the element type of <literal>c1</literal> - and <literal>c2</literal> are the same. In general, the - types <literal>t1</literal> and <literal>t2</literal> of an equality - constraint may be arbitrary monotypes; i.e., they may not contain any - quantifiers, independent of whether higher-rank types are otherwise - enabled. - </para> - <para> - Equality constraints can also appear in class and instance contexts. - The former enable a simple translation of programs using functional - dependencies into programs using family synonyms instead. The general - idea is to rewrite a class declaration of the form -<programlisting> -class C a b | a -> b -</programlisting> - to -<programlisting> -class (F a ~ b) => C a b where - type F a -</programlisting> - That is, we represent every functional dependency (FD) <literal>a1 .. an - -> b</literal> by an FD type family <literal>F a1 .. an</literal> and a - superclass context equality <literal>F a1 .. an ~ b</literal>, - essentially giving a name to the functional dependency. In class - instances, we define the type instances of FD families in accordance - with the class head. Method signatures are not affected by that - process. - </para> - - <sect2 id="coercible"> - <title>The <literal>Coercible</literal> constraint</title> - <para> - The constraint <literal>Coercible t1 t2</literal> is similar to <literal>t1 ~ - t2</literal>, but denotes representational equality between - <literal>t1</literal> and <literal>t2</literal> in the sense of Roles - (<xref linkend="roles"/>). It is exported by - <ulink url="&libraryBaseLocation;/Data-Coerce.html"><literal>Data.Coerce</literal></ulink>, - which also contains the documentation. More details and discussion can be found in - the paper - <ulink href="http://www.cis.upenn.edu/~eir/papers/2014/coercible/coercible.pdf">Safe Coercions"</ulink>. - </para> - </sect2> - - </sect1> - -<sect1 id="constraint-kind"> -<title>The <literal>Constraint</literal> kind</title> - -<para> - Normally, <emphasis>constraints</emphasis> (which appear in types to the left of the - <literal>=></literal> arrow) have a very restricted syntax. They can only be: - <itemizedlist> - <listitem> - <para>Class constraints, e.g. <literal>Show a</literal></para> - </listitem> - <listitem> - <para><link linkend="implicit-parameters">Implicit parameter</link> constraints, - e.g. <literal>?x::Int</literal> (with the <option>-XImplicitParams</option> flag)</para> - </listitem> - <listitem> - <para><link linkend="equality-constraints">Equality constraints</link>, - e.g. <literal>a ~ Int</literal> (with the <option>-XTypeFamilies</option> or - <option>-XGADTs</option> flag)</para> - </listitem> - </itemizedlist> -</para> - -<para> - With the <option>-XConstraintKinds</option> flag, GHC becomes more liberal in - what it accepts as constraints in your program. To be precise, with this flag any - <emphasis>type</emphasis> of the new kind <literal>Constraint</literal> can be used as a constraint. - The following things have kind <literal>Constraint</literal>: - - <itemizedlist> - <listitem> - Anything which is already valid as a constraint without the flag: saturated applications to type classes, - implicit parameter and equality constraints. - </listitem> - <listitem> - Tuples, all of whose component types have kind <literal>Constraint</literal>. So for example the - type <literal>(Show a, Ord a)</literal> is of kind <literal>Constraint</literal>. - </listitem> - <listitem> - Anything whose form is not yet known, but the user has declared to have kind <literal>Constraint</literal> - (for which they need to import it from <literal>GHC.Exts</literal>). So for example - <literal>type Foo (f :: * -> Constraint) = forall b. f b => b -> b</literal> is allowed, as well as - examples involving type families: -<programlisting> -type family Typ a b :: Constraint -type instance Typ Int b = Show b -type instance Typ Bool b = Num b - -func :: Typ a b => a -> b -> b -func = ... -</programlisting> - </listitem> - </itemizedlist> -</para> - -<para> - Note that because constraints are just handled as types of a particular kind, this extension allows type - constraint synonyms: -</para> - -<programlisting> -type Stringy a = (Read a, Show a) -foo :: Stringy a => a -> (String, String -> a) -foo x = (show x, read) -</programlisting> - -<para> - Presently, only standard constraints, tuples and type synonyms for those two sorts of constraint are - permitted in instance contexts and superclasses (without extra flags). The reason is that permitting more general - constraints can cause type checking to loop, as it would with these two programs: -</para> - -<programlisting> -type family Clsish u a -type instance Clsish () a = Cls a -class Clsish () a => Cls a where -</programlisting> - -<programlisting> -class OkCls a where - -type family OkClsish u a -type instance OkClsish () a = OkCls a -instance OkClsish () a => OkCls a where -</programlisting> - -<para> - You may write programs that use exotic sorts of constraints in instance contexts and superclasses, but - to do so you must use <option>-XUndecidableInstances</option> to signal that you don't mind if the type checker - fails to terminate. -</para> - -</sect1> - -<sect1 id="other-type-extensions"> -<title>Other type system extensions</title> - -<sect2 id="explicit-foralls"><title>Explicit universal quantification (forall)</title> -<para> -Haskell type signatures are implicitly quantified. When the language option <option>-XExplicitForAll</option> -is used, the keyword <literal>forall</literal> -allows us to say exactly what this means. For example: -</para> -<para> -<programlisting> - g :: b -> b -</programlisting> -means this: -<programlisting> - g :: forall b. (b -> b) -</programlisting> -The two are treated identically. -</para> -<para> -Of course <literal>forall</literal> becomes a keyword; you can't use <literal>forall</literal> as -a type variable any more! -</para> -</sect2> - - -<sect2 id="flexible-contexts"><title>The context of a type signature</title> -<para> -The <option>-XFlexibleContexts</option> flag lifts the Haskell 98 restriction -that the type-class constraints in a type signature must have the -form <emphasis>(class type-variable)</emphasis> or -<emphasis>(class (type-variable type1 type2 ... typen))</emphasis>. -With <option>-XFlexibleContexts</option> -these type signatures are perfectly OK -<programlisting> - g :: Eq [a] => ... - g :: Ord (T a ()) => ... -</programlisting> -The flag <option>-XFlexibleContexts</option> also lifts the corresponding -restriction on class declarations (<xref linkend="superclass-rules"/>) and instance declarations -(<xref linkend="instance-rules"/>). -</para> -</sect2> - -<sect2 id="ambiguity"><title>Ambiguous types and the ambiguity check</title> - -<para> -Each user-written type signature is subjected to an -<emphasis>ambiguity check</emphasis>. -The ambiguity check rejects functions that can never be called; for example: -<programlisting> - f :: C a => Int -</programlisting> -The idea is there can be no legal calls to <literal>f</literal> because every call will -give rise to an ambiguous constraint. -Indeed, the <emphasis>only</emphasis> purpose of the -ambiguity check is to report functions that cannot possibly be called. -We could soundly omit the -ambiguity check on type signatures entirely, at the expense of -delaying ambiguity errors to call sites. Indeed, the language extension -<option>-XAllowAmbiguousTypes</option> switches off the ambiguity check. -</para> -<para> -Ambiguity can be subtle. Consider this example which uses functional dependencies: -<programlisting> - class D a b | a -> b where .. - h :: D Int b => Int -</programlisting> -The <literal>Int</literal> may well fix <literal>b</literal> at the call site, so that signature should -not be rejected. Moreover, the dependencies might be hidden. Consider -<programlisting> - class X a b where ... - class D a b | a -> b where ... - instance D a b => X [a] b where... - h :: X a b => a -> a -</programlisting> -Here <literal>h</literal>'s type looks ambiguous in <literal>b</literal>, but here's a legal call: -<programlisting> - ...(h [True])... -</programlisting> -That gives rise to a <literal>(X [Bool] beta)</literal> constraint, and using the -instance means we need <literal>(D Bool beta)</literal> and that -fixes <literal>beta</literal> via <literal>D</literal>'s -fundep! -</para> -<para> -Behind all these special cases there is a simple guiding principle. -Consider -<programlisting> - f :: <replaceable>type</replaceable> - f = ...blah... - - g :: <replaceable>type</replaceable> - g = f -</programlisting> -You would think that the definition of <literal>g</literal> would surely typecheck! -After all <literal>f</literal> has exactly the same type, and <literal>g=f</literal>. -But in fact <literal>f</literal>'s type -is instantiated and the instantiated constraints are solved against -the constraints bound by <literal>g</literal>'s signature. So, in the case an ambiguous type, solving will fail. -For example, consider the earlier definition <literal>f :: C a => Int</literal>: -<programlisting> - f :: C a => Int - f = ...blah... - - g :: C a => Int - g = f -</programlisting> -In <literal>g</literal>'s definition, -we'll instantiate to <literal>(C alpha)</literal> and try to -deduce <literal>(C alpha)</literal> from <literal>(C a)</literal>, -and fail. -</para> -<para> -So in fact we use this as our <emphasis>definition</emphasis> of ambiguity: a type -<literal><replaceable>ty</replaceable></literal> is -ambiguous if and only if <literal>((undefined :: <replaceable>ty</replaceable>) -:: <replaceable>ty</replaceable>)</literal> would fail to typecheck. We use a -very similar test for <emphasis>inferred</emphasis> types, to ensure that they too are -unambiguous. -</para> -<para><emphasis>Switching off the ambiguity check.</emphasis> -Even if a function is has an ambiguous type according the "guiding principle", -it is possible that the function is callable. For example: -<programlisting> - class D a b where ... - instance D Bool b where ... - - strange :: D a b => a -> a - strange = ...blah... - - foo = strange True -</programlisting> -Here <literal>strange</literal>'s type is ambiguous, but the call in <literal>foo</literal> -is OK because it gives rise to a constraint <literal>(D Bool beta)</literal>, which is -soluble by the <literal>(D Bool b)</literal> instance. So the language extension -<option>-XAllowAmbiguousTypes</option> allows you to switch off the ambiguity check. -But even with ambiguity checking switched off, GHC will complain about a function -that can <emphasis>never</emphasis> be called, such as this one: -<programlisting> - f :: (Int ~ Bool) => a -> a -</programlisting> -</para> - -<para> -<emphasis>A historical note.</emphasis> -GHC used to impose some more restrictive and less principled conditions -on type signatures. For type type -<literal>forall tv1..tvn (c1, ...,cn) => type</literal> -GHC used to require (a) that each universally quantified type variable -<literal>tvi</literal> must be "reachable" from <literal>type</literal>, -and (b) that every constraint <literal>ci</literal> mentions at least one of the -universally quantified type variables <literal>tvi</literal>. -These ad-hoc restrictions are completely subsumed by the new ambiguity check. -<emphasis>End of historical note.</emphasis> -</para> - -</sect2> - -<sect2 id="implicit-parameters"> -<title>Implicit parameters</title> - -<para> Implicit parameters are implemented as described in -"Implicit parameters: dynamic scoping with static types", -J Lewis, MB Shields, E Meijer, J Launchbury, -27th ACM Symposium on Principles of Programming Languages (POPL'00), -Boston, Jan 2000. -(Most of the following, still rather incomplete, documentation is -due to Jeff Lewis.)</para> - -<para>Implicit parameter support is enabled with the option -<option>-XImplicitParams</option>.</para> - -<para> -A variable is called <emphasis>dynamically bound</emphasis> when it is bound by the calling -context of a function and <emphasis>statically bound</emphasis> when bound by the callee's -context. In Haskell, all variables are statically bound. Dynamic -binding of variables is a notion that goes back to Lisp, but was later -discarded in more modern incarnations, such as Scheme. Dynamic binding -can be very confusing in an untyped language, and unfortunately, typed -languages, in particular Hindley-Milner typed languages like Haskell, -only support static scoping of variables. -</para> -<para> -However, by a simple extension to the type class system of Haskell, we -can support dynamic binding. Basically, we express the use of a -dynamically bound variable as a constraint on the type. These -constraints lead to types of the form <literal>(?x::t') => t</literal>, which says "this -function uses a dynamically-bound variable <literal>?x</literal> -of type <literal>t'</literal>". For -example, the following expresses the type of a sort function, -implicitly parameterised by a comparison function named <literal>cmp</literal>. -<programlisting> - sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] -</programlisting> -The dynamic binding constraints are just a new form of predicate in the type class system. -</para> -<para> -An implicit parameter occurs in an expression using the special form <literal>?x</literal>, -where <literal>x</literal> is -any valid identifier (e.g. <literal>ord ?x</literal> is a valid expression). -Use of this construct also introduces a new -dynamic-binding constraint in the type of the expression. -For example, the following definition -shows how we can define an implicitly parameterised sort function in -terms of an explicitly parameterised <literal>sortBy</literal> function: -<programlisting> - sortBy :: (a -> a -> Bool) -> [a] -> [a] - - sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] - sort = sortBy ?cmp -</programlisting> -</para> - -<sect3> -<title>Implicit-parameter type constraints</title> -<para> -Dynamic binding constraints behave just like other type class -constraints in that they are automatically propagated. Thus, when a -function is used, its implicit parameters are inherited by the -function that called it. For example, our <literal>sort</literal> function might be used -to pick out the least value in a list: -<programlisting> - least :: (?cmp :: a -> a -> Bool) => [a] -> a - least xs = head (sort xs) -</programlisting> -Without lifting a finger, the <literal>?cmp</literal> parameter is -propagated to become a parameter of <literal>least</literal> as well. With explicit -parameters, the default is that parameters must always be explicit -propagated. With implicit parameters, the default is to always -propagate them. -</para> -<para> -An implicit-parameter type constraint differs from other type class constraints in the -following way: All uses of a particular implicit parameter must have -the same type. This means that the type of <literal>(?x, ?x)</literal> -is <literal>(?x::a) => (a,a)</literal>, and not -<literal>(?x::a, ?x::b) => (a, b)</literal>, as would be the case for type -class constraints. -</para> - -<para> You can't have an implicit parameter in the context of a class or instance -declaration. For example, both these declarations are illegal: -<programlisting> - class (?x::Int) => C a where ... - instance (?x::a) => Foo [a] where ... -</programlisting> -Reason: exactly which implicit parameter you pick up depends on exactly where -you invoke a function. But the ``invocation'' of instance declarations is done -behind the scenes by the compiler, so it's hard to figure out exactly where it is done. -Easiest thing is to outlaw the offending types.</para> -<para> -Implicit-parameter constraints do not cause ambiguity. For example, consider: -<programlisting> - f :: (?x :: [a]) => Int -> Int - f n = n + length ?x - - g :: (Read a, Show a) => String -> String - g s = show (read s) -</programlisting> -Here, <literal>g</literal> has an ambiguous type, and is rejected, but <literal>f</literal> -is fine. The binding for <literal>?x</literal> at <literal>f</literal>'s call site is -quite unambiguous, and fixes the type <literal>a</literal>. -</para> -</sect3> - -<sect3> -<title>Implicit-parameter bindings</title> - -<para> -An implicit parameter is <emphasis>bound</emphasis> using the standard -<literal>let</literal> or <literal>where</literal> binding forms. -For example, we define the <literal>min</literal> function by binding -<literal>cmp</literal>. -<programlisting> - min :: [a] -> a - min = let ?cmp = (<=) in least -</programlisting> -</para> -<para> -A group of implicit-parameter bindings may occur anywhere a normal group of Haskell -bindings can occur, except at top level. That is, they can occur in a <literal>let</literal> -(including in a list comprehension, or do-notation, or pattern guards), -or a <literal>where</literal> clause. -Note the following points: -<itemizedlist> -<listitem><para> -An implicit-parameter binding group must be a -collection of simple bindings to implicit-style variables (no -function-style bindings, and no type signatures); these bindings are -neither polymorphic or recursive. -</para></listitem> -<listitem><para> -You may not mix implicit-parameter bindings with ordinary bindings in a -single <literal>let</literal> -expression; use two nested <literal>let</literal>s instead. -(In the case of <literal>where</literal> you are stuck, since you can't nest <literal>where</literal> clauses.) -</para></listitem> - -<listitem><para> -You may put multiple implicit-parameter bindings in a -single binding group; but they are <emphasis>not</emphasis> treated -as a mutually recursive group (as ordinary <literal>let</literal> bindings are). -Instead they are treated as a non-recursive group, simultaneously binding all the implicit -parameter. The bindings are not nested, and may be re-ordered without changing -the meaning of the program. -For example, consider: -<programlisting> - f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y -</programlisting> -The use of <literal>?x</literal> in the binding for <literal>?y</literal> does not "see" -the binding for <literal>?x</literal>, so the type of <literal>f</literal> is -<programlisting> - f :: (?x::Int) => Int -> Int -</programlisting> -</para></listitem> -</itemizedlist> -</para> - -</sect3> - -<sect3><title>Implicit parameters and polymorphic recursion</title> - -<para> -Consider these two definitions: -<programlisting> - len1 :: [a] -> Int - len1 xs = let ?acc = 0 in len_acc1 xs - - len_acc1 [] = ?acc - len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs - - ------------ - - len2 :: [a] -> Int - len2 xs = let ?acc = 0 in len_acc2 xs - - len_acc2 :: (?acc :: Int) => [a] -> Int - len_acc2 [] = ?acc - len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs -</programlisting> -The only difference between the two groups is that in the second group -<literal>len_acc</literal> is given a type signature. -In the former case, <literal>len_acc1</literal> is monomorphic in its own -right-hand side, so the implicit parameter <literal>?acc</literal> is not -passed to the recursive call. In the latter case, because <literal>len_acc2</literal> -has a type signature, the recursive call is made to the -<emphasis>polymorphic</emphasis> version, which takes <literal>?acc</literal> -as an implicit parameter. So we get the following results in GHCi: -<programlisting> - Prog> len1 "hello" - 0 - Prog> len2 "hello" - 5 -</programlisting> -Adding a type signature dramatically changes the result! This is a rather -counter-intuitive phenomenon, worth watching out for. -</para> -</sect3> - -<sect3><title>Implicit parameters and monomorphism</title> - -<para>GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the -Haskell Report) to implicit parameters. For example, consider: -<programlisting> - f :: Int -> Int - f v = let ?x = 0 in - let y = ?x + v in - let ?x = 5 in - y -</programlisting> -Since the binding for <literal>y</literal> falls under the Monomorphism -Restriction it is not generalised, so the type of <literal>y</literal> is -simply <literal>Int</literal>, not <literal>(?x::Int) => Int</literal>. -Hence, <literal>(f 9)</literal> returns result <literal>9</literal>. -If you add a type signature for <literal>y</literal>, then <literal>y</literal> -will get type <literal>(?x::Int) => Int</literal>, so the occurrence of -<literal>y</literal> in the body of the <literal>let</literal> will see the -inner binding of <literal>?x</literal>, so <literal>(f 9)</literal> will return -<literal>14</literal>. -</para> -</sect3> - -<sect3 id="implicit-parameters-special"><title>Special implicit parameters</title> -<para> -GHC treats implicit parameters of type <literal>GHC.Types.CallStack</literal> -specially, by resolving them to the current location in the program. Consider: -<programlisting> - f :: String - f = show (?loc :: CallStack) -</programlisting> -GHC will automatically resolve <literal>?loc</literal> to its source -location. If another implicit parameter with type <literal>CallStack</literal> is -in scope, GHC will append the two locations, creating an explicit call-stack. For example: -<programlisting> - f :: (?stk :: CallStack) => String - f = show (?stk :: CallStack) -</programlisting> -will produce the location of <literal>?stk</literal>, followed by -<literal>f</literal>'s call-site. Note that the name of the implicit parameter does not -matter (we used <literal>?loc</literal> above), GHC will solve any implicit parameter -with the right type. The name does, however, matter when pushing new locations onto -existing stacks. Consider: -<programlisting> - f :: (?stk :: CallStack) => String - f = show (?loc :: CallStack) -</programlisting> -When we call <literal>f</literal>, the stack will include the use of <literal>?loc</literal>, -but not the call to <literal>f</literal>; in this case the names must match. -</para> -<para> -<literal>CallStack</literal> is kept abstract, but -GHC provides a function -<programlisting> - getCallStack :: CallStack -> [(String, SrcLoc)] -</programlisting> -to access the individual call-sites in the stack. The <literal>String</literal> -is the name of the function that was called, and the <literal>SrcLoc</literal> -provides the package, module, and file name, as well as the line and column -numbers. The stack will never be empty, as the first call-site -will be the location at which the implicit parameter was used. GHC will also -never infer <literal>?loc :: CallStack</literal> as a type constraint, which -means that functions must explicitly ask to be told about their call-sites. -</para> -<para> -A potential "gotcha" when using implicit <literal>CallStack</literal>s is that -the <literal>:type</literal> command in GHCi will not report the -<literal>?loc :: CallStack</literal> constraint, as the typechecker will -immediately solve it. Use <literal>:info</literal> instead to print the -unsolved type. -</para> -</sect3> -</sect2> - -<sect2 id="kinding"> -<title>Explicitly-kinded quantification</title> - -<para> -Haskell infers the kind of each type variable. Sometimes it is nice to be able -to give the kind explicitly as (machine-checked) documentation, -just as it is nice to give a type signature for a function. On some occasions, -it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) -John Hughes had to define the data type: -<screen> - data Set cxt a = Set [a] - | Unused (cxt a -> ()) -</screen> -The only use for the <literal>Unused</literal> constructor was to force the correct -kind for the type variable <literal>cxt</literal>. -</para> -<para> -GHC now instead allows you to specify the kind of a type variable directly, wherever -a type variable is explicitly bound, with the flag <option>-XKindSignatures</option>. -</para> -<para> -This flag enables kind signatures in the following places: -<itemizedlist> -<listitem><para><literal>data</literal> declarations: -<screen> - data Set (cxt :: * -> *) a = Set [a] -</screen></para></listitem> -<listitem><para><literal>type</literal> declarations: -<screen> - type T (f :: * -> *) = f Int -</screen></para></listitem> -<listitem><para><literal>class</literal> declarations: -<screen> - class (Eq a) => C (f :: * -> *) a where ... -</screen></para></listitem> -<listitem><para><literal>forall</literal>'s in type signatures: -<screen> - f :: forall (cxt :: * -> *). Set cxt Int -</screen></para></listitem> -</itemizedlist> -</para> - -<para> -The parentheses are required. Some of the spaces are required too, to -separate the lexemes. If you write <literal>(f::*->*)</literal> you -will get a parse error, because "<literal>::*->*</literal>" is a -single lexeme in Haskell. -</para> - -<para> -As part of the same extension, you can put kind annotations in types -as well. Thus: -<screen> - f :: (Int :: *) -> Int - g :: forall a. a -> (a :: *) -</screen> -The syntax is -<screen> - atype ::= '(' ctype '::' kind ') -</screen> -The parentheses are required. -</para> -</sect2> - - -<sect2 id="universal-quantification"> -<title>Arbitrary-rank polymorphism -</title> - -<para> -GHC's type system supports <emphasis>arbitrary-rank</emphasis> -explicit universal quantification in -types. -For example, all the following types are legal: -<programlisting> - f1 :: forall a b. a -> b -> a - g1 :: forall a b. (Ord a, Eq b) => a -> b -> a - - f2 :: (forall a. a->a) -> Int -> Int - g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int - - f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool - - f4 :: Int -> (forall a. a -> a) -</programlisting> -Here, <literal>f1</literal> and <literal>g1</literal> are rank-1 types, and -can be written in standard Haskell (e.g. <literal>f1 :: a->b->a</literal>). -The <literal>forall</literal> makes explicit the universal quantification that -is implicitly added by Haskell. -</para> -<para> -The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types; -the <literal>forall</literal> is on the left of a function arrow. As <literal>g2</literal> -shows, the polymorphic type on the left of the function arrow can be overloaded. -</para> -<para> -The function <literal>f3</literal> has a rank-3 type; -it has rank-2 types on the left of a function arrow. -</para> -<para> -The language option <option>-XRankNTypes</option> (which implies <option>-XExplicitForAll</option>, <xref linkend="explicit-foralls"/>) -enables higher-rank types. -That is, you can nest <literal>forall</literal>s -arbitrarily deep in function arrows. -For example, a forall-type (also called a "type scheme"), -including a type-class context, is legal: -<itemizedlist> -<listitem> <para> On the left or right (see <literal>f4</literal>, for example) -of a function arrow </para> </listitem> -<listitem> <para> As the argument of a constructor, or type of a field, in a data type declaration. For -example, any of the <literal>f1,f2,f3,g1,g2</literal> above would be valid -field type signatures.</para> </listitem> -<listitem> <para> As the type of an implicit parameter </para> </listitem> -<listitem> <para> In a pattern type signature (see <xref linkend="scoped-type-variables"/>) </para> </listitem> -</itemizedlist> -The <option>-XRankNTypes</option> option is also required for any -type with a <literal>forall</literal> or -context to the right of an arrow (e.g. <literal>f :: Int -> forall a. a->a</literal>, or -<literal>g :: Int -> Ord a => a -> a</literal>). Such types are technically rank 1, but -are clearly not Haskell-98, and an extra flag did not seem worth the bother. -</para> - -<para> -In particular, in <literal>data</literal> and -<literal>newtype</literal> declarations the constructor arguments may -be polymorphic types of any rank; see examples in <xref linkend="univ"/>. -Note that the declared types are -nevertheless always monomorphic. This is important because by default -GHC will not instantiate type variables to a polymorphic type -(<xref linkend="impredicative-polymorphism"/>). -</para> -<para> -The obsolete language options <option>-XPolymorphicComponents</option> -and <option>-XRank2Types</option> are synonyms for -<option>-XRankNTypes</option>. They used to specify finer -distinctions that GHC no longer makes. (They should really elicit a -deprecation warning, but they don't, purely to avoid the need to -library authors to change their old flags specifications.) -</para> - -<sect3 id="univ"> -<title>Examples -</title> - -<para> -These are examples of <literal>data</literal> and <literal>newtype</literal> -declarations whose data constructors have polymorphic argument types: -<programlisting> -data T a = T1 (forall b. b -> b -> b) a - -data MonadT m = MkMonad { return :: forall a. a -> m a, - bind :: forall a b. m a -> (a -> m b) -> m b - } - -newtype Swizzle = MkSwizzle (forall a. Ord a => [a] -> [a]) -</programlisting> - -</para> - -<para> -The constructors have rank-2 types: -</para> - -<para> - -<programlisting> -T1 :: forall a. (forall b. b -> b -> b) -> a -> T a -MkMonad :: forall m. (forall a. a -> m a) - -> (forall a b. m a -> (a -> m b) -> m b) - -> MonadT m -MkSwizzle :: (forall a. Ord a => [a] -> [a]) -> Swizzle -</programlisting> - -</para> - -<para> -In earlier versions of GHC, it was possible to omit the <literal>forall</literal> -in the type of the constructor if there was an explicit context. For example: - -<programlisting> -newtype Swizzle' = MkSwizzle' (Ord a => [a] -> [a]) -</programlisting> - -As of GHC 7.10, this is deprecated. The <literal>-fwarn-context-quantification</literal> -flag detects this situation and issues a warning. In GHC 7.12, declarations -such as <literal>MkSwizzle'</literal> will cause an out-of-scope error. -</para> - -<para> -As for type signatures, implicit quantification happens for non-overloaded -types too. So if you write this: - -<programlisting> - f :: (a -> a) -> a -</programlisting> - -it's just as if you had written this: - -<programlisting> - f :: forall a. (a -> a) -> a -</programlisting> - -That is, since the type variable <literal>a</literal> isn't in scope, it's -implicitly universally quantified. -</para> - -<para> -You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying -the constructor to suitable values, just as usual. For example, -</para> - -<para> - -<programlisting> - a1 :: T Int - a1 = T1 (\xy->x) 3 - - a2, a3 :: Swizzle - a2 = MkSwizzle sort - a3 = MkSwizzle reverse - - a4 :: MonadT Maybe - a4 = let r x = Just x - b m k = case m of - Just y -> k y - Nothing -> Nothing - in - MkMonad r b - - mkTs :: (forall b. b -> b -> b) -> a -> [T a] - mkTs f x y = [T1 f x, T1 f y] -</programlisting> - -</para> - -<para> -The type of the argument can, as usual, be more general than the type -required, as <literal>(MkSwizzle reverse)</literal> shows. (<function>reverse</function> -does not need the <literal>Ord</literal> constraint.) -</para> - -<para> -When you use pattern matching, the bound variables may now have -polymorphic types. For example: -</para> - -<para> - -<programlisting> - f :: T a -> a -> (a, Char) - f (T1 w k) x = (w k x, w 'c' 'd') - - g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] - g (MkSwizzle s) xs f = s (map f (s xs)) - - h :: MonadT m -> [m a] -> m [a] - h m [] = return m [] - h m (x:xs) = bind m x $ \y -> - bind m (h m xs) $ \ys -> - return m (y:ys) -</programlisting> - -</para> - -<para> -In the function <function>h</function> we use the record selectors <literal>return</literal> -and <literal>bind</literal> to extract the polymorphic bind and return functions -from the <literal>MonadT</literal> data structure, rather than using pattern -matching. -</para> -</sect3> - -<sect3> -<title>Type inference</title> - -<para> -In general, type inference for arbitrary-rank types is undecidable. -GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) -to get a decidable algorithm by requiring some help from the programmer. -We do not yet have a formal specification of "some help" but the rule is this: -</para> -<para> -<emphasis>For a lambda-bound or case-bound variable, x, either the programmer -provides an explicit polymorphic type for x, or GHC's type inference will assume -that x's type has no foralls in it</emphasis>. -</para> -<para> -What does it mean to "provide" an explicit type for x? You can do that by -giving a type signature for x directly, using a pattern type signature -(<xref linkend="scoped-type-variables"/>), thus: -<programlisting> - \ f :: (forall a. a->a) -> (f True, f 'c') -</programlisting> -Alternatively, you can give a type signature to the enclosing -context, which GHC can "push down" to find the type for the variable: -<programlisting> - (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char) -</programlisting> -Here the type signature on the expression can be pushed inwards -to give a type signature for f. Similarly, and more commonly, -one can give a type signature for the function itself: -<programlisting> - h :: (forall a. a->a) -> (Bool,Char) - h f = (f True, f 'c') -</programlisting> -You don't need to give a type signature if the lambda bound variable -is a constructor argument. Here is an example we saw earlier: -<programlisting> - f :: T a -> a -> (a, Char) - f (T1 w k) x = (w k x, w 'c' 'd') -</programlisting> -Here we do not need to give a type signature to <literal>w</literal>, because -it is an argument of constructor <literal>T1</literal> and that tells GHC all -it needs to know. -</para> - -</sect3> - - -<sect3 id="implicit-quant"> -<title>Implicit quantification</title> - -<para> -GHC performs implicit quantification as follows. <emphasis>At the top level (only) of -user-written types, if and only if there is no explicit <literal>forall</literal>, -GHC finds all the type variables mentioned in the type that are not already -in scope, and universally quantifies them.</emphasis> For example, the following pairs are -equivalent: -<programlisting> - f :: a -> a - f :: forall a. a -> a - - g (x::a) = let - h :: a -> b -> b - h x y = y - in ... - g (x::a) = let - h :: forall b. a -> b -> b - h x y = y - in ... -</programlisting> -</para> -<para> -Notice that GHC does <emphasis>not</emphasis> find the innermost possible quantification -point. For example: -<programlisting> - f :: (a -> a) -> Int - -- MEANS - f :: forall a. (a -> a) -> Int - -- NOT - f :: (forall a. a -> a) -> Int - - - g :: (Ord a => a -> a) -> Int - -- MEANS the illegal type - g :: forall a. (Ord a => a -> a) -> Int - -- NOT - g :: (forall a. Ord a => a -> a) -> Int -</programlisting> -The latter produces an illegal type, which you might think is silly, -but at least the rule is simple. If you want the latter type, you -can write your for-alls explicitly. Indeed, doing so is strongly advised -for rank-2 types. -</para> -</sect3> -</sect2> - - -<sect2 id="impredicative-polymorphism"> -<title>Impredicative polymorphism -</title> -<para>In general, GHC will only instantiate a polymorphic function at -a monomorphic type (one with no foralls). For example, -<programlisting> -runST :: (forall s. ST s a) -> a -id :: forall b. b -> b - -foo = id runST -- Rejected -</programlisting> -The definition of <literal>foo</literal> is rejected because one would have to instantiate -<literal>id</literal>'s type with <literal>b := (forall s. ST s a) -> a</literal>, and -that is not allowed. -Instanting polymorpic type variables with polymorphic types is called <emphasis>impredicative polymorphism</emphasis>. -</para> - -<para>GHC has extremely flaky support for <emphasis>impredicative polymorphism</emphasis>, -enabled with <option>-XImpredicativeTypes</option>. -If it worked, this would mean -that you <emphasis>could</emphasis> call a polymorphic function at a polymorphic type, and -parameterise data structures over polymorphic types. For example: -<programlisting> - f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char]) - f (Just g) = Just (g [3], g "hello") - f Nothing = Nothing -</programlisting> -Notice here that the <literal>Maybe</literal> type is parameterised by the -<emphasis>polymorphic</emphasis> type <literal>(forall a. [a] -> [a])</literal>. -However <emphasis>the extension should be considered highly experimental, and certainly un-supported</emphasis>. -You are welcome to try it, but please don't rely on it working consistently, or -working the same in subsequent releases. See -<ulink url="https://ghc.haskell.org/trac/ghc/wiki/ImpredicativePolymorphism">this wiki page</ulink> -for more details. -</para> -<para>If you want impredicative polymorphism, the main workaround is to use a newtype wrapper. -The <literal>id runST</literal> example can be written using theis workaround like this: -<programlisting> -runST :: (forall s. ST s a) -> a -id :: forall b. b -> b - -nwetype Wrap a = Wrap { unWrap :: (forall s. ST s a) -> a } - -foo :: (forall s. ST s a) -> a -foo = unWrap (id (Wrap runST)) - -- Here id is called at monomorphic type (Wrap a) -</programlisting> -</para> - -</sect2> - -<sect2 id="scoped-type-variables"> -<title>Lexically scoped type variables -</title> - -<para> -GHC supports <emphasis>lexically scoped type variables</emphasis>, without -which some type signatures are simply impossible to write. For example: -<programlisting> -f :: forall a. [a] -> [a] -f xs = ys ++ ys - where - ys :: [a] - ys = reverse xs -</programlisting> -The type signature for <literal>f</literal> brings the type variable <literal>a</literal> into scope, -because of the explicit <literal>forall</literal> (<xref linkend="decl-type-sigs"/>). -The type variables bound by a <literal>forall</literal> scope over -the entire definition of the accompanying value declaration. -In this example, the type variable <literal>a</literal> scopes over the whole -definition of <literal>f</literal>, including over -the type signature for <varname>ys</varname>. -In Haskell 98 it is not possible to declare -a type for <varname>ys</varname>; a major benefit of scoped type variables is that -it becomes possible to do so. -</para> -<para>Lexically-scoped type variables are enabled by -<option>-XScopedTypeVariables</option>. This flag implies <option>-XRelaxedPolyRec</option>. -</para> - -<sect3> -<title>Overview</title> - -<para>The design follows the following principles -<itemizedlist> -<listitem><para>A scoped type variable stands for a type <emphasis>variable</emphasis>, and not for -a <emphasis>type</emphasis>. (This is a change from GHC's earlier -design.)</para></listitem> -<listitem><para>Furthermore, distinct lexical type variables stand for distinct -type variables. This means that every programmer-written type signature -(including one that contains free scoped type variables) denotes a -<emphasis>rigid</emphasis> type; that is, the type is fully known to the type -checker, and no inference is involved.</para></listitem> -<listitem><para>Lexical type variables may be alpha-renamed freely, without -changing the program.</para></listitem> -</itemizedlist> -</para> -<para> -A <emphasis>lexically scoped type variable</emphasis> can be bound by: -<itemizedlist> -<listitem><para>A declaration type signature (<xref linkend="decl-type-sigs"/>)</para></listitem> -<listitem><para>An expression type signature (<xref linkend="exp-type-sigs"/>)</para></listitem> -<listitem><para>A pattern type signature (<xref linkend="pattern-type-sigs"/>)</para></listitem> -<listitem><para>Class and instance declarations (<xref linkend="cls-inst-scoped-tyvars"/>)</para></listitem> -</itemizedlist> -</para> -<para> -In Haskell, a programmer-written type signature is implicitly quantified over -its free type variables (<ulink -url="http://www.haskell.org/onlinereport/decls.html#sect4.1.2">Section -4.1.2</ulink> -of the Haskell Report). -Lexically scoped type variables affect this implicit quantification rules -as follows: any type variable that is in scope is <emphasis>not</emphasis> universally -quantified. For example, if type variable <literal>a</literal> is in scope, -then -<programlisting> - (e :: a -> a) means (e :: a -> a) - (e :: b -> b) means (e :: forall b. b->b) - (e :: a -> b) means (e :: forall b. a->b) -</programlisting> -</para> - - -</sect3> - - -<sect3 id="decl-type-sigs"> -<title>Declaration type signatures</title> -<para>A declaration type signature that has <emphasis>explicit</emphasis> -quantification (using <literal>forall</literal>) brings into scope the -explicitly-quantified -type variables, in the definition of the named function. For example: -<programlisting> - f :: forall a. [a] -> [a] - f (x:xs) = xs ++ [ x :: a ] -</programlisting> -The "<literal>forall a</literal>" brings "<literal>a</literal>" into scope in -the definition of "<literal>f</literal>". -</para> -<para>This only happens if: -<itemizedlist> -<listitem><para> The quantification in <literal>f</literal>'s type -signature is explicit. For example: -<programlisting> - g :: [a] -> [a] - g (x:xs) = xs ++ [ x :: a ] -</programlisting> -This program will be rejected, because "<literal>a</literal>" does not scope -over the definition of "<literal>g</literal>", so "<literal>x::a</literal>" -means "<literal>x::forall a. a</literal>" by Haskell's usual implicit -quantification rules. -</para></listitem> -<listitem><para> The signature gives a type for a function binding or a bare variable binding, -not a pattern binding. -For example: -<programlisting> - f1 :: forall a. [a] -> [a] - f1 (x:xs) = xs ++ [ x :: a ] -- OK - - f2 :: forall a. [a] -> [a] - f2 = \(x:xs) -> xs ++ [ x :: a ] -- OK - - f3 :: forall a. [a] -> [a] - Just f3 = Just (\(x:xs) -> xs ++ [ x :: a ]) -- Not OK! -</programlisting> -The binding for <literal>f3</literal> is a pattern binding, and so its type signature -does not bring <literal>a</literal> into scope. However <literal>f1</literal> is a -function binding, and <literal>f2</literal> binds a bare variable; in both cases -the type signature brings <literal>a</literal> into scope. -</para></listitem> -</itemizedlist> -</para> -</sect3> - -<sect3 id="exp-type-sigs"> -<title>Expression type signatures</title> - -<para>An expression type signature that has <emphasis>explicit</emphasis> -quantification (using <literal>forall</literal>) brings into scope the -explicitly-quantified -type variables, in the annotated expression. For example: -<programlisting> - f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool ) -</programlisting> -Here, the type signature <literal>forall s. ST s Bool</literal> brings the -type variable <literal>s</literal> into scope, in the annotated expression -<literal>(op >>= \(x :: STRef s Int) -> g x)</literal>. -</para> - -</sect3> - -<sect3 id="pattern-type-sigs"> -<title>Pattern type signatures</title> -<para> -A type signature may occur in any pattern; this is a <emphasis>pattern type -signature</emphasis>. -For example: -<programlisting> - -- f and g assume that 'a' is already in scope - f = \(x::Int, y::a) -> x - g (x::a) = x - h ((x,y) :: (Int,Bool)) = (y,x) -</programlisting> -In the case where all the type variables in the pattern type signature are -already in scope (i.e. bound by the enclosing context), matters are simple: the -signature simply constrains the type of the pattern in the obvious way. -</para> -<para> -Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised. -The pattern in a <emphasis>pattern binding</emphasis> may only mention type variables -that are already in scope. For example: -<programlisting> - f :: forall a. [a] -> (Int, [a]) - f xs = (n, zs) - where - (ys::[a], n) = (reverse xs, length xs) -- OK - zs::[a] = xs ++ ys -- OK - - Just (v::b) = ... -- Not OK; b is not in scope -</programlisting> -Here, the pattern signatures for <literal>ys</literal> and <literal>zs</literal> -are fine, but the one for <literal>v</literal> is not because <literal>b</literal> is -not in scope. -</para> -<para> -However, in all patterns <emphasis>other</emphasis> than pattern bindings, a pattern -type signature may mention a type variable that is not in scope; in this case, -<emphasis>the signature brings that type variable into scope</emphasis>. -This is particularly important for existential data constructors. For example: -<programlisting> - data T = forall a. MkT [a] - - k :: T -> T - k (MkT [t::a]) = MkT t3 - where - t3::[a] = [t,t,t] -</programlisting> -Here, the pattern type signature <literal>(t::a)</literal> mentions a lexical type -variable that is not already in scope. Indeed, it <emphasis>cannot</emphasis> already be in scope, -because it is bound by the pattern match. GHC's rule is that in this situation -(and only then), a pattern type signature can mention a type variable that is -not already in scope; the effect is to bring it into scope, standing for the -existentially-bound type variable. -</para> -<para> -When a pattern type signature binds a type variable in this way, GHC insists that the -type variable is bound to a <emphasis>rigid</emphasis>, or fully-known, type variable. -This means that any user-written type signature always stands for a completely known type. -</para> -<para> -If all this seems a little odd, we think so too. But we must have -<emphasis>some</emphasis> way to bring such type variables into scope, else we -could not name existentially-bound type variables in subsequent type signatures. -</para> -<para> -This is (now) the <emphasis>only</emphasis> situation in which a pattern type -signature is allowed to mention a lexical variable that is not already in -scope. -For example, both <literal>f</literal> and <literal>g</literal> would be -illegal if <literal>a</literal> was not already in scope. -</para> - - -</sect3> - -<!-- ==================== Commented out part about result type signatures - -<sect3 id="result-type-sigs"> -<title>Result type signatures</title> - -<para> -The result type of a function, lambda, or case expression alternative can be given a signature, thus: - -<programlisting> - {- f assumes that 'a' is already in scope -} - f x y :: [a] = [x,y,x] - - g = \ x :: [Int] -> [3,4] - - h :: forall a. [a] -> a - h xs = case xs of - (y:ys) :: a -> y -</programlisting> -The final <literal>:: [a]</literal> after the patterns of <literal>f</literal> gives the type of -the result of the function. Similarly, the body of the lambda in the RHS of -<literal>g</literal> is <literal>[Int]</literal>, and the RHS of the case -alternative in <literal>h</literal> is <literal>a</literal>. -</para> -<para> A result type signature never brings new type variables into scope.</para> -<para> -There are a couple of syntactic wrinkles. First, notice that all three -examples would parse quite differently with parentheses: -<programlisting> - {- f assumes that 'a' is already in scope -} - f x (y :: [a]) = [x,y,x] - - g = \ (x :: [Int]) -> [3,4] - - h :: forall a. [a] -> a - h xs = case xs of - ((y:ys) :: a) -> y -</programlisting> -Now the signature is on the <emphasis>pattern</emphasis>; and -<literal>h</literal> would certainly be ill-typed (since the pattern -<literal>(y:ys)</literal> cannot have the type <literal>a</literal>. - -Second, to avoid ambiguity, the type after the “<literal>::</literal>” in a result -pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single -token or a parenthesised type of some sort). To see why, -consider how one would parse this: -<programlisting> - \ x :: a -> b -> x -</programlisting> -</para> -</sect3> - - --> - -<sect3 id="cls-inst-scoped-tyvars"> -<title>Class and instance declarations</title> -<para> - -The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration -scope over the methods defined in the <literal>where</literal> part. You do not even need -an explicit <literal>forall</literal>. For example: -<programlisting> - class C a where - op :: [a] -> a - - op xs = let ys::[a] - ys = reverse xs - in - head ys - - instance C b => C [b] where - op xs = reverse (head (xs :: [[b]])) -</programlisting> -</para> -</sect3> - -</sect2> - -<sect2> -<title>Bindings and generalisation</title> - -<sect3 id="monomorphism"> -<title>Switching off the dreaded Monomorphism Restriction</title> - <indexterm><primary><option>-XNoMonomorphismRestriction</option></primary></indexterm> - -<para>Haskell's monomorphism restriction (see -<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.5">Section -4.5.5</ulink> -of the Haskell Report) -can be completely switched off by -<option>-XNoMonomorphismRestriction</option>. Since GHC 7.8.1, the monomorphism -restriction is switched off by default in GHCi's interactive options (see <xref linkend="ghci-interactive-options"/>). -</para> -</sect3> - - -<sect3 id="typing-binds"> -<title>Generalised typing of mutually recursive bindings</title> - -<para> -The Haskell Report specifies that a group of bindings (at top level, or in a -<literal>let</literal> or <literal>where</literal>) should be sorted into -strongly-connected components, and then type-checked in dependency order -(<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.1">Haskell -Report, Section 4.5.1</ulink>). -As each group is type-checked, any binders of the group that -have -an explicit type signature are put in the type environment with the specified -polymorphic type, -and all others are monomorphic until the group is generalised -(<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.2">Haskell Report, Section 4.5.2</ulink>). -</para> - -<para>Following a suggestion of Mark Jones, in his paper -<ulink url="http://citeseer.ist.psu.edu/424440.html">Typing Haskell in -Haskell</ulink>, -GHC implements a more general scheme. If <option>-XRelaxedPolyRec</option> is -specified: -<emphasis>the dependency analysis ignores references to variables that have an explicit -type signature</emphasis>. -As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will -typecheck. For example, consider: -<programlisting> - f :: Eq a => a -> Bool - f x = (x == x) || g True || g "Yes" - - g y = (y <= y) || f True -</programlisting> -This is rejected by Haskell 98, but under Jones's scheme the definition for -<literal>g</literal> is typechecked first, separately from that for -<literal>f</literal>, -because the reference to <literal>f</literal> in <literal>g</literal>'s right -hand side is ignored by the dependency analysis. Then <literal>g</literal>'s -type is generalised, to get -<programlisting> - g :: Ord a => a -> Bool -</programlisting> -Now, the definition for <literal>f</literal> is typechecked, with this type for -<literal>g</literal> in the type environment. -</para> - -<para> -The same refined dependency analysis also allows the type signatures of -mutually-recursive functions to have different contexts, something that is illegal in -Haskell 98 (Section 4.5.2, last sentence). With -<option>-XRelaxedPolyRec</option> -GHC only insists that the type signatures of a <emphasis>refined</emphasis> group have identical -type signatures; in practice this means that only variables bound by the same -pattern binding must have the same context. For example, this is fine: -<programlisting> - f :: Eq a => a -> Bool - f x = (x == x) || g True - - g :: Ord a => a -> Bool - g y = (y <= y) || f True -</programlisting> -</para> -</sect3> - -<sect3 id="mono-local-binds"> -<title>Let-generalisation</title> -<para> -An ML-style language usually generalises the type of any let-bound or where-bound variable, -so that it is as polymorphic as possible. -With the flag <option>-XMonoLocalBinds</option> GHC implements a slightly more conservative policy, -using the following rules: -<itemizedlist> - <listitem><para> - A variable is <emphasis>closed</emphasis> if and only if - <itemizedlist> - <listitem><para> the variable is let-bound</para></listitem> - <listitem><para> one of the following holds: - <itemizedlist> - <listitem><para>the variable has an explicit type signature that has no free type variables, or</para></listitem> - <listitem><para>its binding group is fully generalised (see next bullet) </para></listitem> - </itemizedlist> - </para></listitem> - </itemizedlist> - </para></listitem> - - <listitem><para> - A binding group is <emphasis>fully generalised</emphasis> if and only if - <itemizedlist> - <listitem><para>each of its free variables is either imported or closed, and</para></listitem> - <listitem><para>the binding is not affected by the monomorphism restriction - (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.5">Haskell Report, Section 4.5.5</ulink>)</para></listitem> - </itemizedlist> - </para></listitem> -</itemizedlist> -For example, consider -<programlisting> -f x = x + 1 -g x = let h y = f y * 2 - k z = z+x - in h x + k x -</programlisting> -Here <literal>f</literal> is generalised because it has no free variables; and its binding group -is unaffected by the monomorphism restriction; and hence <literal>f</literal> is closed. -The same reasoning applies to <literal>g</literal>, except that it has one closed free variable, namely <literal>f</literal>. -Similarly <literal>h</literal> is closed, <emphasis>even though it is not bound at top level</emphasis>, -because its only free variable <literal>f</literal> is closed. -But <literal>k</literal> is not closed, because it mentions <literal>x</literal> which is not closed (because it is not let-bound). -</para> -<para> -Notice that a top-level binding that is affected by the monomorphism restriction is not closed, and hence may -in turn prevent generalisation of bindings that mention it. -</para> -<para> -The rationale for this more conservative strategy is given in -<ulink url="http://research.microsoft.com/~simonpj/papers/constraints/index.htm">the papers</ulink> "Let should not be generalised" and "Modular type inference with local assumptions", and -a related <ulink url="http://ghc.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7">blog post</ulink>. -</para><para> -The flag <option>-XMonoLocalBinds</option> is implied by <option>-XTypeFamilies</option> and <option>-XGADTs</option>. You can switch it off again -with <option>-XNoMonoLocalBinds</option> but type inference becomes less predicatable if you do so. (Read the papers!) -</para> -</sect3> -</sect2> - -</sect1> -<!-- ==================== End of type system extensions ================= --> - -<sect1 id="typed-holes"> -<title>Typed Holes</title> - -<para> -Typed holes are a feature of GHC that allows special placeholders written with -a leading underscore (e.g., "<literal>_</literal>", "<literal>_foo</literal>", -"<literal>_bar</literal>"), to be used as expressions. During compilation these -holes will generate an error message that describes which type is expected at -the hole's location, information about the origin of any free type variables, -and a list of local bindings that might help fill the hole with actual code. -Typed holes are always enabled in GHC. -</para> - -<para> -The goal of typed holes is to help with writing Haskell code rather than to -change the type system. Typed holes can be used to obtain extra information -from the type checker, which might otherwise be hard to get. Normally, using -GHCi, users can inspect the (inferred) type signatures of all top-level -bindings. However, this method is less convenient with terms that are not -defined on top-level or inside complex expressions. Holes allow the user to -check the type of the term they are about to write. -</para> - -<para> -For example, compiling the following module with GHC: -<programlisting> -f :: a -> a -f x = _ -</programlisting> -will fail with the following error: -<programlisting> -hole.hs:2:7: - Found hole `_' with type: a - Where: `a' is a rigid type variable bound by - the type signature for f :: a -> a at hole.hs:1:6 - Relevant bindings include - f :: a -> a (bound at hole.hs:2:1) - x :: a (bound at hole.hs:2:3) - In the expression: _ - In an equation for `f': f x = _ -</programlisting> -</para> - -<para> -Here are some more details: -<itemizedlist> -<listitem> -<para> -A "<literal>Found hole</literal>" error usually terminates compilation, like -any other type error. After all, you have omitted some code from your program. -Nevertheless, you can run and test a piece of code containing holes, by using the flag -<option>-fdefer-typed-holes</option> flag. This flag defers errors -produced by typed holes until runtime, and converts them into compile-time warnings. -These warnings can in turn -be suppressed entirely by <option>-fnowarn-typed-holes</option>). -</para> -<para> -The result is that a hole will behave -like <literal>undefined</literal>, but with the added benefits that it shows a -warning at compile time, and will show the same message if it gets -evaluated at runtime. This behaviour follows that of the -<literal>-fdefer-type-errors</literal> option, which implies -<literal>-fdefer-typed-holes</literal>. See <xref linkend="defer-type-errors"/>. -</para> -</listitem> - -<listitem><para> -All unbound identifiers are treated as typed holes, <emphasis>whether or not they -start with an underscore</emphasis>. The only difference is in the error message: -<programlisting> -cons z = z : True : _x : y -</programlisting> -yields the errors -<programlisting> -Foo.hs:5:15: error: - Found hole: _x :: Bool - Relevant bindings include - p :: Bool (bound at Foo.hs:3:6) - cons :: Bool -> [Bool] (bound at Foo.hs:3:1) - -Foo.hs:5:20: error: - Variable not in scope: y :: [Bool] -</programlisting> -More information is given for explicit holes (i.e. ones that start with an underscore), -than for out-of-scope variables, because the latter are often -unintended typos, so the extra information is distracting. -If you the detailed information, use a leading underscore to -make explicit your intent to use a hole. -</para></listitem> - -<listitem><para> -Unbound identifiers with the same name are never unified, even within the -same function, but shown individually. -For example: -<programlisting> -cons = _x : _x -</programlisting> -results in the following errors: -<programlisting> -unbound.hs:1:8: - Found hole '_x' with type: a - Where: `a' is a rigid type variable bound by - the inferred type of cons :: [a] at unbound.hs:1:1 - Relevant bindings include cons :: [a] (bound at unbound.hs:1:1) - In the first argument of `(:)', namely `_x' - In the expression: _x : _x - In an equation for `cons': cons = _x : _x - -unbound.hs:1:13: - Found hole '_x' with type: [a] - Arising from: an undeclared identifier `_x' at unbound.hs:1:13-14 - Where: `a' is a rigid type variable bound by - the inferred type of cons :: [a] at unbound.hs:1:1 - Relevant bindings include cons :: [a] (bound at unbound.hs:1:1) - In the second argument of `(:)', namely `_x' - In the expression: _x : _x - In an equation for `cons': cons = _x : _x -</programlisting> -Notice the two different types reported for the two different occurrences of <literal>_x</literal>. -</para></listitem> - -<listitem><para> -No language extension is required to use typed holes. The lexeme "<literal>_</literal>" was previously -illegal in Haskell, but now has a more informative error message. The lexeme "<literal>_x</literal>" -is a perfectly legal variable, and its behaviour is unchanged when it is in scope. For example -<programlisting> -f _x = _x + 1 -</programlisting> -does not elict any errors. Only a variable <emphasis>that is not in scope</emphasis> -(whether or not it starts with an underscore) -is treated as an error (which it always was), albeit now with a more informative error message. -</para></listitem> - -<listitem><para> -Unbound data constructors used in expressions behave exactly as above. -However, unbound data constructors used in <emphasis>patterns</emphasis> cannot -be deferred, and instead bring compilation to a halt. (In implementation terms, they -are reported by the renamer rather than the type checker.) -</para></listitem> - -</itemizedlist> -</para> - -</sect1> -<!-- ==================== Partial Type Signatures ================= --> - -<sect1 id="partial-type-signatures"> -<title>Partial Type Signatures</title> - -<para> -A partial type signature is a type signature containing special placeholders -written with a leading underscore (e.g., "<literal>_</literal>", -"<literal>_foo</literal>", "<literal>_bar</literal>") called -<emphasis>wildcards</emphasis>. Partial type signatures are to type signatures -what <xref linkend="typed-holes"/> are to expressions. During compilation these -wildcards or holes will generate an error message that describes which type -was inferred at the hole's location, and information about the origin of any -free type variables. GHC reports such error messages by default.</para> - -<para> -Unlike <xref linkend="typed-holes"/>, which make the program incomplete and -will generate errors when they are evaluated, this needn't be the case for -holes in type signatures. The type checker is capable (in most cases) of -type-checking a binding with or without a type signature. A partial type -signature bridges the gap between the two extremes, the programmer can choose -which parts of a type to annotate and which to leave over to the type-checker -to infer. -</para> - -<para> -By default, the type-checker will report an error message for each hole in a -partial type signature, informing the programmer of the inferred type. When -the <option>-XPartialTypeSignatures</option> flag is enabled, the type-checker -will accept the inferred type for each hole, generating warnings instead of -errors. Additionally, these warnings can be silenced with the -<option>-fno-warn-partial-type-signatures</option> flag. -</para> - -<sect2 id="pts-syntax"> -<title>Syntax</title> - -<para> -A (partial) type signature has the following form: <literal>forall a b .. . -(C1, C2, ..) => tau</literal>. It consists of three parts: -</para> - -<itemizedlist> - <listitem>The type variables: <literal>a b ..</literal></listitem> - <listitem>The constraints: <literal>(C1, C2, ..)</literal></listitem> - <listitem>The (mono)type: <literal>tau</literal></listitem> -</itemizedlist> - -<para> -We distinguish three kinds of wildcards. -</para> - -<sect3 id="type-wildcards"> -<title>Type Wildcards</title> -<para> -Wildcards occurring within the monotype (tau) part of the type signature are -<emphasis>type wildcards</emphasis> ("type" is often omitted as this is the -default kind of wildcard). Type wildcards can be instantiated to any monotype -like <literal>Bool</literal> or <literal>Maybe [Bool]</literal>, including -functions and higher-kinded types like <literal>(Int -> Bool)</literal> or -<literal>Maybe</literal>. -</para> -<programlisting> -not' :: Bool -> _ -not' x = not x --- Inferred: Bool -> Bool - -maybools :: _ -maybools = Just [True] --- Inferred: Maybe [Bool] - -just1 :: _ Int -just1 = Just 1 --- Inferred: Maybe Int - -filterInt :: _ -> _ -> [Int] -filterInt = filter -- has type forall a. (a -> Bool) -> [a] -> [a] --- Inferred: (Int -> Bool) -> [Int] -> [Int] -</programlisting> - -<para> -For instance, the first wildcard in the type signature <literal>not'</literal> -would produce the following error message: -</para> -<programlisting> -Test.hs:4:17: - Found hole ‘_’ with type: Bool - To use the inferred type, enable PartialTypeSignatures - In the type signature for ‘not'’: Bool -> _ -</programlisting> - -<para> -When a wildcard is not instantiated to a monotype, it will be generalised -over, i.e. replaced by a fresh type variable (of which the name will often -start with <literal>w_</literal>), e.g. -</para> -<programlisting> -foo :: _ -> _ -foo x = x --- Inferred: forall w_. w_ -> w_ - -filter' :: _ -filter' = filter -- has type forall a. (a -> Bool) -> [a] -> [a] --- Inferred: (a -> Bool) -> [a] -> [a] -</programlisting> -</sect3> - -<sect3 id="named-wildcards"> -<title>Named Wildcards</title> -<para> -Type wildcards can also be named by giving the underscore an identifier as -suffix, i.e. <literal>_a</literal>. These are called <emphasis>named -wildcards</emphasis>. All occurrences of the same named wildcard within one -type signature will unify to the same type. For example: -</para> -<programlisting> -f :: _x -> _x -f ('c', y) = ('d', error "Urk") --- Inferred: forall t. (Char, t) -> (Char, t) -</programlisting> - -<para> -The named wildcard forces the argument and result types to be the same. -Lacking a signature, GHC would have inferred <literal>forall a b. (Char, a) -> -(Char, b)</literal>. A named wildcard can be mentioned in constraints, -provided it also occurs in the monotype part of the type signature to make -sure that it unifies with something: -</para> - -<programlisting> -somethingShowable :: Show _x => _x -> _ -somethingShowable x = show x --- Inferred type: Show w_x => w_x -> String - -somethingShowable' :: Show _x => _x -> _ -somethingShowable' x = show (not x) --- Inferred type: Bool -> String -</programlisting> - -<para> -Besides an extra-constraints wildcard (see <xref -linkend="extra-constraints-wildcard"/>), only named wildcards can occur in the -constraints, e.g. the <literal>_x</literal> in <literal>Show _x</literal>. -</para> - -<para> -Named wildcards <emphasis>should not be confused with type -variables</emphasis>. Even though syntactically similar, named wildcards can -unify with monotypes as well as be generalised over (and behave as type -variables).</para> - -<para> -In the first example above, <literal>_x</literal> is generalised over (and is -effectively replaced by a fresh type variable <literal>w_x</literal>). In the -second example, <literal>_x</literal> is unified with the -<literal>Bool</literal> type, and as <literal>Bool</literal> implements the -<literal>Show</literal> type class, the constraint <literal>Show -Bool</literal> can be simplified away. -</para> - -<para> -By default, GHC (as the Haskell 2010 standard prescribes) parses identifiers -starting with an underscore in a type as type variables. To treat them as -named wildcards, the <option>-XNamedWildCards</option> flag should be enabled. -The example below demonstrated the effect. -</para> - -<programlisting> -foo :: _a -> _a -foo _ = False -</programlisting> - -<para> -Compiling this program without enabling <option>-XNamedWildCards</option> -produces the following error message complaining about the type variable -<literal>_a</literal> no matching the actual type <literal>Bool</literal>. -</para> - -<programlisting> -Test.hs:5:9: - Couldn't match expected type ‘_a’ with actual type ‘Bool’ - ‘_a’ is a rigid type variable bound by - the type signature for foo :: _a -> _a at Test.hs:4:8 - Relevant bindings include foo :: _a -> _a (bound at Test.hs:4:1) - In the expression: False - In an equation for ‘foo’: foo _ = False -</programlisting> - -<para> -Compiling this program with <option>-XNamedWildCards</option> enabled produces -the following error message reporting the inferred type of the named wildcard -<literal>_a</literal>. -</para> - -<programlisting> -Test.hs:4:8: Warning: - Found hole ‘_a’ with type: Bool - In the type signature for ‘foo’: _a -> _a -</programlisting> -</sect3> - -<sect3 id="extra-constraints-wildcard"> -<title>Extra-Constraints Wildcard</title> - -<para> -The third kind of wildcard is the <emphasis>extra-constraints -wildcard</emphasis>. The presence of an extra-constraints wildcard indicates -that an arbitrary number of extra constraints may be inferred during type -checking and will be added to the type signature. In the example below, the -extra-constraints wildcard is used to infer three extra constraints. -</para> - -<programlisting> -arbitCs :: _ => a -> String -arbitCs x = show (succ x) ++ show (x == x) --- Inferred: --- forall a. (Enum a, Eq a, Show a) => a -> String --- Error: -Test.hs:5:12: - Found hole ‘_’ with inferred constraints: (Enum a, Eq a, Show a) - To use the inferred type, enable PartialTypeSignatures - In the type signature for ‘arbitCs’: _ => a -> String -</programlisting> - -<para> -An extra-constraints wildcard shouldn't prevent the programmer from already -listing the constraints he knows or wants to annotate, e.g. -</para> - -<programlisting> --- Also a correct partial type signature: -arbitCs' :: (Enum a, _) => a -> String -arbitCs' x = arbitCs x --- Inferred: --- forall a. (Enum a, Show a, Eq a) => a -> String --- Error: -Test.hs:9:22: - Found hole ‘_’ with inferred constraints: (Eq a, Show a) - To use the inferred type, enable PartialTypeSignatures - In the type signature for ‘arbitCs'’: (Enum a, _) => a -> String -</programlisting> - -<para> -An extra-constraints wildcard can also lead to zero extra constraints to be -inferred, e.g. -</para> - -<programlisting> -noCs :: _ => String -noCs = "noCs" --- Inferred: String --- Error: -Test.hs:13:9: - Found hole ‘_’ with inferred constraints: () - To use the inferred type, enable PartialTypeSignatures - In the type signature for ‘noCs’: _ => String -</programlisting> - -<para> -As a single extra-constraints wildcard is enough to infer any number of -constraints, only one is allowed in a type signature and it should come last -in the list of constraints. -</para> - -<para> -Extra-constraints wildcards cannot be named. -</para> - -</sect3> -</sect2> - -<sect2 id="pts-where"> -<title>Where can they occur?</title> - -<para> -Partial type signatures are allowed for bindings, pattern and expression signatures. -In all other contexts, e.g. type class or type family declarations, they are disallowed. -In the following example a wildcard is used in each of the three possible contexts. -Extra-constraints wildcards are not supported in pattern or expression signatures. -</para> -<programlisting> -{-# LANGUAGE ScopedTypeVariables #-} -foo :: _ -foo (x :: _) = (x :: _) --- Inferred: forall w_. w_ -> w_ -</programlisting> - -<para> - Anonymous wildcards <emphasis>can</emphasis> occur in type or data instance - declarations. However, these declarations are not partial type signatures - and different rules apply. See <xref linkend="data-instance-declarations"/> - for more details. -</para> - -<para> -Partial type signatures can also be used in <xref linkend="template-haskell"/> splices. -</para> - -<itemizedlist> - <listitem>Declaration splices: partial type signature are fully supported. -<programlisting> -{-# LANGUAGE TemplateHaskell, NamedWildCards #-} -$( [d| foo :: _ => _a -> _a -> _ - foo x y = x == y|] ) -</programlisting> - </listitem> - <listitem>Expression splices: anonymous and named wildcards can be used in expression signatures. - Extra-constraints wildcards are not supported, just like in regular expression signatures. -<programlisting> -{-# LANGUAGE TemplateHaskell, NamedWildCards #-} -$( [e| foo = (Just True :: _m _) |] ) -</programlisting> - </listitem> - <listitem>Typed expression splices: the same wildcards as in (untyped) expression splices are supported. - </listitem> - <listitem>Pattern splices: Template Haskell doesn't support type signatures in pattern splices. - Consequently, partial type signatures are not supported either. - </listitem> - <listitem>Type splices: only anonymous wildcards are supported in type splices. - Named and extra-constraints wildcards are not. -<programlisting> -{-# LANGUAGE TemplateHaskell #-} -foo :: $( [t| _ |] ) -> a -foo x = x -</programlisting> - </listitem> -</itemizedlist> - - -</sect2> -</sect1> -<!-- ==================== Deferring type errors ================= --> - -<sect1 id="defer-type-errors"> -<title>Deferring type errors to runtime</title> - <para> - While developing, sometimes it is desirable to allow compilation to succeed - even if there are type errors in the code. Consider the following case: -<programlisting> -module Main where - -a :: Int -a = 'a' - -main = print "b" -</programlisting> - Even though <literal>a</literal> is ill-typed, it is not used in the end, so if - all that we're interested in is <literal>main</literal> it can be useful to be - able to ignore the problems in <literal>a</literal>. - </para> - <para> - For more motivation and details please refer to the <ulink - url="http://ghc.haskell.org/trac/ghc/wiki/DeferErrorsToRuntime">HaskellWiki</ulink> - page or the <ulink - url="http://research.microsoft.com/en-us/um/people/simonpj/papers/ext-f/">original - paper</ulink>. - </para> - -<sect2><title>Enabling deferring of type errors</title> - <para> - The flag <literal>-fdefer-type-errors</literal> controls whether type - errors are deferred to runtime. Type errors will still be emitted as - warnings, but will not prevent compilation. You can use - <literal>-fno-warn-deferred-type-errors</literal> to suppress these warnings. - </para> - <para> - This flag implies the <literal>-fdefer-typed-holes</literal> flag, - which enables this behaviour for <link linkend="typed-holes">typed holes - </link>. Should you so wish, it is possible to enable - <literal>-fdefer-type-errors</literal> without enabling - <literal>-fdefer-typed-holes</literal>, by explicitly specifying - <literal>-fno-defer-typed-holes</literal> on the command-line after the - <literal>-fdefer-type-errors</literal> flag. - </para> - <para> - At runtime, whenever a term containing a type error would need to be - evaluated, the error is converted into a runtime exception of type - <literal>TypeError</literal>. Note that type errors are deferred as much - as possible during runtime, but invalid coercions are never performed, - even when they would ultimately result in a value of the correct type. - For example, given the following code: -<programlisting> -x :: Int -x = 0 - -y :: Char -y = x - -z :: Int -z = y -</programlisting> - evaluating <literal>z</literal> will result in a runtime <literal>TypeError</literal>. - </para> -</sect2> -<sect2><title>Deferred type errors in GHCi</title> - <para> - The flag <literal>-fdefer-type-errors</literal> works in GHCi as well, with - one exception: for "naked" expressions typed at the prompt, type - errors don't get delayed, so for example: -<programlisting> -Prelude> fst (True, 1 == 'a') - -<interactive>:2:12: - No instance for (Num Char) arising from the literal `1' - Possible fix: add an instance declaration for (Num Char) - In the first argument of `(==)', namely `1' - In the expression: 1 == 'a' - In the first argument of `fst', namely `(True, 1 == 'a')' -</programlisting> -Otherwise, in the common case of a simple type error such as -typing <literal>reverse True</literal> at the prompt, you would get a warning and then -an immediately-following type error when the expression is evaluated. - </para> - <para> - This exception doesn't apply to statements, as the following example demonstrates: -<programlisting> -Prelude> let x = (True, 1 == 'a') - -<interactive>:3:16: Warning: - No instance for (Num Char) arising from the literal `1' - Possible fix: add an instance declaration for (Num Char) - In the first argument of `(==)', namely `1' - In the expression: 1 == 'a' - In the expression: (True, 1 == 'a') -Prelude> fst x -True -</programlisting> - </para> -</sect2> -</sect1> - -<!-- ====================== TEMPLATE HASKELL ======================= --> - -<sect1 id="template-haskell"> -<title>Template Haskell</title> - -<para>Template Haskell allows you to do compile-time meta-programming in -Haskell. -The background to -the main technical innovations is discussed in "<ulink -url="http://research.microsoft.com/~simonpj/papers/meta-haskell/"> -Template Meta-programming for Haskell</ulink>" (Proc Haskell Workshop 2002). -</para> -<para> -There is a Wiki page about -Template Haskell at <ulink url="http://www.haskell.org/haskellwiki/Template_Haskell"> -http://www.haskell.org/haskellwiki/Template_Haskell</ulink>, and that is the best place to look for -further details. -You may also -consult the <ulink -url="http://www.haskell.org/ghc/docs/latest/html/libraries/index.html">online -Haskell library reference material</ulink> -(look for module <literal>Language.Haskell.TH</literal>). -Many changes to the original design are described in - <ulink url="http://research.microsoft.com/~simonpj/papers/meta-haskell/notes2.ps"> -Notes on Template Haskell version 2</ulink>. -Not all of these changes are in GHC, however. -</para> - -<para> The first example from that paper is set out below (<xref linkend="th-example"/>) -as a worked example to help get you started. -</para> - -<para> -The documentation here describes the realisation of Template Haskell in GHC. It is not detailed enough to -understand Template Haskell; see the <ulink url="http://haskell.org/haskellwiki/Template_Haskell"> -Wiki page</ulink>. -</para> - - <sect2 id="th-syntax"> - <title>Syntax</title> - - <para> Template Haskell has the following new syntactic - constructions. You need to use the flag - <option>-XTemplateHaskell</option> - <indexterm><primary><option>-XTemplateHaskell</option></primary> - </indexterm>to switch these syntactic extensions on.</para> - - <itemizedlist> - <listitem><para> - A splice is written <literal>$x</literal>, where <literal>x</literal> is an - identifier, or <literal>$(...)</literal>, where the "..." is an arbitrary expression. - There must be no space between the "$" and the identifier or parenthesis. This use - of "$" overrides its meaning as an infix operator, just as "M.x" overrides the meaning - of "." as an infix operator. If you want the infix operator, put spaces around it. - </para> - <para> A splice can occur in place of - <itemizedlist> - <listitem><para> an expression; the spliced expression must - have type <literal>Q Exp</literal></para></listitem> - <listitem><para> a pattern; the spliced pattern must - have type <literal>Q Pat</literal></para></listitem> - <listitem><para> a type; the spliced expression must - have type <literal>Q Type</literal></para></listitem> - <listitem><para> a list of declarations at top level; the spliced expression - must have type <literal>Q [Dec]</literal></para></listitem> - </itemizedlist> - Inside a splice you can only call functions defined in imported modules, - not functions defined elsewhere in the same module. Note that - declaration splices are not allowed anywhere except at top level - (outside any other declarations).</para></listitem> - - <listitem><para> - A expression quotation is written in Oxford brackets, thus: - <itemizedlist> - <listitem><para> <literal>[| ... |]</literal>, or <literal>[e| ... |]</literal>, - where the "..." is an expression; - the quotation has type <literal>Q Exp</literal>.</para></listitem> - <listitem><para> <literal>[d| ... |]</literal>, where the "..." is a list of top-level declarations; - the quotation has type <literal>Q [Dec]</literal>.</para></listitem> - <listitem><para> <literal>[t| ... |]</literal>, where the "..." is a type; - the quotation has type <literal>Q Type</literal>.</para></listitem> - <listitem><para> <literal>[p| ... |]</literal>, where the "..." is a pattern; - the quotation has type <literal>Q Pat</literal>.</para></listitem> - </itemizedlist> - See <xref linkend="pts-where"/> for using partial type signatures in quotations.</para></listitem> - - <listitem> - <para> - A <emphasis>typed</emphasis> expression splice is written - <literal>$$x</literal>, where <literal>x</literal> is an - identifier, or <literal>$$(...)</literal>, where the "..." is - an arbitrary expression. - </para> - <para> - A typed expression splice can occur in place of an - expression; the spliced expression must have type <literal>Q - (TExp a)</literal> - </para> - </listitem> - - <listitem> - <para> - A <emphasis>typed</emphasis> expression quotation is written - as <literal>[|| ... ||]</literal>, or <literal>[e|| - ... ||]</literal>, where the "..." is an expression; if the - "..." expression has type <literal>a</literal>, then the - quotation has type <literal>Q (TExp a)</literal>. - </para> - - <para> - Values of type <literal>TExp a</literal> may be converted to - values of type <literal>Exp</literal> using the function - <literal>unType :: TExp a -> Exp</literal>. - </para> - </listitem> - - <listitem><para> - A quasi-quotation can appear in a pattern, type, expression, or - declaration context and is also written in Oxford brackets: - <itemizedlist> - <listitem><para> <literal>[<replaceable>varid</replaceable>| ... |]</literal>, - where the "..." is an arbitrary string; a full description of the - quasi-quotation facility is given in <xref linkend="th-quasiquotation"/>.</para></listitem> - </itemizedlist></para></listitem> - - <listitem><para> - A name can be quoted with either one or two prefix single quotes: - <itemizedlist> - <listitem><para> <literal>'f</literal> has type <literal>Name</literal>, and names the function <literal>f</literal>. - Similarly <literal>'C</literal> has type <literal>Name</literal> and names the data constructor <literal>C</literal>. - In general <literal>'</literal><replaceable>thing</replaceable> - interprets <replaceable>thing</replaceable> in an expression context.</para> - <para>A name whose second character is a single - quote (sadly) cannot be quoted in this way, - because it will be parsed instead as a quoted - character. For example, if the function is called - <literal>f'7</literal> (which is a legal Haskell - identifier), an attempt to quote it as - <literal>'f'7</literal> would be parsed as the - character literal <literal>'f'</literal> followed - by the numeric literal <literal>7</literal>. There - is no current escape mechanism in this (unusual) - situation. - </para></listitem> - <listitem><para> <literal>''T</literal> has type <literal>Name</literal>, and names the type constructor <literal>T</literal>. - That is, <literal>''</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in a type context. - </para></listitem> - </itemizedlist> - These <literal>Names</literal> can be used to construct Template Haskell expressions, patterns, declarations etc. They - may also be given as an argument to the <literal>reify</literal> function. - </para> - </listitem> - - <listitem> - <para> - It is possible for a splice to expand to an expression that contain - names which are not in scope at the site of the splice. As an - example, consider the following code: - -<programlisting> -module Bar where - -import Language.Haskell.TH - -add1 :: Int -> Q Exp -add1 x = [| x + 1 |] -</programlisting> - - Now consider a splice using <literal>add1</literal> in a separate - module: - -<programlisting> -module Foo where - -import Bar - -two :: Int -two = $(add1 1) -</programlisting> - - Template Haskell cannot know what the argument to - <literal>add1</literal> will be at the function's definition site, so - a lifting mechanism is used to promote <literal>x</literal> into a - value of type <literal>Q Exp</literal>. This functionality is exposed - to the user as the <literal>Lift</literal> typeclass in the - <literal>Language.Haskell.TH.Syntax</literal> module. If a type has a - <literal>Lift</literal> instance, then any of its values can be - lifted to a Template Haskell expression: - -<programlisting> -class Lift t where - lift :: t -> Q Exp -</programlisting> - - In general, if GHC sees an expression within Oxford brackets (e.g., - <literal>[| foo bar |]</literal>, then GHC looks up each name within - the brackets. If a name is global (e.g., suppose - <literal>foo</literal> comes from an import or a top-level - declaration), then the fully qualified name is used directly in the - quotation. If the name is local (e.g., suppose <literal>bar</literal> - is bound locally in the function definition - <literal>mkFoo bar = [| foo bar |]</literal>), then GHC uses - <literal>lift</literal> on it (so GHC pretends - <literal>[| foo bar |]</literal> actually contains - <literal>[| foo $(lift bar) |]</literal>). Local names, which are not - in scope at splice locations, are actually evaluated when the - quotation is processed. - - The <literal>template-haskell</literal> library provides - <literal>Lift</literal> instances for many common data types. - Furthermore, it is possible to derive <literal>Lift</literal> - instances automatically by using the <option>-XDeriveLift</option> - language extension. See <xref linkend="deriving-lift" /> for more - information. - </para> - </listitem> - - <listitem><para> You may omit the <literal>$(...)</literal> in a top-level declaration splice. - Simply writing an expression (rather than a declaration) implies a splice. For example, you can write -<programlisting> -module Foo where -import Bar - -f x = x - -$(deriveStuff 'f) -- Uses the $(...) notation - -g y = y+1 - -deriveStuff 'g -- Omits the $(...) - -h z = z-1 -</programlisting> - This abbreviation makes top-level declaration slices quieter and less intimidating. - </para></listitem> - - <listitem> - <para> - Outermost pattern splices may bind variables. By "outermost" here, we refer to - a pattern splice that occurs outside of any quotation brackets. For example, - -<programlisting> -mkPat :: Bool -> Q Pat -mkPat True = [p| (x, y) |] -mkPat False = [p| (y, x) |] - --- in another module: -foo :: (Char, String) -> String -foo $(mkPat True) = x : y - -bar :: (String, Char) -> String -bar $(mkPat False) = x : y -</programlisting> - </para> - </listitem> - - - <listitem> - <para> - Nested pattern splices do <emphasis>not</emphasis> bind variables. - By "nested" here, we refer to a pattern splice occurring within a - quotation bracket. Continuing the example from the last bullet: - -<programlisting> -baz :: Bool -> Q Exp -baz b = [| quux $(mkPat b) = x + y |] -</programlisting> - - would fail with <literal>x</literal> and <literal>y</literal> - being out of scope. - </para> - - <para> - The difference in treatment of outermost and nested pattern splices is - because outermost splices are run at compile time. GHC can then use - the result of running the splice when analysing the expressions within - the pattern's scope. Nested splices, on the other hand, are <emphasis>not</emphasis> - run at compile time; they are run when the bracket is spliced in, sometime later. - Since nested pattern splices may refer to local variables, there is no way for GHC - to know, at splice compile time, what variables are bound, so it binds none. - </para> - </listitem> - - <listitem> - <para> - A pattern quasiquoter <emphasis>may</emphasis> - generate binders that scope over the right-hand side of a - definition because these binders are in scope lexically. For - example, given a quasiquoter <literal>haskell</literal> that - parses Haskell, in the following code, the <literal>y</literal> - in the right-hand side of <literal>f</literal> refers to the - <literal>y</literal> bound by the <literal>haskell</literal> - pattern quasiquoter, <emphasis>not</emphasis> the top-level - <literal>y = 7</literal>. -<programlisting> -y :: Int -y = 7 - -f :: Int -> Int -> Int -f n = \ [haskell|y|] -> y+n -</programlisting> - </para> - </listitem> - <listitem> - <para> - Top-level declaration splices break up a source file into - <emphasis>declaration groups</emphasis>. A - <emphasis>declaration group</emphasis> is the group of - declarations created by a top-level declaration splice, plus - those following it, down to but not including the next - top-level declaration splice. The first declaration group in a - module includes all top-level definitions down to but not - including the first top-level declaration splice. - </para> - - <para> - Each declaration group is mutually recursive only within - the group. Declaration groups can refer to definitions within - previous groups, but not later ones. - </para> - - <para> - Accordingly, the type environment seen by - <literal>reify</literal> includes all the top-level - declarations up to the end of the immediately preceding - declaration group, but no more. - </para> - - <para> - Unlike normal declaration splices, declaration quasiquoters - do not cause a break. These quasiquoters are expanded before - the rest of the declaration group is processed, and the - declarations they generate are merged into the surrounding - declaration group. Consequently, the type environment seen - by <literal>reify</literal> from a declaration quasiquoter - will not include anything from the quasiquoter's declaration - group. - </para> - - <para> - Concretely, consider the following code -<programlisting> -module M where - import ... - f x = x - $(th1 4) - h y = k y y $(blah1) - [qq|blah|] - k x y = x + y - $(th2 10) - w z = $(blah2) -</programlisting> - - In this example - <orderedlist> - <listitem> - <para> - The body of <literal>h</literal> would be unable to refer - to the function <literal>w</literal>. - </para> - - <para> - A <literal>reify</literal> inside the splice <literal>$(th1 - ..)</literal> would see the definition of - <literal>f</literal>. - </para> - </listitem> - <listitem> - <para> - A <literal>reify</literal> inside the splice - <literal>$(blah1)</literal> would see the definition of - <literal>f</literal>, but would not see the definition of - <literal>h</literal>. - </para> - </listitem> - <listitem> - <para> - A <literal>reify</literal> inside the splice - <literal>$(th2..)</literal> would see the definition of - <literal>f</literal>, all the bindings created by - <literal>$(th1..)</literal>, and the definition of - <literal>h</literal>. - </para> - </listitem> - <listitem> - <para> - A <literal>reify</literal> inside the splice - <literal>$(blah2)</literal> would see the same definitions - as the splice <literal>$(th2...)</literal>. - </para> - </listitem> - <listitem> - <para> - The body of <literal>h</literal> <emphasis>is</emphasis> - able to refer to the function <literal>k</literal> - appearing on the other side of the declaration - quasiquoter, as quasiquoters never cause a declaration - group to be broken up. - </para> - <para> - A <literal>reify</literal> inside the - <literal>qq</literal> quasiquoter would be able to see - the definition of <literal>f</literal> from the - preceding declaration group, but not the definitions of - <literal>h</literal> or <literal>k</literal>, or any - definitions from subsequent declaration groups. - </para> - </listitem> - </orderedlist> - </para> - </listitem> - <listitem> - <para> - Expression quotations accept most Haskell language constructs. - However, there are some GHC-specific extensions which expression - quotations currently do not support, including - <itemizedlist> - <listitem> - <para> - Recursive <literal>do</literal>-statements (see - <ulink url="https://ghc.haskell.org/trac/ghc/ticket/1262"> - Trac #1262</ulink>) - </para> - </listitem> - <listitem> - <para> - Pattern synonyms (see - <ulink url="https://ghc.haskell.org/trac/ghc/ticket/8761"> - Trac #8761</ulink>) - </para> - </listitem> - <listitem> - <para> - Typed holes (see - <ulink url="https://ghc.haskell.org/trac/ghc/ticket/10267"> - Trac #10267</ulink>) - </para> - </listitem> - </itemizedlist> - </para> - </listitem> - - </itemizedlist> -(Compared to the original paper, there are many differences of detail. -The syntax for a declaration splice uses "<literal>$</literal>" not "<literal>splice</literal>". -The type of the enclosed expression must be <literal>Q [Dec]</literal>, not <literal>[Q Dec]</literal>. -Typed expression splices and quotations are supported.) - -</sect2> - -<sect2> <title> Using Template Haskell </title> -<para> -<itemizedlist> - <listitem><para> - The data types and monadic constructor functions for Template Haskell are in the library - <literal>Language.Haskell.THSyntax</literal>. - </para></listitem> - - <listitem><para> - You can only run a function at compile time if it is imported from another module. That is, - you can't define a function in a module, and call it from within a splice in the same module. - (It would make sense to do so, but it's hard to implement.) - </para></listitem> - - <listitem><para> - You can only run a function at compile time if it is imported - from another module <emphasis>that is not part of a mutually-recursive group of modules - that includes the module currently being compiled</emphasis>. Furthermore, all of the modules of - the mutually-recursive group must be reachable by non-SOURCE imports from the module where the - splice is to be run.</para> - <para> - For example, when compiling module A, - you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly). - The reason should be clear: to run B we must compile and run A, but we are currently type-checking A. - </para></listitem> - - <listitem><para> - If you are building GHC from source, you need at least a stage-2 bootstrap compiler to - run Template Haskell splices and quasi-quotes. A stage-1 compiler will only accept regular quotes of Haskell. Reason: TH splices and quasi-quotes - compile and run a program, and then looks at the result. So it's important that - the program it compiles produces results whose representations are identical to - those of the compiler itself. - </para></listitem> -</itemizedlist> -</para> -<para> Template Haskell works in any mode (<literal>--make</literal>, <literal>--interactive</literal>, - or file-at-a-time). There used to be a restriction to the former two, but that restriction - has been lifted. -</para> -</sect2> - -<sect2 id="th-view-gen-code"> <title> Viewing Template Haskell generated code </title> - <para> - The flag <literal>-ddump-splices</literal> shows the expansion of all top-level declaration splices, both typed and untyped, as they happen. - As with all dump flags, the default is for this output to be sent to stdout. - For a non-trivial program, you may be interested in combining this with the <literal>-ddump-to-file flag</literal> (see <xref linkend="dumping-output"/>. - For each file using Template Haskell, this will show the output in a <literal>.dump-splices</literal> file. - </para> - - <para> - The flag <literal>-dth-dec-file</literal> shows the expansions of all top-level TH declaration splices, both typed and untyped, in the file <literal>M.th.hs</literal> where M is the name of the module being compiled. - Note that other types of splices (expressions, types, and patterns) are not shown. - Application developers can check this into their repository so that they can grep for identifiers that were defined in Template Haskell. - This is similar to using <option>-ddump-to-file</option> with <option>-ddump-splices</option> but it always generates a file instead of being coupled to <option>-ddump-to-file</option>. The format is also different: it does not show code from the original file, instead it only shows generated code and has a comment for the splice location of the original file. - </para> - - <para> - Below is a sample output of <literal>-ddump-splices</literal> - </para> - -<programlisting> -TH_pragma.hs:(6,4)-(8,26): Splicing declarations - [d| foo :: Int -> Int - foo x = x + 1 |] -======> - foo :: Int -> Int - foo x = (x + 1) -</programlisting> - - <para> - Below is the output of the same sample using <literal>-dth-dec-file</literal> - </para> - -<programlisting> --- TH_pragma.hs:(6,4)-(8,26): Splicing declarations -foo :: Int -> Int -foo x = (x + 1) -</programlisting> -</sect2> - -<sect2 id="th-example"> <title> A Template Haskell Worked Example </title> -<para>To help you get over the confidence barrier, try out this skeletal worked example. - First cut and paste the two modules below into "Main.hs" and "Printf.hs":</para> - -<programlisting> - -{- Main.hs -} -module Main where - --- Import our template "pr" -import Printf ( pr ) - --- The splice operator $ takes the Haskell source code --- generated at compile time by "pr" and splices it into --- the argument of "putStrLn". -main = putStrLn ( $(pr "Hello") ) - - -{- Printf.hs -} -module Printf where - --- Skeletal printf from the paper. --- It needs to be in a separate module to the one where --- you intend to use it. - --- Import some Template Haskell syntax -import Language.Haskell.TH - --- Describe a format string -data Format = D | S | L String - --- Parse a format string. This is left largely to you --- as we are here interested in building our first ever --- Template Haskell program and not in building printf. -parse :: String -> [Format] -parse s = [ L s ] - --- Generate Haskell source code from a parsed representation --- of the format string. This code will be spliced into --- the module which calls "pr", at compile time. -gen :: [Format] -> Q Exp -gen [D] = [| \n -> show n |] -gen [S] = [| \s -> s |] -gen [L s] = stringE s - --- Here we generate the Haskell code for the splice --- from an input format string. -pr :: String -> Q Exp -pr s = gen (parse s) -</programlisting> - -<para>Now run the compiler (here we are a Cygwin prompt on Windows): -</para> -<programlisting> -$ ghc --make -XTemplateHaskell main.hs -o main.exe -</programlisting> - -<para>Run "main.exe" and here is your output:</para> - -<programlisting> -$ ./main -Hello -</programlisting> - -</sect2> - -<sect2> -<title>Using Template Haskell with Profiling</title> -<indexterm><primary>profiling</primary><secondary>with Template Haskell</secondary></indexterm> - -<para>Template Haskell relies on GHC's built-in bytecode compiler and -interpreter to run the splice expressions. The bytecode interpreter -runs the compiled expression on top of the same runtime on which GHC -itself is running; this means that the compiled code referred to by -the interpreted expression must be compatible with this runtime, and -in particular this means that object code that is compiled for -profiling <emphasis>cannot</emphasis> be loaded and used by a splice -expression, because profiled object code is only compatible with the -profiling version of the runtime.</para> - -<para>This causes difficulties if you have a multi-module program -containing Template Haskell code and you need to compile it for -profiling, because GHC cannot load the profiled object code and use it -when executing the splices. Fortunately GHC provides a workaround. -The basic idea is to compile the program twice:</para> - -<orderedlist> -<listitem> - <para>Compile the program or library first the normal way, without - <option>-prof</option><indexterm><primary><option>-prof</option></primary></indexterm>.</para> -</listitem> -<listitem> - <para>Then compile it again with <option>-prof</option>, and - additionally use <option>-osuf - p_o</option><indexterm><primary><option>-osuf</option></primary></indexterm> - to name the object files differently (you can choose any suffix - that isn't the normal object suffix here). GHC will automatically - load the object files built in the first step when executing splice - expressions. If you omit the <option>-osuf</option> flag when - building with <option>-prof</option> and Template Haskell is used, - GHC will emit an error message. </para> -</listitem> -</orderedlist> -</sect2> - -<sect2 id="th-quasiquotation"> <title> Template Haskell Quasi-quotation </title> -<para>Quasi-quotation allows patterns and expressions to be written using -programmer-defined concrete syntax; the motivation behind the extension and -several examples are documented in -"<ulink url="http://www.cs.tufts.edu/comp/150FP/archive/geoff-mainland/quasiquoting.pdf">Why It's -Nice to be Quoted: Quasiquoting for Haskell</ulink>" (Proc Haskell Workshop -2007). The example below shows how to write a quasiquoter for a simple -expression language.</para> -<para> -Here are the salient features -<itemizedlist> -<listitem><para> -A quasi-quote has the form -<literal>[<replaceable>quoter</replaceable>| <replaceable>string</replaceable> |]</literal>. -<itemizedlist> -<listitem><para> -The <replaceable>quoter</replaceable> must be the name of an imported quoter, -either qualified or unqualified; it cannot be an arbitrary expression. -</para></listitem> -<listitem><para> -The <replaceable>quoter</replaceable> cannot be "<literal>e</literal>", -"<literal>t</literal>", "<literal>d</literal>", or "<literal>p</literal>", since -those overlap with Template Haskell quotations. -</para></listitem> -<listitem><para> -There must be no spaces in the token -<literal>[<replaceable>quoter</replaceable>|</literal>. -</para></listitem> -<listitem><para> -The quoted <replaceable>string</replaceable> -can be arbitrary, and may contain newlines. -</para></listitem> -<listitem><para> -The quoted <replaceable>string</replaceable> -finishes at the first occurrence of the two-character sequence <literal>"|]"</literal>. -Absolutely no escaping is performed. If you want to embed that character -sequence in the string, you must invent your own escape convention (such -as, say, using the string <literal>"|~]"</literal> instead), and make your -quoter function interpret <literal>"|~]"</literal> as <literal>"|]"</literal>. -One way to implement this is to compose your quoter with a pre-processing pass to -perform your escape conversion. See the -<ulink url="http://ghc.haskell.org/trac/ghc/ticket/5348"> -discussion in Trac</ulink> for details. -</para></listitem> -</itemizedlist> -</para></listitem> - -<listitem><para> -A quasiquote may appear in place of -<itemizedlist> -<listitem><para>An expression</para></listitem> -<listitem><para>A pattern</para></listitem> -<listitem><para>A type</para></listitem> -<listitem><para>A top-level declaration</para></listitem> -</itemizedlist> -(Only the first two are described in the paper.) -</para></listitem> - -<listitem><para> -A quoter is a value of type <literal>Language.Haskell.TH.Quote.QuasiQuoter</literal>, -which is defined thus: -<programlisting> -data QuasiQuoter = QuasiQuoter { quoteExp :: String -> Q Exp, - quotePat :: String -> Q Pat, - quoteType :: String -> Q Type, - quoteDec :: String -> Q [Dec] } -</programlisting> -That is, a quoter is a tuple of four parsers, one for each of the contexts -in which a quasi-quote can occur. -</para></listitem> -<listitem><para> -A quasi-quote is expanded by applying the appropriate parser to the string -enclosed by the Oxford brackets. The context of the quasi-quote (expression, pattern, -type, declaration) determines which of the parsers is called. -</para></listitem> -<listitem><para> -Unlike normal declaration splices of the form <literal>$(...)</literal>, -declaration quasi-quotes do not cause a declaration group break. See -<xref linkend="th-syntax"/> for more information. -</para></listitem> -</itemizedlist> -</para> -<para> -The example below shows quasi-quotation in action. The quoter <literal>expr</literal> -is bound to a value of type <literal>QuasiQuoter</literal> defined in module <literal>Expr</literal>. -The example makes use of an antiquoted -variable <literal>n</literal>, indicated by the syntax <literal>'int:n</literal> -(this syntax for anti-quotation was defined by the parser's -author, <emphasis>not</emphasis> by GHC). This binds <literal>n</literal> to the -integer value argument of the constructor <literal>IntExpr</literal> when -pattern matching. Please see the referenced paper for further details regarding -anti-quotation as well as the description of a technique that uses SYB to -leverage a single parser of type <literal>String -> a</literal> to generate both -an expression parser that returns a value of type <literal>Q Exp</literal> and a -pattern parser that returns a value of type <literal>Q Pat</literal>. -</para> - -<para> -Quasiquoters must obey the same stage restrictions as Template Haskell, e.g., in -the example, <literal>expr</literal> cannot be defined -in <literal>Main.hs</literal> where it is used, but must be imported. -</para> - -<programlisting> -{- ------------- file Main.hs --------------- -} -module Main where - -import Expr - -main :: IO () -main = do { print $ eval [expr|1 + 2|] - ; case IntExpr 1 of - { [expr|'int:n|] -> print n - ; _ -> return () - } - } - - -{- ------------- file Expr.hs --------------- -} -module Expr where - -import qualified Language.Haskell.TH as TH -import Language.Haskell.TH.Quote - -data Expr = IntExpr Integer - | AntiIntExpr String - | BinopExpr BinOp Expr Expr - | AntiExpr String - deriving(Show, Typeable, Data) - -data BinOp = AddOp - | SubOp - | MulOp - | DivOp - deriving(Show, Typeable, Data) - -eval :: Expr -> Integer -eval (IntExpr n) = n -eval (BinopExpr op x y) = (opToFun op) (eval x) (eval y) - where - opToFun AddOp = (+) - opToFun SubOp = (-) - opToFun MulOp = (*) - opToFun DivOp = div - -expr = QuasiQuoter { quoteExp = parseExprExp, quotePat = parseExprPat } - --- Parse an Expr, returning its representation as --- either a Q Exp or a Q Pat. See the referenced paper --- for how to use SYB to do this by writing a single --- parser of type String -> Expr instead of two --- separate parsers. - -parseExprExp :: String -> Q Exp -parseExprExp ... - -parseExprPat :: String -> Q Pat -parseExprPat ... -</programlisting> - -<para>Now run the compiler: -<programlisting> -$ ghc --make -XQuasiQuotes Main.hs -o main -</programlisting> -</para> - -<para>Run "main" and here is your output: -<programlisting> -$ ./main -3 -1 -</programlisting> -</para> -</sect2> - -</sect1> - -<!-- ===================== Arrow notation =================== --> - -<sect1 id="arrow-notation"> -<title>Arrow notation -</title> - -<para>Arrows are a generalisation of monads introduced by John Hughes. -For more details, see -<itemizedlist> - -<listitem> -<para> -“Generalising Monads to Arrows”, -John Hughes, in <citetitle>Science of Computer Programming</citetitle> 37, -pp67–111, May 2000. -The paper that introduced arrows: a friendly introduction, motivated with -programming examples. -</para> -</listitem> - -<listitem> -<para> -“<ulink url="http://www.soi.city.ac.uk/~ross/papers/notation.html">A New Notation for Arrows</ulink>”, -Ross Paterson, in <citetitle>ICFP</citetitle>, Sep 2001. -Introduced the notation described here. -</para> -</listitem> - -<listitem> -<para> -“<ulink url="http://www.soi.city.ac.uk/~ross/papers/fop.html">Arrows and Computation</ulink>”, -Ross Paterson, in <citetitle>The Fun of Programming</citetitle>, -Palgrave, 2003. -</para> -</listitem> - -<listitem> -<para> -“<ulink url="http://www.cse.chalmers.se/~rjmh/afp-arrows.pdf">Programming with Arrows</ulink>”, -John Hughes, in <citetitle>5th International Summer School on -Advanced Functional Programming</citetitle>, -<citetitle>Lecture Notes in Computer Science</citetitle> vol. 3622, -Springer, 2004. -This paper includes another introduction to the notation, -with practical examples. -</para> -</listitem> - -<listitem> -<para> -“<ulink url="http://www.haskell.org/ghc/docs/papers/arrow-rules.pdf">Type and Translation Rules for Arrow Notation in GHC</ulink>”, -Ross Paterson and Simon Peyton Jones, September 16, 2004. -A terse enumeration of the formal rules used -(extracted from comments in the source code). -</para> -</listitem> - -<listitem> -<para> -The arrows web page at -<ulink url="http://www.haskell.org/arrows/"><literal>http://www.haskell.org/arrows/</literal></ulink>. -</para> -</listitem> - -</itemizedlist> -With the <option>-XArrows</option> flag, GHC supports the arrow -notation described in the second of these papers, -translating it using combinators from the -<ulink url="&libraryBaseLocation;/Control-Arrow.html"><literal>Control.Arrow</literal></ulink> -module. -What follows is a brief introduction to the notation; -it won't make much sense unless you've read Hughes's paper. -</para> - -<para>The extension adds a new kind of expression for defining arrows: -<screen> -<replaceable>exp</replaceable><superscript>10</superscript> ::= ... - | proc <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable> -</screen> -where <literal>proc</literal> is a new keyword. -The variables of the pattern are bound in the body of the -<literal>proc</literal>-expression, -which is a new sort of thing called a <firstterm>command</firstterm>. -The syntax of commands is as follows: -<screen> -<replaceable>cmd</replaceable> ::= <replaceable>exp</replaceable><superscript>10</superscript> -< <replaceable>exp</replaceable> - | <replaceable>exp</replaceable><superscript>10</superscript> -<< <replaceable>exp</replaceable> - | <replaceable>cmd</replaceable><superscript>0</superscript> -</screen> -with <replaceable>cmd</replaceable><superscript>0</superscript> up to -<replaceable>cmd</replaceable><superscript>9</superscript> defined using -infix operators as for expressions, and -<screen> -<replaceable>cmd</replaceable><superscript>10</superscript> ::= \ <replaceable>apat</replaceable> ... <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable> - | let <replaceable>decls</replaceable> in <replaceable>cmd</replaceable> - | if <replaceable>exp</replaceable> then <replaceable>cmd</replaceable> else <replaceable>cmd</replaceable> - | case <replaceable>exp</replaceable> of { <replaceable>calts</replaceable> } - | do { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> ; <replaceable>cmd</replaceable> } - | <replaceable>fcmd</replaceable> - -<replaceable>fcmd</replaceable> ::= <replaceable>fcmd</replaceable> <replaceable>aexp</replaceable> - | ( <replaceable>cmd</replaceable> ) - | (| <replaceable>aexp</replaceable> <replaceable>cmd</replaceable> ... <replaceable>cmd</replaceable> |) - -<replaceable>cstmt</replaceable> ::= let <replaceable>decls</replaceable> - | <replaceable>pat</replaceable> <- <replaceable>cmd</replaceable> - | rec { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> [;] } - | <replaceable>cmd</replaceable> -</screen> -where <replaceable>calts</replaceable> are like <replaceable>alts</replaceable> -except that the bodies are commands instead of expressions. -</para> - -<para> -Commands produce values, but (like monadic computations) -may yield more than one value, -or none, and may do other things as well. -For the most part, familiarity with monadic notation is a good guide to -using commands. -However the values of expressions, even monadic ones, -are determined by the values of the variables they contain; -this is not necessarily the case for commands. -</para> - -<para> -A simple example of the new notation is the expression -<screen> -proc x -> f -< x+1 -</screen> -We call this a <firstterm>procedure</firstterm> or -<firstterm>arrow abstraction</firstterm>. -As with a lambda expression, the variable <literal>x</literal> -is a new variable bound within the <literal>proc</literal>-expression. -It refers to the input to the arrow. -In the above example, <literal>-<</literal> is not an identifier but an -new reserved symbol used for building commands from an expression of arrow -type and an expression to be fed as input to that arrow. -(The weird look will make more sense later.) -It may be read as analogue of application for arrows. -The above example is equivalent to the Haskell expression -<screen> -arr (\ x -> x+1) >>> f -</screen> -That would make no sense if the expression to the left of -<literal>-<</literal> involves the bound variable <literal>x</literal>. -More generally, the expression to the left of <literal>-<</literal> -may not involve any <firstterm>local variable</firstterm>, -i.e. a variable bound in the current arrow abstraction. -For such a situation there is a variant <literal>-<<</literal>, as in -<screen> -proc x -> f x -<< x+1 -</screen> -which is equivalent to -<screen> -arr (\ x -> (f x, x+1)) >>> app -</screen> -so in this case the arrow must belong to the <literal>ArrowApply</literal> -class. -Such an arrow is equivalent to a monad, so if you're using this form -you may find a monadic formulation more convenient. -</para> - -<sect2> -<title>do-notation for commands</title> - -<para> -Another form of command is a form of <literal>do</literal>-notation. -For example, you can write -<screen> -proc x -> do - y <- f -< x+1 - g -< 2*y - let z = x+y - t <- h -< x*z - returnA -< t+z -</screen> -You can read this much like ordinary <literal>do</literal>-notation, -but with commands in place of monadic expressions. -The first line sends the value of <literal>x+1</literal> as an input to -the arrow <literal>f</literal>, and matches its output against -<literal>y</literal>. -In the next line, the output is discarded. -The arrow <function>returnA</function> is defined in the -<ulink url="&libraryBaseLocation;/Control-Arrow.html"><literal>Control.Arrow</literal></ulink> -module as <literal>arr id</literal>. -The above example is treated as an abbreviation for -<screen> -arr (\ x -> (x, x)) >>> - first (arr (\ x -> x+1) >>> f) >>> - arr (\ (y, x) -> (y, (x, y))) >>> - first (arr (\ y -> 2*y) >>> g) >>> - arr snd >>> - arr (\ (x, y) -> let z = x+y in ((x, z), z)) >>> - first (arr (\ (x, z) -> x*z) >>> h) >>> - arr (\ (t, z) -> t+z) >>> - returnA -</screen> -Note that variables not used later in the composition are projected out. -After simplification using rewrite rules (see <xref linkend="rewrite-rules"/>) -defined in the -<ulink url="&libraryBaseLocation;/Control-Arrow.html"><literal>Control.Arrow</literal></ulink> -module, this reduces to -<screen> -arr (\ x -> (x+1, x)) >>> - first f >>> - arr (\ (y, x) -> (2*y, (x, y))) >>> - first g >>> - arr (\ (_, (x, y)) -> let z = x+y in (x*z, z)) >>> - first h >>> - arr (\ (t, z) -> t+z) -</screen> -which is what you might have written by hand. -With arrow notation, GHC keeps track of all those tuples of variables for you. -</para> - -<para> -Note that although the above translation suggests that -<literal>let</literal>-bound variables like <literal>z</literal> must be -monomorphic, the actual translation produces Core, -so polymorphic variables are allowed. -</para> - -<para> -It's also possible to have mutually recursive bindings, -using the new <literal>rec</literal> keyword, as in the following example: -<programlisting> -counter :: ArrowCircuit a => a Bool Int -counter = proc reset -> do - rec output <- returnA -< if reset then 0 else next - next <- delay 0 -< output+1 - returnA -< output -</programlisting> -The translation of such forms uses the <function>loop</function> combinator, -so the arrow concerned must belong to the <literal>ArrowLoop</literal> class. -</para> - -</sect2> - -<sect2> -<title>Conditional commands</title> - -<para> -In the previous example, we used a conditional expression to construct the -input for an arrow. -Sometimes we want to conditionally execute different commands, as in -<screen> -proc (x,y) -> - if f x y - then g -< x+1 - else h -< y+2 -</screen> -which is translated to -<screen> -arr (\ (x,y) -> if f x y then Left x else Right y) >>> - (arr (\x -> x+1) >>> g) ||| (arr (\y -> y+2) >>> h) -</screen> -Since the translation uses <function>|||</function>, -the arrow concerned must belong to the <literal>ArrowChoice</literal> class. -</para> - -<para> -There are also <literal>case</literal> commands, like -<screen> -case input of - [] -> f -< () - [x] -> g -< x+1 - x1:x2:xs -> do - y <- h -< (x1, x2) - ys <- k -< xs - returnA -< y:ys -</screen> -The syntax is the same as for <literal>case</literal> expressions, -except that the bodies of the alternatives are commands rather than expressions. -The translation is similar to that of <literal>if</literal> commands. -</para> - -</sect2> - -<sect2> -<title>Defining your own control structures</title> - -<para> -As we're seen, arrow notation provides constructs, -modelled on those for expressions, -for sequencing, value recursion and conditionals. -But suitable combinators, -which you can define in ordinary Haskell, -may also be used to build new commands out of existing ones. -The basic idea is that a command defines an arrow from environments to values. -These environments assign values to the free local variables of the command. -Thus combinators that produce arrows from arrows -may also be used to build commands from commands. -For example, the <literal>ArrowPlus</literal> class includes a combinator -<programlisting> -ArrowPlus a => (<+>) :: a b c -> a b c -> a b c -</programlisting> -so we can use it to build commands: -<programlisting> -expr' = proc x -> do - returnA -< x - <+> do - symbol Plus -< () - y <- term -< () - expr' -< x + y - <+> do - symbol Minus -< () - y <- term -< () - expr' -< x - y -</programlisting> -(The <literal>do</literal> on the first line is needed to prevent the first -<literal><+> ...</literal> from being interpreted as part of the -expression on the previous line.) -This is equivalent to -<programlisting> -expr' = (proc x -> returnA -< x) - <+> (proc x -> do - symbol Plus -< () - y <- term -< () - expr' -< x + y) - <+> (proc x -> do - symbol Minus -< () - y <- term -< () - expr' -< x - y) -</programlisting> -We are actually using <literal><+></literal> here with the more specific type -<programlisting> -ArrowPlus a => (<+>) :: a (e,()) c -> a (e,()) c -> a (e,()) c -</programlisting> -It is essential that this operator be polymorphic in <literal>e</literal> -(representing the environment input to the command -and thence to its subcommands) -and satisfy the corresponding naturality property -<screen> -arr (first k) >>> (f <+> g) = (arr (first k) >>> f) <+> (arr (first k) >>> g) -</screen> -at least for strict <literal>k</literal>. -(This should be automatic if you're not using <function>seq</function>.) -This ensures that environments seen by the subcommands are environments -of the whole command, -and also allows the translation to safely trim these environments. -(The second component of the input pairs can contain unnamed input values, -as described in the next section.) -The operator must also not use any variable defined within the current -arrow abstraction. -</para> - -<para> -We could define our own operator -<programlisting> -untilA :: ArrowChoice a => a (e,s) () -> a (e,s) Bool -> a (e,s) () -untilA body cond = proc x -> - b <- cond -< x - if b then returnA -< () - else do - body -< x - untilA body cond -< x -</programlisting> -and use it in the same way. -Of course this infix syntax only makes sense for binary operators; -there is also a more general syntax involving special brackets: -<screen> -proc x -> do - y <- f -< x+1 - (|untilA (increment -< x+y) (within 0.5 -< x)|) -</screen> -</para> - -</sect2> - -<sect2> -<title>Primitive constructs</title> - -<para> -Some operators will need to pass additional inputs to their subcommands. -For example, in an arrow type supporting exceptions, -the operator that attaches an exception handler will wish to pass the -exception that occurred to the handler. -Such an operator might have a type -<screen> -handleA :: ... => a (e,s) c -> a (e,(Ex,s)) c -> a (e,s) c -</screen> -where <literal>Ex</literal> is the type of exceptions handled. -You could then use this with arrow notation by writing a command -<screen> -body `handleA` \ ex -> handler -</screen> -so that if an exception is raised in the command <literal>body</literal>, -the variable <literal>ex</literal> is bound to the value of the exception -and the command <literal>handler</literal>, -which typically refers to <literal>ex</literal>, is entered. -Though the syntax here looks like a functional lambda, -we are talking about commands, and something different is going on. -The input to the arrow represented by a command consists of values for -the free local variables in the command, plus a stack of anonymous values. -In all the prior examples, we made no assumptions about this stack. -In the second argument to <function>handleA</function>, -the value of the exception has been added to the stack input to the handler. -The command form of lambda merely gives this value a name. -</para> - -<para> -More concretely, -the input to a command consists of a pair of an environment and a stack. -Each value on the stack is paired with the remainder of the stack, -with an empty stack being <literal>()</literal>. -So operators like <function>handleA</function> that pass -extra inputs to their subcommands can be designed for use with the notation -by placing the values on the stack paired with the environment in this way. -More precisely, the type of each argument of the operator (and its result) -should have the form -<screen> -a (e, (t1, ... (tn, ())...)) t -</screen> -where <replaceable>e</replaceable> is a polymorphic variable -(representing the environment) -and <replaceable>ti</replaceable> are the types of the values on the stack, -with <replaceable>t1</replaceable> being the <quote>top</quote>. -The polymorphic variable <replaceable>e</replaceable> must not occur in -<replaceable>a</replaceable>, <replaceable>ti</replaceable> or -<replaceable>t</replaceable>. -However the arrows involved need not be the same. -Here are some more examples of suitable operators: -<screen> -bracketA :: ... => a (e,s) b -> a (e,(b,s)) c -> a (e,(c,s)) d -> a (e,s) d -runReader :: ... => a (e,s) c -> a' (e,(State,s)) c -runState :: ... => a (e,s) c -> a' (e,(State,s)) (c,State) -</screen> -We can supply the extra input required by commands built with the last two -by applying them to ordinary expressions, as in -<screen> -proc x -> do - s <- ... - (|runReader (do { ... })|) s -</screen> -which adds <literal>s</literal> to the stack of inputs to the command -built using <function>runReader</function>. -</para> - -<para> -The command versions of lambda abstraction and application are analogous to -the expression versions. -In particular, the beta and eta rules describe equivalences of commands. -These three features (operators, lambda abstraction and application) -are the core of the notation; everything else can be built using them, -though the results would be somewhat clumsy. -For example, we could simulate <literal>do</literal>-notation by defining -<programlisting> -bind :: Arrow a => a (e,s) b -> a (e,(b,s)) c -> a (e,s) c -u `bind` f = returnA &&& u >>> f - -bind_ :: Arrow a => a (e,s) b -> a (e,s) c -> a (e,s) c -u `bind_` f = u `bind` (arr fst >>> f) -</programlisting> -We could simulate <literal>if</literal> by defining -<programlisting> -cond :: ArrowChoice a => a (e,s) b -> a (e,s) b -> a (e,(Bool,s)) b -cond f g = arr (\ (e,(b,s)) -> if b then Left (e,s) else Right (e,s)) >>> f ||| g -</programlisting> -</para> - -</sect2> - -<sect2> -<title>Differences with the paper</title> - -<itemizedlist> - -<listitem> -<para>Instead of a single form of arrow application (arrow tail) with two -translations, the implementation provides two forms -<quote><literal>-<</literal></quote> (first-order) -and <quote><literal>-<<</literal></quote> (higher-order). -</para> -</listitem> - -<listitem> -<para>User-defined operators are flagged with banana brackets instead of -a new <literal>form</literal> keyword. -</para> -</listitem> - -<listitem> -<para>In the paper and the previous implementation, -values on the stack were paired to the right of the environment -in a single argument, -but now the environment and stack are separate arguments. -</para> -</listitem> - -</itemizedlist> - -</sect2> - -<sect2> -<title>Portability</title> - -<para> -Although only GHC implements arrow notation directly, -there is also a preprocessor -(available from the -<ulink url="http://www.haskell.org/arrows/">arrows web page</ulink>) -that translates arrow notation into Haskell 98 -for use with other Haskell systems. -You would still want to check arrow programs with GHC; -tracing type errors in the preprocessor output is not easy. -Modules intended for both GHC and the preprocessor must observe some -additional restrictions: -<itemizedlist> - -<listitem> -<para> -The module must import -<ulink url="&libraryBaseLocation;/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>. -</para> -</listitem> - -<listitem> -<para> -The preprocessor cannot cope with other Haskell extensions. -These would have to go in separate modules. -</para> -</listitem> - -<listitem> -<para> -Because the preprocessor targets Haskell (rather than Core), -<literal>let</literal>-bound variables are monomorphic. -</para> -</listitem> - -</itemizedlist> -</para> - -</sect2> - -</sect1> - -<!-- ==================== BANG PATTERNS ================= --> - -<sect1 id="bang-patterns"> -<title>Bang patterns -<indexterm><primary>Bang patterns</primary></indexterm> -</title> -<para>GHC supports an extension of pattern matching called <emphasis>bang -patterns</emphasis>, written <literal>!<replaceable>pat</replaceable></literal>. -Bang patterns are under consideration for Haskell Prime. -The <ulink -url="http://ghc.haskell.org/trac/haskell-prime/wiki/BangPatterns">Haskell -prime feature description</ulink> contains more discussion and examples -than the material below. -</para> -<para> -The key change is the addition of a new rule to the -<ulink url="http://haskell.org/onlinereport/exps.html#sect3.17.2">semantics of pattern matching in the Haskell 98 report</ulink>. -Add new bullet 10, saying: Matching the pattern <literal>!</literal><replaceable>pat</replaceable> -against a value <replaceable>v</replaceable> behaves as follows: -<itemizedlist> -<listitem><para>if <replaceable>v</replaceable> is bottom, the match diverges</para></listitem> -<listitem><para>otherwise, <replaceable>pat</replaceable> is matched against <replaceable>v</replaceable> </para></listitem> -</itemizedlist> -</para> -<para> -Bang patterns are enabled by the flag <option>-XBangPatterns</option>. -</para> - -<sect2 id="bang-patterns-informal"> -<title>Informal description of bang patterns -</title> -<para> -The main idea is to add a single new production to the syntax of patterns: -<programlisting> - pat ::= !pat -</programlisting> -Matching an expression <literal>e</literal> against a pattern <literal>!p</literal> is done by first -evaluating <literal>e</literal> (to WHNF) and then matching the result against <literal>p</literal>. -Example: -<programlisting> -f1 !x = True -</programlisting> -This definition makes <literal>f1</literal> is strict in <literal>x</literal>, -whereas without the bang it would be lazy. -Bang patterns can be nested of course: -<programlisting> -f2 (!x, y) = [x,y] -</programlisting> -Here, <literal>f2</literal> is strict in <literal>x</literal> but not in -<literal>y</literal>. -A bang only really has an effect if it precedes a variable or wild-card pattern: -<programlisting> -f3 !(x,y) = [x,y] -f4 (x,y) = [x,y] -</programlisting> -Here, <literal>f3</literal> and <literal>f4</literal> are identical; -putting a bang before a pattern that -forces evaluation anyway does nothing. -</para> -<para> -There is one (apparent) exception to this general rule that a bang only -makes a difference when it precedes a variable or wild-card: a bang at the -top level of a <literal>let</literal> or <literal>where</literal> -binding makes the binding strict, regardless of the pattern. -(We say "apparent" exception because the Right Way to think of it is that the bang -at the top of a binding is not part of the <emphasis>pattern</emphasis>; rather it -is part of the syntax of the <emphasis>binding</emphasis>, -creating a "bang-pattern binding".) -For example: -<programlisting> -let ![x,y] = e in b -</programlisting> -is a bang-pattern binding. Operationally, it behaves just like a case expression: -<programlisting> -case e of [x,y] -> b -</programlisting> -Like a case expression, a bang-pattern binding must be non-recursive, and -is monomorphic. - -However, <emphasis>nested</emphasis> bangs in a pattern binding behave uniformly with all other forms of -pattern matching. For example -<programlisting> -let (!x,[y]) = e in b -</programlisting> -is equivalent to this: -<programlisting> -let { t = case e of (x,[y]) -> x `seq` (x,y) - x = fst t - y = snd t } -in b -</programlisting> -The binding is lazy, but when either <literal>x</literal> or <literal>y</literal> is -evaluated by <literal>b</literal> the entire pattern is matched, including forcing the -evaluation of <literal>x</literal>. -</para> -<para> -Bang patterns work in <literal>case</literal> expressions too, of course: -<programlisting> -g5 x = let y = f x in body -g6 x = case f x of { y -> body } -g7 x = case f x of { !y -> body } -</programlisting> -The functions <literal>g5</literal> and <literal>g6</literal> mean exactly the same thing. -But <literal>g7</literal> evaluates <literal>(f x)</literal>, binds <literal>y</literal> to the -result, and then evaluates <literal>body</literal>. -</para> -</sect2> - - -<sect2 id="bang-patterns-sem"> -<title>Syntax and semantics -</title> -<para> - -We add a single new production to the syntax of patterns: -<programlisting> - pat ::= !pat -</programlisting> -There is one problem with syntactic ambiguity. Consider: -<programlisting> -f !x = 3 -</programlisting> -Is this a definition of the infix function "<literal>(!)</literal>", -or of the "<literal>f</literal>" with a bang pattern? GHC resolves this -ambiguity in favour of the latter. If you want to define -<literal>(!)</literal> with bang-patterns enabled, you have to do so using -prefix notation: -<programlisting> -(!) f x = 3 -</programlisting> -The semantics of Haskell pattern matching is described in <ulink -url="http://www.haskell.org/onlinereport/exps.html#sect3.17.2"> -Section 3.17.2</ulink> of the Haskell Report. To this description add -one extra item 10, saying: -<itemizedlist><listitem><para>Matching -the pattern <literal>!pat</literal> against a value <literal>v</literal> behaves as follows: -<itemizedlist><listitem><para>if <literal>v</literal> is bottom, the match diverges</para></listitem> - <listitem><para>otherwise, <literal>pat</literal> is matched against - <literal>v</literal></para></listitem> -</itemizedlist> -</para></listitem></itemizedlist> -Similarly, in Figure 4 of <ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.17.3"> -Section 3.17.3</ulink>, add a new case (t): -<programlisting> -case v of { !pat -> e; _ -> e' } - = v `seq` case v of { pat -> e; _ -> e' } -</programlisting> -</para><para> -That leaves let expressions, whose translation is given in -<ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.12">Section -3.12</ulink> -of the Haskell Report. -In the translation box, first apply -the following transformation: for each pattern <literal>pi</literal> that is of -form <literal>!qi = ei</literal>, transform it to <literal>(xi,!qi) = ((),ei)</literal>, and replace <literal>e0</literal> -by <literal>(xi `seq` e0)</literal>. Then, when none of the left-hand-side patterns -have a bang at the top, apply the rules in the existing box. -</para> -<para>The effect of the let rule is to force complete matching of the pattern -<literal>qi</literal> before evaluation of the body is begun. The bang is -retained in the translated form in case <literal>qi</literal> is a variable, -thus: -<programlisting> - let !y = f x in b -</programlisting> - -</para> -<para> -The let-binding can be recursive. However, it is much more common for -the let-binding to be non-recursive, in which case the following law holds: -<literal>(let !p = rhs in body)</literal> - is equivalent to -<literal>(case rhs of !p -> body)</literal> -</para> -<para> -A pattern with a bang at the outermost level is not allowed at the top level of -a module. -</para> -</sect2> -</sect1> - -<!-- ==================== ASSERTIONS ================= --> - -<sect1 id="assertions"> -<title>Assertions -<indexterm><primary>Assertions</primary></indexterm> -</title> - -<para> -If you want to make use of assertions in your standard Haskell code, you -could define a function like the following: -</para> - -<para> - -<programlisting> -assert :: Bool -> a -> a -assert False x = error "assertion failed!" -assert _ x = x -</programlisting> - -</para> - -<para> -which works, but gives you back a less than useful error message -- -an assertion failed, but which and where? -</para> - -<para> -One way out is to define an extended <function>assert</function> function which also -takes a descriptive string to include in the error message and -perhaps combine this with the use of a pre-processor which inserts -the source location where <function>assert</function> was used. -</para> - -<para> -Ghc offers a helping hand here, doing all of this for you. For every -use of <function>assert</function> in the user's source: -</para> - -<para> - -<programlisting> -kelvinToC :: Double -> Double -kelvinToC k = assert (k >= 0.0) (k+273.15) -</programlisting> - -</para> - -<para> -Ghc will rewrite this to also include the source location where the -assertion was made, -</para> - -<para> - -<programlisting> -assert pred val ==> assertError "Main.hs|15" pred val -</programlisting> - -</para> - -<para> -The rewrite is only performed by the compiler when it spots -applications of <function>Control.Exception.assert</function>, so you -can still define and use your own versions of -<function>assert</function>, should you so wish. If not, import -<literal>Control.Exception</literal> to make use -<function>assert</function> in your code. -</para> - -<para> -GHC ignores assertions when optimisation is turned on with the - <option>-O</option><indexterm><primary><option>-O</option></primary></indexterm> flag. That is, expressions of the form -<literal>assert pred e</literal> will be rewritten to -<literal>e</literal>. You can also disable assertions using the - <option>-fignore-asserts</option> - option<indexterm><primary><option>-fignore-asserts</option></primary> - </indexterm>. The option <option>-fno-ignore-asserts</option> allows -enabling assertions even when optimisation is turned on. -</para> - -<para> -Assertion failures can be caught, see the documentation for the -<literal>Control.Exception</literal> library for the details. -</para> - -</sect1> - -<!-- =============================== STATIC POINTERS =========================== --> - -<sect1 id="static-pointers"> -<title>Static pointers -<indexterm><primary>Static pointers</primary></indexterm> -</title> - -<para> -The language extension <literal>-XStaticPointers</literal> adds a new -syntactic form <literal>static <replaceable>e</replaceable></literal>, -which stands for a reference to the closed expression -<replaceable>e</replaceable>. This reference is stable and portable, -in the sense that it remains valid across different processes on -possibly different machines. Thus, a process can create a reference -and send it to another process that can resolve it to -<replaceable>e</replaceable>. -</para> -<para> -With this extension turned on, <literal>static</literal> is no longer -a valid identifier. -</para> -<para> -Static pointers were first proposed in the paper <ulink -url="http://research.microsoft.com/en-us/um/people/simonpj/papers/parallel/remote.pdf"> -Towards Haskell in the cloud</ulink>, Jeff Epstein, Andrew P. Black and Simon -Peyton-Jones, Proceedings of the 4th ACM Symposium on Haskell, pp. -118-129, ACM, 2011. -</para> - -<sect2 id="using-static-pointers"> -<title>Using static pointers</title> - -<para> -Each reference is given a key which can be used to locate it at runtime with -<ulink url="&libraryBaseLocation;/GHC.StaticPtr.html#v%3AunsafeLookupStaticPtr"><literal>unsafeLookupStaticPtr</literal></ulink> -which uses a global and immutable table called the Static Pointer Table. -The compiler includes entries in this table for all static forms found in -the linked modules. The value can be obtained from the reference via -<ulink url="&libraryBaseLocation;/GHC.StaticPtr.html#v%3AdeRefStaticPtr"><literal>deRefStaticPtr</literal></ulink> -</para> - -<para> -The body <literal>e</literal> of a <literal>static -e</literal> expression must be a closed expression. That is, there can -be no free variables occurring in <literal>e</literal>, i.e. lambda- -or let-bound variables bound locally in the context of the expression. -</para> - -<para> -All of the following are permissible: -<programlisting> -inc :: Int -> Int -inc x = x + 1 - -ref1 = static 1 -ref2 = static inc -ref3 = static (inc 1) -ref4 = static ((\x -> x + 1) (1 :: Int)) -ref5 y = static (let x = 1 in x) -</programlisting> -While the following definitions are rejected: -<programlisting> -ref6 = let x = 1 in static x -ref7 y = static (let x = 1 in y) -</programlisting> -</para> -</sect2> - -<sect2 id="typechecking-static-pointers"> -<title>Static semantics of static pointers</title> - -<para> - -Informally, if we have a closed expression -<programlisting> -e :: forall a_1 ... a_n . t -</programlisting> -the static form is of type -<programlisting> -static e :: (Typeable a_1, ... , Typeable a_n) => StaticPtr t -</programlisting> -Furthermore, type <literal>t</literal> is constrained to have a -<literal>Typeable</literal> instance. - -The following are therefore illegal: -<programlisting> -static show -- No Typeable instance for (Show a => a -> String) -static Control.Monad.ST.runST -- No Typeable instance for ((forall s. ST s a) -> a) -</programlisting> - -That being said, with the appropriate use of wrapper datatypes, the -above limitations induce no loss of generality: -<programlisting> -{-# LANGUAGE ConstraintKinds #-} -{-# LANGUAGE DeriveDataTypeable #-} -{-# LANGUAGE ExistentialQuantification #-} -{-# LANGUAGE Rank2Types #-} -{-# LANGUAGE StandaloneDeriving #-} -{-# LANGUAGE StaticPointers #-} - -import Control.Monad.ST -import Data.Typeable -import GHC.StaticPtr - -data Dict c = c => Dict - deriving Typeable - -g1 :: Typeable a => StaticPtr (Dict (Show a) -> a -> String) -g1 = static (\Dict -> show) - -data Rank2Wrapper f = R2W (forall s. f s) - deriving Typeable -newtype Flip f a s = Flip { unFlip :: f s a } - deriving Typeable - -g2 :: Typeable a => StaticPtr (Rank2Wrapper (Flip ST a) -> a) -g2 = static (\(R2W f) -> runST (unFlip f)) -</programlisting> -</para> -</sect2> - -</sect1> - - -<!-- =============================== PRAGMAS =========================== --> - - <sect1 id="pragmas"> - <title>Pragmas</title> - - <indexterm><primary>pragma</primary></indexterm> - - <para>GHC supports several pragmas, or instructions to the - compiler placed in the source code. Pragmas don't normally affect - the meaning of the program, but they might affect the efficiency - of the generated code.</para> - - <para>Pragmas all take the form - -<literal>{-# <replaceable>word</replaceable> ... #-}</literal> - - where <replaceable>word</replaceable> indicates the type of - pragma, and is followed optionally by information specific to that - type of pragma. Case is ignored in - <replaceable>word</replaceable>. The various values for - <replaceable>word</replaceable> that GHC understands are described - in the following sections; any pragma encountered with an - unrecognised <replaceable>word</replaceable> is - ignored. The layout rule applies in pragmas, so the closing <literal>#-}</literal> - should start in a column to the right of the opening <literal>{-#</literal>. </para> - - <para>Certain pragmas are <emphasis>file-header pragmas</emphasis>: - <itemizedlist> - <listitem><para> - A file-header - pragma must precede the <literal>module</literal> keyword in the file. - </para></listitem> - <listitem><para> - There can be as many file-header pragmas as you please, and they can be - preceded or followed by comments. - </para></listitem> - <listitem><para> - File-header pragmas are read once only, before - pre-processing the file (e.g. with cpp). - </para></listitem> - <listitem><para> - The file-header pragmas are: <literal>{-# LANGUAGE #-}</literal>, - <literal>{-# OPTIONS_GHC #-}</literal>, and - <literal>{-# INCLUDE #-}</literal>. - </para></listitem> - </itemizedlist> - </para> - - <sect2 id="language-pragma"> - <title>LANGUAGE pragma</title> - - <indexterm><primary>LANGUAGE</primary><secondary>pragma</secondary></indexterm> - <indexterm><primary>pragma</primary><secondary>LANGUAGE</secondary></indexterm> - - <para>The <literal>LANGUAGE</literal> pragma allows language extensions to be enabled - in a portable way. - It is the intention that all Haskell compilers support the - <literal>LANGUAGE</literal> pragma with the same syntax, although not - all extensions are supported by all compilers, of - course. The <literal>LANGUAGE</literal> pragma should be used instead - of <literal>OPTIONS_GHC</literal>, if possible.</para> - - <para>For example, to enable the FFI and preprocessing with CPP:</para> - -<programlisting>{-# LANGUAGE ForeignFunctionInterface, CPP #-}</programlisting> - - <para><literal>LANGUAGE</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para> - - <para>Every language extension can also be turned into a command-line flag - by prefixing it with "<literal>-X</literal>"; for example <option>-XForeignFunctionInterface</option>. - (Similarly, all "<literal>-X</literal>" flags can be written as <literal>LANGUAGE</literal> pragmas.) - </para> - - <para>A list of all supported language extensions can be obtained by invoking - <literal>ghc --supported-extensions</literal> (see <xref linkend="modes"/>).</para> - - <para>Any extension from the <literal>Extension</literal> type defined in - <ulink - url="&libraryCabalLocation;/Language-Haskell-Extension.html"><literal>Language.Haskell.Extension</literal></ulink> - may be used. GHC will report an error if any of the requested extensions are not supported.</para> - </sect2> - - - <sect2 id="options-pragma"> - <title>OPTIONS_GHC pragma</title> - <indexterm><primary>OPTIONS_GHC</primary> - </indexterm> - <indexterm><primary>pragma</primary><secondary>OPTIONS_GHC</secondary> - </indexterm> - - <para>The <literal>OPTIONS_GHC</literal> pragma is used to specify - additional options that are given to the compiler when compiling - this source file. See <xref linkend="source-file-options"/> for - details.</para> - - <para>Previous versions of GHC accepted <literal>OPTIONS</literal> rather - than <literal>OPTIONS_GHC</literal>, but that is now deprecated.</para> - </sect2> - - <para><literal>OPTIONS_GHC</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para> - - <sect2 id="include-pragma"> - <title>INCLUDE pragma</title> - - <para>The <literal>INCLUDE</literal> used to be necessary for - specifying header files to be included when using the FFI and - compiling via C. It is no longer required for GHC, but is - accepted (and ignored) for compatibility with other - compilers.</para> - </sect2> - - <sect2 id="warning-deprecated-pragma"> - <title>WARNING and DEPRECATED pragmas</title> - <indexterm><primary>WARNING</primary></indexterm> - <indexterm><primary>DEPRECATED</primary></indexterm> - - <para>The WARNING pragma allows you to attach an arbitrary warning - to a particular function, class, or type. - A DEPRECATED pragma lets you specify that - a particular function, class, or type is deprecated. - There are two ways of using these pragmas. - - <itemizedlist> - <listitem> - <para>You can work on an entire module thus:</para> -<programlisting> - module Wibble {-# DEPRECATED "Use Wobble instead" #-} where - ... -</programlisting> - <para>Or:</para> -<programlisting> - module Wibble {-# WARNING "This is an unstable interface." #-} where - ... -</programlisting> - <para>When you compile any module that import - <literal>Wibble</literal>, GHC will print the specified - message.</para> - </listitem> - - <listitem> - <para>You can attach a warning to a function, class, type, or data constructor, with the - following top-level declarations:</para> -<programlisting> - {-# DEPRECATED f, C, T "Don't use these" #-} - {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-} -</programlisting> - <para>When you compile any module that imports and uses any - of the specified entities, GHC will print the specified - message.</para> - <para> You can only attach to entities declared at top level in the module - being compiled, and you can only use unqualified names in the list of - entities. A capitalised name, such as <literal>T</literal> - refers to <emphasis>either</emphasis> the type constructor <literal>T</literal> - <emphasis>or</emphasis> the data constructor <literal>T</literal>, or both if - both are in scope. If both are in scope, there is currently no way to - specify one without the other (c.f. fixities - <xref linkend="infix-tycons"/>).</para> - </listitem> - </itemizedlist> - Warnings and deprecations are not reported for - (a) uses within the defining module, - (b) defining a method in a class instance, and - (c) uses in an export list. - The latter reduces spurious complaints within a library - in which one module gathers together and re-exports - the exports of several others. - </para> - <para>You can suppress the warnings with the flag - <option>-fno-warn-warnings-deprecations</option>.</para> - </sect2> - - <sect2 id="minimal-pragma"> - <title>MINIMAL pragma</title> - <indexterm><primary>MINIMAL</primary></indexterm> - <para>The MINIMAL pragma is used to specify the minimal complete definition of a class. I.e. specify which methods must be implemented by all instances. If an instance does not satisfy the minimal complete definition, then a warning is generated. - This can be useful when a class has methods with circular defaults. For example - </para> -<programlisting> -class Eq a where - (==) :: a -> a -> Bool - (/=) :: a -> a -> Bool - x == y = not (x /= y) - x /= y = not (x == y) - {-# MINIMAL (==) | (/=) #-} -</programlisting> - <para>Without the MINIMAL pragma no warning would be generated for an instance that implements neither method. - </para> - <para>The syntax for minimal complete definition is:</para> -<screen> -mindef ::= name - | '(' mindef ')' - | mindef '|' mindef - | mindef ',' mindef -</screen> - <para>A vertical bar denotes disjunction, i.e. one of the two sides is required. - A comma denotes conjunction, i.e. both sides are required. - Conjunction binds stronger than disjunction.</para> - <para> - If no MINIMAL pragma is given in the class declaration, it is just as if - a pragma <literal>{-# MINIMAL op1, op2, ..., opn #-}</literal> was given, where - the <literal>opi</literal> are the methods - (a) that lack a default method in the class declaration, and - (b) whose name that does not start with an underscore - (c.f. <option>-fwarn-missing-methods</option>, <xref linkend="options-sanity"/>). - </para> - <para>This warning can be turned off with the flag <option>-fno-warn-missing-methods</option>.</para> - </sect2> - - <sect2 id="inline-noinline-pragma"> - <title>INLINE and NOINLINE pragmas</title> - - <para>These pragmas control the inlining of function - definitions.</para> - - <sect3 id="inline-pragma"> - <title>INLINE pragma</title> - <indexterm><primary>INLINE</primary></indexterm> - - <para> - GHC (with <option>-O</option>, as always) tries to inline - (or “unfold”) functions/values that are - “small enough,” thus avoiding the call overhead - and possibly exposing other more-wonderful optimisations. - GHC has a set of heuristics, tuned over a long period of - time using many benchmarks, that decide when it is - beneficial to inline a function at its call site. The - heuristics are designed to inline functions when it appears - to be beneficial to do so, but without incurring excessive - code bloat. If a function looks too big, it won't be - inlined, and functions larger than a certain size will not - even have their definition exported in the interface file. - Some of the thresholds that govern these heuristic decisions - can be changed using flags, see <xref linkend="options-f" - />. - </para> - - <para> - Normally GHC will do a reasonable job of deciding by itself - when it is a good idea to inline a function. However, - sometimes you might want to override the default behaviour. - For example, if you have a key function that is important to - inline because it leads to further optimisations, but GHC - judges it to be too big to inline. - </para> - - <para>The sledgehammer you can bring to bear is the - <literal>INLINE</literal><indexterm><primary>INLINE - pragma</primary></indexterm> pragma, used thusly:</para> - -<programlisting> -key_function :: Int -> String -> (Bool, Double) -{-# INLINE key_function #-} -</programlisting> - - <para>The major effect of an <literal>INLINE</literal> pragma - is to declare a function's “cost” to be very low. - The normal unfolding machinery will then be very keen to - inline it. However, an <literal>INLINE</literal> pragma for a - function "<literal>f</literal>" has a number of other effects: -<itemizedlist> -<listitem><para> -While GHC is keen to inline the function, it does not do so -blindly. For example, if you write -<programlisting> -map key_function xs -</programlisting> -there really isn't any point in inlining <literal>key_function</literal> to get -<programlisting> -map (\x -> <replaceable>body</replaceable>) xs -</programlisting> -In general, GHC only inlines the function if there is some reason (no matter -how slight) to suppose that it is useful to do so. -</para></listitem> - -<listitem><para> -Moreover, GHC will only inline the function if it is <emphasis>fully applied</emphasis>, -where "fully applied" -means applied to as many arguments as appear (syntactically) -on the LHS of the function -definition. For example: -<programlisting> -comp1 :: (b -> c) -> (a -> b) -> a -> c -{-# INLINE comp1 #-} -comp1 f g = \x -> f (g x) - -comp2 :: (b -> c) -> (a -> b) -> a -> c -{-# INLINE comp2 #-} -comp2 f g x = f (g x) -</programlisting> -The two functions <literal>comp1</literal> and <literal>comp2</literal> have the -same semantics, but <literal>comp1</literal> will be inlined when applied -to <emphasis>two</emphasis> arguments, while <literal>comp2</literal> requires -<emphasis>three</emphasis>. This might make a big difference if you say -<programlisting> -map (not `comp1` not) xs -</programlisting> -which will optimise better than the corresponding use of `comp2`. -</para></listitem> - -<listitem><para> -It is useful for GHC to optimise the definition of an -INLINE function <literal>f</literal> just like any other non-INLINE function, -in case the non-inlined version of <literal>f</literal> is -ultimately called. But we don't want to inline -the <emphasis>optimised</emphasis> version -of <literal>f</literal>; -a major reason for INLINE pragmas is to expose functions -in <literal>f</literal>'s RHS that have -rewrite rules, and it's no good if those functions have been optimised -away. -</para> -<para> -So <emphasis>GHC guarantees to inline precisely the code that you wrote</emphasis>, no more -and no less. It does this by capturing a copy of the definition of the function to use -for inlining (we call this the "inline-RHS"), which it leaves untouched, -while optimising the ordinarily RHS as usual. For externally-visible functions -the inline-RHS (not the optimised RHS) is recorded in the interface file. -</para></listitem> -<listitem><para> -An INLINE function is not worker/wrappered by strictness analysis. -It's going to be inlined wholesale instead. -</para></listitem> -</itemizedlist> -</para> -<para>GHC ensures that inlining cannot go on forever: every mutually-recursive -group is cut by one or more <emphasis>loop breakers</emphasis> that is never inlined -(see <ulink url="http://research.microsoft.com/%7Esimonpj/Papers/inlining/index.htm"> -Secrets of the GHC inliner, JFP 12(4) July 2002</ulink>). -GHC tries not to select a function with an INLINE pragma as a loop breaker, but -when there is no choice even an INLINE function can be selected, in which case -the INLINE pragma is ignored. -For example, for a self-recursive function, the loop breaker can only be the function -itself, so an INLINE pragma is always ignored.</para> - - <para>Syntactically, an <literal>INLINE</literal> pragma for a - function can be put anywhere its type signature could be - put.</para> - - <para><literal>INLINE</literal> pragmas are a particularly - good idea for the - <literal>then</literal>/<literal>return</literal> (or - <literal>bind</literal>/<literal>unit</literal>) functions in - a monad. For example, in GHC's own - <literal>UniqueSupply</literal> monad code, we have:</para> - -<programlisting> -{-# INLINE thenUs #-} -{-# INLINE returnUs #-} -</programlisting> - - <para>See also the <literal>NOINLINE</literal> (<xref linkend="noinline-pragma"/>) - and <literal>INLINABLE</literal> (<xref linkend="inlinable-pragma"/>) - pragmas.</para> - - </sect3> - - <sect3 id="inlinable-pragma"> - <title>INLINABLE pragma</title> - -<para>An <literal>{-# INLINABLE f #-}</literal> pragma on a -function <literal>f</literal> has the following behaviour: -<itemizedlist> -<listitem><para> -While <literal>INLINE</literal> says "please inline me", the <literal>INLINABLE</literal> -says "feel free to inline me; use your -discretion". In other words the choice is left to GHC, which uses the same -rules as for pragma-free functions. Unlike <literal>INLINE</literal>, that decision is made at -the <emphasis>call site</emphasis>, and -will therefore be affected by the inlining threshold, optimisation level etc. -</para></listitem> -<listitem><para> -Like <literal>INLINE</literal>, the <literal>INLINABLE</literal> pragma retains a -copy of the original RHS for -inlining purposes, and persists it in the interface file, regardless of -the size of the RHS. -</para></listitem> - -<listitem><para> -One way to use <literal>INLINABLE</literal> is in conjunction with -the special function <literal>inline</literal> (<xref linkend="special-ids"/>). -The call <literal>inline f</literal> tries very hard to inline <literal>f</literal>. -To make sure that <literal>f</literal> can be inlined, -it is a good idea to mark the definition -of <literal>f</literal> as <literal>INLINABLE</literal>, -so that GHC guarantees to expose an unfolding regardless of how big it is. -Moreover, by annotating <literal>f</literal> as <literal>INLINABLE</literal>, -you ensure that <literal>f</literal>'s original RHS is inlined, rather than -whatever random optimised version of <literal>f</literal> GHC's optimiser -has produced. -</para></listitem> - -<listitem><para> -The <literal>INLINABLE</literal> pragma also works with <literal>SPECIALISE</literal>: -if you mark function <literal>f</literal> as <literal>INLINABLE</literal>, then -you can subsequently <literal>SPECIALISE</literal> in another module -(see <xref linkend="specialize-pragma"/>).</para></listitem> - -<listitem><para> -Unlike <literal>INLINE</literal>, it is OK to use -an <literal>INLINABLE</literal> pragma on a recursive function. -The principal reason do to so to allow later use of <literal>SPECIALISE</literal> -</para></listitem> -</itemizedlist> -</para> - - </sect3> - - <sect3 id="noinline-pragma"> - <title>NOINLINE pragma</title> - - <indexterm><primary>NOINLINE</primary></indexterm> - <indexterm><primary>NOTINLINE</primary></indexterm> - - <para>The <literal>NOINLINE</literal> pragma does exactly what - you'd expect: it stops the named function from being inlined - by the compiler. You shouldn't ever need to do this, unless - you're very cautious about code size.</para> - - <para><literal>NOTINLINE</literal> is a synonym for - <literal>NOINLINE</literal> (<literal>NOINLINE</literal> is - specified by Haskell 98 as the standard way to disable - inlining, so it should be used if you want your code to be - portable).</para> - </sect3> - - <sect3 id="conlike-pragma"> - <title>CONLIKE modifier</title> - <indexterm><primary>CONLIKE</primary></indexterm> - <para>An INLINE or NOINLINE pragma may have a CONLIKE modifier, - which affects matching in RULEs (only). See <xref linkend="conlike"/>. - </para> - </sect3> - - <sect3 id="phase-control"> - <title>Phase control</title> - - <para> Sometimes you want to control exactly when in GHC's - pipeline the INLINE pragma is switched on. Inlining happens - only during runs of the <emphasis>simplifier</emphasis>. Each - run of the simplifier has a different <emphasis>phase - number</emphasis>; the phase number decreases towards zero. - If you use <option>-dverbose-core2core</option> you'll see the - sequence of phase numbers for successive runs of the - simplifier. In an INLINE pragma you can optionally specify a - phase number, thus: - <itemizedlist> - <listitem> - <para>"<literal>INLINE[k] f</literal>" means: do not inline - <literal>f</literal> - until phase <literal>k</literal>, but from phase - <literal>k</literal> onwards be very keen to inline it. - </para></listitem> - <listitem> - <para>"<literal>INLINE[~k] f</literal>" means: be very keen to inline - <literal>f</literal> - until phase <literal>k</literal>, but from phase - <literal>k</literal> onwards do not inline it. - </para></listitem> - <listitem> - <para>"<literal>NOINLINE[k] f</literal>" means: do not inline - <literal>f</literal> - until phase <literal>k</literal>, but from phase - <literal>k</literal> onwards be willing to inline it (as if - there was no pragma). - </para></listitem> - <listitem> - <para>"<literal>NOINLINE[~k] f</literal>" means: be willing to inline - <literal>f</literal> - until phase <literal>k</literal>, but from phase - <literal>k</literal> onwards do not inline it. - </para></listitem> - </itemizedlist> -The same information is summarised here: -<programlisting> - -- Before phase 2 Phase 2 and later - {-# INLINE [2] f #-} -- No Yes - {-# INLINE [~2] f #-} -- Yes No - {-# NOINLINE [2] f #-} -- No Maybe - {-# NOINLINE [~2] f #-} -- Maybe No - - {-# INLINE f #-} -- Yes Yes - {-# NOINLINE f #-} -- No No -</programlisting> -By "Maybe" we mean that the usual heuristic inlining rules apply (if the -function body is small, or it is applied to interesting-looking arguments etc). -Another way to understand the semantics is this: -<itemizedlist> -<listitem><para>For both INLINE and NOINLINE, the phase number says -when inlining is allowed at all.</para></listitem> -<listitem><para>The INLINE pragma has the additional effect of making the -function body look small, so that when inlining is allowed it is very likely to -happen. -</para></listitem> -</itemizedlist> -</para> -<para>The same phase-numbering control is available for RULES - (<xref linkend="rewrite-rules"/>).</para> - </sect3> - </sect2> - - - <sect2 id="line-pragma"> - <title>LINE pragma</title> - - <indexterm><primary>LINE</primary><secondary>pragma</secondary></indexterm> - <indexterm><primary>pragma</primary><secondary>LINE</secondary></indexterm> - <para>This pragma is similar to C's <literal>#line</literal> - pragma, and is mainly for use in automatically generated Haskell - code. It lets you specify the line number and filename of the - original code; for example</para> - -<programlisting>{-# LINE 42 "Foo.vhs" #-}</programlisting> - - <para>if you'd generated the current file from something called - <filename>Foo.vhs</filename> and this line corresponds to line - 42 in the original. GHC will adjust its error messages to refer - to the line/file named in the <literal>LINE</literal> - pragma.</para> - - <para><literal>LINE</literal> pragmas generated from Template Haskell set - the file and line position for the duration of the splice and are limited - to the splice. Note that because Template Haskell splices abstract syntax, - the file positions are not automatically advanced.</para> - </sect2> - - <sect2 id="rules"> - <title>RULES pragma</title> - - <para>The RULES pragma lets you specify rewrite rules. It is - described in <xref linkend="rewrite-rules"/>.</para> - </sect2> - - <sect2 id="specialize-pragma"> - <title>SPECIALIZE pragma</title> - - <indexterm><primary>SPECIALIZE pragma</primary></indexterm> - <indexterm><primary>pragma, SPECIALIZE</primary></indexterm> - <indexterm><primary>overloading, death to</primary></indexterm> - - <para>(UK spelling also accepted.) For key overloaded - functions, you can create extra versions (NB: more code space) - specialised to particular types. Thus, if you have an - overloaded function:</para> - -<programlisting> - hammeredLookup :: Ord key => [(key, value)] -> key -> value -</programlisting> - - <para>If it is heavily used on lists with - <literal>Widget</literal> keys, you could specialise it as - follows:</para> - -<programlisting> - {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-} -</programlisting> - -<itemizedlist> -<listitem> - <para>A <literal>SPECIALIZE</literal> pragma for a function can - be put anywhere its type signature could be put. Moreover, you - can also <literal>SPECIALIZE</literal> an <emphasis>imported</emphasis> - function provided it was given an <literal>INLINABLE</literal> pragma at - its definition site (<xref linkend="inlinable-pragma"/>).</para> -</listitem> - -<listitem> - <para>A <literal>SPECIALIZE</literal> has the effect of generating - (a) a specialised version of the function and (b) a rewrite rule - (see <xref linkend="rewrite-rules"/>) that rewrites a call to - the un-specialised function into a call to the specialised one. - Moreover, given a <literal>SPECIALIZE</literal> pragma for a - function <literal>f</literal>, GHC will automatically create - specialisations for any type-class-overloaded functions called - by <literal>f</literal>, if they are in the same module as - the <literal>SPECIALIZE</literal> pragma, or if they are - <literal>INLINABLE</literal>; and so on, transitively.</para> -</listitem> - -<listitem> - <para>You can add phase control (<xref linkend="phase-control"/>) - to the RULE generated by a <literal>SPECIALIZE</literal> pragma, - just as you can if you write a RULE directly. For example: -<programlisting> - {-# SPECIALIZE [0] hammeredLookup :: [(Widget, value)] -> Widget -> value #-} -</programlisting> - generates a specialisation rule that only fires in Phase 0 (the final phase). - If you do not specify any phase control in the <literal>SPECIALIZE</literal> pragma, - the phase control is inherited from the inline pragma (if any) of the function. - For example: -<programlisting> - foo :: Num a => a -> a - foo = ...blah... - {-# NOINLINE [0] foo #-} - {-# SPECIALIZE foo :: Int -> Int #-} -</programlisting> - The <literal>NOINLINE</literal> pragma tells GHC not to inline <literal>foo</literal> - until Phase 0; and this property is inherited by the specialisation RULE, which will - therefore only fire in Phase 0.</para> - <para>The main reason for using phase control on specialisations is so that you can - write optimisation RULES that fire early in the compilation pipeline, and only - <emphasis>then</emphasis> specialise the calls to the function. If specialisation is - done too early, the optimisation rules might fail to fire. - </para> -</listitem> - -<listitem> - <para>The type in a SPECIALIZE pragma can be any type that is less - polymorphic than the type of the original function. In concrete terms, - if the original function is <literal>f</literal> then the pragma -<programlisting> - {-# SPECIALIZE f :: <type> #-} -</programlisting> - is valid if and only if the definition -<programlisting> - f_spec :: <type> - f_spec = f -</programlisting> - is valid. Here are some examples (where we only give the type signature - for the original function, not its code): -<programlisting> - f :: Eq a => a -> b -> b - {-# SPECIALISE f :: Int -> b -> b #-} - - g :: (Eq a, Ix b) => a -> b -> b - {-# SPECIALISE g :: (Eq a) => a -> Int -> Int #-} - - h :: Eq a => a -> a -> a - {-# SPECIALISE h :: (Eq a) => [a] -> [a] -> [a] #-} -</programlisting> -The last of these examples will generate a -RULE with a somewhat-complex left-hand side (try it yourself), so it might not fire very -well. If you use this kind of specialisation, let us know how well it works. -</para> -</listitem> -</itemizedlist> - - <sect3 id="specialize-inline"> - <title>SPECIALIZE INLINE</title> - -<para>A <literal>SPECIALIZE</literal> pragma can optionally be followed with a -<literal>INLINE</literal> or <literal>NOINLINE</literal> pragma, optionally -followed by a phase, as described in <xref linkend="inline-noinline-pragma"/>. -The <literal>INLINE</literal> pragma affects the specialised version of the -function (only), and applies even if the function is recursive. The motivating -example is this: -<programlisting> --- A GADT for arrays with type-indexed representation -data Arr e where - ArrInt :: !Int -> ByteArray# -> Arr Int - ArrPair :: !Int -> Arr e1 -> Arr e2 -> Arr (e1, e2) - -(!:) :: Arr e -> Int -> e -{-# SPECIALISE INLINE (!:) :: Arr Int -> Int -> Int #-} -{-# SPECIALISE INLINE (!:) :: Arr (a, b) -> Int -> (a, b) #-} -(ArrInt _ ba) !: (I# i) = I# (indexIntArray# ba i) -(ArrPair _ a1 a2) !: i = (a1 !: i, a2 !: i) -</programlisting> -Here, <literal>(!:)</literal> is a recursive function that indexes arrays -of type <literal>Arr e</literal>. Consider a call to <literal>(!:)</literal> -at type <literal>(Int,Int)</literal>. The second specialisation will fire, and -the specialised function will be inlined. It has two calls to -<literal>(!:)</literal>, -both at type <literal>Int</literal>. Both these calls fire the first -specialisation, whose body is also inlined. The result is a type-based -unrolling of the indexing function.</para> -<para>You can add explicit phase control (<xref linkend="phase-control"/>) -to <literal>SPECIALISE INLINE</literal> pragma, -just like on an <literal>INLINE</literal> pragma; if you do so, the same phase -is used for the rewrite rule and the INLINE control of the specialised function.</para> - -<para>Warning: you can make GHC diverge by using <literal>SPECIALISE INLINE</literal> -on an ordinarily-recursive function.</para> -</sect3> - -<sect3><title>SPECIALIZE for imported functions</title> - -<para> -Generally, you can only give a <literal>SPECIALIZE</literal> pragma -for a function defined in the same module. -However if a function <literal>f</literal> is given an <literal>INLINABLE</literal> -pragma at its definition site, then it can subsequently be specialised by -importing modules (see <xref linkend="inlinable-pragma"/>). -For example -<programlisting> -module Map( lookup, blah blah ) where - lookup :: Ord key => [(key,a)] -> key -> Maybe a - lookup = ... - {-# INLINABLE lookup #-} - -module Client where - import Map( lookup ) - - data T = T1 | T2 deriving( Eq, Ord ) - {-# SPECIALISE lookup :: [(T,a)] -> T -> Maybe a -</programlisting> -Here, <literal>lookup</literal> is declared <literal>INLINABLE</literal>, but -it cannot be specialised for type <literal>T</literal> at its definition site, -because that type does not exist yet. Instead a client module can define <literal>T</literal> -and then specialise <literal>lookup</literal> at that type. -</para> -<para> -Moreover, every module that imports <literal>Client</literal> (or imports a module -that imports <literal>Client</literal>, transitively) will "see", and make use of, -the specialised version of <literal>lookup</literal>. You don't need to put -a <literal>SPECIALIZE</literal> pragma in every module. -</para> -<para> -Moreover you often don't even need the <literal>SPECIALIZE</literal> pragma in the -first place. When compiling a module M, -GHC's optimiser (with -O) automatically considers each top-level -overloaded function declared in M, and specialises it -for the different types at which it is called in M. The optimiser -<emphasis>also</emphasis> considers each <emphasis>imported</emphasis> -<literal>INLINABLE</literal> overloaded function, and specialises it -for the different types at which it is called in M. -So in our example, it would be enough for <literal>lookup</literal> to -be called at type <literal>T</literal>: -<programlisting> -module Client where - import Map( lookup ) - - data T = T1 | T2 deriving( Eq, Ord ) - - findT1 :: [(T,a)] -> Maybe a - findT1 m = lookup m T1 -- A call of lookup at type T -</programlisting> -However, sometimes there are no such calls, in which case the -pragma can be useful. -</para> -</sect3> - -<sect3><title>Obsolete SPECIALIZE syntax</title> - - <para>Note: In earlier versions of GHC, it was possible to provide your own - specialised function for a given type: - -<programlisting> -{-# SPECIALIZE hammeredLookup :: [(Int, value)] -> Int -> value = intLookup #-} -</programlisting> - - This feature has been removed, as it is now subsumed by the - <literal>RULES</literal> pragma (see <xref linkend="rule-spec"/>).</para> -</sect3> - - </sect2> - -<sect2 id="specialize-instance-pragma"> -<title>SPECIALIZE instance pragma -</title> - -<para> -<indexterm><primary>SPECIALIZE pragma</primary></indexterm> -<indexterm><primary>overloading, death to</primary></indexterm> -Same idea, except for instance declarations. For example: - -<programlisting> -instance (Eq a) => Eq (Foo a) where { - {-# SPECIALIZE instance Eq (Foo [(Int, Bar)]) #-} - ... usual stuff ... - } -</programlisting> -The pragma must occur inside the <literal>where</literal> part -of the instance declaration. -</para> - -</sect2> - - <sect2 id="unpack-pragma"> - <title>UNPACK pragma</title> - - <indexterm><primary>UNPACK</primary></indexterm> - - <para>The <literal>UNPACK</literal> indicates to the compiler - that it should unpack the contents of a constructor field into - the constructor itself, removing a level of indirection. For - example:</para> - -<programlisting> -data T = T {-# UNPACK #-} !Float - {-# UNPACK #-} !Float -</programlisting> - - <para>will create a constructor <literal>T</literal> containing - two unboxed floats. This may not always be an optimisation: if - the <function>T</function> constructor is scrutinised and the - floats passed to a non-strict function for example, they will - have to be reboxed (this is done automatically by the - compiler).</para> - - <para>Unpacking constructor fields should only be used in - conjunction with <option>-O</option><footnote>in fact, UNPACK - has no effect without <option>-O</option>, for technical - reasons - (see <ulink url="http://ghc.haskell.org/trac/ghc/ticket/5252">tick - 5252</ulink>)</footnote>, in order to expose - unfoldings to the compiler so the reboxing can be removed as - often as possible. For example:</para> - -<programlisting> -f :: T -> Float -f (T f1 f2) = f1 + f2 -</programlisting> - - <para>The compiler will avoid reboxing <function>f1</function> - and <function>f2</function> by inlining <function>+</function> - on floats, but only when <option>-O</option> is on.</para> - - <para>Any single-constructor data is eligible for unpacking; for - example</para> - -<programlisting> -data T = T {-# UNPACK #-} !(Int,Int) -</programlisting> - - <para>will store the two <literal>Int</literal>s directly in the - <function>T</function> constructor, by flattening the pair. - Multi-level unpacking is also supported: - -<programlisting> -data T = T {-# UNPACK #-} !S -data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int -</programlisting> - - will store two unboxed <literal>Int#</literal>s - directly in the <function>T</function> constructor. The - unpacker can see through newtypes, too.</para> - - <para>See also the <option>-funbox-strict-fields</option> flag, - which essentially has the effect of adding - <literal>{-# UNPACK #-}</literal> to every strict - constructor field.</para> - </sect2> - - <sect2 id="nounpack-pragma"> - <title>NOUNPACK pragma</title> - - <indexterm><primary>NOUNPACK</primary></indexterm> - - <para>The <literal>NOUNPACK</literal> pragma indicates to the compiler - that it should not unpack the contents of a constructor field. - Example: - </para> -<programlisting> -data T = T {-# NOUNPACK #-} !(Int,Int) -</programlisting> - <para> - Even with the flags - <option>-funbox-strict-fields</option> and <option>-O</option>, - the field of the constructor <function>T</function> is not - unpacked. - </para> - </sect2> - - <sect2 id="source-pragma"> - <title>SOURCE pragma</title> - - <indexterm><primary>SOURCE</primary></indexterm> - <para>The <literal>{-# SOURCE #-}</literal> pragma is used only in <literal>import</literal> declarations, - to break a module loop. It is described in detail in <xref linkend="mutual-recursion"/>. - </para> -</sect2> - -<sect2 id="overlap-pragma"> -<title>OVERLAPPING, OVERLAPPABLE, OVERLAPS, and INCOHERENT pragmas</title> -<para> -The pragmas - <literal>OVERLAPPING</literal>, - <literal>OVERLAPPABLE</literal>, - <literal>OVERLAPS</literal>, - <literal>INCOHERENT</literal> are used to specify the overlap -behavior for individual instances, as described in Section -<xref linkend="instance-overlap"/>. The pragmas are written immediately -after the <literal>instance</literal> keyword, like this: -</para> -<programlisting> -instance {-# OVERLAPPING #-} C t where ... -</programlisting> -</sect2> - -</sect1> - -<!-- ======================= REWRITE RULES ======================== --> - -<sect1 id="rewrite-rules"> -<title>Rewrite rules - -<indexterm><primary>RULES pragma</primary></indexterm> -<indexterm><primary>pragma, RULES</primary></indexterm> -<indexterm><primary>rewrite rules</primary></indexterm></title> - -<para> -The programmer can specify rewrite rules as part of the source program -(in a pragma). -Here is an example: - -<programlisting> - {-# RULES - "map/map" forall f g xs. map f (map g xs) = map (f.g) xs - #-} -</programlisting> -</para> -<para> -Use the debug flag <option>-ddump-simpl-stats</option> to see what rules fired. -If you need more information, then <option>-ddump-rule-firings</option> shows you -each individual rule firing and <option>-ddump-rule-rewrites</option> also shows what the code looks like before and after the rewrite. -</para> - -<sect2> -<title>Syntax</title> - -<para> -From a syntactic point of view: - -<itemizedlist> - -<listitem> -<para> - There may be zero or more rules in a <literal>RULES</literal> pragma, separated by semicolons (which - may be generated by the layout rule). -</para> -</listitem> - -<listitem> -<para> -The layout rule applies in a pragma. -Currently no new indentation level -is set, so if you put several rules in single RULES pragma and wish to use layout to separate them, -you must lay out the starting in the same column as the enclosing definitions. -<programlisting> - {-# RULES - "map/map" forall f g xs. map f (map g xs) = map (f.g) xs - "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys - #-} -</programlisting> -Furthermore, the closing <literal>#-}</literal> -should start in a column to the right of the opening <literal>{-#</literal>. -</para> -</listitem> - -<listitem> -<para> - Each rule has a name, enclosed in double quotes. The name itself has -no significance at all. It is only used when reporting how many times the rule fired. -</para> -</listitem> - -<listitem> -<para> -A rule may optionally have a phase-control number (see <xref linkend="phase-control"/>), -immediately after the name of the rule. Thus: -<programlisting> - {-# RULES - "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs - #-} -</programlisting> -The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse -notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, -Phase 2. -</para> -<para> -Rules support the special phase-control notation "[~]", which means the rule is never active. -This feature supports plugins (see <xref linkend="compiler-plugins"/>), by making it possible -to define a RULE that is never run by GHC, but is nevertheless parsed, typechecked etc, so that -it is available to the plugin. -</para> -</listitem> - - - -<listitem> -<para> - Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>), -or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>). The variables bound by -the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables. They are separated -by spaces, just like in a type <literal>forall</literal>. -</para> -</listitem> -<listitem> - -<para> - A pattern variable may optionally have a type signature. -If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature. -For example, here is the <literal>foldr/build</literal> rule: - -<programlisting> -"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) . - foldr k z (build g) = g k z -</programlisting> - -Since <function>g</function> has a polymorphic type, it must have a type signature. - -</para> -</listitem> -<listitem> - -<para> -The left hand side of a rule must consist of a top-level variable applied -to arbitrary expressions. For example, this is <emphasis>not</emphasis> OK: - -<programlisting> -"wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1 -"wrong2" forall f. f True = True -</programlisting> - -In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable -in the head. -</para> -</listitem> -<listitem> - -<para> - A rule does not need to be in the same module as (any of) the -variables it mentions, though of course they need to be in scope. -</para> -</listitem> -<listitem> - -<para> - All rules are implicitly exported from the module, and are therefore -in force in any module that imports the module that defined the rule, directly -or indirectly. (That is, if A imports B, which imports C, then C's rules are -in force when compiling A.) The situation is very similar to that for instance -declarations. -</para> -</listitem> - -<listitem> - -<para> -Inside a RULE "<literal>forall</literal>" is treated as a keyword, regardless of -any other flag settings. Furthermore, inside a RULE, the language extension -<option>-XScopedTypeVariables</option> is automatically enabled; see -<xref linkend="scoped-type-variables"/>. -</para> -</listitem> -<listitem> - -<para> -Like other pragmas, RULE pragmas are always checked for scope errors, and -are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, -and must have the same type. However, rules are only <emphasis>enabled</emphasis> -if the <option>-fenable-rewrite-rules</option> flag is -on (see <xref linkend="rule-semantics"/>). -</para> -</listitem> -</itemizedlist> - -</para> - -</sect2> - -<sect2 id="rule-semantics"> -<title>Semantics</title> - -<para> -From a semantic point of view: - -<itemizedlist> -<listitem> -<para> -Rules are enabled (that is, used during optimisation) -by the <option>-fenable-rewrite-rules</option> flag. -This flag is implied by <option>-O</option>, and may be switched -off (as usual) by <option>-fno-enable-rewrite-rules</option>. -(NB: enabling <option>-fenable-rewrite-rules</option> without <option>-O</option> -may not do what you expect, though, because without <option>-O</option> GHC -ignores all optimisation information in interface files; -see <option>-fignore-interface-pragmas</option>, <xref linkend="options-f"/>.) -Note that <option>-fenable-rewrite-rules</option> is an <emphasis>optimisation</emphasis> flag, and -has no effect on parsing or typechecking. -</para> -</listitem> - -<listitem> -<para> - Rules are regarded as left-to-right rewrite rules. -When GHC finds an expression that is a substitution instance of the LHS -of a rule, it replaces the expression by the (appropriately-substituted) RHS. -By "a substitution instance" we mean that the LHS can be made equal to the -expression by substituting for the pattern variables. - -</para> -</listitem> -<listitem> - -<para> - GHC makes absolutely no attempt to verify that the LHS and RHS -of a rule have the same meaning. That is undecidable in general, and -infeasible in most interesting cases. The responsibility is entirely the programmer's! - -</para> -</listitem> -<listitem> - -<para> - GHC makes no attempt to make sure that the rules are confluent or -terminating. For example: - -<programlisting> - "loop" forall x y. f x y = f y x -</programlisting> - -This rule will cause the compiler to go into an infinite loop. - -</para> -</listitem> -<listitem> - -<para> - If more than one rule matches a call, GHC will choose one arbitrarily to apply. - -</para> -</listitem> -<listitem> -<para> - GHC currently uses a very simple, syntactic, matching algorithm -for matching a rule LHS with an expression. It seeks a substitution -which makes the LHS and expression syntactically equal modulo alpha -conversion. The pattern (rule), but not the expression, is eta-expanded if -necessary. (Eta-expanding the expression can lead to laziness bugs.) -But not beta conversion (that's called higher-order matching). -</para> - -<para> -Matching is carried out on GHC's intermediate language, which includes -type abstractions and applications. So a rule only matches if the -types match too. See <xref linkend="rule-spec"/> below. -</para> -</listitem> -<listitem> - -<para> - GHC keeps trying to apply the rules as it optimises the program. -For example, consider: - -<programlisting> - let s = map f - t = map g - in - s (t xs) -</programlisting> - -The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC -will substitute for <varname>s</varname> and <varname>t</varname>, giving an expression which does match. -If <varname>s</varname> or <varname>t</varname> was (a) used more than once, and (b) large or a redex, then it would -not be substituted, and the rule would not fire. - -</para> -</listitem> -</itemizedlist> - -</para> - -</sect2> - -<sect2 id="rules-inline"> -<title>How rules interact with INLINE/NOINLINE pragmas</title> - -<para> -Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected -results. Consider this (artificial) example -<programlisting> -f x = x -g y = f y -h z = g True - -{-# RULES "f" f True = False #-} -</programlisting> -Since <literal>f</literal>'s right-hand side is small, it is inlined into <literal>g</literal>, -to give -<programlisting> -g y = y -</programlisting> -Now <literal>g</literal> is inlined into <literal>h</literal>, but <literal>f</literal>'s RULE has -no chance to fire. -If instead GHC had first inlined <literal>g</literal> into <literal>h</literal> then there -would have been a better chance that <literal>f</literal>'s RULE might fire. -</para> -<para> -The way to get predictable behaviour is to use a NOINLINE -pragma, or an INLINE[<replaceable>phase</replaceable>] pragma, on <literal>f</literal>, to ensure -that it is not inlined until its RULEs have had a chance to fire. -The warning flag <option>-fwarn-inline-rule-shadowing</option> (see <xref linkend="options-sanity"/>) -warns about this situation. -</para> -</sect2> - -<sect2 id="conlike"> -<title>How rules interact with CONLIKE pragmas</title> - -<para> -GHC is very cautious about duplicating work. For example, consider -<programlisting> -f k z xs = let xs = build g - in ...(foldr k z xs)...sum xs... -{-# RULES "foldr/build" forall k z g. foldr k z (build g) = g k z #-} -</programlisting> -Since <literal>xs</literal> is used twice, GHC does not fire the foldr/build rule. Rightly -so, because it might take a lot of work to compute <literal>xs</literal>, which would be -duplicated if the rule fired. -</para> -<para> -Sometimes, however, this approach is over-cautious, and we <emphasis>do</emphasis> want the -rule to fire, even though doing so would duplicate redex. There is no way that GHC can work out -when this is a good idea, so we provide the CONLIKE pragma to declare it, thus: -<programlisting> -{-# INLINE CONLIKE [1] f #-} -f x = <replaceable>blah</replaceable> -</programlisting> -CONLIKE is a modifier to an INLINE or NOINLINE pragma. It specifies that an application -of f to one argument (in general, the number of arguments to the left of the '=' sign) -should be considered cheap enough to duplicate, if such a duplication would make rule -fire. (The name "CONLIKE" is short for "constructor-like", because constructors certainly -have such a property.) -The CONLIKE pragma is a modifier to INLINE/NOINLINE because it really only makes sense to match -<literal>f</literal> on the LHS of a rule if you are sure that <literal>f</literal> is -not going to be inlined before the rule has a chance to fire. -</para> -</sect2> - -<sect2 id="rules-class-methods"> -<title>How rules interact with class methods</title> - -<para> -Giving a RULE for a class method is a bad idea: -<programlisting> -class C a where - op :: a -> a -> a - -instance C Bool where - op x y = ...rhs for op at Bool... - -{-# RULES "f" op True y = False #-} -</programlisting> -In this -example, <literal>op</literal> is not an ordinary top-level function; -it is a class method. GHC rapidly rewrites any occurrences of -<literal>op</literal>-used-at-type-Bool -to a specialised function, say <literal>opBool</literal>, where -<programlisting> -opBool :: Bool -> Bool -> Bool -opBool x y = ..rhs for op at Bool... -</programlisting> -So the RULE never has a chance to fire, for just the same reasons as in <xref linkend="rules-inline"/>. -</para> -<para> -The solution is to define the instance-specific function yourself, with a pragma to prevent -it being inlined too early, and give a RULE for it: -<programlisting> -instance C Bool where - op x y = opBool - -opBool :: Bool -> Bool -> Bool -{-# NOINLINE [1] opBool #-} -opBool x y = ..rhs for op at Bool... - -{-# RULES "f" opBool True y = False #-} -</programlisting> -If you want a RULE that truly applies to the overloaded class method, the only way to -do it is like this: -<programlisting> -class C a where - op_c :: a -> a -> a - -op :: C a => a -> a -> a -{-# NOINLINE [1] op #-} -op = op_c - -{-# RULES "reassociate" op (op x y) z = op x (op y z) #-} -</programlisting> -Now the inlining of <literal>op</literal> is delayed until the rule has a chance to fire. -The down-side is that instance declarations must define <literal>op_c</literal>, but -all other uses should go via <literal>op</literal>. -</para> -</sect2> -<sect2> -<title>List fusion</title> - -<para> -The RULES mechanism is used to implement fusion (deforestation) of common list functions. -If a "good consumer" consumes an intermediate list constructed by a "good producer", the -intermediate list should be eliminated entirely. -</para> - -<para> -The following are good producers: - -<itemizedlist> -<listitem> - -<para> - List comprehensions -</para> -</listitem> -<listitem> - -<para> - Enumerations of <literal>Int</literal>, <literal>Integer</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>). -</para> -</listitem> -<listitem> - -<para> - Explicit lists (e.g. <literal>[True, False]</literal>) -</para> -</listitem> -<listitem> - -<para> - The cons constructor (e.g <literal>3:4:[]</literal>) -</para> -</listitem> -<listitem> - -<para> - <function>++</function> -</para> -</listitem> - -<listitem> -<para> - <function>map</function> -</para> -</listitem> - -<listitem> -<para> -<function>take</function>, <function>filter</function> -</para> -</listitem> -<listitem> - -<para> - <function>iterate</function>, <function>repeat</function> -</para> -</listitem> -<listitem> - -<para> - <function>zip</function>, <function>zipWith</function> -</para> -</listitem> - -</itemizedlist> - -</para> - -<para> -The following are good consumers: - -<itemizedlist> -<listitem> - -<para> - List comprehensions -</para> -</listitem> -<listitem> - -<para> - <function>array</function> (on its second argument) -</para> -</listitem> -<listitem> - -<para> - <function>++</function> (on its first argument) -</para> -</listitem> - -<listitem> -<para> - <function>foldr</function> -</para> -</listitem> - -<listitem> -<para> - <function>map</function> -</para> -</listitem> -<listitem> - -<para> -<function>take</function>, <function>filter</function> -</para> -</listitem> -<listitem> - -<para> - <function>concat</function> -</para> -</listitem> -<listitem> - -<para> - <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function> -</para> -</listitem> -<listitem> - -<para> - <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function> -will fuse with one but not the other) -</para> -</listitem> -<listitem> - -<para> - <function>partition</function> -</para> -</listitem> -<listitem> - -<para> - <function>head</function> -</para> -</listitem> -<listitem> - -<para> - <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function> -</para> -</listitem> -<listitem> - -<para> - <function>sequence_</function> -</para> -</listitem> -<listitem> - -<para> - <function>msum</function> -</para> -</listitem> - -</itemizedlist> - -</para> - - <para> -So, for example, the following should generate no intermediate lists: - -<programlisting> -array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]] -</programlisting> - -</para> - -<para> -This list could readily be extended; if there are Prelude functions that you use -a lot which are not included, please tell us. -</para> - -<para> -If you want to write your own good consumers or producers, look at the -Prelude definitions of the above functions to see how to do so. -</para> - -</sect2> - -<sect2 id="rule-spec"> -<title>Specialisation -</title> - -<para> -Rewrite rules can be used to get the same effect as a feature -present in earlier versions of GHC. -For example, suppose that: - -<programlisting> -genericLookup :: Ord a => Table a b -> a -> b -intLookup :: Table Int b -> Int -> b -</programlisting> - -where <function>intLookup</function> is an implementation of -<function>genericLookup</function> that works very fast for -keys of type <literal>Int</literal>. You might wish -to tell GHC to use <function>intLookup</function> instead of -<function>genericLookup</function> whenever the latter was called with -type <literal>Table Int b -> Int -> b</literal>. -It used to be possible to write - -<programlisting> -{-# SPECIALIZE genericLookup :: Table Int b -> Int -> b = intLookup #-} -</programlisting> - -This feature is no longer in GHC, but rewrite rules let you do the same thing: - -<programlisting> -{-# RULES "genericLookup/Int" genericLookup = intLookup #-} -</programlisting> - -This slightly odd-looking rule instructs GHC to replace -<function>genericLookup</function> by <function>intLookup</function> -<emphasis>whenever the types match</emphasis>. -What is more, this rule does not need to be in the same -file as <function>genericLookup</function>, unlike the -<literal>SPECIALIZE</literal> pragmas which currently do (so that they -have an original definition available to specialise). -</para> - -<para>It is <emphasis>Your Responsibility</emphasis> to make sure that -<function>intLookup</function> really behaves as a specialised version -of <function>genericLookup</function>!!!</para> - -<para>An example in which using <literal>RULES</literal> for -specialisation will Win Big: - -<programlisting> -toDouble :: Real a => a -> Double -toDouble = fromRational . toRational - -{-# RULES "toDouble/Int" toDouble = i2d #-} -i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly -</programlisting> - -The <function>i2d</function> function is virtually one machine -instruction; the default conversion—via an intermediate -<literal>Rational</literal>—is obscenely expensive by -comparison. -</para> - -</sect2> - -<sect2 id="controlling-rules"> -<title>Controlling what's going on in rewrite rules</title> - -<para> - -<itemizedlist> -<listitem> - -<para> -Use <option>-ddump-rules</option> to see the rules that are defined -<emphasis>in this module</emphasis>. -This includes rules generated by the specialisation pass, but excludes -rules imported from other modules. -</para> -</listitem> - -<listitem> -<para> - Use <option>-ddump-simpl-stats</option> to see what rules are being fired. -If you add <option>-dppr-debug</option> you get a more detailed listing. -</para> -</listitem> - -<listitem> -<para> - Use <option>-ddump-rule-firings</option> or <option>-ddump-rule-rewrites</option> -to see in great detail what rules are being fired. -If you add <option>-dppr-debug</option> you get a still more detailed listing. -</para> -</listitem> - -<listitem> -<para> - The definition of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks like this: - -<programlisting> - build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] - {-# INLINE build #-} - build g = g (:) [] -</programlisting> - -Notice the <literal>INLINE</literal>! That prevents <literal>(:)</literal> from being inlined when compiling -<literal>PrelBase</literal>, so that an importing module will “see” the <literal>(:)</literal>, and can -match it on the LHS of a rule. <literal>INLINE</literal> prevents any inlining happening -in the RHS of the <literal>INLINE</literal> thing. I regret the delicacy of this. - -</para> -</listitem> -<listitem> - -<para> - In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to -see how to write rules that will do fusion and yet give an efficient -program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</filename>. -</para> -</listitem> - -</itemizedlist> - -</para> - -</sect2> - -</sect1> - -<sect1 id="special-ids"> -<title>Special built-in functions</title> -<para>GHC has a few built-in functions with special behaviour. -In particular: -<itemizedlist> -<listitem><para> -<ulink url="&libraryBaseLocation;/GHC-Exts.html#v%3Ainline"><literal>inline</literal></ulink> -allows control over inlining on a per-call-site basis. -</para></listitem> -<listitem><para> -<ulink url="&libraryBaseLocation;/GHC-Exts.html#v%3Alazy"><literal>lazy</literal></ulink> -restrains the strictness analyser. -</para></listitem> -<listitem><para> -<ulink url="&libraryBaseLocation;/GHC-Exts.html#v%3AoneShot"><literal>oneShot</literal></ulink> -gives a hint to the compiler about how often a function is being called. -</para></listitem> -</itemizedlist> -</para> -</sect1> - - -<sect1 id="generic-classes"> -<title>Generic classes</title> - -<para> -GHC used to have an implementation of generic classes as defined in the paper -"Derivable type classes", Ralf Hinze and Simon Peyton Jones, Haskell Workshop, -Montreal Sept 2000, pp94-105. These have been removed and replaced by the more -general <link linkend="generic-programming">support for generic programming</link>. -</para> - -</sect1> - - -<sect1 id="generic-programming"> -<title>Generic programming</title> - -<para> -Using a combination of <option>-XDeriveGeneric</option> -(<xref linkend="deriving-typeable"/>), -<option>-XDefaultSignatures</option> (<xref linkend="class-default-signatures"/>), -and <option>-XDeriveAnyClass</option> (<xref linkend="derive-any-class"/>), -you can easily do datatype-generic -programming using the <literal>GHC.Generics</literal> framework. This section -gives a very brief overview of how to do it. -</para> - -<para> -Generic programming support in GHC allows defining classes with methods that -do not need a user specification when instantiating: the method body is -automatically derived by GHC. This is similar to what happens for standard -classes such as <literal>Read</literal> and <literal>Show</literal>, for -instance, but now for user-defined classes. -</para> - -<sect2> -<title>Deriving representations</title> - -<para> -The first thing we need is generic representations. The -<literal>GHC.Generics</literal> module defines a couple of primitive types -that are used to represent Haskell datatypes: - -<programlisting> --- | Unit: used for constructors without arguments -data U1 p = U1 - --- | Constants, additional parameters and recursion of kind * -newtype K1 i c p = K1 { unK1 :: c } - --- | Meta-information (constructor names, etc.) -newtype M1 i c f p = M1 { unM1 :: f p } - --- | Sums: encode choice between constructors -infixr 5 :+: -data (:+:) f g p = L1 (f p) | R1 (g p) - --- | Products: encode multiple arguments to constructors -infixr 6 :*: -data (:*:) f g p = f p :*: g p -</programlisting> -</para> - -<para> -The <literal>Generic</literal> and <literal>Generic1</literal> classes mediate -between user-defined datatypes and their internal representation as a -sum-of-products: - -<programlisting> -class Generic a where - -- Encode the representation of a user datatype - type Rep a :: * -> * - -- Convert from the datatype to its representation - from :: a -> (Rep a) x - -- Convert from the representation to the datatype - to :: (Rep a) x -> a - -class Generic1 f where - type Rep1 f :: * -> * - - from1 :: f a -> Rep1 f a - to1 :: Rep1 f a -> f a -</programlisting> - -<literal>Generic1</literal> is used for functions that can only be defined over -type containers, such as <literal>map</literal>. -Instances of these classes can be derived by GHC with the -<option>-XDeriveGeneric</option> (<xref linkend="deriving-typeable"/>), and are -necessary to be able to define generic instances automatically. -</para> - -<para> -For example, a user-defined datatype of trees <literal>data UserTree a = Node a -(UserTree a) (UserTree a) | Leaf</literal> gets the following representation: - -<programlisting> -instance Generic (UserTree a) where - -- Representation type - type Rep (UserTree a) = - M1 D D1UserTree ( - M1 C C1_0UserTree ( - M1 S NoSelector (K1 R a) - :*: M1 S NoSelector (K1 R (UserTree a)) - :*: M1 S NoSelector (K1 R (UserTree a))) - :+: M1 C C1_1UserTree U1) - - -- Conversion functions - from (Node x l r) = M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r)))) - from Leaf = M1 (R1 (M1 U1)) - to (M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))) = Node x l r - to (M1 (R1 (M1 U1))) = Leaf - --- Meta-information -data D1UserTree -data C1_0UserTree -data C1_1UserTree - -instance Datatype D1UserTree where - datatypeName _ = "UserTree" - moduleName _ = "Main" - packageName _ = "main" - -instance Constructor C1_0UserTree where - conName _ = "Node" - -instance Constructor C1_1UserTree where - conName _ = "Leaf" -</programlisting> - -This representation is generated automatically if a -<literal>deriving Generic</literal> clause is attached to the datatype. -<link linkend="stand-alone-deriving">Standalone deriving</link> can also be -used. -</para> - -</sect2> - -<sect2> -<title>Writing generic functions</title> - -<para> -A generic function is defined by creating a class and giving instances for -each of the representation types of <literal>GHC.Generics</literal>. As an -example we show generic serialization: -<programlisting> -data Bin = O | I - -class GSerialize f where - gput :: f a -> [Bin] - -instance GSerialize U1 where - gput U1 = [] - -instance (GSerialize a, GSerialize b) => GSerialize (a :*: b) where - gput (x :*: y) = gput x ++ gput y - -instance (GSerialize a, GSerialize b) => GSerialize (a :+: b) where - gput (L1 x) = O : gput x - gput (R1 x) = I : gput x - -instance (GSerialize a) => GSerialize (M1 i c a) where - gput (M1 x) = gput x - -instance (Serialize a) => GSerialize (K1 i a) where - gput (K1 x) = put x -</programlisting> - -Typically this class will not be exported, as it only makes sense to have -instances for the representation types. -</para> -</sect2> - -<sect2> -<title>Generic defaults</title> - -<para> -The only thing left to do now is to define a "front-end" class, which is -exposed to the user: -<programlisting> -class Serialize a where - put :: a -> [Bin] - - default put :: (Generic a, GSerialize (Rep a)) => a -> [Bit] - put = gput . from -</programlisting> -Here we use a <link linkend="class-default-signatures">default signature</link> -to specify that the user does not have to provide an implementation for -<literal>put</literal>, as long as there is a <literal>Generic</literal> -instance for the type to instantiate. For the <literal>UserTree</literal> type, -for instance, the user can just write: - -<programlisting> -instance (Serialize a) => Serialize (UserTree a) -</programlisting> - -The default method for <literal>put</literal> is then used, corresponding to the -generic implementation of serialization. - -If you are using <option>-XDeriveAnyClass</option>, the same instance is -generated by simply attaching a <literal>deriving Serialize</literal> clause -to the <literal>UserTree</literal> datatype declaration. - -For more examples of generic functions please refer to the -<ulink url="http://hackage.haskell.org/package/generic-deriving">generic-deriving</ulink> -package on Hackage. -</para> -</sect2> - -<sect2> -<title>More information</title> - -<para> -For more details please refer to the -<ulink url="http://www.haskell.org/haskellwiki/GHC.Generics">HaskellWiki -page</ulink> or the original paper: -</para> - -<itemizedlist> -<listitem> -<para> -Jose Pedro Magalhaes, Atze Dijkstra, Johan Jeuring, and Andres Loeh. -<ulink url="http://dreixel.net/research/pdf/gdmh.pdf"> - A generic deriving mechanism for Haskell</ulink>. -<citetitle>Proceedings of the third ACM Haskell symposium on Haskell</citetitle> -(Haskell'2010), pp. 37-48, ACM, 2010. -</para> -</listitem> -</itemizedlist> - -</sect2> - -</sect1> - -<sect1 id="roles"> -<title>Roles -<indexterm><primary>roles</primary></indexterm> -</title> - -<para> -Using <option>-XGeneralizedNewtypeDeriving</option> (<xref -linkend="generalized-newtype-deriving" />), a programmer can take existing -instances of classes and "lift" these into instances of that class for a -newtype. However, this is not always safe. For example, consider the following: -</para> - -<programlisting> - newtype Age = MkAge { unAge :: Int } - - type family Inspect x - type instance Inspect Age = Int - type instance Inspect Int = Bool - - class BadIdea a where - bad :: a -> Inspect a - - instance BadIdea Int where - bad = (> 0) - - deriving instance BadIdea Age -- not allowed! -</programlisting> - -<para> -If the derived instance were allowed, what would the type of its method -<literal>bad</literal> be? It would seem to be <literal>Age -> Inspect -Age</literal>, which is equivalent to <literal>Age -> Int</literal>, according -to the type family <literal>Inspect</literal>. Yet, if we simply adapt the -implementation from the instance for <literal>Int</literal>, the implementation -for <literal>bad</literal> produces a <literal>Bool</literal>, and we have trouble. -</para> - -<para> -The way to identify such situations is to have <emphasis>roles</emphasis> assigned -to type variables of datatypes, classes, and type synonyms.</para> - -<para> -Roles as implemented in GHC are a from a simplified version of the work -described in <ulink -url="http://www.seas.upenn.edu/~sweirich/papers/popl163af-weirich.pdf">Generative -type abstraction and type-level computation</ulink>, published at POPL 2011.</para> - -<sect2 id="nominal-representational-and-phantom"> -<title>Nominal, Representational, and Phantom</title> - -<para>The goal of the roles system is to track when two types have the same -underlying representation. In the example above, <literal>Age</literal> and -<literal>Int</literal> have the same representation. But, the corresponding -instances of <literal>BadIdea</literal> would <emphasis>not</emphasis> have -the same representation, because the types of the implementations of -<literal>bad</literal> would be different.</para> - -<para>Suppose we have two uses of a type constructor, each applied to the same -parameters except for one difference. (For example, <literal>T Age Bool -c</literal> and <literal>T Int Bool c</literal> for some type -<literal>T</literal>.) The role of a type parameter says what we need to -know about the two differing type arguments in order to know that the two -outer types have the same representation (in the example, what must be true -about <literal>Age</literal> and <literal>Int</literal> in order to show that -<literal>T Age Bool c</literal> has the same representation as <literal> -T Int Bool c</literal>).</para> - -<para>GHC supports three different roles for type parameters: nominal, -representational, and phantom. If a type parameter has a nominal role, then -the two types that differ must not actually differ at all: they must be -identical (after type family reduction). If a type parameter has a -representational role, then the two types must have the same representation. -(If <literal>T</literal>'s first parameter's role is representational, then -<literal>T Age Bool c</literal> and <literal>T Int Bool c</literal> would have -the same representation, because <literal>Age</literal> and -<literal>Int</literal> have the same representation.) If a type parameter has -a phantom role, then we need no further information.</para> - -<para>Here are some examples:</para> - -<programlisting> - data Simple a = MkSimple a -- a has role representational - - type family F - type instance F Int = Bool - type instance F Age = Char - - data Complex a = MkComplex (F a) -- a has role nominal - - data Phant a = MkPhant Bool -- a has role phantom -</programlisting> - -<para>The type <literal>Simple</literal> has its parameter at role -representational, which is generally the most common case. <literal>Simple -Age</literal> would have the same representation as <literal>Simple -Int</literal>. The type <literal>Complex</literal>, on the other hand, has its -parameter at role nominal, because <literal>Simple Age</literal> and -<literal>Simple Int</literal> are <emphasis>not</emphasis> the same. Lastly, -<literal>Phant Age</literal> and <literal>Phant Bool</literal> have the same -representation, even though <literal>Age</literal> and <literal>Bool</literal> -are unrelated.</para> - -</sect2> - -<sect2 id="role-inference"> -<title>Role inference</title> - -<para> -What role should a given type parameter should have? GHC performs role -inference to determine the correct role for every parameter. It starts with a -few base facts: <literal>(->)</literal> has two representational parameters; -<literal>(~)</literal> has two nominal parameters; all type families' -parameters are nominal; and all GADT-like parameters are nominal. Then, these -facts are propagated to all places where these types are used. The default -role for datatypes and synonyms is phantom; the default role for classes is -nominal. Thus, for datatypes and synonyms, any parameters unused in the -right-hand side (or used only in other types in phantom positions) will be -phantom. Whenever a parameter is used in a representational position (that is, -used as a type argument to a constructor whose corresponding variable is at -role representational), we raise its role from phantom to representational. -Similarly, when a parameter is used in a nominal position, its role is -upgraded to nominal. We never downgrade a role from nominal to phantom or -representational, or from representational to phantom. In this way, we infer -the most-general role for each parameter. -</para> - -<para> -Classes have their roles default to nominal to promote coherence of class -instances. If a <literal>C Int</literal> were stored in a datatype, it would -be quite bad if that were somehow changed into a <literal>C Age</literal> -somewhere, especially if another <literal>C Age</literal> had been declared! -</para> - -<para>There is one particularly tricky case that should be explained:</para> - -<programlisting> - data Tricky a b = MkTricky (a b) -</programlisting> - -<para>What should <literal>Tricky</literal>'s roles be? At first blush, it -would seem that both <literal>a</literal> and <literal>b</literal> should be -at role representational, since both are used in the right-hand side and -neither is involved in a type family. However, this would be wrong, as the -following example shows:</para> - -<programlisting> - data Nom a = MkNom (F a) -- type family F from example above -</programlisting> - -<para>Is <literal>Tricky Nom Age</literal> representationally equal to -<literal>Tricky Nom Int</literal>? No! The former stores a -<literal>Char</literal> and the latter stores a <literal>Bool</literal>. The -solution to this is to require all parameters to type variables to have role -nominal. Thus, GHC would infer role representational for <literal>a</literal> -but role nominal for <literal>b</literal>.</para> - -</sect2> - -<sect2 id="role-annotations"> -<title>Role annotations -<indexterm><primary>-XRoleAnnotations</primary></indexterm> -</title> - -<para> -Sometimes the programmer wants to constrain the inference process. For -example, the base library contains the following definition: -</para> - -<programlisting> - data Ptr a = Ptr Addr# -</programlisting> - -<para> -The idea is that <literal>a</literal> should really be a representational -parameter, but role inference assigns it to phantom. This makes some level of -sense: a pointer to an <literal>Int</literal> really is representationally the -same as a pointer to a <literal>Bool</literal>. But, that's not at all how we -want to use <literal>Ptr</literal>s! So, we want to be able to say</para> - -<programlisting> - type role Ptr representational - data Ptr a = Ptr Addr# -</programlisting> - -<para> -The <literal>type role</literal> (enabled with -<option>-XRoleAnnotations</option>) declaration forces the parameter -<literal>a</literal> to be at role representational, not role phantom. GHC -then checks the user-supplied roles to make sure they don't break any -promises. It would be bad, for example, if the user could make -<literal>BadIdea</literal>'s role be representational. -</para> - -<para>As another example, we can consider a type <literal>Set a</literal> that -represents a set of data, ordered according to <literal>a</literal>'s -<literal>Ord</literal> instance. While it would generally be type-safe to -consider <literal>a</literal> to be at role representational, it is possible -that a <literal>newtype</literal> and its base type have -<emphasis>different</emphasis> orderings encoded in their respective -<literal>Ord</literal> instances. This would lead to misbehavior at runtime. -So, the author of the <literal>Set</literal> datatype would like its parameter -to be at role nominal. This would be done with a declaration</para> - -<programlisting> - type role Set nominal -</programlisting> - -<para>Role annotations can also be used should a programmer wish to write -a class with a representational (or phantom) role. However, as a class -with non-nominal roles can quickly lead to class instance incoherence, -it is necessary to also specify <option>-XIncoherentInstances</option> -to allow non-nominal roles for classes.</para> - -<para>The other place where role annotations may be necessary are in -<literal>hs-boot</literal> files (<xref linkend="mutual-recursion"/>), where -the right-hand sides of definitions can be omitted. As usual, the -types/classes declared in an <literal>hs-boot</literal> file must match up -with the definitions in the <literal>hs</literal> file, including down to the -roles. The default role for datatypes -is representational in <literal>hs-boot</literal> files, -corresponding to the common use case.</para> - -<para> -Role annotations are allowed on data, newtype, and class declarations. A role -annotation declaration starts with <literal>type role</literal> and is -followed by one role listing for each parameter of the type. (This parameter -count includes parameters implicitly specified by a kind signature in a -GADT-style data or newtype declaration.) Each role listing is a role -(<literal>nominal</literal>, <literal>representational</literal>, or -<literal>phantom</literal>) or a <literal>_</literal>. Using a -<literal>_</literal> says that GHC should infer that role. The role annotation -may go anywhere in the same module as the datatype or class definition -(much like a value-level type signature). -Here are some examples:</para> - -<programlisting> - type role T1 _ phantom - data T1 a b = MkT1 a -- b is not used; annotation is fine but unnecessary - - type role T2 _ phantom - data T2 a b = MkT2 b -- ERROR: b is used and cannot be phantom - - type role T3 _ nominal - data T3 a b = MkT3 a -- OK: nominal is higher than necessary, but safe - - type role T4 nominal - data T4 a = MkT4 (a Int) -- OK, but nominal is higher than necessary - - type role C representational _ -- OK, with -XIncoherentInstances - class C a b where ... -- OK, b will get a nominal role - - type role X nominal - type X a = ... -- ERROR: role annotations not allowed for type synonyms -</programlisting> - -</sect2> - -</sect1> - -<sect1 id="strict-haskell"> - <title>Strict Haskell</title> - <indexterm><primary>strict haskell</primary></indexterm> - - <para>High-performance Haskell code (e.g. numeric code) can - sometimes be littered with bang patterns, making it harder to - read. The reason is that lazy evaluation isn't the right default in - this particular code but the programmer has no way to say that - except by repeatedly adding bang patterns. Below - <option>-XStrictData</option> is detailed that allows the programmer - to switch the default behavior on a per-module basis.</para> - - <sect2 id="strict-data"> - <title>Strict-by-default data types</title> - - <para>Informally the <literal>StrictData</literal> language - extension switches data type declarations to be strict by default - allowing fields to be lazy by adding a <literal>~</literal> in - front of the field.</para> - - <para>When the user writes</para> - - <programlisting> - data T = C a - data T' = C' ~a - </programlisting> - - <para>we interpret it as if she had written</para> - - <programlisting> - data T = C !a - data T' = C' a - </programlisting> - - <para>The extension only affects definitions in this module.</para> - </sect2> - -</sect1> - -<!-- Emacs stuff: - ;;; Local Variables: *** - ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") *** - ;;; ispell-local-dictionary: "british" *** - ;;; End: *** - --> |