diff options
Diffstat (limited to 'docs/comm/exts/ndp.html')
-rw-r--r-- | docs/comm/exts/ndp.html | 360 |
1 files changed, 360 insertions, 0 deletions
diff --git a/docs/comm/exts/ndp.html b/docs/comm/exts/ndp.html new file mode 100644 index 0000000000..0c94c3960b --- /dev/null +++ b/docs/comm/exts/ndp.html @@ -0,0 +1,360 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Parallel Arrays</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Parallel Arrays</h1> + <p> + This section describes an experimental extension by high-performance + arrays, which comprises special syntax for array types and array + comprehensions, a set of optimising program transformations, and a set + of special purpose libraries. The extension is currently only partially + implemented, but the development will be tracked here. + <p> + Parallel arrays originally got their name from the aim to provide an + architecture-independent programming model for a range of parallel + computers. However, since experiments showed that the approach is also + worthwhile for sequential array code, the emphasis has shifted to their + parallel evaluation semantics: As soon as any element in a parallel + array is demanded, all the other elements are evaluated, too. This + makes parallel arrays more strict than <a + href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98 + arrays</a>, but also opens the door for a loop-based implementation + strategy that leads to significantly more efficient code. + <p> + The programming model as well as the use of the <em>flattening + transformation</em>, which is central to the approach, has its origin in + the programming language <a + href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>. + + <h2>More Sugar: Special Syntax for Array Comprehensions</h2> + <p> + The option <code>-fparr</code>, which is a dynamic hsc option that can + be reversed with <code>-fno-parr</code>, enables special syntax for + parallel arrays, which is not essential to using parallel arrays, but + makes for significantly more concise programs. The switch works by + making the lexical analyser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>) + recognise the tokens <code>[:</code> and <code>:]</code>. Given that + the additional productions in the parser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>) + cannot be triggered without the lexer generating the necessary tokens, + there is no need to alter the behaviour of the parser. + <p> + The following additional syntax is accepted (the non-terminals are those + from the <a href="http://haskell.org/onlinereport/">Haskell 98 language + definition</a>): + <p> + <blockquote><pre> +atype -> '[:' type ':] (parallel array type) + +aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0) + | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence) + | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1) + +quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1) + +apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0) +</pre> + </blockquote> + <p> + Moreover, the extended comprehension syntax that allows for <em>parallel + qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also + supported in list comprehensions. In general, the similarity to the + special syntax for list is obvious. The two main differences are that + (a) arithmetic array sequences are always finite and (b) + <code>[::]</code> is not treated as a constructor in expressions and + patterns, but rather as a special case of the explicit array syntax. + The former is a simple consequence of the parallel evaluation semantics + of parallel arrays and the latter is due to arrays not being a + constructor-based data type. + <p> + As a naming convention, types and functions that are concerned with + parallel arrays usually contain the string <code>parr</code> or + <code>PArr</code> (often as a prefix), and where corresponding types or + functions for handling lists exist, the name is identical, except for + containing the substring <code>parr</code> instead of <code>list</code> + (possibly in caps). + <p> + The following implications are worth noting explicitly: + <ul> + <li>As the value and pattern <code>[::]</code> is an empty explicit + parallel array (i.e., something of the form <code>ExplicitPArr ty + []</code> in the AST). This is in contrast to lists, which use the + nil-constructor instead. In the case of parallel arrays, using a + constructor would be rather awkward, as it is not a constructor-based + type. (This becomes rather clear in the desugarer.) + <li>As a consequence, array patterns have the general form <code>[:p1, + p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array + patterns overlap iff they have the same length -- an important property + for the pattern matching compiler. + </ul> + + <h2>Prelude Support for Parallel Arrays</h2> + <p> + The Prelude module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a> + defines the standard operations and their types on parallel arrays and + provides a basic implementation based on boxed arrays. The interface of + <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but + leaving out all functions that require infinite structures and adding + frequently needed array operations, such as permutations. Parallel + arrays are quite unqiue in that they use an entirely different + representation as soon as the flattening transformation is activated, + which is described further below. In particular, <code>PrelPArr</code> + defines the type <code>[::]</code> and operations to create, process, + and inspect parallel arrays. The type as well as the names of some of + the operations are also hardwired in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a> + (see the definition of <code>parrTyCon</code> in this module) and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. + This is again very much like the case of lists, where the type is + defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a> + and similarly wired in; however, for lists the entirely + constructor-based definition is exposed to user programs, which is not + the case for parallel arrays. + + <h2>Desugaring Parallel Arrays</h2> + <p> + The parallel array extension requires the desugarer to replace all + occurrences of (1) explicit parallel arrays, (2) array patterns, and (3) + array comprehensions by a suitable combination of invocations of + operations defined in <code>PrelPArr</code>. + + <h4>Explicit Parallel Arrays</h4> + <p> + These are by far the simplest to remove. We simply replace every + occurrence of <code>[:<i>e<sub>1</sub></i>, ..., + <i>e<sub>n</sub></i>:]</code> by + <blockquote> + <code> + toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>] + </code> + </blockquote> + <p> + i.e., we build a list of the array elements, which is, then, converted + into a parallel array. + + <h4>Parallel Array Patterns</h4> + <p> + Array patterns are much more tricky as element positions may contain + further patterns and the <a + href="../the-beast/desugar.html#patmat">pattern matching compiler</a> + requires us to flatten all those out. But before we turn to the gory + details, here first the basic idea: A flat array pattern matches exactly + iff it's length corresponds to the length of the matched array. Hence, + if we have a set of flat array patterns matching an array value + <code>a</code>, it suffices to generate a Core <code>case</code> + expression that scrutinises <code>lengthP a</code> and has one + alternative for every length of array occuring in one of the patterns. + Moreover, there needs to be a default case catching all other array + lengths. In each alternative, array indexing (i.e., the functions + <code>!:</code>) is used to bind array elements to the corresponding + pattern variables. This sounds easy enough and is essentially what the + parallel array equation of the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code> + does. + <p> + Unfortunately, however, the pattern matching compiler expects that it + can turn (almost) any pattern into variable patterns, literals, or + constructor applications by way of the functions <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>. + And to make matters worse, some weird machinery in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a> + insists on being able to reverse the process (essentially to pretty + print patterns in warnings for incomplete or overlapping patterns). + <p> + The solution to this is an (unlimited) set of <em>fake</em> constructors + for parallel arrays, courtesy of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>. + In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>, + ..., <i>p<sub>n</sub></i>:]</code> is transformed into + <blockquote> + <code> + MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i> + </code> + </blockquote> + <p> + by <code>Match.tidy1</code>, then, run through the rest of the pattern + matching compiler, and finally, picked up by + <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a + <code>case</code> expression as outlined above. + <p> + As an example consider the source expression + <blockquote><pre> +case v of + [:x1:] -> e1 + [:x2, x3, x4:] -> e2 + _ -> e3</pre> + </blockquote> + <p> + <code>Match.tidy1</code> converts it into a form that is equivalent to + <blockquote><pre> +case v of { + MkPArr1 x1 -> e1; + MkPArr2 x2 x3 x4 -> e2; + _ -> e3; +}</pre> + </blockquote> + <p> + which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the + following Core code: + <blockquote><pre> + case lengthP v of + Int# i# -> + case i# of l { + DFT -> e3 + 1 -> let x1 = v!:0 in e1 + 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2 + }</pre> + </blockquote> + + <h4>Parallel Array Comprehensions</h4> + <p> + The most challenging construct of the three are array comprehensions. + In principle, it would be possible to transform them in essentially the + same way as list comprehensions, but this would lead to abysmally slow + code as desugaring of list comprehensions generates code that is + optimised for sequential, constructor-based structures. In contrast, + array comprehensions need to be transformed into code that solely relies + on collective operations and avoids the creation of many small + intermediate arrays. + <p> + The transformation is implemented by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>. + In the following, we denote this transformation function by the form + <code><<<i>e</i>>> pa ea</code>, where <code><i>e</i></code> + is the comprehension to be compiled and the arguments <code>pa</code> + and <code>ea</code> denote a pattern and the currently processed array + expression, respectively. The invariant constraining these two + arguments is that all elements in the array produced by <code>ea</code> + will <em>successfully</em> match against <code>pa</code>. + <p> + Given a source-level comprehensions <code>[:e | qss:]</code>, we compile + it with <code><<[:e | qss:]>> () [:():]</code> using the + rules + <blockquote><pre> +<<[:e' | :]>> pa ea = mapP (\pa -> e') ea +<<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) +<<[:e' | p <- e, qs:]>> pa ea = + let ef = filterP (\x -> case x of {p -> True; _ -> False}) e + in + <<[:e' | qs:]>> (pa, p) (crossP ea ef) +<<[:e' | let ds, qs:]>> pa ea = + <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) + (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) +where + {x_1, ..., x_n} = DV (ds) -- Defined Variables +<<[:e' | qs | qss:]>> pa ea = + <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) + (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) +where + {x_1, ..., x_n} = DV (qs)</pre> + </blockquote> + <p> + We assume the denotation of <code>crossP</code> to be given by + <blockquote><pre> +crossP :: [:a:] -> [:b:] -> [:(a, b):] +crossP a1 a2 = let + len1 = lengthP a1 + len2 = lengthP a2 + x1 = concatP $ mapP (replicateP len2) a1 + x2 = concatP $ replicateP len1 a2 + in + zipP x1 x2</pre> + </blockquote> + <p> + For a more efficient implementation of <code>crossP</code>, see + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>. + <p> + Moreover, the following optimisations are important: + <ul> + <li>In the <code>p <- e</code> rule, if <code>pa == ()</code>, drop + it and simplify the <code>crossP ea e</code> to <code>e</code>. + <li>We assume that fusion will optimise sequences of array processing + combinators. + <li>FIXME: Do we want to have the following function? + <blockquote><pre> +mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre> + </blockquote> + <p> + Even with fusion <code>(mapP (\p -> e) . filterP (\p -> + b))</code> may still result in redundant pattern matching + operations. (Let's wait with this until we have seen what the + Simplifier does to the generated code.) + </ul> + + <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2> + <p> + On the quest towards an entirely unboxed representation of parallel + arrays, the flattening transformation is the essential ingredient. GHC + uses a <a + href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially + improved version</a> of the transformation whose original form was + described by Blelloch & Sabot. The flattening transformation + replaces values of type <code>[:a:]</code> as well as functions + operating on these values by alternative, more efficient data structures + and functions. + <p> + The flattening machinery is activated by the option + <code>-fflatten</code>, which is a static hsc option. It consists of + two steps: function vectorisation and array specialisation. Only the + first of those is implemented so far. If selected, the transformation + is applied to a module in Core form immediately after the <a + href="../the-beast/desugar.html">desugarer,</a> before the <a + href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its + job. After vectorisation, the Core program can be dumped using the + option <code>-ddump-vect</code>. The is a good reason for us to perform + flattening immediately after the desugarer. In <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code> + the so-called <em>persistent compiler state</em> is maintained, which + contains all the information about imported interface files needed to + lookup the details of imported names (e.g., during renaming and type + checking). However, all this information is zapped before the + simplifier is invoked (supposedly to reduce the space-consumption of + GHC). As flattening has to get at all kinds of identifiers from Prelude + modules, we need to do it before the relevant information in the + persistent compiler state is gone. + + <p> + As flattening generally requires all libraries to be compiled for + flattening (just like profiling does), there is a <em>compiler way</em> + <code>"ndp"</code>, which can be selected using the way option code + <code>-ndp</code>. This option will automagically select + <code>-fparr</code> and <code>-fflatten</code>. + + <h4><code>FlattenMonad</code></h4> + <p> + The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a> + implements the monad <code>Flatten</code> that is used during + vectorisation to keep track of various sets of bound variables and a + variable substitution map; moreover, it provides a supply of new uniques + and allows us to look up names in the persistent compiler state (i.e., + imported identifiers). + <p> + In order to be able to look up imported identifiers in the persistent + compiler state, it is important that these identifies are included into + the free variable lists computed by the renamer. More precisely, all + names needed by flattening are included in the names produced by + <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting + flattening-dependent lists of names into the renamer, the module + <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>. + + [FIXME: It might be worthwhile to document in the non-Flattening part of + the Commentary that the persistent compiler state is zapped after + desugaring and how the free variables determined by the renamer imply + which names are imported.] + + <p><small> +<!-- hhmts start --> +Last modified: Tue Feb 12 01:44:21 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> |