diff options
Diffstat (limited to 'docs/comm/the-beast/desugar.html')
-rw-r--r-- | docs/comm/the-beast/desugar.html | 156 |
1 files changed, 156 insertions, 0 deletions
diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html new file mode 100644 index 0000000000..a66740259b --- /dev/null +++ b/docs/comm/the-beast/desugar.html @@ -0,0 +1,156 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Sugar Free: From Haskell To Core</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1> + <p> + Up until after type checking, GHC keeps the source program in an + abstract representation of Haskell source without removing any of the + syntactic sugar (such as, list comprehensions) that could easily be + represented by more primitive Haskell. This complicates part of the + front-end considerably as the abstract syntax of Haskell (as exported by + the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>) + is much more complex than a simplified representation close to, say, the + <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell + Kernel</a> would be. However, having a representation that is as close + as possible to the surface syntax simplifies the generation of clear + error messages. As GHC (quite in contrast to "conventional" compilers) + prints code fragments as part of error messages, the choice of + representation is especially important. + <p> + Nonetheless, as soon as the input has passed all static checks, it is + transformed into GHC's principal intermediate language that goes by the + name of <em>Core</em> and whose representation is exported by the + module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>. + All following compiler phases, except code generation operate on Core. + Due to Andrew Tolmach's effort, there is also an <a + href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external + representation for Core.</a> + <p> + The conversion of the compiled module from <code>HsSyn</code> into that + of <code>CoreSyn</code> is performed by a phase called the + <em>desugarer</em>, which is located in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>. + It's operation is detailed in the following. + </p> + + <h2>Auxilliary Functions</h2> + <p> + The modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a> + defines the desugarer monad (of type <code>DsM</code>) which maintains + the environment needed for desugaring. In particular, it encapsulates a + unique supply for generating new variables, a map to lookup standard + names (such as functions from the prelude), a source location for error + messages, and a pool to collect warning messages generated during + desugaring. Initialisation of the environment happens in the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>, + which is also the main entry point into the desugarer. + <p> + The generation of Core code often involves the use of standard functions + for which proper identifiers (i.e., values of type <code>Id</code> that + actually refer to the definition in the right Prelude) need to be + obtained. This is supported by the function + <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>. + + <h2><a name="patmat">Pattern Matching</a></h2> + <p> + Nested pattern matching with guards and everything is translated into + the simple, flat case expressions of Core by the following modules: + <dl> + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>: + <dd>This modules contains the main pattern-matching compiler in the form + of a function called <code>match</code>. There is some documentation + as to how <code>match</code> works contained in the module itself. + Generally, the implemented algorithm is similar to the one described + in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The + Implementation of Functional Programming Languages</em>. + <code>Match</code> exports a couple of functions with not really + intuitive names. In particular, it exports <code>match</code>, + <code>matchWrapper</code>, <code>matchExport</code>, and + <code>matchSimply</code>. The function <code>match</code>, which is + the main work horse, is only used by the other matching modules. The + function <code>matchExport</code> - despite it's name - is merely used + internally in <code>Match</code> and handles warning messages (see + below for more details). The actual interface to the outside is + <code>matchWrapper</code>, which converts the output of the type + checker into the form needed by the pattern matching compiler (i.e., a + list of <code>EquationInfo</code>). Similar in function to + <code>matchWrapper</code> is <code>matchSimply</code>, which provides + an interface for the case where a single expression is to be matched + against a single pattern (as, for example, is the case in bindings in + a <code>do</code> expression). + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>: + <dd>This module generates code for a set of alternative constructor + patterns that belong to a single type by means of the routine + <code>matchConFamily</code>. More precisely, the routine gets a set + of equations where the left-most pattern of each equation is a + constructor pattern with a head symbol from the same type as that of + all the other equations. A Core case expression is generated that + distinguihes between all these constructors. The routine is clever + enough to generate a sparse case expression and to add a catch-all + default case only when needed (i.e., if the case expression isn't + exhaustive already). There is also an explanation at the start of the + modules. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>: + <dd>Generates code for a set of alternative literal patterns by means of + the routine <code>matchLiterals</code>. The principle is similar to + that of <code>matchConFamily</code>, but all left-most patterns are + literals of the same type. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>: + <dd>This module provides a set of auxilliary definitions as well as the + data types <code>EquationInfo</code> and <code>MatchResult</code> that + form the input and output, respectively, of the pattern matching + compiler. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>: + <dd>This module does not really contribute the compiling pattern + matching, but it inspects sets of equations to find whether there are + any overlapping patterns or non-exhaustive pattern sets. This task is + implemented by the function <code>check</code>, which returns a list of + patterns that are part of a non-exhaustive case distinction as well as a + set of equation labels that can be reached during execution of the code; + thus, the remaining equations are shadowed due to overlapping patterns. + The function <code>check</code> is invoked and its result converted into + suitable warning messages by the function <code>Match.matchExport</code> + (which is a wrapper for <code>Match.match</code>). + </dl> + <p> + The central function <code>match</code>, given a set of equations, + proceeds in a number of steps: + <ol> + <li>It starts by desugaring the left-most pattern of each equation using + the function <code>tidy1</code> (indirectly via + <code>tidyEqnInfo</code>). During this process, non-elementary + pattern (e.g., those using explicit list syntax <code>[x, y, ..., + z]</code>) are converted to a standard constructor pattern and also + irrefutable pattern are removed. + <li>Then, a process called <em>unmixing</em> clusters the equations into + blocks (without re-ordering them), such that the left-most pattern of + all equations in a block are either all variables, all literals, or + all constructors. + <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>, + which forwards the handling of literal pattern blocks to + <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to + <code>MatchCon.matchConFamily</code>, and hands variable pattern + blocks back to <code>match</code>. + </ol> + + <p><hr><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 22:35:25 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> |