1 files changed, 156 insertions, 0 deletions
diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html
new file mode 100644
index 0000000000..a66740259b
--- /dev/null
+++ b/docs/comm/the-beast/desugar.html
@@ -0,0 +1,156 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Sugar Free: From Haskell To Core</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1>
+    <p>
+      Up until after type checking, GHC keeps the source program in an
+      abstract representation of Haskell source without removing any of the
+      syntactic sugar (such as, list comprehensions) that could easily be
+      represented by more primitive Haskell.  This complicates part of the
+      front-end considerably as the abstract syntax of Haskell (as exported by
+      the module <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>)
+      is much more complex than a simplified representation close to, say, the
+      <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell
+      Kernel</a> would be.  However, having a representation that is as close
+      as possible to the surface syntax simplifies the generation of clear
+      error messages.  As GHC (quite in contrast to "conventional" compilers)
+      prints code fragments as part of error messages, the choice of
+      representation is especially important.
+    <p>
+      Nonetheless, as soon as the input has passed all static checks, it is
+      transformed into GHC's principal intermediate language that goes by the
+      name of <em>Core</em> and whose representation is exported by the
+      module <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>.  
+      All following compiler phases, except code generation operate on Core.
+      Due to Andrew Tolmach's effort, there is also an <a
+      href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external
+      representation for Core.</a>
+    <p>
+      The conversion of the compiled module from <code>HsSyn</code> into that
+      of <code>CoreSyn</code> is performed by a phase called the
+      <em>desugarer</em>, which is located in
+      <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>.
+      It's operation is detailed in the following.
+    </p>
+
+    <h2>Auxilliary Functions</h2>
+    <p>
+      The modules <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a>
+      defines the desugarer monad (of type <code>DsM</code>) which maintains
+      the environment needed for desugaring.  In particular, it encapsulates a
+      unique supply for generating new variables, a map to lookup standard
+      names (such as functions from the prelude), a source location for error
+      messages, and a pool to collect warning messages generated during
+      desugaring.  Initialisation of the environment happens in the function <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>, 
+      which is also the main entry point into the desugarer.
+    <p>
+      The generation of Core code often involves the use of standard functions
+      for which proper identifiers (i.e., values of type <code>Id</code> that
+      actually refer to the definition in the right Prelude) need to be
+      obtained.  This is supported by the function
+      <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>.
+
+    <h2><a name="patmat">Pattern Matching</a></h2>
+    <p>
+      Nested pattern matching with guards and everything is translated into
+      the simple, flat case expressions of Core by the following modules:
+    <dl>
+      <dt><a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>:
+      <dd>This modules contains the main pattern-matching compiler in the form
+	of a function called <code>match</code>.  There is some documentation
+	as to how <code>match</code> works contained in the module itself.
+	Generally, the implemented algorithm is similar to the one described
+	in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The
+	Implementation of Functional Programming Languages</em>.
+	<code>Match</code> exports a couple of functions with not really
+	intuitive names.  In particular, it exports <code>match</code>,
+	<code>matchWrapper</code>, <code>matchExport</code>, and
+	<code>matchSimply</code>.  The function <code>match</code>, which is
+	the main work horse, is only used by the other matching modules.  The
+	function <code>matchExport</code> - despite it's name - is merely used
+	internally in <code>Match</code> and handles warning messages (see
+	below for more details).  The actual interface to the outside is
+	<code>matchWrapper</code>, which converts the output of the type
+	checker into the form needed by the pattern matching compiler (i.e., a
+	list of <code>EquationInfo</code>).  Similar in function to
+	<code>matchWrapper</code> is <code>matchSimply</code>, which provides
+	an interface for the case where a single expression is to be matched
+	against a single pattern (as, for example, is the case in bindings in
+	a <code>do</code> expression).
+      <dt><a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>:
+      <dd>This module generates code for a set of alternative constructor
+	patterns that belong to a single type by means of the routine
+	<code>matchConFamily</code>.  More precisely, the routine gets a set
+	of equations where the left-most pattern of each equation is a
+	constructor pattern with a head symbol from the same type as that of
+	all the other equations.  A Core case expression is generated that
+	distinguihes between all these constructors.  The routine is clever
+	enough to generate a sparse case expression and to add a catch-all
+	default case only when needed (i.e., if the case expression isn't
+	exhaustive already).  There is also an explanation at the start of the
+	modules.
+      <dt><a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>:
+      <dd>Generates code for a set of alternative literal patterns by means of
+	the routine <code>matchLiterals</code>.  The principle is similar to
+	that of <code>matchConFamily</code>, but all left-most patterns are
+	literals of the same type.
+      <dt><a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>:
+      <dd>This module provides a set of auxilliary definitions as well as the
+      data types <code>EquationInfo</code> and <code>MatchResult</code> that
+      form the input and output, respectively, of the pattern matching
+      compiler.
+      <dt><a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>:
+      <dd>This module does not really contribute the compiling pattern
+      matching, but it inspects sets of equations to find whether there are
+      any overlapping patterns or non-exhaustive pattern sets.  This task is
+      implemented by the function <code>check</code>, which returns a list of
+      patterns that are part of a non-exhaustive case distinction as well as a
+      set of equation labels that can be reached during execution of the code;
+      thus, the remaining equations are shadowed due to overlapping patterns.
+      The function <code>check</code> is invoked and its result converted into
+      suitable warning messages by the function <code>Match.matchExport</code>
+      (which is a wrapper for <code>Match.match</code>).
+    </dl>
+    <p>
+      The central function <code>match</code>, given a set of equations,
+      proceeds in a number of steps:
+      <ol>
+      <li>It starts by desugaring the left-most pattern of each equation using
+	the function <code>tidy1</code> (indirectly via
+	<code>tidyEqnInfo</code>).  During this process, non-elementary
+	pattern (e.g., those using explicit list syntax <code>[x, y, ...,
+	z]</code>) are converted to a standard constructor pattern and also
+	irrefutable pattern are removed.
+      <li>Then, a process called <em>unmixing</em> clusters the equations into
+	blocks (without re-ordering them), such that the left-most pattern of
+	all equations in a block are either all variables, all literals, or
+	all constructors.
+      <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>,
+	which forwards the handling of literal pattern blocks to
+	<code>MatchLit.matchLiterals</code>, of constructor pattern blocks to
+	<code>MatchCon.matchConFamily</code>, and hands variable pattern
+	blocks back to <code>match</code>.
+      </ol>
+
+    <p><hr><small>
+<!-- hhmts start -->
+Last modified: Mon Feb 11 22:35:25 EST 2002
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>