summaryrefslogtreecommitdiff
path: root/docs/comm/the-beast/syntax.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/comm/the-beast/syntax.html')
-rw-r--r--docs/comm/the-beast/syntax.html99
1 files changed, 99 insertions, 0 deletions
diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html
new file mode 100644
index 0000000000..be5bbefa17
--- /dev/null
+++ b/docs/comm/the-beast/syntax.html
@@ -0,0 +1,99 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Just Syntax</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Just Syntax</h1>
+ <p>
+ The lexical and syntactic analyser for Haskell programs are located in
+ <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/"><code>fptools/ghc/compiler/parser/</code></a>.
+ </p>
+
+ <h2>The Lexer</h2>
+ <p>
+ The lexer is a rather tedious piece of Haskell code contained in the
+ module <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex</code></a>.
+ Its complexity partially stems from covering, in addition to Haskell 98,
+ also the whole range of GHC language extensions plus its ability to
+ analyse interface files in addition to normal Haskell source. The lexer
+ defines a parser monad <code>P a</code>, where <code>a</code> is the
+ type of the result expected from a successful parse. More precisely, a
+ result of type
+<blockquote><pre>
+data ParseResult a = POk PState a
+ | PFailed Message</pre>
+</blockquote>
+ <p>
+ is produced with <code>Message</code> being from <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ErrUtils.lhs"><code>ErrUtils</code></a>
+ (and currently is simply a synonym for <code>SDoc</code>).
+ <p>
+ The record type <code>PState</code> contains information such as the
+ current source location, buffer state, contexts for layout processing,
+ and whether Glasgow extensions are accepted (either due to
+ <code>-fglasgow-exts</code> or due to reading an interface file). Most
+ of the fields of <code>PState</code> store unboxed values; in fact, even
+ the flag indicating whether Glasgow extensions are enabled is
+ represented by an unboxed integer instead of by a <code>Bool</code>. My
+ (= chak's) guess is that this is to avoid having to perform a
+ <code>case</code> on a boxed value in the inner loop of the lexer.
+ <p>
+ The same lexer is used by the Haskell source parser, the Haskell
+ interface parser, and the package configuration parser.
+
+ <h2>The Haskell Source Parser</h2>
+ <p>
+ The parser for Haskell source files is defined in the form of a parser
+ specification for the parser generator <a
+ href="http://haskell.org/happy/">Happy</a> in the file <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>.
+ The parser exports three entry points for parsing entire modules
+ (<code>parseModule</code>, individual statements
+ (<code>parseStmt</code>), and individual identifiers
+ (<code>parseIdentifier</code>), respectively. The last two are needed
+ for GHCi. All three require a parser state (of type
+ <code>PState</code>) and are invoked from <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a>.
+ <p>
+ Parsing of Haskell is a rather involved process. The most challenging
+ features are probably the treatment of layout and expressions that
+ contain infix operators. The latter may be user-defined and so are not
+ easily captured in a static syntax specification. Infix operators may
+ also appear in the right hand sides of value definitions, and so, GHC's
+ parser treats those in the same way as expressions. In other words, as
+ general expressions are a syntactic superset of expressions - ok, they
+ <em>nearly</em> are - the parser simply attempts to parse a general
+ expression in such positions. Afterwards, the generated parse tree is
+ inspected to ensure that the accepted phrase indeed forms a legal
+ pattern. This and similar checks are performed by the routines from <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/ParseUtil.lhs"><code>ParseUtil</code></a>. In
+ some cases, these routines do, in addition to checking for
+ wellformedness, also transform the parse tree, such that it fits into
+ the syntactic context in which it has been parsed; in fact, this happens
+ for patterns, which are transformed from a representation of type
+ <code>RdrNameHsExpr</code> into a representation of type
+ <code>RdrNamePat</code>.
+
+ <h2>The Haskell Interface Parser</h2>
+ <p>
+ The parser for interface files is also generated by Happy from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/ParseIface.y"><code>ParseIface.y</code></a>.
+ It's main routine <code>parseIface</code> is invoked from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/RnHiFiles.lhs"><code>RnHiFiles</code></a><code>.readIface</code>.
+
+ <h2>The Package Configuration Parser</h2>
+ <p>
+ The parser for configuration files is by far the smallest of the three
+ and defined in <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ParsePkgConf.y"><code>ParsePkgConf.y</code></a>.
+ It exports <code>loadPackageConfig</code>, which is used by <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState</code></a><code>.readPackageConf</code>.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Jan 16 00:30:14 EST 2002
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>