diff options
author | Simon Marlow <simonmar@microsoft.com> | 2006-04-07 02:05:11 +0000 |
---|---|---|
committer | Simon Marlow <simonmar@microsoft.com> | 2006-04-07 02:05:11 +0000 |
commit | 0065d5ab628975892cea1ec7303f968c3338cbe1 (patch) | |
tree | 8e2afe0ab48ee33cf95009809d67c9649573ef92 /docs/comm | |
parent | 28a464a75e14cece5db40f2765a29348273ff2d2 (diff) | |
download | haskell-0065d5ab628975892cea1ec7303f968c3338cbe1.tar.gz |
Reorganisation of the source tree
Most of the other users of the fptools build system have migrated to
Cabal, and with the move to darcs we can now flatten the source tree
without losing history, so here goes.
The main change is that the ghc/ subdir is gone, and most of what it
contained is now at the top level. The build system now makes no
pretense at being multi-project, it is just the GHC build system.
No doubt this will break many things, and there will be a period of
instability while we fix the dependencies. A straightforward build
should work, but I haven't yet fixed binary/source distributions.
Changes to the Building Guide will follow, too.
Diffstat (limited to 'docs/comm')
39 files changed, 6995 insertions, 0 deletions
diff --git a/docs/comm/exts/ndp.html b/docs/comm/exts/ndp.html new file mode 100644 index 0000000000..0c94c3960b --- /dev/null +++ b/docs/comm/exts/ndp.html @@ -0,0 +1,360 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Parallel Arrays</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Parallel Arrays</h1> + <p> + This section describes an experimental extension by high-performance + arrays, which comprises special syntax for array types and array + comprehensions, a set of optimising program transformations, and a set + of special purpose libraries. The extension is currently only partially + implemented, but the development will be tracked here. + <p> + Parallel arrays originally got their name from the aim to provide an + architecture-independent programming model for a range of parallel + computers. However, since experiments showed that the approach is also + worthwhile for sequential array code, the emphasis has shifted to their + parallel evaluation semantics: As soon as any element in a parallel + array is demanded, all the other elements are evaluated, too. This + makes parallel arrays more strict than <a + href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98 + arrays</a>, but also opens the door for a loop-based implementation + strategy that leads to significantly more efficient code. + <p> + The programming model as well as the use of the <em>flattening + transformation</em>, which is central to the approach, has its origin in + the programming language <a + href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>. + + <h2>More Sugar: Special Syntax for Array Comprehensions</h2> + <p> + The option <code>-fparr</code>, which is a dynamic hsc option that can + be reversed with <code>-fno-parr</code>, enables special syntax for + parallel arrays, which is not essential to using parallel arrays, but + makes for significantly more concise programs. The switch works by + making the lexical analyser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>) + recognise the tokens <code>[:</code> and <code>:]</code>. Given that + the additional productions in the parser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>) + cannot be triggered without the lexer generating the necessary tokens, + there is no need to alter the behaviour of the parser. + <p> + The following additional syntax is accepted (the non-terminals are those + from the <a href="http://haskell.org/onlinereport/">Haskell 98 language + definition</a>): + <p> + <blockquote><pre> +atype -> '[:' type ':] (parallel array type) + +aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0) + | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence) + | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1) + +quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1) + +apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0) +</pre> + </blockquote> + <p> + Moreover, the extended comprehension syntax that allows for <em>parallel + qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also + supported in list comprehensions. In general, the similarity to the + special syntax for list is obvious. The two main differences are that + (a) arithmetic array sequences are always finite and (b) + <code>[::]</code> is not treated as a constructor in expressions and + patterns, but rather as a special case of the explicit array syntax. + The former is a simple consequence of the parallel evaluation semantics + of parallel arrays and the latter is due to arrays not being a + constructor-based data type. + <p> + As a naming convention, types and functions that are concerned with + parallel arrays usually contain the string <code>parr</code> or + <code>PArr</code> (often as a prefix), and where corresponding types or + functions for handling lists exist, the name is identical, except for + containing the substring <code>parr</code> instead of <code>list</code> + (possibly in caps). + <p> + The following implications are worth noting explicitly: + <ul> + <li>As the value and pattern <code>[::]</code> is an empty explicit + parallel array (i.e., something of the form <code>ExplicitPArr ty + []</code> in the AST). This is in contrast to lists, which use the + nil-constructor instead. In the case of parallel arrays, using a + constructor would be rather awkward, as it is not a constructor-based + type. (This becomes rather clear in the desugarer.) + <li>As a consequence, array patterns have the general form <code>[:p1, + p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array + patterns overlap iff they have the same length -- an important property + for the pattern matching compiler. + </ul> + + <h2>Prelude Support for Parallel Arrays</h2> + <p> + The Prelude module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a> + defines the standard operations and their types on parallel arrays and + provides a basic implementation based on boxed arrays. The interface of + <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but + leaving out all functions that require infinite structures and adding + frequently needed array operations, such as permutations. Parallel + arrays are quite unqiue in that they use an entirely different + representation as soon as the flattening transformation is activated, + which is described further below. In particular, <code>PrelPArr</code> + defines the type <code>[::]</code> and operations to create, process, + and inspect parallel arrays. The type as well as the names of some of + the operations are also hardwired in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a> + (see the definition of <code>parrTyCon</code> in this module) and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. + This is again very much like the case of lists, where the type is + defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a> + and similarly wired in; however, for lists the entirely + constructor-based definition is exposed to user programs, which is not + the case for parallel arrays. + + <h2>Desugaring Parallel Arrays</h2> + <p> + The parallel array extension requires the desugarer to replace all + occurrences of (1) explicit parallel arrays, (2) array patterns, and (3) + array comprehensions by a suitable combination of invocations of + operations defined in <code>PrelPArr</code>. + + <h4>Explicit Parallel Arrays</h4> + <p> + These are by far the simplest to remove. We simply replace every + occurrence of <code>[:<i>e<sub>1</sub></i>, ..., + <i>e<sub>n</sub></i>:]</code> by + <blockquote> + <code> + toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>] + </code> + </blockquote> + <p> + i.e., we build a list of the array elements, which is, then, converted + into a parallel array. + + <h4>Parallel Array Patterns</h4> + <p> + Array patterns are much more tricky as element positions may contain + further patterns and the <a + href="../the-beast/desugar.html#patmat">pattern matching compiler</a> + requires us to flatten all those out. But before we turn to the gory + details, here first the basic idea: A flat array pattern matches exactly + iff it's length corresponds to the length of the matched array. Hence, + if we have a set of flat array patterns matching an array value + <code>a</code>, it suffices to generate a Core <code>case</code> + expression that scrutinises <code>lengthP a</code> and has one + alternative for every length of array occuring in one of the patterns. + Moreover, there needs to be a default case catching all other array + lengths. In each alternative, array indexing (i.e., the functions + <code>!:</code>) is used to bind array elements to the corresponding + pattern variables. This sounds easy enough and is essentially what the + parallel array equation of the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code> + does. + <p> + Unfortunately, however, the pattern matching compiler expects that it + can turn (almost) any pattern into variable patterns, literals, or + constructor applications by way of the functions <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>. + And to make matters worse, some weird machinery in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a> + insists on being able to reverse the process (essentially to pretty + print patterns in warnings for incomplete or overlapping patterns). + <p> + The solution to this is an (unlimited) set of <em>fake</em> constructors + for parallel arrays, courtesy of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>. + In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>, + ..., <i>p<sub>n</sub></i>:]</code> is transformed into + <blockquote> + <code> + MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i> + </code> + </blockquote> + <p> + by <code>Match.tidy1</code>, then, run through the rest of the pattern + matching compiler, and finally, picked up by + <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a + <code>case</code> expression as outlined above. + <p> + As an example consider the source expression + <blockquote><pre> +case v of + [:x1:] -> e1 + [:x2, x3, x4:] -> e2 + _ -> e3</pre> + </blockquote> + <p> + <code>Match.tidy1</code> converts it into a form that is equivalent to + <blockquote><pre> +case v of { + MkPArr1 x1 -> e1; + MkPArr2 x2 x3 x4 -> e2; + _ -> e3; +}</pre> + </blockquote> + <p> + which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the + following Core code: + <blockquote><pre> + case lengthP v of + Int# i# -> + case i# of l { + DFT -> e3 + 1 -> let x1 = v!:0 in e1 + 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2 + }</pre> + </blockquote> + + <h4>Parallel Array Comprehensions</h4> + <p> + The most challenging construct of the three are array comprehensions. + In principle, it would be possible to transform them in essentially the + same way as list comprehensions, but this would lead to abysmally slow + code as desugaring of list comprehensions generates code that is + optimised for sequential, constructor-based structures. In contrast, + array comprehensions need to be transformed into code that solely relies + on collective operations and avoids the creation of many small + intermediate arrays. + <p> + The transformation is implemented by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>. + In the following, we denote this transformation function by the form + <code><<<i>e</i>>> pa ea</code>, where <code><i>e</i></code> + is the comprehension to be compiled and the arguments <code>pa</code> + and <code>ea</code> denote a pattern and the currently processed array + expression, respectively. The invariant constraining these two + arguments is that all elements in the array produced by <code>ea</code> + will <em>successfully</em> match against <code>pa</code>. + <p> + Given a source-level comprehensions <code>[:e | qss:]</code>, we compile + it with <code><<[:e | qss:]>> () [:():]</code> using the + rules + <blockquote><pre> +<<[:e' | :]>> pa ea = mapP (\pa -> e') ea +<<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) +<<[:e' | p <- e, qs:]>> pa ea = + let ef = filterP (\x -> case x of {p -> True; _ -> False}) e + in + <<[:e' | qs:]>> (pa, p) (crossP ea ef) +<<[:e' | let ds, qs:]>> pa ea = + <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) + (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) +where + {x_1, ..., x_n} = DV (ds) -- Defined Variables +<<[:e' | qs | qss:]>> pa ea = + <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) + (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) +where + {x_1, ..., x_n} = DV (qs)</pre> + </blockquote> + <p> + We assume the denotation of <code>crossP</code> to be given by + <blockquote><pre> +crossP :: [:a:] -> [:b:] -> [:(a, b):] +crossP a1 a2 = let + len1 = lengthP a1 + len2 = lengthP a2 + x1 = concatP $ mapP (replicateP len2) a1 + x2 = concatP $ replicateP len1 a2 + in + zipP x1 x2</pre> + </blockquote> + <p> + For a more efficient implementation of <code>crossP</code>, see + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>. + <p> + Moreover, the following optimisations are important: + <ul> + <li>In the <code>p <- e</code> rule, if <code>pa == ()</code>, drop + it and simplify the <code>crossP ea e</code> to <code>e</code>. + <li>We assume that fusion will optimise sequences of array processing + combinators. + <li>FIXME: Do we want to have the following function? + <blockquote><pre> +mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre> + </blockquote> + <p> + Even with fusion <code>(mapP (\p -> e) . filterP (\p -> + b))</code> may still result in redundant pattern matching + operations. (Let's wait with this until we have seen what the + Simplifier does to the generated code.) + </ul> + + <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2> + <p> + On the quest towards an entirely unboxed representation of parallel + arrays, the flattening transformation is the essential ingredient. GHC + uses a <a + href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially + improved version</a> of the transformation whose original form was + described by Blelloch & Sabot. The flattening transformation + replaces values of type <code>[:a:]</code> as well as functions + operating on these values by alternative, more efficient data structures + and functions. + <p> + The flattening machinery is activated by the option + <code>-fflatten</code>, which is a static hsc option. It consists of + two steps: function vectorisation and array specialisation. Only the + first of those is implemented so far. If selected, the transformation + is applied to a module in Core form immediately after the <a + href="../the-beast/desugar.html">desugarer,</a> before the <a + href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its + job. After vectorisation, the Core program can be dumped using the + option <code>-ddump-vect</code>. The is a good reason for us to perform + flattening immediately after the desugarer. In <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code> + the so-called <em>persistent compiler state</em> is maintained, which + contains all the information about imported interface files needed to + lookup the details of imported names (e.g., during renaming and type + checking). However, all this information is zapped before the + simplifier is invoked (supposedly to reduce the space-consumption of + GHC). As flattening has to get at all kinds of identifiers from Prelude + modules, we need to do it before the relevant information in the + persistent compiler state is gone. + + <p> + As flattening generally requires all libraries to be compiled for + flattening (just like profiling does), there is a <em>compiler way</em> + <code>"ndp"</code>, which can be selected using the way option code + <code>-ndp</code>. This option will automagically select + <code>-fparr</code> and <code>-fflatten</code>. + + <h4><code>FlattenMonad</code></h4> + <p> + The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a> + implements the monad <code>Flatten</code> that is used during + vectorisation to keep track of various sets of bound variables and a + variable substitution map; moreover, it provides a supply of new uniques + and allows us to look up names in the persistent compiler state (i.e., + imported identifiers). + <p> + In order to be able to look up imported identifiers in the persistent + compiler state, it is important that these identifies are included into + the free variable lists computed by the renamer. More precisely, all + names needed by flattening are included in the names produced by + <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting + flattening-dependent lists of names into the renamer, the module + <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>. + + [FIXME: It might be worthwhile to document in the non-Flattening part of + the Commentary that the persistent compiler state is zapped after + desugaring and how the free variables determined by the renamer imply + which names are imported.] + + <p><small> +<!-- hhmts start --> +Last modified: Tue Feb 12 01:44:21 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/exts/th.html b/docs/comm/exts/th.html new file mode 100644 index 0000000000..dbb168aa0e --- /dev/null +++ b/docs/comm/exts/th.html @@ -0,0 +1,197 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Template Haskell</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Template Haskell</h1> + <p> + The Template Haskell (TH) extension to GHC adds a meta-programming + facility in which all meta-level code is executed at compile time. The + design of this extension is detailed in "Template Meta-programming for + Haskell", Tim Sheard and Simon Peyton Jones, <a + href="http://portal.acm.org/toc.cfm?id=581690&type=proceeding&coll=portal&dl=ACM&part=series&WantType=proceedings&idx=unknown&title=unknown">ACM + SIGPLAN 2002 Haskell Workshop,</a> 2002. However, some of the details + changed after the paper was published. + </p> + + <h2>Meta Sugar</h2> + <p> + The extra syntax of TH (quasi-quote brackets, splices, and reification) + is handled in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMeta.hs"><code>DsMeta</code></a>. + In particular, the function <code>dsBracket</code> desugars the four + types of quasi-quote brackets (<code>[|...|]</code>, + <code>[p|...|]</code>, <code>[d|...|]</code>, and <code>[t|...|]</code>) + and <code>dsReify</code> desugars the three types of reification + operations (<code>reifyType</code>, <code>reifyDecl</code>, and + <code>reifyFixity</code>). + </p> + + <h3>Desugaring of Quasi-Quote Brackets</h3> + <p> + A term in quasi-quote brackets needs to be translated into Core code + that, when executed, yields a <em>representation</em> of that term in + the form of the abstract syntax trees defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/template-haskell/Language/Haskell/TH/Syntax.hs"><code>Language.Haskell.TH.Syntax</code></a>. + Within <code>DsMeta</code>, this is achieved by four functions + corresponding to the four types of quasi-quote brackets: + <code>repE</code> (for <code>[|...|]</code>), <code>repP</code> (for + <code>[p|...|]</code>), <code>repTy</code> (for <code>[t|...|]</code>), + and <code>repTopDs</code> (for <code>[d|...|]</code>). All four of + these functions receive as an argument the GHC-internal Haskell AST of + the syntactic form that they quote (i.e., arguments of type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsExpr.lhs"><code>HsExpr</code></a><code>.HsExpr + Name</code>, <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsPat.lhs"><code>HsPat</code></a><code>.HsPat Name</code>, + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsTypes.lhs"><code>HsType</code></a><code>.HsType + Name</code>, and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsDecls.lhs"><code>HsDecls</code></a><code>.HsGroup + Name</code>, respectively). + </p> + <p> + To increase the static type safety in <code>DsMeta</code>, the functions + constructing representations do not just return plain values of type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a> + <code>.CoreExpr</code>; instead, <code>DsMeta</code> introduces a + parametrised type <code>Core</code> whose dummy type parameter indicates + the source-level type of the value computed by the corresponding Core + expression. All construction of Core fragments in <code>DsMeta</code> + is performed by smart constructors whose type signatures use the dummy + type parameter to constrain the contexts in which they are applicable. + For example, a function that builds a Core expression that evaluates to + a TH type representation, which has type + <code>Language.Haskell.TH.Syntax.Type</code>, would return a value of + type + </p> + <blockquote> + <pre> +Core Language.Haskell.TH.Syntax.Type</pre> + </blockquote> + + <h3>Desugaring of Reification Operators</h3> + <p> + The TH paper introduces four reification operators: + <code>reifyType</code>, <code>reifyDecl</code>, + <code>reifyFixity</code>, and <code>reifyLocn</code>. Of these, + currently (= 9 Nov 2002), only the former two are implemented. + </p> + <p> + The operator <code>reifyType</code> receives the name of a function or + data constructor as its argument and yields a representation of this + entity's type in the form of a value of type + <code>TH.Syntax.Type</code>. Similarly, <code>reifyDecl</code> receives + the name of a type and yields a representation of the type's declaration + as a value of type <code>TH.Syntax.Decl</code>. The name of the reified + entity is mapped to the GHC-internal representation of the entity by + using the function <code>lookupOcc</code> on the name. + </p> + + <h3>Representing Binding Forms</h3> + <p> + Care needs to be taken when constructing TH representations of Haskell + terms that include binding forms, such as lambda abstractions or let + bindings. To avoid name clashes, fresh names need to be generated for + all defined identifiers. This is achieved via the routine + <code>DsMeta.mkGenSym</code>, which, given a <code>Name</code>, produces + a <code>Name</code> / <code>Id</code> pair (of type + <code>GenSymBind</code>) that associates the given <code>Name</code> + with a Core identifier that at runtime will be bound to a string that + contains the fresh name. Notice the two-level nature of this + arrangement. It is necessary, as the Core code that constructs the + Haskell term representation may be executed multiple types at runtime + and it must be ensured that different names are generated in each run. + </p> + <p> + Such fresh bindings need to be entered into the meta environment (of + type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a><code>.DsMetaEnv</code>), + which is part of the state (of type <code>DsMonad.DsEnv</code>) + maintained in the desugarer monad (of type <code>DsMonad.DsM</code>). + This is done using the function <code>DsMeta.addBinds</code>, which + extends the current environment by a list of <code>GenSymBind</code>s + and executes a subcomputation in this extended environment. Names can + be looked up in the meta environment by way of the functions + <code>DsMeta.lookupOcc</code> and <code>DsMeta.lookupBinder</code>; more + details about the difference between these two functions can be found in + the next subsection. + </p> + <p> + NB: <code>DsMeta</code> uses <code>mkGenSym</code> only when + representing terms that may be embedded into a context where names can + be shadowed. For example, a lambda abstraction embedded into an + expression can potentially shadow names defined in the context it is + being embedded into. In contrast, this can never be the case for + top-level declarations, such as data type declarations; hence, the type + variables that a parametric data type declaration abstracts over are not + being gensym'ed. As a result, variables in defining positions are + handled differently depending on the syntactic construct in which they + appear. + </p> + + <h3>Binders Versus Occurences</h3> + <p> + Name lookups in the meta environment of the desugarer use two functions + with slightly different behaviour, namely <code>DsMeta.lookupOcc</code> + and <code>lookupBinder</code>. The module <code>DsMeta</code> contains + the following explanation as to the difference of these functions: + </p> + <blockquote> + <pre> +When we desugar [d| data T = MkT |] +we want to get + Data "T" [] [Con "MkT" []] [] +and *not* + Data "Foo:T" [] [Con "Foo:MkT" []] [] +That is, the new data decl should fit into whatever new module it is +asked to fit in. We do *not* clone, though; no need for this: + Data "T79" .... + +But if we see this: + data T = MkT + foo = reifyDecl T + +then we must desugar to + foo = Data "Foo:T" [] [Con "Foo:MkT" []] [] + +So in repTopDs we bring the binders into scope with mkGenSyms and addBinds, +but in dsReify we do not. And we use lookupOcc, rather than lookupBinder +in repTyClD and repC.</pre> + </blockquote> + <p> + This implies that <code>lookupOcc</code>, when it does not find the name + in the meta environment, uses the function <code>DsMeta.globalVar</code> + to construct the <em>original name</em> of the entity (cf. the TH paper + for more details regarding original names). This name uniquely + identifies the entity in the whole program and is in scope + <em>independent</em> of whether the user name of the same entity is in + scope or not (i.e., it may be defined in a different module without + being explicitly imported) and has the form <module>:<name>. + <strong>NB:</strong> Incidentally, the current implementation of this + mechanisms facilitates breaking any abstraction barrier. + </p> + + <h3>Known-key Names for Template Haskell</h3> + <p> + During the construction of representations, the desugarer needs to use a + large number of functions defined in the library + <code>Language.Haskell.TH.Syntax</code>. The names of these functions + need to be made available to the compiler in the way outlined <a + href="../the-beast/prelude.html">Primitives and the Prelude.</a> + Unfortunately, any change to <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a> + triggers a significant amount of recompilation. Hence, the names needed + for TH are defined in <code>DsMeta</code> instead (at the end of the + module). All library functions needed by TH are contained in the name + set <code>DsMeta.templateHaskellNames</code>. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Wed Nov 13 18:01:48 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/feedback.html b/docs/comm/feedback.html new file mode 100644 index 0000000000..1da8b10f29 --- /dev/null +++ b/docs/comm/feedback.html @@ -0,0 +1,34 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Feedback</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Feedback</h1> + <p> + <a href="mailto:chak@cse.unsw.edu.au">I</a> welcome any feedback on the + material and in particular would appreciated comments on which parts of + the document are incomprehensible or miss explanation -- e.g., due to + the use of GHC speak that is explained nowhere (words like infotable or + so). Moreover, I would be interested to know which areas of GHC you + would like to see covered here. + <p> + For the moment is probably best if feedback is directed to + <p> + <blockquote> + <a + href="mailto:chak@cse.unsw.edu.au"><code>chak@cse.unsw.edu.au</code></a> + </blockquote> + <p> + However, if there is sufficient interest, we might consider setting up a + mailing list. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 00:11:42 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/genesis.html b/docs/comm/genesis/genesis.html new file mode 100644 index 0000000000..30b16fec46 --- /dev/null +++ b/docs/comm/genesis/genesis.html @@ -0,0 +1,82 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Outline of the Genesis</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Outline of the Genesis</h1> + <p> + Building GHC happens in two stages: First you have to prepare the tree + with <code>make boot</code>; and second, you build the compiler and + associated libraries with <code>make all</code>. The <code>boot</code> + stage builds some tools used during the main build process, generates + parsers and other pre-computed source, and finally computes dependency + information. There is considerable detail on the build process in GHC's + <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building Guide.</a> + + <h4>Debugging the Beast</h4> + <p> + If you are hacking the compiler or like to play with unstable + development versions, chances are that the compiler someday just crashes + on you. Then, it is a good idea to load the <code>core</code> into + <code>gdb</code> as usual, but unfortunately there is usually not too + much useful information. + <p> + The next step, then, is somewhat tedious. You should build a compiler + producing programs with a runtime system that has debugging turned on + and use that to build the crashing compiler. There are many sanity + checks in the RTS, which may detect inconsistency before they lead to a + crash and you may include more debugging information, which helps + <code>gdb.</code> For a RTS with debugging turned on, add the following + to <code>build.mk</code> (see also the comment in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/mk/config.mk.in"><code>config.mk.in</code></a> that you find when searching for + <code>GhcRtsHcOpts</code>): +<blockquote><pre> +GhcRtsHcOpts+=-optc-DDEBUG +GhcRtsCcOpts+=-g +EXTRA_LD_OPTS=-lbfd -liberty</pre></blockquote> + <p> + Then go into <code>fptools/ghc/rts</code> and <code>make clean boot && + make all</code>. With the resulting runtime system, you have to re-link + the compiler. Go into <code>fptools/ghc/compiler</code>, delete the + file <code>hsc</code> (up to version 4.08) or + <code>ghc-<version></code>, and execute <code>make all</code>. + <p> + The <code>EXTRA_LD_OPTS</code> are necessary as some of the debugging + code uses the BFD library, which in turn requires <code>liberty</code>. + I would also recommend (in 4.11 and from 5.0 upwards) adding these linker + options to the files <code>package.conf</code> and + <code>package.conf.inplace</code> in the directory + <code>fptools/ghc/driver/</code> to the <code>extra_ld_opts</code> entry + of the package <code>RTS</code>. Otherwise, you have to supply them + whenever you compile and link a program with a compiler that uses the + debugging RTS for the programs it produces. + <p> + To run GHC up to version 4.08 in <code>gdb</code>, first invoke the + compiler as usual, but pass it the option <code>-v</code>. This will + show you the exact invocation of the compiler proper <code>hsc</code>. + Run <code>hsc</code> with these options in <code>gdb</code>. The + development version 4.11 and stable releases from 5.0 on do no longer + use the Perl driver; so, you can run them directly with gdb. + <p> + <strong>Debugging a compiler during building from HC files.</strong> + If you are boot strapping the compiler on new platform from HC files and + it crashes somewhere during the build (e.g., when compiling the + libraries), do as explained above, but you may have to re-configure the + build system with <code>--enable-hc-boot</code> before re-making the + code in <code>fptools/ghc/driver/</code>. + If you do this with a compiler up to version 4.08, run the build process + with <code>make EXTRA_HC_OPTS=-v</code> to get the exact arguments with + which you have to invoke <code>hsc</code> in <code>gdb</code>. + + <p><small> +<!-- hhmts start --> +Last modified: Sun Apr 24 22:16:30 CEST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/makefiles.html b/docs/comm/genesis/makefiles.html new file mode 100644 index 0000000000..957a82eb85 --- /dev/null +++ b/docs/comm/genesis/makefiles.html @@ -0,0 +1,51 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Mindboggling Makefiles</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Mindboggling Makefiles</h1> + <p> + The size and structure of GHC's makefiles makes it quite easy to scream + out loud - in pain - during the process of tracking down problems in the + make system or when attempting to alter it. GHC's <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building + Guide</a> has valuable information on <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/sec-makefile-arch.html">the + makefile architecture.</a> + + <h4>A maze of twisty little passages, all alike</h4> + <p> + The <code>fptools/</code> toplevel and the various project directories + contain not only a <code>Makefile</code> each, but there are + subdirectories of name <code>mk/</code> at various levels that contain + rules, targets, and so on specific to a project - or, in the case of the + toplevel, the default rules for the whole system. Each <code>mk/</code> + directory contains a file <code>boilerplate.mk</code> that ties the + various other makefiles together. Files called <code>target.mk</code>, + <code>paths.mk</code>, and <code>suffix.mk</code> contain make targets, + definitions of variables containing paths, and suffix rules, + respectively. + <p> + One particularly nasty trick used in this hierarchy of makefiles is the + way in which the variable <code>$(TOP)</code> is used. AFAIK, + <code>$(TOP)</code> always points to a directory containing an + <code>mk/</code> subdirectory; however, it not necessarily points to the + toplevel <code>fptools/</code> directory. For example, within the GHC + subtree, <code>$(TOP)</code> points to <code>fptools/ghc/</code>. + However, some of the makefiles in <code>fptools/ghc/mk/</code> will then + <em>temporarily</em> redefine <code>$(TOP)</code> to point a level + higher (i.e., to <code>fptools/</code>) while they are including the + toplevel boilerplate. After that <code>$(TOP)</code> is redefined to + whatever value it had before including makefiles from higher up in the + hierarchy. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/modules.html b/docs/comm/genesis/modules.html new file mode 100644 index 0000000000..de59cce6d3 --- /dev/null +++ b/docs/comm/genesis/modules.html @@ -0,0 +1,164 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Marvellous Module Structure of GHC </title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Marvellous Module Structure of GHC </h1> + <p> + +GHC is built out of about 245 Haskell modules. It can be quite tricky +to figure out what the module dependency graph looks like. It can be +important, too, because loops in the module dependency graph need to +be broken carefully using <tt>.hi-boot</tt> interface files. +<p> +This section of the commentary documents the subtlest part of +the module dependency graph, namely the part near the bottom. +<ul> +<li> The list is given in compilation order: that is, +module near the top are more primitive, and are compiled earlier. +<li> Each module is listed together with its most critical +dependencies in parentheses; that is, the dependencies that prevent it being +compiled earlier. +<li> Modules in the same bullet don't depend on each other. +<li> Loops are documented by a dependency such as "<tt>loop Type.Type</tt>". +This means tha the module imports <tt>Type.Type</tt>, but module <tt>Type</tt> +has not yet been compiled, so the import comes from <tt>Type.hi-boot</tt>. +</ul> + +Compilation order is as follows: +<ul> +<li> +<strong>First comes a layer of modules that have few interdependencies, +and which implement very basic data types</strong>: +<tt> <ul> +<li> Util +<li> OccName +<li> Pretty +<li> Outputable +<li> StringBuffer +<li> ListSetOps +<li> Maybes +<li> etc +</ul> </tt> + +<p> +<li> <strong> Now comes the main subtle layer, involving types, classes, type constructors +identifiers, expressions, rules, and their operations.</strong> + +<tt> +<ul> +<li> Name <br> PrimRep +<p><li> + PrelNames +<p><li> + Var (Name, loop IdInfo.IdInfo, + loop Type.Type, loop Type.Kind) +<p><li> + VarEnv, VarSet, ThinAir +<p><li> + Class (loop TyCon.TyCon, loop Type.Type) +<p><li> + TyCon (loop Type.Type, loop Type.Kind, loop DataCon.DataCon, loop Generics.GenInfo) +<p><li> + TypeRep (loop DataCon.DataCon, loop Subst.substTyWith) +<p><li> + Type (loop PprType.pprType, loop Subst.substTyWith) +<p><li> + FieldLabel(Type) <br> + TysPrim(Type) <br> +<p><li> + Literal (TysPrim, PprType) <br> + DataCon (loop PprType, loop Subst.substTyWith, FieldLabel.FieldLabel) +<p><li> + TysWiredIn (loop MkId.mkDataConIds) +<p><li> + TcType( lots of TysWiredIn stuff) +<p><li> + PprType( lots of TcType stuff ) +<p><li> + PrimOp (PprType, TysWiredIn) +<p><li> + CoreSyn [does not import Id] +<p><li> + IdInfo (CoreSyn.Unfolding, CoreSyn.CoreRules) +<p><li> + Id (lots from IdInfo) +<p><li> + CoreFVs <br> + PprCore +<p><li> + CoreUtils (PprCore.pprCoreExpr, CoreFVs.exprFreeVars, + CoreSyn.isEvaldUnfolding CoreSyn.maybeUnfoldingTemplate) +<p><li> + CoreLint( CoreUtils ) <br> + OccurAnal (CoreUtils.exprIsTrivial) <br> + CoreTidy (CoreUtils.exprArity ) <br> +<p><li> + CoreUnfold (OccurAnal.occurAnalyseGlobalExpr) +<p><li> + Subst (CoreUnfold.Unfolding, CoreFVs) <br> + Generics (CoreUnfold.mkTopUnfolding) <br> + Rules (CoreUnfold.Unfolding, PprCore.pprTidyIdRules) +<p><li> + MkId (CoreUnfold.mkUnfolding, Subst, Rules.addRule) +<p><li> + PrelInfo (MkId) <br> + HscTypes( Rules.RuleBase ) +</ul></tt> + +<p><li> <strong>That is the end of the infrastructure. Now we get the + main layer of mdoules that perform useful work.</strong> + +<tt><ul> +<p><li> + CoreTidy (HscTypes.PersistentCompilerState) +</ul></tt> +</ul> + +HsSyn stuff +<ul> +<li> HsPat.hs-boot +<li> HsExpr.hs-boot (loop HsPat.LPat) +<li> HsTypes (loop HsExpr.HsSplice) +<li> HsBinds (HsTypes.LHsType, loop HsPat.LPat, HsExpr.pprFunBind and others) + HsLit (HsTypes.SyntaxName) +<li> HsPat (HsBinds, HsLit) + HsDecls (HsBinds) +<li> HsExpr (HsDecls, HsPat) +</ul> + + + +<h2>Library stuff: base package</h2> + +<ul> +<li> GHC.Base +<li> Data.Tuple (GHC.Base), GHC.Ptr (GHC.Base) +<li> GHC.Enum (Data.Tuple) +<li> GHC.Show (GHC.Enum) +<li> GHC.Num (GHC.Show) +<li> GHC.ST (GHC.Num), GHC.Real (GHC.Num) +<li> GHC.Arr (GHC.ST) GHC.STRef (GHC.ST) +<li> GHC.IOBase (GHC.Arr) +<li> Data.Bits (GHC.Real) +<li> Data.HashTable (Data.Bits, Control.Monad) +<li> Data.Typeable (GHC.IOBase, Data.HashTable) +<li> GHC.Weak (Data.Typeable, GHC.IOBase) +</ul> + + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 +<!-- hhmts end --> + </small> + </body> +</html> + + + + + diff --git a/docs/comm/index.html b/docs/comm/index.html new file mode 100644 index 0000000000..5ccd5f0ca9 --- /dev/null +++ b/docs/comm/index.html @@ -0,0 +1,121 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Beast Explained</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.17]</h1> + <p> + <!-- Contributors: Whoever makes substantial additions or changes to the + document, please add your name and keep the order alphabetic. Moreover, + please bump the version number for any substantial modification that you + check into CVS. + --> + <strong>Manuel M. T. Chakravarty</strong><br> + <strong>Sigbjorn Finne</strong><br> + <strong>Simon Marlow</strong><br> + <strong>Simon Peyton Jones</strong><br> + <strong>Julian Seward</strong><br> + <strong>Reuben Thomas</strong><br> + <br> + <p> + This document started as a collection of notes describing what <a + href="mailto:chak@cse.unsw.edu.au">I</a> learnt when poking around in + the <a href="http://haskell.org/ghc/">GHC</a> sources. During the + <i>Haskell Implementers Workshop</i> in January 2001, it was decided to + put the commentary into + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/">GHC's CVS + repository</a> + to allow the whole developer community to add their wizardly insight to + the document. + <p> + <strong>The document is still far from being complete - help it + grow!</strong> + + <h2>Before the Show Begins</h2> + <p> + <ul> + <li><a href="feedback.html">Feedback</a> + <li><a href="others.html">Other Sources of Wisdom</a> + </ul> + + <h2>Genesis</h2> + <p> + <ul> + <li><a href="genesis/genesis.html">Outline of the Genesis</a> + <li><a href="genesis/makefiles.html">Mindboggling Makefiles</a> + <li><a href="genesis/modules.html">GHC's Marvellous Module Structure</a> + </ul> + + <h2>The Beast Dissected</h2> + <p> + <ul> + <li><a href="the-beast/coding-style.html">Coding style used in + the compiler</a> + <li><a href="the-beast/driver.html">The Glorious Driver</a> + <li><a href="the-beast/prelude.html">Primitives and the Prelude</a> + <li><a href="the-beast/syntax.html">Just Syntax</a> + <li><a href="the-beast/basicTypes.html">The Basics</a> + <li><a href="the-beast/modules.html">Modules, ModuleNames and + Packages</a> + <li><a href="the-beast/names.html">The truth about names: Names and OccNames</a> + <li><a href="the-beast/vars.html">The Real Story about Variables, Ids, + TyVars, and the like</a> + <li><a href="the-beast/data-types.html">Data types and constructors</a> + <li><a href="the-beast/renamer.html">The Glorious Renamer</a> + <li><a href="the-beast/types.html">Hybrid Types</a> + <li><a href="the-beast/typecheck.html">Checking Types</a> + <li><a href="the-beast/desugar.html">Sugar Free: From Haskell To Core</a> + <li><a href="the-beast/simplifier.html">The Mighty Simplifier</a> + <li><a href="the-beast/mangler.html">The Evil Mangler</a> + <li><a href="the-beast/alien.html">Alien Functions</a> + <li><a href="the-beast/stg.html">You Got Control: The STG-language</a> + <li><a href="the-beast/ncg.html">The Native Code Generator</a> + <li><a href="the-beast/ghci.html">GHCi</a> + <li><a href="the-beast/fexport.html">Implementation of + <code>foreign export</code></a> + <li><a href="the-beast/main.html">Compiling and running the Main module</code></a> + </ul> + + <h2>RTS & Libraries</h2> + <p> + <ul> + <li><a href="rts-libs/coding-style.html">Coding Style Guidelines</a> + <li><a href="rts-libs/stgc.html">Spineless Tagless C</a> + <li><a href="rts-libs/primitives.html">Primitives</a> + <li><a href="rts-libs/prelfound.html">Prelude Foundations</a> + <li><a href="rts-libs/prelude.html">Cunning Prelude Code</a> + <li><a href="rts-libs/foreignptr.html">On why we have <tt>ForeignPtr</tt></a> + <li><a href="rts-libs/non-blocking.html">Non-blocking I/O for Win32</a> + <li><a href="rts-libs/multi-thread.html">Supporting multi-threaded interoperation</a> + </ul> + + <h2>Extensions, or Making a Complicated System More Complicated</h2> + <p> + <ul> + <li><a href="exts/th.html">Template Haskell</a> + <li><a href="exts/ndp.html">Parallel Arrays</a> + </ul> + + <h2>The Source</h2> + <p> + The online master copy of the Commentary is at + <blockquote> + <a href="http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/">http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/</a> + </blockquote> + <p> + This online version is updated + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/docs/comm/">from + CVS</a> + daily. + + <p><small> +<!-- hhmts start --> +Last modified: Thu May 12 19:03:42 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/others.html b/docs/comm/others.html new file mode 100644 index 0000000000..52d87e9419 --- /dev/null +++ b/docs/comm/others.html @@ -0,0 +1,60 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Other Sources of Wisdom</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Other Sources of Wisdom</h1> + <p> + Believe it or not, but there are other people besides you who are + masochistic enough to study the innards of the beast. Some of the have + been kind (or cruel?) enough to share their insights with us. Here is a + probably incomplete list: + <p> + <ul> + + <li>The <a + href="http://www.cee.hw.ac.uk/~dsg/gph/docs/StgSurvival.ps.gz">STG + Survival Sheet</a> has -- according to its header -- been written by + `a poor wee soul',<sup><a href="#footnote1">1</a></sup> which + probably has been pushed into the torments of madness by the very + act of contemplating the inner workings of the STG runtime system. + This document discusses GHC's runtime system with a focus on + support for parallel processing (aka GUM). + + <li>Instructions on <a + href="http://www-users.cs.york.ac.uk/~olaf/PUBLICATIONS/extendGHC.html">Adding + an Optimisation Pass to the Glasgow Haskell Compiler</a> + have been compiled by <a + href="http://www-users.cs.york.ac.uk/~olaf/">Olaf Chitil</a>. + Unfortunately, this document is already a little aged. + + <li><a href="http://www.cs.pdx.edu/~apt/">Andrew Tolmach</a> has defined + <a href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">an external + representation of + GHC's <em>Core</em> language</a> and also implemented a GHC pass + that emits the intermediate form into <code>.hcr</code> files. The + option <code>-fext-core</code> triggers GHC to emit Core code after + optimisation; in addition, <code>-fno-code</code> is often used to + stop compilation after Core has been emitted. + + <!-- Add references to other background texts listed on the GHC docu + page + --> + + </ul> + + <p><hr><p> + <sup><a name="footnote1">1</a></sup>Usually reliable sources have it that + the poor soul in question is no one less than GUM hardcore hacker <a + href="http://www.cee.hw.ac.uk/~hwloidl/">Hans-Wolfgang Loidl</a>. + + <p><small> +<!-- hhmts start --> +Last modified: Tue Nov 13 10:56:57 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/coding-style.html b/docs/comm/rts-libs/coding-style.html new file mode 100644 index 0000000000..58f5b4f9bb --- /dev/null +++ b/docs/comm/rts-libs/coding-style.html @@ -0,0 +1,516 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Style Guidelines for RTS C code</title> + </head> + +<body> +<H1>The GHC Commentary - Style Guidelines for RTS C code</h1> + +<h2>Comments</h2> + +<p>These coding style guidelines are mainly intended for use in +<tt>ghc/rts</tt> and <tt>ghc/includes</tt>. + +<p>NB These are just suggestions. They're not set in stone. Some of +them are probably misguided. If you disagree with them, feel free to +modify this document (and make your commit message reasonably +informative) or mail someone (eg. <a +href="glasgow-haskell-users@haskell.org">The GHC mailing list</a>) + +<h2>References</h2> + +If you haven't read them already, you might like to check the following. +Where they conflict with our suggestions, they're probably right. + +<ul> + +<li> +The C99 standard. One reasonable reference is <a +href="http://home.tiscalinet.ch/t_wolf/tw/c/c9x_changes.html">here</a>. + +<p><li> +Writing Solid Code, Microsoft Press. (Highly recommended. Possibly +the only Microsoft Press book that's worth reading.) + +<p><li> +Autoconf documentation. +See also <a href="http://peti.gmd.de/autoconf-archive/">The autoconf macro archive</a> and +<a href="http://www.cyclic.com/cyclic-pages/autoconf.html">Cyclic Software's description</a> + +<p><li> <a +href="http://www.cs.umd.edu/users/cml/cstyle/indhill-cstyle.html">Indian +Hill C Style and Coding Standards</a>. + +<p><li> +<a href="http://www.cs.umd.edu/users/cml/cstyle/">A list of C programming style links</a> + +<p><li> +<a href="http://www.lysator.liu.se/c/c-www.html">A very large list of C programming links</a> + +<p><li> +<a href="http://www.geek-girl.com/unix.html">A list of Unix programming links</a> + +</ul> + + +<h2>Portability issues</h2> + +<ul> +<p><li> We try to stick to C99 where possible. We use the following +C99 features relative to C89, some of which were previously GCC +extensions (possibly with different syntax): + +<ul> +<p><li>Variable length arrays as the last field of a struct. GCC has +a similar extension, but the syntax is slightly different: in GCC you +would declare the array as <tt>arr[0]</tt>, whereas in C99 it is +declared as <tt>arr[]</tt>. + +<p><li>Inline annotations on functions (see later) + +<p><li>Labeled elements in initialisers. Again, GCC has a slightly +different syntax from C99 here, and we stick with the GCC syntax until +GCC implements the C99 proposal. + +<p><li>C++-style comments. These are part of the C99 standard, and we +prefer to use them whenever possible. +</ul> + +<p>In addition we use ANSI-C-style function declarations and +prototypes exclusively. Every function should have a prototype; +static function prototypes may be placed near the top of the file in +which they are declared, and external prototypes are usually placed in +a header file with the same basename as the source file (although there +are exceptions to this rule, particularly when several source files +together implement a subsystem which is described by a single external +header file). + +<p><li>We use the following GCC extensions, but surround them with +<tt>#ifdef __GNUC__</tt>: + +<ul> +<p><li>Function attributes (mostly just <code>no_return</code> and +<code>unused</code>) +<p><li>Inline assembly. +</ul> + +<p><li> +char can be signed or unsigned - always say which you mean + +<p><li>Our POSIX policy: try to write code that only uses POSIX (IEEE +Std 1003.1) interfaces and APIs. We used to define +<code>POSIX_SOURCE</code> by default, but found that this caused more +problems than it solved, so now we require any code that is +POSIX-compliant to explicitly say so by having <code>#include +"PosixSource.h"</code> at the top. Try to do this whenever possible. + +<p><li> Some architectures have memory alignment constraints. Others +don't have any constraints but go faster if you align things. These +macros (from <tt>ghcconfig.h</tt>) tell you which alignment to use + +<pre> + /* minimum alignment of unsigned int */ + #define ALIGNMENT_UNSIGNED_INT 4 + + /* minimum alignment of long */ + #define ALIGNMENT_LONG 4 + + /* minimum alignment of float */ + #define ALIGNMENT_FLOAT 4 + + /* minimum alignment of double */ + #define ALIGNMENT_DOUBLE 4 +</pre> + +<p><li> Use <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> when +reading/writing ints and ptrs to the stack or heap. Note that, by +definition, <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> are +the same size and have the same alignment constraints even if +<code>sizeof(int) != sizeof(ptr)</code> on that platform. + +<p><li> Use <tt>StgInt8</tt>, <tt>StgInt16</tt>, etc when you need a +certain minimum number of bits in a type. Use <tt>int</tt> and +<tt>nat</tt> when there's no particular constraint. ANSI C only +guarantees that ints are at least 16 bits but within GHC we assume +they are 32 bits. + +<p><li> Use <tt>StgFloat</tt> and <tt>StgDouble</tt> for floating +point values which will go on/have come from the stack or heap. Note +that <tt>StgDouble</tt> may occupy more than one <tt>StgWord</tt>, but +it will always be a whole number multiple. + +<p> +Use <code>PK_FLT(addr)</code>, <code>PK_DBL(addr)</code> to read +<tt>StgFloat</tt> and <tt>StgDouble</tt> values from the stack/heap, +and <code>ASSIGN_FLT(val,addr)</code> / +<code>ASSIGN_DBL(val,addr)</code> to assign StgFloat/StgDouble values +to heap/stack locations. These macros take care of alignment +restrictions. + +<p> +Heap/Stack locations are always <tt>StgWord</tt> aligned; the +alignment requirements of an <tt>StgDouble</tt> may be more than that +of <tt>StgWord</tt>, but we don't pad misaligned <tt>StgDoubles</tt> +because doing so would be too much hassle (see <code>PK_DBL</code> & +co above). + +<p><li> +Avoid conditional code like this: + +<pre> + #ifdef solaris_host_OS + // do something solaris specific + #endif +</pre> + +Instead, add an appropriate test to the configure.ac script and use +the result of that test instead. + +<pre> + #ifdef HAVE_BSD_H + // use a BSD library + #endif +</pre> + +<p>The problem is that things change from one version of an OS to another +- things get added, things get deleted, things get broken, some things +are optional extras. Using "feature tests" instead of "system tests" +makes things a lot less brittle. Things also tend to get documented +better. + +</ul> + +<h2>Debugging/robustness tricks</h2> + + +Anyone who has tried to debug a garbage collector or code generator +will tell you: "If a program is going to crash, it should crash as +soon, as noisily and as often as possible." There's nothing worse +than trying to find a bug which only shows up when running GHC on +itself and doesn't manifest itself until 10 seconds after the actual +cause of the problem. + +<p>We put all our debugging code inside <tt>#ifdef DEBUG</tt>. The +general policy is we don't ship code with debugging checks and +assertions in it, but we do run with those checks in place when +developing and testing. Anything inside <tt>#ifdef DEBUG</tt> should +not slow down the code by more than a factor of 2. + +<p>We also have more expensive "sanity checking" code for hardcore +debugging - this can slow down the code by a large factor, but is only +enabled on demand by a command-line flag. General sanity checking in +the RTS is currently enabled with the <tt>-DS</tt> RTS flag. + +<p>There are a number of RTS flags which control debugging output and +sanity checking in various parts of the system when <tt>DEBUG</tt> is +defined. For example, to get the scheduler to be verbose about what +it is doing, you would say <tt>+RTS -Ds -RTS</tt>. See +<tt>includes/RtsFlags.h</tt> and <tt>rts/RtsFlags.c</tt> for the full +set of debugging flags. To check one of these flags in the code, +write: + +<pre> + IF_DEBUG(gc, fprintf(stderr, "...")); +</pre> + +would check the <tt>gc</tt> flag before generating the output (and the +code is removed altogether if <tt>DEBUG</tt> is not defined). + +<p>All debugging output should go to <tt>stderr</tt>. + +<p> +Particular guidelines for writing robust code: + +<ul> +<p><li> +Use assertions. Use lots of assertions. If you write a comment +that says "takes a +ve number" add an assertion. If you're casting +an int to a nat, add an assertion. If you're casting an int to a char, +add an assertion. We use the <tt>ASSERT</tt> macro for writing +assertions; it goes away when <tt>DEBUG</tt> is not defined. + +<p><li> +Write special debugging code to check the integrity of your data structures. +(Most of the runtime checking code is in <tt>rts/Sanity.c</tt>) +Add extra assertions which call this code at the start and end of any +code that operates on your data structures. + +<p><li> +When you find a hard-to-spot bug, try to think of some assertions, +sanity checks or whatever that would have made the bug easier to find. + +<p><li> +When defining an enumeration, it's a good idea not to use 0 for normal +values. Instead, make 0 raise an internal error. The idea here is to +make it easier to detect pointer-related errors on the assumption that +random pointers are more likely to point to a 0 than to anything else. + +<pre> +typedef enum + { i_INTERNAL_ERROR /* Instruction 0 raises an internal error */ + , i_PANIC /* irrefutable pattern match failed! */ + , i_ERROR /* user level error */ + + ... +</pre> + +<p><li> Use <tt>#warning</tt> or <tt>#error</tt> whenever you write a +piece of incomplete/broken code. + +<p><li> When testing, try to make infrequent things happen often. + For example, make a context switch/gc/etc happen every time a + context switch/gc/etc can happen. The system will run like a + pig but it'll catch a lot of bugs. + +</ul> + +<h2>Syntactic details</h2> + +<ul> +<p><li><b>Important:</b> Put "redundant" braces or parens in your code. +Omitting braces and parens leads to very hard to spot bugs - +especially if you use macros (and you might have noticed that GHC does +this a lot!) + +<p> +In particular: +<ul> +<p><li> +Put braces round the body of for loops, while loops, if statements, etc. +even if they "aren't needed" because it's really hard to find the resulting +bug if you mess up. Indent them any way you like but put them in there! +</ul> + +<p><li> +When defining a macro, always put parens round args - just in case. +For example, write: +<pre> + #define add(x,y) ((x)+(y)) +</pre> +instead of +<pre> + #define add(x,y) x+y +</pre> + +<p><li> Don't declare and initialize variables at the same time. +Separating the declaration and initialization takes more lines, but +make the code clearer. + +<p><li> +Use inline functions instead of macros if possible - they're a lot +less tricky to get right and don't suffer from the usual problems +of side effects, evaluation order, multiple evaluation, etc. + +<ul> +<p><li>Inline functions get the naming issue right. E.g. they + can have local variables which (in an expression context) + macros can't. + +<p><li> Inline functions have call-by-value semantics whereas macros + are call-by-name. You can be bitten by duplicated computation + if you aren't careful. + +<p><li> You can use inline functions from inside gdb if you compile with + -O0 or -fkeep-inline-functions. If you use macros, you'd better + know what they expand to. +</ul> + +However, note that macros can serve as both l-values and r-values and +can be "polymorphic" as these examples show: +<pre> + // you can use this as an l-value or an l-value + #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo) + + // polymorphic case + // but note that min(min(1,2),3) does 3 comparisions instead of 2!! + #define min(x,y) (((x)<=(y)) ? (x) : (y)) +</pre> + +<p><li> +Inline functions should be "static inline" because: +<ul> +<p><li> +gcc will delete static inlines if not used or theyre always inlined. + +<p><li> + if they're externed, we could get conflicts between 2 copies of the + same function if, for some reason, gcc is unable to delete them. + If they're static, we still get multiple copies but at least they don't conflict. +</ul> + +OTOH, the gcc manual says this +so maybe we should use extern inline? + +<pre> + When a function is both inline and `static', if all calls to the +function are integrated into the caller, and the function's address is +never used, then the function's own assembler code is never referenced. +In this case, GNU CC does not actually output assembler code for the +function, unless you specify the option `-fkeep-inline-functions'. +Some calls cannot be integrated for various reasons (in particular, +calls that precede the function's definition cannot be integrated, and +neither can recursive calls within the definition). If there is a +nonintegrated call, then the function is compiled to assembler code as +usual. The function must also be compiled as usual if the program +refers to its address, because that can't be inlined. + + When an inline function is not `static', then the compiler must +assume that there may be calls from other source files; since a global +symbol can be defined only once in any program, the function must not +be defined in the other source files, so the calls therein cannot be +integrated. Therefore, a non-`static' inline function is always +compiled on its own in the usual fashion. + + If you specify both `inline' and `extern' in the function +definition, then the definition is used only for inlining. In no case +is the function compiled on its own, not even if you refer to its +address explicitly. Such an address becomes an external reference, as +if you had only declared the function, and had not defined it. + + This combination of `inline' and `extern' has almost the effect of a +macro. The way to use it is to put a function definition in a header +file with these keywords, and put another copy of the definition +(lacking `inline' and `extern') in a library file. The definition in +the header file will cause most calls to the function to be inlined. +If any uses of the function remain, they will refer to the single copy +in the library. +</pre> + +<p><li> +Don't define macros that expand to a list of statements. +You could just use braces as in: + +<pre> + #define ASSIGN_CC_ID(ccID) \ + { \ + ccID = CC_ID; \ + CC_ID++; \ + } +</pre> + +(but it's usually better to use an inline function instead - see above). + +<p><li> +Don't even write macros that expand to 0 statements - they can mess you +up as well. Use the doNothing macro instead. +<pre> + #define doNothing() do { } while (0) +</pre> + +<p><li> +This code +<pre> +int* p, q; +</pre> +looks like it declares two pointers but, in fact, only p is a pointer. +It's safer to write this: +<pre> +int* p; +int* q; +</pre> +You could also write this: +<pre> +int *p, *q; +</pre> +but it is preferrable to split the declarations. + +<p><li> +Try to use ANSI C's enum feature when defining lists of constants of +the same type. Among other benefits, you'll notice that gdb uses the +name instead of its (usually inscrutable) number when printing values +with enum types and gdb will let you use the name in expressions you +type. + +<p> +Examples: +<pre> + typedef enum { /* N.B. Used as indexes into arrays */ + NO_HEAP_PROFILING, + HEAP_BY_CC, + HEAP_BY_MOD, + HEAP_BY_GRP, + HEAP_BY_DESCR, + HEAP_BY_TYPE, + HEAP_BY_TIME + } ProfilingFlags; +</pre> +instead of +<pre> + # define NO_HEAP_PROFILING 0 /* N.B. Used as indexes into arrays */ + # define HEAP_BY_CC 1 + # define HEAP_BY_MOD 2 + # define HEAP_BY_GRP 3 + # define HEAP_BY_DESCR 4 + # define HEAP_BY_TYPE 5 + # define HEAP_BY_TIME 6 +</pre> +and +<pre> + typedef enum { + CCchar = 'C', + MODchar = 'M', + GRPchar = 'G', + DESCRchar = 'D', + TYPEchar = 'Y', + TIMEchar = 'T' + } ProfilingTag; +</pre> +instead of +<pre> + # define CCchar 'C' + # define MODchar 'M' + # define GRPchar 'G' + # define DESCRchar 'D' + # define TYPEchar 'Y' + # define TIMEchar 'T' +</pre> + +<p><li> Please keep to 80 columns: the line has to be drawn somewhere, +and by keeping it to 80 columns we can ensure that code looks OK on +everyone's screen. Long lines are hard to read, and a sign that the +code needs to be restructured anyway. + +<p><li> When commenting out large chunks of code, use <code>#ifdef 0 +... #endif</code> rather than <code>/* ... */</code> because C doesn't +have nested comments. + +<p><li>When declaring a typedef for a struct, give the struct a name +as well, so that other headers can forward-reference the struct name +and it becomes possible to have opaque pointers to the struct. Our +convention is to name the struct the same as the typedef, but add a +leading underscore. For example: + +<pre> + typedef struct _Foo { + ... + } Foo; +</pre> + +<p><li>Do not use <tt>!</tt> instead of explicit comparison against +<tt>NULL</tt> or <tt>'\0'</tt>; the latter is much clearer. + +<p><li> We don't care too much about your indentation style but, if +you're modifying a function, please try to use the same style as the +rest of the function (or file). If you're writing new code, a +tab width of 4 is preferred. + +</ul> + +<h2>CVS issues</h2> + +<ul> +<p><li> +Don't be tempted to reindent or reorganise large chunks of code - it +generates large diffs in which it's hard to see whether anything else +was changed. +<p> +If you must reindent or reorganise, don't include any functional +changes that commit and give advance warning that you're about to do +it in case anyone else is changing that file. +</ul> + + +</body> +</html> diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html new file mode 100644 index 0000000000..febe9fe422 --- /dev/null +++ b/docs/comm/rts-libs/foreignptr.html @@ -0,0 +1,68 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title> + </head> + + <body BGCOLOR="FFFFFF"> + + <h1>On why we have <tt>ForeignPtr</tt></h1> + + <p>Unfortunately it isn't possible to add a finalizer to a normal + <tt>Ptr a</tt>. We already have a generic finalization mechanism: + see the Weak module in package lang. But the only reliable way to + use finalizers is to attach one to an atomic heap object - that + way the compiler's optimiser can't interfere with the lifetime of + the object. + + <p>The <tt>Ptr</tt> type is really just a boxed address - it's + defined like + + <pre> +data Ptr a = Ptr Addr# +</pre> + + <p>where <tt>Addr#</tt> is an unboxed native address (just a 32- + or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is + dangerous, because the compiler's optimiser might remove the box + altogether. + + <p><tt>ForeignPtr</tt> is defined like this + + <pre> +data ForeignPtr a = ForeignPtr ForeignObj# +</pre> + + <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds + to a real heap object. The heap object is primitive from the + point of view of the compiler - it can't be optimised away. So it + works to attach a finalizer to the <tt>ForeignObj#</tt> (but not + to the <tt>ForeignPtr</tt>!). + + <p>There are several primitive objects to which we can attach + finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>, + etc. We have special functions for some of these: eg. + <tt>MVar.addMVarFinalizer</tt>. + + <p>So a nicer interface might be something like + +<pre> +class Finalizable a where + addFinalizer :: a -> IO () -> IO () + +instance Finalizable (ForeignPtr a) where ... +instance Finalizable (MVar a) where ... +</pre> + + <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and + rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that + is just efficiency, I think. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Sep 26 09:49:37 BST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html new file mode 100644 index 0000000000..67a544be85 --- /dev/null +++ b/docs/comm/rts-libs/multi-thread.html @@ -0,0 +1,445 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> +<head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> +<title>The GHC Commentary - Supporting multi-threaded interoperation</title> +</head> +<body> +<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1> +<em> +<p> +Authors: sof@galois.com, simonmar@microsoft.com<br> +Date: April 2002 +</p> +</em> +<p> +This document presents the implementation of an extension to +Concurrent Haskell that provides two enhancements: +</p> +<ul> +<li>A Concurrent Haskell thread may call an external (e.g., C) +function in a manner that's transparent to the execution/evaluation of +other Haskell threads. Section <a href="#callout">Calling out"</a> covers this. +</li> +<li> +OS threads may safely call Haskell functions concurrently. Section +<a href="#callin">"Calling in"</a> covers this. +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callout">The problem: foreign calls that block</h2> +<p> +When a Concurrent Haskell(CH) thread calls a 'foreign import'ed +function, the runtime system(RTS) has to handle this in a manner +transparent to other CH threads. That is, they shouldn't be blocked +from making progress while the CH thread executes the external +call. Presently, all threads will block. +</p> +<p> +Clearly, we have to rely on OS-level threads in order to support this +kind of concurrency. The implementation described here defines the +(abstract) OS threads interface that the RTS assumes. The implementation +currently provides two instances of this interface, one for POSIX +threads (pthreads) and one for the Win32 threads. +</p> + +<!---- *************************************** -----> +<h3>Multi-threading the RTS</h3> + +<p> +A simple and efficient way to implement non-blocking foreign calls is like this: +<ul> +<li> Invariant: only one OS thread is allowed to +execute code inside of the GHC runtime system. [There are alternate +designs, but I won't go into details on their pros and cons here.] +We'll call the OS thread that is currently running Haskell threads +the <em>Current Haskell Worker Thread</em>. +<p> +The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its +time-slice expires or it blocks on an MVar, then grabs another, and executes +that, and so on. +</p> +<li> +<p> +When the Current Haskell Worker comes to execute a potentially blocking 'foreign +import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that +another OS worker thread is available to become the Current Haskell Worker. +Consequently, even if the external call blocks, the new Current Haskell Worker +continues execution of the other Concurrent Haskell threads. +When the external call eventually completes, the Concurrent Haskell +thread that made the call is passed the result and made runnable +again. +</p> +<p> +<li> +A pool of OS threads are constantly trying to become the Current Haskell Worker. +Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers. +<p><li> +The OS worker threads are regarded as interchangeable. A given Haskell thread +may, during its lifetime, be executed entirely by one OS worker thread, or by more than one. +There's just no way to tell. + +<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved. +The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and +the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>. +</ul> +<p> +The rest of this section describes the mechanics of implementing all +this. There's two parts to it, one that describes how a native (OS) thread +leaves the RTS to service the external call, the other how the same +thread handles returning the result of the external call back to the +Haskell thread. +</p> + +<!---- *************************************** -----> +<h3>Making the external call</h3> + +<p> +Presently, GHC handles 'safe' C calls by effectively emitting the +following code sequence: +</p> + +<pre> + ...save thread state... + t = suspendThread(); + r = foo(arg1,...,argn); + resumeThread(t); + ...restore thread state... + return r; +</pre> + +<p> +After having squirreled away the state of a Haskell thread, +<tt>Schedule.c:suspendThread()</tt> is called which puts the current +thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>] +containing threads that are currently blocked waiting for external calls +to complete (this is done for the purposes of finding roots when +garbage collecting). +</p> + +<p> +In addition to putting the Haskell thread on +<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also +does the following: +</p> +<ul> +<li>Instructs the <em>Task Manager</em> to make sure that there's a +another native thread waiting in the wings to take over the execution +of Haskell threads. This might entail creating a new +<em>worker thread</em> or re-using one that's currently waiting for +more work to do. The <a href="#taskman">Task Manager</a> section +presents the functionality provided by this subsystem. +</li> + +<li>Releases its capability to execute within the RTS. By doing +so, another worker thread will become unblocked and start executing +code within the RTS. See the <a href="#capability">Capability</a> +section for details. +</li> + +<li><tt>suspendThread()</tt> returns a token which is used to +identify the Haskell thread that was added to +<tt>suspended_ccalling_threads</tt>. This is done so that once the +external call has completed, we know what Haskell thread to pull off +the <tt>suspended_ccalling_threads</tt> list. +</li> +</ul> + +<p> +Upon return from <tt>suspendThread()</tt>, the OS thread is free of +its RTS executing responsibility, and can now invoke the external +call. Meanwhile, the other worker thread that have now gained access +to the RTS will continue executing Concurrent Haskell code. Concurrent +'stuff' is happening! +</p> + +<!---- *************************************** -----> +<h3>Returning the external result</h3> + +<p> +When the native thread eventually returns from the external call, +the result needs to be communicated back to the Haskell thread that +issued the external call. The following steps takes care of this: +</p> + +<ul> +<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>, +passing along the token referring to the Haskell thread that made the +call we're returning from. +</li> + +<li> +The OS thread then tries to grab hold of a <em>returning worker +capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>. +Until granted, the thread blocks waiting for RTS permissions. Clearly we +don't want the thread to be blocked longer than it has to, so whenever +a thread that is executing within the RTS enters the Scheduler (which +is quite often, e.g., when a Haskell thread context switch is made), +it checks to see whether it can give up its RTS capability to a +returning worker, which is done by calling +<tt>Capability.c:yieldToReturningWorker()</tt>. +</li> + +<li> +If a returning worker is waiting (the code in <tt>Capability.c</tt> +keeps a counter of the number of returning workers that are currently +blocked waiting), it is woken up and the given the RTS execution +priviledges/capabilities of the worker thread that gave up its. +</li> + +<li> +The thread that gave up its capability then tries to re-acquire +the capability to execute RTS code; this is done by calling +<tt>Capability.c:waitForWorkCapability()</tt>. +</li> + +<li> +The returning worker that was woken up will continue execution in +<tt>resumeThread()</tt>, removing its associated Haskell thread +from the <tt>suspended_ccalling_threads</tt> list and start evaluating +that thread, passing it the result of the external call. +</li> +</ul> + +<!---- *************************************** -----> +<h3 id="rts-exec">RTS execution</h3> + +<p> +If a worker thread inside the RTS runs out of runnable Haskell +threads, it goes to sleep waiting for the external calls to complete. +It does this by calling <tt>waitForWorkCapability</tt> +</p> + +<p> +The availability of new runnable Haskell threads is signalled when: +</p> + +<ul> +<li>When an external call is set up in <tt>suspendThread()</tt>.</li> +<li>When a new Haskell thread is created (e.g., whenever +<tt>Concurrent.forkIO</tt> is called from within Haskell); this is +signalled in <tt>Schedule.c:scheduleThread_()</tt>. +</li> +<li>Whenever a Haskell thread is removed from a 'blocking queue' +attached to an MVar (only?). +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callin">Calling in</h2> + +Providing robust support for having multiple OS threads calling into +Haskell is not as involved as its dual. + +<ul> +<li>The OS thread issues the call to a Haskell function by going via +the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>). +<li>Making the function application requires the construction of a +closure on the heap. This is done in a thread-safe manner by having +the OS thread lock a designated block of memory (the 'Rts API' block, +which is part of the GC's root set) for the short period of time it +takes to construct the application. +<li>The OS thread then creates a new Haskell thread to execute the +function application, which (eventually) boils down to calling +<tt>Schedule.c:createThread()</tt> +<li> +Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>, +which asks the Task Manager to possibly create a new worker (OS) +thread to execute the Haskell thread. +<li> +After the OS thread has done this, it blocks waiting for the +Haskell thread to complete the evaluation of the Haskell function. +<p> +The reason why a separate worker thread is made to evaluate the Haskell +function and not the OS thread that made the call-in via the +Rts API, is that we want that OS thread to return as soon as possible. +We wouldn't be able to guarantee that if the OS thread entered the +RTS to (initially) just execute its function application, as the +Scheduler may side-track it and also ask it to evaluate other Haskell threads. +</li> +</ul> + +<p> +<strong>Note:</strong> As of 20020413, the implementation of the RTS API +only serializes access to the allocator between multiple OS threads wanting +to call into Haskell (via the RTS API.) It does not coordinate this access +to the allocator with that of the OS worker thread that's currently executing +within the RTS. This weakness/bug is scheduled to be tackled as part of an +overhaul/reworking of the RTS API itself. + + +<!---- *************************************** -----> +<h2>Subsystems introduced/modified</h2> + +<p> +These threads extensions affect the Scheduler portions of the runtime +system. To make it more manageable to work with, the changes +introduced a couple of new RTS 'sub-systems'. This section presents +the functionality and API of these sub-systems. +</p> + +<!---- *************************************** -----> +<h3 id="#capability">Capabilities</h3> + +<p> +A Capability represent the token required to execute STG code, +and all the state an OS thread/task needs to run Haskell code: +its STG registers, a pointer to its TSO, a nursery etc. During +STG execution, a pointer to the capabilitity is kept in a +register (BaseReg). +</p> +<p> +Only in an SMP build will there be multiple capabilities, for +the threaded RTS and other non-threaded builds, there is only +one global capability, namely <tt>MainCapability</tt>. + +<p> +The Capability API is as follows: +<pre> +/* Capability.h */ +extern void initCapabilities(void); + +extern void grabReturnCapability(Mutex* pMutex, Capability** pCap); +extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable); +extern void releaseCapability(Capability* cap); + +extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap); + +extern void grabCapability(Capability** cap); +</pre> + +<ul> +<li><tt>initCapabilities()</tt> initialises the subsystem. + +<li><tt>grabReturnCapability()</tt> is called by worker threads +returning from an external call. It blocks them waiting to gain +permissions to do so. + +<li><tt>waitForWorkCapability()</tt> is called by worker threads +already inside the RTS, but without any work to do. It blocks them +waiting for there to new work to become available. + +<li><tt>releaseCapability()</tt> hands back a capability. If a +'returning worker' is waiting, it is signalled that a capability +has become available. If not, <tt>releaseCapability()</tt> tries +to signal worker threads that are blocked waiting inside +<tt>waitForWorkCapability()</tt> that new work might now be +available. + +<li><tt>yieldToReturningWorker()</tt> is called by the worker thread +that's currently inside the Scheduler. It checks whether there are other +worker threads waiting to return from making an external call. If so, +they're given preference and a capability is transferred between worker +threads. One of the waiting 'returning worker' threads is signalled and made +runnable, with the other, yielding, worker blocking to re-acquire +a capability. +</ul> + +<p> +The condition variables used to implement the synchronisation between +worker consumers and providers are local to the Capability +implementation. See source for details and comments. +</p> + +<!---- *************************************** -----> +<h3 id="taskman">The Task Manager</h3> + +<p> +The Task Manager API is responsible for managing the creation of +OS worker RTS threads. When a Haskell thread wants to make an +external call, the Task Manager is asked to possibly create a +new worker thread to take over the RTS-executing capability of +the worker thread that's exiting the RTS to execute the external call. + +<p> +The Capability subsystem keeps track of idle worker threads, so +making an informed decision about whether or not to create a new OS +worker thread is easy work for the task manager. The Task manager +provides the following API: +</p> + +<pre> +/* Task.h */ +extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) ); +extern void stopTaskManager ( void ); + +extern void startTask ( void (*taskStart)(void) ); +</pre> + +<ul> +<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts +up and shuts down the subsystem. When starting up, you have the option +to limit the overall number of worker threads that can be +created. An unbounded (modulo OS thread constraints) number of threads +is created if you pass '0'. +<li><tt>startTask()</tt> is called when a worker thread calls +<tt>suspendThread()</tt> to service an external call, asking another +worker thread to take over its RTS-executing capability. It is also +called when an external OS thread invokes a Haskell function via the +<em>Rts API</em>. +</ul> + +<!---- *************************************** -----> +<h3>Native threads API</h3> + +To hide OS details, the following API is used by the task manager and +the scheduler to interact with an OS' threads API: + +<pre> +/* OSThreads.h */ +typedef <em>..OS specific..</em> Mutex; +extern void initMutex ( Mutex* pMut ); +extern void grabMutex ( Mutex* pMut ); +extern void releaseMutex ( Mutex* pMut ); + +typedef <em>..OS specific..</em> Condition; +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); +</pre> + + + +<!---- *************************************** -----> +<h2>User-level interface</h2> + +To signal that you want an external call to be serviced by a separate +OS thread, you have to add the attribute <tt>threadsafe</tt> to +a foreign import declaration, i.e., + +<pre> +foreign import "bigComp" threadsafe largeComputation :: Int -> IO () +</pre> + +<p> +The distinction between 'safe' and thread-safe C calls is made +so that we may call external functions that aren't re-entrant but may +cause a GC to occur. +<p> +The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>. +</p> + +<!---- *************************************** -----> +<h2>Building the GHC RTS</h2> + +The multi-threaded extension isn't currently enabled by default. To +have it built, you need to run the <tt>fptools</tt> configure script +with the extra option <tt>--enable-threaded-rts</tt> turned on, and +then proceed to build the compiler as per normal. + +<hr> +<small> +<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end --> +</small> +</body> </html> + diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html new file mode 100644 index 0000000000..627bde8d88 --- /dev/null +++ b/docs/comm/rts-libs/non-blocking.html @@ -0,0 +1,133 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Non-blocking I/O on Win32</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Non-blocking I/O on Win32</h1> + <p> + +This note discusses the implementation of non-blocking I/O on +Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth +capturing the ideas. Thanks to Sigbjorn for writing them. + +<h2> Background</h2> + +GHC has provided non-blocking I/O support for Concurrent Haskell +threads on platforms that provide 'UNIX-style' non-blocking I/O for +quite a while. That is, platforms that let you alter the property of a +file descriptor to instead of having a thread block performing an I/O +operation that cannot be immediately satisfied, the operation returns +back a special error code (EWOULDBLOCK.) When that happens, the CH +thread that made the blocking I/O request is put into a blocked-on-IO +state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will +in a timely fashion check to see whether I/O is again possible +(via a call to select()), and if it is, unblock the thread & have it +re-try the I/O operation. The result is that other Concurrent Haskell +threads won't be affected, but can continue operating while a thread +is blocked on I/O. +<p> +Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for +the simple reason that it doesn't provide the OS facilities described +above. + +<h2>Win32 non-blocking I/O, attempt 1</h2> + +Win32 does provide something select()-like, namely the +WaitForMultipleObjects() API. It takes an array of kernel object +handles plus a timeout interval, and waits for either one (or all) of +them to become 'signalled'. A handle representing an open file (for +reading) becomes signalled once there is input available. +<p> +So, it is possible to observe that I/O is possible using this +function, but not whether there's "enough" to satisfy the I/O request. +So, if we were to mimic select() usage with WaitForMultipleObjects(), +we'd correctly avoid blocking initially, but a thread may very well +block waiting for their I/O requests to be satisified once the file +handle has become signalled. [There is a fix for this -- only read +and write one byte at a the time -- but I'm not advocating that.] + + +<h2>Win32 non-blocking I/O, attempt 2</h2> + +Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is, +asynchronous read and write requests can be made via the ReadFile() / +WriteFile () APIs, specifying position and length of the operation. +If the I/O requests cannot be handled right away, the APIs won't +block, but return immediately (and report ERROR_IO_PENDING as their +status code.) +<p> +The completion of the request can be reported in a number of ways: +<ul> + <li> synchronously, by blocking inside Read/WriteFile(). (this is the + non-overlapped case, really.) +<p> + + <li> as part of the overlapped I/O request, pass a HANDLE to an event + object. The I/O system will signal this event once the request + completed, which a waiting thread will then be able to see. +<p> + + <li> by supplying a pointer to a completion routine, which will be + called as an Asynchronous Procedure Call (APC) whenever a thread + calls a select bunch of 'alertable' APIs. +<p> + + <li> by associating the file handle with an I/O completion port. Once + the request completes, the thread servicing the I/O completion + port will be notified. +</ul> +The use of I/O completion port looks the most interesting to GHC, +as it provides a central point where all I/O requests are reported. +<p> +Note: asynchronous I/O is only fully supported by OSes based on +the NT codebase, i.e., Win9x don't permit async I/O on files and +pipes. However, Win9x does support async socket operations, and +I'm currently guessing here, console I/O. In my view, it would +be acceptable to provide non-blocking I/O support for NT-based +OSes only. +<p> +Here's the design I currently have in mind: +<ul> +<li> Upon startup, an RTS helper thread whose only purpose is to service + an I/O completion port, is created. +<p> +<li> All files are opened in 'overlapping' mode, and associated + with an I/O completion port. +<p> +<li> Overlapped I/O requests are used to implement read() and write(). +<p> +<li> If the request cannot be satisified without blocking, the Haskell + thread is put on the blocked-on-I/O thread list & a re-schedule + is made. +<p> +<li> When the completion of a request is signalled via the I/O completion + port, the RTS helper thread will move the associated Haskell thread + from the blocked list onto the runnable list. (Clearly, care + is required here to have another OS thread mutate internal Scheduler + data structures.) + +<p> +<li> In the event all Concurrent Haskell threads are blocked waiting on + I/O, the main RTS thread blocks waiting on an event synchronisation + object, which the helper thread will signal whenever it makes + a Haskell thread runnable. + +</ul> + +I might do the communication between the RTS helper thread and the +main RTS thread differently though: rather than have the RTS helper +thread manipluate thread queues itself, thus requiring careful +locking, just have it change a bit on the relevant TSO, which the main +RTS thread can check at regular intervals (in some analog of +awaitEvent(), for example). + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html new file mode 100644 index 0000000000..25407eed43 --- /dev/null +++ b/docs/comm/rts-libs/prelfound.html @@ -0,0 +1,57 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Prelude Foundations</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Prelude Foundations</h1> + <p> + The standard Haskell Prelude as well as GHC's Prelude extensions are + constructed from GHC's <a href="primitives.html">primitives</a> in a + couple of layers. + + <h4><code>PrelBase.lhs</code></h4> + <p> + Some most elementary Prelude definitions are collected in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>. + In particular, it defines the boxed versions of Haskell primitive types + - for example, <code>Int</code> is defined as + <blockquote><pre> +data Int = I# Int#</pre> + </blockquote> + <p> + Saying that a boxed integer <code>Int</code> is formed by applying the + data constructor <code>I#</code> to an <em>unboxed</em> integer of type + <code>Int#</code>. Unboxed types are hardcoded in the compiler and + exported together with the <a href="primitives.html">primitive + operations</a> understood by GHC. + <p> + <code>PrelBase.lhs</code> similarly defines basic types, such as, + boolean values + <blockquote><pre> +data Bool = False | True deriving (Eq, Ord)</pre> + </blockquote> + <p> + the unit type + <blockquote><pre> +data () = ()</pre> + </blockquote> + <p> + and lists + <blockquote><pre> +data [] a = [] | a : [a]</pre> + </blockquote> + <p> + It also contains instance delarations for these types. In addition, + <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky + machinery</a> for efficient list handling. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html new file mode 100644 index 0000000000..4ad6c20338 --- /dev/null +++ b/docs/comm/rts-libs/prelude.html @@ -0,0 +1,121 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Cunning Prelude Code</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Cunning Prelude Code</h1> + <p> + GHC's uses a many optimsations and GHC specific techniques (unboxed + values, RULES pragmas, and so on) to make the heavily used Prelude code + as fast as possible. + + <hr> + <h4>Par, seq, and lazy</h4> + + In GHC.Conc you will dinf +<blockquote><pre> + pseq a b = a `seq` lazy b +</pre></blockquote> + What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting. + We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict + in b, then b might be evaluated <em>before</em> a, which is all wrong. +<p> +Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function, +except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the +inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look +lazy, and have no inlining. So the strictness analyser gets no traction. +<p> +In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs), +so we get all the efficiency back. +<p> +This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an +Int#. Now there is no seq# operator at all. + + + <hr> + <h4>fold/build</h4> + <p> + There is a lot of magic in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> - + among other things, the <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES + pragmas</a> implementing the <a + href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a> + optimisation. The code for <code>map</code> is + a good example for how it all works. In the prelude code for version + 5.03 it reads as follows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map _ [] = [] +map f (x:xs) = f x : map f xs + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +{-# INLINE [0] mapFB #-} +mapFB c f x ys = c (f x) ys + +{-# RULES +"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) + #-}</pre> + </blockquote> + <p> + Up to (but not including) phase 1, we use the <code>"map"</code> rule to + rewrite all saturated applications of <code>map</code> with its + build/fold form, hoping for fusion to happen. In phase 1 and 0, we + switch off that rule, inline build, and switch on the + <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back + into plain map. + <p> + It's important that these two rules aren't both active at once + (along with build's unfolding) else we'd get an infinite loop + in the rules. Hence the activation control using explicit phase numbers. + <p> + The "mapFB" rule optimises compositions of map. + <p> + The mechanism as described above is new in 5.03 since January 2002, + where the <code>[~</code><i>N</i><code>]</code> syntax for phase number + annotations at rules was introduced. Before that the whole arrangement + was more complicated, as the corresponding prelude code for version + 4.08.1 shows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map = mapList + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +mapFB c f x ys = c (f x) ys + +mapList :: (a -> b) -> [a] -> [b] +mapList _ [] = [] +mapList f (x:xs) = f x : mapList f xs + +{-# RULES +"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) +"mapList" forall f. foldr (mapFB (:) f) [] = mapList f + #-}</pre> + </blockquote> + <p> + This code is structured as it is, because the "map" rule first + <em>breaks</em> the map <em>open,</em> which exposes it to the various + foldr/build rules, and if no foldr/build rule matches, the "mapList" + rule <em>closes</em> it again in a later phase of optimisation - after + build was inlined. As a consequence, the whole thing depends a bit on + the timing of the various optimsations (the map might be closed again + before any of the foldr/build rules fires). To make the timing + deterministic, <code>build</code> gets a <code>{-# INLINE 2 build + #-}</code> pragma, which delays <code>build</code>'s inlining, and thus, + the closing of the map. [NB: Phase numbering was forward at that time.] + + <p><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 20:00:49 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html new file mode 100644 index 0000000000..28abc79426 --- /dev/null +++ b/docs/comm/rts-libs/primitives.html @@ -0,0 +1,70 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Primitives</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Primitives</h1> + <p> + Most user-level Haskell types and functions provided by GHC (in + particular those from the Prelude and GHC's Prelude extensions) are + internally constructed from even more elementary types and functions. + Most notably, GHC understands a notion of <em>unboxed types,</em> which + are the Haskell representation of primitive bit-level integer, float, + etc. types (as opposed to their boxed, heap allocated counterparts) - + cf. <a + href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed + Values as First Class Citizens."</a> + + <h4>The Ultimate Source of Primitives</h4> + <p> + The hardwired types of GHC are brought into scope by the module + <code>PrelGHC</code>. This modules only exists in the form of a + handwritten interface file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a> + which lists the type and function names, as well as instance + declarations. The actually types of these names as well as their + implementation is hardwired into GHC. Note that the names in this file + are z-encoded, and in particular, identifiers ending on <code>zh</code> + denote user-level identifiers ending in a hash mark (<code>#</code>), + which is used to flag unboxed values or functions operating on unboxed + values. For example, we have <code>Char#</code>, <code>ord#</code>, and + so on. + + <h4>The New Primitive Definition Scheme</h4> + <p> + As of (about) the development version 4.11, the types and various + properties of primitive operations are defined in the file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>. + (Personally, I don't think that the <code>.txt</code> suffix is really + appropriate, as the file is used for automatic code generation; the + recent addition of <code>.pp</code> means that the file is now mangled + by cpp.) + <p> + The utility <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a> + generates a series of Haskell files from <code>primops.txt</code>, which + encode the types and various properties of the primitive operations as + compiler internal data structures. These Haskell files are not complete + modules, but program fragments, which are included into compiler modules + during the GHC build process. The generated include files can be found + in the directory <code>fptools/ghc/compiler/</code> and carry names + matching the pattern <code>primop-*.hs-incl</code>. They are generate + during the execution of the <code>boot</code> target in the + <code>fptools/ghc/</code> directory. This scheme significantly + simplifies the maintenance of primitive operations. + <p> + As of development version 5.02, the <code>primops.txt</code> file also allows the + recording of documentation about intended semantics of the primitives. This can + be extracted into a latex document (or rather, into latex document fragments) + via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code> + for full details of how GHC is configured to cope with different machine word sizes. + <p><small> +<!-- hhmts start --> +Last modified: Mon Nov 26 18:03:16 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html new file mode 100644 index 0000000000..196ec9150d --- /dev/null +++ b/docs/comm/rts-libs/stgc.html @@ -0,0 +1,45 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Spineless Tagless C</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Spineless Tagless C</h1> + <p> + The C code generated by GHC doesn't use higher-level features of C to be + able to control as precisely as possible what code is generated. + Moreover, it uses special features of gcc (such as, first class labels) + to produce more efficient code. + <p> + STG C makes ample use of C's macro language to define idioms, which also + reduces the size of the generated C code (thus, reducing I/O times). + These macros are defined in the C headers located in GHC's <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a> + directory. + + <h4><code>TailCalls.h</code></h4> + <p> + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a> + defines how tail calls are implemented - and in particular - optimised + in GHC generated code. The default case, for an architecture for which + GHC is not optimised, is to use the mini interpreter described in the <a + href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a> + <p> + For supported architectures, various tricks are used to generate + assembler implementing proper tail calls. On i386, gcc's first class + labels are used to directly jump to a function pointer. Furthermore, + markers of the form <code>--- BEGIN ---</code> and <code>--- END + ---</code> are added to the assembly right after the function prologue + and before the epilogue. These markers are used by <a + href="../the-beast/mangler.html">the Evil Mangler.</a> + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:28:29 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html new file mode 100644 index 0000000000..499aeec767 --- /dev/null +++ b/docs/comm/rts-libs/threaded-rts.html @@ -0,0 +1,126 @@ +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title> + </head> + + <body> + <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1> + + <p>This section of the commentary explains the structure of the runtime system + when used in threaded or SMP mode.</p> + + <p>The <em>threaded</em> version of the runtime supports + bound threads and non-blocking foreign calls, and an overview of its + design can be found in the paper <a + href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending + the Haskell Foreign Function Interface with Concurrency</a>. To + compile the runtime with threaded support, add the line + +<pre>GhcRTSWays += thr</pre> + + to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way, + the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the + build system when building for way <tt>thr</tt>, see + <tt>mk/config.mk</tt>). To build a Haskell program + with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this + can be used in conjunction with <tt>-prof</tt>, and possibly + <tt>-debug</tt> and others depending on which versions of the RTS have + been built.</p> + + <p>The <em>SMP</em> version runtime supports the same facilities as the + threaded version, and in addition supports execution of Haskell code by + multiple simultaneous OS threads. For SMP support, both the runtime and + the libraries must be built a special way: add the lines + + <pre> +GhcRTSWays += thr +GhcLibWays += s</pre> + + to <tt>mk/build.mk</tt>. To build Haskell code for + SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in + conjunction with <tt>-debug</tt>, but no other way-flags at this time). + When building C code in the runtime for SMP + support, the symbol <tt>SMP</tt> is defined (this is arranged by the + compiler when the <tt>-smp</tt> flag is given, see + <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p> + + <p>When building the runtime in either the threaded or SMP ways, the symbol + <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p> + + <h2>Overall design</h2> + + <p>The system is based around the notion of a <tt>Capability</tt>. A + <tt>Capability</tt> is an object that represents both the permission to + execute some Haskell code, and the state required to do so. In order + to execute some Haskell code, a thread must therefore hold a + <tt>Capability</tt>. The available pool of capabilities is managed by + the <tt>Capability</tt> API, described below.</p> + + <p>In the threaded runtime, there is only a single <tt>Capabililty</tt> in the + system, indicating that only a single thread can be executing Haskell + code at any one time. In the SMP runtime, there can be an arbitrary + number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt> + flag; in practice the number is best chosen to be the same as the number of + processors on the host machine.</p> + + <p>There are a number of OS threads running code in the runtime. We call + these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>. + Tasks are managed by the <tt>Task</tt> subsystem, which is mainly + concerned with keeping track of statistics such as how much time each + task spends executing Haskell code, and also keeping track of how many + tasks are around when we want to shut down the runtime.</p> + + <p>Some tasks are created by the runtime itself, and some may be here + as a result of a call to Haskell from foreign code (we + call this an in-call). The + runtime can support any number of concurrent foreign in-calls, but the + number of these calls that will actually run Haskell code in parallel is + determined by the number of available capabilities. Each in-call creates + a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited + above).</p> + + <p>In the future we may want to bind a <tt>Capability</tt> to a particular + processor, so that we can support a notion of affinity - avoiding + accidental migration of work from one CPU to another, so that we can make + best use of a CPU's local cache. For now, the design ignores this + issue.</p> + + <h2>The <tt>OSThreads</tt> interface</h2> + + <p>This interface is merely an abstraction layer over the OS-specific APIs + for managing threads. It has two main implementations: Win32 and + POSIX.</p> + + <p>This is the entirety of the interface:</p> + +<pre> +/* Various abstract types */ +typedef Mutex; +typedef Condition; +typedef OSThreadId; + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); + +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern void initMutex ( Mutex* pMut ); + </pre> + + <h2>The Task interface</h2> + + <h2>The Capability interface</h2> + + <h2>Multiprocessor Haskell Execution</h2> + + </body> +</html> diff --git a/docs/comm/the-beast/alien.html b/docs/comm/the-beast/alien.html new file mode 100644 index 0000000000..3d4776ebc9 --- /dev/null +++ b/docs/comm/the-beast/alien.html @@ -0,0 +1,56 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Alien Functions</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Alien Functions</h1> + <p> + GHC implements experimental (by now it is actually quite well tested) + support for access to foreign functions and generally the interaction + between Haskell code and code written in other languages. Code + generation in this context can get quite tricky. This section attempts + to cast some light on this aspect of the compiler. + + <h4>FFI Stub Files</h4> + <p> + For each Haskell module that contains a <code>foreign export + dynamic</code> declaration, GHC generates a <code>_stub.c</code> file + that needs to be linked with any program that imports the Haskell + module. When asked about it <a + href="mailto:simonmar@microsoft.com">Simon Marlow</a> justified the + existence of these files as follows: + <blockquote> + The stub files contain the helper function which invokes the Haskell + code when called from C. + <p> + Each time the foreign export dynamic is invoked to create a new + callback function, a small piece of code has to be dynamically + generated (by code in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Adjustor.c"><code>Adjustor.c</code></a>). It is the address of this dynamically generated bit of + code that is returned as the <code>Addr</code> (or <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/Ptr.lhs"><code>Ptr</code></a>). + When called from C, the dynamically generated code must somehow invoke + the Haskell function which was originally passed to the + f.e.d. function -- it does this by invoking the helper function, + passing it a <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/StablePtr.lhs"><code>StablePtr</code></a> + to the Haskell function. It's split this way for two reasons: the + same helper function can be used each time the f.e.d. function is + called, and to keep the amount of dynamically generated code to a + minimum. + </blockquote> + <p> + The stub code is generated by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsForeign.lhs"><code>DSForeign</code></a><code>.fexportEntry</code>. + + + <p><small> +<!-- hhmts start --> +Last modified: Fri Aug 10 11:47:41 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/basicTypes.html b/docs/comm/the-beast/basicTypes.html new file mode 100644 index 0000000000..ca56d6b6a8 --- /dev/null +++ b/docs/comm/the-beast/basicTypes.html @@ -0,0 +1,132 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Basics</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Basics</h1> + <p> + The directory <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/"><code>fptools/ghc/compiler/basicTypes/</code></a> + contains modules that define some of the essential types definition for + the compiler - such as, identifiers, variables, modules, and unique + names. Some of those are discussed in the following. See elsewhere for more + detailed information on: + <ul> + <li> <a href="vars.html"><code>Var</code>s, <code>Id</code>s, and <code>TyVar</code>s</a> + <li> <a href="renamer.html"><code>OccName</code>s, <code>RdrName</code>s, and <code>Names</code>s</a> + </ul> + + <h2>Elementary Types</h2> + + <h4><code>Id</code>s</h4> + <p> + An <code>Id</code> (defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id.lhs</code></a> + essentially records information about value and data constructor + identifiers -- to be precise, in the case of data constructors, two + <code>Id</code>s are used to represent the worker and wrapper functions + for the data constructor, respectively. The information maintained in + the <code>Id</code> abstraction includes among other items strictness, + occurrence, specialisation, and unfolding information. + <p> + Due to the way <code>Id</code>s are used for data constructors, + all <code>Id</code>s are represented as variables, which contain a + <code>varInfo</code> field of abstract type <code><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/IdInfo.lhs">IdInfo</a>.IdInfo</code>. + This is where the information about <code>Id</code>s is really stored. + The following is a (currently, partial) list of the various items in an + <code>IdInfo</code>: + <p> + <dl> + <dt><a name="occInfo">Occurence information</a> + <dd>The <code>OccInfo</code> data type is defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/BasicTypes.lhs"><code>BasicTypes.lhs</code></a>. + Apart from the trivial <code>NoOccInfo</code>, it distinguishes + between variables that do not occur at all (<code>IAmDead</code>), + occur just once (<code>OneOcc</code>), or a <a + href="simplifier.html#loopBreaker">loop breakers</a> + (<code>IAmALoopBreaker</code>). + </dl> + + <h2>Sets, Finite Maps, and Environments</h2> + <p> + Sets of variables, or more generally names, which are needed throughout + the compiler, are provided by the modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarSet.lhs"><code>VarSet.lhs</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameSet.lhs"><code>NameSet.lhs</code></a>, + respectively. Moreover, frequently maps from variables (or names) to + other data is needed. For example, a substitution is represented by a + finite map from variable names to expressions. Jobs like this are + solved by means of variable and name environments implemented by the + modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarEnv.lhs"><code>VarEnv.lhs</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameEnv.lhs"><code>NameEnv.lhs</code></a>. + + <h4>The Module <code>VarSet</code></h4> + <p> + The Module <code>VarSet</code> provides the types <code>VarSet</code>, + <code>IdSet</code>, and <code>TyVarSet</code>, which are synonyms in the + current implementation, as <code>Var</code>, <code>Id</code>, and + <code>TyVar</code> are synonyms. The module provides all the operations + that one would expect including the creating of sets from individual + variables and lists of variables, union and intersection operations, + element checks, deletion, filter, fold, and map functions. + <p> + The implementation is based on <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqSet.lhs"><code>UniqSet</code></a>s, + which in turn are simply <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>s + (i.e., finite maps with uniques as keys) that map each unique to the + variable that it represents. + + <h4>The Module <code>NameSet</code></h4> + <p> + The Module <code>NameSet</code> provides the same functionality as + <code>VarSet</code> only for <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Name.lhs"><code>Name</code></a>s. + As for the difference between <code>Name</code>s and <code>Var</code>s, + a <code>Var</code> is built from a <code>Name</code> plus additional + information (mostly importantly type information). + + <h4>The Module <code>VarEnv</code></h4> + <p> + The module <code>VarEnv</code> provides the types <code>VarEnv</code>, + <code>IdEnv</code>, and <code>TyVarEnv</code>, which are again + synonyms. The provided base functionality is similar to + <code>VarSet</code> with the main difference that a type <code>VarEnv + T</code> associates a value of type <code>T</code> with each variable in + the environment, thus effectively implementing a finite map from + variables to values of type <code>T</code>. + <p> + The implementation of <code>VarEnv</code> is also by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>, + which entails the slightly surprising implication that it is + <em>not</em> possible to retrieve the domain of a variable environment. + In other words, there is no function corresponding to + <code>VarSet.varSetElems :: VarSet -> [Var]</code> in + <code>VarEnv</code>. This is because the <code>UniqFM</code> used to + implement <code>VarEnv</code> stores only the unique corresponding to a + variable in the environment, but not the entire variable (and there is + no mapping from uniques to variables). + <p> + In addition to plain variable environments, the module also contains + special substitution environments - the type <code>SubstEnv</code> - + that associates variables with a special purpose type + <code>SubstResult</code>. + + <h4>The Module <code>NameEnv</code></h4> + <p> + The type <code>NameEnv.NameEnv</code> is like <code>VarEnv</code> only + for <code>Name</code>s. + + <p><hr><small> +<!-- hhmts start --> +Last modified: Tue Jan 8 18:29:52 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/coding-style.html b/docs/comm/the-beast/coding-style.html new file mode 100644 index 0000000000..41347c6902 --- /dev/null +++ b/docs/comm/the-beast/coding-style.html @@ -0,0 +1,230 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Coding Style Guidelines</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Coding Style Guidelines</h1> + + <p>This is a rough description of some of the coding practices and + style that we use for Haskell code inside <tt>ghc/compiler</tt>. + + <p>The general rule is to stick to the same coding style as is + already used in the file you're editing. If you must make + stylistic changes, commit them separately from functional changes, + so that someone looking back through the change logs can easily + distinguish them. + + <h2>To literate or not to literate?</h2> + + <p>In GHC we use a mixture of literate (<tt>.lhs</tt>) and + non-literate (<tt>.hs</tt>) source. I (Simon M.) prefer to use + non-literate style, because I think the + <tt>\begin{code}..\end{code}</tt> clutter up the source too much, + and I like to use Haddock-style comments (we haven't tried + processing the whole of GHC with Haddock yet, though). + + <h2>To CPP or not to CPP?</h2> + + <p>We pass all the compiler sources through CPP. The + <tt>-cpp</tt> flag is always added by the build system. + + <p>The following CPP symbols are used throughout the compiler: + + <dl> + <dt><tt>DEBUG</tt></dt> + + <dd>Used to enables extra checks and debugging output in the + compiler. The <tt>ASSERT</tt> macro (see <tt>HsVersions.h</tt>) + provides assertions which disappear when <tt>DEBUG</tt> is not + defined. + + <p>All debugging output should be placed inside <tt>#ifdef + DEBUG</tt>; we generally use this to provide warnings about + strange cases and things that might warrant investigation. When + <tt>DEBUG</tt> is off, the compiler should normally be silent + unless something goes wrong (exception when the verbosity level + is greater than zero). + + <p>A good rule of thumb is that <tt>DEBUG</tt> shouldn't add + more than about 10-20% to the compilation time. This is the case + at the moment. If it gets too expensive, we won't use it. For + more expensive runtime checks, consider adding a flag - see for + example <tt>-dcore-lint</tt>. + </dd> + + <dt><tt>GHCI</tt></dt> + + <dd>Enables GHCi support, including the byte code generator and + interactive user interface. This isn't the default, because the + compiler needs to be bootstrapped with itself in order for GHCi + to work properly. The reason is that the byte-code compiler and + linker are quite closely tied to the runtime system, so it is + essential that GHCi is linked with the most up-to-date RTS. + Another reason is that the representation of certain datatypes + must be consistent between GHCi and its libraries, and if these + were inconsistent then disaster could follow. + </dd> + + </dl> + + <h2>Platform tests</h2> + + <p>There are three platforms of interest to GHC: + + <ul> + <li>The <b>Build</b> platform. This is the platform on which we + are building GHC.</li> + <li>The <b>Host</b> platform. This is the platform on which we + are going to run this GHC binary, and associated tools.</li> + <li>The <b>Target</b> platform. This is the platform for which + this GHC binary will generate code.</li> + </ul> + + <p>At the moment, there is very limited support for having + different values for buil, host, and target. In particular:</p> + + <ul> + <li>The build platform is currently always the same as the host + platform. The build process needs to use some of the tools in + the source tree, for example <tt>ghc-pkg</tt> and + <tt>hsc2hs</tt>.</li> + + <li>If the target platform differs from the host platform, then + this is generally for the purpose of building <tt>.hc</tt> files + from Haskell source for porting GHC to the target platform. + Full cross-compilation isn't supported (yet).</li> + </ul> + + <p>In the compiler's source code, you may make use of the + following CPP symbols:</p> + + <ul> + <li><em>xxx</em><tt>_TARGET_ARCH</tt></li> + <li><em>xxx</em><tt>_TARGET_VENDOR</tt></li> + <li><em>xxx</em><tt>_TARGET_OS</tt></li> + <li><em>xxx</em><tt>_HOST_ARCH</tt></li> + <li><em>xxx</em><tt>_HOST_VENDOR</tt></li> + <li><em>xxx</em><tt>_HOST_OS</tt></li> + </ul> + + <p>where <em>xxx</em> is the appropriate value: + eg. <tt>i386_TARGET_ARCH</tt>. + + <h2>Compiler versions</h2> + + <p>GHC must be compilable by every major version of GHC from 5.02 + onwards, and itself. It isn't necessary for it to be compilable + by every intermediate development version (that includes last + week's CVS sources). + + <p>To maintain compatibility, use <tt>HsVersions.h</tt> (see + below) where possible, and try to avoid using <tt>#ifdef</tt> in + the source itself. + + <h2>The source file</h2> + + <p>We now describe a typical source file, annotating stylistic + choices as we go. + +<pre> +{-# OPTIONS ... #-} +</pre> + + <p>An <tt>OPTIONS</tt> pragma is optional, but if present it + should go right at the top of the file. Things you might want to + put in <tt>OPTIONS</tt> include: + + <ul> + <li><tt>-#include</tt> options to bring into scope prototypes + for FFI declarations</li> + <li><tt>-fvia-C</tt> if you know that + this module won't compile with the native code generator. + </ul> + + <p>Don't bother putting <tt>-cpp</tt> or <tt>-fglasgow-exts</tt> + in the <tt>OPTIONS</tt> pragma; these are already added to the + command line by the build system. + + +<pre> +module Foo ( + T(..), + foo, -- :: T -> T + ) where +</pre> + + <p>We usually (99% of the time) include an export list. The only + exceptions are perhaps where the export list would list absolutely + everything in the module, and even then sometimes we do it anyway. + + <p>It's helpful to give type signatures inside comments in the + export list, but hard to keep them consistent, so we don't always + do that. + +<pre> +#include "HsVersions.h" +</pre> + + <p><tt>HsVersions.h</tt> is a CPP header file containing a number + of macros that help smooth out the differences between compiler + versions. It defines, for example, macros for library module + names which have moved between versions. Take a look. + +<pre> +-- friends +import SimplMonad + +-- GHC +import CoreSyn +import Id ( idName, idType ) +import BasicTypes + +-- libraries +import DATA_IOREF ( newIORef, readIORef ) + +-- std +import List ( partition ) +import Maybe ( fromJust ) +</pre> + + <p>List imports in the following order: + + <ul> + <li>Local to this subsystem (or directory) first</li> + + <li>Compiler imports, generally ordered from specific to generic + (ie. modules from <tt>utils/</tt> and <tt>basicTypes/</tt> + usually come last)</li> + + <li>Library imports</li> + + <li>Standard Haskell 98 imports last</li> + </ul> + + <p>Import library modules from the <tt>base</tt> and + <tt>haskell98</tt> packages only. Use <tt>#defines</tt> in + <tt>HsVersions.h</tt> when the modules names differ between + versions of GHC (eg. <tt>DATA_IOREF</tt> in the example above). + For code inside <tt>#ifdef GHCI</tt>, don't need to worry about GHC + versioning (because we are bootstrapped). + + <p>We usually use import specs to give an explicit list of the + entities imported from a module. The main reason for doing this is + so that you can search the file for an entity and find which module + it comes from. However, huge import lists can be a pain to + maintain, so we often omit the import specs when they start to get + long (actually I start omitting them when they don't fit on one + line --Simon M.). Tip: use GHC's <tt>-fwarn-unused-imports</tt> + flag so that you get notified when an import isn't being used any + more. + + <p>If the module can be compiled multiple ways (eg. GHCI + vs. non-GHCI), make sure the imports are properly <tt>#ifdefed</tt> + too, so as to avoid spurious unused import warnings. + + <p><em>ToDo: finish this</em> + </body> +</html> diff --git a/docs/comm/the-beast/data-types.html b/docs/comm/the-beast/data-types.html new file mode 100644 index 0000000000..fef4852d4d --- /dev/null +++ b/docs/comm/the-beast/data-types.html @@ -0,0 +1,242 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Data types and data constructors</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Data types and data constructors</h1> + <p> + +This chapter was thoroughly changed Feb 2003. + +<h2>Data types</h2> + +Consider the following data type declaration: + +<pre> + data T a = MkT !(a,a) !(T a) | Nil + + f x = case x of + MkT p q -> MkT p (q+1) + Nil -> Nil +</pre> +The user's source program mentions only the constructors <tt>MkT</tt> +and <tt>Nil</tt>. However, these constructors actually <em>do</em> something +in addition to building a data value. For a start, <tt>MkT</tt> evaluates +its arguments. Secondly, with the flag <tt>-funbox-strict-fields</tt> GHC +will flatten (or unbox) the strict fields. So we may imagine that there's the +<em>source</em> constructor <tt>MkT</tt> and the <em>representation</em> constructor +<tt>MkT</tt>, and things start to get pretty confusing. +<p> +GHC now generates three unique <tt>Name</tt>s for each data constructor: +<pre> + ---- OccName ------ + String Name space Used for + --------------------------------------------------------------------------- + The "source data con" MkT DataName The DataCon itself + The "worker data con" MkT VarName Its worker Id + aka "representation data con" + The "wrapper data con" $WMkT VarName Its wrapper Id (optional) +</pre> +Recall that each occurrence name (OccName) is a pair of a string and a +name space (see <a href="names.html">The truth about names</a>), and +two OccNames are considered the same only if both components match. +That is what distinguishes the name of the name of the DataCon from +the name of its worker Id. To keep things unambiguous, in what +follows we'll write "MkT{d}" for the source data con, and "MkT{v}" for +the worker Id. (Indeed, when you dump stuff with "-ddumpXXX", if you +also add "-dppr-debug" you'll get stuff like "Foo {- d rMv -}". The +"d" part is the name space; the "rMv" is the unique key.) +<p> +Each of these three names gets a distinct unique key in GHC's name cache. + +<h2>The life cycle of a data type</h2> + +Suppose the Haskell source looks like this: +<pre> + data T a = MkT !(a,a) !Int | Nil + + f x = case x of + Nil -> Nil + MkT p q -> MkT p (q+1) +</pre> +When the parser reads it in, it decides which name space each lexeme comes +from, thus: +<pre> + data T a = MkT{d} !(a,a) !Int | Nil{d} + + f x = case x of + Nil{d} -> Nil{d} + MkT{d} p q -> MkT{d} p (q+1) +</pre> +Notice that in the Haskell source <em>all data contructors are named via the "source data con" MkT{d}</em>, +whether in pattern matching or in expressions. +<p> +In the translated source produced by the type checker (-ddump-tc), the program looks like this: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} p q -> $WMkT p (q+1) + +</pre> +Notice that the type checker replaces the occurrence of MkT by the <em>wrapper</em>, but +the occurrence of Nil by the <em>worker</em>. Reason: Nil doesn't have a wrapper because there is +nothing to do in the wrapper (this is the vastly common case). +<p> +Though they are not printed out by "-ddump-tc", behind the scenes, there are +also the following: the data type declaration and the wrapper function for MkT. +<pre> + data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) +</pre> +Here, the <em>wrapper</em> <tt>$WMkT</tt> evaluates and takes apart the argument <tt>p</tt>, +evaluates the argument <tt>t</tt>, and builds a three-field data value +with the <em>worker</em> constructor <tt>MkT{v}</tt>. (There are more notes below +about the unboxing of strict fields.) The worker $WMkT is called an <em>implicit binding</em>, +because it's introduced implicitly by the data type declaration (record selectors +are also implicit bindings, for example). Implicit bindings are injected into the code +just before emitting code or External Core. +<p> +After desugaring into Core (-ddump-ds), the definition of <tt>f</tt> looks like this: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> let { p = (a,b); q = I# r } in + $WMkT p (q+1) +</pre> +Notice the way that pattern matching has been desugared to take account of the fact +that the "real" data constructor MkT has three fields. +<p> +By the time the simplifier has had a go at it, <tt>f</tt> will be transformed to: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) +</pre> +Which is highly cool. + + +<h2> The constructor wrapper functions </h2> + +The wrapper functions are automatically generated by GHC, and are +really emitted into the result code (albeit only after CorePre; see +<tt>CorePrep.mkImplicitBinds</tt>). +The wrapper functions are inlined very +vigorously, so you will not see many occurrences of the wrapper +functions in an optimised program, but you may see some. For example, +if your Haskell source has +<pre> + map MkT xs +</pre> +then <tt>$WMkT</tt> will not be inlined (because it is not applied to anything). +That is why we generate real top-level bindings for the wrapper functions, +and generate code for them. + + +<h2> The constructor worker functions </h2> + +Saturated applications of the constructor worker function MkT{v} are +treated specially by the code generator; they really do allocation. +However, we do want a single, shared, top-level definition for +top-level nullary constructors (like True and False). Furthermore, +what if the code generator encounters a non-saturated application of a +worker? E.g. <tt>(map Just xs)</tt>. We could declare that to be an +error (CorePrep should saturate them). But instead we currently +generate a top-level defintion for each constructor worker, whether +nullary or not. It takes the form: +<pre> + MkT{v} = \ p q r -> MkT{v} p q r +</pre> +This is a real hack. The occurrence on the RHS is saturated, so the code generator (both the +one that generates abstract C and the byte-code generator) treats it as a special case and +allocates a MkT; it does not make a recursive call! So now there's a top-level curried +version of the worker which is available to anyone who wants it. +<p> +This strange defintion is not emitted into External Core. Indeed, you might argue that +we should instead pass the list of <tt>TyCon</tt>s to the code generator and have it +generate magic bindings directly. As it stands, it's a real hack: see the code in +CorePrep.mkImplicitBinds. + + +<h2> External Core </h2> + +When emitting External Core, we should see this for our running example: + +<pre> + data T a = MkT a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT a b t) + + f x = case x of + Nil -> Nil + MkT a b r -> MkT a b (r +# 1#) +</pre> +Notice that it makes perfect sense as a program all by itself. Constructors +look like constructors (albeit not identical to the original Haskell ones). +<p> +When reading in External Core, the parser is careful to read it back in just +as it was before it was spat out, namely: +<pre> + data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) + + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) +</pre> + + +<h2> Unboxing strict fields </h2> + +If GHC unboxes strict fields (as in the first argument of <tt>MkT</tt> above), +it also transforms +source-language case expressions. Suppose you write this in your Haskell source: +<pre> + case e of + MkT p t -> ..p..t.. +</pre> +GHC will desugar this to the following Core code: +<pre> + case e of + MkT a b t -> let p = (a,b) in ..p..t.. +</pre> +The local let-binding reboxes the pair because it may be mentioned in +the case alternative. This may well be a bad idea, which is why +<tt>-funbox-strict-fields</tt> is an experimental feature. +<p> +It's essential that when importing a type <tt>T</tt> defined in some +external module <tt>M</tt>, GHC knows what representation was used for +that type, and that in turn depends on whether module <tt>M</tt> was +compiled with <tt>-funbox-strict-fields</tt>. So when writing an +interface file, GHC therefore records with each data type whether its +strict fields (if any) should be unboxed. + +<h2> Labels and info tables </h2> + +<em>Quick rough notes: SLPJ March 2003</em>. +<p> +Every data constructor <tt>C</tt>has two info tables: +<ul> +<li> The static info table (label <tt>C_static_info</tt>), used for statically-allocated constructors. + +<li> The dynamic info table (label <tt>C_con_info</tt>), used for dynamically-allocated constructors. +</ul> +Statically-allocated constructors are not moved by the garbage collector, and therefore have a different closure +type from dynamically-allocated constructors; hence they need +a distinct info table. +Both info tables share the same entry code, but since the entry code is phyiscally juxtaposed with the +info table, it must be duplicated (<tt>C_static_entry</tt> and <tt>C_con_entry</tt> respectively). + + </body> +</html> + diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html new file mode 100644 index 0000000000..a66740259b --- /dev/null +++ b/docs/comm/the-beast/desugar.html @@ -0,0 +1,156 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Sugar Free: From Haskell To Core</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1> + <p> + Up until after type checking, GHC keeps the source program in an + abstract representation of Haskell source without removing any of the + syntactic sugar (such as, list comprehensions) that could easily be + represented by more primitive Haskell. This complicates part of the + front-end considerably as the abstract syntax of Haskell (as exported by + the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>) + is much more complex than a simplified representation close to, say, the + <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell + Kernel</a> would be. However, having a representation that is as close + as possible to the surface syntax simplifies the generation of clear + error messages. As GHC (quite in contrast to "conventional" compilers) + prints code fragments as part of error messages, the choice of + representation is especially important. + <p> + Nonetheless, as soon as the input has passed all static checks, it is + transformed into GHC's principal intermediate language that goes by the + name of <em>Core</em> and whose representation is exported by the + module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>. + All following compiler phases, except code generation operate on Core. + Due to Andrew Tolmach's effort, there is also an <a + href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external + representation for Core.</a> + <p> + The conversion of the compiled module from <code>HsSyn</code> into that + of <code>CoreSyn</code> is performed by a phase called the + <em>desugarer</em>, which is located in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>. + It's operation is detailed in the following. + </p> + + <h2>Auxilliary Functions</h2> + <p> + The modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a> + defines the desugarer monad (of type <code>DsM</code>) which maintains + the environment needed for desugaring. In particular, it encapsulates a + unique supply for generating new variables, a map to lookup standard + names (such as functions from the prelude), a source location for error + messages, and a pool to collect warning messages generated during + desugaring. Initialisation of the environment happens in the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>, + which is also the main entry point into the desugarer. + <p> + The generation of Core code often involves the use of standard functions + for which proper identifiers (i.e., values of type <code>Id</code> that + actually refer to the definition in the right Prelude) need to be + obtained. This is supported by the function + <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>. + + <h2><a name="patmat">Pattern Matching</a></h2> + <p> + Nested pattern matching with guards and everything is translated into + the simple, flat case expressions of Core by the following modules: + <dl> + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>: + <dd>This modules contains the main pattern-matching compiler in the form + of a function called <code>match</code>. There is some documentation + as to how <code>match</code> works contained in the module itself. + Generally, the implemented algorithm is similar to the one described + in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The + Implementation of Functional Programming Languages</em>. + <code>Match</code> exports a couple of functions with not really + intuitive names. In particular, it exports <code>match</code>, + <code>matchWrapper</code>, <code>matchExport</code>, and + <code>matchSimply</code>. The function <code>match</code>, which is + the main work horse, is only used by the other matching modules. The + function <code>matchExport</code> - despite it's name - is merely used + internally in <code>Match</code> and handles warning messages (see + below for more details). The actual interface to the outside is + <code>matchWrapper</code>, which converts the output of the type + checker into the form needed by the pattern matching compiler (i.e., a + list of <code>EquationInfo</code>). Similar in function to + <code>matchWrapper</code> is <code>matchSimply</code>, which provides + an interface for the case where a single expression is to be matched + against a single pattern (as, for example, is the case in bindings in + a <code>do</code> expression). + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>: + <dd>This module generates code for a set of alternative constructor + patterns that belong to a single type by means of the routine + <code>matchConFamily</code>. More precisely, the routine gets a set + of equations where the left-most pattern of each equation is a + constructor pattern with a head symbol from the same type as that of + all the other equations. A Core case expression is generated that + distinguihes between all these constructors. The routine is clever + enough to generate a sparse case expression and to add a catch-all + default case only when needed (i.e., if the case expression isn't + exhaustive already). There is also an explanation at the start of the + modules. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>: + <dd>Generates code for a set of alternative literal patterns by means of + the routine <code>matchLiterals</code>. The principle is similar to + that of <code>matchConFamily</code>, but all left-most patterns are + literals of the same type. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>: + <dd>This module provides a set of auxilliary definitions as well as the + data types <code>EquationInfo</code> and <code>MatchResult</code> that + form the input and output, respectively, of the pattern matching + compiler. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>: + <dd>This module does not really contribute the compiling pattern + matching, but it inspects sets of equations to find whether there are + any overlapping patterns or non-exhaustive pattern sets. This task is + implemented by the function <code>check</code>, which returns a list of + patterns that are part of a non-exhaustive case distinction as well as a + set of equation labels that can be reached during execution of the code; + thus, the remaining equations are shadowed due to overlapping patterns. + The function <code>check</code> is invoked and its result converted into + suitable warning messages by the function <code>Match.matchExport</code> + (which is a wrapper for <code>Match.match</code>). + </dl> + <p> + The central function <code>match</code>, given a set of equations, + proceeds in a number of steps: + <ol> + <li>It starts by desugaring the left-most pattern of each equation using + the function <code>tidy1</code> (indirectly via + <code>tidyEqnInfo</code>). During this process, non-elementary + pattern (e.g., those using explicit list syntax <code>[x, y, ..., + z]</code>) are converted to a standard constructor pattern and also + irrefutable pattern are removed. + <li>Then, a process called <em>unmixing</em> clusters the equations into + blocks (without re-ordering them), such that the left-most pattern of + all equations in a block are either all variables, all literals, or + all constructors. + <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>, + which forwards the handling of literal pattern blocks to + <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to + <code>MatchCon.matchConFamily</code>, and hands variable pattern + blocks back to <code>match</code>. + </ol> + + <p><hr><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 22:35:25 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/driver.html b/docs/comm/the-beast/driver.html new file mode 100644 index 0000000000..fbf65e33e7 --- /dev/null +++ b/docs/comm/the-beast/driver.html @@ -0,0 +1,179 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Glorious Driver</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Glorious Driver</h1> + <p> + The Glorious Driver (GD) is the part of GHC that orchestrates the + interaction of all the other pieces that make up GHC. It supersedes the + <em>Evil Driver (ED),</em> which was a Perl script that served the same + purpose and was in use until version 4.08.1 of GHC. Simon Marlow + eventually slayed the ED and instated the GD. The GD is usually called + the <em>Compilation Manager</em> these days. + </p> + <p> + The GD has been substantially extended for GHCi, i.e., the interactive + variant of GHC that integrates the compiler with a (meta-circular) + interpreter since version 5.00. Most of the driver is located in the + directory + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/"><code>fptools/ghc/compiler/main/</code></a>. + </p> + + <h2>Command Line Options</h2> + <p> + GHC's many flavours of command line options make the code interpreting + them rather involved. The following provides a brief overview of the + processing of these options. Since the addition of the interactive + front-end to GHC, there are two kinds of options: <em>static + options</em> and <em>dynamic options.</em> The former can only be set + when the system is invoked, whereas the latter can be altered in the + course of an interactive session. A brief explanation on the difference + between these options and related matters is at the start of the module + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>. + The same module defines the enumeration <code>DynFlag</code>, which + contains all dynamic flags. Moreover, there is the labelled record + <code>DynFlags</code> that collects all the flag-related information + that is passed by the compilation manager to the compiler proper, + <code>hsc</code>, whenever a compilation is triggered. If you like to + find out whether an option is static, use the predicate + <code>isStaticHscFlag</code> in the same module. + <p> + The second module that contains a lot of code related to the management + of flags is <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverFlags.hs"><code>DriverFlags.hs</code></a>. + In particular, the module contains two association lists that map the + textual representation of the various flags to a data structure that + tells the driver how to parse the flag (e.g., whether it has any + arguments) and provides its internal representation. All static flags + are contained in <code>static_flags</code>. A whole range of + <code>-f</code> flags can be negated by adding a <code>-f-no-</code> + prefix. These flags are contained in the association list + <code>fFlags</code>. + <p> + The driver uses a nasty hack based on <code>IORef</code>s that permits + the rest of the compiler to access static flags as CAFs; i.e., there is + a family of toplevel variable definitions in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>, + below the literate section heading <i>Static options</i>, each of which + contains the value of one static option. This is essentially realised + via global variables (in the sense of C-style, updatable, global + variables) defined via an evil pre-processor macro named + <code>GLOBAL_VAR</code>, which is defined in a particularly ugly corner + of GHC, namely the C header file + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/HsVersions.h"><code>HsVersions.h</code></a>. + + <h2>What Happens When</h2> + <p> + Inside the Haskell compiler proper (<code>hsc</code>), a whole series of + stages (``passes'') are executed in order to transform your Haskell program + into C or native code. This process is orchestrated by + <code>main/HscMain.hscMain</code> and its relative + <code>hscReComp</code>. The latter directly invokes, in order, + the parser, the renamer, the typechecker, the desugarer, the + simplifier (Core2Core), the CoreTidy pass, the CorePrep pass, + conversion to STG (CoreToStg), the interface generator + (MkFinalIface), the code generator, and code output. The + simplifier is the most complex of these, and is made up of many + sub-passes. These are controlled by <code>buildCoreToDo</code>, + as described below. + + <h2>Scheduling Optimisations Phases</h2> + <p> + GHC has a large variety of optimisations at its disposal, many of which + have subtle interdependencies. The overall plan for program + optimisation is fixed in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState.hs</code></a>. + First of all, there is the variable <code>hsc_minusNoO_flags</code> that + determines the <code>-f</code> options that you get without + <code>-O</code> (aka optimisation level 0) as well as + <code>hsc_minusO_flags</code> and <code>hsc_minusO2_flags</code> for + <code>-O</code> and <code>-O2</code>. + <p> + However, most of the strategic decisions about optimisations on the + intermediate language Core are encoded in the value produced by + <code>buildCoreToDo</code>, which is a list with elements of type + <code>CoreToDo</code>. Each element of this list specifies one step in + the sequence of core optimisations executed by the <a + href="simplifier.html">Mighty Simplifier</a>. The type + <code>CoreToDo</code> is defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts.lhs</code></a>. + The actual execution of the optimisation plan produced by + <code>buildCoreToDo</code> is performed by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimpleCore</code></a><code>.doCorePasses</code>. + Core optimisation plans consist of a number of simplification phases + (currently, three for optimisation levels of 1 or higher) with + decreasing phase numbers (the lowest, corresponding to the last phase, + namely 0). Before and after these phases, optimisations such as + specialisation, let floating, worker/wrapper, and so on are executed. + The sequence of phases is such that the synergistic effect of the phases + is maximised -- however, this is a fairly fragile arrangement. + <p> + There is a similar construction for optimisations on STG level stored in + the variable <code>buildStgToDo :: [StgToDo]</code>. However, this is a + lot less complex than the arrangement for Core optimisations. + + <h2>Linking the <code>RTS</code> and <code>libHSstd</code></h2> + <p> + Since the RTS and HSstd refer to each other, there is a Cunning + Hack to avoid putting them each on the command-line twice or + thrice (aside: try asking for `plaice and chips thrice' in a + fish and chip shop; bet you only get two lots). The hack involves + adding + the symbols that the RTS needs from libHSstd, such as + <code>PrelWeak_runFinalizzerBatch_closure</code> and + <code>__stginit_Prelude</code>, to the link line with the + <code>-u</code> flag. The standard library appears before the + RTS on the link line, and these options cause the corresponding + symbols to be picked up even so the linked might not have seen them + being used as the RTS appears later on the link line. As a result, + when the RTS is also scanned, these symbols are already resolved. This + avoids the linker having to read the standard library and RTS + multiple times. + </p> + <p> + This does, however, leads to a complication. Normal Haskell + programs do not have a <code>main()</code> function, so this is + supplied by the RTS (in the file + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Main.c"><code>Main.c</code></a>). + It calls <code>startupHaskell</code>, which + itself calls <code>__stginit_PrelMain</code>, which is therefore, + since it occurs in the standard library, one of the symbols + passed to the linker using the <code>-u</code> option. This is fine + for standalone Haskell programs, but as soon as the Haskell code is only + used as part of a program implemented in a foreign language, the + <code>main()</code> function of that foreign language should be used + instead of that of the Haskell runtime. In this case, the previously + described arrangement unfortunately fails as + <code>__stginit_PrelMain</code> had better not be linked in, + because it tries to call <code>__stginit_Main</code>, which won't + exist. In other words, the RTS's <code>main()</code> refers to + <code>__stginit_PrelMain</code> which in turn refers to + <code>__stginit_Main</code>. Although the RTS's <code>main()</code> + might not be linked in if the program provides its own, the driver + will normally force <code>__stginit_PrelMain</code> to be linked in anyway, + using <code>-u</code>, because it's a back-reference from the + RTS to HSstd. This case is coped with by the <code>-no-hs-main</code> + flag, which suppresses passing the corresonding <code>-u</code> option + to the linker -- although in some versions of the compiler (e.g., 5.00.2) + it didn't work. In addition, the driver generally places the C program + providing the <code>main()</code> that we want to use before the RTS + on the link line. Therefore, the RTS's main is never used and + without the <code>-u</code> the label <code>__stginit_PrelMain</code> + will not be linked. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Tue Feb 19 11:09:00 UTC 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html new file mode 100644 index 0000000000..956043bafb --- /dev/null +++ b/docs/comm/the-beast/fexport.html @@ -0,0 +1,231 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - foreign export</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - foreign export</h1> + + The implementation scheme for foreign export, as of 27 Feb 02, is + as follows. There are four cases, of which the first two are easy. + <p> + <b>(1) static export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> IO Int</code> + <p> + For this we generate no Haskell code. However, a C stub is + generated, and it looks like this: + <p> + <pre> +extern StgClosure* MMM_foo_closure; + +HsInt foo (HsInt a1, HsInt a2) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("foo",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + This does the obvious thing: builds in the heap the expression + <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it, + and uses <code>rts_getInt</code> to fish out the result. + + <p> + <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> Int</code> + <p> + This is identical to case (1), with the sole difference that the + stub calls <code>rts_eval</code> rather than + <code>rts_evalIO</code>. + <p> + + <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code> + <p> + Dynamic exports are a whole lot more complicated than their static + counterparts. + <p> + First of all, we get some Haskell code, which, when given a + function <code>callMe :: (Int -> Int -> IO Int)</code> to be made + C-callable, IO-returns a <code>FunPtr a</code>, which is the + address of the resulting C-callable code. This address can now be + handed out to the C-world, and callers to it will get routed + through to <code>callMe</code>. + <p> + The generated Haskell function looks like this: + <p> +<pre> +mkCallback f + = do sp <- mkStablePtr f + r <- ccall "createAdjustorThunk" sp (&"run_mkCallback") + return r +</pre> + <p> + <code>createAdjustorThunk</code> is a gruesome, + architecture-specific function in the RTS. It takes a stable + pointer to the Haskell function to be run, and the address of the + associated C wrapper, and returns a piece of machine code, + which, when called from the outside (C) world, eventually calls + through to <code>f</code>. + <p> + This machine code fragment is called the "Adjustor Thunk" (don't + ask me why). What it does is simply to call onwards to the C + helper + function <code>run_mkCallback</code>, passing all the args given + to it but also conveying <code>sp</code>, which is a stable + pointer + to the Haskell function to run. So: + <p> +<pre> +createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn ) +{ + create malloc'd piece of machine code "mc", behaving thusly: + + mc ( args_to_mc ) + { + jump to addr_of_helper_C_fn, passing sp as an additional + argument + } +</pre> + <p> + This is a horrible hack, because there is no portable way, even at + the machine code level, to function which adds one argument and + then transfers onwards to another C function. On x86s args are + pushed R to L onto the stack, so we can just push <code>sp</code>, + fiddle around with return addresses, and jump onwards to the + helper C function. However, on architectures which use register + windows and/or pass args extensively in registers (Sparc, Alpha, + MIPS, IA64), this scheme borders on the unviable. GHC has a + limited <code>createAdjustorThunk</code> implementation for Sparc + and Alpha, which handles only the cases where all args, including + the extra one, fit in registers. + <p> + Anyway: the other lump of code generated as a result of a + f-x-dynamic declaration is the C helper stub. This is basically + the same as in the static case, except that it only ever gets + called from the adjustor thunk, and therefore must accept + as an extra argument, a stable pointer to the Haskell function + to run, naturally enough, as this is not known until run-time. + It then dereferences the stable pointer and does the call in + the same way as the f-x-static case: +<pre> +HsInt Main_d1kv ( StgStablePtr the_stableptr, + void* original_return_addr, + HsInt a1, HsInt a2 ) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr), + rts_mkInt(a1) + ), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("Main_d1kv",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + Note how this function has a purely made-up name + <code>Main_d1kv</code>, since unlike the f-x-static case, this + function is never called from user code, only from the adjustor + thunk. + <p> + Note also how the function takes a bogus parameter + <code>original_return_addr</code>, which is part of this extra-arg + hack. The usual scheme is to leave the original caller's return + address in place and merely push the stable pointer above that, + hence the spare parameter. + <p> + Finally, there is some extra trickery, detailed in + <code>ghc/rts/Adjustor.c</code>, to get round the following + problem: the adjustor thunk lives in mallocville. It is + quite possible that the Haskell code will actually + call <code>free()</code> on the adjustor thunk used to get to it + -- because otherwise there is no way to reclaim the space used + by the adjustor thunk. That's all very well, but it means that + the C helper cannot return to the adjustor thunk in the obvious + way, since we've already given it back using <code>free()</code>. + So we leave, on the C stack, the address of whoever called the + adjustor thunk, and before calling the helper, mess with the stack + such that when the helper returns, it returns directly to the + adjustor thunk's caller. + <p> + That's how the <code>stdcall</code> convention works. If the + adjustor thunk has been called using the <code>ccall</code> + convention, we return indirectly, via a statically-allocated + yet-another-magic-piece-of-code, which takes care of removing the + extra argument that the adjustor thunk pushed onto the stack. + This is needed because in <code>ccall</code>-world, it is the + caller who removes args after the call, and the original caller of + the adjustor thunk has no way to know about the extra arg pushed + by the adjustor thunk. + <p> + You didn't really want to know all this stuff, did you? + <p> + + + + <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code> + <p> + (4) relates to (3) as (2) relates to (1), that is, it's identical, + except the C stub uses <code>rts_eval</code> instead of + <code>rts_evalIO</code>. + <p> + + + <h2>Some perspective on f-x-dynamic</h2> + + The only really horrible problem with f-x-dynamic is how the + adjustor thunk should pass to the C helper the stable pointer to + use. Ideally we would like this to be conveyed via some invisible + side channel, since then the adjustor thunk could simply jump + directly to the C helper, with no non-portable stack fiddling. + <p> + Unfortunately there is no obvious candidate for the invisible + side-channel. We've chosen to pass it on the stack, with the + bad consequences detailed above. Another possibility would be to + park it in a global variable, but this is non-reentrant and + non-(OS-)thread-safe. A third idea is to put it into a callee-saves + register, but that has problems too: the C helper may not use that + register and therefore we will have trashed any value placed there + by the caller; and there is no C-level portable way to read from + the register inside the C helper. + <p> + In short, we can't think of a really satisfactory solution. I'd + vote for introducing some kind of OS-thread-local-state and passing + it in there, but that introduces complications of its own. + <p> + <b>OS-thread-safety</b> is of concern in the C stubs, whilst + building up the expressions to run. These need to have exclusive + access to the heap whilst allocating in it. Also, there needs to + be some guarantee that no GC will happen in between the + <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code> + starts running. At the moment there are no guarantees for + either property. This needs to be sorted out before the + implementation can be regarded as fully safe to use. + +<p><small> + +<!-- hhmts start --> +Last modified: Weds 27 Feb 02 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/ghci.html b/docs/comm/the-beast/ghci.html new file mode 100644 index 0000000000..b893acdeb4 --- /dev/null +++ b/docs/comm/the-beast/ghci.html @@ -0,0 +1,407 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - GHCi</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - GHCi</h1> + + This isn't a coherent description of how GHCi works, sorry. What + it is (currently) is a dumping ground for various bits of info + pertaining to GHCi, which ought to be recorded somewhere. + + <h2>Debugging the interpreter</h2> + + The usual symptom is that some expression / program crashes when + running on the interpreter (commonly), or gets wierd results + (rarely). Unfortunately, finding out what the problem really is + has proven to be extremely difficult. In retrospect it may be + argued a design flaw that GHC's implementation of the STG + execution mechanism provides only the weakest of support for + automated internal consistency checks. This makes it hard to + debug. + <p> + Execution failures in the interactive system can be due to + problems with the bytecode interpreter, problems with the bytecode + generator, or problems elsewhere. From the bugs seen so far, + the bytecode generator is often the culprit, with the interpreter + usually being correct. + <p> + Here are some tips for tracking down interactive nonsense: + <ul> + <li>Find the smallest source fragment which causes the problem. + <p> + <li>Using an RTS compiled with <code>-DDEBUG</code> (nb, that + means the RTS from the previous stage!), run with <code>+RTS + -D2</code> to get a listing in great detail from the + interpreter. Note that the listing is so voluminous that + this is impractical unless you have been diligent in + the previous step. + <p> + <li>At least in principle, using the trace and a bit of GDB + poking around at the time of death, you can figure out what + the problem is. In practice you quickly get depressed at + the hopelessness of ever making sense of the mass of + details. Well, I do, anyway. + <p> + <li><code>+RTS -D2</code> tries hard to print useful + descriptions of what's on the stack, and often succeeds. + However, it has no way to map addresses to names in + code/data loaded by our runtime linker. So the C function + <code>ghci_enquire</code> is provided. Given an address, it + searches the loaded symbol tables for symbols close to that + address. You can run it from inside GDB: + <pre> + (gdb) p ghci_enquire ( 0x50a406f0 ) + 0x50a406f0 + -48 == `PrelBase_Czh_con_info' + 0x50a406f0 + -12 == `PrelBase_Izh_static_info' + 0x50a406f0 + -48 == `PrelBase_Czh_con_entry' + 0x50a406f0 + -24 == `PrelBase_Izh_con_info' + 0x50a406f0 + 16 == `PrelBase_ZC_con_entry' + 0x50a406f0 + 0 == `PrelBase_ZMZN_static_entry' + 0x50a406f0 + -36 == `PrelBase_Czh_static_entry' + 0x50a406f0 + -24 == `PrelBase_Izh_con_entry' + 0x50a406f0 + 64 == `PrelBase_EQ_static_info' + 0x50a406f0 + 0 == `PrelBase_ZMZN_static_info' + 0x50a406f0 + 48 == `PrelBase_LT_static_entry' + $1 = void + </pre> + In this case the enquired-about address is + <code>PrelBase_ZMZN_static_entry</code>. If no symbols are + close to the given addr, nothing is printed. Not a great + mechanism, but better than nothing. + <p> + <li>We have had various problems in the past due to the bytecode + generator (<code>compiler/ghci/ByteCodeGen.lhs</code>) being + confused about the true set of free variables of an + expression. The compilation scheme for <code>let</code>s + applies the BCO for the RHS of the let to its free + variables, so if the free-var annotation is wrong or + misleading, you end up with code which has wrong stack + offsets, which is usually fatal. + <p> + <li>The baseline behaviour of the interpreter is to interpret + BCOs, and hand all other closures back to the scheduler for + evaluation. However, this causes a huge number of expensive + context switches, so the interpreter knows how to enter the + most common non-BCO closure types by itself. + <p> + These optimisations complicate the interpreter. + If you think you have an interpreter problem, re-enable the + define <code>REFERENCE_INTERPRETER</code> in + <code>ghc/rts/Interpreter.c</code>. All optimisations are + thereby disabled, giving the baseline + I-only-know-how-to-enter-BCOs behaviour. + <p> + <li>Following the traces is often problematic because execution + hops back and forth between the interpreter, which is + traced, and compiled code, which you can't see. + Particularly annoying is when the stack looks OK in the + interpreter, then compiled code runs for a while, and later + we arrive back in the interpreter, with the stack corrupted, + and usually in a completely different place from where we + left off. + <p> + If this is biting you baaaad, it may be worth copying + sources for the compiled functions causing the problem, into + your interpreted module, in the hope that you stay in the + interpreter more of the time. Of course this doesn't work + very well if you've defined + <code>REFERENCE_INTERPRETER</code> in + <code>ghc/rts/Interpreter.c</code>. + <p> + <li>There are various commented-out pieces of code in + <code>Interpreter.c</code> which can be used to get the + stack sanity-checked after every entry, and even after after + every bytecode instruction executed. Note that some + bytecodes (<code>PUSH_UBX</code>) leave the stack in + an unwalkable state, so the <code>do_print_stack</code> + local variable is used to suppress the stack walk after + them. + </ul> + + + <h2>Useful stuff to know about the interpreter</h2> + + The code generation scheme is straightforward (naive, in fact). + <code>-ddump-bcos</code> prints each BCO along with the Core it + was generated from, which is very handy. + <ul> + <li>Simple lets are compiled in-line. For the general case, let + v = E in ..., E is compiled into a new BCO which takes as + args its free variables, and v is bound to AP(the new BCO, + free vars of E). + <p> + <li><code>case</code>s as usual, become: push the return + continuation, enter the scrutinee. There is some magic to + make all combinations of compiled/interpreted calls and + returns work, described below. In the interpreted case, all + case alts are compiled into a single big return BCO, which + commences with instructions implementing a switch tree. + </ul> + <p> + <b>ARGCHECK magic</b> + <p> + You may find ARGCHECK instructions at the start of BCOs which + don't appear to need them; case continuations in particular. + These play an important role: they force objects which should + evaluated to BCOs to actually be BCOs. + <p> + Typically, there may be an application node somewhere in the heap. + This is a thunk which when leant on turns into a BCO for a return + continuation. The thunk may get entered with an update frame on + top of the stack. This is legitimate since from one viewpoint + this is an AP which simply reduces to a data object, so does not + have functional type. However, once the AP turns itself into a + BCO (so to speak) we cannot simply enter the BCO, because that + expects to see args on top of the stack, not an update frame. + Therefore any BCO which expects something on the stack above an + update frame, even non-function BCOs, start with an ARGCHECK. In + this case it fails, the update is done, the update frame is + removed, and the BCO re-entered. Subsequent entries of the BCO of + course go unhindered. + <p> + The optimised (<code>#undef REFERENCE_INTERPRETER</code>) handles + this case specially, so that a trip through the scheduler is + avoided. When reading traces from <code>+RTS -D2 -RTS</code>, you + may see BCOs which appear to execute their initial ARGCHECK insn + twice. The first time it fails; the interpreter does the update + immediately and re-enters with no further comment. + <p> + This is all a bit ugly, and, as SimonM correctly points out, it + would have been cleaner to make BCOs unpointed (unthunkable) + objects, so that a pointer to something <code>:: BCO#</code> + really points directly at a BCO. + <p> + <b>Stack management</b> + <p> + There isn't any attempt to stub the stack, minimise its growth, or + generally remove unused pointers ahead of time. This is really + due to lazyness on my part, although it does have the minor + advantage that doing something cleverer would almost certainly + increase the number of bytecodes that would have to be executed. + Of course we SLIDE out redundant stuff, to get the stack back to + the sequel depth, before returning a HNF, but that's all. As + usual this is probably a cause of major space leaks. + <p> + <b>Building constructors</b> + <p> + Constructors are built on the stack and then dumped into the heap + with a single PACK instruction, which simply copies the top N + words of the stack verbatim into the heap, adds an info table, and zaps N + words from the stack. The constructor args are pushed onto the + stack one at a time. One upshot of this is that unboxed values + get pushed untaggedly onto the stack (via PUSH_UBX), because that's how they + will be in the heap. That in turn means that the stack is not + always walkable at arbitrary points in BCO execution, although + naturally it is whenever GC might occur. + <p> + Function closures created by the interpreter use the AP-node + (tagged) format, so although their fields are similarly + constructed on the stack, there is never a stack walkability + problem. + <p> + <b>Unpacking constructors</b> + <p> + At the start of a case continuation, the returned constructor is + unpacked onto the stack, which means that unboxed fields have to + be tagged. Rather than burdening all such continuations with a + complex, general mechanism, I split it into two. The + allegedly-common all-pointers case uses a single UNPACK insn + to fish out all fields with no further ado. The slow case uses a + sequence of more complex UPK_TAG insns, one for each field (I + think). This seemed like a good compromise to me. + <p> + <b>Perspective</b> + <p> + I designed the bytecode mechanism with the experience of both STG + hugs and Classic Hugs in mind. The latter has an small + set of bytecodes, a small interpreter loop, and runs amazingly + fast considering the cruddy code it has to interpret. The former + had a large interpretative loop with many different opcodes, + including multiple minor variants of the same thing, which + made it difficult to optimise and maintain, yet it performed more + or less comparably with Classic Hugs. + <p> + My design aims were therefore to minimise the interpreter's + complexity whilst maximising performance. This means reducing the + number of opcodes implemented, whilst reducing the number of insns + despatched. In particular there are only two opcodes, PUSH_UBX + and UPK_TAG, which deal with tags. STG Hugs had dozens of opcodes + for dealing with tagged data. In cases where the common + all-pointers case is significantly simpler (UNPACK) I deal with it + specially. Finally, the number of insns executed is reduced a + little by merging multiple pushes, giving PUSH_LL and PUSH_LLL. + These opcode pairings were determined by using the opcode-pair + frequency profiling stuff which is ifdef-d out in + <code>Interpreter.c</code>. These significantly improve + performance without having much effect on the uglyness or + complexity of the interpreter. + <p> + Overall, the interpreter design is something which turned out + well, and I was pleased with it. Unfortunately I cannot say the + same of the bytecode generator. + + <h2><code>case</code> returns between interpreted and compiled code</h2> + + Variants of the following scheme have been drifting around in GHC + RTS documentation for several years. Since what follows is + actually what is implemented, I guess it supersedes all other + documentation. Beware; the following may make your brain melt. + In all the pictures below, the stack grows downwards. + <p> + <b>Returning to interpreted code</b>. + <p> + Interpreted returns employ a set of polymorphic return infotables. + Each element in the set corresponds to one of the possible return + registers (R1, D1, F1) that compiled code will place the returned + value in. In fact this is a bit misleading, since R1 can be used + to return either a pointer or an int, and we need to distinguish + these cases. So, supposing the set of return registers is {R1p, + R1n, D1, F1}, there would be four corresponding infotables, + <code>stg_ctoi_ret_R1p_info</code>, etc. In the pictures below we + call them <code>stg_ctoi_ret_REP_info</code>. + <p> + These return itbls are polymorphic, meaning that all 8 vectored + return codes and the direct return code are identical. + <p> + Before the scrutinee is entered, the stack is arranged like this: + <pre> + | | + +--------+ + | BCO | -------> the return contination BCO + +--------+ + | itbl * | -------> stg_ctoi_ret_REP_info, with all 9 codes as follows: + +--------+ + BCO* bco = Sp[1]; + push R1/F1/D1 depending on REP + push bco + yield to sched + </pre> + On entry, the interpreted contination BCO expects the stack to look + like this: + <pre> + | | + +--------+ + | BCO | -------> the return contination BCO + +--------+ + | itbl * | -------> ret_REP_ctoi_info, with all 9 codes as follows: + +--------+ + : VALUE : (the returned value, shown with : since it may occupy + +--------+ multiple stack words) + </pre> + A machine code return will park the returned value in R1/F1/D1, + and enter the itbl on the top of the stack. Since it's our magic + itbl, this pushes the returned value onto the stack, which is + where the interpreter expects to find it. It then pushes the BCO + (again) and yields. The scheduler removes the BCO from the top, + and enters it, so that the continuation is interpreted with the + stack as shown above. + <p> + An interpreted return will create the value to return at the top + of the stack. It then examines the return itbl, which must be + immediately underneath the return value, to see if it is one of + the magic <code>stg_ctoi_ret_REP_info</code> set. Since this is so, + it knows it is returning to an interpreted contination. It + therefore simply enters the BCO which it assumes it immediately + underneath the itbl on the stack. + + <p> + <b>Returning to compiled code</b>. + <p> + Before the scrutinee is entered, the stack is arranged like this: + <pre> + ptr to vec code 8 ------> return vector code 8 + | | .... + +--------+ ptr to vec code 1 ------> return vector code 1 + | itbl * | -- Itbl end + +--------+ \ .... + \ Itbl start + ----> direct return code + </pre> + The scrutinee value is then entered. + The case continuation(s) expect the stack to look the same, with + the returned HNF in a suitable return register, R1, D1, F1 etc. + <p> + A machine code return knows whether it is doing a vectored or + direct return, and, if the former, which vector element it is. + So, for a direct return we jump to <code>Sp[0]</code>, and for a + vectored return, jump to <code>((CodePtr*)(Sp[0]))[ - ITBL_LENGTH + - vector number ]</code>. This is (of course) the scheme that + compiled code has been using all along. + <p> + An interpreted return will, as described just above, have examined + the itbl immediately beneath the return value it has just pushed, + and found it not to be one of the <code>ret_REP_ctoi_info</code> set, + so it knows this must be a return to machine code. It needs to + pop the return value, currently on the stack, into R1/F1/D1, and + jump through the info table. Unfortunately the first part cannot + be accomplished directly since we are not in Haskellised-C world. + <p> + We therefore employ a second family of magic infotables, indexed, + like the first, on the return representation, and therefore with + names of the form <code>stg_itoc_ret_REP_info</code>. (Note: + <code>itoc</code>; the previous bunch were <code>ctoi</code>). + This is pushed onto the stack (note, tagged values have their tag + zapped), giving: + <pre> + | | + +--------+ + | itbl * | -------> arbitrary machine code return itbl + +--------+ + : VALUE : (the returned value, possibly multiple words) + +--------+ + | itbl * | -------> stg_itoc_ret_REP_info, with code: + +--------+ + pop myself (stg_itoc_ret_REP_info) off the stack + pop return value into R1/D1/F1 + do standard machine code return to itbl at t.o.s. + </pre> + We then return to the scheduler, asking it to enter the itbl at + t.o.s. When entered, <code>stg_itoc_ret_REP_info</code> removes + itself from the stack, pops the return value into the relevant + return register, and returns to the itbl to which we were trying + to return in the first place. + <p> + Amazingly enough, this stuff all actually works! Well, mostly ... + <p> + <b>Unboxed tuples: a Right Royal Spanner In The Works</b> + <p> + The above scheme depends crucially on having magic infotables + <code>stg_{itoc,ctoi}_ret_REP_info</code> for each return + representation <code>REP</code>. It unfortunately fails miserably + in the face of unboxed tuple returns, because the set of required + tables would be infinite; this despite the fact that for any given + unboxed tuple return type, the scheme could be made to work fine. + <p> + This is a serious problem, because it prevents interpreted + code from doing <code>IO</code>-typed returns, since <code>IO + t</code> is implemented as <code>(# t, RealWorld# #)</code> or + thereabouts. This restriction in turn rules out FFI stuff in the + interpreter. Not good. + <p> + Although we have no way to make general unboxed tuples work, we + can at least make <code>IO</code>-types work using the following + ultra-kludgey observation: <code>RealWorld#</code> doesn't really + exist and so has zero size, in compiled code. In turn this means + that a type of the form <code>(# t, RealWorld# #)</code> has the + same representation as plain <code>t</code> does. So the bytecode + generator, whilst rejecting code with general unboxed tuple + returns, recognises and accepts this special case. Which means + that <code>IO</code>-typed stuff works in the interpreter. Just. + <p> + If anyone asks, I will claim I was out of radio contact, on a + 6-month walking holiday to the south pole, at the time this was + ... er ... dreamt up. + + +<p><small> + +<!-- hhmts start --> +Last modified: Thursday February 7 15:33:49 GMT 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/main.html b/docs/comm/the-beast/main.html new file mode 100644 index 0000000000..332ffaa501 --- /dev/null +++ b/docs/comm/the-beast/main.html @@ -0,0 +1,35 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Compiling and running the Main module</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Compiling and running the Main module</h1> + +GHC allows you to determine which module contains the "main" function, and +what that function is called, via the <code>-fmain-is</code> flag. The trouble is +that the runtime system is fixed, so what symbol should it link to? +<p> +The current solution is this. Suppose the main function is <code>Foo.run</code>. +<ul> +<li> +Then, when compiling module <code>Foo</code>, GHC adds an extra definition: +<pre> + :Main.main = runIO Foo.run +</pre> +Now the RTS can invoke <code>:Main.main</code> to start the program. (This extra +definition is inserted in TcRnDriver.checkMain.) +<p><li> +Before starting the program, though, the RTS also initialises the module tree +by calling <code>init_:Main</code>, so when compiling the main module (Foo in this case), +as well as generating <code>init_Foo</code> as usual, GHC also generates +<pre> + init_zcMain() { init_Foo; } +</pre> +This extra initialisation code is generated in CodeGen.mkModuleInit. +</ul> + + </body> +</html> diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html new file mode 100644 index 0000000000..1ad80f0d5c --- /dev/null +++ b/docs/comm/the-beast/mangler.html @@ -0,0 +1,79 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Evil Mangler</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Evil Mangler</h1> + <p> + The Evil Mangler (EM) is a Perl script invoked by the <a + href="driver.html">Glorious Driver</a> after the C compiler (gcc) has + translated the GHC-produced C code into assembly. Consequently, it is + only of interest if <code>-fvia-C</code> is in effect (either explicitly + or implicitly). + + <h4>Its purpose</h4> + <p> + The EM reads the assembly produced by gcc and re-arranges code blocks as + well as nukes instructions that it considers <em>non-essential.</em> It + derives it evilness from its utterly ad hoc, machine, compiler, and + whatnot dependent design and implementation. More precisely, the EM + performs the following tasks: + <ul> + <li>The code executed when a closure is entered is moved adjacent to + that closure's infotable. Moreover, the order of the info table + entries is reversed. Also, SRT pointers are removed from closures that + don't need them (non-FUN, RET and THUNK ones). + <li>Function prologue and epilogue code is removed. (GHC generated code + manages its own stack and uses the system stack only for return + addresses and during calls to C code.) + <li>Certain code patterns are replaced by simpler code (eg, loads of + fast entry points followed by indirect jumps are replaced by direct + jumps to the fast entry point). + </ul> + + <h4>Implementation</h4> + <p> + The EM is located in the Perl script <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>. + The script reads the <code>.s</code> file and chops it up into + <em>chunks</em> (that's how they are actually called in the script) that + roughly correspond to basic blocks. Each chunk is annotated with an + educated guess about what kind of code it contains (e.g., infotable, + fast entry point, slow entry point, etc.). The annotations also contain + the symbol introducing the chunk of assembly and whether that chunk has + already been processed or not. + <p> + The parsing of the input into chunks as well as recognising assembly + instructions that are to be removed or altered is based on a large + number of Perl regular expressions sprinkled over the whole code. These + expressions are rather fragile as they heavily rely on the structure of + the generated code - in fact, they even rely on the right amount of + white space and thus on the formatting of the assembly. + <p> + Afterwards, the chunks are reordered, some of them purged, and some + stripped of some useless instructions. Moreover, some instructions are + manipulated (eg, loads of fast entry points followed by indirect jumps + are replaced by direct jumps to the fast entry point). + <p> + The EM knows which part of the code belongs to function prologues and + epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the + form <code>--- BEGIN ---</code> and <code>--- END ---</code> the + assembler just before and after the code proper of a function starts. + It adds these tags using gcc's <code>__asm__</code> feature. + <p> + <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic + block re-ordering and dead code elimination. This seems to make the + whole <code>--- END ---</code> tag business redundant -- in fact, if + proper code is generated, no <code>--- END ---</code> tags survive gcc + optimiser. + + <p><small> +<!-- hhmts start --> +Last modified: Sun Feb 17 17:55:47 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/modules.html b/docs/comm/the-beast/modules.html new file mode 100644 index 0000000000..a6655a68a7 --- /dev/null +++ b/docs/comm/the-beast/modules.html @@ -0,0 +1,80 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Modules, ModuleNames and Packages</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Modules, ModuleNames and Packages</h1> + + <p>This section describes the datatypes <code>ModuleName</code> + <code>Module</code> and <code>PackageName</code> all available + from the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Module.lhs"><code>Module</code></a>.<p> + + <h2>Packages</h2> + + <p>A package is a collection of (zero or more) Haskell modules, + together with some information about external libraries, extra C + compiler options, and other things that this collection of modules + requires. When using DLLs on windows (or shared libraries on a + Unix system; currently unsupported), a package can consist of only + a single shared library of Haskell code; the reason for this is + described below. + + <p>Packages are further described in the User's Guide <a + href="http://www.haskell.org/ghc/docs/latest/packages.html">here</a>. + + <h2>The ModuleName type</h2> + + <p>At the bottom of the hierarchy is a <code>ModuleName</code>, + which, as its name suggests, is simply the name of a module. It + is represented as a Z-encoded FastString, and is an instance of + <code>Uniquable</code> so we can build <code>FiniteMap</code>s + with <code>ModuleName</code>s as the keys. + + <p>A <code>ModuleName</code> can be built from a + <code>String</code>, using the <code>mkModuleName</code> function. + + <h2>The Module type</h2> + + <p>For a given module, the compiler also needs to know whether the + module is in the <em>home package</em>, or in another package. + This distinction is important for two reasons: + + <ul> + <li><p>When generating code to call a function in another package, + the compiler might have to generate a cross-DLL call, which is + different from an intra-DLL call (hence the restriction that the + code in a package can only reside in a single DLL). + + <li><p>We avoid putting version information in an interface file + for entities defined in another package, on the grounds that other + packages are generally "stable". This also helps keep the size of + interface files down. + </ul> + + <p>The <code>Module</code> type contains a <code>ModuleName</code> + and a <code>PackageInfo</code> field. The + <code>PackageInfo</code> indicates whether the given + <code>Module</code> comes from the current package or from another + package. + + <p>To get the actual package in which a given module resides, you + have to read the interface file for that module, which contains + the package name (actually the value of the + <code>-package-name</code> flag when that module was built). This + information is currently unused inside the compiler, but we might + make use of it in the future, especially with the advent of + hierarchical modules, to allow the compiler to automatically + figure out which packages a program should be linked with, and + thus avoid the need to specify <code>-package</code> options on + the command line. + + <p><code>Module</code>s are also instances of + <code>Uniquable</code>, and indeed the unique of a + <code>Module</code> is the same as the unique of the underlying + <code>ModuleName</code>. + </body> +</html> diff --git a/docs/comm/the-beast/names.html b/docs/comm/the-beast/names.html new file mode 100644 index 0000000000..061fae3ebf --- /dev/null +++ b/docs/comm/the-beast/names.html @@ -0,0 +1,169 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The truth about names: OccNames, and Names</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The truth about names: OccNames, and Names</h1> + <p> + Every entity (type constructor, class, identifier, type variable) has a + <code>Name</code>. The <code>Name</code> type is pervasive in GHC, and + is defined in <code>basicTypes/Name.lhs</code>. Here is what a Name + looks like, though it is private to the Name module. + </p> + <blockquote> + <pre> +data Name = Name { + n_sort :: NameSort, -- What sort of name it is + n_occ :: !OccName, -- Its occurrence name + n_uniq :: Unique, -- Its identity + n_loc :: !SrcLoc -- Definition site + }</pre> + </blockquote> + <ul> + <li> The <code>n_sort</code> field says what sort of name this is: see + <a href="#sort">NameSort below</a>. + <li> The <code>n_occ</code> field gives the "occurrence name" of the + Name; see + <a href="#occname">OccName below</a>. + <li> The <code>n_uniq</code> field allows fast tests for equality of + Names. + <li> The <code>n_loc</code> field gives some indication of where the + name was bound. + </ul> + + <h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2> + <p> + There are four flavours of <code>Name</code>: + </p> + <blockquote> + <pre> +data NameSort + = External Module (Maybe Name) + -- (Just parent) => this Name is a subordinate name of 'parent' + -- e.g. data constructor of a data type, method of a class + -- Nothing => not a subordinate + + | WiredIn Module (Maybe Name) TyThing BuiltInSyntax + -- A variant of External, for wired-in things + + | Internal -- A user-defined Id or TyVar + -- defined in the module being compiled + + | System -- A system-defined Id or TyVar. Typically the + -- OccName is very uninformative (like 's')</pre> + </blockquote> + <ul> + <li>Here are the sorts of Name an entity can have: + <ul> + <li> Class, TyCon: External. + <li> Id: External, Internal, or System. + <li> TyVar: Internal, or System. + </ul> + </li> + <li>An <code>External</code> name has a globally-unique + (module name, occurrence name) pair, namely the + <em>original name</em> of the entity, + describing where the thing was originally defined. So for example, + if we have + <blockquote> + <pre> +module M where + f = e1 + g = e2 + +module A where + import qualified M as Q + import M + a = Q.f + g</pre> + </blockquote> + <p> + then the RdrNames for "a", "Q.f" and "g" get replaced (by the + Renamer) by the Names "A.a", "M.f", and "M.g" respectively. + </p> + </li> + <li>An <code>InternalName</code> + has only an occurrence name. Distinct InternalNames may have the same + occurrence name; use the Unique to distinguish them. + </li> + <li>An <code>ExternalName</code> has a unique that never changes. It + is never cloned. This is important, because the simplifier invents + new names pretty freely, but we don't want to lose the connnection + with the type environment (constructed earlier). An + <code>InternalName</code> name can be cloned freely. + </li> + <li><strong>Before CoreTidy</strong>: the Ids that were defined at top + level in the original source program get <code>ExternalNames</code>, + whereas extra top-level bindings generated (say) by the type checker + get <code>InternalNames</code>. q This distinction is occasionally + useful for filtering diagnostic output; e.g. for -ddump-types. + </li> + <li><strong>After CoreTidy</strong>: An Id with an + <code>ExternalName</code> will generate symbols that + appear as external symbols in the object file. An Id with an + <code>InternalName</code> cannot be referenced from outside the + module, and so generates a local symbol in the object file. The + CoreTidy pass makes the decision about which names should be External + and which Internal. + </li> + <li>A <code>System</code> name is for the most part the same as an + <code>Internal</code>. Indeed, the differences are purely cosmetic: + <ul> + <li>Internal names usually come from some name the + user wrote, whereas a System name has an OccName like "a", or "t". + Usually there are masses of System names with the same OccName but + different uniques, whereas typically there are only a handful of + distince Internal names with the same OccName. + </li> + <li>Another difference is that when unifying the type checker tries + to unify away type variables with System names, leaving ones with + Internal names (to improve error messages). + </li> + </ul> + </li> + </ul> + + <h2><a name="occname">Occurrence names: <code>OccName</code></a></h2> + <p> + An <code>OccName</code> is more-or-less just a string, like "foo" or + "Tree", giving the (unqualified) name of an entity. + </p> + <p> + Well, not quite just a string, because in Haskell a name like "C" could + mean a type constructor or data constructor, depending on context. So + GHC defines a type <tt>OccName</tt> (defined in + <tt>basicTypes/OccName.lhs</tt>) that is a pair of a <tt>FastString</tt> + and a <tt>NameSpace</tt> indicating which name space the name is drawn + from: + <blockquote> + <pre> +data OccName = OccName NameSpace EncodedFS</pre> + </blockquote> + <p> + The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating + that the string is Z-encoded. (Details in <tt>OccName.lhs</tt>.) + Z-encoding encodes funny characters like '%' and '$' into alphabetic + characters, like "zp" and "zd", so that they can be used in object-file + symbol tables without confusing linkers and suchlike. + </p> + <p> + The name spaces are: + </p> + <ul> + <li> <tt>VarName</tt>: ordinary variables</li> + <li> <tt>TvName</tt>: type variables</li> + <li> <tt>DataName</tt>: data constructors</li> + <li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they + share a name space) </li> + </ul> + + <small> +<!-- hhmts start --> +Last modified: Wed May 4 14:57:55 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> + diff --git a/docs/comm/the-beast/ncg.html b/docs/comm/the-beast/ncg.html new file mode 100644 index 0000000000..5810a35212 --- /dev/null +++ b/docs/comm/the-beast/ncg.html @@ -0,0 +1,749 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Native Code Generator</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Native Code Generator</h1> + <p> + On some platforms (currently x86 and PowerPC, with bitrotted + support for Sparc and Alpha), GHC can generate assembly code + directly, without having to go via C. This can sometimes almost + halve compilation time, and avoids the fragility and + horribleness of the <a href="mangler.html">mangler</a>. The NCG + is enabled by default for + non-optimising compilation on supported platforms. For most programs + it generates code which runs only 1-3% slower + (depending on platform and type of code) than that + created by gcc on x86s, so it is well worth using even with + optimised compilation. FP-intensive x86 programs see a bigger + slowdown, and all Sparc code runs about 5% slower due to + us not filling branch delay slots. + <p> + The NCG has always been something of a second-class citizen + inside GHC, an unloved child, rather. This means that its + integration into the compiler as a whole is rather clumsy, which + brings some problems described below. That apart, the NCG + proper is fairly cleanly designed, as target-independent as it + reasonably can be, and so should not be difficult to retarget. + <p> + <b>NOTE!</b> The native code generator was largely rewritten as part + of the C-- backend changes, around May 2004. Unfortunately the + rest of this document still refers to the old version, and was written + with relation to the CVS head as of end-Jan 2002. Some of it is relevant, + some of it isn't. + + <h2>Overview</h2> + The top-level code generator fn is + <p> + <code>absCtoNat :: AbstractC -> UniqSM (SDoc, Pretty.Doc)</code> + <p> + The returned <code>SDoc</code> is for debugging, so is empty unless + you specify <code>-ddump-stix</code>. The <code>Pretty.Doc</code> + bit is the final assembly code. Translation involves three main + phases, the first and third of which are target-independent. + <ul> + <li><b>Translation into the <code>Stix</code> representation.</b> Stix + is a simple tree-like RTL-style language, in which you can + mention: + <p> + <ul> + <li>An infinite number of temporary, virtual registers. + <li>The STG "magic" registers (<code>MagicId</code>), such as + the heap and stack pointers. + <li>Literals and low-level machine ops (<code>MachOp</code>). + <li>Simple address computations. + <li>Reads and writes of: memory, virtual regs, and various STG + regs. + <li>Labels and <code>if ... goto ...</code> style control-flow. + </ul> + <p> + Stix has two main associated types: + <p> + <ul> + <li><code>StixStmt</code> -- trees executed for their side + effects: assignments, control transfers, and auxiliary junk + such as segment changes and literal data. + <li><code>StixExpr</code> -- trees which denote a value. + </ul> + <p> + Translation into Stix is almost completely + target-independent. Needed dependencies are knowledge of + word size and endianness, used when generating code to do + deal with half-word fields in info tables. This could be + abstracted out easily enough. Also, the Stix translation + needs to know which <code>MagicId</code>s map to registers + on the given target, and which are stored in offsets from + <code>BaseReg</code>. + <p> + After initial Stix generation, the trees are cleaned up with + constant-folding and a little copy-propagation ("Stix + inlining", as the code misleadingly calls it). We take + the opportunity to translate <code>MagicId</code>s which are + stored in memory on the given target, into suitable memory + references. Those which are stored in registers are left + alone. There is also a half-hearted attempt to lift literal + strings to the top level in cases where nested strings have + been observed to give incorrect code in the past. + <p> + Primitive machine-level operations will already be phrased in + terms of <code>MachOp</code>s in the presented Abstract C, and + these are passed through unchanged. We comment only that the + <code>MachOp</code>s have been chosen so as to be easy to + implement on all targets, and their meaning is intended to be + unambiguous, and the same on all targets, regardless of word + size or endianness. + <p> + <b>A note on <code>MagicId</code>s.</b> + Those which are assigned to + registers on the current target are left unmodified. Those + which are not are stored in memory as offsets from + <code>BaseReg</code> (which is assumed to permanently have the + value <code>(&MainCapability.r)</code>), so the constant folder + calculates the offsets and inserts suitable loads/stores. One + complication is that not all archs have <code>BaseReg</code> + itself in a register, so for those (sparc), we instead + generate the address as an offset from the static symbol + <code>MainCapability</code>, since the register table lives in + there. + <p> + Finally, <code>BaseReg</code> does occasionally itself get + mentioned in Stix expression trees, and in this case what is + denoted is precisely <code>(&MainCapability.r)</code>, not, as + in all other cases, the value of memory at some offset from + the start of the register table. Since what it denotes is an + r-value and not an l-value, assigning <code>BaseReg</code> is + meaningless, so the machinery checks to ensure this never + happens. All these details are taken into account by the + constant folder. + <p> + <li><b>Instruction selection.</b> This is the only majorly + target-specific phase. It turns Stix statements and + expressions into sequences of <code>Instr</code>, a data + type which is different for each architecture. + <code>Instr</code>, unsurprisingly, has various supporting + types, such as <code>Reg</code>, <code>Operand</code>, + <code>Imm</code>, etc. The generated instructions may refer + to specific machine registers, or to arbitrary virtual + registers, either those created within the instruction + selector, or those mentioned in the Stix passed to it. + <p> + The instruction selectors live in <code>MachCode.lhs</code>. + The core functions, for each target, are: + <p> + <code> + getAmode :: StixExpr -> NatM Amode + <br>getRegister :: StixExpr -> NatM Register + <br>assignMem_IntCode :: PrimRep -> StixExpr -> StixExpr -> NatM InstrBlock + <br>assignReg_IntCode :: PrimRep -> StixReg -> StixExpr -> NatM InstrBlock + </code> + <p> + The insn selectors use the "maximal munch" algorithm. The + bizarrely-misnamed <code>getRegister</code> translates + expressions. A simplified version of its type is: + <p> + <code>getRegister :: StixExpr -> NatM (OrdList Instr, Reg)</code> + <p> + That is: it (monadically) turns a <code>StixExpr</code> into a + sequence of instructions, and a register, with the meaning + that after executing the (possibly empty) sequence of + instructions, the (possibly virtual) register will + hold the resulting value. The real situation is complicated + by the presence of fixed registers, and is detailed below. + <p> + Maximal munch is a greedy algorithm and is known not to give + globally optimal code sequences, but it is good enough, and + fast and simple. Early incarnations of the NCG used something + more sophisticated, but that is long gone now. + <p> + Similarly, <code>getAmode</code> translates a value, intended + to denote an address, into a sequence of insns leading up to + a (processor-specific) addressing mode. This stuff could be + done using the general <code>getRegister</code> selector, but + would necessarily generate poorer code, because the calculated + address would be forced into a register, which might be + unnecessary if it could partially or wholly be calculated + using an addressing mode. + <p> + Finally, <code>assignMem_IntCode</code> and + <code>assignReg_IntCode</code> create instruction sequences to + calculate a value and store it in the given register, or at + the given address. Because these guys translate a statement, + not a value, they just return a sequence of insns and no + associated register. Floating-point and 64-bit integer + assignments have analogous selectors. + <p> + Apart from the complexities of fixed vs floating registers, + discussed below, the instruction selector is as simple + as it can be. It looks long and scary but detailed + examination reveals it to be fairly straightforward. + <p> + <li><b>Register allocation.</b> The register allocator, + <code>AsmRegAlloc.lhs</code> takes sequences of + <code>Instr</code>s which mention a mixture of real and + virtual registers, and returns a modified sequence referring + only to real ones. It is gloriously and entirely + target-independent. Well, not exactly true. Instead it + regards <code>Instr</code> (instructions) and <code>Reg</code> + (virtual and real registers) as abstract types, to which it has + the following interface: + <p> + <code> + insnFuture :: Instr -> InsnFuture + <br>regUsage :: Instr -> RegUsage + <br>patchRegs :: Instr -> (Reg -> Reg) -> Instr + </code> + <p> + <code>insnFuture</code> is used to (re)construct the graph of + all possible control transfers between the insns to be + allocated. <code>regUsage</code> returns the sets of registers + read and written by an instruction. And + <code>patchRegs</code> is used to apply the allocator's final + decision on virtual-to-real reg mapping to an instruction. + <p> + Clearly these 3 fns have to be written anew for each + architecture. They are defined in + <code>RegAllocInfo.lhs</code>. Think twice, no, thrice, + before modifying them: making false claims about insn + behaviour will lead to hard-to-find register allocation + errors. + <p> + <code>AsmRegAlloc.lhs</code> contains detailed comments about + how the allocator works. Here is a summary. The head honcho + <p> + <code>allocUsingTheseRegs :: [Instr] -> [Reg] -> (Bool, [Instr])</code> + <p> + takes a list of instructions and a list of real registers + available for allocation, and maps as many of the virtual regs + in the input into real ones as it can. The returned + <code>Bool</code> indicates whether or not it was + successful. If so, that's the end of it. If not, the caller + of <code>allocUsingTheseRegs</code> will attempt spilling. + More of that later. What <code>allocUsingTheseRegs</code> + does is: + <p> + <ul> + <li>Implicitly number each instruction by its position in the + input list. + <p> + <li>Using <code>insnFuture</code>, create the set of all flow + edges -- possible control transfers -- within this set of + insns. + <p> + <li>Using <code>regUsage</code> and iterating around the flow + graph from the previous step, calculate, for each virtual + register, the set of flow edges on which it is live. + <p> + <li>Make a real-register committment map, which gives the set + of edges for which each real register is committed (in + use). These sets are initially empty. For each virtual + register, attempt to find a real register whose current + committment does not intersect that of the virtual + register -- ie, is uncommitted on all edges that the + virtual reg is live. If successful, this means the vreg + can be assigned to the realreg, so add the vreg's set to + the realreg's committment. + <p> + <li>If all the vregs were assigned to a realreg, use + <code>patchInstr</code> to apply the mapping to the insns themselves. + </ul> + <p> + <b>Spilling</b> + <p> + If <code>allocUsingTheseRegs</code> fails, a baroque + mechanism comes into play. We now know that much simpler + schemes are available to do the same thing and give better + results. + Anyways: + <p> + The logic above <code>allocUsingTheseRegs</code>, in + <code>doGeneralAlloc</code> and <code>runRegAllocate</code>, + observe that allocation has failed with some set R of real + registers. So they apply <code>runRegAllocate</code> a second + time to the code, but remove (typically) two registers from R + before doing so. This naturally fails too, but returns a + partially-allocated sequence. <code>doGeneralAlloc</code> + then inserts spill code into the sequence, and finally re-runs + <code>allocUsingTheseRegs</code>, but supplying the original, + unadulterated R. This is guaranteed to succeed since the two + registers previously removed from R are sufficient to allocate + all the spill/restore instructions added. + <p> + Because x86 is very short of registers, and in the worst case + needs three removed from R, a softly-softly approach is used. + <code>doGeneralAlloc</code> first tries with zero regs removed + from R, then if that fails one, then two, etc. This means + <code>allocUsingTheseRegs</code> may get run several times + before a successful arrangement is arrived at. + <code>findReservedRegs</code> cooks up the sets of spill + registers to try with. + <p> + The resulting machinery is complicated and the generated spill + code is appalling. The saving grace is that spills are very + rare so it doesn't matter much. I did not invent this -- I inherited it. + <p> + <b>Dealing with common cases fast</b> + <p> + The entire reg-alloc mechanism described so far is general and + correct, but expensive overkill for many simple code blocks. + So to begin with we use + <code>doSimpleAlloc</code>, which attempts to do something + simple. It exploits the observation that if the total number + of virtual registers does not exceed the number of real ones + available, we can simply dole out a new realreg each time we + see mention of a new vreg, with no regard for control flow. + <code>doSimpleAlloc</code> therefore attempts this in a + single pass over the code. It gives up if it runs out of real + regs or sees any condition which renders the above observation + invalid (fixed reg uses, for example). + <p> + This clever hack handles the majority of code blocks quickly. + It was copied from the previous reg-allocator (the + Mattson/Partain/Marlow/Gill one). + </ul> + +<p> +<h2>Complications, observations, and possible improvements</h2> + +<h3>Real vs virtual registers in the instruction selectors</h3> + +The instruction selectors for expression trees, namely +<code>getRegister</code>, are complicated by the fact that some +expressions can only be computed into a specific register, whereas +the majority can be computed into any register. We take x86 as an +example, but the problem applies to all archs. +<p> +Terminology: <em>rreg</em> means real register, a real machine +register. <em>vreg</em> means one of an infinite set of virtual +registers. The type <code>Reg</code> is the sum of <em>rreg</em> and +<em>vreg</em>. The instruction selector generates sequences with +unconstrained use of vregs, leaving the register allocator to map them +all into rregs. +<p> +Now, where was I ? Oh yes. We return to the type of +<code>getRegister</code>, which despite its name, selects instructions +to compute the value of an expression tree. +<pre> + getRegister :: StixExpr -> NatM Register + + data Register + = Fixed PrimRep Reg InstrBlock + | Any PrimRep (Reg -> InstrBlock) + + type InstrBlock -- sequence of instructions +</pre> +At first this looks eminently reasonable (apart from the stupid +name). <code>getRegister</code>, and nobody else, knows whether or +not a given expression has to be computed into a fixed rreg or can be +computed into any rreg or vreg. In the first case, it returns +<code>Fixed</code> and indicates which rreg the result is in. In the +second case it defers committing to any specific target register by +returning a function from <code>Reg</code> to <code>InstrBlock</code>, +and the caller can specify the target reg as it sees fit. +<p> +Unfortunately, that forces <code>getRegister</code>'s callers (usually +itself) to use a clumsy and confusing idiom in the common case where +they do not care what register the result winds up in. The reason is +that although a value might be computed into a fixed rreg, we are +forbidden (on pain of segmentation fault :) from subsequently +modifying the fixed reg. This and other rules are record in "Rules of +the game" inside <code>MachCode.lhs</code>. +<p> +Why can't fixed registers be modified post-hoc? Consider a simple +expression like <code>Hp+1</code>. Since the heap pointer +<code>Hp</code> is definitely in a fixed register, call it R, +<code>getRegister</code> on subterm <code>Hp</code> will simply return +<code>Fixed</code> with an empty sequence and R. But we can't just +emit an increment instruction for R, because that trashes +<code>Hp</code>; instead we first have to copy it into a fresh vreg +and increment that. +<p> +With all that in mind, consider now writing a <code>getRegister</code> +clause for terms of the form <code>(1 + E)</code>. Contrived, yes, +but illustrates the matter. First we do +<code>getRegister</code> on E. Now we are forced to examine what +comes back. +<pre> + getRegister (OnePlus e) + = getRegister e `thenNat` \ e_result -> + case e_result of + Fixed e_code e_fixed + -> returnNat (Any IntRep (\dst -> e_code ++ [MOV e_fixed dst, INC dst])) + Any e_any + -> Any (\dst -> e_any dst ++ [INC dst]) +</pre> +This seems unreasonably cumbersome, yet the instruction selector is +full of such idioms. A good example of the complexities induced by +this scheme is shown by <code>trivialCode</code> for x86 in +<code>MachCode.lhs</code>. This deals with general integer dyadic +operations on x86 and has numerous cases. It was difficult to get +right. +<p> +An alternative suggestion is to simplify the type of +<code>getRegister</code> to this: +<pre> + getRegister :: StixExpr -> NatM (InstrBloc, VReg) + type VReg = .... a vreg ... +</pre> +and then we could safely write +<pre> + getRegister (OnePlus e) + = getRegister e `thenNat` \ (e_code, e_vreg) -> + returnNat (e_code ++ [INC e_vreg], e_vreg) +</pre> +which is about as straightforward as you could hope for. +Unfortunately, it requires <code>getRegister</code> to insert moves of +values which naturally compute into an rreg, into a vreg. Consider: +<pre> + 1 + ccall some-C-fn +</pre> +On x86 the ccall result is returned in rreg <code>%eax</code>. The +resulting sequence, prior to register allocation, would be: +<pre> + # push args + call some-C-fn + # move %esp to nuke args + movl %eax, %vreg + incl %vreg +</pre> +If, as is likely, <code>%eax</code> is not held live beyond this point +for any other purpose, the move into a fresh register is pointless; +we'd have been better off leaving the value in <code>%eax</code> as +long as possible. +<p> +The simplified <code>getRegister</code> story is attractive. It would +clean up the instruction selectors significantly and make it simpler +to write new ones. The only drawback is that it generates redundant +register moves. I suggest that eliminating these should be the job +of the register allocator. Indeed: +<ul> +<li>There has been some work on this already ("Iterated register + coalescing" ?), so this isn't a new idea. +<p> +<li>You could argue that the existing scheme inappropriately blurs the + boundary between the instruction selector and the register + allocator. The instruction selector should .. well .. just + select instructions, without having to futz around worrying about + what kind of registers subtrees get generated into. Register + allocation should be <em>entirely</em> the domain of the register + allocator, with the proviso that it should endeavour to allocate + registers so as to minimise the number of non-redundant reg-reg + moves in the final output. +</ul> + + +<h3>Selecting insns for 64-bit values/loads/stores on 32-bit platforms</h3> + +Note that this stuff doesn't apply on 64-bit archs, since the +<code>getRegister</code> mechanism applies there. + +The relevant functions are: +<pre> + assignMem_I64Code :: StixExpr -> StixExpr -> NatM InstrBlock + assignReg_I64Code :: StixReg -> StixExpr -> NatM InstrBlock + iselExpr64 :: StixExpr -> NatM ChildCode64 + + data ChildCode64 -- a.k.a "Register64" + = ChildCode64 + InstrBlock -- code + VRegUnique -- unique for the lower 32-bit temporary +</pre> +<code>iselExpr64</code> is the 64-bit, plausibly-named analogue of +<code>getRegister</code>, and <code>ChildCode64</code> is the analogue +of <code>Register</code>. The aim here was to generate working 64 +bit code as simply as possible. To this end, I used the +simplified <code>getRegister</code> scheme described above, in which +<code>iselExpr64</code>generates its results into two vregs which +can always safely be modified afterwards. +<p> +Virtual registers are, unsurprisingly, distinguished by their +<code>Unique</code>s. There is a small difficulty in how to +know what the vreg for the upper 32 bits of a value is, given the vreg +for the lower 32 bits. The simple solution adopted is to say that +any low-32 vreg may also have a hi-32 counterpart which shares the +same unique, but is otherwise regarded as a separate entity. +<code>getHiVRegFromLo</code> gets one from the other. +<pre> + data VRegUnique + = VRegUniqueLo Unique -- lower part of a split quantity + | VRegUniqueHi Unique -- upper part thereof +</pre> +Apart from that, 64-bit code generation is really simple. The sparc +and x86 versions are almost copy-n-pastes of each other, with minor +adjustments for endianness. The generated code isn't wonderful but +is certainly acceptable, and it works. + + + +<h3>Shortcomings and inefficiencies in the register allocator</h3> + +<h4>Redundant reconstruction of the control flow graph</h4> + +The allocator goes to considerable computational expense to construct +all the flow edges in the group of instructions it's allocating for, +by using the <code>insnFuture</code> function in the +<code>Instr</code> pseudo-abstract type. +<p> +This is really silly, because all that information is present at the +abstract C stage, but is thrown away in the translation to Stix. +So a good thing to do is to modify that translation to +produce a directed graph of Stix straight-line code blocks, +and to preserve that structure through the insn selector, so the +allocator can see it. +<p> +This would eliminate the fragile, hacky, arch-specific +<code>insnFuture</code> mechanism, and probably make the whole +compiler run measurably faster. Register allocation is a fair chunk +of the time of non-optimising compilation (10% or more), and +reconstructing the flow graph is an expensive part of reg-alloc. +It would probably accelerate the vreg liveness computation too. + +<h4>Really ridiculous method for doing spilling</h4> + +This is a more ambitious suggestion, but ... reg-alloc should be +reimplemented, using the scheme described in "Quality and speed in +linear-scan register allocation." (Traub?) For straight-line code +blocks, this gives an elegant one-pass algorithm for assigning +registers and creating the minimal necessary spill code, without the +need for reserving spill registers ahead of time. +<p> +I tried it in Rigr, replacing the previous spiller which used the +current GHC scheme described above, and it cut the number of spill +loads and stores by a factor of eight. Not to mention being simpler, +easier to understand and very fast. +<p> +The Traub paper also describes how to extend their method to multiple +basic blocks, which will be needed for GHC. It comes down to +reconciling multiple vreg-to-rreg mappings at points where control +flow merges. + +<h4>Redundant-move support for revised instruction selector suggestion</h4> + +As mentioned above, simplifying the instruction selector will require +the register allocator to try and allocate source and destination +vregs to the same rreg in reg-reg moves, so as to make as many as +possible go away. Without that, the revised insn selector would +generate worse code than at present. I know this stuff has been done +but know nothing about it. The Linear-scan reg-alloc paper mentioned +above does indeed mention a bit about it in the context of single +basic blocks, but I don't know if that's sufficient. + + + +<h3>x86 arcana that you should know about</h3> + +The main difficulty with x86 is that many instructions have fixed +register constraints, which can occasionally make reg-alloc fail +completely. And the FPU doesn't have the flat register model which +the reg-alloc abstraction (implicitly) assumes. +<p> +Our strategy is: do a good job for the common small subset, that is +integer loads, stores, address calculations, basic ALU ops (+, -, +and, or, xor), and jumps. That covers the vast majority of +executed insns. And indeed we do do a good job, with a loss of +less than 2% compared with gcc. +<p> +Initially we tried to handle integer instructions with awkward +register constraints (mul, div, shifts by non-constant amounts) via +various jigglings of the spiller et al. This never worked robustly, +and putting platform-specific tweaks in the generic infrastructure is +a big No-No. (Not quite true; shifts by a non-constant amount are +still done by a giant kludge, and should be moved into this new +framework.) +<p> +Fortunately, all such insns are rare. So the current scheme is to +pretend that they don't have any such constraints. This fiction is +carried all the way through the register allocator. When the insn +finally comes to be printed, we emit a sequence which copies the +operands through memory (<code>%esp</code>-relative), satisfying the +constraints of the real instruction. This localises the gruesomeness +to just one place. Here, for example, is the code generated for +integer divison of <code>%esi</code> by <code>%ecx</code>: +<pre> + # BEGIN IQUOT %ecx, %esi + pushl $0 + pushl %eax + pushl %edx + pushl %ecx + movl %esi,% eax + cltd + idivl 0(%esp) + movl %eax, 12(%esp) + popl %edx + popl %edx + popl %eax + popl %esi + # END IQUOT %ecx, %esi +</pre> +This is not quite as appalling as it seems, if you consider that the +division itself typically takes 16+ cycles, whereas the rest of the +insns probably go through in about 1 cycle each. +<p> +This trick is taken to extremes for FP operations. +<p> +All notions of the x86 FP stack and its insns have been removed. +Instead, we pretend, to the instruction selector and register +allocator, that x86 has six floating point registers, +<code>%fake0</code> .. <code>%fake5</code>, which can be used in the +usual flat manner. We further claim that x86 has floating point +instructions very similar to SPARC and Alpha, that is, a simple +3-operand register-register arrangement. Code generation and register +allocation proceed on this basis. +<p> +When we come to print out the final assembly, our convenient fiction +is converted to dismal reality. Each fake instruction is +independently converted to a series of real x86 instructions. +<code>%fake0</code> .. <code>%fake5</code> are mapped to +<code>%st(0)</code> .. <code>%st(5)</code>. To do reg-reg arithmetic +operations, the two operands are pushed onto the top of the FP stack, +the operation done, and the result copied back into the relevant +register. When one of the operands is also the destination, we emit a +slightly less scummy translation. There are only six +<code>%fake</code> registers because 2 are needed for the translation, +and x86 has 8 in total. +<p> +The translation is inefficient but is simple and it works. A cleverer +translation would handle a sequence of insns, simulating the FP stack +contents, would not impose a fixed mapping from <code>%fake</code> to +<code>%st</code> regs, and hopefully could avoid most of the redundant +reg-reg moves of the current translation. +<p> +There are, however, two unforeseen bad side effects: +<ul> +<li>This doesn't work properly, because it doesn't observe the normal + conventions for x86 FP code generation. It turns out that each of + the 8 elements in the x86 FP register stack has a tag bit which + indicates whether or not that register is notionally in use or + not. If you do a FPU operation which happens to read a + tagged-as-empty register, you get an x87 FPU (stack invalid) + exception, which is normally handled by the FPU without passing it + to the OS: the program keeps going, but the resulting FP values + are garbage. The OS can ask for the FPU to pass it FP + stack-invalid exceptions, but it usually doesn't. + <p> + Anyways: inside NCG created x86 FP code this all works fine. + However, the NCG's fiction of a flat register set does not operate + the x87 register stack in the required stack-like way. When + control returns to a gcc-generated world, the stack tag bits soon + cause stack exceptions, and thus garbage results. + <p> + The only fix I could think of -- and it is horrible -- is to clear + all the tag bits just before the next STG-level entry, in chunks + of code which use FP insns. <code>i386_insert_ffrees</code> + inserts the relevant <code>ffree</code> insns into such code + blocks. It depends critically on <code>is_G_instr</code> to + detect such blocks. +<p> +<li>It's very difficult to read the generated assembly and + reason about it when debugging, because there's so much clutter. + We print the fake insns as comments in the output, and that helps + a bit. +</ul> + + + +<h3>Generating code for ccalls</h3> + +For reasons I don't really understand, the instruction selectors for +generating calls to C (<code>genCCall</code>) have proven surprisingly +difficult to get right, and soaked up a lot of debugging time. As a +result, I have once again opted for schemes which are simple and not +too difficult to argue as correct, even if they don't generate +excellent code. +<p> +The sparc ccall generator in particular forces all arguments into +temporary virtual registers before moving them to the final +out-registers (<code>%o0</code> .. <code>%o5</code>). This creates +some unnecessary reg-reg moves. The reason is explained in a +comment in the code. + + +<h3>Duplicate implementation for many STG macros</h3> + +This has been discussed at length already. It has caused a couple of +nasty bugs due to subtle untracked divergence in the macro +translations. The macro-expander really should be pushed up into the +Abstract C phase, so the problem can't happen. +<p> +Doing so would have the added benefit that the NCG could be used to +compile more "ways" -- well, at least the 'p' profiling way. + + +<h3>How to debug the NCG without losing your sanity/hair/cool</h3> + +Last, but definitely not least ... +<p> +The usual syndrome is that some program, when compiled via C, works, +but not when compiled via the NCG. Usually the problem is fairly +simple to fix, once you find the specific code block which has been +mistranslated. But the latter can be nearly impossible, since most +modules generate at least hundreds and often thousands of them. +<p> +My solution: cheat. +<p> +Because the via-C and native routes diverge only late in the day, +it is not difficult to construct a 1-1 correspondence between basic +blocks on the two routes. So, if the program works via C but not on +the NCG, do the following: +<ul> +<li>Recompile <code>AsmCodeGen.lhs</code> in the afflicted compiler + with <code>-DDEBUG_NCG</code>, so that it inserts + <code>___ncg_debug_marker</code>s + into the assembly it emits. +<p> +<li>Using a binary search on modules, find the module which is causing + the problem. +<p> +<li>Compile that module to assembly code, with identical flags, twice, + once via C and once via NCG. + Call the outputs <code>ModuleName.s-gcc</code> and + <code>ModuleName.s-nat</code>. Check that the latter does indeed have + <code>___ncg_debug_marker</code>s in it; otherwise the next steps fail. +<p> +<li>Build (with a working compiler) the program + <code>fptools/ghc/utils/debugNCG/diff_gcc_nat</code>. +<p> +<li>Run: <code>diff_gcc_nat ModuleName.s</code>. This will + construct the 1-1 correspondence, and emits on stdout + a cppable assembly output. Place this in a file -- I always + call it <code>synth.S</code>. Note, the capital S is important; + otherwise it won't get cpp'd. You can feed this file directly to + ghc and it will automatically get cpp'd; you don't have to do so + yourself. +<p> +<li>By messing with the <code>#define</code>s at the top of + <code>synth.S</code>, do a binary search to find the incorrect + block. Keep a careful record of where you are in the search; it + is easy to get confused. Remember also that multiple blocks may + be wrong, which also confuses matters. Finally, I usually start + off by re-checking that I can build the executable with all the + <code>#define</code>s set to 0 and then all to 1. This ensures + you won't get halfway through the search and then get stuck due to + some snafu with gcc-specific literals. Usually I set + <code>UNMATCHED_GCC</code> to 1 all the time, and this bit should + contain only literal data. + <code>UNMATCHED_NAT</code> should be empty. +</ul> +<p> +<code>diff_gcc_nat</code> was known to work correctly last time I used +it, in December 01, for both x86 and sparc. If it doesn't work, due +to changes in assembly syntax, or whatever, make it work. The +investment is well worth it. Searching for the incorrect block(s) any +other way is a total time waster. + + + +</ul> + + + + + <p><small> +<!-- hhmts start --> +Last modified: Fri Feb 1 16:14:11 GMT 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/optimistic.html b/docs/comm/the-beast/optimistic.html new file mode 100644 index 0000000000..4d158022e8 --- /dev/null +++ b/docs/comm/the-beast/optimistic.html @@ -0,0 +1,65 @@ +<h2> Architectural stuff </h2> + +New fields in the TSO: +<ul> +<li> New global speculation-depth register; always counts the number of specuation frames +on the stack; incremented when +starting speculation, decremented when finishing. +<li> Profiling stuff +</ul> + + +<h2> Speculation frames </h2> + +The info table for a speculation frame points to the static spec-depth configuration +for that speculation point. (Points to, because the config is mutable, and the info +table has to be adjacent to the (immutable) code.) + + + +<h2> Abortion</h2> + +Abortion is modelled by a special asynchronous exception ThreadAbort. + +<ul> +<li> In the scheduler, if a thread returns with ThreadBlocked, and non-zero SpecDepth, send it +an asynchronous exception. + +<li> In the implementation of the <tt>catch#</tt> primop, raise an asynchonous exception if +SpecDepth is nonzero. + +<li> Timeout, administered by scheduler. Current story: abort if a speculation frame lasts from +one minor GC to the next. We detect this by seeing if there's a profiling frame on the stack --- a +profiling frame is added at a minor GC in place of a speculation frame (see Online Profiling). +</ul> + + +When tearing frames off the stack, we start a new chunk at every speculation frame, as well as every +update frame. We proceed down to the deepest speculation frame. +<p> +The <tt>AP_STACK</tt> closure built for a speculation frame must be careful <em>not</em> to enter the +next <tt>AP_STACK</tt> closure up, because that would re-enter a possible loop. +<p> +Delivering an asynch exception to a thread that is speculating. Invariant: there can be no catch frames +inside speculation (we abort in <tt>catch#</tt> when speculating. So the asynch exception just +tears off frames in the standard way until it gets to a catch frame, just as it would usually do. +<p> +Abortion can punish one or more of the speculation frames by decrementing their static config variables. + +<h3>Synchronous exceptions</h3> + +Synchronous exceptions are treated similarly as before. The stack is discarded up to an update frame; the +thunk to be updated is overwritten with "raise x", and the process continues. Until a catch frame. +<p> +When we find a spec frame, we allocate a "raise x" object, and resume execution with the return address +in the spec frame. In that way the spec frame is like a catch frame; it stops the unwinding process. +<p> +It's essential that every hard failure is caught, else speculation is unsafe. In particular, divide by zero +is hard to catch using OS support, so we test explicitly in library code. You can shoot yourself in the foot +by writing <tt>x `div#` 0</tt>, side-stepping the test. + + +<h3> Online profiling </h3> + +Sampling can be more frequent than minor GC (by jiggling the end-of-block code) but cannot +be less frequent, because GC doesn't expect to see profiling frames.
\ No newline at end of file diff --git a/docs/comm/the-beast/prelude.html b/docs/comm/the-beast/prelude.html new file mode 100644 index 0000000000..64b607def5 --- /dev/null +++ b/docs/comm/the-beast/prelude.html @@ -0,0 +1,207 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Primitives and the Prelude</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Primitives and the Prelude</h1> + <p> + One of the trickiest aspects of GHC is the delicate interplay + between what knowledge is baked into the compiler, and what + knowledge it gets by reading the interface files of library + modules. In general, the less that is baked in, the better. +<p> + Most of what the compiler has to have wired in about primitives and + prelude definitions is in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/"><code>fptools/ghc/compiler/prelude/</code></a>. + </p> + +GHC recognises these main classes of baked-in-ness: +<dl> +<dt><strong>Primitive types.</strong> +<dd>Primitive types cannot be defined in Haskell, and are utterly baked into the compiler. +They are notionally defined in the fictional module <tt>GHC.Prim</tt>. The <tt>TyCon</tt>s for these types are all defined +in module <tt>TysPrim</tt>; for example, +<pre> + intPrimTyCon :: TyCon + intPrimTyCon = .... +</pre> +Examples: +<tt>Int#, Float#, Addr#, State#</tt>. +<p> +<dt><strong>Wired-in types.</strong> +<dd>Wired-in types can be defined in Haskell, and indeed are (many are defined in </tt>GHC.Base</tt>). +However, it's very convenient for GHC to be able to use the type constructor for (say) <tt>Int</tt> +without looking it up in any environment. So module <tt>TysWiredIn</tt> contains many definitions +like this one: +<pre> + intTyCon :: TyCon + intTyCon = .... + + intDataCon :: DataCon + intDataCon = .... +</pre> +However, since a <tt>TyCon</tt> value contains the entire type definition inside it, it follows +that the complete definition of <tt>Int</tt> is thereby baked into the compiler. +<p> +Nevertheless, the library module <tt>GHC.Base</tt> still contains a definition for <tt>Int</tt> +just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition +in <tt>TysWiredIn</tt> differs from that in <tt>GHC.Base</tt>. +<p> +The rule is that only very simple types should be wired in (for example, <tt>Ratio</tt> is not, +and <tt>IO</tt> is certainly not). No class is wired in: classes are just too complicated. +<p> +Examples: <tt>Int</tt>, <tt>Float</tt>, <tt>List</tt>, tuples. + +<p> +<dt><strong>Known-key things.</strong> +<dd>GHC knows of the existence of many, many other types, classes and values. <em>But all it knows is +their <tt>Name</tt>.</em> Remember, a <tt>Name</tt> includes a unique key that identifies the +thing, plus its defining module and occurrence name +(see <a href="names.html">The truth about Names</a>). Knowing a <tt>Name</tt>, therefore, GHC can +run off to the interface file for the module and find out everything else it might need. +<p> +Most of these known-key names are defined in module <tt>PrelNames</tt>; a further swathe concerning +Template Haskell are defined in <tt>DsMeta</tt>. The allocation of unique keys is done manually; +chaotic things happen if you make a mistake here, which is why they are all together. +</dl> + +All the <tt>Name</tt>s from all the above categories are used to initialise the global name cache, +which maps (module,occurrence-name) pairs to the globally-unique <tt>Name</tt> for that +thing. (See <tt>HscMain.initOrigNames</tt>.) + +<p> +The next sections elaborate these three classes a bit. + + + <h2>Primitives (module <tt>TysPrim</tt>)</h2> + <p> + Some types and functions have to be hardwired into the compiler as they + are atomic; all other code is essentially built around this primitive + functionality. This includes basic arithmetic types, such as integers, + and their elementary operations as well as pointer types. Primitive + types and functions often receive special treatment in the code + generator, which means that these entities have to be explicitly + represented in the compiler. Moreover, many of these types receive some + explicit treatment in the runtime system, and so, there is some further + information about <a href="../rts-libs/primitives.html">primitives in + the RTS section</a> of this document. + <p> + The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysPrim.lhs"><code>TysPrim</code></a> + exports a list of all primitive type constructors as <code>primTyCons :: + [TyCon]</code>. All of these type constructors (of type + <code>TyCon</code>) are also exported as <code>intPrimTyCon</code>, + <code>stablePtrPrimTyCon</code>, and so on. In addition, for each + nullary type constructor the corresponding type (of type + <code>Type</code>) is also exported; for example, we have + <code>intPrimTy :: Type</code>. For all other type constructors, a + function is exported that constructs the type obtained by applying the + type constructors to an argument type (of type <code>Type</code>); for + example, we have <code>mkStablePtrPrimTy :: Type -> Type</code>. + <p> + As it is inconvenient to identify type that receive a special treatment + by the code generator by looking at their name, the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrimRep.lhs"><code>PrimRep</code></a> + exports a data type <code>PrimRep</code>, which lists all + machine-manipulable implementation types. The module also exports a set + of query functions on <code>PrimRep</code> that define properties, such + as a type's byte size or whether a primitive type is a pointer type. + Moreover, the function <code>TysPrim.primRepTyCon :: PrimRep -> + TyCon</code> converts <code>PrimRep</code> values into the corresponding + type constructor. + + <h2>Wired in types (module <tt>TysWiredIn</tt>)</h2> + <p> + In addition to entities that are primitive, as the compiler has to treat + them specially in the backend, there is a set of types, functions, + etc. that the Haskell language definition flags as essential to the + language by placing them into the special module <code>Prelude</code> + that is implicitly imported into each Haskell module. For some of these + entities it suffices to define them (by standard Haskell definitions) in + a <code>Prelude</code> module and ensuring that this module is treated + specially by being always imported . + <p> + However, there is a set of entities (such as, for example, the list type + and the corresponding data constructors) that have an inbetween status: + They are not truly primitive (lists, for example, can easily be defined + by a <code>data</code> declaration), but the compiler has to have extra + knowledge about them, as they are associated with some particular + features of the language (in the case of lists, there is special syntax, + such as list comprehensions, associated with the type). Another + example, for a special kind of entity are type classes that can be used + in a <code>deriving</code> clause. All types that are not-primitive, + but about which the compiler nonetheless has to have some extra + knowledge are defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>. + <p> + All wired in type constructors are contained in <code>wiredInTyCons :: + [TyCon]</code>. In addition to that list, <code>TysWiredIn</code> + exports variables bound to representations of all listed type + constructors and their data constructors. So, for example, we have + <code>listTyCon</code> together with <code>nilDataCon</cons> and + </code>consDataCon</code>. There are also convenience functions, such + as <code>mkListTy</code> and <code>mkTupleTy</code>, which construct + compound types. + <p> + + <h2>Known-key names (module <tt>PrelNames</tt>)</h2> + + All names of types, functions, etc. known to the compiler are defined in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. + This includes the names of types and functions exported from + <code>TysWiredIn</code>, but also others. In particular, this module + also fixes the names of all prelude modules; i.e., of the modules whose + name starts with <code>Prel</code>, which GHC's library uses to bring + some structure into the quite large number of <code>Prelude</code> + definitions. + <p> + <code>PrelNames.knownKeyNames :: [Name]</code> contains all names known + to the compiler, but the elements of the list are also exported + individually as variables, such as <code>floatTyConName</code> (having + the lexeme <code>Float</code>) and <code>floatDataConName</code> (having + the lexeme <code>F#</code>). For each of these names, + <code>PrelNames</code> derfines a unique key with a definition, such as + <p> +<blockquote><pre> +floatPrimTyConKey = mkPreludeTyConUnique 11</pre> +</blockquote> + <p> + that is, all unique keys for known prelude names are hardcoded into + <code>PrelNames</code> (and uniqueness has to be manually ensured in + that module). To simplify matching the types of important groups of + type constructors, <code>PrelNames</code> also exports lists, such as + <code>numericTyKeys</code> (keys of all numeric types), that contain the + unique keys of all names in that group. In addition, derivable type + classes and their structure is defined by + <code>derivableClassKeys</code> and related definitions. + <p> + In addition to names that have unique keys, <code>PrelNames</code> also + defines a set of names without uniqueness information. These names end + on the suffix <code>_RDR</code> and are of type <code>RdrName</code> (an + example, is <code>times_RDR</code>, which represents the lexeme + <code>*</code>). The names are used in locations where they pass + through the renamer anyway (e.g., special constructors encountered by + the parser, such as [], and code generated from deriving clauses), which + will take care of adding uniqueness information. + <p> + +<h2>Gathering it all together (module <tt>PrelInfo</tt>)</h2> + The module + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelInfo.lhs"><code>PrelInfo</code></a> + in some sense ties all the above together and provides a reasonably + restricted interface to these definition to the rest of the compiler. + However, from what I have seen, this doesn't quite work out and the + earlier mentioned modules are directly imported in many places. + + <p><small> +<!-- hhmts start --> +Last modified: Tue Dec 11 17:54:07 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/renamer.html b/docs/comm/the-beast/renamer.html new file mode 100644 index 0000000000..828b569bb9 --- /dev/null +++ b/docs/comm/the-beast/renamer.html @@ -0,0 +1,249 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Glorious Renamer</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Glorious Renamer</h1> + <p> + The <em>renamer</em> sits between the parser and the typechecker. + However, its operation is quite tightly interwoven with the + typechecker. This is partially due to support for Template Haskell, + where spliced code has to be renamed and type checked. In particular, + top-level splices lead to multiple rounds of renaming and type + checking. + </p> + <p> + The main externally used functions of the renamer are provided by the + module <code>rename/RnSource.lhs</code>. In particular, we have + </p> + <blockquote> + <pre> +rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) +rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] +rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)</pre> + </blockquote> + <p> + All of which execute in the renamer monad <code>RnM</code>. The first + function, <code>rnSrcDecls</code> renames a binding group; the second, + <code>rnTyClDecls</code> renames a list of (toplevel) type and class + declarations; and the third, <code>rnSplice</code> renames a Template + Haskell splice. As the types indicate, the main task of the renamer is + to convert converts all the <tt>RdrNames</tt> to <a + href="names.html"><tt>Names</tt></a>, which includes a number of + well-formedness checks (no duplicate declarations, all names are in + scope, and so on). In addition, the renamer performs other, not + strictly name-related, well-formedness checks, which includes checking + that the appropriate flags have been supplied whenever language + extensions are used in the source. + </p> + + <h2>RdrNames</h2> + <p> + A <tt>RdrName.RdrName</tt> is pretty much just a string (for an + unqualified name like "<tt>f</tt>") or a pair of strings (for a + qualified name like "<tt>M.f</tt>"): + </p> + <blockquote> + <pre> +data RdrName + = Unqual OccName + -- Used for ordinary, unqualified occurrences + + | Qual Module OccName + -- A qualified name written by the user in + -- *source* code. The module isn't necessarily + -- the module where the thing is defined; + -- just the one from which it is imported + + | Orig Module OccName + -- An original name; the module is the *defining* module. + -- This is used when GHC generates code that will be fed + -- into the renamer (e.g. from deriving clauses), but where + -- we want to say "Use Prelude.map dammit". + + | Exact Name + -- We know exactly the Name. This is used + -- (a) when the parser parses built-in syntax like "[]" + -- and "(,)", but wants a RdrName from it + -- (b) when converting names to the RdrNames in IfaceTypes + -- Here an Exact RdrName always contains an External Name + -- (Internal Names are converted to simple Unquals) + -- (c) by Template Haskell, when TH has generated a unique name</pre> + </blockquote> + <p> + The OccName type is described in <a href="names.html#occname">The + truth about names</a>. + </p> + + <h2>The Renamer Monad</h2> + <p> + Due to the tight integration of the renamer with the typechecker, both + use the same monad in recent versions of GHC. So, we have + </p> + <blockquote> + <pre> +type RnM a = TcRn a -- Historical +type TcM a = TcRn a -- Historical</pre> + </blockquote> + <p> + with the combined monad defined as + </p> + <blockquote> + <pre> +type TcRn a = TcRnIf TcGblEnv TcLclEnv a +type TcRnIf a b c = IOEnv (Env a b) c + +data Env gbl lcl -- Changes as we move into an expression + = Env { + env_top :: HscEnv, -- Top-level stuff that never changes + -- Includes all info about imported things + + env_us :: TcRef UniqSupply, -- Unique supply for local varibles + + env_gbl :: gbl, -- Info about things defined at the top level + -- of the module being compiled + + env_lcl :: lcl -- Nested stuff; changes as we go into + -- an expression + }</pre> + </blockquote> + <p> + the details of the global environment type <code>TcGblEnv</code> and + local environment type <code>TcLclEnv</code> are also defined in the + module <code>typecheck/TcRnTypes.lhs</code>. The monad + <code>IOEnv</code> is defined in <code>utils/IOEnv.hs</code> and extends + the vanilla <code>IO</code> monad with an additional state parameter + <code>env</code> that is treated as in a reader monad. (Side effecting + operations, such as updating the unique supply, are done with + <code>TcRef</code>s, which are simply a synonym for <code>IORef</code>s.) + </p> + + <h2>Name Space Management</h2> + <p> + As anticipated by the variants <code>Orig</code> and <code>Exact</code> + of <code>RdrName</code> some names should not change during renaming, + whereas others need to be turned into unique names. In this context, + the two functions <code>RnEnv.newTopSrcBinder</code> and + <code>RnEnv.newLocals</code> are important: + </p> + <blockquote> + <pre> +newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name +newLocalsRn :: [Located RdrName] -> RnM [Name]</pre> + </blockquote> + <p> + The two functions introduces new toplevel and new local names, + respectively, where the first two arguments to + <code>newTopSrcBinder</code> determine the currently compiled module and + the parent construct of the newly defined name. Both functions create + new names only for <code>RdrName</code>s that are neither exact nor + original. + </p> + + <h3>Introduction of Toplevel Names: Global RdrName Environment</h3> + <p> + A global <code>RdrName</code> environment + <code>RdrName.GlobalRdrEnv</code> is a map from <code>OccName</code>s to + lists of qualified names. More precisely, the latter are + <code>Name</code>s with an associated <code>Provenance</code>: + </p> + <blockquote> + <pre> +data Provenance + = LocalDef -- Defined locally + Module + + | Imported -- Imported + [ImportSpec] -- INVARIANT: non-empty + Bool -- True iff the thing was named *explicitly* + -- in *any* of the import specs rather than being + -- imported as part of a group; + -- e.g. + -- import B + -- import C( T(..) ) + -- Here, everything imported by B, and the constructors of T + -- are not named explicitly; only T is named explicitly. + -- This info is used when warning of unused names.</pre> + </blockquote> + <p> + The part of the global <code>RdrName</code> environment for a module + that contains the local definitions is created by the function + <code>RnNames.importsFromLocalDecls</code>, which also computes a data + structure recording all imported declarations in the form of a value of + type <code>TcRnTypes.ImportAvails</code>. + </p> + <p> + The function <code>importsFromLocalDecls</code>, in turn, makes use of + <code>RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM + [AvailInfo]</code> to extract all declared names from a binding group, + where <code>HscTypes.AvailInfo</code> is essentially a collection of + <code>Name</code>s; i.e., <code>getLocalDeclBinders</code>, on the fly, + generates <code>Name</code>s from the <code>RdrName</code>s of all + top-level binders of the module represented by the <code>HsGroup + RdrName</code> argument. + </p> + <p> + It is important to note that all this happens before the renamer + actually descends into the toplevel bindings of a module. In other + words, before <code>TcRnDriver.rnTopSrcDecls</code> performs the + renaming of a module by way of <code>RnSource.rnSrcDecls</code>, it uses + <code>importsFromLocalDecls</code> to set up the global + <code>RdrName</code> environment, which contains <code>Name</code>s for + all imported <em>and</em> all locally defined toplevel binders. Hence, + when the helpers of <code>rnSrcDecls</code> come across the + <em>defining</em> occurences of a toplevel <code>RdrName</code>, they + don't rename it by generating a new name, but they simply look up its + name in the global <code>RdrName</code> environment. + </p> + + <h2>Rebindable syntax</h2> + <p> + In Haskell when one writes "3" one gets "fromInteger 3", where + "fromInteger" comes from the Prelude (regardless of whether the + Prelude is in scope). If you want to completely redefine numbers, + that becomes inconvenient. So GHC lets you say + "-fno-implicit-prelude"; in that case, the "fromInteger" comes from + whatever is in scope. (This is documented in the User Guide.) + </p> + <p> + This feature is implemented as follows (I always forget). + <ul> + <li>Names that are implicitly bound by the Prelude, are marked by the + type <code>HsExpr.SyntaxExpr</code>. Moreover, the association list + <code>HsExpr.SyntaxTable</code> is set up by the renamer to map + rebindable names to the value they are bound to. + </li> + <li>Currently, five constructs related to numerals + (<code>HsExpr.NegApp</code>, <code>HsPat.NPat</code>, + <code>HsPat.NPlusKPat</code>, <code>HsLit.HsIntegral</code>, and + <code>HsLit.HsFractional</code>) and + two constructs related to code>do</code> expressions + (<code>HsExpr.BindStmt</code> and + <code>HsExpr.ExprStmt</code>) have rebindable syntax. + </li> + <li> When the parser builds these constructs, it puts in the + built-in Prelude Name (e.g. PrelNum.fromInteger). + </li> + <li> When the renamer encounters these constructs, it calls + <tt>RnEnv.lookupSyntaxName</tt>. + This checks for <tt>-fno-implicit-prelude</tt>; if not, it just + returns the same Name; otherwise it takes the occurrence name of the + Name, turns it into an unqualified RdrName, and looks it up in the + environment. The returned name is plugged back into the construct. + </li> + <li> The typechecker uses the Name to generate the appropriate typing + constraints. + </li> + </ul> + + <p><small> +<!-- hhmts start --> +Last modified: Wed May 4 17:16:15 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> + diff --git a/docs/comm/the-beast/simplifier.html b/docs/comm/the-beast/simplifier.html new file mode 100644 index 0000000000..40cf7cf892 --- /dev/null +++ b/docs/comm/the-beast/simplifier.html @@ -0,0 +1,86 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Mighty Simplifier</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Mighty Simplifier</h1> + <p> + Most of the optimising program transformations applied by GHC are + performed on an intermediate language called <em>Core,</em> which + essentially is a compiler-friendly formulation of rank-2 polymorphic + lambda terms defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs/"><code>CoreSyn.lhs</code>.</a> + The transformation engine optimising Core programs is called the + <em>Simplifier</em> and composed from a couple of modules located in the + directory <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/"><code>fptools/ghc/compiler/simplCore/</code>.</a> + The main engine of the simplifier is contained in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code>.</a> + and its driver is the routine <code>core2core</code> in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimplCore.lhs</code>.</a> + <p> + The program that the simplifier has produced after applying its various + optimisations can be obtained by passing the option + <code>-ddump-simpl</code> to GHC. Moreover, the various intermediate + stages of the optimisation process is printed when passing + <code>-dverbose-core2core</code>. + + <h4><a name="loopBreaker">Recursive Definitions</a></h4> + <p> + The simplification process has to take special care when handling + recursive binding groups; otherwise, the compiler might loop. + Therefore, the routine <code>reOrderRec</code> in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/OccurAnal.lhs"><code>OccurAnal.lhs</code></a> + computes a set of <em>loop breakers</em> - a set of definitions that + together cut any possible loop in the binding group. It marks the + identifiers bound by these definitions as loop breakers by enriching + their <a href="basicTypes.html#occInfo">occurence information.</a> Loop + breakers will <em>never</em> be inlined by the simplifier; thus, + guaranteeing termination of the simplification procedure. (This is not + entirely accurate -- see <a href="#rules">rewrite rules</a> below.) + + The processes finding loop breakers works as follows: First, the + strongly connected components (SCC) of the graph representing all + function dependencies is computed. Then, each SCC is inspected in turn. + If it contains only a single binding (self-recursive function), this is + the loop breaker. In case of multiple recursive bindings, the function + attempts to select bindings where the decision not to inline them does + cause the least harm - in the sense of inhibiting optimisations in the + code. This is achieved by considering each binding in turn and awarding + a <em>score</em> between 0 and 4, where a lower score means that the + function is less useful for inlining - and thus, a better loop breaker. + The evaluation of bingings is performed by the function + <code>score</code> locally defined in <code>OccurAnal</code>. + + Note that, because core programs represent function definitions as + <em>one</em> binding choosing between the possibly many equations in the + source program with a <code>case</code> construct, a loop breaker cannot + inline any of its possibly many alternatives (not even the non-recursive + alternatives). + + <h4><a name="rules">Rewrite Rules</a></h4> + <p> + The application of rewrite rules is controlled in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code></a> + by the function <code>completeCall</code>. This function first checks + whether it should inline the function applied at the currently inspected + call site, then simplifies the arguments, and finally, checks whether + any rewrite rule can be applied (and also whether there is a matching + specialised version of the applied function). The actual check for rule + application is performed by the function <code><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/specialise/Rules.lhs">Rules</a>.lookupRule</code>. + <p> + It should be note that the application of rewrite rules is not subject + to the loop breaker check - i.e., rules of loop breakers will be applied + regardless of whether this may cause the simplifier to diverge. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:25:33 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/stg.html b/docs/comm/the-beast/stg.html new file mode 100644 index 0000000000..4581da7d1f --- /dev/null +++ b/docs/comm/the-beast/stg.html @@ -0,0 +1,164 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - You Got Control: The STG-language</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - You Got Control: The STG-language</h1> + <p> + GHC contains two completely independent backends: the byte code + generator and the machine code generator. The decision over which of + the two is invoked is made in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscCodeGen</code>. + The machine code generator proceeds itself in a number of phases: First, + the <a href="desugar.html">Core</a> intermediate language is translated + into <em>STG-language</em>; second, STG-language is transformed into a + GHC-internal variant of <a href="http://www.cminusminus.org/">C--</a>; + and thirdly, this is either emitted as concrete C--, converted to GNU C, + or translated to native code (by the <a href="ncg.html">native code + generator</a> which targets IA32, Sparc, and PowerPC [as of March '5]). + </p> + <p> + In the following, we will have a look at the first step of machine code + generation, namely the translation steps involving the STG-language. + Details about the underlying abstract machine, the <em>Spineless Tagless + G-machine</em>, are in <a + href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">Implementing + lazy functional languages on stock hardware: the Spineless Tagless + G-machine</a>, SL Peyton Jones, Journal of Functional Programming 2(2), + Apr 1992, pp127-202. (Some details have changed since the publication of + this article, but it still gives a good introduction to the main + concepts.) + </p> + + <h2>The STG Language</h2> + <p> + The AST of the STG-language and the generation of STG code from Core is + both located in the <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/"><code>stgSyn/</code></a> + directory; in the modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a>, + respectively. + </p> + <p> + Conceptually, the STG-language is a lambda calculus (including data + constructors and case expressions) whose syntax is restricted to make + all control flow explicit. As such, it can be regarded as a variant of + <em>administrative normal form (ANF).</em> (C.f., <a + href="http://doi.acm.org/10.1145/173262.155113">The essence of compiling + with continuations.</a> Cormac Flanagan, Amr Sabry, Bruce F. Duba, and + Matthias Felleisen. <em>ACM SIGPLAN Conference on Programming Language + Design and Implementation,</em> ACM Press, 1993.) Each syntactic from + has a precise operational interpretation, in addition to the + denotational interpretation inherited from the lambda calculus. The + concrete representation of the STG language inside GHC also includes + auxiliary attributes, such as <em>static reference tables (SRTs),</em> + which determine the top-level bindings referenced by each let binding + and case expression. + </p> + <p> + As usual in ANF, arguments to functions etc. are restricted to atoms + (i.e., constants or variables), which implies that all sub-expressions + are explicitly named and evaluation order is explicit. Specific to the + STG language is that all let bindings correspond to closure allocation + (thunks, function closures, and data constructors) and that case + expressions encode both computation and case selection. There are two + flavours of case expressions scrutinising boxed and unboxed values, + respectively. The former perform function calls including demanding the + evaluation of thunks, whereas the latter execute primitive operations + (such as arithmetic on fixed size integers and floating-point numbers). + </p> + <p> + The representation of STG language defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> + abstracts over both binders and occurences of variables. The type names + involved in this generic definition all carry the prefix + <code>Gen</code> (such as in <code>GenStgBinding</code>). Instances of + these generic definitions, where both binders and occurences are of type + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id</code></a><code>.Id</code> + are defined as type synonyms and use type names that drop the + <code>Gen</code> prefix (i.e., becoming plain <code>StgBinding</code>). + Complete programs in STG form are represented by values of type + <code>[StgBinding]</code>. + </p> + + <h2>From Core to STG</h2> + <p> + Although, the actual translation from Core AST into STG AST is performed + by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code> + (or <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreExprToStg</code> + for individual expressions), the translation crucial depends on <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepPgm</code> + (resp. <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepExpr</code>), + which prepares Core code for code generation (for both byte code and + machine code generation). <code>CorePrep</code> saturates primitive and + constructor applications, turns the code into A-normal form, renames all + identifiers into globally unique names, generates bindings for + constructor workers, constructor wrappers, and record selectors plus + some further cleanup. + </p> + <p> + In other words, after Core code is prepared for code generation it is + structurally already in the form required by the STG language. The main + work performed by the actual transformation from Core to STG, as + performed by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code>, + is to compute the live and free variables as well as live CAFs (constant + applicative forms) at each let binding and case alternative. In + subsequent phases, the live CAF information is used to compute SRTs. + The live variable information is used to determine which stack slots + need to be zapped (to avoid space leaks) and the free variable + information is need to construct closures. Moreover, hints for + optimised code generation are computed, such as whether a closure needs + to be updated after is has been evaluated. + </p> + + <h2>STG Passes</h2> + <p> + These days little actual work is performed on programs in STG form; in + particular, the code is not further optimised. All serious optimisation + (except low-level optimisations which are performed during native code + generation) has already been done on Core. The main task of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.stg2stg</code> + is to compute SRTs from the live CAF information determined during STG + generation. Other than that, <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/profiling/SCCfinal.lhs"><code>SCCfinal</code></a><code>.stgMassageForProfiling</code> + is executed when compiling for profiling and information may be dumped + for debugging purposes. + </p> + + <h2>Towards C--</h2> + <p> + GHC's internal form of C-- is defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/cmm/Cmm.hs"><code>Cmm</code></a>. + The definition is generic in that it abstracts over the type of static + data and of the contents of basic blocks (i.e., over the concrete + representation of constant data and instructions). These generic + definitions have names carrying the prefix <code>Gen</code> (such as + <code>GenCmm</code>). The same module also instantiates the generic + form to a concrete form where data is represented by + <code>CmmStatic</code> and instructions are represented by + <code>CmmStmt</code> (giving us, e.g., <code>Cmm</code> from + <code>GenCmm</code>). The concrete form more or less follows the + external <a href="http://www.cminusminus.org/">C--</a> language. + </p> + <p> + Programs in STG form are translated to <code>Cmm</code> by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/codeGen/CodeGen.lhs"><code>CodeGen</code></a><code>.codeGen</code> + </p> + + <p><hr><small> +<!-- hhmts start --> +Last modified: Sat Mar 5 22:55:25 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html new file mode 100644 index 0000000000..be5bbefa17 --- /dev/null +++ b/docs/comm/the-beast/syntax.html @@ -0,0 +1,99 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Just Syntax</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Just Syntax</h1> + <p> + The lexical and syntactic analyser for Haskell programs are located in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/"><code>fptools/ghc/compiler/parser/</code></a>. + </p> + + <h2>The Lexer</h2> + <p> + The lexer is a rather tedious piece of Haskell code contained in the + module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex</code></a>. + Its complexity partially stems from covering, in addition to Haskell 98, + also the whole range of GHC language extensions plus its ability to + analyse interface files in addition to normal Haskell source. The lexer + defines a parser monad <code>P a</code>, where <code>a</code> is the + type of the result expected from a successful parse. More precisely, a + result of type +<blockquote><pre> +data ParseResult a = POk PState a + | PFailed Message</pre> +</blockquote> + <p> + is produced with <code>Message</code> being from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ErrUtils.lhs"><code>ErrUtils</code></a> + (and currently is simply a synonym for <code>SDoc</code>). + <p> + The record type <code>PState</code> contains information such as the + current source location, buffer state, contexts for layout processing, + and whether Glasgow extensions are accepted (either due to + <code>-fglasgow-exts</code> or due to reading an interface file). Most + of the fields of <code>PState</code> store unboxed values; in fact, even + the flag indicating whether Glasgow extensions are enabled is + represented by an unboxed integer instead of by a <code>Bool</code>. My + (= chak's) guess is that this is to avoid having to perform a + <code>case</code> on a boxed value in the inner loop of the lexer. + <p> + The same lexer is used by the Haskell source parser, the Haskell + interface parser, and the package configuration parser. + + <h2>The Haskell Source Parser</h2> + <p> + The parser for Haskell source files is defined in the form of a parser + specification for the parser generator <a + href="http://haskell.org/happy/">Happy</a> in the file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>. + The parser exports three entry points for parsing entire modules + (<code>parseModule</code>, individual statements + (<code>parseStmt</code>), and individual identifiers + (<code>parseIdentifier</code>), respectively. The last two are needed + for GHCi. All three require a parser state (of type + <code>PState</code>) and are invoked from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a>. + <p> + Parsing of Haskell is a rather involved process. The most challenging + features are probably the treatment of layout and expressions that + contain infix operators. The latter may be user-defined and so are not + easily captured in a static syntax specification. Infix operators may + also appear in the right hand sides of value definitions, and so, GHC's + parser treats those in the same way as expressions. In other words, as + general expressions are a syntactic superset of expressions - ok, they + <em>nearly</em> are - the parser simply attempts to parse a general + expression in such positions. Afterwards, the generated parse tree is + inspected to ensure that the accepted phrase indeed forms a legal + pattern. This and similar checks are performed by the routines from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/ParseUtil.lhs"><code>ParseUtil</code></a>. In + some cases, these routines do, in addition to checking for + wellformedness, also transform the parse tree, such that it fits into + the syntactic context in which it has been parsed; in fact, this happens + for patterns, which are transformed from a representation of type + <code>RdrNameHsExpr</code> into a representation of type + <code>RdrNamePat</code>. + + <h2>The Haskell Interface Parser</h2> + <p> + The parser for interface files is also generated by Happy from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/ParseIface.y"><code>ParseIface.y</code></a>. + It's main routine <code>parseIface</code> is invoked from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/RnHiFiles.lhs"><code>RnHiFiles</code></a><code>.readIface</code>. + + <h2>The Package Configuration Parser</h2> + <p> + The parser for configuration files is by far the smallest of the three + and defined in <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ParsePkgConf.y"><code>ParsePkgConf.y</code></a>. + It exports <code>loadPackageConfig</code>, which is used by <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState</code></a><code>.readPackageConf</code>. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Jan 16 00:30:14 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/typecheck.html b/docs/comm/the-beast/typecheck.html new file mode 100644 index 0000000000..8d22784b8a --- /dev/null +++ b/docs/comm/the-beast/typecheck.html @@ -0,0 +1,316 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Checking Types</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Checking Types</h1> + <p> + Probably the most important phase in the frontend is the type checker, + which is located at <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a> + GHC type checks programs in their original Haskell form before the + desugarer converts them into Core code. This complicates the type + checker as it has to handle the much more verbose Haskell AST, but it + improves error messages, as those message are based on the same + structure that the user sees. + </p> + <p> + GHC defines the abstract syntax of Haskell programs in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a> + using a structure that abstracts over the concrete representation of + bound occurences of identifiers and patterns. The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code></a> + defines a number of helper function required by the type checker. Note + that the type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a>.<code>TcId</code> + used to represent identifiers in some signatures during type checking + is, in fact, nothing but a synonym for a <a href="vars.html">plain + <code>Id</code>.</a> + </p> + <p> + It is also noteworthy, that the representations of types changes during + type checking from <code>HsType</code> to <code>TypeRep.Type</code>. + The latter is a <a href="types.html">hybrid type representation</a> that + is used to type Core, but still contains sufficient information to + recover source types. In particular, the type checker maintains and + compares types in their <code>Type</code> form. + </p> + + <h2>The Overall Flow of Things</h2> + + <h4>Entry Points Into the Type Checker</h4> + <p> + The interface of the type checker (and <a + href="renamer.html">renamer</a>) to the rest of the compiler is provided + by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnDriver.lhs"><code>TcRnDriver</code></a>. + Entire modules are processed by calling <code>tcRnModule</code> and GHCi + uses <code>tcRnStmt</code>, <code>tcRnExpr</code>, and + <code>tcRnType</code> to typecheck statements and expressions, and to + kind check types, respectively. Moreover, <code>tcRnExtCore</code> is + provided to typecheck external Core code. Moreover, + <code>tcTopSrcDecls</code> is used by Template Haskell - more + specifically by <code>TcSplice.tc_bracket</code> + - to type check the contents of declaration brackets. + </p> + + <h4>Renaming and Type Checking a Module</h4> + <p> + The function <code>tcRnModule</code> controls the complete static + analysis of a Haskell module. It sets up the combined renamer and type + checker monad, resolves all import statements, initiates the actual + renaming and type checking process, and finally, wraps off by processing + the export list. + </p> + <p> + The actual type checking and renaming process is initiated via + <code>TcRnDriver.tcRnSrcDecls</code>, which uses a helper called + <code>tc_rn_src_decls</code> to implement the iterative renaming and + type checking process required by <a href="../exts/th.html">Template + Haskell</a>. However, before it invokes <code>tc_rn_src_decls</code>, + it takes care of hi-boot files; afterwards, it simplifies type + constraints and zonking (see below regarding the later). + </p> + <p> + The function <code>tc_rn_src_decls</code> partitions static analysis of + a whole module into multiple rounds, where the initial round is followed + by an additional one for each toplevel splice. It collects all + declarations up to the next splice into an <code>HsDecl.HsGroup</code> + to rename and type check that <em>declaration group</em> by calling + <code>TcRnDriver.tcRnGroup</code>. Afterwards, it executes the + splice (if there are any left) and proceeds to the next group, which + includes the declarations produced by the splice. + </p> + <p> + The function <code>tcRnGroup</code>, finally, gets down to invoke the + actual renaming and type checking via + <code>TcRnDriver.rnTopSrcDecls</code> and + <code>TcRnDriver.tcTopSrcDecls</code>, respectively. The renamer, apart + from renaming, computes the global type checking environment, of type + <code>TcRnTypes.TcGblEnv</code>, which is stored in the type checking + monad before type checking commences. + </p> + + <h2>Type Checking a Declaration Group</h2> + <p> + The type checking of a declaration group, performed by + <code>tcTopSrcDecls</code> starts by processing of the type and class + declarations of the current module, using the function + <code>TcTyClsDecls.tcTyAndClassDecls</code>. This is followed by a + first round over instance declarations using + <code>TcInstDcls.tcInstDecls1</code>, which in particular generates all + additional bindings due to the deriving process. Then come foreign + import declarations (<code>TcForeign.tcForeignImports</code>) and + default declarations (<code>TcDefaults.tcDefaults</code>). + </p> + <p> + Now, finally, toplevel value declarations (including derived ones) are + type checked using <code>TcBinds.tcTopBinds</code>. Afterwards, + <code>TcInstDcls.tcInstDecls2</code> traverses instances for the second + time. Type checking concludes with processing foreign exports + (<code>TcForeign.tcForeignExports</code>) and rewrite rules + (<code>TcRules.tcRules</code>). Finally, the global environment is + extended with the new bindings. + </p> + + <h2>Type checking Type and Class Declarations</h2> + <p> + Type and class declarations are type checked in a couple of phases that + contain recursive dependencies - aka <em>knots.</em> The first knot + encompasses almost the whole type checking of these declarations and + forms the main piece of <code>TcTyClsDecls.tcTyAndClassDecls</code>. + </p> + <p> + Inside this big knot, the first main operation is kind checking, which + again involves a knot. It is implemented by <code>kcTyClDecls</code>, + which performs kind checking of potentially recursively-dependent type + and class declarations using kind variables for initially unknown kinds. + During processing the individual declarations some of these variables + will be instantiated depending on the context; the rest gets by default + kind <code>*</code> (during <em>zonking</em> of the kind signatures). + Type synonyms are treated specially in this process, because they can + have an unboxed type, but they cannot be recursive. Hence, their kinds + are inferred in dependency order. Moreover, in contrast to class + declarations and other type declarations, synonyms are not entered into + the global environment as a global <code>TyThing</code>. + (<code>TypeRep.TyThing</code> is a sum type that combines the various + flavours of typish entities, such that they can be stuck into type + environments and similar.) + </p> + + <h2>More Details</h2> + + <h4>Types Variables and Zonking</h4> + <p> + During type checking type variables are represented by mutable variables + - cf. the <a href="vars.html#TyVar">variable story.</a> Consequently, + unification can instantiate type variables by updating those mutable + variables. This process of instantiation is (for reasons that elude me) + called <a + href="http://www.dictionary.com/cgi-bin/dict.pl?term=zonk&db=*">zonking</a> + in GHC's sources. The zonking routines for the various forms of Haskell + constructs are responsible for most of the code in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code>,</a> + whereas the routines that actually operate on mutable types are defined + in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcMType.lhs"><code>TcMType</code></a>; + this includes the zonking of type variables and type terms, routines to + create mutable structures and update them as well as routines that check + constraints, such as that type variables in function signatures have not + been instantiated during type checking. The actual type unification + routine is <code>uTys</code> in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcUnify.lhs"><code>TcUnify</code></a>. + </p> + <p> + All type variables that may be instantiated (those in signatures + may not), but haven't been instantiated during type checking, are zonked + to <code>()</code>, so that after type checking all mutable variables + have been eliminated. + </p> + + <h4>Type Representation</h4> + <p> + The representation of types is fixed in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRep.lhs"><code>TcRep</code></a> + and exported as the data type <code>Type</code>. As explained in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcType.lhs"><code>TcType</code></a>, + GHC supports rank-N types, but, in the type checker, maintains the + restriction that type variables cannot be instantiated to quantified + types (i.e., the type system is predicative). The type checker floats + universal quantifiers outside and maintains types in prenex form. + (However, quantifiers can, of course, not float out of negative + positions.) Overall, we have + </p> + <blockquote> + <pre> +sigma -> forall tyvars. phi +phi -> theta => rho +rho -> sigma -> rho + | tau +tau -> tyvar + | tycon tau_1 .. tau_n + | tau_1 tau_2 + | tau_1 -> tau_2</pre> + </blockquote> + <p> + where <code>sigma</code> is in prenex form; i.e., there is never a + forall to the right of an arrow in a <code>phi</code> type. Moreover, a + type of the form <code>tau</code> never contains a quantifier (which + includes arguments to type constructors). + </p> + <p> + Of particular interest are the variants <code>SourceTy</code> and + <code>NoteTy</code> of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TypeRep.lhs"><code>TypeRep</code></a>.<code>Type</code>. + The constructor <code>SourceTy :: SourceType -> Type</code> represents a + type constraint; that is, a predicate over types represented by a + dictionary. The type checker treats a <code>SourceTy</code> as opaque, + but during the translation to core it will be expanded into its concrete + representation (i.e., a dictionary type) by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>sourceTypeRep</code>. + Note that newtypes are not covered by <code>SourceType</code>s anymore, + even if some comments in GHC still suggest this. Instead, all newtype + applications are initially represented as a <code>NewTcApp</code>, until + they are eliminated by calls to <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>newTypeRep</code>. + </p> + <p> + The <code>NoteTy</code> constructor is used to add non-essential + information to a type term. Such information has the type + <code>TypeRep.TyNote</code> and is either the set of free type variables + of the annotated expression or the unexpanded version of a type synonym. + Free variables sets are cached as notes to save the overhead of + repeatedly computing the same set for a given term. Unexpanded type + synonyms are useful for generating comprehensible error messages, but + have no influence on the process of type checking. + </p> + + <h4>Type Checking Environment</h4> + <p> + During type checking, GHC maintains a <em>type environment</em> whose + type definitions are fixed in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a> with the operations defined in +<a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv</code></a>. + Among other things, the environment contains all imported and local + instances as well as a list of <em>global</em> entities (imported and + local types and classes together with imported identifiers) and + <em>local</em> entities (locally defined identifiers). This environment + is threaded through the type checking monad, whose support functions + including initialisation can be found in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnMonad.lhs"><code>TcRnMonad</code>.</a> + + <h4>Expressions</h4> + <p> + Expressions are type checked by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr</code>.</a> + <p> + Usage occurences of identifiers are processed by the function + <code>tcId</code> whose main purpose is to <a href="#inst">instantiate + overloaded identifiers.</a> It essentially calls + <code>TcInst.instOverloadedFun</code> once for each universally + quantified set of type constraints. It should be noted that overloaded + identifiers are replaced by new names that are first defined in the LIE + (Local Instance Environment?) and later promoted into top-level + bindings. + + <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4> + <p> + GHC implements overloading using so-called <em>dictionaries.</em> A + dictionary is a tuple of functions -- one function for each method in + the class of which the dictionary implements an instance. During type + checking, GHC replaces each type constraint of a function with one + additional argument. At runtime, the extended function gets passed a + matching class dictionary by way of these additional arguments. + Whenever the function needs to call a method of such a class, it simply + extracts it from the dictionary. + <p> + This sounds simple enough; however, the actual implementation is a bit + more tricky as it wants to keep track of all the instances at which + overloaded functions are used in a module. This information is useful + to optimise the code. The implementation is the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a> + <p> + The function <code>instOverloadedFun</code> is invoked for each + overloaded usage occurence of an identifier, where overloaded means that + the type of the idendifier contains a non-trivial type constraint. It + proceeds in two steps: (1) Allocation of a method instance + (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional + dependencies. The former implies allocating a new unique identifier, + which replaces the original (overloaded) identifier at the currently + type-checked usage occurrence. + <p> + The new identifier (after being threaded through the LIE) eventually + will be bound by a top-level binding whose rhs contains a partial + application of the original overloaded identifier. This papp applies + the overloaded function to the dictionaries needed for the current + instance. In GHC lingo, this is called a <em>method.</em> Before + becoming a top-level binding, the method is first represented as a value + of type <code>Inst.Inst</code>, which makes it easy to fold multiple + instances of the same identifier at the same types into one global + definition. (And probably other things, too, which I haven't + investigated yet.) + + <p> + <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the + CVS HEAD), the above mechanism interferes badly with RULES pragmas + defined over overloaded functions. During instantiation, a new name is + created for an overloaded function partially applied to the dictionaries + needed in a usage position of that function. As the rewrite rule, + however, mentions the original overloaded name, it won't fire anymore + -- unless later phases remove the intermediate definition again. The + latest CVS version of GHC has an option + <code>-fno-method-sharing</code>, which avoids sharing instantiation + stubs. This is usually/often/sometimes sufficient to make the rules + fire again. + + <p><small> +<!-- hhmts start --> +Last modified: Thu May 12 22:52:46 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/types.html b/docs/comm/the-beast/types.html new file mode 100644 index 0000000000..383b71f054 --- /dev/null +++ b/docs/comm/the-beast/types.html @@ -0,0 +1,179 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Hybrid Types</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Hybrid Types</h1> + <p> + GHC essentially supports two type systems: (1) the <em>source type + system</em> (which is a heavily extended version of the type system of + Haskell 98) and (2) the <em>Core type system,</em> which is the type system + used by the intermediate language (see also <a + href="desugar.html">Sugar Free: From Haskell To Core</a>). + </p> + <p> + During parsing and renaming, type information is represented in a form + that is very close to Haskell's concrete syntax; it is defined by + <code>HsTypes.HsType</code>. In addition, type, class, and instance + declarations are maintained in their source form as defined in the + module <code>HsDecl</code>. The situation changes during type checking, + where types are translated into a second representation, which is + defined in the module <code>types/TypeRep.lhs</code>, as type + <code>Type</code>. This second representation is peculiar in that it is + a hybrid between the source representation of types and the Core + representation of types. Using functions, such as + <code>Type.coreView</code> and <code>Type.deepCoreView</code>, a value + of type <code>Type</code> exhibits its Core representation. On the + other hand, pretty printing a <code>Type</code> with + <code>TypeRep.pprType</code> yields the type's source representation. + </p> + <p> + In fact, the <a href="typecheck.html">type checker</a> maintains type + environments based on <code>Type</code>, but needs to perform type + checking on source-level types. As a result, we have functions + <code>Type.tcEqType</code> and <code>Type.tcCmpType</code>, which + compare types based on their source representation, as well as the + function <code>coreEqType</code>, which compares them based on their + core representation. The latter is needed during type checking of Core + (as performed by the functions in the module + <code>coreSyn/CoreLint.lhs</code>). + </p> + + <h2>Type Synonyms</h2> + <p> + Type synonyms in Haskell are essentially a form of macro definitions on + the type level. For example, when the type checker compares two type + terms, synonyms are always compared in their expanded form. However, to + produce good error messages, we like to avoid expanding type synonyms + during pretty printing. Hence, <code>Type</code> has a variant + <code>NoteTy TyNote Type</code>, where + </p> + <blockquote> + <pre> +data TyNote + = FTVNote TyVarSet -- The free type variables of the noted expression + + | SynNote Type -- Used for type synonyms + -- The Type is always a TyConApp, and is the un-expanded form. + -- The type to which the note is attached is the expanded form.</pre> + </blockquote> + <p> + In other words, a <code>NoteTy</code> represents the expanded form of a + type synonym together with a note stating its source form. + </p> + + <h3>Creating Representation Types of Synonyms</h3> + <p> + During translation from <code>HsType</code> to <code>Type</code> the + function <code>Type.mkSynTy</code> is used to construct representations + of applications of type synonyms. It creates a <code>NoteTy</code> node + if the synonym is applied to a sufficient number of arguments; + otherwise, it builds a simple <code>TyConApp</code> and leaves it to + <code>TcMType.checkValidType</code> to pick up invalid unsaturated + synonym applications. While creating a <code>NoteTy</code>, + <code>mkSynTy</code> also expands the synonym by substituting the type + arguments for the parameters of the synonym definition, using + <code>Type.substTyWith</code>. + </p> + <p> + The function <code>mkSynTy</code> is used indirectly via + <code>mkGenTyConApp</code>, <code>mkAppTy</code>, and + <code>mkAppTy</code>, which construct type representations involving + type applications. The function <code>mkSynTy</code> is also used + directly during type checking interface files; this is for tedious + reasons to do with forall hoisting - see the comment at + <code>TcIface.mkIfTcApp</code>. + </p> + + <h2>Newtypes</h2> + <p> + Data types declared by a <code>newtype</code> declarations constitute new + type constructors---i.e., they are not just type macros, but introduce + new type names. However, provided that a newtype is not recursive, we + still want to implement it by its representation type. GHC realises this + by providing two flavours of type equality: (1) <code>tcEqType</code> is + source-level type equality, which compares newtypes and + <code>PredType</code>s by name, and (2) <code>coreEqType</code> compares + them structurally (by using <code>deepCoreView</code> to expand the + representation before comparing). The function + <code>deepCoreView</code> (via <code>coreView</code>) invokes + <code>expandNewTcApp</code> for every type constructor application + (<code>TyConApp</code>) to determine whether we are looking at a newtype + application that needs to be expanded to its representation type. + </p> + + <h2>Predicates</h2> + <p> + The dictionary translation of type classes, translates each predicate in + a type context of a type signature into an additional argument, which + carries a dictionary with the functions overloaded by the corresponding + class. The <code>Type</code> data type has a special variant + <code>PredTy PredType</code> for predicates, where + </p> + <blockquote> + <pre> +data PredType + = ClassP Class [Type] -- Class predicate + | IParam (IPName Name) Type -- Implicit parameter</pre> + </blockquote> + <p> + These types need to be handled as source type during type checking, but + turn into their representations when inspected through + <code>coreView</code>. The representation is determined by + <code>Type.predTypeRep</code>. + </p> + + <h2>Representation of Type Constructors</h2> + <p> + Type constructor applications are represented in <code>Type</code> by + the variant <code>TyConApp :: TyCon -> [Type] -> Type</code>. The first + argument to <code>TyConApp</code>, namely <code>TyCon.TyCon</code>, + distinguishes between function type constructors (variant + <code>FunTyCon</code>) and algebraic type constructors (variant + <code>AlgTyCon</code>), which arise from data and newtype declarations. + The variant <code>AlgTyCon</code> contains all the information available + from the data/newtype declaration as well as derived information, such + as the <code>Unique</code> and argument variance information. This + includes a field <code>algTcRhs :: AlgTyConRhs</code>, where + <code>AlgTyConRhs</code> distinguishes three kinds of algebraic data + type declarations: (1) declarations that have been exported abstractly, + (2) <code>data</code> declarations, and (3) <code>newtype</code> + declarations. The last two both include their original right hand side; + in addition, the third variant also caches the "ultimate" representation + type, which is the right hand side after expanding all type synonyms and + non-recursive newtypes. + </p> + <p> + Both data and newtype declarations refer to their data constructors + represented as <code>DataCon.DataCon</code>, which include all details + of their signature (as derived from the original declaration) as well + information for code generation, such as their tag value. + </p> + + <h2>Representation of Classes and Instances</h2> + <p> + Class declarations turn into values of type <code>Class.Class</code>. + They represent methods as the <code>Id</code>s of the dictionary + selector functions. Similar selector functions are available for + superclass dictionaries. + </p> + <p> + Instance declarations turn into values of type + <code>InstEnv.Instance</code>, which in interface files are represented + as <code>IfaceSyn.IfaceInst</code>. Moreover, the type + <code>InstEnv.InstEnv</code>, which is a synonym for <code>UniqFM + ClsInstEnv</code>, provides a mapping of classes to their + instances---<code>ClsInstEnv</code> is essentially a list of instance + declarations. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Sun Jun 19 13:07:22 EST 2005 +<!-- hhmts end --> + </small></p> + </body> +</html> diff --git a/docs/comm/the-beast/vars.html b/docs/comm/the-beast/vars.html new file mode 100644 index 0000000000..9bbd310c60 --- /dev/null +++ b/docs/comm/the-beast/vars.html @@ -0,0 +1,235 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1> + <p> + + +<h2>Variables</h2> + +The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>, +represents variables, both term variables and type variables: +<pre> + data Var + = Var { + varName :: Name, + realUnique :: FastInt, + varType :: Type, + varDetails :: VarDetails, + varInfo :: IdInfo + } +</pre> +<ul> +<li> The <code>varName</code> field contains the identity of the variable: +its unique number, and its print-name. See "<a href="names.html">The truth about names</a>". + +<p><li> The <code>realUnique</code> field caches the unique number in the +<code>varName</code> field, just to make comparison of <code>Var</code>s a little faster. + +<p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a +type variable. (Types and kinds are both represented by a <code>Type</code>.) + +<p><li> The <code>varDetails</code> field distinguishes term variables from type variables, +and makes some further distinctions (see below). + +<p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful +information: strictness, unfolding, etc. However, this information is all optional; +you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away +the <code>VarDetails</code> of a <code>Var</code> +</ul> +<p> +It's often fantastically convenient to have term variables and type variables +share a single data type. For example, +<pre> + exprFreeVars :: CoreExpr -> VarSet +</pre> +If there were two types, we'd need to return two sets. Simiarly, big lambdas and +little lambdas use the same constructor in Core, which is extremely convenient. +<p> +We define a couple of type synonyms: +<pre> + type Id = Var -- Term variables + type TyVar = Var -- Type variables +</pre> +just to help us document the occasions when we are expecting only term variables, +or only type variables. + + +<h2> The <code>VarDetails</code> field </h2> + +The <code>VarDetails</code> field tells what kind of variable this is: +<pre> +data VarDetails + = LocalId -- Used for locally-defined Ids (see NOTE below) + LocalIdDetails + + | GlobalId -- Used for imported Ids, dict selectors etc + GlobalIdDetails + + | TyVar + | MutTyVar (IORef (Maybe Type)) -- Used during unification; + TyVarDetails +</pre> + +<a name="TyVar"> +<h2>Type variables (<code>TyVar</code>)</h2> +</a> +<p> +The <code>TyVar</code> case is self-explanatory. The <code>MutTyVar</code> +case is used only during type checking. Then a type variable can be unified, +using an imperative update, with a type, and that is what the +<code>IORef</code> is for. The <code>TcType.TyVarDetails</code> field records +the sort of type variable we are dealing with. It is defined as +<pre> +data TyVarDetails = SigTv | ClsTv | InstTv | VanillaTv +</pre> +<code>SigTv</code> marks type variables that were introduced when +instantiating a type signature prior to matching it against the inferred type +of a definition. The variants <code>ClsTv</code> and <code>InstTv</code> mark +scoped type variables introduced by class and instance heads, respectively. +These first three sorts of type variables are skolem variables (tested by the +predicate <code>isSkolemTyVar</code>); i.e., they must <em>not</em> be +instantiated. All other type variables are marked as <code>VanillaTv</code>. +<p> +For a long time I tried to keep mutable Vars statically type-distinct +from immutable Vars, but I've finally given up. It's just too painful. +After type checking there are no MutTyVars left, but there's no static check +of that fact. + +<h2>Term variables (<code>Id</code>)</h2> + +A term variable (of type <code>Id</code>) is represented either by a +<code>LocalId</code> or a <code>GlobalId</code>: +<p> +A <code>GlobalId</code> is +<ul> +<li> Always bound at top-level. +<li> Always has a <code>GlobalName</code>, and hence has + a <code>Unique</code> that is globally unique across the whole + GHC invocation (a single invocation may compile multiple modules). +<li> Has <code>IdInfo</code> that is absolutely fixed, forever. +</ul> + +<p> +A <code>LocalId</code> is: +<ul> +<li> Always bound in the module being compiled: +<ul> +<li> <em>either</em> bound within an expression (lambda, case, local let(rec)) +<li> <em>or</em> defined at top level in the module being compiled. +</ul> +<li> Has IdInfo that changes as the simpifier bashes repeatedly on it. +</ul> +<p> +The key thing about <code>LocalId</code>s is that the free-variable finder +typically treats them as candidate free variables. That is, it ignores +<code>GlobalId</code>s such as imported constants, data contructors, etc. +<p> +An important invariant is this: <em>All the bindings in the module +being compiled (whether top level or not) are <code>LocalId</code>s +until the CoreTidy phase.</em> In the CoreTidy phase, all +externally-visible top-level bindings are made into GlobalIds. This +is the point when a <code>LocalId</code> becomes "frozen" and becomes +a fixed, immutable <code>GlobalId</code>. +<p> +(A binding is <em>"externally-visible"</em> if it is exported, or +mentioned in the unfolding of an externally-visible Id. An +externally-visible Id may not have an unfolding, either because it is +too big, or because it is the loop-breaker of a recursive group.) + +<h3>Global Ids and implicit Ids</h3> + +<code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>. +This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other +structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code> +so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration: +<pre> +data GlobalIdDetails + = NotGlobalId -- Used as a convenient extra return value + -- from globalIdDetails + + | VanillaGlobal -- Imported from elsewhere + + | PrimOpId PrimOp -- The Id for a primitive operator + | FCallId ForeignCall -- The Id for a foreign call + + -- These next ones are all "implicit Ids" + | RecordSelId FieldLabel -- The Id for a record selector + | DataConId DataCon -- The Id for a data constructor *worker* + | DataConWrapId DataCon -- The Id for a data constructor *wrapper* + -- [the only reasons we need to know is so that + -- a) we can suppress printing a definition in the interface file + -- b) when typechecking a pattern we can get from the + -- Id back to the data con] +</pre> +The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for +a record selector, say, to its field name; or the <code>Id</code> for a primitive +operator to the <code>PrimOp</code> itself. +<p> +Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit +Id is derived by implication from some other declaration. So a record selector is +derived from its data type declaration, for example. An implicit Ids is always +a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just +that: implicit. If you do -ddump-simpl you won't see their definition. (That's +why it's true to say that until CoreTidy all Ids in this compilation unit are +LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in +this module, so that the code generator will generate code for the (curried) function. +<p> +Implicit Ids carry their unfolding inside them, of course, so they may well have +been inlined much earlier; but we generate the curried top-level defn just in +case its ever needed. + +<h3>LocalIds</h3> + +The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>: +<pre> +data LocalIdDetails + = NotExported -- Not exported + | Exported -- Exported + | SpecPragma -- Not exported, but not to be discarded either + -- It's unclean that this is so deeply built in +</pre> +From this we can tell whether the <code>LocalId</code> is exported, and that +tells us whether we can drop an unused binding as dead code. +<p> +The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma: +<pre> + foo :: Num a => a -> a + {-# SPECIALIZE foo :: Int -> Int #-} + foo = ... +</pre> +The type checker generates a dummy call to <code>foo</code> at the right types: +<pre> + $dummy = foo Int dNumInt +</pre> +The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang +onto that call to <code>foo</code> so that the specialiser can see it, but there +are no calls to <code>$dummy</code>. +The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it +reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id +to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>. +So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser. + + +<h3> ExternalNames and InternalNames </h3> + +Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is +not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code> +(see "<a href="names.html#sort">The truth about Names</a>"): +<ul> +<li> Every <code>GlobalId</code> has an <code>ExternalName</code>. +<li> A <code>LocalId</code> might have either kind of <code>Name</code>. +</ul> + +<!-- hhmts start --> +Last modified: Fri Sep 12 15:17:18 BST 2003 +<!-- hhmts end --> + </small> + </body> +</html> + |