diff options
Diffstat (limited to 'docs/comm')
39 files changed, 6995 insertions, 0 deletions
diff --git a/docs/comm/exts/ndp.html b/docs/comm/exts/ndp.html new file mode 100644 index 0000000000..0c94c3960b --- /dev/null +++ b/docs/comm/exts/ndp.html @@ -0,0 +1,360 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Parallel Arrays</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Parallel Arrays</h1> + <p> + This section describes an experimental extension by high-performance + arrays, which comprises special syntax for array types and array + comprehensions, a set of optimising program transformations, and a set + of special purpose libraries. The extension is currently only partially + implemented, but the development will be tracked here. + <p> + Parallel arrays originally got their name from the aim to provide an + architecture-independent programming model for a range of parallel + computers. However, since experiments showed that the approach is also + worthwhile for sequential array code, the emphasis has shifted to their + parallel evaluation semantics: As soon as any element in a parallel + array is demanded, all the other elements are evaluated, too. This + makes parallel arrays more strict than <a + href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98 + arrays</a>, but also opens the door for a loop-based implementation + strategy that leads to significantly more efficient code. + <p> + The programming model as well as the use of the <em>flattening + transformation</em>, which is central to the approach, has its origin in + the programming language <a + href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>. + + <h2>More Sugar: Special Syntax for Array Comprehensions</h2> + <p> + The option <code>-fparr</code>, which is a dynamic hsc option that can + be reversed with <code>-fno-parr</code>, enables special syntax for + parallel arrays, which is not essential to using parallel arrays, but + makes for significantly more concise programs. The switch works by + making the lexical analyser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>) + recognise the tokens <code>[:</code> and <code>:]</code>. Given that + the additional productions in the parser (located in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>) + cannot be triggered without the lexer generating the necessary tokens, + there is no need to alter the behaviour of the parser. + <p> + The following additional syntax is accepted (the non-terminals are those + from the <a href="http://haskell.org/onlinereport/">Haskell 98 language + definition</a>): + <p> + <blockquote><pre> +atype -> '[:' type ':] (parallel array type) + +aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0) + | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence) + | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1) + +quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1) + +apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0) +</pre> + </blockquote> + <p> + Moreover, the extended comprehension syntax that allows for <em>parallel + qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also + supported in list comprehensions. In general, the similarity to the + special syntax for list is obvious. The two main differences are that + (a) arithmetic array sequences are always finite and (b) + <code>[::]</code> is not treated as a constructor in expressions and + patterns, but rather as a special case of the explicit array syntax. + The former is a simple consequence of the parallel evaluation semantics + of parallel arrays and the latter is due to arrays not being a + constructor-based data type. + <p> + As a naming convention, types and functions that are concerned with + parallel arrays usually contain the string <code>parr</code> or + <code>PArr</code> (often as a prefix), and where corresponding types or + functions for handling lists exist, the name is identical, except for + containing the substring <code>parr</code> instead of <code>list</code> + (possibly in caps). + <p> + The following implications are worth noting explicitly: + <ul> + <li>As the value and pattern <code>[::]</code> is an empty explicit + parallel array (i.e., something of the form <code>ExplicitPArr ty + []</code> in the AST). This is in contrast to lists, which use the + nil-constructor instead. In the case of parallel arrays, using a + constructor would be rather awkward, as it is not a constructor-based + type. (This becomes rather clear in the desugarer.) + <li>As a consequence, array patterns have the general form <code>[:p1, + p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array + patterns overlap iff they have the same length -- an important property + for the pattern matching compiler. + </ul> + + <h2>Prelude Support for Parallel Arrays</h2> + <p> + The Prelude module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a> + defines the standard operations and their types on parallel arrays and + provides a basic implementation based on boxed arrays. The interface of + <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but + leaving out all functions that require infinite structures and adding + frequently needed array operations, such as permutations. Parallel + arrays are quite unqiue in that they use an entirely different + representation as soon as the flattening transformation is activated, + which is described further below. In particular, <code>PrelPArr</code> + defines the type <code>[::]</code> and operations to create, process, + and inspect parallel arrays. The type as well as the names of some of + the operations are also hardwired in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a> + (see the definition of <code>parrTyCon</code> in this module) and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. + This is again very much like the case of lists, where the type is + defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a> + and similarly wired in; however, for lists the entirely + constructor-based definition is exposed to user programs, which is not + the case for parallel arrays. + + <h2>Desugaring Parallel Arrays</h2> + <p> + The parallel array extension requires the desugarer to replace all + occurrences of (1) explicit parallel arrays, (2) array patterns, and (3) + array comprehensions by a suitable combination of invocations of + operations defined in <code>PrelPArr</code>. + + <h4>Explicit Parallel Arrays</h4> + <p> + These are by far the simplest to remove. We simply replace every + occurrence of <code>[:<i>e<sub>1</sub></i>, ..., + <i>e<sub>n</sub></i>:]</code> by + <blockquote> + <code> + toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>] + </code> + </blockquote> + <p> + i.e., we build a list of the array elements, which is, then, converted + into a parallel array. + + <h4>Parallel Array Patterns</h4> + <p> + Array patterns are much more tricky as element positions may contain + further patterns and the <a + href="../the-beast/desugar.html#patmat">pattern matching compiler</a> + requires us to flatten all those out. But before we turn to the gory + details, here first the basic idea: A flat array pattern matches exactly + iff it's length corresponds to the length of the matched array. Hence, + if we have a set of flat array patterns matching an array value + <code>a</code>, it suffices to generate a Core <code>case</code> + expression that scrutinises <code>lengthP a</code> and has one + alternative for every length of array occuring in one of the patterns. + Moreover, there needs to be a default case catching all other array + lengths. In each alternative, array indexing (i.e., the functions + <code>!:</code>) is used to bind array elements to the corresponding + pattern variables. This sounds easy enough and is essentially what the + parallel array equation of the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code> + does. + <p> + Unfortunately, however, the pattern matching compiler expects that it + can turn (almost) any pattern into variable patterns, literals, or + constructor applications by way of the functions <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>. + And to make matters worse, some weird machinery in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a> + insists on being able to reverse the process (essentially to pretty + print patterns in warnings for incomplete or overlapping patterns). + <p> + The solution to this is an (unlimited) set of <em>fake</em> constructors + for parallel arrays, courtesy of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>. + In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>, + ..., <i>p<sub>n</sub></i>:]</code> is transformed into + <blockquote> + <code> + MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i> + </code> + </blockquote> + <p> + by <code>Match.tidy1</code>, then, run through the rest of the pattern + matching compiler, and finally, picked up by + <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a + <code>case</code> expression as outlined above. + <p> + As an example consider the source expression + <blockquote><pre> +case v of + [:x1:] -> e1 + [:x2, x3, x4:] -> e2 + _ -> e3</pre> + </blockquote> + <p> + <code>Match.tidy1</code> converts it into a form that is equivalent to + <blockquote><pre> +case v of { + MkPArr1 x1 -> e1; + MkPArr2 x2 x3 x4 -> e2; + _ -> e3; +}</pre> + </blockquote> + <p> + which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the + following Core code: + <blockquote><pre> + case lengthP v of + Int# i# -> + case i# of l { + DFT -> e3 + 1 -> let x1 = v!:0 in e1 + 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2 + }</pre> + </blockquote> + + <h4>Parallel Array Comprehensions</h4> + <p> + The most challenging construct of the three are array comprehensions. + In principle, it would be possible to transform them in essentially the + same way as list comprehensions, but this would lead to abysmally slow + code as desugaring of list comprehensions generates code that is + optimised for sequential, constructor-based structures. In contrast, + array comprehensions need to be transformed into code that solely relies + on collective operations and avoids the creation of many small + intermediate arrays. + <p> + The transformation is implemented by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>. + In the following, we denote this transformation function by the form + <code><<<i>e</i>>> pa ea</code>, where <code><i>e</i></code> + is the comprehension to be compiled and the arguments <code>pa</code> + and <code>ea</code> denote a pattern and the currently processed array + expression, respectively. The invariant constraining these two + arguments is that all elements in the array produced by <code>ea</code> + will <em>successfully</em> match against <code>pa</code>. + <p> + Given a source-level comprehensions <code>[:e | qss:]</code>, we compile + it with <code><<[:e | qss:]>> () [:():]</code> using the + rules + <blockquote><pre> +<<[:e' | :]>> pa ea = mapP (\pa -> e') ea +<<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea) +<<[:e' | p <- e, qs:]>> pa ea = + let ef = filterP (\x -> case x of {p -> True; _ -> False}) e + in + <<[:e' | qs:]>> (pa, p) (crossP ea ef) +<<[:e' | let ds, qs:]>> pa ea = + <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) + (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea) +where + {x_1, ..., x_n} = DV (ds) -- Defined Variables +<<[:e' | qs | qss:]>> pa ea = + <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) + (zipP ea <<[:(x_1, ..., x_n) | qs:]>>) +where + {x_1, ..., x_n} = DV (qs)</pre> + </blockquote> + <p> + We assume the denotation of <code>crossP</code> to be given by + <blockquote><pre> +crossP :: [:a:] -> [:b:] -> [:(a, b):] +crossP a1 a2 = let + len1 = lengthP a1 + len2 = lengthP a2 + x1 = concatP $ mapP (replicateP len2) a1 + x2 = concatP $ replicateP len1 a2 + in + zipP x1 x2</pre> + </blockquote> + <p> + For a more efficient implementation of <code>crossP</code>, see + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>. + <p> + Moreover, the following optimisations are important: + <ul> + <li>In the <code>p <- e</code> rule, if <code>pa == ()</code>, drop + it and simplify the <code>crossP ea e</code> to <code>e</code>. + <li>We assume that fusion will optimise sequences of array processing + combinators. + <li>FIXME: Do we want to have the following function? + <blockquote><pre> +mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre> + </blockquote> + <p> + Even with fusion <code>(mapP (\p -> e) . filterP (\p -> + b))</code> may still result in redundant pattern matching + operations. (Let's wait with this until we have seen what the + Simplifier does to the generated code.) + </ul> + + <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2> + <p> + On the quest towards an entirely unboxed representation of parallel + arrays, the flattening transformation is the essential ingredient. GHC + uses a <a + href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially + improved version</a> of the transformation whose original form was + described by Blelloch & Sabot. The flattening transformation + replaces values of type <code>[:a:]</code> as well as functions + operating on these values by alternative, more efficient data structures + and functions. + <p> + The flattening machinery is activated by the option + <code>-fflatten</code>, which is a static hsc option. It consists of + two steps: function vectorisation and array specialisation. Only the + first of those is implemented so far. If selected, the transformation + is applied to a module in Core form immediately after the <a + href="../the-beast/desugar.html">desugarer,</a> before the <a + href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its + job. After vectorisation, the Core program can be dumped using the + option <code>-ddump-vect</code>. The is a good reason for us to perform + flattening immediately after the desugarer. In <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code> + the so-called <em>persistent compiler state</em> is maintained, which + contains all the information about imported interface files needed to + lookup the details of imported names (e.g., during renaming and type + checking). However, all this information is zapped before the + simplifier is invoked (supposedly to reduce the space-consumption of + GHC). As flattening has to get at all kinds of identifiers from Prelude + modules, we need to do it before the relevant information in the + persistent compiler state is gone. + + <p> + As flattening generally requires all libraries to be compiled for + flattening (just like profiling does), there is a <em>compiler way</em> + <code>"ndp"</code>, which can be selected using the way option code + <code>-ndp</code>. This option will automagically select + <code>-fparr</code> and <code>-fflatten</code>. + + <h4><code>FlattenMonad</code></h4> + <p> + The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a> + implements the monad <code>Flatten</code> that is used during + vectorisation to keep track of various sets of bound variables and a + variable substitution map; moreover, it provides a supply of new uniques + and allows us to look up names in the persistent compiler state (i.e., + imported identifiers). + <p> + In order to be able to look up imported identifiers in the persistent + compiler state, it is important that these identifies are included into + the free variable lists computed by the renamer. More precisely, all + names needed by flattening are included in the names produced by + <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting + flattening-dependent lists of names into the renamer, the module + <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>. + + [FIXME: It might be worthwhile to document in the non-Flattening part of + the Commentary that the persistent compiler state is zapped after + desugaring and how the free variables determined by the renamer imply + which names are imported.] + + <p><small> +<!-- hhmts start --> +Last modified: Tue Feb 12 01:44:21 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/exts/th.html b/docs/comm/exts/th.html new file mode 100644 index 0000000000..dbb168aa0e --- /dev/null +++ b/docs/comm/exts/th.html @@ -0,0 +1,197 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Template Haskell</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Template Haskell</h1> + <p> + The Template Haskell (TH) extension to GHC adds a meta-programming + facility in which all meta-level code is executed at compile time. The + design of this extension is detailed in "Template Meta-programming for + Haskell", Tim Sheard and Simon Peyton Jones, <a + href="http://portal.acm.org/toc.cfm?id=581690&type=proceeding&coll=portal&dl=ACM&part=series&WantType=proceedings&idx=unknown&title=unknown">ACM + SIGPLAN 2002 Haskell Workshop,</a> 2002. However, some of the details + changed after the paper was published. + </p> + + <h2>Meta Sugar</h2> + <p> + The extra syntax of TH (quasi-quote brackets, splices, and reification) + is handled in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMeta.hs"><code>DsMeta</code></a>. + In particular, the function <code>dsBracket</code> desugars the four + types of quasi-quote brackets (<code>[|...|]</code>, + <code>[p|...|]</code>, <code>[d|...|]</code>, and <code>[t|...|]</code>) + and <code>dsReify</code> desugars the three types of reification + operations (<code>reifyType</code>, <code>reifyDecl</code>, and + <code>reifyFixity</code>). + </p> + + <h3>Desugaring of Quasi-Quote Brackets</h3> + <p> + A term in quasi-quote brackets needs to be translated into Core code + that, when executed, yields a <em>representation</em> of that term in + the form of the abstract syntax trees defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/libraries/template-haskell/Language/Haskell/TH/Syntax.hs"><code>Language.Haskell.TH.Syntax</code></a>. + Within <code>DsMeta</code>, this is achieved by four functions + corresponding to the four types of quasi-quote brackets: + <code>repE</code> (for <code>[|...|]</code>), <code>repP</code> (for + <code>[p|...|]</code>), <code>repTy</code> (for <code>[t|...|]</code>), + and <code>repTopDs</code> (for <code>[d|...|]</code>). All four of + these functions receive as an argument the GHC-internal Haskell AST of + the syntactic form that they quote (i.e., arguments of type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsExpr.lhs"><code>HsExpr</code></a><code>.HsExpr + Name</code>, <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsPat.lhs"><code>HsPat</code></a><code>.HsPat Name</code>, + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsTypes.lhs"><code>HsType</code></a><code>.HsType + Name</code>, and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsDecls.lhs"><code>HsDecls</code></a><code>.HsGroup + Name</code>, respectively). + </p> + <p> + To increase the static type safety in <code>DsMeta</code>, the functions + constructing representations do not just return plain values of type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a> + <code>.CoreExpr</code>; instead, <code>DsMeta</code> introduces a + parametrised type <code>Core</code> whose dummy type parameter indicates + the source-level type of the value computed by the corresponding Core + expression. All construction of Core fragments in <code>DsMeta</code> + is performed by smart constructors whose type signatures use the dummy + type parameter to constrain the contexts in which they are applicable. + For example, a function that builds a Core expression that evaluates to + a TH type representation, which has type + <code>Language.Haskell.TH.Syntax.Type</code>, would return a value of + type + </p> + <blockquote> + <pre> +Core Language.Haskell.TH.Syntax.Type</pre> + </blockquote> + + <h3>Desugaring of Reification Operators</h3> + <p> + The TH paper introduces four reification operators: + <code>reifyType</code>, <code>reifyDecl</code>, + <code>reifyFixity</code>, and <code>reifyLocn</code>. Of these, + currently (= 9 Nov 2002), only the former two are implemented. + </p> + <p> + The operator <code>reifyType</code> receives the name of a function or + data constructor as its argument and yields a representation of this + entity's type in the form of a value of type + <code>TH.Syntax.Type</code>. Similarly, <code>reifyDecl</code> receives + the name of a type and yields a representation of the type's declaration + as a value of type <code>TH.Syntax.Decl</code>. The name of the reified + entity is mapped to the GHC-internal representation of the entity by + using the function <code>lookupOcc</code> on the name. + </p> + + <h3>Representing Binding Forms</h3> + <p> + Care needs to be taken when constructing TH representations of Haskell + terms that include binding forms, such as lambda abstractions or let + bindings. To avoid name clashes, fresh names need to be generated for + all defined identifiers. This is achieved via the routine + <code>DsMeta.mkGenSym</code>, which, given a <code>Name</code>, produces + a <code>Name</code> / <code>Id</code> pair (of type + <code>GenSymBind</code>) that associates the given <code>Name</code> + with a Core identifier that at runtime will be bound to a string that + contains the fresh name. Notice the two-level nature of this + arrangement. It is necessary, as the Core code that constructs the + Haskell term representation may be executed multiple types at runtime + and it must be ensured that different names are generated in each run. + </p> + <p> + Such fresh bindings need to be entered into the meta environment (of + type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a><code>.DsMetaEnv</code>), + which is part of the state (of type <code>DsMonad.DsEnv</code>) + maintained in the desugarer monad (of type <code>DsMonad.DsM</code>). + This is done using the function <code>DsMeta.addBinds</code>, which + extends the current environment by a list of <code>GenSymBind</code>s + and executes a subcomputation in this extended environment. Names can + be looked up in the meta environment by way of the functions + <code>DsMeta.lookupOcc</code> and <code>DsMeta.lookupBinder</code>; more + details about the difference between these two functions can be found in + the next subsection. + </p> + <p> + NB: <code>DsMeta</code> uses <code>mkGenSym</code> only when + representing terms that may be embedded into a context where names can + be shadowed. For example, a lambda abstraction embedded into an + expression can potentially shadow names defined in the context it is + being embedded into. In contrast, this can never be the case for + top-level declarations, such as data type declarations; hence, the type + variables that a parametric data type declaration abstracts over are not + being gensym'ed. As a result, variables in defining positions are + handled differently depending on the syntactic construct in which they + appear. + </p> + + <h3>Binders Versus Occurences</h3> + <p> + Name lookups in the meta environment of the desugarer use two functions + with slightly different behaviour, namely <code>DsMeta.lookupOcc</code> + and <code>lookupBinder</code>. The module <code>DsMeta</code> contains + the following explanation as to the difference of these functions: + </p> + <blockquote> + <pre> +When we desugar [d| data T = MkT |] +we want to get + Data "T" [] [Con "MkT" []] [] +and *not* + Data "Foo:T" [] [Con "Foo:MkT" []] [] +That is, the new data decl should fit into whatever new module it is +asked to fit in. We do *not* clone, though; no need for this: + Data "T79" .... + +But if we see this: + data T = MkT + foo = reifyDecl T + +then we must desugar to + foo = Data "Foo:T" [] [Con "Foo:MkT" []] [] + +So in repTopDs we bring the binders into scope with mkGenSyms and addBinds, +but in dsReify we do not. And we use lookupOcc, rather than lookupBinder +in repTyClD and repC.</pre> + </blockquote> + <p> + This implies that <code>lookupOcc</code>, when it does not find the name + in the meta environment, uses the function <code>DsMeta.globalVar</code> + to construct the <em>original name</em> of the entity (cf. the TH paper + for more details regarding original names). This name uniquely + identifies the entity in the whole program and is in scope + <em>independent</em> of whether the user name of the same entity is in + scope or not (i.e., it may be defined in a different module without + being explicitly imported) and has the form <module>:<name>. + <strong>NB:</strong> Incidentally, the current implementation of this + mechanisms facilitates breaking any abstraction barrier. + </p> + + <h3>Known-key Names for Template Haskell</h3> + <p> + During the construction of representations, the desugarer needs to use a + large number of functions defined in the library + <code>Language.Haskell.TH.Syntax</code>. The names of these functions + need to be made available to the compiler in the way outlined <a + href="../the-beast/prelude.html">Primitives and the Prelude.</a> + Unfortunately, any change to <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a> + triggers a significant amount of recompilation. Hence, the names needed + for TH are defined in <code>DsMeta</code> instead (at the end of the + module). All library functions needed by TH are contained in the name + set <code>DsMeta.templateHaskellNames</code>. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Wed Nov 13 18:01:48 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/feedback.html b/docs/comm/feedback.html new file mode 100644 index 0000000000..1da8b10f29 --- /dev/null +++ b/docs/comm/feedback.html @@ -0,0 +1,34 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Feedback</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Feedback</h1> + <p> + <a href="mailto:chak@cse.unsw.edu.au">I</a> welcome any feedback on the + material and in particular would appreciated comments on which parts of + the document are incomprehensible or miss explanation -- e.g., due to + the use of GHC speak that is explained nowhere (words like infotable or + so). Moreover, I would be interested to know which areas of GHC you + would like to see covered here. + <p> + For the moment is probably best if feedback is directed to + <p> + <blockquote> + <a + href="mailto:chak@cse.unsw.edu.au"><code>chak@cse.unsw.edu.au</code></a> + </blockquote> + <p> + However, if there is sufficient interest, we might consider setting up a + mailing list. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 00:11:42 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/genesis.html b/docs/comm/genesis/genesis.html new file mode 100644 index 0000000000..30b16fec46 --- /dev/null +++ b/docs/comm/genesis/genesis.html @@ -0,0 +1,82 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Outline of the Genesis</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Outline of the Genesis</h1> + <p> + Building GHC happens in two stages: First you have to prepare the tree + with <code>make boot</code>; and second, you build the compiler and + associated libraries with <code>make all</code>. The <code>boot</code> + stage builds some tools used during the main build process, generates + parsers and other pre-computed source, and finally computes dependency + information. There is considerable detail on the build process in GHC's + <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building Guide.</a> + + <h4>Debugging the Beast</h4> + <p> + If you are hacking the compiler or like to play with unstable + development versions, chances are that the compiler someday just crashes + on you. Then, it is a good idea to load the <code>core</code> into + <code>gdb</code> as usual, but unfortunately there is usually not too + much useful information. + <p> + The next step, then, is somewhat tedious. You should build a compiler + producing programs with a runtime system that has debugging turned on + and use that to build the crashing compiler. There are many sanity + checks in the RTS, which may detect inconsistency before they lead to a + crash and you may include more debugging information, which helps + <code>gdb.</code> For a RTS with debugging turned on, add the following + to <code>build.mk</code> (see also the comment in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/mk/config.mk.in"><code>config.mk.in</code></a> that you find when searching for + <code>GhcRtsHcOpts</code>): +<blockquote><pre> +GhcRtsHcOpts+=-optc-DDEBUG +GhcRtsCcOpts+=-g +EXTRA_LD_OPTS=-lbfd -liberty</pre></blockquote> + <p> + Then go into <code>fptools/ghc/rts</code> and <code>make clean boot && + make all</code>. With the resulting runtime system, you have to re-link + the compiler. Go into <code>fptools/ghc/compiler</code>, delete the + file <code>hsc</code> (up to version 4.08) or + <code>ghc-<version></code>, and execute <code>make all</code>. + <p> + The <code>EXTRA_LD_OPTS</code> are necessary as some of the debugging + code uses the BFD library, which in turn requires <code>liberty</code>. + I would also recommend (in 4.11 and from 5.0 upwards) adding these linker + options to the files <code>package.conf</code> and + <code>package.conf.inplace</code> in the directory + <code>fptools/ghc/driver/</code> to the <code>extra_ld_opts</code> entry + of the package <code>RTS</code>. Otherwise, you have to supply them + whenever you compile and link a program with a compiler that uses the + debugging RTS for the programs it produces. + <p> + To run GHC up to version 4.08 in <code>gdb</code>, first invoke the + compiler as usual, but pass it the option <code>-v</code>. This will + show you the exact invocation of the compiler proper <code>hsc</code>. + Run <code>hsc</code> with these options in <code>gdb</code>. The + development version 4.11 and stable releases from 5.0 on do no longer + use the Perl driver; so, you can run them directly with gdb. + <p> + <strong>Debugging a compiler during building from HC files.</strong> + If you are boot strapping the compiler on new platform from HC files and + it crashes somewhere during the build (e.g., when compiling the + libraries), do as explained above, but you may have to re-configure the + build system with <code>--enable-hc-boot</code> before re-making the + code in <code>fptools/ghc/driver/</code>. + If you do this with a compiler up to version 4.08, run the build process + with <code>make EXTRA_HC_OPTS=-v</code> to get the exact arguments with + which you have to invoke <code>hsc</code> in <code>gdb</code>. + + <p><small> +<!-- hhmts start --> +Last modified: Sun Apr 24 22:16:30 CEST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/makefiles.html b/docs/comm/genesis/makefiles.html new file mode 100644 index 0000000000..957a82eb85 --- /dev/null +++ b/docs/comm/genesis/makefiles.html @@ -0,0 +1,51 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Mindboggling Makefiles</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Mindboggling Makefiles</h1> + <p> + The size and structure of GHC's makefiles makes it quite easy to scream + out loud - in pain - during the process of tracking down problems in the + make system or when attempting to alter it. GHC's <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building + Guide</a> has valuable information on <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/building/sec-makefile-arch.html">the + makefile architecture.</a> + + <h4>A maze of twisty little passages, all alike</h4> + <p> + The <code>fptools/</code> toplevel and the various project directories + contain not only a <code>Makefile</code> each, but there are + subdirectories of name <code>mk/</code> at various levels that contain + rules, targets, and so on specific to a project - or, in the case of the + toplevel, the default rules for the whole system. Each <code>mk/</code> + directory contains a file <code>boilerplate.mk</code> that ties the + various other makefiles together. Files called <code>target.mk</code>, + <code>paths.mk</code>, and <code>suffix.mk</code> contain make targets, + definitions of variables containing paths, and suffix rules, + respectively. + <p> + One particularly nasty trick used in this hierarchy of makefiles is the + way in which the variable <code>$(TOP)</code> is used. AFAIK, + <code>$(TOP)</code> always points to a directory containing an + <code>mk/</code> subdirectory; however, it not necessarily points to the + toplevel <code>fptools/</code> directory. For example, within the GHC + subtree, <code>$(TOP)</code> points to <code>fptools/ghc/</code>. + However, some of the makefiles in <code>fptools/ghc/mk/</code> will then + <em>temporarily</em> redefine <code>$(TOP)</code> to point a level + higher (i.e., to <code>fptools/</code>) while they are including the + toplevel boilerplate. After that <code>$(TOP)</code> is redefined to + whatever value it had before including makefiles from higher up in the + hierarchy. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/genesis/modules.html b/docs/comm/genesis/modules.html new file mode 100644 index 0000000000..de59cce6d3 --- /dev/null +++ b/docs/comm/genesis/modules.html @@ -0,0 +1,164 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Marvellous Module Structure of GHC </title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Marvellous Module Structure of GHC </h1> + <p> + +GHC is built out of about 245 Haskell modules. It can be quite tricky +to figure out what the module dependency graph looks like. It can be +important, too, because loops in the module dependency graph need to +be broken carefully using <tt>.hi-boot</tt> interface files. +<p> +This section of the commentary documents the subtlest part of +the module dependency graph, namely the part near the bottom. +<ul> +<li> The list is given in compilation order: that is, +module near the top are more primitive, and are compiled earlier. +<li> Each module is listed together with its most critical +dependencies in parentheses; that is, the dependencies that prevent it being +compiled earlier. +<li> Modules in the same bullet don't depend on each other. +<li> Loops are documented by a dependency such as "<tt>loop Type.Type</tt>". +This means tha the module imports <tt>Type.Type</tt>, but module <tt>Type</tt> +has not yet been compiled, so the import comes from <tt>Type.hi-boot</tt>. +</ul> + +Compilation order is as follows: +<ul> +<li> +<strong>First comes a layer of modules that have few interdependencies, +and which implement very basic data types</strong>: +<tt> <ul> +<li> Util +<li> OccName +<li> Pretty +<li> Outputable +<li> StringBuffer +<li> ListSetOps +<li> Maybes +<li> etc +</ul> </tt> + +<p> +<li> <strong> Now comes the main subtle layer, involving types, classes, type constructors +identifiers, expressions, rules, and their operations.</strong> + +<tt> +<ul> +<li> Name <br> PrimRep +<p><li> + PrelNames +<p><li> + Var (Name, loop IdInfo.IdInfo, + loop Type.Type, loop Type.Kind) +<p><li> + VarEnv, VarSet, ThinAir +<p><li> + Class (loop TyCon.TyCon, loop Type.Type) +<p><li> + TyCon (loop Type.Type, loop Type.Kind, loop DataCon.DataCon, loop Generics.GenInfo) +<p><li> + TypeRep (loop DataCon.DataCon, loop Subst.substTyWith) +<p><li> + Type (loop PprType.pprType, loop Subst.substTyWith) +<p><li> + FieldLabel(Type) <br> + TysPrim(Type) <br> +<p><li> + Literal (TysPrim, PprType) <br> + DataCon (loop PprType, loop Subst.substTyWith, FieldLabel.FieldLabel) +<p><li> + TysWiredIn (loop MkId.mkDataConIds) +<p><li> + TcType( lots of TysWiredIn stuff) +<p><li> + PprType( lots of TcType stuff ) +<p><li> + PrimOp (PprType, TysWiredIn) +<p><li> + CoreSyn [does not import Id] +<p><li> + IdInfo (CoreSyn.Unfolding, CoreSyn.CoreRules) +<p><li> + Id (lots from IdInfo) +<p><li> + CoreFVs <br> + PprCore +<p><li> + CoreUtils (PprCore.pprCoreExpr, CoreFVs.exprFreeVars, + CoreSyn.isEvaldUnfolding CoreSyn.maybeUnfoldingTemplate) +<p><li> + CoreLint( CoreUtils ) <br> + OccurAnal (CoreUtils.exprIsTrivial) <br> + CoreTidy (CoreUtils.exprArity ) <br> +<p><li> + CoreUnfold (OccurAnal.occurAnalyseGlobalExpr) +<p><li> + Subst (CoreUnfold.Unfolding, CoreFVs) <br> + Generics (CoreUnfold.mkTopUnfolding) <br> + Rules (CoreUnfold.Unfolding, PprCore.pprTidyIdRules) +<p><li> + MkId (CoreUnfold.mkUnfolding, Subst, Rules.addRule) +<p><li> + PrelInfo (MkId) <br> + HscTypes( Rules.RuleBase ) +</ul></tt> + +<p><li> <strong>That is the end of the infrastructure. Now we get the + main layer of mdoules that perform useful work.</strong> + +<tt><ul> +<p><li> + CoreTidy (HscTypes.PersistentCompilerState) +</ul></tt> +</ul> + +HsSyn stuff +<ul> +<li> HsPat.hs-boot +<li> HsExpr.hs-boot (loop HsPat.LPat) +<li> HsTypes (loop HsExpr.HsSplice) +<li> HsBinds (HsTypes.LHsType, loop HsPat.LPat, HsExpr.pprFunBind and others) + HsLit (HsTypes.SyntaxName) +<li> HsPat (HsBinds, HsLit) + HsDecls (HsBinds) +<li> HsExpr (HsDecls, HsPat) +</ul> + + + +<h2>Library stuff: base package</h2> + +<ul> +<li> GHC.Base +<li> Data.Tuple (GHC.Base), GHC.Ptr (GHC.Base) +<li> GHC.Enum (Data.Tuple) +<li> GHC.Show (GHC.Enum) +<li> GHC.Num (GHC.Show) +<li> GHC.ST (GHC.Num), GHC.Real (GHC.Num) +<li> GHC.Arr (GHC.ST) GHC.STRef (GHC.ST) +<li> GHC.IOBase (GHC.Arr) +<li> Data.Bits (GHC.Real) +<li> Data.HashTable (Data.Bits, Control.Monad) +<li> Data.Typeable (GHC.IOBase, Data.HashTable) +<li> GHC.Weak (Data.Typeable, GHC.IOBase) +</ul> + + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 22 16:46:33 GMT Daylight Time 2001 +<!-- hhmts end --> + </small> + </body> +</html> + + + + + diff --git a/docs/comm/index.html b/docs/comm/index.html new file mode 100644 index 0000000000..5ccd5f0ca9 --- /dev/null +++ b/docs/comm/index.html @@ -0,0 +1,121 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Beast Explained</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.17]</h1> + <p> + <!-- Contributors: Whoever makes substantial additions or changes to the + document, please add your name and keep the order alphabetic. Moreover, + please bump the version number for any substantial modification that you + check into CVS. + --> + <strong>Manuel M. T. Chakravarty</strong><br> + <strong>Sigbjorn Finne</strong><br> + <strong>Simon Marlow</strong><br> + <strong>Simon Peyton Jones</strong><br> + <strong>Julian Seward</strong><br> + <strong>Reuben Thomas</strong><br> + <br> + <p> + This document started as a collection of notes describing what <a + href="mailto:chak@cse.unsw.edu.au">I</a> learnt when poking around in + the <a href="http://haskell.org/ghc/">GHC</a> sources. During the + <i>Haskell Implementers Workshop</i> in January 2001, it was decided to + put the commentary into + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/">GHC's CVS + repository</a> + to allow the whole developer community to add their wizardly insight to + the document. + <p> + <strong>The document is still far from being complete - help it + grow!</strong> + + <h2>Before the Show Begins</h2> + <p> + <ul> + <li><a href="feedback.html">Feedback</a> + <li><a href="others.html">Other Sources of Wisdom</a> + </ul> + + <h2>Genesis</h2> + <p> + <ul> + <li><a href="genesis/genesis.html">Outline of the Genesis</a> + <li><a href="genesis/makefiles.html">Mindboggling Makefiles</a> + <li><a href="genesis/modules.html">GHC's Marvellous Module Structure</a> + </ul> + + <h2>The Beast Dissected</h2> + <p> + <ul> + <li><a href="the-beast/coding-style.html">Coding style used in + the compiler</a> + <li><a href="the-beast/driver.html">The Glorious Driver</a> + <li><a href="the-beast/prelude.html">Primitives and the Prelude</a> + <li><a href="the-beast/syntax.html">Just Syntax</a> + <li><a href="the-beast/basicTypes.html">The Basics</a> + <li><a href="the-beast/modules.html">Modules, ModuleNames and + Packages</a> + <li><a href="the-beast/names.html">The truth about names: Names and OccNames</a> + <li><a href="the-beast/vars.html">The Real Story about Variables, Ids, + TyVars, and the like</a> + <li><a href="the-beast/data-types.html">Data types and constructors</a> + <li><a href="the-beast/renamer.html">The Glorious Renamer</a> + <li><a href="the-beast/types.html">Hybrid Types</a> + <li><a href="the-beast/typecheck.html">Checking Types</a> + <li><a href="the-beast/desugar.html">Sugar Free: From Haskell To Core</a> + <li><a href="the-beast/simplifier.html">The Mighty Simplifier</a> + <li><a href="the-beast/mangler.html">The Evil Mangler</a> + <li><a href="the-beast/alien.html">Alien Functions</a> + <li><a href="the-beast/stg.html">You Got Control: The STG-language</a> + <li><a href="the-beast/ncg.html">The Native Code Generator</a> + <li><a href="the-beast/ghci.html">GHCi</a> + <li><a href="the-beast/fexport.html">Implementation of + <code>foreign export</code></a> + <li><a href="the-beast/main.html">Compiling and running the Main module</code></a> + </ul> + + <h2>RTS & Libraries</h2> + <p> + <ul> + <li><a href="rts-libs/coding-style.html">Coding Style Guidelines</a> + <li><a href="rts-libs/stgc.html">Spineless Tagless C</a> + <li><a href="rts-libs/primitives.html">Primitives</a> + <li><a href="rts-libs/prelfound.html">Prelude Foundations</a> + <li><a href="rts-libs/prelude.html">Cunning Prelude Code</a> + <li><a href="rts-libs/foreignptr.html">On why we have <tt>ForeignPtr</tt></a> + <li><a href="rts-libs/non-blocking.html">Non-blocking I/O for Win32</a> + <li><a href="rts-libs/multi-thread.html">Supporting multi-threaded interoperation</a> + </ul> + + <h2>Extensions, or Making a Complicated System More Complicated</h2> + <p> + <ul> + <li><a href="exts/th.html">Template Haskell</a> + <li><a href="exts/ndp.html">Parallel Arrays</a> + </ul> + + <h2>The Source</h2> + <p> + The online master copy of the Commentary is at + <blockquote> + <a href="http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/">http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/</a> + </blockquote> + <p> + This online version is updated + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/docs/comm/">from + CVS</a> + daily. + + <p><small> +<!-- hhmts start --> +Last modified: Thu May 12 19:03:42 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/others.html b/docs/comm/others.html new file mode 100644 index 0000000000..52d87e9419 --- /dev/null +++ b/docs/comm/others.html @@ -0,0 +1,60 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Other Sources of Wisdom</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Other Sources of Wisdom</h1> + <p> + Believe it or not, but there are other people besides you who are + masochistic enough to study the innards of the beast. Some of the have + been kind (or cruel?) enough to share their insights with us. Here is a + probably incomplete list: + <p> + <ul> + + <li>The <a + href="http://www.cee.hw.ac.uk/~dsg/gph/docs/StgSurvival.ps.gz">STG + Survival Sheet</a> has -- according to its header -- been written by + `a poor wee soul',<sup><a href="#footnote1">1</a></sup> which + probably has been pushed into the torments of madness by the very + act of contemplating the inner workings of the STG runtime system. + This document discusses GHC's runtime system with a focus on + support for parallel processing (aka GUM). + + <li>Instructions on <a + href="http://www-users.cs.york.ac.uk/~olaf/PUBLICATIONS/extendGHC.html">Adding + an Optimisation Pass to the Glasgow Haskell Compiler</a> + have been compiled by <a + href="http://www-users.cs.york.ac.uk/~olaf/">Olaf Chitil</a>. + Unfortunately, this document is already a little aged. + + <li><a href="http://www.cs.pdx.edu/~apt/">Andrew Tolmach</a> has defined + <a href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">an external + representation of + GHC's <em>Core</em> language</a> and also implemented a GHC pass + that emits the intermediate form into <code>.hcr</code> files. The + option <code>-fext-core</code> triggers GHC to emit Core code after + optimisation; in addition, <code>-fno-code</code> is often used to + stop compilation after Core has been emitted. + + <!-- Add references to other background texts listed on the GHC docu + page + --> + + </ul> + + <p><hr><p> + <sup><a name="footnote1">1</a></sup>Usually reliable sources have it that + the poor soul in question is no one less than GUM hardcore hacker <a + href="http://www.cee.hw.ac.uk/~hwloidl/">Hans-Wolfgang Loidl</a>. + + <p><small> +<!-- hhmts start --> +Last modified: Tue Nov 13 10:56:57 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/coding-style.html b/docs/comm/rts-libs/coding-style.html new file mode 100644 index 0000000000..58f5b4f9bb --- /dev/null +++ b/docs/comm/rts-libs/coding-style.html @@ -0,0 +1,516 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Style Guidelines for RTS C code</title> + </head> + +<body> +<H1>The GHC Commentary - Style Guidelines for RTS C code</h1> + +<h2>Comments</h2> + +<p>These coding style guidelines are mainly intended for use in +<tt>ghc/rts</tt> and <tt>ghc/includes</tt>. + +<p>NB These are just suggestions. They're not set in stone. Some of +them are probably misguided. If you disagree with them, feel free to +modify this document (and make your commit message reasonably +informative) or mail someone (eg. <a +href="glasgow-haskell-users@haskell.org">The GHC mailing list</a>) + +<h2>References</h2> + +If you haven't read them already, you might like to check the following. +Where they conflict with our suggestions, they're probably right. + +<ul> + +<li> +The C99 standard. One reasonable reference is <a +href="http://home.tiscalinet.ch/t_wolf/tw/c/c9x_changes.html">here</a>. + +<p><li> +Writing Solid Code, Microsoft Press. (Highly recommended. Possibly +the only Microsoft Press book that's worth reading.) + +<p><li> +Autoconf documentation. +See also <a href="http://peti.gmd.de/autoconf-archive/">The autoconf macro archive</a> and +<a href="http://www.cyclic.com/cyclic-pages/autoconf.html">Cyclic Software's description</a> + +<p><li> <a +href="http://www.cs.umd.edu/users/cml/cstyle/indhill-cstyle.html">Indian +Hill C Style and Coding Standards</a>. + +<p><li> +<a href="http://www.cs.umd.edu/users/cml/cstyle/">A list of C programming style links</a> + +<p><li> +<a href="http://www.lysator.liu.se/c/c-www.html">A very large list of C programming links</a> + +<p><li> +<a href="http://www.geek-girl.com/unix.html">A list of Unix programming links</a> + +</ul> + + +<h2>Portability issues</h2> + +<ul> +<p><li> We try to stick to C99 where possible. We use the following +C99 features relative to C89, some of which were previously GCC +extensions (possibly with different syntax): + +<ul> +<p><li>Variable length arrays as the last field of a struct. GCC has +a similar extension, but the syntax is slightly different: in GCC you +would declare the array as <tt>arr[0]</tt>, whereas in C99 it is +declared as <tt>arr[]</tt>. + +<p><li>Inline annotations on functions (see later) + +<p><li>Labeled elements in initialisers. Again, GCC has a slightly +different syntax from C99 here, and we stick with the GCC syntax until +GCC implements the C99 proposal. + +<p><li>C++-style comments. These are part of the C99 standard, and we +prefer to use them whenever possible. +</ul> + +<p>In addition we use ANSI-C-style function declarations and +prototypes exclusively. Every function should have a prototype; +static function prototypes may be placed near the top of the file in +which they are declared, and external prototypes are usually placed in +a header file with the same basename as the source file (although there +are exceptions to this rule, particularly when several source files +together implement a subsystem which is described by a single external +header file). + +<p><li>We use the following GCC extensions, but surround them with +<tt>#ifdef __GNUC__</tt>: + +<ul> +<p><li>Function attributes (mostly just <code>no_return</code> and +<code>unused</code>) +<p><li>Inline assembly. +</ul> + +<p><li> +char can be signed or unsigned - always say which you mean + +<p><li>Our POSIX policy: try to write code that only uses POSIX (IEEE +Std 1003.1) interfaces and APIs. We used to define +<code>POSIX_SOURCE</code> by default, but found that this caused more +problems than it solved, so now we require any code that is +POSIX-compliant to explicitly say so by having <code>#include +"PosixSource.h"</code> at the top. Try to do this whenever possible. + +<p><li> Some architectures have memory alignment constraints. Others +don't have any constraints but go faster if you align things. These +macros (from <tt>ghcconfig.h</tt>) tell you which alignment to use + +<pre> + /* minimum alignment of unsigned int */ + #define ALIGNMENT_UNSIGNED_INT 4 + + /* minimum alignment of long */ + #define ALIGNMENT_LONG 4 + + /* minimum alignment of float */ + #define ALIGNMENT_FLOAT 4 + + /* minimum alignment of double */ + #define ALIGNMENT_DOUBLE 4 +</pre> + +<p><li> Use <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> when +reading/writing ints and ptrs to the stack or heap. Note that, by +definition, <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> are +the same size and have the same alignment constraints even if +<code>sizeof(int) != sizeof(ptr)</code> on that platform. + +<p><li> Use <tt>StgInt8</tt>, <tt>StgInt16</tt>, etc when you need a +certain minimum number of bits in a type. Use <tt>int</tt> and +<tt>nat</tt> when there's no particular constraint. ANSI C only +guarantees that ints are at least 16 bits but within GHC we assume +they are 32 bits. + +<p><li> Use <tt>StgFloat</tt> and <tt>StgDouble</tt> for floating +point values which will go on/have come from the stack or heap. Note +that <tt>StgDouble</tt> may occupy more than one <tt>StgWord</tt>, but +it will always be a whole number multiple. + +<p> +Use <code>PK_FLT(addr)</code>, <code>PK_DBL(addr)</code> to read +<tt>StgFloat</tt> and <tt>StgDouble</tt> values from the stack/heap, +and <code>ASSIGN_FLT(val,addr)</code> / +<code>ASSIGN_DBL(val,addr)</code> to assign StgFloat/StgDouble values +to heap/stack locations. These macros take care of alignment +restrictions. + +<p> +Heap/Stack locations are always <tt>StgWord</tt> aligned; the +alignment requirements of an <tt>StgDouble</tt> may be more than that +of <tt>StgWord</tt>, but we don't pad misaligned <tt>StgDoubles</tt> +because doing so would be too much hassle (see <code>PK_DBL</code> & +co above). + +<p><li> +Avoid conditional code like this: + +<pre> + #ifdef solaris_host_OS + // do something solaris specific + #endif +</pre> + +Instead, add an appropriate test to the configure.ac script and use +the result of that test instead. + +<pre> + #ifdef HAVE_BSD_H + // use a BSD library + #endif +</pre> + +<p>The problem is that things change from one version of an OS to another +- things get added, things get deleted, things get broken, some things +are optional extras. Using "feature tests" instead of "system tests" +makes things a lot less brittle. Things also tend to get documented +better. + +</ul> + +<h2>Debugging/robustness tricks</h2> + + +Anyone who has tried to debug a garbage collector or code generator +will tell you: "If a program is going to crash, it should crash as +soon, as noisily and as often as possible." There's nothing worse +than trying to find a bug which only shows up when running GHC on +itself and doesn't manifest itself until 10 seconds after the actual +cause of the problem. + +<p>We put all our debugging code inside <tt>#ifdef DEBUG</tt>. The +general policy is we don't ship code with debugging checks and +assertions in it, but we do run with those checks in place when +developing and testing. Anything inside <tt>#ifdef DEBUG</tt> should +not slow down the code by more than a factor of 2. + +<p>We also have more expensive "sanity checking" code for hardcore +debugging - this can slow down the code by a large factor, but is only +enabled on demand by a command-line flag. General sanity checking in +the RTS is currently enabled with the <tt>-DS</tt> RTS flag. + +<p>There are a number of RTS flags which control debugging output and +sanity checking in various parts of the system when <tt>DEBUG</tt> is +defined. For example, to get the scheduler to be verbose about what +it is doing, you would say <tt>+RTS -Ds -RTS</tt>. See +<tt>includes/RtsFlags.h</tt> and <tt>rts/RtsFlags.c</tt> for the full +set of debugging flags. To check one of these flags in the code, +write: + +<pre> + IF_DEBUG(gc, fprintf(stderr, "...")); +</pre> + +would check the <tt>gc</tt> flag before generating the output (and the +code is removed altogether if <tt>DEBUG</tt> is not defined). + +<p>All debugging output should go to <tt>stderr</tt>. + +<p> +Particular guidelines for writing robust code: + +<ul> +<p><li> +Use assertions. Use lots of assertions. If you write a comment +that says "takes a +ve number" add an assertion. If you're casting +an int to a nat, add an assertion. If you're casting an int to a char, +add an assertion. We use the <tt>ASSERT</tt> macro for writing +assertions; it goes away when <tt>DEBUG</tt> is not defined. + +<p><li> +Write special debugging code to check the integrity of your data structures. +(Most of the runtime checking code is in <tt>rts/Sanity.c</tt>) +Add extra assertions which call this code at the start and end of any +code that operates on your data structures. + +<p><li> +When you find a hard-to-spot bug, try to think of some assertions, +sanity checks or whatever that would have made the bug easier to find. + +<p><li> +When defining an enumeration, it's a good idea not to use 0 for normal +values. Instead, make 0 raise an internal error. The idea here is to +make it easier to detect pointer-related errors on the assumption that +random pointers are more likely to point to a 0 than to anything else. + +<pre> +typedef enum + { i_INTERNAL_ERROR /* Instruction 0 raises an internal error */ + , i_PANIC /* irrefutable pattern match failed! */ + , i_ERROR /* user level error */ + + ... +</pre> + +<p><li> Use <tt>#warning</tt> or <tt>#error</tt> whenever you write a +piece of incomplete/broken code. + +<p><li> When testing, try to make infrequent things happen often. + For example, make a context switch/gc/etc happen every time a + context switch/gc/etc can happen. The system will run like a + pig but it'll catch a lot of bugs. + +</ul> + +<h2>Syntactic details</h2> + +<ul> +<p><li><b>Important:</b> Put "redundant" braces or parens in your code. +Omitting braces and parens leads to very hard to spot bugs - +especially if you use macros (and you might have noticed that GHC does +this a lot!) + +<p> +In particular: +<ul> +<p><li> +Put braces round the body of for loops, while loops, if statements, etc. +even if they "aren't needed" because it's really hard to find the resulting +bug if you mess up. Indent them any way you like but put them in there! +</ul> + +<p><li> +When defining a macro, always put parens round args - just in case. +For example, write: +<pre> + #define add(x,y) ((x)+(y)) +</pre> +instead of +<pre> + #define add(x,y) x+y +</pre> + +<p><li> Don't declare and initialize variables at the same time. +Separating the declaration and initialization takes more lines, but +make the code clearer. + +<p><li> +Use inline functions instead of macros if possible - they're a lot +less tricky to get right and don't suffer from the usual problems +of side effects, evaluation order, multiple evaluation, etc. + +<ul> +<p><li>Inline functions get the naming issue right. E.g. they + can have local variables which (in an expression context) + macros can't. + +<p><li> Inline functions have call-by-value semantics whereas macros + are call-by-name. You can be bitten by duplicated computation + if you aren't careful. + +<p><li> You can use inline functions from inside gdb if you compile with + -O0 or -fkeep-inline-functions. If you use macros, you'd better + know what they expand to. +</ul> + +However, note that macros can serve as both l-values and r-values and +can be "polymorphic" as these examples show: +<pre> + // you can use this as an l-value or an l-value + #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo) + + // polymorphic case + // but note that min(min(1,2),3) does 3 comparisions instead of 2!! + #define min(x,y) (((x)<=(y)) ? (x) : (y)) +</pre> + +<p><li> +Inline functions should be "static inline" because: +<ul> +<p><li> +gcc will delete static inlines if not used or theyre always inlined. + +<p><li> + if they're externed, we could get conflicts between 2 copies of the + same function if, for some reason, gcc is unable to delete them. + If they're static, we still get multiple copies but at least they don't conflict. +</ul> + +OTOH, the gcc manual says this +so maybe we should use extern inline? + +<pre> + When a function is both inline and `static', if all calls to the +function are integrated into the caller, and the function's address is +never used, then the function's own assembler code is never referenced. +In this case, GNU CC does not actually output assembler code for the +function, unless you specify the option `-fkeep-inline-functions'. +Some calls cannot be integrated for various reasons (in particular, +calls that precede the function's definition cannot be integrated, and +neither can recursive calls within the definition). If there is a +nonintegrated call, then the function is compiled to assembler code as +usual. The function must also be compiled as usual if the program +refers to its address, because that can't be inlined. + + When an inline function is not `static', then the compiler must +assume that there may be calls from other source files; since a global +symbol can be defined only once in any program, the function must not +be defined in the other source files, so the calls therein cannot be +integrated. Therefore, a non-`static' inline function is always +compiled on its own in the usual fashion. + + If you specify both `inline' and `extern' in the function +definition, then the definition is used only for inlining. In no case +is the function compiled on its own, not even if you refer to its +address explicitly. Such an address becomes an external reference, as +if you had only declared the function, and had not defined it. + + This combination of `inline' and `extern' has almost the effect of a +macro. The way to use it is to put a function definition in a header +file with these keywords, and put another copy of the definition +(lacking `inline' and `extern') in a library file. The definition in +the header file will cause most calls to the function to be inlined. +If any uses of the function remain, they will refer to the single copy +in the library. +</pre> + +<p><li> +Don't define macros that expand to a list of statements. +You could just use braces as in: + +<pre> + #define ASSIGN_CC_ID(ccID) \ + { \ + ccID = CC_ID; \ + CC_ID++; \ + } +</pre> + +(but it's usually better to use an inline function instead - see above). + +<p><li> +Don't even write macros that expand to 0 statements - they can mess you +up as well. Use the doNothing macro instead. +<pre> + #define doNothing() do { } while (0) +</pre> + +<p><li> +This code +<pre> +int* p, q; +</pre> +looks like it declares two pointers but, in fact, only p is a pointer. +It's safer to write this: +<pre> +int* p; +int* q; +</pre> +You could also write this: +<pre> +int *p, *q; +</pre> +but it is preferrable to split the declarations. + +<p><li> +Try to use ANSI C's enum feature when defining lists of constants of +the same type. Among other benefits, you'll notice that gdb uses the +name instead of its (usually inscrutable) number when printing values +with enum types and gdb will let you use the name in expressions you +type. + +<p> +Examples: +<pre> + typedef enum { /* N.B. Used as indexes into arrays */ + NO_HEAP_PROFILING, + HEAP_BY_CC, + HEAP_BY_MOD, + HEAP_BY_GRP, + HEAP_BY_DESCR, + HEAP_BY_TYPE, + HEAP_BY_TIME + } ProfilingFlags; +</pre> +instead of +<pre> + # define NO_HEAP_PROFILING 0 /* N.B. Used as indexes into arrays */ + # define HEAP_BY_CC 1 + # define HEAP_BY_MOD 2 + # define HEAP_BY_GRP 3 + # define HEAP_BY_DESCR 4 + # define HEAP_BY_TYPE 5 + # define HEAP_BY_TIME 6 +</pre> +and +<pre> + typedef enum { + CCchar = 'C', + MODchar = 'M', + GRPchar = 'G', + DESCRchar = 'D', + TYPEchar = 'Y', + TIMEchar = 'T' + } ProfilingTag; +</pre> +instead of +<pre> + # define CCchar 'C' + # define MODchar 'M' + # define GRPchar 'G' + # define DESCRchar 'D' + # define TYPEchar 'Y' + # define TIMEchar 'T' +</pre> + +<p><li> Please keep to 80 columns: the line has to be drawn somewhere, +and by keeping it to 80 columns we can ensure that code looks OK on +everyone's screen. Long lines are hard to read, and a sign that the +code needs to be restructured anyway. + +<p><li> When commenting out large chunks of code, use <code>#ifdef 0 +... #endif</code> rather than <code>/* ... */</code> because C doesn't +have nested comments. + +<p><li>When declaring a typedef for a struct, give the struct a name +as well, so that other headers can forward-reference the struct name +and it becomes possible to have opaque pointers to the struct. Our +convention is to name the struct the same as the typedef, but add a +leading underscore. For example: + +<pre> + typedef struct _Foo { + ... + } Foo; +</pre> + +<p><li>Do not use <tt>!</tt> instead of explicit comparison against +<tt>NULL</tt> or <tt>'\0'</tt>; the latter is much clearer. + +<p><li> We don't care too much about your indentation style but, if +you're modifying a function, please try to use the same style as the +rest of the function (or file). If you're writing new code, a +tab width of 4 is preferred. + +</ul> + +<h2>CVS issues</h2> + +<ul> +<p><li> +Don't be tempted to reindent or reorganise large chunks of code - it +generates large diffs in which it's hard to see whether anything else +was changed. +<p> +If you must reindent or reorganise, don't include any functional +changes that commit and give advance warning that you're about to do +it in case anyone else is changing that file. +</ul> + + +</body> +</html> diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html new file mode 100644 index 0000000000..febe9fe422 --- /dev/null +++ b/docs/comm/rts-libs/foreignptr.html @@ -0,0 +1,68 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title> + </head> + + <body BGCOLOR="FFFFFF"> + + <h1>On why we have <tt>ForeignPtr</tt></h1> + + <p>Unfortunately it isn't possible to add a finalizer to a normal + <tt>Ptr a</tt>. We already have a generic finalization mechanism: + see the Weak module in package lang. But the only reliable way to + use finalizers is to attach one to an atomic heap object - that + way the compiler's optimiser can't interfere with the lifetime of + the object. + + <p>The <tt>Ptr</tt> type is really just a boxed address - it's + defined like + + <pre> +data Ptr a = Ptr Addr# +</pre> + + <p>where <tt>Addr#</tt> is an unboxed native address (just a 32- + or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is + dangerous, because the compiler's optimiser might remove the box + altogether. + + <p><tt>ForeignPtr</tt> is defined like this + + <pre> +data ForeignPtr a = ForeignPtr ForeignObj# +</pre> + + <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds + to a real heap object. The heap object is primitive from the + point of view of the compiler - it can't be optimised away. So it + works to attach a finalizer to the <tt>ForeignObj#</tt> (but not + to the <tt>ForeignPtr</tt>!). + + <p>There are several primitive objects to which we can attach + finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>, + etc. We have special functions for some of these: eg. + <tt>MVar.addMVarFinalizer</tt>. + + <p>So a nicer interface might be something like + +<pre> +class Finalizable a where + addFinalizer :: a -> IO () -> IO () + +instance Finalizable (ForeignPtr a) where ... +instance Finalizable (MVar a) where ... +</pre> + + <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and + rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that + is just efficiency, I think. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Sep 26 09:49:37 BST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html new file mode 100644 index 0000000000..67a544be85 --- /dev/null +++ b/docs/comm/rts-libs/multi-thread.html @@ -0,0 +1,445 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> +<head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> +<title>The GHC Commentary - Supporting multi-threaded interoperation</title> +</head> +<body> +<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1> +<em> +<p> +Authors: sof@galois.com, simonmar@microsoft.com<br> +Date: April 2002 +</p> +</em> +<p> +This document presents the implementation of an extension to +Concurrent Haskell that provides two enhancements: +</p> +<ul> +<li>A Concurrent Haskell thread may call an external (e.g., C) +function in a manner that's transparent to the execution/evaluation of +other Haskell threads. Section <a href="#callout">Calling out"</a> covers this. +</li> +<li> +OS threads may safely call Haskell functions concurrently. Section +<a href="#callin">"Calling in"</a> covers this. +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callout">The problem: foreign calls that block</h2> +<p> +When a Concurrent Haskell(CH) thread calls a 'foreign import'ed +function, the runtime system(RTS) has to handle this in a manner +transparent to other CH threads. That is, they shouldn't be blocked +from making progress while the CH thread executes the external +call. Presently, all threads will block. +</p> +<p> +Clearly, we have to rely on OS-level threads in order to support this +kind of concurrency. The implementation described here defines the +(abstract) OS threads interface that the RTS assumes. The implementation +currently provides two instances of this interface, one for POSIX +threads (pthreads) and one for the Win32 threads. +</p> + +<!---- *************************************** -----> +<h3>Multi-threading the RTS</h3> + +<p> +A simple and efficient way to implement non-blocking foreign calls is like this: +<ul> +<li> Invariant: only one OS thread is allowed to +execute code inside of the GHC runtime system. [There are alternate +designs, but I won't go into details on their pros and cons here.] +We'll call the OS thread that is currently running Haskell threads +the <em>Current Haskell Worker Thread</em>. +<p> +The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its +time-slice expires or it blocks on an MVar, then grabs another, and executes +that, and so on. +</p> +<li> +<p> +When the Current Haskell Worker comes to execute a potentially blocking 'foreign +import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that +another OS worker thread is available to become the Current Haskell Worker. +Consequently, even if the external call blocks, the new Current Haskell Worker +continues execution of the other Concurrent Haskell threads. +When the external call eventually completes, the Concurrent Haskell +thread that made the call is passed the result and made runnable +again. +</p> +<p> +<li> +A pool of OS threads are constantly trying to become the Current Haskell Worker. +Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers. +<p><li> +The OS worker threads are regarded as interchangeable. A given Haskell thread +may, during its lifetime, be executed entirely by one OS worker thread, or by more than one. +There's just no way to tell. + +<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved. +The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and +the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>. +</ul> +<p> +The rest of this section describes the mechanics of implementing all +this. There's two parts to it, one that describes how a native (OS) thread +leaves the RTS to service the external call, the other how the same +thread handles returning the result of the external call back to the +Haskell thread. +</p> + +<!---- *************************************** -----> +<h3>Making the external call</h3> + +<p> +Presently, GHC handles 'safe' C calls by effectively emitting the +following code sequence: +</p> + +<pre> + ...save thread state... + t = suspendThread(); + r = foo(arg1,...,argn); + resumeThread(t); + ...restore thread state... + return r; +</pre> + +<p> +After having squirreled away the state of a Haskell thread, +<tt>Schedule.c:suspendThread()</tt> is called which puts the current +thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>] +containing threads that are currently blocked waiting for external calls +to complete (this is done for the purposes of finding roots when +garbage collecting). +</p> + +<p> +In addition to putting the Haskell thread on +<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also +does the following: +</p> +<ul> +<li>Instructs the <em>Task Manager</em> to make sure that there's a +another native thread waiting in the wings to take over the execution +of Haskell threads. This might entail creating a new +<em>worker thread</em> or re-using one that's currently waiting for +more work to do. The <a href="#taskman">Task Manager</a> section +presents the functionality provided by this subsystem. +</li> + +<li>Releases its capability to execute within the RTS. By doing +so, another worker thread will become unblocked and start executing +code within the RTS. See the <a href="#capability">Capability</a> +section for details. +</li> + +<li><tt>suspendThread()</tt> returns a token which is used to +identify the Haskell thread that was added to +<tt>suspended_ccalling_threads</tt>. This is done so that once the +external call has completed, we know what Haskell thread to pull off +the <tt>suspended_ccalling_threads</tt> list. +</li> +</ul> + +<p> +Upon return from <tt>suspendThread()</tt>, the OS thread is free of +its RTS executing responsibility, and can now invoke the external +call. Meanwhile, the other worker thread that have now gained access +to the RTS will continue executing Concurrent Haskell code. Concurrent +'stuff' is happening! +</p> + +<!---- *************************************** -----> +<h3>Returning the external result</h3> + +<p> +When the native thread eventually returns from the external call, +the result needs to be communicated back to the Haskell thread that +issued the external call. The following steps takes care of this: +</p> + +<ul> +<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>, +passing along the token referring to the Haskell thread that made the +call we're returning from. +</li> + +<li> +The OS thread then tries to grab hold of a <em>returning worker +capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>. +Until granted, the thread blocks waiting for RTS permissions. Clearly we +don't want the thread to be blocked longer than it has to, so whenever +a thread that is executing within the RTS enters the Scheduler (which +is quite often, e.g., when a Haskell thread context switch is made), +it checks to see whether it can give up its RTS capability to a +returning worker, which is done by calling +<tt>Capability.c:yieldToReturningWorker()</tt>. +</li> + +<li> +If a returning worker is waiting (the code in <tt>Capability.c</tt> +keeps a counter of the number of returning workers that are currently +blocked waiting), it is woken up and the given the RTS execution +priviledges/capabilities of the worker thread that gave up its. +</li> + +<li> +The thread that gave up its capability then tries to re-acquire +the capability to execute RTS code; this is done by calling +<tt>Capability.c:waitForWorkCapability()</tt>. +</li> + +<li> +The returning worker that was woken up will continue execution in +<tt>resumeThread()</tt>, removing its associated Haskell thread +from the <tt>suspended_ccalling_threads</tt> list and start evaluating +that thread, passing it the result of the external call. +</li> +</ul> + +<!---- *************************************** -----> +<h3 id="rts-exec">RTS execution</h3> + +<p> +If a worker thread inside the RTS runs out of runnable Haskell +threads, it goes to sleep waiting for the external calls to complete. +It does this by calling <tt>waitForWorkCapability</tt> +</p> + +<p> +The availability of new runnable Haskell threads is signalled when: +</p> + +<ul> +<li>When an external call is set up in <tt>suspendThread()</tt>.</li> +<li>When a new Haskell thread is created (e.g., whenever +<tt>Concurrent.forkIO</tt> is called from within Haskell); this is +signalled in <tt>Schedule.c:scheduleThread_()</tt>. +</li> +<li>Whenever a Haskell thread is removed from a 'blocking queue' +attached to an MVar (only?). +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callin">Calling in</h2> + +Providing robust support for having multiple OS threads calling into +Haskell is not as involved as its dual. + +<ul> +<li>The OS thread issues the call to a Haskell function by going via +the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>). +<li>Making the function application requires the construction of a +closure on the heap. This is done in a thread-safe manner by having +the OS thread lock a designated block of memory (the 'Rts API' block, +which is part of the GC's root set) for the short period of time it +takes to construct the application. +<li>The OS thread then creates a new Haskell thread to execute the +function application, which (eventually) boils down to calling +<tt>Schedule.c:createThread()</tt> +<li> +Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>, +which asks the Task Manager to possibly create a new worker (OS) +thread to execute the Haskell thread. +<li> +After the OS thread has done this, it blocks waiting for the +Haskell thread to complete the evaluation of the Haskell function. +<p> +The reason why a separate worker thread is made to evaluate the Haskell +function and not the OS thread that made the call-in via the +Rts API, is that we want that OS thread to return as soon as possible. +We wouldn't be able to guarantee that if the OS thread entered the +RTS to (initially) just execute its function application, as the +Scheduler may side-track it and also ask it to evaluate other Haskell threads. +</li> +</ul> + +<p> +<strong>Note:</strong> As of 20020413, the implementation of the RTS API +only serializes access to the allocator between multiple OS threads wanting +to call into Haskell (via the RTS API.) It does not coordinate this access +to the allocator with that of the OS worker thread that's currently executing +within the RTS. This weakness/bug is scheduled to be tackled as part of an +overhaul/reworking of the RTS API itself. + + +<!---- *************************************** -----> +<h2>Subsystems introduced/modified</h2> + +<p> +These threads extensions affect the Scheduler portions of the runtime +system. To make it more manageable to work with, the changes +introduced a couple of new RTS 'sub-systems'. This section presents +the functionality and API of these sub-systems. +</p> + +<!---- *************************************** -----> +<h3 id="#capability">Capabilities</h3> + +<p> +A Capability represent the token required to execute STG code, +and all the state an OS thread/task needs to run Haskell code: +its STG registers, a pointer to its TSO, a nursery etc. During +STG execution, a pointer to the capabilitity is kept in a +register (BaseReg). +</p> +<p> +Only in an SMP build will there be multiple capabilities, for +the threaded RTS and other non-threaded builds, there is only +one global capability, namely <tt>MainCapability</tt>. + +<p> +The Capability API is as follows: +<pre> +/* Capability.h */ +extern void initCapabilities(void); + +extern void grabReturnCapability(Mutex* pMutex, Capability** pCap); +extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable); +extern void releaseCapability(Capability* cap); + +extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap); + +extern void grabCapability(Capability** cap); +</pre> + +<ul> +<li><tt>initCapabilities()</tt> initialises the subsystem. + +<li><tt>grabReturnCapability()</tt> is called by worker threads +returning from an external call. It blocks them waiting to gain +permissions to do so. + +<li><tt>waitForWorkCapability()</tt> is called by worker threads +already inside the RTS, but without any work to do. It blocks them +waiting for there to new work to become available. + +<li><tt>releaseCapability()</tt> hands back a capability. If a +'returning worker' is waiting, it is signalled that a capability +has become available. If not, <tt>releaseCapability()</tt> tries +to signal worker threads that are blocked waiting inside +<tt>waitForWorkCapability()</tt> that new work might now be +available. + +<li><tt>yieldToReturningWorker()</tt> is called by the worker thread +that's currently inside the Scheduler. It checks whether there are other +worker threads waiting to return from making an external call. If so, +they're given preference and a capability is transferred between worker +threads. One of the waiting 'returning worker' threads is signalled and made +runnable, with the other, yielding, worker blocking to re-acquire +a capability. +</ul> + +<p> +The condition variables used to implement the synchronisation between +worker consumers and providers are local to the Capability +implementation. See source for details and comments. +</p> + +<!---- *************************************** -----> +<h3 id="taskman">The Task Manager</h3> + +<p> +The Task Manager API is responsible for managing the creation of +OS worker RTS threads. When a Haskell thread wants to make an +external call, the Task Manager is asked to possibly create a +new worker thread to take over the RTS-executing capability of +the worker thread that's exiting the RTS to execute the external call. + +<p> +The Capability subsystem keeps track of idle worker threads, so +making an informed decision about whether or not to create a new OS +worker thread is easy work for the task manager. The Task manager +provides the following API: +</p> + +<pre> +/* Task.h */ +extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) ); +extern void stopTaskManager ( void ); + +extern void startTask ( void (*taskStart)(void) ); +</pre> + +<ul> +<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts +up and shuts down the subsystem. When starting up, you have the option +to limit the overall number of worker threads that can be +created. An unbounded (modulo OS thread constraints) number of threads +is created if you pass '0'. +<li><tt>startTask()</tt> is called when a worker thread calls +<tt>suspendThread()</tt> to service an external call, asking another +worker thread to take over its RTS-executing capability. It is also +called when an external OS thread invokes a Haskell function via the +<em>Rts API</em>. +</ul> + +<!---- *************************************** -----> +<h3>Native threads API</h3> + +To hide OS details, the following API is used by the task manager and +the scheduler to interact with an OS' threads API: + +<pre> +/* OSThreads.h */ +typedef <em>..OS specific..</em> Mutex; +extern void initMutex ( Mutex* pMut ); +extern void grabMutex ( Mutex* pMut ); +extern void releaseMutex ( Mutex* pMut ); + +typedef <em>..OS specific..</em> Condition; +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); +</pre> + + + +<!---- *************************************** -----> +<h2>User-level interface</h2> + +To signal that you want an external call to be serviced by a separate +OS thread, you have to add the attribute <tt>threadsafe</tt> to +a foreign import declaration, i.e., + +<pre> +foreign import "bigComp" threadsafe largeComputation :: Int -> IO () +</pre> + +<p> +The distinction between 'safe' and thread-safe C calls is made +so that we may call external functions that aren't re-entrant but may +cause a GC to occur. +<p> +The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>. +</p> + +<!---- *************************************** -----> +<h2>Building the GHC RTS</h2> + +The multi-threaded extension isn't currently enabled by default. To +have it built, you need to run the <tt>fptools</tt> configure script +with the extra option <tt>--enable-threaded-rts</tt> turned on, and +then proceed to build the compiler as per normal. + +<hr> +<small> +<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end --> +</small> +</body> </html> + diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html new file mode 100644 index 0000000000..627bde8d88 --- /dev/null +++ b/docs/comm/rts-libs/non-blocking.html @@ -0,0 +1,133 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Non-blocking I/O on Win32</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Non-blocking I/O on Win32</h1> + <p> + +This note discusses the implementation of non-blocking I/O on +Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth +capturing the ideas. Thanks to Sigbjorn for writing them. + +<h2> Background</h2> + +GHC has provided non-blocking I/O support for Concurrent Haskell +threads on platforms that provide 'UNIX-style' non-blocking I/O for +quite a while. That is, platforms that let you alter the property of a +file descriptor to instead of having a thread block performing an I/O +operation that cannot be immediately satisfied, the operation returns +back a special error code (EWOULDBLOCK.) When that happens, the CH +thread that made the blocking I/O request is put into a blocked-on-IO +state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will +in a timely fashion check to see whether I/O is again possible +(via a call to select()), and if it is, unblock the thread & have it +re-try the I/O operation. The result is that other Concurrent Haskell +threads won't be affected, but can continue operating while a thread +is blocked on I/O. +<p> +Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for +the simple reason that it doesn't provide the OS facilities described +above. + +<h2>Win32 non-blocking I/O, attempt 1</h2> + +Win32 does provide something select()-like, namely the +WaitForMultipleObjects() API. It takes an array of kernel object +handles plus a timeout interval, and waits for either one (or all) of +them to become 'signalled'. A handle representing an open file (for +reading) becomes signalled once there is input available. +<p> +So, it is possible to observe that I/O is possible using this +function, but not whether there's "enough" to satisfy the I/O request. +So, if we were to mimic select() usage with WaitForMultipleObjects(), +we'd correctly avoid blocking initially, but a thread may very well +block waiting for their I/O requests to be satisified once the file +handle has become signalled. [There is a fix for this -- only read +and write one byte at a the time -- but I'm not advocating that.] + + +<h2>Win32 non-blocking I/O, attempt 2</h2> + +Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is, +asynchronous read and write requests can be made via the ReadFile() / +WriteFile () APIs, specifying position and length of the operation. +If the I/O requests cannot be handled right away, the APIs won't +block, but return immediately (and report ERROR_IO_PENDING as their +status code.) +<p> +The completion of the request can be reported in a number of ways: +<ul> + <li> synchronously, by blocking inside Read/WriteFile(). (this is the + non-overlapped case, really.) +<p> + + <li> as part of the overlapped I/O request, pass a HANDLE to an event + object. The I/O system will signal this event once the request + completed, which a waiting thread will then be able to see. +<p> + + <li> by supplying a pointer to a completion routine, which will be + called as an Asynchronous Procedure Call (APC) whenever a thread + calls a select bunch of 'alertable' APIs. +<p> + + <li> by associating the file handle with an I/O completion port. Once + the request completes, the thread servicing the I/O completion + port will be notified. +</ul> +The use of I/O completion port looks the most interesting to GHC, +as it provides a central point where all I/O requests are reported. +<p> +Note: asynchronous I/O is only fully supported by OSes based on +the NT codebase, i.e., Win9x don't permit async I/O on files and +pipes. However, Win9x does support async socket operations, and +I'm currently guessing here, console I/O. In my view, it would +be acceptable to provide non-blocking I/O support for NT-based +OSes only. +<p> +Here's the design I currently have in mind: +<ul> +<li> Upon startup, an RTS helper thread whose only purpose is to service + an I/O completion port, is created. +<p> +<li> All files are opened in 'overlapping' mode, and associated + with an I/O completion port. +<p> +<li> Overlapped I/O requests are used to implement read() and write(). +<p> +<li> If the request cannot be satisified without blocking, the Haskell + thread is put on the blocked-on-I/O thread list & a re-schedule + is made. +<p> +<li> When the completion of a request is signalled via the I/O completion + port, the RTS helper thread will move the associated Haskell thread + from the blocked list onto the runnable list. (Clearly, care + is required here to have another OS thread mutate internal Scheduler + data structures.) + +<p> +<li> In the event all Concurrent Haskell threads are blocked waiting on + I/O, the main RTS thread blocks waiting on an event synchronisation + object, which the helper thread will signal whenever it makes + a Haskell thread runnable. + +</ul> + +I might do the communication between the RTS helper thread and the +main RTS thread differently though: rather than have the RTS helper +thread manipluate thread queues itself, thus requiring careful +locking, just have it change a bit on the relevant TSO, which the main +RTS thread can check at regular intervals (in some analog of +awaitEvent(), for example). + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html new file mode 100644 index 0000000000..25407eed43 --- /dev/null +++ b/docs/comm/rts-libs/prelfound.html @@ -0,0 +1,57 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Prelude Foundations</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Prelude Foundations</h1> + <p> + The standard Haskell Prelude as well as GHC's Prelude extensions are + constructed from GHC's <a href="primitives.html">primitives</a> in a + couple of layers. + + <h4><code>PrelBase.lhs</code></h4> + <p> + Some most elementary Prelude definitions are collected in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>. + In particular, it defines the boxed versions of Haskell primitive types + - for example, <code>Int</code> is defined as + <blockquote><pre> +data Int = I# Int#</pre> + </blockquote> + <p> + Saying that a boxed integer <code>Int</code> is formed by applying the + data constructor <code>I#</code> to an <em>unboxed</em> integer of type + <code>Int#</code>. Unboxed types are hardcoded in the compiler and + exported together with the <a href="primitives.html">primitive + operations</a> understood by GHC. + <p> + <code>PrelBase.lhs</code> similarly defines basic types, such as, + boolean values + <blockquote><pre> +data Bool = False | True deriving (Eq, Ord)</pre> + </blockquote> + <p> + the unit type + <blockquote><pre> +data () = ()</pre> + </blockquote> + <p> + and lists + <blockquote><pre> +data [] a = [] | a : [a]</pre> + </blockquote> + <p> + It also contains instance delarations for these types. In addition, + <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky + machinery</a> for efficient list handling. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html new file mode 100644 index 0000000000..4ad6c20338 --- /dev/null +++ b/docs/comm/rts-libs/prelude.html @@ -0,0 +1,121 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Cunning Prelude Code</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Cunning Prelude Code</h1> + <p> + GHC's uses a many optimsations and GHC specific techniques (unboxed + values, RULES pragmas, and so on) to make the heavily used Prelude code + as fast as possible. + + <hr> + <h4>Par, seq, and lazy</h4> + + In GHC.Conc you will dinf +<blockquote><pre> + pseq a b = a `seq` lazy b +</pre></blockquote> + What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting. + We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict + in b, then b might be evaluated <em>before</em> a, which is all wrong. +<p> +Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function, +except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the +inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look +lazy, and have no inlining. So the strictness analyser gets no traction. +<p> +In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs), +so we get all the efficiency back. +<p> +This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an +Int#. Now there is no seq# operator at all. + + + <hr> + <h4>fold/build</h4> + <p> + There is a lot of magic in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> - + among other things, the <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES + pragmas</a> implementing the <a + href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a> + optimisation. The code for <code>map</code> is + a good example for how it all works. In the prelude code for version + 5.03 it reads as follows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map _ [] = [] +map f (x:xs) = f x : map f xs + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +{-# INLINE [0] mapFB #-} +mapFB c f x ys = c (f x) ys + +{-# RULES +"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) + #-}</pre> + </blockquote> + <p> + Up to (but not including) phase 1, we use the <code>"map"</code> rule to + rewrite all saturated applications of <code>map</code> with its + build/fold form, hoping for fusion to happen. In phase 1 and 0, we + switch off that rule, inline build, and switch on the + <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back + into plain map. + <p> + It's important that these two rules aren't both active at once + (along with build's unfolding) else we'd get an infinite loop + in the rules. Hence the activation control using explicit phase numbers. + <p> + The "mapFB" rule optimises compositions of map. + <p> + The mechanism as described above is new in 5.03 since January 2002, + where the <code>[~</code><i>N</i><code>]</code> syntax for phase number + annotations at rules was introduced. Before that the whole arrangement + was more complicated, as the corresponding prelude code for version + 4.08.1 shows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map = mapList + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +mapFB c f x ys = c (f x) ys + +mapList :: (a -> b) -> [a] -> [b] +mapList _ [] = [] +mapList f (x:xs) = f x : mapList f xs + +{-# RULES +"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) +"mapList" forall f. foldr (mapFB (:) f) [] = mapList f + #-}</pre> + </blockquote> + <p> + This code is structured as it is, because the "map" rule first + <em>breaks</em> the map <em>open,</em> which exposes it to the various + foldr/build rules, and if no foldr/build rule matches, the "mapList" + rule <em>closes</em> it again in a later phase of optimisation - after + build was inlined. As a consequence, the whole thing depends a bit on + the timing of the various optimsations (the map might be closed again + before any of the foldr/build rules fires). To make the timing + deterministic, <code>build</code> gets a <code>{-# INLINE 2 build + #-}</code> pragma, which delays <code>build</code>'s inlining, and thus, + the closing of the map. [NB: Phase numbering was forward at that time.] + + <p><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 20:00:49 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html new file mode 100644 index 0000000000..28abc79426 --- /dev/null +++ b/docs/comm/rts-libs/primitives.html @@ -0,0 +1,70 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Primitives</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Primitives</h1> + <p> + Most user-level Haskell types and functions provided by GHC (in + particular those from the Prelude and GHC's Prelude extensions) are + internally constructed from even more elementary types and functions. + Most notably, GHC understands a notion of <em>unboxed types,</em> which + are the Haskell representation of primitive bit-level integer, float, + etc. types (as opposed to their boxed, heap allocated counterparts) - + cf. <a + href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed + Values as First Class Citizens."</a> + + <h4>The Ultimate Source of Primitives</h4> + <p> + The hardwired types of GHC are brought into scope by the module + <code>PrelGHC</code>. This modules only exists in the form of a + handwritten interface file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a> + which lists the type and function names, as well as instance + declarations. The actually types of these names as well as their + implementation is hardwired into GHC. Note that the names in this file + are z-encoded, and in particular, identifiers ending on <code>zh</code> + denote user-level identifiers ending in a hash mark (<code>#</code>), + which is used to flag unboxed values or functions operating on unboxed + values. For example, we have <code>Char#</code>, <code>ord#</code>, and + so on. + + <h4>The New Primitive Definition Scheme</h4> + <p> + As of (about) the development version 4.11, the types and various + properties of primitive operations are defined in the file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>. + (Personally, I don't think that the <code>.txt</code> suffix is really + appropriate, as the file is used for automatic code generation; the + recent addition of <code>.pp</code> means that the file is now mangled + by cpp.) + <p> + The utility <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a> + generates a series of Haskell files from <code>primops.txt</code>, which + encode the types and various properties of the primitive operations as + compiler internal data structures. These Haskell files are not complete + modules, but program fragments, which are included into compiler modules + during the GHC build process. The generated include files can be found + in the directory <code>fptools/ghc/compiler/</code> and carry names + matching the pattern <code>primop-*.hs-incl</code>. They are generate + during the execution of the <code>boot</code> target in the + <code>fptools/ghc/</code> directory. This scheme significantly + simplifies the maintenance of primitive operations. + <p> + As of development version 5.02, the <code>primops.txt</code> file also allows the + recording of documentation about intended semantics of the primitives. This can + be extracted into a latex document (or rather, into latex document fragments) + via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code> + for full details of how GHC is configured to cope with different machine word sizes. + <p><small> +<!-- hhmts start --> +Last modified: Mon Nov 26 18:03:16 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html new file mode 100644 index 0000000000..196ec9150d --- /dev/null +++ b/docs/comm/rts-libs/stgc.html @@ -0,0 +1,45 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Spineless Tagless C</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Spineless Tagless C</h1> + <p> + The C code generated by GHC doesn't use higher-level features of C to be + able to control as precisely as possible what code is generated. + Moreover, it uses special features of gcc (such as, first class labels) + to produce more efficient code. + <p> + STG C makes ample use of C's macro language to define idioms, which also + reduces the size of the generated C code (thus, reducing I/O times). + These macros are defined in the C headers located in GHC's <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a> + directory. + + <h4><code>TailCalls.h</code></h4> + <p> + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a> + defines how tail calls are implemented - and in particular - optimised + in GHC generated code. The default case, for an architecture for which + GHC is not optimised, is to use the mini interpreter described in the <a + href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a> + <p> + For supported architectures, various tricks are used to generate + assembler implementing proper tail calls. On i386, gcc's first class + labels are used to directly jump to a function pointer. Furthermore, + markers of the form <code>--- BEGIN ---</code> and <code>--- END + ---</code> are added to the assembly right after the function prologue + and before the epilogue. These markers are used by <a + href="../the-beast/mangler.html">the Evil Mangler.</a> + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:28:29 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html new file mode 100644 index 0000000000..499aeec767 --- /dev/null +++ b/docs/comm/rts-libs/threaded-rts.html @@ -0,0 +1,126 @@ +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title> + </head> + + <body> + <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1> + + <p>This section of the commentary explains the structure of the runtime system + when used in threaded or SMP mode.</p> + + <p>The <em>threaded</em> version of the runtime supports + bound threads and non-blocking foreign calls, and an overview of its + design can be found in the paper <a + href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending + the Haskell Foreign Function Interface with Concurrency</a>. To + compile the runtime with threaded support, add the line + +<pre>GhcRTSWays += thr</pre> + + to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way, + the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the + build system when building for way <tt>thr</tt>, see + <tt>mk/config.mk</tt>). To build a Haskell program + with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this + can be used in conjunction with <tt>-prof</tt>, and possibly + <tt>-debug</tt> and others depending on which versions of the RTS have + been built.</p> + + <p>The <em>SMP</em> version runtime supports the same facilities as the + threaded version, and in addition supports execution of Haskell code by + multiple simultaneous OS threads. For SMP support, both the runtime and + the libraries must be built a special way: add the lines + + <pre> +GhcRTSWays += thr +GhcLibWays += s</pre> + + to <tt>mk/build.mk</tt>. To build Haskell code for + SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in + conjunction with <tt>-debug</tt>, but no other way-flags at this time). + When building C code in the runtime for SMP + support, the symbol <tt>SMP</tt> is defined (this is arranged by the + compiler when the <tt>-smp</tt> flag is given, see + <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p> + + <p>When building the runtime in either the threaded or SMP ways, the symbol + <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p> + + <h2>Overall design</h2> + + <p>The system is based around the notion of a <tt>Capability</tt>. A + <tt>Capability</tt> is an object that represents both the permission to + execute some Haskell code, and the state required to do so. In order + to execute some Haskell code, a thread must therefore hold a + <tt>Capability</tt>. The available pool of capabilities is managed by + the <tt>Capability</tt> API, described below.</p> + + <p>In the threaded runtime, there is only a single <tt>Capabililty</tt> in the + system, indicating that only a single thread can be executing Haskell + code at any one time. In the SMP runtime, there can be an arbitrary + number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt> + flag; in practice the number is best chosen to be the same as the number of + processors on the host machine.</p> + + <p>There are a number of OS threads running code in the runtime. We call + these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>. + Tasks are managed by the <tt>Task</tt> subsystem, which is mainly + concerned with keeping track of statistics such as how much time each + task spends executing Haskell code, and also keeping track of how many + tasks are around when we want to shut down the runtime.</p> + + <p>Some tasks are created by the runtime itself, and some may be here + as a result of a call to Haskell from foreign code (we + call this an in-call). The + runtime can support any number of concurrent foreign in-calls, but the + number of these calls that will actually run Haskell code in parallel is + determined by the number of available capabilities. Each in-call creates + a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited + above).</p> + + <p>In the future we may want to bind a <tt>Capability</tt> to a particular + processor, so that we can support a notion of affinity - avoiding + accidental migration of work from one CPU to another, so that we can make + best use of a CPU's local cache. For now, the design ignores this + issue.</p> + + <h2>The <tt>OSThreads</tt> interface</h2> + + <p>This interface is merely an abstraction layer over the OS-specific APIs + for managing threads. It has two main implementations: Win32 and + POSIX.</p> + + <p>This is the entirety of the interface:</p> + +<pre> +/* Various abstract types */ +typedef Mutex; +typedef Condition; +typedef OSThreadId; + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); + +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern void initMutex ( Mutex* pMut ); + </pre> + + <h2>The Task interface</h2> + + <h2>The Capability interface</h2> + + <h2>Multiprocessor Haskell Execution</h2> + + </body> +</html> diff --git a/docs/comm/the-beast/alien.html b/docs/comm/the-beast/alien.html new file mode 100644 index 0000000000..3d4776ebc9 --- /dev/null +++ b/docs/comm/the-beast/alien.html @@ -0,0 +1,56 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Alien Functions</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Alien Functions</h1> + <p> + GHC implements experimental (by now it is actually quite well tested) + support for access to foreign functions and generally the interaction + between Haskell code and code written in other languages. Code + generation in this context can get quite tricky. This section attempts + to cast some light on this aspect of the compiler. + + <h4>FFI Stub Files</h4> + <p> + For each Haskell module that contains a <code>foreign export + dynamic</code> declaration, GHC generates a <code>_stub.c</code> file + that needs to be linked with any program that imports the Haskell + module. When asked about it <a + href="mailto:simonmar@microsoft.com">Simon Marlow</a> justified the + existence of these files as follows: + <blockquote> + The stub files contain the helper function which invokes the Haskell + code when called from C. + <p> + Each time the foreign export dynamic is invoked to create a new + callback function, a small piece of code has to be dynamically + generated (by code in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Adjustor.c"><code>Adjustor.c</code></a>). It is the address of this dynamically generated bit of + code that is returned as the <code>Addr</code> (or <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/Ptr.lhs"><code>Ptr</code></a>). + When called from C, the dynamically generated code must somehow invoke + the Haskell function which was originally passed to the + f.e.d. function -- it does this by invoking the helper function, + passing it a <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/hslibs/lang/StablePtr.lhs"><code>StablePtr</code></a> + to the Haskell function. It's split this way for two reasons: the + same helper function can be used each time the f.e.d. function is + called, and to keep the amount of dynamically generated code to a + minimum. + </blockquote> + <p> + The stub code is generated by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsForeign.lhs"><code>DSForeign</code></a><code>.fexportEntry</code>. + + + <p><small> +<!-- hhmts start --> +Last modified: Fri Aug 10 11:47:41 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/basicTypes.html b/docs/comm/the-beast/basicTypes.html new file mode 100644 index 0000000000..ca56d6b6a8 --- /dev/null +++ b/docs/comm/the-beast/basicTypes.html @@ -0,0 +1,132 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Basics</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Basics</h1> + <p> + The directory <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/"><code>fptools/ghc/compiler/basicTypes/</code></a> + contains modules that define some of the essential types definition for + the compiler - such as, identifiers, variables, modules, and unique + names. Some of those are discussed in the following. See elsewhere for more + detailed information on: + <ul> + <li> <a href="vars.html"><code>Var</code>s, <code>Id</code>s, and <code>TyVar</code>s</a> + <li> <a href="renamer.html"><code>OccName</code>s, <code>RdrName</code>s, and <code>Names</code>s</a> + </ul> + + <h2>Elementary Types</h2> + + <h4><code>Id</code>s</h4> + <p> + An <code>Id</code> (defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id.lhs</code></a> + essentially records information about value and data constructor + identifiers -- to be precise, in the case of data constructors, two + <code>Id</code>s are used to represent the worker and wrapper functions + for the data constructor, respectively. The information maintained in + the <code>Id</code> abstraction includes among other items strictness, + occurrence, specialisation, and unfolding information. + <p> + Due to the way <code>Id</code>s are used for data constructors, + all <code>Id</code>s are represented as variables, which contain a + <code>varInfo</code> field of abstract type <code><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/IdInfo.lhs">IdInfo</a>.IdInfo</code>. + This is where the information about <code>Id</code>s is really stored. + The following is a (currently, partial) list of the various items in an + <code>IdInfo</code>: + <p> + <dl> + <dt><a name="occInfo">Occurence information</a> + <dd>The <code>OccInfo</code> data type is defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/BasicTypes.lhs"><code>BasicTypes.lhs</code></a>. + Apart from the trivial <code>NoOccInfo</code>, it distinguishes + between variables that do not occur at all (<code>IAmDead</code>), + occur just once (<code>OneOcc</code>), or a <a + href="simplifier.html#loopBreaker">loop breakers</a> + (<code>IAmALoopBreaker</code>). + </dl> + + <h2>Sets, Finite Maps, and Environments</h2> + <p> + Sets of variables, or more generally names, which are needed throughout + the compiler, are provided by the modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarSet.lhs"><code>VarSet.lhs</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameSet.lhs"><code>NameSet.lhs</code></a>, + respectively. Moreover, frequently maps from variables (or names) to + other data is needed. For example, a substitution is represented by a + finite map from variable names to expressions. Jobs like this are + solved by means of variable and name environments implemented by the + modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/VarEnv.lhs"><code>VarEnv.lhs</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/NameEnv.lhs"><code>NameEnv.lhs</code></a>. + + <h4>The Module <code>VarSet</code></h4> + <p> + The Module <code>VarSet</code> provides the types <code>VarSet</code>, + <code>IdSet</code>, and <code>TyVarSet</code>, which are synonyms in the + current implementation, as <code>Var</code>, <code>Id</code>, and + <code>TyVar</code> are synonyms. The module provides all the operations + that one would expect including the creating of sets from individual + variables and lists of variables, union and intersection operations, + element checks, deletion, filter, fold, and map functions. + <p> + The implementation is based on <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqSet.lhs"><code>UniqSet</code></a>s, + which in turn are simply <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>s + (i.e., finite maps with uniques as keys) that map each unique to the + variable that it represents. + + <h4>The Module <code>NameSet</code></h4> + <p> + The Module <code>NameSet</code> provides the same functionality as + <code>VarSet</code> only for <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Name.lhs"><code>Name</code></a>s. + As for the difference between <code>Name</code>s and <code>Var</code>s, + a <code>Var</code> is built from a <code>Name</code> plus additional + information (mostly importantly type information). + + <h4>The Module <code>VarEnv</code></h4> + <p> + The module <code>VarEnv</code> provides the types <code>VarEnv</code>, + <code>IdEnv</code>, and <code>TyVarEnv</code>, which are again + synonyms. The provided base functionality is similar to + <code>VarSet</code> with the main difference that a type <code>VarEnv + T</code> associates a value of type <code>T</code> with each variable in + the environment, thus effectively implementing a finite map from + variables to values of type <code>T</code>. + <p> + The implementation of <code>VarEnv</code> is also by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/utils/UniqFM.lhs"><code>UniqFM</code></a>, + which entails the slightly surprising implication that it is + <em>not</em> possible to retrieve the domain of a variable environment. + In other words, there is no function corresponding to + <code>VarSet.varSetElems :: VarSet -> [Var]</code> in + <code>VarEnv</code>. This is because the <code>UniqFM</code> used to + implement <code>VarEnv</code> stores only the unique corresponding to a + variable in the environment, but not the entire variable (and there is + no mapping from uniques to variables). + <p> + In addition to plain variable environments, the module also contains + special substitution environments - the type <code>SubstEnv</code> - + that associates variables with a special purpose type + <code>SubstResult</code>. + + <h4>The Module <code>NameEnv</code></h4> + <p> + The type <code>NameEnv.NameEnv</code> is like <code>VarEnv</code> only + for <code>Name</code>s. + + <p><hr><small> +<!-- hhmts start --> +Last modified: Tue Jan 8 18:29:52 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/coding-style.html b/docs/comm/the-beast/coding-style.html new file mode 100644 index 0000000000..41347c6902 --- /dev/null +++ b/docs/comm/the-beast/coding-style.html @@ -0,0 +1,230 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Coding Style Guidelines</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Coding Style Guidelines</h1> + + <p>This is a rough description of some of the coding practices and + style that we use for Haskell code inside <tt>ghc/compiler</tt>. + + <p>The general rule is to stick to the same coding style as is + already used in the file you're editing. If you must make + stylistic changes, commit them separately from functional changes, + so that someone looking back through the change logs can easily + distinguish them. + + <h2>To literate or not to literate?</h2> + + <p>In GHC we use a mixture of literate (<tt>.lhs</tt>) and + non-literate (<tt>.hs</tt>) source. I (Simon M.) prefer to use + non-literate style, because I think the + <tt>\begin{code}..\end{code}</tt> clutter up the source too much, + and I like to use Haddock-style comments (we haven't tried + processing the whole of GHC with Haddock yet, though). + + <h2>To CPP or not to CPP?</h2> + + <p>We pass all the compiler sources through CPP. The + <tt>-cpp</tt> flag is always added by the build system. + + <p>The following CPP symbols are used throughout the compiler: + + <dl> + <dt><tt>DEBUG</tt></dt> + + <dd>Used to enables extra checks and debugging output in the + compiler. The <tt>ASSERT</tt> macro (see <tt>HsVersions.h</tt>) + provides assertions which disappear when <tt>DEBUG</tt> is not + defined. + + <p>All debugging output should be placed inside <tt>#ifdef + DEBUG</tt>; we generally use this to provide warnings about + strange cases and things that might warrant investigation. When + <tt>DEBUG</tt> is off, the compiler should normally be silent + unless something goes wrong (exception when the verbosity level + is greater than zero). + + <p>A good rule of thumb is that <tt>DEBUG</tt> shouldn't add + more than about 10-20% to the compilation time. This is the case + at the moment. If it gets too expensive, we won't use it. For + more expensive runtime checks, consider adding a flag - see for + example <tt>-dcore-lint</tt>. + </dd> + + <dt><tt>GHCI</tt></dt> + + <dd>Enables GHCi support, including the byte code generator and + interactive user interface. This isn't the default, because the + compiler needs to be bootstrapped with itself in order for GHCi + to work properly. The reason is that the byte-code compiler and + linker are quite closely tied to the runtime system, so it is + essential that GHCi is linked with the most up-to-date RTS. + Another reason is that the representation of certain datatypes + must be consistent between GHCi and its libraries, and if these + were inconsistent then disaster could follow. + </dd> + + </dl> + + <h2>Platform tests</h2> + + <p>There are three platforms of interest to GHC: + + <ul> + <li>The <b>Build</b> platform. This is the platform on which we + are building GHC.</li> + <li>The <b>Host</b> platform. This is the platform on which we + are going to run this GHC binary, and associated tools.</li> + <li>The <b>Target</b> platform. This is the platform for which + this GHC binary will generate code.</li> + </ul> + + <p>At the moment, there is very limited support for having + different values for buil, host, and target. In particular:</p> + + <ul> + <li>The build platform is currently always the same as the host + platform. The build process needs to use some of the tools in + the source tree, for example <tt>ghc-pkg</tt> and + <tt>hsc2hs</tt>.</li> + + <li>If the target platform differs from the host platform, then + this is generally for the purpose of building <tt>.hc</tt> files + from Haskell source for porting GHC to the target platform. + Full cross-compilation isn't supported (yet).</li> + </ul> + + <p>In the compiler's source code, you may make use of the + following CPP symbols:</p> + + <ul> + <li><em>xxx</em><tt>_TARGET_ARCH</tt></li> + <li><em>xxx</em><tt>_TARGET_VENDOR</tt></li> + <li><em>xxx</em><tt>_TARGET_OS</tt></li> + <li><em>xxx</em><tt>_HOST_ARCH</tt></li> + <li><em>xxx</em><tt>_HOST_VENDOR</tt></li> + <li><em>xxx</em><tt>_HOST_OS</tt></li> + </ul> + + <p>where <em>xxx</em> is the appropriate value: + eg. <tt>i386_TARGET_ARCH</tt>. + + <h2>Compiler versions</h2> + + <p>GHC must be compilable by every major version of GHC from 5.02 + onwards, and itself. It isn't necessary for it to be compilable + by every intermediate development version (that includes last + week's CVS sources). + + <p>To maintain compatibility, use <tt>HsVersions.h</tt> (see + below) where possible, and try to avoid using <tt>#ifdef</tt> in + the source itself. + + <h2>The source file</h2> + + <p>We now describe a typical source file, annotating stylistic + choices as we go. + +<pre> +{-# OPTIONS ... #-} +</pre> + + <p>An <tt>OPTIONS</tt> pragma is optional, but if present it + should go right at the top of the file. Things you might want to + put in <tt>OPTIONS</tt> include: + + <ul> + <li><tt>-#include</tt> options to bring into scope prototypes + for FFI declarations</li> + <li><tt>-fvia-C</tt> if you know that + this module won't compile with the native code generator. + </ul> + + <p>Don't bother putting <tt>-cpp</tt> or <tt>-fglasgow-exts</tt> + in the <tt>OPTIONS</tt> pragma; these are already added to the + command line by the build system. + + +<pre> +module Foo ( + T(..), + foo, -- :: T -> T + ) where +</pre> + + <p>We usually (99% of the time) include an export list. The only + exceptions are perhaps where the export list would list absolutely + everything in the module, and even then sometimes we do it anyway. + + <p>It's helpful to give type signatures inside comments in the + export list, but hard to keep them consistent, so we don't always + do that. + +<pre> +#include "HsVersions.h" +</pre> + + <p><tt>HsVersions.h</tt> is a CPP header file containing a number + of macros that help smooth out the differences between compiler + versions. It defines, for example, macros for library module + names which have moved between versions. Take a look. + +<pre> +-- friends +import SimplMonad + +-- GHC +import CoreSyn +import Id ( idName, idType ) +import BasicTypes + +-- libraries +import DATA_IOREF ( newIORef, readIORef ) + +-- std +import List ( partition ) +import Maybe ( fromJust ) +</pre> + + <p>List imports in the following order: + + <ul> + <li>Local to this subsystem (or directory) first</li> + + <li>Compiler imports, generally ordered from specific to generic + (ie. modules from <tt>utils/</tt> and <tt>basicTypes/</tt> + usually come last)</li> + + <li>Library imports</li> + + <li>Standard Haskell 98 imports last</li> + </ul> + + <p>Import library modules from the <tt>base</tt> and + <tt>haskell98</tt> packages only. Use <tt>#defines</tt> in + <tt>HsVersions.h</tt> when the modules names differ between + versions of GHC (eg. <tt>DATA_IOREF</tt> in the example above). + For code inside <tt>#ifdef GHCI</tt>, don't need to worry about GHC + versioning (because we are bootstrapped). + + <p>We usually use import specs to give an explicit list of the + entities imported from a module. The main reason for doing this is + so that you can search the file for an entity and find which module + it comes from. However, huge import lists can be a pain to + maintain, so we often omit the import specs when they start to get + long (actually I start omitting them when they don't fit on one + line --Simon M.). Tip: use GHC's <tt>-fwarn-unused-imports</tt> + flag so that you get notified when an import isn't being used any + more. + + <p>If the module can be compiled multiple ways (eg. GHCI + vs. non-GHCI), make sure the imports are properly <tt>#ifdefed</tt> + too, so as to avoid spurious unused import warnings. + + <p><em>ToDo: finish this</em> + </body> +</html> diff --git a/docs/comm/the-beast/data-types.html b/docs/comm/the-beast/data-types.html new file mode 100644 index 0000000000..fef4852d4d --- /dev/null +++ b/docs/comm/the-beast/data-types.html @@ -0,0 +1,242 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Data types and data constructors</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Data types and data constructors</h1> + <p> + +This chapter was thoroughly changed Feb 2003. + +<h2>Data types</h2> + +Consider the following data type declaration: + +<pre> + data T a = MkT !(a,a) !(T a) | Nil + + f x = case x of + MkT p q -> MkT p (q+1) + Nil -> Nil +</pre> +The user's source program mentions only the constructors <tt>MkT</tt> +and <tt>Nil</tt>. However, these constructors actually <em>do</em> something +in addition to building a data value. For a start, <tt>MkT</tt> evaluates +its arguments. Secondly, with the flag <tt>-funbox-strict-fields</tt> GHC +will flatten (or unbox) the strict fields. So we may imagine that there's the +<em>source</em> constructor <tt>MkT</tt> and the <em>representation</em> constructor +<tt>MkT</tt>, and things start to get pretty confusing. +<p> +GHC now generates three unique <tt>Name</tt>s for each data constructor: +<pre> + ---- OccName ------ + String Name space Used for + --------------------------------------------------------------------------- + The "source data con" MkT DataName The DataCon itself + The "worker data con" MkT VarName Its worker Id + aka "representation data con" + The "wrapper data con" $WMkT VarName Its wrapper Id (optional) +</pre> +Recall that each occurrence name (OccName) is a pair of a string and a +name space (see <a href="names.html">The truth about names</a>), and +two OccNames are considered the same only if both components match. +That is what distinguishes the name of the name of the DataCon from +the name of its worker Id. To keep things unambiguous, in what +follows we'll write "MkT{d}" for the source data con, and "MkT{v}" for +the worker Id. (Indeed, when you dump stuff with "-ddumpXXX", if you +also add "-dppr-debug" you'll get stuff like "Foo {- d rMv -}". The +"d" part is the name space; the "rMv" is the unique key.) +<p> +Each of these three names gets a distinct unique key in GHC's name cache. + +<h2>The life cycle of a data type</h2> + +Suppose the Haskell source looks like this: +<pre> + data T a = MkT !(a,a) !Int | Nil + + f x = case x of + Nil -> Nil + MkT p q -> MkT p (q+1) +</pre> +When the parser reads it in, it decides which name space each lexeme comes +from, thus: +<pre> + data T a = MkT{d} !(a,a) !Int | Nil{d} + + f x = case x of + Nil{d} -> Nil{d} + MkT{d} p q -> MkT{d} p (q+1) +</pre> +Notice that in the Haskell source <em>all data contructors are named via the "source data con" MkT{d}</em>, +whether in pattern matching or in expressions. +<p> +In the translated source produced by the type checker (-ddump-tc), the program looks like this: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} p q -> $WMkT p (q+1) + +</pre> +Notice that the type checker replaces the occurrence of MkT by the <em>wrapper</em>, but +the occurrence of Nil by the <em>worker</em>. Reason: Nil doesn't have a wrapper because there is +nothing to do in the wrapper (this is the vastly common case). +<p> +Though they are not printed out by "-ddump-tc", behind the scenes, there are +also the following: the data type declaration and the wrapper function for MkT. +<pre> + data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) +</pre> +Here, the <em>wrapper</em> <tt>$WMkT</tt> evaluates and takes apart the argument <tt>p</tt>, +evaluates the argument <tt>t</tt>, and builds a three-field data value +with the <em>worker</em> constructor <tt>MkT{v}</tt>. (There are more notes below +about the unboxing of strict fields.) The worker $WMkT is called an <em>implicit binding</em>, +because it's introduced implicitly by the data type declaration (record selectors +are also implicit bindings, for example). Implicit bindings are injected into the code +just before emitting code or External Core. +<p> +After desugaring into Core (-ddump-ds), the definition of <tt>f</tt> looks like this: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> let { p = (a,b); q = I# r } in + $WMkT p (q+1) +</pre> +Notice the way that pattern matching has been desugared to take account of the fact +that the "real" data constructor MkT has three fields. +<p> +By the time the simplifier has had a go at it, <tt>f</tt> will be transformed to: +<pre> + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) +</pre> +Which is highly cool. + + +<h2> The constructor wrapper functions </h2> + +The wrapper functions are automatically generated by GHC, and are +really emitted into the result code (albeit only after CorePre; see +<tt>CorePrep.mkImplicitBinds</tt>). +The wrapper functions are inlined very +vigorously, so you will not see many occurrences of the wrapper +functions in an optimised program, but you may see some. For example, +if your Haskell source has +<pre> + map MkT xs +</pre> +then <tt>$WMkT</tt> will not be inlined (because it is not applied to anything). +That is why we generate real top-level bindings for the wrapper functions, +and generate code for them. + + +<h2> The constructor worker functions </h2> + +Saturated applications of the constructor worker function MkT{v} are +treated specially by the code generator; they really do allocation. +However, we do want a single, shared, top-level definition for +top-level nullary constructors (like True and False). Furthermore, +what if the code generator encounters a non-saturated application of a +worker? E.g. <tt>(map Just xs)</tt>. We could declare that to be an +error (CorePrep should saturate them). But instead we currently +generate a top-level defintion for each constructor worker, whether +nullary or not. It takes the form: +<pre> + MkT{v} = \ p q r -> MkT{v} p q r +</pre> +This is a real hack. The occurrence on the RHS is saturated, so the code generator (both the +one that generates abstract C and the byte-code generator) treats it as a special case and +allocates a MkT; it does not make a recursive call! So now there's a top-level curried +version of the worker which is available to anyone who wants it. +<p> +This strange defintion is not emitted into External Core. Indeed, you might argue that +we should instead pass the list of <tt>TyCon</tt>s to the code generator and have it +generate magic bindings directly. As it stands, it's a real hack: see the code in +CorePrep.mkImplicitBinds. + + +<h2> External Core </h2> + +When emitting External Core, we should see this for our running example: + +<pre> + data T a = MkT a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT a b t) + + f x = case x of + Nil -> Nil + MkT a b r -> MkT a b (r +# 1#) +</pre> +Notice that it makes perfect sense as a program all by itself. Constructors +look like constructors (albeit not identical to the original Haskell ones). +<p> +When reading in External Core, the parser is careful to read it back in just +as it was before it was spat out, namely: +<pre> + data T a = MkT{d} a a Int# | Nil{d} + + $WMkT :: (a,a) -> T a -> T a + $WMkT p t = case p of + (a,b) -> seq t (MkT{v} a b t) + + f x = case x of + Nil{d} -> Nil{v} + MkT{d} a b r -> MkT{v} a b (r +# 1#) +</pre> + + +<h2> Unboxing strict fields </h2> + +If GHC unboxes strict fields (as in the first argument of <tt>MkT</tt> above), +it also transforms +source-language case expressions. Suppose you write this in your Haskell source: +<pre> + case e of + MkT p t -> ..p..t.. +</pre> +GHC will desugar this to the following Core code: +<pre> + case e of + MkT a b t -> let p = (a,b) in ..p..t.. +</pre> +The local let-binding reboxes the pair because it may be mentioned in +the case alternative. This may well be a bad idea, which is why +<tt>-funbox-strict-fields</tt> is an experimental feature. +<p> +It's essential that when importing a type <tt>T</tt> defined in some +external module <tt>M</tt>, GHC knows what representation was used for +that type, and that in turn depends on whether module <tt>M</tt> was +compiled with <tt>-funbox-strict-fields</tt>. So when writing an +interface file, GHC therefore records with each data type whether its +strict fields (if any) should be unboxed. + +<h2> Labels and info tables </h2> + +<em>Quick rough notes: SLPJ March 2003</em>. +<p> +Every data constructor <tt>C</tt>has two info tables: +<ul> +<li> The static info table (label <tt>C_static_info</tt>), used for statically-allocated constructors. + +<li> The dynamic info table (label <tt>C_con_info</tt>), used for dynamically-allocated constructors. +</ul> +Statically-allocated constructors are not moved by the garbage collector, and therefore have a different closure +type from dynamically-allocated constructors; hence they need +a distinct info table. +Both info tables share the same entry code, but since the entry code is phyiscally juxtaposed with the +info table, it must be duplicated (<tt>C_static_entry</tt> and <tt>C_con_entry</tt> respectively). + + </body> +</html> + diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html new file mode 100644 index 0000000000..a66740259b --- /dev/null +++ b/docs/comm/the-beast/desugar.html @@ -0,0 +1,156 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Sugar Free: From Haskell To Core</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Sugar Free: From Haskell To Core</h1> + <p> + Up until after type checking, GHC keeps the source program in an + abstract representation of Haskell source without removing any of the + syntactic sugar (such as, list comprehensions) that could easily be + represented by more primitive Haskell. This complicates part of the + front-end considerably as the abstract syntax of Haskell (as exported by + the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>) + is much more complex than a simplified representation close to, say, the + <a href="http://haskell.org/onlinereport/intro.html#sect1.2">Haskell + Kernel</a> would be. However, having a representation that is as close + as possible to the surface syntax simplifies the generation of clear + error messages. As GHC (quite in contrast to "conventional" compilers) + prints code fragments as part of error messages, the choice of + representation is especially important. + <p> + Nonetheless, as soon as the input has passed all static checks, it is + transformed into GHC's principal intermediate language that goes by the + name of <em>Core</em> and whose representation is exported by the + module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs"><code>CoreSyn</code></a>. + All following compiler phases, except code generation operate on Core. + Due to Andrew Tolmach's effort, there is also an <a + href="http://www.haskell.org/ghc/docs/papers/core.ps.gz">external + representation for Core.</a> + <p> + The conversion of the compiled module from <code>HsSyn</code> into that + of <code>CoreSyn</code> is performed by a phase called the + <em>desugarer</em>, which is located in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/"><code>fptools/ghc/compiler/deSugar/</code></a>. + It's operation is detailed in the following. + </p> + + <h2>Auxilliary Functions</h2> + <p> + The modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsMonad.lhs"><code>DsMonad</code></a> + defines the desugarer monad (of type <code>DsM</code>) which maintains + the environment needed for desugaring. In particular, it encapsulates a + unique supply for generating new variables, a map to lookup standard + names (such as functions from the prelude), a source location for error + messages, and a pool to collect warning messages generated during + desugaring. Initialisation of the environment happens in the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Desugar.lhs"><code>Desugar</code></a><code>.desugar</code>, + which is also the main entry point into the desugarer. + <p> + The generation of Core code often involves the use of standard functions + for which proper identifiers (i.e., values of type <code>Id</code> that + actually refer to the definition in the right Prelude) need to be + obtained. This is supported by the function + <code>DsMonad.dsLookupGlobalValue :: Name -> DsM Id</code>. + + <h2><a name="patmat">Pattern Matching</a></h2> + <p> + Nested pattern matching with guards and everything is translated into + the simple, flat case expressions of Core by the following modules: + <dl> + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a>: + <dd>This modules contains the main pattern-matching compiler in the form + of a function called <code>match</code>. There is some documentation + as to how <code>match</code> works contained in the module itself. + Generally, the implemented algorithm is similar to the one described + in Phil Wadler's Chapter ? of Simon Peyton Jones' <em>The + Implementation of Functional Programming Languages</em>. + <code>Match</code> exports a couple of functions with not really + intuitive names. In particular, it exports <code>match</code>, + <code>matchWrapper</code>, <code>matchExport</code>, and + <code>matchSimply</code>. The function <code>match</code>, which is + the main work horse, is only used by the other matching modules. The + function <code>matchExport</code> - despite it's name - is merely used + internally in <code>Match</code> and handles warning messages (see + below for more details). The actual interface to the outside is + <code>matchWrapper</code>, which converts the output of the type + checker into the form needed by the pattern matching compiler (i.e., a + list of <code>EquationInfo</code>). Similar in function to + <code>matchWrapper</code> is <code>matchSimply</code>, which provides + an interface for the case where a single expression is to be matched + against a single pattern (as, for example, is the case in bindings in + a <code>do</code> expression). + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchCon.lhs"><code>MatchCon</code></a>: + <dd>This module generates code for a set of alternative constructor + patterns that belong to a single type by means of the routine + <code>matchConFamily</code>. More precisely, the routine gets a set + of equations where the left-most pattern of each equation is a + constructor pattern with a head symbol from the same type as that of + all the other equations. A Core case expression is generated that + distinguihes between all these constructors. The routine is clever + enough to generate a sparse case expression and to add a catch-all + default case only when needed (i.e., if the case expression isn't + exhaustive already). There is also an explanation at the start of the + modules. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/MatchLit.lhs"><code>MatchLit</code></a>: + <dd>Generates code for a set of alternative literal patterns by means of + the routine <code>matchLiterals</code>. The principle is similar to + that of <code>matchConFamily</code>, but all left-most patterns are + literals of the same type. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a>: + <dd>This module provides a set of auxilliary definitions as well as the + data types <code>EquationInfo</code> and <code>MatchResult</code> that + form the input and output, respectively, of the pattern matching + compiler. + <dt><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>: + <dd>This module does not really contribute the compiling pattern + matching, but it inspects sets of equations to find whether there are + any overlapping patterns or non-exhaustive pattern sets. This task is + implemented by the function <code>check</code>, which returns a list of + patterns that are part of a non-exhaustive case distinction as well as a + set of equation labels that can be reached during execution of the code; + thus, the remaining equations are shadowed due to overlapping patterns. + The function <code>check</code> is invoked and its result converted into + suitable warning messages by the function <code>Match.matchExport</code> + (which is a wrapper for <code>Match.match</code>). + </dl> + <p> + The central function <code>match</code>, given a set of equations, + proceeds in a number of steps: + <ol> + <li>It starts by desugaring the left-most pattern of each equation using + the function <code>tidy1</code> (indirectly via + <code>tidyEqnInfo</code>). During this process, non-elementary + pattern (e.g., those using explicit list syntax <code>[x, y, ..., + z]</code>) are converted to a standard constructor pattern and also + irrefutable pattern are removed. + <li>Then, a process called <em>unmixing</em> clusters the equations into + blocks (without re-ordering them), such that the left-most pattern of + all equations in a block are either all variables, all literals, or + all constructors. + <li>Each block is, then, compiled by <code>matchUnmixedEqns</code>, + which forwards the handling of literal pattern blocks to + <code>MatchLit.matchLiterals</code>, of constructor pattern blocks to + <code>MatchCon.matchConFamily</code>, and hands variable pattern + blocks back to <code>match</code>. + </ol> + + <p><hr><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 22:35:25 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/driver.html b/docs/comm/the-beast/driver.html new file mode 100644 index 0000000000..fbf65e33e7 --- /dev/null +++ b/docs/comm/the-beast/driver.html @@ -0,0 +1,179 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Glorious Driver</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Glorious Driver</h1> + <p> + The Glorious Driver (GD) is the part of GHC that orchestrates the + interaction of all the other pieces that make up GHC. It supersedes the + <em>Evil Driver (ED),</em> which was a Perl script that served the same + purpose and was in use until version 4.08.1 of GHC. Simon Marlow + eventually slayed the ED and instated the GD. The GD is usually called + the <em>Compilation Manager</em> these days. + </p> + <p> + The GD has been substantially extended for GHCi, i.e., the interactive + variant of GHC that integrates the compiler with a (meta-circular) + interpreter since version 5.00. Most of the driver is located in the + directory + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/"><code>fptools/ghc/compiler/main/</code></a>. + </p> + + <h2>Command Line Options</h2> + <p> + GHC's many flavours of command line options make the code interpreting + them rather involved. The following provides a brief overview of the + processing of these options. Since the addition of the interactive + front-end to GHC, there are two kinds of options: <em>static + options</em> and <em>dynamic options.</em> The former can only be set + when the system is invoked, whereas the latter can be altered in the + course of an interactive session. A brief explanation on the difference + between these options and related matters is at the start of the module + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>. + The same module defines the enumeration <code>DynFlag</code>, which + contains all dynamic flags. Moreover, there is the labelled record + <code>DynFlags</code> that collects all the flag-related information + that is passed by the compilation manager to the compiler proper, + <code>hsc</code>, whenever a compilation is triggered. If you like to + find out whether an option is static, use the predicate + <code>isStaticHscFlag</code> in the same module. + <p> + The second module that contains a lot of code related to the management + of flags is <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverFlags.hs"><code>DriverFlags.hs</code></a>. + In particular, the module contains two association lists that map the + textual representation of the various flags to a data structure that + tells the driver how to parse the flag (e.g., whether it has any + arguments) and provides its internal representation. All static flags + are contained in <code>static_flags</code>. A whole range of + <code>-f</code> flags can be negated by adding a <code>-f-no-</code> + prefix. These flags are contained in the association list + <code>fFlags</code>. + <p> + The driver uses a nasty hack based on <code>IORef</code>s that permits + the rest of the compiler to access static flags as CAFs; i.e., there is + a family of toplevel variable definitions in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts</code></a>, + below the literate section heading <i>Static options</i>, each of which + contains the value of one static option. This is essentially realised + via global variables (in the sense of C-style, updatable, global + variables) defined via an evil pre-processor macro named + <code>GLOBAL_VAR</code>, which is defined in a particularly ugly corner + of GHC, namely the C header file + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/HsVersions.h"><code>HsVersions.h</code></a>. + + <h2>What Happens When</h2> + <p> + Inside the Haskell compiler proper (<code>hsc</code>), a whole series of + stages (``passes'') are executed in order to transform your Haskell program + into C or native code. This process is orchestrated by + <code>main/HscMain.hscMain</code> and its relative + <code>hscReComp</code>. The latter directly invokes, in order, + the parser, the renamer, the typechecker, the desugarer, the + simplifier (Core2Core), the CoreTidy pass, the CorePrep pass, + conversion to STG (CoreToStg), the interface generator + (MkFinalIface), the code generator, and code output. The + simplifier is the most complex of these, and is made up of many + sub-passes. These are controlled by <code>buildCoreToDo</code>, + as described below. + + <h2>Scheduling Optimisations Phases</h2> + <p> + GHC has a large variety of optimisations at its disposal, many of which + have subtle interdependencies. The overall plan for program + optimisation is fixed in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState.hs</code></a>. + First of all, there is the variable <code>hsc_minusNoO_flags</code> that + determines the <code>-f</code> options that you get without + <code>-O</code> (aka optimisation level 0) as well as + <code>hsc_minusO_flags</code> and <code>hsc_minusO2_flags</code> for + <code>-O</code> and <code>-O2</code>. + <p> + However, most of the strategic decisions about optimisations on the + intermediate language Core are encoded in the value produced by + <code>buildCoreToDo</code>, which is a list with elements of type + <code>CoreToDo</code>. Each element of this list specifies one step in + the sequence of core optimisations executed by the <a + href="simplifier.html">Mighty Simplifier</a>. The type + <code>CoreToDo</code> is defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/CmdLineOpts.lhs"><code>CmdLineOpts.lhs</code></a>. + The actual execution of the optimisation plan produced by + <code>buildCoreToDo</code> is performed by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimpleCore</code></a><code>.doCorePasses</code>. + Core optimisation plans consist of a number of simplification phases + (currently, three for optimisation levels of 1 or higher) with + decreasing phase numbers (the lowest, corresponding to the last phase, + namely 0). Before and after these phases, optimisations such as + specialisation, let floating, worker/wrapper, and so on are executed. + The sequence of phases is such that the synergistic effect of the phases + is maximised -- however, this is a fairly fragile arrangement. + <p> + There is a similar construction for optimisations on STG level stored in + the variable <code>buildStgToDo :: [StgToDo]</code>. However, this is a + lot less complex than the arrangement for Core optimisations. + + <h2>Linking the <code>RTS</code> and <code>libHSstd</code></h2> + <p> + Since the RTS and HSstd refer to each other, there is a Cunning + Hack to avoid putting them each on the command-line twice or + thrice (aside: try asking for `plaice and chips thrice' in a + fish and chip shop; bet you only get two lots). The hack involves + adding + the symbols that the RTS needs from libHSstd, such as + <code>PrelWeak_runFinalizzerBatch_closure</code> and + <code>__stginit_Prelude</code>, to the link line with the + <code>-u</code> flag. The standard library appears before the + RTS on the link line, and these options cause the corresponding + symbols to be picked up even so the linked might not have seen them + being used as the RTS appears later on the link line. As a result, + when the RTS is also scanned, these symbols are already resolved. This + avoids the linker having to read the standard library and RTS + multiple times. + </p> + <p> + This does, however, leads to a complication. Normal Haskell + programs do not have a <code>main()</code> function, so this is + supplied by the RTS (in the file + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/rts/Main.c"><code>Main.c</code></a>). + It calls <code>startupHaskell</code>, which + itself calls <code>__stginit_PrelMain</code>, which is therefore, + since it occurs in the standard library, one of the symbols + passed to the linker using the <code>-u</code> option. This is fine + for standalone Haskell programs, but as soon as the Haskell code is only + used as part of a program implemented in a foreign language, the + <code>main()</code> function of that foreign language should be used + instead of that of the Haskell runtime. In this case, the previously + described arrangement unfortunately fails as + <code>__stginit_PrelMain</code> had better not be linked in, + because it tries to call <code>__stginit_Main</code>, which won't + exist. In other words, the RTS's <code>main()</code> refers to + <code>__stginit_PrelMain</code> which in turn refers to + <code>__stginit_Main</code>. Although the RTS's <code>main()</code> + might not be linked in if the program provides its own, the driver + will normally force <code>__stginit_PrelMain</code> to be linked in anyway, + using <code>-u</code>, because it's a back-reference from the + RTS to HSstd. This case is coped with by the <code>-no-hs-main</code> + flag, which suppresses passing the corresonding <code>-u</code> option + to the linker -- although in some versions of the compiler (e.g., 5.00.2) + it didn't work. In addition, the driver generally places the C program + providing the <code>main()</code> that we want to use before the RTS + on the link line. Therefore, the RTS's main is never used and + without the <code>-u</code> the label <code>__stginit_PrelMain</code> + will not be linked. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Tue Feb 19 11:09:00 UTC 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html new file mode 100644 index 0000000000..956043bafb --- /dev/null +++ b/docs/comm/the-beast/fexport.html @@ -0,0 +1,231 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - foreign export</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - foreign export</h1> + + The implementation scheme for foreign export, as of 27 Feb 02, is + as follows. There are four cases, of which the first two are easy. + <p> + <b>(1) static export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> IO Int</code> + <p> + For this we generate no Haskell code. However, a C stub is + generated, and it looks like this: + <p> + <pre> +extern StgClosure* MMM_foo_closure; + +HsInt foo (HsInt a1, HsInt a2) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("foo",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + This does the obvious thing: builds in the heap the expression + <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it, + and uses <code>rts_getInt</code> to fish out the result. + + <p> + <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> Int</code> + <p> + This is identical to case (1), with the sole difference that the + stub calls <code>rts_eval</code> rather than + <code>rts_evalIO</code>. + <p> + + <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code> + <p> + Dynamic exports are a whole lot more complicated than their static + counterparts. + <p> + First of all, we get some Haskell code, which, when given a + function <code>callMe :: (Int -> Int -> IO Int)</code> to be made + C-callable, IO-returns a <code>FunPtr a</code>, which is the + address of the resulting C-callable code. This address can now be + handed out to the C-world, and callers to it will get routed + through to <code>callMe</code>. + <p> + The generated Haskell function looks like this: + <p> +<pre> +mkCallback f + = do sp <- mkStablePtr f + r <- ccall "createAdjustorThunk" sp (&"run_mkCallback") + return r +</pre> + <p> + <code>createAdjustorThunk</code> is a gruesome, + architecture-specific function in the RTS. It takes a stable + pointer to the Haskell function to be run, and the address of the + associated C wrapper, and returns a piece of machine code, + which, when called from the outside (C) world, eventually calls + through to <code>f</code>. + <p> + This machine code fragment is called the "Adjustor Thunk" (don't + ask me why). What it does is simply to call onwards to the C + helper + function <code>run_mkCallback</code>, passing all the args given + to it but also conveying <code>sp</code>, which is a stable + pointer + to the Haskell function to run. So: + <p> +<pre> +createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn ) +{ + create malloc'd piece of machine code "mc", behaving thusly: + + mc ( args_to_mc ) + { + jump to addr_of_helper_C_fn, passing sp as an additional + argument + } +</pre> + <p> + This is a horrible hack, because there is no portable way, even at + the machine code level, to function which adds one argument and + then transfers onwards to another C function. On x86s args are + pushed R to L onto the stack, so we can just push <code>sp</code>, + fiddle around with return addresses, and jump onwards to the + helper C function. However, on architectures which use register + windows and/or pass args extensively in registers (Sparc, Alpha, + MIPS, IA64), this scheme borders on the unviable. GHC has a + limited <code>createAdjustorThunk</code> implementation for Sparc + and Alpha, which handles only the cases where all args, including + the extra one, fit in registers. + <p> + Anyway: the other lump of code generated as a result of a + f-x-dynamic declaration is the C helper stub. This is basically + the same as in the static case, except that it only ever gets + called from the adjustor thunk, and therefore must accept + as an extra argument, a stable pointer to the Haskell function + to run, naturally enough, as this is not known until run-time. + It then dereferences the stable pointer and does the call in + the same way as the f-x-static case: +<pre> +HsInt Main_d1kv ( StgStablePtr the_stableptr, + void* original_return_addr, + HsInt a1, HsInt a2 ) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr), + rts_mkInt(a1) + ), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("Main_d1kv",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + Note how this function has a purely made-up name + <code>Main_d1kv</code>, since unlike the f-x-static case, this + function is never called from user code, only from the adjustor + thunk. + <p> + Note also how the function takes a bogus parameter + <code>original_return_addr</code>, which is part of this extra-arg + hack. The usual scheme is to leave the original caller's return + address in place and merely push the stable pointer above that, + hence the spare parameter. + <p> + Finally, there is some extra trickery, detailed in + <code>ghc/rts/Adjustor.c</code>, to get round the following + problem: the adjustor thunk lives in mallocville. It is + quite possible that the Haskell code will actually + call <code>free()</code> on the adjustor thunk used to get to it + -- because otherwise there is no way to reclaim the space used + by the adjustor thunk. That's all very well, but it means that + the C helper cannot return to the adjustor thunk in the obvious + way, since we've already given it back using <code>free()</code>. + So we leave, on the C stack, the address of whoever called the + adjustor thunk, and before calling the helper, mess with the stack + such that when the helper returns, it returns directly to the + adjustor thunk's caller. + <p> + That's how the <code>stdcall</code> convention works. If the + adjustor thunk has been called using the <code>ccall</code> + convention, we return indirectly, via a statically-allocated + yet-another-magic-piece-of-code, which takes care of removing the + extra argument that the adjustor thunk pushed onto the stack. + This is needed because in <code>ccall</code>-world, it is the + caller who removes args after the call, and the original caller of + the adjustor thunk has no way to know about the extra arg pushed + by the adjustor thunk. + <p> + You didn't really want to know all this stuff, did you? + <p> + + + + <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code> + <p> + (4) relates to (3) as (2) relates to (1), that is, it's identical, + except the C stub uses <code>rts_eval</code> instead of + <code>rts_evalIO</code>. + <p> + + + <h2>Some perspective on f-x-dynamic</h2> + + The only really horrible problem with f-x-dynamic is how the + adjustor thunk should pass to the C helper the stable pointer to + use. Ideally we would like this to be conveyed via some invisible + side channel, since then the adjustor thunk could simply jump + directly to the C helper, with no non-portable stack fiddling. + <p> + Unfortunately there is no obvious candidate for the invisible + side-channel. We've chosen to pass it on the stack, with the + bad consequences detailed above. Another possibility would be to + park it in a global variable, but this is non-reentrant and + non-(OS-)thread-safe. A third idea is to put it into a callee-saves + register, but that has problems too: the C helper may not use that + register and therefore we will have trashed any value placed there + by the caller; and there is no C-level portable way to read from + the register inside the C helper. + <p> + In short, we can't think of a really satisfactory solution. I'd + vote for introducing some kind of OS-thread-local-state and passing + it in there, but that introduces complications of its own. + <p> + <b>OS-thread-safety</b> is of concern in the C stubs, whilst + building up the expressions to run. These need to have exclusive + access to the heap whilst allocating in it. Also, there needs to + be some guarantee that no GC will happen in between the + <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code> + starts running. At the moment there are no guarantees for + either property. This needs to be sorted out before the + implementation can be regarded as fully safe to use. + +<p><small> + +<!-- hhmts start --> +Last modified: Weds 27 Feb 02 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/ghci.html b/docs/comm/the-beast/ghci.html new file mode 100644 index 0000000000..b893acdeb4 --- /dev/null +++ b/docs/comm/the-beast/ghci.html @@ -0,0 +1,407 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - GHCi</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - GHCi</h1> + + This isn't a coherent description of how GHCi works, sorry. What + it is (currently) is a dumping ground for various bits of info + pertaining to GHCi, which ought to be recorded somewhere. + + <h2>Debugging the interpreter</h2> + + The usual symptom is that some expression / program crashes when + running on the interpreter (commonly), or gets wierd results + (rarely). Unfortunately, finding out what the problem really is + has proven to be extremely difficult. In retrospect it may be + argued a design flaw that GHC's implementation of the STG + execution mechanism provides only the weakest of support for + automated internal consistency checks. This makes it hard to + debug. + <p> + Execution failures in the interactive system can be due to + problems with the bytecode interpreter, problems with the bytecode + generator, or problems elsewhere. From the bugs seen so far, + the bytecode generator is often the culprit, with the interpreter + usually being correct. + <p> + Here are some tips for tracking down interactive nonsense: + <ul> + <li>Find the smallest source fragment which causes the problem. + <p> + <li>Using an RTS compiled with <code>-DDEBUG</code> (nb, that + means the RTS from the previous stage!), run with <code>+RTS + -D2</code> to get a listing in great detail from the + interpreter. Note that the listing is so voluminous that + this is impractical unless you have been diligent in + the previous step. + <p> + <li>At least in principle, using the trace and a bit of GDB + poking around at the time of death, you can figure out what + the problem is. In practice you quickly get depressed at + the hopelessness of ever making sense of the mass of + details. Well, I do, anyway. + <p> + <li><code>+RTS -D2</code> tries hard to print useful + descriptions of what's on the stack, and often succeeds. + However, it has no way to map addresses to names in + code/data loaded by our runtime linker. So the C function + <code>ghci_enquire</code> is provided. Given an address, it + searches the loaded symbol tables for symbols close to that + address. You can run it from inside GDB: + <pre> + (gdb) p ghci_enquire ( 0x50a406f0 ) + 0x50a406f0 + -48 == `PrelBase_Czh_con_info' + 0x50a406f0 + -12 == `PrelBase_Izh_static_info' + 0x50a406f0 + -48 == `PrelBase_Czh_con_entry' + 0x50a406f0 + -24 == `PrelBase_Izh_con_info' + 0x50a406f0 + 16 == `PrelBase_ZC_con_entry' + 0x50a406f0 + 0 == `PrelBase_ZMZN_static_entry' + 0x50a406f0 + -36 == `PrelBase_Czh_static_entry' + 0x50a406f0 + -24 == `PrelBase_Izh_con_entry' + 0x50a406f0 + 64 == `PrelBase_EQ_static_info' + 0x50a406f0 + 0 == `PrelBase_ZMZN_static_info' + 0x50a406f0 + 48 == `PrelBase_LT_static_entry' + $1 = void + </pre> + In this case the enquired-about address is + <code>PrelBase_ZMZN_static_entry</code>. If no symbols are + close to the given addr, nothing is printed. Not a great + mechanism, but better than nothing. + <p> + <li>We have had various problems in the past due to the bytecode + generator (<code>compiler/ghci/ByteCodeGen.lhs</code>) being + confused about the true set of free variables of an + expression. The compilation scheme for <code>let</code>s + applies the BCO for the RHS of the let to its free + variables, so if the free-var annotation is wrong or + misleading, you end up with code which has wrong stack + offsets, which is usually fatal. + <p> + <li>The baseline behaviour of the interpreter is to interpret + BCOs, and hand all other closures back to the scheduler for + evaluation. However, this causes a huge number of expensive + context switches, so the interpreter knows how to enter the + most common non-BCO closure types by itself. + <p> + These optimisations complicate the interpreter. + If you think you have an interpreter problem, re-enable the + define <code>REFERENCE_INTERPRETER</code> in + <code>ghc/rts/Interpreter.c</code>. All optimisations are + thereby disabled, giving the baseline + I-only-know-how-to-enter-BCOs behaviour. + <p> + <li>Following the traces is often problematic because execution + hops back and forth between the interpreter, which is + traced, and compiled code, which you can't see. + Particularly annoying is when the stack looks OK in the + interpreter, then compiled code runs for a while, and later + we arrive back in the interpreter, with the stack corrupted, + and usually in a completely different place from where we + left off. + <p> + If this is biting you baaaad, it may be worth copying + sources for the compiled functions causing the problem, into + your interpreted module, in the hope that you stay in the + interpreter more of the time. Of course this doesn't work + very well if you've defined + <code>REFERENCE_INTERPRETER</code> in + <code>ghc/rts/Interpreter.c</code>. + <p> + <li>There are various commented-out pieces of code in + <code>Interpreter.c</code> which can be used to get the + stack sanity-checked after every entry, and even after after + every bytecode instruction executed. Note that some + bytecodes (<code>PUSH_UBX</code>) leave the stack in + an unwalkable state, so the <code>do_print_stack</code> + local variable is used to suppress the stack walk after + them. + </ul> + + + <h2>Useful stuff to know about the interpreter</h2> + + The code generation scheme is straightforward (naive, in fact). + <code>-ddump-bcos</code> prints each BCO along with the Core it + was generated from, which is very handy. + <ul> + <li>Simple lets are compiled in-line. For the general case, let + v = E in ..., E is compiled into a new BCO which takes as + args its free variables, and v is bound to AP(the new BCO, + free vars of E). + <p> + <li><code>case</code>s as usual, become: push the return + continuation, enter the scrutinee. There is some magic to + make all combinations of compiled/interpreted calls and + returns work, described below. In the interpreted case, all + case alts are compiled into a single big return BCO, which + commences with instructions implementing a switch tree. + </ul> + <p> + <b>ARGCHECK magic</b> + <p> + You may find ARGCHECK instructions at the start of BCOs which + don't appear to need them; case continuations in particular. + These play an important role: they force objects which should + evaluated to BCOs to actually be BCOs. + <p> + Typically, there may be an application node somewhere in the heap. + This is a thunk which when leant on turns into a BCO for a return + continuation. The thunk may get entered with an update frame on + top of the stack. This is legitimate since from one viewpoint + this is an AP which simply reduces to a data object, so does not + have functional type. However, once the AP turns itself into a + BCO (so to speak) we cannot simply enter the BCO, because that + expects to see args on top of the stack, not an update frame. + Therefore any BCO which expects something on the stack above an + update frame, even non-function BCOs, start with an ARGCHECK. In + this case it fails, the update is done, the update frame is + removed, and the BCO re-entered. Subsequent entries of the BCO of + course go unhindered. + <p> + The optimised (<code>#undef REFERENCE_INTERPRETER</code>) handles + this case specially, so that a trip through the scheduler is + avoided. When reading traces from <code>+RTS -D2 -RTS</code>, you + may see BCOs which appear to execute their initial ARGCHECK insn + twice. The first time it fails; the interpreter does the update + immediately and re-enters with no further comment. + <p> + This is all a bit ugly, and, as SimonM correctly points out, it + would have been cleaner to make BCOs unpointed (unthunkable) + objects, so that a pointer to something <code>:: BCO#</code> + really points directly at a BCO. + <p> + <b>Stack management</b> + <p> + There isn't any attempt to stub the stack, minimise its growth, or + generally remove unused pointers ahead of time. This is really + due to lazyness on my part, although it does have the minor + advantage that doing something cleverer would almost certainly + increase the number of bytecodes that would have to be executed. + Of course we SLIDE out redundant stuff, to get the stack back to + the sequel depth, before returning a HNF, but that's all. As + usual this is probably a cause of major space leaks. + <p> + <b>Building constructors</b> + <p> + Constructors are built on the stack and then dumped into the heap + with a single PACK instruction, which simply copies the top N + words of the stack verbatim into the heap, adds an info table, and zaps N + words from the stack. The constructor args are pushed onto the + stack one at a time. One upshot of this is that unboxed values + get pushed untaggedly onto the stack (via PUSH_UBX), because that's how they + will be in the heap. That in turn means that the stack is not + always walkable at arbitrary points in BCO execution, although + naturally it is whenever GC might occur. + <p> + Function closures created by the interpreter use the AP-node + (tagged) format, so although their fields are similarly + constructed on the stack, there is never a stack walkability + problem. + <p> + <b>Unpacking constructors</b> + <p> + At the start of a case continuation, the returned constructor is + unpacked onto the stack, which means that unboxed fields have to + be tagged. Rather than burdening all such continuations with a + complex, general mechanism, I split it into two. The + allegedly-common all-pointers case uses a single UNPACK insn + to fish out all fields with no further ado. The slow case uses a + sequence of more complex UPK_TAG insns, one for each field (I + think). This seemed like a good compromise to me. + <p> + <b>Perspective</b> + <p> + I designed the bytecode mechanism with the experience of both STG + hugs and Classic Hugs in mind. The latter has an small + set of bytecodes, a small interpreter loop, and runs amazingly + fast considering the cruddy code it has to interpret. The former + had a large interpretative loop with many different opcodes, + including multiple minor variants of the same thing, which + made it difficult to optimise and maintain, yet it performed more + or less comparably with Classic Hugs. + <p> + My design aims were therefore to minimise the interpreter's + complexity whilst maximising performance. This means reducing the + number of opcodes implemented, whilst reducing the number of insns + despatched. In particular there are only two opcodes, PUSH_UBX + and UPK_TAG, which deal with tags. STG Hugs had dozens of opcodes + for dealing with tagged data. In cases where the common + all-pointers case is significantly simpler (UNPACK) I deal with it + specially. Finally, the number of insns executed is reduced a + little by merging multiple pushes, giving PUSH_LL and PUSH_LLL. + These opcode pairings were determined by using the opcode-pair + frequency profiling stuff which is ifdef-d out in + <code>Interpreter.c</code>. These significantly improve + performance without having much effect on the uglyness or + complexity of the interpreter. + <p> + Overall, the interpreter design is something which turned out + well, and I was pleased with it. Unfortunately I cannot say the + same of the bytecode generator. + + <h2><code>case</code> returns between interpreted and compiled code</h2> + + Variants of the following scheme have been drifting around in GHC + RTS documentation for several years. Since what follows is + actually what is implemented, I guess it supersedes all other + documentation. Beware; the following may make your brain melt. + In all the pictures below, the stack grows downwards. + <p> + <b>Returning to interpreted code</b>. + <p> + Interpreted returns employ a set of polymorphic return infotables. + Each element in the set corresponds to one of the possible return + registers (R1, D1, F1) that compiled code will place the returned + value in. In fact this is a bit misleading, since R1 can be used + to return either a pointer or an int, and we need to distinguish + these cases. So, supposing the set of return registers is {R1p, + R1n, D1, F1}, there would be four corresponding infotables, + <code>stg_ctoi_ret_R1p_info</code>, etc. In the pictures below we + call them <code>stg_ctoi_ret_REP_info</code>. + <p> + These return itbls are polymorphic, meaning that all 8 vectored + return codes and the direct return code are identical. + <p> + Before the scrutinee is entered, the stack is arranged like this: + <pre> + | | + +--------+ + | BCO | -------> the return contination BCO + +--------+ + | itbl * | -------> stg_ctoi_ret_REP_info, with all 9 codes as follows: + +--------+ + BCO* bco = Sp[1]; + push R1/F1/D1 depending on REP + push bco + yield to sched + </pre> + On entry, the interpreted contination BCO expects the stack to look + like this: + <pre> + | | + +--------+ + | BCO | -------> the return contination BCO + +--------+ + | itbl * | -------> ret_REP_ctoi_info, with all 9 codes as follows: + +--------+ + : VALUE : (the returned value, shown with : since it may occupy + +--------+ multiple stack words) + </pre> + A machine code return will park the returned value in R1/F1/D1, + and enter the itbl on the top of the stack. Since it's our magic + itbl, this pushes the returned value onto the stack, which is + where the interpreter expects to find it. It then pushes the BCO + (again) and yields. The scheduler removes the BCO from the top, + and enters it, so that the continuation is interpreted with the + stack as shown above. + <p> + An interpreted return will create the value to return at the top + of the stack. It then examines the return itbl, which must be + immediately underneath the return value, to see if it is one of + the magic <code>stg_ctoi_ret_REP_info</code> set. Since this is so, + it knows it is returning to an interpreted contination. It + therefore simply enters the BCO which it assumes it immediately + underneath the itbl on the stack. + + <p> + <b>Returning to compiled code</b>. + <p> + Before the scrutinee is entered, the stack is arranged like this: + <pre> + ptr to vec code 8 ------> return vector code 8 + | | .... + +--------+ ptr to vec code 1 ------> return vector code 1 + | itbl * | -- Itbl end + +--------+ \ .... + \ Itbl start + ----> direct return code + </pre> + The scrutinee value is then entered. + The case continuation(s) expect the stack to look the same, with + the returned HNF in a suitable return register, R1, D1, F1 etc. + <p> + A machine code return knows whether it is doing a vectored or + direct return, and, if the former, which vector element it is. + So, for a direct return we jump to <code>Sp[0]</code>, and for a + vectored return, jump to <code>((CodePtr*)(Sp[0]))[ - ITBL_LENGTH + - vector number ]</code>. This is (of course) the scheme that + compiled code has been using all along. + <p> + An interpreted return will, as described just above, have examined + the itbl immediately beneath the return value it has just pushed, + and found it not to be one of the <code>ret_REP_ctoi_info</code> set, + so it knows this must be a return to machine code. It needs to + pop the return value, currently on the stack, into R1/F1/D1, and + jump through the info table. Unfortunately the first part cannot + be accomplished directly since we are not in Haskellised-C world. + <p> + We therefore employ a second family of magic infotables, indexed, + like the first, on the return representation, and therefore with + names of the form <code>stg_itoc_ret_REP_info</code>. (Note: + <code>itoc</code>; the previous bunch were <code>ctoi</code>). + This is pushed onto the stack (note, tagged values have their tag + zapped), giving: + <pre> + | | + +--------+ + | itbl * | -------> arbitrary machine code return itbl + +--------+ + : VALUE : (the returned value, possibly multiple words) + +--------+ + | itbl * | -------> stg_itoc_ret_REP_info, with code: + +--------+ + pop myself (stg_itoc_ret_REP_info) off the stack + pop return value into R1/D1/F1 + do standard machine code return to itbl at t.o.s. + </pre> + We then return to the scheduler, asking it to enter the itbl at + t.o.s. When entered, <code>stg_itoc_ret_REP_info</code> removes + itself from the stack, pops the return value into the relevant + return register, and returns to the itbl to which we were trying + to return in the first place. + <p> + Amazingly enough, this stuff all actually works! Well, mostly ... + <p> + <b>Unboxed tuples: a Right Royal Spanner In The Works</b> + <p> + The above scheme depends crucially on having magic infotables + <code>stg_{itoc,ctoi}_ret_REP_info</code> for each return + representation <code>REP</code>. It unfortunately fails miserably + in the face of unboxed tuple returns, because the set of required + tables would be infinite; this despite the fact that for any given + unboxed tuple return type, the scheme could be made to work fine. + <p> + This is a serious problem, because it prevents interpreted + code from doing <code>IO</code>-typed returns, since <code>IO + t</code> is implemented as <code>(# t, RealWorld# #)</code> or + thereabouts. This restriction in turn rules out FFI stuff in the + interpreter. Not good. + <p> + Although we have no way to make general unboxed tuples work, we + can at least make <code>IO</code>-types work using the following + ultra-kludgey observation: <code>RealWorld#</code> doesn't really + exist and so has zero size, in compiled code. In turn this means + that a type of the form <code>(# t, RealWorld# #)</code> has the + same representation as plain <code>t</code> does. So the bytecode + generator, whilst rejecting code with general unboxed tuple + returns, recognises and accepts this special case. Which means + that <code>IO</code>-typed stuff works in the interpreter. Just. + <p> + If anyone asks, I will claim I was out of radio contact, on a + 6-month walking holiday to the south pole, at the time this was + ... er ... dreamt up. + + +<p><small> + +<!-- hhmts start --> +Last modified: Thursday February 7 15:33:49 GMT 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/main.html b/docs/comm/the-beast/main.html new file mode 100644 index 0000000000..332ffaa501 --- /dev/null +++ b/docs/comm/the-beast/main.html @@ -0,0 +1,35 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Compiling and running the Main module</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Compiling and running the Main module</h1> + +GHC allows you to determine which module contains the "main" function, and +what that function is called, via the <code>-fmain-is</code> flag. The trouble is +that the runtime system is fixed, so what symbol should it link to? +<p> +The current solution is this. Suppose the main function is <code>Foo.run</code>. +<ul> +<li> +Then, when compiling module <code>Foo</code>, GHC adds an extra definition: +<pre> + :Main.main = runIO Foo.run +</pre> +Now the RTS can invoke <code>:Main.main</code> to start the program. (This extra +definition is inserted in TcRnDriver.checkMain.) +<p><li> +Before starting the program, though, the RTS also initialises the module tree +by calling <code>init_:Main</code>, so when compiling the main module (Foo in this case), +as well as generating <code>init_Foo</code> as usual, GHC also generates +<pre> + init_zcMain() { init_Foo; } +</pre> +This extra initialisation code is generated in CodeGen.mkModuleInit. +</ul> + + </body> +</html> diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html new file mode 100644 index 0000000000..1ad80f0d5c --- /dev/null +++ b/docs/comm/the-beast/mangler.html @@ -0,0 +1,79 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Evil Mangler</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Evil Mangler</h1> + <p> + The Evil Mangler (EM) is a Perl script invoked by the <a + href="driver.html">Glorious Driver</a> after the C compiler (gcc) has + translated the GHC-produced C code into assembly. Consequently, it is + only of interest if <code>-fvia-C</code> is in effect (either explicitly + or implicitly). + + <h4>Its purpose</h4> + <p> + The EM reads the assembly produced by gcc and re-arranges code blocks as + well as nukes instructions that it considers <em>non-essential.</em> It + derives it evilness from its utterly ad hoc, machine, compiler, and + whatnot dependent design and implementation. More precisely, the EM + performs the following tasks: + <ul> + <li>The code executed when a closure is entered is moved adjacent to + that closure's infotable. Moreover, the order of the info table + entries is reversed. Also, SRT pointers are removed from closures that + don't need them (non-FUN, RET and THUNK ones). + <li>Function prologue and epilogue code is removed. (GHC generated code + manages its own stack and uses the system stack only for return + addresses and during calls to C code.) + <li>Certain code patterns are replaced by simpler code (eg, loads of + fast entry points followed by indirect jumps are replaced by direct + jumps to the fast entry point). + </ul> + + <h4>Implementation</h4> + <p> + The EM is located in the Perl script <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>. + The script reads the <code>.s</code> file and chops it up into + <em>chunks</em> (that's how they are actually called in the script) that + roughly correspond to basic blocks. Each chunk is annotated with an + educated guess about what kind of code it contains (e.g., infotable, + fast entry point, slow entry point, etc.). The annotations also contain + the symbol introducing the chunk of assembly and whether that chunk has + already been processed or not. + <p> + The parsing of the input into chunks as well as recognising assembly + instructions that are to be removed or altered is based on a large + number of Perl regular expressions sprinkled over the whole code. These + expressions are rather fragile as they heavily rely on the structure of + the generated code - in fact, they even rely on the right amount of + white space and thus on the formatting of the assembly. + <p> + Afterwards, the chunks are reordered, some of them purged, and some + stripped of some useless instructions. Moreover, some instructions are + manipulated (eg, loads of fast entry points followed by indirect jumps + are replaced by direct jumps to the fast entry point). + <p> + The EM knows which part of the code belongs to function prologues and + epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the + form <code>--- BEGIN ---</code> and <code>--- END ---</code> the + assembler just before and after the code proper of a function starts. + It adds these tags using gcc's <code>__asm__</code> feature. + <p> + <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic + block re-ordering and dead code elimination. This seems to make the + whole <code>--- END ---</code> tag business redundant -- in fact, if + proper code is generated, no <code>--- END ---</code> tags survive gcc + optimiser. + + <p><small> +<!-- hhmts start --> +Last modified: Sun Feb 17 17:55:47 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/modules.html b/docs/comm/the-beast/modules.html new file mode 100644 index 0000000000..a6655a68a7 --- /dev/null +++ b/docs/comm/the-beast/modules.html @@ -0,0 +1,80 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Modules, ModuleNames and Packages</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>Modules, ModuleNames and Packages</h1> + + <p>This section describes the datatypes <code>ModuleName</code> + <code>Module</code> and <code>PackageName</code> all available + from the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Module.lhs"><code>Module</code></a>.<p> + + <h2>Packages</h2> + + <p>A package is a collection of (zero or more) Haskell modules, + together with some information about external libraries, extra C + compiler options, and other things that this collection of modules + requires. When using DLLs on windows (or shared libraries on a + Unix system; currently unsupported), a package can consist of only + a single shared library of Haskell code; the reason for this is + described below. + + <p>Packages are further described in the User's Guide <a + href="http://www.haskell.org/ghc/docs/latest/packages.html">here</a>. + + <h2>The ModuleName type</h2> + + <p>At the bottom of the hierarchy is a <code>ModuleName</code>, + which, as its name suggests, is simply the name of a module. It + is represented as a Z-encoded FastString, and is an instance of + <code>Uniquable</code> so we can build <code>FiniteMap</code>s + with <code>ModuleName</code>s as the keys. + + <p>A <code>ModuleName</code> can be built from a + <code>String</code>, using the <code>mkModuleName</code> function. + + <h2>The Module type</h2> + + <p>For a given module, the compiler also needs to know whether the + module is in the <em>home package</em>, or in another package. + This distinction is important for two reasons: + + <ul> + <li><p>When generating code to call a function in another package, + the compiler might have to generate a cross-DLL call, which is + different from an intra-DLL call (hence the restriction that the + code in a package can only reside in a single DLL). + + <li><p>We avoid putting version information in an interface file + for entities defined in another package, on the grounds that other + packages are generally "stable". This also helps keep the size of + interface files down. + </ul> + + <p>The <code>Module</code> type contains a <code>ModuleName</code> + and a <code>PackageInfo</code> field. The + <code>PackageInfo</code> indicates whether the given + <code>Module</code> comes from the current package or from another + package. + + <p>To get the actual package in which a given module resides, you + have to read the interface file for that module, which contains + the package name (actually the value of the + <code>-package-name</code> flag when that module was built). This + information is currently unused inside the compiler, but we might + make use of it in the future, especially with the advent of + hierarchical modules, to allow the compiler to automatically + figure out which packages a program should be linked with, and + thus avoid the need to specify <code>-package</code> options on + the command line. + + <p><code>Module</code>s are also instances of + <code>Uniquable</code>, and indeed the unique of a + <code>Module</code> is the same as the unique of the underlying + <code>ModuleName</code>. + </body> +</html> diff --git a/docs/comm/the-beast/names.html b/docs/comm/the-beast/names.html new file mode 100644 index 0000000000..061fae3ebf --- /dev/null +++ b/docs/comm/the-beast/names.html @@ -0,0 +1,169 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The truth about names: OccNames, and Names</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The truth about names: OccNames, and Names</h1> + <p> + Every entity (type constructor, class, identifier, type variable) has a + <code>Name</code>. The <code>Name</code> type is pervasive in GHC, and + is defined in <code>basicTypes/Name.lhs</code>. Here is what a Name + looks like, though it is private to the Name module. + </p> + <blockquote> + <pre> +data Name = Name { + n_sort :: NameSort, -- What sort of name it is + n_occ :: !OccName, -- Its occurrence name + n_uniq :: Unique, -- Its identity + n_loc :: !SrcLoc -- Definition site + }</pre> + </blockquote> + <ul> + <li> The <code>n_sort</code> field says what sort of name this is: see + <a href="#sort">NameSort below</a>. + <li> The <code>n_occ</code> field gives the "occurrence name" of the + Name; see + <a href="#occname">OccName below</a>. + <li> The <code>n_uniq</code> field allows fast tests for equality of + Names. + <li> The <code>n_loc</code> field gives some indication of where the + name was bound. + </ul> + + <h2><a name="sort">The <code>NameSort</code> of a <code>Name</code></a></h2> + <p> + There are four flavours of <code>Name</code>: + </p> + <blockquote> + <pre> +data NameSort + = External Module (Maybe Name) + -- (Just parent) => this Name is a subordinate name of 'parent' + -- e.g. data constructor of a data type, method of a class + -- Nothing => not a subordinate + + | WiredIn Module (Maybe Name) TyThing BuiltInSyntax + -- A variant of External, for wired-in things + + | Internal -- A user-defined Id or TyVar + -- defined in the module being compiled + + | System -- A system-defined Id or TyVar. Typically the + -- OccName is very uninformative (like 's')</pre> + </blockquote> + <ul> + <li>Here are the sorts of Name an entity can have: + <ul> + <li> Class, TyCon: External. + <li> Id: External, Internal, or System. + <li> TyVar: Internal, or System. + </ul> + </li> + <li>An <code>External</code> name has a globally-unique + (module name, occurrence name) pair, namely the + <em>original name</em> of the entity, + describing where the thing was originally defined. So for example, + if we have + <blockquote> + <pre> +module M where + f = e1 + g = e2 + +module A where + import qualified M as Q + import M + a = Q.f + g</pre> + </blockquote> + <p> + then the RdrNames for "a", "Q.f" and "g" get replaced (by the + Renamer) by the Names "A.a", "M.f", and "M.g" respectively. + </p> + </li> + <li>An <code>InternalName</code> + has only an occurrence name. Distinct InternalNames may have the same + occurrence name; use the Unique to distinguish them. + </li> + <li>An <code>ExternalName</code> has a unique that never changes. It + is never cloned. This is important, because the simplifier invents + new names pretty freely, but we don't want to lose the connnection + with the type environment (constructed earlier). An + <code>InternalName</code> name can be cloned freely. + </li> + <li><strong>Before CoreTidy</strong>: the Ids that were defined at top + level in the original source program get <code>ExternalNames</code>, + whereas extra top-level bindings generated (say) by the type checker + get <code>InternalNames</code>. q This distinction is occasionally + useful for filtering diagnostic output; e.g. for -ddump-types. + </li> + <li><strong>After CoreTidy</strong>: An Id with an + <code>ExternalName</code> will generate symbols that + appear as external symbols in the object file. An Id with an + <code>InternalName</code> cannot be referenced from outside the + module, and so generates a local symbol in the object file. The + CoreTidy pass makes the decision about which names should be External + and which Internal. + </li> + <li>A <code>System</code> name is for the most part the same as an + <code>Internal</code>. Indeed, the differences are purely cosmetic: + <ul> + <li>Internal names usually come from some name the + user wrote, whereas a System name has an OccName like "a", or "t". + Usually there are masses of System names with the same OccName but + different uniques, whereas typically there are only a handful of + distince Internal names with the same OccName. + </li> + <li>Another difference is that when unifying the type checker tries + to unify away type variables with System names, leaving ones with + Internal names (to improve error messages). + </li> + </ul> + </li> + </ul> + + <h2><a name="occname">Occurrence names: <code>OccName</code></a></h2> + <p> + An <code>OccName</code> is more-or-less just a string, like "foo" or + "Tree", giving the (unqualified) name of an entity. + </p> + <p> + Well, not quite just a string, because in Haskell a name like "C" could + mean a type constructor or data constructor, depending on context. So + GHC defines a type <tt>OccName</tt> (defined in + <tt>basicTypes/OccName.lhs</tt>) that is a pair of a <tt>FastString</tt> + and a <tt>NameSpace</tt> indicating which name space the name is drawn + from: + <blockquote> + <pre> +data OccName = OccName NameSpace EncodedFS</pre> + </blockquote> + <p> + The <tt>EncodedFS</tt> is a synonym for <tt>FastString</tt> indicating + that the string is Z-encoded. (Details in <tt>OccName.lhs</tt>.) + Z-encoding encodes funny characters like '%' and '$' into alphabetic + characters, like "zp" and "zd", so that they can be used in object-file + symbol tables without confusing linkers and suchlike. + </p> + <p> + The name spaces are: + </p> + <ul> + <li> <tt>VarName</tt>: ordinary variables</li> + <li> <tt>TvName</tt>: type variables</li> + <li> <tt>DataName</tt>: data constructors</li> + <li> <tt>TcClsName</tt>: type constructors and classes (in Haskell they + share a name space) </li> + </ul> + + <small> +<!-- hhmts start --> +Last modified: Wed May 4 14:57:55 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> + diff --git a/docs/comm/the-beast/ncg.html b/docs/comm/the-beast/ncg.html new file mode 100644 index 0000000000..5810a35212 --- /dev/null +++ b/docs/comm/the-beast/ncg.html @@ -0,0 +1,749 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Native Code Generator</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Native Code Generator</h1> + <p> + On some platforms (currently x86 and PowerPC, with bitrotted + support for Sparc and Alpha), GHC can generate assembly code + directly, without having to go via C. This can sometimes almost + halve compilation time, and avoids the fragility and + horribleness of the <a href="mangler.html">mangler</a>. The NCG + is enabled by default for + non-optimising compilation on supported platforms. For most programs + it generates code which runs only 1-3% slower + (depending on platform and type of code) than that + created by gcc on x86s, so it is well worth using even with + optimised compilation. FP-intensive x86 programs see a bigger + slowdown, and all Sparc code runs about 5% slower due to + us not filling branch delay slots. + <p> + The NCG has always been something of a second-class citizen + inside GHC, an unloved child, rather. This means that its + integration into the compiler as a whole is rather clumsy, which + brings some problems described below. That apart, the NCG + proper is fairly cleanly designed, as target-independent as it + reasonably can be, and so should not be difficult to retarget. + <p> + <b>NOTE!</b> The native code generator was largely rewritten as part + of the C-- backend changes, around May 2004. Unfortunately the + rest of this document still refers to the old version, and was written + with relation to the CVS head as of end-Jan 2002. Some of it is relevant, + some of it isn't. + + <h2>Overview</h2> + The top-level code generator fn is + <p> + <code>absCtoNat :: AbstractC -> UniqSM (SDoc, Pretty.Doc)</code> + <p> + The returned <code>SDoc</code> is for debugging, so is empty unless + you specify <code>-ddump-stix</code>. The <code>Pretty.Doc</code> + bit is the final assembly code. Translation involves three main + phases, the first and third of which are target-independent. + <ul> + <li><b>Translation into the <code>Stix</code> representation.</b> Stix + is a simple tree-like RTL-style language, in which you can + mention: + <p> + <ul> + <li>An infinite number of temporary, virtual registers. + <li>The STG "magic" registers (<code>MagicId</code>), such as + the heap and stack pointers. + <li>Literals and low-level machine ops (<code>MachOp</code>). + <li>Simple address computations. + <li>Reads and writes of: memory, virtual regs, and various STG + regs. + <li>Labels and <code>if ... goto ...</code> style control-flow. + </ul> + <p> + Stix has two main associated types: + <p> + <ul> + <li><code>StixStmt</code> -- trees executed for their side + effects: assignments, control transfers, and auxiliary junk + such as segment changes and literal data. + <li><code>StixExpr</code> -- trees which denote a value. + </ul> + <p> + Translation into Stix is almost completely + target-independent. Needed dependencies are knowledge of + word size and endianness, used when generating code to do + deal with half-word fields in info tables. This could be + abstracted out easily enough. Also, the Stix translation + needs to know which <code>MagicId</code>s map to registers + on the given target, and which are stored in offsets from + <code>BaseReg</code>. + <p> + After initial Stix generation, the trees are cleaned up with + constant-folding and a little copy-propagation ("Stix + inlining", as the code misleadingly calls it). We take + the opportunity to translate <code>MagicId</code>s which are + stored in memory on the given target, into suitable memory + references. Those which are stored in registers are left + alone. There is also a half-hearted attempt to lift literal + strings to the top level in cases where nested strings have + been observed to give incorrect code in the past. + <p> + Primitive machine-level operations will already be phrased in + terms of <code>MachOp</code>s in the presented Abstract C, and + these are passed through unchanged. We comment only that the + <code>MachOp</code>s have been chosen so as to be easy to + implement on all targets, and their meaning is intended to be + unambiguous, and the same on all targets, regardless of word + size or endianness. + <p> + <b>A note on <code>MagicId</code>s.</b> + Those which are assigned to + registers on the current target are left unmodified. Those + which are not are stored in memory as offsets from + <code>BaseReg</code> (which is assumed to permanently have the + value <code>(&MainCapability.r)</code>), so the constant folder + calculates the offsets and inserts suitable loads/stores. One + complication is that not all archs have <code>BaseReg</code> + itself in a register, so for those (sparc), we instead + generate the address as an offset from the static symbol + <code>MainCapability</code>, since the register table lives in + there. + <p> + Finally, <code>BaseReg</code> does occasionally itself get + mentioned in Stix expression trees, and in this case what is + denoted is precisely <code>(&MainCapability.r)</code>, not, as + in all other cases, the value of memory at some offset from + the start of the register table. Since what it denotes is an + r-value and not an l-value, assigning <code>BaseReg</code> is + meaningless, so the machinery checks to ensure this never + happens. All these details are taken into account by the + constant folder. + <p> + <li><b>Instruction selection.</b> This is the only majorly + target-specific phase. It turns Stix statements and + expressions into sequences of <code>Instr</code>, a data + type which is different for each architecture. + <code>Instr</code>, unsurprisingly, has various supporting + types, such as <code>Reg</code>, <code>Operand</code>, + <code>Imm</code>, etc. The generated instructions may refer + to specific machine registers, or to arbitrary virtual + registers, either those created within the instruction + selector, or those mentioned in the Stix passed to it. + <p> + The instruction selectors live in <code>MachCode.lhs</code>. + The core functions, for each target, are: + <p> + <code> + getAmode :: StixExpr -> NatM Amode + <br>getRegister :: StixExpr -> NatM Register + <br>assignMem_IntCode :: PrimRep -> StixExpr -> StixExpr -> NatM InstrBlock + <br>assignReg_IntCode :: PrimRep -> StixReg -> StixExpr -> NatM InstrBlock + </code> + <p> + The insn selectors use the "maximal munch" algorithm. The + bizarrely-misnamed <code>getRegister</code> translates + expressions. A simplified version of its type is: + <p> + <code>getRegister :: StixExpr -> NatM (OrdList Instr, Reg)</code> + <p> + That is: it (monadically) turns a <code>StixExpr</code> into a + sequence of instructions, and a register, with the meaning + that after executing the (possibly empty) sequence of + instructions, the (possibly virtual) register will + hold the resulting value. The real situation is complicated + by the presence of fixed registers, and is detailed below. + <p> + Maximal munch is a greedy algorithm and is known not to give + globally optimal code sequences, but it is good enough, and + fast and simple. Early incarnations of the NCG used something + more sophisticated, but that is long gone now. + <p> + Similarly, <code>getAmode</code> translates a value, intended + to denote an address, into a sequence of insns leading up to + a (processor-specific) addressing mode. This stuff could be + done using the general <code>getRegister</code> selector, but + would necessarily generate poorer code, because the calculated + address would be forced into a register, which might be + unnecessary if it could partially or wholly be calculated + using an addressing mode. + <p> + Finally, <code>assignMem_IntCode</code> and + <code>assignReg_IntCode</code> create instruction sequences to + calculate a value and store it in the given register, or at + the given address. Because these guys translate a statement, + not a value, they just return a sequence of insns and no + associated register. Floating-point and 64-bit integer + assignments have analogous selectors. + <p> + Apart from the complexities of fixed vs floating registers, + discussed below, the instruction selector is as simple + as it can be. It looks long and scary but detailed + examination reveals it to be fairly straightforward. + <p> + <li><b>Register allocation.</b> The register allocator, + <code>AsmRegAlloc.lhs</code> takes sequences of + <code>Instr</code>s which mention a mixture of real and + virtual registers, and returns a modified sequence referring + only to real ones. It is gloriously and entirely + target-independent. Well, not exactly true. Instead it + regards <code>Instr</code> (instructions) and <code>Reg</code> + (virtual and real registers) as abstract types, to which it has + the following interface: + <p> + <code> + insnFuture :: Instr -> InsnFuture + <br>regUsage :: Instr -> RegUsage + <br>patchRegs :: Instr -> (Reg -> Reg) -> Instr + </code> + <p> + <code>insnFuture</code> is used to (re)construct the graph of + all possible control transfers between the insns to be + allocated. <code>regUsage</code> returns the sets of registers + read and written by an instruction. And + <code>patchRegs</code> is used to apply the allocator's final + decision on virtual-to-real reg mapping to an instruction. + <p> + Clearly these 3 fns have to be written anew for each + architecture. They are defined in + <code>RegAllocInfo.lhs</code>. Think twice, no, thrice, + before modifying them: making false claims about insn + behaviour will lead to hard-to-find register allocation + errors. + <p> + <code>AsmRegAlloc.lhs</code> contains detailed comments about + how the allocator works. Here is a summary. The head honcho + <p> + <code>allocUsingTheseRegs :: [Instr] -> [Reg] -> (Bool, [Instr])</code> + <p> + takes a list of instructions and a list of real registers + available for allocation, and maps as many of the virtual regs + in the input into real ones as it can. The returned + <code>Bool</code> indicates whether or not it was + successful. If so, that's the end of it. If not, the caller + of <code>allocUsingTheseRegs</code> will attempt spilling. + More of that later. What <code>allocUsingTheseRegs</code> + does is: + <p> + <ul> + <li>Implicitly number each instruction by its position in the + input list. + <p> + <li>Using <code>insnFuture</code>, create the set of all flow + edges -- possible control transfers -- within this set of + insns. + <p> + <li>Using <code>regUsage</code> and iterating around the flow + graph from the previous step, calculate, for each virtual + register, the set of flow edges on which it is live. + <p> + <li>Make a real-register committment map, which gives the set + of edges for which each real register is committed (in + use). These sets are initially empty. For each virtual + register, attempt to find a real register whose current + committment does not intersect that of the virtual + register -- ie, is uncommitted on all edges that the + virtual reg is live. If successful, this means the vreg + can be assigned to the realreg, so add the vreg's set to + the realreg's committment. + <p> + <li>If all the vregs were assigned to a realreg, use + <code>patchInstr</code> to apply the mapping to the insns themselves. + </ul> + <p> + <b>Spilling</b> + <p> + If <code>allocUsingTheseRegs</code> fails, a baroque + mechanism comes into play. We now know that much simpler + schemes are available to do the same thing and give better + results. + Anyways: + <p> + The logic above <code>allocUsingTheseRegs</code>, in + <code>doGeneralAlloc</code> and <code>runRegAllocate</code>, + observe that allocation has failed with some set R of real + registers. So they apply <code>runRegAllocate</code> a second + time to the code, but remove (typically) two registers from R + before doing so. This naturally fails too, but returns a + partially-allocated sequence. <code>doGeneralAlloc</code> + then inserts spill code into the sequence, and finally re-runs + <code>allocUsingTheseRegs</code>, but supplying the original, + unadulterated R. This is guaranteed to succeed since the two + registers previously removed from R are sufficient to allocate + all the spill/restore instructions added. + <p> + Because x86 is very short of registers, and in the worst case + needs three removed from R, a softly-softly approach is used. + <code>doGeneralAlloc</code> first tries with zero regs removed + from R, then if that fails one, then two, etc. This means + <code>allocUsingTheseRegs</code> may get run several times + before a successful arrangement is arrived at. + <code>findReservedRegs</code> cooks up the sets of spill + registers to try with. + <p> + The resulting machinery is complicated and the generated spill + code is appalling. The saving grace is that spills are very + rare so it doesn't matter much. I did not invent this -- I inherited it. + <p> + <b>Dealing with common cases fast</b> + <p> + The entire reg-alloc mechanism described so far is general and + correct, but expensive overkill for many simple code blocks. + So to begin with we use + <code>doSimpleAlloc</code>, which attempts to do something + simple. It exploits the observation that if the total number + of virtual registers does not exceed the number of real ones + available, we can simply dole out a new realreg each time we + see mention of a new vreg, with no regard for control flow. + <code>doSimpleAlloc</code> therefore attempts this in a + single pass over the code. It gives up if it runs out of real + regs or sees any condition which renders the above observation + invalid (fixed reg uses, for example). + <p> + This clever hack handles the majority of code blocks quickly. + It was copied from the previous reg-allocator (the + Mattson/Partain/Marlow/Gill one). + </ul> + +<p> +<h2>Complications, observations, and possible improvements</h2> + +<h3>Real vs virtual registers in the instruction selectors</h3> + +The instruction selectors for expression trees, namely +<code>getRegister</code>, are complicated by the fact that some +expressions can only be computed into a specific register, whereas +the majority can be computed into any register. We take x86 as an +example, but the problem applies to all archs. +<p> +Terminology: <em>rreg</em> means real register, a real machine +register. <em>vreg</em> means one of an infinite set of virtual +registers. The type <code>Reg</code> is the sum of <em>rreg</em> and +<em>vreg</em>. The instruction selector generates sequences with +unconstrained use of vregs, leaving the register allocator to map them +all into rregs. +<p> +Now, where was I ? Oh yes. We return to the type of +<code>getRegister</code>, which despite its name, selects instructions +to compute the value of an expression tree. +<pre> + getRegister :: StixExpr -> NatM Register + + data Register + = Fixed PrimRep Reg InstrBlock + | Any PrimRep (Reg -> InstrBlock) + + type InstrBlock -- sequence of instructions +</pre> +At first this looks eminently reasonable (apart from the stupid +name). <code>getRegister</code>, and nobody else, knows whether or +not a given expression has to be computed into a fixed rreg or can be +computed into any rreg or vreg. In the first case, it returns +<code>Fixed</code> and indicates which rreg the result is in. In the +second case it defers committing to any specific target register by +returning a function from <code>Reg</code> to <code>InstrBlock</code>, +and the caller can specify the target reg as it sees fit. +<p> +Unfortunately, that forces <code>getRegister</code>'s callers (usually +itself) to use a clumsy and confusing idiom in the common case where +they do not care what register the result winds up in. The reason is +that although a value might be computed into a fixed rreg, we are +forbidden (on pain of segmentation fault :) from subsequently +modifying the fixed reg. This and other rules are record in "Rules of +the game" inside <code>MachCode.lhs</code>. +<p> +Why can't fixed registers be modified post-hoc? Consider a simple +expression like <code>Hp+1</code>. Since the heap pointer +<code>Hp</code> is definitely in a fixed register, call it R, +<code>getRegister</code> on subterm <code>Hp</code> will simply return +<code>Fixed</code> with an empty sequence and R. But we can't just +emit an increment instruction for R, because that trashes +<code>Hp</code>; instead we first have to copy it into a fresh vreg +and increment that. +<p> +With all that in mind, consider now writing a <code>getRegister</code> +clause for terms of the form <code>(1 + E)</code>. Contrived, yes, +but illustrates the matter. First we do +<code>getRegister</code> on E. Now we are forced to examine what +comes back. +<pre> + getRegister (OnePlus e) + = getRegister e `thenNat` \ e_result -> + case e_result of + Fixed e_code e_fixed + -> returnNat (Any IntRep (\dst -> e_code ++ [MOV e_fixed dst, INC dst])) + Any e_any + -> Any (\dst -> e_any dst ++ [INC dst]) +</pre> +This seems unreasonably cumbersome, yet the instruction selector is +full of such idioms. A good example of the complexities induced by +this scheme is shown by <code>trivialCode</code> for x86 in +<code>MachCode.lhs</code>. This deals with general integer dyadic +operations on x86 and has numerous cases. It was difficult to get +right. +<p> +An alternative suggestion is to simplify the type of +<code>getRegister</code> to this: +<pre> + getRegister :: StixExpr -> NatM (InstrBloc, VReg) + type VReg = .... a vreg ... +</pre> +and then we could safely write +<pre> + getRegister (OnePlus e) + = getRegister e `thenNat` \ (e_code, e_vreg) -> + returnNat (e_code ++ [INC e_vreg], e_vreg) +</pre> +which is about as straightforward as you could hope for. +Unfortunately, it requires <code>getRegister</code> to insert moves of +values which naturally compute into an rreg, into a vreg. Consider: +<pre> + 1 + ccall some-C-fn +</pre> +On x86 the ccall result is returned in rreg <code>%eax</code>. The +resulting sequence, prior to register allocation, would be: +<pre> + # push args + call some-C-fn + # move %esp to nuke args + movl %eax, %vreg + incl %vreg +</pre> +If, as is likely, <code>%eax</code> is not held live beyond this point +for any other purpose, the move into a fresh register is pointless; +we'd have been better off leaving the value in <code>%eax</code> as +long as possible. +<p> +The simplified <code>getRegister</code> story is attractive. It would +clean up the instruction selectors significantly and make it simpler +to write new ones. The only drawback is that it generates redundant +register moves. I suggest that eliminating these should be the job +of the register allocator. Indeed: +<ul> +<li>There has been some work on this already ("Iterated register + coalescing" ?), so this isn't a new idea. +<p> +<li>You could argue that the existing scheme inappropriately blurs the + boundary between the instruction selector and the register + allocator. The instruction selector should .. well .. just + select instructions, without having to futz around worrying about + what kind of registers subtrees get generated into. Register + allocation should be <em>entirely</em> the domain of the register + allocator, with the proviso that it should endeavour to allocate + registers so as to minimise the number of non-redundant reg-reg + moves in the final output. +</ul> + + +<h3>Selecting insns for 64-bit values/loads/stores on 32-bit platforms</h3> + +Note that this stuff doesn't apply on 64-bit archs, since the +<code>getRegister</code> mechanism applies there. + +The relevant functions are: +<pre> + assignMem_I64Code :: StixExpr -> StixExpr -> NatM InstrBlock + assignReg_I64Code :: StixReg -> StixExpr -> NatM InstrBlock + iselExpr64 :: StixExpr -> NatM ChildCode64 + + data ChildCode64 -- a.k.a "Register64" + = ChildCode64 + InstrBlock -- code + VRegUnique -- unique for the lower 32-bit temporary +</pre> +<code>iselExpr64</code> is the 64-bit, plausibly-named analogue of +<code>getRegister</code>, and <code>ChildCode64</code> is the analogue +of <code>Register</code>. The aim here was to generate working 64 +bit code as simply as possible. To this end, I used the +simplified <code>getRegister</code> scheme described above, in which +<code>iselExpr64</code>generates its results into two vregs which +can always safely be modified afterwards. +<p> +Virtual registers are, unsurprisingly, distinguished by their +<code>Unique</code>s. There is a small difficulty in how to +know what the vreg for the upper 32 bits of a value is, given the vreg +for the lower 32 bits. The simple solution adopted is to say that +any low-32 vreg may also have a hi-32 counterpart which shares the +same unique, but is otherwise regarded as a separate entity. +<code>getHiVRegFromLo</code> gets one from the other. +<pre> + data VRegUnique + = VRegUniqueLo Unique -- lower part of a split quantity + | VRegUniqueHi Unique -- upper part thereof +</pre> +Apart from that, 64-bit code generation is really simple. The sparc +and x86 versions are almost copy-n-pastes of each other, with minor +adjustments for endianness. The generated code isn't wonderful but +is certainly acceptable, and it works. + + + +<h3>Shortcomings and inefficiencies in the register allocator</h3> + +<h4>Redundant reconstruction of the control flow graph</h4> + +The allocator goes to considerable computational expense to construct +all the flow edges in the group of instructions it's allocating for, +by using the <code>insnFuture</code> function in the +<code>Instr</code> pseudo-abstract type. +<p> +This is really silly, because all that information is present at the +abstract C stage, but is thrown away in the translation to Stix. +So a good thing to do is to modify that translation to +produce a directed graph of Stix straight-line code blocks, +and to preserve that structure through the insn selector, so the +allocator can see it. +<p> +This would eliminate the fragile, hacky, arch-specific +<code>insnFuture</code> mechanism, and probably make the whole +compiler run measurably faster. Register allocation is a fair chunk +of the time of non-optimising compilation (10% or more), and +reconstructing the flow graph is an expensive part of reg-alloc. +It would probably accelerate the vreg liveness computation too. + +<h4>Really ridiculous method for doing spilling</h4> + +This is a more ambitious suggestion, but ... reg-alloc should be +reimplemented, using the scheme described in "Quality and speed in +linear-scan register allocation." (Traub?) For straight-line code +blocks, this gives an elegant one-pass algorithm for assigning +registers and creating the minimal necessary spill code, without the +need for reserving spill registers ahead of time. +<p> +I tried it in Rigr, replacing the previous spiller which used the +current GHC scheme described above, and it cut the number of spill +loads and stores by a factor of eight. Not to mention being simpler, +easier to understand and very fast. +<p> +The Traub paper also describes how to extend their method to multiple +basic blocks, which will be needed for GHC. It comes down to +reconciling multiple vreg-to-rreg mappings at points where control +flow merges. + +<h4>Redundant-move support for revised instruction selector suggestion</h4> + +As mentioned above, simplifying the instruction selector will require +the register allocator to try and allocate source and destination +vregs to the same rreg in reg-reg moves, so as to make as many as +possible go away. Without that, the revised insn selector would +generate worse code than at present. I know this stuff has been done +but know nothing about it. The Linear-scan reg-alloc paper mentioned +above does indeed mention a bit about it in the context of single +basic blocks, but I don't know if that's sufficient. + + + +<h3>x86 arcana that you should know about</h3> + +The main difficulty with x86 is that many instructions have fixed +register constraints, which can occasionally make reg-alloc fail +completely. And the FPU doesn't have the flat register model which +the reg-alloc abstraction (implicitly) assumes. +<p> +Our strategy is: do a good job for the common small subset, that is +integer loads, stores, address calculations, basic ALU ops (+, -, +and, or, xor), and jumps. That covers the vast majority of +executed insns. And indeed we do do a good job, with a loss of +less than 2% compared with gcc. +<p> +Initially we tried to handle integer instructions with awkward +register constraints (mul, div, shifts by non-constant amounts) via +various jigglings of the spiller et al. This never worked robustly, +and putting platform-specific tweaks in the generic infrastructure is +a big No-No. (Not quite true; shifts by a non-constant amount are +still done by a giant kludge, and should be moved into this new +framework.) +<p> +Fortunately, all such insns are rare. So the current scheme is to +pretend that they don't have any such constraints. This fiction is +carried all the way through the register allocator. When the insn +finally comes to be printed, we emit a sequence which copies the +operands through memory (<code>%esp</code>-relative), satisfying the +constraints of the real instruction. This localises the gruesomeness +to just one place. Here, for example, is the code generated for +integer divison of <code>%esi</code> by <code>%ecx</code>: +<pre> + # BEGIN IQUOT %ecx, %esi + pushl $0 + pushl %eax + pushl %edx + pushl %ecx + movl %esi,% eax + cltd + idivl 0(%esp) + movl %eax, 12(%esp) + popl %edx + popl %edx + popl %eax + popl %esi + # END IQUOT %ecx, %esi +</pre> +This is not quite as appalling as it seems, if you consider that the +division itself typically takes 16+ cycles, whereas the rest of the +insns probably go through in about 1 cycle each. +<p> +This trick is taken to extremes for FP operations. +<p> +All notions of the x86 FP stack and its insns have been removed. +Instead, we pretend, to the instruction selector and register +allocator, that x86 has six floating point registers, +<code>%fake0</code> .. <code>%fake5</code>, which can be used in the +usual flat manner. We further claim that x86 has floating point +instructions very similar to SPARC and Alpha, that is, a simple +3-operand register-register arrangement. Code generation and register +allocation proceed on this basis. +<p> +When we come to print out the final assembly, our convenient fiction +is converted to dismal reality. Each fake instruction is +independently converted to a series of real x86 instructions. +<code>%fake0</code> .. <code>%fake5</code> are mapped to +<code>%st(0)</code> .. <code>%st(5)</code>. To do reg-reg arithmetic +operations, the two operands are pushed onto the top of the FP stack, +the operation done, and the result copied back into the relevant +register. When one of the operands is also the destination, we emit a +slightly less scummy translation. There are only six +<code>%fake</code> registers because 2 are needed for the translation, +and x86 has 8 in total. +<p> +The translation is inefficient but is simple and it works. A cleverer +translation would handle a sequence of insns, simulating the FP stack +contents, would not impose a fixed mapping from <code>%fake</code> to +<code>%st</code> regs, and hopefully could avoid most of the redundant +reg-reg moves of the current translation. +<p> +There are, however, two unforeseen bad side effects: +<ul> +<li>This doesn't work properly, because it doesn't observe the normal + conventions for x86 FP code generation. It turns out that each of + the 8 elements in the x86 FP register stack has a tag bit which + indicates whether or not that register is notionally in use or + not. If you do a FPU operation which happens to read a + tagged-as-empty register, you get an x87 FPU (stack invalid) + exception, which is normally handled by the FPU without passing it + to the OS: the program keeps going, but the resulting FP values + are garbage. The OS can ask for the FPU to pass it FP + stack-invalid exceptions, but it usually doesn't. + <p> + Anyways: inside NCG created x86 FP code this all works fine. + However, the NCG's fiction of a flat register set does not operate + the x87 register stack in the required stack-like way. When + control returns to a gcc-generated world, the stack tag bits soon + cause stack exceptions, and thus garbage results. + <p> + The only fix I could think of -- and it is horrible -- is to clear + all the tag bits just before the next STG-level entry, in chunks + of code which use FP insns. <code>i386_insert_ffrees</code> + inserts the relevant <code>ffree</code> insns into such code + blocks. It depends critically on <code>is_G_instr</code> to + detect such blocks. +<p> +<li>It's very difficult to read the generated assembly and + reason about it when debugging, because there's so much clutter. + We print the fake insns as comments in the output, and that helps + a bit. +</ul> + + + +<h3>Generating code for ccalls</h3> + +For reasons I don't really understand, the instruction selectors for +generating calls to C (<code>genCCall</code>) have proven surprisingly +difficult to get right, and soaked up a lot of debugging time. As a +result, I have once again opted for schemes which are simple and not +too difficult to argue as correct, even if they don't generate +excellent code. +<p> +The sparc ccall generator in particular forces all arguments into +temporary virtual registers before moving them to the final +out-registers (<code>%o0</code> .. <code>%o5</code>). This creates +some unnecessary reg-reg moves. The reason is explained in a +comment in the code. + + +<h3>Duplicate implementation for many STG macros</h3> + +This has been discussed at length already. It has caused a couple of +nasty bugs due to subtle untracked divergence in the macro +translations. The macro-expander really should be pushed up into the +Abstract C phase, so the problem can't happen. +<p> +Doing so would have the added benefit that the NCG could be used to +compile more "ways" -- well, at least the 'p' profiling way. + + +<h3>How to debug the NCG without losing your sanity/hair/cool</h3> + +Last, but definitely not least ... +<p> +The usual syndrome is that some program, when compiled via C, works, +but not when compiled via the NCG. Usually the problem is fairly +simple to fix, once you find the specific code block which has been +mistranslated. But the latter can be nearly impossible, since most +modules generate at least hundreds and often thousands of them. +<p> +My solution: cheat. +<p> +Because the via-C and native routes diverge only late in the day, +it is not difficult to construct a 1-1 correspondence between basic +blocks on the two routes. So, if the program works via C but not on +the NCG, do the following: +<ul> +<li>Recompile <code>AsmCodeGen.lhs</code> in the afflicted compiler + with <code>-DDEBUG_NCG</code>, so that it inserts + <code>___ncg_debug_marker</code>s + into the assembly it emits. +<p> +<li>Using a binary search on modules, find the module which is causing + the problem. +<p> +<li>Compile that module to assembly code, with identical flags, twice, + once via C and once via NCG. + Call the outputs <code>ModuleName.s-gcc</code> and + <code>ModuleName.s-nat</code>. Check that the latter does indeed have + <code>___ncg_debug_marker</code>s in it; otherwise the next steps fail. +<p> +<li>Build (with a working compiler) the program + <code>fptools/ghc/utils/debugNCG/diff_gcc_nat</code>. +<p> +<li>Run: <code>diff_gcc_nat ModuleName.s</code>. This will + construct the 1-1 correspondence, and emits on stdout + a cppable assembly output. Place this in a file -- I always + call it <code>synth.S</code>. Note, the capital S is important; + otherwise it won't get cpp'd. You can feed this file directly to + ghc and it will automatically get cpp'd; you don't have to do so + yourself. +<p> +<li>By messing with the <code>#define</code>s at the top of + <code>synth.S</code>, do a binary search to find the incorrect + block. Keep a careful record of where you are in the search; it + is easy to get confused. Remember also that multiple blocks may + be wrong, which also confuses matters. Finally, I usually start + off by re-checking that I can build the executable with all the + <code>#define</code>s set to 0 and then all to 1. This ensures + you won't get halfway through the search and then get stuck due to + some snafu with gcc-specific literals. Usually I set + <code>UNMATCHED_GCC</code> to 1 all the time, and this bit should + contain only literal data. + <code>UNMATCHED_NAT</code> should be empty. +</ul> +<p> +<code>diff_gcc_nat</code> was known to work correctly last time I used +it, in December 01, for both x86 and sparc. If it doesn't work, due +to changes in assembly syntax, or whatever, make it work. The +investment is well worth it. Searching for the incorrect block(s) any +other way is a total time waster. + + + +</ul> + + + + + <p><small> +<!-- hhmts start --> +Last modified: Fri Feb 1 16:14:11 GMT 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/optimistic.html b/docs/comm/the-beast/optimistic.html new file mode 100644 index 0000000000..4d158022e8 --- /dev/null +++ b/docs/comm/the-beast/optimistic.html @@ -0,0 +1,65 @@ +<h2> Architectural stuff </h2> + +New fields in the TSO: +<ul> +<li> New global speculation-depth register; always counts the number of specuation frames +on the stack; incremented when +starting speculation, decremented when finishing. +<li> Profiling stuff +</ul> + + +<h2> Speculation frames </h2> + +The info table for a speculation frame points to the static spec-depth configuration +for that speculation point. (Points to, because the config is mutable, and the info +table has to be adjacent to the (immutable) code.) + + + +<h2> Abortion</h2> + +Abortion is modelled by a special asynchronous exception ThreadAbort. + +<ul> +<li> In the scheduler, if a thread returns with ThreadBlocked, and non-zero SpecDepth, send it +an asynchronous exception. + +<li> In the implementation of the <tt>catch#</tt> primop, raise an asynchonous exception if +SpecDepth is nonzero. + +<li> Timeout, administered by scheduler. Current story: abort if a speculation frame lasts from +one minor GC to the next. We detect this by seeing if there's a profiling frame on the stack --- a +profiling frame is added at a minor GC in place of a speculation frame (see Online Profiling). +</ul> + + +When tearing frames off the stack, we start a new chunk at every speculation frame, as well as every +update frame. We proceed down to the deepest speculation frame. +<p> +The <tt>AP_STACK</tt> closure built for a speculation frame must be careful <em>not</em> to enter the +next <tt>AP_STACK</tt> closure up, because that would re-enter a possible loop. +<p> +Delivering an asynch exception to a thread that is speculating. Invariant: there can be no catch frames +inside speculation (we abort in <tt>catch#</tt> when speculating. So the asynch exception just +tears off frames in the standard way until it gets to a catch frame, just as it would usually do. +<p> +Abortion can punish one or more of the speculation frames by decrementing their static config variables. + +<h3>Synchronous exceptions</h3> + +Synchronous exceptions are treated similarly as before. The stack is discarded up to an update frame; the +thunk to be updated is overwritten with "raise x", and the process continues. Until a catch frame. +<p> +When we find a spec frame, we allocate a "raise x" object, and resume execution with the return address +in the spec frame. In that way the spec frame is like a catch frame; it stops the unwinding process. +<p> +It's essential that every hard failure is caught, else speculation is unsafe. In particular, divide by zero +is hard to catch using OS support, so we test explicitly in library code. You can shoot yourself in the foot +by writing <tt>x `div#` 0</tt>, side-stepping the test. + + +<h3> Online profiling </h3> + +Sampling can be more frequent than minor GC (by jiggling the end-of-block code) but cannot +be less frequent, because GC doesn't expect to see profiling frames.
\ No newline at end of file diff --git a/docs/comm/the-beast/prelude.html b/docs/comm/the-beast/prelude.html new file mode 100644 index 0000000000..64b607def5 --- /dev/null +++ b/docs/comm/the-beast/prelude.html @@ -0,0 +1,207 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Primitives and the Prelude</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Primitives and the Prelude</h1> + <p> + One of the trickiest aspects of GHC is the delicate interplay + between what knowledge is baked into the compiler, and what + knowledge it gets by reading the interface files of library + modules. In general, the less that is baked in, the better. +<p> + Most of what the compiler has to have wired in about primitives and + prelude definitions is in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/"><code>fptools/ghc/compiler/prelude/</code></a>. + </p> + +GHC recognises these main classes of baked-in-ness: +<dl> +<dt><strong>Primitive types.</strong> +<dd>Primitive types cannot be defined in Haskell, and are utterly baked into the compiler. +They are notionally defined in the fictional module <tt>GHC.Prim</tt>. The <tt>TyCon</tt>s for these types are all defined +in module <tt>TysPrim</tt>; for example, +<pre> + intPrimTyCon :: TyCon + intPrimTyCon = .... +</pre> +Examples: +<tt>Int#, Float#, Addr#, State#</tt>. +<p> +<dt><strong>Wired-in types.</strong> +<dd>Wired-in types can be defined in Haskell, and indeed are (many are defined in </tt>GHC.Base</tt>). +However, it's very convenient for GHC to be able to use the type constructor for (say) <tt>Int</tt> +without looking it up in any environment. So module <tt>TysWiredIn</tt> contains many definitions +like this one: +<pre> + intTyCon :: TyCon + intTyCon = .... + + intDataCon :: DataCon + intDataCon = .... +</pre> +However, since a <tt>TyCon</tt> value contains the entire type definition inside it, it follows +that the complete definition of <tt>Int</tt> is thereby baked into the compiler. +<p> +Nevertheless, the library module <tt>GHC.Base</tt> still contains a definition for <tt>Int</tt> +just so that its info table etc get generated somewhere. Chaos will result if the wired-in definition +in <tt>TysWiredIn</tt> differs from that in <tt>GHC.Base</tt>. +<p> +The rule is that only very simple types should be wired in (for example, <tt>Ratio</tt> is not, +and <tt>IO</tt> is certainly not). No class is wired in: classes are just too complicated. +<p> +Examples: <tt>Int</tt>, <tt>Float</tt>, <tt>List</tt>, tuples. + +<p> +<dt><strong>Known-key things.</strong> +<dd>GHC knows of the existence of many, many other types, classes and values. <em>But all it knows is +their <tt>Name</tt>.</em> Remember, a <tt>Name</tt> includes a unique key that identifies the +thing, plus its defining module and occurrence name +(see <a href="names.html">The truth about Names</a>). Knowing a <tt>Name</tt>, therefore, GHC can +run off to the interface file for the module and find out everything else it might need. +<p> +Most of these known-key names are defined in module <tt>PrelNames</tt>; a further swathe concerning +Template Haskell are defined in <tt>DsMeta</tt>. The allocation of unique keys is done manually; +chaotic things happen if you make a mistake here, which is why they are all together. +</dl> + +All the <tt>Name</tt>s from all the above categories are used to initialise the global name cache, +which maps (module,occurrence-name) pairs to the globally-unique <tt>Name</tt> for that +thing. (See <tt>HscMain.initOrigNames</tt>.) + +<p> +The next sections elaborate these three classes a bit. + + + <h2>Primitives (module <tt>TysPrim</tt>)</h2> + <p> + Some types and functions have to be hardwired into the compiler as they + are atomic; all other code is essentially built around this primitive + functionality. This includes basic arithmetic types, such as integers, + and their elementary operations as well as pointer types. Primitive + types and functions often receive special treatment in the code + generator, which means that these entities have to be explicitly + represented in the compiler. Moreover, many of these types receive some + explicit treatment in the runtime system, and so, there is some further + information about <a href="../rts-libs/primitives.html">primitives in + the RTS section</a> of this document. + <p> + The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysPrim.lhs"><code>TysPrim</code></a> + exports a list of all primitive type constructors as <code>primTyCons :: + [TyCon]</code>. All of these type constructors (of type + <code>TyCon</code>) are also exported as <code>intPrimTyCon</code>, + <code>stablePtrPrimTyCon</code>, and so on. In addition, for each + nullary type constructor the corresponding type (of type + <code>Type</code>) is also exported; for example, we have + <code>intPrimTy :: Type</code>. For all other type constructors, a + function is exported that constructs the type obtained by applying the + type constructors to an argument type (of type <code>Type</code>); for + example, we have <code>mkStablePtrPrimTy :: Type -> Type</code>. + <p> + As it is inconvenient to identify type that receive a special treatment + by the code generator by looking at their name, the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrimRep.lhs"><code>PrimRep</code></a> + exports a data type <code>PrimRep</code>, which lists all + machine-manipulable implementation types. The module also exports a set + of query functions on <code>PrimRep</code> that define properties, such + as a type's byte size or whether a primitive type is a pointer type. + Moreover, the function <code>TysPrim.primRepTyCon :: PrimRep -> + TyCon</code> converts <code>PrimRep</code> values into the corresponding + type constructor. + + <h2>Wired in types (module <tt>TysWiredIn</tt>)</h2> + <p> + In addition to entities that are primitive, as the compiler has to treat + them specially in the backend, there is a set of types, functions, + etc. that the Haskell language definition flags as essential to the + language by placing them into the special module <code>Prelude</code> + that is implicitly imported into each Haskell module. For some of these + entities it suffices to define them (by standard Haskell definitions) in + a <code>Prelude</code> module and ensuring that this module is treated + specially by being always imported . + <p> + However, there is a set of entities (such as, for example, the list type + and the corresponding data constructors) that have an inbetween status: + They are not truly primitive (lists, for example, can easily be defined + by a <code>data</code> declaration), but the compiler has to have extra + knowledge about them, as they are associated with some particular + features of the language (in the case of lists, there is special syntax, + such as list comprehensions, associated with the type). Another + example, for a special kind of entity are type classes that can be used + in a <code>deriving</code> clause. All types that are not-primitive, + but about which the compiler nonetheless has to have some extra + knowledge are defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>. + <p> + All wired in type constructors are contained in <code>wiredInTyCons :: + [TyCon]</code>. In addition to that list, <code>TysWiredIn</code> + exports variables bound to representations of all listed type + constructors and their data constructors. So, for example, we have + <code>listTyCon</code> together with <code>nilDataCon</cons> and + </code>consDataCon</code>. There are also convenience functions, such + as <code>mkListTy</code> and <code>mkTupleTy</code>, which construct + compound types. + <p> + + <h2>Known-key names (module <tt>PrelNames</tt>)</h2> + + All names of types, functions, etc. known to the compiler are defined in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>. + This includes the names of types and functions exported from + <code>TysWiredIn</code>, but also others. In particular, this module + also fixes the names of all prelude modules; i.e., of the modules whose + name starts with <code>Prel</code>, which GHC's library uses to bring + some structure into the quite large number of <code>Prelude</code> + definitions. + <p> + <code>PrelNames.knownKeyNames :: [Name]</code> contains all names known + to the compiler, but the elements of the list are also exported + individually as variables, such as <code>floatTyConName</code> (having + the lexeme <code>Float</code>) and <code>floatDataConName</code> (having + the lexeme <code>F#</code>). For each of these names, + <code>PrelNames</code> derfines a unique key with a definition, such as + <p> +<blockquote><pre> +floatPrimTyConKey = mkPreludeTyConUnique 11</pre> +</blockquote> + <p> + that is, all unique keys for known prelude names are hardcoded into + <code>PrelNames</code> (and uniqueness has to be manually ensured in + that module). To simplify matching the types of important groups of + type constructors, <code>PrelNames</code> also exports lists, such as + <code>numericTyKeys</code> (keys of all numeric types), that contain the + unique keys of all names in that group. In addition, derivable type + classes and their structure is defined by + <code>derivableClassKeys</code> and related definitions. + <p> + In addition to names that have unique keys, <code>PrelNames</code> also + defines a set of names without uniqueness information. These names end + on the suffix <code>_RDR</code> and are of type <code>RdrName</code> (an + example, is <code>times_RDR</code>, which represents the lexeme + <code>*</code>). The names are used in locations where they pass + through the renamer anyway (e.g., special constructors encountered by + the parser, such as [], and code generated from deriving clauses), which + will take care of adding uniqueness information. + <p> + +<h2>Gathering it all together (module <tt>PrelInfo</tt>)</h2> + The module + <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelInfo.lhs"><code>PrelInfo</code></a> + in some sense ties all the above together and provides a reasonably + restricted interface to these definition to the rest of the compiler. + However, from what I have seen, this doesn't quite work out and the + earlier mentioned modules are directly imported in many places. + + <p><small> +<!-- hhmts start --> +Last modified: Tue Dec 11 17:54:07 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/renamer.html b/docs/comm/the-beast/renamer.html new file mode 100644 index 0000000000..828b569bb9 --- /dev/null +++ b/docs/comm/the-beast/renamer.html @@ -0,0 +1,249 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Glorious Renamer</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Glorious Renamer</h1> + <p> + The <em>renamer</em> sits between the parser and the typechecker. + However, its operation is quite tightly interwoven with the + typechecker. This is partially due to support for Template Haskell, + where spliced code has to be renamed and type checked. In particular, + top-level splices lead to multiple rounds of renaming and type + checking. + </p> + <p> + The main externally used functions of the renamer are provided by the + module <code>rename/RnSource.lhs</code>. In particular, we have + </p> + <blockquote> + <pre> +rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) +rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] +rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)</pre> + </blockquote> + <p> + All of which execute in the renamer monad <code>RnM</code>. The first + function, <code>rnSrcDecls</code> renames a binding group; the second, + <code>rnTyClDecls</code> renames a list of (toplevel) type and class + declarations; and the third, <code>rnSplice</code> renames a Template + Haskell splice. As the types indicate, the main task of the renamer is + to convert converts all the <tt>RdrNames</tt> to <a + href="names.html"><tt>Names</tt></a>, which includes a number of + well-formedness checks (no duplicate declarations, all names are in + scope, and so on). In addition, the renamer performs other, not + strictly name-related, well-formedness checks, which includes checking + that the appropriate flags have been supplied whenever language + extensions are used in the source. + </p> + + <h2>RdrNames</h2> + <p> + A <tt>RdrName.RdrName</tt> is pretty much just a string (for an + unqualified name like "<tt>f</tt>") or a pair of strings (for a + qualified name like "<tt>M.f</tt>"): + </p> + <blockquote> + <pre> +data RdrName + = Unqual OccName + -- Used for ordinary, unqualified occurrences + + | Qual Module OccName + -- A qualified name written by the user in + -- *source* code. The module isn't necessarily + -- the module where the thing is defined; + -- just the one from which it is imported + + | Orig Module OccName + -- An original name; the module is the *defining* module. + -- This is used when GHC generates code that will be fed + -- into the renamer (e.g. from deriving clauses), but where + -- we want to say "Use Prelude.map dammit". + + | Exact Name + -- We know exactly the Name. This is used + -- (a) when the parser parses built-in syntax like "[]" + -- and "(,)", but wants a RdrName from it + -- (b) when converting names to the RdrNames in IfaceTypes + -- Here an Exact RdrName always contains an External Name + -- (Internal Names are converted to simple Unquals) + -- (c) by Template Haskell, when TH has generated a unique name</pre> + </blockquote> + <p> + The OccName type is described in <a href="names.html#occname">The + truth about names</a>. + </p> + + <h2>The Renamer Monad</h2> + <p> + Due to the tight integration of the renamer with the typechecker, both + use the same monad in recent versions of GHC. So, we have + </p> + <blockquote> + <pre> +type RnM a = TcRn a -- Historical +type TcM a = TcRn a -- Historical</pre> + </blockquote> + <p> + with the combined monad defined as + </p> + <blockquote> + <pre> +type TcRn a = TcRnIf TcGblEnv TcLclEnv a +type TcRnIf a b c = IOEnv (Env a b) c + +data Env gbl lcl -- Changes as we move into an expression + = Env { + env_top :: HscEnv, -- Top-level stuff that never changes + -- Includes all info about imported things + + env_us :: TcRef UniqSupply, -- Unique supply for local varibles + + env_gbl :: gbl, -- Info about things defined at the top level + -- of the module being compiled + + env_lcl :: lcl -- Nested stuff; changes as we go into + -- an expression + }</pre> + </blockquote> + <p> + the details of the global environment type <code>TcGblEnv</code> and + local environment type <code>TcLclEnv</code> are also defined in the + module <code>typecheck/TcRnTypes.lhs</code>. The monad + <code>IOEnv</code> is defined in <code>utils/IOEnv.hs</code> and extends + the vanilla <code>IO</code> monad with an additional state parameter + <code>env</code> that is treated as in a reader monad. (Side effecting + operations, such as updating the unique supply, are done with + <code>TcRef</code>s, which are simply a synonym for <code>IORef</code>s.) + </p> + + <h2>Name Space Management</h2> + <p> + As anticipated by the variants <code>Orig</code> and <code>Exact</code> + of <code>RdrName</code> some names should not change during renaming, + whereas others need to be turned into unique names. In this context, + the two functions <code>RnEnv.newTopSrcBinder</code> and + <code>RnEnv.newLocals</code> are important: + </p> + <blockquote> + <pre> +newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name +newLocalsRn :: [Located RdrName] -> RnM [Name]</pre> + </blockquote> + <p> + The two functions introduces new toplevel and new local names, + respectively, where the first two arguments to + <code>newTopSrcBinder</code> determine the currently compiled module and + the parent construct of the newly defined name. Both functions create + new names only for <code>RdrName</code>s that are neither exact nor + original. + </p> + + <h3>Introduction of Toplevel Names: Global RdrName Environment</h3> + <p> + A global <code>RdrName</code> environment + <code>RdrName.GlobalRdrEnv</code> is a map from <code>OccName</code>s to + lists of qualified names. More precisely, the latter are + <code>Name</code>s with an associated <code>Provenance</code>: + </p> + <blockquote> + <pre> +data Provenance + = LocalDef -- Defined locally + Module + + | Imported -- Imported + [ImportSpec] -- INVARIANT: non-empty + Bool -- True iff the thing was named *explicitly* + -- in *any* of the import specs rather than being + -- imported as part of a group; + -- e.g. + -- import B + -- import C( T(..) ) + -- Here, everything imported by B, and the constructors of T + -- are not named explicitly; only T is named explicitly. + -- This info is used when warning of unused names.</pre> + </blockquote> + <p> + The part of the global <code>RdrName</code> environment for a module + that contains the local definitions is created by the function + <code>RnNames.importsFromLocalDecls</code>, which also computes a data + structure recording all imported declarations in the form of a value of + type <code>TcRnTypes.ImportAvails</code>. + </p> + <p> + The function <code>importsFromLocalDecls</code>, in turn, makes use of + <code>RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM + [AvailInfo]</code> to extract all declared names from a binding group, + where <code>HscTypes.AvailInfo</code> is essentially a collection of + <code>Name</code>s; i.e., <code>getLocalDeclBinders</code>, on the fly, + generates <code>Name</code>s from the <code>RdrName</code>s of all + top-level binders of the module represented by the <code>HsGroup + RdrName</code> argument. + </p> + <p> + It is important to note that all this happens before the renamer + actually descends into the toplevel bindings of a module. In other + words, before <code>TcRnDriver.rnTopSrcDecls</code> performs the + renaming of a module by way of <code>RnSource.rnSrcDecls</code>, it uses + <code>importsFromLocalDecls</code> to set up the global + <code>RdrName</code> environment, which contains <code>Name</code>s for + all imported <em>and</em> all locally defined toplevel binders. Hence, + when the helpers of <code>rnSrcDecls</code> come across the + <em>defining</em> occurences of a toplevel <code>RdrName</code>, they + don't rename it by generating a new name, but they simply look up its + name in the global <code>RdrName</code> environment. + </p> + + <h2>Rebindable syntax</h2> + <p> + In Haskell when one writes "3" one gets "fromInteger 3", where + "fromInteger" comes from the Prelude (regardless of whether the + Prelude is in scope). If you want to completely redefine numbers, + that becomes inconvenient. So GHC lets you say + "-fno-implicit-prelude"; in that case, the "fromInteger" comes from + whatever is in scope. (This is documented in the User Guide.) + </p> + <p> + This feature is implemented as follows (I always forget). + <ul> + <li>Names that are implicitly bound by the Prelude, are marked by the + type <code>HsExpr.SyntaxExpr</code>. Moreover, the association list + <code>HsExpr.SyntaxTable</code> is set up by the renamer to map + rebindable names to the value they are bound to. + </li> + <li>Currently, five constructs related to numerals + (<code>HsExpr.NegApp</code>, <code>HsPat.NPat</code>, + <code>HsPat.NPlusKPat</code>, <code>HsLit.HsIntegral</code>, and + <code>HsLit.HsFractional</code>) and + two constructs related to code>do</code> expressions + (<code>HsExpr.BindStmt</code> and + <code>HsExpr.ExprStmt</code>) have rebindable syntax. + </li> + <li> When the parser builds these constructs, it puts in the + built-in Prelude Name (e.g. PrelNum.fromInteger). + </li> + <li> When the renamer encounters these constructs, it calls + <tt>RnEnv.lookupSyntaxName</tt>. + This checks for <tt>-fno-implicit-prelude</tt>; if not, it just + returns the same Name; otherwise it takes the occurrence name of the + Name, turns it into an unqualified RdrName, and looks it up in the + environment. The returned name is plugged back into the construct. + </li> + <li> The typechecker uses the Name to generate the appropriate typing + constraints. + </li> + </ul> + + <p><small> +<!-- hhmts start --> +Last modified: Wed May 4 17:16:15 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> + diff --git a/docs/comm/the-beast/simplifier.html b/docs/comm/the-beast/simplifier.html new file mode 100644 index 0000000000..40cf7cf892 --- /dev/null +++ b/docs/comm/the-beast/simplifier.html @@ -0,0 +1,86 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Mighty Simplifier</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Mighty Simplifier</h1> + <p> + Most of the optimising program transformations applied by GHC are + performed on an intermediate language called <em>Core,</em> which + essentially is a compiler-friendly formulation of rank-2 polymorphic + lambda terms defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs/"><code>CoreSyn.lhs</code>.</a> + The transformation engine optimising Core programs is called the + <em>Simplifier</em> and composed from a couple of modules located in the + directory <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/"><code>fptools/ghc/compiler/simplCore/</code>.</a> + The main engine of the simplifier is contained in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code>.</a> + and its driver is the routine <code>core2core</code> in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimplCore.lhs</code>.</a> + <p> + The program that the simplifier has produced after applying its various + optimisations can be obtained by passing the option + <code>-ddump-simpl</code> to GHC. Moreover, the various intermediate + stages of the optimisation process is printed when passing + <code>-dverbose-core2core</code>. + + <h4><a name="loopBreaker">Recursive Definitions</a></h4> + <p> + The simplification process has to take special care when handling + recursive binding groups; otherwise, the compiler might loop. + Therefore, the routine <code>reOrderRec</code> in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/OccurAnal.lhs"><code>OccurAnal.lhs</code></a> + computes a set of <em>loop breakers</em> - a set of definitions that + together cut any possible loop in the binding group. It marks the + identifiers bound by these definitions as loop breakers by enriching + their <a href="basicTypes.html#occInfo">occurence information.</a> Loop + breakers will <em>never</em> be inlined by the simplifier; thus, + guaranteeing termination of the simplification procedure. (This is not + entirely accurate -- see <a href="#rules">rewrite rules</a> below.) + + The processes finding loop breakers works as follows: First, the + strongly connected components (SCC) of the graph representing all + function dependencies is computed. Then, each SCC is inspected in turn. + If it contains only a single binding (self-recursive function), this is + the loop breaker. In case of multiple recursive bindings, the function + attempts to select bindings where the decision not to inline them does + cause the least harm - in the sense of inhibiting optimisations in the + code. This is achieved by considering each binding in turn and awarding + a <em>score</em> between 0 and 4, where a lower score means that the + function is less useful for inlining - and thus, a better loop breaker. + The evaluation of bingings is performed by the function + <code>score</code> locally defined in <code>OccurAnal</code>. + + Note that, because core programs represent function definitions as + <em>one</em> binding choosing between the possibly many equations in the + source program with a <code>case</code> construct, a loop breaker cannot + inline any of its possibly many alternatives (not even the non-recursive + alternatives). + + <h4><a name="rules">Rewrite Rules</a></h4> + <p> + The application of rewrite rules is controlled in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code></a> + by the function <code>completeCall</code>. This function first checks + whether it should inline the function applied at the currently inspected + call site, then simplifies the arguments, and finally, checks whether + any rewrite rule can be applied (and also whether there is a matching + specialised version of the applied function). The actual check for rule + application is performed by the function <code><a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/specialise/Rules.lhs">Rules</a>.lookupRule</code>. + <p> + It should be note that the application of rewrite rules is not subject + to the loop breaker check - i.e., rules of loop breakers will be applied + regardless of whether this may cause the simplifier to diverge. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:25:33 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/stg.html b/docs/comm/the-beast/stg.html new file mode 100644 index 0000000000..4581da7d1f --- /dev/null +++ b/docs/comm/the-beast/stg.html @@ -0,0 +1,164 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - You Got Control: The STG-language</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - You Got Control: The STG-language</h1> + <p> + GHC contains two completely independent backends: the byte code + generator and the machine code generator. The decision over which of + the two is invoked is made in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscCodeGen</code>. + The machine code generator proceeds itself in a number of phases: First, + the <a href="desugar.html">Core</a> intermediate language is translated + into <em>STG-language</em>; second, STG-language is transformed into a + GHC-internal variant of <a href="http://www.cminusminus.org/">C--</a>; + and thirdly, this is either emitted as concrete C--, converted to GNU C, + or translated to native code (by the <a href="ncg.html">native code + generator</a> which targets IA32, Sparc, and PowerPC [as of March '5]). + </p> + <p> + In the following, we will have a look at the first step of machine code + generation, namely the translation steps involving the STG-language. + Details about the underlying abstract machine, the <em>Spineless Tagless + G-machine</em>, are in <a + href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">Implementing + lazy functional languages on stock hardware: the Spineless Tagless + G-machine</a>, SL Peyton Jones, Journal of Functional Programming 2(2), + Apr 1992, pp127-202. (Some details have changed since the publication of + this article, but it still gives a good introduction to the main + concepts.) + </p> + + <h2>The STG Language</h2> + <p> + The AST of the STG-language and the generation of STG code from Core is + both located in the <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/"><code>stgSyn/</code></a> + directory; in the modules <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> + and <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a>, + respectively. + </p> + <p> + Conceptually, the STG-language is a lambda calculus (including data + constructors and case expressions) whose syntax is restricted to make + all control flow explicit. As such, it can be regarded as a variant of + <em>administrative normal form (ANF).</em> (C.f., <a + href="http://doi.acm.org/10.1145/173262.155113">The essence of compiling + with continuations.</a> Cormac Flanagan, Amr Sabry, Bruce F. Duba, and + Matthias Felleisen. <em>ACM SIGPLAN Conference on Programming Language + Design and Implementation,</em> ACM Press, 1993.) Each syntactic from + has a precise operational interpretation, in addition to the + denotational interpretation inherited from the lambda calculus. The + concrete representation of the STG language inside GHC also includes + auxiliary attributes, such as <em>static reference tables (SRTs),</em> + which determine the top-level bindings referenced by each let binding + and case expression. + </p> + <p> + As usual in ANF, arguments to functions etc. are restricted to atoms + (i.e., constants or variables), which implies that all sub-expressions + are explicitly named and evaluation order is explicit. Specific to the + STG language is that all let bindings correspond to closure allocation + (thunks, function closures, and data constructors) and that case + expressions encode both computation and case selection. There are two + flavours of case expressions scrutinising boxed and unboxed values, + respectively. The former perform function calls including demanding the + evaluation of thunks, whereas the latter execute primitive operations + (such as arithmetic on fixed size integers and floating-point numbers). + </p> + <p> + The representation of STG language defined in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/StgSyn.lhs"><code>StgSyn</code></a> + abstracts over both binders and occurences of variables. The type names + involved in this generic definition all carry the prefix + <code>Gen</code> (such as in <code>GenStgBinding</code>). Instances of + these generic definitions, where both binders and occurences are of type + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id</code></a><code>.Id</code> + are defined as type synonyms and use type names that drop the + <code>Gen</code> prefix (i.e., becoming plain <code>StgBinding</code>). + Complete programs in STG form are represented by values of type + <code>[StgBinding]</code>. + </p> + + <h2>From Core to STG</h2> + <p> + Although, the actual translation from Core AST into STG AST is performed + by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code> + (or <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreExprToStg</code> + for individual expressions), the translation crucial depends on <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepPgm</code> + (resp. <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CorePrep.lhs"><code>CorePrep</code></a><code>.corePrepExpr</code>), + which prepares Core code for code generation (for both byte code and + machine code generation). <code>CorePrep</code> saturates primitive and + constructor applications, turns the code into A-normal form, renames all + identifiers into globally unique names, generates bindings for + constructor workers, constructor wrappers, and record selectors plus + some further cleanup. + </p> + <p> + In other words, after Core code is prepared for code generation it is + structurally already in the form required by the STG language. The main + work performed by the actual transformation from Core to STG, as + performed by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.coreToStg</code>, + is to compute the live and free variables as well as live CAFs (constant + applicative forms) at each let binding and case alternative. In + subsequent phases, the live CAF information is used to compute SRTs. + The live variable information is used to determine which stack slots + need to be zapped (to avoid space leaks) and the free variable + information is need to construct closures. Moreover, hints for + optimised code generation are computed, such as whether a closure needs + to be updated after is has been evaluated. + </p> + + <h2>STG Passes</h2> + <p> + These days little actual work is performed on programs in STG form; in + particular, the code is not further optimised. All serious optimisation + (except low-level optimisations which are performed during native code + generation) has already been done on Core. The main task of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/stgSyn/CoreToStg.lhs"><code>CoreToStg</code></a><code>.stg2stg</code> + is to compute SRTs from the live CAF information determined during STG + generation. Other than that, <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/profiling/SCCfinal.lhs"><code>SCCfinal</code></a><code>.stgMassageForProfiling</code> + is executed when compiling for profiling and information may be dumped + for debugging purposes. + </p> + + <h2>Towards C--</h2> + <p> + GHC's internal form of C-- is defined in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/cmm/Cmm.hs"><code>Cmm</code></a>. + The definition is generic in that it abstracts over the type of static + data and of the contents of basic blocks (i.e., over the concrete + representation of constant data and instructions). These generic + definitions have names carrying the prefix <code>Gen</code> (such as + <code>GenCmm</code>). The same module also instantiates the generic + form to a concrete form where data is represented by + <code>CmmStatic</code> and instructions are represented by + <code>CmmStmt</code> (giving us, e.g., <code>Cmm</code> from + <code>GenCmm</code>). The concrete form more or less follows the + external <a href="http://www.cminusminus.org/">C--</a> language. + </p> + <p> + Programs in STG form are translated to <code>Cmm</code> by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/codeGen/CodeGen.lhs"><code>CodeGen</code></a><code>.codeGen</code> + </p> + + <p><hr><small> +<!-- hhmts start --> +Last modified: Sat Mar 5 22:55:25 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/syntax.html b/docs/comm/the-beast/syntax.html new file mode 100644 index 0000000000..be5bbefa17 --- /dev/null +++ b/docs/comm/the-beast/syntax.html @@ -0,0 +1,99 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Just Syntax</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Just Syntax</h1> + <p> + The lexical and syntactic analyser for Haskell programs are located in + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/"><code>fptools/ghc/compiler/parser/</code></a>. + </p> + + <h2>The Lexer</h2> + <p> + The lexer is a rather tedious piece of Haskell code contained in the + module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex</code></a>. + Its complexity partially stems from covering, in addition to Haskell 98, + also the whole range of GHC language extensions plus its ability to + analyse interface files in addition to normal Haskell source. The lexer + defines a parser monad <code>P a</code>, where <code>a</code> is the + type of the result expected from a successful parse. More precisely, a + result of type +<blockquote><pre> +data ParseResult a = POk PState a + | PFailed Message</pre> +</blockquote> + <p> + is produced with <code>Message</code> being from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ErrUtils.lhs"><code>ErrUtils</code></a> + (and currently is simply a synonym for <code>SDoc</code>). + <p> + The record type <code>PState</code> contains information such as the + current source location, buffer state, contexts for layout processing, + and whether Glasgow extensions are accepted (either due to + <code>-fglasgow-exts</code> or due to reading an interface file). Most + of the fields of <code>PState</code> store unboxed values; in fact, even + the flag indicating whether Glasgow extensions are enabled is + represented by an unboxed integer instead of by a <code>Bool</code>. My + (= chak's) guess is that this is to avoid having to perform a + <code>case</code> on a boxed value in the inner loop of the lexer. + <p> + The same lexer is used by the Haskell source parser, the Haskell + interface parser, and the package configuration parser. + + <h2>The Haskell Source Parser</h2> + <p> + The parser for Haskell source files is defined in the form of a parser + specification for the parser generator <a + href="http://haskell.org/happy/">Happy</a> in the file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>. + The parser exports three entry points for parsing entire modules + (<code>parseModule</code>, individual statements + (<code>parseStmt</code>), and individual identifiers + (<code>parseIdentifier</code>), respectively. The last two are needed + for GHCi. All three require a parser state (of type + <code>PState</code>) and are invoked from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a>. + <p> + Parsing of Haskell is a rather involved process. The most challenging + features are probably the treatment of layout and expressions that + contain infix operators. The latter may be user-defined and so are not + easily captured in a static syntax specification. Infix operators may + also appear in the right hand sides of value definitions, and so, GHC's + parser treats those in the same way as expressions. In other words, as + general expressions are a syntactic superset of expressions - ok, they + <em>nearly</em> are - the parser simply attempts to parse a general + expression in such positions. Afterwards, the generated parse tree is + inspected to ensure that the accepted phrase indeed forms a legal + pattern. This and similar checks are performed by the routines from <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/ParseUtil.lhs"><code>ParseUtil</code></a>. In + some cases, these routines do, in addition to checking for + wellformedness, also transform the parse tree, such that it fits into + the syntactic context in which it has been parsed; in fact, this happens + for patterns, which are transformed from a representation of type + <code>RdrNameHsExpr</code> into a representation of type + <code>RdrNamePat</code>. + + <h2>The Haskell Interface Parser</h2> + <p> + The parser for interface files is also generated by Happy from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/ParseIface.y"><code>ParseIface.y</code></a>. + It's main routine <code>parseIface</code> is invoked from <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/rename/RnHiFiles.lhs"><code>RnHiFiles</code></a><code>.readIface</code>. + + <h2>The Package Configuration Parser</h2> + <p> + The parser for configuration files is by far the smallest of the three + and defined in <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/ParsePkgConf.y"><code>ParsePkgConf.y</code></a>. + It exports <code>loadPackageConfig</code>, which is used by <a href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/DriverState.hs"><code>DriverState</code></a><code>.readPackageConf</code>. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Jan 16 00:30:14 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/typecheck.html b/docs/comm/the-beast/typecheck.html new file mode 100644 index 0000000000..8d22784b8a --- /dev/null +++ b/docs/comm/the-beast/typecheck.html @@ -0,0 +1,316 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Checking Types</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Checking Types</h1> + <p> + Probably the most important phase in the frontend is the type checker, + which is located at <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a> + GHC type checks programs in their original Haskell form before the + desugarer converts them into Core code. This complicates the type + checker as it has to handle the much more verbose Haskell AST, but it + improves error messages, as those message are based on the same + structure that the user sees. + </p> + <p> + GHC defines the abstract syntax of Haskell programs in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a> + using a structure that abstracts over the concrete representation of + bound occurences of identifiers and patterns. The module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code></a> + defines a number of helper function required by the type checker. Note + that the type <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a>.<code>TcId</code> + used to represent identifiers in some signatures during type checking + is, in fact, nothing but a synonym for a <a href="vars.html">plain + <code>Id</code>.</a> + </p> + <p> + It is also noteworthy, that the representations of types changes during + type checking from <code>HsType</code> to <code>TypeRep.Type</code>. + The latter is a <a href="types.html">hybrid type representation</a> that + is used to type Core, but still contains sufficient information to + recover source types. In particular, the type checker maintains and + compares types in their <code>Type</code> form. + </p> + + <h2>The Overall Flow of Things</h2> + + <h4>Entry Points Into the Type Checker</h4> + <p> + The interface of the type checker (and <a + href="renamer.html">renamer</a>) to the rest of the compiler is provided + by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnDriver.lhs"><code>TcRnDriver</code></a>. + Entire modules are processed by calling <code>tcRnModule</code> and GHCi + uses <code>tcRnStmt</code>, <code>tcRnExpr</code>, and + <code>tcRnType</code> to typecheck statements and expressions, and to + kind check types, respectively. Moreover, <code>tcRnExtCore</code> is + provided to typecheck external Core code. Moreover, + <code>tcTopSrcDecls</code> is used by Template Haskell - more + specifically by <code>TcSplice.tc_bracket</code> + - to type check the contents of declaration brackets. + </p> + + <h4>Renaming and Type Checking a Module</h4> + <p> + The function <code>tcRnModule</code> controls the complete static + analysis of a Haskell module. It sets up the combined renamer and type + checker monad, resolves all import statements, initiates the actual + renaming and type checking process, and finally, wraps off by processing + the export list. + </p> + <p> + The actual type checking and renaming process is initiated via + <code>TcRnDriver.tcRnSrcDecls</code>, which uses a helper called + <code>tc_rn_src_decls</code> to implement the iterative renaming and + type checking process required by <a href="../exts/th.html">Template + Haskell</a>. However, before it invokes <code>tc_rn_src_decls</code>, + it takes care of hi-boot files; afterwards, it simplifies type + constraints and zonking (see below regarding the later). + </p> + <p> + The function <code>tc_rn_src_decls</code> partitions static analysis of + a whole module into multiple rounds, where the initial round is followed + by an additional one for each toplevel splice. It collects all + declarations up to the next splice into an <code>HsDecl.HsGroup</code> + to rename and type check that <em>declaration group</em> by calling + <code>TcRnDriver.tcRnGroup</code>. Afterwards, it executes the + splice (if there are any left) and proceeds to the next group, which + includes the declarations produced by the splice. + </p> + <p> + The function <code>tcRnGroup</code>, finally, gets down to invoke the + actual renaming and type checking via + <code>TcRnDriver.rnTopSrcDecls</code> and + <code>TcRnDriver.tcTopSrcDecls</code>, respectively. The renamer, apart + from renaming, computes the global type checking environment, of type + <code>TcRnTypes.TcGblEnv</code>, which is stored in the type checking + monad before type checking commences. + </p> + + <h2>Type Checking a Declaration Group</h2> + <p> + The type checking of a declaration group, performed by + <code>tcTopSrcDecls</code> starts by processing of the type and class + declarations of the current module, using the function + <code>TcTyClsDecls.tcTyAndClassDecls</code>. This is followed by a + first round over instance declarations using + <code>TcInstDcls.tcInstDecls1</code>, which in particular generates all + additional bindings due to the deriving process. Then come foreign + import declarations (<code>TcForeign.tcForeignImports</code>) and + default declarations (<code>TcDefaults.tcDefaults</code>). + </p> + <p> + Now, finally, toplevel value declarations (including derived ones) are + type checked using <code>TcBinds.tcTopBinds</code>. Afterwards, + <code>TcInstDcls.tcInstDecls2</code> traverses instances for the second + time. Type checking concludes with processing foreign exports + (<code>TcForeign.tcForeignExports</code>) and rewrite rules + (<code>TcRules.tcRules</code>). Finally, the global environment is + extended with the new bindings. + </p> + + <h2>Type checking Type and Class Declarations</h2> + <p> + Type and class declarations are type checked in a couple of phases that + contain recursive dependencies - aka <em>knots.</em> The first knot + encompasses almost the whole type checking of these declarations and + forms the main piece of <code>TcTyClsDecls.tcTyAndClassDecls</code>. + </p> + <p> + Inside this big knot, the first main operation is kind checking, which + again involves a knot. It is implemented by <code>kcTyClDecls</code>, + which performs kind checking of potentially recursively-dependent type + and class declarations using kind variables for initially unknown kinds. + During processing the individual declarations some of these variables + will be instantiated depending on the context; the rest gets by default + kind <code>*</code> (during <em>zonking</em> of the kind signatures). + Type synonyms are treated specially in this process, because they can + have an unboxed type, but they cannot be recursive. Hence, their kinds + are inferred in dependency order. Moreover, in contrast to class + declarations and other type declarations, synonyms are not entered into + the global environment as a global <code>TyThing</code>. + (<code>TypeRep.TyThing</code> is a sum type that combines the various + flavours of typish entities, such that they can be stuck into type + environments and similar.) + </p> + + <h2>More Details</h2> + + <h4>Types Variables and Zonking</h4> + <p> + During type checking type variables are represented by mutable variables + - cf. the <a href="vars.html#TyVar">variable story.</a> Consequently, + unification can instantiate type variables by updating those mutable + variables. This process of instantiation is (for reasons that elude me) + called <a + href="http://www.dictionary.com/cgi-bin/dict.pl?term=zonk&db=*">zonking</a> + in GHC's sources. The zonking routines for the various forms of Haskell + constructs are responsible for most of the code in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code>,</a> + whereas the routines that actually operate on mutable types are defined + in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcMType.lhs"><code>TcMType</code></a>; + this includes the zonking of type variables and type terms, routines to + create mutable structures and update them as well as routines that check + constraints, such as that type variables in function signatures have not + been instantiated during type checking. The actual type unification + routine is <code>uTys</code> in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcUnify.lhs"><code>TcUnify</code></a>. + </p> + <p> + All type variables that may be instantiated (those in signatures + may not), but haven't been instantiated during type checking, are zonked + to <code>()</code>, so that after type checking all mutable variables + have been eliminated. + </p> + + <h4>Type Representation</h4> + <p> + The representation of types is fixed in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRep.lhs"><code>TcRep</code></a> + and exported as the data type <code>Type</code>. As explained in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcType.lhs"><code>TcType</code></a>, + GHC supports rank-N types, but, in the type checker, maintains the + restriction that type variables cannot be instantiated to quantified + types (i.e., the type system is predicative). The type checker floats + universal quantifiers outside and maintains types in prenex form. + (However, quantifiers can, of course, not float out of negative + positions.) Overall, we have + </p> + <blockquote> + <pre> +sigma -> forall tyvars. phi +phi -> theta => rho +rho -> sigma -> rho + | tau +tau -> tyvar + | tycon tau_1 .. tau_n + | tau_1 tau_2 + | tau_1 -> tau_2</pre> + </blockquote> + <p> + where <code>sigma</code> is in prenex form; i.e., there is never a + forall to the right of an arrow in a <code>phi</code> type. Moreover, a + type of the form <code>tau</code> never contains a quantifier (which + includes arguments to type constructors). + </p> + <p> + Of particular interest are the variants <code>SourceTy</code> and + <code>NoteTy</code> of <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TypeRep.lhs"><code>TypeRep</code></a>.<code>Type</code>. + The constructor <code>SourceTy :: SourceType -> Type</code> represents a + type constraint; that is, a predicate over types represented by a + dictionary. The type checker treats a <code>SourceTy</code> as opaque, + but during the translation to core it will be expanded into its concrete + representation (i.e., a dictionary type) by the function <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>sourceTypeRep</code>. + Note that newtypes are not covered by <code>SourceType</code>s anymore, + even if some comments in GHC still suggest this. Instead, all newtype + applications are initially represented as a <code>NewTcApp</code>, until + they are eliminated by calls to <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>newTypeRep</code>. + </p> + <p> + The <code>NoteTy</code> constructor is used to add non-essential + information to a type term. Such information has the type + <code>TypeRep.TyNote</code> and is either the set of free type variables + of the annotated expression or the unexpanded version of a type synonym. + Free variables sets are cached as notes to save the overhead of + repeatedly computing the same set for a given term. Unexpanded type + synonyms are useful for generating comprehensible error messages, but + have no influence on the process of type checking. + </p> + + <h4>Type Checking Environment</h4> + <p> + During type checking, GHC maintains a <em>type environment</em> whose + type definitions are fixed in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a> with the operations defined in +<a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv</code></a>. + Among other things, the environment contains all imported and local + instances as well as a list of <em>global</em> entities (imported and + local types and classes together with imported identifiers) and + <em>local</em> entities (locally defined identifiers). This environment + is threaded through the type checking monad, whose support functions + including initialisation can be found in the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnMonad.lhs"><code>TcRnMonad</code>.</a> + + <h4>Expressions</h4> + <p> + Expressions are type checked by <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr</code>.</a> + <p> + Usage occurences of identifiers are processed by the function + <code>tcId</code> whose main purpose is to <a href="#inst">instantiate + overloaded identifiers.</a> It essentially calls + <code>TcInst.instOverloadedFun</code> once for each universally + quantified set of type constraints. It should be noted that overloaded + identifiers are replaced by new names that are first defined in the LIE + (Local Instance Environment?) and later promoted into top-level + bindings. + + <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4> + <p> + GHC implements overloading using so-called <em>dictionaries.</em> A + dictionary is a tuple of functions -- one function for each method in + the class of which the dictionary implements an instance. During type + checking, GHC replaces each type constraint of a function with one + additional argument. At runtime, the extended function gets passed a + matching class dictionary by way of these additional arguments. + Whenever the function needs to call a method of such a class, it simply + extracts it from the dictionary. + <p> + This sounds simple enough; however, the actual implementation is a bit + more tricky as it wants to keep track of all the instances at which + overloaded functions are used in a module. This information is useful + to optimise the code. The implementation is the module <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a> + <p> + The function <code>instOverloadedFun</code> is invoked for each + overloaded usage occurence of an identifier, where overloaded means that + the type of the idendifier contains a non-trivial type constraint. It + proceeds in two steps: (1) Allocation of a method instance + (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional + dependencies. The former implies allocating a new unique identifier, + which replaces the original (overloaded) identifier at the currently + type-checked usage occurrence. + <p> + The new identifier (after being threaded through the LIE) eventually + will be bound by a top-level binding whose rhs contains a partial + application of the original overloaded identifier. This papp applies + the overloaded function to the dictionaries needed for the current + instance. In GHC lingo, this is called a <em>method.</em> Before + becoming a top-level binding, the method is first represented as a value + of type <code>Inst.Inst</code>, which makes it easy to fold multiple + instances of the same identifier at the same types into one global + definition. (And probably other things, too, which I haven't + investigated yet.) + + <p> + <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the + CVS HEAD), the above mechanism interferes badly with RULES pragmas + defined over overloaded functions. During instantiation, a new name is + created for an overloaded function partially applied to the dictionaries + needed in a usage position of that function. As the rewrite rule, + however, mentions the original overloaded name, it won't fire anymore + -- unless later phases remove the intermediate definition again. The + latest CVS version of GHC has an option + <code>-fno-method-sharing</code>, which avoids sharing instantiation + stubs. This is usually/often/sometimes sufficient to make the rules + fire again. + + <p><small> +<!-- hhmts start --> +Last modified: Thu May 12 22:52:46 EST 2005 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/the-beast/types.html b/docs/comm/the-beast/types.html new file mode 100644 index 0000000000..383b71f054 --- /dev/null +++ b/docs/comm/the-beast/types.html @@ -0,0 +1,179 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Hybrid Types</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Hybrid Types</h1> + <p> + GHC essentially supports two type systems: (1) the <em>source type + system</em> (which is a heavily extended version of the type system of + Haskell 98) and (2) the <em>Core type system,</em> which is the type system + used by the intermediate language (see also <a + href="desugar.html">Sugar Free: From Haskell To Core</a>). + </p> + <p> + During parsing and renaming, type information is represented in a form + that is very close to Haskell's concrete syntax; it is defined by + <code>HsTypes.HsType</code>. In addition, type, class, and instance + declarations are maintained in their source form as defined in the + module <code>HsDecl</code>. The situation changes during type checking, + where types are translated into a second representation, which is + defined in the module <code>types/TypeRep.lhs</code>, as type + <code>Type</code>. This second representation is peculiar in that it is + a hybrid between the source representation of types and the Core + representation of types. Using functions, such as + <code>Type.coreView</code> and <code>Type.deepCoreView</code>, a value + of type <code>Type</code> exhibits its Core representation. On the + other hand, pretty printing a <code>Type</code> with + <code>TypeRep.pprType</code> yields the type's source representation. + </p> + <p> + In fact, the <a href="typecheck.html">type checker</a> maintains type + environments based on <code>Type</code>, but needs to perform type + checking on source-level types. As a result, we have functions + <code>Type.tcEqType</code> and <code>Type.tcCmpType</code>, which + compare types based on their source representation, as well as the + function <code>coreEqType</code>, which compares them based on their + core representation. The latter is needed during type checking of Core + (as performed by the functions in the module + <code>coreSyn/CoreLint.lhs</code>). + </p> + + <h2>Type Synonyms</h2> + <p> + Type synonyms in Haskell are essentially a form of macro definitions on + the type level. For example, when the type checker compares two type + terms, synonyms are always compared in their expanded form. However, to + produce good error messages, we like to avoid expanding type synonyms + during pretty printing. Hence, <code>Type</code> has a variant + <code>NoteTy TyNote Type</code>, where + </p> + <blockquote> + <pre> +data TyNote + = FTVNote TyVarSet -- The free type variables of the noted expression + + | SynNote Type -- Used for type synonyms + -- The Type is always a TyConApp, and is the un-expanded form. + -- The type to which the note is attached is the expanded form.</pre> + </blockquote> + <p> + In other words, a <code>NoteTy</code> represents the expanded form of a + type synonym together with a note stating its source form. + </p> + + <h3>Creating Representation Types of Synonyms</h3> + <p> + During translation from <code>HsType</code> to <code>Type</code> the + function <code>Type.mkSynTy</code> is used to construct representations + of applications of type synonyms. It creates a <code>NoteTy</code> node + if the synonym is applied to a sufficient number of arguments; + otherwise, it builds a simple <code>TyConApp</code> and leaves it to + <code>TcMType.checkValidType</code> to pick up invalid unsaturated + synonym applications. While creating a <code>NoteTy</code>, + <code>mkSynTy</code> also expands the synonym by substituting the type + arguments for the parameters of the synonym definition, using + <code>Type.substTyWith</code>. + </p> + <p> + The function <code>mkSynTy</code> is used indirectly via + <code>mkGenTyConApp</code>, <code>mkAppTy</code>, and + <code>mkAppTy</code>, which construct type representations involving + type applications. The function <code>mkSynTy</code> is also used + directly during type checking interface files; this is for tedious + reasons to do with forall hoisting - see the comment at + <code>TcIface.mkIfTcApp</code>. + </p> + + <h2>Newtypes</h2> + <p> + Data types declared by a <code>newtype</code> declarations constitute new + type constructors---i.e., they are not just type macros, but introduce + new type names. However, provided that a newtype is not recursive, we + still want to implement it by its representation type. GHC realises this + by providing two flavours of type equality: (1) <code>tcEqType</code> is + source-level type equality, which compares newtypes and + <code>PredType</code>s by name, and (2) <code>coreEqType</code> compares + them structurally (by using <code>deepCoreView</code> to expand the + representation before comparing). The function + <code>deepCoreView</code> (via <code>coreView</code>) invokes + <code>expandNewTcApp</code> for every type constructor application + (<code>TyConApp</code>) to determine whether we are looking at a newtype + application that needs to be expanded to its representation type. + </p> + + <h2>Predicates</h2> + <p> + The dictionary translation of type classes, translates each predicate in + a type context of a type signature into an additional argument, which + carries a dictionary with the functions overloaded by the corresponding + class. The <code>Type</code> data type has a special variant + <code>PredTy PredType</code> for predicates, where + </p> + <blockquote> + <pre> +data PredType + = ClassP Class [Type] -- Class predicate + | IParam (IPName Name) Type -- Implicit parameter</pre> + </blockquote> + <p> + These types need to be handled as source type during type checking, but + turn into their representations when inspected through + <code>coreView</code>. The representation is determined by + <code>Type.predTypeRep</code>. + </p> + + <h2>Representation of Type Constructors</h2> + <p> + Type constructor applications are represented in <code>Type</code> by + the variant <code>TyConApp :: TyCon -> [Type] -> Type</code>. The first + argument to <code>TyConApp</code>, namely <code>TyCon.TyCon</code>, + distinguishes between function type constructors (variant + <code>FunTyCon</code>) and algebraic type constructors (variant + <code>AlgTyCon</code>), which arise from data and newtype declarations. + The variant <code>AlgTyCon</code> contains all the information available + from the data/newtype declaration as well as derived information, such + as the <code>Unique</code> and argument variance information. This + includes a field <code>algTcRhs :: AlgTyConRhs</code>, where + <code>AlgTyConRhs</code> distinguishes three kinds of algebraic data + type declarations: (1) declarations that have been exported abstractly, + (2) <code>data</code> declarations, and (3) <code>newtype</code> + declarations. The last two both include their original right hand side; + in addition, the third variant also caches the "ultimate" representation + type, which is the right hand side after expanding all type synonyms and + non-recursive newtypes. + </p> + <p> + Both data and newtype declarations refer to their data constructors + represented as <code>DataCon.DataCon</code>, which include all details + of their signature (as derived from the original declaration) as well + information for code generation, such as their tag value. + </p> + + <h2>Representation of Classes and Instances</h2> + <p> + Class declarations turn into values of type <code>Class.Class</code>. + They represent methods as the <code>Id</code>s of the dictionary + selector functions. Similar selector functions are available for + superclass dictionaries. + </p> + <p> + Instance declarations turn into values of type + <code>InstEnv.Instance</code>, which in interface files are represented + as <code>IfaceSyn.IfaceInst</code>. Moreover, the type + <code>InstEnv.InstEnv</code>, which is a synonym for <code>UniqFM + ClsInstEnv</code>, provides a mapping of classes to their + instances---<code>ClsInstEnv</code> is essentially a list of instance + declarations. + </p> + + <p><small> +<!-- hhmts start --> +Last modified: Sun Jun 19 13:07:22 EST 2005 +<!-- hhmts end --> + </small></p> + </body> +</html> diff --git a/docs/comm/the-beast/vars.html b/docs/comm/the-beast/vars.html new file mode 100644 index 0000000000..9bbd310c60 --- /dev/null +++ b/docs/comm/the-beast/vars.html @@ -0,0 +1,235 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1> + <p> + + +<h2>Variables</h2> + +The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>, +represents variables, both term variables and type variables: +<pre> + data Var + = Var { + varName :: Name, + realUnique :: FastInt, + varType :: Type, + varDetails :: VarDetails, + varInfo :: IdInfo + } +</pre> +<ul> +<li> The <code>varName</code> field contains the identity of the variable: +its unique number, and its print-name. See "<a href="names.html">The truth about names</a>". + +<p><li> The <code>realUnique</code> field caches the unique number in the +<code>varName</code> field, just to make comparison of <code>Var</code>s a little faster. + +<p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a +type variable. (Types and kinds are both represented by a <code>Type</code>.) + +<p><li> The <code>varDetails</code> field distinguishes term variables from type variables, +and makes some further distinctions (see below). + +<p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful +information: strictness, unfolding, etc. However, this information is all optional; +you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away +the <code>VarDetails</code> of a <code>Var</code> +</ul> +<p> +It's often fantastically convenient to have term variables and type variables +share a single data type. For example, +<pre> + exprFreeVars :: CoreExpr -> VarSet +</pre> +If there were two types, we'd need to return two sets. Simiarly, big lambdas and +little lambdas use the same constructor in Core, which is extremely convenient. +<p> +We define a couple of type synonyms: +<pre> + type Id = Var -- Term variables + type TyVar = Var -- Type variables +</pre> +just to help us document the occasions when we are expecting only term variables, +or only type variables. + + +<h2> The <code>VarDetails</code> field </h2> + +The <code>VarDetails</code> field tells what kind of variable this is: +<pre> +data VarDetails + = LocalId -- Used for locally-defined Ids (see NOTE below) + LocalIdDetails + + | GlobalId -- Used for imported Ids, dict selectors etc + GlobalIdDetails + + | TyVar + | MutTyVar (IORef (Maybe Type)) -- Used during unification; + TyVarDetails +</pre> + +<a name="TyVar"> +<h2>Type variables (<code>TyVar</code>)</h2> +</a> +<p> +The <code>TyVar</code> case is self-explanatory. The <code>MutTyVar</code> +case is used only during type checking. Then a type variable can be unified, +using an imperative update, with a type, and that is what the +<code>IORef</code> is for. The <code>TcType.TyVarDetails</code> field records +the sort of type variable we are dealing with. It is defined as +<pre> +data TyVarDetails = SigTv | ClsTv | InstTv | VanillaTv +</pre> +<code>SigTv</code> marks type variables that were introduced when +instantiating a type signature prior to matching it against the inferred type +of a definition. The variants <code>ClsTv</code> and <code>InstTv</code> mark +scoped type variables introduced by class and instance heads, respectively. +These first three sorts of type variables are skolem variables (tested by the +predicate <code>isSkolemTyVar</code>); i.e., they must <em>not</em> be +instantiated. All other type variables are marked as <code>VanillaTv</code>. +<p> +For a long time I tried to keep mutable Vars statically type-distinct +from immutable Vars, but I've finally given up. It's just too painful. +After type checking there are no MutTyVars left, but there's no static check +of that fact. + +<h2>Term variables (<code>Id</code>)</h2> + +A term variable (of type <code>Id</code>) is represented either by a +<code>LocalId</code> or a <code>GlobalId</code>: +<p> +A <code>GlobalId</code> is +<ul> +<li> Always bound at top-level. +<li> Always has a <code>GlobalName</code>, and hence has + a <code>Unique</code> that is globally unique across the whole + GHC invocation (a single invocation may compile multiple modules). +<li> Has <code>IdInfo</code> that is absolutely fixed, forever. +</ul> + +<p> +A <code>LocalId</code> is: +<ul> +<li> Always bound in the module being compiled: +<ul> +<li> <em>either</em> bound within an expression (lambda, case, local let(rec)) +<li> <em>or</em> defined at top level in the module being compiled. +</ul> +<li> Has IdInfo that changes as the simpifier bashes repeatedly on it. +</ul> +<p> +The key thing about <code>LocalId</code>s is that the free-variable finder +typically treats them as candidate free variables. That is, it ignores +<code>GlobalId</code>s such as imported constants, data contructors, etc. +<p> +An important invariant is this: <em>All the bindings in the module +being compiled (whether top level or not) are <code>LocalId</code>s +until the CoreTidy phase.</em> In the CoreTidy phase, all +externally-visible top-level bindings are made into GlobalIds. This +is the point when a <code>LocalId</code> becomes "frozen" and becomes +a fixed, immutable <code>GlobalId</code>. +<p> +(A binding is <em>"externally-visible"</em> if it is exported, or +mentioned in the unfolding of an externally-visible Id. An +externally-visible Id may not have an unfolding, either because it is +too big, or because it is the loop-breaker of a recursive group.) + +<h3>Global Ids and implicit Ids</h3> + +<code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>. +This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other +structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code> +so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration: +<pre> +data GlobalIdDetails + = NotGlobalId -- Used as a convenient extra return value + -- from globalIdDetails + + | VanillaGlobal -- Imported from elsewhere + + | PrimOpId PrimOp -- The Id for a primitive operator + | FCallId ForeignCall -- The Id for a foreign call + + -- These next ones are all "implicit Ids" + | RecordSelId FieldLabel -- The Id for a record selector + | DataConId DataCon -- The Id for a data constructor *worker* + | DataConWrapId DataCon -- The Id for a data constructor *wrapper* + -- [the only reasons we need to know is so that + -- a) we can suppress printing a definition in the interface file + -- b) when typechecking a pattern we can get from the + -- Id back to the data con] +</pre> +The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for +a record selector, say, to its field name; or the <code>Id</code> for a primitive +operator to the <code>PrimOp</code> itself. +<p> +Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit +Id is derived by implication from some other declaration. So a record selector is +derived from its data type declaration, for example. An implicit Ids is always +a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just +that: implicit. If you do -ddump-simpl you won't see their definition. (That's +why it's true to say that until CoreTidy all Ids in this compilation unit are +LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in +this module, so that the code generator will generate code for the (curried) function. +<p> +Implicit Ids carry their unfolding inside them, of course, so they may well have +been inlined much earlier; but we generate the curried top-level defn just in +case its ever needed. + +<h3>LocalIds</h3> + +The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>: +<pre> +data LocalIdDetails + = NotExported -- Not exported + | Exported -- Exported + | SpecPragma -- Not exported, but not to be discarded either + -- It's unclean that this is so deeply built in +</pre> +From this we can tell whether the <code>LocalId</code> is exported, and that +tells us whether we can drop an unused binding as dead code. +<p> +The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma: +<pre> + foo :: Num a => a -> a + {-# SPECIALIZE foo :: Int -> Int #-} + foo = ... +</pre> +The type checker generates a dummy call to <code>foo</code> at the right types: +<pre> + $dummy = foo Int dNumInt +</pre> +The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang +onto that call to <code>foo</code> so that the specialiser can see it, but there +are no calls to <code>$dummy</code>. +The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it +reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id +to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>. +So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser. + + +<h3> ExternalNames and InternalNames </h3> + +Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is +not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code> +(see "<a href="names.html#sort">The truth about Names</a>"): +<ul> +<li> Every <code>GlobalId</code> has an <code>ExternalName</code>. +<li> A <code>LocalId</code> might have either kind of <code>Name</code>. +</ul> + +<!-- hhmts start --> +Last modified: Fri Sep 12 15:17:18 BST 2003 +<!-- hhmts end --> + </small> + </body> +</html> + |