9 files changed, 1581 insertions, 0 deletions
diff --git a/docs/comm/rts-libs/coding-style.html b/docs/comm/rts-libs/coding-style.html
new file mode 100644
index 0000000000..58f5b4f9bb
--- /dev/null
+++ b/docs/comm/rts-libs/coding-style.html
@@ -0,0 +1,516 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Style Guidelines for RTS C code</title>
+  </head>
+
+<body>
+<H1>The GHC Commentary - Style Guidelines for RTS C code</h1>
+
+<h2>Comments</h2>
+
+<p>These coding style guidelines are mainly intended for use in
+<tt>ghc/rts</tt> and <tt>ghc/includes</tt>.
+
+<p>NB These are just suggestions.  They're not set in stone.  Some of
+them are probably misguided.  If you disagree with them, feel free to
+modify this document (and make your commit message reasonably
+informative) or mail someone (eg. <a
+href="glasgow-haskell-users@haskell.org">The GHC mailing list</a>)
+
+<h2>References</h2>
+
+If you haven't read them already, you might like to check the following.
+Where they conflict with our suggestions, they're probably right.
+
+<ul>
+
+<li>
+The C99 standard.  One reasonable reference is <a
+href="http://home.tiscalinet.ch/t_wolf/tw/c/c9x_changes.html">here</a>.
+
+<p><li>
+Writing Solid Code, Microsoft Press.  (Highly recommended.  Possibly
+the only Microsoft Press book that's worth reading.)
+
+<p><li>
+Autoconf documentation.
+See also <a href="http://peti.gmd.de/autoconf-archive/">The autoconf macro archive</a> and 
+<a href="http://www.cyclic.com/cyclic-pages/autoconf.html">Cyclic Software's description</a>
+
+<p><li> <a
+href="http://www.cs.umd.edu/users/cml/cstyle/indhill-cstyle.html">Indian
+Hill C Style and Coding Standards</a>.
+
+<p><li>
+<a href="http://www.cs.umd.edu/users/cml/cstyle/">A list of C programming style links</a>
+
+<p><li>
+<a href="http://www.lysator.liu.se/c/c-www.html">A very large list of C programming links</a>
+
+<p><li>
+<a href="http://www.geek-girl.com/unix.html">A list of Unix programming links</a>
+
+</ul>
+
+
+<h2>Portability issues</h2>
+
+<ul>
+<p><li> We try to stick to C99 where possible.  We use the following
+C99 features relative to C89, some of which were previously GCC
+extensions (possibly with different syntax):
+
+<ul>
+<p><li>Variable length arrays as the last field of a struct.  GCC has
+a similar extension, but the syntax is slightly different: in GCC you
+would declare the array as <tt>arr[0]</tt>, whereas in C99 it is
+declared as <tt>arr[]</tt>.
+
+<p><li>Inline annotations on functions (see later)
+
+<p><li>Labeled elements in initialisers.  Again, GCC has a slightly
+different syntax from C99 here, and we stick with the GCC syntax until
+GCC implements the C99 proposal.
+
+<p><li>C++-style comments.  These are part of the C99 standard, and we
+prefer to use them whenever possible.
+</ul>
+
+<p>In addition we use ANSI-C-style function declarations and
+prototypes exclusively.  Every function should have a prototype;
+static function prototypes may be placed near the top of the file in
+which they are declared, and external prototypes are usually placed in
+a header file with the same basename as the source file (although there
+are exceptions to this rule, particularly when several source files
+together implement a subsystem which is described by a single external
+header file).
+
+<p><li>We use the following GCC extensions, but surround them with
+<tt>#ifdef __GNUC__</tt>:
+
+<ul>
+<p><li>Function attributes (mostly just <code>no_return</code> and
+<code>unused</code>)
+<p><li>Inline assembly.
+</ul>
+
+<p><li>
+char can be signed or unsigned - always say which you mean
+
+<p><li>Our POSIX policy: try to write code that only uses POSIX (IEEE
+Std 1003.1) interfaces and APIs.  We used to define
+<code>POSIX_SOURCE</code> by default, but found that this caused more
+problems than it solved, so now we require any code that is
+POSIX-compliant to explicitly say so by having <code>#include
+"PosixSource.h"</code> at the top.  Try to do this whenever possible.
+
+<p><li> Some architectures have memory alignment constraints.  Others
+don't have any constraints but go faster if you align things.  These
+macros (from <tt>ghcconfig.h</tt>) tell you which alignment to use
+
+<pre>
+  /* minimum alignment of unsigned int */
+  #define ALIGNMENT_UNSIGNED_INT 4
+
+  /* minimum alignment of long */
+  #define ALIGNMENT_LONG 4
+
+  /* minimum alignment of float */
+  #define ALIGNMENT_FLOAT 4
+
+  /* minimum alignment of double */
+  #define ALIGNMENT_DOUBLE 4
+</pre>
+
+<p><li> Use <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> when
+reading/writing ints and ptrs to the stack or heap.  Note that, by
+definition, <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> are
+the same size and have the same alignment constraints even if
+<code>sizeof(int) != sizeof(ptr)</code> on that platform.
+
+<p><li> Use <tt>StgInt8</tt>, <tt>StgInt16</tt>, etc when you need a
+certain minimum number of bits in a type.  Use <tt>int</tt> and
+<tt>nat</tt> when there's no particular constraint.  ANSI C only
+guarantees that ints are at least 16 bits but within GHC we assume
+they are 32 bits.
+
+<p><li> Use <tt>StgFloat</tt> and <tt>StgDouble</tt> for floating
+point values which will go on/have come from the stack or heap.  Note
+that <tt>StgDouble</tt> may occupy more than one <tt>StgWord</tt>, but
+it will always be a whole number multiple.
+
+<p>
+Use <code>PK_FLT(addr)</code>, <code>PK_DBL(addr)</code> to read
+<tt>StgFloat</tt> and <tt>StgDouble</tt> values from the stack/heap,
+and <code>ASSIGN_FLT(val,addr)</code> /
+<code>ASSIGN_DBL(val,addr)</code> to assign StgFloat/StgDouble values
+to heap/stack locations.  These macros take care of alignment
+restrictions.
+
+<p>
+Heap/Stack locations are always <tt>StgWord</tt> aligned; the
+alignment requirements of an <tt>StgDouble</tt> may be more than that
+of <tt>StgWord</tt>, but we don't pad misaligned <tt>StgDoubles</tt>
+because doing so would be too much hassle (see <code>PK_DBL</code> &
+co above).
+
+<p><li>
+Avoid conditional code like this:
+
+<pre>
+  #ifdef solaris_host_OS
+  // do something solaris specific
+  #endif
+</pre>
+
+Instead, add an appropriate test to the configure.ac script and use
+the result of that test instead. 
+
+<pre>
+  #ifdef HAVE_BSD_H
+  // use a BSD library
+  #endif
+</pre>
+
+<p>The problem is that things change from one version of an OS to another
+- things get added, things get deleted, things get broken, some things
+are optional extras.  Using "feature tests" instead of "system tests"
+makes things a lot less brittle.  Things also tend to get documented
+better.
+
+</ul>
+
+<h2>Debugging/robustness tricks</h2>
+
+
+Anyone who has tried to debug a garbage collector or code generator
+will tell you: "If a program is going to crash, it should crash as
+soon, as noisily and as often as possible."  There's nothing worse
+than trying to find a bug which only shows up when running GHC on
+itself and doesn't manifest itself until 10 seconds after the actual
+cause of the problem.
+
+<p>We put all our debugging code inside <tt>#ifdef DEBUG</tt>.  The
+general policy is we don't ship code with debugging checks and
+assertions in it, but we do run with those checks in place when
+developing and testing.  Anything inside <tt>#ifdef DEBUG</tt> should
+not slow down the code by more than a factor of 2.
+
+<p>We also have more expensive "sanity checking" code for hardcore
+debugging - this can slow down the code by a large factor, but is only
+enabled on demand by a command-line flag.  General sanity checking in
+the RTS is currently enabled with the <tt>-DS</tt> RTS flag.
+
+<p>There are a number of RTS flags which control debugging output and
+sanity checking in various parts of the system when <tt>DEBUG</tt> is
+defined.  For example, to get the scheduler to be verbose about what
+it is doing, you would say <tt>+RTS -Ds -RTS</tt>.  See
+<tt>includes/RtsFlags.h</tt> and <tt>rts/RtsFlags.c</tt> for the full
+set of debugging flags.  To check one of these flags in the code,
+write:
+
+<pre>
+  IF_DEBUG(gc, fprintf(stderr, "..."));
+</pre>
+
+would check the <tt>gc</tt> flag before generating the output (and the
+code is removed altogether if <tt>DEBUG</tt> is not defined).
+
+<p>All debugging output should go to <tt>stderr</tt>.
+
+<p>
+Particular guidelines for writing robust code:
+
+<ul>
+<p><li>
+Use assertions.  Use lots of assertions.  If you write a comment
+that says "takes a +ve number" add an assertion.  If you're casting
+an int to a nat, add an assertion.  If you're casting an int to a char,
+add an assertion.  We use the <tt>ASSERT</tt> macro for writing
+assertions; it goes away when <tt>DEBUG</tt> is not defined.
+
+<p><li>
+Write special debugging code to check the integrity of your data structures.
+(Most of the runtime checking code is in <tt>rts/Sanity.c</tt>)
+Add extra assertions which call this code at the start and end of any
+code that operates on your data structures.
+
+<p><li>
+When you find a hard-to-spot bug, try to think of some assertions,
+sanity checks or whatever that would have made the bug easier to find.
+
+<p><li>
+When defining an enumeration, it's a good idea not to use 0 for normal
+values.  Instead, make 0 raise an internal error.  The idea here is to
+make it easier to detect pointer-related errors on the assumption that
+random pointers are more likely to point to a 0 than to anything else.
+
+<pre>
+typedef enum
+    { i_INTERNAL_ERROR  /* Instruction 0 raises an internal error */
+    , i_PANIC           /* irrefutable pattern match failed! */
+    , i_ERROR           /* user level error */
+
+    ...
+</pre>
+
+<p><li> Use <tt>#warning</tt> or <tt>#error</tt> whenever you write a
+piece of incomplete/broken code.
+
+<p><li> When testing, try to make infrequent things happen often.
+     For example, make a context switch/gc/etc happen every time a 
+     context switch/gc/etc can happen.  The system will run like a 
+     pig but it'll catch a lot of bugs.
+
+</ul>
+
+<h2>Syntactic details</h2>
+
+<ul>
+<p><li><b>Important:</b> Put "redundant" braces or parens in your code.
+Omitting braces and parens leads to very hard to spot bugs -
+especially if you use macros (and you might have noticed that GHC does
+this a lot!)
+
+<p>
+In particular:
+<ul>
+<p><li>
+Put braces round the body of for loops, while loops, if statements, etc.
+even if they "aren't needed" because it's really hard to find the resulting
+bug if you mess up.  Indent them any way you like but put them in there!
+</ul>
+
+<p><li>
+When defining a macro, always put parens round args - just in case.
+For example, write:
+<pre>
+  #define add(x,y) ((x)+(y))
+</pre>
+instead of
+<pre>
+  #define add(x,y) x+y
+</pre>
+
+<p><li> Don't declare and initialize variables at the same time.
+Separating the declaration and initialization takes more lines, but
+make the code clearer.
+
+<p><li>
+Use inline functions instead of macros if possible - they're a lot
+less tricky to get right and don't suffer from the usual problems
+of side effects, evaluation order, multiple evaluation, etc.
+
+<ul>
+<p><li>Inline functions get the naming issue right.  E.g. they
+  can have local variables which (in an expression context)
+  macros can't.
+
+<p><li> Inline functions have call-by-value semantics whereas macros
+  are call-by-name.  You can be bitten by duplicated computation
+  if you aren't careful.
+
+<p><li> You can use inline functions from inside gdb if you compile with
+  -O0 or -fkeep-inline-functions.  If you use macros, you'd better
+  know what they expand to.
+</ul>
+
+However, note that macros can serve as both l-values and r-values and
+can be "polymorphic" as these examples show:
+<pre>
+  // you can use this as an l-value or an l-value
+  #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo)
+
+  // polymorphic case
+  // but note that min(min(1,2),3) does 3 comparisions instead of 2!!
+  #define min(x,y) (((x)<=(y)) ? (x) : (y))
+</pre>
+
+<p><li>
+Inline functions should be "static inline" because:
+<ul>
+<p><li>
+gcc will delete static inlines if not used or theyre always inlined.
+
+<p><li>
+  if they're externed, we could get conflicts between 2 copies of the 
+  same function if, for some reason, gcc is unable to delete them.
+  If they're static, we still get multiple copies but at least they don't conflict.
+</ul>
+
+OTOH, the gcc manual says this
+so maybe we should use extern inline?
+
+<pre>
+   When a function is both inline and `static', if all calls to the
+function are integrated into the caller, and the function's address is
+never used, then the function's own assembler code is never referenced.
+In this case, GNU CC does not actually output assembler code for the
+function, unless you specify the option `-fkeep-inline-functions'.
+Some calls cannot be integrated for various reasons (in particular,
+calls that precede the function's definition cannot be integrated, and
+neither can recursive calls within the definition).  If there is a
+nonintegrated call, then the function is compiled to assembler code as
+usual.  The function must also be compiled as usual if the program
+refers to its address, because that can't be inlined.
+
+   When an inline function is not `static', then the compiler must
+assume that there may be calls from other source files; since a global
+symbol can be defined only once in any program, the function must not
+be defined in the other source files, so the calls therein cannot be
+integrated.  Therefore, a non-`static' inline function is always
+compiled on its own in the usual fashion.
+
+   If you specify both `inline' and `extern' in the function
+definition, then the definition is used only for inlining.  In no case
+is the function compiled on its own, not even if you refer to its
+address explicitly.  Such an address becomes an external reference, as
+if you had only declared the function, and had not defined it.
+
+   This combination of `inline' and `extern' has almost the effect of a
+macro.  The way to use it is to put a function definition in a header
+file with these keywords, and put another copy of the definition
+(lacking `inline' and `extern') in a library file.  The definition in
+the header file will cause most calls to the function to be inlined.
+If any uses of the function remain, they will refer to the single copy
+in the library.
+</pre>
+
+<p><li>
+Don't define macros that expand to a list of statements.  
+You could just use braces as in:
+
+<pre>
+  #define ASSIGN_CC_ID(ccID)              \
+        {                                 \
+        ccID = CC_ID;                     \
+        CC_ID++;                          \
+        }
+</pre>
+
+(but it's usually better to use an inline function instead - see above).
+
+<p><li>
+Don't even write macros that expand to 0 statements - they can mess you 
+up as well.  Use the doNothing macro instead.
+<pre>
+  #define doNothing() do { } while (0)
+</pre>
+
+<p><li>
+This code
+<pre>
+int* p, q;
+</pre>
+looks like it declares two pointers but, in fact, only p is a pointer.
+It's safer to write this:
+<pre>
+int* p;
+int* q;
+</pre>
+You could also write this:
+<pre>
+int *p, *q;
+</pre>
+but it is preferrable to split the declarations.
+
+<p><li>
+Try to use ANSI C's enum feature when defining lists of constants of
+the same type.  Among other benefits, you'll notice that gdb uses the
+name instead of its (usually inscrutable) number when printing values
+with enum types and gdb will let you use the name in expressions you
+type.  
+
+<p>
+Examples:
+<pre>
+    typedef enum { /* N.B. Used as indexes into arrays */
+     NO_HEAP_PROFILING,		
+     HEAP_BY_CC,		
+     HEAP_BY_MOD,		
+     HEAP_BY_GRP,		
+     HEAP_BY_DESCR,		
+     HEAP_BY_TYPE,		
+     HEAP_BY_TIME		
+    } ProfilingFlags;
+</pre>
+instead of
+<pre>
+    # define NO_HEAP_PROFILING	0	/* N.B. Used as indexes into arrays */
+    # define HEAP_BY_CC		1
+    # define HEAP_BY_MOD	2
+    # define HEAP_BY_GRP	3
+    # define HEAP_BY_DESCR	4
+    # define HEAP_BY_TYPE	5
+    # define HEAP_BY_TIME	6
+</pre>
+and 
+<pre>
+    typedef enum {
+     CCchar    = 'C',
+     MODchar   = 'M',
+     GRPchar   = 'G',
+     DESCRchar = 'D',
+     TYPEchar  = 'Y',
+     TIMEchar  = 'T'
+    } ProfilingTag;
+</pre>
+instead of
+<pre>
+    # define CCchar    'C'
+    # define MODchar   'M'
+    # define GRPchar   'G'
+    # define DESCRchar 'D'
+    # define TYPEchar  'Y'
+    # define TIMEchar  'T'
+</pre>
+
+<p><li> Please keep to 80 columns: the line has to be drawn somewhere,
+and by keeping it to 80 columns we can ensure that code looks OK on
+everyone's screen.  Long lines are hard to read, and a sign that the
+code needs to be restructured anyway.
+
+<p><li> When commenting out large chunks of code, use <code>#ifdef 0
+... #endif</code> rather than <code>/* ... */</code> because C doesn't
+have nested comments.
+
+<p><li>When declaring a typedef for a struct, give the struct a name
+as well, so that other headers can forward-reference the struct name
+and it becomes possible to have opaque pointers to the struct.  Our
+convention is to name the struct the same as the typedef, but add a
+leading underscore.  For example:
+
+<pre>
+  typedef struct _Foo {
+    ...
+  } Foo;
+</pre>
+
+<p><li>Do not use <tt>!</tt> instead of explicit comparison against
+<tt>NULL</tt> or <tt>'\0'</tt>;  the latter is much clearer.
+
+<p><li> We don't care too much about your indentation style but, if
+you're modifying a function, please try to use the same style as the
+rest of the function (or file).  If you're writing new code, a
+tab width of 4 is preferred.
+
+</ul>
+
+<h2>CVS issues</h2>
+
+<ul>
+<p><li>
+Don't be tempted to reindent or reorganise large chunks of code - it
+generates large diffs in which it's hard to see whether anything else
+was changed.
+<p>
+If you must reindent or reorganise, don't include any functional
+changes that commit and give advance warning that you're about to do
+it in case anyone else is changing that file.
+</ul>
+
+
+</body>
+</html>
diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html
new file mode 100644
index 0000000000..febe9fe422
--- /dev/null
+++ b/docs/comm/rts-libs/foreignptr.html
@@ -0,0 +1,68 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+
+    <h1>On why we have <tt>ForeignPtr</tt></h1>
+
+    <p>Unfortunately it isn't possible to add a finalizer to a normal
+    <tt>Ptr a</tt>.  We already have a generic finalization mechanism:
+    see the Weak module in package lang.  But the only reliable way to
+    use finalizers is to attach one to an atomic heap object - that
+    way the compiler's optimiser can't interfere with the lifetime of
+    the object.
+
+    <p>The <tt>Ptr</tt> type is really just a boxed address - it's
+    defined like
+
+    <pre>
+data Ptr a = Ptr Addr#
+</pre>
+
+    <p>where <tt>Addr#</tt> is an unboxed native address (just a 32-
+    or 64- bit word).  Putting a finalizer on a <tt>Ptr</tt> is
+    dangerous, because the compiler's optimiser might remove the box
+    altogether.
+
+    <p><tt>ForeignPtr</tt> is defined like this
+
+    <pre>
+data ForeignPtr a = ForeignPtr ForeignObj#
+</pre>
+
+    <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds
+    to a real heap object.  The heap object is primitive from the
+    point of view of the compiler - it can't be optimised away.  So it
+    works to attach a finalizer to the <tt>ForeignObj#</tt> (but not
+    to the <tt>ForeignPtr</tt>!).
+
+    <p>There are several primitive objects to which we can attach
+    finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>,
+    etc.  We have special functions for some of these: eg.
+    <tt>MVar.addMVarFinalizer</tt>.
+
+    <p>So a nicer interface might be something like
+
+<pre>
+class Finalizable a where
+   addFinalizer :: a -> IO () -> IO ()
+
+instance Finalizable (ForeignPtr a) where ...
+instance Finalizable (MVar a) where ...
+</pre>
+
+    <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and
+    rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>.  The reason for that
+    is just efficiency, I think.
+
+    <p><small>
+<!-- hhmts start -->
+Last modified: Wed Sep 26 09:49:37 BST 2001
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html
new file mode 100644
index 0000000000..67a544be85
--- /dev/null
+++ b/docs/comm/rts-libs/multi-thread.html
@@ -0,0 +1,445 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+<head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+<title>The GHC Commentary - Supporting multi-threaded interoperation</title>
+</head>
+<body>
+<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1>
+<em>
+<p>
+Authors: sof@galois.com, simonmar@microsoft.com<br>
+Date:    April 2002
+</p>
+</em>
+<p>
+This document presents the implementation of an extension to
+Concurrent Haskell that provides two enhancements:
+</p>
+<ul>
+<li>A Concurrent Haskell thread may call an external (e.g., C)
+function in a manner that's transparent to the execution/evaluation of
+other Haskell threads. Section <a href="#callout">Calling out"</a> covers this.
+</li>
+<li>
+OS threads may safely call Haskell functions concurrently. Section
+<a href="#callin">"Calling in"</a> covers this.
+</li>
+</ul>
+
+<!---- ***************************************  ----->
+<h2 id="callout">The problem: foreign calls that block</h2>
+<p>
+When a Concurrent Haskell(CH) thread calls a 'foreign import'ed
+function, the runtime system(RTS) has to handle this in a manner
+transparent to other CH threads. That is, they shouldn't be blocked
+from making progress while the CH thread executes the external
+call. Presently, all threads will block.
+</p>
+<p>
+Clearly, we have to rely on OS-level threads in order to support this
+kind of concurrency. The implementation described here defines the 
+(abstract) OS threads interface that the RTS assumes. The implementation
+currently provides two instances of this interface, one for POSIX
+threads (pthreads) and one for the Win32 threads.
+</p>
+
+<!---- ***************************************  ----->
+<h3>Multi-threading the RTS</h3>
+
+<p>
+A simple and efficient way to implement non-blocking foreign calls is like this:
+<ul>
+<li> Invariant: only one OS thread is allowed to
+execute code inside of the GHC runtime system. [There are alternate
+designs, but I won't go into details on their pros and cons here.]
+We'll call the OS thread that is currently running Haskell threads
+the <em>Current Haskell Worker Thread</em>.
+<p>
+The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its
+time-slice expires or it blocks on an MVar, then grabs another, and executes
+that, and so on.
+</p>
+<li>
+<p>
+When the Current Haskell Worker comes to execute a potentially blocking 'foreign
+import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that
+another OS worker thread is available to become the Current Haskell Worker.
+Consequently, even if the external call blocks, the new Current Haskell Worker
+continues execution of the other Concurrent Haskell threads.
+When the external call eventually completes, the Concurrent Haskell
+thread that made the call is passed the result and made runnable
+again.
+</p>
+<p>
+<li>
+A pool of OS threads are constantly trying to become the Current Haskell Worker.
+Only one succeeds at any moment.   If the pool becomes empty, the RTS creates more workers.
+<p><li>
+The OS worker threads are regarded as interchangeable.  A given Haskell thread
+may, during its lifetime, be executed entirely by one OS worker thread, or by more than one.
+There's just no way to tell.
+
+<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved.
+The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and
+the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>.
+</ul>
+<p>
+The rest of this section describes the mechanics of implementing all
+this. There's two parts to it, one that describes how a native (OS) thread
+leaves the RTS to service the external call, the other how the same
+thread handles returning the result of the external call back to the
+Haskell thread.
+</p>
+
+<!---- ***************************************  ----->
+<h3>Making the external call</h3>
+
+<p>
+Presently, GHC handles 'safe' C calls by effectively emitting the
+following code sequence:
+</p>  
+
+<pre>
+    ...save thread state...
+    t = suspendThread();
+    r = foo(arg1,...,argn);
+    resumeThread(t);
+    ...restore thread state...
+    return r;
+</pre>
+
+<p>
+After having squirreled away the state of a Haskell thread,
+<tt>Schedule.c:suspendThread()</tt> is called which puts the current
+thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>]
+containing threads that are currently blocked waiting for external calls
+to complete (this is done for the purposes of finding roots when
+garbage collecting).
+</p>
+
+<p>
+In addition to putting the Haskell thread on
+<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also
+does the following:
+</p>
+<ul>
+<li>Instructs the <em>Task Manager</em> to make sure that there's a
+another native thread waiting in the wings to take over the execution
+of Haskell threads. This might entail creating a new
+<em>worker thread</em> or re-using one that's currently waiting for
+more work to do. The <a href="#taskman">Task Manager</a> section
+presents the functionality provided by this subsystem.
+</li>
+
+<li>Releases its capability to execute within the RTS. By doing
+so, another worker thread will become unblocked and start executing
+code within the RTS. See the <a href="#capability">Capability</a>
+section for details.
+</li>
+
+<li><tt>suspendThread()</tt> returns a token which is used to
+identify the Haskell thread that was added to
+<tt>suspended_ccalling_threads</tt>. This is done so that once the
+external call has completed, we know what Haskell thread to pull off
+the <tt>suspended_ccalling_threads</tt> list.
+</li>
+</ul>
+
+<p>
+Upon return from <tt>suspendThread()</tt>, the OS thread is free of
+its RTS executing responsibility, and can now invoke the external
+call. Meanwhile, the other worker thread that have now gained access
+to the RTS will continue executing Concurrent Haskell code. Concurrent
+'stuff' is happening!
+</p>
+
+<!---- ***************************************  ----->
+<h3>Returning the external result</h3>
+
+<p>
+When the native thread eventually returns from the external call,
+the result needs to be communicated back to the Haskell thread that
+issued the external call. The following steps takes care of this:
+</p>
+
+<ul>
+<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>,
+passing along the token referring to the Haskell thread that made the
+call we're returning from.
+</li>
+
+<li>
+The OS thread then tries to grab hold of a <em>returning worker
+capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>.
+Until granted, the thread blocks waiting for RTS permissions. Clearly we
+don't want the thread to be blocked longer than it has to, so whenever
+a thread that is executing within the RTS enters the Scheduler (which
+is quite often, e.g., when a Haskell thread context switch is made),
+it checks to see whether it can give up its RTS capability to a
+returning worker, which is done by calling
+<tt>Capability.c:yieldToReturningWorker()</tt>.
+</li>
+
+<li>
+If a returning worker is waiting (the code in <tt>Capability.c</tt>
+keeps a counter of the number of returning workers that are currently
+blocked waiting), it is woken up and the given the RTS execution
+priviledges/capabilities of the worker thread that gave up its.
+</li>
+
+<li>
+The thread that gave up its capability then tries to re-acquire
+the capability to execute RTS code; this is done by calling
+<tt>Capability.c:waitForWorkCapability()</tt>.
+</li>
+
+<li>
+The returning worker that was woken up will continue execution in
+<tt>resumeThread()</tt>, removing its associated Haskell thread
+from the <tt>suspended_ccalling_threads</tt> list and start evaluating
+that thread, passing it the result of the external call.
+</li>
+</ul>
+
+<!---- ***************************************  ----->
+<h3 id="rts-exec">RTS execution</h3>
+
+<p>
+If a worker thread inside the RTS runs out of runnable Haskell
+threads, it goes to sleep waiting for the external calls to complete.
+It does this by calling <tt>waitForWorkCapability</tt>
+</p>
+
+<p>
+The availability of new runnable Haskell threads is signalled when:
+</p>
+
+<ul>
+<li>When an external call is set up in <tt>suspendThread()</tt>.</li>
+<li>When a new Haskell thread is created (e.g., whenever
+<tt>Concurrent.forkIO</tt> is called from within Haskell); this is
+signalled in <tt>Schedule.c:scheduleThread_()</tt>.
+</li>
+<li>Whenever a Haskell thread is removed from a 'blocking queue'
+attached to an MVar (only?).
+</li>
+</ul>
+
+<!---- ***************************************  ----->
+<h2 id="callin">Calling in</h2>
+
+Providing robust support for having multiple OS threads calling into
+Haskell is not as involved as its dual. 
+
+<ul>
+<li>The OS thread issues the call to a Haskell function by going via
+the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>). 
+<li>Making the function application requires the construction of a
+closure on the heap. This is done in a thread-safe manner by having
+the OS thread lock a designated block of memory (the 'Rts API' block,
+which is part of the GC's root set) for the short period of time it
+takes to construct the application.
+<li>The OS thread then creates a new Haskell thread to execute the
+function application, which (eventually) boils down to calling
+<tt>Schedule.c:createThread()</tt> 
+<li>
+Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>,
+which asks the Task Manager to possibly create a new worker (OS)
+thread to execute the Haskell thread.
+<li>
+After the OS thread has done this, it blocks waiting for the 
+Haskell thread to complete the evaluation of the Haskell function.
+<p>
+The reason why a separate worker thread is made to evaluate the Haskell
+function and not the OS thread that made the call-in via the
+Rts API, is that we want that OS thread to return as soon as possible.
+We wouldn't be able to guarantee that if the OS thread entered the 
+RTS to (initially) just execute its function application, as the
+Scheduler may side-track it and also ask it to evaluate other Haskell threads.
+</li>
+</ul>
+
+<p>
+<strong>Note:</strong> As of 20020413, the implementation of the RTS API
+only serializes access to the allocator between multiple OS threads wanting
+to call into Haskell (via the RTS API.) It does not coordinate this access
+to the allocator with that of the OS worker thread that's currently executing
+within the RTS. This weakness/bug is scheduled to be tackled as part of an
+overhaul/reworking of the RTS API itself.
+
+
+<!---- ***************************************  ----->
+<h2>Subsystems introduced/modified</h2>
+
+<p>
+These threads extensions affect the Scheduler portions of the runtime
+system. To make it more manageable to work with, the changes
+introduced a couple of new RTS 'sub-systems'. This section presents
+the functionality and API of these sub-systems.
+</p>
+
+<!---- ***************************************  ----->
+<h3 id="#capability">Capabilities</h3>
+
+<p>
+A Capability represent the token required to execute STG code,
+and all the state an OS thread/task needs to run Haskell code:
+its STG registers, a pointer to its TSO, a nursery etc. During
+STG execution, a pointer to the capabilitity is kept in a
+register (BaseReg).
+</p>
+<p>
+Only in an SMP build will there be multiple capabilities, for
+the threaded RTS and other non-threaded builds, there is only
+one global capability, namely <tt>MainCapability</tt>.
+
+<p>
+The Capability API is as follows:
+<pre>
+/* Capability.h */
+extern void initCapabilities(void);
+
+extern void grabReturnCapability(Mutex* pMutex, Capability** pCap);
+extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable);
+extern void releaseCapability(Capability* cap);
+
+extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap);
+
+extern void grabCapability(Capability** cap);
+</pre>
+
+<ul>
+<li><tt>initCapabilities()</tt> initialises the subsystem.
+
+<li><tt>grabReturnCapability()</tt> is called by worker threads
+returning from an external call. It blocks them waiting to gain
+permissions to do so.
+
+<li><tt>waitForWorkCapability()</tt> is called by worker threads
+already inside the RTS, but without any work to do. It blocks them
+waiting for there to new work to become available.
+
+<li><tt>releaseCapability()</tt> hands back a capability. If a
+'returning worker' is waiting, it is signalled that a capability
+has become available. If not, <tt>releaseCapability()</tt> tries
+to signal worker threads that are blocked waiting inside
+<tt>waitForWorkCapability()</tt> that new work might now be
+available.
+
+<li><tt>yieldToReturningWorker()</tt> is called by the worker thread
+that's currently inside the Scheduler. It checks whether there are other
+worker threads waiting to return from making an external call. If so,
+they're given preference and a capability is transferred between worker
+threads. One of the waiting 'returning worker' threads is signalled and made
+runnable, with the other, yielding, worker blocking to re-acquire
+a capability.
+</ul>
+
+<p>
+The condition variables used to implement the synchronisation between
+worker consumers and providers are local to the Capability
+implementation. See source for details and comments.
+</p>
+
+<!---- ***************************************  ----->
+<h3 id="taskman">The Task Manager</h3>
+
+<p>
+The Task Manager API is responsible for managing the creation of
+OS worker RTS threads. When a Haskell thread wants to make an
+external call, the Task Manager is asked to possibly create a
+new worker thread to take over the RTS-executing capability of
+the worker thread that's exiting the RTS to execute the external call.
+
+<p>
+The Capability subsystem keeps track of idle worker threads, so
+making an informed decision about whether or not to create a new OS
+worker thread is easy work for the task manager. The Task manager
+provides the following API:
+</p>
+
+<pre>
+/* Task.h */
+extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) );
+extern void stopTaskManager ( void );
+
+extern void startTask ( void (*taskStart)(void) );
+</pre>
+
+<ul>
+<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts
+up and shuts down the subsystem. When starting up, you have the option
+to limit the overall number of worker threads that can be
+created. An unbounded (modulo OS thread constraints) number of threads
+is created if you pass '0'.
+<li><tt>startTask()</tt> is called when a worker thread calls
+<tt>suspendThread()</tt> to service an external call, asking another
+worker thread to take over its RTS-executing capability. It is also
+called when an external OS thread invokes a Haskell function via the
+<em>Rts API</em>.
+</ul>
+
+<!---- ***************************************  ----->
+<h3>Native threads API</h3>
+
+To hide OS details, the following API is used by the task manager and
+the scheduler to interact with an OS' threads API:
+
+<pre>
+/* OSThreads.h */
+typedef <em>..OS specific..</em> Mutex;
+extern void initMutex    ( Mutex* pMut );
+extern void grabMutex    ( Mutex* pMut );
+extern void releaseMutex ( Mutex* pMut );
+  
+typedef <em>..OS specific..</em> Condition;
+extern void    initCondition      ( Condition* pCond );
+extern void    closeCondition     ( Condition* pCond );
+extern rtsBool broadcastCondition ( Condition* pCond );
+extern rtsBool signalCondition    ( Condition* pCond );
+extern rtsBool waitCondition      ( Condition* pCond, 
+				    Mutex* pMut );
+
+extern OSThreadId osThreadId      ( void );
+extern void shutdownThread        ( void );
+extern void yieldThread           ( void );
+extern int  createOSThread        ( OSThreadId* tid,
+				    void (*startProc)(void) );
+</pre>
+
+
+
+<!---- ***************************************  ----->
+<h2>User-level interface</h2>
+
+To signal that you want an external call to be serviced by a separate
+OS thread, you have to add the attribute <tt>threadsafe</tt> to
+a foreign import declaration, i.e.,
+
+<pre>
+foreign import "bigComp" threadsafe largeComputation :: Int -> IO ()
+</pre>
+
+<p>
+The distinction between 'safe' and thread-safe C calls is made
+so that we may call external functions that aren't re-entrant but may
+cause a GC to occur.
+<p>
+The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>.
+</p>
+
+<!---- ***************************************  ----->
+<h2>Building the GHC RTS</h2>
+
+The multi-threaded extension isn't currently enabled by default. To
+have it built, you need to run the <tt>fptools</tt> configure script
+with the extra option <tt>--enable-threaded-rts</tt> turned on, and
+then proceed to build the compiler as per normal.
+
+<hr>
+<small>
+<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end -->
+</small>
+</body> </html>
+
diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html
new file mode 100644
index 0000000000..627bde8d88
--- /dev/null
+++ b/docs/comm/rts-libs/non-blocking.html
@@ -0,0 +1,133 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Non-blocking I/O on Win32</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Non-blocking I/O on Win32</h1>
+    <p>
+
+This note discusses the implementation of non-blocking I/O on
+Win32 platforms.   It is not implemented yet (Apr 2002), but it seems worth
+capturing the ideas.  Thanks to Sigbjorn for writing them.
+
+<h2> Background</h2>
+
+GHC has provided non-blocking I/O support for Concurrent Haskell
+threads on platforms that provide 'UNIX-style' non-blocking I/O for
+quite a while. That is, platforms that let you alter the property of a
+file descriptor to instead of having a thread block performing an I/O
+operation that cannot be immediately satisfied, the operation returns
+back a special error code (EWOULDBLOCK.) When that happens, the CH
+thread that made the blocking I/O request is put into a blocked-on-IO
+state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will
+in a timely fashion check to see whether I/O is again possible
+(via a call to select()), and if it is, unblock the thread & have it
+re-try the I/O operation. The result is that other Concurrent Haskell
+threads won't be affected, but can continue operating while a thread
+is blocked on I/O.
+<p>
+Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for
+the simple reason that it doesn't provide the OS facilities described
+above. 
+
+<h2>Win32 non-blocking I/O, attempt 1</h2>
+
+Win32 does provide something select()-like, namely the
+WaitForMultipleObjects() API. It takes an array of kernel object
+handles plus a timeout interval, and waits for either one (or all) of
+them to become 'signalled'. A handle representing an open file (for
+reading) becomes signalled once there is input available.
+<p>
+So, it is possible to observe that I/O is possible using this
+function, but not whether there's "enough" to satisfy the I/O request.
+So, if we were to mimic select() usage with WaitForMultipleObjects(),
+we'd correctly avoid blocking initially, but a thread may very well 
+block waiting for their I/O requests to be satisified once the file
+handle has become signalled. [There is a fix for this -- only read
+and write one byte at a the time -- but I'm not advocating that.]
+
+
+<h2>Win32 non-blocking I/O, attempt 2</h2>
+
+Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is,
+asynchronous read and write requests can be made via the ReadFile() /
+WriteFile () APIs, specifying position and length of the operation.
+If the I/O requests cannot be handled right away, the APIs won't
+block, but return immediately (and report ERROR_IO_PENDING as their
+status code.)
+<p>
+The completion of the request can be reported in a number of ways:
+<ul>
+  <li> synchronously, by blocking inside Read/WriteFile().  (this is the
+    non-overlapped case, really.)
+<p>
+
+  <li> as part of the overlapped I/O request, pass a HANDLE to an event
+    object. The I/O system will signal this event once the request
+    completed, which a waiting thread will then be able to see.
+<p>
+
+  <li> by supplying a pointer to a completion routine, which will be
+    called as an Asynchronous Procedure Call (APC) whenever a thread
+    calls a select bunch of 'alertable' APIs.
+<p>
+
+  <li> by associating the file handle with an I/O completion port.  Once
+    the request completes, the thread servicing the I/O completion
+    port will be notified.
+</ul>
+The use of I/O completion port looks the most interesting to GHC,
+as it provides a central point where all I/O requests are reported.
+<p>
+Note: asynchronous I/O is only fully supported by OSes based on
+the NT codebase, i.e., Win9x don't permit async I/O on files and
+pipes. However, Win9x does support async socket operations, and
+I'm currently guessing here, console I/O. In my view, it would
+be acceptable to provide non-blocking I/O support for NT-based
+OSes only.
+<p>
+Here's the design I currently have in mind:
+<ul>
+<li> Upon startup, an RTS helper thread whose only purpose is to service
+  an I/O completion port, is created.
+<p>
+<li> All files are opened in 'overlapping' mode, and associated
+  with an I/O completion port.
+<p>
+<li> Overlapped I/O requests are used to implement read() and write().
+<p>
+<li> If the request cannot be satisified without blocking, the Haskell
+  thread is put on the blocked-on-I/O thread list & a re-schedule
+  is made.
+<p>
+<li> When the completion of a request is signalled via the I/O completion
+  port, the RTS helper thread will move the associated Haskell thread
+  from the blocked list onto the runnable list. (Clearly, care
+  is required here to have another OS thread mutate internal Scheduler
+  data structures.)
+  
+<p>
+<li> In the event all Concurrent Haskell threads are blocked waiting on
+  I/O, the main RTS thread blocks waiting on an event synchronisation
+  object, which the helper thread will signal whenever it makes
+  a Haskell thread runnable.
+
+</ul>
+
+I might do the communication between the RTS helper thread and the 
+main RTS thread differently though: rather than have the RTS helper 
+thread manipluate thread queues itself, thus requiring careful 
+locking, just have it change a bit on the relevant TSO, which the main 
+RTS thread can check at regular intervals (in some analog of 
+awaitEvent(), for example).
+
+    <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug  8 19:30:18 EST 2001
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html
new file mode 100644
index 0000000000..25407eed43
--- /dev/null
+++ b/docs/comm/rts-libs/prelfound.html
@@ -0,0 +1,57 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Prelude Foundations</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Prelude Foundations</h1>
+    <p>
+      The standard Haskell Prelude as well as GHC's Prelude extensions are
+      constructed from GHC's <a href="primitives.html">primitives</a> in a
+      couple of layers.  
+
+    <h4><code>PrelBase.lhs</code></h4>
+    <p>
+      Some most elementary Prelude definitions are collected in <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>.
+      In particular, it defines the boxed versions of Haskell primitive types
+      - for example, <code>Int</code> is defined as
+    <blockquote><pre>
+data Int = I# Int#</pre>
+    </blockquote>
+    <p>
+      Saying that a boxed integer <code>Int</code> is formed by applying the
+      data constructor <code>I#</code> to an <em>unboxed</em> integer of type
+      <code>Int#</code>.  Unboxed types are hardcoded in the compiler and
+      exported together with the <a href="primitives.html">primitive
+	operations</a> understood by GHC.
+    <p>
+      <code>PrelBase.lhs</code> similarly defines basic types, such as,
+      boolean values
+    <blockquote><pre>
+data  Bool  =  False | True  deriving (Eq, Ord)</pre>
+    </blockquote>
+    <p>
+      the unit type
+    <blockquote><pre>
+data  ()  =  ()</pre>
+    </blockquote>
+    <p>
+      and lists
+    <blockquote><pre>
+data [] a = [] | a : [a]</pre>
+    </blockquote>
+    <p>
+      It also contains instance delarations for these types.  In addition,
+      <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky
+      machinery</a> for efficient list handling.
+
+    <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug  8 19:30:18 EST 2001
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html
new file mode 100644
index 0000000000..4ad6c20338
--- /dev/null
+++ b/docs/comm/rts-libs/prelude.html
@@ -0,0 +1,121 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Cunning Prelude Code</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Cunning Prelude Code</h1>
+    <p>
+      GHC's uses a many optimsations and GHC specific techniques (unboxed
+      values, RULES pragmas, and so on) to make the heavily used Prelude code
+      as fast as possible.
+
+    <hr>
+    <h4>Par, seq, and lazy</h4>
+
+    In GHC.Conc you will dinf
+<blockquote><pre>
+  pseq a b = a `seq` lazy b
+</pre></blockquote>
+   What's this "lazy" thing.  Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting.
+   We really mean "evaluate a, then b".  But if the strictness analyser sees that pseq is strict
+   in b, then b might be evaluated <em>before</em> a, which is all wrong. 
+<p>
+Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>.  This function is just the identity function,
+except that it's put into the built-in environment in MkId.lhs.  That is, the MkId.lhs defn over-rides the
+inlining and strictness information that comes in from GHC.Base.hi.  And that makes <tt>lazy</tt> look
+lazy, and have no inlining.  So the strictness analyser gets no traction.
+<p>
+In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs),
+so we get all the efficiency back.
+<p>
+This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an
+Int#.  Now there is no seq# operator at all.
+
+
+    <hr>
+    <h4>fold/build</h4>
+    <p>
+      There is a lot of magic in <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> -
+      among other things, the <a
+	href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES
+      pragmas</a> implementing the <a
+	href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a>
+	optimisation.  The code for <code>map</code> is
+      a good example for how it all works. In the prelude code for version
+      5.03 it reads as follows:
+    <blockquote><pre>
+map :: (a -> b) -> [a] -> [b]
+map _ []     = []
+map f (x:xs) = f x : map f xs
+
+-- Note eta expanded
+mapFB ::  (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+{-# INLINE [0] mapFB #-}
+mapFB c f x ys = c (f x) ys
+
+{-# RULES
+"map"	    [~1] forall f xs.	map f xs		= build (\c n -> foldr (mapFB c f) n xs)
+"mapList"   [1]  forall f.	foldr (mapFB (:) f) []	= map f
+"mapFB"	    forall c f g.	mapFB (mapFB c f) g	= mapFB c (f.g) 
+  #-}</pre>
+    </blockquote>
+    <p>
+      Up to (but not including) phase 1, we use the <code>"map"</code> rule to
+      rewrite all saturated applications of <code>map</code> with its
+      build/fold form, hoping for fusion to happen.  In phase 1 and 0, we
+      switch off that rule, inline build, and switch on the
+      <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back
+      into plain map.
+    <p>
+      It's important that these two rules aren't both active at once 
+      (along with build's unfolding) else we'd get an infinite loop 
+      in the rules.  Hence the activation control using explicit phase numbers.
+    <p>
+      The "mapFB" rule optimises compositions of map.
+    <p>
+      The mechanism as described above is new in 5.03 since January 2002,
+      where the <code>[~</code><i>N</i><code>]</code> syntax for phase number
+      annotations at rules was introduced.  Before that the whole arrangement
+      was more complicated, as the corresponding prelude code for version
+      4.08.1 shows:
+    <blockquote><pre>
+map :: (a -> b) -> [a] -> [b]
+map = mapList
+
+-- Note eta expanded
+mapFB ::  (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+mapFB c f x ys = c (f x) ys
+
+mapList :: (a -> b) -> [a] -> [b]
+mapList _ []     = []
+mapList f (x:xs) = f x : mapList f xs
+
+{-# RULES
+"map"	  forall f xs.  map f xs	       = build (\c n -> foldr (mapFB c f) n xs)
+"mapFB"	  forall c f g. mapFB (mapFB c f) g    = mapFB c (f.g) 
+"mapList" forall f.	foldr (mapFB (:) f) [] = mapList f
+ #-}</pre>
+    </blockquote>
+    <p>
+      This code is structured as it is, because the "map" rule first
+      <em>breaks</em> the map <em>open,</em> which exposes it to the various
+      foldr/build rules, and if no foldr/build rule matches, the "mapList"
+      rule <em>closes</em> it again in a later phase of optimisation - after
+      build was inlined.  As a consequence, the whole thing depends a bit on
+      the timing of the various optimsations (the map might be closed again
+      before any of the foldr/build rules fires).  To make the timing
+      deterministic, <code>build</code> gets a <code>{-# INLINE 2 build
+      #-}</code> pragma, which delays <code>build</code>'s inlining, and thus,
+      the closing of the map. [NB: Phase numbering was forward at that time.]
+
+    <p><small>
+<!-- hhmts start -->
+Last modified: Mon Feb 11 20:00:49 EST 2002
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html
new file mode 100644
index 0000000000..28abc79426
--- /dev/null
+++ b/docs/comm/rts-libs/primitives.html
@@ -0,0 +1,70 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Primitives</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Primitives</h1>
+    <p>
+      Most user-level Haskell types and functions provided by GHC (in
+      particular those from the Prelude and GHC's Prelude extensions) are
+      internally constructed from even more elementary types and functions.
+      Most notably, GHC understands a notion of <em>unboxed types,</em> which
+      are the Haskell representation of primitive bit-level integer, float,
+      etc. types (as opposed to their boxed, heap allocated counterparts) -
+      cf. <a
+	href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed
+	Values as First Class Citizens."</a>
+
+    <h4>The Ultimate Source of Primitives</h4>
+    <p>
+      The hardwired types of GHC are brought into scope by the module
+      <code>PrelGHC</code>.  This modules only exists in the form of a
+      handwritten interface file <a
+      href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a>
+      which lists the type and function names, as well as instance
+      declarations.  The actually types of these names as well as their
+      implementation is hardwired into GHC.  Note that the names in this file
+      are z-encoded, and in particular, identifiers ending on <code>zh</code>
+      denote user-level identifiers ending in a hash mark (<code>#</code>),
+      which is used to flag unboxed values or functions operating on unboxed
+      values.  For example, we have <code>Char#</code>, <code>ord#</code>, and
+      so on. 
+
+    <h4>The New Primitive Definition Scheme</h4>
+    <p>
+      As of (about) the development version 4.11, the types and various
+      properties of primitive operations are defined in the file <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>.
+      (Personally, I don't think that the <code>.txt</code> suffix is really
+      appropriate, as the file is used for automatic code generation; the
+      recent addition of <code>.pp</code> means that the file is now mangled
+      by cpp.)
+    <p>
+      The utility <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a> 
+      generates a series of Haskell files from <code>primops.txt</code>, which
+      encode the types and various properties of the primitive operations as
+      compiler internal data structures.  These Haskell files are not complete
+      modules, but program fragments, which are included into compiler modules
+      during the GHC build process.  The generated include files can be found
+      in the directory <code>fptools/ghc/compiler/</code> and carry names
+      matching the pattern <code>primop-*.hs-incl</code>.  They are generate
+      during the execution of the <code>boot</code> target in the
+      <code>fptools/ghc/</code> directory.  This scheme significantly
+      simplifies the maintenance of primitive operations.
+    <p> 
+      As of development version 5.02, the <code>primops.txt</code> file also allows the
+      recording of documentation about intended semantics of the primitives.  This can
+      be extracted into a latex document (or rather, into latex document fragments)
+      via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code>
+      for full details of how GHC is configured to cope with different machine word sizes.
+    <p><small>
+<!-- hhmts start -->
+Last modified: Mon Nov 26 18:03:16 EST 2001
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html
new file mode 100644
index 0000000000..196ec9150d
--- /dev/null
+++ b/docs/comm/rts-libs/stgc.html
@@ -0,0 +1,45 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+    <title>The GHC Commentary - Spineless Tagless C</title>
+  </head>
+
+  <body BGCOLOR="FFFFFF">
+    <h1>The GHC Commentary - Spineless Tagless C</h1>
+    <p>
+      The C code generated by GHC doesn't use higher-level features of C to be
+      able to control as precisely as possible what code is generated.
+      Moreover, it uses special features of gcc (such as, first class labels)
+      to produce more efficient code.
+    <p>
+      STG C makes ample use of C's macro language to define idioms, which also
+      reduces the size of the generated C code (thus, reducing I/O times).
+      These macros are defined in the C headers located in GHC's <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a>
+      directory.
+
+    <h4><code>TailCalls.h</code></h4>
+    <p>
+      <a
+	href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a>
+      defines how tail calls are implemented - and in particular - optimised
+      in GHC generated code.  The default case, for an architecture for which
+      GHC is not optimised, is to use the mini interpreter described in the <a
+	href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a>
+    <p>
+      For supported architectures, various tricks are used to generate
+      assembler implementing proper tail calls.  On i386, gcc's first class
+      labels are used to directly jump to a function pointer.  Furthermore,
+      markers of the form <code>--- BEGIN ---</code> and <code>--- END
+      ---</code> are added to the assembly right after the function prologue
+      and before the epilogue.  These markers are used by <a
+	href="../the-beast/mangler.html">the Evil Mangler.</a>
+
+    <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug  8 19:28:29 EST 2001
+<!-- hhmts end -->
+    </small>
+  </body>
+</html>
diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html
new file mode 100644
index 0000000000..499aeec767
--- /dev/null
+++ b/docs/comm/rts-libs/threaded-rts.html
@@ -0,0 +1,126 @@
+<html>
+  <head>
+    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+      <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title>
+  </head>
+
+  <body>
+    <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1>
+    
+    <p>This section of the commentary explains the structure of the runtime system
+      when used in threaded or SMP mode.</p>
+
+    <p>The <em>threaded</em> version of the runtime supports
+      bound threads and non-blocking foreign calls, and an overview of its
+      design can be found in the paper <a
+	href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending
+	the Haskell Foreign Function Interface with Concurrency</a>.  To
+      compile the runtime with threaded support, add the line
+
+<pre>GhcRTSWays += thr</pre>
+
+    to <tt>mk/build.mk</tt>.  When building C code in the runtime for the threaded way,
+      the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the
+      build system when building for way <tt>thr</tt>, see
+      <tt>mk/config.mk</tt>).  To build a Haskell program
+      with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this
+      can be used in conjunction with <tt>-prof</tt>, and possibly
+      <tt>-debug</tt> and others depending on which versions of the RTS have
+      been built.</p>
+
+    <p>The <em>SMP</em> version runtime supports the same facilities as the
+      threaded version, and in addition supports execution of Haskell code by
+      multiple simultaneous OS threads.  For SMP support, both the runtime and
+      the libraries must be built a special way: add the lines
+
+   <pre>
+GhcRTSWays += thr
+GhcLibWays += s</pre>
+
+    to <tt>mk/build.mk</tt>.  To build Haskell code for
+      SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in
+      conjunction with <tt>-debug</tt>, but no other way-flags at this time).
+      When building C code in the runtime for SMP
+      support, the symbol <tt>SMP</tt> is defined (this is arranged by the
+      compiler when the <tt>-smp</tt> flag is given, see
+      <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p>
+
+    <p>When building the runtime in either the threaded or SMP ways, the symbol
+      <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p>
+
+    <h2>Overall design</h2>
+
+    <p>The system is based around the notion of a <tt>Capability</tt>.  A
+      <tt>Capability</tt> is an object that represents both the permission to
+      execute some Haskell code, and the state required to do so.  In order
+      to execute some Haskell code, a thread must therefore hold a
+      <tt>Capability</tt>.  The available pool of capabilities is managed by
+      the <tt>Capability</tt> API, described below.</p>
+
+    <p>In the threaded runtime, there is only a single <tt>Capabililty</tt> in the
+      system, indicating that only a single thread can be executing Haskell
+      code at any one time.  In the SMP runtime, there can be an arbitrary
+      number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt>
+      flag; in practice the number is best chosen to be the same as the number of
+      processors on the host machine.</p>
+
+    <p>There are a number of OS threads running code in the runtime.  We call
+      these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>.
+      Tasks are managed by the <tt>Task</tt> subsystem, which is mainly
+      concerned with keeping track of statistics such as how much time each
+      task spends executing Haskell code, and also keeping track of how many
+      tasks are around when we want to shut down the runtime.</p>
+
+    <p>Some tasks are created by the runtime itself, and some may be here
+      as a result of a call to Haskell from foreign code (we
+      call this an in-call).  The
+      runtime can support any number of concurrent foreign in-calls, but the
+      number of these calls that will actually run Haskell code in parallel is
+      determined by the number of available capabilities.  Each in-call creates
+      a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited
+      above).</p>
+
+    <p>In the future we may want to bind a <tt>Capability</tt> to a particular
+      processor, so that we can support a notion of affinity - avoiding
+      accidental migration of work from one CPU to another, so that we can make
+      best use of a CPU's local cache.  For now, the design ignores this
+      issue.</p>
+
+    <h2>The <tt>OSThreads</tt> interface</h2>
+
+    <p>This interface is merely an abstraction layer over the OS-specific APIs
+      for managing threads.  It has two main implementations: Win32 and
+      POSIX.</p>
+
+    <p>This is the entirety of the interface:</p>
+
+<pre>
+/* Various abstract types */
+typedef Mutex;
+typedef Condition;
+typedef OSThreadId;
+
+extern OSThreadId osThreadId      ( void );
+extern void shutdownThread        ( void );
+extern void yieldThread           ( void );
+extern int  createOSThread        ( OSThreadId* tid,
+				    void (*startProc)(void) );
+
+extern void initCondition         ( Condition* pCond );
+extern void closeCondition        ( Condition* pCond );
+extern rtsBool broadcastCondition ( Condition* pCond );
+extern rtsBool signalCondition    ( Condition* pCond );
+extern rtsBool waitCondition      ( Condition* pCond, 
+				    Mutex* pMut );
+
+extern void initMutex             ( Mutex* pMut );
+    </pre>
+
+    <h2>The Task interface</h2>
+
+    <h2>The Capability interface</h2>
+
+    <h2>Multiprocessor Haskell Execution</h2>
+
+  </body>
+</html>