summaryrefslogtreecommitdiff
path: root/docs/comm/rts-libs
diff options
context:
space:
mode:
Diffstat (limited to 'docs/comm/rts-libs')
-rw-r--r--docs/comm/rts-libs/coding-style.html516
-rw-r--r--docs/comm/rts-libs/foreignptr.html68
-rw-r--r--docs/comm/rts-libs/multi-thread.html445
-rw-r--r--docs/comm/rts-libs/non-blocking.html133
-rw-r--r--docs/comm/rts-libs/prelfound.html57
-rw-r--r--docs/comm/rts-libs/prelude.html121
-rw-r--r--docs/comm/rts-libs/primitives.html70
-rw-r--r--docs/comm/rts-libs/stgc.html45
-rw-r--r--docs/comm/rts-libs/threaded-rts.html126
9 files changed, 1581 insertions, 0 deletions
diff --git a/docs/comm/rts-libs/coding-style.html b/docs/comm/rts-libs/coding-style.html
new file mode 100644
index 0000000000..58f5b4f9bb
--- /dev/null
+++ b/docs/comm/rts-libs/coding-style.html
@@ -0,0 +1,516 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Style Guidelines for RTS C code</title>
+ </head>
+
+<body>
+<H1>The GHC Commentary - Style Guidelines for RTS C code</h1>
+
+<h2>Comments</h2>
+
+<p>These coding style guidelines are mainly intended for use in
+<tt>ghc/rts</tt> and <tt>ghc/includes</tt>.
+
+<p>NB These are just suggestions. They're not set in stone. Some of
+them are probably misguided. If you disagree with them, feel free to
+modify this document (and make your commit message reasonably
+informative) or mail someone (eg. <a
+href="glasgow-haskell-users@haskell.org">The GHC mailing list</a>)
+
+<h2>References</h2>
+
+If you haven't read them already, you might like to check the following.
+Where they conflict with our suggestions, they're probably right.
+
+<ul>
+
+<li>
+The C99 standard. One reasonable reference is <a
+href="http://home.tiscalinet.ch/t_wolf/tw/c/c9x_changes.html">here</a>.
+
+<p><li>
+Writing Solid Code, Microsoft Press. (Highly recommended. Possibly
+the only Microsoft Press book that's worth reading.)
+
+<p><li>
+Autoconf documentation.
+See also <a href="http://peti.gmd.de/autoconf-archive/">The autoconf macro archive</a> and
+<a href="http://www.cyclic.com/cyclic-pages/autoconf.html">Cyclic Software's description</a>
+
+<p><li> <a
+href="http://www.cs.umd.edu/users/cml/cstyle/indhill-cstyle.html">Indian
+Hill C Style and Coding Standards</a>.
+
+<p><li>
+<a href="http://www.cs.umd.edu/users/cml/cstyle/">A list of C programming style links</a>
+
+<p><li>
+<a href="http://www.lysator.liu.se/c/c-www.html">A very large list of C programming links</a>
+
+<p><li>
+<a href="http://www.geek-girl.com/unix.html">A list of Unix programming links</a>
+
+</ul>
+
+
+<h2>Portability issues</h2>
+
+<ul>
+<p><li> We try to stick to C99 where possible. We use the following
+C99 features relative to C89, some of which were previously GCC
+extensions (possibly with different syntax):
+
+<ul>
+<p><li>Variable length arrays as the last field of a struct. GCC has
+a similar extension, but the syntax is slightly different: in GCC you
+would declare the array as <tt>arr[0]</tt>, whereas in C99 it is
+declared as <tt>arr[]</tt>.
+
+<p><li>Inline annotations on functions (see later)
+
+<p><li>Labeled elements in initialisers. Again, GCC has a slightly
+different syntax from C99 here, and we stick with the GCC syntax until
+GCC implements the C99 proposal.
+
+<p><li>C++-style comments. These are part of the C99 standard, and we
+prefer to use them whenever possible.
+</ul>
+
+<p>In addition we use ANSI-C-style function declarations and
+prototypes exclusively. Every function should have a prototype;
+static function prototypes may be placed near the top of the file in
+which they are declared, and external prototypes are usually placed in
+a header file with the same basename as the source file (although there
+are exceptions to this rule, particularly when several source files
+together implement a subsystem which is described by a single external
+header file).
+
+<p><li>We use the following GCC extensions, but surround them with
+<tt>#ifdef __GNUC__</tt>:
+
+<ul>
+<p><li>Function attributes (mostly just <code>no_return</code> and
+<code>unused</code>)
+<p><li>Inline assembly.
+</ul>
+
+<p><li>
+char can be signed or unsigned - always say which you mean
+
+<p><li>Our POSIX policy: try to write code that only uses POSIX (IEEE
+Std 1003.1) interfaces and APIs. We used to define
+<code>POSIX_SOURCE</code> by default, but found that this caused more
+problems than it solved, so now we require any code that is
+POSIX-compliant to explicitly say so by having <code>#include
+"PosixSource.h"</code> at the top. Try to do this whenever possible.
+
+<p><li> Some architectures have memory alignment constraints. Others
+don't have any constraints but go faster if you align things. These
+macros (from <tt>ghcconfig.h</tt>) tell you which alignment to use
+
+<pre>
+ /* minimum alignment of unsigned int */
+ #define ALIGNMENT_UNSIGNED_INT 4
+
+ /* minimum alignment of long */
+ #define ALIGNMENT_LONG 4
+
+ /* minimum alignment of float */
+ #define ALIGNMENT_FLOAT 4
+
+ /* minimum alignment of double */
+ #define ALIGNMENT_DOUBLE 4
+</pre>
+
+<p><li> Use <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> when
+reading/writing ints and ptrs to the stack or heap. Note that, by
+definition, <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> are
+the same size and have the same alignment constraints even if
+<code>sizeof(int) != sizeof(ptr)</code> on that platform.
+
+<p><li> Use <tt>StgInt8</tt>, <tt>StgInt16</tt>, etc when you need a
+certain minimum number of bits in a type. Use <tt>int</tt> and
+<tt>nat</tt> when there's no particular constraint. ANSI C only
+guarantees that ints are at least 16 bits but within GHC we assume
+they are 32 bits.
+
+<p><li> Use <tt>StgFloat</tt> and <tt>StgDouble</tt> for floating
+point values which will go on/have come from the stack or heap. Note
+that <tt>StgDouble</tt> may occupy more than one <tt>StgWord</tt>, but
+it will always be a whole number multiple.
+
+<p>
+Use <code>PK_FLT(addr)</code>, <code>PK_DBL(addr)</code> to read
+<tt>StgFloat</tt> and <tt>StgDouble</tt> values from the stack/heap,
+and <code>ASSIGN_FLT(val,addr)</code> /
+<code>ASSIGN_DBL(val,addr)</code> to assign StgFloat/StgDouble values
+to heap/stack locations. These macros take care of alignment
+restrictions.
+
+<p>
+Heap/Stack locations are always <tt>StgWord</tt> aligned; the
+alignment requirements of an <tt>StgDouble</tt> may be more than that
+of <tt>StgWord</tt>, but we don't pad misaligned <tt>StgDoubles</tt>
+because doing so would be too much hassle (see <code>PK_DBL</code> &
+co above).
+
+<p><li>
+Avoid conditional code like this:
+
+<pre>
+ #ifdef solaris_host_OS
+ // do something solaris specific
+ #endif
+</pre>
+
+Instead, add an appropriate test to the configure.ac script and use
+the result of that test instead.
+
+<pre>
+ #ifdef HAVE_BSD_H
+ // use a BSD library
+ #endif
+</pre>
+
+<p>The problem is that things change from one version of an OS to another
+- things get added, things get deleted, things get broken, some things
+are optional extras. Using "feature tests" instead of "system tests"
+makes things a lot less brittle. Things also tend to get documented
+better.
+
+</ul>
+
+<h2>Debugging/robustness tricks</h2>
+
+
+Anyone who has tried to debug a garbage collector or code generator
+will tell you: "If a program is going to crash, it should crash as
+soon, as noisily and as often as possible." There's nothing worse
+than trying to find a bug which only shows up when running GHC on
+itself and doesn't manifest itself until 10 seconds after the actual
+cause of the problem.
+
+<p>We put all our debugging code inside <tt>#ifdef DEBUG</tt>. The
+general policy is we don't ship code with debugging checks and
+assertions in it, but we do run with those checks in place when
+developing and testing. Anything inside <tt>#ifdef DEBUG</tt> should
+not slow down the code by more than a factor of 2.
+
+<p>We also have more expensive "sanity checking" code for hardcore
+debugging - this can slow down the code by a large factor, but is only
+enabled on demand by a command-line flag. General sanity checking in
+the RTS is currently enabled with the <tt>-DS</tt> RTS flag.
+
+<p>There are a number of RTS flags which control debugging output and
+sanity checking in various parts of the system when <tt>DEBUG</tt> is
+defined. For example, to get the scheduler to be verbose about what
+it is doing, you would say <tt>+RTS -Ds -RTS</tt>. See
+<tt>includes/RtsFlags.h</tt> and <tt>rts/RtsFlags.c</tt> for the full
+set of debugging flags. To check one of these flags in the code,
+write:
+
+<pre>
+ IF_DEBUG(gc, fprintf(stderr, "..."));
+</pre>
+
+would check the <tt>gc</tt> flag before generating the output (and the
+code is removed altogether if <tt>DEBUG</tt> is not defined).
+
+<p>All debugging output should go to <tt>stderr</tt>.
+
+<p>
+Particular guidelines for writing robust code:
+
+<ul>
+<p><li>
+Use assertions. Use lots of assertions. If you write a comment
+that says "takes a +ve number" add an assertion. If you're casting
+an int to a nat, add an assertion. If you're casting an int to a char,
+add an assertion. We use the <tt>ASSERT</tt> macro for writing
+assertions; it goes away when <tt>DEBUG</tt> is not defined.
+
+<p><li>
+Write special debugging code to check the integrity of your data structures.
+(Most of the runtime checking code is in <tt>rts/Sanity.c</tt>)
+Add extra assertions which call this code at the start and end of any
+code that operates on your data structures.
+
+<p><li>
+When you find a hard-to-spot bug, try to think of some assertions,
+sanity checks or whatever that would have made the bug easier to find.
+
+<p><li>
+When defining an enumeration, it's a good idea not to use 0 for normal
+values. Instead, make 0 raise an internal error. The idea here is to
+make it easier to detect pointer-related errors on the assumption that
+random pointers are more likely to point to a 0 than to anything else.
+
+<pre>
+typedef enum
+ { i_INTERNAL_ERROR /* Instruction 0 raises an internal error */
+ , i_PANIC /* irrefutable pattern match failed! */
+ , i_ERROR /* user level error */
+
+ ...
+</pre>
+
+<p><li> Use <tt>#warning</tt> or <tt>#error</tt> whenever you write a
+piece of incomplete/broken code.
+
+<p><li> When testing, try to make infrequent things happen often.
+ For example, make a context switch/gc/etc happen every time a
+ context switch/gc/etc can happen. The system will run like a
+ pig but it'll catch a lot of bugs.
+
+</ul>
+
+<h2>Syntactic details</h2>
+
+<ul>
+<p><li><b>Important:</b> Put "redundant" braces or parens in your code.
+Omitting braces and parens leads to very hard to spot bugs -
+especially if you use macros (and you might have noticed that GHC does
+this a lot!)
+
+<p>
+In particular:
+<ul>
+<p><li>
+Put braces round the body of for loops, while loops, if statements, etc.
+even if they "aren't needed" because it's really hard to find the resulting
+bug if you mess up. Indent them any way you like but put them in there!
+</ul>
+
+<p><li>
+When defining a macro, always put parens round args - just in case.
+For example, write:
+<pre>
+ #define add(x,y) ((x)+(y))
+</pre>
+instead of
+<pre>
+ #define add(x,y) x+y
+</pre>
+
+<p><li> Don't declare and initialize variables at the same time.
+Separating the declaration and initialization takes more lines, but
+make the code clearer.
+
+<p><li>
+Use inline functions instead of macros if possible - they're a lot
+less tricky to get right and don't suffer from the usual problems
+of side effects, evaluation order, multiple evaluation, etc.
+
+<ul>
+<p><li>Inline functions get the naming issue right. E.g. they
+ can have local variables which (in an expression context)
+ macros can't.
+
+<p><li> Inline functions have call-by-value semantics whereas macros
+ are call-by-name. You can be bitten by duplicated computation
+ if you aren't careful.
+
+<p><li> You can use inline functions from inside gdb if you compile with
+ -O0 or -fkeep-inline-functions. If you use macros, you'd better
+ know what they expand to.
+</ul>
+
+However, note that macros can serve as both l-values and r-values and
+can be "polymorphic" as these examples show:
+<pre>
+ // you can use this as an l-value or an l-value
+ #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo)
+
+ // polymorphic case
+ // but note that min(min(1,2),3) does 3 comparisions instead of 2!!
+ #define min(x,y) (((x)<=(y)) ? (x) : (y))
+</pre>
+
+<p><li>
+Inline functions should be "static inline" because:
+<ul>
+<p><li>
+gcc will delete static inlines if not used or theyre always inlined.
+
+<p><li>
+ if they're externed, we could get conflicts between 2 copies of the
+ same function if, for some reason, gcc is unable to delete them.
+ If they're static, we still get multiple copies but at least they don't conflict.
+</ul>
+
+OTOH, the gcc manual says this
+so maybe we should use extern inline?
+
+<pre>
+ When a function is both inline and `static', if all calls to the
+function are integrated into the caller, and the function's address is
+never used, then the function's own assembler code is never referenced.
+In this case, GNU CC does not actually output assembler code for the
+function, unless you specify the option `-fkeep-inline-functions'.
+Some calls cannot be integrated for various reasons (in particular,
+calls that precede the function's definition cannot be integrated, and
+neither can recursive calls within the definition). If there is a
+nonintegrated call, then the function is compiled to assembler code as
+usual. The function must also be compiled as usual if the program
+refers to its address, because that can't be inlined.
+
+ When an inline function is not `static', then the compiler must
+assume that there may be calls from other source files; since a global
+symbol can be defined only once in any program, the function must not
+be defined in the other source files, so the calls therein cannot be
+integrated. Therefore, a non-`static' inline function is always
+compiled on its own in the usual fashion.
+
+ If you specify both `inline' and `extern' in the function
+definition, then the definition is used only for inlining. In no case
+is the function compiled on its own, not even if you refer to its
+address explicitly. Such an address becomes an external reference, as
+if you had only declared the function, and had not defined it.
+
+ This combination of `inline' and `extern' has almost the effect of a
+macro. The way to use it is to put a function definition in a header
+file with these keywords, and put another copy of the definition
+(lacking `inline' and `extern') in a library file. The definition in
+the header file will cause most calls to the function to be inlined.
+If any uses of the function remain, they will refer to the single copy
+in the library.
+</pre>
+
+<p><li>
+Don't define macros that expand to a list of statements.
+You could just use braces as in:
+
+<pre>
+ #define ASSIGN_CC_ID(ccID) \
+ { \
+ ccID = CC_ID; \
+ CC_ID++; \
+ }
+</pre>
+
+(but it's usually better to use an inline function instead - see above).
+
+<p><li>
+Don't even write macros that expand to 0 statements - they can mess you
+up as well. Use the doNothing macro instead.
+<pre>
+ #define doNothing() do { } while (0)
+</pre>
+
+<p><li>
+This code
+<pre>
+int* p, q;
+</pre>
+looks like it declares two pointers but, in fact, only p is a pointer.
+It's safer to write this:
+<pre>
+int* p;
+int* q;
+</pre>
+You could also write this:
+<pre>
+int *p, *q;
+</pre>
+but it is preferrable to split the declarations.
+
+<p><li>
+Try to use ANSI C's enum feature when defining lists of constants of
+the same type. Among other benefits, you'll notice that gdb uses the
+name instead of its (usually inscrutable) number when printing values
+with enum types and gdb will let you use the name in expressions you
+type.
+
+<p>
+Examples:
+<pre>
+ typedef enum { /* N.B. Used as indexes into arrays */
+ NO_HEAP_PROFILING,
+ HEAP_BY_CC,
+ HEAP_BY_MOD,
+ HEAP_BY_GRP,
+ HEAP_BY_DESCR,
+ HEAP_BY_TYPE,
+ HEAP_BY_TIME
+ } ProfilingFlags;
+</pre>
+instead of
+<pre>
+ # define NO_HEAP_PROFILING 0 /* N.B. Used as indexes into arrays */
+ # define HEAP_BY_CC 1
+ # define HEAP_BY_MOD 2
+ # define HEAP_BY_GRP 3
+ # define HEAP_BY_DESCR 4
+ # define HEAP_BY_TYPE 5
+ # define HEAP_BY_TIME 6
+</pre>
+and
+<pre>
+ typedef enum {
+ CCchar = 'C',
+ MODchar = 'M',
+ GRPchar = 'G',
+ DESCRchar = 'D',
+ TYPEchar = 'Y',
+ TIMEchar = 'T'
+ } ProfilingTag;
+</pre>
+instead of
+<pre>
+ # define CCchar 'C'
+ # define MODchar 'M'
+ # define GRPchar 'G'
+ # define DESCRchar 'D'
+ # define TYPEchar 'Y'
+ # define TIMEchar 'T'
+</pre>
+
+<p><li> Please keep to 80 columns: the line has to be drawn somewhere,
+and by keeping it to 80 columns we can ensure that code looks OK on
+everyone's screen. Long lines are hard to read, and a sign that the
+code needs to be restructured anyway.
+
+<p><li> When commenting out large chunks of code, use <code>#ifdef 0
+... #endif</code> rather than <code>/* ... */</code> because C doesn't
+have nested comments.
+
+<p><li>When declaring a typedef for a struct, give the struct a name
+as well, so that other headers can forward-reference the struct name
+and it becomes possible to have opaque pointers to the struct. Our
+convention is to name the struct the same as the typedef, but add a
+leading underscore. For example:
+
+<pre>
+ typedef struct _Foo {
+ ...
+ } Foo;
+</pre>
+
+<p><li>Do not use <tt>!</tt> instead of explicit comparison against
+<tt>NULL</tt> or <tt>'\0'</tt>; the latter is much clearer.
+
+<p><li> We don't care too much about your indentation style but, if
+you're modifying a function, please try to use the same style as the
+rest of the function (or file). If you're writing new code, a
+tab width of 4 is preferred.
+
+</ul>
+
+<h2>CVS issues</h2>
+
+<ul>
+<p><li>
+Don't be tempted to reindent or reorganise large chunks of code - it
+generates large diffs in which it's hard to see whether anything else
+was changed.
+<p>
+If you must reindent or reorganise, don't include any functional
+changes that commit and give advance warning that you're about to do
+it in case anyone else is changing that file.
+</ul>
+
+
+</body>
+</html>
diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html
new file mode 100644
index 0000000000..febe9fe422
--- /dev/null
+++ b/docs/comm/rts-libs/foreignptr.html
@@ -0,0 +1,68 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+
+ <h1>On why we have <tt>ForeignPtr</tt></h1>
+
+ <p>Unfortunately it isn't possible to add a finalizer to a normal
+ <tt>Ptr a</tt>. We already have a generic finalization mechanism:
+ see the Weak module in package lang. But the only reliable way to
+ use finalizers is to attach one to an atomic heap object - that
+ way the compiler's optimiser can't interfere with the lifetime of
+ the object.
+
+ <p>The <tt>Ptr</tt> type is really just a boxed address - it's
+ defined like
+
+ <pre>
+data Ptr a = Ptr Addr#
+</pre>
+
+ <p>where <tt>Addr#</tt> is an unboxed native address (just a 32-
+ or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is
+ dangerous, because the compiler's optimiser might remove the box
+ altogether.
+
+ <p><tt>ForeignPtr</tt> is defined like this
+
+ <pre>
+data ForeignPtr a = ForeignPtr ForeignObj#
+</pre>
+
+ <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds
+ to a real heap object. The heap object is primitive from the
+ point of view of the compiler - it can't be optimised away. So it
+ works to attach a finalizer to the <tt>ForeignObj#</tt> (but not
+ to the <tt>ForeignPtr</tt>!).
+
+ <p>There are several primitive objects to which we can attach
+ finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>,
+ etc. We have special functions for some of these: eg.
+ <tt>MVar.addMVarFinalizer</tt>.
+
+ <p>So a nicer interface might be something like
+
+<pre>
+class Finalizable a where
+ addFinalizer :: a -> IO () -> IO ()
+
+instance Finalizable (ForeignPtr a) where ...
+instance Finalizable (MVar a) where ...
+</pre>
+
+ <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and
+ rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that
+ is just efficiency, I think.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Sep 26 09:49:37 BST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html
new file mode 100644
index 0000000000..67a544be85
--- /dev/null
+++ b/docs/comm/rts-libs/multi-thread.html
@@ -0,0 +1,445 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+<head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+<title>The GHC Commentary - Supporting multi-threaded interoperation</title>
+</head>
+<body>
+<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1>
+<em>
+<p>
+Authors: sof@galois.com, simonmar@microsoft.com<br>
+Date: April 2002
+</p>
+</em>
+<p>
+This document presents the implementation of an extension to
+Concurrent Haskell that provides two enhancements:
+</p>
+<ul>
+<li>A Concurrent Haskell thread may call an external (e.g., C)
+function in a manner that's transparent to the execution/evaluation of
+other Haskell threads. Section <a href="#callout">Calling out"</a> covers this.
+</li>
+<li>
+OS threads may safely call Haskell functions concurrently. Section
+<a href="#callin">"Calling in"</a> covers this.
+</li>
+</ul>
+
+<!---- *************************************** ----->
+<h2 id="callout">The problem: foreign calls that block</h2>
+<p>
+When a Concurrent Haskell(CH) thread calls a 'foreign import'ed
+function, the runtime system(RTS) has to handle this in a manner
+transparent to other CH threads. That is, they shouldn't be blocked
+from making progress while the CH thread executes the external
+call. Presently, all threads will block.
+</p>
+<p>
+Clearly, we have to rely on OS-level threads in order to support this
+kind of concurrency. The implementation described here defines the
+(abstract) OS threads interface that the RTS assumes. The implementation
+currently provides two instances of this interface, one for POSIX
+threads (pthreads) and one for the Win32 threads.
+</p>
+
+<!---- *************************************** ----->
+<h3>Multi-threading the RTS</h3>
+
+<p>
+A simple and efficient way to implement non-blocking foreign calls is like this:
+<ul>
+<li> Invariant: only one OS thread is allowed to
+execute code inside of the GHC runtime system. [There are alternate
+designs, but I won't go into details on their pros and cons here.]
+We'll call the OS thread that is currently running Haskell threads
+the <em>Current Haskell Worker Thread</em>.
+<p>
+The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its
+time-slice expires or it blocks on an MVar, then grabs another, and executes
+that, and so on.
+</p>
+<li>
+<p>
+When the Current Haskell Worker comes to execute a potentially blocking 'foreign
+import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that
+another OS worker thread is available to become the Current Haskell Worker.
+Consequently, even if the external call blocks, the new Current Haskell Worker
+continues execution of the other Concurrent Haskell threads.
+When the external call eventually completes, the Concurrent Haskell
+thread that made the call is passed the result and made runnable
+again.
+</p>
+<p>
+<li>
+A pool of OS threads are constantly trying to become the Current Haskell Worker.
+Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers.
+<p><li>
+The OS worker threads are regarded as interchangeable. A given Haskell thread
+may, during its lifetime, be executed entirely by one OS worker thread, or by more than one.
+There's just no way to tell.
+
+<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved.
+The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and
+the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>.
+</ul>
+<p>
+The rest of this section describes the mechanics of implementing all
+this. There's two parts to it, one that describes how a native (OS) thread
+leaves the RTS to service the external call, the other how the same
+thread handles returning the result of the external call back to the
+Haskell thread.
+</p>
+
+<!---- *************************************** ----->
+<h3>Making the external call</h3>
+
+<p>
+Presently, GHC handles 'safe' C calls by effectively emitting the
+following code sequence:
+</p>
+
+<pre>
+ ...save thread state...
+ t = suspendThread();
+ r = foo(arg1,...,argn);
+ resumeThread(t);
+ ...restore thread state...
+ return r;
+</pre>
+
+<p>
+After having squirreled away the state of a Haskell thread,
+<tt>Schedule.c:suspendThread()</tt> is called which puts the current
+thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>]
+containing threads that are currently blocked waiting for external calls
+to complete (this is done for the purposes of finding roots when
+garbage collecting).
+</p>
+
+<p>
+In addition to putting the Haskell thread on
+<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also
+does the following:
+</p>
+<ul>
+<li>Instructs the <em>Task Manager</em> to make sure that there's a
+another native thread waiting in the wings to take over the execution
+of Haskell threads. This might entail creating a new
+<em>worker thread</em> or re-using one that's currently waiting for
+more work to do. The <a href="#taskman">Task Manager</a> section
+presents the functionality provided by this subsystem.
+</li>
+
+<li>Releases its capability to execute within the RTS. By doing
+so, another worker thread will become unblocked and start executing
+code within the RTS. See the <a href="#capability">Capability</a>
+section for details.
+</li>
+
+<li><tt>suspendThread()</tt> returns a token which is used to
+identify the Haskell thread that was added to
+<tt>suspended_ccalling_threads</tt>. This is done so that once the
+external call has completed, we know what Haskell thread to pull off
+the <tt>suspended_ccalling_threads</tt> list.
+</li>
+</ul>
+
+<p>
+Upon return from <tt>suspendThread()</tt>, the OS thread is free of
+its RTS executing responsibility, and can now invoke the external
+call. Meanwhile, the other worker thread that have now gained access
+to the RTS will continue executing Concurrent Haskell code. Concurrent
+'stuff' is happening!
+</p>
+
+<!---- *************************************** ----->
+<h3>Returning the external result</h3>
+
+<p>
+When the native thread eventually returns from the external call,
+the result needs to be communicated back to the Haskell thread that
+issued the external call. The following steps takes care of this:
+</p>
+
+<ul>
+<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>,
+passing along the token referring to the Haskell thread that made the
+call we're returning from.
+</li>
+
+<li>
+The OS thread then tries to grab hold of a <em>returning worker
+capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>.
+Until granted, the thread blocks waiting for RTS permissions. Clearly we
+don't want the thread to be blocked longer than it has to, so whenever
+a thread that is executing within the RTS enters the Scheduler (which
+is quite often, e.g., when a Haskell thread context switch is made),
+it checks to see whether it can give up its RTS capability to a
+returning worker, which is done by calling
+<tt>Capability.c:yieldToReturningWorker()</tt>.
+</li>
+
+<li>
+If a returning worker is waiting (the code in <tt>Capability.c</tt>
+keeps a counter of the number of returning workers that are currently
+blocked waiting), it is woken up and the given the RTS execution
+priviledges/capabilities of the worker thread that gave up its.
+</li>
+
+<li>
+The thread that gave up its capability then tries to re-acquire
+the capability to execute RTS code; this is done by calling
+<tt>Capability.c:waitForWorkCapability()</tt>.
+</li>
+
+<li>
+The returning worker that was woken up will continue execution in
+<tt>resumeThread()</tt>, removing its associated Haskell thread
+from the <tt>suspended_ccalling_threads</tt> list and start evaluating
+that thread, passing it the result of the external call.
+</li>
+</ul>
+
+<!---- *************************************** ----->
+<h3 id="rts-exec">RTS execution</h3>
+
+<p>
+If a worker thread inside the RTS runs out of runnable Haskell
+threads, it goes to sleep waiting for the external calls to complete.
+It does this by calling <tt>waitForWorkCapability</tt>
+</p>
+
+<p>
+The availability of new runnable Haskell threads is signalled when:
+</p>
+
+<ul>
+<li>When an external call is set up in <tt>suspendThread()</tt>.</li>
+<li>When a new Haskell thread is created (e.g., whenever
+<tt>Concurrent.forkIO</tt> is called from within Haskell); this is
+signalled in <tt>Schedule.c:scheduleThread_()</tt>.
+</li>
+<li>Whenever a Haskell thread is removed from a 'blocking queue'
+attached to an MVar (only?).
+</li>
+</ul>
+
+<!---- *************************************** ----->
+<h2 id="callin">Calling in</h2>
+
+Providing robust support for having multiple OS threads calling into
+Haskell is not as involved as its dual.
+
+<ul>
+<li>The OS thread issues the call to a Haskell function by going via
+the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>).
+<li>Making the function application requires the construction of a
+closure on the heap. This is done in a thread-safe manner by having
+the OS thread lock a designated block of memory (the 'Rts API' block,
+which is part of the GC's root set) for the short period of time it
+takes to construct the application.
+<li>The OS thread then creates a new Haskell thread to execute the
+function application, which (eventually) boils down to calling
+<tt>Schedule.c:createThread()</tt>
+<li>
+Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>,
+which asks the Task Manager to possibly create a new worker (OS)
+thread to execute the Haskell thread.
+<li>
+After the OS thread has done this, it blocks waiting for the
+Haskell thread to complete the evaluation of the Haskell function.
+<p>
+The reason why a separate worker thread is made to evaluate the Haskell
+function and not the OS thread that made the call-in via the
+Rts API, is that we want that OS thread to return as soon as possible.
+We wouldn't be able to guarantee that if the OS thread entered the
+RTS to (initially) just execute its function application, as the
+Scheduler may side-track it and also ask it to evaluate other Haskell threads.
+</li>
+</ul>
+
+<p>
+<strong>Note:</strong> As of 20020413, the implementation of the RTS API
+only serializes access to the allocator between multiple OS threads wanting
+to call into Haskell (via the RTS API.) It does not coordinate this access
+to the allocator with that of the OS worker thread that's currently executing
+within the RTS. This weakness/bug is scheduled to be tackled as part of an
+overhaul/reworking of the RTS API itself.
+
+
+<!---- *************************************** ----->
+<h2>Subsystems introduced/modified</h2>
+
+<p>
+These threads extensions affect the Scheduler portions of the runtime
+system. To make it more manageable to work with, the changes
+introduced a couple of new RTS 'sub-systems'. This section presents
+the functionality and API of these sub-systems.
+</p>
+
+<!---- *************************************** ----->
+<h3 id="#capability">Capabilities</h3>
+
+<p>
+A Capability represent the token required to execute STG code,
+and all the state an OS thread/task needs to run Haskell code:
+its STG registers, a pointer to its TSO, a nursery etc. During
+STG execution, a pointer to the capabilitity is kept in a
+register (BaseReg).
+</p>
+<p>
+Only in an SMP build will there be multiple capabilities, for
+the threaded RTS and other non-threaded builds, there is only
+one global capability, namely <tt>MainCapability</tt>.
+
+<p>
+The Capability API is as follows:
+<pre>
+/* Capability.h */
+extern void initCapabilities(void);
+
+extern void grabReturnCapability(Mutex* pMutex, Capability** pCap);
+extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable);
+extern void releaseCapability(Capability* cap);
+
+extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap);
+
+extern void grabCapability(Capability** cap);
+</pre>
+
+<ul>
+<li><tt>initCapabilities()</tt> initialises the subsystem.
+
+<li><tt>grabReturnCapability()</tt> is called by worker threads
+returning from an external call. It blocks them waiting to gain
+permissions to do so.
+
+<li><tt>waitForWorkCapability()</tt> is called by worker threads
+already inside the RTS, but without any work to do. It blocks them
+waiting for there to new work to become available.
+
+<li><tt>releaseCapability()</tt> hands back a capability. If a
+'returning worker' is waiting, it is signalled that a capability
+has become available. If not, <tt>releaseCapability()</tt> tries
+to signal worker threads that are blocked waiting inside
+<tt>waitForWorkCapability()</tt> that new work might now be
+available.
+
+<li><tt>yieldToReturningWorker()</tt> is called by the worker thread
+that's currently inside the Scheduler. It checks whether there are other
+worker threads waiting to return from making an external call. If so,
+they're given preference and a capability is transferred between worker
+threads. One of the waiting 'returning worker' threads is signalled and made
+runnable, with the other, yielding, worker blocking to re-acquire
+a capability.
+</ul>
+
+<p>
+The condition variables used to implement the synchronisation between
+worker consumers and providers are local to the Capability
+implementation. See source for details and comments.
+</p>
+
+<!---- *************************************** ----->
+<h3 id="taskman">The Task Manager</h3>
+
+<p>
+The Task Manager API is responsible for managing the creation of
+OS worker RTS threads. When a Haskell thread wants to make an
+external call, the Task Manager is asked to possibly create a
+new worker thread to take over the RTS-executing capability of
+the worker thread that's exiting the RTS to execute the external call.
+
+<p>
+The Capability subsystem keeps track of idle worker threads, so
+making an informed decision about whether or not to create a new OS
+worker thread is easy work for the task manager. The Task manager
+provides the following API:
+</p>
+
+<pre>
+/* Task.h */
+extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) );
+extern void stopTaskManager ( void );
+
+extern void startTask ( void (*taskStart)(void) );
+</pre>
+
+<ul>
+<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts
+up and shuts down the subsystem. When starting up, you have the option
+to limit the overall number of worker threads that can be
+created. An unbounded (modulo OS thread constraints) number of threads
+is created if you pass '0'.
+<li><tt>startTask()</tt> is called when a worker thread calls
+<tt>suspendThread()</tt> to service an external call, asking another
+worker thread to take over its RTS-executing capability. It is also
+called when an external OS thread invokes a Haskell function via the
+<em>Rts API</em>.
+</ul>
+
+<!---- *************************************** ----->
+<h3>Native threads API</h3>
+
+To hide OS details, the following API is used by the task manager and
+the scheduler to interact with an OS' threads API:
+
+<pre>
+/* OSThreads.h */
+typedef <em>..OS specific..</em> Mutex;
+extern void initMutex ( Mutex* pMut );
+extern void grabMutex ( Mutex* pMut );
+extern void releaseMutex ( Mutex* pMut );
+
+typedef <em>..OS specific..</em> Condition;
+extern void initCondition ( Condition* pCond );
+extern void closeCondition ( Condition* pCond );
+extern rtsBool broadcastCondition ( Condition* pCond );
+extern rtsBool signalCondition ( Condition* pCond );
+extern rtsBool waitCondition ( Condition* pCond,
+ Mutex* pMut );
+
+extern OSThreadId osThreadId ( void );
+extern void shutdownThread ( void );
+extern void yieldThread ( void );
+extern int createOSThread ( OSThreadId* tid,
+ void (*startProc)(void) );
+</pre>
+
+
+
+<!---- *************************************** ----->
+<h2>User-level interface</h2>
+
+To signal that you want an external call to be serviced by a separate
+OS thread, you have to add the attribute <tt>threadsafe</tt> to
+a foreign import declaration, i.e.,
+
+<pre>
+foreign import "bigComp" threadsafe largeComputation :: Int -> IO ()
+</pre>
+
+<p>
+The distinction between 'safe' and thread-safe C calls is made
+so that we may call external functions that aren't re-entrant but may
+cause a GC to occur.
+<p>
+The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>.
+</p>
+
+<!---- *************************************** ----->
+<h2>Building the GHC RTS</h2>
+
+The multi-threaded extension isn't currently enabled by default. To
+have it built, you need to run the <tt>fptools</tt> configure script
+with the extra option <tt>--enable-threaded-rts</tt> turned on, and
+then proceed to build the compiler as per normal.
+
+<hr>
+<small>
+<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end -->
+</small>
+</body> </html>
+
diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html
new file mode 100644
index 0000000000..627bde8d88
--- /dev/null
+++ b/docs/comm/rts-libs/non-blocking.html
@@ -0,0 +1,133 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Non-blocking I/O on Win32</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Non-blocking I/O on Win32</h1>
+ <p>
+
+This note discusses the implementation of non-blocking I/O on
+Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth
+capturing the ideas. Thanks to Sigbjorn for writing them.
+
+<h2> Background</h2>
+
+GHC has provided non-blocking I/O support for Concurrent Haskell
+threads on platforms that provide 'UNIX-style' non-blocking I/O for
+quite a while. That is, platforms that let you alter the property of a
+file descriptor to instead of having a thread block performing an I/O
+operation that cannot be immediately satisfied, the operation returns
+back a special error code (EWOULDBLOCK.) When that happens, the CH
+thread that made the blocking I/O request is put into a blocked-on-IO
+state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will
+in a timely fashion check to see whether I/O is again possible
+(via a call to select()), and if it is, unblock the thread & have it
+re-try the I/O operation. The result is that other Concurrent Haskell
+threads won't be affected, but can continue operating while a thread
+is blocked on I/O.
+<p>
+Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for
+the simple reason that it doesn't provide the OS facilities described
+above.
+
+<h2>Win32 non-blocking I/O, attempt 1</h2>
+
+Win32 does provide something select()-like, namely the
+WaitForMultipleObjects() API. It takes an array of kernel object
+handles plus a timeout interval, and waits for either one (or all) of
+them to become 'signalled'. A handle representing an open file (for
+reading) becomes signalled once there is input available.
+<p>
+So, it is possible to observe that I/O is possible using this
+function, but not whether there's "enough" to satisfy the I/O request.
+So, if we were to mimic select() usage with WaitForMultipleObjects(),
+we'd correctly avoid blocking initially, but a thread may very well
+block waiting for their I/O requests to be satisified once the file
+handle has become signalled. [There is a fix for this -- only read
+and write one byte at a the time -- but I'm not advocating that.]
+
+
+<h2>Win32 non-blocking I/O, attempt 2</h2>
+
+Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is,
+asynchronous read and write requests can be made via the ReadFile() /
+WriteFile () APIs, specifying position and length of the operation.
+If the I/O requests cannot be handled right away, the APIs won't
+block, but return immediately (and report ERROR_IO_PENDING as their
+status code.)
+<p>
+The completion of the request can be reported in a number of ways:
+<ul>
+ <li> synchronously, by blocking inside Read/WriteFile(). (this is the
+ non-overlapped case, really.)
+<p>
+
+ <li> as part of the overlapped I/O request, pass a HANDLE to an event
+ object. The I/O system will signal this event once the request
+ completed, which a waiting thread will then be able to see.
+<p>
+
+ <li> by supplying a pointer to a completion routine, which will be
+ called as an Asynchronous Procedure Call (APC) whenever a thread
+ calls a select bunch of 'alertable' APIs.
+<p>
+
+ <li> by associating the file handle with an I/O completion port. Once
+ the request completes, the thread servicing the I/O completion
+ port will be notified.
+</ul>
+The use of I/O completion port looks the most interesting to GHC,
+as it provides a central point where all I/O requests are reported.
+<p>
+Note: asynchronous I/O is only fully supported by OSes based on
+the NT codebase, i.e., Win9x don't permit async I/O on files and
+pipes. However, Win9x does support async socket operations, and
+I'm currently guessing here, console I/O. In my view, it would
+be acceptable to provide non-blocking I/O support for NT-based
+OSes only.
+<p>
+Here's the design I currently have in mind:
+<ul>
+<li> Upon startup, an RTS helper thread whose only purpose is to service
+ an I/O completion port, is created.
+<p>
+<li> All files are opened in 'overlapping' mode, and associated
+ with an I/O completion port.
+<p>
+<li> Overlapped I/O requests are used to implement read() and write().
+<p>
+<li> If the request cannot be satisified without blocking, the Haskell
+ thread is put on the blocked-on-I/O thread list & a re-schedule
+ is made.
+<p>
+<li> When the completion of a request is signalled via the I/O completion
+ port, the RTS helper thread will move the associated Haskell thread
+ from the blocked list onto the runnable list. (Clearly, care
+ is required here to have another OS thread mutate internal Scheduler
+ data structures.)
+
+<p>
+<li> In the event all Concurrent Haskell threads are blocked waiting on
+ I/O, the main RTS thread blocks waiting on an event synchronisation
+ object, which the helper thread will signal whenever it makes
+ a Haskell thread runnable.
+
+</ul>
+
+I might do the communication between the RTS helper thread and the
+main RTS thread differently though: rather than have the RTS helper
+thread manipluate thread queues itself, thus requiring careful
+locking, just have it change a bit on the relevant TSO, which the main
+RTS thread can check at regular intervals (in some analog of
+awaitEvent(), for example).
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:30:18 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html
new file mode 100644
index 0000000000..25407eed43
--- /dev/null
+++ b/docs/comm/rts-libs/prelfound.html
@@ -0,0 +1,57 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Prelude Foundations</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Prelude Foundations</h1>
+ <p>
+ The standard Haskell Prelude as well as GHC's Prelude extensions are
+ constructed from GHC's <a href="primitives.html">primitives</a> in a
+ couple of layers.
+
+ <h4><code>PrelBase.lhs</code></h4>
+ <p>
+ Some most elementary Prelude definitions are collected in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>.
+ In particular, it defines the boxed versions of Haskell primitive types
+ - for example, <code>Int</code> is defined as
+ <blockquote><pre>
+data Int = I# Int#</pre>
+ </blockquote>
+ <p>
+ Saying that a boxed integer <code>Int</code> is formed by applying the
+ data constructor <code>I#</code> to an <em>unboxed</em> integer of type
+ <code>Int#</code>. Unboxed types are hardcoded in the compiler and
+ exported together with the <a href="primitives.html">primitive
+ operations</a> understood by GHC.
+ <p>
+ <code>PrelBase.lhs</code> similarly defines basic types, such as,
+ boolean values
+ <blockquote><pre>
+data Bool = False | True deriving (Eq, Ord)</pre>
+ </blockquote>
+ <p>
+ the unit type
+ <blockquote><pre>
+data () = ()</pre>
+ </blockquote>
+ <p>
+ and lists
+ <blockquote><pre>
+data [] a = [] | a : [a]</pre>
+ </blockquote>
+ <p>
+ It also contains instance delarations for these types. In addition,
+ <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky
+ machinery</a> for efficient list handling.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:30:18 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html
new file mode 100644
index 0000000000..4ad6c20338
--- /dev/null
+++ b/docs/comm/rts-libs/prelude.html
@@ -0,0 +1,121 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Cunning Prelude Code</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Cunning Prelude Code</h1>
+ <p>
+ GHC's uses a many optimsations and GHC specific techniques (unboxed
+ values, RULES pragmas, and so on) to make the heavily used Prelude code
+ as fast as possible.
+
+ <hr>
+ <h4>Par, seq, and lazy</h4>
+
+ In GHC.Conc you will dinf
+<blockquote><pre>
+ pseq a b = a `seq` lazy b
+</pre></blockquote>
+ What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting.
+ We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict
+ in b, then b might be evaluated <em>before</em> a, which is all wrong.
+<p>
+Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function,
+except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the
+inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look
+lazy, and have no inlining. So the strictness analyser gets no traction.
+<p>
+In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs),
+so we get all the efficiency back.
+<p>
+This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an
+Int#. Now there is no seq# operator at all.
+
+
+ <hr>
+ <h4>fold/build</h4>
+ <p>
+ There is a lot of magic in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> -
+ among other things, the <a
+ href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES
+ pragmas</a> implementing the <a
+ href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a>
+ optimisation. The code for <code>map</code> is
+ a good example for how it all works. In the prelude code for version
+ 5.03 it reads as follows:
+ <blockquote><pre>
+map :: (a -> b) -> [a] -> [b]
+map _ [] = []
+map f (x:xs) = f x : map f xs
+
+-- Note eta expanded
+mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+{-# INLINE [0] mapFB #-}
+mapFB c f x ys = c (f x) ys
+
+{-# RULES
+"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs)
+"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f
+"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g)
+ #-}</pre>
+ </blockquote>
+ <p>
+ Up to (but not including) phase 1, we use the <code>"map"</code> rule to
+ rewrite all saturated applications of <code>map</code> with its
+ build/fold form, hoping for fusion to happen. In phase 1 and 0, we
+ switch off that rule, inline build, and switch on the
+ <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back
+ into plain map.
+ <p>
+ It's important that these two rules aren't both active at once
+ (along with build's unfolding) else we'd get an infinite loop
+ in the rules. Hence the activation control using explicit phase numbers.
+ <p>
+ The "mapFB" rule optimises compositions of map.
+ <p>
+ The mechanism as described above is new in 5.03 since January 2002,
+ where the <code>[~</code><i>N</i><code>]</code> syntax for phase number
+ annotations at rules was introduced. Before that the whole arrangement
+ was more complicated, as the corresponding prelude code for version
+ 4.08.1 shows:
+ <blockquote><pre>
+map :: (a -> b) -> [a] -> [b]
+map = mapList
+
+-- Note eta expanded
+mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+mapFB c f x ys = c (f x) ys
+
+mapList :: (a -> b) -> [a] -> [b]
+mapList _ [] = []
+mapList f (x:xs) = f x : mapList f xs
+
+{-# RULES
+"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs)
+"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g)
+"mapList" forall f. foldr (mapFB (:) f) [] = mapList f
+ #-}</pre>
+ </blockquote>
+ <p>
+ This code is structured as it is, because the "map" rule first
+ <em>breaks</em> the map <em>open,</em> which exposes it to the various
+ foldr/build rules, and if no foldr/build rule matches, the "mapList"
+ rule <em>closes</em> it again in a later phase of optimisation - after
+ build was inlined. As a consequence, the whole thing depends a bit on
+ the timing of the various optimsations (the map might be closed again
+ before any of the foldr/build rules fires). To make the timing
+ deterministic, <code>build</code> gets a <code>{-# INLINE 2 build
+ #-}</code> pragma, which delays <code>build</code>'s inlining, and thus,
+ the closing of the map. [NB: Phase numbering was forward at that time.]
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Mon Feb 11 20:00:49 EST 2002
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html
new file mode 100644
index 0000000000..28abc79426
--- /dev/null
+++ b/docs/comm/rts-libs/primitives.html
@@ -0,0 +1,70 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Primitives</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Primitives</h1>
+ <p>
+ Most user-level Haskell types and functions provided by GHC (in
+ particular those from the Prelude and GHC's Prelude extensions) are
+ internally constructed from even more elementary types and functions.
+ Most notably, GHC understands a notion of <em>unboxed types,</em> which
+ are the Haskell representation of primitive bit-level integer, float,
+ etc. types (as opposed to their boxed, heap allocated counterparts) -
+ cf. <a
+ href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed
+ Values as First Class Citizens."</a>
+
+ <h4>The Ultimate Source of Primitives</h4>
+ <p>
+ The hardwired types of GHC are brought into scope by the module
+ <code>PrelGHC</code>. This modules only exists in the form of a
+ handwritten interface file <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a>
+ which lists the type and function names, as well as instance
+ declarations. The actually types of these names as well as their
+ implementation is hardwired into GHC. Note that the names in this file
+ are z-encoded, and in particular, identifiers ending on <code>zh</code>
+ denote user-level identifiers ending in a hash mark (<code>#</code>),
+ which is used to flag unboxed values or functions operating on unboxed
+ values. For example, we have <code>Char#</code>, <code>ord#</code>, and
+ so on.
+
+ <h4>The New Primitive Definition Scheme</h4>
+ <p>
+ As of (about) the development version 4.11, the types and various
+ properties of primitive operations are defined in the file <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>.
+ (Personally, I don't think that the <code>.txt</code> suffix is really
+ appropriate, as the file is used for automatic code generation; the
+ recent addition of <code>.pp</code> means that the file is now mangled
+ by cpp.)
+ <p>
+ The utility <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a>
+ generates a series of Haskell files from <code>primops.txt</code>, which
+ encode the types and various properties of the primitive operations as
+ compiler internal data structures. These Haskell files are not complete
+ modules, but program fragments, which are included into compiler modules
+ during the GHC build process. The generated include files can be found
+ in the directory <code>fptools/ghc/compiler/</code> and carry names
+ matching the pattern <code>primop-*.hs-incl</code>. They are generate
+ during the execution of the <code>boot</code> target in the
+ <code>fptools/ghc/</code> directory. This scheme significantly
+ simplifies the maintenance of primitive operations.
+ <p>
+ As of development version 5.02, the <code>primops.txt</code> file also allows the
+ recording of documentation about intended semantics of the primitives. This can
+ be extracted into a latex document (or rather, into latex document fragments)
+ via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code>
+ for full details of how GHC is configured to cope with different machine word sizes.
+ <p><small>
+<!-- hhmts start -->
+Last modified: Mon Nov 26 18:03:16 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html
new file mode 100644
index 0000000000..196ec9150d
--- /dev/null
+++ b/docs/comm/rts-libs/stgc.html
@@ -0,0 +1,45 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Spineless Tagless C</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Spineless Tagless C</h1>
+ <p>
+ The C code generated by GHC doesn't use higher-level features of C to be
+ able to control as precisely as possible what code is generated.
+ Moreover, it uses special features of gcc (such as, first class labels)
+ to produce more efficient code.
+ <p>
+ STG C makes ample use of C's macro language to define idioms, which also
+ reduces the size of the generated C code (thus, reducing I/O times).
+ These macros are defined in the C headers located in GHC's <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a>
+ directory.
+
+ <h4><code>TailCalls.h</code></h4>
+ <p>
+ <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a>
+ defines how tail calls are implemented - and in particular - optimised
+ in GHC generated code. The default case, for an architecture for which
+ GHC is not optimised, is to use the mini interpreter described in the <a
+ href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a>
+ <p>
+ For supported architectures, various tricks are used to generate
+ assembler implementing proper tail calls. On i386, gcc's first class
+ labels are used to directly jump to a function pointer. Furthermore,
+ markers of the form <code>--- BEGIN ---</code> and <code>--- END
+ ---</code> are added to the assembly right after the function prologue
+ and before the epilogue. These markers are used by <a
+ href="../the-beast/mangler.html">the Evil Mangler.</a>
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:28:29 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html
new file mode 100644
index 0000000000..499aeec767
--- /dev/null
+++ b/docs/comm/rts-libs/threaded-rts.html
@@ -0,0 +1,126 @@
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title>
+ </head>
+
+ <body>
+ <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1>
+
+ <p>This section of the commentary explains the structure of the runtime system
+ when used in threaded or SMP mode.</p>
+
+ <p>The <em>threaded</em> version of the runtime supports
+ bound threads and non-blocking foreign calls, and an overview of its
+ design can be found in the paper <a
+ href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending
+ the Haskell Foreign Function Interface with Concurrency</a>. To
+ compile the runtime with threaded support, add the line
+
+<pre>GhcRTSWays += thr</pre>
+
+ to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way,
+ the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the
+ build system when building for way <tt>thr</tt>, see
+ <tt>mk/config.mk</tt>). To build a Haskell program
+ with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this
+ can be used in conjunction with <tt>-prof</tt>, and possibly
+ <tt>-debug</tt> and others depending on which versions of the RTS have
+ been built.</p>
+
+ <p>The <em>SMP</em> version runtime supports the same facilities as the
+ threaded version, and in addition supports execution of Haskell code by
+ multiple simultaneous OS threads. For SMP support, both the runtime and
+ the libraries must be built a special way: add the lines
+
+ <pre>
+GhcRTSWays += thr
+GhcLibWays += s</pre>
+
+ to <tt>mk/build.mk</tt>. To build Haskell code for
+ SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in
+ conjunction with <tt>-debug</tt>, but no other way-flags at this time).
+ When building C code in the runtime for SMP
+ support, the symbol <tt>SMP</tt> is defined (this is arranged by the
+ compiler when the <tt>-smp</tt> flag is given, see
+ <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p>
+
+ <p>When building the runtime in either the threaded or SMP ways, the symbol
+ <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p>
+
+ <h2>Overall design</h2>
+
+ <p>The system is based around the notion of a <tt>Capability</tt>. A
+ <tt>Capability</tt> is an object that represents both the permission to
+ execute some Haskell code, and the state required to do so. In order
+ to execute some Haskell code, a thread must therefore hold a
+ <tt>Capability</tt>. The available pool of capabilities is managed by
+ the <tt>Capability</tt> API, described below.</p>
+
+ <p>In the threaded runtime, there is only a single <tt>Capabililty</tt> in the
+ system, indicating that only a single thread can be executing Haskell
+ code at any one time. In the SMP runtime, there can be an arbitrary
+ number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt>
+ flag; in practice the number is best chosen to be the same as the number of
+ processors on the host machine.</p>
+
+ <p>There are a number of OS threads running code in the runtime. We call
+ these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>.
+ Tasks are managed by the <tt>Task</tt> subsystem, which is mainly
+ concerned with keeping track of statistics such as how much time each
+ task spends executing Haskell code, and also keeping track of how many
+ tasks are around when we want to shut down the runtime.</p>
+
+ <p>Some tasks are created by the runtime itself, and some may be here
+ as a result of a call to Haskell from foreign code (we
+ call this an in-call). The
+ runtime can support any number of concurrent foreign in-calls, but the
+ number of these calls that will actually run Haskell code in parallel is
+ determined by the number of available capabilities. Each in-call creates
+ a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited
+ above).</p>
+
+ <p>In the future we may want to bind a <tt>Capability</tt> to a particular
+ processor, so that we can support a notion of affinity - avoiding
+ accidental migration of work from one CPU to another, so that we can make
+ best use of a CPU's local cache. For now, the design ignores this
+ issue.</p>
+
+ <h2>The <tt>OSThreads</tt> interface</h2>
+
+ <p>This interface is merely an abstraction layer over the OS-specific APIs
+ for managing threads. It has two main implementations: Win32 and
+ POSIX.</p>
+
+ <p>This is the entirety of the interface:</p>
+
+<pre>
+/* Various abstract types */
+typedef Mutex;
+typedef Condition;
+typedef OSThreadId;
+
+extern OSThreadId osThreadId ( void );
+extern void shutdownThread ( void );
+extern void yieldThread ( void );
+extern int createOSThread ( OSThreadId* tid,
+ void (*startProc)(void) );
+
+extern void initCondition ( Condition* pCond );
+extern void closeCondition ( Condition* pCond );
+extern rtsBool broadcastCondition ( Condition* pCond );
+extern rtsBool signalCondition ( Condition* pCond );
+extern rtsBool waitCondition ( Condition* pCond,
+ Mutex* pMut );
+
+extern void initMutex ( Mutex* pMut );
+ </pre>
+
+ <h2>The Task interface</h2>
+
+ <h2>The Capability interface</h2>
+
+ <h2>Multiprocessor Haskell Execution</h2>
+
+ </body>
+</html>