diff options
Diffstat (limited to 'docs/comm/rts-libs')
-rw-r--r-- | docs/comm/rts-libs/coding-style.html | 516 | ||||
-rw-r--r-- | docs/comm/rts-libs/foreignptr.html | 68 | ||||
-rw-r--r-- | docs/comm/rts-libs/multi-thread.html | 445 | ||||
-rw-r--r-- | docs/comm/rts-libs/non-blocking.html | 133 | ||||
-rw-r--r-- | docs/comm/rts-libs/prelfound.html | 57 | ||||
-rw-r--r-- | docs/comm/rts-libs/prelude.html | 121 | ||||
-rw-r--r-- | docs/comm/rts-libs/primitives.html | 70 | ||||
-rw-r--r-- | docs/comm/rts-libs/stgc.html | 45 | ||||
-rw-r--r-- | docs/comm/rts-libs/threaded-rts.html | 126 |
9 files changed, 1581 insertions, 0 deletions
diff --git a/docs/comm/rts-libs/coding-style.html b/docs/comm/rts-libs/coding-style.html new file mode 100644 index 0000000000..58f5b4f9bb --- /dev/null +++ b/docs/comm/rts-libs/coding-style.html @@ -0,0 +1,516 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Style Guidelines for RTS C code</title> + </head> + +<body> +<H1>The GHC Commentary - Style Guidelines for RTS C code</h1> + +<h2>Comments</h2> + +<p>These coding style guidelines are mainly intended for use in +<tt>ghc/rts</tt> and <tt>ghc/includes</tt>. + +<p>NB These are just suggestions. They're not set in stone. Some of +them are probably misguided. If you disagree with them, feel free to +modify this document (and make your commit message reasonably +informative) or mail someone (eg. <a +href="glasgow-haskell-users@haskell.org">The GHC mailing list</a>) + +<h2>References</h2> + +If you haven't read them already, you might like to check the following. +Where they conflict with our suggestions, they're probably right. + +<ul> + +<li> +The C99 standard. One reasonable reference is <a +href="http://home.tiscalinet.ch/t_wolf/tw/c/c9x_changes.html">here</a>. + +<p><li> +Writing Solid Code, Microsoft Press. (Highly recommended. Possibly +the only Microsoft Press book that's worth reading.) + +<p><li> +Autoconf documentation. +See also <a href="http://peti.gmd.de/autoconf-archive/">The autoconf macro archive</a> and +<a href="http://www.cyclic.com/cyclic-pages/autoconf.html">Cyclic Software's description</a> + +<p><li> <a +href="http://www.cs.umd.edu/users/cml/cstyle/indhill-cstyle.html">Indian +Hill C Style and Coding Standards</a>. + +<p><li> +<a href="http://www.cs.umd.edu/users/cml/cstyle/">A list of C programming style links</a> + +<p><li> +<a href="http://www.lysator.liu.se/c/c-www.html">A very large list of C programming links</a> + +<p><li> +<a href="http://www.geek-girl.com/unix.html">A list of Unix programming links</a> + +</ul> + + +<h2>Portability issues</h2> + +<ul> +<p><li> We try to stick to C99 where possible. We use the following +C99 features relative to C89, some of which were previously GCC +extensions (possibly with different syntax): + +<ul> +<p><li>Variable length arrays as the last field of a struct. GCC has +a similar extension, but the syntax is slightly different: in GCC you +would declare the array as <tt>arr[0]</tt>, whereas in C99 it is +declared as <tt>arr[]</tt>. + +<p><li>Inline annotations on functions (see later) + +<p><li>Labeled elements in initialisers. Again, GCC has a slightly +different syntax from C99 here, and we stick with the GCC syntax until +GCC implements the C99 proposal. + +<p><li>C++-style comments. These are part of the C99 standard, and we +prefer to use them whenever possible. +</ul> + +<p>In addition we use ANSI-C-style function declarations and +prototypes exclusively. Every function should have a prototype; +static function prototypes may be placed near the top of the file in +which they are declared, and external prototypes are usually placed in +a header file with the same basename as the source file (although there +are exceptions to this rule, particularly when several source files +together implement a subsystem which is described by a single external +header file). + +<p><li>We use the following GCC extensions, but surround them with +<tt>#ifdef __GNUC__</tt>: + +<ul> +<p><li>Function attributes (mostly just <code>no_return</code> and +<code>unused</code>) +<p><li>Inline assembly. +</ul> + +<p><li> +char can be signed or unsigned - always say which you mean + +<p><li>Our POSIX policy: try to write code that only uses POSIX (IEEE +Std 1003.1) interfaces and APIs. We used to define +<code>POSIX_SOURCE</code> by default, but found that this caused more +problems than it solved, so now we require any code that is +POSIX-compliant to explicitly say so by having <code>#include +"PosixSource.h"</code> at the top. Try to do this whenever possible. + +<p><li> Some architectures have memory alignment constraints. Others +don't have any constraints but go faster if you align things. These +macros (from <tt>ghcconfig.h</tt>) tell you which alignment to use + +<pre> + /* minimum alignment of unsigned int */ + #define ALIGNMENT_UNSIGNED_INT 4 + + /* minimum alignment of long */ + #define ALIGNMENT_LONG 4 + + /* minimum alignment of float */ + #define ALIGNMENT_FLOAT 4 + + /* minimum alignment of double */ + #define ALIGNMENT_DOUBLE 4 +</pre> + +<p><li> Use <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> when +reading/writing ints and ptrs to the stack or heap. Note that, by +definition, <tt>StgInt</tt>, <tt>StgWord</tt> and <tt>StgPtr</tt> are +the same size and have the same alignment constraints even if +<code>sizeof(int) != sizeof(ptr)</code> on that platform. + +<p><li> Use <tt>StgInt8</tt>, <tt>StgInt16</tt>, etc when you need a +certain minimum number of bits in a type. Use <tt>int</tt> and +<tt>nat</tt> when there's no particular constraint. ANSI C only +guarantees that ints are at least 16 bits but within GHC we assume +they are 32 bits. + +<p><li> Use <tt>StgFloat</tt> and <tt>StgDouble</tt> for floating +point values which will go on/have come from the stack or heap. Note +that <tt>StgDouble</tt> may occupy more than one <tt>StgWord</tt>, but +it will always be a whole number multiple. + +<p> +Use <code>PK_FLT(addr)</code>, <code>PK_DBL(addr)</code> to read +<tt>StgFloat</tt> and <tt>StgDouble</tt> values from the stack/heap, +and <code>ASSIGN_FLT(val,addr)</code> / +<code>ASSIGN_DBL(val,addr)</code> to assign StgFloat/StgDouble values +to heap/stack locations. These macros take care of alignment +restrictions. + +<p> +Heap/Stack locations are always <tt>StgWord</tt> aligned; the +alignment requirements of an <tt>StgDouble</tt> may be more than that +of <tt>StgWord</tt>, but we don't pad misaligned <tt>StgDoubles</tt> +because doing so would be too much hassle (see <code>PK_DBL</code> & +co above). + +<p><li> +Avoid conditional code like this: + +<pre> + #ifdef solaris_host_OS + // do something solaris specific + #endif +</pre> + +Instead, add an appropriate test to the configure.ac script and use +the result of that test instead. + +<pre> + #ifdef HAVE_BSD_H + // use a BSD library + #endif +</pre> + +<p>The problem is that things change from one version of an OS to another +- things get added, things get deleted, things get broken, some things +are optional extras. Using "feature tests" instead of "system tests" +makes things a lot less brittle. Things also tend to get documented +better. + +</ul> + +<h2>Debugging/robustness tricks</h2> + + +Anyone who has tried to debug a garbage collector or code generator +will tell you: "If a program is going to crash, it should crash as +soon, as noisily and as often as possible." There's nothing worse +than trying to find a bug which only shows up when running GHC on +itself and doesn't manifest itself until 10 seconds after the actual +cause of the problem. + +<p>We put all our debugging code inside <tt>#ifdef DEBUG</tt>. The +general policy is we don't ship code with debugging checks and +assertions in it, but we do run with those checks in place when +developing and testing. Anything inside <tt>#ifdef DEBUG</tt> should +not slow down the code by more than a factor of 2. + +<p>We also have more expensive "sanity checking" code for hardcore +debugging - this can slow down the code by a large factor, but is only +enabled on demand by a command-line flag. General sanity checking in +the RTS is currently enabled with the <tt>-DS</tt> RTS flag. + +<p>There are a number of RTS flags which control debugging output and +sanity checking in various parts of the system when <tt>DEBUG</tt> is +defined. For example, to get the scheduler to be verbose about what +it is doing, you would say <tt>+RTS -Ds -RTS</tt>. See +<tt>includes/RtsFlags.h</tt> and <tt>rts/RtsFlags.c</tt> for the full +set of debugging flags. To check one of these flags in the code, +write: + +<pre> + IF_DEBUG(gc, fprintf(stderr, "...")); +</pre> + +would check the <tt>gc</tt> flag before generating the output (and the +code is removed altogether if <tt>DEBUG</tt> is not defined). + +<p>All debugging output should go to <tt>stderr</tt>. + +<p> +Particular guidelines for writing robust code: + +<ul> +<p><li> +Use assertions. Use lots of assertions. If you write a comment +that says "takes a +ve number" add an assertion. If you're casting +an int to a nat, add an assertion. If you're casting an int to a char, +add an assertion. We use the <tt>ASSERT</tt> macro for writing +assertions; it goes away when <tt>DEBUG</tt> is not defined. + +<p><li> +Write special debugging code to check the integrity of your data structures. +(Most of the runtime checking code is in <tt>rts/Sanity.c</tt>) +Add extra assertions which call this code at the start and end of any +code that operates on your data structures. + +<p><li> +When you find a hard-to-spot bug, try to think of some assertions, +sanity checks or whatever that would have made the bug easier to find. + +<p><li> +When defining an enumeration, it's a good idea not to use 0 for normal +values. Instead, make 0 raise an internal error. The idea here is to +make it easier to detect pointer-related errors on the assumption that +random pointers are more likely to point to a 0 than to anything else. + +<pre> +typedef enum + { i_INTERNAL_ERROR /* Instruction 0 raises an internal error */ + , i_PANIC /* irrefutable pattern match failed! */ + , i_ERROR /* user level error */ + + ... +</pre> + +<p><li> Use <tt>#warning</tt> or <tt>#error</tt> whenever you write a +piece of incomplete/broken code. + +<p><li> When testing, try to make infrequent things happen often. + For example, make a context switch/gc/etc happen every time a + context switch/gc/etc can happen. The system will run like a + pig but it'll catch a lot of bugs. + +</ul> + +<h2>Syntactic details</h2> + +<ul> +<p><li><b>Important:</b> Put "redundant" braces or parens in your code. +Omitting braces and parens leads to very hard to spot bugs - +especially if you use macros (and you might have noticed that GHC does +this a lot!) + +<p> +In particular: +<ul> +<p><li> +Put braces round the body of for loops, while loops, if statements, etc. +even if they "aren't needed" because it's really hard to find the resulting +bug if you mess up. Indent them any way you like but put them in there! +</ul> + +<p><li> +When defining a macro, always put parens round args - just in case. +For example, write: +<pre> + #define add(x,y) ((x)+(y)) +</pre> +instead of +<pre> + #define add(x,y) x+y +</pre> + +<p><li> Don't declare and initialize variables at the same time. +Separating the declaration and initialization takes more lines, but +make the code clearer. + +<p><li> +Use inline functions instead of macros if possible - they're a lot +less tricky to get right and don't suffer from the usual problems +of side effects, evaluation order, multiple evaluation, etc. + +<ul> +<p><li>Inline functions get the naming issue right. E.g. they + can have local variables which (in an expression context) + macros can't. + +<p><li> Inline functions have call-by-value semantics whereas macros + are call-by-name. You can be bitten by duplicated computation + if you aren't careful. + +<p><li> You can use inline functions from inside gdb if you compile with + -O0 or -fkeep-inline-functions. If you use macros, you'd better + know what they expand to. +</ul> + +However, note that macros can serve as both l-values and r-values and +can be "polymorphic" as these examples show: +<pre> + // you can use this as an l-value or an l-value + #define PROF_INFO(cl) (((StgClosure*)(cl))->header.profInfo) + + // polymorphic case + // but note that min(min(1,2),3) does 3 comparisions instead of 2!! + #define min(x,y) (((x)<=(y)) ? (x) : (y)) +</pre> + +<p><li> +Inline functions should be "static inline" because: +<ul> +<p><li> +gcc will delete static inlines if not used or theyre always inlined. + +<p><li> + if they're externed, we could get conflicts between 2 copies of the + same function if, for some reason, gcc is unable to delete them. + If they're static, we still get multiple copies but at least they don't conflict. +</ul> + +OTOH, the gcc manual says this +so maybe we should use extern inline? + +<pre> + When a function is both inline and `static', if all calls to the +function are integrated into the caller, and the function's address is +never used, then the function's own assembler code is never referenced. +In this case, GNU CC does not actually output assembler code for the +function, unless you specify the option `-fkeep-inline-functions'. +Some calls cannot be integrated for various reasons (in particular, +calls that precede the function's definition cannot be integrated, and +neither can recursive calls within the definition). If there is a +nonintegrated call, then the function is compiled to assembler code as +usual. The function must also be compiled as usual if the program +refers to its address, because that can't be inlined. + + When an inline function is not `static', then the compiler must +assume that there may be calls from other source files; since a global +symbol can be defined only once in any program, the function must not +be defined in the other source files, so the calls therein cannot be +integrated. Therefore, a non-`static' inline function is always +compiled on its own in the usual fashion. + + If you specify both `inline' and `extern' in the function +definition, then the definition is used only for inlining. In no case +is the function compiled on its own, not even if you refer to its +address explicitly. Such an address becomes an external reference, as +if you had only declared the function, and had not defined it. + + This combination of `inline' and `extern' has almost the effect of a +macro. The way to use it is to put a function definition in a header +file with these keywords, and put another copy of the definition +(lacking `inline' and `extern') in a library file. The definition in +the header file will cause most calls to the function to be inlined. +If any uses of the function remain, they will refer to the single copy +in the library. +</pre> + +<p><li> +Don't define macros that expand to a list of statements. +You could just use braces as in: + +<pre> + #define ASSIGN_CC_ID(ccID) \ + { \ + ccID = CC_ID; \ + CC_ID++; \ + } +</pre> + +(but it's usually better to use an inline function instead - see above). + +<p><li> +Don't even write macros that expand to 0 statements - they can mess you +up as well. Use the doNothing macro instead. +<pre> + #define doNothing() do { } while (0) +</pre> + +<p><li> +This code +<pre> +int* p, q; +</pre> +looks like it declares two pointers but, in fact, only p is a pointer. +It's safer to write this: +<pre> +int* p; +int* q; +</pre> +You could also write this: +<pre> +int *p, *q; +</pre> +but it is preferrable to split the declarations. + +<p><li> +Try to use ANSI C's enum feature when defining lists of constants of +the same type. Among other benefits, you'll notice that gdb uses the +name instead of its (usually inscrutable) number when printing values +with enum types and gdb will let you use the name in expressions you +type. + +<p> +Examples: +<pre> + typedef enum { /* N.B. Used as indexes into arrays */ + NO_HEAP_PROFILING, + HEAP_BY_CC, + HEAP_BY_MOD, + HEAP_BY_GRP, + HEAP_BY_DESCR, + HEAP_BY_TYPE, + HEAP_BY_TIME + } ProfilingFlags; +</pre> +instead of +<pre> + # define NO_HEAP_PROFILING 0 /* N.B. Used as indexes into arrays */ + # define HEAP_BY_CC 1 + # define HEAP_BY_MOD 2 + # define HEAP_BY_GRP 3 + # define HEAP_BY_DESCR 4 + # define HEAP_BY_TYPE 5 + # define HEAP_BY_TIME 6 +</pre> +and +<pre> + typedef enum { + CCchar = 'C', + MODchar = 'M', + GRPchar = 'G', + DESCRchar = 'D', + TYPEchar = 'Y', + TIMEchar = 'T' + } ProfilingTag; +</pre> +instead of +<pre> + # define CCchar 'C' + # define MODchar 'M' + # define GRPchar 'G' + # define DESCRchar 'D' + # define TYPEchar 'Y' + # define TIMEchar 'T' +</pre> + +<p><li> Please keep to 80 columns: the line has to be drawn somewhere, +and by keeping it to 80 columns we can ensure that code looks OK on +everyone's screen. Long lines are hard to read, and a sign that the +code needs to be restructured anyway. + +<p><li> When commenting out large chunks of code, use <code>#ifdef 0 +... #endif</code> rather than <code>/* ... */</code> because C doesn't +have nested comments. + +<p><li>When declaring a typedef for a struct, give the struct a name +as well, so that other headers can forward-reference the struct name +and it becomes possible to have opaque pointers to the struct. Our +convention is to name the struct the same as the typedef, but add a +leading underscore. For example: + +<pre> + typedef struct _Foo { + ... + } Foo; +</pre> + +<p><li>Do not use <tt>!</tt> instead of explicit comparison against +<tt>NULL</tt> or <tt>'\0'</tt>; the latter is much clearer. + +<p><li> We don't care too much about your indentation style but, if +you're modifying a function, please try to use the same style as the +rest of the function (or file). If you're writing new code, a +tab width of 4 is preferred. + +</ul> + +<h2>CVS issues</h2> + +<ul> +<p><li> +Don't be tempted to reindent or reorganise large chunks of code - it +generates large diffs in which it's hard to see whether anything else +was changed. +<p> +If you must reindent or reorganise, don't include any functional +changes that commit and give advance warning that you're about to do +it in case anyone else is changing that file. +</ul> + + +</body> +</html> diff --git a/docs/comm/rts-libs/foreignptr.html b/docs/comm/rts-libs/foreignptr.html new file mode 100644 index 0000000000..febe9fe422 --- /dev/null +++ b/docs/comm/rts-libs/foreignptr.html @@ -0,0 +1,68 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - why we have <tt>ForeignPtr</tt></title> + </head> + + <body BGCOLOR="FFFFFF"> + + <h1>On why we have <tt>ForeignPtr</tt></h1> + + <p>Unfortunately it isn't possible to add a finalizer to a normal + <tt>Ptr a</tt>. We already have a generic finalization mechanism: + see the Weak module in package lang. But the only reliable way to + use finalizers is to attach one to an atomic heap object - that + way the compiler's optimiser can't interfere with the lifetime of + the object. + + <p>The <tt>Ptr</tt> type is really just a boxed address - it's + defined like + + <pre> +data Ptr a = Ptr Addr# +</pre> + + <p>where <tt>Addr#</tt> is an unboxed native address (just a 32- + or 64- bit word). Putting a finalizer on a <tt>Ptr</tt> is + dangerous, because the compiler's optimiser might remove the box + altogether. + + <p><tt>ForeignPtr</tt> is defined like this + + <pre> +data ForeignPtr a = ForeignPtr ForeignObj# +</pre> + + <p>where <tt>ForeignObj#</tt> is a *boxed* address, it corresponds + to a real heap object. The heap object is primitive from the + point of view of the compiler - it can't be optimised away. So it + works to attach a finalizer to the <tt>ForeignObj#</tt> (but not + to the <tt>ForeignPtr</tt>!). + + <p>There are several primitive objects to which we can attach + finalizers: <tt>MVar#</tt>, <tt>MutVar#</tt>, <tt>ByteArray#</tt>, + etc. We have special functions for some of these: eg. + <tt>MVar.addMVarFinalizer</tt>. + + <p>So a nicer interface might be something like + +<pre> +class Finalizable a where + addFinalizer :: a -> IO () -> IO () + +instance Finalizable (ForeignPtr a) where ... +instance Finalizable (MVar a) where ... +</pre> + + <p>So you might ask why we don't just get rid of <tt>Ptr</tt> and + rename <tt>ForeignPtr</tt> to <tt>Ptr</tt>. The reason for that + is just efficiency, I think. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Sep 26 09:49:37 BST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/multi-thread.html b/docs/comm/rts-libs/multi-thread.html new file mode 100644 index 0000000000..67a544be85 --- /dev/null +++ b/docs/comm/rts-libs/multi-thread.html @@ -0,0 +1,445 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> +<head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> +<title>The GHC Commentary - Supporting multi-threaded interoperation</title> +</head> +<body> +<h1>The GHC Commentary - Supporting multi-threaded interoperation</h1> +<em> +<p> +Authors: sof@galois.com, simonmar@microsoft.com<br> +Date: April 2002 +</p> +</em> +<p> +This document presents the implementation of an extension to +Concurrent Haskell that provides two enhancements: +</p> +<ul> +<li>A Concurrent Haskell thread may call an external (e.g., C) +function in a manner that's transparent to the execution/evaluation of +other Haskell threads. Section <a href="#callout">Calling out"</a> covers this. +</li> +<li> +OS threads may safely call Haskell functions concurrently. Section +<a href="#callin">"Calling in"</a> covers this. +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callout">The problem: foreign calls that block</h2> +<p> +When a Concurrent Haskell(CH) thread calls a 'foreign import'ed +function, the runtime system(RTS) has to handle this in a manner +transparent to other CH threads. That is, they shouldn't be blocked +from making progress while the CH thread executes the external +call. Presently, all threads will block. +</p> +<p> +Clearly, we have to rely on OS-level threads in order to support this +kind of concurrency. The implementation described here defines the +(abstract) OS threads interface that the RTS assumes. The implementation +currently provides two instances of this interface, one for POSIX +threads (pthreads) and one for the Win32 threads. +</p> + +<!---- *************************************** -----> +<h3>Multi-threading the RTS</h3> + +<p> +A simple and efficient way to implement non-blocking foreign calls is like this: +<ul> +<li> Invariant: only one OS thread is allowed to +execute code inside of the GHC runtime system. [There are alternate +designs, but I won't go into details on their pros and cons here.] +We'll call the OS thread that is currently running Haskell threads +the <em>Current Haskell Worker Thread</em>. +<p> +The Current Haskell Worker Thread repeatedly grabs a Haskell thread, executes it until its +time-slice expires or it blocks on an MVar, then grabs another, and executes +that, and so on. +</p> +<li> +<p> +When the Current Haskell Worker comes to execute a potentially blocking 'foreign +import', it leaves the RTS and ceases being the Current Haskell Worker, but before doing so it makes certain that +another OS worker thread is available to become the Current Haskell Worker. +Consequently, even if the external call blocks, the new Current Haskell Worker +continues execution of the other Concurrent Haskell threads. +When the external call eventually completes, the Concurrent Haskell +thread that made the call is passed the result and made runnable +again. +</p> +<p> +<li> +A pool of OS threads are constantly trying to become the Current Haskell Worker. +Only one succeeds at any moment. If the pool becomes empty, the RTS creates more workers. +<p><li> +The OS worker threads are regarded as interchangeable. A given Haskell thread +may, during its lifetime, be executed entirely by one OS worker thread, or by more than one. +There's just no way to tell. + +<p><li>If a foreign program wants to call a Haskell function, there is always a thread switch involved. +The foreign program uses thread-safe mechanisms to create a Haskell thread and make it runnable; and +the current Haskell Worker Thread exectutes it. See Section <a href="#callin">Calling in</a>. +</ul> +<p> +The rest of this section describes the mechanics of implementing all +this. There's two parts to it, one that describes how a native (OS) thread +leaves the RTS to service the external call, the other how the same +thread handles returning the result of the external call back to the +Haskell thread. +</p> + +<!---- *************************************** -----> +<h3>Making the external call</h3> + +<p> +Presently, GHC handles 'safe' C calls by effectively emitting the +following code sequence: +</p> + +<pre> + ...save thread state... + t = suspendThread(); + r = foo(arg1,...,argn); + resumeThread(t); + ...restore thread state... + return r; +</pre> + +<p> +After having squirreled away the state of a Haskell thread, +<tt>Schedule.c:suspendThread()</tt> is called which puts the current +thread on a list [<tt>Schedule.c:suspended_ccalling_threads</tt>] +containing threads that are currently blocked waiting for external calls +to complete (this is done for the purposes of finding roots when +garbage collecting). +</p> + +<p> +In addition to putting the Haskell thread on +<tt>suspended_ccalling_threads</tt>, <tt>suspendThread()</tt> now also +does the following: +</p> +<ul> +<li>Instructs the <em>Task Manager</em> to make sure that there's a +another native thread waiting in the wings to take over the execution +of Haskell threads. This might entail creating a new +<em>worker thread</em> or re-using one that's currently waiting for +more work to do. The <a href="#taskman">Task Manager</a> section +presents the functionality provided by this subsystem. +</li> + +<li>Releases its capability to execute within the RTS. By doing +so, another worker thread will become unblocked and start executing +code within the RTS. See the <a href="#capability">Capability</a> +section for details. +</li> + +<li><tt>suspendThread()</tt> returns a token which is used to +identify the Haskell thread that was added to +<tt>suspended_ccalling_threads</tt>. This is done so that once the +external call has completed, we know what Haskell thread to pull off +the <tt>suspended_ccalling_threads</tt> list. +</li> +</ul> + +<p> +Upon return from <tt>suspendThread()</tt>, the OS thread is free of +its RTS executing responsibility, and can now invoke the external +call. Meanwhile, the other worker thread that have now gained access +to the RTS will continue executing Concurrent Haskell code. Concurrent +'stuff' is happening! +</p> + +<!---- *************************************** -----> +<h3>Returning the external result</h3> + +<p> +When the native thread eventually returns from the external call, +the result needs to be communicated back to the Haskell thread that +issued the external call. The following steps takes care of this: +</p> + +<ul> +<li>The returning OS thread calls <tt>Schedule.c:resumeThread()</tt>, +passing along the token referring to the Haskell thread that made the +call we're returning from. +</li> + +<li> +The OS thread then tries to grab hold of a <em>returning worker +capability</em>, via <tt>Capability.c:grabReturnCapability()</tt>. +Until granted, the thread blocks waiting for RTS permissions. Clearly we +don't want the thread to be blocked longer than it has to, so whenever +a thread that is executing within the RTS enters the Scheduler (which +is quite often, e.g., when a Haskell thread context switch is made), +it checks to see whether it can give up its RTS capability to a +returning worker, which is done by calling +<tt>Capability.c:yieldToReturningWorker()</tt>. +</li> + +<li> +If a returning worker is waiting (the code in <tt>Capability.c</tt> +keeps a counter of the number of returning workers that are currently +blocked waiting), it is woken up and the given the RTS execution +priviledges/capabilities of the worker thread that gave up its. +</li> + +<li> +The thread that gave up its capability then tries to re-acquire +the capability to execute RTS code; this is done by calling +<tt>Capability.c:waitForWorkCapability()</tt>. +</li> + +<li> +The returning worker that was woken up will continue execution in +<tt>resumeThread()</tt>, removing its associated Haskell thread +from the <tt>suspended_ccalling_threads</tt> list and start evaluating +that thread, passing it the result of the external call. +</li> +</ul> + +<!---- *************************************** -----> +<h3 id="rts-exec">RTS execution</h3> + +<p> +If a worker thread inside the RTS runs out of runnable Haskell +threads, it goes to sleep waiting for the external calls to complete. +It does this by calling <tt>waitForWorkCapability</tt> +</p> + +<p> +The availability of new runnable Haskell threads is signalled when: +</p> + +<ul> +<li>When an external call is set up in <tt>suspendThread()</tt>.</li> +<li>When a new Haskell thread is created (e.g., whenever +<tt>Concurrent.forkIO</tt> is called from within Haskell); this is +signalled in <tt>Schedule.c:scheduleThread_()</tt>. +</li> +<li>Whenever a Haskell thread is removed from a 'blocking queue' +attached to an MVar (only?). +</li> +</ul> + +<!---- *************************************** -----> +<h2 id="callin">Calling in</h2> + +Providing robust support for having multiple OS threads calling into +Haskell is not as involved as its dual. + +<ul> +<li>The OS thread issues the call to a Haskell function by going via +the <em>Rts API</em> (as specificed in <tt>RtsAPI.h</tt>). +<li>Making the function application requires the construction of a +closure on the heap. This is done in a thread-safe manner by having +the OS thread lock a designated block of memory (the 'Rts API' block, +which is part of the GC's root set) for the short period of time it +takes to construct the application. +<li>The OS thread then creates a new Haskell thread to execute the +function application, which (eventually) boils down to calling +<tt>Schedule.c:createThread()</tt> +<li> +Evaluation is kicked off by calling <tt>Schedule.c:scheduleExtThread()</tt>, +which asks the Task Manager to possibly create a new worker (OS) +thread to execute the Haskell thread. +<li> +After the OS thread has done this, it blocks waiting for the +Haskell thread to complete the evaluation of the Haskell function. +<p> +The reason why a separate worker thread is made to evaluate the Haskell +function and not the OS thread that made the call-in via the +Rts API, is that we want that OS thread to return as soon as possible. +We wouldn't be able to guarantee that if the OS thread entered the +RTS to (initially) just execute its function application, as the +Scheduler may side-track it and also ask it to evaluate other Haskell threads. +</li> +</ul> + +<p> +<strong>Note:</strong> As of 20020413, the implementation of the RTS API +only serializes access to the allocator between multiple OS threads wanting +to call into Haskell (via the RTS API.) It does not coordinate this access +to the allocator with that of the OS worker thread that's currently executing +within the RTS. This weakness/bug is scheduled to be tackled as part of an +overhaul/reworking of the RTS API itself. + + +<!---- *************************************** -----> +<h2>Subsystems introduced/modified</h2> + +<p> +These threads extensions affect the Scheduler portions of the runtime +system. To make it more manageable to work with, the changes +introduced a couple of new RTS 'sub-systems'. This section presents +the functionality and API of these sub-systems. +</p> + +<!---- *************************************** -----> +<h3 id="#capability">Capabilities</h3> + +<p> +A Capability represent the token required to execute STG code, +and all the state an OS thread/task needs to run Haskell code: +its STG registers, a pointer to its TSO, a nursery etc. During +STG execution, a pointer to the capabilitity is kept in a +register (BaseReg). +</p> +<p> +Only in an SMP build will there be multiple capabilities, for +the threaded RTS and other non-threaded builds, there is only +one global capability, namely <tt>MainCapability</tt>. + +<p> +The Capability API is as follows: +<pre> +/* Capability.h */ +extern void initCapabilities(void); + +extern void grabReturnCapability(Mutex* pMutex, Capability** pCap); +extern void waitForWorkCapability(Mutex* pMutex, Capability** pCap, rtsBool runnable); +extern void releaseCapability(Capability* cap); + +extern void yieldToReturningWorker(Mutex* pMutex, Capability* cap); + +extern void grabCapability(Capability** cap); +</pre> + +<ul> +<li><tt>initCapabilities()</tt> initialises the subsystem. + +<li><tt>grabReturnCapability()</tt> is called by worker threads +returning from an external call. It blocks them waiting to gain +permissions to do so. + +<li><tt>waitForWorkCapability()</tt> is called by worker threads +already inside the RTS, but without any work to do. It blocks them +waiting for there to new work to become available. + +<li><tt>releaseCapability()</tt> hands back a capability. If a +'returning worker' is waiting, it is signalled that a capability +has become available. If not, <tt>releaseCapability()</tt> tries +to signal worker threads that are blocked waiting inside +<tt>waitForWorkCapability()</tt> that new work might now be +available. + +<li><tt>yieldToReturningWorker()</tt> is called by the worker thread +that's currently inside the Scheduler. It checks whether there are other +worker threads waiting to return from making an external call. If so, +they're given preference and a capability is transferred between worker +threads. One of the waiting 'returning worker' threads is signalled and made +runnable, with the other, yielding, worker blocking to re-acquire +a capability. +</ul> + +<p> +The condition variables used to implement the synchronisation between +worker consumers and providers are local to the Capability +implementation. See source for details and comments. +</p> + +<!---- *************************************** -----> +<h3 id="taskman">The Task Manager</h3> + +<p> +The Task Manager API is responsible for managing the creation of +OS worker RTS threads. When a Haskell thread wants to make an +external call, the Task Manager is asked to possibly create a +new worker thread to take over the RTS-executing capability of +the worker thread that's exiting the RTS to execute the external call. + +<p> +The Capability subsystem keeps track of idle worker threads, so +making an informed decision about whether or not to create a new OS +worker thread is easy work for the task manager. The Task manager +provides the following API: +</p> + +<pre> +/* Task.h */ +extern void startTaskManager ( nat maxTasks, void (*taskStart)(void) ); +extern void stopTaskManager ( void ); + +extern void startTask ( void (*taskStart)(void) ); +</pre> + +<ul> +<li><tt>startTaskManager()</tt> and <tt>stopTaskManager()</tt> starts +up and shuts down the subsystem. When starting up, you have the option +to limit the overall number of worker threads that can be +created. An unbounded (modulo OS thread constraints) number of threads +is created if you pass '0'. +<li><tt>startTask()</tt> is called when a worker thread calls +<tt>suspendThread()</tt> to service an external call, asking another +worker thread to take over its RTS-executing capability. It is also +called when an external OS thread invokes a Haskell function via the +<em>Rts API</em>. +</ul> + +<!---- *************************************** -----> +<h3>Native threads API</h3> + +To hide OS details, the following API is used by the task manager and +the scheduler to interact with an OS' threads API: + +<pre> +/* OSThreads.h */ +typedef <em>..OS specific..</em> Mutex; +extern void initMutex ( Mutex* pMut ); +extern void grabMutex ( Mutex* pMut ); +extern void releaseMutex ( Mutex* pMut ); + +typedef <em>..OS specific..</em> Condition; +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); +</pre> + + + +<!---- *************************************** -----> +<h2>User-level interface</h2> + +To signal that you want an external call to be serviced by a separate +OS thread, you have to add the attribute <tt>threadsafe</tt> to +a foreign import declaration, i.e., + +<pre> +foreign import "bigComp" threadsafe largeComputation :: Int -> IO () +</pre> + +<p> +The distinction between 'safe' and thread-safe C calls is made +so that we may call external functions that aren't re-entrant but may +cause a GC to occur. +<p> +The <tt>threadsafe</tt> attribute subsumes <tt>safe</tt>. +</p> + +<!---- *************************************** -----> +<h2>Building the GHC RTS</h2> + +The multi-threaded extension isn't currently enabled by default. To +have it built, you need to run the <tt>fptools</tt> configure script +with the extra option <tt>--enable-threaded-rts</tt> turned on, and +then proceed to build the compiler as per normal. + +<hr> +<small> +<!-- hhmts start --> Last modified: Wed Apr 10 14:21:57 Pacific Daylight Time 2002 <!-- hhmts end --> +</small> +</body> </html> + diff --git a/docs/comm/rts-libs/non-blocking.html b/docs/comm/rts-libs/non-blocking.html new file mode 100644 index 0000000000..627bde8d88 --- /dev/null +++ b/docs/comm/rts-libs/non-blocking.html @@ -0,0 +1,133 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Non-blocking I/O on Win32</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Non-blocking I/O on Win32</h1> + <p> + +This note discusses the implementation of non-blocking I/O on +Win32 platforms. It is not implemented yet (Apr 2002), but it seems worth +capturing the ideas. Thanks to Sigbjorn for writing them. + +<h2> Background</h2> + +GHC has provided non-blocking I/O support for Concurrent Haskell +threads on platforms that provide 'UNIX-style' non-blocking I/O for +quite a while. That is, platforms that let you alter the property of a +file descriptor to instead of having a thread block performing an I/O +operation that cannot be immediately satisfied, the operation returns +back a special error code (EWOULDBLOCK.) When that happens, the CH +thread that made the blocking I/O request is put into a blocked-on-IO +state (see Foreign.C.Error.throwErrnoIfRetryMayBlock). The RTS will +in a timely fashion check to see whether I/O is again possible +(via a call to select()), and if it is, unblock the thread & have it +re-try the I/O operation. The result is that other Concurrent Haskell +threads won't be affected, but can continue operating while a thread +is blocked on I/O. +<p> +Non-blocking I/O hasn't been supported by GHC on Win32 platforms, for +the simple reason that it doesn't provide the OS facilities described +above. + +<h2>Win32 non-blocking I/O, attempt 1</h2> + +Win32 does provide something select()-like, namely the +WaitForMultipleObjects() API. It takes an array of kernel object +handles plus a timeout interval, and waits for either one (or all) of +them to become 'signalled'. A handle representing an open file (for +reading) becomes signalled once there is input available. +<p> +So, it is possible to observe that I/O is possible using this +function, but not whether there's "enough" to satisfy the I/O request. +So, if we were to mimic select() usage with WaitForMultipleObjects(), +we'd correctly avoid blocking initially, but a thread may very well +block waiting for their I/O requests to be satisified once the file +handle has become signalled. [There is a fix for this -- only read +and write one byte at a the time -- but I'm not advocating that.] + + +<h2>Win32 non-blocking I/O, attempt 2</h2> + +Asynchronous I/O on Win32 is supported via 'overlapped I/O'; that is, +asynchronous read and write requests can be made via the ReadFile() / +WriteFile () APIs, specifying position and length of the operation. +If the I/O requests cannot be handled right away, the APIs won't +block, but return immediately (and report ERROR_IO_PENDING as their +status code.) +<p> +The completion of the request can be reported in a number of ways: +<ul> + <li> synchronously, by blocking inside Read/WriteFile(). (this is the + non-overlapped case, really.) +<p> + + <li> as part of the overlapped I/O request, pass a HANDLE to an event + object. The I/O system will signal this event once the request + completed, which a waiting thread will then be able to see. +<p> + + <li> by supplying a pointer to a completion routine, which will be + called as an Asynchronous Procedure Call (APC) whenever a thread + calls a select bunch of 'alertable' APIs. +<p> + + <li> by associating the file handle with an I/O completion port. Once + the request completes, the thread servicing the I/O completion + port will be notified. +</ul> +The use of I/O completion port looks the most interesting to GHC, +as it provides a central point where all I/O requests are reported. +<p> +Note: asynchronous I/O is only fully supported by OSes based on +the NT codebase, i.e., Win9x don't permit async I/O on files and +pipes. However, Win9x does support async socket operations, and +I'm currently guessing here, console I/O. In my view, it would +be acceptable to provide non-blocking I/O support for NT-based +OSes only. +<p> +Here's the design I currently have in mind: +<ul> +<li> Upon startup, an RTS helper thread whose only purpose is to service + an I/O completion port, is created. +<p> +<li> All files are opened in 'overlapping' mode, and associated + with an I/O completion port. +<p> +<li> Overlapped I/O requests are used to implement read() and write(). +<p> +<li> If the request cannot be satisified without blocking, the Haskell + thread is put on the blocked-on-I/O thread list & a re-schedule + is made. +<p> +<li> When the completion of a request is signalled via the I/O completion + port, the RTS helper thread will move the associated Haskell thread + from the blocked list onto the runnable list. (Clearly, care + is required here to have another OS thread mutate internal Scheduler + data structures.) + +<p> +<li> In the event all Concurrent Haskell threads are blocked waiting on + I/O, the main RTS thread blocks waiting on an event synchronisation + object, which the helper thread will signal whenever it makes + a Haskell thread runnable. + +</ul> + +I might do the communication between the RTS helper thread and the +main RTS thread differently though: rather than have the RTS helper +thread manipluate thread queues itself, thus requiring careful +locking, just have it change a bit on the relevant TSO, which the main +RTS thread can check at regular intervals (in some analog of +awaitEvent(), for example). + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelfound.html b/docs/comm/rts-libs/prelfound.html new file mode 100644 index 0000000000..25407eed43 --- /dev/null +++ b/docs/comm/rts-libs/prelfound.html @@ -0,0 +1,57 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Prelude Foundations</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Prelude Foundations</h1> + <p> + The standard Haskell Prelude as well as GHC's Prelude extensions are + constructed from GHC's <a href="primitives.html">primitives</a> in a + couple of layers. + + <h4><code>PrelBase.lhs</code></h4> + <p> + Some most elementary Prelude definitions are collected in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>. + In particular, it defines the boxed versions of Haskell primitive types + - for example, <code>Int</code> is defined as + <blockquote><pre> +data Int = I# Int#</pre> + </blockquote> + <p> + Saying that a boxed integer <code>Int</code> is formed by applying the + data constructor <code>I#</code> to an <em>unboxed</em> integer of type + <code>Int#</code>. Unboxed types are hardcoded in the compiler and + exported together with the <a href="primitives.html">primitive + operations</a> understood by GHC. + <p> + <code>PrelBase.lhs</code> similarly defines basic types, such as, + boolean values + <blockquote><pre> +data Bool = False | True deriving (Eq, Ord)</pre> + </blockquote> + <p> + the unit type + <blockquote><pre> +data () = ()</pre> + </blockquote> + <p> + and lists + <blockquote><pre> +data [] a = [] | a : [a]</pre> + </blockquote> + <p> + It also contains instance delarations for these types. In addition, + <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky + machinery</a> for efficient list handling. + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:30:18 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/prelude.html b/docs/comm/rts-libs/prelude.html new file mode 100644 index 0000000000..4ad6c20338 --- /dev/null +++ b/docs/comm/rts-libs/prelude.html @@ -0,0 +1,121 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Cunning Prelude Code</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Cunning Prelude Code</h1> + <p> + GHC's uses a many optimsations and GHC specific techniques (unboxed + values, RULES pragmas, and so on) to make the heavily used Prelude code + as fast as possible. + + <hr> + <h4>Par, seq, and lazy</h4> + + In GHC.Conc you will dinf +<blockquote><pre> + pseq a b = a `seq` lazy b +</pre></blockquote> + What's this "lazy" thing. Well, <tt>pseq</tt> is a <tt>seq</tt> for a parallel setting. + We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict + in b, then b might be evaluated <em>before</em> a, which is all wrong. +<p> +Solution: wrap the 'b' in a call to <tt>GHC.Base.lazy</tt>. This function is just the identity function, +except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the +inlining and strictness information that comes in from GHC.Base.hi. And that makes <tt>lazy</tt> look +lazy, and have no inlining. So the strictness analyser gets no traction. +<p> +In the worker/wrapper phase, after strictness analysis, <tt>lazy</tt> is "manually" inlined (see WorkWrap.lhs), +so we get all the efficiency back. +<p> +This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an +Int#. Now there is no seq# operator at all. + + + <hr> + <h4>fold/build</h4> + <p> + There is a lot of magic in <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> - + among other things, the <a + href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES + pragmas</a> implementing the <a + href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a> + optimisation. The code for <code>map</code> is + a good example for how it all works. In the prelude code for version + 5.03 it reads as follows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map _ [] = [] +map f (x:xs) = f x : map f xs + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +{-# INLINE [0] mapFB #-} +mapFB c f x ys = c (f x) ys + +{-# RULES +"map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapList" [1] forall f. foldr (mapFB (:) f) [] = map f +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) + #-}</pre> + </blockquote> + <p> + Up to (but not including) phase 1, we use the <code>"map"</code> rule to + rewrite all saturated applications of <code>map</code> with its + build/fold form, hoping for fusion to happen. In phase 1 and 0, we + switch off that rule, inline build, and switch on the + <code>"mapList"</code> rule, which rewrites the foldr/mapFB thing back + into plain map. + <p> + It's important that these two rules aren't both active at once + (along with build's unfolding) else we'd get an infinite loop + in the rules. Hence the activation control using explicit phase numbers. + <p> + The "mapFB" rule optimises compositions of map. + <p> + The mechanism as described above is new in 5.03 since January 2002, + where the <code>[~</code><i>N</i><code>]</code> syntax for phase number + annotations at rules was introduced. Before that the whole arrangement + was more complicated, as the corresponding prelude code for version + 4.08.1 shows: + <blockquote><pre> +map :: (a -> b) -> [a] -> [b] +map = mapList + +-- Note eta expanded +mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst +mapFB c f x ys = c (f x) ys + +mapList :: (a -> b) -> [a] -> [b] +mapList _ [] = [] +mapList f (x:xs) = f x : mapList f xs + +{-# RULES +"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) +"mapList" forall f. foldr (mapFB (:) f) [] = mapList f + #-}</pre> + </blockquote> + <p> + This code is structured as it is, because the "map" rule first + <em>breaks</em> the map <em>open,</em> which exposes it to the various + foldr/build rules, and if no foldr/build rule matches, the "mapList" + rule <em>closes</em> it again in a later phase of optimisation - after + build was inlined. As a consequence, the whole thing depends a bit on + the timing of the various optimsations (the map might be closed again + before any of the foldr/build rules fires). To make the timing + deterministic, <code>build</code> gets a <code>{-# INLINE 2 build + #-}</code> pragma, which delays <code>build</code>'s inlining, and thus, + the closing of the map. [NB: Phase numbering was forward at that time.] + + <p><small> +<!-- hhmts start --> +Last modified: Mon Feb 11 20:00:49 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/primitives.html b/docs/comm/rts-libs/primitives.html new file mode 100644 index 0000000000..28abc79426 --- /dev/null +++ b/docs/comm/rts-libs/primitives.html @@ -0,0 +1,70 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Primitives</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Primitives</h1> + <p> + Most user-level Haskell types and functions provided by GHC (in + particular those from the Prelude and GHC's Prelude extensions) are + internally constructed from even more elementary types and functions. + Most notably, GHC understands a notion of <em>unboxed types,</em> which + are the Haskell representation of primitive bit-level integer, float, + etc. types (as opposed to their boxed, heap allocated counterparts) - + cf. <a + href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed + Values as First Class Citizens."</a> + + <h4>The Ultimate Source of Primitives</h4> + <p> + The hardwired types of GHC are brought into scope by the module + <code>PrelGHC</code>. This modules only exists in the form of a + handwritten interface file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a> + which lists the type and function names, as well as instance + declarations. The actually types of these names as well as their + implementation is hardwired into GHC. Note that the names in this file + are z-encoded, and in particular, identifiers ending on <code>zh</code> + denote user-level identifiers ending in a hash mark (<code>#</code>), + which is used to flag unboxed values or functions operating on unboxed + values. For example, we have <code>Char#</code>, <code>ord#</code>, and + so on. + + <h4>The New Primitive Definition Scheme</h4> + <p> + As of (about) the development version 4.11, the types and various + properties of primitive operations are defined in the file <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt.pp"><code>primops.txt.pp</code></a>. + (Personally, I don't think that the <code>.txt</code> suffix is really + appropriate, as the file is used for automatic code generation; the + recent addition of <code>.pp</code> means that the file is now mangled + by cpp.) + <p> + The utility <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a> + generates a series of Haskell files from <code>primops.txt</code>, which + encode the types and various properties of the primitive operations as + compiler internal data structures. These Haskell files are not complete + modules, but program fragments, which are included into compiler modules + during the GHC build process. The generated include files can be found + in the directory <code>fptools/ghc/compiler/</code> and carry names + matching the pattern <code>primop-*.hs-incl</code>. They are generate + during the execution of the <code>boot</code> target in the + <code>fptools/ghc/</code> directory. This scheme significantly + simplifies the maintenance of primitive operations. + <p> + As of development version 5.02, the <code>primops.txt</code> file also allows the + recording of documentation about intended semantics of the primitives. This can + be extracted into a latex document (or rather, into latex document fragments) + via an appropriate switch to <code>genprimopcode</code>. In particular, see <code>primops.txt</code> + for full details of how GHC is configured to cope with different machine word sizes. + <p><small> +<!-- hhmts start --> +Last modified: Mon Nov 26 18:03:16 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/stgc.html b/docs/comm/rts-libs/stgc.html new file mode 100644 index 0000000000..196ec9150d --- /dev/null +++ b/docs/comm/rts-libs/stgc.html @@ -0,0 +1,45 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - Spineless Tagless C</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - Spineless Tagless C</h1> + <p> + The C code generated by GHC doesn't use higher-level features of C to be + able to control as precisely as possible what code is generated. + Moreover, it uses special features of gcc (such as, first class labels) + to produce more efficient code. + <p> + STG C makes ample use of C's macro language to define idioms, which also + reduces the size of the generated C code (thus, reducing I/O times). + These macros are defined in the C headers located in GHC's <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a> + directory. + + <h4><code>TailCalls.h</code></h4> + <p> + <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a> + defines how tail calls are implemented - and in particular - optimised + in GHC generated code. The default case, for an architecture for which + GHC is not optimised, is to use the mini interpreter described in the <a + href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a> + <p> + For supported architectures, various tricks are used to generate + assembler implementing proper tail calls. On i386, gcc's first class + labels are used to directly jump to a function pointer. Furthermore, + markers of the form <code>--- BEGIN ---</code> and <code>--- END + ---</code> are added to the assembly right after the function prologue + and before the epilogue. These markers are used by <a + href="../the-beast/mangler.html">the Evil Mangler.</a> + + <p><small> +<!-- hhmts start --> +Last modified: Wed Aug 8 19:28:29 EST 2001 +<!-- hhmts end --> + </small> + </body> +</html> diff --git a/docs/comm/rts-libs/threaded-rts.html b/docs/comm/rts-libs/threaded-rts.html new file mode 100644 index 0000000000..499aeec767 --- /dev/null +++ b/docs/comm/rts-libs/threaded-rts.html @@ -0,0 +1,126 @@ +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</title> + </head> + + <body> + <h1>The GHC Commentary - The Multi-threaded runtime, and multiprocessor execution</h1> + + <p>This section of the commentary explains the structure of the runtime system + when used in threaded or SMP mode.</p> + + <p>The <em>threaded</em> version of the runtime supports + bound threads and non-blocking foreign calls, and an overview of its + design can be found in the paper <a + href="http://www.haskell.org/~simonmar/papers/conc-ffi.pdf">Extending + the Haskell Foreign Function Interface with Concurrency</a>. To + compile the runtime with threaded support, add the line + +<pre>GhcRTSWays += thr</pre> + + to <tt>mk/build.mk</tt>. When building C code in the runtime for the threaded way, + the symbol <tt>THREADED_RTS</tt> is defined (this is arranged by the + build system when building for way <tt>thr</tt>, see + <tt>mk/config.mk</tt>). To build a Haskell program + with the threaded runtime, pass the flag <tt>-threaded</tt> to GHC (this + can be used in conjunction with <tt>-prof</tt>, and possibly + <tt>-debug</tt> and others depending on which versions of the RTS have + been built.</p> + + <p>The <em>SMP</em> version runtime supports the same facilities as the + threaded version, and in addition supports execution of Haskell code by + multiple simultaneous OS threads. For SMP support, both the runtime and + the libraries must be built a special way: add the lines + + <pre> +GhcRTSWays += thr +GhcLibWays += s</pre> + + to <tt>mk/build.mk</tt>. To build Haskell code for + SMP execution, use the flag <tt>-smp</tt> to GHC (this can be used in + conjunction with <tt>-debug</tt>, but no other way-flags at this time). + When building C code in the runtime for SMP + support, the symbol <tt>SMP</tt> is defined (this is arranged by the + compiler when the <tt>-smp</tt> flag is given, see + <tt>ghc/compiler/main/StaticFlags.hs</tt>).</p> + + <p>When building the runtime in either the threaded or SMP ways, the symbol + <tt>RTS_SUPPORTS_THREADS</tt> will be defined (see <tt>Rts.h</tt>).</p> + + <h2>Overall design</h2> + + <p>The system is based around the notion of a <tt>Capability</tt>. A + <tt>Capability</tt> is an object that represents both the permission to + execute some Haskell code, and the state required to do so. In order + to execute some Haskell code, a thread must therefore hold a + <tt>Capability</tt>. The available pool of capabilities is managed by + the <tt>Capability</tt> API, described below.</p> + + <p>In the threaded runtime, there is only a single <tt>Capabililty</tt> in the + system, indicating that only a single thread can be executing Haskell + code at any one time. In the SMP runtime, there can be an arbitrary + number of capabilities selectable at runtime with the <tt>+RTS -N<em>n</em></tt> + flag; in practice the number is best chosen to be the same as the number of + processors on the host machine.</p> + + <p>There are a number of OS threads running code in the runtime. We call + these <em>tasks</em> to avoid confusion with Haskell <em>threads</em>. + Tasks are managed by the <tt>Task</tt> subsystem, which is mainly + concerned with keeping track of statistics such as how much time each + task spends executing Haskell code, and also keeping track of how many + tasks are around when we want to shut down the runtime.</p> + + <p>Some tasks are created by the runtime itself, and some may be here + as a result of a call to Haskell from foreign code (we + call this an in-call). The + runtime can support any number of concurrent foreign in-calls, but the + number of these calls that will actually run Haskell code in parallel is + determined by the number of available capabilities. Each in-call creates + a <em>bound thread</em>, as described in the FFI/Concurrency paper (cited + above).</p> + + <p>In the future we may want to bind a <tt>Capability</tt> to a particular + processor, so that we can support a notion of affinity - avoiding + accidental migration of work from one CPU to another, so that we can make + best use of a CPU's local cache. For now, the design ignores this + issue.</p> + + <h2>The <tt>OSThreads</tt> interface</h2> + + <p>This interface is merely an abstraction layer over the OS-specific APIs + for managing threads. It has two main implementations: Win32 and + POSIX.</p> + + <p>This is the entirety of the interface:</p> + +<pre> +/* Various abstract types */ +typedef Mutex; +typedef Condition; +typedef OSThreadId; + +extern OSThreadId osThreadId ( void ); +extern void shutdownThread ( void ); +extern void yieldThread ( void ); +extern int createOSThread ( OSThreadId* tid, + void (*startProc)(void) ); + +extern void initCondition ( Condition* pCond ); +extern void closeCondition ( Condition* pCond ); +extern rtsBool broadcastCondition ( Condition* pCond ); +extern rtsBool signalCondition ( Condition* pCond ); +extern rtsBool waitCondition ( Condition* pCond, + Mutex* pMut ); + +extern void initMutex ( Mutex* pMut ); + </pre> + + <h2>The Task interface</h2> + + <h2>The Capability interface</h2> + + <h2>Multiprocessor Haskell Execution</h2> + + </body> +</html> |