summaryrefslogtreecommitdiff
path: root/docs/comm/the-beast/fexport.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/comm/the-beast/fexport.html')
-rw-r--r--docs/comm/the-beast/fexport.html231
1 files changed, 231 insertions, 0 deletions
diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html
new file mode 100644
index 0000000000..956043bafb
--- /dev/null
+++ b/docs/comm/the-beast/fexport.html
@@ -0,0 +1,231 @@
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - foreign export</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - foreign export</h1>
+
+ The implementation scheme for foreign export, as of 27 Feb 02, is
+ as follows. There are four cases, of which the first two are easy.
+ <p>
+ <b>(1) static export of an IO-typed function from some module <code>MMM</code></b>
+ <p>
+ <code>foreign export foo :: Int -> Int -> IO Int</code>
+ <p>
+ For this we generate no Haskell code. However, a C stub is
+ generated, and it looks like this:
+ <p>
+ <pre>
+extern StgClosure* MMM_foo_closure;
+
+HsInt foo (HsInt a1, HsInt a2)
+{
+ SchedulerStatus rc;
+ HaskellObj ret;
+ rc = rts_evalIO(
+ rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)),
+ rts_mkInt(a2)
+ ),
+ &ret
+ );
+ rts_checkSchedStatus("foo",rc);
+ return(rts_getInt(ret));
+}
+</pre>
+ <p>
+ This does the obvious thing: builds in the heap the expression
+ <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it,
+ and uses <code>rts_getInt</code> to fish out the result.
+
+ <p>
+ <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b>
+ <p>
+ <code>foreign export foo :: Int -> Int -> Int</code>
+ <p>
+ This is identical to case (1), with the sole difference that the
+ stub calls <code>rts_eval</code> rather than
+ <code>rts_evalIO</code>.
+ <p>
+
+ <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b>
+ <p>
+ <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code>
+ <p>
+ Dynamic exports are a whole lot more complicated than their static
+ counterparts.
+ <p>
+ First of all, we get some Haskell code, which, when given a
+ function <code>callMe :: (Int -> Int -> IO Int)</code> to be made
+ C-callable, IO-returns a <code>FunPtr a</code>, which is the
+ address of the resulting C-callable code. This address can now be
+ handed out to the C-world, and callers to it will get routed
+ through to <code>callMe</code>.
+ <p>
+ The generated Haskell function looks like this:
+ <p>
+<pre>
+mkCallback f
+ = do sp <- mkStablePtr f
+ r <- ccall "createAdjustorThunk" sp (&"run_mkCallback")
+ return r
+</pre>
+ <p>
+ <code>createAdjustorThunk</code> is a gruesome,
+ architecture-specific function in the RTS. It takes a stable
+ pointer to the Haskell function to be run, and the address of the
+ associated C wrapper, and returns a piece of machine code,
+ which, when called from the outside (C) world, eventually calls
+ through to <code>f</code>.
+ <p>
+ This machine code fragment is called the "Adjustor Thunk" (don't
+ ask me why). What it does is simply to call onwards to the C
+ helper
+ function <code>run_mkCallback</code>, passing all the args given
+ to it but also conveying <code>sp</code>, which is a stable
+ pointer
+ to the Haskell function to run. So:
+ <p>
+<pre>
+createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn )
+{
+ create malloc'd piece of machine code "mc", behaving thusly:
+
+ mc ( args_to_mc )
+ {
+ jump to addr_of_helper_C_fn, passing sp as an additional
+ argument
+ }
+</pre>
+ <p>
+ This is a horrible hack, because there is no portable way, even at
+ the machine code level, to function which adds one argument and
+ then transfers onwards to another C function. On x86s args are
+ pushed R to L onto the stack, so we can just push <code>sp</code>,
+ fiddle around with return addresses, and jump onwards to the
+ helper C function. However, on architectures which use register
+ windows and/or pass args extensively in registers (Sparc, Alpha,
+ MIPS, IA64), this scheme borders on the unviable. GHC has a
+ limited <code>createAdjustorThunk</code> implementation for Sparc
+ and Alpha, which handles only the cases where all args, including
+ the extra one, fit in registers.
+ <p>
+ Anyway: the other lump of code generated as a result of a
+ f-x-dynamic declaration is the C helper stub. This is basically
+ the same as in the static case, except that it only ever gets
+ called from the adjustor thunk, and therefore must accept
+ as an extra argument, a stable pointer to the Haskell function
+ to run, naturally enough, as this is not known until run-time.
+ It then dereferences the stable pointer and does the call in
+ the same way as the f-x-static case:
+<pre>
+HsInt Main_d1kv ( StgStablePtr the_stableptr,
+ void* original_return_addr,
+ HsInt a1, HsInt a2 )
+{
+ SchedulerStatus rc;
+ HaskellObj ret;
+ rc = rts_evalIO(
+ rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr),
+ rts_mkInt(a1)
+ ),
+ rts_mkInt(a2)
+ ),
+ &ret
+ );
+ rts_checkSchedStatus("Main_d1kv",rc);
+ return(rts_getInt(ret));
+}
+</pre>
+ <p>
+ Note how this function has a purely made-up name
+ <code>Main_d1kv</code>, since unlike the f-x-static case, this
+ function is never called from user code, only from the adjustor
+ thunk.
+ <p>
+ Note also how the function takes a bogus parameter
+ <code>original_return_addr</code>, which is part of this extra-arg
+ hack. The usual scheme is to leave the original caller's return
+ address in place and merely push the stable pointer above that,
+ hence the spare parameter.
+ <p>
+ Finally, there is some extra trickery, detailed in
+ <code>ghc/rts/Adjustor.c</code>, to get round the following
+ problem: the adjustor thunk lives in mallocville. It is
+ quite possible that the Haskell code will actually
+ call <code>free()</code> on the adjustor thunk used to get to it
+ -- because otherwise there is no way to reclaim the space used
+ by the adjustor thunk. That's all very well, but it means that
+ the C helper cannot return to the adjustor thunk in the obvious
+ way, since we've already given it back using <code>free()</code>.
+ So we leave, on the C stack, the address of whoever called the
+ adjustor thunk, and before calling the helper, mess with the stack
+ such that when the helper returns, it returns directly to the
+ adjustor thunk's caller.
+ <p>
+ That's how the <code>stdcall</code> convention works. If the
+ adjustor thunk has been called using the <code>ccall</code>
+ convention, we return indirectly, via a statically-allocated
+ yet-another-magic-piece-of-code, which takes care of removing the
+ extra argument that the adjustor thunk pushed onto the stack.
+ This is needed because in <code>ccall</code>-world, it is the
+ caller who removes args after the call, and the original caller of
+ the adjustor thunk has no way to know about the extra arg pushed
+ by the adjustor thunk.
+ <p>
+ You didn't really want to know all this stuff, did you?
+ <p>
+
+
+
+ <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b>
+ <p>
+ <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code>
+ <p>
+ (4) relates to (3) as (2) relates to (1), that is, it's identical,
+ except the C stub uses <code>rts_eval</code> instead of
+ <code>rts_evalIO</code>.
+ <p>
+
+
+ <h2>Some perspective on f-x-dynamic</h2>
+
+ The only really horrible problem with f-x-dynamic is how the
+ adjustor thunk should pass to the C helper the stable pointer to
+ use. Ideally we would like this to be conveyed via some invisible
+ side channel, since then the adjustor thunk could simply jump
+ directly to the C helper, with no non-portable stack fiddling.
+ <p>
+ Unfortunately there is no obvious candidate for the invisible
+ side-channel. We've chosen to pass it on the stack, with the
+ bad consequences detailed above. Another possibility would be to
+ park it in a global variable, but this is non-reentrant and
+ non-(OS-)thread-safe. A third idea is to put it into a callee-saves
+ register, but that has problems too: the C helper may not use that
+ register and therefore we will have trashed any value placed there
+ by the caller; and there is no C-level portable way to read from
+ the register inside the C helper.
+ <p>
+ In short, we can't think of a really satisfactory solution. I'd
+ vote for introducing some kind of OS-thread-local-state and passing
+ it in there, but that introduces complications of its own.
+ <p>
+ <b>OS-thread-safety</b> is of concern in the C stubs, whilst
+ building up the expressions to run. These need to have exclusive
+ access to the heap whilst allocating in it. Also, there needs to
+ be some guarantee that no GC will happen in between the
+ <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code>
+ starts running. At the moment there are no guarantees for
+ either property. This needs to be sorted out before the
+ implementation can be regarded as fully safe to use.
+
+<p><small>
+
+<!-- hhmts start -->
+Last modified: Weds 27 Feb 02
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>