diff options
Diffstat (limited to 'docs/comm/the-beast/fexport.html')
-rw-r--r-- | docs/comm/the-beast/fexport.html | 231 |
1 files changed, 231 insertions, 0 deletions
diff --git a/docs/comm/the-beast/fexport.html b/docs/comm/the-beast/fexport.html new file mode 100644 index 0000000000..956043bafb --- /dev/null +++ b/docs/comm/the-beast/fexport.html @@ -0,0 +1,231 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - foreign export</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - foreign export</h1> + + The implementation scheme for foreign export, as of 27 Feb 02, is + as follows. There are four cases, of which the first two are easy. + <p> + <b>(1) static export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> IO Int</code> + <p> + For this we generate no Haskell code. However, a C stub is + generated, and it looks like this: + <p> + <pre> +extern StgClosure* MMM_foo_closure; + +HsInt foo (HsInt a1, HsInt a2) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply(MMM_foo_closure,rts_mkInt(a1)), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("foo",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + This does the obvious thing: builds in the heap the expression + <code>(foo a1 a2)</code>, calls <code>rts_evalIO</code> to run it, + and uses <code>rts_getInt</code> to fish out the result. + + <p> + <b>(2) static export of a non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export foo :: Int -> Int -> Int</code> + <p> + This is identical to case (1), with the sole difference that the + stub calls <code>rts_eval</code> rather than + <code>rts_evalIO</code>. + <p> + + <b>(3) dynamic export of an IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> IO Int) -> IO (FunPtr a)</code> + <p> + Dynamic exports are a whole lot more complicated than their static + counterparts. + <p> + First of all, we get some Haskell code, which, when given a + function <code>callMe :: (Int -> Int -> IO Int)</code> to be made + C-callable, IO-returns a <code>FunPtr a</code>, which is the + address of the resulting C-callable code. This address can now be + handed out to the C-world, and callers to it will get routed + through to <code>callMe</code>. + <p> + The generated Haskell function looks like this: + <p> +<pre> +mkCallback f + = do sp <- mkStablePtr f + r <- ccall "createAdjustorThunk" sp (&"run_mkCallback") + return r +</pre> + <p> + <code>createAdjustorThunk</code> is a gruesome, + architecture-specific function in the RTS. It takes a stable + pointer to the Haskell function to be run, and the address of the + associated C wrapper, and returns a piece of machine code, + which, when called from the outside (C) world, eventually calls + through to <code>f</code>. + <p> + This machine code fragment is called the "Adjustor Thunk" (don't + ask me why). What it does is simply to call onwards to the C + helper + function <code>run_mkCallback</code>, passing all the args given + to it but also conveying <code>sp</code>, which is a stable + pointer + to the Haskell function to run. So: + <p> +<pre> +createAdjustorThunk ( StablePtr sp, CCodeAddress addr_of_helper_C_fn ) +{ + create malloc'd piece of machine code "mc", behaving thusly: + + mc ( args_to_mc ) + { + jump to addr_of_helper_C_fn, passing sp as an additional + argument + } +</pre> + <p> + This is a horrible hack, because there is no portable way, even at + the machine code level, to function which adds one argument and + then transfers onwards to another C function. On x86s args are + pushed R to L onto the stack, so we can just push <code>sp</code>, + fiddle around with return addresses, and jump onwards to the + helper C function. However, on architectures which use register + windows and/or pass args extensively in registers (Sparc, Alpha, + MIPS, IA64), this scheme borders on the unviable. GHC has a + limited <code>createAdjustorThunk</code> implementation for Sparc + and Alpha, which handles only the cases where all args, including + the extra one, fit in registers. + <p> + Anyway: the other lump of code generated as a result of a + f-x-dynamic declaration is the C helper stub. This is basically + the same as in the static case, except that it only ever gets + called from the adjustor thunk, and therefore must accept + as an extra argument, a stable pointer to the Haskell function + to run, naturally enough, as this is not known until run-time. + It then dereferences the stable pointer and does the call in + the same way as the f-x-static case: +<pre> +HsInt Main_d1kv ( StgStablePtr the_stableptr, + void* original_return_addr, + HsInt a1, HsInt a2 ) +{ + SchedulerStatus rc; + HaskellObj ret; + rc = rts_evalIO( + rts_apply(rts_apply((StgClosure*)deRefStablePtr(the_stableptr), + rts_mkInt(a1) + ), + rts_mkInt(a2) + ), + &ret + ); + rts_checkSchedStatus("Main_d1kv",rc); + return(rts_getInt(ret)); +} +</pre> + <p> + Note how this function has a purely made-up name + <code>Main_d1kv</code>, since unlike the f-x-static case, this + function is never called from user code, only from the adjustor + thunk. + <p> + Note also how the function takes a bogus parameter + <code>original_return_addr</code>, which is part of this extra-arg + hack. The usual scheme is to leave the original caller's return + address in place and merely push the stable pointer above that, + hence the spare parameter. + <p> + Finally, there is some extra trickery, detailed in + <code>ghc/rts/Adjustor.c</code>, to get round the following + problem: the adjustor thunk lives in mallocville. It is + quite possible that the Haskell code will actually + call <code>free()</code> on the adjustor thunk used to get to it + -- because otherwise there is no way to reclaim the space used + by the adjustor thunk. That's all very well, but it means that + the C helper cannot return to the adjustor thunk in the obvious + way, since we've already given it back using <code>free()</code>. + So we leave, on the C stack, the address of whoever called the + adjustor thunk, and before calling the helper, mess with the stack + such that when the helper returns, it returns directly to the + adjustor thunk's caller. + <p> + That's how the <code>stdcall</code> convention works. If the + adjustor thunk has been called using the <code>ccall</code> + convention, we return indirectly, via a statically-allocated + yet-another-magic-piece-of-code, which takes care of removing the + extra argument that the adjustor thunk pushed onto the stack. + This is needed because in <code>ccall</code>-world, it is the + caller who removes args after the call, and the original caller of + the adjustor thunk has no way to know about the extra arg pushed + by the adjustor thunk. + <p> + You didn't really want to know all this stuff, did you? + <p> + + + + <b>(4) dynamic export of an non-IO-typed function from some module <code>MMM</code></b> + <p> + <code>foreign export mkCallback :: (Int -> Int -> Int) -> IO (FunPtr a)</code> + <p> + (4) relates to (3) as (2) relates to (1), that is, it's identical, + except the C stub uses <code>rts_eval</code> instead of + <code>rts_evalIO</code>. + <p> + + + <h2>Some perspective on f-x-dynamic</h2> + + The only really horrible problem with f-x-dynamic is how the + adjustor thunk should pass to the C helper the stable pointer to + use. Ideally we would like this to be conveyed via some invisible + side channel, since then the adjustor thunk could simply jump + directly to the C helper, with no non-portable stack fiddling. + <p> + Unfortunately there is no obvious candidate for the invisible + side-channel. We've chosen to pass it on the stack, with the + bad consequences detailed above. Another possibility would be to + park it in a global variable, but this is non-reentrant and + non-(OS-)thread-safe. A third idea is to put it into a callee-saves + register, but that has problems too: the C helper may not use that + register and therefore we will have trashed any value placed there + by the caller; and there is no C-level portable way to read from + the register inside the C helper. + <p> + In short, we can't think of a really satisfactory solution. I'd + vote for introducing some kind of OS-thread-local-state and passing + it in there, but that introduces complications of its own. + <p> + <b>OS-thread-safety</b> is of concern in the C stubs, whilst + building up the expressions to run. These need to have exclusive + access to the heap whilst allocating in it. Also, there needs to + be some guarantee that no GC will happen in between the + <code>deRefStablePtr</code> call and when <code>rts_eval[IO]</code> + starts running. At the moment there are no guarantees for + either property. This needs to be sorted out before the + implementation can be regarded as fully safe to use. + +<p><small> + +<!-- hhmts start --> +Last modified: Weds 27 Feb 02 +<!-- hhmts end --> + </small> + </body> +</html> |