diff options
Diffstat (limited to 'docs/comm/the-beast/mangler.html')
-rw-r--r-- | docs/comm/the-beast/mangler.html | 79 |
1 files changed, 79 insertions, 0 deletions
diff --git a/docs/comm/the-beast/mangler.html b/docs/comm/the-beast/mangler.html new file mode 100644 index 0000000000..1ad80f0d5c --- /dev/null +++ b/docs/comm/the-beast/mangler.html @@ -0,0 +1,79 @@ +<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN"> +<html> + <head> + <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"> + <title>The GHC Commentary - The Evil Mangler</title> + </head> + + <body BGCOLOR="FFFFFF"> + <h1>The GHC Commentary - The Evil Mangler</h1> + <p> + The Evil Mangler (EM) is a Perl script invoked by the <a + href="driver.html">Glorious Driver</a> after the C compiler (gcc) has + translated the GHC-produced C code into assembly. Consequently, it is + only of interest if <code>-fvia-C</code> is in effect (either explicitly + or implicitly). + + <h4>Its purpose</h4> + <p> + The EM reads the assembly produced by gcc and re-arranges code blocks as + well as nukes instructions that it considers <em>non-essential.</em> It + derives it evilness from its utterly ad hoc, machine, compiler, and + whatnot dependent design and implementation. More precisely, the EM + performs the following tasks: + <ul> + <li>The code executed when a closure is entered is moved adjacent to + that closure's infotable. Moreover, the order of the info table + entries is reversed. Also, SRT pointers are removed from closures that + don't need them (non-FUN, RET and THUNK ones). + <li>Function prologue and epilogue code is removed. (GHC generated code + manages its own stack and uses the system stack only for return + addresses and during calls to C code.) + <li>Certain code patterns are replaced by simpler code (eg, loads of + fast entry points followed by indirect jumps are replaced by direct + jumps to the fast entry point). + </ul> + + <h4>Implementation</h4> + <p> + The EM is located in the Perl script <a + href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>. + The script reads the <code>.s</code> file and chops it up into + <em>chunks</em> (that's how they are actually called in the script) that + roughly correspond to basic blocks. Each chunk is annotated with an + educated guess about what kind of code it contains (e.g., infotable, + fast entry point, slow entry point, etc.). The annotations also contain + the symbol introducing the chunk of assembly and whether that chunk has + already been processed or not. + <p> + The parsing of the input into chunks as well as recognising assembly + instructions that are to be removed or altered is based on a large + number of Perl regular expressions sprinkled over the whole code. These + expressions are rather fragile as they heavily rely on the structure of + the generated code - in fact, they even rely on the right amount of + white space and thus on the formatting of the assembly. + <p> + Afterwards, the chunks are reordered, some of them purged, and some + stripped of some useless instructions. Moreover, some instructions are + manipulated (eg, loads of fast entry points followed by indirect jumps + are replaced by direct jumps to the fast entry point). + <p> + The EM knows which part of the code belongs to function prologues and + epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the + form <code>--- BEGIN ---</code> and <code>--- END ---</code> the + assembler just before and after the code proper of a function starts. + It adds these tags using gcc's <code>__asm__</code> feature. + <p> + <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic + block re-ordering and dead code elimination. This seems to make the + whole <code>--- END ---</code> tag business redundant -- in fact, if + proper code is generated, no <code>--- END ---</code> tags survive gcc + optimiser. + + <p><small> +<!-- hhmts start --> +Last modified: Sun Feb 17 17:55:47 EST 2002 +<!-- hhmts end --> + </small> + </body> +</html> |