diff options
author | Simon Marlow <marlowsd@gmail.com> | 2012-10-03 09:30:56 +0100 |
---|---|---|
committer | Simon Marlow <marlowsd@gmail.com> | 2012-10-08 09:04:40 +0100 |
commit | a7c0387d20c1c9994d1100b14fbb8fb4e28a259e (patch) | |
tree | b95d0a512f951a4a463f1aa5178b0cd5c4fdb410 /rts/Updates.h | |
parent | aed37acd4d157791381800d5de960a2461bcbef3 (diff) | |
download | haskell-a7c0387d20c1c9994d1100b14fbb8fb4e28a259e.tar.gz |
Produce new-style Cmm from the Cmm parser
The main change here is that the Cmm parser now allows high-level cmm
code with argument-passing and function calls. For example:
foo ( gcptr a, bits32 b )
{
if (b > 0) {
// we can make tail calls passing arguments:
jump stg_ap_0_fast(a);
}
return (x,y);
}
More details on the new cmm syntax are in Note [Syntax of .cmm files]
in CmmParse.y.
The old syntax is still more-or-less supported for those occasional
code fragments that really need to explicitly manipulate the stack.
However there are a couple of differences: it is now obligatory to
give a list of live GlobalRegs on every jump, e.g.
jump %ENTRY_CODE(Sp(0)) [R1];
Again, more details in Note [Syntax of .cmm files].
I have rewritten most of the .cmm files in the RTS into the new
syntax, except for AutoApply.cmm which is generated by the genapply
program: this file could be generated in the new syntax instead and
would probably be better off for it, but I ran out of enthusiasm.
Some other changes in this batch:
- The PrimOp calling convention is gone, primops now use the ordinary
NativeNodeCall convention. This means that primops and "foreign
import prim" code must be written in high-level cmm, but they can
now take more than 10 arguments.
- CmmSink now does constant-folding (should fix #7219)
- .cmm files now go through the cmmPipeline, and as a result we
generate better code in many cases. All the object files generated
for the RTS .cmm files are now smaller. Performance should be
better too, but I haven't measured it yet.
- RET_DYN frames are removed from the RTS, lots of code goes away
- we now have some more canned GC points to cover unboxed-tuples with
2-4 pointers, which will reduce code size a little.
Diffstat (limited to 'rts/Updates.h')
-rw-r--r-- | rts/Updates.h | 17 |
1 files changed, 11 insertions, 6 deletions
diff --git a/rts/Updates.h b/rts/Updates.h index 954f02afe1..0205e6e763 100644 --- a/rts/Updates.h +++ b/rts/Updates.h @@ -24,29 +24,34 @@ * field. So, we call LDV_RECORD_CREATE(). */ -/* We have two versions of this macro (sadly), one for use in C-- code, +/* + * We have two versions of this macro (sadly), one for use in C-- code, * and the other for C. * * The and_then argument is a performance hack so that we can paste in * the continuation code directly. It helps shave a couple of * instructions off the common case in the update code, which is * worthwhile (the update code is often part of the inner loop). - * (except that gcc now appears to common up this code again and - * invert the optimisation. Grrrr --SDM). */ #ifdef CMINUSMINUS -#define updateWithIndirection(p1, p2, and_then) \ +#define UPDATE_FRAME_FIELDS(w_,p_,info_ptr,updatee) \ + w_ info_ptr, \ + PROF_HDR_FIELDS(w_) \ + p_ updatee + + +#define updateWithIndirection(p1, p2, and_then) \ W_ bd; \ \ OVERWRITING_CLOSURE(p1); \ StgInd_indirectee(p1) = p2; \ - prim %write_barrier() []; \ + prim %write_barrier(); \ SET_INFO(p1, stg_BLACKHOLE_info); \ LDV_RECORD_CREATE(p1); \ bd = Bdescr(p1); \ if (bdescr_gen_no(bd) != 0 :: bits16) { \ - recordMutableCap(p1, TO_W_(bdescr_gen_no(bd)), R1); \ + recordMutableCap(p1, TO_W_(bdescr_gen_no(bd))); \ TICK_UPD_OLD_IND(); \ and_then; \ } else { \ |