delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix ghc-bignum exceptions	Sylvain Henry	2020-06-27	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We must ensure that exceptions are not simplified. Previously we used: case raiseDivZero of _ -> 0## -- dummyValue But it was wrong because the evaluation of `raiseDivZero` was removed and the dummy value was directly returned. See new Note [ghc-bignum exceptions]. I've also removed the exception triggering primops which were fragile. We don't need them to be primops, we can have them exported by ghc-prim. I've also added a test for #18359 which triggered this patch.
*	Fix unboxed-sums GC ptr-slot rubbish value (#17791)	Sylvain Henry	2020-05-09	1	-0/+9
\| \| \| \| \| \| \|	This patch allows boot libraries to use unboxed sums without implicitly depending on `base` package because of `absentSumFieldError`. See updated Note [aBSENT_SUM_FIELD_ERROR_ID] in GHC.Core.Make
*	Module hierarchy: Cmm (cf #13009)	Sylvain Henry	2020-01-25	1	-1/+1
\|
*	rts: Implement concurrent collection in the nonmoving collector	Ben Gamari	2019-10-20	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the non-moving collector to allow concurrent collection. The full design of the collector implemented here is described in detail in a technical note B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell Compiler" (2018) This extension involves the introduction of a capability-local remembered set, known as the /update remembered set/, which tracks objects which may no longer be visible to the collector due to mutation. To maintain this remembered set we introduce a write barrier on mutations which is enabled while a concurrent mark is underway. The update remembered set representation is similar to that of the nonmoving mark queue, being a chunked array of `MarkEntry`s. Each `Capability` maintains a single accumulator chunk, which it flushed when it (a) is filled, or (b) when the nonmoving collector enters its post-mark synchronization phase. While the write barrier touches a significant amount of code it is conceptually straightforward: the mutator must ensure that the referee of any pointer it overwrites is added to the update remembered set. However, there are a few details: * In the case of objects with a dirty flag (e.g. `MVar`s) we can exploit the fact that only the first mutation requires a write barrier. * Weak references, as usual, complicate things. In particular, we must ensure that the referee of a weak object is marked if dereferenced by the mutator. For this we (unfortunately) must introduce a read barrier, as described in Note [Concurrent read barrier on deRefWeak#] (in `NonMovingMark.c`). * Stable names are also a bit tricky as described in Note [Sweeping stable names in the concurrent collector] (`NonMovingSweep.c`). We take quite some pains to ensure that the high thread count often seen in parallel Haskell applications doesn't affect pause times. To this end we allow thread stacks to be marked either by the thread itself (when it is executed or stack-underflows) or the concurrent mark thread (if the thread owning the stack is never scheduled). There is a non-trivial handshake to ensure that this happens without racing which is described in Note [StgStack dirtiness flags and concurrent marking]. Co-Authored-by: Ömer Sinan Ağacan <omer@well-typed.com>
*	rts: Rip out support for STM invariants	Ben Gamari	2018-06-02	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This feature has some very serious correctness issues (#14310), introduces a great deal of complexity, and hasn't seen wide usage. Consequently we are removing it, as proposed in Proposal #77 [1]. This is heavily based on a patch from fryguybob. Updates stm submodule. [1] https://github.com/ghc-proposals/ghc-proposals/pull/77 Test Plan: Validate Reviewers: erikd, simonmar, hvr Reviewed By: simonmar Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14310 Differential Revision: https://phabricator.haskell.org/D4760
*	Correctly add unwinding info in manifestSp and makeFixupBlocks	Bartosz Nitka	2018-05-03	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In `manifestSp` the unwind info was before the relevant instruction, not after. I added some notes to establish semantics. Also removes redundant annotation in stg_catch_frame. For `makeFixupBlocks` it looks like we were off by `wORD_SIZE dflags`. I'm not sure why, but it lines up with `manifestSp`. In fact it lines up so well so that I can consolidate the Sp unwind logic in `maybeAddUnwind`. I detected the problems with `makeFixupBlocks` by running T14779b after patching D4559. Test Plan: added a new test Reviewers: bgamari, scpmw, simonmar, erikd Reviewed By: bgamari Subscribers: thomie, carter GHC Trac Issues: #14999 Differential Revision: https://phabricator.haskell.org/D4606
*	Update a comment in Exception.cmm	Ömer Sinan Ağacan	2018-03-13	1	-1/+1
\| \| \| \|	[skip ci]
*	rts: Set unwind information for catch_frame	Ben Gamari	2017-09-21	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Reviewers: austin, erikd, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3937
*	rts: Set unwind information for remaining stack frames	Ben Gamari	2017-09-21	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	Reviewers: austin, erikd, simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3985
*	rts: Fix type of bool literal	Ben Gamari	2016-12-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Test Plan: Build `p` way Reviewers: austin, erikd, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2779
*	rts: fix threadStackUnderflow type in cmm	Sergei Trofimovich	2016-03-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	stg_stack_underflow_frame had an incorrect call of C function 'threadStackUnderflow': ("ptr" ret_off) = foreign "C" threadStackUnderflow( MyCapability(), CurrentTSO); Which means it's prototype is: void * () (W_, void); While real prototype is: W_ () (Capability cap, StgTSO *tso); The fix is simple. Fix type annotations: (ret_off) = foreign "C" threadStackUnderflow( MyCapability() "ptr", CurrentTSO "ptr"); Noticed when debugged T9045 test failure on m68k target which distincts between pointer and non pointer return types (uses different registers) While at it noticed and fixed return types for 'throwTo' and 'findSpark'. Trac #11395 Signed-off-by: Sergei Trofimovich <siarheit@google.com>
*	Enable stack traces with ghci -fexternal-interpreter -prof	Simon Marlow	2016-01-08	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: The main goal here is enable stack traces in GHCi. After this change, if you start GHCi like this: ghci -fexternal-interpreter -prof (which requires packages to be built for profiling, but not GHC itself) then the interpreter manages cost-centre stacks during execution and can produce a stack trace on request. Call locations are available for all interpreted code, and any compiled code that was built with the `-fprof-auto` familiy of flags. There are a couple of ways to get a stack trace: * `error`/`undefined` automatically get one attached * `Debug.Trace.traceStack` can be used anywhere, and prints the current stack Because the interpreter is running in a separate process, only the interpreted code is running in profiled mode and the compiler itself isn't slowed down by profiling. The GHCi debugger still doesn't work with -fexternal-interpreter, although this patch gets it a step closer. Most of the functionality of breakpoints is implemented, but the runtime value introspection is still not supported. Along the way I also did some refactoring and added type arguments to the various remote pointer types in `GHCi.RemotePtr`, so there's better type safety and documentation in the bridge code between GHC and ghc-iserv. Test Plan: validate Reviewers: bgamari, ezyang, austin, hvr, goldfire, erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1747 GHC Trac Issues: #11047, #11100
*	CMM: add a mechanism to import C .data labels	Sergei Trofimovich	2015-01-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: This introduces new .cmm syntax for import: 'import' 'CLOSURE' <identifier>; Currently cmm syntax allows importing only function labels: import pthread_mutex_lock; but sometimes ghc needs to import global gariables or haskell closures: import ghczmprim_GHCziTypes_True_closure; import base_ControlziExceptionziBase_nestedAtomically_closure; import ghczmprim_GHCziTypes_False_closure; import sm_mutex; It breaks on ia64 where there is a difference in pointers to data and pointer to functions. Patch fixes threaded runtime on ia64 where dereference of 'sm_mutex' from CMM led to incurrect location. Exact breakage machanics are the same as in e18525fae273f4c1ad8d6cbe1dea4fc074cac721 Merge into the 7.10 branch Signed-off-by: Sergei Trofimovich <siarheit@google.com> Test Plan: passes ./validate, makes ghci work on ghc-7.8.4 Reviewers: simonmar, simonpj, austin Reviewed By: austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D622
*	Add unwind information to Cmm	Peter Wortmann	2014-12-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unwind information allows the debugger to discover more information about a program state, by allowing it to "reconstruct" other states of the program. In practice, this means that we explain to the debugger how to unravel stack frames, which comes down mostly to explaining how to find their Sp and Ip register values. * We declare yet another new constructor for CmmNode - and this time there's actually little choice, as unwind information can and will change mid-block. We don't actually make use of these capabilities, and back-end support would be tricky (generate new labels?), but it feels like the right way to do it. * Even though we only use it for Sp so far, we allow CmmUnwind to specify unwind information for any register. This is pretty cheap and could come in useful in future. * We allow full CmmExpr expressions for specifying unwind values. The advantage here is that we don't have to make up new syntax, and can e.g. use the WDS macro directly. On the other hand, the back-end will now have to simplify the expression until it can sensibly be converted into DWARF byte code - a process which might fail, yielding NCG panics. On the other hand, when you're writing Cmm by hand you really ought to know what you're doing. (From Phabricator D169)
*	[skip ci] rts: Detabify Exception.cmm	Austin Seipp	2014-10-21	1	-45/+45
\| \| \| \|	Signed-off-by: Austin Seipp <austin@well-typed.com>
*	Revert "Rename _closure to _static_closure, apply naming consistently."	Edward Z. Yang	2014-10-20	1	-3/+3
\| \| \| \| \| \| \|	This reverts commit 35672072b4091d6f0031417bc160c568f22d0469. Conflicts: compiler/main/DriverPipeline.hs
*	Rename _closure to _static_closure, apply naming consistently.	Edward Z. Yang	2014-10-01	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: In preparation for indirecting all references to closures, we rename _closure to _static_closure to ensure any old code will get an undefined symbol error. In order to reference a closure foobar_closure (which is now undefined), you should instead use STATIC_CLOSURE(foobar). For convenience, a number of these old identifiers are macro'd. Across C-- and C (Windows and otherwise), there were differing conventions on whether or not foobar_closure or &foobar_closure was the address of the closure. Now, all foobar_closure references are addresses, and no & is necessary. CHARLIKE/INTLIKE were not changed, simply alpha-renamed. Part of remove HEAP_ALLOCED patch set (#8199) Depends on D265 Signed-off-by: Edward Z. Yang <ezyang@mit.edu> Test Plan: validate Reviewers: simonmar, austin Subscribers: simonmar, ezyang, carter, thomie Differential Revision: https://phabricator.haskell.org/D267 GHC Trac Issues: #8199
*	Refer to 'mask' instead of 'block' in Control.Exception	Thomas Miedema	2014-09-25	1	-16/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Summary: More thorough version of a75383cdd46f7bb593639bc6d1628b068b78262a Test Plan: change of comments only [skip ci] Reviewers: austin, simonmar, ekmett Reviewed By: austin, ekmett Subscribers: simonmar, ezyang, carter Differential Revision: https://phabricator.haskell.org/D239
*	Fix copy/paste error (#8937)	Simon Marlow	2014-04-04	1	-1/+1
\|
*	s/excpetions/exceptions/	Edward Z. Yang	2013-10-21	1	-1/+1
\| \| \| \|	Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
*	Remove use of R9, and fix associated bugs	Simon Marlow	2013-10-01	1	-4/+4
\| \| \| \| \| \| \| \| \| \|	We were passing the function address to stg_gc_prim_p in R9, which was wrong because the call was a high-level call and didn't declare R9 as a parameter. Passing R9 as an argument is the right way, but unfortunately that exposed another bug: we were using the same macro in some low-level Cmm, where it is illegal to call functions with arguments (see Note [syntax of cmm files]). So we now have low-level variants of STK_CHK() and STK_CHK_P() for use in low-level Cmm code.
*	ticky enhancements	Nicolas Frisby	2013-03-29	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap__info resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation of closure in addition to allocation by that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
*	Add a write barrier for TVAR closures	Simon Marlow	2012-11-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This improves GC performance when there are a lot of TVars in the heap. For instance, a TChan with a lot of elements causes a massive GC drag without this patch. There's more to do - several other STM closure types don't have write barriers, so GC performance when there are a lot of threads blocked on STM isn't great. But fixing the problem for TVar is a good start.
*	Save R1/R2/.. across foreign calls	Simon Marlow	2012-11-05	1	-6/+15
\|
*	profiling fixes	Simon Marlow	2012-10-19	1	-4/+4
\|
*	Produce new-style Cmm from the Cmm parser	Simon Marlow	2012-10-08	1	-113/+129
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
*	Make profiling work with multiple capabilities (+RTS -N)	Simon Marlow	2011-11-29	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
*	Fix trashing of the masking state in STM (#5238)	Simon Marlow	2011-11-16	1	-18/+21
\|
*	+RTS -xc: print a the closure type of the exception too	Simon Marlow	2011-11-14	1	-1/+3
\|
*	Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints	Simon Marlow	2011-11-02	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	User visible changes ==================== Profilng -------- Flags renamed (the old ones are still accepted for now): OLD NEW --------- ------------ -auto-all -fprof-auto -auto -fprof-exported -caf-all -fprof-cafs New flags: -fprof-auto Annotates all bindings (not just top-level ones) with SCCs -fprof-top Annotates just top-level bindings with SCCs -fprof-exported Annotates just exported bindings with SCCs -fprof-no-count-entries Do not maintain entry counts when profiling (can make profiled code go faster; useful with heap profiling where entry counts are not used) Cost-centre stacks have a new semantics, which should in most cases result in more useful and intuitive profiles. If you find this not to be the case, please let me know. This is the area where I have been experimenting most, and the current solution is probably not the final version, however it does address all the outstanding bugs and seems to be better than GHC 7.2. Stack traces ------------ +RTS -xc now gives more information. If the exception originates from a CAF (as is common, because GHC tends to lift exceptions out to the top-level), then the RTS walks up the stack and reports the stack in the enclosing update frame(s). Result: +RTS -xc is much more useful now - but you still have to compile for profiling to get it. I've played around a little with adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem quite accurately. I plan to add more facilities for stack tracing (e.g. in GHCi) in the future. Coverage (HPC) -------------- * derived instances are now coloured yellow if they weren't used * likewise record field names * entry counts are more accurate (hpc --fun-entry-count) * tab width is now correct (markup was previously off in source with tabs) Internal changes ================ In Core, the Note constructor has been replaced by Tick (Tickish b) (Expr b) which is used to represent all the kinds of source annotation we support: profiling SCCs, HPC ticks, and GHCi breakpoints. Depending on the properties of the Tickish, different transformations apply to Tick. See CoreUtils.mkTick for details. Tickets ======= This commit closes the following tickets, test cases to follow: - Close #2552: not a bug, but the behaviour is now more intuitive (test is T2552) - Close #680 (test is T680) - Close #1531 (test is result001) - Close #949 (test is T949) - Close #2466: test case has bitrotted (doesn't compile against current version of vector-space package)
*	Fix #4988: we were wrongly running exception handlers in the	Simon Marlow	2011-09-01	1	-7/+9
\| \| \| \| \| \| \| \| \| \|	maskUninterruptible state instead of ordinary mask, due to a misinterpretation of the way the TSO_INTERRUPTIBLE flag works. Remarkably this must have been broken for quite some time. Indeed we even had a test that demonstrated the wrong behaviour (conc015a) but presumably I didn't look hard enough at the output to notice that it was wrong.
*	Implement stack chunks and separate TSO/STACK objects	Simon Marlow	2010-12-15	1	-9/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
*	Follow GHC.Bool/GHC.Types merge	Ian Lynagh	2010-10-23	1	-3/+3
\|
*	remove unnecessary stg_noForceIO (#3508)	Simon Marlow	2010-09-24	1	-4/+3
\|
*	New asynchronous exception control API (ghc parts)	Simon Marlow	2010-07-08	1	-45/+111
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As discussed on the libraries/haskell-cafe mailing lists http://www.haskell.org/pipermail/libraries/2010-April/013420.html This is a replacement for block/unblock in the asychronous exceptions API to fix a problem whereby a function could unblock asynchronous exceptions even if called within a blocked context. The new terminology is "mask" rather than "block" (to avoid confusion due to overloaded meanings of the latter). In GHC, we changed the names of some primops: blockAsyncExceptions# -> maskAsyncExceptions# unblockAsyncExceptions# -> unmaskAsyncExceptions# asyncExceptionsBlocked# -> getMaskingState# and added one new primop: maskUninterruptible# See the accompanying patch to libraries/base for the API changes.
*	add a MAYBE_GC() in killThread#, fixes throwto003(threaded2) looping	Simon Marlow	2010-05-05	1	-0/+2
\|
*	Use message-passing to implement throwTo in the RTS	Simon Marlow	2010-03-11	1	-24/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This replaces some complicated locking schemes with message-passing in the implementation of throwTo. The benefits are - previously it was impossible to guarantee that a throwTo from a thread running on one CPU to a thread running on another CPU would be noticed, and we had to rely on the GC to pick up these forgotten exceptions. This no longer happens. - the locking regime is simpler (though the code is about the same size) - threads can be unblocked from a blocked_exceptions queue without having to traverse the whole queue now. It's a rare case, but replaces an O(n) operation with an O(1). - generally we move in the direction of sharing less between Capabilities (aka HECs), which will become important with other changes we have planned. Also in this patch I replaced several STM-specific closure types with a generic MUT_PRIM closure type, which allowed a lot of code in the GC and other places to go away, hence the line-count reduction. The message-passing changes resulted in about a net zero line-count difference.
*	micro-opt: replace stmGetEnclosingTRec() with a field access	Simon Marlow	2009-10-14	1	-2/+2
\| \| \| \| \|	While fixing #3578 I noticed that this function was just a field access to StgTRecHeader, so I inlined it manually.
*	Fix #3429: a tricky race condition	Simon Marlow	2009-08-18	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two bugs, and had it not been for the first one we would not have noticed the second one, so this is quite fortunate. The first bug is in stg_unblockAsyncExceptionszh_ret, when we found a pending exception to raise, but don't end up raising it, there was a missing adjustment to the stack pointer. The second bug was that this case was actually happening at all: it ought to be incredibly rare, because the pending exception thread would have to be killed between us finding it and attempting to raise the exception. This made me suspicious. It turned out that there was a race condition on the tso->flags field; multiple threads were updating this bitmask field non-atomically (one of the bits is the dirty-bit for the generational GC). The fix is to move the dirty bit into its own field of the TSO, making the TSO one word larger (sadly).
*	Rename primops from foozh_fast to stg_foozh	Simon Marlow	2009-08-03	1	-17/+17
\| \| \| \|	For consistency with other RTS exported symbols
*	rts_stop_on_exception is a C int, not a W_	Simon Marlow	2009-08-03	1	-1/+1
\| \| \| \|	amazing this hasn't caused any problems before now
*	Fix #3279, #3288: fix crash encountered when calling unblock inside ↵	Simon Marlow	2009-06-16	1	-8/+35
\| \| \| \| \| \|	unsafePerformIO See comments for details
*	Make killThread# cmm primop use local stack allocation	Duncan Coutts	2009-06-10	1	-2/+3
\| \| \| \| \| \| \|	It using the mp_tmp_w register/global as a convenient temporary variable. This is naughty because those vars are supposed to be for gmp. Also, we want to remove the gmp temp vars so we must now use a local stack slot instead.
*	Merging in the new codegen branch	dias@eecs.harvard.edu	2008-08-14	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This merge does not turn on the new codegen (which only compiles a select few programs at this point), but it does introduce some changes to the old code generator. The high bits: 1. The Rep Swamp patch is finally here. The highlight is that the representation of types at the machine level has changed. Consequently, this patch contains updates across several back ends. 2. The new Stg -> Cmm path is here, although it appears to have a fair number of bugs lurking. 3. Many improvements along the CmmCPSZ path, including: o stack layout o some code for infotables, half of which is right and half wrong o proc-point splitting
*	Change the calling conventions for unboxed tuples slightly	Simon Marlow	2008-07-28	1	-48/+0
\| \| \| \| \| \| \| \| \| \|	When returning an unboxed tuple with a single non-void component, we now use the same calling convention as for returning a value of the same type as that component. This means that the return convention for IO now doesn't vary depending on the platform, which make some parts of the RTS simpler, and fixes a problem I was having with making the FFI work in unregisterised GHCi (the byte-code compiler makes some assumptions about calling conventions to keep things simple).
*	add new primop: asyncExceptionsBlocked# :: IO Bool	Simon Marlow	2008-07-09	1	-0/+9
\|
*	tso->link is now tso->_link (fix after merge with HEAD)	Simon Marlow	2008-04-17	1	-1/+1
\|
*	Do some stack fiddling in stg_unblockAsyncExceptionszh_ret	Ian Lynagh	2008-05-23	1	-0/+8
\| \| \| \|	This fixes a segfault in #1657
*	Do not #include external header files when compiling via C	Simon Marlow	2008-04-02	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This has several advantages: - -fvia-C is consistent with -fasm with respect to FFI declarations: both bind to the ABI, not the API. - foreign calls can now be inlined freely across module boundaries, since a header file is not required when compiling the call. - bootstrapping via C will be more reliable, because this difference in behavour between the two backends has been removed. There is one disadvantage: - we get no checking by the C compiler that the FFI declaration is correct. So now, the c-includes field in a .cabal file is always ignored by GHC, as are header files specified in an FFI declaration. This was previously the case only for -fasm compilations, now it is also the case for -fvia-C too.
*	Follow library changes	Ian Lynagh	2008-03-23	1	-3/+3
\| \| \| \| \|	Integer, Bool and Unit/Inl/Inr are now in new packages integer and ghc-prim.