summaryrefslogtreecommitdiff
path: root/rts/sm/MarkWeak.c
Commit message (Collapse)AuthorAgeFilesLines
* Use C99's boolBen Gamari2016-11-291-21/+21
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* Fix comments about scavenging WEAK objectsTakano Akio2016-05-121-7/+1
| | | | | | | | | | | | | | | | | | This is a follow-up of D2189. If fixes some comments, deletes a section in the User's Guide about the bug, and updates .mailmap as suggested on the WorkinConventions wiki page. Test Plan: It compiles. Reviewers: austin, simonmar, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2202 GHC Trac Issues: #11108
* Handle promotion failures when scavenging a WEAK (#11108)Takano Akio2016-05-111-2/+35
| | | | | | | | | | | | | | | | | | | | | | Previously, we ignored promotion failures when evacuating fields of a WEAK object. When a failure happens, this resulted in an WEAK object pointing to another object in a younger generation, causing crashes. I used the test case from #11746 to check that the fix is working. However I haven't managed to produce a test case that quickly reproduces the issue. Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2189 GHC Trac Issues: #11108
* rts: Replace `nat` with `uint32_t`Erik de Castro Lopo2016-05-051-5/+5
| | | | | | | | | | | | The `nat` type was an alias for `unsigned int` with a comment saying it was at least 32 bits. We keep the typedef in case client code is using it but mark it as deprecated. Test Plan: Validated on Linux, OS X and Windows Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20 Differential Revision: https://phabricator.haskell.org/D2166
* Small simplification (#11777)Simon Marlow2016-04-121-5/+1
| | | | | DEAD_WEAK used to have a different layout, see d61c623ed6b2d352474a7497a65015dbf6a72e12
* Fix a bug with mallocForeignPtr and finalizers (#10904)Simon Marlow2015-09-241-0/+5
| | | | | | | | | | | | Summary: See Note [MallocPtr finalizers] Test Plan: validate; new test T10904 Reviewers: ezyang, bgamari, austin, hvr, rwbarton Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1275
* Don't call DEAD_WEAK finalizer again on shutdown (#7170)Simon Marlow2015-06-011-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: There's a race condition like this: # A foreign pointer gets promoted to the last generation # It has its finalizer called manually # We start shutting down the runtime in `hs_exit_` from the main thread # A minor GC starts running (`scheduleDoGC`) on one of the threads # The minor GC notices that we're in `SCHED_INTERRUPTING` state and advances to `SCHED_SHUTTING_DOWN` # The main thread tries to do major GC (with `scheduleDoGC`), but it exits early because we're in `SCHED_SHUTTING_DOWN` state # We end up with a `DEAD_WEAK` left on the list of weak pointers of the last generation, because it relied on major GC removing it from that list This change: * Ignores DEAD_WEAK finalizers when shutting down * Makes the major GC on shutdown more likely * Fixes a bogus assert Test Plan: before this diff https://ghc.haskell.org/trac/ghc/ticket/7170#comment:5 reproduced and after it doesn't Reviewers: ezyang, austin, simonmar Reviewed By: simonmar Subscribers: bgamari, thomie Differential Revision: https://phabricator.haskell.org/D921 GHC Trac Issues: #7170
* Revert "rts: add Emacs 'Local Variables' to every .c file"Simon Marlow2014-09-291-8/+0
| | | | This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
* rts: detabify/dewhitespace sm/MarkWeak.cAustin Seipp2014-08-201-15/+15
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* rts: add Emacs 'Local Variables' to every .c fileAustin Seipp2014-07-281-0/+8
| | | | | | | | This will hopefully help ensure some basic consistency in the forward by overriding buffer variables. In particular, it sets the wrap length, the offset to 4, and turns off tabs. Signed-off-by: Austin Seipp <austin@well-typed.com>
* Per-capability nursery weak pointer lists, fixes #9075Edward Z. Yang2014-05-291-0/+35
| | | | Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
* Update comment now that we have per-gen weak pointer lists.Edward Z. Yang2014-05-041-4/+2
| | | | Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
* Globally replace "hackage.haskell.org" with "ghc.haskell.org"Simon Marlow2013-10-011-1/+1
|
* Fix #7970, #2161, unfix #551Simon Marlow2013-07-021-48/+52
| | | | | | | | | | | Establish the reachability of threads before weak pointers. Hence a deadlocked thread can keep a weak pointer alive and prevent it from being finalized early. However, an reference from the finalizer of a weak pointer will no longer prevent a thread from being considered deadlocked (#551). To keep the thread alive in that situation you need to use a StablePtr. See comments on #7970 and in the code for more details.
* Maintain per-generation lists of weak pointers (#7847)Takano Akio2013-06-151-94/+135
|
* Allow multiple C finalizers to be attached to a Weak#Takano Akio2013-06-151-3/+2
| | | | | | | | | | | | | The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#. This new primop mutates an existing Weak# object and adds a new C finalizer to it. This change removes an invariant in MarkWeak.c, namely that the relative order of Weak# objects in the list needs to be preserved across GC. This makes it easier to split the list into per-generation structures. The patch also removes a race condition between two threads calling finalizeWeak# on the same WEAK object at that same time.
* Make a function for get_itbl, rather than using a CPP macroIan Lynagh2012-08-251-2/+2
| | | | | | | | | | | | This has several advantages: * It can be called from gdb * There is more type information for the user, and type checking for the compiler * Less opportunity for things to go wrong, e.g. due to missing parentheses or repeated execution The sizes of the non-debug .o files hasn't changed (other than Inlines.o), so I'm pretty sure the compiled code is identical.
* Retain ordering of finalizers during GC (#7160)Simon Marlow2012-08-211-5/+14
| | | | | | | | | | | This came up since the addition of C finalizers, since Haskell finalizers are already stored in an explicit list. C finalizers on the other hand get a WEAK object each, so in order to run them in the right order we have to make sure that list stays in the correct order. I hate adding new invariants, but this is the quickest way to fix the bug for now. A better way to fix it would be to have a single WEAK object with a list of finaliers attached to it, and a primop for adding finalizers to the list.
* Refactoring and tidy upSimon Marlow2011-04-111-0/+1
| | | | | | | | | | | | This is a port of some of the changes from my private local-GC branch (which is still in darcs, I haven't converted it to git yet). There are a couple of small functional differences in the GC stats: first, per-thread GC timings should now be more accurate, and secondly we now report average and maximum pause times. e.g. from minimax +RTS -N8 -s: Tot time (elapsed) Avg pause Max pause Gen 0 2755 colls, 2754 par 13.16s 0.93s 0.0003s 0.0150s Gen 1 769 colls, 769 par 3.71s 0.26s 0.0003s 0.0059s
* A small GC optimisationSimon Marlow2011-02-021-1/+1
| | | | | | Store the *number* of the destination generation in the Bdescr struct, so that in evacuate() we don't have to deref gen to get it. This is another improvement ported over from my GC branch.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* Don't interrupt when task blocks exceptions, don't immediately start exception.Edward Z. Yang2010-09-251-1/+2
|
* Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4Edward Z. Yang2010-09-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | This is patch that adds support for interruptible FFI calls in the form of a new foreign import keyword 'interruptible', which can be used instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe FFI calls, except that the worker thread they run on may be interrupted. Internally, it replaces BlockedOnCCall_NoUnblockEx with BlockedOnCCall_Interruptible, and changes the behavior of the RTS to not modify the TSO_ flags on the event of an FFI call from a thread that was interruptible. It also modifies the bytecode format for foreign call, adding an extra Word16 to indicate interruptibility. The semantics of interruption vary from platform to platform, but the intent is that any blocking system calls are aborted with an error code. This is most useful for making function calls to system library functions that support interrupting. There is no support for pre-Vista Windows. There is a partner testsuite patch which adds several tests for this functionality.
* New implementation of BLACKHOLEsSimon Marlow2010-03-291-58/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the global blackhole_queue with a clever scheme that enables us to queue up blocked threads on the closure that they are blocked on, while still avoiding atomic instructions in the common case. Advantages: - gets rid of a locked global data structure and some tricky GC code (replacing it with some per-thread data structures and different tricky GC code :) - wakeups are more prompt: parallel/concurrent performance should benefit. I haven't seen anything dramatic in the parallel benchmarks so far, but a couple of threading benchmarks do improve a bit. - waking up a thread blocked on a blackhole is now O(1) (e.g. if it is the target of throwTo). - less sharing and better separation of Capabilities: communication is done with messages, the data structures are strictly owned by a Capability and cannot be modified except by sending messages. - this change will utlimately enable us to do more intelligent scheduling when threads block on each other. This is what started off the whole thing, but it isn't done yet (#3838). I'll be documenting all this on the wiki in due course.
* Fix an assertion that was not safe when running in parallelSimon Marlow2010-03-251-3/+12
|
* Use message-passing to implement throwTo in the RTSSimon Marlow2010-03-111-33/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces some complicated locking schemes with message-passing in the implementation of throwTo. The benefits are - previously it was impossible to guarantee that a throwTo from a thread running on one CPU to a thread running on another CPU would be noticed, and we had to rely on the GC to pick up these forgotten exceptions. This no longer happens. - the locking regime is simpler (though the code is about the same size) - threads can be unblocked from a blocked_exceptions queue without having to traverse the whole queue now. It's a rare case, but replaces an O(n) operation with an O(1). - generally we move in the direction of sharing less between Capabilities (aka HECs), which will become important with other changes we have planned. Also in this patch I replaced several STM-specific closure types with a generic MUT_PRIM closure type, which allowed a lot of code in the GC and other places to go away, hence the line-count reduction. The message-passing changes resulted in about a net zero line-count difference.
* simplify weak pointer processingSimon Marlow2009-12-081-22/+15
|
* GC refactoring, remove "steps"Simon Marlow2009-12-031-30/+23
| | | | | | | | | | | | | | | | | | | | | The GC had a two-level structure, G generations each of T steps. Steps are for aging within a generation, mostly to avoid premature promotion. Measurements show that more than 2 steps is almost never worthwhile, and 1 step is usually worse than 2. In theory fractional steps are possible, so the ideal number of steps is somewhere between 1 and 3. GHC's default has always been 2. We can implement 2 steps quite straightforwardly by having each block point to the generation to which objects in that block should be promoted, so blocks in the nursery point to generation 0, and blocks in gen 0 point to gen 1, and so on. This commit removes the explicit step structures, merging generations with steps, thus simplifying a lot of code. Performance is unaffected. The tunable number of steps is now gone, although it may be replaced in the future by a way to tune the aging in generation 0.
* Make allocatePinned use local storage, and other refactoringsSimon Marlow2009-12-011-101/+123
| | | | | | | | | | | | | | | | | | | | | | | | | | This is a batch of refactoring to remove some of the GC's global state, as we move towards CPU-local GC. - allocateLocal() now allocates large objects into the local nursery, rather than taking a global lock and allocating then in gen 0 step 0. - allocatePinned() was still allocating from global storage and taking a lock each time, now it uses local storage. (mallocForeignPtrBytes should be faster with -threaded). - We had a gen 0 step 0, distinct from the nurseries, which are stored in a separate nurseries[] array. This is slightly strange. I removed the g0s0 global that pointed to gen 0 step 0, and removed all uses of it. I think now we don't use gen 0 step 0 at all, except possibly when there is only one generation. Possibly more tidying up is needed here. - I removed the global allocate() function, and renamed allocateLocal() to allocate(). - the alloc_blocks global is gone. MAYBE_GC() and doYouWantToGC() now check the local nursery only.
* RTS tidyup sweep, first phaseSimon Marlow2009-08-021-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The first phase of this tidyup is focussed on the header files, and in particular making sure we are exposinng publicly exactly what we need to, and no more. - Rts.h now includes everything that the RTS exposes publicly, rather than a random subset of it. - Most of the public header files have moved into subdirectories, and many of them have been renamed. But clients should not need to include any of the other headers directly, just #include the main public headers: Rts.h, HsFFI.h, RtsAPI.h. - All the headers needed for via-C compilation have moved into the stg subdirectory, which is self-contained. Most of the headers for the rest of the RTS APIs have moved into the rts subdirectory. - I left MachDeps.h where it is, because it is so widely used in Haskell code. - I left a deprecated stub for RtsFlags.h in place. The flag structures are now exposed by Rts.h. - Various internal APIs are no longer exposed by public header files. - Various bits of dead code and declarations have been removed - More gcc warnings are turned on, and the RTS code is more warning-clean. - More source files #include "PosixSource.h", and hence only use standard POSIX (1003.1c-1995) interfaces. There is a lot more tidying up still to do, this is just the first pass. I also intend to standardise the names for external RTS APIs (e.g use the rts_ prefix consistently), and declare the internal APIs as hidden for shared libraries.
* Fix #2637: conc032(threaded2) failureSimon Marlow2008-10-011-38/+52
| | | | | | | | There was a race condition whereby a thread doing throwTo could be blocked on a thread that had finished, and the GC would detect this as a deadlock rather than raising the pending exception. We can't close the race, but we can make the right thing happen when the GC runs later.
* small bugfix in traverseBlackHoleQueue()Simon Marlow2008-09-091-1/+5
|
* remove EVACUATED: store the forwarding pointer in the info pointerSimon Marlow2008-04-171-6/+10
|
* Don't look at all the threads before each GC.Simon Marlow2008-04-161-8/+26
| | | | | | | | | | | We were looking at all the threads for 2 reasons: 1. to catch transactions that might be looping as a result of seeing an inconsistent view of memory. 2. to catch threads with blocked exceptions that are themselves blocked. For (1) we now check for this case whenever a thread yields, and for (2) we catch these threads in the GC itself and send the exceptions after GC (see performPendingThrowTos).
* Don't traverse the entire list of threads on every GC (phase 1)Simon Marlow2008-04-161-56/+72
| | | | | | Instead of keeping a single list of all threads, keep one per step and only look at the threads belonging to steps that we are collecting.
* bugfix for traverseBlackHoleQueueSimon Marlow2008-04-161-3/+2
|
* Add a write barrier to the TSO link field (#1589)Simon Marlow2008-04-161-4/+6
|
* update copyrights in rts/smSimon Marlow2008-04-161-1/+1
|
* Reorganisation to fix problems related to the gct register variableSimon Marlow2008-04-161-0/+1
| | | | | | | | | - GCAux.c contains code not compiled with the gct register enabled, it is callable from outside the GC - marking functions are moved to their relevant subsystems, outside the GC - mark_root needs to save the gct register, as it is called from outside the GC
* GC refactoring: change evac_gen to evac_stepSimon Marlow2007-10-311-1/+1
| | | | | | | | By establishing an ordering on step pointers, we can simplify the test (stp->gen_no < evac_gen) to (stp < evac_step) which is common in evacuate().
* GC refactoring: make evacuate() take an StgClosure**Simon Marlow2007-10-311-7/+10
| | | | | | | | | | | | | Change the type of evacuate() from StgClosure *evacuate(StgClosure *); to void evacuate(StgClosure **); So evacuate() itself writes the source pointer, rather than the caller. This is slightly cleaner, and avoids a few memory writes: sometimes evacuate() doesn't move the object, and in these cases the source pointer doesn't need to be written. It doesn't have a measurable impact on performance, though.
* Refactoring of the GC in preparation for parallel GCSimon Marlow2007-10-311-1/+1
| | | | | | | | | | | | This patch localises the state of the GC into a gc_thread structure, and reorganises the inner loop of the GC to scavenge one block at a time from global work lists in each "step". The gc_thread structure has a "workspace" for each step, in which it collects evacuated objects until it has a full block to push out to the step's global list. Details of the algorithm will be on the wiki in due course. At the moment, THREADED_RTS does not compile, but the single-threaded GC works (and is 10-20% slower than before).
* MERGE: Fix bug exposed by conc052.Simon Marlow2007-04-041-5/+14
| | | | | | | | | A thread that was blocked on a blackhole but can now be woken up could possibly be treated as unreachable by the GC, and sent the NonTermination exception. This can give rise to spurious <<loop>>s in concurrent programs, so it's a good one to fix.
* copyright updates and add Commentary linksSimon Marlow2006-10-261-0/+5
|
* Split GC.c, and move storage manager into sm/ directorySimon Marlow2006-10-241-0/+325
In preparation for parallel GC, split up the monolithic GC.c file into smaller parts. Also in this patch (and difficult to separate, unfortunatley): - Don't include Stable.h in Rts.h, instead just include it where necessary. - consistently use STATIC_INLINE in source files, and INLINE_HEADER in header files. STATIC_INLINE is now turned off when DEBUG is on, to make debugging easier. - The GC no longer takes the get_roots function as an argument. We weren't making use of this generalisation.