summaryrefslogtreecommitdiff
path: root/rts/Schedule.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Per-thread allocation counters and limits"Simon Marlow2014-05-041-19/+0
| | | | | | | | Problems were found on 32-bit platforms, I'll commit again when I have a fix. This reverts the following commits: 54b31f744848da872c7c6366dea840748e01b5cf b0534f78a73f972e279eed4447a5687bd6a8308e
* Per-thread allocation counters and limitsSimon Marlow2014-05-021-0/+19
| | | | | | | | | | | | | | | | | | | | | | | This tracks the amount of memory allocation by each thread in a counter stored in the TSO. Optionally, when the counter drops below zero (it counts down), the thread can be sent an asynchronous exception: AllocationLimitExceeded. When this happens, given a small additional limit so that it can handle the exception. See documentation in GHC.Conc for more details. Allocation limits are similar to timeouts, but - timeouts use real time, not CPU time. Allocation limits do not count anything while the thread is blocked or in foreign code. - timeouts don't re-trigger if the thread catches the exception, allocation limits do. - timeouts can catch non-allocating loops, if you use -fno-omit-yields. This doesn't work for allocation limits. I couldn't measure any impact on benchmarks with these changes, even for nofib/smp.
* rts: Fix typo in commentBen Gamari2013-10-251-1/+1
|
* s/Heep/Heap/Edward Z. Yang2013-10-031-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* use a nat, not StgWord8, for gc_typeSimon Marlow2013-10-011-1/+1
|
* Revert "Default to infinite stack size (#8189)"Austin Seipp2013-09-081-1/+1
| | | | This reverts commit d85044f6b201eae0a9e453b89c0433608e0778f0.
* Default to infinite stack size (#8189)Austin Seipp2013-09-081-1/+1
| | | | | | | | | | | | | When servicing a stack overflows, only throw an exception to the given thread if the user explicitly set a max stack size, using +RTS -K. Otherwise just service it normally and grow the stack. In case we actually run out of *heap* (stack chuncks are allocated on the heap), then we need to bail by calling the stackOverflow() hook and exit immediately. Authored-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Don't move Capabilities in setNumCapabilities (#8209)Simon Marlow2013-09-041-52/+35
| | | | | | | | | | | | | We have various problems with reallocating the array of Capabilities, due to threads in waitForReturnCapability that are already holding a pointer to a Capability. Rather than add more locking to make this safer, I decided it would be easier to ensure that we never move the Capabilities at all. The capabilities array is now an array of pointers to Capabaility. There are extra indirections, but it rarely matters - we don't often access Capabilities via the array, normally we already have a pointer to one. I ran the parallel benchmarks and didn't see any difference.
* Implement atomicReadMVar, fixing #4001.Edward Z. Yang2013-07-091-0/+2
| | | | | | | | | We add the invariant to the MVar blocked threads queue that threads blocked on an atomic read are always at the front of the queue. This invariant is easy to maintain, since takers are only ever added to the end of the queue. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Fix segfault with STM; fixes #8035. Patch from errge.Ian Lynagh2013-07-071-1/+13
|
* Ensure gc_type is StgWord8.Austin Seipp2013-06-211-1/+1
| | | | | | | Again, the range of gc_type is actually 1-3, which is technically outside the range of rtsBool. Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix bug in setNumCapabilitiesSimon Marlow2013-02-201-12/+23
| | | | | | We were changing n_capabilities after we had released the Capabilities, which lead to a range of interesting crashes. This should fix test failures in setnumcapabilities001.
* Simplify some code; patch from Bill TuttIan Lynagh2013-02-171-4/+1
|
* Build fix for dyn way on Windows; patch from nusIan Lynagh2013-02-161-1/+1
|
* Changed ioManagerCapabilitiesChanged to take no arguments.Andreas Voellmy2013-02-111-3/+1
| | | | ioManagerCapabilitiesChanged now queries getNumCapabilities for the current number of enabled capabilities.
* setNumCapabilities calls GHC.Conc.IO.ioManagerCapabilitiesChanged before ↵Andreas Voellmy2013-02-111-0/+8
| | | | | | returning. This enables the IO manager to change the number of IO loops it uses (usually one per capability).
* Tidy up tso->stackobj before calling threadStackUnderflow (#7636)Simon Marlow2013-02-071-0/+1
| | | | | | | | Fixes the following crash: internal error: threadStackUnderflow: not enough space for return values when using STM.
* Better abstraction over run queues.Edward Z. Yang2013-01-161-7/+13
| | | | | | | | | This adds some new functions: peekRunQueue, promoteInRunQueue, singletonRunQueue and truncateRunQueue which help abstract away manual linked list manipulation, making it easier to swap in a new queue implementation. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Add a write barrier for TVAR closuresSimon Marlow2012-11-161-1/+1
| | | | | | | | | | This improves GC performance when there are a lot of TVars in the heap. For instance, a TChan with a lot of elements causes a massive GC drag without this patch. There's more to do - several other STM closure types don't have write barriers, so GC performance when there are a lot of threads blocked on STM isn't great. But fixing the problem for TVar is a good start.
* delete old commentsSimon Marlow2012-10-251-22/+0
|
* remove unused sched_shutting_downSimon Marlow2012-10-251-7/+0
|
* fix a warningSimon Marlow2012-10-231-2/+2
|
* typoSimon Marlow2012-10-221-1/+1
|
* Another overhaul of the recent_activity / idle GC handling (#5991)Simon Marlow2012-09-241-4/+12
| | | | | | | | | | | | | | | Improvements: - we now turn off the timer signal in the non-threaded RTS after idleGCDelay. This should make the xmonad users on #5991 happy. - we now turn off the timer signal after idleGCDelay even if the idle GC is disabled with +RTS -I0. - we now do *not* turn off the timer when profiling. - more comments to explain the meaning of the various ACTIVITY_* values
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-071-2/+2
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* tidy upSimon Marlow2012-08-211-6/+11
|
* Fix a bug in the handling of recent_activitySimon Marlow2012-08-071-13/+22
| | | | | | | | The problem occurred when the idle GC was turned off with +RTS -I0. Then the scheduler would go into the state ACTIVITY_DONE_GC directly without doing a GC, and a subsequent GC would put it back to ACTIVITY_YES but without turning the timer back on. Instead if the GC finds the state is ACTIVITY_DONE_GC it should leave it there.
* Merge remote branch 'mikolaj/dcoutts'Ian Lynagh2012-07-141-1/+12
|\
| * Emit the task-tracking eventsDuncan Coutts2012-07-101-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | Based on initial patches by Mikolaj Konarski <mikolaj@well-typed.com> Use the new task tracing functions traceTaskCreate/Migrate/Delete. There are two key places. One is for worker tasks which have a relatively simple life cycle. Worker tasks are created and deleted by the RTS. The other case is bound tasks which are either created by the RTS, or appear as foreign C threads making calls into the RTS. For bound threads we do the tracing in rts_lock/unlock, which actually covers both threads coming in from outside, and also bound threads made by the RTS.
* | The final GC should be a major oneSimon Marlow2012-07-101-1/+1
|/ | | | | | | | We do a final GC before shutting down the system, to clean up. However, we were doing an ordinary GC rather than forcing a major GC, so especially when the allocation area is large, this final GC could be expensive. This is really just a bug - the final GC should have virtually nothing to do, because there is nothing live.
* Merge branch 'master' of http://darcs.haskell.org//ghcIan Lynagh2012-06-071-1/+1
|\
| * Test USE_MINIINTERPRETER rather than GhcUnregisterisedIan Lynagh2012-05-271-1/+1
| |
* | scheduleYield: avoid doing a GC again if we just did oneIan Lynagh2012-06-071-8/+19
|/ | | | | | If we are interrupted to do a GC, then we do not immediately do another one. This avoids a starvation situation where one Capability keeps forcing a GC and the other Capabilities make no progress at all.
* Fix the timestamps in GC_START and GC_END events on the GC-initiating capMikolaj2012-04-041-2/+0
| | | | | | | | | | | There was a discrepancy between GC times reported in +RTS -s and the timestamps of GC_START and GC_END events on the cap, on which +RTS -s stats for the given GC are based. This is fixed by posting the events with exactly the same timestamp as generated for the stat calculation. The calls posting the events are moved too, so that the events are emitted close to the time instant they claim to be emitted at. The GC_STATS_GHC was moved, too, ensuring it's emitted before the moved GC_END on all caps, which simplifies tools code.
* Add eventlog/trace stuff for capabilities: create/delete/enable/disableDuncan Coutts2012-04-041-1/+4
| | | | | | | | | | | | | | | | | | | | | | | Now that we can adjust the number of capabilities on the fly, we need this reflected in the eventlog. Previously the eventlog had a single startup event that declared a static number of capabilities. Obviously that's no good anymore. For compatability we're keeping the EVENT_STARTUP but adding new EVENT_CAP_CREATE/DELETE. The EVENT_CAP_DELETE is actually just the old EVENT_SHUTDOWN but renamed and extended (using the existing mechanism to extend eventlog events in a compatible way). So we now emit both EVENT_STARTUP and EVENT_CAP_CREATE. One day we will drop EVENT_STARTUP. Since reducing the number of capabilities at runtime does not really delete them, it just disables them, then we also have new events for disable/enable. The old EVENT_SHUTDOWN was in the scheduler class of events. The new EVENT_CAP_* events are in the unconditional class, along with the EVENT_CAPSET_* ones. Knowing when capabilities are created and deleted is crucial to making sense of eventlogs, you always want those events. In any case, they're extremely low volume.
* Use win32AllocStack on Win64 tooIan Lynagh2012-03-191-1/+1
|
* Fixed for unregisterised Windows buildsIan Lynagh2012-03-181-1/+1
|
* Another Win64 fixIan Lynagh2012-03-161-1/+1
|
* typoGabor Greif2012-02-271-1/+1
|
* setNumCapabilities: don't barf() if it isn't supported, just print an errorSimon Marlow2012-01-061-3/+9
|
* Support for reducing the number of Capabilities with setNumCapabilitiesSimon Marlow2011-12-151-70/+174
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows setNumCapabilities to /reduce/ the number of active capabilities as well as increase it. This is particularly tricky to do, because a Capability is a large data structure and ties into the rest of the system in many ways. Trying to clean it all up would be extremely error prone. So instead, the solution is to mark the extra capabilities as "disabled". This has the following consequences: - threads on a disabled capability are migrated away by the scheduler loop - disabled capabilities do not participate in GC (see scheduleDoGC()) - No spark threads are created on this capability (see scheduleActivateSpark()) - We do not attempt to migrate threads *to* a disabled capability (see schedulePushWork()). So a disabled capability should do no work, and does not participate in GC, although it remains alive in other respects. For example, a blocked thread might wake up on a disabled capability, and it will get quickly migrated to a live capability. A disabled capability can still initiate GC if necessary. Indeed, it turns out to be hard to migrate bound threads, so we wait until the next GC to do this (see comments for details).
* New flag +RTS -qi<n>, avoid waking up idle Capabilities to do parallel GCSimon Marlow2011-12-131-2/+68
| | | | | | | | | | | | | | | | | This is an experimental tweak to the parallel GC that avoids waking up a Capability to do parallel GC if we know that the capability has been idle for a (tunable) number of GC cycles. The idea is that if you're only using a few Capabilities, there's no point waking up the ones that aren't busy. e.g. +RTS -qi3 says "A Capability will participate in parallel GC if it was running at all since the last 3 GC cycles." Results are a bit hit and miss, and I don't completely understand why yet. Hence, for now it is turned off by default, and also not documented except in the +RTS -? output.
* Allow the number of capabilities to be increased at runtime (#3729)Simon Marlow2011-12-061-29/+143
| | | | | At present the number of capabilities can only be *increased*, not decreased. The latter presents a few more challenges!
* Make forkProcess work with +RTS -NSimon Marlow2011-12-061-86/+182
| | | | | | | | | | | | | | | | | | | | | | Consider this experimental for the time being. There are a lot of things that could go wrong, but I've verified that at least it works on the test cases we have. I also did some API cleanups while I was here. Previously we had: Capability * rts_eval (Capability *cap, HaskellObj p, /*out*/HaskellObj *ret); but this API is particularly error-prone: if you forget to discard the Capability * you passed in and use the return value instead, then you're in for subtle bugs with +RTS -N later on. So I changed all these functions to this form: void rts_eval (/* inout */ Capability **cap, /* in */ HaskellObj p, /* out */ HaskellObj *ret) It's much harder to use this version incorrectly, because you have to pass the Capability in by reference.
* Fix a scheduling bug in the threaded RTSSimon Marlow2011-12-011-7/+13
| | | | | | | | | | | | | | | The parallel GC was using setContextSwitches() to stop all the other threads, which sets the context_switch flag on every Capability. That had the side effect of causing every Capability to also switch threads, and since GCs can be much more frequent than context switches, this increased the context switch frequency. When context switches are expensive (because the switch is between two bound threads or a bound and unbound thread), the difference is quite noticeable. The fix is to have a separate flag to indicate that a Capability should stop and return to the scheduler, but not switch threads. I've called this the "interrupt" flag.
* Make profiling work with multiple capabilities (+RTS -N)Simon Marlow2011-11-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
* Time handling overhaulSimon Marlow2011-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | Terminology cleanup: the type "Ticks" has been renamed "Time", which is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds). The terminology "tick" is now used consistently to mean the interval between timer signals. The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if we have it). Before it used CPU time in the non-threaded RTS and realtime in the threaded RTS, but I've discovered that the CPU timer has terrible resolution (at least on Linux) and isn't much use for profiling. So now we always use realtime. This should also fix The default tick interval is now 10ms, except when profiling where we drop it to 1ms. This gives more accurate profiles without affecting runtime too much (<1%). Lots of cleanups - the resolution of Time is now in one place only (Rts.h) rather than having calculations that depend on the resolution scattered all over the RTS. I hope I found them all.
* fix occasional failure of numsparks001 test. During shutdown weSimon Marlow2011-08-141-5/+14
| | | | | | | | | discard all the sparks from each Capability, but we were forgetting to account for the discarded sparks in the stats, leading to a failure of the assertion that tests the spark invariant. I've moved the discarding of sparks to just before the GC, to avoid race conditions, and counted the discarded sparks as GC'd.
* Move the call to heapCensus() into GarbageCollect(), just beforeSimon Marlow2011-07-201-5/+4
| | | | | | | | calling resurrectThreads() (fixes #5314). This avoids a lot of problems, because resurrectThreads() may overwrite some closures in the heap, leaving slop behind. The bug in instances, this fix avoids them all in one go.
* Add spark counter tracingDuncan Coutts2011-07-181-0/+2
| | | | | | | A new eventlog event containing 7 spark counters/statistics: sparks created, dud, overflowed, converted, GC'd, fizzled and remaining. These are maintained and logged separately for each capability. We log them at startup, on each GC (minor and major) and on shutdown.