summaryrefslogtreecommitdiff
path: root/rts/Threads.c
Commit message (Collapse)AuthorAgeFilesLines
* A bunch of typofixesGabor Greif2017-09-261-1/+1
|
* Prefer #if defined to #ifdefBen Gamari2017-04-281-3/+3
| | | | Our new CPP linter enforces this.
* Enable new warning for fragile/incorrect CPP #if usageErik de Castro Lopo2017-04-281-1/+1
| | | | | | | | | | | | | | | | The C code in the RTS now gets built with `-Wundef` and the Haskell code (stages 1 and 2 only) with `-Wcpp-undef`. We now get warnings whereever `#if` is used on undefined identifiers. Test Plan: Validate on Linux and Windows Reviewers: austin, angerman, simonmar, bgamari, Phyx Reviewed By: bgamari Subscribers: thomie, snowleopard Differential Revision: https://phabricator.haskell.org/D3278
* Revert "Enable new warning for fragile/incorrect CPP #if usage"Ben Gamari2017-04-051-1/+1
| | | | | | | | This is causing too much platform dependent breakage at the moment. We will need a more rigorous testing strategy before this can be merged again. This reverts commit 7e340c2bbf4a56959bd1e95cdd1cfdb2b7e537c2.
* Enable new warning for fragile/incorrect CPP #if usageErik de Castro Lopo2017-04-051-1/+1
| | | | | | | | | | | | | | | | The C code in the RTS now gets built with `-Wundef` and the Haskell code (stages 1 and 2 only) with `-Wcpp-undef`. We now get warnings whereever `#if` is used on undefined identifiers. Test Plan: Validate on Linux and Windows Reviewers: austin, angerman, simonmar, bgamari, Phyx Reviewed By: bgamari Subscribers: thomie, snowleopard Differential Revision: https://phabricator.haskell.org/D3278
* Use C99's boolBen Gamari2016-11-291-13/+13
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* Add hs_try_putmvar()Simon Marlow2016-09-121-0/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This is a fast, non-blocking, asynchronous, interface to tryPutMVar that can be called from C/C++. It's useful for callback-based C/C++ APIs: the idea is that the callback invokes hs_try_putmvar(), and the Haskell code waits for the callback to run by blocking in takeMVar. The callback doesn't block - this is often a requirement of callback-based APIs. The callback wakes up the Haskell thread with minimal overhead and no unnecessary context-switches. There are a couple of benchmarks in testsuite/tests/concurrent/should_run. Some example results comparing hs_try_putmvar() with using a standard foreign export: ./hs_try_putmvar003 1 64 16 100 +RTS -s -N4 0.49s ./hs_try_putmvar003 2 64 16 100 +RTS -s -N4 2.30s hs_try_putmvar() is 4x faster for this workload (see the source for hs_try_putmvar003.hs for details of the workload). An alternative solution is to use the IO Manager for this. We've tried it, but there are problems with that approach: * Need to create a new file descriptor for each callback * The IO Manger thread(s) become a bottleneck * More potential for things to go wrong, e.g. throwing an exception in an IO Manager callback kills the IO Manager thread. Test Plan: validate; new unit tests Reviewers: niteria, erikd, ezyang, bgamari, austin, hvr Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2501
* rts: Replace `nat` with `uint32_t`Erik de Castro Lopo2016-05-051-4/+4
| | | | | | | | | | | | The `nat` type was an alias for `unsigned int` with a comment saying it was at least 32 bits. We keep the typedef in case client code is using it but mark it as deprecated. Test Plan: Validated on Linux, OS X and Windows Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20 Differential Revision: https://phabricator.haskell.org/D2166
* rts: mark 'wakeBlockingQueue' as staticSergei Trofimovich2016-02-071-1/+1
| | | | | | | | Noticed by uselex.rb: wakeBlockingQueue: [R]: exported from: ./rts/dist/build/Threads.o Signed-off-by: Sergei Trofimovich <siarheit@google.com>
* Account for stack allocation in the thread's allocation counterSimon Marlow2015-09-141-0/+8
| | | | | | | | | | | | Summary: (see comment for details) Test Plan: validate Reviewers: bgamari, ezyang, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1243
* Do not copy stack after stack overflow, refix #8435Flaviu Andrei Csernik (archblob)2015-06-121-0/+1
| | | | | | | | | | | | | | | | | | Summary: This was reverted in d70b19bfb5ed79b22c2ac31e22f46782fc47a117 and is a part of the reason for #10445. Test Plan: validate Reviewers: ezyang, simonmar, austin Reviewed By: simonmar, austin Subscribers: bgamari, thomie Differential Revision: https://phabricator.haskell.org/D938 GHC Trac Issues: #8435
* Remove comments and flag for GranSimThomas Miedema2015-03-191-2/+0
| | | | | | | | | The GranSim code was removed in dd56e9ab and 297b05a9 in 2009, and perhaps other commits I couldn't find. Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D737
* fix bus errors on SPARC caused by unalignment access to alloc_limit (fixes ↵Karel Gardas2015-02-231-3/+3
| | | | | | | | | | #10043) Reviewers: austin, simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D657
* Per-thread allocation counters and limitsSimon Marlow2014-11-121-48/+29
| | | | | | | | This reverts commit f0fcc41d755876a1b02d1c7c79f57515059f6417. New changes: now works on 32-bit platforms too. I added some basic support for 64-bit subtraction and comparison operations to the x86 NCG.
* [skip ci] rts: Detabify Threads.cAustin Seipp2014-10-211-58/+58
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* Revert "rts: add Emacs 'Local Variables' to every .c file"Simon Marlow2014-09-291-8/+0
| | | | This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
* panic message fixSimon Marlow2014-08-011-1/+1
|
* rts: add Emacs 'Local Variables' to every .c fileAustin Seipp2014-07-281-0/+8
| | | | | | | | This will hopefully help ensure some basic consistency in the forward by overriding buffer variables. In particular, it sets the wrap length, the offset to 4, and turns off tabs. Signed-off-by: Austin Seipp <austin@well-typed.com>
* Revert "Per-thread allocation counters and limits"Simon Marlow2014-05-041-29/+48
| | | | | | | | Problems were found on 32-bit platforms, I'll commit again when I have a fix. This reverts the following commits: 54b31f744848da872c7c6366dea840748e01b5cf b0534f78a73f972e279eed4447a5687bd6a8308e
* Per-thread allocation counters and limitsSimon Marlow2014-05-021-48/+29
| | | | | | | | | | | | | | | | | | | | | | | This tracks the amount of memory allocation by each thread in a counter stored in the TSO. Optionally, when the counter drops below zero (it counts down), the thread can be sent an asynchronous exception: AllocationLimitExceeded. When this happens, given a small additional limit so that it can handle the exception. See documentation in GHC.Conc for more details. Allocation limits are similar to timeouts, but - timeouts use real time, not CPU time. Allocation limits do not count anything while the thread is blocked or in foreign code. - timeouts don't re-trigger if the thread catches the exception, allocation limits do. - timeouts can catch non-allocating loops, if you use -fno-omit-yields. This doesn't work for allocation limits. I couldn't measure any impact on benchmarks with these changes, even for nofib/smp.
* rts: Allow for infinite stack sizeBen Gamari2013-10-251-1/+2
| | | | This is encoded as RtsFlags.GcFlags.maxStkSize == 0.
* Do not copy stack after stack overflow, fixes #8435Edward Z. Yang2013-10-111-0/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* If exceptions are blocked, add stack overflow to blocked exceptions list. ↵Edward Z. Yang2013-10-111-14/+52
| | | | | | Fixes #8303. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Typofix.Edward Z. Yang2013-10-101-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Clarify the TSO_SQUEEZED check.Edward Z. Yang2013-10-101-0/+8
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* s/pathalogical/pathological/Edward Z. Yang2013-10-031-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Revert "Default to infinite stack size (#8189)"Austin Seipp2013-09-081-8/+2
| | | | This reverts commit d85044f6b201eae0a9e453b89c0433608e0778f0.
* Default to infinite stack size (#8189)Austin Seipp2013-09-081-2/+8
| | | | | | | | | | | | | When servicing a stack overflows, only throw an exception to the given thread if the user explicitly set a max stack size, using +RTS -K. Otherwise just service it normally and grow the stack. In case we actually run out of *heap* (stack chuncks are allocated on the heap), then we need to bail by calling the stackOverflow() hook and exit immediately. Authored-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Don't move Capabilities in setNumCapabilities (#8209)Simon Marlow2013-09-041-1/+1
| | | | | | | | | | | | | We have various problems with reallocating the array of Capabilities, due to threads in waitForReturnCapability that are already holding a pointer to a Capability. Rather than add more locking to make this safer, I decided it would be easier to ensure that we never move the Capabilities at all. The capabilities array is now an array of pointers to Capabaility. There are extra indirections, but it rarely matters - we don't often access Capabilities via the array, normally we already have a pointer to one. I ran the parallel benchmarks and didn't see any difference.
* Implement atomicReadMVar, fixing #4001.Edward Z. Yang2013-07-091-0/+4
| | | | | | | | | We add the invariant to the MVar blocked threads queue that threads blocked on an atomic read are always at the front of the queue. This invariant is easy to maintain, since takers are only ever added to the end of the queue. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* More accurate cost attribution for stacks. Fixes #7818.Edward Z. Yang2013-04-221-2/+2
| | | | | | | | Previously, stacks were always attributed to CCCS_SYSTEM. Now, we attribute them to the CCS when the stack was allocated. If a stack grows, new stack chunks inherit the CCS of the old stack. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Lots of nat -> StgWord changesSimon Marlow2012-09-071-4/+4
|
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-071-4/+4
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* Fix crash with tiny initial stack size (#5993)Simon Marlow2012-04-121-2/+2
|
* updateThunk: minor tweak to avoid an unnecessary call to checkBlockingQueuesSimon Marlow2012-04-121-4/+8
|
* threadStackOverflow: Tweak to stack chunk sizingSimon Marlow2012-03-281-1/+2
| | | | | | | | | If the old stack is only half full, then the next chunk we allocate will be twice as large, to accommodate large requests for stack. In that case, make sure that the chunk we allocate is at least as large as the usual chunk size - there's no point allocating a 2k chunk (double the default initial chunk size of 1k) if in the normal case we would allocate a 32k chunk.
* Fix a bug in threadStackOverflow() (maybe #5214)Simon Marlow2012-03-281-13/+22
| | | | | | | | | | | | | | If we overflow the current stack chunk and copy its entire contents into the next stack chunk, we could end up with two UNDERFLOW_FRAMEs. We had a special case to catch this in the case when the old stack chunk was the last one (ending in STOP_FRAME), but it went wrong for other chunks. I found this bug while poking around in the core dump attached to options and running the nofib suite: imaginary/wheel_sieve2 crashed with +RTS -kc600 -kb300. I don't know if this is the cause of all the symptoms reported in
* Rename the CCCS field of StgTSO so as not to conflict with the CCCS ↵Simon Marlow2012-01-051-1/+1
| | | | | | pseudo-register Needed by #5357
* Time handling overhaulSimon Marlow2011-11-251-5/+7
| | | | | | | | | | | | | | | | | | | | | Terminology cleanup: the type "Ticks" has been renamed "Time", which is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds). The terminology "tick" is now used consistently to mean the interval between timer signals. The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if we have it). Before it used CPU time in the non-threaded RTS and realtime in the threaded RTS, but I've discovered that the CPU timer has terrible resolution (at least on Linux) and isn't much use for profiling. So now we always use realtime. This should also fix The default tick interval is now 10ms, except when profiling where we drop it to 1ms. This gives more accurate profiles without affecting runtime too much (<1%). Lots of cleanups - the resolution of Time is now in one place only (Rts.h) rather than having calculations that depend on the resolution scattered all over the RTS. I hope I found them all.
* GC refactoring and cleanupSimon Marlow2011-02-021-1/+1
| | | | | | | | | Now we keep any partially-full blocks in the gc_thread[] structs after each GC, rather than moving them to the generation. This should give us slightly better locality (though I wasn't able to measure any difference). Also in this patch: better sanity checking with THREADED.
* fix warningSimon Marlow2010-12-161-1/+1
|
* fix for large stack allocationsSimon Marlow2010-12-151-6/+26
|
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-46/+242
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* removeThreadFromQueue: stub out the link field before returning (#4813)Simon Marlow2010-12-021-1/+4
|
* commentSimon Marlow2010-10-131-0/+13
|
* Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4Edward Z. Yang2010-09-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | This is patch that adds support for interruptible FFI calls in the form of a new foreign import keyword 'interruptible', which can be used instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe FFI calls, except that the worker thread they run on may be interrupted. Internally, it replaces BlockedOnCCall_NoUnblockEx with BlockedOnCCall_Interruptible, and changes the behavior of the RTS to not modify the TSO_ flags on the event of an FFI call from a thread that was interruptible. It also modifies the bytecode format for foreign call, adding an extra Word16 to indicate interruptibility. The semantics of interruption vary from platform to platform, but the intent is that any blocking system calls are aborted with an error code. This is most useful for making function calls to system library functions that support interrupting. There is no support for pre-Vista Windows. There is a partner testsuite patch which adds several tests for this functionality.
* Add a couple of missing tests for EAGER_BLACKHOLESimon Marlow2010-08-231-0/+1
| | | | | | | | This was leading to looping and excessive allocation, when the computation should have just blocked on the black hole. Reported by Christian Höner zu Siederdissen <choener@tbi.univie.ac.at> on glasgow-haskell-users.
* Fix for derefing ThreadRelocated TSOs in MVar operationsSimon Marlow2010-04-071-2/+6
|
* don't forget to deRefTSO() in tryWakeupThread()Simon Marlow2010-04-061-1/+2
|
* fix bug in migrateThread()Simon Marlow2010-04-011-1/+1
|