summaryrefslogtreecommitdiff
path: root/rts/RaiseAsync.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix note references and some typosGabor Greif2017-07-261-1/+1
|
* Eagerly blackhole AP_STACKsBen Gamari2017-07-031-0/+1
| | | | | | | | | | | | | This fixes #13615. See the rather lengthy Note [AP_STACKs must be eagerly blackholed] for details. Reviewers: simonmar, austin, erikd, dfeuer Subscribers: duog, dfeuer, hsyl20, rwbarton, thomie GHC Trac Issues: #13615 Differential Revision: https://phabricator.haskell.org/D3695
* rts: annotate switch/case with '/* fallthrough */'Sergei Trofimovich2017-05-141-0/+1
| | | | | | | | | | | | | | Fixes gcc-7.1.0 warnings of form: rts/sm/Scav.c:559:9: error: error: this statement may fall through [-Werror=implicit-fallthrough=] scavenge_fun_srt(info); ^~~~~~~~~~~~~~~~~~~~~~ Many of places are indeed unobvious and some are already annotated by comments. Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
* Prefer #if defined to #ifdefBen Gamari2017-04-281-4/+4
| | | | Our new CPP linter enforces this.
* Spelling only [ci skip]Gabor Greif2017-02-231-1/+1
|
* Use C99's boolBen Gamari2016-11-291-11/+11
| | | | | | | | | | | | Test Plan: Validate on lots of platforms Reviewers: erikd, simonmar, austin Reviewed By: erikd, simonmar Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2699
* rts: More const correct-ness fixesErik de Castro Lopo2016-05-181-1/+1
| | | | | | | | | | | | | | | | | | | | In addition to more const-correctness fixes this patch fixes an infelicity of the previous const-correctness patch (995cf0f356) which left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter but returning a non-const pointer. Here we restore the original type signature of `UNTAG_CLOSURE` and add a new function `UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure` pointer and uses that wherever possible. Test Plan: Validate on Linux, OS X and Windows Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi Reviewed By: simonmar, trofi Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2231
* rts: Replace `nat` with `uint32_t`Erik de Castro Lopo2016-05-051-6/+6
| | | | | | | | | | | | The `nat` type was an alias for `unsigned int` with a comment saying it was at least 32 bits. We keep the typedef in case client code is using it but mark it as deprecated. Test Plan: Validated on Linux, OS X and Windows Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20 Differential Revision: https://phabricator.haskell.org/D2166
* rts: mark 'blockedThrowTo' as staticSergei Trofimovich2016-02-071-0/+3
| | | | | | | | Noticed by uselex.rb: blockedThrowTo: [R]: exported from: ./rts/dist/build/RaiseAsync.o Signed-off-by: Sergei Trofimovich <siarheit@google.com>
* fix EBADF unqueueing in select backend (Trac #10590)Sergei Trofimovich2015-07-071-7/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Alexander found a interesting case: 1. We have a queue of two waiters in a blocked_queue 2. first file descriptor changes state to RUNNABLE, second changes to INVALID 3. awaitEvent function dequeued RUNNABLE thread to a run queue and attempted to dequeue INVALID descriptor to a run queue. Unqueueing INVALID fails thusly: #3 0x000000000045cf1c in barf (s=0x4c1cb0 "removeThreadFromDeQueue: not found") at rts/RtsMessages.c:42 #4 0x000000000046848b in removeThreadFromDeQueue (...) at rts/Threads.c:249 #5 0x000000000049a120 in removeFromQueues (...) at rts/RaiseAsync.c:719 #6 0x0000000000499502 in throwToSingleThreaded__ (...) at rts/RaiseAsync.c:67 #7 0x0000000000499555 in throwToSingleThreaded (..) at rts/RaiseAsync.c:75 #8 0x000000000047c27d in awaitEvent (wait=rtsFalse) at rts/posix/Select.c:415 The problem here is a throwToSingleThreaded function that tries to unqueue a TSO from blocked_queue, but awaitEvent function leaves blocked_queue in a inconsistent state while traverses over blocked_queue: case RTS_FD_IS_READY: IF_DEBUG(scheduler, debugBelch("Waking up blocked thread %lu\n", (unsigned long)tso->id)); tso->why_blocked = NotBlocked; tso->_link = END_TSO_QUEUE; // Here we break the queue head pushOnRunQueue(&MainCapability,tso); break; Signed-off-by: Sergei Trofimovich <siarheit@google.com> Test Plan: tested on a sample from T10590 Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: qnikst, thomie, bgamari Differential Revision: https://phabricator.haskell.org/D1024 GHC Trac Issues: #10590, #4934
* Per-thread allocation counters and limitsSimon Marlow2014-11-121-0/+54
| | | | | | | | This reverts commit f0fcc41d755876a1b02d1c7c79f57515059f6417. New changes: now works on 32-bit platforms too. I added some basic support for 64-bit subtraction and comparison operations to the x86 NCG.
* [skip ci] rts: Detabify RaiseAsync.cAustin Seipp2014-10-211-229/+227
| | | | Signed-off-by: Austin Seipp <austin@well-typed.com>
* Revert "Rename _closure to _static_closure, apply naming consistently."Edward Z. Yang2014-10-201-2/+2
| | | | | | | This reverts commit 35672072b4091d6f0031417bc160c568f22d0469. Conflicts: compiler/main/DriverPipeline.hs
* Rename _closure to _static_closure, apply naming consistently.Edward Z. Yang2014-10-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: In preparation for indirecting all references to closures, we rename _closure to _static_closure to ensure any old code will get an undefined symbol error. In order to reference a closure foobar_closure (which is now undefined), you should instead use STATIC_CLOSURE(foobar). For convenience, a number of these old identifiers are macro'd. Across C-- and C (Windows and otherwise), there were differing conventions on whether or not foobar_closure or &foobar_closure was the address of the closure. Now, all foobar_closure references are addresses, and no & is necessary. CHARLIKE/INTLIKE were not changed, simply alpha-renamed. Part of remove HEAP_ALLOCED patch set (#8199) Depends on D265 Signed-off-by: Edward Z. Yang <ezyang@mit.edu> Test Plan: validate Reviewers: simonmar, austin Subscribers: simonmar, ezyang, carter, thomie Differential Revision: https://phabricator.haskell.org/D267 GHC Trac Issues: #8199
* Revert "rts: add Emacs 'Local Variables' to every .c file"Simon Marlow2014-09-291-8/+0
| | | | This reverts commit 39b5c1cbd8950755de400933cecca7b8deb4ffcd.
* rts: add Emacs 'Local Variables' to every .c fileAustin Seipp2014-07-281-0/+8
| | | | | | | | This will hopefully help ensure some basic consistency in the forward by overriding buffer variables. In particular, it sets the wrap length, the offset to 4, and turns off tabs. Signed-off-by: Austin Seipp <austin@well-typed.com>
* Revert "Per-thread allocation counters and limits"Simon Marlow2014-05-041-54/+0
| | | | | | | | Problems were found on 32-bit platforms, I'll commit again when I have a fix. This reverts the following commits: 54b31f744848da872c7c6366dea840748e01b5cf b0534f78a73f972e279eed4447a5687bd6a8308e
* Per-thread allocation counters and limitsSimon Marlow2014-05-021-0/+54
| | | | | | | | | | | | | | | | | | | | | | | This tracks the amount of memory allocation by each thread in a counter stored in the TSO. Optionally, when the counter drops below zero (it counts down), the thread can be sent an asynchronous exception: AllocationLimitExceeded. When this happens, given a small additional limit so that it can handle the exception. See documentation in GHC.Conc for more details. Allocation limits are similar to timeouts, but - timeouts use real time, not CPU time. Allocation limits do not count anything while the thread is blocked or in foreign code. - timeouts don't re-trigger if the thread catches the exception, allocation limits do. - timeouts can catch non-allocating loops, if you use -fno-omit-yields. This doesn't work for allocation limits. I couldn't measure any impact on benchmarks with these changes, even for nofib/smp.
* s/excpetions/exceptions/Edward Z. Yang2013-10-211-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* If exceptions are blocked, add stack overflow to blocked exceptions list. ↵Edward Z. Yang2013-10-111-4/+1
| | | | | | Fixes #8303. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Implement atomicReadMVar, fixing #4001.Edward Z. Yang2013-07-091-1/+3
| | | | | | | | | We add the invariant to the MVar blocked threads queue that threads blocked on an atomic read are always at the front of the queue. This invariant is easy to maintain, since takers are only ever added to the end of the queue. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* ticky enhancementsNicolas Frisby2013-03-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Make a function for get_itbl, rather than using a CPP macroIan Lynagh2012-08-251-1/+1
| | | | | | | | | | | | This has several advantages: * It can be called from gdb * There is more type information for the user, and type checking for the compiler * Less opportunity for things to go wrong, e.g. due to missing parentheses or repeated execution The sizes of the non-debug .o files hasn't changed (other than Inlines.o), so I'm pretty sure the compiled code is identical.
* throwTo: unlock the MSG_THROWTO object before returning (#6103)Simon Marlow2012-06-071-2/+8
|
* raiseAsync: cope with ATOMICALLY_FRAMES inside UPDATE_FRAMES (#5866)Simon Marlow2012-02-271-11/+56
|
* Fix crash with +RTS -xc (occasional cgrun057(profthreaded) failure)Simon Marlow2012-01-061-1/+1
| | | | | | Don't try to print a stack trace from raiseAsync() when there's no exception - we might just be deleting the thread, or suspending duplicate work.
* Rename the CCCS field of StgTSO so as not to conflict with the CCCS ↵Simon Marlow2012-01-051-1/+1
| | | | | | pseudo-register Needed by #5357
* +RTS -xc: print a the closure type of the exception tooSimon Marlow2011-11-141-1/+1
|
* Overhaul of infrastructure for profiling, coverage (HPC) and breakpointsSimon Marlow2011-11-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User visible changes ==================== Profilng -------- Flags renamed (the old ones are still accepted for now): OLD NEW --------- ------------ -auto-all -fprof-auto -auto -fprof-exported -caf-all -fprof-cafs New flags: -fprof-auto Annotates all bindings (not just top-level ones) with SCCs -fprof-top Annotates just top-level bindings with SCCs -fprof-exported Annotates just exported bindings with SCCs -fprof-no-count-entries Do not maintain entry counts when profiling (can make profiled code go faster; useful with heap profiling where entry counts are not used) Cost-centre stacks have a new semantics, which should in most cases result in more useful and intuitive profiles. If you find this not to be the case, please let me know. This is the area where I have been experimenting most, and the current solution is probably not the final version, however it does address all the outstanding bugs and seems to be better than GHC 7.2. Stack traces ------------ +RTS -xc now gives more information. If the exception originates from a CAF (as is common, because GHC tends to lift exceptions out to the top-level), then the RTS walks up the stack and reports the stack in the enclosing update frame(s). Result: +RTS -xc is much more useful now - but you still have to compile for profiling to get it. I've played around a little with adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem quite accurately. I plan to add more facilities for stack tracing (e.g. in GHCi) in the future. Coverage (HPC) -------------- * derived instances are now coloured yellow if they weren't used * likewise record field names * entry counts are more accurate (hpc --fun-entry-count) * tab width is now correct (markup was previously off in source with tabs) Internal changes ================ In Core, the Note constructor has been replaced by Tick (Tickish b) (Expr b) which is used to represent all the kinds of source annotation we support: profiling SCCs, HPC ticks, and GHCi breakpoints. Depending on the properties of the Tickish, different transformations apply to Tick. See CoreUtils.mkTick for details. Tickets ======= This commit closes the following tickets, test cases to follow: - Close #2552: not a bug, but the behaviour is now more intuitive (test is T2552) - Close #680 (test is T680) - Close #1531 (test is result001) - Close #949 (test is T949) - Close #2466: test case has bitrotted (doesn't compile against current version of vector-space package)
* GC refactoring and cleanupSimon Marlow2011-02-021-3/+3
| | | | | | | | | Now we keep any partially-full blocks in the gc_thread[] structs after each GC, rather than moving them to the generation. This should give us slightly better locality (though I wasn't able to measure any difference). Also in this patch: better sanity checking with THREADED.
* Implement stack chunks and separate TSO/STACK objectsSimon Marlow2010-12-151-55/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
* throwTo: report the why_blocked value in the barf()Simon Marlow2010-12-031-1/+1
|
* handle ThreadMigrating in throwTo() (#4811)Simon Marlow2010-12-031-0/+12
| | | | | | | If a throwTo targets a thread that has just been created with forkOnIO, then it is possible the exception strikes while the thread is still in the process of migrating. throwTo() didn't handle this case, but it's fairly straightforward.
* minor refactoringSimon Marlow2010-09-261-21/+19
|
* Fix for interruptible FFI handlingSimon Marlow2010-09-251-8/+0
| | | | | | Set tso->why_blocked before calling maybePerformBlockedException(), so that throwToSingleThreaded() doesn't try to unblock the current thread (it is already unblocked).
* Don't interrupt when task blocks exceptions, don't immediately start exception.Edward Z. Yang2010-09-251-3/+13
|
* Interruptible FFI calls with pthread_kill and CancelSynchronousIO. v4Edward Z. Yang2010-09-191-2/+23
| | | | | | | | | | | | | | | | | | | | | | | This is patch that adds support for interruptible FFI calls in the form of a new foreign import keyword 'interruptible', which can be used instead of 'safe' or 'unsafe'. Interruptible FFI calls act like safe FFI calls, except that the worker thread they run on may be interrupted. Internally, it replaces BlockedOnCCall_NoUnblockEx with BlockedOnCCall_Interruptible, and changes the behavior of the RTS to not modify the TSO_ flags on the event of an FFI call from a thread that was interruptible. It also modifies the bytecode format for foreign call, adding an extra Word16 to indicate interruptibility. The semantics of interruption vary from platform to platform, but the intent is that any blocking system calls are aborted with an error code. This is most useful for making function calls to system library functions that support interrupting. There is no support for pre-Vista Windows. There is a partner testsuite patch which adds several tests for this functionality.
* New asynchronous exception control API (ghc parts)Simon Marlow2010-07-081-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | As discussed on the libraries/haskell-cafe mailing lists http://www.haskell.org/pipermail/libraries/2010-April/013420.html This is a replacement for block/unblock in the asychronous exceptions API to fix a problem whereby a function could unblock asynchronous exceptions even if called within a blocked context. The new terminology is "mask" rather than "block" (to avoid confusion due to overloaded meanings of the latter). In GHC, we changed the names of some primops: blockAsyncExceptions# -> maskAsyncExceptions# unblockAsyncExceptions# -> unmaskAsyncExceptions# asyncExceptionsBlocked# -> getMaskingState# and added one new primop: maskUninterruptible# See the accompanying patch to libraries/base for the API changes.
* Don't raise a throwTo when the target is masking and BlockedOnBlackHoleSimon Marlow2010-05-051-8/+14
|
* Fix for derefing ThreadRelocated TSOs in MVar operationsSimon Marlow2010-04-071-2/+2
|
* Change the representation of the MVar blocked queueSimon Marlow2010-04-011-78/+63
| | | | | | | | | | | | | | | | | | | | | The list of threads blocked on an MVar is now represented as a list of separately allocated objects rather than being linked through the TSOs themselves. This lets us remove a TSO from the list in O(1) time rather than O(n) time, by marking the list object. Removing this linear component fixes some pathalogical performance cases where many threads were blocked on an MVar and became unreachable simultaneously (nofib/smp/threads007), or when sending an asynchronous exception to a TSO in a long list of thread blocked on an MVar. MVar performance has actually improved by a few percent as a result of this change, slightly to my surprise. This is the final cleanup in the sequence, which let me remove the old way of waking up threads (unblockOne(), MSG_WAKEUP) in favour of the new way (tryWakeupThread and MSG_TRY_WAKEUP, which is idempotent). It is now the case that only the Capability that owns a TSO may modify its state (well, almost), and this simplifies various things. More of the RTS is based on message-passing between Capabilities now.
* change throwTo to use tryWakeupThread rather than unblockOneSimon Marlow2010-03-291-31/+23
|
* New implementation of BLACKHOLEsSimon Marlow2010-03-291-106/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces the global blackhole_queue with a clever scheme that enables us to queue up blocked threads on the closure that they are blocked on, while still avoiding atomic instructions in the common case. Advantages: - gets rid of a locked global data structure and some tricky GC code (replacing it with some per-thread data structures and different tricky GC code :) - wakeups are more prompt: parallel/concurrent performance should benefit. I haven't seen anything dramatic in the parallel benchmarks so far, but a couple of threading benchmarks do improve a bit. - waking up a thread blocked on a blackhole is now O(1) (e.g. if it is the target of throwTo). - less sharing and better separation of Capabilities: communication is done with messages, the data structures are strictly owned by a Capability and cannot be modified except by sending messages. - this change will utlimately enable us to do more intelligent scheduling when threads block on each other. This is what started off the whole thing, but it isn't done yet (#3838). I'll be documenting all this on the wiki in due course.
* Fix a couple of bugs in the throwTo handling, exposed by conc016(threaded2)Simon Marlow2010-03-111-8/+11
|
* Use message-passing to implement throwTo in the RTSSimon Marlow2010-03-111-239/+273
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces some complicated locking schemes with message-passing in the implementation of throwTo. The benefits are - previously it was impossible to guarantee that a throwTo from a thread running on one CPU to a thread running on another CPU would be noticed, and we had to rely on the GC to pick up these forgotten exceptions. This no longer happens. - the locking regime is simpler (though the code is about the same size) - threads can be unblocked from a blocked_exceptions queue without having to traverse the whole queue now. It's a rare case, but replaces an O(n) operation with an O(1). - generally we move in the direction of sharing less between Capabilities (aka HECs), which will become important with other changes we have planned. Also in this patch I replaced several STM-specific closure types with a generic MUT_PRIM closure type, which allowed a lot of code in the GC and other places to go away, hence the line-count reduction. The message-passing changes resulted in about a net zero line-count difference.
* Use local mut lists in UPD_IND(), also clean up Updates.hSimon Marlow2009-12-311-1/+1
|
* Allow throwTo() to be called without a source threadSimon Marlow2009-12-181-11/+18
| | | | | Returns false if the exception could not be thrown becuase the tartget thread was running. Not used yet, but might come in handy later.
* add a couple of assertionsSimon Marlow2009-11-231-0/+4
|
* Refactoring onlySimon Marlow2009-12-021-1/+1
|