summaryrefslogtreecommitdiff
path: root/rts/sm
Commit message (Collapse)AuthorAgeFilesLines
* use StgWord not StgWord8 for wakeupSimon Marlow2013-10-011-1/+1
| | | | volatile StgWord8 is not guaranteed to be atomic.
* Fix build when PROF_SPIN is unsetPatrick Palka2013-09-231-0/+2
| | | | whitehole_spin is only defined when PROF_SPIN is set.
* Revert "Default to infinite stack size (#8189)"Austin Seipp2013-09-081-39/+16
| | | | This reverts commit d85044f6b201eae0a9e453b89c0433608e0778f0.
* Default to infinite stack size (#8189)Austin Seipp2013-09-081-16/+39
| | | | | | | | | | | | | When servicing a stack overflows, only throw an exception to the given thread if the user explicitly set a max stack size, using +RTS -K. Otherwise just service it normally and grow the stack. In case we actually run out of *heap* (stack chuncks are allocated on the heap), then we need to bail by calling the stackOverflow() hook and exit immediately. Authored-by: Ben Gamari <bgamari.foss@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Don't move Capabilities in setNumCapabilities (#8209)Simon Marlow2013-09-044-41/+35
| | | | | | | | | | | | | We have various problems with reallocating the array of Capabilities, due to threads in waitForReturnCapability that are already holding a pointer to a Capability. Rather than add more locking to make this safer, I decided it would be easier to ensure that we never move the Capabilities at all. The capabilities array is now an array of pointers to Capabaility. There are extra indirections, but it rarely matters - we don't often access Capabilities via the array, normally we already have a pointer to one. I ran the parallel benchmarks and didn't see any difference.
* Merge branch 'master' into atomicsRyan Newton2013-08-311-0/+5
|\
| * Really unload object code when it is safe to do so (#8039)Simon Marlow2013-08-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The next major GC after an unloadObj() will do a traversal of the heap to determine whether the object code can be removed from memory or not. We'll keep doing these until it is safe to remove the object code. In my experiments with GHCi, the objects get unloaded immediately, which is a good sign: we're not accidentally holding on to any references anywhere in the GHC data structures. Changes relative to the patch earlier posted on the ticket: - fix two memory leaks discovered with Valgrind, after testing with tests/rts/linker_unload.c
* | Eliminate atomic_inc_by and instead medofiy atomic_inc.Ryan Newton2013-08-211-1/+1
|/
* Implement atomicReadMVar, fixing #4001.Edward Z. Yang2013-07-093-0/+3
| | | | | | | | | We add the invariant to the MVar blocked threads queue that threads blocked on an atomic read are always at the front of the queue. This invariant is easy to maintain, since takers are only ever added to the end of the queue. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Fix #7970, #2161, unfix #551Simon Marlow2013-07-021-48/+52
| | | | | | | | | | | Establish the reachability of threads before weak pointers. Hence a deadlocked thread can keep a weak pointer alive and prevent it from being finalized early. However, an reference from the finalizer of a weak pointer will no longer prevent a thread from being considered deadlocked (#551). To keep the thread alive in that situation you need to use a StablePtr. See comments on #7970 and in the code for more details.
* Ensure gc_thread->wakeup is of type StgWord8.Austin Seipp2013-06-211-1/+1
| | | | | | | | | | rtsBool is defined to only have two inhabitants, which are true (1) and false (0) But the wakeup flag is set to 4 possible values, outside the range of rtsBool. This leads Clang to warn about tautological comparisons. Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix typo in header guard.Austin Seipp2013-06-191-1/+1
| | | | | | Spotted by Clang. Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Whitespace and braces onlyIan Lynagh2013-06-151-4/+5
|
* Maintain per-generation lists of weak pointers (#7847)Takano Akio2013-06-154-100/+144
|
* Allow multiple C finalizers to be attached to a Weak#Takano Akio2013-06-152-4/+3
| | | | | | | | | | | | | The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#. This new primop mutates an existing Weak# object and adds a new C finalizer to it. This change removes an invariant in MarkWeak.c, namely that the relative order of Weak# objects in the list needs to be preserved across GC. This makes it easier to split the list into per-generation structures. The patch also removes a race condition between two threads calling finalizeWeak# on the same WEAK object at that same time.
* use libffi for iOS adjustors; fixes #7718Ian Lynagh2013-06-081-4/+54
| | | | Based on a patch from Stephen Blackheath.
* Fix crash with large objects (#7919)Simon Marlow2013-05-241-14/+44
| | | | See comments for details.
* Fix a problem caused by very large objects (#7919)Simon Marlow2013-05-211-5/+12
| | | | | | As far as I can tell the bug should be harmless, apart from the failing assertion. Since the ticket reported crashes, there might be problems elsewhere that aren't triggered by this test case.
* Kill dead code.Austin Seipp2013-05-121-101/+86
| | | | Signed-off-by: Austin Seipp <aseipp@pobox.com>
* ticky enhancementsNicolas Frisby2013-03-291-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Fix segfault in retainer profiling when using multiple cores (#5909)Simon Marlow2013-02-191-2/+15
| | | | | Thanks to @akio on the ticket for the diagnosis and the patch. I modified the comments a bit.
* isAlive needs to look through BLACKHOLE indirectionsSimon Marlow2013-02-141-0/+8
| | | | | This has been breaking StableNames and possibly weak pointers in some cases.
* Separate StablePtr and StableName tables (#7674)Simon Marlow2013-02-142-7/+7
| | | | To improve performance of StablePtr.
* Simplify the allocation stats accountingSimon Marlow2013-02-144-50/+20
| | | | | | | | | | | We were doing it in two different ways and asserting that the results were the same. In most cases they were, but I found one case where they weren't: the GC itself allocates some memory for running finalizers, and this memory was accounted for one way but not the other. It was simpler to remove the old way of counting allocation that to try to fix it up, so I did that.
* Fix documentation bug: TSOs are *not* unconditionally kept on the mutable list.Edward Z. Yang2013-01-271-1/+1
| | | | | | | | | | | The bug where TSOs were unconditionally kept on the mutable list was #1589 which was fixed in 04cddd339c000df6d02c90ce59dbffa58d2fe166. Curiously enough, the commit that changed this comment 0417404f5d1230c9d291ea9f73e2831121c8ec99 occurred *after* this change was made; I can only assume Simon Marlow accidentally forgot that he had fixed this bug. :-) Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Hopefully fix breakage on OS X w/ LLVMSimon Marlow2013-01-173-1/+12
| | | | | | | Reordering of includes in GC.c broke on OS X because gctKey is declared in Task.h and is needed in the storage manager. This is really the wrong place for it anyway, so I've moved the gctKey pieces to where they should be.
* Rearrange includes to avoid a clash on ARM/LinuxSimon Marlow2013-01-171-12/+13
|
* Add a write barrier for TVAR closuresSimon Marlow2012-11-168-15/+114
| | | | | | | | | | This improves GC performance when there are a lot of TVars in the heap. For instance, a TChan with a lot of elements causes a massive GC drag without this patch. There's more to do - several other STM closure types don't have write barriers, so GC performance when there are a lot of threads blocked on STM isn't great. But fixing the problem for TVar is a good start.
* fix bug in previous commit, 65e46f144f3d8b18de7264b0b099086153c68d6cSimon Marlow2012-11-161-1/+1
|
* a fix for checkTSO(): the TSO could be a WHITEHOLESimon Marlow2012-11-121-3/+10
|
* Don't clearNurseries() in parallel with -debugSimon Marlow2012-11-011-3/+5
| | | | It makes sanity-checking fail.
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-084-85/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Fix the profiling buildIan Lynagh2012-09-211-2/+2
|
* Convert more RTS macros to functionsIan Lynagh2012-09-212-6/+6
| | | | No size changes in the non-debug object files
* Include pinned memory in the stats for allocated memorySimon Marlow2012-09-212-1/+2
| | | | | This broke with the changes to the pinned object handling in 67f4ab7e6b7705a9d617c6109a8c5434ede13cae.
* Cache the result of countOccupied(gen->large_objects) as gen->n_large_words ↵Simon Marlow2012-09-212-2/+6
| | | | | | | | | (#7257) The program in #7257 was spending 90% of its time counting the live data in gen->large_objects. We already avoid doing this for small objects, but in this example the old generation was full of large objects (actually pinned ByteStrings).
* Allow allocNursery() to allocate single blocks (#7257)Simon Marlow2012-09-212-11/+13
| | | | | | | Forcing large allocations here can creates serious fragmentation in some cases, and since the large allocations are only a small optimisation we should allow the nursery to hoover up small blocks before allocating large chunks.
* Small parallel GC improvementSimon Marlow2012-09-181-2/+12
| | | | Overlap the main thread's clearNursery() with the other threads.
* More OS X build fixesIan Lynagh2012-09-141-8/+8
|
* Lots of nat -> StgWord changesSimon Marlow2012-09-077-49/+49
|
* Deprecate lnat, and use StgWord insteadSimon Marlow2012-09-0712-83/+83
| | | | | | | | | | | | lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
* Some further tweaks to reduce fragmentation when allocating the nurserySimon Marlow2012-09-073-19/+37
|
* some nats should be lnatsSimon Marlow2012-09-071-1/+1
|
* When using -H with -M<size>, don't exceed the maximum heap sizeSimon Marlow2012-09-071-1/+5
|
* memInventory(): tweak pretty-printingSimon Marlow2012-09-071-8/+8
|
* More CPP macros -> inline functionsIan Lynagh2012-08-252-6/+6
| | | | | | | | All the wibble seem to have cancelled out, and (non-debug) object sizes are back to where they started. I'm not 100% sure that the types are optimal, but at least now the functions have types and we can fix them if necessary.
* Make a function for get_itbl, rather than using a CPP macroIan Lynagh2012-08-252-6/+6
| | | | | | | | | | | | This has several advantages: * It can be called from gdb * There is more type information for the user, and type checking for the compiler * Less opportunity for things to go wrong, e.g. due to missing parentheses or repeated execution The sizes of the non-debug .o files hasn't changed (other than Inlines.o), so I'm pretty sure the compiled code is identical.
* tidy upSimon Marlow2012-08-211-5/+4
|
* Reduce fragmentation when using +RTS -H (with or without a size)Simon Marlow2012-08-213-2/+45
|
* improve debug outputSimon Marlow2012-08-211-1/+1
|