| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
volatile StgWord8 is not guaranteed to be atomic.
|
|
|
|
| |
whitehole_spin is only defined when PROF_SPIN is set.
|
|
|
|
| |
This reverts commit d85044f6b201eae0a9e453b89c0433608e0778f0.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When servicing a stack overflows, only throw an exception to the given
thread if the user explicitly set a max stack size, using +RTS -K.
Otherwise just service it normally and grow the stack.
In case we actually run out of *heap* (stack chuncks are allocated on
the heap), then we need to bail by calling the stackOverflow() hook and
exit immediately.
Authored-by: Ben Gamari <bgamari.foss@gmail.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have various problems with reallocating the array of Capabilities,
due to threads in waitForReturnCapability that are already holding a
pointer to a Capability.
Rather than add more locking to make this safer, I decided it would be
easier to ensure that we never move the Capabilities at all. The
capabilities array is now an array of pointers to Capabaility. There
are extra indirections, but it rarely matters - we don't often access
Capabilities via the array, normally we already have a pointer to
one. I ran the parallel benchmarks and didn't see any difference.
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The next major GC after an unloadObj() will do a traversal of the heap
to determine whether the object code can be removed from memory or
not. We'll keep doing these until it is safe to remove the object
code.
In my experiments with GHCi, the objects get unloaded immediately,
which is a good sign: we're not accidentally holding on to any
references anywhere in the GHC data structures.
Changes relative to the patch earlier posted on the ticket:
- fix two memory leaks discovered with Valgrind, after
testing with tests/rts/linker_unload.c
|
|/ |
|
|
|
|
|
|
|
|
|
| |
We add the invariant to the MVar blocked threads queue that
threads blocked on an atomic read are always at the front of
the queue. This invariant is easy to maintain, since takers
are only ever added to the end of the queue.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
|
|
|
|
| |
Establish the reachability of threads before weak pointers. Hence a
deadlocked thread can keep a weak pointer alive and prevent it from
being finalized early. However, an reference from the finalizer of a
weak pointer will no longer prevent a thread from being considered
deadlocked (#551). To keep the thread alive in that situation you
need to use a StablePtr.
See comments on #7970 and in the code for more details.
|
|
|
|
|
|
|
|
|
|
| |
rtsBool is defined to only have two inhabitants, which are true (1) and
false (0)
But the wakeup flag is set to 4 possible values, outside the range of
rtsBool. This leads Clang to warn about tautological comparisons.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
| |
Spotted by Clang.
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit replaces mkWeakForeignEnv# with addCFinalizerToWeak#.
This new primop mutates an existing Weak# object and adds a new
C finalizer to it.
This change removes an invariant in MarkWeak.c, namely that the relative
order of Weak# objects in the list needs to be preserved across GC. This
makes it easier to split the list into per-generation structures.
The patch also removes a race condition between two threads calling
finalizeWeak# on the same WEAK object at that same time.
|
|
|
|
| |
Based on a patch from Stephen Blackheath.
|
|
|
|
| |
See comments for details.
|
|
|
|
|
|
| |
As far as I can tell the bug should be harmless, apart from the
failing assertion. Since the ticket reported crashes, there might be
problems elsewhere that aren't triggered by this test case.
|
|
|
|
| |
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* the new StgCmmArgRep module breaks a dependency cycle; I also
untabified it, but made no real changes
* updated the documentation in the wiki and change the user guide to
point there
* moved the allocation enters for ticky and CCS to after the heap check
* I left LDV where it was, which was before the heap check at least
once, since I have no idea what it is
* standardized all (active?) ticky alloc totals to bytes
* in order to avoid double counting StgCmmLayout.adjustHpBackwards
no longer bumps ALLOC_HEAP_ctr
* I resurrected the SLOW_CALL counters
* the new module StgCmmArgRep breaks cyclic dependency between
Layout and Ticky (which the SLOW_CALL counters cause)
* renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL
* added ALLOC_RTS_ctr and _tot ticky counters
* eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info
* resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and
ALLOC_PRIM
* added -ticky and -DTICKY_TICKY in ways.mk for debug ways
* added a ticky counter for total LNE entries
* new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE
* all off by default
* -ticky-allocd: tracks allocation *of* closure in addition to
allocation *by* that closure
* -ticky-dyn-thunk tracks dynamic thunks as if they were functions
* -ticky-LNE tracks LNEs as if they were functions
* updated the ticky report format, including making the argument
categories (more?) accurate again
* the printed name for things in the report include the unique of
their ticky parent as well as if they are not top-level
|
|
|
|
|
| |
Thanks to @akio on the ticket for the diagnosis and the patch. I
modified the comments a bit.
|
|
|
|
|
| |
This has been breaking StableNames and possibly weak pointers in some
cases.
|
|
|
|
| |
To improve performance of StablePtr.
|
|
|
|
|
|
|
|
|
|
|
| |
We were doing it in two different ways and asserting that the results
were the same. In most cases they were, but I found one case where
they weren't: the GC itself allocates some memory for running
finalizers, and this memory was accounted for one way but not the
other.
It was simpler to remove the old way of counting allocation that to
try to fix it up, so I did that.
|
|
|
|
|
|
|
|
|
|
|
| |
The bug where TSOs were unconditionally kept on the mutable list was #1589
which was fixed in 04cddd339c000df6d02c90ce59dbffa58d2fe166.
Curiously enough, the commit that changed this comment
0417404f5d1230c9d291ea9f73e2831121c8ec99 occurred *after* this
change was made; I can only assume Simon Marlow accidentally forgot
that he had fixed this bug. :-)
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
|
|
|
|
|
|
|
| |
Reordering of includes in GC.c broke on OS X because gctKey is
declared in Task.h and is needed in the storage manager. This is
really the wrong place for it anyway, so I've moved the gctKey pieces
to where they should be.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This improves GC performance when there are a lot of TVars in the
heap. For instance, a TChan with a lot of elements causes a massive
GC drag without this patch.
There's more to do - several other STM closure types don't have write
barriers, so GC performance when there are a lot of threads blocked on
STM isn't great. But fixing the problem for TVar is a good start.
|
| |
|
| |
|
|
|
|
| |
It makes sanity-checking fail.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change here is that the Cmm parser now allows high-level cmm
code with argument-passing and function calls. For example:
foo ( gcptr a, bits32 b )
{
if (b > 0) {
// we can make tail calls passing arguments:
jump stg_ap_0_fast(a);
}
return (x,y);
}
More details on the new cmm syntax are in Note [Syntax of .cmm files]
in CmmParse.y.
The old syntax is still more-or-less supported for those occasional
code fragments that really need to explicitly manipulate the stack.
However there are a couple of differences: it is now obligatory to
give a list of live GlobalRegs on every jump, e.g.
jump %ENTRY_CODE(Sp(0)) [R1];
Again, more details in Note [Syntax of .cmm files].
I have rewritten most of the .cmm files in the RTS into the new
syntax, except for AutoApply.cmm which is generated by the genapply
program: this file could be generated in the new syntax instead and
would probably be better off for it, but I ran out of enthusiasm.
Some other changes in this batch:
- The PrimOp calling convention is gone, primops now use the ordinary
NativeNodeCall convention. This means that primops and "foreign
import prim" code must be written in high-level cmm, but they can
now take more than 10 arguments.
- CmmSink now does constant-folding (should fix #7219)
- .cmm files now go through the cmmPipeline, and as a result we
generate better code in many cases. All the object files generated
for the RTS .cmm files are now smaller. Performance should be
better too, but I haven't measured it yet.
- RET_DYN frames are removed from the RTS, lots of code goes away
- we now have some more canned GC points to cover unboxed-tuples with
2-4 pointers, which will reduce code size a little.
|
| |
|
|
|
|
| |
No size changes in the non-debug object files
|
|
|
|
|
| |
This broke with the changes to the pinned object handling in
67f4ab7e6b7705a9d617c6109a8c5434ede13cae.
|
|
|
|
|
|
|
|
|
| |
(#7257)
The program in #7257 was spending 90% of its time counting the live
data in gen->large_objects. We already avoid doing this for small
objects, but in this example the old generation was full of large
objects (actually pinned ByteStrings).
|
|
|
|
|
|
|
| |
Forcing large allocations here can creates serious fragmentation in
some cases, and since the large allocations are only a small
optimisation we should allow the nursery to hoover up small blocks
before allocating large chunks.
|
|
|
|
| |
Overlap the main thread's clearNursery() with the other threads.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
lnat was originally "long unsigned int" but we were using it when we
wanted a 64-bit type on a 64-bit machine. This broke on Windows x64,
where long == int == 32 bits. Using types of unspecified size is bad,
but what we really wanted was a type with N bits on an N-bit machine.
StgWord is exactly that.
lnat was mentioned in some APIs that clients might be using
(e.g. StackOverflowHook()), so we leave it defined but with a comment
to say that it's deprecated.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
All the wibble seem to have cancelled out, and (non-debug) object sizes
are back to where they started.
I'm not 100% sure that the types are optimal, but at least now the
functions have types and we can fix them if necessary.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has several advantages:
* It can be called from gdb
* There is more type information for the user, and type checking
for the compiler
* Less opportunity for things to go wrong, e.g. due to missing
parentheses or repeated execution
The sizes of the non-debug .o files hasn't changed (other than
Inlines.o), so I'm pretty sure the compiled code is identical.
|
| |
|
| |
|
| |
|