| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recently I've used a different build system for building the
rts (Xcode). And in doing so, I looked through the rts/ghc.mk
to figure out how to build the rts.
In general it's quite straight forward to just compile all the
c files with the proper flags.
However there is one rather awkward copy step that copies some
files for special handling for the rts way.
I'm wondering if the proposed solution in this diff is better
or worse than the current situation?
The idea is to keep the files, but use #includes to produce
identical files with just an additional define. It does however
produce empty objects for non threaded ways.
Reviewers: ezyang, bgamari, austin, erikd, simonmar, rwbarton
Reviewed By: bgamari, simonmar, rwbarton
Subscribers: rwbarton, thomie, snowleopard
Differential Revision: https://phabricator.haskell.org/D3237
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes trac issue #13288.
Reviewers: austin, bgamari, erikd, simonmar
Reviewed By: simonmar
Subscribers: mutjida, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3143
|
|
|
|
| |
See comments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This commit makes various improvements and addresses some issues with
Compact Regions (aka Compact Normal Forms).
This was the most important thing I wanted to fix. Compaction
previously prevented GC from running until it was complete, which
would be a problem in a multicore setting. Now, we compact using a
hand-written Cmm routine that can be interrupted at any point. When a
GC is triggered during a sharing-enabled compaction, the GC has to
traverse and update the hash table, so this hash table is now stored
in the StgCompactNFData object.
Previously, compaction consisted of a deepseq using the NFData class,
followed by a traversal in C code to copy the data. This is now done
in a single pass with hand-written Cmm (see rts/Compact.cmm). We no
longer use the NFData instances, instead the Cmm routine evaluates
components directly as it compacts.
The new compaction is about 50% faster than the old one with no
sharing, and a little faster on average with sharing (the cost of the
hash table dominates when we're doing sharing).
Static objects that don't (transitively) refer to any CAFs don't need
to be copied into the compact region. In particular this means we
often avoid copying Char values and small Int values, because these
are static closures in the runtime.
Each Compact# object can support a single compactAdd# operation at any
given time, so the Data.Compact library now enforces mutual exclusion
using an MVar stored in the Compact object.
We now get exceptions rather than killing everything with a barf()
when we encounter an object that cannot be compacted (a function, or a
mutable object). We now also detect pinned objects, which can't be
compacted either.
The Data.Compact API has been refactored and cleaned up. A new
compactSize operation returns the size (in bytes) of the compact
object.
Most of the documentation is in the Haddock docs for the compact
library, which I've expanded and improved here.
Various comments in the code have been improved, especially the main
Note [Compact Normal Forms] in rts/sm/CNF.c.
I've added a few tests, and expanded a few of the tests that were
there. We now also run the tests with GHCi, and in a new test way
that enables sanity checking (+RTS -DS).
There's a benchmark in libraries/compact/tests/compact_bench.hs for
measuring compaction speed and comparing sharing vs. no sharing.
The field totalDataW in StgCompactNFData was unnecessary.
Test Plan:
* new unit tests
* validate
* tested manually that we can compact Data.Aeson data
Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd
Subscribers: thomie, simonpj
Differential Revision: https://phabricator.haskell.org/D2751
GHC Trac Issues: #12455
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Visible API changes:
* The C struct `GCDetails` gives the stats about a single GC. This is
passed to the `gcDone()` callback if one is set via the
RtsConfig. (previously we just passed a collection of values, so this
is more extensible, at the expense of breaking the existing API)
* `RTSStats` gives cumulative stats since the start of the program,
and includes the `GCDetails` for the most recent GC. This struct
can be obtained via `getRTSStats()` (the old `getGCStats()` has been
removed, and `getGCStatsEnabled()` has been renamed to
`getRTSStatsEnabled()`)
Improvements:
* The per-GC stats and cumulative stats are now cleanly separated.
* Inside the RTS we have a top-level `RTSStats` struct to keep all our
stats in, previously this was just a collection of strangely-named
variables. This struct is mostly just copied in `getRTSStats()`, so
the implementation of that function is a lot shorter.
* Types are more consistent. We use a uint64_t byte count for all
memory values, and Time for all time values.
* Names are more consistent. We use a suffix `_bytes` for all byte
counts and `_ns` for all time values.
* We now collect information about the amount of memory in large
objects and compact objects in `GCDetails`. (the latter was the reason
I started doing this patch but it seems to have ballooned a bit!)
* I fixed a bug in the calculation of the elapsed MUT time, and added
an ASSERT to stop the calculations going wrong in the future.
For now I kept the Haskell API in `GHC.Stats` the same, by
impedence-matching with the new API. We could either break that API
and make it match the C API more closely, or we could add a new API
and deprecate the old one. Opinions welcome.
This stuff is very easy to get wrong, and it's hard to test. Reviews
welcome!
Test Plan:
manual testing
validate
Reviewers: bgamari, niteria, austin, ezyang, hvr, erikd, rwbarton, Phyx
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2756
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fix issues preventing x86 GHC to build on Windows and
fix segfault in the testsuite.
Test Plan: ./validate
Reviewers: austin, erikd, simonmar, bgamari
Reviewed By: bgamari
Subscribers: #ghc_windows_task_force, thomie
Differential Revision: https://phabricator.haskell.org/D2789
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Validate on lots of platforms
Reviewers: erikd, simonmar, austin
Reviewed By: erikd, simonmar
Subscribers: michalt, thomie
Differential Revision: https://phabricator.haskell.org/D2699
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Validate
Reviewers: simonmar, austin, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2764
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous code passed an end pointer, but the interface takes a size
instead.
Fixes #12838.
Reviewers: austin, erikd, simonmar, bgamari
Reviewed By: simonmar, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2711
GHC Trac Issues: #12838
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We currently have two info tables for a constructor
* XXX_con_info: the info table for a heap-resident instance of the
constructor, It has type CONSTR, or one of the specialised types like
CONSTR_1_0
* XXX_static_info: the info table for a static instance of this
constructor, which has type CONSTR_STATIC or CONSTR_STATIC_NOCAF.
I'm getting rid of the latter, and using the `con_info` info table for
both static and dynamic constructors. For rationale and more details
see Note [static constructors] in SMRep.hs.
I also removed these macros: `isSTATIC()`, `ip_STATIC()`,
`closure_STATIC()`, since they relied on the CONSTR/CONSTR_STATIC
distinction, and anyway HEAP_ALLOCED() does the same job.
Test Plan: validate
Reviewers: bgamari, simonpj, austin, gcampax, hvr, niteria, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2690
GHC Trac Issues: #12455
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The problem boils down to global variables: in particular gc_threads[],
which was being modified by a subsequent GC before the previous GC had
finished with it. The fix is to not use global variables.
This was causing setnumcapabilities001 to fail (again!). It's an old
bug though.
Test Plan:
Ran setnumcapabilities001 in a loop for a couple of hours. Before this
patch it had been failing after a few minutes. Not a very scientific
test, but it's the best I have.
Reviewers: bgamari, austin, fryguybob, niteria, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2654
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nursery chunks help reduce the cost of GC when capabilities are unevenly
loaded, by ensuring that we use more of the available nursery.
The rationale for enabling this at -A16m is that any negative effects
due to loss of cache locality are less likely to be an issue at -A16m
and above. It's a conservative guess. If we had a lot of benchmark
data we could probably do better.
Results for nofib/parallel at -N4 -A32m with and without -n4m:
```
------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
------------------------------------------------------------------------
blackscholes 0.0% -9.5% -9.0% -15.0% -2.2%
coins 0.0% -4.7% -3.6% -0.6% -13.6%
mandel 0.0% -0.3% +7.7% +13.1% +0.1%
matmult 0.0% +1.5% +10.0% +7.7% +0.1%
nbody 0.0% -4.1% -2.9% 0.085 0.0%
parfib 0.0% -1.4% +1.0% +1.5% +0.2%
partree 0.0% -0.3% +0.8% +2.9% -0.8%
prsa 0.0% -0.5% -2.1% -7.6% 0.0%
queens 0.0% -3.2% -1.4% +2.2% +1.3%
ray 0.0% -5.6% -14.5% -7.6% +0.8%
sumeuler 0.0% -0.4% +2.4% +1.1% 0.0%
------------------------------------------------------------------------
Min 0.0% -9.5% -14.5% -15.0% -13.6%
Max 0.0% +1.5% +10.0% +13.1% +1.3%
Geometric Mean +0.0% -2.6% -1.3% -0.5% -1.4%
```
Not conclusive, but slightly better. This matters a lot more when you
have more cores.
Test Plan: validate, nofib/paralel
Reviewers: niteria, ezyang, nh2, trofi, austin, erikd, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2581
GHC Trac Issues: #9221
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We stumbled upon a case where an external library (OpenCL) does not work
if a specific address (0x200000000) is taken.
It so happens that `osReserveHeapMemory` starts trying to mmap at 0x200000000:
```
void *hint = (void*)((W_)8 * (1 << 30) + attempt * BLOCK_SIZE);
at = osTryReserveHeapMemory(*len, hint);
```
This makes it impossible to use Haskell programs compiled with GHC 8
with C functions that use OpenCL.
See this example https://github.com/chpatrick/oclwtf for a repro.
This patch allows the user to work around this kind of behavior outside
our control by letting the user override the starting address through an
RTS command line flag.
Reviewers: bgamari, Phyx, simonmar, erikd, austin
Reviewed By: Phyx, simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D2513
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
malloc'd memory is not guaranteed to be zeroed. On Linux, however,
it is often zeroed, leading to latent bugs. In fact, with this
patch I fix two uninitialized memory bugs stemming from this.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
Test Plan: validate
Reviewers: simonmar, austin, Phyx, bgamari, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2455
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch refactors GNU C version test (for 4.5 and more modern)
due to usage of __builtin_unreachable done in the CNF.c code directly
into the new RTS_UNREACHABLE macro placed into Rts.h
Reviewers: bgamari, austin, simonmar, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch fixes compilation failure on OpenBSD. The OpenBSD's
GNU C compiler is of 4.2.1 version and problematic __builtin_unreachable
was added in GNU C 4.5 release. Let's use pure abort() call
on OpenBSD instead of __builtin_unreachable
Reviewers: bgamari, austin, erikd, simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2453
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Knowing the length of the run queue in O(1) time is useful: for example
we don't have to traverse the run queue to know how many threads we have
to migrate in schedulePushWork().
Test Plan: validate
Reviewers: ezyang, erikd, bgamari, austin
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2437
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
@simonmar told me that it makes more sense this way.
Test Plan: it still builds
Reviewers: bgamari, austin, simonmar, erikd
Reviewed By: simonmar, erikd
Subscribers: thomie, simonmar
Differential Revision: https://phabricator.haskell.org/D2428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The recent Compact Regions commit (cf989ffe49) builds fine on Linux
but doesn't build on OS X r Windows.
* rts/sm/CNF.c: Drop un-needed #includes.
* Fix parenthesis usage with CPP ASSERT macro.
* Fix format string in debugBelch messages.
* Use stg_max() instead hand rolled inline max() function.
Test Plan: Build on Linux, OS X and Windows
Reviewers: gcampax, simonmar, austin, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2421
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings in initial support for compact regions, as described in the
ICFP 2015 paper "Efficient Communication and Collection with Compact
Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
Campagna.
Some things may change before the 8.2 release, but I (Simon M.) wanted
to get the main patch committed so that we can iterate.
What documentation there is is in the Data.Compact module in the new
compact package. We'll need to extend and polish the documentation
before the release.
Test Plan:
validate
(new test cases included)
Reviewers: ezyang, simonmar, hvr, bgamari, austin
Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D1264
GHC Trac Issues: #11493
|
|
|
|
|
| |
- Move the numaMap and nNumaNodes out of RtsFlags to Capability.c
- Add a test to tests/rts
|
|
|
|
|
|
|
|
| |
* Remove unused/old flags from the structs
* Update old comments
* Add missing flags to GHC.RTS
* Simplify GHC.RTS, remove C code and use hsc2hs instead
* Make ParFlags unconditional, and add support to GHC.RTS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The aim here is to reduce the number of remote memory accesses on
systems with a NUMA memory architecture, typically multi-socket servers.
Linux provides a NUMA API for doing two things:
* Allocating memory local to a particular node
* Binding a thread to a particular node
When given the +RTS --numa flag, the runtime will
* Determine the number of NUMA nodes (N) by querying the OS
* Assign capabilities to nodes, so cap C is on node C%N
* Bind worker threads on a capability to the correct node
* Keep a separate free lists in the block layer for each node
* Allocate the nursery for a capability from node-local memory
* Allocate blocks in the GC from node-local memory
For example, using nofib/parallel/queens on a 24-core 2-socket machine:
```
$ ./Main 15 +RTS -N24 -s -A64m
Total time 173.960s ( 7.467s elapsed)
$ ./Main 15 +RTS -N24 -s -A64m --numa
Total time 150.836s ( 6.423s elapsed)
```
The biggest win here is expected to be allocating from node-local
memory, so that means programs using a large -A value (as here).
According to perf, on this program the number of remote memory accesses
were reduced by more than 50% by using `--numa`.
Test Plan:
* validate
* There's a new flag --debug-numa=<n> that pretends to do NUMA without
actually making the OS calls, which is useful for testing the code
on non-NUMA systems.
* TODO: I need to add some unit tests
Reviewers: erikd, austin, rwbarton, ezyang, bgamari, hvr, niteria
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2199
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the code a little more modular and allows the removal of some
CPP hackery. By providing dummy implementations of of the `m32_*`
functions (which simply call `errorBelch`) it means that the call sites
for these functions are syntax checked even when `RTS_LINKER_USE_MMAP`
is `0`.
Also changes some size parameter types from `unsigned int` to `size_t`.
Test Plan: Validate on Linux, OS X and Windows
Reviewers: Phyx, hsyl20, bgamari, simonmar, austin
Reviewed By: simonmar, austin
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2237
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: austin, erikd, simonmar, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2241
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The first argument of 'osFreeMBlocks' ought to have the same type as the
return value from 'osGetMBlocks'. Make it so.
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: erikd, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D2235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In addition to more const-correctness fixes this patch fixes an
infelicity of the previous const-correctness patch (995cf0f356) which
left `UNTAG_CLOSURE` taking a `const StgClosure` pointer parameter
but returning a non-const pointer. Here we restore the original type
signature of `UNTAG_CLOSURE` and add a new function
`UNTAG_CONST_CLOSURE` which takes and returns a const `StgClosure`
pointer and uses that wherever possible.
Test Plan: Validate on Linux, OS X and Windows
Reviewers: Phyx, hsyl20, bgamari, austin, simonmar, trofi
Reviewed By: simonmar, trofi
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2231
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up of D2189. If fixes some comments, deletes a section
in the User's Guide about the bug, and updates .mailmap as suggested on
the WorkinConventions wiki page.
Test Plan: It compiles.
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2202
GHC Trac Issues: #11108
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a function takes a pointer parameter and doesn't update what
the pointer points to, we can add `const` to the parameter
declaration to document that no updates occur.
Test Plan: Validate on Linux, OS X and Windows
Reviewers: austin, Phyx, bgamari, simonmar, hsyl20
Reviewed By: bgamari, simonmar, hsyl20
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we ignored promotion failures when evacuating fields of
a WEAK object. When a failure happens, this resulted in an WEAK object
pointing to another object in a younger generation, causing crashes.
I used the test case from #11746 to check that the fix is working.
However I haven't managed to produce a test case that quickly reproduces
the issue.
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: simonmar
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2189
GHC Trac Issues: #11108
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't define Stg{Int,Word} in terms of {,u}intptr_t because STG
depends on them being the exact same size as void*, and {,u}intptr_t
does not make that guarantee. Furthermore, we also need to define
StgHalf{Int,Word}, so the preprocessor if needs to stay. But we can at
least keep it in a single place instead of repeating it in various
files.
Also define STG_{INT,WORD}{8,16,32,64}_{MIN,MAX} and use it in HsFFI.h,
further reducing the need for CPP in other files.
Reviewers: austin, bgamari, simonmar, hvr, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2182
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `nat` type was an alias for `unsigned int` with a comment saying
it was at least 32 bits. We keep the typedef in case client code is
using it but mark it as deprecated.
Test Plan: Validated on Linux, OS X and Windows
Reviewers: simonmar, austin, thomie, hvr, bgamari, hsyl20
Differential Revision: https://phabricator.haskell.org/D2166
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
+RTS -AL<size> controls the total size of large objects that can be
allocated before a GC is triggered. Previously this was always just the
value of -A, and the limit mainly existed to prevent runaway allocation
in pathalogical programs that allocate a lot of large objects. However,
since the limit is shared between all cores, on a large multicore the
default becomes more restrictive, and can end up triggering GC well
before it would normally have been.
Arguably a better default would be A*N, but this is probably excessive.
Adding a flag lets you choose, and I've left the default as it was.
See docs for usage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the GC to use fewer threads than the number of capabilities.
At each GC, we choose some of the capabilities to be "idle", which means
that the thread running on that capability (if any) will sleep for the
duration of the GC, and the other threads will do its work. We choose
capabilities that are already idle (if any) to be the idle capabilities.
The idea is that this helps in the following situation:
* We want to use a large -N value so as to make use of hyperthreaded
cores
* We use a large heap size, so GC is infrequent
* But we don't want to use all -N threads in the GC, because that
thrashes the memory too much.
See docs for usage.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Rename to the (more correct) NUM_FREE_LISTS
- NUM_FREE_LISTS should be derived from the block and mblock sizes, not
defined manually. It was actually too large by one, which caused a
little bit of (benign) extra work in the form of a redundant loop
iteration in some cases.
- Add some ASSERTs for input preconditions to log_2() and log_2_ceil()
- Fix some comments
- Fix usage in allocLargeChunk, to account for the fact that
log_2_ceil() can return NUM_FREE_LISTS.
|
|
|
|
|
|
|
| |
This reverts commit 546f24e4f8a7c086b1e5afcdda624176610cbcf8.
And adds a fix for Windows: we need to use __builtin_clzll() rather than
__builtin_clzl(), because StgWord is unsigned long long on Windows.
|
|
|
|
|
|
|
|
| |
Some old stuff related to the PAR way.
Reviewed by: austin, simonmar
Differential Revision: https://phabricator.haskell.org/D2137
|
|
|
|
| |
This reverts commit 24864ba5587c1a0447beabae90529e8bb4fa117a.
|
| |
|
|
|
|
| |
A microoptimisation in the block allocator.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Avoids contention for the block allocator lock in the GC; this can be
seen in the gc_alloc_block_sync counter emitted by +RTS -s.
I experimented with this a while ago, and there was already
commented-out code for it in GCUtils.c, but I've now improved it so that
it doesn't result in significantly worse memory usage.
* The old method of putting spare blocks on ws->part_list was wasteful,
the spare blocks are now shared between all generations and retained
between GCs.
* repeated allocGroup() results in fragmentation, so I switched to using
allocLargeChunk() instead which is fragmentation-friendly; we already
use it for the same reason in nursery allocation.
|
|
|
|
|
|
|
|
|
| |
After a parallel GC, it is possible to have a long list of blocks in
ws->part_list, if we did a lot of work stealing but didn't fill up the
blocks we stole. These blocks persist until the next load-balanced GC,
which might be a long time, and during every GC we were traversing this
list to find its size. The fix is to maintain the size all the time, so
we don't have to compute it.
|
|
|
|
|
| |
DEAD_WEAK used to have a different layout, see
d61c623ed6b2d352474a7497a65015dbf6a72e12
|
| |
|
|
|
|
|
| |
This reverts commit 6c2c853b11fe25c106469da7b105e2be596c17de which was
supposed to be merged as individual commits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this Diff contains small, self-contained changes as I work towards
fixing #10613. It is mostly created to let harbormaster do its job, but
feedback is welcome as well.
Please do not merge this via arc; I’d like to push the individual
patches as layed out here. I might push mostly trivial ones even without
review, as long as the build passes.
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5d52d9b64c21dcf77849866584744722f8121389 removed
global 'blackhole_queue' in favour of new mechanism:
when TSO hits blackhole TSO blocks waiting for
'MessgaeBlackhole' delivery.
Patch removed unused global and updates stale comments.
Noticed by Yuras Shumovich.
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Test Plan: build test
Reviewers: simonmar, austin, Yuras, bgamari
Reviewed By: Yuras, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1953
|
|
|
|
|
|
|
|
| |
Noticed by uselex.rb:
copied: [R]: exported from:
./rts/dist/build/sm/GC.o
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
|
|
|
|
|
|
|
|
|
|
| |
Noticed by uselex.rb:
scavenge_mutable_list: [R]: exported from:
./rts/dist/build/sm/Scav.o
scavenge_mutable_list1: [R]: exported from:
./rts/dist/build/sm/Scav.thr_o
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use of these helper functions was removed by
commit 18896fa2b06844407fd1e0d3f85cd3db97a96ff4
Author: Simon Marlow <marlowsd@gmail.com>
Date: Wed Feb 2 15:49:55 2011 +0000
Noticed by uselex.rb:
calcLiveBlocks: [R]: exported from:
./rts/dist/build/sm/Storage.o
calcLiveWords: [R]: exported from:
./rts/dist/build/sm/Storage.o
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
|