diff options
author | Ömer Sinan Ağacan <omer@well-typed.com> | 2019-02-05 00:18:44 -0500 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2019-10-20 21:15:37 -0400 |
commit | 68e0647f432f9d79ae13a23f614ef293bfd297a9 (patch) | |
tree | 89211783fbd4d8d9502e2777b2719da493fef851 /rts/sm/Scav.c | |
parent | b3ef2d1a861e9b892d64f22f6a233ea331db86d1 (diff) | |
download | haskell-68e0647f432f9d79ae13a23f614ef293bfd297a9.tar.gz |
rts: Non-concurrent mark and sweep
This implements the core heap structure and a serial mark/sweep
collector which can be used to manage the oldest-generation heap.
This is the first step towards a concurrent mark-and-sweep collector
aimed at low-latency applications.
The full design of the collector implemented here is described in detail
in a technical note
B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell
Compiler" (2018)
The basic heap structure used in this design is heavily inspired by
K. Ueno & A. Ohori. "A fully concurrent garbage collector for
functional programs on multicore processors." /ACM SIGPLAN Notices/
Vol. 51. No. 9 (presented by ICFP 2016)
This design is intended to allow both marking and sweeping
concurrent to execution of a multi-core mutator. Unlike the Ueno design,
which requires no global synchronization pauses, the collector
introduced here requires a stop-the-world pause at the beginning and end
of the mark phase.
To avoid heap fragmentation, the allocator consists of a number of
fixed-size /sub-allocators/. Each of these sub-allocators allocators into
its own set of /segments/, themselves allocated from the block
allocator. Each segment is broken into a set of fixed-size allocation
blocks (which back allocations) in addition to a bitmap (used to track
the liveness of blocks) and some additional metadata (used also used
to track liveness).
This heap structure enables collection via mark-and-sweep, which can be
performed concurrently via a snapshot-at-the-beginning scheme (although
concurrent collection is not implemented in this patch).
The mark queue is a fairly straightforward chunked-array structure.
The representation is a bit more verbose than a typical mark queue to
accomodate a combination of two features:
* a mark FIFO, which improves the locality of marking, reducing one of
the major overheads seen in mark/sweep allocators (see [1] for
details)
* the selector optimization and indirection shortcutting, which
requires that we track where we found each reference to an object
in case we need to update the reference at a later point (e.g. when
we find that it is an indirection). See Note [Origin references in
the nonmoving collector] (in `NonMovingMark.h`) for details.
Beyond this the mark/sweep is fairly run-of-the-mill.
[1] R. Garner, S.M. Blackburn, D. Frampton. "Effective Prefetch for
Mark-Sweep Garbage Collection." ISMM 2007.
Co-Authored-By: Ben Gamari <ben@well-typed.com>
Diffstat (limited to 'rts/sm/Scav.c')
-rw-r--r-- | rts/sm/Scav.c | 28 |
1 files changed, 25 insertions, 3 deletions
diff --git a/rts/sm/Scav.c b/rts/sm/Scav.c index b50dff38fb..cac9ca1e2d 100644 --- a/rts/sm/Scav.c +++ b/rts/sm/Scav.c @@ -62,6 +62,8 @@ #include "Hash.h" #include "sm/MarkWeak.h" +#include "sm/NonMoving.h" // for nonmoving_set_closure_mark_bit +#include "sm/NonMovingScav.h" static void scavenge_large_bitmap (StgPtr p, StgLargeBitmap *large_bitmap, @@ -1654,7 +1656,10 @@ scavenge_mutable_list(bdescr *bd, generation *gen) ; } - if (scavenge_one(p)) { + if (RtsFlags.GcFlags.useNonmoving && major_gc && gen == oldest_gen) { + // We can't use scavenge_one here as we need to scavenge SRTs + nonmovingScavengeOne((StgClosure *)p); + } else if (scavenge_one(p)) { // didn't manage to promote everything, so put the // object back on the list. recordMutableGen_GC((StgClosure *)p,gen_no); @@ -1666,7 +1671,14 @@ scavenge_mutable_list(bdescr *bd, generation *gen) void scavenge_capability_mut_lists (Capability *cap) { - uint32_t g; + // In a major GC only nonmoving heap's mut list is root + if (RtsFlags.GcFlags.useNonmoving && major_gc) { + uint32_t g = oldest_gen->no; + scavenge_mutable_list(cap->saved_mut_lists[g], oldest_gen); + freeChain_sync(cap->saved_mut_lists[g]); + cap->saved_mut_lists[g] = NULL; + return; + } /* Mutable lists from each generation > N * we want to *scavenge* these roots, not evacuate them: they're not @@ -1674,7 +1686,7 @@ scavenge_capability_mut_lists (Capability *cap) * Also do them in reverse generation order, for the usual reason: * namely to reduce the likelihood of spurious old->new pointers. */ - for (g = RtsFlags.GcFlags.generations-1; g > N; g--) { + for (uint32_t g = RtsFlags.GcFlags.generations-1; g > N; g--) { scavenge_mutable_list(cap->saved_mut_lists[g], &generations[g]); freeChain_sync(cap->saved_mut_lists[g]); cap->saved_mut_lists[g] = NULL; @@ -2044,6 +2056,16 @@ loop: for (g = RtsFlags.GcFlags.generations-1; g >= 0; g--) { ws = &gct->gens[g]; + if (ws->todo_seg != END_NONMOVING_TODO_LIST) { + struct NonmovingSegment *seg = ws->todo_seg; + ASSERT(seg->todo_link); + ws->todo_seg = seg->todo_link; + seg->todo_link = NULL; + scavengeNonmovingSegment(seg); + did_something = true; + break; + } + gct->scan_bd = NULL; // If we have a scan block with some work to do, |