summaryrefslogtreecommitdiff
path: root/rts
diff options
context:
space:
mode:
authorMatthew Pickering <matthewtpickering@gmail.com>2022-03-22 18:02:44 +0000
committerMarge Bot <ben+marge-bot@smart-cactus.org>2022-03-29 13:08:42 -0400
commit840a6811b074a883b6e01ff10ac65387acd85089 (patch)
tree6cb941332a93ba4467fa457d11d731efedf07819 /rts
parentad09a5f7671c5594524c517945b2bdd7c5e2c8fa (diff)
downloadhaskell-840a6811b074a883b6e01ff10ac65387acd85089.tar.gz
RTS: Zero gc_cpu_start and gc_cpu_end after accounting
When passed a combination of `-N` and `-qn` options the cpu time for garbage collection was being vastly overcounted because the counters were not being zeroed appropiately. When -qn1 is passed, only 1 of the N avaiable GC threads is chosen to perform work, the rest are idle. At the end of the GC period, stat_endGC traverses all the GC threads and adds up the elapsed time from each of them. For threads which didn't participate in this GC, the value of the cpu time should be zero, but before this patch, the counters were not zeroed and hence we would count the same elapsed time on many subsequent iterations (until the thread participated in a GC again). The most direct way to zero these fields is to do so immediately after the value is added into the global counter, after which point they are never used again. We also tried another approach where we would zero the counter in yieldCapability but there are some (undiagnosed) siations where a capbility would not pass through yieldCapability before the GC ended and the same double counting problem would occur. Fixes #21082
Diffstat (limited to 'rts')
-rw-r--r--rts/Stats.c20
1 files changed, 11 insertions, 9 deletions
diff --git a/rts/Stats.c b/rts/Stats.c
index fb87a6d8e4..16875ce5d2 100644
--- a/rts/Stats.c
+++ b/rts/Stats.c
@@ -515,15 +515,17 @@ stat_endGC (Capability *cap, gc_thread *initiating_gct, W_ live, W_ copied, W_ s
// see Note [n_gc_threads]
// par_n_threads is set to n_gc_threads at the single callsite of this
// function
- if (par_n_threads == 1) {
- ASSERT(initiating_gct->gc_end_cpu >= initiating_gct->gc_start_cpu);
- stats.gc.cpu_ns += initiating_gct->gc_end_cpu - initiating_gct->gc_start_cpu;
- } else {
- for (unsigned int i=0; i < par_n_threads; i++) {
- gc_thread *gct = gc_threads[i];
- ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
- stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
- }
+ for (unsigned int i=0; i < par_n_threads; i++) {
+ gc_thread *gct = gc_threads[i];
+ ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
+ stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
+ // Reset the start/end times to zero after the difference has been
+ // added to the global total (see #21082)
+ // Resetting the values here is the most direct way to avoid double counting
+ // because the gc_end_cpu and gc_start_cpu is not reset to zero as
+ // stat_startGC is only called on active (not idle threads).
+ gct->gc_end_cpu = 0;
+ gct->gc_start_cpu = 0;
}
}
// -------------------------------------------------