summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMatthew Pickering <matthewtpickering@gmail.com>2022-03-22 18:02:44 +0000
committerMatthew Pickering <matthewtpickering@gmail.com>2022-03-22 18:11:18 +0000
commitc0e7c929be2632040b6c380c905e064c3aecbe7e (patch)
tree307bbab5365e0ddae98f11cdc9324c4b31221553
parent52ffd38c610f418ee1d1a549cfdfdaa11794ea40 (diff)
downloadhaskell-wip/gc-stats-times.tar.gz
RTS: Zero gc_cpu_start and gc_cpu_end after accountingwip/gc-stats-times
When passed a combination of `-N` and `-qn` options the cpu time for garbage collection was being vastly overcounted because the counters were not being zeroed appropiately. When -qn1 is passed, only 1 of the N avaiable GC threads is chosen to perform work, the rest are idle. At the end of the GC period, stat_endGC traverses all the GC threads and adds up the elapsed time from each of them. For threads which didn't participate in this GC, the value of the cpu time should be zero, but before this patch, the counters were not zeroed and hence we would count the same elapsed time on many subsequent iterations (until the thread participated in a GC again). The most direct way to zero these fields is to do so immediately after the value is added into the global counter, after which point they are never used again. We also tried another approach where we would zero the counter in yieldCapability but there are some (undiagnosed) siations where a capbility would not pass through yieldCapability before the GC ended and the same double counting problem would occur. Fixes #21082
-rw-r--r--rts/Stats.c20
1 files changed, 11 insertions, 9 deletions
diff --git a/rts/Stats.c b/rts/Stats.c
index fb87a6d8e4..16875ce5d2 100644
--- a/rts/Stats.c
+++ b/rts/Stats.c
@@ -515,15 +515,17 @@ stat_endGC (Capability *cap, gc_thread *initiating_gct, W_ live, W_ copied, W_ s
// see Note [n_gc_threads]
// par_n_threads is set to n_gc_threads at the single callsite of this
// function
- if (par_n_threads == 1) {
- ASSERT(initiating_gct->gc_end_cpu >= initiating_gct->gc_start_cpu);
- stats.gc.cpu_ns += initiating_gct->gc_end_cpu - initiating_gct->gc_start_cpu;
- } else {
- for (unsigned int i=0; i < par_n_threads; i++) {
- gc_thread *gct = gc_threads[i];
- ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
- stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
- }
+ for (unsigned int i=0; i < par_n_threads; i++) {
+ gc_thread *gct = gc_threads[i];
+ ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
+ stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
+ // Reset the start/end times to zero after the difference has been
+ // added to the global total (see #21082)
+ // Resetting the values here is the most direct way to avoid double counting
+ // because the gc_end_cpu and gc_start_cpu is not reset to zero as
+ // stat_startGC is only called on active (not idle threads).
+ gct->gc_end_cpu = 0;
+ gct->gc_start_cpu = 0;
}
}
// -------------------------------------------------