RTS: Zero gc_cpu_start and gc_cpu_end after accounting

When passed a combination of `-N` and `-qn` options the cpu time for garbage collection was being vastly overcounted because the counters were not being zeroed appropiately. When -qn1 is passed, only 1 of the N avaiable GC threads is chosen to perform work, the rest are idle. At the end of the GC period, stat_endGC traverses all the GC threads and adds up the elapsed time from each of them. For threads which didn't participate in this GC, the value of the cpu time should be zero, but before this patch, the counters were not zeroed and hence we would count the same elapsed time on many subsequent iterations (until the thread participated in a GC again). The most direct way to zero these fields is to do so immediately after the value is added into the global counter, after which point they are never used again. We also tried another approach where we would zero the counter in yieldCapability but there are some (undiagnosed) siations where a capbility would not pass through yieldCapability before the GC ended and the same double counting problem would occur. Fixes #21082
author: Matthew Pickering <matthewtpickering@gmail.com> 2022-03-22 18:02:44 +0000
committer: Marge Bot <ben+marge-bot@smart-cactus.org> 2022-03-29 13:08:42 -0400
commit: 840a6811b074a883b6e01ff10ac65387acd85089 (patch)
tree: 6cb941332a93ba4467fa457d11d731efedf07819
parent: ad09a5f7671c5594524c517945b2bdd7c5e2c8fa (diff)
download: haskell-840a6811b074a883b6e01ff10ac65387acd85089.tar.gz
1 files changed, 11 insertions, 9 deletions
diff --git a/rts/Stats.c b/rts/Stats.c
index fb87a6d8e4..16875ce5d2 100644
--- a/rts/Stats.c
+++ b/rts/Stats.c
@@ -515,15 +515,17 @@ stat_endGC (Capability *cap, gc_thread *initiating_gct, W_ live, W_ copied, W_ s
         // see Note [n_gc_threads]
         // par_n_threads is set to n_gc_threads at the single callsite of this
         // function
-        if (par_n_threads == 1) {
-            ASSERT(initiating_gct->gc_end_cpu >= initiating_gct->gc_start_cpu);
-            stats.gc.cpu_ns += initiating_gct->gc_end_cpu - initiating_gct->gc_start_cpu;
-        } else {
-            for (unsigned int i=0; i < par_n_threads; i++) {
-                gc_thread *gct = gc_threads[i];
-                ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
-                stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
-            }
+        for (unsigned int i=0; i < par_n_threads; i++) {
+            gc_thread *gct = gc_threads[i];
+            ASSERT(gct->gc_end_cpu >= gct->gc_start_cpu);
+            stats.gc.cpu_ns += gct->gc_end_cpu - gct->gc_start_cpu;
+            // Reset the start/end times to zero after the difference has been
+            // added to the global total (see #21082)
+            // Resetting the values here is the most direct way to avoid double counting
+            // because the gc_end_cpu and gc_start_cpu is not reset to zero as
+            // stat_startGC is only called on active (not idle threads).
+            gct->gc_end_cpu = 0;
+            gct->gc_start_cpu = 0;
         }
     }
     // -------------------------------------------------
author	Matthew Pickering <matthewtpickering@gmail.com>	2022-03-22 18:02:44 +0000
committer	Marge Bot <ben+marge-bot@smart-cactus.org>	2022-03-29 13:08:42 -0400
commit	840a6811b074a883b6e01ff10ac65387acd85089 (patch)
tree	6cb941332a93ba4467fa457d11d731efedf07819
parent	ad09a5f7671c5594524c517945b2bdd7c5e2c8fa (diff)
download	haskell-840a6811b074a883b6e01ff10ac65387acd85089.tar.gz