diff options
author | Simon Marlow <marlowsd@gmail.com> | 2015-02-11 14:19:21 +0000 |
---|---|---|
committer | Simon Marlow <marlowsd@gmail.com> | 2017-11-16 13:49:05 +0000 |
commit | 2f4638735ad1526d6502a4706bffafffb93e24da (patch) | |
tree | 8711e34b498ea52d0a8092fb763892c07c602d92 /docs/users_guide/runtime_control.rst | |
parent | 07ac921f48baea84b40835b0b7c476806f7f63f6 (diff) | |
download | haskell-2f4638735ad1526d6502a4706bffafffb93e24da.tar.gz |
Detect overly long GC sync
Summary:
GC sync is the time between a GC being intiated and all the mutator
threads finally stopping so that the GC can start. Problems that cause
the GC sync to be delayed are hard to find and can cause dramatic
slowdowns for heavily parallel programs.
The new flag --long-gc-sync=<time> helps by emitting a warning and
calling a user-overridable hook when the GC sync time exceeds the
specified threshold. A debugger can be used to set a breakpoint when
this happens and inspect the stacks of threads to find the culprit.
Test Plan:
```
$ ./inplace/bin/ghc-stage2 +RTS --long-gc-sync=0.0000001 -S
Alloc Copied Live GC GC TOT TOT Page Flts
bytes bytes bytes user elap user elap
1135856 51144 153736 0.000 0.000 0.002 0.002 0 0 (Gen: 0)
1034760 94704 188752 0.000 0.000 0.002 0.002 0 0 (Gen: 0)
1038888 134832 228888 0.009 0.009 0.011 0.011 0 0 (Gen: 1)
1025288 90128 235184 0.000 0.000 0.012 0.012 0 0 (Gen: 0)
1049088 130080 333984 0.000 0.000 0.013 0.013 0 0 (Gen: 0)
Warning: waited 0us for GC sync
1034424 73360 331976 0.000 0.000 0.013 0.013 0 0 (Gen: 0)
```
Also tested on a real production problem.
Reviewers: niteria, bgamari, erikd
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4193
Diffstat (limited to 'docs/users_guide/runtime_control.rst')
-rw-r--r-- | docs/users_guide/runtime_control.rst | 38 |
1 files changed, 38 insertions, 0 deletions
diff --git a/docs/users_guide/runtime_control.rst b/docs/users_guide/runtime_control.rst index f944ae95fe..f8e0ce026c 100644 --- a/docs/users_guide/runtime_control.rst +++ b/docs/users_guide/runtime_control.rst @@ -728,6 +728,44 @@ performance. that indicates the NUMA nodes on which to run the program. For example, ``--numa=3`` would run the program on NUMA nodes 0 and 1. +.. rts-flag:: --long-gc-sync + --long-gc-sync=<seconds> + + .. index:: + single: GC sync time, measuring + + When a GC starts, all the running mutator threads have to stop and + synchronise. The period between when the GC is initiated and all + the mutator threads are stopped is called the GC synchronisation + phase. If this phase is taking a long time (longer than 1ms is + considered long), then it can have a severe impact on overall + throughput. + + A long GC sync can be caused by a mutator thread that is inside an + ``unsafe`` FFI call, or running in a loop that doesn't allocate + memory and so doesn't yield. To fix the former, make the call + ``safe``, and to fix the latter, either avoid calling the code in + question or compile it with :ghc-flag:`-fomit-yields`. + + By default, the flag will cause a warning to be emitted to stderr + when the sync time exceeds the specified time. This behaviour can + be overriden, however: the ``longGCSync()`` hook is called when + the sync time is exceeded during the sync period, and the + ``longGCSyncEnd()`` hook at the end. Both of these hooks can be + overriden in the ``RtsConfig`` when the runtime is started with + ``hs_init_ghc()``. The default implementations of these hooks + (``LongGcSync()`` and ``LongGCSyncEnd()`` respectively) print + warnings to stderr. + + One way to use this flag is to set a breakpoint on + ``LongGCSync()`` in the debugger, and find the thread that is + delaying the sync. You probably want to use :ghc-flag:`-g` to + provide more info to the debugger. + + The GC sync time, along with other GC stats, are available by + calling the ``getRTSStats()`` function from C, or + ``GHC.Stats.getRTSStats`` from Haskell. + .. _rts-options-statistics: RTS options to produce runtime statistics |