diff options
author | Andreas Klebinger <klebinger.andreas@gmx.at> | 2022-04-07 17:21:47 +0200 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2022-08-06 06:13:17 -0400 |
commit | fab0ee93abda33bf5c7eb5ca0372e12bd140a252 (patch) | |
tree | dfb79e20a525328a52bd5ea9168583b836f9ab54 /docs | |
parent | 1f6c56ae9aa4ab4977ba376ac901d5256bf0aba0 (diff) | |
download | haskell-fab0ee93abda33bf5c7eb5ca0372e12bd140a252.tar.gz |
Change `-fprof-late` to insert cost centres after unfolding creation.
The former behaviour of adding cost centres after optimization but
before unfoldings are created is not available via the flag
`prof-late-inline` instead.
I also reduced the overhead of -fprof-late* by pushing the cost centres
into lambdas. This means the cost centres will only account for
execution of functions and not their partial application.
Further I made LATE_CC cost centres it's own CC flavour so they now
won't clash with user defined ones if a user uses the same string for
a custom scc.
LateCC: Don't put cost centres inside constructor workers.
With -fprof-late they are rarely useful as the worker is usually
inlined. Even if the worker is not inlined or we use -fprof-late-linline
they are generally not helpful but bloat compile and run time
significantly. So we just don't add sccs inside constructor workers.
-------------------------
Metric Decrease:
T13701
-------------------------
Diffstat (limited to 'docs')
-rw-r--r-- | docs/users_guide/debugging.rst | 6 | ||||
-rw-r--r-- | docs/users_guide/profiling.rst | 38 |
2 files changed, 39 insertions, 5 deletions
diff --git a/docs/users_guide/debugging.rst b/docs/users_guide/debugging.rst index 0c09c4c3ec..87a689c268 100644 --- a/docs/users_guide/debugging.rst +++ b/docs/users_guide/debugging.rst @@ -446,6 +446,12 @@ subexpression elimination pass. Dump output of Core preparation pass +.. ghc-flag:: -ddump-late-cc + :shortdesc: Dump core with late cost centres added + :type: dynamic + + Dump output of LateCC pass after cost centres have been added. + .. ghc-flag:: -ddump-view-pattern-commoning :shortdesc: Dump commoned view patterns :type: dynamic diff --git a/docs/users_guide/profiling.rst b/docs/users_guide/profiling.rst index 418c9b0bb0..1c2f458f10 100644 --- a/docs/users_guide/profiling.rst +++ b/docs/users_guide/profiling.rst @@ -439,19 +439,47 @@ compiled program. details. .. ghc-flag:: -fprof-late - :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the optimizer has run. + :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the core pipeline has run. :type: dynamic :reverse: -fno-prof-late :category: :since: 9.4.1 - Adds an automatic ``SCC`` annotation to all top level bindings late in the core pipeline after - the optimizer has run. This means these cost centres will not interfere with core-level optimizations + Adds an automatic ``SCC`` annotation to all top level bindings late in the compilation pipeline after + the optimizer has run and unfoldings have been created. This means these cost centres will not interfere with core-level optimizations and the resulting profile will be closer to the performance profile of an optimized non-profiled executable. - While the results of this are generally very informative some of the compiler internal names - will leak into the profile. + While the results of this are generally informative, some of the compiler internal names + will leak into the profile. Further if a function is inlined into a use site it's costs will be counted against the + caller's cost center. + + For example if we have this code: + + .. code-block:: haskell + + {-# INLINE mysum #-} + mysum = sum + main = print $ mysum [1..9999999] + + Then ``mysum`` will not show up in the profile since it will be inlined into main and therefore + it's associated costs will be attributed to mains implicit cost centre. + +.. ghc-flag:: -fprof-late-inline + :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the optimizer has run and retain them when inlining. + :type: dynamic + :reverse: -fno-prof-late-inline + :category: + + :since: 9.4.1 + + Adds an automatic ``SCC`` annotation to all top level bindings late in the core pipeline after + the optimizer has run. This is the same as :ghc-flag:`-fprof-late` except that cost centers are included in some unfoldings. + + The result of which is that cost centers *can* inhibit core optimizations to some degree at use sites + after inlining. Further there can be significant overhead from cost centres added to small functions if they are inlined often. + + You can try this mode if :ghc-flag:`-fprof-late` results in a profile that's too hard to interpret. .. ghc-flag:: -fprof-cafs :shortdesc: Auto-add ``SCC``\\ s to all CAFs |