summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorAndreas Klebinger <klebinger.andreas@gmx.at>2022-04-07 17:21:47 +0200
committerMarge Bot <ben+marge-bot@smart-cactus.org>2022-08-06 06:13:17 -0400
commitfab0ee93abda33bf5c7eb5ca0372e12bd140a252 (patch)
treedfb79e20a525328a52bd5ea9168583b836f9ab54 /docs
parent1f6c56ae9aa4ab4977ba376ac901d5256bf0aba0 (diff)
downloadhaskell-fab0ee93abda33bf5c7eb5ca0372e12bd140a252.tar.gz
Change `-fprof-late` to insert cost centres after unfolding creation.
The former behaviour of adding cost centres after optimization but before unfoldings are created is not available via the flag `prof-late-inline` instead. I also reduced the overhead of -fprof-late* by pushing the cost centres into lambdas. This means the cost centres will only account for execution of functions and not their partial application. Further I made LATE_CC cost centres it's own CC flavour so they now won't clash with user defined ones if a user uses the same string for a custom scc. LateCC: Don't put cost centres inside constructor workers. With -fprof-late they are rarely useful as the worker is usually inlined. Even if the worker is not inlined or we use -fprof-late-linline they are generally not helpful but bloat compile and run time significantly. So we just don't add sccs inside constructor workers. ------------------------- Metric Decrease: T13701 -------------------------
Diffstat (limited to 'docs')
-rw-r--r--docs/users_guide/debugging.rst6
-rw-r--r--docs/users_guide/profiling.rst38
2 files changed, 39 insertions, 5 deletions
diff --git a/docs/users_guide/debugging.rst b/docs/users_guide/debugging.rst
index 0c09c4c3ec..87a689c268 100644
--- a/docs/users_guide/debugging.rst
+++ b/docs/users_guide/debugging.rst
@@ -446,6 +446,12 @@ subexpression elimination pass.
Dump output of Core preparation pass
+.. ghc-flag:: -ddump-late-cc
+ :shortdesc: Dump core with late cost centres added
+ :type: dynamic
+
+ Dump output of LateCC pass after cost centres have been added.
+
.. ghc-flag:: -ddump-view-pattern-commoning
:shortdesc: Dump commoned view patterns
:type: dynamic
diff --git a/docs/users_guide/profiling.rst b/docs/users_guide/profiling.rst
index 418c9b0bb0..1c2f458f10 100644
--- a/docs/users_guide/profiling.rst
+++ b/docs/users_guide/profiling.rst
@@ -439,19 +439,47 @@ compiled program.
details.
.. ghc-flag:: -fprof-late
- :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the optimizer has run.
+ :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the core pipeline has run.
:type: dynamic
:reverse: -fno-prof-late
:category:
:since: 9.4.1
- Adds an automatic ``SCC`` annotation to all top level bindings late in the core pipeline after
- the optimizer has run. This means these cost centres will not interfere with core-level optimizations
+ Adds an automatic ``SCC`` annotation to all top level bindings late in the compilation pipeline after
+ the optimizer has run and unfoldings have been created. This means these cost centres will not interfere with core-level optimizations
and the resulting profile will be closer to the performance profile of an optimized non-profiled
executable.
- While the results of this are generally very informative some of the compiler internal names
- will leak into the profile.
+ While the results of this are generally informative, some of the compiler internal names
+ will leak into the profile. Further if a function is inlined into a use site it's costs will be counted against the
+ caller's cost center.
+
+ For example if we have this code:
+
+ .. code-block:: haskell
+
+ {-# INLINE mysum #-}
+ mysum = sum
+ main = print $ mysum [1..9999999]
+
+ Then ``mysum`` will not show up in the profile since it will be inlined into main and therefore
+ it's associated costs will be attributed to mains implicit cost centre.
+
+.. ghc-flag:: -fprof-late-inline
+ :shortdesc: Auto-add ``SCC``\\ s to all top level bindings *after* the optimizer has run and retain them when inlining.
+ :type: dynamic
+ :reverse: -fno-prof-late-inline
+ :category:
+
+ :since: 9.4.1
+
+ Adds an automatic ``SCC`` annotation to all top level bindings late in the core pipeline after
+ the optimizer has run. This is the same as :ghc-flag:`-fprof-late` except that cost centers are included in some unfoldings.
+
+ The result of which is that cost centers *can* inhibit core optimizations to some degree at use sites
+ after inlining. Further there can be significant overhead from cost centres added to small functions if they are inlined often.
+
+ You can try this mode if :ghc-flag:`-fprof-late` results in a profile that's too hard to interpret.
.. ghc-flag:: -fprof-cafs
:shortdesc: Auto-add ``SCC``\\ s to all CAFs