summaryrefslogtreecommitdiff
path: root/docs/users_guide/profiling.xml
diff options
context:
space:
mode:
authorSimon Marlow <marlowsd@gmail.com>2011-11-29 13:05:48 +0000
committerSimon Marlow <marlowsd@gmail.com>2011-11-29 14:22:27 +0000
commit1ed0dfa1fe8d50ece73ee9872aa045998ef6f0f5 (patch)
treef520cc435ea93ad6e5ad1b7cfd2cfd046c79d316 /docs/users_guide/profiling.xml
parentf44f725e69999a9ca0fabdcf4f24e3d47e80685b (diff)
downloadhaskell-1ed0dfa1fe8d50ece73ee9872aa045998ef6f0f5.tar.gz
doc update: -prof now works with +RTS -N (with caveats)
Diffstat (limited to 'docs/users_guide/profiling.xml')
-rw-r--r--docs/users_guide/profiling.xml44
1 files changed, 38 insertions, 6 deletions
diff --git a/docs/users_guide/profiling.xml b/docs/users_guide/profiling.xml
index a5a1d4911c..ee3b387e31 100644
--- a/docs/users_guide/profiling.xml
+++ b/docs/users_guide/profiling.xml
@@ -9,12 +9,6 @@
can answer questions like "why is my program so slow?", or "why is
my program using so much memory?".</para>
- <para>Note that multi-processor execution (e.g. <literal>+RTS
- -N2</literal>) is not currently supported with GHC's time and space
- profiling. However, there is a separate tool specifically for
- profiling concurrent and parallel programs: <ulink
- url="http://www.haskell.org/haskellwiki/ThreadScope">ThreadScope</ulink>.</para>
-
<para>Profiling a program is a three-step process:</para>
<orderedlist>
@@ -1359,6 +1353,44 @@ to re-read its input file:
</sect2>
</sect1>
+ <sect1 id="prof-threaded">
+ <title>Profiling Parallel and Concurrent Programs</title>
+
+ <para>Combining <option>-threaded</option>
+ and <option>-prof</option> is perfectly fine, and indeed it is
+ possible to profile a program running on multiple processors
+ with the <option>+RTS -N</option> option.<footnote>This feature
+ was added in GHC 7.4.1.</footnote>
+ </para>
+
+ <para>
+ Some caveats apply, however. In the current implementation, a
+ profiled program is likely to scale much less well than the
+ unprofiled program, because the profiling implementation uses
+ some shared data structures which require locking in the runtime
+ system. Furthermore, the memory allocation statistics collected
+ by the profiled program are stored in shared memory
+ but <emphasis>not</emphasis> locked (for speed), which means
+ that these figures might be inaccurate for parallel programs.
+ </para>
+
+ <para>
+ We strongly recommend that you
+ use <option>-fno-prof-count-entries</option> when compiling a
+ program to be profiled on multiple cores, because the entry
+ counts are also stored in shared memory, and continuously
+ updating them on multiple cores is extremely slow.
+ </para>
+
+ <para>
+ We also recommend
+ using <ulink url="http://www.haskell.org/haskellwiki/ThreadScope">ThreadScope</ulink>
+ for profiling parallel programs; it offers a GUI for visualising
+ parallel execution, and is complementary to the time and space
+ profiling features provided with GHC.
+ </para>
+ </sect1>
+
<sect1 id="hpc">
<title>Observing Code Coverage</title>
<indexterm><primary>code coverage</primary></indexterm>