diff options
author | Simon Marlow <marlowsd@gmail.com> | 2011-11-29 13:05:48 +0000 |
---|---|---|
committer | Simon Marlow <marlowsd@gmail.com> | 2011-11-29 14:22:27 +0000 |
commit | 1ed0dfa1fe8d50ece73ee9872aa045998ef6f0f5 (patch) | |
tree | f520cc435ea93ad6e5ad1b7cfd2cfd046c79d316 /docs/users_guide/profiling.xml | |
parent | f44f725e69999a9ca0fabdcf4f24e3d47e80685b (diff) | |
download | haskell-1ed0dfa1fe8d50ece73ee9872aa045998ef6f0f5.tar.gz |
doc update: -prof now works with +RTS -N (with caveats)
Diffstat (limited to 'docs/users_guide/profiling.xml')
-rw-r--r-- | docs/users_guide/profiling.xml | 44 |
1 files changed, 38 insertions, 6 deletions
diff --git a/docs/users_guide/profiling.xml b/docs/users_guide/profiling.xml index a5a1d4911c..ee3b387e31 100644 --- a/docs/users_guide/profiling.xml +++ b/docs/users_guide/profiling.xml @@ -9,12 +9,6 @@ can answer questions like "why is my program so slow?", or "why is my program using so much memory?".</para> - <para>Note that multi-processor execution (e.g. <literal>+RTS - -N2</literal>) is not currently supported with GHC's time and space - profiling. However, there is a separate tool specifically for - profiling concurrent and parallel programs: <ulink - url="http://www.haskell.org/haskellwiki/ThreadScope">ThreadScope</ulink>.</para> - <para>Profiling a program is a three-step process:</para> <orderedlist> @@ -1359,6 +1353,44 @@ to re-read its input file: </sect2> </sect1> + <sect1 id="prof-threaded"> + <title>Profiling Parallel and Concurrent Programs</title> + + <para>Combining <option>-threaded</option> + and <option>-prof</option> is perfectly fine, and indeed it is + possible to profile a program running on multiple processors + with the <option>+RTS -N</option> option.<footnote>This feature + was added in GHC 7.4.1.</footnote> + </para> + + <para> + Some caveats apply, however. In the current implementation, a + profiled program is likely to scale much less well than the + unprofiled program, because the profiling implementation uses + some shared data structures which require locking in the runtime + system. Furthermore, the memory allocation statistics collected + by the profiled program are stored in shared memory + but <emphasis>not</emphasis> locked (for speed), which means + that these figures might be inaccurate for parallel programs. + </para> + + <para> + We strongly recommend that you + use <option>-fno-prof-count-entries</option> when compiling a + program to be profiled on multiple cores, because the entry + counts are also stored in shared memory, and continuously + updating them on multiple cores is extremely slow. + </para> + + <para> + We also recommend + using <ulink url="http://www.haskell.org/haskellwiki/ThreadScope">ThreadScope</ulink> + for profiling parallel programs; it offers a GUI for visualising + parallel execution, and is complementary to the time and space + profiling features provided with GHC. + </para> + </sect1> + <sect1 id="hpc"> <title>Observing Code Coverage</title> <indexterm><primary>code coverage</primary></indexterm> |