docs/users_guide/parallel.xml


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

<?xml version="1.0" encoding="iso-8859-1"?>
<sect1 id="lang-parallel">
  <title>Parallel Haskell</title>
  <indexterm><primary>parallelism</primary>
  </indexterm>

  <para>There are two implementations of Parallel Haskell: SMP paralellism
    <indexterm><primary>SMP</primary></indexterm>
    which is built-in to GHC (see <xref linkend="sec-using-smp" />) and
    supports running Parallel Haskell programs on a single multiprocessor
    machine, and
    Glasgow Parallel Haskell<indexterm><primary>Glasgow Parallel Haskell</primary></indexterm>
    (GPH) which supports running Parallel Haskell
    programs on both clusters of machines or single multiprocessors.  GPH is
    developed and distributed
    separately from GHC (see <ulink url="http://www.cee.hw.ac.uk/~dsg/gph/">The
      GPH Page</ulink>).</para>
  
  <para>Ordinary single-threaded Haskell programs will not benefit from
    enabling SMP parallelism alone.  You must expose parallelism to the
    compiler in one of the following two ways.</para>
  
  <sect2>
    <title>Running Concurrent Haskell programs in parallel</title>

    <para>The first possibility is to use concurrent threads to structure your
      program, and make sure
      that you spread computation amongst the threads.  The runtime will
      schedule the running Haskell threads among the available OS
      threads, running as many in parallel as you specified with the
      <option>-N</option> RTS option.</para>
  </sect2>

  <sect2>
    <title>Annotating pure code for parallelism</title>

    <para>The simplest mechanism for extracting parallelism from pure code is
      to use the <literal>par</literal> combinator, which is closely related to (and often used
      with) <literal>seq</literal>.  Both of these are available from <ulink
	url="../libraries/base/Control-Parallel.html"><literal>Control.Parallel</literal></ulink>:</para>

<programlisting>
infixr 0 `par`
infixr 1 `seq`

par :: a -&#62; b -&#62; b
seq :: a -&#62; b -&#62; b</programlisting>

    <para>The expression <literal>(x `par` y)</literal>
      <emphasis>sparks</emphasis> the evaluation of <literal>x</literal>
      (to weak head normal form) and returns <literal>y</literal>.  Sparks are
      queued for execution in FIFO order, but are not executed immediately.  If
      the runtime detects that there is an idle CPU, then it may convert a
      spark into a real thread, and run the new thread on the idle CPU.  In
      this way the available parallelism is spread amongst the real
      CPUs.</para>

    <para>For example, consider the following parallel version of our old
      nemesis, <function>nfib</function>:</para>

<programlisting>
import Control.Parallel

nfib :: Int -&#62; Int
nfib n | n &#60;= 1 = 1
       | otherwise = par n1 (seq n2 (n1 + n2 + 1))
                     where n1 = nfib (n-1)
                           n2 = nfib (n-2)</programlisting>

    <para>For values of <varname>n</varname> greater than 1, we use
      <function>par</function> to spark a thread to evaluate <literal>nfib (n-1)</literal>,
      and then we use <function>seq</function> to force the
      parent thread to evaluate <literal>nfib (n-2)</literal> before going on
      to add together these two subexpressions.  In this divide-and-conquer
      approach, we only spark a new thread for one branch of the computation
      (leaving the parent to evaluate the other branch).  Also, we must use
      <function>seq</function> to ensure that the parent will evaluate
      <varname>n2</varname> <emphasis>before</emphasis> <varname>n1</varname>
      in the expression <literal>(n1 + n2 + 1)</literal>.  It is not sufficient
      to reorder the expression as <literal>(n2 + n1 + 1)</literal>, because
      the compiler may not generate code to evaluate the addends from left to
      right.</para>

    <para>When using <literal>par</literal>, the general rule of thumb is that
      the sparked computation should be required at a later time, but not too
      soon.  Also, the sparked computation should not be too small, otherwise
      the cost of forking it in parallel will be too large relative to the
      amount of parallelism gained.  Getting these factors right is tricky in
      practice.</para>

    <para>More sophisticated combinators for expressing parallelism are
      available from the <ulink
	url="../libraries/base/Control-Parallel-Strategies.html"><literal>Control.Parallel.Strategies</literal></ulink> module.
      This module builds functionality around <literal>par</literal>,
      expressing more elaborate patterns of parallel computation, such as
      parallel <literal>map</literal>.</para>
  </sect2>

</sect1>

<!-- Emacs stuff:
     ;;; Local Variables: ***
     ;;; mode: xml ***
     ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
     ;;; End: ***
 -->