docs/programmer_reference/transapp_read.html


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <title>Degrees of isolation</title>
    <link rel="stylesheet" href="gettingStarted.css" type="text/css" />
    <meta name="generator" content="DocBook XSL Stylesheets V1.73.2" />
    <link rel="start" href="index.html" title="Berkeley DB Programmer's Reference Guide" />
    <link rel="up" href="transapp.html" title="Chapter 11.  Berkeley DB Transactional Data Store Applications" />
    <link rel="prev" href="transapp_inc.html" title="Isolation" />
    <link rel="next" href="transapp_cursor.html" title="Transactional cursors" />
  </head>
  <body>
    <div xmlns="" class="navheader">
      <div class="libver">
        <p>Library Version 12.1.6.1</p>
      </div>
      <table width="100%" summary="Navigation header">
        <tr>
          <th colspan="3" align="center">Degrees of isolation</th>
        </tr>
        <tr>
          <td width="20%" align="left"><a accesskey="p" href="transapp_inc.html">Prev</a> </td>
          <th width="60%" align="center">Chapter 11.  Berkeley DB Transactional Data Store Applications </th>
          <td width="20%" align="right"> <a accesskey="n" href="transapp_cursor.html">Next</a></td>
        </tr>
      </table>
      <hr />
    </div>
    <div class="sect1" lang="en" xml:lang="en">
      <div class="titlepage">
        <div>
          <div>
            <h2 class="title" style="clear: both"><a id="transapp_read"></a>Degrees of isolation</h2>
          </div>
        </div>
      </div>
      <div class="toc">
        <dl>
          <dt>
            <span class="sect2">
              <a href="transapp_read.html#snapshot_isolation">Snapshot Isolation</a>
            </span>
          </dt>
        </dl>
      </div>
      <p>
        Transactions can be isolated from each other to different
        degrees. <span class="emphasis"><em>Serializable</em></span> provides the most
        isolation, and means that, for the life of the transaction,
        every time a thread of control reads a data item, it will be
        unchanged from its previous value (assuming, of course, the
        thread of control does not itself modify the item). By
        default, Berkeley DB enforces serializability whenever
        database reads are wrapped in transactions. This is also known
        as <span class="emphasis"><em>degree 3 isolation</em></span>.
    </p>
      <p>
        Most applications do not need to enclose all reads in
        transactions, and when possible, transactionally protected
        reads at serializable isolation should be avoided as they can
        cause performance problems. For example, a serializable cursor
        sequentially reading each key/data pair in a database, will
        acquire a read lock on most of the pages in the database and
        so will gradually block all write operations on the databases
        until the transaction commits or aborts. Note, however, that
        if there are update transactions present in the application,
        the read operations must still use locking, and must be
        prepared to repeat any operation (possibly closing and
        reopening a cursor) that fails with a return value of <a class="link" href="program_errorret.html#program_errorret.DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>.
        Applications that need repeatable reads are ones that require 
        the ability to repeatedly access a data item knowing that it will
        not have changed (for example, an operation modifying a data item based
        on its existing value).
    </p>
      <p>
        <span class="emphasis"><em>Snapshot isolation</em></span> also guarantees
        repeatable reads, but avoids read locks by using multiversion
        concurrency control (MVCC). This makes update operations more
        expensive, because they have to allocate space for new
        versions of pages in cache and make copies, but avoiding read
        locks can significantly increase throughput for many
        applications. Snapshot isolation is discussed in detail
        below.
    </p>
      <p>
        A transaction may only require <span class="emphasis"><em>cursor
        stability</em></span>, that is only be guaranteed that
        cursors see committed data that does not change so long as it
        is addressed by the cursor, but may change before the reading
        transaction completes. This is also called <span class="emphasis"><em>degree 2
        isolation</em></span>. Berkeley DB provides this level of
        isolation when a transaction is started with the
        <a href="../api_reference/C/dbcget.html#dbcget_DB_READ_COMMITTED" class="olink">DB_READ_COMMITTED</a> flag. This flag may also be specified when
        opening a cursor within a fully isolated transaction.
    </p>
      <p>
        Berkeley DB optionally supports reading uncommitted data;
        that is, read operations may request data which has been
        modified but not yet committed by another transaction. This is
        also called <span class="emphasis"><em>degree 1 isolation</em></span>. This is
        done by first specifying the <a href="../api_reference/C/dbopen.html#dbopen_DB_READ_UNCOMMITTED" class="olink">DB_READ_UNCOMMITTED</a> flag when
        opening the underlying database, and then specifying the
        <a href="../api_reference/C/dbopen.html#dbopen_DB_READ_UNCOMMITTED" class="olink">DB_READ_UNCOMMITTED</a> flag when beginning a transaction,
        opening a cursor, or performing a read operation. The
        advantage of using <a href="../api_reference/C/dbopen.html#dbopen_DB_READ_UNCOMMITTED" class="olink">DB_READ_UNCOMMITTED</a> is that read
        operations will not block when another transaction holds a
        write lock on the requested data; the disadvantage is that
        read operations may return data that will disappear should the
        transaction holding the write lock abort.
    </p>
      <div class="sect2" lang="en" xml:lang="en">
        <div class="titlepage">
          <div>
            <div>
              <h3 class="title"><a id="snapshot_isolation"></a>Snapshot Isolation</h3>
            </div>
          </div>
        </div>
        <p>
            To make use of snapshot isolation, databases must first
            be configured for multiversion access by calling <a href="../api_reference/C/dbopen.html" class="olink">DB-&gt;open()</a>
            with the <a href="../api_reference/C/dbopen.html#dbopen_DB_MULTIVERSION" class="olink">DB_MULTIVERSION</a> flag. Then transactions or
            cursors must be configured with the <a href="../api_reference/C/txnbegin.html#txnbegin_DB_TXN_SNAPSHOT" class="olink">DB_TXN_SNAPSHOT</a>
            flag.
        </p>
        <p>
            When configuring an environment for snapshot isolation,
            it is important to realize that having multiple versions
            of pages in cache means that the working set will take up
            more of the cache. As a result, snapshot isolation is best
            suited for use with larger cache sizes.
        </p>
        <p>
            If the cache becomes full of page copies before the old
            copies can be discarded, additional I/O will occur as
            pages are written to temporary "freezer" files. This can
            substantially reduce throughput, and should be avoided if
            possible by configuring a large cache and keeping snapshot
            isolation transactions short. The amount of cache required
            to avoid freezing buffers can be estimated by taking a
            checkpoint followed by a call to <a href="../api_reference/C/logarchive.html" class="olink">DB_ENV-&gt;log_archive()</a>. The amount
            of cache required is approximately double the size of logs
            that remains.
        </p>
        <p>
            The environment should also be configured for sufficient
            transactions using <a href="../api_reference/C/envset_tx_max.html" class="olink">DB_ENV-&gt;set_tx_max()</a>. The maximum number of
            transactions needs to include all transactions executed
            concurrently by the application plus all cursors
            configured for snapshot isolation. Further, the
            transactions are retained until the last page they created
            is evicted from cache, so in the extreme case, an
            additional transaction may be needed for each page in the
            cache. Note that cache sizes under 500MB are increased by
            25%, so the calculation of number of pages needs to take
            this into account.
        </p>
        <p>
            So when <span class="emphasis"><em>should</em></span> applications use
            snapshot isolation?
        </p>
        <div class="itemizedlist">
          <ul type="disc">
            <li>
                There is a large cache relative to size of
                updates performed by concurrent transactions;
                and
            </li>
            <li>
                Read/write contention is limiting the throughput
                of the application; or
            </li>
            <li>
                The application is all or mostly read-only, and
                contention for the lock manager mutex is limiting
                throughput.
            </li>
          </ul>
        </div>
        <p>
            The simplest way to take advantage of snapshot isolation
            is for queries: keep update transactions using full
            read/write locking and set <a href="../api_reference/C/txnbegin.html#txnbegin_DB_TXN_SNAPSHOT" class="olink">DB_TXN_SNAPSHOT</a> on read-only
            transactions or cursors. This should minimize blocking of
            snapshot isolation transactions and will avoid introducing
            new <a class="link" href="program_errorret.html#program_errorret.DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a> errors.
        </p>
        <p>
            If the application has update transactions which read
            many items and only update a small set (for example,
            scanning until a desired record is found, then modifying
            it), throughput may be improved by running some updates at
            snapshot isolation as well.
        </p>
      </div>
    </div>
    <div class="navfooter">
      <hr />
      <table width="100%" summary="Navigation footer">
        <tr>
          <td width="40%" align="left"><a accesskey="p" href="transapp_inc.html">Prev</a> </td>
          <td width="20%" align="center">
            <a accesskey="u" href="transapp.html">Up</a>
          </td>
          <td width="40%" align="right"> <a accesskey="n" href="transapp_cursor.html">Next</a></td>
        </tr>
        <tr>
          <td width="40%" align="left" valign="top">Isolation </td>
          <td width="20%" align="center">
            <a accesskey="h" href="index.html">Home</a>
          </td>
          <td width="40%" align="right" valign="top"> Transactional cursors</td>
        </tr>
      </table>
    </div>
  </body>
</html>