diff options
Diffstat (limited to 'docs/programmer_reference/transapp_throughput.html')
| -rw-r--r-- | docs/programmer_reference/transapp_throughput.html | 268 |
1 files changed, 171 insertions, 97 deletions
diff --git a/docs/programmer_reference/transapp_throughput.html b/docs/programmer_reference/transapp_throughput.html index 19d4cf6e..c5bc7083 100644 --- a/docs/programmer_reference/transapp_throughput.html +++ b/docs/programmer_reference/transapp_throughput.html @@ -14,17 +14,16 @@ <body> <div xmlns="" class="navheader"> <div class="libver"> - <p>Library Version 11.2.5.3</p> + <p>Library Version 12.1.6.1</p> </div> <table width="100%" summary="Navigation header"> <tr> - <th colspan="3" align="center">Transaction throughput</th> + <th colspan="3" align="center">Transaction + throughput</th> </tr> <tr> <td width="20%" align="left"><a accesskey="p" href="transapp_tune.html">Prev</a> </td> - <th width="60%" align="center">Chapter 11. - Berkeley DB Transactional Data Store Applications - </th> + <th width="60%" align="center">Chapter 11. Berkeley DB Transactional Data Store Applications </th> <td width="20%" align="right"> <a accesskey="n" href="transapp_faq.html">Next</a></td> </tr> </table> @@ -34,121 +33,196 @@ <div class="titlepage"> <div> <div> - <h2 class="title" style="clear: both"><a id="transapp_throughput"></a>Transaction throughput</h2> + <h2 class="title" style="clear: both"><a id="transapp_throughput"></a>Transaction + throughput</h2> </div> </div> </div> - <p>Generally, the speed of a database system is measured by the -<span class="emphasis"><em>transaction throughput</em></span>, expressed as a number of -transactions per second. The two gating factors for Berkeley DB performance -in a transactional system are usually the underlying database files and -the log file. Both are factors because they require disk I/O, which is -slow relative to other system resources such as CPU.</p> - <p>In the worst-case scenario:</p> + <p> + Generally, the speed of a database system is measured by the + <span class="emphasis"><em>transaction throughput</em></span>, expressed as + a number of transactions per second. The two gating factors + for Berkeley DB performance in a transactional system are + usually the underlying database files and the log file. Both + are factors because they require disk I/O, which is slow + relative to other system resources such as CPU. + </p> + <p> + In the worst-case scenario: + </p> <div class="itemizedlist"> <ul type="disc"> - <li>Database access is truly random and the database is too large for any -significant percentage of it to fit into the cache, resulting in a -single I/O per requested key/data pair.</li> - <li>Both the database and the log are on a single disk.</li> + <li> + Database access is truly random and the database is + too large for any significant percentage of it to fit into + the cache, resulting in a single I/O per requested + key/data pair. + </li> + <li> + Both the database and the log are on a single + disk. + </li> </ul> </div> - <p>This means that for each transaction, Berkeley DB is potentially performing -several filesystem operations:</p> + <p> + This means that for each transaction, Berkeley DB is + potentially performing several filesystem operations: + </p> <div class="itemizedlist"> <ul type="disc"> - <li>Disk seek to database file</li> - <li>Database file read</li> - <li>Disk seek to log file</li> - <li>Log file write</li> - <li>Flush log file information to disk</li> - <li>Disk seek to update log file metadata (for example, inode information)</li> - <li>Log metadata write</li> - <li>Flush log file metadata to disk</li> + <li> + Disk seek to database file + </li> + <li> + Database file read + </li> + <li> + Disk seek to log file</li> + <li> + Log file write + </li> + <li> + Flush log file information to disk + </li> + <li> + Disk seek to update log file metadata (for example, + inode information) + </li> + <li> + Log metadata write + </li> + <li> + Flush log file metadata to disk + </li> </ul> </div> - <p>There are a number of ways to increase transactional throughput, all of -which attempt to decrease the number of filesystem operations per -transaction. First, the Berkeley DB software includes support for -<span class="emphasis"><em>group commit</em></span>. Group commit simply means that when the -information about one transaction is flushed to disk, the information -for any other waiting transactions will be flushed to disk at the same -time, potentially amortizing a single log write over a large number of -transactions. There are additional tuning parameters which may be -useful to application writers:</p> + <p> + There are a number of ways to increase transactional + throughput, all of which attempt to decrease the number of + filesystem operations per transaction. First, the Berkeley DB + software includes support for <span class="emphasis"><em>group + commit</em></span>. Group commit simply means that when the + information about one transaction is flushed to disk, the + information for any other waiting transactions will be flushed + to disk at the same time, potentially amortizing a single log + write over a large number of transactions. There are + additional tuning parameters which may be useful to + application writers: + </p> <div class="itemizedlist"> <ul type="disc"> - <li>Tune the size of the database cache. If the Berkeley DB key/data pairs used -during the transaction are found in the database cache, the seek and read -from the database are no longer necessary, resulting in two fewer -filesystem operations per transaction. To determine whether your cache -size is too small, see <a class="xref" href="general_am_conf.html#am_conf_cachesize" title="Selecting a cache size">Selecting a cache size</a>.</li> - <li>Put the database and the log files on different disks. This allows reads -and writes to the log files and the database files to be performed -concurrently.</li> - <li>Set the filesystem configuration so that file access and modification times -are not updated. Note that although the file access and modification times -are not used by Berkeley DB, this may affect other programs -- so be careful.</li> - <li>Upgrade your hardware. When considering the hardware on which to run your -application, however, it is important to consider the entire system. The -controller and bus can have as much to do with the disk performance as -the disk itself. It is also important to remember that throughput is -rarely the limiting factor, and that disk seek times are normally the true -performance issue for Berkeley DB.</li> - <li> - Turn on the <a href="../api_reference/C/envset_flags.html#envset_flags_DB_TXN_NOSYNC" class="olink">DB_TXN_NOSYNC</a> or <a href="../api_reference/C/envset_flags.html#set_flags_DB_TXN_WRITE_NOSYNC" class="olink">DB_TXN_WRITE_NOSYNC</a> flags. This - changes the Berkeley DB behavior so that the log files are not written - and/or flushed when transactions are committed. Although this change - will greatly increase your transaction throughput, it means that - transactions will exhibit the ACI (atomicity, consistency, and - isolation) properties, but not D (durability). Database integrity will - be maintained, but it is possible that some number of the most recently - committed transactions may be undone during recovery instead of being - redone. -</li> + <li> + Tune the size of the database cache. If the Berkeley + DB key/data pairs used during the transaction are found in + the database cache, the seek and read from the database + are no longer necessary, resulting in two fewer filesystem + operations per transaction. To determine whether your + cache size is too small, see <a class="xref" href="general_am_conf.html#am_conf_cachesize" title="Selecting a cache size">Selecting a cache size</a>. + </li> + <li> + Put the database and the log files on different + disks. This allows reads and writes to the log files and + the database files to be performed + concurrently. + </li> + <li> + Set the filesystem configuration so that file access + and modification times are not updated. Note that although + the file access and modification times are not used by + Berkeley DB, this may affect other programs -- so be + careful. + </li> + <li> + Upgrade your hardware. When considering the hardware + on which to run your application, however, it is important + to consider the entire system. The controller and bus can + have as much to do with the disk performance as the disk + itself. It is also important to remember that throughput + is rarely the limiting factor, and that disk seek times + are normally the true performance issue for Berkeley + DB. + </li> + <li> + Turn on the <a href="../api_reference/C/envset_flags.html#envset_flags_DB_TXN_NOSYNC" class="olink">DB_TXN_NOSYNC</a> or + <a href="../api_reference/C/envset_flags.html#set_flags_DB_TXN_WRITE_NOSYNC" class="olink">DB_TXN_WRITE_NOSYNC</a> flags. This changes the Berkeley DB + behavior so that the log files are not written and/or + flushed when transactions are committed. Although this + change will greatly increase your transaction throughput, + it means that transactions will exhibit the ACI + (atomicity, consistency, and isolation) properties, but + not D (durability). Database integrity will be maintained, + but it is possible that some number of the most recently + committed transactions may be undone during recovery + instead of being redone. + </li> </ul> </div> - <p>If you are bottlenecked on logging, the following test will help you -confirm that the number of transactions per second that your application -does is reasonable for the hardware on which you are running. Your test -program should repeatedly perform the following operations:</p> + <p> + If you are bottlenecked on logging, the following test will + help you confirm that the number of transactions per second + that your application does is reasonable for the hardware on + which you are running. Your test program should repeatedly + perform the following operations: + </p> <div class="itemizedlist"> <ul type="disc"> - <li>Seek to the beginning of a file</li> - <li>Write to the file</li> - <li>Flush the file write to disk</li> + <li> + Seek to the beginning of a file + </li> + <li> + Write to the file + </li> + <li> + Flush the file write to disk + </li> </ul> </div> - <p>The number of times that you can perform these three operations per -second is a rough measure of the minimum number of transactions per -second of which the hardware is capable. This test simulates the -operations applied to the log file. (As a simplifying assumption in this -experiment, we assume that the database files are either on a separate -disk; or that they fit, with some few exceptions, into the database -cache.) We do not have to directly simulate updating the log file -directory information because it will normally be updated and flushed -to disk as a result of flushing the log file write to disk.</p> - <p>Running this test program, in which we write 256 bytes for 1000 operations -on reasonably standard commodity hardware (Pentium II CPU, SCSI disk), -returned the following results:</p> + <p> + The number of times that you can perform these three + operations per second is a rough measure of the minimum number + of transactions per second of which the hardware is capable. + This test simulates the operations applied to the log file. + (As a simplifying assumption in this experiment, we assume + that the database files are either on a separate disk; or that + they fit, with some few exceptions, into the database cache.) + We do not have to directly simulate updating the log file + directory information because it will normally be updated and + flushed to disk as a result of flushing the log file write to + disk. + </p> + <p> + Running this test program, in which we write 256 bytes for + 1000 operations on reasonably standard commodity hardware + (Pentium II CPU, SCSI disk), returned the following + results: + </p> <pre class="programlisting">% testfile -b256 -o1000 running: 1000 ops Elapsed time: 16.641934 seconds 1000 ops: 60.09 ops per second</pre> - <p>Note that the number of bytes being written to the log as part of each -transaction can dramatically affect the transaction throughput. The -test run used 256, which is a reasonable size log write. Your log -writes may be different. To determine your average log write size, use -the <a href="../api_reference/C/db_stat.html" class="olink">db_stat</a> utility to display your log statistics.</p> - <p>As a quick sanity check, the average seek time is 9.4 msec for this -particular disk, and the average latency is 4.17 msec. That results in -a minimum requirement for a data transfer to the disk of 13.57 msec, or -a maximum of 74 transfers per second. This is close enough to the -previous 60 operations per second (which was not done on a quiescent -disk) that the number is believable.</p> - <p>An implementation of the previous <a class="ulink" href="writetest.cs" target="_top">example test -program</a> for IEEE/ANSI Std 1003.1 (POSIX) standard systems is included in the Berkeley DB -distribution.</p> + <p> + Note that the number of bytes being written to the log as + part of each transaction can dramatically affect the + transaction throughput. The test run used 256, which is a + reasonable size log write. Your log writes may be different. + To determine your average log write size, use the <a href="../api_reference/C/db_stat.html" class="olink">db_stat</a> utility to + display your log statistics. + </p> + <p> + As a quick sanity check, the average seek time is 9.4 msec + for this particular disk, and the average latency is 4.17 + msec. That results in a minimum requirement for a data + transfer to the disk of 13.57 msec, or a maximum of 74 + transfers per second. This is close enough to the previous 60 + operations per second (which was not done on a quiescent disk) + that the number is believable. + </p> + <p> + An implementation of the previous <a class="ulink" href="writetest.cs" target="_top"> + example test program</a> for IEEE/ANSI Std 1003.1 + (POSIX) standard systems is included in the Berkeley DB + distribution. + </p> </div> <div class="navfooter"> <hr /> |
