summaryrefslogtreecommitdiff
path: root/docs/programmer_reference/transapp_reclimit.html
diff options
context:
space:
mode:
Diffstat (limited to 'docs/programmer_reference/transapp_reclimit.html')
-rw-r--r--docs/programmer_reference/transapp_reclimit.html322
1 files changed, 165 insertions, 157 deletions
diff --git a/docs/programmer_reference/transapp_reclimit.html b/docs/programmer_reference/transapp_reclimit.html
index 47994027..c93d399b 100644
--- a/docs/programmer_reference/transapp_reclimit.html
+++ b/docs/programmer_reference/transapp_reclimit.html
@@ -14,7 +14,7 @@
<body>
<div xmlns="" class="navheader">
<div class="libver">
- <p>Library Version 11.2.5.3</p>
+ <p>Library Version 12.1.6.1</p>
</div>
<table width="100%" summary="Navigation header">
<tr>
@@ -22,9 +22,7 @@
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href="transapp_filesys.html">Prev</a> </td>
- <th width="60%" align="center">Chapter 11. 
- Berkeley DB Transactional Data Store Applications
- </th>
+ <th width="60%" align="center">Chapter 11.  Berkeley DB Transactional Data Store Applications </th>
<td width="20%" align="right"> <a accesskey="n" href="transapp_tune.html">Next</a></td>
</tr>
</table>
@@ -38,29 +36,30 @@
</div>
</div>
</div>
- <p>
- Berkeley DB recovery is based on write-ahead logging. This means
- that when a change is made to a database page, a description of the
- change is written into a log file. This description in the log
- file is guaranteed to be written to stable storage before the
- database pages that were changed are written to stable storage.
- This is the fundamental feature of the logging system that makes
- durability and rollback work.
+ <p>
+ Berkeley DB recovery is based on write-ahead logging. This
+ means that when a change is made to a database page, a
+ description of the change is written into a log file. This
+ description in the log file is guaranteed to be written to
+ stable storage before the database pages that were changed are
+ written to stable storage. This is the fundamental feature of
+ the logging system that makes durability and rollback work.
</p>
- <p>
- If the application or system crashes, the log is reviewed during
- recovery. Any database changes described in the log that were part
- of committed transactions and that were never written to the actual
- database itself are written to the database as part of recovery.
- Any database changes described in the log that were never committed
- and that were written to the actual database itself are backed-out
- of the database as part of recovery. This design allows the
- database to be written lazily, and only blocks from the log file
- have to be forced to disk as part of transaction commit.
+ <p>
+ If the application or system crashes, the log is reviewed
+ during recovery. Any database changes described in the log
+ that were part of committed transactions and that were never
+ written to the actual database itself are written to the
+ database as part of recovery. Any database changes described
+ in the log that were never committed and that were written to
+ the actual database itself are backed-out of the database as
+ part of recovery. This design allows the database to be
+ written lazily, and only blocks from the log file have to be
+ forced to disk as part of transaction commit.
</p>
- <p>
- There are two interfaces that are a concern when considering
- Berkeley DB recoverability:
+ <p>
+ There are two interfaces that are a concern when
+ considering Berkeley DB recoverability:
</p>
<div class="orderedlist">
<ol type="1">
@@ -69,153 +68,162 @@
system/filesystem.
</li>
<li>
- The interface between the operating system/filesystem and the
- underlying stable storage hardware.
+ The interface between the operating
+ system/filesystem and the underlying stable storage
+ hardware.
</li>
</ol>
</div>
- <p>
- Berkeley DB uses the operating system interfaces and its underlying
- filesystem when writing its files. This means that Berkeley DB can
- fail if the underlying filesystem fails in some unrecoverable way.
- Otherwise, the interface requirements here are simple: The system
- call that Berkeley DB uses to flush data to disk (normally fsync or
- fdatasync), must guarantee that all the information necessary for a
- file's recoverability has been written to stable storage before it
- returns to Berkeley DB, and that no possible application or system
- crash can cause that file to be unrecoverable.
+ <p>
+ Berkeley DB uses the operating system interfaces and its
+ underlying filesystem when writing its files. This means that
+ Berkeley DB can fail if the underlying filesystem fails in
+ some unrecoverable way. Otherwise, the interface requirements
+ here are simple: The system call that Berkeley DB uses to
+ flush data to disk (normally fsync or fdatasync), must
+ guarantee that all the information necessary for a file's
+ recoverability has been written to stable storage before it
+ returns to Berkeley DB, and that no possible application or
+ system crash can cause that file to be unrecoverable.
</p>
- <p>
- In addition, Berkeley DB implicitly uses the interface between the
- operating system and the underlying hardware. The interface
- requirements here are not as simple.
+ <p>
+ In addition, Berkeley DB implicitly uses the interface
+ between the operating system and the underlying hardware. The
+ interface requirements here are not as simple.
</p>
- <p>
- First, it is necessary to consider the underlying page size of the
- Berkeley DB databases. The Berkeley DB library performs all
- database writes using the page size specified by the application,
- and Berkeley DB assumes pages are written atomically. This means
- that if the operating system performs filesystem I/O in blocks of
- different sizes than the database page size, it may increase the
- possibility for database corruption. For example, assume that
- Berkeley DB is writing 32KB pages for a database, and the operating
- system does filesystem I/O in 16KB blocks. If the operating system
- writes the first 16KB of the database page successfully, but
- crashes before being able to write the second 16KB of the database,
- the database has been corrupted and this corruption may or may not
- be detected during recovery. For this reason, it may be important
- to select database page sizes that will be written as single block
- transfers by the underlying operating system. If you do not select
- a page size that the underlying operating system will write as a
- single block, you may want to configure the database to use
- checksums (see the <a href="../api_reference/C/dbset_flags.html" class="olink">DB-&gt;set_flags()</a> flag for more information). By
- configuring checksums, you guarantee this kind of corruption will
- be detected at the expense of the CPU required to generate the
- checksums. When such an error is detected, the only course of
- recovery is to perform catastrophic recovery to restore the
- database.
+ <p>
+ First, it is necessary to consider the underlying page size
+ of the Berkeley DB databases. The Berkeley DB library performs
+ all database writes using the page size specified by the
+ application, and Berkeley DB assumes pages are written
+ atomically. This means that if the operating system performs
+ filesystem I/O in blocks of different sizes than the database
+ page size, it may increase the possibility for database
+ corruption. For example, assume that Berkeley DB is writing
+ 32KB pages for a database, and the operating system does
+ filesystem I/O in 16KB blocks. If the operating system writes
+ the first 16KB of the database page successfully, but crashes
+ before being able to write the second 16KB of the database,
+ the database has been corrupted and this corruption may or may
+ not be detected during recovery. For this reason, it may be
+ important to select database page sizes that will be written
+ as single block transfers by the underlying operating system.
+ If you do not select a page size that the underlying operating
+ system will write as a single block, you may want to configure
+ the database to use checksums (see the <a href="../api_reference/C/dbset_flags.html" class="olink">DB-&gt;set_flags()</a> flag for
+ more information). By configuring checksums, you guarantee
+ this kind of corruption will be detected at the expense of the
+ CPU required to generate the checksums. When such an error is
+ detected, the only course of recovery is to perform
+ catastrophic recovery to restore the database.
</p>
- <p>
- Second, if you are copying database files (either as part of doing
- a hot backup or creation of a hot failover area), there is an
- additional question related to the page size of the Berkeley DB
- databases. You must copy databases atomically, in units of the
- database page size. In other words, the reads made by the copy
- program must not be interleaved with writes by other threads of
- control, and the copy program must read the databases in multiples
- of the underlying database page size. On Unix systems, this is not
- a problem, as these operating systems already make this guarantee
- and system utilities normally read in power-of-2 sized chunks,
- which are larger than the largest possible Berkeley DB database
- page size. Other operating systems, particularly those based on
- Linux and Windows, do not provide this guarantee and hot backups may
- not be performed on these systems by reading data from the file
- system. The <a href="../api_reference/C/db_hotbackup.html" class="olink">db_hotbackup</a> utility should be used on these
- systems.
+ <p>
+ Second, if you are copying database files (either as part
+ of doing a hot backup or creation of a hot failover area),
+ there is an additional question related to the page size of
+ the Berkeley DB databases. You must copy databases atomically,
+ in units of the database page size. In other words, the reads
+ made by the copy program must not be interleaved with writes
+ by other threads of control, and the copy program must read
+ the databases in multiples of the underlying database page
+ size. On Unix systems, this is not a problem, as these
+ operating systems already make this guarantee and system
+ utilities normally read in power-of-2 sized chunks, which are
+ larger than the largest possible Berkeley DB database page
+ size. Other operating systems, particularly those based on
+ Linux and Windows, do not provide this guarantee and hot
+ backups may not be performed on these systems by reading data
+ from the file system. The <a href="../api_reference/C/db_hotbackup.html" class="olink">db_hotbackup</a> utility should be used on
+ these systems.
</p>
<p>
- An additional problem we have seen in this area was in some
- releases of Solaris where the cp utility was implemented using the
- mmap system call rather than the read system call. Because the
- Solaris' mmap system call did not make the same guarantee of read
- atomicity as the read system call, using the cp utility could
- create corrupted copies of the databases. Another problem we have
- seen is implementations of the tar utility doing 10KB block reads
- by default, and even when an output block size was specified to
- that utility, not reading from the underlying databases in
- multiples of the block size. Using the dd utility instead of the
- cp or tar utilities (and specifying an appropriate block size),
- fixes these problems. If you plan to use a system utility to copy
- database files, you may want to use a system call trace utility
- (for example, ktrace or truss) to check for an I/O size smaller
- than or not a multiple of the database page size and system calls
- other than read.
+ An additional problem we have seen in this area was in some
+ releases of Solaris where the cp utility was implemented using
+ the mmap system call rather than the read system call. Because
+ the Solaris' mmap system call did not make the same guarantee
+ of read atomicity as the read system call, using the cp
+ utility could create corrupted copies of the databases.
+ Another problem we have seen is implementations of the tar
+ utility doing 10KB block reads by default, and even when an
+ output block size was specified to that utility, not reading
+ from the underlying databases in multiples of the block size.
+ Using the dd utility instead of the cp or tar utilities (and
+ specifying an appropriate block size), fixes these problems.
+ If you plan to use a system utility to copy database files,
+ you may want to use a system call trace utility (for example,
+ ktrace or truss) to check for an I/O size smaller than or not
+ a multiple of the database page size and system calls other
+ than read.
</p>
- <p>
- Third, it is necessary to consider the behavior of the system's
- underlying stable storage hardware. For example, consider a SCSI
- controller that has been configured to cache data and return to the
- operating system that the data has been written to stable storage,
- when, in fact, it has only been written into the controller RAM
- cache. If power is lost before the controller is able to flush its
- cache to disk, and the controller cache is not stable (that is, the
- writes will not be flushed to disk when power returns), the writes
- will be lost. If the writes include database blocks, there is no
- loss because recovery will correctly update the database. If the
- writes include log file blocks, it is possible that transactions
- that were already committed may not appear in the recovered
- database, although the recovered database will be coherent after a
- crash.
+ <p>
+ Third, it is necessary to consider the behavior of the
+ system's underlying stable storage hardware. For example,
+ consider a SCSI controller that has been configured to cache
+ data and return to the operating system that the data has been
+ written to stable storage, when, in fact, it has only been
+ written into the controller RAM cache. If power is lost before
+ the controller is able to flush its cache to disk, and the
+ controller cache is not stable (that is, the writes will not
+ be flushed to disk when power returns), the writes will be
+ lost. If the writes include database blocks, there is no loss
+ because recovery will correctly update the database. If the
+ writes include log file blocks, it is possible that
+ transactions that were already committed may not appear in the
+ recovered database, although the recovered database will be
+ coherent after a crash.
</p>
- <p>
- If the underlying hardware can fail in any way so that only part of
- the block was written, the failure conditions are the same as those
- described previously for an operating system failure that writes
- only part of a logical database block. In such cases, configuring
- the database for checksums will ensure the corruption is
- detected.
+ <p>
+ If the underlying hardware can fail in any way so that only
+ part of the block was written, the failure conditions are the
+ same as those described previously for an operating system
+ failure that writes only part of a logical database block. In
+ such cases, configuring the database for checksums will ensure
+ the corruption is detected.
</p>
- <p>
- For these reasons, it may be important to select hardware that does
- not do partial writes and does not cache data writes (or does not
- return that the data has been written to stable storage until it
- has either been written to stable storage or the actual writing of
- all of the data is guaranteed, barring catastrophic hardware
- failure — that is, your disk drive exploding).
+ <p>
+ For these reasons, it may be important to select hardware
+ that does not do partial writes and does not cache data writes
+ (or does not return that the data has been written to stable
+ storage until it has either been written to stable storage or
+ the actual writing of all of the data is guaranteed, barring
+ catastrophic hardware failure — that is, your disk drive
+ exploding).
</p>
- <p>
- If the disk drive on which you are storing your databases explodes,
- you can perform normal Berkeley DB catastrophic recovery, because
- it requires only a snapshot of your databases plus the log files
- you have archived since those snapshots were taken. In this case,
- you should lose no database changes at all.
+ <p>
+ If the disk drive on which you are storing your databases
+ explodes, you can perform normal Berkeley DB catastrophic
+ recovery, because it requires only a snapshot of your
+ databases plus the log files you have archived since those
+ snapshots were taken. In this case, you should lose no
+ database changes at all.
</p>
- <p>
- If the disk drive on which you are storing your log files explodes,
- you can also perform catastrophic recovery, but you will lose any
- database changes made as part of transactions committed since your
- last archival of the log files. Alternatively, if your database
- environment and databases are still available after you lose the
- log file disk, you should be able to dump your databases. However,
- you may see an inconsistent snapshot of your data after doing the
- dump, because changes that were part of transactions that were not
- yet committed may appear in the database dump. Depending on the
- value of the data, a reasonable alternative may be to perform both
- the database dump and the catastrophic recovery and then compare
- the databases created by the two methods.
+ <p>
+ If the disk drive on which you are storing your log files
+ explodes, you can also perform catastrophic recovery, but you
+ will lose any database changes made as part of transactions
+ committed since your last archival of the log files.
+ Alternatively, if your database environment and databases are
+ still available after you lose the log file disk, you should
+ be able to dump your databases. However, you may see an
+ inconsistent snapshot of your data after doing the dump,
+ because changes that were part of transactions that were not
+ yet committed may appear in the database dump. Depending on
+ the value of the data, a reasonable alternative may be to
+ perform both the database dump and the catastrophic recovery
+ and then compare the databases created by the two methods.
</p>
- <p>
- Regardless, for these reasons, storing your databases and log files
- on different disks should be considered a safety measure as well as
- a performance enhancement.
+ <p>
+ Regardless, for these reasons, storing your databases and
+ log files on different disks should be considered a safety
+ measure as well as a performance enhancement.
</p>
- <p>
- Finally, you should be aware that Berkeley DB does not protect
- against all cases of stable storage hardware failure, nor does it
- protect against simple hardware misbehavior (for example, a disk
- controller writing incorrect data to the disk). However,
- configuring the database for checksums will ensure that any such
- corruption is detected.
+ <p>
+ Finally, you should be aware that Berkeley DB does not
+ protect against all cases of stable storage hardware failure,
+ nor does it protect against simple hardware misbehavior (for
+ example, a disk controller writing incorrect data to the
+ disk). However, configuring the database for checksums will
+ ensure that any such corruption is detected.
</p>
</div>
<div class="navfooter">