Import changeset

author: tim@threads.polyesthetic.msg <> 2001-03-04 19:42:05 -0500
committer: tim@threads.polyesthetic.msg <> 2001-03-04 19:42:05 -0500
commit: 89dad52004ecba5a380aeebb0e2a9beaae88eb86 (patch)
tree: 9dd732e08dba156ee3d7635caedc0dc3107ecac6 /bdb/docs/ref/lock
parent: 639a1069d313843288ba6d9cb54b290073a748a7 (diff)
download: mariadb-git-89dad52004ecba5a380aeebb0e2a9beaae88eb86.tar.gz
11 files changed, 767 insertions, 0 deletions
diff --git a/bdb/docs/ref/lock/am_conv.html b/bdb/docs/ref/lock/am_conv.html
new file mode 100644
index 00000000000..7dbe3e73d47
--- /dev/null
+++ b/bdb/docs/ref/lock/am_conv.html
@@ -0,0 +1,129 @@
+<!--$Id: am_conv.so,v 10.16 2000/03/18 21:43:13 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Access method locking conventions</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/twopl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/cam_conv.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Access method locking conventions</h1>
+<p>All the Berkeley DB access methods follow the same conventions for locking
+database objects.  Applications that do their own locking and also do
+locking via the access methods must be careful to adhere to these
+conventions.
+<p>Whenever a Berkeley DB database is opened, the DB handle is
+assigned a unique locker ID.  Unless transactions are specified,
+that ID is used as the locker for all calls that the Berkeley DB methods
+make to the lock subsystem.  In order to lock a file,  pages in
+the file, or records in the file, we must create a unique ID that
+can be used as the object to be locked in calls to the lock manager.
+Under normal operation, that object is a 28-byte value, created by
+the concatenation of a unique file identifier, a page or record number,
+and an object type (page or record).
+<p>In a transaction-protected environment, database create and delete
+operations are recoverable and single-threaded.  This single-threading is
+achieved using a single lock for the entire environment that must be
+acquired before beginning a create or delete operation.  In this case,
+the object on which Berkeley DB will lock is a 32-bit unsigned integer with a
+value of 0.
+<p>If applications are using the lock subsystem directly while they are also
+using locking via the access methods, they must take care not to
+inadvertently lock objects that happen to be equal to the unique file IDs
+used to lock files.  This is most easily accomplished by using a locker
+ID of a different length than the values used by Berkeley DB.
+<p>All of the access methods other than Queue use a simple
+multiple-reader/single writer page locking scheme.  The standard
+read/write locks (<b>DB_LOCK_READ</b> and <b>DB_LOCK_WRITE</b>) and
+conflict matrix, as described in <a href="../../ref/lock/stdmode.html">Standard lock modes</a> are used.  An operation that returns data (e.g.,
+<a href="../../api_c/db_get.html">DB-&gt;get</a>, <a href="../../api_c/dbc_get.html">DBcursor-&gt;c_get</a>) obtains a read lock on all the pages
+accessed while locating the requested record.   When an update operation
+is requested (e.g., <a href="../../api_c/db_put.html">DB-&gt;put</a>, <a href="../../api_c/dbc_del.html">DBcursor-&gt;c_del</a>), the page containing
+the updated (or new) data is write locked.  As read-modify-write cycles
+are quite common and are deadlock prone under normal circumstances, the
+Berkeley DB interfaces allow the application to specify the <a href="../../api_c/dbc_get.html#DB_RMW">DB_RMW</a> flag,
+which causes operations to immediately obtain a writelock, even though
+they are only reading the data.  While this may reduce concurrency
+somewhat, it reduces the probability of deadlock.
+<p>The Queue access method does not hold long term page locks.
+Instead, page locks are held only long enough to locate records or to change
+metadata on a page, and record locks are held for the appropriate duration.
+In the presence of transactions, record locks are held until transaction
+commit.
+For Berkeley DB operations, record locks are held until operation
+completion and for DBC operations, record locks are held until
+subsequent records are returned or the cursor is closed.
+<p>Under non-transaction operation, the access methods do not normally hold
+locks across calls to the Berkeley DB interfaces.  The one exception to this
+rule is when cursors are used.  As cursors maintain a position in a file,
+they must hold locks across calls and will, in fact, hold locks until the
+cursor is closed.  Furthermore, each cursor is assigned its own unique
+locker ID when it is created, so cursor operations can conflict with one
+another.  (Each cursor is assigned its own locker ID because Berkeley DB handles
+may be shared by multiple threads of control.  The Berkeley DB library cannot
+identify which operations are performed by which threads of control, and
+it must ensure that two different threads of control are not
+simultaneously modifying the same data structure.  By assigning each
+cursor its own locker, two threads of control sharing a handle cannot
+inadvertently interfere with each other.
+<p>This has important implications.  If a single thread of control opens two
+cursors or uses a combination of cursor and non-cursor operations, these
+operations are performed on behalf of different lockers.  Conflicts that
+arise between these different lockers may not cause actual deadlocks, but
+can, in fact, permanently block the thread of control.  For example,
+assume that an application creates a cursor and uses it to read record A.
+Now assume a second cursor is opened and the application attempts to write
+record A using the second cursor.  Unfortunately, the first cursor has a
+read lock so the second cursor cannot obtain its write lock.  However,
+that read lock is held by the same thread of control, so if we block
+waiting for the write lock, the read lock can never be released.  This
+might appear to be a deadlock from the application's perspective, but
+Berkeley DB cannot identify it as such because it has no knowledge of which
+lockers belong to which threads of control.  For this reason, application
+designers are encouraged to close cursors as soon as they are done with
+them.
+<p>Complicated operations that require multiple cursors (or combinations of
+cursor and non-cursor operations) can be performed in two ways.  First,
+they may be performed within a transaction, in which case all operations
+lock on behalf of the designated locker ID.  Alternatively, the
+<a href="../../api_c/dbc_dup.html">DBcursor-&gt;c_dup</a> function duplicates a cursor, using the same locker ID as
+the originating cursor.  There is no way to achieve this duplication
+functionality through the DB handle calls, but any DB call can be
+implemented by one or more calls through a cursor.
+<p>When the access methods use transactions, many of these problems disappear.
+The transaction ID is used as the locker ID for all operations performed
+on behalf of the transaction.  This means that the application may open
+multiple cursors on behalf of the same transaction and these cursors will
+all share a common locker ID.  This is safe because transactions cannot
+span threads of control, so the library knows that two cursors in the same
+transaction cannot modify the database concurrently.
+<p>As mentioned earlier, most of the Berkeley DB access methods use page level
+locking.  During Btree traversal, lock-coupling is used to traverse the
+tree.  Note that the tree traversal that occurs during an update operation
+can also use lock-coupling; it is not necessary to retain locks on
+internal Btree pages even if the item finally referenced will be updated.
+Even in the presence of transactions, locks obtained on internal pages of
+the Btree may be safely released as the traversal proceeds.  This greatly
+improves concurrency.  The only time internal locks become crucial is when
+internal pages are split or merged.  When traversing duplicate data items
+for a key, the lock on the key value also acts as a lock on all duplicates
+of that key.  Therefore, two conflicting threads of control cannot access
+the same duplicate set simultaneously.
+<p>The Recno access method uses a Btree as its underlying data
+representation and follows similar locking conventions.  However, as the
+Recno access method must keep track of the number of children for all
+internal pages, it must obtain write locks on all internal pages during
+read and write operations.  In the presence of transactions, these locks
+are not released until transaction commit.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/twopl.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/cam_conv.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/cam_conv.html b/bdb/docs/ref/lock/cam_conv.html
new file mode 100644
index 00000000000..b37914890bc
--- /dev/null
+++ b/bdb/docs/ref/lock/cam_conv.html
@@ -0,0 +1,53 @@
+<!--$Id: cam_conv.so,v 10.10 2000/03/18 21:43:13 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Berkeley DB Concurrent Data Store locking conventions</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/am_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/dead.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Berkeley DB Concurrent Data Store locking conventions</h1>
+<p>The Berkeley DB Concurrent Data Store product has a different set of conventions for locking.  It
+provides multiple reader/single writer semantics, but not per-page locking
+or transaction recoverability.  As such, it does its locking entirely at
+the interface to the access methods.
+<p>The object it locks is the file, identified by its unique file number.
+The locking matrix is not one of the two standard lock modes, instead,
+we use a four-lock set, consisting of:
+<p><dl compact>
+<p><dt>DB_LOCK_NG<dd>not granted (always 0)
+<dt>DB_LOCK_READ<dd>read (shared)
+<dt>DB_LOCK_WRITE<dd>write (exclusive)
+<dt>DB_LOCK_IWRITE<dd>intention-to-write (shared with NG and READ, but conflicts with WRITE and IWRITE)
+</dl>
+<p>The IWRITE lock is used for cursors that will be used for updating (IWRITE
+locks are implicitly obtained for write operations through the Berkeley DB
+handles, e.g., <a href="../../api_c/db_put.html">DB-&gt;put</a>, <a href="../../api_c/db_del.html">DB-&gt;del</a>).  While the cursor is
+reading, the IWRITE lock is held, but as soon as the cursor is about to
+modify the database, the IWRITE is upgraded to a WRITE lock.  This upgrade
+blocks until all readers have exited the database.  Because only one
+IWRITE lock is allowed at any one time, no two cursors can ever try to
+upgrade to a WRITE lock at the same time, and therefore deadlocks are
+prevented, which is essential as Berkeley DB Concurrent Data Store does not include deadlock
+detection and recovery.
+<p>Applications that need to lock compatibly with Berkeley DB Concurrent Data Store must obey the
+following rules:
+<p><ol>
+<p><li>Use only lock modes DB_LOCK_NG, DB_LOCK_READ, DB_LOCK_WRITE,
+DB_LOCK_IWRITE.
+<p><li>Never attempt to acquire a WRITE lock on an object that is
+already locked with a READ lock.
+</ol>
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/am_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/dead.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/config.html b/bdb/docs/ref/lock/config.html
new file mode 100644
index 00000000000..cc0b5248149
--- /dev/null
+++ b/bdb/docs/ref/lock/config.html
@@ -0,0 +1,46 @@
+<!--$Id: config.so,v 10.15 2000/12/08 20:43:16 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Configuring locking</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/dead.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/max.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Configuring locking</h1>
+<p>The <a href="../../api_c/env_set_lk_detect.html">DBENV-&gt;set_lk_detect</a> function specifies that the deadlock detector
+should be run whenever a lock blocks.  This option provides for rapid
+detection of deadlocks at the expense of potentially frequent
+invocations of the deadlock detector.  On a fast processor with a highly
+contentious application, where response time is critical, this is a good
+choice.  An argument to the <a href="../../api_c/env_set_lk_detect.html">DBENV-&gt;set_lk_detect</a> function indicates which
+transaction to abort when a deadlock is detected.  It can take on any
+one of the following values:
+<p><dl compact>
+<p><dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_YOUNGEST">DB_LOCK_YOUNGEST</a><dd>Abort the most recently started transaction.
+<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_OLDEST">DB_LOCK_OLDEST</a><dd>Abort the longest lived transaction.
+<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_RANDOM">DB_LOCK_RANDOM</a><dd>Abort whatever transaction the deadlock detector happens to find first.
+<dt><a href="../../api_c/env_set_lk_detect.html#DB_LOCK_DEFAULT">DB_LOCK_DEFAULT</a><dd>Use the default policy (currently DB_RANDOM).
+</dl>
+<p>In general, <a href="../../api_c/env_set_lk_detect.html#DB_LOCK_DEFAULT">DB_LOCK_DEFAULT</a> is probably the correct choice.  If
+an application has long-running transactions, then
+<a href="../../api_c/env_set_lk_detect.html#DB_LOCK_YOUNGEST">DB_LOCK_YOUNGEST</a> will guarantee that transactions eventually
+complete, but it may do so at the expense of a large number of aborts.
+<p>The alternative to using the <a href="../../api_c/env_set_lk_detect.html">DBENV-&gt;set_lk_detect</a> interface is
+to run the deadlock detector manually, using the Berkeley DB
+<a href="../../api_c/lock_detect.html">lock_detect</a> interface.
+<p>The <a href="../../api_c/env_set_lk_conflicts.html">DBENV-&gt;set_lk_conflicts</a> function allows you to specify your own locking
+conflicts matrix.  This is an advanced configuration option, and rarely
+necessary.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/dead.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/max.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/dead.html b/bdb/docs/ref/lock/dead.html
new file mode 100644
index 00000000000..bb77e982285
--- /dev/null
+++ b/bdb/docs/ref/lock/dead.html
@@ -0,0 +1,93 @@
+<!--$Id: dead.so,v 10.13 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Deadlocks and deadlock avoidance</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/cam_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/config.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Deadlocks and deadlock avoidance</h1>
+<p>Practically any application that uses locking may deadlock.
+In nearly all cases, in order to recover from a deadlock, transactions
+must be used, so that an operation that deadlocks mid-way through can
+be undone, leaving the database in a consistent state.
+As the access methods may perform updates on multiple pages during a
+single API call, transactions are necessary even when the application
+makes only single update calls into the database.
+The only exception to this rule is when all the threads accessing
+the database are doing so read-only or when the Concurrent Data Store
+product is used; this product guarantees deadlock-free operation at the
+expense of reduced concurrency.
+Since deadlocks cannot  be prevented, Berkeley DB provides the ability to detect
+deadlocks and recover from them gracefully.
+<p>Deadlocks occur when two or more threads of control are blocked waiting
+on each other's forward progress.  Consider two transactions, each of
+which wants to modify items A and B.  Assume that transaction 1 modifies
+first A and then B, but transaction 2 modifies B then A.  Now, assume
+that transaction 1 obtains its writelock on A, but before it obtains its
+writelock on B, it is descheduled and transaction 2 runs.  Transaction 2
+successfully acquires its writelock on B, but then blocks when it tries
+to obtain its writelock on A, because transaction 1 already holds a
+writelock on it.  This is a deadlock. Transaction 1 cannot make forward
+progress until Transaction 2 releases its lock on B, but Transaction 2
+cannot make forward progress until Transaction 1 releases its lock on A.
+<p>The <a href="../../api_c/lock_detect.html">lock_detect</a> function runs an instance of the Berkeley DB deadlock
+detector.  The <a href="../../utility/db_deadlock.html">db_deadlock</a> utility performs deadlock detection by
+calling <a href="../../api_c/lock_detect.html">lock_detect</a> at regular intervals.  When a deadlock exists
+in the system, all of the threads of control involved in the deadlock are,
+by definition, waiting on a lock.  The deadlock detector examines the
+state of the lock manager and identifies a deadlock, and selects one of
+the participants to abort.  (See <a href="../../ref/lock/config.html">Configuring locking</a> for a discussion of how a participant is selected).
+The lock on which the selected participant is waiting is identified such
+that the <a href="../../api_c/lock_get.html">lock_get</a> (or <a href="../../api_c/lock_vec.html">lock_vec</a>) call in which that lock
+was requested will receive an error return of <a href="../../ref/program/errorret.html#DB_LOCK_DEADLOCK">DB_LOCK_DEADLOCK</a>.
+In the access methods, this error return is propagated back through the
+Berkeley DB interface as DB_LOCK_DEADLOCK.
+<p>When an application receives an DB_LOCK_DEADLOCK, the correct action is
+to abort the current transaction, and optionally retry it.  Transaction
+support is necessary for recovery from deadlocks.  When a deadlock occurs,
+the database may be left in an inconsistent or corrupted state, and any
+database changes already accomplished must be undone before the
+application can proceed further.
+<p>The deadlock detector identifies deadlocks by looking for a cycle in what
+is commonly referred to as its "waits-for" graph.  More precisely, the
+deadlock detector reads through the lock table, and finds each object
+currently locked.  Each object has a list of transactions or operations
+(hereafter called lockers) that currently hold locks on the object and
+possibly a list of waiting lockers, waiting on the lockers holding it.
+Each object creates one or more partial orderings of lockers.  That is,
+for a particular object, every waiting locker comes after every holding
+locker, because that holding locker must release its lock before the
+waiting locker can make forward progress.  Conceptually, after each object
+has been examined, the partial orderings are topologically sorted (see
+tsort).  If this topological sort reveals any cycles, then the lockers
+forming the cycle are involved in a deadlock.  One of the lockers is
+selected for abortion.
+<p>It is possible that aborting a single transaction involved in a deadlock
+is not enough to allow other transactions to make forward progress.
+In this case, the deadlock detector will be called repeatedly.
+Unfortunately, at the time a transaction is selected for abortion,
+there is not enough information available to determine if aborting
+that single transaction will allow forward progress or not.  Since
+most applications have few deadlocks, Berkeley DB takes the conservative
+approach, aborting as few transactions as may be necessary to resolve
+the existing deadlocks.  In particular, for each unique cycle found
+in the waits-for graph described in the previous paragraph, only one
+transaction is selected for abortion.  However, if there are multiple
+cycles, then one transaction from each cycle is selected for abortion.
+Only after the aborting transactions have received the deadlock return
+and aborted their transactions, can it be determined if it is necessary
+to abort other transactions in order to allow forward progress.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/cam_conv.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/config.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/intro.html b/bdb/docs/ref/lock/intro.html
new file mode 100644
index 00000000000..b5c85af05b0
--- /dev/null
+++ b/bdb/docs/ref/lock/intro.html
@@ -0,0 +1,89 @@
+<!--$Id: intro.so,v 10.16 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Berkeley DB and locking</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/program/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/page.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Berkeley DB and locking</h1>
+<p>The lock subsystem provides interprocess and intraprocess concurrency
+control mechanisms.  While the locking system is used extensively by the
+Berkeley DB access methods and transaction system, it may also be used as a
+stand-alone subsystem to provide concurrency control to any set of
+designated resources.
+<p>The lock subsystem is created, initialized, and opened by calls to
+<a href="../../api_c/env_open.html">DBENV-&gt;open</a> with the <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> or <a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a>
+flags specified.
+<p>The <a href="../../api_c/lock_detect.html">lock_detect</a> function provides the programmatic interface to
+the Berkeley DB deadlock detector.  Whenever two threads of control issue lock
+requests that are not carefully ordered or that require upgrading locks
+(obtaining write locks on objects that are already read-locked), the
+possibility for deadlock arises.  A deadlock occurs when two or more
+threads of control are blocked, waiting for actions that another one of
+these blocked threads must take.  For example, assume that threads one
+and two have each obtained read locks on object A.  Now suppose that both
+threads wish to obtain write locks on object A.  Neither thread can be
+granted its writelock (because of the other thread's readlock).  Both
+threads block and will never unblock because the event for which they are
+waiting can never happen.
+<p>The deadlock detector examines all the locks held in the environment and
+identifies situations where no thread can make forward progress.  It then
+selects one of the participants in the deadlock (according to the argument
+that was specified to <a href="../../api_c/env_set_lk_detect.html">DBENV-&gt;set_lk_detect</a>) and forces it to return
+the value DB_LOCK_DEADLOCK, which indicates that a deadlock occurred.
+The thread receiving such an error should abort its current transaction,
+or simply release all its locks if it is not running in a transaction,
+and retry the operation.
+<p>The <a href="../../api_c/lock_vec.html">lock_vec</a> interface is used to acquire and release locks.
+<p>Two additional interfaces, <a href="../../api_c/lock_get.html">lock_get</a> and <a href="../../api_c/lock_put.html">lock_put</a>, are
+provided.  These interfaces are simpler front-ends to the <a href="../../api_c/lock_vec.html">lock_vec</a>
+functionality, where <a href="../../api_c/lock_get.html">lock_get</a> acquires a lock, and
+<a href="../../api_c/lock_put.html">lock_put</a> releases a lock that was acquired using <a href="../../api_c/lock_get.html">lock_get</a>
+or <a href="../../api_c/lock_vec.html">lock_vec</a>.
+<p>It is up to the application to specify lockers and objects appropriately.
+When used with the Berkeley DB access methods, these lockers and objects are
+handled completely internally, but an application using the lock manager
+directly must either use the same conventions as the access methods or
+define its own convention to which it adheres.  If the application is
+using the access methods with locking at the same time that it is calling
+the lock manager directly, the application must follow a convention that
+is compatible with the access methods' use of the locking subsystem.  See
+<a href="../../ref/lock/am_conv.html">Access method locking conventions</a>
+for more information.
+<p>The <a href="../../api_c/lock_id.html">lock_id</a> function returns a unique ID which may safely be used
+as the locker parameter to the <a href="../../api_c/lock_vec.html">lock_vec</a> interface.  The access
+methods use <a href="../../api_c/lock_id.html">lock_id</a> to generate unique lockers for the cursors
+associated with a database.
+<p>The <a href="../../api_c/lock_vec.html">lock_vec</a> function performs any number of lock operations
+atomically.  It also provides the ability to release all locks held by a
+particular locker and release all the locks on a particular object.
+Performing multiple lock operations atomically is useful in performing
+Btree traversals where you want to acquire a lock on a child page and once
+acquired, immediately release the lock on its parent (this is
+traditionally referred to as "lock-coupling").  Using <a href="../../api_c/lock_vec.html">lock_vec</a>
+instead of separate calls to <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_get.html">lock_get</a> reduces
+the synchronization overhead between multiple threads or processes.
+<p>The three interfaces, <a href="../../api_c/lock_get.html">lock_get</a>, <a href="../../api_c/lock_put.html">lock_put</a> and <a href="../../api_c/lock_vec.html">lock_vec</a>,
+are fully compatible, and may be used interchangeably.
+<p>All locks explicitly requested by an application should be released via
+calls to <a href="../../api_c/lock_put.html">lock_put</a> or <a href="../../api_c/lock_vec.html">lock_vec</a>.
+<p>The <a href="../../api_c/lock_stat.html">lock_stat</a> function returns information about the status of
+the lock subsystem.  It is the programmatic interface used by the
+<a href="../../utility/db_stat.html">db_stat</a> utility.
+<p>The locking subsystem is closed by the call to <a href="../../api_c/env_close.html">DBENV-&gt;close</a>.
+<p>Finally, the entire locking subsystem may be discarded using the
+<a href="../../api_c/env_remove.html">DBENV-&gt;remove</a> interface.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/program/runtime.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/page.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/max.html b/bdb/docs/ref/lock/max.html
new file mode 100644
index 00000000000..23622909035
--- /dev/null
+++ b/bdb/docs/ref/lock/max.html
@@ -0,0 +1,88 @@
+<!--$Id: max.so,v 10.2 2000/12/21 19:11:28 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Configuring locking: sizing the system</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/nondb.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Configuring locking: sizing the system</h1>
+<p>The lock system is sized using the following three functions:
+<p><blockquote><pre><a href="../../api_c/env_set_lk_max_locks.html">DBENV-&gt;set_lk_max_locks</a>
+<a href="../../api_c/env_set_lk_max_lockers.html">DBENV-&gt;set_lk_max_lockers</a>
+<a href="../../api_c/env_set_lk_max_objects.html">DBENV-&gt;set_lk_max_objects</a></pre></blockquote>
+<p>The <a href="../../api_c/env_set_lk_max_locks.html">DBENV-&gt;set_lk_max_locks</a>, <a href="../../api_c/env_set_lk_max_lockers.html">DBENV-&gt;set_lk_max_lockers</a>
+and <a href="../../api_c/env_set_lk_max_objects.html">DBENV-&gt;set_lk_max_objects</a> functions specify, respectively, the
+maximum number of locks, lockers and locked objects supported by the
+lock subsystem.  The maximum number of locks is the number of locks that
+can be simultaneously requested in the system.  The maximum number of
+lockers is the number of lockers that can simultaneously request locks
+in the system.  The maximum number of lock objects is the number of
+objects that can simultaneously be locked in the system.  Selecting
+appropriate values requires an understanding of your application and
+its databases.  If the values are too small, then requests for locks in
+an application will fail.  If the values are too large, then the locking
+subsystem will consume more resources than is necessary.  It is better
+to err in the direction of allocating too many locks, lockers and
+objects as increasing the number of locks does not require large amounts
+of additional resources.
+<p>The recommended algorithm for selecting the maximum number of locks,
+lockers and lock objects, is to run the application under stressful
+conditions and then review the lock system's statistics to determine
+the maximum number of locks, lockers and lock objects that were used.
+Then, double these values for safety.  However, in some large
+applications, finer granularity of control is necessary in order to
+minimize the size of the lock subsystem.
+<p>The maximum number of lockers can be estimated as follows:
+<ul type=disc>
+<li>If the
+database environment is configured to use transactions, then the maximum
+number of lockers needed is the number of simultaneously active
+transactions and child transactions (where a child transaction is active
+until its parent commits or aborts, not until it commits or aborts).
+<li>If the database environment is not configured to use transactions, then
+the maximum number of lockers needed is the number of simultaneous
+non-cursor operations plus an additional locker for every simultaneously
+open cursor.
+</ul>
+<p>The maximum number of lock objects needed can be estimated as follows:
+<ul type=disc>
+<li>For Btree and Recno access methods, you will need, at a minimum, one
+lock object per level of the database tree.  (Unless keys are quite
+large with respect to the page size, neither Recno nor Btree database
+trees should ever be deeper than five levels.)  Then, you will need one
+lock object for each leaf page of the database tree that will be
+simultaneously accessed.
+<li>For the Queue access method you will need one lock object per record
+that is simultaneously accessed.  To this, add one lock object per page
+that will be simultaneously accessed.  (Since the Queue access method
+uses fixed-length records, and the database page size is known, it is
+possible to calculate the number of pages and therefore, lock objects,
+required.)  Deleted records skipped by a <a href="../../api_c/dbc_get.html#DB_NEXT">DB_NEXT</a> or
+<a href="../../api_c/dbc_get.html#DB_PREV">DB_PREV</a> operation do not require a separate lock object.
+Further, if your application is using transactions, then no database
+operation will ever use more than three lock objects at any time.
+<li>For the Hash access method you only need a single lock object.
+</ul>
+<p>For all access methods, you should then add an additional lock object
+per database, for the database's metadata page.
+<p>The maximum number of locks required by an application cannot be easily
+estimated.  It is possible to calculate a maximum number of locks by
+multiplying the maximum number of lockers, times the maximum number of
+lock objects, times two (two for the two possible lock modes for each
+object, read and write).  However, this is a pessimal value, and real
+applications are unlikely to actually need that many locks.  Review of
+the lock subsystem statistics is the best way to determine this value.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/config.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/nondb.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/nondb.html b/bdb/docs/ref/lock/nondb.html
new file mode 100644
index 00000000000..4fb37d6d7b0
--- /dev/null
+++ b/bdb/docs/ref/lock/nondb.html
@@ -0,0 +1,50 @@
+<!--$Id: nondb.so,v 10.10 2000/12/08 20:43:16 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Locking and non-Berkeley DB applications</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/max.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/intro.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Locking and non-Berkeley DB applications</h1>
+<p>The locking subsystem is useful outside the context of Berkeley DB.  It can be
+used to manage concurrent access to any collection of either ephemeral or
+persistent objects.  That is, the lock region can persist across
+invocations of an application, so it can be used to provide long-term
+locking (e.g., conference room scheduling).
+<p>In order to use the locking subsystem in such a general way, the
+applications must adhere to a convention for naming objects and lockers.
+Consider the conference room scheduling problem described above.  Assume
+there are three conference rooms and that we wish to schedule them in
+half-hour intervals.
+<p>The scheduling application must then select a way to identify each
+conference room/time slot combination.  In this case, we could describe
+the objects being locker as bytestrings consisting of the conference room
+name, the date on which it is needed, and the beginning of the appropriate
+half-hour slot.
+<p>Lockers are 32-bit numbers, so we might choose to use the User ID of the
+individual running the scheduling program.  To schedule half-hour slots,
+all the application need do is issue a <a href="../../api_c/lock_get.html">lock_get</a> call for the
+appropriate locker/object pair.  To schedule a longer slot, the
+application would issue a <a href="../../api_c/lock_vec.html">lock_vec</a> call with one <a href="../../api_c/lock_get.html">lock_get</a>
+operation per half-hour up to the total length.  If the <a href="../../api_c/lock_vec.html">lock_vec</a>
+call fails, the application would have to release the parts of the time
+slot that were obtained.
+<p>To cancel a reservation, the application would make the appropriate
+<a href="../../api_c/lock_put.html">lock_put</a> calls.  To reschedule a reservation, the <a href="../../api_c/lock_get.html">lock_get</a>
+and <a href="../../api_c/lock_put.html">lock_put</a> calls could all be made inside of a single
+<a href="../../api_c/lock_vec.html">lock_vec</a> call.  The output of <a href="../../api_c/lock_stat.html">lock_stat</a> could be
+post-processed into a human-readable schedule of conference room use.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/max.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/log/intro.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/notxn.html b/bdb/docs/ref/lock/notxn.html
new file mode 100644
index 00000000000..16b00cf66bf
--- /dev/null
+++ b/bdb/docs/ref/lock/notxn.html
@@ -0,0 +1,46 @@
+<!--$Id: notxn.so,v 10.10 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Locking without transactions</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/stdmode.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/twopl.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Locking without transactions</h1>
+<p>If an application runs with locking specified, but not transactions (e.g.,
+<a href="../../api_c/env_open.html">DBENV-&gt;open</a> is called with <a href="../../api_c/env_open.html#DB_INIT_LOCK">DB_INIT_LOCK</a> or
+<a href="../../api_c/env_open.html#DB_INIT_CDB">DB_INIT_CDB</a> specified, but not <a href="../../api_c/env_open.html#DB_INIT_TXN">DB_INIT_TXN</a>), locks are
+normally acquired during each Berkeley DB operation and released before the
+operation returns to the caller.  The only exception is in the case of
+cursor operations.  As cursors identify a particular position in a file,
+a cursor must retain a read-lock across cursor calls to make sure that
+that position is uniquely identifiable during the next cursor call,
+because an operation using <a href="../../api_c/dbc_get.html#DB_CURRENT">DB_CURRENT</a> must reference the same
+record as the previous cursor call.  Such cursor locks cannot be released
+until either the cursor is reset using the <a href="../../api_c/db_get.html#DB_GET_BOTH">DB_GET_BOTH</a>,
+<a href="../../api_c/dbc_get.html#DB_SET">DB_SET</a>, <a href="../../api_c/dbc_get.html#DB_SET_RANGE">DB_SET_RANGE</a>, <a href="../../api_c/dbc_put.html#DB_KEYFIRST">DB_KEYFIRST</a>, or
+<a href="../../api_c/dbc_put.html#DB_KEYLAST">DB_KEYLAST</a> functionality, in which case a new cursor lock is
+established, or the cursor is closed.  As a result, application designers
+are encouraged to close cursors as soon as possible.
+<p>It is important to realize that concurrent applications that use locking
+must ensure that two concurrent threads do not interfere with each other.
+However, as Btree and Hash access method page splits can occur at any
+time, there is virtually no way to guarantee that an application which
+writes the database cannot deadlock.  Applications running without the
+protection of transactions may deadlock, and when they do so, can leave
+the database in an inconsistent state.  Applications that need concurrent
+access, but not transactions, are more safely implemented using the Berkeley DB Concurrent Data Store
+Product.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/stdmode.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/twopl.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/page.html b/bdb/docs/ref/lock/page.html
new file mode 100644
index 00000000000..a7e43b3af66
--- /dev/null
+++ b/bdb/docs/ref/lock/page.html
@@ -0,0 +1,62 @@
+<!--$Id: page.so,v 10.12 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Page locks</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/stdmode.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Page locks</h1>
+<p>Under normal operation, the access methods use page locking.  The pagesize
+of a database is set when the database is created and may be specified by
+calling the <a href="../../api_c/db_set_pagesize.html">DB-&gt;set_pagesize</a> function.  If not specified, the Berkeley DB
+package tries to select a pagesize that will provide the best I/O
+performance by setting the page size equal to the block size of the
+underlying file system.
+<p>In the Btree access method, Berkeley DB uses a technique called lock coupling
+to improve concurrency.  The traversal of a Btree requires reading a page,
+searching that page to determine which page to search next and then
+repeating this process on the next page.  Once a page has been searched,
+it will never be accessed again for this operation, unless a page split
+is required.  To improve concurrency in the tree, once the next page to
+read/search has been determined, that page is locked, and then atomically
+(i.e., without relinquishing control of the lock manager) the original
+page lock is released.
+<p>As the Recno access method is built upon Btree, it too uses lock coupling
+for read operations.  However, as the Recno access method must maintain
+a count of records on its internal pages, it cannot lock couple during
+write operations.  Instead, it retains write locks on all internal pages
+during every update operation.  For this reason, it is not possible to
+have high concurrency in the Recno access method in the presence of write
+operations.
+<p>The Queue access method only uses short term page locks.  That is, a page
+lock is released prior to requesting another page lock.  Record locks are
+used for transaction isolation.  The provides a high degree of concurrency
+for write operations.  A metadata page is used to keep track of the head
+and tail of the queue.  This page is never locked during other locking or
+I/O operations.
+<p>The Hash access method does not have such traversal issues, but because
+it implements dynamic hashing, it must always refer to its metadata while
+computing a hash function.  This metadata is stored on a special page in
+the hash database.  This page must therefore be read locked on every
+operation.  Fortunately, it need only be write locked when new pages are
+allocated to the file, which happens in three cases: 1) a hash bucket
+becomes full and needs to split, 2) a key or data item is too large to
+fit on a normal page, and 3) the number of duplicate items for a fixed
+key becomes sufficiently large that they are moved to an auxiliary page.
+In this case, the access method must obtain a write lock on the metadata
+page, thus requiring that all readers be blocked from entering the tree
+until the update completes.
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/intro.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/stdmode.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/stdmode.html b/bdb/docs/ref/lock/stdmode.html
new file mode 100644
index 00000000000..ca1cd6b0bdd
--- /dev/null
+++ b/bdb/docs/ref/lock/stdmode.html
@@ -0,0 +1,61 @@
+<!--$Id: stdmode.so,v 10.20 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Standard lock modes</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/page.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/notxn.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Standard lock modes</h1>
+<p>The Berkeley DB locking protocol is described by a conflict matrix.  A conflict
+matrix is an n x n array where n is the number of different lock modes
+supported, and the (i, j)th entry of the array indicates whether a lock of
+mode i conflicts with a lock of mode j.
+<p>The Berkeley DB include files declare two commonly used conflict arrays:
+<p><dl compact>
+<p><dt>const u_int8_t db_rw_conflicts[ ];<dd>This is a conflict matrix for a simple scheme using shared and exclusive
+lock modes.
+<p><dt>const u_int8_t db_riw_conflicts[ ];<dd>This is a conflict matrix that involves various intent lock modes (e.g.,
+intent shared) that are used for multigranularity locking.
+</dl>
+<p>The number of modes associated with each matrix are DB_LOCK_RW_N and
+DB_LOCK_RIW_N, respectively.
+<p>In addition, the Berkeley DB include file defines the type <b>db_lockmode_t</b>,
+which is the type of the lock modes used with the standard tables above:
+<p><dl compact>
+<p><dt>DB_LOCK_NG<dd>not granted (always 0)
+<p><dt>DB_LOCK_READ<dd>read (shared)
+<p><dt>DB_LOCK_WRITE<dd>write (exclusive)
+</dl>
+<p>As an example, consider the basic multiple-reader/single writer conflict
+matrix described by <b>db_rw_conflicts</b>.  In the following
+example (and in the appropriate file), a 1 represents a conflict (i.e.,
+do not grant the lock if the indicated lock is held) and a 0 indicates
+that it is OK to grant the lock.
+<p>The rows indicate the lock that is held and the columns indicate the lock
+that is requested.
+<p><blockquote><pre>		Notheld	Read    Write
+Notheld		0	0	0
+Read*		0	0	1
+Write**		0	1	1
+</pre></blockquote>
+<p><dl compact>
+<p><dt>*<dd>In this case, suppose that there is a read lock held on an object.  A new
+request for a read lock would be granted, but a request for a write lock
+would not.
+<p><dt>**<dd>In this case, suppose that there is a write lock held on an object.  A
+new request for either a read or write lock would be denied.
+</dl>
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/page.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/notxn.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
diff --git a/bdb/docs/ref/lock/twopl.html b/bdb/docs/ref/lock/twopl.html
new file mode 100644
index 00000000000..6cf112c0979
--- /dev/null
+++ b/bdb/docs/ref/lock/twopl.html
@@ -0,0 +1,50 @@
+<!--$Id: twopl.so,v 10.7 2000/03/18 21:43:14 bostic Exp $-->
+<!--Copyright 1997, 1998, 1999, 2000 by Sleepycat Software, Inc.-->
+<!--All rights reserved.-->
+<html>
+<head>
+<title>Berkeley DB Reference Guide: Locking with transactions: two-phase locking</title>
+<meta name="description" content="Berkeley DB: An embedded database programmatic toolkit.">
+<meta name="keywords" content="embedded,database,programmatic,toolkit,b+tree,btree,hash,hashing,transaction,transactions,locking,logging,access method,access methods,java,C,C++">
+</head>
+<body bgcolor=white>
+        <a name="2"><!--meow--></a>    
+<table><tr valign=top>
+<td><h3><dl><dt>Berkeley DB Reference Guide:<dd>Locking Subsystem</dl></h3></td>
+<td width="1%"><a href="../../ref/lock/notxn.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/am_conv.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p>
+<h1 align=center>Locking with transactions: two-phase locking</h1>
+<p>Berkeley DB uses a locking protocol called two-phase locking.  This is the
+traditional protocol used in conjunction with lock-based transaction
+systems.
+<p>In a two-phase locking (2PL) system, transactions are broken up into two
+distinct phases.  During the first phase, the transaction only acquires
+locks.  During the second phase, the transaction only releases locks.
+More formally, once a transaction releases a lock, it may not acquire any
+additional locks.  Practically, this translates into a system where locks
+are acquired as they are needed throughout a transaction and retained
+until the transaction ends, either by committing or aborting.  In Berkeley DB,
+locks are released during <a href="../../api_c/txn_abort.html">txn_abort</a> or <a href="../../api_c/txn_commit.html">txn_commit</a>.  The
+only exception to this protocol occurs when we use lock-coupling to
+traverse a data structure.  If the locks are held only for traversal
+purposes, then the locks may be released before transaction commit or
+abort.
+<p>For applications, the implications of 2PL are that long-running
+transactions will hold locks for a long time.  When designing
+applications, lock contention should be considered.  In order to reduce
+the probability of deadlock and achieve the best level of concurrency
+possible, the following guidelines are helpful.
+<p><ol>
+<p><li>When accessing multiple databases, design all transactions so
+that they access the files in the same order.
+<p><li>If possible, access your most hotly contested resources last
+(so that their locks are held for the shortest time possible).
+<p><li>If possible, use nested transactions to protect the parts of
+your transaction most likely to deadlock.
+</ol>
+<table><tr><td><br></td><td width="1%"><a href="../../ref/lock/notxn.html"><img src="../../images/prev.gif" alt="Prev"></a><a href="../../ref/toc.html"><img src="../../images/ref.gif" alt="Ref"></a><a href="../../ref/lock/am_conv.html"><img src="../../images/next.gif" alt="Next"></a>
+</td></tr></table>
+<p><font size=1><a href="http://www.sleepycat.com">Copyright Sleepycat Software</a></font>
+</body>
+</html>
author	tim@threads.polyesthetic.msg <>	2001-03-04 19:42:05 -0500
committer	tim@threads.polyesthetic.msg <>	2001-03-04 19:42:05 -0500
commit	89dad52004ecba5a380aeebb0e2a9beaae88eb86 (patch)
tree	9dd732e08dba156ee3d7635caedc0dc3107ecac6 /bdb/docs/ref/lock
parent	639a1069d313843288ba6d9cb54b290073a748a7 (diff)
download	mariadb-git-89dad52004ecba5a380aeebb0e2a9beaae88eb86.tar.gz