From 780b92ada9afcf1d58085a83a0b9e6bc982203d1 Mon Sep 17 00:00:00 2001
From: Lorry Tar Creator
- When building Transactional Data Store applications, there are design
- issues to consider whenever a thread of control with open Berkeley DB
- handles fails for any reason (where a thread of control may be either a
- true thread or a process).
-
- The first case is handling system failure: if the system fails, the
- database environment and the databases may be left in a corrupted
- state. In this case, recovery must be performed on the database
- environment before any further action is taken, in order to:
-
+ When building Transactional Data Store applications, there
+ are design issues to consider whenever a thread of control
+ with open Berkeley DB handles fails for any reason (where a
+ thread of control may be either a true thread or a process).
+
+ The first case is handling system failure: if the system
+ fails, the database environment and the databases may be left
+ in a corrupted state. In this case, recovery must be performed
+ on the database environment before any further action is
+ taken, in order to:
+
- For details on performing recovery, see the
- Recovery procedures.
-
- The second case is handling the failure of a thread of control. There
- are resources maintained in database environments that may be left
- locked or corrupted if a thread of control exits unexpectedly. These
- resources include data structure mutexes, logical database locks and
- unresolved transactions (that is, transactions which were never aborted
- or committed). While Transactional Data Store applications can treat
- the failure of a thread of control in the same way as they do a system
- failure, they have an alternative choice, the DB_ENV->failchk() method.
-
-
- The DB_ENV->failchk() will return - DB_RUNRECOVERY - if the database - environment is unusable as a result of the thread of control failure. - (If a data structure mutex or a database write lock is left held by - thread of control failure, the application should not continue to use - the database environment, as subsequent use of the environment is - likely to result in threads of control convoying behind the held - locks.) The DB_ENV->failchk() call will release any database read locks - that have been left held by the exit of a thread of control, and abort - any unresolved transactions. In this case, the application can - continue to use the database environment. -
+ The second case is handling the failure of a thread of + control. There are resources maintained in database + environments that may be left locked or corrupted if a thread + of control exits unexpectedly. These resources include data + structure mutexes, logical database locks and unresolved + transactions (that is, transactions which were never aborted + or committed). While Transactional Data Store applications can + treat the failure of a thread of control in the same way as + they do a system failure, they have an alternative choice, the + DB_ENV->failchk() method. + ++ The DB_ENV->failchk() method will return + DB_RUNRECOVERY + if the database environment is unusable as a result of the thread + of control failure. (If a data structure mutex or a database write + lock is left held by thread of control failure, the application + should not continue to use the database environment, as subsequent + use of the environment is likely to result in threads of control + convoying behind the held locks.) The DB_ENV->failchk() call will + release any database read locks that have been left held by the + exit of a thread of control, and abort any unresolved transactions. + In this case, the application can continue to use the database + environment. +
- A Transactional Data Store application recovering from a thread of - control failure should call DB_ENV->failchk(), and, if it returns success, - the application can continue. If DB_ENV->failchk() returns - DB_RUNRECOVERY, - the application should proceed as described for - the case of system failure. -
+ Note that you can optionally cause DB_ENV->failchk() to broadcast a database + environment failure to other threads of control by using the +--enable-failchk_broadcast
flag when you compile
+ your Berkeley DB library. If this option is turned on, then all
+ threads of control using the database environment will return
+ DB_RUNRECOVERY
+ when they attempt to obtain a mutex lock. In this situation, a
+ DB_EVENT_FAILCHK_PANIC or
+ DB_EVENT_MUTEX_DIED event will also be raised.
+ (You use DB_ENV->set_event_notify() to examine events).
+
+ + A Transactional Data Store application recovering from a + thread of control failure should call DB_ENV->failchk(), and, if it + returns success, the application can continue. If DB_ENV->failchk() + returns DB_RUNRECOVERY, + the application should proceed as described for the case of system + failure. In addition, threads notified of failure by DB_ENV->failchk() + should also proceed as described for the case of system failure. +
- It greatly simplifies matters that recovery may be performed regardless - of whether recovery needs to be performed; that is, it is not an error - to recover a database environment for which recovery is not strictly - necessary. For this reason, applications should not try to determine - if the database environment was active when the application or system - failed. Instead, applications should run recovery any time the - DB_ENV->failchk() method returns - DB_RUNRECOVERY, - or, if the application is - not calling the DB_ENV->failchk() method, any time any thread of control - accessing the database environment fails, as well as any time the - system reboots. -
+ It greatly simplifies matters that recovery may be + performed regardless of whether recovery needs to be + performed; that is, it is not an error to recover a database + environment for which recovery is not strictly necessary. For + this reason, applications should not try to determine if the + database environment was active when the application or system + failed. Instead, applications should run recovery any time the + DB_ENV->failchk() method returns + DB_RUNRECOVERY, or, if the application is not + calling the DB_ENV->failchk() method, any time any thread of + control accessing the database environment fails, as well as + any time the system reboots. + -- cgit v1.2.1