diff options
Diffstat (limited to 'docs/programmer_reference/cam_app.html')
| -rw-r--r-- | docs/programmer_reference/cam_app.html | 487 |
1 files changed, 293 insertions, 194 deletions
diff --git a/docs/programmer_reference/cam_app.html b/docs/programmer_reference/cam_app.html index 07889503..b2287e72 100644 --- a/docs/programmer_reference/cam_app.html +++ b/docs/programmer_reference/cam_app.html @@ -14,7 +14,7 @@ <body> <div xmlns="" class="navheader"> <div class="libver"> - <p>Library Version 11.2.5.3</p> + <p>Library Version 12.1.6.1</p> </div> <table width="100%" summary="Navigation header"> <tr> @@ -22,9 +22,8 @@ </tr> <tr> <td width="20%" align="left"><a accesskey="p" href="cam_fail.html">Prev</a> </td> - <th width="60%" align="center">Chapter 10. - Berkeley DB Concurrent Data Store Applications - </th> + <th width="60%" align="center">Chapter 10. Berkeley DB Concurrent Data Store + Applications </th> <td width="20%" align="right"> <a accesskey="n" href="transapp.html">Next</a></td> </tr> </table> @@ -38,197 +37,299 @@ </div> </div> </div> - <p>When building Data Store and Concurrent Data Store applications, the -architecture decisions involve application startup (cleaning up any -existing databases, the removal of any existing database environment -and creation of a new environment), and handling system or application -failure. "Cleaning up" databases involves removal and re-creation -of the database, restoration from an archival copy and/or verification -and optional salvage, as described in <a class="xref" href="cam_fail.html" title="Handling failure in Data Store and Concurrent Data Store applications">Handling failure in Data Store and Concurrent Data Store applications</a>.</p> - <p>Data Store or Concurrent Data Store applications without database -environments are single process, by definition. These applications -should start up, re-create, restore, or verify and optionally salvage -their databases and run until eventual exit or application or system -failure. After system or application failure, that process can simply -repeat this procedure. This document will not discuss the case of these -applications further.</p> - <p>Otherwise, the first question of Data Store and Concurrent Data Store -architecture is the cleaning up existing databases and the removal of -existing database environments, and the subsequent creation of a new -environment. For obvious reasons, the application must serialize the -re-creation, restoration, or verification and optional salvage of its -databases. Further, environment removal and creation must be -single-threaded, that is, one thread of control (where a thread of -control is either a true thread or a process) must remove and re-create -the environment before any other thread of control can use the new -environment. It may simplify matters that Berkeley DB serializes creation of -the environment, so multiple threads of control attempting to create a -environment will serialize behind a single creating thread.</p> - <p>Removing a database environment will first mark the environment as -"failed", causing any threads of control still running in the -environment to fail and return to the application. This feature allows -applications to remove environments without concern for threads of -control that might still be running in the removed environment.</p> - <p>One consideration in removing a database environment which may be in use -by another thread, is the type of mutex being used by the Berkeley DB library. -In the case of database environment failure when using test-and-set -mutexes, threads of control waiting on a mutex when the environment is -marked "failed" will quickly notice the failure and will return an error -from the Berkeley DB API. In the case of environment failure when using -blocking mutexes, where the underlying system mutex implementation does -not unblock mutex waiters after the thread of control holding the mutex -dies, threads waiting on a mutex when an environment is recovered might -hang forever. Applications blocked on events (for example, an -application blocked on a network socket or a GUI event) may also fail -to notice environment recovery within a reasonable amount of time. -Systems with such mutex implementations are rare, but do exist; -applications on such systems should use an application architecture -where the thread recovering the database environment can explicitly -terminate any process using the failed environment, or configure Berkeley DB -for test-and-set mutexes, or incorporate some form of long-running timer -or watchdog process to wake or kill blocked processes should they block -for too long.</p> - <p>Regardless, it makes little sense for multiple threads of control to -simultaneously attempt to remove and re-create a environment, since the -last one to run will remove all environments created by the threads of -control that ran before it. However, for some few applications, it may -make sense for applications to have a single thread of control that -checks the existing databases and removes the environment, after which -the application launches a number of processes, any of which are able -to create the environment.</p> - <p>With respect to cleaning up existing databases, the database environment -must be removed before the databases are cleaned up. Removing the -environment causes any Berkeley DB library calls made by threads of control -running in the failed environment to return failure to the application. -Removing the database environment first ensures the threads of control -in the old environment do not race with the threads of control cleaning -up the databases, possibly overwriting them after the cleanup has -finished. Where the application architecture and system permit, many -applications kill all threads of control running in the failed database -environment before removing the failed database environment, on general -principles as well as to minimize overall system resource usage. It -does not matter if the new environment is created before or after the -databases are cleaned up.</p> - <p>After having dealt with database and database environment recovery after -failure, the next issue to manage is application failure. As described -in <a class="xref" href="cam_fail.html" title="Handling failure in Data Store and Concurrent Data Store applications">Handling failure in Data Store and Concurrent Data Store applications</a>, when a thread of control in a Data Store or -Concurrent Data Store application fails, it may exit holding data -structure mutexes or logical database locks. These mutexes and locks -must be released to avoid the remaining threads of control hanging -behind the failed thread of control's mutexes or locks.</p> - <p>There are three common ways to architect Berkeley DB Data Store and Concurrent -Data Store applications. The one chosen is usually based on whether or -not the application is comprised of a single process or group of -processes descended from a single process (for example, a server started -when the system first boots), or if the application is comprised of -unrelated processes (for example, processes started by web connections -or users logging into the system).</p> + <p> + When building Data Store and Concurrent Data Store + applications, the architecture decisions involve application + startup (cleaning up any existing databases, the removal of + any existing database environment and creation of a new + environment), and handling system or application failure. + "Cleaning up" databases involves removal and re-creation of + the database, restoration from an archival copy and/or + verification and optional salvage, as described in <a class="xref" href="cam_fail.html" title="Handling failure in Data Store and Concurrent Data Store applications">Handling failure in Data Store and Concurrent Data Store applications</a>. + </p> + <p> + Data Store or Concurrent Data Store applications without + database environments are single process, by definition. These + applications should start up, re-create, restore, or verify + and optionally salvage their databases and run until eventual + exit or application or system failure. After system or + application failure, that process can simply repeat this + procedure. This document will not discuss the case of these + applications further. + </p> + <p> + Otherwise, the first question of Data Store and Concurrent + Data Store architecture is the cleaning up existing databases + and the removal of existing database environments, and the + subsequent creation of a new environment. For obvious reasons, + the application must serialize the re-creation, restoration, + or verification and optional salvage of its databases. + Further, environment removal and creation must be + single-threaded, that is, one thread of control (where a + thread of control is either a true thread or a process) must + remove and re-create the environment before any other thread + of control can use the new environment. It may simplify + matters that Berkeley DB serializes creation of the + environment, so multiple threads of control attempting to + create a environment will serialize behind a single creating + thread. + </p> + <p> + Removing a database environment will first mark the + environment as "failed", causing any threads of control still + running in the environment to fail and return to the + application. This feature allows applications to remove + environments without concern for threads of control that might + still be running in the removed environment. + </p> + <p> + One consideration in removing a database environment which + may be in use by another thread, is the type of mutex being + used by the Berkeley DB library. In the case of database + environment failure when using test-and-set mutexes, threads + of control waiting on a mutex when the environment is marked + "failed" will quickly notice the failure and will return an + error from the Berkeley DB API. In the case of environment + failure when using blocking mutexes, where the underlying + system mutex implementation does not unblock mutex waiters + after the thread of control holding the mutex dies, threads + waiting on a mutex when an environment is recovered might hang + forever. Applications blocked on events (for example, an + application blocked on a network socket or a GUI event) may + also fail to notice environment recovery within a reasonable + amount of time. Systems with such mutex implementations are + rare, but do exist; applications on such systems should use an + application architecture where the thread recovering the + database environment can explicitly terminate any process + using the failed environment, or configure Berkeley DB for + test-and-set mutexes, or incorporate some form of long-running + timer or watchdog process to wake or kill blocked processes + should they block for too long. + </p> + <p> + Regardless, it makes little sense for multiple threads of + control to simultaneously attempt to remove and re-create a + environment, since the last one to run will remove all + environments created by the threads of control that ran before + it. However, for some few applications, it may make sense for + applications to have a single thread of control that checks + the existing databases and removes the environment, after + which the application launches a number of processes, any of + which are able to create the environment. + </p> + <p> + With respect to cleaning up existing databases, the database + environment must be removed before the databases are cleaned + up. Removing the environment causes any Berkeley DB library + calls made by threads of control running in the failed + environment to return failure to the application. Removing the + database environment first ensures the threads of control in + the old environment do not race with the threads of control + cleaning up the databases, possibly overwriting them after the + cleanup has finished. Where the application architecture and + system permit, many applications kill all threads of control + running in the failed database environment before removing the + failed database environment, on general principles as well as + to minimize overall system resource usage. It does not matter + if the new environment is created before or after the + databases are cleaned up. + </p> + <p> + After having dealt with database and database environment + recovery after failure, the next issue to manage is + application failure. As described in <a class="xref" href="cam_fail.html" title="Handling failure in Data Store and Concurrent Data Store applications">Handling failure in Data Store and Concurrent Data Store applications</a>, when a thread of control in a + Data Store or Concurrent Data Store application fails, it may + exit holding data structure mutexes or logical database locks. + These mutexes and locks must be released to avoid the + remaining threads of control hanging behind the failed thread + of control's mutexes or locks. + </p> + <p> + There are three common ways to architect Berkeley DB Data + Store and Concurrent Data Store applications. The one chosen + is usually based on whether or not the application is + comprised of a single process or group of processes descended + from a single process (for example, a server started when the + system first boots), or if the application is comprised of + unrelated processes (for example, processes started by web + connections or users logging into the system). + </p> <div class="orderedlist"> <ol type="1"> - <li>The first way to architect Data Store and Concurrent Data Store -applications is as a single process (the process may or may not be -multithreaded.) -<p>When this process starts, it removes any existing database environment -and creates a new environment. It then cleans up the databases and -opens those databases in the environment. The application can -subsequently create new threads of control as it chooses. Those threads -of control can either share already open Berkeley DB <a href="../api_reference/C/env.html" class="olink">DB_ENV</a> and -<a href="../api_reference/C/db.html" class="olink">DB</a> handles, or create their own. In this architecture, -databases are rarely opened or closed when more than a single thread of -control is running; that is, they are opened when only a single thread -is running, and closed after all threads but one have exited. The last -thread of control to exit closes the databases and the database -environment.</p><p>This architecture is simplest to implement because thread serialization -is easy and failure detection does not require monitoring multiple -processes.</p><p>If the application's thread model allows the process to continue after -thread failure, the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method can be used to determine if -the database environment is usable after the failure. If the -application does not call <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, or -<a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, the application -must behave as if there has been a system failure, removing the -environment and creating a new environment, and cleaning up any -databases it wants to continue to use. Once these actions have been -taken, other threads of control can continue (as long as all existing -Berkeley DB handles are first discarded), or restarted.</p></li> - <li>The second way to architect Data Store and Concurrent Data Store -applications is as a group of related processes (the processes may or -may not be multithreaded). -<p>This architecture requires the order in which threads of control are -created be controlled to serialize database environment removal and -creation, and database cleanup.</p><p>In addition, this architecture requires that threads of control be -monitored. If any thread of control exits with open Berkeley DB handles, the -application may call the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method to determine if the -database environment is usable after the exit. If the application does -not call <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, or <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns -<a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, the application must behave as if there has been -a system failure, removing the environment and creating a new -environment, and cleaning up any databases it wants to continue to use. -Once these actions have been taken, other threads of control can -continue (as long as all existing Berkeley DB handles are first discarded), -or restarted.</p><p>The easiest way to structure groups of related processes is to first -create a single "watcher" process (often a script) that starts when the -system first boots, removes and creates the database environment, cleans -up the databases and then creates the processes or threads that will -actually perform work. The initial thread has no further -responsibilities other than to wait on the threads of control it has -started, to ensure none of them unexpectedly exit. If a thread of -control exits, the watcher process optionally calls the -<a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method. If the application does not call -<a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, or if <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns -<a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, the environment can no longer be used, the -watcher kills all of the threads of control using the failed -environment, cleans up, and starts new threads of control to perform -work.</p></li> - <li>The third way to architect Data Store and Concurrent Data Store -applications is as a group of unrelated processes (the processes may or -may not be multithreaded). This is the most difficult architecture to -implement because of the level of difficulty in some systems of finding -and monitoring unrelated processes. -<p>One solution is to log a thread of control ID when a new Berkeley DB handle -is opened. For example, an initial "watcher" process could open/create -the database environment, clean up the databases and then create a -sentinel file. Any "worker" process wanting to use the environment -would check for the sentinel file. If the sentinel file does not exist, -the worker would fail or wait for the sentinel file to be created. Once -the sentinel file exists, the worker would register its process ID with -the watcher (via shared memory, IPC or some other registry mechanism), -and then the worker would open its <a href="../api_reference/C/env.html" class="olink">DB_ENV</a> handles and proceed. -When the worker finishes using the environment, it would unregister its -process ID with the watcher. The watcher periodically checks to ensure -that no worker has failed while using the environment. If a worker -fails while using the environment, the watcher removes the sentinel -file, kills all of the workers currently using the environment, cleans -up the environment and databases, and finally creates a new sentinel -file.</p><p>The weakness of this approach is that, on some systems, it is difficult -to determine if an unrelated process is still running. For example, -POSIX systems generally disallow sending signals to unrelated processes. -The trick to monitoring unrelated processes is to find a system resource -held by the process that will be modified if the process dies. On POSIX -systems, flock- or fcntl-style locking will work, as will LockFile on -Windows systems. Other systems may have to use other process-related -information such as file reference counts or modification times. In the -worst case, threads of control can be required to periodically -re-register with the watcher process: if the watcher has not heard from -a thread of control in a specified period of time, the watcher will take -action, cleaning up the environment.</p><p>If it is not practical to monitor the processes sharing a database -environment, it may be possible to monitor the environment to detect if -a thread of control has failed holding open Berkeley DB handles. This would -be done by having a "watcher" process periodically call the -<a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method. If <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns -<a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, the watcher would then take action, cleaning up -the environment.</p><p>The weakness of this approach is that all threads of control using the -environment must specify an "ID" function and an "is-alive" function -using the <a href="../api_reference/C/envset_thread_id.html" class="olink">DB_ENV->set_thread_id()</a> method. (In other words, the Berkeley DB -library must be able to assign a unique ID to each thread of control, -and additionally determine if the thread of control is still running. -It can be difficult to portably provide that information in applications -using a variety of different programming languages and running on a -variety of different platforms.)</p></li> + <li> + The first way to architect Data Store and Concurrent + Data Store applications is as a single process (the + process may or may not be multithreaded.) + <p> + When this + process starts, it removes any existing database + environment and creates a new environment. It then + cleans up the databases and opens those databases in + the environment. The application can subsequently + create new threads of control as it chooses. Those + threads of control can either share already open + Berkeley DB <a href="../api_reference/C/env.html" class="olink">DB_ENV</a> and <a href="../api_reference/C/db.html" class="olink">DB</a> handles, or create their + own. In this architecture, databases are rarely opened + or closed when more than a single thread of control is + running; that is, they are opened when only a single + thread is running, and closed after all threads but + one have exited. The last thread of control to exit + closes the databases and the database + environment. + </p><p> + This architecture is simplest to implement because + thread serialization is easy and failure detection + does not require monitoring multiple processes. + </p><p> + If the application's thread model allows the process + to continue after thread failure, the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> + method can be used to determine if the database + environment is usable after the failure. If the + application does not call <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, or + <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, + the application must behave as if there has been a system + failure, removing the environment and creating a new + environment, and cleaning up any databases it wants to + continue to use. Once these actions have been taken, other + threads of control can continue (as long as all existing + Berkeley DB handles are first discarded), or restarted. + </p><p> + Note that by default <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> will only notify the + calling thread that the database environment is unusable. + However, you can optionally cause <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> to broadcast + this to other threads of control by using the + <code class="literal">--enable-failchk_broadcast</code> flag when you + compile your Berkeley DB library. If this option is turned + on, then all threads of control using the database + environment will return + <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a> + when they attempt to obtain a mutex lock. In this + situation, a <code class="literal">DB_EVENT_FAILCHK_PANIC</code> or + <code class="literal">DB_EVENT_MUTEX_DIED</code> event will also be + raised. (You use <a href="../api_reference/C/envevent_notify.html" class="olink">DB_ENV->set_event_notify()</a> to examine events). + </p></li> + <li> + The second way to architect Data Store and + Concurrent Data Store applications is as a group of + related processes (the processes may or may not be + multithreaded). + <p> + This architecture requires the order + in which threads of control are created be controlled + to serialize database environment removal and + creation, and database cleanup. + </p><p> + In addition, this architecture requires that threads + of control be monitored. If any thread of control + exits with open Berkeley DB handles, the application + may call the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method to determine if the + database environment is usable after the exit. If the + application does not call <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, or + <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, + the application must + behave as if there has been a system failure, removing + the environment and creating a new environment, and + cleaning up any databases it wants to continue to use. + Once these actions have been taken, other threads of + control can continue (as long as all existing Berkeley + DB handles are first discarded), or restarted. + </p><p> + The easiest way to structure groups of related + processes is to first create a single "watcher" + process (often a script) that starts when the system + first boots, removes and creates the database + environment, cleans up the databases and then creates + the processes or threads that will actually perform + work. The initial thread has no further + responsibilities other than to wait on the threads of + control it has started, to ensure none of them + unexpectedly exit. If a thread of control exits, the + watcher process optionally calls the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> + method. If the application does not call <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a>, + or if <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, the environment can no + longer be used, the watcher kills all of the threads + of control using the failed environment, cleans up, + and starts new threads of control to perform + work. + </p></li> + <li> + The third way to architect Data Store and Concurrent + Data Store applications is as a group of unrelated + processes (the processes may or may not be multithreaded). + This is the most difficult architecture to implement + because of the level of difficulty in some systems of + finding and monitoring unrelated processes. + <p> + One + solution is to log a thread of control ID when a new + Berkeley DB handle is opened. For example, an initial + "watcher" process could open/create the database + environment, clean up the databases and then create a + sentinel file. Any "worker" process wanting to use the + environment would check for the sentinel file. If the + sentinel file does not exist, the worker would fail or + wait for the sentinel file to be created. Once the + sentinel file exists, the worker would register its + process ID with the watcher (via shared memory, IPC or + some other registry mechanism), and then the worker + would open its <a href="../api_reference/C/env.html" class="olink">DB_ENV</a> handles and proceed. When the + worker finishes using the environment, it would + unregister its process ID with the watcher. The + watcher periodically checks to ensure that no worker + has failed while using the environment. If a worker + fails while using the environment, the watcher removes + the sentinel file, kills all of the workers currently + using the environment, cleans up the environment and + databases, and finally creates a new sentinel + file. + </p><p> + The weakness of this approach is that, on some + systems, it is difficult to determine if an unrelated + process is still running. For example, POSIX systems + generally disallow sending signals to unrelated + processes. The trick to monitoring unrelated processes + is to find a system resource held by the process that + will be modified if the process dies. On POSIX + systems, flock- or fcntl-style locking will work, as + will LockFile on Windows systems. Other systems may + have to use other process-related information such as + file reference counts or modification times. In the + worst case, threads of control can be required to + periodically re-register with the watcher process: if + the watcher has not heard from a thread of control in + a specified period of time, the watcher will take + action, cleaning up the environment. + </p><p> + If it is not practical to monitor the processes + sharing a database environment, it may be possible to + monitor the environment to detect if a thread of + control has failed holding open Berkeley DB handles. + This would be done by having a "watcher" process + periodically call the <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> method. If + <a href="../api_reference/C/envfailchk.html" class="olink">DB_ENV->failchk()</a> returns <a class="link" href="program_errorret.html#program_errorret.DB_RUNRECOVERY">DB_RUNRECOVERY</a>, + the watcher would then + take action, cleaning up the environment. + </p><p> + The weakness of this approach is that all threads of + control using the environment must specify an "ID" + function and an "is-alive" function using the + <a href="../api_reference/C/envset_thread_id.html" class="olink">DB_ENV->set_thread_id()</a> method. (In other words, the + Berkeley DB library must be able to assign a unique ID + to each thread of control, and additionally determine + if the thread of control is still running. It can be + difficult to portably provide that information in + applications using a variety of different programming + languages and running on a variety of different + platforms.) + </p></li> </ol> </div> - <p>Obviously, when implementing a process to monitor other threads of -control, it is important the watcher process' code be as simple and -well-tested as possible, because the application may hang if it fails.</p> + <p> + Obviously, when implementing a process to monitor other + threads of control, it is important the watcher process' code + be as simple and well-tested as possible, because the + application may hang if it fails. + </p> </div> <div class="navfooter"> <hr /> @@ -245,9 +346,7 @@ well-tested as possible, because the application may hang if it fails.</p> <td width="20%" align="center"> <a accesskey="h" href="index.html">Home</a> </td> - <td width="40%" align="right" valign="top"> Chapter 11. - Berkeley DB Transactional Data Store Applications - </td> + <td width="40%" align="right" valign="top"> Chapter 11. Berkeley DB Transactional Data Store Applications </td> </tr> </table> </div> |
