From 780b92ada9afcf1d58085a83a0b9e6bc982203d1 Mon Sep 17 00:00:00 2001
From: Lorry Tar Creator The simplest way to build a replicated Berkeley DB application is to first
-build (and debug!) the transactional version of the same application.
-Then, add a thin replication layer: application initialization must be
-changed and the application's communication infrastructure must be
-added. The application initialization changes are relatively simple.
-Replication Manager provides a communication infrastructure, but
-in order to use the replication Base APIs you must provide your own. For implementation reasons, all replicated databases must reside in
-the data directories set from DB_ENV->set_data_dir() (or in the
-default environment home directory, if not using
-DB_ENV->set_data_dir()), rather than in a subdirectory below the
-specified directory. Care must be taken in applications using
-relative pathnames and changing working directories after opening the
-environment. In such applications the replication initialization code
-may not be able to locate the databases, and applications that change
-their working directories may need to use absolute pathnames. During application initialization, the application performs
-three additional tasks: first, it must specify the DB_INIT_REP
-flag when opening its database environment and additionally, a
-Replication Manager application must also specify the DB_THREAD flag;
-second, it must provide Berkeley DB information about its communications
-infrastructure; and third, it must start the Berkeley DB replication system.
-Generally, a replicated application will do normal Berkeley DB recovery and
-configuration, exactly like any other transactional application.
- Replication Manager applications configure the built-in communications
- infrastructure by calling obtaining a DB_SITE handle, and then using
- it to configure the local site. It can optionally obtain one or more
- DB_SITE handles to configure remote sites. Once the environment has
- been opened, the application starts the replication system by calling
- the DB_ENV->repmgr_start() method.
- A Base API application calls the
-DB_ENV->rep_set_transport() method to configure the entry point to its own
-communications infrastructure, and then calls the
-DB_ENV->rep_start() method to join or create the replication group. When starting the replication system, an application has two choices:
-it may choose the group master site explicitly, or alternatively it
-may configure all group members as clients and then call for an
-election, letting the clients select the master from among
-themselves. Either is correct, and the choice is entirely up to the
-application. Replication Manager applications make this choice simply by setting
-the flags parameter to the DB_ENV->repmgr_start() method. For a Base API application, the result of
-calling DB_ENV->rep_start() is usually the discovery of a master, or the
-declaration of the local environment as the master. If a master has
-not been discovered after a reasonable amount of time, the application
-should call DB_ENV->rep_elect() to call for an election. Consider a Base API application with multiple processes or multiple
-environment handles
-that modify databases in the replicated environment. All modifications
-must be done on the master environment. The first process to join or
-create the master environment must call both the
-DB_ENV->rep_set_transport() and the DB_ENV->rep_start() method. Subsequent
-replication processes must at least call the DB_ENV->rep_set_transport() method.
-Those processes may call the DB_ENV->rep_start() method (as long as they use the
-same master or client argument). If multiple processes are modifying
-the master environment there must be a unified communication
-infrastructure such that messages arriving at clients have a single
-master ID. Additionally the application must be structured so that all
-incoming messages are able to be processed by a single DB_ENV
-handle. Note that not all processes running in replicated environments need to
-call DB_ENV->repmgr_start(), DB_ENV->rep_set_transport() or DB_ENV->rep_start(). Read-only
-processes running in a master environment do not need to be configured
-for replication in any way. Processes running in a client environment
-are read-only by definition, and so do not need to be configured for
-replication either (although, in the case of clients that may become
-masters, it is usually simplest to configure for replication on process
-startup rather than trying to reconfigure when the client becomes a
-master). Obviously, at least one thread of control on each client must
-be configured for replication as messages must be passed between the
-master and the client. Any site in a replication group may have its own private
-transactional databases in the environment as well. A site may
-create a local database by specifying the DB_TXN_NOT_DURABLE
-flag to the DB->set_flags() method. The application
-must never create a private database with the same name
-as a database replicated across the entire environment
-as data corruption can result. For implementation reasons, Base API applications must process
-all incoming replication messages
-using the same DB_ENV handle. It is not required that
-a single thread of control process all messages, only that all threads
-of control processing messages use the same handle. No additional calls are required to shut down a database environment
-participating in a replication group. The application should shut down
-the environment in the usual manner, by calling the DB_ENV->close() method.
-For Replication Manager applications, this also terminates all network
-connections and background processing threads.
+ The application initialization changes are relatively + simple. Replication Manager provides a communication + infrastructure, but in order to use the replication Base APIs + you must provide your own. +
++ For implementation reasons, all replicated databases must + reside in the data directories set from DB_ENV->add_data_dir() (or + in the default environment home directory, if not using + DB_ENV->add_data_dir()), rather than in a subdirectory below the + specified directory. Care must be taken in applications using + relative pathnames and changing working directories after + opening the environment. In such applications the replication + initialization code may not be able to locate the databases, + and applications that change their working directories may + need to use absolute pathnames. +
++ During application initialization, the application performs + three additional tasks: first, it must specify the + DB_INIT_REP flag when opening its database environment and + additionally, a Replication Manager application must also + specify the DB_THREAD flag; second, it must provide Berkeley + DB information about its communications infrastructure; and + third, it must start the Berkeley DB replication system. + Generally, a replicated application will do normal Berkeley DB + recovery and configuration, exactly like any other + transactional application. +
++ Replication Manager applications configure the built-in + communications infrastructure by calling obtaining a DB_SITE + handle, and then using it to configure the local site. It can + optionally obtain one or more DB_SITE handles to configure + remote sites. Once the environment has been opened, the + application starts the replication system by calling the + DB_ENV->repmgr_start() method. +
++ A Base API application calls the DB_ENV->rep_set_transport() method to + configure the entry point to its own communications + infrastructure, and then calls the DB_ENV->rep_start() method to join + or create the replication group. +
++ When starting the replication system, an application has two + choices: it may choose the group master site explicitly, or + alternatively it may configure all group members as clients + and then call for an election, letting the clients select the + master from among themselves. Either is correct, and the + choice is entirely up to the application. +
++ Replication Manager applications make this choice simply by + setting the flags parameter to the DB_ENV->repmgr_start() + method. +
++ For a Base API application, the result of calling DB_ENV->rep_start() + is usually the discovery of a master, or the declaration of + the local environment as the master. If a master has not been + discovered after a reasonable amount of time, the application + should call DB_ENV->rep_elect() to call for an election. +
++ Consider a Base API application with multiple processes or + multiple environment handles that modify databases in the + replicated environment. All modifications must be done on the + master environment. The first process to join or create the + master environment must call both the DB_ENV->rep_set_transport() and the + DB_ENV->rep_start() method. Subsequent replication processes must at + least call the DB_ENV->rep_set_transport() method. Those processes may call + the DB_ENV->rep_start() method (as long as they use the same master or + client argument). If multiple processes are modifying the + master environment there must be a unified communication + infrastructure such that messages arriving at clients have a + single master ID. Additionally the application must be + structured so that all incoming messages are able to be + processed by a single DB_ENV handle. +
++ Note that not all processes running in replicated + environments need to call DB_ENV->repmgr_start(), DB_ENV->rep_set_transport() or + DB_ENV->rep_start(). Read-only processes running in a master + environment do not need to be configured for replication in + any way. Processes running in a client environment are + read-only by definition, and so do not need to be configured + for replication either (although, in the case of clients that + may become masters, it is usually simplest to configure for + replication on process startup rather than trying to + reconfigure when the client becomes a master). Obviously, at + least one thread of control on each client must be configured + for replication as messages must be passed between the master + and the client. +
++ Any site in a replication group may have its own private + transactional databases in the environment as well. A site may + create a local database by specifying the DB_TXN_NOT_DURABLE + flag to the DB->set_flags() method. The application must never + create a private database with the same name as a database + replicated across the entire environment as data corruption + can result. +
++ For implementation reasons, Base API applications must + process all incoming replication messages using the same DB_ENV + handle. It is not required that a single thread of control + process all messages, only that all threads of control + processing messages use the same handle. +
++ No additional calls are required to shut down a database + environment participating in a replication group. The + application should shut down the environment in the usual + manner, by calling the DB_ENV->close() method. For Replication + Manager applications, this also terminates all network + connections and background processing threads. +