Imported from /home/lorry/working-area/delta_berkeleydb/db-6.1.23.tar.gz.HEAD db-6.1.23 master

author: Lorry Tar Creator <lorry-tar-importer@baserock.org> 2015-02-17 17:25:57 +0000
committer: <> 2015-03-17 16:26:24 +0000
commit: 780b92ada9afcf1d58085a83a0b9e6bc982203d1 (patch)
tree: 598f8b9fa431b228d29897e798de4ac0c1d3d970 /docs/programmer_reference/rep_partition.html
parent: 7a2660ba9cc2dc03a69ddfcfd95369395cc87444 (diff)
download: berkeleydb-master.tar.gz
1 files changed, 129 insertions, 91 deletions
diff --git a/docs/programmer_reference/rep_partition.html b/docs/programmer_reference/rep_partition.html
index e5740736..9b662f4b 100644
--- a/docs/programmer_reference/rep_partition.html
+++ b/docs/programmer_reference/rep_partition.html
@@ -14,7 +14,7 @@
   <body>
     <div xmlns="" class="navheader">
       <div class="libver">
-        <p>Library Version 11.2.5.3</p>
+        <p>Library Version 12.1.6.1</p>
       </div>
       <table width="100%" summary="Navigation header">
         <tr>
@@ -22,9 +22,7 @@
         </tr>
         <tr>
           <td width="20%" align="left"><a accesskey="p" href="rep_twosite.html">Prev</a> </td>
-          <th width="60%" align="center">Chapter 12. 
-		Berkeley DB Replication
-        </th>
+          <th width="60%" align="center">Chapter 12.  Berkeley DB Replication </th>
           <td width="20%" align="right"> <a accesskey="n" href="rep_faq.html">Next</a></td>
         </tr>
       </table>
@@ -38,93 +36,133 @@
           </div>
         </div>
       </div>
-      <p>The Berkeley DB replication implementation can be affected by network
-partitioning problems.</p>
-      <p>For example, consider a replication group with N members.  The network
-partitions with the master on one side and more than N/2 of the sites
-on the other side.  The sites on the side with the master will continue
-forward, and the master will continue to accept write queries for the
-databases.  Unfortunately, the sites on the other side of the partition,
-realizing they no longer have a master, will hold an election.  The
-election will succeed as there are more than N/2 of the total sites
-participating, and there will then be two masters for the replication
-group.  Since both masters are potentially accepting write queries, the
-databases could diverge in incompatible ways.</p>
-      <p>If multiple masters are ever found to exist in a replication group, a
-master detecting the problem will return <a href="../api_reference/C/repmessage.html#repmsg_DB_REP_DUPMASTER" class="olink">DB_REP_DUPMASTER</a>.  If
-the application sees this return, it should reconfigure itself as a
-client (by calling <a href="../api_reference/C/repstart.html" class="olink">DB_ENV-&gt;rep_start()</a>), and then call for an election
-(by calling <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a>).  The site that wins the election may be
-one of the two previous masters, or it may be another site entirely.
-Regardless, the winning system will bring all of the other systems into
-conformance.</p>
-      <p>As another example, consider a replication group with a master
-environment and two clients A and B, where client A may upgrade to
-master status and client B cannot.  Then, assume client A is partitioned
-from the other two database environments, and it becomes out-of-date
-with respect to the master.  Then, assume the master crashes and does
-not come back on-line.  Subsequently, the network partition is restored,
-and clients A and B hold an election.  As client B cannot win the
-election, client A will win by default, and in order to get back into
-sync with client B, possibly committed transactions on client B will be
-unrolled until the two sites can once again move forward together.</p>
-      <p>In both of these examples, there is a phase where a newly elected master
-brings the members of a replication group into conformance with itself
-so that it can start sending new information to them.  This can result
-in the loss of information as previously committed transactions are
-unrolled.</p>
-      <p>In architectures where network partitions are an issue, applications
-may want to implement a heart-beat protocol to minimize the consequences
-of a bad network partition.  As long as a master is able to contact at
-least half of the sites in the replication group, it is impossible for
-there to be two masters.  If the master can no longer contact a
-sufficient number of systems, it should reconfigure itself as a client,
-and hold an election.  Replication Manager does not currently
-implement such a feature, so this technique is only available to Base API
-applications.</p>
-      <p>There is another tool applications can use to minimize the damage in
-the case of a network partition.  By specifying an <span class="bold"><strong>nsites</strong></span>
-argument to <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> that is larger than the actual number of
-database environments in the replication group, Base API applications can keep
-systems from declaring themselves the master unless they can talk to
-a large percentage of the sites in the system.  For example, if there
-are 20 database environments in the replication group, and an argument
-of 30 is specified to the <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> method, then a system will have
-to be able to talk to at least 16 of the sites to declare itself the
-master.</p>
-      <p>Replication Manager uses the value of <span class="bold"><strong>nsites</strong></span> (configured by
-the <a href="../api_reference/C/repnsites.html" class="olink">DB_ENV-&gt;rep_set_nsites()</a> method) for elections as well as in calculating how
-many acknowledgements to wait for when sending a
-<a href="../api_reference/C/reptransport.html#transport_DB_REP_PERMANENT" class="olink">DB_REP_PERMANENT</a> message.  So this technique may be useful here
-as well, unless the application uses the <a href="../api_reference/C/repmgrset_ack_policy.html#ackspolicy_DB_REPMGR_ACKS_ALL" class="olink">DB_REPMGR_ACKS_ALL</a> or
-<a href="../api_reference/C/repmgrset_ack_policy.html#ackspolicy_DB_REPMGR_ACKS_ALL_PEERS" class="olink">DB_REPMGR_ACKS_ALL_PEERS</a> acknowledgement policies.</p>
-      <p>Specifying a <span class="bold"><strong>nsites</strong></span> argument to <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> that is
-smaller than the actual number of database environments in the
-replication group has its uses as well.  For example, consider a
-replication group with 2 environments.  If they are partitioned from
-each other, neither of the sites could ever get enough votes to become
-the master.  A reasonable alternative would be to specify a
-<span class="bold"><strong>nsites</strong></span> argument of 2 to one of the systems 
-and a <span class="bold"><strong>nsites</strong></span>
-argument of 1 to the other.  That way, one of the systems could win
-elections even when partitioned, while the other one could not.  This
-would allow one of the systems to continue accepting write
-queries after the partition.</p>
-      <p>In a 2-site group, Replication Manager by default reacts to the loss of
-communication with the master by observing a strict majority rule that
-prevents the survivor from taking over.  Thus it avoids multiple masters and
-the need to unroll some transactions if both sites are running but cannot
-communicate.  But it does leave the group in a read-only state until both
-sites are available.  If application availability while one site is down is a 
-priority and it is acceptable to risk unrolling some transactions, there
-is a configuration option to turn off the strict majority rule and allow
-the surviving client to declare itself to be master.  See the <a href="../api_reference/C/repconfig.html" class="olink">DB_ENV-&gt;rep_set_config()</a>
-method <a href="../api_reference/C/repconfig.html#config_DB_REPMGR_CONF_2SITE_STRICT" class="olink">DB_REPMGR_CONF_2SITE_STRICT</a> flag for more information.</p>
-      <p>These scenarios stress the importance of good network infrastructure in
-Berkeley DB replicated environments.  When replicating database environments
-over sufficiently lossy networking, the best solution may well be to
-pick a single master, and only hold elections when human intervention
-has determined the selected master is unable to recover at all.</p>
+      <p>
+        The Berkeley DB replication implementation can be affected
+        by network partitioning problems.
+    </p>
+      <p>
+        For example, consider a replication group with N members.
+        The network partitions with the master on one side and more
+        than N/2 of the sites on the other side. The sites on the side
+        with the master will continue forward, and the master will
+        continue to accept write queries for the databases.
+        Unfortunately, the sites on the other side of the partition,
+        realizing they no longer have a master, will hold an election.
+        The election will succeed as there are more than N/2 of the
+        total sites participating, and there will then be two masters
+        for the replication group. Since both masters are potentially
+        accepting write queries, the databases could diverge in
+        incompatible ways.
+    </p>
+      <p>
+        If multiple masters are ever found to exist in a replication
+        group, a master detecting the problem will return
+        <a href="../api_reference/C/repmessage.html#repmsg_DB_REP_DUPMASTER" class="olink">DB_REP_DUPMASTER</a>. Replication Manager applications
+        automatically handle duplicate master situations. If a Base
+        API application sees this return, it should reconfigure itself
+        as a client (by calling <a href="../api_reference/C/repstart.html" class="olink">DB_ENV-&gt;rep_start()</a>), and then call for an
+        election (by calling <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a>). The site that wins the
+        election may be one of the two previous masters, or it may be
+        another site entirely. Regardless, the winning system will
+        bring all of the other systems into conformance.
+    </p>
+      <p>
+        As another example, consider a replication group with a
+        master environment and two clients A and B, where client A may
+        upgrade to master status and client B cannot. Then, assume
+        client A is partitioned from the other two database
+        environments, and it becomes out-of-date with respect to the
+        master. Then, assume the master crashes and does not come back
+        on-line. Subsequently, the network partition is restored, and
+        clients A and B hold an election. As client B cannot win the
+        election, client A will win by default, and in order to get
+        back into sync with client B, possibly committed transactions
+        on client B will be unrolled until the two sites can once
+        again move forward together.
+    </p>
+      <p>
+        In both of these examples, there is a phase where a newly
+        elected master brings the members of a replication group into
+        conformance with itself so that it can start sending new
+        information to them. This can result in the loss of
+        information as previously committed transactions are
+        unrolled.
+    </p>
+      <p>
+        In architectures where network partitions are an issue,
+        applications may want to implement a heartbeat protocol to
+        minimize the consequences of a bad network partition. As long
+        as a master is able to contact at least half of the sites in
+        the replication group, it is impossible for there to be two
+        masters. If the master can no longer contact a sufficient
+        number of systems, it should reconfigure itself as a client,
+        and hold an election. Replication Manager does not currently
+        implement such a feature, so this technique is only available
+        to Base API applications.
+    </p>
+      <p>
+        There is another tool applications can use to minimize the
+        damage in the case of a network partition. By specifying an
+        <span class="bold"><strong>nsites</strong></span> argument to
+        <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> that is larger than the actual number of database
+        environments in the replication group, Base API applications
+        can keep systems from declaring themselves the master unless
+        they can talk to a large percentage of the sites in the
+        system. For example, if there are 20 database environments in
+        the replication group, and an argument of 30 is specified to
+        the <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> method, then a system will have to be able to
+        talk to at least 16 of the sites to declare itself the master.
+        Replication Manager automatically maintains the number of
+        sites in the replication group, so this technique is only
+        available to Base API applications.
+    </p>
+      <p>
+        Specifying a <span class="bold"><strong>nsites</strong></span>
+        argument to <a href="../api_reference/C/repelect.html" class="olink">DB_ENV-&gt;rep_elect()</a> that is smaller than the actual number
+        of database environments in the replication group has its uses
+        as well. For example, consider a replication group with 2
+        environments. If they are partitioned from each other, neither
+        of the sites could ever get enough votes to become the master.
+        A reasonable alternative would be to specify a <span class="bold"><strong>nsites</strong></span> argument of 2 to one of the
+        systems and a <span class="bold"><strong>nsites</strong></span> argument
+        of 1 to the other. That way, one of the systems could win
+        elections even when partitioned, while the other one could
+        not. This would allow one of the systems to continue accepting
+        write queries after the partition.
+    </p>
+      <p>
+        In a two-site group, Replication Manager by default reacts to
+        the loss of communication with the master by observing a
+        strict majority rule that prevents the survivor from taking
+        over. Thus it avoids multiple masters and the need to unroll
+        some transactions if both sites are running but cannot
+        communicate. But it does leave the group in a read-only state
+        until both sites are available. If application availability
+        while one site is down is a priority and it is acceptable to
+        risk unrolling some transactions, there is a configuration
+        option to turn off the strict majority rule and allow the
+        surviving client to declare itself to be master. See the
+        <a href="../api_reference/C/repconfig.html" class="olink">DB_ENV-&gt;rep_set_config()</a> method <a href="../api_reference/C/repconfig.html#config_DB_REPMGR_CONF_2SITE_STRICT" class="olink">DB_REPMGR_CONF_2SITE_STRICT</a> flag for more
+        information.
+    </p>
+      <p>
+        Preferred master mode is another alternative for two-site
+        Replication Manager replication groups. It allows the survivor
+        to take over after the loss of communication with the master.
+        When communications are restored, it always preserves the
+        transactions from the preferred master site. See
+        <a class="xref" href="rep_twosite.html#twosite_prefmas" title="Preferred master mode">Preferred master mode</a>
+        for more information.
+    </p>
+      <p>
+        These scenarios stress the importance of good network
+        infrastructure in Berkeley DB replicated environments. When
+        replicating database environments over sufficiently lossy
+        networking, the best solution may well be to pick a single
+        master, and only hold elections when human intervention has
+        determined the selected master is unable to recover at
+        all.
+    </p>
     </div>
     <div class="navfooter">
       <hr />
author	Lorry Tar Creator <lorry-tar-importer@baserock.org>	2015-02-17 17:25:57 +0000
committer	<>	2015-03-17 16:26:24 +0000
commit	780b92ada9afcf1d58085a83a0b9e6bc982203d1 (patch)
tree	598f8b9fa431b228d29897e798de4ac0c1d3d970 /docs/programmer_reference/rep_partition.html
parent	7a2660ba9cc2dc03a69ddfcfd95369395cc87444 (diff)
download	berkeleydb-master.tar.gz