From 012f33fd105fb0838898bb66a25823aaf07a9704 Mon Sep 17 00:00:00 2001 From: Alan Conway Date: Tue, 27 Mar 2012 14:49:47 +0000 Subject: QPID-3603: Update new HA docs with information on rgmanager, more detail about client connections. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1305855 13f79535-47bb-0310-9956-ffa450edef68 --- qpid/doc/book/src/Active-Passive-Cluster.xml | 497 +++++++++++++++++++-------- 1 file changed, 353 insertions(+), 144 deletions(-) diff --git a/qpid/doc/book/src/Active-Passive-Cluster.xml b/qpid/doc/book/src/Active-Passive-Cluster.xml index 3eaadad51e..266fd3551d 100644 --- a/qpid/doc/book/src/Active-Passive-Cluster.xml +++ b/qpid/doc/book/src/Active-Passive-Cluster.xml @@ -27,66 +27,62 @@ under the License.
Overview - This release provides a preview of a new module for High Availability (HA). The new - module is not yet complete or ready for production use, it being made available so - that users can experiment with the new approach and provide feedback early in the - development process. Feedback should go to user@qpid.apache.org. + This release provides a preview of a new module for High Availability (HA). The new module is + not yet complete or ready for production use. It being made available so that users can + experiment with the new approach and provide feedback early in the development process. + Feedback should go to dev@qpid.apache.org. - The old cluster module takes an active-active approach, - i.e. all the brokers in a cluster are able to handle client requests - simultaneously. The new HA module takes an active-passive, - hot-standby approach. + The old cluster module takes an active-active approach, i.e. all the + brokers in a cluster are able to handle client requests simultaneously. The new HA module + takes an active-passive, hot-standby approach. - In an active-passive cluster, only one broker, known as the - primary, is active and serving clients at a time. The other - brokers are standing by as backups. Changes on the primary - are immediately replicated to all the backups so they are always up-to-date or - "hot". If the primary fails, one of the backups is promoted to be the new - primary. Clients fail-over to the new primary automatically. If there are multiple - backups, the backups also fail-over to become backups of the new primary. + In an active-passive cluster only one broker, known as the primary, is + active and serving clients at a time. The other brokers are standing by as + backups. Changes on the primary are immediately replicated to all the + backups so they are always up-to-date or "hot". If the primary fails, one of the backups is + promoted to take over as the new primary. Clients fail-over to the new primary + automatically. If there are multiple backups, the backups also fail-over to become backups of + the new primary. Backup brokers reject connection attempts, to enforce the requirement that + only the primary be active. - The new approach depends on an external cluster resource - manager to detect failure of the primary and choose the new primary. The - first supported resource manager will be rgmanager, but it will - be possible to add integration with other resource managers in the future. The - preview version is not integrated with any resource manager, you can use the - qpid-ha tool to simulate the actions of a resource manager or do - your own integration. + This approach depends on an external cluster resource manager to detect + failures and choose the primary. Rgmanager is supported + initially, but others may be supported in future future.
Why the new approach? - The new active-passive approach has several advantages compared to the - existing active-active cluster module. - - - It does not depend directly on openais or corosync. It does not use multicast - which simplifies deployment. - - - It is more portable: in environments that don't support corosync, it can be - integrated with a resource manager available in that environment. - - - Replication to a disaster recovery site can be handled as - simply another node in the cluster, it does not require a separate replication - mechanism. - - - It can take advantage of features provided by the resource manager, for example - virtual IP addresses. - - - Improved performance and scalability due to better use of multiple CPU s - - + + The new active-passive approach has several advantages compared to the + existing active-active cluster module. + + + It does not depend directly on openais or corosync. It does not use multicast + which simplifies deployment. + + + It is more portable: in environments that don't support corosync, it can be + integrated with a resource manager available in that environment. + + + Replication to a disaster recovery site can be handled as + simply another node in the cluster, it does not require a separate replication + mechanism. + + + It can take advantage of features provided by the resource manager, for example + virtual IP addresses. + + + Improved performance and scalability due to better use of multiple CPU s + + +
- Limitations @@ -96,9 +92,9 @@ under the License. - Transactional changes to queue state are not replicated atomically. If the - primary crashes during a transaction, it is possible that the backup could - contain only part of the changes introduced by a transaction. + Transactional changes to queue state are not replicated atomically. If the primary crashes + during a transaction, it is possible that the backup could contain only part of the + changes introduced by a transaction. During a fail-over one backup is promoted to primary and any other backups switch to @@ -106,14 +102,14 @@ under the License. switched could be lost if the new primary itself fails before all the backups have switched. - - When used with a persistent store: if the entire cluster fails, there are no tools - to help identify the most recent store. - Acknowledgments are confirmed to clients before the message has been dequeued from replicas or indeed from the local store if that is asynchronous. + + When used with a persistent store: if the entire cluster fails, there are no tools to help + identify the most recent store. + A persistent broker must have its store erased before joining an existing cluster. In the production version a persistent broker will be able to load its store and @@ -149,18 +145,32 @@ under the License.
- +
+ Virtual IP Addresses + + Some resource managers (including rgmanager) support virtual IP + addresses. A virtual IP address is an IP address that can be relocated to any of + the nodes in a cluster. The resource manager associates this address with the primary node in + the cluster, and relocates it to the new primary when there is a failure. This simplifies + configuration as you can publish a single IP address rather than a list. + + + A virtual IP address can be used by clients to connect to the primary, and also by backup + brokers when they connect to the primary. The following sections will explain how to configure + virtual IP addresses for clients or brokers. + +
Configuring the Brokers - The broker must load the ha module, it is loaded by default - when you start a broker. The following broker options are available for the HA module. + The broker must load the ha module, it is loaded by default. The + following broker options are available for the HA module. Options for High Availability Messaging Cluster - + @@ -171,7 +181,7 @@ under the License. - --ha-cluster yes|no + --ha-cluster yes|no Set to "yes" to have the broker join a cluster. @@ -179,7 +189,7 @@ under the License. - --ha-brokers URL + --ha-brokers URL URL use by brokers to connect to each other. The URL lists the addresses of @@ -201,19 +211,19 @@ under the License. - --ha-public-brokers URL + --ha-public-brokers URL URL used by clients to connect to the brokers in the same format as - --ha-brokers above. Use this option if you want client + --ha-brokers above. Use this option if you want client traffic on a different network from broker replication traffic. If this option is not set, clients will use the same URL as brokers. - --ha-username USER - --ha-password PASS - --ha-mechanism MECH + --ha-username USER + --ha-password PASS + --ha-mechanism MECH Brokers use USER, @@ -225,16 +235,15 @@ under the License.
- To configure a cluster you must set at least ha-cluster and ha-brokers + To configure a cluster you must set at least ha-cluster and ha-brokers.
-
Creating replicated queues and exchanges To create a replicated queue or exchange, pass the argument - qpid.replicate when creating the queue or exchange. It should + qpid.replicate when creating the queue or exchange. It should have one of the following three values: @@ -249,113 +258,313 @@ under the License. - Bindings are automatically replicated if the queue and exchange being bound both have - replication argument of all or confguration, they are - not replicated otherwise. + + Bindings are automatically replicated if the queue and exchange being bound both have + replication argument of all or configuration, they are + not replicated otherwise. + + + You can create replicated queues and exchanges with the qpid-config + management tool like this: + + qpid-config add queue myqueue --replicate all + + + + To create replicated queues and exchanges via the client API, add a node entry to the address like this: + + "myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}" + + +
- You can create replicated queues and exchanges with the qpid-config - management tool like this: - - qpid-config add queue myqueue --replicate all - +
+ Client Connection and Fail-over + + Clients can only connect to the primary broker. Backup brokers automatically reject any + connection attempt by a client. + + + Clients are configured with the URL for the cluster. There are two possibilities + + The URL contains multiple addresses, one for each broker in the cluster. + + The URL contains a single virtual IP address that is assigned to the primary broker by the resource manager. + Only if the resource manager supports virtual IP addresses + + + In the first case, clients will repeatedly re-try each address in the URL until they + successfully connect to the primary. In the second case the resource manager will assign the + virtual IP address to the primary broker, so clients only need to re-try on a single address. + + + When the primary broker fails all clients are disconnected. They go back to re-trying until + they connect to the new primary. Any messages that have been sent by the client, but not yet + acknowledged as delivered, are resent. Similarly messages that have been sent by the broker, + but not acknowledged, are re-queued. + + + Suppose your cluster has 3 nodes: node1, node2 + and node3 all using the default AMQP port. To connect a client you + need to specify the address(es) and set the reconnect property to + true. Here's how to connect each type of client: + +
+ C++ clients + + With the C++ client, you specify multiple cluster addresses in a single URL + + + The full grammar for the URL is: + + url = ["amqp:"][ user ["/" password] "@" ] addr ("," addr)* + addr = tcp_addr / rmda_addr / ssl_addr / ... + tcp_addr = ["tcp:"] host [":" port] + rdma_addr = "rdma:" host [":" port] + ssl_addr = "ssl:" host [":" port]' + + + . You also + need to specify the connection option reconnect to be true. For + example: + + qpid::messaging::Connection c("node1,node2,node3","{reconnect:true}"); + + +
+
+ Python clients + + With the python client, you specify reconnect=True and a list of + host:port addresses as reconnect_urls + when calling Connection.establish or Connection.open + + connection = qpid.messaging.Connection.establish("node1", reconnect=True, reconnect_urls=["node1", "node2", "node3"]) + + +
+
+ Java JMS Clients + + In Java JMS clients, client fail-over is handled automatically if it is enabled in the + connection. You can configure a connection to use fail-over using the + failover property: + - To create replicated queues and exchangs via the client API, add a node entry to the address like this: - - "myqueue;{create:always,node:{x-declare:{arguments:{'qpid.replicate':all}}}}" - -
+ + connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672'&failover='failover_exchange' + + + This property can take three values: + + + Fail-over Modes + + failover_exchange + + + If the connection fails, fail over to any other broker in the cluster. + + + + + + + roundrobin + + + If the connection fails, fail over to one of the brokers specified in the brokerlist. + + + + + + + singlebroker + + + Fail-over is not supported; the connection is to a single broker only. + + + + + + + In a Connection URL, heartbeat is set using the idle_timeout property, which is an integer corresponding to the heartbeat period in seconds. For instance, the following line from a JNDI properties file sets the heartbeat time out to 3 seconds: + + + + connectionfactory.qpidConnectionfactory = amqp://guest:guest@clientid/test?brokerlist='tcp://localhost:5672',idle_timeout=3 + + +
+
- Client Fail-over + The Cluster Resource Manager - Clients can only connect to the single primary broker. All other brokers in the - cluster are backups, and they automatically reject any attempt by a client to - connect. + Broker fail-over is managed by a cluster resource manager. An + integration with rgmanager is provided, but it is + possible to integrate with other resource managers. - Clients are configured with the addreses of all of the brokers in the cluster. - - - If the resource manager supports virtual IP addresses then the clients - can be configured with a single virtual IP address. - - - When the client tries to connect initially, it will try all of its addresses until it - successfully connects to the primary. If the primary fails, clients will try to - try to re-connect to all the known brokers until they find the new primary. + The resource manager is responsible for starting an appropriately-configured broker on each + node in the cluster. The resource manager then promotes one of the + brokers to be the primary. The other brokers connect to the primary as backups, using the URL + provided in the ha-brokers configuration option. - Suppose your cluster has 3 nodes: node1, node2 and node3 all using the default AMQP port. + Once connected, the backup brokers synchronize their state with the primary. When a backup is + synchronized, or "hot", it is ready to take over if the primary fails. Backup brokers + continually receive updates from the primary in order to stay synchronized. - With the C++ client, you specify all the cluster addresses in a single URL, for example: - - qpid::messaging::Connection c("node1:node2:node3"); - + If the primary fails, backup brokers go into fail-over mode. The resource manager must detect + the failure and promote one of the backups to be the new primary. The other backups connect + to the new primary and synchronize their state so they can be backups for it. - With the python client, you specify reconnect=True and a list of host:port addresses as reconnect_urls when calling establish or open - - connection = qpid.messaging.Connection.establish("node1", reconnect=True, "reconnect_urls=["node1", "node2", "node3"]) - + The resource manager is also responsible for protecting the cluster from + split-brain conditions resulting from a network partition. + A network partition divide a cluster into two sub-groups which cannot see each other. + Usually a quorum voting algorithm is used that disables + nodes in the inquorate sub-group.
-
- Broker fail-over + Configuring <command>rgmanager</command> as resource manager + + This section assumes that you are already familiar with setting up and configuring + clustered services using cman and rgmanager. It + will show you how to configure an active-passive, hot-standby qpidd + HA cluster. + - Broker fail-over is managed by a cluster resource - manager. The initial preview version of HA is not integrated with a - resource manager, the production version will be integrated with rgmanager and it may - be integrated with other resource managers in the future. + Here is an example cluster.conf file for a cluster of 3 nodes named + mrg32, mrg34 and mrg35. We will go through the configuration step-by-step. + + + + + + + + + + + + + + + + + + + + + + + + + + + + +