summaryrefslogtreecommitdiff
path: root/qpid/cpp/src/tests/ha_test.py
Commit message (Collapse)AuthorAgeFilesLines
* QPID-7207: remove cpp and python subdirs from svn trunk, they have migrated ↵Robert Gemmell2016-07-051-401/+0
| | | | | | to their own git repositories git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1751566 13f79535-47bb-0310-9956-ffa450edef68
* QPID-7207: Create independent cpp and python subtrees, with content from ↵Justin Ross2016-04-211-7/+3
| | | | | | tools and extras git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1740289 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5855 - Simplified HA transaction logic.Alan Conway2015-09-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Removed complex and incorrect HA+TX logic, reverted to the following limitation: You can use transactions in a HA cluster, but there are limitations on the transactional guarantees. Transactions function normally with the *primary* broker but replication to the backups is not coverted by the atomic guarantee. The following situations are all safe: - Client rolls back a transaction. - Client successfully commits a transaction. - Primary fails during a transaction *before* the client sends a commit. - Transaction contains only one message. The problem case is when all of the following occur: - transaction contains multiple actions (enqueues or dequeues) - primary fails between client sending commit and receiving commit-complete. In this case it is possible that only part of the transaction was replicated to the backups, so on fail-over partial transaction results may be visible. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1701109 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4710: [AMQP 1.0] Support for transactions in qpid::messaging C++ client.Alan Conway2015-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implements the "transactional retire and settle immediately" option for transactions as specified in AMQP 1.0 in the qpid::messaging C++ client. NOTE: Transactions over AMQP 1.0 require proton 0.9 or greater. With older versions, attempting a transactions over AMQP 1.0 will raise a link-detached exception "Node not found: tx-transaction" 1. Added descriptor list to Variant with support in Encoder and PnData. Required to support transactions, need to be able to create described lists. Variant changes are source and binary compatible. A Variant now has a Variant::List of descripors which can be numeric or string. Nested descriptors are implemented by putting multiple descriptors in the list. Other minor changes: - Variant refactor: don't delete impl on every assignment. - Add Variant constructors that take a string encoding. (new constructors, not defaulted arguments, so the change is binary and source compatible.) - Growable buffer support for Encoder. - Printing described Variant prints descriptors in form @descriptor value 2. Added transaction support to AMQP 1.0 client code Added messaging/amqp/Transaction.h,cpp: transaction logic - communicate with coordinator, send declare/dischange messages. - add tx state info to transfers and acknowledgements. - Sync session after discharge. - A transactional session automatically acks any message retrieved by fetch/get to bring them into the transaction. This is consistent the 0-10 client. Minor fixes to existing client code: - Fix use of pn_drain API in C++ client to work with C++ and Java brokers. - Make amqp::Exception derive from qpid::Exception 3. Fixes to existing broker code: - Incoming.cpp fix: start async completion before processing message. - Delay accept of dischage message till commit is complete. - newSession - handle failover during session creation. 4. Added tests interop_tests.py: transaction tests that can run against an external broker, see comments. ha_tests.py: Enable transaction tests over AMQP 1.0. Minor test fixes: - brokertest.py don't set default logging if QPID_LOG env vars set. - brokertest.py Pass kwargs to broker() create function. - qpid-receive: capacity should never be larger than message count. - Accept user:pass as well as user/pass in Url. - brokertest.py: Always do a ready() check on all brokers. If proton < 0.9 is used, transaction tests will be skipped or will downgrade to the amqp0-10 protocol with a printed warning. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1662743 13f79535-47bb-0310-9956-ffa450edef68
* QPID-6414: Skip HA tests if qpid-ha or qpid-config tools are not available.Alan Conway2015-02-251-7/+16
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1662275 13f79535-47bb-0310-9956-ffa450edef68
* QPID-6413: Sporadic failure of HA tests causd by maxNegotiateTimeoutAlan Conway2015-02-251-11/+11
| | | | | | | | | | | | | | | | | | | | Increased maxNegotiateTimeout to the default (10 seconds). A smaller value speeds up detection of non-running brokers on remote hosts, but this is not necessary for the tests. Increased some other test timeouts and added some improved error reporting. The occasional long (> 1 second) connection delays are caused by Cyrus SASL authentication. Not clear why this takes so long, but that is a separate issue. Here's a client log excerpt showing the delay. 2015-02-25 08:29:37.461299895 [Network] trace RECV [[127.0.0.1:34247-127.0.0.1:45983]]: Frame[BEbe; channel=0; {ConnectionStartBody: server-properties={qpid.federation_tag:V2:36:str16(77800bff-a176-46c1-917a-32f136dee650)}; mechanisms=str16{V2:9:str16(ANONYMOUS), V2:5:str16(PLAIN)}; locales=str16{V2:5:str16(en_US)}; }] 2015-02-25 08:29:37.463116303 [Security] debug CyrusSasl::start(ANONYMOUS PLAIN) (Note delay > 1 sec here) 2015-02-25 08:29:38.839793753 [Security] debug min_ssf: 0, max_ssf: 256 2015-02-25 08:29:38.839851781 [Security] debug CyrusSasl::start(ANONYMOUS PLAIN): selected ANONYMOUS response: 'anonymous@wallace' 2015-02-25 08:29:38.839963162 [Client] warning Connection [127.0.0.1:34247-127.0.0.1:45983] closed git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1662247 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: HA Fix ha_tests.py failures with SWIG 0.10 client.Alan Conway2014-09-051-24/+17
| | | | | | | | - Fix un-necessary re-sends in amqp0_10::SenderImpl::replay. - Throw NotFound and UnauthorizedAccess correctly from amqp0_10::SessionImpl and ConnectionImpl - Fix ha_test wait_address and valid_address re-using a session after it is closed by NotFound. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1622592 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5975: HA extra/missing messages when running qpid-txtest2 in a loop ↵Alan Conway2014-08-281-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | with failover. This is partly not-a-bug, there is a client error handling issue that has been corrected. qpid-txtest2 initializes a queue with messages at the start and drains the queues at the end. These operations are *not transactional*. Therefore duplicates are expected if there is a failover during initialization or draining. When duplicates were observed, there was indeed a failover at one of these times. Making these operations transactional is not enough to pass, now we see the test fail with "no messages to fetch". This is explained as follows: If there is a failover during a transaction, TransactionAborted is raised. The client assumes the transaction was rolled back and re-plays it. However, if the failover occurs at a critical point *after* the client has sent commit but *before* it has received a response, then the the client *does not know* whether the transaction was committed or rolled-back on the new primary. Re-playing in this case can duplicate the transaction. Each transaction moves messages from one queue to another so as long as transactions are atomic the total number of messages will not change. However, if transactions are duplicated, a transactional session may try to move more messages than exist on the queue, hence "no messages to fetch". For example if thread 1 moves N messages from q1 to q2, and thread 2 tries to move N+M messages back, then thread 2 will fail. This problem has been corrected as follows: C++ and python clients now raise the following exceptions: - TransactionAborted: The transaction has definitely been rolled back due to a connection failure before commit or a broker error (e.g. a store error) during commit. It can safely be replayed. - TransactionUnknown: The transaction outcome is unknown because the connection failed at the critical time. There's no simple automatic way to know what happened without examining the state of the broker queues. Unfortunately With this fix qpid-txtest2 is no longer useful test for TX failover because it regularly raises TransactionUnknown and there's not much we can do with that. A better test of TX atomicity with failover is to run a pair of qpid-send/qpid-receive with fail-over and verify that the number of enqueues/dequeues and message depth are a multiple of the transaction size. See the JIRA for such a test. (Note these test also sometimes raise TransactionUnknown but it doesn't matter since all we are checking is that messages go on and off the queues in multiple of the TX size.) ) Note: the original bug also reported seeing missing messages from qpid-txtest2. I don't have a good explanation for that but since the qpid-send/receive test shows that transactions are atomic I am going to let that go for now. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1621211 13f79535-47bb-0310-9956-ffa450edef68
* QPID-6035: HA clearly distinguish qpid-ha commands intended for cluster manager.Alan Conway2014-08-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a --cluster-manager flag to qpid-ha tool. Without this flag - the 'promote' command is not listed in the tool help. - using the promote command raises an error saying that it is only for cluster manager use and mentioning the --cluster-manager flag. With the flag: promote functions as before. The qpid-ha help text for promote is also more clear now that it is for cluster manager only. Originally the idea was to split qpid-ha into two tools but I have kept one tool with the flag and warning messages because it: - avoids packaging changes that might trip things up. - helps people who are already using qpid-ha promote: their scripts will break but the error message explains how to fix it. I think the special role of promote is sufficiently clear now even if it is part of the same tool. This commit also updates the following to take account of the new flag: - rgmanager qpidd-primary script. - qpidd tests. - qpid book HA chapter. NOTE: THIS WILL BREAK TEST HARNESSES that do promotion outside of rgmanager. You'll need to add the --cluster-manager flag in the relevant places. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1619877 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: HA fix hanging ha_tests.test_failover_send_receive on RHEL5Alan Conway2014-05-011-6/+6
| | | | | | | | | | | | | The test was hanging because of a python construct not available in 2.4. It was causing an exception in a strange place because this bit of code was imported at runtime, and that was hanging the test. Fixed and did some cleanup to avoid such mysterious hangs in future: - Fixed qpidtoollibs/config.py to work with python 2.4. - Import qpid-ha script at import time rather than runtime. - Fix Popen.teardown logic to avoid hanging if a process can't be killed. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1591794 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: HA minor cleanup of qpid-ha toolAlan Conway2014-04-241-8/+0
| | | | | | | | | | - Remove some dead code. - Removed "set" command - not ready for production. All settings in qpidd.conf. - Removed related tests in ha_tests - Improved help on promote command. - Made option group for common broker connection options. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589834 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5719: HA becomes unresponsive once any of the brokers are SIGSTOPedAlan Conway2014-04-241-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | - Added timeout to qpid-ha. - qpidd init script pings broker to verify it is not hung. - updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml. The new results for the cases mentioned in the bug: a] stopped ALL brokers: rgmanager restarts the entire cluster but data is lost. Equivalent to killing all the brokers at once. This does not affect quorum because only qpidd services are affected, not other services managed by cman. b] stopped the primary: rgmanager restarts the primary after a timeout and promotes one of the backups. c] stopped a backup: rgmanager restarts the backups after a timeout. Clients that are actively sending messages may see a delay while backup is restarted. Note you need to set link-heartbeat-interval in qpidd.conf. The default is very high (120 seconds), it should be set lower to see recovery from sigstop in a reasonable time. See the updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589807 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5560: HA tests do not use AMQP 1.0Alan Conway2014-04-071-21/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HA tests were using only AMQP 0-10. Modified the tests to use AMQP 1.0 if available (still use 0-10 if 1.0 is not available) Fixed bugs uncovered both in the tests and in the AMQP 1.0 implementation. Summary of changes: - brokertest.py: configurable support for of swig vs. native and amqp0-10 vs. 1.0 - default to swig+amqp1.0 if swig is available, native+amqp0-10 otherwise - qpidtoollibs/broker.py: enable use of swig client with BrokerAgent - Swig python client: - support for passing client_properties/properties. - expose AddressHelper pn_data read/write as PnData helper class - set sender/receiver capacity on creation - limited disposition support - rejected messages. - support for additional timeout parameters - expose messaging::Logger, allow log configuration to be set from python. - ha_tests.py: - bind, delete policies not supported by AMQP 1.0, switched to using BrokerAgent QMF. - pass protocol:amqp1.0 connection-option to c++ test clients (qpid-send, qpid-receive) - TX tests forsce use of 0-10 protocol (but still with Swig client if enabled.) - Broker fixes: - Queue::Settings::isTemporary was set in the 0-10 SessionAdapter, moved to Broker::createQueue. - broker::amqp::Session was always setting an exclusive owner in createQueue git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1585588 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5666: HA fails with resource-limit-exceeded: Exceeded replicated queue ↵Alan Conway2014-04-071-2/+2
| | | | | | | | | | | | limit This is regression introduced in r1561206: CommitDate: Fri Jan 24 21:54:59 2014 +0000 QPID-5513: HA backup fails if number of replicated queues exceeds number of channels. Fixed by the current commit. PrimaryQueueLimits was not taking account of queues already on the broker prior to promotion. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1585507 13f79535-47bb-0310-9956-ffa450edef68
* Author: Alan Conway <aconway@redhat.com>Alan Conway2014-02-061-1/+1
| | | | | | | --- log message follows this NO-JIRA: Remove use of python built-in 'next', not available before python 2.6. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1565382 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5513: HA backup fails if number of replicated queues exceeds number of ↵Alan Conway2014-01-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | channels. The problem: - create cluster of 2 brokers. - create more than 32768 queues (exceeds number of channels on a connection) - backup exits with critical error but - client creating queues receives no error, primary continues with unreplicated queue. The solution: Primary raises an error to the client if it attempts to create queues in excess of the channel limit. The queue is not created on primary or backup, primary and backup continue as normal. In addition: raised the channel limit from 32k to 64k. There was no reason for the smaller limit. See discussion: http://qpid.2158936.n2.nabble.com/CHANNEL-MAX-and-CHANNEL-HIGH-BIT-question-tp7603121p7603138.html New unit test to reproduce the issue, must create > 64k queues. Other minor improvements: - brokertest framework doesn't override --log options in the arguments. - increased default heartbeat in test framework for tests that have busy brokers. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1561206 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5318: HA tests fail sporadically with "AttributeError: 'NoneType' ↵Alan Conway2013-11-081-2/+3
| | | | | | | | | object has no attribute 'name'" This was due to a race condition where a session was deleted while the QmfAgent was looking it up. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1540171 13f79535-47bb-0310-9956-ffa450edef68
* QPID-5139: Add unit test for deadlock caused by blocking HA commit.Alan Conway2013-10-291-0/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1536751 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4944: HA re-enable test_failover_send_receiveAlan Conway2013-09-121-8/+14
| | | | | | | | | | | | | Appears to have been fixed at this point on trunk, not clear which checkins are responsible. Test ran for 48 hours with no failures. Other minor changes: - Enable test_failover_send_receive - Increase heartbeat interval. - Reduce capacity of senders in failover test to be more aggressive. - Use HaBrokerTest as test base git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1522711 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA support for TX transactions - fix TX error messages.Alan Conway2013-09-091-2/+2
| | | | | | | | | - Ignore un-replicated queues when replicating transactions. - Clean up cancel logic in QueueReplicator, causing "no such subscription" errors. - Remove unnecessary exchange delete warnings - ha_test.py: Shorter timeout for starting cluster brokers. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1521192 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA support for TX transactions - fix auth bugs.Alan Conway2013-09-091-1/+1
| | | | | | | | - Set auth info on status check connections - Clean up status check loging - Use realm@username for authentication name (was using just username) git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1521190 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: HA minor fixes to test framework & comments.Alan Conway2013-09-041-2/+4
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1520108 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA clean up transaction artifacts at end of TX.Alan Conway2013-08-301-1/+16
| | | | | | | | | | - Backups delete transactions on failover. - TxReplicator cancel subscriptions when transaction is finished. - TxReplicator rollback if destroyed prematurely. - Handle special case of no backups for a tx. - ha_tests.py: new and modified tests to cover the new functionality. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1518982 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA Handle brokers joining and leaving during a transaction.Alan Conway2013-08-051-3/+6
| | | | | | | | | | | During a transaction: - A broker leaving aborts the transaction. - A broker joining does not participate in the transaction - but does receive the results of the TX via normal replication. Clean up tx-queues when the transaction completes. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1510678 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA TX transactions, blocking wait for prepareAlan Conway2013-08-011-5/+11
| | | | | | | | | | | Backups send prepare messages to primary, primary delays completion of prepare till all are prepared (or there is a failure). This is NOT the production solution - blocking could cause a deadlock. We need to introduce asynchronous completion of prepare without blocking. This interim solution allows testing on other aspects of TX support. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1509424 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4327: HA TX transactions: basic replication.Alan Conway2013-08-011-1/+7
| | | | | | | | | | | | | | | On primary a PrimaryTxObserver observes a transaction's TxBuffer and generates transaction events on a tx-replication-queue. On the backup a TxReplicator receives the events and constructs a TxBuffer equivalent to the one in the primary. Unfinished: - Primary does not wait for backups to prepare() before committing. - All connected backups are assumed to be in the transaction, there are race conditions around brokers joining/leavinv where this assumption is invalid. - Need more tests. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1509423 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4944: HA Sporadic failure in ha_tests: tes_failover_send_receive and ↵Alan Conway2013-06-211-1/+1
| | | | | | | | | | | | test_expected_backup_timeout Very sporadic failures so difficult to verify the fix. - Simplified Membership, centralized status change, make it atomic. - Fix test bug in test_expected_backup_timeout: not waiting on final status check, race. - Remove out-of-date status info from log prefixes: Guard, ReplicatingSubscription git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1495466 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4938: No longer build acl or ssl support as pluginsAndrew Stitcher2013-06-191-1/+1
| | | | | | (also remove final references to dead watchdog plugin) git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1494697 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4348: HA Use independent sequence numbers for identifying messagesAlan Conway2013-06-171-2/+2
| | | | | | | | | | | | | | | Previously HA code used queue sequence numbers to identify messasges. This assumes that message sequence is identical on primary and backup. Implementing new features (for example transactions) requires that we tolerate ordering differences between primary and backups. This patch introduces a new, queue-scoped HA sequence number managed by the HA plugin. The HA ID is set *before* the message is enqueued and assigned a queue sequence number. This means it is possible to identify messages before they are enqueued, e.g. messages in an open transaction. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1493771 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4866: HA support for failover exchangeAlan Conway2013-05-221-8/+6
| | | | | | | Add support for the "amq.failover" exchange with new HA, to support migration of clients that used this facility with the old cluster. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1485511 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4745: HA safe port allocation for brokers in HA tests.Alan Conway2013-05-151-25/+79
| | | | | | | | | | | | | | | | | | Many HA tests use --port=0 to start a broker on an available port, but then need to shutdown and restart the broker on the same port. This is not safe, on a busy system it is possible for another process to take the port between the time the broker is shut down and the time it is restarted. The solution is to do bind(0) and listen in the python test framework (class HaPort) and let the broker use the socket using qpidd --socket-fd. When the broker is shut down the port remains bound by the python process. When the broker is re-started it again is given access to the socket via --socket-fd. Other changes - move ha_store_tests into ha_tests. - add heartbeats to avoid stalling. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1482881 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4631: C++ Broker federated links are protected by ACL policy.Charles E. Rolke2013-04-291-0/+10
| | | | | | | | | This issue evolved a bit between the original discussion and the final commit. See https://reviews.apache.org/r/10658/ for the details. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1477112 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4649: use qpid uuid library instead of pythonKenneth Anthony Giusti2013-03-151-2/+1
| | | | git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1456950 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4555: HA Test bugs causing sporadic faiulres in ↵Alan Conway2013-01-311-4/+16
| | | | | | | | | | | | ha_tests.ReplicationTests.test_auto_delete_timeout The tests were not waiting for the cluster to be ready before starting. Updated HaCluster to wait by default before returning. Increase timeout in calls to wait_no_queue, the default timeout of 1 sec was the same as the auto-delete timeout. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1441157 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4506: Qpid HA's '--ha-public-url' option duplicates the ↵Alan Conway2012-12-141-4/+8
| | | | | | | | | | | | | | '--known-hosts-url' option but cannot be disabled Reverts the previous commit and simplifies the semantics of setting --ha-public-url and --ha-brokers-url. There is no longer any over-riding or implicit updating of values. That means you must set --ha-public-url as well as --ha-brokers-url, it will not be defaulted. Likewise if you *dont* set ha-public-url, it will remain empty, which is the use case in this bug. The defaulting was adding complexity without adding much value. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1421934 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4428: HA add UUID tag to avoid using an out of date queue/exchange.Alan Conway2012-11-141-3/+7
| | | | | | | | | | | | | | | | | Imagine a cluster with primary A and backups B and C. A queue Q is created on A and replicated to B, C. Now A dies and B takes over as primary. Before C can connect to B, a client destroys Q and creates a new queue with the same name. When B connects it sees Q and incorrectly assumes it is the same Q that it has already replicated. Now C has an inconsistent replica of Q. The fix is to tag queues/exchanges with a UUID so a backup can tell if a queue is not the same as the one it has already replicated, even if the names are the same. This all also applies to exchanges. - Minor imrovements to printing UUIDs in a FieldTable. - Fix comparison of void Variants, added operator != git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1409241 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: Fix spelling if --link-maintenace-interval option and add ↵Alan Conway2012-10-111-1/+1
| | | | | | | | descriptive text. Also added description for --link-heartbeat-interval git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1397253 13f79535-47bb-0310-9956-ffa450edef68
* Bug 860701 - QPID-4350: HA handle auto-delete queuesAlan Conway2012-10-111-4/+17
| | | | | | | Subscribed auto-delete queues are deleted by the backup. Timed auto-delete queues are deleted after the timeout. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1397243 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4360: Fix test bug: Non-ready HA broker can be incorrectly promoted to ↵Alan Conway2012-10-091-1/+1
| | | | | | | | | | | primary. Test test_delete_missing_response was failing with "cluster active, cannot promote". - Fixed test bug: "fake" primary triggered "cannot promote". - Backup: always create QueueReplicator if not already existing. - Terminology change: "initial" queues -> "catch-up" queues. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1396244 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4285: HA backups continuously disconnect / re-sync after attempting to ↵Alan Conway2012-10-021-0/+2
| | | | | | | | | | | | | | | | replicate a deleted queue. (Based on patch by Jason Dillama) This does not directly tackle the origin of the problem but extends Jasons's patch since it addresses something we had to fix anyway: "leaking" queues and exchanges. It does 2 things. 1. enabled hideDeletedError on all subscription objects used by HA This suppress the troublesome exception with a harmless no-op 2. Delete queues/exchanges missing from responses (based on Jasons patch) Fix the "leak" of queues and exchanges possible when an object replicated to a backup is deleted from the newn primary before the backup connects. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1393089 13f79535-47bb-0310-9956-ffa450edef68
* NO-JIRA: Fix logging in ha_tests.pyAlan Conway2012-09-271-1/+12
| | | | | | | | | | In order to suppress unwanted warnings from certain test, the ha_test framework was actually turning off all python logging. This patch selectively turns off wanrnings in specific code regions and then restores the configured logging level. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1391232 13f79535-47bb-0310-9956-ffa450edef68
* QPID-4325: HA Starting from persistent storeAlan Conway2012-09-251-0/+258
When re-starting a persistent HA cluster, the broker that becomes primary should keep its store data while all the backup brokers should discard their store data and catch up from the primary. Backups cannot simply use their own stores because sequence numbers of stored messages will not match on all brokers. The backup erases individual queues and exchanges as the catch-up process gets to them. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1390123 13f79535-47bb-0310-9956-ffa450edef68