<feed xmlns='http://www.w3.org/2005/Atom'>
<title>delta/python-packages/qpid-python.git/qpid/tools/src/py/qpid-ha, branch trunk</title>
<subtitle>git.apache.org: qpid.git
</subtitle>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/'/>
<entry>
<title>QPID-7207: Create independent cpp and python subtrees, with content from tools and extras</title>
<updated>2016-04-21T12:31:34+00:00</updated>
<author>
<name>Justin Ross</name>
<email>jross@apache.org</email>
</author>
<published>2016-04-21T12:31:34+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=71149592670f7592886751a9a866459bef0f12cc'/>
<id>71149592670f7592886751a9a866459bef0f12cc</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1740289 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1740289 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>QPID-6767: add --sasl-service-name option to all QPID tools.</title>
<updated>2015-10-02T19:47:17+00:00</updated>
<author>
<name>Ken Giusti</name>
<email>kgiusti@apache.org</email>
</author>
<published>2015-10-02T19:47:17+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=2d9f712ce551026f0d5f302aaa0e0b85aea8747c'/>
<id>2d9f712ce551026f0d5f302aaa0e0b85aea8747c</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1706480 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1706480 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>NO-JIRA: qpid-ha don't return error status if called with -h or --help flag.</title>
<updated>2014-12-01T21:10:51+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-12-01T21:10:51+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=7b330757e5bcff063ebf9db23b9206a84bf78c95'/>
<id>7b330757e5bcff063ebf9db23b9206a84bf78c95</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1642758 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1642758 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>QPID-6035: HA fix bug in qpid-ha, --help does not show help.</title>
<updated>2014-09-02T16:22:27+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-09-02T16:22:27+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=4e794e4b1157995e544b9a501b39a13d002e0b10'/>
<id>4e794e4b1157995e544b9a501b39a13d002e0b10</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1622057 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1622057 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>QPID-6035: HA clearly distinguish qpid-ha commands intended for cluster manager.</title>
<updated>2014-08-22T18:06:20+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-08-22T18:06:20+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=d42d6b5305b6617bd5bdc7417500e115e2346a88'/>
<id>d42d6b5305b6617bd5bdc7417500e115e2346a88</id>
<content type='text'>
This commit adds a --cluster-manager flag to qpid-ha tool.

Without this flag
- the 'promote' command is not listed in the tool help.
- using the promote command raises an error saying that it is only for cluster manager use
  and mentioning the --cluster-manager flag.

With the flag: promote functions as before.

The qpid-ha help text for promote is also more clear now that it is for cluster
manager only.

Originally the idea was to split qpid-ha into two tools but I have kept one tool
with the flag and warning messages because it:

- avoids packaging changes that might trip things up.

- helps people who are already using qpid-ha promote: their scripts will
  break but the error message explains how to fix it.

I think the special role of promote is sufficiently clear now even if it is
part of the same tool.

This commit also updates the following to take account of the new flag:
- rgmanager qpidd-primary script.
- qpidd tests.
- qpid book HA chapter.

NOTE: THIS WILL BREAK TEST HARNESSES that do promotion outside of rgmanager.
You'll need to add the --cluster-manager flag in the relevant places.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1619877 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
This commit adds a --cluster-manager flag to qpid-ha tool.

Without this flag
- the 'promote' command is not listed in the tool help.
- using the promote command raises an error saying that it is only for cluster manager use
  and mentioning the --cluster-manager flag.

With the flag: promote functions as before.

The qpid-ha help text for promote is also more clear now that it is for cluster
manager only.

Originally the idea was to split qpid-ha into two tools but I have kept one tool
with the flag and warning messages because it:

- avoids packaging changes that might trip things up.

- helps people who are already using qpid-ha promote: their scripts will
  break but the error message explains how to fix it.

I think the special role of promote is sufficiently clear now even if it is
part of the same tool.

This commit also updates the following to take account of the new flag:
- rgmanager qpidd-primary script.
- qpidd tests.
- qpid book HA chapter.

NOTE: THIS WILL BREAK TEST HARNESSES that do promotion outside of rgmanager.
You'll need to add the --cluster-manager flag in the relevant places.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1619877 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>QPID-5942: qpid HA cluster may end-up in joining state after HA primary is killed</title>
<updated>2014-07-31T13:55:11+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-07-31T13:55:11+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=c9276b03da088b3f4d3f4b527f2e02703e2729eb'/>
<id>c9276b03da088b3f4d3f4b527f2e02703e2729eb</id>
<content type='text'>
There are two issues here, both related to the fact that rgmanager sees qpidd
and qpidd-primary as two separate services.

1. The service start/stop scripts can be called concurrently. This can lead to
   running a qpidd process who's pid is not in the pidfile. rgmanager cannot
   detect or kill this qpidd and cannot start another qpidd because of the lock
   on the qpidd data directory.

2. rgmanager sees a primary failure as two failures: qpidd and qpidd-primary,
   and will then try to stop and start both services. The order of these actions
   is not defined and can lead to rgmanager killing a service it has just
   started.

This patch makes two major changes to the init scripts:

1. Uses flock to lock the sensitive stop/start part of the scripts to ensure
   they are not executed concurrently.

2. On "stop" the scripts check if a running qpidd is primary or not. "qpidd stop"
   is a no-op if the running broker is primary, "qpidd-primary stop" is a no op
   if it is not. This ensures that a broker will be stopped by the same stream
   of service actions that started it.

Minor changes in this patch:
- better logging of broker start-up and shut-down sequence.
- qpid-ha heartbeat use half of timeout option.
- add missing timeouts in qpid-ha.


Notes:

This changes the behavior of 'clusvcadm -d &lt;qpidd-service&gt;' on the primary node.
Previously this would have stopped the qpidd service on that node, killed the
qpidd process and relocated the primary service. Now this will stop the qpidd
service (as far as rgmanager is concerned) but will not kill qpidd or relocate
the primary service. When the primary is relocated the qpidd service wil not be
able to re-start on that node until it is re-enabled with 'clusvcadm -e'.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1614895 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are two issues here, both related to the fact that rgmanager sees qpidd
and qpidd-primary as two separate services.

1. The service start/stop scripts can be called concurrently. This can lead to
   running a qpidd process who's pid is not in the pidfile. rgmanager cannot
   detect or kill this qpidd and cannot start another qpidd because of the lock
   on the qpidd data directory.

2. rgmanager sees a primary failure as two failures: qpidd and qpidd-primary,
   and will then try to stop and start both services. The order of these actions
   is not defined and can lead to rgmanager killing a service it has just
   started.

This patch makes two major changes to the init scripts:

1. Uses flock to lock the sensitive stop/start part of the scripts to ensure
   they are not executed concurrently.

2. On "stop" the scripts check if a running qpidd is primary or not. "qpidd stop"
   is a no-op if the running broker is primary, "qpidd-primary stop" is a no op
   if it is not. This ensures that a broker will be stopped by the same stream
   of service actions that started it.

Minor changes in this patch:
- better logging of broker start-up and shut-down sequence.
- qpid-ha heartbeat use half of timeout option.
- add missing timeouts in qpid-ha.


Notes:

This changes the behavior of 'clusvcadm -d &lt;qpidd-service&gt;' on the primary node.
Previously this would have stopped the qpidd service on that node, killed the
qpidd process and relocated the primary service. Now this will stop the qpidd
service (as far as rgmanager is concerned) but will not kill qpidd or relocate
the primary service. When the primary is relocated the qpidd service wil not be
able to re-start on that node until it is re-enabled with 'clusvcadm -e'.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1614895 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>NO-JIRA: Added qpid-ha query --all flag.</title>
<updated>2014-07-18T18:17:17+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-07-18T18:17:17+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=0e259111b19f2972933b9fb80070b1c4872c450e'/>
<id>0e259111b19f2972933b9fb80070b1c4872c450e</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1611747 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1611747 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>NO-JIRA: HA qpid-ha usability: automatically use qpidd.conf if no --broker option.</title>
<updated>2014-04-25T19:38:41+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-04-25T19:38:41+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=32dae54e851da9b5e5608ce1b7d46ac2b7ad2d96'/>
<id>32dae54e851da9b5e5608ce1b7d46ac2b7ad2d96</id>
<content type='text'>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1590118 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1590118 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>NO-JIRA: HA minor cleanup of qpid-ha tool</title>
<updated>2014-04-24T18:59:03+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-04-24T18:59:03+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=e8f4b182c6a6b4df51a4853d21319e624b1203b7'/>
<id>e8f4b182c6a6b4df51a4853d21319e624b1203b7</id>
<content type='text'>
- Remove some dead code.
- Removed "set" command - not ready for production. All settings in qpidd.conf.
  - Removed related tests in ha_tests
- Improved help on promote command.
- Made option group for common broker connection options.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589834 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Remove some dead code.
- Removed "set" command - not ready for production. All settings in qpidd.conf.
  - Removed related tests in ha_tests
- Improved help on promote command.
- Made option group for common broker connection options.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589834 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
<entry>
<title>QPID-5719: HA becomes unresponsive once any of the brokers are SIGSTOPed</title>
<updated>2014-04-24T17:54:05+00:00</updated>
<author>
<name>Alan Conway</name>
<email>aconway@apache.org</email>
</author>
<published>2014-04-24T17:54:05+00:00</published>
<link rel='alternate' type='text/html' href='http://trove.baserock.org/cgit/delta/python-packages/qpid-python.git/commit/?id=1d3b4560f8a7f212976b536376a976b3b41f489b'/>
<id>1d3b4560f8a7f212976b536376a976b3b41f489b</id>
<content type='text'>
- Added timeout to qpid-ha.
- qpidd init script pings broker to verify it is not hung.
- updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.

The new results for the cases mentioned in the bug:

a] stopped ALL brokers: rgmanager restarts the entire cluster but data is lost.
   Equivalent to killing all the  brokers at once. This does not affect quorum because
   only qpidd services are affected, not other services managed by cman.

b] stopped the primary: rgmanager restarts the primary after a timeout and promotes one of the backups.

c] stopped a backup: rgmanager restarts the backups after a timeout.
   Clients that are actively sending messages may see a delay while backup is restarted.

Note you need to set link-heartbeat-interval in qpidd.conf. The default is very
high (120 seconds), it should be set lower to see recovery from sigstop in a
reasonable time.
See the updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589807 13f79535-47bb-0310-9956-ffa450edef68
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
- Added timeout to qpid-ha.
- qpidd init script pings broker to verify it is not hung.
- updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.

The new results for the cases mentioned in the bug:

a] stopped ALL brokers: rgmanager restarts the entire cluster but data is lost.
   Equivalent to killing all the  brokers at once. This does not affect quorum because
   only qpidd services are affected, not other services managed by cman.

b] stopped the primary: rgmanager restarts the primary after a timeout and promotes one of the backups.

c] stopped a backup: rgmanager restarts the backups after a timeout.
   Clients that are actively sending messages may see a delay while backup is restarted.

Note you need to set link-heartbeat-interval in qpidd.conf. The default is very
high (120 seconds), it should be set lower to see recovery from sigstop in a
reasonable time.
See the updated documentation in qpid/doc/book/src/cpp-broker/Active-Passive-Cluster.xml.

git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1589807 13f79535-47bb-0310-9956-ffa450edef68
</pre>
</div>
</content>
</entry>
</feed>
