summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAlan Conway <aconway@apache.org>2011-10-19 20:36:43 +0000
committerAlan Conway <aconway@apache.org>2011-10-19 20:36:43 +0000
commit39823dac45f3a79d19d37204dd122921b5339673 (patch)
tree56288a974a021b1885820e80ec3d4701d131426e
parent1b7f61bbf0f1ab5dad2bc380b3890911a1e43f52 (diff)
downloadqpid-python-39823dac45f3a79d19d37204dd122921b5339673.tar.gz
QPID-2920: Updates to new-cluster-plan.
Filled out plan to-do list. More implementation detail. git-svn-id: https://svn.apache.org/repos/asf/qpid/branches/qpid-2920-active@1186466 13f79535-47bb-0310-9956-ffa450edef68
-rw-r--r--qpid/cpp/design_docs/new-cluster-design.txt15
-rw-r--r--qpid/cpp/design_docs/new-cluster-plan.txt20
2 files changed, 24 insertions, 11 deletions
diff --git a/qpid/cpp/design_docs/new-cluster-design.txt b/qpid/cpp/design_docs/new-cluster-design.txt
index aaf8cd6488..936530a39a 100644
--- a/qpid/cpp/design_docs/new-cluster-design.txt
+++ b/qpid/cpp/design_docs/new-cluster-design.txt
@@ -296,23 +296,20 @@ flow control is sufficient for qpid.
** Live upgrades
Live upgrades refers to the ability to upgrade a cluster while it is
-running, with no downtime. Brokers are shut down and re-start brokers
-in the cluster is shut down, and then re-started with a new version of
-the broker code.
+running, with no downtime. Each brokers in the cluster is shut down,
+and then re-started with a new version of the broker code.
To achieve this
- Cluster protocl XML file has a new element <version number=N> attached
to each method. This is the version at which the method was added.
-- New versions can add methods, existing methods cannot be changed.
+- New versions can only add methods, existing methods cannot be changed.
- The cluster handshake for new members includes the protocol version
at each member.
- The cluster's version is the lowest version among its members.
-- A newer broker can join and older cluster but not vice versa.
-- A newer broker in an old cluster must speak the old protocol for the
- benefit of older brokers. It can also use newer controls which
- will be ignored by old broekrs.
+- A newer broker can join and older cluster. When it does, it must restrict
+ itself to speaking the older version protocol.
- When the cluster version increases (because the lowest version member has left)
- the remaining members may switch to using only the new version.
+ the remaining members may move up to the new version.
* Design debates
diff --git a/qpid/cpp/design_docs/new-cluster-plan.txt b/qpid/cpp/design_docs/new-cluster-plan.txt
index b22f7a1f31..626e443be7 100644
--- a/qpid/cpp/design_docs/new-cluster-plan.txt
+++ b/qpid/cpp/design_docs/new-cluster-plan.txt
@@ -90,7 +90,9 @@ Independent message IDs that can be generated and sent with the message simplify
this and potentially allow performance benefits by relaxing total ordering.
However they imply additional map lookups that might hurt performance.
-- [ ] Prototype independent message IDs, check performance.
+- [X] Prototype independent message IDs, check performance.
+Throughput worse by 30% in contented case, 10% in uncontended.
+Sticking with queue sequence numbers.
* Outstanding Tasks
** TODO [#A] Defer and async completion of wiring commands.
@@ -152,6 +154,10 @@ Status includes
- persistent store state (clean, dirty)
- cluster protocol version.
+** TODO [#B] Replace boost::hash with our own hash function.
+The hash function is effectively part of the interface so
+we need to be sure it doesn't change underneath us.
+
** TODO [#B] Persistent cluster support.
Initial status protoocl to support persistent start-up (see existing code)
@@ -192,6 +198,16 @@ When this is fixed in the standalone broker, it should be fixed for cluster.
** TODO [#B] Network partitions and quorum.
Re-use existing implementation.
+** TODO [#B] Review error handling, put in a consitent model.
+- [ ] Review all asserts, for possible throw.
+- [ ] Decide on fatal vs. non-fatal errors.
+
+** TODO [#B] Implement inconsistent error handling policy.
+What to do if a message is enqueued sucessfully on the local broker,
+but fails on one or more backups - e.g. due to store limits?
+- we have more flexibility, we don't *have* to crash
+- but we've loste some of our redundancy guarantee, how should we inform client?
+
** TODO [#C] Allow non-replicated exchanges, queues.
Set qpid.replicate=false in declare arguments, set flag on Exchange, Queue objects.
@@ -226,7 +242,7 @@ The old cluster has workarounds in the broker code that can be removed.
- [ ] drop connections, sessions, management from cluster update.
- [ ] drop security workarounds: cluster code now operates after message decoding.
- [ ] drop connection tracking in cluster code.
-- [ ] simper inconsistent-error handling code, no need to stall.
+- [ ] simpler inconsistent-error handling code, no need to stall.
** TODO [#C] Support for live upgrades.