diff options
author | Alan Conway <aconway@apache.org> | 2011-10-19 20:36:43 +0000 |
---|---|---|
committer | Alan Conway <aconway@apache.org> | 2011-10-19 20:36:43 +0000 |
commit | 39823dac45f3a79d19d37204dd122921b5339673 (patch) | |
tree | 56288a974a021b1885820e80ec3d4701d131426e | |
parent | 1b7f61bbf0f1ab5dad2bc380b3890911a1e43f52 (diff) | |
download | qpid-python-39823dac45f3a79d19d37204dd122921b5339673.tar.gz |
QPID-2920: Updates to new-cluster-plan.
Filled out plan to-do list. More implementation detail.
git-svn-id: https://svn.apache.org/repos/asf/qpid/branches/qpid-2920-active@1186466 13f79535-47bb-0310-9956-ffa450edef68
-rw-r--r-- | qpid/cpp/design_docs/new-cluster-design.txt | 15 | ||||
-rw-r--r-- | qpid/cpp/design_docs/new-cluster-plan.txt | 20 |
2 files changed, 24 insertions, 11 deletions
diff --git a/qpid/cpp/design_docs/new-cluster-design.txt b/qpid/cpp/design_docs/new-cluster-design.txt index aaf8cd6488..936530a39a 100644 --- a/qpid/cpp/design_docs/new-cluster-design.txt +++ b/qpid/cpp/design_docs/new-cluster-design.txt @@ -296,23 +296,20 @@ flow control is sufficient for qpid. ** Live upgrades Live upgrades refers to the ability to upgrade a cluster while it is -running, with no downtime. Brokers are shut down and re-start brokers -in the cluster is shut down, and then re-started with a new version of -the broker code. +running, with no downtime. Each brokers in the cluster is shut down, +and then re-started with a new version of the broker code. To achieve this - Cluster protocl XML file has a new element <version number=N> attached to each method. This is the version at which the method was added. -- New versions can add methods, existing methods cannot be changed. +- New versions can only add methods, existing methods cannot be changed. - The cluster handshake for new members includes the protocol version at each member. - The cluster's version is the lowest version among its members. -- A newer broker can join and older cluster but not vice versa. -- A newer broker in an old cluster must speak the old protocol for the - benefit of older brokers. It can also use newer controls which - will be ignored by old broekrs. +- A newer broker can join and older cluster. When it does, it must restrict + itself to speaking the older version protocol. - When the cluster version increases (because the lowest version member has left) - the remaining members may switch to using only the new version. + the remaining members may move up to the new version. * Design debates diff --git a/qpid/cpp/design_docs/new-cluster-plan.txt b/qpid/cpp/design_docs/new-cluster-plan.txt index b22f7a1f31..626e443be7 100644 --- a/qpid/cpp/design_docs/new-cluster-plan.txt +++ b/qpid/cpp/design_docs/new-cluster-plan.txt @@ -90,7 +90,9 @@ Independent message IDs that can be generated and sent with the message simplify this and potentially allow performance benefits by relaxing total ordering. However they imply additional map lookups that might hurt performance. -- [ ] Prototype independent message IDs, check performance. +- [X] Prototype independent message IDs, check performance. +Throughput worse by 30% in contented case, 10% in uncontended. +Sticking with queue sequence numbers. * Outstanding Tasks ** TODO [#A] Defer and async completion of wiring commands. @@ -152,6 +154,10 @@ Status includes - persistent store state (clean, dirty) - cluster protocol version. +** TODO [#B] Replace boost::hash with our own hash function. +The hash function is effectively part of the interface so +we need to be sure it doesn't change underneath us. + ** TODO [#B] Persistent cluster support. Initial status protoocl to support persistent start-up (see existing code) @@ -192,6 +198,16 @@ When this is fixed in the standalone broker, it should be fixed for cluster. ** TODO [#B] Network partitions and quorum. Re-use existing implementation. +** TODO [#B] Review error handling, put in a consitent model. +- [ ] Review all asserts, for possible throw. +- [ ] Decide on fatal vs. non-fatal errors. + +** TODO [#B] Implement inconsistent error handling policy. +What to do if a message is enqueued sucessfully on the local broker, +but fails on one or more backups - e.g. due to store limits? +- we have more flexibility, we don't *have* to crash +- but we've loste some of our redundancy guarantee, how should we inform client? + ** TODO [#C] Allow non-replicated exchanges, queues. Set qpid.replicate=false in declare arguments, set flag on Exchange, Queue objects. @@ -226,7 +242,7 @@ The old cluster has workarounds in the broker code that can be removed. - [ ] drop connections, sessions, management from cluster update. - [ ] drop security workarounds: cluster code now operates after message decoding. - [ ] drop connection tracking in cluster code. -- [ ] simper inconsistent-error handling code, no need to stall. +- [ ] simpler inconsistent-error handling code, no need to stall. ** TODO [#C] Support for live upgrades. |