delta/python-packages/qpid-python.git - git.apache.org: qpid.git

diff options

author	Alan Conway <aconway@apache.org>	2014-12-19 03:18:57 +0000
committer	Alan Conway <aconway@apache.org>	2014-12-19 03:18:57 +0000
commit	40e74eaa3f8a345e7bc888e36de79717b7c761d0 (patch)
tree	4d9a08cb40caf897b9d73c55deac60374d97eb0c /qpid/cpp/examples/messaging
parent	aa51ac52f3bd77d92acf585699bc7429666ad785 (diff)
download	qpid-python-40e74eaa3f8a345e7bc888e36de79717b7c761d0.tar.gz

QPID-6278: HA broker abort in TXN soak test

The crash appears to be a race condition in async completion exposed by the HA TX code code as follows: 1. Message received and placed on tx-replication queue, completion delayed till backups ack. Completion count goes up for each backup then down as each backup acks. 2. Prepare received, message placed on primary's local persistent queue. Completion count goes up one then down one for local store completion (null store in this case). The race is something like this: - last backup ack arrives (on backup IO thread) and drops completion count to 0. - prepare arrives (on client thread) null store bumps count to 1 and immediately drops to 0. - both threads try to invoke the completion callback, one deletes it while the other is still invoking. The old completion logic assumed that only one thread can see the atomic counter go to 0. It does not handle the count going to 0 in one thread and concurrently being increased and decreased back to 0 in another. This case is introduced by HA transactions because the same message is put onto a tx-replication queue and then put again onto another persistent local queue, so there are two cycles of completion. The new logic fixes this only one call to completion callback is possible in all cases. Also fixed missing lock in ha/Primary.cpp. git-svn-id: https://svn.apache.org/repos/asf/qpid/trunk@1646618 13f79535-47bb-0310-9956-ffa450edef68

Diffstat (limited to 'qpid/cpp/examples/messaging')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: