summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Just in case something goes wrong in our weird unsupervised process, log it.bug26368Simon MacMullen2014-10-141-2/+10
|
* Be a bit less verbose here: it's actually a very common case and perfectly ↵Simon MacMullen2014-10-141-2/+2
| | | | normal; no need to dump state.
* Check whether the cluster is fully connected before trying to autoheal, and ↵Simon MacMullen2014-10-142-26/+50
| | | | ignore autoheal requests if it isn't.
* Switch to having the winner inform the losers that they need to stop, rather ↵Simon MacMullen2014-10-141-25/+8
| | | | than having the leader do it. This fixes the race where the leader tells them to stop before the partition has healed from the winner's POV. So it should be simpler and more correct.
* In fact, that case can't happen since bug 26043, so let's simplify.Simon MacMullen2014-10-141-7/+0
|
* Merge in default (move from stable to default; we need to change the protocol)Simon MacMullen2014-10-140-0/+0
|\
* \ Merge bug26404Simon MacMullen2014-10-130-0/+0
|\ \
* | | Also avoid partition-related hangs here.bug26404Simon MacMullen2014-10-101-2/+8
| | |
* | | Use a new function introduced on defaultSimon MacMullen2014-10-101-2/+1
| | |
* | | Merge in defaultSimon MacMullen2014-10-101-6/+18
|\ \ \ | |/ / |/| |
| * | Defend against partitions at the wrong time causing badness.Simon MacMullen2014-10-081-6/+18
| | |
* | | Merge bug26408 (again)Simon MacMullen2014-10-102-26/+36
|\ \ \
| * \ \ Merge bug 26408Simon MacMullen2014-10-104-10/+15
| |\ \ \
| * \ \ \ Merge bug26410Simon MacMullen2014-10-101-6/+19
| |\ \ \ \
| | * | | | Use flow control when talking to the message store, on a fast machine this ↵bug26410Simon MacMullen2014-10-101-6/+19
| |/ / / / | | | | | | | | | | | | | | | test could overwhelm the message store such that gen_server2:drain/1 never completed.
| * | | | Merge bug26409Simon MacMullen2014-10-090-0/+0
| |\ \ \ \
| * | | | | Rethink partial partition detection: switching to disconnecting the node ↵bug26409Simon MacMullen2014-10-091-19/+15
| | | | | | | | | | | | | | | | | | | | | | | | that first saw a DOWN (rather than the one that DOWN was about) means we don't need the delay. Also don't attempt to ask nodes we're partitioned from to check; let's not multiply entities needlessly.
| * | | | | Check Mnesia's idea of which nodes are running; avoid infinite loop.Simon MacMullen2014-10-091-1/+2
| |/ / / /
* | | | | Separate out different is_process_alive implementations depending on whether ↵bug26408Simon MacMullen2014-10-105-8/+23
| |/ / / |/| | | | | | | | | | | we want to consider the process alive if it is running but we can't talk to it via Mnesia. Thus unbreak exclusive queues with the direct client from non-Rabbit nodes.
* | | | OopsSimon MacMullen2014-10-091-1/+1
| | | |
* | | | Slightly more accurate comment.Simon MacMullen2014-10-091-4/+5
| | | |
* | | | Make rabbit_misc:is_process_alive() return false for nodes we are ↵Simon MacMullen2014-10-093-6/+10
|/ / / | | | | | | | | | partitioned from; prevent prequeue:init/1 from entering an infinite loop on partition.
* | | Merge bug26407Simon MacMullen2014-10-091-4/+13
|\ \ \
| * | | Partial partition delay.bug26407Simon MacMullen2014-10-091-4/+13
|/ / /
* | | Mini essaySimon MacMullen2014-10-091-0/+25
| | |
* | | Merge bug26406Simon MacMullen2014-10-090-0/+0
|\ \ \
* | | | Update docs.bug26406Simon MacMullen2014-10-091-1/+1
| | | |
* | | | Return environment for all apps.Simon MacMullen2014-10-081-2/+7
|/ / /
* | | Unbreak quickcheck.Simon MacMullen2014-10-081-0/+1
|/ /
* | Merge bug 25850Simon MacMullen2014-10-083-2/+26
|\ \
| * | Rename function, and GC for large messages on the way out, too.bug25850Simon MacMullen2014-10-063-5/+7
| | |
| * | Prevent the channel from holding a lot of binary garbage when accepting huge ↵Simon MacMullen2014-10-062-2/+24
|/ / | | | | | | messages.
* | Merge bug26401Simon MacMullen2014-10-060-0/+0
|\ \
* | | Classify ETS memory by owner (and thus include all msg store memory under ↵bug26401Simon MacMullen2014-10-061-5/+6
| | | | | | | | | | | | "msg store index").
* | | Split out connection memory into reader / writer / channel / other.Simon MacMullen2014-10-061-34/+57
|/ /
* | Merge bug26213 (again)Simon MacMullen2014-10-031-6/+13
|\ \
| * | Just because we received a running_partitioned_network, doesn't mean all ↵bug26213Simon MacMullen2014-10-031-6/+13
| | | | | | | | | | | | nodes are now contactable. Defer attempting autoheal until we can talk to everyone again, to avoid getting stuck in a loop with partial partition promotion.
| * | Merge defaultSimon MacMullen2014-10-030-0/+0
| |\ \ |/ / /
| * | Merge defaultSimon MacMullen2014-10-030-0/+0
| |\ \
* | \ \ Merge bug26368 (again)Simon MacMullen2014-10-032-17/+28
|\ \ \ \ | |/ / / |/| | / | | |/ | |/|
| * | Distinguish between "already stopped" (fine, carry on) or "already down" ↵Simon MacMullen2014-10-032-17/+28
| | | | | | | | | | | | (abort since we've lost contact).
* | | Merge bug 26213Simon MacMullen2014-10-032-22/+92
|\ \ \ | | |/ | |/|
| * | Allow requires / enables to be multiple steps, reduce ugliness.Simon MacMullen2014-10-031-4/+7
| | |
| * | In the event of a partial partition in pause_minority mode, pause until ↵Simon MacMullen2014-09-301-14/+25
| | | | | | | | | | | | everything comes back - otherwise we stand a chance of just reconnecting and still being in a partial partition.
| * | Partial partition detection and handling (where by "handling" we mean ↵Simon MacMullen2014-09-292-16/+72
| | | | | | | | | | | | "promotion to full partition"). This necessitates that we hold GUIDs for each node (so that we can detect if a node has restarted behind our back).
* | | MErge bug26398Simon MacMullen2014-10-020-0/+0
|\ \ \
* | | | Oopsbug26398Simon MacMullen2014-10-021-1/+1
| | | |
* | | | Present memory and aggregated binaries in the same way.Simon MacMullen2014-10-011-9/+18
|/ / /
* | | Merge bug26397Simon MacMullen2014-10-011-41/+79
|\ \ \
| * | | Simplify.bug26397Simon MacMullen2014-10-011-22/+10
| | | |