summaryrefslogtreecommitdiff
path: root/ACE/protocols/ace/TMCast/README
diff options
context:
space:
mode:
Diffstat (limited to 'ACE/protocols/ace/TMCast/README')
-rw-r--r--ACE/protocols/ace/TMCast/README240
1 files changed, 240 insertions, 0 deletions
diff --git a/ACE/protocols/ace/TMCast/README b/ACE/protocols/ace/TMCast/README
new file mode 100644
index 00000000000..7104be46e30
--- /dev/null
+++ b/ACE/protocols/ace/TMCast/README
@@ -0,0 +1,240 @@
+
+
+Introduction
+------------
+
+TMCast (stands for Transaction MultiCast) is an implementation of a
+transactional multicast protocol. In essence, the idea is to represent
+each message delivery to members of a multicast group as a transaction
+- an atomic, consistent and isolated action. A multicast transaction
+can be viewed as an atomic transition of the group members to a new
+state. If we define [Mo] as a set of operational (non-faulty) members
+of the group, [Mf] as a set of faulty members of the group, [Ma] as a
+set of members that view transition [Tn] as aborted and [Mc] as a set
+of members that view transition [Tn] as committed, then this atomic
+transition [Tn] should satisfy one of the following equations:
+
+Mo(Tn-1) = Ma(T) U Mf(T)
+Mo(Tn-1) = Mc(T) U Mf(T)
+
+Or, in other words, after transaction T has been committed (aborted),
+all operational (before transaction T) members are either in the
+committed (aborted) or failed state.
+
+Thus, for each member of the group, outcome of the transaction can be
+commit, abort or a member failure. It is important for a member to
+exhibit a failfast (error latency is less than transaction cycle)
+behavior. Or, in other words, if a member transitioned into a wrong
+state, it is guaranteed to fail instead of delivering a wrong result.
+
+In order to achieve such an error detection in a decentralized
+environment, certain limitations were imposed. One of the most
+user-visible limitation is the fact that the lifetime of the group
+with only one member is very short. This is because there is not way
+for a member to distinguish "no members yet" case from "my link to the
+group is down". In such a situation, the member assumes the latter
+case. There is also a military saying that puts it quite nicely: two
+is one, one is nothing.
+
+
+
+State of Implementation
+-----------------------
+
+The current implementation is in a prototypical stage. The following
+parts are not implemented or still under development:
+
+* Handling of network partitioning (TODO)
+
+* Redundant network support (TODO)
+
+* Member failure detection (partial implementation)
+
+
+Examples
+--------
+
+There is a simple example available in examples/TMCast/Member with
+the corresponding README.
+
+
+Architecture
+------------
+
+Primary goals of the protocol are to (1) mask transient failures of the
+underlying multicast protocol (or, more precisely, allow to recover
+from transient failures) and (2) exhibit failfast behavior in cases of
+permanent failures.
+
+The distinction between transient and permanent failures is based on
+timeouts thus what can be a transient failure in one configuration of
+the protocol could be a permanent failure in the other.
+
+[Maybe talk more about a transient/permanent threshold and its effect
+on performance/resource utilization/etc.]
+
+[Maybe add a terminology section.]
+
+Each member of a multicast group has its unique (group-wise) id:
+
+struct MemberId
+{
+ char id[MEMBER_ID_LENGTH];
+};
+
+Each payload delivery is part of a transaction. Each transaction is
+identified by TransactionId:
+
+typedef unsigned short TransactionId;
+
+
+Each transaction has a status code which identifies its state, as viewed by
+a group member:
+
+
+typedef unsigned char TransactionStatus;
+
+TransactionStatus const TS_BEGIN = 1;
+TransactionStatus const TS_COMMIT = 2;
+TransactionStatus const TS_ABORT = 3;
+TransactionStatus const TS_COMMITTED = 4;
+TransactionStatus const TS_ABORTED = 5;
+
+Thus each transaction is described by its id and status:
+
+struct Transaction
+{
+ TransactionId id;
+ TransactionStatus status;
+};
+
+The outcome of some predefined number of recent transactions is stored
+in TransactionList:
+
+typedef Transaction TransactionList[TL_LENGTH];
+
+
+Each message sent to a multicast group has the following header:
+
+struct MessageHeader
+{
+ unsigned long length;
+ unsigned long check_sum;
+ MemberId member_id;
+ Transaction current;
+ TransactionList transaction_list;
+};
+
+[Maybe describe each field here.]
+
+A new member joins the group with transaction id 0 and status
+TS_COMMITTED.
+
+Each member sends a periodic 'pulse' messages with some predefined interval
+advertising its current view of the group. This includes the state of the
+current transaction and the history of the recent transactions.
+
+
+If a member of the group needs a payload delivery it starts a new
+transaction by sending a message with current transaction set to
+
+{++current_id, TS_BEGIN}
+
+and payload appended after the header.
+
+
+Each member joins a transaction in one of the following ways:
+
+* A member that began the transaction joins it 'to commit' (TS_COMMIT)
+
+* A member that received TS_BEGIN joins current transaction 'to commit'
+ (TS_COMMIT).
+
+* A member that received TS_COMMIT or TS_ABORT but did not receive TS_BEGIN
+ joins current transaction 'to abort' (TS_ABORT).
+
+
+After a member has joined the transaction it starts participating in the
+transaction's voting phase. On this phase members of the group decide the
+fate of the transaction. Each member sends a predefined number of messages
+where it announces its vote. In between those messages the member is receiving
+and processing votes from other members and can be influenced by their
+'opinion'.
+
+In their decision-making members follow the principle of the majority. As
+the voting progresses (and comes close to an end) members become more and
+more reluctant to deviate from the decision of the majority.
+
+[Maybe add an equation that measures member's willingness to change
+its mind.]
+
+At the end of the voting phase each member declares the current transaction
+either committed (TS_COMMITTED) or aborted (TS_ABORTED). If this decision does
+not agree with the majority the member declares itself failed.
+
+In addition, each member builds a 'majority view' of the transaction history
+(based on transaction_list). If it deviates from the member's own history the
+member declares itself failed.
+
+Here are some example scenarios of how the protocol behaves in different
+situations. Let's say we have three members of the group S, R1, R2. S
+initiates a transaction. R1 and R2 join it.
+
+Scenario 1. (two-step voting)
+
+1. S initiates a transaction (TS_BEGIN)
+2a. R1 receives TS_BEGIN, joins for commit
+2b. R2 receives TS_BEGIN, joins for commit
+3a. S announces TS_COMMIT (first vote)
+3b. R1 announces TS_COMMIT (first vote)
+3c. R2 announces TS_COMMIT (first vote)
+4a. S announces TS_COMMIT (second vote)
+4b. R1 announces TS_COMMIT (second vote)
+4c. R2 announces TS_COMMIT (second vote)
+5a. S announces TS_COMMITTED (end of vote)
+5b. R1 announces TS_COMMITTED (end of vote)
+5c. R2 announces TS_COMMITTED (end of vote)
+
+
+Scenario 2. (two-step voting)
+
+1. S initiates a transaction (TS_BEGIN)
+2a. R1 receives TS_BEGIN, joins for commit
+2b. R2 didn't receive TS_BEGIN
+3a. S announces TS_COMMIT (first vote)
+3b. R1 announces TS_COMMIT (first vote)
+3c. R2 received R1's TS_COMMIT announces TS_ABORT (first vote)
+4a. S received R2's TS_ABORT announces TS_ABORT (second vote)
+4b. R1 received R2's TS_ABORT announces TS_ABORT (second vote)
+4c. R2 announces TS_ABORT (second vote)
+5a. S announces TS_ABORTED (end of vote)
+5b. R1 announces TS_ABORTED (end of vote)
+5c. R2 announces TS_ABORTED (end of vote)
+
+
+Scenario 3. (three-step voting)
+
+1. S initiates a transaction (TS_BEGIN)
+2a. R1 receives TS_BEGIN, joins for commit
+2b. R2 didn't receive TS_BEGIN
+3a. S announces TS_COMMIT (first vote)
+3b. R1 announces TS_COMMIT (first vote)
+3c. R2 still didn't receive anything
+4a. S announces TS_COMMIT (second vote)
+4b. R1 announces TS_COMMIT (second vote)
+4c. R2 received R1's TS_COMMIT, announces TS_ABORT (first vote)
+
+5a. S received R2's TS_ABORT but it is the end of the voting phase and
+ majority (S and R1) vote for commit, announces TS_COMMIT (third vote)
+5b. R1 received R2's TS_ABORT but it is the end of the voting phase and
+ majority (S and R1) vote for commit, announces TS_COMMIT (third vote)
+5c. R2 announces TS_ABORT (second vote)
+
+6a. S announces TS_COMMITTED (end of vote)
+6b. R1 announces TS_COMMITTED (end of vote)
+6c. R2 discovers that the the majority has declared current transaction
+ committed and thus declares itself failed.
+
+
+--
+Boris Kolpackov <boris@dre.vanderbilt.edu>