Improve container-sync docs.

Two improvements: first, document that the container-sync process connects to the remote cluster's proxy server, so outbound connectivity is required. Second, rewrite the behind-the-scenes container-sync example and add some ASCII-art diagrams. Fixes bug 1068430. Bonus fix of docstring in wsgi.py to squelch a sphinx warning. Change-Id: I85bd56c2bd14431e13f7c57a43852777f14014fb
author: Samuel Merritt <sam@swiftstack.com> 2012-11-21 14:57:21 -0800
committer: Samuel Merritt <sam@swiftstack.com> 2012-11-21 14:59:26 -0800
commit: 89a871d42f1226c2dd292ea739dfda01d6f4b3f2 (patch)
tree: a2464cd559d2f6e92d6267b82c003e7b6ce71084 /doc/source/overview_container_sync.rst
parent: 2fc9716ec9384b0079d9c077e0f081a13ad76624 (diff)
download: swift-89a871d42f1226c2dd292ea739dfda01d6f4b3f2.tar.gz
1 files changed, 85 insertions, 36 deletions
diff --git a/doc/source/overview_container_sync.rst b/doc/source/overview_container_sync.rst
index af0168791..b62136d25 100644
--- a/doc/source/overview_container_sync.rst
+++ b/doc/source/overview_container_sync.rst
@@ -174,6 +174,13 @@ to the other container.
 
 .. note::
 
+    The swift-container-sync process runs on each container server in
+    the cluster and talks to the proxy servers in the remote cluster.
+    Therefore, the container servers must be permitted to initiate
+    outbound connections to the remote proxy servers.
+
+.. note::
+
     Container sync will sync object POSTs only if the proxy server is set to
     use "object_post_as_copy = true" which is the default. So-called fast
     object posts, "object_post_as_copy = false" do not update the container
@@ -184,39 +191,81 @@ The actual syncing is slightly more complicated to make use of the three
 do the exact same work but also without missing work if one node happens to
 be down.
 
-Two sync points are kept per container database. All rows between the two
-sync points trigger updates. Any rows newer than both sync points cause
-updates depending on the node's position for the container (primary nodes
-do one third, etc. depending on the replica count of course). After a sync
-run, the first sync point is set to the newest ROWID known and the second
-sync point is set to newest ROWID for which all updates have been sent.
-
-An example may help. Assume replica count is 3 and perfectly matching
-ROWIDs starting at 1.
-
-    First sync run, database has 6 rows:
-
-        * SyncPoint1 starts as -1.
-        * SyncPoint2 starts as -1.
-        * No rows between points, so no "all updates" rows.
-        * Six rows newer than SyncPoint1, so a third of the rows are sent
-          by node 1, another third by node 2, remaining third by node 3.
-        * SyncPoint1 is set as 6 (the newest ROWID known).
-        * SyncPoint2 is left as -1 since no "all updates" rows were synced.
-
-    Next sync run, database has 12 rows:
-
-        * SyncPoint1 starts as 6.
-        * SyncPoint2 starts as -1.
-        * The rows between -1 and 6 all trigger updates (most of which
-          should short-circuit on the remote end as having already been
-          done).
-        * Six more rows newer than SyncPoint1, so a third of the rows are
-          sent by node 1, another third by node 2, remaining third by node
-          3.
-        * SyncPoint1 is set as 12 (the newest ROWID known).
-        * SyncPoint2 is set as 6 (the newest "all updates" ROWID).
-
-In this way, under normal circumstances each node sends its share of
-updates each run and just sends a batch of older updates to ensure nothing
-was missed.
+Two sync points are kept in each container database. When syncing a
+container, the container-sync process figures out which replica of the
+container it has. In a standard 3-replica scenario, the process will
+have either replica number 0, 1, or 2. This is used to figure out
+which rows are belong to this sync process and which ones don't.
+
+An example may help. Assume a replica count of 3 and database row IDs
+are 1..6. Also, assume that container-sync is running on this
+container for the first time, hence SP1 = SP2 = -1. ::
+
+   SP1
+   SP2
+    |
+    v
+   -1 0 1 2 3 4 5 6
+
+First, the container-sync process looks for rows with id between SP1
+and SP2. Since this is the first run, SP1 = SP2 = -1, and there aren't
+any such rows. ::
+
+   SP1
+   SP2
+    |
+    v
+   -1 0 1 2 3 4 5 6
+
+Second, the container-sync process looks for rows with id greater than
+SP1, and syncs those rows which it owns. Ownership is based on the
+hash of the object name, so it's not always guaranteed to be exactly
+one out of every three rows, but it usually gets close. For the sake
+of example, let's say that this process ends up owning rows 2 and 5.
+
+Once it's finished syncing those rows, it updates SP1 to be the
+biggest row-id that it's seen, which is 6 in this example. ::
+
+   SP2           SP1
+    |             |
+    v             v
+   -1 0 1 2 3 4 5 6
+
+While all that was going on, clients uploaded new objects into the
+container, creating new rows in the database. ::
+
+   SP2           SP1
+    |             |
+    v             v
+   -1 0 1 2 3 4 5 6 7 8 9 10 11 12
+
+On the next run, the container-sync starts off looking at rows with
+ids between SP1 and SP2. This time, there are a bunch of them. The
+sync process takes the ones it *does not* own and syncs them. Again,
+this is based on the hashes, so this will be everything it didn't sync
+before. In this example, that's rows 0, 1, 3, 4, and 6.
+
+Under normal circumstances, the container-sync processes for the other
+replicas will have already taken care of synchronizing those rows, so
+this is a set of quick checks. However, if one of those other sync
+processes failed for some reason, then this is a vital fallback to
+make sure all the objects in the container get synchronized. Without
+this seemingly-redundant work, any container-sync failure results in
+unsynchronized objects.
+
+Once it's done with the fallback rows, SP2 is advanced to SP1. ::
+
+                 SP2
+                 SP1
+                  |
+                  v
+   -1 0 1 2 3 4 5 6 7 8 9 10 11 12
+
+Then, rows with row ID greater than SP1 are synchronized (provided
+this container-sync process is responsible for them), and SP1 is moved
+up to the greatest row ID seen. ::
+
+                 SP2            SP1
+                  |              |
+                  v              v
+   -1 0 1 2 3 4 5 6 7 8 9 10 11 12
author	Samuel Merritt <sam@swiftstack.com>	2012-11-21 14:57:21 -0800
committer	Samuel Merritt <sam@swiftstack.com>	2012-11-21 14:59:26 -0800
commit	89a871d42f1226c2dd292ea739dfda01d6f4b3f2 (patch)
tree	a2464cd559d2f6e92d6267b82c003e7b6ce71084 /doc/source/overview_container_sync.rst
parent	2fc9716ec9384b0079d9c077e0f081a13ad76624 (diff)
download	swift-89a871d42f1226c2dd292ea739dfda01d6f4b3f2.tar.gz