Force slaves to resync after unsuccessful PSYNC.

Using chained replication where C is slave of B which is in turn slave of A, if B reconnects the replication link with A but discovers it is no longer possible to PSYNC, slaves of B must be disconnected and PSYNC not allowed, since the new B dataset may be completely different after the synchronization with the master. Note that there are varius semantical differences in the way this is handled now compared to the past. In the past the semantics was: 1. When a slave lost connection with its master, disconnected the chained slaves ASAP. Which is not needed since after a successful PSYNC with the master, the slaves can continue and don't need to resync in turn. 2. However after a failed PSYNC the replication backlog was not reset, so a slave was able to PSYNC successfully even if the instance did a full sync with its master, containing now an entirely different data set. Now instead chained slaves are not disconnected when the slave lose the connection with its master, but only when it is forced to full SYNC with its master. This means that if the slave having chained slaves does a successful PSYNC all its slaves can continue without troubles. See issue #2694 for more details.
author: antirez <antirez@gmail.com> 2015-07-28 16:14:52 +0200
committer: antirez <antirez@gmail.com> 2015-07-28 16:35:02 +0200
commit: c1e94b6b9c6432ade2ec427dc8602189c19758e7 (patch)
tree: 668377525ea7b22ac5601731bf1c86950b4cde6b
parent: 278ea9d16b24add67379e569c236c69fecf55bdb (diff)
download: redis-c1e94b6b9c6432ade2ec427dc8602189c19758e7.tar.gz
1 files changed, 10 insertions, 6 deletions
diff --git a/src/replication.c b/src/replication.c
index 15ac76d5e..202342793 100644
--- a/src/replication.c
+++ b/src/replication.c
@@ -1330,6 +1330,13 @@ void syncWithMaster(aeEventLoop *el, int fd, void *privdata, int mask) {
         return;
     }
 
+    /* PSYNC failed or is not supported: we want our slaves to resync with us
+     * as well, if we have any (chained replication case). The mater may
+     * transfer us an entirely different data set and we have no way to
+     * incrementally feed our slaves after that. */
+    disconnectSlaves(); /* Force our slaves to resync with us as well. */
+    freeReplicationBacklog(); /* Don't allow our chained slaves to PSYNC. */
+
     /* Fall back to SYNC if needed. Otherwise psync_result == PSYNC_FULLRESYNC
      * and the server.repl_master_runid and repl_master_initial_offset are
      * already populated. */
@@ -1483,12 +1490,9 @@ void replicationHandleMasterDisconnection(void) {
     server.master = NULL;
     server.repl_state = REPL_STATE_CONNECT;
     server.repl_down_since = server.unixtime;
-    /* We lost connection with our master, force our slaves to resync
-     * with us as well to load the new data set.
-     *
-     * If server.masterhost is NULL the user called SLAVEOF NO ONE so
-     * slave resync is not needed. */
-    if (server.masterhost != NULL) disconnectSlaves();
+    /* We lost connection with our master, don't disconnect slaves yet,
+     * maybe we'll be able to PSYNC with our master later. We'll disconnect
+     * the slaves only if we'll have to do a full resync with our master. */
 }
 
 void slaveofCommand(client *c) {
author	antirez <antirez@gmail.com>	2015-07-28 16:14:52 +0200
committer	antirez <antirez@gmail.com>	2015-07-28 16:35:02 +0200
commit	c1e94b6b9c6432ade2ec427dc8602189c19758e7 (patch)
tree	668377525ea7b22ac5601731bf1c86950b4cde6b
parent	278ea9d16b24add67379e569c236c69fecf55bdb (diff)
download	redis-c1e94b6b9c6432ade2ec427dc8602189c19758e7.tar.gz