diff options
author | antirez <antirez@gmail.com> | 2015-08-04 16:56:00 +0200 |
---|---|---|
committer | antirez <antirez@gmail.com> | 2015-08-04 17:06:10 +0200 |
commit | 292fec058a32323d5aa52dddfa86be280e29fe65 (patch) | |
tree | f728ed2b4687d008f4e03ae908359922286eb5b5 /src/server.h | |
parent | d1ff328170a161fc002e47954e5dd0e0989d2ce9 (diff) | |
download | redis-292fec058a32323d5aa52dddfa86be280e29fe65.tar.gz |
PSYNC initial offset fix.
This commit attempts to fix a bug involving PSYNC and diskless
replication (currently experimental) found by Yuval Inbar from Redis Labs
and that was later found to have even more far reaching effects (the bug also
exists when diskstore is off).
The gist of the bug is that, a Redis master replies with +FULLRESYNC to
a PSYNC attempt that fails and requires a full resynchronization.
However, the baseline offset sent along with FULLRESYNC was always the
current master replication offset. This is not ok, because there are
many reasosn that may delay the RDB file creation. And... guess what,
the master offset we communicate must be the one of the time the RDB
was created. So for example:
1) When the BGSAVE for replication is delayed since there is one
already but is not good for replication.
2) When the BGSAVE is not needed as we attach one currently ongoing.
3) When because of diskless replication the BGSAVE is delayed.
In all the above cases the PSYNC reply is wrong and the slave may
reconnect later claiming to need a wrong offset: this may cause
data curruption later.
Diffstat (limited to 'src/server.h')
-rw-r--r-- | src/server.h | 5 |
1 files changed, 5 insertions, 0 deletions
diff --git a/src/server.h b/src/server.h index c3bea54ea..bbd0014e1 100644 --- a/src/server.h +++ b/src/server.h @@ -564,6 +564,9 @@ typedef struct client { long long reploff; /* replication offset if this is our master */ long long repl_ack_off; /* replication ack offset, if this is a slave */ long long repl_ack_time;/* replication ack time, if this is a slave */ + long long psync_initial_offset; /* FULLRESYNC reply offset other slaves + copying this slave output buffer + should use. */ char replrunid[CONFIG_RUN_ID_SIZE+1]; /* master run id if this is a master */ int slave_listening_port; /* As configured with: SLAVECONF listening-port */ multiState mstate; /* MULTI/EXEC state */ @@ -1198,6 +1201,8 @@ int replicationCountAcksByOffset(long long offset); void replicationSendNewlineToMaster(void); long long replicationGetSlaveOffset(void); char *replicationGetSlaveName(client *c); +long long getPsyncInitialOffset(void); +int replicationSendFullresyncReply(client *slave, long long offset); /* Generic persistence functions */ void startLoading(FILE *fp); |