summaryrefslogtreecommitdiff
path: root/src/replication.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix a bug to delay bgsave while AOF rewrite in progress for replicationQu Chen2016-08-021-1/+1
|
* Ability of slave to announce arbitrary ip/port to master.antirez2016-07-271-6/+56
| | | | | | | | | This feature is useful, especially in deployments using Sentinel in order to setup Redis HA, where the slave is executed with NAT or port forwarding, so that the auto-detected port/ip addresses, as listed in the "INFO replication" output of the master, or as provided by the "ROLE" command, don't match the real addresses at which the slave is reachable for connections.
* Replication: when possible start RDB saving ASAP.antirez2016-07-221-2/+8
| | | | | | | | | | | | | | | | | | In a previous commit the replication code was changed in order to centralize the BGSAVE for replication trigger in replicationCron(), however after further testings, the 1 second delay imposed by this change is not acceptable. So now the BGSAVE is only delayed if the AOF rewriting process is active. However past comments made sure that replicationCron() is always able to trigger the BGSAVE when needed, making the code generally more robust. The new code is more similar to the initial @oranagra patch where the BGSAVE was delayed only if an AOF rewrite was in progress. Trivia: delaying the BGSAVE uncovered a minor Sentinel issue that is now fixed.
* Replication: start BGSAVE for replication always in replicationCron().antirez2016-07-211-12/+15
| | | | | | This makes the replication code conceptually simpler by removing the synchronous BGSAVE trigger in syncCommand(). This also means that socket and disk BGSAVE targets are handled by the same code.
* Centralize slave replication handshake aborting.antirez2015-12-031-22/+23
| | | | | | | | Now we have a single function to call in any state of the slave handshake, instead of using different functions for different states which is error prone. Change performed in the context of issue #2479 but does not fix it, since should be functionally identical to the past. Just an attempt to make replication.c simpler to follow.
* PR 2813 fix ported to unstable.antirez2015-10-151-20/+23
|
* Lazyfree: cond vars to enabled/disable it based on DEL context.antirez2015-10-021-1/+4
|
* Lazyfree: ability to free whole DBs in background.antirez2015-10-011-1/+1
|
* Refactoring: unlinkClient() added to lower freeClient() complexity.antirez2015-09-301-22/+2
|
* Refactoring: new function to test if client has pending output.antirez2015-09-301-2/+2
|
* Avoid installing the client write handler when possible.antirez2015-09-301-0/+6
|
* Log client details on SLAVEOF command having an effect.antirez2015-08-211-3/+8
|
* startBgsaveForReplication(): handle waiting slaves state change.psync-fixesantirez2015-08-201-47/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this commit, after triggering a BGSAVE it was up to the caller of startBgsavForReplication() to handle slaves in WAIT_BGSAVE_START in order to update them accordingly. However when the replication target is the socket, this is not possible since the process of updating the slaves and sending the FULLRESYNC reply must be coupled with the process of starting an RDB save (the reason is, we need to send the FULLSYNC command and spawn a child that will start to send RDB data to the slaves ASAP). This commit moves the responsibility of handling slaves in WAIT_BGSAVE_START to startBgsavForReplication() so that for both diskless and disk-based replication we have the same chain of responsiblity. In order accomodate such change, the syncCommand() also needs to put the client in the slave list ASAP (just after the initial checks) and not at the end, so that startBgsavForReplication() can find the new slave alrady in the list. Another related change is what happens if the BGSAVE fails because of fork() or other errors: we now remove the slave from the list of slaves and send an error, scheduling the slave connection to be terminated. As a side effect of this change the following errors found by Oran Agra are fixed (thanks!): 1. rdbSaveToSlavesSockets() on failed fork will get the slaves cleaned up, otherwise they remain in a wrong state forever since we setup them for full resync before actually trying to fork. 2. updateSlavesWaitingBgsave() with replication target set as "socket" was broken since the function changed the slaves state from WAIT_BGSAVE_START to WAIT_BGSAVE_END via replicationSetupSlaveForFullResync(), so later rdbSaveToSlavesSockets() will not find any slave in the right state (WAIT_BGSAVE_START) to feed.
* slaveTryPartialResynchronization and syncWithMaster: better synergy.antirez2015-08-071-14/+16
| | | | | | | | | It is simpler if removing the read event handler from the FD is up to slaveTryPartialResynchronization, after all it is only called in the context of syncWithMaster. This commit also makes sure that on error all the event handlers are removed from the socket before closing it.
* syncWithMaster(): non blocking state machine.statemachineantirez2015-08-061-80/+193
|
* startBgsaveForReplication(): log what you really do.antirez2015-08-061-2/+3
|
* Replication: add REPLCONF CAPA EOF support.slaves_capaantirez2015-08-061-11/+45
| | | | | | | | | | | | | | | | | Add the concept of slaves capabilities to Redis, the slave now presents to the Redis master with a set of capabilities in the form: REPLCONF capa SOMECAPA capa OTHERCAPA ... This has the effect of setting slave->slave_capa with the corresponding SLAVE_CAPA macros that the master can test later to understand if it the slave will understand certain formats and protocols of the replication process. This makes it much simpler to introduce new replication capabilities in the future in a way that don't break old slaves or masters. This patch was designed and implemented together with Oran Agra (@oranagra).
* Fix replication slave pings period.antirez2015-08-051-20/+26
| | | | | | | For PINGs we use the period configured by the user, but for the newlines of slaves waiting for an RDB to be created (including slaves waiting for the FULLRESYNC reply) we need to ping with frequency of 1 second, since the timeout is fixed and needs to be refreshed.
* Make sure we re-emit SELECT after each new slave full sync setup.antirez2015-08-051-15/+24
| | | | | | | | | | | | In previous commits we moved the FULLRESYNC to the moment we start the BGSAVE, so that the offset we provide is the right one. However this also means that we need to re-emit the SELECT statement every time a new slave starts to accumulate the changes. To obtian this effect in a more clean way, the function that sends the FULLRESYNC reply was overloaded with a more important role of also doing this and chanigng the slave state. So it was renamed to replicationSetupSlaveForFullResync() to better reflect what it does now.
* Don't send SELECT to slaves in WAIT_BGSAVE_START state.antirez2015-08-051-0/+1
|
* syncCommand() comments improved.antirez2015-08-051-1/+8
|
* PSYNC initial offset fix.antirez2015-08-041-13/+46
| | | | | | | | | | | | | | | | | | | | | | | | This commit attempts to fix a bug involving PSYNC and diskless replication (currently experimental) found by Yuval Inbar from Redis Labs and that was later found to have even more far reaching effects (the bug also exists when diskstore is off). The gist of the bug is that, a Redis master replies with +FULLRESYNC to a PSYNC attempt that fails and requires a full resynchronization. However, the baseline offset sent along with FULLRESYNC was always the current master replication offset. This is not ok, because there are many reasosn that may delay the RDB file creation. And... guess what, the master offset we communicate must be the one of the time the RDB was created. So for example: 1) When the BGSAVE for replication is delayed since there is one already but is not good for replication. 2) When the BGSAVE is not needed as we attach one currently ongoing. 3) When because of diskless replication the BGSAVE is delayed. In all the above cases the PSYNC reply is wrong and the slave may reconnect later claiming to need a wrong offset: this may cause data curruption later.
* Force slaves to resync after unsuccessful PSYNC.antirez2015-07-281-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Using chained replication where C is slave of B which is in turn slave of A, if B reconnects the replication link with A but discovers it is no longer possible to PSYNC, slaves of B must be disconnected and PSYNC not allowed, since the new B dataset may be completely different after the synchronization with the master. Note that there are varius semantical differences in the way this is handled now compared to the past. In the past the semantics was: 1. When a slave lost connection with its master, disconnected the chained slaves ASAP. Which is not needed since after a successful PSYNC with the master, the slaves can continue and don't need to resync in turn. 2. However after a failed PSYNC the replication backlog was not reset, so a slave was able to PSYNC successfully even if the instance did a full sync with its master, containing now an entirely different data set. Now instead chained slaves are not disconnected when the slave lose the connection with its master, but only when it is forced to full SYNC with its master. This means that if the slave having chained slaves does a successful PSYNC all its slaves can continue without troubles. See issue #2694 for more details.
* replicationHandleMasterDisconnection() belongs to replication.c.antirez2015-07-281-0/+14
|
* RDMF: More consistent define names.antirez2015-07-271-196/+196
|
* RDMF: REDIS_OK REDIS_ERR -> C_OK C_ERR.antirez2015-07-261-26/+26
|
* RDMF: redisAssert -> serverAssert.antirez2015-07-261-10/+10
|
* RDMF: OBJ_ macros for object related stuff.antirez2015-07-261-4/+4
|
* RDMF: use client instead of redisClient, like Disque.antirez2015-07-261-29/+29
|
* RDMF: redisLog -> serverLog.antirez2015-07-261-81/+81
|
* RDMF (Redis/Disque merge friendlyness) refactoring WIP 1.antirez2015-07-261-1/+1
|
* Use best effort address binding to connect to the masterantirez2015-06-111-1/+1
| | | | | | | | | | | | | | | | We usually want to reach the master using the address of the interface Redis is bound to (via the "bind" config option). That's useful since the master will get (and publish) the slave address getting the peer name of the incoming socket connection from the slave. However, when this is not possible, for example because the slave is bound to the loopback interface but repliaces from a master accessed via an external interface, we want to still connect with the master even from a different interface: in this case it is not really important that the master will provide any other address, while it is vital to be able to replicate correctly. Related to issues #2609 and #2612.
* Net: improve prepareClientToWrite() error handling and comments.antirez2015-04-011-3/+4
| | | | | | | | | | | | | When we fail to setup the write handler it does not make sense to take the client around, it is missing writes: whatever is a client or a slave anyway the connection should terminated ASAP. Moreover what the function does exactly with its return value, and in which case the write handler is installed on the socket, was not clear, so the functions comment are improved to make the goals of the function more obvious. Also related to #2485.
* fixes to diskless replication.Oran Agra2015-03-311-0/+1
| | | | | master was closing the connection if the RDB transfer took long time. and also sent PINGs to the slave before it got the initial ACK, in which case the slave wouldn't be able to find the EOF marker.
* Replication: disconnect blocked clients when switching to slave role.antirez2015-03-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug as old as Redis and blocking operations. It's hard to trigger since only happens on instance role switch, but the results are quite bad since an inconsistency between master and slave is created. How to trigger the bug is a good description of the bug itself. 1. Client does "BLPOP mylist 0" in master. 2. Master is turned into slave, that replicates from New-Master. 3. Client does "LPUSH mylist foo" in New-Master. 4. New-Master propagates write to slave. 5. Slave receives the LPUSH, the blocked client get served. Now Master "mylist" key has "foo", Slave "mylist" key is empty. Highlights: * At step "2" above, the client remains attached, basically escaping any check performed during command dispatch: read only slave, in that case. * At step "5" the slave (that was the master), serves the blocked client consuming a list element, which is not consumed on the master side. This scenario is technically likely to happen during failovers, however since Redis Sentinel already disconnects clients using the CLIENT command when changing the role of the instance, the bug is avoided in Sentinel deployments. Closes #2473.
* Replication: put server.master client creation into separated function.antirez2015-02-041-11/+18
|
* AnetFormatIP(): renamed, commented, now sticks to IP:port format.antirez2014-12-111-1/+1
| | | | | A few code style changes + consistent format: not nice for humans but better for parsers.
* Cleanup all IP formatting codeMatt Stancliff2014-12-111-1/+1
| | | | | Instead of manually checking for strchr(n,':') everywhere, we can use our new centralized IP formatting functions.
* Network bandwidth tracking + refactoring.antirez2014-12-031-0/+3
| | | | | | | | | Track bandwidth used by clients and replication (but diskless replication is not tracked since the actual transfer happens in the child process). This includes a refactoring that makes tracking new instantaneous metrics simpler.
* Diskless SYNC: fix RDB EOF detection.antirez2014-11-111-4/+19
| | | | | | | | | | | | | | | | | | | RDB EOF detection was relying on the final part of the RDB transfer to be a magic 40 bytes EOF marker. However as the slave is put online immediately, and because of sockets timeouts, the replication stream is actually contiguous with the RDB file. This means that to detect the EOF correctly we should either: 1) Scan all the stream searching for the mark. Sucks CPU-wise. 2) Start to send the replication stream only after an acknowledge. 3) Implement a proper chunked encoding. For now solution "2" was picked, so the master does not start to send ASAP the stream of commands in the case of diskless replication. We wait for the first REPLCONF ACK command from the slave, that certifies us that the slave correctly loaded the RDB file and is ready to get more data.
* Disconnect timedout slave: regression introduced with diskless repl.antirez2014-11-111-2/+3
|
* Networking: add more outbound IP binding fixesMatt Stancliff2014-10-291-1/+2
| | | | | | | | | | | Same as the original bind fixes (we just missed these the first time around). This helps Redis not automatically send connections from the first IP on an interface if we are bound to a specific IP address (e.g. with multiple IP aliases on one interface, you want to send from _your_ IP, not from the first IP on the interface).
* Diskless replication: missing listRewind() added.memsyncantirez2014-10-291-1/+5
| | | | | | | | | | This caused BGSAVE to be triggered a second time without any need when we switch from socket to disk target via the command CONFIG SET repl-diskless-sync no and there is already a slave waiting for the BGSAVE to start. Also comments clarified about what is happening.
* Log slave ip:port in more log messages.antirez2014-10-271-6/+11
|
* Added a function to get slave name for logs.antirez2014-10-271-9/+26
|
* Diskless replication: log BGSAVE delay only when it is non-zero.antirez2014-10-271-1/+2
|
* Diskless sync delay is now configurable.antirez2014-10-271-3/+3
|
* Remove duplicated log message about starting BGSAVE.antirez2014-10-241-1/+0
|
* Diskless replication: less debugging printfs around.antirez2014-10-171-1/+0
|
* rio fdset target: handle short writes.antirez2014-10-171-0/+1
| | | | | While the socket is set in blocking mode, we still can get short writes writing to a socket.