From 99a425d0f3b7b00896cb855d5de4ae93be1fe3f0 Mon Sep 17 00:00:00 2001 From: Tian Date: Tue, 21 Jun 2022 00:17:23 +0800 Subject: Fsync directory while persisting AOF manifest, RDB file, and config file (#10737) The current process to persist files is `write` the data, `fsync` and `rename` the file, but a underlying problem is that the rename may be lost when a sudden crash like power outage and the directory hasn't been persisted. The article [Ensuring data reaches disk](https://lwn.net/Articles/457667/) mentions a safe way to update file should be: 1. create a new temp file (on the same file system!) 2. write data to the temp file 3. fsync() the temp file 4. rename the temp file to the appropriate name 5. fsync() the containing directory This commit handles CONFIG REWRITE, AOF manifest, and RDB file (both for persistence, and the one the replica gets from the master). It doesn't handle (yet), ACL SAVE and Cluster configs, since these don't yet follow this pattern. --- src/replication.c | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'src/replication.c') diff --git a/src/replication.c b/src/replication.c index 8a40a8197..b929b0460 100644 --- a/src/replication.c +++ b/src/replication.c @@ -2151,6 +2151,16 @@ void readSyncBulkPayload(connection *conn) { /* Close old rdb asynchronously. */ if (old_rdb_fd != -1) bioCreateCloseJob(old_rdb_fd); + /* Sync the directory to ensure rename is persisted */ + if (fsyncFileDir(server.rdb_filename) == -1) { + serverLog(LL_WARNING, + "Failed trying to sync DB directory %s in " + "MASTER <-> REPLICA synchronization: %s", + server.rdb_filename, strerror(errno)); + cancelReplicationHandshake(1); + return; + } + if (rdbLoad(server.rdb_filename,&rsi,RDBFLAGS_REPLICATION) != C_OK) { serverLog(LL_WARNING, "Failed trying to load the MASTER synchronization " -- cgit v1.2.1