diff options
author | Binbin <binloveplay1314@qq.com> | 2023-03-29 20:17:05 +0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2023-03-29 15:17:05 +0300 |
commit | cb1717865804fdb0561e728d2f3a0a1138099d9d (patch) | |
tree | ecd347bc55b44b5891a3ae4fd23fb8a2a41dffe5 /tests | |
parent | 557ca05d059951f618aab5fcba727fa19ecad729 (diff) | |
download | redis-cb1717865804fdb0561e728d2f3a0a1138099d9d.tar.gz |
Fix fork done handler wrongly update fsync metrics and enhance AOF_ FSYNC_ALWAYS (#11973)
This PR fix several unrelated bugs that were discovered by the same set of tests
(WAITAOF tests in #11713), could make the `WAITAOF` test hang.
The change in `backgroundRewriteDoneHandler` is about MP-AOF.
That leftover / old code assumes that we started a new AOF file just now
(when we have a new base into which we're gonna incrementally write), but
the fact is that with MP-AOF, the fork done handler doesn't really affect the
incremental file being maintained by the parent process, there's no reason to
re-issue `SELECT`, and no reason to update any of the fsync variables in that flow.
This should have been deleted with MP-AOF (introduced in #9788, 7.0).
The damage is that the update to `aof_fsync_offset` will cause us to miss an fsync
in `flushAppendOnlyFile`, that happens if we stop write commands in `AOF_FSYNC_EVERYSEC`
while an AOFRW is in progress. This caused a new `WAITAOF` test to sometime hang forever.
Also because of MP-AOF, we needed to change `aof_fsync_offset` to `aof_last_incr_fsync_offset`
and match it to `aof_last_incr_size` in `flushAppendOnlyFile`. This is because in the past we compared
`aof_fsync_offset` and `aof_current_size`, but with MP-AOF it could be the total AOF file will be
smaller after AOFRW, and the (already existing) incr file still has data that needs to be fsynced.
The change in `flushAppendOnlyFile`, about the `AOF_FSYNC_ALWAYS`, it is follow #6053
(the details is in #5985), we also check `AOF_FSYNC_ALWAYS` to handle a case where
appendfsync is changed from everysec to always while there is data that's written but not yet fsynced.
Diffstat (limited to 'tests')
-rw-r--r-- | tests/unit/wait.tcl | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/tests/unit/wait.tcl b/tests/unit/wait.tcl index 08a7a71f6..af13a3374 100644 --- a/tests/unit/wait.tcl +++ b/tests/unit/wait.tcl @@ -175,7 +175,49 @@ tags {"wait aof network external:skip"} { $replica config set appendfsync everysec test {WAITAOF replica copy everysec} { + $replica config set appendfsync everysec + waitForBgrewriteaof $replica ;# Make sure there is no AOFRW + + $master incr foo + assert_equal [$master waitaof 0 1 0] {1 1} + } + + test {WAITAOF replica copy everysec with AOFRW} { + $replica config set appendfsync everysec + + # When we trigger an AOFRW, a fsync is triggered when closing the old INCR file, + # so with the everysec, we will skip that second of fsync, and in the next second + # after that, we will eventually do the fsync. + $replica bgrewriteaof + waitForBgrewriteaof $replica + + $master incr foo + assert_equal [$master waitaof 0 1 0] {1 1} + } + + test {WAITAOF replica copy everysec with slow AOFRW} { + $replica config set appendfsync everysec + $replica config set rdb-key-save-delay 1000000 ;# 1 sec + + $replica bgrewriteaof + + $master incr foo + assert_equal [$master waitaof 0 1 0] {1 1} + + $replica config set rdb-key-save-delay 0 + waitForBgrewriteaof $replica + } + + test {WAITAOF replica copy everysec->always with AOFRW} { + $replica config set appendfsync everysec + + # Try to fit all of them in the same round second, although there's no way to guarantee + # that, it can be done on fast machine. In any case, the test shouldn't fail either. + $replica bgrewriteaof $master incr foo + waitForBgrewriteaof $replica + $replica config set appendfsync always + assert_equal [$master waitaof 0 1 0] {1 1} } |