Fix races in swapdb async_loading test (#11613)

There is a race in the test: ``` *** [err]: Diskless load swapdb (async_loading): new database is exposed after swapping in tests/integration/replication.tcl Expected 'myvalue' to be equal to '' (context: type eval line 3 cmd {assert_equal [$replica GET mykey] ""} proc ::test) ``` When doing `$replica GET mykey`, the replica is using the old database. The reason may be that when doing `master client kill type replica`, the replica did not yet realize it got disconnected from the master. So the check of master_link_status fails, and the replica did not finish the swapdb and the loading. In that case, i think the solution is to check the sync_full stat on the master and wait for it to get incremented from the previous value. i.e. the way to know that we're done with the full sync is not to check that our state is up (could be up if we check too early), but rather check that the sync_full counter got incremented. During the reviewing, we found another race, in Aborted testType, the `$master config set rdb-key-save-delay 10000` is done after we already initiated the disconnection, so there's a chance that the replica will attempt to reconnect before that call, in which case if we fork() before it, the config will not take effect. Move it to above the disconnection. Co-authored-by: Oran Agra <oran@redislabs.com>
author: Binbin <binloveplay1314@qq.com> 2022-12-13 13:59:43 +0800
committer: GitHub <noreply@github.com> 2022-12-13 07:59:43 +0200
commit: 5f69ce0d8e43607da0239cdd64e015100ec00046 (patch)
tree: 059b18b4855d871d4d0774d0544b10fc94b15b58 /tests/integration
parent: cd12cc2f545cea5e50be1748de4d2316841e19b8 (diff)
download: redis-5f69ce0d8e43607da0239cdd64e015100ec00046.tar.gz
1 files changed, 17 insertions, 4 deletions
diff --git a/tests/integration/replication.tcl b/tests/integration/replication.tcl
index bb06e18ac..435e8fcde 100644
--- a/tests/integration/replication.tcl
+++ b/tests/integration/replication.tcl
@@ -569,6 +569,15 @@ foreach testType {Successful Aborted} {
                 redis.register_function('test', function() return 'hello2' end)
             }
 
+            # Remember the sync_full stat before the client kill.
+            set sync_full [s 0 sync_full]
+
+            if {$testType == "Aborted"} {
+                # Set master with a slow rdb generation, so that we can easily intercept loading
+                # 10ms per key, with 2000 keys is 20 seconds
+                $master config set rdb-key-save-delay 10000
+            }
+
             # Force the replica to try another full sync (this time it will have matching master replid)
             $master multi
             $master client kill type replica
@@ -579,12 +588,16 @@ foreach testType {Successful Aborted} {
             }
             $master exec
 
+            # Wait for sync_full to get incremented from the previous value.
+            # After the client kill, make sure we do a reconnect, and do a FULL SYNC.
+            wait_for_condition 100 100 {
+                [s 0 sync_full] > $sync_full
+            } else {
+                fail "Master <-> Replica didn't start the full sync"
+            }
+
             switch $testType {
                 "Aborted" {
-                    # Set master with a slow rdb generation, so that we can easily intercept loading
-                    # 10ms per key, with 2000 keys is 20 seconds
-                    $master config set rdb-key-save-delay 10000
-
                     test {Diskless load swapdb (async_loading): replica enter async_loading} {
                         # Wait for the replica to start reading the rdb
                         wait_for_condition 100 100 {
author	Binbin <binloveplay1314@qq.com>	2022-12-13 13:59:43 +0800
committer	GitHub <noreply@github.com>	2022-12-13 07:59:43 +0200
commit	5f69ce0d8e43607da0239cdd64e015100ec00046 (patch)
tree	059b18b4855d871d4d0774d0544b10fc94b15b58 /tests/integration
parent	cd12cc2f545cea5e50be1748de4d2316841e19b8 (diff)
download	redis-5f69ce0d8e43607da0239cdd64e015100ec00046.tar.gz