summaryrefslogtreecommitdiff
path: root/tests/integration
diff options
context:
space:
mode:
authorMeir Shpilraien (Spielrein) <meir@redis.com>2022-08-24 12:51:36 +0300
committerGitHub <noreply@github.com>2022-08-24 12:51:36 +0300
commitc1bd61a4a50225982d4a9bc7c66fec913b0c9af2 (patch)
tree7dbf99f4377f69ccd1ded7b82f8a889305a76674 /tests/integration
parent41d9eb0291417c36d694894c936d8f4f29ec5504 (diff)
downloadredis-c1bd61a4a50225982d4a9bc7c66fec913b0c9af2.tar.gz
Reverts most of the changes of #10969 (#11178)
The PR reverts the changes made on #10969. The reason for revert was trigger because of occasional test failure that started after the PR was merged. The issue is that if there is a lazy expire during the command invocation, the `del` command is added to the replication stream after the command placeholder. So the logical order on the primary is: * Delete the key (lazy expiration) * Command invocation But the replication stream gets it the other way around: * Command invocation (because the command is written into the placeholder) * Delete the key (lazy expiration) So if the command write to the key that was just lazy expired we will get inconsistency between primary and replica. One solution we considered is to add another lazy expire replication stream and write all the lazy expire there. Then when replicating, we will replicate the lazy expire replication stream first. This will solve this specific test failure but we realize that the issues does not ends here and the more we dig the more problems we find.One of the example we thought about (that can actually crashes Redis) is as follow: * User perform SINTERSTORE * When Redis tries to fetch the second input key it triggers lazy expire * The lazy expire trigger a module logic that deletes the first input key * Now Redis hold the robj of the first input key that was actually freed We believe we took the wrong approach and we will come up with another PR that solve the problem differently, for now we revert the changes so we will not have the tests failure. Notice that not the entire code was revert, some parts of the PR are changes that we would like to keep. The changes that **was** reverted are: * Saving a placeholder for replication at the beginning of the command (`call` function) * Order of the replication stream on active expire and eviction (we will decide how to handle it correctly on follow up PR) * `Spop` changes are no longer needed (because we reverted the placeholder code) Changes that **was not** reverted: * On expire/eviction, wrap the `del` and the notification effect in a multi exec. * `PropagateNow` function can still accept a special dbid, -1, indicating not to replicate select. * Keep optimisation for reusing the `alsoPropagate` array instead of allocating it each time. Tests: * All tests was kept and only few tests was modify to work correctly with the changes * Test was added to verify that the revert fixes the issues.
Diffstat (limited to 'tests/integration')
-rw-r--r--tests/integration/replication.tcl31
1 files changed, 31 insertions, 0 deletions
diff --git a/tests/integration/replication.tcl b/tests/integration/replication.tcl
index 457c3150e..11849a5b6 100644
--- a/tests/integration/replication.tcl
+++ b/tests/integration/replication.tcl
@@ -1355,3 +1355,34 @@ start_server {tags {"repl" "external:skip"}} {
assert_equal "PONG" [r ping]
}
}
+
+start_server {tags {"repl external:skip"}} {
+ set master [srv 0 client]
+ set master_host [srv 0 host]
+ set master_port [srv 0 port]
+ $master debug SET-ACTIVE-EXPIRE 0
+ start_server {} {
+ set slave [srv 0 client]
+ $slave debug SET-ACTIVE-EXPIRE 0
+ $slave slaveof $master_host $master_port
+
+ test "Test replication with lazy expire" {
+ # wait for replication to be in sync
+ wait_for_condition 50 100 {
+ [lindex [$slave role] 0] eq {slave} &&
+ [string match {*master_link_status:up*} [$slave info replication]]
+ } else {
+ fail "Can't turn the instance into a replica"
+ }
+
+ $master sadd s foo
+ $master pexpire s 1
+ after 10
+ $master sadd s foo
+ assert_equal 1 [$master wait 1 1]
+
+ assert_equal "set" [$master type s]
+ assert_equal "set" [$slave type s]
+ }
+ }
+}