summaryrefslogtreecommitdiff
path: root/ovsdb
diff options
context:
space:
mode:
authorDumitru Ceara <dceara@redhat.com>2021-12-13 20:46:03 +0100
committerIlya Maximets <i.maximets@ovn.org>2021-12-13 21:52:59 +0100
commitbf07cc9cdb2f37fede8c0363937f1eb9f4cfd730 (patch)
tree2a7f9a94ffe8e664cf336750c876e259cd1b2189 /ovsdb
parent20a4f546f7db197a9652a79c4e8edac972a16084 (diff)
downloadopenvswitch-bf07cc9cdb2f37fede8c0363937f1eb9f4cfd730.tar.gz
raft: Only allow followers to snapshot.
Commit 3c2d6274bcee ("raft: Transfer leadership before creating snapshots.") made it such that raft leaders transfer leadership before snapshotting. However, there's still the case when the next leader to be is in the process of snapshotting. To avoid delays in that case too, we now explicitly allow snapshots only on followers. Cluster members will have to wait until the current election is settled before snapshotting. Given the following logs taken from an OVN_Southbound 3-server cluster during a scale test: S1 (old leader): 19:07:51.226Z|raft|INFO|Transferring leadership to write a snapshot. 19:08:03.830Z|ovsdb|INFO|OVN_Southbound: Database compaction took 12601ms 19:08:03.940Z|raft|INFO|server 8b8d is leader for term 43 S2 (follower): 19:08:00.870Z|raft|INFO|server 8b8d is leader for term 43 S3 (new leader): 19:07:51.242Z|raft|INFO|received leadership transfer from f5c9 in term 42 19:07:51.244Z|raft|INFO|term 43: starting election 19:08:00.805Z|ovsdb|INFO|OVN_Southbound: Database compaction took 9559ms 19:08:00.869Z|raft|INFO|term 43: elected leader by 2+ of 3 servers We see that the leader to be (S3) receives the leadership transfer, initiates the election and immediately after starts a snapshot that takes ~9.5 seconds. During this time, S2 votes for S3 electing it as cluster leader but S3 doesn't effectively become leader until it finishes snapshotting, essentially keeping the cluster without a leader for up to ~9.5 seconds. With the current change, S3 will delay compaction and snapshotting until the election is finished. The only exception is the case of single-node clusters for which we allow the node to snapshot regardless of role. Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Diffstat (limited to 'ovsdb')
-rw-r--r--ovsdb/raft.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index ce40c5bc0..1a3447a8d 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -4226,7 +4226,7 @@ raft_may_snapshot(const struct raft *raft)
&& !raft->leaving
&& !raft->left
&& !raft->failed
- && raft->role != RAFT_LEADER
+ && (raft->role == RAFT_FOLLOWER || hmap_count(&raft->servers) == 1)
&& raft->last_applied >= raft->log_start);
}