summaryrefslogtreecommitdiff
path: root/ovsdb/raft.c
diff options
context:
space:
mode:
authorIlya Maximets <i.maximets@ovn.org>2020-10-21 03:32:49 +0200
committerIlya Maximets <i.maximets@ovn.org>2020-11-10 01:23:33 +0100
commiteca34ebd7c418c0351eb92ae615d07edc31a9404 (patch)
tree4f46d9ed1b7db198a952958a927c6637e9bedb59 /ovsdb/raft.c
parentc4bc03d872db5fe6f804fc9ddbbec29e28335cb5 (diff)
downloadopenvswitch-eca34ebd7c418c0351eb92ae615d07edc31a9404.tar.gz
raft: Set threshold on backlog for raft connections.
RAFT messages could be fairly big. If something abnormal happens to one of the servers in a cluster it may not be able to process all the incoming messages in a timely manner. This results in jsonrpc backlog growth on the sender's side. For example if follower gets many new clients at once that it needs to serve, or it decides to take a snapshot in a period of high number of database changes. If backlog grows large enough it becomes harder and harder for follower to process incoming raft messages, it sends outdated replies and starts receiving snapshots and the whole raft log from the leader. Sometimes backlog grows too high (60GB in this example): jsonrpc|INFO|excessive sending backlog, jsonrpc: ssl:<ip>, num of msgs: 15370, backlog: 61731060773. In this case OS might actually decide to kill the sender to free some memory. Anyway, It could take a lot of time for such a server to catch up with the rest of the cluster if it has so much data to receive and process. Introducing backlog thresholds for jsonrpc connections. If sending backlog will exceed particular values (500 messages or 4GB in size), connection will be dropped and re-created. This will allow to drop all the current backlog and start over increasing chances of cluster recovery. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Diffstat (limited to 'ovsdb/raft.c')
-rw-r--r--ovsdb/raft.c5
1 files changed, 5 insertions, 0 deletions
diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index f94a3eed8..67c714ff4 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -925,6 +925,9 @@ raft_reset_ping_timer(struct raft *raft)
raft->ping_timeout = time_msec() + raft->election_timer / 3;
}
+#define RAFT_MAX_BACKLOG_N_MSGS 500
+#define RAFT_MAX_BACKLOG_BYTES UINT32_MAX
+
static void
raft_add_conn(struct raft *raft, struct jsonrpc_session *js,
const struct uuid *sid, bool incoming)
@@ -940,6 +943,8 @@ raft_add_conn(struct raft *raft, struct jsonrpc_session *js,
conn->incoming = incoming;
conn->js_seqno = jsonrpc_session_get_seqno(conn->js);
jsonrpc_session_set_probe_interval(js, 0);
+ jsonrpc_session_set_backlog_threshold(js, RAFT_MAX_BACKLOG_N_MSGS,
+ RAFT_MAX_BACKLOG_BYTES);
}
/* Starts the local server in an existing Raft cluster, using the local copy of