summaryrefslogtreecommitdiff
path: root/sql/slave.cc
diff options
context:
space:
mode:
authorBrandon Nesterenko <brandon.nesterenko@mariadb.com>2021-10-20 20:13:45 -0600
committerBrandon Nesterenko <brandon.nesterenko@mariadb.com>2022-04-22 12:59:54 -0600
commita83c7ab1ea62954ab81cd599315e76a2f115ff92 (patch)
treef4d90ab228784d1be8ebbbddce661ea6ec05b8ef /sql/slave.cc
parent807945f2eb5fa22e6f233cc17b85a2e141efe2c8 (diff)
downloadmariadb-git-a83c7ab1ea62954ab81cd599315e76a2f115ff92.tar.gz
MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state
Problem: ======== If a primary is shutdown during an active semi-sync connection during the period when the primary is awaiting an ACK, the primary hard kills the active communication thread and does not ensure the transaction was received by a replica. This can lead to an inconsistent replication state. Solution: ======== During shutdown, the primary should wait for an ACK or timeout before hard killing a thread which is awaiting a communication. We extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore any threads waiting for a semi-sync ACK in phase 1. Then, before stopping the ack receiver thread, the shutdown is delayed until all waiting semi-sync connections receive an ACK or time out. The connections are then killed in phase 2. Notes: 1) There remains an unresolved corner case that affects this patch. MDEV-28141: Slave crashes with Packets out of order when connecting to a shutting down master. Specifically, If a slave is connecting to a master which is actively shutting down, the slave can crash with a "Packets out of order" assertion error. To get around this issue in the MTR tests, the primary will wait a small amount of time before phase 1 killing threads to let the replicas safely stop (if applicable). 2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver Thread Can Error on COM_QUIT Reviewed By ============ Andrei Elkin <andrei.elkin@mariadb.com>
Diffstat (limited to 'sql/slave.cc')
-rw-r--r--sql/slave.cc1
1 files changed, 1 insertions, 0 deletions
diff --git a/sql/slave.cc b/sql/slave.cc
index edea312c5ea..029fd0f5aaf 100644
--- a/sql/slave.cc
+++ b/sql/slave.cc
@@ -4859,6 +4859,7 @@ Stopping slave I/O thread due to out-of-memory error from master");
not cause the slave IO thread to stop, and the error messages are
already reported.
*/
+ DBUG_EXECUTE_IF("simulate_delay_semisync_slave_reply", my_sleep(800000););
(void)repl_semisync_slave.slave_reply(mi);
}