diff options
author | Brandon Nesterenko <brandon.nesterenko@mariadb.com> | 2021-10-20 20:13:45 -0600 |
---|---|---|
committer | Brandon Nesterenko <brandon.nesterenko@mariadb.com> | 2022-04-22 12:59:54 -0600 |
commit | a83c7ab1ea62954ab81cd599315e76a2f115ff92 (patch) | |
tree | f4d90ab228784d1be8ebbbddce661ea6ec05b8ef /sql/slave.cc | |
parent | 807945f2eb5fa22e6f233cc17b85a2e141efe2c8 (diff) | |
download | mariadb-git-a83c7ab1ea62954ab81cd599315e76a2f115ff92.tar.gz |
MDEV-11853: semisync thread can be killed after sync binlog but before ACK in the sync state
Problem:
========
If a primary is shutdown during an active semi-sync connection
during the period when the primary is awaiting an ACK, the primary
hard kills the active communication thread and does not ensure the
transaction was received by a replica. This can lead to an
inconsistent replication state.
Solution:
========
During shutdown, the primary should wait for an ACK or timeout
before hard killing a thread which is awaiting a communication. We
extend the `SHUTDOWN WAIT FOR SLAVES` logic to identify and ignore
any threads waiting for a semi-sync ACK in phase 1. Then, before
stopping the ack receiver thread, the shutdown is delayed until all
waiting semi-sync connections receive an ACK or time out. The
connections are then killed in phase 2.
Notes:
1) There remains an unresolved corner case that affects this
patch. MDEV-28141: Slave crashes with Packets out of order when
connecting to a shutting down master. Specifically, If a slave is
connecting to a master which is actively shutting down, the slave
can crash with a "Packets out of order" assertion error. To get
around this issue in the MTR tests, the primary will wait a small
amount of time before phase 1 killing threads to let the replicas
safely stop (if applicable).
2) This patch also fixes MDEV-28114: Semi-sync Master ACK Receiver
Thread Can Error on COM_QUIT
Reviewed By
============
Andrei Elkin <andrei.elkin@mariadb.com>
Diffstat (limited to 'sql/slave.cc')
-rw-r--r-- | sql/slave.cc | 1 |
1 files changed, 1 insertions, 0 deletions
diff --git a/sql/slave.cc b/sql/slave.cc index edea312c5ea..029fd0f5aaf 100644 --- a/sql/slave.cc +++ b/sql/slave.cc @@ -4859,6 +4859,7 @@ Stopping slave I/O thread due to out-of-memory error from master"); not cause the slave IO thread to stop, and the error messages are already reported. */ + DBUG_EXECUTE_IF("simulate_delay_semisync_slave_reply", my_sleep(800000);); (void)repl_semisync_slave.slave_reply(mi); } |