diff options
author | unknown <knielsen@knielsen-hq.org> | 2014-06-03 10:31:11 +0200 |
---|---|---|
committer | Kristian Nielsen <knielsen@knielsen-hq.org> | 2014-06-03 10:31:11 +0200 |
commit | 629b822913348cec56ec7a80a236f0ba2e613585 (patch) | |
tree | 8c713a0ad975deb8b6764af03b1cca8e8cd195db /sql/mysqld.h | |
parent | 787c470cef54574e744eb5dfd9153d837fe67e45 (diff) | |
download | mariadb-git-629b822913348cec56ec7a80a236f0ba2e613585.tar.gz |
MDEV-5262, MDEV-5914, MDEV-5941, MDEV-6020: Deadlocks during parallel
replication causing replication to fail.
In parallel replication, we run transactions from the master in parallel, but
force them to commit in the same order they did on the master. If we force T1
to commit before T2, but T2 holds eg. a row lock that is needed by T1, we get
a deadlock when T2 waits until T1 has committed.
Usually, we do not run T1 and T2 in parallel if there is a chance that they
can have conflicting locks like this, but there are certain edge cases where
it can occasionally happen (eg. MDEV-5914, MDEV-5941, MDEV-6020). The bug was
that this would cause replication to hang, eventually getting a lock timeout
and causing the slave to stop with error.
With this patch, InnoDB will report back to the upper layer whenever a
transactions T1 is about to do a lock wait on T2. If T1 and T2 are parallel
replication transactions, and T2 needs to commit later than T1, we can thus
detect the deadlock; we then kill T2, setting a flag that causes it to catch
the kill and convert it to a deadlock error; this error will then cause T2 to
roll back and release its locks (so that T1 can commit), and later T2 will be
re-tried and eventually also committed.
The kill happens asynchroneously in a slave background thread; this is
necessary, as the reporting from InnoDB about lock waits happen deep inside
the locking code, at a point where it is not possible to directly call
THD::awake() due to mutexes held.
Deadlock is assumed to be (very) rarely occuring, so this patch tries to
minimise the performance impact on the normal case where no deadlocks occur,
rather than optimise the handling of the occasional deadlock.
Also fix transaction retry due to deadlock when it happens after a transaction
already signalled to later transactions that it started to commit. In this
case we need to undo this signalling (and later redo it when we commit again
during retry), so following transactions will not start too early.
Also add a missing thd->send_kill_message() that got triggered during testing
(this corrects an incorrect fix for MySQL Bug#58933).
Diffstat (limited to 'sql/mysqld.h')
-rw-r--r-- | sql/mysqld.h | 10 |
1 files changed, 7 insertions, 3 deletions
diff --git a/sql/mysqld.h b/sql/mysqld.h index e7eea3dfa1a..5b28fefb082 100644 --- a/sql/mysqld.h +++ b/sql/mysqld.h @@ -309,8 +309,8 @@ extern PSI_cond_key key_COND_wait_gtid, key_COND_gtid_ignore_duplicates; extern PSI_thread_key key_thread_bootstrap, key_thread_delayed_insert, key_thread_handle_manager, key_thread_kill_server, key_thread_main, - key_thread_one_connection, key_thread_signal_hand, key_thread_slave_init, - key_rpl_parallel_thread; + key_thread_one_connection, key_thread_signal_hand, + key_thread_slave_background, key_rpl_parallel_thread; extern PSI_file_key key_file_binlog, key_file_binlog_index, key_file_casetest, key_file_dbopt, key_file_des_key_file, key_file_ERRMSG, key_select_to_file, @@ -451,6 +451,8 @@ extern PSI_stage_info stage_waiting_for_room_in_worker_thread; extern PSI_stage_info stage_master_gtid_wait_primary; extern PSI_stage_info stage_master_gtid_wait; extern PSI_stage_info stage_gtid_wait_other_connection; +extern PSI_stage_info stage_slave_background_process_request; +extern PSI_stage_info stage_slave_background_wait_request; #ifdef HAVE_PSI_STATEMENT_INTERFACE /** @@ -518,7 +520,8 @@ extern mysql_mutex_t LOCK_delayed_status, LOCK_delayed_create, LOCK_crypt, LOCK_timezone, LOCK_slave_list, LOCK_active_mi, LOCK_manager, LOCK_global_system_variables, LOCK_user_conn, - LOCK_prepared_stmt_count, LOCK_error_messages, LOCK_connection_count; + LOCK_prepared_stmt_count, LOCK_error_messages, LOCK_connection_count, + LOCK_slave_background; extern MYSQL_PLUGIN_IMPORT mysql_mutex_t LOCK_thread_count; #ifdef HAVE_OPENSSL extern mysql_mutex_t LOCK_des_key_file; @@ -529,6 +532,7 @@ extern mysql_rwlock_t LOCK_grant, LOCK_sys_init_connect, LOCK_sys_init_slave; extern mysql_rwlock_t LOCK_system_variables_hash; extern mysql_cond_t COND_thread_count; extern mysql_cond_t COND_manager; +extern mysql_cond_t COND_slave_background; extern int32 thread_running; extern int32 thread_count; extern my_atomic_rwlock_t thread_running_lock, thread_count_lock; |