diff options
author | Dmitry Lenev <dlenev@mysql.com> | 2010-07-27 17:34:58 +0400 |
---|---|---|
committer | Dmitry Lenev <dlenev@mysql.com> | 2010-07-27 17:34:58 +0400 |
commit | 00496b7acd1f2ac8b099ba7e6a4c7bbf09178384 (patch) | |
tree | bef62913efddc244d466f7ff730bcc4205357491 /mysql-test/t | |
parent | 36290c092392c460132f3d7256aeeb8f94debe8f (diff) | |
download | mariadb-git-00496b7acd1f2ac8b099ba7e6a4c7bbf09178384.tar.gz |
Fix for bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH
TABLES <list> WITH READ LOCK are incompatible".
The problem was that FLUSH TABLES <list> WITH READ LOCK
which was issued when other connection has acquired global
read lock using FLUSH TABLES WITH READ LOCK was blocked
and has to wait until global read lock is released.
This issue stemmed from the fact that FLUSH TABLES <list>
WITH READ LOCK implementation has acquired X metadata locks
on tables to be flushed. Since these locks required acquiring
of global IX lock this statement was incompatible with global
read lock.
This patch addresses problem by using SNW metadata type of
lock for tables to be flushed by FLUSH TABLES <list> WITH
READ LOCK. It is OK to acquire them without global IX lock
as long as we won't try to upgrade those locks. Since SNW
locks allow concurrent statements using same table FLUSH
TABLE <list> WITH READ LOCK now has to wait until old
versions of tables to be flushed go away after acquiring
metadata locks. Since such waiting can lead to deadlock
MDL deadlock detector was extended to take into account
waits for flush and resolve such deadlocks.
As a bonus code in open_tables() which was responsible for
waiting old versions of tables to go away was refactored.
Now when we encounter old version of table in open_table()
we don't back-off and wait for all old version to go away,
but instead wait for this particular table to be flushed.
Such approach supported by deadlock detection should reduce
number of scenarios in which FLUSH TABLES aborts concurrent
multi-statement transactions.
Note that active FLUSH TABLES <list> WITH READ LOCK still
blocks concurrent FLUSH TABLES WITH READ LOCK statement
as the former keeps tables open and thus prevents the
latter statement from doing flush.
mysql-test/include/handler.inc:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/r/flush.result:
Added test which checks that "flush tables <list> with
read lock" is compatible with active "flush tables with
read lock" but not vice-versa. This test also covers
bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
<list> WITH READ LOCK are incompatible".
mysql-test/r/mdl_sync.result:
Added scenarios in which wait for table to be flushed
causes deadlocks to the coverage of MDL deadlock detector.
mysql-test/suite/perfschema/r/dml_setup_instruments.result:
Adjusted test results after removal of COND_refresh
condition variable.
mysql-test/suite/perfschema/r/server_init.result:
Adjusted test and its results after removal of COND_refresh
condition variable.
mysql-test/suite/perfschema/t/server_init.test:
Adjusted test and its results after removal of COND_refresh
condition variable.
mysql-test/t/flush.test:
Added test which checks that "flush tables <list> with
read lock" is compatible with active "flush tables with
read lock" but not vice-versa. This test also covers
bug #52044 "FLUSH TABLES WITH READ LOCK and FLUSH TABLES
<list> WITH READ LOCK are incompatible".
mysql-test/t/kill.test:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/t/lock_multi.test:
Adjusted test case after changing status which is set
when FLUSH TABLES waits for tables to be flushed from
"Flushing tables" to "Waiting for table".
mysql-test/t/mdl_sync.test:
Added scenarios in which wait for table to be flushed
causes deadlocks to the coverage of MDL deadlock detector.
sql/ha_ndbcluster.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/ha_ndbcluster_binlog.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/lock.cc:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/mdl.cc:
Now MDL deadlock detector takes into account information
about waits for table flushes when searching for deadlock.
To implement this change:
- Declaration of enum_deadlock_weight and
Deadlock_detection_visitor were moved to mdl.h header
to make them available to the code in table.cc which
implements deadlock detector traversal through edges
of waiters graph representing waiting for flush.
- Since now MDL_context may wait not only for metadata
lock but also for table to be flushed an abstract
Wait_for_edge class was introduced. Its descendants
MDL_ticket and Flush_ticket incapsulate specifics
of inspecting waiters graph when following through
edge representing wait of particular type.
We no longer require global IX metadata lock when acquiring
SNW or SNRW locks. Such locks are needed only when metadata
locks of these types are upgraded to X locks. This allows
to use SNW locks in FLUSH TABLES <list> WITH READ LOCK
implementation and keep the latter compatible with global
read lock.
sql/mdl.h:
Now MDL deadlock detector takes into account information
about waits for table flushes when searching for deadlock.
To implement this change:
- Declaration of enum_deadlock_weight and
Deadlock_detection_visitor were moved to mdl.h header
to make them available to the code in table.cc which
implements deadlock detector traversal through edges
of waiters graph representing waiting for flush.
- Since now MDL_context may wait not only for metadata
lock but also for table to be flushed an abstract
Wait_for_edge class was introduced. Its descendants
MDL_ticket and Flush_ticket incapsulate specifics
of inspecting waiters graph when following through
edge representing wait of particular type.
- Deadlock_detection_visitor now has m_table_shares_visited
member which allows to support recursive locking for
LOCK_open. This is required when deadlock detector
inspects waiters graph which contains several edges
representing waits for flushes or needs to come through
the such edge more than once.
sql/mysqld.cc:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/mysqld.h:
Removed COND_refresh condition variable. See comment
for sql_base.cc for details.
sql/sql_base.cc:
Changed approach to how threads are waiting for table
to be flushed. Now thread that wants to wait for old
table to go away subscribes for notification by adding
Flush_ticket to table's share and waits using
MDL_context::m_wait object. Once table gets flushed
(i.e. all tables are closed and table share is ready
to be destroyed) all such waiters are notified
individually.
Thanks to this change MDL deadlock detector can take
such waits into account.
To implement this/as result of this change:
- tdc_wait_for_old_versions() was replaced with
tdc_wait_for_old_version() which waits for individual
old share to go away and which is called by open_table()
after finding out that share is outdated. We don't
need to perform back-off before such waiting thanks
to the fact that deadlock detector now sees such waits.
- As result Open_table_ctx::m_mdl_requests became
unnecessary and was removed. We no longer allocate
copies of MDL_request objects on MEM_ROOT when
MYSQL_OPEN_FORCE_SHARED/SHARED_HIGH_PRIO flags are
in effect.
- close_cached_tables() and tdc_wait_for_old_version()
share code which implements waiting for share to be
flushed - the both use TABLE_SHARE::wait_until_flush()
method. Thanks to this close_cached_tables() supports
timeouts and has extra parameter for this.
- Open_table_context::OT_MDL_CONFLICT enum element was
renamed to OT_CONFLICT as it is now also used in cases
when back-off is required to resolve deadlock caused
by waiting for flush and not metadata lock.
- In cases when we discover that current connection tries
to open tables from different generation we now simply
back-off and restart process of opening tables. To
support this Open_table_context::OT_REOPEN_TABLES enum
element was added.
- COND_refresh condition variable became unnecessary and
was removed.
- mysql_notify_thread_having_shared_lock() no longer wakes
up connections waiting for flush as all such connections
can be waken up by deadlock detector if necessary.
sql/sql_base.h:
- close_cached_tables() now has one more parameter -
timeout for waiting for table to be flushed.
- Open_table_context::OT_MDL_CONFLICT enum element was
renamed to OT_CONFLICT as it is now also used in cases
when back-off is required to resolve deadlock caused
by waiting for flush and not metadata lock.
Added new OT_REOPEN_TABLES enum element to be used in
cases when we need to restart open tables process even
in the middle of transaction.
- Open_table_ctx::m_mdl_requests became unnecessary and
was removed.
sql/sql_class.h:
Added assert ensuring that we won't use LOCK_open mutex
with THD::enter_cond(). Otherwise deadlocks can arise in
MDL deadlock detector.
sql/sql_parse.cc:
Changed FLUSH TABLES <list> WITH READ LOCK to take SNW
metadata locks instead of X locks on tables to be flushed.
Since we no longer require global IX lock to be taken
when SNW locks are taken this makes this statement
compatible with FLUSH TABLES WITH READ LOCK statement.
Since SNW locks allow other connections to have table
opened FLUSH TABLES <list> WITH READ LOCK now has to
wait during open_tables() for old version to go away.
Such waits can lead to deadlocks which will be detected
by MDL deadlock detector which now takes waits for table
to be flushed into account.
Also adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/sql_yacc.yy:
FLUSH TABLES <list> WITH READ LOCK now needs only SNW
metadata locks on tables.
sql/sys_vars.cc:
Adjusted code after adding one more parameter for
close_cached_tables() call - timeout for waiting for
table to be flushed.
sql/table.cc:
Implemented new approach to how threads are waiting for
table to be flushed. Now thread that wants to wait for
old table to go away subscribes for notification by
adding Flush_ticket to table's share and waits using
MDL_context::m_wait object. Once table gets flushed
(i.e. all tables are closed and table share is ready
to be destroyed) all such waiters are notified
individually. This change allows to make such waits
visible inside of MDL deadlock detector.
To do it:
- Added list of waiters/Flush_tickets to TABLE_SHARE
class.
- Changed free_table_share() to postpone freeing of
share memory until last waiter goes away and to
wake up subscribed waiters.
- Added TABLE_SHARE::wait_until_flushed() method which
implements subscription to the list of waiters for
table to be flushed and waiting for this event.
Implemented interface which allows to expose waits for
flushes to MDL deadlock detector:
- Introduced Flush_ticket class a descendant of
Wait_for_edge class.
- Added TABLE_SHARE::find_deadlock() method which allows
deadlock detector to find out what contexts are still
using old version of table in question (i.e. to find
out what contexts are waited for by owner of
Flush_ticket).
sql/table.h:
In order to support new strategy of waiting for table flush
(see comment for table.cc for details) added list of
waiters/Flush_tickets to TABLE_SHARE class.
Implemented interface which allows to expose waits for
flushes to MDL deadlock detector:
- Introduced Flush_ticket class a descendant of
Wait_for_edge class.
- Added TABLE_SHARE::find_deadlock() method which allows
deadlock detector to find out what contexts are still
using old version of table in question (i.e. to find
out what contexts are waited for by owner of
Flush_ticket).
Diffstat (limited to 'mysql-test/t')
-rw-r--r-- | mysql-test/t/flush.test | 14 | ||||
-rw-r--r-- | mysql-test/t/kill.test | 2 | ||||
-rw-r--r-- | mysql-test/t/lock_multi.test | 2 | ||||
-rw-r--r-- | mysql-test/t/mdl_sync.test | 250 |
4 files changed, 266 insertions, 2 deletions
diff --git a/mysql-test/t/flush.test b/mysql-test/t/flush.test index 0d406338394..0157f2dc764 100644 --- a/mysql-test/t/flush.test +++ b/mysql-test/t/flush.test @@ -318,6 +318,20 @@ insert into t2 (a) values (3); --echo # --> connection default; connection default; unlock tables; +--echo # +--echo # Check that "flush tables <list> with read lock" is +--echo # compatible with active "flush tables with read lock". +--echo # Vice versa is not true as tables read-locked by +--echo # "flush tables <list> with read lock" can't be flushed. +flush tables with read lock; +--echo # --> connection con1; +connection con1; +flush table t1 with read lock; +select * from t1; +unlock tables; +--echo # --> connection default; +connection default; +unlock tables; --echo # --> connection con1 connection con1; disconnect con1; diff --git a/mysql-test/t/kill.test b/mysql-test/t/kill.test index b91feb3a1d5..7169ca5f7c3 100644 --- a/mysql-test/t/kill.test +++ b/mysql-test/t/kill.test @@ -536,7 +536,7 @@ connection ddl; connection dml; let $wait_condition= select count(*) = 1 from information_schema.processlist - where state = "Flushing tables" and + where state = "Waiting for table" and info = "flush tables"; --source include/wait_condition.inc --send select * from t1 diff --git a/mysql-test/t/lock_multi.test b/mysql-test/t/lock_multi.test index 6983947d1c4..2a31392e8f8 100644 --- a/mysql-test/t/lock_multi.test +++ b/mysql-test/t/lock_multi.test @@ -982,7 +982,7 @@ connection con3; connection con2; let $wait_condition= SELECT COUNT(*) = 1 FROM information_schema.processlist - WHERE state = "Flushing tables" AND info = "FLUSH TABLES"; + WHERE state = "Waiting for table" AND info = "FLUSH TABLES"; --source include/wait_condition.inc --error ER_LOCK_WAIT_TIMEOUT SELECT * FROM t1; diff --git a/mysql-test/t/mdl_sync.test b/mysql-test/t/mdl_sync.test index 6b721ace07f..13e6aef10be 100644 --- a/mysql-test/t/mdl_sync.test +++ b/mysql-test/t/mdl_sync.test @@ -2829,6 +2829,187 @@ connection default; drop table t1; +--echo # +--echo # Now, test for situation in which deadlock involves waiting not +--echo # only in MDL subsystem but also for TDC. Such deadlocks should be +--echo # successfully detected. If possible they should be resolved without +--echo # resorting to ER_LOCK_DEADLOCK error. +--echo # +create table t1(i int); +create table t2(j int); + +--echo # +--echo # First, let us check how we handle simple scenario involving +--echo # waits in MDL and TDC. +--echo # +set debug_sync= 'RESET'; + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Start statement which will acquire SR metadata lock on t1, open it +--echo # and then will stop, before trying to acquire SW lock and opening t2. +set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go'; +--echo # Sending: +--send select * from t1 where i in (select j from t2 for update) + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +--echo # Wait till the above SELECT stops. +set debug_sync='now WAIT_FOR parked'; +--echo # The below FLUSH TABLES WITH READ LOCK should acquire +--echo # SNW locks on t1 and t2 and wait till SELECT closes t1. +--echo # Sending: +--send flush tables t1, t2 with read lock + +--echo # Switching to connection 'deadlock_con3'. +connection deadlock_con3; +--echo # Wait until FLUSH TABLES WITH READ LOCK starts waiting +--echo # for SELECT to close t1. +let $wait_condition= + select count(*) = 1 from information_schema.processlist + where state = "Waiting for table" and info = "flush tables t1, t2 with read lock"; +--source include/wait_condition.inc + +--echo # Resume SELECT, so it tries to acquire SW lock on t1 and blocks, +--echo # creating a deadlock. This deadlock should be detected and resolved +--echo # by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should +--echo # be able to finish. +set debug_sync='now SIGNAL go'; + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +--echo # Reap FLUSH TABLES WITH READ LOCK. +--reap +unlock tables; + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Reap SELECT. +--reap + +--echo # +--echo # The same scenario with a slightly different order of events +--echo # which emphasizes that setting correct deadlock detector weights +--echo # for flush waits is important. +--echo # +set debug_sync= 'RESET'; + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +set debug_sync='flush_tables_with_read_lock_after_acquire_locks SIGNAL parked WAIT_FOR go'; + +--echo # The below FLUSH TABLES WITH READ LOCK should acquire +--echo # SNW locks on t1 and t2 and wait on debug sync point. +--echo # Sending: +--send flush tables t1, t2 with read lock + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Wait till FLUSH TABLE WITH READ LOCK stops. +set debug_sync='now WAIT_FOR parked'; + +--echo # Start statement which will acquire SR metadata lock on t1, open +--echo # it and then will block while trying to acquire SW lock on t2. +--echo # Sending: +--send select * from t1 where i in (select j from t2 for update) + +--echo # Switching to connection 'deadlock_con3'. +connection deadlock_con3; +--echo # Wait till the above SELECT blocks. +let $wait_condition= + select count(*) = 1 from information_schema.processlist + where state = "Waiting for table" and + info = "select * from t1 where i in (select j from t2 for update)"; +--source include/wait_condition.inc + +--echo # Resume FLUSH TABLES, so it tries to flush t1 creating a deadlock. +--echo # This deadlock should be detected and resolved by backing-off SELECT. +--echo # As result FLUSH TABLES WITH READ LOCK should be able to finish. +set debug_sync='now SIGNAL go'; + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +--echo # Reap FLUSH TABLES WITH READ LOCK. +--reap +unlock tables; + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Reap SELECT. +--reap + +--echo # +--echo # Now more complex scenario involving two connections +--echo # waiting for MDL and one for TDC. +--echo # +set debug_sync= 'RESET'; + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Start statement which will acquire SR metadata lock on t2, open it +--echo # and then will stop, before trying to acquire SR lock and opening t1. +set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go'; +--echo # Sending: +--send select * from t2, t1 + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +--echo # Wait till the above SELECT stops. +set debug_sync='now WAIT_FOR parked'; +--echo # The below FLUSH TABLES WITH READ LOCK should acquire +--echo # SNW locks on t2 and wait till SELECT closes t2. +--echo # Sending: +--send flush tables t2 with read lock + +--echo # Switching to connection 'deadlock_con3'. +connection deadlock_con3; +--echo # Wait until FLUSH TABLES WITH READ LOCK starts waiting +--echo # for SELECT to close t2. +let $wait_condition= + select count(*) = 1 from information_schema.processlist + where state = "Waiting for table" and info = "flush tables t2 with read lock"; +--source include/wait_condition.inc + +--echo # The below DROP TABLES should acquire X lock on t1 and start +--echo # waiting for X lock on t2. +--echo # Sending: +--send drop tables t1, t2 + +--echo # Switching to connection 'default'. +connection default; +--echo # Wait until DROP TABLES starts waiting for X lock on t2. +let $wait_condition= + select count(*) = 1 from information_schema.processlist + where state = "Waiting for table" and info = "drop tables t1, t2"; +--source include/wait_condition.inc + +--echo # Resume SELECT, so it tries to acquire SR lock on t1 and blocks, +--echo # creating a deadlock. This deadlock should be detected and resolved +--echo # by backing-off SELECT. As result FLUSH TABLES WITH READ LOCK should +--echo # be able to finish. +set debug_sync='now SIGNAL go'; + +--echo # Switching to connection 'deadlock_con2'. +connection deadlock_con2; +--echo # Reap FLUSH TABLES WITH READ LOCK. +--reap +--echo # Unblock DROP TABLES. +unlock tables; + +--echo # Switching to connection 'deadlock_con3'. +connection deadlock_con3; +--echo # Reap DROP TABLES. +--reap + +--echo # Switching to connection 'deadlock_con1'. +connection deadlock_con1; +--echo # Reap SELECT. It should emit error about missing table. +--error ER_NO_SUCH_TABLE +--reap + +--echo # Switching to connection 'default'. +connection default; + set debug_sync= 'RESET'; disconnect deadlock_con1; @@ -2837,6 +3018,75 @@ disconnect deadlock_con3; --echo # +--echo # Test for scenario in which FLUSH TABLES <list> WITH READ LOCK +--echo # has been erroneously releasing metadata locks. +--echo # +connect(con1,localhost,root,,); +connect(con2,localhost,root,,); +connection default; +--disable_warnings +drop tables if exists t1, t2; +--enable_warnings +set debug_sync= 'RESET'; +create table t1(i int); +create table t2(j int); + +--echo # Switching to connection 'con2'. +connection con2; +set debug_sync='open_tables_after_open_and_process_table SIGNAL parked WAIT_FOR go'; + +--echo # The below FLUSH TABLES <list> WITH READ LOCK should acquire +--echo # SNW locks on t1 and t2, open table t1 and wait on debug sync +--echo # point. +--echo # Sending: +--send flush tables t1, t2 with read lock + +--echo # Switching to connection 'con1'. +connection con1; +--echo # Wait till FLUSH TABLES <list> WITH READ LOCK stops. +set debug_sync='now WAIT_FOR parked'; + +--echo # Start statement which will flush all tables and thus invalidate +--echo # table t1 open by FLUSH TABLES <list> WITH READ LOCK. +--echo # Sending: +--send flush tables + +--echo # Switching to connection 'default'. +connection default; +--echo # Wait till the above FLUSH TABLES blocks. +let $wait_condition= + select count(*) = 1 from information_schema.processlist + where state = "Waiting for table" and + info = "flush tables"; +--source include/wait_condition.inc + +--echo # Resume FLUSH TABLES <list> WITH READ LOCK, so it tries to open t2 +--echo # discovers that its t1 is obsolete and tries to reopen all tables. +--echo # Such reopen should not cause releasing of SNW metadata locks +--echo # which will result in assertion failures. +set debug_sync='now SIGNAL go'; + +--echo # Switching to connection 'con2'. +connection con2; +--echo # Reap FLUSH TABLES <list> WITH READ LOCK. +--reap +unlock tables; + +--echo # Switching to connection 'con1'. +connection con1; +--echo # Reap FLUSH TABLES. +--reap + +--echo # Clean-up. +--echo # Switching to connection 'default'. +connection default; +drop tables t1, t2; +set debug_sync= 'RESET'; +disconnect con1; +disconnect con2; + + +--echo # --echo # Test for bug #46748 "Assertion in MDL_context::wait_for_locks() --echo # on INSERT + CREATE TRIGGER". --echo # |