From fa5f60681fe4ed9d5bb160e551381cb7b1be5850 Mon Sep 17 00:00:00 2001 From: Rinat Ibragimov Date: Thu, 18 Jun 2020 01:11:39 +0300 Subject: MDEV-20946: Hard FTWRL deadlock under user level locks It was possibile for a user to create an interlocked state which may go on for a significant period of time. There is a tight loop in the FTWRL code path that tries to repeatedly acquire a read lock. As the weight of FTWRL lock is the smallest among others, it's always selected by the deadlock detector, but can never be killed. Imaging the following sequence: connection_0 connection_1 GET_LOCK("l1", 0); LOCK TABLES t WRITE; FLUSH TABLES WITH READ LOCK; GET_LOCK("l1", 1000); The GET_LOCK statement in connection_1 triggers the deadlock detector, which tries to select the lock in FTWRL, since its weight is 0. However, since a loop in Global_read_lock::lock_global_read_lock() tries to always win, it tries to acquire lock again. Which invokes the deadlock detector, and that cycle continues until GET_LOCK in connection_1 times out. This patch resolves the live-locking by introducing a dynamic bonus to the deadlock weight associated with every lock. Each lock gets a bonus weight each time it's selected by the deadlock detector. In case of a live-lock situation, those locks that cannot be killed, get additional weight each iteration. Eventually their weight becomes so high that the deadlock detector shifts its attention to other lock, until it find the one that can be killed. --- sql/mdl.cc | 1 + 1 file changed, 1 insertion(+) (limited to 'sql/mdl.cc') diff --git a/sql/mdl.cc b/sql/mdl.cc index 1798d3039fa..5e54178db70 100644 --- a/sql/mdl.cc +++ b/sql/mdl.cc @@ -2782,6 +2782,7 @@ void MDL_context::find_deadlock() context was waiting is concurrently satisfied. */ (void) victim->m_wait.set_status(MDL_wait::VICTIM); + victim->inc_deadlock_overweight(); victim->unlock_deadlock_victim(); if (victim == this) -- cgit v1.2.1