A better fix for bug #56405 "Deadlock in the MDL deadlock

detector" that doesn't introduce bug #56715 "Concurrent transactions + FLUSH result in sporadical unwarranted deadlock errors". Deadlock could have occurred when workload containing a mix of DML, DDL and FLUSH TABLES statements affecting the same set of tables was executed in a heavily concurrent environment. This deadlock occurred when several connections tried to perform deadlock detection in the metadata locking subsystem. The first connection started traversing wait-for graph, encountered a sub-graph representing a wait for flush, acquired LOCK_open and dived into sub-graph inspection. Then it encountered sub-graph corresponding to wait for metadata lock and blocked while trying to acquire a rd-lock on MDL_lock::m_rwlock, since some,other thread had a wr-lock on it. When this wr-lock was released it could have happened (if there was another pending wr-lock against this rwlock) that the rd-lock from the first connection was left unsatisfied but at the same time the new rd-lock request from the second connection sneaked in and was satisfied (for this to be possible the second rd-request should come exactly after the wr-lock is released but before pending the wr-lock manages to grab rwlock, which is possible both on Linux and in our own rwlock implementation). If this second connection continued traversing the wait-for graph and encountered a sub-graph representing a wait for flush it tried to acquire LOCK_open and thus the deadlock was created. The previous patch tried to workaround this problem by not allowing the deadlock detector to lock LOCK_open mutex if some other thread doing deadlock detection already owns it and current search depth is greater than 0. Instead deadlock was reported. As a result it has introduced bug #56715. This patch solves this problem in a different way. It introduces a new rw_pr_lock_t implementation to be used by MDL subsystem instead of one based on Linux rwlocks or our own rwlock implementation. This new implementation never allows situation in which an rwlock is rd-locked and there is a blocked pending rd-lock. Thus the situation which has caused this bug becomes impossible with this implementation. Due to fact that this implementation is optimized for wr-lock/unlock scenario which is most common in the MDL subsystem it doesn't introduce noticeable performance regressions in sysbench tests. Moreover it significantly improves situation for POINT_SELECT test when many connections are used. No test case is provided as this bug is very hard to repeat in MTR environment but is repeatable with the help of RQG tests. This patch also doesn't include a test for bug #56715 "Concurrent transactions + FLUSH result in sporadical unwarranted deadlock errors" as it takes too much time to be run as part of normal test-suite runs. config.h.cmake: We no longer need to check for presence of pthread_rwlockattr_setkind_np as we no longer use Linux-specific implementation of rw_pr_lock_t which uses this function. configure.cmake: We no longer need to check for presence of pthread_rwlockattr_setkind_np as we no longer use Linux-specific implementation of rw_pr_lock_t which uses this function. configure.in: We no longer need to check for presence of pthread_rwlockattr_setkind_np as we no longer use Linux-specific implementation of rw_pr_lock_t which uses this function. include/my_pthread.h: Introduced new implementation of rw_pr_lock_t. Since it never allows situation in which rwlock is rd-locked and there is a blocked pending rd-lock it is not affected by bug #56405 "Deadlock in the MDL deadlock detector". This implementation is also optimized for wr-lock/unlock scenario which is most common in MDL subsystem. So it doesn't introduce noticiable performance regressions in sysbench tests (compared to old Linux-specific implementation). Moreover it significantly improves situation for POINT_SELECT test when many connections are used. As part of this change removed try-lock part of API for this type of lock. It is not used in our code and it would be hard to implement correctly within constraints of new implementation. Finally, removed support of preferring readers from my_rw_lock_t implementation as the only user of this feature was old rw_pr_lock_t implementation. include/mysql/psi/mysql_thread.h: Removed try-lock part of prlock API. It is not used in our code and it would be hard to implement correctly within constraints of new prlock implementation. mysys/thr_rwlock.c: Introduced new implementation of rw_pr_lock_t. Since it never allows situation in which rwlock is rd-locked and there is a blocked pending rd-lock it is not affected by bug #56405 "Deadlock in the MDL deadlock detector". This implementation is also optimized for wr-lock/unlock scenario which is most common in MDL subsystem. So it doesn't introduce noticiable performance regressions in sysbench tests (compared to old Linux-specific implementation). Moreover it significantly improves situation for POINT_SELECT test when many connections are used. Also removed support of preferring readers from my_rw_lock_t implementation as the only user of this feature was old rw_pr_lock_t implementation.
author: Dmitry Lenev <Dmitry.Lenev@oracle.com> 2010-09-29 16:09:07 +0400
committer: Dmitry Lenev <Dmitry.Lenev@oracle.com> 2010-09-29 16:09:07 +0400
commit: 0afd0a18feb4501cafb4800115fc25f13171acf6 (patch)
tree: 6b53476f77b4149bc740fbc04af751faa7cda38f /include/my_pthread.h
parent: b72e7f05ffe5e8360a9b91c83db045c1e4618d35 (diff)
download: mariadb-git-0afd0a18feb4501cafb4800115fc25f13171acf6.tar.gz
1 files changed, 69 insertions, 37 deletions
diff --git a/include/my_pthread.h b/include/my_pthread.h
index 5cf181596ad..27ab5ba23fe 100644
--- a/include/my_pthread.h
+++ b/include/my_pthread.h
@@ -594,7 +594,7 @@ int my_pthread_fastmutex_lock(my_pthread_fastmutex_t *mp);
 /* Use our own version of read/write locks */
 #define NEED_MY_RW_LOCK 1
 #define rw_lock_t my_rw_lock_t
-#define my_rwlock_init(A,B) my_rw_init((A), 0)
+#define my_rwlock_init(A,B) my_rw_init((A))
 #define rw_rdlock(A) my_rw_rdlock((A))
 #define rw_wrlock(A) my_rw_wrlock((A))
 #define rw_tryrdlock(A) my_rw_tryrdlock((A))
@@ -606,49 +606,82 @@ int my_pthread_fastmutex_lock(my_pthread_fastmutex_t *mp);
 #endif /* USE_MUTEX_INSTEAD_OF_RW_LOCKS */
 
 
-/*
-  Portable read-write locks which prefer readers.
-
-  Required by some algorithms in order to provide correctness.
+/**
+  Portable implementation of special type of read-write locks.
+
+  These locks have two properties which are unusual for rwlocks:
+  1) They "prefer readers" in the sense that they do not allow
+     situations in which rwlock is rd-locked and there is a
+     pending rd-lock which is blocked (e.g. due to pending
+     request for wr-lock).
+     This is a stronger guarantee than one which is provided for
+     PTHREAD_RWLOCK_PREFER_READER_NP rwlocks in Linux.
+     MDL subsystem deadlock detector relies on this property for
+     its correctness.
+  2) They are optimized for uncontended wr-lock/unlock case.
+     This is scenario in which they are most oftenly used
+     within MDL subsystem. Optimizing for it gives significant
+     performance improvements in some of tests involving many
+     connections.
+
+  Another important requirement imposed on this type of rwlock
+  by the MDL subsystem is that it should be OK to destroy rwlock
+  object which is in unlocked state even though some threads might
+  have not yet fully left unlock operation for it (of course there
+  is an external guarantee that no thread will try to lock rwlock
+  which is destroyed).
+  Putting it another way the unlock operation should not access
+  rwlock data after changing its state to unlocked.
+
+  TODO/FIXME: We should consider alleviating this requirement as
+  it blocks us from doing certain performance optimizations.
 */
 
-#if defined(HAVE_PTHREAD_RWLOCK_RDLOCK) && defined(HAVE_PTHREAD_RWLOCKATTR_SETKIND_NP)
-/*
-  On systems which have a way to specify that readers should
-  be preferred through attribute mechanism (e.g. Linux) we use
-  system implementation of read/write locks.
-*/
-#define rw_pr_lock_t pthread_rwlock_t
+typedef struct st_rw_pr_lock_t {
+  /**
+    Lock which protects the structure.
+    Also held for the duration of wr-lock.
+  */
+  pthread_mutex_t lock;
+  /**
+    Condition variable which is used to wake-up
+    writers waiting for readers to go away.
+  */
+  pthread_cond_t no_active_readers;
+  /** Number of active readers. */
+  uint active_readers;
+  /** Number of writers waiting for readers to go away. */
+  uint writers_waiting_readers;
+  /** Indicates whether there is an active writer. */
+  my_bool active_writer;
+#ifdef SAFE_MUTEX
+  /** Thread holding wr-lock (for debug purposes only). */
+  pthread_t writer_thread;
+#endif
+} rw_pr_lock_t;
+
 extern int rw_pr_init(rw_pr_lock_t *);
-#define rw_pr_rdlock(A) pthread_rwlock_rdlock(A)
-#define rw_pr_wrlock(A) pthread_rwlock_wrlock(A)
-#define rw_pr_tryrdlock(A) pthread_rwlock_tryrdlock(A)
-#define rw_pr_trywrlock(A) pthread_rwlock_trywrlock(A)
-#define rw_pr_unlock(A) pthread_rwlock_unlock(A)
-#define rw_pr_destroy(A) pthread_rwlock_destroy(A)
+extern int rw_pr_rdlock(rw_pr_lock_t *);
+extern int rw_pr_wrlock(rw_pr_lock_t *);
+extern int rw_pr_unlock(rw_pr_lock_t *);
+extern int rw_pr_destroy(rw_pr_lock_t *);
+#ifdef SAFE_MUTEX
+#define rw_pr_lock_assert_write_owner(A) \
+  DBUG_ASSERT((A)->active_writer && pthread_equal(pthread_self(), \
+                                                  (A)->writer_thread))
+#define rw_pr_lock_assert_not_write_owner(A) \
+  DBUG_ASSERT(! (A)->active_writer || ! pthread_equal(pthread_self(), \
+                                                      (A)->writer_thread))
+#else
 #define rw_pr_lock_assert_write_owner(A)
 #define rw_pr_lock_assert_not_write_owner(A)
-#else
-/* Otherwise we have to use our own implementation of read/write locks. */
-#define NEED_MY_RW_LOCK 1
-struct st_my_rw_lock_t;
-#define rw_pr_lock_t my_rw_lock_t
-extern int rw_pr_init(struct st_my_rw_lock_t *);
-#define rw_pr_rdlock(A) my_rw_rdlock((A))
-#define rw_pr_wrlock(A) my_rw_wrlock((A))
-#define rw_pr_tryrdlock(A) my_rw_tryrdlock((A))
-#define rw_pr_trywrlock(A) my_rw_trywrlock((A))
-#define rw_pr_unlock(A) my_rw_unlock((A))
-#define rw_pr_destroy(A) my_rw_destroy((A))
-#define rw_pr_lock_assert_write_owner(A) my_rw_lock_assert_write_owner((A))
-#define rw_pr_lock_assert_not_write_owner(A) my_rw_lock_assert_not_write_owner((A))
-#endif /* defined(HAVE_PTHREAD_RWLOCK_RDLOCK) && defined(HAVE_PTHREAD_RWLOCKATTR_SETKIND_NP) */
+#endif /* SAFE_MUTEX */
 
 
 #ifdef NEED_MY_RW_LOCK
 /*
-  On systems which don't support native read/write locks, or don't support
-  read/write locks which prefer readers we have to use own implementation.
+  On systems which don't support native read/write locks we have
+  to use own implementation.
 */
 typedef struct st_my_rw_lock_t {
 	pthread_mutex_t lock;		/* lock for structure		*/
@@ -656,13 +689,12 @@ typedef struct st_my_rw_lock_t {
 	pthread_cond_t	writers;	/* waiting writers		*/
 	int		state;		/* -1:writer,0:free,>0:readers	*/
 	int             waiters;        /* number of waiting writers	*/
-	my_bool         prefer_readers;
 #ifdef SAFE_MUTEX
         pthread_t       write_thread;
 #endif
 } my_rw_lock_t;
 
-extern int my_rw_init(my_rw_lock_t *, my_bool *);
+extern int my_rw_init(my_rw_lock_t *);
 extern int my_rw_destroy(my_rw_lock_t *);
 extern int my_rw_rdlock(my_rw_lock_t *);
 extern int my_rw_wrlock(my_rw_lock_t *);
author	Dmitry Lenev <Dmitry.Lenev@oracle.com>	2010-09-29 16:09:07 +0400
committer	Dmitry Lenev <Dmitry.Lenev@oracle.com>	2010-09-29 16:09:07 +0400
commit	0afd0a18feb4501cafb4800115fc25f13171acf6 (patch)
tree	6b53476f77b4149bc740fbc04af751faa7cda38f /include/my_pthread.h
parent	b72e7f05ffe5e8360a9b91c83db045c1e4618d35 (diff)
download	mariadb-git-0afd0a18feb4501cafb4800115fc25f13171acf6.tar.gz