diff options
author | Keith Bostic <keith@wiredtiger.com> | 2015-08-23 10:11:36 -0400 |
---|---|---|
committer | Keith Bostic <keith@wiredtiger.com> | 2015-08-23 10:11:36 -0400 |
commit | 3a41ccbadfa59eb1dffcb2f8173299a1c974d815 (patch) | |
tree | ba405cefd6d98f25f9282b4173e934f48b2eac18 /src/include/btmem.h | |
parent | dc2adba3e2d9c51dfb23bac3442a986acb444c25 (diff) | |
download | mongo-3a41ccbadfa59eb1dffcb2f8173299a1c974d815.tar.gz |
There are three locks in play with the lookaside file, and it's leading
to deadlock. The three locks are as follows: reconciliation takes a page
lock (IFF compaction is running), then subsequently acquires the shared
lookaside cursor (IFF eviction is being performed by an application
thread), then acquires the page lock in order to insert a record into
the lookaside file. The simple deadlock is when the two page locks are
the same. More complicated deadlocks are possible, for example, thread
X acquires page lock 5, acquires the shared lookaside cursor, sleeps;
thread Y acquires page lock 6, then waits on the lookaside cursor;
thread X wakes and attempts to acquire page lock 6 as its second page
lock.
There's a WT_PAGE_SCANNING lock reconciliation always acquires in order
to block threads trimming update lists while reconciliation is running.
Rename that lock to WT_PAGE_RECONCILIATION and give it the more general
meaning that reconciliation is working on a page.
Change compaction to use the new WT_PAGE_RECONCILIATION lock instead of
the page lock. This means compaction can collide with threads trimming
update lists, but compaction is both a relatively rare operation and
only holds the lock for short time.
Page locks revert to their original remaining use, serialization around
page inserts.
Reconciliation does less work when compaction is configured (acquiring
one less lock), and the combination of compaction and reconciliation
no longer blocks page inserts.
This also simplifies compaction. Previously, compaction set a flag to
instruct reconciliation to start taking page locks, and then waited for
on-going reconciliation work to drain to ensure it didn't race; that's
no longer necessary because compaction is using a lock reconciliation
always acquires, the wait-to-drain isn't necessary.
Add a new F_CAS_ATOMIC_WAIT macro, the same as F_CAS_ATOMIC, but it loops
until successful.
Diffstat (limited to 'src/include/btmem.h')
-rw-r--r-- | src/include/btmem.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/src/include/btmem.h b/src/include/btmem.h index f613c082c77..e313ff412da 100644 --- a/src/include/btmem.h +++ b/src/include/btmem.h @@ -579,7 +579,7 @@ struct __wt_page { #define WT_PAGE_DISK_ALLOC 0x02 /* Disk image in allocated memory */ #define WT_PAGE_DISK_MAPPED 0x04 /* Disk image in mapped memory */ #define WT_PAGE_EVICT_LRU 0x08 /* Page is on the LRU queue */ -#define WT_PAGE_SCANNING 0x10 /* Obsolete updates are being scanned */ +#define WT_PAGE_RECONCILIATION 0x10 /* Page reconciliation lock */ #define WT_PAGE_SPLIT_INSERT 0x20 /* A leaf page was split for append */ #define WT_PAGE_SPLIT_LOCKED 0x40 /* An internal page is growing */ uint8_t flags_atomic; /* Atomic flags, use F_*_ATOMIC */ |