| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
(cherry picked from: 8f371403f0ccfae0188d7e4c2e6d629ade697b13)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(cherry picked from commit: 84e6ac0e67019bba22af87b99b40bb0bc0e21157)
When pages split in WiredTiger, internal pages cannot be evicted
immediately because there is a chance that a reader is still looking at
an index pointing to the page. We check for this when considering pages
for eviction, and assert that we never evict an internal page in an
active generation.
However, if a page splits and then we try to get exclusive access to
the tree (e.g., to verify it), we could fail to evict the tree from
cache even though we have guaranteed exclusive access to it.
Relax the check on internal pages to allow eviction from trees that are
locked exclusive.
|
|
|
| |
Also don't check for a full cache while holding the table lock (we're likely reading the metadata in that case, just being extra careful).
|
|\ |
|
| |
| |
| |
| |
| |
| | |
We could race an in-progress switch that set a new, empty active slot
but has not yet released the previously active slot and get an
incorrect LSN.
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Unify spinlock structures.
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| | |
MongoDB user on Windows noticed the "LSM: application work units currently queued" statistic was changing in a configuration that involved no LSM code. This was caused by a bug in code that tracks time spent in spinlocks incrementing the wrong statistic.
In particular, spinlocks contain fields describing which statistics should be used to track time spent in that spinlock. A value of -1 indicates that the spinlock should not be tracked, but a value of zero is the first statistic in the array for a connection, which happens to be the "LSM: application work units currently queued" statistic. The Windows implementation of spinlocks was not setting these fields to -1, leading to the bug.
This bug was introduced by WT 2955 and also meant that every WiredTiger spinlock on Windows was being timed, which may have negatively impacted Windows performance.
|
|\ \
| |/ |
|
| | |
|
| |
| |
| |
| | |
Otherwise the number of workers wouldn't adjust when the workload changed.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Set minimum split pct to 50.
* The leaf-page value dictionary stores cell offsets in the disk image, which implies a dictionary reset any time we hit a boundary or grow the disk image buffer. Recent changes broke that, we weren't resetting the dictionary when the disk image buffer was resized.
Instead of clearing the dictionary on buffer resize, switch to using cell offsets in the dictionary instead of cell pointers. It's unlikely to be a big win for many workloads, but it might help some, and it's cleaner than resetting the dictionary more often.
Add a verify of disk images we don't write: the I/O routines verify any image we write, but we need to verify any image we create.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Set WT_CONN_CLOSING earlier in the connection close process (before
calling the async close functions). This requires removing the assert
in btree handle open that close hasn't yet been called. Add a barrier
after setting the connection close flag to ensure the write is flushed.
LSM workers checked both the WT_CONN_SERVER_RUN and WT_LSM_WORKER_RUN
flags because the LSM destroy path (__lsm_manager_worker_shutdown),
didn't clear WT_LSM_WORKER_RUN flag. Add that clear, change __lsm_worker
to only check WT_LSM_WORKER_RUN.
Previously, the LSM manager checked the WT_CONN_SERVER_RUN flag in the
LSM destroy path and connection shutdown waited on the LSM manager to
stop and clear WT_CONN_SERVER_LSM. Flip that process: the LSM shutdown
path now clears WT_CONN_SERVER_LSM, and the LSM manager stops when it
sees WT_CONN_SERVER_LSM is cleared. The LSM manager sets a new flag,
WT_LSM_MANAGER_SHUTDOWN, when it's stopped, and the shutdown process
waits on that new flag.
Add memory barriers to the thread create and join functions. WiredTiger
typically sets (clears) state and expects threads to see the state and
start (stop). It simpler and safer if we imply a barrier in the thread
API.
* Rename WT_CONN_LOG_SERVER_RUN to WT_CONN_SERVER_LOG to match the other
server flags.
* Once the async and LSM servers have exited, assert no more files are
opened.
* Instead of using a barrier to ensure the worker run state isn't cached,
declare the structure field volatile. Use a stand-alone structure field
instead of a set of flags, it's a simpler "volatile" story.
* In one of two places, when shutting down worker threads, we signalled the
condition variable to wake the worker thread. For consistency, remove the
signal (we're only sleeping for 100th of a second, the wake isn't buying
us anything).
* Restore the assertion in __open_session() that we're not in the
"closing" path, returning an error is more dangerous, it might
cause a thread to panic, and then we have a panic racing with the
close.
* A wt_thread_t (POSIX pthread_t) is an opaque type, and can't be assigned
to 0 or tested against an integral value portably. Add a
bool WT_LSM_WORKER_ARGS.tid_set field instead of assigning or testing the
wt_thread_t.
We already have an __wt_lsm_start function, add a __wt_lsm_stop function
and move the setting/clearing of the WT_LSM_WORKER_ARGS.{running,tid_set}
fields into those functions so we ensure the ordering is correct.
|
| | |
|
| |
| |
| | |
This avoids metadata operations failing in in-memory configurations.
|
| |
| |
| |
| | |
This reverts commit 1c41c7735b3529521b7bd34180f80584caee7f59.
|
| |
| |
| | |
* Changes `split_pct` to have a minimum of 50%.
|
| |
| |
| | |
Non-zero int values for these functions should not raise exceptions.
|
| |
| |
| |
| |
| | |
Panic if there's an error in reading/writing from/to the turtle file,
there's no point in continuing. This change avoids user confusion when
the turtle file is corrupted or zero'd out by the filesystem.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* WT-3240 Coverity reports
Coverity report 1373075: allocated memory is leaked if __wt_snprintf
fails.
* Coverity report 1373074: allocated memory is leaked if __wt_snprintf
fails.
* Coverity report 1373073: allocated memory is leaked if __wt_snprintf
fails.
* Coverity report 1373072: allocated memory is leaked if __wt_snprintf
fails.
* Coverity report 1373071: allocated memory is leaked if __wt_snprintf
fails.
* Coverity report 1369053: CID 1369053 (#1 of 1): Unused value
(UNUSED_VALUE) assigned_pointer: Assigning value from "," to
append_comma here, but that stored value is overwritten before
it can be used.
|
| | |
|
| |
| |
| |
| |
| | |
Revert "Change LSM WT_CURSOR.{compare,insert,update,remove} to accept an internal key instead of copying the key into WiredTiger-owned memory (in other words, replace WT_CURSOR_NEEDKEY calls with WT_CURSOR_CHECKKEY)."
This reverts commit af2c787.
|
| |
| |
| | |
Fix a typo.
|
| |
| |
| |
| |
| |
| | |
Add a style check for use of the snprintf/vsnprintf calls rather than
the WiredTiger library replacements.
Fix a wtperf snprintf call I missed.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return
Make a pass through the source base to check sprintf, snprintf, vsprintf
and vsnprintf calls for errors.
* A WiredTiger key is a uint64_t.
Use sizeof(), don't hard-wire buffer sizes into the code.
* More (u_int) vs. (uint64_t) fixes.
* Use CONFIG_APPEND instead of FORMAT_APPEND, it makes more sense.
* revert part of 4475ae9, there's an explicit allocation of the size of
the buffer.
* MVSC complaints:
test\format\config.c(765): warning C4018: '<': signed/unsigned mismatch
test\format\config.c(765): warning C4018: '>': signed/unsigned mismatch
* Change Windows testing shim to correctly use __wt_snprintf
* Change Windows test shim to use the __wt_XXX functions
* MSDN's _vscprintf API returns the number of characters excluding the
termininating nul byte, return that value.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* WT-98 Update the current cursor value without a search
When running in-memory and insert/update fails, we should expect
WT_ROLLBACK even when not running inside a transaction.
* Order the operations alphabetically (they were ordered the way they were
because of the order in which we used to choose operations, but that's no
longer the case).
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| |
| | |
* Table cursors with overwrite configured wrongly treat not-found as an error, return success instead.
* The LSM code clears WT_CURSTD_KEY_SET on unsuccessful searches, which breaks table cursors with indices doing searches on the set of cursors in order to delete old index keys, because there's no key set when it's time to do the update.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* Update WiredTiger build for clang 4.0.
ex_all.c:852:7: error: possible misuse of comma operator here [-Werror,-Wcomma]
p1++, p2++;
^
ex_all.c:852:3: note: cast expression to void to silence warning
p1++, p2++;
^~~~
(void)( )
1 error generated.
* wtperf.c:2670:4: error: code will never be executed [-Werror,-Wunreachable-code]
pos += (size_t)snprintf(
^~~
wtperf.c:2669:23: note: silence by adding parentheses to mark code as explicitly dead
if (opts->verbose > 1 && strlen(debug_tconfig) != 0)
^
/* DISABLES CODE */ ( )
wtperf.c:2630:4: error: code will never be executed [-Werror,-Wunreachable-code]
pos += (size_t)snprintf(
^~~
wtperf.c:2629:23: note: silence by adding parentheses to mark code as explicitly dead
if (opts->verbose > 1 && strlen(debug_cconfig) != 0)
^
/* DISABLES CODE */ ( )
2 errors generated.
|
| |
| |
| |
| |
| |
| | |
There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place -- a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database.
* Add the MOVEFILE_WRITE_THROUGH flag to the MoveFileEx call. If we somehow end up in a copy-then-delete path, that flag adds a disk flush after the copy phase, so the window of vulnerability is as short as possible.
|
| |
| |
| |
| | |
in_memory (#3341)
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Once an LSM primary is known to be on disk, we expect readers to use the
checkpoint. The original page image for the primary will then be
discarded by an LSM worker thread.
We previously allowed the LSM primary to be evicted in between so that
eviction workers can deal with cache pressure ahead of the LSM worker
threads discarding the chunk. However, that leads to cases where
application threads end up evicting a 100MB page, and also means that
discarding the chunk needs to worry about split generations (the cause
of the assertion failure here).
The solution suggested here is simple: never enable eviction in LSM
primaries, which also means we never need to fix up cache accounting.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
(#3338)
The Python test suite uses "XXX: " as its error prefix, and the WiredTiger
error routines append a comma and space after the error prefix in error
messages. This means the error messages come out "XXX: , YYY". Remove the
comma and space from the declared error_prefix so the error messages come
out "XXX, YYY".
|
| |
| |
| |
| |
| |
| | |
Move lsm_primary check near evict_disabled check.
The assertion was caused by `WT_BTREE_NO_RECONCILE`, which allows in-memory splits even when eviction is disabled. Rename that flag `WT_BTREE_ALLOW_SPLITS` for clarity.
|
| | |
|
| |
| |
| |
| |
| | |
* Build a static library with -fPIC objects, suitable for pulling into a dynamic library. Distribute our SWIG results, rather than running SWIG on the target machine.
* Added builtin support for snappy and zlib. Made it easy to manage the list of builtins.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611
Casting the call's return to int is because CentOS 7.3.1611 complains
about syscall returning a long and the loss of integer precision in
the assignment to ret. The cast should be a no-op everywhere.
* On Centos 7.3.1611, system header files aren't compatible with
-Wdisabled-macro-expansion. I don't see a big reason for having
that warning, so I'm turning it off generally.
Add -Wuninitialized to WiredTiger's gcc builds.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* WT-3204 eviction changes cost LSM performance
Modify LSM's primary chunk switching to match the new btree eviction
semantics on object creation. We now create objects with eviction turned
off, LSM should no longer have to turn eviction off when configuring the
primary chunk.
LSM previously set WT_BTREE.bulk_load_ok to false to ensure an insert
into the tree wouldn't turn eviction on. That problem remains, but
there's a race in the implementation if multiple threads are inserting
at the same time (where a thread modifies WT_BTREE.bulk_load_ok and goes
to sleep before configuring eviction, and another thread does an insert
and turns off eviction), and there's a further race between threads
doing F_ISSET/F_SET tests. Change the WT_BTREE_LSM_PRIMARY flag into a
WT_BTREE.lsm_primary variable so there's no F_ISSET/F_SET race. Remove
the test/set of bulk-load_ok, instead, test the lsm_primary value in the
btree code before turning eviction off.
When checkpointing an LSM chunk, move the code that turns off the
chunk's primary flag in the chunk inside the single-threaded part of the
function to ensure we don't race with other threads doing checkpoints.
That makes the code to fix up the accounting single-threaded and safe.
Simplify the LSM checkpoint code to call __wt_checkpoint directly, and
use the same handle for turning off the chunk's primary flag as we use
for the checkpoint.
* Force a primary switch in LSM after an exclusive-handle operation
has come through.
Otherwise it's possible to attempt to use a file as the primary chunk
without disabling eviction.
* spelling
* WT_BTREE.bulk_load_ok isn't a boolean, don't use true/false comparisons.
* Only check for an empty tree the first time an LSM chunk is opened.
The goal here is to make sure that LSM primary chunks start empty.
Otherwise, we can't load into a skiplist in memory as required by LSM.
If an operation such as verify closes a btree in order to check the
on-disk state, the next time it is opened we have to check whether it is
empty.
It is safe to do this check without locking: what matters is that we
always do the `lsm_primary` check before any update operation that would
turn off `btree->bulk_load_ok`.
* Rename WT_BTREE.bulk_load_ok to be WT_BTREE.original, it's used by LSM.
* Fix a comment.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We use split generations to detect when readers may be looking at
structures that are replaced by a split. For correctness, we should
only increment the global split generation *after* a split becomes
public. Only then can we safely check that no thread is still reading
with the old generation.
Previously, a split could increment the global split generation, then a
thread could start reading with the new split generation but see the old
index structure.
This issue was introduced by WT 3088, where we wanted a way to ensure
that newly-allocated pages don't split until it is safe. That is solved
here by having the split code pin a split generation in the ordinary way
(without allocating a new one) for the duration that splits of new
pages need to be prevented.
|
| |
| |
| |
| |
| |
| | |
* Recheck for existence after acquiring write lock when creating a new dhandle.
* Add a wtperf workload that reproduced the original failure.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Also switch to holding the schema lock when completing a bulk load.
This avoids a race with checkpoints starting, so avoids a failure mode
that was added to checkpoint earlier in this ticket. Assert that we
don't hit that case instead.
|
| |
| |
| |
| |
| |
| |
| | |
error (#3326)
Have test/fops handle EBUSY returns from forced checkpoints and EINVAL from bulk cursors.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously, we gathered handles, then started a transaction, then
figured out which handles were clean and released them. However,
* checkpoints were keeping every handle in both its handle list and in
the meta_tracking list because the *_apply_all functions were saving all
handles when meta_tracking was active; and
* we had acquired exclusive locks on checkpoints to be dropped before
determining that we could skip a checkpoint in a clean tree. These
locks blocked drops (among other things) until the checkpoint completed.
The solution here is to first start the transaction, then check for
clean handles as checkpoint visits them. However, this has to cope with
races where a handle changes state in between the transaction starting
and getting the handle (e.g., table creates, bulk loads completing).
|