summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* WT-3271 Prevent integer overflow in eviction tuning. (#3379)mongodb-3.5.7mongodb-3.5.6mongodb-3.4.4Michael Cahill2017-04-111-17/+19
| | | | (cherry picked from: 8f371403f0ccfae0188d7e4c2e6d629ade697b13)
* WT-3265 Allow eviction of recently split pages when tree is locked. (#3372)Michael Cahill2017-04-081-1/+6
| | | | | | | | | | | | | | | | | (cherry picked from commit: 84e6ac0e67019bba22af87b99b40bb0bc0e21157) When pages split in WiredTiger, internal pages cannot be evicted immediately because there is a chance that a reader is still looking at an index pointing to the page. We check for this when considering pages for eviction, and assert that we never evict an internal page in an active generation. However, if a page splits and then we try to get exclusive access to the tree (e.g., to verify it), we could fail to evict the tree from cache even though we have guaranteed exclusive access to it. Relax the check on internal pages to allow eviction from trees that are locked exclusive.
* WT-3262 Don't check if the cache is full when accessing metadata. (#3376)Michael Cahill2017-04-081-6/+11
| | | Also don't check for a full cache while holding the table lock (we're likely reading the metadata in that case, just being extra careful).
* Merge commit 'adbe2ec' into mongodb-3.6Alex Gorrod2017-04-064-16/+22
|\
| * WT-3249 Look at slot_state during force while holding lock. (#3365)sueloverso2017-04-043-15/+21
| | | | | | | | | | | | We could race an in-progress switch that set a new, empty active slot but has not yet released the previously active slot and get an incorrect LSN.
| * WT-3254 Fix typo in reconfig string (#3366)sueloverso2017-04-041-1/+1
| |
* | Merge branch 'develop' into mongodb-3.6Alex Gorrod2017-04-042-35/+19
|\ \ | |/
| * WT-3250 Have one function initializing the WT portion of the spinlock. (#3364)sueloverso2017-04-032-35/+19
| | | | | | | | Unify spinlock structures.
* | Merge branch 'develop' into mongodb-3.6Alex Gorrod2017-04-041-0/+3
|\ \ | |/
| * WT-3250 Fix spinlock statistics tracking on Windows. (#3363)Michael Cahill2017-04-031-0/+3
| | | | | | | | | | | | | | MongoDB user on Windows noticed the "LSM: application work units currently queued" statistic was changing in a configuration that involved no LSM code. This was caused by a bug in code that tracks time spent in spinlocks incrementing the wrong statistic. In particular, spinlocks contain fields describing which statistics should be used to track time spent in that spinlock. A value of -1 indicates that the spinlock should not be tracked, but a value of zero is the first statistic in the array for a connection, which happens to be the "LSM: application work units currently queued" statistic. The Windows implementation of spinlocks was not setting these fields to -1, leading to the bug. This bug was introduced by WT 2955 and also meant that every WiredTiger spinlock on Windows was being timed, which may have negatively impacted Windows performance.
* | Merge branch 'develop' into mongodb-3.6Alex Gorrod2017-04-01131-1909/+2920
|\ \ | |/
| * WT-3243 Reorder log slot release so joins don't wait on IO (#3360)sueloverso2017-03-317-192/+221
| |
| * WT-3190 perform a complete re-tune of eviction workers every 30 seconds. (#3324)Alexandra (Sasha) Fedorova2017-03-305-201/+250
| | | | | | | | Otherwise the number of workers wouldn't adjust when the workload changed.
| * WT-2439 Enhance reconciliation page layout (#3358)Keith Bostic2017-03-307-479/+591
| | | | | | | | | | | | | | | | | | * Set minimum split pct to 50. * The leaf-page value dictionary stores cell offsets in the disk image, which implies a dictionary reset any time we hit a boundary or grow the disk image buffer. Recent changes broke that, we weren't resetting the dictionary when the disk image buffer was resized. Instead of clearing the dictionary on buffer resize, switch to using cell offsets in the dictionary instead of cell pointers. It's unlikely to be a big win for many workloads, but it might help some, and it's cleaner than resetting the dictionary more often. Add a verify of disk images we don't write: the I/O routines verify any image we write, but we need to verify any image we create.
| * WT-3155 Remove WT_CONN_SERVER_RUN flag (#3344)Keith Bostic2017-03-2912-58/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Set WT_CONN_CLOSING earlier in the connection close process (before calling the async close functions). This requires removing the assert in btree handle open that close hasn't yet been called. Add a barrier after setting the connection close flag to ensure the write is flushed. LSM workers checked both the WT_CONN_SERVER_RUN and WT_LSM_WORKER_RUN flags because the LSM destroy path (__lsm_manager_worker_shutdown), didn't clear WT_LSM_WORKER_RUN flag. Add that clear, change __lsm_worker to only check WT_LSM_WORKER_RUN. Previously, the LSM manager checked the WT_CONN_SERVER_RUN flag in the LSM destroy path and connection shutdown waited on the LSM manager to stop and clear WT_CONN_SERVER_LSM. Flip that process: the LSM shutdown path now clears WT_CONN_SERVER_LSM, and the LSM manager stops when it sees WT_CONN_SERVER_LSM is cleared. The LSM manager sets a new flag, WT_LSM_MANAGER_SHUTDOWN, when it's stopped, and the shutdown process waits on that new flag. Add memory barriers to the thread create and join functions. WiredTiger typically sets (clears) state and expects threads to see the state and start (stop). It simpler and safer if we imply a barrier in the thread API. * Rename WT_CONN_LOG_SERVER_RUN to WT_CONN_SERVER_LOG to match the other server flags. * Once the async and LSM servers have exited, assert no more files are opened. * Instead of using a barrier to ensure the worker run state isn't cached, declare the structure field volatile. Use a stand-alone structure field instead of a set of flags, it's a simpler "volatile" story. * In one of two places, when shutting down worker threads, we signalled the condition variable to wake the worker thread. For consistency, remove the signal (we're only sleeping for 100th of a second, the wake isn't buying us anything). * Restore the assertion in __open_session() that we're not in the "closing" path, returning an error is more dangerous, it might cause a thread to panic, and then we have a panic racing with the close. * A wt_thread_t (POSIX pthread_t) is an opaque type, and can't be assigned to 0 or tested against an integral value portably. Add a bool WT_LSM_WORKER_ARGS.tid_set field instead of assigning or testing the wt_thread_t. We already have an __wt_lsm_start function, add a __wt_lsm_stop function and move the setting/clearing of the WT_LSM_WORKER_ARGS.{running,tid_set} fields into those functions so we ensure the ordering is correct.
| * WT-3208 Don't count page rewrites as eviction making progress. (#3356)Michael Cahill2017-03-294-9/+39
| |
| * WT-3244 Turn off in-memory cache-full checks on the metadata file (#3359)Keith Bostic2017-03-291-0/+8
| | | | | | This avoids metadata operations failing in in-memory configurations.
| * Revert "WT-2439 Improve page layout: keep pages more than half full (#3277)"Michael Cahill2017-03-297-532/+463
| | | | | | | | This reverts commit 1c41c7735b3529521b7bd34180f80584caee7f59.
| * WT-2439 Improve page layout: keep pages more than half full (#3277)Sulabh Mahajan2017-03-297-463/+532
| | | | | | * Changes `split_pct` to have a minimum of 50%.
| * WT-3238 Java: Fix Cursor.compare and Cursor.equals to return int values. (#3355)Don Anderson2017-03-294-1/+186
| | | | | | Non-zero int values for these functions should not raise exceptions.
| * SERVER-28168 Cannot start or repair mongodb after unexpected shutdown. (#3353)Keith Bostic2017-03-271-14/+20
| | | | | | | | | | Panic if there's an error in reading/writing from/to the turtle file, there's no point in continuing. This change avoids user confusion when the turtle file is corrupted or zero'd out by the filesystem.
| * WT-3240 Coverity reports (#3354)Keith Bostic2017-03-276-21/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WT-3240 Coverity reports Coverity report 1373075: allocated memory is leaked if __wt_snprintf fails. * Coverity report 1373074: allocated memory is leaked if __wt_snprintf fails. * Coverity report 1373073: allocated memory is leaked if __wt_snprintf fails. * Coverity report 1373072: allocated memory is leaked if __wt_snprintf fails. * Coverity report 1373071: allocated memory is leaked if __wt_snprintf fails. * Coverity report 1369053: CID 1369053 (#1 of 1): Unused value (UNUSED_VALUE) assigned_pointer: Assigning value from "," to append_comma here, but that stored value is overwritten before it can be used.
| * WT-3207 Use config to determine checkpoint force value. (#3350)sueloverso2017-03-271-1/+5
| |
| * WT-98 Update the current cursor value without a searchKeith Bostic2017-03-241-5/+5
| | | | | | | | | | Revert "Change LSM WT_CURSOR.{compare,insert,update,remove} to accept an internal key instead of copying the key into WiredTiger-owned memory (in other words, replace WT_CURSOR_NEEDKEY calls with WT_CURSOR_CHECKKEY)." This reverts commit af2c787.
| * WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return (#3348)Keith Bostic2017-03-241-1/+1
| | | | | | Fix a typo.
| * WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return (#3347)Keith Bostic2017-03-242-2/+10
| | | | | | | | | | | | Add a style check for use of the snprintf/vsnprintf calls rather than the WiredTiger library replacements. Fix a wtperf snprintf call I missed.
| * WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return (#3340)Keith Bostic2017-03-2484-671/+893
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WT-3136 bug fix: WiredTiger doesn't check sprintf calls for error return Make a pass through the source base to check sprintf, snprintf, vsprintf and vsnprintf calls for errors. * A WiredTiger key is a uint64_t. Use sizeof(), don't hard-wire buffer sizes into the code. * More (u_int) vs. (uint64_t) fixes. * Use CONFIG_APPEND instead of FORMAT_APPEND, it makes more sense. * revert part of 4475ae9, there's an explicit allocation of the size of the buffer. * MVSC complaints: test\format\config.c(765): warning C4018: '<': signed/unsigned mismatch test\format\config.c(765): warning C4018: '>': signed/unsigned mismatch * Change Windows testing shim to correctly use __wt_snprintf * Change Windows test shim to use the __wt_XXX functions * MSDN's _vscprintf API returns the number of characters excluding the termininating nul byte, return that value.
| * WT-98 Update the current cursor value without a search (#3346)Keith Bostic2017-03-241-43/+43
| | | | | | | | | | | | | | | | | | | | | | * WT-98 Update the current cursor value without a search When running in-memory and insert/update fails, we should expect WT_ROLLBACK even when not running inside a transaction. * Order the operations alphabetically (they were ordered the way they were because of the order in which we used to choose operations, but that's no longer the case).
| * WT-98 Update the current cursor value without a search (#3330)Keith Bostic2017-03-2426-365/+663
| |
* | Merge branch 'develop' into mongodb-3.6Michael Cahill2017-03-24136-1570/+2577
|\ \ | |/
| * WT-3228 Remove with overwrite shouldn't return WT_NOTFOUND (#3339)Keith Bostic2017-03-243-26/+59
| | | | | | | | | | * Table cursors with overwrite configured wrongly treat not-found as an error, return success instead. * The LSM code clears WT_CURSTD_KEY_SET on unsuccessful searches, which breaks table cursors with indices doing searches on the set of cursors in order to delete old index keys, because there's no key set when it's time to do the update.
| * WT-3234 Update WiredTiger build for clang 4.0. (#3345)Keith Bostic2017-03-242-18/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Update WiredTiger build for clang 4.0. ex_all.c:852:7: error: possible misuse of comma operator here [-Werror,-Wcomma] p1++, p2++; ^ ex_all.c:852:3: note: cast expression to void to silence warning p1++, p2++; ^~~~ (void)( ) 1 error generated. * wtperf.c:2670:4: error: code will never be executed [-Werror,-Wunreachable-code] pos += (size_t)snprintf( ^~~ wtperf.c:2669:23: note: silence by adding parentheses to mark code as explicitly dead if (opts->verbose > 1 && strlen(debug_tconfig) != 0) ^ /* DISABLES CODE */ ( ) wtperf.c:2630:4: error: code will never be executed [-Werror,-Wunreachable-code] pos += (size_t)snprintf( ^~~ wtperf.c:2629:23: note: silence by adding parentheses to mark code as explicitly dead if (opts->verbose > 1 && strlen(debug_cconfig) != 0) ^ /* DISABLES CODE */ ( ) 2 errors generated.
| * SERVER-28194 Missing WiredTiger.turtle file loses data (#3337)Keith Bostic2017-03-232-15/+13
| | | | | | | | | | | | There's a two step process on Windows to rename files (including the turtle file), remove the original and then move the replacement into place -- a DeleteFileW followed by a MoveFileW. If we crash in the middle (and in SERVER-28194, it looks like there's a weirder failure mode, where the DeleteFileW succeeded, but the file was still there), we can be left without a turtle file, which will lose all of the data in the database. * Add the MOVEFILE_WRITE_THROUGH flag to the MoveFileEx call. If we somehow end up in a copy-then-delete path, that flag adds a disk flush after the copy phase, so the window of vulnerability is as short as possible.
| * WT-3202 Add in_memory config opt, do not reopen connection if db is ↵Sulabh Mahajan2017-03-233-1/+16
| | | | | | | | in_memory (#3341)
| * WT-2990 Restore use of dhandle lock in LSM. (#3342)sueloverso2017-03-201-3/+6
| |
| * WT-3196 Prevent eviction in LSM primaries after the are flushed. (#3336)Michael Cahill2017-03-201-66/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once an LSM primary is known to be on disk, we expect readers to use the checkpoint. The original page image for the primary will then be discarded by an LSM worker thread. We previously allowed the LSM primary to be evicted in between so that eviction workers can deal with cache pressure ahead of the LSM worker threads discarding the chunk. However, that leads to cases where application threads end up evicting a 100MB page, and also means that discarding the chunk needs to worry about split generations (the cause of the assertion failure here). The solution suggested here is simple: never enable eviction in LSM primaries, which also means we never need to fix up cache accounting.
| * WT-3227 Python test suite inserts unnecessary whitespace in error output. ↵Keith Bostic2017-03-176-6/+6
| | | | | | | | | | | | | | | | | | (#3338) The Python test suite uses "XXX: " as its error prefix, and the WiredTiger error routines append a comma and space after the error prefix in error messages. This means the error messages come out "XXX: , YYY". Remove the comma and space from the declared error_prefix so the error messages come out "XXX, YYY".
| * WT-3224 Prevent splits in LSM primaries (#3335)Keith Bostic2017-03-175-12/+12
| | | | | | | | | | | | Move lsm_primary check near evict_disabled check. The assertion was caused by `WT_BTREE_NO_RECONCILE`, which allows in-memory splits even when eviction is disabled. Rename that flag `WT_BTREE_ALLOW_SPLITS` for clarity.
| * WT-3216 changes suggested by clang-tidy (#3328)Keith Bostic2017-03-1751-313/+304
| |
| * WT-2978 Python: make a pip-compatible installer. (#3320)Don Anderson2017-03-172-0/+456
| | | | | | | | | | * Build a static library with -fPIC objects, suitable for pulling into a dynamic library. Distribute our SWIG results, rather than running SWIG on the target machine. * Added builtin support for snappy and zlib. Made it easy to manage the list of builtins.
| * WT-3212 Table cursors should not free memory owned by the table. (#3327)Don Anderson2017-03-171-3/+7
| |
| * WT-3218 Reduce to 2k tables so Jenkins doesn't hit open file ulimit. (#3334)sueloverso2017-03-161-1/+1
| |
| * WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611 (#3333)Keith Bostic2017-03-163-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * WT-3225 WiredTiger won't build with clang on CentOS 7.3.1611 Casting the call's return to int is because CentOS 7.3.1611 complains about syscall returning a long and the loss of integer precision in the assignment to ret. The cast should be a no-op everywhere. * On Centos 7.3.1611, system header files aren't compatible with -Wdisabled-macro-expansion. I don't see a big reason for having that warning, so I'm turning it off generally. Add -Wuninitialized to WiredTiger's gcc builds.
| * WT-3204 eviction changes cost LSM performance (#3325)Keith Bostic2017-03-168-112/+130
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WT-3204 eviction changes cost LSM performance Modify LSM's primary chunk switching to match the new btree eviction semantics on object creation. We now create objects with eviction turned off, LSM should no longer have to turn eviction off when configuring the primary chunk. LSM previously set WT_BTREE.bulk_load_ok to false to ensure an insert into the tree wouldn't turn eviction on. That problem remains, but there's a race in the implementation if multiple threads are inserting at the same time (where a thread modifies WT_BTREE.bulk_load_ok and goes to sleep before configuring eviction, and another thread does an insert and turns off eviction), and there's a further race between threads doing F_ISSET/F_SET tests. Change the WT_BTREE_LSM_PRIMARY flag into a WT_BTREE.lsm_primary variable so there's no F_ISSET/F_SET race. Remove the test/set of bulk-load_ok, instead, test the lsm_primary value in the btree code before turning eviction off. When checkpointing an LSM chunk, move the code that turns off the chunk's primary flag in the chunk inside the single-threaded part of the function to ensure we don't race with other threads doing checkpoints. That makes the code to fix up the accounting single-threaded and safe. Simplify the LSM checkpoint code to call __wt_checkpoint directly, and use the same handle for turning off the chunk's primary flag as we use for the checkpoint. * Force a primary switch in LSM after an exclusive-handle operation has come through. Otherwise it's possible to attempt to use a file as the primary chunk without disabling eviction. * spelling * WT_BTREE.bulk_load_ok isn't a boolean, don't use true/false comparisons. * Only check for an empty tree the first time an LSM chunk is opened. The goal here is to make sure that LSM primary chunks start empty. Otherwise, we can't load into a skiplist in memory as required by LSM. If an operation such as verify closes a btree in order to check the on-disk state, the next time it is opened we have to check whether it is empty. It is safe to do this check without locking: what matters is that we always do the `lsm_primary` check before any update operation that would turn off `btree->bulk_load_ok`. * Rename WT_BTREE.bulk_load_ok to be WT_BTREE.original, it's used by LSM. * Fix a comment.
| * WT-3206 Fix a race allocating split generations. (#3332)Michael Cahill2017-03-161-23/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use split generations to detect when readers may be looking at structures that are replaced by a split. For correctness, we should only increment the global split generation *after* a split becomes public. Only then can we safely check that no thread is still reading with the old generation. Previously, a split could increment the global split generation, then a thread could start reading with the new split generation but see the old index structure. This issue was introduced by WT 3088, where we wanted a way to ensure that newly-allocated pages don't split until it is safe. That is solved here by having the split code pin a split generation in the ordinary way (without allocating a new one) for the duration that splits of new pages need to be prevented.
| * WT-3218 Avoid adding duplicate handles to connection dhandle list (#3331)Alex Gorrod2017-03-162-0/+27
| | | | | | | | | | | | * Recheck for existence after acquiring write lock when creating a new dhandle. * Add a wtperf workload that reproduced the original failure.
| * WT-3211 WT_CURSOR.remove cannot always retain its position. (#3321)Keith Bostic2017-03-1426-494/+960
| |
| * WT-3207 Fix a leak if a checkpoint fails. (#3329)Michael Cahill2017-03-138-63/+50
| | | | | | | | | | | | | | Also switch to holding the schema lock when completing a bulk load. This avoids a race with checkpoints starting, so avoids a failure mode that was added to checkpoint earlier in this ticket. Assert that we don't hit that case instead.
| * WT-3207 Report a message for conflicting forced checkpoints, rather than an ↵Alex Gorrod2017-03-104-12/+34
| | | | | | | | | | | | | | error (#3326) Have test/fops handle EBUSY returns from forced checkpoints and EINVAL from bulk cursors.
| * WT-3207 Don't hold clean handles during checkpoints. (#3319)Michael Cahill2017-03-107-208/+247
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we gathered handles, then started a transaction, then figured out which handles were clean and released them. However, * checkpoints were keeping every handle in both its handle list and in the meta_tracking list because the *_apply_all functions were saving all handles when meta_tracking was active; and * we had acquired exclusive locks on checkpoints to be dropped before determining that we could skip a checkpoint in a clean tree. These locks blocked drops (among other things) until the checkpoint completed. The solution here is to first start the transaction, then check for clean handles as checkpoint visits them. However, this has to cope with races where a handle changes state in between the transaction starting and getting the handle (e.g., table creates, bulk loads completing).