summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* WT-3467 Minor lint/cleanup (#3541)mongodb-3.5.11Keith Bostic2017-07-3111-75/+42
|
* WT-3338 Implement optimized cursor modify. (#3437)Keith Bostic2017-07-2838-712/+2109
| | | | | Adds a new WT_UPDATE type, WT_UPDATE_MODIFIED, containing a delta update to an existing value. Allows a fixed number of these deltas to accumulate between full updates, and applies deltas during reads so that the final, modified values are returned. Fixes bugs from WT-3402 with variable-length column stores with run-length encoding.
* WT-3466 Track the first commit_timestamp set in a transaction. (#3540)Michael Cahill2017-07-282-4/+30
| | | | | | | | | | | In particular, if a commit_timestamp is set via WT_SESSION::timestamp_transaction and again later via WT_SESSION::commit_transaction, there was a window where session->txn.commit_timestamp was zero, which caused WT_CONNECTION::query_timestamp to return incorrect results. Similarly, the shared list of transactions with commit timestamps should be sorted by the first commit timestamp: we don't (and shouldn't) re-sort that list if a transaction's commit_timestamp is updated.
* WT-3389 Restructure split code to hold a split generation for the entire ↵Keith Bostic2017-07-281-168/+167
| | | | | | | | | | | | | | | | operation. (#3479) * Change insert, multi-page and reverse splits to acquire and hold a split generation for the entire operation. This is intended to ensure internal pages split by other threads cannot be evicted from underneath the current split. This is intended to avoid problems like the one in WT-3373, where an internal page was evicted during an attempt to split into it, leading to an access violation. These functions will already acquire and hold a split generation at some point, so there's no performance cost with holding it for a longer period (other than potentially failing to evict internal pages as soon as would otherwise be possible). * Now that we're holding a split-generation across the insert operations, simplify some underlying code. We continue to need to set the split generation on newly created pages, but move that set from `__split_ref_prepare()` into the calling code where the other fields of the newly created pages are initialized. * We no longer need to call `WT_ENTER_PAGE_INDEX/WT_LEAVE_PAGE_INDEX` in `__split_ref_prepare()`, `__split_root()` or `__split_internal()`. * `__split_internal_lock()` gets a hazard pointer on the child's parent page to avoid parent eviction. The problem that solved was that while the existence of the child page prevents parent page eviction initially, once the parent's page index is updated, the child may no longer be a child of that parent, the parent might have no children, and so could potentially be evicted. Now we hold a page lock on the parent page that prevents eviction if the parent page is dirty, because eviction would require the page be reconciled/cleaned. That should be safe: if the child page is moved so the parent could be evicted, then the parent page must also have been marked dirty.
* WT-3461 Use CLOCK_MONOTONIC for pthread_cond_timedwait if possible. (#3537)Michael Cahill2017-07-2811-72/+179
| | | | | | | | * WT-3461 Use CLOCK_MONOTONIC for pthread_cond_timedwait if possible. Regardless, don't adjust the realtime clock before calculating when a timed sleep should end. Otherwise, we can sleep for longer than expected by however much the clock changed. * __wt_epoch() is now identical between POSIX and Windows, pull the OS-independent time functions out into a new file.
* WT-3422 Add upgrade/downgrade doc text (#3538)sueloverso2017-07-281-3/+9
|
* WT-3463 Add test phase to take backup without a checkpoint. (#3539)sueloverso2017-07-281-29/+63
|
* WT-3410 Add activity diagram in the documentation for schema rename (#3536)Sulabh Mahajan2017-07-272-2/+83
|
* WT-2309 Add option to cause delays in internal page split code to aid ↵Sulabh Mahajan2017-07-266-13/+47
| | | | testing (#3531)
* WT-3387 Add use_timestamp option for checkpoint (#3503)sueloverso2017-07-2610-42/+183
| | | | | | | | | | | | * Add stable_timestamp config and basic parsing. * Change to use stable timestamp: Remove calls to timestamp_dump. Add get=stable to timestamp query and have checkpoint code use it. Add test component to update the stable timestamp and recheck the backup. * Add check for read_timestamp being after stable_timestamp and test usage.
* WT-3412 Add backoff logic in bt_delete and bt_walk (#3534)Vamsi Krishna2017-07-268-33/+71
| | | When waiting for a ref to change state
* WT-3440 Add a log record when starting a checkpoint. (#3525)sueloverso2017-07-2613-87/+480
| | | Otherwise recovery could do more work than necessary and create a new checkpoint.
* WT-3447 Fix python test so it will loop checking for WiredTigerTables (#3532)David Hows2017-07-251-9/+14
| | | To give time for the statistics log thread to write into the file, not just create it.
* WT-3446 Temporarily disable test/checkpoint timestamp testing. (#3533)Vamsi Krishna2017-07-242-6/+21
|
* Revert "WT-3446 Temporarily disable checkpoint timestamp testing."Vamsi Krishna2017-07-242-20/+6
| | | | This reverts commit e8afecc7b27a2e93f3e075a7d1881b1a0cf04f84.
* WT-3446 Temporarily disable checkpoint timestamp testing.Vamsi Krishna2017-07-242-6/+20
| | | | | Also checkpoint testing will return error if -s (timestamp testing) is passed.
* WT-3380 Make 8-byte timestamps a special case (#3509)Sulabh Mahajan2017-07-2427-171/+301
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Change wt_timestamp_t to union of uint64 and uint8 array * Add contents from Keith's change * Whitespace and s_string ok addition * Get rid of WT_GET_TIMESTAMP_PTR * Fix the change after the merge * Remove superfluous struct around union, simplify macros * Remove packed attribute from WT_UPDATE. (#3523) * Remove packed attribute from WT_UPDATE. Add 3B of data declaration at the end of the WT_UPDATE structure. That way we don't have to pack the structure to avoid wasting data bytes, and we don't have to use a macro to identify the start of the data. * Locate the timestamp in the WT_UPDATE structure depending on its alignment, to avoid padding. * I lost a change, the size of the WT_UPDATE structure has to reflect the size of the timestamp. * Change the __wt_timestamp_t union into a structure so the compiler doesn't insert padding in the middle of the WT_UPDATE structure (the existence of the uint64_t in the union causes the whole thing to be aligned, even if we never access it). Incorporate Michael's change to replace sizeof(WT_UPDATE) with WT_UPDATESIZE, the compiler is padding the structure at the end and we need to ignore that. * If there's no union, we can't reach into it and get the field, take the address like we do everywhere else. * Simplify size calculations for WT_UPDATE. In particular, go back to having WT_UPDATE_SIZE match the size of a WT_UPDATE excluding the payload data. That means the declared size of the data array in WT_UPDATE is no longer special. * Switch from a 3-byte array to a C99 flexible array member. There is no longer anything special about the 3 byte array: since the timestamp can be any size, there is no guarantee it makes a WT_UPDATE any nicely rounded size. Compilers enforce some rules around how flexible array members can be used: we should consider switching our other uses of structs with trailing arrays (in a separate ticket). * Remove some more unnecessary casts now that upd->data is typed.
* WT-3316 Add a developer documentation section starting with schema create ↵Sulabh Mahajan2017-07-249-1/+212
| | | | | | | | | | | | | | | | | | | | | | | | (#3521) * Add plantuml support, add new developer documentation section * Take out plantuml image generation out of s_docs, checkin the generated images * Bypass Doxygen support for plantuml, we do not an additional dependency * Addressed Alex's comments * Fix whitespace * Improve uml * Update uml images * Add uri table: * Modify s_plantuml to work on OSX as well. * rename s_plantuml to s_docs_platuml
* WT-3387 Fix checkpoint support for read_timestamp. (#3528)Michael Cahill2017-07-212-10/+13
| | | | | | | | | | | | * WT-3387 Fix checkpoint support for read_timestamp. Recent changes cleared the WT_TXN_HAS_TS_READ flag in the special handling of the checkpoint transaction in __checkpoint_prepare. This change splits the functional flag (HAS_TS_READ) from the flag indicating whether the information has been published by hooking the transaction into a list (PUBLIC_TS_READ). * Minor KNF
* WT-3442 Coverity 1378213: false positive on diagnostic assignment. (#3529)Keith Bostic2017-07-211-2/+3
| | | | | | | | | | | * WT-3442 Coverity 1378213: false positive on diagnostic assignment. CID 1378213 (#1 of 1): Side effect in assertion (ASSERT_SIDE_EFFECT) assignment_where_comparison_intended: Assignment upd = *updp has a side effect. This code will work differently in a non-debug build. * rec_write.c:1312:80: error: suggest parentheses around '&&' within '||' [-Werror=parentheses]
* WT-3047 Add mode aimed at uncovering race conditions in split code (#3518)Vamsi Krishna2017-07-217-81/+84
| | | Rename the new diagnostic_timing_stress flag to timing_stress_for_test, and don't gate the code on HAVE_DIAGNOSTIC - sometimes we need release builds to reproduce race conditions
* WT-3308 Add statistics tracking around yield loops (#3496)Vamsi Krishna2017-07-2111-42/+127
| | | | | If many threads spin in a yield loop the symptoms are difficult to diagnose - we end up consuming all kernel CPU, and throughput generally stalls. Add statistics to places where that is (now vanishingly unlikely) but theoretically possible.
* WT-3433 Add support for alter and readonly. Add test case. (#3526)sueloverso2017-07-212-4/+29
|
* WT-3432 Fix braces error. (#3524)Don Anderson2017-07-211-2/+5
|
* WT-3418 Fix a block manager race in tree close/open (#3512)Keith Bostic2017-07-207-67/+100
| | | Don't set the handle's WT_DHANDLE_DEAD flag before closing the underlying block manager handle, the block manager asserts there are never two references to the same block store, and it's possible to open another data handle once we mark this handle dead.
* WT-3406 Reconciliation should ignore concurrent updates. (#3516)Michael Cahill2017-07-203-35/+44
| | | | | | | | | * Reconciliation should ignore concurrent updates: cache the oldest running transaction ID when reconciliation starts so that concurrent transactions cannot be treated as committed. * Mark transaction IDs volatile so they can be safely read. Transaction IDs are set when updates are created (before they become visible) and change in exactly one situation: when an update is marked with WT_TXN_ABORTED. Readers of transaction IDs expect to be able to copy a transaction ID into a local variable and see a stable value. In case a compiler might choose to re-read the transaction ID from memory rather than respecting the semantics of code written using a local variable, mark the shared transaction IDs volatile to prevent unexpected repeated / reordered reads.
* WT-3439 lint cleanup (#3520)Keith Bostic2017-07-204-12/+13
| | | | | | | | | | | * Prefer __wt_timestamp_set_zero to WT_CLEAR. * If we get an unexpected timestamp query, provide a helpful error message. * Values stored to btree_inuse and cache_inuse never read. * Make WiredTiger build without timestamps configured. * We automatically append a period to documentation descriptions, so we can't end a description with question mark.
* WT-3438 Don't tune eviction thread count when the count is fixed (#3519)David Hows2017-07-191-0/+7
|
* WT-3381 Improve concurrency in the transaction subsystem (#3515)Alex Gorrod2017-07-195-74/+184
| | | | | | | | | | | | | | | | | Removes timestamps from WT_TXN_STATE as a step towards merging WT_TXN_STATE with WT_TXN. This change has transactions add themselves to global, sorted lists (one ordered by commit timestamp, the other ordered by read timestamp). Each list has its own rwlock, and scans have been replaced by peeking at the first (oldest) transaction in the list. The list's lock protects the relevant fields in WT_TXN (i.e., no thread will read txn->commit_timestamp unless the transaction is in the commit_timestamp list). This should reduce contention for txn_global->rwlock, which can be further decomposed in future (e.g., eliminating the scan for __wt_txn_update_oldest and replacing __wt_txn_get_snapshot with a loop that makes a copy of running transaction IDs in order).
* WT-3426 Add update only wtperf workload (#3517)Alex Gorrod2017-07-191-0/+10
|
* Revert "WT-3381 Improve timestamp concurrency. (#3511)" (#3514)Alex Gorrod2017-07-185-171/+72
| | | This reverts commit fb9e565013a321749f385bfd3c1dba9b462da08e.
* WT-3381 Improve timestamp concurrency. (#3511)Michael Cahill2017-07-185-72/+171
| | | | | | | | | | | | | | | | | | | Removes timestamps from WT_TXN_STATE as a step towards merging WT_TXN_STATE with WT_TXN. This change has transactions add themselves to global, sorted lists (one ordered by commit timestamp, the other ordered by read timestamp). Each list has its own rwlock, and scans have been replaced by peeking at the first (oldest) transaction in the list. The list's lock protects the relevant fields in WT_TXN (i.e., no thread will read txn->commit_timestamp unless the transaction is in the commit_timestamp list). This should reduce contention for txn_global->rwlock, which can be further decomposed in future (e.g., eliminating the scan for __wt_txn_update_oldest and replacing __wt_txn_get_snapshot with a loop that makes a copy of running transaction IDs in order).
* WT-3140 Revert change removing camel casing in JSON statistics (#3513)David Hows2017-07-183-8/+8
|
* WT-3138 Enhance range of eviction server statistics available. (#3269)Alex Gorrod2017-07-1411-327/+707
|
* WT-3425 In workgen, added 'reopen' configuration option for Operations. (#3508)Don Anderson2017-07-145-22/+111
|
* WT-3140 Add output of handle name to per-handle stats (#3474)David Hows2017-07-144-15/+219
| | | | Restructure the way WiredTiger prints JSON stats, so valid json documents are produced
* WT-3424 additional gcc 7.1 compile warnings. (#3507)Keith Bostic2017-07-124-5/+5
| | | Printing out an "int" takes 20 characters.
* WT-3329 Visit trees using a tiny fraction of cache. (#3442)Michael Cahill2017-07-113-31/+8
| | | | | | For workloads where no tree takes up a large enough fraction of cache, we were using a randomized approach to deciding when eviction should visit trees. That led to slow performance for workloads with uniform updates over thousands of trees.
* WT-3421 Fix unreachable code error exposed when diagnostic build is off. (#3505)Don Anderson2017-07-101-1/+6
|
* WT-3413 Add more aggressive compile warning flags to Jenkins Windows job (#3502)Keith Bostic2017-07-071-2/+4
| | | | | | | | | | MongoDB/Evergreen uses: /Gv /wd4090 /wd4996 /we4047 /we4024 /TC /we4100 /w4133 We already specify level 3 warnings, which include C4047 and C4024 (both level 1), and C4133 (level 3), but not C4100 (level 4). Add /we4100 to our Windows build, and turn it off when building SWIG-generated code.
* WT-3310 Fix test/format to tolerate EBUSY with LSM. (#3501)sueloverso2017-07-062-4/+13
| | | | | | * WT-3310 Fix test/format to tolerate EBUSY with LSM. * Retry alter on EBUSY too. Make retry loop similar to others.
* WT-3415 Add locking around setting txn_state timestamp (#3500)sueloverso2017-07-062-19/+29
| | | | | | * WT-3415 Add locking around setting txn_state timestamp * Rename 's' to 'txn_state'
* WT-3394 Fix compilation warnings for GCC-7 (#3499)Alex Gorrod2017-07-0638-151/+312
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * WT-3394 Build WiredTiger with gcc7 Simplify wtperf loops to not check both the thread pointer and the count, it makes gcc7's testing for unsafe loops sad. It shouldn't be necessary, the count being non-zero indicates the thread pointers are non-NULL. * Turn off "-Wunsafe-loop-optimizations" for gcc7, it fires all over the place where the loops are just fine. * Rewrite a loop to turn off gcc7's -Wunsafe-loop-optimizations warning. * Gcc7's -Wunsafe-loop-optimizations warning was correct: byte and stopbyte are unsigned, which means the loop could be infinite. Rewrite the loops to check for equality after the last possible byte value. This is safe because nbits must be non-zero and byte must initially be less than stopbyte. Add explicit gcc fallthrough attributes, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros. * Add explicit gcc fallthrough attributes, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros. * Add a new gcc attribute macro, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros, we have to add gcc attributes to turn those warnings off. * Don't hardcode the size of the buffer in the integer-packing test. Use a testutil assertion so a test failure results in a non-zero exit status. * whitespace. * opts->table_count is a uint32_t, switch from PRId32 to PRIu32. * workp->throttle is a uin64_t, switch from PRId64 to PRIu64. * Enums are integers, but the signedness is implementation-dependent. * min_version/maj_version are int64_t's, switch PRIu64 to PRId64. * adjustment is a uint64_t, switch from PRId64 to PRIu64. * Fields slot->slot_state and slot->slot_unbuffered are int64_t's. When printing them as a hex value, cast to (uint64_t), otherwise, use PRId64, not PRIu64. * ckpt->order is an int64_t, switch from PRIu64 to PRId64. * WT_MIN of a uint32_t value has to be cast before being used as an int. * Don't bother declaring uint64_t arguments as 0ULL, we have a prototype in the file and gcc 7.0 complains about the use of "long long". * WT_EXTRA_INTERNAL_SESSIONS is type int, switch from PRI732 to %d. * optype is a uint32_t, switch from %d to PRIu32. * When printing wt_off_t's, use uintmax_t, not intmax_t. * id is a u_int, switch from %d to %u. * g.run_cnt is a uint32_t, switch from %d to PRIu32. * singput/soutput are int64_t's, switch from PRIu64 to PRId64. When printing hex values, cast to unsigned char. * id is unsigned, switch from %d to %u. * gcc wasn't inlining __wt_verbose(), which is bad because that's a function call to no purpose in most production environments. (The problem is gcc won't inline any function with variadic argument handling.) Break __wt_verbose() into two parts, the flag test as a macro and a call to a real function that handles the variadic argument part of the problem. One change: this requires calls to __wt_verbose() include a format argument plus at least one other argument, that is, you can't do: __wt_verbose(session, WT_VERB_LOOKASIDE, "my message"); you have to instead do: __wt_verbose(session, WT_VERB_LOOKASIDE, "%s", "my message"); * Update spell checker and auto-generated files. * Add support for gcc7, including a number of additional compiler warning flags. Remove -Wmissing-parameter-type (implied by -Wextra), and -Wmisleading-indentation (implied by -Wall). * Make configurations without HAVE_VERBOSE build cleanly again. * I can't figure out a way to make __attribute__((fallthrough)) work for gcc 7.X as well as earlier gcc compiler releases without a whole bunch of C preprocessor magic. Explode the macros so we don't need it anymore. * Add additional warning flags to gcc6 where supported. * I broke bitstring.i when I changed the ffs/ffc loops. * gcc7 thinks buffer can be used uninitialized, but I can't find any place where that's true. Explicitly initialize the buffer to clear the warning. * Use WT_INTPACK64_MAXSIZE instead of hard-coding the maximum buffer size. * Add a comment to explain why __wt_verbose() has to take a format string and at least one additional argument. Add comments to explain why we replaced expanded some #define's in switch statements. * Fix Windows compile failure: src\os_win\os_map.c(78) : error C4100: 'length' : unreferenced formal parameter * Revert "WT-3394 Build WiredTiger with gcc7" This reverts commit aebd86717952e409e28468bab03b8663bea612d3.
* Revert "WT-3394 Build WiredTiger with gcc7 (#3492)" (#3498)Alex Gorrod2017-07-0638-318/+159
| | | This reverts commit d5f82a43f1e0c8aafd38f0098bc9349a6de335e6.
* WT-3394 Build WiredTiger with gcc7 (#3492)Keith Bostic2017-07-0638-159/+318
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Simplify wtperf loops to not check both the thread pointer and the count, it makes gcc7's testing for unsafe loops sad. It shouldn't be necessary, the count being non-zero indicates the thread pointers are non-NULL. * Turn off "-Wunsafe-loop-optimizations" for gcc7, it fires all over the place where the loops are just fine. * Rewrite a loop to turn off gcc7's -Wunsafe-loop-optimizations warning. * Gcc7's -Wunsafe-loop-optimizations warning was correct: byte and stopbyte are unsigned, which means the loop could be infinite. Rewrite the loops to check for equality after the last possible byte value. This is safe because nbits must be non-zero and byte must initially be less than stopbyte. Add explicit gcc fallthrough attributes, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros. * Add explicit gcc fallthrough attributes, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros. * Add a new gcc attribute macro, gcc7's -Wimplicit-fallthrough complaints aren't turned off by /* FALLTHROUGH */ comments that appear in macros, we have to add gcc attributes to turn those warnings off. * Don't hardcode the size of the buffer in the integer-packing test. Use a testutil assertion so a test failure results in a non-zero exit status. * whitespace. * opts->table_count is a uint32_t, switch from PRId32 to PRIu32. * workp->throttle is a uin64_t, switch from PRId64 to PRIu64. * Enums are integers, but the signedness is implementation-dependent. * min_version/maj_version are int64_t's, switch PRIu64 to PRId64. * adjustment is a uint64_t, switch from PRId64 to PRIu64. * Fields slot->slot_state and slot->slot_unbuffered are int64_t's. When printing them as a hex value, cast to (uint64_t), otherwise, use PRId64, not PRIu64. * ckpt->order is an int64_t, switch from PRIu64 to PRId64. * WT_MIN of a uint32_t value has to be cast before being used as an int. * Don't bother declaring uint64_t arguments as 0ULL, we have a prototype in the file and gcc 7.0 complains about the use of "long long". * WT_EXTRA_INTERNAL_SESSIONS is type int, switch from PRI732 to %d. * optype is a uint32_t, switch from %d to PRIu32. * When printing wt_off_t's, use uintmax_t, not intmax_t. * id is a u_int, switch from %d to %u. * g.run_cnt is a uint32_t, switch from %d to PRIu32. * singput/soutput are int64_t's, switch from PRIu64 to PRId64. When printing hex values, cast to unsigned char. * id is unsigned, switch from %d to %u. * gcc wasn't inlining __wt_verbose(), which is bad because that's a function call to no purpose in most production environments. (The problem is gcc won't inline any function with variadic argument handling.) Break __wt_verbose() into two parts, the flag test as a macro and a call to a real function that handles the variadic argument part of the problem. One change: this requires calls to __wt_verbose() include a format argument plus at least one other argument, that is, you can't do: __wt_verbose(session, WT_VERB_LOOKASIDE, "my message"); you have to instead do: __wt_verbose(session, WT_VERB_LOOKASIDE, "%s", "my message"); * Update spell checker and auto-generated files. * Add support for gcc7, including a number of additional compiler warning flags. Remove -Wmissing-parameter-type (implied by -Wextra), and -Wmisleading-indentation (implied by -Wall). * Make configurations without HAVE_VERBOSE build cleanly again. * I can't figure out a way to make __attribute__((fallthrough)) work for gcc 7.X as well as earlier gcc compiler releases without a whole bunch of C preprocessor magic. Explode the macros so we don't need it anymore. * Add additional warning flags to gcc6 where supported. * I broke bitstring.i when I changed the ffs/ffc loops. * gcc7 thinks buffer can be used uninitialized, but I can't find any place where that's true. Explicitly initialize the buffer to clear the warning. * Use WT_INTPACK64_MAXSIZE instead of hard-coding the maximum buffer size. * Add a comment to explain why __wt_verbose() has to take a format string and at least one additional argument. Add comments to explain why we replaced expanded some #define's in switch statements.
* WT-3310 Add support to WT_SESSION::alter to change table log setting (#3439)sueloverso2017-07-0616-302/+488
|
* WT-3403 Restore panic if writing a log record fails. (#3497)sueloverso2017-07-051-1/+6
|
* WT-3402 Move cached overflow records to the update list. (#3493)Keith Bostic2017-07-0414-745/+404
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Cached overflow records are large value items that are being written after being updated in a committed transaction (which implies freeing and the potential reuse of their backing disk blocks), but which also must be kept available because an older reader in the system may want to read the previous value. Historically, we maintained a separate cache of these overflow values in the WT_PAGE_MODIFY.WT_OVFL_TXNC structure, with a set of functions to maintain that list. The reason was because we wanted to be able to remove overflow items from the cache based on updated transactional information, that is, once any older readers exited the system, we could discard the overflow values from the cache. Since that code was written, we've added support for removing items from key/value WT_UPDATE lists when they're no longer needed by earlier readers in the system, making the WT_UPDATE lists sufficient to replace the separate overflow values cache. Rather than enter overflow values into the separate cache, append the value to the WT_UPDATE list, with an impossibly low transaction ID to ensure global visibility. * Remove a diagnostic printf. * sort statistics lists. * Delete the data-source and connection statistic cache_overflow_value. It doesn't tell us anything useful other than "there were older readers in the system", and there are better sources of that information. * The cached remove information is a WT_CELL, no reason to hide that. * Increment the page footprint based on __wt_update_alloc's return of the size, don't roll our own. * There's special-case code in reconciliation to handle an overflow value being removed, the overflow value being cached because of an older reader in the system, and then the page going through an update/restore reconciliation. The idea is that if we cache an overflow value for an older reader in the system, then do a page update/restore eviction, re-instantiating the page in-memory means we'll no longer find the removed overflow value in the cache because the index for the cache is the on-page disk address of the overflow value, which doesn't appear in a re-instantiated page. The fix was to take a copy of the original overflow value from the cache and append it to the key's WT_UPDATE list. With the move of the overflow cache from a separate list to the key's WT_UPDATE list, that special-case code is no longer needed. Now, if an overflow value is removed and cached, the cache is the key's WT_UPDATE list, and subsequent reconciliations will always see a globally visible entry. * Document the removal of the cache_overflow_value statistic. * Add WT_UPDATE_DATA_VALUE macro, it flags a WT_UPDATE structure that includes a standalone chunk of data, in other words, "standard" or "deleted", but not "reserved", or in the future, "modified". * Use "no transaction ID" instead of "impossibly low transaction ID".
* WT-3409 WiredTiger generations can silently self-deadlock. (#3494)Keith Bostic2017-07-041-0/+6
| | | Panic if we self-deadlock in the generation code.
* WT-3401 Lint and minor cleanup (#3491)Keith Bostic2017-07-0312-57/+94
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Remove unnecessary line breaks. * Fix a couple of copyrights. * Whitespace/Indentation. * Don't treat booleans as type "int". * Dump out the WT_UPDATE structure timestamp information when debugging. * Simplify a couple of expressions in the btree row-store search function. Normally I wouldn't bother, but, well, it's the row-store search function. * Lint hates having a variable declared in an inline include file that's never used in the file, it complains about it for every single occurrence. Make zero_timestamp alocal variable by creating __wt_timestamp_set_zero(). Add __wt_timestamp_set_inf() as well, that way we don't hard-code the 0xff initialization in rec_write.c. * Split constant string to avoid spell complaints. * Add missing comments to new timestamp functions. * Make the debug code build w/o timestamps. * Minor cleanups for util_downgrade: Fix a comment (there isn't a table name argument), remove an "else" clause after a return, remove the verbose output of a newline, as far as I can tell, downgrade doesn't report progress. * bt_debug.c:1005:4: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] * lint: quit futzing around with pointers, do the copy. * Update the .gitignore list.