| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
* Default to an initial 250 hazard slots and grow from there.
* Make hazard_max undocumented, add an internal limit of 1000 eviction walks.
* If we grow the hazard pointer array, schedule the original to be freed when the database is closed.
* Update test_bug011 back to stress eviction with the hard-coded limit of 1000 active trees. Only run during "long" tests.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WT-2984 Keep sufficient history in the metadata for queries.
Since we treat the checkpoint transaction specially, we also have to
track the amount of history required for the metadata specially.
Previously, there was a window where a query started while a checkpoint
was running could fail to see the results of the checkpoint when it
queried the metadata.
* Remove the unused "session" argument to WT_IS_METADATA.
|
| |
|
|
|
| |
While in the area, fix sending "config={values}" to extensions: just the values should be passed in.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WT-3018 lint
clang version 3.4.1
random-abort.c:37:6: error: no previous extern declaration for
non-static variable 'inmem' [-Werror,-Wmissing-variable-declarations]
* Clean up WT_UNUSED() macro uses, where we do use the variable.
* Back out part of 6028ca3, the #ifdef'd code has variable declarations
and ISO C90 forbids mixed declarations and code.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
* Add a write barrier in front of __wt_cond_signal() to ensure the caller's flags meant to cause a thread to exit are seen by the thread.
* Make the LSM start/stop worker thread loops look the same.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fix a bug found by inspection in LZ4 code: we're going to offset the destination buffer by sizeof(LZ4_PREFIX), so we need to offset the destination buffer length by the same amount.
* Prettiness pass through the snappy driver so it and the zstd driver look the same, minor cleanup in zlib.
* Add the compression_level configuration option to the zstd extension initialization function so it's possible to set the compression level from an application.
* Fix a bug in zlib raw compression: the destination buffer size (dst_len), can be smaller than the target page size (page_max), use the smaller of the two values to set the target compression size.
* The zlib raw compression function could return without calling deflateEnd on its z_stream structures, potentially leaking memory and scratch buffers.
* If the default reserved bytes for zlib raw compression isn't sufficient, we fail on compression of what might be very large blocks. We don't have information on how many bytes need to be reserved in order to know the final deflate() will succeed, and the default value was experimentally derived, for all we know there are workloads where the default will fail
a lot. Add a fallback before failing hard, try with 2x the default reserved space.
|
|
|
| |
Exposed via a new 'cache_walk' statistics configuration option.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
high level locks (#3086)
* WT-2955 Add statistics tracking the amount of time threads spend waiting for high level locks
Sort the statistics categories so it's easier to find stuff, no real change.
* Add counters and usec wait times to long-term locks (currently the
checkpoint, handle-list, metadata, schema and table locks).
* mutex.i:295:26: error: conversion to int64_t {aka long int} from long
unsigned int may change the sign of the result [-Werror=sign-conversion]
[t->slot_usecs] += WT_TIMEDIFF_US(leave, enter);
* Rename the lock statistics so they group together.
Split lock wait times between internal and application threads.
* Separate the connection's dummy session initialization out into its own
function, that way it's clear what we're doing.
* The session's connection statistics are fixed when the session ID is
allocated, so we can cache a pointer to them and avoid u_int divisions
(which are currently about the slowest thing you can do on a modern
architecture).
* A slightly different change: instead of caching a reference to the
connection statistics, cache the offset into the array of statistics
pointers, that way we can avoid the integer division when updating
almost all statistics.
* Review comments:
Add comments describing the use of statistics array offsets in lock
tracking.
Rename WT_STATS_FIELD_TO_SLOT to WT_STATS_FIELD_TO_OFFSET.
Whitespace cleanup.
* __wt_cache_create() doesn't need to call __wt_cache_destroy() explicitly,
if the connection open fails at any point, __wt_cache_destroy() will be
called as part of that cleanup.
* Append the suffix "_off" to the spinlock structure statistics field
names, clarify they're offsets into the statistics structures.
|
|
|
| |
If a system call to retrieve a timestamp fails it will now result in a panic. We couldn't find any case where that's a real possibility.
|
|
|
| |
This saves locking and scanning the list of active handles for read-only workloads, which can be time consuming when there are many tables.
|
|
|
|
| |
Also replace key-string generation implementation with a more efficient implementation.
|
|
|
| |
Reverts commit ac1f7401dcb8be345973f7787d9121c5c321bf7b.
|
|
|
| |
Don't unlock the spin lock unless we've locked it.
|
| |
|
|
|
|
|
| |
* Don't relax the clean / dirty requirements when we get aggressive: if application threads are blocked by the dirty limit, evicting a clean page doesn't unblock them so there's no point trying.
* Sanity check dirty eviction limits: once the cache is full, we should evict both clean and dirty pages.
|
|
|
|
|
|
|
| |
Having _FAST_ macros gives an impression that when we use them,
we are collecting fast statistics only, which is not true.
Except when statistics=none is set, we collect all the stats.
This change removes _FAST_ macros and modifies the basic macros
to only collect stats when statistics=none is not set.
|
|
|
|
| |
Also avoid eviction of dirty pages entirely in the eviction server if
worker threads are configured.
|
|
|
|
|
|
|
|
|
|
| |
Along with several other eviction tuning changes, including:
* Do scrubbing in units of 10MB or 1% of cache (whichever is larger) to avoid stalling application threads in large caches.
* Don't include eviction walks in the "pages requested" stats.
* Only skip eviction on pages where a previous eviction attempt failed.
* Don't have application threads attempt to write pages if there are
eviction workers: with enough threads, they swamp the disk and see high
latencies.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* WT-2892 hot backup can race with block truncate
Hold a readlock across the block truncate, otherwise a hot backup
could potentially race with the truncation.
* Don't truncate log files if there's a hot backup in progress.
* Moving a pre-allocated log file into place is a rename, and so it shouldn't
be affected by a hot backup.
The only real change is removing the test of WT_CONNECTION_IMPL.hot_backup,
but some minor restructuring for clarity.
* Revert "Moving a pre-allocated log file into place is a rename, and so it shouldn't"
This reverts commit 3b8b2a7a81b0542b2b87f29339c18c6e7a660ed2.
* The log-file rename for a pre-allocated file can race with a hot-backup,
we have to hold a read lock on the hot backup lock across the process.
Minor shuffling & whitespace for clarity, move some variables closer to
where they're used.
* Tweak default settings, increase backups from 5% to 20%, logging
from 30% to 50%.
* __wt_cursor_close() doesn't need a local "ret" variable.
* Minor tweak to catch more of the standard API_CALL macros.
|
|
|
| |
Some functions return an error code even though they don't need to. That adds complexity to our code. Switch to returning a void.
|
| |
|
|
|
|
|
|
| |
__wt_checkpoint_signal doesn't return anything, switch from the function
type int to void
WT_CONNECTION_IMPL.ckpt_signalled is a boolean, retype it.
|
|
|
|
|
|
|
|
|
|
| |
A set of changes to the eviction algorithm including:
* Fix a bug in how many items can be added to the urgent queue
* Have the eviction server sleep less so it recovers from disruptions faster.
* Only have application threads evict dirty pages if they are blocked on the dirty trigger.
* Swap eviction queues when one becomes empty.
* Have the eviction server populate the "other" queue whenever it notices that it isn't full.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix bugs related to reconfiguring eviction settings.
The code to support reconfiguring eviction worker threads had
several bugs. Implemented an abstracted utility thread group API
and switch eviction workers over to using the new abstraction.
* Remove unused function from new thread group API.
* Commit auto-generated files.
* Be more careful cleaning up on error in thread group code.
* Fix uninitialized variable.
* Ensure thread group structures are cleared after destruction.
This is necessary since the structure is re-used for eviction workers
when recovery is run.
* Remove util and worker as notions from thread group as per review feedback.
Implement other feedback review as well.
* Fix a bug where application threads could attempt to help with eviction
before the server is setup.
Happens if using a shared cache so the cache size starts out at 0.
* Remove _util_ prefix from thread group functions.
* Add session name to thread group. Fix some comments and whitespace.
* Restore error return path.
|
|
|
|
|
|
|
|
|
|
| |
* The pthread mutex implementation of spinlocks lock/unlock functions didn't check the underlying pthread_mutex functions for failure. Panic if pthreads fails.
* Change condition mutex functions to not return errors.
* Change __wt_verbose() to not return errors.
* Make a final panic check before writing anything.
|
|
|
|
|
|
|
|
|
|
|
| |
There are known crashes in __wt_split_stash_discard_all() during
WT_CONNECTION.close.
We've been unable to reproduce them or spot the problem, but it is
also a possibility we're racing inside WiredTiger because there
are operations running in WiredTiger during a MongoDB shutdown.
MongoDB usually configures leak_memory on connection close, extend
the connection's leak-memory semantics to session memory, hoping to
at least make the symptom go away.
|
| |
|
| |
|
|
|
| |
* Rework the block manager to ignore whether or not truncate works at a low-level, rather than handling errors we don't care about in the callers.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead, get multi-threaded writes by dialling down the eviction dirty trigger.
Change eviction_dirty_* defaults (to 20/5).
Make sure all files are available for eviction before starting fsync.
Take more care excluding clean files from checkpoints.
Only use scrubbing mode when there is minimal cache pressure.
Avoid penalizing read-only operations with dirty eviction targets.
Enable scrubbing when eviction is keeping space available in cache.
Add stats for checkpoint scrubbing phase.
Improve cache scrubbing with big caches and slow I/O.
Use the rate of bytes written from cache to decide how long to wait for
the dirty bytes to come down.
|
|
|
|
|
|
|
|
| |
* WT-2793 Remove very long running config. Rename the one we run.
* Fix overflow stats. Make 130K overflow test use btree.
* Add line to upgrading doc stating stat field removed.
|
|
|
| |
We used to provide the ability to specify a custom checkpoint name that the checkpoint server would use. That feature increased code complexity and led to inconsistent checkpoint behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
API to make durable the default (#2867)
Change the default remove/rename calls to flush the enclosing directory.
Simplify the pluggable file system API by replacing the directory-sync method
with "durable" boolean argument to the remove, rename and open-file methods.
* Add "durable" arguments to relevant functions so that each remove or rename
call specifies its durability requirements.
* Switch the WT_FILE_SYSTEM::fs_open_file type enum from WT_OPEN_FILE_TYPE,
with WT_OPEN_XXX names, to the WT_FS_OPEN_FILE_TYPE, with WT_FS_OPEN_XXX
names.
Switch the WT_FILE_SYSTEM::fs_open_file flags from WT_OPEN_XXX names to
WT_FS_OPEN_XXX names.
* Replace the "bool durable" argument to WT_FILE_SYSTEM.fs_remove and
WT_FILE_SYSTEM.fs_rename with a "uint32_t flags" argument, and the
WT_FS_DURABLE flag.
* Remove a stray bracket.
|
|
|
|
|
|
| |
* Upgrade to the current FreeBSD sys/queue.h file, and remove the write barriers from the TAILQ_XXX macros.
* Add a write barrier before adding data handles to the global list. Otherwise, the sweep server could potentially see incomplete structures.
|
|
|
|
|
|
|
|
|
|
| |
* WT-2711 Remove posix expanded strftime values and use older C89 values
* Fix issues with s_string
* Add a comment so nobody rewrites the strftime format and reintroduces the bug.
* Fix strings sort order.
|
|
|
|
|
|
|
|
|
|
| |
KNF, remove space after "sizeof" keyword.
Info 790: Suspicious truncation, integral to float
strlen returns a size_t, cast before comparing against a wt_off_t.
size_t is 8B, more size_t cleanups.
|
|
|
|
| |
If there's no server running, discard any configuration information so
we don't leak memory during reconfiguration.
|
|
|
|
|
|
|
|
|
| |
No longer support setting the statistics_log path in WT_CONNECTION::reconfigure.
No longer support setting a custom name for statistics files, only allow a destination directory.
Be more explicit about which logging configuration options are allowed in WT_CONNECTION::reconfigure.
The aim of these changes is to avoid situations where applications that embed WiredTiger allow their users to overwrite unexpected files on a file system.
This potentially requires an upgrade step for applications that were specifying a non-standard file name component for statistics log file names, it's not backward compatible.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a statistic for the amount of cache used for page images vs other objects
* Reset the "maximum page size" statistic each checkpoint
* Add a separate queue for urgent eviction, replaces existing WOULD_BLOCK eviction mode
Until now, the eviction server has had to walk the cache to find pages that
would otherwise block application threads. This change allows threads to put
those pages directly on a queue that is checked before ordinary LRU eviction.
* Don't queue pages for urgent eviction when eviction is disabled in a tree.
Decouple evict_queue_lock and the individual queue's evict_lock in
__evict_get_ref. Don't block the eviction server if there are many
application threads waiting on a queue.
* Take care with the urgent queue.
Entries in the urgent queue cannot be skipped or they will not be considered again.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Randomize visits to trees that use a tiny fraction of the cache.
Other general eviction optimizations.
Now that we are queuing more entries (potentially), make sure enough of
them become candidates. Previously, a skewed distribution of read
generations could mean that only 10% of queue entries were considered.
Improve the efficiency of sorting the queue by calculating the score
once when pages are added to the queue.
Add a workload to exercise differential eviction from trees.
In order to do that, add range partitioning to wtperf.
Don't override icount in wtperf workloads with random_range set.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Build a Windows-to-POSIX/ANSI error translation layer.
Replace the read-only error mapping to WT_NOTFOUND and WT_PERM_DENIED
with EACCES and ENOENT.
Windows no longer needs its own version of __wt_strerror(), move the POSIX implementation from os_posix/os_errno.c to os_common/os_errno.c. Rename os_win/os_errno.c to os_win/os_winerr.c to avoid a collision.
Windows now has DWORD types in prototypes, split the Windows/POSIX extern.h files. (This actually cleans up some noise, previously we had to sort the OS prototypes to remove duplicates, which wasn't trivial.)
Add the WT_EXTENSION_API.map_windows_error method to map Windows system codes to POSIX/ANSI system codes.
|
|
|
| |
Add more options for callers when updating the oldest ID to control how much they care about the ID being updated.
|
|
|
|
|
| |
gcc 4.7, clang 3.4 and clang 3.8 all complain about different things;
cast the arguments to our new wrapper functions, hopefully that makes
all of the complaints go away.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Cast arguments for ctype functions to avoid sign extension errors.
Create __wt_* versions of all ctype functions, and use them whenever
wt_internal.h is available. Add check to prevent direct use of ctype functions
from core source.
* Change wrappers to use u_char arguments, and return bool.
Remove unused wrappers.
* Use u_char in preference to unsigned char.
* Examples do not have u_char defined on Windows.
|
|
|
|
| |
Clear the WT_CONN_CACHE_POOL flag when the shared cache is destroyed so
reconfigure won't try and destroy it more than once.
|