WiredTiger release 1.3.1, 2012-09-25 ------------------------------------ This is a bugfix release, primarily related to LSM trees. The changes are as follows: [#309] Implement auto-commit of transactions at the API. As well as ensuring the atomicity of complex operations, this change simplified code that simulated auto-commit internally and fixed a number of bugs. [#321] Bulk-cursors no longer block checkpoints. We can't write files that are being bulk-loaded, so change checkpoint to create checkpoints in the metadata that, if accessed, look like empty files. Tighten down the requirements for bulk-load, the only thing that can be bulk-loaded now is a newly created tree, not any empty file. [#329] Add dictionary support to variable-length column store objects. Support large row-store reconciliation dictionaries: add a skiplist as the indexing mechanism. [#333] Fix a leak of the in-memory transaction log structure and the LSM data source handle. [#334] Fix a memory leak where a page's replacement address wasn't being freed. * Check that LSM trees are not configured as column stores. * Fix a race when starting the LSM worker thread. It was possible for the thread to exit immediately if it started fast enough. * Two fixes for LSM, one to ensure that cursors read from a checkpoint if one is available. The other to reduce the number of empty chunks that can be created initially. * Fix a bug that disabled bloom filters. * The configure script checks for Python support in SWIG. * If a drop operation fails to acquire all of the handle locks it needs, make sure it releases the primary handle lock. * Fix a number of other minor bugs and memory leaks. WiredTiger release 1.3.0, 2012-09-17 ------------------------------------ This release contains a number of major new features, including: * support for LSM trees with Bloom filters; * support for hot backups; and * support for fast truncation of files. In addition, there are some critical bug fixes. We recommend that all users upgrade. Here is the full list of changes: [#143] Implement random record lookups. [#168] Add support for LSM trees. [#168] Add support for Bloom filters in LSM trees. [#198] Handle page-generation wraparound. [#236] Implement hot backups. [#244] Index cursors for column-store objects may not be created using the record number as the index key. [#247] Add a fast-path for WT_SESSION::truncate that avoids reading most data to be deleted. [#259] Performance hack for cursor open: don't parse the configuration strings for a default value if the application didn't specify a configuration string. [#262] Disable dump on child cursors: only the top-level cursor is wrapped in a dump cursor. [#266] Deal with new / dropped indices in __wt_schema_open_index. [#269] Checkpoint handles must not be open when they are overwritten. [#271] Add support for a reserved checkpoint name "WiredTigerCheckpoint" that opens the object's last checkpoint. [#271] Add the ability to access unnamed checkpoints. [#274] Change cursor.equals to return a standard error value and store the cursor equality result in a separate argument. [#275] If exclusive handle is required for an operation and it is not available, fail immediately: don't block. [#276] Fix methods that return integer parameters from Python. This includes cursor.equals and cursor.search_near. [#277] Acquire the schema lock when creating the metadata file. We're single-threaded, so it isn't protecting against anything, but the handle management code expects to have the schema lock. [#279] Some optimizations for __wt_config_gets_defno. Specifically, if we're dealing with a simple stack of config strings, just parse the application string rather than the full list of defaults. [#279] Split the description string into a set of structures, to reduce the number of string comparisons and manipulation that's required. [#282] Remove the cursor.reconfigure method, and replace it with documentation showing how to "reconfigure" cursors using the session.open_cursor method to duplicate them with different configuration strings. [#284] Fix for a hazard reference race, where page eviction races with the creation of the hazard reference, we have to check the pointer itself as well as the state of the pointer. [#285] We can clear the tree's modified flag on checkpoint, as long as the checkpoint writes all modifications. Clear the tree's modified flag before we start the checkpoint, but reset it as necessary if reconciliation is unable to write all of the changes in a page. [#287] Fix __wt_config_check to handle overlapping config values correctly. [#289] Add support for read-committed isolation, make it the default. Add a session-level "isolation" setting. [#294] If txn_commit fails, document the transaction was rolled-back. [#295] Expand the documentation on using cursors without explicit transactions. [#300] Include all changes whenever closing a file, don't check for visibility. If updates are skipped while evicting a page, give up. [#305] Have "wt dump" fail more gracefully if the object doesn't exist. [#310] When freeing a tracked address in reconcilation, clear it to avoid freeing the same address again on error. [#314] Replace cursor.equals with cursor.compare [#319] Clear the bulk_load_ok flag when closing handles. * Add an "ancient transaction" statistic so we can find out if they're actually occurring in the field. * Add an "was object ever modified" flag to the btree handle, and use it to avoid writing read-only objects during internal checkpoints, issue * Add per-connection statistics counters for transaction checkpoint, begin, commit and rollback. Add per-btree statistics counters for update conflicts. * Another fixed-length column-store implicit record fix: if the earliest row in the object is row 10, and it's on an append list, we still must return rows 1-9, they've been implicitly created. * Bulk cursors: disallow cursor.{equals,next,prev,reset,search, search_near,update,remove}; only close and insert are supported. * Change session.truncate to support any cursor position for range truncation, not just keys that are known to exist. * Checkpoint has to flush the metadata file, but only after it's flushed all of the other files. * Discard obsolete WT_UPDATE structures during updates. * Document that duplicated cursors are positioned at the same point as the cursor that was duplicated. * Fix a (very unlikely) deadlock at startup, if an application issues a checkpoint before the eviction server has managed to open its sesssion. * Fix a core dump if we verify a file that's corrupted such that we are unable to load any checkpoints at all, and the per-checkpoint bit map is never set. * If a page selected for eviction cannot be freed because it has some recent updates, try instead to free memory by trimming old updates. * If a thread fails to evict a page, try to bump its snapshot. This avoids the common case of read-committed threads getting stuck because one thread falls behind (e.g., because we can't evict during a checkpoint). * If an exclusive table create fails, return EEXIST. * If we try to remove a file that doesn't exist, don't complain, return success. * If we're repeatedly taking a checkpoint with the same name, skip the work for read-only objects. * Instead of flagging the empty tree's leaf page empty as part of creating an empty tree in memory, set the page as modified (to force reconciliation); if the leaf page is still empty at that time, then we'll figure it out during that reconciliation. This fixes a memory leak where the leaf page of a empty tree wasn't being freed. * It's not unreasonable to open a cursor on a non-existent table, don't complain, just return not-found. * Move dist/RELEASE to the top level of the tree. * Optimization: don't repeatedly look up btree handles for schema operations. * Return keys from all operations: don't keep pointing to the application's key. * Update btree usage of 64 bitstring implementation, so it's cleaner. * Update the bitstring implementation to use 64 bit length strings. * Updates performed without an active transaction should become visible with the current transaction ID. * Upgrade to doxygen 1.8.x * Use a real snapshot transaction for checkpoints. Otherwise, the snapshot can be updated in between checkpointing multiple files (when updating the metadata). WiredTiger release 1.2.2, 2012-06-20 ------------------------------------ This is a bugfix release. The changes are as follows: * Defer making free pages available until the end of a checkpoint, in case there is a failure after processing some files. * When checking the value of the "isolation" key, don't assume it is NUL terminated. This bug could cause transactions to run with incorrect isolation. * Fix two bugs with snapshot isolation: 1. reset the isolation level when the transaction completes; 2. when checking visibility, check item's ID against the maximum snapshot ID (not the transaction's ID). WiredTiger release 1.2.1, 2012-06-15 ------------------------------------ This is a bugfix release. The changes are as follows: * Avoid a deadlock between eviction and checkpoint on the connection spinlock. * Allocate "desc" buffers in heap memory so that they are correctly aligned (fixes direct_io support on Linux). * Initialize the snapshot-avail list after cleaning it out, else we'll try and print a NULL pointer in VERBOSE mode. WiredTiger release 1.2.0, 2012-06-04 ------------------------------------ This release contains many bugfixes and improvements. The major changes are: [#138] Add support for transactions with coarse-grained durability. Transactions provide atomicity guarantees and rollback, and uncommitted changes are never written to disk. There is no on-disk log, so committed changes only become durable when the next checkpoint completes. Checkpoints are implemented by creating transactionally-consistent snapshots within data files. [#156] Fully support operations that make schema changes with multiple sessions open concurrently. [#159] Disable internal page key suffix compression if a custom collator is configured. This avoids issues with collators that require complete keys. [#167] Add support for durable snapshots within files. While a snapshot is active, the pages used by the snapshot will not be overwritten. If a file is accessed after a crash or application exit without calling WT_CONNECTION::close, any changes made after the last snapshot will be silently ignored. [#214, #216] Fixes for forcing eviction with small caches. WiredTiger release 1.1.5, 2012-04-26 ------------------------------------ Don't update a WT_REF after it has been unlocked. Add an operation to set a flag atomically, use it to avoid racing on page flags. Fix a race between sync and reading that could cause a segfault. WiredTiger release 1.1.4, 2012-04-16 ------------------------------------ Check the versions of autoconf, automake and libtool to avoid failures when trying to build from the github tree with versions that are too old. [#191] Create the schema table as part of creating the environment so that application threads don't race trying to create it later. [#193] Split-merge pages have to be reconciled to mark their parents dirty [#194] The dump utility should only output configuration that can be passed to WT_SESSION::create. Eviction fixes for out-of-cache update workloads: * Fix an unlikely bug where the EVICT_LRU flag was cleared when a page in the LRU queue was overwritten with itself during a walk. This led to an assertion failure when the page was later evicted. * Clear all unused eviction queue entries while holding the lru_lock. * Split WT_PAGE->flags so that there is no possibility of racing: (1) Move WT_PAGE_REC_* flags into WT_PAGE_MODIFY; (2) Use atomic operations to set and clear the remaining (2) page flags. Move the test/format threads setting into the CONFIG file. WiredTiger release 1.1.3, 2012-04-04 ------------------------------------ Fix the "exclusive" config for WT_SESSION::create. [#181] 1. Make it work for files within a single session. 2. Make it work for files across sessions. 3. Make other data sources consistent with files. Fix an eviction bug introduced into 1.1.2: when evicting a page with children, remove the children from the LRU eviction queue. Reduce the impact of clearing a page from the LRU queue by marking pages on the queue with a flag (WT_PAGE_EVICT_LRU). During an eviction walk, pin pages up to the root so there is no need to spin when attempting to lock a parent page. Use the EVICT_LRU page flag to avoid putting a page on the LRU queue multiple times. Layer dump cursors on top of any cursor type. Add a section on replacing the default system memory allocator to the tuning page. Typo in usage method for "wt write". Don't report range errors for config values that aren't well-formed integers. WiredTiger release 1.1.2, 2012-03-20 ------------------------------------ Add public-domain copyright notices to the extension code. test/format can now run multi-threaded, fixed two bugs it found: (1) When iterating backwards through a skiplist, we could race with an insert. (2) If eviction fails for a page, we have to assume that eviction has unlocked the reference. Scan row-store leaf pages twice when reading to reduce the overhead of the index array. Eviction race fixes: (1) Call __rec_review with WT_REFs: don't look at the page until we've checked the state. (2) Clear the eviction point if we hit it when discarding a child page, not just the parent. Eviction tuning changes, particularly for read-only, out-of-cache workloads. Only notify the eviction server if an application thread doesn't find any pages to evict, and then only once. Only spin on the LRU lock if there might be pages in the LRU queue to evict. Keep the current eviction point in memory and make the eviction walk run concurrent with LRU eviction. Every test now has err/out captured, and it is checked to assure it is empty at the end of every test. WiredTiger release 1.1.1, 2012-03-12 ------------------------------------ Default to a verbose build: that can be switched off by running "configure --enable-silent-rules"). Account for all memory allocated when reading a page into cache. Total memory usage is now much closer to the cache size when using many small keys and values. Have application threads trigger a retry forced page eviction rather than blocking eviction. This allows rec_evict.c to simply set the WT_REF state to WT_REF_MEM after all failures, and fixes a bug where pages on the forced eviction queue would end up with state WT_REF_MEM, meaning they could be chosen for eviction multiple times. Grow existing scratch buffers in preference to allocating new ones. Fix a race between threads reading in and then modifying a page. Get rid of the pinned flag: it is no longer used. Fix a race where btree files weren't completely closed before they could be re-opened. This behavior can be triggered by using a new session on every operation (see the new -S flag to the test/thread program). [#178] When connections are closed, create a session and discard the btree handles. This fixes a long-standing bug in closing a connection: if for any reason there are btree handles still open, we need a real session handle to close them. Really close btree handles: otherwise we can't safely remove or rename them. Fixes test failures in test_base02 (among others). Wait for application threads in LRU eviction to drain before walking a file. Fix a buffer size calculation when updating the root address of a file. Documentation fix: 10% of 1MB is 100KB. WiredTiger release 1.1.0, 2012-02-28 ------------------------------------ Add checks to the session.truncate method to ensure the start/stop cursors reference the same object and have been initialized. Implement cursor duplication via WT_SESSION::open_cursor. [#161] Switch to quiet builds by default. Fix with automake version < 1.11, use foreign mode so that fewer top-level files are required. If a session or connection method is about to return WT_NOTFOUND (some underlying object was not found), map it to ENOENT, only cursor methods return WT_NOTFOUND. [#163] Save and restore session->btree in schema ops to simplify calling code. [#164] Note the wiredtiger_open config string "multiprocess" is not yet supported. Move "root:F" and "version:F" entries for files into the value for "file:F", so there is only a single record per file. [NOTE: SCHEMA CHANGE] When parsing config strings, continue to the end of the string in case of repeated keys. [#124] Don't require shared libraries unless Python is configured. Add support for direct I/O, with the config "direct_io=(data,log)". Build with _GNU_SOURCE on Linux to enable O_DIRECT. Don't keep the last page of column stores pinned: it prevented eviction of large trees created from scratch. Allow application threads to evict pages from any tree: maintain a count of threads doing LRU in each tree and wait for activity to drain when closing.