summaryrefslogtreecommitdiff
path: root/src/docs/tune-page-sizes.dox
diff options
context:
space:
mode:
authorKeith Bostic <keith@wiredtiger.com>2014-12-10 16:20:01 -0500
committerKeith Bostic <keith@wiredtiger.com>2014-12-10 16:20:01 -0500
commit2e6098e8c349da64371fbe7d5eb5f3b0f08eaebf (patch)
treeda65215b565cc848d24b81e72dde53e90ba2f89b /src/docs/tune-page-sizes.dox
parent632e10c983024a69156f14bfb136a8006445c9d1 (diff)
downloadmongo-2e6098e8c349da64371fbe7d5eb5f3b0f08eaebf.tar.gz
Another pass over the wording.
Diffstat (limited to 'src/docs/tune-page-sizes.dox')
-rw-r--r--src/docs/tune-page-sizes.dox78
1 files changed, 41 insertions, 37 deletions
diff --git a/src/docs/tune-page-sizes.dox b/src/docs/tune-page-sizes.dox
index 2c94f5c30b7..2490dbbb5fc 100644
--- a/src/docs/tune-page-sizes.dox
+++ b/src/docs/tune-page-sizes.dox
@@ -32,14 +32,14 @@ disk is necessary, it is usually desirable to read a large amount of
data, assuming some locality of reference in the application's access
pattern).
-The default page size configuration (2KB for \c internal_page_max and
-32KB for \c leaf_page_max), are appropriate for applications with
-relatively small keys and values.
-
-Applications doing table scans in out-of-memory workloads might increase
-both internal and leaf page sizes to transfer more data per I/O.
-
-Applications focused on read/write amplification might decrease the page
+The default page size configurations (2KB for \c internal_page_max, 32KB
+for \c leaf_page_max), are appropriate for applications with relatively
+small keys and values.
+
+- Applications doing full-table scans through out-of-memory workloads
+might increase both internal and leaf page sizes to transfer more data
+per I/O.
+- Applications focused on read/write amplification might decrease the page
size to better match the underlying storage block size.
When block compression has been configured, configured page sizes will
@@ -47,43 +47,47 @@ not match the actual size of the page on disk. Block compression in
WiredTiger happens within the I/O subsystem, and so a page might split
even if subsequent compression would result in a resulting page size
small enough to leave as a single page. In other words, page sizes are
-based on in-memory sizes, not on-disk sizes.
-
-The configured page size also determines the default size of overflow
-items, that is, keys and values too large to easily store on a page.
-Overflow items are stored separately in the file from the page where the
-item logically appears, and so referencing overflow items is more
-expensive than referencing on-page items, requiring off-page access and
-additional I/O in many cases For this reason, it is important to avoid
-creating large numbers of overflow items that are commonly referenced.
-
-This is especially important for keys, as keys on internal pages are
-referenced during random searches, not just during data retrieval.
-Generally, applications should make every attempt to avoid creating
-overflow keys.
-
-Applications with large keys and values, and concerned with latency,
+based on in-memory sizes, not on-disk sizes. Applications needing to
+write specific sized blocks may want to consider implementing a
+WT_COMPRESSOR::compress_raw function.
+
+The page sizes also determine the default size of overflow items, that
+is, keys and values too large to easily store on a page. Overflow items
+are stored separately in the file from the page where the item logically
+appears, and so reading or writing an overflow item is more expensive
+than an on-page item, normally requiring additional I/O. Additionally,
+overflow values are not cached in memory. This means overflow items
+won't affect the caching behavior of the application, but it also means
+that each time an overflow value is read, it is re-read from disk.
+
+For both of these reasons, applications should avoid creating large
+numbers of commonly referenced overflow items. This is especially
+important for keys, as keys on internal pages are referenced during
+random searches, not just during data retrieval. Generally,
+applications should make every attempt to avoid creating overflow keys.
+
+- Applications with large keys and values, and concerned with latency,
might increase the page size to avoid creating overflow items, in order
-to avoid the additional work of retrieving them.
+to avoid the additional cost of retrieving them.
-Applications with large keys and values, doing random searches, might
+- Applications with large keys and values, doing random searches, might
decrease the page size to avoid wasting cache space on overflow items
that aren't likely to be needed.
-Applications with large keys and values, and doing table scans, might
+- Applications with large keys and values, doing table scans, might
increase the page size to avoid creating overflow items, as the overflow
-items will have to be read into cache in all cases anyway.
+items must be read into memory in all cases, anyway.
The \c internal_key_max, \c leaf_key_max and \c leaf_value_max
configuration values allow applications to change the size at which a
key or value will be treated as an overflow item.
The value of \c internal_key_max is relative to the maximum internal
-page size. Because the number of keys on an internal page usually
-determines the depth of the tree, the \c internal_key_max value can only
-be adjusted within a certain range, and the configured value will be
-automatically adjusted by WiredTiger to ensure at a reasonable number
-of keys fit on an internal page.
+page size. Because the number of keys on an internal page determines
+the depth of the tree, the \c internal_key_max value can only be
+adjusted within a certain range, and the configured value will be
+automatically adjusted by WiredTiger, if necessary to ensure a
+reasonable number of keys fit on an internal page.
The values of \c leaf_key_max and \c leaf_value_max are not relative to
the maximum leaf page size. If either is larger than the maximum page
@@ -91,9 +95,9 @@ size, the page size will be ignored when the larger keys and values are
being written, and a larger page will be created as necessary.
Most applications should not need to tune the maximum key and value
-sizes. Applications requiring a small page size but also having latency
-concerns such that the additional work to retrieve an overflow item is
-an issue, may find them useful.
+sizes. Applications requiring a small page size, but also having
+latency concerns such that the additional work to retrieve an overflow
+item is an issue, may find them useful.
@section tune_page_sizes_split_percentage Split percentage
@@ -131,4 +135,4 @@ Most applications should not need to tune the allocation size; it is
primarily intended for applications coping with the specific
requirements some file systems make to support features like direct I/O.
- */
+*/