diff options
author | David Hows <howsdav@gmail.com> | 2016-10-26 17:18:02 +1100 |
---|---|---|
committer | Michael Cahill <michael.cahill@mongodb.com> | 2016-10-26 17:18:02 +1100 |
commit | 2ddf863d837014f0f11a1fb0c196f95a34e5090d (patch) | |
tree | 21727c5e0d790fa9e6fcc600c0fae4ea1e8623dd | |
parent | ef87391458eb9c4440f2f9dc1aa89b41363b6b9b (diff) | |
download | mongo-2ddf863d837014f0f11a1fb0c196f95a34e5090d.tar.gz |
WT-2971 Add details on raw-compression into WT documentation (#3093)
-rw-r--r-- | src/docs/file-formats.dox | 6 | ||||
-rw-r--r-- | src/docs/spell.ok | 1 | ||||
-rwxr-xr-x | src/docs/tools/doxfilter.py | 4 | ||||
-rw-r--r-- | src/docs/tune-compression.dox | 39 |
4 files changed, 46 insertions, 4 deletions
diff --git a/src/docs/file-formats.dox b/src/docs/file-formats.dox index 8346024953a..d8990aca7a6 100644 --- a/src/docs/file-formats.dox +++ b/src/docs/file-formats.dox @@ -110,7 +110,7 @@ considered. (See @subpage_single huffman for details.) compressing blocks of the backing object's file. The cost is additional CPU and memory use when reading and writing pages to disk. Note the additional CPU cost of block compression can be high, and should be -considered. (See @ref compression for details.) +considered. (See @x_ref compression_formats for details.) Block compression is disabled by default. @@ -146,7 +146,7 @@ Huffman encoding can be high, and should be considered. compressing blocks of the backing object's file. The cost is additional CPU and memory use when reading and writing pages to disk. Note the additional CPU cost of block compression can be high, and should be -considered. (See @ref compression for details.) +considered. (See @x_ref compression_formats for details.) Block compression is disabled by default. @@ -157,7 +157,7 @@ compression: block compression. compressing blocks of the backing object's file. The cost is additional CPU and memory use when reading and writing pages to disk. Note the additional CPU cost of block compression can be high, and should be -considered. (See @ref compression for details.) +considered. (See @x_ref compression_formats for details.) Block compression is disabled by default. diff --git a/src/docs/spell.ok b/src/docs/spell.ok index 4b1337f84b8..2413cbc93fb 100644 --- a/src/docs/spell.ok +++ b/src/docs/spell.ok @@ -346,6 +346,7 @@ nolock nolocking nommap nop +noraw nosql nosync notgranted diff --git a/src/docs/tools/doxfilter.py b/src/docs/tools/doxfilter.py index b2d5f857df1..f1c3308c689 100755 --- a/src/docs/tools/doxfilter.py +++ b/src/docs/tools/doxfilter.py @@ -98,6 +98,9 @@ def process_lang(lang, lines): subpage_pat = re.compile(r'@subpage\s+(\w*)') subpage_rep = r'@subpage \1' + lang_suffix exref_pat = re.compile(r'@ex_ref{ex_([^.]*)[.]c}') + # Add some ability to have non-language references + x_ref_pat = re.compile(r'@x_ref\s+(\w*)') + x_ref_rep = r'@ref \1' if lang == 'c': exref_rep = r'@ex_ref{ex_\1' + lang_ext + '}' else: @@ -118,6 +121,7 @@ def process_lang(lang, lines): line = re.sub(snip_pat, snip_rep, line) line = re.sub(mpage_pat, mpage_rep, line) line = re.sub(subpage_pat, subpage_rep, line) + line = re.sub(x_ref_pat, x_ref_rep, line) if '@m_if' in line: m = re.search(mif_pat, line) if not m: diff --git a/src/docs/tune-compression.dox b/src/docs/tune-compression.dox index bb675337a0d..8db2151aa76 100644 --- a/src/docs/tune-compression.dox +++ b/src/docs/tune-compression.dox @@ -2,7 +2,7 @@ WiredTiger includes a number of optional compression techniques. Configuring compression generally decreases on-disk and in-memory resource requirements -and the amount of I/O, and increases CPU cost when rows are read and written. +and the amount of I/O, and increases CPU cost when data are read and written. Configuring compression may change application throughput. For example, in applications using solid-state drives (where I/O is less expensive), @@ -19,7 +19,44 @@ An example of turning on row-store or column-store dictionary compression: @snippet ex_all.c Configure dictionary compression on +@section compression_formats Block Compression Formats +WiredTiger provides two methods of compressing your data when using block +compression: the raw and noraw methods. These methods change how WiredTiger +works to fit data into the blocks that are stored on disk. + +@subsection noraw_compression Noraw Compression +Noraw compression is the traditional compression model where a fixed +amount of data is given to the compression system, then turned into a +compressed block of data. The amount of data chosen to compress is the +data needed to fill the uncompressed block. Thus when compressed, the block will +be smaller than the normal data size and the sizes written to disk will often +vary depending on how compressible the data being stored is. Algorithms +using noraw compression include zlib-noraw, lz4-noraw and snappy. + +@subsection raw_compression Raw Compression +WiredTiger's raw compression takes advantage of compressors that provide a +streaming compression API. Using the streaming API WiredTiger will try to fit +as much data as possible into one block. This means that blocks created +with raw compression should be of similar size. Using a streaming compression +method should also make for less overhead in compression, as the setup and +initial work for compressing is done fewer times compared to the amount of +data stored. Algorithms using raw compression include zlib, lz4. + +@subsection to_raw_or_noraw Choosing between Raw and Noraw Compression +When looking at which compression method to use the biggest consideration is +that raw compression will normally provide higher compression levels while +using more CPU for compression. + +An additional consideration is that raw compression may provide a performance +advantage in workloads where data is accessed sequentially. That is because +more data is generally packed into each block on disk. Conversely, noraw +compression may perform better for workloads with random access patterns +because each block will tend to be smaller and require less work to read and +decompress. + See @ref file_formats_compression for more information on available compression techniques. +See @ref compression for information on how to configure and enable compression. + */ |