summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDavid Hows <howsdav@gmail.com>2016-10-26 17:18:02 +1100
committerMichael Cahill <michael.cahill@mongodb.com>2016-10-26 17:18:02 +1100
commit2ddf863d837014f0f11a1fb0c196f95a34e5090d (patch)
tree21727c5e0d790fa9e6fcc600c0fae4ea1e8623dd
parentef87391458eb9c4440f2f9dc1aa89b41363b6b9b (diff)
downloadmongo-2ddf863d837014f0f11a1fb0c196f95a34e5090d.tar.gz
WT-2971 Add details on raw-compression into WT documentation (#3093)
-rw-r--r--src/docs/file-formats.dox6
-rw-r--r--src/docs/spell.ok1
-rwxr-xr-xsrc/docs/tools/doxfilter.py4
-rw-r--r--src/docs/tune-compression.dox39
4 files changed, 46 insertions, 4 deletions
diff --git a/src/docs/file-formats.dox b/src/docs/file-formats.dox
index 8346024953a..d8990aca7a6 100644
--- a/src/docs/file-formats.dox
+++ b/src/docs/file-formats.dox
@@ -110,7 +110,7 @@ considered. (See @subpage_single huffman for details.)
compressing blocks of the backing object's file. The cost is additional
CPU and memory use when reading and writing pages to disk. Note the
additional CPU cost of block compression can be high, and should be
-considered. (See @ref compression for details.)
+considered. (See @x_ref compression_formats for details.)
Block compression is disabled by default.
@@ -146,7 +146,7 @@ Huffman encoding can be high, and should be considered.
compressing blocks of the backing object's file. The cost is additional
CPU and memory use when reading and writing pages to disk. Note the
additional CPU cost of block compression can be high, and should be
-considered. (See @ref compression for details.)
+considered. (See @x_ref compression_formats for details.)
Block compression is disabled by default.
@@ -157,7 +157,7 @@ compression: block compression.
compressing blocks of the backing object's file. The cost is additional
CPU and memory use when reading and writing pages to disk. Note the
additional CPU cost of block compression can be high, and should be
-considered. (See @ref compression for details.)
+considered. (See @x_ref compression_formats for details.)
Block compression is disabled by default.
diff --git a/src/docs/spell.ok b/src/docs/spell.ok
index 4b1337f84b8..2413cbc93fb 100644
--- a/src/docs/spell.ok
+++ b/src/docs/spell.ok
@@ -346,6 +346,7 @@ nolock
nolocking
nommap
nop
+noraw
nosql
nosync
notgranted
diff --git a/src/docs/tools/doxfilter.py b/src/docs/tools/doxfilter.py
index b2d5f857df1..f1c3308c689 100755
--- a/src/docs/tools/doxfilter.py
+++ b/src/docs/tools/doxfilter.py
@@ -98,6 +98,9 @@ def process_lang(lang, lines):
subpage_pat = re.compile(r'@subpage\s+(\w*)')
subpage_rep = r'@subpage \1' + lang_suffix
exref_pat = re.compile(r'@ex_ref{ex_([^.]*)[.]c}')
+ # Add some ability to have non-language references
+ x_ref_pat = re.compile(r'@x_ref\s+(\w*)')
+ x_ref_rep = r'@ref \1'
if lang == 'c':
exref_rep = r'@ex_ref{ex_\1' + lang_ext + '}'
else:
@@ -118,6 +121,7 @@ def process_lang(lang, lines):
line = re.sub(snip_pat, snip_rep, line)
line = re.sub(mpage_pat, mpage_rep, line)
line = re.sub(subpage_pat, subpage_rep, line)
+ line = re.sub(x_ref_pat, x_ref_rep, line)
if '@m_if' in line:
m = re.search(mif_pat, line)
if not m:
diff --git a/src/docs/tune-compression.dox b/src/docs/tune-compression.dox
index bb675337a0d..8db2151aa76 100644
--- a/src/docs/tune-compression.dox
+++ b/src/docs/tune-compression.dox
@@ -2,7 +2,7 @@
WiredTiger includes a number of optional compression techniques. Configuring
compression generally decreases on-disk and in-memory resource requirements
-and the amount of I/O, and increases CPU cost when rows are read and written.
+and the amount of I/O, and increases CPU cost when data are read and written.
Configuring compression may change application throughput. For example,
in applications using solid-state drives (where I/O is less expensive),
@@ -19,7 +19,44 @@ An example of turning on row-store or column-store dictionary compression:
@snippet ex_all.c Configure dictionary compression on
+@section compression_formats Block Compression Formats
+WiredTiger provides two methods of compressing your data when using block
+compression: the raw and noraw methods. These methods change how WiredTiger
+works to fit data into the blocks that are stored on disk.
+
+@subsection noraw_compression Noraw Compression
+Noraw compression is the traditional compression model where a fixed
+amount of data is given to the compression system, then turned into a
+compressed block of data. The amount of data chosen to compress is the
+data needed to fill the uncompressed block. Thus when compressed, the block will
+be smaller than the normal data size and the sizes written to disk will often
+vary depending on how compressible the data being stored is. Algorithms
+using noraw compression include zlib-noraw, lz4-noraw and snappy.
+
+@subsection raw_compression Raw Compression
+WiredTiger's raw compression takes advantage of compressors that provide a
+streaming compression API. Using the streaming API WiredTiger will try to fit
+as much data as possible into one block. This means that blocks created
+with raw compression should be of similar size. Using a streaming compression
+method should also make for less overhead in compression, as the setup and
+initial work for compressing is done fewer times compared to the amount of
+data stored. Algorithms using raw compression include zlib, lz4.
+
+@subsection to_raw_or_noraw Choosing between Raw and Noraw Compression
+When looking at which compression method to use the biggest consideration is
+that raw compression will normally provide higher compression levels while
+using more CPU for compression.
+
+An additional consideration is that raw compression may provide a performance
+advantage in workloads where data is accessed sequentially. That is because
+more data is generally packed into each block on disk. Conversely, noraw
+compression may perform better for workloads with random access patterns
+because each block will tend to be smaller and require less work to read and
+decompress.
+
See @ref file_formats_compression for more information on available
compression techniques.
+See @ref compression for information on how to configure and enable compression.
+
*/