summaryrefslogtreecommitdiff
path: root/src/docs/tune-system-buffer-cache.dox
blob: 1919c66997a3e4dfaf845a711a3c48f63595f974 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
/*! @page tune_system_buffer_cache System buffer cache

@section tuning_system_buffer_cache_direct_io Direct I/O

WiredTiger optionally supports direct I/O.  Configuring direct I/O may
be useful for applications wanting to:
- minimize the operating system cache effects of I/O to and from
WiredTiger's buffer cache,
- avoid double-buffering of blocks in WiredTiger's cache and the
operating system buffer cache, and
- avoid stalling underlying solid-state drives by writing a large number
of dirty blocks.

Direct I/O is configured using the "direct_io" configuration string to
the ::wiredtiger_open function.  An example of configuring direct I/O
for WiredTiger's data files:

@snippet ex_all.c Configure direct_io for data files

On Windows, the "direct_io" configuration string controls whether the operating
system cache is used to buffer reads, and writes to disk, i.e.,
FILE_FLAG_NO_BUFFERING. When "direct_io" is off, Windows will use free RAM to
cache access to files. This may had adverse effects because Windows may page out
the WiredTiger buffer cache instead of its file cache. An additional
configuration string "write_through" controls whether the disk is allowed to
cache the writes. Enabling this flag increases write latency as the drive must
ensure all writes are persisted to disk, but it ensures write durability. To get
the equivalent of \c O_DIRECT on Windows, "direct_io", and "write_through" must
be both set.

Direct I/O implies a writing thread waits for the write to complete
(which is a slower operation than writing into the system buffer cache),
and configuring direct I/O is likely to decrease overall application
performance.

Many Linux systems do not support mixing \c O_DIRECT and memory
mapping or normal I/O to the same file, and attempting to do so can
result in data loss or corruption.   For this reason:

- WiredTiger silently ignores the setting of the \c mmap configuration
to the wiredtiger_open function in those cases, and will never memory
map a file which is configured for direct I/O.

- If \c O_DIRECT is configured for data files on Linux systems, any
system utilities used to copy data files for the purposes of backup
should also specify \c O_DIRECT when configuring their file access.  A
standard Linux system utility that supports \c O_DIRECT is the \c dd
utility, when using the \c iflag=direct command-line option.

Additionally, Windows, and many Linux systems require specific alignment for
buffers used for I/O when direct I/O is configured, and using the wrong
alignment can cause data loss or corruption.  When direct I/O is
configured on Windows, and Linux systems, WiredTiger aligns I/O buffers to 4KB;
if different alignment is required by your system, the \c buffer_alignment
configuration to the wiredtiger_open call should be configured to the
correct value.

Finally, if direct I/O is configured on any system, WiredTiger will
silently change the file unit allocation size and the maximum leaf and
internal page sizes to be at least as large as the \c buffer_alignment
value as well as a multiple of that value.

Direct I/O is based on the non-standard \c O_DIRECT flag to the POSIX
1003.1 open system call and may not be available on all platforms.

@section tuning_system_buffer_cache_os_cache_dirty_max os_cache_dirty_max

As well as direct I/O, WiredTiger supports two additional configuration
options related to the system buffer cache:

The first is \c os_cache_dirty_max, the maximum dirty bytes an object
is allowed to have in the system buffer cache.  Once this many bytes
from an object are written into the system buffer cache, WiredTiger will
attempt to schedule writes for all of the dirty blocks the object has
in the system buffer cache.  This configuration option allows
applications to flush dirty blocks from the object, avoiding stalling
any underlying drives when the object is subsequently flushed to disk
as part of a durability operation.

An example of configuring \c os_cache_dirty_max:

@snippet ex_all.c os_cache_dirty_max configuration

The \c os_cache_dirty_max configuration may not be used in combination
with direct I/O.

The \c os_cache_dirty_max configuration is based on the non-standard
Linux \c sync_file_range system call and will be ignored if set
and that call is not available.

@section tuning_system_buffer_cache_os_cache_max os_cache_max

The second configuration option related to the system buffer cache is
\c os_cache_max, the maximum bytes an object is allowed to have in the
system buffer cache.  Once this many bytes from an object are either
read into or written from the system buffer cache, WiredTiger will
attempt to evict all of the object's blocks from the buffer cache.  This
configuration option allows applications to evict blocks from the system
buffer cache to limit double-buffering and system buffer cache overhead.

An example of configuring \c os_cache_max:

@snippet ex_all.c os_cache_max configuration

The \c os_cache_max configuration may not be used in combination with
direct I/O.

The \c os_cache_max configuration is based on the POSIX 1003.1 standard
\c posix_fadvise system call and may not available on all platforms.

Configuring direct I/O, \c os_cache_dirty_max or \c os_cache_max all
have the side effect of turning off memory-mapping of objects in
WiredTiger.

 */