summaryrefslogtreecommitdiff
path: root/src/third_party/wiredtiger/src/docs/checkpoint.dox
blob: 2da3b1df27299a8213cca2fab20485193f998b58 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
/*! @m_page{{c,java},checkpoint,Checkpoint durability}

WiredTiger supports checkpoint durability by default, and optionally
commit-level durability when logging is enabled.  In most applications,
commit-level durability impacts performance more than checkpoint
durability; checkpoints offer basic operation durability across
application or system failure without impacting performance (although
the creation of each checkpoint is a relatively heavy-weight operation).
See @ref durability for information on commit-level durability.

Checkpoints with transactional logging enabled allow logging to archive
older log files (if configured to do so) while still retaining
commit-level durability and the ability to run recovery.

A checkpoint is automatically created whenever a modified data source
is closed.  Additionally, checkpoints of one or more data sources, or
the entire database, can be explicitly created using the
WT_SESSION::checkpoint method.  Checkpoints can also be scheduled
periodically based on elapsed time or data size with the \c checkpoint
configuration to ::wiredtiger_open.

All transactional updates committed before a checkpoint are made durable
by the checkpoint, therefore the frequency of checkpoints limits the
volume of data that may be lost due to application or system failure.
<b>This guarantee has an exception:</b> If a crash occurs when a backup
cursor is open, then the system will be restored to the most recent
checkpoint prior to the opening of the backup cursor, even if later
database checkpoints were completed.

Data sources that are involved in an exclusive operation when the
checkpoint starts, including bulk load, verify or salvage, will be skipped
by the checkpoint.  Operations requiring exclusive access may fail with
an \c EBUSY error if attempted during a checkpoint.

When data sources are first opened, they are opened in the state of the
most recent checkpoint taken on the file, in other words, updates after the
most recent checkpoint will not appear in the data source.  If no
checkpoint is found when the data source is opened, the data source will
appear empty.

@section checkpoint_server Automatic checkpoints

WiredTiger will automatically checkpoint the entire database when the
\c checkpoint configuration parameter to ::wiredtiger_open is set.  When
this configuration is used, an internal server thread is created.

The period between checkpoints can be defined either in seconds via \c
wait, or, if logging is enabled, as the number of bytes written to the log
since the last checkpoint via \c log_size, or both.  If both periods are
defined then the checkpoint occurs as soon as either threshold has occurred
and both are reset once the checkpoint is complete.  If using \c log_size,
we recommend that the size selected be a multiple of the log file size for
archiving purposes.

@section checkpoint_cursors Checkpoint cursors

Cursors are normally opened in the most recent version of a data source.
However, a checkpoint configuration string may be provided
to WT_SESSION::open_cursor, opening a read-only, static view of the
data source.  This provides a limited form of time-travel, as the static
view is not changed by subsequent checkpoints and will persist until
the checkpoint cursor is closed. While it is not an error to set a read
timestamp in a transaction including a checkpoint cursor, it also has no
effect on the checkpoint cursor's behavior.

@section checkpoint_naming Checkpoint naming

Additionally, checkpoints that do not include LSM trees may optionally be
given names by the application.  Because named checkpoints persist until
discarded or replaced, they can be used to periodically snapshot data for
later use.

Checkpoints named by the application persist until explicitly discarded or
the application creates a new checkpoint with the same name (which replaces
the previous checkpoint of that name). If the previous checkpoint cannot be
replaced, either because a cursor is reading from the previous checkpoint,
or backups are in progress, the checkpoint will fail.

Internal checkpoints (that is, checkpoints not named by the application)
use the reserved name "WiredTigerCheckpoint".  Applications can open the
most recent of these checkpoints by specifying "WiredTigerCheckpoint"
as the checkpoint name to WT_SESSION::open_cursor.   Creating a new
internal checkpoint drops all previous internal checkpoints, if
possible; if a previous internal checkpoint cannot be dropped for any
reason, the checkpoint will ignore the previous checkpoint and continue.
Subsequent checkpoints will drop those ignored checkpoints when it
becomes possible.

The \c -c option to the \c wt command line utility \c list command will
list a data source's checkpoints, with time stamp, in a human-readable
format.

@section checkpoint_compaction Checkpoints and file compaction

Checkpoints share file blocks, and dropping a checkpoint may or may not
make file blocks available for re-use, depending on whether the dropped
checkpoint contained the last reference to a file block.
Because checkpoints named by the application are not discarded until
explicitly discarded or replaced, they may prevent WT_SESSION::compact
from accomplishing anything due to shared file blocks.

 */