summaryrefslogtreecommitdiff
path: root/src/docs/backup.dox
blob: 5926d0cc72b957149aba3dd848bda239dbc857a7 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
/*! @class doc_bulk_durability

Bulk-loads are not commit-level durable, that is, the creation and
bulk-load of an object will not appear in the database log files.\  For
this reason, applications doing incremental backups after a full backup
should repeat the full backup step after doing a bulk-load to make the
bulk-load durable.\ In addition, incremental backups after a bulk-load
can cause recovery to report errors because there are log records that
apply to data files which don't appear in the backup.

*/

/*! @m_page{{c,java},backup,Backups}

WiredTiger cursors provide access to data from a variety of sources.
One of these sources is the list of files required to perform a backup
of the database.  The list may be the files required by all of the
objects in the database, or a subset of the objects in the database.

WiredTiger backups are "on-line" or "hot" backups, and applications may
continue to read and write the databases while a snapshot is taken.

@section backup_process Backup from an application

1. Open a cursor on the \c "backup:" data source, which begins the
   process of a backup.

2. Copy each file returned by the WT_CURSOR::next method to the backup
   location, for example, a different directory. Do not reuse backup
   locations unless all files have first been removed from them, in
   other words, remove any previous backup information before using a
   backup location.

3. Close the cursor; the cursor must not be closed until all of the
   files have been copied.

The directory into which the files are copied may subsequently be
specified as an directory to the ::wiredtiger_open function and
accessed as a WiredTiger database home.

Copying the database files for a backup does not require any special
alignment or block size (specifically, Linux or Windows filesystems that
do not support read/write isolation can be safely read for backups).

The cursor must not be closed until all of the files have been copied,
however, there is no requirement the files be copied in any order or in
any relationship to the WT_CURSOR::next calls, only that all files have
been copied before the cursor is closed.  For example, applications might
aggregate the file names from the cursor and then list the file names as
arguments to a file archiver such as the system tar utility.

During the period the backup cursor is open, database checkpoints can
be created, but no checkpoints can be deleted.  This may result in
significant file growth.

The following is a programmatic example of creating a backup:

@snippet ex_all.c backup

In cases where the backup is desired for a checkpoint other than the
most recent, applications can discard all checkpoints subsequent to the
checkpoint they want using the WT_SESSION::checkpoint method.  For
example:

@snippet ex_all.c backup of a checkpoint

@section backup_util Backup from the command line

The @ref util_backup command may also be used to create backups:

@code
rm -rf /path/database.backup &&
    mkdir /path/database.backup &&
    wt -h /path/database.source backup /path/database.backup
@endcode

@section backup_incremental Incremental backup

Once a backup has been done, it can be rolled forward incrementally by
adding log files to the backup copy. Adding log files to the copy
decreases potential data loss from switching to the copy, but increases
the recovery time necessary to switch to the copy.  To reset the
recovery time necessary to switch to the copy, perform a full backup of
the database.  For example, an application might do a full backup of the
database once a week during a quiet period, and then incrementally copy
new log files into the backup directory for the rest of the week.

WiredTiger log files are named "WiredTigerLog.[number]" where "[number]"
is a 10-digit value, for example, WiredTigerLog.0000000001".  The log
file with the largest number in its name is the most recent log file
written.  To incrementally add log files to a backup, simply copy newly
created log files from the database home to the backup copy.

@copydoc doc_bulk_durability

By default, WiredTiger automatically removes log files no longer
required for recovery.  Applications wanting to use log files for
incremental backup must first disable automatic log file removal using
the \c log=(archive=false) configuration to ::wiredtiger_open.

The following is the procedure for incrementally backing up a database
and removing log files from the original database home:

1. Perform a full backup of the database (as described above).

2. Copy the log files from the database home to the backup directory.
   It is not an error to copy a log file which has been copied before,
   but care should be taken to ensure each log file is completely copied
   as the most recent log file may change in size while being copied.
   A way to ensure safety is to copy the most recent log file copied
   during an incremental backup as part of the next incremental backup.
   For example, if log files 10 through 12 were copied in an incremental
   backup, the next incremental backup would begin by copying log file
   12 again.

3. Perform a checkpoint in the database.

4. Discard log files in the database home that are no longer needed.  The
   log files that are no longer needed include the log files that were
   copied in step 2 except for the most recent log file in the database
   home at that time.  For example, if log files 10 through 12 were
   copied in step 2, then log files 10 and 11 can be discarded once step
   3 completes.

Steps 2, 3 and 4 can be repeated any number of times before step 1 is
repeated.

@section backup_o_direct Backup and O_DIRECT

Many Linux systems do not support mixing \c O_DIRECT and memory mapping
or normal I/O to the same file.   If \c O_DIRECT is configured for data
or log files on Linux systems (using the wiredtiger_open \c direct_io
configuration), any program used to copy files during backup should also
specify \c O_DIRECT when configuring its file access.  Likewise, when
\c O_DIRECT is not configured by the database application, programs
copying files should not configure \c O_DIRECT.

*/