summaryrefslogtreecommitdiff
path: root/docs/src/dump-formats.dox
diff options
context:
space:
mode:
Diffstat (limited to 'docs/src/dump-formats.dox')
-rw-r--r--docs/src/dump-formats.dox51
1 files changed, 51 insertions, 0 deletions
diff --git a/docs/src/dump-formats.dox b/docs/src/dump-formats.dox
new file mode 100644
index 00000000000..0c021e3a29b
--- /dev/null
+++ b/docs/src/dump-formats.dox
@@ -0,0 +1,51 @@
+/*! @page dump_formats Dump Formats
+
+The @ref utility_dump command produces a text representation of a table
+that can be loaded by @ref utility_load. This page describes the output
+format of the @ref utility_dump command.
+
+Dump files have three parts, a prefix, a header and a body.
+
+The dump prefix includes basic information about the dump including the
+WiredTiger version that created the dump and the dump format. The dump
+format consists of a line beginning with \c "Format=", and contains the
+following information:
+
+<table>
+@hrow{String, Meaning}
+@row{hex, the dumped data is in a hexadecimal dump format}
+@row{print, the dumped data is in a printable format}
+</table>
+
+The dump header follows a single \c "Header" line in the file and
+consists of paired key and value lines, where the key is the URI passed
+to WT_SESSION::create and the value is corresponding configuration
+string. The table or file can be recreated by calling
+WT_SESSION::create for each pair of lines in the header.
+
+The dump body follows a single \c "Data" line in the file and consists
+of a text representation of the records in the table. Each record is a
+represented by a pair of lines: the first line is the key and the second
+line is the value. These lines are encoded in one of two formats: a
+printable format and a hexadecimal format.
+
+The printable format consists of literal printable characters, and
+hexadecimal encoded non-printable characters. Encoded characters are
+written as three separate characters: a backslash character followed by
+two hexadecimal characters (first the high nibble and then the low
+nibble). For example, a newline character in the ASCII character set
+would be encoded as \c "\0a" and an escape character would be encoded
+as \c "\1b". Backslash characters which do not precede a hexadecimal
+encoding are paired, that is, the characters \c "\\" should be
+interpreted as a single backslash character.
+
+The hexadecimal format consists of encoded characters, where each
+literal character is written as a pair of characters (first the
+high-nibble and then the low-nibble). For example, "0a" would be an
+ASCII newline character and "1b" would be an ASCII escape character.
+
+Because the definition of "printable" may depend on the application's
+locale, dump files in the printable output format may be less portable
+than dump files in the hexadecimal output format.
+
+ */