diff options
Diffstat (limited to 'docs/src/dump-formats.dox')
-rw-r--r-- | docs/src/dump-formats.dox | 51 |
1 files changed, 51 insertions, 0 deletions
diff --git a/docs/src/dump-formats.dox b/docs/src/dump-formats.dox new file mode 100644 index 00000000000..0c021e3a29b --- /dev/null +++ b/docs/src/dump-formats.dox @@ -0,0 +1,51 @@ +/*! @page dump_formats Dump Formats + +The @ref utility_dump command produces a text representation of a table +that can be loaded by @ref utility_load. This page describes the output +format of the @ref utility_dump command. + +Dump files have three parts, a prefix, a header and a body. + +The dump prefix includes basic information about the dump including the +WiredTiger version that created the dump and the dump format. The dump +format consists of a line beginning with \c "Format=", and contains the +following information: + +<table> +@hrow{String, Meaning} +@row{hex, the dumped data is in a hexadecimal dump format} +@row{print, the dumped data is in a printable format} +</table> + +The dump header follows a single \c "Header" line in the file and +consists of paired key and value lines, where the key is the URI passed +to WT_SESSION::create and the value is corresponding configuration +string. The table or file can be recreated by calling +WT_SESSION::create for each pair of lines in the header. + +The dump body follows a single \c "Data" line in the file and consists +of a text representation of the records in the table. Each record is a +represented by a pair of lines: the first line is the key and the second +line is the value. These lines are encoded in one of two formats: a +printable format and a hexadecimal format. + +The printable format consists of literal printable characters, and +hexadecimal encoded non-printable characters. Encoded characters are +written as three separate characters: a backslash character followed by +two hexadecimal characters (first the high nibble and then the low +nibble). For example, a newline character in the ASCII character set +would be encoded as \c "\0a" and an escape character would be encoded +as \c "\1b". Backslash characters which do not precede a hexadecimal +encoding are paired, that is, the characters \c "\\" should be +interpreted as a single backslash character. + +The hexadecimal format consists of encoded characters, where each +literal character is written as a pair of characters (first the +high-nibble and then the low-nibble). For example, "0a" would be an +ASCII newline character and "1b" would be an ASCII escape character. + +Because the definition of "printable" may depend on the application's +locale, dump files in the printable output format may be less portable +than dump files in the hexadecimal output format. + + */ |