summaryrefslogtreecommitdiff
path: root/docs/src/schema.dox
diff options
context:
space:
mode:
Diffstat (limited to 'docs/src/schema.dox')
-rw-r--r--docs/src/schema.dox206
1 files changed, 206 insertions, 0 deletions
diff --git a/docs/src/schema.dox b/docs/src/schema.dox
new file mode 100644
index 00000000000..01b9572c0ff
--- /dev/null
+++ b/docs/src/schema.dox
@@ -0,0 +1,206 @@
+/*! @page schema Schemas
+
+While many tables have simple key/value pairs for records, WiredTiger
+also supports more complex data patterns.
+
+@section schema_intro Tables, rows and columns
+
+A table is a logical representation of data consisting of cells in rows and
+columns. For example, a database might have this table.
+
+<table>
+<tr><th>EmpId</th><th>Lastname</th><th>Firstname</th><th>Salary</th></tr>
+<tr><td>1</td><td>Smith</td><td>Joe</td><td>40000</td></tr>
+<tr><td>2</td><td>Jones</td><td>Mary</td><td>50000</td></tr>
+<tr><td>3</td><td>Johnson</td><td>Cathy</td><td>44000</td></tr>
+</table>
+
+This simple table includes an employee identifier, last name and first
+name, and a salary.
+
+A row-oriented database would store all of the values in a row together,
+then the values in the next row, and so on:
+
+<pre>
+ 1,Smith,Joe,40000;
+ 2,Jones,Mary,50000;
+ 3,Johnson,Cathy,44000;
+</pre>
+
+A column-oriented database stores all of the values of a column together,
+then the values of the next column, and so on:
+
+<pre>
+ 1,2,3;
+ Smith,Jones,Johnson;
+ Joe,Mary,Cathy;
+ 40000,50000,44000;
+</pre>
+
+WiredTiger supports both storage formats, and can mix and match the storage
+of columns within a logical table.
+
+Applications describe the format of their data by supplying a schema to
+WT_SESSION::create. This specifies how the application's data can be split
+into fields and mapped onto rows and columns.
+
+@section schema_types Column types
+
+By default, WiredTiger works as a traditional key/value store, where the
+keys and values are raw byte arrays accessed using a WT_ITEM structure.
+Keys and values may be up to (4GB - 512B) bytes in size, but depending
+on how @ref WT_SESSION::create "maximum item sizes" are configured,
+large key and value items will be stored on overflow pages.
+
+See @subpage keyvalue for more details on raw key / value items.
+
+The schema layer allows key and value types to be chosen from a list,
+or composite keys or values made up of columns with any combination of
+types. The size (4GB - 512B) byte limit on keys and values still
+applies.
+
+WiredTiger's uses format strings similar to those specified in the Python
+struct module to describe the types of columns in a table:
+ http://docs.python.org/library/struct
+
+@section schema_format_types Format types
+
+<table>
+@hrow{Format, C Type, Java type, Python type, Standard size}
+@row{x, pad byte, N/A, N/A, 1}
+@row{b, signed char, byte, integer, 1}
+@row{B, unsigned char, byte, integer, 1}
+@row{h, short, short, integer, 2}
+@row{H, unsigned short, short, integer, 2}
+@row{i, int, int, integer, 4}
+@row{I, unsigned int, int, integer, 4}
+@row{l, long, int, integer, 4}
+@row{L, unsigned long, int, integer, 4}
+@row{q, long long, long, integer, 8}
+@row{Q, unsigned long long, long, integer, 8}
+@row{r, uint64_t, long, integer, 8}
+@row{s, char[], String, string, fixed length}
+@row{S, char[], String, string, variable}
+@row{t, unsigned char, byte, integer, fixed bit length}
+@row{u, WT_ITEM *, byte[], string, variable}
+</table>
+
+The <code>'r'</code> type is used for record number keys in column stores.
+
+The <code>'S'</code> type is encoded as a C language string terminated by a
+NUL character.
+
+The <code>'t'</code> type is used for fixed-length bit field values. If
+it is preceded by a size, that indicates the number of bits to store,
+between 1 and 8. That number of low-order bits will be stored in the
+table. The default is a size of 1 bit: that is, a boolean. The
+application must always use an <code>unsigned char</code> type (or
+equivalently, <code>uint8_t</code>) for calls to WT_CURSOR::set_value,
+and a pointer to the same for calls to WT_CURSOR::get_value. If a bit
+field value is combined with other types in a packing format, it is
+equivalent to <code>'B'</code>, and a full byte is used to store it.
+
+When referenced by a record number (that is, a key format of \c 'r'),
+the <code>'t'</code> type will be stored in a fixed-length column-store,
+and will not have a out-of-band value to indicate the record does not
+exist. In this case, a 0 byte value is used to indicate the record does
+not exist. This means removing a record with WT_CURSOR::remove is
+equivalent to storing a value of 0 in the record with WT_CURSOR::update
+(and storing a value of 0 in the record will cause cursor scans to
+skip the record). Additionally, creating a record past the end of the
+table or file implies the creation of any missing intermediate records,
+with byte values of 0.
+
+The <code>'u'</code> type is for raw byte arrays: if it appears at the end
+of a format string (including in the default <code>"u"</code> format for
+untyped tables), the size is not stored explicitly. When <code>'u'</code>
+appears within a format string, the size is stored as a 32-bit integer in
+the same byte order as the rest of the format string, followed by the data.
+
+There is a default collator that gives lexicographic (byte-wise)
+comparisons, and the default encoding is designed so that lexicographic
+ordering of encoded keys is usually the expected ordering. For example,
+the variable-length encoding of integers is designed so that they have the
+natural integer ordering under the default collator.
+
+See @subpage packing for details of WiredTiger's packing format.
+
+WiredTiger can also be extended with @ref WT_COLLATOR.
+
+@section schema_data_access Columns in key and values
+
+Every table has a key format and a value format as describe in @ref
+schema_types. These types are configured when the table is created by
+passing \c key_format and \c value_format keys to WT_SESSION::create.
+
+Cursors for a table have the same key format as the table itself. The key
+columns of a cursor are set with WT_CURSOR::set_key and accessed with
+WT_CURSOR::get_key. WT_CURSOR::set_key is analogous to \c printf, and takes
+a list of values in the order the key columns are configured in \c
+key_format. The columns values are accessed with WT_CURSOR::get_key, which
+is analogous to \c scanf, and takes a list of pointers to values in the same
+order.
+
+Cursors for a table have the same value format as the table, unless a
+projection is specified to WT_SESSION::open_cursor. WT_CURSOR::set_value is
+used to set value columns, and WT_CURSOR::get_value is used to get value
+columns.
+
+@section schema_columns Describing columns
+
+The columns in a table can be assigned names by passing a \c columns key to
+WT_SESSION::create. The column names are assigned first to the columns in
+the \c key_format, and then to the columns in \c value_format. There must be
+a name for every column, and no column names may be repeated.
+
+@section schema_column_groups Storing groups of columns together
+
+Once column names are assigned, they can be used to configure column groups,
+where groups of columns are stored in separate files.
+
+There are two steps involved in setting up column groups: first, pass a list
+of names for the column groups in the \c colgroups configuration key to
+WT_SESSION::create. Then make a call to WT_SESSION::create for each column
+group, using the URI <code>colgroup:\<table\>:\<colgroup name\></code> and a
+\c columns key in the configuration. Columns can be stored in multiple
+column groups, but all value columns must appear in at least on column group.
+
+Column groups have the same key as the table. This is particularly useful
+for column stores, because record numbers are not stored explicitly on disk,
+so there is no repetition of keys across multiple files. Also note that key
+columns cannot be stored in column group values: they can be retrieved with
+WT_CURSOR::get_key.
+
+@section schema_indices Adding an index
+
+Schema columns can also be used to configure indices on tables. To create an
+index, call WT_SESSION::create using the URI
+<code>index:\<table\>:\<index name\></code> and include a \c columns key in
+the configuration. WiredTiger updates all indices for a table whenever the
+table is modified.
+
+A cursor can be opened on an index by passing the index URI to
+WT_SESSION::open_cursor. Index cursors use the specified index key columns
+for WT_CURSOR::get_key and WT_CURSOR::set_key, and by default return all of
+the table value columns in WT_CURSOR::get_value. Index cursors are
+read-only: they cannot be used to perform updates.
+
+@section schema_examples Code samples
+
+The code below is taken from the complete example program
+@ex_ref{ex_schema.c}.
+
+@snippet ex_schema.c schema decl
+@snippet ex_schema.c schema work
+
+The code below is taken from the complete example program
+@ex_ref{ex_call_center.c}.
+
+@snippet ex_call_center.c call-center decl
+@snippet ex_call_center.c call-center work
+
+@todo new section: schema_advanced Advanced Schemas
+@todo non-relational data such as multiple index keys per row
+@todo application-supplied extractors and collators need to be
+registered before recovery can run.
+ */