summaryrefslogtreecommitdiff
path: root/src/third_party/wiredtiger/src/docs/custom-extractors.dox
diff options
context:
space:
mode:
Diffstat (limited to 'src/third_party/wiredtiger/src/docs/custom-extractors.dox')
-rw-r--r--src/third_party/wiredtiger/src/docs/custom-extractors.dox73
1 files changed, 73 insertions, 0 deletions
diff --git a/src/third_party/wiredtiger/src/docs/custom-extractors.dox b/src/third_party/wiredtiger/src/docs/custom-extractors.dox
new file mode 100644
index 00000000000..31dfed94b75
--- /dev/null
+++ b/src/third_party/wiredtiger/src/docs/custom-extractors.dox
@@ -0,0 +1,73 @@
+/*! @page custom_extractors Custom Extractors
+
+@section custom_extractors_intro Introduction to Custom Extractors
+
+A WiredTiger table can have zero or more associated indices. An index
+uses a different key to locate records than the table, and usually only
+stores a short key for each record, with the (larger) value in the
+table.
+
+WiredTiger tables must be created with column names in order to create
+an index. This is required so that index cursors can support
+projections, and because WiredTiger optimizes some cases of "simple"
+tables without column names.
+
+When the full schema of your records can be described in WiredTiger's
+packing format, you can create an index by specifying which columns from
+the record should appear in the index. However, for more complex
+records, or to associate multiple index keys to each record,
+applications can instead specify a custom extractor by implementing the
+WT_EXTRACTOR interface.
+
+The main method in the interface is WT_EXTRACTOR::extract. This is
+called by WiredTiger each time a record is updated in a table. The
+\c extract method should determine the index key(s) and call
+WT_CURSOR::set_key followed by WT_CURSOR::insert on the supplied
+\c result_cursor for each index key.
+
+If any operation fails, WT_EXTRACTOR::extract must return the failure
+to WiredTiger, or the index could become out of sync with the table.
+
+Note that the extract callback is called for all operations that update
+the table, not just inserts. The callback sets the key and uses the
+WT_CURSOR::insert method to return the index key(s). WiredTiger will
+perform the required operation to keep the index in sync with the table.
+
+Applications must register their WT_EXTRACTOR implementations using
+WT_CONNECTION::add_extractor. This is often done by creating a
+@ref extensions "WiredTiger extension". They are then configured by
+passing \c "extractor=..." to WT_SESSION::create when creating an index.
+
+See @ex_ref{ex_extractor.c} for an example of how to implement custom
+extractors.
+
+@section custom_extractors_notes Implementation notes
+
+A WiredTiger index is a row store where the key columns contain all of
+the secondary and primary key columns, but only the secondary key
+columns are visible to applications. The value is empty, and
+WiredTiger's on-disk format optimizes for this case (empty values take
+up no space on disk).
+
+Custom extractors only need to calculate the public index key columns.
+The \c result_cursor will be configured with a \c key_format
+corresponding to what was supplied to WT_SESSION::create when the index
+was created. WiredTiger will append the (hidden) primary key when
+populating the index.
+
+If column names are specified for an index with a custom extractor, it
+is not permitted to use any column names from the table key. Custom
+index keys can include columns from the table value, but the extracted
+value must be equal to the value from that column of the record or the
+results of using a projection cursor on the index will be undefined.
+
+@section custom_extractors_raw Custom Collators in raw mode
+
+If a custom extractor needs to operate in raw mode on the
+\c result_cursor, it must take into account an implementation detail.
+To avoid rewriting the extracted key, WiredTiger appends a padding byte
+to the raw key using a \c 'x' format. See @ref schema_format_types for
+more information. If the callback operates in raw mode, it must also
+append this padding byte.
+
+*/