From 780b92ada9afcf1d58085a83a0b9e6bc982203d1 Mon Sep 17 00:00:00 2001 From: Lorry Tar Creator Date: Tue, 17 Feb 2015 17:25:57 +0000 Subject: Imported from /home/lorry/working-area/delta_berkeleydb/db-6.1.23.tar.gz. --- docs/programmer_reference/bt_conf.html | 604 +++++++++++++++++++-------------- 1 file changed, 343 insertions(+), 261 deletions(-) (limited to 'docs/programmer_reference/bt_conf.html') diff --git a/docs/programmer_reference/bt_conf.html b/docs/programmer_reference/bt_conf.html index b270901d..98774713 100644 --- a/docs/programmer_reference/bt_conf.html +++ b/docs/programmer_reference/bt_conf.html @@ -14,7 +14,7 @@

- There are a series of configuration tasks which you can perform when - using the Btree access method. They are described in the following sections. -

+ There are a series of configuration tasks which you can + perform when using the Btree access method. They are described + in the following sections. +

@@ -79,25 +79,34 @@
-

The Btree data structure is a sorted, balanced tree structure storing -associated key/data pairs. By default, the sort order is lexicographical, -with shorter keys collating before longer keys. The user can specify the -sort order for the Btree by using the DB->set_bt_compare() method.

-

Sort routines are passed pointers to keys as arguments. The keys are -represented as DBT structures. The routine must return an integer -less than, equal to, or greater than zero if the first argument is -considered to be respectively less than, equal to, or greater than the -second argument. The only fields that the routines may examine in the -DBT structures are data and size fields.

-

An example routine that might be used to sort integer keys in the database -is as follows:

+

+ The Btree data structure is a sorted, balanced tree + structure storing associated key/data pairs. By default, the + sort order is lexicographical, with shorter keys collating + before longer keys. The user can specify the sort order for + the Btree by using the DB->set_bt_compare() method. +

+

+ Sort routines are passed pointers to keys as arguments. The + keys are represented as DBT structures. The routine must + return an integer less than, equal to, or greater than zero if + the first argument is considered to be respectively less than, + equal to, or greater than the second argument. The only fields + that the routines may examine in the DBT structures are + data and size fields. +

+

+ An example routine that might be used to sort integer keys + in the database is as follows: +

-
-int
-compare_int(DB *dbp, const DBT *a, const DBT *b)
+        
int
+compare_int(DB *dbp, const DBT *a, const DBT *b, size_t *locp)
     
 {
     int ai, bi;
+
+    locp = NULL;
     /*
      * Returns:
      *    < 0 if a < b
@@ -109,23 +118,29 @@ compare_int(DB *dbp, const DBT *a, const DBT *b)
     return (ai - bi);
 }
 
-

Note that the data must first be copied into memory that is appropriately -aligned, as Berkeley DB does not guarantee any kind of alignment of the -underlying data, including for comparison routines. When writing -comparison routines, remember that databases created on machines of -different architectures may have different integer byte orders, for which -your code may need to compensate.

-

An example routine that might be used to sort keys based on the first -five bytes of the key (ignoring any subsequent bytes) is as follows:

+

+ Note that the data must first be copied into memory that is + appropriately aligned, as Berkeley DB does not guarantee any + kind of alignment of the underlying data, including for + comparison routines. When writing comparison routines, + remember that databases created on machines of different + architectures may have different integer byte orders, for + which your code may need to compensate. +

+

+ An example routine that might be used to sort keys based on + the first five bytes of the key (ignoring any subsequent + bytes) is as follows: +

-
-int
-compare_dbt(DB *dbp, const DBT *a, const DBT *b)
+        
int
+compare_dbt(DB *dbp, const DBT *a, const DBT *b, size_t *locp)
     
 {
     int len;
     u_char *p1, *p2;
 
+    locp = NULL;
     /*
      * Returns:
      *    < 0 if a < b
@@ -138,52 +153,71 @@ compare_dbt(DB *dbp, const DBT *a, const DBT *b)
     return (0);
 }
 
-

All comparison functions must cause the keys in the database to be -well-ordered. The most important implication of being well-ordered is -that the key relations must be transitive, that is, if key A is less -than key B, and key B is less than key C, then the comparison routine -must also return that key A is less than key C.

-

It is reasonable for a comparison function to not examine an entire key -in some applications, which implies partial keys may be specified to the -Berkeley DB interfaces. When partial keys are specified to Berkeley DB, interfaces -which retrieve data items based on a user-specified key (for example, -DB->get() and DBC->get() with the DB_SET flag), will -modify the user-specified key by returning the actual key stored in the -database.

+

+ All comparison functions must cause the keys in the database + to be well-ordered. The most important implication of being + well-ordered is that the key relations must be transitive, + that is, if key A is less than key B, and key B is less than + key C, then the comparison routine must also return that key A + is less than key C. +

+

+ It is reasonable for a comparison function to not examine an + entire key in some applications, which implies partial keys + may be specified to the Berkeley DB interfaces. When partial + keys are specified to Berkeley DB, interfaces which retrieve + data items based on a user-specified key (for example, DB->get() + and DBC->get() with the DB_SET flag), will modify the + user-specified key by returning the actual key stored in the + database. +

-

Btree prefix comparison

+

Btree prefix + comparison

-

The Berkeley DB Btree implementation maximizes the number of keys that can be -stored on an internal page by storing only as many bytes of each key as -are necessary to distinguish it from adjacent keys. The prefix -comparison routine is what determines this minimum number of bytes (that -is, the length of the unique prefix), that must be stored. A prefix -comparison function for the Btree can be specified by calling -DB->set_bt_prefix().

-

The prefix comparison routine must be compatible with the overall -comparison function of the Btree, since what distinguishes any two keys -depends entirely on the function used to compare them. This means that -if a prefix comparison routine is specified by the application, a -compatible overall comparison routine must also have been specified.

-

Prefix comparison routines are passed pointers to keys as arguments. -The keys are represented as DBT structures. The only fields -the routines may examine in the DBT structures are data -and size fields.

-

The prefix comparison function must return the number of bytes necessary -to distinguish the two keys. If the keys are identical (equal and equal -in length), the length should be returned. If the keys are equal up to -the smaller of the two lengths, then the length of the smaller key plus -1 should be returned.

-

An example prefix comparison routine follows:

+

+ The Berkeley DB Btree implementation maximizes the number of + keys that can be stored on an internal page by storing only as + many bytes of each key as are necessary to distinguish it from + adjacent keys. The prefix comparison routine is what + determines this minimum number of bytes (that is, the length + of the unique prefix), that must be stored. A prefix + comparison function for the Btree can be specified by calling + DB->set_bt_prefix(). +

+

+ The prefix comparison routine must be compatible with the + overall comparison function of the Btree, since what + distinguishes any two keys depends entirely on the function + used to compare them. This means that if a prefix comparison + routine is specified by the application, a compatible overall + comparison routine must also have been specified. +

+

+ Prefix comparison routines are passed pointers to keys as + arguments. The keys are represented as DBT structures. The + only fields the routines may examine in the DBT structures + are data and size fields. +

+

+ The prefix comparison function must return the number of + bytes necessary to distinguish the two keys. If the keys are + identical (equal and equal in length), the length should be + returned. If the keys are equal up to the smaller of the two + lengths, then the length of the smaller key plus 1 should be + returned. +

+

+ An example prefix comparison routine follows: +

-
-size_t
+        
size_t
 compare_prefix(DB *dbp, const DBT *a, const DBT *b)
    
 {
@@ -207,8 +241,11 @@ compare_prefix(DB *dbp, const DBT *a, const DBT *b)
     return (b->size);
 }
 
-

The usefulness of this functionality is data-dependent, but in some data -sets can produce significantly reduced tree sizes and faster search times.

+

+ The usefulness of this functionality is data-dependent, but + in some data sets can produce significantly reduced tree sizes + and faster search times. +

@@ -218,38 +255,58 @@ sets can produce significantly reduced tree sizes and faster search times.

-

The number of keys stored on each page affects the size of a Btree and -how it is maintained. Therefore, it also affects the retrieval and search -performance of the tree. For each Btree, Berkeley DB computes a maximum key -and data size. This size is a function of the page size and the fact that -at least two key/data pairs must fit on any Btree page. Whenever key or -data items exceed the calculated size, they are stored on overflow pages -instead of in the standard Btree leaf pages.

-

Applications may use the DB->set_bt_minkey() method to change the minimum -number of keys that must fit on a Btree page from two to another value. -Altering this value in turn alters the on-page maximum size, and can be -used to force key and data items which would normally be stored in the -Btree leaf pages onto overflow pages.

-

Some data sets can benefit from this tuning. For example, consider an -application using large page sizes, with a data set almost entirely -consisting of small key and data items, but with a few large items. By -setting the minimum number of keys that must fit on a page, the -application can force the outsized items to be stored on overflow pages. -That in turn can potentially keep the tree more compact, that is, with -fewer internal levels to traverse during searches.

-

The following calculation is similar to the one performed by the Btree -implementation. (The minimum_keys value is multiplied by 2 -because each key/data pair requires 2 slots on a Btree page.)

+

+ The number of keys stored on each page affects the size of a + Btree and how it is maintained. Therefore, it also affects the + retrieval and search performance of the tree. For each Btree, + Berkeley DB computes a maximum key and data size. This size is + a function of the page size and the fact that at least two + key/data pairs must fit on any Btree page. Whenever key or + data items exceed the calculated size, they are stored on + overflow pages instead of in the standard Btree leaf + pages. +

+

+ Applications may use the DB->set_bt_minkey() method to change + the minimum number of keys that must fit on a Btree page from + two to another value. Altering this value in turn alters the + on-page maximum size, and can be used to force key and data + items which would normally be stored in the Btree leaf pages + onto overflow pages. +

+

+ Some data sets can benefit from this tuning. For example, + consider an application using large page sizes, with a data + set almost entirely consisting of small key and data items, + but with a few large items. By setting the minimum number of + keys that must fit on a page, the application can force the + outsized items to be stored on overflow pages. That in turn + can potentially keep the tree more compact, that is, with + fewer internal levels to traverse during searches. +

+

+ The following calculation is similar to the one performed by + the Btree implementation. (The minimum_keys + value is multiplied by 2 because + each key/data pair requires 2 slots on a Btree page.) +

maximum_size = page_size / (minimum_keys * 2)
-

Using this calculation, if the page size is 8KB and the default -minimum_keys value of 2 is used, then any key or data items -larger than 2KB will be forced to an overflow page. If an application -were to specify a minimum_key value of 100, then any key or data -items larger than roughly 40 bytes would be forced to overflow pages.

-

It is important to remember that accesses to overflow pages do not perform -as well as accesses to the standard Btree leaf pages, and so setting the -value incorrectly can result in overusing overflow pages and decreasing -the application's overall performance.

+

+ Using this calculation, if the page size is 8KB and the + default minimum_keys value of + 2 is used, then any key or data items larger than 2KB will be + forced to an overflow page. If an application were to specify + a minimum_key value of 100, + then any key or data items larger than roughly 40 bytes would + be forced to overflow pages. +

+

+ It is important to remember that accesses to overflow pages + do not perform as well as accesses to the standard Btree leaf + pages, and so setting the value incorrectly can result in + overusing overflow pages and decreasing the application's + overall performance. +

@@ -259,21 +316,28 @@ the application's overall performance.

-

The Btree access method optionally supports retrieval by logical record -numbers. To configure a Btree to support record numbers, call the -DB->set_flags() method with the DB_RECNUM flag.

-

Configuring a Btree for record numbers should not be done lightly. -While often useful, it may significantly slow down the speed at which -items can be stored into the database, and can severely impact -application throughput. Generally it should be avoided in trees with -a need for high write concurrency.

-

To retrieve by record number, use the DB_SET_RECNO flag to the -DB->get() and DBC->get() methods. The following is an example of -a routine that displays the data item for a Btree database created with -the DB_RECNUM option.

+

+ The Btree access method optionally supports retrieval by + logical record numbers. To configure a Btree to support record + numbers, call the DB->set_flags() method with the DB_RECNUM + flag. +

+

+ Configuring a Btree for record numbers should not be done + lightly. While often useful, it may significantly slow down + the speed at which items can be stored into the database, and + can severely impact application throughput. Generally it + should be avoided in trees with a need for high write + concurrency. +

+

+ To retrieve by record number, use the DB_SET_RECNO flag + to the DB->get() and DBC->get() methods. The following is an + example of a routine that displays the data item for a Btree + database created with the DB_RECNUM option. +

-
-int
+        
int
 rec_display(DB *dbp, db_recno_t recno)
     
 {
@@ -292,12 +356,14 @@ rec_display(DB *dbp, db_recno_t recno)
     return (0);
 }
 
-

To determine a key's record number, use the DB_GET_RECNO flag -to the DBC->get() method. The following is an example of a routine that -displays the record number associated with a specific key.

+

+ To determine a key's record number, use the DB_GET_RECNO + flag to the DBC->get() method. The following is an example of a + routine that displays the record number associated with a + specific key. +

-
-int
+        
int
 recno_display(DB *dbp, char *keyvalue)
     
 {
@@ -355,55 +421,59 @@ err:    /* Close the cursor. */
             
           
         
-        

- The Btree access method supports the automatic compression of key/data - pairs upon their insertion into the database. The key/data pairs are - decompressed before they are returned to the application, making an - application's interaction with a compressed database identical to that - for a non-compressed database. To configure Berkeley DB for - compression, call the DB->set_bt_compress() method and specify custom - compression and decompression functions. If DB->set_bt_compress() is - called with NULL compression and decompression functions, Berkeley DB - will use its default compression functions. -

+

+ The Btree access method supports the automatic compression + of key/data pairs upon their insertion into the database. The + key/data pairs are decompressed before they are returned to + the application, making an application's interaction with a + compressed database identical to that for a non-compressed + database. To configure Berkeley DB for compression, call the + DB->set_bt_compress() method and specify custom compression and + decompression functions. If DB->set_bt_compress() is called with + NULL compression and decompression functions, Berkeley DB will + use its default compression functions. +

Note

-

- Compression only works with the Btree access method, and then only - so long as your database is not configured for unsorted duplicates. -

+

+ Compression only works with the Btree access method, + and then only so long as your database is not configured + for unsorted duplicates. +

Note

-

- The default compression function is not guaranteed to reduce the - size of the on-disk database in every case. It has been tested - and shown to work well with English-language text. Of course, in - order to determine if the default compression algorithm is beneficial - for your application, it is important to test both the final size - and the performance using a representative set of data and access - patterns. -

+

+ The default compression function is not guaranteed to + reduce the size of the on-disk database in every case. It + has been tested and shown to work well with + English-language text. Of course, in order to determine if + the default compression algorithm is beneficial for your + application, it is important to test both the final size + and the performance using a representative set of data and + access patterns. +

- The default compression function performs prefix compression on each - key added to the database. This means that, for a key - n bytes in length, the first - i bytes that match the first - i bytes of the previous key exactly are omitted - and only the final n-i bytes are stored in the - database. If the bytes of key being stored match the bytes of the - previous key exactly, then the same prefix compression algorithm is - applied to the data value being stored. To use Berkeley DB's default - compression behavior, both the default compression and decompression - functions must be used. -

+ The default compression function performs prefix + compression on each key added to the database. This means + that, for a key n bytes in length, the + first i bytes that match the first + i bytes of the previous key exactly + are omitted and only the final n-i bytes + are stored in the database. If the bytes of key being stored + match the bytes of the previous key exactly, then the same + prefix compression algorithm is applied to the data value + being stored. To use Berkeley DB's default compression + behavior, both the default compression and decompression + functions must be used. +

- For example, to configure your database for default compression: -

+ For example, to configure your database for default + compression: +

-
-    DB *dbp = NULL;
+        
DB *dbp = NULL;
     DB_ENV *envp = NULL;
      u_int32_t db_flags; 
     const char *file_name = "mydb.db"; 
@@ -441,60 +511,64 @@ err:    /* Close the cursor. */
           
-

Custom compression

+

Custom + compression

- An application wishing to perform its own compression may supply a - compression and decompression function which will be called instead of - Berkeley DB's default functions. The compression function is - passed five DBT structures: -

+ An application wishing to perform its own compression + may supply a compression and decompression function which + will be called instead of Berkeley DB's default functions. + The compression function is passed five DBT structures: +

  • -

    - The key and data immediately preceeding the key/data pair - that is being stored. -

    +

    + The key and data immediately preceeding the + key/data pair that is being stored. +

  • - The key and data being stored in the tree. -

    + The key and data being stored in the tree. +

  • -

    - The buffer where the compressed data should be written. -

    +

    + The buffer where the compressed data should be + written. +

+

+ The total size of the buffer used to store the + compressed data is identified in the DBT's + ulen field. If the compressed data + cannot fit in the buffer, the compression function should + store the amount of space needed in DBT's + size field and then return + DB_BUFFER_SMALL. Berkeley DB will + subsequently re-call the compression function with the + required amount of space allocated in the compression data + buffer. +

+

+ Multiple compressed key/data pairs will likely be + written to the same buffer and the compression function + should take steps to ensure it does not overwrite data. +

- The total size of the buffer used to store the compressed data is - identified in the DBT's ulen field. If the - compressed data cannot fit in the buffer, the compression function - should store the amount of space needed in DBT's - size field and then return - DB_BUFFER_SMALL. Berkeley DB will subsequently - re-call the compression function with the required amount of space - allocated in the compression data buffer. -

-

- Multiple compressed key/data pairs will likely be written to the - same buffer and the compression function should take steps to - ensure it does not overwrite data. -

-

- For example, the following code fragments illustrate the use of a custom - compression routine. This code is actually a much simplified - example of the default compression provided by Berkeley DB. It does - simple prefix compression on the key part of the data. -

+ For example, the following code fragments illustrate + the use of a custom compression routine. This code is + actually a much simplified example of the default + compression provided by Berkeley DB. It does simple prefix + compression on the key part of the data. +

-
-    int compress(DB *dbp, const DBT *prevKey, const DBT *prevData, 
+          
int compress(DB *dbp, const DBT *prevKey, const DBT *prevData, 
         const DBT *key, const DBT *data, DBT *dest)
 {
     u_int8_t *dest_data_ptr;
@@ -538,49 +612,53 @@ err:    /* Close the cursor. */
     return (0);
 } 

- The corresponding decompression function is likewise passed five DBT structures: -

+ The corresponding decompression function is likewise + passed five DBT structures: +

  • -

    - The key and data DBTs immediately preceding the - decompressed key and data. -

    +

    + The key and data DBTs immediately preceding + the decompressed key and data. +

  • -

    - The compressed data from the database. -

    +

    + The compressed data from the database. +

  • -

    - One to store the decompressed key and another one for the - decompressed data. -

    +

    + One to store the decompressed key and another + one for the decompressed data. +

+

+ Because the compression of record X + relies upon record X-1, the + decompression function can be called repeatedly to + linearally decompress a set of records stored in the + compressed buffer. +

- Because the compression of record X relies upon - record X-1, the decompression function can be - called repeatedly to linearally decompress a set of records stored - in the compressed buffer. -

-

- The total size of the buffer available to store the decompressed data is - identified in the destination DBT's ulen field. If the - decompressed data cannot fit in the buffer, the decompression function - should store the amount of space needed in the destination DBT's - size field and then return - DB_BUFFER_SMALL. Berkeley DB will subsequently - re-call the decompression function with the required amount of space - allocated in the decompression data buffer. -

+ The total size of the buffer available to store the + decompressed data is identified in the destination DBT's + ulen field. If the decompressed + data cannot fit in the buffer, the decompression function + should store the amount of space needed in the destination + DBT's size field and then return + DB_BUFFER_SMALL. Berkeley DB will + subsequently re-call the decompression function with the + required amount of space allocated in the decompression + data buffer. +

- For example, the decompression routine that corresponds to the - example compression routine provided above is: -

+ For example, the decompression routine that corresponds + to the example compression routine provided above is: +

int decompress(DB *dbp, const DBT *prevKey, const DBT *prevData, 
     DBT *compressed, DBT *destKey, DBT *destData)
@@ -642,46 +720,49 @@ err:    /* Close the cursor. */
           
-

Programmer Notes

+

Programmer Notes

-

- As you use compression with your databases, be aware of the - following: -

+

+ As you use compression with your databases, be aware of + the following: +

  • -

    - Compression works by placing key/data pairs from a single - database page into a single block of compressed data. This is true - whether you use DB's default compression, or you write - your own compression. Because all of key/data data is - placed in a single block of memory, you cannot decompress - data unless you have decompressed everything that came - before it in the block. That is, you cannot decompress item - n in the data block, unless you also - decompress items 0 through - n-1. -

    +

    + Compression works by placing key/data pairs + from a single database page into a single block of + compressed data. This is true whether you use + DB's default compression, or you write your + own compression. Because all of key/data data is + placed in a single block of memory, you cannot + decompress data unless you have decompressed + everything that came before it in the block. That + is, you cannot decompress item + n in the data block, + unless you also decompress items + 0 through + n-1. +

  • - If you increase the minimum number of key/data pairs placed - on a Btree leaf page (using DB->set_bt_minkey()), you will - decrease your seek times on a compressed database. However, - this will also decrease the effectiveness of the - compression. -

    + If you increase the minimum number of key/data + pairs placed on a Btree leaf page (using + DB->set_bt_minkey()), you will decrease your seek + times on a compressed database. However, this will + also decrease the effectiveness of the + compression. +

  • -

    - Compressed databases are fastest if bulk load is used to - add data to them. See - Retrieving and updating records in bulk - for information on using bulk load. -

    +

    + Compressed databases are fastest if bulk load + is used to add data to them. See Retrieving and updating records in bulk for information + on using bulk load. +

@@ -703,7 +784,8 @@ err: /* Close the cursor. */ Home -  Hash access method specific configuration +  Hash access method specific + configuration -- cgit v1.2.1