diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/index.html | 63 | ||||
-rw-r--r-- | doc/table_format.txt | 41 |
2 files changed, 104 insertions, 0 deletions
diff --git a/doc/index.html b/doc/index.html index 472f7cd..521d2ba 100644 --- a/doc/index.html +++ b/doc/index.html @@ -400,6 +400,69 @@ We might want to prefix <code>filename</code> keys with one letter (say '/') and over just the metadata do not force us to fetch and cache bulky file contents. <p> +<h2>Filters</h2> +<p> +Because of the way <code>leveldb</code> data is organized on disk, +a single <code>Get()</code> call may involve multiple reads from disk. +The optional <code>FilterPolicy</code> mechanism can be used to reduce +the number of disk reads substantially. +<pre> + leveldb::Options options; + options.filter_policy = NewBloomFilter(10); + leveldb::DB* db; + leveldb::DB::Open(options, "/tmp/testdb", &db); + ... use the database ... + delete db; + delete options.filter_policy; +</pre> +The preceding code associates a +<a href="http://en.wikipedia.org/wiki/Bloom_filter">Bloom filter</a> +based filtering policy with the database. Bloom filter based +filtering relies on keeping some number of bits of data in memory per +key (in this case 10 bits per key since that is the argument we passed +to NewBloomFilter). This filter will reduce the number of unnecessary +disk reads needed for <code>Get()</code> calls by a factor of +approximately a 100. Increasing the bits per key will lead to a +larger reduction at the cost of more memory usage. We recommend that +applications whose working set does not fit in memory and that do a +lot of random reads set a filter policy. +<p> +If you are using a custom comparator, you should ensure that the filter +policy you are using is compatible with your comparator. For example, +consider a comparator that ignores trailing spaces when comparing keys. +<code>NewBloomFilter</code> must not be used with such a comparator. +Instead, the application should provide a custom filter policy that +also ignores trailing spaces. For example: +<pre> + class CustomFilterPolicy : public leveldb::FilterPolicy { + private: + FilterPolicy* builtin_policy_; + public: + CustomFilterPolicy() : builtin_policy_(NewBloomFilter(10)) { } + ~CustomFilterPolicy() { delete builtin_policy_; } + + const char* Name() const { return "IgnoreTrailingSpacesFilter"; } + + void CreateFilter(const Slice* keys, int n, std::string* dst) const { + // Use builtin bloom filter code after removing trailing spaces + std::vector<Slice> trimmed(n); + for (int i = 0; i < n; i++) { + trimmed[i] = RemoveTrailingSpaces(keys[i]); + } + return builtin_policy_->CreateFilter(&trimmed[i], n, dst); + } + + bool KeyMayMatch(const Slice& key, const Slice& filter) const { + // Use builtin bloom filter code after removing trailing spaces + return builtin_policy_->KeyMayMatch(RemoveTrailingSpaces(key), filter); + } + }; +</pre> +<p> +Advanced applications may provide a filter policy that does not use +a bloom filter but uses some other mechanism for summarizing a set +of keys. See <code>leveldb/filter_policy.h</code> for detail. +<p> <h1>Checksums</h1> <p> <code>leveldb</code> associates checksums with all data it stores in the file system. diff --git a/doc/table_format.txt b/doc/table_format.txt index ad5aa4b..d0f3065 100644 --- a/doc/table_format.txt +++ b/doc/table_format.txt @@ -47,6 +47,47 @@ the BlockHandle of the metaindex and index blocks as well as a magic number. // (40==2*BlockHandle::kMaxEncodedLength) magic: fixed64; // == 0xdb4775248b80fb57 +"filter" Meta Block +------------------- + +If a "FilterPolicy" was specified when the database was opened, a +filter block is stored in each table. The "metaindex" block contains +an entry that maps from "filter.<N>" to the BlockHandle for the filter +block where "<N>" is the string returned by the filter policy's +"Name()" method. + +The filter block stores a sequence of filters, where filter i contains +the output of FilterPolicy::CreateFilter() on all keys that are stored +in a block whose file offset falls within the range + + [ i*base ... (i+1)*base-1 ] + +Currently, "base" is 2KB. So for example, if blocks X and Y start in +the range [ 0KB .. 2KB-1 ], all of the keys in X and Y will be +converted to a filter by calling FilterPolicy::CreateFilter(), and the +resulting filter will be stored as the first filter in the filter +block. + +The filter block is formatted as follows: + + [filter 0] + [filter 1] + [filter 2] + ... + [filter N-1] + + [offset of filter 0] : 4 bytes + [offset of filter 1] : 4 bytes + [offset of filter 2] : 4 bytes + ... + [offset of filter N-1] : 4 bytes + + [offset of beginning of offset array] : 4 bytes + lg(base) : 1 byte + +The offset array at the end of the filter block allows efficient +mapping from a data block offset to the corresponding filter. + "stats" Meta Block ------------------ |