summaryrefslogtreecommitdiff
path: root/db/log_reader.cc
diff options
context:
space:
mode:
authormjwiacek <mjwiacek@google.com>2016-02-23 07:36:39 -0800
committerChris Mumford <cmumford@chromium.org>2016-03-31 15:53:34 -0700
commite84b5bdb5af6a4df8c3dc3f6f644b4ca3b6722cc (patch)
treebe394fdf8a6dd685a6c036cdb421feee52145bbf /db/log_reader.cc
parent32113439095d148fa93c7581a15f52ff26a389d3 (diff)
downloadleveldb-e84b5bdb5af6a4df8c3dc3f6f644b4ca3b6722cc.tar.gz
This CL fixes a bug encountered when reading records from leveldb files that have been split, as in a [] input task split.
Detailed description: Suppose an input split is generated between two leveldb record blocks and the preceding block ends with null padding. A reader that previously read at least 1 record within the first block (before encountering the padding) upon trying to read the next record, will successfully and correctly read the next logical record from the subsequent block, but will return a last record offset pointing to the padding in the first block. When this happened in a [], it resulted in duplicate records being handled at what appeared to be different offsets that were separated by only a few bytes. This behavior is only observed when at least 1 record was read from the first block before encountering the padding. If the initial offset for a reader was within the padding, the correct record offset would be reported, namely the offset within the second block. The tests failed to catch this scenario/bug, because each read test only read a single record with an initial offset. This CL adds an explicit test case for this scenario, and modifies the test structure to read all remaining records in the test case after an initial offset is specified. Thus an initial offset that jumps to record #3, with 5 total records in the test file, will result in reading 2 records, and validating the offset of each of them in order to pass successfully. ------------- Created by MOE: https://github.com/google/moe MOE_MIGRATED_REVID=115338487
Diffstat (limited to 'db/log_reader.cc')
-rw-r--r--db/log_reader.cc8
1 files changed, 7 insertions, 1 deletions
diff --git a/db/log_reader.cc b/db/log_reader.cc
index 6d4a5b2..a6d3045 100644
--- a/db/log_reader.cc
+++ b/db/log_reader.cc
@@ -73,8 +73,14 @@ bool Reader::ReadRecord(Slice* record, std::string* scratch) {
Slice fragment;
while (true) {
- uint64_t physical_record_offset = end_of_buffer_offset_ - buffer_.size();
const unsigned int record_type = ReadPhysicalRecord(&fragment);
+
+ // ReadPhysicalRecord may have only had an empty trailer remaining in its
+ // internal buffer. Calculate the offset of the next physical record now
+ // that it has returned, properly accounting for its header size.
+ uint64_t physical_record_offset =
+ end_of_buffer_offset_ - buffer_.size() - kHeaderSize - fragment.size();
+
if (resyncing_) {
if (record_type == kMiddleType) {
continue;