diff options
author | Michael Cahill <michael.cahill@wiredtiger.com> | 2014-09-10 14:30:24 +1000 |
---|---|---|
committer | Michael Cahill <michael.cahill@wiredtiger.com> | 2014-09-10 14:30:24 +1000 |
commit | c07c5cb34aae56fb20e30fa6586fa660326a6d5d (patch) | |
tree | 24c16130ab5e40146ae59d36bef52ef894f42062 /ext | |
parent | 72d4e708e6a1d4e84db117ae3f098ad970c098ba (diff) | |
download | mongo-c07c5cb34aae56fb20e30fa6586fa660326a6d5d.tar.gz |
Limit the maximum compression ratio our raw zlib implementation will allow. Once we have taken 20x the maximum page size, stop. This prevents pathological behavior in sythetic workloads where a page is forced out of cache, then compresses into a single page on disk, and we repeat for every update.
Diffstat (limited to 'ext')
-rw-r--r-- | ext/compressors/zlib/zlib_compress.c | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/ext/compressors/zlib/zlib_compress.c b/ext/compressors/zlib/zlib_compress.c index 33bb9bf8810..3532ecf16cd 100644 --- a/ext/compressors/zlib/zlib_compress.c +++ b/ext/compressors/zlib/zlib_compress.c @@ -225,8 +225,15 @@ zlib_compress_raw(WT_COMPRESSOR *compressor, WT_SESSION *session, * Strategy: take the available output size and compress that much * input. Continue until there is no input small enough or the * compression fails to fit. + * + * Don't let the compression ratio become insanely good (which can + * happen with synthetic workloads). Once we hit a limit, stop so that + * the in-memory size of pages isn't totally different to the on-disk + * size. Otherwise we can get into trouble where every update to a + * page results in forced eviction based on in-memory size, even though + * the data fits into a single on-disk block. */ - while (zs.avail_out > 0) { + while (zs.avail_out > 0 && zs.total_in <= zs.total_out * 20) { /* Find the slot we will try to compress up to. */ if ((curr_slot = zlib_find_slot( zs.total_in + zs.avail_out, offsets, slots)) <= last_slot) |