1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
|
WiredTiger release 2.4.0, 2014-10-15
------------------------------------
The WiredTiger 2.4.0 release contains significant new features, API changes
and many bug fixes.
New features and API changes:
* Allow cursors to keep their position across transaction boundaries. That is
WT_SESSION::begin and WT_SESSION::commit no longer reset cursors. [#1181]
* Add ground work to support building WiredTiger on Windows.
* Add ability to customize a collator for specific data sources or with
application managed metadata. See upgrading documentation for more
information. [#1165]
* Enhance extension mechanism in WiredTiger to support loading extensions from
the application binary - not just a separate library. [#1174]
* Replace WT_SESSION::create lsm=(merge_threads) configuration option with
wiredtiger_open lsm_manager=(worker_thread_max). See upgrading documentation
for more information.
* Enhancements to the WiredTiger Python API build process. [#1188]
* Add ability to dump and load WiredTiger databases in JSON format. [#1154]
* Add ability to automatically checkpoint based on the volume of log records
generated since the last checkpoint. Enabled using wiredtiger_open
configuration option of the form: "checkpoint=(log_size=size)" [#1170]
* Enhance functionality allowing users to write content into the WiredTiger
transaction log. [#1171][#1175]
* Enhance the WiredTiger HyperLevelDB implementation to support log replay.
[#1106][#1155]
Other significant changes:
* Fix several bugs in the shared cache implementation. [#1180][#1176]
* Fix a bug where WiredTiger could overwrite the public URI field in a cursor.
[#1235]
* Fix several bugs in salvage implementation. [#1222][#1169]
* Several bug fixes and enhancements for WT_CONNECTION::reconfigure.
[#1214][#1172]
* Fix several bugs in raw compression implementation. Relevant for data that
compresses extremely well. [#1191]
* Fix a bug in cursors. When an operation returns WT_NOTFOUND, the cursor is
now left pointing to the original key/value pair. [#1209]
* Several bug fixes and enhancements to WiredTiger LevelDB interface.
* Switch default build from using adaptive pthread mutexes to default pthread
mutexes.
WiredTiger release 2.3.1, 2014-08-14
------------------------------------
The WiredTiger 2.3.1 release contains mainly performance enhancements and bug
fixes.
Changes to the WiredTiger API:
* Fix a bug in WT_CURSOR::set_value that could lead to undefined behavior with
some value formats.
* Make the asynchronous API generally available [#1139]
* Add log cursors for replay and verification. Make generated log record and
operation types public. [#1106]
* Allow eviction worker threads to be started and stopped dynamically.
Applications that use the `eviction_workers` configuration should see the
upgrading documentation on how to use this feature.
[#1116, #1143, #1158]
Other significant changes:
* Improve performance and reduce latency during checkpoints and LSM merges.
Remove uses of the checkpoint lock other than serializing checkpoints:
compact holds the schema lock, so it doesn't need to hold the checkpoint
lock, the new WT_BTREE handle close lock prevents checkpoints from colliding
with handle close, so LSM doesn't need the checkpoint lock either.
* Some minor cleanups, setting the internal session's name in a few places.
[#1073]
* Grab the live lock when loading a checkpoint in diagnostic mode: that could
race with a read. [#1102]
* Instead of keeping a list of file URIs for checkpoint to flush, open a handle
and stash it. [#1114]
* Add a new OS-layer function __wt_fsync_async to flush a file without waiting
for the results, call it from the Btree flush-leaves code so pages start
flushing while we're working the rest of the checkpoint. [#1136, #1152]
* Wait for the handle flush lock when writing the leaf pages instead of
returning EBUSY. [#1136]
* Add a wtperf page to the documentation, describe how to simulate workloads
and view statistics. [#1147]
* Flag new structures not listed in PREDEFINE. [#1148]
* Return EBUSY if no async handles available and fix ex_async to look for it.
[#1153]
* Fix some problems with navigation in the reference guide.
* Bump the number of slots for internal sessions: we have a lot more than 2
now. Add a test for `session_max` settings, make sure we add enough to
account for at least the default internal sessions.
* Remove tcbench: we're no longer maintaining it.
WiredTiger release 2.3.0, 2014-07-29
------------------------------------
The WiredTiger 2.3.0 release contains significant new features, performance
enhancements and bug fixes. Significant changes are described below.
Changes to the WiredTiger API (see upgrading documentation for details):
* Add a LevelDB API implementation for WiredTiger. This includes support for
stock LevelDB as well as Basho, HyperLevelDB and RocksDB versions of the API.
To build the LevelDB API include --enable-leveldb in the configure command,
to specify compatability with an alternative LevelDB API use
--enable-leveldb=[basho,hyper,rocksdb]. [#1028]
* Add ability to build some common extensions into the WiredTiger library.
This means that the libraries for those extensions don't need to be
dynamically loaded at runtime. Currently supported extensions are Snappy
compression and zlib compression. The option can be enabled by passing
--with-builtins=[snappy,zlib] to the configure command line.
* Add a new configuration to wiredtiger_open: statistics_log=(on_close=true),
that causes a set of statistics to be logged on WT_CONNECTION::close. [#1086]
* Add a new configuration to wiredtiger_open: exclusive, that causes the open
to fail if the database already exists.
Other significant changes:
* Performance improvement for high throughput workloads using multiple
eviction threads. Performance of some workloads improves by over 15% [#1087]
* Significant performance optimizations for queries, giving up to 20%
throughput improvement for in-memory query workloads.
https://github.com/wiredtiger/wiredtiger/wiki/Query-throughput
* Fix an off-by-one bug that could lead to ENOMEM during commit with logging.
[#1104][#1121]
* Allow bulk loads to multiple files to complete in parallel. [#1114][#1126]
WiredTiger release 2.2.1, 2014-06-24
------------------------------------
The WiredTiger 2.2.1 release contains mainly performance enhancements and bug
fixes. Significant changes are described below.
Changes to the WiredTiger API (see upgrading documentation for details):
* Change the order in which configuration setting mechanisms are applied by
wiredtiger_open. [#1010][#1034]
* Split the global transaction_sync configuration into two parts: a sync method
(dsync, fsync or none), and an enabled flag (false by default). [#1074]
* Add ability to sync with per transaction granularity. [#1074]
* Update WiredTiger Java API to throw WiredTigerException consistently. [#1011]
* Add ability to dump and load databases using JSON format. [#740][#1049]
Other significant changes:
* Various performance improvements to the main cursor search routine including
reductions in how often we need to copy data and profiling based optimizations
for tight search loops. [#1050][#1070]
* Fix a bug in recovery with missing files (e.g., after a hotbackup that raced
with file creation). [#1042][#1045]
* Several bug fixes and performance enhancements related to LSM trees and
snapshot isolation transactions. [#1057][#1060][#1075]
* Several performance tuning enhancements to LSM trees around locking,
throttling and switching chunks. [#1051]
* Algorithmic improvements to LSM tree compact operation. It is now faster
and more reliable. [#1063]
* Create a separate thread to manage open file handles - which means that:
- Application threads are less likely to be responsible for closing handles
- Multi threaded workloads don't open/close handles more often than necessary
[#1018]
WiredTiger release 2.2.0, 2014-05-21
------------------------------------
The WiredTiger 2.2.0 release contains new features, performance enhancements
and bug fixes. Significant changes include:
Changes relevant for upgrading applications:
Update the table create API to disable prefix compression by default.
Applications generally see better performance without prefix compression,
choosing space saving over performance is up to the application. [#981]
Change the default leaf_page_max setting from 1MB to 32KB. Choosing a large
default leaf_page_max led to poor performance in out of cache workloads.
Remove the `--enable-debug` option to configure. It is more standard to set
`CFLAGS="-g"` variable instead.
Save the wiredtiger_open configuration when a database is created, so that
settings like cache size, extensions and logging are set consistently by all
subsequent users of the database.
Add an `--enable-verbose` option to configure. In order to access the verbose
message functionality available as part of the wiredtiger_open and
WT_CONNECTION::reconfigure APIs, it is necessary to pass the `--enable-verbose`
option to configure.
Enhance the metadata cursor implementation (i.e: cursors created with a
"metadata:" prefix) so that they can be used to inspect metadata for internal
tables and now support altering the metadata. Add a new "read_only" flag to
cursor configuration that defaults to false for metadata cursors.
Fix several bugs in raw compression, including one that could cause data
corruption and some that triggered poor performance.
[#984][#991][#1007][#1008][#1013]
Improve the performance of recovery - we no longer need to scan all log files
looking for the last checkpoint.
Improve performance of read-only transactions, by deferring the allocation
of transaction IDs. [#978]
Several bug fixes in hot backup related to log
files, including:
* Always choose the right metadata version in the backup [#972]
* Don't require that hot backup copies log files in order [#976]
* Always copy log files before data files [#976]
* Fix a bug where recovery returned an error if the last log record was
incomplete [#994]
Speed up checkpoints by doing a better job of skipping pages that can't
contain changes that need to be included. [#954][#963][#1001]
Add ability to store zero length data items into LSM trees. [#540]
Add an asynchronous data access/manipulation API to WiredTiger. [#933]
Add the ability to configure multiple eviction server threads, to help with
keeping space available in the cache. [#918]
Add the ability to reconfigure the checkpoint and statistics log servers.
[#997][#1004]
Improve the performance of retrieving data for in cache workloads. [#970]
Improve the structure of the in-memory tree we are generating, by allowing
internal pages to be split. This significantly improves query performance
in some workloads. [#876]
Work around a bug in posix_fallocate on Linux, where it could corrupt already
written data.
Add the ability to leak memory on connection close via new leak_memory option
to WT_CONNECTION::close API. This allows for faster shutdown if a process is
going to exit when the WiredTiger connection is closing.
Allow salvage to run on any table type.
WiredTiger release 2.1.2, 2014-03-27
------------------------------------
The WiredTiger 2.1.2 release contains performance enhancements and bug fixes.
Significant changes include:
Update the configuration settings for shared_cache to make the distinction
between cache_size and shared_cache less confusing. See upgrading
documentation for more information.
Various performance enhancements to improve the performance of checkpoints.
Fix a bug that could cause a hang with small caches under heavy load. [#894]
WiredTiger release 2.1.1, 2014-03-04
------------------------------------
The WiredTiger 2.1.1 release contains new features, performance enhancements
and bug fixes. Significant changes include:
Fix a bug where a page could be marked clean when it contained uncommitted
changes. This bug could cause undefined behavior in transaction rollback
under load.
Fix a bug with shared caches when rebalancing between connections.
Add a new public API to WiredTiger that provides the ability to parse
WiredTiger compatible configuration strings. See the upgrading documentation
for further information. [#873]
A number of performance enhancements to the LSM implementation, particularly
for long running workloads.
A number of performance enhancements and bug fixes to cache eviction code.
Add an option to use direct I/O when reading from checkpoints. To enabled
the functionality add "direct_io=[checkpoint]" to your wiredtiger_open
configuration string. [#847]
WiredTiger release 2.1.0, 2014-02-04
------------------------------------
The WiredTiger 2.1.0 release contains new features, performance enhancements
and bug fixes. Significant changes include:
The WT_ITEM structure was changed so that the size field is a size_t rather
than a uint32_t. See upgrading documentation for details.
A change to the compress_raw interface around repeating the call with more
records. See upgrading documentation for details.
In LSM trees, the memory_page_max setting is ignored. The effective setting
is double the chunk size. [#861][#859]
Add support for zlib compression. [#855] [#865]
Various enhancements to how WiredTiger generates tree structures in memory to
help maintain consistent performance as table size grows. [#851]
Add support for Levyx Inc Helium as an external data source in WiredTiger
[#849][#850]
Improve insert performance when a table contains many identical overflow
items.
Various performance enhancements to btree searches. [#838][#839][#840]
Add support for newer versions of autoconf up to 1.14. [#599][#841]
Improve multi-threaded throughput of durable log writes, including changing
the default wiredtiger_open transaction_sync configuration from dsync to
fsync, see the upgrading documentation for further information. [#831][#832]
In the Python and Java APIs, automatically close handles to prevent invalid
accesses by applications. [#649][#800][#830]
Various enhancements to the LSM merge algorithm, including improvements to how
files are selected for merging, and throttling based on whether merges are
keeping up (to limit write amplification). Made the minimum number of chunks
chosen to merge configurable. [#817][#819][#822]
WiredTiger release 2.0.1, 2013-12-12
------------------------------------
The WiredTiger 2.0.1 release contains major new features, numerous performance
enhancements and bug fixes.
Significant changes include:
* WiredTiger now supports fine-grained durability via Write Ahead Logging (WAL).
Logging is enabled with the "log=(enabled)" configuration string to
wiredtiger_open. If the connection is not shut down cleanly and logging is
enabled, WiredTiger will automatically run recovery the next time it is
opened, rolling forward changes in the log until the last commit.
[#605]
* Many enhancements to the LSM implementation to improve the throughput and
reduce maximum operation latency including:
- Algorithmic improvements when multiple merge threads are configured.
- Improvements to bloom filter lookup speed.
- Enhancements to internal cursor management, to reduce search overhead.
- Prioritize switching to a new level 0 chunk in utility threads, to avoid
application thread pauses.
- More advanced logic in choosing when to create bloom filters.
* LSM specific WT_SESSION::create configuration option enhancements. Including:
- Move existing options into their own group, and strip leading lsm_ prefix.
- Add a new merge_max configuration option.
- Update the default chunk_size to be 10MB.
- Increase the default bloom filter bit and hash counts.
- Clean up files left after interrupted merges.
See the upgrading documentation for details.
[#784, #785, #786, #802]
* WT_SESSION::compact can now be used to merge LSM trees into a small number
of chunks on disk.
[#792]
* Enhanced the Java API, so that when WiredTiger automatically closes a
handle, the handle is automatically invalidated for the Java application.
[#485]
* Add a script that can create an interactive web page to view statistics
from a WiredTiger statistics log. Based on D3: http://d3js.org/
* Enhancements to the wtperf performance testing tool to add new features
WiredTiger release 1.6.6, 2013-11-19
------------------------------------
The WiredTiger 1.6.6 release is a bugfix and performance tuning release.
This release of WiredTiger contains a database format change. Database files
from previous releases will need to be upgraded.
A special note: the WiredTiger code base is now being regularly reviewed
using the Coverity Static Analysis Verification Engine. We'd like to
thank Coverity for their on-going support of Open Source projects like
WiredTiger!
Significant changes include:
* Performance changes include: limiting operations done inside update
serialization primitives, removing unnecessary memory barriers, replacing
spinlocks with atomic instructions, padding structures to avoid false
cache sharing, switching from per-file mutexes to per-page mutexes,
pre-allocating structures to avoid memory allocation while holding
mutexes, and using adaptive mutexes where available.
[#707, #718, #719]
* A number of LSM stability and performance improvements: changes include
better merge algorithms, reduced locking, and higher concurrency.
* A number of table compaction performance improvements, including changes
allowing compaction to no longer read unnecessary file blocks into the
cache, requires fewer passes over the file and support concurrent
checkpoints and eviction. This change required an underlying file
format change, see the upgrading documentation for details.
[#756, #761]
* WiredTiger statistics have been significantly improved:
Statistics logging has been changed to aggregate information from all
open handles. [#709, #717]
For performance reasons, statistics are now disabled by default, see
the upgrading documentation for details. [#715]
Statistics configuration has been changed so the connection and cursor
configuration are consistent, with matching changes to the "wt stat"
command-line utility; see the upgrading documentation for details.
* Update WT_EVENT_HANDLER interface to contain a new "handle close"
interface and to pass a WT_SESSION handle into all callbacks, see the
upgrading documentation for details. [#649]
Add timestamp, process ID and thread ID to messages generated via
WT_EVENT_HANDLER interface. [#753]
* WiredTiger eviction improvements, supporting larger data-to-cache size
ratios. [#754]
* Various fixes for handling overflow records. [#726, #743]
* Overflow records are no longer tracked during bulk-loads, significantly
increasing bulk-load performance for some data sets.
WiredTiger release 1.6.5, 2013-10-09
------------------------------------
This is primarily a bugfix and performance tuning release. The main changes are:
* Change the default statistics_fast configuration from false to true.
* Change WT_CURSOR::insert to not hold a position. [#673]
* Disallow WT_SESSION::compact operations on LSM trees.
* The 'sync' setting to wiredtiger_open has been renamed 'checkpoint_sync'.
* Add a "metadata:" cursor type. [#660]
* Fix race in the cache's dirty byte tracking. [#635, #699]
* Fix a bug scanning through a memory-mapped file with overflow items. [#701]
* Use hardware checksum instructions when available. [#582, #702]
* Several bug fixes related to tracking active transaction IDs and detection of
obsolete updates with high concurrency workloads. [#639, #643, #657, #683]
* Fix several bugs in LSM including races on shutdown and Bloom filter
creation. [#686, #687, #688].
* Fix a bug in LSM where we were not including Bloom filter files in backups.
[#684]
* Optimize the LSM throttle and merge algorithms. [#676]
* Make hot backups work concurrently with files being bulk-loaded. [#570, #653]
* Add full support for snapshot isolation to LSM: only switch LSM chunks if all
changes are globally visible and detect conflicts between transactions across
file switches. [#629]
WiredTiger release 1.6.4, 2013-08-20
------------------------------------
This is primarily a bugfix and performance tuning release. The main changes are:
* Make prefix compression of keys conditional on the amount of space saved.
A database format change was required for this enhancement. See upgrading
documentation for details. [#624]
* The default behavior of the wt utility's load command has been changed to
overwrite existing data.
* Add a WT_SESSION.create prefix_compression_min configuration option with a
default value of 4. [#624] and [#624]
* Fix "make install" of Python API. [#598]
* Require platform support for atomic read/write of 64 bit values. [#553]
* Support transaction semantics for custom data source implementations. Enhance
Memrata data source to support transactions.
* Changes to the wtperf testing tool related to how configuration options are
specified.
* Enhance cursor key/value memory management to be more efficient, consistent,
and have stricter checking of inputs and outputs.
* Increase the likelihood of being able to evict hot pages. [#604]
* Reference on-page keys instead of copying them to allocated memory. This
saves space in the cache and overhead when reading pages into cache.
[#592] and [#600]
* Add a btree search optimization that skips matching prefixes. [#595]
* Turn off Huffman encoding for keys on row-store internal pages. [#592]
* Add concurrent logging infrastructure that will be used to support write
ahead logging in a future release.
WiredTiger release 1.6.3, 2013-07-12
------------------------------------
This is a bugfix and performance tuning release. The main changes are:
* Change the default cursor overwrite configuration so that it is consistent
across all data sources. This change may alter the behavior of existing
applications without triggering any compilation or runtime warnings. See
the upgrade documentation for details. [#512]
* Require platform support for 64 bit atomic operations. [#553]
WiredTiger release 1.6.2, 2013-06-18
------------------------------------
This is a bugfix and performance tuning release. The main changes are:
* Fix a race in the WiredTiger pseudo random number generator that was leading
to poor distribution of numbers.
* Change the default compression configuration to "uncompressed".
* Fix a race between checkpoints and LSM that could result in a crash. [#543]
* Add an option to output version information at runtime. Configure by
including "verbose=[version]" in the wiredtiger_open connection
configuration string. [#564]
* Add a configurable prefix to error messages. [#527]
* Add two new extension APIs, one to return a transaction ID, one to return
if a transaction ID is visible to the current transaction.
* Add standard metadata functions to the extension API and make extension data
sources responsible for their own metadata entries.
* Add a new extension function __wt_ext_config_strget that returns the
configuration value from a single string.
WiredTiger release 1.6.1, 2013-05-31
------------------------------------
This is a bugfix and performance tuning release. The main changes are:
* Fix the compress_raw API so that it uses platform independent types. See the
upgrade guide for further information. [#561]
* Add an explicit enable setting to shared_cache configuration. See the
upgrade guide for further information.
* Fix several bugs in hot backup, including race conditions between backup and
table drop (and other schema level operations). [#556] [#557]
* Allow any data source type for indices as well as column groups. [#545]
* Preload btree internal pages into file system cache when opening a table.
* Change the default allocation size to 4KB so that DIRECT_IO with 4KB blocks
works. [#547]
* Fix some bugs related to tracking the oldest active transaction. [#552]
* Fix a bug in the extension API when using multiple databases.
* Disallow named checkpoints on LSM trees - they aren't supported. [#546]
* Fix support for custom collators with LSM trees. [#544]
* Build fixes for gcc 4.1.2.
See the upgrade documentation for details of API changes that may require
altering existing applications.
WiredTiger release 1.6.0, 2013-05-16
------------------------------------
This release contains new features, bug fixes and performance improvements.
The significant changes are highlighted below:
* Fix a bug where configuring direct I/O could cause checksum errors at
runtime. NOTE: database file format change. [#526]
* Fix a race that allowed checkpoints to be deleted while hot backups are
running. [#515]
* Scale to events per second in graphs generated from statistics log output.
[#518]
* Changes to reduce the latency of LSM operations.
* Add a new terminate callback to extension interfaces that is called when the
WiredTiger connection is closed. [#530]
* Various optimizations and bug fixes to cache management and eviction code.
* Update various statistics.
* Fix a bug where using a combination of read-committed and snapshot
transactions could result in inconsistent values being returned. [#539]
* Fix a bug where using LSM trees with compression enabled could result in an
invalid system call. [#535]
* Enhance statistics logging so that it can dump "lsm:" statistics.
See the upgrade documentation for information about database format changes
in this release.
WiredTiger release 1.5.3, 2013-04-26
------------------------------------
This release contains some major new features along with numerous bug fixes
and performance improvements. The significant changes are highlighted
below:
* Enhance the extension data source API to facilitate implementation of new
data stores in WiredTiger.
* Add support for the STEC / Memrata KVS data source.
* Add a Berkeley DB data source via the WiredTiger extension API.
* Various enhancements to cache eviction management. Mostly to avoid stalls in
application threads.
* Fixes to shared cache pool implementation, so resources are more
aggressively reallocated.
* Add new statistics.
* Implement automatic insert throttling in LSM - enabled by default.
* Configuration strings are now case sensitive.
* Enhance LSM merge algorithms to be more efficient as trees grow very large.
See the upgrade documentation for details of API changes that may require
altering existing applications.
WiredTiger release 1.5.2, 2013-03-28
------------------------------------
This is a bugfix release. The main changes are:
[#493] Fix get_key/value in the Java API for complex cursors.
* Fix a leak in eviction detected by valgrind.
* Stop trying to cache the oldest reader: we only use it for eviction and only update it when required.
* Track cursor creation in the statistics (creating a cursor per operation isn't a good idea).
WiredTiger release 1.5.1, 2013-03-25
------------------------------------
This is a bugfix and performance tuning release. The main changes are:
* Fix several bugs in LSM:
- the logic for setting the "no eviction" flag on LSM chunks was reversed,
causing unnecessary eviction once the cache became full;
- calling session.checkpoint while writing to an LSM tree could confuse
the logic around switching to new chunks; and
- fix a possible NULL pointer indirection when switching chunks.
* Make WT_ASSERT a no-op when not in DIAGNOSTIC mode.
* Panic if we find a block on the wrong list, that's not something we can
recover from.
* If a page is reconciled (causing it's on-disk blocks to be freed and
potentially recycled), and then a subsequent collapse of a stack of
split-merge pages replaces that page with a page that has not yet been
reconciled, we can potentially free the same blocks twice. The fix is to
clear the page's WT_REF.addr field at the time we free the blocks, so
future reconciliations will ignore the original disk blocks.
* Fix a bug in the dump utility that allowed index URIs.
* Tweak merge to build better trees with random insert workloads.
* Don't use a stale value for the oldest reader transaction ID.
* Track the size of the WT_REF array in internal pages (including
WT_ADDRs). Also add an estimate of per-allocation overhead.
* Fix a bug where URIs containing absolute paths were not being parsed
correctly.
* Add a RMW insert mode to wtbench.
[#427] Improve cleanup after a failed wiredtiger_open call.
[#484] Don't allow true/false values in config strings where integers are
expected.
[#486] Move the cache full check for autocommit transactions out of the
rollback path (since we don't reset cursors there), to after we
close a cursor.
[#488] Fix an assertion failure if we try to do eviction without ever having done an update.
WiredTiger release 1.5.0, 2013-03-14
------------------------------------
This release contains some major new features along with numerous bug fixes
and performance improvements. The significant changes are highlighted
below:
* Add a Java API.
* Create a thread to do automatic checkpoints, configured by passing
"checkpoint=(wait=X)" to wiredtiger_open.
* Add support for periodically logging statistics to a file and a tool to
generate graphs based on those logs. Configured by passing
"statistics_log=(wait=X)" to wiredtiger_open.
* Several changes to minimize the impact of checkpoints on other threads.
* When reading from checkpoints, use mmap by default.
* Enhance eviction so that internal pages take up less space.
* Add maximum filesystem buffer cache settings to wiredtiger_open called
"os_cache_max" and "os_cache_dirty_max". After doing the specified
amount of reads or writes, WiredTiger will call fadvise and/or
sync_file_range to drop pages from the filesystem cache. This is an
alternative to direct I/O with less impact on performance.
* Make run-time statistics optional, defaulted to "off".
* Change how we detect if shared cache is used. It used to rely on a name,
now it will be used if the shared_cache configuration option is included.
* Add the ability to specify a per-connection reserved size for cache
pools. Ensure cache pool reconfiguration is honoured quickly.
* Rework hazard pointer coupling during cursor walks to be more efficient.
* Add a cache_eviction_walk statistic to track the pages we walk and a
cache_eviction_force statistic to track the count of pages queued for
forced eviction.
* Fixes to reduce the number of operations on shared data that were causing
bottlenecks in read only workloads.
* Add streaming pack / unpack to the API.
* Add some basic reconciliation stats to the connection stats.
* In LSM, keep trying to switch if there is an error: it may be transient.
* Minor clean up and enhancement for the reconciliation statistics, add a
set of compression statistics, both to the data-source statistics.
* Compaction cannot run at the same time as a checkpoint: the problem is
that checkpoints review page reconciliation information and checkpoints
update page reconciliation information. Lock out checkpoints while
compaction is running.
WiredTiger release 1.4.2, 2013-01-14
------------------------------------
[#387] Fast-path "S" and "u" formats in cursor.get_key and cursor.get_value.
[#407] Allow non-conflicting updates to complete concurrently.
[#418] Add code in to prioritize eviction of pages that are larger than a
certain threshold. This avoids taking a performance hit when a huge page
needs to be reconciled. Add a new memory_max_page configuration option.
[#419] If a page splits, it potentially creates a merge-split internal page
and we potentially walk that page during fast-delete. The WT_REF.addr field
doesn't point to a cell in that case and we'll drop core.
[#424] Add clarification wording for boolean configuration strings.
[#425] Perform checkpoints in the calling thread, don't block eviction: when
evicting in a file that is being checkpointed, only evict clean pages. Also
Do compaction in the calling thread instead of interrupting the eviction
thread to do the work.
[#426] Fixes for automake 1.3.x. Allow examples to run in parallel: give
each a unique home directory.
Make the tree build without HAVE_VERBOSE.
Fix some issues with LSM rename and add a Python test.
Track when cursors refer to memory returned by WiredTiger, copy it if
required before dropping hazard pointers that might be protecting it.
Verify shouldn't ever modify the file -- don't bother checking for dirty
pages, just discard everything.
When rolling forward to resolve key prefix compression, don't copy the key,
we only need a reference to it, should speed up tables with lots of key
prefix compression.
Requested changes for the WT_COMPRESSOR::compress_raw method API: pass in the
configured object's page size as a convenience, and if
WT_COMPRESSOR::pre_size is set, use it to determine the size of the
destination buffer, rather than using the object's page size as the maximum
needed.
WiredTiger release 1.4.1, 2012-12-12
------------------------------------
This is a bugfix, cleanup and performance tuning release. The significant
changes are highlighted below:
[215] Add a __wt_panic function that shuts down all of the WiredTiger APIs.
Also add a new error return WT_PANIC which means there has been an error
in the WiredTiger engine, and it should be restarted.
[409] Fix a bug populating column groups with complex schema. Also allow empty
column lists in projection cursors.
[150] Add description of how to do index-only searches to the documentation.
[392] Move examples/c/ex_test_perf.c to bench/wtperf.
[322] Add support for statistics on schema-level objects i.e tables,
column groups, indices.
* Enhance statistics, including changing the name of some statistics.
* Fix a bug in the eviction server that could cause it to abort, leaving the
system unusable.
WiredTiger release 1.4.0, 2012-12-03
------------------------------------
This release adds several major new features, a number of performance
improvements and bug fixes. The significant changes are outlined below:
New features and API changes:
[242] Track the percentage of cache that is dirty, trigger eviction to bound
it. This can be used to bound how much data checkpoints write.
[324] Add support for WT_COMPRESS::compress_raw, which lets the compression
routine select how many rows are included in each disk block.
[381] Add statistics to track read and write amplification (application data
size versus I/O size)
* Add a trigger configuration option to WT_SESSION::compact API.
* Make WT_SESSION::create's checksum configuration 3-state: on, off, or
uncompressed blocks only.
Bug fixes:
* Fix build issues on Solaris.
* Fix a bug calculating the generation of an LSM merge.
* Fix WiredTiger dump and load for tables.
* Fix a memory leak in checkpoints.
* Improve accuracy of cache memory tracking with overflow items.
WiredTiger release 1.3.8, 2012-11-22
------------------------------------
This release improves the performance of LSM trees, changes how statistics are
reported and adds a shared cache implementation:
New features and API changes:
[232] Add a "size of checkpoint" statistic.
* Add a shared cache pool implemention. Manages a single cache among
multiple databases within a process.
* Merge statistics from file and LSM sources into a "data source" statistic
structure. Rename and regroup some shared stastistics. Add a helper to
the Python API to lookup in a cursor in a simple expression.
* Add support for sub groups of options in configuration strings.
Performance tuning for LSM trees:
* Don't try to merge with a chunk that is much larger than a small chunk.
* After an LSM merge, fault in some pages before the new tree goes live to
avoid stalling application threads.
* Don't automatically fail inserts if the write generation check fails:
compare keys instead.
* Switch the LSM tree lock to a read/write lock, so cursors can read the
state of the tree in parallel.
Bug fixes:
* Fix a bug where we could write past the end of a buffer after it was grown.
WiredTiger release 1.3.7, 2012-11-09
------------------------------------
This release fixes a bug and improves performance with Bloom filters:
* Drop any old Bloom filter before creating a new one -- we may have been
interrupted in between creating it and updating the metadata. Write the
metadata after creating missing Bloom filters.
* Use a separate thread for creation of Bloom filters for the newest,
unmerged LSM chunks.
* Changes to the ex_test_perf example: change the default configuration to
4KB pages and disable prefix compression. Change the "-i" command line
option to be a simple count of records to insert. Clean up error
handling and add option to populate using multiple threads.
* Clarify the docs for the default buffer_alignment setting.
WiredTiger release 1.3.6, 2012-11-06
------------------------------------
This is a bugfix and performance tuning release. The changes are as follows:
* Rename the WiredTiger installed modules to libwiredtiger_XXX. Don't install
the nop and reverse collator modules.
* Replace test/format's bzip configuration string with compression, which can
take one of four arguments (none, bzip, ext, snappy), change format to run
snappy compression if the library is available.
* Rename the builtin block compressor names from "bzip2_compress" to "bzip2",
and from "snappy_compress" to "snappy".
* Support multiple LSM merge threads with the "lsm_merge_threads" config key.
Use IDs rather than array index to mark the start chunk in a merge, in case
we race with another thread.
* Cache the hash values used for Bloom filter lookups, rather than hashing for
each Bloom filter in an LSM tree.
* Only switch trees in an LSM cursor if the primary chunk is on disk.
* Add a per-btree cache priority, currently only used to make it more likely
for Bloom filter pages to stay in cache.
* Only evict pages with read generations in the bottom quarter of the range we
see. Fix a 32-bit wrapping bug in assigning read generations.
* For update-only LSM cursors, only open a cursor in the primary chunk.
* LSM: Report errors from the checkpoint thread.
* LSM: only save a Bloom URI in the metadata after it is successfully created.
* LSM: Create missing Bloom filters when reading from an LSM tree if
"lsm_bloom_newest"is set.
* LSM: Include all of the chosen chunks in a merge. Only pin the current chunk
in an LSM cursor if it is writeable.
WiredTiger release 1.3.5, 2012-10-26
------------------------------------
This is a bugfix and performance tuning release. The changes are as follows:
[#370] Document that applications are responsible for figuring out their
upgrade path if they might swap out compression engines.
[#371] When a single session was used to reconcile multiple btrees, one of
which had dictionaries configured and one of which didn't, we failed to
clear the dictionary when starting page reconciliation. Be consistent,
never use anything other than the btree handle's configuration to decide
if we're using a dictionary in a reconcilation run.
[#372] Fix several potential integer overflow bugs.
[#373] Fix a bug where calls that performed an operation on multiple objects
(such as creating a table that implicitly creates a column group)
could leave the metadata incomplete if a process exited without
calling `WT_CONNECTION::close`.
Hold the schema lock while opening tables. Fixes the error "cannot be
opened until all column groups are created" message when create calls
race with open_cursor.
[#374] Fix a race that caused crashes when using the Python API with
multi-threaded code.
[#375] Fix a bug in __wt_cond_wait - so that it returns after timeout expires.
* Protect the list of LSM trees with the schema lock to avoid races during
create.
* Update ex_test_perf to output statistics during populate and improve timing
accuracy.
* Skew eviction in favor of leaf pages - which improves read-only performance
for large LSM trees.
* Hold the LSM tree lock while gathering statistics.
* Fix a bug in bulk load of bitmap files.
* Fix a related bug in the bloom code that uses bitmap stores.
* Don't attempt to drop the first chunk of an LSM tree before creating it.
* Instead of entering a fake key cell after the last cell on the page just
in case the page ends with a key cell which has no value, use the end of
the page to detect that case.
* Cache cursor key/value formats in Python, to save a native call from every
get_key/value.
* Don't sync the directory after open if the global "sync" flag is false.
* Fix a race for LSM trees that could happen if two threads race to open a
cursor and drop the LSM tree.
WiredTiger release 1.3.4, 2012-10-19
------------------------------------
This release includes several important new features, including:
* support for online compaction of files;
* support for tables, column groups and indices that use LSM trees for
storage; and
* improved statistics and configuration for LSM trees and Bloom filters.
In addition, there are some significant performance improvements and bug
fixes. The full list of changes is:
[#248] Add support for online compaction.
[#310] Fixed a bug where overflow blocks could be accessed by a
long-running reader after they had been freed in a checkpoint.
[#358] Allocate checkpoint blocks from the live system's list of available
blocks rather than always extending the file.
[#361] Sync the directory after creating a file: this is apparently
required for durability on Linux, according to the Linux fsync man
page.
[#362] Don't check if a page is on the avail or discard lists if we're
salvaging the file, that is okay.
[#363] Remove obsolete code dealing with forced eviction.
[#366] Fake checkpoints may have the delete flag set, ignore them when
rolling checkpoints forward.
[#367] All metadata reads should ignore the application's transactional
context.
[#369] Support LSM as a data source for tables, column groups and indices.
* Add tuning options for LSM bloom filters, including controlling whether
the oldest level in the tree has a Bloom filter, whether newly-created
(level 0) files have Bloom filters, and passing arbitrary file
configuration for Bloom filters.
* Add a merge generation to LSM chunks. Add a statistic that reports the
highest merge generation in a tree.
* Add a new LSM statistic tracking searches that could benefit from bloom
filters.
* Enable LSM statistics in the "wt stat" utility.
* Interrupt LSM merge operations, rather than waiting on close.
* Wait for a while before looking for LSM major merges, in case merges
catch up with inserts.
* Fix LSM index searches. The main issue was LSM search_near was not
always returning the closest key to the search key, which calling code
expects. It now tries hard to find the smallest cursor larger than the
search key, and only if no larger record exists does it return the
largest record smaller than the search key.
* Reset any old cursor position before an LSM search. This limits hazard
references in an LSM search to a single chunk.
* Fix a memory leak in an error path in Bloom filters.
* Tweak the search loops in hazard_{set,clear} in favor of
last-in-first-out ordering.
* If there are many files open, some hotter than others, walk more files
looking for pages to evict.
* Don't stop evicting until we reach the target, have eviction wake up
periodically regardless of whether the application signals it. This
latter requires a "timed condition wait" operation.
* Tweaks to file handle flags for out-of-cache read performance on Linux
(disable readahead and access time updates).
* Replace the WT_SESSION::dumpfile method with configuration strings to
WT_SESSION::verify.
* Fix a bug where we weren't skipping unnecessary default checkpoints
because we weren't handling the generational number included in the
internal checkpoint name.
* Add a "force" configuration flag to WT_SESSION::checkpoint, object
compaction needs it because the work it wants done is done by the block
manager.
* Make compact and checkpoint operate on a table's indices.
* When doing a page truncate, lock down the page before we unpack the
on-page cell -- it's possible the page could be instantiated, modified
and reconciled while we're sleeping, in which case the WT_REF.addr field
would no longer point on-page.
WiredTiger release 1.3.3, 2012-10-11
------------------------------------
This is a bugfix and performance tuning release, primarily related to LSM
trees. The changes are as follows:
[#350] Checkpoint the metadata after successful schema-level operations.
Otherwise, if process exits without closing the connection or
running a checkpoint, created objects exist but there is no record
in the metadata.
[#351] Don't put checkpoint extent blocks on the available list, blocks on
it are considered for truncation; they have to go on the "checkpoint
available" list.
* Choose LSM merges based on a measure of efficiency (levels collapsed per
record), rather than simply choosing a minor or a major merge. Tweak the
merge heuristic so we don't end up with runs of smaller chunks in the
middle of the tree.
* Add a connection-wide flag to disable LSM merges.
* Don't create Bloom filters for the oldest chunk in the system. Add the
ability to disable Bloom filters entirely.
* Fix fast-path for bit values in WT_CURSOR::set_value.
* Clean up allocation of LSM chunk IDs.
* Update bloom_get so that it doesn't hold a cursor position.
* Respect the page size for fixed-length column stores, remembering there
are 8 bits per byte.
* Support bulk loading a bitmap into a fixed-length column store, update
Bloom filter code to use this.
* Add an example program, ex_test_perf, to demonstrate basic LSM usage.
* Add a new statistics cursor type "statistics:lsm". Update ex_stat.c to
demonstrate usage.
* Add a statistics_fast flag to file statistics cursors. Update LSM
statistics so that they aggregate some cache statistics. Add ability to
open a statistics cursor on a checkpoint.
* Walk a constant number of pages for LRU eviction.
* Move the cache full check to after an update operation completes, when it
is no longer holding hazard references. This improves behavior with
small caches.
WiredTiger release 1.3.2, 2012-10-03
------------------------------------
This is a bugfix and performance tuning release, primarily related to LSM
trees. The changes are as follows:
* Implement minor merges for LSM trees, prefer them to major merges.
* Update hazard references, so the active array grows as needed. Change
the default hazard_max to 1000.
* Abort transactions if the cache is so full that they cannot make
progress.
* Fix a bug where verify could crash if an empty checkpoint exists.
* Make the maximum number of chunks for merges configurable, rather than
deriving a value from the number of hazard references available.
* Switch to an atomic add to allocate transaction IDs. This fixes a subtle
race before where two threads could temporarily have the same ID in the
global state table. If one of the threads timed out and the other thread
committed its transaction with that ID, the commit would not become
visible immediately. This could lead to deadlock errors in workloads
that are logically conflict-free.
* Have auto-commit transactions retry deadlocks. This requires that we
keep the user's key and value in the cursor.
* Simplify the code handling updated records in variable-length
column-store reconciliation.
* Never wait for eviction when holding the schema lock. This avoids
deadlocks between opening a column store file and taking a checkpoint.
* Take care with the loop termination when walking files for eviction. We
were making one extra call into __wt_tree_walk, which would leave a leaf
page in the WT_REF_EVICT_WALK state, unable to be evicted. In some
workloads, including LSM loads, we could end up with many files all
consisting of a single leaf page, none of which could be evicted.
* Pause updates when the cache is full.
* In files marked as "out of cache", don't wait for eviction when reading a
page.
* Fix the record count calculation for minor merges. This was leading to
no Bloom filter being created for minor merges after running for some
time, leading to merges taking increasingly long to complete.
* Only sleep in the LSM checkpoint thread if no work is done.
* Add sanity check of cache size to LSM open.
[#338] Create fake checkpoints until an object is modified, so that a
checkpoint between the cursor create and the bulk load doesn't make
it impossible to do a bulk-load on the cursor.
WiredTiger release 1.3.1, 2012-09-25
------------------------------------
This is a bugfix release, primarily related to LSM trees. The changes are
as follows:
[#309] Implement auto-commit of transactions at the API. As well as
ensuring the atomicity of complex operations, this change simplified
code that simulated auto-commit internally and fixed a number of
bugs.
[#321] Bulk-cursors no longer block checkpoints. We can't write files that
are being bulk-loaded, so change checkpoint to create checkpoints in
the metadata that, if accessed, look like empty files.
Tighten down the requirements for bulk-load, the only thing that can
be bulk-loaded now is a newly created tree, not any empty file.
[#329] Add dictionary support to variable-length column store objects.
Support large row-store reconciliation dictionaries: add a skiplist
as the indexing mechanism.
[#333] Fix a leak of the in-memory transaction log structure and the LSM
data source handle.
[#334] Fix a memory leak where a page's replacement address wasn't being
freed.
* Check that LSM trees are not configured as column stores.
* Fix a race when starting the LSM worker thread. It was possible for the
thread to exit immediately if it started fast enough.
* Two fixes for LSM, one to ensure that cursors read from a checkpoint if
one is available. The other to reduce the number of empty chunks that can
be created initially.
* Fix a bug that disabled bloom filters.
* The configure script checks for Python support in SWIG.
* If a drop operation fails to acquire all of the handle locks it needs,
make sure it releases the primary handle lock.
* Fix a number of other minor bugs and memory leaks.
WiredTiger release 1.3.0, 2012-09-17
------------------------------------
This release contains a number of major new features, including:
* support for LSM trees with Bloom filters;
* support for hot backups; and
* support for fast truncation of files.
In addition, there are some critical bug fixes. We recommend that all users
upgrade. Here is the full list of changes:
[#143] Implement random record lookups.
[#168] Add support for LSM trees.
[#168] Add support for Bloom filters in LSM trees.
[#198] Handle page-generation wraparound.
[#236] Implement hot backups.
[#244] Index cursors for column-store objects may not be created using the
record number as the index key.
[#247] Add a fast-path for WT_SESSION::truncate that avoids reading most
data to be deleted.
[#259] Performance hack for cursor open: don't parse the configuration
strings for a default value if the application didn't specify a
configuration string.
[#262] Disable dump on child cursors: only the top-level cursor is wrapped
in a dump cursor.
[#266] Deal with new / dropped indices in __wt_schema_open_index.
[#269] Checkpoint handles must not be open when they are overwritten.
[#271] Add support for a reserved checkpoint name "WiredTigerCheckpoint"
that opens the object's last checkpoint.
[#271] Add the ability to access unnamed checkpoints.
[#274] Change cursor.equals to return a standard error value and store the
cursor equality result in a separate argument.
[#275] If exclusive handle is required for an operation and it is not
available, fail immediately: don't block.
[#276] Fix methods that return integer parameters from Python. This
includes cursor.equals and cursor.search_near.
[#277] Acquire the schema lock when creating the metadata file. We're
single-threaded, so it isn't protecting against anything, but the
handle management code expects to have the schema lock.
[#279] Some optimizations for __wt_config_gets_defno. Specifically, if
we're dealing with a simple stack of config strings, just parse the
application string rather than the full list of defaults.
[#279] Split the description string into a set of structures, to reduce the
number of string comparisons and manipulation that's required.
[#282] Remove the cursor.reconfigure method, and replace it with
documentation showing how to "reconfigure" cursors using the
session.open_cursor method to duplicate them with different
configuration strings.
[#284] Fix for a hazard reference race, where page eviction races with the
creation of the hazard reference, we have to check the pointer
itself as well as the state of the pointer.
[#285] We can clear the tree's modified flag on checkpoint, as long as the
checkpoint writes all modifications. Clear the tree's modified
flag before we start the checkpoint, but reset it as necessary if
reconciliation is unable to write all of the changes in a page.
[#287] Fix __wt_config_check to handle overlapping config values correctly.
[#289] Add support for read-committed isolation, make it the default. Add
a session-level "isolation" setting.
[#294] If txn_commit fails, document the transaction was rolled-back.
[#295] Expand the documentation on using cursors without explicit
transactions.
[#300] Include all changes whenever closing a file, don't check for
visibility. If updates are skipped while evicting a page, give up.
[#305] Have "wt dump" fail more gracefully if the object doesn't exist.
[#310] When freeing a tracked address in reconcilation, clear it to avoid
freeing the same address again on error.
[#314] Replace cursor.equals with cursor.compare
[#319] Clear the bulk_load_ok flag when closing handles.
* Add an "ancient transaction" statistic so we can find out if they're
actually occurring in the field.
* Add an "was object ever modified" flag to the btree handle, and use it to
avoid writing read-only objects during internal checkpoints, issue
* Add per-connection statistics counters for transaction checkpoint, begin,
commit and rollback. Add per-btree statistics counters for update
conflicts.
* Another fixed-length column-store implicit record fix: if the earliest
row in the object is row 10, and it's on an append list, we still must
return rows 1-9, they've been implicitly created.
* Bulk cursors: disallow cursor.{equals,next,prev,reset,search,
search_near,update,remove}; only close and insert are supported.
* Change session.truncate to support any cursor position for range
truncation, not just keys that are known to exist.
* Checkpoint has to flush the metadata file, but only after it's flushed
all of the other files.
* Discard obsolete WT_UPDATE structures during updates.
* Document that duplicated cursors are positioned at the same point as the
cursor that was duplicated.
* Fix a (very unlikely) deadlock at startup, if an application issues a
checkpoint before the eviction server has managed to open its sesssion.
* Fix a core dump if we verify a file that's corrupted such that we are
unable to load any checkpoints at all, and the per-checkpoint bit map is
never set.
* If a page selected for eviction cannot be freed because it has some
recent updates, try instead to free memory by trimming old updates.
* If a thread fails to evict a page, try to bump its snapshot. This avoids
the common case of read-committed threads getting stuck because one
thread falls behind (e.g., because we can't evict during a checkpoint).
* If an exclusive table create fails, return EEXIST.
* If we try to remove a file that doesn't exist, don't complain, return
success.
* If we're repeatedly taking a checkpoint with the same name, skip the work
for read-only objects.
* Instead of flagging the empty tree's leaf page empty as part of creating
an empty tree in memory, set the page as modified (to force
reconciliation); if the leaf page is still empty at that time, then we'll
figure it out during that reconciliation. This fixes a memory leak where
the leaf page of a empty tree wasn't being freed.
* It's not unreasonable to open a cursor on a non-existent table, don't
complain, just return not-found.
* Move dist/RELEASE to the top level of the tree.
* Optimization: don't repeatedly look up btree handles for schema
operations.
* Return keys from all operations: don't keep pointing to the application's
key.
* Update btree usage of 64 bitstring implementation, so it's cleaner.
* Update the bitstring implementation to use 64 bit length strings.
* Updates performed without an active transaction should become visible
with the current transaction ID.
* Upgrade to doxygen 1.8.x
* Use a real snapshot transaction for checkpoints. Otherwise, the snapshot
can be updated in between checkpointing multiple files (when updating the
metadata).
WiredTiger release 1.2.2, 2012-06-20
------------------------------------
This is a bugfix release. The changes are as follows:
* Defer making free pages available until the end of a checkpoint, in case
there is a failure after processing some files.
* When checking the value of the "isolation" key, don't assume it is NUL
terminated. This bug could cause transactions to run with incorrect
isolation.
* Fix two bugs with snapshot isolation:
1. reset the isolation level when the transaction completes;
2. when checking visibility, check item's ID against the maximum snapshot ID
(not the transaction's ID).
WiredTiger release 1.2.1, 2012-06-15
------------------------------------
This is a bugfix release. The changes are as follows:
* Avoid a deadlock between eviction and checkpoint on the connection spinlock.
* Allocate "desc" buffers in heap memory so that they are correctly aligned
(fixes direct_io support on Linux).
* Initialize the snapshot-avail list after cleaning it out, else we'll try and
print a NULL pointer in VERBOSE mode.
WiredTiger release 1.2.0, 2012-06-04
------------------------------------
This release contains many bugfixes and improvements. The major changes are:
[#138] Add support for transactions with coarse-grained durability.
Transactions provide atomicity guarantees and rollback, and uncommitted
changes are never written to disk. There is no on-disk log, so
committed changes only become durable when the next checkpoint
completes. Checkpoints are implemented by creating
transactionally-consistent snapshots within data files.
[#156] Fully support operations that make schema changes with multiple
sessions open concurrently.
[#159] Disable internal page key suffix compression if a custom collator is
configured. This avoids issues with collators that require complete
keys.
[#167] Add support for durable snapshots within files. While a snapshot is
active, the pages used by the snapshot will not be overwritten. If a
file is accessed after a crash or application exit without calling
WT_CONNECTION::close, any changes made after the last snapshot will be
silently ignored.
[#214, #216]
Fixes for forcing eviction with small caches.
WiredTiger release 1.1.5, 2012-04-26
------------------------------------
Don't update a WT_REF after it has been unlocked.
Add an operation to set a flag atomically, use it to avoid racing on page flags.
Fix a race between sync and reading that could cause a segfault.
WiredTiger release 1.1.4, 2012-04-16
------------------------------------
Check the versions of autoconf, automake and libtool to avoid failures when
trying to build from the github tree with versions that are too old.
[#191] Create the schema table as part of creating the environment so that
application threads don't race trying to create it later.
[#193] Split-merge pages have to be reconciled to mark their parents dirty
[#194] The dump utility should only output configuration that can be passed to
WT_SESSION::create.
Eviction fixes for out-of-cache update workloads:
* Fix an unlikely bug where the EVICT_LRU flag was cleared when a page in
the LRU queue was overwritten with itself during a walk. This led to an
assertion failure when the page was later evicted.
* Clear all unused eviction queue entries while holding the lru_lock.
* Split WT_PAGE->flags so that there is no possibility of racing:
(1) Move WT_PAGE_REC_* flags into WT_PAGE_MODIFY;
(2) Use atomic operations to set and clear the remaining (2) page flags.
Move the test/format threads setting into the CONFIG file.
WiredTiger release 1.1.3, 2012-04-04
------------------------------------
Fix the "exclusive" config for WT_SESSION::create. [#181]
1. Make it work for files within a single session.
2. Make it work for files across sessions.
3. Make other data sources consistent with files.
Fix an eviction bug introduced into 1.1.2: when evicting a page with children,
remove the children from the LRU eviction queue. Reduce the impact of clearing
a page from the LRU queue by marking pages on the queue with a flag
(WT_PAGE_EVICT_LRU).
During an eviction walk, pin pages up to the root so there is no need to spin
when attempting to lock a parent page. Use the EVICT_LRU page flag to avoid
putting a page on the LRU queue multiple times.
Layer dump cursors on top of any cursor type.
Add a section on replacing the default system memory allocator to the tuning
page.
Typo in usage method for "wt write".
Don't report range errors for config values that aren't well-formed integers.
WiredTiger release 1.1.2, 2012-03-20
------------------------------------
Add public-domain copyright notices to the extension code.
test/format can now run multi-threaded, fixed two bugs it found:
(1) When iterating backwards through a skiplist, we could race with an insert.
(2) If eviction fails for a page, we have to assume that eviction has unlocked
the reference.
Scan row-store leaf pages twice when reading to reduce the overhead of the
index array.
Eviction race fixes:
(1) Call __rec_review with WT_REFs: don't look at the page until we've checked
the state.
(2) Clear the eviction point if we hit it when discarding a child page, not
just the parent.
Eviction tuning changes, particularly for read-only, out-of-cache workloads.
Only notify the eviction server if an application thread doesn't find any pages
to evict, and then only once.
Only spin on the LRU lock if there might be pages in the LRU queue to evict.
Keep the current eviction point in memory and make the eviction walk run
concurrent with LRU eviction.
Every test now has err/out captured, and it is checked to assure it is empty at
the end of every test.
WiredTiger release 1.1.1, 2012-03-12
------------------------------------
Default to a verbose build: that can be switched off by running `configure
--enable-silent-rules`).
Account for all memory allocated when reading a page into cache. Total memory
usage is now much closer to the cache size when using many small keys and
values.
Have application threads trigger a retry forced page eviction rather than
blocking eviction. This allows rec_evict.c to simply set the WT_REF state to
WT_REF_MEM after all failures, and fixes a bug where pages on the forced
eviction queue would end up with state WT_REF_MEM, meaning they could be chosen
for eviction multiple times.
Grow existing scratch buffers in preference to allocating new ones.
Fix a race between threads reading in and then modifying a page.
Get rid of the pinned flag: it is no longer used.
Fix a race where btree files weren't completely closed before they could be
re-opened. This behavior can be triggered by using a new session on every
operation (see the new -S flag to the test/thread program). [#178]
When connections are closed, create a session and discard the btree handles.
This fixes a long-standing bug in closing a connection: if for any reason there
are btree handles still open, we need a real session handle to close them.
Really close btree handles: otherwise we can't safely remove or rename them.
Fixes test failures in test_base02 (among others).
Wait for application threads in LRU eviction to drain before walking a file.
Fix a buffer size calculation when updating the root address of a file.
Documentation fix: 10% of 1MB is 100KB.
WiredTiger release 1.1.0, 2012-02-28
------------------------------------
Add checks to the session.truncate method to ensure the start/stop
cursors reference the same object and have been initialized.
Implement cursor duplication via WT_SESSION::open_cursor. [#161]
Switch to quiet builds by default.
Fix with automake version < 1.11, use foreign mode so that fewer
top-level files are required.
If a session or connection method is about to return WT_NOTFOUND (some
underlying object was not found), map it to ENOENT, only cursor methods
return WT_NOTFOUND. [#163]
Save and restore session->btree in schema ops to simplify calling code.
[#164]
Note the wiredtiger_open config string "multiprocess" is not yet
supported.
Move "root:F" and "version:F" entries for files into the value for
"file:F", so there is only a single record per file.
[NOTE: SCHEMA CHANGE]
When parsing config strings, continue to the end of the string in case
of repeated keys. [#124]
Don't require shared libraries unless Python is configured.
Add support for direct I/O, with the config "direct_io=(data,log)".
Build with _GNU_SOURCE on Linux to enable O_DIRECT.
Don't keep the last page of column stores pinned: it prevented eviction
of large trees created from scratch.
Allow application threads to evict pages from any tree: maintain a count
of threads doing LRU in each tree and wait for activity to drain when
closing.
|