1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
|
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
#
Linear Store issues:
Current/pending:
================
Q-JIRA RHBZ Description / Comments
------ ------- ----------------------
5359 - Linearstore: Implement new management schema and wire into store
5360 - Linearstore: Evaluate and rework logging to produce a consistent log output
5361 - Linearstore: No tests for linearstore functionality currently exist
svn r.1564893 2014-02-05: Added tx-test-soak.sh
svn r.1564935 2014-02-05: Added license text to tx-test-soak.sh
* No existing tests for linearstore:
** Basic broker-level tests for txn and non-txn recovery
** Store-level tests which check write boundary conditions
** EFP tests, including file recovery, error management
** Unit tests
** Basic performance tests
5362 - Linearstore: No store tools exist for examining the journals
svn r.1556888 2014-01-09: WIP checkin for linearstore version of qpid_qls_analyze. Needs testing and tidy-up.
svn r.1560530 2014-01-22: Bugfixes for qpid_qls_analyze
svn r.1561848 2014-01-27: Bugfixes and enhancements for qpid_qls_analyze
svn r.1564808 2014-02-05: Bugfixes and enhancements for qpid_qls_analyze
* Store analysis and status
* Recovery/reading of message content
* Empty file pool status and management
5464 - [linearstore] Incompletely created journal files accumulate in EFP
5484 1035843 Slow performance for producers
svn r.1558592 2014-01-15 fixes an issue with using /dev/random as a source of random numbers for Journal serial numbers.
svn r.1558913 2014-01-16 replaces use of /dev/urandom with several calls to rand() to construct a 64-bit random number.
* Recommend rebuilding and testing for performance again with these two fixes. Marked POST.
- 1039522 Qpid crashes while recovering from linear store around apid::linearstore::journal::JournalFile::getFqFileName() including enq_rec::decode() threw JERR_JREC_BAD_RECTAIL
* Possible dup of 1039525
* May be fixed by QPID-5483 - waiting for needinfo, recommend rebuilding with QPID-5483 fix and re-testing. Marked POST.
- 1039525 Qpid crashes while recovering from linear store around apid::linearstore::journal::jexception::format including enq_rec::decode() threw JERR_JREC_BAD_REC_TAIL
* Possible dup of 1039522
* May be fixed by QPID-5483 - waiting for needinfo, recommend rebuilding with QPID-5483 fix and re-testing. Marked POST.
# - 1049870 [LinearStore] auto-delete property does not survive restart
Fixed/closed (in commit order):
===============================
Q-JIRA RHBZ Description / Comments
------ ------- ----------------------
5357 1052518 Linearstore: Empty file recycling not functional
svn r.1545563 2013-11-26: Propsed fix. VERIFIED
5358 1052727 Linearstore: Checksums not implemented in record tail
svn r.1547601 2013-12-03: Propsed fix. NEEDINFO on algorithm
5387 1036071 Linearstore: Segmentation fault when deleting queue
svn r.1547641 2013-12-03: Propsed fix. VERIFIED
5388 1035802 Linearstore: Segmentation fault when recovering empty queue
svn r.1547921 2013-12-04: Propsed fix. VERIFIED
NO-JIRA - Added missing Apache copyright/license text
svn r.1551304 2013-12-16: Propsed fix
5425 1052445 Linearstore: Transaction Prepared List (TPL) fails with jexception 0x0402 AtomicCounter::addLimit() threw JERR_JNLF_FILEOFFSOVFL
svn r.1551361 2013-12-16: Proposed fix VERIFIED
5442 1039949 Linearstore: Dtx recover test fails
svn r.1552772 2013-12-20: Proposed fix VERIFIED
5444 1052775 Linearstore: Recovering from qpid-txtest fails with "Inconsistent TPL 2PC count" error message
svn r.1553148 2013-12-23: Proposed fix NEEDIFNO on reproduction and testing
- 1038599 [LinearStore] Abort when deleting used queue after restart
CLOSED-NOTABUG 2014-01-06
5460 1051097 [linearstore] Recovery of store which contains prepared but incomplete transactions results in message loss
svn r.1556892 2014-01-09: Proposed fix VERIFIED
5473 1051924 [linearstore] Recovery of journal in which last logical file contains truncated record causes crash
svn r.1557620 2014-01-12: Proposed fix MODIFIED
5483 - [linearstore] Recovery of journal with partly written record fails with "JERR_JREC_BADRECTAIL: Invalid data record tail" error message
svn r.1558589 2014-01-15: Proposed fix
* May be linked to RHBZ 1039522 - VERIFIED
* May be linked to RHBZ 1039525 - VERIFIED
5487 1054448 [linearstore] Replace use of /dev/urandom with c random generator calls
svn r.1558913 2014-01-16: Proposed fix VEFIFIED
5479 1053701 [linearstore] Using recovered store results in "JERR_JNLF_FILEOFFSOVFL: Attempted to increase submitted offset past file size. (JournalFile::submittedDblkCount)" error message
* Probability: 2 of 600 (0.3%) using tx-test-soak.sh
* Fixed by checkin for QPID-5480, no longer able to reproduce. VERIFIED
5480 1053749 [linearstore] Recovery of store failure with "JERR_MAP_NOTFOUND: Key not found in map." error message
svn r.1564877 2014-02-05: Proposed fix
* Probability: 6 of 600 (1.0%) using tx-test-soak.sh
* If broker is started a second time after failure, it starts correctly and test completes ok.
* Problem: File is being recycled to EFP with still-locked enqueues in it (ie dequeued transactionally).
* Problem: Record alignment check writes filler records to wrong file when decoding bad record moves across a file boundary
5603 1063700 [linearstore] broker restart fails under stress test
svn r.1574513 2014-03-05: Proposed fix. POST
* jexception 0x0701 RecoveryManager::readNextRemainingRecord() threw JERR_JREC_BADRECTAIL
5607 1064181 [linearstore] Qpidd closes transactional client session&connection with async_dequeue() failed
svn r.1575009 2014-03-06 Proposed fix. POST
* jexception 0x010b LinearFileController::getCurrentSerial() threw JERR_NULL
- 1064230 [linearstore] Qpidd linearstore recovery sometimes fail to recover messages with recoverMessages() failed
* jexception 0x0701 RecoveryManager::readNextRemainingRecord() threw JERR_JREC_BADRECTAIL
* possible dup of 1063700
- 1036026 [LinearStore] Qpid linear store unable to create durable queue - framing-error: Queue <q-name>: create() failed: jexception 0x0000
* UNABLE TO REPRODUCE - but Frantizek has additional info
* Retested after checkin 1575009, problem solved. VERIFIED
Ordered checkin list:
=====================
In order to port the linearstore changes from trunk to a branch, the following svn checkins need to be ported in order:
no. svn r Q-JIRA RHBZ Date
--- ------- ------- -------- ----------
1. 1545563 5357 1052518 2013-11-26
2. 1547601 5358 1052727 2013-12-03
3. 1547641 5387 1036071 2013-12-03
4. 1547921 5388 1035802 2013-12-04
5. 1551304 NO-JIRA - 2013-12-16
6. 1551361 5425 1052445 2013-12-16
7. 1552772 5442 1039949 2013-12-20
8. 1553148 5444 1052775 2013-12-23
9. 1556888 5362 - 2014-01-09
10. 1556892 5460 1051097 2014-01-09
11. 1557620 5473 1051924 2014-01-12
12. 1558589 5483 - 2014-01-15
13. 1558592 5484 1035843 2014-01-15
14. 1558913 5487 1054448 2014-01-16
15. 1560530 5362 - 2014-01-22
16. 1561848 5362 - 2014-01-27
17. 1564808 5362 - 2014-02-05
18. 1564877 5480 1053749 2014-02-05
19. 1564893 5361 - 2014-02-05
20. 1564935 5361 - 2014-02-05
21. 1574513 5603 1063700 2014-03-05
22. 1575009 5607 1064181 2014-03-06
See above sections for details on these checkins.
Future work:
============
* One journal file lost when queue deleted. All files except for one are recycled back to the EFP.
* Complete exceptions - several exceptions thrown using jexception have no exception numbers
* Investigate ability of store to detect missing journal files, especially from logical end of a journal
* Investigate ability of store to handle file muddle-ups (ie journal files from EFP which are not zeroed or other journals)
* Look at improving the efficiency of recovery - right now the entire store is read once, and then each recovered record xid and data is read again
Code tidy-up
------------
* Remove old comments
* Use c++ cast templates instead of (xxx)y
* Member names: xxx_
* Rename classes, functions and variables to camel-case
* Add Doxygen docs to classes
* Make fid's consistent in name (fid, file_id, pfid) and format (hex vs decimal)
|