1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
|
REQUIREMENTS
------------
- It should be possible to run two systems with replication using the
same configuration file on both systems.
FEATURES TO IMPLEMENT
---------------------
- Fix so that execute and command uses ExtSender.
None of them should have their own signals, this should
instead by abstacted to the RepStateRequest layer.
- Delete signals
GSN_REP_INSERT_GCIBUFFER_CONF
GSN_REP_INSERT_GCIBUFFER_REF
- Fix so that all ExtSenders are set at one point in the code only.
- Verify the following signals:
GSN_REP_INSERT_GCIBUFFER_REQ
GSN_REP_CLEAR_SS_GCIBUFFER_REQ
GSN_REP_DROP_TABLE_REQ
- Fix all @todo's in the code
- Remove all #if 1, #if 0 etc.
- Fix correct usage of dbug package used in MySQL source code.
- System table storing all info about channels
- Think about how channels, subscriptions etc map to SUMA Subscriptions
- TableInfoPS must be secured if SS REP is restarted and PS REP still
has all log records needed to sync. (This could be saved in a system
table instead of using the struct.)
KNOWN BUGS AND LIMITATIONS
--------------------------
- REP#1: Non-consistency due to non-logging stop [LIMITATION]
Problem:
- Stopping replication in state other than "Logging" can
lead to a non-consistent state of the destination database
Suggested solution:
- Implement a cleanData flag (= false) that indicates that
this has happend.
- REP#2: PS REP uses epochs from old subscription [BUG]
The following scenario can lead to a non-correct replication:
- Start replication X
- Wait until replication is in "Logging" state
- Kill SS REP
- Let PS REP be alive
- Start new replication Y
- Replication Y can use old PS REP epochs from replication X.
Suggested solution:
- Mark PS buffers with channel ids
- Make sure that all epoch requests use channel number in the requests.
- REP#3: When having two node groups, there is sometimes 626 [FIXED]
Problem:
- Sometimes (when doing updated) there is 626 error code when
using 2 node groups.
- 626 = Tuple does not exists.
- Current code in RepState.cpp is:
if(s == Channel::App &&
m_channel.getState() == Channel::DATASCAN_COMPLETED &&
i.last() >= m_channel.getDataScanEpochs().last() &&
i.last() >= m_channel.getMetaScanEpochs().last())
{
m_channel.setState(Channel::LOG);
disableAutoStart();
}
When the system gets into LOG state, force flag is turned off
Suggested solution:
- During DATASCAN, force=true (i.e. updates are treated as writes,
deletes error due to non-existing tuple are ignored)
- The code above must take ALL node groups into account.
- REP#4: User requests sometime vanish when DB node is down [LIMITATION]
Problem:
- PS REP node does not always REF when no connection to GREP exists
Suggested solution:
- All REP->GREP signalsends should be checked. If they return <0,
then a REF signal should be returned.
- REP#5: User requests sometime vanish when PS REP is down [BUG]
Scenario:
- Execute "Start" with PS REP node down
Solution:
- When start is executed, the connect flag should be checked
- REP#6: No warning if table exists [Lars, BUG!]
Problem:
- There is no warning if a replicated table already exists in the
database.
Suggested solution:
- Print warning
- Set cleanData = false
- REP#7: Starting 2nd subscription crashes DB node (Grep.cpp:994) [FIXED]
Scenario:
- Start replication
- Wait until replication is in "Logging" state
- Kill SS REP
- Let PS REP be alive
- Start new replication
- Now GREP crashes in Grep.cpp:994.
Suggested fix:
- If a new subscription is requested with same subscriberData
as already exists, then SUMA (or GREP) sends a REF signal
indicating that SUMA does not allow a new subscription to be
created. [Now no senderData is sent from REP.]
- REP#8: Dangling subscriptions in GREP/SUMA [Johan,LIMITATION]
Problem:
- If both REP nodes die, then there is no possibility to remove
subscriptions from GREP/SUMA
Suggested solution 1:
- Fix so that GREP/SUMA can receive a subscription removal
signal with subid 0. This means that ALL subscriptions are
removed. This meaning should be documented in the
signaldata class.
- A new user command "STOP ALL" is implemented that sends
a request to delete all subscriptions.
Suggested solution 2:
- When GREP detects that ALL PS REP nodes associated with a s
subscription are killed, then that subscription should be
deleted.
|