blob: ee5e02bb5495b5fa1b1fef3564e74fdca0acc3a2 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
|
-- BACKUP SIGNAL DIAGRAM COMPLEMENT TO BACKUP AMENDMENTS 2003-07-11 --
USER MASTER MASTER SLAVE SLAVE
---------------------------------------------------------------------
BACKUP_REQ
---------------->
UTIL_SEQUENCE
--------------->
<---------------
DEFINE_BACKUP
------------------------------> (Local signals)
LIST_TABLES
--------------->
<---------------
FSOPEN
--------------->
GET_TABINFO
<---------------
DI_FCOUNT
--------------->
<---------------
DI_GETPRIM
--------------->
<---------------
<-------------------------------
BACKUP_CONF
<----------------
CREATE_TRIG
--------------> (If master crashes here -> rouge triggers/memory leak)
<--------------
START_BACKUP
------------------------------>
<------------------------------
ALTER_TRIG
-------------->
<--------------
WAIT_GCP
-------------->
<--------------
BACKUP_FRAGMENT
------------------------------>
SCAN_FRAG
--------------->
<---------------
<------------------------------
WAIT_GCP
-------------->
<--------------
DROP_TRIG
-------------->
<--------------
STOP_BACKUP
------------------------------>
<------------------------------
BACKUP_COMPLETE_REP
<----------------
ABORT_BACKUP
------------------------------>
----------------------------------------------------------------------------
USER BACKUP-MASTER
1) BACKUP_REQ -->
2) To all slaves DEFINE_BACKUP_REQ
This signals contains info so that all
slaves can take over as master
Tomas: Except triggerId info...
3) Wait for conf
4) <-- BACKUP_CONF
5) For Each Table
PREP_CREATE_TRIG_REQ
Wait for Conf
6) To all slaves START_BACKUP_REQ
Include trigger ids
Wait for conf
7) For Each Table
CREATE_TRIG_REQ
Wait for conf
8) Wait for GCP
9) For each table
For each fragment
BACKUP_FRAGMENT_REQ -->
<-- BACKUP_FRAGMENT_CONF
10) Wait for GCP
11) To all slaves STOP_BACKUP_REQ
This signal turns off logging
12) Wait for conf
13) <-- BACKUP_COMPLETE_REP
----
Slave: Master Died
Wait for master take-over, max 30 sec then abort everything
Slave: Master TakeOver
BACKUP_STATUS_REQ --> To all nodes
<-- BACKUP_STATUS_CONF
BACKUP_STATUS_CONF
BACKUP_DEFINED
BACKUP_STARTED
BACKUP_FRAGMENT
Master: Slave died
-- Define Backup Req --
1) Get backup definition
Which tables (all)
2) Open files
Write table list to CTL - file
3) Get definitions for all tables in backup
4) Get Fragment info
5) Define Backup Conf
-- Define Backup Req --
-- Abort Backup Req --
1) Report to others
2) Stop logging
3) Stop file(s)
4) Stop scan
5) If failure/abort
Remove files
6) If XXX
Report to user
7) Clean up records/stuff
-- Abort Backup --
Reasons for aborting:
1a) client abort
1b) slave failure
1c) node failure
Resources to be cleaned up:
Slave responsability:
2a) Close and remove files
2b) Free allocated resources
Master responsability:
2c) Drop triggers
USER MASTER MASTER SLAVE SLAVE
---------------------------------------------------------------------
BACKUP_ABORT_ORD:
-------------------------(ALL)-->
Set Master State ABORTING Set Slave State ABORTING
Drop Triggers Close and Remove files
CleanupSlaveResources()
BACKUP_ABORT_ORD:OkToClean
-------------------------(ALL)-->
CleanupMasterResources()
BACKUP_ABORT_REP
<---------------
State descriptions:
Master - INITIAL
BACKUP_REQ ->
Master - DEFINING
DEFINE_BACKUP_CONF ->
Master - DEFINED
CREATE_TRIG_CONF ->
Master - STARTED
<--->
Master - SCANNING
WAIT_GCP_CONF ->
Master - STOPPING
(Master - CLEANING)
--------
Master - ABORTING
Slave - INITIAL
DEFINE_BACKUP_REQ ->
Slave - DEFINING
- backupId
- tables
DIGETPRIMCONF ->
Slave - DEFINED
START_BACKUP_REQ ->
Slave - STARTED
Slave - SCANNING
STOP_BACKUP_REQ ->
Slave - STOPPING
FSCLOSECONF ->
Slave - CLEANING
-----
Slave - ABORTING
Testcases:
2. Master failure at first START_BACKUP_CONF
<masterId> error 10004
start backup
- Ok
2. Master failure at first CREATE_TRIG_CONF
<masterId> error 10003
start backup
- Ok
2. Master failure at first ALTER_TRIG_CONF
<masterId> error 10005
start backup
- Ok
2. Master failure at WAIT_GCP_CONF
<masterId> error 10007
start backup
- Ok
2. Master failure at WAIT_GCP_CONF, nextFragment
<masterId> error 10008
start backup
- Ok
2. Master failure at WAIT_GCP_CONF, stopping
<masterId> error 10009
start backup
- Ok
2. Master failure at BACKUP_FRAGMENT_CONF
<masterId> error 10010
start backup
- Ok
2. Master failure at first DROP_TRIG_CONF
<masterId> error 10012
start backup
- Ok
1. Master failure at first STOP_BACKUP_CONF
<masterId> error 10013
start backup
- Ok
3. Multiple node failiure:
<masterId> error 10001
<otheId> error 10014
start backup
- Ok (note, mgmtsrvr does gets BACKUP_ABORT_REP but expects BACKUP_REF, hangs...)
4. Multiple node failiure:
<masterId> error 10007
<takeover id> error 10002
start backup
- Ok
ndbrequire(!ERROR_INSERTED(10001));
ndbrequire(!ERROR_INSERTED(10002));
ndbrequire(!ERROR_INSERTED(10021));
ndbrequire(!ERROR_INSERTED(10003));
ndbrequire(!ERROR_INSERTED(10004));
ndbrequire(!ERROR_INSERTED(10005));
ndbrequire(!ERROR_INSERTED(10006));
ndbrequire(!ERROR_INSERTED(10007));
ndbrequire(!ERROR_INSERTED(10008));
ndbrequire(!ERROR_INSERTED(10009));
ndbrequire(!ERROR_INSERTED(10010));
ndbrequire(!ERROR_INSERTED(10011));
ndbrequire(!ERROR_INSERTED(10012));
ndbrequire(!ERROR_INSERTED(10013));
ndbrequire(!ERROR_INSERTED(10014));
ndbrequire(!ERROR_INSERTED(10015));
ndbrequire(!ERROR_INSERTED(10016));
ndbrequire(!ERROR_INSERTED(10017));
ndbrequire(!ERROR_INSERTED(10018));
ndbrequire(!ERROR_INSERTED(10019));
ndbrequire(!ERROR_INSERTED(10020));
if (ERROR_INSERTED(10023)) {
if (ERROR_INSERTED(10023)) {
if (ERROR_INSERTED(10024)) {
if (ERROR_INSERTED(10025)) {
if (ERROR_INSERTED(10026)) {
if (ERROR_INSERTED(10028)) {
if (ERROR_INSERTED(10027)) {
(ERROR_INSERTED(10022))) {
if (ERROR_INSERTED(10029)) {
if(trigPtr.p->operation->noOfBytes > 123 && ERROR_INSERTED(10030)) {
|