| Commit message (Collapse) | Author | Age | Files | Lines |
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The failure happens due to a race condition between processing
a row event (INSERT) and an automatically generated event
DROP TEMPORARY TABLE. Even though DROP has a higher GTID, it can
become visible in @@gtid_slave_pos before the row event with
a lower GTID has been committed. Since the test makes the slave
to synchronize with the master using GTID, the waiting stops
as soon as GTID of the DROP TEMPORARY TABLE becomes visible,
and if changes from the previous event haven't been applied yet,
the error occurs.
According to Kristian (see the comment to MDEV-10631), the real
problem is that DROP TEMPORARY TABLE is logged in the row mode
at all. For this particular test, since DROP does not do anything,
nothing prevents it from competing with the prior transaction.
The workaround for the test is to add a meaningful event
after DROP TEMPORARY TABLE, so that the slave would wait on its
GTID instead of the one from DROP.
Additionally (unrelated to this problem) removed FLUSH TABLES,
which, as the comment stated, should have been removed after
MDEV-6403 was fixed.
|
|/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use include/sync_with_master_gtid.inc instead of --sync_with_master to avoid a
race in the test case.
In parallel replication, the old-style slave position (which is used by
--sync_with_master) is updated out-of-order between parallel threads. This
makes it possible for the position to be updated past DROP TEMPORARY TABLE t2
just before the commit of INSERT INTO t1 SELECT * FROM t2 becomes visible.
In this case, there is a small window where a SELECT just after
--sync_with_master may not see the changes from the INSERT.
|
|
|| (thd->state_flags & Open_tables_state::BACKUPS_AVAIL)' fails with parallel replication
The direct cause of the assertion was missing error handling in
record_gtid(). If ha_commit_trans() fails for the statement commit, there was
missing code to catch the error and do ha_rollback_trans() in this case; this
caused close_thread_tables() to assert.
Normally, this error case is not hit, but in this case it was triggered due to
another bug: When a transaction T1 fails during parallel replication, the code
would signal following transactions that they could start to run without
properly marking the error condition. This caused subsequent transactions to
incorrectly start replicating, only to get an error later during their own
commit step. This was particularly serious if the subsequent transactions were
DDL or MyISAM updates, which cannot be rolled back and would leave replication
in an inconsistent state.
Fixed by 1) in case of error, only signal following transactions to continue
once the error has been properly marked and those transactions will know not
to start; and 2) implement proper error handling in record_gtid() in the case
that statement commit fails.
|