| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The test requires adaptation for MariaDB, which is done
in this patch. In addition, this patch includes a fix for
the SST script startup code that adds escaping for special
characters on the command line (in case they are contained
in the arguments to mysqld). The fix does not require
separate tests, as the required tests are already part
of the mtr suite for Galera.
|
| |
|
|
|
|
|
| |
It seems that memory is not freed when updated value is NULL or
when wsrep is not initialized before shutdown.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some DML operations on tables having unique secondary keys cause scanning
in the secondary index, for instance to find potential unique key violations
in the seconday index. This scanning may involve GAP locking in the index.
As this locking happens also when applying replication events in high priority
applier threads, there is a probabality for lock conflicts between two wsrep
high priority threads.
This PR avoids lock conflicts of high priority wsrep threads, which do
secondary index scanning e.g. for duplicate key detection.
The actual fix is the patch in sql_class.cc:thd_need_ordering_with(), where
we allow relaxed GAP locking protocol between wsrep high priority threads.
wsrep high priority threads (replication appliers, replayers and TOI processors)
are ordered by the replication provider, and they will not need serializability
support gained by secondary index GAP locks.
PR contains also a mtr test, which exercises a scenario where two replication
applier threads have a false positive conflict in GAP of unique secondary index.
The conflicting local committing transaction has to replay, and the test verifies
also that the replaying phase will not conflict with the latter repllication applier.
Commit also contains new test scenario for galera.galera_UK_conflict.test,
where replayer starts applying after a slave applier thread, with later seqno,
has advanced to commit phase. The applier and replayer have false positive GAP
lock conflict on secondary unique index, and replayer should ignore this.
This test scenario caused crash with earlier version in this PR, and to fix this,
the secondary index uniquenes checking has been relaxed even further.
Now innodb trx_t structure has new member: bool wsrep_UK_scan, which is set to
true, when high priority thread is performing unique secondary index scanning.
The member trx_t::wsrep_UK_scan is defined inside WITH_WSREP directive, to make
it possible to prepare a MariaDB build where this additional trx_t member is
not present and is not used in the code base. trx->wsrep_UK_scan is set to true
only for the duration of function call for: lock_rec_lock() trx->wsrep_UK_scan
is used only in lock_rec_has_to_wait() function to relax the need to wait if
wsrep_UK_scan is set and conflicting transaction is also high priority.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
|
|
|
|
|
|
|
| |
failed: 1205: Lock wait timeout exceeded
Add wait_conditions to verify correct database state before
next operation.
|
|
|
|
| |
Use debug_sync to force FTWRL to pause in correct state.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
At end_connection make sure we have wsrep before trying to free
connection assigned to it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If log_slave_updates==OFF, wsrep applier threads used to be configured
with option: thd->variables.option_bits&= ~(OPTION_BIN_LOG);
(i.e. like sql_log_bin=ON). And this was regardless of log-bin configuration.
With this, having configuration of: --log-bin && --log-slave-updates=OFF,
local threads used binlogging, but applier threads did not. And further:
local threads went through binlog group commit, while applier threads did
direct commits. This resulted in situation, where applier threads entered
earlier in wsrep XID checkpointing, and could sync their wsrep XID out of order.
Later local thread commit would see that higher seqno was already checkpointed,
and fire an assert because of this.
As a fix, applier threads are now forced to enable binlogging regardless of
log-slave-updates configuration.
This PR comes with new mtr test: galera.MDEV-24327, which causes a scenario
where applier transaction is applied and committed while earlier local transaction
is parked before commit order monitor enter. A buggy mariadb versoin would fail
for assertion because of wsrep XID checkpoint order violation.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
|
| |
|
|
|
|
| |
Add test case
|
|
|
|
| |
Add wait_condition.
|
|
|
|
| |
Add primary key and wait condition.
|
|
|
|
| |
Add wait_conditions.
|
|
|
|
|
|
|
|
|
| |
Changed the test so that it does not rely on specific auto increment
ids. With Galera's default wsrep_auto_increment_control setting it is
not guaranteed that auto increments always start from 1. The test was
occasionally failing due to result content mismatch.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
|
|
|
|
|
| |
Remove unnecessary condition and add necessary include
for non debug Galera library.
|
|
|
|
|
|
|
| |
Deadlock is possible between applier thread and local committing thread with active FLUSH TABLE.
Applier thread should skip table share checks and locks when opening table.
Reviewed-by: Jan Lindström <jan.lindstrom@mariadb.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To fix this, it is necessary to add an option to exclude the
database with the name "lost+found" from processing (the database
name will be checked by the check_if_skip_database_by_path() or
by the check_if_skip_database() function, and as a result
"lost+found" will be skipped).
In addition, it is necessary to slightly modify the verification
logic in the check_if_skip_database() function.
Also added a new test galera_sst_mariabackup_lost_found.test
|
|
|
|
| |
Add necessary wait_conditions to stabilize test.
|
|
|
|
|
|
| |
galera_gcache_recover_full_gcache.test: assert_grep.inc failed
Grep only the fact that we need to fall back to SST.
|
|
|
|
|
|
|
| |
not transition to state READY
Replace sleeps with proper wait_conditions to wait correct
cluster configuration.
|
|
|
|
| |
Remove infinite procedure and use direct INSERTs.
|
|
|
|
|
| |
While doing TOI buffer OR REPLACE option was not added to replicated
string.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit contains a fix and extended test case for a ASAN failure
reported during galera.fk mtr testing.
The reported heap buffer overflow happens in test case where a cascading
foreign key constraint is defined for a column of varchar type, and
galera.fk.test has such vulnerable test scenario.
Troubleshoting revealed that erlier fix for MDEV-19660 has made a fix
for cascading delete handling to append wsrep keys from pcur->old_rec,
in row_ins_foreign_check_on_constraint(). And, the ASAN failuer comes from
later scanning of this old_rec reference.
The fix in this commit, moves the call for wsrep_append_foreign_key() to happen
somewhat earlier, and inside ongoing mtr, and using clust_rec which is set
earlier in the same mtr for both update and delete cascade operations.
for wsrep key populating, it does not matter when the keys are populated,
all keys just have to be appended before wsrep transaction replicates.
Note that I also tried similar fix for earlier wsrep key append, but using
the old implementation with pcur->old_rec (instead of clust_rec), and same
ASAN failure was reported. So it appears that pcur->old_rec is not properly
set, to be used for wsrep key appending.
galera.galera_fk_cascade_delete test has been extended by two new test scenarios:
* FK cascade on varchar column.
This test case reproduces same scenario as galera.fk, and this test scenario
will also trigger ASAN failure with non fixed MariaDB versions.
* multi-master conflict with FK cascading.
this scenario causes a conflict between a replicated FK cascading transaction
and local transaction trying to modify the cascaded child table row.
Local transaction should be aborted and get deadlock error.
This test scenario is passing both with old MariaDB version and with this
commit as well.
|
|
|
|
|
| |
Revert change to MDL and set SST donor thread as a system thread.
Joiner thread was already a system thread.
|
|
|
|
|
| |
During SST we need to let FTWRL to use normal timeout method
even when client is disconnected.
|
|
|
|
| |
hang.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sporadic test hangs happen because of mutex dealock between innodb
background threads and two test connection executions.
The test sets variable innodb_disallow_writes, which blocks all writes
to filesyste. The test logic is to execute an INSERT, which should hang
because of filesytstem writes are blocked, and through another session
verify by SELECT that this hanging happens. The SELECT session will then
release innodb_disallow_writes blocking.
However, filesystem write blocking affects also innodb background threads
and they may hang while keeping some other resources locked.
As an example, in one test hang situation, buffer pool access was blocked.
And, if buffer pool is blocked, the test connections will be blocked as well,
and the SELECT session will not be able to continue to release the
innodb_disallow_writes.
The fix in this commit is refactoring of the test logic.
The test will now set first innodb_disallow_writes blocking, and then record
a hash of data directory's filesystem contents. This works as checksum of the
state of data on the datadirectory.
Then some SQL load is tried on both nodes, these sessions will be blocking
due to frozen file system state. The test will have a short sleep to allow
innodb background threads to loop and possibly encounter innodb_disallow_writes
blocking as well.
After the sleep, the test will record file system checksun for the second time,
and then release the innodb_disallow-writes blocking.
Finally, the two checksums are compared, they should be identical to verify that
nothing was written on datadirectory during the test execution.
The checksum is implemented by md5sum hash over all files found in datadirectory
by find command. all these file hashes are hashed together by one more md5sum.
The test therefore depends on md5sum and find. find may work differently with some
OS distributions, e.g. freebsd may be problematic.
|
|
|
|
| |
https://github.com/codership/mariadb-server into 10.2-MDEV-18838
|
|\ |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The galera.galera_parallel_autoinc_manytrx mtr test opens and runs test
scenario through 3 connections to node 1 and one connection to node 2.
In the test initialization phase, the test creates two tables 't1' and 'ten'
and then creates a stored procedure 'p1' to operate on these tables.
These 3 create DDL statements are issued through same connection to node 1.
In the next test phase, the mtr script uses send command to launch the call
for the p1 stored procedure through all 3 connections to node 1 and through
one connection to node 2. As the mtr send command is asynchronous,
this test phase is non blocking and fast operation.
Now, if the replication between nodes is slow, it may happen that the
initialization phase DDL statements have not been received or have not been
fully applied in node 2. Therefore there is no guarantee that the test tables
and the stored procedure have been created in node 2. Yet, the test is trying
to call p1 in node 2.
In the failure case error logs, there is error message
"MTR failed: query 'reap' failed: 1305: PROCEDURE test.p1 does not exist"
The reap command through connection to node 2, is the first place where test
execution may observe that test tables and/or stored procedure are not yet
created in node 2.
The fix in this commit adds a wait condition in connection to node 2, to wait
until the stored procedure is created before calling the stored procedure.
The wait is implemented by looking in information_schema.routines for the p1
stored procedure.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
drop database `fts`.`` though there are still open handles
MDEV-22140 galera.galera_drop_database MTR failed: InnoDB: MySQL is trying to drop database `fts`.`` though there are still open handles
Add wait conditions to wait that all operations are done in both
nodes.
|
| |
| |
| |
| |
| |
| | |
galera.galera_var_innodb_disallow_writes: Result length mismatch
Add wait_conditions to force desired execution.
|
| | |
|
|\ \
| |/ |
|
| | |
|
| |
| |
| |
| |
| |
| | |
Backported the support for aborting and replaying stored procedure and fix for trigger
key assigments from 10.4 version.
Backported also two mtr tests: wsrep_sp_bf_abort and MDEV-20225
|
| |
| |
| |
| |
| |
| | |
'reap' succeeded - should have failed with errno 1213
Test cleanup.
|
| |
| |
| |
| |
| | |
Enable tests with additional galera output to find out actual
reason for test failures.
|
| |
| |
| |
| |
| |
| |
| | |
galera.galera_as_slave_gtid_myisam: Result length mismatch
Add wait_condition so that drop table has time to replicate to
Galera cluster.
|
| |
| |
| |
| |
| |
| |
| | |
galera.galera_wsrep_new_cluster: Result content mismatch
Test starts two servers and we do not know order they really start,
thus wsrep_local_index can be 1 or 2.
|
| |
| |
| |
| |
| |
| |
| | |
Result content mismatch
Add wait_condition to make sure that node contains all the rows
expected.
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is a backport of the applicable part of
commit 93475aff8de80a0ef53cbee924bcb70de6e86f2c and
commit 2c39f69d34e64a5cf94720e82e78c0ee91bd4649
from 10.4.
Before 10.4 and Galera 4, WSREP_ON is a macro that points to
a global Boolean variable, so it is not that expensive to
evaluate, but we will add an unlikely() hint around it.
WSREP_ON_NEW: Remove. This macro was introduced in
commit c863159c320008676aff978a7cdde5732678f975
when reverting WSREP_ON to its previous definition.
We replace some use of WSREP_ON with WSREP(thd), like it was done
in 93475aff8de80a0ef53cbee924bcb70de6e86f2c. Note: the macro
WSREP() in 10.1 is equivalent to WSREP_NNULL() in 10.4.
Item_func_rand::seed_random(): Avoid invoking current_thd
when WSREP is not enabled.
|