summaryrefslogtreecommitdiff
path: root/mysql-test/suite/rpl/t
Commit message (Collapse)AuthorAgeFilesLines
* MDEV-23089 rpl_parallel2 fails in 10.5Sachin2020-08-031-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | Problem:- rpl_parallel2 was failing non-deterministically Analysis:- When FLUSH TABLES WITH READ LOCK is executed, it will allow all worker threads to complete their ongoing transactions and then it will pause them. At this state FTWRL will proceed to acquire global read lock. FTWRL first blocks threads from starting new commits, then upgrades the lock to block commit of existing transactions. Step1: FLUSH TABLES WITH READ LOCK - Blocks new commits Step2: * STOP SLAVE command enables 'force_abort=1' which unblocks workers, they continue to execute events. * T1: Waits in 'record_gtid' call to update 'gtid_slave_pos' table with its current GTID, but it is blocked becuase of Step1. * T2: Holds COMMIT lock and waits for T1 to commit. Step3: FLUSH TABLES WITH READ LOCK - Waiting to get BLOCK_COMMIT. This results in deadlock. When STOP SLAVE command allows paused workers to proceed, workers should skip the execution of all further events, similar to 'conservative' parallel mode. Solution:- We will assign 1 to skip_event_group when we are aborted in do_ftwrl_wait. rpl_parallel_entry->pause_sub_id is only reset when force_abort is off in rpl_pause_after_ftwrl.
* Merge 10.3 into 10.4Marko Mäkelä2020-07-312-0/+87
|\
| * Merge 10.2 into 10.3Marko Mäkelä2020-07-312-0/+87
| |\
| | * MDEV-19338 InnoDB: Failing assertion: !cursor->index->is_committed()Nikita Malyavin2020-07-311-0/+27
| | | | | | | | | | | | | | | | | | Call mark_columns_per_binlog_row_image before find_row() to set up table->vcol_set early, so the virtual column value will be updated after record read (ha_rnd_pos/ha_index_next/etc) by table->update_virtual_fields() call
| | * MDEV-14203: rpl.rpl_extra_col_master_myisam, ↵Sujatha2020-07-231-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rpl.rpl_slave_load_tmpdir_not_exist failed in buildbot with a warning Problem: ======= rpl.rpl_slave_load_tmpdir_not_exist 'stmt' w3 [ fail ] Found warnings/errors in server log file! Test ended at 2017-09-27 20:34:55 [Warning] Master is configured to log replication events with checksum, but will not send such events to slaves that cannot process them ^ Found warnings in /mnt/buildbot/build/mariadb-10.2.10/mysql-test/var/3/log/mysqld.1.err ok Analysis: ======== When slave tries to connect to master 'get_master_version_and_clock' function is invoked to perform elaborated slave-master handshake. During this process slave server queries master server, to know if it is checksum aware and at the same time master is notified about its CRC-awareness. The master's side instant value of @@global.binlog_checksum is stored in the dump thread's uservar area as well as cached locally to become known in consensus by master and slave. Post hand-shake slave requests master for binlog dump. It sends 'COM_BINLOG_DUMP'. This command is sent to master by 'cli_advanced_command' call. If there is some temporary network failure during this request_dump call, 'end_server' is invoked to close the current connection between master and slave. Upon connection close the dump thread on the master gets terminated and it clears the 'uservar' data it got through master-slave handshake. The 'COM_BINLOG_DUMP' command is sent once again without master-slave handshake. Since the checksum data is not available with new dump thread a warning gets reported. Fix: === Upon network write error donot attempt reconnect, proceed to master-slave handshake. This ensures that master is aware of slave's capability to use checksums.
* | | MDEV-21953 deadlock between BACKUP STAGE BLOCK_COMMIT and parallel repl.Monty2020-07-211-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The issue was: T1, a parallel slave worker thread, is waiting for another worker thread to commit. While waiting, it has the MDL_BACKUP_COMMIT lock. T2, working for mariabackup, is doing BACKUP STAGE BLOCK_COMMIT and blocks all commits. This causes a deadlock as the thread T1 is waiting for can't commit. Fixed by moving locking of MDL_BACKUP_COMMIT from ha_commit_trans() to commit_one_phase_2() Other things: - Added a new argument to ha_comit_one_phase() to signal if the transaction was a write transaction. - Ensured that ha_maria::implicit_commit() is always called under MDL_BACKUP_COMMIT. This code is not needed in 10.5 - Ensure that MDL_Request values 'type' and 'ticket' are always initialized. This makes it easier to check the state of the MDL_Request. - Moved thd->store_globals() earlier in handle_rpl_parallel_thread() as thd->init_for_queries() could use a MDL that could crash if store_globals where not called. - Don't call ha_enable_transactions() in THD::init_for_queries() as this is both slow (uses MDL locks) and not needed.
* | | Merge 10.3 into 10.4Marko Mäkelä2020-07-021-0/+121
|\ \ \ | |/ /
| * | Merge 10.2 into 10.3Marko Mäkelä2020-07-021-0/+121
| |\ \ | | |/
| | * MDEV-20428 after-merge fix: Stabilize the testMarko Mäkelä2020-07-011-3/+3
| | |
| | * Merge 10.1 into 10.2bb-10.2-mergeMarko Mäkelä2020-07-011-0/+121
| | |\
| | | * MDEV-20428: "Start binlog_dump" message doesn't indicate GTID positionSujatha2020-06-161-0/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: ======= The "Start binlog_dump" message hasn't been updated to include the slave's requested GTID position: 20:05:05 139836760311552 [Note] Start binlog_dump to slave_server(2), pos(, 4) For diagnostic purposes, it would be helpful if the GTID position were included. Fix: === Imporve "Start binlog_dump" print message to include "using_gtid" and "GTID position" requested by slave. Ex: [Note] Start binlog_dump to slave_server(2), pos(, 4), using_gtid(1), gtid('1-1-201,2-2-100') [Note] Start binlog_dump to slave_server(3), pos('mariadb-bin.004142', 507988273), using_gtid(0), gtid('')
| | | * MDEV-15152 Optimistic parallel slave doesnt cope well with START SLAVE UNTILAndrei Elkin2020-05-261-0/+466
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The immediate bug was caused by a failure to recognize a correct position to stop the slave applier run in optimistic parallel mode. There were the following set of issues that the analysis unveil. 1 incorrect estimate for the event binlog position passed to is_until_satisfied 2 wait for workers to complete by the driver thread did not account non-group events that could be left unprocessed and thus to mix up the last executed binlog group's file and position: the file remained old and the position related to the new rotated file 3 incorrect 'slave reached file:pos' by the parallel slave report in the error log 4 relay log UNTIL missed out the parallel slave branch in is_until_satisfied. The patch addresses all of them to simplify logics of log change notification in either the master and relay-log until case. P.1 is addressed with passing the event into is_until_satisfied() for proper analisis by the function. P.2 is fixed by changes in handle_queued_pos_update(). P.4 required removing relay-log change notification by workers. Instead the driver thread updates the notion of the current relay-log fully itself with aid of introduced bool Relay_log_info::until_relay_log_names_defer. An extra print out of the requested until file:pos is arranged with --log-warning=3.
* | | | Merge 10.3 into 10.4Marko Mäkelä2020-05-301-0/+465
|\ \ \ \ | |/ / /
| * | | Merge 10.2 into 10.3Marko Mäkelä2020-05-271-0/+465
| |\ \ \ | | |/ /
| | * | MDEV-15152 Optimistic parallel slave doesnt cope well with START SLAVE UNTILAndrei Elkin2020-05-261-0/+465
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The immediate bug was caused by a failure to recognize a correct position to stop the slave applier run in optimistic parallel mode. There were the following set of issues that the analysis unveil. 1 incorrect estimate for the event binlog position passed to is_until_satisfied 2 wait for workers to complete by the driver thread did not account non-group events that could be left unprocessed and thus to mix up the last executed binlog group's file and position: the file remained old and the position related to the new rotated file 3 incorrect 'slave reached file:pos' by the parallel slave report in the error log 4 relay log UNTIL missed out the parallel slave branch in is_until_satisfied. The patch addresses all of them to simplify logics of log change notification in either the master and relay-log until case. P.1 is addressed with passing the event into is_until_satisfied() for proper analisis by the function. P.2 is fixed by changes in handle_queued_pos_update(). P.4 required removing relay-log change notification by workers. Instead the driver thread updates the notion of the current relay-log fully itself with aid of introduced bool Relay_log_info::until_relay_log_names_defer. An extra print out of the requested until file:pos is arranged with --log-warning=3.
* | | | Merge 10.3 into 10.4Marko Mäkelä2020-05-201-2/+1
|\ \ \ \ | |/ / /
| * | | Merge 10.2 into 10.3Marko Mäkelä2020-05-201-2/+1
| |\ \ \ | | |/ /
| | * | Merge 10.1 into 10.2Marko Mäkelä2020-05-201-2/+1
| | |\ \ | | | |/
| | | * MDEV-22472 rpl.rpl_fail_register failed in buildbot with wrong resultAndrei Elkin2020-05-191-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a new test from upstream that did not expect the correct value of the command slot of the Dump thread when the latter gets killed. The test is made to expect "Killed" string as the command in show-processlist as it is supposed to when a thread gets killed.
* | | | Merge 10.3 into 10.4Marko Mäkelä2020-05-161-64/+0
|\ \ \ \ | |/ / / | | | | | | | | | | | | We will expose some more std::atomic internals in Atomic_counter, so that dict_index_t::lock will support the default assignment operator.
| * | | Merge 10.2 into 10.3Marko Mäkelä2020-05-151-64/+0
| |\ \ \ | | |/ /
| | * | MDEV-22456 Dropping the adaptive hash index may cause DDL to lock up InnoDBMarko Mäkelä2020-05-151-64/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the InnoDB buffer pool contains many pages for a table or index that is being dropped or rebuilt, and if many of such pages are pointed to by the adaptive hash index, dropping the adaptive hash index may consume a lot of time. The time-consuming operation of dropping the adaptive hash index entries is being executed while the InnoDB data dictionary cache dict_sys is exclusively locked. It is not actually necessary to drop all adaptive hash index entries at the time a table or index is being dropped or rebuilt. We can let the LRU replacement policy of the buffer pool take care of this gradually. For this to work, we must detach the dict_table_t and dict_index_t objects from the main dict_sys cache, and once the last adaptive hash index entry for the detached table is removed (when the garbage page is evicted from the buffer pool) we can free the dict_table_t and dict_index_t object. Related to this, in MDEV-16283, we made ALTER TABLE...DISCARD TABLESPACE skip both the buffer pool eviction and the drop of the adaptive hash index. We shifted the burden to ALTER TABLE...IMPORT TABLESPACE or DROP TABLE. We can remove the eviction from DROP TABLE. We must retain the eviction in the ALTER TABLE...IMPORT TABLESPACE code path, so that in case the discarded table is being re-imported with the same tablespace identifier, the fresh data from the imported tablespace will replace any stale pages in the buffer pool. rpl.rpl_failed_drop_tbl_binlog: Remove the test. DROP TABLE can no longer be interrupted inside InnoDB. fseg_free_page(), fseg_free_step(), fseg_free_step_not_header(), fseg_free_page_low(), fseg_free_extent(): Remove the parameter that specifies whether the adaptive hash index should be dropped. btr_search_lazy_free(): Lazily free an index when the last reference to it is dropped from the adaptive hash index. buf_pool_clear_hash_index(): Declare static, and move to the same compilation unit with the bulk of the adaptive hash index code. dict_index_t::clone(), dict_index_t::clone_if_needed(): Clone an index that is being rebuilt while adaptive hash index entries exist. The original index will be inserted into dict_table_t::freed_indexes and dict_index_t::set_freed() will be called. dict_index_t::set_freed(), dict_index_t::freed(): Note that or check whether the index has been freed. We will use the impossible page number 1 to denote this condition. dict_index_t::n_ahi_pages(): Replaces btr_search_info_get_ref_count(). dict_index_t::detach_columns(): Move the assignment n_fields=0 to ha_innobase_inplace_ctx::clear_added_indexes(). We must have access to the columns when freeing the adaptive hash index. Note: dict_table_t::v_cols[] will remain valid. If virtual columns are dropped or added, the table definition will be reloaded in ha_innobase::commit_inplace_alter_table(). buf_page_mtr_lock(): Drop a stale adaptive hash index if needed. We will also reduce the number of btr_get_search_latch() calls and enclose some more code inside #ifdef BTR_CUR_HASH_ADAPT in order to benefit cmake -DWITH_INNODB_AHI=OFF.
* | | | MDEV-19650: Privilege bug on MariaDB 10.4Oleksandr Byelkin2020-05-071-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also fixes: MDEV-21487: Implement option for mysql_upgrade that allows root@localhost to be replaced MDEV-21486: Implement option for mysql_install_db that allows root@localhost to be replaced Add user mariadb.sys to be definer of user view (and has right on underlying table global_priv for required operation over global_priv (SELECT,UPDATE,DELETE)) Also changed definer of gis functions in case of creation, but they work with any definer so upgrade script do not try to push this change.
* | | | Merge 10.3 into 10.4Marko Mäkelä2020-05-052-1/+45
|\ \ \ \ | |/ / /
| * | | Merge branch '10.2' into 10.3Oleksandr Byelkin2020-05-042-1/+45
| |\ \ \ | | |/ /
| | * | Merge branch '10.1' into 10.2Oleksandr Byelkin2020-05-022-1/+45
| | |\ \ | | | |/
| | | * Merge branch '5.5' into 10.1Oleksandr Byelkin2020-04-302-1/+44
| | | |\
| | | | * Bug#29915479 RUNNING COM_REGISTER_SLAVE WITHOUT COM_BINLOG_DUMP CAN RESULTS ↵Sergei Golubchik2020-04-301-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | IN SERVER EXIT in fact, in MariaDB it cannot, but it can show spurious slaves in SHOW SLAVE HOSTS. slave was registered in COM_REGISTER_SLAVE and un-registered after COM_BINLOG_DUMP. If there was no COM_BINLOG_DUMP, it would never unregister.
| | | | * Bug#28388217 - SERVER CAN FAIL WHILE REPLICATING CONDITIONAL COMMENTSSergei Golubchik2020-04-291-1/+11
| | | | | | | | | | | | | | | | | | | | test case
* | | | | Merge 10.3 into 10.4Marko Mäkelä2020-03-305-30/+30
|\ \ \ \ \ | |/ / / /
| * | | | Merge 10.2 into 10.3Marko Mäkelä2020-03-305-30/+30
| |\ \ \ \ | | |/ / /
| | * | | MDEV-21360 restore debud_dbug through a session variable instead of '-d,..'bb-10.2-MDEV-21360Alice Sherepa2020-03-235-30/+30
| | | | |
* | | | | Merge branch '10.3' into 10.4Oleksandr Byelkin2020-01-247-12/+9
|\ \ \ \ \ | |/ / / /
| * | | | Merge branch '10.2' into 10.3Oleksandr Byelkin2020-01-247-12/+9
| |\ \ \ \ | | |/ / /
| | * | | MDEV-21360 save/restore debud_dbug instead of total reset at the end of the testAlice Sherepa2020-01-217-12/+9
| | | | |
* | | | | MDEV-20821 parallel slave server shutdown hangAndrei Elkin2020-01-212-0/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Parallel slave server shutdown found to be hanging in close_connections() triggered by shutdown due to a slave worker thread would not be notified to exit in case the worker was sitting idle. Fixed with destroying the worker pool earlier that is in slave_prepare_for_shutdown() when all their driver threads have already left. A test file is added to simulate the bug condition as well as check multi-sourced and not-idle worker cases.
* | | | | Merge 10.3 into 10.4Marko Mäkelä2020-01-2014-29/+76
|\ \ \ \ \ | |/ / / / | | | | | | | | | | | | | | | The MDEV-17062 fix in commit c4195305b2a8431f39a4c75cc1c66ba43685f7a0 was omitted.
| * | | | Merge branch '10.2' into 10.3Sergei Petrunia2020-01-1714-29/+76
| |\ \ \ \ | | |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | # Conflicts: # mysql-test/suite/galera/r/MW-388.result # mysql-test/suite/galera/t/MW-388.test # mysql-test/suite/innodb/r/truncate_inject.result # mysql-test/suite/innodb/t/truncate_inject.test # mysql-test/suite/rpl/r/rpl_stop_slave.result # mysql-test/suite/rpl/t/rpl_stop_slave.test # sql/sp_head.cc # sql/sp_head.h # sql/sql_lex.cc # sql/sql_yacc.yy # storage/xtradb/buf/buf0dblwr.cc
| | * | | MDEV-21360 global debug_dbug pre-test value restoration issuesAlice Sherepa2020-01-1512-27/+27
| | | | |
| | * | | MDEV-21360 debug_dbug pre-test value restoration issuesAlice Sherepa2020-01-1513-29/+30
| | | | |
| | * | | MDEV-18514: Assertion `!writer.checksum_len || writer.remains == 0' failedSujatha2020-01-091-0/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Analysis: ======== 'max_binlog_cache_size' is configured and a huge transaction is executed. When the transaction specific events size exceeds 'max_binlog_cache_size' the event cannot be written to the binary log cache and cache write error is raised. Upon cache write error the statement is rolled back and the transaction cache should be truncated to a previous statement specific position. The truncate operation should reset the cache to earlier valid positions and flush the new changes. Even though the flush is successful the cache write error is still in marked state. The truncate code interprets the cache write error as cache flush failure and returns abruptly without modifying the write cache parameters. Hence cache is in a invalid state. When a COMMIT statement is executed in this session it tries to flush the contents of transaction cache to binary log. Since cache has partial events the cache write operation will report 'writer.remains' assert. Fix: === Binlog truncate function resets the cache to a specified size. As a first step of truncation, clear the cache write error flag that was raised during earlier execution. With this new errors that surface during cache truncation can be clearly identified.
* | | | | Merge branch '10.3' into 10.4Oleksandr Byelkin2019-12-091-1/+1
|\ \ \ \ \ | |/ / / /
| * | | | Lintian complains on spelling errorFaustin Lammler2019-12-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | The lintian check complains on spelling error: https://salsa.debian.org/mariadb-team/mariadb-10.3/-/jobs/95739
* | | | | Merge 10.3 into 10.4Marko Mäkelä2019-11-121-0/+74
|\ \ \ \ \ | |/ / / /
| * | | | merge 10.2->10.3 with conflict resolutionsAndrei Elkin2019-11-111-0/+74
| |\ \ \ \ | | |/ / /
| | * | | manual merge 10.1->10.2Andrei Elkin2019-11-111-0/+75
| | |\ \ \ | | | |/ /
| | | * | MDEV-19376 Repl_semi_sync_master::commit_trx assertion failure: ... || ↵Andrei Elkin2019-11-101-0/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | !m_active_tranxs->is_tranx_end_pos(trx_wait_binlog_name, trx_wait_binlog_pos) The assert indicates that the current transaction got caught uncleaned from the semisync master's cache when it is signaled to proceed upon its ack receive. The reason of missed cleanup turns out to be a flaw in the gtid connect mode. A submitted by connecting slave value of its last received event's binlog file *name* was adopted into {{Repl_semi_sync_master::m_reply_file_name}} as a part of semisync initialization. Notice that the initialization still refines the position part of the submitted last received event's binlog coordinates. The master side binlog filename:pos refinement is specific to the gtid connect mode for purpose of computing the latest binlog file to resume slave feeding from. Effectively in the gtid connect mode the computed resumption filename:pos may appear smaller in which case a new post-connect time committing transaction may be logged with its filename:pos also less than the submitted coordinates and that triggers the assert. Fixed with making the semisync initialization to use the refined filename:pos. It is guaranteed to be less than any new generated transaction's binlog:pos.
* | | | | Merge 10.3 into 10.4Marko Mäkelä2019-11-011-0/+30
|\ \ \ \ \ | |/ / / /
| * | | | read-only slave using statement replication should replicate tmp tablesMichael Widenius2019-10-211-0/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Relates to MDEV-17863 DROP TEMPORARY TABLE creates a transaction in binary log on read only server Other things: - Fixed that insert into normal_table select from tmp_table is replicated as row events if tmp_table doesn't exists on slave.
* | | | | Merge 10.3 into 10.4Marko Mäkelä2019-10-106-23/+34
|\ \ \ \ \ | |/ / / /