summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* MDEV-24449 Crash recovery may fail to apply some logbb-10.5-corruptionMarko Mäkelä2020-12-183-5/+17
| | | | | | | | | | | | | | | We are seeing !buf_pool.any_io_pending() assertion failures in srv_start() ever since MDEV-21452 in 10.6. But, the problem appears to be older. recv_sys_t::apply(): At the end of each batch, wait for pending reads to complete, so that any pending changes will have been added to buf_pool.flush_list before we flush the buffer pool. io_callback(): Do not invoke read_slots->release() before the callback function has returned, to ensure the correct operation of recv_sys_t::apply().
* MDEV-24445 Using innodb_undo_tablespaces corrupts system tablespaceMarko Mäkelä2020-12-189-104/+83
| | | | | | | | | | | | | | | | | | | | | | | | | | In the rewrite of MDEV-8139 (based on MDEV-15528), we introduced a wrong assumption that any persistent tablespace that is not an .ibd file is the system tablespace. This assumption is broken when innodb_undo_tablespaces (files undo001, undo002, ...) are being used. By default, we have innodb_undo_tablespaces=0 (the persistent undo log is being stored in the system tablespace). In MDEV-15528 and MDEV-8139 we rewrote the page scrubbing logic so that it will follow the tried-and-true write-ahead logging protocol, first writing FREE_PAGE records and then in the page flushing, zerofilling or hole-punching freed pages. Unfortunately, the implementation included a wrong assumption that that anything that is not in an .ibd file must be the system tablespace. This wrong assumption would cause overwrites of valid data pages in the system tablespace. mtr_t::m_freed_in_system_tablespace: Remove. mtr_t::m_freed_space: The tablespace associated with m_freed_pages. buf_page_free(): Take the tablespace and page number as a parameter, instead of taking a page identifier.
* MDEV-24442 Assertion space->referenced() failed in ↵Marko Mäkelä2020-12-182-16/+19
| | | | | | | | | | | | | | | | fil_crypt_space_needs_rotation A race condition between deleting an .ibd file and fil_crypt_thread marking pages dirty was introduced in commit 118e258aaac5da75a2ac4556201aaea3688fac67 (part of MDEV-23855). fil_space_t::acquire_if_not_stopped(): Correctly return false if the STOPPING flag is set, indicating that any further activity on the tablespace must be avoided. Also, remove the constant parameter have_mutex=true and move the function declaration to the same compilation unit with the only callers. fil_crypt_flush_space(): Remove an unused variable.
* MDEV-24426 fixup: Assertion failure on shutdownMarko Mäkelä2020-12-181-3/+2
| | | | | | | | fil_crypt_find_space_to_rotate(): Always treat the sentinel value that indicates that we have run out of work, even if at the same time the thread should shut down due to other reasons. Thanks to Matthias Leich for reproducing this bug with RQG.
* MDEV-24426 fil_crypt_thread keep spinning even if ↵bb-10.5-MDEV-24426Marko Mäkelä2020-12-172-20/+39
| | | | | | | | | | | | | | | | | | | | | | innodb_encryption_rotate_key_age=0 After MDEV-15528, two modes of operation in the fil_crypt_thread remains, depending on whether innodb_encryption_rotate_key_age=0 (whether key rotation is disabled). If the key rotation is disabled, the fil_crypt_thread miss the opportunity to sleep, which will result in lots of wasted CPU usage. fil_crypt_return_iops(): Add a parameter to specify whether other fil_crypt_thread should be woken up. fil_system_t::keyrotate_next(): Return the special value fil_system.temp_space to indicate that no work is to be done. fil_space_t::next(): Propagage the special value fil_system.temp_space to the caller. fil_crypt_find_space_to_rotate(): If no work is to be done, do not wake up other threads.
* Contain AIX perrorEtienne Guesnet2020-12-161-7/+13
|
* Fix build on GCC 5Etienne Guesnet2020-12-161-1/+1
|
* Add LARGE_FILES flag for GCC AIX buildEtienne Guesnet2020-12-161-2/+2
|
* Add -berok for head test on AIXEtienne Guesnet2020-12-161-0/+5
|
* Parse GSSAPI flags on AIXEtienne Guesnet2020-12-161-1/+5
|
* Add flags for AIX buildEtienne Guesnet2020-12-161-0/+7
|
* Remove -Werror for AIXEtienne Guesnet2020-12-161-1/+5
|
* AIX workaround for GCC include bugEtienne Guesnet2020-12-161-0/+6
|
* AIX workaround for GCC TOC bugEtienne Guesnet2020-12-162-0/+6
|
* Support of AIX for auth_socket pluginEtienne Guesnet2020-12-162-0/+20
|
* Add build on AIXEtienne Guesnet2020-12-1614-68/+81
|
* MDEV-24366 Use environment variables as S3 test case variableszhaorenhai2020-12-153-16/+92
| | | | | | | | | | | Move the S3 test case variables to suite.pm to use environment variables. Use minio credentials if a TCP connection to localhost:9000 is accepted so the current build works corrected. Reviewer: Daniel Black closes #1711
* MDEV-23659 Update Galera disabled.def fileStepan Patryshev2020-12-141-1/+0
|
* MDEV-24313 fixup: GCC 8 -WconversionMarko Mäkelä2020-12-141-4/+4
|
* MDEV-24313 fixup: GCC -WparenthesesMarko Mäkelä2020-12-141-1/+2
|
* MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1bb-10.5-MDEV-24313Marko Mäkelä2020-12-1410-83/+58
| | | | | | | | | | | | | | | | | | | In commit 5e62b6a5e06eb02cbde1e34e95e26f42d87fce02 (MDEV-16264) the logic of os_aio_init() was changed so that it will never fail, but instead automatically disable innodb_use_native_aio (which is enabled by default) if the io_setup() system call would fail due to resource limits being exceeded. This is questionable, especially because falling back to simulated AIO may lead to significantly reduced performance. srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads: Change the data type from ulong to uint. os_aio_init(): Remove the parameters, and actually return an error code. thread_pool::configure_aio(): Do not silently fall back to simulated AIO. Reviewed by: Vladislav Vaintroub
* MDEV-24313 (1 of 2): Hang with innodb_write_io_threads=1Marko Mäkelä2020-12-143-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit a5a2ef079cec378340d8b575aef05974b0b3442e (part of MDEV-23855) implemented asynchronous doublewrite, it is possible that the server will hang when the following parametes are in effect: innodb_doublewrite=1 (default) innodb_write_io_threads=1 innodb_use_native_aio=0 Note: In commit 5e62b6a5e06eb02cbde1e34e95e26f42d87fce02 (MDEV-16264) the logic of os_aio_init() was changed so that it will never fail, but instead automatically disable innodb_use_native_aio (which is enabled by default) if the io_setup() system call would fail due to resource limits being exceeded. Before commit a5a2ef079cec378340d8b575aef05974b0b3442e, we used a synchronous write for the doublewrite buffer batches, always at most 64 pages at a time. So, upon completing a doublewrite batch, a single thread would submit at most 64 page writes (for the individual pages that were first written to the doublewrite buffer). With that commit, we may submit up to 128 page writes at a time. The maximum number of outstanding requests per thread is 256. Because the maximum number of asynchronous write submissions per thread was roughly doubled, it is now possible that buf_dblwr_t::flush_buffered_writes_completed() will hang in io_slots::acquire(), called via os_aio() and fil_space_t::io(), when submitting writes of the individual blocks. We will prevent this type of hang by increasing the minimum number of innodb_write_io_threads from 1 to 2, so that this type of hang would only become possible when 512 outstanding write requests are exceeded.
* MDEV-24391 heap-use-after-free in fil_space_t::flush_low()bb-10.5-MDEV-24391Marko Mäkelä2020-12-115-15/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We observed a race condition that involved two threads executing fil_flush_file_spaces() and one thread executing fil_delete_tablespace(). After one of the fil_flush_file_spaces() observed that space.needs_flush_not_stopping() is set and was releasing the fil_system.mutex, the other fil_flush_file_spaces() would complete the execution of fil_space_t::flush_low() on the same tablespace. Then, fil_delete_tablespace() would destroy the object, because the value of fil_space_t::n_pending did not prevent that. Finally, the fil_flush_file_spaces() would resume execution and invoke fil_space_t::flush_low() on the freed object. This race condition was introduced in commit 118e258aaac5da75a2ac4556201aaea3688fac67 of MDEV-23855. fil_space_t::flush(): Add a template parameter that indicates whether the caller is holding a reference to prevent the tablespace from being freed. buf_dblwr_t::flush_buffered_writes_completed(), row_quiesce_table_start(): Acquire a reference for the duration of the fil_space_t::flush_low() operation. It should be impossible for the object to be freed in these code paths, but we want to satisfy the debug assertions. fil_space_t::flush_low(): Do not increment or decrement the reference count, but instead assert that the caller is holding a reference. fil_space_extend_must_retry(), fil_flush_file_spaces(): Acquire a reference before releasing fil_system.mutex. This is what will fix the race condition.
* Remove unused DBUG_EXECUTE_IF "ignore_punch_hole"Marko Mäkelä2020-12-092-18/+0
| | | | | | | | | | | | | Since commit ea21d630be639317be0dc9d2b72a04f3ef3f9c7b we conditionally define a variable that only plays a role on systems that support hole-punching (explicit creation of sparse files). However, that broke debug builds on such systems. It turns out that the debug_dbug label "ignore_punch_hole" is not at all used in MariaDB server. It would be covered by the MySQL 5.7 test innodb.table_compress. (Note: MariaDB 10.1 implemented page_compressed tables before something comparable appeared in MySQL 5.7.)
* MDEV-12227 Defer writes to the InnoDB temporary tablespacebb-10.5-MDEV-24369Marko Mäkelä2020-12-096-42/+79
| | | | | | | | | | | | | | | | | | | | | The flushing of the InnoDB temporary tablespace is unnecessarily tied to the write-ahead redo logging and redo log checkpoints, which must be tied to the page writes of persistent tablespaces. Let us simply omit any pages of temporary tables from buf_pool.flush_list. In this way, log checkpoints will never incur any 'collateral damage' of writing out unmodified changes for temporary tables. After this change, pages of the temporary tablespace can only be written out by buf_flush_lists(n_pages,0) as part of LRU eviction. Hopefully, most of the time, that code will never be executed, and instead, the temporary pages will be evicted by buf_release_freed_page() without ever being written back to the temporary tablespace file. This should improve the efficiency of the checkpoint flushing and the buf_flush_page_cleaner thread. Reviewed by: Vladislav Vaintroub
* Fix -Wunused-but-set-variableMarko Mäkelä2020-12-081-3/+12
|
* MDEV-24369 Page cleaner sleeps despite innodb_max_dirty_pages_pct_lwm being ↵Marko Mäkelä2020-12-081-32/+18
| | | | | | | | | | | | | | | | | | exceeded MDEV-24278 improved the page cleaner so that it will no longer wake up once per second on an idle server. However, with innodb_adaptive_flushing (the default) the function page_cleaner_flush_pages_recommendation() could initially return 0 even if there is work to do. af_get_pct_for_dirty(): Remove. Based on a comment here, it appears that an initial intention of innodb_max_dirty_pages_pct_lwm=0.0 (the default value) was to disable something. That ceased to hold in MDEV-23855: the value is a pure threshold; the page cleaner will not perform any work unless the threshold is exceeded. page_cleaner_flush_pages_recommendation(): Add the parameter dirty_blocks to ensure that buf_pool.flush_list will eventually be emptied.
* MDEV-24351: S3, same-backend replication: Dropping a table on master...Sergei Petrunia2020-12-085-4/+73
| | | | | | | | | | | | ..causes error on slave. Cause: if the master doesn't have the frm file for the table, DROP TABLE code will call ha_delete_table_force() to drop the table in all available storage engines. The issue was that this code path didn't check for HTON_TABLE_MAY_NOT_EXIST_ON_SLAVE flag for the storage engine, and so did not add "... IF EXISTS" to the statement that's written to the binary log. This can cause error on the slave when it tries to drop a table that's already gone.
* Simplify clang workarounds.Vladislav Vaintroub2020-12-071-9/+2
|
* MDEV-24350 buf_dblwr unnecessarily uses memory-intensive srv_stats countersbb-10.5-MDEV-24350Marko Mäkelä2020-12-045-22/+45
| | | | | | | The counters in srv_stats use std::atomic and multiple cache lines per counter. This is an overkill in a case where a critical section already exists in the code. A regular variable will work just fine, with much smaller memory bus impact.
* MDEV-24348 InnoDB shutdown hang with innodb_flush_sync=0Marko Mäkelä2020-12-044-0/+23
| | | | | | | | | | | | | | | | | | | | | This hang was caused by MDEV-23855, and we failed to fix it in MDEV-24109 (commit 4cbfdeca840098b9ed0d8147d43288c36743a328). When buf_flush_ahead() is invoked soon before server shutdown and the non-default setting innodb_flush_sync=OFF is in effect and the buffer pool contains dirty pages of temporary tables, the page cleaner thread may remain in an infinite loop without completing its work, thus causing the shutdown to hang. buf_flush_page_cleaner(): If the buffer pool contains no unmodified persistent pages, ensure that buf_flush_sync_lsn= 0 will be assigned, so that shutdown will proceed. The test case is not deterministic. On my system, it reproduced the hang with 95% probability when running multiple instances of the test in parallel, and 4% when running single-threaded. Thanks to Eugene Kosov for debugging and testing this.
* Fixed usage of not initialized memory in LIKE ... ESCAPEMonty2020-12-032-65/+91
| | | | | | | | | | | | | | | | | | | | | | This was noticed wben running "mtr --valgrind main.precedence" The problem was that Item_func_like::escape could be left unitialized when used with views combined with UNIONS like in: create or replace view v1 as select 2 LIKE 1 ESCAPE 3 IN (SELECT 0 UNION SELECT 1), 2 LIKE 1 ESCAPE (3 IN (SELECT 0 UNION SELECT 1)), (2 LIKE 1 ESCAPE 3) IN (SELECT 0 UNION SELECT 1); The above query causes in fix_escape_item() escape_item->const_during_execution() to be true and escape_item->const_item() to be false in which case 'escape' is never calculated. The fix is to make the main logic of fix_escape_item() out to a separate function and call that function once in Item. Other things: - Reorganized fields in Item_func_like class to make it more compact
* MDEV-22929 fixup: root_name() clash with clang++ <fstream>Marko Mäkelä2020-12-032-5/+3
| | | | | | | | | | | | The clang++ -stdlib=libc++ header file <fstream> depends on <filesystem> that defines a member function path::root_name(), which conflicts with the rather unused #define root_name() that had been introduced in commit 7c58e97bf6f80a251046c5b3e7bce826fe058bd6. Because an instrumented -stdlib=libc++ (rather than the default -stdlib=libstdc++) is easier to build for a working -fsanitize=memory (cmake -DWITH_MSAN=ON), let us remove the conflicting #define for now.
* MDEV-24295: Fix the non-clang buildMarko Mäkelä2020-12-021-0/+3
| | | | | Sorry, only tested commit 4174fc1a1bd1f1c29f10264108269bf2e18e2f24 on clang. Other compilers do not define __has_feature().
* MDEV-24295: Fix the WITH_MSAN buildMarko Mäkelä2020-12-021-1/+6
| | | | | For some reason, commit 5bb5d4ad3a687ac61a9c5f8ffff6dd231f9b581a made clang++-11 unhappy about a constexpr declaration.
* MDEV-20051 fixup: Correct galera.galera_defaults resultMarko Mäkelä2020-12-021-1/+2
| | | | | For some reason, the test was never adjusted for commit e6a50e41da5e9e55031b978fba19a2b3ec4928b8.
* Merge 10.4 into 10.5Marko Mäkelä2020-12-02123-450/+2481
|\
| * MDEV-15532 after-merge fixes from MontyMarko Mäkelä2020-12-0215-22/+22
| | | | | | | | The Galera tests were massively failing with debug assertions.
| * Merge 10.3 into 10.4Marko Mäkelä2020-12-01100-388/+2275
| |\
| | * MDEV-22929 MariaBackup option to report and/or continue when corruption is ↵Vlad Lesin2020-12-011-2/+2
| | | | | | | | | | | | | | | | | | encountered Post-push Windows compilation errors fix.
| | * After merge fixesMonty2020-12-011-2/+2
| | | | | | | | | | | | | | | Change thd->mdl_context.release_transactional_locks() to thd->mdl_release_transactional_locks()
| | * MDEV-24323 Crash on recovery after kill during instant ADD COLUMNMarko Mäkelä2020-12-013-7/+57
| | | | | | | | | | | | | | | row_undo_ins_parse_undo_rec(): Do not try to read non-existing virtual column information for the metadata record.
| | * Merge 10.2 into 10.3Marko Mäkelä2020-12-0166-230/+1895
| | |\
| | | * MDEV-22929 MariaBackup option to report and/or continue when corruption is ↵Vlad Lesin2020-12-0115-128/+1204
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | encountered The new option --log-innodb-page-corruption is introduced. When this option is set, backup is not interrupted if innodb corrupted page is detected. Instead it logs all found corrupted pages in innodb_corrupted_pages file in backup directory and finishes with error. For incremental backup corrupted pages are also copied to .delta file, because we can't do LSN check for such pages during backup, innodb_corrupted_pages will also be created in incremental backup directory. During --prepare, corrupted pages list is read from the file just after redo log is applied, and each page from the list is checked if it is allocated in it's tablespace or not. If it is not allocated, then it is zeroed out, flushed to the tablespace and removed from the list. If all pages are removed from the list, then --prepare is finished successfully and innodb_corrupted_pages file is removed from backup directory. Otherwise --prepare is finished with error message and innodb_corrupted_pages contains the list of the pages, which are detected as corrupted during backup, and are allocated in their tablespaces, what means backup directory contains corrupted innodb pages, and backup can not be considered as consistent. For incremental --prepare corrupted pages from .delta files are applied to the base backup, innodb_corrupted_pages is read from both base in incremental directories, and the same action is proceded for corrupted pages list as for full --prepare. innodb_corrupted_pages file is modified or removed only in base directory. If DDL happens during backup, it is also processed at the end of backup to have correct tablespace names in innodb_corrupted_pages.
| | | * MDEV 15532 Assertion `!log->same_pk' failed in row_log_table_apply_deleteMonty2020-11-3021-84/+164
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reason for the failure is that thd->mdl_context.release_transactional_locks() was called after commit & rollback even in cases where the current transaction is still active. For 10.2, 10.3 and 10.4 the fix is simple: - Replace all calls to thd->mdl_context.release_transactional_locks() with thd->release_transactional_locks(). The thd function will only call the mdl_context function if there are no active transactional locks. In 10.6 we will better fix where we will change the return value for some trans_xxx() functions to indicate if transaction did close the transaction or not. This will avoid the need of the indirect call. Other things: - trans_xa_commit() and trans_xa_rollback() will automatically call release_transactional_locks() if the transaction is closed. - We can't do that for the other functions as the caller of many of these are doing additional work (like close_thread_tables) before calling release_transactional_locks(). - Added missing abort_result_set() and missing DBUG_RETURN in select_create::send_eof() - Fixed wrong indentation in injector::transaction::commit()
| | | * Disable mysqldump-system.test if auth socket plugin is not dynamicMonty2020-11-301-0/+5
| | | |
| | | * MDEV-24289: show grants missing with grant optionAnel Husakovic2020-11-263-1/+30
| | | | | | | | | | | | | | | | Reviewed by:serg@mariadb.com
| | | * MDEV-24275 InnoDB persistent stats analyze forces full scan forcing lock crashEugene Kosov2020-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fixup patch for MDEV-23991 afc9d00c66db946c8240fe1fa6b345a3a8b6fec1 We really should read result.n_leaf_pages, which was set previously. Analysis and fix was provided by Jukka Santala. Thanks! Reviewed by: Marko Mäkelä
| | | * Skip main.lock_view for cmake -DPLUGIN_PERFSCHEMA=NOMarko Mäkelä2020-11-251-0/+1
| | | |
| | | * Cleanup: row_log_free()Marko Mäkelä2020-11-255-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nonnull attribute is not applicable to parameters that are passed by reference, at least not in the Intel compiler. Let us remove the reference indirection, which was only there so that the pointer could be assigned to NULL, and let the callers perform that task. row_log_allocate(): Fix a bug in out-of-memory error handling that would leave a pointer to freed memory.