summaryrefslogtreecommitdiff
path: root/storage/innobase/fts/fts0fts.cc
Commit message (Collapse)AuthorAgeFilesLines
* Merge 10.7 into 10.8Marko Mäkelä2022-06-141-207/+58
|\
| * MDEV-25581 Allow user thread to do InnoDB fts cache syncThirunarayanan Balathandayuthapani2022-06-101-207/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem: ======== InnoDB FTS requesting the fts sync of the table once the fts cache size reaches 1/10 of innodb_ft_cache_size. But fts_sync() releases cache lock when writing the word. By doing this, InnoDB insert thread increases the innodb fts cache memory and SYNC operation will take more time to complete. Solution: ========= Remove the fts sync operation(FTS_MSG_SYNC_TABLE) from the fts optimize background thread. Instead of that, allow user thread to sync the InnoDB fts cache when the cache size exceeds 512 kb. User thread holds cache lock while doing cache syncing, it make sure that other threads doesn't add the docs into the cache. Removed FTS_MSG_SYNC_TABLE and its related function because we do remove the FTS_MSG_SYNC_TABLE message itself. Removed fts_sync_index_check() and all related function because other threads doesn't add while cache operation going on.
* | Merge 10.7 into 10.8Marko Mäkelä2022-06-061-47/+44
|\ \ | |/
| * MDEV-13542: Crashing on corrupted page is unhelpfulMarko Mäkelä2022-06-061-47/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The approach to handling corruption that was chosen by Oracle in commit 177d8b0c125b841c0650d27d735e3b87509dc286 is not really useful. Not only did it actually fail to prevent InnoDB from crashing, but it is making things worse by blocking attempts to rescue data from or rebuild a partially readable table. We will try to prevent crashes in a different way: by propagating errors up the call stack. We will never mark the clustered index persistently corrupted, so that data recovery may be attempted by reading from the table, or by rebuilding the table. This should also fix MDEV-13680 (crash on btr_page_alloc() failure); it was extensively tested with innodb_file_per_table=0 and a non-autoextend system tablespace. We should now avoid crashes in many cases, such as when a page cannot be read or allocated, or an inconsistency is detected when attempting to update multiple pages. We will not crash on double-free, such as on the recovery of DDL in system tablespace in case something was corrupted. Crashes on corrupted data are still possible. The fault injection mechanism that is introduced in the subsequent commit may help catch more of them. buf_page_import_corrupt_failure: Remove the fault injection, and instead corrupt some pages using Perl code in the tests. btr_cur_pessimistic_insert(): Always reserve extents (except for the change buffer), in order to prevent a subsequent allocation failure. btr_pcur_open_at_rnd_pos(): Merged to the only caller ibuf_merge_pages(). btr_assert_not_corrupted(), btr_corruption_report(): Remove. Similar checks are already part of btr_block_get(). FSEG_MAGIC_N_BYTES: Replaces FSEG_MAGIC_N_VALUE. dict_hdr_get(), trx_rsegf_get_new(), trx_undo_page_get(), trx_undo_page_get_s_latched(): Replaced with error-checking calls. trx_rseg_t::get(mtr_t*): Replaces trx_rsegf_get(). trx_rseg_header_create(): Let the caller update the TRX_SYS page if needed. trx_sys_create_sys_pages(): Merged with trx_sysf_create(). dict_check_tablespaces_and_store_max_id(): Do not access DICT_HDR_MAX_SPACE_ID, because it was already recovered in dict_boot(). Merge dict_check_sys_tables() with this function. dir_pathname(): Replaces os_file_make_new_pathname(). row_undo_ins_remove_sec(): Do not modify the undo page by adding a terminating NUL byte to the record. btr_decryption_failed(): Report decryption failures dict_set_corrupted_by_space(), dict_set_encrypted_by_space(), dict_set_corrupted_index_cache_only(): Remove. dict_set_corrupted(): Remove the constant parameter dict_locked=false. Never flag the clustered index corrupted in SYS_INDEXES, because that would deny further access to the table. It might be possible to repair the table by executing ALTER TABLE or OPTIMIZE TABLE, in case no B-tree leaf page is corrupted. dict_table_skip_corrupt_index(), dict_table_next_uncorrupted_index(), row_purge_skip_uncommitted_virtual_index(): Remove, and refactor the callers to read dict_index_t::type only once. dict_table_is_corrupted(): Remove. dict_index_t::is_btree(): Determine if the index is a valid B-tree. BUF_GET_NO_LATCH, BUF_EVICT_IF_IN_POOL: Remove. UNIV_BTR_DEBUG: Remove. Any inconsistency will no longer trigger assertion failures, but error codes being returned. buf_corrupt_page_release(): Replaced with a direct call to buf_pool.corrupted_evict(). fil_invalid_page_access_msg(): Never crash on an invalid read; let the caller of buf_page_get_gen() decide. btr_pcur_t::restore_position(): Propagate failure status to the caller by returning CORRUPTED. opt_search_plan_for_table(): Simplify the code. row_purge_del_mark(), row_purge_upd_exist_or_extern_func(), row_undo_ins_remove_sec_rec(), row_undo_mod_upd_del_sec(), row_undo_mod_del_mark_sec(): Avoid mem_heap_create()/mem_heap_free() when no secondary indexes exist. row_undo_mod_upd_exist_sec(): Simplify the code. row_upd_clust_step(), dict_load_table_one(): Return DB_TABLE_CORRUPT if the clustered index (and therefore the table) is corrupted, similar to what we do in row_insert_for_mysql(). fut_get_ptr(): Replace with buf_page_get_gen() calls. buf_page_get_gen(): Return nullptr and *err=DB_CORRUPTION if the page is marked as freed. For other modes than BUF_GET_POSSIBLY_FREED or BUF_PEEK_IF_IN_POOL this will trigger a debug assertion failure. For BUF_GET_POSSIBLY_FREED, we will return nullptr for freed pages, so that the callers can be simplified. The purge of transaction history will be a new user of BUF_GET_POSSIBLY_FREED, to avoid crashes on corrupted data. buf_page_get_low(): Never crash on a corrupted page, but simply return nullptr. fseg_page_is_allocated(): Replaces fseg_page_is_free(). fts_drop_common_tables(): Return an error if the transaction was rolled back. fil_space_t::set_corrupted(): Report a tablespace as corrupted if it was not reported already. fil_space_t::io(): Invoke fil_space_t::set_corrupted() to report out-of-bounds page access or other errors. Clean up mtr_t::page_lock() buf_page_get_low(): Validate the page identifier (to check for recently read corrupted pages) after acquiring the page latch. buf_page_t::read_complete(): Flag uninitialized (all-zero) pages with DB_FAIL. Return DB_PAGE_CORRUPTED on page number mismatch. mtr_t::defer_drop_ahi(): Renamed from mtr_defer_drop_ahi(). recv_sys_t::free_corrupted_page(): Only set_corrupt_fs() if any log records exist for the page. We do not mind if read-ahead produces corrupted (or all-zero) pages that were not actually needed during recovery. recv_recover_page(): Return whether the operation succeeded. recv_sys_t::recover_low(): Simplify the logic. Check for recovery error. Thanks to Matthias Leich for testing this extensively and to the authors of https://rr-project.org for making it easy to diagnose and fix any failures that were found during the testing.
* | Merge branch '10.7' into 10.8Sergei Golubchik2022-05-111-3/+2
|\ \ | |/
| * MDEV-28465 Some calls to btr_pcur_close() are unnecessaryMarko Mäkelä2022-05-031-3/+2
| | | | | | | | | | | | | | | | | | | | The function btr_pcur_close() is being invoked on local variables even when no cleanup needs to be done. In particular, for B-tree indexes (not SPATIAL INDEX), unless btr_pcur_store_position() was invoked in the past, there is no need to invoke btr_pcur_close(). On purge and rollback, we will retain btr_pcur_close(&pcur) because otherwise some ./mtr --suite=innodb_gis tests would leak memory.
* | Merge 10.7 into 10.8Nayuta Yanagisawa2022-04-131-16/+25
|\ \ | |/
| * MDEV-27783 InnoDB: Failing assertion: table->get_ref_count() == 0 upon ALTER ↵Thirunarayanan Balathandayuthapani2022-04-071-16/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | TABLE ... MODIFY COLUMN - There is a race condition occurs between purge thread and DDL. So purge thread can increment n_ref_count even after DDL does purge_sys_t::stop_FTS(). - dict_table_open_on_id for purge thread should check purge_sys.must_wait_FTS() before acquring the table. - purge_sys.stop_FTS() does acquire dict_sys.latch for setting the purge system flag and check table ref count on auxilary tables.
* | Merge 10.7 into 10.8Marko Mäkelä2022-03-301-3/+1
|\ \ | |/
| * Merge 10.5 into 10.6Marko Mäkelä2022-03-291-3/+1
| |\
| | * Merge 10.4 into 10.5Marko Mäkelä2022-03-291-3/+1
| | |\
| | | * Merge 10.3 into 10.4Marko Mäkelä2022-03-291-3/+1
| | | |\
| | | | * Merge 10.2 into 10.3Marko Mäkelä2022-03-291-3/+1
| | | | |\
| | | | | * Fix gcc-12 -O2 -Warray-boundsMarko Mäkelä2022-03-171-3/+1
| | | | | |
* | | | | | Merge 10.7 into 10.8Daniel Black2022-03-251-3/+6
|\ \ \ \ \ \ | |/ / / / /
| * | | | | MDEV-28079 Shutdown hangs after altering innodb partition fts tableThirunarayanan Balathandayuthapani2022-03-161-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - InnoDB purge waits at resume_FTS() while shutting down. This happens after altering the FTS innodb partition table. stop_FTS() has been called for each partition, but it calls resume_FTS() only once and it leads to hang during shutdown. This issue was introduced by commit 1bd681c8b3c5213ce1f7976940a7dc38b48a0d39(MDEV-25506).
* | | | | | Merge 10.7 into 10.8Marko Mäkelä2022-03-081-41/+8
|\ \ \ \ \ \ | |/ / / / /
| * | | | | Merge 10.5 into 10.6Vlad Lesin2022-03-071-41/+8
| |\ \ \ \ \ | | |/ / / /
| | * | | | Merge 10.4 into 10.5Marko Mäkelä2022-03-071-41/+8
| | |\ \ \ \ | | | |/ / /
| | | * | | Merge 10.3 into 10.4Marko Mäkelä2022-03-071-41/+8
| | | |\ \ \ | | | | |/ /
| | | | * | MDEV-27582 Fulltext DDL decrements the FTS_DOC_ID valueThirunarayanan Balathandayuthapani2022-03-031-41/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - InnoDB FTS DDL decrements the FTS_DOC_ID when there is a deleted marked record involved. FTS_DOC_ID must never be reused. The purpose of FTS_DOC_ID is to be a unique row identifier that will be changed whenever a fulltext indexed column is updated.
* | | | | | Merge 10.7 into 10.8Marko Mäkelä2022-02-251-3/+16
|\ \ \ \ \ \ | |/ / / / /
| * | | | | Merge 10.5 into 10.6 (MDEV-27913)Marko Mäkelä2022-02-251-3/+16
| |\ \ \ \ \ | | |/ / / /
| | * | | | Merge 10.4 into 10.5 (MDEV-27913)Marko Mäkelä2022-02-251-4/+17
| | |\ \ \ \ | | | |/ / /
| | | * | | Merge 10.3 into 10.4 (MDEV-27913)Marko Mäkelä2022-02-251-4/+17
| | | |\ \ \ | | | | |/ /
| | | | * | MDEV-27913 innodb_ft_cache_size max possible value (80000000) is too small ↵Thirunarayanan Balathandayuthapani2022-02-241-6/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | for practical purposes - Make innodb_ft_cache_size & innodb_ft_total_cache_size are dynamic variable and increase the maximum value of innodb_ft_cache_size to 512MB for 32-bit system and 1 TB for 64-bit system and set innodb_ft_total_cache_size maximum value to 1 TB for 64-bit system. - Print warning if the fts cache exceeds the innodb_ft_cache_size and also unlock the cache if fts cache memory reduces less than innodb_ft_cache_size.
| | | * | | Merge 10.3 into 10.4Vlad Lesin2022-02-211-7/+4
| | | |\ \ \ | | | | |/ /
| | | | * | MDEV-20605 Awaken transaction can miss inserted by other transaction records ↵Vlad Lesin2022-02-211-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | due to wrong persistent cursor restoration Backported from 10.5 20e9e804c131c6522bc7c469e4863e8d1eaa3ee0 and 5948d7602ec7f61937c368dcb134e6ec226a2990. sel_restore_position_for_mysql() moves forward persistent cursor position after btr_pcur_restore_position() call if cursor relative position is BTR_PCUR_ON and the cursor points to the record with NOT the same field values as in a stored record(and some other not important for this case conditions). It was done because btr_pcur_restore_position() sets page_cur_mode_t mode to PAGE_CUR_LE for cursor->rel_pos == BTR_PCUR_ON before opening cursor. So we are searching for the record less or equal to stored one. And if the found record is not equal to stored one, then it is less and we need to move cursor forward. But there can be a situation when the stored record was purged, but the new one with the same key but different value was inserted while row_search_mvcc() was suspended. In this case, when the thread is awaken, it will invoke sel_restore_position_for_mysql(), which, in turns, invoke btr_pcur_restore_position(), which will return false because found record don't match stored record, and sel_restore_position_for_mysql() will move forward cursor position. The above can lead to the case when awaken row_search_mvcc() do not see records inserted by other transactions while it slept. The mtr test case shows the example how it can be. The fix is to return special value from persistent cursor restoring function which would notify its caller that uniq fields of restored record and stored record are the same, and in this case sel_restore_position_for_mysql() don't move cursor forward. Delete-marked records are correctly processed in row_search_mvcc(). Non-unique secondary indexes are "uniquified" by adding the PK, the index->n_uniq should then be index->n_fields. So there is no need in additional checks in the fix. If transaction's readview can't see the changes made in secondary index record, it requests clustered index record in row_search_mvcc() to check its transaction id and get the correspondent record version. After this row_search_mvcc() commits mtr to preserve clustered index latching order, and starts mtr. Between those mtr commit and start secondary index pages are unlatched, and purge has the ability to remove stored in the cursor record, what causes rows duplication in result set for non-locking reads, as cursor position is restored to the previously visited record. To solve this the changes are just switched off for non-locking reads, it's quite simple solution, besides the changes don't make sense for non-locking reads. The more complex and effective from performance perspective solution is to create mtr savepoint before clustered record requesting and rolling back to that savepoint after that. See MDEV-27557. One more solution is to have per-record transaction id for secondary indexes. See MDEV-17598. If any of those is implemented, just remove select_lock_type argument in sel_restore_position_for_mysql().
* | | | | | Merge branch '10.7' into 10.8Vladislav Vaintroub2022-02-171-7/+4
|\ \ \ \ \ \ | |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | # Conflicts: # storage/innobase/btr/btr0pcur.cc # storage/innobase/log/log0log.cc
| * | | | | Merge 10.5 into 10.6Vlad Lesin2022-02-141-7/+4
| |\ \ \ \ \ | | |/ / / /
| | * | | | MDEV-20605 Awaken transaction can miss inserted by other transaction records ↵Vlad Lesin2022-02-141-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | due to wrong persistent cursor restoration sel_restore_position_for_mysql() moves forward persistent cursor position after btr_pcur_restore_position() call if cursor relative position is BTR_PCUR_ON and the cursor points to the record with NOT the same field values as in a stored record(and some other not important for this case conditions). It was done because btr_pcur_restore_position() sets page_cur_mode_t mode to PAGE_CUR_LE for cursor->rel_pos == BTR_PCUR_ON before opening cursor. So we are searching for the record less or equal to stored one. And if the found record is not equal to stored one, then it is less and we need to move cursor forward. But there can be a situation when the stored record was purged, but the new one with the same key but different value was inserted while row_search_mvcc() was suspended. In this case, when the thread is awaken, it will invoke sel_restore_position_for_mysql(), which, in turns, invoke btr_pcur_restore_position(), which will return false because found record don't match stored record, and sel_restore_position_for_mysql() will move forward cursor position. The above can lead to the case when awaken row_search_mvcc() do not see records inserted by other transactions while it slept. The mtr test case shows the example how it can be. The fix is to return special value from persistent cursor restoring function which would notify its caller that uniq fields of restored record and stored record are the same, and in this case sel_restore_position_for_mysql() don't move cursor forward. Delete-marked records are correctly processed in row_search_mvcc(). Non-unique secondary indexes are "uniquified" by adding the PK, the index->n_uniq should then be index->n_fields. So there is no need in additional checks in the fix. If transaction's readview can't see the changes made in secondary index record, it requests clustered index record in row_search_mvcc() to check its transaction id and get the correspondent record version. After this row_search_mvcc() commits mtr to preserve clustered index latching order, and starts mtr. Between those mtr commit and start secondary index pages are unlatched, and purge has the ability to remove stored in the cursor record, what causes rows duplication in result set for non-locking reads, as cursor position is restored to the previously visited record. To solve this the changes are just switched off for non-locking reads, it's quite simple solution, besides the changes don't make sense for non-locking reads. The more complex and effective from performance perspective solution is to create mtr savepoint before clustered record requesting and rolling back to that savepoint after that. See MDEV-27557. One more solution is to have per-record transaction id for secondary indexes. See MDEV-17598. If any of those is implemented, just remove select_lock_type argument in sel_restore_position_for_mysql().
* | | | | | Merge branch '10.7' into 10.8Oleksandr Byelkin2022-02-041-9/+5
|\ \ \ \ \ \ | |/ / / / /
| * | | | | Merge branch '10.5' into 10.6Oleksandr Byelkin2022-02-031-9/+5
| |\ \ \ \ \ | | |/ / / /
| | * | | | Merge branch '10.4' into 10.5Oleksandr Byelkin2022-02-011-10/+6
| | |\ \ \ \ | | | |/ / /
| | | * | | Merge branch '10.3' into 10.4Oleksandr Byelkin2022-01-301-10/+6
| | | |\ \ \ | | | | |/ /
| | | | * | Merge branch '10.2' into 10.3mariadb-10.3.33Oleksandr Byelkin2022-01-291-10/+6
| | | | |\ \ | | | | | |/
| | | | | * Cleanup: Remove an unused parameter of fts_add_doc_by_id()Marko Mäkelä2022-01-261-8/+4
| | | | | |
| | | | | * MDEV-24088 Assertion in InnoDB's FTS code may be triggered by a repeated ↵Sergei Golubchik2022-01-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | words fed to simple_parser plugin increment `position` for every word, because the plugin doesn't (FTS API doesn't use positions that InnoDB FTS relies on)
| | | | | * MDEV-27494 Rename .ic files to .inlVladislav Vaintroub2022-01-171-1/+1
| | | | | |
* | | | | | MDEV-27158: humanize the bytes in innodb info/error messagesDaniel Black2022-01-181-1/+1
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Log messages like total size = 17179869184, chunk size = 134217728 get hard to read. If we normalize it down to IEC units is easier. Idea thanks to Axel Schwenke. Review thanks to Eugene Kosov and Marko Mäkelä $ mariadblocal --innodb-buffer-pool-size=30G --innodb-log-file-size=128M Installing MariaDB/MySQL system tables in '/tmp/build-mariadb-server-10.7-datadir' ... 2021-12-09 9:54:04 0 [Note] /home/dan/repos/build-mariadb-server-10.7/sql/mysqld (server 10.7.2-MariaDB) starting as process 250473 ... 2021-12-09 9:54:04 0 [Note] InnoDB: The first data file './ibdata1' did not exist. A new tablespace will be created! 2021-12-09 9:54:04 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2021-12-09 9:54:04 0 [Note] InnoDB: Number of transaction pools: 1 2021-12-09 9:54:04 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2021-12-09 9:54:04 0 [Note] InnoDB: Using liburing 2021-12-09 9:54:04 0 [Note] InnoDB: Initializing buffer pool, total size = 128.000MiB, chunk size = 128.000MiB 2021-12-09 9:54:04 0 [Note] InnoDB: Completed initialization of buffer pool 2021-12-09 9:54:04 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2021-12-09 9:54:04 0 [Note] InnoDB: Setting file './ibdata1' size to 12.000MiB. Physically writing the file full; Please wait ... 2021-12-09 9:54:04 0 [Note] InnoDB: File './ibdata1' size is now 12.000MiB. 2021-12-09 9:54:04 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 96.000MiB 2021-12-09 9:54:04 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 2021-12-09 9:54:04 0 [Note] InnoDB: New log file created, LSN=10317 2021-12-09 9:54:04 0 [Note] InnoDB: Doublewrite buffer not found: creating new 2021-12-09 9:54:04 0 [Note] InnoDB: Doublewrite buffer created 2021-12-09 9:54:04 0 [Note] InnoDB: 128 rollback segments are active. 2021-12-09 9:54:04 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-12-09 9:54:04 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ... 2021-12-09 9:54:04 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB. 2021-12-09 9:54:04 0 [Note] InnoDB: 10.7.2 started; log sequence number 0; transaction id 3 OK 2021-12-09 9:54:04 0 [Note] sql/mysqld (server 10.7.2-MariaDB) starting as process 250501 ... 2021-12-09 9:54:04 0 [Note] InnoDB: Compressed tables use zlib 1.2.11 2021-12-09 9:54:04 0 [Note] InnoDB: Number of transaction pools: 1 2021-12-09 9:54:04 0 [Note] InnoDB: Using crc32 + pclmulqdq instructions 2021-12-09 9:54:04 0 [Note] InnoDB: Using liburing 2021-12-09 9:54:04 0 [Note] InnoDB: Initializing buffer pool, total size = 30.000GiB, chunk size = 128.000MiB 2021-12-09 9:54:04 0 [Note] InnoDB: Completed initialization of buffer pool 2021-12-09 9:54:04 0 [Note] InnoDB: Setting O_DIRECT on file ./ibdata1 failed 2021-12-09 9:54:04 0 [Note] InnoDB: Resizing redo log from 96.000MiB to 128.000MiB; LSN=41361 2021-12-09 9:54:04 0 [Note] InnoDB: Starting to delete and rewrite log file. 2021-12-09 9:54:04 0 [Note] InnoDB: Setting log file ./ib_logfile101 size to 128.000MiB 2021-12-09 9:54:04 0 [Note] InnoDB: Renaming log file ./ib_logfile101 to ./ib_logfile0 2021-12-09 9:54:04 0 [Note] InnoDB: New log file created, LSN=41361 2021-12-09 9:54:04 0 [Note] InnoDB: 128 rollback segments are active. 2021-12-09 9:54:04 0 [Note] InnoDB: Creating shared tablespace for temporary tables 2021-12-09 9:54:04 0 [Note] InnoDB: Setting file './ibtmp1' size to 12.000MiB. Physically writing the file full; Please wait ... 2021-12-09 9:54:04 0 [Note] InnoDB: File './ibtmp1' size is now 12.000MiB. 2021-12-09 9:54:04 0 [Note] InnoDB: 10.7.2 started; log sequence number 41349; transaction id 14 2021-12-09 9:54:04 0 [Note] InnoDB: Loading buffer pool(s) from /tmp/build-mariadb-server-10.7-datadir/ib_buffer_pool 2021-12-09 9:54:04 0 [Note] Plugin 'FEEDBACK' is disabled. 2021-12-09 9:54:04 0 [Note] InnoDB: Buffer pool(s) load completed at 211209 9:54:04 2021-12-09 9:54:04 0 [Note] sql/mysqld: ready for connections. Version: '10.7.2-MariaDB' socket: '/tmp/build-mariadb-server-10.7.sock' port: 0 Source distribution 2021-12-09 9:56:57 0 [Note] sql/mysqld (initiated by: unknown): Normal shutdown 2021-12-09 9:56:57 0 [Note] InnoDB: FTS optimize thread exiting. 2021-12-09 9:56:57 0 [Note] InnoDB: Starting shutdown... 2021-12-09 9:56:57 0 [Note] InnoDB: Dumping buffer pool(s) to /tmp/build-mariadb-server-10.7-datadir/ib_buffer_pool 2021-12-09 9:56:57 0 [Note] InnoDB: Buffer pool(s) dump completed at 211209 9:56:57 2021-12-09 9:56:57 0 [Note] InnoDB: Removed temporary tablespace data file: "./ibtmp1" 2021-12-09 9:56:57 0 [Note] InnoDB: Shutdown completed; log sequence number 42602; transaction id 15 2021-12-09 9:56:57 0 [Note] sql/mysqld: Shutdown complete
* | | | | MDEV-27017 Assertion failure 'table->get_ref_count() == 0' on DDL that ↵bb-10.6-MDEV-27017Thirunarayanan Balathandayuthapani2022-01-051-4/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | involves FULLTEXT INDEX purge_sys.stop_FTS() does not wait for purge operation on FTS tables to finish. InnoDB DDL does purge_sys.stop_FTS() and lock all fts tables. It eventually fails due to n_ref_count value. fts_stop_purge(): Stops the purge thread to process new FTS tables, check n_ref_count of all fts index auxiliary, common tables. This should make sure that consecutive fts_lock_tables() is always successful.
* | | | | MDEV-26933 InnoDB fails to detect page number mismatchbb-10.6-MDEV-26933Marko Mäkelä2021-11-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | mtr_t::page_lock(): Validate the page number. ibuf_tree_root_get(): Remove assertions that became redundant. The assertions in btr_validate_level() are kind of redundant as well, but because they are ut_a(), they are also present in release builds, while the ones in mtr_t::page_lock() are only present in debug builds. btr_cur_position(): Do not duplicate an assertion that is part of page_cur_position(). dict_load_tablespace(): Introduce a new option DICT_ERR_IGNORE_TABLESPACE that will suppress loading a tablespace when a table is going to be dropped.
* | | | | Merge 10.5 into 10.6Marko Mäkelä2021-10-211-10/+14
|\ \ \ \ \ | |/ / / /
| * | | | Merge 10.4 into 10.5Marko Mäkelä2021-10-211-10/+14
| |\ \ \ \ | | |/ / /
| | * | | Merge 10.3 into 10.4Marko Mäkelä2021-10-211-10/+14
| | |\ \ \ | | | |/ /
| | | * | Merge 10.2 into 10.3Marko Mäkelä2021-10-211-10/+14
| | | |\ \ | | | | |/
| | | | * MDEV-26865 fts_optimize_thread cannot keep up with workloadMarko Mäkelä2021-10-211-5/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | fts_cache_t::total_size_at_sync: New field, to sample total_size. fts_add_doc_by_id(): Invoke sync if total_size has grown too much since the previous sync request. (Maintain cache->total_size_at_sync.) ib_wqueue_t::length: Caches ib_list_len(*items). ib_wqueue_len(): Removed. We will refer to fts_optimize_wq->length directly. Based on mysql/mysql-server@bc9c46bf2894673d0df17cd0ee872d0d99663121
| | | | * MDEV-19522 InnoDB commit fails when FTS_DOC_ID value is greater than 4294967295Thirunarayanan Balathandayuthapani2021-10-211-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | InnoDB commit fails when consecutive FTS_DOC_ID value is greater than 4294967295. Fix is that InnoDB should remove the delta FTS_DOC_ID value limitations and fts should encode 8 byte value, remove FTS_DOC_ID_MAX_STEP variable. Replaced the fts0vlc.ic file with fts0vlc.h fts_encode_int(): Should be able to encode 10 bytes value fts_get_encoded_len(): Should get the length of the value which has 10 bytes fts_decode_vlc(): Add debug assertion to verify the maximum length allowed is 10. mach_read_uint64_little_endian(): Reads 64 bit stored in little endian format Added a unit test case which check for minimum and maximum value to do the fts encoding
* | | | | MDEV-25919: Lock tables before acquiring dict_sys.latchMarko Mäkelä2021-08-311-8/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In commit 1bd681c8b3c5213ce1f7976940a7dc38b48a0d39 (MDEV-25506 part 3) we introduced a "fake instant timeout" when a transaction would wait for a table or record lock while holding dict_sys.latch. This prevented a deadlock of the server but could cause bogus errors for operations on the InnoDB persistent statistics tables. A better fix is to ensure that whenever a transaction is being executed in the InnoDB internal SQL parser (which will for now require dict_sys.latch to be held), it will already have acquired all locks that could be required for the execution. So, we will acquire the following locks upfront, before acquiring dict_sys.latch: (1) MDL on the affected user table (acquired by the SQL layer) (2) If applicable (not for RENAME TABLE): InnoDB table lock (3) If persistent statistics are going to be modified: (3.a) MDL_SHARED on mysql.innodb_table_stats, mysql.innodb_index_stats (3.b) exclusive table locks on the statistics tables (4) Exclusive table locks on the InnoDB data dictionary tables (not needed in ANALYZE TABLE and the like) Note: Acquiring exclusive locks on the statistics tables may cause more locking conflicts between concurrent DDL operations. Notably, RENAME TABLE will lock the statistics tables even if no persistent statistics are enabled for the table. DROP DATABASE will only acquire locks on statistics tables if persistent statistics are enabled for the tables on which the SQL layer is invoking ha_innobase::delete_table(). For any "garbage collection" in innodb_drop_database(), a timeout while acquiring locks on the statistics tables will result in any statistics not being deleted for any tables that the SQL layer did not know about. If innodb_defragment=ON, information may be written to the statistics tables even for tables for which InnoDB persistent statistics are disabled. But, DROP TABLE will no longer attempt to delete that information if persistent statistics are not enabled for the table. This change should also fix the hangs related to InnoDB persistent statistics and STATS_AUTO_RECALC (MDEV-15020) as well as a bug that running ALTER TABLE on the statistics tables concurrently with running ALTER TABLE on InnoDB tables could cause trouble. lock_rec_enqueue_waiting(), lock_table_enqueue_waiting(): Do not issue a fake instant timeout error when the transaction is holding dict_sys.latch. Instead, assert that the dict_sys.latch is never being held here. lock_sys_tables(): A new function to acquire exclusive locks on all dictionary tables, in case DROP TABLE or similar operation is being executed. Locking non-hard-coded tables is optional to avoid a crash in row_merge_drop_temp_indexes(). The SYS_VIRTUAL table was introduced in MySQL 5.7 and MariaDB Server 10.2. Normally, we require all these dictionary tables to exist before executing any DDL, but the function row_merge_drop_temp_indexes() is an exception. When upgrading from MariaDB Server 10.1 or MySQL 5.6 or earlier, the table SYS_VIRTUAL would not exist at this point. ha_innobase::commit_inplace_alter_table(): Invoke log_write_up_to() while not holding dict_sys.latch. dict_sys_t::remove(), dict_table_close(): No longer try to drop index stubs that were left behind by aborted online ADD INDEX. Such indexes should be dropped from the InnoDB data dictionary by row_merge_drop_indexes() as part of the failed DDL operation. Stubs for aborted indexes may only be left behind in the data dictionary cache. dict_stats_fetch_from_ps(): Use a normal read-only transaction. ha_innobase::delete_table(), ha_innobase::truncate(), fts_lock_table(): While waiting for purge to stop using the table, do not hold dict_sys.latch. ha_innobase::delete_table(): Implement a work-around for the rollback of ALTER TABLE...ADD PARTITION. MDL_EXCLUSIVE would not be held if ALTER TABLE hits lock_wait_timeout while trying to upgrade the MDL due to a conflicting LOCK TABLES, such as in the first ALTER TABLE in the test case of Bug#53676 in parts.partition_special_innodb. Therefore, we must explicitly stop purge, because it would not be stopped by MDL. dict_stats_func(), btr_defragment_chunk(): Allocate a THD so that we can acquire MDL on the InnoDB persistent statistics tables. mysqltest_embedded: Invoke ha_pre_shutdown() before free_used_memory() in order to avoid ASAN heap-use-after-free related to acquire_thd(). trx_t::dict_operation_lock_mode: Changed the type to bool. row_mysql_lock_data_dictionary(), row_mysql_unlock_data_dictionary(): Implemented as macros. rollback_inplace_alter_table(): Apply an infinite timeout to lock waits. innodb_thd_increment_pending_ops(): Wrapper for thd_increment_pending_ops(). Never attempt async operation for InnoDB background threads, such as the trx_t::commit() in dict_stats_process_entry_from_recalc_pool(). lock_sys_t::cancel(trx_t*): Make dictionary transactions immune to KILL. lock_wait(): Make dictionary transactions immune to KILL, and to lock wait timeout when waiting for locks on dictionary tables. parts.partition_special_innodb: Use lock_wait_timeout=0 to instantly get ER_LOCK_WAIT_TIMEOUT. main.mdl: Filter out MDL on InnoDB persistent statistics tables Reviewed by: Thirunarayanan Balathandayuthapani
* | | | | MDEV-25919 preparation: Various cleanupMarko Mäkelä2021-08-311-28/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | que_eval_sql(): Remove the parameter lock_dict. The only caller with lock_dict=true was dict_stats_exec_sql(), which will now explicitly invoke dict_sys.lock() and dict_sys.unlock() by itself. row_import_cleanup(): Do not unnecessarily lock the dictionary. Concurrent access to the table during ALTER TABLE...IMPORT TABLESPACE is prevented by MDL and the fact that there cannot exist any undo log or change buffer records that would refer to the table or tablespace. row_import_for_mysql(): Do not unnecessarily lock the dictionary while accessing fil_system. Thanks to MDL_EXCLUSIVE that was acquired by the SQL layer, only one IMPORT may be in effect for the table name. row_quiesce_set_state(): Do not unnecessarily lock the dictionary. The dict_table_t::quiesce state is documented to be protected by all index latches, which we are acquiring. dict_table_close(): Introduce a simpler variant with fewer parameters. dict_table_close(): Reduce the amount of calls. We can simply invoke dict_table_t::release() on startup or in DDL operations, or when the table is inaccessible. In none of these cases, there is no need to invalidate the InnoDB persistent statistics. pars_info_t::graph_owns_us: Remove (unused). pars_info_free(): Define inline. fts_delete(), trx_t::evict_table(), row_prebuilt_free(), row_rename_table_for_mysql(): Simplify. row_mysql_lock_data_dictionary(): Remove some references; use dict_sys.lock() and dict_sys.unlock() instead. row_mysql_lock_table(): Remove. Use lock_table_for_trx() instead. ha_innobase::check_if_supported_inplace_alter(), row_create_table_for_mysql(): Simply assert dict_sys.sys_tables_exist(). In commit 49e2c8f0a6fefdeac50925f758090d6bd099768d and commit 1bd681c8b3c5213ce1f7976940a7dc38b48a0d39 srv_start() actually guarantees that the system tables will exist, or the server is in read-only mode, or startup will fail. Reviewed by: Thirunarayanan Balathandayuthapani