summaryrefslogtreecommitdiff
path: root/storage/rocksdb
diff options
context:
space:
mode:
authorMarko Mäkelä <marko.makela@mariadb.com>2019-10-11 17:28:15 +0300
committerMarko Mäkelä <marko.makela@mariadb.com>2019-10-11 17:28:15 +0300
commitb42294bc6409794bdbd2051b32fa079d81cea61d (patch)
tree8da23db6ced9e068134a51b95e80d5a8cb055691 /storage/rocksdb
parentb5fae7f743d2b2c71ee23b0daf51c2be294733eb (diff)
downloadmariadb-git-b42294bc6409794bdbd2051b32fa079d81cea61d.tar.gz
MDEV-19514 Defer change buffer merge until pages are requested
We will remove the InnoDB background operation of merging buffered changes to secondary index leaf pages. Changes will only be merged as a result of an operation that accesses a secondary index leaf page, such as a SQL statement that performs a lookup via that index, or is modifying the index. Also ROLLBACK and some background operations, such as purging the history of committed transactions, or computing index cardinality statistics, can cause change buffer merge. Encryption key rotation will not perform change buffer merge. The motivation of this change is to simplify the I/O logic and to allow crash recovery to happen in the background (MDEV-14481). We also hope that this will reduce the number of "mystery" crashes due to corrupted data. Because change buffer merge will typically take place as a result of executing SQL statements, there should be a clearer connection between the crash and the SQL statements that were executed when the server crashed. In many cases, a slight performance improvement was observed. This is joint work with Thirunarayanan Balathandayuthapani and was tested by Axel Schwenke and Matthias Leich. The InnoDB monitor counter innodb_ibuf_merge_usec will be removed. On slow shutdown (innodb_fast_shutdown=0), we will continue to merge all buffered changes (and purge all undo log history). Two InnoDB configuration parameters will be changed as follows: innodb_disable_background_merge: Removed. This parameter existed only in debug builds. All change buffer merges will use synchronous reads. innodb_force_recovery will be changed as follows: * innodb_force_recovery=4 will be the same as innodb_force_recovery=3 (the change buffer merge cannot be disabled; it can only happen as a result of an operation that accesses a secondary index leaf page). The option used to be capable of corrupting secondary index leaf pages. Now that capability is removed, and innodb_force_recovery=4 becomes 'safe'. * innodb_force_recovery=5 (which essentially hard-wires SET GLOBAL TRANSACTION ISOLATION LEVEL READ UNCOMMITTED) becomes safe to use. Bogus data can be returned to SQL, but persistent InnoDB data files will not be corrupted further. * innodb_force_recovery=6 (ignore the redo log files) will be the only option that can potentially cause persistent corruption of InnoDB data files. Code changes: buf_page_t::ibuf_exist: New flag, to indicate whether buffered changes exist for a buffer pool page. Pages with pending changes can be returned by buf_page_get_gen(). Previously, the changes were always merged inside buf_page_get_gen() if needed. ibuf_page_exists(const buf_page_t&): Check if a buffered changes exist for an X-latched or read-fixed page. buf_page_get_gen(): Add the parameter allow_ibuf_merge=false. All callers that know that they may be accessing a secondary index leaf page must pass this parameter as allow_ibuf_merge=true, unless it does not matter for that caller whether all buffered changes have been applied. Assert that whenever allow_ibuf_merge holds, the page actually is a leaf page. Attempt change buffer merge only to secondary B-tree index leaf pages. btr_block_get(): Add parameter 'bool merge'. All callers of btr_block_get() should know whether the page could be a secondary index leaf page. If it is not, we should avoid consulting the change buffer bitmap to even consider a merge. This is the main interface to requesting index pages from the buffer pool. ibuf_merge_or_delete_for_page(), recv_recover_page(): Replace buf_page_get_known_nowait() with much simpler logic, because it is now guaranteed that that the block is x-latched or read-fixed. mlog_init_t::mark_ibuf_exist(): Renamed from mlog_init_t::ibuf_merge(). On crash recovery, we will no longer merge any buffered changes for the pages that we read into the buffer pool during the last batch of applying log records. buf_page_get_gen_known_nowait(), BUF_MAKE_YOUNG, BUF_KEEP_OLD: Remove. btr_search_guess_on_hash(): Merge buf_page_get_gen_known_nowait() to its only remaining caller. buf_page_make_young_if_needed(): Define as an inline function. Add the parameter buf_pool. buf_page_peek_if_young(), buf_page_peek_if_too_old(): Add the parameter buf_pool. fil_space_validate_for_mtr_commit(): Remove a bogus comment about background merge of the change buffer. btr_cur_open_at_rnd_pos_func(), btr_cur_search_to_nth_level_func(), btr_cur_open_at_index_side_func(): Use narrower data types and scopes. ibuf_read_merge_pages(): Replaces buf_read_ibuf_merge_pages(). Merge the change buffer by invoking buf_page_get_gen().
Diffstat (limited to 'storage/rocksdb')
-rw-r--r--storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result1
1 files changed, 0 insertions, 1 deletions
diff --git a/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result b/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result
index 6f446a13132..9bf35793159 100644
--- a/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result
+++ b/storage/rocksdb/mysql-test/rocksdb/r/innodb_i_s_tables_disabled.result
@@ -230,7 +230,6 @@ innodb_activity_count server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NU
innodb_master_active_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is active
innodb_master_idle_loops server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Number of times master thread performs its tasks when server is idle
innodb_background_drop_table_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process drop table list
-innodb_ibuf_merge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to process change buffer merge
innodb_log_flush_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to flush log records
innodb_mem_validate_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent to do memory validation
innodb_master_purge_usec server 0 NULL NULL NULL 0 NULL NULL NULL NULL NULL NULL NULL 0 counter Time (in microseconds) spent by master thread to purge records