summaryrefslogtreecommitdiff
path: root/storage/innobase
Commit message (Collapse)AuthorAgeFilesLines
* Cleanup: flst_read_addr(), fil_addr_tMarko Mäkelä2019-11-2812-169/+79
| | | | | | | | fil_addr_t: Use exactly sized data types. flst_read_addr(): Remove the unused parameter mtr. page_offset(): Return uint16_t.
* Merge 10.4 into 10.5Marko Mäkelä2019-11-271-20/+24
|\
| * MDEV-21148: Assertion index->n_core_fields + n_add >= index->n_fieldsMarko Mäkelä2019-11-261-20/+24
| | | | | | | | | | | | | | | | Revert part of commit 6cedb671e99038f1a10e0d8504f835aaabed9780 because it turns out to be theoretically impossible to parse a ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC metadata record where the variable-length fields in the PRIMARY KEY have been written as nonempty strings.
* | MDEV-21152 Bogus debug assertion btr_pcur_is_after_last_in_tree() in ibuf codeMarko Mäkelä2019-11-261-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | As noted in commit abd45cdc38e72ce329365ffe0df4c6f8c319b407 a search with PAGE_CUR_GE may land on the supremum record on a leaf page that is not the rightmost leaf page. This could occur when all keys on the current page are smaller than the search key, and the smallest key on the successor page is larger than the search key. Hence, after a failed PAGE_CUR_GE search, assertions btr_pcur_is_after_last_in_tree() are bogus and should be replaced with btr_pcur_is_after_last_on_page().
* | MDEV-21127 Assertion in key_text::key_text()Aleksey Midenkov2019-11-261-3/+3
| | | | | | | | Unsigned type while RHS expression could be less than 0.
* | MDEV-21133 Optimize access to InnoDB page header fieldsMarko Mäkelä2019-11-2611-91/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce memcpy_aligned<N>(), memcmp_aligned<N>(), memset_aligned<N>() and use them for accessing InnoDB page header fields that are known to be aligned. MY_ASSUME_ALIGNED(): Wrapper for the GCC/clang __builtin_assume_aligned(). Nothing similar seems to exist in Microsoft Visual Studio, and the C++20 std::assume_aligned is not available to us yet. Explicitly specified alignment guarantees allow compilers to generate faster code on platforms with strict alignment rules, instead of emitting calls to potentially unaligned memcpy(), memcmp(), or memset().
* | Merge 10.4 into 10.5Aleksey Midenkov2019-11-257-23/+73
|\ \ | |/
| * MDEV-20190 Instant operation fails when add column and collation change on ↵Marko Mäkelä2019-11-255-22/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | non-indexed column We must relax too strict debug assertions. For latin1_swedish_ci, mtype=DATA_CHAR or mtype=DATA_VARCHAR will be used instead of mtype=DATA_MYSQL or mtype=DATA_VARMYSQL. Likewise, some changes of dtype_get_charset_coll() do not affect the data type encoding, but only any indexes that are defined on the column. Charset::same_encoding(): Check whether two charset-collations have the same character set encoding. dict_col_t::same_encoding(): Check whether two character columns have the same character set encoding. dict_col_t::same_type(): Check whether two columns have a compatible data type encoding. dict_col_t::same_format(), dict_table_t::instant_column(): Do not compare mtype or the charset-collation of prtype directly. Rely on dict_col_t::same_type() instead. dtype_get_charset_coll(): Narrow the return type to uint16_t. This is a refined version of a fix that was developed by Thirunarayanan Balathandayuthapani.
| * Merge 10.3 into 10.4Aleksey Midenkov2019-11-251-0/+4
| |\
| | * MDEV-21045 AddressSanitizer: use-after-poison in mem_heap_dup / ↵Eugene Kosov2019-11-211-0/+4
| | | | | | | | | | | | | | | | | | | | | row_log_table_get_pk_col row_log_table_get_pk_col(): read instant field value from instant alter table when it's required.
| * | MDEV-21096 async slave crash with gtid_log_pos table access (#1413)seppo2019-11-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The original crash happened when async replication IO thread was updating mysql.gtid_slave_pos table. Operations on this table should remain node local, but it appears that protection (THD::wsrep_ignore_table flag) to prevent wsrep replication for this table mas missing for innodb write_row() and update_row(). It was somewhat difficult to reproduce the issue, because mtr seems to create the affected table mysql.gtid_log_pos as of Aria engine type, and Aria engine operations will not be replicated anyhow. It looks, though, that in release installation, mysql.gtid_slave_pos table is of InnoDB engine. It was possible to trigger somewhat related problem by running test galera.galera_as_slave_gtid with configuration: gtid_pos_auto_engines=InnoDB. However, this test mode, causes earlier crash when replication background thread creates aditional table: mysql.gtid_slave_pos_InnoDB, and this table create triggered wsrep TOI replication, which also failed for assertion. Actually, async replication IO and background threads should not replicate anything to cluster. This pull request contains new test galera.galera_as_slave_gtid_auto_engine, which basically just runs galera.galera_as_slave_gtid with configuration of gtid_pos_auto_engines=InnoDB. Test galera.galera_as_slave_gtid is also modified for better code reuse. Actual fix for MDEV-21096 is in storage/innobase/handler/ha_innodb.cc, where THD::wsrep_ignore_table flag is now honored before wsrep key population. There is additional fix in sql/service_wsrep.cc where async replication IO and background threads are marked as non-local. This fences these threads out of wsrep replication altogether. Note that this change, actually makes the use of THD::wsrep_ignore-table redundant. We may want to refactor THD::wsrep_ignore_table out in the future, if there is no other use case for it in sight.
* | | Fix shutdown hang in dict_stats , caused by MDEV-16264mariadb-10.5.0Vladislav Vaintroub2019-11-251-1/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | dict_stats_shutdown() can hang, waiting for timer callback to finish. This happens because locks the same mutex, which can also used inside timer callback, within dict_stats_schedule() function. Fix is to make dict_stats_schedule() use mutex.try_lock() instead of mutex.lock(). In the unlikely case of simultaneous dict_stats_schedule() setting different timer delays, now the first one would win, which is fine. Important is that shutdown won't hang.
* | | MDEV-21132 Remove buf_page_t::newest_modificationMarko Mäkelä2019-11-2512-449/+289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | At each mini-transaction commit, the log sequence number of the mini-transaction must be written to each modified page, so that it will be available in the FIL_PAGE_LSN field when the page is being read in crash recovery. InnoDB was unnecessarily allocating redundant storage for the field, in buf_page_t::newest_modification. Let us access FIL_PAGE_LSN directly. Furthermore, on ALTER TABLE...IMPORT TABLESPACE, let us write 0 to FIL_PAGE_LSN instead of using log_sys.lsn. buf_flush_init_for_writing(), buf_flush_update_zip_checksum(), fil_encrypt_buf_for_full_crc32(), fil_encrypt_buf(), fil_space_encrypt(): Remove the parameter lsn. buf_page_get_newest_modification(): Merge with the only caller. buf_tmp_reserve_compression_buf(), buf_tmp_page_encrypt(), buf_page_encrypt(): Define static in the same compilation unit with the only caller. PageConverter::m_current_lsn: Remove. Write 0 to FIL_PAGE_LSN on ALTER TABLE...IMPORT TABLESPACE.
* | | Merge 10.4 into 10.5Marko Mäkelä2019-11-204-25/+72
|\ \ \ | |/ /
| * | MDEV-21088 Table cannot be loaded after instant ADD/DROP COLUMNMarko Mäkelä2019-11-203-24/+65
| | | | | | | | | | | | | | | | | | btr_cur_instant_init_low(): Accurately parse the metadata record header for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT. CHAR columns used to be unnecessarily written as nonempty strings of bytes.
| * | Merge 10.3 into 10.4Marko Mäkelä2019-11-201-1/+7
| |\ \ | | |/
| | * MDEV-21088: Follow-up fix for ROW_FORMAT=REDUNDANTMarko Mäkelä2019-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | For ROW_FORMAT=REDUNDANT, we must reserve fixed-length dummy values for the CHAR columns in the metadata record. This is because in MariaDB Server 10.4, btr_cur_instant_init_low() will rely on dict_index_t::trx_id_offset being accurate for the metadata record.
| | * MDEV-21088 Table cannot be loaded after instant ADD/DROP COLUMNMarko Mäkelä2019-11-201-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | In MariaDB Server 10.4, btr_cur_instant_init_low() assumes that all PRIMARY KEY columns that are internally variable-length will be encoded in 0 bytes in the metadata record. Sometimes, CHAR columns can be encoded as variable-length. We should not unnecessarily reserve space for a dummy string value in the metadata record.
* | | MDEV-20480 Obsolete internal parser for FK in InnoDBAleksey Midenkov2019-11-206-1376/+765
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently InnoDB uses internal parser for adding foreign keys. Remove internal parser and use data parsed by SQL parser (sql_yacc) for adding foreign keys. - create_table_info_t::create_foreign_keys() replacement for dict_create_foreign_constraints_low(); - Pass constraint name via Foreign_key object. Temporary until MDEV-20865: - Pass alter_info as part of create_info.
* | | Merge 10.4 into 10.5Marko Mäkelä2019-11-196-39/+38
|\ \ \ | |/ /
| * | Merge 10.3 into 10.4Marko Mäkelä2019-11-198-20/+53
| |\ \ | | |/
| | * MDEV-13564 follow-up: Remove unused codeMarko Mäkelä2019-11-191-217/+0
| | | | | | | | | | | | | | | | | | | | | | | | DropIndex, CreateIndex: Remove. The file row0trunc.cc only exists in MariaDB Server 10.3 so that the crash recovery of TRUNCATE TABLE operations from older 10.2 and 10.3 servers will work. This dead code was being used for implementing the MySQL 5.7 WL#6501 TRUNCATE TABLE that was replaced with a backup-safe implementation in MDEV-13564.
| | * Merge 10.2 into 10.3Marko Mäkelä2019-11-198-37/+60
| | |\
| | | * MDEV-21069 Crash on DROP TABLE if the data file is corruptedMarko Mäkelä2019-11-198-43/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | buf_read_ibuf_merge_pages(): Discard any page numbers that are outside the current bounds of the tablespace, by invoking the function ibuf_delete_recs() that was introduced in MDEV-20934. This could avoid an infinite change buffer merge loop on innodb_fast_shutdown=0, because normally the change buffer merge would only be attempted if a page was successfully loaded into the buffer pool. dict_drop_index_tree(): Add the parameter trx_t*. To prevent the DROP TABLE crash, do not invoke btr_free_if_exists() if the entire .ibd file will be dropped. Thus, we will avoid a crash if the BTR_SEG_LEAF or BTR_SEG_TOP of the index is corrupted, and we will also avoid unnecessarily accessing the to-be-dropped tablespace via the buffer pool. In MariaDB 10.2, we disable the DROP TABLE fix if innodb_safe_truncate=0, because the backup-unsafe MySQL 5.7 WL#6501 form of TRUNCATE TABLE requires that the individual pages be freed inside the tablespace.
| | * | MDEV-20611: MRR scan over partitioned InnoDB table produces "Out of memory" ↵Sergei Petrunia2019-11-151-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | error Fix partitioning and DS-MRR to work together - In ha_partition::index_end(): take into account that ha_innobase (and other engines using DS-MRR) will have inited=RND when initialized for DS-MRR scan. - In ha_partition::multi_range_read_next(): if the MRR scan is using HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's handler will store anything into *range_info. - In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions about how much memory their MRR implementation needs by passing *buffer_size=0. DS-MRR code didn't know about this (actually it used uint for buffer size calculation and would have an under-flow). Returning *buffer_size=0 made ha_partition assume that partitions do not need MRR memory and pass the same buffer to each of them. Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return the amount of buffer space needed, but not more than about @@mrr_buffer_size. * Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its partitions, and partition use DS-MRR, the code will call handler->clone with TABLE (*NOT partition*) name as an argument. DS-MRR has no way of knowing the partition name, so the solution was to have the ::clone() function for the affected storage engine to ignore the name argument and get it elsewhere.
| * | | MDEV-13564: Remove an unused return valueMarko Mäkelä2019-11-172-20/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When commit 09af00cbde1d62dfda574dee10e5c0fd240c3f7f removed the crash-upgrade logic of old TRUNCATE TABLE from MariaDB 10.2 and 10.3, it actually made the return value of dict_drop_index_tree() redundant.
* | | | MDEV-16264: Minor cleanupMarko Mäkelä2019-11-152-18/+18
| | | | | | | | | | | | | | | | Add missing static qualifiers.
* | | | MDEV-16264: Remove IORequest::IGNORE_MISSINGMarko Mäkelä2019-11-154-59/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | After MDEV-11556, not even crash recovery should attempt to access non-existing pages. But, buf_load() is not validating its input and must thus be able to ignore missing pages, so that is why buf_read_page_background() does that.
* | | | MDEV-16264: Fix some white spaceMarko Mäkelä2019-11-154-98/+59
| | | |
* | | | MDEV-21054 Crash on shutdown due to btr_search_latches=NULLMarko Mäkelä2019-11-151-8/+2
| | | | | | | | | | | | | | | | innodb_shutdown(): Invoke os_aio_free() before btr_search_sys_free().
* | | | Make .clang-format work with clang-8Vladislav Vaintroub2019-11-1519-425/+302
| | | | | | | | | | | | | | | | Remove keywords that are too new.
* | | | MDEV-16264 Use threadpool for Innodb background work.Vladislav Vaintroub2019-11-1541-5667/+1101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Almost all threads have gone - the "ticking" threads, that sleep a while then do some work) (srv_monitor_thread, srv_error_monitor_thread, srv_master_thread) were replaced with timers. Some timers are periodic, e.g the "master" timer. - The btr_defragment_thread is also replaced by a timer , which reschedules it self when current defragment "item" needs throttling - the buf_resize_thread and buf_dump_threads are substitutes with tasks Ditto with page cleaner workers. - purge workers threads are not tasks as well, and purge cleaner coordinator is a combination of a task and timer. - All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io() and provides the callback. - The srv_slot_t was removed, and innodb_debug_sync used in purge is currently not working, and needs reimplementation.
* | | | Cleanup: More use of mtr_memo_type_tMarko Mäkelä2019-11-155-13/+16
| | | |
* | | | Merge 10.4 into 10.5Marko Mäkelä2019-11-1431-530/+542
|\ \ \ \ | |/ / /
| * | | Merge 10.3 into 10.4Marko Mäkelä2019-11-1422-160/+145
| |\ \ \ | | |/ /
| | * | MDEV-12353 preparation: Replace mtr_x_lock() and friendsMarko Mäkelä2019-11-1422-163/+148
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Apart from page latches (buf_block_t::lock), mini-transactions are keeping track of at most one dict_index_t::lock and fil_space_t::latch at a time, and in a rare case, purge_sys.latch. Let us introduce interfaces for acquiring an index latch or a tablespace latch. In a later version, we may want to introduce mtr_t members for holding a latched dict_index_t* and fil_space_t*, and replace the remaining use of mtr_t::m_memo with std::set<buf_block_t*> or with a map<buf_block_t*,byte*> pointing to log records.
| * | | MDEV-20949: Merge 10.3 into 10.4Marko Mäkelä2019-11-149-241/+329
| |\ \ \ | | |/ /
| | * | MDEV-20949: Merge 10.2 into 10.3Marko Mäkelä2019-11-149-242/+325
| | |\ \ | | | |/ | | | | | | | | | | | | | | | | | | | | In the test innodb.instant_alter,4k we would be flagging an error for too large row size. That error was previously only being reported if the table was being rebuilt. Thus, this merge is fixing a small omission in MDEV-11369 (instant ADD COLUMN).
| | | * MDEV-20949 Stop issuing 'row size' error on DMLEugene Kosov2019-11-1310-262/+326
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Move row size check to early CREATE/ALTER TABLE phase. Stop checking on table open. dict_index_add_to_cache(): remove parameter 'strict', stop checking row size dict_index_t::record_size_info_t: this is a result of row size check operation create_table_info_t::row_size_is_acceptable(): performs row size check. Issues error or warning. Writes first overflow field to InnoDB log. create_table_info_t::create_table(): add row size check dict_index_t::record_size_info(): this is a refactored version of dict_index_t::rec_potentially_too_big(). New version doesn't change global state of a program but return all interesting info. And it's callers who decide how to handle row size overflow. dict_index_t::rec_potentially_too_big(): removed
| * | | Merge 10.3 into 10.4Marko Mäkelä2019-11-141-129/+68
| |\ \ \ | | |/ /
| | * | Merge 10.2 into 10.3Marko Mäkelä2019-11-141-129/+68
| | |\ \ | | | |/
| | | * Clean up mtr_t::commit() furtherMarko Mäkelä2019-11-131-129/+68
| | | | | | | | | | | | | | | | | | | | | | | | memo_block_unfix(), memo_latch_release(): Merge to ReleaseLatches. memo_slot_release(), ReleaseAll: Clean up the formatting.
| | | * MDEV-20934: Correct a debug assertionMarko Mäkelä2019-11-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A search with PAGE_CUR_GE may land on the supremum record on a leaf page that is not the rightmost leaf page. This could occur when all keys on the current page are smaller than the search key, and the smallest key on the successor page is larger than the search key. ibuf_delete_recs(): Correct the debug assertion accordingly.
* | | | Merge 10.4 into 10.5Marko Mäkelä2019-11-137-171/+124
|\ \ \ \ | |/ / /
| * | | MDEV-17138 follow-up: Optimize index page creationMarko Mäkelä2019-11-132-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | btr_create(), btr_root_raise_and_insert(): Write a MLOG_MEMSET record to set FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL, instead of writing two MLOG_4BYTES records. For ROW_FORMAT=COMPRESSED pages, we will not use MLOG_MEMSET because we want the crash-downgrade to earlier 10.4 releases to succeed. mlog_parse_nbytes(): Relax the too strict assertion. There is no problem with MLOG_MEMSET records that affect the uncompressed header of ROW_FORMAT=COMPRESSED index pages.
| * | | Use constexpr for constants on data pagesMarko Mäkelä2019-11-134-20/+21
| | | |
| * | | cleanup: replace List_iterator(_fast) in handler0alter.ccEugene Kosov2019-11-131-147/+88
| | | | | | | | | | | | | | | | | | | | Basically, use more List<T>::iterator. This patch required adding two more overloads to new iterator for convenience.
* | | | Merge 10.4 into 10.5Marko Mäkelä2019-11-125-483/+243
|\ \ \ \ | |/ / /
| * | | Merge 10.3 into 10.4Marko Mäkelä2019-11-125-485/+245
| |\ \ \ | | |/ /
| | * | Merge 10.2 into 10.3Marko Mäkelä2019-11-125-499/+248
| | |\ \ | | | |/