summaryrefslogtreecommitdiff
path: root/storage/innobase/include/row0log.h
Commit message (Collapse)AuthorAgeFilesLines
* Merge 10.5 into 10.6Marko Mäkelä2021-03-181-0/+5
|\
| * Merge 10.4 into 10.5Marko Mäkelä2021-03-181-1/+6
| |\
| | * Merge 10.3 into 10.4Marko Mäkelä2021-03-181-0/+5
| | |\ | |/ /
| | * MDEV-24730 Insert log operation fails after purge resets n_core_fieldsThirunarayanan Balathandayuthapani2021-03-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Online log for insert operation of redundant table fails with index->is_instant() assert. Purge can reset the n_core_fields when alter is waiting to upgrade MDL for commit phase of DDL. In the meantime, any insert DML tries to log the operation fails with index is not being instant. row_log_get_n_core_fields(): Get the n_core_fields of online log for the given index. rec_get_converted_size_comp_prefix_low(): Use n_core_fields of online log when InnoDB calculates the size of data tuple during redundant row format table rebuild. rec_convert_dtuple_to_rec_comp(): Use n_core_fields of online log when InnoDB does the conversion of data tuple to record during redudant row format table rebuild. - Adding the test case which has more than 129 instant columns.
* | | MDEV-515 Reduce InnoDB undo logging for insert into empty tableMarko Mäkelä2021-01-251-3/+3
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We implement an idea that was suggested by Michael 'Monty' Widenius in October 2017: When InnoDB is inserting into an empty table or partition, we can write a single undo log record TRX_UNDO_EMPTY, which will cause ROLLBACK to clear the table. For this to work, the insert into an empty table or partition must be covered by an exclusive table lock that will be held until the transaction has been committed or rolled back, or the INSERT operation has been rolled back (and the table is empty again), in lock_table_x_unlock(). Clustered index records that are covered by the TRX_UNDO_EMPTY record will carry DB_TRX_ID=0 and DB_ROLL_PTR=1<<55, and thus they cannot be distinguished from what MDEV-12288 leaves behind after purging the history of row-logged operations. Concurrent non-locking reads must be adjusted: If the read view was created before the INSERT into an empty table, then we must continue to imagine that the table is empty, and not try to read any records. If the read view was created after the INSERT was committed, then all records must be visible normally. To implement this, we introduce the field dict_table_t::bulk_trx_id. This special handling only applies to the very first INSERT statement of a transaction for the empty table or partition. If a subsequent statement in the transaction is modifying the initially empty table again, we must enable row-level undo logging, so that we will be able to roll back to the start of the statement in case of an error (such as duplicate key). INSERT IGNORE will continue to use row-level logging and locking, because implementing it would require the ability to roll back the latest row. Since the undo log that we write only allows us to roll back the entire statement, we cannot support INSERT IGNORE. We will introduce a handler::extra() parameter HA_EXTRA_IGNORE_INSERT to indicate to storage engines that INSERT IGNORE is being executed. In many test cases, we add an extra record to the table, so that during the 'interesting' part of the test, row-level locking and logging will be used. Replicas will continue to use row-level logging and locking until MDEV-24622 has been addressed. Likewise, this optimization will be disabled in Galera cluster until MDEV-24623 enables it. dict_table_t::bulk_trx_id: The latest active or committed transaction that initiated an insert into an empty table or partition. Protected by exclusive table lock and a clustered index leaf page latch. ins_node_t::bulk_insert: Whether bulk insert was initiated. trx_t::mod_tables: Use C++11 style accessors (emplace instead of insert). Unlike earlier, this collection will cover also temporary tables. trx_mod_table_time_t: Add start_bulk_insert(), end_bulk_insert(), is_bulk_insert(), was_bulk_insert(). trx_undo_report_row_operation(): Before accessing any undo log pages, invoke trx->mod_tables.emplace() in order to determine whether undo logging was disabled, or whether this is the first INSERT and we are supposed to write a TRX_UNDO_EMPTY record. row_ins_clust_index_entry_low(): If we are inserting into an empty clustered index leaf page, set the ins_node_t::bulk_insert flag for the subsequent trx_undo_report_row_operation() call. lock_rec_insert_check_and_lock(), lock_prdt_insert_check_and_lock(): Remove the redundant parameter 'flags' that can be checked in the caller. btr_cur_ins_lock_and_undo(): Simplify the logic. Correctly write DB_TRX_ID,DB_ROLL_PTR after invoking trx_undo_report_row_operation(). trx_mark_sql_stat_end(), ha_innobase::extra(HA_EXTRA_IGNORE_INSERT), ha_innobase::external_lock(): Invoke trx_t::end_bulk_insert() so that the next statement will not be covered by table-level undo logging. ReadView::changes_visible(trx_id_t) const: New accessor for the case where the trx_id_t is not read from a potentially corrupted index page but directly from the memory. In this case, we can skip a sanity check. row_sel(), row_sel_try_search_shortcut(), row_search_mvcc(): row_sel_try_search_shortcut_for_mysql(), row_merge_read_clustered_index(): Check dict_table_t::bulk_trx_id. row_sel_clust_sees(): Replaces lock_clust_rec_cons_read_sees(). lock_sec_rec_cons_read_sees(): Replaced with lower-level code. btr_root_page_init(): Refactored from btr_create(). dict_index_t::clear(), dict_table_t::clear(): Empty an index or table, for the ROLLBACK of an INSERT operation. ROW_T_EMPTY, ROW_OP_EMPTY: Note a concurrent ROLLBACK of an INSERT into an empty table. This is joint work with Thirunarayanan Balathandayuthapani, who created a working prototype. Thanks to Matthias Leich for extensive testing.
* | Merge 10.3 into 10.4Marko Mäkelä2020-12-011-2/+2
|\ \ | |/
| * Merge 10.2 into 10.3Marko Mäkelä2020-12-011-2/+2
| |\
| | * Cleanup: row_log_free()Marko Mäkelä2020-11-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nonnull attribute is not applicable to parameters that are passed by reference, at least not in the Intel compiler. Let us remove the reference indirection, which was only there so that the pointer could be assigned to NULL, and let the callers perform that task. row_log_allocate(): Fix a bug in out-of-memory error handling that would leave a pointer to freed memory.
* | | Merge 10.3 into 10.4Marko Mäkelä2020-05-051-4/+4
|\ \ \ | |/ /
| * | Merge branch '10.2' into 10.3Oleksandr Byelkin2020-05-041-4/+4
| |\ \ | | |/
| | * MDEV-21595: innodb offset_t rename to rec_offsDaniel Black2020-04-291-4/+4
| | | | | | | | | | | | | | | | | | thanks to: perl -i -pe 's/\boffset_t\b/rec_offs/g' $(git grep -lw offset_t storage/innobase)
* | | Merge 10.3 into 10.4Marko Mäkelä2019-12-131-4/+4
|\ \ \ | |/ / | | | | | | | | | We disable the MDEV-21189 test galera.galera_partition because it times out.
| * | Merge 10.2 into 10.3Marko Mäkelä2019-12-131-4/+4
| |\ \ | | |/
| | * MDEV-20950 Reduce size of record offsetsbb-10.2-MDEV-20950-stack-offsetsEugene Kosov2019-12-131-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | offset_t: this is a type which represents one record offset. It's unsigned short int. a lot of functions: replace ulint with offset_t btr_pcur_restore_position_func(), page_validate(), row_ins_scan_sec_index_for_duplicate(), row_upd_clust_rec_by_insert_inherit_func(), row_vers_impl_x_locked_low(), trx_undo_prev_version_build(): allocate record offsets on the stack instead of waiting for rec_get_offsets() to allocate it from mem_heap_t. So, reducing memory allocations. RECORD_OFFSET, INDEX_OFFSET: now it's less convenient to store pointers in offset_t* array. One pointer occupies now several offset_t. And those constant are start indexes into array to places where to store pointer values REC_OFFS_HEADER_SIZE: adjusted for the new reality REC_OFFS_NORMAL_SIZE: increase size from 100 to 300 which means less heap allocations. And sizeof(offset_t[REC_OFFS_NORMAL_SIZE]) now is 600 bytes which is smaller than previous 800 bytes. REC_OFFS_SEC_INDEX_SIZE: adjusted for the new reality rem0rec.h, rem0rec.ic, rem0rec.cc: various arguments, return values and local variables types were changed to fix numerous integer conversions issues. enum field_type_t: offset types concept was introduces which replaces old offset flags stuff. Like in earlier version, 2 upper bits are used to store offset type. And this enum represents those types. REC_OFFS_SQL_NULL, REC_OFFS_MASK: removed get_type(), set_type(), get_value(), combine(): these are convenience functions to work with offsets and it's types rec_offs_base()[0]: still uses an old scheme with flags REC_OFFS_COMPACT and REC_OFFS_EXTERNAL rec_offs_base()[i]: these have type offset_t now. Two upper bits contains type.
* | | Merge branch '10.3' into 10.4Oleksandr Byelkin2019-05-191-1/+1
|\ \ \ | |/ /
| * | Merge 10.2 into 10.3Marko Mäkelä2019-05-141-1/+1
| |\ \ | | |/
| | * Merge 10.1 into 10.2Marko Mäkelä2019-05-131-1/+1
| | |\
| | | * Update FSF addressVicențiu Ciorbaru2019-05-111-1/+1
| | | |
* | | | MDEV-17441 - InnoDB transition to C++11 atomicsSergey Vojtovich2018-12-271-1/+1
|/ / / | | | | | | | | | onlineddl_rowlog_rows transition to Atomic_counter.
* | | Merge 10.2 into 10.3Marko Mäkelä2018-11-301-3/+1
|\ \ \ | |/ / | | | | | | | | | | | | Also, related to MDEV-15522, MDEV-17304, MDEV-17835, remove the Galera xtrabackup tests, because xtrabackup never worked with MariaDB Server 10.3 due to InnoDB redo log format changes.
| * | Remove some unnecessary InnoDB #includeMarko Mäkelä2018-11-291-3/+1
| | |
* | | MDEV-16365 Setting a column NOT NULL fails to return error forThirunarayanan Balathandayuthapani2018-06-261-1/+3
| | | | | | | | | | | | | | | | | | NULL values when there is no DEFAULT - Fixed the test failure, assigned number of rows read to new table.
* | | MDEV-16365 Setting a column NOT NULL fails to return error forThirunarayanan Balathandayuthapani2018-06-251-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NULL values when there is no DEFAULT Copy and inplace algorithm works similarly for NULL to NOT NULL conversion for the following cases: (1) strict sql mode - Should give error. (2) non-strict sql mode - Should give warnings alone (3) alter ignore table command. - Should give warnings alone.
* | | MDEV-14168 Unconditionally allow ALGORITHM=INPLACE for setting a column NOT NULLThirunarayanan Balathandayuthapani2018-04-241-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Allow NOT NULL constraint to replace the NULL value in the row with explicit or implicit default value. - If the default value is non-const value then inplace alter won't support it. - ALTER IGNORE will ignore the error if the concurrent DML contains NULL value.
* | | Merge bb-10.2-ext into 10.3Marko Mäkelä2017-11-101-7/+2
|\ \ \ | |/ /
| * | MDEV-13795/MDEV-14332 Corruption during online table-rebuilding ALTER when ↵Marko Mäkelä2017-11-091-7/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | VIRTUAL columns exist When MySQL 5.7 introduced indexed virtual columns, it introduced several bugs into the online table-rebuilding ALTER, that is, the row_log_table_apply() family of functions. The online_log format that was introduced for online table-rebuilding ALTER in MySQL 5.6 should be sufficient. Ideally, any indexed virtual column values would be evaluated based on the log records in the temporary file. There is no need to log virtual column values. (For ADD INDEX, that is row_log_apply(), we always must log the values of the keys, no matter if the columns are virtual.) Because omitting the virtual column values removes any chance of row_log_table_apply() working with indexed virtual columns, we will for now refuse LOCK=NONE in table-rebuilding ALTER operations when indexes on virtual columns exist. This restriction would be lifted in MDEV-14341. innobase_indexed_virtual_exist(): New predicate, to determine if indexed virtual columns exist in a table definition. ha_innobase::check_if_supported_inplace_alter(): Refuse online rebuild if indexed virtual columns exist. rec_get_converted_size_temp_v(), rec_convert_dtuple_to_temp_v(): Remove. row_log_table_delete(), row_log_table_update(, row_log_table_insert(): Remove parameters for virtual columns. trx_undo_read_v_rows(): Remove the col_map parameter. row_log_table_apply(): Do not deal with virtual columns.
* | | Merge bb-10.2-ext into 10.3Marko Mäkelä2017-09-011-5/+5
|\ \ \ | |/ /
| * | Add ATTRIBUTE_NORETURN and ATTRIBUTE_COLDMarko Mäkelä2017-08-311-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ATTRIBUTE_NORETURN is supported on all platforms (MSVS and GCC-like). It declares that a function will not return; instead, the thread or the whole process will terminate. ATTRIBUTE_COLD is supported starting with GCC 4.3. It declares that a function is supposed to be executed rarely. Rarely used error-handling functions and functions that emit messages to the error log should be tagged such.
* | | MDEV-13654 Various crashes due to DB_TRX_ID mismatch in table-rebuilding ↵Marko Mäkelä2017-09-011-0/+1
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ALTER TABLE…LOCK=NONE After MDEV-12288 and MDEV-13536, the DB_TRX_ID of old clustered index records for which no history is available should be reset to 0. This caused crashes in online table-rebuilding ALTER, because the row_log_table_apply() is built on the assumption that the PRIMARY KEY together with DB_TRX_ID,DB_ROLL_PTR identifies the record. Both when copying the old table and when writing log about changes to the old table, we must map "old" DB_TRX_ID to 0. "old" here is simply "older than the trx_id of the ALTER TABLE transaction", because the MDL_EXCLUSIVE (and exclusive InnoDB table lock) in ha_innobase::prepare_inplace_alter_table() forces any transactions accessing the table to commit or rollback. So, we know that we can safely reset any DB_TRX_ID in the table that is older than the transaction ID of the ALTER TABLE, because the undo log history would be lost in a table-rebuilding ALTER. Note: After a table-rebuilding online ALTER TABLE, the rebuilt table may end up containing some nonzero DB_TRX_ID columns. The apply logic identifies the rows by the combination of PRIMARY KEY and DB_TRX_ID. These nonzero DB_TRX_ID would necessarily refer to concurrent DML operations that were started during ha_innobase::inplace_alter_table(). row_log_allocate(): Add a parameter for the ALTER TABLE transaction. row_log_t::min_trx: The ALTER TABLE transaction ID. trx_id_check(): A debug function to check that DB_TRX_ID makes sense (is either 0 or bigger than the ALTER TABLE transaction ID). reset_trx_id[]: The reset DB_TRX_ID,DB_ROLL_PTR columns. row_log_table_delete(), row_log_table_get_pk(): Reset the DB_TRX_ID,DB_ROLL_PTR when they precede the ALTER TABLE transaction. row_log_table_apply_delete(), row_log_table_apply_update(): Assert trx_id_check(). row_merge_insert_index_tuples(): Remove the unused parameter trx_id. row_merge_read_clustered_index(): In a table-rebuilding ALTER, reset the DB_TRX_ID,DB_ROLL_PTR when they precede the ALTER TABLE transaction. Assert trx_id_check() on clustered index records that are being buffered.
* | MDEV-12271 Port MySQL 8.0 Bug#23150562 REMOVE UNIV_MUST_NOT_INLINE AND ↵Marko Mäkelä2017-03-171-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UNIV_NONINL Also, remove empty .ic files that were not removed by my MySQL commit. Problem: InnoDB used to support a compilation mode that allowed to choose whether the function definitions in .ic files are to be inlined or not. This stopped making sense when InnoDB moved to C++ in MySQL 5.6 (and ha_innodb.cc started to #include .ic files), and more so in MySQL 5.7 when inline methods and functions were introduced in .h files. Solution: Remove all references to UNIV_NONINL and UNIV_MUST_NOT_INLINE from all files, assuming that the symbols are never defined. Remove the files fut0fut.cc and ut0byte.cc which only mattered when UNIV_NONINL was defined.
* | Merge InnoDB 5.7 from mysql-5.7.14.Jan Lindström2016-09-081-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | Contains also: MDEV-10549 mysqld: sql/handler.cc:2692: int handler::ha_index_first(uchar*): Assertion `table_share->tmp_table != NO_TMP_TABLE || m_lock_type != 2' failed. (branch bb-10.2-jan) Unlike MySQL, InnoDB still uses THR_LOCK in MariaDB MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan) enable tests that were fixed in MDEV-10549 MDEV-10548 Some of the debug sync waits do not work with InnoDB 5.7 (branch bb-10.2-jan) fix main.innodb_mysql_sync - re-enable online alter for partitioned innodb tables
* | Merge InnoDB 5.7 from mysql-5.7.9.Jan Lindström2016-09-021-39/+50
|/ | | | | | | | | | | | | | | | | | | | | | | Contains also MDEV-10547: Test multi_update_innodb fails with InnoDB 5.7 The failure happened because 5.7 has changed the signature of the bool handler::primary_key_is_clustered() const virtual function ("const" was added). InnoDB was using the old signature which caused the function not to be used. MDEV-10550: Parallel replication lock waits/deadlock handling does not work with InnoDB 5.7 Fixed mutexing problem on lock_trx_handle_wait. Note that rpl_parallel and rpl_optimistic_parallel tests still fail. MDEV-10156 : Group commit tests fail on 10.2 InnoDB (branch bb-10.2-jan) Reason: incorrect merge MDEV-10550: Parallel replication can't sync with master in InnoDB 5.7 (branch bb-10.2-jan) Reason: incorrect merge
* Merge branch '10.0' into 10.1Sergei Golubchik2016-06-281-16/+16
|\
| * 5.6.31Sergei Golubchik2016-06-211-16/+16
| |
* | Merge branch '10.0' into 10.1Sergei Golubchik2016-02-231-2/+3
|\ \ | |/
| * 5.6.29Sergei Golubchik2016-02-161-2/+3
| |
| * move to storage/innobaseSergei Golubchik2015-05-041-0/+239
|
* MDEV-6812: Merge Kakao: Add global status variables which tellJan Lindström2014-09-301-0/+4
| | | | | | | | | | | | | | | | | | | you the progress of inplace alter table and row log buffer usage - (x 100%, it's 4-digit. 10000 means 100.00%) - Innodb_onlineddl_rowlog_rows Shows how many rows are stored in row log buffer. - Innodb_onlineddl_rowlog_pct_used Shows row log buffer usage in percent ( *100%, it's 4-digit. 10000 means 100.00% ). - Innodb_onlineddl_pct_progress Shows the progress of inplace alter table. It might be not so accurate because inplace alter is highly depend on disk and buffer pool status. But still it is useful and better than nothing. - Add some log for inplace alter table XtraDB/InnoDB will print some message before and after doing some task.
* innodb 5.6.17Sergei Golubchik2014-05-071-6/+7
|
* MDEV-5574 Set AUTO_INCREMENT below max value of column.Sergei Golubchik2014-02-011-17/+14
| | | | | | | Update InnoDB to 5.6.14 Apply MySQL-5.6 hack for MySQL Bug#16434374 Move Aria-only HA_RTREE_INDEX from my_base.h to maria_def.h (breaks an assert in InnoDB) Fix InnoDB memory leak
* Temporary commit of 10.0-mergeMichael Widenius2013-03-261-0/+241