summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarko Mäkelä <marko.makela@mariadb.com>2017-10-06 07:00:05 +0300
committerMarko Mäkelä <marko.makela@mariadb.com>2017-10-06 09:50:10 +0300
commita4948dafcd7eee65f16d848bdc6562fc49ef8916 (patch)
treef2f404bfab72b6b0f280dbc8cc468682ed7a7bd2
parent3a418242dffe93ee34db388727f67eb498ae48ee (diff)
downloadmariadb-git-a4948dafcd7eee65f16d848bdc6562fc49ef8916.tar.gz
MDEV-11369 Instant ADD COLUMN for InnoDB
For InnoDB tables, adding, dropping and reordering columns has required a rebuild of the table and all its indexes. Since MySQL 5.6 (and MariaDB 10.0) this has been supported online (LOCK=NONE), allowing concurrent modification of the tables. This work revises the InnoDB ROW_FORMAT=REDUNDANT, ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC so that columns can be appended instantaneously, with only minor changes performed to the table structure. The counter innodb_instant_alter_column in INFORMATION_SCHEMA.GLOBAL_STATUS is incremented whenever a table rebuild operation is converted into an instant ADD COLUMN operation. ROW_FORMAT=COMPRESSED tables will not support instant ADD COLUMN. Some usability limitations will be addressed in subsequent work: MDEV-13134 Introduce ALTER TABLE attributes ALGORITHM=NOCOPY and ALGORITHM=INSTANT MDEV-14016 Allow instant ADD COLUMN, ADD INDEX, LOCK=NONE The format of the clustered index (PRIMARY KEY) is changed as follows: (1) The FIL_PAGE_TYPE of the root page will be FIL_PAGE_TYPE_INSTANT, and a new field PAGE_INSTANT will contain the original number of fields in the clustered index ('core' fields). If instant ADD COLUMN has not been used or the table becomes empty, or the very first instant ADD COLUMN operation is rolled back, the fields PAGE_INSTANT and FIL_PAGE_TYPE will be reset to 0 and FIL_PAGE_INDEX. (2) A special 'default row' record is inserted into the leftmost leaf, between the page infimum and the first user record. This record is distinguished by the REC_INFO_MIN_REC_FLAG, and it is otherwise in the same format as records that contain values for the instantly added columns. This 'default row' always has the same number of fields as the clustered index according to the table definition. The values of 'core' fields are to be ignored. For other fields, the 'default row' will contain the default values as they were during the ALTER TABLE statement. (If the column default values are changed later, those values will only be stored in the .frm file. The 'default row' will contain the original evaluated values, which must be the same for every row.) The 'default row' must be completely hidden from higher-level access routines. Assertions have been added to ensure that no 'default row' is ever present in the adaptive hash index or in locked records. The 'default row' is never delete-marked. (3) In clustered index leaf page records, the number of fields must reside between the number of 'core' fields (dict_index_t::n_core_fields introduced in this work) and dict_index_t::n_fields. If the number of fields is less than dict_index_t::n_fields, the missing fields are replaced with the column value of the 'default row'. Note: The number of fields in the record may shrink if some of the last instantly added columns are updated to the value that is in the 'default row'. The function btr_cur_trim() implements this 'compression' on update and rollback; dtuple::trim() implements it on insert. (4) In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC records, the new status value REC_STATUS_COLUMNS_ADDED will indicate the presence of a new record header that will encode n_fields-n_core_fields-1 in 1 or 2 bytes. (In ROW_FORMAT=REDUNDANT records, the record header always explicitly encodes the number of fields.) We introduce the undo log record type TRX_UNDO_INSERT_DEFAULT for covering the insert of the 'default row' record when instant ADD COLUMN is used for the first time. Subsequent instant ADD COLUMN can use TRX_UNDO_UPD_EXIST_REC. This is joint work with Vin Chen (陈福荣) from Tencent. The design that was discussed in April 2017 would not have allowed import or export of data files, because instead of the 'default row' it would have introduced a data dictionary table. The test rpl.rpl_alter_instant is exactly as contributed in pull request #408. The test innodb.instant_alter is based on a contributed test. The redo log record format changes for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT are as contributed. (With this change present, crash recovery from MariaDB 10.3.1 will fail in spectacular ways!) Also the semantics of higher-level redo log records that modify the PAGE_INSTANT field is changed. The redo log format version identifier was already changed to LOG_HEADER_FORMAT_CURRENT=103 in MariaDB 10.3.1. Everything else has been rewritten by me. Thanks to Elena Stepanova, the code has been tested extensively. When rolling back an instant ADD COLUMN operation, we must empty the PAGE_FREE list after deleting or shortening the 'default row' record, by calling either btr_page_empty() or btr_page_reorganize(). We must know the size of each entry in the PAGE_FREE list. If rollback left a freed copy of the 'default row' in the PAGE_FREE list, we would be unable to determine its size (if it is in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC) because it would contain more fields than the rolled-back definition of the clustered index. UNIV_SQL_DEFAULT: A new special constant that designates an instantly added column that is not present in the clustered index record. len_is_stored(): Check if a length is an actual length. There are two magic length values: UNIV_SQL_DEFAULT, UNIV_SQL_NULL. dict_col_t::def_val: The 'default row' value of the column. If the column is not added instantly, def_val.len will be UNIV_SQL_DEFAULT. dict_col_t: Add the accessors is_virtual(), is_nullable(), is_instant(), instant_value(). dict_col_t::remove_instant(): Remove the 'instant ADD' status of a column. dict_col_t::name(const dict_table_t& table): Replaces dict_table_get_col_name(). dict_index_t::n_core_fields: The original number of fields. For secondary indexes and if instant ADD COLUMN has not been used, this will be equal to dict_index_t::n_fields. dict_index_t::n_core_null_bytes: Number of bytes needed to represent the null flags; usually equal to UT_BITS_IN_BYTES(n_nullable). dict_index_t::NO_CORE_NULL_BYTES: Magic value signalling that n_core_null_bytes was not initialized yet from the clustered index root page. dict_index_t: Add the accessors is_instant(), is_clust(), get_n_nullable(), instant_field_value(). dict_index_t::instant_add_field(): Adjust clustered index metadata for instant ADD COLUMN. dict_index_t::remove_instant(): Remove the 'instant ADD' status of a clustered index when the table becomes empty, or the very first instant ADD COLUMN operation is rolled back. dict_table_t: Add the accessors is_instant(), is_temporary(), supports_instant(). dict_table_t::instant_add_column(): Adjust metadata for instant ADD COLUMN. dict_table_t::rollback_instant(): Adjust metadata on the rollback of instant ADD COLUMN. prepare_inplace_alter_table_dict(): First create the ctx->new_table, and only then decide if the table really needs to be rebuilt. We must split the creation of table or index metadata from the creation of the dictionary table records and the creation of the data. In this way, we can transform a table-rebuilding operation into an instant ADD COLUMN operation. Dictionary objects will only be added to cache when table rebuilding or index creation is needed. The ctx->instant_table will never be added to cache. dict_table_t::add_to_cache(): Modified and renamed from dict_table_add_to_cache(). Do not modify the table metadata. Let the callers invoke dict_table_add_system_columns() and if needed, set can_be_evicted. dict_create_sys_tables_tuple(), dict_create_table_step(): Omit the system columns (which will now exist in the dict_table_t object already at this point). dict_create_table_step(): Expect the callers to invoke dict_table_add_system_columns(). pars_create_table(): Before creating the table creation execution graph, invoke dict_table_add_system_columns(). row_create_table_for_mysql(): Expect all callers to invoke dict_table_add_system_columns(). create_index_dict(): Replaces row_merge_create_index_graph(). innodb_update_n_cols(): Renamed from innobase_update_n_virtual(). Call my_error() if an error occurs. btr_cur_instant_init(), btr_cur_instant_init_low(), btr_cur_instant_root_init(): Load additional metadata from the clustered index and set dict_index_t::n_core_null_bytes. This is invoked when table metadata is first loaded into the data dictionary. dict_boot(): Initialize n_core_null_bytes for the four hard-coded dictionary tables. dict_create_index_step(): Initialize n_core_null_bytes. This is executed as part of CREATE TABLE. dict_index_build_internal_clust(): Initialize n_core_null_bytes to NO_CORE_NULL_BYTES if table->supports_instant(). row_create_index_for_mysql(): Initialize n_core_null_bytes for CREATE TEMPORARY TABLE. commit_cache_norebuild(): Call the code to rename or enlarge columns in the cache only if instant ADD COLUMN is not being used. (Instant ADD COLUMN would copy all column metadata from instant_table to old_table, including the names and lengths.) PAGE_INSTANT: A new 13-bit field for storing dict_index_t::n_core_fields. This is repurposing the 16-bit field PAGE_DIRECTION, of which only the least significant 3 bits were used. The original byte containing PAGE_DIRECTION will be accessible via the new constant PAGE_DIRECTION_B. page_get_instant(), page_set_instant(): Accessors for the PAGE_INSTANT. page_ptr_get_direction(), page_get_direction(), page_ptr_set_direction(): Accessors for PAGE_DIRECTION. page_direction_reset(): Reset PAGE_DIRECTION, PAGE_N_DIRECTION. page_direction_increment(): Increment PAGE_N_DIRECTION and set PAGE_DIRECTION. rec_get_offsets(): Use the 'leaf' parameter for non-debug purposes, and assume that heap_no is always set. Initialize all dict_index_t::n_fields for ROW_FORMAT=REDUNDANT records, even if the record contains fewer fields. rec_offs_make_valid(): Add the parameter 'leaf'. rec_copy_prefix_to_dtuple(): Assert that the tuple is only built on the core fields. Instant ADD COLUMN only applies to the clustered index, and we should never build a search key that has more than the PRIMARY KEY and possibly DB_TRX_ID,DB_ROLL_PTR. All these columns are always present. dict_index_build_data_tuple(): Remove assertions that would be duplicated in rec_copy_prefix_to_dtuple(). rec_init_offsets(): Support ROW_FORMAT=REDUNDANT records whose number of fields is between n_core_fields and n_fields. cmp_rec_rec_with_match(): Implement the comparison between two MIN_REC_FLAG records. trx_t::in_rollback: Make the field available in non-debug builds. trx_start_for_ddl_low(): Remove dangerous error-tolerance. A dictionary transaction must be flagged as such before it has generated any undo log records. This is because trx_undo_assign_undo() will mark the transaction as a dictionary transaction in the undo log header right before the very first undo log record is being written. btr_index_rec_validate(): Account for instant ADD COLUMN row_undo_ins_remove_clust_rec(): On the rollback of an insert into SYS_COLUMNS, revert instant ADD COLUMN in the cache by removing the last column from the table and the clustered index. row_search_on_row_ref(), row_undo_mod_parse_undo_rec(), row_undo_mod(), trx_undo_update_rec_get_update(): Handle the 'default row' as a special case. dtuple_t::trim(index): Omit a redundant suffix of an index tuple right before insert or update. After instant ADD COLUMN, if the last fields of a clustered index tuple match the 'default row', there is no need to store them. While trimming the entry, we must hold a page latch, so that the table cannot be emptied and the 'default row' be deleted. btr_cur_optimistic_update(), btr_cur_pessimistic_update(), row_upd_clust_rec_by_insert(), row_ins_clust_index_entry_low(): Invoke dtuple_t::trim() if needed. row_ins_clust_index_entry(): Restore dtuple_t::n_fields after calling row_ins_clust_index_entry_low(). rec_get_converted_size(), rec_get_converted_size_comp(): Allow the number of fields to be between n_core_fields and n_fields. Do not support infimum,supremum. They are never supposed to be stored in dtuple_t, because page creation nowadays uses a lower-level method for initializing them. rec_convert_dtuple_to_rec_comp(): Assign the status bits based on the number of fields. btr_cur_trim(): In an update, trim the index entry as needed. For the 'default row', handle rollback specially. For user records, omit fields that match the 'default row'. btr_cur_optimistic_delete_func(), btr_cur_pessimistic_delete(): Skip locking and adaptive hash index for the 'default row'. row_log_table_apply_convert_mrec(): Replace 'default row' values if needed. In the temporary file that is applied by row_log_table_apply(), we must identify whether the records contain the extra header for instantly added columns. For now, we will allocate an additional byte for this for ROW_T_INSERT and ROW_T_UPDATE records when the source table has been subject to instant ADD COLUMN. The ROW_T_DELETE records are fine, as they will be converted and will only contain 'core' columns (PRIMARY KEY and some system columns) that are converted from dtuple_t. rec_get_converted_size_temp(), rec_init_offsets_temp(), rec_convert_dtuple_to_temp(): Add the parameter 'status'. REC_INFO_DEFAULT_ROW = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED: An info_bits constant for distinguishing the 'default row' record. rec_comp_status_t: An enum of the status bit values. rec_leaf_format: An enum that replaces the bool parameter of rec_init_offsets_comp_ordinary().
-rw-r--r--mysql-test/suite/encryption/r/innodb-bad-key-change.result64
-rw-r--r--mysql-test/suite/encryption/r/innodb-bad-key-change2.result36
-rw-r--r--mysql-test/suite/encryption/r/innodb-bad-key-change4.result11
-rw-r--r--mysql-test/suite/encryption/r/innodb-encryption-disable.result5
-rw-r--r--mysql-test/suite/encryption/r/innodb-force-corrupt.result5
-rw-r--r--mysql-test/suite/encryption/r/innodb-missing-key.result7
-rw-r--r--mysql-test/suite/encryption/t/innodb-bad-key-change.test21
-rw-r--r--mysql-test/suite/encryption/t/innodb-bad-key-change2.test26
-rw-r--r--mysql-test/suite/encryption/t/innodb-bad-key-change4.test1
-rw-r--r--mysql-test/suite/encryption/t/innodb-encryption-disable.test5
-rw-r--r--mysql-test/suite/encryption/t/innodb-force-corrupt.test5
-rw-r--r--mysql-test/suite/encryption/t/innodb-missing-key.test7
-rw-r--r--mysql-test/suite/innodb/r/alter_rename_existing.result7
-rw-r--r--mysql-test/suite/innodb/r/innodb-alter-debug.result8
-rw-r--r--mysql-test/suite/innodb/r/innodb-index-debug.result2
-rw-r--r--mysql-test/suite/innodb/r/instant_alter,32k.rdiff83
-rw-r--r--mysql-test/suite/innodb/r/instant_alter,4k.rdiff191
-rw-r--r--mysql-test/suite/innodb/r/instant_alter,64k.rdiff83
-rw-r--r--mysql-test/suite/innodb/r/instant_alter,8k.rdiff191
-rw-r--r--mysql-test/suite/innodb/r/instant_alter.resultbin0 -> 117087 bytes
-rw-r--r--mysql-test/suite/innodb/r/instant_alter_crash.result103
-rw-r--r--mysql-test/suite/innodb/r/instant_alter_debug.result167
-rw-r--r--mysql-test/suite/innodb/r/instant_alter_inject.result66
-rw-r--r--mysql-test/suite/innodb/r/instant_alter_rollback.result90
-rw-r--r--mysql-test/suite/innodb/r/truncate_debug.result6
-rw-r--r--mysql-test/suite/innodb/t/alter_rename_existing.test6
-rw-r--r--mysql-test/suite/innodb/t/innodb-alter-debug.test8
-rw-r--r--mysql-test/suite/innodb/t/innodb-index-debug.test2
-rw-r--r--mysql-test/suite/innodb/t/instant_alter.opt1
-rw-r--r--mysql-test/suite/innodb/t/instant_alter.test261
-rw-r--r--mysql-test/suite/innodb/t/instant_alter_crash.test123
-rw-r--r--mysql-test/suite/innodb/t/instant_alter_debug.test181
-rw-r--r--mysql-test/suite/innodb/t/instant_alter_inject.test46
-rw-r--r--mysql-test/suite/innodb/t/instant_alter_rollback.test70
-rw-r--r--mysql-test/suite/innodb/t/truncate_debug.test7
-rw-r--r--mysql-test/suite/innodb_gis/r/alter_spatial_index.result2
-rw-r--r--mysql-test/suite/innodb_gis/t/alter_spatial_index.test12
-rw-r--r--mysql-test/suite/rpl/r/rpl_alter_instant.result66
-rw-r--r--mysql-test/suite/rpl/t/rpl_alter_instant.test50
-rw-r--r--storage/innobase/btr/btr0btr.cc177
-rw-r--r--storage/innobase/btr/btr0bulk.cc15
-rw-r--r--storage/innobase/btr/btr0cur.cc557
-rw-r--r--storage/innobase/btr/btr0sea.cc137
-rw-r--r--storage/innobase/buf/buf0buf.cc6
-rw-r--r--storage/innobase/buf/buf0dblwr.cc2
-rw-r--r--storage/innobase/buf/buf0flu.cc1
-rw-r--r--storage/innobase/data/data0data.cc34
-rw-r--r--storage/innobase/data/data0type.cc2
-rw-r--r--storage/innobase/dict/dict0boot.cc24
-rw-r--r--storage/innobase/dict/dict0crea.cc45
-rw-r--r--storage/innobase/dict/dict0dict.cc107
-rw-r--r--storage/innobase/dict/dict0load.cc12
-rw-r--r--storage/innobase/dict/dict0mem.cc284
-rw-r--r--storage/innobase/dict/dict0stats.cc18
-rw-r--r--storage/innobase/fts/fts0fts.cc17
-rw-r--r--storage/innobase/gis/gis0rtree.cc2
-rw-r--r--storage/innobase/handler/ha_innodb.cc16
-rw-r--r--storage/innobase/handler/handler0alter.cc990
-rw-r--r--storage/innobase/handler/i_s.cc12
-rw-r--r--storage/innobase/ibuf/ibuf0ibuf.cc2
-rw-r--r--storage/innobase/include/btr0btr.h14
-rw-r--r--storage/innobase/include/btr0cur.h18
-rw-r--r--storage/innobase/include/btr0cur.ic4
-rw-r--r--storage/innobase/include/btr0sea.h20
-rw-r--r--storage/innobase/include/data0data.h9
-rw-r--r--storage/innobase/include/data0data.ic3
-rw-r--r--storage/innobase/include/data0type.h6
-rw-r--r--storage/innobase/include/dict0dict.h51
-rw-r--r--storage/innobase/include/dict0dict.ic17
-rw-r--r--storage/innobase/include/dict0mem.h163
-rw-r--r--storage/innobase/include/dict0mem.ic1
-rw-r--r--storage/innobase/include/fil0fil.h21
-rw-r--r--storage/innobase/include/fil0fil.ic2
-rw-r--r--storage/innobase/include/gis0rtree.ic2
-rw-r--r--storage/innobase/include/page0page.h101
-rw-r--r--storage/innobase/include/page0page.ic93
-rw-r--r--storage/innobase/include/rem0rec.h443
-rw-r--r--storage/innobase/include/rem0rec.ic402
-rw-r--r--storage/innobase/include/row0merge.h2
-rw-r--r--storage/innobase/include/row0upd.h32
-rw-r--r--storage/innobase/include/srv0srv.h3
-rw-r--r--storage/innobase/include/trx0rec.h5
-rw-r--r--storage/innobase/include/trx0trx.h2
-rw-r--r--storage/innobase/lock/lock0lock.cc60
-rw-r--r--storage/innobase/mtr/mtr0log.cc50
-rw-r--r--storage/innobase/page/page0cur.cc303
-rw-r--r--storage/innobase/page/page0page.cc64
-rw-r--r--storage/innobase/page/page0zip.cc10
-rw-r--r--storage/innobase/pars/pars0pars.cc9
-rw-r--r--storage/innobase/rem/rem0cmp.cc19
-rw-r--r--storage/innobase/rem/rem0rec.cc737
-rw-r--r--storage/innobase/row/row0ftsort.cc5
-rw-r--r--storage/innobase/row/row0import.cc28
-rw-r--r--storage/innobase/row/row0ins.cc52
-rw-r--r--storage/innobase/row/row0log.cc139
-rw-r--r--storage/innobase/row/row0merge.cc82
-rw-r--r--storage/innobase/row/row0mysql.cc7
-rw-r--r--storage/innobase/row/row0purge.cc19
-rw-r--r--storage/innobase/row/row0row.cc100
-rw-r--r--storage/innobase/row/row0sel.cc84
-rw-r--r--storage/innobase/row/row0trunc.cc18
-rw-r--r--storage/innobase/row/row0uins.cc107
-rw-r--r--storage/innobase/row/row0umod.cc22
-rw-r--r--storage/innobase/row/row0undo.cc5
-rw-r--r--storage/innobase/row/row0upd.cc74
-rw-r--r--storage/innobase/row/row0vers.cc4
-rw-r--r--storage/innobase/trx/trx0rec.cc92
-rw-r--r--storage/innobase/trx/trx0roll.cc18
-rw-r--r--storage/innobase/trx/trx0trx.cc9
109 files changed, 6133 insertions, 1883 deletions
diff --git a/mysql-test/suite/encryption/r/innodb-bad-key-change.result b/mysql-test/suite/encryption/r/innodb-bad-key-change.result
index 2e87b85489e..71ad4909899 100644
--- a/mysql-test/suite/encryption/r/innodb-bad-key-change.result
+++ b/mysql-test/suite/encryption/r/innodb-bad-key-change.result
@@ -1,5 +1,6 @@
call mtr.add_suppression("Plugin 'file_key_management' init function returned error");
call mtr.add_suppression("Plugin 'file_key_management' registration.*failed");
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[12]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[12]\\.ibd' cannot be decrypted\\.");
call mtr.add_suppression("File '.*mysql-test.std_data.keysbad3\\.txt' not found");
# Start server with keys2.txt
@@ -25,15 +26,17 @@ foobar 2
# Restart server with keysbad3.txt
SELECT * FROM t1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
Warning 192 Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Warning 192 Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t1' doesn't exist in engine
DROP TABLE t1;
+Warnings:
+Warning 192 Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
SHOW WARNINGS;
Level Code Message
+Warning 192 Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
# Start server with keys3.txt
SET GLOBAL innodb_default_encryption_key_id=5;
CREATE TABLE t2 (c VARCHAR(8), id int not null primary key, b int, key(b)) ENGINE=InnoDB ENCRYPTED=YES;
@@ -41,74 +44,63 @@ INSERT INTO t2 VALUES ('foobar',1,2);
# Restart server with keys2.txt
SELECT * FROM t2;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
Warning 192 Table test/t2 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
SELECT * FROM t2 where id = 1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
SELECT * FROM t2 where b = 1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
INSERT INTO t2 VALUES ('tmp',3,3);
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
DELETE FROM t2 where b = 3;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
DELETE FROM t2 where id = 3;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
UPDATE t2 set b = b +1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
OPTIMIZE TABLE t2;
Table Op Msg_type Msg_text
-test.t2 optimize Warning Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t2 optimize Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t2 optimize error Corrupt
+test.t2 optimize Error Table 'test.t2' doesn't exist in engine
+test.t2 optimize status Operation failed
SHOW WARNINGS;
Level Code Message
ALTER TABLE t2 ADD COLUMN d INT;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
ANALYZE TABLE t2;
Table Op Msg_type Msg_text
-test.t2 analyze Warning Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t2 analyze Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t2 analyze error Corrupt
+test.t2 analyze Error Table 'test.t2' doesn't exist in engine
+test.t2 analyze status Operation failed
SHOW WARNINGS;
Level Code Message
TRUNCATE TABLE t2;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t2' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t2 in file ./test/t2.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t2' doesn't exist in engine
DROP TABLE t2;
# Start server with keys2.txt
diff --git a/mysql-test/suite/encryption/r/innodb-bad-key-change2.result b/mysql-test/suite/encryption/r/innodb-bad-key-change2.result
index b1f91c0d095..087f76eda2d 100644
--- a/mysql-test/suite/encryption/r/innodb-bad-key-change2.result
+++ b/mysql-test/suite/encryption/r/innodb-bad-key-change2.result
@@ -1,3 +1,4 @@
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted|does not exist.*is trying to rename)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t1(new)?\\.ibd' cannot be decrypted\\.");
call mtr.add_suppression("Couldn't load plugins from 'file_key_management");
call mtr.add_suppression("InnoDB: Tablespace for table \`test\`.\`t1\` is set as discarded\\.");
@@ -6,39 +7,37 @@ CREATE TABLE t1 (pk INT PRIMARY KEY, f VARCHAR(8)) ENGINE=InnoDB
ENCRYPTED=YES ENCRYPTION_KEY_ID=4;
INSERT INTO t1 VALUES (1,'foo'),(2,'bar');
SELECT * FROM t1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
Warning 192 Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Warning 192 Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t1' doesn't exist in engine
ALTER TABLE t1 ENGINE=InnoDB;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SHOW WARNINGS;
Level Code Message
-Warning 192 Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Error 1296 Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+Error 1932 Table 'test.t1' doesn't exist in engine
OPTIMIZE TABLE t1;
Table Op Msg_type Msg_text
-test.t1 optimize Warning Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t1 optimize Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t1 optimize error Corrupt
+test.t1 optimize Error Table 'test.t1' doesn't exist in engine
+test.t1 optimize status Operation failed
SHOW WARNINGS;
Level Code Message
CHECK TABLE t1;
Table Op Msg_type Msg_text
-test.t1 check Warning Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t1 check Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t1 check error Corrupt
+test.t1 check Error Table 'test.t1' doesn't exist in engine
+test.t1 check status Operation failed
SHOW WARNINGS;
Level Code Message
FLUSH TABLES t1 FOR EXPORT;
backup: t1
UNLOCK TABLES;
ALTER TABLE t1 DISCARD TABLESPACE;
-Warnings:
-Warning 192 Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-Warning 1812 Tablespace is missing for table 'test/t1'
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
+DROP TABLE t1;
+CREATE TABLE t1 (pk INT PRIMARY KEY, f VARCHAR(8)) ENGINE=InnoDB
+ENCRYPTED=YES ENCRYPTION_KEY_ID=4;
+ALTER TABLE t1 DISCARD TABLESPACE;
restore: t1 .ibd and .cfg files
ALTER TABLE t1 IMPORT TABLESPACE;
Warnings:
@@ -51,6 +50,7 @@ t1 CREATE TABLE `t1` (
PRIMARY KEY (`pk`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 `ENCRYPTED`=YES `ENCRYPTION_KEY_ID`=4
RENAME TABLE t1 TO t1new;
-ALTER TABLE t1new RENAME TO t2new;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-DROP TABLE t1new;
+ERROR HY000: Error on rename of './test/t1' to './test/t1new' (errno: 155 "The table does not exist in the storage engine")
+ALTER TABLE t1 RENAME TO t1new;
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
+DROP TABLE t1;
diff --git a/mysql-test/suite/encryption/r/innodb-bad-key-change4.result b/mysql-test/suite/encryption/r/innodb-bad-key-change4.result
index 7f4b1fbc151..514a0aec051 100644
--- a/mysql-test/suite/encryption/r/innodb-bad-key-change4.result
+++ b/mysql-test/suite/encryption/r/innodb-bad-key-change4.result
@@ -1,3 +1,4 @@
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t1\\.ibd' cannot be decrypted\\.");
call mtr.add_suppression("Couldn't load plugins from 'file_key_management");
SET GLOBAL innodb_file_per_table = ON;
@@ -7,16 +8,14 @@ INSERT INTO t1 VALUES (1,'foo'),(2,'bar');
OPTIMIZE TABLE t1;
Table Op Msg_type Msg_text
test.t1 optimize Warning Table test/t1 in tablespace is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t1 optimize Warning Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t1 optimize Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t1 optimize error Corrupt
+test.t1 optimize Error Table 'test.t1' doesn't exist in engine
+test.t1 optimize status Operation failed
SHOW WARNINGS;
Level Code Message
CHECK TABLE t1;
Table Op Msg_type Msg_text
-test.t1 check Warning Table t1 in file ./test/t1.ibd is encrypted but encryption service or used key_id is not available. Can't continue reading table.
-test.t1 check Error Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
-test.t1 check error Corrupt
+test.t1 check Error Table 'test.t1' doesn't exist in engine
+test.t1 check status Operation failed
SHOW WARNINGS;
Level Code Message
DROP TABLE t1;
diff --git a/mysql-test/suite/encryption/r/innodb-encryption-disable.result b/mysql-test/suite/encryption/r/innodb-encryption-disable.result
index 90668a3a395..74570c92ae0 100644
--- a/mysql-test/suite/encryption/r/innodb-encryption-disable.result
+++ b/mysql-test/suite/encryption/r/innodb-encryption-disable.result
@@ -1,3 +1,4 @@
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[15]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[15]\\.ibd' cannot be decrypted\\.");
call mtr.add_suppression("Couldn't load plugins from 'file_key_management");
create table t5 (
@@ -18,8 +19,8 @@ CREATE TABLE `t1` (
insert into t1 values (1,2,'maria','db','encryption');
alter table t1 encrypted='yes' `encryption_key_id`=1;
select * from t1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
select * from t5;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t5' doesn't exist in engine
drop table t1;
drop table t5;
diff --git a/mysql-test/suite/encryption/r/innodb-force-corrupt.result b/mysql-test/suite/encryption/r/innodb-force-corrupt.result
index 67917ca5f82..d27136bf430 100644
--- a/mysql-test/suite/encryption/r/innodb-force-corrupt.result
+++ b/mysql-test/suite/encryption/r/innodb-force-corrupt.result
@@ -1,3 +1,4 @@
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[13]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[123]\\.ibd' cannot be decrypted\\.");
SET GLOBAL innodb_file_per_table = ON;
set global innodb_compression_algorithm = 1;
@@ -14,10 +15,10 @@ COMMIT;
# Backup tables before corrupting
# Corrupt tables
SELECT * FROM t1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SELECT * FROM t2;
ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
SELECT * FROM t3;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t3' doesn't exist in engine
# Restore the original tables
DROP TABLE t1,t2,t3;
diff --git a/mysql-test/suite/encryption/r/innodb-missing-key.result b/mysql-test/suite/encryption/r/innodb-missing-key.result
index 3eb48409f13..2c5401ff681 100644
--- a/mysql-test/suite/encryption/r/innodb-missing-key.result
+++ b/mysql-test/suite/encryption/r/innodb-missing-key.result
@@ -1,3 +1,4 @@
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[123]\\.ibd' cannot be decrypted\\.");
# Start server with keys2.txt
CREATE TABLE t1(a int not null primary key auto_increment, b varchar(128)) engine=innodb ENCRYPTED=YES ENCRYPTION_KEY_ID=19;
@@ -32,11 +33,11 @@ SELECT COUNT(1) FROM t2;
COUNT(1)
2048
SELECT COUNT(1) FROM t2,t1 where t2.a = t1.a;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SELECT COUNT(1) FROM t1 where b = 'ab';
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
SELECT COUNT(1) FROM t1;
-ERROR HY000: Got error 192 'Table encrypted but decryption failed. This could be because correct encryption management plugin is not loaded, used encryption key is not available or encryption method does not match.' from InnoDB
+ERROR 42S02: Table 'test.t1' doesn't exist in engine
# Start server with keys2.txt
SELECT COUNT(1) FROM t1;
diff --git a/mysql-test/suite/encryption/t/innodb-bad-key-change.test b/mysql-test/suite/encryption/t/innodb-bad-key-change.test
index 04c50e6f327..8a431cd93ca 100644
--- a/mysql-test/suite/encryption/t/innodb-bad-key-change.test
+++ b/mysql-test/suite/encryption/t/innodb-bad-key-change.test
@@ -10,6 +10,7 @@
call mtr.add_suppression("Plugin 'file_key_management' init function returned error");
call mtr.add_suppression("Plugin 'file_key_management' registration.*failed");
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[12]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[12]\\.ibd' cannot be decrypted\\.");
call mtr.add_suppression("File '.*mysql-test.std_data.keysbad3\\.txt' not found");
@@ -36,7 +37,7 @@ SELECT * FROM t1;
-- let $restart_parameters=--file-key-management-filename=$MYSQL_TEST_DIR/std_data/keysbad3.txt
-- source include/restart_mysqld.inc
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t1;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
@@ -66,45 +67,45 @@ INSERT INTO t2 VALUES ('foobar',1,2);
-- let $restart_parameters=--file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys2.txt
-- source include/restart_mysqld.inc
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t2;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t2 where id = 1;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t2 where b = 1;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
INSERT INTO t2 VALUES ('tmp',3,3);
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
DELETE FROM t2 where b = 3;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
DELETE FROM t2 where id = 3;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
UPDATE t2 set b = b +1;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
OPTIMIZE TABLE t2;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
ALTER TABLE t2 ADD COLUMN d INT;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
ANALYZE TABLE t2;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
TRUNCATE TABLE t2;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
diff --git a/mysql-test/suite/encryption/t/innodb-bad-key-change2.test b/mysql-test/suite/encryption/t/innodb-bad-key-change2.test
index 3c9e10efc90..8c1a8277a30 100644
--- a/mysql-test/suite/encryption/t/innodb-bad-key-change2.test
+++ b/mysql-test/suite/encryption/t/innodb-bad-key-change2.test
@@ -8,6 +8,7 @@
# MDEV-8768: Server crash at file btr0btr.ic line 122 when checking encrypted table using incorrect keys
# MDEV-8727: Server/InnoDB hangs on shutdown after trying to read an encrypted table with a wrong key
#
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted|does not exist.*is trying to rename)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t1(new)?\\.ibd' cannot be decrypted\\.");
# Suppression for builds where file_key_management plugin is linked statically
call mtr.add_suppression("Couldn't load plugins from 'file_key_management");
@@ -25,11 +26,11 @@ INSERT INTO t1 VALUES (1,'foo'),(2,'bar');
--let $restart_parameters=--plugin-load-add=file_key_management.so --file-key-management --file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys3.txt
--source include/restart_mysqld.inc
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t1;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
ALTER TABLE t1 ENGINE=InnoDB;
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
SHOW WARNINGS;
@@ -56,8 +57,13 @@ UNLOCK TABLES;
--let $restart_parameters=--plugin-load-add=file_key_management.so --file-key-management --file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys3.txt
--source include/restart_mysqld.inc
-# Discard should pass even with incorrect keys
---replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
+--error ER_NO_SUCH_TABLE_IN_ENGINE
+ALTER TABLE t1 DISCARD TABLESPACE;
+# Drop table will succeed.
+DROP TABLE t1;
+
+CREATE TABLE t1 (pk INT PRIMARY KEY, f VARCHAR(8)) ENGINE=InnoDB
+ENCRYPTED=YES ENCRYPTION_KEY_ID=4;
ALTER TABLE t1 DISCARD TABLESPACE;
perl;
@@ -66,7 +72,6 @@ ib_discard_tablespaces("test", "t1");
ib_restore_tablespaces("test", "t1");
EOF
-
--let $restart_parameters=--plugin-load-add=file_key_management.so --file-key-management --file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys2.txt
--source include/restart_mysqld.inc
@@ -76,13 +81,10 @@ SHOW CREATE TABLE t1;
--let $restart_parameters= --innodb-encrypt-tables --plugin-load-add=file_key_management.so --file-key-management --file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys3.txt
--source include/restart_mysqld.inc
-# Rename table should pass even with incorrect keys
+--error ER_ERROR_ON_RENAME
RENAME TABLE t1 TO t1new;
---replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
-
-# Alter table rename is not allowed with incorrect keys
---error ER_GET_ERRMSG
-ALTER TABLE t1new RENAME TO t2new;
+--error ER_NO_SUCH_TABLE_IN_ENGINE
+ALTER TABLE t1 RENAME TO t1new;
# Drop should pass even with incorrect keys
--replace_regex /(tablespace|key_id) [1-9][0-9]*/\1 /
-DROP TABLE t1new;
+DROP TABLE t1;
diff --git a/mysql-test/suite/encryption/t/innodb-bad-key-change4.test b/mysql-test/suite/encryption/t/innodb-bad-key-change4.test
index a2305aa968b..30d417cfe93 100644
--- a/mysql-test/suite/encryption/t/innodb-bad-key-change4.test
+++ b/mysql-test/suite/encryption/t/innodb-bad-key-change4.test
@@ -7,6 +7,7 @@
# MDEV-8768: Server crash at file btr0btr.ic line 122 when checking encrypted table using incorrect keys
#
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t1\\.ibd' cannot be decrypted\\.");
# Suppression for builds where file_key_management plugin is linked statically
call mtr.add_suppression("Couldn't load plugins from 'file_key_management");
diff --git a/mysql-test/suite/encryption/t/innodb-encryption-disable.test b/mysql-test/suite/encryption/t/innodb-encryption-disable.test
index 8c72cf6a3b2..0514ce70fb6 100644
--- a/mysql-test/suite/encryption/t/innodb-encryption-disable.test
+++ b/mysql-test/suite/encryption/t/innodb-encryption-disable.test
@@ -7,6 +7,7 @@
# MDEV-9559: Server without encryption configs crashes if selecting from an implicitly encrypted table
#
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[15]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[15]\\.ibd' cannot be decrypted\\.");
# Suppression for builds where file_key_management plugin is linked statically
@@ -39,9 +40,9 @@ alter table t1 encrypted='yes' `encryption_key_id`=1;
--let $restart_parameters=--innodb-encrypt-tables=OFF
--source include/restart_mysqld.inc
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
select * from t1;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
select * from t5;
--let $restart_parameters=--innodb-encrypt-tables=ON --plugin-load-add=file_key_management.so --file-key-management --file-key-management-filename=$MYSQL_TEST_DIR/std_data/keys2.txt
diff --git a/mysql-test/suite/encryption/t/innodb-force-corrupt.test b/mysql-test/suite/encryption/t/innodb-force-corrupt.test
index 4d3bfc2d1e9..c23959801ca 100644
--- a/mysql-test/suite/encryption/t/innodb-force-corrupt.test
+++ b/mysql-test/suite/encryption/t/innodb-force-corrupt.test
@@ -7,6 +7,7 @@
# Don't test under embedded
-- source include/not_embedded.inc
+call mtr.add_suppression("InnoDB: Table `test`\\.`t[13]` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[123]\\.ibd' cannot be decrypted\\.");
SET GLOBAL innodb_file_per_table = ON;
@@ -65,11 +66,11 @@ EOF
--source include/start_mysqld.inc
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t1;
--error ER_GET_ERRMSG
SELECT * FROM t2;
---error ER_GET_ERRMSG
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT * FROM t3;
--source include/shutdown_mysqld.inc
diff --git a/mysql-test/suite/encryption/t/innodb-missing-key.test b/mysql-test/suite/encryption/t/innodb-missing-key.test
index 8091d23cf1c..2a56581601a 100644
--- a/mysql-test/suite/encryption/t/innodb-missing-key.test
+++ b/mysql-test/suite/encryption/t/innodb-missing-key.test
@@ -7,6 +7,7 @@
# MDEV-11004: Unable to start (Segfault or os error 2) when encryption key missing
#
+call mtr.add_suppression("InnoDB: Table `test`\\.`t1` (has an unreadable root page|is corrupted)");
call mtr.add_suppression("InnoDB: The page \\[page id: space=[1-9][0-9]*, page number=[1-9][0-9]*\\] in file '.*test.t[123]\\.ibd' cannot be decrypted\\.");
--echo # Start server with keys2.txt
@@ -42,11 +43,11 @@ CREATE TABLE t4(a int not null primary key auto_increment, b varchar(128)) engin
SELECT SLEEP(5);
SELECT COUNT(1) FROM t3;
SELECT COUNT(1) FROM t2;
---error 1296
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT COUNT(1) FROM t2,t1 where t2.a = t1.a;
---error 1296
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT COUNT(1) FROM t1 where b = 'ab';
---error 1296
+--error ER_NO_SUCH_TABLE_IN_ENGINE
SELECT COUNT(1) FROM t1;
--echo
diff --git a/mysql-test/suite/innodb/r/alter_rename_existing.result b/mysql-test/suite/innodb/r/alter_rename_existing.result
index 881518595de..8fc54adbd10 100644
--- a/mysql-test/suite/innodb/r/alter_rename_existing.result
+++ b/mysql-test/suite/innodb/r/alter_rename_existing.result
@@ -58,15 +58,15 @@ ALTER TABLE t1 ADD COLUMN d INT, ALGORITHM=COPY;
# while a blocking t1.ibd file exists.
#
SET GLOBAL innodb_file_per_table=ON;
-ALTER TABLE t1 ADD COLUMN e1 INT, ALGORITHM=INPLACE;
+ALTER TABLE t1 FORCE, ALGORITHM=INPLACE;
ERROR HY000: Tablespace for table 'test/t1' exists. Please DISCARD the tablespace before IMPORT
-ALTER TABLE t1 ADD COLUMN e2 INT, ALGORITHM=COPY;
+ALTER TABLE t1 FORCE, ALGORITHM=COPY;
ERROR HY000: Error on rename of 'OLD_FILE_NAME' to 'NEW_FILE_NAME' (errno: 184 "Tablespace already exists")
#
# Delete the blocking file called MYSQLD_DATADIR/test/t1.ibd
# Move t1 to file-per-table using ALGORITHM=INPLACE with no blocking t1.ibd.
#
-ALTER TABLE t1 ADD COLUMN e INT, ALGORITHM=INPLACE;
+ALTER TABLE t1 FORCE, ALGORITHM=INPLACE;
SHOW CREATE TABLE t1;
Table Create Table
t1 CREATE TABLE `t1` (
@@ -74,7 +74,6 @@ t1 CREATE TABLE `t1` (
`b` char(20) DEFAULT NULL,
`c` int(11) DEFAULT NULL,
`d` int(11) DEFAULT NULL,
- `e` int(11) DEFAULT NULL,
UNIQUE KEY `a` (`a`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=latin1
SELECT name, space=0 FROM information_schema.innodb_sys_tables WHERE name = 'test/t1';
diff --git a/mysql-test/suite/innodb/r/innodb-alter-debug.result b/mysql-test/suite/innodb/r/innodb-alter-debug.result
index d580a641d81..d455e54be3d 100644
--- a/mysql-test/suite/innodb/r/innodb-alter-debug.result
+++ b/mysql-test/suite/innodb/r/innodb-alter-debug.result
@@ -33,7 +33,7 @@ engine = innodb;
insert into t1 select 1, 1;
insert into t1 select 2, 2;
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL s1 WAIT_FOR s2';
-alter table t1 add b int, ALGORITHM=inplace;
+alter table t1 force, add b int, ALGORITHM=inplace;
/* connection con1 */
connect con1,localhost,root,,;
SET DEBUG_SYNC = 'now WAIT_FOR s1';
@@ -42,10 +42,10 @@ ERROR 23000: Duplicate entry '1' for key 'uk'
SET DEBUG_SYNC = 'now SIGNAL s2';
/* connection default */
connection default;
-/* reap */ alter table t1 add b int, ALGORITHM=inplace;
+/* reap */ alter table t1 force, add b int, ALGORITHM=inplace;
ERROR 23000: Duplicate entry '1' for key 'uk'
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL s1 WAIT_FOR s2';
-alter table t1 add b int, ALGORITHM=inplace;;
+alter table t1 force, add b int, ALGORITHM=inplace;;
/* connection con1 */
connection con1;
set DEBUG_SYNC = 'now WAIT_FOR s1';
@@ -55,7 +55,7 @@ SET DEBUG_SYNC = 'now SIGNAL s2';
disconnect con1;
/* connection default */
connection default;
-/* reap */ alter table t1 add b int, ALGORITHM=inplace;
+/* reap */ alter table t1 force, add b int, ALGORITHM=inplace;
ERROR 23000: Duplicate entry '1' for key 'uk'
SET DEBUG_SYNC = 'RESET';
drop table t1;
diff --git a/mysql-test/suite/innodb/r/innodb-index-debug.result b/mysql-test/suite/innodb/r/innodb-index-debug.result
index 172e4ebf454..beab075f3c8 100644
--- a/mysql-test/suite/innodb/r/innodb-index-debug.result
+++ b/mysql-test/suite/innodb/r/innodb-index-debug.result
@@ -83,7 +83,7 @@ create table t1(k1 int auto_increment primary key,
k2 char(200),k3 char(200))engine=innodb;
SET DEBUG_SYNC= 'row_merge_after_scan
SIGNAL opened WAIT_FOR flushed';
-ALTER TABLE t1 ADD COLUMN k4 int;
+ALTER TABLE t1 FORCE, ADD COLUMN k4 int;
connection default;
SET DEBUG_SYNC= 'now WAIT_FOR opened';
SET debug = '+d,row_log_tmpfile_fail';
diff --git a/mysql-test/suite/innodb/r/instant_alter,32k.rdiff b/mysql-test/suite/innodb/r/instant_alter,32k.rdiff
new file mode 100644
index 00000000000..c61439e1c3a
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter,32k.rdiff
@@ -0,0 +1,83 @@
+--- instant_alter.result
++++ instant_alter,32k.result
+@@ -317,7 +317,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -340,7 +340,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++5
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -353,7 +353,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -668,7 +668,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -691,7 +691,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++5
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -704,7 +704,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -1019,7 +1019,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -1042,7 +1042,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++5
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -1055,7 +1055,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ disconnect analyze;
diff --git a/mysql-test/suite/innodb/r/instant_alter,4k.rdiff b/mysql-test/suite/innodb/r/instant_alter,4k.rdiff
new file mode 100644
index 00000000000..098d9fa3b5d
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter,4k.rdiff
@@ -0,0 +1,191 @@
+--- instant_alter.result
++++ instant_alter,4k.result
+@@ -181,7 +181,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++6
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -191,7 +191,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -202,7 +202,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -212,7 +212,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -317,7 +317,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++8
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -340,7 +340,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++28
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -353,7 +353,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++8
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -532,7 +532,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++6
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -542,7 +542,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -553,7 +553,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -563,7 +563,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -668,7 +668,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++7
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -691,7 +691,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++23
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -704,7 +704,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++7
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -883,7 +883,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++6
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -893,7 +893,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -904,7 +904,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -914,7 +914,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++4
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -1019,7 +1019,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++7
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -1042,7 +1042,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++23
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -1055,7 +1055,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++7
+ connection default;
+ DROP TABLE big;
+ disconnect analyze;
diff --git a/mysql-test/suite/innodb/r/instant_alter,64k.rdiff b/mysql-test/suite/innodb/r/instant_alter,64k.rdiff
new file mode 100644
index 00000000000..b5a31848ccd
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter,64k.rdiff
@@ -0,0 +1,83 @@
+--- instant_alter.result
++++ instant_alter,64k.result
+@@ -317,7 +317,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -340,7 +340,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++3
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -353,7 +353,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -668,7 +668,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -691,7 +691,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++3
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -704,7 +704,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -1019,7 +1019,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -1042,7 +1042,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++3
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -1055,7 +1055,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++1
+ connection default;
+ DROP TABLE big;
+ disconnect analyze;
diff --git a/mysql-test/suite/innodb/r/instant_alter,8k.rdiff b/mysql-test/suite/innodb/r/instant_alter,8k.rdiff
new file mode 100644
index 00000000000..a313765df3a
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter,8k.rdiff
@@ -0,0 +1,191 @@
+--- instant_alter.result
++++ instant_alter,8k.result
+@@ -181,7 +181,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++5
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -191,7 +191,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -202,7 +202,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -212,7 +212,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -317,7 +317,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -340,7 +340,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++13
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -353,7 +353,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -532,7 +532,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++5
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -542,7 +542,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -553,7 +553,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -563,7 +563,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -668,7 +668,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -691,7 +691,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++13
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -704,7 +704,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ DROP TABLE big;
+ CREATE TABLE t1
+@@ -883,7 +883,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++5
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -893,7 +893,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ BEGIN;
+ UPDATE t2 SET d1 = repeat(id, 200);
+@@ -904,7 +904,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ROLLBACK;
+ connection analyze;
+@@ -914,7 +914,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/t2';
+ clust_index_size
+-1
++3
+ connection default;
+ ALTER TABLE t2 DROP p;
+ affected rows: 0
+@@ -1019,7 +1019,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ ALTER TABLE big ADD COLUMN
+ (d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+@@ -1042,7 +1042,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-7
++13
+ connection default;
+ ROLLBACK;
+ CHECKSUM TABLE big;
+@@ -1055,7 +1055,7 @@
+ SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+ WHERE name = 'test/big';
+ clust_index_size
+-3
++5
+ connection default;
+ DROP TABLE big;
+ disconnect analyze;
diff --git a/mysql-test/suite/innodb/r/instant_alter.result b/mysql-test/suite/innodb/r/instant_alter.result
new file mode 100644
index 00000000000..c06e58b4c77
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter.result
Binary files differ
diff --git a/mysql-test/suite/innodb/r/instant_alter_crash.result b/mysql-test/suite/innodb/r/instant_alter_crash.result
new file mode 100644
index 00000000000..1c967e538aa
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter_crash.result
@@ -0,0 +1,103 @@
+#
+# MDEV-11369: Instant ADD COLUMN for InnoDB
+#
+CREATE TABLE t1(id INT PRIMARY KEY, c2 INT UNIQUE)
+ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
+CREATE TABLE t2 LIKE t1;
+INSERT INTO t1 VALUES(1,2);
+BEGIN;
+INSERT INTO t2 VALUES(2,1);
+ALTER TABLE t2 ADD COLUMN (c3 TEXT NOT NULL DEFAULT 'De finibus bonorum');
+connect ddl, localhost, root;
+SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
+ALTER TABLE t1 ADD COLUMN (c3 TEXT NOT NULL DEFAULT ' et malorum');
+connection default;
+SET DEBUG_SYNC='now WAIT_FOR ddl';
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+COMMIT;
+# Kill the server
+disconnect ddl;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+SELECT * FROM t1;
+id c2
+1 2
+SELECT * FROM t2;
+id c2 c3
+2 1 De finibus bonorum
+BEGIN;
+DELETE FROM t1;
+ROLLBACK;
+InnoDB 0 transactions not purged
+INSERT INTO t2 VALUES (64,42,'De finibus bonorum'), (347,33101,' et malorum');
+connect ddl, localhost, root;
+SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
+ALTER TABLE t2 ADD COLUMN (c4 TEXT NOT NULL DEFAULT ' et malorum');
+connection default;
+SET DEBUG_SYNC='now WAIT_FOR ddl';
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+DELETE FROM t1;
+# Kill the server
+disconnect ddl;
+SET @saved_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+SELECT * FROM t1;
+id c2
+SELECT * FROM t2;
+id c2 c3
+2 1 De finibus bonorum
+64 42 De finibus bonorum
+347 33101 et malorum
+BEGIN;
+INSERT INTO t1 SET id=1;
+DELETE FROM t2;
+ROLLBACK;
+InnoDB 0 transactions not purged
+FLUSH TABLE t1,t2 FOR EXPORT;
+t1 clustered index root page(type 17855):
+N_RECS=0; LEVEL=0
+header=0x010000030074 (id=0x696e66696d756d00)
+header=0x010008030000 (id=0x73757072656d756d00)
+t2 clustered index root page(type 18):
+N_RECS=4; LEVEL=0
+header=0x010000030088 (id=0x696e66696d756d00)
+header=0x1000100b00b9 (id=0x80000000,
+ DB_TRX_ID=0x000000000000,
+ DB_ROLL_PTR=0x80000000000000,
+ c2=NULL(4 bytes),
+ c3=0x44652066696e6962757320626f6e6f72756d)
+header=0x0000180900d8 (id=0x80000002,
+ DB_TRX_ID=0x000000000000,
+ DB_ROLL_PTR=0x80000000000000,
+ c2=0x80000001)
+header=0x0000200900f8 (id=0x80000040,
+ DB_TRX_ID=0x000000000000,
+ DB_ROLL_PTR=0x80000000000000,
+ c2=0x8000002a)
+header=0x0000280b0074 (id=0x8000015b,
+ DB_TRX_ID=0x000000000000,
+ DB_ROLL_PTR=0x80000000000000,
+ c2=0x8000814d,
+ c3=0x206574206d616c6f72756d)
+header=0x050008030000 (id=0x73757072656d756d00)
+UNLOCK TABLES;
+DELETE FROM t2;
+InnoDB 0 transactions not purged
+SHOW CREATE TABLE t1;
+Table Create Table
+t1 CREATE TABLE `t1` (
+ `id` int(11) NOT NULL,
+ `c2` int(11) DEFAULT NULL,
+ PRIMARY KEY (`id`),
+ UNIQUE KEY `c2` (`c2`)
+) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
+SHOW CREATE TABLE t2;
+Table Create Table
+t2 CREATE TABLE `t2` (
+ `id` int(11) NOT NULL,
+ `c2` int(11) DEFAULT NULL,
+ `c3` text NOT NULL DEFAULT 'De finibus bonorum',
+ PRIMARY KEY (`id`),
+ UNIQUE KEY `c2` (`c2`)
+) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
+DROP TABLE t1,t2;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
diff --git a/mysql-test/suite/innodb/r/instant_alter_debug.result b/mysql-test/suite/innodb/r/instant_alter_debug.result
new file mode 100644
index 00000000000..d3d75ff05d4
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter_debug.result
@@ -0,0 +1,167 @@
+SET @save_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+CREATE TABLE t1 (
+pk INT AUTO_INCREMENT PRIMARY KEY,
+c1 INT,
+c2 VARCHAR(255),
+c3 VARCHAR(255),
+c4 INT,
+c5 INT,
+c6 INT,
+c7 VARCHAR(255),
+c8 TIMESTAMP NULL
+) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (NULL,1,NULL,'foo',NULL,1,NULL,NULL,'2011-11-11 00:00:00');
+ALTER TABLE t1 ADD COLUMN f INT;
+REPLACE INTO t1 (c7) VALUES ('bar');
+CREATE TABLE t2 (i INT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t2 VALUES (-1),(1);
+ALTER TABLE t2 ADD COLUMN j INT;
+BEGIN;
+DELETE FROM t2;
+ROLLBACK;
+TRUNCATE TABLE t2;
+INSERT INTO t2 VALUES (1,2);
+CREATE TABLE t3 (pk INT AUTO_INCREMENT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t3 () VALUES ();
+ALTER TABLE t3 ADD COLUMN f INT;
+UPDATE t3 SET pk = DEFAULT;
+SELECT * FROM t3;
+pk f
+0 NULL
+CREATE TABLE t4 (pk INT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t4 VALUES (0);
+ALTER TABLE t4 ADD COLUMN b INT;
+SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS
+LEFT JOIN t4 ON (NUMERIC_SCALE = pk);
+COUNT(*)
+1733
+SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL enter WAIT_FOR delete';
+ALTER TABLE t4 ADD COLUMN c INT;
+connect dml,localhost,root,,;
+SET DEBUG_SYNC='now WAIT_FOR enter';
+DELETE FROM t4;
+InnoDB 0 transactions not purged
+SET DEBUG_SYNC='now SIGNAL delete';
+connection default;
+CREATE TABLE t5 (i INT, KEY(i)) ENGINE=InnoDB;
+INSERT INTO t5 VALUES (-42);
+ALTER TABLE t5 ADD UNIQUE ui(i);
+ALTER TABLE t5 ADD COLUMN i2 INT, DROP INDEX i;
+CREATE TABLE t6 (i INT NOT NULL) ENGINE=InnoDB;
+INSERT INTO t6 VALUES (0);
+ALTER TABLE t6 ADD COLUMN j INT;
+TRUNCATE TABLE t6;
+INSERT INTO t6 VALUES (1,2);
+CREATE TABLE t7 (i INT) ENGINE=InnoDB;
+INSERT INTO t7 VALUES (1),(2),(3),(4),(5);
+ALTER TABLE t7 ADD t TEXT DEFAULT '';
+CREATE TABLE t8 (i INT) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
+INSERT INTO t8 VALUES (NULL);
+ALTER TABLE t8 ADD c CHAR(3);
+SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL rebuilt WAIT_FOR dml';
+ALTER TABLE t8 FORCE;
+connection dml;
+SET DEBUG_SYNC='now WAIT_FOR rebuilt';
+BEGIN;
+INSERT INTO t8 SET i=1;
+UPDATE t8 SET i=ISNULL(i);
+ROLLBACK;
+SET DEBUG_SYNC='now SIGNAL dml';
+connection default;
+SET DEBUG_SYNC='RESET';
+CREATE TABLE t9 (
+pk INT AUTO_INCREMENT PRIMARY KEY,
+c1 BIGINT UNSIGNED,
+c2 TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
+c3 BIGINT,
+c4 VARCHAR(257) CHARACTER SET utf8,
+c5 TINYINT UNSIGNED,
+c6 TINYINT,
+c7 VARCHAR(257) CHARACTER SET latin1,
+c8 VARCHAR(257) CHARACTER SET binary
+) ENGINE=InnoDB;
+INSERT INTO t9 () VALUES ();
+ALTER TABLE t9 ADD COLUMN IF NOT EXISTS t TIMESTAMP NULL KEY;
+Warnings:
+Note 1061 Multiple primary key defined
+SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL rebuilt WAIT_FOR dml';
+OPTIMIZE TABLE t9;
+connection dml;
+SET DEBUG_SYNC='now WAIT_FOR rebuilt';
+BEGIN;
+INSERT INTO t9 () VALUES (),();
+UPDATE t9 SET t=current_timestamp();
+ROLLBACK;
+SET DEBUG_SYNC='now SIGNAL dml';
+disconnect dml;
+connection default;
+Table Op Msg_type Msg_text
+test.t9 optimize note Table does not support optimize, doing recreate + analyze instead
+test.t9 optimize status OK
+SET DEBUG_SYNC='RESET';
+CREATE TABLE t10 (pk INT DEFAULT 0 KEY) ENGINE=InnoDB;
+INSERT INTO t10 (pk) VALUES (1);
+ALTER TABLE t10 ADD c INT;
+TRUNCATE TABLE t10;
+INSERT INTO t10 VALUES (1,1),(2,2);
+ALTER TABLE t10 FORCE;
+CREATE TABLE t11 (
+c01 enum('a','b'),
+c02 bit,
+c03 blob,
+c04 enum('c','d'),
+c05 blob,
+c06 decimal,
+c07 char(1),
+c08 int,
+c09 char(1),
+c10 set('e','f'),
+c11 char(1),
+c12 float,
+c13 bit,
+c14 char(1),
+c15 int,
+c16 float,
+c17 decimal,
+c18 char(1) CHARACTER SET utf8 not null default '',
+c19 float,
+c20 set('g','h'),
+c21 char(1),
+c22 int,
+c23 int,
+c24 int,
+c25 set('i','j'),
+c26 decimal,
+c27 float,
+c28 char(1),
+c29 int,
+c30 enum('k','l'),
+c31 decimal,
+c32 char(1),
+c33 decimal,
+c34 bit,
+c35 enum('m','n'),
+c36 set('o','p'),
+c37 enum('q','r'),
+c38 blob,
+c39 decimal,
+c40 blob not null default '',
+c41 char(1),
+c42 int,
+c43 float,
+c44 float,
+c45 enum('s','t'),
+c46 decimal,
+c47 set('u','v'),
+c48 enum('w','x'),
+c49 set('y','z'),
+c50 float
+) ENGINE=InnoDB;
+INSERT INTO t11 () VALUES ();
+ALTER TABLE t11 ADD COLUMN f INT;
+INSERT INTO t11 () VALUES ();
+UPDATE t11 SET c22 = 1;
+InnoDB 0 transactions not purged
+DROP TABLE t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+SET GLOBAL innodb_purge_rseg_truncate_frequency = @save_frequency;
diff --git a/mysql-test/suite/innodb/r/instant_alter_inject.result b/mysql-test/suite/innodb/r/instant_alter_inject.result
new file mode 100644
index 00000000000..fe175f5bed0
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter_inject.result
@@ -0,0 +1,66 @@
+CREATE TABLE t1(a INT PRIMARY KEY, b INT, KEY(b)) ENGINE=InnoDB
+ROW_FORMAT=REDUNDANT PARTITION BY KEY() PARTITIONS 3;
+INSERT INTO t1 (a) VALUES (1),(2),(3),(4),(5);
+SET @saved_dbug= @@SESSION.debug_dbug;
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_2';
+ALTER TABLE t1 ADD COLUMN c CHAR(3) DEFAULT 'lie';
+ERROR HY000: Internal error: Injected error!
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t1;
+Table Op Msg_type Msg_text
+test.t1 check status OK
+BEGIN;
+UPDATE t1 SET b=a+1;
+INSERT INTO t1 VALUES (0,1);
+ROLLBACK;
+SELECT * FROM t1;
+a b
+1 NULL
+2 NULL
+3 NULL
+4 NULL
+5 NULL
+ALTER TABLE t1 ADD COLUMN c CHAR(3) DEFAULT 'lie';
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_1';
+ALTER TABLE t1 ADD COLUMN d INT NOT NULL DEFAULT -42;
+ERROR HY000: Internal error: Injected error!
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t1;
+Table Op Msg_type Msg_text
+test.t1 check status OK
+BEGIN;
+DELETE FROM t1;
+INSERT INTO t1 VALUES (1,2,'foo');
+ROLLBACK;
+SHOW CREATE TABLE t1;
+Table Create Table
+t1 CREATE TABLE `t1` (
+ `a` int(11) NOT NULL,
+ `b` int(11) DEFAULT NULL,
+ `c` char(3) DEFAULT 'lie',
+ PRIMARY KEY (`a`),
+ KEY `b` (`b`)
+) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=REDUNDANT
+ PARTITION BY KEY ()
+PARTITIONS 3
+DROP TABLE t1;
+CREATE TABLE t2(a INT, KEY(a)) ENGINE=InnoDB;
+INSERT INTO t2 VALUES (1);
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_1';
+ALTER TABLE t2 ADD COLUMN b INT;
+ERROR HY000: Internal error: Injected error!
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t2;
+Table Op Msg_type Msg_text
+test.t2 check status OK
+BEGIN;
+DELETE FROM t2;
+INSERT INTO t2 VALUES (1);
+ROLLBACK;
+SHOW CREATE TABLE t2;
+Table Create Table
+t2 CREATE TABLE `t2` (
+ `a` int(11) DEFAULT NULL,
+ KEY `a` (`a`)
+) ENGINE=InnoDB DEFAULT CHARSET=latin1
+DROP TABLE t2;
diff --git a/mysql-test/suite/innodb/r/instant_alter_rollback.result b/mysql-test/suite/innodb/r/instant_alter_rollback.result
new file mode 100644
index 00000000000..786bfa81336
--- /dev/null
+++ b/mysql-test/suite/innodb/r/instant_alter_rollback.result
@@ -0,0 +1,90 @@
+#
+# MDEV-11369: Instant ADD COLUMN for InnoDB
+#
+connect to_be_killed, localhost, root;
+CREATE TABLE empty_REDUNDANT
+(id INT PRIMARY KEY, c2 INT UNIQUE) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
+CREATE TABLE once_REDUNDANT LIKE empty_REDUNDANT;
+CREATE TABLE twice_REDUNDANT LIKE empty_REDUNDANT;
+INSERT INTO once_REDUNDANT SET id=1,c2=1;
+INSERT INTO twice_REDUNDANT SET id=1,c2=1;
+ALTER TABLE empty_REDUNDANT ADD COLUMN (d1 INT DEFAULT 15);
+ALTER TABLE once_REDUNDANT ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_REDUNDANT ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_REDUNDANT ADD COLUMN
+(d2 INT NOT NULL DEFAULT 10,
+d3 VARCHAR(15) NOT NULL DEFAULT 'var och en char');
+CREATE TABLE empty_COMPACT
+(id INT PRIMARY KEY, c2 INT UNIQUE) ENGINE=InnoDB ROW_FORMAT=COMPACT;
+CREATE TABLE once_COMPACT LIKE empty_COMPACT;
+CREATE TABLE twice_COMPACT LIKE empty_COMPACT;
+INSERT INTO once_COMPACT SET id=1,c2=1;
+INSERT INTO twice_COMPACT SET id=1,c2=1;
+ALTER TABLE empty_COMPACT ADD COLUMN (d1 INT DEFAULT 15);
+ALTER TABLE once_COMPACT ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_COMPACT ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_COMPACT ADD COLUMN
+(d2 INT NOT NULL DEFAULT 10,
+d3 VARCHAR(15) NOT NULL DEFAULT 'var och en char');
+CREATE TABLE empty_DYNAMIC
+(id INT PRIMARY KEY, c2 INT UNIQUE) ENGINE=InnoDB ROW_FORMAT=DYNAMIC;
+CREATE TABLE once_DYNAMIC LIKE empty_DYNAMIC;
+CREATE TABLE twice_DYNAMIC LIKE empty_DYNAMIC;
+INSERT INTO once_DYNAMIC SET id=1,c2=1;
+INSERT INTO twice_DYNAMIC SET id=1,c2=1;
+ALTER TABLE empty_DYNAMIC ADD COLUMN (d1 INT DEFAULT 15);
+ALTER TABLE once_DYNAMIC ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_DYNAMIC ADD COLUMN (d1 INT DEFAULT 20);
+ALTER TABLE twice_DYNAMIC ADD COLUMN
+(d2 INT NOT NULL DEFAULT 10,
+d3 VARCHAR(15) NOT NULL DEFAULT 'var och en char');
+BEGIN;
+INSERT INTO empty_REDUNDANT set id=0,c2=42;
+UPDATE once_REDUNDANT set c2=c2+1;
+UPDATE twice_REDUNDANT set c2=c2+1;
+INSERT INTO twice_REDUNDANT SET id=2,c2=0,d3='';
+INSERT INTO empty_COMPACT set id=0,c2=42;
+UPDATE once_COMPACT set c2=c2+1;
+UPDATE twice_COMPACT set c2=c2+1;
+INSERT INTO twice_COMPACT SET id=2,c2=0,d3='';
+INSERT INTO empty_DYNAMIC set id=0,c2=42;
+UPDATE once_DYNAMIC set c2=c2+1;
+UPDATE twice_DYNAMIC set c2=c2+1;
+INSERT INTO twice_DYNAMIC SET id=2,c2=0,d3='';
+connection default;
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+CREATE TABLE foo(a INT PRIMARY KEY) ENGINE=InnoDB;
+# Kill the server
+disconnect to_be_killed;
+SET @saved_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+DROP TABLE foo;
+InnoDB 0 transactions not purged
+SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
+SELECT * FROM empty_REDUNDANT;
+id c2 d1
+SELECT * FROM once_REDUNDANT;
+id c2 d1
+1 1 20
+SELECT * FROM twice_REDUNDANT;
+id c2 d1 d2 d3
+1 1 20 10 var och en char
+DROP TABLE empty_REDUNDANT, once_REDUNDANT, twice_REDUNDANT;
+SELECT * FROM empty_COMPACT;
+id c2 d1
+SELECT * FROM once_COMPACT;
+id c2 d1
+1 1 20
+SELECT * FROM twice_COMPACT;
+id c2 d1 d2 d3
+1 1 20 10 var och en char
+DROP TABLE empty_COMPACT, once_COMPACT, twice_COMPACT;
+SELECT * FROM empty_DYNAMIC;
+id c2 d1
+SELECT * FROM once_DYNAMIC;
+id c2 d1
+1 1 20
+SELECT * FROM twice_DYNAMIC;
+id c2 d1 d2 d3
+1 1 20 10 var och en char
+DROP TABLE empty_DYNAMIC, once_DYNAMIC, twice_DYNAMIC;
diff --git a/mysql-test/suite/innodb/r/truncate_debug.result b/mysql-test/suite/innodb/r/truncate_debug.result
index c04b83dbbe8..27410b9417a 100644
--- a/mysql-test/suite/innodb/r/truncate_debug.result
+++ b/mysql-test/suite/innodb/r/truncate_debug.result
@@ -7,10 +7,10 @@ SET GLOBAL innodb_adaptive_hash_index=ON;
Test_1 :- Check if DDL operations are possible on
table being truncated. Also check if
DDL operations on other tables succeed.
-create table t1 (f1 int,f2 int,key(f2),f3 int) engine=innodb;
+create table t1 (f1 int,f2 int,key(f2),f3 int) engine=innodb row_format=redundant;
create index idx1 on t1(f3);
-create table t2 (f1 int,f2 int,key(f2),f3 int) engine=innodb;
-create table t3 (f1 int,f2 int,key(f2)) engine=innodb;
+create table t2 (f1 int,f2 int,key(f2),f3 int) engine=innodb row_format=redundant;
+create table t3 (f1 int,f2 int,key(f2)) engine=innodb row_format=redundant;
insert into t1 values (10,20,30),(30,40,50),(50,60,70);
insert into t1 select * from t1;
insert into t1 select * from t1;
diff --git a/mysql-test/suite/innodb/t/alter_rename_existing.test b/mysql-test/suite/innodb/t/alter_rename_existing.test
index 0c8bf481969..3173906841c 100644
--- a/mysql-test/suite/innodb/t/alter_rename_existing.test
+++ b/mysql-test/suite/innodb/t/alter_rename_existing.test
@@ -61,10 +61,10 @@ ALTER TABLE t1 ADD COLUMN d INT, ALGORITHM=COPY;
SET GLOBAL innodb_file_per_table=ON;
--replace_regex /$MYSQLD_DATADIR/MYSQLD_DATADIR/
--error ER_TABLESPACE_EXISTS
-ALTER TABLE t1 ADD COLUMN e1 INT, ALGORITHM=INPLACE;
+ALTER TABLE t1 FORCE, ALGORITHM=INPLACE;
--replace_regex /Error on rename of '.*' to '.*'/Error on rename of 'OLD_FILE_NAME' to 'NEW_FILE_NAME'/
--error ER_ERROR_ON_RENAME
-ALTER TABLE t1 ADD COLUMN e2 INT, ALGORITHM=COPY;
+ALTER TABLE t1 FORCE, ALGORITHM=COPY;
--echo #
--echo # Delete the blocking file called MYSQLD_DATADIR/test/t1.ibd
@@ -72,7 +72,7 @@ ALTER TABLE t1 ADD COLUMN e2 INT, ALGORITHM=COPY;
--echo # Move t1 to file-per-table using ALGORITHM=INPLACE with no blocking t1.ibd.
--echo #
-ALTER TABLE t1 ADD COLUMN e INT, ALGORITHM=INPLACE;
+ALTER TABLE t1 FORCE, ALGORITHM=INPLACE;
SHOW CREATE TABLE t1;
SELECT name, space=0 FROM information_schema.innodb_sys_tables WHERE name = 'test/t1';
diff --git a/mysql-test/suite/innodb/t/innodb-alter-debug.test b/mysql-test/suite/innodb/t/innodb-alter-debug.test
index f4996916e9f..a779aecb71f 100644
--- a/mysql-test/suite/innodb/t/innodb-alter-debug.test
+++ b/mysql-test/suite/innodb/t/innodb-alter-debug.test
@@ -40,7 +40,7 @@ engine = innodb;
insert into t1 select 1, 1;
insert into t1 select 2, 2;
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL s1 WAIT_FOR s2';
---send alter table t1 add b int, ALGORITHM=inplace
+--send alter table t1 force, add b int, ALGORITHM=inplace
--echo /* connection con1 */
connect (con1,localhost,root,,);
@@ -51,12 +51,12 @@ SET DEBUG_SYNC = 'now SIGNAL s2';
--echo /* connection default */
connection default;
---echo /* reap */ alter table t1 add b int, ALGORITHM=inplace;
+--echo /* reap */ alter table t1 force, add b int, ALGORITHM=inplace;
--error ER_DUP_ENTRY
--reap
SET DEBUG_SYNC = 'row_log_table_apply1_before SIGNAL s1 WAIT_FOR s2';
---send alter table t1 add b int, ALGORITHM=inplace;
+--send alter table t1 force, add b int, ALGORITHM=inplace;
--echo /* connection con1 */
connection con1;
@@ -68,7 +68,7 @@ disconnect con1;
--echo /* connection default */
connection default;
---echo /* reap */ alter table t1 add b int, ALGORITHM=inplace;
+--echo /* reap */ alter table t1 force, add b int, ALGORITHM=inplace;
--error ER_DUP_ENTRY
--reap
SET DEBUG_SYNC = 'RESET';
diff --git a/mysql-test/suite/innodb/t/innodb-index-debug.test b/mysql-test/suite/innodb/t/innodb-index-debug.test
index 6927120fd5b..de598740e6a 100644
--- a/mysql-test/suite/innodb/t/innodb-index-debug.test
+++ b/mysql-test/suite/innodb/t/innodb-index-debug.test
@@ -96,7 +96,7 @@ create table t1(k1 int auto_increment primary key,
k2 char(200),k3 char(200))engine=innodb;
SET DEBUG_SYNC= 'row_merge_after_scan
SIGNAL opened WAIT_FOR flushed';
-send ALTER TABLE t1 ADD COLUMN k4 int;
+send ALTER TABLE t1 FORCE, ADD COLUMN k4 int;
connection default;
SET DEBUG_SYNC= 'now WAIT_FOR opened';
SET debug = '+d,row_log_tmpfile_fail';
diff --git a/mysql-test/suite/innodb/t/instant_alter.opt b/mysql-test/suite/innodb/t/instant_alter.opt
new file mode 100644
index 00000000000..99bf0e5a28b
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter.opt
@@ -0,0 +1 @@
+--innodb-sys-tablestats
diff --git a/mysql-test/suite/innodb/t/instant_alter.test b/mysql-test/suite/innodb/t/instant_alter.test
new file mode 100644
index 00000000000..3e524057fdc
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter.test
@@ -0,0 +1,261 @@
+--source include/innodb_page_size.inc
+
+--echo #
+--echo # MDEV-11369: Instant ADD COLUMN for InnoDB
+--echo #
+
+# FIXME: Test that instant ADD is not allowed on ROW_FORMAT=COMPRESSED
+# (create a table with SPATIAL INDEX, ROW_FORMAT=COMPACT, and
+# show that ALTER TABLE…ADD COLUMN…LOCK=NONE is refused.
+# This does not work yet for any table, because
+# check_if_supported_inplace_alter()
+# does not check if instant ADD is possible.)
+
+connect analyze, localhost, root;
+connection default;
+SET timestamp = 42;
+SET time_zone='+03:00';
+SET @saved_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+
+SET @old_instant=
+(SELECT variable_value FROM information_schema.global_status
+WHERE variable_name = 'innodb_instant_alter_column');
+
+let $format= 3;
+while ($format) {
+let $engine= `SELECT CONCAT('ENGINE=InnoDB ROW_FORMAT=',CASE $format
+WHEN 1 THEN 'DYNAMIC'
+WHEN 2 THEN 'COMPACT'
+ELSE 'REDUNDANT' END)`;
+
+eval CREATE TABLE t1
+(id INT PRIMARY KEY, c2 INT UNIQUE,
+ c3 POINT NOT NULL DEFAULT ST_GeomFromText('POINT(3 4)'),
+ SPATIAL INDEX(c3)) $engine;
+
+INSERT INTO t1 (id, c2) values(1,1);
+SELECT * FROM t1;
+
+--enable_info
+ALTER TABLE t1 ADD COLUMN (
+ d1 INT, d2 INT UNSIGNED DEFAULT 10, d3 VARCHAR(20) NOT NULL DEFAULT 'abcde',
+ d4 TIMESTAMP NOT NULL DEFAULT current_timestamp());
+--disable_info
+
+SELECT * FROM t1;
+INSERT INTO t1 (id) VALUES(2),(3),(4),(5),(6);
+
+--enable_info
+ALTER TABLE t1 CHANGE d1 d1 INT DEFAULT 5, CHANGE d2 d2 INT DEFAULT 15,
+CHANGE d3 d3 VARCHAR(20) NOT NULL DEFAULT 'fghij',
+CHANGE d4 dfour TIMESTAMP NOT NULL DEFAULT now();
+--disable_info
+
+INSERT INTO t1 SET id = 7;
+SELECT * FROM t1;
+
+# add virtual columns
+--enable_info
+ALTER TABLE t1 ADD COLUMN e1 INT AS (id * 3);
+ALTER TABLE t1 ADD COLUMN e2 VARCHAR(30) AS (d3);
+ALTER TABLE t1 ADD COLUMN e3 INT AS (id * 2);
+
+# instant alter
+ALTER TABLE t1 CHANGE d3 d3 VARCHAR(20) NOT NULL DEFAULT 'foobar',
+ADD COLUMN (d5 CHAR(20) DEFAULT 'hijkl', d6 INT DEFAULT -12345, d7 INT);
+--disable_info
+
+INSERT INTO t1 SET id = 8;
+
+# Updating a column by extending an existing record
+UPDATE t1 SET d3 = 'yyyyy' WHERE id = 1;
+
+# Updating an already materialized column
+UPDATE t1 SET d3 = 'xxxxx' WHERE id = 2;
+
+# transaction rollback
+BEGIN;
+UPDATE t1 SET d3 = 'xxxxx' WHERE id = 3;
+SELECT * FROM t1 WHERE id = 3;
+ROLLBACK;
+SELECT * FROM t1 WHERE id = 3;
+
+# NULL to NULL, no change
+BEGIN;
+UPDATE t1 SET d7 = NULL WHERE ID = 5;
+ROLLBACK;
+BEGIN;
+UPDATE t1 SET d7 = NULL, d6 = 10 WHERE id = 5;
+SELECT * FROM t1 WHERE id = 5;
+ROLLBACK;
+SELECT * FROM t1 WHERE id = 5;
+
+# add virtual stored columns; not instant
+--enable_info
+ALTER TABLE t1 ADD COLUMN (f1 VARCHAR(20) AS (concat('x', e2)) STORED);
+
+# instant add
+ALTER TABLE t1 ADD COLUMN (d8 VARCHAR(20) DEFAULT 'omnopq');
+--disable_info
+
+SELECT * FROM t1;
+SHOW CREATE TABLE t1;
+
+--enable_info
+ALTER TABLE t1
+CHANGE c2 c2 INT DEFAULT 42,
+CHANGE d1 d1 INT DEFAULT 1,
+CHANGE d2 d2 INT DEFAULT 20,
+CHANGE d3 d3 VARCHAR(20) NOT NULL DEFAULT 'boofar';
+--disable_info
+INSERT INTO t1 SET id=9;
+--enable_info
+ALTER TABLE t1 DROP c3;
+--disable_info
+
+SHOW CREATE TABLE t1;
+SELECT * FROM t1;
+
+eval CREATE TABLE t2
+(id INT primary key, c1 VARCHAR(4000),
+ p GEOMETRY NOT NULL DEFAULT ST_GeomFromText('LINESTRING(0 0,0 1,1 1)'),
+ SPATIAL INDEX(p))
+$engine;
+
+BEGIN;
+INSERT INTO t2 SET id=1, c1=REPEAT('a', 4000);
+INSERT INTO t2 SET id=2, c1=REPEAT('a', 4000), p=ST_GeomFromText('POINT(1 1)');
+COMMIT;
+
+--enable_info
+ALTER TABLE t2 ADD COLUMN d1 VARCHAR(2000) DEFAULT REPEAT('asdf',500);
+--disable_info
+SELECT * FROM t2;
+
+# inplace update, rollback
+BEGIN;
+UPDATE t2 SET c1 = repeat(id, 4000);
+
+connection analyze;
+ANALYZE TABLE t2;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/t2';
+connection default;
+
+ROLLBACK;
+connection analyze;
+ANALYZE TABLE t2;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/t2';
+connection default;
+
+# non-inplace update. Rollback MUST NOT materialize off-page columns.
+BEGIN;
+UPDATE t2 SET d1 = repeat(id, 200);
+connection analyze;
+ANALYZE TABLE t2;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/t2';
+connection default;
+ROLLBACK;
+connection analyze;
+ANALYZE TABLE t2;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/t2';
+connection default;
+
+--enable_info
+ALTER TABLE t2 DROP p;
+--disable_info
+SELECT * FROM t2;
+
+# datetime
+eval CREATE TABLE t3
+(id INT PRIMARY KEY, c2 INT UNSIGNED NOT NULL UNIQUE,
+ c3 POLYGON NOT NULL DEFAULT ST_PolyFromText('POLYGON((1 1,2 2,3 3,1 1))'),
+ SPATIAL INDEX(c3))
+$engine;
+INSERT INTO t3(id,c2) VALUES(1,1),(2,2),(3,3);
+SELECT * FROM t3;
+--enable_info
+ALTER TABLE t3 ADD COLUMN
+(c4 DATETIME DEFAULT current_timestamp(),
+ c5 TIMESTAMP NOT NULL DEFAULT current_timestamp(),
+ c6 POINT);
+SELECT * FROM t3;
+ALTER TABLE t3 ADD COLUMN c7 TIME NOT NULL DEFAULT current_timestamp();
+ALTER TABLE t3 ADD COLUMN c8 DATE NOT NULL DEFAULT current_timestamp();
+--disable_info
+SELECT * FROM t3;
+
+--enable_info
+ALTER TABLE t3 ADD COLUMN t TEXT CHARSET utf8
+DEFAULT 'The quick brown fox jumps over the lazy dog';
+ALTER TABLE t3 ADD COLUMN b BLOB NOT NULL;
+--error ER_NO_DEFAULT_FOR_FIELD
+INSERT INTO t3 SET id=4;
+INSERT INTO t3 SET id=4, c2=0, b=0xf09f98b1;
+
+ALTER TABLE t3 CHANGE t phrase TEXT DEFAULT 0xc3a4c3a448,
+CHANGE b b BLOB NOT NULL DEFAULT 'binary line of business';
+--disable_info
+INSERT INTO t3 SET id=5, c2=9;
+SELECT * FROM t3;
+--enable_info
+ALTER TABLE t3 DROP c3, DROP c7;
+--disable_info
+SELECT * FROM t3;
+
+eval CREATE TABLE big
+(id INT PRIMARY KEY, c1 VARCHAR(4000), c2 VARCHAR(4000), c3 VARCHAR(1000),
+ p POINT NOT NULL DEFAULT ST_GeomFromText('POINT(0 0)'), SPATIAL INDEX(p))
+$engine;
+BEGIN;
+INSERT INTO big
+SET id=1, c1=REPEAT('a', 200), c2=REPEAT('b', 200), c3=REPEAT('c', 159);
+SET @i:=1;
+INSERT INTO big SELECT @i:=@i+1, c1, c2, c3, p FROM big;
+INSERT INTO big SELECT @i:=@i+1, c1, c2, c3, p FROM big;
+INSERT INTO big SELECT @i:=@i+1, c1, c2, c3, p FROM big;
+INSERT INTO big SELECT @i:=@i+1, c1, c2, c3, p FROM big;
+INSERT INTO big SELECT @i:=@i+1, c1, c2, c3, p FROM big;
+COMMIT;
+connection analyze;
+ANALYZE TABLE big;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/big';
+connection default;
+--enable_info
+ALTER TABLE big ADD COLUMN
+(d1 INT DEFAULT 0, d2 VARCHAR(20) DEFAULT 'abcde',
+ d3 TIMESTAMP NOT NULL DEFAULT current_timestamp ON UPDATE current_timestamp);
+--disable_info
+CHECKSUM TABLE big;
+BEGIN;
+INSERT INTO big(id, c1, c2, c3) SELECT @i:=@i+1, c1, c2, c3 FROM big;
+INSERT INTO big(id, c1, c2, c3) SELECT @i:=@i+1, c1, c2, c3 FROM big;
+CHECKSUM TABLE big;
+connection analyze;
+ANALYZE TABLE big;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/big';
+connection default;
+ROLLBACK;
+CHECKSUM TABLE big;
+connection analyze;
+ANALYZE TABLE big;
+SELECT clust_index_size FROM INFORMATION_SCHEMA.INNODB_SYS_TABLESTATS
+WHERE name = 'test/big';
+connection default;
+
+--source include/wait_all_purged.inc
+DROP TABLE t1,t2,t3,big;
+
+dec $format;
+}
+disconnect analyze;
+SELECT variable_value-@old_instant instants
+FROM information_schema.global_status
+WHERE variable_name = 'innodb_instant_alter_column';
+SET GLOBAL innodb_purge_rseg_truncate_frequency= @saved_frequency;
diff --git a/mysql-test/suite/innodb/t/instant_alter_crash.test b/mysql-test/suite/innodb/t/instant_alter_crash.test
new file mode 100644
index 00000000000..36f6f48339f
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter_crash.test
@@ -0,0 +1,123 @@
+--source include/have_innodb.inc
+# The embedded server tests do not support restarting.
+--source include/not_embedded.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+
+let INNODB_PAGE_SIZE=`select @@innodb_page_size`;
+let MYSQLD_DATADIR=`select @@datadir`;
+
+--echo #
+--echo # MDEV-11369: Instant ADD COLUMN for InnoDB
+--echo #
+
+CREATE TABLE t1(id INT PRIMARY KEY, c2 INT UNIQUE)
+ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
+CREATE TABLE t2 LIKE t1;
+INSERT INTO t1 VALUES(1,2);
+BEGIN;
+INSERT INTO t2 VALUES(2,1);
+ALTER TABLE t2 ADD COLUMN (c3 TEXT NOT NULL DEFAULT 'De finibus bonorum');
+
+connect ddl, localhost, root;
+SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
+--send
+ALTER TABLE t1 ADD COLUMN (c3 TEXT NOT NULL DEFAULT ' et malorum');
+
+connection default;
+SET DEBUG_SYNC='now WAIT_FOR ddl';
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+COMMIT;
+
+--source include/kill_mysqld.inc
+disconnect ddl;
+--source include/start_mysqld.inc
+
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+SELECT * FROM t1;
+SELECT * FROM t2;
+BEGIN;
+DELETE FROM t1;
+ROLLBACK;
+--source include/wait_all_purged.inc
+
+INSERT INTO t2 VALUES (64,42,'De finibus bonorum'), (347,33101,' et malorum');
+
+connect ddl, localhost, root;
+SET DEBUG_SYNC='innodb_alter_inplace_before_commit SIGNAL ddl WAIT_FOR ever';
+--send
+ALTER TABLE t2 ADD COLUMN (c4 TEXT NOT NULL DEFAULT ' et malorum');
+
+connection default;
+SET DEBUG_SYNC='now WAIT_FOR ddl';
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+DELETE FROM t1;
+
+--source include/kill_mysqld.inc
+disconnect ddl;
+--source include/start_mysqld.inc
+
+SET @saved_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+
+SELECT * FROM t1;
+SELECT * FROM t2;
+BEGIN;
+INSERT INTO t1 SET id=1;
+DELETE FROM t2;
+ROLLBACK;
+--source include/wait_all_purged.inc
+
+FLUSH TABLE t1,t2 FOR EXPORT;
+
+# At this point, t1 is empty and t2 contains a 'default row'.
+
+# The following is based on innodb.table_flags and innodb.dml_purge:
+--perl
+use strict;
+my $ps= $ENV{INNODB_PAGE_SIZE};
+foreach my $table ('t1','t2') {
+my $file= "$ENV{MYSQLD_DATADIR}/test/$table.ibd";
+open(FILE, "<", $file) || die "Unable to open $file\n";
+my $page;
+sysseek(FILE, 3*$ps, 0) || die "Unable to seek $file";
+die "Unable to read $file" unless sysread(FILE, $page, $ps) == $ps;
+print "$table clustered index root page";
+print "(type ", unpack("n", substr($page,24,2)), "):\n";
+print "N_RECS=", unpack("n", substr($page,38+16,2));
+print "; LEVEL=", unpack("n", substr($page,38+26,2)), "\n";
+my @fields=("id","DB_TRX_ID","DB_ROLL_PTR", "c2","c3","c4");
+for (my $offset= 0x65; $offset;
+ $offset= unpack("n", substr($page,$offset-2,2)))
+{
+ print "header=0x", unpack("H*",substr($page,$offset-6,6)), " (";
+ my $n_fields= unpack("n", substr($page,$offset-4,2)) >> 1 & 0x3ff;
+ my $start= 0;
+ my $name;
+ for (my $i= 0; $i < $n_fields; $i++) {
+ my $end= unpack("C", substr($page, $offset-7-$i, 1));
+ print ",\n " if $i;
+ print "$fields[$i]=";
+ if ($end & 0x80) {
+ print "NULL(", ($end & 0x7f) - $start, " bytes)"
+ } else {
+ print "0x", unpack("H*", substr($page,$offset+$start,$end-$start))
+ }
+ $start= $end & 0x7f;
+ }
+ print ")\n";
+}
+close(FILE) || die "Unable to close $file\n";
+}
+EOF
+
+UNLOCK TABLES;
+
+DELETE FROM t2;
+--source include/wait_all_purged.inc
+
+SHOW CREATE TABLE t1;
+SHOW CREATE TABLE t2;
+DROP TABLE t1,t2;
+
+SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
diff --git a/mysql-test/suite/innodb/t/instant_alter_debug.test b/mysql-test/suite/innodb/t/instant_alter_debug.test
new file mode 100644
index 00000000000..69aab6e2fc1
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter_debug.test
@@ -0,0 +1,181 @@
+--source include/have_innodb.inc
+--source include/have_debug.inc
+--source include/have_debug_sync.inc
+--source include/have_innodb.inc
+SET @save_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+
+CREATE TABLE t1 (
+ pk INT AUTO_INCREMENT PRIMARY KEY,
+ c1 INT,
+ c2 VARCHAR(255),
+ c3 VARCHAR(255),
+ c4 INT,
+ c5 INT,
+ c6 INT,
+ c7 VARCHAR(255),
+ c8 TIMESTAMP NULL
+) ENGINE=InnoDB;
+INSERT INTO t1 VALUES (NULL,1,NULL,'foo',NULL,1,NULL,NULL,'2011-11-11 00:00:00');
+ALTER TABLE t1 ADD COLUMN f INT;
+REPLACE INTO t1 (c7) VALUES ('bar');
+
+CREATE TABLE t2 (i INT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t2 VALUES (-1),(1);
+ALTER TABLE t2 ADD COLUMN j INT;
+BEGIN;
+DELETE FROM t2;
+ROLLBACK;
+TRUNCATE TABLE t2;
+INSERT INTO t2 VALUES (1,2);
+
+CREATE TABLE t3 (pk INT AUTO_INCREMENT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t3 () VALUES ();
+ALTER TABLE t3 ADD COLUMN f INT;
+UPDATE t3 SET pk = DEFAULT;
+SELECT * FROM t3;
+
+CREATE TABLE t4 (pk INT PRIMARY KEY) ENGINE=InnoDB;
+INSERT INTO t4 VALUES (0);
+ALTER TABLE t4 ADD COLUMN b INT;
+SELECT COUNT(*) FROM INFORMATION_SCHEMA.COLUMNS
+LEFT JOIN t4 ON (NUMERIC_SCALE = pk);
+SET DEBUG_SYNC='innodb_inplace_alter_table_enter SIGNAL enter WAIT_FOR delete';
+--send
+ALTER TABLE t4 ADD COLUMN c INT;
+connect (dml,localhost,root,,);
+SET DEBUG_SYNC='now WAIT_FOR enter';
+DELETE FROM t4;
+--source include/wait_all_purged.inc
+SET DEBUG_SYNC='now SIGNAL delete';
+connection default;
+reap;
+
+CREATE TABLE t5 (i INT, KEY(i)) ENGINE=InnoDB;
+INSERT INTO t5 VALUES (-42);
+ALTER TABLE t5 ADD UNIQUE ui(i);
+ALTER TABLE t5 ADD COLUMN i2 INT, DROP INDEX i;
+
+CREATE TABLE t6 (i INT NOT NULL) ENGINE=InnoDB;
+INSERT INTO t6 VALUES (0);
+ALTER TABLE t6 ADD COLUMN j INT;
+TRUNCATE TABLE t6;
+INSERT INTO t6 VALUES (1,2);
+
+CREATE TABLE t7 (i INT) ENGINE=InnoDB;
+INSERT INTO t7 VALUES (1),(2),(3),(4),(5);
+ALTER TABLE t7 ADD t TEXT DEFAULT '';
+
+CREATE TABLE t8 (i INT) ENGINE=InnoDB ROW_FORMAT=REDUNDANT;
+INSERT INTO t8 VALUES (NULL);
+ALTER TABLE t8 ADD c CHAR(3);
+SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL rebuilt WAIT_FOR dml';
+--send
+ALTER TABLE t8 FORCE;
+connection dml;
+SET DEBUG_SYNC='now WAIT_FOR rebuilt';
+BEGIN;
+INSERT INTO t8 SET i=1;
+UPDATE t8 SET i=ISNULL(i);
+ROLLBACK;
+SET DEBUG_SYNC='now SIGNAL dml';
+connection default;
+reap;
+SET DEBUG_SYNC='RESET';
+
+CREATE TABLE t9 (
+ pk INT AUTO_INCREMENT PRIMARY KEY,
+ c1 BIGINT UNSIGNED,
+ c2 TIMESTAMP NULL DEFAULT CURRENT_TIMESTAMP,
+ c3 BIGINT,
+ c4 VARCHAR(257) CHARACTER SET utf8,
+ c5 TINYINT UNSIGNED,
+ c6 TINYINT,
+ c7 VARCHAR(257) CHARACTER SET latin1,
+ c8 VARCHAR(257) CHARACTER SET binary
+) ENGINE=InnoDB;
+INSERT INTO t9 () VALUES ();
+ALTER TABLE t9 ADD COLUMN IF NOT EXISTS t TIMESTAMP NULL KEY;
+SET DEBUG_SYNC='row_log_table_apply1_before SIGNAL rebuilt WAIT_FOR dml';
+--send
+OPTIMIZE TABLE t9;
+connection dml;
+SET DEBUG_SYNC='now WAIT_FOR rebuilt';
+BEGIN;
+INSERT INTO t9 () VALUES (),();
+UPDATE t9 SET t=current_timestamp();
+ROLLBACK;
+SET DEBUG_SYNC='now SIGNAL dml';
+disconnect dml;
+connection default;
+reap;
+SET DEBUG_SYNC='RESET';
+
+CREATE TABLE t10 (pk INT DEFAULT 0 KEY) ENGINE=InnoDB;
+INSERT INTO t10 (pk) VALUES (1);
+ALTER TABLE t10 ADD c INT;
+TRUNCATE TABLE t10;
+INSERT INTO t10 VALUES (1,1),(2,2);
+ALTER TABLE t10 FORCE;
+
+CREATE TABLE t11 (
+ c01 enum('a','b'),
+ c02 bit,
+ c03 blob,
+ c04 enum('c','d'),
+ c05 blob,
+ c06 decimal,
+ c07 char(1),
+ c08 int,
+ c09 char(1),
+ c10 set('e','f'),
+ c11 char(1),
+ c12 float,
+ c13 bit,
+ c14 char(1),
+ c15 int,
+ c16 float,
+ c17 decimal,
+ c18 char(1) CHARACTER SET utf8 not null default '',
+ c19 float,
+ c20 set('g','h'),
+ c21 char(1),
+ c22 int,
+ c23 int,
+ c24 int,
+ c25 set('i','j'),
+ c26 decimal,
+ c27 float,
+ c28 char(1),
+ c29 int,
+ c30 enum('k','l'),
+ c31 decimal,
+ c32 char(1),
+ c33 decimal,
+ c34 bit,
+ c35 enum('m','n'),
+ c36 set('o','p'),
+ c37 enum('q','r'),
+ c38 blob,
+ c39 decimal,
+ c40 blob not null default '',
+ c41 char(1),
+ c42 int,
+ c43 float,
+ c44 float,
+ c45 enum('s','t'),
+ c46 decimal,
+ c47 set('u','v'),
+ c48 enum('w','x'),
+ c49 set('y','z'),
+ c50 float
+) ENGINE=InnoDB;
+INSERT INTO t11 () VALUES ();
+ALTER TABLE t11 ADD COLUMN f INT;
+INSERT INTO t11 () VALUES ();
+UPDATE t11 SET c22 = 1;
+
+--source include/wait_all_purged.inc
+DROP TABLE t1,t2,t3,t4,t5,t6,t7,t8,t9,t10,t11;
+
+SET GLOBAL innodb_purge_rseg_truncate_frequency = @save_frequency;
diff --git a/mysql-test/suite/innodb/t/instant_alter_inject.test b/mysql-test/suite/innodb/t/instant_alter_inject.test
new file mode 100644
index 00000000000..2a74998f65e
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter_inject.test
@@ -0,0 +1,46 @@
+--source include/have_innodb.inc
+--source include/have_debug.inc
+--source include/have_partition.inc
+
+CREATE TABLE t1(a INT PRIMARY KEY, b INT, KEY(b)) ENGINE=InnoDB
+ROW_FORMAT=REDUNDANT PARTITION BY KEY() PARTITIONS 3;
+INSERT INTO t1 (a) VALUES (1),(2),(3),(4),(5);
+SET @saved_dbug= @@SESSION.debug_dbug;
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_2';
+--error ER_INTERNAL_ERROR
+ALTER TABLE t1 ADD COLUMN c CHAR(3) DEFAULT 'lie';
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t1;
+BEGIN;
+UPDATE t1 SET b=a+1;
+INSERT INTO t1 VALUES (0,1);
+ROLLBACK;
+SELECT * FROM t1;
+ALTER TABLE t1 ADD COLUMN c CHAR(3) DEFAULT 'lie';
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_1';
+--error ER_INTERNAL_ERROR
+ALTER TABLE t1 ADD COLUMN d INT NOT NULL DEFAULT -42;
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t1;
+BEGIN;
+DELETE FROM t1;
+INSERT INTO t1 VALUES (1,2,'foo');
+ROLLBACK;
+
+SHOW CREATE TABLE t1;
+DROP TABLE t1;
+
+CREATE TABLE t2(a INT, KEY(a)) ENGINE=InnoDB;
+INSERT INTO t2 VALUES (1);
+SET DEBUG_DBUG='+d,ib_commit_inplace_fail_1';
+--error ER_INTERNAL_ERROR
+ALTER TABLE t2 ADD COLUMN b INT;
+SET DEBUG_DBUG= @saved_dbug;
+CHECK TABLE t2;
+BEGIN;
+DELETE FROM t2;
+INSERT INTO t2 VALUES (1);
+ROLLBACK;
+
+SHOW CREATE TABLE t2;
+DROP TABLE t2;
diff --git a/mysql-test/suite/innodb/t/instant_alter_rollback.test b/mysql-test/suite/innodb/t/instant_alter_rollback.test
new file mode 100644
index 00000000000..c27f198b4c6
--- /dev/null
+++ b/mysql-test/suite/innodb/t/instant_alter_rollback.test
@@ -0,0 +1,70 @@
+--source include/have_innodb.inc
+# The embedded server tests do not support restarting.
+--source include/not_embedded.inc
+
+--echo #
+--echo # MDEV-11369: Instant ADD COLUMN for InnoDB
+--echo #
+
+connect to_be_killed, localhost, root;
+
+let $format= 3;
+while ($format) {
+let $fmt= `SELECT CASE $format WHEN 1 THEN 'DYNAMIC' WHEN 2 THEN 'COMPACT'
+ELSE 'REDUNDANT' END`;
+let $engine= ENGINE=InnoDB ROW_FORMAT=$fmt;
+
+eval CREATE TABLE empty_$fmt
+(id INT PRIMARY KEY, c2 INT UNIQUE) $engine;
+eval CREATE TABLE once_$fmt LIKE empty_$fmt;
+eval CREATE TABLE twice_$fmt LIKE empty_$fmt;
+eval INSERT INTO once_$fmt SET id=1,c2=1;
+eval INSERT INTO twice_$fmt SET id=1,c2=1;
+eval ALTER TABLE empty_$fmt ADD COLUMN (d1 INT DEFAULT 15);
+eval ALTER TABLE once_$fmt ADD COLUMN (d1 INT DEFAULT 20);
+eval ALTER TABLE twice_$fmt ADD COLUMN (d1 INT DEFAULT 20);
+eval ALTER TABLE twice_$fmt ADD COLUMN
+(d2 INT NOT NULL DEFAULT 10,
+ d3 VARCHAR(15) NOT NULL DEFAULT 'var och en char');
+dec $format;
+}
+
+BEGIN;
+
+let $format= 3;
+while ($format) {
+let $fmt= `SELECT CASE $format WHEN 1 THEN 'DYNAMIC' WHEN 2 THEN 'COMPACT'
+ELSE 'REDUNDANT' END`;
+
+eval INSERT INTO empty_$fmt set id=0,c2=42;
+eval UPDATE once_$fmt set c2=c2+1;
+eval UPDATE twice_$fmt set c2=c2+1;
+eval INSERT INTO twice_$fmt SET id=2,c2=0,d3='';
+dec $format;
+}
+
+connection default;
+SET GLOBAL innodb_flush_log_at_trx_commit=1;
+CREATE TABLE foo(a INT PRIMARY KEY) ENGINE=InnoDB;
+
+--source include/kill_mysqld.inc
+disconnect to_be_killed;
+--source include/start_mysqld.inc
+
+SET @saved_frequency= @@GLOBAL.innodb_purge_rseg_truncate_frequency;
+SET GLOBAL innodb_purge_rseg_truncate_frequency=1;
+DROP TABLE foo;
+--source include/wait_all_purged.inc
+SET GLOBAL innodb_purge_rseg_truncate_frequency=@saved_frequency;
+
+let $format= 3;
+while ($format) {
+let $fmt= `SELECT CASE $format WHEN 1 THEN 'DYNAMIC' WHEN 2 THEN 'COMPACT'
+ELSE 'REDUNDANT' END`;
+
+eval SELECT * FROM empty_$fmt;
+eval SELECT * FROM once_$fmt;
+eval SELECT * FROM twice_$fmt;
+eval DROP TABLE empty_$fmt, once_$fmt, twice_$fmt;
+dec $format;
+}
diff --git a/mysql-test/suite/innodb/t/truncate_debug.test b/mysql-test/suite/innodb/t/truncate_debug.test
index 5fee9174d98..74d9064faa4 100644
--- a/mysql-test/suite/innodb/t/truncate_debug.test
+++ b/mysql-test/suite/innodb/t/truncate_debug.test
@@ -18,10 +18,11 @@ SET GLOBAL innodb_adaptive_hash_index=ON;
--echo table being truncated. Also check if
--echo DDL operations on other tables succeed.
-create table t1 (f1 int,f2 int,key(f2),f3 int) engine=innodb;
+create table t1 (f1 int,f2 int,key(f2),f3 int) engine=innodb row_format=redundant;
create index idx1 on t1(f3);
-create table t2 (f1 int,f2 int,key(f2),f3 int) engine=innodb;
-create table t3 (f1 int,f2 int,key(f2)) engine=innodb;
+create table t2 (f1 int,f2 int,key(f2),f3 int) engine=innodb row_format=redundant;
+
+create table t3 (f1 int,f2 int,key(f2)) engine=innodb row_format=redundant;
insert into t1 values (10,20,30),(30,40,50),(50,60,70);
insert into t1 select * from t1;
diff --git a/mysql-test/suite/innodb_gis/r/alter_spatial_index.result b/mysql-test/suite/innodb_gis/r/alter_spatial_index.result
index 17f1f7e1b06..625eac13959 100644
--- a/mysql-test/suite/innodb_gis/r/alter_spatial_index.result
+++ b/mysql-test/suite/innodb_gis/r/alter_spatial_index.result
@@ -483,7 +483,7 @@ info: Records: 0 Duplicates: 0 Warnings: 0
ALTER TABLE tab MODIFY COLUMN c2 GEOMETRY NOT NULL;
affected rows: 0
info: Records: 0 Duplicates: 0 Warnings: 0
-ALTER TABLE tab add COLUMN c8 POINT NOT NULL, ALGORITHM = INPLACE, LOCK=NONE;
+ALTER TABLE tab add COLUMN c8 POINT NOT NULL AFTER c5, ALGORITHM = INPLACE, LOCK=NONE;
ERROR 0A000: LOCK=NONE is not supported. Reason: Do not support online operation on table with GIS index. Try LOCK=SHARED
SHOW CREATE TABLE tab;
Table Create Table
diff --git a/mysql-test/suite/innodb_gis/t/alter_spatial_index.test b/mysql-test/suite/innodb_gis/t/alter_spatial_index.test
index 2b834ac69a6..7513d0ddb39 100644
--- a/mysql-test/suite/innodb_gis/t/alter_spatial_index.test
+++ b/mysql-test/suite/innodb_gis/t/alter_spatial_index.test
@@ -482,8 +482,18 @@ ALTER TABLE tab MODIFY COLUMN c2 GEOMETRY NOT NULL;
# --error ER_INVALID_USE_OF_NULL
# ALTER TABLE tab add COLUMN c7 POINT NOT NULL;
+# instant add, supported
+#ALTER TABLE tab add COLUMN c8 POINT NOT NULL, ALGORITHM = INPLACE, LOCK=NONE;
+#SELECT HEX(c8) FROM tab;
+#BEGIN;
+#INSERT INTO tab SELECT 0,c2,c3,c4,c5,ST_GeomFromText('POINT(67 89)')
+#FROM tab LIMIT 1;
+#SELECT HEX(c8) FROM tab;
+#ROLLBACK;
+
+# not instant, not supported
--error ER_ALTER_OPERATION_NOT_SUPPORTED_REASON
-ALTER TABLE tab add COLUMN c8 POINT NOT NULL, ALGORITHM = INPLACE, LOCK=NONE;
+ALTER TABLE tab add COLUMN c8 POINT NOT NULL AFTER c5, ALGORITHM = INPLACE, LOCK=NONE;
--disable_info
SHOW CREATE TABLE tab;
diff --git a/mysql-test/suite/rpl/r/rpl_alter_instant.result b/mysql-test/suite/rpl/r/rpl_alter_instant.result
new file mode 100644
index 00000000000..35380fdeddf
--- /dev/null
+++ b/mysql-test/suite/rpl/r/rpl_alter_instant.result
@@ -0,0 +1,66 @@
+include/master-slave.inc
+[connection master]
+use test;
+create table t1 (id int primary key, c1 int default 10, c2 varchar(20) default 'holiday') engine = innodb;
+insert into t1 values(1, 12345, 'abcde'), (2, default, default), (3, 23456, 'xyzab');
+set time_zone='+03:00';
+set timestamp = 1;
+alter table t1 add column d1 timestamp not null default current_timestamp;
+select * from t1;
+id c1 c2 d1
+1 12345 abcde 1970-01-01 03:00:01
+2 10 holiday 1970-01-01 03:00:01
+3 23456 xyzab 1970-01-01 03:00:01
+connection slave;
+connection slave;
+set time_zone='+03:00';
+select * from t1;
+id c1 c2 d1
+1 12345 abcde 1970-01-01 03:00:01
+2 10 holiday 1970-01-01 03:00:01
+3 23456 xyzab 1970-01-01 03:00:01
+connection master;
+alter table t1 add column d2 timestamp not null default current_timestamp, ALGORITHM=copy;
+connection slave;
+connection slave;
+select * from t1;
+id c1 c2 d1 d2
+1 12345 abcde 1970-01-01 03:00:01 1970-01-01 03:00:01
+2 10 holiday 1970-01-01 03:00:01 1970-01-01 03:00:01
+3 23456 xyzab 1970-01-01 03:00:01 1970-01-01 03:00:01
+connection master;
+drop table t1;
+create table t4 (id int primary key, c2 int);
+insert into t4 values(1,1),(2,2),(3,3);
+set timestamp = 1000;
+alter table t4 add column (c3 datetime default current_timestamp(), c4 timestamp not null default current_timestamp());
+select * from t4;
+id c2 c3 c4
+1 1 1970-01-01 03:16:40 1970-01-01 03:16:40
+2 2 1970-01-01 03:16:40 1970-01-01 03:16:40
+3 3 1970-01-01 03:16:40 1970-01-01 03:16:40
+alter table t4 add column c5 time not null default current_timestamp();
+Warnings:
+Note 1265 Data truncated for column 'c5' at row 1
+Note 1265 Data truncated for column 'c5' at row 2
+Note 1265 Data truncated for column 'c5' at row 3
+alter table t4 add column c6 date not null default current_timestamp();
+Warnings:
+Note 1265 Data truncated for column 'c6' at row 1
+Note 1265 Data truncated for column 'c6' at row 2
+Note 1265 Data truncated for column 'c6' at row 3
+select * from t4;
+id c2 c3 c4 c5 c6
+1 1 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+2 2 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+3 3 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+connection slave;
+connection slave;
+select * from t4;
+id c2 c3 c4 c5 c6
+1 1 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+2 2 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+3 3 1970-01-01 03:16:40 1970-01-01 03:16:40 03:16:40 1970-01-01
+connection master;
+drop table t4;
+include/rpl_end.inc
diff --git a/mysql-test/suite/rpl/t/rpl_alter_instant.test b/mysql-test/suite/rpl/t/rpl_alter_instant.test
new file mode 100644
index 00000000000..260f7e92d10
--- /dev/null
+++ b/mysql-test/suite/rpl/t/rpl_alter_instant.test
@@ -0,0 +1,50 @@
+source include/have_innodb.inc;
+source include/master-slave.inc;
+
+use test;
+create table t1 (id int primary key, c1 int default 10, c2 varchar(20) default 'holiday') engine = innodb;
+
+insert into t1 values(1, 12345, 'abcde'), (2, default, default), (3, 23456, 'xyzab');
+
+set time_zone='+03:00';
+set timestamp = 1;
+alter table t1 add column d1 timestamp not null default current_timestamp;
+
+select * from t1;
+sync_slave_with_master;
+
+connection slave;
+set time_zone='+03:00';
+select * from t1;
+
+connection master;
+alter table t1 add column d2 timestamp not null default current_timestamp, ALGORITHM=copy;
+
+sync_slave_with_master;
+
+connection slave;
+select * from t1;
+
+connection master;
+drop table t1;
+
+
+# datetime
+create table t4 (id int primary key, c2 int);
+insert into t4 values(1,1),(2,2),(3,3);
+set timestamp = 1000;
+alter table t4 add column (c3 datetime default current_timestamp(), c4 timestamp not null default current_timestamp());
+select * from t4;
+alter table t4 add column c5 time not null default current_timestamp();
+alter table t4 add column c6 date not null default current_timestamp();
+
+select * from t4;
+sync_slave_with_master;
+
+connection slave;
+select * from t4;
+
+connection master;
+drop table t4;
+
+--source include/rpl_end.inc
diff --git a/storage/innobase/btr/btr0btr.cc b/storage/innobase/btr/btr0btr.cc
index 3e9f26ad125..cf2aebe80f5 100644
--- a/storage/innobase/btr/btr0btr.cc
+++ b/storage/innobase/btr/btr0btr.cc
@@ -1686,6 +1686,17 @@ func_exit:
#ifdef UNIV_ZIP_DEBUG
ut_a(!page_zip || page_zip_validate(page_zip, page, index));
#endif /* UNIV_ZIP_DEBUG */
+
+ if (!recovery && page_is_root(temp_page)
+ && fil_page_get_type(temp_page) == FIL_PAGE_TYPE_INSTANT) {
+ /* Preserve the PAGE_INSTANT information. */
+ ut_ad(!page_zip);
+ ut_ad(index->is_instant());
+ memcpy(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE + temp_page, 2);
+ memcpy(PAGE_HEADER + PAGE_INSTANT + page,
+ PAGE_HEADER + PAGE_INSTANT + temp_page, 2);
+ }
+
buf_block_free(temp_block);
/* Restore logging mode */
@@ -1720,6 +1731,19 @@ func_exit:
MONITOR_INC(MONITOR_INDEX_REORG_SUCCESSFUL);
}
+ if (UNIV_UNLIKELY(fil_page_get_type(page) == FIL_PAGE_TYPE_INSTANT)) {
+ /* Log the PAGE_INSTANT information. */
+ ut_ad(!page_zip);
+ ut_ad(index->is_instant());
+ ut_ad(!recovery);
+ mlog_write_ulint(FIL_PAGE_TYPE + page, FIL_PAGE_TYPE_INSTANT,
+ MLOG_2BYTES, mtr);
+ mlog_write_ulint(PAGE_HEADER + PAGE_INSTANT + page,
+ mach_read_from_2(PAGE_HEADER + PAGE_INSTANT
+ + page),
+ MLOG_2BYTES, mtr);
+ }
+
return(success);
}
@@ -1818,17 +1842,19 @@ btr_parse_page_reorganize(
return(ptr);
}
-/*************************************************************//**
-Empties an index page. @see btr_page_create(). */
-static
+/** Empty an index page (possibly the root page). @see btr_page_create().
+@param[in,out] block page to be emptied
+@param[in,out] page_zip compressed page frame, or NULL
+@param[in] index index of the page
+@param[in] level B-tree level of the page (0=leaf)
+@param[in,out] mtr mini-transaction */
void
btr_page_empty(
-/*===========*/
- buf_block_t* block, /*!< in: page to be emptied */
- page_zip_des_t* page_zip,/*!< out: compressed page, or NULL */
- dict_index_t* index, /*!< in: index of the page */
- ulint level, /*!< in: the B-tree level of the page */
- mtr_t* mtr) /*!< in: mtr */
+ buf_block_t* block,
+ page_zip_des_t* page_zip,
+ dict_index_t* index,
+ ulint level,
+ mtr_t* mtr)
{
page_t* page = buf_block_get_frame(block);
@@ -1903,6 +1929,7 @@ btr_root_raise_and_insert(
root_page_zip = buf_block_get_page_zip(root_block);
ut_ad(!page_is_empty(root));
index = btr_cur_get_index(cursor);
+ ut_ad(index->n_core_null_bytes <= UT_BITS_IN_BYTES(index->n_nullable));
#ifdef UNIV_ZIP_DEBUG
ut_a(!root_page_zip || page_zip_validate(root_page_zip, root, index));
#endif /* UNIV_ZIP_DEBUG */
@@ -1965,19 +1992,16 @@ btr_root_raise_and_insert(
root_page_zip, root, index, mtr);
/* Update the lock table and possible hash index. */
-
- if (!dict_table_is_locking_disabled(index->table)) {
- lock_move_rec_list_end(new_block, root_block,
- page_get_infimum_rec(root));
- }
+ lock_move_rec_list_end(new_block, root_block,
+ page_get_infimum_rec(root));
/* Move any existing predicate locks */
if (dict_index_is_spatial(index)) {
lock_prdt_rec_move(new_block, root_block);
+ } else {
+ btr_search_move_or_delete_hash_entries(
+ new_block, root_block);
}
-
- btr_search_move_or_delete_hash_entries(new_block, root_block,
- index);
}
if (dict_index_is_sec_or_ibuf(index)) {
@@ -2046,6 +2070,17 @@ btr_root_raise_and_insert(
/* Rebuild the root page to get free space */
btr_page_empty(root_block, root_page_zip, index, level + 1, mtr);
+ /* btr_page_empty() is supposed to zero-initialize the field. */
+ ut_ad(!page_get_instant(root_block->frame));
+
+ if (index->is_instant()) {
+ ut_ad(!root_page_zip);
+ byte* page_type = root_block->frame + FIL_PAGE_TYPE;
+ ut_ad(mach_read_from_2(page_type) == FIL_PAGE_INDEX);
+ mlog_write_ulint(page_type, FIL_PAGE_TYPE_INSTANT,
+ MLOG_2BYTES, mtr);
+ page_set_instant(root_block->frame, index->n_core_fields, mtr);
+ }
/* Set the next node and previous node fields, although
they should already have been set. The previous node field
@@ -3081,16 +3116,12 @@ insert_empty:
ULINT_UNDEFINED, mtr);
/* Update the lock table and possible hash index. */
-
- if (!dict_table_is_locking_disabled(
- cursor->index->table)) {
- lock_move_rec_list_start(
- new_block, block, move_limit,
- new_page + PAGE_NEW_INFIMUM);
- }
+ lock_move_rec_list_start(
+ new_block, block, move_limit,
+ new_page + PAGE_NEW_INFIMUM);
btr_search_move_or_delete_hash_entries(
- new_block, block, cursor->index);
+ new_block, block);
/* Delete the records from the source page. */
@@ -3127,16 +3158,12 @@ insert_empty:
cursor->index, mtr);
/* Update the lock table and possible hash index. */
- if (!dict_table_is_locking_disabled(
- cursor->index->table)) {
- lock_move_rec_list_end(
- new_block, block, move_limit);
- }
+ lock_move_rec_list_end(new_block, block, move_limit);
ut_ad(!dict_index_is_spatial(index));
btr_search_move_or_delete_hash_entries(
- new_block, block, cursor->index);
+ new_block, block);
/* Delete the records from the source page. */
@@ -3537,6 +3564,19 @@ btr_lift_page_up(
/* Make the father empty */
btr_page_empty(father_block, father_page_zip, index, page_level, mtr);
+ /* btr_page_empty() is supposed to zero-initialize the field. */
+ ut_ad(!page_get_instant(father_block->frame));
+
+ if (page_level == 0 && index->is_instant()) {
+ ut_ad(!father_page_zip);
+ byte* page_type = father_block->frame + FIL_PAGE_TYPE;
+ ut_ad(mach_read_from_2(page_type) == FIL_PAGE_INDEX);
+ mlog_write_ulint(page_type, FIL_PAGE_TYPE_INSTANT,
+ MLOG_2BYTES, mtr);
+ page_set_instant(father_block->frame,
+ index->n_core_fields, mtr);
+ }
+
page_level++;
/* Copy the records to the father page one by one. */
@@ -3558,18 +3598,16 @@ btr_lift_page_up(
/* Update the lock table and possible hash index. */
- if (!dict_table_is_locking_disabled(index->table)) {
- lock_move_rec_list_end(father_block, block,
- page_get_infimum_rec(page));
- }
+ lock_move_rec_list_end(father_block, block,
+ page_get_infimum_rec(page));
/* Also update the predicate locks */
if (dict_index_is_spatial(index)) {
lock_prdt_rec_move(father_block, block);
+ } else {
+ btr_search_move_or_delete_hash_entries(
+ father_block, block);
}
-
- btr_search_move_or_delete_hash_entries(father_block, block,
- index);
}
if (!dict_table_is_locking_disabled(index->table)) {
@@ -4205,6 +4243,7 @@ btr_discard_only_page_on_level(
/* block is the root page, which must be empty, except
for the node pointer to the (now discarded) block(s). */
+ ut_ad(page_is_root(block->frame));
#ifdef UNIV_BTR_DEBUG
if (!dict_index_is_ibuf(index)) {
@@ -4219,9 +4258,14 @@ btr_discard_only_page_on_level(
btr_page_empty(block, buf_block_get_page_zip(block), index, 0, mtr);
ut_ad(page_is_leaf(buf_block_get_frame(block)));
-
- if (!dict_index_is_clust(index)
- && !dict_table_is_temporary(index->table)) {
+ /* btr_page_empty() is supposed to zero-initialize the field. */
+ ut_ad(!page_get_instant(block->frame));
+
+ if (index->is_clust()) {
+ /* Concurrent access is prevented by the root_block->lock
+ X-latch, so this should be safe. */
+ index->remove_instant();
+ } else if (!index->table->is_temporary()) {
/* We play it safe and reset the free bits for the root */
ibuf_reset_free_bits(block);
@@ -4615,8 +4659,6 @@ btr_index_rec_validate(
and page on error */
{
ulint len;
- ulint n;
- ulint i;
const page_t* page;
mem_heap_t* heap = NULL;
ulint offsets_[REC_OFFS_NORMAL_SIZE];
@@ -4647,31 +4689,34 @@ btr_index_rec_validate(
return(FALSE);
}
- n = dict_index_get_n_fields(index);
-
- if (!page_is_comp(page)
- && (rec_get_n_fields_old(rec) != n
- /* a record for older SYS_INDEXES table
- (missing merge_threshold column) is acceptable. */
- && !(index->id == DICT_INDEXES_ID
- && rec_get_n_fields_old(rec) == n - 1))) {
- btr_index_rec_validate_report(page, rec, index);
+ if (!page_is_comp(page)) {
+ const ulint n_rec_fields = rec_get_n_fields_old(rec);
+ if (n_rec_fields == DICT_FLD__SYS_INDEXES__MERGE_THRESHOLD
+ && index->id == DICT_INDEXES_ID) {
+ /* A record for older SYS_INDEXES table
+ (missing merge_threshold column) is acceptable. */
+ } else if (n_rec_fields < index->n_core_fields
+ || n_rec_fields > index->n_fields) {
+ btr_index_rec_validate_report(page, rec, index);
- ib::error() << "Has " << rec_get_n_fields_old(rec)
- << " fields, should have " << n;
+ ib::error() << "Has " << rec_get_n_fields_old(rec)
+ << " fields, should have "
+ << index->n_core_fields << ".."
+ << index->n_fields;
- if (dump_on_error) {
- fputs("InnoDB: corrupt record ", stderr);
- rec_print_old(stderr, rec);
- putc('\n', stderr);
+ if (dump_on_error) {
+ fputs("InnoDB: corrupt record ", stderr);
+ rec_print_old(stderr, rec);
+ putc('\n', stderr);
+ }
+ return(FALSE);
}
- return(FALSE);
}
offsets = rec_get_offsets(rec, index, offsets, page_is_leaf(page),
ULINT_UNDEFINED, &heap);
- for (i = 0; i < n; i++) {
+ for (unsigned i = 0; i < index->n_fields; i++) {
dict_field_t* field = dict_index_get_nth_field(index, i);
ulint fixed_size = dict_col_get_fixed_size(
dict_field_get_col(field),
@@ -4686,14 +4731,10 @@ btr_index_rec_validate(
length. When fixed_size == 0, prefix_len is the maximum
length of the prefix index column. */
- if ((field->prefix_len == 0
- && len != UNIV_SQL_NULL && fixed_size
- && len != fixed_size)
- || (field->prefix_len > 0
- && len != UNIV_SQL_NULL
- && len
- > field->prefix_len)) {
-
+ if (len_is_stored(len)
+ && (field->prefix_len
+ ? len > field->prefix_len
+ : (fixed_size && len != fixed_size))) {
btr_index_rec_validate_report(page, rec, index);
ib::error error;
diff --git a/storage/innobase/btr/btr0bulk.cc b/storage/innobase/btr/btr0bulk.cc
index 139e3116d06..a2bd25b4a04 100644
--- a/storage/innobase/btr/btr0bulk.cc
+++ b/storage/innobase/btr/btr0bulk.cc
@@ -170,13 +170,14 @@ PageBulk::insert(
ut_ad(m_heap != NULL);
rec_size = rec_offs_size(offsets);
+ ut_d(const bool is_leaf = page_rec_is_leaf(m_cur_rec));
#ifdef UNIV_DEBUG
/* Check whether records are in order. */
if (!page_rec_is_infimum(m_cur_rec)) {
rec_t* old_rec = m_cur_rec;
ulint* old_offsets = rec_get_offsets(
- old_rec, m_index, NULL, page_rec_is_leaf(old_rec),
+ old_rec, m_index, NULL, is_leaf,
ULINT_UNDEFINED, &m_heap);
ut_ad(cmp_rec_rec(rec, old_rec, offsets, old_offsets, m_index)
@@ -188,7 +189,7 @@ PageBulk::insert(
/* 1. Copy the record to page. */
rec_t* insert_rec = rec_copy(m_heap_top, rec, offsets);
- rec_offs_make_valid(insert_rec, m_index, offsets);
+ rec_offs_make_valid(insert_rec, m_index, is_leaf, offsets);
/* 2. Insert the record in the linked list. */
rec_t* next_rec = page_rec_get_next(m_cur_rec);
@@ -291,12 +292,12 @@ PageBulk::finish()
page_dir_set_n_slots(m_page, NULL, 2 + slot_index);
page_header_set_ptr(m_page, NULL, PAGE_HEAP_TOP, m_heap_top);
page_dir_set_n_heap(m_page, NULL, PAGE_HEAP_NO_USER_LOW + m_rec_no);
- page_header_set_field(m_page, NULL, PAGE_N_RECS, m_rec_no);
-
page_header_set_ptr(m_page, NULL, PAGE_LAST_INSERT, m_cur_rec);
- page_header_set_field(m_page, NULL, PAGE_DIRECTION, PAGE_RIGHT);
- page_header_set_field(m_page, NULL, PAGE_N_DIRECTION, 0);
-
+ mach_write_to_2(PAGE_HEADER + PAGE_N_RECS + m_page, m_rec_no);
+ ut_ad(!page_get_instant(m_page));
+ m_page[PAGE_HEADER + PAGE_DIRECTION_B] = PAGE_RIGHT;
+ *reinterpret_cast<uint16_t*>(PAGE_HEADER + PAGE_N_DIRECTION + m_page)
+ = 0;
m_block->skip_flush_check = false;
}
diff --git a/storage/innobase/btr/btr0cur.cc b/storage/innobase/btr/btr0cur.cc
index 8295c0573cf..e96aceb5f5d 100644
--- a/storage/innobase/btr/btr0cur.cc
+++ b/storage/innobase/btr/btr0cur.cc
@@ -393,6 +393,211 @@ btr_cur_latch_leaves(
return(latch_leaves);
}
+/** Load the instant ALTER TABLE metadata from the clustered index
+when loading a table definition.
+@param[in,out] index clustered index definition
+@param[in,out] mtr mini-transaction
+@return error code
+@retval DB_SUCCESS if no error occurred
+@retval DB_CORRUPTION if any corruption was noticed */
+static
+dberr_t
+btr_cur_instant_init_low(dict_index_t* index, mtr_t* mtr)
+{
+ ut_ad(index->is_clust());
+ ut_ad(index->n_core_null_bytes == dict_index_t::NO_CORE_NULL_BYTES);
+ ut_ad(index->table->supports_instant());
+ ut_ad(index->table->is_readable());
+
+ page_t* root = btr_root_get(index, mtr);
+
+ if (!root || btr_cur_instant_root_init(index, root)) {
+ ib::error() << "Table " << index->table->name
+ << " has an unreadable root page";
+ index->table->corrupted = true;
+ return DB_CORRUPTION;
+ }
+
+ ut_ad(index->n_core_null_bytes != dict_index_t::NO_CORE_NULL_BYTES);
+
+ if (!index->is_instant()) {
+ return DB_SUCCESS;
+ }
+
+ btr_cur_t cur;
+ dberr_t err = btr_cur_open_at_index_side(true, index, BTR_SEARCH_LEAF,
+ &cur, 0, mtr);
+ if (err != DB_SUCCESS) {
+ index->table->corrupted = true;
+ return err;
+ }
+
+ ut_ad(page_cur_is_before_first(&cur.page_cur));
+ ut_ad(page_is_leaf(cur.page_cur.block->frame));
+
+ page_cur_move_to_next(&cur.page_cur);
+
+ const rec_t* rec = cur.page_cur.rec;
+
+ if (page_rec_is_supremum(rec) || !rec_is_default_row(rec, index)) {
+ ib::error() << "Table " << index->table->name
+ << " is missing instant ALTER metadata";
+ index->table->corrupted = true;
+ return DB_CORRUPTION;
+ }
+
+ if (dict_table_is_comp(index->table)) {
+ if (rec_get_info_bits(rec, true) != REC_INFO_MIN_REC_FLAG
+ && rec_get_status(rec) != REC_STATUS_COLUMNS_ADDED) {
+incompatible:
+ ib::error() << "Table " << index->table->name
+ << " contains unrecognizable "
+ "instant ALTER metadata";
+ index->table->corrupted = true;
+ return DB_CORRUPTION;
+ }
+ } else if (rec_get_info_bits(rec, false) != REC_INFO_MIN_REC_FLAG) {
+ goto incompatible;
+ }
+
+ /* Read the 'default row'. We can get here on server restart
+ or when the table was evicted from the data dictionary cache
+ and is now being accessed again.
+
+ Here, READ COMMITTED and REPEATABLE READ should be equivalent.
+ Committing the ADD COLUMN operation would acquire
+ MDL_EXCLUSIVE and LOCK_X|LOCK_TABLE, which would prevent any
+ concurrent operations on the table, including table eviction
+ from the cache. */
+
+ mem_heap_t* heap = NULL;
+ ulint* offsets = rec_get_offsets(rec, index, NULL, true,
+ ULINT_UNDEFINED, &heap);
+ if (rec_offs_any_default(offsets)) {
+inconsistent:
+ mem_heap_free(heap);
+ goto incompatible;
+ }
+
+ /* In fact, because we only ever append fields to the 'default
+ value' record, it is also OK to perform READ UNCOMMITTED and
+ then ignore any extra fields, provided that
+ trx_rw_is_active(DB_TRX_ID). */
+ if (rec_offs_n_fields(offsets) > index->n_fields
+ && !trx_rw_is_active(row_get_rec_trx_id(rec, index, offsets),
+ NULL, false)) {
+ goto inconsistent;
+ }
+
+ for (unsigned i = index->n_core_fields; i < index->n_fields; i++) {
+ ulint len;
+ const byte* data = rec_get_nth_field(rec, offsets, i, &len);
+ dict_col_t* col = index->fields[i].col;
+ ut_ad(!col->is_instant());
+ ut_ad(!col->def_val.data);
+ col->def_val.len = len;
+ switch (len) {
+ case UNIV_SQL_NULL:
+ continue;
+ case 0:
+ col->def_val.data = field_ref_zero;
+ continue;
+ }
+ ut_ad(len != UNIV_SQL_DEFAULT);
+ if (!rec_offs_nth_extern(offsets, i)) {
+ col->def_val.data = mem_heap_dup(
+ index->table->heap, data, len);
+ } else if (len < BTR_EXTERN_FIELD_REF_SIZE
+ || !memcmp(data + len - BTR_EXTERN_FIELD_REF_SIZE,
+ field_ref_zero,
+ BTR_EXTERN_FIELD_REF_SIZE)) {
+ col->def_val.len = UNIV_SQL_DEFAULT;
+ goto inconsistent;
+ } else {
+ col->def_val.data = btr_copy_externally_stored_field(
+ &col->def_val.len, data,
+ dict_table_page_size(index->table),
+ len, index->table->heap);
+ }
+ }
+
+ mem_heap_free(heap);
+ return DB_SUCCESS;
+}
+
+/** Load the instant ALTER TABLE metadata from the clustered index
+when loading a table definition.
+@param[in,out] table table definition from the data dictionary
+@return error code
+@retval DB_SUCCESS if no error occurred */
+dberr_t
+btr_cur_instant_init(dict_table_t* table)
+{
+ mtr_t mtr;
+ dict_index_t* index = dict_table_get_first_index(table);
+ mtr.start();
+ dberr_t err = index
+ ? btr_cur_instant_init_low(index, &mtr)
+ : DB_CORRUPTION;
+ mtr.commit();
+ return(err);
+}
+
+/** Initialize the n_core_null_bytes on first access to a clustered
+index root page.
+@param[in] index clustered index that is on its first access
+@param[in] page clustered index root page
+@return whether the page is corrupted */
+bool
+btr_cur_instant_root_init(dict_index_t* index, const page_t* page)
+{
+ ut_ad(page_is_root(page));
+ ut_ad(!page_is_comp(page) == !dict_table_is_comp(index->table));
+ ut_ad(index->is_clust());
+ ut_ad(!index->is_instant());
+ ut_ad(index->table->supports_instant());
+ /* This is normally executed as part of btr_cur_instant_init()
+ when dict_load_table_one() is loading a table definition.
+ Other threads should not access or modify the n_core_null_bytes,
+ n_core_fields before dict_load_table_one() returns.
+
+ This can also be executed during IMPORT TABLESPACE, where the
+ table definition is exclusively locked. */
+
+ switch (fil_page_get_type(page)) {
+ default:
+ ut_ad(!"wrong page type");
+ return true;
+ case FIL_PAGE_INDEX:
+ /* The field PAGE_INSTANT is guaranteed 0 on clustered
+ index root pages of ROW_FORMAT=COMPACT or
+ ROW_FORMAT=DYNAMIC when instant ADD COLUMN is not used. */
+ ut_ad(!page_is_comp(page) || !page_get_instant(page));
+ index->n_core_null_bytes = UT_BITS_IN_BYTES(index->n_nullable);
+ return false;
+ case FIL_PAGE_TYPE_INSTANT:
+ break;
+ }
+
+ uint16_t n = page_get_instant(page);
+ if (n < index->n_uniq + DATA_ROLL_PTR || n > index->n_fields) {
+ /* The PRIMARY KEY (or hidden DB_ROW_ID) and
+ DB_TRX_ID,DB_ROLL_PTR columns must always be present
+ as 'core' fields. All fields, including those for
+ instantly added columns, must be present in the data
+ dictionary. */
+ return true;
+ }
+ index->n_core_fields = n;
+ ut_ad(!index->is_dummy);
+ ut_d(index->is_dummy = true);
+ index->n_core_null_bytes = n == index->n_fields
+ ? UT_BITS_IN_BYTES(index->n_nullable)
+ : UT_BITS_IN_BYTES(index->get_n_nullable(n));
+ ut_d(index->is_dummy = false);
+ return false;
+}
+
/** Optimistically latches the leaf page or pages requested.
@param[in] block guessed buffer block
@param[in] modify_clock modify clock value
@@ -923,6 +1128,7 @@ btr_cur_search_to_nth_level(
will have to check it again. */
&& btr_search_enabled
&& !modify_external
+ && !(tuple->info_bits & REC_INFO_MIN_REC_FLAG)
&& rw_lock_get_writer(btr_get_search_latch(index))
== RW_LOCK_NOT_LOCKED
&& btr_search_guess_on_hash(index, info, tuple, mode,
@@ -1308,14 +1514,14 @@ retry_page_get:
ut_ad(fil_page_index_page_check(page));
ut_ad(index->id == btr_page_get_index_id(page));
- if (UNIV_UNLIKELY(height == ULINT_UNDEFINED)) {
+ if (height == ULINT_UNDEFINED) {
/* We are in the root node */
height = btr_page_get_level(page, mtr);
root_height = height;
cursor->tree_height = root_height + 1;
- if (dict_index_is_spatial(index)) {
+ if (UNIV_UNLIKELY(dict_index_is_spatial(index))) {
ut_ad(cursor->rtr_info);
node_seq_t seq_no = rtr_get_current_ssn_id(index);
@@ -1480,6 +1686,7 @@ retry_page_get:
}
#ifdef BTR_CUR_HASH_ADAPT
} else if (height == 0 && btr_search_enabled
+ && !(tuple->info_bits & REC_INFO_MIN_REC_FLAG)
&& !dict_index_is_spatial(index)) {
/* The adaptive hash index is only used when searching
for leaf pages (height==0), but not in r-trees.
@@ -1965,11 +2172,21 @@ need_opposite_intention:
will properly check btr_search_enabled again in
btr_search_build_page_hash_index() before building a
page hash index, while holding search latch. */
- if (btr_search_enabled
+ if (!btr_search_enabled) {
# ifdef MYSQL_INDEX_DISABLE_AHI
- && !index->disable_ahi
+ } else if (index->disable_ahi) {
# endif
- ) {
+ } else if (tuple->info_bits & REC_INFO_MIN_REC_FLAG) {
+ ut_ad(index->is_instant());
+ /* This may be a search tuple for
+ btr_pcur_restore_position(). */
+ ut_ad(tuple->info_bits == REC_INFO_DEFAULT_ROW
+ || tuple->info_bits == REC_INFO_MIN_REC_FLAG);
+ } else if (rec_is_default_row(btr_cur_get_rec(cursor),
+ index)) {
+ /* Only user records belong in the adaptive
+ hash index. */
+ } else {
btr_search_info_update(index, cursor);
}
#endif /* BTR_CUR_HASH_ADAPT */
@@ -3081,6 +3298,10 @@ fail_err:
# ifdef MYSQL_INDEX_DISABLE_AHI
} else if (index->disable_ahi) {
# endif
+ } else if (entry->info_bits & REC_INFO_MIN_REC_FLAG) {
+ ut_ad(entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(index->is_instant());
+ ut_ad(flags == BTR_NO_LOCKING_FLAG);
} else if (!reorg && cursor->flag == BTR_CUR_HASH) {
btr_search_update_hash_node_on_insert(cursor);
} else {
@@ -3284,7 +3505,14 @@ btr_cur_pessimistic_insert(
# ifdef MYSQL_INDEX_DISABLE_AHI
if (index->disable_ahi); else
# endif
+ if (entry->info_bits & REC_INFO_MIN_REC_FLAG) {
+ ut_ad(entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(index->is_instant());
+ ut_ad((flags & ~BTR_KEEP_IBUF_BITMAP)
+ == BTR_NO_LOCKING_FLAG);
+ } else {
btr_search_update_hash_on_insert(cursor);
+ }
#endif /* BTR_CUR_HASH_ADAPT */
if (inherit && !(flags & BTR_NO_LOCKING_FLAG)) {
@@ -3557,7 +3785,8 @@ btr_cur_update_alloc_zip_func(
goto out_of_space;
}
- rec_offs_make_valid(page_cur_get_rec(cursor), index, offsets);
+ rec_offs_make_valid(page_cur_get_rec(cursor), index,
+ page_is_leaf(page), offsets);
/* After recompressing a page, we must make sure that the free
bits in the insert buffer bitmap will not exceed the free
@@ -3635,6 +3864,7 @@ btr_cur_update_in_place(
| BTR_CREATE_FLAG | BTR_KEEP_SYS_FLAG));
ut_ad(fil_page_index_page_check(btr_cur_get_page(cursor)));
ut_ad(btr_page_get_index_id(btr_cur_get_page(cursor)) == index->id);
+ ut_ad(!(update->info_bits & REC_INFO_MIN_REC_FLAG));
DBUG_LOG("ib_cur",
"update-in-place " << index->name << " (" << index->id
@@ -3695,7 +3925,7 @@ btr_cur_update_in_place(
if (!dict_index_is_clust(index)
|| row_upd_changes_ord_field_binary(index, update, thr,
NULL, NULL)) {
-
+ ut_ad(!(update->info_bits & REC_INFO_MIN_REC_FLAG));
/* Remove possible hash index pointer to this record */
btr_search_update_hash_on_delete(cursor);
}
@@ -3742,6 +3972,60 @@ func_exit:
return(err);
}
+/** Trim an update tuple due to instant ADD COLUMN, if needed.
+For normal records, the trailing instantly added fields that match
+the 'default row' are omitted.
+
+For the special 'default row' record on a table on which instant
+ADD COLUMN has already been executed, both ADD COLUMN and the
+rollback of ADD COLUMN need to be handled specially.
+
+@param[in,out] entry index entry
+@param[in] index index
+@param[in] update update vector
+@param[in] thr execution thread */
+static inline
+void
+btr_cur_trim(
+ dtuple_t* entry,
+ const dict_index_t* index,
+ const upd_t* update,
+ const que_thr_t* thr)
+{
+ if (!index->is_instant()) {
+ } else if (UNIV_UNLIKELY(update->info_bits == REC_INFO_DEFAULT_ROW)) {
+ /* We are either updating a 'default row'
+ (instantly adding columns to a table where instant ADD was
+ already executed) or rolling back such an operation. */
+ ut_ad(!upd_get_nth_field(update, 0)->orig_len);
+ ut_ad(upd_get_nth_field(update, 0)->field_no
+ > index->n_core_fields);
+
+ if (thr->graph->trx->in_rollback) {
+ /* This rollback can occur either as part of
+ ha_innobase::commit_inplace_alter_table() rolling
+ back after a failed innobase_add_instant_try(),
+ or as part of crash recovery. Either way, the
+ table will be in the data dictionary cache, with
+ the instantly added columns going to be removed
+ later in the rollback. */
+ ut_ad(index->table->cached);
+ /* The DB_TRX_ID,DB_ROLL_PTR are always last,
+ and there should be some change to roll back.
+ The first field in the update vector is the
+ first instantly added column logged by
+ innobase_add_instant_try(). */
+ ut_ad(update->n_fields > 2);
+ ulint n_fields = upd_get_nth_field(update, 0)
+ ->field_no;
+ ut_ad(n_fields + 1 >= entry->n_fields);
+ entry->n_fields = n_fields;
+ }
+ } else {
+ entry->trim(*index);
+ }
+}
+
/*************************************************************//**
Tries to update a record on a page in an index tree. It is assumed that mtr
holds an x-latch on the page. The operation does not succeed if there is too
@@ -3817,7 +4101,11 @@ btr_cur_optimistic_update(
|| trx_is_recv(thr_get_trx(thr)));
#endif /* UNIV_DEBUG || UNIV_BLOB_LIGHT_DEBUG */
- if (!row_upd_changes_field_size_or_external(index, *offsets, update)) {
+ const bool is_default_row = update->info_bits == REC_INFO_DEFAULT_ROW;
+
+ if (UNIV_LIKELY(!is_default_row)
+ && !row_upd_changes_field_size_or_external(index, *offsets,
+ update)) {
/* The simplest and the most common case: the update does not
change the size of any field and none of the updated fields is
@@ -3870,7 +4158,8 @@ any_extern:
corresponding to new_entry is latched in mtr.
Thus the following call is safe. */
row_upd_index_replace_new_col_vals_index_pos(new_entry, index, update,
- FALSE, *heap);
+ *heap);
+ btr_cur_trim(new_entry, index, update, thr);
old_rec_size = rec_offs_size(*offsets);
new_rec_size = rec_get_converted_size(index, new_entry, 0);
@@ -3974,7 +4263,16 @@ any_extern:
lock_rec_store_on_page_infimum(block, rec);
}
- btr_search_update_hash_on_delete(cursor);
+ if (UNIV_UNLIKELY(is_default_row)) {
+ ut_ad(new_entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(index->is_instant());
+ /* This can be innobase_add_instant_try() performing a
+ subsequent instant ADD COLUMN, or its rollback by
+ row_undo_mod_clust_low(). */
+ ut_ad(flags & BTR_NO_LOCKING_FLAG);
+ } else {
+ btr_search_update_hash_on_delete(cursor);
+ }
page_cur_delete_rec(page_cursor, index, *offsets, mtr);
@@ -3992,8 +4290,14 @@ any_extern:
cursor, new_entry, offsets, heap, 0/*n_ext*/, mtr);
ut_a(rec); /* <- We calculated above the insert would fit */
- /* Restore the old explicit lock state on the record */
- if (!dict_table_is_locking_disabled(index->table)) {
+ if (UNIV_UNLIKELY(is_default_row)) {
+ /* We must empty the PAGE_FREE list, because if this
+ was a rollback, the shortened 'default row' record
+ would have too many fields, and we would be unable to
+ know the size of the freed record. */
+ btr_page_reorganize(page_cursor, index, mtr);
+ } else if (!dict_table_is_locking_disabled(index->table)) {
+ /* Restore the old explicit lock state on the record */
lock_rec_restore_from_page_infimum(block, rec, block);
}
@@ -4190,7 +4494,11 @@ btr_cur_pessimistic_update(
purge would also have removed the clustered index record
itself. Thus the following call is safe. */
row_upd_index_replace_new_col_vals_index_pos(new_entry, index, update,
- FALSE, entry_heap);
+ entry_heap);
+ btr_cur_trim(new_entry, index, update, thr);
+
+ const bool is_default_row = new_entry->info_bits
+ & REC_INFO_MIN_REC_FLAG;
/* We have to set appropriate extern storage bits in the new
record to be inserted: we have to remember which fields were such */
@@ -4287,19 +4595,30 @@ btr_cur_pessimistic_update(
page, 1);
}
- /* Store state of explicit locks on rec on the page infimum record,
- before deleting rec. The page infimum acts as a dummy carrier of the
- locks, taking care also of lock releases, before we can move the locks
- back on the actual record. There is a special case: if we are
- inserting on the root page and the insert causes a call of
- btr_root_raise_and_insert. Therefore we cannot in the lock system
- delete the lock structs set on the root page even if the root
- page carries just node pointers. */
- if (!dict_table_is_locking_disabled(index->table)) {
- lock_rec_store_on_page_infimum(block, rec);
- }
+ if (UNIV_UNLIKELY(is_default_row)) {
+ ut_ad(new_entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(index->is_instant());
+ /* This can be innobase_add_instant_try() performing a
+ subsequent instant ADD COLUMN, or its rollback by
+ row_undo_mod_clust_low(). */
+ ut_ad(flags & BTR_NO_LOCKING_FLAG);
+ } else {
+ btr_search_update_hash_on_delete(cursor);
- btr_search_update_hash_on_delete(cursor);
+ /* Store state of explicit locks on rec on the page
+ infimum record, before deleting rec. The page infimum
+ acts as a dummy carrier of the locks, taking care also
+ of lock releases, before we can move the locks back on
+ the actual record. There is a special case: if we are
+ inserting on the root page and the insert causes a
+ call of btr_root_raise_and_insert. Therefore we cannot
+ in the lock system delete the lock structs set on the
+ root page even if the root page carries just node
+ pointers. */
+ if (!dict_table_is_locking_disabled(index->table)) {
+ lock_rec_store_on_page_infimum(block, rec);
+ }
+ }
#ifdef UNIV_ZIP_DEBUG
ut_a(!page_zip || page_zip_validate(page_zip, page, index));
@@ -4316,7 +4635,14 @@ btr_cur_pessimistic_update(
if (rec) {
page_cursor->rec = rec;
- if (!dict_table_is_locking_disabled(index->table)) {
+ if (UNIV_UNLIKELY(is_default_row)) {
+ /* We must empty the PAGE_FREE list, because if this
+ was a rollback, the shortened 'default row' record
+ would have too many fields, and we would be unable to
+ know the size of the freed record. */
+ btr_page_reorganize(page_cursor, index, mtr);
+ rec = page_cursor->rec;
+ } else if (!dict_table_is_locking_disabled(index->table)) {
lock_rec_restore_from_page_infimum(
btr_cur_get_block(cursor), rec, block);
}
@@ -4333,11 +4659,12 @@ btr_cur_pessimistic_update(
}
bool adjust = big_rec_vec && (flags & BTR_KEEP_POS_FLAG);
+ ut_ad(!adjust || page_is_leaf(page));
if (btr_cur_compress_if_useful(cursor, adjust, mtr)) {
if (adjust) {
- rec_offs_make_valid(
- page_cursor->rec, index, *offsets);
+ rec_offs_make_valid(page_cursor->rec, index,
+ true, *offsets);
}
} else if (!dict_index_is_clust(index)
&& page_is_leaf(page)) {
@@ -4463,7 +4790,14 @@ btr_cur_pessimistic_update(
ut_ad(row_get_rec_trx_id(rec, index, *offsets));
}
- if (!dict_table_is_locking_disabled(index->table)) {
+ if (UNIV_UNLIKELY(is_default_row)) {
+ /* We must empty the PAGE_FREE list, because if this
+ was a rollback, the shortened 'default row' record
+ would have too many fields, and we would be unable to
+ know the size of the freed record. */
+ btr_page_reorganize(page_cursor, index, mtr);
+ rec = page_cursor->rec;
+ } else if (!dict_table_is_locking_disabled(index->table)) {
lock_rec_restore_from_page_infimum(
btr_cur_get_block(cursor), rec, block);
}
@@ -4957,6 +5291,48 @@ btr_cur_optimistic_delete_func(
|| (flags & BTR_CREATE_FLAG));
rec = btr_cur_get_rec(cursor);
+
+ if (UNIV_UNLIKELY(page_is_root(block->frame)
+ && page_get_n_recs(block->frame) == 1
+ + (cursor->index->is_instant()
+ && !rec_is_default_row(rec, cursor->index)))) {
+ /* The whole index (and table) becomes logically empty.
+ Empty the whole page. That is, if we are deleting the
+ only user record, also delete the 'default row' record
+ if one exists (it exists if and only if is_instant()).
+ If we are deleting the 'default row' record and the
+ table becomes empty, clean up the whole page. */
+ dict_index_t* index = cursor->index;
+ ut_ad(!index->is_instant()
+ || rec_is_default_row(
+ page_rec_get_next_const(
+ page_get_infimum_rec(block->frame)),
+ index));
+ if (UNIV_UNLIKELY(rec_get_info_bits(rec, page_rec_is_comp(rec))
+ & REC_INFO_MIN_REC_FLAG)) {
+ /* This should be rolling back instant ADD COLUMN.
+ If this is a recovered transaction, then
+ index->is_instant() will hold until the
+ insert into SYS_COLUMNS is rolled back. */
+ ut_ad(index->table->supports_instant());
+ ut_ad(index->is_clust());
+ } else {
+ lock_update_delete(block, rec);
+ }
+ btr_page_empty(block, buf_block_get_page_zip(block),
+ index, 0, mtr);
+ page_cur_set_after_last(block, btr_cur_get_page_cur(cursor));
+
+ if (index->is_clust()) {
+ /* Concurrent access is prevented by
+ root_block->lock X-latch, so this should be
+ safe. */
+ index->remove_instant();
+ }
+
+ return true;
+ }
+
offsets = rec_get_offsets(rec, cursor->index, offsets, true,
ULINT_UNDEFINED, &heap);
@@ -4969,9 +5345,29 @@ btr_cur_optimistic_delete_func(
page_t* page = buf_block_get_frame(block);
page_zip_des_t* page_zip= buf_block_get_page_zip(block);
- lock_update_delete(block, rec);
+ if (UNIV_UNLIKELY(rec_get_info_bits(rec, page_rec_is_comp(rec))
+ & REC_INFO_MIN_REC_FLAG)) {
+ /* This should be rolling back instant ADD COLUMN.
+ If this is a recovered transaction, then
+ index->is_instant() will hold until the
+ insert into SYS_COLUMNS is rolled back. */
+ ut_ad(cursor->index->table->supports_instant());
+ ut_ad(cursor->index->is_clust());
+ ut_ad(!page_zip);
+ page_cur_delete_rec(btr_cur_get_page_cur(cursor),
+ cursor->index, offsets, mtr);
+ /* We must empty the PAGE_FREE list, because
+ after rollback, this deleted 'default row' record
+ would have too many fields, and we would be
+ unable to know the size of the freed record. */
+ btr_page_reorganize(btr_cur_get_page_cur(cursor),
+ cursor->index, mtr);
+ goto func_exit;
+ } else {
+ lock_update_delete(block, rec);
- btr_search_update_hash_on_delete(cursor);
+ btr_search_update_hash_on_delete(cursor);
+ }
if (page_zip) {
#ifdef UNIV_ZIP_DEBUG
@@ -5011,6 +5407,7 @@ btr_cur_optimistic_delete_func(
btr_cur_prefetch_siblings(block);
}
+func_exit:
if (UNIV_LIKELY_NULL(heap)) {
mem_heap_free(heap);
}
@@ -5112,27 +5509,79 @@ btr_cur_pessimistic_delete(
#endif /* UNIV_ZIP_DEBUG */
}
- if (flags == 0) {
- lock_update_delete(block, rec);
- }
-
- if (UNIV_UNLIKELY(page_get_n_recs(page) < 2)
- && UNIV_UNLIKELY(dict_index_get_page(index)
- != block->page.id.page_no())) {
-
- /* If there is only one record, drop the whole page in
- btr_discard_page, if this is not the root page */
+ if (page_is_leaf(page)) {
+ const bool is_default_row = rec_get_info_bits(
+ rec, page_rec_is_comp(rec)) & REC_INFO_MIN_REC_FLAG;
+ if (UNIV_UNLIKELY(is_default_row)) {
+ /* This should be rolling back instant ADD COLUMN.
+ If this is a recovered transaction, then
+ index->is_instant() will hold until the
+ insert into SYS_COLUMNS is rolled back. */
+ ut_ad(rollback);
+ ut_ad(index->table->supports_instant());
+ ut_ad(index->is_clust());
+ } else if (flags == 0) {
+ lock_update_delete(block, rec);
+ }
+
+ if (!page_is_root(page)) {
+ if (page_get_n_recs(page) < 2) {
+ goto discard_page;
+ }
+ } else if (page_get_n_recs(page) == 1
+ + (index->is_instant()
+ && !rec_is_default_row(rec, index))) {
+ /* The whole index (and table) becomes logically empty.
+ Empty the whole page. That is, if we are deleting the
+ only user record, also delete the 'default row' record
+ if one exists (it exists if and only if is_instant()).
+ If we are deleting the 'default row' record and the
+ table becomes empty, clean up the whole page. */
+ ut_ad(!index->is_instant()
+ || rec_is_default_row(
+ page_rec_get_next_const(
+ page_get_infimum_rec(page)),
+ index));
+ btr_page_empty(block, page_zip, index, 0, mtr);
+ page_cur_set_after_last(block,
+ btr_cur_get_page_cur(cursor));
+ if (index->is_clust()) {
+ /* Concurrent access is prevented by
+ index->lock and root_block->lock
+ X-latch, so this should be safe. */
+ index->remove_instant();
+ }
+ ret = TRUE;
+ goto return_after_reservations;
+ }
- btr_discard_page(cursor, mtr);
+ if (UNIV_LIKELY(!is_default_row)) {
+ btr_search_update_hash_on_delete(cursor);
+ } else {
+ page_cur_delete_rec(btr_cur_get_page_cur(cursor),
+ index, offsets, mtr);
+ /* We must empty the PAGE_FREE list, because
+ after rollback, this deleted 'default row' record
+ would carry too many fields, and we would be
+ unable to know the size of the freed record. */
+ btr_page_reorganize(btr_cur_get_page_cur(cursor),
+ index, mtr);
+ ut_ad(!ret);
+ goto return_after_reservations;
+ }
+ } else if (UNIV_UNLIKELY(page_rec_is_first(rec, page))) {
+ if (page_rec_is_last(rec, page)) {
+discard_page:
+ ut_ad(page_get_n_recs(page) == 1);
+ /* If there is only one record, drop
+ the whole page. */
- ret = TRUE;
+ btr_discard_page(cursor, mtr);
- goto return_after_reservations;
- }
+ ret = TRUE;
+ goto return_after_reservations;
+ }
- if (page_is_leaf(page)) {
- btr_search_update_hash_on_delete(cursor);
- } else if (UNIV_UNLIKELY(page_rec_is_first(rec, page))) {
rec_t* next_rec = page_rec_get_next(rec);
if (btr_page_get_prev(page, mtr) == FIL_NULL) {
@@ -5185,9 +5634,9 @@ btr_cur_pessimistic_delete(
on a page, we have to change the parent node pointer
so that it is equal to the new leftmost node pointer
on the page */
- ulint level = btr_page_get_level(page, mtr);
btr_node_ptr_delete(index, block, mtr);
+ const ulint level = btr_page_get_level(page, mtr);
dtuple_t* node_ptr = dict_index_build_node_ptr(
index, next_rec, block->page.id.page_no(),
@@ -5205,10 +5654,10 @@ btr_cur_pessimistic_delete(
ut_a(!page_zip || page_zip_validate(page_zip, page, index));
#endif /* UNIV_ZIP_DEBUG */
+return_after_reservations:
/* btr_check_node_ptr() needs parent block latched */
ut_ad(!parent_latched || btr_check_node_ptr(index, block, mtr));
-return_after_reservations:
*err = DB_SUCCESS;
mem_heap_free(heap);
@@ -6055,7 +6504,7 @@ btr_estimate_number_of_different_key_vals(
page = btr_cur_get_page(&cursor);
rec = page_rec_get_next(page_get_infimum_rec(page));
- ut_d(const bool is_leaf = page_is_leaf(page));
+ const bool is_leaf = page_is_leaf(page);
if (!page_rec_is_supremum(rec)) {
not_empty_flag = 1;
@@ -6207,7 +6656,7 @@ btr_rec_get_field_ref_offs(
ut_a(rec_offs_nth_extern(offsets, n));
field_ref_offs = rec_get_nth_field_offs(offsets, n, &local_len);
- ut_a(local_len != UNIV_SQL_NULL);
+ ut_a(len_is_stored(local_len));
ut_a(local_len >= BTR_EXTERN_FIELD_REF_SIZE);
return(field_ref_offs + local_len - BTR_EXTERN_FIELD_REF_SIZE);
@@ -6616,8 +7065,8 @@ struct btr_blob_log_check_t {
*m_block = btr_pcur_get_block(m_pcur);
*m_rec = btr_pcur_get_rec(m_pcur);
- ut_d(rec_offs_make_valid(
- *m_rec, index, const_cast<ulint*>(m_offsets)));
+ rec_offs_make_valid(*m_rec, index, true,
+ const_cast<ulint*>(m_offsets));
ut_ad(m_mtr->memo_contains_page_flagged(
*m_rec,
diff --git a/storage/innobase/btr/btr0sea.cc b/storage/innobase/btr/btr0sea.cc
index 750c2506ff5..6613e4c59f8 100644
--- a/storage/innobase/btr/btr0sea.cc
+++ b/storage/innobase/btr/btr0sea.cc
@@ -80,11 +80,78 @@ btr_search_sys_t* btr_search_sys;
/** If the number of records on the page divided by this parameter
would have been successfully accessed using a hash index, the index
is then built on the page, assuming the global limit has been reached */
-#define BTR_SEARCH_PAGE_BUILD_LIMIT 16
+#define BTR_SEARCH_PAGE_BUILD_LIMIT 16U
/** The global limit for consecutive potentially successful hash searches,
before hash index building is started */
-#define BTR_SEARCH_BUILD_LIMIT 100
+#define BTR_SEARCH_BUILD_LIMIT 100U
+
+/** Compute a hash value of a record in a page.
+@param[in] rec index record
+@param[in] offsets return value of rec_get_offsets()
+@param[in] n_fields number of complete fields to fold
+@param[in] n_bytes number of bytes to fold in the last field
+@param[in] index_id index tree ID
+@return the hash value */
+static inline
+ulint
+rec_fold(
+ const rec_t* rec,
+ const ulint* offsets,
+ ulint n_fields,
+ ulint n_bytes,
+ index_id_t tree_id)
+{
+ ulint i;
+ const byte* data;
+ ulint len;
+ ulint fold;
+ ulint n_fields_rec;
+
+ ut_ad(rec_offs_validate(rec, NULL, offsets));
+ ut_ad(rec_validate(rec, offsets));
+ ut_ad(page_rec_is_leaf(rec));
+ ut_ad(!page_rec_is_default_row(rec));
+ ut_ad(n_fields > 0 || n_bytes > 0);
+
+ n_fields_rec = rec_offs_n_fields(offsets);
+ ut_ad(n_fields <= n_fields_rec);
+ ut_ad(n_fields < n_fields_rec || n_bytes == 0);
+
+ if (n_fields > n_fields_rec) {
+ n_fields = n_fields_rec;
+ }
+
+ if (n_fields == n_fields_rec) {
+ n_bytes = 0;
+ }
+
+ fold = ut_fold_ull(tree_id);
+
+ for (i = 0; i < n_fields; i++) {
+ data = rec_get_nth_field(rec, offsets, i, &len);
+
+ if (len != UNIV_SQL_NULL) {
+ fold = ut_fold_ulint_pair(fold,
+ ut_fold_binary(data, len));
+ }
+ }
+
+ if (n_bytes > 0) {
+ data = rec_get_nth_field(rec, offsets, i, &len);
+
+ if (len != UNIV_SQL_NULL) {
+ if (len > n_bytes) {
+ len = n_bytes;
+ }
+
+ fold = ut_fold_ulint_pair(fold,
+ ut_fold_binary(data, len));
+ }
+ }
+
+ return(fold);
+}
/** Determine the number of accessed key fields.
@param[in] n_fields number of complete fields
@@ -570,7 +637,7 @@ btr_search_update_block_hash_info(
if ((!block->index)
|| (block->n_hash_helps
- > 2 * page_get_n_recs(block->frame))
+ > 2U * page_get_n_recs(block->frame))
|| (block->n_fields != block->curr_n_fields)
|| (block->n_bytes != block->curr_n_bytes)
|| (block->left_side != block->curr_left_side)) {
@@ -1223,6 +1290,9 @@ retry:
rec = page_get_infimum_rec(page);
rec = page_rec_get_next_low(rec, page_is_comp(page));
+ if (rec_is_default_row(rec, index)) {
+ rec = page_rec_get_next_low(rec, page_is_comp(page));
+ }
prev_fold = 0;
@@ -1370,14 +1440,14 @@ btr_search_build_page_hash_index(
{
hash_table_t* table;
page_t* page;
- rec_t* rec;
- rec_t* next_rec;
+ const rec_t* rec;
+ const rec_t* next_rec;
ulint fold;
ulint next_fold;
ulint n_cached;
ulint n_recs;
ulint* folds;
- rec_t** recs;
+ const rec_t** recs;
ulint i;
mem_heap_t* heap = NULL;
ulint offsets_[REC_OFFS_NORMAL_SIZE];
@@ -1438,14 +1508,19 @@ btr_search_build_page_hash_index(
/* Calculate and cache fold values and corresponding records into
an array for fast insertion to the hash index */
- folds = (ulint*) ut_malloc_nokey(n_recs * sizeof(ulint));
- recs = (rec_t**) ut_malloc_nokey(n_recs * sizeof(rec_t*));
+ folds = static_cast<ulint*>(ut_malloc_nokey(n_recs * sizeof *folds));
+ recs = static_cast<const rec_t**>(
+ ut_malloc_nokey(n_recs * sizeof *recs));
n_cached = 0;
ut_a(index->id == btr_page_get_index_id(page));
- rec = page_rec_get_next(page_get_infimum_rec(page));
+ rec = page_rec_get_next_const(page_get_infimum_rec(page));
+
+ if (rec_is_default_row(rec, index)) {
+ rec = page_rec_get_next_const(rec);
+ }
offsets = rec_get_offsets(
rec, index, offsets, true,
@@ -1464,7 +1539,7 @@ btr_search_build_page_hash_index(
}
for (;;) {
- next_rec = page_rec_get_next(rec);
+ next_rec = page_rec_get_next_const(rec);
if (page_rec_is_supremum(next_rec)) {
@@ -1552,47 +1627,37 @@ exit_func:
}
}
-/** Moves or deletes hash entries for moved records. If new_page is already
-hashed, then the hash index for page, if any, is dropped. If new_page is not
-hashed, and page is hashed, then a new hash index is built to new_page with the
-same parameters as page (this often happens when a page is split).
-@param[in,out] new_block records are copied to this page.
-@param[in,out] block index page from which record are copied, and the
- copied records will be deleted from this page.
-@param[in,out] index record descriptor */
+/** Move or delete hash entries for moved records, usually in a page split.
+If new_block is already hashed, then any hash index for block is dropped.
+If new_block is not hashed, and block is hashed, then a new hash index is
+built to new_block with the same parameters as block.
+@param[in,out] new_block destination page
+@param[in,out] block source page (subject to deletion later) */
void
btr_search_move_or_delete_hash_entries(
buf_block_t* new_block,
- buf_block_t* block,
- dict_index_t* index)
+ buf_block_t* block)
{
-#ifdef MYSQL_INDEX_DISABLE_AHI
- if (index->disable_ahi) return;
-#endif
- if (!btr_search_enabled) {
- return;
- }
-
ut_ad(rw_lock_own(&(block->lock), RW_LOCK_X));
ut_ad(rw_lock_own(&(new_block->lock), RW_LOCK_X));
- btr_search_s_lock(index);
+ if (!btr_search_enabled) {
+ return;
+ }
- ut_a(!new_block->index || new_block->index == index);
- ut_a(!block->index || block->index == index);
- ut_a(!(new_block->index || block->index)
- || !dict_index_is_ibuf(index));
assert_block_ahi_valid(block);
assert_block_ahi_valid(new_block);
if (new_block->index) {
-
- btr_search_s_unlock(index);
-
btr_search_drop_page_hash_index(block);
+ return;
+ }
+ dict_index_t* index = block->index;
+ if (!index) {
return;
}
+ btr_search_s_lock(index);
if (block->index) {
ulint n_fields = block->curr_n_fields;
@@ -1834,7 +1899,7 @@ btr_search_update_hash_on_insert(btr_cur_t* cursor)
n_bytes, index->id);
}
- if (!page_rec_is_infimum(rec)) {
+ if (!page_rec_is_infimum(rec) && !rec_is_default_row(rec, index)) {
offsets = rec_get_offsets(
rec, index, offsets, true,
btr_search_get_n_fields(n_fields, n_bytes), &heap);
diff --git a/storage/innobase/buf/buf0buf.cc b/storage/innobase/buf/buf0buf.cc
index 0a4d4d276e9..84ab2015348 100644
--- a/storage/innobase/buf/buf0buf.cc
+++ b/storage/innobase/buf/buf0buf.cc
@@ -1286,6 +1286,7 @@ buf_page_print(const byte* read_buf, const page_size_t& page_size)
switch (fil_page_get_type(read_buf)) {
index_id_t index_id;
case FIL_PAGE_INDEX:
+ case FIL_PAGE_TYPE_INSTANT:
case FIL_PAGE_RTREE:
index_id = btr_page_get_index_id(read_buf);
ib::info() << "Page may be an index page where"
@@ -5628,13 +5629,14 @@ buf_page_monitor(
switch (fil_page_get_type(frame)) {
ulint level;
-
+ case FIL_PAGE_TYPE_INSTANT:
case FIL_PAGE_INDEX:
case FIL_PAGE_RTREE:
level = btr_page_get_level_low(frame);
/* Check if it is an index page for insert buffer */
- if (btr_page_get_index_id(frame)
+ if (fil_page_get_type(frame) == FIL_PAGE_INDEX
+ && btr_page_get_index_id(frame)
== (index_id_t)(DICT_IBUF_ID_MIN + IBUF_SPACE_ID)) {
if (level == 0) {
counter = MONITOR_RW_COUNTER(
diff --git a/storage/innobase/buf/buf0dblwr.cc b/storage/innobase/buf/buf0dblwr.cc
index 2bc3630d3f5..8594efd0c8d 100644
--- a/storage/innobase/buf/buf0dblwr.cc
+++ b/storage/innobase/buf/buf0dblwr.cc
@@ -853,6 +853,7 @@ buf_dblwr_check_block(
switch (fil_page_get_type(block->frame)) {
case FIL_PAGE_INDEX:
+ case FIL_PAGE_TYPE_INSTANT:
case FIL_PAGE_RTREE:
if (page_is_comp(block->frame)) {
if (page_simple_validate_new(block->frame)) {
@@ -885,7 +886,6 @@ buf_dblwr_check_block(
case FIL_PAGE_TYPE_ALLOCATED:
/* empty pages should never be flushed */
return;
- break;
}
buf_dblwr_assert_on_corrupt_block(block);
diff --git a/storage/innobase/buf/buf0flu.cc b/storage/innobase/buf/buf0flu.cc
index 1f41c566945..4f277e907e5 100644
--- a/storage/innobase/buf/buf0flu.cc
+++ b/storage/innobase/buf/buf0flu.cc
@@ -928,6 +928,7 @@ buf_flush_init_for_writing(
default:
switch (page_type) {
case FIL_PAGE_INDEX:
+ case FIL_PAGE_TYPE_INSTANT:
case FIL_PAGE_RTREE:
case FIL_PAGE_UNDO_LOG:
case FIL_PAGE_INODE:
diff --git a/storage/innobase/data/data0data.cc b/storage/innobase/data/data0data.cc
index 9ed4faa8e70..6601edfec9d 100644
--- a/storage/innobase/data/data0data.cc
+++ b/storage/innobase/data/data0data.cc
@@ -42,6 +42,39 @@ to data_error. */
byte data_error;
#endif /* UNIV_DEBUG */
+/** Trim the tail of an index tuple before insert or update.
+After instant ADD COLUMN, if the last fields of a clustered index tuple
+match the 'default row', there will be no need to store them.
+NOTE: A page latch in the index must be held, so that the index
+may not lose 'instantness' before the trimmed tuple has been
+inserted or updated.
+@param[in] index index possibly with instantly added columns */
+void dtuple_t::trim(const dict_index_t& index)
+{
+ ut_ad(n_fields >= index.n_core_fields);
+ ut_ad(n_fields <= index.n_fields);
+ ut_ad(index.is_instant());
+
+ ulint i = n_fields;
+ for (; i > index.n_core_fields; i--) {
+ const dfield_t* dfield = dtuple_get_nth_field(this, i - 1);
+ const dict_col_t* col = dict_index_get_nth_col(&index, i - 1);
+ ut_ad(col->is_instant());
+ ulint len = dfield_get_len(dfield);
+ if (len != col->def_val.len) {
+ break;
+ }
+
+ if (len != 0 && len != UNIV_SQL_NULL
+ && dfield->data != col->def_val.data
+ && memcmp(dfield->data, col->def_val.data, len)) {
+ break;
+ }
+ }
+
+ n_fields = i;
+}
+
/** Compare two data tuples.
@param[in] tuple1 first data tuple
@param[in] tuple2 second data tuple
@@ -813,6 +846,7 @@ dfield_t::clone(mem_heap_t* heap) const
dfield_t* obj = static_cast<dfield_t*>(
mem_heap_alloc(heap, sizeof(dfield_t) + size));
+ ut_ad(len != UNIV_SQL_DEFAULT);
obj->ext = ext;
obj->len = len;
obj->type = type;
diff --git a/storage/innobase/data/data0type.cc b/storage/innobase/data/data0type.cc
index d4b809c3f59..61537fb5aa5 100644
--- a/storage/innobase/data/data0type.cc
+++ b/storage/innobase/data/data0type.cc
@@ -56,7 +56,7 @@ dtype_get_at_most_n_mbchars(
ulint mbminlen = DATA_MBMINLEN(mbminmaxlen);
ulint mbmaxlen = DATA_MBMAXLEN(mbminmaxlen);
- ut_a(data_len != UNIV_SQL_NULL);
+ ut_a(len_is_stored(data_len));
ut_ad(!mbmaxlen || !(prefix_len % mbmaxlen));
if (mbminlen != mbmaxlen) {
diff --git a/storage/innobase/dict/dict0boot.cc b/storage/innobase/dict/dict0boot.cc
index 50a55172c59..29707e5bdc2 100644
--- a/storage/innobase/dict/dict0boot.cc
+++ b/storage/innobase/dict/dict0boot.cc
@@ -351,7 +351,8 @@ dict_boot(void)
table->id = DICT_TABLES_ID;
- dict_table_add_to_cache(table, FALSE, heap);
+ dict_table_add_system_columns(table, heap);
+ table->add_to_cache();
dict_sys->sys_tables = table;
mem_heap_empty(heap);
@@ -369,6 +370,9 @@ dict_boot(void)
MLOG_4BYTES, &mtr),
FALSE);
ut_a(error == DB_SUCCESS);
+ ut_ad(!table->is_instant());
+ table->indexes.start->n_core_null_bytes = UT_BITS_IN_BYTES(
+ table->indexes.start->n_nullable);
/*-------------------------*/
index = dict_mem_index_create("SYS_TABLES", "ID_IND",
@@ -397,7 +401,8 @@ dict_boot(void)
table->id = DICT_COLUMNS_ID;
- dict_table_add_to_cache(table, FALSE, heap);
+ dict_table_add_system_columns(table, heap);
+ table->add_to_cache();
dict_sys->sys_columns = table;
mem_heap_empty(heap);
@@ -415,6 +420,9 @@ dict_boot(void)
MLOG_4BYTES, &mtr),
FALSE);
ut_a(error == DB_SUCCESS);
+ ut_ad(!table->is_instant());
+ table->indexes.start->n_core_null_bytes = UT_BITS_IN_BYTES(
+ table->indexes.start->n_nullable);
/*-------------------------*/
table = dict_mem_table_create("SYS_INDEXES", DICT_HDR_SPACE,
@@ -431,7 +439,8 @@ dict_boot(void)
table->id = DICT_INDEXES_ID;
- dict_table_add_to_cache(table, FALSE, heap);
+ dict_table_add_system_columns(table, heap);
+ table->add_to_cache();
dict_sys->sys_indexes = table;
mem_heap_empty(heap);
@@ -449,6 +458,9 @@ dict_boot(void)
MLOG_4BYTES, &mtr),
FALSE);
ut_a(error == DB_SUCCESS);
+ ut_ad(!table->is_instant());
+ table->indexes.start->n_core_null_bytes = UT_BITS_IN_BYTES(
+ table->indexes.start->n_nullable);
/*-------------------------*/
table = dict_mem_table_create("SYS_FIELDS", DICT_HDR_SPACE, 3, 0, 0, 0);
@@ -459,7 +471,8 @@ dict_boot(void)
table->id = DICT_FIELDS_ID;
- dict_table_add_to_cache(table, FALSE, heap);
+ dict_table_add_system_columns(table, heap);
+ table->add_to_cache();
dict_sys->sys_fields = table;
mem_heap_free(heap);
@@ -477,6 +490,9 @@ dict_boot(void)
MLOG_4BYTES, &mtr),
FALSE);
ut_a(error == DB_SUCCESS);
+ ut_ad(!table->is_instant());
+ table->indexes.start->n_core_null_bytes = UT_BITS_IN_BYTES(
+ table->indexes.start->n_nullable);
mtr_commit(&mtr);
diff --git a/storage/innobase/dict/dict0crea.cc b/storage/innobase/dict/dict0crea.cc
index 16d57bb67f5..77a79e32605 100644
--- a/storage/innobase/dict/dict0crea.cc
+++ b/storage/innobase/dict/dict0crea.cc
@@ -67,6 +67,7 @@ dict_create_sys_tables_tuple(
ut_ad(table);
ut_ad(heap);
+ ut_ad(table->n_cols >= DATA_N_SYS_COLS);
sys_tables = dict_sys->sys_tables;
@@ -100,7 +101,8 @@ dict_create_sys_tables_tuple(
/* If there is any virtual column, encode it in N_COLS */
mach_write_to_4(ptr, dict_table_encode_n_col(
- static_cast<ulint>(table->n_def),
+ static_cast<ulint>(table->n_cols
+ - DATA_N_SYS_COLS),
static_cast<ulint>(table->n_v_def))
| ((table->flags & DICT_TF_COMPACT) << 31));
dfield_set_data(dfield, ptr, 4);
@@ -480,21 +482,6 @@ dict_build_tablespace_for_table(
return(DB_SUCCESS);
}
-/***************************************************************//**
-Builds a column definition to insert. */
-static
-void
-dict_build_col_def_step(
-/*====================*/
- tab_node_t* node) /*!< in: table create node */
-{
- dtuple_t* row;
-
- row = dict_create_sys_columns_tuple(node->table, node->col_no,
- node->heap);
- ins_node_set_new_row(node->col_def, row);
-}
-
/** Builds a SYS_VIRTUAL row definition to insert.
@param[in] node table create node */
static
@@ -1356,12 +1343,19 @@ dict_create_table_step(
if (node->state == TABLE_BUILD_COL_DEF) {
- if (node->col_no < (static_cast<ulint>(node->table->n_def)
- + static_cast<ulint>(node->table->n_v_def))) {
+ if (node->col_no + DATA_N_SYS_COLS
+ < (static_cast<ulint>(node->table->n_def)
+ + static_cast<ulint>(node->table->n_v_def))) {
- dict_build_col_def_step(node);
+ ulint i = node->col_no++;
+ if (i + DATA_N_SYS_COLS >= node->table->n_def) {
+ i += DATA_N_SYS_COLS;
+ }
- node->col_no++;
+ ins_node_set_new_row(
+ node->col_def,
+ dict_create_sys_columns_tuple(node->table, i,
+ node->heap));
thr->run_node = node->col_def;
@@ -1419,7 +1413,8 @@ dict_create_table_step(
if (node->state == TABLE_ADD_TO_CACHE) {
DBUG_EXECUTE_IF("ib_ddl_crash_during_create", DBUG_SUICIDE(););
- dict_table_add_to_cache(node->table, TRUE, node->heap);
+ node->table->can_be_evicted = true;
+ node->table->add_to_cache();
err = DB_SUCCESS;
}
@@ -1519,6 +1514,14 @@ dict_create_index_step(
goto function_exit;
}
+ ut_ad(!node->index->is_instant());
+ ut_ad(node->index->n_core_null_bytes
+ == ((dict_index_is_clust(node->index)
+ && node->table->supports_instant())
+ ? dict_index_t::NO_CORE_NULL_BYTES
+ : UT_BITS_IN_BYTES(node->index->n_nullable)));
+ node->index->n_core_null_bytes = UT_BITS_IN_BYTES(
+ node->index->n_nullable);
node->state = INDEX_CREATE_INDEX_TREE;
}
diff --git a/storage/innobase/dict/dict0dict.cc b/storage/innobase/dict/dict0dict.cc
index 7627fdac0d5..4313fa16370 100644
--- a/storage/innobase/dict/dict0dict.cc
+++ b/storage/innobase/dict/dict0dict.cc
@@ -618,26 +618,28 @@ dict_table_has_column(
return(col_max);
}
-/**********************************************************************//**
-Returns a column's name.
-@return column name. NOTE: not guaranteed to stay valid if table is
-modified in any way (columns added, etc.). */
-const char*
-dict_table_get_col_name(
-/*====================*/
- const dict_table_t* table, /*!< in: table */
- ulint col_nr) /*!< in: column number */
+/** Retrieve the column name.
+@param[in] table table name */
+const char* dict_col_t::name(const dict_table_t& table) const
{
- ulint i;
- const char* s;
+ ut_ad(table.magic_n == DICT_TABLE_MAGIC_N);
- ut_ad(table);
- ut_ad(col_nr < table->n_def);
- ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
+ size_t col_nr;
+ const char *s;
+
+ if (is_virtual()) {
+ col_nr = reinterpret_cast<const dict_v_col_t*>(this)
+ - table.v_cols;
+ ut_ad(col_nr < table.n_v_def);
+ s = table.v_col_names;
+ } else {
+ col_nr = this - table.cols;
+ ut_ad(col_nr < table.n_def);
+ s = table.col_names;
+ }
- s = table->col_names;
if (s) {
- for (i = 0; i < col_nr; i++) {
+ for (size_t i = 0; i < col_nr; i++) {
s += strlen(s) + 1;
}
}
@@ -1274,41 +1276,31 @@ dict_table_add_system_columns(
#endif
}
-/**********************************************************************//**
-Adds a table object to the dictionary cache. */
+/** Add the table definition to the data dictionary cache */
void
-dict_table_add_to_cache(
-/*====================*/
- dict_table_t* table, /*!< in: table */
- bool can_be_evicted, /*!< in: whether can be evicted */
- mem_heap_t* heap) /*!< in: temporary heap */
+dict_table_t::add_to_cache()
{
- ulint fold;
- ulint id_fold;
-
ut_ad(dict_lru_validate());
ut_ad(mutex_own(&dict_sys->mutex));
- dict_table_add_system_columns(table, heap);
+ cached = TRUE;
- table->cached = TRUE;
-
- fold = ut_fold_string(table->name.m_name);
- id_fold = ut_fold_ull(table->id);
+ ulint fold = ut_fold_string(name.m_name);
+ ulint id_fold = ut_fold_ull(id);
/* Look for a table with the same name: error if such exists */
{
dict_table_t* table2;
HASH_SEARCH(name_hash, dict_sys->table_hash, fold,
dict_table_t*, table2, ut_ad(table2->cached),
- !strcmp(table2->name.m_name, table->name.m_name));
+ !strcmp(table2->name.m_name, name.m_name));
ut_a(table2 == NULL);
#ifdef UNIV_DEBUG
/* Look for the same table pointer with a different name */
HASH_SEARCH_ALL(name_hash, dict_sys->table_hash,
dict_table_t*, table2, ut_ad(table2->cached),
- table2 == table);
+ table2 == this);
ut_ad(table2 == NULL);
#endif /* UNIV_DEBUG */
}
@@ -1318,32 +1310,30 @@ dict_table_add_to_cache(
dict_table_t* table2;
HASH_SEARCH(id_hash, dict_sys->table_id_hash, id_fold,
dict_table_t*, table2, ut_ad(table2->cached),
- table2->id == table->id);
+ table2->id == id);
ut_a(table2 == NULL);
#ifdef UNIV_DEBUG
/* Look for the same table pointer with a different id */
HASH_SEARCH_ALL(id_hash, dict_sys->table_id_hash,
dict_table_t*, table2, ut_ad(table2->cached),
- table2 == table);
+ table2 == this);
ut_ad(table2 == NULL);
#endif /* UNIV_DEBUG */
}
/* Add table to hash table of tables */
HASH_INSERT(dict_table_t, name_hash, dict_sys->table_hash, fold,
- table);
+ this);
/* Add table to hash table of tables based on table id */
HASH_INSERT(dict_table_t, id_hash, dict_sys->table_id_hash, id_fold,
- table);
-
- table->can_be_evicted = can_be_evicted;
+ this);
- if (table->can_be_evicted) {
- UT_LIST_ADD_FIRST(dict_sys->table_LRU, table);
+ if (can_be_evicted) {
+ UT_LIST_ADD_FIRST(dict_sys->table_LRU, this);
} else {
- UT_LIST_ADD_FIRST(dict_sys->table_non_LRU, table);
+ UT_LIST_ADD_FIRST(dict_sys->table_non_LRU, this);
}
ut_ad(dict_lru_validate());
@@ -2469,12 +2459,14 @@ dict_index_add_to_cache_w_vcol(
/* Build the cache internal representation of the index,
containing also the added system fields */
- if (index->type == DICT_FTS) {
- new_index = dict_index_build_internal_fts(table, index);
- } else if (dict_index_is_clust(index)) {
+ if (dict_index_is_clust(index)) {
new_index = dict_index_build_internal_clust(table, index);
} else {
- new_index = dict_index_build_internal_non_clust(table, index);
+ new_index = (index->type & DICT_FTS)
+ ? dict_index_build_internal_fts(table, index)
+ : dict_index_build_internal_non_clust(table, index);
+ new_index->n_core_null_bytes = UT_BITS_IN_BYTES(
+ new_index->n_nullable);
}
/* Set the n_fields value in new_index to the actual defined
@@ -2570,6 +2562,8 @@ dict_index_add_to_cache_w_vcol(
rw_lock_create(index_tree_rw_lock_key, &new_index->lock,
SYNC_INDEX_TREE);
+ new_index->n_core_fields = new_index->n_fields;
+
dict_mem_index_free(index);
return(DB_SUCCESS);
@@ -2824,11 +2818,8 @@ dict_index_add_col(
if (v_col->v_indexes != NULL) {
/* Register the index with the virtual column index
list */
- struct dict_v_idx_t new_idx
- = {index, index->n_def};
-
- v_col->v_indexes->push_back(new_idx);
-
+ v_col->v_indexes->push_back(
+ dict_v_idx_t(index, index->n_def));
}
col_name = dict_table_get_v_col_name_mysql(
@@ -3156,6 +3147,9 @@ dict_index_build_internal_clust(
ut_ad(UT_LIST_GET_LEN(table->indexes) == 0);
+ new_index->n_core_null_bytes = table->supports_instant()
+ ? dict_index_t::NO_CORE_NULL_BYTES
+ : UT_BITS_IN_BYTES(new_index->n_nullable);
new_index->cached = TRUE;
return(new_index);
@@ -5682,21 +5676,14 @@ dict_index_copy_rec_order_prefix(
@param[in,out] heap memory heap for allocation
@return own: data tuple */
dtuple_t*
-dict_index_build_data_tuple_func(
+dict_index_build_data_tuple(
const rec_t* rec,
const dict_index_t* index,
-#ifdef UNIV_DEBUG
bool leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
mem_heap_t* heap)
{
- dtuple_t* tuple;
-
- ut_ad(dict_table_is_comp(index->table)
- || n_fields <= rec_get_n_fields_old(rec));
-
- tuple = dtuple_create(heap, n_fields);
+ dtuple_t* tuple = dtuple_create(heap, n_fields);
dict_index_copy_types(tuple, index, n_fields);
diff --git a/storage/innobase/dict/dict0load.cc b/storage/innobase/dict/dict0load.cc
index 532d7ace740..0e55e353837 100644
--- a/storage/innobase/dict/dict0load.cc
+++ b/storage/innobase/dict/dict0load.cc
@@ -3005,10 +3005,11 @@ err_exit:
dict_load_virtual(table, heap);
+ dict_table_add_system_columns(table, heap);
+
if (cached) {
- dict_table_add_to_cache(table, TRUE, heap);
- } else {
- dict_table_add_system_columns(table, heap);
+ table->can_be_evicted = true;
+ table->add_to_cache();
}
mem_heap_empty(heap);
@@ -3050,6 +3051,11 @@ err_exit:
}
}
+ if (err == DB_SUCCESS && cached && table->is_readable()
+ && table->supports_instant()) {
+ err = btr_cur_instant_init(table);
+ }
+
/* Initialize table foreign_child value. Its value could be
changed when dict_load_foreigns() is called below */
table->fk_max_recusive_level = 0;
diff --git a/storage/innobase/dict/dict0mem.cc b/storage/innobase/dict/dict0mem.cc
index 3666443c1de..5e894e0649e 100644
--- a/storage/innobase/dict/dict0mem.cc
+++ b/storage/innobase/dict/dict0mem.cc
@@ -360,7 +360,7 @@ dict_mem_table_add_v_col(
i, name, heap);
}
- v_col = dict_table_get_nth_v_col(table, i);
+ v_col = &table->v_cols[i];
dict_mem_fill_column_struct(&v_col->m_col, pos, mtype, prtype, len);
v_col->v_pos = i;
@@ -621,6 +621,8 @@ dict_mem_fill_column_struct(
column->mtype = (unsigned int) mtype;
column->prtype = (unsigned int) prtype;
column->len = (unsigned int) col_len;
+ column->def_val.data = NULL;
+ column->def_val.len = UNIV_SQL_DEFAULT;
dtype_get_mblen(mtype, prtype, &mbminlen, &mbmaxlen);
dict_col_set_mbminmaxlen(column, mbminlen, mbmaxlen);
@@ -1129,3 +1131,283 @@ dict_mem_table_is_system(
return true;
}
}
+
+/** Adjust clustered index metadata for instant ADD COLUMN.
+@param[in] clustered index definition after instant ADD COLUMN */
+inline void dict_index_t::instant_add_field(const dict_index_t& instant)
+{
+ DBUG_ASSERT(is_clust());
+ DBUG_ASSERT(instant.is_clust());
+ DBUG_ASSERT(!instant.is_instant());
+ DBUG_ASSERT(n_def == n_fields);
+ DBUG_ASSERT(instant.n_def == instant.n_fields);
+
+ DBUG_ASSERT(type == instant.type);
+ DBUG_ASSERT(trx_id_offset == instant.trx_id_offset);
+ DBUG_ASSERT(n_user_defined_cols == instant.n_user_defined_cols);
+ DBUG_ASSERT(n_uniq == instant.n_uniq);
+ DBUG_ASSERT(instant.n_fields > n_fields);
+ DBUG_ASSERT(instant.n_def > n_def);
+ DBUG_ASSERT(instant.n_nullable >= n_nullable);
+ DBUG_ASSERT(instant.n_core_fields >= n_core_fields);
+ DBUG_ASSERT(instant.n_core_null_bytes >= n_core_null_bytes);
+
+ n_fields = instant.n_fields;
+ n_def = instant.n_def;
+ n_nullable = instant.n_nullable;
+ fields = static_cast<dict_field_t*>(
+ mem_heap_dup(heap, instant.fields, n_fields * sizeof *fields));
+
+ ut_d(unsigned n_null = 0);
+
+ for (unsigned i = 0; i < n_fields; i++) {
+ DBUG_ASSERT(fields[i].same(instant.fields[i]));
+ const dict_col_t* icol = instant.fields[i].col;
+ DBUG_ASSERT(!icol->is_virtual());
+ dict_col_t* col = fields[i].col = &table->cols[
+ icol - instant.table->cols];
+ fields[i].name = col->name(*table);
+ ut_d(n_null += col->is_nullable());
+ }
+
+ ut_ad(n_null == n_nullable);
+}
+
+/** Adjust metadata for instant ADD COLUMN.
+@param[in] table table definition after instant ADD COLUMN */
+void dict_table_t::instant_add_column(const dict_table_t& table)
+{
+ DBUG_ASSERT(!table.cached);
+ DBUG_ASSERT(table.n_def == table.n_cols);
+ DBUG_ASSERT(table.n_t_def == table.n_t_cols);
+ DBUG_ASSERT(n_def == n_cols);
+ DBUG_ASSERT(n_t_def == n_t_cols);
+ DBUG_ASSERT(table.n_cols > n_cols);
+ ut_ad(mutex_own(&dict_sys->mutex));
+
+ const char* end = table.col_names;
+ for (unsigned i = table.n_cols; i--; ) end += strlen(end) + 1;
+
+ col_names = static_cast<char*>(mem_heap_dup(heap, table.col_names,
+ end - table.col_names));
+ const dict_col_t* const old_cols = cols;
+ const dict_col_t* const old_cols_end = cols + n_cols;
+ cols = static_cast<dict_col_t*>(mem_heap_dup(heap, table.cols,
+ table.n_cols
+ * sizeof *cols));
+
+ /* Preserve the default values of previously instantly
+ added columns. */
+ for (unsigned i = n_cols - DATA_N_SYS_COLS; i--; ) {
+ cols[i].def_val = old_cols[i].def_val;
+ }
+
+ /* Copy the new default values to this->heap. */
+ for (unsigned i = n_cols; i < table.n_cols; i++) {
+ dict_col_t& c = cols[i - DATA_N_SYS_COLS];
+ DBUG_ASSERT(c.is_instant());
+ if (c.def_val.len == 0) {
+ c.def_val.data = field_ref_zero;
+ } else if (const void*& d = c.def_val.data) {
+ d = mem_heap_dup(heap, d, c.def_val.len);
+ } else {
+ DBUG_ASSERT(c.def_val.len == UNIV_SQL_NULL);
+ }
+ }
+
+ const unsigned old_n_cols = n_cols;
+ const unsigned n_add = table.n_cols - n_cols;
+
+ n_t_def += n_add;
+ n_t_cols += n_add;
+ n_cols = table.n_cols;
+ n_def = n_cols;
+
+ for (unsigned i = n_v_def; i--; ) {
+ const dict_v_col_t& v = v_cols[i];
+ for (ulint n = v.num_base; n--; ) {
+ dict_col_t*& base = v.base_col[n];
+ if (!base->is_virtual()) {
+ ptrdiff_t n = base - old_cols;
+ DBUG_ASSERT(n >= 0);
+ DBUG_ASSERT(n < old_n_cols - DATA_N_SYS_COLS);
+ base = &cols[n];
+ }
+ }
+ }
+
+ dict_index_t* index = dict_table_get_first_index(this);
+
+ index->instant_add_field(*dict_table_get_first_index(&table));
+
+ while ((index = dict_table_get_next_index(index)) != NULL) {
+ for (unsigned i = 0; i < index->n_fields; i++) {
+ dict_field_t& field = index->fields[i];
+ if (field.col < old_cols
+ || field.col >= old_cols_end) {
+ DBUG_ASSERT(field.col->is_virtual());
+ } else {
+ ptrdiff_t n = field.col - old_cols;
+ /* Secondary indexes may contain user
+ columns and DB_ROW_ID (if there is
+ GEN_CLUST_INDEX instead of PRIMARY KEY),
+ but not DB_TRX_ID,DB_ROLL_PTR. */
+ DBUG_ASSERT(n >= 0);
+ DBUG_ASSERT(n <= old_n_cols - DATA_N_SYS_COLS);
+ if (n + DATA_N_SYS_COLS >= old_n_cols) {
+ /* Replace DB_ROW_ID */
+ n += n_add;
+ }
+ field.col = &cols[n];
+ DBUG_ASSERT(!field.col->is_virtual());
+ field.name = field.col->name(*this);
+ }
+ }
+ }
+}
+
+/** Roll back instant_add_column().
+@param[in] old_n_cols original n_cols
+@param[in] old_cols original cols
+@param[in] old_col_names original col_names */
+void
+dict_table_t::rollback_instant(
+ unsigned old_n_cols,
+ dict_col_t* old_cols,
+ const char* old_col_names)
+{
+ ut_ad(mutex_own(&dict_sys->mutex));
+ dict_index_t* index = indexes.start;
+ /* index->is_instant() does not necessarily hold here, because
+ the table may have been emptied */
+ DBUG_ASSERT(old_n_cols >= DATA_N_SYS_COLS);
+ DBUG_ASSERT(n_cols >= old_n_cols);
+ DBUG_ASSERT(n_cols == n_def);
+ DBUG_ASSERT(index->n_def == index->n_fields);
+
+ const unsigned n_remove = n_cols - old_n_cols;
+
+ for (unsigned i = index->n_fields - n_remove; i < index->n_fields;
+ i++) {
+ index->n_nullable -= index->fields[i].col->is_nullable();
+ }
+
+ index->n_fields -= n_remove;
+ index->n_def = index->n_fields;
+ if (index->n_core_fields > index->n_fields) {
+ index->n_core_fields = index->n_fields;
+ index->n_core_null_bytes = UT_BITS_IN_BYTES(index->n_nullable);
+ }
+
+ const dict_col_t* const new_cols = cols;
+ const dict_col_t* const new_cols_end = cols + n_cols;
+
+ cols = old_cols;
+ col_names = old_col_names;
+ n_cols = old_n_cols;
+ n_def = old_n_cols;
+ n_t_def -= n_remove;
+ n_t_cols -= n_remove;
+
+ for (unsigned i = n_v_def; i--; ) {
+ const dict_v_col_t& v = v_cols[i];
+ for (ulint n = v.num_base; n--; ) {
+ dict_col_t*& base = v.base_col[n];
+ if (!base->is_virtual()) {
+ base = &cols[base - new_cols];
+ }
+ }
+ }
+
+ do {
+ for (unsigned i = 0; i < index->n_fields; i++) {
+ dict_field_t& field = index->fields[i];
+ if (field.col < new_cols
+ || field.col >= new_cols_end) {
+ DBUG_ASSERT(field.col->is_virtual());
+ } else {
+ ptrdiff_t n = field.col - new_cols;
+ DBUG_ASSERT(n >= 0);
+ DBUG_ASSERT(n <= n_cols);
+ if (n >= n_cols - DATA_N_SYS_COLS) {
+ n -= n_remove;
+ }
+ field.col = &cols[n];
+ DBUG_ASSERT(!field.col->is_virtual());
+ field.name = field.col->name(*this);
+ }
+ }
+ } while ((index = dict_table_get_next_index(index)) != NULL);
+}
+
+/** Trim the instantly added columns when an insert into SYS_COLUMNS
+is rolled back during ALTER TABLE or recovery.
+@param[in] n number of surviving non-system columns */
+void dict_table_t::rollback_instant(unsigned n)
+{
+ ut_ad(mutex_own(&dict_sys->mutex));
+ dict_index_t* index = indexes.start;
+ DBUG_ASSERT(index->is_instant());
+ DBUG_ASSERT(index->n_def == index->n_fields);
+ DBUG_ASSERT(n_cols == n_def);
+ DBUG_ASSERT(n >= index->n_uniq);
+ DBUG_ASSERT(n_cols > n + DATA_N_SYS_COLS);
+ const unsigned n_remove = n_cols - n - DATA_N_SYS_COLS;
+
+ char* names = const_cast<char*>(dict_table_get_col_name(this, n));
+ const char* sys = names;
+ for (unsigned i = n_remove; i--; ) {
+ sys += strlen(sys) + 1;
+ }
+ static const char system[] = "DB_ROW_ID\0DB_TRX_ID\0DB_ROLL_PTR";
+ DBUG_ASSERT(!memcmp(sys, system, sizeof system));
+ for (unsigned i = index->n_fields - n_remove; i < index->n_fields;
+ i++) {
+ index->n_nullable -= index->fields[i].col->is_nullable();
+ }
+ index->n_fields -= n_remove;
+ index->n_def = index->n_fields;
+ memmove(names, sys, sizeof system);
+ memmove(cols + n, cols + n_cols - DATA_N_SYS_COLS,
+ DATA_N_SYS_COLS * sizeof *cols);
+ n_cols -= n_remove;
+ n_def = n_cols;
+ n_t_cols -= n_remove;
+ n_t_def -= n_remove;
+
+ for (unsigned i = DATA_N_SYS_COLS; i--; ) {
+ cols[n_cols - i].ind--;
+ }
+
+ if (dict_index_is_auto_gen_clust(index)) {
+ DBUG_ASSERT(index->n_uniq == 1);
+ dict_field_t* field = index->fields;
+ field->name = sys;
+ field->col = dict_table_get_sys_col(this, DATA_ROW_ID);
+ field++;
+ field->name = sys + sizeof "DB_ROW_ID";
+ field->col = dict_table_get_sys_col(this, DATA_TRX_ID);
+ field++;
+ field->name = sys + sizeof "DB_ROW_ID\0DB_TRX_ID";
+ field->col = dict_table_get_sys_col(this, DATA_ROLL_PTR);
+
+ /* Replace the DB_ROW_ID column in secondary indexes. */
+ while ((index = dict_table_get_next_index(index)) != NULL) {
+ field = &index->fields[index->n_fields - 1];
+ DBUG_ASSERT(field->col->mtype == DATA_SYS);
+ DBUG_ASSERT(field->col->prtype
+ == DATA_NOT_NULL + DATA_TRX_ID);
+ field->col--;
+ field->name = sys;
+ }
+
+ return;
+ }
+
+ dict_field_t* field = &index->fields[index->n_uniq];
+ field->name = sys + sizeof "DB_ROW_ID";
+ field->col = dict_table_get_sys_col(this, DATA_TRX_ID);
+ field++;
+ field->name = sys + sizeof "DB_ROW_ID\0DB_TRX_ID";
+ field->col = dict_table_get_sys_col(this, DATA_ROLL_PTR);
+}
diff --git a/storage/innobase/dict/dict0stats.cc b/storage/innobase/dict/dict0stats.cc
index 99fb115ea2c..e3d0effb5ac 100644
--- a/storage/innobase/dict/dict0stats.cc
+++ b/storage/innobase/dict/dict0stats.cc
@@ -1108,11 +1108,19 @@ dict_stats_analyze_index_level(
/* there should not be any pages on the left */
ut_a(btr_page_get_prev(page, mtr) == FIL_NULL);
- /* check whether the first record on the leftmost page is marked
- as such, if we are on a non-leaf level */
- ut_a((level == 0)
- == !(REC_INFO_MIN_REC_FLAG & rec_get_info_bits(
- btr_pcur_get_rec(&pcur), page_is_comp(page))));
+ if (REC_INFO_MIN_REC_FLAG & rec_get_info_bits(
+ btr_pcur_get_rec(&pcur), page_is_comp(page))) {
+ ut_ad(btr_pcur_is_on_user_rec(&pcur));
+ if (level == 0) {
+ /* Skip the 'default row' pseudo-record */
+ ut_ad(index->is_instant());
+ btr_pcur_move_to_next_user_rec(&pcur, mtr);
+ }
+ } else {
+ /* The first record on the leftmost page must be
+ marked as such on each level except the leaf level. */
+ ut_a(level == 0);
+ }
prev_rec = NULL;
prev_rec_is_copied = false;
diff --git a/storage/innobase/fts/fts0fts.cc b/storage/innobase/fts/fts0fts.cc
index 09012ad4101..0174e51ec07 100644
--- a/storage/innobase/fts/fts0fts.cc
+++ b/storage/innobase/fts/fts0fts.cc
@@ -1747,7 +1747,7 @@ fts_create_in_mem_aux_table(
@param[in] table Table that has FTS Index
@param[in] fts_table_name FTS AUX table name
@param[in] fts_suffix FTS AUX table suffix
-@param[in] heap heap
+@param[in,out] heap temporary memory heap
@return table object if created, else NULL */
static
dict_table_t*
@@ -1784,6 +1784,7 @@ fts_create_one_common_table(
FTS_CONFIG_TABLE_VALUE_COL_LEN);
}
+ dict_table_add_system_columns(new_table, heap);
error = row_create_table_for_mysql(new_table, trx,
FIL_ENCRYPTION_DEFAULT, FIL_DEFAULT_ENCRYPTION_KEY);
@@ -1878,13 +1879,15 @@ fts_create_common_tables(
dict_table_t* common_table = fts_create_one_common_table(
trx, table, full_name[i], fts_table.suffix, heap);
- if (common_table == NULL) {
+ if (common_table == NULL) {
error = DB_ERROR;
goto func_exit;
} else {
common_tables.push_back(common_table);
}
+ mem_heap_empty(heap);
+
DBUG_EXECUTE_IF("ib_fts_aux_table_error",
/* Return error after creating FTS_AUX_CONFIG table. */
if (i == 4) {
@@ -1944,7 +1947,7 @@ func_exit:
@param[in,out] trx transaction
@param[in] index the index instance
@param[in] fts_table fts_table structure
-@param[in,out] heap memory heap
+@param[in,out] heap temporary memory heap
@see row_merge_create_fts_sort_index()
@return DB_SUCCESS or error code */
static
@@ -2001,6 +2004,7 @@ fts_create_one_index_table(
(DATA_MTYPE_MAX << 16) | DATA_UNSIGNED | DATA_NOT_NULL,
FTS_INDEX_ILIST_LEN);
+ dict_table_add_system_columns(new_table, heap);
error = row_create_table_for_mysql(new_table, trx,
FIL_ENCRYPTION_DEFAULT, FIL_DEFAULT_ENCRYPTION_KEY);
@@ -2076,6 +2080,8 @@ fts_create_index_tables_low(
aux_idx_tables.push_back(new_table);
}
+ mem_heap_empty(heap);
+
DBUG_EXECUTE_IF("ib_fts_index_table_error",
/* Return error after creating FTS_INDEX_5
aux table. */
@@ -3303,6 +3309,8 @@ fts_fetch_doc_from_rec(
parser = get_doc->index_cache->index->parser;
clust_rec = btr_pcur_get_rec(pcur);
+ ut_ad(!page_rec_is_comp(clust_rec)
+ || rec_get_status(clust_rec) == REC_STATUS_ORDINARY);
num_field = dict_index_get_n_fields(index);
@@ -3598,6 +3606,8 @@ fts_get_max_doc_id(
return(0);
}
+ ut_ad(!index->is_instant());
+
dfield = dict_index_get_nth_field(index, 0);
#if 0 /* This can fail when renaming a column to FTS_DOC_ID_COL_NAME. */
@@ -3632,6 +3642,7 @@ fts_get_max_doc_id(
goto func_exit;
}
+ ut_ad(!rec_is_default_row(rec, index));
offsets = rec_get_offsets(
rec, index, offsets, true, ULINT_UNDEFINED, &heap);
diff --git a/storage/innobase/gis/gis0rtree.cc b/storage/innobase/gis/gis0rtree.cc
index b8220d73ec0..0061f57c539 100644
--- a/storage/innobase/gis/gis0rtree.cc
+++ b/storage/innobase/gis/gis0rtree.cc
@@ -85,7 +85,7 @@ rtr_page_split_initialize_nodes(
stop = task + n_recs;
rec = page_rec_get_next(page_get_infimum_rec(page));
- ut_d(const bool is_leaf = page_is_leaf(page));
+ const bool is_leaf = page_is_leaf(page);
*offsets = rec_get_offsets(rec, cursor->index, *offsets, is_leaf,
n_uniq, &heap);
diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index 1cb10cfb557..d6d103e55c2 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -1100,6 +1100,9 @@ static SHOW_VAR innodb_status_variables[]= {
{"defragment_count",
(char*) &export_vars.innodb_defragment_count, SHOW_LONG},
+ {"instant_alter_column",
+ (char*) &export_vars.innodb_instant_alter_column, SHOW_LONG},
+
/* Online alter table status variables */
{"onlineddl_rowlog_rows",
(char*) &export_vars.innodb_onlineddl_rowlog_rows, SHOW_LONG},
@@ -11491,6 +11494,8 @@ err_col:
fts_add_doc_id_column(table, heap);
}
+ dict_table_add_system_columns(table, heap);
+
ut_ad(trx_state_eq(m_trx, TRX_STATE_NOT_STARTED));
/* If temp table, then we avoid creation of entries in SYSTEM TABLES.
@@ -11505,19 +11510,10 @@ err_col:
err = dict_build_tablespace_for_table(table, NULL);
if (err == DB_SUCCESS) {
- /* Temp-table are maintained in memory and so
- can_be_evicted is FALSE. */
- mem_heap_t* temp_table_heap;
-
- temp_table_heap = mem_heap_create(256);
-
- dict_table_add_to_cache(
- table, FALSE, temp_table_heap);
+ table->add_to_cache();
DBUG_EXECUTE_IF("ib_ddl_crash_during_create2",
DBUG_SUICIDE(););
-
- mem_heap_free(temp_table_heap);
}
} else {
if (err == DB_SUCCESS) {
diff --git a/storage/innobase/handler/handler0alter.cc b/storage/innobase/handler/handler0alter.cc
index f4edd6131cf..552350ef805 100644
--- a/storage/innobase/handler/handler0alter.cc
+++ b/storage/innobase/handler/handler0alter.cc
@@ -43,10 +43,14 @@ Smart ALTER TABLE
#include "rem0types.h"
#include "row0log.h"
#include "row0merge.h"
+#include "row0ins.h"
+#include "row0row.h"
+#include "row0upd.h"
#include "trx0trx.h"
#include "trx0roll.h"
#include "handler0alter.h"
#include "srv0mon.h"
+#include "srv0srv.h"
#include "fts0priv.h"
#include "fts0plugin.h"
#include "pars0pars.h"
@@ -157,6 +161,8 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
dict_table_t* old_table;
/** table where the indexes are being created or dropped */
dict_table_t* new_table;
+ /** table definition for instant ADD COLUMN */
+ dict_table_t* instant_table;
/** mapping of old column numbers to new ones, or NULL */
const ulint* col_map;
/** new column names, or NULL if nothing was renamed */
@@ -183,6 +189,12 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
const char** drop_vcol_name;
/** ALTER TABLE stage progress recorder */
ut_stage_alter_t* m_stage;
+ /** original number of user columns in the table */
+ const unsigned old_n_cols;
+ /** original columns of the table */
+ dict_col_t* const old_cols;
+ /** original column names of the table */
+ const char* const old_col_names;
ha_innobase_inplace_ctx(row_prebuilt_t*& prebuilt_arg,
dict_index_t** drop_arg,
@@ -210,7 +222,7 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
add_fk (add_fk_arg), num_to_add_fk (num_to_add_fk_arg),
online (online_arg), heap (heap_arg), trx (0),
old_table (prebuilt_arg->table),
- new_table (new_table_arg),
+ new_table (new_table_arg), instant_table (0),
col_map (0), col_names (col_names_arg),
add_autoinc (add_autoinc_arg),
add_cols (0),
@@ -224,8 +236,12 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
num_to_drop_vcol(0),
drop_vcol(0),
drop_vcol_name(0),
- m_stage(NULL)
+ m_stage(NULL),
+ old_n_cols(prebuilt_arg->table->n_cols),
+ old_cols(prebuilt_arg->table->cols),
+ old_col_names(prebuilt_arg->table->col_names)
{
+ ut_ad(old_n_cols >= DATA_N_SYS_COLS);
#ifdef UNIV_DEBUG
for (ulint i = 0; i < num_to_add_index; i++) {
ut_ad(!add_index[i]->to_be_dropped);
@@ -242,6 +258,15 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
~ha_innobase_inplace_ctx()
{
UT_DELETE(m_stage);
+ if (instant_table) {
+ while (dict_index_t* index
+ = UT_LIST_GET_LAST(instant_table->indexes)) {
+ UT_LIST_REMOVE(instant_table->indexes, index);
+ rw_lock_free(&index->lock);
+ dict_mem_index_free(index);
+ }
+ dict_mem_table_free(instant_table);
+ }
mem_heap_free(heap);
}
@@ -249,6 +274,36 @@ struct ha_innobase_inplace_ctx : public inplace_alter_handler_ctx
@return whether the table will be rebuilt */
bool need_rebuild () const { return(old_table != new_table); }
+ /** Convert table-rebuilding ALTER to instant ALTER. */
+ void prepare_instant()
+ {
+ DBUG_ASSERT(need_rebuild());
+ DBUG_ASSERT(!is_instant());
+ DBUG_ASSERT(old_table->n_cols == old_table->n_def);
+ DBUG_ASSERT(new_table->n_cols == new_table->n_def);
+ DBUG_ASSERT(old_table->n_cols == old_n_cols);
+ DBUG_ASSERT(new_table->n_cols > old_table->n_cols);
+ instant_table = new_table;
+
+ new_table = old_table;
+ export_vars.innodb_instant_alter_column++;
+ }
+
+ /** Revert prepare_instant() if the transaction is rolled back. */
+ void rollback_instant()
+ {
+ if (!is_instant()) return;
+ old_table->rollback_instant(old_n_cols,
+ old_cols, old_col_names);
+ }
+
+ /** @return whether this is instant ALTER TABLE */
+ bool is_instant() const
+ {
+ DBUG_ASSERT(!instant_table || !instant_table->can_be_evicted);
+ return instant_table;
+ }
+
private:
// Disable copying
ha_innobase_inplace_ctx(const ha_innobase_inplace_ctx&);
@@ -592,6 +647,11 @@ ha_innobase::check_if_supported_inplace_alter(
update_thd();
+ // FIXME: Construct ha_innobase_inplace_ctx here and determine
+ // if instant ALTER TABLE is possible. If yes, we will be able to
+ // allow ADD COLUMN even if SPATIAL INDEX, FULLTEXT INDEX or
+ // virtual columns exist, also together with adding virtual columns.
+
/* Change on engine specific table options require rebuild of the
table */
if (ha_alter_info->handler_flags
@@ -1822,7 +1882,7 @@ null_field:
continue;
}
- ifield = rec_get_nth_field(rec, offsets, ipos, &ilen);
+ ifield = rec_get_nth_cfield(rec, index, offsets, ipos, &ilen);
/* Assign the NULL flag */
if (ilen == UNIV_SQL_NULL) {
@@ -3071,11 +3131,7 @@ innobase_build_col_map(
}
while (const Create_field* new_field = cf_it++) {
- bool is_v = false;
-
- if (innobase_is_v_fld(new_field)) {
- is_v = true;
- }
+ bool is_v = innobase_is_v_fld(new_field);
ulint num_old_v = 0;
@@ -3896,34 +3952,34 @@ innobase_add_one_virtual(
return(error);
}
-/** Update INNODB SYS_TABLES on number of virtual columns
+/** Update SYS_TABLES.N_COLS in the data dictionary.
@param[in] user_table InnoDB table
-@param[in] n_col number of columns
+@param[in] n_cols the new value of SYS_TABLES.N_COLS
@param[in] trx transaction
-@return DB_SUCCESS if successful, otherwise error code */
+@return whether the operation failed */
static
-dberr_t
-innobase_update_n_virtual(
- const dict_table_t* table,
- ulint n_col,
- trx_t* trx)
+bool
+innodb_update_n_cols(const dict_table_t* table, ulint n_cols, trx_t* trx)
{
- dberr_t err = DB_SUCCESS;
pars_info_t* info = pars_info_create();
- pars_info_add_int4_literal(info, "num_col", n_col);
+ pars_info_add_int4_literal(info, "n", n_cols);
pars_info_add_ull_literal(info, "id", table->id);
- err = que_eval_sql(
- info,
- "PROCEDURE RENUMBER_TABLE_ID_PROC () IS\n"
- "BEGIN\n"
- "UPDATE SYS_TABLES"
- " SET N_COLS = :num_col\n"
- " WHERE ID = :id;\n"
- "END;\n", FALSE, trx);
+ dberr_t err = que_eval_sql(info,
+ "PROCEDURE UPDATE_N_COLS () IS\n"
+ "BEGIN\n"
+ "UPDATE SYS_TABLES SET N_COLS = :n"
+ " WHERE ID = :id;\n"
+ "END;\n", FALSE, trx);
- return(err);
+ if (err != DB_SUCCESS) {
+ my_error(ER_INTERNAL_ERROR, MYF(0),
+ "InnoDB: Updating SYS_TABLES.N_COLS failed");
+ return true;
+ }
+
+ return false;
}
/** Update system table for adding virtual column(s)
@@ -3969,15 +4025,254 @@ innobase_add_virtual_try(
ulint new_n = dict_table_encode_n_col(n_col, n_v_col)
+ ((user_table->flags & DICT_TF_COMPACT) << 31);
- err = innobase_update_n_virtual(user_table, new_n, trx);
+ return innodb_update_n_cols(user_table, new_n, trx);
+}
- if (err != DB_SUCCESS) {
- my_error(ER_INTERNAL_ERROR, MYF(0),
- "InnoDB: ADD COLUMN...VIRTUAL");
- return(true);
+/** Insert into SYS_COLUMNS and insert/update the 'default row'
+for instant ADD COLUMN.
+@param[in,out] ha_alter_info Data used during in-place alter
+@param[in,out] ctx ALTER TABLE context for the current partition
+@param[in] altered_table MySQL table that is being altered
+@param[in] table MySQL table as it is before the ALTER operation
+@param[in,out] trx dictionary transaction
+@retval true failure
+@retval false success */
+static
+bool
+innobase_add_instant_try(
+ Alter_inplace_info* ha_alter_info,
+ ha_innobase_inplace_ctx*ctx,
+ const TABLE* altered_table,
+ const TABLE* table,
+ trx_t* trx)
+{
+ DBUG_ASSERT(!ctx->need_rebuild());
+
+ if (!ctx->is_instant()) return false;
+
+ DBUG_ASSERT(altered_table->s->fields > table->s->fields);
+ DBUG_ASSERT(ctx->old_table->n_cols == ctx->old_n_cols);
+
+ dict_table_t* user_table = ctx->old_table;
+ user_table->instant_add_column(*ctx->instant_table);
+ dict_index_t* index = dict_table_get_first_index(user_table);
+ /* The table may have been emptied and may have lost its
+ 'instant-add-ness' during this instant ADD COLUMN. */
+
+ /* Construct a table row of default values for the stored columns. */
+ dtuple_t* row = dtuple_create(ctx->heap, user_table->n_cols);
+ dict_table_copy_types(row, user_table);
+ Field** af = altered_table->field;
+ Field** const end = altered_table->field + altered_table->s->fields;
+
+ for (uint i = 0; af < end; af++) {
+ if (!(*af)->stored_in_db()) {
+ continue;
+ }
+
+ dict_col_t* col = dict_table_get_nth_col(user_table, i);
+ DBUG_ASSERT(!strcmp((*af)->field_name.str,
+ dict_table_get_col_name(user_table, i)));
+
+ dfield_t* d = dtuple_get_nth_field(row, i);
+
+ if (col->is_instant()) {
+ dfield_set_data(d, col->def_val.data,
+ col->def_val.len);
+ } else if ((*af)->real_maybe_null()) {
+ /* Store NULL for nullable 'core' columns. */
+ dfield_set_null(d);
+ } else {
+ switch ((*af)->type()) {
+ case MYSQL_TYPE_VARCHAR:
+ case MYSQL_TYPE_GEOMETRY:
+ case MYSQL_TYPE_TINY_BLOB:
+ case MYSQL_TYPE_MEDIUM_BLOB:
+ case MYSQL_TYPE_BLOB:
+ case MYSQL_TYPE_LONG_BLOB:
+ /* Store the empty string for 'core'
+ variable-length NOT NULL columns. */
+ dfield_set_data(d, field_ref_zero, 0);
+ break;
+ default:
+ /* For fixed-length NOT NULL 'core' columns,
+ get a dummy default value from SQL. Note that
+ we will preserve the old values of these
+ columns when updating the 'default row'
+ record, to avoid unnecessary updates. */
+ ulint len = (*af)->pack_length();
+ DBUG_ASSERT(d->type.mtype != DATA_INT
+ || len <= 8);
+ row_mysql_store_col_in_innobase_format(
+ d, d->type.mtype == DATA_INT
+ ? static_cast<byte*>(
+ mem_heap_alloc(ctx->heap, len))
+ : NULL, true, (*af)->ptr, len,
+ dict_table_is_comp(user_table));
+ }
+ }
+
+ if (i + DATA_N_SYS_COLS < ctx->old_n_cols) {
+ i++;
+ continue;
+ }
+
+ pars_info_t* info = pars_info_create();
+ pars_info_add_ull_literal(info, "id", user_table->id);
+ pars_info_add_int4_literal(info, "pos", i);
+ pars_info_add_str_literal(info, "name", (*af)->field_name.str);
+ pars_info_add_int4_literal(info, "mtype", d->type.mtype);
+ pars_info_add_int4_literal(info, "prtype", d->type.prtype);
+ pars_info_add_int4_literal(info, "len", d->type.len);
+
+ dberr_t err = que_eval_sql(
+ info,
+ "PROCEDURE ADD_COL () IS\n"
+ "BEGIN\n"
+ "INSERT INTO SYS_COLUMNS VALUES"
+ "(:id,:pos,:name,:mtype,:prtype,:len,0);\n"
+ "END;\n", FALSE, trx);
+ if (err != DB_SUCCESS) {
+ my_error(ER_INTERNAL_ERROR, MYF(0),
+ "InnoDB: Insert into SYS_COLUMNS failed");
+ return(true);
+ }
+
+ i++;
}
- return(false);
+ if (innodb_update_n_cols(user_table, dict_table_encode_n_col(
+ user_table->n_cols - DATA_N_SYS_COLS,
+ user_table->n_v_cols)
+ | (user_table->flags & DICT_TF_COMPACT) << 31,
+ trx)) {
+ return true;
+ }
+
+ unsigned i = user_table->n_cols - DATA_N_SYS_COLS;
+ byte trx_id[DATA_TRX_ID_LEN], roll_ptr[DATA_ROLL_PTR_LEN];
+ dfield_set_data(dtuple_get_nth_field(row, i++), field_ref_zero,
+ DATA_ROW_ID_LEN);
+ dfield_set_data(dtuple_get_nth_field(row, i++), trx_id, sizeof trx_id);
+ dfield_set_data(dtuple_get_nth_field(row, i),roll_ptr,sizeof roll_ptr);
+ DBUG_ASSERT(i + 1 == user_table->n_cols);
+
+ trx_write_trx_id(trx_id, trx->id);
+ /* The DB_ROLL_PTR will be assigned later, when allocating undo log.
+ Silence a Valgrind warning in dtuple_validate() when
+ row_ins_clust_index_entry_low() searches for the insert position. */
+ memset(roll_ptr, 0, sizeof roll_ptr);
+
+ dtuple_t* entry = row_build_index_entry(row, NULL, index, ctx->heap);
+ entry->info_bits = REC_INFO_DEFAULT_ROW;
+
+ mtr_t mtr;
+ mtr.start();
+ mtr.set_named_space(index->space);
+ btr_pcur_t pcur;
+ btr_pcur_open_at_index_side(true, index, BTR_MODIFY_TREE, &pcur, true,
+ 0, &mtr);
+ ut_ad(btr_pcur_is_before_first_on_page(&pcur));
+ btr_pcur_move_to_next_on_page(&pcur);
+
+ buf_block_t* block = btr_pcur_get_block(&pcur);
+ ut_ad(page_is_leaf(block->frame));
+ ut_ad(!buf_block_get_page_zip(block));
+ const rec_t* rec = btr_pcur_get_rec(&pcur);
+ que_thr_t* thr = pars_complete_graph_for_exec(
+ NULL, trx, ctx->heap, NULL);
+
+ if (rec_is_default_row(rec, index)) {
+ ut_ad(page_rec_is_user_rec(rec));
+ if (page_rec_is_last(rec, block->frame)) {
+ goto empty_table;
+ }
+ /* Extend the record with the instantly added columns. */
+ const unsigned n = user_table->n_cols - ctx->old_n_cols;
+ /* Reserve room for DB_TRX_ID,DB_ROLL_PTR and any
+ non-updated off-page columns in case they are moved off
+ page as a result of the update. */
+ upd_t* update = upd_create(index->n_fields, ctx->heap);
+ update->n_fields = n;
+ update->info_bits = REC_INFO_DEFAULT_ROW;
+ /* Add the default values for instantly added columns */
+ for (unsigned i = 0; i < n; i++) {
+ upd_field_t* uf = upd_get_nth_field(update, i);
+ unsigned f = index->n_fields - n + i;
+ uf->field_no = f;
+ uf->new_val = entry->fields[f];
+ }
+ ulint* offsets = NULL;
+ mem_heap_t* offsets_heap = NULL;
+ big_rec_t* big_rec;
+ dberr_t error = btr_cur_pessimistic_update(
+ BTR_NO_LOCKING_FLAG, btr_pcur_get_btr_cur(&pcur),
+ &offsets, &offsets_heap, ctx->heap,
+ &big_rec, update, UPD_NODE_NO_ORD_CHANGE,
+ thr, trx->id, &mtr);
+ if (big_rec) {
+ if (error == DB_SUCCESS) {
+ error = btr_store_big_rec_extern_fields(
+ &pcur, update, offsets, big_rec, &mtr,
+ BTR_STORE_UPDATE);
+ }
+
+ dtuple_big_rec_free(big_rec);
+ }
+ if (offsets_heap) {
+ mem_heap_free(offsets_heap);
+ }
+ btr_pcur_close(&pcur);
+ mtr.commit();
+ return error != DB_SUCCESS;
+ } else if (page_rec_is_supremum(rec)) {
+empty_table:
+ /* The table is empty. */
+ ut_ad(page_is_root(block->frame));
+ btr_page_empty(block, NULL, index, 0, &mtr);
+ index->remove_instant();
+ mtr.commit();
+ return false;
+ }
+
+ /* Convert the table to the instant ADD COLUMN format. */
+ ut_ad(user_table->is_instant());
+ mtr.commit();
+ mtr.start();
+ mtr.set_named_space(index->space);
+ dberr_t err;
+ if (page_t* root = btr_root_get(index, &mtr)) {
+ switch (fil_page_get_type(root)) {
+ case FIL_PAGE_TYPE_INSTANT:
+ DBUG_ASSERT(page_get_instant(root)
+ == index->n_core_fields);
+ break;
+ case FIL_PAGE_INDEX:
+ DBUG_ASSERT(!page_is_comp(root)
+ || !page_get_instant(root));
+ break;
+ default:
+ DBUG_ASSERT(!"wrong page type");
+ mtr.commit();
+ return true;
+ }
+
+ mlog_write_ulint(root + FIL_PAGE_TYPE,
+ FIL_PAGE_TYPE_INSTANT, MLOG_2BYTES,
+ &mtr);
+ page_set_instant(root, index->n_core_fields, &mtr);
+ mtr.commit();
+ mtr.start();
+ mtr.set_named_space(index->space);
+ err = row_ins_clust_index_entry_low(
+ BTR_NO_LOCKING_FLAG, BTR_MODIFY_TREE, index,
+ index->n_uniq, entry, 0, thr, false);
+ } else {
+ err = DB_CORRUPTION;
+ }
+
+ mtr.commit();
+ return err != DB_SUCCESS;
}
/** Update INNODB SYS_COLUMNS on new virtual column's position
@@ -4196,14 +4491,7 @@ innobase_drop_virtual_try(
ulint new_n = dict_table_encode_n_col(n_col, n_v_col)
+ ((user_table->flags & DICT_TF_COMPACT) << 31);
- err = innobase_update_n_virtual(user_table, new_n, trx);
-
- if (err != DB_SUCCESS) {
- my_error(ER_INTERNAL_ERROR, MYF(0),
- "InnoDB: DROP COLUMN...VIRTUAL");
- }
-
- return(false);
+ return innodb_update_n_cols(user_table, new_n, trx);
}
/** Adjust the create index column number from "New table" to
@@ -4289,6 +4577,35 @@ innodb_v_adjust_idx_col(
}
}
+/** Create index metadata in the data dictionary.
+@param[in,out] trx dictionary transaction
+@param[in,out] index index being created
+@param[in] add_v virtual columns that are being added, or NULL
+@return DB_SUCCESS or error code */
+MY_ATTRIBUTE((nonnull(1,2), warn_unused_result))
+static
+dberr_t
+create_index_dict(
+ trx_t* trx,
+ dict_index_t* index,
+ const dict_add_v_col_t* add_v)
+{
+ DBUG_ENTER("create_index_dict");
+
+ mem_heap_t* heap = mem_heap_create(512);
+ ind_node_t* node = ind_create_graph_create(index, heap, add_v);
+ que_thr_t* thr = pars_complete_graph_for_exec(node, trx, heap, NULL);
+
+ que_fork_start_command(
+ static_cast<que_fork_t*>(que_node_get_parent(thr)));
+
+ que_run_threads(thr);
+
+ que_graph_free((que_t*) que_node_get_parent(thr));
+
+ DBUG_RETURN(trx->error_state);
+}
+
/** Update internal structures with concurrent writes blocked,
while preparing ALTER TABLE.
@@ -4383,12 +4700,6 @@ prepare_inplace_alter_table_dict(
here */
ut_ad(check_v_col_in_order(old_table, altered_table, ha_alter_info));
- /* Create a background transaction for the operations on
- the data dictionary tables. */
- ctx->trx = innobase_trx_allocate(ctx->prebuilt->trx->mysql_thd);
-
- trx_start_for_ddl(ctx->trx, TRX_DICT_OP_INDEX);
-
/* Create table containing all indexes to be built in this
ALTER TABLE ADD INDEX so that they are in the correct order
in the table. */
@@ -4443,17 +4754,12 @@ prepare_inplace_alter_table_dict(
/* Allocate memory for dictionary index definitions */
ctx->add_index = static_cast<dict_index_t**>(
- mem_heap_alloc(ctx->heap, ctx->num_to_add_index
+ mem_heap_zalloc(ctx->heap, ctx->num_to_add_index
* sizeof *ctx->add_index));
ctx->add_key_numbers = add_key_nums = static_cast<ulint*>(
mem_heap_alloc(ctx->heap, ctx->num_to_add_index
* sizeof *ctx->add_key_numbers));
- /* This transaction should be dictionary operation, so that
- the data dictionary will be locked during crash recovery. */
-
- ut_ad(ctx->trx->dict_operation == TRX_DICT_OP_INDEX);
-
/* Acquire a lock on the table before creating any indexes. */
if (ctx->online) {
@@ -4468,6 +4774,12 @@ prepare_inplace_alter_table_dict(
}
}
+ /* Create a background transaction for the operations on
+ the data dictionary tables. */
+ ctx->trx = innobase_trx_allocate(ctx->prebuilt->trx->mysql_thd);
+
+ trx_start_for_ddl(ctx->trx, TRX_DICT_OP_INDEX);
+
/* Latch the InnoDB data dictionary exclusively so that no deadlocks
or lock waits can happen in it during an index create operation. */
@@ -4487,11 +4799,43 @@ prepare_inplace_alter_table_dict(
ut_d(dict_table_check_for_dup_indexes(
ctx->new_table, CHECK_ABORTED_OK));
+ DBUG_EXECUTE_IF("innodb_OOM_prepare_inplace_alter",
+ error = DB_OUT_OF_MEMORY;
+ goto error_handling;);
+
/* If a new clustered index is defined for the table we need
to rebuild the table with a temporary name. */
if (new_clustered) {
- fil_space_crypt_t* crypt_data;
+ if (innobase_check_foreigns(
+ ha_alter_info, altered_table, old_table,
+ user_table, ctx->drop_fk, ctx->num_to_drop_fk)) {
+new_clustered_failed:
+ DBUG_ASSERT(ctx->trx != ctx->prebuilt->trx);
+ trx_rollback_to_savepoint(ctx->trx, NULL);
+
+ ut_ad(user_table->get_ref_count() == 1);
+
+ online_retry_drop_indexes_with_trx(
+ user_table, ctx->trx);
+
+ if (ctx->need_rebuild()) {
+ ut_ad(!ctx->new_table->cached);
+ dict_mem_table_free(ctx->new_table);
+ ctx->new_table = ctx->old_table;
+ }
+
+ while (ctx->num_to_add_index--) {
+ if (dict_index_t*& i = ctx->add_index[
+ ctx->num_to_add_index]) {
+ dict_mem_index_free(i);
+ i = NULL;
+ }
+ }
+
+ goto err_exit;
+ }
+
const char* new_table_name
= dict_mem_create_temporary_tablename(
ctx->heap,
@@ -4502,23 +4846,6 @@ prepare_inplace_alter_table_dict(
dtuple_t* add_cols;
ulint space_id = 0;
ulint z = 0;
- uint32_t key_id = FIL_DEFAULT_ENCRYPTION_KEY;
- fil_encryption_t mode = FIL_ENCRYPTION_DEFAULT;
-
- fil_space_t* space = fil_space_acquire(ctx->prebuilt->table->space);
- crypt_data = space->crypt_data;
- fil_space_release(space);
-
- if (crypt_data) {
- key_id = crypt_data->key_id;
- mode = crypt_data->encryption;
- }
-
- if (innobase_check_foreigns(
- ha_alter_info, altered_table, old_table,
- user_table, ctx->drop_fk, ctx->num_to_drop_fk)) {
- goto new_clustered_failed;
- }
for (uint i = 0; i < altered_table->s->fields; i++) {
const Field* field = altered_table->field[i];
@@ -4543,15 +4870,6 @@ prepare_inplace_alter_table_dict(
DBUG_ASSERT(!add_fts_doc_id_idx || (flags2 & DICT_TF2_FTS));
- /* Create the table. */
- trx_set_dict_operation(ctx->trx, TRX_DICT_OP_TABLE);
-
- if (dict_table_get_low(new_table_name)) {
- my_error(ER_TABLE_EXISTS_ERROR, MYF(0),
- new_table_name);
- goto new_clustered_failed;
- }
-
/* The initial space id 0 may be overridden later if this
table is going to be a file_per_table tablespace. */
ctx->new_table = dict_mem_table_create(
@@ -4600,8 +4918,6 @@ prepare_inplace_alter_table_dict(
charset_no = (ulint) field->charset()->number;
if (charset_no > MAX_CHAR_COLL_NUM) {
- dict_mem_table_free(
- ctx->new_table);
my_error(ER_WRONG_KEY_COLUMN, MYF(0), "InnoDB",
field->field_name.str);
goto new_clustered_failed;
@@ -4633,6 +4949,7 @@ prepare_inplace_alter_table_dict(
if (dict_col_name_is_reserved(field->field_name.str)) {
dict_mem_table_free(ctx->new_table);
+ ctx->new_table = ctx->old_table;
my_error(ER_WRONG_COLUMN_NAME, MYF(0),
field->field_name.str);
goto new_clustered_failed;
@@ -4683,51 +5000,7 @@ prepare_inplace_alter_table_dict(
ctx->new_table->fts->doc_col = fts_doc_id_col;
}
- error = row_create_table_for_mysql(
- ctx->new_table, ctx->trx, mode, key_id);
-
- switch (error) {
- dict_table_t* temp_table;
- case DB_SUCCESS:
- /* We need to bump up the table ref count and
- before we can use it we need to open the
- table. The new_table must be in the data
- dictionary cache, because we are still holding
- the dict_sys->mutex. */
- ut_ad(mutex_own(&dict_sys->mutex));
- temp_table = dict_table_open_on_name(
- ctx->new_table->name.m_name, TRUE, FALSE,
- DICT_ERR_IGNORE_NONE);
- ut_a(ctx->new_table == temp_table);
- /* n_ref_count must be 1, because purge cannot
- be executing on this very table as we are
- holding dict_operation_lock X-latch. */
- DBUG_ASSERT(ctx->new_table->get_ref_count() == 1);
- break;
- case DB_TABLESPACE_EXISTS:
- my_error(ER_TABLESPACE_EXISTS, MYF(0),
- new_table_name);
- goto new_clustered_failed;
- case DB_DUPLICATE_KEY:
- my_error(HA_ERR_TABLE_EXIST, MYF(0),
- altered_table->s->table_name.str);
- goto new_clustered_failed;
- case DB_UNSUPPORTED:
- my_error(ER_UNSUPPORTED_EXTENSION, MYF(0),
- ctx->new_table->name.m_name);
- goto new_clustered_failed;
- default:
- my_error_innodb(error, table_name, flags);
-new_clustered_failed:
- DBUG_ASSERT(ctx->trx != ctx->prebuilt->trx);
- trx_rollback_to_savepoint(ctx->trx, NULL);
-
- ut_ad(user_table->get_ref_count() == 1);
-
- online_retry_drop_indexes_with_trx(
- user_table, ctx->trx);
- goto err_exit;
- }
+ dict_table_add_system_columns(ctx->new_table, ctx->heap);
if (ha_alter_info->handler_flags
& Alter_inplace_info::ADD_COLUMN) {
@@ -4793,15 +5066,10 @@ new_clustered_failed:
}
}
- /* Assign table_id, so that no table id of
- fts_create_index_tables() will be written to the undo logs. */
- DBUG_ASSERT(ctx->new_table->id != 0);
- ctx->trx->table_id = ctx->new_table->id;
-
- /* Create the indexes in SYS_INDEXES and load into dictionary. */
+ ut_ad(new_clustered == ctx->need_rebuild());
+ /* Create the index metadata. */
for (ulint a = 0; a < ctx->num_to_add_index; a++) {
-
if (index_defs[a].ind_type & DICT_VIRTUAL
&& ctx->num_to_drop_vcol > 0 && !new_clustered) {
innodb_v_adjust_idx_col(ha_alter_info, old_table,
@@ -4810,68 +5078,287 @@ new_clustered_failed:
}
ctx->add_index[a] = row_merge_create_index(
- ctx->trx, ctx->new_table,
+ ctx->new_table,
&index_defs[a], add_v, ctx->col_names);
add_key_nums[a] = index_defs[a].key_number;
- if (!ctx->add_index[a]) {
- error = ctx->trx->error_state;
- DBUG_ASSERT(error != DB_SUCCESS);
- goto error_handling;
- }
-
DBUG_ASSERT(ctx->add_index[a]->is_committed()
== !!new_clustered);
+ }
- if (ctx->add_index[a]->type & DICT_FTS) {
- DBUG_ASSERT(num_fts_index);
- DBUG_ASSERT(!fts_index);
- DBUG_ASSERT(ctx->add_index[a]->type == DICT_FTS);
- fts_index = ctx->add_index[a];
- }
-
- /* If only online ALTER TABLE operations have been
- requested, allocate a modification log. If the table
- will be locked anyway, the modification
- log is unnecessary. When rebuilding the table
- (new_clustered), we will allocate the log for the
- clustered index of the old table, later. */
- if (new_clustered
- || !ctx->online
- || !user_table->is_readable()
- || dict_table_is_discarded(user_table)) {
- /* No need to allocate a modification log. */
- ut_ad(!ctx->add_index[a]->online_log);
- } else if (ctx->add_index[a]->type & DICT_FTS) {
- /* Fulltext indexes are not covered
- by a modification log. */
- } else {
- DBUG_EXECUTE_IF("innodb_OOM_prepare_inplace_alter",
- error = DB_OUT_OF_MEMORY;
- goto error_handling;);
- rw_lock_x_lock(&ctx->add_index[a]->lock);
+ if (ctx->need_rebuild() && ctx->new_table->supports_instant()) {
+ if (~ha_alter_info->handler_flags
+ & Alter_inplace_info::ADD_STORED_BASE_COLUMN) {
+ goto not_instant_add_column;
+ }
+
+ if (ha_alter_info->handler_flags
+ & (INNOBASE_ALTER_REBUILD
+ & ~Alter_inplace_info::ADD_STORED_BASE_COLUMN
+ & ~Alter_inplace_info::CHANGE_CREATE_OPTION)) {
+ goto not_instant_add_column;
+ }
- bool ok = row_log_allocate(ctx->prebuilt->trx,
- ctx->add_index[a],
- NULL, true, NULL, NULL,
- path);
- rw_lock_x_unlock(&ctx->add_index[a]->lock);
+ if ((ha_alter_info->handler_flags
+ & Alter_inplace_info::CHANGE_CREATE_OPTION)
+ && (ha_alter_info->create_info->used_fields
+ & (HA_CREATE_USED_ROW_FORMAT
+ | HA_CREATE_USED_KEY_BLOCK_SIZE))) {
+ goto not_instant_add_column;
+ }
- if (!ok) {
- error = DB_OUT_OF_MEMORY;
- goto error_handling;
+ for (uint i = ctx->old_table->n_cols - DATA_N_SYS_COLS;
+ i--; ) {
+ if (ctx->col_map[i] != i) {
+ goto not_instant_add_column;
}
}
+
+ DBUG_ASSERT(ctx->new_table->n_cols > ctx->old_table->n_cols);
+
+ if (ha_alter_info->handler_flags & INNOBASE_ONLINE_CREATE) {
+ /* At the moment, we disallow ADD [UNIQUE] INDEX
+ together with instant ADD COLUMN.
+
+ The main reason is that the work of instant
+ ADD must be done in commit_inplace_alter_table().
+ For the rollback_instant() to work, we must
+ add the columns to dict_table_t beforehand,
+ and roll back those changes in case the
+ transaction is rolled back.
+
+ If we added the columns to the dictionary cache
+ already in the prepare_inplace_alter_table(),
+ we would have to deal with column number
+ mismatch in ha_innobase::open(), write_row()
+ and other functions. */
+
+ /* FIXME: allow instant ADD COLUMN together
+ with ADD INDEX on pre-existing columns. */
+ goto not_instant_add_column;
+ }
+
+ for (uint a = 0; a < ctx->num_to_add_index; a++) {
+ error = dict_index_add_to_cache_w_vcol(
+ ctx->new_table, ctx->add_index[a], add_v,
+ FIL_NULL, false);
+ ut_a(error == DB_SUCCESS);
+ }
+ DBUG_ASSERT(ha_alter_info->key_count
+ + dict_index_is_auto_gen_clust(
+ dict_table_get_first_index(ctx->new_table))
+ == ctx->num_to_add_index);
+ ctx->num_to_add_index = 0;
+ ctx->add_index = NULL;
+
+ uint i = 0; // index of stored columns ctx->new_table->cols[]
+ Field **af = altered_table->field;
+
+ List_iterator_fast<Create_field> cf_it(
+ ha_alter_info->alter_info->create_list);
+
+ while (const Create_field* new_field = cf_it++) {
+ DBUG_ASSERT(!new_field->field
+ || std::find(old_table->field,
+ old_table->field
+ + old_table->s->fields,
+ new_field->field) !=
+ old_table->field + old_table->s->fields);
+ DBUG_ASSERT(new_field->field
+ || !strcmp(new_field->field_name.str,
+ (*af)->field_name.str));
+
+ if (!(*af)->stored_in_db()) {
+ af++;
+ continue;
+ }
+
+ dict_col_t* col = dict_table_get_nth_col(
+ ctx->new_table, i);
+ DBUG_ASSERT(!strcmp((*af)->field_name.str,
+ dict_table_get_col_name(ctx->new_table,
+ i)));
+ DBUG_ASSERT(!col->is_instant());
+
+ if (new_field->field) {
+ ut_d(const dict_col_t* old_col
+ = dict_table_get_nth_col(user_table, i));
+ ut_d(const dict_index_t* index
+ = user_table->indexes.start);
+ DBUG_ASSERT(col->mtype == old_col->mtype);
+ DBUG_ASSERT(col->prtype == old_col->prtype);
+ DBUG_ASSERT(col->mbminmaxlen
+ == old_col->mbminmaxlen);
+ DBUG_ASSERT(col->len >= old_col->len);
+ DBUG_ASSERT(old_col->is_instant()
+ == (dict_col_get_clust_pos(
+ old_col, index)
+ >= index->n_core_fields));
+ } else if ((*af)->is_real_null()) {
+ /* DEFAULT NULL */
+ col->def_val.len = UNIV_SQL_NULL;
+ } else {
+ switch ((*af)->type()) {
+ case MYSQL_TYPE_VARCHAR:
+ col->def_val.len = reinterpret_cast
+ <const Field_varstring*>
+ ((*af))->get_length();
+ col->def_val.data = reinterpret_cast
+ <const Field_varstring*>
+ ((*af))->get_data();
+ break;
+ case MYSQL_TYPE_GEOMETRY:
+ case MYSQL_TYPE_TINY_BLOB:
+ case MYSQL_TYPE_MEDIUM_BLOB:
+ case MYSQL_TYPE_BLOB:
+ case MYSQL_TYPE_LONG_BLOB:
+ col->def_val.len = reinterpret_cast
+ <const Field_blob*>
+ ((*af))->get_length();
+ col->def_val.data = reinterpret_cast
+ <const Field_blob*>
+ ((*af))->get_ptr();
+ break;
+ default:
+ dfield_t d;
+ dict_col_copy_type(col, &d.type);
+ ulint len = (*af)->pack_length();
+ DBUG_ASSERT(len <= 8
+ || d.type.mtype
+ != DATA_INT);
+ row_mysql_store_col_in_innobase_format(
+ &d,
+ d.type.mtype == DATA_INT
+ ? static_cast<byte*>(
+ mem_heap_alloc(
+ ctx->heap,
+ len))
+ : NULL,
+ true, (*af)->ptr, len,
+ dict_table_is_comp(
+ user_table));
+ col->def_val.len = d.len;
+ col->def_val.data = d.data;
+ }
+ }
+
+ i++;
+ af++;
+ }
+
+ DBUG_ASSERT(af == altered_table->field
+ + altered_table->s->fields);
+ /* There might exist a hidden FTS_DOC_ID column for
+ FULLTEXT INDEX. If it exists, the columns should have
+ been implicitly added by ADD FULLTEXT INDEX together
+ with instant ADD COLUMN. (If a hidden FTS_DOC_ID pre-existed,
+ then the ctx->col_map[] check should have prevented
+ adding visible user columns after that.) */
+ DBUG_ASSERT(DATA_N_SYS_COLS + i == ctx->new_table->n_cols
+ || (1 + DATA_N_SYS_COLS + i
+ == ctx->new_table->n_cols
+ && !strcmp(dict_table_get_col_name(
+ ctx->new_table, i),
+ FTS_DOC_ID_COL_NAME)));
+
+ ctx->prepare_instant();
}
- ut_ad(new_clustered == ctx->need_rebuild());
+ if (ctx->need_rebuild()) {
+not_instant_add_column:
+ uint32_t key_id = FIL_DEFAULT_ENCRYPTION_KEY;
+ fil_encryption_t mode = FIL_ENCRYPTION_DEFAULT;
- DBUG_EXECUTE_IF("innodb_OOM_prepare_inplace_alter",
- error = DB_OUT_OF_MEMORY;
- goto error_handling;);
+ if (fil_space_t* s = fil_space_acquire(user_table->space)) {
+ if (const fil_space_crypt_t* c = s->crypt_data) {
+ key_id = c->key_id;
+ mode = c->encryption;
+ }
+ fil_space_release(s);
+ }
+
+ if (dict_table_get_low(ctx->new_table->name.m_name)) {
+ my_error(ER_TABLE_EXISTS_ERROR, MYF(0),
+ ctx->new_table->name.m_name);
+ goto new_clustered_failed;
+ }
+
+ /* Create the table. */
+ trx_set_dict_operation(ctx->trx, TRX_DICT_OP_TABLE);
+
+ error = row_create_table_for_mysql(
+ ctx->new_table, ctx->trx, mode, key_id);
+
+ switch (error) {
+ dict_table_t* temp_table;
+ case DB_SUCCESS:
+ /* We need to bump up the table ref count and
+ before we can use it we need to open the
+ table. The new_table must be in the data
+ dictionary cache, because we are still holding
+ the dict_sys->mutex. */
+ ut_ad(mutex_own(&dict_sys->mutex));
+ temp_table = dict_table_open_on_name(
+ ctx->new_table->name.m_name, TRUE, FALSE,
+ DICT_ERR_IGNORE_NONE);
+ ut_a(ctx->new_table == temp_table);
+ /* n_ref_count must be 1, because purge cannot
+ be executing on this very table as we are
+ holding dict_operation_lock X-latch. */
+ DBUG_ASSERT(ctx->new_table->get_ref_count() == 1);
+ DBUG_ASSERT(ctx->new_table->id != 0);
+ DBUG_ASSERT(ctx->new_table->id == ctx->trx->table_id);
+ break;
+ case DB_TABLESPACE_EXISTS:
+ my_error(ER_TABLESPACE_EXISTS, MYF(0),
+ ctx->new_table->name.m_name);
+ goto new_table_failed;
+ case DB_DUPLICATE_KEY:
+ my_error(HA_ERR_TABLE_EXIST, MYF(0),
+ altered_table->s->table_name.str);
+ goto new_table_failed;
+ case DB_UNSUPPORTED:
+ my_error(ER_UNSUPPORTED_EXTENSION, MYF(0),
+ ctx->new_table->name.m_name);
+ goto new_table_failed;
+ default:
+ my_error_innodb(error, table_name, flags);
+new_table_failed:
+ DBUG_ASSERT(ctx->trx != ctx->prebuilt->trx);
+ goto new_clustered_failed;
+ }
+
+ for (ulint a = 0; a < ctx->num_to_add_index; a++) {
+ dict_index_t*& index = ctx->add_index[a];
+ const bool has_new_v_col = index->has_new_v_col;
+ error = create_index_dict(ctx->trx, index, add_v);
+ if (error != DB_SUCCESS) {
+ while (++a < ctx->num_to_add_index) {
+ dict_mem_index_free(ctx->add_index[a]);
+ }
+ goto error_handling;
+ }
+
+ index = dict_table_get_index_on_name(
+ ctx->new_table, index_defs[a].name, true);
+ ut_a(index);
+
+ index->parser = index_defs[a].parser;
+ index->has_new_v_col = has_new_v_col;
+ /* Note the id of the transaction that created this
+ index, we use it to restrict readers from accessing
+ this index, to ensure read consistency. */
+ ut_ad(index->trx_id == ctx->trx->id);
+
+ if (index->type & DICT_FTS) {
+ DBUG_ASSERT(num_fts_index);
+ DBUG_ASSERT(!fts_index);
+ DBUG_ASSERT(index->type == DICT_FTS);
+ fts_index = ctx->add_index[a];
+ }
+ }
- if (new_clustered) {
dict_index_t* clust_index = dict_table_get_first_index(
user_table);
dict_index_t* new_clust_index = dict_table_get_first_index(
@@ -4882,6 +5369,11 @@ new_clustered_failed:
DBUG_EXECUTE_IF("innodb_alter_table_pk_assert_no_sort",
DBUG_ASSERT(ctx->skip_pk_sort););
+ ut_ad(!new_clust_index->is_instant());
+ /* row_merge_build_index() depends on the correct value */
+ ut_ad(new_clust_index->n_core_null_bytes
+ == UT_BITS_IN_BYTES(new_clust_index->n_nullable));
+
DBUG_ASSERT(!ctx->new_table->persistent_autoinc);
if (const Field* ai = altered_table->found_next_number_field) {
const unsigned col_no = innodb_col_no(ai);
@@ -4915,9 +5407,70 @@ new_clustered_failed:
goto error_handling;
}
}
+ } else if (ctx->num_to_add_index) {
+ ut_ad(!ctx->is_instant());
+ ctx->trx->table_id = user_table->id;
+
+ for (ulint a = 0; a < ctx->num_to_add_index; a++) {
+ dict_index_t*& index = ctx->add_index[a];
+ const bool has_new_v_col = index->has_new_v_col;
+ error = create_index_dict(ctx->trx, index, add_v);
+ if (error != DB_SUCCESS) {
+error_handling_drop_uncached:
+ while (++a < ctx->num_to_add_index) {
+ dict_mem_index_free(ctx->add_index[a]);
+ }
+ goto error_handling;
+ }
+
+ index = dict_table_get_index_on_name(
+ ctx->new_table, index_defs[a].name, false);
+ ut_a(index);
+
+ index->parser = index_defs[a].parser;
+ index->has_new_v_col = has_new_v_col;
+ /* Note the id of the transaction that created this
+ index, we use it to restrict readers from accessing
+ this index, to ensure read consistency. */
+ ut_ad(index->trx_id == ctx->trx->id);
+
+ /* If ADD INDEX with LOCK=NONE has been
+ requested, allocate a modification log. */
+ if (index->type & DICT_FTS) {
+ DBUG_ASSERT(num_fts_index);
+ DBUG_ASSERT(!fts_index);
+ DBUG_ASSERT(index->type == DICT_FTS);
+ fts_index = ctx->add_index[a];
+ /* Fulltext indexes are not covered
+ by a modification log. */
+ } else if (!ctx->online
+ || !user_table->is_readable()
+ || dict_table_is_discarded(user_table)) {
+ /* No need to allocate a modification log. */
+ DBUG_ASSERT(!index->online_log);
+ } else {
+ DBUG_EXECUTE_IF(
+ "innodb_OOM_prepare_inplace_alter",
+ error = DB_OUT_OF_MEMORY;
+ goto error_handling_drop_uncached;);
+ rw_lock_x_lock(&ctx->add_index[a]->lock);
+
+ bool ok = row_log_allocate(
+ ctx->prebuilt->trx,
+ index,
+ NULL, true, NULL, NULL,
+ path);
+ rw_lock_x_unlock(&index->lock);
+
+ if (!ok) {
+ error = DB_OUT_OF_MEMORY;
+ goto error_handling_drop_uncached;
+ }
+ }
+ }
}
- if (ctx->online) {
+ if (ctx->online && ctx->num_to_add_index) {
/* Assign a consistent read view for
row_merge_read_clustered_index(). */
trx_assign_read_view(ctx->prebuilt->trx);
@@ -4944,8 +5497,8 @@ op_ok:
ut_ad(rw_lock_own(dict_operation_lock, RW_LOCK_X));
DICT_TF2_FLAG_SET(ctx->new_table, DICT_TF2_FTS);
- if (new_clustered) {
- /* For !new_clustered, this will be set at
+ if (ctx->need_rebuild()) {
+ /* For !ctx->need_rebuild(), this will be set at
commit_cache_norebuild(). */
ctx->new_table->fts_doc_id_index
= dict_table_get_index_on_name(
@@ -5044,6 +5597,11 @@ error_handling:
error_handled:
ctx->prebuilt->trx->error_info = NULL;
+
+ if (!ctx->trx) {
+ goto err_exit;
+ }
+
ctx->trx->error_state = DB_SUCCESS;
if (!dict_locked) {
@@ -5105,9 +5663,11 @@ err_exit:
}
#endif /* UNIV_DEBUG */
- row_mysql_unlock_data_dictionary(ctx->trx);
+ if (ctx->trx) {
+ row_mysql_unlock_data_dictionary(ctx->trx);
- trx_free_for_mysql(ctx->trx);
+ trx_free_for_mysql(ctx->trx);
+ }
trx_commit_for_mysql(ctx->prebuilt->trx);
delete ctx;
@@ -6360,6 +6920,8 @@ ok_exit:
DBUG_ASSERT(ctx->trx);
DBUG_ASSERT(ctx->prebuilt == m_prebuilt);
+ if (ctx->is_instant()) goto ok_exit;
+
dict_index_t* pk = dict_table_get_first_index(m_prebuilt->table);
ut_ad(pk != NULL);
@@ -8055,28 +8617,31 @@ commit_try_norebuild(
DBUG_RETURN(true);
}
+ if (innobase_add_instant_try(ha_alter_info, ctx, altered_table,
+ old_table, trx)) {
+ DBUG_RETURN(true);
+ }
+
DBUG_RETURN(false);
}
/** Commit the changes to the data dictionary cache
after a successful commit_try_norebuild() call.
-@param ctx In-place ALTER TABLE context
+@param ha_alter_info algorithm=inplace context
+@param ctx In-place ALTER TABLE context for the current partition
@param table the TABLE before the ALTER
-@param trx Data dictionary transaction object
-(will be started and committed)
-@return whether all replacements were found for dropped indexes */
-inline MY_ATTRIBUTE((nonnull, warn_unused_result))
-bool
+@param trx Data dictionary transaction
+(will be started and committed, for DROP INDEX) */
+inline MY_ATTRIBUTE((nonnull))
+void
commit_cache_norebuild(
/*===================*/
+ Alter_inplace_info* ha_alter_info,
ha_innobase_inplace_ctx*ctx,
const TABLE* table,
trx_t* trx)
{
DBUG_ENTER("commit_cache_norebuild");
-
- bool found = true;
-
DBUG_ASSERT(!ctx->need_rebuild());
col_set drop_list;
@@ -8132,7 +8697,7 @@ commit_cache_norebuild(
if (!dict_foreign_replace_index(
index->table, ctx->col_names, index)) {
- found = false;
+ ut_a(!ctx->prebuilt->trx->check_foreigns);
}
/* Mark the index dropped
@@ -8164,6 +8729,15 @@ commit_cache_norebuild(
trx_commit_for_mysql(trx);
}
+ if (!ctx->is_instant()) {
+ innobase_rename_or_enlarge_columns_cache(
+ ha_alter_info, table, ctx->new_table);
+ }
+
+#ifdef MYSQL_RENAME_INDEX
+ rename_indexes_in_cache(ctx, ha_alter_info);
+#endif
+
ctx->new_table->fts_doc_id_index
= ctx->new_table->fts
? dict_table_get_index_on_name(
@@ -8171,8 +8745,7 @@ commit_cache_norebuild(
: NULL;
DBUG_ASSERT((ctx->new_table->fts == NULL)
== (ctx->new_table->fts_doc_id_index == NULL));
-
- DBUG_RETURN(found);
+ DBUG_VOID_RETURN;
}
/** Adjust the persistent statistics after non-rebuilding ALTER TABLE.
@@ -8588,9 +9161,16 @@ ha_innobase::commit_inplace_alter_table(
}
/* Commit or roll back the changes to the data dictionary. */
+ DEBUG_SYNC(m_user_thd, "innodb_alter_inplace_before_commit");
if (fail) {
trx_rollback_for_mysql(trx);
+ for (inplace_alter_handler_ctx** pctx = ctx_array;
+ *pctx; pctx++) {
+ ha_innobase_inplace_ctx* ctx
+ = static_cast<ha_innobase_inplace_ctx*>(*pctx);
+ ctx->rollback_instant();
+ }
} else if (!new_clustered) {
trx_commit_for_mysql(trx);
} else {
@@ -8767,19 +9347,9 @@ foreign_fail:
"InnoDB: Could not add foreign"
" key constraints.");
} else {
- if (!commit_cache_norebuild(
- ctx, table, trx)) {
- ut_a(!m_prebuilt->trx->check_foreigns);
- }
-
- innobase_rename_or_enlarge_columns_cache(
- ha_alter_info, table,
- ctx->new_table);
-#ifdef MYSQL_RENAME_INDEX
- rename_indexes_in_cache(ctx, ha_alter_info);
-#endif
+ commit_cache_norebuild(ha_alter_info, ctx,
+ table, trx);
}
-
}
dict_mem_table_free_foreign_vcol_set(ctx->new_table);
diff --git a/storage/innobase/handler/i_s.cc b/storage/innobase/handler/i_s.cc
index 7bc49792e32..632fcebc2f6 100644
--- a/storage/innobase/handler/i_s.cc
+++ b/storage/innobase/handler/i_s.cc
@@ -5041,13 +5041,15 @@ i_s_innodb_set_page_type(
in the i_s_page_type[] array is I_S_PAGE_TYPE_INDEX
(1) for index pages or I_S_PAGE_TYPE_IBUF for
change buffer index pages */
- if (page_info->index_id
- == static_cast<index_id_t>(DICT_IBUF_ID_MIN
- + IBUF_SPACE_ID)) {
- page_info->page_type = I_S_PAGE_TYPE_IBUF;
- } else if (page_type == FIL_PAGE_RTREE) {
+ if (page_type == FIL_PAGE_RTREE) {
page_info->page_type = I_S_PAGE_TYPE_RTREE;
+ } else if (page_info->index_id
+ == static_cast<index_id_t>(DICT_IBUF_ID_MIN
+ + IBUF_SPACE_ID)) {
+ page_info->page_type = I_S_PAGE_TYPE_IBUF;
} else {
+ ut_ad(page_type == FIL_PAGE_INDEX
+ || page_type == FIL_PAGE_TYPE_INSTANT);
page_info->page_type = I_S_PAGE_TYPE_INDEX;
}
diff --git a/storage/innobase/ibuf/ibuf0ibuf.cc b/storage/innobase/ibuf/ibuf0ibuf.cc
index b53ede41427..e52dfe12b6a 100644
--- a/storage/innobase/ibuf/ibuf0ibuf.cc
+++ b/storage/innobase/ibuf/ibuf0ibuf.cc
@@ -1625,6 +1625,8 @@ ibuf_build_entry_from_ibuf_rec_func(
ibuf_dummy_index_add_col(index, dfield_get_type(field), len);
}
+ index->n_core_null_bytes = UT_BITS_IN_BYTES(index->n_nullable);
+
/* Prevent an ut_ad() failure in page_zip_write_rec() by
adding system columns to the dummy table pointed to by the
dummy secondary index. The insert buffer is only used for
diff --git a/storage/innobase/include/btr0btr.h b/storage/innobase/include/btr0btr.h
index 1d7710a1496..cff8bc7cbc9 100644
--- a/storage/innobase/include/btr0btr.h
+++ b/storage/innobase/include/btr0btr.h
@@ -680,6 +680,20 @@ btr_page_free(
buf_block_t* block, /*!< in: block to be freed, x-latched */
mtr_t* mtr) /*!< in: mtr */
MY_ATTRIBUTE((nonnull));
+/** Empty an index page (possibly the root page). @see btr_page_create().
+@param[in,out] block page to be emptied
+@param[in,out] page_zip compressed page frame, or NULL
+@param[in] index index of the page
+@param[in] level B-tree level of the page (0=leaf)
+@param[in,out] mtr mini-transaction */
+void
+btr_page_empty(
+ buf_block_t* block,
+ page_zip_des_t* page_zip,
+ dict_index_t* index,
+ ulint level,
+ mtr_t* mtr)
+ MY_ATTRIBUTE((nonnull(1, 3, 5)));
/**************************************************************//**
Creates a new index page (not the root, and also not
used in page reorganization). @see btr_page_empty(). */
diff --git a/storage/innobase/include/btr0cur.h b/storage/innobase/include/btr0cur.h
index e62a5e90ce2..0445d0ef59c 100644
--- a/storage/innobase/include/btr0cur.h
+++ b/storage/innobase/include/btr0cur.h
@@ -132,6 +132,24 @@ btr_cur_position(
buf_block_t* block, /*!< in: buffer block of rec */
btr_cur_t* cursor);/*!< in: cursor */
+/** Load the instant ALTER TABLE metadata from the clustered index
+when loading a table definition.
+@param[in,out] table table definition from the data dictionary
+@return error code
+@retval DB_SUCCESS if no error occurred */
+dberr_t
+btr_cur_instant_init(dict_table_t* table)
+ ATTRIBUTE_COLD __attribute__((nonnull, warn_unused_result));
+
+/** Initialize the n_core_null_bytes on first access to a clustered
+index root page.
+@param[in] index clustered index that is on its first access
+@param[in] page clustered index root page
+@return whether the page is corrupted */
+bool
+btr_cur_instant_root_init(dict_index_t* index, const page_t* page)
+ ATTRIBUTE_COLD __attribute__((nonnull, warn_unused_result));
+
/** Optimistically latches the leaf page or pages requested.
@param[in] block guessed buffer block
@param[in] modify_clock modify clock value
diff --git a/storage/innobase/include/btr0cur.ic b/storage/innobase/include/btr0cur.ic
index b1e59651a1d..56868cca336 100644
--- a/storage/innobase/include/btr0cur.ic
+++ b/storage/innobase/include/btr0cur.ic
@@ -28,7 +28,7 @@ Created 10/16/1994 Heikki Tuuri
#ifdef UNIV_DEBUG
# define LIMIT_OPTIMISTIC_INSERT_DEBUG(NREC, CODE)\
if (btr_cur_limit_optimistic_insert_debug > 1\
- && (NREC) >= (ulint)btr_cur_limit_optimistic_insert_debug) {\
+ && (NREC) >= btr_cur_limit_optimistic_insert_debug) {\
CODE;\
}
#else
@@ -134,7 +134,7 @@ btr_cur_compress_recommendation(
page = btr_cur_get_page(cursor);
- LIMIT_OPTIMISTIC_INSERT_DEBUG(page_get_n_recs(page) * 2,
+ LIMIT_OPTIMISTIC_INSERT_DEBUG(page_get_n_recs(page) * 2U,
return(FALSE));
if ((page_get_data_size(page)
diff --git a/storage/innobase/include/btr0sea.h b/storage/innobase/include/btr0sea.h
index fad0dac93c4..bd1a72fc3ac 100644
--- a/storage/innobase/include/btr0sea.h
+++ b/storage/innobase/include/btr0sea.h
@@ -1,6 +1,7 @@
/*****************************************************************************
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
+Copyright (c) 2017, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -106,19 +107,16 @@ btr_search_guess_on_hash(
ulint has_search_latch,
mtr_t* mtr);
-/** Moves or deletes hash entries for moved records. If new_page is already
-hashed, then the hash index for page, if any, is dropped. If new_page is not
-hashed, and page is hashed, then a new hash index is built to new_page with the
-same parameters as page (this often happens when a page is split).
-@param[in,out] new_block records are copied to this page.
-@param[in,out] block index page from which record are copied, and the
- copied records will be deleted from this page.
-@param[in,out] index record descriptor */
+/** Move or delete hash entries for moved records, usually in a page split.
+If new_block is already hashed, then any hash index for block is dropped.
+If new_block is not hashed, and block is hashed, then a new hash index is
+built to new_block with the same parameters as block.
+@param[in,out] new_block destination page
+@param[in,out] block source page (subject to deletion later) */
void
btr_search_move_or_delete_hash_entries(
buf_block_t* new_block,
- buf_block_t* block,
- dict_index_t* index);
+ buf_block_t* block);
/** Drop any adaptive hash index entries that point to an index page.
@param[in,out] block block containing index page, s- or x-latched, or an
@@ -252,7 +250,7 @@ btr_get_search_table(const dict_index_t* index);
# define btr_search_x_lock(index)
# define btr_search_x_unlock(index)
# define btr_search_info_update(index, cursor)
-# define btr_search_move_or_delete_hash_entries(new_block, block, index)
+# define btr_search_move_or_delete_hash_entries(new_block, block)
# define btr_search_update_hash_on_insert(cursor)
# define btr_search_update_hash_on_delete(cursor)
# define btr_search_sys_resize(hash_size)
diff --git a/storage/innobase/include/data0data.h b/storage/innobase/include/data0data.h
index b6187d46025..a0b3059ad40 100644
--- a/storage/innobase/include/data0data.h
+++ b/storage/innobase/include/data0data.h
@@ -619,6 +619,15 @@ struct dtuple_t {
/** Value of dtuple_t::magic_n */
# define DATA_TUPLE_MAGIC_N 65478679
#endif /* UNIV_DEBUG */
+
+ /** Trim the tail of an index tuple before insert or update.
+ After instant ADD COLUMN, if the last fields of a clustered index tuple
+ match the 'default row', there will be no need to store them.
+ NOTE: A page latch in the index must be held, so that the index
+ may not lose 'instantness' before the trimmed tuple has been
+ inserted or updated.
+ @param[in] index index possibly with instantly added columns */
+ void trim(const dict_index_t& index);
};
/** A slot for a field in a big rec vector */
diff --git a/storage/innobase/include/data0data.ic b/storage/innobase/include/data0data.ic
index 81788885aa5..310902f5166 100644
--- a/storage/innobase/include/data0data.ic
+++ b/storage/innobase/include/data0data.ic
@@ -94,6 +94,7 @@ dfield_get_len(
ut_ad(field);
ut_ad((field->len == UNIV_SQL_NULL)
|| (field->data != &data_error));
+ ut_ad(field->len != UNIV_SQL_DEFAULT);
return(field->len);
}
@@ -108,6 +109,7 @@ dfield_set_len(
ulint len) /*!< in: length or UNIV_SQL_NULL */
{
ut_ad(field);
+ ut_ad(len != UNIV_SQL_DEFAULT);
#ifdef UNIV_VALGRIND_DEBUG
if (len != UNIV_SQL_NULL) UNIV_MEM_ASSERT_RW(field->data, len);
#endif /* UNIV_VALGRIND_DEBUG */
@@ -326,6 +328,7 @@ dfield_data_is_binary_equal(
ulint len, /*!< in: data length or UNIV_SQL_NULL */
const byte* data) /*!< in: data */
{
+ ut_ad(len != UNIV_SQL_DEFAULT);
return(len == dfield_get_len(field)
&& (len == UNIV_SQL_NULL
|| !memcmp(dfield_get_data(field), data, len)));
diff --git a/storage/innobase/include/data0type.h b/storage/innobase/include/data0type.h
index a7c7bc92ee9..bd2a15fe881 100644
--- a/storage/innobase/include/data0type.h
+++ b/storage/innobase/include/data0type.h
@@ -29,6 +29,12 @@ Created 1/16/1996 Heikki Tuuri
#include "univ.i"
+/** Special length indicating a missing instantly added column */
+#define UNIV_SQL_DEFAULT (UNIV_SQL_NULL - 1)
+
+/** @return whether a length is actually stored in a field */
+#define len_is_stored(len) (len != UNIV_SQL_NULL && len != UNIV_SQL_DEFAULT)
+
extern ulint data_mysql_default_charset_coll;
#define DATA_MYSQL_BINARY_CHARSET_COLL 63
diff --git a/storage/innobase/include/dict0dict.h b/storage/innobase/include/dict0dict.h
index 736419f9dd7..03175936f7e 100644
--- a/storage/innobase/include/dict0dict.h
+++ b/storage/innobase/include/dict0dict.h
@@ -387,15 +387,6 @@ dict_table_add_system_columns(
mem_heap_t* heap) /*!< in: temporary heap */
MY_ATTRIBUTE((nonnull));
/**********************************************************************//**
-Adds a table object to the dictionary cache. */
-void
-dict_table_add_to_cache(
-/*====================*/
- dict_table_t* table, /*!< in: table */
- bool can_be_evicted, /*!< in: whether can be evicted*/
- mem_heap_t* heap) /*!< in: temporary heap */
- MY_ATTRIBUTE((nonnull));
-/**********************************************************************//**
Removes a table object from the dictionary cache. */
void
dict_table_remove_from_cache(
@@ -589,16 +580,6 @@ dict_foreign_find_index(
happened */
MY_ATTRIBUTE((nonnull(1,3), warn_unused_result));
-/**********************************************************************//**
-Returns a column's name.
-@return column name. NOTE: not guaranteed to stay valid if table is
-modified in any way (columns added, etc.). */
-const char*
-dict_table_get_col_name(
-/*====================*/
- const dict_table_t* table, /*!< in: table */
- ulint col_nr) /*!< in: column number */
- MY_ATTRIBUTE((nonnull, warn_unused_result));
/** Returns a virtual column's name.
@param[in] table table object
@@ -920,6 +901,18 @@ dict_table_get_sys_col(
/* Get nth virtual columns */
#define dict_table_get_nth_v_col(table, pos) (&(table)->v_cols[pos])
#endif /* UNIV_DEBUG */
+/** Wrapper function.
+@see dict_col_t::name()
+@param[in] table table
+@param[in] col_nr column number in table
+@return column name */
+inline
+const char*
+dict_table_get_col_name(const dict_table_t* table, ulint col_nr)
+{
+ return(dict_table_get_nth_col(table, col_nr)->name(*table));
+}
+
/********************************************************************//**
Gets the given system column number of a table.
@return column number */
@@ -1163,6 +1156,7 @@ dict_index_get_n_fields(
representation of index (in
the dictionary cache) */
MY_ATTRIBUTE((nonnull, warn_unused_result));
+
/********************************************************************//**
Gets the number of fields in the internal representation of an index
that uniquely determine the position of an index entry in the index, if
@@ -1451,22 +1445,13 @@ dict_index_copy_rec_order_prefix(
@param[in,out] heap memory heap for allocation
@return own: data tuple */
dtuple_t*
-dict_index_build_data_tuple_func(
+dict_index_build_data_tuple(
const rec_t* rec,
const dict_index_t* index,
-#ifdef UNIV_DEBUG
bool leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
mem_heap_t* heap)
MY_ATTRIBUTE((nonnull, warn_unused_result));
-#ifdef UNIV_DEBUG
-# define dict_index_build_data_tuple(rec, index, leaf, n_fields, heap) \
- dict_index_build_data_tuple_func(rec, index, leaf, n_fields, heap)
-#else /* UNIV_DEBUG */
-# define dict_index_build_data_tuple(rec, index, leaf, n_fields, heap) \
- dict_index_build_data_tuple_func(rec, index, n_fields, heap)
-#endif /* UNIV_DEBUG */
/*********************************************************************//**
Gets the space id of the root of the index tree.
@@ -1978,13 +1963,7 @@ dict_index_node_ptr_max_size(
/*=========================*/
const dict_index_t* index) /*!< in: index */
MY_ATTRIBUTE((warn_unused_result));
-/** Check if a column is a virtual column
-@param[in] col column
-@return true if it is a virtual column, false otherwise */
-UNIV_INLINE
-bool
-dict_col_is_virtual(
- const dict_col_t* col);
+#define dict_col_is_virtual(col) (col)->is_virtual()
/** encode number of columns and number of virtual columns in one
4 bytes value. We could do this because the number of columns in
diff --git a/storage/innobase/include/dict0dict.ic b/storage/innobase/include/dict0dict.ic
index 134a4d63066..06cd2434942 100644
--- a/storage/innobase/include/dict0dict.ic
+++ b/storage/innobase/include/dict0dict.ic
@@ -89,16 +89,6 @@ dict_col_copy_type(
type->len = col->len;
type->mbminmaxlen = col->mbminmaxlen;
}
-/** Check if a column is a virtual column
-@param[in] col column
-@return true if it is a virtual column, false otherwise */
-UNIV_INLINE
-bool
-dict_col_is_virtual(
- const dict_col_t* col)
-{
- return(col->prtype & DATA_VIRTUAL);
-}
#ifdef UNIV_DEBUG
/*********************************************************************//**
@@ -296,8 +286,7 @@ dict_index_is_clust(
const dict_index_t* index) /*!< in: index */
{
ut_ad(index->magic_n == DICT_INDEX_MAGIC_N);
-
- return(index->type & DICT_CLUSTERED);
+ return(index->is_clust());
}
/** Check if index is auto-generated clustered index.
@@ -547,8 +536,8 @@ dict_table_get_nth_v_col(
ut_ad(table);
ut_ad(pos < table->n_v_def);
ut_ad(table->magic_n == DICT_TABLE_MAGIC_N);
-
- return(static_cast<dict_v_col_t*>(table->v_cols) + pos);
+ ut_ad(!table->v_cols[pos].m_col.is_instant());
+ return &table->v_cols[pos];
}
/********************************************************************//**
diff --git a/storage/innobase/include/dict0mem.h b/storage/innobase/include/dict0mem.h
index 5c285ef215d..152ff3f2fd6 100644
--- a/storage/innobase/include/dict0mem.h
+++ b/storage/innobase/include/dict0mem.h
@@ -612,6 +612,47 @@ struct dict_col_t{
this column. Our current max limit is
3072 (REC_VERSION_56_MAX_INDEX_COL_LEN)
bytes. */
+
+ /** Data for instantly added columns */
+ struct {
+ /** original default value of instantly added column */
+ const void* data;
+ /** len of data, or UNIV_SQL_DEFAULT if unavailable */
+ ulint len;
+ } def_val;
+
+ /** Retrieve the column name.
+ @param[in] table table name */
+ const char* name(const dict_table_t& table) const;
+
+ /** @return whether this is a virtual column */
+ bool is_virtual() const { return prtype & DATA_VIRTUAL; }
+ /** @return whether NULL is an allowed value for this column */
+ bool is_nullable() const { return !(prtype & DATA_NOT_NULL); }
+ /** @return whether this is an instantly-added column */
+ bool is_instant() const
+ {
+ DBUG_ASSERT(def_val.len != UNIV_SQL_DEFAULT || !def_val.data);
+ return def_val.len != UNIV_SQL_DEFAULT;
+ }
+ /** Get the default value of an instantly-added column.
+ @param[out] len value length (in bytes), or UNIV_SQL_NULL
+ @return default value
+ @retval NULL if the default value is SQL NULL (len=UNIV_SQL_NULL) */
+ const byte* instant_value(ulint* len) const
+ {
+ DBUG_ASSERT(is_instant());
+ *len = def_val.len;
+ return static_cast<const byte*>(def_val.data);
+ }
+
+ /** Remove the 'instant ADD' status of the column */
+ void remove_instant()
+ {
+ DBUG_ASSERT(is_instant());
+ def_val.len = UNIV_SQL_DEFAULT;
+ def_val.data = NULL;
+ }
};
/** Index information put in a list of virtual column structure. Index
@@ -623,6 +664,9 @@ struct dict_v_idx_t {
/** position in this index */
ulint nth_field;
+
+ dict_v_idx_t(dict_index_t* index, ulint nth_field)
+ : index(index), nth_field(nth_field) {}
};
/** Index list to put in dict_v_col_t */
@@ -726,6 +770,15 @@ struct dict_field_t{
unsigned fixed_len:10; /*!< 0 or the fixed length of the
column if smaller than
DICT_ANTELOPE_MAX_INDEX_COL_LEN */
+
+ /** Check whether two index fields are equivalent.
+ @param[in] old the other index field
+ @return whether the index fields are equivalent */
+ bool same(const dict_field_t& other) const
+ {
+ return(prefix_len == other.prefix_len
+ && fixed_len == other.fixed_len);
+ }
};
/**********************************************************************//**
@@ -844,6 +897,15 @@ struct dict_index_t{
unsigned n_def:10;/*!< number of fields defined so far */
unsigned n_fields:10;/*!< number of fields in the index */
unsigned n_nullable:10;/*!< number of nullable fields */
+ unsigned n_core_fields:10;/*!< number of fields in the index
+ (before the first time of instant add columns) */
+ /** number of bytes of null bits in ROW_FORMAT!=REDUNDANT node pointer
+ records; usually equal to UT_BITS_IN_BYTES(n_nullable), but
+ can be less in clustered indexes with instant ADD COLUMN */
+ unsigned n_core_null_bytes:8;
+ /** magic value signalling that n_core_null_bytes was not
+ initialized yet */
+ static const unsigned NO_CORE_NULL_BYTES = 0xff;
unsigned cached:1;/*!< TRUE if the index object is in the
dictionary cache */
unsigned to_be_dropped:1;
@@ -970,6 +1032,63 @@ struct dict_index_t{
and the .ibd file is missing, or a
page cannot be read or decrypted */
inline bool is_readable() const;
+
+ /** @return whether instant ADD COLUMN is in effect */
+ inline bool is_instant() const;
+
+ /** @return whether the index is the clustered index */
+ bool is_clust() const { return type & DICT_CLUSTERED; }
+
+ /** Determine how many fields of a given prefix can be set NULL.
+ @param[in] n_prefix number of fields in the prefix
+ @return number of fields 0..n_prefix-1 that can be set NULL */
+ unsigned get_n_nullable(ulint n_prefix) const
+ {
+ DBUG_ASSERT(is_instant());
+ DBUG_ASSERT(n_prefix > 0);
+ DBUG_ASSERT(n_prefix <= n_fields);
+ unsigned n = n_nullable;
+ for (; n_prefix < n_fields; n_prefix++) {
+ const dict_col_t* col = fields[n_prefix].col;
+ DBUG_ASSERT(is_dummy || col->is_instant());
+ DBUG_ASSERT(!col->is_virtual());
+ n -= col->is_nullable();
+ }
+ DBUG_ASSERT(n < n_def);
+ return n;
+ }
+
+ /** Get the default value of an instantly-added clustered index field.
+ @param[in] n instantly added field position
+ @param[out] len value length (in bytes), or UNIV_SQL_NULL
+ @return default value
+ @retval NULL if the default value is SQL NULL (len=UNIV_SQL_NULL) */
+ const byte* instant_field_value(uint n, ulint* len) const
+ {
+ DBUG_ASSERT(is_instant());
+ DBUG_ASSERT(n >= n_core_fields);
+ DBUG_ASSERT(n < n_fields);
+ return fields[n].col->instant_value(len);
+ }
+
+ /** Adjust clustered index metadata for instant ADD COLUMN.
+ @param[in] clustered index definition after instant ADD COLUMN */
+ void instant_add_field(const dict_index_t& instant);
+
+ /** Remove the 'instant ADD' status of a clustered index.
+ Protected by index root page x-latch or table X-lock. */
+ void remove_instant()
+ {
+ DBUG_ASSERT(is_clust());
+ if (!is_instant()) {
+ return;
+ }
+ for (unsigned i = n_core_fields; i < n_fields; i++) {
+ fields[i].col->remove_instant();
+ }
+ n_core_fields = n_fields;
+ n_core_null_bytes = UT_BITS_IN_BYTES(n_nullable);
+ }
};
/** The status of online index creation */
@@ -1331,6 +1450,39 @@ struct dict_table_t {
return(UNIV_LIKELY(!file_unreadable));
}
+ /** @return whether instant ADD COLUMN is in effect */
+ bool is_instant() const
+ {
+ return(UT_LIST_GET_FIRST(indexes)->is_instant());
+ }
+
+ /** @return whether the table supports instant ADD COLUMN */
+ bool supports_instant() const
+ {
+ return(!(flags & DICT_TF_MASK_ZIP_SSIZE));
+ }
+
+ /** Adjust metadata for instant ADD COLUMN.
+ @param[in] table table definition after instant ADD COLUMN */
+ void instant_add_column(const dict_table_t& table);
+
+ /** Roll back instant_add_column().
+ @param[in] old_n_cols original n_cols
+ @param[in] old_cols original cols
+ @param[in] old_col_names original col_names */
+ void rollback_instant(
+ unsigned old_n_cols,
+ dict_col_t* old_cols,
+ const char* old_col_names);
+
+ /** Trim the instantly added columns when an insert into SYS_COLUMNS
+ is rolled back during ALTER TABLE or recovery.
+ @param[in] n number of surviving non-system columns */
+ void rollback_instant(unsigned n);
+
+ /** Add the table definition to the data dictionary cache */
+ void add_to_cache();
+
/** Id of the table. */
table_id_t id;
@@ -1711,6 +1863,17 @@ inline bool dict_index_t::is_readable() const
return(UNIV_LIKELY(!table->file_unreadable));
}
+inline bool dict_index_t::is_instant() const
+{
+ ut_ad(n_core_fields > 0);
+ ut_ad(n_core_fields <= n_fields);
+ ut_ad(n_core_fields == n_fields
+ || (type & ~(DICT_UNIQUE | DICT_CORRUPT)) == DICT_CLUSTERED);
+ ut_ad(n_core_fields == n_fields || table->supports_instant());
+ ut_ad(n_core_fields == n_fields || !table->is_temporary());
+ return(n_core_fields != n_fields);
+}
+
/*******************************************************************//**
Initialise the table lock list. */
void
diff --git a/storage/innobase/include/dict0mem.ic b/storage/innobase/include/dict0mem.ic
index da2ac629850..2e3d0f2172a 100644
--- a/storage/innobase/include/dict0mem.ic
+++ b/storage/innobase/include/dict0mem.ic
@@ -66,6 +66,7 @@ dict_mem_fill_index_struct(
index->merge_threshold = DICT_INDEX_MERGE_THRESHOLD_DEFAULT;
index->table_name = table_name;
index->n_fields = (unsigned int) n_fields;
+ index->n_core_fields = (unsigned int) n_fields;
/* The '1 +' above prevents allocation
of an empty mem block */
index->nulls_equal = false;
diff --git a/storage/innobase/include/fil0fil.h b/storage/innobase/include/fil0fil.h
index d3336c5f5b5..12395a3f060 100644
--- a/storage/innobase/include/fil0fil.h
+++ b/storage/innobase/include/fil0fil.h
@@ -392,7 +392,7 @@ extern fil_addr_t fil_addr_null;
then encrypted */
#define FIL_PAGE_PAGE_COMPRESSED 34354 /*!< page compressed page */
#define FIL_PAGE_INDEX 17855 /*!< B-tree node */
-#define FIL_PAGE_RTREE 17854 /*!< B-tree node */
+#define FIL_PAGE_RTREE 17854 /*!< R-tree node (SPATIAL INDEX) */
#define FIL_PAGE_UNDO_LOG 2 /*!< Undo log page */
#define FIL_PAGE_INODE 3 /*!< Index node */
#define FIL_PAGE_IBUF_FREE_LIST 4 /*!< Insert buffer free list */
@@ -415,15 +415,26 @@ extern fil_addr_t fil_addr_null;
//#define FIL_PAGE_ENCRYPTED 15
//#define FIL_PAGE_COMPRESSED_AND_ENCRYPTED 16
//#define FIL_PAGE_ENCRYPTED_RTREE 17
+/** Clustered index root page after instant ADD COLUMN */
+#define FIL_PAGE_TYPE_INSTANT 18
-/** Used by i_s.cc to index into the text description. */
+/** Used by i_s.cc to index into the text description.
+Note: FIL_PAGE_TYPE_INSTANT maps to the same as FIL_PAGE_INDEX. */
#define FIL_PAGE_TYPE_LAST FIL_PAGE_TYPE_UNKNOWN
/*!< Last page type */
/* @} */
-/** macro to check whether the page type is index (Btree or Rtree) type */
-#define fil_page_type_is_index(page_type) \
- (page_type == FIL_PAGE_INDEX || page_type == FIL_PAGE_RTREE)
+/** @return whether the page type is B-tree or R-tree index */
+inline bool fil_page_type_is_index(ulint page_type)
+{
+ switch (page_type) {
+ case FIL_PAGE_TYPE_INSTANT:
+ case FIL_PAGE_INDEX:
+ case FIL_PAGE_RTREE:
+ return(true);
+ }
+ return(false);
+}
/** Check whether the page is index page (either regular Btree index or Rtree
index */
diff --git a/storage/innobase/include/fil0fil.ic b/storage/innobase/include/fil0fil.ic
index 9505cc0bd69..1dd4c64f73e 100644
--- a/storage/innobase/include/fil0fil.ic
+++ b/storage/innobase/include/fil0fil.ic
@@ -39,6 +39,7 @@ fil_get_page_type_name(
return "PAGE_COMPRESSED_ENRYPTED";
case FIL_PAGE_PAGE_COMPRESSED:
return "PAGE_COMPRESSED";
+ case FIL_PAGE_TYPE_INSTANT:
case FIL_PAGE_INDEX:
return "INDEX";
case FIL_PAGE_RTREE:
@@ -89,6 +90,7 @@ fil_page_type_validate(
if (!((page_type == FIL_PAGE_PAGE_COMPRESSED ||
page_type == FIL_PAGE_PAGE_COMPRESSED_ENCRYPTED ||
page_type == FIL_PAGE_INDEX ||
+ page_type == FIL_PAGE_TYPE_INSTANT ||
page_type == FIL_PAGE_RTREE ||
page_type == FIL_PAGE_UNDO_LOG ||
page_type == FIL_PAGE_INODE ||
diff --git a/storage/innobase/include/gis0rtree.ic b/storage/innobase/include/gis0rtree.ic
index e852ebd8028..4dd05d3b251 100644
--- a/storage/innobase/include/gis0rtree.ic
+++ b/storage/innobase/include/gis0rtree.ic
@@ -38,7 +38,7 @@ rtr_page_cal_mbr(
{
page_t* page;
rec_t* rec;
- byte* field;
+ const byte* field;
ulint len;
ulint* offsets = NULL;
double bmin, bmax;
diff --git a/storage/innobase/include/page0page.h b/storage/innobase/include/page0page.h
index 53a58de229d..c2b9a833bda 100644
--- a/storage/innobase/include/page0page.h
+++ b/storage/innobase/include/page0page.h
@@ -63,9 +63,42 @@ typedef byte page_header_t;
#define PAGE_FREE 6 /* pointer to start of page free record list */
#define PAGE_GARBAGE 8 /* number of bytes in deleted records */
#define PAGE_LAST_INSERT 10 /* pointer to the last inserted record, or
- NULL if this info has been reset by a delete,
+ 0 if this info has been reset by a delete,
for example */
-#define PAGE_DIRECTION 12 /* last insert direction: PAGE_LEFT, ... */
+
+/** This 10-bit field is usually 0. In B-tree index pages of
+ROW_FORMAT=REDUNDANT tables, this byte can contain garbage if the .ibd
+file was created in MySQL 4.1.0 or if the table resides in the system
+tablespace and was created before MySQL 4.1.1 or MySQL 4.0.14.
+In this case, the FIL_PAGE_TYPE would be FIL_PAGE_INDEX.
+
+In ROW_FORMAT=COMPRESSED tables, this field is always 0, because
+instant ADD COLUMN is not supported.
+
+In ROW_FORMAT=COMPACT and ROW_FORMAT=DYNAMIC tables, this field is
+always 0, except in the root page of the clustered index after instant
+ADD COLUMN.
+
+Instant ADD COLUMN will change FIL_PAGE_TYPE to FIL_PAGE_TYPE_INSTANT
+and initialize the PAGE_INSTANT field to the original number of
+fields in the clustered index (dict_index_t::n_core_fields). The most
+significant bits are in the first byte, and the least significant 5
+bits are stored in the most significant 5 bits of PAGE_DIRECTION_B.
+
+These FIL_PAGE_TYPE_INSTANT and PAGE_INSTANT may be assigned even if
+instant ADD COLUMN was not committed. Changes to these page header fields
+are not undo-logged, but changes to the 'default value record' are.
+If the server is killed and restarted, the page header fields could
+remain set even though no 'default value record' is present.
+
+When the table becomes empty, the PAGE_INSTANT field and the
+FIL_PAGE_TYPE can be reset and any 'default value record' be removed. */
+#define PAGE_INSTANT 12
+
+/** last insert direction: PAGE_LEFT, ....
+In ROW_FORMAT=REDUNDANT tables created before MySQL 4.1.1 or MySQL 4.0.14,
+this byte can be garbage. */
+#define PAGE_DIRECTION_B 13
#define PAGE_N_DIRECTION 14 /* number of consecutive inserts to the same
direction */
#define PAGE_N_RECS 16 /* number of user records on the page */
@@ -251,6 +284,20 @@ page_rec_is_comp(const byte* rec)
return(page_is_comp(page_align(rec)));
}
+# ifdef UNIV_DEBUG
+/** Determine if the record is the 'default row' pseudo-record
+in the clustered index.
+@param[in] rec leaf page record on an index page
+@return whether the record is the 'default row' pseudo-record */
+inline
+bool
+page_rec_is_default_row(const rec_t* rec)
+{
+ return rec_get_info_bits(rec, page_rec_is_comp(rec))
+ & REC_INFO_MIN_REC_FLAG;
+}
+# endif /* UNIV_DEBUG */
+
/** Determine the offset of the infimum record on the page.
@param[in] page index page
@return offset of the infimum record in record list, relative from page */
@@ -457,7 +504,7 @@ page_header_set_field(
Returns the offset stored in the given header field.
@return offset from the start of the page, or 0 */
UNIV_INLINE
-ulint
+uint16_t
page_header_get_offs(
/*=================*/
const page_t* page, /*!< in: page */
@@ -551,7 +598,7 @@ Gets the number of user records on page (the infimum and supremum records
are not user records).
@return number of user records */
UNIV_INLINE
-ulint
+uint16_t
page_get_n_recs(
/*============*/
const page_t* page); /*!< in: index page */
@@ -569,7 +616,7 @@ page_rec_get_n_recs_before(
Gets the number of records in the heap.
@return number of user records */
UNIV_INLINE
-ulint
+uint16_t
page_dir_get_n_heap(
/*================*/
const page_t* page); /*!< in: index page */
@@ -590,7 +637,7 @@ page_dir_set_n_heap(
Gets the number of dir slots in directory.
@return number of slots */
UNIV_INLINE
-ulint
+uint16_t
page_dir_get_n_slots(
/*=================*/
const page_t* page); /*!< in: index page */
@@ -865,7 +912,7 @@ Returns the sum of the sizes of the records in the record list
excluding the infimum and supremum records.
@return data in bytes */
UNIV_INLINE
-ulint
+uint16_t
page_get_data_size(
/*===============*/
const page_t* page); /*!< in: index page */
@@ -911,6 +958,45 @@ page_mem_free(
const dict_index_t* index, /*!< in: index of rec */
const ulint* offsets);/*!< in: array returned by
rec_get_offsets() */
+
+/** Read the PAGE_DIRECTION field from a byte.
+@param[in] ptr pointer to PAGE_DIRECTION_B
+@return the value of the PAGE_DIRECTION field */
+inline
+byte
+page_ptr_get_direction(const byte* ptr);
+
+/** Set the PAGE_DIRECTION field.
+@param[in] ptr pointer to PAGE_DIRECTION_B
+@param[in] dir the value of the PAGE_DIRECTION field */
+inline
+void
+page_ptr_set_direction(byte* ptr, byte dir);
+
+/** Read the PAGE_DIRECTION field.
+@param[in] page index page
+@return the value of the PAGE_DIRECTION field */
+inline
+byte
+page_get_direction(const page_t* page)
+{
+ return page_ptr_get_direction(PAGE_HEADER + PAGE_DIRECTION_B + page);
+}
+
+/** Read the PAGE_INSTANT field.
+@param[in] page index page
+@return the value of the PAGE_INSTANT field */
+inline
+uint16_t
+page_get_instant(const page_t* page);
+/** Assign the PAGE_INSTANT field.
+@param[in,out] page clustered index root page
+@param[in] n original number of clustered index fields
+@param[in,out] mtr mini-transaction */
+inline
+void
+page_set_instant(page_t* page, unsigned n, mtr_t* mtr);
+
/**********************************************************//**
Create an uncompressed B-tree index page.
@return pointer to the page */
@@ -1251,5 +1337,4 @@ page_warn_strict_checksum(
#include "page0page.ic"
-
#endif
diff --git a/storage/innobase/include/page0page.ic b/storage/innobase/include/page0page.ic
index 0062db56bfa..ee908896050 100644
--- a/storage/innobase/include/page0page.ic
+++ b/storage/innobase/include/page0page.ic
@@ -186,7 +186,7 @@ page_header_set_field(
Returns the offset stored in the given header field.
@return offset from the start of the page, or 0 */
UNIV_INLINE
-ulint
+uint16_t
page_header_get_offs(
/*=================*/
const page_t* page, /*!< in: page */
@@ -464,7 +464,7 @@ Gets the number of user records on page (infimum and supremum records
are not user records).
@return number of user records */
UNIV_INLINE
-ulint
+uint16_t
page_get_n_recs(
/*============*/
const page_t* page) /*!< in: index page */
@@ -477,7 +477,7 @@ page_get_n_recs(
Gets the number of dir slots in directory.
@return number of slots */
UNIV_INLINE
-ulint
+uint16_t
page_dir_get_n_slots(
/*=================*/
const page_t* page) /*!< in: index page */
@@ -502,7 +502,7 @@ page_dir_set_n_slots(
Gets the number of records in the heap.
@return number of user records */
UNIV_INLINE
-ulint
+uint16_t
page_dir_get_n_heap(
/*================*/
const page_t* page) /*!< in: index page */
@@ -868,21 +868,17 @@ Returns the sum of the sizes of the records in the record list, excluding
the infimum and supremum records.
@return data in bytes */
UNIV_INLINE
-ulint
+uint16_t
page_get_data_size(
/*===============*/
const page_t* page) /*!< in: index page */
{
- ulint ret;
-
- ret = (ulint)(page_header_get_field(page, PAGE_HEAP_TOP)
- - (page_is_comp(page)
- ? PAGE_NEW_SUPREMUM_END
- : PAGE_OLD_SUPREMUM_END)
- - page_header_get_field(page, PAGE_GARBAGE));
-
+ uint16_t ret = page_header_get_field(page, PAGE_HEAP_TOP)
+ - (page_is_comp(page)
+ ? PAGE_NEW_SUPREMUM_END
+ : PAGE_OLD_SUPREMUM_END)
+ - page_header_get_field(page, PAGE_GARBAGE);
ut_ad(ret < UNIV_PAGE_SIZE);
-
return(ret);
}
@@ -1078,6 +1074,75 @@ page_mem_free(
}
}
+/** Read the PAGE_DIRECTION field from a byte.
+@param[in] ptr pointer to PAGE_DIRECTION_B
+@return the value of the PAGE_DIRECTION field */
+inline
+byte
+page_ptr_get_direction(const byte* ptr)
+{
+ ut_ad(page_offset(ptr) == PAGE_HEADER + PAGE_DIRECTION_B);
+ return *ptr & ((1U << 3) - 1);
+}
+
+/** Set the PAGE_DIRECTION field.
+@param[in] ptr pointer to PAGE_DIRECTION_B
+@param[in] dir the value of the PAGE_DIRECTION field */
+inline
+void
+page_ptr_set_direction(byte* ptr, byte dir)
+{
+ ut_ad(page_offset(ptr) == PAGE_HEADER + PAGE_DIRECTION_B);
+ ut_ad(dir >= PAGE_LEFT);
+ ut_ad(dir <= PAGE_NO_DIRECTION);
+ *ptr = (*ptr & ~((1U << 3) - 1)) | dir;
+}
+
+/** Read the PAGE_INSTANT field.
+@param[in] page index page
+@return the value of the PAGE_INSTANT field */
+inline
+uint16_t
+page_get_instant(const page_t* page)
+{
+ uint16_t i = page_header_get_field(page, PAGE_INSTANT);
+#ifdef UNIV_DEBUG
+ switch (fil_page_get_type(page)) {
+ case FIL_PAGE_TYPE_INSTANT:
+ ut_ad(page_get_direction(page) <= PAGE_NO_DIRECTION);
+ ut_ad(i >> 3);
+ break;
+ case FIL_PAGE_INDEX:
+ ut_ad(i <= PAGE_NO_DIRECTION || !page_is_comp(page));
+ break;
+ case FIL_PAGE_RTREE:
+ ut_ad(i == PAGE_NO_DIRECTION || i == 0);
+ break;
+ default:
+ ut_ad(!"invalid page type");
+ break;
+ }
+#endif /* UNIV_DEBUG */
+ return(i >> 3);
+}
+
+/** Assign the PAGE_INSTANT field.
+@param[in,out] page clustered index root page
+@param[in] n original number of clustered index fields
+@param[in,out] mtr mini-transaction */
+inline
+void
+page_set_instant(page_t* page, unsigned n, mtr_t* mtr)
+{
+ ut_ad(fil_page_get_type(page) == FIL_PAGE_TYPE_INSTANT);
+ ut_ad(n > 0);
+ ut_ad(n < REC_MAX_N_FIELDS);
+ uint16_t i = page_header_get_field(page, PAGE_INSTANT);
+ ut_ad(i <= PAGE_NO_DIRECTION);
+ i |= n << 3;
+ mlog_write_ulint(PAGE_HEADER + PAGE_INSTANT + page, i,
+ MLOG_2BYTES, mtr);
+}
#endif /* !UNIV_INNOCHECKSUM */
#ifdef UNIV_MATERIALIZE
diff --git a/storage/innobase/include/rem0rec.h b/storage/innobase/include/rem0rec.h
index 6e927da9bd9..58802e23e77 100644
--- a/storage/innobase/include/rem0rec.h
+++ b/storage/innobase/include/rem0rec.h
@@ -33,6 +33,7 @@ Created 5/30/1994 Heikki Tuuri
#include "rem0types.h"
#include "mtr0types.h"
#include "page0types.h"
+#include "dict0dict.h"
#include "trx0types.h"
#endif /*! UNIV_INNOCHECKSUM */
#include <ostream>
@@ -54,11 +55,29 @@ in addition to the data and the offsets */
in addition to the data and the offsets */
#define REC_N_NEW_EXTRA_BYTES 5
-/* Record status values */
-#define REC_STATUS_ORDINARY 0
-#define REC_STATUS_NODE_PTR 1
-#define REC_STATUS_INFIMUM 2
-#define REC_STATUS_SUPREMUM 3
+/** Record status values for ROW_FORMAT=COMPACT,DYNAMIC,COMPRESSED */
+enum rec_comp_status_t {
+ /** User record (PAGE_LEVEL=0, heap>=PAGE_HEAP_NO_USER_LOW) */
+ REC_STATUS_ORDINARY = 0,
+ /** Node pointer record (PAGE_LEVEL>=0, heap>=PAGE_HEAP_NO_USER_LOW) */
+ REC_STATUS_NODE_PTR = 1,
+ /** The page infimum pseudo-record (heap=PAGE_HEAP_NO_INFIMUM) */
+ REC_STATUS_INFIMUM = 2,
+ /** The page supremum pseudo-record (heap=PAGE_HEAP_NO_SUPREMUM) */
+ REC_STATUS_SUPREMUM = 3,
+ /** Clustered index record that has been inserted or updated
+ after instant ADD COLUMN (more than dict_index_t::n_core_fields) */
+ REC_STATUS_COLUMNS_ADDED = 4
+};
+
+/** The dtuple_t::info_bits of the 'default row' record.
+@see rec_is_default_row() */
+static const byte REC_INFO_DEFAULT_ROW
+ = REC_INFO_MIN_REC_FLAG | REC_STATUS_COLUMNS_ADDED;
+
+#define REC_NEW_STATUS 3 /* This is single byte bit-field */
+#define REC_NEW_STATUS_MASK 0x7UL
+#define REC_NEW_STATUS_SHIFT 0
/* The following four constants are needed in page0zip.cc in order to
efficiently compress and decompress pages. */
@@ -94,6 +113,22 @@ offsets[] array, first passed to rec_get_offsets() */
#define REC_OFFS_NORMAL_SIZE OFFS_IN_REC_NORMAL_SIZE
#define REC_OFFS_SMALL_SIZE 10
+/** Get the base address of offsets. The extra_size is stored at
+this position, and following positions hold the end offsets of
+the fields. */
+#define rec_offs_base(offsets) (offsets + REC_OFFS_HEADER_SIZE)
+
+/** Compact flag ORed to the extra size returned by rec_get_offsets() */
+const ulint REC_OFFS_COMPACT = ~(ulint(~0) >> 1);
+/** SQL NULL flag in offsets returned by rec_get_offsets() */
+const ulint REC_OFFS_SQL_NULL = REC_OFFS_COMPACT;
+/** External flag in offsets returned by rec_get_offsets() */
+const ulint REC_OFFS_EXTERNAL = REC_OFFS_COMPACT >> 1;
+/** Default value flag in offsets returned by rec_get_offsets() */
+const ulint REC_OFFS_DEFAULT = REC_OFFS_COMPACT >> 2;
+/** Mask for offsets returned by rec_get_offsets() */
+const ulint REC_OFFS_MASK = REC_OFFS_DEFAULT - 1;
+
#ifndef UNIV_INNOCHECKSUM
/******************************************************//**
The following function is used to get the pointer of the next chained record
@@ -252,25 +287,30 @@ rec_set_info_bits_new(
rec_t* rec, /*!< in/out: new-style physical record */
ulint bits) /*!< in: info bits */
MY_ATTRIBUTE((nonnull));
-/******************************************************//**
-The following function retrieves the status bits of a new-style record.
+
+/** Determine the status bits of a non-REDUNDANT record.
+@param[in] rec ROW_FORMAT=COMPACT,DYNAMIC,COMPRESSED record
@return status bits */
-UNIV_INLINE
-ulint
-rec_get_status(
-/*===========*/
- const rec_t* rec) /*!< in: physical record */
- MY_ATTRIBUTE((warn_unused_result));
+inline
+rec_comp_status_t
+rec_get_status(const rec_t* rec)
+{
+ byte bits = rec[-REC_NEW_STATUS] & REC_NEW_STATUS_MASK;
+ ut_ad(bits <= REC_STATUS_COLUMNS_ADDED);
+ return static_cast<rec_comp_status_t>(bits);
+}
-/******************************************************//**
-The following function is used to set the status bits of a new-style record. */
-UNIV_INLINE
+/** Set the status bits of a non-REDUNDANT record.
+@param[in,out] rec ROW_FORMAT=COMPACT,DYNAMIC,COMPRESSED record
+@param[in] bits status bits */
+inline
void
-rec_set_status(
-/*===========*/
- rec_t* rec, /*!< in/out: physical record */
- ulint bits) /*!< in: info bits */
- MY_ATTRIBUTE((nonnull));
+rec_set_status(rec_t* rec, byte bits)
+{
+ ut_ad(bits <= REC_STATUS_COLUMNS_ADDED);
+ rec[-REC_NEW_STATUS] = (rec[-REC_NEW_STATUS] & ~REC_NEW_STATUS_MASK)
+ | bits;
+}
/******************************************************//**
The following function is used to retrieve the info and status
@@ -459,9 +499,7 @@ rec_get_offsets_func(
const rec_t* rec,
const dict_index_t* index,
ulint* offsets,
-#ifdef UNIV_DEBUG
bool leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
#ifdef UNIV_DEBUG
const char* file, /*!< in: file name where called */
@@ -471,7 +509,7 @@ rec_get_offsets_func(
#ifdef UNIV_DEBUG
MY_ATTRIBUTE((nonnull(1,2,6,8),warn_unused_result));
#else /* UNIV_DEBUG */
- MY_ATTRIBUTE((nonnull(1,2,5),warn_unused_result));
+ MY_ATTRIBUTE((nonnull(1,2,6),warn_unused_result));
#endif /* UNIV_DEBUG */
#ifdef UNIV_DEBUG
@@ -479,7 +517,7 @@ rec_get_offsets_func(
rec_get_offsets_func(rec,index,offsets,leaf,n,__FILE__,__LINE__,heap)
#else /* UNIV_DEBUG */
# define rec_get_offsets(rec, index, offsets, leaf, n, heap) \
- rec_get_offsets_func(rec, index, offsets, n, heap)
+ rec_get_offsets_func(rec, index, offsets, leaf, n, heap)
#endif /* UNIV_DEBUG */
/******************************************************//**
@@ -499,32 +537,31 @@ rec_get_offsets_reverse(
offsets[0] allocated elements */
MY_ATTRIBUTE((nonnull));
#ifdef UNIV_DEBUG
-/************************************************************//**
-Validates offsets returned by rec_get_offsets().
-@return TRUE if valid */
-UNIV_INLINE
-ibool
+/** Validate offsets returned by rec_get_offsets().
+@param[in] rec record, or NULL
+@param[in] index the index that the record belongs in, or NULL
+@param[in,out] offsets the offsets of the record
+@return true */
+bool
rec_offs_validate(
-/*==============*/
- const rec_t* rec, /*!< in: record or NULL */
- const dict_index_t* index, /*!< in: record descriptor or NULL */
- const ulint* offsets)/*!< in: array returned by
- rec_get_offsets() */
+ const rec_t* rec,
+ const dict_index_t* index,
+ const ulint* offsets)
MY_ATTRIBUTE((nonnull(3), warn_unused_result));
-/************************************************************//**
-Updates debug data in offsets, in order to avoid bogus
-rec_offs_validate() failures. */
-UNIV_INLINE
+/** Update debug data in offsets, in order to tame rec_offs_validate().
+@param[in] rec record
+@param[in] index the index that the record belongs in
+@param[in] leaf whether the record resides in a leaf page
+@param[in,out] offsets offsets from rec_get_offsets() to adjust */
void
rec_offs_make_valid(
-/*================*/
- const rec_t* rec, /*!< in: record */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in: array returned by
- rec_get_offsets() */
+ const rec_t* rec,
+ const dict_index_t* index,
+ bool leaf,
+ ulint* offsets)
MY_ATTRIBUTE((nonnull));
#else
-# define rec_offs_make_valid(rec, index, offsets) ((void) 0)
+# define rec_offs_make_valid(rec, index, leaf, offsets)
#endif /* UNIV_DEBUG */
/************************************************************//**
@@ -568,26 +605,7 @@ rec_get_nth_field_offs(
MY_ATTRIBUTE((nonnull));
#define rec_get_nth_field(rec, offsets, n, len) \
((rec) + rec_get_nth_field_offs(offsets, n, len))
-/******************************************************//**
-Determine if the offsets are for a record in the new
-compact format.
-@return nonzero if compact format */
-UNIV_INLINE
-ulint
-rec_offs_comp(
-/*==========*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
- MY_ATTRIBUTE((warn_unused_result));
-/******************************************************//**
-Determine if the offsets are for a record containing
-externally stored columns.
-@return nonzero if externally stored */
-UNIV_INLINE
-ulint
-rec_offs_any_extern(
-/*================*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
- MY_ATTRIBUTE((warn_unused_result));
+
/******************************************************//**
Determine if the offsets are for a record containing null BLOB pointers.
@return first field containing a null BLOB pointer, or NULL if none found */
@@ -598,15 +616,16 @@ rec_offs_any_null_extern(
const rec_t* rec, /*!< in: record */
const ulint* offsets) /*!< in: rec_get_offsets(rec) */
MY_ATTRIBUTE((warn_unused_result));
+
/******************************************************//**
Returns nonzero if the extern bit is set in nth field of rec.
@return nonzero if externally stored */
UNIV_INLINE
ulint
-rec_offs_nth_extern(
+rec_offs_nth_extern_old(
/*================*/
- const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
- ulint n) /*!< in: nth field */
+ const rec_t* rec, /*!< in: record */
+ ulint n /*!< in: index of the field */)
MY_ATTRIBUTE((warn_unused_result));
/** Mark the nth field as externally stored.
@@ -616,16 +635,179 @@ void
rec_offs_make_nth_extern(
ulint* offsets,
const ulint n);
-/******************************************************//**
-Returns nonzero if the SQL NULL bit is set in nth field of rec.
-@return nonzero if SQL NULL */
-UNIV_INLINE
+
+/** Determine the number of allocated elements for an array of offsets.
+@param[in] offsets offsets after rec_offs_set_n_alloc()
+@return number of elements */
+inline
ulint
-rec_offs_nth_sql_null(
-/*==================*/
- const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
- ulint n) /*!< in: nth field */
- MY_ATTRIBUTE((warn_unused_result));
+rec_offs_get_n_alloc(const ulint* offsets)
+{
+ ulint n_alloc;
+ ut_ad(offsets);
+ n_alloc = offsets[0];
+ ut_ad(n_alloc > REC_OFFS_HEADER_SIZE);
+ UNIV_MEM_ASSERT_W(offsets, n_alloc * sizeof *offsets);
+ return(n_alloc);
+}
+
+/** Determine the number of fields for which offsets have been initialized.
+@param[in] offsets rec_get_offsets()
+@return number of fields */
+inline
+ulint
+rec_offs_n_fields(const ulint* offsets)
+{
+ ulint n_fields;
+ ut_ad(offsets);
+ n_fields = offsets[1];
+ ut_ad(n_fields > 0);
+ ut_ad(n_fields <= REC_MAX_N_FIELDS);
+ ut_ad(n_fields + REC_OFFS_HEADER_SIZE
+ <= rec_offs_get_n_alloc(offsets));
+ return(n_fields);
+}
+
+/** Get a flag of a record field.
+@param[in] offsets rec_get_offsets()
+@param[in] n nth field
+@param[in] flag flag to extract
+@return the flag of the record field */
+inline
+ulint
+rec_offs_nth_flag(const ulint* offsets, ulint n, ulint flag)
+{
+ ut_ad(rec_offs_validate(NULL, NULL, offsets));
+ ut_ad(n < rec_offs_n_fields(offsets));
+ /* The DEFAULT, NULL, EXTERNAL flags are mutually exclusive. */
+ ut_ad(ut_is_2pow(rec_offs_base(offsets)[1 + n]
+ & (REC_OFFS_DEFAULT
+ | REC_OFFS_SQL_NULL
+ | REC_OFFS_EXTERNAL)));
+ return rec_offs_base(offsets)[1 + n] & flag;
+}
+
+/** Determine if a record field is missing
+(should be replaced by dict_index_t::instant_field_value()).
+@param[in] offsets rec_get_offsets()
+@param[in] n nth field
+@return nonzero if default bit is set */
+inline
+ulint
+rec_offs_nth_default(const ulint* offsets, ulint n)
+{
+ return rec_offs_nth_flag(offsets, n, REC_OFFS_DEFAULT);
+}
+
+/** Determine if a record field is SQL NULL
+(should be replaced by dict_index_t::instant_field_value()).
+@param[in] offsets rec_get_offsets()
+@param[in] n nth field
+@return nonzero if SQL NULL set */
+inline
+ulint
+rec_offs_nth_sql_null(const ulint* offsets, ulint n)
+{
+ return rec_offs_nth_flag(offsets, n, REC_OFFS_SQL_NULL);
+}
+
+/** Determine if a record field is stored off-page.
+@param[in] offsets rec_get_offsets()
+@param[in] n nth field
+Returns nonzero if the extern bit is set in nth field of rec.
+@return nonzero if externally stored */
+inline
+ulint
+rec_offs_nth_extern(const ulint* offsets, ulint n)
+{
+ return rec_offs_nth_flag(offsets, n, REC_OFFS_EXTERNAL);
+}
+
+/** Get a global flag of a record.
+@param[in] offsets rec_get_offsets()
+@param[in] flag flag to extract
+@return the flag of the record field */
+inline
+ulint
+rec_offs_any_flag(const ulint* offsets, ulint flag)
+{
+ ut_ad(rec_offs_validate(NULL, NULL, offsets));
+ return *rec_offs_base(offsets) & flag;
+}
+
+/** Determine if the offsets are for a record containing off-page columns.
+@param[in] offsets rec_get_offsets()
+@return nonzero if any off-page columns exist */
+inline
+ulint
+rec_offs_any_extern(const ulint* offsets)
+{
+ return rec_offs_any_flag(offsets, REC_OFFS_EXTERNAL);
+}
+
+/** Determine if the offsets are for a record that is missing fields.
+@param[in] offsets rec_get_offsets()
+@return nonzero if any fields need to be replaced with
+ dict_index_t::instant_field_value() */
+inline
+ulint
+rec_offs_any_default(const ulint* offsets)
+{
+ return rec_offs_any_flag(offsets, REC_OFFS_DEFAULT);
+}
+
+/** Determine if the offsets are for other than ROW_FORMAT=REDUNDANT.
+@param[in] offsets rec_get_offsets()
+@return nonzero if ROW_FORMAT is COMPACT,DYNAMIC or COMPRESSED
+@retval 0 if ROW_FORMAT=REDUNDANT */
+inline
+ulint
+rec_offs_comp(const ulint* offsets)
+{
+ ut_ad(rec_offs_validate(NULL, NULL, offsets));
+ return(*rec_offs_base(offsets) & REC_OFFS_COMPACT);
+}
+
+/** Determine if the record is the 'default row' pseudo-record
+in the clustered index.
+@param[in] rec leaf page record
+@param[in] index index of the record
+@return whether the record is the 'default row' pseudo-record */
+inline
+bool
+rec_is_default_row(const rec_t* rec, const dict_index_t* index)
+{
+ bool is = rec_get_info_bits(rec, dict_table_is_comp(index->table))
+ & REC_INFO_MIN_REC_FLAG;
+ ut_ad(!is || index->is_instant());
+ ut_ad(!is || !dict_table_is_comp(index->table)
+ || rec_get_status(rec) == REC_STATUS_COLUMNS_ADDED);
+ return is;
+}
+
+/** Get the nth field from an index.
+@param[in] rec index record
+@param[in] index index
+@param[in] offsets rec_get_offsets(rec, index)
+@param[in] n field number
+@param[out] len length of the field in bytes, or UNIV_SQL_NULL
+@return a read-only copy of the index field */
+inline
+const byte*
+rec_get_nth_cfield(
+ const rec_t* rec,
+ const dict_index_t* index,
+ const ulint* offsets,
+ ulint n,
+ ulint* len)
+{
+ ut_ad(rec_offs_validate(rec, index, offsets));
+ if (!rec_offs_nth_default(offsets, n)) {
+ return rec_get_nth_field(rec, offsets, n, len);
+ }
+ return index->instant_field_value(n, len);
+}
+
/******************************************************//**
Gets the physical size of a field.
@return length of field */
@@ -679,16 +861,6 @@ rec_get_data_size_old(
const rec_t* rec) /*!< in: physical record */
MY_ATTRIBUTE((warn_unused_result));
/**********************************************************//**
-The following function returns the number of allocated elements
-for an array of offsets.
-@return number of elements */
-UNIV_INLINE
-ulint
-rec_offs_get_n_alloc(
-/*=================*/
- const ulint* offsets)/*!< in: array for rec_get_offsets() */
- MY_ATTRIBUTE((warn_unused_result));
-/**********************************************************//**
The following function sets the number of allocated elements
for an array of offsets. */
UNIV_INLINE
@@ -702,15 +874,6 @@ rec_offs_set_n_alloc(
#define rec_offs_init(offsets) \
rec_offs_set_n_alloc(offsets, (sizeof offsets) / sizeof *offsets)
/**********************************************************//**
-The following function returns the number of fields in a record.
-@return number of fields */
-UNIV_INLINE
-ulint
-rec_offs_n_fields(
-/*==============*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
- MY_ATTRIBUTE((warn_unused_result));
-/**********************************************************//**
The following function returns the data size of a physical
record, that is the sum of field lengths. SQL null fields
are counted as length 0 fields. The value returned by the function
@@ -785,14 +948,46 @@ rec_copy(
@param[in] fields data fields
@param[in] n_fields number of data fields
@param[out] extra record header size
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
@return total size, in bytes */
ulint
rec_get_converted_size_temp(
const dict_index_t* index,
const dfield_t* fields,
ulint n_fields,
- ulint* extra)
- MY_ATTRIBUTE((warn_unused_result, nonnull(1,2)));
+ ulint* extra,
+ rec_comp_status_t status = REC_STATUS_ORDINARY)
+ MY_ATTRIBUTE((warn_unused_result, nonnull));
+
+/** Determine the offset to each field in temporary file.
+@param[in] rec temporary file record
+@param[in] index index of that the record belongs to
+@param[in,out] offsets offsets to the fields; in: rec_offs_n_fields(offsets)
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
+*/
+void
+rec_init_offsets_temp(
+ const rec_t* rec,
+ const dict_index_t* index,
+ ulint* offsets,
+ rec_comp_status_t status = REC_STATUS_ORDINARY)
+ MY_ATTRIBUTE((nonnull));
+
+/** Convert a data tuple prefix to the temporary file format.
+@param[out] rec record in temporary file format
+@param[in] index clustered or secondary index
+@param[in] fields data fields
+@param[in] n_fields number of data fields
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
+*/
+void
+rec_convert_dtuple_to_temp(
+ rec_t* rec,
+ const dict_index_t* index,
+ const dfield_t* fields,
+ ulint n_fields,
+ rec_comp_status_t status = REC_STATUS_ORDINARY)
+ MY_ATTRIBUTE((nonnull));
/** Determine the converted size of virtual column data in a temporary file.
@see rec_convert_dtuple_to_temp_v()
@@ -817,29 +1012,6 @@ rec_convert_dtuple_to_temp_v(
const dtuple_t* v_entry)
MY_ATTRIBUTE((nonnull));
-/******************************************************//**
-Determine the offset to each field in temporary file.
-@see rec_convert_dtuple_to_temp() */
-void
-rec_init_offsets_temp(
-/*==================*/
- const rec_t* rec, /*!< in: temporary file record */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in/out: array of offsets;
- in: n=rec_offs_n_fields(offsets) */
- MY_ATTRIBUTE((nonnull));
-
-/*********************************************************//**
-Builds a temporary file record out of a data tuple.
-@see rec_init_offsets_temp() */
-void
-rec_convert_dtuple_to_temp(
-/*=======================*/
- rec_t* rec, /*!< out: record */
- const dict_index_t* index, /*!< in: record descriptor */
- const dfield_t* fields, /*!< in: array of data fields */
- ulint n_fields); /*!< in: number of fields */
-
/**************************************************************//**
Copies the first n fields of a physical record to a new physical record in
a buffer.
@@ -856,22 +1028,6 @@ rec_copy_prefix_to_buf(
or NULL */
ulint* buf_size) /*!< in/out: buffer size */
MY_ATTRIBUTE((nonnull));
-/** Fold a prefix of a physical record.
-@param[in] rec index record
-@param[in] offsets return value of rec_get_offsets()
-@param[in] n_fields number of complete fields to fold
-@param[in] n_bytes number of bytes to fold in the last field
-@param[in] index_id index tree ID
-@return the folded value */
-UNIV_INLINE
-ulint
-rec_fold(
- const rec_t* rec,
- const ulint* offsets,
- ulint n_fields,
- ulint n_bytes,
- index_id_t tree_id)
- MY_ATTRIBUTE((warn_unused_result));
/*********************************************************//**
Builds a physical record out of a data tuple and
stores it into the given buffer.
@@ -919,7 +1075,7 @@ rec_get_converted_size_comp(
dict_table_is_comp() is
assumed to hold, even if
it does not */
- ulint status, /*!< in: status bits of the record */
+ rec_comp_status_t status, /*!< in: status bits of the record */
const dfield_t* fields, /*!< in: array of data fields */
ulint n_fields,/*!< in: number of data fields */
ulint* extra) /*!< out: extra size */
@@ -944,23 +1100,14 @@ The fields are copied into the memory heap.
@param[in] n_fields number of fields to copy
@param[in,out] heap memory heap */
void
-rec_copy_prefix_to_dtuple_func(
+rec_copy_prefix_to_dtuple(
dtuple_t* tuple,
const rec_t* rec,
const dict_index_t* index,
-#ifdef UNIV_DEBUG
bool is_leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
mem_heap_t* heap)
MY_ATTRIBUTE((nonnull));
-#ifdef UNIV_DEBUG
-# define rec_copy_prefix_to_dtuple(tuple,rec,index,leaf,n_fields,heap) \
- rec_copy_prefix_to_dtuple_func(tuple,rec,index,leaf,n_fields,heap)
-#else /* UNIV_DEBUG */
-# define rec_copy_prefix_to_dtuple(tuple,rec,index,leaf,n_fields,heap) \
- rec_copy_prefix_to_dtuple_func(tuple,rec,index,n_fields,heap)
-#endif /* UNIV_DEBUG */
/***************************************************************//**
Validates the consistency of a physical record.
@return TRUE if ok */
diff --git a/storage/innobase/include/rem0rec.ic b/storage/innobase/include/rem0rec.ic
index e16eab62181..cc66149945c 100644
--- a/storage/innobase/include/rem0rec.ic
+++ b/storage/innobase/include/rem0rec.ic
@@ -1,6 +1,7 @@
/*****************************************************************************
Copyright (c) 1994, 2015, Oracle and/or its affiliates. All Rights Reserved.
+Copyright (c) 2017, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -25,19 +26,9 @@ Created 5/30/1994 Heikki Tuuri
#include "mach0data.h"
#include "ut0byte.h"
-#include "dict0dict.h"
#include "dict0boot.h"
#include "btr0types.h"
-/* Compact flag ORed to the extra size returned by rec_get_offsets() */
-#define REC_OFFS_COMPACT ((ulint) 1 << 31)
-/* SQL NULL flag in offsets returned by rec_get_offsets() */
-#define REC_OFFS_SQL_NULL ((ulint) 1 << 31)
-/* External flag in offsets returned by rec_get_offsets() */
-#define REC_OFFS_EXTERNAL ((ulint) 1 << 30)
-/* Mask for offsets returned by rec_get_offsets() */
-#define REC_OFFS_MASK (REC_OFFS_EXTERNAL - 1)
-
/* Offsets of the bit-fields in an old-style record. NOTE! In the table the
most significant bytes and bits are written below less significant.
@@ -72,10 +63,11 @@ most significant bytes and bits are written below less significant.
relative_offset + offset_of_this_record
mod UNIV_PAGE_SIZE
3 3 bits status:
- 000=conventional record
- 001=node pointer record (inside B-tree)
- 010=infimum record
- 011=supremum record
+ 000=REC_STATUS_ORDINARY
+ 001=REC_STATUS_NODE_PTR
+ 010=REC_STATUS_INFIMUM
+ 011=REC_STATUS_SUPREMUM
+ 100=REC_STATUS_COLUMNS_ADDED
1xx=reserved
5 bits heap number
4 8 bits heap number
@@ -98,10 +90,6 @@ and the shift needed to obtain each bit-field of the record. */
#define REC_OLD_N_FIELDS_MASK 0x7FEUL
#define REC_OLD_N_FIELDS_SHIFT 1
-#define REC_NEW_STATUS 3 /* This is single byte bit-field */
-#define REC_NEW_STATUS_MASK 0x7UL
-#define REC_NEW_STATUS_SHIFT 0
-
#define REC_OLD_HEAP_NO 5
#define REC_HEAP_NO_MASK 0xFFF8UL
#if 0 /* defined in rem0rec.h for use of page0zip.cc */
@@ -456,26 +444,6 @@ rec_set_n_fields_old(
}
/******************************************************//**
-The following function retrieves the status bits of a new-style record.
-@return status bits */
-UNIV_INLINE
-ulint
-rec_get_status(
-/*===========*/
- const rec_t* rec) /*!< in: physical record */
-{
- ulint ret;
-
- ut_ad(rec);
-
- ret = rec_get_bit_field_1(rec, REC_NEW_STATUS,
- REC_NEW_STATUS_MASK, REC_NEW_STATUS_SHIFT);
- ut_ad((ret & ~REC_NEW_STATUS_MASK) == 0);
-
- return(ret);
-}
-
-/******************************************************//**
The following function is used to get the number of fields
in a record.
@return number of data fields */
@@ -494,6 +462,7 @@ rec_get_n_fields(
}
switch (rec_get_status(rec)) {
+ case REC_STATUS_COLUMNS_ADDED:
case REC_STATUS_ORDINARY:
return(dict_index_get_n_fields(index));
case REC_STATUS_NODE_PTR:
@@ -501,10 +470,10 @@ rec_get_n_fields(
case REC_STATUS_INFIMUM:
case REC_STATUS_SUPREMUM:
return(1);
- default:
- ut_error;
- return(ULINT_UNDEFINED);
}
+
+ ut_error;
+ return(ULINT_UNDEFINED);
}
/** Confirms the n_fields of the entry is sane with comparing the other
@@ -520,13 +489,15 @@ rec_n_fields_is_sane(
const rec_t* rec,
const dtuple_t* entry)
{
- return(rec_get_n_fields(rec, index)
- == dtuple_get_n_fields(entry)
+ const ulint n_fields = rec_get_n_fields(rec, index);
+
+ return(n_fields == dtuple_get_n_fields(entry)
+ || (index->is_instant()
+ && n_fields >= index->n_core_fields)
/* a record for older SYS_INDEXES table
(missing merge_threshold column) is acceptable. */
|| (index->table->id == DICT_INDEXES_ID
- && rec_get_n_fields(rec, index)
- == dtuple_get_n_fields(entry) - 1));
+ && n_fields == dtuple_get_n_fields(entry) - 1));
}
/******************************************************//**
@@ -645,19 +616,6 @@ rec_set_info_bits_new(
}
/******************************************************//**
-The following function is used to set the status bits of a new-style record. */
-UNIV_INLINE
-void
-rec_set_status(
-/*===========*/
- rec_t* rec, /*!< in/out: physical record */
- ulint bits) /*!< in: info bits */
-{
- rec_set_bit_field_1(rec, bits, REC_NEW_STATUS,
- REC_NEW_STATUS_MASK, REC_NEW_STATUS_SHIFT);
-}
-
-/******************************************************//**
The following function is used to retrieve the info and status
bits of a record. (Only compact records have status bits.)
@return info bits */
@@ -924,29 +882,6 @@ rec_2_is_field_extern(
return(rec_2_get_field_end_info(rec, n) & REC_2BYTE_EXTERN_MASK);
}
-/* Get the base address of offsets. The extra_size is stored at
-this position, and following positions hold the end offsets of
-the fields. */
-#define rec_offs_base(offsets) (offsets + REC_OFFS_HEADER_SIZE)
-
-/**********************************************************//**
-The following function returns the number of allocated elements
-for an array of offsets.
-@return number of elements */
-UNIV_INLINE
-ulint
-rec_offs_get_n_alloc(
-/*=================*/
- const ulint* offsets)/*!< in: array for rec_get_offsets() */
-{
- ulint n_alloc;
- ut_ad(offsets);
- n_alloc = offsets[0];
- ut_ad(n_alloc > REC_OFFS_HEADER_SIZE);
- UNIV_MEM_ASSERT_W(offsets, n_alloc * sizeof *offsets);
- return(n_alloc);
-}
-
/**********************************************************//**
The following function sets the number of allocated elements
for an array of offsets. */
@@ -964,102 +899,6 @@ rec_offs_set_n_alloc(
offsets[0] = n_alloc;
}
-/**********************************************************//**
-The following function returns the number of fields in a record.
-@return number of fields */
-UNIV_INLINE
-ulint
-rec_offs_n_fields(
-/*==============*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
-{
- ulint n_fields;
- ut_ad(offsets);
- n_fields = offsets[1];
- ut_ad(n_fields > 0);
- ut_ad(n_fields <= REC_MAX_N_FIELDS);
- ut_ad(n_fields + REC_OFFS_HEADER_SIZE
- <= rec_offs_get_n_alloc(offsets));
- return(n_fields);
-}
-
-/************************************************************//**
-Validates offsets returned by rec_get_offsets().
-@return TRUE if valid */
-UNIV_INLINE
-ibool
-rec_offs_validate(
-/*==============*/
- const rec_t* rec, /*!< in: record or NULL */
- const dict_index_t* index, /*!< in: record descriptor or NULL */
- const ulint* offsets)/*!< in: array returned by
- rec_get_offsets() */
-{
- ulint i = rec_offs_n_fields(offsets);
- ulint last = ULINT_MAX;
- ulint comp = *rec_offs_base(offsets) & REC_OFFS_COMPACT;
-
- if (rec) {
- ut_ad((ulint) rec == offsets[2]);
- if (!comp) {
- ut_a(rec_get_n_fields_old(rec) >= i);
- }
- }
- if (index) {
- ulint max_n_fields;
- ut_ad((ulint) index == offsets[3]);
- max_n_fields = ut_max(
- dict_index_get_n_fields(index),
- dict_index_get_n_unique_in_tree(index) + 1);
- if (comp && rec) {
- switch (rec_get_status(rec)) {
- case REC_STATUS_ORDINARY:
- break;
- case REC_STATUS_NODE_PTR:
- max_n_fields = dict_index_get_n_unique_in_tree(
- index) + 1;
- break;
- case REC_STATUS_INFIMUM:
- case REC_STATUS_SUPREMUM:
- max_n_fields = 1;
- break;
- default:
- ut_error;
- }
- }
- /* index->n_def == 0 for dummy indexes if !comp */
- ut_a(!comp || index->n_def);
- ut_a(!index->n_def || i <= max_n_fields);
- }
- while (i--) {
- ulint curr = rec_offs_base(offsets)[1 + i] & REC_OFFS_MASK;
- ut_a(curr <= last);
- last = curr;
- }
- return(TRUE);
-}
-#ifdef UNIV_DEBUG
-/************************************************************//**
-Updates debug data in offsets, in order to avoid bogus
-rec_offs_validate() failures. */
-UNIV_INLINE
-void
-rec_offs_make_valid(
-/*================*/
- const rec_t* rec, /*!< in: record */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in: array returned by
- rec_get_offsets() */
-{
- ut_ad(rec);
- ut_ad(index);
- ut_ad(offsets);
- ut_ad(rec_get_n_fields(rec, index) >= rec_offs_n_fields(offsets));
- offsets[2] = (ulint) rec;
- offsets[3] = (ulint) index;
-}
-#endif /* UNIV_DEBUG */
-
/************************************************************//**
The following function is used to get an offset to the nth
data field in a record.
@@ -1071,7 +910,7 @@ rec_get_nth_field_offs(
const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
ulint n, /*!< in: index of the field */
ulint* len) /*!< out: length of the field; UNIV_SQL_NULL
- if SQL null */
+ if SQL null; UNIV_SQL_DEFAULT is default value */
{
ulint offs;
ulint length;
@@ -1088,6 +927,8 @@ rec_get_nth_field_offs(
if (length & REC_OFFS_SQL_NULL) {
length = UNIV_SQL_NULL;
+ } else if (length & REC_OFFS_DEFAULT) {
+ length = UNIV_SQL_DEFAULT;
} else {
length &= REC_OFFS_MASK;
length -= offs;
@@ -1098,34 +939,6 @@ rec_get_nth_field_offs(
}
/******************************************************//**
-Determine if the offsets are for a record in the new
-compact format.
-@return nonzero if compact format */
-UNIV_INLINE
-ulint
-rec_offs_comp(
-/*==========*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
-{
- ut_ad(rec_offs_validate(NULL, NULL, offsets));
- return(*rec_offs_base(offsets) & REC_OFFS_COMPACT);
-}
-
-/******************************************************//**
-Determine if the offsets are for a record containing
-externally stored columns.
-@return nonzero if externally stored */
-UNIV_INLINE
-ulint
-rec_offs_any_extern(
-/*================*/
- const ulint* offsets)/*!< in: array returned by rec_get_offsets() */
-{
- ut_ad(rec_offs_validate(NULL, NULL, offsets));
- return(*rec_offs_base(offsets) & REC_OFFS_EXTERNAL);
-}
-
-/******************************************************//**
Determine if the offsets are for a record containing null BLOB pointers.
@return first field containing a null BLOB pointer, or NULL if none found */
UNIV_INLINE
@@ -1166,29 +979,14 @@ Returns nonzero if the extern bit is set in nth field of rec.
@return nonzero if externally stored */
UNIV_INLINE
ulint
-rec_offs_nth_extern(
+rec_offs_nth_extern_old(
/*================*/
- const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
- ulint n) /*!< in: nth field */
-{
- ut_ad(rec_offs_validate(NULL, NULL, offsets));
- ut_ad(n < rec_offs_n_fields(offsets));
- return(rec_offs_base(offsets)[1 + n] & REC_OFFS_EXTERNAL);
-}
-
-/******************************************************//**
-Returns nonzero if the SQL NULL bit is set in nth field of rec.
-@return nonzero if SQL NULL */
-UNIV_INLINE
-ulint
-rec_offs_nth_sql_null(
-/*==================*/
- const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
- ulint n) /*!< in: nth field */
+ const rec_t* rec, /*!< in: record */
+ ulint n /*!< in: index of the field */)
{
- ut_ad(rec_offs_validate(NULL, NULL, offsets));
- ut_ad(n < rec_offs_n_fields(offsets));
- return(rec_offs_base(offsets)[1 + n] & REC_OFFS_SQL_NULL);
+ if(rec_get_1byte_offs_flag(rec))
+ return 0;
+ return (rec_2_get_field_end_info(rec,n) & REC_2BYTE_EXTERN_MASK);
}
/******************************************************//**
@@ -1426,6 +1224,7 @@ rec_set_nth_field(
ut_ad(rec);
ut_ad(rec_offs_validate(rec, NULL, offsets));
+ ut_ad(!rec_offs_nth_default(offsets, n));
if (len == UNIV_SQL_NULL) {
if (!rec_offs_nth_sql_null(offsets, n)) {
@@ -1436,7 +1235,7 @@ rec_set_nth_field(
return;
}
- data2 = rec_get_nth_field(rec, offsets, n, &len2);
+ data2 = (byte*)rec_get_nth_field(rec, offsets, n, &len2);
if (len2 == UNIV_SQL_NULL) {
ut_ad(!rec_offs_comp(offsets));
rec_set_nth_field_null_bit(rec, n, FALSE);
@@ -1517,7 +1316,7 @@ rec_offs_extra_size(
{
ulint size;
ut_ad(rec_offs_validate(NULL, NULL, offsets));
- size = *rec_offs_base(offsets) & ~(REC_OFFS_COMPACT | REC_OFFS_EXTERNAL);
+ size = *rec_offs_base(offsets) & REC_OFFS_MASK;
ut_ad(size < UNIV_PAGE_SIZE);
return(size);
}
@@ -1630,27 +1429,34 @@ rec_get_converted_size(
ut_ad(index);
ut_ad(dtuple);
ut_ad(dtuple_check_typed(dtuple));
-
- ut_ad(dict_index_is_ibuf(index)
-
- || dtuple_get_n_fields(dtuple)
- == (((dtuple_get_info_bits(dtuple) & REC_NEW_STATUS_MASK)
- == REC_STATUS_NODE_PTR)
- ? dict_index_get_n_unique_in_tree_nonleaf(index) + 1
- : dict_index_get_n_fields(index))
-
- /* a record for older SYS_INDEXES table
- (missing merge_threshold column) is acceptable. */
- || (index->table->id == DICT_INDEXES_ID
- && dtuple_get_n_fields(dtuple)
- == dict_index_get_n_fields(index) - 1));
+#ifdef UNIV_DEBUG
+ if (dict_index_is_ibuf(index)) {
+ ut_ad(dtuple->n_fields > 1);
+ } else if ((dtuple_get_info_bits(dtuple) & REC_NEW_STATUS_MASK)
+ == REC_STATUS_NODE_PTR) {
+ ut_ad(dtuple->n_fields
+ == dict_index_get_n_unique_in_tree_nonleaf(index) + 1);
+ } else if (index->table->id == DICT_INDEXES_ID) {
+ /* The column SYS_INDEXES.MERGE_THRESHOLD was
+ instantly added in MariaDB 10.2.2 (MySQL 5.7). */
+ ut_ad(index->n_fields == DICT_NUM_FIELDS__SYS_INDEXES);
+ ut_ad(dtuple->n_fields == DICT_NUM_FIELDS__SYS_INDEXES
+ || dtuple->n_fields
+ == DICT_FLD__SYS_INDEXES__MERGE_THRESHOLD);
+ } else {
+ ut_ad(dtuple->n_fields >= index->n_core_fields);
+ ut_ad(dtuple->n_fields <= index->n_fields);
+ }
+#endif
if (dict_table_is_comp(index->table)) {
- return(rec_get_converted_size_comp(index,
- dtuple_get_info_bits(dtuple)
- & REC_NEW_STATUS_MASK,
- dtuple->fields,
- dtuple->n_fields, NULL));
+ return(rec_get_converted_size_comp(
+ index,
+ static_cast<rec_comp_status_t>(
+ dtuple->info_bits
+ & REC_NEW_STATUS_MASK),
+ dtuple->fields,
+ dtuple->n_fields, NULL));
}
data_size = dtuple_get_data_size(dtuple, 0);
@@ -1658,105 +1464,5 @@ rec_get_converted_size(
extra_size = rec_get_converted_extra_size(
data_size, dtuple_get_n_fields(dtuple), n_ext);
-#if 0
- /* This code is inactive since it may be the wrong place to add
- in the size of node pointers used in parent pages AND it is not
- currently needed since ha_innobase::max_supported_key_length()
- ensures that the key size limit for each page size is well below
- the actual limit ((free space on page / 4) - record overhead).
- But those limits will need to be raised when InnoDB can
- support multiple page sizes. At that time, we will need
- to consider the node pointer on these universal btrees. */
-
- if (dict_index_is_ibuf(index)) {
- /* This is for the insert buffer B-tree.
- All fields in the leaf tuple ascend to the
- parent node plus the child page pointer. */
-
- /* ibuf cannot contain externally stored fields */
- ut_ad(n_ext == 0);
-
- /* Add the data pointer and recompute extra_size
- based on one more field. */
- data_size += REC_NODE_PTR_SIZE;
- extra_size = rec_get_converted_extra_size(
- data_size,
- dtuple_get_n_fields(dtuple) + 1,
- 0);
-
- /* Be sure dtuple->n_fields has this node ptr
- accounted for. This function should correspond to
- what rec_convert_dtuple_to_rec() needs in storage.
- In optimistic insert or update-not-in-place, we will
- have to ensure that if the record is converted to a
- node pointer, it will not become too large.*/
- }
-#endif
-
return(data_size + extra_size);
}
-
-/** Fold a prefix of a physical record.
-@param[in] rec index record
-@param[in] offsets return value of rec_get_offsets()
-@param[in] n_fields number of complete fields to fold
-@param[in] n_bytes number of bytes to fold in the last field
-@param[in] index_id index tree ID
-@return the folded value */
-UNIV_INLINE
-ulint
-rec_fold(
- const rec_t* rec,
- const ulint* offsets,
- ulint n_fields,
- ulint n_bytes,
- index_id_t tree_id)
-{
- ulint i;
- const byte* data;
- ulint len;
- ulint fold;
- ulint n_fields_rec;
-
- ut_ad(rec_offs_validate(rec, NULL, offsets));
- ut_ad(rec_validate(rec, offsets));
- ut_ad(n_fields > 0 || n_bytes > 0);
-
- n_fields_rec = rec_offs_n_fields(offsets);
- ut_ad(n_fields <= n_fields_rec);
- ut_ad(n_fields < n_fields_rec || n_bytes == 0);
-
- if (n_fields > n_fields_rec) {
- n_fields = n_fields_rec;
- }
-
- if (n_fields == n_fields_rec) {
- n_bytes = 0;
- }
-
- fold = ut_fold_ull(tree_id);
-
- for (i = 0; i < n_fields; i++) {
- data = rec_get_nth_field(rec, offsets, i, &len);
-
- if (len != UNIV_SQL_NULL) {
- fold = ut_fold_ulint_pair(fold,
- ut_fold_binary(data, len));
- }
- }
-
- if (n_bytes > 0) {
- data = rec_get_nth_field(rec, offsets, i, &len);
-
- if (len != UNIV_SQL_NULL) {
- if (len > n_bytes) {
- len = n_bytes;
- }
-
- fold = ut_fold_ulint_pair(fold,
- ut_fold_binary(data, len));
- }
- }
-
- return(fold);
-}
diff --git a/storage/innobase/include/row0merge.h b/storage/innobase/include/row0merge.h
index bdfdc2f3c08..b7f9dd02cb0 100644
--- a/storage/innobase/include/row0merge.h
+++ b/storage/innobase/include/row0merge.h
@@ -263,7 +263,6 @@ row_merge_rename_index_to_drop(
MY_ATTRIBUTE((nonnull(1), warn_unused_result));
/** Create the index and load in to the dictionary.
-@param[in,out] trx trx (sets error_state)
@param[in,out] table the index is on this table
@param[in] index_def the index definition
@param[in] add_v new virtual columns added along with add
@@ -273,7 +272,6 @@ row_merge_rename_index_to_drop(
@return index, or NULL on error */
dict_index_t*
row_merge_create_index(
- trx_t* trx,
dict_table_t* table,
const index_def_t* index_def,
const dict_add_v_col_t* add_v,
diff --git a/storage/innobase/include/row0upd.h b/storage/innobase/include/row0upd.h
index ec7995dd096..92b5942966b 100644
--- a/storage/innobase/include/row0upd.h
+++ b/storage/innobase/include/row0upd.h
@@ -1,6 +1,7 @@
/*****************************************************************************
Copyright (c) 1996, 2016, Oracle and/or its affiliates. All Rights Reserved.
+Copyright (c) 2017, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -26,7 +27,6 @@ Created 12/27/1996 Heikki Tuuri
#ifndef row0upd_h
#define row0upd_h
-#include "univ.i"
#include "data0data.h"
#include "row0types.h"
#include "btr0types.h"
@@ -244,27 +244,19 @@ row_upd_build_difference_binary(
mem_heap_t* heap,
TABLE* mysql_table)
MY_ATTRIBUTE((nonnull(1,2,3,7), warn_unused_result));
-/***********************************************************//**
-Replaces the new column values stored in the update vector to the index entry
-given. */
+/** Apply an update vector to an index entry.
+@param[in,out] entry index entry to be updated; the clustered index record
+ must be covered by a lock or a page latch to prevent
+ deletion (rollback or purge)
+@param[in] index index of the entry
+@param[in] update update vector built for the entry
+@param[in,out] heap memory heap for copying off-page columns */
void
row_upd_index_replace_new_col_vals_index_pos(
-/*=========================================*/
- dtuple_t* entry, /*!< in/out: index entry where replaced;
- the clustered index record must be
- covered by a lock or a page latch to
- prevent deletion (rollback or purge) */
- dict_index_t* index, /*!< in: index; NOTE that this may also be a
- non-clustered index */
- const upd_t* update, /*!< in: an update vector built for the index so
- that the field number in an upd_field is the
- index position */
- ibool order_only,
- /*!< in: if TRUE, limit the replacement to
- ordering fields of index; note that this
- does not work for non-clustered indexes. */
- mem_heap_t* heap) /*!< in: memory heap for allocating and
- copying the new values */
+ dtuple_t* entry,
+ const dict_index_t* index,
+ const upd_t* update,
+ mem_heap_t* heap)
MY_ATTRIBUTE((nonnull));
/***********************************************************//**
Replaces the new column values stored in the update vector to the index entry
diff --git a/storage/innobase/include/srv0srv.h b/storage/innobase/include/srv0srv.h
index ae00ffe21cd..c471ce5d57d 100644
--- a/storage/innobase/include/srv0srv.h
+++ b/storage/innobase/include/srv0srv.h
@@ -1007,6 +1007,9 @@ struct export_var_t{
ulint innodb_defragment_count; /*!< Number of defragment
operations*/
+ /** Number of instant ALTER TABLE operations that affect columns */
+ ulong innodb_instant_alter_column;
+
ulint innodb_onlineddl_rowlog_rows; /*!< Online alter rows */
ulint innodb_onlineddl_rowlog_pct_used; /*!< Online alter percentage
of used row log buffer */
diff --git a/storage/innobase/include/trx0rec.h b/storage/innobase/include/trx0rec.h
index 3dc35c7fda8..a6889696036 100644
--- a/storage/innobase/include/trx0rec.h
+++ b/storage/innobase/include/trx0rec.h
@@ -309,6 +309,8 @@ trx_undo_read_v_idx(
compilation info multiplied by 16 is ORed to this value in an undo log
record */
+#define TRX_UNDO_INSERT_DEFAULT 10 /* insert a "default value"
+ pseudo-record for instant ALTER */
#define TRX_UNDO_INSERT_REC 11 /* fresh insert into clustered index */
#define TRX_UNDO_UPD_EXIST_REC 12 /* update of a non-delete-marked
record */
@@ -324,6 +326,9 @@ record */
storage fields: used by purge to
free the external storage */
+/** The search tuple corresponding to TRX_UNDO_INSERT_DEFAULT */
+extern const dtuple_t trx_undo_default_rec;
+
#include "trx0rec.ic"
#endif /* trx0rec_h */
diff --git a/storage/innobase/include/trx0trx.h b/storage/innobase/include/trx0trx.h
index 69bea016605..133f23081a0 100644
--- a/storage/innobase/include/trx0trx.h
+++ b/storage/innobase/include/trx0trx.h
@@ -1177,10 +1177,8 @@ struct trx_t {
trx_rsegs_t rsegs; /* rollback segments for undo logging */
undo_no_t roll_limit; /*!< least undo number to undo during
a partial rollback; 0 otherwise */
-#ifdef UNIV_DEBUG
bool in_rollback; /*!< true when the transaction is
executing a partial or full rollback */
-#endif /* UNIV_DEBUG */
ulint pages_undone; /*!< number of undo log pages undone
since the last undo log truncation */
/*------------------------------*/
diff --git a/storage/innobase/lock/lock0lock.cc b/storage/innobase/lock/lock0lock.cc
index 8f96fd577e7..fe9d9683785 100644
--- a/storage/innobase/lock/lock0lock.cc
+++ b/storage/innobase/lock/lock0lock.cc
@@ -356,6 +356,9 @@ lock_report_trx_id_insanity(
const ulint* offsets, /*!< in: rec_get_offsets(rec, index) */
trx_id_t max_trx_id) /*!< in: trx_sys_get_max_trx_id() */
{
+ ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(!rec_is_default_row(rec, index));
+
ib::error()
<< "Transaction id " << trx_id
<< " associated with record" << rec_offsets_print(rec, offsets)
@@ -382,6 +385,7 @@ lock_check_trx_id_sanity(
const ulint* offsets) /*!< in: rec_get_offsets(rec, index) */
{
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(!rec_is_default_row(rec, index));
trx_id_t max_trx_id = trx_sys_get_max_trx_id();
bool is_ok = trx_id < max_trx_id;
@@ -410,6 +414,7 @@ lock_clust_rec_cons_read_sees(
ut_ad(dict_index_is_clust(index));
ut_ad(page_rec_is_user_rec(rec));
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(!rec_is_default_row(rec, index));
/* Temp-tables are not shared across connections and multiple
transactions from different connections cannot simultaneously
@@ -448,6 +453,8 @@ lock_sec_rec_cons_read_sees(
const ReadView* view) /*!< in: consistent read view */
{
ut_ad(page_rec_is_user_rec(rec));
+ ut_ad(!index->is_clust());
+ ut_ad(!rec_is_default_row(rec, index));
/* NOTE that we might call this function while holding the search
system latch. */
@@ -1492,6 +1499,7 @@ lock_sec_rec_some_has_impl(
ut_ad(!dict_index_is_clust(index));
ut_ad(page_rec_is_user_rec(rec));
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(!rec_is_default_row(rec, index));
max_trx_id = page_get_max_trx_id(page);
@@ -1539,6 +1547,8 @@ lock_rec_other_trx_holds_expl(
const buf_block_t* block) /*!< in: buffer block
containing the record */
{
+ ut_ad(!page_rec_is_default_row(rec));
+
trx_t* holds = NULL;
lock_mutex_enter();
@@ -3574,6 +3584,9 @@ lock_move_reorganize_page(
for (;;) {
ulint old_heap_no;
ulint new_heap_no;
+ ut_d(const rec_t* const orec = rec1);
+ ut_ad(page_rec_is_default_row(rec1)
+ == page_rec_is_default_row(rec2));
if (comp) {
old_heap_no = rec_get_heap_no_new(rec2);
@@ -3594,6 +3607,8 @@ lock_move_reorganize_page(
/* Clear the bit in old_lock. */
if (old_heap_no < lock->un_member.rec_lock.n_bits
&& lock_rec_reset_nth_bit(lock, old_heap_no)) {
+ ut_ad(!page_rec_is_default_row(orec));
+
/* NOTE that the old lock bitmap could be too
small for the new heap number! */
@@ -3673,6 +3688,10 @@ lock_move_rec_list_end(
reset the lock bits on the old */
for (;;) {
+ ut_ad(page_rec_is_default_row(rec1)
+ == page_rec_is_default_row(rec2));
+ ut_d(const rec_t* const orec = rec1);
+
ulint rec1_heap_no;
ulint rec2_heap_no;
@@ -3695,8 +3714,11 @@ lock_move_rec_list_end(
rec2_heap_no = rec_get_heap_no_old(rec2);
+ ut_ad(rec_get_data_size_old(rec1)
+ == rec_get_data_size_old(rec2));
+
ut_ad(!memcmp(rec1, rec2,
- rec_get_data_size_old(rec2)));
+ rec_get_data_size_old(rec1)));
rec1 = page_rec_get_next_low(rec1, FALSE);
rec2 = page_rec_get_next_low(rec2, FALSE);
@@ -3704,6 +3726,8 @@ lock_move_rec_list_end(
if (rec1_heap_no < lock->un_member.rec_lock.n_bits
&& lock_rec_reset_nth_bit(lock, rec1_heap_no)) {
+ ut_ad(!page_rec_is_default_row(orec));
+
if (type_mode & LOCK_WAIT) {
lock_reset_lock_and_trx_wait(lock);
}
@@ -3747,6 +3771,7 @@ lock_move_rec_list_start(
ut_ad(block->frame == page_align(rec));
ut_ad(new_block->frame == page_align(old_end));
ut_ad(comp == page_rec_is_comp(old_end));
+ ut_ad(!page_rec_is_default_row(rec));
lock_mutex_enter();
@@ -3772,6 +3797,9 @@ lock_move_rec_list_start(
reset the lock bits on the old */
while (rec1 != rec) {
+ ut_ad(!page_rec_is_default_row(rec1));
+ ut_ad(!page_rec_is_default_row(rec2));
+
ulint rec1_heap_no;
ulint rec2_heap_no;
@@ -3872,6 +3900,8 @@ lock_rtr_move_rec_list(
rec1 = rec_move[moved].old_rec;
rec2 = rec_move[moved].new_rec;
+ ut_ad(!page_rec_is_default_row(rec1));
+ ut_ad(!page_rec_is_default_row(rec2));
if (comp) {
rec1_heap_no = rec_get_heap_no_new(rec1);
@@ -3950,6 +3980,8 @@ lock_update_merge_right(
page which will be
discarded */
{
+ ut_ad(!page_rec_is_default_row(orig_succ));
+
lock_mutex_enter();
/* Inherit the locks from the supremum of the left page to the
@@ -4207,6 +4239,7 @@ lock_update_insert(
ulint donator_heap_no;
ut_ad(block->frame == page_align(rec));
+ ut_ad(!page_rec_is_default_row(rec));
/* Inherit the gap-locking locks for rec, in gap mode, from the next
record */
@@ -4238,6 +4271,7 @@ lock_update_delete(
ulint next_heap_no;
ut_ad(page == page_align(rec));
+ ut_ad(!page_rec_is_default_row(rec));
if (page_is_comp(page)) {
heap_no = rec_get_heap_no_new(rec);
@@ -5073,6 +5107,7 @@ lock_rec_unlock(
ut_ad(block->frame == page_align(rec));
ut_ad(!trx->lock.wait_lock);
ut_ad(trx_state_eq(trx, TRX_STATE_ACTIVE));
+ ut_ad(!page_rec_is_default_row(rec));
heap_no = page_rec_get_heap_no(rec);
@@ -5599,6 +5634,7 @@ lock_rec_print(FILE* file, const lock_t* lock)
rec = page_find_rec_with_heap_no(
buf_block_get_frame(block), i);
+ ut_ad(!page_rec_is_default_row(rec));
offsets = rec_get_offsets(
rec, lock->index, offsets, true,
@@ -6228,6 +6264,7 @@ lock_rec_queue_validate(
ut_a(block->frame == page_align(rec));
ut_ad(rec_offs_validate(rec, index, offsets));
ut_ad(!page_rec_is_comp(rec) == !rec_offs_comp(offsets));
+ ut_ad(page_rec_is_leaf(rec));
ut_ad(lock_mutex_own() == locked_lock_trx_sys);
ut_ad(!index || dict_index_is_clust(index)
|| !dict_index_is_online_ddl(index));
@@ -6330,6 +6367,7 @@ lock_rec_queue_validate(
lock = lock_rec_get_next_const(heap_no, lock)) {
ut_ad(!trx_is_ac_nl_ro(lock->trx));
+ ut_ad(!page_rec_is_default_row(rec));
if (index) {
ut_a(lock->index == index);
@@ -6672,6 +6710,7 @@ lock_rec_insert_check_and_lock(
|| dict_index_is_clust(index)
|| (flags & BTR_CREATE_FLAG));
ut_ad(mtr->is_named_space(index->space));
+ ut_ad(page_rec_is_leaf(rec));
if (flags & BTR_NO_LOCKING_FLAG) {
@@ -6686,6 +6725,7 @@ lock_rec_insert_check_and_lock(
trx_t* trx = thr_get_trx(thr);
const rec_t* next_rec = page_rec_get_next_const(rec);
ulint heap_no = page_rec_get_heap_no(next_rec);
+ ut_ad(!rec_is_default_row(next_rec, index));
lock_mutex_enter();
/* Because this code is invoked for a running transaction by
@@ -6811,6 +6851,8 @@ lock_rec_convert_impl_to_expl_for_trx(
ulint heap_no)/*!< in: rec heap number to lock */
{
ut_ad(trx_is_referenced(trx));
+ ut_ad(page_rec_is_leaf(rec));
+ ut_ad(!rec_is_default_row(rec, index));
DEBUG_SYNC_C("before_lock_rec_convert_impl_to_expl_for_trx");
@@ -6855,6 +6897,8 @@ lock_rec_convert_impl_to_expl(
ut_ad(page_rec_is_user_rec(rec));
ut_ad(rec_offs_validate(rec, index, offsets));
ut_ad(!page_rec_is_comp(rec) == !rec_offs_comp(offsets));
+ ut_ad(page_rec_is_leaf(rec));
+ ut_ad(!rec_is_default_row(rec, index));
if (dict_index_is_clust(index)) {
trx_id_t trx_id;
@@ -6909,6 +6953,7 @@ lock_clust_rec_modify_check_and_lock(
ulint heap_no;
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(page_rec_is_leaf(rec));
ut_ad(dict_index_is_clust(index));
ut_ad(block->frame == page_align(rec));
@@ -6916,6 +6961,7 @@ lock_clust_rec_modify_check_and_lock(
return(DB_SUCCESS);
}
+ ut_ad(!rec_is_default_row(rec, index));
ut_ad(!dict_table_is_temporary(index->table));
heap_no = rec_offs_comp(offsets)
@@ -6974,6 +7020,8 @@ lock_sec_rec_modify_check_and_lock(
ut_ad(!dict_index_is_online_ddl(index) || (flags & BTR_CREATE_FLAG));
ut_ad(block->frame == page_align(rec));
ut_ad(mtr->is_named_space(index->space));
+ ut_ad(page_rec_is_leaf(rec));
+ ut_ad(!rec_is_default_row(rec, index));
if (flags & BTR_NO_LOCKING_FLAG) {
@@ -7066,6 +7114,7 @@ lock_sec_rec_read_check_and_lock(
ut_ad(block->frame == page_align(rec));
ut_ad(page_rec_is_user_rec(rec) || page_rec_is_supremum(rec));
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(page_rec_is_leaf(rec));
ut_ad(mode == LOCK_X || mode == LOCK_S);
if ((flags & BTR_NO_LOCKING_FLAG)
@@ -7075,6 +7124,7 @@ lock_sec_rec_read_check_and_lock(
return(DB_SUCCESS);
}
+ ut_ad(!rec_is_default_row(rec, index));
heap_no = page_rec_get_heap_no(rec);
/* Some transaction may have an implicit x-lock on the record only
@@ -7146,6 +7196,8 @@ lock_clust_rec_read_check_and_lock(
ut_ad(gap_mode == LOCK_ORDINARY || gap_mode == LOCK_GAP
|| gap_mode == LOCK_REC_NOT_GAP);
ut_ad(rec_offs_validate(rec, index, offsets));
+ ut_ad(page_rec_is_leaf(rec));
+ ut_ad(!rec_is_default_row(rec, index));
if ((flags & BTR_NO_LOCKING_FLAG)
|| srv_read_only_mode
@@ -8559,12 +8611,14 @@ lock_update_split_and_merge(
{
const rec_t* left_next_rec;
- ut_a(left_block && right_block);
- ut_a(orig_pred);
+ ut_ad(page_is_leaf(left_block->frame));
+ ut_ad(page_is_leaf(right_block->frame));
+ ut_ad(page_align(orig_pred) == left_block->frame);
lock_mutex_enter();
left_next_rec = page_rec_get_next_const(orig_pred);
+ ut_ad(!page_rec_is_default_row(left_next_rec));
/* Inherit the locks on the supremum of the left page to the
first record which was moved from the right page */
diff --git a/storage/innobase/mtr/mtr0log.cc b/storage/innobase/mtr/mtr0log.cc
index 8cfde15a3ba..c33e792ec04 100644
--- a/storage/innobase/mtr/mtr0log.cc
+++ b/storage/innobase/mtr/mtr0log.cc
@@ -436,8 +436,9 @@ mlog_open_and_write_index(
log_end = log_ptr + 11 + size;
} else {
ulint i;
+ bool is_instant = index->is_instant();
ulint n = dict_index_get_n_fields(index);
- ulint total = 11 + size + (n + 2) * 2;
+ ulint total = 11 + (is_instant ? 2 : 0) + size + (n + 2) * 2;
ulint alloc = total;
if (alloc > mtr_buf_t::MAX_DATA_SIZE) {
@@ -463,7 +464,18 @@ mlog_open_and_write_index(
log_ptr = mlog_write_initial_log_record_fast(
rec, type, log_ptr, mtr);
- mach_write_to_2(log_ptr, n);
+ if (is_instant) {
+ // marked as instant index
+ mach_write_to_2(log_ptr, n | 0x8000);
+
+ log_ptr += 2;
+
+ // record the n_core_fields
+ mach_write_to_2(log_ptr, index->n_core_fields);
+ } else {
+ mach_write_to_2(log_ptr, n);
+ }
+
log_ptr += 2;
if (is_leaf) {
@@ -540,6 +552,7 @@ mlog_parse_index(
ulint i, n, n_uniq;
dict_table_t* table;
dict_index_t* ind;
+ ulint n_core_fields = 0;
ut_ad(comp == FALSE || comp == TRUE);
@@ -549,6 +562,23 @@ mlog_parse_index(
}
n = mach_read_from_2(ptr);
ptr += 2;
+ if (n & 0x8000) { /* record after instant ADD COLUMN */
+ n &= 0x7FFF;
+
+ n_core_fields = mach_read_from_2(ptr);
+
+ if (!n_core_fields || n_core_fields > n) {
+ recv_sys->found_corrupt_log = TRUE;
+ return(NULL);
+ }
+
+ ptr += 2;
+
+ if (end_ptr < ptr + 2) {
+ return(NULL);
+ }
+ }
+
n_uniq = mach_read_from_2(ptr);
ptr += 2;
ut_ad(n_uniq <= n);
@@ -600,6 +630,22 @@ mlog_parse_index(
ind->fields[DATA_ROLL_PTR - 1 + n_uniq].col
= &table->cols[n + DATA_ROLL_PTR];
}
+
+ ut_ad(table->n_cols == table->n_def);
+
+ if (n_core_fields) {
+ for (i = n_core_fields; i < n; i++) {
+ ind->fields[i].col->def_val.len
+ = UNIV_SQL_NULL;
+ }
+ ind->n_core_fields = n_core_fields;
+ ind->n_core_null_bytes = UT_BITS_IN_BYTES(
+ ind->get_n_nullable(n_core_fields));
+ } else {
+ ind->n_core_null_bytes = UT_BITS_IN_BYTES(
+ ind->n_nullable);
+ ind->n_core_fields = ind->n_fields;
+ }
}
/* avoid ut_ad(index->cached) in dict_index_get_n_unique_in_tree */
ind->cached = TRUE;
diff --git a/storage/innobase/page/page0cur.cc b/storage/innobase/page/page0cur.cc
index 113c31e44ae..5ede7faf3ba 100644
--- a/storage/innobase/page/page0cur.cc
+++ b/storage/innobase/page/page0cur.cc
@@ -360,19 +360,15 @@ page_cur_search_with_match(
#ifdef BTR_CUR_HASH_ADAPT
if (is_leaf
- && (mode == PAGE_CUR_LE)
+ && page_get_direction(page) == PAGE_RIGHT
+ && page_header_get_offs(page, PAGE_LAST_INSERT)
+ && mode == PAGE_CUR_LE
&& !dict_index_is_spatial(index)
- && (page_header_get_field(page, PAGE_N_DIRECTION) > 3)
- && (page_header_get_ptr(page, PAGE_LAST_INSERT))
- && (page_header_get_field(page, PAGE_DIRECTION) == PAGE_RIGHT)) {
-
- if (page_cur_try_search_shortcut(
- block, index, tuple,
- iup_matched_fields,
- ilow_matched_fields,
- cursor)) {
- return;
- }
+ && page_header_get_field(page, PAGE_N_DIRECTION) > 3
+ && page_cur_try_search_shortcut(
+ block, index, tuple,
+ iup_matched_fields, ilow_matched_fields, cursor)) {
+ return;
}
# ifdef PAGE_CUR_DBG
if (mode == PAGE_CUR_DBG) {
@@ -602,6 +598,7 @@ page_cur_search_with_match_bytes(
rec_offs_init(offsets_);
ut_ad(dtuple_validate(tuple));
+ ut_ad(!(tuple->info_bits & REC_INFO_MIN_REC_FLAG));
#ifdef UNIV_DEBUG
# ifdef PAGE_CUR_DBG
if (mode != PAGE_CUR_DBG)
@@ -621,18 +618,16 @@ page_cur_search_with_match_bytes(
#ifdef BTR_CUR_HASH_ADAPT
if (page_is_leaf(page)
- && (mode == PAGE_CUR_LE)
- && (page_header_get_field(page, PAGE_N_DIRECTION) > 3)
- && (page_header_get_ptr(page, PAGE_LAST_INSERT))
- && (page_header_get_field(page, PAGE_DIRECTION) == PAGE_RIGHT)) {
-
- if (page_cur_try_search_shortcut_bytes(
- block, index, tuple,
- iup_matched_fields, iup_matched_bytes,
- ilow_matched_fields, ilow_matched_bytes,
- cursor)) {
- return;
- }
+ && page_get_direction(page) == PAGE_RIGHT
+ && page_header_get_offs(page, PAGE_LAST_INSERT)
+ && mode == PAGE_CUR_LE
+ && page_header_get_field(page, PAGE_N_DIRECTION) > 3
+ && page_cur_try_search_shortcut_bytes(
+ block, index, tuple,
+ iup_matched_fields, iup_matched_bytes,
+ ilow_matched_fields, ilow_matched_bytes,
+ cursor)) {
+ return;
}
# ifdef PAGE_CUR_DBG
if (mode == PAGE_CUR_DBG) {
@@ -667,7 +662,7 @@ page_cur_search_with_match_bytes(
/* Perform binary search until the lower and upper limit directory
slots come to the distance 1 of each other */
- ut_d(bool is_leaf = page_is_leaf(page));
+ const bool is_leaf = page_is_leaf(page);
while (up - low > 1) {
mid = (low + up) / 2;
@@ -735,6 +730,19 @@ up_slot_match:
low_matched_fields, low_matched_bytes,
up_matched_fields, up_matched_bytes);
+ if (UNIV_UNLIKELY(rec_get_info_bits(
+ mid_rec,
+ dict_table_is_comp(index->table))
+ & REC_INFO_MIN_REC_FLAG)) {
+ ut_ad(mach_read_from_4(FIL_PAGE_PREV
+ + page_align(mid_rec))
+ == FIL_NULL);
+ ut_ad(!page_rec_is_leaf(mid_rec)
+ || rec_is_default_row(mid_rec, index));
+ cmp = 1;
+ goto low_rec_match;
+ }
+
offsets = rec_get_offsets(
mid_rec, index, offsets_, is_leaf,
dtuple_get_n_fields_cmp(tuple), &heap);
@@ -768,23 +776,6 @@ up_rec_match:
|| mode == PAGE_CUR_LE_OR_EXTENDS
#endif /* PAGE_CUR_LE_OR_EXTENDS */
) {
- if (!cmp && !cur_matched_fields) {
-#ifdef UNIV_DEBUG
- mtr_t mtr;
- mtr_start(&mtr);
-
- /* We got a match, but cur_matched_fields is
- 0, it must have REC_INFO_MIN_REC_FLAG */
- ulint rec_info = rec_get_info_bits(mid_rec,
- rec_offs_comp(offsets));
- ut_ad(rec_info & REC_INFO_MIN_REC_FLAG);
- ut_ad(btr_page_get_prev(page, &mtr) == FIL_NULL);
- mtr_commit(&mtr);
-#endif
-
- cur_matched_fields = dtuple_get_n_fields_cmp(tuple);
- }
-
goto low_rec_match;
} else {
@@ -865,7 +856,7 @@ page_cur_insert_rec_write_log(
ut_ad(!page_rec_is_comp(insert_rec)
== !dict_table_is_comp(index->table));
- ut_d(const bool is_leaf = page_rec_is_leaf(cursor_rec));
+ const bool is_leaf = page_rec_is_leaf(cursor_rec);
{
mem_heap_t* heap = NULL;
@@ -1139,7 +1130,7 @@ page_cur_parse_insert_rec(
/* Read from the log the inserted index record end segment which
differs from the cursor record */
- ut_d(bool is_leaf = page_is_leaf(page));
+ const bool is_leaf = page_is_leaf(page);
offsets = rec_get_offsets(cursor_rec, index, offsets, is_leaf,
ULINT_UNDEFINED, &heap);
@@ -1176,15 +1167,13 @@ page_cur_parse_insert_rec(
ut_memcpy(buf + mismatch_index, ptr, end_seg_len);
if (page_is_comp(page)) {
- /* Make rec_get_offsets() and rec_offs_make_valid() happy. */
- ut_d(rec_set_heap_no_new(buf + origin_offset,
- PAGE_HEAP_NO_USER_LOW));
+ rec_set_heap_no_new(buf + origin_offset,
+ PAGE_HEAP_NO_USER_LOW);
rec_set_info_and_status_bits(buf + origin_offset,
info_and_status_bits);
} else {
- /* Make rec_get_offsets() and rec_offs_make_valid() happy. */
- ut_d(rec_set_heap_no_old(buf + origin_offset,
- PAGE_HEAP_NO_USER_LOW));
+ rec_set_heap_no_old(buf + origin_offset,
+ PAGE_HEAP_NO_USER_LOW);
rec_set_info_bits_old(buf + origin_offset,
info_and_status_bits);
}
@@ -1213,6 +1202,50 @@ page_cur_parse_insert_rec(
return(const_cast<byte*>(ptr + end_seg_len));
}
+/** Reset PAGE_DIRECTION and PAGE_N_DIRECTION.
+@param[in,out] ptr the PAGE_DIRECTION_B field
+@param[in,out] page index tree page frame
+@param[in] page_zip compressed page descriptor, or NULL */
+static inline
+void
+page_direction_reset(byte* ptr, page_t* page, page_zip_des_t* page_zip)
+{
+ ut_ad(ptr == PAGE_HEADER + PAGE_DIRECTION_B + page);
+ page_ptr_set_direction(ptr, PAGE_NO_DIRECTION);
+ if (page_zip) {
+ page_zip_write_header(page_zip, ptr, 1, NULL);
+ }
+ ptr = PAGE_HEADER + PAGE_N_DIRECTION + page;
+ *reinterpret_cast<uint16_t*>(ptr) = 0;
+ if (page_zip) {
+ page_zip_write_header(page_zip, ptr, 2, NULL);
+ }
+}
+
+/** Increment PAGE_N_DIRECTION.
+@param[in,out] ptr the PAGE_DIRECTION_B field
+@param[in,out] page index tree page frame
+@param[in] page_zip compressed page descriptor, or NULL
+@param[in] dir PAGE_RIGHT or PAGE_LEFT */
+static inline
+void
+page_direction_increment(
+ byte* ptr,
+ page_t* page,
+ page_zip_des_t* page_zip,
+ uint dir)
+{
+ ut_ad(ptr == PAGE_HEADER + PAGE_DIRECTION_B + page);
+ ut_ad(dir == PAGE_RIGHT || dir == PAGE_LEFT);
+ page_ptr_set_direction(ptr, dir);
+ if (page_zip) {
+ page_zip_write_header(page_zip, ptr, 1, NULL);
+ }
+ page_header_set_field(page, page_zip, PAGE_N_DIRECTION,
+ page_header_get_field(page, PAGE_N_DIRECTION)
+ + 1);
+}
+
/***********************************************************//**
Inserts a record next to page cursor on an uncompressed page.
Returns pointer to inserted record if succeed, i.e., enough
@@ -1323,28 +1356,7 @@ use_heap:
/* 3. Create the record */
insert_rec = rec_copy(insert_buf, rec, offsets);
- rec_offs_make_valid(insert_rec, index, offsets);
-
- /* This is because assertion below is debug assertion */
-#ifdef UNIV_DEBUG
- if (UNIV_UNLIKELY(current_rec == insert_rec)) {
- ulint extra_len, data_len;
- extra_len = rec_offs_extra_size(offsets);
- data_len = rec_offs_data_size(offsets);
-
- fprintf(stderr, "InnoDB: Error: current_rec == insert_rec "
- " extra_len " ULINTPF
- " data_len " ULINTPF " insert_buf %p rec %p\n",
- extra_len, data_len, insert_buf, rec);
- fprintf(stderr, "InnoDB; Physical record: \n");
- rec_print(stderr, rec, index);
- fprintf(stderr, "InnoDB: Inserted record: \n");
- rec_print(stderr, insert_rec, index);
- fprintf(stderr, "InnoDB: Current record: \n");
- rec_print(stderr, current_rec, index);
- ut_a(current_rec != insert_rec);
- }
-#endif /* UNIV_DEBUG */
+ rec_offs_make_valid(insert_rec, index, page_is_leaf(page), offsets);
/* 4. Insert the record in the linked list of records */
ut_ad(current_rec != insert_rec);
@@ -1354,9 +1366,24 @@ use_heap:
rec_t* next_rec = page_rec_get_next(current_rec);
#ifdef UNIV_DEBUG
if (page_is_comp(page)) {
- ut_ad(rec_get_status(current_rec)
- <= REC_STATUS_INFIMUM);
- ut_ad(rec_get_status(insert_rec) < REC_STATUS_INFIMUM);
+ switch (rec_get_status(current_rec)) {
+ case REC_STATUS_ORDINARY:
+ case REC_STATUS_NODE_PTR:
+ case REC_STATUS_COLUMNS_ADDED:
+ case REC_STATUS_INFIMUM:
+ break;
+ case REC_STATUS_SUPREMUM:
+ ut_ad(!"wrong status on current_rec");
+ }
+ switch (rec_get_status(insert_rec)) {
+ case REC_STATUS_ORDINARY:
+ case REC_STATUS_NODE_PTR:
+ case REC_STATUS_COLUMNS_ADDED:
+ break;
+ case REC_STATUS_INFIMUM:
+ case REC_STATUS_SUPREMUM:
+ ut_ad(!"wrong status on insert_rec");
+ }
ut_ad(rec_get_status(next_rec) != REC_STATUS_INFIMUM);
}
#endif
@@ -1387,34 +1414,18 @@ use_heap:
== rec_get_node_ptr_flag(insert_rec));
if (!dict_index_is_spatial(index)) {
+ byte* ptr = PAGE_HEADER + PAGE_DIRECTION_B + page;
if (UNIV_UNLIKELY(last_insert == NULL)) {
- page_header_set_field(page, NULL, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(page, NULL, PAGE_N_DIRECTION, 0);
-
- } else if ((last_insert == current_rec)
- && (page_header_get_field(page, PAGE_DIRECTION)
- != PAGE_LEFT)) {
-
- page_header_set_field(page, NULL, PAGE_DIRECTION,
- PAGE_RIGHT);
- page_header_set_field(page, NULL, PAGE_N_DIRECTION,
- page_header_get_field(
- page, PAGE_N_DIRECTION) + 1);
-
- } else if ((page_rec_get_next(insert_rec) == last_insert)
- && (page_header_get_field(page, PAGE_DIRECTION)
- != PAGE_RIGHT)) {
-
- page_header_set_field(page, NULL, PAGE_DIRECTION,
- PAGE_LEFT);
- page_header_set_field(page, NULL, PAGE_N_DIRECTION,
- page_header_get_field(
- page, PAGE_N_DIRECTION) + 1);
+no_direction:
+ page_direction_reset(ptr, page, NULL);
+ } else if (last_insert == current_rec
+ && page_ptr_get_direction(ptr) != PAGE_LEFT) {
+ page_direction_increment(ptr, page, NULL, PAGE_RIGHT);
+ } else if (page_ptr_get_direction(ptr) != PAGE_RIGHT
+ && page_rec_get_next(insert_rec) == last_insert) {
+ page_direction_increment(ptr, page, NULL, PAGE_LEFT);
} else {
- page_header_set_field(page, NULL, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(page, NULL, PAGE_N_DIRECTION, 0);
+ goto no_direction;
}
}
@@ -1497,7 +1508,7 @@ page_cur_insert_rec_zip(
ut_ad(mach_read_from_8(page + PAGE_HEADER + PAGE_INDEX_ID) == index->id
|| (mtr ? mtr->is_inside_ibuf() : dict_index_is_ibuf(index))
|| recv_recovery_is_on());
-
+ ut_ad(!page_get_instant(page));
ut_ad(!page_cur_is_after_last(cursor));
#ifdef UNIV_ZIP_DEBUG
ut_a(page_zip_validate(page_zip, page, index));
@@ -1619,7 +1630,8 @@ page_cur_insert_rec_zip(
/* This should be followed by
MLOG_ZIP_PAGE_COMPRESS_NO_DATA,
which should succeed. */
- rec_offs_make_valid(insert_rec, index, offsets);
+ rec_offs_make_valid(insert_rec, index,
+ page_is_leaf(page), offsets);
} else {
ulint pos = page_rec_get_n_recs_before(insert_rec);
ut_ad(pos > 0);
@@ -1635,7 +1647,8 @@ page_cur_insert_rec_zip(
level, page, index, mtr);
rec_offs_make_valid(
- insert_rec, index, offsets);
+ insert_rec, index,
+ page_is_leaf(page), offsets);
return(insert_rec);
}
@@ -1678,7 +1691,8 @@ page_cur_insert_rec_zip(
insert_rec = page + rec_get_next_offs(
cursor->rec, TRUE);
rec_offs_make_valid(
- insert_rec, index, offsets);
+ insert_rec, index,
+ page_is_leaf(page), offsets);
return(insert_rec);
}
@@ -1820,7 +1834,7 @@ use_heap:
/* 3. Create the record */
insert_rec = rec_copy(insert_buf, rec, offsets);
- rec_offs_make_valid(insert_rec, index, offsets);
+ rec_offs_make_valid(insert_rec, index, page_is_leaf(page), offsets);
/* 4. Insert the record in the linked list of records */
ut_ad(cursor->rec != insert_rec);
@@ -1859,36 +1873,20 @@ use_heap:
== rec_get_node_ptr_flag(insert_rec));
if (!dict_index_is_spatial(index)) {
+ byte* ptr = PAGE_HEADER + PAGE_DIRECTION_B + page;
if (UNIV_UNLIKELY(last_insert == NULL)) {
- page_header_set_field(page, page_zip, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(page, page_zip,
- PAGE_N_DIRECTION, 0);
-
- } else if ((last_insert == cursor->rec)
- && (page_header_get_field(page, PAGE_DIRECTION)
- != PAGE_LEFT)) {
-
- page_header_set_field(page, page_zip, PAGE_DIRECTION,
- PAGE_RIGHT);
- page_header_set_field(page, page_zip, PAGE_N_DIRECTION,
- page_header_get_field(
- page, PAGE_N_DIRECTION) + 1);
-
- } else if ((page_rec_get_next(insert_rec) == last_insert)
- && (page_header_get_field(page, PAGE_DIRECTION)
- != PAGE_RIGHT)) {
-
- page_header_set_field(page, page_zip, PAGE_DIRECTION,
- PAGE_LEFT);
- page_header_set_field(page, page_zip, PAGE_N_DIRECTION,
- page_header_get_field(
- page, PAGE_N_DIRECTION) + 1);
+no_direction:
+ page_direction_reset(ptr, page, page_zip);
+ } else if (last_insert == cursor->rec
+ && page_ptr_get_direction(ptr) != PAGE_LEFT) {
+ page_direction_increment(ptr, page, page_zip,
+ PAGE_RIGHT);
+ } else if (page_ptr_get_direction(ptr) != PAGE_RIGHT
+ && page_rec_get_next(insert_rec) == last_insert) {
+ page_direction_increment(ptr, page, page_zip,
+ PAGE_LEFT);
} else {
- page_header_set_field(page, page_zip, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(page, page_zip,
- PAGE_N_DIRECTION, 0);
+ goto no_direction;
}
}
@@ -1989,6 +1987,14 @@ page_parse_copy_rec_list_to_created_page(
return(rec_end);
}
+ /* This function is never invoked on the clustered index root page,
+ except in the redo log apply of
+ page_copy_rec_list_end_to_created_page() which was logged by.
+ page_copy_rec_list_to_created_page_write_log().
+ For other pages, this field must be zero-initialized. */
+ ut_ad(!page_get_instant(block->frame)
+ || (page_is_root(block->frame) && index->is_dummy));
+
while (ptr < rec_end) {
ptr = page_cur_parse_insert_rec(TRUE, ptr, end_ptr,
block, index, mtr);
@@ -2002,9 +2008,8 @@ page_parse_copy_rec_list_to_created_page(
page_header_set_ptr(page, page_zip, PAGE_LAST_INSERT, NULL);
if (!dict_index_is_spatial(index)) {
- page_header_set_field(page, page_zip, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(page, page_zip, PAGE_N_DIRECTION, 0);
+ page_direction_reset(PAGE_HEADER + PAGE_DIRECTION_B + page,
+ page, page_zip);
}
return(rec_end);
@@ -2044,6 +2049,9 @@ page_copy_rec_list_end_to_created_page(
ut_ad(page_dir_get_n_heap(new_page) == PAGE_HEAP_NO_USER_LOW);
ut_ad(page_align(rec) != new_page);
ut_ad(page_rec_is_comp(rec) == page_is_comp(new_page));
+ /* This function is never invoked on the clustered index root page,
+ except in btr_lift_page_up(). */
+ ut_ad(!page_get_instant(new_page) || page_is_root(new_page));
if (page_rec_is_infimum(rec)) {
@@ -2084,7 +2092,7 @@ page_copy_rec_list_end_to_created_page(
slot_index = 0;
n_recs = 0;
- ut_d(const bool is_leaf = page_is_leaf(new_page));
+ const bool is_leaf = page_is_leaf(new_page);
do {
offsets = rec_get_offsets(rec, index, offsets, is_leaf,
@@ -2129,7 +2137,7 @@ page_copy_rec_list_end_to_created_page(
heap_top += rec_size;
- rec_offs_make_valid(insert_rec, index, offsets);
+ rec_offs_make_valid(insert_rec, index, is_leaf, offsets);
page_cur_insert_rec_write_log(insert_rec, rec_size, prev_rec,
index, mtr);
prev_rec = insert_rec;
@@ -2157,6 +2165,10 @@ page_copy_rec_list_end_to_created_page(
mem_heap_free(heap);
}
+ /* Restore the log mode */
+
+ mtr_set_log_mode(mtr, log_mode);
+
log_data_len = mtr->get_log()->size() - log_data_len;
ut_a(log_data_len < 100 * UNIV_PAGE_SIZE);
@@ -2181,15 +2193,10 @@ page_copy_rec_list_end_to_created_page(
page_dir_set_n_heap(new_page, NULL, PAGE_HEAP_NO_USER_LOW + n_recs);
page_header_set_field(new_page, NULL, PAGE_N_RECS, n_recs);
- page_header_set_ptr(new_page, NULL, PAGE_LAST_INSERT, NULL);
-
- page_header_set_field(new_page, NULL, PAGE_DIRECTION,
- PAGE_NO_DIRECTION);
- page_header_set_field(new_page, NULL, PAGE_N_DIRECTION, 0);
-
- /* Restore the log mode */
-
- mtr_set_log_mode(mtr, log_mode);
+ *reinterpret_cast<uint16_t*>(PAGE_HEADER + PAGE_LAST_INSERT + new_page)
+ = 0;
+ page_direction_reset(PAGE_HEADER + PAGE_DIRECTION_B + new_page,
+ new_page, NULL);
}
/***********************************************************//**
diff --git a/storage/innobase/page/page0page.cc b/storage/innobase/page/page0page.cc
index fb528843da6..624e31685fe 100644
--- a/storage/innobase/page/page0page.cc
+++ b/storage/innobase/page/page0page.cc
@@ -379,7 +379,8 @@ page_create_low(
memset(page + PAGE_HEADER, 0, PAGE_HEADER_PRIV_END);
page[PAGE_HEADER + PAGE_N_DIR_SLOTS + 1] = 2;
- page[PAGE_HEADER + PAGE_DIRECTION + 1] = PAGE_NO_DIRECTION;
+ page[PAGE_HEADER + PAGE_INSTANT] = 0;
+ page[PAGE_HEADER + PAGE_DIRECTION_B] = PAGE_NO_DIRECTION;
if (comp) {
page[PAGE_HEADER + PAGE_N_HEAP] = 0x80;/*page_is_comp()*/
@@ -598,7 +599,7 @@ page_copy_rec_list_end_no_locks(
ut_a(page_is_comp(new_page) == page_rec_is_comp(rec));
ut_a(mach_read_from_2(new_page + UNIV_PAGE_SIZE - 10) == (ulint)
(page_is_comp(new_page) ? PAGE_NEW_INFIMUM : PAGE_OLD_INFIMUM));
- ut_d(const bool is_leaf = page_is_leaf(block->frame));
+ const bool is_leaf = page_is_leaf(block->frame);
cur2 = page_get_infimum_rec(buf_block_get_frame(new_block));
@@ -768,9 +769,10 @@ page_copy_rec_list_end(
/* Update the lock table and possible hash index */
- if (dict_index_is_spatial(index) && rec_move) {
+ if (dict_table_is_locking_disabled(index->table)) {
+ } else if (rec_move && dict_index_is_spatial(index)) {
lock_rtr_move_rec_list(new_block, block, rec_move, num_moved);
- } else if (!dict_table_is_locking_disabled(index->table)) {
+ } else {
lock_move_rec_list_end(new_block, block, rec);
}
@@ -778,7 +780,7 @@ page_copy_rec_list_end(
mem_heap_free(heap);
}
- btr_search_move_or_delete_hash_entries(new_block, block, index);
+ btr_search_move_or_delete_hash_entries(new_block, block);
return(ret);
}
@@ -928,9 +930,10 @@ zip_reorganize:
/* Update the lock table and possible hash index */
- if (dict_index_is_spatial(index)) {
+ if (dict_table_is_locking_disabled(index->table)) {
+ } else if (dict_index_is_spatial(index)) {
lock_rtr_move_rec_list(new_block, block, rec_move, num_moved);
- } else if (!dict_table_is_locking_disabled(index->table)) {
+ } else {
lock_move_rec_list_start(new_block, block, rec, ret);
}
@@ -938,7 +941,7 @@ zip_reorganize:
mem_heap_free(heap);
}
- btr_search_move_or_delete_hash_entries(new_block, block, index);
+ btr_search_move_or_delete_hash_entries(new_block, block);
return(ret);
}
@@ -1106,7 +1109,7 @@ delete_all:
? MLOG_COMP_LIST_END_DELETE
: MLOG_LIST_END_DELETE, mtr);
- ut_d(const bool is_leaf = page_is_leaf(page));
+ const bool is_leaf = page_is_leaf(page);
if (page_zip) {
mtr_log_t log_mode;
@@ -1297,7 +1300,7 @@ page_delete_rec_list_start(
/* Individual deletes are not logged */
mtr_log_t log_mode = mtr_set_log_mode(mtr, MTR_LOG_NONE);
- ut_d(const bool is_leaf = page_rec_is_leaf(rec));
+ const bool is_leaf = page_rec_is_leaf(rec);
while (page_cur_get_rec(&cur1) != rec) {
offsets = rec_get_offsets(page_cur_get_rec(&cur1), index,
@@ -1875,20 +1878,20 @@ page_header_print(
fprintf(stderr,
"--------------------------------\n"
"PAGE HEADER INFO\n"
- "Page address %p, n records %lu (%s)\n"
- "n dir slots %lu, heap top %lu\n"
- "Page n heap %lu, free %lu, garbage %lu\n"
- "Page last insert %lu, direction %lu, n direction %lu\n",
- page, (ulong) page_header_get_field(page, PAGE_N_RECS),
+ "Page address %p, n records %u (%s)\n"
+ "n dir slots %u, heap top %u\n"
+ "Page n heap %u, free %u, garbage %u\n"
+ "Page last insert %u, direction %u, n direction %u\n",
+ page, page_header_get_field(page, PAGE_N_RECS),
page_is_comp(page) ? "compact format" : "original format",
- (ulong) page_header_get_field(page, PAGE_N_DIR_SLOTS),
- (ulong) page_header_get_field(page, PAGE_HEAP_TOP),
- (ulong) page_dir_get_n_heap(page),
- (ulong) page_header_get_field(page, PAGE_FREE),
- (ulong) page_header_get_field(page, PAGE_GARBAGE),
- (ulong) page_header_get_field(page, PAGE_LAST_INSERT),
- (ulong) page_header_get_field(page, PAGE_DIRECTION),
- (ulong) page_header_get_field(page, PAGE_N_DIRECTION));
+ page_header_get_field(page, PAGE_N_DIR_SLOTS),
+ page_header_get_field(page, PAGE_HEAP_TOP),
+ page_dir_get_n_heap(page),
+ page_header_get_field(page, PAGE_FREE),
+ page_header_get_field(page, PAGE_GARBAGE),
+ page_header_get_field(page, PAGE_LAST_INSERT),
+ page_get_direction(page),
+ page_header_get_field(page, PAGE_N_DIRECTION));
}
/***************************************************************//**
@@ -2806,19 +2809,26 @@ page_find_rec_max_not_deleted(
const rec_t* rec = page_get_infimum_rec(page);
const rec_t* prev_rec = NULL; // remove warning
- /* Because the page infimum is never delete-marked,
+ /* Because the page infimum is never delete-marked
+ and never the 'default row' pseudo-record (MIN_REC_FLAG)),
prev_rec will always be assigned to it first. */
- ut_ad(!rec_get_deleted_flag(rec, page_rec_is_comp(rec)));
+ ut_ad(!rec_get_info_bits(rec, page_rec_is_comp(rec)));
+ ut_ad(page_is_leaf(page));
+
if (page_is_comp(page)) {
do {
- if (!rec_get_deleted_flag(rec, true)) {
+ if (!(rec[-REC_NEW_INFO_BITS]
+ & (REC_INFO_DELETED_FLAG
+ | REC_INFO_MIN_REC_FLAG))) {
prev_rec = rec;
}
rec = page_rec_get_next_low(rec, true);
} while (rec != page + PAGE_NEW_SUPREMUM);
} else {
do {
- if (!rec_get_deleted_flag(rec, false)) {
+ if (!(rec[-REC_OLD_INFO_BITS]
+ & (REC_INFO_DELETED_FLAG
+ | REC_INFO_MIN_REC_FLAG))) {
prev_rec = rec;
}
rec = page_rec_get_next_low(rec, false);
diff --git a/storage/innobase/page/page0zip.cc b/storage/innobase/page/page0zip.cc
index 10d905e0c8b..9e3119bfd3a 100644
--- a/storage/innobase/page/page0zip.cc
+++ b/storage/innobase/page/page0zip.cc
@@ -1775,6 +1775,10 @@ page_zip_fields_decode(
}
}
+ /* ROW_FORMAT=COMPRESSED does not support instant ADD COLUMN */
+ index->n_core_fields = index->n_fields;
+ index->n_core_null_bytes = UT_BITS_IN_BYTES(index->n_nullable);
+
ut_ad(b == end);
if (is_spatial) {
@@ -2164,13 +2168,11 @@ page_zip_apply_log(
continue;
}
-#if REC_STATUS_NODE_PTR != TRUE
-# error "REC_STATUS_NODE_PTR != TRUE"
-#endif
+ compile_time_assert(REC_STATUS_NODE_PTR == TRUE);
rec_get_offsets_reverse(data, index,
hs & REC_STATUS_NODE_PTR,
offsets);
- rec_offs_make_valid(rec, index, offsets);
+ rec_offs_make_valid(rec, index, is_leaf, offsets);
/* Copy the extra bytes (backwards). */
{
diff --git a/storage/innobase/pars/pars0pars.cc b/storage/innobase/pars/pars0pars.cc
index 56ca037f247..d0696ab5cfc 100644
--- a/storage/innobase/pars/pars0pars.cc
+++ b/storage/innobase/pars/pars0pars.cc
@@ -1916,12 +1916,13 @@ pars_create_table(
table = dict_mem_table_create(
table_sym->name, 0, n_cols, 0, flags, flags2);
+ mem_heap_t* heap = pars_sym_tab_global->heap;
column = column_defs;
while (column) {
dtype = dfield_get_type(que_node_get_val(column));
- dict_mem_table_add_col(table, table->heap,
+ dict_mem_table_add_col(table, heap,
column->name, dtype->mtype,
dtype->prtype, dtype->len);
column->resolved = TRUE;
@@ -1930,8 +1931,10 @@ pars_create_table(
column = static_cast<sym_node_t*>(que_node_get_next(column));
}
- node = tab_create_graph_create(table, pars_sym_tab_global->heap,
- FIL_ENCRYPTION_DEFAULT, FIL_DEFAULT_ENCRYPTION_KEY);
+ dict_table_add_system_columns(table, heap);
+ node = tab_create_graph_create(table, heap,
+ FIL_ENCRYPTION_DEFAULT,
+ FIL_DEFAULT_ENCRYPTION_KEY);
table_sym->resolved = TRUE;
table_sym->token_type = SYM_TABLE;
diff --git a/storage/innobase/rem/rem0cmp.cc b/storage/innobase/rem/rem0cmp.cc
index 0e2bc9b30de..bfb9e95a5f8 100644
--- a/storage/innobase/rem/rem0cmp.cc
+++ b/storage/innobase/rem/rem0cmp.cc
@@ -409,6 +409,9 @@ cmp_data(
const byte* data2,
ulint len2)
{
+ ut_ad(len1 != UNIV_SQL_DEFAULT);
+ ut_ad(len2 != UNIV_SQL_DEFAULT);
+
if (len1 == UNIV_SQL_NULL || len2 == UNIV_SQL_NULL) {
if (len1 == len2) {
return(0);
@@ -708,6 +711,11 @@ cmp_dtuple_rec_with_match_low(
contain externally stored fields, and the first fields
(primary key fields) should already differ. */
ut_ad(!rec_offs_nth_extern(offsets, cur_field));
+ /* We should never compare against instantly added columns.
+ Columns can only be instantly added to clustered index
+ leaf page records, and the first fields (primary key fields)
+ should already differ. */
+ ut_ad(!rec_offs_nth_default(offsets, cur_field));
rec_b_ptr = rec_get_nth_field(rec, offsets, cur_field,
&rec_f_len);
@@ -823,6 +831,8 @@ cmp_dtuple_rec_with_match_bytes(
dtuple_b_ptr = static_cast<const byte*>(
dfield_get_data(dfield));
+
+ ut_ad(!rec_offs_nth_default(offsets, cur_field));
rec_b_ptr = rec_get_nth_field(rec, offsets,
cur_field, &rec_f_len);
ut_ad(!rec_offs_nth_extern(offsets, cur_field));
@@ -1144,10 +1154,9 @@ cmp_rec_rec_with_match(
/* Test if rec is the predefined minimum record */
if (UNIV_UNLIKELY(rec_get_info_bits(rec1, comp)
& REC_INFO_MIN_REC_FLAG)) {
- /* There should only be one such record. */
- ut_ad(!(rec_get_info_bits(rec2, comp)
- & REC_INFO_MIN_REC_FLAG));
- ret = -1;
+ ret = UNIV_UNLIKELY(rec_get_info_bits(rec2, comp)
+ & REC_INFO_MIN_REC_FLAG)
+ ? 0 : -1;
goto order_resolved;
} else if (UNIV_UNLIKELY
(rec_get_info_bits(rec2, comp)
@@ -1197,6 +1206,8 @@ cmp_rec_rec_with_match(
DB_ROLL_PTR, and any externally stored columns. */
ut_ad(!rec_offs_nth_extern(offsets1, cur_field));
ut_ad(!rec_offs_nth_extern(offsets2, cur_field));
+ ut_ad(!rec_offs_nth_default(offsets1, cur_field));
+ ut_ad(!rec_offs_nth_default(offsets2, cur_field));
rec1_b_ptr = rec_get_nth_field(rec1, offsets1,
cur_field, &rec1_f_len);
diff --git a/storage/innobase/rem/rem0rec.cc b/storage/innobase/rem/rem0rec.cc
index c26614d5eae..9e3d3e4be9d 100644
--- a/storage/innobase/rem/rem0rec.cc
+++ b/storage/innobase/rem/rem0rec.cc
@@ -172,7 +172,10 @@ rec_get_n_extern_new(
ulint i;
ut_ad(dict_table_is_comp(index->table));
- ut_ad(rec_get_status(rec) == REC_STATUS_ORDINARY);
+ ut_ad(!index->table->supports_instant() || index->is_dummy);
+ ut_ad(!index->is_instant());
+ ut_ad(rec_get_status(rec) == REC_STATUS_ORDINARY
+ || rec_get_status(rec) == REC_STATUS_COLUMNS_ADDED);
ut_ad(n == ULINT_UNDEFINED || n <= dict_index_get_n_fields(index));
if (n == ULINT_UNDEFINED) {
@@ -234,50 +237,140 @@ rec_get_n_extern_new(
return(n_extern);
}
-/******************************************************//**
-Determine the offset to each field in a leaf-page record
-in ROW_FORMAT=COMPACT. This is a special case of
-rec_init_offsets() and rec_get_offsets_func(). */
-UNIV_INLINE MY_ATTRIBUTE((nonnull))
+/** Get the length of added field count in a REC_STATUS_COLUMNS_ADDED record.
+@param[in] n_add_field number of added fields, minus one
+@return storage size of the field count, in bytes */
+static inline unsigned rec_get_n_add_field_len(unsigned n_add_field)
+{
+ ut_ad(n_add_field < REC_MAX_N_FIELDS);
+ return n_add_field < 0x80 ? 1 : 2;
+}
+
+/** Get the added field count in a REC_STATUS_COLUMNS_ADDED record.
+@param[in,out] header variable header of a REC_STATUS_COLUMNS_ADDED record
+@return number of added fields */
+static inline unsigned rec_get_n_add_field(const byte*& header)
+{
+ unsigned n_fields_add = *--header;
+ if (n_fields_add < 0x80) {
+ ut_ad(rec_get_n_add_field_len(n_fields_add) == 1);
+ return n_fields_add;
+ }
+
+ n_fields_add &= 0x7f;
+ n_fields_add |= unsigned(*--header) << 7;
+ ut_ad(n_fields_add < REC_MAX_N_FIELDS);
+ ut_ad(rec_get_n_add_field_len(n_fields_add) == 2);
+ return n_fields_add;
+}
+
+/** Set the added field count in a REC_STATUS_COLUMNS_ADDED record.
+@param[in,out] header variable header of a REC_STATUS_COLUMNS_ADDED record
+@param[in] n_add number of added fields, minus 1
+@return record header before the number of added fields */
+static inline void rec_set_n_add_field(byte*& header, unsigned n_add)
+{
+ ut_ad(n_add < REC_MAX_N_FIELDS);
+
+ if (n_add < 0x80) {
+ *header-- = byte(n_add);
+ } else {
+ *header-- = byte(n_add) | 0x80;
+ *header-- = byte(n_add >> 7);
+ }
+}
+
+/** Format of a leaf-page ROW_FORMAT!=REDUNDANT record */
+enum rec_leaf_format {
+ /** Temporary file record */
+ REC_LEAF_TEMP,
+ /** Temporary file record, with added columns
+ (REC_STATUS_COLUMNS_ADDED) */
+ REC_LEAF_TEMP_COLUMNS_ADDED,
+ /** Normal (REC_STATUS_ORDINARY) */
+ REC_LEAF_ORDINARY,
+ /** With added columns (REC_STATUS_COLUMNS_ADDED) */
+ REC_LEAF_COLUMNS_ADDED
+};
+
+/** Determine the offset to each field in a leaf-page record
+in ROW_FORMAT=COMPACT,DYNAMIC,COMPRESSED.
+This is a special case of rec_init_offsets() and rec_get_offsets_func().
+@param[in] rec leaf-page record
+@param[in] index the index that the record belongs in
+@param[in,out] offsets offsets, with valid rec_offs_n_fields(offsets)
+@param[in] format record format */
+static inline
void
rec_init_offsets_comp_ordinary(
-/*===========================*/
- const rec_t* rec, /*!< in: physical record in
- ROW_FORMAT=COMPACT */
- bool temp, /*!< in: whether to use the
- format for temporary files in
- index creation */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in/out: array of offsets;
- in: n=rec_offs_n_fields(offsets) */
+ const rec_t* rec,
+ const dict_index_t* index,
+ ulint* offsets,
+ rec_leaf_format format)
{
- ulint i = 0;
ulint offs = 0;
- ulint any_ext = 0;
- ulint n_null = index->n_nullable;
- const byte* nulls = temp
- ? rec - 1
- : rec - (1 + REC_N_NEW_EXTRA_BYTES);
- const byte* lens = nulls - UT_BITS_IN_BYTES(n_null);
+ ulint any = 0;
+ const byte* nulls = rec;
+ const byte* lens = NULL;
+ ulint n_fields = index->n_core_fields;
ulint null_mask = 1;
+ ut_ad(index->n_core_fields > 0);
+ ut_ad(index->n_fields >= index->n_core_fields);
+ ut_ad(index->n_core_null_bytes <= UT_BITS_IN_BYTES(index->n_nullable));
+ ut_ad(format == REC_LEAF_TEMP || format == REC_LEAF_TEMP_COLUMNS_ADDED
+ || dict_table_is_comp(index->table));
+ ut_ad(format != REC_LEAF_TEMP_COLUMNS_ADDED
+ || index->n_fields == rec_offs_n_fields(offsets));
+ ut_d(ulint n_null);
+
+ switch (format) {
+ case REC_LEAF_TEMP:
+ if (dict_table_is_comp(index->table)) {
+ /* No need to do adjust fixed_len=0. We only need to
+ adjust it for ROW_FORMAT=REDUNDANT. */
+ format = REC_LEAF_ORDINARY;
+ }
+ goto ordinary;
+ case REC_LEAF_ORDINARY:
+ nulls -= REC_N_NEW_EXTRA_BYTES;
+ordinary:
+ lens = --nulls - index->n_core_null_bytes;
+
+ ut_d(n_null = std::min(index->n_core_null_bytes * 8U,
+ index->n_nullable));
+ break;
+ case REC_LEAF_COLUMNS_ADDED:
+ /* We would have !index->is_instant() when rolling back
+ an instant ADD COLUMN operation. */
+ nulls -= REC_N_NEW_EXTRA_BYTES;
+ if (rec_offs_n_fields(offsets) <= n_fields) {
+ goto ordinary;
+ }
+ /* fall through */
+ case REC_LEAF_TEMP_COLUMNS_ADDED:
+ ut_ad(index->is_instant());
+ n_fields = index->n_core_fields + 1
+ + rec_get_n_add_field(nulls);
+ ut_ad(n_fields <= index->n_fields);
+ const ulint n_nullable = index->get_n_nullable(n_fields);
+ const ulint n_null_bytes = UT_BITS_IN_BYTES(n_nullable);
+ ut_d(n_null = n_nullable);
+ ut_ad(n_null <= index->n_nullable);
+ ut_ad(n_null_bytes >= index->n_core_null_bytes);
+ lens = --nulls - n_null_bytes;
+ }
+
#ifdef UNIV_DEBUG
- /* We cannot invoke rec_offs_make_valid() here if temp=true.
+ /* We cannot invoke rec_offs_make_valid() if format==REC_LEAF_TEMP.
Similarly, rec_offs_validate() will fail in that case, because
it invokes rec_get_status(). */
offsets[2] = (ulint) rec;
offsets[3] = (ulint) index;
#endif /* UNIV_DEBUG */
- ut_ad(temp || dict_table_is_comp(index->table));
-
- if (temp && dict_table_is_comp(index->table)) {
- /* No need to do adjust fixed_len=0. We only need to
- adjust it for ROW_FORMAT=REDUNDANT. */
- temp = false;
- }
-
- /* read the lengths of fields 0..n */
+ /* read the lengths of fields 0..n_fields */
+ ulint i = 0;
do {
const dict_field_t* field
= dict_index_get_nth_field(index, i);
@@ -285,6 +378,20 @@ rec_init_offsets_comp_ordinary(
= dict_field_get_col(field);
ulint len;
+ /* set default value flag */
+ if (i >= n_fields) {
+ ulint dlen;
+ if (!index->instant_field_value(i, &dlen)) {
+ len = offs | REC_OFFS_SQL_NULL;
+ ut_ad(dlen == UNIV_SQL_NULL);
+ } else {
+ len = offs | REC_OFFS_DEFAULT;
+ any |= REC_OFFS_DEFAULT;
+ }
+
+ goto resolved;
+ }
+
if (!(col->prtype & DATA_NOT_NULL)) {
/* nullable field => read the null flag */
ut_ad(n_null--);
@@ -307,7 +414,8 @@ rec_init_offsets_comp_ordinary(
}
if (!field->fixed_len
- || (temp && !dict_col_get_fixed_size(col, temp))) {
+ || (format == REC_LEAF_TEMP
+ && !dict_col_get_fixed_size(col, true))) {
/* Variable-length field: read the length */
len = *lens--;
/* If the maximum length of the field is up
@@ -317,26 +425,21 @@ rec_init_offsets_comp_ordinary(
stored in one byte for 0..127. The length
will be encoded in two bytes when it is 128 or
more, or when the field is stored externally. */
- if (DATA_BIG_COL(col)) {
- if (len & 0x80) {
- /* 1exxxxxxx xxxxxxxx */
- len <<= 8;
- len |= *lens--;
-
- offs += len & 0x3fff;
- if (UNIV_UNLIKELY(len
- & 0x4000)) {
- ut_ad(dict_index_is_clust
- (index));
- any_ext = REC_OFFS_EXTERNAL;
- len = offs
- | REC_OFFS_EXTERNAL;
- } else {
- len = offs;
- }
-
- goto resolved;
+ if ((len & 0x80) && DATA_BIG_COL(col)) {
+ /* 1exxxxxxx xxxxxxxx */
+ len <<= 8;
+ len |= *lens--;
+
+ offs += len & 0x3fff;
+ if (UNIV_UNLIKELY(len & 0x4000)) {
+ ut_ad(dict_index_is_clust(index));
+ any |= REC_OFFS_EXTERNAL;
+ len = offs | REC_OFFS_EXTERNAL;
+ } else {
+ len = offs;
}
+
+ goto resolved;
}
len = offs += len;
@@ -348,12 +451,115 @@ resolved:
} while (++i < rec_offs_n_fields(offsets));
*rec_offs_base(offsets)
- = (rec - (lens + 1)) | REC_OFFS_COMPACT | any_ext;
+ = (rec - (lens + 1)) | REC_OFFS_COMPACT | any;
}
-/******************************************************//**
-The following function determines the offsets to each field in the
-record. The offsets are written to a previously allocated array of
+#ifdef UNIV_DEBUG
+/** Update debug data in offsets, in order to tame rec_offs_validate().
+@param[in] rec record
+@param[in] index the index that the record belongs in
+@param[in] leaf whether the record resides in a leaf page
+@param[in,out] offsets offsets from rec_get_offsets() to adjust */
+void
+rec_offs_make_valid(
+ const rec_t* rec,
+ const dict_index_t* index,
+ bool leaf,
+ ulint* offsets)
+{
+ ut_ad(rec_offs_n_fields(offsets)
+ <= (leaf
+ ? dict_index_get_n_fields(index)
+ : dict_index_get_n_unique_in_tree_nonleaf(index) + 1)
+ || index->is_dummy || dict_index_is_ibuf(index));
+ const bool is_user_rec = (dict_table_is_comp(index->table)
+ ? rec_get_heap_no_new(rec)
+ : rec_get_heap_no_old(rec))
+ >= PAGE_HEAP_NO_USER_LOW;
+ ulint n = rec_get_n_fields(rec, index);
+ /* The infimum and supremum records carry 1 field. */
+ ut_ad(is_user_rec || n == 1);
+ ut_ad(is_user_rec || rec_offs_n_fields(offsets) == 1);
+ ut_ad(!is_user_rec || n >= index->n_core_fields
+ || n >= rec_offs_n_fields(offsets));
+ for (; n < rec_offs_n_fields(offsets); n++) {
+ ut_ad(leaf);
+ ut_ad(rec_offs_base(offsets)[1 + n] & REC_OFFS_DEFAULT);
+ }
+ offsets[2] = ulint(rec);
+ offsets[3] = ulint(index);
+}
+
+/** Validate offsets returned by rec_get_offsets().
+@param[in] rec record, or NULL
+@param[in] index the index that the record belongs in, or NULL
+@param[in,out] offsets the offsets of the record
+@return true */
+bool
+rec_offs_validate(
+ const rec_t* rec,
+ const dict_index_t* index,
+ const ulint* offsets)
+{
+ ulint i = rec_offs_n_fields(offsets);
+ ulint last = ULINT_MAX;
+ ulint comp = *rec_offs_base(offsets) & REC_OFFS_COMPACT;
+
+ if (rec) {
+ ut_ad(ulint(rec) == offsets[2]);
+ if (!comp) {
+ const bool is_user_rec = rec_get_heap_no_old(rec)
+ >= PAGE_HEAP_NO_USER_LOW;
+ ulint n = rec_get_n_fields_old(rec);
+ /* The infimum and supremum records carry 1 field. */
+ ut_ad(is_user_rec || n == 1);
+ ut_ad(is_user_rec || i == 1);
+ ut_ad(!is_user_rec || n >= i || !index
+ || n >= index->n_core_fields);
+ for (; n < i; n++) {
+ ut_ad(rec_offs_base(offsets)[1 + n]
+ & REC_OFFS_DEFAULT);
+ }
+ }
+ }
+ if (index) {
+ ulint max_n_fields;
+ ut_ad(ulint(index) == offsets[3]);
+ max_n_fields = ut_max(
+ dict_index_get_n_fields(index),
+ dict_index_get_n_unique_in_tree(index) + 1);
+ if (comp && rec) {
+ switch (rec_get_status(rec)) {
+ case REC_STATUS_COLUMNS_ADDED:
+ case REC_STATUS_ORDINARY:
+ break;
+ case REC_STATUS_NODE_PTR:
+ max_n_fields = dict_index_get_n_unique_in_tree(
+ index) + 1;
+ break;
+ case REC_STATUS_INFIMUM:
+ case REC_STATUS_SUPREMUM:
+ max_n_fields = 1;
+ break;
+ default:
+ ut_error;
+ }
+ }
+ /* index->n_def == 0 for dummy indexes if !comp */
+ ut_a(!comp || index->n_def);
+ ut_a(!index->n_def || i <= max_n_fields);
+ }
+ while (i--) {
+ ulint curr = rec_offs_base(offsets)[1 + i] & REC_OFFS_MASK;
+ ut_a(curr <= last);
+ last = curr;
+ }
+ return(TRUE);
+}
+#endif /* UNIV_DEBUG */
+
+/** Determine the offsets to each field in the record.
+ The offsets are written to a previously allocated array of
ulint, where rec_offs_n_fields(offsets) has been initialized to the
number of fields in the record. The rest of the array will be
initialized by this function. rec_offs_base(offsets)[0] will be set
@@ -364,20 +570,25 @@ offsets past the end of fields 0..n_fields, or to the beginning of
fields 1..n_fields+1. When the high-order bit of the offset at [i+1]
is set (REC_OFFS_SQL_NULL), the field i is NULL. When the second
high-order bit of the offset at [i+1] is set (REC_OFFS_EXTERNAL), the
-field i is being stored externally. */
+field i is being stored externally.
+@param[in] rec record
+@param[in] index the index that the record belongs in
+@param[in] leaf whether the record resides in a leaf page
+@param[in,out] offsets array of offsets, with valid rec_offs_n_fields() */
static
void
rec_init_offsets(
-/*=============*/
- const rec_t* rec, /*!< in: physical record */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in/out: array of offsets;
- in: n=rec_offs_n_fields(offsets) */
+ const rec_t* rec,
+ const dict_index_t* index,
+ bool leaf,
+ ulint* offsets)
{
ulint i = 0;
ulint offs;
- rec_offs_make_valid(rec, index, offsets);
+ ut_ad(index->n_core_null_bytes <= UT_BITS_IN_BYTES(index->n_nullable));
+ ut_d(offsets[2] = ulint(rec));
+ ut_d(offsets[3] = ulint(index));
if (dict_table_is_comp(index->table)) {
const byte* nulls;
@@ -396,18 +607,35 @@ rec_init_offsets(
rec_offs_base(offsets)[1] = 8;
return;
case REC_STATUS_NODE_PTR:
+ ut_ad(!leaf);
n_node_ptr_field
= dict_index_get_n_unique_in_tree_nonleaf(
index);
break;
+ case REC_STATUS_COLUMNS_ADDED:
+ ut_ad(leaf);
+ rec_init_offsets_comp_ordinary(rec, index, offsets,
+ REC_LEAF_COLUMNS_ADDED);
+ return;
case REC_STATUS_ORDINARY:
- rec_init_offsets_comp_ordinary(
- rec, false, index, offsets);
+ ut_ad(leaf);
+ rec_init_offsets_comp_ordinary(rec, index, offsets,
+ REC_LEAF_ORDINARY);
return;
}
+ /* The n_nullable flags in the clustered index node pointer
+ records in ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC must
+ reflect the number of 'core columns'. These flags are
+ useless garbage, and they are only reserved because of
+ file format compatibility.
+ (Clustered index node pointer records only contain the
+ PRIMARY KEY columns, which are always NOT NULL,
+ so we should have used n_nullable=0.) */
+ ut_ad(index->n_core_fields > 0);
+
nulls = rec - (REC_N_NEW_EXTRA_BYTES + 1);
- lens = nulls - UT_BITS_IN_BYTES(index->n_nullable);
+ lens = nulls - index->n_core_null_bytes;
offs = 0;
null_mask = 1;
@@ -487,9 +715,13 @@ resolved:
} else {
/* Old-style record: determine extra size and end offsets */
offs = REC_N_OLD_EXTRA_BYTES;
+ const ulint n_fields = rec_get_n_fields_old(rec);
+ const ulint n = std::min(n_fields, rec_offs_n_fields(offsets));
+ ulint any;
+
if (rec_get_1byte_offs_flag(rec)) {
- offs += rec_offs_n_fields(offsets);
- *rec_offs_base(offsets) = offs;
+ offs += n_fields;
+ any = offs;
/* Determine offsets to fields */
do {
offs = rec_1_get_field_end_info(rec, i);
@@ -498,10 +730,10 @@ resolved:
offs |= REC_OFFS_SQL_NULL;
}
rec_offs_base(offsets)[1 + i] = offs;
- } while (++i < rec_offs_n_fields(offsets));
+ } while (++i < n);
} else {
- offs += 2 * rec_offs_n_fields(offsets);
- *rec_offs_base(offsets) = offs;
+ offs += 2 * n_fields;
+ any = offs;
/* Determine offsets to fields */
do {
offs = rec_2_get_field_end_info(rec, i);
@@ -512,11 +744,24 @@ resolved:
if (offs & REC_2BYTE_EXTERN_MASK) {
offs &= ~REC_2BYTE_EXTERN_MASK;
offs |= REC_OFFS_EXTERNAL;
- *rec_offs_base(offsets) |= REC_OFFS_EXTERNAL;
+ any |= REC_OFFS_EXTERNAL;
}
rec_offs_base(offsets)[1 + i] = offs;
+ } while (++i < n);
+ }
+
+ if (i < rec_offs_n_fields(offsets)) {
+ offs = (rec_offs_base(offsets)[i] & REC_OFFS_MASK)
+ | REC_OFFS_DEFAULT;
+
+ do {
+ rec_offs_base(offsets)[1 + i] = offs;
} while (++i < rec_offs_n_fields(offsets));
+
+ any |= REC_OFFS_DEFAULT;
}
+
+ *rec_offs_base(offsets) = any;
}
}
@@ -535,9 +780,7 @@ rec_get_offsets_func(
const rec_t* rec,
const dict_index_t* index,
ulint* offsets,
-#ifdef UNIV_DEBUG
bool leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
#ifdef UNIV_DEBUG
const char* file, /*!< in: file name where called */
@@ -555,6 +798,7 @@ rec_get_offsets_func(
if (dict_table_is_comp(index->table)) {
switch (UNIV_EXPECT(rec_get_status(rec),
REC_STATUS_ORDINARY)) {
+ case REC_STATUS_COLUMNS_ADDED:
case REC_STATUS_ORDINARY:
ut_ad(leaf);
n = dict_index_get_n_fields(index);
@@ -587,8 +831,8 @@ rec_get_offsets_func(
page_rec_is_user_rec(rec) and similar predicates
cannot be evaluated. We can still distinguish the
infimum and supremum record based on the heap number. */
- ut_d(const bool is_user_rec = rec_get_heap_no_old(rec)
- >= PAGE_HEAP_NO_USER_LOW);
+ const bool is_user_rec = rec_get_heap_no_old(rec)
+ >= PAGE_HEAP_NO_USER_LOW;
/* The infimum and supremum records carry 1 field. */
ut_ad(is_user_rec || n == 1);
ut_ad(!is_user_rec || leaf || index->is_dummy
@@ -598,9 +842,13 @@ rec_get_offsets_func(
ut_ad(!is_user_rec || !leaf || index->is_dummy
|| dict_index_is_ibuf(index)
|| n == n_fields /* btr_pcur_restore_position() */
- || n == index->n_fields
- || (index->id == DICT_INDEXES_ID
- && (n == DICT_NUM_FIELDS__SYS_INDEXES - 1)));
+ || (n >= index->n_core_fields && n <= index->n_fields));
+
+ if (is_user_rec && leaf && n < index->n_fields) {
+ ut_ad(!index->is_dummy);
+ ut_ad(!dict_index_is_ibuf(index));
+ n = index->n_fields;
+ }
}
if (UNIV_UNLIKELY(n_fields < n)) {
@@ -624,7 +872,7 @@ rec_get_offsets_func(
}
rec_offs_set_n_fields(offsets, n);
- rec_init_offsets(rec, index, offsets);
+ rec_init_offsets(rec, index, leaf, offsets);
return(offsets);
}
@@ -658,6 +906,7 @@ rec_get_offsets_reverse(
ut_ad(index);
ut_ad(offsets);
ut_ad(dict_table_is_comp(index->table));
+ ut_ad(!index->is_instant());
if (UNIV_UNLIKELY(node_ptr)) {
n_node_ptr_field =
@@ -805,7 +1054,8 @@ rec_get_nth_field_offs_old(
/**********************************************************//**
Determines the size of a data tuple prefix in ROW_FORMAT=COMPACT.
@return total size */
-UNIV_INLINE MY_ATTRIBUTE((warn_unused_result, nonnull(1,2)))
+MY_ATTRIBUTE((warn_unused_result, nonnull(1,2)))
+static inline
ulint
rec_get_converted_size_comp_prefix_low(
/*===================================*/
@@ -816,21 +1066,31 @@ rec_get_converted_size_comp_prefix_low(
const dfield_t* fields, /*!< in: array of data fields */
ulint n_fields,/*!< in: number of data fields */
ulint* extra, /*!< out: extra size */
+ rec_comp_status_t status, /*!< in: status flags */
bool temp) /*!< in: whether this is a
temporary file record */
{
- ulint extra_size;
+ ulint extra_size = temp ? 0 : REC_N_NEW_EXTRA_BYTES;
ulint data_size;
ulint i;
ut_ad(n_fields > 0);
ut_ad(n_fields <= dict_index_get_n_fields(index));
- ut_ad(!temp || extra);
-
ut_d(ulint n_null = index->n_nullable);
+ ut_ad(status == REC_STATUS_ORDINARY || status == REC_STATUS_NODE_PTR
+ || status == REC_STATUS_COLUMNS_ADDED);
+
+ if (status == REC_STATUS_COLUMNS_ADDED
+ && (!temp || n_fields > index->n_core_fields)) {
+ ut_ad(index->is_instant());
+ ut_ad(UT_BITS_IN_BYTES(n_null) >= index->n_core_null_bytes);
+ extra_size += UT_BITS_IN_BYTES(index->get_n_nullable(n_fields))
+ + rec_get_n_add_field_len(n_fields - 1
+ - index->n_core_fields);
+ } else {
+ ut_ad(n_fields <= index->n_core_fields);
+ extra_size += index->n_core_null_bytes;
+ }
- extra_size = temp
- ? UT_BITS_IN_BYTES(index->n_nullable)
- : REC_N_NEW_EXTRA_BYTES + UT_BITS_IN_BYTES(index->n_nullable);
data_size = 0;
if (temp && dict_table_is_comp(index->table)) {
@@ -991,7 +1251,8 @@ rec_get_converted_size_comp_prefix(
{
ut_ad(dict_table_is_comp(index->table));
return(rec_get_converted_size_comp_prefix_low(
- index, fields, n_fields, extra, false));
+ index, fields, n_fields, extra,
+ REC_STATUS_ORDINARY, false));
}
/**********************************************************//**
@@ -1004,40 +1265,41 @@ rec_get_converted_size_comp(
dict_table_is_comp() is
assumed to hold, even if
it does not */
- ulint status, /*!< in: status bits of the record */
+ rec_comp_status_t status, /*!< in: status bits of the record */
const dfield_t* fields, /*!< in: array of data fields */
ulint n_fields,/*!< in: number of data fields */
ulint* extra) /*!< out: extra size */
{
- ulint size;
ut_ad(n_fields > 0);
switch (UNIV_EXPECT(status, REC_STATUS_ORDINARY)) {
case REC_STATUS_ORDINARY:
- ut_ad(n_fields == dict_index_get_n_fields(index));
- size = 0;
- break;
+ if (n_fields > index->n_core_fields) {
+ ut_ad(index->is_instant());
+ status = REC_STATUS_COLUMNS_ADDED;
+ }
+ /* fall through */
+ case REC_STATUS_COLUMNS_ADDED:
+ ut_ad(n_fields >= index->n_core_fields);
+ ut_ad(n_fields <= index->n_fields);
+ return rec_get_converted_size_comp_prefix_low(
+ index, fields, n_fields, extra, status, false);
case REC_STATUS_NODE_PTR:
n_fields--;
ut_ad(n_fields == dict_index_get_n_unique_in_tree_nonleaf(
index));
ut_ad(dfield_get_len(&fields[n_fields]) == REC_NODE_PTR_SIZE);
- size = REC_NODE_PTR_SIZE; /* child page number */
- break;
+ return REC_NODE_PTR_SIZE /* child page number */
+ + rec_get_converted_size_comp_prefix_low(
+ index, fields, n_fields, extra, status, false);
case REC_STATUS_INFIMUM:
case REC_STATUS_SUPREMUM:
- /* infimum or supremum record, 8 data bytes */
- if (UNIV_LIKELY_NULL(extra)) {
- *extra = REC_N_NEW_EXTRA_BYTES;
- }
- return(REC_N_NEW_EXTRA_BYTES + 8);
- default:
- ut_error;
- return(ULINT_UNDEFINED);
+ /* not supported */
+ break;
}
- return(size + rec_get_converted_size_comp_prefix_low(
- index, fields, n_fields, extra, false));
+ ut_error;
+ return(ULINT_UNDEFINED);
}
/***********************************************************//**
@@ -1134,8 +1396,7 @@ rec_convert_dtuple_to_rec_old(
/* Set the info bits of the record */
rec_set_info_bits_old(rec, dtuple_get_info_bits(dtuple)
& REC_INFO_BITS_MASK);
- /* Make rec_get_offsets() and rec_offs_make_valid() happy. */
- ut_d(rec_set_heap_no_old(rec, PAGE_HEAP_NO_USER_LOW));
+ rec_set_heap_no_old(rec, PAGE_HEAP_NO_USER_LOW);
/* Store the data and the offsets */
@@ -1207,25 +1468,29 @@ rec_convert_dtuple_to_rec_old(
return(rec);
}
-/*********************************************************//**
-Builds a ROW_FORMAT=COMPACT record out of a data tuple. */
-UNIV_INLINE
+/** Convert a data tuple into a ROW_FORMAT=COMPACT record.
+@param[out] rec converted record
+@param[in] index index
+@param[in] fields data fields to convert
+@param[in] n_fields number of data fields
+@param[in] status rec_get_status(rec)
+@param[in] temp whether to use the format for temporary files
+ in index creation */
+static inline
void
rec_convert_dtuple_to_rec_comp(
-/*===========================*/
- rec_t* rec, /*!< in: origin of record */
- const dict_index_t* index, /*!< in: record descriptor */
- const dfield_t* fields, /*!< in: array of data fields */
- ulint n_fields,/*!< in: number of data fields */
- ulint status, /*!< in: status bits of the record */
- bool temp) /*!< in: whether to use the
- format for temporary files in
- index creation */
+ rec_t* rec,
+ const dict_index_t* index,
+ const dfield_t* fields,
+ ulint n_fields,
+ rec_comp_status_t status,
+ bool temp)
{
const dfield_t* field;
const dtype_t* type;
byte* end;
- byte* nulls;
+ byte* nulls = temp
+ ? rec - 1 : rec - (REC_N_NEW_EXTRA_BYTES + 1);
byte* lens;
ulint len;
ulint i;
@@ -1235,49 +1500,55 @@ rec_convert_dtuple_to_rec_comp(
ut_ad(n_fields > 0);
ut_ad(temp || dict_table_is_comp(index->table));
- ulint n_null = index->n_nullable;
- const ulint n_null_bytes = UT_BITS_IN_BYTES(n_null);
+ ut_ad(index->n_core_null_bytes <= UT_BITS_IN_BYTES(index->n_nullable));
- if (temp) {
- ut_ad(status == REC_STATUS_ORDINARY);
+ ut_d(ulint n_null = index->n_nullable);
+
+ switch (status) {
+ case REC_STATUS_COLUMNS_ADDED:
+ ut_ad(index->is_instant());
+ ut_ad(n_fields > index->n_core_fields);
+ rec_set_n_add_field(nulls, n_fields - 1
+ - index->n_core_fields);
+ /* fall through */
+ case REC_STATUS_ORDINARY:
ut_ad(n_fields <= dict_index_get_n_fields(index));
- n_node_ptr_field = ULINT_UNDEFINED;
- nulls = rec - 1;
- if (dict_table_is_comp(index->table)) {
+ if (!temp) {
+ rec_set_heap_no_new(rec, PAGE_HEAP_NO_USER_LOW);
+ rec_set_status(rec, n_fields == index->n_core_fields
+ ? REC_STATUS_ORDINARY
+ : REC_STATUS_COLUMNS_ADDED);
+ } if (dict_table_is_comp(index->table)) {
/* No need to do adjust fixed_len=0. We only
need to adjust it for ROW_FORMAT=REDUNDANT. */
temp = false;
}
- } else {
- /* Make rec_get_offsets() and rec_offs_make_valid() happy. */
- ut_d(rec_set_heap_no_new(rec, PAGE_HEAP_NO_USER_LOW));
- nulls = rec - (REC_N_NEW_EXTRA_BYTES + 1);
- switch (UNIV_EXPECT(status, REC_STATUS_ORDINARY)) {
- case REC_STATUS_ORDINARY:
- ut_ad(n_fields <= dict_index_get_n_fields(index));
- n_node_ptr_field = ULINT_UNDEFINED;
- break;
- case REC_STATUS_NODE_PTR:
- ut_ad(n_fields
- == dict_index_get_n_unique_in_tree_nonleaf(index)
- + 1);
- n_node_ptr_field = n_fields - 1;
- break;
- case REC_STATUS_INFIMUM:
- case REC_STATUS_SUPREMUM:
- ut_ad(n_fields == 1);
- n_node_ptr_field = ULINT_UNDEFINED;
- break;
- default:
- ut_error;
- return;
- }
+ n_node_ptr_field = ULINT_UNDEFINED;
+ lens = nulls - (index->is_instant()
+ ? UT_BITS_IN_BYTES(index->get_n_nullable(
+ n_fields))
+ : UT_BITS_IN_BYTES(index->n_nullable));
+ break;
+ case REC_STATUS_NODE_PTR:
+ ut_ad(!temp);
+ rec_set_heap_no_new(rec, PAGE_HEAP_NO_USER_LOW);
+ rec_set_status(rec, status);
+ ut_ad(n_fields
+ == dict_index_get_n_unique_in_tree_nonleaf(index) + 1);
+ ut_d(n_null = std::min(index->n_core_null_bytes * 8U,
+ index->n_nullable));
+ n_node_ptr_field = n_fields - 1;
+ lens = nulls - index->n_core_null_bytes;
+ break;
+ case REC_STATUS_INFIMUM:
+ case REC_STATUS_SUPREMUM:
+ ut_error;
+ return;
}
end = rec;
/* clear the SQL-null flags */
- lens = nulls - n_null_bytes;
memset(lens + 1, 0, nulls - lens);
/* Store the data and the offsets */
@@ -1450,21 +1721,26 @@ rec_convert_dtuple_to_rec_new(
const dict_index_t* index, /*!< in: record descriptor */
const dtuple_t* dtuple) /*!< in: data tuple */
{
+ ut_ad(!(dtuple->info_bits
+ & ~(REC_NEW_STATUS_MASK | REC_INFO_DELETED_FLAG
+ | REC_INFO_MIN_REC_FLAG)));
+ rec_comp_status_t status = static_cast<rec_comp_status_t>(
+ dtuple->info_bits & REC_NEW_STATUS_MASK);
+ if (status == REC_STATUS_ORDINARY
+ && dtuple->n_fields > index->n_core_fields) {
+ ut_ad(index->is_instant());
+ status = REC_STATUS_COLUMNS_ADDED;
+ }
+
ulint extra_size;
- ulint status;
- rec_t* rec;
- status = dtuple_get_info_bits(dtuple) & REC_NEW_STATUS_MASK;
rec_get_converted_size_comp(
index, status, dtuple->fields, dtuple->n_fields, &extra_size);
- rec = buf + extra_size;
+ rec_t* rec = buf + extra_size;
rec_convert_dtuple_to_rec_comp(
rec, index, dtuple->fields, dtuple->n_fields, status, false);
-
- /* Set the info bits of the record */
- rec_set_info_and_status_bits(rec, dtuple_get_info_bits(dtuple));
-
+ rec_set_info_bits_new(rec, dtuple->info_bits & ~REC_NEW_STATUS_MASK);
return(rec);
}
@@ -1504,45 +1780,60 @@ rec_convert_dtuple_to_rec(
@param[in] fields data fields
@param[in] n_fields number of data fields
@param[out] extra record header size
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
@return total size, in bytes */
ulint
rec_get_converted_size_temp(
const dict_index_t* index,
const dfield_t* fields,
ulint n_fields,
- ulint* extra)
+ ulint* extra,
+ rec_comp_status_t status)
{
- return(rec_get_converted_size_comp_prefix_low(
- index, fields, n_fields, extra, true));
+ return rec_get_converted_size_comp_prefix_low(
+ index, fields, n_fields, extra, status, true);
}
-/******************************************************//**
-Determine the offset to each field in temporary file.
-@see rec_convert_dtuple_to_temp() */
+/** Determine the offset to each field in temporary file.
+@param[in] rec temporary file record
+@param[in] index index of that the record belongs to
+@param[in,out] offsets offsets to the fields; in: rec_offs_n_fields(offsets)
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
+*/
void
rec_init_offsets_temp(
-/*==================*/
- const rec_t* rec, /*!< in: temporary file record */
- const dict_index_t* index, /*!< in: record descriptor */
- ulint* offsets)/*!< in/out: array of offsets;
- in: n=rec_offs_n_fields(offsets) */
+ const rec_t* rec,
+ const dict_index_t* index,
+ ulint* offsets,
+ rec_comp_status_t status)
{
- rec_init_offsets_comp_ordinary(rec, true, index, offsets);
+ ut_ad(status == REC_STATUS_ORDINARY
+ || status == REC_STATUS_COLUMNS_ADDED);
+ ut_ad(status == REC_STATUS_ORDINARY || index->is_instant());
+
+ rec_init_offsets_comp_ordinary(rec, index, offsets,
+ status == REC_STATUS_COLUMNS_ADDED
+ ? REC_LEAF_TEMP_COLUMNS_ADDED
+ : REC_LEAF_TEMP);
}
-/*********************************************************//**
-Builds a temporary file record out of a data tuple.
-@see rec_init_offsets_temp() */
+/** Convert a data tuple prefix to the temporary file format.
+@param[out] rec record in temporary file format
+@param[in] index clustered or secondary index
+@param[in] fields data fields
+@param[in] n_fields number of data fields
+@param[in] status REC_STATUS_ORDINARY or REC_STATUS_COLUMNS_ADDED
+*/
void
rec_convert_dtuple_to_temp(
-/*=======================*/
- rec_t* rec, /*!< out: record */
- const dict_index_t* index, /*!< in: record descriptor */
- const dfield_t* fields, /*!< in: array of data fields */
- ulint n_fields) /*!< in: number of fields */
+ rec_t* rec,
+ const dict_index_t* index,
+ const dfield_t* fields,
+ ulint n_fields,
+ rec_comp_status_t status)
{
rec_convert_dtuple_to_rec_comp(rec, index, fields, n_fields,
- REC_STATUS_ORDINARY, true);
+ status, true);
}
/** Copy the first n fields of a (copy of a) physical record to a data tuple.
@@ -1553,13 +1844,11 @@ The fields are copied into the memory heap.
@param[in] n_fields number of fields to copy
@param[in,out] heap memory heap */
void
-rec_copy_prefix_to_dtuple_func(
+rec_copy_prefix_to_dtuple(
dtuple_t* tuple,
const rec_t* rec,
const dict_index_t* index,
-#ifdef UNIV_DEBUG
bool is_leaf,
-#endif /* UNIV_DEBUG */
ulint n_fields,
mem_heap_t* heap)
{
@@ -1574,10 +1863,10 @@ rec_copy_prefix_to_dtuple_func(
n_fields, &heap);
ut_ad(rec_validate(rec, offsets));
+ ut_ad(!rec_offs_any_default(offsets));
ut_ad(dtuple_check_typed(tuple));
- dtuple_set_info_bits(tuple, rec_get_info_bits(
- rec, dict_table_is_comp(index->table)));
+ tuple->info_bits = rec_get_info_bits(rec, rec_offs_comp(offsets));
for (ulint i = 0; i < n_fields; i++) {
dfield_t* field;
@@ -1660,9 +1949,9 @@ rec_copy_prefix_to_buf(
ulint i;
ulint prefix_len;
ulint null_mask;
- ulint status;
bool is_rtr_node_ptr = false;
+ ut_ad(index->n_core_null_bytes <= UT_BITS_IN_BYTES(index->n_nullable));
UNIV_PREFETCH_RW(*buf);
if (!dict_table_is_comp(index->table)) {
@@ -1673,11 +1962,26 @@ rec_copy_prefix_to_buf(
buf, buf_size));
}
- status = rec_get_status(rec);
-
- switch (status) {
+ switch (rec_get_status(rec)) {
+ case REC_STATUS_COLUMNS_ADDED:
+ /* We would have !index->is_instant() when rolling back
+ an instant ADD COLUMN operation. */
+ ut_ad(index->is_instant() || page_rec_is_default_row(rec));
+ if (n_fields >= index->n_core_fields) {
+ ut_ad(index->is_instant());
+ ut_ad(n_fields <= index->n_fields);
+ nulls = &rec[-REC_N_NEW_EXTRA_BYTES];
+ const ulint n_rec = n_fields + 1
+ + rec_get_n_add_field(nulls);
+ const uint n_nullable = index->get_n_nullable(n_rec);
+ lens = --nulls - UT_BITS_IN_BYTES(n_nullable);
+ break;
+ }
+ /* fall through */
case REC_STATUS_ORDINARY:
- ut_ad(n_fields <= dict_index_get_n_fields(index));
+ ut_ad(n_fields <= index->n_core_fields);
+ nulls = rec - (REC_N_NEW_EXTRA_BYTES + 1);
+ lens = nulls - index->n_core_null_bytes;
break;
case REC_STATUS_NODE_PTR:
/* For R-tree, we need to copy the child page number field. */
@@ -1690,17 +1994,16 @@ rec_copy_prefix_to_buf(
ut_ad(n_fields <=
dict_index_get_n_unique_in_tree_nonleaf(index));
}
+ nulls = rec - (REC_N_NEW_EXTRA_BYTES + 1);
+ lens = nulls - index->n_core_null_bytes;
break;
case REC_STATUS_INFIMUM:
case REC_STATUS_SUPREMUM:
/* infimum or supremum record: no sense to copy anything */
- default:
ut_error;
return(NULL);
}
- nulls = rec - (REC_N_NEW_EXTRA_BYTES + 1);
- lens = nulls - UT_BITS_IN_BYTES(index->n_nullable);
UNIV_PREFETCH_R(lens);
prefix_len = 0;
null_mask = 1;
@@ -1839,20 +2142,27 @@ rec_validate(
return(FALSE);
}
- ut_a(rec_offs_comp(offsets) || n_fields <= rec_get_n_fields_old(rec));
+ ut_a(rec_offs_any_flag(offsets, REC_OFFS_COMPACT | REC_OFFS_DEFAULT)
+ || n_fields <= rec_get_n_fields_old(rec));
for (i = 0; i < n_fields; i++) {
rec_get_nth_field_offs(offsets, i, &len);
- if (!((len < UNIV_PAGE_SIZE) || (len == UNIV_SQL_NULL))) {
- ib::error() << "Record field " << i << " len " << len;
- return(FALSE);
- }
-
- if (len != UNIV_SQL_NULL) {
+ switch (len) {
+ default:
+ if (len >= UNIV_PAGE_SIZE) {
+ ib::error() << "Record field " << i
+ << " len " << len;
+ return(FALSE);
+ }
len_sum += len;
- } else if (!rec_offs_comp(offsets)) {
- len_sum += rec_get_nth_field_size(rec, i);
+ break;
+ case UNIV_SQL_DEFAULT:
+ break;
+ case UNIV_SQL_NULL:
+ if (!rec_offs_comp(offsets)) {
+ len_sum += rec_get_nth_field_size(rec, i);
+ }
}
}
@@ -1937,11 +2247,19 @@ rec_print_comp(
const byte* data;
ulint len;
- data = rec_get_nth_field(rec, offsets, i, &len);
+ if (rec_offs_nth_default(offsets, i)) {
+ len = UNIV_SQL_DEFAULT;
+ } else {
+ data = rec_get_nth_field(rec, offsets, i, &len);
+ }
fprintf(file, " " ULINTPF ":", i);
- if (len != UNIV_SQL_NULL) {
+ if (len == UNIV_SQL_NULL) {
+ fputs(" SQL NULL", file);
+ } else if (len == UNIV_SQL_DEFAULT) {
+ fputs(" SQL DEFAULT", file);
+ } else {
if (len <= 30) {
ut_print_buf(file, data, len);
@@ -1959,8 +2277,6 @@ rec_print_comp(
fprintf(file, " (total " ULINTPF " bytes)",
len);
}
- } else {
- fputs(" SQL NULL", file);
}
putc(';', file);
putc('\n', file);
@@ -2054,6 +2370,7 @@ rec_print_mbr_rec(
ut_ad(rec);
ut_ad(offsets);
ut_ad(rec_offs_validate(rec, NULL, offsets));
+ ut_ad(!rec_offs_any_default(offsets));
if (!rec_offs_comp(offsets)) {
rec_print_mbr_old(file, rec);
@@ -2205,6 +2522,11 @@ rec_print(
data = rec_get_nth_field(rec, offsets, i, &len);
+ if (len == UNIV_SQL_DEFAULT) {
+ o << "DEFAULT";
+ continue;
+ }
+
if (len == UNIV_SQL_NULL) {
o << "NULL";
continue;
@@ -2358,6 +2680,7 @@ wsrep_rec_get_foreign_key(
dict_index_get_nth_field(index_ref, i);
const dict_col_t* col_r = dict_field_get_col(field_r);
+ ut_ad(!rec_offs_nth_default(offsets, i));
data = rec_get_nth_field(rec, offsets, i, &len);
if (key_len + ((len != UNIV_SQL_NULL) ? len + 1 : 1) >
*buf_len) {
diff --git a/storage/innobase/row/row0ftsort.cc b/storage/innobase/row/row0ftsort.cc
index dad00c58827..5d4824f0aae 100644
--- a/storage/innobase/row/row0ftsort.cc
+++ b/storage/innobase/row/row0ftsort.cc
@@ -1679,6 +1679,11 @@ row_fts_merge_insert(
dict_table_close(aux_table, FALSE, FALSE);
aux_index = dict_table_get_first_index(aux_table);
+ ut_ad(!aux_index->is_instant());
+ /* row_merge_write_fts_node() depends on the correct value */
+ ut_ad(aux_index->n_core_null_bytes
+ == UT_BITS_IN_BYTES(aux_index->n_nullable));
+
FlushObserver* observer;
observer = psort_info[0].psort_common->trx->flush_observer;
diff --git a/storage/innobase/row/row0import.cc b/storage/innobase/row/row0import.cc
index c8a9370208f..757fbd28a88 100644
--- a/storage/innobase/row/row0import.cc
+++ b/storage/innobase/row/row0import.cc
@@ -1439,6 +1439,13 @@ IndexPurge::open() UNIV_NOTHROW
btr_pcur_open_at_index_side(
true, m_index, BTR_MODIFY_LEAF, &m_pcur, true, 0, &m_mtr);
+ btr_pcur_move_to_next_user_rec(&m_pcur, &m_mtr);
+ if (rec_is_default_row(btr_pcur_get_rec(&m_pcur), m_index)) {
+ ut_ad(btr_pcur_is_on_user_rec(&m_pcur));
+ /* Skip the 'default row' pseudo-record. */
+ } else {
+ btr_pcur_move_to_prev_on_page(&m_pcur);
+ }
}
/**
@@ -1815,6 +1822,13 @@ PageConverter::update_index_page(
if (dict_index_is_clust(m_index->m_srv_index)) {
if (page_is_root(page)) {
/* Preserve the PAGE_ROOT_AUTO_INC. */
+ if (m_index->m_srv_index->table->supports_instant()
+ && btr_cur_instant_root_init(
+ const_cast<dict_index_t*>(
+ m_index->m_srv_index),
+ page)) {
+ return(DB_CORRUPTION);
+ }
} else {
/* Clear PAGE_MAX_TRX_ID so that it can be
used for other purposes in the future. IMPORT
@@ -1907,6 +1921,8 @@ PageConverter::update_page(
return(DB_CORRUPTION);
}
+ /* fall through */
+ case FIL_PAGE_TYPE_INSTANT:
/* This is on every page in the tablespace. */
mach_write_to_4(
get_frame(block)
@@ -2309,7 +2325,14 @@ row_import_set_sys_max_row_id(
rec = btr_pcur_get_rec(&pcur);
/* Check for empty table. */
- if (!page_rec_is_infimum(rec)) {
+ if (page_rec_is_infimum(rec)) {
+ /* The table is empty. */
+ err = DB_SUCCESS;
+ } else if (rec_is_default_row(rec, index)) {
+ /* The clustered index contains the 'default row',
+ that is, the table is empty. */
+ err = DB_SUCCESS;
+ } else {
ulint len;
const byte* field;
mem_heap_t* heap = NULL;
@@ -2336,9 +2359,6 @@ row_import_set_sys_max_row_id(
if (heap != NULL) {
mem_heap_free(heap);
}
- } else {
- /* The table is empty. */
- err = DB_SUCCESS;
}
btr_pcur_close(&pcur);
diff --git a/storage/innobase/row/row0ins.cc b/storage/innobase/row/row0ins.cc
index 9b9d19ae960..88508cd8ce3 100644
--- a/storage/innobase/row/row0ins.cc
+++ b/storage/innobase/row/row0ins.cc
@@ -2250,6 +2250,8 @@ row_ins_duplicate_error_in_clust_online(
dberr_t err = DB_SUCCESS;
const rec_t* rec = btr_cur_get_rec(cursor);
+ ut_ad(!cursor->index->is_instant());
+
if (cursor->low_match >= n_uniq && !page_rec_is_infimum(rec)) {
*offsets = rec_get_offsets(rec, cursor->index, *offsets, true,
ULINT_UNDEFINED, heap);
@@ -2583,6 +2585,7 @@ row_ins_clust_index_entry_low(
ut_ad(flags & BTR_NO_LOCKING_FLAG);
ut_ad(!dict_index_is_online_ddl(index));
ut_ad(!index->table->persistent_autoinc);
+ ut_ad(!index->is_instant());
mtr.set_log_mode(MTR_LOG_NO_REDO);
} else {
mtr.set_named_space(index->space);
@@ -2634,6 +2637,41 @@ row_ins_clust_index_entry_low(
}
#endif /* UNIV_DEBUG */
+ if (UNIV_UNLIKELY(entry->info_bits)) {
+ ut_ad(entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(flags == BTR_NO_LOCKING_FLAG);
+ ut_ad(index->is_instant());
+ ut_ad(!dict_index_is_online_ddl(index));
+ ut_ad(!dup_chk_only);
+
+ const rec_t* rec = btr_cur_get_rec(cursor);
+
+ switch (rec_get_info_bits(rec, page_rec_is_comp(rec))
+ & (REC_INFO_MIN_REC_FLAG | REC_INFO_DELETED_FLAG)) {
+ case REC_INFO_MIN_REC_FLAG:
+ thr_get_trx(thr)->error_info = index;
+ err = DB_DUPLICATE_KEY;
+ goto err_exit;
+ case REC_INFO_MIN_REC_FLAG | REC_INFO_DELETED_FLAG:
+ /* The 'default row' is never delete-marked.
+ If a table loses its 'instantness', it happens
+ by the rollback of this first-time insert, or
+ by a call to btr_page_empty() on the root page
+ when the table becomes empty. */
+ err = DB_CORRUPTION;
+ goto err_exit;
+ default:
+ ut_ad(!row_ins_must_modify_rec(cursor));
+ goto do_insert;
+ }
+ }
+
+ if (index->is_instant()) entry->trim(*index);
+
+ if (rec_is_default_row(btr_cur_get_rec(cursor), index)) {
+ goto do_insert;
+ }
+
if (n_uniq
&& (cursor->up_match >= n_uniq || cursor->low_match >= n_uniq)) {
@@ -2696,6 +2734,7 @@ err_exit:
mtr_commit(&mtr);
mem_heap_free(entry_heap);
} else {
+do_insert:
rec_t* insert_rec;
if (mode != BTR_MODIFY_TREE) {
@@ -3192,16 +3231,19 @@ row_ins_clust_index_entry(
n_uniq = dict_index_is_unique(index) ? index->n_uniq : 0;
- /* Try first optimistic descent to the B-tree */
- log_free_check();
- const ulint flags = dict_table_is_temporary(index->table)
+ const ulint flags = index->table->is_temporary()
? BTR_NO_LOCKING_FLAG
: index->table->no_rollback() ? BTR_NO_ROLLBACK : 0;
+ const ulint orig_n_fields = entry->n_fields;
+
+ /* Try first optimistic descent to the B-tree */
+ log_free_check();
err = row_ins_clust_index_entry_low(
flags, BTR_MODIFY_LEAF, index, n_uniq, entry,
n_ext, thr, dup_chk_only);
+ entry->n_fields = orig_n_fields;
DEBUG_SYNC_C_IF_THD(thr_get_trx(thr)->mysql_thd,
"after_row_ins_clust_index_entry_leaf");
@@ -3218,6 +3260,8 @@ row_ins_clust_index_entry(
flags, BTR_MODIFY_TREE, index, n_uniq, entry,
n_ext, thr, dup_chk_only);
+ entry->n_fields = orig_n_fields;
+
DBUG_RETURN(err);
}
@@ -3311,7 +3355,7 @@ row_ins_index_entry(
DBUG_SET("-d,row_ins_index_entry_timeout");
return(DB_LOCK_WAIT);});
- if (dict_index_is_clust(index)) {
+ if (index->is_clust()) {
return(row_ins_clust_index_entry(index, entry, thr, 0, false));
} else {
return(row_ins_sec_index_entry(index, entry, thr, false));
diff --git a/storage/innobase/row/row0log.cc b/storage/innobase/row/row0log.cc
index a5dff03c16f..edb55534ada 100644
--- a/storage/innobase/row/row0log.cc
+++ b/storage/innobase/row/row0log.cc
@@ -832,16 +832,18 @@ row_log_table_low_redundant(
mem_heap_t* heap = NULL;
dtuple_t* tuple;
ulint num_v = ventry ? dtuple_get_n_v_fields(ventry) : 0;
+ const ulint n_fields = rec_get_n_fields_old(rec);
ut_ad(!page_is_comp(page_align(rec)));
- ut_ad(dict_index_get_n_fields(index) == rec_get_n_fields_old(rec));
+ ut_ad(index->n_fields >= n_fields);
+ ut_ad(index->n_fields == n_fields || index->is_instant());
ut_ad(dict_tf2_is_valid(index->table->flags, index->table->flags2));
ut_ad(!dict_table_is_comp(index->table)); /* redundant row format */
ut_ad(dict_index_is_clust(new_index));
- heap = mem_heap_create(DTUPLE_EST_ALLOC(index->n_fields));
- tuple = dtuple_create_with_vcol(heap, index->n_fields, num_v);
- dict_index_copy_types(tuple, index, index->n_fields);
+ heap = mem_heap_create(DTUPLE_EST_ALLOC(n_fields));
+ tuple = dtuple_create_with_vcol(heap, n_fields, num_v);
+ dict_index_copy_types(tuple, index, n_fields);
if (num_v) {
dict_table_copy_v_types(tuple, index->table);
@@ -850,7 +852,7 @@ row_log_table_low_redundant(
dtuple_set_n_fields_cmp(tuple, dict_index_get_n_unique(index));
if (rec_get_1byte_offs_flag(rec)) {
- for (ulint i = 0; i < index->n_fields; i++) {
+ for (ulint i = 0; i < n_fields; i++) {
dfield_t* dfield;
ulint len;
const void* field;
@@ -861,7 +863,7 @@ row_log_table_low_redundant(
dfield_set_data(dfield, field, len);
}
} else {
- for (ulint i = 0; i < index->n_fields; i++) {
+ for (ulint i = 0; i < n_fields; i++) {
dfield_t* dfield;
ulint len;
const void* field;
@@ -877,8 +879,15 @@ row_log_table_low_redundant(
}
}
+ rec_comp_status_t status = index->is_instant()
+ ? REC_STATUS_COLUMNS_ADDED : REC_STATUS_ORDINARY;
+
size = rec_get_converted_size_temp(
- index, tuple->fields, tuple->n_fields, &extra_size);
+ index, tuple->fields, tuple->n_fields, &extra_size, status);
+ if (index->is_instant()) {
+ size++;
+ extra_size++;
+ }
ulint v_size = num_v
? rec_get_converted_size_temp_v(index, ventry) : 0;
@@ -913,15 +922,19 @@ row_log_table_low_redundant(
if (byte* b = row_log_table_open(index->online_log,
mrec_size, &avail_size)) {
- *b++ = insert ? ROW_T_INSERT : ROW_T_UPDATE;
+ if (insert) {
+ *b++ = ROW_T_INSERT;
+ } else {
+ *b++ = ROW_T_UPDATE;
- if (old_pk_size) {
- *b++ = static_cast<byte>(old_pk_extra_size);
+ if (old_pk_size) {
+ *b++ = static_cast<byte>(old_pk_extra_size);
- rec_convert_dtuple_to_temp(
- b + old_pk_extra_size, new_index,
- old_pk->fields, old_pk->n_fields);
- b += old_pk_size;
+ rec_convert_dtuple_to_temp(
+ b + old_pk_extra_size, new_index,
+ old_pk->fields, old_pk->n_fields);
+ b += old_pk_size;
+ }
}
if (extra_size < 0x80) {
@@ -932,8 +945,17 @@ row_log_table_low_redundant(
*b++ = static_cast<byte>(extra_size);
}
+ if (status == REC_STATUS_COLUMNS_ADDED) {
+ ut_ad(index->is_instant());
+ if (n_fields <= index->n_core_fields) {
+ status = REC_STATUS_ORDINARY;
+ }
+ *b = status;
+ }
+
rec_convert_dtuple_to_temp(
- b + extra_size, index, tuple->fields, tuple->n_fields);
+ b + extra_size, index, tuple->fields, tuple->n_fields,
+ status);
b += size;
ut_ad(!num_v == !v_size);
if (num_v) {
@@ -976,7 +998,6 @@ row_log_table_low(
const dtuple_t* old_pk) /*!< in: old PRIMARY KEY value (if !insert
and a PRIMARY KEY is being created) */
{
- ulint omit_size;
ulint old_pk_size;
ulint old_pk_extra_size;
ulint extra_size;
@@ -995,7 +1016,19 @@ row_log_table_low(
ut_ad(rw_lock_own_flagged(
&index->lock,
RW_LOCK_FLAG_S | RW_LOCK_FLAG_X | RW_LOCK_FLAG_SX));
- ut_ad(fil_page_get_type(page_align(rec)) == FIL_PAGE_INDEX);
+#ifdef UNIV_DEBUG
+ switch (fil_page_get_type(page_align(rec))) {
+ case FIL_PAGE_INDEX:
+ break;
+ case FIL_PAGE_TYPE_INSTANT:
+ ut_ad(index->is_instant());
+ ut_ad(page_is_root(page_align(rec)));
+ break;
+ default:
+ ut_ad(!"wrong page type");
+ }
+#endif /* UNIV_DEBUG */
+ ut_ad(!rec_is_default_row(rec, index));
ut_ad(page_rec_is_leaf(rec));
ut_ad(!page_is_comp(page_align(rec)) == !rec_offs_comp(offsets));
/* old_pk=row_log_table_get_pk() [not needed in INSERT] is a prefix
@@ -1020,14 +1053,17 @@ row_log_table_low(
}
ut_ad(page_is_comp(page_align(rec)));
- ut_ad(rec_get_status(rec) == REC_STATUS_ORDINARY);
+ ut_ad(rec_get_status(rec) == REC_STATUS_ORDINARY
+ || rec_get_status(rec) == REC_STATUS_COLUMNS_ADDED);
- omit_size = REC_N_NEW_EXTRA_BYTES;
+ const ulint omit_size = REC_N_NEW_EXTRA_BYTES;
- extra_size = rec_offs_extra_size(offsets) - omit_size;
+ const ulint rec_extra_size = rec_offs_extra_size(offsets) - omit_size;
+ extra_size = rec_extra_size + index->is_instant();
mrec_size = ROW_LOG_HEADER_SIZE
- + (extra_size >= 0x80) + rec_offs_size(offsets) - omit_size;
+ + (extra_size >= 0x80) + rec_offs_size(offsets) - omit_size
+ + index->is_instant();
if (ventry && ventry->n_v_fields > 0) {
mrec_size += rec_get_converted_size_temp_v(new_index, ventry);
@@ -1063,15 +1099,19 @@ row_log_table_low(
if (byte* b = row_log_table_open(index->online_log,
mrec_size, &avail_size)) {
- *b++ = insert ? ROW_T_INSERT : ROW_T_UPDATE;
+ if (insert) {
+ *b++ = ROW_T_INSERT;
+ } else {
+ *b++ = ROW_T_UPDATE;
- if (old_pk_size) {
- *b++ = static_cast<byte>(old_pk_extra_size);
+ if (old_pk_size) {
+ *b++ = static_cast<byte>(old_pk_extra_size);
- rec_convert_dtuple_to_temp(
- b + old_pk_extra_size, new_index,
- old_pk->fields, old_pk->n_fields);
- b += old_pk_size;
+ rec_convert_dtuple_to_temp(
+ b + old_pk_extra_size, new_index,
+ old_pk->fields, old_pk->n_fields);
+ b += old_pk_size;
+ }
}
if (extra_size < 0x80) {
@@ -1082,8 +1122,14 @@ row_log_table_low(
*b++ = static_cast<byte>(extra_size);
}
- memcpy(b, rec - rec_offs_extra_size(offsets), extra_size);
- b += extra_size;
+ if (index->is_instant()) {
+ *b++ = rec_get_status(rec);
+ } else {
+ ut_ad(rec_get_status(rec) == REC_STATUS_ORDINARY);
+ }
+
+ memcpy(b, rec - rec_extra_size - omit_size, rec_extra_size);
+ b += rec_extra_size;
memcpy(b, rec, rec_offs_data_size(offsets));
b += rec_offs_data_size(offsets);
@@ -1618,6 +1664,9 @@ blob_done:
rw_lock_x_unlock(dict_index_get_lock(index));
} else {
data = rec_get_nth_field(mrec, offsets, i, &len);
+ if (len == UNIV_SQL_DEFAULT) {
+ data = index->instant_field_value(i, &len);
+ }
dfield_set_data(dfield, data, len);
}
@@ -2479,12 +2528,18 @@ row_log_table_apply_op(
mrec += extra_size;
+ ut_ad(extra_size || !dup->index->is_instant());
+
if (mrec > mrec_end) {
return(NULL);
}
rec_offs_set_n_fields(offsets, dup->index->n_fields);
- rec_init_offsets_temp(mrec, dup->index, offsets);
+ rec_init_offsets_temp(mrec, dup->index, offsets,
+ dup->index->is_instant()
+ ? static_cast<rec_comp_status_t>(
+ *(mrec - extra_size))
+ : REC_STATUS_ORDINARY);
next_mrec = mrec + rec_offs_data_size(offsets);
@@ -2520,6 +2575,9 @@ row_log_table_apply_op(
For fixed-length PRIMARY key columns, it is 0. */
mrec += extra_size;
+ /* The ROW_T_DELETE record was converted by
+ rec_convert_dtuple_to_temp() using new_index. */
+ ut_ad(!new_index->is_instant());
rec_offs_set_n_fields(offsets, new_index->n_uniq + 2);
rec_init_offsets_temp(mrec, new_index, offsets);
next_mrec = mrec + rec_offs_data_size(offsets) + ext_size;
@@ -2590,12 +2648,18 @@ row_log_table_apply_op(
mrec += extra_size;
+ ut_ad(extra_size || !dup->index->is_instant());
+
if (mrec > mrec_end) {
return(NULL);
}
rec_offs_set_n_fields(offsets, dup->index->n_fields);
- rec_init_offsets_temp(mrec, dup->index, offsets);
+ rec_init_offsets_temp(mrec, dup->index, offsets,
+ dup->index->is_instant()
+ ? static_cast<rec_comp_status_t>(
+ *(mrec - extra_size))
+ : REC_STATUS_ORDINARY);
next_mrec = mrec + rec_offs_data_size(offsets);
@@ -2638,6 +2702,9 @@ row_log_table_apply_op(
/* Get offsets for PRIMARY KEY,
DB_TRX_ID, DB_ROLL_PTR. */
+ /* The old_pk prefix was converted by
+ rec_convert_dtuple_to_temp() using new_index. */
+ ut_ad(!new_index->is_instant());
rec_offs_set_n_fields(offsets, new_index->n_uniq + 2);
rec_init_offsets_temp(mrec, new_index, offsets);
@@ -2690,12 +2757,18 @@ row_log_table_apply_op(
mrec += extra_size;
+ ut_ad(extra_size || !dup->index->is_instant());
+
if (mrec > mrec_end) {
return(NULL);
}
rec_offs_set_n_fields(offsets, dup->index->n_fields);
- rec_init_offsets_temp(mrec, dup->index, offsets);
+ rec_init_offsets_temp(mrec, dup->index, offsets,
+ dup->index->is_instant()
+ ? static_cast<rec_comp_status_t>(
+ *(mrec - extra_size))
+ : REC_STATUS_ORDINARY);
next_mrec = mrec + rec_offs_data_size(offsets);
diff --git a/storage/innobase/row/row0merge.cc b/storage/innobase/row/row0merge.cc
index fa466d09d30..a2442222d6d 100644
--- a/storage/innobase/row/row0merge.cc
+++ b/storage/innobase/row/row0merge.cc
@@ -1856,6 +1856,14 @@ row_merge_read_clustered_index(
btr_pcur_open_at_index_side(
true, clust_index, BTR_SEARCH_LEAF, &pcur, true, 0, &mtr);
+ btr_pcur_move_to_next_user_rec(&pcur, &mtr);
+ if (rec_is_default_row(btr_pcur_get_rec(&pcur), clust_index)) {
+ ut_ad(btr_pcur_is_on_user_rec(&pcur));
+ /* Skip the 'default row' pseudo-record. */
+ } else {
+ ut_ad(!clust_index->is_instant());
+ btr_pcur_move_to_prev_on_page(&pcur);
+ }
if (old_table != new_table) {
/* The table is being rebuilt. Identify the columns
@@ -4388,52 +4396,7 @@ row_merge_rename_tables_dict(
return(err);
}
-/** Create and execute a query graph for creating an index.
-@param[in,out] trx trx
-@param[in,out] table table
-@param[in,out] index index
-@param[in] add_v new virtual columns added along with add index call
-@return DB_SUCCESS or error code */
-MY_ATTRIBUTE((nonnull(1,2,3), warn_unused_result))
-static
-dberr_t
-row_merge_create_index_graph(
- trx_t* trx,
- dict_table_t* table,
- dict_index_t* index,
- const dict_add_v_col_t* add_v)
-{
- ind_node_t* node; /*!< Index creation node */
- mem_heap_t* heap; /*!< Memory heap */
- que_thr_t* thr; /*!< Query thread */
- dberr_t err;
-
- DBUG_ENTER("row_merge_create_index_graph");
-
- ut_ad(trx);
- ut_ad(table);
- ut_ad(index);
-
- heap = mem_heap_create(512);
-
- index->table = table;
- node = ind_create_graph_create(index, heap, add_v);
- thr = pars_complete_graph_for_exec(node, trx, heap, NULL);
-
- ut_a(thr == que_fork_start_command(
- static_cast<que_fork_t*>(que_node_get_parent(thr))));
-
- que_run_threads(thr);
-
- err = trx->error_state;
-
- que_graph_free((que_t*) que_node_get_parent(thr));
-
- DBUG_RETURN(err);
-}
-
/** Create the index and load in to the dictionary.
-@param[in,out] trx trx (sets error_state)
@param[in,out] table the index is on this table
@param[in] index_def the index definition
@param[in] add_v new virtual columns added along with add
@@ -4443,17 +4406,14 @@ row_merge_create_index_graph(
@return index, or NULL on error */
dict_index_t*
row_merge_create_index(
- trx_t* trx,
dict_table_t* table,
const index_def_t* index_def,
const dict_add_v_col_t* add_v,
const char** col_names)
{
dict_index_t* index;
- dberr_t err;
ulint n_fields = index_def->n_fields;
ulint i;
- bool has_new_v_col = false;
DBUG_ENTER("row_merge_create_index");
@@ -4466,8 +4426,7 @@ row_merge_create_index(
index = dict_mem_index_create(table->name.m_name, index_def->name,
0, index_def->ind_type, n_fields);
- ut_a(index);
-
+ index->table = table;
index->set_committed(index_def->rebuild);
for (i = 0; i < n_fields; i++) {
@@ -4481,7 +4440,7 @@ row_merge_create_index(
ut_ad(ifield->col_no >= table->n_v_def);
name = add_v->v_col_name[
ifield->col_no - table->n_v_def];
- has_new_v_col = true;
+ index->has_new_v_col = true;
} else {
name = dict_table_get_v_col_name(
table, ifield->col_no);
@@ -4506,27 +4465,6 @@ row_merge_create_index(
dict_mem_index_add_field(index, name, ifield->prefix_len);
}
- /* Add the index to SYS_INDEXES, using the index prototype. */
- err = row_merge_create_index_graph(trx, table, index, add_v);
-
- if (err == DB_SUCCESS) {
-
- index = dict_table_get_index_on_name(table, index_def->name,
- index_def->rebuild);
-
- ut_a(index);
-
- index->parser = index_def->parser;
- index->has_new_v_col = has_new_v_col;
-
- /* Note the id of the transaction that created this
- index, we use it to restrict readers from accessing
- this index, to ensure read consistency. */
- ut_ad(index->trx_id == trx->id);
- } else {
- index = NULL;
- }
-
DBUG_RETURN(index);
}
diff --git a/storage/innobase/row/row0mysql.cc b/storage/innobase/row/row0mysql.cc
index eb1c253be1c..f16155ef152 100644
--- a/storage/innobase/row/row0mysql.cc
+++ b/storage/innobase/row/row0mysql.cc
@@ -2622,6 +2622,8 @@ row_create_index_for_mysql(
index = dict_index_get_if_in_cache_low(index_id);
ut_a(index != NULL);
index->table = table;
+ ut_ad(!index->is_instant());
+ index->n_core_null_bytes = UT_BITS_IN_BYTES(index->n_nullable);
err = dict_create_index_tree_in_mem(index, trx);
@@ -3965,9 +3967,8 @@ row_drop_table_for_mysql(
}
sql += "DELETE FROM SYS_VIRTUAL\n"
- "WHERE TABLE_ID = table_id;\n";
-
- sql += "END;\n";
+ "WHERE TABLE_ID = table_id;\n"
+ "END;\n";
err = que_eval_sql(info, sql.c_str(), FALSE, trx);
} else {
diff --git a/storage/innobase/row/row0purge.cc b/storage/innobase/row/row0purge.cc
index b041917d398..f4f6d4cff9f 100644
--- a/storage/innobase/row/row0purge.cc
+++ b/storage/innobase/row/row0purge.cc
@@ -905,6 +905,7 @@ row_purge_parse_undo_rec(
node->rec_type = type;
switch (type) {
+ case TRX_UNDO_INSERT_DEFAULT:
case TRX_UNDO_INSERT_REC:
break;
default:
@@ -942,9 +943,15 @@ try_again:
goto err_exit;
}
- if (type != TRX_UNDO_INSERT_REC
- && node->table->n_v_cols && !node->table->vc_templ
- && dict_table_has_indexed_v_cols(node->table)) {
+ switch (type) {
+ case TRX_UNDO_INSERT_DEFAULT:
+ case TRX_UNDO_INSERT_REC:
+ break;
+ default:
+ if (!node->table->n_v_cols || node->table->vc_templ
+ || !dict_table_has_indexed_v_cols(node->table)) {
+ break;
+ }
/* Need server fully up for virtual column computation */
if (!mysqld_server_started) {
@@ -974,6 +981,11 @@ err_exit:
return(false);
}
+ if (type == TRX_UNDO_INSERT_DEFAULT) {
+ node->ref = &trx_undo_default_rec;
+ return(true);
+ }
+
ptr = trx_undo_rec_get_row_ref(ptr, clust_index, &(node->ref),
node->heap);
@@ -1034,6 +1046,7 @@ row_purge_record_func(
MONITOR_INC(MONITOR_N_DEL_ROW_PURGE);
}
break;
+ case TRX_UNDO_INSERT_DEFAULT:
case TRX_UNDO_INSERT_REC:
node->roll_ptr |= 1ULL << ROLL_PTR_INSERT_FLAG_POS;
/* fall through */
diff --git a/storage/innobase/row/row0row.cc b/storage/innobase/row/row0row.cc
index a61adb7cd15..c9bd4533fd8 100644
--- a/storage/innobase/row/row0row.cc
+++ b/storage/innobase/row/row0row.cc
@@ -436,7 +436,7 @@ row_build_low(
}
/* Avoid a debug assertion in rec_offs_validate(). */
- rec_offs_make_valid(copy, index, const_cast<ulint*>(offsets));
+ rec_offs_make_valid(copy, index, true, const_cast<ulint*>(offsets));
if (!col_table) {
ut_ad(!col_map);
@@ -506,10 +506,14 @@ row_build_low(
}
dfield_t* dfield = dtuple_get_nth_field(row, col_no);
-
- const byte* field = rec_get_nth_field(
+ const void* field = rec_get_nth_field(
copy, offsets, i, &len);
-
+ if (len == UNIV_SQL_DEFAULT) {
+ field = index->instant_field_value(i, &len);
+ if (field && type != ROW_COPY_POINTERS) {
+ field = mem_heap_dup(heap, field, len);
+ }
+ }
dfield_set_data(dfield, field, len);
if (rec_offs_nth_extern(offsets, i)) {
@@ -526,7 +530,7 @@ row_build_low(
}
}
- rec_offs_make_valid(rec, index, const_cast<ulint*>(offsets));
+ rec_offs_make_valid(rec, index, true, const_cast<ulint*>(offsets));
ut_ad(dtuple_check_typed(row));
@@ -644,20 +648,24 @@ row_build_w_add_vcol(
add_cols, add_v, col_map, ext, heap));
}
-/*******************************************************************//**
-Converts an index record to a typed data tuple.
+/** Convert an index record to a data tuple.
+@tparam def whether the index->instant_field_value() needs to be accessed
+@param[in] rec index record
+@param[in] index index
+@param[in] offsets rec_get_offsets(rec, index)
+@param[out] n_ext number of externally stored columns
+@param[in,out] heap memory heap for allocations
@return index entry built; does not set info_bits, and the data fields
in the entry will point directly to rec */
+template<bool def>
+static inline
dtuple_t*
-row_rec_to_index_entry_low(
-/*=======================*/
- const rec_t* rec, /*!< in: record in the index */
- const dict_index_t* index, /*!< in: index */
- const ulint* offsets,/*!< in: rec_get_offsets(rec, index) */
- ulint* n_ext, /*!< out: number of externally
- stored columns */
- mem_heap_t* heap) /*!< in: memory heap from which
- the memory needed is allocated */
+row_rec_to_index_entry_impl(
+ const rec_t* rec,
+ const dict_index_t* index,
+ const ulint* offsets,
+ ulint* n_ext,
+ mem_heap_t* heap)
{
dtuple_t* entry;
dfield_t* dfield;
@@ -669,6 +677,7 @@ row_rec_to_index_entry_low(
ut_ad(rec != NULL);
ut_ad(heap != NULL);
ut_ad(index != NULL);
+ ut_ad(def || !rec_offs_any_default(offsets));
/* Because this function may be invoked by row0merge.cc
on a record whose header is in different format, the check
@@ -693,7 +702,9 @@ row_rec_to_index_entry_low(
for (i = 0; i < rec_len; i++) {
dfield = dtuple_get_nth_field(entry, i);
- field = rec_get_nth_field(rec, offsets, i, &len);
+ field = def
+ ? rec_get_nth_cfield(rec, index, offsets, i, &len)
+ : rec_get_nth_field(rec, offsets, i, &len);
dfield_set_data(dfield, field, len);
@@ -704,10 +715,27 @@ row_rec_to_index_entry_low(
}
ut_ad(dtuple_check_typed(entry));
-
return(entry);
}
+/** Convert an index record to a data tuple.
+@param[in] rec index record
+@param[in] index index
+@param[in] offsets rec_get_offsets(rec, index)
+@param[out] n_ext number of externally stored columns
+@param[in,out] heap memory heap for allocations */
+dtuple_t*
+row_rec_to_index_entry_low(
+ const rec_t* rec,
+ const dict_index_t* index,
+ const ulint* offsets,
+ ulint* n_ext,
+ mem_heap_t* heap)
+{
+ return row_rec_to_index_entry_impl<false>(
+ rec, index, offsets, n_ext, heap);
+}
+
/*******************************************************************//**
Converts an index record to a typed data tuple. NOTE that externally
stored (often big) fields are NOT copied to heap.
@@ -738,10 +766,12 @@ row_rec_to_index_entry(
copy_rec = rec_copy(buf, rec, offsets);
- rec_offs_make_valid(copy_rec, index, const_cast<ulint*>(offsets));
- entry = row_rec_to_index_entry_low(
+ rec_offs_make_valid(copy_rec, index, true,
+ const_cast<ulint*>(offsets));
+ entry = row_rec_to_index_entry_impl<true>(
copy_rec, index, offsets, n_ext, heap);
- rec_offs_make_valid(rec, index, const_cast<ulint*>(offsets));
+ rec_offs_make_valid(rec, index, true,
+ const_cast<ulint*>(offsets));
dtuple_set_info_bits(entry,
rec_get_info_bits(rec, rec_offs_comp(offsets)));
@@ -804,8 +834,7 @@ row_build_row_ref(
mem_heap_alloc(heap, rec_offs_size(offsets)));
rec = rec_copy(buf, rec, offsets);
- /* Avoid a debug assertion in rec_offs_validate(). */
- rec_offs_make_valid(rec, index, offsets);
+ rec_offs_make_valid(rec, index, true, offsets);
}
table = index->table;
@@ -825,6 +854,7 @@ row_build_row_ref(
ut_a(pos != ULINT_UNDEFINED);
+ ut_ad(!rec_offs_nth_default(offsets, pos));
field = rec_get_nth_field(rec, offsets, pos, &len);
dfield_set_data(dfield, field, len);
@@ -925,6 +955,7 @@ row_build_row_ref_in_tuple(
ut_a(pos != ULINT_UNDEFINED);
+ ut_ad(!rec_offs_nth_default(offsets, pos));
field = rec_get_nth_field(rec, offsets, pos, &len);
dfield_set_data(dfield, field, len);
@@ -980,9 +1011,24 @@ row_search_on_row_ref(
index = dict_table_get_first_index(table);
- ut_a(dtuple_get_n_fields(ref) == dict_index_get_n_unique(index));
-
- btr_pcur_open(index, ref, PAGE_CUR_LE, mode, pcur, mtr);
+ if (UNIV_UNLIKELY(ref->info_bits)) {
+ ut_ad(ref->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(ref->n_fields <= index->n_uniq);
+ btr_pcur_open_at_index_side(true, index, mode, pcur, true, 0,
+ mtr);
+ btr_pcur_move_to_next_user_rec(pcur, mtr);
+ /* We do not necessarily have index->is_instant() here,
+ because we could be executing a rollback of an
+ instant ADD COLUMN operation. The function
+ rec_is_default_row() asserts index->is_instant();
+ we do not want to call it here. */
+ return rec_get_info_bits(btr_pcur_get_rec(pcur),
+ dict_table_is_comp(index->table))
+ & REC_INFO_MIN_REC_FLAG;
+ } else {
+ ut_a(ref->n_fields == index->n_uniq);
+ btr_pcur_open(index, ref, PAGE_CUR_LE, mode, pcur, mtr);
+ }
low_match = btr_pcur_get_low_match(pcur);
@@ -1227,6 +1273,8 @@ row_raw_format(
ulint ret;
ibool format_in_hex;
+ ut_ad(data_len != UNIV_SQL_DEFAULT);
+
if (buf_size == 0) {
return(0);
diff --git a/storage/innobase/row/row0sel.cc b/storage/innobase/row/row0sel.cc
index 08cce51a503..ad583393d23 100644
--- a/storage/innobase/row/row0sel.cc
+++ b/storage/innobase/row/row0sel.cc
@@ -242,7 +242,7 @@ row_sel_sec_rec_is_for_clust_rec(
clust_field = static_cast<byte*>(vfield->data);
} else {
clust_pos = dict_col_get_clust_pos(col, clust_index);
-
+ ut_ad(!rec_offs_nth_default(clust_offs, clust_pos));
clust_field = rec_get_nth_field(
clust_rec, clust_offs, clust_pos, &clust_len);
}
@@ -517,7 +517,6 @@ row_sel_fetch_columns(
rec, offsets,
dict_table_page_size(index->table),
field_no, &len, heap);
- //field_no, &len, heap, NULL);
/* data == NULL means that the
externally stored field was not
@@ -534,9 +533,8 @@ row_sel_fetch_columns(
needs_copy = TRUE;
} else {
- data = rec_get_nth_field(rec, offsets,
- field_no, &len);
-
+ data = rec_get_nth_cfield(rec, index, offsets,
+ field_no, &len);
needs_copy = column->copy_val;
}
@@ -1494,6 +1492,15 @@ row_sel_try_search_shortcut(
return(SEL_RETRY);
}
+ if (rec_is_default_row(rec, index)) {
+ /* Skip the 'default row' pseudo-record. */
+ if (!btr_pcur_move_to_next_user_rec(&plan->pcur, mtr)) {
+ return(SEL_RETRY);
+ }
+
+ rec = btr_pcur_get_rec(&plan->pcur);
+ }
+
ut_ad(plan->mode == PAGE_CUR_GE);
/* As the cursor is now placed on a user record after a search with
@@ -1820,6 +1827,12 @@ skip_lock:
goto next_rec;
}
+ if (rec_is_default_row(rec, index)) {
+ /* Skip the 'default row' pseudo-record. */
+ cost_counter++;
+ goto next_rec;
+ }
+
if (!consistent_read) {
/* Try to place a lock on the index record */
ulint lock_type;
@@ -3023,7 +3036,6 @@ row_sel_store_mysql_field_func(
rec, offsets,
dict_table_page_size(prebuilt->table),
field_no, &len, heap);
- //field_no, &len, heap, NULL);
if (UNIV_UNLIKELY(!data)) {
/* The externally stored field was not written
@@ -3050,9 +3062,19 @@ row_sel_store_mysql_field_func(
mem_heap_free(heap);
}
} else {
- /* Field is stored in the row. */
-
- data = rec_get_nth_field(rec, offsets, field_no, &len);
+ /* The field is stored in the index record, or
+ in the 'default row' for instant ADD COLUMN. */
+
+ if (rec_offs_nth_default(offsets, field_no)) {
+ ut_ad(dict_index_is_clust(index));
+ ut_ad(index->is_instant());
+ const dict_index_t* clust_index
+ = dict_table_get_first_index(prebuilt->table);
+ ut_ad(index == clust_index);
+ data = clust_index->instant_field_value(field_no,&len);
+ } else {
+ data = rec_get_nth_field(rec, offsets, field_no, &len);
+ }
if (len == UNIV_SQL_NULL) {
/* MySQL assumes that the field for an SQL
@@ -3586,7 +3608,12 @@ sel_restore_position_for_mysql(
case BTR_PCUR_ON:
if (!success && moves_up) {
next:
- btr_pcur_move_to_next(pcur, mtr);
+ if (btr_pcur_move_to_next(pcur, mtr)
+ && rec_is_default_row(btr_pcur_get_rec(pcur),
+ pcur->btr_cur.index)) {
+ btr_pcur_move_to_next(pcur, mtr);
+ }
+
return(TRUE);
}
return(!success);
@@ -3597,7 +3624,9 @@ next:
/* positioned to record after pcur->old_rec. */
pcur->pos_state = BTR_PCUR_IS_POSITIONED;
prev:
- if (btr_pcur_is_on_user_rec(pcur) && !moves_up) {
+ if (btr_pcur_is_on_user_rec(pcur) && !moves_up
+ && !rec_is_default_row(btr_pcur_get_rec(pcur),
+ pcur->btr_cur.index)) {
btr_pcur_move_to_prev(pcur, mtr);
}
return(TRUE);
@@ -3877,6 +3906,15 @@ row_sel_try_search_shortcut_for_mysql(
return(SEL_RETRY);
}
+ if (rec_is_default_row(rec, index)) {
+ /* Skip the 'default row' pseudo-record. */
+ if (!btr_pcur_move_to_next_user_rec(pcur, mtr)) {
+ return(SEL_RETRY);
+ }
+
+ rec = btr_pcur_get_rec(pcur);
+ }
+
/* As the cursor is now placed on a user record after a search with
the mode PAGE_CUR_GE, the up_match field in the cursor tells how many
fields in the user record matched to the search tuple */
@@ -4019,6 +4057,9 @@ row_sel_fill_vrow(
rec_offs_init(offsets_);
ut_ad(!(*vrow));
+ ut_ad(heap);
+ ut_ad(!dict_index_is_clust(index));
+ ut_ad(!index->is_instant());
ut_ad(page_rec_is_leaf(rec));
offsets = rec_get_offsets(rec, index, offsets, true,
@@ -4032,18 +4073,18 @@ row_sel_fill_vrow(
for (ulint i = 0; i < dict_index_get_n_fields(index); i++) {
const dict_field_t* field;
- const dict_col_t* col;
+ const dict_col_t* col;
field = dict_index_get_nth_field(index, i);
col = dict_field_get_col(field);
if (dict_col_is_virtual(col)) {
const byte* data;
- ulint len;
+ ulint len;
data = rec_get_nth_field(rec, offsets, i, &len);
- const dict_v_col_t* vcol = reinterpret_cast<
+ const dict_v_col_t* vcol = reinterpret_cast<
const dict_v_col_t*>(col);
dfield_t* dfield = dtuple_get_nth_v_field(
@@ -4702,12 +4743,24 @@ rec_loop:
corruption */
if (comp) {
+ if (rec_get_info_bits(rec, true) & REC_INFO_MIN_REC_FLAG) {
+ /* Skip the 'default row' pseudo-record. */
+ ut_ad(index->is_instant());
+ goto next_rec;
+ }
+
next_offs = rec_get_next_offs(rec, TRUE);
if (UNIV_UNLIKELY(next_offs < PAGE_NEW_SUPREMUM)) {
goto wrong_offs;
}
} else {
+ if (rec_get_info_bits(rec, false) & REC_INFO_MIN_REC_FLAG) {
+ /* Skip the 'default row' pseudo-record. */
+ ut_ad(index->is_instant());
+ goto next_rec;
+ }
+
next_offs = rec_get_next_offs(rec, FALSE);
if (UNIV_UNLIKELY(next_offs < PAGE_OLD_SUPREMUM)) {
@@ -5979,6 +6032,9 @@ row_search_get_max_rec(
btr_pcur_close(&pcur);
+ ut_ad(!rec
+ || !(rec_get_info_bits(rec, dict_table_is_comp(index->table))
+ & (REC_INFO_MIN_REC_FLAG | REC_INFO_DELETED_FLAG)));
return(rec);
}
diff --git a/storage/innobase/row/row0trunc.cc b/storage/innobase/row/row0trunc.cc
index 90e7aa08965..8d371f67e72 100644
--- a/storage/innobase/row/row0trunc.cc
+++ b/storage/innobase/row/row0trunc.cc
@@ -1950,16 +1950,11 @@ row_truncate_table_for_mysql(
return(row_truncate_complete(
table, trx, fsp_flags, logger, DB_ERROR));
}
- }
- DBUG_EXECUTE_IF("ib_trunc_crash_after_redo_log_write_complete",
- log_buffer_flush_to_disk();
- os_thread_sleep(3000000);
- DBUG_SUICIDE(););
-
- /* Step-9: Drop all indexes (free index pages associated with these
- indexes) */
- if (!dict_table_is_temporary(table)) {
+ DBUG_EXECUTE_IF("ib_trunc_crash_after_redo_log_write_complete",
+ log_buffer_flush_to_disk();
+ os_thread_sleep(3000000);
+ DBUG_SUICIDE(););
DropIndex dropIndex(table, no_redo);
@@ -1974,7 +1969,10 @@ row_truncate_table_for_mysql(
return(row_truncate_complete(
table, trx, fsp_flags, logger, err));
}
+
+ dict_table_get_first_index(table)->remove_instant();
} else {
+ ut_ad(!table->is_instant());
/* For temporary tables we don't have entries in SYSTEM TABLES*/
ut_ad(fsp_is_system_temporary(table->space));
for (dict_index_t* index = UT_LIST_GET_FIRST(table->indexes);
@@ -2108,6 +2106,8 @@ row_truncate_table_for_mysql(
trx_commit_for_mysql(trx);
}
+ ut_ad(!table->is_instant());
+
return(row_truncate_complete(table, trx, fsp_flags, logger, err));
}
diff --git a/storage/innobase/row/row0uins.cc b/storage/innobase/row/row0uins.cc
index b50b4e94cfb..6628007909e 100644
--- a/storage/innobase/row/row0uins.cc
+++ b/storage/innobase/row/row0uins.cc
@@ -78,6 +78,7 @@ row_undo_ins_remove_clust_rec(
mtr.start();
if (index->table->is_temporary()) {
+ ut_ad(node->rec_type == TRX_UNDO_INSERT_REC);
mtr.set_log_mode(MTR_LOG_NO_REDO);
} else {
mtr.set_named_space(index->space);
@@ -120,10 +121,11 @@ row_undo_ins_remove_clust_rec(
mem_heap_free(heap);
}
- if (node->table->id == DICT_INDEXES_ID) {
-
+ switch (node->table->id) {
+ case DICT_INDEXES_ID:
ut_ad(!online);
ut_ad(node->trx->dict_operation_lock_mode == RW_X_LATCH);
+ ut_ad(node->rec_type == TRX_UNDO_INSERT_REC);
dict_drop_index_tree(
btr_pcur_get_rec(&node->pcur), &(node->pcur), &mtr);
@@ -135,6 +137,54 @@ row_undo_ins_remove_clust_rec(
success = btr_pcur_restore_position(
BTR_MODIFY_LEAF, &node->pcur, &mtr);
ut_a(success);
+ break;
+ case DICT_COLUMNS_ID:
+ /* This is rolling back an INSERT into SYS_COLUMNS.
+ If it was part of an instant ADD COLUMN operation, we
+ must modify the table definition. At this point, any
+ corresponding operation to the 'default row' will have
+ been rolled back. */
+ ut_ad(!online);
+ ut_ad(node->trx->dict_operation_lock_mode == RW_X_LATCH);
+ ut_ad(node->rec_type == TRX_UNDO_INSERT_REC);
+ const rec_t* rec = btr_pcur_get_rec(&node->pcur);
+ if (rec_get_n_fields_old(rec)
+ != DICT_NUM_FIELDS__SYS_COLUMNS) {
+ break;
+ }
+ ulint len;
+ const byte* data = rec_get_nth_field_old(
+ rec, DICT_FLD__SYS_COLUMNS__TABLE_ID, &len);
+ if (len != 8) {
+ break;
+ }
+ const table_id_t table_id = mach_read_from_8(data);
+ data = rec_get_nth_field_old(rec, DICT_FLD__SYS_COLUMNS__POS,
+ &len);
+ if (len != 4) {
+ break;
+ }
+ const unsigned pos = mach_read_from_4(data);
+ if (pos == 0 || pos >= (1U << 16)) {
+ break;
+ }
+ dict_table_t* table = dict_table_open_on_id(
+ table_id, true, DICT_TABLE_OP_OPEN_ONLY_IF_CACHED);
+ if (!table) {
+ break;
+ }
+
+ dict_index_t* index = dict_table_get_first_index(table);
+
+ if (index && index->is_instant()
+ && DATA_N_SYS_COLS + 1 + pos == table->n_cols) {
+ /* This is the rollback of an instant ADD COLUMN.
+ Remove the column from the dictionary cache,
+ but keep the system columns. */
+ table->rollback_instant(pos);
+ }
+
+ dict_table_close(table, true, false);
}
if (btr_cur_optimistic_delete(btr_cur, 0, &mtr)) {
@@ -177,6 +227,27 @@ retry:
func_exit:
btr_pcur_commit_specify_mtr(&node->pcur, &mtr);
+ if (err == DB_SUCCESS && node->rec_type == TRX_UNDO_INSERT_DEFAULT) {
+ /* When rolling back the very first instant ADD COLUMN
+ operation, reset the root page to the basic state. */
+ ut_ad(!index->table->is_temporary());
+ mtr.start();
+ if (page_t* root = btr_root_get(index, &mtr)) {
+ byte* page_type = root + FIL_PAGE_TYPE;
+ ut_ad(mach_read_from_2(page_type)
+ == FIL_PAGE_TYPE_INSTANT
+ || mach_read_from_2(page_type)
+ == FIL_PAGE_INDEX);
+ mtr.set_named_space(index->space);
+ mlog_write_ulint(page_type, FIL_PAGE_INDEX,
+ MLOG_2BYTES, &mtr);
+ byte* instant = PAGE_INSTANT + PAGE_HEADER + root;
+ mlog_write_ulint(instant,
+ page_ptr_get_direction(instant + 1),
+ MLOG_2BYTES, &mtr);
+ }
+ mtr.commit();
+ }
return(err);
}
@@ -340,7 +411,7 @@ row_undo_ins_parse_undo_rec(
ptr = trx_undo_rec_get_pars(node->undo_rec, &type, &dummy,
&dummy_extern, &undo_no, &table_id);
- ut_ad(type == TRX_UNDO_INSERT_REC);
+ ut_ad(type == TRX_UNDO_INSERT_REC || type == TRX_UNDO_INSERT_DEFAULT);
node->rec_type = type;
node->update = NULL;
@@ -369,8 +440,14 @@ close_table:
clust_index = dict_table_get_first_index(node->table);
if (clust_index != NULL) {
- ptr = trx_undo_rec_get_row_ref(
- ptr, clust_index, &node->ref, node->heap);
+ if (type == TRX_UNDO_INSERT_REC) {
+ ptr = trx_undo_rec_get_row_ref(
+ ptr, clust_index, &node->ref,
+ node->heap);
+ } else {
+ ut_ad(type == TRX_UNDO_INSERT_DEFAULT);
+ node->ref = &trx_undo_default_rec;
+ }
if (!row_undo_search_clust_to_pcur(node)) {
/* An error probably occurred during
@@ -484,18 +561,28 @@ row_undo_ins(
node->index = dict_table_get_first_index(node->table);
ut_ad(dict_index_is_clust(node->index));
- /* Skip the clustered index (the first index) */
- node->index = dict_table_get_next_index(node->index);
- dict_table_skip_corrupt_index(node->index);
+ switch (node->rec_type) {
+ default:
+ ut_ad(!"wrong undo record type");
+ case TRX_UNDO_INSERT_REC:
+ /* Skip the clustered index (the first index) */
+ node->index = dict_table_get_next_index(node->index);
- err = row_undo_ins_remove_sec_rec(node, thr);
+ dict_table_skip_corrupt_index(node->index);
- if (err == DB_SUCCESS) {
+ err = row_undo_ins_remove_sec_rec(node, thr);
+
+ if (err != DB_SUCCESS) {
+ break;
+ }
+ /* fall through */
+ case TRX_UNDO_INSERT_DEFAULT:
log_free_check();
if (node->table->id == DICT_INDEXES_ID) {
+ ut_ad(node->rec_type == TRX_UNDO_INSERT_REC);
if (!dict_locked) {
mutex_enter(&dict_sys->mutex);
diff --git a/storage/innobase/row/row0umod.cc b/storage/innobase/row/row0umod.cc
index 049b1048724..ecf6b76a593 100644
--- a/storage/innobase/row/row0umod.cc
+++ b/storage/innobase/row/row0umod.cc
@@ -1175,6 +1175,21 @@ close_table:
node->heap, &(node->update));
node->new_trx_id = trx_id;
node->cmpl_info = cmpl_info;
+ ut_ad(!node->ref->info_bits);
+
+ if (node->update->info_bits & REC_INFO_MIN_REC_FLAG) {
+ /* This must be an undo log record for a subsequent
+ instant ADD COLUMN on a table, extending the
+ 'default value' record. */
+ ut_ad(clust_index->is_instant());
+ if (node->update->info_bits != REC_INFO_MIN_REC_FLAG) {
+ ut_ad(!"wrong info_bits in undo log record");
+ goto close_table;
+ }
+ node->update->info_bits = REC_INFO_DEFAULT_ROW;
+ const_cast<dtuple_t*>(node->ref)->info_bits
+ = REC_INFO_DEFAULT_ROW;
+ }
if (!row_undo_search_clust_to_pcur(node)) {
/* As long as this rolling-back transaction exists,
@@ -1236,6 +1251,12 @@ row_undo_mod(
node->index = dict_table_get_first_index(node->table);
ut_ad(dict_index_is_clust(node->index));
+
+ if (node->ref->info_bits) {
+ ut_ad(node->ref->info_bits == REC_INFO_DEFAULT_ROW);
+ goto rollback_clust;
+ }
+
/* Skip the clustered index (the first index) */
node->index = dict_table_get_next_index(node->index);
@@ -1258,6 +1279,7 @@ row_undo_mod(
}
if (err == DB_SUCCESS) {
+rollback_clust:
err = row_undo_mod_clust(node, thr);
bool update_statistics
diff --git a/storage/innobase/row/row0undo.cc b/storage/innobase/row/row0undo.cc
index 8826ebdd0cb..a1bb4cb7dbd 100644
--- a/storage/innobase/row/row0undo.cc
+++ b/storage/innobase/row/row0undo.cc
@@ -1,6 +1,7 @@
/*****************************************************************************
Copyright (c) 1997, 2016, Oracle and/or its affiliates. All Rights Reserved.
+Copyright (c) 2017, MariaDB Corporation.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
@@ -225,10 +226,14 @@ row_undo_search_clust_to_pcur(
}
if (node->rec_type == TRX_UNDO_UPD_EXIST_REC) {
+ ut_ad(node->row->info_bits == REC_INFO_MIN_REC_FLAG
+ || node->row->info_bits == 0);
node->undo_row = dtuple_copy(node->row, node->heap);
row_upd_replace(node->undo_row, &node->undo_ext,
clust_index, node->update, node->heap);
} else {
+ ut_ad((node->row->info_bits == REC_INFO_MIN_REC_FLAG)
+ == (node->rec_type == TRX_UNDO_INSERT_DEFAULT));
node->undo_row = NULL;
node->undo_ext = NULL;
}
diff --git a/storage/innobase/row/row0upd.cc b/storage/innobase/row/row0upd.cc
index 2eceef14025..2499aff2e2e 100644
--- a/storage/innobase/row/row0upd.cc
+++ b/storage/innobase/row/row0upd.cc
@@ -595,6 +595,7 @@ row_upd_changes_field_size_or_external(
new_val = &(upd_field->new_val);
new_len = dfield_get_len(new_val);
+ ut_ad(new_len != UNIV_SQL_DEFAULT);
if (dfield_is_null(new_val) && !rec_offs_comp(offsets)) {
/* A bug fixed on Dec 31st, 2004: we looked at the
@@ -697,6 +698,20 @@ row_upd_rec_in_place(
ut_ad(rec_offs_validate(rec, index, offsets));
if (rec_offs_comp(offsets)) {
+#ifdef UNIV_DEBUG
+ switch (rec_get_status(rec)) {
+ case REC_STATUS_ORDINARY:
+ break;
+ case REC_STATUS_COLUMNS_ADDED:
+ ut_ad(index->is_instant());
+ break;
+ case REC_STATUS_INFIMUM:
+ case REC_STATUS_SUPREMUM:
+ case REC_STATUS_NODE_PTR:
+ ut_ad(!"wrong record status in update");
+ }
+#endif /* UNIV_DEBUG */
+
rec_set_info_bits_new(rec, update->info_bits);
} else {
rec_set_info_bits_old(rec, update->info_bits);
@@ -978,6 +993,7 @@ row_upd_build_sec_rec_difference_binary(
ut_ad(rec_offs_validate(rec, index, offsets));
ut_ad(rec_offs_n_fields(offsets) == dtuple_get_n_fields(entry));
ut_ad(!rec_offs_any_extern(offsets));
+ ut_ad(!rec_offs_any_default(offsets));
update = upd_create(dtuple_get_n_fields(entry), heap);
@@ -1076,8 +1092,7 @@ row_upd_build_difference_binary(
}
for (i = 0; i < n_fld; i++) {
-
- data = rec_get_nth_field(rec, offsets, i, &len);
+ data = rec_get_nth_cfield(rec, index, offsets, i, &len);
dfield = dtuple_get_nth_field(entry, i);
@@ -1308,41 +1323,25 @@ row_upd_index_replace_new_col_val(
}
}
-/***********************************************************//**
-Replaces the new column values stored in the update vector to the index entry
-given. */
+/** Apply an update vector to an index entry.
+@param[in,out] entry index entry to be updated; the clustered index record
+ must be covered by a lock or a page latch to prevent
+ deletion (rollback or purge)
+@param[in] index index of the entry
+@param[in] update update vector built for the entry
+@param[in,out] heap memory heap for copying off-page columns */
void
row_upd_index_replace_new_col_vals_index_pos(
-/*=========================================*/
- dtuple_t* entry, /*!< in/out: index entry where replaced;
- the clustered index record must be
- covered by a lock or a page latch to
- prevent deletion (rollback or purge) */
- dict_index_t* index, /*!< in: index; NOTE that this may also be a
- non-clustered index */
- const upd_t* update, /*!< in: an update vector built for the index so
- that the field number in an upd_field is the
- index position */
- ibool order_only,
- /*!< in: if TRUE, limit the replacement to
- ordering fields of index; note that this
- does not work for non-clustered indexes. */
- mem_heap_t* heap) /*!< in: memory heap for allocating and
- copying the new values */
+ dtuple_t* entry,
+ const dict_index_t* index,
+ const upd_t* update,
+ mem_heap_t* heap)
{
- ulint i;
- ulint n_fields;
const page_size_t& page_size = dict_table_page_size(index->table);
dtuple_set_info_bits(entry, update->info_bits);
- if (order_only) {
- n_fields = dict_index_get_n_unique(index);
- } else {
- n_fields = dict_index_get_n_fields(index);
- }
-
- for (i = 0; i < n_fields; i++) {
+ for (unsigned i = index->n_fields; i--; ) {
const dict_field_t* field;
const dict_col_t* col;
const upd_field_t* uf;
@@ -2045,16 +2044,19 @@ row_upd_copy_columns(
/*=================*/
rec_t* rec, /*!< in: record in a clustered index */
const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
+ const dict_index_t* index, /*!< in: index of rec */
sym_node_t* column) /*!< in: first column in a column list, or
NULL */
{
- byte* data;
+ ut_ad(dict_index_is_clust(index));
+
+ const byte* data;
ulint len;
while (column) {
- data = rec_get_nth_field(rec, offsets,
- column->field_nos[SYM_CLUST_FIELD_NO],
- &len);
+ data = rec_get_nth_cfield(
+ rec, index, offsets,
+ column->field_nos[SYM_CLUST_FIELD_NO], &len);
eval_node_copy_and_alloc_val(column, data, len);
column = UT_LIST_GET_NEXT(col_var_list, column);
@@ -2588,6 +2590,7 @@ row_upd_clust_rec_by_insert_inherit_func(
#ifdef UNIV_DEBUG
if (UNIV_LIKELY(rec != NULL)) {
+ ut_ad(!rec_offs_nth_default(offsets, i));
const byte* rec_data
= rec_get_nth_field(rec, offsets, i, &len);
ut_ad(len == dfield_get_len(dfield));
@@ -2672,6 +2675,7 @@ row_upd_clust_rec_by_insert(
entry = row_build_index_entry_low(node->upd_row, node->upd_ext,
index, heap, ROW_BUILD_FOR_INSERT);
+ if (index->is_instant()) entry->trim(*index);
ut_ad(dtuple_get_info_bits(entry) == 0);
row_upd_index_entry_sys_field(entry, index, DATA_TRX_ID, trx->id);
@@ -3170,7 +3174,7 @@ row_upd_clust_step(
if (UNIV_UNLIKELY(!node->in_mysql_interface)) {
/* Copy the necessary columns from clust_rec and calculate the
new values to set */
- row_upd_copy_columns(rec, offsets,
+ row_upd_copy_columns(rec, offsets, index,
UT_LIST_GET_FIRST(node->columns));
row_upd_eval_new_vals(node->update);
}
diff --git a/storage/innobase/row/row0vers.cc b/storage/innobase/row/row0vers.cc
index a659042bb2f..8c9919511ad 100644
--- a/storage/innobase/row/row0vers.cc
+++ b/storage/innobase/row/row0vers.cc
@@ -1210,7 +1210,7 @@ row_vers_build_for_consistent_read(
in_heap, rec_offs_size(*offsets)));
*old_vers = rec_copy(buf, prev_version, *offsets);
- rec_offs_make_valid(*old_vers, index, *offsets);
+ rec_offs_make_valid(*old_vers, index, true, *offsets);
if (vrow && *vrow) {
*vrow = dtuple_copy(*vrow, in_heap);
@@ -1337,7 +1337,7 @@ committed_version_trx:
in_heap, rec_offs_size(*offsets)));
*old_vers = rec_copy(buf, version, *offsets);
- rec_offs_make_valid(*old_vers, index, *offsets);
+ rec_offs_make_valid(*old_vers, index, true, *offsets);
if (vrow && *vrow) {
*vrow = dtuple_copy(*vrow, in_heap);
dtuple_dup_v_fld(*vrow, in_heap);
diff --git a/storage/innobase/trx/trx0rec.cc b/storage/innobase/trx/trx0rec.cc
index 3e874bbeed3..931d50e4b82 100644
--- a/storage/innobase/trx/trx0rec.cc
+++ b/storage/innobase/trx/trx0rec.cc
@@ -41,6 +41,16 @@ Created 3/26/1996 Heikki Tuuri
#include "fsp0sysspace.h"
#include "row0mysql.h"
+/** The search tuple corresponding to TRX_UNDO_INSERT_DEFAULT */
+const dtuple_t trx_undo_default_rec = {
+ REC_INFO_DEFAULT_ROW, 0, 0,
+ NULL, 0, NULL,
+ UT_LIST_NODE_T(dtuple_t)()
+#ifdef UNIV_DEBUG
+ , DATA_TUPLE_MAGIC_N
+#endif /* UNIV_DEBUG */
+};
+
/*=========== UNDO LOG RECORD CREATION AND DECODING ====================*/
/**********************************************************************//**
@@ -499,6 +509,13 @@ trx_undo_page_report_insert(
/*----------------------------------------*/
/* Store then the fields required to uniquely determine the record
to be inserted in the clustered index */
+ if (UNIV_UNLIKELY(clust_entry->info_bits)) {
+ ut_ad(clust_entry->info_bits == REC_INFO_DEFAULT_ROW);
+ ut_ad(index->is_instant());
+ ut_ad(undo_page[first_free + 2] == TRX_UNDO_INSERT_REC);
+ undo_page[first_free + 2] = TRX_UNDO_INSERT_DEFAULT;
+ goto done;
+ }
for (i = 0; i < dict_index_get_n_unique(index); i++) {
@@ -530,6 +547,7 @@ trx_undo_page_report_insert(
}
}
+done:
return(trx_undo_page_set_next_prev_and_add(undo_page, ptr, mtr));
}
@@ -966,6 +984,8 @@ trx_undo_page_report_modify(
for (i = 0; i < dict_index_get_n_unique(index); i++) {
+ /* The ordering columns must not be instant added columns. */
+ ut_ad(!rec_offs_nth_default(offsets, i));
field = rec_get_nth_field(rec, offsets, i, &flen);
/* The ordering columns must not be stored externally. */
@@ -1081,8 +1101,8 @@ trx_undo_page_report_modify(
flen, max_v_log_len);
}
} else {
- field = rec_get_nth_field(rec, offsets,
- pos, &flen);
+ field = rec_get_nth_cfield(
+ rec, index, offsets, pos, &flen);
}
if (trx_undo_left(undo_page, ptr) < 15) {
@@ -1222,8 +1242,8 @@ trx_undo_page_report_modify(
ptr += mach_write_compressed(ptr, pos);
/* Save the old value of field */
- field = rec_get_nth_field(rec, offsets, pos,
- &flen);
+ field = rec_get_nth_cfield(
+ rec, index, offsets, pos, &flen);
if (rec_offs_nth_extern(offsets, pos)) {
const dict_col_t* col =
@@ -1504,6 +1524,7 @@ trx_undo_update_rec_get_update(
ulint orig_len;
bool is_virtual;
+ upd_field = upd_get_nth_field(update, i);
field_no = mach_read_next_compressed(&ptr);
is_virtual = (field_no >= REC_MAX_N_FIELDS);
@@ -1515,27 +1536,6 @@ trx_undo_update_rec_get_update(
index->table, ptr, first_v_col, &is_undo_log,
&field_no);
first_v_col = false;
- } else if (field_no >= dict_index_get_n_fields(index)) {
- ib::error() << "Trying to access update undo rec"
- " field " << field_no
- << " in index " << index->name
- << " of table " << index->table->name
- << " but index has only "
- << dict_index_get_n_fields(index)
- << " fields " << BUG_REPORT_MSG
- << ". Run also CHECK TABLE "
- << index->table->name << "."
- " n_fields = " << n_fields << ", i = " << i
- << ", ptr " << ptr;
-
- ut_ad(0);
- *upd = NULL;
- return(NULL);
- }
-
- upd_field = upd_get_nth_field(update, i);
-
- if (is_virtual) {
/* This column could be dropped or no longer indexed */
if (field_no == ULINT_UNDEFINED) {
/* Mark this is no longer needed */
@@ -1549,10 +1549,31 @@ trx_undo_update_rec_get_update(
continue;
}
- upd_field_set_v_field_no(
- upd_field, field_no, index);
- } else {
+ upd_field_set_v_field_no(upd_field, field_no, index);
+ } else if (field_no < index->n_fields) {
upd_field_set_field_no(upd_field, field_no, index);
+ } else if (update->info_bits == REC_INFO_MIN_REC_FLAG
+ && index->is_instant()) {
+ /* This must be a rollback of a subsequent
+ instant ADD COLUMN operation. This will be
+ detected and handled by btr_cur_trim(). */
+ upd_field->field_no = field_no;
+ upd_field->orig_len = 0;
+ } else {
+ ib::error() << "Trying to access update undo rec"
+ " field " << field_no
+ << " in index " << index->name
+ << " of table " << index->table->name
+ << " but index has only "
+ << dict_index_get_n_fields(index)
+ << " fields " << BUG_REPORT_MSG
+ << ". Run also CHECK TABLE "
+ << index->table->name << "."
+ " n_fields = " << n_fields << ", i = " << i;
+
+ ut_ad(0);
+ *upd = NULL;
+ return(NULL);
}
ptr = trx_undo_rec_get_col_val(ptr, &field, &len, &orig_len);
@@ -1887,10 +1908,15 @@ trx_undo_report_row_operation(
} else {
ut_ad(!trx->read_only);
ut_ad(trx->id);
- /* Keep INFORMATION_SCHEMA.TABLES.UPDATE_TIME
- up-to-date for persistent tables. Temporary tables are
- not listed there. */
- trx->mod_tables.insert(index->table);
+ if (UNIV_LIKELY(!clust_entry || clust_entry->info_bits
+ != REC_INFO_DEFAULT_ROW)) {
+ /* Keep INFORMATION_SCHEMA.TABLES.UPDATE_TIME
+ up-to-date for persistent tables outside
+ instant ADD COLUMN. */
+ trx->mod_tables.insert(index->table);
+ } else {
+ ut_ad(index->is_instant());
+ }
pundo = &trx->rsegs.m_redo.undo;
rseg = trx->rsegs.m_redo.rseg;
@@ -2291,7 +2317,7 @@ trx_undo_prev_version_build(
heap, rec_offs_size(offsets)));
*old_vers = rec_copy(buf, rec, offsets);
- rec_offs_make_valid(*old_vers, index, offsets);
+ rec_offs_make_valid(*old_vers, index, true, offsets);
row_upd_rec_in_place(*old_vers, index, offsets, update, NULL);
}
diff --git a/storage/innobase/trx/trx0roll.cc b/storage/innobase/trx/trx0roll.cc
index 946c90f457c..0d2d6beac90 100644
--- a/storage/innobase/trx/trx0roll.cc
+++ b/storage/innobase/trx/trx0roll.cc
@@ -1012,7 +1012,7 @@ trx_roll_pop_top_rec_of_trx(trx_t* trx, roll_ptr_t* roll_ptr, mem_heap_t* heap)
if the transaction object is committed and reused
later, we will default to a full ROLLBACK. */
trx->roll_limit = 0;
- ut_d(trx->in_rollback = false);
+ trx->in_rollback = false;
mutex_exit(&trx->undo_mutex);
return(NULL);
}
@@ -1028,10 +1028,20 @@ trx_roll_pop_top_rec_of_trx(trx_t* trx, roll_ptr_t* roll_ptr, mem_heap_t* heap)
trx_undo_rec_t* undo_rec = trx_roll_pop_top_rec(trx, undo, &mtr);
const undo_no_t undo_no = trx_undo_rec_get_undo_no(undo_rec);
- if (trx_undo_rec_get_type(undo_rec) == TRX_UNDO_INSERT_REC) {
+ switch (trx_undo_rec_get_type(undo_rec)) {
+ case TRX_UNDO_INSERT_DEFAULT:
+ /* This record type was introduced in MDEV-11369
+ instant ADD COLUMN, which was implemented after
+ MDEV-12288 removed the insert_undo log. There is no
+ instant ADD COLUMN for temporary tables. Therefore,
+ this record can only be present in the main undo log. */
+ ut_ad(undo == update);
+ /* fall through */
+ case TRX_UNDO_INSERT_REC:
ut_ad(undo == insert || undo == update || undo == temp);
*roll_ptr |= 1ULL << ROLL_PTR_INSERT_FLAG_POS;
- } else {
+ break;
+ default:
ut_ad(undo == update || undo == temp);
}
@@ -1119,7 +1129,7 @@ trx_rollback_start(
ut_ad(!trx->in_rollback);
trx->roll_limit = roll_limit;
- ut_d(trx->in_rollback = true);
+ trx->in_rollback = true;
ut_a(trx->roll_limit <= trx->undo_no);
diff --git a/storage/innobase/trx/trx0trx.cc b/storage/innobase/trx/trx0trx.cc
index 323748733be..20fed90c712 100644
--- a/storage/innobase/trx/trx0trx.cc
+++ b/storage/innobase/trx/trx0trx.cc
@@ -3014,15 +3014,6 @@ trx_start_for_ddl_low(
return;
case TRX_STATE_ACTIVE:
-
- /* We have this start if not started idiom, therefore we
- can't add stronger checks here. */
- trx->ddl = true;
-
- ut_ad(trx->dict_operation != TRX_DICT_OP_NONE);
- ut_ad(trx->will_lock > 0);
- return;
-
case TRX_STATE_PREPARED:
case TRX_STATE_COMMITTED_IN_MEMORY:
break;