| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
fil_addr_t: Use exactly sized data types.
flst_read_addr(): Remove the unused parameter mtr.
page_offset(): Return uint16_t.
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
Revert part of commit 6cedb671e99038f1a10e0d8504f835aaabed9780
because it turns out to be theoretically impossible to parse a
ROW_FORMAT=COMPACT or ROW_FORMAT=DYNAMIC metadata record where
the variable-length fields in the PRIMARY KEY have been written
as nonempty strings.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As noted in commit abd45cdc38e72ce329365ffe0df4c6f8c319b407
a search with PAGE_CUR_GE may land on the supremum record on
a leaf page that is not the rightmost leaf page. This could occur
when all keys on the current page are smaller than the search key,
and the smallest key on the successor page is larger than the search key.
Hence, after a failed PAGE_CUR_GE search, assertions
btr_pcur_is_after_last_in_tree() are bogus
and should be replaced with btr_pcur_is_after_last_on_page().
|
| |
| |
| |
| | |
Unsigned type while RHS expression could be less than 0.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Introduce memcpy_aligned<N>(), memcmp_aligned<N>(), memset_aligned<N>()
and use them for accessing InnoDB page header fields that are known
to be aligned.
MY_ASSUME_ALIGNED(): Wrapper for the GCC/clang __builtin_assume_aligned().
Nothing similar seems to exist in Microsoft Visual Studio, and the
C++20 std::assume_aligned is not available to us yet.
Explicitly specified alignment guarantees allow compilers to generate
faster code on platforms with strict alignment rules, instead of
emitting calls to potentially unaligned memcpy(), memcmp(), or memset().
|
|\ \
| |/ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
non-indexed column
We must relax too strict debug assertions. For latin1_swedish_ci,
mtype=DATA_CHAR or mtype=DATA_VARCHAR will be used instead of
mtype=DATA_MYSQL or mtype=DATA_VARMYSQL. Likewise, some changes of
dtype_get_charset_coll() do not affect the data type encoding,
but only any indexes that are defined on the column.
Charset::same_encoding(): Check whether two charset-collations have
the same character set encoding.
dict_col_t::same_encoding(): Check whether two character columns
have the same character set encoding.
dict_col_t::same_type(): Check whether two columns have a compatible
data type encoding.
dict_col_t::same_format(), dict_table_t::instant_column(): Do not
compare mtype or the charset-collation of prtype directly.
Rely on dict_col_t::same_type() instead.
dtype_get_charset_coll(): Narrow the return type to uint16_t.
This is a refined version of a fix that was developed by
Thirunarayanan Balathandayuthapani.
|
| |\ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
row_log_table_get_pk_col
row_log_table_get_pk_col(): read instant field value from instant
alter table when it's required.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The original crash happened when async replication IO thread was updating mysql.gtid_slave_pos table. Operations on this table should remain node local, but it appears that protection (THD::wsrep_ignore_table flag) to prevent wsrep replication for this table mas missing for innodb write_row() and update_row().
It was somewhat difficult to reproduce the issue, because mtr seems to create the affected table mysql.gtid_log_pos as of Aria engine type, and Aria engine operations will not be replicated anyhow. It looks, though, that in release installation, mysql.gtid_slave_pos table is of InnoDB engine.
It was possible to trigger somewhat related problem by running test galera.galera_as_slave_gtid with configuration: gtid_pos_auto_engines=InnoDB. However, this test mode, causes earlier crash when replication background thread creates aditional table: mysql.gtid_slave_pos_InnoDB, and this table create triggered wsrep TOI replication, which also failed for assertion. Actually, async replication IO and background threads should not replicate anything to cluster.
This pull request contains new test galera.galera_as_slave_gtid_auto_engine, which basically just runs galera.galera_as_slave_gtid with configuration of gtid_pos_auto_engines=InnoDB.
Test galera.galera_as_slave_gtid is also modified for better code reuse.
Actual fix for MDEV-21096 is in storage/innobase/handler/ha_innodb.cc, where THD::wsrep_ignore_table flag is now honored before wsrep key population.
There is additional fix in sql/service_wsrep.cc where async replication IO and background threads are marked as non-local. This fences these threads out of wsrep replication altogether. Note that this change, actually makes the use of THD::wsrep_ignore-table redundant. We may want to refactor THD::wsrep_ignore_table out in the future, if there is no other use case for it in sight.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
dict_stats_shutdown() can hang, waiting for timer callback to finish.
This happens because locks the same mutex, which can also used inside
timer callback, within dict_stats_schedule() function.
Fix is to make dict_stats_schedule() use mutex.try_lock() instead of
mutex.lock().
In the unlikely case of simultaneous dict_stats_schedule() setting
different timer delays, now the first one would win, which is fine.
Important is that shutdown won't hang.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
At each mini-transaction commit, the log sequence number of the
mini-transaction must be written to each modified page, so that
it will be available in the FIL_PAGE_LSN field when the page is
being read in crash recovery.
InnoDB was unnecessarily allocating redundant storage for the
field, in buf_page_t::newest_modification. Let us access
FIL_PAGE_LSN directly.
Furthermore, on ALTER TABLE...IMPORT TABLESPACE, let us write
0 to FIL_PAGE_LSN instead of using log_sys.lsn.
buf_flush_init_for_writing(), buf_flush_update_zip_checksum(),
fil_encrypt_buf_for_full_crc32(), fil_encrypt_buf(),
fil_space_encrypt(): Remove the parameter lsn.
buf_page_get_newest_modification(): Merge with the only caller.
buf_tmp_reserve_compression_buf(), buf_tmp_page_encrypt(),
buf_page_encrypt(): Define static in the same compilation unit
with the only caller.
PageConverter::m_current_lsn: Remove. Write 0 to FIL_PAGE_LSN
on ALTER TABLE...IMPORT TABLESPACE.
|
|\ \ \
| |/ / |
|
| | |
| | |
| | |
| | |
| | |
| | | |
btr_cur_instant_init_low(): Accurately parse the metadata record
header for ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPACT. CHAR columns
used to be unnecessarily written as nonempty strings of bytes.
|
| |\ \
| | |/ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For ROW_FORMAT=REDUNDANT, we must reserve fixed-length dummy values
for the CHAR columns in the metadata record. This is because in
MariaDB Server 10.4, btr_cur_instant_init_low() will rely on
dict_index_t::trx_id_offset being accurate for the metadata record.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
In MariaDB Server 10.4, btr_cur_instant_init_low() assumes that
all PRIMARY KEY columns that are internally variable-length will
be encoded in 0 bytes in the metadata record. Sometimes, CHAR
columns can be encoded as variable-length. We should not
unnecessarily reserve space for a dummy string value in the
metadata record.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Currently InnoDB uses internal parser for adding foreign keys. Remove
internal parser and use data parsed by SQL parser (sql_yacc) for
adding foreign keys.
- create_table_info_t::create_foreign_keys() replacement for
dict_create_foreign_constraints_low();
- Pass constraint name via Foreign_key object.
Temporary until MDEV-20865:
- Pass alter_info as part of create_info.
|
|\ \ \
| |/ / |
|
| |\ \
| | |/ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
DropIndex, CreateIndex: Remove. The file row0trunc.cc only exists
in MariaDB Server 10.3 so that the crash recovery of TRUNCATE TABLE
operations from older 10.2 and 10.3 servers will work. This dead code
was being used for implementing the MySQL 5.7 WL#6501 TRUNCATE TABLE
that was replaced with a backup-safe implementation in MDEV-13564.
|
| | |\ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
buf_read_ibuf_merge_pages(): Discard any page numbers that are
outside the current bounds of the tablespace, by invoking the
function ibuf_delete_recs() that was introduced in MDEV-20934.
This could avoid an infinite change buffer merge loop on
innodb_fast_shutdown=0, because normally the change buffer merge
would only be attempted if a page was successfully loaded into
the buffer pool.
dict_drop_index_tree(): Add the parameter trx_t*.
To prevent the DROP TABLE crash, do not invoke btr_free_if_exists()
if the entire .ibd file will be dropped. Thus, we will avoid a crash
if the BTR_SEG_LEAF or BTR_SEG_TOP of the index is corrupted,
and we will also avoid unnecessarily accessing the to-be-dropped
tablespace via the buffer pool.
In MariaDB 10.2, we disable the DROP TABLE fix if innodb_safe_truncate=0,
because the backup-unsafe MySQL 5.7 WL#6501 form of TRUNCATE TABLE
requires that the individual pages be freed inside the tablespace.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
error
Fix partitioning and DS-MRR to work together
- In ha_partition::index_end(): take into account that ha_innobase (and
other engines using DS-MRR) will have inited=RND when initialized for
DS-MRR scan.
- In ha_partition::multi_range_read_next(): if the MRR scan is using
HA_MRR_NO_ASSOCIATION mode, it is not guaranteed that the partition's
handler will store anything into *range_info.
- In DsMrr_impl::choose_mrr_impl(): ha_partition will inquire partitions
about how much memory their MRR implementation needs by passing
*buffer_size=0. DS-MRR code didn't know about this (actually it used
uint for buffer size calculation and would have an under-flow).
Returning *buffer_size=0 made ha_partition assume that partitions do
not need MRR memory and pass the same buffer to each of them.
Now, this is fixed. If DS-MRR gets *buffer_size=0, it will return
the amount of buffer space needed, but not more than about
@@mrr_buffer_size.
* Fix ha_{innobase,maria,myisam}::clone. If ha_partition uses MRR on its
partitions, and partition use DS-MRR, the code will call handler->clone
with TABLE (*NOT partition*) name as an argument.
DS-MRR has no way of knowing the partition name, so the solution was
to have the ::clone() function for the affected storage engine to ignore
the name argument and get it elsewhere.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When commit 09af00cbde1d62dfda574dee10e5c0fd240c3f7f
removed the crash-upgrade logic of old TRUNCATE TABLE
from MariaDB 10.2 and 10.3, it actually made the return
value of dict_drop_index_tree() redundant.
|
| | | |
| | | |
| | | |
| | | | |
Add missing static qualifiers.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
After MDEV-11556, not even crash recovery should attempt to access
non-existing pages. But, buf_load() is not validating its input
and must thus be able to ignore missing pages, so that is why
buf_read_page_background() does that.
|
| | | | |
|
| | | |
| | | |
| | | |
| | | | |
innodb_shutdown(): Invoke os_aio_free() before btr_search_sys_free().
|
| | | |
| | | |
| | | |
| | | | |
Remove keywords that are too new.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Almost all threads have gone
- the "ticking" threads, that sleep a while then do some work)
(srv_monitor_thread, srv_error_monitor_thread, srv_master_thread)
were replaced with timers. Some timers are periodic,
e.g the "master" timer.
- The btr_defragment_thread is also replaced by a timer , which
reschedules it self when current defragment "item" needs throttling
- the buf_resize_thread and buf_dump_threads are substitutes with tasks
Ditto with page cleaner workers.
- purge workers threads are not tasks as well, and purge cleaner
coordinator is a combination of a task and timer.
- All AIO is outsourced to tpool, Innodb just calls thread_pool::submit_io()
and provides the callback.
- The srv_slot_t was removed, and innodb_debug_sync used in purge
is currently not working, and needs reimplementation.
|
| | | | |
|
|\ \ \ \
| |/ / / |
|
| |\ \ \
| | |/ / |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Apart from page latches (buf_block_t::lock), mini-transactions
are keeping track of at most one dict_index_t::lock and
fil_space_t::latch at a time, and in a rare case, purge_sys.latch.
Let us introduce interfaces for acquiring an index latch
or a tablespace latch.
In a later version, we may want to introduce mtr_t members
for holding a latched dict_index_t* and fil_space_t*,
and replace the remaining use of mtr_t::m_memo
with std::set<buf_block_t*> or with a map<buf_block_t*,byte*>
pointing to log records.
|
| |\ \ \
| | |/ / |
|
| | |\ \
| | | |/
| | | |
| | | |
| | | |
| | | |
| | | | |
In the test innodb.instant_alter,4k we would be flagging an error
for too large row size. That error was previously only being reported
if the table was being rebuilt. Thus, this merge is fixing a small
omission in MDEV-11369 (instant ADD COLUMN).
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Move row size check to early CREATE/ALTER TABLE phase. Stop checking
on table open.
dict_index_add_to_cache(): remove parameter 'strict', stop checking row size
dict_index_t::record_size_info_t: this is a result of row size check operation
create_table_info_t::row_size_is_acceptable(): performs row size check.
Issues error or warning. Writes first overflow field to InnoDB log.
create_table_info_t::create_table(): add row size check
dict_index_t::record_size_info(): this is a refactored version
of dict_index_t::rec_potentially_too_big(). New version doesn't change global
state of a program but return all interesting info. And it's callers who
decide how to handle row size overflow.
dict_index_t::rec_potentially_too_big(): removed
|
| |\ \ \
| | |/ / |
|
| | |\ \
| | | |/ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
memo_block_unfix(), memo_latch_release(): Merge to ReleaseLatches.
memo_slot_release(), ReleaseAll: Clean up the formatting.
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
A search with PAGE_CUR_GE may land on the supremum record on
a leaf page that is not the rightmost leaf page.
This could occur when all keys on the current page are
smaller than the search key, and the smallest key on the
successor page is larger than the search key.
ibuf_delete_recs(): Correct the debug assertion accordingly.
|
|\ \ \ \
| |/ / / |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
btr_create(), btr_root_raise_and_insert(): Write a MLOG_MEMSET record
to set FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL, instead of writing
two MLOG_4BYTES records.
For ROW_FORMAT=COMPRESSED pages, we will not use MLOG_MEMSET
because we want the crash-downgrade to earlier 10.4 releases to succeed.
mlog_parse_nbytes(): Relax the too strict assertion. There is no problem
with MLOG_MEMSET records that affect the uncompressed header of
ROW_FORMAT=COMPRESSED index pages.
|
| | | | |
|
| | | |
| | | |
| | | |
| | | |
| | | | |
Basically, use more List<T>::iterator. This patch required adding two more
overloads to new iterator for convenience.
|
|\ \ \ \
| |/ / / |
|
| |\ \ \
| | |/ / |
|
| | |\ \
| | | |/ |
|