summaryrefslogtreecommitdiff
path: root/storage/innobase/include/page0page.ic
diff options
context:
space:
mode:
authorMarko Mäkelä <marko.makela@mariadb.com>2020-02-13 19:12:17 +0200
committerMarko Mäkelä <marko.makela@mariadb.com>2020-02-13 19:12:17 +0200
commit7ae21b18a6b73bbc3bf1ff448faf60c29ac1d386 (patch)
treecd6d9521f59a8e1570897856bf94ab8c84e5bd83 /storage/innobase/include/page0page.ic
parent98690052017d2091a463d335eba2c6fc8cac7cb9 (diff)
downloadmariadb-git-7ae21b18a6b73bbc3bf1ff448faf60c29ac1d386.tar.gz
MDEV-12353: Change the redo log encoding
log_t::FORMAT_10_5: physical redo log format tag log_phys_t: Buffered records in the physical format. The log record bytes will follow the last data field, making use of alignment padding that would otherwise be wasted. If there are multiple records for the same page, also those may be appended to an existing log_phys_t object if the memory is available. In the physical format, the first byte of a record identifies the record and its length (up to 15 bytes). For longer records, the immediately following bytes will encode the remaining length in a variable-length encoding. Usually, a variable-length-encoded page identifier will follow, followed by optional payload, whose length is included in the initially encoded total record length. When a mini-transaction is updating multiple fields in a page, it can avoid repeating the tablespace identifier and page number by setting the same_page flag (most significant bit) in the first byte of the log record. The byte offset of the record will be relative to where the previous record for that page ended. Until MDEV-14425 introduces a separate file-level log for redo log checkpoints and file operations, we will write the file-level records in the page-level redo log file. The record FILE_CHECKPOINT (which replaces MLOG_CHECKPOINT) will be removed in MDEV-14425, and one sequential scan of the page recovery log will suffice. Compared to MLOG_FILE_CREATE2, FILE_CREATE will not include any flags. If the information is needed, it can be parsed from WRITE records that modify FSP_SPACE_FLAGS. MLOG_ZIP_WRITE_STRING: Remove. The record was only introduced temporarily as part of this work, before being replaced with WRITE (along with MLOG_WRITE_STRING, MLOG_1BYTE, MLOG_nBYTES). mtr_buf_t::empty(): Check if the buffer is empty. mtr_t::m_n_log_recs: Remove. It suffices to check if m_log is empty. mtr_t::m_last, mtr_t::m_last_offset: End of the latest m_log record, for the same_page encoding. page_recv_t::last_offset: Reflects mtr_t::m_last_offset. Valid values for last_offset during recovery should be 0 or above 8. (The first 8 bytes of a page are the checksum and the page number, and neither are ever updated directly by log records.) Internally, the special value 1 indicates that the same_page form will not be allowed for the subsequent record. mtr_t::page_create(): Take the block descriptor as parameter, so that it can be compared to mtr_t::m_last. The INIT_INDEX_PAGE record will always followed by a subtype byte, because same_page records must be longer than 1 byte. trx_undo_page_init(): Combine the writes in WRITE record. trx_undo_header_create(): Write 4 bytes using a special MEMSET record that includes 1 bytes of length and 2 bytes of payload. flst_write_addr(): Define as a static function. Combine the writes. flst_zero_both(): Replaces two flst_zero_addr() calls. flst_init(): Do not inline the function. fsp_free_seg_inode(): Zerofill the whole inode. fsp_apply_init_file_page(): Initialize FIL_PAGE_PREV,FIL_PAGE_NEXT to FIL_NULL when using the physical format. btr_create(): Assert !page_has_siblings() because fsp_apply_init_file_page() must have been invoked. fil_ibd_create(): Do not write FILE_MODIFY after FILE_CREATE. fil_names_dirty_and_write(): Remove the parameter mtr. Write the records using a separate mini-transaction object, because any FILE_ records must be at the start of a mini-transaction log. recv_recover_page(): Add a fil_space_t* parameter. After applying log to the a ROW_FORMAT=COMPRESSED page, invoke buf_zip_decompress() to restore the uncompressed page. buf_page_io_complete(): Remove the temporary hack to discard the uncompressed page of a ROW_FORMAT=COMPRESSED page. page_zip_write_header(): Remove. Use mtr_t::write() or mtr_t::memset() instead, and update the compressed page frame separately. trx_undo_header_add_space_for_xid(): Remove. trx_undo_seg_create(): Perform the changes that were previously made by trx_undo_header_add_space_for_xid(). btr_reset_instant(): New function: Reset the table to MariaDB 10.2 or 10.3 format when rolling back an instant ALTER TABLE operation. page_rec_find_owner_rec(): Merge with the only callers. page_cur_insert_rec_low(): Combine writes by using a local buffer. MEMMOVE data from the preceding record whenever feasible (copying at least 3 bytes). page_cur_insert_rec_zip(): Combine writes to page header fields. PageBulk::insertPage(): Issue MEMMOVE records to copy a matching part from the preceding record. PageBulk::finishPage(): Combine the writes to the page header and to the sparse page directory slots. mtr_t::write(): Only log the least significant (last) bytes of multi-byte fields that actually differ. For updating FSP_SIZE, we must always write all 4 bytes to the redo log, so that the fil_space_set_recv_size() logic in recv_sys_t::parse() will work. mtr_t::memcpy(), mtr_t::zmemcpy(): Take a pointer argument instead of a numeric offset to the page frame. Only log the last bytes of multi-byte fields that actually differ. In fil_space_crypt_t::write_page0(), we must log also any unchanged bytes, so that recovery will recognize the record and invoke fil_crypt_parse(). Future work: MDEV-21724 Optimize page_cur_insert_rec_low() redo logging MDEV-21725 Optimize btr_page_reorganize_low() redo logging MDEV-21727 Optimize redo logging for ROW_FORMAT=COMPRESSED
Diffstat (limited to 'storage/innobase/include/page0page.ic')
-rw-r--r--storage/innobase/include/page0page.ic57
1 files changed, 13 insertions, 44 deletions
diff --git a/storage/innobase/include/page0page.ic b/storage/innobase/include/page0page.ic
index 5cc6b4d9d50..8604f088adf 100644
--- a/storage/innobase/include/page0page.ic
+++ b/storage/innobase/include/page0page.ic
@@ -89,17 +89,14 @@ page_set_ssn_id(
node_seq_t ssn_id, /*!< in: transaction id */
mtr_t* mtr) /*!< in/out: mini-transaction */
{
- ut_ad(!mtr || mtr_memo_contains_flagged(mtr, block,
- MTR_MEMO_PAGE_SX_FIX
- | MTR_MEMO_PAGE_X_FIX));
-
- byte* ssn = block->frame + FIL_RTREE_SPLIT_SEQ_NUM;
- if (UNIV_LIKELY_NULL(page_zip)) {
- mach_write_to_8(ssn, ssn_id);
- page_zip_write_header(block, ssn, 8, mtr);
- } else {
- mtr->write<8,mtr_t::OPT>(*block, ssn, ssn_id);
- }
+ ut_ad(mtr_memo_contains_flagged(mtr, block,
+ MTR_MEMO_PAGE_SX_FIX | MTR_MEMO_PAGE_X_FIX));
+ ut_ad(!page_zip || page_zip == &block->page.zip);
+ constexpr uint16_t field= FIL_RTREE_SPLIT_SEQ_NUM;
+ byte *b= my_assume_aligned<2>(&block->frame[field]);
+ if (mtr->write<8,mtr_t::OPT>(*block, b, ssn_id) &&
+ UNIV_LIKELY_NULL(page_zip))
+ memcpy_aligned<2>(&page_zip->data[field], b, 8);
}
#endif /* !UNIV_INNOCHECKSUM */
@@ -133,15 +130,11 @@ Reset PAGE_LAST_INSERT.
@param[in,out] mtr mini-transaction */
inline void page_header_reset_last_insert(buf_block_t *block, mtr_t *mtr)
{
- byte *b= &block->frame[PAGE_HEADER + PAGE_LAST_INSERT];
-
- if (UNIV_LIKELY_NULL(block->page.zip.data))
- {
- mach_write_to_2(b, 0);
- page_zip_write_header(block, b, 2, mtr);
- }
- else
- mtr->write<2,mtr_t::OPT>(*block, b, 0U);
+ constexpr uint16_t field= PAGE_HEADER + PAGE_LAST_INSERT;
+ byte *b= my_assume_aligned<2>(&block->frame[field]);
+ if (mtr->write<2,mtr_t::OPT>(*block, b, 0U) &&
+ UNIV_LIKELY_NULL(block->page.zip.data))
+ memcpy_aligned<2>(&block->page.zip.data[field], b, 2);
}
/***************************************************************//**
@@ -576,30 +569,6 @@ page_rec_get_prev(
return((rec_t*) page_rec_get_prev_const(rec));
}
-/***************************************************************//**
-Looks for the record which owns the given record.
-@return the owner record */
-UNIV_INLINE
-rec_t*
-page_rec_find_owner_rec(
-/*====================*/
- rec_t* rec) /*!< in: the physical record */
-{
- ut_ad(page_rec_check(rec));
-
- if (page_rec_is_comp(rec)) {
- while (rec_get_n_owned_new(rec) == 0) {
- rec = page_rec_get_next(rec);
- }
- } else {
- while (rec_get_n_owned_old(rec) == 0) {
- rec = page_rec_get_next(rec);
- }
- }
-
- return(rec);
-}
-
/**********************************************************//**
Returns the base extra size of a physical record. This is the
size of the fixed header, independent of the record size.