delta/mariadb-git.git - github.com: MariaDB/server.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	mariabackup little FreeBSD update support.	David Carlier	2021-03-20	1	-1/+10
\| \| \| \| \|	In this platform, it s better not to rely on optional proc filesystem presence. Using native API to retrieve binary absolute path instead.
*	Merge 10.4 into 10.5	Marko Mäkelä	2021-03-11	1	-1/+12
\|\
\| *	arguments overflow fix proposal. the list is assumed to be implictly null ↵	David CARLIER	2021-03-09	1	-1/+2
\| \| \| \| \| \| \| \|	terminated at usage time.
\| *	mariabackup utility, binary path implementation for Mac.	David CARLIER	2021-03-09	1	-0/+10
\| \| \| \| \| \| \| \| \| \|	implements in a native way get_exepath which gives reliably the full path.
* \|	Merge 10.4 into 10.5	Marko Mäkelä	2021-03-08	1	-4/+3
\|\ \ \| \|/
\| *	Merge 10.3 into 10.4	Marko Mäkelä	2021-03-08	1	-4/+3
\| \|\
\| \| *	Merge 10.2 into 10.3	Marko Mäkelä	2021-03-05	1	-4/+3
\| \| \|\
\| \| \| *	MDEV-22929 fixup. Print "completed OK!" if page corruption and ↵	Vladislav Vaintroub	2021-03-05	1	-4/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	--log-innodb-page-corruption Since we do not stop at corrupted page error, there is no reason to log a backup error.
\| * \| \|	MDEV-20386: Allow RDRAND, RDSEED WITH_MSAN	Marko Mäkelä	2021-01-02	1	-27/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let us use Intel intrinsic functions in WolfSSL whenever possible. This allows such code to be compiled WITH_MSAN.
\| * \| \|	WolfSSL v4.6.0-stable	Marko Mäkelä	2021-01-02	2	-1/+1
\| \| \| \|
* \| \| \|	Added 'const' to arguments in get_one_option and find_typeset()	Monty	2021-02-08	11	-18/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	One should not change the program arguments! This change also reduces warnings from the icc compiler. Almost all changes are just syntax changes (adding const to 'get_one_option function' declarations). Other changes: - Added a few cast of 'argument' from 'const char' to 'char '. This was mainly in calls to 'external' functions we don't have control of. - Ensure that all reset of 'password command line argument' are similar. (In almost all cases it was just adding a comment and a cast) - In mysqlbinlog.cc and mysqld.cc there was a few cases that changed the command line argument. These places where changed to instead allocate the option in a MEM_ROOT to avoid changing the argument. Some of this code was changed to ensure that different programs did parsing the same way. Added a test case for the changes in mysqlbinlog.cc - Changed a few variables that took their value from command line options from 'char ' to 'const char '.
* \| \| \|	MDEV-24537 innodb_max_dirty_pages_pct_lwm=0 lost its special meaning	Marko Mäkelä	2021-01-06	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 3a9a3be1c64b14c05648e87ebe0f1dd96457de41 (MDEV-23855) some previous logic was replaced with the condition dirty_pct < srv_max_dirty_pages_pct_lwm, which caused the default value of the parameter innodb_max_dirty_pages_pct_lwm=0 to lose its special meaning: 'refer to innodb_max_dirty_pages_pct instead'. This implicit special meaning was visible in the function af_get_pct_for_dirty(), which was removed in commit f0c295e2de8c074c2ca72e19ff06e1d0e3ee6d2b (MDEV-24369). page_cleaner_flush_pages_recommendation(): Restore the special meaning that was removed in MDEV-24369. buf_flush_page_cleaner(): If srv_max_dirty_pages_pct_lwm==0.0, refer to srv_max_buf_pool_modified_pct. This fixes the observed performance regression due to excessive page flushing. buf_pool_t::page_cleaner_wakeup(): Revise the wakeup condition. innodb_init(): Do initialize srv_max_io_capacity in Mariabackup. It was previously constantly 0, which caused mariadb-backup --prepare to hang in buf_flush_sync(), making no progress.
* \| \| \|	Merge commit '10.4' into 10.5	Oleksandr Byelkin	2021-01-06	1	-0/+4
\|\ \ \ \ \| \|/ / /
\| * \| \|	Merge branch '10.3' into 10.4bb-10.4-MDEV-23468	Oleksandr Byelkin	2020-12-25	1	-0/+4
\| \|\ \ \ \| \| \|/ /
\| \| * \|	Merge branch '10.2' into 10.3	Oleksandr Byelkin	2020-12-23	1	-0/+4
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-22810 mariabackup does not honor open_files_limit from option during ↵	Vlad Lesin	2020-12-16	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	backup prepare open_files_limit option was processed only for --backup, but not for --prepare.
\| * \| \|	Merge 10.3 into 10.4	Marko Mäkelä	2020-12-23	1	-4/+1
\| \|\ \ \ \| \| \|/ /
\| \| * \|	Merge 10.2 into 10.3	Marko Mäkelä	2020-12-23	1	-4/+1
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-24340 Unique final message of InnoDB during shutdownbb-10.2-MDEV-24340	Marko Mäkelä	2020-12-04	1	-4/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	innobase_space_shutdown(): Remove. We want this step to be executed before the message "InnoDB: Shutdown completed; log sequence number " is output by innodb_shutdown(). It used to be executed after that step. innodb_shutdown(): Duplicate the code that used to live in innobase_space_shutdown(). innobase_init_abort(): Merge with innobase_space_shutdown().
* \| \| \|	MDEV-20386: Allow RDRAND, RDSEED WITH_MSAN	Marko Mäkelä	2021-01-01	1	-27/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Let us use Intel intrinsic functions in WolfSSL whenever possible. This allows such code to be compiled WITH_MSAN.
* \| \| \|	WolfSSL v4.6.0-stable	Marko Mäkelä	2021-01-01	2	-1/+1
\| \| \| \|
* \| \| \|	Contain AIX perror	Etienne Guesnet	2020-12-16	1	-7/+13
\| \| \| \|
* \| \| \|	Fix build on GCC 5	Etienne Guesnet	2020-12-16	1	-1/+1
\| \| \| \|
* \| \| \|	Add build on AIX	Etienne Guesnet	2020-12-16	1	-1/+6
\| \| \| \|
* \| \| \|	MDEV-24313 (2 of 2): Silently ignored innodb_use_native_aio=1bb-10.5-MDEV-24313	Marko Mäkelä	2020-12-14	1	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit 5e62b6a5e06eb02cbde1e34e95e26f42d87fce02 (MDEV-16264) the logic of os_aio_init() was changed so that it will never fail, but instead automatically disable innodb_use_native_aio (which is enabled by default) if the io_setup() system call would fail due to resource limits being exceeded. This is questionable, especially because falling back to simulated AIO may lead to significantly reduced performance. srv_n_file_io_threads, srv_n_read_io_threads, srv_n_write_io_threads: Change the data type from ulong to uint. os_aio_init(): Remove the parameters, and actually return an error code. thread_pool::configure_aio(): Do not silently fall back to simulated AIO. Reviewed by: Vladislav Vaintroub
* \| \| \|	MDEV-24391 heap-use-after-free in fil_space_t::flush_low()bb-10.5-MDEV-24391	Marko Mäkelä	2020-12-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We observed a race condition that involved two threads executing fil_flush_file_spaces() and one thread executing fil_delete_tablespace(). After one of the fil_flush_file_spaces() observed that space.needs_flush_not_stopping() is set and was releasing the fil_system.mutex, the other fil_flush_file_spaces() would complete the execution of fil_space_t::flush_low() on the same tablespace. Then, fil_delete_tablespace() would destroy the object, because the value of fil_space_t::n_pending did not prevent that. Finally, the fil_flush_file_spaces() would resume execution and invoke fil_space_t::flush_low() on the freed object. This race condition was introduced in commit 118e258aaac5da75a2ac4556201aaea3688fac67 of MDEV-23855. fil_space_t::flush(): Add a template parameter that indicates whether the caller is holding a reference to prevent the tablespace from being freed. buf_dblwr_t::flush_buffered_writes_completed(), row_quiesce_table_start(): Acquire a reference for the duration of the fil_space_t::flush_low() operation. It should be impossible for the object to be freed in these code paths, but we want to satisfy the debug assertions. fil_space_t::flush_low(): Do not increment or decrement the reference count, but instead assert that the caller is holding a reference. fil_space_extend_must_retry(), fil_flush_file_spaces(): Acquire a reference before releasing fil_system.mutex. This is what will fix the race condition.
* \| \| \|	Merge 10.4 into 10.5	Marko Mäkelä	2020-12-02	10	-126/+496
\|\ \ \ \ \| \|/ / /
\| * \| \|	Merge 10.3 into 10.4	Marko Mäkelä	2020-12-01	10	-127/+491
\| \|\ \ \ \| \| \|/ /
\| \| * \|	MDEV-22929 MariaBackup option to report and/or continue when corruption is ↵	Vlad Lesin	2020-12-01	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	encountered Post-push Windows compilation errors fix.
\| \| * \|	Merge 10.2 into 10.3	Marko Mäkelä	2020-12-01	10	-127/+491
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-22929 MariaBackup option to report and/or continue when corruption is ↵	Vlad Lesin	2020-12-01	10	-127/+489
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	encountered The new option --log-innodb-page-corruption is introduced. When this option is set, backup is not interrupted if innodb corrupted page is detected. Instead it logs all found corrupted pages in innodb_corrupted_pages file in backup directory and finishes with error. For incremental backup corrupted pages are also copied to .delta file, because we can't do LSN check for such pages during backup, innodb_corrupted_pages will also be created in incremental backup directory. During --prepare, corrupted pages list is read from the file just after redo log is applied, and each page from the list is checked if it is allocated in it's tablespace or not. If it is not allocated, then it is zeroed out, flushed to the tablespace and removed from the list. If all pages are removed from the list, then --prepare is finished successfully and innodb_corrupted_pages file is removed from backup directory. Otherwise --prepare is finished with error message and innodb_corrupted_pages contains the list of the pages, which are detected as corrupted during backup, and are allocated in their tablespaces, what means backup directory contains corrupted innodb pages, and backup can not be considered as consistent. For incremental --prepare corrupted pages from .delta files are applied to the base backup, innodb_corrupted_pages is read from both base in incremental directories, and the same action is proceded for corrupted pages list as for full --prepare. innodb_corrupted_pages file is modified or removed only in base directory. If DDL happens during backup, it is also processed at the end of backup to have correct tablespace names in innodb_corrupted_pages.
\| * \| \|	Merge 10.3 into 10.4	Marko Mäkelä	2020-11-03	3	-2/+15
\| \|\ \ \ \| \| \|/ /
* \| \| \|	Merge branch '10.4' into 10.5	Oleksandr Byelkin	2020-11-01	3	-2/+15
\|\ \ \ \
\| * \ \ \	Merge branch '10.3' into 10.4	Oleksandr Byelkin	2020-10-31	3	-2/+15
\| \|\ \ \ \ \| \| \|/ / / \| \|/\| / / \| \| \|/ /
\| \| * \|	Merge branch '10.2' into 10.3	Oleksandr Byelkin	2020-10-30	3	-2/+15
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-24026: InnoDB: Failing assertion: os_total_large_mem_allocated >= size ↵	Vlad Lesin	2020-10-29	3	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	upon incremental backup mariabackup deallocated uninitialized write_filt_ctxt.u.wf_incremental_ctxt in xtrabackup_copy_datafile() when some table should be skipped due to parsed DDL redo log record.
* \| \| \|	Merge 10.4 into 10.5	Marko Mäkelä	2020-10-30	2	-52/+101
\|\ \ \ \ \| \|/ / /
\| * \| \|	Merge 10.3 into 10.4	Marko Mäkelä	2020-10-29	1	-51/+102
\| \|\ \ \ \| \| \|/ /
\| \| * \|	Merge 10.2 into 10.3	Marko Mäkelä	2020-10-28	1	-51/+102
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-20755 InnoDB: Database page corruption on disk or a failed file read of ↵	Vlad Lesin	2020-10-23	1	-51/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tablespace upon prepare of mariabackup incremental backup The problem: When incremental backup is taken, delta files are created for innodb tables which are marked as new tables during innodb ddl tracking. When such tablespace is tried to be opened during prepare in xb_delta_open_matching_space(), it is "created", i.e. xb_space_create_file() is invoked, instead of opening, even if a tablespace with the same name exists in the base backup directory. xb_space_create_file() writes page 0 header the tablespace. This header does not contain crypt data, as mariabackup does not have any information about crypt data in delta file metadata for tablespaces. After delta file is applied, recovery process is started. As the sequence of recovery for different pages is not defined, there can be the situation when crypt data redo log event is executed after some other page is read for recovery. When some page is read for recovery, it's decrypted using crypt data stored in tablespace header in page 0, if there is no crypt data, the page is not decryped and does not pass corruption test. This causes error for incremental backup --prepare for encrypted tablespaces. The error is not stable because crypt data redo log event updates crypt data on page 0, and recovery for different pages can be executed in undefined order. The fix: When delta file is created, the corresponding write filter copies only the pages which LSN is greater then some incremental LSN. When new file is created during incremental backup, the LSN of all it's pages must be greater then incremental LSN, so there is no need to create delta for such table, we can just copy it completely. The fix is to copy the whole file which was tracked during incremental backup with innodb ddl tracker, and copy it to base directory during --prepare instead of delta applying. There is also DBUG_EXECUTE_IF() in innodb code to avoid writing redo log record for crypt data updating on page 0 to make the test case stable. Note: The issue is not reproducible in 10.5 as optimized DDL's are deprecated in 10.5. But the fix is still useful because it allows to decrease data copy size during backup, as delta file contains some extra info. The test case should be removed for 10.5 as it will always pass.
\| * \| \|	MDEV-23539: aws key plugin - fails to build	Daniel Black	2020-10-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Recent gcc/clang versions failed to compile the existing code. Updating a later upstream SDK version was simple and required only implementing a flush method. This was left blank as there was no strong requirement to keep the error log atomic or durable. Reviewed-by: wlad@mariadb.com The upstream SDK version added a flush method which was simple to complete.
* \| \| \|	MDEV-23855: Use normal mutex for log_sys.mutex, log_sys.flush_order_mutex	Marko Mäkelä	2020-10-26	1	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With an unreasonably small innodb_log_file_size, the page cleaner thread would frequently acquire log_sys.flush_order_mutex and spend a significant portion of CPU time spinning on that mutex when determining the checkpoint LSN.
* \| \| \|	MDEV-23855: Shrink fil_space_t	Marko Mäkelä	2020-10-26	2	-8/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Merge n_pending_ios, n_pending_ops to std::atomic<uint32_t> n_pending. Change some more fil_space_t members to uint32_t to reduce the memory footprint. fil_space_t::add(), fil_ibd_create(): Attach the already opened handle to the tablespace, and enforce the fil_system.n_open limit. dict_boot(): Initialize fil_system.max_assigned_id. srv_boot(): Call srv_thread_pool_init() before anything else, so that files should be opened in the correct mode on Windows. fil_ibd_create(): Create the file in OS_FILE_AIO mode, just like fil_node_open_file_low() does it. dict_table_t::is_accessible(): Replaces fil_table_accessible(). Reviewed by: Vladislav Vaintroub
* \| \| \|	MDEV-23855: Remove fil_system.LRU and reduce fil_system.mutex contention	Marko Mäkelä	2020-10-26	2	-54/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Also fixes MDEV-23929: innodb_flush_neighbors is not being ignored for system tablespace on SSD When the maximum configured number of file is exceeded, InnoDB will close data files. We used to maintain a fil_system.LRU list and a counter fil_node_t::n_pending to achieve this, at the huge cost of multiple fil_system.mutex operations per I/O operation. fil_node_open_file_low(): Implement a FIFO replacement policy: The last opened file will be moved to the end of fil_system.space_list, and files will be closed from the start of the list. However, we will not move tablespaces in fil_system.space_list while i_s_tablespaces_encryption_fill_table() is executing (producing output for INFORMATION_SCHEMA.INNODB_TABLESPACES_ENCRYPTION) because it may cause information of some tablespaces to go missing. We also avoid this in mariabackup --backup because datafiles_iter_next() assumes that the ordering is not changed. IORequest: Fold more parameters to IORequest::type. fil_space_t::io(): Replaces fil_io(). fil_space_t::flush(): Replaces fil_flush(). OS_AIO_IBUF: Remove. We will always issue synchronous reads of the change buffer pages in buf_read_page_low(). We will always ignore some errors for background reads. This should reduce fil_system.mutex contention a little. fil_node_t::complete_write(): Replaces fil_node_t::complete_io(). On both read and write completion, fil_space_t::release_for_io() will have to be called. fil_space_t::io(): Do not acquire fil_system.mutex in the normal code path. xb_delta_open_matching_space(): Do not try to open the system tablespace which was already opened. This fixes a file sharing violation in mariabackup --prepare --incremental. Reviewed by: Vladislav Vaintroub
* \| \| \|	MDEV-23855: Improve InnoDB log checkpoint performance	Marko Mäkelä	2020-10-26	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	After MDEV-15053, MDEV-22871, MDEV-23399 shifted the scalability bottleneck, log checkpoints became a new bottleneck. If innodb_io_capacity is set low or innodb_max_dirty_pct_lwm is set high and the workload fits in the buffer pool, the page cleaner thread will perform very little flushing. When we reach the capacity of the circular redo log file ib_logfile0 and must initiate a checkpoint, some 'furious flushing' will be necessary. (If innodb_flush_sync=OFF, then flushing would continue at the innodb_io_capacity rate, and writers would be throttled.) We have the best chance of advancing the checkpoint LSN immediately after a page flush batch has been completed. Hence, it is best to perform checkpoints after every batch in the page cleaner thread, attempting to run once per second. By initiating high-priority flushing in the page cleaner as early as possible, we aim to make the throughput more stable. The function buf_flush_wait_flushed() used to sleep for 10ms, hoping that the page cleaner thread would do something during that time. The observed end result was that a large number of threads that call log_free_check() would end up sleeping while nothing useful is happening. We will revise the design so that in the default innodb_flush_sync=ON mode, buf_flush_wait_flushed() will wake up the page cleaner thread to perform the necessary flushing, and it will wait for a signal from the page cleaner thread. If innodb_io_capacity is set to a low value (causing the page cleaner to throttle its work), a write workload would initially perform well, until the capacity of the circular ib_logfile0 is reached and log_free_check() will trigger checkpoints. At that point, the extra waiting in buf_flush_wait_flushed() will start reducing throughput. The page cleaner thread will also initiate log checkpoints after each buf_flush_lists() call, because that is the best point of time for the checkpoint LSN to advance by the maximum amount. Even in 'furious flushing' mode we invoke buf_flush_lists() with innodb_io_capacity_max pages at a time, and at the start of each batch (in the log_flush() callback function that runs in a separate task) we will invoke os_aio_wait_until_no_pending_writes(). This tweak allows the checkpoint to advance in smaller steps and significantly reduces the maximum latency. On an Intel Optane 960 NVMe SSD on Linux, it reduced from 4.6 seconds to 74 milliseconds. On Microsoft Windows with a slower SSD, it reduced from more than 180 seconds to 0.6 seconds. We will make innodb_adaptive_flushing=OFF simply flush innodb_io_capacity per second whenever the dirty proportion of buffer pool pages exceeds innodb_max_dirty_pages_pct_lwm. For innodb_adaptive_flushing=ON we try to make page_cleaner_flush_pages_recommendation() more consistent and predictable: if we are below innodb_adaptive_flushing_lwm, let us flush pages according to the return value of af_get_pct_for_dirty(). innodb_max_dirty_pages_pct_lwm: Revert the change of the default value that was made in MDEV-23399. The value innodb_max_dirty_pages_pct_lwm=0 guarantees that a shutdown of an idle server will be fast. Users might be surprised if normal shutdown suddenly became slower when upgrading within a GA release series. innodb_checkpoint_usec: Remove. The master task will no longer perform periodic log checkpoints. It is the duty of the page cleaner thread. log_sys.max_modified_age: Remove. The current span of the buf_pool.flush_list expressed in LSN only matters for adaptive flushing (outside the 'furious flushing' condition). For the correctness of checkpoints, the only thing that matters is the checkpoint age (log_sys.lsn - log_sys.last_checkpoint_lsn). This run-time constant was also reported as log_max_modified_age_sync. log_sys.max_checkpoint_age_async: Remove. This does not serve any purpose, because the checkpoints will now be triggered by the page cleaner thread. We will retain the log_sys.max_checkpoint_age limit for engaging 'furious flushing'. page_cleaner.slot: Remove. It turns out that page_cleaner_slot.flush_list_time was duplicating page_cleaner.slot.flush_time and page_cleaner.slot.flush_list_pass was duplicating page_cleaner.flush_pass. Likewise, there were some redundant monitor counters, because the page cleaner thread no longer performs any buf_pool.LRU flushing, and because there only is one buf_flush_page_cleaner thread. buf_flush_sync_lsn: Protect writes by buf_pool.flush_list_mutex. buf_pool_t::get_oldest_modification(): Add a parameter to specify the return value when no persistent data pages are dirty. Require the caller to hold buf_pool.flush_list_mutex. log_buf_pool_get_oldest_modification(): Take the fall-back LSN as a parameter. All callers will also invoke log_sys.get_lsn(). log_preflush_pool_modified_pages(): Replaced with buf_flush_wait_flushed(). buf_flush_wait_flushed(): Implement two limits. If not enough buffer pool has been flushed, signal the page cleaner (unless innodb_flush_sync=OFF) and wait for the page cleaner to complete. If the page cleaner thread is not running (which can be the case durign shutdown), initiate the flush and wait for it directly. buf_flush_ahead(): If innodb_flush_sync=ON (the default), submit a new buf_flush_sync_lsn target for the page cleaner but do not wait for the flushing to finish. log_get_capacity(), log_get_max_modified_age_async(): Remove, to make it easier to see that af_get_pct_for_lsn() is not acquiring any mutexes. page_cleaner_flush_pages_recommendation(): Protect all access to buf_pool.flush_list with buf_pool.flush_list_mutex. Previously there were some race conditions in the calculation. buf_flush_sync_for_checkpoint(): New function to process buf_flush_sync_lsn in the page cleaner thread. At the end of each batch, we try to wake up any blocked buf_flush_wait_flushed(). If everything up to buf_flush_sync_lsn has been flushed, we will reset buf_flush_sync_lsn=0. The page cleaner thread will keep 'furious flushing' until the limit is reached. Any threads that are waiting in buf_flush_wait_flushed() will be able to resume as soon as their own limit has been satisfied. buf_flush_page_cleaner: Prioritize buf_flush_sync_lsn and do not sleep as long as it is set. Do not update any page_cleaner statistics for this special mode of operation. In the normal mode (buf_flush_sync_lsn is not set for innodb_flush_sync=ON), try to wake up once per second. No longer check whether srv_inc_activity_count() has been called. After each batch, try to perform a log checkpoint, because the best chances for the checkpoint LSN to advance by the maximum amount are upon completing a flushing batch. log_t: Move buf_free, max_buf_free possibly to the same cache line with log_sys.mutex. log_margin_checkpoint_age(): Simplify the logic, and replace a 0.1-second sleep with a call to buf_flush_wait_flushed() to initiate flushing. Moved to the same compilation unit with the only caller. log_close(): Clean up the calculations. (Should be no functional change.) Return whether flush-ahead is needed. Moved to the same compilation unit with the only caller. mtr_t::finish_write(): Return whether flush-ahead is needed. mtr_t::commit(): Invoke buf_flush_ahead() when needed. Let us avoid external calls in mtr_t::commit() and make the logic easier to follow by having related code in a single compilation unit. Also, we will invoke srv_stats.log_write_requests.inc() only once per mini-transaction commit, while not holding mutexes. log_checkpoint_margin(): Only care about log_sys.max_checkpoint_age. Upon reaching log_sys.max_checkpoint_age where we must wait to prevent the log from getting corrupted, let us wait for at most 1MiB of LSN at a time, before rechecking the condition. This should allow writers to proceed even if the redo log capacity has been reached and 'furious flushing' is in progress. We no longer care about log_sys.max_modified_age_sync or log_sys.max_modified_age_async. The log_sys.max_modified_age_sync could be a relic from the time when there was a srv_master_thread that wrote dirty pages to data files. Also, we no longer have any log_sys.max_checkpoint_age_async limit, because log checkpoints will now be triggered by the page cleaner thread upon completing buf_flush_lists(). log_set_capacity(): Simplify the calculations of the limit (no functional change). log_checkpoint_low(): Split from log_checkpoint(). Moved to the same compilation unit with the caller. log_make_checkpoint(): Only wait for everything to be flushed until the current LSN. create_log_file(): After checkpoint, invoke log_write_up_to() to ensure that the FILE_CHECKPOINT record has been written. This avoids ut_ad(!srv_log_file_created) in create_log_file_rename(). srv_start(): Do not call recv_recovery_from_checkpoint_start() if the log has just been created. Set fil_system.space_id_reuse_warned before dict_boot() has been executed, and clear it after recovery has finished. dict_boot(): Initialize fil_system.max_assigned_id. srv_check_activity(): Remove. The activity count is counting transaction commits and therefore mostly interesting for the purge of history. BtrBulk::insert(): Do not explicitly wake up the page cleaner, but do invoke srv_inc_activity_count(), because that counter is still being used in buf_load_throttle_if_needed() for some heuristics. (It might be cleaner to execute buf_load() in the page cleaner thread!) Reviewed by: Vladislav Vaintroub
* \| \| \|	MDEV-23399 fixup: Avoid crash on Mariabackup shutdown	Marko Mäkelä	2020-10-26	1	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	innodb_preshutdown(): Terminate the encryption threads before the page cleaner thread can be shut down. innodb_shutdown(): Always wait for the encryption threads and page cleaner to shut down. srv_shutdown_all_bg_threads(): Wait for the encryption threads and the page cleaner to shut down. (After an aborted startup, innodb_shutdown() would not be called.) row_get_background_drop_list_len_low(): Remove. os_thread_count: Remove. Alternatively, at the end of srv_shutdown_all_bg_threads() we could try to wait longer for the count to reach 0. On some platforms, an assertion os_thread_count==0 could fail even after a small delay, even though in the core dump all threads would have exited. srv_shutdown_threads(): Renamed from srv_shutdown_all_bg_threads(). Do not wait for the page cleaner to shut down, because the later innodb_shutdown(), which may invoke logs_empty_and_mark_files_at_shutdown(), assumes that it exists.
* \| \| \|	Merge 10.4 to 10.5	Marko Mäkelä	2020-10-22	1	-2/+2
\|\ \ \ \ \| \|/ / /
\| * \| \|	Merge 10.3 into 10.4	Marko Mäkelä	2020-10-22	1	-2/+2
\| \|\ \ \ \| \| \|/ /
\| \| * \|	Merge 10.2 into 10.3	Marko Mäkelä	2020-10-22	1	-2/+2
\| \| \|\ \ \| \| \| \|/
\| \| \| *	MDEV-21951: mariabackup SST fail if data-directory have lost+found directory	Julius Goryavsky	2020-10-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To fix this, it is necessary to add an option to exclude the database with the name "lost+found" from processing (the database name will be checked by the check_if_skip_database_by_path() or by the check_if_skip_database() function, and as a result "lost+found" will be skipped). In addition, it is necessary to slightly modify the verification logic in the check_if_skip_database() function. Also added a new test galera_sst_mariabackup_lost_found.test