diff options
author | Marko Mäkelä <marko.makela@mariadb.com> | 2020-12-28 12:06:22 +0200 |
---|---|---|
committer | Marko Mäkelä <marko.makela@mariadb.com> | 2020-12-28 12:06:22 +0200 |
commit | 5b9ee8d8193a8c7a8ebdd35eedcadc3ae78e7fc1 (patch) | |
tree | 136c29d054b5634e03deb48b9dbdf17f919f8b4c | |
parent | 8e3e87d2fc1e63d287f203d441dcb9360775c6b7 (diff) | |
download | mariadb-git-5b9ee8d8193a8c7a8ebdd35eedcadc3ae78e7fc1.tar.gz |
MDEV-24449 Corruption of system tablespace or last recovered page
This corresponds to 10.5 commit 39378e1366f78b38c05e45103b9fb9c829cc5f4f.
With a patched version of the test innodb.ibuf_not_empty (so that
it would trigger crash recovery after using the change buffer),
and patched code that would modify the os_thread_sleep() in
recv_apply_hashed_log_recs() to be 1ms as well as add a sleep of
the same duration to the end of recv_recover_page() when
recv_sys->n_addrs=0, we can demonstrate a race condition.
After disabling some debug checks in buf_all_freed_instance(),
buf_pool_invalidate_instance() and buf_validate(), we managed to
trigger an assertion failure in fseg_free_step(), on the XDES_FREE_BIT.
In other words, an trx_undo_seg_free() call during
trx_rollback_resurrected() was attempting a double-free of a page.
This was repeated about once in 400 to 500 test runs. With the fix
applied, the test passed 2,000 runs.
recv_apply_hashed_log_recs(): Do not only wait for recv_sys->n_addrs
to reach 0, but also wait for buf_get_n_pending_read_ios() to reach 0,
to guarantee that buf_page_io_complete() will not be executing
ibuf_merge_or_delete_for_page().
-rw-r--r-- | storage/innobase/log/log0recv.cc | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/storage/innobase/log/log0recv.cc b/storage/innobase/log/log0recv.cc index 4c3886caeaf..95179ec2271 100644 --- a/storage/innobase/log/log0recv.cc +++ b/storage/innobase/log/log0recv.cc @@ -2501,7 +2501,7 @@ apply: /* Wait until all the pages have been processed */ - while (recv_sys->n_addrs != 0) { + while (recv_sys->n_addrs || buf_get_n_pending_read_ios()) { const bool abort = recv_sys->found_corrupt_log || recv_sys->found_corrupt_fs; |