From 5fbd5dafe7f78735922ce5aa15d65ffab811deb8 Mon Sep 17 00:00:00 2001 From: unknown Date: Tue, 13 Nov 2007 17:12:29 +0100 Subject: * WL#4137 Maria- Framework for testing recovery in mysql-test-run See test maria-recovery.test for a model; all include scripts have an "API" section at start if they do take parameters from outside. * Fixing bug reported by Jani and Monty (when two REDOs about the same page in one group, see ma_blockrec.c). * Fixing small bugs in recovery mysql-test/include/wait_until_connected_again.inc: be sure to enter the loop (the previous query by the caller may not have failed: it could be query; mysqladmin shutdown; call this script). mysql-test/lib/mtr_process.pl: * Through the "expect" file a test can tell mtr that a server crash is expected. What the file contains is irrelevant. Now if its last line starts with "wait", mtr will wait before restarting (it will wait for the last line to not start with "wait"). This is for tests which need to mangle files under the feet of a dead mysqld. * Remove "expect" file before restarting; otherwise there could be a race condition: tests sees server restarted, does something, writes an "expect" file, and then mtr removes that file, then test kills mysqld, and then mtr will never restart it. storage/maria/ma_blockrec.c: - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - fixing bug in applying of REDO_PURGE_BLOCKS in recovery: page_range sometimes has TAIL_BIT set, need to turn it down to know the real page range. - Both bugs are covered in maria-recovery.test storage/maria/ma_checkpoint.c: Capability to, in debug builds only, do some special operations (flush all bitmap and data pages, flush state, flush log) and crash mysqld, to later test recovery. Driven by some --debug=d, symbols. storage/maria/ma_open.c: debugging info storage/maria/ma_pagecache.c: Now that we can _ma_unpin_all_pages() during the REDO phase to set page's LSN, the assertion needs to be relaxed. storage/maria/ma_recovery.c: - open trace file in append mode (useful when a test triggers several recoveries, we see them all). - fixing wrong error detection, it's possible that during recovery we want to open an already open table. - when applying a REDO in recovery, we don't anymore put UNDO's LSN on the page at once; indeed if in this REDO's group there comes another REDO for the same page it would be wrongly skipped. Instead, we keep pages pinned, don't change their LSN. When done with all REDOs of the group we unpin them and stamp them with UNDO's LSN. - we verify that all log records of a group are about the same table, for debugging. mysql-test/r/maria-recovery.result: result mysql-test/t/maria-recovery-master.opt: crash is expected, core file would take room, stack trace would wake pushbuild up. mysql-test/t/maria-recovery.test: Test of recovery from mysql-test (it is already tested as unit tests in ma_test_recovery) (WL#4137) - test that, if recovery is made to start on an empty table it can replay the effects of committed and uncommitted statements (having only the committed ones in the end result). This should be the first test for someone writing code of new REDOs. - test that, if mysqld is crashed and recovery runs we have only committed statements in the end result. Crashes are done in different ways: flush nothing (so, uncommitted statement is often missing from the log => no rollback to do); flush pagecache (implicitely flushes log (WAL)) and flush log, both causes rollbacks; flush log can also flush state (state.records etc) to test recovery of the state (not tested well now as we repair the index anyway). - test of bug found by Jani and Monty in recovery (two REDO about the same page in one group). mysql-test/include/maria_empty_logs.inc: removes logs, to have a clean sheet for testing recovery. mysql-test/include/maria_make_snapshot.inc: copies a table to another directory, or back, or compares both (comparison is not implemented as physical comparison is impossible if an UNDO phase happened). mysql-test/include/maria_make_snapshot_for_comparison.inc: copies tables to another directory so that they can later serve as a comparison reference (they are the good tables, recovery should produce similar ones). mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc: When we want to force recovery to start on old tables, we prepare old tables with this script: we put them in a spare directory. They are later copied back over mysqltest tables while mysqld is dead. We also need to copy back the control file, otherwise mysqld, in recovery, would start from the latest checkpoint: latest checkpoint plus old tables is not a recovery-possible scenario of course. mysql-test/include/maria_verify_recovery.inc: causes mysqld to crash, restores old tables if requested, lets recovery run, compares resulting tables with reference tables by using CHECKSUM TABLE. We don't do any sanity checks on page's LSN in resulting tables, yet. --- storage/maria/ma_checkpoint.c | 63 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) (limited to 'storage/maria/ma_checkpoint.c') diff --git a/storage/maria/ma_checkpoint.c b/storage/maria/ma_checkpoint.c index 058e86d8106..680dc73be35 100644 --- a/storage/maria/ma_checkpoint.c +++ b/storage/maria/ma_checkpoint.c @@ -346,6 +346,43 @@ int ma_checkpoint_init(ulong interval) } +#ifndef DBUG_OFF +/** + Function used to test recovery: flush some table pieces and then caller + crashes. + + @param what_to_flush 0: current bitmap and all data pages + 1: state +*/ +static void flush_all_tables(int what_to_flush) +{ + LIST *pos; /**< to iterate over open tables */ + pthread_mutex_lock(&THR_LOCK_maria); + for (pos= maria_open_list; pos; pos= pos->next) + { + MARIA_HA *info= (MARIA_HA*)pos->data; + if (info->s->now_transactional) + { + switch (what_to_flush) + { + case 0: + _ma_flush_table_files(info, MARIA_FLUSH_DATA | MARIA_FLUSH_INDEX, + FLUSH_KEEP, FLUSH_KEEP); + break; + case 1: + _ma_state_info_write(info->s, 1|4); + DBUG_PRINT("maria_flush_states", + ("is_of_horizon: LSN (%lu,0x%lx)", + LSN_IN_PARTS(info->s->state.is_of_horizon))); + break; + } + } + } + pthread_mutex_unlock(&THR_LOCK_maria); +} +#endif + + /** @brief Destroys the checkpoint module */ @@ -353,6 +390,32 @@ int ma_checkpoint_init(ulong interval) void ma_checkpoint_end(void) { DBUG_ENTER("ma_checkpoint_end"); + DBUG_EXECUTE_IF("maria_flush_whole_page_cache", + { + DBUG_PRINT("maria_flush_whole_page_cache", ("now")); + flush_all_tables(0); + }); + DBUG_EXECUTE_IF("maria_flush_whole_log", + { + DBUG_PRINT("maria_flush_whole_log", ("now")); + translog_flush(translog_get_horizon()); + }); + /* + Note that for WAL reasons, maria_flush_states requires + maria_flush_whole_log. + */ + DBUG_EXECUTE_IF("maria_flush_states", + { + DBUG_PRINT("maria_flush_states", ("now")); + flush_all_tables(1); + }); + DBUG_EXECUTE_IF("maria_crash", + { + DBUG_PRINT("maria_crash", ("now")); + fflush(DBUG_FILE); + abort(); + }); + if (checkpoint_inited) { pthread_mutex_lock(&LOCK_checkpoint); -- cgit v1.2.1