diff options
author | Guilhem Bichot <guilhem@mysql.com> | 2008-06-02 22:53:25 +0200 |
---|---|---|
committer | Guilhem Bichot <guilhem@mysql.com> | 2008-06-02 22:53:25 +0200 |
commit | a5bcb63f45f58f7c5f4f2387da521aa7a14b60be (patch) | |
tree | 7eaa8ccde458e0c059e01272c49894176fd01dba /storage/maria | |
parent | 2d64cd05e1b9cd3b76368af7db34335b88bea248 (diff) | |
download | mariadb-git-a5bcb63f45f58f7c5f4f2387da521aa7a14b60be.tar.gz |
WL#4374 "Maria - force start if Recovery fails multiple times"
http://forge.mysql.com/worklog/task.php?id=4374
new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures
of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()])
is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated
systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also
be used on them: this revision makes maria-recover work (it was disabled).
Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
KNOWN_BUGS.txt:
As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc".
LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago.
Recovery of fulltext and GIS indexes works since a few weeks.
mysql-test/include/maria_make_snapshot.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_comparison.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_verify_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/lib/mtr_report.pl:
new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1.
mysql-test/r/maria-preload.result:
result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger
because using the information_schema and the join leads to some internal maria temp table being used, and thus some
blocks of it being read.
mysql-test/r/maria-purge.result:
engine's name in SHOW ENGINE MARIA LOGS changed.
mysql-test/r/maria-recover.result:
result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected.
mysql-test/r/maria-recovery.result:
result update
mysql-test/r/maria.result:
new variables show up
mysql-test/t/disabled.def:
BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay
disabled (BUG#35107).
mysql-test/t/maria-preload.test:
Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
compute differences in status variables before and after relevant queries
mysql-test/t/maria-recover-master.opt:
test --maria-recover
mysql-test/t/maria-recover.test:
Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired)
mysql-test/t/maria-recovery-big.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery-bitmap.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery.test:
update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl
does not blindly remove all corruption messages for t1 which is
a common name.
storage/maria/ha_maria.cc:
Enabling maria-recover.
Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init()
calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries
and remove logs if needed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
storage/maria/ma_checkpoint.c:
new prototype
storage/maria/ma_control_file.c:
Storing in one byte in the control file, the number of consecutive recovery failures.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_init.c:
new prototype
storage/maria/ma_locking.c:
Need to update open_count on disk at first write and close for transactional tables, like we already did for
non-transactional tables, otherwise we cannot notice that the table is dubious.
storage/maria/ma_loghandler.c:
translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is
for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE).
storage/maria/ma_loghandler.h:
export function because ha_maria::mark_recovery_start() needs it
storage/maria/ma_recovery.c:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_recovery.h:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_test_force_start.pl:
Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover).
This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed.
I'll have to run it on my machine and also on a Windows machine.
storage/maria/unittest/ma_control_file-t.c:
adding recovery_failures to the test
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
fix for compiler warning (unused variable in non-debug build)
Diffstat (limited to 'storage/maria')
-rw-r--r-- | storage/maria/ha_maria.cc | 154 | ||||
-rw-r--r-- | storage/maria/ma_checkpoint.c | 3 | ||||
-rw-r--r-- | storage/maria/ma_control_file.c | 72 | ||||
-rw-r--r-- | storage/maria/ma_control_file.h | 6 | ||||
-rw-r--r-- | storage/maria/ma_init.c | 2 | ||||
-rw-r--r-- | storage/maria/ma_locking.c | 17 | ||||
-rw-r--r-- | storage/maria/ma_loghandler.c | 106 | ||||
-rw-r--r-- | storage/maria/ma_loghandler.h | 4 | ||||
-rw-r--r-- | storage/maria/ma_recovery.c | 17 | ||||
-rw-r--r-- | storage/maria/ma_recovery.h | 2 | ||||
-rwxr-xr-x | storage/maria/ma_test_force_start.pl | 179 | ||||
-rw-r--r-- | storage/maria/unittest/ma_control_file-t.c | 45 | ||||
-rw-r--r-- | storage/maria/unittest/ma_test_loghandler_multigroup-t.c | 8 |
13 files changed, 503 insertions, 112 deletions
diff --git a/storage/maria/ha_maria.cc b/storage/maria/ha_maria.cc index 3a4f2b6df23..a6339b28332 100644 --- a/storage/maria/ha_maria.cc +++ b/storage/maria/ha_maria.cc @@ -49,13 +49,11 @@ ulong pagecache_division_limit, pagecache_age_threshold; ulonglong pagecache_buffer_size; /** - @todo For now there is no way for a user to set a different value of - maria_recover_options, i.e. auto-check-and-repair is always disabled. - We could enable it. As the auto-repair is initiated when opened from the - SQL layer (open_unireg_entry(), check_and_repair()), it does not happen - when Maria's Recovery internally opens the table to apply log records to - it, which is good. It would happen only after Recovery, if the table is - still corrupted. + As the auto-repair is initiated when opened from the SQL layer + (open_unireg_entry(), check_and_repair()), it does not happen when Maria's + Recovery internally opens the table to apply log records to it, which is + good. It would happen only after Recovery, if the table is still + corrupted. */ ulong maria_recover_options= HA_RECOVER_NONE; handlerton *maria_hton; @@ -63,7 +61,14 @@ handlerton *maria_hton; /* bits in maria_recover_options */ const char *maria_recover_names[]= { - "DEFAULT", "BACKUP", "FORCE", "QUICK", NullS + /* + Compared to MyISAM, "default" was renamed to "normal" as it collided with + SET var=default which sets to the var's default i.e. what happens when the + var is not set i.e. HA_RECOVER_NONE. + Another change is that OFF is used to disable, not ""; this is to have OFF + display in SHOW VARIABLES which is better than "". + */ + "OFF", "NORMAL", "BACKUP", "FORCE", "QUICK", NullS }; TYPELIB maria_recover_typelib= { @@ -103,11 +108,13 @@ TYPELIB maria_sync_log_dir_typelib= maria_sync_log_dir_names, NULL }; -/** @brief Interval between background checkpoints in seconds */ +/** Interval between background checkpoints in seconds */ static ulong checkpoint_interval; static void update_checkpoint_interval(MYSQL_THD thd, struct st_mysql_sys_var *var, void *var_ptr, const void *save); +/** After that many consecutive recovery failures, remove logs */ +static ulong force_start_after_recovery_failures; static void update_log_file_size(MYSQL_THD thd, struct st_mysql_sys_var *var, void *var_ptr, const void *save); @@ -124,6 +131,17 @@ static MYSQL_SYSVAR_ULONG(checkpoint_interval, checkpoint_interval, " 'no automatic checkpoints' which makes sense only for testing.", NULL, update_checkpoint_interval, 30, 0, UINT_MAX, 1); +static MYSQL_SYSVAR_ULONG(force_start_after_recovery_failures, + force_start_after_recovery_failures, + /* + Read-only because setting it on the fly has no useful effect, + should be set on command-line. + */ + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY, + "Number of consecutive log recovery failures after which logs will be" + " automatically deleted to cure the problem; 0 (the default) disables" + " the feature.", NULL, NULL, 0, 0, UINT_MAX8, 1); + static MYSQL_SYSVAR_BOOL(page_checksum, maria_page_checksums, 0, "Maintain page checksums (can be overridden per table " "with PAGE_CHECKSUM clause in CREATE TABLE)", 0, 0, 1); @@ -175,6 +193,12 @@ static MYSQL_SYSVAR_ULONG(pagecache_division_limit, pagecache_division_limit, "The minimum percentage of warm blocks in key cache", 0, 0, 100, 1, 100, 1); +static MYSQL_SYSVAR_ENUM(recover, maria_recover_options, PLUGIN_VAR_OPCMDARG, + "Specifies how corrupted tables should be automatically repaired." + " Possible values are \"NORMAL\" (the default), \"BACKUP\", \"FORCE\"," + " \"QUICK\", or \"OFF\" which is like not using the option.", + NULL, NULL, HA_RECOVER_NONE, &maria_recover_typelib); + static MYSQL_THDVAR_ULONG(repair_threads, PLUGIN_VAR_RQCMDARG, "Number of threads to use when repairing maria tables. The value of 1 " "disables parallel repair.", @@ -186,7 +210,7 @@ static MYSQL_THDVAR_ULONG(sort_buffer_size, PLUGIN_VAR_RQCMDARG, 0, 0, 8192*1024, 4, ~0L, 1); static MYSQL_THDVAR_ENUM(stats_method, PLUGIN_VAR_RQCMDARG, - "Specifies how maria index statistics collection code should threat " + "Specifies how maria index statistics collection code should treat " "NULLs. Possible values are \"nulls_unequal\", \"nulls_equal\", " "and \"nulls_ignored\".", 0, 0, 0, &maria_stats_method_typelib); @@ -870,6 +894,12 @@ int ha_maria::open(const char *name, int mode, uint test_if_locked) test_if_locked|= HA_OPEN_MMAP; #endif + if (unlikely(maria_recover_options != HA_RECOVER_NONE)) + { + /* user asked to trigger a repair if table was not properly closed */ + test_if_locked|= HA_OPEN_ABORT_IF_CRASHED; + } + if (!(file= maria_open(name, mode, test_if_locked | HA_OPEN_FROM_SQL_LAYER))) return (my_errno ? my_errno : -1); @@ -2728,7 +2758,7 @@ bool maria_show_status(handlerton *hton, stat_print_fn *print, enum ha_stat_type stat) { - char engine_name[]= "maria"; + const LEX_STRING *engine_name= hton_name(hton); switch (stat) { case HA_ENGINE_LOGS: { @@ -2745,8 +2775,8 @@ bool maria_show_status(handlerton *hton, if (first_file == 0) { const char error[]= "error"; - print(thd, engine_name, sizeof(engine_name), - STRING_WITH_LEN(""), error, sizeof(error)); + print(thd, engine_name->str, engine_name->length, + STRING_WITH_LEN(""), error, sizeof(error) - 1); break; } @@ -2762,7 +2792,7 @@ bool maria_show_status(handlerton *hton, if (!(stat= my_stat(file, &stat_buff, MYF(MY_WME)))) { status= error; - status_len= sizeof(error); + status_len= sizeof(error) - 1; length= my_snprintf(object, SHOW_MSG_LEN, "Size unknown ; %s", file); } else @@ -2770,23 +2800,23 @@ bool maria_show_status(handlerton *hton, if (first_needed == 0) { status= unknown; - status_len= sizeof(unknown); + status_len= sizeof(unknown) - 1; } else if (i < first_needed) { status= unneeded; - status_len= sizeof(unneeded); + status_len= sizeof(unneeded) - 1; } else { status= needed; - status_len= sizeof(needed); + status_len= sizeof(needed) - 1; } length= my_snprintf(object, SHOW_MSG_LEN, "Size %12lu ; %s", (ulong) stat->st_size, file); } - print(thd, engine_name, sizeof(engine_name), + print(thd, engine_name->str, engine_name->length, object, length, status, status_len); } break; @@ -2799,9 +2829,90 @@ bool maria_show_status(handlerton *hton, return 0; } + +/** + Callback to delete all logs in directory. This is lower-level than other + functions in ma_loghandler.c which delete logs, as it does not rely on + translog_init() having been called first. + + @param directory directory where file is + @param filename base name of the file to delete +*/ + +static my_bool translog_callback_delete_all(const char *directory, + const char *filename) +{ + char complete_name[FN_REFLEN]; + fn_format(complete_name, filename, directory, "", MYF(MY_UNPACK_FILENAME)); + return my_delete(complete_name, MYF(MY_WME)); +} + + +/** + Helper function for option maria-force-start-after-recovery-failures. + Deletes logs if too many failures. Otherwise, increments the counter of + failures in the control file. + Notice how this has to be called _before_ translog_init() (if log is + corrupted, translog_init() might crash the server, so we need to remove logs + before). + + @param log_dir directory where logs to be deleted are +*/ + +static int mark_recovery_start(const char* log_dir) +{ + int res; + DBUG_ENTER("mark_recovery_start"); + if (unlikely(maria_recover_options == HA_RECOVER_NONE)) + ma_message_no_user(ME_JUST_WARNING, "Please consider using option" + " --maria-recover[=...] to automatically check and" + " repair tables when logs are removed by option" + " --maria-force-start-after-recovery-failures=#"); + if (recovery_failures >= force_start_after_recovery_failures) + { + /* + Remove logs which cause the problem; keep control file which has + critical info like uuid, max_trid (removing control file may make + correct tables look corrupted!). + */ + char msg[100]; + res= translog_walk_filenames(log_dir, &translog_callback_delete_all); + my_snprintf(msg, sizeof(msg), + "%s logs after %u consecutive failures of" + " recovery from logs", + (res ? "failed to remove some" : "removed all"), + recovery_failures); + ma_message_no_user((res ? 0 : ME_JUST_WARNING), msg); + } + else + res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, + max_trid_in_control_file, + recovery_failures + 1); + DBUG_RETURN(res); +} + + +/** + Helper function for option maria-force-start-after-recovery-failures. + Records in the control file that recovery was a success, so that it's not + counted for maria-force-start-after-recovery-failures. +*/ + +static int mark_recovery_success(void) +{ + /* success of recovery, reset recovery_failures: */ + int res; + DBUG_ENTER("mark_recovery_success"); + res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, + max_trid_in_control_file, 0); + DBUG_RETURN(res); +} + + static int ha_maria_init(void *p) { int res; + const char *log_dir= maria_data_root; maria_hton= (handlerton *)p; maria_hton->state= SHOW_OPTION_YES; maria_hton->db_type= DB_TYPE_UNKNOWN; @@ -2816,6 +2927,8 @@ static int ha_maria_init(void *p) bzero(maria_log_pagecache, sizeof(*maria_log_pagecache)); maria_tmpdir= &mysql_tmpdir_list; /* For REDO */ res= maria_init() || ma_control_file_open(TRUE, TRUE) || + ((force_start_after_recovery_failures != 0) && + mark_recovery_start(log_dir)) || !init_pagecache(maria_pagecache, (size_t) pagecache_buffer_size, pagecache_division_limit, pagecache_age_threshold, maria_block_size, 0) || @@ -2825,7 +2938,8 @@ static int ha_maria_init(void *p) translog_init(maria_data_root, log_file_size, MYSQL_VERSION_ID, server_id, maria_log_pagecache, TRANSLOG_DEFAULT_FLAGS, 0) || - maria_recover() || + maria_recovery_from_log() || + ((force_start_after_recovery_failures != 0) && mark_recovery_success()) || ma_checkpoint_init(checkpoint_interval); maria_multi_threaded= TRUE; return res ? HA_ERR_INITIALIZATION : 0; @@ -2913,6 +3027,7 @@ my_bool ha_maria::register_query_cache_table(THD *thd, char *table_name, static struct st_mysql_sys_var* system_variables[]= { MYSQL_SYSVAR(block_size), MYSQL_SYSVAR(checkpoint_interval), + MYSQL_SYSVAR(force_start_after_recovery_failures), MYSQL_SYSVAR(page_checksum), MYSQL_SYSVAR(log_dir_path), MYSQL_SYSVAR(log_file_size), @@ -2921,6 +3036,7 @@ static struct st_mysql_sys_var* system_variables[]= { MYSQL_SYSVAR(pagecache_age_threshold), MYSQL_SYSVAR(pagecache_buffer_size), MYSQL_SYSVAR(pagecache_division_limit), + MYSQL_SYSVAR(recover), MYSQL_SYSVAR(repair_threads), MYSQL_SYSVAR(sort_buffer_size), MYSQL_SYSVAR(stats_method), diff --git a/storage/maria/ma_checkpoint.c b/storage/maria/ma_checkpoint.c index 36db37f0d4d..f815a7cf75c 100644 --- a/storage/maria/ma_checkpoint.c +++ b/storage/maria/ma_checkpoint.c @@ -245,7 +245,8 @@ static int really_execute_checkpoint(void) that log was flushed before we write to the control file). */ if (unlikely(ma_control_file_write_and_force(lsn, last_logno, - max_trid_in_control_file))) + max_trid_in_control_file, + recovery_failures))) { translog_unlock(); goto err; diff --git a/storage/maria/ma_control_file.c b/storage/maria/ma_control_file.c index e6018a4b847..84fae2a9f7b 100644 --- a/storage/maria/ma_control_file.c +++ b/storage/maria/ma_control_file.c @@ -39,6 +39,8 @@ Start of changeable part: - Checksum of changeable part - LSN of last checkpoint - Number of last log file + - Max trid in control file (since Maria 1.5 May 2008) + - Number of consecutive recovery failures (since Maria 1.5 May 2008) ..... Here we can add new variables without changing format The idea is that one can add new variables to the control file and still @@ -80,7 +82,9 @@ one should increment the control file version number. #define CF_FILENO_SIZE 4 #define CF_MAX_TRID_OFFSET (CF_FILENO_OFFSET + CF_FILENO_SIZE) #define CF_MAX_TRID_SIZE TRANSID_SIZE -#define CF_CHANGEABLE_TOTAL_SIZE (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE) +#define CF_RECOV_FAIL_OFFSET (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE) +#define CF_RECOV_FAIL_SIZE 1 +#define CF_CHANGEABLE_TOTAL_SIZE (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE) /* The following values should not be changed, except when changing version @@ -109,6 +113,12 @@ uint32 last_logno= FILENO_IMPOSSIBLE; TrID max_trid_in_control_file= 0; /** + Number of consecutive log or recovery failures. Reset to 0 after recovery's + success. +*/ +uint8 recovery_failures= 0; + +/** @brief If log's lock should be asserted when writing to control file. Can be re-used by any function which needs to be thread-safe except when @@ -188,7 +198,7 @@ static CONTROL_FILE_ERROR create_control_file(const char *name, /* init the file with these "undefined" values */ DBUG_RETURN(ma_control_file_write_and_force(LSN_IMPOSSIBLE, - FILENO_IMPOSSIBLE, 0)); + FILENO_IMPOSSIBLE, 0, 0)); } @@ -420,6 +430,9 @@ CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing, if (new_cf_changeable_size >= (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE)) max_trid_in_control_file= transid_korr(buffer + new_cf_create_time_size + CF_MAX_TRID_OFFSET); + if (new_cf_changeable_size >= (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE)) + recovery_failures= + (buffer + new_cf_create_time_size + CF_RECOV_FAIL_OFFSET)[0]; ok: DBUG_RETURN(0); @@ -436,19 +449,21 @@ err: /* Write information durably to the control file; stores this information into - the last_checkpoint_lsn, last_logno, max_trid_in_control_file global - variables. + the last_checkpoint_lsn, last_logno, max_trid_in_control_file, + recovery_failures global variables. Called when we have created a new log (after syncing this log's creation), - when we have written a checkpoint (after syncing this log record), and at - shutdown (for storing trid in case logs are soon removed by user). + when we have written a checkpoint (after syncing this log record), at + shutdown (for storing trid in case logs are soon removed by user), and + before and after recovery (to store recovery_failures). Variables last_checkpoint_lsn and last_logno must be protected by caller using log's lock, unless this function is called at startup. SYNOPSIS ma_control_file_write_and_force() - checkpoint_lsn LSN of last checkpoint - logno last log file number - trid maximum transaction longid. + last_checkpoint_lsn_arg LSN of last checkpoint + last_logno_arg last log file number + max_trid_arg maximum transaction longid + recovery_failures_arg consecutive recovery failures NOTE We always want to do one single my_pwrite() here to be as atomic as @@ -459,17 +474,26 @@ err: 1 - Error */ -int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, - TrID trid) +int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg, + uint32 last_logno_arg, + TrID max_trid_arg, + uint8 recovery_failures_arg) { uchar buffer[CF_MAX_SIZE]; uint32 sum; + my_bool no_need_sync; DBUG_ENTER("ma_control_file_write_and_force"); - if ((last_checkpoint_lsn == checkpoint_lsn) && - (last_logno == logno) && - (max_trid_in_control_file == trid)) - DBUG_RETURN(0); /* no need to write */ + /* + We don't need to sync if this is just an increase of + recovery_failures: it's even good if that counter is not increased on disk + in case of power or hardware failure (less false positives when removing + logs). + */ + no_need_sync= ((last_checkpoint_lsn == last_checkpoint_lsn_arg) && + (last_logno == last_logno_arg) && + (max_trid_in_control_file == max_trid_arg) && + (recovery_failures_arg > 0)); if (control_file_fd < 0) DBUG_RETURN(1); @@ -479,9 +503,10 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, translog_lock_handler_assert_owner(); #endif - lsn_store(buffer + CF_LSN_OFFSET, checkpoint_lsn); - int4store(buffer + CF_FILENO_OFFSET, logno); - transid_store(buffer + CF_MAX_TRID_OFFSET, trid); + lsn_store(buffer + CF_LSN_OFFSET, last_checkpoint_lsn_arg); + int4store(buffer + CF_FILENO_OFFSET, last_logno_arg); + transid_store(buffer + CF_MAX_TRID_OFFSET, max_trid_arg); + (buffer + CF_RECOV_FAIL_OFFSET)[0]= recovery_failures_arg; if (cf_changeable_size > CF_CHANGEABLE_TOTAL_SIZE) { @@ -514,12 +539,13 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, if (my_pwrite(control_file_fd, buffer, cf_changeable_size, cf_create_time_size, MYF(MY_FNABP | MY_WME)) || - my_sync(control_file_fd, MYF(MY_WME))) + (!no_need_sync && my_sync(control_file_fd, MYF(MY_WME)))) DBUG_RETURN(1); - last_checkpoint_lsn= checkpoint_lsn; - last_logno= logno; - max_trid_in_control_file= trid; + last_checkpoint_lsn= last_checkpoint_lsn_arg; + last_logno= last_logno_arg; + max_trid_in_control_file= max_trid_arg; + recovery_failures= recovery_failures_arg; cf_changeable_size= CF_CHANGEABLE_TOTAL_SIZE; /* no more warning */ DBUG_RETURN(0); @@ -558,7 +584,7 @@ int ma_control_file_end(void) */ last_checkpoint_lsn= LSN_IMPOSSIBLE; last_logno= FILENO_IMPOSSIBLE; - max_trid_in_control_file= 0; + max_trid_in_control_file= recovery_failures= 0; DBUG_RETURN(close_error); } diff --git a/storage/maria/ma_control_file.h b/storage/maria/ma_control_file.h index 52001cd4a4c..4cb5527620d 100644 --- a/storage/maria/ma_control_file.h +++ b/storage/maria/ma_control_file.h @@ -44,6 +44,8 @@ extern uint32 last_logno; extern TrID max_trid_in_control_file; +extern uint8 recovery_failures; + extern my_bool maria_multi_threaded, maria_in_recovery; typedef enum enum_control_file_error { @@ -63,7 +65,9 @@ typedef enum enum_control_file_error { C_MODE_START CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing, my_bool print_error); -int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, TrID trid); +int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg, + uint32 last_logno_arg, TrID max_trid_arg, + uint8 recovery_failures_arg); int ma_control_file_end(void); my_bool ma_control_file_inited(void); C_MODE_END diff --git a/storage/maria/ma_init.c b/storage/maria/ma_init.c index f81afda2141..c13f01001b3 100644 --- a/storage/maria/ma_init.c +++ b/storage/maria/ma_init.c @@ -86,7 +86,7 @@ void maria_end(void) from the log, as it cannot process REDOs). */ (void)ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, - trid); + trid, recovery_failures); } trnman_destroy(); if (translog_status == TRANSLOG_OK) diff --git a/storage/maria/ma_locking.c b/storage/maria/ma_locking.c index 4ec242fd927..89a8cad26f1 100644 --- a/storage/maria/ma_locking.c +++ b/storage/maria/ma_locking.c @@ -381,7 +381,7 @@ int _ma_test_if_changed(register MARIA_HA *info) tells us if the MARIA file wasn't properly closed. (This is true if my_disable_locking is set). - open_count is not maintained on disk for transactional or temporary tables. + open_count is not maintained on disk for temporary tables. */ int _ma_mark_file_changed(MARIA_HA *info) @@ -400,11 +400,16 @@ int _ma_mark_file_changed(MARIA_HA *info) share->state.open_count++; } /* - temp tables don't need an open_count as they are removed on crash; - transactional tables are fixed by log-based recovery, so don't need an - open_count either (and we thus avoid the disk write below). + Temp tables don't need an open_count as they are removed on crash. + In theory transactional tables are fixed by log-based recovery, so don't + need an open_count either, but if recovery has failed and logs have been + removed (by maria-force-start-after-recovery-failures), we still need to + detect dubious tables. + If we didn't maintain open_count on disk for a table, after a crash + we wouldn't know if it was closed at crash time (thus does not need a + check) or not. So we would have to check all tables: overkill. */ - if (!(share->temporary | share->base.born_transactional)) + if (!share->temporary) { mi_int2store(buff,share->state.open_count); buff[2]=1; /* Mark that it's changed */ @@ -471,7 +476,7 @@ int _ma_decrement_open_count(MARIA_HA *info) { share->state.open_count--; share->changed= 1; /* We have to update state */ - if (!(share->temporary | share->base.born_transactional)) + if (!share->temporary) { mi_int2store(buff,share->state.open_count); write_error= (int) my_pwrite(share->kfile.file, buff, sizeof(buff), diff --git a/storage/maria/ma_loghandler.c b/storage/maria/ma_loghandler.c index b6d12e3e975..4daef976b6e 100644 --- a/storage/maria/ma_loghandler.c +++ b/storage/maria/ma_loghandler.c @@ -17,6 +17,7 @@ #include "trnman.h" #include "ma_blockrec.h" /* for some constants and in-write hooks */ #include "ma_key_recover.h" /* For some in-write hooks */ +#include "ma_checkpoint.h" /* On Windows, neither my_open() nor my_sync() work for directories. @@ -1522,7 +1523,8 @@ static my_bool translog_create_new_file() DBUG_RETURN(1); if (ma_control_file_write_and_force(last_checkpoint_lsn, file_no, - max_trid_in_control_file)) + max_trid_in_control_file, + recovery_failures)) { translog_stop_writing(); DBUG_RETURN(1); @@ -3211,21 +3213,29 @@ static my_bool translog_truncate_log(TRANSLOG_ADDRESS addr) /** - @brief Check log files presence + Applies function 'callback' to all files (in a directory) which + name looks like a log's name (maria_log.[0-9]{7}). + If 'callback' returns TRUE this interrupts the walk and returns + TRUE. Otherwise FALSE is returned after processing all log files. + It cannot just use log_descriptor.directory because that may not yet have + been initialized. - @retval 0 no log files. - @retval 1 there is at least 1 log file in the directory + @param directory directory to scan + @param callback function to apply; is passed directory and base + name of found file */ -my_bool translog_is_log_files() +my_bool translog_walk_filenames(const char *directory, + my_bool (*callback)(const char *, + const char *)) { MY_DIR *dirp; uint i; my_bool rc= FALSE; /* Finds and removes transaction log files */ - if (!(dirp = my_dir(log_descriptor.directory, MYF(MY_DONT_SORT)))) - return 1; + if (!(dirp = my_dir(directory, MYF(MY_DONT_SORT)))) + return FALSE; for (i= 0; i < dirp->number_off_files; i++) { @@ -3239,14 +3249,14 @@ my_bool translog_is_log_files() file[15] >= '0' && file[15] <= '9' && file[16] >= '0' && file[16] <= '9' && file[17] >= '0' && file[17] <= '9' && - file[18] == '\0') + file[18] == '\0' && (*callback)(directory, file)) { rc= TRUE; break; } } my_dirend(dirp); - return FALSE; + return rc; } @@ -3270,6 +3280,19 @@ static void translog_fill_overhead_table() /** + Callback to find first log in directory. +*/ + +static my_bool translog_callback_search_first(const char *directory + __attribute__((unused)), + const char *filename + __attribute__((unused))) +{ + return TRUE; +} + + +/** @brief Checks that chunk is LSN one @param type type of the chunk @@ -3353,7 +3376,7 @@ my_bool translog_init_with_table(const char *directory, my_init_dynamic_array(&log_descriptor.unfinished_files, sizeof(struct st_file_counter), 10, 10)) - DBUG_RETURN(1); + goto err; log_descriptor.min_need_file= 0; log_descriptor.min_file_number= 0; log_descriptor.last_lsn_checked= LSN_IMPOSSIBLE; @@ -3367,7 +3390,7 @@ my_bool translog_init_with_table(const char *directory, my_errno= errno; DBUG_PRINT("error", ("Error %d during opening directory '%s'", errno, log_descriptor.directory)); - DBUG_RETURN(1); + goto err; } #endif log_descriptor.in_buffers_only= LSN_IMPOSSIBLE; @@ -3417,7 +3440,7 @@ my_bool translog_init_with_table(const char *directory, for (i= 0; i < TRANSLOG_BUFFERS_NO; i++) { if (translog_buffer_init(log_descriptor.buffers + i)) - DBUG_RETURN(1); + goto err; #ifndef DBUG_OFF log_descriptor.buffers[i].buffer_no= (uint8) i; #endif @@ -3461,7 +3484,8 @@ my_bool translog_init_with_table(const char *directory, log_descriptor.horizon= last_page= MAKE_LSN(last_logno, 0); if (translog_get_last_page_addr(&last_page, &pageok, no_errors)) { - if (!translog_is_log_files()) + if (!translog_walk_filenames(log_descriptor.directory, + &translog_callback_search_first)) { /* Files was deleted, just start from the next log number, so that @@ -3472,7 +3496,7 @@ my_bool translog_init_with_table(const char *directory, logs_found= 0; } else - DBUG_RETURN(1); + goto err; } else if (LSN_OFFSET(last_page) == 0) { @@ -3485,7 +3509,7 @@ my_bool translog_init_with_table(const char *directory, { last_page-= LSN_ONE_FILE; if (translog_get_last_page_addr(&last_page, &pageok, 0)) - DBUG_RETURN(1); + goto err; } } if (logs_found) @@ -3497,7 +3521,7 @@ my_bool translog_init_with_table(const char *directory, if (allocate_dynamic(&log_descriptor.open_files, log_descriptor.max_file - log_descriptor.min_file + 1)) - DBUG_RETURN(1); + goto err; for (i = log_descriptor.max_file; i >= log_descriptor.min_file; i--) { /* @@ -3526,10 +3550,10 @@ my_bool translog_init_with_table(const char *directory, if (file) { free(file); - DBUG_RETURN(1); + goto err; } else - DBUG_RETURN(1); + goto err; } translog_file_init(file, i, 1); /* we allocated space so it can't fail */ @@ -3543,7 +3567,7 @@ my_bool translog_init_with_table(const char *directory, { /* There is no logs and there is read-only mode => nothing to read */ DBUG_PRINT("error", ("No logs and read-only mode")); - DBUG_RETURN(1); + goto err; } if (logs_found) @@ -3568,7 +3592,7 @@ my_bool translog_init_with_table(const char *directory, TRANSLOG_ADDRESS current_file_last_page; current_file_last_page= current_page; if (translog_get_last_page_addr(¤t_file_last_page, &pageok, 0)) - DBUG_RETURN(1); + goto err; if (!pageok) { DBUG_PRINT("error", ("File %lu have no complete last page", @@ -3585,7 +3609,7 @@ my_bool translog_init_with_table(const char *directory, uchar *page; data.addr= ¤t_page; if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL) - DBUG_RETURN(1); + goto err; if (data.was_recovered) { DBUG_PRINT("error", ("file no: %lu (%d) " @@ -3614,7 +3638,7 @@ my_bool translog_init_with_table(const char *directory, { /* Panic!!! Even page which should be valid is invalid */ /* TODO: issue error */ - DBUG_RETURN(1); + goto err; } DBUG_PRINT("info", ("Last valid page is in file: %lu " "offset: %lu (0x%lx) " @@ -3639,7 +3663,7 @@ my_bool translog_init_with_table(const char *directory, LSN_FILE_NO(log_descriptor.horizon)); if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL || (chunk_offset= translog_get_first_chunk_offset(page)) == 0) - DBUG_RETURN(1); + goto err; /* Puts filled part of old page in the buffer */ log_descriptor.horizon= last_valid_page; @@ -3654,7 +3678,7 @@ my_bool translog_init_with_table(const char *directory, uint16 chunk_length; if ((chunk_length= translog_get_total_chunk_length(page, chunk_offset)) == 0) - DBUG_RETURN(1); + goto err; DBUG_PRINT("info", ("chunk: offset: %u length: %u", (uint) chunk_offset, (uint) chunk_length)); chunk_offset+= chunk_length; @@ -3690,7 +3714,7 @@ my_bool translog_init_with_table(const char *directory, open_files, 0, TRANSLOG_FILE **))-> handler.file)) - DBUG_RETURN(1); + goto err; version_changed= (info.maria_version != TRANSLOG_VERSION_ID); } } @@ -3702,25 +3726,26 @@ my_bool translog_init_with_table(const char *directory, MYF(0)); DBUG_PRINT("info", ("The log is not found => we will create new log")); if (file == NULL) - DBUG_RETURN(1); + goto err; /* Start new log system from scratch */ log_descriptor.horizon= MAKE_LSN(start_file_num, TRANSLOG_PAGE_SIZE); /* header page */ if ((file->handler.file= create_logfile_by_number_no_cache(start_file_num)) == -1) - DBUG_RETURN(1); + goto err; translog_file_init(file, start_file_num, 0); if (insert_dynamic(&log_descriptor.open_files, (uchar*)&file)) - DBUG_RETURN(1); + goto err; log_descriptor.min_file= log_descriptor.max_file= start_file_num; if (translog_write_file_header()) - DBUG_RETURN(1); + goto err; DBUG_ASSERT(log_descriptor.max_file - log_descriptor.min_file + 1 == log_descriptor.open_files.elements); if (ma_control_file_write_and_force(checkpoint_lsn, start_file_num, - max_trid_in_control_file)) - DBUG_RETURN(1); + max_trid_in_control_file, + recovery_failures)) + goto err; /* assign buffer 0 */ translog_start_buffer(log_descriptor.buffers, &log_descriptor.bc, 0); translog_new_page_header(&log_descriptor.horizon, &log_descriptor.bc); @@ -3734,7 +3759,7 @@ my_bool translog_init_with_table(const char *directory, log_descriptor.horizon= LSN_REPLACE_OFFSET(log_descriptor.horizon, TRANSLOG_PAGE_SIZE); if (translog_create_new_file()) - DBUG_RETURN(1); + goto err; /* Buffer system left untouched after recovery => we should init it (starting from buffer 0) @@ -3767,7 +3792,7 @@ my_bool translog_init_with_table(const char *directory, id_to_share= (MARIA_SHARE **) my_malloc(SHARE_ID_MAX * sizeof(MARIA_SHARE*), MYF(MY_WME | MY_ZEROFILL)); if (unlikely(!id_to_share)) - DBUG_RETURN(1); + goto err; id_to_share--; /* min id is 1 */ /* Check the last LSN record integrity */ @@ -3783,7 +3808,7 @@ my_bool translog_init_with_table(const char *directory, page_addr= (log_descriptor.horizon - ((log_descriptor.horizon - 1) % TRANSLOG_PAGE_SIZE + 1)); if (translog_scanner_init(page_addr, 1, &scanner, 1)) - DBUG_RETURN(1); + goto err; scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]]; for (;;) { @@ -3797,7 +3822,7 @@ my_bool translog_init_with_table(const char *directory, if (translog_get_next_chunk(&scanner)) { translog_destroy_scanner(&scanner); - DBUG_RETURN(1); + goto err; } if (scanner.page != END_OF_LOG) chunk_1byte= scanner.page[scanner.page_offset]; @@ -3808,7 +3833,7 @@ my_bool translog_init_with_table(const char *directory, if (translog_get_next_chunk(&scanner)) { translog_destroy_scanner(&scanner); - DBUG_RETURN(1); + goto err; } if (scanner.page == END_OF_LOG) break; /* it was the last record */ @@ -3845,7 +3870,7 @@ my_bool translog_init_with_table(const char *directory, } translog_destroy_scanner(&scanner); if (translog_scanner_init(page_addr, 1, &scanner, 1)) - DBUG_RETURN(1); + goto err; scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]]; } translog_destroy_scanner(&scanner); @@ -3872,7 +3897,7 @@ my_bool translog_init_with_table(const char *directory, else if (translog_truncate_log(last_lsn)) { translog_free_record_header(&rec); - DBUG_RETURN(1); + goto err; } } else @@ -3898,7 +3923,7 @@ my_bool translog_init_with_table(const char *directory, else if (translog_truncate_log(last_lsn)) { translog_free_record_header(&rec); - DBUG_RETURN(1); + goto err; } } } @@ -3907,6 +3932,9 @@ my_bool translog_init_with_table(const char *directory, } } DBUG_RETURN(0); +err: + ma_message_no_user(0, "log initialization failed"); + DBUG_RETURN(1); } diff --git a/storage/maria/ma_loghandler.h b/storage/maria/ma_loghandler.h index c21d9492cba..3cb18a0eb49 100644 --- a/storage/maria/ma_loghandler.h +++ b/storage/maria/ma_loghandler.h @@ -317,6 +317,10 @@ extern void translog_deassign_id_from_share(struct st_maria_share *share); extern void translog_assign_id_to_share_from_recovery(struct st_maria_share *share, uint16 id); +extern my_bool translog_walk_filenames(const char *directory, + my_bool (*callback)(const char *, + const char *)); + enum enum_translog_status { TRANSLOG_UNINITED, /* no initialization done or error during initialization */ diff --git a/storage/maria/ma_recovery.c b/storage/maria/ma_recovery.c index 2e162b4e07d..ec679609320 100644 --- a/storage/maria/ma_recovery.c +++ b/storage/maria/ma_recovery.c @@ -191,12 +191,12 @@ static void print_preamble() @retval !=0 Error */ -int maria_recover(void) +int maria_recovery_from_log(void) { int res= 1; FILE *trace_file; uint warnings_count; - DBUG_ENTER("maria_recover"); + DBUG_ENTER("maria_recovery_from_log"); DBUG_ASSERT(!maria_in_recovery); maria_in_recovery= TRUE; @@ -462,7 +462,12 @@ end: "Maria recovery failed. Please run maria_chk -r on all maria " "tables and delete all maria_log.######## files", MYF(0)); procent_printed= 0; - /* we don't cleanly close tables if we hit some error (may corrupt them) */ + /* + We don't cleanly close tables if we hit some error (may corrupt them by + flushing some wrong blocks made from wrong REDOs). It also leaves their + open_count>0, which ensures that --maria-recover, if used, will try to + repair them. + */ DBUG_RETURN(error); } @@ -1224,6 +1229,12 @@ static int new_table(uint16 sid, const char *name, LSN lsn_of_file_id) " maria_chk -r", share->open_file_name); error= -1; /* not fatal, try with other tables */ goto end; + /* + Note that if a first recovery fails to apply a REDO, it marks the table + corrupted and stops the entire recovery. A second recovery will find the + table is marked corrupted and skip it (and thus possibly handle other + tables). + */ } /* don't log any records for this work */ _ma_tmp_disable_logging_for_table(info, FALSE); diff --git a/storage/maria/ma_recovery.h b/storage/maria/ma_recovery.h index 56d75f16dde..aa8fa7ecae9 100644 --- a/storage/maria/ma_recovery.h +++ b/storage/maria/ma_recovery.h @@ -25,7 +25,7 @@ C_MODE_START enum maria_apply_log_way { MARIA_LOG_APPLY, MARIA_LOG_DISPLAY_HEADER, MARIA_LOG_CHECK }; -int maria_recover(void); +int maria_recovery_from_log(void); int maria_apply_log(LSN lsn, enum maria_apply_log_way apply, FILE *trace_file, my_bool execute_undo_phase, my_bool skip_DDLs, diff --git a/storage/maria/ma_test_force_start.pl b/storage/maria/ma_test_force_start.pl new file mode 100755 index 00000000000..db97e376004 --- /dev/null +++ b/storage/maria/ma_test_force_start.pl @@ -0,0 +1,179 @@ +#!/usr/bin/env perl + + +use strict; +use warnings; + +my $usage= <<EOF; +This program tests that the options +--maria-force-start-after-recovery-failures --maria-recover work as +expected. +It has to be run from directory mysql-test, and works with non-debug +and debug binaries. +Pass it option -d or -i (to test corruption of data or index file). +EOF + +# -d currently exhibits BUG#36578 +# "Maria: maria-recover may fail to autorepair a table" + +die($usage) if (@ARGV == 0); + +my $corrupt_index; + +if ($ARGV[0] eq '-d') + { + $corrupt_index= 0; + } +elsif ($ARGV[0] eq '-i') + { + $corrupt_index= 1; + } +else + { + die($usage); + } + +my $force_after= 3; +my $corrupt_file= $corrupt_index ? "MAI" : "MAD"; +my $corrupt_message= + "\\[ERROR\\] mysqld: Table '.\/test\/t1' is marked as crashed and should be repaired"; + +my $sql_name= "./var/tmp/create_table.sql"; +my $error_log_name= "./var/log/master.err"; +my @cmd_output; +my $whatever; # garbage data +my $base_server_cmd= "perl mysql-test-run.pl --mem --mysqld=--maria-force-start-after-recovery-failures=$force_after maria-recover"; +my $server_cmd; +my $client_cmd= "../client/mysql -u root -S var/tmp/master.sock test < $sql_name"; +my $server_pid_name="./var/run/master.pid"; +my $server_pid; +my $i; # count of server restarts +sub kill_server; + +print "starting mysqld\n"; +$server_cmd= $base_server_cmd . " --start-and-exit 2>&1"; +@cmd_output=`$server_cmd`; +die if $?; + +open(FILE, ">", $sql_name) or die; + +# To exhibit BUG#36578 with -d, we don't create an index if -d. This is +# because the presence of an index will cause repair-by-sort to be used, +# where sort_get_next_record() is only called inside +#_ma_create_index_by_sort(), so the latter function fails and in this +# case retry_repair is set, so bug does not happen. Whereas without +# an index, repair-with-key-cache is called, which calls +# sort_get_next_record() whose failure itself does not cause a retry. + +print FILE "create table t1 (a varchar(1000)". + ($corrupt_index ? ", index(a)" : "") .") engine=maria;\n"; +print FILE <<EOF; +insert into t1 values("ThursdayMorningsMarket"); +# If Recovery executes REDO_INDEX_NEW_PAGE it will overwrite our +# intentional corruption; we make Recovery skip this record by bumping +# create_rename_lsn using OPTIMIZE TABLE. This also makes sure to put +# the pages on disk, so that we can corrupt them. +optimize table t1; +# mark table open, so that --maria-recover repairs it +insert into t1 select concat(a,'b') from t1 limit 1; +EOF +close FILE; + +print "creating table\n"; +`$client_cmd`; +die if $?; + +print "killing mysqld hard\n"; +kill_server(9); + +print "ruining " . + ($corrupt_index ? "first page of keys" : "bitmap page") . + " in table to test maria-recover\n"; +open(FILE, "+<", "./var/master-data/test/t1.$corrupt_file") or die; +$whatever= ("\xAB" x 100); +sysseek (FILE, $corrupt_index ? 8192 : (8192-100-100), 0) or die; +syswrite (FILE, $whatever) or die; +close FILE; + +print "ruining log to make recovery fail; mysqld should fail the $force_after first restarts\n"; +open(FILE, "+<", "./var/tmp/maria_log.00000001") or die; +$whatever= ("\xAB" x 8192); +sysseek (FILE, 99, 0) or die; +syswrite (FILE, $whatever) or die; +close FILE; + +$server_cmd= $base_server_cmd . " --start-dirty 2>&1"; +for($i= 1; $i <= $force_after; $i= $i + 1) + { + print "mysqld restart number $i... "; + unlink($error_log_name) or die; + `$server_cmd`; + # mysqld should return 1 when can't read log + die unless (($? >> 8) == 1); + open(FILE, "<", $error_log_name) or die; + @cmd_output= <FILE>; + close FILE; + die unless grep(/\[ERROR\] mysqld: Maria engine: log initialization failed/, @cmd_output); + die unless grep(/\[ERROR\] Plugin 'MARIA' init function returned error./, @cmd_output); + print "failed - ok\n"; + } + +print "mysqld restart number $i... "; +unlink($error_log_name) or die; +@cmd_output=`$server_cmd`; +die if $?; +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die unless grep(/\[Warning\] mysqld: Maria engine: removed all logs after [\d]+ consecutive failures of recovery from logs/, @cmd_output); +die unless grep(/\[ERROR\] mysqld: File '..\/tmp\/maria_log.00000001' not found \(Errcode: 2\)/, @cmd_output); +print "success - ok\n"; + +open(FILE, ">", $sql_name) or die; +print FILE <<EOF; +set global maria_recover=normal; +insert into t1 values('aaa'); +EOF +close FILE; + +# verify corruption has not yet been noticed +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die if grep(/$corrupt_message/, @cmd_output); + +print "inserting in table\n"; +`$client_cmd`; +die if $?; +print "table is usable - ok\n"; + +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die unless grep(/$corrupt_message/, @cmd_output); +die unless grep(/\[Warning\] Recovering table: '.\/test\/t1'/, @cmd_output); +print "was corrupted and automatically repaired - ok\n"; + +# remove our traces +kill_server(15); + +print "TEST ALL OK\n"; + +# kills mysqld with signal given in parameter +sub kill_server + { + my ($sig)= @_; + my $wait_count= 0; + open(FILE, "<", $server_pid_name) or die; + @cmd_output= <FILE>; + close FILE; + $server_pid= $cmd_output[0]; + die unless $server_pid > 0; + kill($sig, $server_pid) or die; + while (kill (0, $server_pid)) + { + print "waiting for mysqld to die\n" if ($wait_count > 30); + $wait_count= $wait_count + 1; + select(undef, undef, undef, 0.1); + } + } diff --git a/storage/maria/unittest/ma_control_file-t.c b/storage/maria/unittest/ma_control_file-t.c index f076615fef7..6702e4deb2f 100644 --- a/storage/maria/unittest/ma_control_file-t.c +++ b/storage/maria/unittest/ma_control_file-t.c @@ -45,6 +45,7 @@ char file_name[FN_REFLEN]; LSN expect_checkpoint_lsn; uint32 expect_logno; TrID expect_max_trid; +uint8 expect_recovery_failures; static int delete_file(myf my_flags); /* @@ -55,10 +56,11 @@ static int close_file(void); /* wraps ma_control_file_end */ /* wraps ma_control_file_open_or_create */ static int open_file(void); /* wraps ma_control_file_write_and_force */ -static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid); +static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid, + uint8 rec_failures); /* Tests */ -static int test_one_log(void); +static int test_one_log_and_recovery_failures(void); static int test_five_logs_and_max_trid(void); static int test_3_checkpoints_and_2_logs(void); static int test_binary_content(void); @@ -135,7 +137,8 @@ int main(int argc,char *argv[]) RET_ERR_UNLESS(0 == delete_file(0)); /* if fails, can't continue */ diag("Tests of normal conditions"); - ok(0 == test_one_log(), "test of creating one log"); + ok(0 == test_one_log_and_recovery_failures(), + "test of creating one log and recording recovery failures"); ok(0 == test_five_logs_and_max_trid(), "test of creating five logs and many transactions"); ok(0 == test_3_checkpoints_and_2_logs(), @@ -167,7 +170,7 @@ static int delete_file(myf my_flags) my_delete(file_name, my_flags); expect_checkpoint_lsn= LSN_IMPOSSIBLE; expect_logno= FILENO_IMPOSSIBLE; - expect_max_trid= 0; + expect_max_trid= expect_recovery_failures= 0; return 0; } @@ -181,6 +184,7 @@ static int verify_module_values_match_expected(void) RET_ERR_UNLESS(last_logno == expect_logno); RET_ERR_UNLESS(last_checkpoint_lsn == expect_checkpoint_lsn); RET_ERR_UNLESS(max_trid_in_control_file == expect_max_trid); + RET_ERR_UNLESS(recovery_failures == expect_recovery_failures); return 0; } @@ -215,21 +219,28 @@ static int open_file(void) return 0; } -static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid) +static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid, + uint8 rec_failures) { - RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid) + RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid, + rec_failures) == 0); /* Check that the module reports expected information */ RET_ERR_UNLESS(verify_module_values_match_expected() == 0); return 0; } -static int test_one_log(void) +static int test_one_log_and_recovery_failures(void) { RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK); expect_logno= 123; RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); + expect_recovery_failures= 158; + RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, + max_trid_in_control_file, + expect_recovery_failures) == 0); RET_ERR_UNLESS(close_file() == 0); return 0; } @@ -245,7 +256,8 @@ static int test_five_logs_and_max_trid(void) { expect_logno*= 3; RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, - expect_max_trid) == 0); + expect_max_trid, + recovery_failures) == 0); } RET_ERR_UNLESS(close_file() == 0); return 0; @@ -260,23 +272,28 @@ static int test_3_checkpoints_and_2_logs(void) RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK); expect_checkpoint_lsn= MAKE_LSN(5, 10000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_logno= 17; RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_checkpoint_lsn= MAKE_LSN(17, 20000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_checkpoint_lsn= MAKE_LSN(17, 45000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_logno= 19; RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); RET_ERR_UNLESS(close_file() == 0); return 0; } diff --git a/storage/maria/unittest/ma_test_loghandler_multigroup-t.c b/storage/maria/unittest/ma_test_loghandler_multigroup-t.c index 421aa5ffb29..55b8421c3db 100644 --- a/storage/maria/unittest/ma_test_loghandler_multigroup-t.c +++ b/storage/maria/unittest/ma_test_loghandler_multigroup-t.c @@ -129,10 +129,10 @@ static my_bool read_and_check_content(TRANSLOG_HEADER_BUFFER *rec, } static const char *load_default_groups[]= {"ma_unit_loghandler", 0}; -#if defined(__WIN__) -static const char *default_dbug_option= "d:t:i:O,\\ma_test_loghandler.trace"; -#else -static const char *default_dbug_option= "d:t:i:o,/tmp/ma_test_loghandler.trace"; +#ifndef DBUG_OFF +static const char *default_dbug_option= + IF_WIN("d:t:i:O,\\ma_test_loghandler.trace", + "d:t:i:o,/tmp/ma_test_loghandler.trace"); #endif static const char *opt_wfile= NULL; static const char *opt_rfile= NULL; |