diff options
author | Guilhem Bichot <guilhem@mysql.com> | 2008-06-02 22:53:25 +0200 |
---|---|---|
committer | Guilhem Bichot <guilhem@mysql.com> | 2008-06-02 22:53:25 +0200 |
commit | a5bcb63f45f58f7c5f4f2387da521aa7a14b60be (patch) | |
tree | 7eaa8ccde458e0c059e01272c49894176fd01dba | |
parent | 2d64cd05e1b9cd3b76368af7db34335b88bea248 (diff) | |
download | mariadb-git-a5bcb63f45f58f7c5f4f2387da521aa7a14b60be.tar.gz |
WL#4374 "Maria - force start if Recovery fails multiple times"
http://forge.mysql.com/worklog/task.php?id=4374
new option --maria-force-start-after-recovery-failures=N; number of consecutive recovery failures (failures
of log reading or recovery processing, anything in [translog_init(),maria_recovery_from_log()])
is stored in the control file; if at a Maria start they are more than N, logs are removed. This is for automated
systems which have to run whatever happens. As tables risk staying corrupted, --maria-recover should also
be used on them: this revision makes maria-recover work (it was disabled).
Fixed bug in translog_is_log_files(). translog_init() now prints message to error log if failed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
KNOWN_BUGS.txt:
As option --maria-force-start-after-recovery-failures is added, it corresponds to the wish "we should fix that if this happens etc".
LOAD INDEX is not ignored since a few weeks. Listed concurrency bugs have been fixed some time ago.
Recovery of fulltext and GIS indexes works since a few weeks.
mysql-test/include/maria_make_snapshot.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_comparison.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/include/maria_verify_recovery.inc:
configurable prefix in table's name (so far 't' or 't_corrupted')
mysql-test/lib/mtr_report.pl:
new test maria-recover.test generates expected corruption warnings in the error log. maria-recovery.test's corrupted table is renamed to t_corrupted1 instead of t1.
mysql-test/r/maria-preload.result:
result update. maria_pagecache_read* values are similar to the previous version of this file, though a bit bigger
because using the information_schema and the join leads to some internal maria temp table being used, and thus some
blocks of it being read.
mysql-test/r/maria-purge.result:
engine's name in SHOW ENGINE MARIA LOGS changed.
mysql-test/r/maria-recover.result:
result for new test. We see corruption messages at first SELECT and then none at second SELECT, expected.
mysql-test/r/maria-recovery.result:
result update
mysql-test/r/maria.result:
new variables show up
mysql-test/t/disabled.def:
BUG#34911 is not fixed but the test had been made independent of the bug (workaround). A new bug (crash) has popped recently, so it has to stay
disabled (BUG#35107).
mysql-test/t/maria-preload.test:
Work around BUG#34911 "FLUSH STATUS doesn't flush what it should":
compute differences in status variables before and after relevant queries
mysql-test/t/maria-recover-master.opt:
test --maria-recover
mysql-test/t/maria-recover.test:
Test of the --maria-recover option (build a corrupted table and see if it is auto-repaired)
mysql-test/t/maria-recovery-big.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery-bitmap.test:
update for new API of include/maria*.inc
mysql-test/t/maria-recovery.test:
update for new API of include/maria*.inc. Corrupted table t1 renamed to t_corrupted1, so that mtr_report.pl
does not blindly remove all corruption messages for t1 which is
a common name.
storage/maria/ha_maria.cc:
Enabling maria-recover.
Adding option and global variable --maria_force_start_after_recovery_failures: ha_maria_init()
calls mark_recovery_start() and mark_recovery_success() to keep track of failed consecutive recoveries
and remove logs if needed.
Removed \0 in the output of SHOW ENGINE MARIA LOGS; removed hard-coded engine name there.
storage/maria/ma_checkpoint.c:
new prototype
storage/maria/ma_control_file.c:
Storing in one byte in the control file, the number of consecutive recovery failures.
storage/maria/ma_control_file.h:
new prototype
storage/maria/ma_init.c:
new prototype
storage/maria/ma_locking.c:
Need to update open_count on disk at first write and close for transactional tables, like we already did for
non-transactional tables, otherwise we cannot notice that the table is dubious.
storage/maria/ma_loghandler.c:
translog_is_log_files() is made more generic to serve either to search or to delete logs (the latter is
for --maria-force-start-after-recovery-failures). It also had a bug (always returned FALSE).
storage/maria/ma_loghandler.h:
export function because ha_maria::mark_recovery_start() needs it
storage/maria/ma_recovery.c:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_recovery.h:
changing name of maria_recover() to distinguish from the maria-recover option.
storage/maria/ma_test_force_start.pl:
Test of --maria-force-start-after-recovery-failures (and also, to be realistic, of --maria-recover).
This is standalone because mysql-test-run does not support testing that multiple mysqld restarts expectedly failed.
I'll have to run it on my machine and also on a Windows machine.
storage/maria/unittest/ma_control_file-t.c:
adding recovery_failures to the test
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
fix for compiler warning (unused variable in non-debug build)
32 files changed, 751 insertions, 259 deletions
diff --git a/KNOWN_BUGS.txt b/KNOWN_BUGS.txt index 6ba40cf4550..3fd1a22d129 100644 --- a/KNOWN_BUGS.txt +++ b/KNOWN_BUGS.txt @@ -24,23 +24,9 @@ or in the worst case add it here for others to know! Known bugs that we are working on and will be fixed shortly =========================================================== -- If the log files are damaged or inconsistent, Maria may fail to start. - We should fix that if this happens and mysqld is restarted (thanks to - mysqld_safe, instance manager or other script) it should disregard the - old logs, start anyway and automaticly repair any tables that was found - to be crashed on open. - Temporary fix is to remove or maria_log.???????? files from the data - directory, restart mysqld and run CHECK TABLE / REPAIR TABLE or - mysqlcheck on your Maria tables - We have some instabilities in log writing that is under investigatation This causes mainly assert to triggers in the code and sometimes the log handler doesn't start up after restart. -- LOAD INDEX commands are for the moment ignored for Maria tables - (The code needs to be rewritten to do all reads through page cache to - avoid half-block reads) -- Some concurrency bugs in Maria's page cache which sometimes show up - under load http://bugs.mysql.com/bug.php?id=34161 and - http://bugs.mysql.com/bug.php?id=34634 . Known bugs that are planned to be fixed before Beta =================================================== @@ -61,19 +47,15 @@ Known bugs that are planned to be fixed before Beta Missing features that is planned to fix before Beta =================================================== -- We will add an maria-recover option to automaticly repair any - crashed tables on open. (This is needed for not transactional tables - and also in edge cases for transactional tables when the table - crashed because of a bug in MySQL or Maria code) - Multiple concurrent inserts & multiple concurrent readers at same time with full MVCC control. Note that UPDATE and DELETE will still be blocking (as with MyISAM) - COUNT(*) and TABLE CHECKSUM under MVCC (ie, they are instant and kept up to date even with multiple inserter) -- Recovery of fulltext and GIS indexes. Features planned for future releases ==================================== http://forge.mysql.com/worklog/ +(you can enter "maria" in the "quick search" field there). diff --git a/mysql-test/include/maria_make_snapshot.inc b/mysql-test/include/maria_make_snapshot.inc index b457f3e1a68..679a65552c1 100644 --- a/mysql-test/include/maria_make_snapshot.inc +++ b/mysql-test/include/maria_make_snapshot.inc @@ -10,28 +10,29 @@ # $mms_copy : to copy table from database to spare directory # $mms_reverse : to copy it back # $mms_compare_physically : to compare both byte-for-byte -# 2) set $mms_table_to_use to a number N: table will be mysqltest.tN +# 2) set $mms_tname to a string and set $mms_table_to_use to a number: tables +# will be mysqltest.$mms_tname$mms_table_to_use. # 3) set $mms_purpose to say what this copy is for (influences the naming # of the spare directory). if ($mms_copy) { - --echo * copied t$mms_table_to_use for $mms_purpose - copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD; - copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI; - copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.frm $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.frm; + --echo * copied $mms_tname$mms_table_to_use for $mms_purpose + copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD; + copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI; + copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.frm $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.frm; } if ($mms_reverse_copy) { # do not call this without flushing target table first! - --echo * copied t$mms_table_to_use back for $mms_purpose + --echo * copied $mms_tname$mms_table_to_use back for $mms_purpose -- error 0,1 - remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD; - copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD; + remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD; + copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD; -- error 0,1 - remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI; - copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI; + remove_file $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI; + copy_file $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI; } if ($mms_compare_physically) @@ -41,8 +42,8 @@ if ($mms_compare_physically) # So, do this only when testing REDO phase. # If UNDO phase, we nevertheless compare checksums # (see maria_verify_recovery.inc). - --echo * compared t$mms_table_to_use to old version - diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAD; + --echo * compared $mms_tname$mms_table_to_use to old version + diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAD $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAD; # index file not yet recovered -# diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/t$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/t$mms_table_to_use.MAI; +# diff_files $MYSQLTEST_VARDIR/master-data/mysqltest/$mms_tname$mms_table_to_use.MAI $MYSQLTEST_VARDIR/master-data/mysqltest_for_$mms_purpose/$mms_tname$mms_table_to_use.MAI; } diff --git a/mysql-test/include/maria_make_snapshot_for_comparison.inc b/mysql-test/include/maria_make_snapshot_for_comparison.inc index 71b821b5212..cb756f60527 100644 --- a/mysql-test/include/maria_make_snapshot_for_comparison.inc +++ b/mysql-test/include/maria_make_snapshot_for_comparison.inc @@ -1,10 +1,11 @@ # Maria helper script # Copies clean tables' data and index file to other directory -# Tables are t1...t[$mms_tables] +# Tables are $mms_tname1...$mms_tname[$mms_tables] # They are later used as a reference to see if recovery works. # API: -# set $mms_tables to N, the script will cover tables mysqltest.t1,...tN +# set $mms_tname to a string, and $mms_tables to a number N, the script will +# cover tables mysqltest.$mms_tname1,...$mms_tnameN connection admin; @@ -22,7 +23,7 @@ eval create database mysqltest_for_$mms_purpose; while ($mms_table_to_use) { # to serve as a reference, table must be in a clean state - eval flush table t$mms_table_to_use; + eval flush table $mms_tname$mms_table_to_use; -- source include/maria_make_snapshot.inc dec $mms_table_to_use; } diff --git a/mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc b/mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc index 369f5eec927..879aa3ef182 100644 --- a/mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc +++ b/mysql-test/include/maria_make_snapshot_for_feeding_recovery.inc @@ -1,12 +1,14 @@ # Maria helper script # Copies tables' data and index file to other directory, and control file. -# Tables are t1...t[$mms_tables]. +# Tables are $mms_tname1...$mms_tname[$mms_tables]. # Later, mysqld is shutdown, and that snapshot is put back into the # datadir, control file too ("flashing recovery's brain"), and recovery is let # to run on it (see maria_verify_recovery.inc). # API: -# set $mms_tables to N, the script will cover tables mysqltest.t1,...tN +# set $mms_tname to a string, and $mms_tables to a number N, the script will +# cover tables mysqltest.$mms_tname1,...$mms_tnameN + connection admin; diff --git a/mysql-test/include/maria_verify_recovery.inc b/mysql-test/include/maria_verify_recovery.inc index 867479ba127..becdfb5df86 100644 --- a/mysql-test/include/maria_verify_recovery.inc +++ b/mysql-test/include/maria_verify_recovery.inc @@ -2,7 +2,8 @@ # Runs recovery, compare with expected table data. # API: -# 1) set $mms_tables to N, the script will cover tables mysqltest.t1,...tN +# 1) set $mms_tname to a string, and $mms_tables to a number N, the script +# will cover tables mysqltest.$mms_tname1,...$mms_tnameN # 2) set $mvr_debug_option to the crash way # 3) set $mvr_crash_statement to the statement which will trigger a crash # 4) set $mvr_restore_old_snapshot to 1 if you want recovery to run on @@ -77,10 +78,10 @@ let $mms_purpose=comparison; let $mms_compare_physically=$mms_compare_physically_save; while ($mms_table_to_use) { - eval check table t$mms_table_to_use extended; + eval check table $mms_tname$mms_table_to_use extended; --echo * testing that checksum after recovery is as expected - let $new_checksum=`CHECKSUM TABLE t$mms_table_to_use`; - let $old_checksum=`CHECKSUM TABLE mysqltest_for_$mms_purpose.t$mms_table_to_use`; + let $new_checksum=`CHECKSUM TABLE $mms_tname$mms_table_to_use`; + let $old_checksum=`CHECKSUM TABLE mysqltest_for_$mms_purpose.$mms_tname$mms_table_to_use`; # the $ text variables above are of the form "db.tablename\tchecksum", # as db differs, we use substring(). --disable_query_log diff --git a/mysql-test/lib/mtr_report.pl b/mysql-test/lib/mtr_report.pl index f4ee4dc7eed..4700ae05dde 100644 --- a/mysql-test/lib/mtr_report.pl +++ b/mysql-test/lib/mtr_report.pl @@ -405,7 +405,12 @@ sub mtr_report_stats ($) { # maria-recovery.test has warning about missing log file /File '.*maria_log.000.*' not found \(Errcode: 2\)/ or # and about marked-corrupted table - /Table '.\/mysqltest\/t1' is crashed, skipping it. Please repair it with maria_chk -r/ + /Table '.\/mysqltest\/t_corrupted1' is crashed, skipping it. Please repair it with maria_chk -r/ or + # maria-recover.test corrupts tables on purpose + /Checking table: '.\/mysqltest\/t_corrupted2'/ or + /Recovering table: '.\/mysqltest\/t_corrupted2'/ or + /Table '.\/mysqltest\/t_corrupted2' is marked as crashed and should be repaired/ or + /Incorrect key file for table '.\/mysqltest\/t_corrupted2.MAI'; try to repair it/ ) { next; # Skip these lines diff --git a/mysql-test/r/maria-preload.result b/mysql-test/r/maria-preload.result index b463c1b359b..bff6e25450f 100644 --- a/mysql-test/r/maria-preload.result +++ b/mysql-test/r/maria-preload.result @@ -1,4 +1,7 @@ drop table if exists t1, t2; +create temporary table initial +select variable_name,variable_value from +information_schema.global_status where variable_name like "Maria_pagecache_read%"; create table t1 ( a int not null auto_increment, b char(16) not null, @@ -46,24 +49,24 @@ count(*) 20672 flush tables; flush status; -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211388 -Maria_pagecache_reads 115 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 211644 +MARIA_PAGECACHE_READS 3 select count(*) from t1 where b = 'test1'; count(*) 4181 -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211414 -Maria_pagecache_reads 122 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 211926 +MARIA_PAGECACHE_READS 11 select count(*) from t1 where b = 'test1'; count(*) 4181 -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211440 -Maria_pagecache_reads 122 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 212208 +MARIA_PAGECACHE_READS 12 flush tables; flush status; select @@preload_buffer_size; @@ -72,23 +75,23 @@ select @@preload_buffer_size; load index into cache t1; Table Op Msg_type Msg_text test.t1 preload_keys status OK -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211511 -Maria_pagecache_reads 193 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 212535 +MARIA_PAGECACHE_READS 84 select count(*) from t1 where b = 'test1'; count(*) 4181 -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211537 -Maria_pagecache_reads 193 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 212817 +MARIA_PAGECACHE_READS 85 flush tables; flush status; -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211537 -Maria_pagecache_reads 193 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 213073 +MARIA_PAGECACHE_READS 86 set session preload_buffer_size=256*1024; select @@preload_buffer_size; @@preload_buffer_size @@ -96,23 +99,23 @@ select @@preload_buffer_size; load index into cache t1 ignore leaves; Table Op Msg_type Msg_text test.t1 preload_keys status OK -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211608 -Maria_pagecache_reads 264 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 213400 +MARIA_PAGECACHE_READS 158 select count(*) from t1 where b = 'test1'; count(*) 4181 -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211634 -Maria_pagecache_reads 270 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 213682 +MARIA_PAGECACHE_READS 165 flush tables; flush status; -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211634 -Maria_pagecache_reads 270 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 213938 +MARIA_PAGECACHE_READS 166 set session preload_buffer_size=1*1024; select @@preload_buffer_size; @@preload_buffer_size @@ -121,52 +124,53 @@ load index into cache t1, t2 key (primary,b) ignore leaves; Table Op Msg_type Msg_text test.t1 preload_keys status OK test.t2 preload_keys status OK -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211748 -Maria_pagecache_reads 384 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 214308 +MARIA_PAGECACHE_READS 281 select count(*) from t1 where b = 'test1'; count(*) 4181 select count(*) from t2 where b = 'test1'; count(*) 2584 -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211788 -Maria_pagecache_reads 387 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 214604 +MARIA_PAGECACHE_READS 285 flush tables; flush status; -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211788 -Maria_pagecache_reads 387 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 214860 +MARIA_PAGECACHE_READS 286 load index into cache t3, t2 key (primary,b) ; Table Op Msg_type Msg_text test.t3 preload_keys Error Table 'test.t3' doesn't exist test.t3 preload_keys error Corrupt test.t2 preload_keys status OK -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211831 -Maria_pagecache_reads 430 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 215159 +MARIA_PAGECACHE_READS 330 flush tables; flush status; -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211831 -Maria_pagecache_reads 430 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 215415 +MARIA_PAGECACHE_READS 331 load index into cache t3 key (b), t2 key (c) ; Table Op Msg_type Msg_text test.t3 preload_keys Error Table 'test.t3' doesn't exist test.t3 preload_keys error Corrupt test.t2 preload_keys Error Key 'c' doesn't exist in table 't2' test.t2 preload_keys status Operation failed -show status like "maria_pagecache_read%"; -Variable_name Value -Maria_pagecache_read_requests 211831 -Maria_pagecache_reads 430 +select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +variable_name g.variable_value-i.variable_value +MARIA_PAGECACHE_READ_REQUESTS 215671 +MARIA_PAGECACHE_READS 332 drop table t1, t2; +drop temporary table initial; show status like "key_read%"; Variable_name Value Key_read_requests 0 diff --git a/mysql-test/r/maria-purge.result b/mysql-test/r/maria-purge.result index 2ebfabaf074..8155bc6ef2a 100644 --- a/mysql-test/r/maria-purge.result +++ b/mysql-test/r/maria-purge.result @@ -37,13 +37,13 @@ set global maria_log_file_size=16777216; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000002 in use +MARIA master-data/maria_log.00000002 in use insert into t2 select * from t1; insert into t1 select * from t2; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000004 in use +MARIA master-data/maria_log.00000004 in use set global maria_log_file_size=16777216; select @@global.maria_log_file_size; @@global.maria_log_file_size @@ -51,7 +51,7 @@ select @@global.maria_log_file_size; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000004 in use +MARIA master-data/maria_log.00000004 in use set global maria_log_file_size=8388608; select @@global.maria_log_file_size; @@global.maria_log_file_size @@ -61,32 +61,32 @@ insert into t1 select * from t2; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000004 free -maria master-data/maria_log.00000005 free -maria master-data/maria_log.00000006 free -maria master-data/maria_log.00000007 free -maria master-data/maria_log.00000008 in use +MARIA master-data/maria_log.00000004 free +MARIA master-data/maria_log.00000005 free +MARIA master-data/maria_log.00000006 free +MARIA master-data/maria_log.00000007 free +MARIA master-data/maria_log.00000008 in use flush logs; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000008 in use +MARIA master-data/maria_log.00000008 in use set global maria_log_file_size=16777216; set global maria_log_purge_type=external; insert into t1 select * from t2; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000008 free -maria master-data/maria_log.00000009 in use +MARIA master-data/maria_log.00000008 free +MARIA master-data/maria_log.00000009 in use flush logs; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000008 free -maria master-data/maria_log.00000009 in use +MARIA master-data/maria_log.00000008 free +MARIA master-data/maria_log.00000009 in use set global maria_log_purge_type=immediate; insert into t1 select * from t2; set global maria_checkpoint_interval=30; SHOW ENGINE maria logs; Type Name Status -maria master-data/maria_log.00000011 in use +MARIA master-data/maria_log.00000011 in use drop table t1, t2; diff --git a/mysql-test/r/maria-recover.result b/mysql-test/r/maria-recover.result new file mode 100644 index 00000000000..9e0908b478a --- /dev/null +++ b/mysql-test/r/maria-recover.result @@ -0,0 +1,35 @@ +select @@global.maria_recover; +@@global.maria_recover +BACKUP +set global maria_recover=off; +select @@global.maria_recover; +@@global.maria_recover +OFF +set global maria_recover=default; +select @@global.maria_recover; +@@global.maria_recover +OFF +set global maria_recover=normal; +select @@global.maria_recover; +@@global.maria_recover +NORMAL +drop database if exists mysqltest; +create database mysqltest; +use mysqltest; +create table t1 (a varchar(1000), index(a)) engine=maria; +insert into t1 values("ThursdayMorningsMarket"); +flush table t1; +insert into t1 select concat(a,'b') from t1 limit 1; +select * from t_corrupted2; +a +ThursdayMorningsMarket +Warnings: +Error 145 Table './mysqltest/t_corrupted2' is marked as crashed and should be repaired +Error 1194 Table 't_corrupted2' is marked as crashed and should be repaired +Error 1034 1 client is using or hasn't closed the table properly +Error 126 Incorrect key file for table './mysqltest/t_corrupted2.MAI'; try to repair it +Error 1034 Wrong base information on indexpage at page: 1 +select * from t_corrupted2; +a +ThursdayMorningsMarket +drop database mysqltest; diff --git a/mysql-test/r/maria-recovery.result b/mysql-test/r/maria-recovery.result index cfdaf4fb068..c137b2090b7 100644 --- a/mysql-test/r/maria-recovery.result +++ b/mysql-test/r/maria-recovery.result @@ -362,24 +362,24 @@ Table Non_unique Key_name Seq_in_index Column_name Collation Cardinality Sub_par t1 1 a 1 a A 1 NULL NULL YES BTREE drop table t1; * TEST of recovery when OPTIMIZE has replaced the index file and crash -create table t1 (a varchar(100), key(a)) engine=maria; -insert into t1 select (rand()) from t2; -flush table t1; -* copied t1 for comparison +create table t_corrupted1 (a varchar(100), key(a)) engine=maria; +insert into t_corrupted1 select (rand()) from t2; +flush table t_corrupted1; +* copied t_corrupted1 for comparison SET SESSION debug="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash_sort_index"; * crashing mysqld intentionally -optimize table t1; +optimize table t_corrupted1; ERROR HY000: Lost connection to MySQL server during query * recovery happens -check table t1 extended; +check table t_corrupted1 extended; Table Op Msg_type Msg_text -mysqltest.t1 check warning Table is marked as crashed and last repair failed -mysqltest.t1 check status OK +mysqltest.t_corrupted1 check warning Table is marked as crashed and last repair failed +mysqltest.t_corrupted1 check status OK * testing that checksum after recovery is as expected Checksum-check ok use mysqltest; -drop table t1, t2; +drop table t_corrupted1, t2; drop database mysqltest_for_feeding_recovery; drop database mysqltest_for_comparison; drop database mysqltest; diff --git a/mysql-test/r/maria.result b/mysql-test/r/maria.result index a9f10a7ec4d..96fc4e25bc4 100644 --- a/mysql-test/r/maria.result +++ b/mysql-test/r/maria.result @@ -2121,6 +2121,7 @@ show variables like 'maria%'; Variable_name Value maria_block_size 8192 maria_checkpoint_interval 30 +maria_force_start_after_recovery_failures 0 maria_log_file_size 4294959104 maria_log_purge_type immediate maria_max_sort_file_size 9223372036854775807 @@ -2128,6 +2129,7 @@ maria_page_checksum OFF maria_pagecache_age_threshold 300 maria_pagecache_buffer_size 8388600 maria_pagecache_division_limit 100 +maria_recover OFF maria_repair_threads 1 maria_sort_buffer_size 8388608 maria_stats_method nulls_unequal diff --git a/mysql-test/t/disabled.def b/mysql-test/t/disabled.def index 05c2670fc33..c944bf8d195 100644 --- a/mysql-test/t/disabled.def +++ b/mysql-test/t/disabled.def @@ -19,4 +19,4 @@ ctype_create : Bug#32965 main.ctype_create fails status : Bug#32966 main.status fails ps_ddl : Bug#12093 2007-12-14 pending WL#4165 / WL#4166 csv_alter_table : Bug#33696 2008-01-21 pcrews no .result file - bug allows NULL columns in CSV tables -maria-preload : Bug#34911 unrepeatable output of SHOW STATUS +maria-preload : Bug#35107 crashes diff --git a/mysql-test/t/maria-preload.test b/mysql-test/t/maria-preload.test index eff42890484..b6b39b92ac3 100644 --- a/mysql-test/t/maria-preload.test +++ b/mysql-test/t/maria-preload.test @@ -8,6 +8,11 @@ drop table if exists t1, t2; --enable_warnings +# Work around BUG#34911 "FLUSH STATUS doesn't flush what it should": +# compute differences in status variables before and after relevant queries +create temporary table initial +select variable_name,variable_value from +information_schema.global_status where variable_name like "Maria_pagecache_read%"; # we don't use block-format because we want page cache stats # about indices and not data pages. @@ -59,50 +64,50 @@ select count(*) from t1; select count(*) from t2; flush tables; flush status; -show status like "maria_pagecache_read%"; - +let $show_stat=select g.variable_name,g.variable_value-i.variable_value from information_schema.global_status as g,initial as i where g.variable_name like "Maria_pagecache_read%" and g.variable_name=i.variable_name order by g.variable_name desc; +eval $show_stat; select count(*) from t1 where b = 'test1'; -show status like "maria_pagecache_read%"; +eval $show_stat; select count(*) from t1 where b = 'test1'; -show status like "maria_pagecache_read%"; +eval $show_stat; flush tables; flush status; select @@preload_buffer_size; load index into cache t1; -show status like "maria_pagecache_read%"; +eval $show_stat; select count(*) from t1 where b = 'test1'; -show status like "maria_pagecache_read%"; +eval $show_stat; flush tables; flush status; -show status like "maria_pagecache_read%"; +eval $show_stat; set session preload_buffer_size=256*1024; select @@preload_buffer_size; load index into cache t1 ignore leaves; -show status like "maria_pagecache_read%"; +eval $show_stat; select count(*) from t1 where b = 'test1'; -show status like "maria_pagecache_read%"; +eval $show_stat; flush tables; flush status; -show status like "maria_pagecache_read%"; +eval $show_stat; set session preload_buffer_size=1*1024; select @@preload_buffer_size; load index into cache t1, t2 key (primary,b) ignore leaves; -show status like "maria_pagecache_read%"; +eval $show_stat; select count(*) from t1 where b = 'test1'; select count(*) from t2 where b = 'test1'; -show status like "maria_pagecache_read%"; +eval $show_stat; flush tables; flush status; -show status like "maria_pagecache_read%"; +eval $show_stat; load index into cache t3, t2 key (primary,b) ; -show status like "maria_pagecache_read%"; - +eval $show_stat; flush tables; flush status; -show status like "maria_pagecache_read%"; +eval $show_stat; load index into cache t3 key (b), t2 key (c) ; -show status like "maria_pagecache_read%"; +eval $show_stat; drop table t1, t2; +drop temporary table initial; # check that Maria didn't use key cache show status like "key_read%"; diff --git a/mysql-test/t/maria-recover-master.opt b/mysql-test/t/maria-recover-master.opt new file mode 100644 index 00000000000..32af5433e03 --- /dev/null +++ b/mysql-test/t/maria-recover-master.opt @@ -0,0 +1 @@ +--maria-recover=backup --maria-log-dir-path=../tmp diff --git a/mysql-test/t/maria-recover.test b/mysql-test/t/maria-recover.test new file mode 100644 index 00000000000..924d573fe4e --- /dev/null +++ b/mysql-test/t/maria-recover.test @@ -0,0 +1,52 @@ +# Test of the --maria-recover option. + +--source include/have_maria.inc + +select @@global.maria_recover; +set global maria_recover=off; +select @@global.maria_recover; +set global maria_recover=default; +select @@global.maria_recover; +set global maria_recover=normal; +select @@global.maria_recover; + +--disable_warnings +drop database if exists mysqltest; +--enable_warnings +create database mysqltest; + +use mysqltest; + +create table t1 (a varchar(1000), index(a)) engine=maria; +insert into t1 values("ThursdayMorningsMarket"); + +flush table t1; # put index page on disk +insert into t1 select concat(a,'b') from t1 limit 1; +# now t1 has its open_count>0 and so will t2_corrupted. +# It is not named t2 because the corruption messages which will be put +# in the error log need to be detected in mtr_process.pl, and we want +# a specific name to do specific detection (don't want to ignore +# any corruption messages of other tests using "t2" as table). + +copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.frm $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.frm; +copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.MAD $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.MAD; +copy_file $MYSQLTEST_VARDIR/master-data/mysqltest/t1.MAI $MYSQLTEST_VARDIR/master-data/mysqltest/t_corrupted2.MAI; + +# Ruin the index file. +# If maria-block-size is smaller than the default, the corruption +# messages will differ. +perl; + use strict; + use warnings; + my $fname= "$ENV{'MYSQLTEST_VARDIR'}/master-data/mysqltest/t_corrupted2.MAI"; + open(FILE, "+<", $fname) or die; + my $whatever= ("\xAB" x 100); + sysseek (FILE, 8192, 0) or die; + syswrite (FILE, $whatever) or die; + close FILE; +EOF + +select * from t_corrupted2; # should show corruption and repair messages +select * from t_corrupted2; # should show just rows + +drop database mysqltest; diff --git a/mysql-test/t/maria-recovery-big.test b/mysql-test/t/maria-recovery-big.test index 591109b7eae..5511d1a0cfb 100644 --- a/mysql-test/t/maria-recovery-big.test +++ b/mysql-test/t/maria-recovery-big.test @@ -15,6 +15,7 @@ set global maria_log_file_size=4294967295; drop database if exists mysqltest; --enable_warnings create database mysqltest; +let $mms_tname=t; # Include scripts can perform SQL. For it to not influence the main test # they use a separate connection. This way if they use a DDL it would diff --git a/mysql-test/t/maria-recovery-bitmap.test b/mysql-test/t/maria-recovery-bitmap.test index ee5f6cbadd3..a44565c9f95 100644 --- a/mysql-test/t/maria-recovery-bitmap.test +++ b/mysql-test/t/maria-recovery-bitmap.test @@ -11,6 +11,7 @@ drop database if exists mysqltest; --enable_warnings create database mysqltest; +let $mms_tname=t; # Include scripts can perform SQL. For it to not influence the main test # they use a separate connection. This way if they use a DDL it would diff --git a/mysql-test/t/maria-recovery-rtree-ft.test b/mysql-test/t/maria-recovery-rtree-ft.test index dd38cc0b0b8..b9bc0e718d2 100644 --- a/mysql-test/t/maria-recovery-rtree-ft.test +++ b/mysql-test/t/maria-recovery-rtree-ft.test @@ -14,6 +14,7 @@ let $MARIA_LOG=.; drop database if exists mysqltest; --enable_warnings create database mysqltest; +let $mms_tname=t; # Include scripts can perform SQL. For it to not influence the main test # they use a separate connection. This way if they use a DDL it would diff --git a/mysql-test/t/maria-recovery.test b/mysql-test/t/maria-recovery.test index 81cd9408041..cbd5cf2bb4c 100644 --- a/mysql-test/t/maria-recovery.test +++ b/mysql-test/t/maria-recovery.test @@ -12,6 +12,7 @@ let $MARIA_LOG=../tmp; drop database if exists mysqltest; --enable_warnings create database mysqltest; +let $mms_tname=t; # Include scripts can perform SQL. For it to not influence the main test # they use a separate connection. This way if they use a DDL it would @@ -297,19 +298,25 @@ show keys from t1; # should be enabled drop table t1; --echo * TEST of recovery when OPTIMIZE has replaced the index file and crash -create table t1 (a varchar(100), key(a)) engine=maria; +create table t_corrupted1 (a varchar(100), key(a)) engine=maria; +# we use a special name because this test portion will generate +# corruption warnings, which we tell mtr_report.pl to ignore by +# putting the message in mtr_report.pl, but we don't want to it ignore +# corruption messages of other tests, hence the special name +# 't_corrupted' and not just 't'. +let $mms_tname=t_corrupted; let $mvr_restore_old_snapshot=0; let $mms_compare_physically=0; -let $mvr_crash_statement= optimize table t1; +let $mvr_crash_statement= optimize table t_corrupted1; let $mvr_debug_option="+d,maria_flush_whole_log,maria_flush_whole_page_cache,maria_crash_sort_index"; -insert into t1 select (rand()) from t2; +insert into t_corrupted1 select (rand()) from t2; -- source include/maria_make_snapshot_for_comparison.inc # Recovery will not fix the table, but we expect to see it marked # "crashed on repair". # Because crash is mild, the table is actually not corrupted, so the # "check table extended" done below fixes the table. -- source include/maria_verify_recovery.inc -drop table t1, t2; +drop table t_corrupted1, t2; # clean up everything let $mms_purpose=feeding_recovery; diff --git a/storage/maria/ha_maria.cc b/storage/maria/ha_maria.cc index 3a4f2b6df23..a6339b28332 100644 --- a/storage/maria/ha_maria.cc +++ b/storage/maria/ha_maria.cc @@ -49,13 +49,11 @@ ulong pagecache_division_limit, pagecache_age_threshold; ulonglong pagecache_buffer_size; /** - @todo For now there is no way for a user to set a different value of - maria_recover_options, i.e. auto-check-and-repair is always disabled. - We could enable it. As the auto-repair is initiated when opened from the - SQL layer (open_unireg_entry(), check_and_repair()), it does not happen - when Maria's Recovery internally opens the table to apply log records to - it, which is good. It would happen only after Recovery, if the table is - still corrupted. + As the auto-repair is initiated when opened from the SQL layer + (open_unireg_entry(), check_and_repair()), it does not happen when Maria's + Recovery internally opens the table to apply log records to it, which is + good. It would happen only after Recovery, if the table is still + corrupted. */ ulong maria_recover_options= HA_RECOVER_NONE; handlerton *maria_hton; @@ -63,7 +61,14 @@ handlerton *maria_hton; /* bits in maria_recover_options */ const char *maria_recover_names[]= { - "DEFAULT", "BACKUP", "FORCE", "QUICK", NullS + /* + Compared to MyISAM, "default" was renamed to "normal" as it collided with + SET var=default which sets to the var's default i.e. what happens when the + var is not set i.e. HA_RECOVER_NONE. + Another change is that OFF is used to disable, not ""; this is to have OFF + display in SHOW VARIABLES which is better than "". + */ + "OFF", "NORMAL", "BACKUP", "FORCE", "QUICK", NullS }; TYPELIB maria_recover_typelib= { @@ -103,11 +108,13 @@ TYPELIB maria_sync_log_dir_typelib= maria_sync_log_dir_names, NULL }; -/** @brief Interval between background checkpoints in seconds */ +/** Interval between background checkpoints in seconds */ static ulong checkpoint_interval; static void update_checkpoint_interval(MYSQL_THD thd, struct st_mysql_sys_var *var, void *var_ptr, const void *save); +/** After that many consecutive recovery failures, remove logs */ +static ulong force_start_after_recovery_failures; static void update_log_file_size(MYSQL_THD thd, struct st_mysql_sys_var *var, void *var_ptr, const void *save); @@ -124,6 +131,17 @@ static MYSQL_SYSVAR_ULONG(checkpoint_interval, checkpoint_interval, " 'no automatic checkpoints' which makes sense only for testing.", NULL, update_checkpoint_interval, 30, 0, UINT_MAX, 1); +static MYSQL_SYSVAR_ULONG(force_start_after_recovery_failures, + force_start_after_recovery_failures, + /* + Read-only because setting it on the fly has no useful effect, + should be set on command-line. + */ + PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY, + "Number of consecutive log recovery failures after which logs will be" + " automatically deleted to cure the problem; 0 (the default) disables" + " the feature.", NULL, NULL, 0, 0, UINT_MAX8, 1); + static MYSQL_SYSVAR_BOOL(page_checksum, maria_page_checksums, 0, "Maintain page checksums (can be overridden per table " "with PAGE_CHECKSUM clause in CREATE TABLE)", 0, 0, 1); @@ -175,6 +193,12 @@ static MYSQL_SYSVAR_ULONG(pagecache_division_limit, pagecache_division_limit, "The minimum percentage of warm blocks in key cache", 0, 0, 100, 1, 100, 1); +static MYSQL_SYSVAR_ENUM(recover, maria_recover_options, PLUGIN_VAR_OPCMDARG, + "Specifies how corrupted tables should be automatically repaired." + " Possible values are \"NORMAL\" (the default), \"BACKUP\", \"FORCE\"," + " \"QUICK\", or \"OFF\" which is like not using the option.", + NULL, NULL, HA_RECOVER_NONE, &maria_recover_typelib); + static MYSQL_THDVAR_ULONG(repair_threads, PLUGIN_VAR_RQCMDARG, "Number of threads to use when repairing maria tables. The value of 1 " "disables parallel repair.", @@ -186,7 +210,7 @@ static MYSQL_THDVAR_ULONG(sort_buffer_size, PLUGIN_VAR_RQCMDARG, 0, 0, 8192*1024, 4, ~0L, 1); static MYSQL_THDVAR_ENUM(stats_method, PLUGIN_VAR_RQCMDARG, - "Specifies how maria index statistics collection code should threat " + "Specifies how maria index statistics collection code should treat " "NULLs. Possible values are \"nulls_unequal\", \"nulls_equal\", " "and \"nulls_ignored\".", 0, 0, 0, &maria_stats_method_typelib); @@ -870,6 +894,12 @@ int ha_maria::open(const char *name, int mode, uint test_if_locked) test_if_locked|= HA_OPEN_MMAP; #endif + if (unlikely(maria_recover_options != HA_RECOVER_NONE)) + { + /* user asked to trigger a repair if table was not properly closed */ + test_if_locked|= HA_OPEN_ABORT_IF_CRASHED; + } + if (!(file= maria_open(name, mode, test_if_locked | HA_OPEN_FROM_SQL_LAYER))) return (my_errno ? my_errno : -1); @@ -2728,7 +2758,7 @@ bool maria_show_status(handlerton *hton, stat_print_fn *print, enum ha_stat_type stat) { - char engine_name[]= "maria"; + const LEX_STRING *engine_name= hton_name(hton); switch (stat) { case HA_ENGINE_LOGS: { @@ -2745,8 +2775,8 @@ bool maria_show_status(handlerton *hton, if (first_file == 0) { const char error[]= "error"; - print(thd, engine_name, sizeof(engine_name), - STRING_WITH_LEN(""), error, sizeof(error)); + print(thd, engine_name->str, engine_name->length, + STRING_WITH_LEN(""), error, sizeof(error) - 1); break; } @@ -2762,7 +2792,7 @@ bool maria_show_status(handlerton *hton, if (!(stat= my_stat(file, &stat_buff, MYF(MY_WME)))) { status= error; - status_len= sizeof(error); + status_len= sizeof(error) - 1; length= my_snprintf(object, SHOW_MSG_LEN, "Size unknown ; %s", file); } else @@ -2770,23 +2800,23 @@ bool maria_show_status(handlerton *hton, if (first_needed == 0) { status= unknown; - status_len= sizeof(unknown); + status_len= sizeof(unknown) - 1; } else if (i < first_needed) { status= unneeded; - status_len= sizeof(unneeded); + status_len= sizeof(unneeded) - 1; } else { status= needed; - status_len= sizeof(needed); + status_len= sizeof(needed) - 1; } length= my_snprintf(object, SHOW_MSG_LEN, "Size %12lu ; %s", (ulong) stat->st_size, file); } - print(thd, engine_name, sizeof(engine_name), + print(thd, engine_name->str, engine_name->length, object, length, status, status_len); } break; @@ -2799,9 +2829,90 @@ bool maria_show_status(handlerton *hton, return 0; } + +/** + Callback to delete all logs in directory. This is lower-level than other + functions in ma_loghandler.c which delete logs, as it does not rely on + translog_init() having been called first. + + @param directory directory where file is + @param filename base name of the file to delete +*/ + +static my_bool translog_callback_delete_all(const char *directory, + const char *filename) +{ + char complete_name[FN_REFLEN]; + fn_format(complete_name, filename, directory, "", MYF(MY_UNPACK_FILENAME)); + return my_delete(complete_name, MYF(MY_WME)); +} + + +/** + Helper function for option maria-force-start-after-recovery-failures. + Deletes logs if too many failures. Otherwise, increments the counter of + failures in the control file. + Notice how this has to be called _before_ translog_init() (if log is + corrupted, translog_init() might crash the server, so we need to remove logs + before). + + @param log_dir directory where logs to be deleted are +*/ + +static int mark_recovery_start(const char* log_dir) +{ + int res; + DBUG_ENTER("mark_recovery_start"); + if (unlikely(maria_recover_options == HA_RECOVER_NONE)) + ma_message_no_user(ME_JUST_WARNING, "Please consider using option" + " --maria-recover[=...] to automatically check and" + " repair tables when logs are removed by option" + " --maria-force-start-after-recovery-failures=#"); + if (recovery_failures >= force_start_after_recovery_failures) + { + /* + Remove logs which cause the problem; keep control file which has + critical info like uuid, max_trid (removing control file may make + correct tables look corrupted!). + */ + char msg[100]; + res= translog_walk_filenames(log_dir, &translog_callback_delete_all); + my_snprintf(msg, sizeof(msg), + "%s logs after %u consecutive failures of" + " recovery from logs", + (res ? "failed to remove some" : "removed all"), + recovery_failures); + ma_message_no_user((res ? 0 : ME_JUST_WARNING), msg); + } + else + res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, + max_trid_in_control_file, + recovery_failures + 1); + DBUG_RETURN(res); +} + + +/** + Helper function for option maria-force-start-after-recovery-failures. + Records in the control file that recovery was a success, so that it's not + counted for maria-force-start-after-recovery-failures. +*/ + +static int mark_recovery_success(void) +{ + /* success of recovery, reset recovery_failures: */ + int res; + DBUG_ENTER("mark_recovery_success"); + res= ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, + max_trid_in_control_file, 0); + DBUG_RETURN(res); +} + + static int ha_maria_init(void *p) { int res; + const char *log_dir= maria_data_root; maria_hton= (handlerton *)p; maria_hton->state= SHOW_OPTION_YES; maria_hton->db_type= DB_TYPE_UNKNOWN; @@ -2816,6 +2927,8 @@ static int ha_maria_init(void *p) bzero(maria_log_pagecache, sizeof(*maria_log_pagecache)); maria_tmpdir= &mysql_tmpdir_list; /* For REDO */ res= maria_init() || ma_control_file_open(TRUE, TRUE) || + ((force_start_after_recovery_failures != 0) && + mark_recovery_start(log_dir)) || !init_pagecache(maria_pagecache, (size_t) pagecache_buffer_size, pagecache_division_limit, pagecache_age_threshold, maria_block_size, 0) || @@ -2825,7 +2938,8 @@ static int ha_maria_init(void *p) translog_init(maria_data_root, log_file_size, MYSQL_VERSION_ID, server_id, maria_log_pagecache, TRANSLOG_DEFAULT_FLAGS, 0) || - maria_recover() || + maria_recovery_from_log() || + ((force_start_after_recovery_failures != 0) && mark_recovery_success()) || ma_checkpoint_init(checkpoint_interval); maria_multi_threaded= TRUE; return res ? HA_ERR_INITIALIZATION : 0; @@ -2913,6 +3027,7 @@ my_bool ha_maria::register_query_cache_table(THD *thd, char *table_name, static struct st_mysql_sys_var* system_variables[]= { MYSQL_SYSVAR(block_size), MYSQL_SYSVAR(checkpoint_interval), + MYSQL_SYSVAR(force_start_after_recovery_failures), MYSQL_SYSVAR(page_checksum), MYSQL_SYSVAR(log_dir_path), MYSQL_SYSVAR(log_file_size), @@ -2921,6 +3036,7 @@ static struct st_mysql_sys_var* system_variables[]= { MYSQL_SYSVAR(pagecache_age_threshold), MYSQL_SYSVAR(pagecache_buffer_size), MYSQL_SYSVAR(pagecache_division_limit), + MYSQL_SYSVAR(recover), MYSQL_SYSVAR(repair_threads), MYSQL_SYSVAR(sort_buffer_size), MYSQL_SYSVAR(stats_method), diff --git a/storage/maria/ma_checkpoint.c b/storage/maria/ma_checkpoint.c index 36db37f0d4d..f815a7cf75c 100644 --- a/storage/maria/ma_checkpoint.c +++ b/storage/maria/ma_checkpoint.c @@ -245,7 +245,8 @@ static int really_execute_checkpoint(void) that log was flushed before we write to the control file). */ if (unlikely(ma_control_file_write_and_force(lsn, last_logno, - max_trid_in_control_file))) + max_trid_in_control_file, + recovery_failures))) { translog_unlock(); goto err; diff --git a/storage/maria/ma_control_file.c b/storage/maria/ma_control_file.c index e6018a4b847..84fae2a9f7b 100644 --- a/storage/maria/ma_control_file.c +++ b/storage/maria/ma_control_file.c @@ -39,6 +39,8 @@ Start of changeable part: - Checksum of changeable part - LSN of last checkpoint - Number of last log file + - Max trid in control file (since Maria 1.5 May 2008) + - Number of consecutive recovery failures (since Maria 1.5 May 2008) ..... Here we can add new variables without changing format The idea is that one can add new variables to the control file and still @@ -80,7 +82,9 @@ one should increment the control file version number. #define CF_FILENO_SIZE 4 #define CF_MAX_TRID_OFFSET (CF_FILENO_OFFSET + CF_FILENO_SIZE) #define CF_MAX_TRID_SIZE TRANSID_SIZE -#define CF_CHANGEABLE_TOTAL_SIZE (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE) +#define CF_RECOV_FAIL_OFFSET (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE) +#define CF_RECOV_FAIL_SIZE 1 +#define CF_CHANGEABLE_TOTAL_SIZE (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE) /* The following values should not be changed, except when changing version @@ -109,6 +113,12 @@ uint32 last_logno= FILENO_IMPOSSIBLE; TrID max_trid_in_control_file= 0; /** + Number of consecutive log or recovery failures. Reset to 0 after recovery's + success. +*/ +uint8 recovery_failures= 0; + +/** @brief If log's lock should be asserted when writing to control file. Can be re-used by any function which needs to be thread-safe except when @@ -188,7 +198,7 @@ static CONTROL_FILE_ERROR create_control_file(const char *name, /* init the file with these "undefined" values */ DBUG_RETURN(ma_control_file_write_and_force(LSN_IMPOSSIBLE, - FILENO_IMPOSSIBLE, 0)); + FILENO_IMPOSSIBLE, 0, 0)); } @@ -420,6 +430,9 @@ CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing, if (new_cf_changeable_size >= (CF_MAX_TRID_OFFSET + CF_MAX_TRID_SIZE)) max_trid_in_control_file= transid_korr(buffer + new_cf_create_time_size + CF_MAX_TRID_OFFSET); + if (new_cf_changeable_size >= (CF_RECOV_FAIL_OFFSET + CF_RECOV_FAIL_SIZE)) + recovery_failures= + (buffer + new_cf_create_time_size + CF_RECOV_FAIL_OFFSET)[0]; ok: DBUG_RETURN(0); @@ -436,19 +449,21 @@ err: /* Write information durably to the control file; stores this information into - the last_checkpoint_lsn, last_logno, max_trid_in_control_file global - variables. + the last_checkpoint_lsn, last_logno, max_trid_in_control_file, + recovery_failures global variables. Called when we have created a new log (after syncing this log's creation), - when we have written a checkpoint (after syncing this log record), and at - shutdown (for storing trid in case logs are soon removed by user). + when we have written a checkpoint (after syncing this log record), at + shutdown (for storing trid in case logs are soon removed by user), and + before and after recovery (to store recovery_failures). Variables last_checkpoint_lsn and last_logno must be protected by caller using log's lock, unless this function is called at startup. SYNOPSIS ma_control_file_write_and_force() - checkpoint_lsn LSN of last checkpoint - logno last log file number - trid maximum transaction longid. + last_checkpoint_lsn_arg LSN of last checkpoint + last_logno_arg last log file number + max_trid_arg maximum transaction longid + recovery_failures_arg consecutive recovery failures NOTE We always want to do one single my_pwrite() here to be as atomic as @@ -459,17 +474,26 @@ err: 1 - Error */ -int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, - TrID trid) +int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg, + uint32 last_logno_arg, + TrID max_trid_arg, + uint8 recovery_failures_arg) { uchar buffer[CF_MAX_SIZE]; uint32 sum; + my_bool no_need_sync; DBUG_ENTER("ma_control_file_write_and_force"); - if ((last_checkpoint_lsn == checkpoint_lsn) && - (last_logno == logno) && - (max_trid_in_control_file == trid)) - DBUG_RETURN(0); /* no need to write */ + /* + We don't need to sync if this is just an increase of + recovery_failures: it's even good if that counter is not increased on disk + in case of power or hardware failure (less false positives when removing + logs). + */ + no_need_sync= ((last_checkpoint_lsn == last_checkpoint_lsn_arg) && + (last_logno == last_logno_arg) && + (max_trid_in_control_file == max_trid_arg) && + (recovery_failures_arg > 0)); if (control_file_fd < 0) DBUG_RETURN(1); @@ -479,9 +503,10 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, translog_lock_handler_assert_owner(); #endif - lsn_store(buffer + CF_LSN_OFFSET, checkpoint_lsn); - int4store(buffer + CF_FILENO_OFFSET, logno); - transid_store(buffer + CF_MAX_TRID_OFFSET, trid); + lsn_store(buffer + CF_LSN_OFFSET, last_checkpoint_lsn_arg); + int4store(buffer + CF_FILENO_OFFSET, last_logno_arg); + transid_store(buffer + CF_MAX_TRID_OFFSET, max_trid_arg); + (buffer + CF_RECOV_FAIL_OFFSET)[0]= recovery_failures_arg; if (cf_changeable_size > CF_CHANGEABLE_TOTAL_SIZE) { @@ -514,12 +539,13 @@ int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, if (my_pwrite(control_file_fd, buffer, cf_changeable_size, cf_create_time_size, MYF(MY_FNABP | MY_WME)) || - my_sync(control_file_fd, MYF(MY_WME))) + (!no_need_sync && my_sync(control_file_fd, MYF(MY_WME)))) DBUG_RETURN(1); - last_checkpoint_lsn= checkpoint_lsn; - last_logno= logno; - max_trid_in_control_file= trid; + last_checkpoint_lsn= last_checkpoint_lsn_arg; + last_logno= last_logno_arg; + max_trid_in_control_file= max_trid_arg; + recovery_failures= recovery_failures_arg; cf_changeable_size= CF_CHANGEABLE_TOTAL_SIZE; /* no more warning */ DBUG_RETURN(0); @@ -558,7 +584,7 @@ int ma_control_file_end(void) */ last_checkpoint_lsn= LSN_IMPOSSIBLE; last_logno= FILENO_IMPOSSIBLE; - max_trid_in_control_file= 0; + max_trid_in_control_file= recovery_failures= 0; DBUG_RETURN(close_error); } diff --git a/storage/maria/ma_control_file.h b/storage/maria/ma_control_file.h index 52001cd4a4c..4cb5527620d 100644 --- a/storage/maria/ma_control_file.h +++ b/storage/maria/ma_control_file.h @@ -44,6 +44,8 @@ extern uint32 last_logno; extern TrID max_trid_in_control_file; +extern uint8 recovery_failures; + extern my_bool maria_multi_threaded, maria_in_recovery; typedef enum enum_control_file_error { @@ -63,7 +65,9 @@ typedef enum enum_control_file_error { C_MODE_START CONTROL_FILE_ERROR ma_control_file_open(my_bool create_if_missing, my_bool print_error); -int ma_control_file_write_and_force(LSN checkpoint_lsn, uint32 logno, TrID trid); +int ma_control_file_write_and_force(LSN last_checkpoint_lsn_arg, + uint32 last_logno_arg, TrID max_trid_arg, + uint8 recovery_failures_arg); int ma_control_file_end(void); my_bool ma_control_file_inited(void); C_MODE_END diff --git a/storage/maria/ma_init.c b/storage/maria/ma_init.c index f81afda2141..c13f01001b3 100644 --- a/storage/maria/ma_init.c +++ b/storage/maria/ma_init.c @@ -86,7 +86,7 @@ void maria_end(void) from the log, as it cannot process REDOs). */ (void)ma_control_file_write_and_force(last_checkpoint_lsn, last_logno, - trid); + trid, recovery_failures); } trnman_destroy(); if (translog_status == TRANSLOG_OK) diff --git a/storage/maria/ma_locking.c b/storage/maria/ma_locking.c index 4ec242fd927..89a8cad26f1 100644 --- a/storage/maria/ma_locking.c +++ b/storage/maria/ma_locking.c @@ -381,7 +381,7 @@ int _ma_test_if_changed(register MARIA_HA *info) tells us if the MARIA file wasn't properly closed. (This is true if my_disable_locking is set). - open_count is not maintained on disk for transactional or temporary tables. + open_count is not maintained on disk for temporary tables. */ int _ma_mark_file_changed(MARIA_HA *info) @@ -400,11 +400,16 @@ int _ma_mark_file_changed(MARIA_HA *info) share->state.open_count++; } /* - temp tables don't need an open_count as they are removed on crash; - transactional tables are fixed by log-based recovery, so don't need an - open_count either (and we thus avoid the disk write below). + Temp tables don't need an open_count as they are removed on crash. + In theory transactional tables are fixed by log-based recovery, so don't + need an open_count either, but if recovery has failed and logs have been + removed (by maria-force-start-after-recovery-failures), we still need to + detect dubious tables. + If we didn't maintain open_count on disk for a table, after a crash + we wouldn't know if it was closed at crash time (thus does not need a + check) or not. So we would have to check all tables: overkill. */ - if (!(share->temporary | share->base.born_transactional)) + if (!share->temporary) { mi_int2store(buff,share->state.open_count); buff[2]=1; /* Mark that it's changed */ @@ -471,7 +476,7 @@ int _ma_decrement_open_count(MARIA_HA *info) { share->state.open_count--; share->changed= 1; /* We have to update state */ - if (!(share->temporary | share->base.born_transactional)) + if (!share->temporary) { mi_int2store(buff,share->state.open_count); write_error= (int) my_pwrite(share->kfile.file, buff, sizeof(buff), diff --git a/storage/maria/ma_loghandler.c b/storage/maria/ma_loghandler.c index b6d12e3e975..4daef976b6e 100644 --- a/storage/maria/ma_loghandler.c +++ b/storage/maria/ma_loghandler.c @@ -17,6 +17,7 @@ #include "trnman.h" #include "ma_blockrec.h" /* for some constants and in-write hooks */ #include "ma_key_recover.h" /* For some in-write hooks */ +#include "ma_checkpoint.h" /* On Windows, neither my_open() nor my_sync() work for directories. @@ -1522,7 +1523,8 @@ static my_bool translog_create_new_file() DBUG_RETURN(1); if (ma_control_file_write_and_force(last_checkpoint_lsn, file_no, - max_trid_in_control_file)) + max_trid_in_control_file, + recovery_failures)) { translog_stop_writing(); DBUG_RETURN(1); @@ -3211,21 +3213,29 @@ static my_bool translog_truncate_log(TRANSLOG_ADDRESS addr) /** - @brief Check log files presence + Applies function 'callback' to all files (in a directory) which + name looks like a log's name (maria_log.[0-9]{7}). + If 'callback' returns TRUE this interrupts the walk and returns + TRUE. Otherwise FALSE is returned after processing all log files. + It cannot just use log_descriptor.directory because that may not yet have + been initialized. - @retval 0 no log files. - @retval 1 there is at least 1 log file in the directory + @param directory directory to scan + @param callback function to apply; is passed directory and base + name of found file */ -my_bool translog_is_log_files() +my_bool translog_walk_filenames(const char *directory, + my_bool (*callback)(const char *, + const char *)) { MY_DIR *dirp; uint i; my_bool rc= FALSE; /* Finds and removes transaction log files */ - if (!(dirp = my_dir(log_descriptor.directory, MYF(MY_DONT_SORT)))) - return 1; + if (!(dirp = my_dir(directory, MYF(MY_DONT_SORT)))) + return FALSE; for (i= 0; i < dirp->number_off_files; i++) { @@ -3239,14 +3249,14 @@ my_bool translog_is_log_files() file[15] >= '0' && file[15] <= '9' && file[16] >= '0' && file[16] <= '9' && file[17] >= '0' && file[17] <= '9' && - file[18] == '\0') + file[18] == '\0' && (*callback)(directory, file)) { rc= TRUE; break; } } my_dirend(dirp); - return FALSE; + return rc; } @@ -3270,6 +3280,19 @@ static void translog_fill_overhead_table() /** + Callback to find first log in directory. +*/ + +static my_bool translog_callback_search_first(const char *directory + __attribute__((unused)), + const char *filename + __attribute__((unused))) +{ + return TRUE; +} + + +/** @brief Checks that chunk is LSN one @param type type of the chunk @@ -3353,7 +3376,7 @@ my_bool translog_init_with_table(const char *directory, my_init_dynamic_array(&log_descriptor.unfinished_files, sizeof(struct st_file_counter), 10, 10)) - DBUG_RETURN(1); + goto err; log_descriptor.min_need_file= 0; log_descriptor.min_file_number= 0; log_descriptor.last_lsn_checked= LSN_IMPOSSIBLE; @@ -3367,7 +3390,7 @@ my_bool translog_init_with_table(const char *directory, my_errno= errno; DBUG_PRINT("error", ("Error %d during opening directory '%s'", errno, log_descriptor.directory)); - DBUG_RETURN(1); + goto err; } #endif log_descriptor.in_buffers_only= LSN_IMPOSSIBLE; @@ -3417,7 +3440,7 @@ my_bool translog_init_with_table(const char *directory, for (i= 0; i < TRANSLOG_BUFFERS_NO; i++) { if (translog_buffer_init(log_descriptor.buffers + i)) - DBUG_RETURN(1); + goto err; #ifndef DBUG_OFF log_descriptor.buffers[i].buffer_no= (uint8) i; #endif @@ -3461,7 +3484,8 @@ my_bool translog_init_with_table(const char *directory, log_descriptor.horizon= last_page= MAKE_LSN(last_logno, 0); if (translog_get_last_page_addr(&last_page, &pageok, no_errors)) { - if (!translog_is_log_files()) + if (!translog_walk_filenames(log_descriptor.directory, + &translog_callback_search_first)) { /* Files was deleted, just start from the next log number, so that @@ -3472,7 +3496,7 @@ my_bool translog_init_with_table(const char *directory, logs_found= 0; } else - DBUG_RETURN(1); + goto err; } else if (LSN_OFFSET(last_page) == 0) { @@ -3485,7 +3509,7 @@ my_bool translog_init_with_table(const char *directory, { last_page-= LSN_ONE_FILE; if (translog_get_last_page_addr(&last_page, &pageok, 0)) - DBUG_RETURN(1); + goto err; } } if (logs_found) @@ -3497,7 +3521,7 @@ my_bool translog_init_with_table(const char *directory, if (allocate_dynamic(&log_descriptor.open_files, log_descriptor.max_file - log_descriptor.min_file + 1)) - DBUG_RETURN(1); + goto err; for (i = log_descriptor.max_file; i >= log_descriptor.min_file; i--) { /* @@ -3526,10 +3550,10 @@ my_bool translog_init_with_table(const char *directory, if (file) { free(file); - DBUG_RETURN(1); + goto err; } else - DBUG_RETURN(1); + goto err; } translog_file_init(file, i, 1); /* we allocated space so it can't fail */ @@ -3543,7 +3567,7 @@ my_bool translog_init_with_table(const char *directory, { /* There is no logs and there is read-only mode => nothing to read */ DBUG_PRINT("error", ("No logs and read-only mode")); - DBUG_RETURN(1); + goto err; } if (logs_found) @@ -3568,7 +3592,7 @@ my_bool translog_init_with_table(const char *directory, TRANSLOG_ADDRESS current_file_last_page; current_file_last_page= current_page; if (translog_get_last_page_addr(¤t_file_last_page, &pageok, 0)) - DBUG_RETURN(1); + goto err; if (!pageok) { DBUG_PRINT("error", ("File %lu have no complete last page", @@ -3585,7 +3609,7 @@ my_bool translog_init_with_table(const char *directory, uchar *page; data.addr= ¤t_page; if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL) - DBUG_RETURN(1); + goto err; if (data.was_recovered) { DBUG_PRINT("error", ("file no: %lu (%d) " @@ -3614,7 +3638,7 @@ my_bool translog_init_with_table(const char *directory, { /* Panic!!! Even page which should be valid is invalid */ /* TODO: issue error */ - DBUG_RETURN(1); + goto err; } DBUG_PRINT("info", ("Last valid page is in file: %lu " "offset: %lu (0x%lx) " @@ -3639,7 +3663,7 @@ my_bool translog_init_with_table(const char *directory, LSN_FILE_NO(log_descriptor.horizon)); if ((page= translog_get_page(&data, psize_buff.buffer, NULL)) == NULL || (chunk_offset= translog_get_first_chunk_offset(page)) == 0) - DBUG_RETURN(1); + goto err; /* Puts filled part of old page in the buffer */ log_descriptor.horizon= last_valid_page; @@ -3654,7 +3678,7 @@ my_bool translog_init_with_table(const char *directory, uint16 chunk_length; if ((chunk_length= translog_get_total_chunk_length(page, chunk_offset)) == 0) - DBUG_RETURN(1); + goto err; DBUG_PRINT("info", ("chunk: offset: %u length: %u", (uint) chunk_offset, (uint) chunk_length)); chunk_offset+= chunk_length; @@ -3690,7 +3714,7 @@ my_bool translog_init_with_table(const char *directory, open_files, 0, TRANSLOG_FILE **))-> handler.file)) - DBUG_RETURN(1); + goto err; version_changed= (info.maria_version != TRANSLOG_VERSION_ID); } } @@ -3702,25 +3726,26 @@ my_bool translog_init_with_table(const char *directory, MYF(0)); DBUG_PRINT("info", ("The log is not found => we will create new log")); if (file == NULL) - DBUG_RETURN(1); + goto err; /* Start new log system from scratch */ log_descriptor.horizon= MAKE_LSN(start_file_num, TRANSLOG_PAGE_SIZE); /* header page */ if ((file->handler.file= create_logfile_by_number_no_cache(start_file_num)) == -1) - DBUG_RETURN(1); + goto err; translog_file_init(file, start_file_num, 0); if (insert_dynamic(&log_descriptor.open_files, (uchar*)&file)) - DBUG_RETURN(1); + goto err; log_descriptor.min_file= log_descriptor.max_file= start_file_num; if (translog_write_file_header()) - DBUG_RETURN(1); + goto err; DBUG_ASSERT(log_descriptor.max_file - log_descriptor.min_file + 1 == log_descriptor.open_files.elements); if (ma_control_file_write_and_force(checkpoint_lsn, start_file_num, - max_trid_in_control_file)) - DBUG_RETURN(1); + max_trid_in_control_file, + recovery_failures)) + goto err; /* assign buffer 0 */ translog_start_buffer(log_descriptor.buffers, &log_descriptor.bc, 0); translog_new_page_header(&log_descriptor.horizon, &log_descriptor.bc); @@ -3734,7 +3759,7 @@ my_bool translog_init_with_table(const char *directory, log_descriptor.horizon= LSN_REPLACE_OFFSET(log_descriptor.horizon, TRANSLOG_PAGE_SIZE); if (translog_create_new_file()) - DBUG_RETURN(1); + goto err; /* Buffer system left untouched after recovery => we should init it (starting from buffer 0) @@ -3767,7 +3792,7 @@ my_bool translog_init_with_table(const char *directory, id_to_share= (MARIA_SHARE **) my_malloc(SHARE_ID_MAX * sizeof(MARIA_SHARE*), MYF(MY_WME | MY_ZEROFILL)); if (unlikely(!id_to_share)) - DBUG_RETURN(1); + goto err; id_to_share--; /* min id is 1 */ /* Check the last LSN record integrity */ @@ -3783,7 +3808,7 @@ my_bool translog_init_with_table(const char *directory, page_addr= (log_descriptor.horizon - ((log_descriptor.horizon - 1) % TRANSLOG_PAGE_SIZE + 1)); if (translog_scanner_init(page_addr, 1, &scanner, 1)) - DBUG_RETURN(1); + goto err; scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]]; for (;;) { @@ -3797,7 +3822,7 @@ my_bool translog_init_with_table(const char *directory, if (translog_get_next_chunk(&scanner)) { translog_destroy_scanner(&scanner); - DBUG_RETURN(1); + goto err; } if (scanner.page != END_OF_LOG) chunk_1byte= scanner.page[scanner.page_offset]; @@ -3808,7 +3833,7 @@ my_bool translog_init_with_table(const char *directory, if (translog_get_next_chunk(&scanner)) { translog_destroy_scanner(&scanner); - DBUG_RETURN(1); + goto err; } if (scanner.page == END_OF_LOG) break; /* it was the last record */ @@ -3845,7 +3870,7 @@ my_bool translog_init_with_table(const char *directory, } translog_destroy_scanner(&scanner); if (translog_scanner_init(page_addr, 1, &scanner, 1)) - DBUG_RETURN(1); + goto err; scanner.page_offset= page_overhead[scanner.page[TRANSLOG_PAGE_FLAGS]]; } translog_destroy_scanner(&scanner); @@ -3872,7 +3897,7 @@ my_bool translog_init_with_table(const char *directory, else if (translog_truncate_log(last_lsn)) { translog_free_record_header(&rec); - DBUG_RETURN(1); + goto err; } } else @@ -3898,7 +3923,7 @@ my_bool translog_init_with_table(const char *directory, else if (translog_truncate_log(last_lsn)) { translog_free_record_header(&rec); - DBUG_RETURN(1); + goto err; } } } @@ -3907,6 +3932,9 @@ my_bool translog_init_with_table(const char *directory, } } DBUG_RETURN(0); +err: + ma_message_no_user(0, "log initialization failed"); + DBUG_RETURN(1); } diff --git a/storage/maria/ma_loghandler.h b/storage/maria/ma_loghandler.h index c21d9492cba..3cb18a0eb49 100644 --- a/storage/maria/ma_loghandler.h +++ b/storage/maria/ma_loghandler.h @@ -317,6 +317,10 @@ extern void translog_deassign_id_from_share(struct st_maria_share *share); extern void translog_assign_id_to_share_from_recovery(struct st_maria_share *share, uint16 id); +extern my_bool translog_walk_filenames(const char *directory, + my_bool (*callback)(const char *, + const char *)); + enum enum_translog_status { TRANSLOG_UNINITED, /* no initialization done or error during initialization */ diff --git a/storage/maria/ma_recovery.c b/storage/maria/ma_recovery.c index 2e162b4e07d..ec679609320 100644 --- a/storage/maria/ma_recovery.c +++ b/storage/maria/ma_recovery.c @@ -191,12 +191,12 @@ static void print_preamble() @retval !=0 Error */ -int maria_recover(void) +int maria_recovery_from_log(void) { int res= 1; FILE *trace_file; uint warnings_count; - DBUG_ENTER("maria_recover"); + DBUG_ENTER("maria_recovery_from_log"); DBUG_ASSERT(!maria_in_recovery); maria_in_recovery= TRUE; @@ -462,7 +462,12 @@ end: "Maria recovery failed. Please run maria_chk -r on all maria " "tables and delete all maria_log.######## files", MYF(0)); procent_printed= 0; - /* we don't cleanly close tables if we hit some error (may corrupt them) */ + /* + We don't cleanly close tables if we hit some error (may corrupt them by + flushing some wrong blocks made from wrong REDOs). It also leaves their + open_count>0, which ensures that --maria-recover, if used, will try to + repair them. + */ DBUG_RETURN(error); } @@ -1224,6 +1229,12 @@ static int new_table(uint16 sid, const char *name, LSN lsn_of_file_id) " maria_chk -r", share->open_file_name); error= -1; /* not fatal, try with other tables */ goto end; + /* + Note that if a first recovery fails to apply a REDO, it marks the table + corrupted and stops the entire recovery. A second recovery will find the + table is marked corrupted and skip it (and thus possibly handle other + tables). + */ } /* don't log any records for this work */ _ma_tmp_disable_logging_for_table(info, FALSE); diff --git a/storage/maria/ma_recovery.h b/storage/maria/ma_recovery.h index 56d75f16dde..aa8fa7ecae9 100644 --- a/storage/maria/ma_recovery.h +++ b/storage/maria/ma_recovery.h @@ -25,7 +25,7 @@ C_MODE_START enum maria_apply_log_way { MARIA_LOG_APPLY, MARIA_LOG_DISPLAY_HEADER, MARIA_LOG_CHECK }; -int maria_recover(void); +int maria_recovery_from_log(void); int maria_apply_log(LSN lsn, enum maria_apply_log_way apply, FILE *trace_file, my_bool execute_undo_phase, my_bool skip_DDLs, diff --git a/storage/maria/ma_test_force_start.pl b/storage/maria/ma_test_force_start.pl new file mode 100755 index 00000000000..db97e376004 --- /dev/null +++ b/storage/maria/ma_test_force_start.pl @@ -0,0 +1,179 @@ +#!/usr/bin/env perl + + +use strict; +use warnings; + +my $usage= <<EOF; +This program tests that the options +--maria-force-start-after-recovery-failures --maria-recover work as +expected. +It has to be run from directory mysql-test, and works with non-debug +and debug binaries. +Pass it option -d or -i (to test corruption of data or index file). +EOF + +# -d currently exhibits BUG#36578 +# "Maria: maria-recover may fail to autorepair a table" + +die($usage) if (@ARGV == 0); + +my $corrupt_index; + +if ($ARGV[0] eq '-d') + { + $corrupt_index= 0; + } +elsif ($ARGV[0] eq '-i') + { + $corrupt_index= 1; + } +else + { + die($usage); + } + +my $force_after= 3; +my $corrupt_file= $corrupt_index ? "MAI" : "MAD"; +my $corrupt_message= + "\\[ERROR\\] mysqld: Table '.\/test\/t1' is marked as crashed and should be repaired"; + +my $sql_name= "./var/tmp/create_table.sql"; +my $error_log_name= "./var/log/master.err"; +my @cmd_output; +my $whatever; # garbage data +my $base_server_cmd= "perl mysql-test-run.pl --mem --mysqld=--maria-force-start-after-recovery-failures=$force_after maria-recover"; +my $server_cmd; +my $client_cmd= "../client/mysql -u root -S var/tmp/master.sock test < $sql_name"; +my $server_pid_name="./var/run/master.pid"; +my $server_pid; +my $i; # count of server restarts +sub kill_server; + +print "starting mysqld\n"; +$server_cmd= $base_server_cmd . " --start-and-exit 2>&1"; +@cmd_output=`$server_cmd`; +die if $?; + +open(FILE, ">", $sql_name) or die; + +# To exhibit BUG#36578 with -d, we don't create an index if -d. This is +# because the presence of an index will cause repair-by-sort to be used, +# where sort_get_next_record() is only called inside +#_ma_create_index_by_sort(), so the latter function fails and in this +# case retry_repair is set, so bug does not happen. Whereas without +# an index, repair-with-key-cache is called, which calls +# sort_get_next_record() whose failure itself does not cause a retry. + +print FILE "create table t1 (a varchar(1000)". + ($corrupt_index ? ", index(a)" : "") .") engine=maria;\n"; +print FILE <<EOF; +insert into t1 values("ThursdayMorningsMarket"); +# If Recovery executes REDO_INDEX_NEW_PAGE it will overwrite our +# intentional corruption; we make Recovery skip this record by bumping +# create_rename_lsn using OPTIMIZE TABLE. This also makes sure to put +# the pages on disk, so that we can corrupt them. +optimize table t1; +# mark table open, so that --maria-recover repairs it +insert into t1 select concat(a,'b') from t1 limit 1; +EOF +close FILE; + +print "creating table\n"; +`$client_cmd`; +die if $?; + +print "killing mysqld hard\n"; +kill_server(9); + +print "ruining " . + ($corrupt_index ? "first page of keys" : "bitmap page") . + " in table to test maria-recover\n"; +open(FILE, "+<", "./var/master-data/test/t1.$corrupt_file") or die; +$whatever= ("\xAB" x 100); +sysseek (FILE, $corrupt_index ? 8192 : (8192-100-100), 0) or die; +syswrite (FILE, $whatever) or die; +close FILE; + +print "ruining log to make recovery fail; mysqld should fail the $force_after first restarts\n"; +open(FILE, "+<", "./var/tmp/maria_log.00000001") or die; +$whatever= ("\xAB" x 8192); +sysseek (FILE, 99, 0) or die; +syswrite (FILE, $whatever) or die; +close FILE; + +$server_cmd= $base_server_cmd . " --start-dirty 2>&1"; +for($i= 1; $i <= $force_after; $i= $i + 1) + { + print "mysqld restart number $i... "; + unlink($error_log_name) or die; + `$server_cmd`; + # mysqld should return 1 when can't read log + die unless (($? >> 8) == 1); + open(FILE, "<", $error_log_name) or die; + @cmd_output= <FILE>; + close FILE; + die unless grep(/\[ERROR\] mysqld: Maria engine: log initialization failed/, @cmd_output); + die unless grep(/\[ERROR\] Plugin 'MARIA' init function returned error./, @cmd_output); + print "failed - ok\n"; + } + +print "mysqld restart number $i... "; +unlink($error_log_name) or die; +@cmd_output=`$server_cmd`; +die if $?; +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die unless grep(/\[Warning\] mysqld: Maria engine: removed all logs after [\d]+ consecutive failures of recovery from logs/, @cmd_output); +die unless grep(/\[ERROR\] mysqld: File '..\/tmp\/maria_log.00000001' not found \(Errcode: 2\)/, @cmd_output); +print "success - ok\n"; + +open(FILE, ">", $sql_name) or die; +print FILE <<EOF; +set global maria_recover=normal; +insert into t1 values('aaa'); +EOF +close FILE; + +# verify corruption has not yet been noticed +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die if grep(/$corrupt_message/, @cmd_output); + +print "inserting in table\n"; +`$client_cmd`; +die if $?; +print "table is usable - ok\n"; + +open(FILE, "<", $error_log_name) or die; +@cmd_output= <FILE>; +close FILE; +die unless grep(/$corrupt_message/, @cmd_output); +die unless grep(/\[Warning\] Recovering table: '.\/test\/t1'/, @cmd_output); +print "was corrupted and automatically repaired - ok\n"; + +# remove our traces +kill_server(15); + +print "TEST ALL OK\n"; + +# kills mysqld with signal given in parameter +sub kill_server + { + my ($sig)= @_; + my $wait_count= 0; + open(FILE, "<", $server_pid_name) or die; + @cmd_output= <FILE>; + close FILE; + $server_pid= $cmd_output[0]; + die unless $server_pid > 0; + kill($sig, $server_pid) or die; + while (kill (0, $server_pid)) + { + print "waiting for mysqld to die\n" if ($wait_count > 30); + $wait_count= $wait_count + 1; + select(undef, undef, undef, 0.1); + } + } diff --git a/storage/maria/unittest/ma_control_file-t.c b/storage/maria/unittest/ma_control_file-t.c index f076615fef7..6702e4deb2f 100644 --- a/storage/maria/unittest/ma_control_file-t.c +++ b/storage/maria/unittest/ma_control_file-t.c @@ -45,6 +45,7 @@ char file_name[FN_REFLEN]; LSN expect_checkpoint_lsn; uint32 expect_logno; TrID expect_max_trid; +uint8 expect_recovery_failures; static int delete_file(myf my_flags); /* @@ -55,10 +56,11 @@ static int close_file(void); /* wraps ma_control_file_end */ /* wraps ma_control_file_open_or_create */ static int open_file(void); /* wraps ma_control_file_write_and_force */ -static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid); +static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid, + uint8 rec_failures); /* Tests */ -static int test_one_log(void); +static int test_one_log_and_recovery_failures(void); static int test_five_logs_and_max_trid(void); static int test_3_checkpoints_and_2_logs(void); static int test_binary_content(void); @@ -135,7 +137,8 @@ int main(int argc,char *argv[]) RET_ERR_UNLESS(0 == delete_file(0)); /* if fails, can't continue */ diag("Tests of normal conditions"); - ok(0 == test_one_log(), "test of creating one log"); + ok(0 == test_one_log_and_recovery_failures(), + "test of creating one log and recording recovery failures"); ok(0 == test_five_logs_and_max_trid(), "test of creating five logs and many transactions"); ok(0 == test_3_checkpoints_and_2_logs(), @@ -167,7 +170,7 @@ static int delete_file(myf my_flags) my_delete(file_name, my_flags); expect_checkpoint_lsn= LSN_IMPOSSIBLE; expect_logno= FILENO_IMPOSSIBLE; - expect_max_trid= 0; + expect_max_trid= expect_recovery_failures= 0; return 0; } @@ -181,6 +184,7 @@ static int verify_module_values_match_expected(void) RET_ERR_UNLESS(last_logno == expect_logno); RET_ERR_UNLESS(last_checkpoint_lsn == expect_checkpoint_lsn); RET_ERR_UNLESS(max_trid_in_control_file == expect_max_trid); + RET_ERR_UNLESS(recovery_failures == expect_recovery_failures); return 0; } @@ -215,21 +219,28 @@ static int open_file(void) return 0; } -static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid) +static int write_file(LSN checkpoint_lsn, uint32 logno, TrID trid, + uint8 rec_failures) { - RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid) + RET_ERR_UNLESS(ma_control_file_write_and_force(checkpoint_lsn, logno, trid, + rec_failures) == 0); /* Check that the module reports expected information */ RET_ERR_UNLESS(verify_module_values_match_expected() == 0); return 0; } -static int test_one_log(void) +static int test_one_log_and_recovery_failures(void) { RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK); expect_logno= 123; RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); + expect_recovery_failures= 158; + RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, + max_trid_in_control_file, + expect_recovery_failures) == 0); RET_ERR_UNLESS(close_file() == 0); return 0; } @@ -245,7 +256,8 @@ static int test_five_logs_and_max_trid(void) { expect_logno*= 3; RET_ERR_UNLESS(write_file(last_checkpoint_lsn, expect_logno, - expect_max_trid) == 0); + expect_max_trid, + recovery_failures) == 0); } RET_ERR_UNLESS(close_file() == 0); return 0; @@ -260,23 +272,28 @@ static int test_3_checkpoints_and_2_logs(void) RET_ERR_UNLESS(open_file() == CONTROL_FILE_OK); expect_checkpoint_lsn= MAKE_LSN(5, 10000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_logno= 17; RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_checkpoint_lsn= MAKE_LSN(17, 20000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_checkpoint_lsn= MAKE_LSN(17, 45000); RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); expect_logno= 19; RET_ERR_UNLESS(write_file(expect_checkpoint_lsn, expect_logno, - max_trid_in_control_file) == 0); + max_trid_in_control_file, + recovery_failures) == 0); RET_ERR_UNLESS(close_file() == 0); return 0; } diff --git a/storage/maria/unittest/ma_test_loghandler_multigroup-t.c b/storage/maria/unittest/ma_test_loghandler_multigroup-t.c index 421aa5ffb29..55b8421c3db 100644 --- a/storage/maria/unittest/ma_test_loghandler_multigroup-t.c +++ b/storage/maria/unittest/ma_test_loghandler_multigroup-t.c @@ -129,10 +129,10 @@ static my_bool read_and_check_content(TRANSLOG_HEADER_BUFFER *rec, } static const char *load_default_groups[]= {"ma_unit_loghandler", 0}; -#if defined(__WIN__) -static const char *default_dbug_option= "d:t:i:O,\\ma_test_loghandler.trace"; -#else -static const char *default_dbug_option= "d:t:i:o,/tmp/ma_test_loghandler.trace"; +#ifndef DBUG_OFF +static const char *default_dbug_option= + IF_WIN("d:t:i:O,\\ma_test_loghandler.trace", + "d:t:i:o,/tmp/ma_test_loghandler.trace"); #endif static const char *opt_wfile= NULL; static const char *opt_rfile= NULL; |