diff options
author | Luis Soares <luis.soares@oracle.com> | 2010-09-24 10:44:53 +0100 |
---|---|---|
committer | Luis Soares <luis.soares@oracle.com> | 2010-09-24 10:44:53 +0100 |
commit | 66a40d0b8ac73ece279d8a91e3da2aaa9f65ef09 (patch) | |
tree | cdfa97b221f1127cb222fedd3d491f0eb92b7554 /sql/slave.cc | |
parent | b288324a13452a5c8430af6eb560ba1ba4b73746 (diff) | |
download | mariadb-git-66a40d0b8ac73ece279d8a91e3da2aaa9f65ef09.tar.gz |
BUG#55263: assert in check_binlog_magic
The procedure for setting up a fake binary log, by changing the
relay log files manually, is run twice when we issue mtr with
--repeat=2. However, when setting it up for the second time,
neither the sql thread is reset nor the server is restarted. This
means that previous stale relay log IO_CACHE metadata - from
first run - is left around. As a consequence, when the test is
run for the second time, the IO_CACHE for the relay log has
position in file different than zero, triggering the assertion.
We fix this by deploying a call to my_b_seek before calling
check_binlog_magic in next_event. This prevents the server
from asserting, in the cases that the SQL thread was reads
from a hot log (relay.NNNNN), then is stopped, then is restarted
from a previous cold log (relay.MMMMM), and then it reaches
the hot log relay.NNNNN again, in which case, the read
coordinates are not set to zero, but to the old values.
Additionally, some comments to the source code were added.
Diffstat (limited to 'sql/slave.cc')
-rw-r--r-- | sql/slave.cc | 64 |
1 files changed, 59 insertions, 5 deletions
diff --git a/sql/slave.cc b/sql/slave.cc index f1e0962e7e8..bd9017e6318 100644 --- a/sql/slave.cc +++ b/sql/slave.cc @@ -4349,12 +4349,66 @@ static Log_event* next_event(Relay_log_info* rli) DBUG_ASSERT(rli->cur_log_fd == -1); /* - Read pointer has to be at the start since we are the only - reader. - We must keep the LOCK_log to read the 4 first bytes, as this is a hot - log (same as when we call read_log_event() above: for a hot log we - take the mutex). + When the SQL thread is [stopped and] (re)started the + following may happen: + + 1. Log was hot at stop time and remains hot at restart + + SQL thread reads again from hot_log (SQL thread was + reading from the active log when it was stopped and the + very same log is still active on SQL thread restart). + + In this case, my_b_seek is performed on cur_log, while + cur_log points to relay_log.get_log_file(); + + 2. Log was hot at stop time but got cold before restart + + The log was hot when SQL thread stopped, but it is not + anymore when the SQL thread restarts. + + In this case, the SQL thread reopens the log, using + cache_buf, ie, cur_log points to &cache_buf, and thence + its coordinates are reset. + + 3. Log was already cold at stop time + + The log was not hot when the SQL thread stopped, and, of + course, it will not be hot when it restarts. + + In this case, the SQL thread opens the cold log again, + using cache_buf, ie, cur_log points to &cache_buf, and + thence its coordinates are reset. + + 4. Log was hot at stop time, DBA changes to previous cold + log and restarts SQL thread + + The log was hot when the SQL thread was stopped, but the + user changed the coordinates of the SQL thread to + restart from a previous cold log. + + In this case, at start time, cur_log points to a cold + log, opened using &cache_buf as cache, and coordinates + are reset. However, as it moves on to the next logs, it + will eventually reach the hot log. If the hot log is the + same at the time the SQL thread was stopped, then + coordinates were not reset - the cur_log will point to + relay_log.get_log_file(), and not a freshly opened + IO_CACHE through cache_buf. For this reason we need to + deploy a my_b_seek before calling check_binlog_magic at + this point of the code (see: BUG#55263 for more + details). + + NOTES: + - We must keep the LOCK_log to read the 4 first bytes, as + this is a hot log (same as when we call read_log_event() + above: for a hot log we take the mutex). + + - Because of scenario #4 above, we need to have a + my_b_seek here. Otherwise, we might hit the assertion + inside check_binlog_magic. */ + + my_b_seek(cur_log, (my_off_t) 0); if (check_binlog_magic(cur_log,&errmsg)) { if (!hot_log) pthread_mutex_unlock(log_lock); |