summaryrefslogtreecommitdiff
path: root/storage/maria/ma_rename.c
diff options
context:
space:
mode:
authorunknown <guilhem@gbichot3.local>2007-06-22 14:49:37 +0200
committerunknown <guilhem@gbichot3.local>2007-06-22 14:49:37 +0200
commit1a96259191b193b353387cbb70d7567009e3b247 (patch)
tree27f19470e270f1546d4eb9ac1eaf51ff23ec8a08 /storage/maria/ma_rename.c
parentfd9bd5802932b08da7484c54445ae14ee4e25385 (diff)
downloadmariadb-git-1a96259191b193b353387cbb70d7567009e3b247.tar.gz
- WL#3239 "log CREATE TABLE in Maria"
- WL#3240 "log DROP TABLE in Maria" - similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and DELETE no_WHERE_clause (== the DELETE which just truncates the files) - create_rename_lsn added to MARIA_SHARE's state - all these operations (except DROP TABLE) also update the table's create_rename_lsn, which is needed for the correctness of Recovery (see function comment of _ma_repair_write_log_record() in ma_check.c) - write a COMMIT record when transaction commits. - don't log REDOs/UNDOs if this is an internal temporary table like inside ALTER TABLE (I expect this to be a big win). There was already no logging for user-created "CREATE TEMPORARY" tables. - don't fsync files/directories if the table is not transactional - in translog_write_record(), autogenerate a 2-byte-id for the table and log the "id->name" pair (LOGREC_FILE_ID); log LOGREC_LONG_TRANSACTION_ID; automatically store the table's 2-byte-id in any log record. - preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint when some dirty pages are unknown; capturing trn->rec_lsn, trn->first_undo_lsn for Checkpoint and log's low-water-mark computing. - assertions, comments. storage/maria/Makefile.am: more files to build storage/maria/ha_maria.cc: - logging a REPAIR log record if REPAIR/OPTIMIZE was successful. - ha_maria::data_file_type does not have to be set in every info() call, just do it once in open(). - if caller said that transactionality can be disabled (like if caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we temporarily disable transactionality of the table in external_lock(); that will ensure that no REDOs/UNDOs are logged for this possibly massive write operation (they are not needed, as if any write fails, the table will be dropped). We re-enable in external_lock(F_UNLCK), which in ALTER TABLE happens before the tmp table replaces the original one (which is good, as thus the final table will have a REDO RENAME and a correct create_rename_lsn). - when we commit we also have to write a log record, so trnman_commit_trn() calls become ma_commit() calls - at end of engine's initialization, we are potentially entering a multi-threaded dangerous world (clients are going to be accepted) and so some assertions of mutex-owning become enforceable, for that we set maria_multi_threaded=TRUE (see ma_control_file.c) storage/maria/ha_maria.h: new member ha_maria::save_transactional (see also ha_maria.cc) storage/maria/ma_blockrec.c: - fixing comments according to discussion with Monty - if a table is transactional but temporarily non-transactional (like in ALTER TABLE), we need to give a sensible LSN to the pages (and, if we give 0, pagecache asserts). - translog_write_record() now takes care of storing the share's 2-byte-id in the log record storage/maria/ma_blockrec.h: fixing comment according to discussion with Monty storage/maria/ma_check.c: When REPAIR/OPTIMIZE modify the data/index file, if this is a transactional table, they must sync it; if they remove files or rename files, they must sync the directory, so that everything is durable. This is just applying to REPAIR/OPTIMIZE the logic already implemented in CREATE/DROP/RENAME a few months ago. Adding a function to write a LOGREC_REPAIR_TABLE at end of REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and to update the table's create_rename_lsn. storage/maria/ma_close.c: fix for a future bug storage/maria/ma_control_file.c: ensuring that if Maria is running in multi-threaded mode, anybody wanting to write to the control file and update last_checkpoint_lsn/last_logno owns the log's lock. storage/maria/ma_control_file.h: see ma_control_file.c storage/maria/ma_create.c: when creating a table: - sync it and its directory only if this is a transactional table and there is a log (no point in syncing in maria_chk) - decouple the two uses of linkname/linkname_ptr (for index file and for data file) into more variables, as we need to know all links until the moment we write the LOGREC_CREATE_TABLE. - set share.data_file_type early so that _ma_initialize_data_file() knows it (Monty's bugfix so that a table always has at least a bitmap page when it is created; so data-file is not 0 bytes anymore). - log a LOGREC_CREATE_TABLE; it contains the bytes which we have just written to the index file's header. Update table's create_rename_lsn. - syncing of kfile had been bugified in a previous merge, correcting - syncing of dfile is now needed as it's not empty anymore - in _ma_initialize_data_file(), use share's block_size and not the global one. This is a gratuitous change, both variables are equal, just that I find it more future-proof to use share-bound variable rather than global one. storage/maria/ma_delete_all.c: log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows(); update create_rename_lsn then. storage/maria/ma_delete_table.c: - logging LOGREC_DROP_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - we need to sync directories only if the table is transactional storage/maria/ma_extra.c: questions storage/maria/ma_init.c: when maria_end() is called, engine is not multithreaded storage/maria/ma_loghandler.c: - translog_inited has to be visible to ma_create() (see how it is used in ma_create()) - checkpoint record will be a single record, not three - no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will log a REDO_CREATE) - adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by truncating the files), REPAIR. - MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk - in translog_write_record(), if MARIA_SHARE does not yet have a 2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically store this short id into log records. - in translog_write_record(), if transaction has not logged its long trid, log LOGREC_LONG_TRANSACTION_ID. - For Checkpoint, we need to know the current end-of-log: adding translog_get_horizon(). - For Control File, adding an assertion that the thread owns the log's lock (control file is protected by this lock) storage/maria/ma_loghandler.h: Changes in log records (see ma_loghandler.c). new prototypes, new functions. storage/maria/ma_loghandler_lsn.h: adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn, where the most significant byte is used for flags. storage/maria/ma_open.c: storing the create_rename_lsn in the index file's header (in the state, precisely) and retrieving it from there. storage/maria/ma_pagecache.c: - my set_if_bigger was wrong, correcting it - if the first_in_switch list is not empty, it means that changed_blocks misses some dirty pages, so Checkpoint cannot run and needs to wait. A variable missing_blocks_in_changed_list is added to tell that (should it be named missing_blocks_in_changed_blocks?) - pagecache_collect_changed_blocks_with_lsn() now also tells the minimum rec_lsn (needed for low-water mark computation). storage/maria/ma_pagecache.h: see ma_pagecache.c storage/maria/ma_panic.c: comment storage/maria/ma_range.c: comment storage/maria/ma_rename.c: - logging LOGREC_RENAME_TABLE; knowing if this is needed, requires knowing if the table is transactional, which requires opening the table. - update create_rename_lsn - we need to sync directories only if the table is transactional storage/maria/ma_static.c: comment storage/maria/ma_test_all.sh: - tip for Valgrind-ing ma_test_all - do "export maria_path=somepath" before calling ma_test_all, if you want to run ma_test_all out of storage/maria (useful to have parallel runs, like one normal and one Valgrind, they must not use the same tables so need to run in different directories) storage/maria/maria_def.h: - state now contains, in memory and on disk, the create_rename_lsn - share now contains a 2-byte-id storage/maria/trnman.c: preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn; minimum first_undo_lsn needed to know log's low-water-mark storage/maria/trnman.h: using most significant byte of first_undo_lsn to hold miscellaneous flags, for now TRANSACTION_LOGGED_LONG_ID. dummy_transaction_object is already declared in ma_static.c. storage/maria/trnman_public.h: dummy_transaction_object was declared in all files including trnman_public.h, while in fact it's a single object. new prototype storage/maria/unittest/ma_test_loghandler-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multigroup-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_multithread-t.c: update for new prototype storage/maria/unittest/ma_test_loghandler_pagecache-t.c: update for new prototype storage/maria/ma_commit.c: function which wraps: - writing a LOGREC_COMMIT record (==commit on disk) - calling trnman_commit_trn() (=commit in memory) storage/maria/ma_commit.h: new header file .tree-is-private: this file is now needed to keep our tree private (don't push it to public trees). When 5.1 is merged into mysql-maria, we can abandon our maria-specific post-commit trigger; .tree_is_private will take care of keeping commit mails private. Don't push this file to public trees.
Diffstat (limited to 'storage/maria/ma_rename.c')
-rw-r--r--storage/maria/ma_rename.c96
1 files changed, 74 insertions, 22 deletions
diff --git a/storage/maria/ma_rename.c b/storage/maria/ma_rename.c
index a80bbcd398f..5224698c614 100644
--- a/storage/maria/ma_rename.c
+++ b/storage/maria/ma_rename.c
@@ -18,6 +18,18 @@
*/
#include "ma_fulltext.h"
+#include "trnman_public.h"
+
+/**
+ @brief renames a table
+
+ @param old_name current name of table
+ @param new_name table should be renamed to this name
+
+ @return Operation status
+ @retval 0 OK
+ @retval !=0 Error
+*/
int maria_rename(const char *old_name, const char *new_name)
{
@@ -26,22 +38,73 @@ int maria_rename(const char *old_name, const char *new_name)
#ifdef USE_RAID
uint raid_type=0,raid_chunks=0;
#endif
+ MARIA_HA *info;
+ MARIA_SHARE *share;
+ myf sync_dir;
DBUG_ENTER("maria_rename");
#ifdef EXTRA_DEBUG
_ma_check_table_is_closed(old_name,"rename old_table");
_ma_check_table_is_closed(new_name,"rename new table2");
#endif
- /* LOCK TODO take X-lock on table here */
+ /** @todo LOCK take X-lock on table */
+ if (!(info= maria_open(old_name, O_RDWR, HA_OPEN_FOR_REPAIR)))
+ DBUG_RETURN(my_errno);
+ share= info->s;
#ifdef USE_RAID
+ raid_type = share->base.raid_type;
+ raid_chunks = share->base.raid_chunks;
+#endif
+
+ sync_dir= (share->base.transactional && !share->temporary) ?
+ MY_SYNC_DIR : 0;
+ if (sync_dir)
{
- MARIA_HA *info;
- if (!(info=maria_open(old_name, O_RDONLY, 0)))
- DBUG_RETURN(my_errno);
- raid_type = info->s->base.raid_type;
- raid_chunks = info->s->base.raid_chunks;
- maria_close(info);
+ uchar log_data[LSN_STORE_SIZE];
+ LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 3];
+ uint old_name_len= strlen(old_name), new_name_len= strlen(new_name);
+ int2store(log_data, old_name_len);
+ int2store(log_data + 2, new_name_len);
+ log_array[TRANSLOG_INTERNAL_PARTS + 0].str= log_data;
+ log_array[TRANSLOG_INTERNAL_PARTS + 0].length= 2 + 2;
+ log_array[TRANSLOG_INTERNAL_PARTS + 1].str= (char *)old_name;
+ log_array[TRANSLOG_INTERNAL_PARTS + 1].length= old_name_len;
+ log_array[TRANSLOG_INTERNAL_PARTS + 2].str= (char *)new_name;
+ log_array[TRANSLOG_INTERNAL_PARTS + 2].length= new_name_len;
+ /*
+ For this record to be of any use for Recovery, we need the upper
+ MySQL layer to be crash-safe, which it is not now (that would require
+ work using the ddl_log of sql/sql_table.cc); when it is, we should
+ reconsider the moment of writing this log record (before or after op,
+ under THR_LOCK_maria or not...), how to use it in Recovery, and force
+ the log. For now this record is just informative.
+ */
+ if (unlikely(translog_write_record(&share->state.create_rename_lsn,
+ LOGREC_REDO_RENAME_TABLE,
+ &dummy_transaction_object, NULL,
+ 2 + 2 + old_name_len + new_name_len,
+ sizeof(log_array)/sizeof(log_array[0]),
+ log_array, NULL)))
+ {
+ maria_close(info);
+ DBUG_RETURN(1);
+ }
+ /*
+ store LSN into file, needed for Recovery to not be confused if a
+ RENAME happened (applying REDOs to the wrong table).
+ */
+ lsn_store(log_data, share->state.create_rename_lsn);
+ if (my_pwrite(share->kfile.file, log_data, sizeof(log_data),
+ sizeof(share->state.header) + 2, MYF(MY_NABP)) ||
+ my_sync(share->kfile.file, MYF(MY_WME)))
+ {
+ maria_close(info);
+ DBUG_RETURN(1);
+ }
}
+
+ maria_close(info);
+#ifdef USE_RAID
#ifdef EXTRA_DEBUG
_ma_check_table_is_closed(old_name,"rename raidcheck");
#endif
@@ -49,29 +112,18 @@ int maria_rename(const char *old_name, const char *new_name)
fn_format(from,old_name,"",MARIA_NAME_IEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT);
fn_format(to,new_name,"",MARIA_NAME_IEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT);
- /*
- RECOVERY TODO log the two renames below. Update
- ZeroDirtyPagesLSN of the table on disk (=> sync the files), this is
- needed so that Recovery does not pick a wrong table.
- Then do the file renames.
- For this log record to be of any use for Recovery, we need the upper MySQL
- layer to be crash-safe in DDLs; when it is we should reconsider the moment
- of writing this log record, how to use it in Recovery, and force the log.
- For now this record is only informative. But ZeroDirtyPagesLSN is
- critically needed!
- */
- if (my_rename_with_symlink(from, to, MYF(MY_WME | MY_SYNC_DIR)))
+ if (my_rename_with_symlink(from, to, MYF(MY_WME | sync_dir)))
DBUG_RETURN(my_errno);
fn_format(from,old_name,"",MARIA_NAME_DEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT);
fn_format(to,new_name,"",MARIA_NAME_DEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT);
#ifdef USE_RAID
if (raid_type)
data_file_rename_error= my_raid_rename(from, to, raid_chunks,
- MYF(MY_WME | MY_SYNC_DIR));
+ MYF(MY_WME | sync_dir));
else
#endif
data_file_rename_error=
- my_rename_with_symlink(from, to, MYF(MY_WME | MY_SYNC_DIR));
+ my_rename_with_symlink(from, to, MYF(MY_WME | sync_dir));
if (data_file_rename_error)
{
/*
@@ -81,7 +133,7 @@ int maria_rename(const char *old_name, const char *new_name)
data_file_rename_error= my_errno;
fn_format(from, old_name, "", MARIA_NAME_IEXT, MYF(MY_UNPACK_FILENAME|MY_APPEND_EXT));
fn_format(to, new_name, "", MARIA_NAME_IEXT, MYF(MY_UNPACK_FILENAME|MY_APPEND_EXT));
- my_rename_with_symlink(to, from, MYF(MY_WME | MY_SYNC_DIR));
+ my_rename_with_symlink(to, from, MYF(MY_WME | sync_dir));
}
DBUG_RETURN(data_file_rename_error);