diff options
author | unknown <guilhem@gbichot3.local> | 2007-06-22 14:49:37 +0200 |
---|---|---|
committer | unknown <guilhem@gbichot3.local> | 2007-06-22 14:49:37 +0200 |
commit | 1a96259191b193b353387cbb70d7567009e3b247 (patch) | |
tree | 27f19470e270f1546d4eb9ac1eaf51ff23ec8a08 /storage/maria/ma_rename.c | |
parent | fd9bd5802932b08da7484c54445ae14ee4e25385 (diff) | |
download | mariadb-git-1a96259191b193b353387cbb70d7567009e3b247.tar.gz |
- WL#3239 "log CREATE TABLE in Maria"
- WL#3240 "log DROP TABLE in Maria"
- similarly, log RENAME TABLE, REPAIR/OPTIMIZE TABLE, and
DELETE no_WHERE_clause (== the DELETE which just truncates the files)
- create_rename_lsn added to MARIA_SHARE's state
- all these operations (except DROP TABLE) also update the table's
create_rename_lsn, which is needed for the correctness of
Recovery (see function comment of _ma_repair_write_log_record()
in ma_check.c)
- write a COMMIT record when transaction commits.
- don't log REDOs/UNDOs if this is an internal temporary table
like inside ALTER TABLE (I expect this to be a big win). There was
already no logging for user-created "CREATE TEMPORARY" tables.
- don't fsync files/directories if the table is not transactional
- in translog_write_record(), autogenerate a 2-byte-id for the table
and log the "id->name" pair (LOGREC_FILE_ID); log
LOGREC_LONG_TRANSACTION_ID; automatically store
the table's 2-byte-id in any log record.
- preparations for Checkpoint: translog_get_horizon(); pausing Checkpoint
when some dirty pages are unknown; capturing trn->rec_lsn,
trn->first_undo_lsn for Checkpoint and log's low-water-mark computing.
- assertions, comments.
storage/maria/Makefile.am:
more files to build
storage/maria/ha_maria.cc:
- logging a REPAIR log record if REPAIR/OPTIMIZE was successful.
- ha_maria::data_file_type does not have to be set in every info()
call, just do it once in open().
- if caller said that transactionality can be disabled (like if
caller is ALTER TABLE) i.e. thd->transaction.on==FALSE, then we
temporarily disable transactionality of the table in external_lock();
that will ensure that no REDOs/UNDOs are logged for this possibly
massive write operation (they are not needed, as if any write fails,
the table will be dropped). We re-enable in external_lock(F_UNLCK),
which in ALTER TABLE happens before the tmp table replaces the original
one (which is good, as thus the final table will have a REDO RENAME
and a correct create_rename_lsn).
- when we commit we also have to write a log record, so
trnman_commit_trn() calls become ma_commit() calls
- at end of engine's initialization, we are potentially entering a
multi-threaded dangerous world (clients are going to be accepted)
and so some assertions of mutex-owning become enforceable, for that
we set maria_multi_threaded=TRUE (see ma_control_file.c)
storage/maria/ha_maria.h:
new member ha_maria::save_transactional (see also ha_maria.cc)
storage/maria/ma_blockrec.c:
- fixing comments according to discussion with Monty
- if a table is transactional but temporarily non-transactional
(like in ALTER TABLE), we need to give a sensible LSN to the pages
(and, if we give 0, pagecache asserts).
- translog_write_record() now takes care of storing the share's
2-byte-id in the log record
storage/maria/ma_blockrec.h:
fixing comment according to discussion with Monty
storage/maria/ma_check.c:
When REPAIR/OPTIMIZE modify the data/index file, if this is a
transactional table, they must sync it; if they remove files or rename
files, they must sync the directory, so that everything is durable.
This is just applying to REPAIR/OPTIMIZE the logic already implemented
in CREATE/DROP/RENAME a few months ago.
Adding a function to write a LOGREC_REPAIR_TABLE at end of
REPAIR/OPTIMIZE (called only by ha_maria, not by maria_chk), and
to update the table's create_rename_lsn.
storage/maria/ma_close.c:
fix for a future bug
storage/maria/ma_control_file.c:
ensuring that if Maria is running in multi-threaded mode, anybody
wanting to write to the control file and update
last_checkpoint_lsn/last_logno owns the log's lock.
storage/maria/ma_control_file.h:
see ma_control_file.c
storage/maria/ma_create.c:
when creating a table:
- sync it and its directory only if this is a transactional table
and there is a log (no point in syncing in maria_chk)
- decouple the two uses of linkname/linkname_ptr (for index file and
for data file) into more variables, as we need to know all links
until the moment we write the LOGREC_CREATE_TABLE.
- set share.data_file_type early so that _ma_initialize_data_file()
knows it (Monty's bugfix so that a table always has at least a bitmap
page when it is created; so data-file is not 0 bytes anymore).
- log a LOGREC_CREATE_TABLE; it contains the bytes which we have
just written to the index file's header. Update table's
create_rename_lsn.
- syncing of kfile had been bugified in a previous merge, correcting
- syncing of dfile is now needed as it's not empty anymore
- in _ma_initialize_data_file(), use share's block_size and not the
global one. This is a gratuitous change, both variables are equal,
just that I find it more future-proof to use share-bound variable
rather than global one.
storage/maria/ma_delete_all.c:
log a LOGREC_DELETE_ALL record when doing ma_delete_all_rows();
update create_rename_lsn then.
storage/maria/ma_delete_table.c:
- logging LOGREC_DROP_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- we need to sync directories only if the table is transactional
storage/maria/ma_extra.c:
questions
storage/maria/ma_init.c:
when maria_end() is called, engine is not multithreaded
storage/maria/ma_loghandler.c:
- translog_inited has to be visible to ma_create() (see how it is used
in ma_create())
- checkpoint record will be a single record, not three
- no REDO for TRUNCATE (TRUNCATE calls ma_create() internally so will
log a REDO_CREATE)
- adding REDO for DELETE no_WHERE_clause (fast DELETE of all rows by
truncating the files), REPAIR.
- MY_WAIT_IF_FULL to wait&retry if a log write hits a full disk
- in translog_write_record(), if MARIA_SHARE does not yet have a
2-byte-id, generate one for it and log LOGREC_FILE_ID; automatically
store this short id into log records.
- in translog_write_record(), if transaction has not logged its
long trid, log LOGREC_LONG_TRANSACTION_ID.
- For Checkpoint, we need to know the current end-of-log: adding
translog_get_horizon().
- For Control File, adding an assertion that the thread owns the
log's lock (control file is protected by this lock)
storage/maria/ma_loghandler.h:
Changes in log records (see ma_loghandler.c).
new prototypes, new functions.
storage/maria/ma_loghandler_lsn.h:
adding a type LSN_WITH_FLAGS especially for TRN::first_undo_lsn,
where the most significant byte is used for flags.
storage/maria/ma_open.c:
storing the create_rename_lsn in the index file's header (in the
state, precisely) and retrieving it from there.
storage/maria/ma_pagecache.c:
- my set_if_bigger was wrong, correcting it
- if the first_in_switch list is not empty, it means that
changed_blocks misses some dirty pages, so Checkpoint cannot run and
needs to wait. A variable missing_blocks_in_changed_list is added to
tell that (should it be named missing_blocks_in_changed_blocks?)
- pagecache_collect_changed_blocks_with_lsn() now also tells the
minimum rec_lsn (needed for low-water mark computation).
storage/maria/ma_pagecache.h:
see ma_pagecache.c
storage/maria/ma_panic.c:
comment
storage/maria/ma_range.c:
comment
storage/maria/ma_rename.c:
- logging LOGREC_RENAME_TABLE; knowing if this is needed, requires
knowing if the table is transactional, which requires opening the
table.
- update create_rename_lsn
- we need to sync directories only if the table is transactional
storage/maria/ma_static.c:
comment
storage/maria/ma_test_all.sh:
- tip for Valgrind-ing ma_test_all
- do "export maria_path=somepath" before calling ma_test_all,
if you want to run ma_test_all out of storage/maria (useful
to have parallel runs, like one normal and one Valgrind, they
must not use the same tables so need to run in different directories)
storage/maria/maria_def.h:
- state now contains, in memory and on disk, the create_rename_lsn
- share now contains a 2-byte-id
storage/maria/trnman.c:
preparations for Checkpoint: capture trn->rec_lsn, trn->first_undo_lsn;
minimum first_undo_lsn needed to know log's low-water-mark
storage/maria/trnman.h:
using most significant byte of first_undo_lsn to hold miscellaneous
flags, for now TRANSACTION_LOGGED_LONG_ID.
dummy_transaction_object is already declared in ma_static.c.
storage/maria/trnman_public.h:
dummy_transaction_object was declared in all files including
trnman_public.h, while in fact it's a single object.
new prototype
storage/maria/unittest/ma_test_loghandler-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multigroup-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_multithread-t.c:
update for new prototype
storage/maria/unittest/ma_test_loghandler_pagecache-t.c:
update for new prototype
storage/maria/ma_commit.c:
function which wraps:
- writing a LOGREC_COMMIT record (==commit on disk)
- calling trnman_commit_trn() (=commit in memory)
storage/maria/ma_commit.h:
new header file
.tree-is-private:
this file is now needed to keep our tree private (don't push it
to public trees). When 5.1 is merged into mysql-maria, we can abandon
our maria-specific post-commit trigger; .tree_is_private will take
care of keeping commit mails private. Don't push this file to public
trees.
Diffstat (limited to 'storage/maria/ma_rename.c')
-rw-r--r-- | storage/maria/ma_rename.c | 96 |
1 files changed, 74 insertions, 22 deletions
diff --git a/storage/maria/ma_rename.c b/storage/maria/ma_rename.c index a80bbcd398f..5224698c614 100644 --- a/storage/maria/ma_rename.c +++ b/storage/maria/ma_rename.c @@ -18,6 +18,18 @@ */ #include "ma_fulltext.h" +#include "trnman_public.h" + +/** + @brief renames a table + + @param old_name current name of table + @param new_name table should be renamed to this name + + @return Operation status + @retval 0 OK + @retval !=0 Error +*/ int maria_rename(const char *old_name, const char *new_name) { @@ -26,22 +38,73 @@ int maria_rename(const char *old_name, const char *new_name) #ifdef USE_RAID uint raid_type=0,raid_chunks=0; #endif + MARIA_HA *info; + MARIA_SHARE *share; + myf sync_dir; DBUG_ENTER("maria_rename"); #ifdef EXTRA_DEBUG _ma_check_table_is_closed(old_name,"rename old_table"); _ma_check_table_is_closed(new_name,"rename new table2"); #endif - /* LOCK TODO take X-lock on table here */ + /** @todo LOCK take X-lock on table */ + if (!(info= maria_open(old_name, O_RDWR, HA_OPEN_FOR_REPAIR))) + DBUG_RETURN(my_errno); + share= info->s; #ifdef USE_RAID + raid_type = share->base.raid_type; + raid_chunks = share->base.raid_chunks; +#endif + + sync_dir= (share->base.transactional && !share->temporary) ? + MY_SYNC_DIR : 0; + if (sync_dir) { - MARIA_HA *info; - if (!(info=maria_open(old_name, O_RDONLY, 0))) - DBUG_RETURN(my_errno); - raid_type = info->s->base.raid_type; - raid_chunks = info->s->base.raid_chunks; - maria_close(info); + uchar log_data[LSN_STORE_SIZE]; + LEX_STRING log_array[TRANSLOG_INTERNAL_PARTS + 3]; + uint old_name_len= strlen(old_name), new_name_len= strlen(new_name); + int2store(log_data, old_name_len); + int2store(log_data + 2, new_name_len); + log_array[TRANSLOG_INTERNAL_PARTS + 0].str= log_data; + log_array[TRANSLOG_INTERNAL_PARTS + 0].length= 2 + 2; + log_array[TRANSLOG_INTERNAL_PARTS + 1].str= (char *)old_name; + log_array[TRANSLOG_INTERNAL_PARTS + 1].length= old_name_len; + log_array[TRANSLOG_INTERNAL_PARTS + 2].str= (char *)new_name; + log_array[TRANSLOG_INTERNAL_PARTS + 2].length= new_name_len; + /* + For this record to be of any use for Recovery, we need the upper + MySQL layer to be crash-safe, which it is not now (that would require + work using the ddl_log of sql/sql_table.cc); when it is, we should + reconsider the moment of writing this log record (before or after op, + under THR_LOCK_maria or not...), how to use it in Recovery, and force + the log. For now this record is just informative. + */ + if (unlikely(translog_write_record(&share->state.create_rename_lsn, + LOGREC_REDO_RENAME_TABLE, + &dummy_transaction_object, NULL, + 2 + 2 + old_name_len + new_name_len, + sizeof(log_array)/sizeof(log_array[0]), + log_array, NULL))) + { + maria_close(info); + DBUG_RETURN(1); + } + /* + store LSN into file, needed for Recovery to not be confused if a + RENAME happened (applying REDOs to the wrong table). + */ + lsn_store(log_data, share->state.create_rename_lsn); + if (my_pwrite(share->kfile.file, log_data, sizeof(log_data), + sizeof(share->state.header) + 2, MYF(MY_NABP)) || + my_sync(share->kfile.file, MYF(MY_WME))) + { + maria_close(info); + DBUG_RETURN(1); + } } + + maria_close(info); +#ifdef USE_RAID #ifdef EXTRA_DEBUG _ma_check_table_is_closed(old_name,"rename raidcheck"); #endif @@ -49,29 +112,18 @@ int maria_rename(const char *old_name, const char *new_name) fn_format(from,old_name,"",MARIA_NAME_IEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT); fn_format(to,new_name,"",MARIA_NAME_IEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT); - /* - RECOVERY TODO log the two renames below. Update - ZeroDirtyPagesLSN of the table on disk (=> sync the files), this is - needed so that Recovery does not pick a wrong table. - Then do the file renames. - For this log record to be of any use for Recovery, we need the upper MySQL - layer to be crash-safe in DDLs; when it is we should reconsider the moment - of writing this log record, how to use it in Recovery, and force the log. - For now this record is only informative. But ZeroDirtyPagesLSN is - critically needed! - */ - if (my_rename_with_symlink(from, to, MYF(MY_WME | MY_SYNC_DIR))) + if (my_rename_with_symlink(from, to, MYF(MY_WME | sync_dir))) DBUG_RETURN(my_errno); fn_format(from,old_name,"",MARIA_NAME_DEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT); fn_format(to,new_name,"",MARIA_NAME_DEXT,MY_UNPACK_FILENAME|MY_APPEND_EXT); #ifdef USE_RAID if (raid_type) data_file_rename_error= my_raid_rename(from, to, raid_chunks, - MYF(MY_WME | MY_SYNC_DIR)); + MYF(MY_WME | sync_dir)); else #endif data_file_rename_error= - my_rename_with_symlink(from, to, MYF(MY_WME | MY_SYNC_DIR)); + my_rename_with_symlink(from, to, MYF(MY_WME | sync_dir)); if (data_file_rename_error) { /* @@ -81,7 +133,7 @@ int maria_rename(const char *old_name, const char *new_name) data_file_rename_error= my_errno; fn_format(from, old_name, "", MARIA_NAME_IEXT, MYF(MY_UNPACK_FILENAME|MY_APPEND_EXT)); fn_format(to, new_name, "", MARIA_NAME_IEXT, MYF(MY_UNPACK_FILENAME|MY_APPEND_EXT)); - my_rename_with_symlink(to, from, MYF(MY_WME | MY_SYNC_DIR)); + my_rename_with_symlink(to, from, MYF(MY_WME | sync_dir)); } DBUG_RETURN(data_file_rename_error); |