diff options
author | Marko Mäkelä <marko.makela@mariadb.com> | 2021-09-10 19:15:41 +0300 |
---|---|---|
committer | Marko Mäkelä <marko.makela@mariadb.com> | 2021-09-10 19:15:41 +0300 |
commit | d09426f9e60fd93296464ec9eb5f9d85566437d3 (patch) | |
tree | 36880933a8d7a8a28ff8cb78fc27c8caebbad5cd /mysql-test/suite | |
parent | 1c378f1b959252528ff3e97ac33c4d0fa596d4bf (diff) | |
download | mariadb-git-d09426f9e60fd93296464ec9eb5f9d85566437d3.tar.gz |
MDEV-26537 InnoDB corrupts files due to incorrect st_blksize calculation
The st_blksize returned by fstat(2) is not documented to be
a power of 2, like we assumed in
commit 58252fff15acfe7c7b0452a87e202e3f8e454e19 (MDEV-26040).
While on Linux, the st_blksize appears to report the file system
block size (which hopefully is not smaller than the sector size
of the underlying block device), on FreeBSD we observed
st_blksize values that might have been something similar to st_size.
Also IBM AIX was affected by this. A simple test case would
lead to a crash when using the minimum innodb_buffer_pool_size=5m
on both FreeBSD and AIX:
seq -f 'create table t%g engine=innodb select * from seq_1_to_200000;' \
1 100|mysql test&
seq -f 'create table u%g engine=innodb select * from seq_1_to_200000;' \
1 100|mysql test&
We will fix this by not trusting st_blksize at all, and assuming that
the smallest allowed write size (for O_DIRECT) is 4096 bytes. We hope
that no storage systems with larger block size exist. Anything larger
than 4096 bytes should be unlikely, given that it is the minimum
virtual memory page size of many contemporary processors.
MariaDB Server on Microsoft Windows was not affected by this.
While the 512-byte sector size of the venerable Seagate ST-225 is still
in widespread use, the minimum innodb_page_size is 4096 bytes, and
innodb_log_file_size can be set in integer multiples of 65536 bytes.
The only occasion where InnoDB uses smaller data file block sizes than
4096 bytes is with ROW_FORMAT=COMPRESSED tables with KEY_BLOCK_SIZE=1
or KEY_BLOCK_SIZE=2 (or innodb_page_size=4096). For such tables,
we will from now on preallocate space in integer multiples of 4096 bytes
and let regular writes extend the file by 1024, 2048, or 3072 bytes.
The view INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES.FS_BLOCK_SIZE
should report the raw st_blksize.
For page_compressed tables, the function fil_space_get_block_size()
will map to 512 any st_blksize value that is larger than 4096.
os_file_set_size(): Assume that the file system block size is 4096 bytes,
and only support extending files to integer multiples of 4096 bytes.
fil_space_extend_must_retry(): Round down the preallocation size to
an integer multiple of 4096 bytes.
Diffstat (limited to 'mysql-test/suite')
-rw-r--r-- | mysql-test/suite/innodb/t/check_ibd_filesize.test | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/mysql-test/suite/innodb/t/check_ibd_filesize.test b/mysql-test/suite/innodb/t/check_ibd_filesize.test index 92f9061a3f6..b6ab95e1930 100644 --- a/mysql-test/suite/innodb/t/check_ibd_filesize.test +++ b/mysql-test/suite/innodb/t/check_ibd_filesize.test @@ -46,6 +46,12 @@ perl; print "# bytes: ", (-s "$ENV{MYSQLD_DATADIR}/test/t1.ibd"), "\n"; EOF INSERT INTO t1 SELECT seq,REPEAT('a',30000) FROM seq_1_to_20; +# Ensure that the file will be extended with the last 1024-byte page +# after the file was pre-extended in 4096-byte increments. +--disable_query_log +FLUSH TABLE t1 FOR EXPORT; +UNLOCK TABLES; +--enable_query_log perl; print "# bytes: ", (-s "$ENV{MYSQLD_DATADIR}/test/t1.ibd"), "\n"; EOF |