diff options
author | Alexander Barkov <bar@mariadb.com> | 2023-01-20 09:52:00 +0400 |
---|---|---|
committer | Jan Lindström <jan.lindstrom@mariadb.com> | 2023-01-20 11:40:01 +0200 |
commit | eea9f2a1e7089f2b06faaabfedad0690b561f2ce (patch) | |
tree | aad7d4ce1ed6304965510fb3f03e7fa9aaf7e14a | |
parent | ae96e21cf0a4696a3fb7ccb27de970ff2dc0dd6b (diff) | |
download | mariadb-git-eea9f2a1e7089f2b06faaabfedad0690b561f2ce.tar.gz |
MDEV-27653 long uniques don't work with unicode collations
There are no source code changes in this commit!
This is an empty follow-up commit for
284ac6f2b73650f138064c97a96c8e1d8846550b
to comment what was done, as the patch itself did not have
change comments.
Problems solved in this patch:
1. The function calc_hash_for_unique() erroneously takes into account
the string length, so equal strings (in terms of the collation)
with different lengths got different hash value.
For example:
- LATIN LETTER A - 1 byte
- LATIN LETTER A WITH ACUTE - 2 bytes
are equal in utf8_general_ci, but as their lengths
are different, calc_hash_for_unique() returned
different hash values.
2. calc_hash_for_unique() also erroneously used val_str()
result to calculate hashes. This may not be correct for
some data types, e.g. TIMESTAMP, as its string
value depends on the session environment (e.g. @@time_zone).
Change summary:
Instead of doing Item::val_str(), we should always call
Field::hash() of the underlying Field. It properly
handles both cases (equal strings with different
lengths, as well as tricky data types like TIMESTAMP).
Detailed change description:
Non-functional changes (make the code cleaner):
- Adding a helper class Hasher, to pass hash parts
nr1 and nr2 through function arguments easier.
- Splitting virtual Field::hash() into non-virtual
wrapper Field::hash() and virtual Field::hash_not_null().
This helps to get rid of duplicate code handling SQL NULL,
as it was equal in all Field_xxx implementations.
- Adding a new method THD::my_ok_with_recreate_info().
Actual fix changes (make new tables work properly):
- Adding a virtual method Item::hash_not_null()
This helps to handle hashes on full fields (Item_field)
and hashes on prefix fields (Item_func_left(Item_field))
in a polymorphic way.
Implementing overrides for Item_field and Item_func_left.
- Rewriting Item_func_hash::val_int() to use Item::hash_not_null(),
instead of the combination of val_str() and alc_hash_for_unique().
Backward compatibility changes (make old tables work in the new server):
- Adding a new class Item_func_hash_mariadb_100403.
Moving the old version of Item_func_hash::val_int()
into Item_func_hash_mariadb_100403::val_int().
The old class Item_func_hash_mariadb_100403 is still needed,
to open old tables before upgrade is done.
- Adding TABLE_SHARE::old_long_hash_function() and
handler::check_long_hash_compatibility() to test
if a table is using an old hash function.
- Adding a helper method TABLE_SHARE::make_long_hash_func()
to instantiate either Item_func_hash_mariadb_100403 (for old
not upgraded tables) or Item_func_hash (for new tables).
Upgrade changes (make old tables upgrade in the new server properly):
Upgrading an old table to a new hash can be done using either
of these two statements:
ALTER IGNORE TABLE t1 FORCE;
REPAIR TABLE t1;
!!! These statements find and filter out erreneous duplicates!!!
The table after these statements will have less records
if there were erroneous duplicates (such and A and A WITH ACUTE).
The information about filtered out records is reported in both statements.
- Adding a new class Recreate_info to return out information
about copied and duplucate rows from these functions:
- mysql_alter_table()
- mysql_recreate_table()
- admin_recreate_table()
This helps to print a warning during REPAIR:
MariaDB [test]> repair table mdev27653_100422_text;
+----------------------------+--------+----------+------------------------------------+
| Table | Op | Msg_type | Msg_text |
+----------------------------+--------+----------+------------------------------------+
| test.mdev27653_100422_text | repair | Warning | Number of rows changed from 2 to 1 |
| test.mdev27653_100422_text | repair | status | OK |
+----------------------------+--------+----------+------------------------------------+
2 rows in set (0.018 sec)
0 files changed, 0 insertions, 0 deletions