summaryrefslogtreecommitdiff
path: root/include
diff options
context:
space:
mode:
authorAlexander Barkov <bar@mariadb.com>2023-03-31 17:20:03 +0400
committerAlexander Barkov <bar@mariadb.com>2023-04-04 12:30:50 +0400
commit8020b1bd735c686818f1563e2c2317e263d5bd3a (patch)
tree9280a9d419e60dc409f88138e732c8ff67050c2e /include
parent0cc1694e9c7481b59d372af7f759bb0bcf552bfa (diff)
downloadmariadb-git-8020b1bd735c686818f1563e2c2317e263d5bd3a.tar.gz
MDEV-30034 UNIQUE USING HASH accepts duplicate entries for tricky collations
- Adding a new argument "flag" to MY_COLLATION_HANDLER::strnncollsp_nchars() and a flag MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES. The flag defines if strnncollsp_nchars() should emulate trailing spaces which were possibly trimmed earlier (e.g. in InnoDB CHAR compression). This is important for NOPAD collations. For example, with this input: - str1= 'a ' (Latin letter a followed by one space) - str2= 'a ' (Latin letter a followed by two spaces) - nchars= 3 if the flag is given, strnncollsp_nchars() will virtually restore one trailing space to str1 up to nchars (3) characters and compare two strings as equal: - str1= 'a ' (one extra trailing space emulated) - str2= 'a ' (as is) If the flag is not given, strnncollsp_nchars() does not add trailing virtual spaces, so in case of a NOPAD collation, str1 will be compared as less than str2 because it is shorter. - Field_string::cmp_prefix() now passes the new flag. Field_varstring::cmp_prefix() and Field_blob::cmp_prefix() do not pass the new flag. - The branch in cmp_whole_field() in storage/innobase/rem/rem0cmp.cc (which handles the CHAR data type) now also passed the new flag. - Fixing UCA collations to respect the new flag. Other collations are possibly also affected, however I had no success in making an SQL script demonstrating the problem. Other collations will be extended to respect this flags in a separate patch later. - Changing the meaning of the last parameter of Field::cmp_prefix() from "number of bytes" (internal length) to "number of characters" (user visible length). The code calling cmp_prefix() from handler.cc was wrong. After this change, the call in handler.cc became correct. The code calling cmp_prefix() from key_rec_cmp() in key.cc was adjusted according to this change. - Old strnncollsp_nchar() related tests in unittest/strings/strings-t.c now pass the new flag. A few new tests also were added, without the flag.
Diffstat (limited to 'include')
-rw-r--r--include/m_ctype.h25
1 files changed, 24 insertions, 1 deletions
diff --git a/include/m_ctype.h b/include/m_ctype.h
index 484cd0a657e..96eea74d5ba 100644
--- a/include/m_ctype.h
+++ b/include/m_ctype.h
@@ -248,6 +248,28 @@ extern MY_UNI_CTYPE my_uni_ctype[256];
#define MY_STRXFRM_REVERSE_LEVEL6 0x00200000 /* if reverse order for level6 */
#define MY_STRXFRM_REVERSE_SHIFT 16
+/* Flags to strnncollsp_nchars */
+/*
+ MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES -
+ defines if inside strnncollsp_nchars()
+ short strings should be virtually extended to "nchars"
+ characters by emulating trimmed trailing spaces.
+
+ This flag is needed when comparing packed strings of the CHAR
+ data type, when trailing spaces are trimmed on storage (like in InnoDB),
+ however the actual values (after unpacking) will have those trailing
+ spaces.
+
+ If this flag is passed, strnncollsp_nchars() performs both
+ truncating longer strings and extending shorter strings
+ to exactly "nchars".
+
+ If this flag is not passed, strnncollsp_nchars() only truncates longer
+ strings to "nchars", but does not extend shorter strings to "nchars".
+*/
+#define MY_STRNNCOLLSP_NCHARS_EMULATE_TRIMMED_TRAILING_SPACES 1
+
+
/*
Collation IDs for MariaDB that should not conflict with MySQL.
We reserve 256..511, because MySQL will most likely use this range
@@ -383,7 +405,8 @@ struct my_collation_handler_st
int (*strnncollsp_nchars)(CHARSET_INFO *,
const uchar *str1, size_t len1,
const uchar *str2, size_t len2,
- size_t nchars);
+ size_t nchars,
+ uint flags);
size_t (*strnxfrm)(CHARSET_INFO *,
uchar *dst, size_t dstlen, uint nweights,
const uchar *src, size_t srclen, uint flags);