Bug#16691598 - ORDER BY LOWER(COLUMN) PRODUCES OUT-OF-ORDER RESULTS

Problem:- We have created a table with UTF8_BIN collation. In case, when in our query we have ORDER BY clause over a function call we are getting result in incorrect order. Note:the bug is not there in 5.5. Analysis: In 5.5, for UTF16_BIN, we have min and max multi-byte length is 2 and 4 respectively.In make_sortkey(),for 2 byte character character we are assuming that the resultant length will be 2 byte/character. But when we use my_strnxfrm_unicode_full_bin(), we store sorting weights using 3 bytes per character.This result in truncated result. Same thing happen for UTF8MB4, where we have 1 byte min multi-byte and 4 byte max multi-byte.We will accsume resultant data as 1 byte/character, which result in truncated result. Solution:- use strnxfrm(means use of MY_CS_STRNXFRM macro) is used for sort, in which the resultant length is not dependent on source length.
author: Neeraj Bisht <neeraj.x.bisht@oracle.com> 2013-11-07 16:46:24 +0530
committer: Neeraj Bisht <neeraj.x.bisht@oracle.com> 2013-11-07 16:46:24 +0530
commit: 88680a99c6acdcd8be84c16e970c7616c912ff59 (patch)
tree: 56530b682c2aa9ef7b56384939aa997cd35d441d /strings
parent: e6949c24f4dcaae6275fad4424896e011f955fd6 (diff)
download: mariadb-git-88680a99c6acdcd8be84c16e970c7616c912ff59.tar.gz
2 files changed, 3 insertions, 2 deletions
diff --git a/strings/ctype-ucs2.c b/strings/ctype-ucs2.c
index cecd4424108..f1d0e775804 100644
--- a/strings/ctype-ucs2.c
+++ b/strings/ctype-ucs2.c
@@ -1664,7 +1664,7 @@ CHARSET_INFO my_charset_utf16_general_ci=
 CHARSET_INFO my_charset_utf16_bin=
 {
   55,0,0,              /* number       */
-  MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_UNICODE|MY_CS_NONASCII,
+  MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_STRNXFRM|MY_CS_UNICODE|MY_CS_NONASCII,
   "utf16",             /* cs name      */
   "utf16_bin",         /* name         */
   "UTF-16 Unicode",    /* comment      */
diff --git a/strings/ctype-utf8.c b/strings/ctype-utf8.c
index 4976a9cf31a..62d5fbe0111 100644
--- a/strings/ctype-utf8.c
+++ b/strings/ctype-utf8.c
@@ -5435,7 +5435,8 @@ CHARSET_INFO my_charset_utf8mb4_general_ci=
 CHARSET_INFO my_charset_utf8mb4_bin=
 {
   46,0,0,             /* number       */
-  MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_UNICODE|MY_CS_UNICODE_SUPPLEMENT, /* state  */
+  MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_STRNXFRM|MY_CS_UNICODE|
+  MY_CS_UNICODE_SUPPLEMENT, /* state  */
   MY_UTF8MB4,         /* cs name      */
   MY_UTF8MB4_BIN,     /* name         */
   "UTF-8 Unicode",    /* comment      */
author	Neeraj Bisht <neeraj.x.bisht@oracle.com>	2013-11-07 16:46:24 +0530
committer	Neeraj Bisht <neeraj.x.bisht@oracle.com>	2013-11-07 16:46:24 +0530
commit	88680a99c6acdcd8be84c16e970c7616c912ff59 (patch)
tree	56530b682c2aa9ef7b56384939aa997cd35d441d /strings
parent	e6949c24f4dcaae6275fad4424896e011f955fd6 (diff)
download	mariadb-git-88680a99c6acdcd8be84c16e970c7616c912ff59.tar.gz