diff options
author | Neeraj Bisht <neeraj.x.bisht@oracle.com> | 2013-11-07 16:46:24 +0530 |
---|---|---|
committer | Neeraj Bisht <neeraj.x.bisht@oracle.com> | 2013-11-07 16:46:24 +0530 |
commit | 88680a99c6acdcd8be84c16e970c7616c912ff59 (patch) | |
tree | 56530b682c2aa9ef7b56384939aa997cd35d441d /strings | |
parent | e6949c24f4dcaae6275fad4424896e011f955fd6 (diff) | |
download | mariadb-git-88680a99c6acdcd8be84c16e970c7616c912ff59.tar.gz |
Bug#16691598 - ORDER BY LOWER(COLUMN) PRODUCES OUT-OF-ORDER RESULTS
Problem:-
We have created a table with UTF8_BIN collation.
In case, when in our query we have ORDER BY clause over a function
call we are getting result in incorrect order.
Note:the bug is not there in 5.5.
Analysis:
In 5.5, for UTF16_BIN, we have min and max multi-byte length is 2 and 4
respectively.In make_sortkey(),for 2 byte character character we are
assuming that the resultant length will be 2 byte/character. But when we
use my_strnxfrm_unicode_full_bin(), we store sorting weights using 3 bytes
per character.This result in truncated result.
Same thing happen for UTF8MB4, where we have 1 byte min multi-byte and
4 byte max multi-byte.We will accsume resultant data as 1 byte/character,
which result in truncated result.
Solution:-
use strnxfrm(means use of MY_CS_STRNXFRM macro) is used for sort, in
which the resultant length is not dependent on source length.
Diffstat (limited to 'strings')
-rw-r--r-- | strings/ctype-ucs2.c | 2 | ||||
-rw-r--r-- | strings/ctype-utf8.c | 3 |
2 files changed, 3 insertions, 2 deletions
diff --git a/strings/ctype-ucs2.c b/strings/ctype-ucs2.c index cecd4424108..f1d0e775804 100644 --- a/strings/ctype-ucs2.c +++ b/strings/ctype-ucs2.c @@ -1664,7 +1664,7 @@ CHARSET_INFO my_charset_utf16_general_ci= CHARSET_INFO my_charset_utf16_bin= { 55,0,0, /* number */ - MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_UNICODE|MY_CS_NONASCII, + MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_STRNXFRM|MY_CS_UNICODE|MY_CS_NONASCII, "utf16", /* cs name */ "utf16_bin", /* name */ "UTF-16 Unicode", /* comment */ diff --git a/strings/ctype-utf8.c b/strings/ctype-utf8.c index 4976a9cf31a..62d5fbe0111 100644 --- a/strings/ctype-utf8.c +++ b/strings/ctype-utf8.c @@ -5435,7 +5435,8 @@ CHARSET_INFO my_charset_utf8mb4_general_ci= CHARSET_INFO my_charset_utf8mb4_bin= { 46,0,0, /* number */ - MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_UNICODE|MY_CS_UNICODE_SUPPLEMENT, /* state */ + MY_CS_COMPILED|MY_CS_BINSORT|MY_CS_STRNXFRM|MY_CS_UNICODE| + MY_CS_UNICODE_SUPPLEMENT, /* state */ MY_UTF8MB4, /* cs name */ MY_UTF8MB4_BIN, /* name */ "UTF-8 Unicode", /* comment */ |