MDEV-23535 SIGSEGV, SIGABRT and SIGILL in typeinfo for Item_func_set_collation (on optimized builds)

This piece of the code in Item_func_or_sum::agg_item_set_converter: if (!conv && ((*arg)->collation.repertoire == MY_REPERTOIRE_ASCII)) conv= new (thd->mem_root) Item_func_conv_charset(thd, *arg, coll.collation, 1); was wrong because: 1. It could change Item_cache to Item_func_conv_charset (with the old Item_cache in args[0]). Such Item type change is not always supported, e.g. the code in Item_singlerow_subselect::reset() expects only Item_cache, to be able to call Item_cache::set_null(). So it erroneously reinterpreted Item_func_conv_charset to Item_cache and called a non-existing method set_null(), which crashed the server. 2. The 1 in the last parameter to Item_func_conv_charset() was also a problem. In MariaDB versions where the reported query did not crash, it erroneously returned "empty set" instead of one row, because the 1 made subselects execute too earlier and return NULL. Fix: 1. Removing the above two lines from Item_func_or_sum::agg_item_set_converter() 2. Adding the repertoire test inside the constructor of Item_func_conv_charset, so it now detects itself as "safe" in more cases than before. This is needed to avoid new "Illegal mix of collations" after removing the wrong code in various scenarios when character set conversion from pure ASCII happens, including the reported scenario. So now this sequence: Item_cache -> Item_func_concat is replaced to this compatible sequence (the top Item is still Item_cache): new Item_cache -> Item_func_conv_charset -> Item_func_concat Before the fix it was replaced to this incompatible sequence: Item_func_conv_charset -> old Item_cache -> Item_func_concat
author: Alexander Barkov <bar@mariadb.com> 2020-09-03 11:31:06 +0400
committer: Alexander Barkov <bar@mariadb.com> 2020-09-03 14:20:06 +0400
commit: f0a57acb492c4b034213d37c1fb5943850619f5a (patch)
tree: 4e11460941c358fed505f696ab36653b0f8a5d43 /sql
parent: 94a520ddbe39ae97de1135d98699cf2674e6b77e (diff)
download: mariadb-git-f0a57acb492c4b034213d37c1fb5943850619f5a.tar.gz
2 files changed, 9 insertions, 3 deletions
diff --git a/sql/item.cc b/sql/item.cc
index 2e12b8d8b64..8792b1c3f32 100644
--- a/sql/item.cc
+++ b/sql/item.cc
@@ -2188,8 +2188,6 @@ bool Item_func_or_sum::agg_item_set_converter(const DTCollation &coll,
     Item* conv= (*arg)->safe_charset_converter(thd, coll.collation);
     if (conv == *arg)
       continue;
-    if (!conv && ((*arg)->collation.repertoire == MY_REPERTOIRE_ASCII))
-      conv= new (thd->mem_root) Item_func_conv_charset(thd, *arg, coll.collation, 1);
 
     if (!conv)
     {
diff --git a/sql/item_strfunc.h b/sql/item_strfunc.h
index 9ea9ff97016..6989639ae67 100644
--- a/sql/item_strfunc.h
+++ b/sql/item_strfunc.h
@@ -1053,11 +1053,19 @@ public:
       /*
         Conversion from and to "binary" is safe.
         Conversion to Unicode is safe.
+        Conversion from an expression with the ASCII repertoire
+        to any character set that can store characters U+0000..U+007F
+        is safe:
+        - All supported multibyte character sets can store U+0000..U+007F
+        - All supported 7bit character sets can store U+0000..U+007F
+          except those marked with MY_CS_NONASCII (e.g. swe7).
         Other kind of conversions are potentially lossy.
       */
       safe= (args[0]->collation.collation == &my_charset_bin ||
              cs == &my_charset_bin ||
-             (cs->state & MY_CS_UNICODE));
+             (cs->state & MY_CS_UNICODE) ||
+             (args[0]->collation.repertoire == MY_REPERTOIRE_ASCII &&
+              (cs->mbmaxlen > 1 || !(cs->state & MY_CS_NONASCII))));
     }
   }
   String *val_str(String *);
author	Alexander Barkov <bar@mariadb.com>	2020-09-03 11:31:06 +0400
committer	Alexander Barkov <bar@mariadb.com>	2020-09-03 14:20:06 +0400
commit	f0a57acb492c4b034213d37c1fb5943850619f5a (patch)
tree	4e11460941c358fed505f696ab36653b0f8a5d43 /sql
parent	94a520ddbe39ae97de1135d98699cf2674e6b77e (diff)
download	mariadb-git-f0a57acb492c4b034213d37c1fb5943850619f5a.tar.gz