summaryrefslogtreecommitdiff
path: root/sql/item_strfunc.h
Commit message (Collapse)AuthorAgeFilesLines
* MDEV-27009 Add UCA-14.0.0 collationsAlexander Barkov2022-08-101-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Added one neutral and 22 tailored (language specific) collations based on Unicode Collation Algorithm version 14.0.0. Collations were added for Unicode character sets utf8mb3, utf8mb4, ucs2, utf16, utf32. Every tailoring was added with four accent and case sensitivity flag combinations, e.g: * utf8mb4_uca1400_swedish_as_cs * utf8mb4_uca1400_swedish_as_ci * utf8mb4_uca1400_swedish_ai_cs * utf8mb4_uca1400_swedish_ai_ci and their _nopad_ variants: * utf8mb4_uca1400_swedish_nopad_as_cs * utf8mb4_uca1400_swedish_nopad_as_ci * utf8mb4_uca1400_swedish_nopad_ai_cs * utf8mb4_uca1400_swedish_nopad_ai_ci - Introducing a conception of contextually typed named collations: CREATE DATABASE db1 CHARACTER SET utf8mb4; CREATE TABLE db1.t1 (a CHAR(10) COLLATE uca1400_as_ci); The idea is that there is no a need to specify the character set prefix in the new collation names. It's enough to type just the suffix "uca1400_as_ci". The character set is taken from the context. In the above example script the context character set is utf8mb4. So the CREATE TABLE will make a column with the collation utf8mb4_uca1400_as_ci. Short collations names can be used in any parts of the SQL syntax where the COLLATE clause is understood. - New collations are displayed only one time (without character set combinations) by these statements: SELECT * FROM INFORMATION_SCHEMA.COLLATIONS; SHOW COLLATION; For example, all these collations: - utf8mb3_uca1400_swedish_as_ci - utf8mb4_uca1400_swedish_as_ci - ucs2_uca1400_swedish_as_ci - utf16_uca1400_swedish_as_ci - utf32_uca1400_swedish_as_ci have just one entry in INFORMATION_SCHEMA.COLLATIONS and SHOW COLLATION, with COLLATION_NAME equal to "uca1400_swedish_as_ci", which is the suffix without the character set name: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+ | COLLATION_NAME | +-----------------------+ | uca1400_swedish_as_ci | +-----------------------+ Note, the behaviour of old collations did not change. Non-unicode collations (e.g. latin1_swedish_ci) and old UCA-4.0.0 collations (e.g. utf8mb4_unicode_ci) are still displayed with the character set prefix, as before. - The structure of the table INFORMATION_SCHEMA.COLLATIONS was changed. The NOT NULL constraint was removed from these columns: - CHARACTER_SET_NAME - ID - IS_DEFAULT and from the corresponding columns in SHOW COLLATION. For example: SELECT COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATIONS WHERE COLLATION_NAME LIKE '%uca1400_swedish_as_ci'; +-----------------------+--------------------+------+------------+ | COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | +-----------------------+--------------------+------+------------+ | uca1400_swedish_as_ci | NULL | NULL | NULL | +-----------------------+--------------------+------+------------+ The NULL value in these columns now means that the collation is applicable to multiple character sets. The behavioir of old collations did not change. Make sure your client programs can handle NULL values in these columns. - The structure of the table INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY was changed. Three new NOT NULL columns were added: - FULL_COLLATION_NAME - ID - IS_DEFAULT New collations have multiple entries in COLLATION_CHARACTER_SET_APPLICABILITY. The column COLLATION_NAME contains the collation name without the character set prefix. The column FULL_COLLATION_NAME contains the collation name with the character set prefix. Old collations have full collation name in both FULL_COLLATION_NAME and COLLATION_NAME. SELECT COLLATION_NAME, FULL_COLLATION_NAME, CHARACTER_SET_NAME, ID, IS_DEFAULT FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY WHERE FULL_COLLATION_NAME RLIKE '^(utf8mb4|latin1).*swedish.*ci$'; +-----------------------------+-------------------------------------+--------------------+------+------------+ | COLLATION_NAME | FULL_COLLATION_NAME | CHARACTER_SET_NAME | ID | IS_DEFAULT | +-----------------------------+-------------------------------------+--------------------+------+------------+ | latin1_swedish_ci | latin1_swedish_ci | latin1 | 8 | Yes | | latin1_swedish_nopad_ci | latin1_swedish_nopad_ci | latin1 | 1032 | | | utf8mb4_swedish_ci | utf8mb4_swedish_ci | utf8mb4 | 232 | | | uca1400_swedish_ai_ci | utf8mb4_uca1400_swedish_ai_ci | utf8mb4 | 2368 | | | uca1400_swedish_as_ci | utf8mb4_uca1400_swedish_as_ci | utf8mb4 | 2370 | | | uca1400_swedish_nopad_ai_ci | utf8mb4_uca1400_swedish_nopad_ai_ci | utf8mb4 | 2372 | | | uca1400_swedish_nopad_as_ci | utf8mb4_uca1400_swedish_nopad_as_ci | utf8mb4 | 2374 | | +-----------------------------+-------------------------------------+--------------------+------+------------+ - Other INFORMATION_SCHEMA queries: SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.PARAMETERS; SELECT TABLE_COLLATION FROM INFORMATION_SCHEMA.TABLES; SELECT DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA; SELECT COLLATION_NAME FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.EVENTS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.EVENTS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.ROUTINES; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.ROUTINES; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT DATABASE_COLLATION FROM INFORMATION_SCHEMA.TRIGGERS; SELECT COLLATION_CONNECTION FROM INFORMATION_SCHEMA.VIEWS; display full collation names, including character sets prefix, for all collations, including new collations. Corresponding SHOW commands also display full collation names in collation related columns: SHOW CREATE TABLE t1; SHOW CREATE DATABASE db1; SHOW TABLE STATUS; SHOW CREATE FUNCTION f1; SHOW CREATE PROCEDURE p1; SHOW CREATE EVENT ev1; SHOW CREATE TRIGGER tr1; SHOW CREATE VIEW; These INFORMATION_SCHEMA queries and SHOW statements may change in the future, to display show collation names.
* MDEV-29154 Excessive warnings upon a call to RANDOM_BYTESDaniel Black2022-07-311-0/+1
| | | | | | | | Bring the 5 warnings of select random_bytes(cast('x' as unsigned)+1); back to two. 1 for Item_func_random_bytes::fix_length_and_dec and one from Item_func_random_bytes::val_str. The warnings are from args[0]->val_int().
* MDEV-29029 RANDOM_BYTES cannot be virtual columnDaniel Black2022-07-311-0/+5
|
* MDEV-25704 Add RANDOM_BYTES functionVanislavsky2022-07-311-0/+20
| | | | | | | MySQL 5.6 added the RANDOM_BYTES function. https://dev.mysql.com/doc/refman/5.6/en/encryption-functions.html#function_random-bytes This is needed for compatibility purposes.
* MDEV-27104 deprecate DES_ENCRYPT/DECRYPT functionsSergei Golubchik2022-07-281-16/+2
|
* MDEV-23479: Add a THD* argument to Item_func_or_sum::fix_length_and_dec()Rucha Deodhar2022-03-301-78/+78
| | | | | Fix: Added THD *thd argument in Item_func_or_sum::fix_length_and_dec() and in fix_length_and_dec() for all derived classes of Item_func_or_sum.
* Merge branch '10.7' into 10.8Oleksandr Byelkin2022-02-041-15/+14
|\
| * Merge branch '10.6' into 10.7Oleksandr Byelkin2022-02-041-16/+15
| |\
| | * Merge branch '10.5' into 10.6Oleksandr Byelkin2022-02-031-16/+15
| | |\
| | | * Merge branch '10.4' into 10.5Oleksandr Byelkin2022-02-011-14/+14
| | | |\
| | | | * Merge branch '10.3' into 10.4Oleksandr Byelkin2022-01-301-1/+1
| | | | |\
| | | | | * Merge branch '10.2' into 10.3mariadb-10.3.33Oleksandr Byelkin2022-01-291-1/+1
| | | | | |\
| | | | | | * MDEV-27544 database() function should return 64 charactersDaniel Black2022-01-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Database names are 64 utf8 characters per the system tables that refer to them. The current database() function is returning 34 characters. The result of limiting this function results to max length of 34 became apparent when used in a UNION ALL where the results are truncated to 34 characters. For (uninvestigated) reasons, SELECT DATABASE() on its own would always return the right number of characters. Thanks Alexander Barkov for the review. Thanks dave for noticing the bug in the stackexchange post https://dba.stackexchange.com/questions/306183/why-is-my-database-name-truncated
| | | | * | | MDEV-26953 Assertion `!str || str != Ptr || !is_alloced()' failed in ↵Alexander Barkov2022-01-271-13/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | String::copy upon SELECT with sjis Item::save_str_in_field() passes &Item::str_value as a parameter to val_str(). Item_func::make_empty_result() also fills and returns str_value. As a result, in the reported scenario in Item_func::val_str_from_val_str_ascii() both "str" and "res" pointed to Item::str_value, which made the DBUG_ASSERT inside String::copy() (preventing copying to itself) crash: if ((null_value= str->copy(res->ptr(), res->length(), &my_charset_latin1, collation.collation, &errors))) Fix: - Adding a String* parameter to make_empty_result() - Passing the val_str() parameter to make_empty_string().
| | | * | | | MDEV-27018 IF and COALESCE lose "json" propertybb-10.5-bar-MDEV-27018Alexander Barkov2022-01-211-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hybrid functions (IF, COALESCE, etc) did not preserve the JSON property from their arguments. The same problem was repeatable for single row subselects. The problem happened because the method Item::is_json_type() was inconsistently implemented across the Item hierarchy. For example, Item_hybrid_func and Item_singlerow_subselect did not override is_json_type(). Solution: - Removing Item::is_json_type() - Implementing specific JSON type handlers: Type_handler_string_json Type_handler_varchar_json Type_handler_tiny_blob_json Type_handler_blob_json Type_handler_medium_blob_json Type_handler_long_blob_json - Reusing the existing data type infrastructure to pass JSON type handlers across all item types, including classes Item_hybrid_func and Item_singlerow_subselect. Note, these two classes themselves do not need any changes! - Extending the data type infrastructure so data types can inherit their properties (e.g. aggregation rules) from their base data types. E.g. VARCHAR/JSON acts as VARCHAR, LONGTEXT/JSON acts as LONGTEXT when mixed to a non-JSON data type. This is done by: - adding virtual method Type_handler::type_handler_base() - adding a helper class Type_handler_pair - refactoring Type_handler_hybrid_field_type methods aggregate_for_result(), aggregate_for_min_max(), aggregate_for_num_op() to use Type_handler_pair. This change also fixes: MDEV-27361 Hybrid functions with JSON arguments do not send format metadata Also, adding mtr tests for JSON replication. It was not covered yet. And the current patch changes the replication code slightly.
* | | | | | | MDEV-27208: Extend CRC32() and implement CRC32C()Marko Mäkelä2022-01-211-5/+17
|/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to define a native unary function CRC32() that computes the CRC-32 of a string using the ISO 3309 polynomial that is being used by zlib and many others. Often, a CRC is computed in pieces. To faciliate this, we introduce a 2-ary variant of the function that inputs a previous CRC as the first argument: CRC32('MariaDB')=CRC32(CRC32('Maria'),'DB'). InnoDB and MyRocks use a different polynomial, which was implemented in SSE4.2 instructions that were introduced in the Intel Nehalem microarchitecture. This is commonly called CRC-32C (Castagnoli). We introduce a native function that uses the Castagnoli polynomial: CRC32C('MariaDB')=CRC32C(CRC32C('Maria'),'DB'). This allows SELECT...INTO DUMPFILE to be used for the creation of files with valid checksums, such as a logically empty InnoDB redo log file ib_logfile0 corresponding to a particular log sequence number.
* | | | | | MDEV-4958 Adding datatype UUIDAlexander Barkov2021-10-291-34/+0
| | | | | |
* | | | | | cleanup: uuidSergei Golubchik2021-10-291-2/+2
| | | | | |
* | | | | | MDEV-4742 - remove leading zero handling, and cleanups.Vladislav Vaintroub2021-10-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leading zeros added a single byte overhead per numeric string, even when they were. Sorting leading zeros offers only for little value (except determinism in sort). I decided to drop it for now, we can be like ICU, which drops leading zeros, in numeric sorting, even with IDENTICAL collation strength. Also, disabled virtual stored columns (thus also indexes), on Serg's request Hopefully it is temporarily, and will be reenabled soon, when everyone is as happy with key generation algorithm as I am.
* | | | | | MDEV-4742 - address review comments.Vladislav Vaintroub2021-10-141-9/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Remove second optional parameter to natural_sort_key(), and all fraction handling. - Rename natsort_num2str() to natsort_encode_length() to show the intention that it encodes string *lengths*, and not encode whitespaces and what not. Handles lengths for which log10(len) >= 10, even if they do not happen for MariaDB Strings (where length is limited by 32bit, and log10(len) is <= 9) - Do not let natural sort key grow past max_packet_length. - Split Item_func_natural_sort_key::val_str() further and add natsort_encode_numeric_string(), which contains comment on how whitespaces are handled. - Simplify, and speedup to_natsort_key() in common case, by removing handling of weird charsets utf16/32, that encode numbers in several bytes. In rare cases utf16/32 is used, we'll convert to utf8 prior to creating keys, and back to original charset afterwards.
* | | | | | MDEV-4742 - provide function to sort string in "natural" orderVladislav Vaintroub2021-10-141-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The numbers should be compared as numbers, while the rest should be compared as string. Introduce natural_sort_key() function that transforms original string so that the lexicographic order of such keys is suitable for natural sort.
* | | | | | cannot allocate a new String[] in the ::val_str() methodSergei Golubchik2021-10-121-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | String inherits from Sql_alloc, so it's allocated on the thd's memroot, this cannot be done per row. Moved String[] allocation into the Item_func_sformat constructor (not fix_fields(), because we want it on the same memroot where the item is).
* | | | | | MDEV-25015 Custom formatting of strings in MariaDB queriesAlan Cueva2021-10-121-0/+16
|/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SFORMAT() SQL function that uses fmtlib (https://fmt.dev/) for python-like (also Rust, C++20, etc) string formatting Only fmtlib 7.0.0+ is supported, older fmtlib produces different results in the test. No native support for temporal and decimal values, * TIME_RESULT is handled as STRING_RESULT * DECIMAL_RESULT as REAL_RESULT
* | | | | MDEV-24285 support oracle build-in function: sys_guidMonty2021-05-191-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SYS_GUID() returns same as UUID(), but without any '-' author: woqutech
* | | | | Don't reset StringBuffers in loops when not neededMonty2021-05-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Moved out creating StringBuffers in loops and instead create them outside and just reset the buffer if it was not allocated (to avoid a possible malloc/free for every entry) Other things related to set_buffer_if_not_allocated() - Changed Valuebuffer to not call set_buffer_if_not_allocated() when it is created. - Fixed geometry functions to reset string length before calling String::reserve(). This is because one should not access length() of an undefined. - Added Item_func_conv_charset::save_in_field() as the item is using str_value to store cached values, which conflicts with Item::save_str_in_field(). - Changed Item_proc_string to not store the string value in sql_string as this clashes with Item::save_str_in_field(). - Locally store value of full_name_cstring() in analyse::end_of_records() as Item::save_str_in_field() may overwrite it. - Marked some strings as set_thread_specific() - Added String::free_buffer() to be used internally in String functions to just free the buffer but not reset other String values. - Fixed uses_buffer_owned_by() to check for allocated length instead of strlength, which could be marked MEM_UNDEFINED().
* | | | | cleanup: Item::can_eval_in_optimize()Sergei Golubchik2021-05-191-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | a helper method to check whether an item can be evaluated in the query optimization phase (in and below JOIN::optimize()).
* | | | | Added override to all releveant methods in Item (and a few other classes)Monty2021-05-191-302/+313
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Other things: - Remove inline and virtual for methods that are overrides - Added a 'final' to some Item classes
* | | | | Reduce usage of strlen()Monty2021-05-191-94/+440
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Changes: - To detect automatic strlen() I removed the methods in String that uses 'const char *' without a length: - String::append(const char*) - Binary_string(const char *str) - String(const char *str, CHARSET_INFO *cs) - append_for_single_quote(const char *) All usage of append(const char*) is changed to either use String::append(char), String::append(const char*, size_t length) or String::append(LEX_CSTRING) - Added STRING_WITH_LEN() around constant string arguments to String::append() - Added overflow argument to escape_string_for_mysql() and escape_quotes_for_mysql() instead of returning (size_t) -1 on overflow. This was needed as most usage of the above functions never tested the result for -1 and would have given wrong results or crashes in case of overflows. - Added Item_func_or_sum::func_name_cstring(), which returns LEX_CSTRING. Changed all Item_func::func_name()'s to func_name_cstring()'s. The old Item_func_or_sum::func_name() is now an inline function that returns func_name_cstring().str. - Changed Item::mode_name() and Item::func_name_ext() to return LEX_CSTRING. - Changed for some functions the name argument from const char * to to const LEX_CSTRING &: - Item::Item_func_fix_attributes() - Item::check_type_...() - Type_std_attributes::agg_item_collations() - Type_std_attributes::agg_item_set_converter() - Type_std_attributes::agg_arg_charsets...() - Type_handler_hybrid_field_type::aggregate_for_result() - Type_handler_geometry::check_type_geom_or_binary() - Type_handler::Item_func_or_sum_illegal_param() - Predicant_to_list_comparator::add_value_skip_null() - Predicant_to_list_comparator::add_value() - cmp_item_row::prepare_comparators() - cmp_item_row::aggregate_row_elements_for_comparison() - Cursor_ref::print_func() - Removes String_space() as it was only used in one cases and that could be simplified to not use String_space(), thanks to the fixed my_vsnprintf(). - Added some const LEX_CSTRING's for common strings: - NULL_clex_str, DATA_clex_str, INDEX_clex_str. - Changed primary_key_name to a LEX_CSTRING - Renamed String::set_quick() to String::set_buffer_if_not_allocated() to clarify what the function really does. - Rename of protocol function: bool store(const char *from, CHARSET_INFO *cs) to bool store_string_or_null(const char *from, CHARSET_INFO *cs). This was done to both clarify the difference between this 'store' function and also to make it easier to find unoptimal usage of store() calls. - Added Protocol::store(const LEX_CSTRING*, CHARSET_INFO*) - Changed some 'const char*' arrays to instead be of type LEX_CSTRING. - class Item_func_units now used LEX_CSTRING for name. Other things: - Fixed a bug in mysql.cc:construct_prompt() where a wrong escape character in the prompt would cause some part of the prompt to be duplicated. - Fixed a lot of instances where the length of the argument to append is known or easily obtain but was not used. - Removed some not needed 'virtual' definition for functions that was inherited from the parent. I added override to these. - Fixed Ordered_key::print() to preallocate needed buffer. Old code could case memory overruns. - Simplified some loops when adding char * to a String with delimiters.
* | | | | Split item->flags into base_flags and with_flagsMonty2021-05-191-24/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was done to simplify copying of with_* flags Other things: - Changed Flags to C++ enums, which enables gdb to print out bit values for the flags. This also enables compiler errors if one tries to manipulate a non existing bit in a variable. - Added set_maybe_null() as a shortcut as setting the MAYBE_NULL flags was used in a LOT of places. - Renamed PARAM flag to SP_VAR to ensure it's not confused with persistent statement parameters.
* | | | | Change bitfields in Item to an uint16Michael Widenius2021-05-191-28/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The reason for the change is that neither clang or gcc can do efficient code when several bit fields are change at the same time or when copying one or more bits between identical bit fields. Updated bits explicitely with & and | is MUCH more efficient than what current compilers can do.
* | | | | Renamed 'flags' variables in Item_classMichael Widenius2021-05-191-3/+3
| | | | | | | | | | | | | | | | | | | | This is a preparation for adding a flags variable to Item class
* | | | | Improved storage size for Item, Field and some other classesMonty2021-05-191-1/+2
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Changed order of class fields to remove dead alignment space. - Changed bool fields in Item to bit fields. - Used packed enum's for some fields in common classes - Removed not used Item::rsize. - Changed some class variables from uint/int to smaller type int's. - Ensured that field_index is uint16 in all classes and functions. Fixed also that we proparly compare with NO_CACHED_FIELD_INDEX when checking if variable is not set. - Removed checking of highest bit of unireg_check (has not been used in a long time) - Fixed wrong arguments to make_cond_for_table() for join_tab_idx_arg from false to 0. One of the result was reducing the size if class Item with ~24 bytes
* | | | Merge 10.4 into 10.5Marko Mäkelä2021-04-211-2/+7
|\ \ \ \ | |/ / /
| * | | Fix all warnings given by UBSANMonty2021-04-201-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The easiest way to compile and test the server with UBSAN is to run: ./BUILD/compile-pentium64-ubsan and then run mysql-test-run. After this commit, one should be able to run this without any UBSAN warnings. There is still a few compiler warnings that should be fixed at some point, but these do not expose any real bugs. The 'special' cases where we disable, suppress or circumvent UBSAN are: - ref10 source (as here we intentionally do some shifts that UBSAN complains about. - x86 version of optimized int#korr() methods. UBSAN do not like unaligned memory access of integers. Fixed by using byte_order_generic.h when compiling with UBSAN - We use smaller thread stack with ASAN and UBSAN, which forced me to disable a few tests that prints the thread stack size. - Verifying class types does not work for shared libraries. I added suppression in mysql-test-run.pl for this case. - Added '#ifdef WITH_UBSAN' when using integer arithmetic where it is safe to have overflows (two cases, in item_func.cc). Things fixed: - Don't left shift signed values (byte_order_generic.h, mysqltest.c, item_sum.cc and many more) - Don't assign not non existing values to enum variables. - Ensure that bool and enum values are properly initialized in constructors. This was needed as UBSAN checks that these types has correct values when one copies an object. (gcalc_tools.h, ha_partition.cc, item_sum.cc, partition_element.h ...) - Ensure we do not called handler functions on unallocated objects or deleted objects. (events.cc, sql_acl.cc). - Fixed bugs in Item_sp::Item_sp() where we did not call constructor on Query_arena object. - Fixed several cast of objects to an incompatible class! (Item.cc, Item_buff.cc, item_timefunc.cc, opt_subselect.cc, sql_acl.cc, sql_select.cc ...) - Ensure we do not do integer arithmetic that causes over or underflows. This includes also ++ and -- of integers. (Item_func.cc, Item_strfunc.cc, item_timefunc.cc, sql_base.cc ...) - Added JSON_VALUE_UNITIALIZED to json_value_types and ensure that value_type is initialized to this instead of to -1, which is not a valid enum value for json_value_types. - Ensure we do not call memcpy() when second argument could be null. - Fixed that Item_func_str::make_empty_result() creates an empty string instead of a null string (safer as it ensures we do not do arithmetic on null strings). Other things: - Changed struct st_position to an OBJECT and added an initialization function to it to ensure that we do not copy or use uninitialized members. The change to a class was also motived that we used "struct st_position" and POSITION randomly trough the code which was confusing. - Notably big rewrite in sql_acl.cc to avoid using deleted objects. - Changed in sql_partition to use '^' instead of '-'. This is safe as the operator is either 0 or 0x8000000000000000ULL. - Added check for select_nr < INT_MAX in JOIN::build_explain() to avoid bug when get_select() could return NULL. - Reordered elements in POSITION for better alignment. - Changed sql_test.cc::print_plan() to use pointers instead of objects. - Fixed bug in find_set() where could could execute '1 << -1'. - Added variable have_sanitizer, used by mtr. (This variable was before only in 10.5 and up). It can now have one of two values: ASAN or UBSAN. - Moved ~Archive_share() from ha_archive.cc to ha_archive.h and marked it virtual. This was an effort to get UBSAN to work with loaded storage engines. I kept the change as the new place is better. - Added in CONNECT engine COLBLK::SetName(), to get around a wrong cast in tabutil.cpp. - Added HAVE_REPLICATION around usage of rgi_slave, to get embedded server to compile with UBSAN. (Patch from Marko). - Added #ifdef for powerpc64 to avoid a bug in old gcc versions related to integer arithmetic. Changes that should not be needed but had to be done to suppress warnings from UBSAN: - Added static_cast<<uint16_t>> around shift to get rid of a LOT of compiler warnings when using UBSAN. - Had to change some '/' of 2 base integers to shift to get rid of some compile time warnings. Reviewed by: - Json changes: Alexey Botchkov - Charset changes in ctype-uca.c: Alexander Barkov - InnoDB changes & Embedded server: Marko Mäkelä - sql_acl.cc changes: Vicențiu Ciorbaru - build_explain() changes: Sergey Petrunia
* | | | Merge 10.4 into 10.5Marko Mäkelä2020-09-231-1/+9
|\ \ \ \ | |/ / /
| * | | Merge 10.3 into 10.4Marko Mäkelä2020-09-221-1/+9
| |\ \ \ | | |/ /
| | * | Merge 10.2 into 10.3Marko Mäkelä2020-09-221-1/+9
| | |\ \ | | | |/
| | | * Merge 10.1 into 10.2Marko Mäkelä2020-09-221-1/+9
| | | |\
| | | | * MDEV-23535 SIGSEGV, SIGABRT and SIGILL in typeinfo for ↵Alexander Barkov2020-09-031-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Item_func_set_collation (on optimized builds) This piece of the code in Item_func_or_sum::agg_item_set_converter: if (!conv && ((*arg)->collation.repertoire == MY_REPERTOIRE_ASCII)) conv= new (thd->mem_root) Item_func_conv_charset(thd, *arg, coll.collation, 1); was wrong because: 1. It could change Item_cache to Item_func_conv_charset (with the old Item_cache in args[0]). Such Item type change is not always supported, e.g. the code in Item_singlerow_subselect::reset() expects only Item_cache, to be able to call Item_cache::set_null(). So it erroneously reinterpreted Item_func_conv_charset to Item_cache and called a non-existing method set_null(), which crashed the server. 2. The 1 in the last parameter to Item_func_conv_charset() was also a problem. In MariaDB versions where the reported query did not crash, it erroneously returned "empty set" instead of one row, because the 1 made subselects execute too earlier and return NULL. Fix: 1. Removing the above two lines from Item_func_or_sum::agg_item_set_converter() 2. Adding the repertoire test inside the constructor of Item_func_conv_charset, so it now detects itself as "safe" in more cases than before. This is needed to avoid new "Illegal mix of collations" after removing the wrong code in various scenarios when character set conversion from pure ASCII happens, including the reported scenario. So now this sequence: Item_cache -> Item_func_concat is replaced to this compatible sequence (the top Item is still Item_cache): new Item_cache -> Item_func_conv_charset -> Item_func_concat Before the fix it was replaced to this incompatible sequence: Item_func_conv_charset -> old Item_cache -> Item_func_concat
* | | | | Merge branch '10.4' into 10.5Oleksandr Byelkin2020-03-111-1/+1
|\ \ \ \ \ | |/ / / /
| * | | | MDEV-21841 CONV() function doesn't truncate its output to 21 when uses ↵Roman Nozdrin2020-02-291-1/+1
| | | | | | | | | | | | | | | | | | | | default charset.
* | | | | correct dbug function namesSergei Golubchik2019-12-211-2/+2
| | | | |
* | | | | MDEV-14024 PCRE2.Alexey Botchkov2019-12-211-2/+0
| | | | | | | | | | | | | | | | | | | | Related changes in the server code.
* | | | | MDEV-20890 Illegal mix of collations with UUID()Alexander Barkov2019-10-241-2/+1
| | | | |
* | | | | Merge 10.4 into 10.5Marko Mäkelä2019-09-061-0/+6
|\ \ \ \ \ | |/ / / /
| * | | | Merge branch '10.3' into 10.4Sergei Golubchik2019-09-061-0/+6
| |\ \ \ \ | | |/ / /
| | * | | Merge 10.2 (up to commit ef00ac4c86daf3294c46a45358da636763fb0049) into 10.3Alexander Barkov2019-09-041-0/+6
| | |\ \ \ | | | |/ /
| | | * | MDEV-18156 Assertion `0' failed or `btr_validate_index(index, 0, false)' in ↵Alexander Barkov2019-09-031-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | row_upd_sec_index_entry or error code 126: Index is corrupted upon DELETE with PAD_CHAR_TO_FULL_LENGTH This change takes into account a column's GENERATED ALWAYS AS expression dependcy on sql_mode's PAD_CHAR_TO_FULL_LENGTH and NO_UNSIGNED_SUBTRACTION flags. Indexed virtual columns as well as persistent generated columns are now not allowed to have such dependencies to avoid inconsistent data or index files on sql_mode changes. So an error is now returned in cases like this: CREATE OR REPLACE TABLE t1 ( a CHAR(5), v VARCHAR(5) AS (a) PERSISTENT -- CHAR->VARCHAR or CHAR->TEXT = ERROR ); Functions RPAD() and RTRIM() can now remove dependency on PAD_CHAR_TO_FULL_LENGTH. So this can be used instead: CREATE OR REPLACE TABLE t1 ( a CHAR(5), v VARCHAR(5) AS (RTRIM(a)) PERSISTENT ); Note, unlike CHAR->VARCHAR and CHAR->TEXT this still works, not RPAD(a) is needed: CREATE OR REPLACE TABLE t1 ( a CHAR(5), v CHAR(5) AS (a) PERSISTENT -- CHAR->CHAR is OK ); More sql_mode flags may affect values of generated columns. They will be addressed separately. See comments in sql_mode.h for implementation details.
* | | | | MDEV-20052 Add a MEM_ROOT pointer argument to Type_handler::make_xxx_field()Alexander Barkov2019-07-121-2/+2
|/ / / /
* | | | Merge branch '10.3' into 10.4Oleksandr Byelkin2019-05-191-1/+1
|\ \ \ \ | |/ / /