diff options
author | istruewing@chilla.local <> | 2007-03-16 10:28:48 +0100 |
---|---|---|
committer | istruewing@chilla.local <> | 2007-03-16 10:28:48 +0100 |
commit | c33486ff1b49a97772a90997f619b9c93db5b862 (patch) | |
tree | 88607a0b9f512b37c8ebf05e7d095b79ddcb3c32 /myisam/mi_range.c | |
parent | 629fed6c4d2dcb8e062de7bee959e38d434d5c67 (diff) | |
download | mariadb-git-c33486ff1b49a97772a90997f619b9c93db5b862.tar.gz |
Bug#26231 - select count(*) on myisam table returns wrong value
when index is used
When the table contained TEXT columns with empty contents
('', zero length, but not NULL) _and_ strings starting with
control characters like tabulator or newline, the empty values
were not found in a "records in range" estimate. Hence count(*)
missed these records.
The reason was a different set of search flags used for key
insert and key range estimation.
I decided to fix the set of flags used in range estimation.
Otherwise millions of databases around the world would require
a repair after an upgrade.
The consequence is that the manual must be fixed, which claims
that TEXT columns are compared with "end space padding". This
is true for CHAR/VARCHAR but wrong for TEXT. See also bug 21335.
Diffstat (limited to 'myisam/mi_range.c')
-rw-r--r-- | myisam/mi_range.c | 36 |
1 files changed, 35 insertions, 1 deletions
diff --git a/myisam/mi_range.c b/myisam/mi_range.c index 4248fa7f04b..e9bec50f971 100644 --- a/myisam/mi_range.c +++ b/myisam/mi_range.c @@ -145,8 +145,42 @@ static ha_rows _mi_record_pos(MI_INFO *info, const byte *key, uint key_len, if (!(nextflag & (SEARCH_FIND | SEARCH_NO_FIND | SEARCH_LAST))) key_len=USE_WHOLE_KEY; + /* + my_handler.c:mi_compare_text() has a flag 'skip_end_space'. + This is set in my_handler.c:ha_key_cmp() in dependence on the + compare flags 'nextflag' and the column type. + + TEXT columns are of type HA_KEYTYPE_VARTEXT. In this case the + condition is skip_end_space= ((nextflag & (SEARCH_FIND | + SEARCH_UPDATE)) == SEARCH_FIND). + + SEARCH_FIND is used for an exact key search. The combination + SEARCH_FIND | SEARCH_UPDATE is used in write/update/delete + operations with a comment like "Not real duplicates", whatever this + means. From the condition above we can see that 'skip_end_space' is + always false for these operations. The result is that trailing space + counts in key comparison and hence, emtpy strings ('', string length + zero, but not NULL) compare less that strings starting with control + characters and these in turn compare less than strings starting with + blanks. + + When estimating the number of records in a key range, we request an + exact search for the minimum key. This translates into a plain + SEARCH_FIND flag. Using this alone would lead to a 'skip_end_space' + compare. Empty strings would be expected above control characters. + Their keys would not be found because they are located below control + characters. + + This is the reason that we add the SEARCH_UPDATE flag here. It makes + the key estimation compare in the same way like key write operations + do. Olny so we will find the keys where they have been inserted. + + Adding the flag unconditionally does not hurt as it is used in the + above mentioned condition only. So it can safely be used together + with other flags. + */ pos=_mi_search_pos(info,keyinfo,key_buff,key_len, - nextflag | SEARCH_SAVE_BUFF, + nextflag | SEARCH_SAVE_BUFF | SEARCH_UPDATE, info->s->state.key_root[inx]); if (pos >= 0.0) { |