diff options
Diffstat (limited to 'Docs')
-rw-r--r-- | Docs/manual.texi | 171 |
1 files changed, 114 insertions, 57 deletions
diff --git a/Docs/manual.texi b/Docs/manual.texi index ffa42d2ba73..e5fa59f4f65 100644 --- a/Docs/manual.texi +++ b/Docs/manual.texi @@ -3691,7 +3691,7 @@ Allow users to change startup options without taking down the server. Fail safe replication. @item More functions for full-text search. -@xref{Fulltext Features in MySQL 4.0}. +@xref{Fulltext TODO}. @item New key cache @item @@ -18044,6 +18044,7 @@ differ somewhat: | ft_min_word_len | 4 | | ft_max_word_len | 254 | | ft_max_word_len_for_sort | 20 | +| ft_boolean_syntax | + -><()~* | | have_bdb | YES | | have_innodb | YES | | have_raid | YES | @@ -18250,6 +18251,9 @@ lot of words into index the slow way - but short words will be inserted very fast. It applies only to index recreation during @code{REPAIR}, @code{CREATE INDEX}, or @code{ALTER TABLE}. +@item @code{ft_boolean_syntax} +List of operators supported by @code{MATCH ... AGAINST(... IN BOOLEAN MODE)}. +@xref{Fulltext Search}. @item @code{have_innodb} @code{YES} if @code{mysqld} supports InnoDB tables. @code{DISABLED} @@ -29156,18 +29160,18 @@ mysql> select STRCMP('text', 'text'); @findex MATCH ... AGAINST() @item MATCH (col1,col2,...) AGAINST (expr) +@itemx MATCH (col1,col2,...) AGAINST (expr IN BOOLEAN MODE) @code{MATCH ... AGAINST()} is used for full-text search and returns relevance - similarity measure between the text in columns @code{(col1,col2,...)} and the query @code{expr}. Relevance is a positive floating-point number. Zero relevance means no similarity. For @code{MATCH ... AGAINST()} to work, a @strong{FULLTEXT} index must be created first. @xref{CREATE TABLE, , @code{CREATE TABLE}}. -@code{MATCH ... AGAINST()} is available in MySQL Version -3.23.23 or later. For details and usage examples -@pxref{Fulltext Search}. +@code{MATCH ... AGAINST()} is available in MySQL version +3.23.23 or later. @code{IN BOOLEAN MODE} extension was added in version +4.0.1. For details and usage examples @pxref{Fulltext Search}. @end table - @node Case Sensitivity Operators, , String comparison functions, String functions @subsubsection Case Sensitivity @@ -33679,9 +33683,9 @@ and searching. Full-text indexes in MySQL are an index of type @code{FULLTEXT}. @code{FULLTEXT} indexes can be created from @code{VARCHAR} and @code{TEXT} columns at @code{CREATE TABLE} time or added later with @code{ALTER TABLE} or @code{CREATE INDEX}. For large datasets, adding -@code{FULLTEXT} index with @code{ALTER TABLE} (or @code{CREATE INDEX}) would -be much faster than inserting rows into the empty table with a @code{FULLTEXT} -index. +@code{FULLTEXT} index with @code{ALTER TABLE} (or @code{CREATE INDEX}) +would be much faster than inserting rows into the empty table that has +a @code{FULLTEXT} index. Full-text search is performed with the @code{MATCH} function. @@ -33714,7 +33718,8 @@ mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('database'); 2 rows in set (0.00 sec) @end example -The function @code{MATCH} matches a natural language query @code{AGAINST} +The function @code{MATCH} matches a natural language (or boolean, +see below) query in case-insensitive fashion @code{AGAINST} a text collection (which is simply the set of columns covered by a @code{FULLTEXT} index). For every row in a table it returns relevance - a similarity measure between the text in that row (in the columns that are @@ -33730,7 +33735,7 @@ The above is a basic example of using @code{MATCH} function. Rows are returned with relevance decreasing. @example -mysql> SELECT id,MATCH (title,body) AGAINST ('Tutorial') FROM articles; +mysql> SELECT id,MATCH title,body AGAINST ('Tutorial') FROM articles; +----+-----------------------------------------+ | id | MATCH (title,body) AGAINST ('Tutorial') | +----+-----------------------------------------+ @@ -33748,7 +33753,7 @@ This example shows how to retrieve the relevances. As neither @code{WHERE} nor @code{ORDER BY} clauses are present, returned rows are not ordered. @example -mysql> SELECT id, body, MATCH (title,body) AGAINST ( +mysql> SELECT id, body, MATCH title,body AGAINST ( -> 'Security implications of running MySQL as root') AS score -> FROM articles WHERE MATCH (title,body) AGAINST -> ('Security implications of running MySQL as root'); @@ -33802,27 +33807,98 @@ We all know this happens far too often when we are trying to find something on the Internet with a search engine. It is with this reasoning that such rows have been assigned a low semantical value in @strong{this particular dataset}. +Since version 4.0.1 MySQL can also perform boolean fulltext searches using +@code{IN BOOLEAN MODE} modifier. + +@example +mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ( + -> '+MySQL -YourSQL' IN BOOLEAN MODE); ++----+------------------------------+-----------------------------------------------+ +| id | title | body | ++----+------------------------------+-----------------------------------------------+ +| 1 | MySQL Tutorial | DBMS stands for DataBase Management ... | +| 2 | How To Use MySQL Efficiently | After you went through a ... | +| 3 | Optimising MySQL | In this tutorial we will show how to ... | +| 4 | 1001 MySQL Trick | 1. Never run mysqld as root. 2. Normalise ... | +| 6 | MySQL Security | When configured properly, MySQL could be ... | ++----+------------------------------+-----------------------------------------------+ +@end example + +This query retrieved all the rows that contain the word @code{MySQL} +(note: 50% threshold is gone), but does @strong{not} contain the word +@code{YourSQL}. Note that it does not auto-magically sort rows in +derceasing relevance order (the last row has the highest relevance, +as it contains @code{MySQL} twice). + +Boolean fulltext search supports the following operators: + +@table @code +@item + +A plus sign prepended to a word indicates that this word @strong{must be} +present in every row returned. +@item - +A minus sign prepended to a word indicates that this word @strong{must not} +be present in the rows returned. +@item +By default - without plus or minus - the word is optional, but the rows that +contain it will be rated higher. This mimicks the behaviour of +@code{MATCH ... AGAINST()} without @code{IN BOOLEAN MODE} modifier. +@item < > +These two operators are used to increase and decrease word's contribution +to the relevance value, assigned to a row. See an example below. +@item ( ) +Parentheses are used - as usual - to group words into subexpressions. +@item ~ +This is negation operator. It makes word's contribution to the row +relevance negative. It's useful for marking noise words. A row that has +such a word will be rated lower than others, but will not be excluded +altogether, as with @code{-} operator. +@item * +This is truncation operator. Unlike others it should be @strong{appended} +to the word, not prepended. +@end table + +And here are some examples: + +@table @code +@item ``apple banana'' +find rows that contain at least one of these words. +@item ``+apple +juice'' +... both words +@item ``+apple macintosh'' +... word ``apple'', but rank it higher if it also contain ``macintosh'' +@item ``+apple -macintosh'' +... word ``apple'' but not ``macintosh'' +@item ``+gates +(>hell <bill)'' +... ``hell'' and ``gates'', or ``bill'' and ``gates'' (in any +order), but rank ``gates to hell'' higher than ``bill gates''. +@item ``apple*'' +... ``apple'', ``apples'', ``applesause'', and ``applet'' +@end table + @menu -* Fulltext restrictions:: Fulltext restrictions +* Fulltext Restrictions:: Fulltext Restrictions * Fulltext Fine-tuning:: Fine-tuning MySQL Full-text Search -* Fulltext Features in MySQL 4.0:: New Features of Full-text Search in MySQL 4.0 * Fulltext TODO:: Full-text Search TODO @end menu -@node Fulltext restrictions, Fulltext Fine-tuning, Fulltext Search, Fulltext Search -@subsection Fulltext restrictions +@node Fulltext Restrictions, Fulltext Fine-tuning, Fulltext Search, Fulltext Search +@subsection Fulltext Restrictions @itemize @bullet @item All parameters to the @code{MATCH} function must be columns from the same table that is part of the same fulltext index. @item +Column list between @code{MATCH} and @code{AGAINST} must match exactly +a column list in the @code{FULLTEXT} index definition. +@item The argument to @code{AGAINST} must be a constant string. @end itemize -@node Fulltext Fine-tuning, Fulltext Features in MySQL 4.0, Fulltext restrictions, Fulltext Search +@node Fulltext Fine-tuning, Fulltext TODO, Fulltext Restrictions, Fulltext Search @subsection Fine-tuning MySQL Full-text Search Unfortunately, full-text search has few user-tunable parameters yet, @@ -33844,13 +33920,13 @@ Change it to the value you prefer, and rebuild your @code{FULLTEXT} indexes. @item -The stopword list is defined in @code{myisam/ft_static.c} +The stopword list is defined in @file{myisam/ft_static.c} Modify it to your taste, recompile MySQL and rebuild your @code{FULLTEXT} indexes. @item The 50% threshold is caused by the particular weighting scheme chosen. To -disable it, change the following line in @code{myisam/ftdefs.h}: +disable it, change the following line in @file{myisam/ftdefs.h}: @example #define GWS_IN_USE GWS_PROB @end example @@ -33860,49 +33936,22 @@ to @end example and recompile MySQL. There is no need to rebuild the indexes in this case. +@strong{Note:} by doing this you @strong{severely} decrease MySQL ability +to provide adequate relevance values by @code{MATCH} function. +It means, that if you really need to search for such a common words, +then you should rather search @code{IN BOOLEAN MODE}, which does not +has 50% threshold. -@end itemize - - -@node Fulltext Features in MySQL 4.0, Fulltext TODO, Fulltext Fine-tuning, Fulltext Search -@subsection New Features of Full-text Search in MySQL 4.0 - -This section includes a list of the fulltext features that are already -implemented in the 4.0 tree. It explains the -@strong{More functions for full-text search} entry of @ref{TODO MySQL 4.0}. - -@itemize @bullet -@item @code{REPAIR TABLE} with @code{FULLTEXT} indexes, -@code{ALTER TABLE} with @code{FULLTEXT} indexes, and -@code{OPTIMIZE TABLE} with @code{FULLTEXT} indexes are now -up to 100 times faster. - -@item @code{MATCH ... AGAINST} is going to supports the following -@strong{boolean operators}: - -@itemize @bullet -@item @code{+}word means the that word @strong{must} be present in every -row returned. -@item @code{-}word means the that word @strong{must not} be present in every -row returned. -@item @code{<} and @code{>} can be used to decrease and increase word -weight in the query. -@item @code{~} can be used to assign a @strong{negative} weight to a noise -word. -@item @code{*} is a truncation operator. -@end itemize - -Boolean search utilises a more simplistic way of calculating the relevance, -that does not have a 50% threshold. - -@item Searches are now up to 2 times faster due to optimised search algorithm. - -@c @item Utility program @code{ft_dump} added for low-level @code{FULLTEXT} -@c index operations (querying/dumping/statistics). +@item +Sometimes search engine maintaner would like to change operators used +for boolean fulltext search. They are defined by a +@code{ft_boolean_syntax} variable. @xref{SHOW VARIABLES}. +Still, this variable is read-only, its value is set in +@file{myisam/ft_static.c}. @end itemize -@node Fulltext TODO, , Fulltext Features in MySQL 4.0, Fulltext Search +@node Fulltext TODO, , Fulltext Fine-tuning, Fulltext Search @subsection Full-text Search TODO @itemize @bullet @@ -45795,6 +45844,10 @@ Our TODO section contains what we plan to have in 4.0. @xref{TODO MySQL 4.0}. @itemize @bullet @item +Fixed bug in @code{DELETE ... WHERE ... MATCH ...} +@item +Added support for @code{MATCH ... AGAINST(... IN BOOLEAN MODE)}. +@item @code{LOCATE()} and @code{INSTR()} are case sensitive if neither argument is a binary string. binary strings. @item @@ -45904,6 +45957,10 @@ Added @code{ALTER TABLE table_name DISABLE KEYS} and @item Allow one to use @code{IN} instead of @code{FROM} in @code{SHOW} commands. @item +Implemented ``repair by sort'' for @code{FULLTEXT} indexes. +@code{REPAIR TABLE}, @code{ALTER TABLE}, and @code{OPTIMIZE TABLE} +for tables with @code{FULLTEXT} indexes are now up to 100 times faster. +@item Allow ANSI SQL syntax @code{X'hexadecimal-number'} @item Cleaned up global lock handling for @code{FLUSH TABLES WITH READ LOCK} |