diff options
author | unknown <monty@hundin.mysql.fi> | 2002-06-03 12:59:31 +0300 |
---|---|---|
committer | unknown <monty@hundin.mysql.fi> | 2002-06-03 12:59:31 +0300 |
commit | f0409fa920c7908f2f9ef03919583a32bf84eaad (patch) | |
tree | be04186411dc657ef6bbcbe01267d30f2675c914 /Docs/internals.texi | |
parent | ebbcb0f391d7df364e0ccc6bca706456e9aadbf7 (diff) | |
parent | 7cb2e2d1dce2c7466388f4a6ade0614564be82fc (diff) | |
download | mariadb-git-f0409fa920c7908f2f9ef03919583a32bf84eaad.tar.gz |
merge with 4.0
BitKeeper/etc/ignore:
auto-union
BitKeeper/etc/logging_ok:
auto-union
BUILD/SETUP.sh:
Auto merged
BUILD/compile-pentium-debug:
Auto merged
BitKeeper/triggers/post-commit:
Auto merged
configure.in:
Auto merged
Docs/manual.texi:
Auto merged
client/mysql.cc:
Auto merged
client/mysqldump.c:
Auto merged
client/mysqltest.c:
Auto merged
extra/mysql_install.c:
Auto merged
extra/resolve_stack_dump.c:
Auto merged
extra/resolveip.c:
Auto merged
include/my_sys.h:
Auto merged
include/mysqld_error.h:
Auto merged
isam/pack_isam.c:
Auto merged
libmysql/Makefile.shared:
Auto merged
libmysql/libmysql.c:
Auto merged
myisam/ft_dump.c:
Auto merged
myisam/ft_test1.c:
Auto merged
myisam/ftdefs.h:
Auto merged
myisam/mi_check.c:
Auto merged
myisam/mi_test1.c:
Auto merged
myisam/mi_write.c:
Auto merged
myisam/myisamchk.c:
Auto merged
myisam/myisampack.c:
Auto merged
mysql-test/mysql-test-run.sh:
Auto merged
mysql-test/r/select_found.result:
Auto merged
mysql-test/t/select_found.test:
Auto merged
mysys/charset.c:
Auto merged
mysys/default.c:
Auto merged
mysys/hash.c:
Auto merged
sql/field.cc:
Auto merged
sql/gen_lex_hash.cc:
Auto merged
sql/ha_innodb.cc:
Auto merged
sql/hostname.cc:
Auto merged
sql/item_cmpfunc.h:
Auto merged
sql/item_strfunc.cc:
Auto merged
sql/item_timefunc.h:
Auto merged
sql/lex.h:
Auto merged
sql/log.cc:
Auto merged
sql/mysql_priv.h:
Auto merged
sql/repl_failsafe.cc:
Auto merged
sql/slave.cc:
Auto merged
sql/sql_acl.cc:
Auto merged
sql/sql_base.cc:
Auto merged
sql/sql_cache.cc:
Auto merged
sql/sql_class.cc:
Auto merged
sql/sql_class.h:
Auto merged
sql/sql_db.cc:
Auto merged
sql/sql_parse.cc:
Auto merged
sql/sql_select.cc:
Auto merged
sql/sql_string.cc:
Auto merged
sql/sql_table.cc:
Auto merged
sql/sql_union.cc:
Auto merged
sql/share/czech/errmsg.txt:
Auto merged
sql/share/danish/errmsg.txt:
Auto merged
sql/share/dutch/errmsg.txt:
Auto merged
sql/share/english/errmsg.txt:
Auto merged
sql/share/estonian/errmsg.txt:
Auto merged
sql/share/german/errmsg.txt:
Auto merged
sql/share/greek/errmsg.txt:
Auto merged
sql/share/hungarian/errmsg.txt:
Auto merged
sql/share/italian/errmsg.txt:
Auto merged
sql/share/japanese/errmsg.txt:
Auto merged
sql/share/korean/errmsg.txt:
Auto merged
sql/share/norwegian-ny/errmsg.txt:
Auto merged
sql/share/norwegian/errmsg.txt:
Auto merged
sql/sql_update.cc:
Auto merged
sql/structs.h:
Auto merged
sql/share/polish/errmsg.txt:
Auto merged
sql/share/portuguese/errmsg.txt:
Auto merged
sql/share/romanian/errmsg.txt:
Auto merged
sql/share/russian/errmsg.txt:
Auto merged
sql/share/slovak/errmsg.txt:
Auto merged
sql/share/spanish/errmsg.txt:
Auto merged
sql/share/swedish/errmsg.txt:
Auto merged
sql/share/ukrainian/errmsg.txt:
Auto merged
strings/Makefile.am:
Auto merged
strings/ctype-ujis.c:
Auto merged
tools/mysqlmanager.c:
Auto merged
Diffstat (limited to 'Docs/internals.texi')
-rw-r--r-- | Docs/internals.texi | 45 |
1 files changed, 44 insertions, 1 deletions
diff --git a/Docs/internals.texi b/Docs/internals.texi index 8f358982ded..871e51c50bd 100644 --- a/Docs/internals.texi +++ b/Docs/internals.texi @@ -57,6 +57,7 @@ This is a manual about @strong{MySQL} internals. * mysys functions:: Functions In The @code{mysys} Library * DBUG:: DBUG Tags To Use * protocol:: MySQL Client/Server Protocol +* Fulltext Search:: Fulltext Search in MySQL @end menu @@ -535,7 +536,7 @@ Print query. @end table -@node protocol, , DBUG, Top +@node protocol, Fulltext Search, DBUG, Top @chapter MySQL Client/Server Protocol @menu @@ -785,6 +786,48 @@ Date 03 0A 00 00 |01 0A |03 00 00 00 @c @printindex fn +@node Fulltext Search, , protocol, Top +@chapter Fulltext Search in MySQL + +Hopefully, sometime there will be complete description of +fulltext search algorithms. +Now it's just unsorted notes. + +@menu +* Weighting in boolean mode:: +@end menu + +@node Weighting in boolean mode, , , Fulltext Search +@section Weighting in boolean mode + +The basic idea is as follows: in expression +@code{A or B or (C and D and E)}, either @code{A} or @code{B} alone +is enough to match the whole expression. While @code{C}, +@code{D}, and @code{E} should @strong{all} match. So it's +reasonable to assign weight 1 to @code{A}, @code{B}, and +@code{(C and D and E)}. And @code{C}, @code{D}, and @code{E} +should get a weight of 1/3. + +Things become more complicated when considering boolean +operators, as used in MySQL FTB. Obvioulsy, @code{+A +B} +should be treated as @code{A and B}, and @code{A B} - +as @code{A or B}. The problem is, that @code{+A B} can @strong{not} +be rewritten in and/or terms (that's the reason why this - extended - +set of operators was chosen). Still, aproximations can be used. +@code{+A B C} can be approximated as @code{A or (A and (B or C))} +or as @code{A or (A and B) or (A and C) or (A and B and C)}. +Applying the above logic (and omitting mathematical +transformations and normalization) one gets that for +@code{+A_1 +A_2 ... +A_N B_1 B_2 ... B_M} the weights +should be: @code{A_i = 1/N}, @code{B_j=1} if @code{N==0}, and, +otherwise, in the first rewritting approach @code{B_j = 1/3}, +and in the second one - @code{B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1))}. + +The second expression gives somewhat steeper increase in total +weight as number of matched B's increases, because it assigns +higher weights to individual B's. Also the first expression in +much simplier. So it is the first one, that is implemented in MySQL. + @summarycontents @contents |