diff options
Diffstat (limited to 'Docs/internals.texi')
-rw-r--r-- | Docs/internals.texi | 1875 |
1 files changed, 1581 insertions, 294 deletions
diff --git a/Docs/internals.texi b/Docs/internals.texi index 2195b42d9a0..a94158f84f8 100644 --- a/Docs/internals.texi +++ b/Docs/internals.texi @@ -1,26 +1,30 @@ \input texinfo @c -*-texinfo-*- -@c Copyright 1998 TcX AB, Detron HB and Monty Program KB +@c Copyright 2002 MySQL AB @c @c %**start of header @setfilename internals.info + @c We want the types in the same index -@c @synindex tp fn cp @synindex cp fn + @iftex -@c Well this is normal in Europe. Maybe this should go into the include.texi? @afourpaper @end iftex + @c Get version and other info @include include.texi + @ifclear tex-debug @c This removes the black squares in the right margin @finalout @end ifclear + @c Set background for HTML @set _body_tags BGCOLOR=#FFFFFF TEXT=#000000 LINK=#101090 VLINK=#7030B0 -@settitle @strong{MySQL} internals Manual for version @value{mysql_version}. -@setchapternewpage off +@settitle @strong{MySQL} Internals Manual for version @value{mysql_version}. +@setchapternewpage odd @paragraphindent 0 + @c %**end of header @ifinfo @@ -35,67 +39,149 @@ END-INFO-DIR-ENTRY @sp 10 @center @titlefont{@strong{MySQL} Internals Manual} @sp 10 -@center Copyright @copyright{} 1998 TcX AB, Detron HB and Monty Program KB +@center Copyright @copyright{} 1998-2002 MySQL AB +@page @end titlepage -@node Top, Introduction, (dir), (dir) +@node Top, caching, (dir), (dir) @ifinfo This is a manual about @strong{MySQL} internals. @end ifinfo @menu +* caching:: How MySQL Handles Caching +* flush tables:: How MySQL Handles @code{FLUSH TABLES} +* filesort:: How MySQL Does Sorting (@code{filesort}) +* coding guidelines:: Coding Guidelines +* mysys functions:: Functions In The @code{mysys} Library +* DBUG:: DBUG Tags To Use +* protocol:: MySQL Client/Server Protocol +* Fulltext Search:: Fulltext Search in MySQL @end menu -@node caching,,, -@chapter How MySQL handles caching + +@node caching, flush tables, Top, Top +@chapter How MySQL Handles Caching @strong{MySQL} has the following caches: (Note that the some of the filename have a wrong spelling of cache. :) -@itemize @bullet +@table @strong -@item Key cache +@item Key Cache A shared cache for all B-tree index blocks in the different NISAM files. Uses hashing and reverse linked lists for quick caching of the last used blocks and quick flushing of changed entries for a specific table. (@file{mysys/mf_keycash.c}) -@item Record cache +@item Record Cache This is used for quick scanning of all records in a table. (@file{mysys/mf_iocash.c} and @file{isam/_cash.c}) -@item Table cache +@item Table Cache This holds the last used tables. (@file{sql/sql_base.cc}) -@item Hostname cache +@item Hostname Cache For quick lookup (with reverse name resolving). Is a must when one has a slow DNS. (@file{sql/hostname.cc}) -@item Privilege cache +@item Privilege Cache To allow quick change between databases the last used privileges are cached for each user/database combination. (@file{sql/sql_acl.cc}) -@item Heap table cache -Many use of GROUP BY or DISTINCT caches all found -rows in a HEAP table (this is a very quick in-memory table with hash index) +@item Heap Table Cache +Many use of @code{GROUP BY} or @code{DISTINCT} caches all found rows in +a @code{HEAP} table. (This is a very quick in-memory table with hash index.) + +@item Join buffer Cache +For every full join in a @code{SELECT} statement (a full join here means +there were no keys that one could use to find the next table in a list), +the found rows are cached in a join cache. One @code{SELECT} query can +use many join caches in the worst case. +@end table -@item Join row cache. -For every full join in a SELECT statement (a full join here means there -were no keys that one could use to find the next table in a list), the -found rows are cached in a join cache. One SELECT query can use many -join caches in the worst case. +@node join_buffer_size, flush tables, caching, Top +@subchapter How MySQL uses the join_buffer cache + +Basic information about @code{join_buffer_size}: + +@itemize @bullet +@item +It's only used in the case when join type is of type @code{ALL} or +@code{index}; In other words: no possible keys can be used. +@item +A join buffer is never allocated for the first not-const table, +even it it would be of type @code{ALL}/@code{index}. +@item +The buffer is allocated when we need to do a each full join between two +tables and freed after the query is done. +@item +Accepted row combinations of tables before the @code{ALL}/@code{index} +able is stored in the cache and is used to compare against each read +row in the @code{ALL} table. +@item +We only store the used fields in the join_buffer cache, not the +whole rows. +@end itemize + +Assume you have the following join: + +@example +Table name Type +t1 range +t2 ref +t3 @code{ALL} +@end example + +The join is then done as follows: + +@example +- While rows in t1 matching range + - Read through all rows in t2 according to reference key + - Store used fields form t1,t2 in cache + - If cache is full + - Read through all rows in t3 + - Compare t3 row against all t1,t2 combination in cache + - If rows satisfying join condition, send it to client + - Empty cache + +- Read through all rows in t3 + - Compare t3 row against all stored t1,t2 combinations in cache + - If rows satisfying join condition, send it to client +@end example + +The above means that table t3 is scanned + +@example +(size-of-stored-row(t1,t2) * accepted-row-cominations(t1,t2))/ +join_buffer_size+1 +@end example +times. + +Some conclusions: + +@itemize @bullet +@item +The larger the join_buff_size, the fewer scans of t3. +If @code{join_buff_size} is already large enough to hold all previous row +combinations then there is no speed to gain by making it bigger. +@item +If there is several tables of @code{ALL}/@code{index} then the we +allocate one @code{join_buffer_size buffer} for each of them and use the +same algorithm described above to handle it. (In other words, we store +the same row combination several times into different buffers) @end itemize -@node flush tables,,, -@chapter How MySQL handles flush tables +@node flush tables, filesort, caching, Top +@chapter How MySQL Handles @code{FLUSH TABLES} @itemize @bullet @item -Flush tables is handled in @code{sql/sql_base.cc::close_cached_tables()}. +Flush tables is handled in @file{sql/sql_base.cc::close_cached_tables()}. @item The idea of flush tables is to force all tables to be closed. This @@ -109,8 +195,8 @@ all tables)! When one does a @code{FLUSH TABLES}, the variable @code{refresh_version} will be incremented. Every time a thread releases a table it checks if the refresh version of the table (updated at open) is the same as -the current refresh_version. If not it will close it and broadcast -a signal on COND_refresh (to wait any thread that is waiting for +the current @code{refresh_version}. If not it will close it and broadcast +a signal on @code{COND_refresh} (to wait any thread that is waiting for all instanses of a table to be closed). @item @@ -119,8 +205,8 @@ The current @code{refresh_version} is also compared to the open refresh version is different the thread will free all locks, reopen the table and try to get the locks again; This is just to quickly get all tables to use the newest version. This is handled by -@code{sql/lock.cc::mysql_lock_tables()} and -@code{sql/sql_base.cc::wait_for_tables()}. +@file{sql/lock.cc::mysql_lock_tables()} and +@file{sql/sql_base.cc::wait_for_tables()}. @item When all tables has been closed @code{FLUSH TABLES} will return an ok @@ -134,8 +220,8 @@ After this it will give other threads a chance to open the same tables. @end itemize -@node Filesort,,, -@chapter How MySQL does sorting (filesort) +@node filesort, coding guidelines, flush tables, Top +@chapter How MySQL Does Sorting (@code{filesort}) @itemize @bullet @@ -146,7 +232,7 @@ Read all rows according to key or by table scanning. Store the sort-key in a buffer (@code{sort_buffer}). @item -When the buffer gets full, run a qsort on it and store the result +When the buffer gets full, run a @code{qsort} on it and store the result in a temporary file. Save a pointer to the sorted block. @item @@ -170,12 +256,13 @@ Now the code in @file{sql/records.cc} will be used to read through them in sorted order by using the row pointers in the result file. To optimize this, we read in a big block of row pointers, sort these and then we read the rows in the sorted order into a row buffer -(@code{record_buffer}) . +(@code{record_buffer}). @end itemize -@node Coding guidelines,,, -@chapter Coding guidelines + +@node coding guidelines, mysys functions, filesort, Top +@chapter Coding Guidelines @itemize @bullet @@ -183,24 +270,28 @@ and then we read the rows in the sorted order into a row buffer We are using @uref{http://www.bitkeeper.com/, BitKeeper} for source management. @item -You should use the @strong{MySQL} 3.23 or 4.0 source for all developments. +You should use the @strong{MySQL} 4.0 source for all developments. @item If you have any questions about the @strong{MySQL} source, you can post these -to @email{developers@@mysql.com} and we will answer them. -Note that we will shortly change the name of this list to -@email{internals@@mysql.com}, to more accurately reflect what should be -posted to this list. +to @email{dev-public@@mysql.com} and we will answer them. Please +remember to not use this internal email list in public! @item -Try to write code in a lot of black boxes that can be reused or at -least have a clean interface. +Try to write code in a lot of black boxes that can be reused or use at +least a clean, easy to change interface. @item Reuse code; There is already a lot of algorithms in MySQL for list handling, queues, dynamic and hashed arrays, sorting, etc. that can be reused. @item +Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/ +@code{my_malloc()} that you can find in the @code{mysys} library instead +of the direct system calls; This will make your code easier to debug and +more portable. + +@item Try to always write optimized code, so that you don't have to go back and rewrite it a couple of months later. It's better to spend 3 times as much time designing and writing an optimal function than @@ -221,25 +312,23 @@ Don't use two commands on the same line. Do not check the same pointer for @code{NULL} more than once. @item -Use long function and variable names in English; This makes your code -easier to read. Use the 'varible_name' style instead of 'VariableName'. +Use long function and variable names in English. This makes your code +easier to read. @item -Think assembly - make it easier for the compiler to optimize your code. +Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_} +rather than dancing SHIFT to seperate words in identifiers). @item -Comment your code when you do something that someone else may think -is not ''trivial''. +Think assembly - make it easier for the compiler to optimize your code. @item -Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/ -@code{my_malloc()} that you can find in the @code{mysys} library instead -of the direct system calls; This will make your code easier to debug and -more portable. +Comment your code when you do something that someone else may think +is not ``trivial''. @item -Use @code{libstring} functions instead of standard libc string functions -whenever possible. +Use @code{libstring} functions (in the @file{strings} directory) +instead of standard @code{libc} string functions whenever possible. @item Avoid using @code{malloc()} (its REAL slow); For memory allocations @@ -254,10 +343,6 @@ easily discuss it thoroughly if some other developer thinks there is better way to do the same thing! @item -Use my_var as opposed to myVar or MyVar (@samp{_} rather than dancing SHIFT -to seperate words in identifiers). - -@item Class names start with a capital letter. @item @@ -270,29 +355,28 @@ Any @code{#define}'s are in all-caps. Matching @samp{@{} are in the same column. @item -Put the @samp{@{} after a 'switch' on the same line +Put the @samp{@{} after a @code{switch} on the same line, as this gives +better overall indentation for the switch statement: @example -switch (arg) { +switch (arg) @{ @end example -Because this gives better overall indentation for the switch statement. - @item -In all other cases, @{ and @} should be on their own line, except -if there is nothing inside @{ @}. +In all other cases, @samp{@{} and @samp{@}} should be on their own line, except +if there is nothing inside @samp{@{} and @samp{@}}. @item -Have a space after 'if' +Have a space after @code{if} @item -Put a space after ',' for function arguments +Put a space after @samp{,} for function arguments @item -Functions return 0 on success, and non-zero on error, so you can do: +Functions return @samp{0} on success, and non-zero on error, so you can do: @example -if(a() || b() || c()) { error("something went wrong"); } +if(a() || b() || c()) @{ error("something went wrong"); @} @end example @item @@ -337,113 +421,110 @@ Suggested mode in emacs: (setq c-default-style "MY") @end example -@node mysys functions,,, -@chapter mysys functions - -Functions i mysys: (For flags se my_sys.h) - - int my_copy _A((const char *from,const char *to,myf MyFlags)); - - Copy file - - int my_delete _A((const char *name,myf MyFlags)); - - Delete file - - int my_getwd _A((string buf,uint size,myf MyFlags)); - int my_setwd _A((const char *dir,myf MyFlags)); - - Get and set working directory - - string my_tempnam _A((const char *pfx,myf MyFlags)); - - Make a uniq temp file name by using dir and adding something after - pfx to make name uniq. Name is made by adding a uniq 6 length-string - and TMP_EXT after pfx. - Returns pointer to malloced area for filename. Should be freed by - free(). - - File my_open _A((const char *FileName,int Flags,myf MyFlags)); - File my_create _A((const char *FileName,int CreateFlags, - int AccsesFlags, myf MyFlags)); - int my_close _A((File Filedes,myf MyFlags)); - uint my_read _A((File Filedes,byte *Buffer,uint Count,myf MyFlags)); - uint my_write _A((File Filedes,const byte *Buffer,uint Count, - myf MyFlags)); - ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags)); - ulong my_tell _A((File fd,myf MyFlags)); - - Use instead of open,open-with-create-flag, close read and write - to get automatic error-messages (flag: MYF_WME) and only have - to test for != 0 if error (flag: MY_NABP). - - int my_rename _A((const char *from,const char *to,myf MyFlags)); - - Rename file - - FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags)); - FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags)); - int my_fclose _A((FILE *fd,myf MyFlags)); - uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags)); - uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, - myf MyFlags)); - ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags)); - ulong my_ftell _A((FILE *stream,myf MyFlags)); - - Same read-interface for streams as for files - - gptr _mymalloc _A((uint uSize,const char *sFile, - uint uLine, myf MyFlag)); - gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile, - uint uLine, myf MyFlag)); - void _myfree _A((gptr pPtr,const char *sFile,uint uLine)); - int _sanity _A((const char *sFile,unsigned int uLine)); - gptr _myget_copy_of_memory _A((const byte *from,uint length, - const char *sFile, uint uLine, - myf MyFlag)); - - malloc(size,myflag) is mapped to this functions if not compiled - with -DSAFEMALLOC - - void TERMINATE _A((void)); - - Writes malloc-info on stdout if compiled with -DSAFEMALLOC. - - int my_chsize _A((File fd,ulong newlength,myf MyFlags)); - - Change size of file - - void my_error _D((int nr,myf MyFlags, ...)); - - Writes message using error number (se mysys/errors.h) on - stdout or curses if MYSYS_PROGRAM_USES_CURSES() is called. - - void my_message _A((const char *str,myf MyFlags)); - - Writes message-string on - stdout or curses if MYSYS_PROGRAM_USES_CURSES() is called. - - void my_init _A((void )); - - Start each program (in main) with this. - void my_end _A((int infoflag)); - - Gives info about program. - - If infoflag & MY_CHECK_ERROR prints if some files are left open - - If infoflag & MY_GIVE_INFO prints timing info and malloc info - about prog. - - int my_redel _A((const char *from, const char *to, int MyFlags)); - - Delete from before rename of to to from. Copyes state from old - file to new file. If MY_COPY_TIME is set sets old time. - - int my_copystat _A((const char *from, const char *to, int MyFlags)); - - Copye state from old file to new file. - If MY_COPY_TIME is set sets copy also time. - - string my_filename _A((File fd)); - - Give filename of open file. - - int dirname _A((string to,const char *name)); - - Copy name of directory from filename. - - int test_if_hard_path _A((const char *dir_name)); - - Test if dirname is a hard path (Starts from root) - - void convert_dirname _A((string name)); - - Convert dirname acording to system. - - In MSDOS changes all caracters to capitals and changes '/' to - '\' - string fn_ext _A((const char *name)); - - Returns pointer to extension in filename - string fn_format _A((string to,const char *name,const char *dsk, - const char *form,int flag)); + +@node mysys functions, DBUG, coding guidelines, Top +@chapter Functions In The @code{mysys} Library + +Functions in @code{mysys}: (For flags see @file{my_sys.h}) + +@table @code +@item int my_copy _A((const char *from, const char *to, myf MyFlags)); +Copy file from @code{from} to @code{to}. + +@item int my_delete _A((const char *name, myf MyFlags)); +Delete file @code{name}. + +@item int my_getwd _A((string buf, uint size, myf MyFlags)); +@item int my_setwd _A((const char *dir, myf MyFlags)); +Get and set working directory. + +@item string my_tempnam _A((const char *pfx, myf MyFlags)); +Make a unique temporary file name by using dir and adding something after +@code{pfx} to make name unique. The file name is made by adding a unique +six character string and @code{TMP_EXT} after @code{pfx}. +Returns pointer to @code{malloc()}'ed area for filename. Should be freed by +@code{free()}. + +@item File my_open _A((const char *FileName,int Flags,myf MyFlags)); +@item File my_create _A((const char *FileName, int CreateFlags, int AccsesFlags, myf MyFlags)); +@item int my_close _A((File Filedes, myf MyFlags)); +@item uint my_read _A((File Filedes, byte *Buffer, uint Count, myf MyFlags)); +@item uint my_write _A((File Filedes, const byte *Buffer, uint Count, myf MyFlags)); +@item ulong my_seek _A((File fd,ulong pos,int whence,myf MyFlags)); +@item ulong my_tell _A((File fd,myf MyFlags)); +Use instead of open, open-with-create-flag, close, read, and write +to get automatic error messages (flag @code{MYF_WME}) and only have +to test for != 0 if error (flag @code{MY_NABP}). + +@item int my_rename _A((const char *from, const char *to, myf MyFlags)); +Rename file from @code{from} to @code{to}. + +@item FILE *my_fopen _A((const char *FileName,int Flags,myf MyFlags)); +@item FILE *my_fdopen _A((File Filedes,int Flags,myf MyFlags)); +@item int my_fclose _A((FILE *fd,myf MyFlags)); +@item uint my_fread _A((FILE *stream,byte *Buffer,uint Count,myf MyFlags)); +@item uint my_fwrite _A((FILE *stream,const byte *Buffer,uint Count, myf MyFlags)); +@item ulong my_fseek _A((FILE *stream,ulong pos,int whence,myf MyFlags)); +@item ulong my_ftell _A((FILE *stream,myf MyFlags)); +Same read-interface for streams as for files. + +@item gptr _mymalloc _A((uint uSize,const char *sFile,uint uLine, myf MyFlag)); +@item gptr _myrealloc _A((string pPtr,uint uSize,const char *sFile,uint uLine, myf MyFlag)); +@item void _myfree _A((gptr pPtr,const char *sFile,uint uLine)); +@item int _sanity _A((const char *sFile,unsigned int uLine)); +@item gptr _myget_copy_of_memory _A((const byte *from,uint length,const char *sFile, uint uLine,myf MyFlag)); +@code{malloc(size,myflag)} is mapped to these functions if not compiled +with @code{-DSAFEMALLOC}. + +@item void TERMINATE _A((void)); +Writes @code{malloc()} info on @code{stdout} if compiled with +@code{-DSAFEMALLOC}. + +@item int my_chsize _A((File fd, ulong newlength, myf MyFlags)); +Change size of file @code{fd} to @code{newlength}. + +@item void my_error _D((int nr, myf MyFlags, ...)); +Writes message using error number (see @file{mysys/errors.h}) on @code{stdout}, +or using curses, if @code{MYSYS_PROGRAM_USES_CURSES()} has been called. + +@item void my_message _A((const char *str, myf MyFlags)); +Writes @code{str} on @code{stdout}, or using curses, if +@code{MYSYS_PROGRAM_USES_CURSES()} has been called. + +@item void my_init _A((void )); +Start each program (in @code{main()}) with this. + +@item void my_end _A((int infoflag)); +Gives info about program. +If @code{infoflag & MY_CHECK_ERROR}, prints if some files are left open. +If @code{infoflag & MY_GIVE_INFO}, prints timing info and malloc info +about program. + +@item int my_redel _A((const char *from, const char *to, int MyFlags)); +Delete @code{from} before rename of @code{to} to @code{from}. Copies state +from old file to new file. If @code{MY_COPY_TIME} is set, sets old time. + +@item int my_copystat _A((const char *from, const char *to, int MyFlags)); +Copy state from old file to new file. If @code{MY_COPY_TIME} is set, +sets old time. + +@item string my_filename _A((File fd)); +Returns filename of open file. + +@item int dirname _A((string to, const char *name)); +Copy name of directory from filename. + +@item int test_if_hard_path _A((const char *dir_name)); +Test if @code{dir_name} is a hard path (starts from root). + +@item void convert_dirname _A((string name)); +Convert dirname according to system. +In MSDOS, changes all characters to capitals and changes @samp{/} to @samp{\}. + +@item string fn_ext _A((const char *name)); +Returns pointer to extension in filename. + +@item string fn_format _A((string to,const char *name,const char *dsk,const char *form,int flag)); format a filename with replace of library and extension and converts between different systems. params to and name may be identicall @@ -457,117 +538,206 @@ Functions i mysys: (For flags se my_sys.h) "open(fn_format(temp_buffe,name,"","",4),...)" to unpack home and convert filename to system-form. - string fn_same _A((string toname,const char *name,int flag)); - - Copys directory and extension from name to toname if neaded. - copy can be forced by same flags that in fn_format. +@item string fn_same _A((string toname, const char *name, int flag)); +Copys directory and extension from @code{name} to @code{toname} if neaded. +Copying can be forced by same flags used in @code{fn_format()}. + +@item int wild_compare _A((const char *str, const char *wildstr)); +Compare if @code{str} matches @code{wildstr}. @code{wildstr} can contain +@samp{*} and @samp{?} as wildcard characters. +Returns 0 if @code{str} and @code{wildstr} match. + +@item void get_date _A((string to, int timeflag)); +Get current date in a form ready for printing. + +@item void soundex _A((string out_pntr, string in_pntr)) +Makes @code{in_pntr} to a 5 char long string. All words that sound +alike have the same string. + +@item int init_key_cache _A((ulong use_mem, ulong leave_this_much_mem)); +Use caching of keys in MISAM, PISAM, and ISAM. +@code{KEY_CACHE_SIZE} is a good size. +Remember to lock databases for optimal caching. - int wild_compare _A((const char *str,const char *wildstr)); - - Compare if str matches wildstr. Wildstr can contain "*" and "?" - as match-characters. - Returns 0 if match. +@item void end_key_cache _A((void)); +End key caching. +@end table - void get_date _A((string to,int timeflag)); - - Get current date in a form ready for printing. - void soundex _A((string out_pntr, string in_pntr)) - - Makes in_pntr to a 5 chars long string. All words that sounds - alike have the same string. - int init_key_cache _A((ulong use_mem,ulong leave_this_much_mem)); - - Use cacheing of keys in MISAM, PISAM, and ISAM. - KEY_CACHE_SIZE is a good size. - - Remember to lock databases for optimal cacheing +@node DBUG, protocol, mysys functions, Top +@chapter DBUG Tags To Use - void end_key_cache _A((void)); - - End key-cacheing. +Here is some of the tags we now use: +(We should probably add a couple of new ones) -@node protocol,,, -@chapter MySQL client/server protocol +@table @code +@item enter +Arguments to the function. -Raw packet without compression -============================== -------------------------------------------------- -| Packet Length | Packet no | Data | -| 3 Bytes | 1 Byte | n Bytes | -------------------------------------------------- +@item exit +Results from the function. -3 Byte packet length - The length is calculated with int3store - See include/global.h for details. - The max packetsize can be 16 MB. -1 Byte packet no +@item info +Something that may be interesting. -If no compression is used the first 4 bytes of each paket -is the header of the paket. -The packet number is incremented for each sent packet. The first -packet starts with 0 +@item warning +When something doesn't go the usual route or may be wrong. -n Byte data +@item error +When something went wrong. + +@item loop +Write in a loop, that is probably only useful when debugging +the loop. These should normally be deleted when one is +satisfied with the code and it has been in real use for a while. +@end table + +Some specific to mysqld, because we want to watch these carefully: + +@table @code +@item trans +Starting/stopping transactions. + +@item quit +@code{info} when mysqld is preparing to die. + +@item query +Print query. +@end table + + +@node protocol, Fulltext Search, DBUG, Top +@chapter MySQL Client/Server Protocol + +@menu +* raw packet without compression:: +* raw packet with compression:: +* basic packets:: +* communication:: +* fieldtype codes:: +* protocol functions:: +* protocol version 2:: +@end menu + +@node raw packet without compression, raw packet with compression, protocol, protocol +@section Raw Packet Without Compression + +@example ++-----------------------------------------------+ +| Packet Length | Packet no | Data | +| 3 Bytes | 1 Byte | n Bytes | ++-----------------------------------------------+ +@end example + +@table @asis +@item 3 Byte packet length +The length is calculated with int3store +See include/global.h for details. +The max packetsize can be 16 MB. + +@item 1 Byte packet no +If no compression is used the first 4 bytes of each packet is the header +of the packet. The packet number is incremented for each sent packet. +The first packet starts with 0. +@item n Byte data + +@end table The packet length can be recalculated with: + +@example length = byte1 + (256 * byte2) + (256 * 256 * byte3) - -Raw packet with compression -=========================== ------------------------------------------------------ -| Packet Length | Packet no | Uncomp. Packet Length | -| 3 Bytes | 1 Byte | 3 Bytes | ------------------------------------------------------ - -3 Byte packet length - The length is calculated with int3store - See include/global.h for details. - The max packetsize can be 16 MB. -1 Byte packet no -3 Byte uncompressed packet length - -If compression is used the first 7 bytes of each paket -is the header of the paket. - -Basic packets -============== -OK-packet - For details see sql/net_pkg.cc - function send_ok - ------------------------------------------------- - | Header | No of Rows | Affected Rows | - | | 1 Byte | 1-8 Byte | - ------------------------------------------------- - | ID (last_insert_id) | Status | Length | - | 1-8 Byte | 2 Byte | 1-8 Byte | - ------------------------------------------------- - | Messagetext | - | n Byte | - ------------------------------------------------- - - Header - 1 byte number of rows ? (always 0 ?) - 1-8 bytes affected rows - 1-8 byte id (last_insert_id) - 2 byte Status (usually 0) - If the OK-packege includes a message: - 1-8 bytes length of message - n bytes messagetext - -Error-packet - ------------------------------------------------- - | Header | Statuscode | Error no | - | | 1 Byte | 2 Byte | - ------------------------------------------------- - | Messagetext | 0x00 | - | n Byte | 1 Byte | - ------------------------------------------------- - - Header - 1 byte status code (0xFF = ERROR) - 2 byte error number (is only sent to new 3.23 clients. - n byte errortext - 1 byte 0x00 - - - -The communication -================= +@end example + + +@node raw packet with compression, basic packets, raw packet without compression, protocol +@section Raw Packet With Compression + +@example ++---------------------------------------------------+ +| Packet Length | Packet no | Uncomp. Packet Length | +| 3 Bytes | 1 Byte | 3 Bytes | ++---------------------------------------------------+ +@end example + +@table @asis +@item 3 Byte packet length +The length is calculated with int3store +See include/global.h for details. +The max packetsize can be 16 MB. + +@item 1 Byte packet no +@item 3 Byte uncompressed packet length +@end table + +If compression is used the first 7 bytes of each packet +is the header of the packet. + + +@node basic packets, communication, raw packet with compression, protocol +@section Basic Packets + +@menu +* ok packet:: +* error packet:: +@end menu + + +@node ok packet, error packet, basic packets, basic packets +@subsection OK Packet + +For details, see @file{sql/net_pkg.cc::send_ok()}. + +@example ++-----------------------------------------------+ +| Header | No of Rows | Affected Rows | +| | 1 Byte | 1-8 Byte | +|-----------------------------------------------| +| ID (last_insert_id) | Status | Length | +| 1-8 Byte | 2 Byte | 1-8 Byte | +|-----------------------------------------------| +| Messagetext | +| n Byte | ++-----------------------------------------------+ +@end example + +@table @asis +@item Header +@item 1 byte number of rows ? (always 0 ?) +@item 1-8 bytes affected rows +@item 1-8 byte id (last_insert_id) +@item 2 byte Status (usually 0) +@item If the OK-packege includes a message: +@item 1-8 bytes length of message +@item n bytes messagetext +@end table + + +@node error packet, , ok packet, basic packets +@subsection Error Packet + +@example ++-----------------------------------------------+ +| Header | Status code | Error no | +| | 1 Byte | 2 Byte | +|-----------------------------------------------| +| Messagetext | 0x00 | +| n Byte | 1 Byte | ++-----------------------------------------------+ +@end example + +@table @asis +@item Header +@item 1 byte status code (0xFF = ERROR) +@item 2 byte error number (is only sent to new 3.23 clients. +@item n byte errortext +@item 1 byte 0x00 +@end table + + +@node communication, fieldtype codes, basic packets, protocol +@section Communication > Packet from server to client < Paket from client tor server @@ -658,27 +828,819 @@ The communication n data -Fieldtype Codes: -================ - - display_length |enum_field_type |flags - ---------------------------------------------------- -Blob 03 FF FF 00 |01 FC |03 90 00 00 -Mediumblob 03 FF FF FF |01 FC |03 90 00 00 -Tinyblob 03 FF 00 00 |01 FC |03 90 00 00 -Text 03 FF FF 00 |01 FC |03 10 00 00 -Mediumtext 03 FF FF FF |01 FC |03 10 00 00 -Tinytext 03 FF 00 00 |01 FC |03 10 00 00 -Integer 03 0B 00 00 |01 03 |03 03 42 00 -Mediumint 03 09 00 00 |01 09 |03 00 00 00 -Smallint 03 06 00 00 |01 02 |03 00 00 00 -Tinyint 03 04 00 00 |01 01 |03 00 00 00 -Varchar 03 XX 00 00 |01 FD |03 00 00 00 -Enum 03 05 00 00 |01 FE |03 00 01 00 -Datetime 03 13 00 00 |01 0C |03 00 00 00 -Timestamp 03 0E 00 00 |01 07 |03 61 04 00 -Time 03 08 00 00 |01 0B |03 00 00 00 -Date 03 0A 00 00 |01 0A |03 00 00 00 +@node fieldtype codes, protocol functions, communication, protocol +@section Fieldtype Codes + +@example + display_length |enum_field_type |flags + ---------------------------------------------------- +Blob 03 FF FF 00 |01 FC |03 90 00 00 +Mediumblob 03 FF FF FF |01 FC |03 90 00 00 +Tinyblob 03 FF 00 00 |01 FC |03 90 00 00 +Text 03 FF FF 00 |01 FC |03 10 00 00 +Mediumtext 03 FF FF FF |01 FC |03 10 00 00 +Tinytext 03 FF 00 00 |01 FC |03 10 00 00 +Integer 03 0B 00 00 |01 03 |03 03 42 00 +Mediumint 03 09 00 00 |01 09 |03 00 00 00 +Smallint 03 06 00 00 |01 02 |03 00 00 00 +Tinyint 03 04 00 00 |01 01 |03 00 00 00 +Varchar 03 XX 00 00 |01 FD |03 00 00 00 +Enum 03 05 00 00 |01 FE |03 00 01 00 +Datetime 03 13 00 00 |01 0C |03 00 00 00 +Timestamp 03 0E 00 00 |01 07 |03 61 04 00 +Time 03 08 00 00 |01 0B |03 00 00 00 +Date 03 0A 00 00 |01 0A |03 00 00 00 +@end example + +@node protocol functions, protocol version 2, fieldtype codes, protocol +@section Functions used to implement the protocol + +This should be merged with the above one and changed to texi format + +Raw packets +----------- + +- The my_net_xxxx() functions handles the packaging of a stream of data + into a raw packet that contains a packet number, length and data. + +- This is implemented for the server in sql/net_serv.cc. + The client file, libmysql/net.c, is symlinked to this file + +The important functions are: + +my_net_write() Store a packet (= # number of bytes) to be sent +net_flush() Send the packets stored in the buffer +net_write_command() Send a command (1 byte) + packet to the server. +my_net_read() Read a packet + + +Include files +------------- + +- include/mysql.h is included by all MySQL clients. It includes the + MYSQL and MYSQL_RES structures. +- include/mysql_com.h is include by mysql.h and mysql_priv.h (the + server) and includes a lot of common functions and structures to + handle the client/server protocol. + + +Packets from server to client: +----------------------------- + +sql/net_pkg.cc: + + - Sending of error packets + - Sending of OK packets (= end of data) + - Storing of values in a packet + + +sql/sql_base.cc: + + - Function send_fields() sends the field description to the client. + +sql/sql_show.cc: + + - Sends results for a lot of SHOW commands, including: + SHOW DATABASES [like 'wildcard'] + SHOW TABLES [like 'wildcard'] + + +Packets from client to server: +------------------------------ + +This is done in libmysql/libmysql.c + +The important ones are: + +- mysql_real_connect() Connects to a mysqld server +- mysql_real_query() Sends a query to the server and + reads the ok packet or columns header. +- mysql_store_result() Read a result set from the server to memory +- mysql_use_result() Read a result set row by row from the server. + +- net_safe_read() Read a packet from the server with + error handling. +- net_field_length() Reads the length of a packet string. +- simple_command() Sends a command/query to the server. + + + +Connecting to mysqld (the MySQL server) +--------------------------------------- + +- On the client side: libmysql/libmysql.c::mysql_real_connect(). +- On the server side: sql/sql_parse.cc::check_connections() + +The packets sent during a connection are as follows + +Server: Send greeting package (includes server capabilites, server + version and a random string of bytes to be used to scramble + the password. +Client: Sends package with client capabilites, user name, scrambled + password, database name + +Server: Sends ok package or error package. + +Client: If init command specified, send it t the server and read + ok/error package. + + +Password functions +------------------ + +The passwords are scrambled to a random number and are stored in hex +format on the server. + +The password handling is done in sql/password.c. The important +function is 'scramble()', which takes the a password in clear text +and uses this to 'encrypt' the random string sent by the server +to a new message. + +The encrypted message is sent to the server which uses the stored +random number password to encrypt the random string sent to the +client. If this is equal to the new message the client sends to the +server then the password is accepted. + +@node protocol version 2, , protocol functions, protocol +@section Another description of the protocol + +This should be merged with the above one and changed to texi format. + +***************************** +* +* PROTOCOL OVERVIEW +* +***************************** + +The MySQL protocol is relatively simple, and is designed for high performance +through minimisation of overhead, and extensibility through versioning and +options flags. It is a request-response protocol, and does not allow +multitasking or multiplexing over a single connection. There are two packet +formats, 'raw' and 'compressed' (which is used when both client and +server support zlib compression, and the client requests that data be +compressed): + +* RAW PACKET, shorter than 16 M * + ++-----------------------------------------------+ +| Packet Length | Packet no | Data | +| 3 Bytes | 1 Byte | n Bytes | ++-----------------------------------------------+ +^ ^ +| 'HEADER' | ++-------------------------------+ + + + * Packet Length: Calculated with int3store. See include/global.h for + details. The basic computation is length = byte1 + + (256 * byte2) + (256 * 256 * byte3). The max packetsize + can be 16 MB. + + * Packet no: The packet number is incremented for each sent packet. + The first packet for each query from the client + starts with 0. + + * Data: Specific to the operation being performed. Most often + used to send string data, such as a SQL query. + +* COMPRESSED PACKET * + ++---------------------------------------------------+-----------------+ +| Packet Length | Packet no | Uncomp. Packet Length | Compressed Data | +| 3 Bytes | 1 Byte | 3 Bytes | n bytes | ++---------------------------------------------------+-----------------+ +^ ^ +| 'HEADER' | ++---------------------------------------------------+ + + * Packet Length: Calculated with int3store. See include/my_global.h for + details. The basic computation is length = byte1 + + (256 * byte2) + (256 * 256 * byte3). The max packetsize + can be 16 MB. + + * Packet no: The packet number is incremented for each sent packet. + The first packet starts with 0. + + * Uncomp. Packet Length: The length of the original, uncompressed packet + If this is zero then the data is not compressed. + + * Compressed Data: The original packet, compressed with zlib compression + + +When using the compressed protocol, the client/server will only compress +send packets where the new packet is smaller than the not compressed one. +In other words, some packets may be compressed while others will not. + +The 'compressed data' is one or more packets in *RAW PACKET* format. + +***************************** +* +* FLOW OF EVENTS +* +***************************** + +To understand how a client communicates with a MySQL server, it is easiest +to start with a high-level flow of events. Each event section will then be +followed by details of the exact contents of each type of packet involved +in the event flow. + +* * +* CONNECTION ESTABLISHMENT * +* * + +Clients connect to the server via a TCP/IP socket (port 3306 by default), a +Unix Domain Socket, or named pipes (on Windows). Once connected, the +following connection establishment sequence is followed: + ++--------+ +--------+ +| Client | | Server | ++--------+ +--------+ + | | + | Handshake initialisation, including MySQL server version, | + | protocol version and options supported, as well as the seed | + | for the password hash | + | | + | <-------------------------------------------------------------- | + | | + | Client options supported, max packet size for client | + | username, password crypted with seed from server, database | + | name. | + | | + | --------------------------------------------------------------> | + | | + | 'OK' packet if authentication succeeds, 'ERROR' packet if | + | authentication fails. | + | | + | <-------------------------------------------------------------- | + | | + + + +* HANDSHAKE INITIALISATION PACKET * + + ++--------------------------------------------------------------------+ +| Header | Prot. Version | Server Version String | 0x00 | +| | 1 Byte | n bytes | 1 byte | +|--------------------------------------------------------------------| +| Thread Number | Crypt Seed | 0x00 | CLIENT_xxx options | +| | | | supported by server | +| 4 Bytes | 8 Bytes | 1 Byte | 2 Bytes | +|--------------------------------------------------------------------| +| Server charset no. | Server status variables | 0x00 padding | +| 1 Byte | 2 Bytes | 13 bytes | ++--------------------------------------------------------------------+ + + * Protocol version (currently '10') + * Server Version String (e.g. '4.0.5-beta-log'). Can be any length as + it's followed by a 0 byte. + * Thread Number - ID of server thread handling this connection + * Crypt seed - seed used to crypt password in auth packet from client + * CLIENT_xxx options - see include/mysql_com.h + * Server charset no. - Index of charset in use by server + * Server status variables - see include/mysql_com.h + * The padding bytes are reserverd for future extensions to the protocol + +* CLIENT AUTH PACKET * + + ++--------------------------------------------------------------------+ +| Header | CLIENT_xxx options supported | max_allowed_packet | +| | by client | for client | +| | 2 Bytes | 3 bytes | +|--------------------------------------------------------------------| +| User Name | 0x00 | Crypted Password | 0x00 | Database Name | +| n Bytes | 1 Byte | 8 Bytes | 1 Byte | n Bytes | +|--------------------------------------------------------------------| +| 0x00 | +| 1 Byte | ++--------------------------------------------------------------------+ + + * CLIENT_xxx options that this client supports: + +#define CLIENT_LONG_PASSWORD 1 /* new more secure passwords */ +#define CLIENT_FOUND_ROWS 2 /* Found instead of affected rows */ +#define CLIENT_LONG_FLAG 4 /* Get all column flags */ +#define CLIENT_CONNECT_WITH_DB 8 /* One can specify db on connect */ +#define CLIENT_NO_SCHEMA 16 /* Don't allow database.table.column */ +#define CLIENT_COMPRESS 32 /* Can use compression protocol */ +#define CLIENT_ODBC 64 /* Odbc client */ +#define CLIENT_LOCAL_FILES 128 /* Can use LOAD DATA LOCAL */ +#define CLIENT_IGNORE_SPACE 256 /* Ignore spaces before '(' */ +#define CLIENT_INTERACTIVE 1024 /* This is an interactive client */ +#define CLIENT_SSL 2048 /* Switch to SSL after handshake */ +#define CLIENT_IGNORE_SIGPIPE 4096 /* IGNORE sigpipes */ +#define CLIENT_TRANSACTIONS 8192 /* Client knows about transactions */ + + * max_allowed_packet for the client (in 'int3store' form) + * User Name - user to authenticate as. Is followed by a null byte. + * Crypted Password - password crypted with seed given in packet from + server, see scramble() in sql/password.c + * Database name (optional) - initial database to use once connected + Is followed by a null byte + +At the end of every client/server exchange there is either an 'OK' packet +or an 'ERROR' packet sent from the server. To determine whether a packet is +an 'OK' packet, or an 'ERROR' packet, check if the first byte (after the +header) is 0xFF. If it has the value of 0xFF, the packet is an 'ERROR' +packet. + + +* OK PACKET * + +For details, see sql/net_pkg.cc::send_ok() + ++-----------------------------------------------+ +| Header | No of Rows | Affected Rows | +| | 1 Byte | 1-9 Byte | +|-----------------------------------------------| +| ID (last_insert_id) | Status | Length | +| 1-9 Byte | 2 Byte | 1-9 Byte | +|-----------------------------------------------| +| Messagetext | +| n Byte | ++-----------------------------------------------+ + + * Number of rows, always 0 + * Affected rows + * ID (last_insert_id) - value for auto_increment column (if any) + * Status (usually 0) + +In general, in the MySQL protocol, fields in a packet that that +represent numeric data, such as lengths, that are labeled as '1-9' +bytes can be decoded by the following logic: + + If the first byte is '251', the + corresponding column value is NULL (only appropriate in + 'ROW DATA' packets). + + If the first byte is '252', the value stored can be read + from the following 2 bytes as a 16-bit integer. + + + If the first byte is '253' the value stored can be read + from the following 4 bytes as a 32-bit long integer + + + If the first byte is '254', the value stored can be read + from the following 8 bytes as a 64-byte long + + Otherwise (values 0-250), the value stored is the value of the + first byte itself. + + +If the OK-packet includes a message: + + * Length of message + * Message Text + + +* ERROR PACKET * + ++-----------------------------------------------+ +| Header | Status code | Error no | +| | 1 Byte | 2 Byte | +|-----------------------------------------------| +| Messagetext | | +| n Byte | | ++-----------------------------------------------+ + + * Status code (0xFF = ERROR) + * Error number (is only sent to 3.23 and newer clients) + * Error message text (ends at end of packet) + +Note that the error message is not null terminated. +The client code can however assume that the packet ends with a null +as my_net_read() will always add an end-null to all read packets to +make things easier for the client. + +Example: + +Packet dump of client connecting to server: + ++------------------------- Protocol Version (10) +| +| +---------------------- Server Version String (0x00 terminated) +| | +| | +0a 34 2e 30 2e 35 2d 62 . 4 . 0 . 5 - b +65 74 61 2d 6c 6f 67 00 e t a - l o g . +15 00 00 00 2b 5a 65 6c . . . . + Z e l + | | + | +------------ First 4 bytes of crypt seed + | + +------------------------ Thread Number + ++------------------------- Last 4 bytes of crypt seed +| +| +-------- CLIENT_XXX Options supported by server +| | +| +-+--+ +--- Server charset index +| | | | +6f 69 41 46 00 2c 28 08 o i A F . , ( . +02 00 00 00 00 00 00 00 . . . . . . . . +| | +| +---------------------- 0x00 padding begins +| ++------------------------- Server status (0x02 = + SERVER_STATUS_AUTOCOMMIT) + +00 00 00 00 00 00 00 00 . . . . . . . . + +* Client Authentication Response (Username 'test', no database + selected) * + + +--------------------- Packet Length (0x13 = 19 bytes) + | + | +--------------- Packet Sequence # + | | + | | +----------- CLIENT_XXX Options supported by client + | | ++---+---+ | +-+-+ +| | | | | +13 00 00 01 03 00 1e 00 . . . . . . . . +00 74 65 73 74 00 48 5e . t e s t . H ^ + | | | + +----+-----+ +------- Scrambled password, 0x00 terminated + | + +----------------- Username, 0x00 terminated + +57 4a 4e 41 4a 4e 00 00 W J N A J N . . +00 . + + +>From this point on, the server waits for 'commands' from the client +which include queries, database shutdown, quit, change user, etc (see +the COM_xxxx values in include/mysql_com.h for the latest +command codes). + +* * +* COMMAND PROCESSING * +* * + ++--------+ +--------+ +| Client | | Server | ++--------+ +--------+ + | | + | A command packet, with a command code, and string data | + | when appropriate (e.g. a query), (see the COM_xxxx values | + | in include/mysql_com.h for the command codes) | + | | + | --------------------------------------------------------------> | + | | + | A 'RESULT' packet if the command completed successfully, | + | an 'ERROR' packet if the command failed. 'RESULT' packets | + | take different forms (see the details following this chart) | + | depending on whether or not the command returns rows. | + | | + | <-------------------------------------------------------------- | + | | + | n 'FIELD PACKET's (if rows are returned) | + | | + | <-------------------------------------------------------------- | + | | + | 'LAST DATA' packet | + | | + | <-------------------------------------------------------------- | + | | + | n 'ROW PACKET's (if rows are returned) | + | | + | <-------------------------------------------------------------- | + | | + | 'LAST DATA' packet | + | | + | <-------------------------------------------------------------- | + | | + + +* Command Packet * + ++------------------------------------------------------+ +| Header | Command type | Query (if applicable) | +| | 1 Byte | n Bytes | ++------------------------------------------------------+ + + * Command type: (e.g.0x03 = query, see the COM_xxxx values in + include/mysql_com.h) + * Query (if applicable) + +Note that my_net_read() null-terminates all packets on the +receiving side of the channel to make it easier for the code +examining the packets. + +The current command codes are: + + 0x00 COM_SLEEP + 0x01 COM_QUIT + 0x02 COM_INIT_DB + 0x03 COM_QUERY + 0x04 COM_FIELD_LIST + 0x05 COM_CREATE_DB + 0x06 COM_DROP_DB + 0x07 COM_REFRESH + 0x08 COM_SHUTDOWN + 0x09 COM_STATISTICS + 0x0a COM_PROCESS_INFO + 0x0b COM_CONNECT + 0x0c COM_PROCESS_KILL + 0x0d COM_DEBUG + 0x0e COM_PING + 0x0f COM_TIME + 0x10 COM_DELAYED_INSERT + 0x11 COM_CHANGE_USER + 0x12 COM_BINLOG_DUMP + 0x13 COM_TABLE_DUMP + 0x14 COM_CONNECT_OUT + 0x15 COM_REGISTER_SLAVE + +* Result Packet * + +Result packet for a command returning _no_ rows: + ++-----------------------------------------------+ +| Header | Field Count | Affected Rows | +| | 1-9 Bytes | 1-9 Bytes | +|-----------------------------------------------| +| ID (last_insert_id) | Server Status | +| 1-9 Bytes | 2 Bytes | ++-----------------------------------------------+ + + * Field Count: Has value of '0' for commands returning _no_ rows + * Affected rows: Count of rows affected by INSERT/UPDATE/DELETE, etc. + * ID: value of auto_increment column in row (if any). 0 if + * Server Status: Usually 0 + +Result packet for a command returning rows: + ++-------------------------------+ +| Header | Field Count | +| | 1-9 Bytes | ++-------------------------------+ + + * Field Count: number of columns/fields in result set, + (packed with net_store_length() in sql/net_pkg.cc) + +This is followed by as many packets as the number of fields ('Field Count') +that contain the metadata for each column/field (see unpack_fields() in +libmysql/libmysql.c): + + +* FIELD PACKET * + ++-----------------------------------------------+ +| Header | Table Name | +| | length-coded-string | +|-----------------------------------------------| +| Field Name | +| length-code-string | +|-----------------------------------------------| +| Display length of field +| length-coded-binary (4 bytes) | +|-----------------------------------------------| +| Field Type (enum_field_types in mysql_com.h) | +| length-coded-binary (2 bytes) | +|-----------------------------------------------| +| Field Flags | Decimal Places| +| length-coded-binary (3 bytes) | 1 Byte | ++--------------+-------------+------------------+ + + * A length coded string is a string where we first have a packet + length (1-9 bytes, packed_with net_store_length()) followed + by a string. + * A length coded binary is a length (1 byte) followed by an integer + value in low-byte-first order. For the moment this type is always + fixed length in this packet. + + * Table Name - the name of the table the column comes from + * Field Name - the name of the column/field + * Display length of field - length of field + * Field Type - Type of field, see enum_field_types in + include/mysql_com.h + + Current field types are: + + 0x00 FIELD_TYPE_DECIMAL + 0x01 FIELD_TYPE_TINY + 0x02 FIELD_TYPE_SHORT + 0x03 FIELD_TYPE_LONG + 0x04 FIELD_TYPE_FLOAT + 0x05 FIELD_TYPE_DOUBLE + 0x06 FIELD_TYPE_NULL + 0x07 FIELD_TYPE_TIMESTAMP + 0x08 FIELD_TYPE_LONGLONG + 0x09 FIELD_TYPE_INT24 + 0x0a FIELD_TYPE_DATE + 0x0b FIELD_TYPE_TIME + 0x0c FIELD_TYPE_DATETIME + 0x0d FIELD_TYPE_YEAR + 0x0e FIELD_TYPE_NEWDATE + 0xf7 FIELD_TYPE_ENUM + 0xf8 FIELD_TYPE_SET + 0xf9 FIELD_TYPE_TINY_BLOB + 0xfa FIELD_TYPE_MEDIUM_BLOB + 0xfb FIELD_TYPE_LONG_BLOB + 0xfc FIELD_TYPE_BLOB + 0xfd FIELD_TYPE_VAR_STRING + 0xfe FIELD_TYPE_STRING + 0xff FIELD_TYPE_GEOMETRY + + * Field Flags - NOT_NULL_FLAG, PRI_KEY_FLAG, xxx_FLAG in + include/mysql_com.h + + +Note that the packet format in 4.1 has slightly changed to allow more values. + + +* ROW PACKET * + ++-----------------------------------------------+ +| Header | Data Length | Column Data | ....for each column +| | 1-9 Bytes | n Bytes | ++-----------------------------------------------+ + + * Data Length: (packed with net_store_length() in sql/net_pkg.cc) + + If 'Data Length' == 0, this is an 'ERROR PACKET'. + + * Column Data: String representation of data. MySQL always sends result set + data as strings. + +* LAST DATA PACKET * + +Packet length is < 9 bytes, and first byte is 0xFE + ++--------+ +| 0xFE | +| 1 Byte | ++--------+ + +Examples: + +*********** +* +* INITDB Command +* +*********** + +A client issuing an 'INITDB' (select the database to use) command, +followed by an 'OK' packet with no rows and no affected rows from +the server: + +* INITDB (select database to use) 'COMMAND' Packet * + + +--------------------- Packet Length (5 bytes) + | + | +--------------- Packet Sequence # + | | + | | +------------ Command # (INITDB = 0x02) + | | ++---+---+ | | +---------- Beginning of query data +| | | | | +05 00 00 00 02 74 65 73 . . . . . t e s +74 t + +* 'OK' Packet with no rows, and no rows affected * + + +--------------------- Packet Length (3 bytes) + | + | +--------------- Packet Sequence # + | | ++---+---+ | +| | | +03 00 00 01 00 00 00 . . . . . . . + + +*********** +* +* SELECT query example +* +*********** + +Client issuing a 'SELECT *' query on the following table: + + CREATE TABLE number_test (minBigInt bigint, + maxBigInt bigint, + testBigInt bigint) + +* 'COMMAND' Packet with QUERY (select ...) * + + +--------------------- Packet Length (26) + | + | +--------------- Packet Sequence # + | | + | | +------------ Command # (QUERY = 0x03) + | | ++---+---+ | | +---------- Beginning of query data +| | | | | +1a 00 00 00 03 53 45 4c . . . . . S E L +45 43 54 20 2a 20 66 72 E C T . * . f r +6f 6d 20 6e 75 6d 62 65 o m . n u m b e +72 5f 74 65 73 74 r _ t e s t + + +and receiving an 'OK' packet with a 'FIELD COUNT' of 3 + + +* 'OK' Packet with 3 fields * + + +--------------------- Packet Length (3 bytes) + | + | +--------------- Packet Sequence # + | | ++---+---+ | +| | | +01 00 00 01 03 . . . . . + +Followed immediately by 3 'FIELD' Packets. Note, the individual packets +are delimitted by =======, so that all fields can be annotated in the first +'FIELD' packet example: + +============================================================= + + +--------------------- Packet Length (0x1f = 31 bytes) + | + | +--------------- Packet Sequence # + | | + | | +------------ Block Length (0x0b = 11 bytes) + | | | ++---+---+ | | +--------- Table Name (11 bytes long) +| | | | | +1f 00 00 02 0b 6e 75 6d . . . . . n u m +62 65 72 5f 74 65 73 74 b e r _ t e s t + + +------------------------ Block Length (9 bytes) + | + | +--------------------- Column Name (9 bytes long) + | | +09 6d 69 6e 42 69 67 49 . m i n B i g I +6e 74 03 14 00 00 01 08 n t . . . . . . + | | | | | + | +---+---+ | +--- Field Type (0x08 = FIELD_TYPE_LONGLONG) + | | | + | | +------ Block Length (1) + | | + | +--------------- Display Length (0x14 = 20 chars) + | + +------------------ Block Length (3) + + +------------------------ Block Length (2) + | + | +-------------------- Field Flags (0 - no flags set) + | | + | +---+ +--------------- Decimal Places (0) + | | | | +02 00 00 00 . . . . + +============================================================= + +'FIELD' packet for the 'number_Test.maxBigInt' column + +1f 00 00 03 0b 6e 75 6d . . . . . n u m +62 65 72 5f 74 65 73 74 b e r _ t e s t +09 6d 61 78 42 69 67 49 . m a x B i g I +6e 74 03 14 00 00 01 08 n t . . . . . . +02 00 00 00 . . . . + +============================================================= + +'FIELD' packet for the 'number_test.testBigInt' column + +20 00 00 04 0b 6e 75 6d . . . . . n u m +62 65 72 5f 74 65 73 74 b e r _ t e s t +0a 74 65 73 74 42 69 67 . t e st B i g +49 6e 74 03 14 00 00 01 I n t . . . . . +08 02 00 00 00 . . . . . +============================================================= + +Followed immediately by one 'LAST DATA' packet: + +fe 00 . . + +Followed immediately by 'n' row packets (in this case, only +one packet is sent from the server, for simplicity's sake): + + + +--------------------- Packet Length (0x52 = 82 bytes) + | + | +--------------- Packet Sequence # + | | + | | +------------ Data Length (0x14 = 20 bytes) + | | | ++---+---+ | | +--------- String Data '-9223372036854775808' +| | | | | (repeat Data Length/Data sequence) + +52 00 00 06 14 2d 39 32 . . . . . - 9 2 +32 33 33 37 32 30 33 36 2 3 3 7 2 0 3 6 +38 35 34 37 37 35 38 30 8 5 4 7 7 5 8 0 +38 13 39 32 32 33 33 37 8 . 9 2 2 3 3 7 +32 30 33 36 38 35 34 37 2 0 3 6 8 5 4 7 +37 35 38 30 37 0a 36 31 7 5 8 0 7 . 6 1 +34 37 34 38 33 36 34 37 4 7 4 8 3 6 4 7 + +Followed immediately by one 'LAST DATA' packet: + +fe 00 . . + @c The Index was empty, and ugly, so I removed it. (jcole, Sep 7, 2000) @@ -688,6 +1650,331 @@ Date 03 0A 00 00 |01 0A |03 00 00 00 @c @printindex fn +@node 4.1 protocol,,, +@subchapter MySQL 4.1 protocol + +@node 4.1 protocol changes,,, +@section Changes to 4.0 protocol in 4.1 + +All basic packet handling is identical to 4.0. When communication +with an old 4.0 or 3.x client we will use the old protocol. + +The new things that we support with 4.1 are: + +@itemize @bullet +@item +Warnings +@item +Prepared statements +@item +Binary protocol (will be faster than the current protocol that +converts everything to strings) +@end itemize + + +What has changed in 4.1 are: + +@itemize @bullet +@item +A lot of new field information (database, real table name etc) +@item +The 'ok' packet has more status fields +@item +The 'end' packet (send last for each result set) now contains some +extra information +@item +New protocol for prepared statements. In this case all parameters and +results will sent as binary (low-byte-first). +@end itemize + + +@node 4.1 field packet,,, +@section 4.1 field description packet + +The field description packet is sent as a response to a query that +contains a result set. It can be distinguished from a ok packet by +the fact that the first byte can't be 0 for a field packet. +@xref {4.1 ok packet}. + +The header packet has the following structure: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 1-9 @tab Number of columns in result set (never 0) +@item 1-9 @tab Extra information sent be some command (SHOW COLUMNS +uses this to send the number of rows in the table) +@end multitable + +This packet is always followed by a field description set. +@xref{4.1 field desc}. + +@node 4.1 field desc,,, +@section 4.1 field description result set + +The field description result set contains the meta info for a result set. + +@multitable @columnfractions .20 .80 +@item Type @tab Comment +@item string @tab Database name +@item string @tab Table name alias (or table name if no alias) +@item string @tab Real table name +@item string @tab Alias for column name (or column name if not used) +@item 3 byte int @tab Length of column definition +@item 1 byte int @tab Enum value for field type +@item 3 byte int @tab 2 byte column flags (NOT_NULL_FLAG etc..) + 1 byte number of decimals. +@item string int @tab Default value, only set when using mysql_list_fields(). +@end multitable + + +@node 4.1 ok packet,,, +@section 4.1 ok packet + +The ok packet is the first that is sent as an response for a query +that didn't return a result set. + +The ok packet has the following structure: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 1 @tab 0 ; Marker for ok packet +@item 1-9 @tab Affected rows +@item 1-9 @tab Last insert id (0 if one wasn't used) +@item 2 @tab Server status; Can be used by client to check if we are inside an transaction +@item 2 @tab Warning count +@item 1-9 @tab Message length (optional) +@item xxx @tab Message (optional) +@end multitable + +Size 1-9 means that the parameter is packed in to 1-9 bytes depending on +the value. (See function sql/net_pkg.cc::net_store_length). + +The message is optional. For example for multi line INSERT it +contains a string for how many rows was inserted / deleted. + + +@node 4.1 end packet,,, +@section 4.1 end packet + +The end packet is sent as the last packet for + +@itemize @bullet +@item +End of field information +@item +End of parameter type information +@item +End of result set +@end itemize + +The end packet has the following structure: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 1 @tab 254 ; Marker for EOF packet +@item 2 @tab Warning count +@item 2 @tab Status flags (For flags like SERVER_STATUS_MORE_RESULTS) +@end multitable + +Note that a normal packet may start with byte 254, which means +'length stored in 9 bytes'. One can different between these cases +by checking the packet length < 9 bytes (in which case it's and end +packet). + + +@node 4.1 error packet +@section 4.1 error packet. + +The error packet is sent when something goes wrong. +The error packet has the following structure: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 1 @tab 255 Error packet marker +@item 2 @tab Error code +@item 1-255 @tab Null terminated error message +@end multitable + +The client/server protocol is designed in such a way that a packet +can only start with 255 if it's an error packet. + + +@node 4.1 prep init,,, +@section 4.1 prepared statement init packet + +This is the return packet when one sends a query with the COM_PREPARE +command. + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 4 @tab Statement handler id +@item 2 @tab Number of columns in result set +@item 2 @tab Number of parameters in query +@end multitable + +After this, there is a packet that contains the following for each +parameter in the query: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 2 @tab Enum value for field type. (MYSQL_TYPE_UNKNOWN if not known) +@item 2 @tab 2 byte column flags (NOT_NULL_FLAG etc) +@item 1 @tab Number of decimals +@item 4 @tab Max column length. +@end itemize + +Note that the above is not yet in 4.1 but will be added this month. + +As MySQL can have a parameter 'anywhere' it will in many cases not be +able to provide the optimal information for all parameters. + +If number of columns, in the header packet, is not 0 then the +prepared statement will contain a result set. In this case the packet +is followed by a field description result set. @xref{4.1 field descr}. + + +@node 4.1 long data,,, +@section 4.1 long data handling + +This is used by mysql_send_long_data() to set any parameter to a string +value. One can call mysql_send_long_data() multiple times for the +same parameter; The server will concatenate the results to a one big +string. + +The server will not require an end packet for the string. +mysql_send_long_data() is responsible updating a flag that all data +has been sent. (Ie; That the last call to mysql_send_long_data() has +the 'last_data' flag set). + +This packet is sent from client -> server: + +@multitable @columnfractions .10 .90 +@item Size @tab Comment +@item 4 @tab Statement handler +@item 2 @tab Parameter number +@item 2 @tab Type of parameter (not used at this point) +@item # @tab data (Rest of packet) +@end itemize + +The server will NOT send an @code{ok} or @code{error} packet in +responce for this. If there is any errors (like to big string), one +will get the error when calling execute. + +@node 4.1 execute,,, +@section 4.1 execute + +On execute we send all parameters to the server in a COM_EXECUTE +packet. + +The packet contains the following information: + +@multitable @columnfractions .30 .70 +@item Size @tab Comment +@item (param_count+9)/8 @tab Null bit map (2 bits reserved for protocol) +@item 1 @tab new_parameter_bound flag. Is set to 1 for first +execute or if one has rebound the parameters. +@item 2*param_count @tab Type of parameters (only given if new_parameter_bound flag is 1) +@item # @tab Parameter data, repeated for each parameter that are +NOT NULL and not used with mysql_send_long_data(). +@end itemize + +The null-bit-map is for all parameters (including parameters sent with +'mysql_send_long_data). If parameter 0 is NULL, then bit 0 in the +null-bit-map should be 1 (ie: first byte should be 1) + +The parameters are stored the following ways: + +@multitable @columnfractions .20 .10 .70 +@item Type @tab Size @tab Comment +@item tinyint @tab 1 @tab One byte integer +@item short @tab 2 @tab +@item int @tab 4 @tab +@item longlong @tab 8 @tab +@item float @tab 4 @tab +@item double @tab 8 @tab +@item string @tab 1-9 + # @tab Packed string length + string +@end multitable + +The result for this will be either an ok packet or a binary result +set. + +@node 4.1 binary result,,, +@section 4.1 binary result set + +A binary result are sent the following way. + +For each result row: + +@itemize +@item +null bit map with first two bits set to 01 (bit 0,1 value 1) +@item +parameter data, repeated for each not null result column. +@end itemize + +The idea with the reserving two bits in the null map is that we can +use standard error (first byte 255) and ok packets (first byte 0) +to end a result sets. + +Except that the null-bit-map is shifted two steps, the server is +sending the data to the client the same way that the server is sending +bound parameters to the client. The server is always sending the data +as type given for 'column type' for respective column. It's up to the +client to convert the parameter to the requested type. + +DATETIME, DATE and TIME are sent to the server in a binary format as follows: + +@multitable @columnfractions .20 .10 .70 +@item Type @tab Size @tab Comment +@item date @tab 1 + 0-11 @tab Length + 2 byte year, 1 byte MMDDHHMMSS, 4 byte billionth of a second +@item datetime @tab 1 + 0-11 @tab Length + 2 byte year, 1 byte MMDDHHMMSS, 4 byte billionth of a second +@item time @tab 1 + 0-14 @tab Length + sign (0 = pos, 1= neg), 4 byte days, 1 byte HHMMDD, 4 byte billionth of a second +@end multitable + +The first byte is a length byte and then comes all parameters that are +not 0. (Always counted from the beginning). + +@node Fulltext Search, , protocol, Top +@chapter Fulltext Search in MySQL + +Hopefully, sometime there will be complete description of +fulltext search algorithms. +Now it's just unsorted notes. + +@menu +* Weighting in boolean mode:: +@end menu + +@node Weighting in boolean mode, , Fulltext Search, Fulltext Search +@section Weighting in boolean mode + +The basic idea is as follows: in expression +@code{A or B or (C and D and E)}, either @code{A} or @code{B} alone +is enough to match the whole expression. While @code{C}, +@code{D}, and @code{E} should @strong{all} match. So it's +reasonable to assign weight 1 to @code{A}, @code{B}, and +@code{(C and D and E)}. And @code{C}, @code{D}, and @code{E} +should get a weight of 1/3. + +Things become more complicated when considering boolean +operators, as used in MySQL FTB. Obvioulsy, @code{+A +B} +should be treated as @code{A and B}, and @code{A B} - +as @code{A or B}. The problem is, that @code{+A B} can @strong{not} +be rewritten in and/or terms (that's the reason why this - extended - +set of operators was chosen). Still, aproximations can be used. +@code{+A B C} can be approximated as @code{A or (A and (B or C))} +or as @code{A or (A and B) or (A and C) or (A and B and C)}. +Applying the above logic (and omitting mathematical +transformations and normalization) one gets that for +@code{+A_1 +A_2 ... +A_N B_1 B_2 ... B_M} the weights +should be: @code{A_i = 1/N}, @code{B_j=1} if @code{N==0}, and, +otherwise, in the first rewritting approach @code{B_j = 1/3}, +and in the second one - @code{B_j = (1+(M-1)*2^M)/(M*(2^(M+1)-1))}. + +The second expression gives somewhat steeper increase in total +weight as number of matched B's increases, because it assigns +higher weights to individual B's. Also the first expression in +much simplier. So it is the first one, that is implemented in MySQL. + @summarycontents @contents |