diff options
-rw-r--r-- | Docs/internals.texi | 406 | ||||
-rw-r--r-- | VC++Files/mysql.dsw | 3 |
2 files changed, 222 insertions, 187 deletions
diff --git a/Docs/internals.texi b/Docs/internals.texi index 66d04b006ff..270fe9e2249 100644 --- a/Docs/internals.texi +++ b/Docs/internals.texi @@ -43,18 +43,18 @@ END-INFO-DIR-ENTRY @page @end titlepage -@node Top, caching, (dir), (dir) +@node Top, coding guidelines, (dir), (dir) @ifinfo This is a manual about @strong{MySQL} internals. @end ifinfo @menu +* coding guidelines:: Coding Guidelines * caching:: How MySQL Handles Caching -* join_buffer_size:: +* join_buffer_size:: * flush tables:: How MySQL Handles @code{FLUSH TABLES} -* filesort:: How MySQL Does Sorting (@code{filesort}) -* coding guidelines:: Coding Guidelines +* Algorithms:: * mysys functions:: Functions In The @code{mysys} Library * DBUG:: DBUG Tags To Use * protocol:: MySQL Client/Server Protocol @@ -67,7 +67,167 @@ This is a manual about @strong{MySQL} internals. @end menu -@node caching, join_buffer_size, Top, Top +@node coding guidelines, caching, Top, Top +@chapter Coding Guidelines + +@itemize @bullet + +@item +We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management. + +@item +You should use the @strong{MySQL} 4.0 source for all developments. + +@item +If you have any questions about the @strong{MySQL} source, you can post these +to @email{dev-public@@mysql.com} and we will answer them. Please +remember to not use this internal email list in public! + +@item +Try to write code in a lot of black boxes that can be reused or use at +least a clean, easy to change interface. + +@item +Reuse code; There is already a lot of algorithms in MySQL for list handling, +queues, dynamic and hashed arrays, sorting, etc. that can be reused. + +@item +Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/ +@code{my_malloc()} that you can find in the @code{mysys} library instead +of the direct system calls; This will make your code easier to debug and +more portable. + +@item +Try to always write optimized code, so that you don't have to +go back and rewrite it a couple of months later. It's better to +spend 3 times as much time designing and writing an optimal function than +having to do it all over again later on. + +@item +Avoid CPU wasteful code, even where it does not matter, so that +you will not develop sloppy coding habits. + +@item +If you can write it in fewer lines, do it (as long as the code will not +be slower or much harder to read). + +@item +Don't use two commands on the same line. + +@item +Do not check the same pointer for @code{NULL} more than once. + +@item +Use long function and variable names in English. This makes your code +easier to read. + +@item +Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_} +rather than dancing SHIFT to seperate words in identifiers). + +@item +Think assembly - make it easier for the compiler to optimize your code. + +@item +Comment your code when you do something that someone else may think +is not ``trivial''. + +@item +Use @code{libstring} functions (in the @file{strings} directory) +instead of standard @code{libc} string functions whenever possible. + +@item +Avoid using @code{malloc()} (its REAL slow); For memory allocations +that only need to live for the lifetime of one thread, one should use +@code{sql_alloc()} instead. + +@item +Before making big design decisions, please first post a summary of +what you want to do, why you want to do it, and how you plan to do +it. This way we can easily provide you with feedback and also +easily discuss it thoroughly if some other developer thinks there is better +way to do the same thing! + +@item +Class names start with a capital letter. + +@item +Structure types are @code{typedef}'ed to an all-caps identifier. + +@item +Any @code{#define}'s are in all-caps. + +@item +Matching @samp{@{} are in the same column. + +@item +Put the @samp{@{} after a @code{switch} on the same line, as this gives +better overall indentation for the switch statement: + +@example +switch (arg) @{ +@end example + +@item +In all other cases, @samp{@{} and @samp{@}} should be on their own line, except +if there is nothing inside @samp{@{} and @samp{@}}. + +@item +Have a space after @code{if} + +@item +Put a space after @samp{,} for function arguments + +@item +Functions return @samp{0} on success, and non-zero on error, so you can do: + +@example +if(a() || b() || c()) @{ error("something went wrong"); @} +@end example + +@item +Using @code{goto} is okay if not abused. + +@item +Avoid default variable initalizations, use @code{LINT_INIT()} if the +compiler complains after making sure that there is really no way +the variable can be used uninitialized. + +@item +Do not instantiate a class if you do not have to. + +@item +Use pointers rather than array indexing when operating on strings. + +@end itemize + +Suggested mode in emacs: + +@example +(load "cc-mode") +(setq c-mode-common-hook '(lambda () + (turn-on-font-lock) + (setq comment-column 48))) +(setq c-style-alist + (cons + '("MY" + (c-basic-offset . 2) + (c-comment-only-line-offset . 0) + (c-offsets-alist . ((statement-block-intro . +) + (knr-argdecl-intro . 0) + (substatement-open . 0) + (label . -) + (statement-cont . +) + (arglist-intro . c-lineup-arglist-intro-after-paren) + (arglist-close . c-lineup-arglist) + )) + ) + c-style-alist)) +(c-set-style "MY") +(setq c-default-style "MY") +@end example + +@node caching, join_buffer_size, coding guidelines, Top @chapter How MySQL Handles Caching @strong{MySQL} has the following caches: @@ -181,7 +341,7 @@ same algorithm described above to handle it. (In other words, we store the same row combination several times into different buffers) @end itemize -@node flush tables, filesort, join_buffer_size, Top +@node flush tables, Algorithms, join_buffer_size, Top @chapter How MySQL Handles @code{FLUSH TABLES} @itemize @bullet @@ -226,8 +386,19 @@ After this it will give other threads a chance to open the same tables. @end itemize -@node filesort, coding guidelines, flush tables, Top -@chapter How MySQL Does Sorting (@code{filesort}) +@node Algorithms, mysys functions, flush tables, Top +@chapter Different algoritms used in MySQL + +MySQL uses a lot of different algorithms. This chapter tries to describe +some of these: + +@menu +* filesort:: +* bulk-insert:: +@end menu + +@node filesort, bulk-insert, Algorithms, Algorithms +@section How MySQL Does Sorting (@code{filesort}) @itemize @bullet @@ -266,169 +437,20 @@ and then we read the rows in the sorted order into a row buffer @end itemize +@node bulk-insert, , filesort, Algorithms +@section Bulk insert -@node coding guidelines, mysys functions, filesort, Top -@chapter Coding Guidelines - -@itemize @bullet - -@item -We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management. - -@item -You should use the @strong{MySQL} 4.0 source for all developments. - -@item -If you have any questions about the @strong{MySQL} source, you can post these -to @email{dev-public@@mysql.com} and we will answer them. Please -remember to not use this internal email list in public! - -@item -Try to write code in a lot of black boxes that can be reused or use at -least a clean, easy to change interface. - -@item -Reuse code; There is already a lot of algorithms in MySQL for list handling, -queues, dynamic and hashed arrays, sorting, etc. that can be reused. - -@item -Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/ -@code{my_malloc()} that you can find in the @code{mysys} library instead -of the direct system calls; This will make your code easier to debug and -more portable. - -@item -Try to always write optimized code, so that you don't have to -go back and rewrite it a couple of months later. It's better to -spend 3 times as much time designing and writing an optimal function than -having to do it all over again later on. - -@item -Avoid CPU wasteful code, even where it does not matter, so that -you will not develop sloppy coding habits. - -@item -If you can write it in fewer lines, do it (as long as the code will not -be slower or much harder to read). - -@item -Don't use two commands on the same line. - -@item -Do not check the same pointer for @code{NULL} more than once. - -@item -Use long function and variable names in English. This makes your code -easier to read. - -@item -Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_} -rather than dancing SHIFT to seperate words in identifiers). - -@item -Think assembly - make it easier for the compiler to optimize your code. - -@item -Comment your code when you do something that someone else may think -is not ``trivial''. - -@item -Use @code{libstring} functions (in the @file{strings} directory) -instead of standard @code{libc} string functions whenever possible. - -@item -Avoid using @code{malloc()} (its REAL slow); For memory allocations -that only need to live for the lifetime of one thread, one should use -@code{sql_alloc()} instead. - -@item -Before making big design decisions, please first post a summary of -what you want to do, why you want to do it, and how you plan to do -it. This way we can easily provide you with feedback and also -easily discuss it thoroughly if some other developer thinks there is better -way to do the same thing! - -@item -Class names start with a capital letter. - -@item -Structure types are @code{typedef}'ed to an all-caps identifier. - -@item -Any @code{#define}'s are in all-caps. - -@item -Matching @samp{@{} are in the same column. - -@item -Put the @samp{@{} after a @code{switch} on the same line, as this gives -better overall indentation for the switch statement: - -@example -switch (arg) @{ -@end example - -@item -In all other cases, @samp{@{} and @samp{@}} should be on their own line, except -if there is nothing inside @samp{@{} and @samp{@}}. - -@item -Have a space after @code{if} - -@item -Put a space after @samp{,} for function arguments - -@item -Functions return @samp{0} on success, and non-zero on error, so you can do: - -@example -if(a() || b() || c()) @{ error("something went wrong"); @} -@end example - -@item -Using @code{goto} is okay if not abused. - -@item -Avoid default variable initalizations, use @code{LINT_INIT()} if the -compiler complains after making sure that there is really no way -the variable can be used uninitialized. - -@item -Do not instantiate a class if you do not have to. - -@item -Use pointers rather than array indexing when operating on strings. - -@end itemize - -Suggested mode in emacs: - -@example -(load "cc-mode") -(setq c-mode-common-hook '(lambda () - (turn-on-font-lock) - (setq comment-column 48))) -(setq c-style-alist - (cons - '("MY" - (c-basic-offset . 2) - (c-comment-only-line-offset . 0) - (c-offsets-alist . ((statement-block-intro . +) - (knr-argdecl-intro . 0) - (substatement-open . 0) - (label . -) - (statement-cont . +) - (arglist-intro . c-lineup-arglist-intro-after-paren) - (arglist-close . c-lineup-arglist) - )) - ) - c-style-alist)) -(c-set-style "MY") -(setq c-default-style "MY") -@end example +Logic behind bulk insert optimisation is simple. +Instead of writing each key value to b-tree (that is to keycache, but +bulk insert code doesn't know about keycache) keys are stored in +balanced binary (red-black) tree, in memory. When this tree reaches its +memory limit it's writes all keys to disk (to keycache, that is). But +as key stream coming from the binary tree is already sorted inserting +goes much faster, all the necessary pages are already in cache, disk +access is minimized, etc. -@node mysys functions, DBUG, coding guidelines, Top +@node mysys functions, DBUG, Algorithms, Top @chapter Functions In The @code{mysys} Library Functions in @code{mysys}: (For flags see @file{my_sys.h}) @@ -624,6 +646,16 @@ Print query. * fieldtype codes:: * protocol functions:: * protocol version 2:: +* 4.1 protocol changes:: +* 4.1 field packet:: +* 4.1 field desc:: +* 4.1 ok packet:: +* 4.1 end packet:: +* 4.1 error packet:: +* 4.1 prep init:: +* 4.1 long data:: +* 4.1 execute:: +* 4.1 binary result:: @end menu @node raw packet without compression, raw packet with compression, protocol, protocol @@ -690,7 +722,7 @@ is the header of the packet. @end menu -@node ok packet, error packet, basic packets, basic packets, basic packets +@node ok packet, error packet, basic packets, basic packets @subsection OK Packet For details, see @file{sql/net_pkg.cc::send_ok()}. @@ -720,7 +752,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}. @end table -@node error packet, , ok packet, basic packets, basic packets +@node error packet, , ok packet, basic packets @subsection Error Packet @example @@ -835,7 +867,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}. n data @end example -@node fieldtype codes, protocol functions, communication +@node fieldtype codes, protocol functions, communication, protocol @section Fieldtype Codes @example @@ -859,7 +891,7 @@ Time 03 08 00 00 |01 0B |03 00 00 00 Date 03 0A 00 00 |01 0A |03 00 00 00 @end example -@node protocol functions, protocol version 2, fieldtype codes +@node protocol functions, protocol version 2, fieldtype codes, protocol @section Functions used to implement the protocol @c This should be merged with the above one and changed to texi format @@ -971,7 +1003,7 @@ client. If this is equal to the new message the client sends to the server then the password is accepted. @end example -@node protocol version 2, 4.1 protocol changes, protocol functions +@node protocol version 2, 4.1 protocol changes, protocol functions, protocol @section Another description of the protocol @c This should be merged with the above one and changed to texi format. @@ -1664,7 +1696,7 @@ fe 00 . . @c @node 4.1 protocol,,, @c @chapter MySQL 4.1 protocol -@node 4.1 protocol changes, 4.1 field packet, protocol version 2 +@node 4.1 protocol changes, 4.1 field packet, protocol version 2, protocol @section Changes to 4.0 protocol in 4.1 All basic packet handling is identical to 4.0. When communication @@ -1699,7 +1731,7 @@ results will sent as binary (low-byte-first). @end itemize -@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes +@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes, protocol @section 4.1 field description packet The field description packet is sent as a response to a query that @@ -1719,7 +1751,7 @@ uses this to send the number of rows in the table) This packet is always followed by a field description set. @xref{4.1 field desc}. -@node 4.1 field desc, 4.1 ok packet, 4.1 field packet +@node 4.1 field desc, 4.1 ok packet, 4.1 field packet, protocol @section 4.1 field description result set The field description result set contains the meta info for a result set. @@ -1737,7 +1769,7 @@ The field description result set contains the meta info for a result set. @end multitable -@node 4.1 ok packet, 4.1 end packet, 4.1 field desc +@node 4.1 ok packet, 4.1 end packet, 4.1 field desc, protocol @section 4.1 ok packet The ok packet is the first that is sent as an response for a query @@ -1763,7 +1795,7 @@ The message is optional. For example for multi line INSERT it contains a string for how many rows was inserted / deleted. -@node 4.1 end packet, 4.1 error packet, 4.1 ok packet +@node 4.1 end packet, 4.1 error packet, 4.1 ok packet, protocol @section 4.1 end packet The end packet is sent as the last packet for @@ -1792,7 +1824,7 @@ by checking the packet length < 9 bytes (in which case it's and end packet). -@node 4.1 error packet, 4.1 prep init, 4.1 end packet +@node 4.1 error packet, 4.1 prep init, 4.1 end packet, protocol @section 4.1 error packet. The error packet is sent when something goes wrong. @@ -1809,7 +1841,7 @@ The client/server protocol is designed in such a way that a packet can only start with 255 if it's an error packet. -@node 4.1 prep init, 4.1 long data, 4.1 error packet +@node 4.1 prep init, 4.1 long data, 4.1 error packet, protocol @section 4.1 prepared statement init packet This is the return packet when one sends a query with the COM_PREPARE @@ -1843,7 +1875,7 @@ prepared statement will contain a result set. In this case the packet is followed by a field description result set. @xref{4.1 field desc}. -@node 4.1 long data, 4.1 execute, 4.1 prep init +@node 4.1 long data, 4.1 execute, 4.1 prep init, protocol @section 4.1 long data handling This is used by mysql_send_long_data() to set any parameter to a string @@ -1870,7 +1902,7 @@ The server will NOT send an @code{ok} or @code{error} packet in responce for this. If there is any errors (like to big string), one will get the error when calling execute. -@node 4.1 execute, 4.1 binary result, 4.1 long data +@node 4.1 execute, 4.1 binary result, 4.1 long data, protocol @section 4.1 execute On execute we send all parameters to the server in a COM_EXECUTE @@ -1908,7 +1940,7 @@ The parameters are stored the following ways: The result for this will be either an ok packet or a binary result set. -@node 4.1 binary result, , 4.1 execute +@node 4.1 binary result, , 4.1 execute, protocol @section 4.1 binary result set A binary result are sent the following way. @@ -2384,7 +2416,7 @@ work for different record formats are: /myisam/mi_statrec.c, /myisam/mi_dynrec.c, and /myisam/mi_packrec.c. @* -@node InnoDB Record Structure,InnoDB Page Structure,MyISAM Record Structure,Top +@node InnoDB Record Structure, InnoDB Page Structure, MyISAM Record Structure, Top @chapter InnoDB Record Structure This page contains: @@ -2690,7 +2722,7 @@ shorter because the NULLs take no space. The most relevant InnoDB source-code files are rem0rec.c, rem0rec.ic, and rem0rec.h in the rem ("Record Manager") directory. -@node InnoDB Page Structure,Files in MySQL Sources,InnoDB Record Structure,Top +@node InnoDB Page Structure, Files in MySQL Sources, InnoDB Record Structure, Top @chapter InnoDB Page Structure InnoDB stores all records inside a fixed-size unit which is commonly called a @@ -3121,7 +3153,7 @@ header. The most relevant InnoDB source-code files are page0page.c, page0page.ic, and page0page.h in \page directory. -@node Files in MySQL Sources,Files in InnoDB Sources,InnoDB Page Structure,Top +@node Files in MySQL Sources, Files in InnoDB Sources, InnoDB Page Structure, Top @chapter Annotated List Of Files in the MySQL Source Code Distribution This is a description of the files that you get when you download the @@ -4942,7 +4974,7 @@ The MySQL program that uses zlib is \mysys\my_compress.c. The use is for packet compression. The client sends messages to the server which are compressed by zlib. See also: \sql\net_serv.cc. -@node Files in InnoDB Sources,,Files in MySQL Sources,Top +@node Files in InnoDB Sources, , Files in MySQL Sources, Top @chapter Annotated List Of Files in the InnoDB Source Code Distribution ERRATUM BY HEIKKI TUURI (START) diff --git a/VC++Files/mysql.dsw b/VC++Files/mysql.dsw index eef82588fa8..9903c91ba1b 100644 --- a/VC++Files/mysql.dsw +++ b/VC++Files/mysql.dsw @@ -605,6 +605,9 @@ Package=<5> Package=<4> {{{ + Begin Project Dependency + Project_Dep_Name strings + End Project Dependency }}} ############################################################################### |