summaryrefslogtreecommitdiff
path: root/Docs/internals.texi
diff options
context:
space:
mode:
authorunknown <monty@mashka.mysql.fi>2003-04-29 00:14:17 +0300
committerunknown <monty@mashka.mysql.fi>2003-04-29 00:14:17 +0300
commit3d12a41d67481a04a218c38c1ac90a2abbe3828a (patch)
tree1432449356385662f41f8737add8bc8bbc733850 /Docs/internals.texi
parent749c0fe97f57aca710bf0a9033f7ed5c0afd80d9 (diff)
downloadmariadb-git-3d12a41d67481a04a218c38c1ac90a2abbe3828a.tar.gz
Added missing dependency to VC++ project file
Docs/internals.texi: Moved code guidelines first Fixed texinfo nodes & menus VC++Files/mysql.dsw: Added missing dependency
Diffstat (limited to 'Docs/internals.texi')
-rw-r--r--Docs/internals.texi406
1 files changed, 219 insertions, 187 deletions
diff --git a/Docs/internals.texi b/Docs/internals.texi
index 66d04b006ff..270fe9e2249 100644
--- a/Docs/internals.texi
+++ b/Docs/internals.texi
@@ -43,18 +43,18 @@ END-INFO-DIR-ENTRY
@page
@end titlepage
-@node Top, caching, (dir), (dir)
+@node Top, coding guidelines, (dir), (dir)
@ifinfo
This is a manual about @strong{MySQL} internals.
@end ifinfo
@menu
+* coding guidelines:: Coding Guidelines
* caching:: How MySQL Handles Caching
-* join_buffer_size::
+* join_buffer_size::
* flush tables:: How MySQL Handles @code{FLUSH TABLES}
-* filesort:: How MySQL Does Sorting (@code{filesort})
-* coding guidelines:: Coding Guidelines
+* Algorithms::
* mysys functions:: Functions In The @code{mysys} Library
* DBUG:: DBUG Tags To Use
* protocol:: MySQL Client/Server Protocol
@@ -67,7 +67,167 @@ This is a manual about @strong{MySQL} internals.
@end menu
-@node caching, join_buffer_size, Top, Top
+@node coding guidelines, caching, Top, Top
+@chapter Coding Guidelines
+
+@itemize @bullet
+
+@item
+We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
+
+@item
+You should use the @strong{MySQL} 4.0 source for all developments.
+
+@item
+If you have any questions about the @strong{MySQL} source, you can post these
+to @email{dev-public@@mysql.com} and we will answer them. Please
+remember to not use this internal email list in public!
+
+@item
+Try to write code in a lot of black boxes that can be reused or use at
+least a clean, easy to change interface.
+
+@item
+Reuse code; There is already a lot of algorithms in MySQL for list handling,
+queues, dynamic and hashed arrays, sorting, etc. that can be reused.
+
+@item
+Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
+@code{my_malloc()} that you can find in the @code{mysys} library instead
+of the direct system calls; This will make your code easier to debug and
+more portable.
+
+@item
+Try to always write optimized code, so that you don't have to
+go back and rewrite it a couple of months later. It's better to
+spend 3 times as much time designing and writing an optimal function than
+having to do it all over again later on.
+
+@item
+Avoid CPU wasteful code, even where it does not matter, so that
+you will not develop sloppy coding habits.
+
+@item
+If you can write it in fewer lines, do it (as long as the code will not
+be slower or much harder to read).
+
+@item
+Don't use two commands on the same line.
+
+@item
+Do not check the same pointer for @code{NULL} more than once.
+
+@item
+Use long function and variable names in English. This makes your code
+easier to read.
+
+@item
+Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
+rather than dancing SHIFT to seperate words in identifiers).
+
+@item
+Think assembly - make it easier for the compiler to optimize your code.
+
+@item
+Comment your code when you do something that someone else may think
+is not ``trivial''.
+
+@item
+Use @code{libstring} functions (in the @file{strings} directory)
+instead of standard @code{libc} string functions whenever possible.
+
+@item
+Avoid using @code{malloc()} (its REAL slow); For memory allocations
+that only need to live for the lifetime of one thread, one should use
+@code{sql_alloc()} instead.
+
+@item
+Before making big design decisions, please first post a summary of
+what you want to do, why you want to do it, and how you plan to do
+it. This way we can easily provide you with feedback and also
+easily discuss it thoroughly if some other developer thinks there is better
+way to do the same thing!
+
+@item
+Class names start with a capital letter.
+
+@item
+Structure types are @code{typedef}'ed to an all-caps identifier.
+
+@item
+Any @code{#define}'s are in all-caps.
+
+@item
+Matching @samp{@{} are in the same column.
+
+@item
+Put the @samp{@{} after a @code{switch} on the same line, as this gives
+better overall indentation for the switch statement:
+
+@example
+switch (arg) @{
+@end example
+
+@item
+In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
+if there is nothing inside @samp{@{} and @samp{@}}.
+
+@item
+Have a space after @code{if}
+
+@item
+Put a space after @samp{,} for function arguments
+
+@item
+Functions return @samp{0} on success, and non-zero on error, so you can do:
+
+@example
+if(a() || b() || c()) @{ error("something went wrong"); @}
+@end example
+
+@item
+Using @code{goto} is okay if not abused.
+
+@item
+Avoid default variable initalizations, use @code{LINT_INIT()} if the
+compiler complains after making sure that there is really no way
+the variable can be used uninitialized.
+
+@item
+Do not instantiate a class if you do not have to.
+
+@item
+Use pointers rather than array indexing when operating on strings.
+
+@end itemize
+
+Suggested mode in emacs:
+
+@example
+(load "cc-mode")
+(setq c-mode-common-hook '(lambda ()
+ (turn-on-font-lock)
+ (setq comment-column 48)))
+(setq c-style-alist
+ (cons
+ '("MY"
+ (c-basic-offset . 2)
+ (c-comment-only-line-offset . 0)
+ (c-offsets-alist . ((statement-block-intro . +)
+ (knr-argdecl-intro . 0)
+ (substatement-open . 0)
+ (label . -)
+ (statement-cont . +)
+ (arglist-intro . c-lineup-arglist-intro-after-paren)
+ (arglist-close . c-lineup-arglist)
+ ))
+ )
+ c-style-alist))
+(c-set-style "MY")
+(setq c-default-style "MY")
+@end example
+
+@node caching, join_buffer_size, coding guidelines, Top
@chapter How MySQL Handles Caching
@strong{MySQL} has the following caches:
@@ -181,7 +341,7 @@ same algorithm described above to handle it. (In other words, we store
the same row combination several times into different buffers)
@end itemize
-@node flush tables, filesort, join_buffer_size, Top
+@node flush tables, Algorithms, join_buffer_size, Top
@chapter How MySQL Handles @code{FLUSH TABLES}
@itemize @bullet
@@ -226,8 +386,19 @@ After this it will give other threads a chance to open the same tables.
@end itemize
-@node filesort, coding guidelines, flush tables, Top
-@chapter How MySQL Does Sorting (@code{filesort})
+@node Algorithms, mysys functions, flush tables, Top
+@chapter Different algoritms used in MySQL
+
+MySQL uses a lot of different algorithms. This chapter tries to describe
+some of these:
+
+@menu
+* filesort::
+* bulk-insert::
+@end menu
+
+@node filesort, bulk-insert, Algorithms, Algorithms
+@section How MySQL Does Sorting (@code{filesort})
@itemize @bullet
@@ -266,169 +437,20 @@ and then we read the rows in the sorted order into a row buffer
@end itemize
+@node bulk-insert, , filesort, Algorithms
+@section Bulk insert
-@node coding guidelines, mysys functions, filesort, Top
-@chapter Coding Guidelines
-
-@itemize @bullet
-
-@item
-We use @uref{http://www.bitkeeper.com/, BitKeeper} for source management.
-
-@item
-You should use the @strong{MySQL} 4.0 source for all developments.
-
-@item
-If you have any questions about the @strong{MySQL} source, you can post these
-to @email{dev-public@@mysql.com} and we will answer them. Please
-remember to not use this internal email list in public!
-
-@item
-Try to write code in a lot of black boxes that can be reused or use at
-least a clean, easy to change interface.
-
-@item
-Reuse code; There is already a lot of algorithms in MySQL for list handling,
-queues, dynamic and hashed arrays, sorting, etc. that can be reused.
-
-@item
-Use the @code{my_*} functions like @code{my_read()}/@code{my_write()}/
-@code{my_malloc()} that you can find in the @code{mysys} library instead
-of the direct system calls; This will make your code easier to debug and
-more portable.
-
-@item
-Try to always write optimized code, so that you don't have to
-go back and rewrite it a couple of months later. It's better to
-spend 3 times as much time designing and writing an optimal function than
-having to do it all over again later on.
-
-@item
-Avoid CPU wasteful code, even where it does not matter, so that
-you will not develop sloppy coding habits.
-
-@item
-If you can write it in fewer lines, do it (as long as the code will not
-be slower or much harder to read).
-
-@item
-Don't use two commands on the same line.
-
-@item
-Do not check the same pointer for @code{NULL} more than once.
-
-@item
-Use long function and variable names in English. This makes your code
-easier to read.
-
-@item
-Use @code{my_var} as opposed to @code{myVar} or @code{MyVar} (@samp{_}
-rather than dancing SHIFT to seperate words in identifiers).
-
-@item
-Think assembly - make it easier for the compiler to optimize your code.
-
-@item
-Comment your code when you do something that someone else may think
-is not ``trivial''.
-
-@item
-Use @code{libstring} functions (in the @file{strings} directory)
-instead of standard @code{libc} string functions whenever possible.
-
-@item
-Avoid using @code{malloc()} (its REAL slow); For memory allocations
-that only need to live for the lifetime of one thread, one should use
-@code{sql_alloc()} instead.
-
-@item
-Before making big design decisions, please first post a summary of
-what you want to do, why you want to do it, and how you plan to do
-it. This way we can easily provide you with feedback and also
-easily discuss it thoroughly if some other developer thinks there is better
-way to do the same thing!
-
-@item
-Class names start with a capital letter.
-
-@item
-Structure types are @code{typedef}'ed to an all-caps identifier.
-
-@item
-Any @code{#define}'s are in all-caps.
-
-@item
-Matching @samp{@{} are in the same column.
-
-@item
-Put the @samp{@{} after a @code{switch} on the same line, as this gives
-better overall indentation for the switch statement:
-
-@example
-switch (arg) @{
-@end example
-
-@item
-In all other cases, @samp{@{} and @samp{@}} should be on their own line, except
-if there is nothing inside @samp{@{} and @samp{@}}.
-
-@item
-Have a space after @code{if}
-
-@item
-Put a space after @samp{,} for function arguments
-
-@item
-Functions return @samp{0} on success, and non-zero on error, so you can do:
-
-@example
-if(a() || b() || c()) @{ error("something went wrong"); @}
-@end example
-
-@item
-Using @code{goto} is okay if not abused.
-
-@item
-Avoid default variable initalizations, use @code{LINT_INIT()} if the
-compiler complains after making sure that there is really no way
-the variable can be used uninitialized.
-
-@item
-Do not instantiate a class if you do not have to.
-
-@item
-Use pointers rather than array indexing when operating on strings.
-
-@end itemize
-
-Suggested mode in emacs:
-
-@example
-(load "cc-mode")
-(setq c-mode-common-hook '(lambda ()
- (turn-on-font-lock)
- (setq comment-column 48)))
-(setq c-style-alist
- (cons
- '("MY"
- (c-basic-offset . 2)
- (c-comment-only-line-offset . 0)
- (c-offsets-alist . ((statement-block-intro . +)
- (knr-argdecl-intro . 0)
- (substatement-open . 0)
- (label . -)
- (statement-cont . +)
- (arglist-intro . c-lineup-arglist-intro-after-paren)
- (arglist-close . c-lineup-arglist)
- ))
- )
- c-style-alist))
-(c-set-style "MY")
-(setq c-default-style "MY")
-@end example
+Logic behind bulk insert optimisation is simple.
+Instead of writing each key value to b-tree (that is to keycache, but
+bulk insert code doesn't know about keycache) keys are stored in
+balanced binary (red-black) tree, in memory. When this tree reaches its
+memory limit it's writes all keys to disk (to keycache, that is). But
+as key stream coming from the binary tree is already sorted inserting
+goes much faster, all the necessary pages are already in cache, disk
+access is minimized, etc.
-@node mysys functions, DBUG, coding guidelines, Top
+@node mysys functions, DBUG, Algorithms, Top
@chapter Functions In The @code{mysys} Library
Functions in @code{mysys}: (For flags see @file{my_sys.h})
@@ -624,6 +646,16 @@ Print query.
* fieldtype codes::
* protocol functions::
* protocol version 2::
+* 4.1 protocol changes::
+* 4.1 field packet::
+* 4.1 field desc::
+* 4.1 ok packet::
+* 4.1 end packet::
+* 4.1 error packet::
+* 4.1 prep init::
+* 4.1 long data::
+* 4.1 execute::
+* 4.1 binary result::
@end menu
@node raw packet without compression, raw packet with compression, protocol, protocol
@@ -690,7 +722,7 @@ is the header of the packet.
@end menu
-@node ok packet, error packet, basic packets, basic packets, basic packets
+@node ok packet, error packet, basic packets, basic packets
@subsection OK Packet
For details, see @file{sql/net_pkg.cc::send_ok()}.
@@ -720,7 +752,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
@end table
-@node error packet, , ok packet, basic packets, basic packets
+@node error packet, , ok packet, basic packets
@subsection Error Packet
@example
@@ -835,7 +867,7 @@ For details, see @file{sql/net_pkg.cc::send_ok()}.
n data
@end example
-@node fieldtype codes, protocol functions, communication
+@node fieldtype codes, protocol functions, communication, protocol
@section Fieldtype Codes
@example
@@ -859,7 +891,7 @@ Time 03 08 00 00 |01 0B |03 00 00 00
Date 03 0A 00 00 |01 0A |03 00 00 00
@end example
-@node protocol functions, protocol version 2, fieldtype codes
+@node protocol functions, protocol version 2, fieldtype codes, protocol
@section Functions used to implement the protocol
@c This should be merged with the above one and changed to texi format
@@ -971,7 +1003,7 @@ client. If this is equal to the new message the client sends to the
server then the password is accepted.
@end example
-@node protocol version 2, 4.1 protocol changes, protocol functions
+@node protocol version 2, 4.1 protocol changes, protocol functions, protocol
@section Another description of the protocol
@c This should be merged with the above one and changed to texi format.
@@ -1664,7 +1696,7 @@ fe 00 . .
@c @node 4.1 protocol,,,
@c @chapter MySQL 4.1 protocol
-@node 4.1 protocol changes, 4.1 field packet, protocol version 2
+@node 4.1 protocol changes, 4.1 field packet, protocol version 2, protocol
@section Changes to 4.0 protocol in 4.1
All basic packet handling is identical to 4.0. When communication
@@ -1699,7 +1731,7 @@ results will sent as binary (low-byte-first).
@end itemize
-@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes
+@node 4.1 field packet, 4.1 field desc, 4.1 protocol changes, protocol
@section 4.1 field description packet
The field description packet is sent as a response to a query that
@@ -1719,7 +1751,7 @@ uses this to send the number of rows in the table)
This packet is always followed by a field description set.
@xref{4.1 field desc}.
-@node 4.1 field desc, 4.1 ok packet, 4.1 field packet
+@node 4.1 field desc, 4.1 ok packet, 4.1 field packet, protocol
@section 4.1 field description result set
The field description result set contains the meta info for a result set.
@@ -1737,7 +1769,7 @@ The field description result set contains the meta info for a result set.
@end multitable
-@node 4.1 ok packet, 4.1 end packet, 4.1 field desc
+@node 4.1 ok packet, 4.1 end packet, 4.1 field desc, protocol
@section 4.1 ok packet
The ok packet is the first that is sent as an response for a query
@@ -1763,7 +1795,7 @@ The message is optional. For example for multi line INSERT it
contains a string for how many rows was inserted / deleted.
-@node 4.1 end packet, 4.1 error packet, 4.1 ok packet
+@node 4.1 end packet, 4.1 error packet, 4.1 ok packet, protocol
@section 4.1 end packet
The end packet is sent as the last packet for
@@ -1792,7 +1824,7 @@ by checking the packet length < 9 bytes (in which case it's and end
packet).
-@node 4.1 error packet, 4.1 prep init, 4.1 end packet
+@node 4.1 error packet, 4.1 prep init, 4.1 end packet, protocol
@section 4.1 error packet.
The error packet is sent when something goes wrong.
@@ -1809,7 +1841,7 @@ The client/server protocol is designed in such a way that a packet
can only start with 255 if it's an error packet.
-@node 4.1 prep init, 4.1 long data, 4.1 error packet
+@node 4.1 prep init, 4.1 long data, 4.1 error packet, protocol
@section 4.1 prepared statement init packet
This is the return packet when one sends a query with the COM_PREPARE
@@ -1843,7 +1875,7 @@ prepared statement will contain a result set. In this case the packet
is followed by a field description result set. @xref{4.1 field desc}.
-@node 4.1 long data, 4.1 execute, 4.1 prep init
+@node 4.1 long data, 4.1 execute, 4.1 prep init, protocol
@section 4.1 long data handling
This is used by mysql_send_long_data() to set any parameter to a string
@@ -1870,7 +1902,7 @@ The server will NOT send an @code{ok} or @code{error} packet in
responce for this. If there is any errors (like to big string), one
will get the error when calling execute.
-@node 4.1 execute, 4.1 binary result, 4.1 long data
+@node 4.1 execute, 4.1 binary result, 4.1 long data, protocol
@section 4.1 execute
On execute we send all parameters to the server in a COM_EXECUTE
@@ -1908,7 +1940,7 @@ The parameters are stored the following ways:
The result for this will be either an ok packet or a binary result
set.
-@node 4.1 binary result, , 4.1 execute
+@node 4.1 binary result, , 4.1 execute, protocol
@section 4.1 binary result set
A binary result are sent the following way.
@@ -2384,7 +2416,7 @@ work for different record formats are: /myisam/mi_statrec.c,
/myisam/mi_dynrec.c, and /myisam/mi_packrec.c.
@*
-@node InnoDB Record Structure,InnoDB Page Structure,MyISAM Record Structure,Top
+@node InnoDB Record Structure, InnoDB Page Structure, MyISAM Record Structure, Top
@chapter InnoDB Record Structure
This page contains:
@@ -2690,7 +2722,7 @@ shorter because the NULLs take no space.
The most relevant InnoDB source-code files are rem0rec.c, rem0rec.ic,
and rem0rec.h in the rem ("Record Manager") directory.
-@node InnoDB Page Structure,Files in MySQL Sources,InnoDB Record Structure,Top
+@node InnoDB Page Structure, Files in MySQL Sources, InnoDB Record Structure, Top
@chapter InnoDB Page Structure
InnoDB stores all records inside a fixed-size unit which is commonly called a
@@ -3121,7 +3153,7 @@ header.
The most relevant InnoDB source-code files are page0page.c,
page0page.ic, and page0page.h in \page directory.
-@node Files in MySQL Sources,Files in InnoDB Sources,InnoDB Page Structure,Top
+@node Files in MySQL Sources, Files in InnoDB Sources, InnoDB Page Structure, Top
@chapter Annotated List Of Files in the MySQL Source Code Distribution
This is a description of the files that you get when you download the
@@ -4942,7 +4974,7 @@ The MySQL program that uses zlib is \mysys\my_compress.c. The use is
for packet compression. The client sends messages to the server which
are compressed by zlib. See also: \sql\net_serv.cc.
-@node Files in InnoDB Sources,,Files in MySQL Sources,Top
+@node Files in InnoDB Sources, , Files in MySQL Sources, Top
@chapter Annotated List Of Files in the InnoDB Source Code Distribution
ERRATUM BY HEIKKI TUURI (START)