summaryrefslogtreecommitdiff
path: root/Docs
diff options
context:
space:
mode:
authorunknown <jcole@tetra.spaceapes.com>2001-08-02 14:56:14 -0500
committerunknown <jcole@tetra.spaceapes.com>2001-08-02 14:56:14 -0500
commit6fe5b189cd2b73a8bb4eef2e7ad1470af778a257 (patch)
tree06ec7a44bfd362ee1795db042d56af9c2be22327 /Docs
parentaee57a52b8af28b45ce0962cf4976187d950795c (diff)
parent89e1029b2711753ee54aa6893b867e46380bd231 (diff)
downloadmariadb-git-6fe5b189cd2b73a8bb4eef2e7ad1470af778a257.tar.gz
Merge jcole@work.mysql.com:/home/bk/mysql
into tetra.spaceapes.com:/home/jcole/bk/mysql Docs/manual.texi: Auto merged
Diffstat (limited to 'Docs')
-rw-r--r--Docs/manual.texi5857
1 files changed, 3001 insertions, 2856 deletions
diff --git a/Docs/manual.texi b/Docs/manual.texi
index 769b898f8e0..ab8e1815c5e 100644
--- a/Docs/manual.texi
+++ b/Docs/manual.texi
@@ -113,11 +113,10 @@ distribution for that version.
* Installing:: Installing @strong{MySQL}
* Tutorial:: @strong{MySQL} Tutorial
* MySQL Database Administration::
+* MySQL Optimization::
* Reference:: @strong{MySQL} language reference
* Table types:: @strong{MySQL} table types
* Fulltext Search:: Fulltext Search
-* Performance:: Getting maximum performance from @strong{MySQL}
-* MySQL Benchmarks:: The @strong{MySQL} benchmark suite
* Maintenance:: Maintaining a @strong{MySQL} installation
* Adding functions:: Adding new functions to @strong{MySQL}
* Adding procedures:: Adding new procedures to @strong{MySQL}
@@ -320,7 +319,8 @@ us.
you are looking for, you should give it a try. @strong{MySQL} also has a
very practical set of features developed in very close cooperation with
our users. You can find a performance comparison of @strong{MySQL}
-to some other database managers on our benchmark page. @xref{Benchmarks}.
+to some other database managers on our benchmark page.
+@xref{MySQL Benchmarks}.
@strong{MySQL} was originally developed to handle very large databases
much faster than existing solutions and has been successfully used in
@@ -4488,7 +4488,7 @@ For a list of all supported limits, functions, and types, see the
@item Performance
For a true comparison of speed, consult the growing @strong{MySQL} benchmark
-suite. @xref{Benchmarks}.
+suite. @xref{MySQL Benchmarks}.
Because there is no thread creation overhead, a small parser, few features, and
simple security, @code{mSQL} should be quicker at:
@@ -4547,7 +4547,7 @@ slower than @strong{MySQL} was seen. This is due to @code{mSQL}'s lack of a
join optimizer to order tables in the optimal order. However, if you put the
tables in exactly the right order in @code{mSQL}2 and the @code{WHERE} is
simple and uses index columns, the join will be relatively fast!
-@xref{Benchmarks}.
+@xref{MySQL Benchmarks}.
@item
@code{ORDER BY} and @code{GROUP BY}.
@item
@@ -7250,11 +7250,11 @@ hundreds of megabytes of data.
@item The @strong{MySQL} benchmark suite
This runs a range of common queries. It is also a test to see whether the
latest batch of optimizations actually made the code faster.
-@xref{Benchmarks}.
+@xref{MySQL Benchmarks}.
@item The @code{crash-me} test
This tries to determine what features the database supports and what its
-capabilities and limitations are. @xref{Benchmarks}.
+capabilities and limitations are. @xref{MySQL Benchmarks}.
@end table
Another test is that we use the newest @strong{MySQL} version in our internal
@@ -14280,7 +14280,7 @@ FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' ESCAPED BY '\\'
@end example
-@node MySQL Database Administration, Reference, Tutorial, Top
+@node MySQL Database Administration, MySQL Optimization, Tutorial, Top
@chapter MySQL Database Administration
@menu
@@ -19851,7 +19851,7 @@ be run at the same time.
@item @code{thread_stack}
The stack size for each thread. Many of the limits detected by the
@code{crash-me} test are dependent on this value. The default is
-large enough for normal operation. @xref{Benchmarks}.
+large enough for normal operation. @xref{MySQL Benchmarks}.
@item @code{timezone}
The timezone for the server.
@@ -23664,8 +23664,2988 @@ isolate it into a separate test case first. Then report the problem to
@email{bugs@@lists.mysql.com} with as much info as possible.
+@node MySQL Optimization, Reference, MySQL Database Administration, Top
+@chapter MySQL Optimization
-@node Reference, Table types, MySQL Database Administration, Top
+@menu
+* Optimize Overview::
+* Query Speed::
+* Locking Issues::
+* Optimizing Database Structure::
+* Optimizing the Server::
+* Disk issues::
+@end menu
+
+
+Optimization is a complicated task because it ultimately requires
+understanding of the whole system. While it may be possible to do some
+local optimizations with small knowledge of your system or application,
+the more optimal you want your system to become the more you will have
+to know about it.
+
+This chapter will try to explain and give some examples of different
+ways to optimize @strong{MySQL}. Remember, however, that there are
+always some (increasingly harder) additional ways to make the system
+even faster.
+
+
+@node Optimize Overview, Query Speed, MySQL Optimization, MySQL Optimization
+@section Optimization Overview
+
+The most important part for getting a system fast is of course the basic
+design. You also need to know what kinds of things your system will be
+doing, and what your bottlenecks are.
+
+The most common bottlenecks are:
+@itemize @bullet
+@item Disk seeks.
+It takes time for the disk to find a piece of data. With modern disks in
+1999, the mean time for this is usually lower than 10ms, so we can in
+theory do about 1000 seeks a second. This time improves slowly with new
+disks and is very hard to optimize for a single table. The way to
+optimize this is to spread the data on more than one disk.
+
+@item Disk reading/writing.
+When the disk is at the correct position we need to read the data. With
+modern disks in 1999, one disk delivers something like 10-20Mb/s. This
+is easier to optimize than seeks because you can read in parallel from
+multiple disks.
+
+@item CPU cycles.
+When we have the data in main memory (or if it already were
+there) we need to process it to get to our result. Having small
+tables compared to the memory is the most common limiting
+factor. But then, with small tables speed is usually not the problem.
+
+@item Memory bandwidth.
+When the CPU needs more data than can fit in the CPU cache the main
+memory bandwidth becomes a bottleneck. This is an uncommon bottleneck
+for most systems, but one should be aware of it.
+@end itemize
+
+
+@menu
+* Design Limitations::
+* Portability::
+* Internal use::
+* MySQL Benchmarks::
+* Custom Benchmarks::
+@end menu
+
+@node Design Limitations, Portability, Optimize Overview, Optimize Overview
+@subsection MySQL Design Limitations/Tradeoffs
+
+@cindex design, limitations
+@cindex limitations, design
+
+Because @strong{MySQL} uses extremely fast table locking (multiple readers /
+single writers) the biggest remaining problem is a mix of a steady stream of
+inserts and slow selects on the same table.
+
+We believe that for a huge number of systems the extremely fast
+performance in other cases make this choice a win. This case is usually
+also possible to solve by having multiple copies of the table, but it
+takes more effort and hardware.
+
+We are also working on some extensions to solve this problem for some
+common application niches.
+
+
+@node Portability, Internal use, Design Limitations, Optimize Overview
+@subsection Portability
+
+@cindex portability
+@cindex crash-me program
+@cindex programs, crash-me
+
+Because all SQL servers implement different parts of SQL, it takes work to
+write portable SQL applications. For very simple selects/inserts it is
+very easy, but the more you need the harder it gets. If you want an
+application that is fast with many databases it becomes even harder!
+
+To make a complex application portable you need to choose a number of
+SQL servers that it should work with.
+
+You can use the @strong{MySQL} crash-me program/web-page
+@uref{http://www.mysql.com/information/crash-me.php} to find functions,
+types, and limits you can use with a selection of database
+servers. Crash-me now tests far from everything possible, but it
+is still comprehensive with about 450 things tested.
+
+For example, you shouldn't have column names longer than 18 characters
+if you want to be able to use Informix or DB2.
+
+Both the @strong{MySQL} benchmarks and crash-me programs are very
+database-independent. By taking a look at how we have handled this, you
+can get a feeling for what you have to do to write your application
+database-independent. The benchmarks themselves can be found in the
+@file{sql-bench} directory in the @strong{MySQL} source
+distribution. They are written in Perl with DBI database interface
+(which solves the access part of the problem).
+
+See @uref{http://www.mysql.com/information/benchmarks.html} for the results
+from this benchmark.
+
+As you can see in these results, all databases have some weak points. That
+is, they have different design compromises that lead to different
+behavior.
+
+If you strive for database independence, you need to get a good feeling
+for each SQL server's bottlenecks. @strong{MySQL} is VERY fast in
+retrieving and updating things, but will have a problem in mixing slow
+readers/writers on the same table. Oracle, on the other hand, has a big
+problem when you try to access rows that you have recently updated
+(until they are flushed to disk). Transaction databases in general are
+not very good at generating summary tables from log tables, as in this
+case row locking is almost useless.
+
+To get your application @emph{really} database-independent, you need to define
+an easy extendable interface through which you manipulate your data. As
+C++ is available on most systems, it makes sense to use a C++ classes
+interface to the databases.
+
+If you use some specific feature for some database (like the
+@code{REPLACE} command in @strong{MySQL}), you should code a method for
+the other SQL servers to implement the same feature (but slower). With
+@strong{MySQL} you can use the @code{/*! */} syntax to add
+@strong{MySQL}-specific keywords to a query. The code inside
+@code{/**/} will be treated as a comment (ignored) by most other SQL
+servers.
+
+If REAL high performance is more important than exactness, as in some
+Web applications, a possibility is to create an application layer that
+caches all results to give you even higher performance. By letting
+old results 'expire' after a while, you can keep the cache reasonably
+fresh. This is quite nice in case of extremely high load, in which case
+you can dynamically increase the cache and set the expire timeout higher
+until things get back to normal.
+
+In this case the table creation information should contain information
+of the initial size of the cache and how often the table should normally
+be refreshed.
+
+@node Internal use, MySQL Benchmarks, Portability, Optimize Overview
+@subsection What Have We Used MySQL For?
+
+@cindex uses, of MySQL
+@cindex customers, of MySQL
+
+During @strong{MySQL} initial development, the features of @strong{MySQL}
+were made to fit our largest customer. They handle data warehousing for a
+couple of the biggest retailers in Sweden.
+
+From all stores, we get weekly summaries of all bonus card transactions,
+and we are expected to provide useful information for the store owners
+to help them find how their advertisement campaigns are affecting their
+customers.
+
+The data is quite huge (about 7 million summary transactions per month),
+and we have data for 4-10 years that we need to present to the users.
+We got weekly requests from the customers that they want to get
+'instant' access to new reports from this data.
+
+We solved this by storing all information per month in compressed
+'transaction' tables. We have a set of simple macros (script) that
+generates summary tables grouped by different criteria (product group,
+customer id, store ...) from the transaction tables. The reports are
+Web pages that are dynamically generated by a small Perl script that
+parses a Web page, executes the SQL statements in it, and inserts the
+results. We would have used PHP or mod_perl instead but they were
+not available at that time.
+
+For graphical data we wrote a simple tool in @code{C} that can produce
+GIFs based on the result of a SQL query (with some processing of the
+result). This is also dynamically executed from the Perl script that
+parses the @code{HTML} files.
+
+In most cases a new report can simply be done by copying an existing
+script and modifying the SQL query in it. In some cases, we will need to
+add more fields to an existing summary table or generate a new one, but
+this is also quite simple, as we keep all transactions tables on disk.
+(Currently we have at least 50G of transactions tables and 200G of other
+customer data.)
+
+We also let our customers access the summary tables directly with ODBC
+so that the advanced users can themselves experiment with the data.
+
+We haven't had any problems handling this with quite modest Sun Ultra
+SPARCstation (2x200 Mhz). We recently upgraded one of our servers to a 2
+CPU 400 Mhz UltraSPARC, and we are now planning to start handling
+transactions on the product level, which would mean a ten-fold increase
+of data. We think we can keep up with this by just adding more disk to
+our systems.
+
+We are also experimenting with Intel-Linux to be able to get more CPU
+power cheaper. Now that we have the binary portable database format (new
+in Version 3.23), we will start to use this for some parts of the application.
+
+Our initial feelings are that Linux will perform much better on
+low-to-medium load and Solaris will perform better when you start to get a
+high load because of extreme disk IO, but we don't yet have anything
+conclusive about this. After some discussion with a Linux Kernel
+developer, this might be a side effect of Linux giving so much resources
+to the batch job that the interactive performance gets very low. This
+makes the machine feel very slow and unresponsive while big batches are
+going. Hopefully this will be better handled in future Linux Kernels.
+
+
+@node MySQL Benchmarks, Custom Benchmarks, Internal use, Optimize Overview
+@subsection The MySQL Benchmark Suite
+
+@cindex benchmark suite
+@cindex crash-me program
+
+This should contain a technical description of the @strong{MySQL}
+benchmark suite (and @code{crash-me}), but that description is not
+written yet. Currently, you can get a good idea of the benchmark by
+looking at the code and results in the @file{sql-bench} directory in any
+@strong{MySQL} source distributions.
+
+This benchmark suite is meant to be a benchmark that will tell any user
+what things a given SQL implementation performs well or poorly at.
+
+Note that this benchmark is single threaded, so it measures the minimum
+time for the operations. We plan to in the future add a lot of
+multi-threaded tests to the benchmark suite.
+
+For example, (run on the same NT 4.0 machine):
+
+@multitable @columnfractions .6 .2 .2
+@strong{Reading 2000000 rows by index} @tab @strong{Seconds} @tab @strong{Seconds}
+@item mysql @tab 367 @tab 249
+@item mysql_odbc @tab 464
+@item db2_odbc @tab 1206
+@item informix_odbc @tab 121126
+@item ms-sql_odbc @tab 1634
+@item oracle_odbc @tab 20800
+@item solid_odbc @tab 877
+@item sybase_odbc @tab 17614
+@end multitable
+
+@multitable @columnfractions .6 .2 .2
+@strong{Inserting (350768) rows} @tab @strong{Seconds} @tab @strong{Seconds}
+@item mysql @tab 381 @tab 206
+@item mysql_odbc @tab 619
+@item db2_odbc @tab 3460
+@item informix_odbc @tab 2692
+@item ms-sql_odbc @tab 4012
+@item oracle_odbc @tab 11291
+@item solid_odbc @tab 1801
+@item sybase_odbc @tab 4802
+@end multitable
+
+In the above test @strong{MySQL} was run with a 8M index cache.
+
+We have gather some more benchmark results at
+@uref{http://www.mysql.com/information/benchmarks.html}.
+
+Note that Oracle is not included because they asked to be removed. All
+Oracle benchmarks have to be passed by Oracle! We believe that makes
+Oracle benchmarks @strong{VERY} biased because the above benchmarks are
+supposed to show what a standard installation can do for a single
+client.
+
+To run the benchmark suite, you have to download a @strong{MySQL} source
+distribution, install the perl DBI driver, the perl DBD driver for the
+database you want to test and then do:
+
+@example
+cd sql-bench
+perl run-all-tests --server=#
+@end example
+
+where # is one of supported servers. You can get a list of all options
+and supported servers by doing @code{run-all-tests --help}.
+
+@cindex crash-me
+@code{crash-me} tries to determine what features a database supports and
+what its capabilities and limitations are by actually running
+queries. For example, it determines:
+
+@itemize @bullet
+@item
+What column types are supported
+@item
+How many indexes are supported
+@item
+What functions are supported
+@item
+How big a query can be
+@item
+How big a @code{VARCHAR} column can be
+@end itemize
+
+We can find the result from crash-me on a lot of different databases at
+@uref{http://www.mysql.com/information/crash-me.php}.
+
+
+@node Custom Benchmarks, , MySQL Benchmarks, Optimize Overview
+@subsection Using Your Own Benchmarks
+
+@cindex benchmarks
+@cindex performance, benchmarks
+
+You should definitely benchmark your application and database to find
+out where the bottlenecks are. By fixing it (or by replacing the
+bottleneck with a 'dummy module') you can then easily identify the next
+bottleneck (and so on). Even if the overall performance for your
+application is sufficient, you should at least make a plan for each
+bottleneck, and decide how to solve it if someday you really need the
+extra performance.
+
+For an example of portable benchmark programs, look at the @strong{MySQL}
+benchmark suite. @xref{MySQL Benchmarks, , @strong{MySQL} Benchmarks}. You
+can take any program from this suite and modify it for your needs. By doing
+this, you can try different solutions to your problem and test which is really
+the fastest solution for you.
+
+It is very common that some problems only occur when the system is very
+heavily loaded. We have had many customers who contact us when they
+have a (tested) system in production and have encountered load problems. In
+every one of these cases so far, it has been problems with basic design
+(table scans are NOT good at high load) or OS/Library issues. Most of
+this would be a @strong{LOT} easier to fix if the systems were not
+already in production.
+
+To avoid problems like this, you should put some effort into benchmarking
+your whole application under the worst possible load! You can use
+Super Smack for this, and it is available at:
+@uref{http://www.mysql.com/Downloads/super-smack/super-smack-1.0.tar.gz}.
+As the name suggests, it can bring your system down to its knees if you ask it,
+so make sure to use it only on your development systems.
+
+
+@node Query Speed, Locking Issues, Optimize Overview, MySQL Optimization
+@section Optimizing @code{SELECT}s and Other Queries
+
+@cindex queries, speed of
+@cindex permission checks, effect on speed
+@cindex speed, of queries
+
+First, one thing that affects all queries: The more complex permission
+system setup you have, the more overhead you get.
+
+If you do not have any @code{GRANT} statements done, @strong{MySQL} will
+optimize the permission checking somewhat. So if you have a very high
+volume it may be worth the time to avoid grants. Otherwise more
+permission check results in a larger overhead.
+
+If your problem is with some explicit @strong{MySQL} function, you can
+always time this in the @strong{MySQL} client:
+
+@example
+mysql> select benchmark(1000000,1+1);
++------------------------+
+| benchmark(1000000,1+1) |
++------------------------+
+| 0 |
++------------------------+
+1 row in set (0.32 sec)
+@end example
+
+The above shows that @strong{MySQL} can execute 1,000,000 @code{+}
+expressions in 0.32 seconds on a @code{PentiumII 400MHz}.
+
+All @strong{MySQL} functions should be very optimized, but there may be
+some exceptions, and the @code{benchmark(loop_count,expression)} is a
+great tool to find out if this is a problem with your query.
+
+@menu
+* EXPLAIN::
+* Estimating performance:: Estimating query performance
+* SELECT speed:: Speed of @code{SELECT} queries
+* Where optimizations:: How MySQL optimizes @code{WHERE} clauses
+* DISTINCT optimization:: How MySQL Optimizes @code{DISTINCT}
+* LEFT JOIN optimization:: How MySQL optimizes @code{LEFT JOIN}
+* LIMIT optimization:: How MySQL optimizes @code{LIMIT}
+* Insert speed:: Speed of @code{INSERT} queries
+* Update speed:: Speed of @code{UPDATE} queries
+* Delete speed:: Speed of @code{DELETE} queries
+* Tips::
+@end menu
+
+
+@node EXPLAIN, Estimating performance, Query Speed, Query Speed
+@subsection @code{EXPLAIN} Syntax (Get Information About a @code{SELECT})
+
+@findex EXPLAIN
+@findex SELECT, optimizing
+
+@example
+ EXPLAIN tbl_name
+or EXPLAIN SELECT select_options
+@end example
+
+@code{EXPLAIN tbl_name} is a synonym for @code{DESCRIBE tbl_name} or
+@code{SHOW COLUMNS FROM tbl_name}.
+
+When you precede a @code{SELECT} statement with the keyword @code{EXPLAIN},
+@strong{MySQL} explains how it would process the @code{SELECT}, providing
+information about how tables are joined and in which order.
+
+With the help of @code{EXPLAIN}, you can see when you must add indexes
+to tables to get a faster @code{SELECT} that uses indexes to find the
+records. You can also see if the optimizer joins the tables in an optimal
+order. To force the optimizer to use a specific join order for a
+@code{SELECT} statement, add a @code{STRAIGHT_JOIN} clause.
+
+For non-simple joins, @code{EXPLAIN} returns a row of information for each
+table used in the @code{SELECT} statement. The tables are listed in the order
+they would be read. @strong{MySQL} resolves all joins using a single-sweep
+multi-join method. This means that @strong{MySQL} reads a row from the first
+table, then finds a matching row in the second table, then in the third table
+and so on. When all tables are processed, it outputs the selected columns and
+backtracks through the table list until a table is found for which there are
+more matching rows. The next row is read from this table and the process
+continues with the next table.
+
+Output from @code{EXPLAIN} includes the following columns:
+
+@table @code
+@item table
+The table to which the row of output refers.
+
+@item type
+The join type. Information about the various types is given below.
+
+@item possible_keys
+The @code{possible_keys} column indicates which indexes @strong{MySQL}
+could use to find the rows in this table. Note that this column is
+totally independent of the order of the tables. That means that some of
+the keys in possible_keys may not be usable in practice with the
+generated table order.
+
+If this column is empty, there are no relevant indexes. In this case,
+you may be able to improve the performance of your query by examining
+the @code{WHERE} clause to see if it refers to some column or columns
+that would be suitable for indexing. If so, create an appropriate index
+and check the query with @code{EXPLAIN} again. @xref{ALTER TABLE}.
+
+To see what indexes a table has, use @code{SHOW INDEX FROM tbl_name}.
+
+@item key
+The @code{key} column indicates the key that @strong{MySQL} actually
+decided to use. The key is @code{NULL} if no index was chosen. If
+@strong{MySQL} chooses the wrong index, you can probably force
+@strong{MySQL} to use another index by using @code{myisamchk --analyze},
+@xref{myisamchk syntax}, or by using @code{USE INDEX/IGNORE INDEX}.
+@xref{JOIN}.
+
+@item key_len
+The @code{key_len} column indicates the length of the key that
+@strong{MySQL} decided to use. The length is @code{NULL} if the
+@code{key} is @code{NULL}. Note that this tells us how many parts of a
+multi-part key @strong{MySQL} will actually use.
+
+@item ref
+The @code{ref} column shows which columns or constants are used with the
+@code{key} to select rows from the table.
+
+@item rows
+The @code{rows} column indicates the number of rows @strong{MySQL}
+believes it must examine to execute the query.
+
+@item Extra
+This column contains additional information of how @strong{MySQL} will
+resolve the query. Here is an explanation of the different text
+strings that can be found in this column:
+
+@table @code
+@item Distinct
+@strong{MySQL} will not continue searching for more rows for the current row
+combination after it has found the first matching row.
+
+@item Not exists
+@strong{MySQL} was able to do a @code{LEFT JOIN} optimization on the
+query and will not examine more rows in this table for the previous row
+combination after it finds one row that matches the @code{LEFT JOIN} criteria.
+
+Here is an example for this:
+
+@example
+SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;
+@end example
+
+Assume that @code{t2.id} is defined with @code{NOT NULL}. In this case
+@strong{MySQL} will scan @code{t1} and look up the rows in @code{t2}
+through @code{t1.id}. If @strong{MySQL} finds a matching row in
+@code{t2}, it knows that @code{t2.id} can never be @code{NULL}, and will
+not scan through the rest of the rows in @code{t2} that has the same
+@code{id}. In other words, for each row in @code{t1}, @strong{MySQL}
+only needs to do a single lookup in @code{t2}, independent of how many
+matching rows there are in @code{t2}.
+
+@item @code{range checked for each record (index map: #)}
+@strong{MySQL} didn't find a real good index to use. It will, instead, for
+each row combination in the preceding tables, do a check on which index to
+use (if any), and use this index to retrieve the rows from the table. This
+isn't very fast but is faster than having to do a join without
+an index.
+
+@item Using filesort
+@strong{MySQL} will need to do an extra pass to find out how to retrieve
+the rows in sorted order. The sort is done by going through all rows
+according to the @code{join type} and storing the sort key + pointer to
+the row for all rows that match the @code{WHERE}. Then the keys are
+sorted. Finally the rows are retrieved in sorted order.
+
+@item Using index
+The column information is retrieved from the table using only
+information in the index tree without having to do an additional seek to
+read the actual row. This can be done when all the used columns for
+the table are part of the same index.
+
+@item Using temporary
+To resolve the query @strong{MySQL} will need to create a
+temporary table to hold the result. This typically happens if you do an
+@code{ORDER BY} on a different column set than you did a @code{GROUP
+BY} on.
+
+@item Where used
+A @code{WHERE} clause will be used to restrict which rows will be
+matched against the next table or sent to the client. If you don't have
+this information and the table is of type @code{ALL} or @code{index},
+you may have something wrong in your query (if you don't intend to
+fetch/examine all rows from the table).
+@end table
+
+If you want to get your queries as fast as possible, you should look out for
+@code{Using filesort} and @code{Using temporary}.
+@end table
+
+The different join types are listed below, ordered from best to worst type:
+
+@cindex system table
+@cindex tables, system
+@table @code
+@item system
+The table has only one row (= system table). This is a special case of
+the @code{const} join type.
+
+@cindex constant table
+@cindex tables, constant
+@item const
+The table has at most one matching row, which will be read at the start
+of the query. Because there is only one row, values from the column in
+this row can be regarded as constants by the rest of the
+optimizer. @code{const} tables are very fast as they are read only once!
+
+@item eq_ref
+One row will be read from this table for each combination of rows from
+the previous tables. This is the best possible join type, other than the
+@code{const} types. It is used when all parts of an index are used by
+the join and the index is @code{UNIQUE} or a @code{PRIMARY KEY}.
+
+@item ref
+All rows with matching index values will be read from this table for each
+combination of rows from the previous tables. @code{ref} is used if the join
+uses only a leftmost prefix of the key, or if the key is not @code{UNIQUE}
+or a @code{PRIMARY KEY} (in other words, if the join cannot select a single
+row based on the key value). If the key that is used matches only a few rows,
+this join type is good.
+
+@item range
+Only rows that are in a given range will be retrieved, using an index to
+select the rows. The @code{key} column indicates which index is used.
+The @code{key_len} contains the longest key part that was used.
+The @code{ref} column will be NULL for this type.
+
+@item index
+This is the same as @code{ALL}, except that only the index tree is
+scanned. This is usually faster than @code{ALL}, as the index file is usually
+smaller than the data file.
+
+@item ALL
+A full table scan will be done for each combination of rows from the
+previous tables. This is normally not good if the table is the first
+table not marked @code{const}, and usually @strong{very} bad in all other
+cases. You normally can avoid @code{ALL} by adding more indexes, so that
+the row can be retrieved based on constant values or column values from
+earlier tables.
+@end table
+
+You can get a good indication of how good a join is by multiplying all values
+in the @code{rows} column of the @code{EXPLAIN} output. This should tell you
+roughly how many rows @strong{MySQL} must examine to execute the query. This
+number is also used when you restrict queries with the @code{max_join_size}
+variable.
+@xref{Server parameters}.
+
+The following example shows how a @code{JOIN} can be optimized progressively
+using the information provided by @code{EXPLAIN}.
+
+Suppose you have the @code{SELECT} statement shown below, that you examine
+using @code{EXPLAIN}:
+
+@example
+EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
+ tt.ProjectReference, tt.EstimatedShipDate,
+ tt.ActualShipDate, tt.ClientID,
+ tt.ServiceCodes, tt.RepetitiveID,
+ tt.CurrentProcess, tt.CurrentDPPerson,
+ tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
+ et_1.COUNTRY, do.CUSTNAME
+ FROM tt, et, et AS et_1, do
+ WHERE tt.SubmitTime IS NULL
+ AND tt.ActualPC = et.EMPLOYID
+ AND tt.AssignedPC = et_1.EMPLOYID
+ AND tt.ClientID = do.CUSTNMBR;
+@end example
+
+For this example, assume that:
+
+@itemize @bullet
+@item
+The columns being compared have been declared as follows:
+
+@multitable @columnfractions .1 .2 .7
+@item @strong{Table} @tab @strong{Column} @tab @strong{Column type}
+@item @code{tt} @tab @code{ActualPC} @tab @code{CHAR(10)}
+@item @code{tt} @tab @code{AssignedPC} @tab @code{CHAR(10)}
+@item @code{tt} @tab @code{ClientID} @tab @code{CHAR(10)}
+@item @code{et} @tab @code{EMPLOYID} @tab @code{CHAR(15)}
+@item @code{do} @tab @code{CUSTNMBR} @tab @code{CHAR(15)}
+@end multitable
+
+@item
+The tables have the indexes shown below:
+
+@multitable @columnfractions .1 .9
+@item @strong{Table} @tab @strong{Index}
+@item @code{tt} @tab @code{ActualPC}
+@item @code{tt} @tab @code{AssignedPC}
+@item @code{tt} @tab @code{ClientID}
+@item @code{et} @tab @code{EMPLOYID} (primary key)
+@item @code{do} @tab @code{CUSTNMBR} (primary key)
+@end multitable
+
+@item
+The @code{tt.ActualPC} values aren't evenly distributed.
+@end itemize
+
+Initially, before any optimizations have been performed, the @code{EXPLAIN}
+statement produces the following information:
+
+@example
+table type possible_keys key key_len ref rows Extra
+et ALL PRIMARY NULL NULL NULL 74
+do ALL PRIMARY NULL NULL NULL 2135
+et_1 ALL PRIMARY NULL NULL NULL 74
+tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872
+ range checked for each record (key map: 35)
+@end example
+
+Because @code{type} is @code{ALL} for each table, this output indicates that
+@strong{MySQL} is doing a full join for all tables! This will take quite a
+long time, as the product of the number of rows in each table must be
+examined! For the case at hand, this is @code{74 * 2135 * 74 * 3872 =
+45,268,558,720} rows. If the tables were bigger, you can only imagine how
+long it would take.
+
+One problem here is that @strong{MySQL} can't (yet) use indexes on columns
+efficiently if they are declared differently. In this context,
+@code{VARCHAR} and @code{CHAR} are the same unless they are declared as
+different lengths. Because @code{tt.ActualPC} is declared as @code{CHAR(10)}
+and @code{et.EMPLOYID} is declared as @code{CHAR(15)}, there is a length
+mismatch.
+
+To fix this disparity between column lengths, use @code{ALTER TABLE} to
+lengthen @code{ActualPC} from 10 characters to 15 characters:
+
+@example
+mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);
+@end example
+
+Now @code{tt.ActualPC} and @code{et.EMPLOYID} are both @code{VARCHAR(15)}.
+Executing the @code{EXPLAIN} statement again produces this result:
+
+@example
+table type possible_keys key key_len ref rows Extra
+tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
+do ALL PRIMARY NULL NULL NULL 2135
+ range checked for each record (key map: 1)
+et_1 ALL PRIMARY NULL NULL NULL 74
+ range checked for each record (key map: 1)
+et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
+@end example
+
+This is not perfect, but is much better (the product of the @code{rows}
+values is now less by a factor of 74). This version is executed in a couple
+of seconds.
+
+A second alteration can be made to eliminate the column length mismatches
+for the @code{tt.AssignedPC = et_1.EMPLOYID} and @code{tt.ClientID =
+do.CUSTNMBR} comparisons:
+
+@example
+mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
+ MODIFY ClientID VARCHAR(15);
+@end example
+
+Now @code{EXPLAIN} produces the output shown below:
+
+@example
+table type possible_keys key key_len ref rows Extra
+et ALL PRIMARY NULL NULL NULL 74
+tt ref AssignedPC,ClientID,ActualPC ActualPC 15 et.EMPLOYID 52 where used
+et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
+do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
+@end example
+
+This is almost as good as it can get.
+
+The remaining problem is that, by default, @strong{MySQL} assumes that values
+in the @code{tt.ActualPC} column are evenly distributed, and that isn't the
+case for the @code{tt} table. Fortunately, it is easy to tell @strong{MySQL}
+about this:
+
+@example
+shell> myisamchk --analyze PATH_TO_MYSQL_DATABASE/tt
+shell> mysqladmin refresh
+@end example
+
+Now the join is perfect, and @code{EXPLAIN} produces this result:
+
+@example
+table type possible_keys key key_len ref rows Extra
+tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
+et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
+et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
+do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
+@end example
+
+Note that the @code{rows} column in the output from @code{EXPLAIN} is an
+educated guess from the @strong{MySQL} join optimizer. To optimize a
+query, you should check if the numbers are even close to the truth. If not,
+you may get better performance by using @code{STRAIGHT_JOIN} in your
+@code{SELECT} statement and trying to list the tables in a different order in
+the @code{FROM} clause.
+
+
+@node Estimating performance, SELECT speed, EXPLAIN, Query Speed
+@subsection Estimating Query Performance
+
+@cindex estimating, query performance
+@cindex queries, estimating performance
+@cindex performance, estimating
+
+In most cases you can estimate the performance by counting disk seeks.
+For small tables, you can usually find the row in 1 disk seek (as the
+index is probably cached). For bigger tables, you can estimate that
+(using B++ tree indexes) you will need: @code{log(row_count) /
+log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) +
+1} seeks to find a row.
+
+In @strong{MySQL} an index block is usually 1024 bytes and the data
+pointer is usually 4 bytes. A 500,000 row table with an
+index length of 3 (medium integer) gives you:
+@code{log(500,000)/log(1024/3*2/(3+4)) + 1} = 4 seeks.
+
+As the above index would require about 500,000 * 7 * 3/2 = 5.2M,
+(assuming that the index buffers are filled to 2/3, which is typical)
+you will probably have much of the index in memory and you will probably
+only need 1-2 calls to read data from the OS to find the row.
+
+For writes, however, you will need 4 seek requests (as above) to find
+where to place the new index and normally 2 seeks to update the index
+and write the row.
+
+Note that the above doesn't mean that your application will slowly
+degenerate by N log N! As long as everything is cached by the OS or SQL
+server things will only go marginally slower while the table gets
+bigger. After the data gets too big to be cached, things will start to
+go much slower until your applications is only bound by disk-seeks
+(which increase by N log N). To avoid this, increase the index cache as
+the data grows. @xref{Server parameters}.
+
+
+@node SELECT speed, Where optimizations, Estimating performance, Query Speed
+@subsection Speed of @code{SELECT} Queries
+
+@findex SELECT speed
+
+@cindex speed, of queries
+
+In general, when you want to make a slow @code{SELECT ... WHERE} faster, the
+first thing to check is whether or not you can add an index. @xref{MySQL
+indexes, , @strong{MySQL} indexes}. All references between different tables
+should usually be done with indexes. You can use the @code{EXPLAIN} command
+to determine which indexes are used for a @code{SELECT}.
+@xref{EXPLAIN, , @code{EXPLAIN}}.
+
+Some general tips:
+
+@itemize @bullet
+@item
+To help @strong{MySQL} optimize queries better, run @code{myisamchk
+--analyze} on a table after it has been loaded with relevant data. This
+updates a value for each index part that indicates the average number of
+rows that have the same value. (For unique indexes, this is always 1,
+of course.). @strong{MySQL} will use this to decide which index to
+choose when you connect two tables with 'a non-constant expression'.
+You can check the result from the @code{analyze} run by doing @code{SHOW
+INDEX FROM table_name} and examining the @code{Cardinality} column.
+
+@item
+To sort an index and data according to an index, use @code{myisamchk
+--sort-index --sort-records=1} (if you want to sort on index 1). If you
+have a unique index from which you want to read all records in order
+according to that index, this is a good way to make that faster. Note,
+however, that this sorting isn't written optimally and will take a long
+time for a large table!
+@end itemize
+
+
+@node Where optimizations, DISTINCT optimization, SELECT speed, Query Speed
+@subsection How MySQL Optimizes @code{WHERE} Clauses
+
+@findex WHERE
+
+@cindex optimizations
+
+The @code{WHERE} optimizations are put in the @code{SELECT} part here because
+they are mostly used with @code{SELECT}, but the same optimizations apply for
+@code{WHERE} in @code{DELETE} and @code{UPDATE} statements.
+
+Also note that this section is incomplete. @strong{MySQL} does many
+optimizations, and we have not had time to document them all.
+
+Some of the optimizations performed by @strong{MySQL} are listed below:
+
+@itemize @bullet
+@item
+Removal of unnecessary parentheses:
+@example
+ ((a AND b) AND c OR (((a AND b) AND (c AND d))))
+-> (a AND b AND c) OR (a AND b AND c AND d)
+@end example
+@item
+Constant folding:
+@example
+ (a<b AND b=c) AND a=5
+-> b>5 AND b=c AND a=5
+@end example
+@item
+Constant condition removal (needed because of constant folding):
+@example
+ (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
+-> B=5 OR B=6
+@end example
+@item
+Constant expressions used by indexes are evaluated only once.
+@item
+@code{COUNT(*)} on a single table without a @code{WHERE} is retrieved
+directly from the table information. This is also done for any @code{NOT NULL}
+expression when used with only one table.
+@item
+Early detection of invalid constant expressions. @strong{MySQL} quickly
+detects that some @code{SELECT} statements are impossible and returns no rows.
+@item
+@code{HAVING} is merged with @code{WHERE} if you don't use @code{GROUP BY}
+or group functions (@code{COUNT()}, @code{MIN()}...).
+@item
+For each sub-join, a simpler @code{WHERE} is constructed to get a fast
+@code{WHERE} evaluation for each sub-join and also to skip records as
+soon as possible.
+@cindex constant table
+@cindex tables, constant
+@item
+All constant tables are read first, before any other tables in the query.
+A constant table is:
+@itemize @minus
+@item
+An empty table or a table with 1 row.
+@item
+A table that is used with a @code{WHERE} clause on a @code{UNIQUE}
+index, or a @code{PRIMARY KEY}, where all index parts are used with constant
+expressions and the index parts are defined as @code{NOT NULL}.
+@end itemize
+All the following tables are used as constant tables:
+@example
+mysql> SELECT * FROM t WHERE primary_key=1;
+mysql> SELECT * FROM t1,t2
+ WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
+@end example
+
+@item
+The best join combination to join the tables is found by trying all
+possibilities. If all columns in @code{ORDER BY} and in @code{GROUP
+BY} come from the same table, then this table is preferred first when
+joining.
+@item
+If there is an @code{ORDER BY} clause and a different @code{GROUP BY}
+clause, or if the @code{ORDER BY} or @code{GROUP BY} contains columns
+from tables other than the first table in the join queue, a temporary
+table is created.
+@item
+If you use @code{SQL_SMALL_RESULT}, @strong{MySQL} will use an in-memory
+temporary table.
+@item
+Each table index is queried, and the best index that spans fewer than 30% of
+the rows is used. If no such index can be found, a quick table scan is used.
+@item
+In some cases, @strong{MySQL} can read rows from the index without even
+consulting the data file. If all columns used from the index are numeric,
+then only the index tree is used to resolve the query.
+@item
+Before each record is output, those that do not match the @code{HAVING} clause
+are skipped.
+@end itemize
+
+Some examples of queries that are very fast:
+
+@example
+mysql> SELECT COUNT(*) FROM tbl_name;
+mysql> SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;
+mysql> SELECT MAX(key_part2) FROM tbl_name
+ WHERE key_part_1=constant;
+mysql> SELECT ... FROM tbl_name
+ ORDER BY key_part1,key_part2,... LIMIT 10;
+mysql> SELECT ... FROM tbl_name
+ ORDER BY key_part1 DESC,key_part2 DESC,... LIMIT 10;
+@end example
+
+The following queries are resolved using only the index tree (assuming
+the indexed columns are numeric):
+
+@example
+mysql> SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;
+mysql> SELECT COUNT(*) FROM tbl_name
+ WHERE key_part1=val1 AND key_part2=val2;
+mysql> SELECT key_part2 FROM tbl_name GROUP BY key_part1;
+@end example
+
+The following queries use indexing to retrieve the rows in sorted
+order without a separate sorting pass:
+
+@example
+mysql> SELECT ... FROM tbl_name ORDER BY key_part1,key_part2,... ;
+mysql> SELECT ... FROM tbl_name ORDER BY key_part1 DESC,key_part2 DESC,... ;
+@end example
+
+@node DISTINCT optimization, LEFT JOIN optimization, Where optimizations, Query Speed
+@subsection How MySQL Optimizes @code{DISTINCT}
+
+@findex DISTINCT
+
+@cindex optimizing, DISTINCT
+
+@code{DISTINCT} is converted to a @code{GROUP BY} on all columns,
+@code{DISTINCT} combined with @code{ORDER BY} will in many cases also
+need a temporary table.
+
+When combining @code{LIMIT #} with @code{DISTINCT}, @strong{MySQL} will stop
+as soon as it finds @code{#} unique rows.
+
+If you don't use columns from all used tables, @strong{MySQL} will stop
+the scanning of the not used tables as soon as it has found the first match.
+
+@example
+SELECT DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;
+@end example
+
+In the case, assuming t1 is used before t2 (check with @code{EXPLAIN}), then
+@strong{MySQL} will stop reading from t2 (for that particular row in t1)
+when the first row in t2 is found.
+
+
+@node LEFT JOIN optimization, LIMIT optimization, DISTINCT optimization, Query Speed
+@subsection How MySQL Optimizes @code{LEFT JOIN} and @code{RIGHT JOIN}
+
+@findex LEFT JOIN
+
+@cindex optimizing, LEFT JOIN
+
+@code{A LEFT JOIN B} in @strong{MySQL} is implemented as follows:
+
+@itemize @bullet
+@item
+The table @code{B} is set to be dependent on table @code{A} and all tables
+that @code{A} is dependent on.
+@item
+The table @code{A} is set to be dependent on all tables (except @code{B})
+that are used in the @code{LEFT JOIN} condition.
+@item
+All @code{LEFT JOIN} conditions are moved to the @code{WHERE} clause.
+@item
+All standard join optimizations are done, with the exception that a table is
+always read after all tables it is dependent on. If there is a circular
+dependence then @strong{MySQL} will issue an error.
+@item
+All standard @code{WHERE} optimizations are done.
+@item
+If there is a row in @code{A} that matches the @code{WHERE} clause, but there
+wasn't any row in @code{B} that matched the @code{LEFT JOIN} condition,
+then an extra @code{B} row is generated with all columns set to @code{NULL}.
+@item
+If you use @code{LEFT JOIN} to find rows that don't exist in some
+table and you have the following test: @code{column_name IS NULL} in the
+@code{WHERE} part, where column_name is a column that is declared as
+@code{NOT NULL}, then @strong{MySQL} will stop searching after more rows
+(for a particular key combination) after it has found one row that
+matches the @code{LEFT JOIN} condition.
+@end itemize
+
+@code{RIGHT JOIN} is implemented analogously as @code{LEFT JOIN}.
+
+The table read order forced by @code{LEFT JOIN} and @code{STRAIGHT JOIN}
+will help the join optimizer (which calculates in which order tables
+should be joined) to do its work much more quickly, as there are fewer
+table permutations to check.
+
+Note that the above means that if you do a query of type:
+
+@example
+SELECT * FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
+@end example
+
+@strong{MySQL} will do a full scan on @code{b} as the @code{LEFT
+JOIN} will force it to be read before @code{d}.
+
+The fix in this case is to change the query to:
+
+@example
+SELECT * FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
+@end example
+
+
+@node LIMIT optimization, Insert speed, LEFT JOIN optimization, Query Speed
+@subsection How MySQL Optimizes @code{LIMIT}
+
+@findex LIMIT
+
+@cindex optimizing, LIMIT
+
+In some cases @strong{MySQL} will handle the query differently when you are
+using @code{LIMIT #} and not using @code{HAVING}:
+
+@itemize @bullet
+@item
+If you are selecting only a few rows with @code{LIMIT}, @strong{MySQL}
+will use indexes in some cases when it normally would prefer to do a
+full table scan.
+@item
+If you use @code{LIMIT #} with @code{ORDER BY}, @strong{MySQL} will end the
+sorting as soon as it has found the first @code{#} lines instead of sorting
+the whole table.
+@item
+When combining @code{LIMIT #} with @code{DISTINCT}, @strong{MySQL} will stop
+as soon as it finds @code{#} unique rows.
+@item
+In some cases a @code{GROUP BY} can be resolved by reading the key in order
+(or do a sort on the key) and then calculate summaries until the
+key value changes. In this case @code{LIMIT #} will not calculate any
+unnecessary @code{GROUP BY}'s.
+@item
+As soon as @strong{MySQL} has sent the first @code{#} rows to the client, it
+will abort the query.
+@item
+@code{LIMIT 0} will always quickly return an empty set. This is useful
+to check the query and to get the column types of the result columns.
+@item
+The size of temporary tables uses the @code{LIMIT #} to calculate how much
+space is needed to resolve the query.
+@end itemize
+
+@node Insert speed, Update speed, LIMIT optimization, Query Speed
+@subsection Speed of @code{INSERT} Queries
+
+@findex INSERT
+
+@cindex speed, inserting
+@cindex inserting, speed of
+
+The time to insert a record consists approximately of:
+
+@itemize @bullet
+@item
+Connect: (3)
+@item
+Sending query to server: (2)
+@item
+Parsing query: (2)
+@item
+Inserting record: (1 x size of record)
+@item
+Inserting indexes: (1 x number of indexes)
+@item
+Close: (1)
+@end itemize
+
+where the numbers are somewhat proportional to the overall time. This
+does not take into consideration the initial overhead to open tables
+(which is done once for each concurrently running query).
+
+The size of the table slows down the insertion of indexes by N log N
+(B-trees).
+
+Some ways to speed up inserts:
+
+@itemize @bullet
+@item
+If you are inserting many rows from the same client at the same time, use
+multiple value lists @code{INSERT} statements. This is much faster (many
+times in some cases) than using separate @code{INSERT} statements.
+@item
+If you are inserting a lot of rows from different clients, you can get
+higher speed by using the @code{INSERT DELAYED} statement. @xref{INSERT,
+, @code{INSERT}}.
+@item
+Note that with @code{MyISAM} you can insert rows at the same time
+@code{SELECT}s are running if there are no deleted rows in the tables.
+@item
+When loading a table from a text file, use @code{LOAD DATA INFILE}. This
+is usually 20 times faster than using a lot of @code{INSERT} statements.
+@xref{LOAD DATA, , @code{LOAD DATA}}.
+@item
+It is possible with some extra work to make @code{LOAD DATA INFILE} run even
+faster when the table has many indexes. Use the following procedure:
+
+@enumerate
+@item
+Optionally create the table with @code{CREATE TABLE}. For example, using
+@code{mysql} or Perl-DBI.
+
+@item
+Execute a @code{FLUSH TABLES} statement or the shell command @code{mysqladmin
+flush-tables}.
+
+@item
+Use @code{myisamchk --keys-used=0 -rq /path/to/db/tbl_name}. This will
+remove all usage of all indexes from the table.
+
+@item
+Insert data into the table with @code{LOAD DATA INFILE}. This will not
+update any indexes and will therefore be very fast.
+
+@item
+If you are going to only read the table in the future, run @code{myisampack}
+on it to make it smaller. @xref{Compressed format}.
+
+@item
+Re-create the indexes with @code{myisamchk -r -q
+/path/to/db/tbl_name}. This will create the index tree in memory before
+writing it to disk, which is much faster because it avoids lots of disk
+seeks. The resulting index tree is also perfectly balanced.
+
+@item
+Execute a @code{FLUSH TABLES} statement or the shell command @code{mysqladmin
+flush-tables}.
+@end enumerate
+
+This procedure will be built into @code{LOAD DATA INFILE} in some future
+version of @strong{MySQL}.
+@item
+You can speed up insertions by locking your tables:
+
+@example
+mysql> LOCK TABLES a WRITE;
+mysql> INSERT INTO a VALUES (1,23),(2,34),(4,33);
+mysql> INSERT INTO a VALUES (8,26),(6,29);
+mysql> UNLOCK TABLES;
+@end example
+
+The main speed difference is that the index buffer is flushed to disk only
+once, after all @code{INSERT} statements have completed. Normally there would
+be as many index buffer flushes as there are different @code{INSERT}
+statements. Locking is not needed if you can insert all rows with a single
+statement.
+
+Locking will also lower the total time of multi-connection tests, but the
+maximum wait time for some threads will go up (because they wait for
+locks). For example:
+
+@example
+thread 1 does 1000 inserts
+thread 2, 3, and 4 does 1 insert
+thread 5 does 1000 inserts
+@end example
+
+If you don't use locking, 2, 3, and 4 will finish before 1 and 5. If you
+use locking, 2, 3, and 4 probably will not finish before 1 or 5, but the
+total time should be about 40% faster.
+
+As @code{INSERT}, @code{UPDATE}, and @code{DELETE} operations are very
+fast in @strong{MySQL}, you will obtain better overall performance by
+adding locks around everything that does more than about 5 inserts or
+updates in a row. If you do very many inserts in a row, you could do a
+@code{LOCK TABLES} followed by an @code{UNLOCK TABLES} once in a while
+(about each 1000 rows) to allow other threads access to the table. This
+would still result in a nice performance gain.
+
+Of course, @code{LOAD DATA INFILE} is much faster for loading data.
+@end itemize
+
+To get some more speed for both @code{LOAD DATA INFILE} and
+@code{INSERT}, enlarge the key buffer. @xref{Server parameters}.
+
+
+@node Update speed, Delete speed, Insert speed, Query Speed
+@subsection Speed of @code{UPDATE} Queries
+
+Update queries are optimized as a @code{SELECT} query with the additional
+overhead of a write. The speed of the write is dependent on the size of
+the data that is being updated and the number of indexes that are
+updated. Indexes that are not changed will not be updated.
+
+Also, another way to get fast updates is to delay updates and then do
+many updates in a row later. Doing many updates in a row is much quicker
+than doing one at a time if you lock the table.
+
+Note that, with dynamic record format, updating a record to
+a longer total length may split the record. So if you do this often,
+it is very important to @code{OPTIMIZE TABLE} sometimes.
+@xref{OPTIMIZE TABLE, , @code{OPTIMIZE TABLE}}.
+
+
+@node Delete speed, Tips, Update speed, Query Speed
+@subsection Speed of @code{DELETE} Queries
+
+If you want to delete all rows in the table, you should use
+@code{TRUNCATE TABLE table_name}. @xref{TRUNCATE}.
+
+The time to delete a record is exactly proportional to the number of
+indexes. To delete records more quickly, you can increase the size of
+the index cache. @xref{Server parameters}.
+
+
+@node Tips, , Delete speed, Query Speed
+@subsection Other Optimization Tips
+
+@cindex optimization, tips
+@cindex tips, optimization
+
+Unsorted tips for faster systems:
+
+@itemize @bullet
+@item
+Use persistent connections to the database to avoid the connection
+overhead. If you can't use persistent connections and you are doing a
+lot of new connections to the database, you may want to change the value
+of the @code{thread_cache_size} variable. @xref{Server parameters}.
+@item
+Always check that all your queries really use the indexes you have created
+in the tables. In @strong{MySQL} you can do this with the @code{EXPLAIN}
+command. @xref{EXPLAIN, Explain, Explain, manual}.
+@item
+Try to avoid complex @code{SELECT} queries on tables that are updated a
+lot. This is to avoid problems with table locking.
+@item
+The new @code{MyISAM} tables can insert rows in a table without deleted
+rows at the same time another table is reading from it. If this is important
+for you, you should consider methods where you don't have to delete rows
+or run @code{OPTIMIZE TABLE} after you have deleted a lot of rows.
+@item
+Use @code{ALTER TABLE ... ORDER BY expr1,expr2...} if you mostly
+retrieve rows in expr1,expr2.. order. By using this option after big
+changes to the table, you may be able to get higher performance.
+@item
+In some cases it may make sense to introduce a column that is 'hashed'
+based on information from other columns. If this column is short and
+reasonably unique it may be much faster than a big index on many
+columns. In @strong{MySQL} it's very easy to use this extra column:
+@code{SELECT * FROM table_name WHERE hash=MD5(concat(col1,col2))
+AND col_1='constant' AND col_2='constant'}
+@item
+For tables that change a lot you should try to avoid all @code{VARCHAR}
+or @code{BLOB} columns. You will get dynamic row length as soon as you
+are using a single @code{VARCHAR} or @code{BLOB} column. @xref{Table
+types}.
+@item
+It's not normally useful to split a table into different tables just
+because the rows gets 'big'. To access a row, the biggest performance
+hit is the disk seek to find the first byte of the row. After finding
+the data most new disks can read the whole row fast enough for most
+applications. The only cases where it really matters to split up a table is if
+it's a dynamic row size table (see above) that you can change to a fixed
+row size, or if you very often need to scan the table and don't need
+most of the columns. @xref{Table types}.
+@item
+If you very often need to calculate things based on information from a
+lot of rows (like counts of things), it's probably much better to
+introduce a new table and update the counter in real time. An update of
+type @code{UPDATE table set count=count+1 where index_column=constant}
+is very fast!
+
+This is really important when you use databases like @strong{MySQL} that
+only have table locking (multiple readers / single writers). This will
+also give better performance with most databases, as the row locking
+manager in this case will have less to do.
+@item
+If you need to collect statistics from big log tables, use summary tables
+instead of scanning the whole table. Maintaining the summaries should be
+much faster than trying to do statistics 'live'. It's much faster to
+regenerate new summary tables from the logs when things change
+(depending on business decisions) than to have to change the running
+application!
+@item
+If possible, one should classify reports as 'live' or 'statistical',
+where data needed for statistical reports are only generated based on
+summary tables that are generated from the actual data.
+@item
+Take advantage of the fact that columns have default values. Insert
+values explicitly only when the value to be inserted differs from the
+default. This reduces the parsing that @strong{MySQL} need to do and
+improves the insert speed.
+@item
+In some cases it's convenient to pack and store data into a blob. In this
+case you have to add some extra code in your application to pack/unpack
+things in the blob, but this may save a lot of accesses at some stage.
+This is practical when you have data that doesn't conform to a static
+table structure.
+@item
+Normally you should try to keep all data non-redundant (what
+is called 3rd normal form in database theory), but you should not be
+afraid of duplicating things or creating summary tables if you need these
+to gain more speed.
+@item
+Stored procedures or UDF (user-defined functions) may be a good way to
+get more performance. In this case you should, however, always have a way
+to do this some other (slower) way if you use some database that doesn't
+support this.
+@item
+You can always gain something by caching queries/answers in your
+application and trying to do many inserts/updates at the same time. If
+your database supports lock tables (like @strong{MySQL} and Oracle),
+this should help to ensure that the index cache is only flushed once
+after all updates.
+@item
+Use @code{INSERT /*! DELAYED */} when you do not need to know when your
+data is written. This speeds things up because many records can be written
+with a single disk write.
+@item
+Use @code{INSERT /*! LOW_PRIORITY */} when you want your selects to be
+more important.
+@item
+Use @code{SELECT /*! HIGH_PRIORITY */} to get selects that jump the
+queue. That is, the select is done even if there is somebody waiting to
+do a write.
+@item
+Use the multi-line @code{INSERT} statement to store many rows with one
+SQL command (many SQL servers supports this).
+@item
+Use @code{LOAD DATA INFILE} to load bigger amounts of data. This is
+faster than normal inserts and will be even faster when @code{myisamchk}
+is integrated in @code{mysqld}.
+@item
+Use @code{AUTO_INCREMENT} columns to make unique values.
+@item
+Use @code{OPTIMIZE TABLE} once in a while to avoid fragmentation when
+using dynamic table format. @xref{OPTIMIZE TABLE, , @code{OPTIMIZE TABLE}}.
+
+@item
+Use @code{HEAP} tables to get more speed when possible. @xref{Table
+types}.
+@item
+When using a normal Web server setup, images should be stored as
+files. That is, store only a file reference in the database. The main
+reason for this is that a normal Web server is much better at caching
+files than database contents. So it it's much easier to get a fast
+system if you are using files.
+@item
+Use in memory tables for non-critical data that are accessed often (like
+information about the last shown banner for users that don't have
+cookies).
+@item
+Columns with identical information in different tables should be
+declared identical and have identical names. Before Version 3.23 you
+got slow joins otherwise.
+
+Try to keep the names simple (use @code{name} instead of
+@code{customer_name} in the customer table). To make your names portable
+to other SQL servers you should keep them shorter than 18 characters.
+@item
+If you need REALLY high speed, you should take a look at the low-level
+interfaces for data storage that the different SQL servers support! For
+example, by accessing the @strong{MySQL} @code{MyISAM} directly, you could
+get a speed increase of 2-5 times compared to using the SQL interface.
+To be able to do this the data must be on the same server as
+the application, and usually it should only be accessed by one process
+(because external file locking is really slow). One could eliminate the
+above problems by introducing low-level @code{MyISAM} commands in the
+@strong{MySQL} server (this could be one easy way to get more
+performance if needed). By carefully designing the database interface,
+it should be quite easy to support this types of optimization.
+@item
+In many cases it's faster to access data from a database (using a live
+connection) than accessing a text file, just because the database is
+likely to be more compact than the text file (if you are using numerical
+data), and this will involve fewer disk accesses. You will also save
+code because you don't have to parse your text files to find line and
+column boundaries.
+@item
+You can also use replication to speed things up. @xref{Replication}.
+@item
+Declaring a table with @code{DELAY_KEY_WRITE=1} will make the updating of
+indexes faster, as these are not logged to disk until the file is closed.
+The downside is that you should run @code{myisamchk} on these tables before
+you start @code{mysqld} to ensure that they are okay if something killed
+@code{mysqld} in the middle. As the key information can always be generated
+from the data, you should not lose anything by using @code{DELAY_KEY_WRITE}.
+@end itemize
+
+
+@node Locking Issues, Optimizing Database Structure, Query Speed, MySQL Optimization
+@section Locking Issues
+
+
+@menu
+* Internal locking::
+* Table locking::
+@end menu
+
+@node Internal locking, Table locking, Locking Issues, Locking Issues
+@subsection How MySQL Locks Tables
+
+@cindex internal locking
+@cindex locking, tables
+@cindex tables, locking
+
+You can find a discussion about different locking methods in the appendix.
+@xref{Locking methods}.
+
+All locking in @strong{MySQL} is deadlock-free. This is managed by always
+requesting all needed locks at once at the beginning of a query and always
+locking the tables in the same order.
+
+The locking method @strong{MySQL} uses for @code{WRITE} locks works as follows:
+
+@itemize @bullet
+@item
+If there are no locks on the table, put a write lock on it.
+@item
+Otherwise, put the lock request in the write lock queue.
+@end itemize
+
+The locking method @strong{MySQL} uses for @code{READ} locks works as follows:
+
+@itemize @bullet
+@item
+If there are no write locks on the table, put a read lock on it.
+@item
+Otherwise, put the lock request in the read lock queue.
+@end itemize
+
+When a lock is released, the lock is made available to the threads
+in the write lock queue, then to the threads in the read lock queue.
+
+This means that if you have many updates on a table, @code{SELECT}
+statements will wait until there are no more updates.
+
+To work around this for the case where you want to do many @code{INSERT} and
+@code{SELECT} operations on a table, you can insert rows in a temporary
+table and update the real table with the records from the temporary table
+once in a while.
+
+This can be done with the following code:
+@example
+mysql> LOCK TABLES real_table WRITE, insert_table WRITE;
+mysql> insert into real_table select * from insert_table;
+mysql> TRUNCATE TABLE insert_table;
+mysql> UNLOCK TABLES;
+@end example
+
+You can use the @code{LOW_PRIORITY} options with @code{INSERT},
+@code{UPDATE} or @code{DELETE} or @code{HIGH_PRIORITY} with
+@code{SELECT} if you want to prioritize retrieval in some specific
+cases. You can also start @code{mysqld} with @code{--low-priority-updates}
+to get the same behaveour.
+
+Using @code{SQL_BUFFER_RESULT} can also help making table locks shorter.
+@xref{SELECT}.
+
+You could also change the locking code in @file{mysys/thr_lock.c} to use a
+single queue. In this case, write locks and read locks would have the same
+priority, which might help some applications.
+
+
+@node Table locking, , Internal locking, Locking Issues
+@subsection Table Locking Issues
+
+@cindex problems, table locking
+
+The table locking code in @strong{MySQL} is deadlock free.
+
+@strong{MySQL} uses table locking (instead of row locking or column
+locking) on all table types, except @code{BDB} tables, to achieve a very
+high lock speed. For large tables, table locking is MUCH better than
+row locking for most applications, but there are, of course, some
+pitfalls.
+
+For @code{BDB} and @code{InnoDB} tables, @strong{MySQL} only uses table
+locking if you explicitely lock the table with @code{LOCK TABLES} or
+execute a command that will modify every row in the table, like
+@code{ALTER TABLE}. For these table types we recommend you to not use
+@code{LOCK TABLES} at all.
+
+In @strong{MySQL} Version 3.23.7 and above, you can insert rows into
+@code{MyISAM} tables at the same time other threads are reading from the
+table. Note that currently this only works if there are no holes after
+deleted rows in the table at the time the insert is made. When all holes
+has been filled with new data, concurrent inserts will automatically be
+enabled again.
+
+Table locking enables many threads to read from a table at the same
+time, but if a thread wants to write to a table, it must first get
+exclusive access. During the update, all other threads that want to
+access this particular table will wait until the update is ready.
+
+As updates on tables normally are considered to be more important than
+@code{SELECT}, all statements that update a table have higher priority
+than statements that retrieve information from a table. This should
+ensure that updates are not 'starved' because one issues a lot of heavy
+queries against a specific table. (You can change this by using
+LOW_PRIORITY with the statement that does the update or
+@code{HIGH_PRIORITY} with the @code{SELECT} statement.)
+
+Starting from @strong{MySQL} Version 3.23.7 one can use the
+@code{max_write_lock_count} variable to force @strong{MySQL} to
+temporary give all @code{SELECT} statements, that wait for a table, a
+higher priority after a specific number of inserts on a table.
+
+Table locking is, however, not very good under the following senario:
+
+@itemize @bullet
+@item
+A client issues a @code{SELECT} that takes a long time to run.
+@item
+Another client then issues an @code{UPDATE} on a used table. This client
+will wait until the @code{SELECT} is finished.
+@item
+Another client issues another @code{SELECT} statement on the same table. As
+@code{UPDATE} has higher priority than @code{SELECT}, this @code{SELECT}
+will wait for the @code{UPDATE} to finish. It will also wait for the first
+@code{SELECT} to finish!
+@item
+A thread is waiting for something like @code{full disk}, in which case all
+threads that wants to access the problem table will also be put in a waiting
+state until more disk space is made available.
+@end itemize
+
+Some possible solutions to this problem are:
+
+@itemize @bullet
+@item
+Try to get the @code{SELECT} statements to run faster. You may have to create
+some summary tables to do this.
+
+@item
+Start @code{mysqld} with @code{--low-priority-updates}. This will give
+all statements that update (modify) a table lower priority than a @code{SELECT}
+statement. In this case the last @code{SELECT} statement in the previous
+scenario would execute before the @code{INSERT} statement.
+
+@item
+You can give a specific @code{INSERT}, @code{UPDATE}, or @code{DELETE}
+statement lower priority with the @code{LOW_PRIORITY} attribute.
+
+@item
+Start @code{mysqld} with a low value for @strong{max_write_lock_count} to give
+@code{READ} locks after a certain number of @code{WRITE} locks.
+
+@item
+You can specify that all updates from a specific thread should be done with
+low priority by using the SQL command: @code{SET SQL_LOW_PRIORITY_UPDATES=1}.
+@xref{SET OPTION, , @code{SET OPTION}}.
+
+@item
+You can specify that a specific @code{SELECT} is very important with the
+@code{HIGH_PRIORITY} attribute. @xref{SELECT, , @code{SELECT}}.
+
+@item
+If you have problems with @code{INSERT} combined with @code{SELECT},
+switch to use the new @code{MyISAM} tables as these support concurrent
+@code{SELECT}s and @code{INSERT}s.
+
+@item
+If you mainly mix @code{INSERT} and @code{SELECT} statements, the
+@code{DELAYED} attribute to @code{INSERT} will probably solve your problems.
+@xref{INSERT, , @code{INSERT}}.
+
+@item
+If you have problems with @code{SELECT} and @code{DELETE}, the @code{LIMIT}
+option to @code{DELETE} may help. @xref{DELETE, , @code{DELETE}}.
+@end itemize
+
+
+@node Optimizing Database Structure, Optimizing the Server, Locking Issues, MySQL Optimization
+@section Optimizing Database Structure
+
+
+@menu
+* Design::
+* Data size::
+* MySQL indexes::
+* Indexes::
+* Multiple-column indexes::
+* Table cache::
+* Creating many tables::
+* Open tables::
+@end menu
+
+@node Design, Data size, Optimizing Database Structure, Optimizing Database Structure
+@subsection Design Choices
+
+@cindex design, choices
+@cindex database design
+@cindex storage of data
+
+@strong{MySQL} keeps row data and index data in separate files. Many (almost
+all) other databases mix row and index data in the same file. We believe that
+the @strong{MySQL} choice is better for a very wide range of modern systems.
+
+Another way to store the row data is to keep the information for each
+column in a separate area (examples are SDBM and Focus). This will cause a
+performance hit for every query that accesses more than one column. Because
+this degenerates so quickly when more than one column is accessed,
+we believe that this model is not good for general purpose databases.
+
+The more common case is that the index and data are stored together
+(like in Oracle/Sybase et al). In this case you will find the row
+information at the leaf page of the index. The good thing with this
+layout is that it, in many cases, depending on how well the index is
+cached, saves a disk read. The bad things with this layout are:
+
+@itemize @bullet
+@item
+Table scanning is much slower because you have to read through the indexes
+to get at the data.
+@item
+You can't use only the index table to retrieve data for a query.
+@item
+You lose a lot of space, as you must duplicate indexes from the nodes
+(as you can't store the row in the nodes).
+@item
+Deletes will degenerate the table over time (as indexes in nodes are
+usually not updated on delete).
+@item
+It's harder to cache ONLY the index data.
+@end itemize
+
+
+@node Data size, MySQL indexes, Design, Optimizing Database Structure
+@subsection Get Your Data as Small as Possible
+
+@cindex data, size
+@cindex reducing, data size
+@cindex storage space, minimizing
+@cindex tables, improving performance
+@cindex performance, improving
+
+One of the most basic optimization is to get your data (and indexes) to
+take as little space on the disk (and in memory) as possible. This can
+give huge improvements because disk reads are faster and normally less
+main memory will be used. Indexing also takes less resources if
+done on smaller columns.
+
+@strong{MySQL} supports a lot of different table types and row formats.
+Choosing the right table format may give you a big performance gain.
+@xref{Table types}.
+
+You can get better performance on a table and minimize storage space
+using the techniques listed below:
+
+@itemize @bullet
+@item
+Use the most efficient (smallest) types possible. @strong{MySQL} has
+many specialized types that save disk space and memory.
+
+@item
+Use the smaller integer types if possible to get smaller tables. For
+example, @code{MEDIUMINT} is often better than @code{INT}.
+
+@item
+Declare columns to be @code{NOT NULL} if possible. It makes everything
+faster and you save one bit per column. Note that if you really need
+@code{NULL} in your application you should definitely use it. Just avoid
+having it on all columns by default.
+
+@item
+If you don't have any variable-length columns (@code{VARCHAR},
+@code{TEXT}, or @code{BLOB} columns), a fixed-size record format is
+used. This is faster but unfortunately may waste some space.
+@xref{MyISAM table formats}.
+
+@item
+The primary index of a table should be as short as possible. This makes
+identification of one row easy and efficient.
+
+@item
+For each table, you have to decide which storage/index method to
+use. @xref{Table types}.
+
+@item
+Only create the indexes that you really need. Indexes are good for
+retrieval but bad when you need to store things fast. If you mostly
+access a table by searching on a combination of columns, make an index
+on them. The first index part should be the most used column. If you are
+ALWAYS using many columns, you should use the column with more duplicates
+first to get better compression of the index.
+
+@item
+If it's very likely that a column has a unique prefix on the first number
+of characters, it's better to only index this prefix. @strong{MySQL}
+supports an index on a part of a character column. Shorter indexes are
+faster not only because they take less disk space but also because they
+will give you more hits in the index cache and thus fewer disk
+seeks. @xref{Server parameters}.
+
+@item
+In some circumstances it can be beneficial to split into two a table that is
+scanned very often. This is especially true if it is a dynamic
+format table and it is possible to use a smaller static format table that
+can be used to find the relevant rows when scanning the table.
+@end itemize
+
+
+@node MySQL indexes, Indexes, Data size, Optimizing Database Structure
+@subsection How MySQL Uses Indexes
+
+@cindex indexes, use of
+
+Indexes are used to find rows with a specific value of one column
+fast. Without an index @strong{MySQL} has to start with the first record
+and then read through the whole table until it finds the relevant
+rows. The bigger the table, the more this costs. If the table has an index
+for the columns in question, @strong{MySQL} can quickly get a position to
+seek to in the middle of the data file without having to look at all the
+data. If a table has 1000 rows, this is at least 100 times faster than
+reading sequentially. Note that if you need to access almost all 1000
+rows it is faster to read sequentially because we then avoid disk seeks.
+
+All @strong{MySQL} indexes (@code{PRIMARY}, @code{UNIQUE}, and
+@code{INDEX}) are stored in B-trees. Strings are automatically prefix-
+and end-space compressed. @xref{CREATE INDEX, , @code{CREATE INDEX}}.
+
+Indexes are used to:
+@itemize @bullet
+@item
+Quickly find the rows that match a @code{WHERE} clause.
+
+@item
+Retrieve rows from other tables when performing joins.
+
+@item
+Find the @code{MAX()} or @code{MIN()} value for a specific indexed
+column. This is optimized by a preprocessor that checks if you are
+using @code{WHERE} key_part_# = constant on all key parts < N. In this case
+@strong{MySQL} will do a single key lookup and replace the @code{MIN()}
+expression with a constant. If all expressions are replaced with
+constants, the query will return at once:
+
+@example
+SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10
+@end example
+
+@item
+Sort or group a table if the sorting or grouping is done on a leftmost
+prefix of a usable key (for example, @code{ORDER BY key_part_1,key_part_2 }). The
+key is read in reverse order if all key parts are followed by @code{DESC}.
+
+The index can also be used even if the @code{ORDER BY} doesn't match the index
+exactly, as long as all the unused index parts and all the extra
+are @code{ORDER BY} columns are constants in the @code{WHERE} clause. The
+following queries will use the index to resolve the @code{ORDER BY} part:
+
+@example
+SELECT * FROM foo ORDER BY key_part1,key_part2,key_part3;
+SELECT * FROM foo WHERE column=constant ORDER BY column, key_part1;
+SELECT * FROM foo WHERE key_part1=const GROUP BY key_part2;
+@end example
+
+@item
+In some cases a query can be optimized to retrieve values without
+consulting the data file. If all used columns for some table are numeric
+and form a leftmost prefix for some key, the values may be retrieved
+from the index tree for greater speed:
+
+@example
+SELECT key_part3 FROM table_name WHERE key_part1=1
+@end example
+
+@end itemize
+
+Suppose you issue the following @code{SELECT} statement:
+
+@example
+mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
+@end example
+
+If a multiple-column index exists on @code{col1} and @code{col2}, the
+appropriate rows can be fetched directly. If separate single-column
+indexes exist on @code{col1} and @code{col2}, the optimizer tries to
+find the most restrictive index by deciding which index will find fewer
+rows and using that index to fetch the rows.
+
+@cindex indexes, leftmost prefix of
+@cindex leftmost prefix of indexes
+If the table has a multiple-column index, any leftmost prefix of the
+index can be used by the optimizer to find rows. For example, if you
+have a three-column index on @code{(col1,col2,col3)}, you have indexed
+search capabilities on @code{(col1)}, @code{(col1,col2)}, and
+@code{(col1,col2,col3)}.
+
+@strong{MySQL} can't use a partial index if the columns don't form a
+leftmost prefix of the index. Suppose you have the @code{SELECT}
+statements shown below:
+
+@example
+mysql> SELECT * FROM tbl_name WHERE col1=val1;
+mysql> SELECT * FROM tbl_name WHERE col2=val2;
+mysql> SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;
+@end example
+
+If an index exists on @code{(col1,col2,col3)}, only the first query
+shown above uses the index. The second and third queries do involve
+indexed columns, but @code{(col2)} and @code{(col2,col3)} are not
+leftmost prefixes of @code{(col1,col2,col3)}.
+
+@findex LIKE, and indexes
+@findex LIKE, and wildcards
+@cindex indexes, and @code{LIKE}
+@cindex wildcards, and @code{LIKE}
+@strong{MySQL} also uses indexes for @code{LIKE} comparisons if the argument
+to @code{LIKE} is a constant string that doesn't start with a wild-card
+character. For example, the following @code{SELECT} statements use indexes:
+
+@example
+mysql> select * from tbl_name where key_col LIKE "Patrick%";
+mysql> select * from tbl_name where key_col LIKE "Pat%_ck%";
+@end example
+
+In the first statement, only rows with @code{"Patrick" <= key_col <
+"Patricl"} are considered. In the second statement, only rows with
+@code{"Pat" <= key_col < "Pau"} are considered.
+
+The following @code{SELECT} statements will not use indexes:
+@example
+mysql> select * from tbl_name where key_col LIKE "%Patrick%";
+mysql> select * from tbl_name where key_col LIKE other_col;
+@end example
+
+In the first statement, the @code{LIKE} value begins with a wild-card
+character. In the second statement, the @code{LIKE} value is not a
+constant.
+
+@findex IS NULL, and indexes
+@cindex indexes, and @code{IS NULL}
+Searching using @code{column_name IS NULL} will use indexes if column_name
+is an index.
+
+@strong{MySQL} normally uses the index that finds the least number of rows. An
+index is used for columns that you compare with the following operators:
+@code{=}, @code{>}, @code{>=}, @code{<}, @code{<=}, @code{BETWEEN}, and a
+@code{LIKE} with a non-wild-card prefix like @code{'something%'}.
+
+Any index that doesn't span all @code{AND} levels in the @code{WHERE} clause
+is not used to optimize the query. In other words: To be able to use an
+index, a prefix of the index must be used in every @code{AND} group.
+
+The following @code{WHERE} clauses use indexes:
+@example
+... WHERE index_part1=1 AND index_part2=2 AND other_column=3
+... WHERE index=1 OR A=10 AND index=2 /* index = 1 OR index = 2 */
+... WHERE index_part1='hello' AND index_part_3=5
+ /* optimized like "index_part1='hello'" */
+... WHERE index1=1 and index2=2 or index1=3 and index3=3;
+ /* Can use index on index1 but not on index2 or index 3 */
+@end example
+
+These @code{WHERE} clauses do @strong{NOT} use indexes:
+@example
+... WHERE index_part2=1 AND index_part3=2 /* index_part_1 is not used */
+... WHERE index=1 OR A=10 /* Index is not used in both AND parts */
+... WHERE index_part1=1 OR index_part2=10 /* No index spans all rows */
+@end example
+
+Note that in some cases @strong{MySQL} will not use an index, even if one
+would be available. Some of the cases where this happens are:
+
+@itemize @bullet
+@item
+If the use of the index would require @strong{MySQL} to access more
+than 30 % of the rows in the table. (In this case a table scan is
+probably much faster, as this will require us to do much fewer seeks).
+Note that if such a query uses @code{LIMIT} to only retrieve
+part of the rows, @strong{MySQL} will use an index anyway, as it can
+much more quickly find the few rows to return in the result.
+@end itemize
+
+
+@menu
+* Indexes::
+* Multiple-column indexes::
+* Table cache::
+* Creating many tables::
+* Open tables::
+@end menu
+
+@node Indexes, Multiple-column indexes, MySQL indexes, Optimizing Database Structure
+@subsection Column Indexes
+
+@cindex indexes, columns
+@cindex columns, indexes
+@cindex keys
+
+All @strong{MySQL} column types can be indexed. Use of indexes on the
+relevant columns is the best way to improve the performance of @code{SELECT}
+operations.
+
+The maximum number of keys and the maximum index length is defined per
+table handler. @xref{Table types}. You can with all table handlers have
+at least 16 keys and a total index length of at least 256 bytes.
+
+For @code{CHAR} and @code{VARCHAR} columns, you can index a prefix of a
+column. This is much faster and requires less disk space than indexing the
+whole column. The syntax to use in the @code{CREATE TABLE} statement to
+index a column prefix looks like this:
+
+@example
+KEY index_name (col_name(length))
+@end example
+
+The example below creates an index for the first 10 characters of the
+@code{name} column:
+
+@example
+mysql> CREATE TABLE test (
+ name CHAR(200) NOT NULL,
+ KEY index_name (name(10)));
+@end example
+
+For @code{BLOB} and @code{TEXT} columns, you must index a prefix of the
+column. You cannot index the entire column.
+
+In @strong{MySQL} Version 3.23.23 or later, you can also create special
+@strong{FULLTEXT} indexes. They are used for full-text search. Only the
+@code{MyISAM} table type supports @code{FULLTEXT} indexes. They can be
+created only from @code{VARCHAR} and @code{TEXT} columns.
+Indexing always happens over the entire column and partial indexing is not
+supported. See @ref{Fulltext Search} for details.
+
+@node Multiple-column indexes, Table cache, Indexes, Optimizing Database Structure
+@subsection Multiple-Column Indexes
+
+@cindex multi-column indexes
+@cindex indexes, multi-column
+@cindex keys, multi-column
+
+@strong{MySQL} can create indexes on multiple columns. An index may
+consist of up to 15 columns. (On @code{CHAR} and @code{VARCHAR} columns you
+can also use a prefix of the column as a part of an index).
+
+A multiple-column index can be considered a sorted array containing values
+that are created by concatenating the values of the indexed columns.
+
+@strong{MySQL} uses multiple-column indexes in such a way that queries are
+fast when you specify a known quantity for the first column of the index in a
+@code{WHERE} clause, even if you don't specify values for the other columns.
+
+Suppose a table is created using the following specification:
+
+@example
+mysql> CREATE TABLE test (
+ id INT NOT NULL,
+ last_name CHAR(30) NOT NULL,
+ first_name CHAR(30) NOT NULL,
+ PRIMARY KEY (id),
+ INDEX name (last_name,first_name));
+@end example
+
+Then the index @code{name} is an index over @code{last_name} and
+@code{first_name}. The index will be used for queries that specify
+values in a known range for @code{last_name}, or for both @code{last_name}
+and @code{first_name}.
+Therefore, the @code{name} index will be used in the following queries:
+
+@example
+mysql> SELECT * FROM test WHERE last_name="Widenius";
+
+mysql> SELECT * FROM test WHERE last_name="Widenius"
+ AND first_name="Michael";
+
+mysql> SELECT * FROM test WHERE last_name="Widenius"
+ AND (first_name="Michael" OR first_name="Monty");
+
+mysql> SELECT * FROM test WHERE last_name="Widenius"
+ AND first_name >="M" AND first_name < "N";
+@end example
+
+However, the @code{name} index will NOT be used in the following queries:
+
+@example
+mysql> SELECT * FROM test WHERE first_name="Michael";
+
+mysql> SELECT * FROM test WHERE last_name="Widenius"
+ OR first_name="Michael";
+@end example
+
+For more information on the manner in which @strong{MySQL} uses indexes to
+improve query performance, see @ref{MySQL indexes, , @strong{MySQL}
+indexes}.
+
+
+@node Table cache, Creating many tables, Multiple-column indexes, Optimizing Database Structure
+@subsection How MySQL Opens and Closes Tables
+
+@findex table_cache
+
+@cindex tables, opening
+@cindex tables, closing
+@cindex opening, tables
+@cindex closing, tables
+@cindex table cache
+
+@code{table_cache}, @code{max_connections}, and @code{max_tmp_tables}
+affect the maximum number of files the server keeps open. If you
+increase one or both of these values, you may run up against a limit
+imposed by your operating system on the per-process number of open file
+descriptors. However, you can increase the limit on many systems.
+Consult your OS documentation to find out how to do this, because the
+method for changing the limit varies widely from system to system.
+
+@code{table_cache} is related to @code{max_connections}. For example,
+for 200 concurrent running connections, you should have a table cache of
+at least @code{200 * n}, where @code{n} is the maximum number of tables
+in a join. You also need to reserve some extra file descriptors for
+temporary tables and files.
+
+The cache of open tables can grow to a maximum of @code{table_cache}
+(default 64; this can be changed with the @code{-O table_cache=#}
+option to @code{mysqld}). A table is never closed, except when the
+cache is full and another thread tries to open a table or if you use
+@code{mysqladmin refresh} or @code{mysqladmin flush-tables}.
+
+When the table cache fills up, the server uses the following procedure
+to locate a cache entry to use:
+
+@itemize @bullet
+@item
+Tables that are not currently in use are released, in least-recently-used
+order.
+
+@item
+If the cache is full and no tables can be released, but a new table needs to
+be opened, the cache is temporarily extended as necessary.
+
+@item
+If the cache is in a temporarily-extended state and a table goes from in-use
+to not-in-use state, the table is closed and released from the cache.
+@end itemize
+
+A table is opened for each concurrent access. This means that
+if you have two threads accessing the same table or access the table
+twice in the same query (with @code{AS}) the table needs to be opened twice.
+The first open of any table takes two file descriptors; each additional
+use of the table takes only one file descriptor. The extra descriptor
+for the first open is used for the index file; this descriptor is shared
+among all threads.
+
+You can check if your table cache is too small by checking the mysqld
+variable @code{opened_tables}. If this is quite big, even if you
+haven't done a lot of @code{FLUSH TABLES}, you should increase your table
+cache. @xref{SHOW STATUS}.
+
+
+@node Creating many tables, Open tables, Table cache, Optimizing Database Structure
+@subsection Drawbacks to Creating Large Numbers of Tables in the Same Database
+
+@cindex tables, too many
+
+If you have many files in a directory, open, close, and create operations will
+be slow. If you execute @code{SELECT} statements on many different tables,
+there will be a little overhead when the table cache is full, because for
+every table that has to be opened, another must be closed. You can reduce
+this overhead by making the table cache larger.
+
+
+@node Open tables, , Creating many tables, Optimizing Database Structure
+@subsection Why So Many Open tables?
+
+@cindex tables, open
+@cindex open tables
+
+When you run @code{mysqladmin status}, you'll see something like this:
+
+@example
+Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12
+@end example
+
+This can be somewhat perplexing if you only have 6 tables.
+
+@strong{MySQL} is multithreaded, so it may have many queries on the same
+table simultaneously. To minimize the problem with two threads having
+different states on the same file, the table is opened independently by
+each concurrent thread. This takes some memory and one extra file
+descriptor for the data file. The index file descriptor is shared
+between all threads.
+
+
+@node Optimizing the Server, Disk issues, Optimizing Database Structure, MySQL Optimization
+@section Optimizing the MySQL Server
+
+
+@menu
+* System::
+* Server parameters::
+* Compile and link options::
+* Memory use::
+* DNS::
+* SET OPTION::
+@end menu
+
+@node System, Server parameters, Optimizing the Server, Optimizing the Server
+@subsection System/Compile Time and Startup Parameter Tuning
+
+@cindex compiling, optimizing
+@cindex system optimization
+@cindex startup parameters, tuning
+
+We start with the system level things since some of these decisions have
+to be made very early. In other cases a fast look at this part may
+suffice because it not that important for the big gains. However, it is always
+nice to have a feeling about how much one could gain by changing things
+at this level.
+
+The default OS to use is really important! To get the most use of
+multiple CPU machines one should use Solaris (because the threads works
+really nice) or Linux (because the 2.2 kernel has really good SMP
+support). Also on 32-bit machines Linux has a 2G file size limit by
+default. Hopefully this will be fixed soon when new filesystems are
+released (XFS/Reiserfs). If you have a desperate need for files bigger
+than 2G on Linux-intel 32 bit, you should get the LFS patch for the ext2
+file system.
+
+Because we have not run @strong{MySQL} in production on that many platforms, we
+advice you to test your intended platform before choosing it, if possible.
+
+@cindex locking
+Other tips:
+@itemize @bullet
+@item
+If you have enough RAM, you could remove all swap devices. Some
+operating systems will use a swap device in some contexts even if you
+have free memory.
+@item
+Use the @code{--skip-locking} @strong{MySQL} option to avoid external
+locking. Note that this will not impact @strong{MySQL}'s functionality as
+long as you only run one server. Just remember to take down the server (or
+lock relevant parts) before you run @code{myisamchk}. On some system
+this switch is mandatory because the external locking does not work in any
+case.
+
+The @code{--skip-locking} option is on by default when compiling with
+MIT-pthreads, because @code{flock()} isn't fully supported by
+MIT-pthreads on all platforms. It's also on default for Linux
+as Linux file locking are not yet safe.
+
+The only case when you can't use @code{--skip-locking} is if you run
+multiple @strong{MySQL} @emph{servers} (not clients) on the same data,
+or run @code{myisamchk} on the table without first flushing and locking
+the @code{mysqld} server tables first.
+
+You can still use @code{LOCK TABLES}/@code{UNLOCK TABLES} even if you
+are using @code{--skip-locking}
+@end itemize
+
+
+@node Server parameters, Compile and link options, System, Optimizing the Server
+@subsection Tuning Server Parameters
+
+@cindex parameters, server
+@cindex @code{mysqld} server, buffer sizes
+@cindex buffer sizes, @code{mysqld} server
+@cindex startup parameters
+
+You can get the default buffer sizes used by the @code{mysqld} server
+with this command:
+
+@example
+shell> mysqld --help
+@end example
+
+@cindex @code{mysqld} options
+@cindex variables, @code{mysqld}
+This command produces a list of all @code{mysqld} options and configurable
+variables. The output includes the default values and looks something
+like this:
+
+@example
+Possible variables for option --set-variable (-O) are:
+back_log current value: 5
+bdb_cache_size current value: 1048540
+binlog_cache_size current_value: 32768
+connect_timeout current value: 5
+delayed_insert_timeout current value: 300
+delayed_insert_limit current value: 100
+delayed_queue_size current value: 1000
+flush_time current value: 0
+interactive_timeout current value: 28800
+join_buffer_size current value: 131072
+key_buffer_size current value: 1048540
+lower_case_table_names current value: 0
+long_query_time current value: 10
+max_allowed_packet current value: 1048576
+max_binlog_cache_size current_value: 4294967295
+max_connections current value: 100
+max_connect_errors current value: 10
+max_delayed_threads current value: 20
+max_heap_table_size current value: 16777216
+max_join_size current value: 4294967295
+max_sort_length current value: 1024
+max_tmp_tables current value: 32
+max_write_lock_count current value: 4294967295
+myisam_sort_buffer_size current value: 8388608
+net_buffer_length current value: 16384
+net_retry_count current value: 10
+net_read_timeout current value: 30
+net_write_timeout current value: 60
+query_buffer_size current value: 0
+record_buffer current value: 131072
+slow_launch_time current value: 2
+sort_buffer current value: 2097116
+table_cache current value: 64
+thread_concurrency current value: 10
+tmp_table_size current value: 1048576
+thread_stack current value: 131072
+wait_timeout current value: 28800
+@end example
+
+If there is a @code{mysqld} server currently running, you can see what
+values it actually is using for the variables by executing this command:
+
+@example
+shell> mysqladmin variables
+@end example
+
+You can find a full description for all variables in the @code{SHOW VARIABLES}
+section in this manual. @xref{SHOW VARIABLES}.
+
+You can also see some statistics from a running server by issuing the command
+@code{SHOW STATUS}. @xref{SHOW STATUS}.
+
+@strong{MySQL} uses algorithms that are very scalable, so you can usually
+run with very little memory. If you, however, give @strong{MySQL} more
+memory, you will normally also get better performance.
+
+When tuning a @strong{MySQL} server, the two most important variables to use
+are @code{key_buffer_size} and @code{table_cache}. You should first feel
+confident that you have these right before trying to change any of the
+other variables.
+
+If you have much memory (>=256M) and many tables and want maximum performance
+with a moderate number of clients, you should use something like this:
+
+@example
+shell> safe_mysqld -O key_buffer=64M -O table_cache=256 \
+ -O sort_buffer=4M -O record_buffer=1M &
+@end example
+
+If you have only 128M and only a few tables, but you still do a lot of
+sorting, you can use something like:
+
+@example
+shell> safe_mysqld -O key_buffer=16M -O sort_buffer=1M
+@end example
+
+If you have little memory and lots of connections, use something like this:
+
+@example
+shell> safe_mysqld -O key_buffer=512k -O sort_buffer=100k \
+ -O record_buffer=100k &
+@end example
+
+or even:
+
+@example
+shell> safe_mysqld -O key_buffer=512k -O sort_buffer=16k \
+ -O table_cache=32 -O record_buffer=8k -O net_buffer=1K &
+@end example
+
+When you have installed @strong{MySQL}, the @file{support-files} directory will
+contain some different @code{my.cnf} example files, @file{my-huge.cnf},
+@file{my-large.cnf}, @file{my-medium.cnf}, and @file{my-small.cnf}, you can
+use as a base to optimize your system.
+
+If there are very many connections, ``swapping problems'' may occur unless
+@code{mysqld} has been configured to use very little memory for each
+connection. @code{mysqld} performs better if you have enough memory for all
+connections, of course.
+
+Note that if you change an option to @code{mysqld}, it remains in effect only
+for that instance of the server.
+
+To see the effects of a parameter change, do something like this:
+
+@example
+shell> mysqld -O key_buffer=32m --help
+@end example
+
+Make sure that the @code{--help} option is last; otherwise, the effect of any
+options listed after it on the command line will not be reflected in the
+output.
+
+
+@node Compile and link options, Memory use, Server parameters, Optimizing the Server
+@subsection How Compiling and Linking Affects the Speed of MySQL
+
+@cindex linking, speed
+@cindex compiling, speed
+@cindex speed, compiling
+@cindex speed, linking
+
+Most of the following tests are done on Linux with the
+@strong{MySQL} benchmarks, but they should give some indication for
+other operating systems and workloads.
+
+You get the fastest executable when you link with @code{-static}.
+
+On Linux, you will get the fastest code when compiling with @code{pgcc}
+and @code{-O3}. To compile @file{sql_yacc.cc} with these options, you
+need about 200M memory because @code{gcc/pgcc} needs a lot of memory to
+make all functions inline. You should also set @code{CXX=gcc} when
+configuring @strong{MySQL} to avoid inclusion of the @code{libstdc++}
+library (it is not needed). Note that with some versions of @code{pgcc},
+the resulting code will only run on true Pentium processors, even if you
+use the compiler option that you want the resulting code to be working on
+all x586 type processors (like AMD).
+
+By just using a better compiler and/or better compiler options you can
+get a 10-30 % speed increase in your application. This is particularly
+important if you compile the SQL server yourself!
+
+We have tested both the Cygnus CodeFusion and Fujitsu compilers, but
+when we tested them, neither was sufficiently bug free to allow
+@strong{MySQL} to be compiled with optimizations on.
+
+When you compile @strong{MySQL} you should only include support for the
+character sets that you are going to use. (Option @code{--with-charset=xxx}).
+The standard @strong{MySQL} binary distributions are compiled with support
+for all character sets.
+
+Here is a list of some measurements that we have done:
+@itemize @bullet
+@item
+If you use @code{pgcc} and compile everything with @code{-O6}, the
+@code{mysqld} server is 1% faster than with @code{gcc} 2.95.2.
+
+@item
+If you link dynamically (without @code{-static}), the result is 13%
+slower on Linux. Note that you still can use a dynamic linked
+@strong{MySQL} library. It is only the server that is critical for
+performance.
+
+@item
+If you strip your @code{mysqld} binary with @code{strip libexec/mysqld},
+the resulting binary can be up to 4 % faster.
+
+@item
+If you connect using TCP/IP rather than Unix sockets, the result is 7.5%
+slower on the same computer. (If you are connection to @code{localhost},
+@strong{MySQL} will, by default, use sockets).
+
+@item
+If you connect using TCP/IP from another computer over a 100M Ethernet,
+things will be 8-11 % slower.
+
+@item
+If you compile with @code{--with-debug=full}, then you will loose 20 %
+for most queries, but some queries may take substantially longer (The
+@strong{MySQL} benchmarks ran 35 % slower)
+If you use @code{--with-debug}, then you will only loose 15 %.
+By starting a @code{mysqld} version compiled with @code{--with-debug=full}
+with @code{--skip-safemalloc} the end result should be close to when
+configuring with @code{--with-debug}.
+
+@item
+On a Sun SPARCstation 20, SunPro C++ 4.2 is 5 % faster than @code{gcc} 2.95.2.
+
+@item
+Compiling with @code{gcc} 2.95.2 for ultrasparc with the option
+@code{-mcpu=v8 -Wa,-xarch=v8plusa} gives 4 % more performance.
+
+@item
+On Solaris 2.5.1, MIT-pthreads is 8-12% slower than Solaris native
+threads on a single processor. With more load/CPUs the difference should
+get bigger.
+
+@item
+Running with @code{--log-bin} makes @strong{[MySQL} 1 % slower.
+
+@item
+Compiling on Linux-x86 using gcc without frame pointers
+@code{-fomit-frame-pointer} or @code{-fomit-frame-pointer -ffixed-ebp}
+@code{mysqld} 1-4% faster.
+@end itemize
+
+The @strong{MySQL}-Linux distribution provided by @strong{MySQL AB} used
+to be compiled with @code{pgcc}, but we had to go back to regular gcc
+because of a bug in @code{pgcc} that would generate the code that does
+not run on AMD. We will continue using gcc until that bug is resolved.
+In the meantime, if you have a non-AMD machine, you can get a faster
+binary by compiling with @code{pgcc}. The standard @strong{MySQL}
+Linux binary is linked statically to get it faster and more portable.
+
+
+@node Memory use, DNS, Compile and link options, Optimizing the Server
+@subsection How MySQL Uses Memory
+
+@cindex memory use
+
+The list below indicates some of the ways that the @code{mysqld} server
+uses memory. Where applicable, the name of the server variable relevant
+to the memory use is given:
+
+@itemize @bullet
+@item
+The key buffer (variable @code{key_buffer_size}) is shared by all
+threads; Other buffers used by the server are allocated as
+needed. @xref{Server parameters}.
+
+@item
+Each connection uses some thread-specific space: A stack (default 64K,
+variable @code{thread_stack}), a connection buffer (variable
+@code{net_buffer_length}), and a result buffer (variable
+@code{net_buffer_length}). The connection buffer and result buffer are
+dynamically enlarged up to @code{max_allowed_packet} when needed. When
+a query is running, a copy of the current query string is also allocated.
+
+@item
+All threads share the same base memory.
+
+@item
+Only the compressed ISAM / MyISAM tables are memory mapped. This is
+because the 32-bit memory space of 4GB is not large enough for most
+big tables. When systems with a 64-bit address space become more
+common we may add general support for memory mapping.
+
+@item
+Each request doing a sequential scan over a table allocates a read buffer
+(variable @code{record_buffer}).
+
+@item
+All joins are done in one pass, and most joins can be done without even
+using a temporary table. Most temporary tables are memory-based (HEAP)
+tables. Temporary tables with a big record length (calculated as the
+sum of all column lengths) or that contain @code{BLOB} columns are
+stored on disk.
+
+One problem in @strong{MySQL} versions before Version 3.23.2 is that if a HEAP table
+exceeds the size of @code{tmp_table_size}, you get the error @code{The
+table tbl_name is full}. In newer versions this is handled by
+automatically changing the in-memory (HEAP) table to a disk-based
+(MyISAM) table as necessary. To work around this problem, you can
+increase the temporary table size by setting the @code{tmp_table_size}
+option to @code{mysqld}, or by setting the SQL option
+@code{SQL_BIG_TABLES} in the client program. @xref{SET OPTION, ,
+@code{SET OPTION}}. In @strong{MySQL} Version 3.20, the maximum size of the
+temporary table was @code{record_buffer*16}, so if you are using this
+version, you have to increase the value of @code{record_buffer}. You can
+also start @code{mysqld} with the @code{--big-tables} option to always
+store temporary tables on disk. However, this will affect the speed of
+many complicated queries.
+
+@item
+Most requests doing a sort allocates a sort buffer and 0-2 temporary
+files depending on the result set size. @xref{Temporary files}.
+
+@item
+Almost all parsing and calculating is done in a local memory store. No
+memory overhead is needed for small items and the normal slow memory
+allocation and freeing is avoided. Memory is allocated only for
+unexpectedly large strings (this is done with @code{malloc()} and
+@code{free()}).
+
+@item
+Each index file is opened once and the data file is opened once for each
+concurrently running thread. For each concurrent thread, a table structure,
+column structures for each column, and a buffer of size @code{3 * n} is
+allocated (where @code{n} is the maximum row length, not counting @code{BLOB}
+columns). A @code{BLOB} uses 5 to 8 bytes plus the length of the @code{BLOB}
+data. The @code{ISAM}/@code{MyISAM} table handlers will use one extra row
+buffer for internal usage.
+
+@item
+For each table having @code{BLOB} columns, a buffer is enlarged dynamically
+to read in larger @code{BLOB} values. If you scan a table, a buffer as large
+as the largest @code{BLOB} value is allocated.
+
+@item
+Table handlers for all in-use tables are saved in a cache and managed as a
+FIFO. Normally the cache has 64 entries. If a table has been used by two
+running threads at the same time, the cache contains two entries for the
+table. @xref{Table cache}.
+
+@item
+A @code{mysqladmin flush-tables} command closes all tables that are not in
+use and marks all in-use tables to be closed when the currently executing
+thread finishes. This will effectively free most in-use memory.
+@end itemize
+
+@code{ps} and other system status programs may report that @code{mysqld}
+uses a lot of memory. This may be caused by thread-stacks on different
+memory addresses. For example, the Solaris version of @code{ps} counts
+the unused memory between stacks as used memory. You can verify this by
+checking available swap with @code{swap -s}. We have tested
+@code{mysqld} with commercial memory-leakage detectors, so there should
+be no memory leaks.
+
+
+@node DNS, SET OPTION, Memory use, Optimizing the Server
+@subsection How MySQL uses DNS
+
+@cindex DNS
+@cindex hostname caching
+
+When a new thread connects to @code{mysqld}, @code{mysqld} will span a
+new thread to handle the request. This thread will first check if the
+hostname is in the hostname cache. If not the thread will call
+@code{gethostbyaddr_r()} and @code{gethostbyname_r()} to resolve the
+hostname.
+
+If the operating system doesn't support the above thread-safe calls, the
+thread will lock a mutex and call @code{gethostbyaddr()} and
+@code{gethostbyname()} instead. Note that in this case no other thread
+can resolve other hostnames that is not in the hostname cache until the
+first thread is ready.
+
+You can disable DNS host lookup by starting @code{mysqld} with
+@code{--skip-name-resolve}. In this case you can however only use IP
+names in the @strong{MySQL} privilege tables.
+
+If you have a very slow DNS and many hosts, you can get more performance by
+either disabling DNS lookop with @code{--skip-name-resolve} or by
+increasing the @code{HOST_CACHE_SIZE} define (default: 128) and recompile
+@code{mysqld}.
+
+You can disable the hostname cache with @code{--skip-host-cache}. You
+can clear the hostname cache with @code{FLUSH HOSTS} or @code{mysqladmin
+flush-hosts}.
+
+If you don't want to allow connections over @code{TCP/IP}, you can do this
+by starting @code{mysqld} with @code{--skip-networking}.
+
+
+@node SET OPTION, , DNS, Optimizing the Server
+@subsection @code{SET} Syntax
+
+@findex SET OPTION
+
+@example
+SET [OPTION] SQL_VALUE_OPTION= value, ...
+@end example
+
+@code{SET OPTION} sets various options that affect the operation of the
+server or your client. Any option you set remains in effect until the
+current session ends, or until you set the option to a different value.
+
+@table @code
+@item CHARACTER SET character_set_name | DEFAULT
+This maps all strings from and to the client with the given mapping.
+Currently the only option for @code{character_set_name} is
+@code{cp1251_koi8}, but you can easily add new mappings by editing the
+@file{sql/convert.cc} file in the @strong{MySQL} source distribution. The
+default mapping can be restored by using a @code{character_set_name} value of
+@code{DEFAULT}.
+
+Note that the syntax for setting the @code{CHARACTER SET} option differs
+from the syntax for setting the other options.
+
+@item PASSWORD = PASSWORD('some password')
+@cindex passwords, setting
+Set the password for the current user. Any non-anonymous user can change his
+own password!
+
+@item PASSWORD FOR user = PASSWORD('some password')
+Set the password for a specific user on the current server host. Only a user
+with access to the @code{mysql} database can do this. The user should be
+given in @code{user@@hostname} format, where @code{user} and @code{hostname}
+are exactly as they are listed in the @code{User} and @code{Host} columns of
+the @code{mysql.user} table entry. For example, if you had an entry with
+@code{User} and @code{Host} fields of @code{'bob'} and @code{'%.loc.gov'},
+you would write:
+
+@example
+mysql> SET PASSWORD FOR bob@@"%.loc.gov" = PASSWORD("newpass");
+
+or
+
+mysql> UPDATE mysql.user SET password=PASSWORD("newpass") where user="bob' and host="%.loc.gov";
+@end example
+
+@item SQL_AUTO_IS_NULL = 0 | 1
+If set to @code{1} (default) then one can find the last inserted row
+for a table with an auto_increment row with the following construct:
+@code{WHERE auto_increment_column IS NULL}. This is used by some
+ODBC programs like Access.
+
+@item AUTOCOMMIT= 0 | 1
+If set to @code{1} all changes to a table will be done at once. To start
+a multi-command transaction, you have to use the @code{BEGIN}
+statement. @xref{COMMIT}. If set to @code{0} you have to use @code{COMMIT} /
+@code{ROLLBACK} to accept/revoke that transaction. @xref{COMMIT}. Note
+that when you change from not @code{AUTOCOMMIT} mode to
+@code{AUTOCOMMIT} mode, @strong{MySQL} will do an automatic
+@code{COMMIT} on any open transactions.
+
+@item SQL_BIG_TABLES = 0 | 1
+@cindex table is full
+If set to @code{1}, all temporary tables are stored on disk rather than in
+memory. This will be a little slower, but you will not get the error
+@code{The table tbl_name is full} for big @code{SELECT} operations that
+require a large temporary table. The default value for a new connection is
+@code{0} (that is, use in-memory temporary tables).
+
+@item SQL_BIG_SELECTS = 0 | 1
+If set to @code{0}, @strong{MySQL} will abort if a @code{SELECT} is attempted
+that probably will take a very long time. This is useful when an inadvisable
+@code{WHERE} statement has been issued. A big query is defined as a
+@code{SELECT} that probably will have to examine more than
+@code{max_join_size} rows. The default value for a new connection is
+@code{1} (which will allow all @code{SELECT} statements).
+
+@item SQL_BUFFER_RESULT = 0 | 1
+@code{SQL_BUFFER_RESULT} will force the result from @code{SELECT}'s
+to be put into a temporary table. This will help @strong{MySQL} free the
+table locks early and will help in cases where it takes a long time to
+send the result set to the client.
+
+@item SQL_LOW_PRIORITY_UPDATES = 0 | 1
+If set to @code{1}, all @code{INSERT}, @code{UPDATE}, @code{DELETE}, and
+and @code{LOCK TABLE WRITE} statements wait until there is no pending
+@code{SELECT} or @code{LOCK TABLE READ} on the affected table.
+
+@item SQL_MAX_JOIN_SIZE = value | DEFAULT
+Don't allow @code{SELECT}s that will probably need to examine more than
+@code{value} row combinations. By setting this value, you can catch
+@code{SELECT}s where keys are not used properly and that would probably
+take a long time. Setting this to a value other than @code{DEFAULT} will reset
+the @code{SQL_BIG_SELECTS} flag. If you set the @code{SQL_BIG_SELECTS}
+flag again, the @code{SQL_MAX_JOIN_SIZE} variable will be ignored.
+You can set a default value for this variable by starting @code{mysqld} with
+@code{-O max_join_size=#}.
+
+@item SQL_SAFE_MODE = 0 | 1
+If set to @code{1}, @strong{MySQL} will abort if an @code{UPDATE} or
+@code{DELETE} is attempted that doesn't use a key or @code{LIMIT} in the
+@code{WHERE} clause. This makes it possible to catch wrong updates
+when creating SQL commands by hand.
+
+@item SQL_SELECT_LIMIT = value | DEFAULT
+The maximum number of records to return from @code{SELECT} statements. If
+a @code{SELECT} has a @code{LIMIT} clause, the @code{LIMIT} takes precedence
+over the value of @code{SQL_SELECT_LIMIT}. The default value for a new
+connection is ``unlimited.'' If you have changed the limit, the default value
+can be restored by using a @code{SQL_SELECT_LIMIT} value of @code{DEFAULT}.
+
+@item SQL_LOG_OFF = 0 | 1
+If set to @code{1}, no logging will be done to the standard log for this
+client, if the client has the @strong{process} privilege. This does not
+affect the update log!
+
+@item SQL_LOG_UPDATE = 0 | 1
+If set to @code{0}, no logging will be done to the update log for the client,
+if the client has the @strong{process} privilege. This does not affect the
+standard log!
+
+@item SQL_QUOTE_SHOW_CREATE = 0 | 1
+If set to @code{1}, @code{SHOW CREATE TABLE} will quote
+table and column names. This is @strong{on} by default,
+for replication of tables with fancy column names to work.
+@ref{SHOW CREATE TABLE, , @code{SHOW CREATE TABLE}}.
+
+@item TIMESTAMP = timestamp_value | DEFAULT
+Set the time for this client. This is used to get the original timestamp if
+you use the update log to restore rows. @code{timestamp_value} should be a
+UNIX Epoch timestamp, not a @strong{MySQL} timestamp.
+
+@item LAST_INSERT_ID = #
+Set the value to be returned from @code{LAST_INSERT_ID()}. This is stored in
+the update log when you use @code{LAST_INSERT_ID()} in a command that updates
+a table.
+
+@item INSERT_ID = #
+Set the value to be used by the following @code{INSERT} or @code{ALTER TABLE}
+command when inserting an @code{AUTO_INCREMENT} value. This is mainly used
+with the update log.
+@end table
+
+
+@menu
+* SET TRANSACTION::
+@end menu
+
+@node SET TRANSACTION, , SET OPTION, SET OPTION
+@subsubsection @code{SET TRANSACTION} Syntax
+
+@findex ISOLATION LEVEL
+
+@example
+SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL
+[READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE]
+@end example
+
+Sets the transaction isolation level for the global, whole session or
+the next transaction.
+
+The default behavior is to set the isolation level for the next (not started)
+transaction.
+
+If you set the @code{GLOBAL} privilege it will affect all new created threads.
+You will need the @code{PROCESS} privilege to do do this.
+
+Setting the @code{SESSION} privilege will affect the following and all
+future transactions.
+
+You can set the default isolation level for @code{mysqld} with
+@code{--transaction-isolation=...}. @xref{Command-line options}.
+
+
+@node Disk issues, , Optimizing the Server, MySQL Optimization
+@section Disk Issues
+
+@cindex disk issues
+@cindex performance, disk issues
+
+@itemize @bullet
+@item
+As mentioned before, disks seeks are a big performance bottleneck. This
+problems gets more and more apparent when the data starts to grow so
+large that effective caching becomes impossible. For large databases,
+where you access data more or less randomly, you can be sure that you
+will need at least one disk seek to read and a couple of disk seeks to
+write things. To minimize this problem, use disks with low seek times.
+
+@item
+Increase the number of available disk spindles (and thereby reduce
+the seek overhead) by either symlink files to different disks or striping
+the disks.
+
+@table @strong
+@item Using symbolic links
+This means that you symlink the index and/or data file(s) from the
+normal data directory to another disk (that may also be striped). This
+makes both the seek and read times better (if the disks are not used for
+other things). @xref{Symbolic links}.
+
+@cindex striping, defined
+@item Striping
+Striping means that you have many disks and put the first block on the
+first disk, the second block on the second disk, and the Nth on the
+(N mod number_of_disks) disk, and so on. This means if your normal data
+size is less than the stripe size (or perfectly aligned) you will get
+much better performance. Note that striping is very dependent on the OS
+and stripe-size. So benchmark your application with different
+stripe-sizes. @xref{Custom Benchmarks}.
+
+Note that the speed difference for striping is @strong{very} dependent
+on the parameters. Depending on how you set the striping parameters and
+number of disks you may get a difference in orders of magnitude. Note that
+you have to choose to optimize for random or sequential access.
+@end table
+
+@item
+For reliability you may want to use RAID 0+1 (striping + mirroring), but
+in this case you will need 2*N drives to hold N drives of data. This is
+probably the best option if you have the money for it! You may, however,
+also have to invest in some volume-management software to handle it
+efficiently.
+
+@item
+A good option is to have semi-important data (that can be regenerated)
+on RAID 0 disk while storing really important data (like host information
+and logs) on a RAID 0+1 or RAID N disk. RAID N can be a problem if you
+have many writes because of the time to update the parity bits.
+
+@item
+You may also set the parameters for the file system that the database
+uses. One easy change is to mount the file system with the noatime
+option. That makes it skip the updating of the last access time in the
+inode and by this will avoid some disk seeks.
+
+@item
+On Linux, you can get much more performance (up to 100 % under load is
+not uncommon) by using hdpram to configure your disk's interface! The
+following should be quite good hdparm options for @strong{MySQL} (and
+probably many other applications):
+
+@example
+hdparm -m 16 -d 1
+@end example
+
+Note that the performance/reliability when using the above depends on
+your hardware, so we strongly suggest that you test your system
+thoroughly after using @code{hdparm}! Please consult the @code{hdparm}
+man page for more information! If @code{hdparm} is not used wisely,
+filesystem corruption may result. Backup everything before experimenting!
+
+@item
+On many operating systems you can mount the disks with the 'async' flag to
+set the file system to be updated asynchronously. If your computer is
+reasonable stable, this should give you more performance without sacrificing
+too much reliability. (This flag is on by default on Linux.)
+
+@item
+If you don't need to know when a file was last accessed (which is not
+really useful on a database server), you can mount your file systems
+with the noatime flag.
+@end itemize
+
+@menu
+* Symbolic links::
+@end menu
+
+@node Symbolic links, , Disk issues, Disk issues
+@subsection Using Symbolic Links
+
+@cindex symbolic links
+@cindex links, symbolic
+
+You can move tables and databases from the database directory to other
+locations and replace them with symbolic links to the new locations.
+You might want to do this, for example, to move a database to a file
+system with more free space or increase the speed of your system by
+spreading your tables to different disk.
+
+The recommended may to do this, is to just symlink databases to different
+disk and only symlink tables as a last resort.
+
+@cindex databases, symbolic links
+@menu
+* Symbolic links to databases::
+* Symbolic links to tables::
+@end menu
+
+
+@node Symbolic links to databases, Symbolic links to tables, Symbolic links, Symbolic links
+@subsubsection Using Symbolic Links for Databases
+
+The way to symlink a database is to first create a directory on some
+disk where you have free space and then create a symlink to it from
+the @strong{MySQL} database directory.
+
+@example
+shell> mkdir /dr1/databases/test
+shell> ln -s /dr1/databases/test mysqld-datadir
+@end example
+
+@strong{MySQL} doesn't support that you link one directory to multiple
+databases. Replacing a database directory with a symbolic link will
+work fine as long as you don't make a symbolic link between databases.
+Suppose you have a database @code{db1} under the @strong{MySQL} data
+directory, and then make a symlink @code{db2} that points to @code{db1}:
+
+@example
+shell> cd /path/to/datadir
+shell> ln -s db1 db2
+@end example
+
+Now, for any table @code{tbl_a} in @code{db1}, there also appears to be
+a table @code{tbl_a} in @code{db2}. If one thread updates @code{db1.tbl_a}
+and another thread updates @code{db2.tbl_a}, there will be problems.
+
+If you really need this, you must change the following code in
+@file{mysys/mf_format.c}:
+
+@example
+if (flag & 32 || (!lstat(to,&stat_buff) && S_ISLNK(stat_buff.st_mode)))
+@end example
+
+to
+
+@example
+if (1)
+@end example
+
+On Windows you can use internal symbolic links to directories by compiling
+@strong{MySQL} with @code{-DUSE_SYMDIR}. This allows you to put different
+databases on different disks. @xref{Windows symbolic links}.
+
+
+@node Symbolic links to tables, , Symbolic links to databases, Symbolic links
+@subsubsection Using Symbolic Links for Tables
+
+@cindex databases, symbolic links
+
+Before @strong{MySQL} 4.0 you should not symlink tables, if you are not
+very carefully with them. The problem is that if you run @code{ALTER
+TABLE}, @code{REPAIR TABLE} or @code{OPTIMIZE TABLE} on a symlinked
+table, the symlinks will be removed and replaced by the original
+files. This happens because the above command works by creating a
+temporary file in the database directory and when the command is
+complete, replace the original file with the temporary file.
+
+You should not symlink tables on system that doesn't have a fully
+working @code{realpath()} call. (At least Linux and Solaris support
+@code{realpath()})
+
+In @strong{MySQL} 4.0 symlinks is only fully supported for @code{MyISAM}
+tables. For other table types you will probably get strange problems
+when doing any of the above mentioned commands.
+
+The handling of symbolic links in @strong{MySQL} 4.0 works the following
+way (this is mostly relevant only for @code{MyISAM} tables).
+
+@itemize @bullet
+@item
+In the data directory you will always have the table definition file
+and the data/index files.
+
+@item
+You can symlink the index file and the data file to different directories
+independent of the other.
+
+@item
+The symlinking can be done from the operating system (if @code{mysqld} is
+not running) or with the @code{INDEX/DATA DIRECTORY="path-to-dir"} command
+in @code{CREATE TABLE}. @xref{CREATE TABLE}.
+
+@item
+@code{myisamchk} will not replace a symlink with the index/file but
+work directly on the files the symlinks points to. Any temporary files
+will be created in the same directory where the data/index file is.
+
+@item
+When you drop a table that is using symlinks, both the symlink and the
+file the symlink points to is dropped. This is a good reason to why you
+should NOT run @code{mysqld} as root and not allow persons to have write
+access to the @strong{MySQL} database directories.
+
+@item
+If you rename a table with @code{ALTER TABLE RENAME} and you don't change
+database, the symlink in the database directory will be renamed to the new
+name and the data/index file will be renamed accordingly.
+
+@item
+If you use @code{ALTER TABLE RENAME} to move a table to another database,
+then the table will be moved to the other database directory and the old
+symlinks and the files they pointed to will be deleted.
+
+@item
+If you are not using symlinks you should use the @code{--skip-symlink}
+option to @code{mysqld} to ensure that no one can drop or rename a file
+outside of the @code{mysqld} data directory.
+@end itemize
+
+Things that are not yet supported:
+
+@cindex TODO, symlinks
+@itemize @bullet
+@item
+@code{ALTER TABLE} ignores all @code{INDEX/DATA DIRECTORY="path"} options.
+@item
+@code{CREATE TABLE} doesn't report if the table has symbolic links.
+@item
+@code{mysqldump} doesn't include the symbolic links information in the output.
+@item
+@code{BACKUP TABLE} and @code{RESTORE TABLE} doesn't use symbolic links.
+@end itemize
+
+
+
+
+
+@node Reference, Table types, MySQL Optimization, Top
@chapter MySQL Language Reference
@menu
@@ -23688,12 +26668,9 @@ isolate it into a separate test case first. Then report the problem to
* LOAD DATA:: @code{LOAD DATA INFILE} syntax
* UPDATE:: @code{UPDATE} syntax
* USE:: @code{USE} syntax
-* EXPLAIN:: @code{EXPLAIN} syntax (Get information about a @code{SELECT})
* DESCRIBE:: @code{DESCRIBE} syntax (Get information about names of columns)
* COMMIT:: @code{BEGIN/COMMIT/ROLLBACK} syntax
* LOCK TABLES:: @code{LOCK TABLES/UNLOCK TABLES} syntax
-* SET OPTION:: @code{SET OPTION} syntax
-* SET TRANSACTION:: @code{SET TRANSACTION} syntax
* CREATE INDEX:: @code{CREATE INDEX} syntax
* DROP INDEX:: @code{DROP INDEX} syntax
* Comments:: Comment syntax
@@ -24490,8 +27467,6 @@ be chosen from the list of values @code{'value1'}, @code{'value2'},
* Date and time types:: Date and time types
* String types:: String types
* Choosing types:: Choosing the right type for a column
-* Indexes:: Column indexes
-* Multiple-column indexes:: Multiple-column indexes
* Other-vendor column types:: Using column types from other database engines
@end menu
@@ -25548,7 +28523,7 @@ the @code{SET} definition in the second column.
@cindex types, columns
@cindex choosing types
-@node Choosing types, Indexes, String types, Column types
+@node Choosing types, Other-vendor column types, String types, Column types
@subsection Choosing the Right Type for a Column
For the most efficient use of storage, try to use the most precise type in
@@ -25565,113 +28540,13 @@ For high precision, you can always convert to a fixed-point type stored
in a @code{BIGINT}. This allows you to do all calculations with integers
and convert results back to floating-point values only when necessary.
-@cindex indexes, columns
-@cindex columns, indexes
-@cindex keys
-@node Indexes, Multiple-column indexes, Choosing types, Column types
-@subsection Column Indexes
-
-All @strong{MySQL} column types can be indexed. Use of indexes on the
-relevant columns is the best way to improve the performance of @code{SELECT}
-operations.
-
-The maximum number of keys and the maximum index length is defined per
-table handler. @xref{Table types}. You can with all table handlers have
-at least 16 keys and a total index length of at least 256 bytes.
-
-For @code{CHAR} and @code{VARCHAR} columns, you can index a prefix of a
-column. This is much faster and requires less disk space than indexing the
-whole column. The syntax to use in the @code{CREATE TABLE} statement to
-index a column prefix looks like this:
-
-@example
-KEY index_name (col_name(length))
-@end example
-
-The example below creates an index for the first 10 characters of the
-@code{name} column:
-
-@example
-mysql> CREATE TABLE test (
- name CHAR(200) NOT NULL,
- KEY index_name (name(10)));
-@end example
-
-For @code{BLOB} and @code{TEXT} columns, you must index a prefix of the
-column. You cannot index the entire column.
-
-In @strong{MySQL} Version 3.23.23 or later, you can also create special
-@strong{FULLTEXT} indexes. They are used for full-text search. Only the
-@code{MyISAM} table type supports @code{FULLTEXT} indexes. They can be
-created only from @code{VARCHAR} and @code{TEXT} columns.
-Indexing always happens over the entire column and partial indexing is not
-supported. See @ref{Fulltext Search} for details.
-
-@cindex multi-column indexes
-@cindex indexes, multi-column
-@cindex keys, multi-column
-@node Multiple-column indexes, Other-vendor column types, Indexes, Column types
-@subsection Multiple-column Indexes
-
-@strong{MySQL} can create indexes on multiple columns. An index may
-consist of up to 15 columns. (On @code{CHAR} and @code{VARCHAR} columns you
-can also use a prefix of the column as a part of an index).
-
-A multiple-column index can be considered a sorted array containing values
-that are created by concatenating the values of the indexed columns.
-
-@strong{MySQL} uses multiple-column indexes in such a way that queries are
-fast when you specify a known quantity for the first column of the index in a
-@code{WHERE} clause, even if you don't specify values for the other columns.
-
-Suppose a table is created using the following specification:
-
-@example
-mysql> CREATE TABLE test (
- id INT NOT NULL,
- last_name CHAR(30) NOT NULL,
- first_name CHAR(30) NOT NULL,
- PRIMARY KEY (id),
- INDEX name (last_name,first_name));
-@end example
-
-Then the index @code{name} is an index over @code{last_name} and
-@code{first_name}. The index will be used for queries that specify
-values in a known range for @code{last_name}, or for both @code{last_name}
-and @code{first_name}.
-Therefore, the @code{name} index will be used in the following queries:
-
-@example
-mysql> SELECT * FROM test WHERE last_name="Widenius";
-
-mysql> SELECT * FROM test WHERE last_name="Widenius"
- AND first_name="Michael";
-
-mysql> SELECT * FROM test WHERE last_name="Widenius"
- AND (first_name="Michael" OR first_name="Monty");
-
-mysql> SELECT * FROM test WHERE last_name="Widenius"
- AND first_name >="M" AND first_name < "N";
-@end example
-
-However, the @code{name} index will NOT be used in the following queries:
-
-@example
-mysql> SELECT * FROM test WHERE first_name="Michael";
-
-mysql> SELECT * FROM test WHERE last_name="Widenius"
- OR first_name="Michael";
-@end example
-For more information on the manner in which @strong{MySQL} uses indexes to
-improve query performance, see @ref{MySQL indexes, , @strong{MySQL}
-indexes}.
+@node Other-vendor column types, , Choosing types, Column types
+@subsection Using Column Types from Other Database Engines
@cindex types, portability
@cindex portability, types
@cindex columns, other types
-@node Other-vendor column types, , Multiple-column indexes, Column types
-@subsection Using Column Types from Other Database Engines
To make it easier to use code written for SQL implementations from other
vendors, @strong{MySQL} maps column types as shown in the table below. These
@@ -30702,7 +33577,7 @@ In @strong{MySQL} Version 3.23, you can use @code{LIMIT #} to ensure that
only a given number of rows are changed.
@findex USE
-@node USE, EXPLAIN, UPDATE, Reference
+@node USE, DESCRIBE, UPDATE, Reference
@section @code{USE} Syntax
@example
@@ -30737,364 +33612,10 @@ mysql> SELECT author_name,editor_name FROM author,db2.editor
The @code{USE} statement is provided for Sybase compatibility.
-@findex EXPLAIN
-@findex SELECT, optimizing
-@node EXPLAIN, DESCRIBE, USE, Reference
-@section @code{EXPLAIN} Syntax (Get Information About a @code{SELECT})
-
-@example
- EXPLAIN tbl_name
-or EXPLAIN SELECT select_options
-@end example
-
-@code{EXPLAIN tbl_name} is a synonym for @code{DESCRIBE tbl_name} or
-@code{SHOW COLUMNS FROM tbl_name}.
-
-When you precede a @code{SELECT} statement with the keyword @code{EXPLAIN},
-@strong{MySQL} explains how it would process the @code{SELECT}, providing
-information about how tables are joined and in which order.
-
-With the help of @code{EXPLAIN}, you can see when you must add indexes
-to tables to get a faster @code{SELECT} that uses indexes to find the
-records. You can also see if the optimizer joins the tables in an optimal
-order. To force the optimizer to use a specific join order for a
-@code{SELECT} statement, add a @code{STRAIGHT_JOIN} clause.
-
-For non-simple joins, @code{EXPLAIN} returns a row of information for each
-table used in the @code{SELECT} statement. The tables are listed in the order
-they would be read. @strong{MySQL} resolves all joins using a single-sweep
-multi-join method. This means that @strong{MySQL} reads a row from the first
-table, then finds a matching row in the second table, then in the third table
-and so on. When all tables are processed, it outputs the selected columns and
-backtracks through the table list until a table is found for which there are
-more matching rows. The next row is read from this table and the process
-continues with the next table.
-
-Output from @code{EXPLAIN} includes the following columns:
-
-@table @code
-@item table
-The table to which the row of output refers.
-
-@item type
-The join type. Information about the various types is given below.
-
-@item possible_keys
-The @code{possible_keys} column indicates which indexes @strong{MySQL}
-could use to find the rows in this table. Note that this column is
-totally independent of the order of the tables. That means that some of
-the keys in possible_keys may not be usable in practice with the
-generated table order.
-
-If this column is empty, there are no relevant indexes. In this case,
-you may be able to improve the performance of your query by examining
-the @code{WHERE} clause to see if it refers to some column or columns
-that would be suitable for indexing. If so, create an appropriate index
-and check the query with @code{EXPLAIN} again. @xref{ALTER TABLE}.
-
-To see what indexes a table has, use @code{SHOW INDEX FROM tbl_name}.
-
-@item key
-The @code{key} column indicates the key that @strong{MySQL} actually
-decided to use. The key is @code{NULL} if no index was chosen. If
-@strong{MySQL} chooses the wrong index, you can probably force
-@strong{MySQL} to use another index by using @code{myisamchk --analyze},
-@xref{myisamchk syntax}, or by using @code{USE INDEX/IGNORE INDEX}.
-@xref{JOIN}.
-
-@item key_len
-The @code{key_len} column indicates the length of the key that
-@strong{MySQL} decided to use. The length is @code{NULL} if the
-@code{key} is @code{NULL}. Note that this tells us how many parts of a
-multi-part key @strong{MySQL} will actually use.
-
-@item ref
-The @code{ref} column shows which columns or constants are used with the
-@code{key} to select rows from the table.
-
-@item rows
-The @code{rows} column indicates the number of rows @strong{MySQL}
-believes it must examine to execute the query.
-
-@item Extra
-This column contains additional information of how @strong{MySQL} will
-resolve the query. Here is an explanation of the different text
-strings that can be found in this column:
-
-@table @code
-@item Distinct
-@strong{MySQL} will not continue searching for more rows for the current row
-combination after it has found the first matching row.
-
-@item Not exists
-@strong{MySQL} was able to do a @code{LEFT JOIN} optimization on the
-query and will not examine more rows in this table for the previous row
-combination after it finds one row that matches the @code{LEFT JOIN} criteria.
-
-Here is an example for this:
-
-@example
-SELECT * FROM t1 LEFT JOIN t2 ON t1.id=t2.id WHERE t2.id IS NULL;
-@end example
-
-Assume that @code{t2.id} is defined with @code{NOT NULL}. In this case
-@strong{MySQL} will scan @code{t1} and look up the rows in @code{t2}
-through @code{t1.id}. If @strong{MySQL} finds a matching row in
-@code{t2}, it knows that @code{t2.id} can never be @code{NULL}, and will
-not scan through the rest of the rows in @code{t2} that has the same
-@code{id}. In other words, for each row in @code{t1}, @strong{MySQL}
-only needs to do a single lookup in @code{t2}, independent of how many
-matching rows there are in @code{t2}.
-
-@item @code{range checked for each record (index map: #)}
-@strong{MySQL} didn't find a real good index to use. It will, instead, for
-each row combination in the preceding tables, do a check on which index to
-use (if any), and use this index to retrieve the rows from the table. This
-isn't very fast but is faster than having to do a join without
-an index.
-
-@item Using filesort
-@strong{MySQL} will need to do an extra pass to find out how to retrieve
-the rows in sorted order. The sort is done by going through all rows
-according to the @code{join type} and storing the sort key + pointer to
-the row for all rows that match the @code{WHERE}. Then the keys are
-sorted. Finally the rows are retrieved in sorted order.
-
-@item Using index
-The column information is retrieved from the table using only
-information in the index tree without having to do an additional seek to
-read the actual row. This can be done when all the used columns for
-the table are part of the same index.
-
-@item Using temporary
-To resolve the query @strong{MySQL} will need to create a
-temporary table to hold the result. This typically happens if you do an
-@code{ORDER BY} on a different column set than you did a @code{GROUP
-BY} on.
-
-@item Where used
-A @code{WHERE} clause will be used to restrict which rows will be
-matched against the next table or sent to the client. If you don't have
-this information and the table is of type @code{ALL} or @code{index},
-you may have something wrong in your query (if you don't intend to
-fetch/examine all rows from the table).
-@end table
-
-If you want to get your queries as fast as possible, you should look out for
-@code{Using filesort} and @code{Using temporary}.
-@end table
-
-The different join types are listed below, ordered from best to worst type:
-
-@cindex system table
-@cindex tables, system
-@table @code
-@item system
-The table has only one row (= system table). This is a special case of
-the @code{const} join type.
-
-@cindex constant table
-@cindex tables, constant
-@item const
-The table has at most one matching row, which will be read at the start
-of the query. Because there is only one row, values from the column in
-this row can be regarded as constants by the rest of the
-optimizer. @code{const} tables are very fast as they are read only once!
-
-@item eq_ref
-One row will be read from this table for each combination of rows from
-the previous tables. This is the best possible join type, other than the
-@code{const} types. It is used when all parts of an index are used by
-the join and the index is @code{UNIQUE} or a @code{PRIMARY KEY}.
-
-@item ref
-All rows with matching index values will be read from this table for each
-combination of rows from the previous tables. @code{ref} is used if the join
-uses only a leftmost prefix of the key, or if the key is not @code{UNIQUE}
-or a @code{PRIMARY KEY} (in other words, if the join cannot select a single
-row based on the key value). If the key that is used matches only a few rows,
-this join type is good.
-
-@item range
-Only rows that are in a given range will be retrieved, using an index to
-select the rows. The @code{key} column indicates which index is used.
-The @code{key_len} contains the longest key part that was used.
-The @code{ref} column will be NULL for this type.
-
-@item index
-This is the same as @code{ALL}, except that only the index tree is
-scanned. This is usually faster than @code{ALL}, as the index file is usually
-smaller than the data file.
-
-@item ALL
-A full table scan will be done for each combination of rows from the
-previous tables. This is normally not good if the table is the first
-table not marked @code{const}, and usually @strong{very} bad in all other
-cases. You normally can avoid @code{ALL} by adding more indexes, so that
-the row can be retrieved based on constant values or column values from
-earlier tables.
-@end table
-
-You can get a good indication of how good a join is by multiplying all values
-in the @code{rows} column of the @code{EXPLAIN} output. This should tell you
-roughly how many rows @strong{MySQL} must examine to execute the query. This
-number is also used when you restrict queries with the @code{max_join_size}
-variable.
-@xref{Server parameters}.
-
-The following example shows how a @code{JOIN} can be optimized progressively
-using the information provided by @code{EXPLAIN}.
-
-Suppose you have the @code{SELECT} statement shown below, that you examine
-using @code{EXPLAIN}:
-
-@example
-EXPLAIN SELECT tt.TicketNumber, tt.TimeIn,
- tt.ProjectReference, tt.EstimatedShipDate,
- tt.ActualShipDate, tt.ClientID,
- tt.ServiceCodes, tt.RepetitiveID,
- tt.CurrentProcess, tt.CurrentDPPerson,
- tt.RecordVolume, tt.DPPrinted, et.COUNTRY,
- et_1.COUNTRY, do.CUSTNAME
- FROM tt, et, et AS et_1, do
- WHERE tt.SubmitTime IS NULL
- AND tt.ActualPC = et.EMPLOYID
- AND tt.AssignedPC = et_1.EMPLOYID
- AND tt.ClientID = do.CUSTNMBR;
-@end example
-
-For this example, assume that:
-
-@itemize @bullet
-@item
-The columns being compared have been declared as follows:
-
-@multitable @columnfractions .1 .2 .7
-@item @strong{Table} @tab @strong{Column} @tab @strong{Column type}
-@item @code{tt} @tab @code{ActualPC} @tab @code{CHAR(10)}
-@item @code{tt} @tab @code{AssignedPC} @tab @code{CHAR(10)}
-@item @code{tt} @tab @code{ClientID} @tab @code{CHAR(10)}
-@item @code{et} @tab @code{EMPLOYID} @tab @code{CHAR(15)}
-@item @code{do} @tab @code{CUSTNMBR} @tab @code{CHAR(15)}
-@end multitable
-
-@item
-The tables have the indexes shown below:
-
-@multitable @columnfractions .1 .9
-@item @strong{Table} @tab @strong{Index}
-@item @code{tt} @tab @code{ActualPC}
-@item @code{tt} @tab @code{AssignedPC}
-@item @code{tt} @tab @code{ClientID}
-@item @code{et} @tab @code{EMPLOYID} (primary key)
-@item @code{do} @tab @code{CUSTNMBR} (primary key)
-@end multitable
-
-@item
-The @code{tt.ActualPC} values aren't evenly distributed.
-@end itemize
-
-Initially, before any optimizations have been performed, the @code{EXPLAIN}
-statement produces the following information:
-
-@example
-table type possible_keys key key_len ref rows Extra
-et ALL PRIMARY NULL NULL NULL 74
-do ALL PRIMARY NULL NULL NULL 2135
-et_1 ALL PRIMARY NULL NULL NULL 74
-tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872
- range checked for each record (key map: 35)
-@end example
-
-Because @code{type} is @code{ALL} for each table, this output indicates that
-@strong{MySQL} is doing a full join for all tables! This will take quite a
-long time, as the product of the number of rows in each table must be
-examined! For the case at hand, this is @code{74 * 2135 * 74 * 3872 =
-45,268,558,720} rows. If the tables were bigger, you can only imagine how
-long it would take.
-
-One problem here is that @strong{MySQL} can't (yet) use indexes on columns
-efficiently if they are declared differently. In this context,
-@code{VARCHAR} and @code{CHAR} are the same unless they are declared as
-different lengths. Because @code{tt.ActualPC} is declared as @code{CHAR(10)}
-and @code{et.EMPLOYID} is declared as @code{CHAR(15)}, there is a length
-mismatch.
-
-To fix this disparity between column lengths, use @code{ALTER TABLE} to
-lengthen @code{ActualPC} from 10 characters to 15 characters:
-
-@example
-mysql> ALTER TABLE tt MODIFY ActualPC VARCHAR(15);
-@end example
-
-Now @code{tt.ActualPC} and @code{et.EMPLOYID} are both @code{VARCHAR(15)}.
-Executing the @code{EXPLAIN} statement again produces this result:
-
-@example
-table type possible_keys key key_len ref rows Extra
-tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
-do ALL PRIMARY NULL NULL NULL 2135
- range checked for each record (key map: 1)
-et_1 ALL PRIMARY NULL NULL NULL 74
- range checked for each record (key map: 1)
-et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
-@end example
-
-This is not perfect, but is much better (the product of the @code{rows}
-values is now less by a factor of 74). This version is executed in a couple
-of seconds.
-
-A second alteration can be made to eliminate the column length mismatches
-for the @code{tt.AssignedPC = et_1.EMPLOYID} and @code{tt.ClientID =
-do.CUSTNMBR} comparisons:
-
-@example
-mysql> ALTER TABLE tt MODIFY AssignedPC VARCHAR(15),
- MODIFY ClientID VARCHAR(15);
-@end example
-
-Now @code{EXPLAIN} produces the output shown below:
-
-@example
-table type possible_keys key key_len ref rows Extra
-et ALL PRIMARY NULL NULL NULL 74
-tt ref AssignedPC,ClientID,ActualPC ActualPC 15 et.EMPLOYID 52 where used
-et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
-do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
-@end example
-
-This is almost as good as it can get.
-
-The remaining problem is that, by default, @strong{MySQL} assumes that values
-in the @code{tt.ActualPC} column are evenly distributed, and that isn't the
-case for the @code{tt} table. Fortunately, it is easy to tell @strong{MySQL}
-about this:
-
-@example
-shell> myisamchk --analyze PATH_TO_MYSQL_DATABASE/tt
-shell> mysqladmin refresh
-@end example
-
-Now the join is perfect, and @code{EXPLAIN} produces this result:
-
-@example
-table type possible_keys key key_len ref rows Extra
-tt ALL AssignedPC,ClientID,ActualPC NULL NULL NULL 3872 where used
-et eq_ref PRIMARY PRIMARY 15 tt.ActualPC 1
-et_1 eq_ref PRIMARY PRIMARY 15 tt.AssignedPC 1
-do eq_ref PRIMARY PRIMARY 15 tt.ClientID 1
-@end example
-
-Note that the @code{rows} column in the output from @code{EXPLAIN} is an
-educated guess from the @strong{MySQL} join optimizer. To optimize a
-query, you should check if the numbers are even close to the truth. If not,
-you may get better performance by using @code{STRAIGHT_JOIN} in your
-@code{SELECT} statement and trying to list the tables in a different order in
-the @code{FROM} clause.
@findex DESC
@findex DESCRIBE
-@node DESCRIBE, COMMIT, EXPLAIN, Reference
+@node DESCRIBE, COMMIT, USE, Reference
@section @code{DESCRIBE} Syntax (Get Information About Columns)
@example
@@ -31179,7 +33700,7 @@ You can change the isolation level for transactions with
@findex LOCK TABLES
@findex UNLOCK TABLES
-@node LOCK TABLES, SET OPTION, COMMIT, Reference
+@node LOCK TABLES, CREATE INDEX, COMMIT, Reference
@section @code{LOCK TABLES/UNLOCK TABLES} Syntax
@example
@@ -31313,181 +33834,15 @@ that can take snapshots in time.
automatically commit any active transactions before attempting to lock the
tables.
-@findex SET OPTION
-@node SET OPTION, SET TRANSACTION, LOCK TABLES, Reference
-@section @code{SET} Syntax
-
-@example
-SET [OPTION] SQL_VALUE_OPTION= value, ...
-@end example
-
-@code{SET OPTION} sets various options that affect the operation of the
-server or your client. Any option you set remains in effect until the
-current session ends, or until you set the option to a different value.
-
-@table @code
-@item CHARACTER SET character_set_name | DEFAULT
-This maps all strings from and to the client with the given mapping.
-Currently the only option for @code{character_set_name} is
-@code{cp1251_koi8}, but you can easily add new mappings by editing the
-@file{sql/convert.cc} file in the @strong{MySQL} source distribution. The
-default mapping can be restored by using a @code{character_set_name} value of
-@code{DEFAULT}.
-
-Note that the syntax for setting the @code{CHARACTER SET} option differs
-from the syntax for setting the other options.
-
-@item PASSWORD = PASSWORD('some password')
-@cindex passwords, setting
-Set the password for the current user. Any non-anonymous user can change his
-own password!
-
-@item PASSWORD FOR user = PASSWORD('some password')
-Set the password for a specific user on the current server host. Only a user
-with access to the @code{mysql} database can do this. The user should be
-given in @code{user@@hostname} format, where @code{user} and @code{hostname}
-are exactly as they are listed in the @code{User} and @code{Host} columns of
-the @code{mysql.user} table entry. For example, if you had an entry with
-@code{User} and @code{Host} fields of @code{'bob'} and @code{'%.loc.gov'},
-you would write:
-
-@example
-mysql> SET PASSWORD FOR bob@@"%.loc.gov" = PASSWORD("newpass");
-
-or
-
-mysql> UPDATE mysql.user SET password=PASSWORD("newpass") where user="bob' and host="%.loc.gov";
-@end example
-
-@item SQL_AUTO_IS_NULL = 0 | 1
-If set to @code{1} (default) then one can find the last inserted row
-for a table with an auto_increment row with the following construct:
-@code{WHERE auto_increment_column IS NULL}. This is used by some
-ODBC programs like Access.
-
-@item AUTOCOMMIT= 0 | 1
-If set to @code{1} all changes to a table will be done at once. To start
-a multi-command transaction, you have to use the @code{BEGIN}
-statement. @xref{COMMIT}. If set to @code{0} you have to use @code{COMMIT} /
-@code{ROLLBACK} to accept/revoke that transaction. @xref{COMMIT}. Note
-that when you change from not @code{AUTOCOMMIT} mode to
-@code{AUTOCOMMIT} mode, @strong{MySQL} will do an automatic
-@code{COMMIT} on any open transactions.
-
-@item SQL_BIG_TABLES = 0 | 1
-@cindex table is full
-If set to @code{1}, all temporary tables are stored on disk rather than in
-memory. This will be a little slower, but you will not get the error
-@code{The table tbl_name is full} for big @code{SELECT} operations that
-require a large temporary table. The default value for a new connection is
-@code{0} (that is, use in-memory temporary tables).
-
-@item SQL_BIG_SELECTS = 0 | 1
-If set to @code{0}, @strong{MySQL} will abort if a @code{SELECT} is attempted
-that probably will take a very long time. This is useful when an inadvisable
-@code{WHERE} statement has been issued. A big query is defined as a
-@code{SELECT} that probably will have to examine more than
-@code{max_join_size} rows. The default value for a new connection is
-@code{1} (which will allow all @code{SELECT} statements).
-
-@item SQL_BUFFER_RESULT = 0 | 1
-@code{SQL_BUFFER_RESULT} will force the result from @code{SELECT}'s
-to be put into a temporary table. This will help @strong{MySQL} free the
-table locks early and will help in cases where it takes a long time to
-send the result set to the client.
-
-@item SQL_LOW_PRIORITY_UPDATES = 0 | 1
-If set to @code{1}, all @code{INSERT}, @code{UPDATE}, @code{DELETE}, and
-and @code{LOCK TABLE WRITE} statements wait until there is no pending
-@code{SELECT} or @code{LOCK TABLE READ} on the affected table.
-
-@item SQL_MAX_JOIN_SIZE = value | DEFAULT
-Don't allow @code{SELECT}s that will probably need to examine more than
-@code{value} row combinations. By setting this value, you can catch
-@code{SELECT}s where keys are not used properly and that would probably
-take a long time. Setting this to a value other than @code{DEFAULT} will reset
-the @code{SQL_BIG_SELECTS} flag. If you set the @code{SQL_BIG_SELECTS}
-flag again, the @code{SQL_MAX_JOIN_SIZE} variable will be ignored.
-You can set a default value for this variable by starting @code{mysqld} with
-@code{-O max_join_size=#}.
-
-@item SQL_SAFE_MODE = 0 | 1
-If set to @code{1}, @strong{MySQL} will abort if an @code{UPDATE} or
-@code{DELETE} is attempted that doesn't use a key or @code{LIMIT} in the
-@code{WHERE} clause. This makes it possible to catch wrong updates
-when creating SQL commands by hand.
-
-@item SQL_SELECT_LIMIT = value | DEFAULT
-The maximum number of records to return from @code{SELECT} statements. If
-a @code{SELECT} has a @code{LIMIT} clause, the @code{LIMIT} takes precedence
-over the value of @code{SQL_SELECT_LIMIT}. The default value for a new
-connection is ``unlimited.'' If you have changed the limit, the default value
-can be restored by using a @code{SQL_SELECT_LIMIT} value of @code{DEFAULT}.
-
-@item SQL_LOG_OFF = 0 | 1
-If set to @code{1}, no logging will be done to the standard log for this
-client, if the client has the @strong{process} privilege. This does not
-affect the update log!
-
-@item SQL_LOG_UPDATE = 0 | 1
-If set to @code{0}, no logging will be done to the update log for the client,
-if the client has the @strong{process} privilege. This does not affect the
-standard log!
-
-@item SQL_QUOTE_SHOW_CREATE = 0 | 1
-If set to @code{1}, @code{SHOW CREATE TABLE} will quote
-table and column names. This is @strong{on} by default,
-for replication of tables with fancy column names to work.
-@ref{SHOW CREATE TABLE, , @code{SHOW CREATE TABLE}}.
-
-@item TIMESTAMP = timestamp_value | DEFAULT
-Set the time for this client. This is used to get the original timestamp if
-you use the update log to restore rows. @code{timestamp_value} should be a
-UNIX Epoch timestamp, not a @strong{MySQL} timestamp.
-
-@item LAST_INSERT_ID = #
-Set the value to be returned from @code{LAST_INSERT_ID()}. This is stored in
-the update log when you use @code{LAST_INSERT_ID()} in a command that updates
-a table.
-
-@item INSERT_ID = #
-Set the value to be used by the following @code{INSERT} or @code{ALTER TABLE}
-command when inserting an @code{AUTO_INCREMENT} value. This is mainly used
-with the update log.
-@end table
-
-@findex ISOLATION LEVEL
-@node SET TRANSACTION, CREATE INDEX, SET OPTION, Reference
-@section @code{SET TRANSACTION} Syntax
-
-@example
-SET [GLOBAL | SESSION] TRANSACTION ISOLATION LEVEL
-[READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE]
-@end example
-
-Sets the transaction isolation level for the global, whole session or
-the next transaction.
-
-The default behavior is to set the isolation level for the next (not started)
-transaction.
-
-If you set the @code{GLOBAL} privilege it will affect all new created threads.
-You will need the @code{PROCESS} privilege to do do this.
-
-Setting the @code{SESSION} privilege will affect the following and all
-future transactions.
-
-You can set the default isolation level for @code{mysqld} with
-@code{--transaction-isolation=...}. @xref{Command-line options}.
+@node CREATE INDEX, DROP INDEX, LOCK TABLES, Reference
+@section @code{CREATE INDEX} Syntax
+@findex CREATE INDEX
@cindex indexes
@cindex indexes, multi-part
@cindex multi-part index
-@findex CREATE INDEX
-@node CREATE INDEX, DROP INDEX, SET TRANSACTION, Reference
-@section @code{CREATE INDEX} Syntax
@example
CREATE [UNIQUE|FULLTEXT] INDEX index_name ON tbl_name (col_name[(length)],... )
@@ -34175,7 +36530,7 @@ not trivial).
@end itemize
-@node Fulltext Search, Performance, Table types, Top
+@node Fulltext Search, Maintenance, Table types, Top
@chapter MySQL Full-text Search
@cindex searching, full-text
@@ -34418,2216 +36773,8 @@ the user wants to treat as words, examples are "C++", "AS/400", "TCP/IP", etc.
parameters to @code{FULLTEXT} in @code{CREATE/ALTER TABLE}).
@end itemize
-@cindex performance, maximizing
-@cindex optimization
-@node Performance, MySQL Benchmarks, Fulltext Search, Top
-@chapter Getting Maximum Performance from MySQL
-
-Optimization is a complicated task because it ultimately requires
-understanding of the whole system. While it may be possible to do some
-local optimizations with small knowledge of your system/application, the
-more optimal you want your system to become the more you will have to
-know about it.
-
-So this chapter will try to explain and give some examples of different
-ways to optimize @strong{MySQL}. But remember that there are always some
-(increasingly harder) additional ways to make the system even faster.
-
-@menu
-* Optimize Basics:: Optimization overview
-* System:: System/Compile time and startup parameter tuning
-* Data size:: Get your data as small as possible
-* MySQL indexes:: How @strong{MySQL} uses indexes
-* Query Speed:: Speed of queries that access or update data
-* Tips:: Other optimization tips
-* Benchmarks:: Using your own benchmarks
-* Design:: Design choices
-* Design Limitations:: MySQL design limitations/tradeoffs
-* Portability:: Portability
-* Internal use:: What have we used MySQL for?
-@end menu
-
-@node Optimize Basics, System, Performance, Performance
-@section Optimization Overview
-
-The most important part for getting a system fast is of course the basic
-design. You also need to know what kinds of things your system will be
-doing, and what your bottlenecks are.
-
-The most common bottlenecks are:
-@itemize @bullet
-@item Disk seeks.
-It takes time for the disk to find a piece of data. With modern disks in
-1999, the mean time for this is usually lower than 10ms, so we can in
-theory do about 1000 seeks a second. This time improves slowly with new
-disks and is very hard to optimize for a single table. The way to
-optimize this is to spread the data on more than one disk.
-@item Disk reading/writing.
-When the disk is at the correct position we need to read the data. With
-modern disks in 1999, one disk delivers something like 10-20Mb/s. This
-is easier to optimize than seeks because you can read in parallel from
-multiple disks.
-@item CPU cycles.
-When we have the data in main memory (or if it already were
-there) we need to process it to get to our result. Having small
-tables compared to the memory is the most common limiting
-factor. But then, with small tables speed is usually not the problem.
-@item Memory bandwidth.
-When the CPU needs more data than can fit in the CPU cache the main
-memory bandwidth becomes a bottleneck. This is an uncommon bottleneck
-for most systems, but one should be aware of it.
-@end itemize
-
-@cindex compiling, optimizing
-@cindex system optimization
-@cindex startup parameters, tuning
-@node System, Data size, Optimize Basics, Performance
-@section System/Compile Time and Startup Parameter Tuning
-
-We start with the system level things since some of these decisions have
-to be made very early. In other cases a fast look at this part may
-suffice because it not that important for the big gains. However, it is always
-nice to have a feeling about how much one could gain by changing things
-at this level.
-
-The default OS to use is really important! To get the most use of
-multiple CPU machines one should use Solaris (because the threads works
-really nice) or Linux (because the 2.2 kernel has really good SMP
-support). Also on 32-bit machines Linux has a 2G file size limit by
-default. Hopefully this will be fixed soon when new filesystems are
-released (XFS/Reiserfs). If you have a desperate need for files bigger
-than 2G on Linux-intel 32 bit, you should get the LFS patch for the ext2
-file system.
-
-Because we have not run @strong{MySQL} in production on that many platforms, we
-advice you to test your intended platform before choosing it, if possible.
-
-@cindex locking
-Other tips:
-@itemize @bullet
-@item
-If you have enough RAM, you could remove all swap devices. Some
-operating systems will use a swap device in some contexts even if you
-have free memory.
-@item
-Use the @code{--skip-locking} @strong{MySQL} option to avoid external
-locking. Note that this will not impact @strong{MySQL}'s functionality as
-long as you only run one server. Just remember to take down the server (or
-lock relevant parts) before you run @code{myisamchk}. On some system
-this switch is mandatory because the external locking does not work in any
-case.
-
-The @code{--skip-locking} option is on by default when compiling with
-MIT-pthreads, because @code{flock()} isn't fully supported by
-MIT-pthreads on all platforms. It's also on default for Linux
-as Linux file locking are not yet safe.
-
-The only case when you can't use @code{--skip-locking} is if you run
-multiple @strong{MySQL} @emph{servers} (not clients) on the same data,
-or run @code{myisamchk} on the table without first flushing and locking
-the @code{mysqld} server tables first.
-
-You can still use @code{LOCK TABLES}/@code{UNLOCK TABLES} even if you
-are using @code{--skip-locking}
-@end itemize
-
-@menu
-* Compile and link options:: How compiling and linking affects the speed of MySQL
-* Disk issues:: Disk issues
-* Symbolic links:: Using Symbolic Links
-* Server parameters:: Tuning server parameters
-* Table cache:: How MySQL opens and closes tables
-* Creating many tables:: Drawbacks of creating large numbers of tables in the same database
-* Open tables:: Why so many open tables?
-* Memory use:: How MySQL uses memory
-* Internal locking:: How MySQL locks tables
-* Table locking:: Table locking issues
-* DNS::
-@end menu
-
-@node Compile and link options, Disk issues, System, System
-@subsection How Compiling and Linking Affects the Speed of MySQL
-
-Most of the following tests are done on Linux with the
-@strong{MySQL} benchmarks, but they should give some indication for
-other operating systems and workloads.
-
-@cindex linking, speed
-@cindex compiling, speed
-@cindex speed, compiling
-@cindex speed, linking
-
-You get the fastest executable when you link with @code{-static}.
-
-On Linux, you will get the fastest code when compiling with @code{pgcc}
-and @code{-O3}. To compile @file{sql_yacc.cc} with these options, you
-need about 200M memory because @code{gcc/pgcc} needs a lot of memory to
-make all functions inline. You should also set @code{CXX=gcc} when
-configuring @strong{MySQL} to avoid inclusion of the @code{libstdc++}
-library (it is not needed). Note that with some versions of @code{pgcc},
-the resulting code will only run on true Pentium processors, even if you
-use the compiler option that you want the resulting code to be working on
-all x586 type processors (like AMD).
-
-By just using a better compiler and/or better compiler options you can
-get a 10-30 % speed increase in your application. This is particularly
-important if you compile the SQL server yourself!
-
-We have tested both the Cygnus CodeFusion and Fujitsu compilers, but
-when we tested them, neither was sufficiently bug free to allow
-@strong{MySQL} to be compiled with optimizations on.
-
-When you compile @strong{MySQL} you should only include support for the
-character sets that you are going to use. (Option @code{--with-charset=xxx}).
-The standard @strong{MySQL} binary distributions are compiled with support
-for all character sets.
-
-Here is a list of some measurements that we have done:
-@itemize @bullet
-@item
-If you use @code{pgcc} and compile everything with @code{-O6}, the
-@code{mysqld} server is 1% faster than with @code{gcc} 2.95.2.
-
-@item
-If you link dynamically (without @code{-static}), the result is 13%
-slower on Linux. Note that you still can use a dynamic linked
-@strong{MySQL} library. It is only the server that is critical for
-performance.
-
-@item
-If you strip your @code{mysqld} binary with @code{strip libexec/mysqld},
-the resulting binary can be up to 4 % faster.
-
-@item
-If you connect using TCP/IP rather than Unix sockets, the result is 7.5%
-slower on the same computer. (If you are connection to @code{localhost},
-@strong{MySQL} will, by default, use sockets).
-
-@item
-If you connect using TCP/IP from another computer over a 100M Ethernet,
-things will be 8-11 % slower.
-
-@item
-If you compile with @code{--with-debug=full}, then you will loose 20 %
-for most queries, but some queries may take substantially longer (The
-@strong{MySQL} benchmarks ran 35 % slower)
-If you use @code{--with-debug}, then you will only loose 15 %.
-By starting a @code{mysqld} version compiled with @code{--with-debug=full}
-with @code{--skip-safemalloc} the end result should be close to when
-configuring with @code{--with-debug}.
-
-@item
-On a Sun SPARCstation 20, SunPro C++ 4.2 is 5 % faster than @code{gcc} 2.95.2.
-
-@item
-Compiling with @code{gcc} 2.95.2 for ultrasparc with the option
-@code{-mcpu=v8 -Wa,-xarch=v8plusa} gives 4 % more performance.
-
-@item
-On Solaris 2.5.1, MIT-pthreads is 8-12% slower than Solaris native
-threads on a single processor. With more load/CPUs the difference should
-get bigger.
-
-@item
-Running with @code{--log-bin} makes @strong{[MySQL} 1 % slower.
-
-@item
-Compiling on Linux-x86 using gcc without frame pointers
-@code{-fomit-frame-pointer} or @code{-fomit-frame-pointer -ffixed-ebp}
-@code{mysqld} 1-4% faster.
-@end itemize
-
-The @strong{MySQL}-Linux distribution provided by @strong{MySQL AB} used
-to be compiled with @code{pgcc}, but we had to go back to regular gcc
-because of a bug in @code{pgcc} that would generate the code that does
-not run on AMD. We will continue using gcc until that bug is resolved.
-In the meantime, if you have a non-AMD machine, you can get a faster
-binary by compiling with @code{pgcc}. The standard @strong{MySQL}
-Linux binary is linked statically to get it faster and more portable.
-
-@cindex disk issues
-@cindex performance, disk issues
-@node Disk issues, Symbolic links, Compile and link options, System
-@subsection Disk Issues
-
-@itemize @bullet
-@item
-As mentioned before, disks seeks are a big performance bottleneck. This
-problems gets more and more apparent when the data starts to grow so
-large that effective caching becomes impossible. For large databases,
-where you access data more or less randomly, you can be sure that you
-will need at least one disk seek to read and a couple of disk seeks to
-write things. To minimize this problem, use disks with low seek times.
-@item
-Increase the number of available disk spindles (and thereby reduce
-the seek overhead) by either symlink files to different disks or striping
-the disks.
-@table @strong
-@item Using symbolic links
-This means that you symlink the index and/or data file(s) from the
-normal data directory to another disk (that may also be striped). This
-makes both the seek and read times better (if the disks are not used for
-other things). @xref{Symbolic links}.
-@cindex striping, defined
-@item Striping
-Striping means that you have many disks and put the first block on the
-first disk, the second block on the second disk, and the Nth on the
-(N mod number_of_disks) disk, and so on. This means if your normal data
-size is less than the stripe size (or perfectly aligned) you will get
-much better performance. Note that striping is very dependent on the OS
-and stripe-size. So benchmark your application with different
-stripe-sizes. @xref{Benchmarks}.
-
-Note that the speed difference for striping is @strong{very} dependent
-on the parameters. Depending on how you set the striping parameters and
-number of disks you may get a difference in orders of magnitude. Note that
-you have to choose to optimize for random or sequential access.
-@end table
-@item
-For reliability you may want to use RAID 0+1 (striping + mirroring), but
-in this case you will need 2*N drives to hold N drives of data. This is
-probably the best option if you have the money for it! You may, however,
-also have to invest in some volume-management software to handle it
-efficiently.
-@item
-A good option is to have semi-important data (that can be regenerated)
-on RAID 0 disk while storing really important data (like host information
-and logs) on a RAID 0+1 or RAID N disk. RAID N can be a problem if you
-have many writes because of the time to update the parity bits.
-@item
-You may also set the parameters for the file system that the database
-uses. One easy change is to mount the file system with the noatime
-option. That makes it skip the updating of the last access time in the
-inode and by this will avoid some disk seeks.
-@item
-On Linux, you can get much more performance (up to 100 % under load is
-not uncommon) by using hdpram to configure your disk's interface! The
-following should be quite good hdparm options for @strong{MySQL} (and
-probably many other applications):
-@example
-hdparm -m 16 -d 1
-
-@end example
-
-Note that the performance/reliability when using the above depends on
-your hardware, so we strongly suggest that you test your system
-thoroughly after using @code{hdparm}! Please consult the @code{hdparm}
-man page for more information! If @code{hdparm} is not used wisely,
-filesystem corruption may result. Backup everything before experimenting!
-@item
-On many operating systems you can mount the disks with the 'async' flag to set the file
-system to be updated asynchronously. If your computer is reasonable stable,
-this should give you more performance without sacrificing too much reliability.
-(This flag is on by default on Linux.)
-@item
-If you don't need to know when a file was last accessed (which is not
-really useful on a database server), you can mount your file systems
-with the noatime flag.
-@end itemize
-
-@cindex symbolic links
-@cindex links, symbolic
-@node Symbolic links, Server parameters, Disk issues, System
-@subsection Using Symbolic Links
-
-You can move tables and databases from the database directory to other
-locations and replace them with symbolic links to the new locations.
-You might want to do this, for example, to move a database to a file
-system with more free space or increase the speed of your system by
-spreading your tables to different disk.
-
-The recommended may to do this, is to just symlink databases to different
-disk and only symlink tables as a last resort.
-.
-
-@cindex databases, symbolic links
-@menu
-* Symbolic links to database::
-* Symbolic links to tables::
-@end menu
-
-@node Symbolic links to database, Symbolic links to tables, Symbolic links, Symbolic links
-@subsubsection Using Symbolic Links for Databases
-
-The way to symlink a database is to first create a directory on some
-disk where you have free space and then create a symlink to it from
-the @strong{MySQL} database directory.
-
-@example
-shell> mkdir /dr1/databases/test
-shell> ln -s /dr1/databases/test mysqld-datadir
-@end example
-
-@strong{MySQL} doesn't support that you link one directory to multiple
-databases. Replacing a database directory with a symbolic link will
-work fine as long as you don't make a symbolic link between databases.
-Suppose you have a database @code{db1} under the @strong{MySQL} data
-directory, and then make a symlink @code{db2} that points to @code{db1}:
-
-@example
-shell> cd /path/to/datadir
-shell> ln -s db1 db2
-@end example
-
-Now, for any table @code{tbl_a} in @code{db1}, there also appears to be
-a table @code{tbl_a} in @code{db2}. If one thread updates @code{db1.tbl_a}
-and another thread updates @code{db2.tbl_a}, there will be problems.
-
-If you really need this, you must change the following code in
-@file{mysys/mf_format.c}:
-
-@example
-if (flag & 32 || (!lstat(to,&stat_buff) && S_ISLNK(stat_buff.st_mode)))
-@end example
-
-to
-
-@example
-if (1)
-@end example
-
-On Windows you can use internal symbolic links to directories by compiling
-@strong{MySQL} with @code{-DUSE_SYMDIR}. This allows you to put different
-databases on different disks. @xref{Windows symbolic links}.
-
-@cindex databases, symbolic links
-@node Symbolic links to tables, , Symbolic links to database, Symbolic links
-@subsubsection Using Symbolic Links for Tables
-
-Before @strong{MySQL} 4.0 you should not symlink tables, if you are not
-very carefully with them. The problem is that if you run @code{ALTER
-TABLE}, @code{REPAIR TABLE} or @code{OPTIMIZE TABLE} on a symlinked
-table, the symlinks will be removed and replaced by the original
-files. This happens because the above command works by creating a
-temporary file in the database directory and when the command is
-complete, replace the original file with the temporary file.
-
-You should not symlink tables on system that doesn't have a fully
-working @code{realpath()} call. (At least Linux and Solaris support
-@code{realpath()})
-
-In @strong{MySQL} 4.0 symlinks is only fully supported for @code{MyISAM}
-tables. For other table types you will probably get strange problems
-when doing any of the above mentioned commands.
-
-The handling of symbolic links in @strong{MySQL} 4.0 works the following
-way (this is mostly relevant only for @code{MyISAM} tables).
-
-@itemize @bullet
-@item
-In the data directory you will always have the table definition file
-and the data/index files.
-@item
-You can symlink the index file and the data file to different directories
-independent of the other.
-@item
-The symlinking can be done from the operating system (if @code{mysqld} is
-not running) or with the @code{INDEX/DATA DIRECTORY="path-to-dir"} command
-in @code{CREATE TABLE}. @xref{CREATE TABLE}.
-@item
-@code{myisamchk} will not replace a symlink with the index/file but
-work directly on the files the symlinks points to. Any temporary files
-will be created in the same directory where the data/index file is.
-@item
-When you drop a table that is using symlinks, both the symlink and the
-file the symlink points to is dropped. This is a good reason to why you
-should NOT run @code{mysqld} as root and not allow persons to have write
-access to the @strong{MySQL} database directories.
-@item
-If you rename a table with @code{ALTER TABLE RENAME} and you don't change
-database, the symlink in the database directory will be renamed to the new
-name and the data/index file will be renamed accordingly.
-@item
-If you use @code{ALTER TABLE RENAME} to move a table to another database,
-then the table will be moved to the other database directory and the old
-symlinks and the files they pointed to will be deleted.
-@item
-If you are not using symlinks you should use the @code{--skip-symlink}
-option to @code{mysqld} to ensure that no one can drop or rename a file
-outside of the @code{mysqld} data directory.
-@end itemize
-
-Things that are not yet supported:
-
-@cindex TODO, symlinks
-@itemize @bullet
-@item
-@code{ALTER TABLE} ignores all @code{INDEX/DATA DIRECTORY="path"} options.
-@item
-@code{CREATE TABLE} doesn't report if the table has symbolic links.
-@item
-@code{mysqldump} doesn't include the symbolic links information in the output.
-@item
-@code{BACKUP TABLE} and @code{RESTORE TABLE} doesn't use symbolic links.
-@end itemize
-
-@cindex parameters, server
-@cindex @code{mysqld} server, buffer sizes
-@cindex buffer sizes, @code{mysqld} server
-@cindex startup parameters
-@node Server parameters, Table cache, Symbolic links, System
-@subsection Tuning Server Parameters
-
-You can get the default buffer sizes used by the @code{mysqld} server
-with this command:
-
-@example
-shell> mysqld --help
-@end example
-
-@cindex @code{mysqld} options
-@cindex variables, @code{mysqld}
-This command produces a list of all @code{mysqld} options and configurable
-variables. The output includes the default values and looks something
-like this:
-
-@example
-Possible variables for option --set-variable (-O) are:
-back_log current value: 5
-bdb_cache_size current value: 1048540
-binlog_cache_size current_value: 32768
-connect_timeout current value: 5
-delayed_insert_timeout current value: 300
-delayed_insert_limit current value: 100
-delayed_queue_size current value: 1000
-flush_time current value: 0
-interactive_timeout current value: 28800
-join_buffer_size current value: 131072
-key_buffer_size current value: 1048540
-lower_case_table_names current value: 0
-long_query_time current value: 10
-max_allowed_packet current value: 1048576
-max_binlog_cache_size current_value: 4294967295
-max_connections current value: 100
-max_connect_errors current value: 10
-max_delayed_threads current value: 20
-max_heap_table_size current value: 16777216
-max_join_size current value: 4294967295
-max_sort_length current value: 1024
-max_tmp_tables current value: 32
-max_write_lock_count current value: 4294967295
-myisam_sort_buffer_size current value: 8388608
-net_buffer_length current value: 16384
-net_retry_count current value: 10
-net_read_timeout current value: 30
-net_write_timeout current value: 60
-query_buffer_size current value: 0
-record_buffer current value: 131072
-slow_launch_time current value: 2
-sort_buffer current value: 2097116
-table_cache current value: 64
-thread_concurrency current value: 10
-tmp_table_size current value: 1048576
-thread_stack current value: 131072
-wait_timeout current value: 28800
-@end example
-
-If there is a @code{mysqld} server currently running, you can see what
-values it actually is using for the variables by executing this command:
-
-@example
-shell> mysqladmin variables
-@end example
-
-You can find a full description for all variables in the @code{SHOW VARIABLES}
-section in this manual. @xref{SHOW VARIABLES}.
-
-You can also see some statistics from a running server by issuing the command
-@code{SHOW STATUS}. @xref{SHOW STATUS}.
-
-@strong{MySQL} uses algorithms that are very scalable, so you can usually
-run with very little memory. If you, however, give @strong{MySQL} more
-memory, you will normally also get better performance.
-
-When tuning a @strong{MySQL} server, the two most important variables to use
-are @code{key_buffer_size} and @code{table_cache}. You should first feel
-confident that you have these right before trying to change any of the
-other variables.
-
-If you have much memory (>=256M) and many tables and want maximum performance
-with a moderate number of clients, you should use something like this:
-
-@example
-shell> safe_mysqld -O key_buffer=64M -O table_cache=256 \
- -O sort_buffer=4M -O record_buffer=1M &
-@end example
-
-If you have only 128M and only a few tables, but you still do a lot of
-sorting, you can use something like:
-
-@example
-shell> safe_mysqld -O key_buffer=16M -O sort_buffer=1M
-@end example
-
-If you have little memory and lots of connections, use something like this:
-
-@example
-shell> safe_mysqld -O key_buffer=512k -O sort_buffer=100k \
- -O record_buffer=100k &
-@end example
-
-or even:
-
-@example
-shell> safe_mysqld -O key_buffer=512k -O sort_buffer=16k \
- -O table_cache=32 -O record_buffer=8k -O net_buffer=1K &
-@end example
-
-When you have installed @strong{MySQL}, the @file{support-files} directory will
-contain some different @code{my.cnf} example files, @file{my-huge.cnf},
-@file{my-large.cnf}, @file{my-medium.cnf}, and @file{my-small.cnf}, you can
-use as a base to optimize your system.
-
-If there are very many connections, ``swapping problems'' may occur unless
-@code{mysqld} has been configured to use very little memory for each
-connection. @code{mysqld} performs better if you have enough memory for all
-connections, of course.
-
-Note that if you change an option to @code{mysqld}, it remains in effect only
-for that instance of the server.
-
-To see the effects of a parameter change, do something like this:
-
-@example
-shell> mysqld -O key_buffer=32m --help
-@end example
-
-Make sure that the @code{--help} option is last; otherwise, the effect of any
-options listed after it on the command line will not be reflected in the
-output.
-
-@cindex tables, opening
-@cindex tables, closing
-@cindex opening, tables
-@cindex closing, tables
-@cindex table cache
-@findex table_cache
-@node Table cache, Creating many tables, Server parameters, System
-@subsection How MySQL Opens and Closes Tables
-
-@code{table_cache}, @code{max_connections}, and @code{max_tmp_tables}
-affect the maximum number of files the server keeps open. If you
-increase one or both of these values, you may run up against a limit
-imposed by your operating system on the per-process number of open file
-descriptors. However, you can increase the limit on many systems.
-Consult your OS documentation to find out how to do this, because the
-method for changing the limit varies widely from system to system.
-
-@code{table_cache} is related to @code{max_connections}. For example,
-for 200 concurrent running connections, you should have a table cache of
-at least @code{200 * n}, where @code{n} is the maximum number of tables
-in a join. You also need to reserve some extra file descriptors for
-temporary tables and files.
-
-The cache of open tables can grow to a maximum of @code{table_cache}
-(default 64; this can be changed with the @code{-O table_cache=#}
-option to @code{mysqld}). A table is never closed, except when the
-cache is full and another thread tries to open a table or if you use
-@code{mysqladmin refresh} or @code{mysqladmin flush-tables}.
-
-When the table cache fills up, the server uses the following procedure
-to locate a cache entry to use:
-
-@itemize @bullet
-@item
-Tables that are not currently in use are released, in least-recently-used
-order.
-
-@item
-If the cache is full and no tables can be released, but a new table needs to
-be opened, the cache is temporarily extended as necessary.
-
-@item
-If the cache is in a temporarily-extended state and a table goes from in-use
-to not-in-use state, the table is closed and released from the cache.
-@end itemize
-
-A table is opened for each concurrent access. This means that
-if you have two threads accessing the same table or access the table
-twice in the same query (with @code{AS}) the table needs to be opened twice.
-The first open of any table takes two file descriptors; each additional
-use of the table takes only one file descriptor. The extra descriptor
-for the first open is used for the index file; this descriptor is shared
-among all threads.
-
-You can check if your table cache is too small by checking the mysqld
-variable @code{opened_tables}. If this is quite big, even if you
-haven't done a lot of @code{FLUSH TABLES}, you should increase your table
-cache. @xref{SHOW STATUS}.
-
-@cindex tables, too many
-@node Creating many tables, Open tables, Table cache, System
-@subsection Drawbacks to Creating Large Numbers of Tables in the Same Database
-
-If you have many files in a directory, open, close, and create operations will
-be slow. If you execute @code{SELECT} statements on many different tables,
-there will be a little overhead when the table cache is full, because for
-every table that has to be opened, another must be closed. You can reduce
-this overhead by making the table cache larger.
-
-@cindex tables, open
-@cindex open tables
-@node Open tables, Memory use, Creating many tables, System
-@subsection Why So Many Open tables?
-
-When you run @code{mysqladmin status}, you'll see something like this:
-
-@example
-Uptime: 426 Running threads: 1 Questions: 11082 Reloads: 1 Open tables: 12
-@end example
-
-This can be somewhat perplexing if you only have 6 tables.
-
-@strong{MySQL} is multithreaded, so it may have many queries on the same
-table simultaneously. To minimize the problem with two threads having
-different states on the same file, the table is opened independently by
-each concurrent thread. This takes some memory and one extra file
-descriptor for the data file. The index file descriptor is shared
-between all threads.
-
-@cindex memory use
-@node Memory use, Internal locking, Open tables, System
-@subsection How MySQL Uses Memory
-
-The list below indicates some of the ways that the @code{mysqld} server
-uses memory. Where applicable, the name of the server variable relevant
-to the memory use is given:
-
-@itemize @bullet
-@item
-The key buffer (variable @code{key_buffer_size}) is shared by all
-threads; Other buffers used by the server are allocated as
-needed. @xref{Server parameters}.
-
-@item
-Each connection uses some thread-specific space: A stack (default 64K,
-variable @code{thread_stack}), a connection buffer (variable
-@code{net_buffer_length}), and a result buffer (variable
-@code{net_buffer_length}). The connection buffer and result buffer are
-dynamically enlarged up to @code{max_allowed_packet} when needed. When
-a query is running, a copy of the current query string is also allocated.
-
-@item
-All threads share the same base memory.
-
-@item
-Only the compressed ISAM / MyISAM tables are memory mapped. This is
-because the 32-bit memory space of 4GB is not large enough for most
-big tables. When systems with a 64-bit address space become more
-common we may add general support for memory mapping.
-
-@item
-Each request doing a sequential scan over a table allocates a read buffer
-(variable @code{record_buffer}).
-
-@item
-All joins are done in one pass, and most joins can be done without even
-using a temporary table. Most temporary tables are memory-based (HEAP)
-tables. Temporary tables with a big record length (calculated as the
-sum of all column lengths) or that contain @code{BLOB} columns are
-stored on disk.
-
-One problem in @strong{MySQL} versions before Version 3.23.2 is that if a HEAP table
-exceeds the size of @code{tmp_table_size}, you get the error @code{The
-table tbl_name is full}. In newer versions this is handled by
-automatically changing the in-memory (HEAP) table to a disk-based
-(MyISAM) table as necessary. To work around this problem, you can
-increase the temporary table size by setting the @code{tmp_table_size}
-option to @code{mysqld}, or by setting the SQL option
-@code{SQL_BIG_TABLES} in the client program. @xref{SET OPTION, ,
-@code{SET OPTION}}. In @strong{MySQL} Version 3.20, the maximum size of the
-temporary table was @code{record_buffer*16}, so if you are using this
-version, you have to increase the value of @code{record_buffer}. You can
-also start @code{mysqld} with the @code{--big-tables} option to always
-store temporary tables on disk. However, this will affect the speed of
-many complicated queries.
-
-@item
-Most requests doing a sort allocates a sort buffer and 0-2 temporary
-files depending on the result set size. @xref{Temporary files}.
-
-@item
-Almost all parsing and calculating is done in a local memory store. No
-memory overhead is needed for small items and the normal slow memory
-allocation and freeing is avoided. Memory is allocated only for
-unexpectedly large strings (this is done with @code{malloc()} and
-@code{free()}).
-
-@item
-Each index file is opened once and the data file is opened once for each
-concurrently running thread. For each concurrent thread, a table structure,
-column structures for each column, and a buffer of size @code{3 * n} is
-allocated (where @code{n} is the maximum row length, not counting @code{BLOB}
-columns). A @code{BLOB} uses 5 to 8 bytes plus the length of the @code{BLOB}
-data. The @code{ISAM}/@code{MyISAM} table handlers will use one extra row
-buffer for internal usage.
-
-@item
-For each table having @code{BLOB} columns, a buffer is enlarged dynamically
-to read in larger @code{BLOB} values. If you scan a table, a buffer as large
-as the largest @code{BLOB} value is allocated.
-
-@item
-Table handlers for all in-use tables are saved in a cache and managed as a
-FIFO. Normally the cache has 64 entries. If a table has been used by two
-running threads at the same time, the cache contains two entries for the
-table. @xref{Table cache}.
-
-@item
-A @code{mysqladmin flush-tables} command closes all tables that are not in
-use and marks all in-use tables to be closed when the currently executing
-thread finishes. This will effectively free most in-use memory.
-@end itemize
-
-@code{ps} and other system status programs may report that @code{mysqld}
-uses a lot of memory. This may be caused by thread-stacks on different
-memory addresses. For example, the Solaris version of @code{ps} counts
-the unused memory between stacks as used memory. You can verify this by
-checking available swap with @code{swap -s}. We have tested
-@code{mysqld} with commercial memory-leakage detectors, so there should
-be no memory leaks.
-
-@cindex internal locking
-@cindex locking, tables
-@cindex tables, locking
-@node Internal locking, Table locking, Memory use, System
-@subsection How MySQL Locks Tables
-
-You can find a discussion about different locking methods in the appendix.
-@xref{Locking methods}.
-
-All locking in @strong{MySQL} is deadlock-free. This is managed by always
-requesting all needed locks at once at the beginning of a query and always
-locking the tables in the same order.
-
-The locking method @strong{MySQL} uses for @code{WRITE} locks works as follows:
-
-@itemize @bullet
-@item
-If there are no locks on the table, put a write lock on it.
-@item
-Otherwise, put the lock request in the write lock queue.
-@end itemize
-
-The locking method @strong{MySQL} uses for @code{READ} locks works as follows:
-
-@itemize @bullet
-@item
-If there are no write locks on the table, put a read lock on it.
-@item
-Otherwise, put the lock request in the read lock queue.
-@end itemize
-
-When a lock is released, the lock is made available to the threads
-in the write lock queue, then to the threads in the read lock queue.
-
-This means that if you have many updates on a table, @code{SELECT}
-statements will wait until there are no more updates.
-
-To work around this for the case where you want to do many @code{INSERT} and
-@code{SELECT} operations on a table, you can insert rows in a temporary
-table and update the real table with the records from the temporary table
-once in a while.
-
-This can be done with the following code:
-@example
-mysql> LOCK TABLES real_table WRITE, insert_table WRITE;
-mysql> insert into real_table select * from insert_table;
-mysql> TRUNCATE TABLE insert_table;
-mysql> UNLOCK TABLES;
-@end example
-
-You can use the @code{LOW_PRIORITY} options with @code{INSERT},
-@code{UPDATE} or @code{DELETE} or @code{HIGH_PRIORITY} with
-@code{SELECT} if you want to prioritize retrieval in some specific
-cases. You can also start @code{mysqld} with @code{--low-priority-updates}
-to get the same behaveour.
-
-Using @code{SQL_BUFFER_RESULT} can also help making table locks shorter.
-@xref{SELECT}.
-
-You could also change the locking code in @file{mysys/thr_lock.c} to use a
-single queue. In this case, write locks and read locks would have the same
-priority, which might help some applications.
-
-@cindex problems, table locking
-@node Table locking, DNS, Internal locking, System
-@subsection Table Locking Issues
-
-The table locking code in @strong{MySQL} is deadlock free.
-
-@strong{MySQL} uses table locking (instead of row locking or column
-locking) on all table types, except @code{BDB} tables, to achieve a very
-high lock speed. For large tables, table locking is MUCH better than
-row locking for most applications, but there are, of course, some
-pitfalls.
-
-For @code{BDB} and @code{InnoDB} tables, @strong{MySQL} only uses table
-locking if you explicitely lock the table with @code{LOCK TABLES} or
-execute a command that will modify every row in the table, like
-@code{ALTER TABLE}. For these table types we recommend you to not use
-@code{LOCK TABLES} at all.
-
-In @strong{MySQL} Version 3.23.7 and above, you can insert rows into
-@code{MyISAM} tables at the same time other threads are reading from the
-table. Note that currently this only works if there are no holes after
-deleted rows in the table at the time the insert is made. When all holes
-has been filled with new data, concurrent inserts will automatically be
-enabled again.
-
-Table locking enables many threads to read from a table at the same
-time, but if a thread wants to write to a table, it must first get
-exclusive access. During the update, all other threads that want to
-access this particular table will wait until the update is ready.
-
-As updates on tables normally are considered to be more important than
-@code{SELECT}, all statements that update a table have higher priority
-than statements that retrieve information from a table. This should
-ensure that updates are not 'starved' because one issues a lot of heavy
-queries against a specific table. (You can change this by using
-LOW_PRIORITY with the statement that does the update or
-@code{HIGH_PRIORITY} with the @code{SELECT} statement.)
-
-Starting from @strong{MySQL} Version 3.23.7 one can use the
-@code{max_write_lock_count} variable to force @strong{MySQL} to
-temporary give all @code{SELECT} statements, that wait for a table, a
-higher priority after a specific number of inserts on a table.
-
-Table locking is, however, not very good under the following senario:
-
-@itemize @bullet
-@item
-A client issues a @code{SELECT} that takes a long time to run.
-@item
-Another client then issues an @code{UPDATE} on a used table. This client
-will wait until the @code{SELECT} is finished.
-@item
-Another client issues another @code{SELECT} statement on the same table. As
-@code{UPDATE} has higher priority than @code{SELECT}, this @code{SELECT}
-will wait for the @code{UPDATE} to finish. It will also wait for the first
-@code{SELECT} to finish!
-@item
-A thread is waiting for something like @code{full disk}, in which case all
-threads that wants to access the problem table will also be put in a waiting
-state until more disk space is made available.
-@end itemize
-
-Some possible solutions to this problem are:
-
-@itemize @bullet
-@item
-Try to get the @code{SELECT} statements to run faster. You may have to create
-some summary tables to do this.
-
-@item
-Start @code{mysqld} with @code{--low-priority-updates}. This will give
-all statements that update (modify) a table lower priority than a @code{SELECT}
-statement. In this case the last @code{SELECT} statement in the previous
-scenario would execute before the @code{INSERT} statement.
-
-@item
-You can give a specific @code{INSERT}, @code{UPDATE}, or @code{DELETE}
-statement lower priority with the @code{LOW_PRIORITY} attribute.
-
-@item
-Start @code{mysqld} with a low value for @strong{max_write_lock_count} to give
-@code{READ} locks after a certain number of @code{WRITE} locks.
-
-@item
-You can specify that all updates from a specific thread should be done with
-low priority by using the SQL command: @code{SET SQL_LOW_PRIORITY_UPDATES=1}.
-@xref{SET OPTION, , @code{SET OPTION}}.
-
-@item
-You can specify that a specific @code{SELECT} is very important with the
-@code{HIGH_PRIORITY} attribute. @xref{SELECT, , @code{SELECT}}.
-
-@item
-If you have problems with @code{INSERT} combined with @code{SELECT},
-switch to use the new @code{MyISAM} tables as these support concurrent
-@code{SELECT}s and @code{INSERT}s.
-
-@item
-If you mainly mix @code{INSERT} and @code{SELECT} statements, the
-@code{DELAYED} attribute to @code{INSERT} will probably solve your problems.
-@xref{INSERT, , @code{INSERT}}.
-
-@item
-If you have problems with @code{SELECT} and @code{DELETE}, the @code{LIMIT}
-option to @code{DELETE} may help. @xref{DELETE, , @code{DELETE}}.
-@end itemize
-
-@cindex DNS
-@cindex hostname caching
-@node DNS, , Table locking, System
-@subsection How MySQL uses DNS
-
-When a new thread connects to @code{mysqld}, @code{mysqld} will span a
-new thread to handle the request. This thread will first check if the
-hostname is in the hostname cache. If not the thread will call
-@code{gethostbyaddr_r()} and @code{gethostbyname_r()} to resolve the
-hostname.
-
-If the operating system doesn't support the above thread-safe calls, the
-thread will lock a mutex and call @code{gethostbyaddr()} and
-@code{gethostbyname()} instead. Note that in this case no other thread
-can resolve other hostnames that is not in the hostname cache until the
-first thread is ready.
-
-You can disable DNS host lookup by starting @code{mysqld} with
-@code{--skip-name-resolve}. In this case you can however only use IP
-names in the @strong{MySQL} privilege tables.
-
-If you have a very slow DNS and many hosts, you can get more performance by
-either disabling DNS lookop with @code{--skip-name-resolve} or by
-increasing the @code{HOST_CACHE_SIZE} define (default: 128) and recompile
-@code{mysqld}.
-
-You can disable the hostname cache with @code{--skip-host-cache}. You
-can clear the hostname cache with @code{FLUSH HOSTS} or @code{mysqladmin
-flush-hosts}.
-
-If you don't want to allow connections over @code{TCP/IP}, you can do this
-by starting @code{mysqld} with @code{--skip-networking}.
-
-@cindex data, size
-@cindex reducing, data size
-@cindex storage space, minimizing
-@cindex tables, improving performance
-@cindex performance, improving
-@node Data size, MySQL indexes, System, Performance
-@section Get Your Data as Small as Possible
-
-One of the most basic optimization is to get your data (and indexes) to
-take as little space on the disk (and in memory) as possible. This can
-give huge improvements because disk reads are faster and normally less
-main memory will be used. Indexing also takes less resources if
-done on smaller columns.
-
-@strong{MySQL} supports a lot of different table types and row formats.
-Choosing the right table format may give you a big performance gain.
-@xref{Table types}.
-
-You can get better performance on a table and minimize storage space
-using the techniques listed below:
-
-@itemize @bullet
-@item
-Use the most efficient (smallest) types possible. @strong{MySQL} has
-many specialized types that save disk space and memory.
-@item
-Use the smaller integer types if possible to get smaller tables. For
-example, @code{MEDIUMINT} is often better than @code{INT}.
-@item
-Declare columns to be @code{NOT NULL} if possible. It makes everything
-faster and you save one bit per column. Note that if you really need
-@code{NULL} in your application you should definitely use it. Just avoid
-having it on all columns by default.
-@item
-If you don't have any variable-length columns (@code{VARCHAR},
-@code{TEXT}, or @code{BLOB} columns), a fixed-size record format is
-used. This is faster but unfortunately may waste some space.
-@xref{MyISAM table formats}.
-@item
-The primary index of a table should be as short as possible. This makes
-identification of one row easy and efficient.
-@item
-For each table, you have to decide which storage/index method to
-use. @xref{Table types}.
-@item
-Only create the indexes that you really need. Indexes are good for
-retrieval but bad when you need to store things fast. If you mostly
-access a table by searching on a combination of columns, make an index
-on them. The first index part should be the most used column. If you are
-ALWAYS using many columns, you should use the column with more duplicates
-first to get better compression of the index.
-@item
-If it's very likely that a column has a unique prefix on the first number
-of characters, it's better to only index this prefix. @strong{MySQL}
-supports an index on a part of a character column. Shorter indexes are
-faster not only because they take less disk space but also because they
-will give you more hits in the index cache and thus fewer disk
-seeks. @xref{Server parameters}.
-@item
-In some circumstances it can be beneficial to split into two a table that is
-scanned very often. This is especially true if it is a dynamic
-format table and it is possible to use a smaller static format table that
-can be used to find the relevant rows when scanning the table.
-@end itemize
-
-@cindex indexes, uses for
-@node MySQL indexes, Query Speed, Data size, Performance
-@section How MySQL Uses Indexes
-
-Indexes are used to find rows with a specific value of one column
-fast. Without an index @strong{MySQL} has to start with the first record
-and then read through the whole table until it finds the relevant
-rows. The bigger the table, the more this costs. If the table has an index
-for the columns in question, @strong{MySQL} can quickly get a position to
-seek to in the middle of the data file without having to look at all the
-data. If a table has 1000 rows, this is at least 100 times faster than
-reading sequentially. Note that if you need to access almost all 1000
-rows it is faster to read sequentially because we then avoid disk seeks.
-
-All @strong{MySQL} indexes (@code{PRIMARY}, @code{UNIQUE}, and
-@code{INDEX}) are stored in B-trees. Strings are automatically prefix-
-and end-space compressed. @xref{CREATE INDEX, , @code{CREATE INDEX}}.
-
-Indexes are used to:
-@itemize @bullet
-@item
-Quickly find the rows that match a @code{WHERE} clause.
-
-@item
-Retrieve rows from other tables when performing joins.
-
-@item
-Find the @code{MAX()} or @code{MIN()} value for a specific indexed
-column. This is optimized by a preprocessor that checks if you are
-using @code{WHERE} key_part_# = constant on all key parts < N. In this case
-@strong{MySQL} will do a single key lookup and replace the @code{MIN()}
-expression with a constant. If all expressions are replaced with
-constants, the query will return at once:
-
-@example
-SELECT MIN(key_part2),MAX(key_part2) FROM table_name where key_part1=10
-@end example
-
-@item
-Sort or group a table if the sorting or grouping is done on a leftmost
-prefix of a usable key (for example, @code{ORDER BY key_part_1,key_part_2 }). The
-key is read in reverse order if all key parts are followed by @code{DESC}.
-
-The index can also be used even if the @code{ORDER BY} doesn't match the index
-exactly, as long as all the unused index parts and all the extra
-are @code{ORDER BY} columns are constants in the @code{WHERE} clause. The
-following queries will use the index to resolve the @code{ORDER BY} part:
-
-@example
-SELECT * FROM foo ORDER BY key_part1,key_part2,key_part3;
-SELECT * FROM foo WHERE column=constant ORDER BY column, key_part1;
-SELECT * FROM foo WHERE key_part1=const GROUP BY key_part2;
-@end example
-
-@item
-In some cases a query can be optimized to retrieve values without
-consulting the data file. If all used columns for some table are numeric
-and form a leftmost prefix for some key, the values may be retrieved
-from the index tree for greater speed:
-
-@example
-SELECT key_part3 FROM table_name WHERE key_part1=1
-@end example
-
-@end itemize
-
-Suppose you issue the following @code{SELECT} statement:
-
-@example
-mysql> SELECT * FROM tbl_name WHERE col1=val1 AND col2=val2;
-@end example
-
-If a multiple-column index exists on @code{col1} and @code{col2}, the
-appropriate rows can be fetched directly. If separate single-column
-indexes exist on @code{col1} and @code{col2}, the optimizer tries to
-find the most restrictive index by deciding which index will find fewer
-rows and using that index to fetch the rows.
-
-@cindex indexes, leftmost prefix of
-@cindex leftmost prefix of indexes
-If the table has a multiple-column index, any leftmost prefix of the
-index can be used by the optimizer to find rows. For example, if you
-have a three-column index on @code{(col1,col2,col3)}, you have indexed
-search capabilities on @code{(col1)}, @code{(col1,col2)}, and
-@code{(col1,col2,col3)}.
-
-@strong{MySQL} can't use a partial index if the columns don't form a
-leftmost prefix of the index. Suppose you have the @code{SELECT}
-statements shown below:
-
-@example
-mysql> SELECT * FROM tbl_name WHERE col1=val1;
-mysql> SELECT * FROM tbl_name WHERE col2=val2;
-mysql> SELECT * FROM tbl_name WHERE col2=val2 AND col3=val3;
-@end example
-If an index exists on @code{(col1,col2,col3)}, only the first query
-shown above uses the index. The second and third queries do involve
-indexed columns, but @code{(col2)} and @code{(col2,col3)} are not
-leftmost prefixes of @code{(col1,col2,col3)}.
-
-@findex LIKE, and indexes
-@findex LIKE, and wildcards
-@cindex indexes, and @code{LIKE}
-@cindex wildcards, and @code{LIKE}
-@strong{MySQL} also uses indexes for @code{LIKE} comparisons if the argument
-to @code{LIKE} is a constant string that doesn't start with a wild-card
-character. For example, the following @code{SELECT} statements use indexes:
-
-@example
-mysql> select * from tbl_name where key_col LIKE "Patrick%";
-mysql> select * from tbl_name where key_col LIKE "Pat%_ck%";
-@end example
-
-In the first statement, only rows with @code{"Patrick" <= key_col <
-"Patricl"} are considered. In the second statement, only rows with
-@code{"Pat" <= key_col < "Pau"} are considered.
-
-The following @code{SELECT} statements will not use indexes:
-@example
-mysql> select * from tbl_name where key_col LIKE "%Patrick%";
-mysql> select * from tbl_name where key_col LIKE other_col;
-@end example
-
-In the first statement, the @code{LIKE} value begins with a wild-card
-character. In the second statement, the @code{LIKE} value is not a
-constant.
-
-@findex IS NULL, and indexes
-@cindex indexes, and @code{IS NULL}
-Searching using @code{column_name IS NULL} will use indexes if column_name
-is an index.
-
-@strong{MySQL} normally uses the index that finds the least number of rows. An
-index is used for columns that you compare with the following operators:
-@code{=}, @code{>}, @code{>=}, @code{<}, @code{<=}, @code{BETWEEN}, and a
-@code{LIKE} with a non-wild-card prefix like @code{'something%'}.
-
-Any index that doesn't span all @code{AND} levels in the @code{WHERE} clause
-is not used to optimize the query. In other words: To be able to use an
-index, a prefix of the index must be used in every @code{AND} group.
-
-The following @code{WHERE} clauses use indexes:
-@example
-... WHERE index_part1=1 AND index_part2=2 AND other_column=3
-... WHERE index=1 OR A=10 AND index=2 /* index = 1 OR index = 2 */
-... WHERE index_part1='hello' AND index_part_3=5
- /* optimized like "index_part1='hello'" */
-... WHERE index1=1 and index2=2 or index1=3 and index3=3;
- /* Can use index on index1 but not on index2 or index 3 */
-@end example
-
-These @code{WHERE} clauses do @strong{NOT} use indexes:
-@example
-... WHERE index_part2=1 AND index_part3=2 /* index_part_1 is not used */
-... WHERE index=1 OR A=10 /* Index is not used in both AND parts */
-... WHERE index_part1=1 OR index_part2=10 /* No index spans all rows */
-@end example
-
-Note that in some cases @strong{MySQL} will not use an index, even if one
-would be available. Some of the cases where this happens are:
-
-@itemize @bullet
-@item
-If the use of the index would require @strong{MySQL} to access more
-than 30 % of the rows in the table. (In this case a table scan is
-probably much faster, as this will require us to do much fewer seeks).
-Note that if such a query uses @code{LIMIT} to only retrieve
-part of the rows, @strong{MySQL} will use an index anyway, as it can
-much more quickly find the few rows to return in the result.
-@end itemize
-
-@cindex queries, speed of
-@cindex permission checks, effect on speed
-@cindex speed, of queries
-@node Query Speed, Tips, MySQL indexes, Performance
-@section Speed of Queries that Access or Update Data
-
-First, one thing that affects all queries: The more complex permission
-system setup you have, the more overhead you get.
-
-If you do not have any @code{GRANT} statements done, @strong{MySQL} will
-optimize the permission checking somewhat. So if you have a very high
-volume it may be worth the time to avoid grants. Otherwise more
-permission check results in a larger overhead.
-
-If your problem is with some explicit @strong{MySQL} function, you can
-always time this in the @strong{MySQL} client:
-
-@example
-mysql> select benchmark(1000000,1+1);
-+------------------------+
-| benchmark(1000000,1+1) |
-+------------------------+
-| 0 |
-+------------------------+
-1 row in set (0.32 sec)
-@end example
-
-The above shows that @strong{MySQL} can execute 1,000,000 @code{+}
-expressions in 0.32 seconds on a @code{PentiumII 400MHz}.
-
-All @strong{MySQL} functions should be very optimized, but there may be
-some exceptions, and the @code{benchmark(loop_count,expression)} is a
-great tool to find out if this is a problem with your query.
-
-@menu
-* Estimating performance:: Estimating query performance
-* SELECT speed:: Speed of @code{SELECT} queries
-* Where optimizations:: How MySQL optimizes @code{WHERE} clauses
-* DISTINCT optimization:: How MySQL Optimizes @code{DISTINCT}
-* LEFT JOIN optimization:: How MySQL optimizes @code{LEFT JOIN}
-* LIMIT optimization:: How MySQL optimizes @code{LIMIT}
-* Insert speed:: Speed of @code{INSERT} queries
-* Update speed:: Speed of @code{UPDATE} queries
-* Delete speed:: Speed of @code{DELETE} queries
-@end menu
-
-@cindex estimating, query performance
-@cindex queries, estimating performance
-@cindex performance, estimating
-@node Estimating performance, SELECT speed, Query Speed, Query Speed
-@subsection Estimating Query Performance
-
-In most cases you can estimate the performance by counting disk seeks.
-For small tables, you can usually find the row in 1 disk seek (as the
-index is probably cached). For bigger tables, you can estimate that
-(using B++ tree indexes) you will need: @code{log(row_count) /
-log(index_block_length / 3 * 2 / (index_length + data_pointer_length)) +
-1} seeks to find a row.
-
-In @strong{MySQL} an index block is usually 1024 bytes and the data
-pointer is usually 4 bytes. A 500,000 row table with an
-index length of 3 (medium integer) gives you:
-@code{log(500,000)/log(1024/3*2/(3+4)) + 1} = 4 seeks.
-
-As the above index would require about 500,000 * 7 * 3/2 = 5.2M,
-(assuming that the index buffers are filled to 2/3, which is typical)
-you will probably have much of the index in memory and you will probably
-only need 1-2 calls to read data from the OS to find the row.
-
-For writes, however, you will need 4 seek requests (as above) to find
-where to place the new index and normally 2 seeks to update the index
-and write the row.
-
-Note that the above doesn't mean that your application will slowly
-degenerate by N log N! As long as everything is cached by the OS or SQL
-server things will only go marginally slower while the table gets
-bigger. After the data gets too big to be cached, things will start to
-go much slower until your applications is only bound by disk-seeks
-(which increase by N log N). To avoid this, increase the index cache as
-the data grows. @xref{Server parameters}.
-
-@cindex speed, of queries
-@findex SELECT speed
-
-@node SELECT speed, Where optimizations, Estimating performance, Query Speed
-@subsection Speed of @code{SELECT} Queries
-
-In general, when you want to make a slow @code{SELECT ... WHERE} faster, the
-first thing to check is whether or not you can add an index. @xref{MySQL
-indexes, , @strong{MySQL} indexes}. All references between different tables
-should usually be done with indexes. You can use the @code{EXPLAIN} command
-to determine which indexes are used for a @code{SELECT}.
-@xref{EXPLAIN, , @code{EXPLAIN}}.
-
-Some general tips:
-
-@itemize @bullet
-@item
-To help @strong{MySQL} optimize queries better, run @code{myisamchk
---analyze} on a table after it has been loaded with relevant data. This
-updates a value for each index part that indicates the average number of
-rows that have the same value. (For unique indexes, this is always 1,
-of course.). @strong{MySQL} will use this to decide which index to
-choose when you connect two tables with 'a non-constant expression'.
-You can check the result from the @code{analyze} run by doing @code{SHOW
-INDEX FROM table_name} and examining the @code{Cardinality} column.
-
-@item
-To sort an index and data according to an index, use @code{myisamchk
---sort-index --sort-records=1} (if you want to sort on index 1). If you
-have a unique index from which you want to read all records in order
-according to that index, this is a good way to make that faster. Note,
-however, that this sorting isn't written optimally and will take a long
-time for a large table!
-@end itemize
-
-@cindex optimizations
-@findex WHERE
-@node Where optimizations, DISTINCT optimization, SELECT speed, Query Speed
-@subsection How MySQL Optimizes @code{WHERE} Clauses
-
-The @code{WHERE} optimizations are put in the @code{SELECT} part here because
-they are mostly used with @code{SELECT}, but the same optimizations apply for
-@code{WHERE} in @code{DELETE} and @code{UPDATE} statements.
-
-Also note that this section is incomplete. @strong{MySQL} does many
-optimizations, and we have not had time to document them all.
-
-Some of the optimizations performed by @strong{MySQL} are listed below:
-
-@itemize @bullet
-@item
-Removal of unnecessary parentheses:
-@example
- ((a AND b) AND c OR (((a AND b) AND (c AND d))))
--> (a AND b AND c) OR (a AND b AND c AND d)
-@end example
-@item
-Constant folding:
-@example
- (a<b AND b=c) AND a=5
--> b>5 AND b=c AND a=5
-@end example
-@item
-Constant condition removal (needed because of constant folding):
-@example
- (B>=5 AND B=5) OR (B=6 AND 5=5) OR (B=7 AND 5=6)
--> B=5 OR B=6
-@end example
-@item
-Constant expressions used by indexes are evaluated only once.
-@item
-@code{COUNT(*)} on a single table without a @code{WHERE} is retrieved
-directly from the table information. This is also done for any @code{NOT NULL}
-expression when used with only one table.
-@item
-Early detection of invalid constant expressions. @strong{MySQL} quickly
-detects that some @code{SELECT} statements are impossible and returns no rows.
-@item
-@code{HAVING} is merged with @code{WHERE} if you don't use @code{GROUP BY}
-or group functions (@code{COUNT()}, @code{MIN()}...).
-@item
-For each sub-join, a simpler @code{WHERE} is constructed to get a fast
-@code{WHERE} evaluation for each sub-join and also to skip records as
-soon as possible.
-@cindex constant table
-@cindex tables, constant
-@item
-All constant tables are read first, before any other tables in the query.
-A constant table is:
-@itemize @minus
-@item
-An empty table or a table with 1 row.
-@item
-A table that is used with a @code{WHERE} clause on a @code{UNIQUE}
-index, or a @code{PRIMARY KEY}, where all index parts are used with constant
-expressions and the index parts are defined as @code{NOT NULL}.
-@end itemize
-All the following tables are used as constant tables:
-@example
-mysql> SELECT * FROM t WHERE primary_key=1;
-mysql> SELECT * FROM t1,t2
- WHERE t1.primary_key=1 AND t2.primary_key=t1.id;
-@end example
-
-@item
-The best join combination to join the tables is found by trying all
-possibilities. If all columns in @code{ORDER BY} and in @code{GROUP
-BY} come from the same table, then this table is preferred first when
-joining.
-@item
-If there is an @code{ORDER BY} clause and a different @code{GROUP BY}
-clause, or if the @code{ORDER BY} or @code{GROUP BY} contains columns
-from tables other than the first table in the join queue, a temporary
-table is created.
-@item
-If you use @code{SQL_SMALL_RESULT}, @strong{MySQL} will use an in-memory
-temporary table.
-@item
-Each table index is queried, and the best index that spans fewer than 30% of
-the rows is used. If no such index can be found, a quick table scan is used.
-@item
-In some cases, @strong{MySQL} can read rows from the index without even
-consulting the data file. If all columns used from the index are numeric,
-then only the index tree is used to resolve the query.
-@item
-Before each record is output, those that do not match the @code{HAVING} clause
-are skipped.
-@end itemize
-
-Some examples of queries that are very fast:
-
-@example
-mysql> SELECT COUNT(*) FROM tbl_name;
-mysql> SELECT MIN(key_part1),MAX(key_part1) FROM tbl_name;
-mysql> SELECT MAX(key_part2) FROM tbl_name
- WHERE key_part_1=constant;
-mysql> SELECT ... FROM tbl_name
- ORDER BY key_part1,key_part2,... LIMIT 10;
-mysql> SELECT ... FROM tbl_name
- ORDER BY key_part1 DESC,key_part2 DESC,... LIMIT 10;
-@end example
-
-The following queries are resolved using only the index tree (assuming
-the indexed columns are numeric):
-
-@example
-mysql> SELECT key_part1,key_part2 FROM tbl_name WHERE key_part1=val;
-mysql> SELECT COUNT(*) FROM tbl_name
- WHERE key_part1=val1 AND key_part2=val2;
-mysql> SELECT key_part2 FROM tbl_name GROUP BY key_part1;
-@end example
-
-The following queries use indexing to retrieve the rows in sorted
-order without a separate sorting pass:
-
-@example
-mysql> SELECT ... FROM tbl_name ORDER BY key_part1,key_part2,... ;
-mysql> SELECT ... FROM tbl_name ORDER BY key_part1 DESC,key_part2 DESC,... ;
-@end example
-
-@findex DISTINCT
-@cindex optimizing, DISTINCT
-@node DISTINCT optimization, LEFT JOIN optimization, Where optimizations, Query Speed
-@subsection How MySQL Optimizes @code{DISTINCT}
-
-@code{DISTINCT} is converted to a @code{GROUP BY} on all columns,
-@code{DISTINCT} combined with @code{ORDER BY} will in many cases also
-need a temporary table.
-
-When combining @code{LIMIT #} with @code{DISTINCT}, @strong{MySQL} will stop
-as soon as it finds @code{#} unique rows.
-
-If you don't use columns from all used tables, @strong{MySQL} will stop
-the scanning of the not used tables as soon as it has found the first match.
-
-@example
-SELECT DISTINCT t1.a FROM t1,t2 where t1.a=t2.a;
-@end example
-
-In the case, assuming t1 is used before t2 (check with @code{EXPLAIN}), then
-@strong{MySQL} will stop reading from t2 (for that particular row in t1)
-when the first row in t2 is found.
-
-@findex LEFT JOIN
-@cindex optimizing, LEFT JOIN
-@node LEFT JOIN optimization, LIMIT optimization, DISTINCT optimization, Query Speed
-@subsection How MySQL Optimizes @code{LEFT JOIN} and @code{RIGHT JOIN}
-
-@code{A LEFT JOIN B} in @strong{MySQL} is implemented as follows:
-
-@itemize @bullet
-@item
-The table @code{B} is set to be dependent on table @code{A} and all tables
-that @code{A} is dependent on.
-@item
-The table @code{A} is set to be dependent on all tables (except @code{B})
-that are used in the @code{LEFT JOIN} condition.
-@item
-All @code{LEFT JOIN} conditions are moved to the @code{WHERE} clause.
-@item
-All standard join optimizations are done, with the exception that a table is
-always read after all tables it is dependent on. If there is a circular
-dependence then @strong{MySQL} will issue an error.
-@item
-All standard @code{WHERE} optimizations are done.
-@item
-If there is a row in @code{A} that matches the @code{WHERE} clause, but there
-wasn't any row in @code{B} that matched the @code{LEFT JOIN} condition,
-then an extra @code{B} row is generated with all columns set to @code{NULL}.
-@item
-If you use @code{LEFT JOIN} to find rows that don't exist in some
-table and you have the following test: @code{column_name IS NULL} in the
-@code{WHERE} part, where column_name is a column that is declared as
-@code{NOT NULL}, then @strong{MySQL} will stop searching after more rows
-(for a particular key combination) after it has found one row that
-matches the @code{LEFT JOIN} condition.
-@end itemize
-
-@code{RIGHT JOIN} is implemented analogously as @code{LEFT JOIN}.
-
-The table read order forced by @code{LEFT JOIN} and @code{STRAIGHT JOIN}
-will help the join optimizer (which calculates in which order tables
-should be joined) to do its work much more quickly, as there are fewer
-table permutations to check.
-
-Note that the above means that if you do a query of type:
-
-@example
-SELECT * FROM a,b LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
-@end example
-
-@strong{MySQL} will do a full scan on @code{b} as the @code{LEFT
-JOIN} will force it to be read before @code{d}.
-
-The fix in this case is to change the query to:
-
-@example
-SELECT * FROM b,a LEFT JOIN c ON (c.key=a.key) LEFT JOIN d (d.key=a.key) WHERE b.key=d.key
-@end example
-
-@cindex optimizing, LIMIT
-@findex LIMIT
-@node LIMIT optimization, Insert speed, LEFT JOIN optimization, Query Speed
-@subsection How MySQL Optimizes @code{LIMIT}
-
-In some cases @strong{MySQL} will handle the query differently when you are
-using @code{LIMIT #} and not using @code{HAVING}:
-
-@itemize @bullet
-@item
-If you are selecting only a few rows with @code{LIMIT}, @strong{MySQL}
-will use indexes in some cases when it normally would prefer to do a
-full table scan.
-@item
-If you use @code{LIMIT #} with @code{ORDER BY}, @strong{MySQL} will end the
-sorting as soon as it has found the first @code{#} lines instead of sorting
-the whole table.
-@item
-When combining @code{LIMIT #} with @code{DISTINCT}, @strong{MySQL} will stop
-as soon as it finds @code{#} unique rows.
-@item
-In some cases a @code{GROUP BY} can be resolved by reading the key in order
-(or do a sort on the key) and then calculate summaries until the
-key value changes. In this case @code{LIMIT #} will not calculate any
-unnecessary @code{GROUP BY}'s.
-@item
-As soon as @strong{MySQL} has sent the first @code{#} rows to the client, it
-will abort the query.
-@item
-@code{LIMIT 0} will always quickly return an empty set. This is useful
-to check the query and to get the column types of the result columns.
-@item
-The size of temporary tables uses the @code{LIMIT #} to calculate how much
-space is needed to resolve the query.
-@end itemize
-
-@cindex speed, inserting
-@cindex inserting, speed of
-@node Insert speed, Update speed, LIMIT optimization, Query Speed
-@subsection Speed of @code{INSERT} Queries
-
-The time to insert a record consists approximately of:
-
-@itemize @bullet
-@item
-Connect: (3)
-@item
-Sending query to server: (2)
-@item
-Parsing query: (2)
-@item
-Inserting record: (1 x size of record)
-@item
-Inserting indexes: (1 x number of indexes)
-@item
-Close: (1)
-@end itemize
-
-where the numbers are somewhat proportional to the overall time. This
-does not take into consideration the initial overhead to open tables
-(which is done once for each concurrently running query).
-
-The size of the table slows down the insertion of indexes by N log N
-(B-trees).
-
-Some ways to speed up inserts:
-
-@itemize @bullet
-@item
-If you are inserting many rows from the same client at the same time, use
-multiple value lists @code{INSERT} statements. This is much faster (many
-times in some cases) than using separate @code{INSERT} statements.
-@item
-If you are inserting a lot of rows from different clients, you can get
-higher speed by using the @code{INSERT DELAYED} statement. @xref{INSERT,
-, @code{INSERT}}.
-@item
-Note that with @code{MyISAM} you can insert rows at the same time
-@code{SELECT}s are running if there are no deleted rows in the tables.
-@item
-When loading a table from a text file, use @code{LOAD DATA INFILE}. This
-is usually 20 times faster than using a lot of @code{INSERT} statements.
-@xref{LOAD DATA, , @code{LOAD DATA}}.
-@item
-It is possible with some extra work to make @code{LOAD DATA INFILE} run even
-faster when the table has many indexes. Use the following procedure:
-
-@enumerate
-@item
-Optionally create the table with @code{CREATE TABLE}. For example, using
-@code{mysql} or Perl-DBI.
-
-@item
-Execute a @code{FLUSH TABLES} statement or the shell command @code{mysqladmin
-flush-tables}.
-
-@item
-Use @code{myisamchk --keys-used=0 -rq /path/to/db/tbl_name}. This will
-remove all usage of all indexes from the table.
-
-@item
-Insert data into the table with @code{LOAD DATA INFILE}. This will not
-update any indexes and will therefore be very fast.
-
-@item
-If you are going to only read the table in the future, run @code{myisampack}
-on it to make it smaller. @xref{Compressed format}.
-
-@item
-Re-create the indexes with @code{myisamchk -r -q
-/path/to/db/tbl_name}. This will create the index tree in memory before
-writing it to disk, which is much faster because it avoids lots of disk
-seeks. The resulting index tree is also perfectly balanced.
-
-@item
-Execute a @code{FLUSH TABLES} statement or the shell command @code{mysqladmin
-flush-tables}.
-@end enumerate
-
-This procedure will be built into @code{LOAD DATA INFILE} in some future
-version of @strong{MySQL}.
-@item
-You can speed up insertions by locking your tables:
-
-@example
-mysql> LOCK TABLES a WRITE;
-mysql> INSERT INTO a VALUES (1,23),(2,34),(4,33);
-mysql> INSERT INTO a VALUES (8,26),(6,29);
-mysql> UNLOCK TABLES;
-@end example
-
-The main speed difference is that the index buffer is flushed to disk only
-once, after all @code{INSERT} statements have completed. Normally there would
-be as many index buffer flushes as there are different @code{INSERT}
-statements. Locking is not needed if you can insert all rows with a single
-statement.
-
-Locking will also lower the total time of multi-connection tests, but the
-maximum wait time for some threads will go up (because they wait for
-locks). For example:
-
-@example
-thread 1 does 1000 inserts
-thread 2, 3, and 4 does 1 insert
-thread 5 does 1000 inserts
-@end example
-
-If you don't use locking, 2, 3, and 4 will finish before 1 and 5. If you
-use locking, 2, 3, and 4 probably will not finish before 1 or 5, but the
-total time should be about 40% faster.
-
-As @code{INSERT}, @code{UPDATE}, and @code{DELETE} operations are very
-fast in @strong{MySQL}, you will obtain better overall performance by
-adding locks around everything that does more than about 5 inserts or
-updates in a row. If you do very many inserts in a row, you could do a
-@code{LOCK TABLES} followed by an @code{UNLOCK TABLES} once in a while
-(about each 1000 rows) to allow other threads access to the table. This
-would still result in a nice performance gain.
-
-Of course, @code{LOAD DATA INFILE} is much faster for loading data.
-@end itemize
-
-To get some more speed for both @code{LOAD DATA INFILE} and
-@code{INSERT}, enlarge the key buffer. @xref{Server parameters}.
-
-@node Update speed, Delete speed, Insert speed, Query Speed
-@subsection Speed of @code{UPDATE} Queries
-
-Update queries are optimized as a @code{SELECT} query with the additional
-overhead of a write. The speed of the write is dependent on the size of
-the data that is being updated and the number of indexes that are
-updated. Indexes that are not changed will not be updated.
-
-Also, another way to get fast updates is to delay updates and then do
-many updates in a row later. Doing many updates in a row is much quicker
-than doing one at a time if you lock the table.
-
-Note that, with dynamic record format, updating a record to
-a longer total length may split the record. So if you do this often,
-it is very important to @code{OPTIMIZE TABLE} sometimes.
-@xref{OPTIMIZE TABLE, , @code{OPTIMIZE TABLE}}.
-
-@node Delete speed, , Update speed, Query Speed
-@subsection Speed of @code{DELETE} Queries
-
-If you want to delete all rows in the table, you should use
-@code{TRUNCATE TABLE table_name}. @xref{TRUNCATE}.
-
-The time to delete a record is exactly proportional to the number of
-indexes. To delete records more quickly, you can increase the size of
-the index cache. @xref{Server parameters}.
-
-@cindex optimization, tips
-@cindex tips, optimization
-@node Tips, Benchmarks, Query Speed, Performance
-@section Other Optimization Tips
-
-Unsorted tips for faster systems:
-
-@itemize @bullet
-@item
-Use persistent connections to the database to avoid the connection
-overhead. If you can't use persistent connections and you are doing a
-lot of new connections to the database, you may want to change the value
-of the @code{thread_cache_size} variable. @xref{Server parameters}.
-@item
-Always check that all your queries really use the indexes you have created
-in the tables. In @strong{MySQL} you can do this with the @code{EXPLAIN}
-command. @xref{EXPLAIN, Explain, Explain, manual}.
-@item
-Try to avoid complex @code{SELECT} queries on tables that are updated a
-lot. This is to avoid problems with table locking.
-@item
-The new @code{MyISAM} tables can insert rows in a table without deleted
-rows at the same time another table is reading from it. If this is important
-for you, you should consider methods where you don't have to delete rows
-or run @code{OPTIMIZE TABLE} after you have deleted a lot of rows.
-@item
-Use @code{ALTER TABLE ... ORDER BY expr1,expr2...} if you mostly
-retrieve rows in expr1,expr2.. order. By using this option after big
-changes to the table, you may be able to get higher performance.
-@item
-In some cases it may make sense to introduce a column that is 'hashed'
-based on information from other columns. If this column is short and
-reasonably unique it may be much faster than a big index on many
-columns. In @strong{MySQL} it's very easy to use this extra column:
-@code{SELECT * FROM table_name WHERE hash=MD5(concat(col1,col2))
-AND col_1='constant' AND col_2='constant'}
-@item
-For tables that change a lot you should try to avoid all @code{VARCHAR}
-or @code{BLOB} columns. You will get dynamic row length as soon as you
-are using a single @code{VARCHAR} or @code{BLOB} column. @xref{Table
-types}.
-@item
-It's not normally useful to split a table into different tables just
-because the rows gets 'big'. To access a row, the biggest performance
-hit is the disk seek to find the first byte of the row. After finding
-the data most new disks can read the whole row fast enough for most
-applications. The only cases where it really matters to split up a table is if
-it's a dynamic row size table (see above) that you can change to a fixed
-row size, or if you very often need to scan the table and don't need
-most of the columns. @xref{Table types}.
-@item
-If you very often need to calculate things based on information from a
-lot of rows (like counts of things), it's probably much better to
-introduce a new table and update the counter in real time. An update of
-type @code{UPDATE table set count=count+1 where index_column=constant}
-is very fast!
-
-This is really important when you use databases like @strong{MySQL} that
-only have table locking (multiple readers / single writers). This will
-also give better performance with most databases, as the row locking
-manager in this case will have less to do.
-@item
-If you need to collect statistics from big log tables, use summary tables
-instead of scanning the whole table. Maintaining the summaries should be
-much faster than trying to do statistics 'live'. It's much faster to
-regenerate new summary tables from the logs when things change
-(depending on business decisions) than to have to change the running
-application!
-@item
-If possible, one should classify reports as 'live' or 'statistical',
-where data needed for statistical reports are only generated based on
-summary tables that are generated from the actual data.
-@item
-Take advantage of the fact that columns have default values. Insert
-values explicitly only when the value to be inserted differs from the
-default. This reduces the parsing that @strong{MySQL} need to do and
-improves the insert speed.
-@item
-In some cases it's convenient to pack and store data into a blob. In this
-case you have to add some extra code in your application to pack/unpack
-things in the blob, but this may save a lot of accesses at some stage.
-This is practical when you have data that doesn't conform to a static
-table structure.
-@item
-Normally you should try to keep all data non-redundant (what
-is called 3rd normal form in database theory), but you should not be
-afraid of duplicating things or creating summary tables if you need these
-to gain more speed.
-@item
-Stored procedures or UDF (user-defined functions) may be a good way to
-get more performance. In this case you should, however, always have a way
-to do this some other (slower) way if you use some database that doesn't
-support this.
-@item
-You can always gain something by caching queries/answers in your
-application and trying to do many inserts/updates at the same time. If
-your database supports lock tables (like @strong{MySQL} and Oracle),
-this should help to ensure that the index cache is only flushed once
-after all updates.
-@item
-Use @code{INSERT /*! DELAYED */} when you do not need to know when your
-data is written. This speeds things up because many records can be written
-with a single disk write.
-@item
-Use @code{INSERT /*! LOW_PRIORITY */} when you want your selects to be
-more important.
-@item
-Use @code{SELECT /*! HIGH_PRIORITY */} to get selects that jump the
-queue. That is, the select is done even if there is somebody waiting to
-do a write.
-@item
-Use the multi-line @code{INSERT} statement to store many rows with one
-SQL command (many SQL servers supports this).
-@item
-Use @code{LOAD DATA INFILE} to load bigger amounts of data. This is
-faster than normal inserts and will be even faster when @code{myisamchk}
-is integrated in @code{mysqld}.
-@item
-Use @code{AUTO_INCREMENT} columns to make unique values.
-@item
-Use @code{OPTIMIZE TABLE} once in a while to avoid fragmentation when
-using dynamic table format. @xref{OPTIMIZE TABLE, , @code{OPTIMIZE TABLE}}.
-
-@item
-Use @code{HEAP} tables to get more speed when possible. @xref{Table
-types}.
-@item
-When using a normal Web server setup, images should be stored as
-files. That is, store only a file reference in the database. The main
-reason for this is that a normal Web server is much better at caching
-files than database contents. So it it's much easier to get a fast
-system if you are using files.
-@item
-Use in memory tables for non-critical data that are accessed often (like
-information about the last shown banner for users that don't have
-cookies).
-@item
-Columns with identical information in different tables should be
-declared identical and have identical names. Before Version 3.23 you
-got slow joins otherwise.
-
-Try to keep the names simple (use @code{name} instead of
-@code{customer_name} in the customer table). To make your names portable
-to other SQL servers you should keep them shorter than 18 characters.
-@item
-If you need REALLY high speed, you should take a look at the low-level
-interfaces for data storage that the different SQL servers support! For
-example, by accessing the @strong{MySQL} @code{MyISAM} directly, you could
-get a speed increase of 2-5 times compared to using the SQL interface.
-To be able to do this the data must be on the same server as
-the application, and usually it should only be accessed by one process
-(because external file locking is really slow). One could eliminate the
-above problems by introducing low-level @code{MyISAM} commands in the
-@strong{MySQL} server (this could be one easy way to get more
-performance if needed). By carefully designing the database interface,
-it should be quite easy to support this types of optimization.
-@item
-In many cases it's faster to access data from a database (using a live
-connection) than accessing a text file, just because the database is
-likely to be more compact than the text file (if you are using numerical
-data), and this will involve fewer disk accesses. You will also save
-code because you don't have to parse your text files to find line and
-column boundaries.
-@item
-You can also use replication to speed things up. @xref{Replication}.
-@item
-Declaring a table with @code{DELAY_KEY_WRITE=1} will make the updating of
-indexes faster, as these are not logged to disk until the file is closed.
-The downside is that you should run @code{myisamchk} on these tables before
-you start @code{mysqld} to ensure that they are okay if something killed
-@code{mysqld} in the middle. As the key information can always be generated
-from the data, you should not lose anything by using @code{DELAY_KEY_WRITE}.
-@end itemize
-
-@cindex benchmarks
-@cindex performance, benchmarks
-@node Benchmarks, Design, Tips, Performance
-@section Using Your Own Benchmarks
-
-You should definitely benchmark your application and database to find
-out where the bottlenecks are. By fixing it (or by replacing the
-bottleneck with a 'dummy module') you can then easily identify the next
-bottleneck (and so on). Even if the overall performance for your
-application is sufficient, you should at least make a plan for each
-bottleneck, and decide how to solve it if someday you really need the
-extra performance.
-
-For an example of portable benchmark programs, look at the @strong{MySQL}
-benchmark suite. @xref{MySQL Benchmarks, , @strong{MySQL} Benchmarks}. You
-can take any program from this suite and modify it for your needs. By doing this,
-you can try different solutions to your problem and test which is really the
-fastest solution for you.
-
-It is very common that some problems only occur when the system is very
-heavily loaded. We have had many customers who contact us when they
-have a (tested) system in production and have encountered load problems. In
-every one of these cases so far, it has been problems with basic design
-(table scans are NOT good at high load) or OS/Library issues. Most of
-this would be a @strong{LOT} easier to fix if the systems were not
-already in production.
-
-To avoid problems like this, you should put some effort into benchmarking
-your whole application under the worst possible load! You can use Sasha's
-recent hack for this -
-@uref{http://www.mysql.com/Downloads/super-smack/super-smack-1.0.tar.gz,
-super-smack}.
-As the name suggests, it can bring your system down to its knees if you ask it,
-so make sure to use it only on your development systems.
-
-@cindex design, choices
-@cindex database design
-@cindex storage of data
-@node Design, Design Limitations, Benchmarks, Performance
-@section Design Choices
-
-@strong{MySQL} keeps row data and index data in separate files. Many (almost
-all) other databases mix row and index data in the same file. We believe that
-the @strong{MySQL} choice is better for a very wide range of modern systems.
-
-Another way to store the row data is to keep the information for each
-column in a separate area (examples are SDBM and Focus). This will cause a
-performance hit for every query that accesses more than one column. Because
-this degenerates so quickly when more than one column is accessed,
-we believe that this model is not good for general purpose databases.
-
-The more common case is that the index and data are stored together
-(like in Oracle/Sybase et al). In this case you will find the row
-information at the leaf page of the index. The good thing with this
-layout is that it, in many cases, depending on how well the index is
-cached, saves a disk read. The bad things with this layout are:
-
-@itemize @bullet
-@item
-Table scanning is much slower because you have to read through the indexes
-to get at the data.
-@item
-You can't use only the index table to retrieve data for a query.
-@item
-You lose a lot of space, as you must duplicate indexes from the nodes
-(as you can't store the row in the nodes).
-@item
-Deletes will degenerate the table over time (as indexes in nodes are
-usually not updated on delete).
-@item
-It's harder to cache ONLY the index data.
-@end itemize
-
-@cindex design, limitations
-@node Design Limitations, Portability, Design, Performance
-@section MySQL Design Limitations/Tradeoffs
-
-Because @strong{MySQL} uses extremely fast table locking (multiple readers /
-single writers) the biggest remaining problem is a mix of a steady stream of
-inserts and slow selects on the same table.
-
-We believe that for a huge number of systems the extremely fast
-performance in other cases make this choice a win. This case is usually
-also possible to solve by having multiple copies of the table, but it
-takes more effort and hardware.
-
-We are also working on some extensions to solve this problem for some
-common application niches.
-
-@cindex portability
-@cindex crash-me program
-@cindex programs, crash-me
-@node Portability, Internal use, Design Limitations, Performance
-@section Portability
-
-Because all SQL servers implement different parts of SQL, it takes work to
-write portable SQL applications. For very simple selects/inserts it is
-very easy, but the more you need the harder it gets. If you want an
-application that is fast with many databases it becomes even harder!
-
-To make a complex application portable you need to choose a number of
-SQL servers that it should work with.
-
-You can use the @strong{MySQL} crash-me program/web-page
-@uref{http://www.mysql.com/information/crash-me.php} to find functions,
-types, and limits you can use with a selection of database
-servers. Crash-me now tests far from everything possible, but it
-is still comprehensive with about 450 things tested.
-
-For example, you shouldn't have column names longer than 18 characters
-if you want to be able to use Informix or DB2.
-
-Both the @strong{MySQL} benchmarks and crash-me programs are very
-database-independent. By taking a look at how we have handled this, you
-can get a feeling for what you have to do to write your application
-database-independent. The benchmarks themselves can be found in the
-@file{sql-bench} directory in the @strong{MySQL} source
-distribution. They are written in Perl with DBI database interface
-(which solves the access part of the problem).
-
-See @uref{http://www.mysql.com/information/benchmarks.html} for the results
-from this benchmark.
-
-As you can see in these results, all databases have some weak points. That
-is, they have different design compromises that lead to different
-behavior.
-
-If you strive for database independence, you need to get a good feeling
-for each SQL server's bottlenecks. @strong{MySQL} is VERY fast in
-retrieving and updating things, but will have a problem in mixing slow
-readers/writers on the same table. Oracle, on the other hand, has a big
-problem when you try to access rows that you have recently updated
-(until they are flushed to disk). Transaction databases in general are
-not very good at generating summary tables from log tables, as in this
-case row locking is almost useless.
-
-To get your application @emph{really} database-independent, you need to define
-an easy extendable interface through which you manipulate your data. As
-C++ is available on most systems, it makes sense to use a C++ classes
-interface to the databases.
-
-If you use some specific feature for some database (like the
-@code{REPLACE} command in @strong{MySQL}), you should code a method for
-the other SQL servers to implement the same feature (but slower). With
-@strong{MySQL} you can use the @code{/*! */} syntax to add
-@strong{MySQL}-specific keywords to a query. The code inside
-@code{/**/} will be treated as a comment (ignored) by most other SQL
-servers.
-
-If REAL high performance is more important than exactness, as in some
-Web applications, a possibility is to create an application layer that
-caches all results to give you even higher performance. By letting
-old results 'expire' after a while, you can keep the cache reasonably
-fresh. This is quite nice in case of extremely high load, in which case
-you can dynamically increase the cache and set the expire timeout higher
-until things get back to normal.
-
-In this case the table creation information should contain information
-of the initial size of the cache and how often the table should normally
-be refreshed.
-
-@cindex uses, of MySQL
-@cindex customers, of MySQL
-@node Internal use, , Portability, Performance
-@section What Have We Used MySQL For?
-
-During @strong{MySQL} initial development, the features of @strong{MySQL} were made to fit
-our largest customer. They handle data warehousing for a couple of the
-biggest retailers in Sweden.
-
-From all stores, we get weekly summaries of all bonus card transactions,
-and we are expected to provide useful information for the store owners
-to help them find how their advertisement campaigns are affecting their
-customers.
-
-The data is quite huge (about 7 million summary transactions per month),
-and we have data for 4-10 years that we need to present to the users.
-We got weekly requests from the customers that they want to get
-'instant' access to new reports from this data.
-
-We solved this by storing all information per month in compressed
-'transaction' tables. We have a set of simple macros (script) that
-generates summary tables grouped by different criteria (product group,
-customer id, store ...) from the transaction tables. The reports are
-Web pages that are dynamically generated by a small Perl script that
-parses a Web page, executes the SQL statements in it, and inserts the
-results. We would have used PHP or mod_perl instead but they were
-not available at that time.
-
-For graphical data we wrote a simple tool in @code{C} that can produce
-GIFs based on the result of a SQL query (with some processing of the
-result). This is also dynamically executed from the Perl script that
-parses the @code{HTML} files.
-
-In most cases a new report can simply be done by copying an existing
-script and modifying the SQL query in it. In some cases, we will need to
-add more fields to an existing summary table or generate a new one, but
-this is also quite simple, as we keep all transactions tables on disk.
-(Currently we have at least 50G of transactions tables and 200G of other
-customer data.)
-
-We also let our customers access the summary tables directly with ODBC
-so that the advanced users can themselves experiment with the data.
-
-We haven't had any problems handling this with quite modest Sun Ultra
-SPARCstation (2x200 Mhz). We recently upgraded one of our servers to a 2
-CPU 400 Mhz UltraSPARC, and we are now planning to start handling
-transactions on the product level, which would mean a ten-fold increase
-of data. We think we can keep up with this by just adding more disk to
-our systems.
-
-We are also experimenting with Intel-Linux to be able to get more CPU
-power cheaper. Now that we have the binary portable database format (new
-in Version 3.23), we will start to use this for some parts of the application.
-
-Our initial feelings are that Linux will perform much better on
-low-to-medium load and Solaris will perform better when you start to get a
-high load because of extreme disk IO, but we don't yet have anything
-conclusive about this. After some discussion with a Linux Kernel
-developer, this might be a side effect of Linux giving so much resources
-to the batch job that the interactive performance gets very low. This
-makes the machine feel very slow and unresponsive while big batches are
-going. Hopefully this will be better handled in future Linux Kernels.
-
-@cindex benchmark suite
-@cindex crash-me program
-@node MySQL Benchmarks, Maintenance, Performance, Top
-@chapter The MySQL Benchmark Suite
-
-This should contain a technical description of the @strong{MySQL}
-benchmark suite (and @code{crash-me}), but that description is not
-written yet. Currently, you can get a good idea of the benchmark by
-looking at the code and results in the @file{sql-bench} directory in any
-@strong{MySQL} source distributions.
-
-This benchmark suite is meant to be a benchmark that will tell any user
-what things a given SQL implementation performs well or poorly at.
-
-Note that this benchmark is single threaded, so it measures the minimum
-time for the operations. We plan to in the future add a lot of
-multi-threaded tests to the benchmark suite.
-
-For example, (run on the same NT 4.0 machine):
-
-@multitable @columnfractions .6 .2 .2
-@strong{Reading 2000000 rows by index} @tab @strong{Seconds} @tab @strong{Seconds}
-@item mysql @tab 367 @tab 249
-@item mysql_odbc @tab 464
-@item db2_odbc @tab 1206
-@item informix_odbc @tab 121126
-@item ms-sql_odbc @tab 1634
-@item oracle_odbc @tab 20800
-@item solid_odbc @tab 877
-@item sybase_odbc @tab 17614
-@end multitable
-
-@multitable @columnfractions .6 .2 .2
-@strong{Inserting (350768) rows} @tab @strong{Seconds} @tab @strong{Seconds}
-@item mysql @tab 381 @tab 206
-@item mysql_odbc @tab 619
-@item db2_odbc @tab 3460
-@item informix_odbc @tab 2692
-@item ms-sql_odbc @tab 4012
-@item oracle_odbc @tab 11291
-@item solid_odbc @tab 1801
-@item sybase_odbc @tab 4802
-@end multitable
-
-In the above test @strong{MySQL} was run with a 8M index cache.
-
-We have gather some more benchmark results at
-@uref{http://www.mysql.com/information/benchmarks.html}.
-
-Note that Oracle is not included because they asked to be removed. All
-Oracle benchmarks have to be passed by Oracle! We believe that makes
-Oracle benchmarks @strong{VERY} biased because the above benchmarks are
-supposed to show what a standard installation can do for a single
-client.
-
-To run the benchmark suite, you have to download a @strong{MySQL} source
-distribution, install the perl DBI driver, the perl DBD driver for the
-database you want to test and then do:
-
-@example
-cd sql-bench
-perl run-all-tests --server=#
-@end example
-
-where # is one of supported servers. You can get a list of all options
-and supported servers by doing @code{run-all-tests --help}.
-
-@cindex crash-me
-@code{crash-me} tries to determine what features a database supports and
-what its capabilities and limitations are by actually running
-queries. For example, it determines:
-
-@itemize @bullet
-@item
-What column types are supported
-@item
-How many indexes are supported
-@item
-What functions are supported
-@item
-How big a query can be
-@item
-How big a @code{VARCHAR} column can be
-@end itemize
-
-We can find the result from crash-me on a lot of different databases at
-@uref{http://www.mysql.com/information/crash-me.php}.
-
-
-@node Maintenance, Adding functions, MySQL Benchmarks, Top
+@node Maintenance, Adding functions, Fulltext Search, Top
@chapter Maintaining a MySQL Installation
@cindex installation maintenance
@@ -36648,8 +36795,6 @@ This chapter covers what you should know about maintaining a @strong{MySQL}
distribution. You will learn how to care for your tables on a regular
basis, and what to do when disaster strikes.
-
-
@node Using mysqlcheck, , Maintenance, Maintenance
@section Using @code{mysqlcheck} for Table Maintenance and Crash Recovery