MySQL Client/Server Protocol Documentation


Introduction
------------


This paper has the objective of presenting a through description
of the client/server protocol that is embodied in MySQL. Particularly,
this paper aims to document and describe:

- manner in which MySQL server detects client connection requests and
  creates connection
- manner in which MySQL client C API call connects to server - the
  entire protocol of sending/receiving data by MySQL server and C API
  code
- manner in which queries are sent by client C API calls to server
- manner in which query results are sent by server
- manner in which query results are resolved by server
- sending and receiving of error messages


This paper does not have the goal or describing nor documenting other
related MySQL issues, like usage of thread libraries, MySQL standard
library set, MySQL strings library and other MySQL specific libraries,
type definitions and utilities.

Issues that are covered by this paper are contained in the following
source code files:

- libmysql/net.c and sql/net_serv.cc, the two being identical
- client/libmysql.c (not entire file is covered)
- include/mysql_com.h
- include/mysql.h
- sql/mysqld.cc (not entire file is covered)
- sql/net_pkg.cc
- sql/sql_base.cc (not entire file is covered)
- sql/sql_select.cc (not entire file is covered)
- sql/sql_parse.cc (not entire file is covered)

Note: libmysql/net.c was client/net.c prior to MySQL 3.23.11.
sql/net_serv.cc was sql/net_serv.c prior to MySQL 3.23.16.

Beside this introduction this paper presents basic definitions,
constants, structures and global variables, all related functions in
server and in C API. Textual description of the entire protocol
functioning is described in the last chapter of this paper.


Constants, structures and global variables
------------------------------------------

This chapter will describe all constants, structures and
global variables relevant to client/server protocol.

Constants

They are important as they contain default values, the ones
that are valid if options are not set in any other way. Beside that
MySQL source code does not contain a single non-defined constant in
its code. This description of constants does not include
configuration and conditional compilation #defines.

NAME_LEN          - field and table name length, current value 64
HOSTNAME_LENGTH   - length of the hostname, current value 64
USERNAME_LENGTH   - username length, current value 16
MYSQL_PORT        - default TCP/IP port number, current value 3306
MYSQL_UNIX_ADDR   - full path of the default Unix socket file, current value
                    "/tmp/mysql.sock"
MYSQL_NAMEDPIPE   - full path of the default NT pipe file, current value
                    "MySQL"
MYSQL_SERVICENAME - name of the MySQL Service on NT, current value "MySQL"
NET_HEADER_SIZE   - size of the network header, when no
                    compression is used, current value 4
COMP_HEADER_SIZE - additional size of network header when
                    compression is used, current value 3

What follows are set of constants, defined in source only, which
define capabilities of the client built with that version of C
API. Simply, when some new feature is added in client, that client
feature is defined, so that server can detect what capabilities a
client program has.

CLIENT_LONG_PASSWORD    - client supports new more secure passwords
CLIENT_LONG_FLAG        - client uses longer flags
CLIENT_CONNECT_WITH_DB  - client can specify db on connect
CLIENT_COMPRESS         - client can use compression protocol
CLIENT_ODBC             - ODBC client
CLIENT_LOCAL_FILES      - client can use LOAD DATA INFILE LOCAL
CLIENT_IGNORE_SPACE     - client can ignore spaces before '('
CLIENT_CHANGE_USER      - client supports the mysql_change_user()

What follows are other constants, pertaining to timeouts and sizes

MYSQL_ERRMSG_SIZE       - maximum size of error message string, current value 200
NET_READ_TIMEOUT        - read timeout, current value 30 seconds
NET_WRITE_TIMEOUT       - write timeout, current value 60 seconds
NET_WAIT_TIMEOUT        - wait for new query timeout, current value 8*60*60
                          seconds, that is, 8 hours
packet_error            - value returned in case of socket errors, current
                          value -1
TES_BLOCKING            - used in debug mode for setting up blocking testing
RETRY COUNT             - number of times network read and write will be
                          retried, current value 1

There are also error messages for last_errno, which depict system
errors, and are used on the server only.

ER_NET_PACKAGE_TOO_LARGE  - packet is larger than max_allowed_packet
ER_OUT_OF_RESOURCES       - practically no more memory
ER_NET_ERROR_ON_WRITE     - error in writing to NT Named Pipe
ER_NET_WRITE_INTERRUPTED  - some signal or interrupt happened
                            during write
ER_NET_READ_ERROR_FROM_PIPE - error in reading from NT Named Pipe
ER_NET_FCNTL_ERROR        - error in trying to set fcntl on socket
                            descriptor
ER_NET_PACKETS_OUT_OF_ORDER - packet numbers on client and
                              server side differ
ER_NET_UNCOMPRESS_ERROR   - error in uncompress of compressed packet


                Structs and enums


struct NET

This is MySQL's network handle structure, used in all client/server
read/write functions. On the server, it is initialized and preserved
in each thread. On the client, it is a part of the MYSQL struct,
which is the MySQL handle used in all C API functions. This structure
uniquely identifies a connection, either on the server or client
side.  It consists of the following fields:

    Vio* vio - explained above
    HANDLE hPipe - Handle for NT Named Pipe file
    my_socket fd - file descriptor used for both TCP/IP socket and
                   Unix socket file
    int fcntl - contains info on fcntl options used on fd. Mostly
                used for saving info if blocking is used or not
    unsigned char *buff - network buffer used for storing data for
                          reading from/writing to socket
    unsigned char,*buff_end - points to the end of buff
    unsigned char *write_pos - present writing position in buff
    unsigned char *read_pos - present reading position in buff. This
                              pointer is used for reading data after
                              calling my_net_read function and function
                              that are just its wrappers
    char last_error[MYSQL_ERRMSG_SIZE] - holds last error message
    unsigned int last_errno - holds last error code of the network
                              protocol. Its possible values are listed
                              in above constants. It is used only on
                              the server side
    unsigned int max_packet - holds current value of buff size
    unsigned int timeout - stores read timeout value for that connection
    unsigned int pkt_nr - stores the value of the current packet number in
                          a batch of packets. Used primarily for
                          detection of protocol errors resulting in a
                          mismatch
    my_bool error - holds either 1 or 0 depending on the error condition
    my_bool return_errno - if its value != 0 then there is an error in
                           protocol mismatch between client and server
    my_bool compress - if true compression is used in the protocol
    unsigned long remain_in_buf - used only in reading compressed packets.
                                  Explained in my_net_read
    unsigned long length - used only for storing the length of the read
                           packet. Explained in my_net_read
    unsigned long buf_length - used only in reading compressed packets.
                               Explained in my_net_read
    unsigned long where_b - used only in reading compressed packets.
                            Explained in my_net_read
    short int more - used for reporting in mysql_list_processes
    char save_char - used in reading compressed packets for saving chars
                     in order to make zero-delimited strings. Explained
                     in my_net_read

A few typedefs will be defined for easier understanding of the text that
follows.

typedef char **MYSQL_ROW - data containing one row of values

typedef unsigned int MYSQL_FIELD_OFFSET - offset in bytes of the current field

typedef MYSQL_ROWS *MYSQL_ROW_OFFSET - offset in bytes of the current row

struct MYSQL_FIELD - contains all info on the attributes of a
specific column in a result set, plus info on lengths of the column in
a result set. This struct is tagged as st_mysql_field. This structure
consists of the following fields:

    char *name - name of column
    char *table - table of column if column was a field and not
                  an expression or constant
    char *def - default value (set by mysql_list_fields)
    enum enum_field_types type - see above
    unsigned int length - width of column in the current row
    unsigned int max_length - maximum width of that column in entire
                              result set
    unsigned int flags - corresponding to Extra in DESCRIBE
    unsigned int decimals - number of decimals in field


struct MYSQL_ROWS - a node for each row in the single linked
list forming entire result set. This struct is tagged as
st_mysql_rows, and has two fields:

    struct st_mysql_rows *next - pointer to the next one
    MYSQL_ROW data - see above


struct MYSQL_DATA - contains all rows from result set. It is
tagged as st_mysql_data and has following fields:

    my_ulonglong rows - how many rows
    unsigned int fields - how many columns
    MYSQL_ROWS *data - see above. This is the first node of the linked list
    MEM_ROOT alloc - MEM_ROOT is MySQL memory allocation structure, and
                     this field is used to store all fields and rows.


struct st_mysql_options - holds various client options, and
contains following fields:

    unsigned int connect_timeout - time in seconds for connection
    unsigned int client_flag - used to hold client capabilities
    my_bool compress - boolean for compression
    my_bool named_pipe - is Named Pipe used? (on NT)
    unsigned int port - what TCP port is used
    char *host - host to connect to
    char *init_command - command to be executed upon connection
    char *user - account name on MySQL server
    char *password - password for the above
    char *unix_socket - full path for Unix socket file
    char *db - default database
    char *my_cnf_file - optional configuration file
    char *my_cnf_group - optional header for options


struct MYSQL - MySQL client's handle. Required for any
operation issued from client to server. Tagged as st_mysql and having
following fields:

    NET net - see above
    char *host - host on which MySQL server is running
    char *user - MySQL username
    char *passwd - password for above
    char *unix_socket- full path of Unix socket file
    char *server_version - version of the server
    char *host_info - contains info on how has connection been
                      established, TCP port, socket or Named Pipe
    char *info - used to store information on the query results,
                 like number of rows affected etc.
    char *db - current database
    unsigned int port - TCP port in use
    unsigned int client_flag - client capabilities
    unsigned int server_capabilities - server capabilities
    unsigned int protocol_version - version of the protocol
    unsigned int field_count - used for storing number of fields
                               immediately upon execution of a query,
                               but before fetching rows
    unsigned long thread_id - server thread to which this connection
                              is attached
    my_ulonglong affected_rows - used for storing number of rows
                                 immediately upon execution of a query,
                                 but before fetching rows
    my_ulonglong insert_id - fetching LAST_INSERT_ID() through client C API
    my_ulonglong extra_info - used by mysqlshow
unsigned long packet_length - saving size of the first packet upon
                              execution of a query
    enum mysql_status status - see above
    MYSQL_FIELD *fields - see above
    MEM_ROOT field_alloc - memory used for storing previous field (fields)
    my_bool free_me - boolean that flags if MYSQL was allocated in mysql_init
    my_bool reconnect - used to automatically reconnect
    struct st_mysql_options options - see above
    char scramble_buff[9] - key for scrambling password before sending it
                           to server


struct MYSQL_RES - tagged as st_mysql_res and used to store
entire result set from a single query. Contains following fields:

    my_ulonglong row_count - number of rows
    unsigned int field_count - number of columns
    unsigned int current_field - cursor for fetching fields
    MYSQL_FIELD *fields - see above
    MYSQL_DATA *data - see above, and used in buffered reads, that is,
                       mysql_store_result only
    MYSQL_ROWS *data_cursor - pointing to the field of above "data"
    MEM_ROOT field_alloc - memory allocation for above "fields"
    MYSQL_ROW row - used for storing row by row in unbuffered reads,
                    that is, in mysql_use_result
    MYSQL_ROW current_row - cursor to the current row for buffered reads
    unsigned long *lengths - column lengths of current row
    MYSQL *handle - see above, used in unbuffered reads, that is, in
                    mysql_use_result
    my_bool eof - used by mysql_fetch_row as a marker for end of data


                                Global variables


unsigned long max_allowed_packet - maximum allowable value of network
                                   buffer. Default value - 1MB

unsigned long net_buffer_length - default, starting value of network
                                  buffer - 8KB

unsigned long bytes_sent - total number of bytes written since startup
                           of the server

unsigned long bytes_received - total number of bytes read since startup
                               of the server


Synopsis of the basic client/server protocol
--------------------------------------------

Purpose of this chapter is to provide a complete picture of
the basic client/server protocol implemented in MySQL. It was felt
it is necessary after writing descriptions for all of the functions
involved in basic protocol. There are at present 11 functions
involved, with several structures, many constants etc, which are all
described in detail. But as a forest could not be seen from the trees,
so the concept of the protocol could not be deciphered easily from a
thorough documentation of minutiae.

Although the concept of the protocol was not changed with the
introduction of vio system, embodied in violate.cc source file and VIO
system, the introduction of these has changed the code substantially. Before
VIO was introduced, functions for reading from/writing to network
connection had to deal with various network standards. So, these functions
depended on whether TCP port or Unix socket file or NT Named Pipe file is
used. This is all changed now and single vio_ functions are called, while
all this diversity is covered by vio_ functions.

In MySQL a specific buffered network input/output transport model
has been implemented. Although each operating system may have its
own buffering for network connections, MySQL has added its own
buffering model. This same for each of the three transport protocol
types that are used in MySQL client/server communications, which
are TCP/IP sockets (on all systems), Unix socket files on Unix and
Unix-like operating systems and Named Pipe files on NT. Although
TCP/IP sockets are omnipresent, the latter two types have been added
for local connections. Those two connection types can be used in
local mode only, that is, when both client and server reside on the
same host, and are introduced because they enable better speeds for
local connections. This is especially useful for WWW type of
applications. Startup options of MySQL server allow that either
TCP/IP sockets or local connection (OS dependent) can be disallowed.

In order to implement buffered input/output, MySQL allocates a
buffer. The starting size of this buffer is determined by the value
of the global variable net_buffer_length, which can be changed at
MySQL server startup. This is, as explained, only the startup length
of MySQL network buffer. Because a single item that has to be read
or written can be larger than that value, MySQL will increase buffer
size as long as that size reaches value of the global variable
max_allowed_packet, which is also settable at server startup. Maximum
value of this variable is limited by the way MySQL stores/reads
sizes of packets to be sent/read, which means by the way MySQL
formats packages.

Basically each packet consists of two parts, a header and data. In
the case when compression is not used, header consists of 4 bytes
of which 3 contain the length of the packet to be sent and one holds
the packet number. When compression is used there are onother 3
bytes which store the size of uncompressed data. Because of the way
MySQL packs length into 3 bytes, plus due to the usage of some
special values in the most significant byte, maximum size of
max_allowed_packet is limited to 24MB at present. So, if compression
is not used, at first 4 bytes are written to the buffer and then
data itself. As MySQL buffers I/O logical packets are packet together
until packets fill up entire size of the buffer. That size no less
than net_buffer_size, but no greater than max_allowed_packet. So,
actual writing to the network is done when this buffer is filled
up. As frequently sequence of buffers make a logical unit, like a
result set, then at the end of sending data, even if buffer is not
full, data is written (flushed to the connection) with a call of
the net_flush function. So that no single packet can be larger than
this value, checks are made throughout the code to make sure that
no single field or command could exceed that value.

In order to maintain coherency in consecutive packets, each packet
is numbered and their number stored as a part of a header, as
explained above. Packets start with 0, so whenever a logical packet
is written, that number is incremented. On the other side when
packets are read, value that is fetched is compared with the value
stored and if there is no mismatch that value is incremented, too.
Packet number is reset on the client side when unwanted connections
are removed from the connection and on the server side when a new
command has been started.


So, before writing, the buffer contains a sequence of logical
packets, consisting of header plus data consecutively. If compression
is used, packet numbers are not stored in each header of the logical
packets, but a whole buffer, or a part of it if flushing is done,
containing one or more logical packets are compressed. In that case
a new larger header, is formed, and all logical packets contained
in the buffer are compressed together. This way only one packet is
formed which makes several logical packets, which improves both
speed and compression ratio. On the other side, when this large
compressed packet is read, it is first uncompressed, and then logical
packets are sent, one by one, to the calling functions.


All this functionality is described in detail in the following
chapter. It does not contain functions that form logical packets, or
that read and write to connections but also functions that are used
for initialization, clearing of connections. There are functions at
higher level dealing with sending fields, rows, establishing
connections, sending commands, but those are not explained in the
following chapter.


Functions utilized in client/server protocol
--------------------------------------------

First of all, functions are described that are involved in preparing,
reading, or writing data over TCP port, Unix socket file, or named
pipe, and functions directly related to those. All of these functions
are used both in server and client. Server and client specific code
segments are documented in each function description.

Each MySQL function checks for errors in memory allocation and
freeing, as well as in every OS call, like the one dealing with
files and sockets, and for errors in indigenous MySQL function
calls. This is expected, but has to be said here so as not to repeat
it in every function description.

Older versions of MySQL have utilized the following macros for
reading from or writing to a socket.

raw_net_read - calls OS function recv function that reads N bytes
from a socket into a buffer. Number of bytes read is returned.

raw_net_write - calls OS function send to write N bytes from a
buffer to socket. Number of bytes written is returned.

These macros are replaced with VIO (Virtual I/O) functions.


Function name: my_net_init

Parameters: struct NET *, enum_net_type, struct Vio

Return value: 1 for error, 0 for success

Function purpose: To initialize properly all NET fields,
                  allocate memory and set socket options

Function description

First of all, buff field of NET struct is allocated to the size of
net_buffer_length, and on failure function exits with 0. All fields
in NET are set to their default or starting values. As net_buffer_length
and max_allowed_packet are configurable, max_allowed_packet is set
equal to net_buffer_length if the latter one is greater. max_packet
is set for that NET to net_buffer_length, and buff_end points to
buff end. vio field is set to the second parameter.  If it is a
real connection, which is the case when second parameter is not
null, then fd field is set by calling vio_fd function. read_pos and
write_pos to buff, while remaining integers are set to 0. If function
is run on the MySQL server on Unix and server is started in a test
mode that would require testing of blocking, then vio_blocking
function is called. Last, fast throughput mode is set by a call to
vio_fastsend function.


Function name: net_end

Parameters: struct NET *

Return value: void

Function purpose: To release memory allocated to buff


Function name: net_realloc (private, static function)

Parameters: struct NET, ulong (unsigned long)

Return value: 1 for error, 0 for success

Function purpose: To change memory allocated to buff

Function description

New length of buff field of NET struct is passed as second parameter.
It is first checked versus max_allowed_packet and if greater, an
error is returned. New length is aligned to 4096-byte boundary. Then,
buff is reallocated, buff_end, max_packet, and write_pas reset to
the same values as in my_net_init.


Function name: net_clear (used on client side only)

Parameters: struct NET *

Return value: void

Function purpose: To read unread packets

Function description

This function is used on client side only, and is executed
only if a program is not started in test mode. This function reads
unread packets without processing them. First, non-blocking mode is
set on systems that do not have non-blocking mode defined. This is
performed by checking the mode with vio_is_blocking function. and
setting non-blocking mode by vio_blocking function. If this operation
was successful, then packets are read by vio_read function, to which
vio field of NET is passed together with buff and max_packet field
values. field of the same struct at a length of max_packet. If
blocking was active before reading is performed, blocking is set with
vio_blocking function. After reading has been performed, pkt_nr is
reset to 0 and write_pos reset to buff. In order to clarify some
matters non-blocking mode enables executing program to dissociate from
a connection, so that error in connection would not hang entire
program or its thread.

Function name: net_flush

Parameters: struct NET *

Return value: 1 for error, 0 for success

Function purpose: To write remaining bytes in buff to socket

Function description

net_real_write (described below) is performed is write_pos
differs from buff, both being fields of the only parameter. write_pos
is reset to buff. This function has to be used, as MySQL uses buffered
writes (as will be explained more in the function net_write_buff).


Function name: my_net_write

Parameters: struct NET *, const char *, ulong

Return value: 1 for error, 0 for success

Function purpose: Write a logical packet in the second parameter
                  of third parameter length

Function description

The purpose of this function is to prepare a logical packet such
that entire content of data, pointed to by second parameter and in
length of third parameter is sent to the other side. In case of
server, it is used for sending result sets, and in case of client
it is used for sending local data. This function foremost prepares
a header for the packet. Normally, the header consists of 4 bytes,
of which the first 3 bytes contain the length of the packet, thereby
limiting a maximum allowable length of a packet to 16MB, while the
fourth byte contains the packet number, which is used when one large
packet has to be divided into sequence of packets. This way each
sub-packet gets its number which should be matched on the other
side. When compression is used another three bytes are added to
packet header, thus packet header is in that case increased to 7
bytes. Additional three bytes are used to save the length of
compressed data. As in connection that uses compression option,
code packs packets together,, a header prepared by this function
is later not used in writing to / reading from network, but only
to distinguish logical packets within a buffered read operation.


This function, first stores the value of the third parameter into the
first 3 bytes of local char variable of NET_HEADER_SIZE size by usage
of function int3store. Then, at this point, if compression is not
used, pkt_nr is increased, and its value stored in the last byte of
the said local char[] variable. If compression is used, 0 is stored in
both values. Then those four bytes are sent to other side by the usage
of the function net_write_buff(to be explained later on), and if
successful, entire packet in second parameter of the length described
in third parameter is sent by the usage of the same function.


Function name: net_write_command

Parameters: struct NET *, char, const char *, ulong

Return value: 1 for error, 0 for success

Function purpose: Send a command with a packet as in previous function

Function description

This function is very similar to the previous one. The only
difference is that first packet is enlarged by one byte, so that the
command precedes the packet to be sent. This is implemented by
increasing first packet by one byte, which contains a command code. As
command codes do not use the range of values that are used by character
sets, so when the other side receives a packet, first byte after
header contains a command code. This function is used by client for
sending all commands and queries, and by server in connection process
and for sending errors.


Function name: net_write_buff (private, static function)

Parameters: struct NET *, const char *, uint

Return value: 1 for error, 0 for success

Function purpose: To write a packet of any size by cutting it
and using next function for writing it

Function description

This function was created after compression feature has been
added to MySQL. This function supposes that packets have already been
properly formatted, regarding packet header etc. The principal reason for
this function to exist is because a packet that is sent by client or
server does not have to be less than max_packet. So this function
first calculates how much data has been left in a buff, by getting a
difference between buff_end and write_pos and storing it to local
variable left_length. Then a loop is run as long as the length to be
sent is greater than length of left bytes (left_length). In a loop
data from second parameter is copied to buff at write_pos, as much as
it can be, that is, by left_length. Then net_real_write function is called
(see below) with NET, buff, and max_packet parameters. This function
is the lowest level function that writes data over established
connection. In the loop, write_pos is reset to buff, the pointer to data
(second parameter) is moved by the amount of data sent (left_length),
length of data to be sent (third parameter) is decreased by the amount
sent (left_length) and left_length is reset to max_packet value, which
ends the loop. This logic was necessary, as there could have been some
data yet unsent (write_pos != buf), while data to be sent could be as
large as necessary, thus requiring many loops. At the end of function,
remaining data in second parameter are copied to buff at write_pos, by
the remaining length of data to be sent (third parameter). So, in the
next call of this function remaining data will be sent, as buff is
used in the call to net_real_write. It is very important to note that if
a packet to be sent is less than the number of bytes that are still
available in buff, then there will be no writing over network, but
only logical packets will be added one after another. This will
accelerate network traffic, plus if compression is used, the
expected compression rate would be higher. That is why server or
client functions that sends data uses at the end of data net_flush
function described above.


Function name: net_real_write

Parameters: struct NET *, const char *, ulong

Return value: 1 for error, 0 for success

Function purpose: To write data to a socket or pipe, with
compression if used

Function description

First, more field is set to 2, to enable reporting in
mysql_list_processes. Then if compression is enabled on that
connection, a new local buffer (variable b) is initialized to the
length of total header (normal header + compression header) and if no
memory is available, an error is returned. This buffer (b) is used for
holding the final, compressed packet to be written over the
connection. Furthermore in compression initialization, second
parameter at length of third parameter is copied to the local buffer
b, and MySQL's wrapped zlib's compression function is run at total
header offset of the local buffer. Please, do note that this function
does not test effectiveness of compression. If compression is turned
on in some connection, it is used all of the time. Also, it is very
important to be cognizant of the fact that this algorithm makes
possible that a single compressed packet contains several logical
packets. In this way compression rate is increased and network
throughput is increased as well. However, this algorithm has
consequences on the other side, that reads compressed packet, which
is covered in my_net_read function. After compression is done, the full
compression header is properly formed with the packet number,
compressed and uncompressed lengths. At the end of compression code,
third parameter is increased by total header length, as the original
header is not used (see above), and second parameter, pointer to data,
is set to point to local buffer b, in order that the further flow of
function is independent of compression. If a function is executed
on server side, a thread alarm initialized and if non-blocking is
active set at NET_WRITE_TIMEOUT. Two local (char *) pointers are
initialized, pos at beginning of second parameter, and end at end of
data. Then the loop is run as long as all data is written, which means
as long as pos != end. First vio_write function is called, with
parameters of vio field, pos and size of data (end - pos). Number of
bytes written over connection is saved in local variable (length). If
error is returned local bool variable (interrupted) is set according
to the return value of the vio_should_retry called with vio field as
parameter. This bool variable indicates whether writing was
interrupted in some way or not.

Further, error from vio_write is treated differently on Unix versus
other OS's (Win32 or OS/2). On Unix an alarm is set if one is not
in use, no bytes have been written and there has been no interruption.
Also, in that case, if connection is not in blocking mode, a sub-loop
is run as long as blocking is not set with vio_blocking function.
Within the loop another run of above vio_write is run based on
return value of vio_is_retry function, provided number of repeated
writes is less than RETRY_COUNT. If that is not the case, error
field of struct NET is set to 1 and function exits. At the exit
of sub-loop number of reruns already executed is reset to zero and
another run of above vio_write function is attempted. If the function
is run on Win32 and OS/2, and in the case that function flow was
not interrupted and thread alarm is not in use, again the main loop
is continued until pos != end. In the case that this function is
executed on thread safe client program, a communication flow is
tested on EINTR, caused by context switching, by use of vio_errno
function, in which case the loop is continued. At the end of
processing of the error from vio_write, error field of struct NET
is set, and if on server last_errno field is set to
ER_NET_WRITE_INTERRUPTED in the case that local bool variable
(interrupted) is true or to ER_NET_ERROR_ON_WRITE. Before the end
of loop, in order to make possible evaluation of the loop condition,
pos is increased by the value written in last iteration (length).
Also global variable bytes_sent is increased by the same value, for
status purposes. At the end of the functions more fields is reset,
in case of compression, compression buffer (b) memory is released
and if thread is still in use, it is ended and blocking state is
reset to its original state, and function returns error is all bytes
are not written.


Function name: my_real_read (private, static function)

Parameters: struct NET *, ulong *

Return value: length of bytes read

Function purpose: low level network connection read function

Function description

This function has made as a separate one when compression was
introduced in MySQL client/server protocol . It contains basic, low
level network reading functionality, while all dealings with
compressed packets are handled in next function. Compression in this
function is only handled in as much to unfold the length of uncompressed
data. First blocking state of connection is saved in local bool
variable net_blocking, and field more is set 1 for detailed reporting
in mysqld_list_processes. A new thread alarm is initialized, in order
to enable read timeout handling, and if on server and a connection can
block a program, the alarm is set at a value of timeout field. Local
pointer is set to the position of the next logical packet, with its
header skipped, which is at field where_b offset from buff. Next, a
two time run code is entered. A loop is run exactly two times because
first time number of bytes to be fetched (remain) are set to the
header size, which is different when compression is used or not used
on the connection. After first fetch has been done, number of packets
that will be received in second iteration is well known, as fetched
header contains the size of packet, packet number, and in the case of
compression, the size of the uncompressed packet. Then, as long as there are
bytes to read the loop is entered with first reading data from network
connection with vio_read function, called with parameters of field
vio, current position and remaining number of bytes, which value is
hold by local variable (remain) initialized at the value of header size,
which differs if compression is used. Number of bytes read are
returned in local length variable. If error is returned local bool
variable (interrupted) is set according to the return value of the
vio_should_retry called with vio field as parameter. This bool
variable indicates whether reading was interrupted in some way or not.

Further, error from vio_read is treated differently on Unix versus
other OS's (Win32 or OS/2). On Unix an alarm is set if one is not
in use, no bytes have been read and there has been no interruption.
Also, in that case, if connection is not in blocking mode, a sub-loop
is run as long as blocking is not set with vio_blocking function.
Within the loop another run of above vio_read is run based on return
value of vio_is_retry function, provided number of repeated writes
is less than RETRY_COUNT. If that is not the case, error field of
struct NET is set to 1 and function exits. At the exit of sub-loop
number of reruns already executed is reset to zero and another run
of above vio_read function is attempted. If the function is run on
Win32 and OS/2, and in the case that function flow was not interrupted
and thread alarm is not in use, again the main loop is continued
as long as there are bytes remaining. In the case that this function
is executed on thread safe client program, then if another run
should be made, which is decided by the output of vio_should_retry
function, in which case the loop is continued. At the end of
processing of the error from vio_read, error field of struct NET
is set, and if on server last_errno field is set to ER_NET_READ_INTERRUPTED
in the case that local bool variable (interrupted) is true or to
ER_NET_ERROR_ON_READ. In case of such an error this function exits
and returns error. In the case when there is no error, number of
remaining bytes (remain) is decreased by the number of bytes read,
which should be zero, but in case it is not the entire code is still
in while (remain > 0) loop, which will be exited immediately if it
is. This has been done to accommodate errors in the traffic level
and for the very slow connections. Current position in field buff
is also moved by the amount of bytes read by vio_read function, and
global variable bytes_received is increased by the same value in a
thread safe manner. When the loop that is run until necessary bytes
are read (remain) is finished, then if external loop is in its first
run, of the two, packet sequencing is tested for consistency by
comparing the number contained at 4th byte in header with pkt_nr
field. Header location is found at where_b offset to field_b. Usage
of where_b is obligatory due to the possible compression usage. If
there is no compression on a connection, then where_b is always 0.
If there is a discrepancy, then first byte of the header is checked
whether it is equal to 255, because when error is sent by the server,
or by a client if it is sending data (like in LOAD DATA INFILE
LOCAL...), then first byte in header is set to 255. If it is not
255, then an error on packets being out of order is printed. In any
case, on server, last_errno field is set to ER_NET_PACKETS_OUT_OF_ORDER
and the function returns with an error, that is, the value returned is
packet_error. If a check on serial number of packet is successful,
pkt_nr field is incremented in order to enable checking packet order
with next packet and if compression is used, uncompressed length
is extracted from a proper position in header and returned in the
second parameter of this function. Length of the packet is saved,
for the purpose of a proper return value from this function. Still
in the first iteration of the main loop, a check must be made if
field buff could accommodate entire package that comes, in its
compressed or uncompressed form. This is done in such a way, because
zlib's compress and uncompress functions use the same memory area
for compression and uncompression. Necessary field buff length is
equal to current offset where data are (where_b which is zero for
non-compression), plus the larger value of compressed or uncompressed
package to be read in a second run. If this value is larger than
the current length of field buff, which is read from field max_packet,
then field buff has to be reallocated. If reallocation with net_realloc
function fails, the function returns an error. Before a second
loop is started, length to be read is set to the length of expected
data and current position (pos) is set at where_b offset from field
buff. At the end of function, if alarm is set, which is the case
if it is run on server or on a client if a function is interrupted
and another run of vio_read is attempted, alarm is ended and blocking
state is restored from the saved local bool variable net_blocking.
Function returns number of bytes read or the error (packet_error).


Function name: my_net_read

Parameters: struct NET *

Return value: length of bytes read

Function purpose: Highest level general purpose reading function

Function description

First, if compression is not used, my_real_read is called, with
struct NET * a first parameter, and pointer to local ulong complen
as a second parameter, but its value is not used here.  Number of
bytes read is returned in local ulong variable len. read_pos field
is set to an offset of value of where_b field from field buff.
where_b field actually denotes where in field buff is the current
packet. If returned number of bytes read (local variable len) does
not signal that an error in packet transmission occurred (that is,
it is not set to packet_error), then the string contained in read_pos
is zero terminated. Simply, the end of the string starting at
read_pos, and ending at read_pos + len, is set to zero. This is
done in that way, because mysql_use_result expects a zero terminated
string, and function returns with a value local variable len. This
ends this function in the case that compression is not used and the
remaining code is executed only if compression is enabled on the
connection.

In order to explain how a compressed packet logically is cut into
meningful packets, the full meaning of several NET fields should
be explained. First of all, fields in NET are used and not local
variables, as all values should be saved between consecutive calls
of this function. Simply, this function is called in order to return
logical packets, but this function does not need to call my_real_read
function everytime, because when a large packet is uncompressed,
it may, but not necessarily so, contain several logical packets.
Therefore, in order to preserve data on logical packets local
variables are not used. Instead fields in NET struct are used. Field
remain_in_buf denotes how many bytes of entire uncompressed packets
is still contained within buff. field buf_length saves the value
of the length of entire uncompressed packet. field save_char is
used to save the character at the position where the packet ends,
which character has to be replaced with a zero, '\0', in order to
make a logical packet zero delimited, for mysql_use_result.  Field
length stores the value of the length of compressed packet.  Field
read_pos as usual, points to the current reading position.  This
char * pointer is used by all functions that call this function in
order to fetch their data. Field buff is not used for that purpose,
but read_pos is used instead. This change was introduced with
compression, when algorithm accommodated grouping of several packets
together.

Now that meanings of all relevant NET fields are explained,
we can proceed with the flow of this function for the case when
compression is active. First, if there are remaining portions of
compressed packet in a field buff, saved character value is set at
the position where zero char '\0' was inserted to enable the string
to be zero delimited for mysql_use_result. Then a loop is started.
In the first part of the loop, if there are remaining bytes, local
uchar *pos variable is set at the current position in field buff
where a new packet starts. This position is an (buf_length -
remain_in_buf) offset in field buff. As it is possible that next
logical packet is not read to the full length in the remaining of
the field buf, several things had to be inspected. It should be
noted that data that is read from net_real_read contains only logical
packets containing 4 byte headers only, being 4 byte headers prepared
by my_net_write or net_write_command. But, when written, logical
packet could be so divided that only a part of header is read in.
Therefore after pointer to the start of the next packet has been
saved, a check is made whether number of remaining bytes in buffer
is less than 4, being 3 bytes for length and one byte for packet
number. If it is greater, then the length of the logical packet is
extracted and saved a length field.  Then a check is made whether
entire packet is contained within a buf, that is, a check is made
that the logical packet is fully contained in the buffer. In that
case, number of bytes remaining in buffer is decreased by the full
length of logical packet (4 + length field), read_pos is moved
forward by 4 bytes to skip header and be set at a beginning of data
in logical packet, length field is saved for the value to be returned
in function and the loop is exited. In the case that the entire
logical packet is not contained within the buffer, then if length of
the entire buffer differs from remaining length of logical packet,
it (logical packet) is moved to the beginning of the field buff.
If length of the entire buffer equals the remaining length of logical
packet, where_b and buf_length fields are set to 0. This is done
so that in both cases buffer is ready to accept next part of packet.

In order to get a next part of a packet, still within a loop,
my_real_read function is called and length of compressed packet is
returned to a local len variable, and length of compressed data is
returned in complen variable. In the case of non-compression value
of complen is zero.  If packet_error is from my_real_read function,
this function returns also with packet_error. If it is not a
packet_error, my_uncompress function is called to uncompress data.
It is called with offset of where_b data from field buff, as it is
the position where compressed packet starts, and with len and complen
values, being lengths of compressed and uncompressed data.  If there
is no compression, 0 is returned for uncompressed size from
my_real_read function, and my_uncompress wrapper function is made
to skip zlib uncompress in that case. If error is returned from
my_uncompress, error field is set to 1, if on server last_errno is
set to ER_NET_UNCOMPRESS_ERROR and loop is exited and function
returns with packet_error. If not, buf_length and remain_in_buf
fields are set to the uncompressed size of buffer and the loop is
continued. When the loop is exited save_char field is used to save
the char at end of a logical packet, which is an offset of field
len from position in field buff pointed by field read_pos, in order
that zero char is set at the same position, for mysql_use_result.
Function returns the length of the logical packet without its header.