summaryrefslogtreecommitdiff
path: root/doc/bio.doc
diff options
context:
space:
mode:
Diffstat (limited to 'doc/bio.doc')
-rw-r--r--doc/bio.doc423
1 files changed, 423 insertions, 0 deletions
diff --git a/doc/bio.doc b/doc/bio.doc
new file mode 100644
index 0000000000..545a57cdff
--- /dev/null
+++ b/doc/bio.doc
@@ -0,0 +1,423 @@
+BIO Routines
+
+This documentation is rather sparse, you are probably best
+off looking at the code for specific details.
+
+The BIO library is a IO abstraction that was originally
+inspired by the need to have callbacks to perform IO to FILE
+pointers when using Windows 3.1 DLLs. There are two types
+of BIO; a source/sink type and a filter type.
+The source/sink methods are as follows:
+- BIO_s_mem() memory buffer - a read/write byte array that
+ grows until memory runs out :-).
+- BIO_s_file() FILE pointer - A wrapper around the normal
+ 'FILE *' commands, good for use with stdin/stdout.
+- BIO_s_fd() File descriptor - A wrapper around file
+ descriptors, often used with pipes.
+- BIO_s_socket() Socket - Used around sockets. It is
+ mostly in the Microsoft world that sockets are different
+ from file descriptors and there are all those ugly winsock
+ commands.
+- BIO_s_null() Null - read nothing and write nothing.; a
+ useful endpoint for filter type BIO's specifically things
+ like the message digest BIO.
+
+The filter types are
+- BIO_f_buffer() IO buffering - does output buffering into
+ larger chunks and performs input buffering to allow gets()
+ type functions.
+- BIO_f_md() Message digest - a transparent filter that can
+ be asked to return a message digest for the data that has
+ passed through it.
+- BIO_f_cipher() Encrypt or decrypt all data passing
+ through the filter.
+- BIO_f_base64() Base64 decode on read and encode on write.
+- BIO_f_ssl() A filter that performs SSL encryption on the
+ data sent through it.
+
+Base BIO functions.
+The BIO library has a set of base functions that are
+implemented for each particular type. Filter BIOs will
+normally call the equivalent function on the source/sink BIO
+that they are layered on top of after they have performed
+some modification to the data stream. Multiple filter BIOs
+can be 'push' into a stack of modifers, so to read from a
+file, unbase64 it, then decrypt it, a BIO_f_cipher,
+BIO_f_base64 and a BIO_s_file would probably be used. If a
+sha-1 and md5 message digest needed to be generated, a stack
+two BIO_f_md() BIOs and a BIO_s_null() BIO could be used.
+The base functions are
+- BIO *BIO_new(BIO_METHOD *type); Create a new BIO of type 'type'.
+- int BIO_free(BIO *a); Free a BIO structure. Depending on
+ the configuration, this will free the underlying data
+ object for a source/sink BIO.
+- int BIO_read(BIO *b, char *data, int len); Read upto 'len'
+ bytes into 'data'.
+- int BIO_gets(BIO *bp,char *buf, int size); Depending on
+ the BIO, this can either be a 'get special' or a get one
+ line of data, as per fgets();
+- int BIO_write(BIO *b, char *data, int len); Write 'len'
+ bytes from 'data' to the 'b' BIO.
+- int BIO_puts(BIO *bp,char *buf); Either a 'put special' or
+ a write null terminated string as per fputs().
+- long BIO_ctrl(BIO *bp,int cmd,long larg,char *parg); A
+ control function which is used to manipulate the BIO
+ structure and modify it's state and or report on it. This
+ function is just about never used directly, rather it
+ should be used in conjunction with BIO_METHOD specific
+ macros.
+- BIO *BIO_push(BIO *new_top, BIO *old); new_top is apped to the
+ top of the 'old' BIO list. new_top should be a filter BIO.
+ All writes will go through 'new_top' first and last on read.
+ 'old' is returned.
+- BIO *BIO_pop(BIO *bio); the new topmost BIO is returned, NULL if
+ there are no more.
+
+If a particular low level BIO method is not supported
+(normally BIO_gets()), -2 will be returned if that method is
+called. Otherwise the IO methods (read, write, gets, puts)
+will return the number of bytes read or written, and 0 or -1
+for error (or end of input). For the -1 case,
+BIO_should_retry(bio) can be called to determine if it was a
+genuine error or a temporary problem. -2 will also be
+returned if the BIO has not been initalised yet, in all
+cases, the correct error codes are set (accessible via the
+ERR library).
+
+
+The following functions are convenience functions:
+- int BIO_printf(BIO *bio, char * format, ..); printf but
+ to a BIO handle.
+- long BIO_ctrl_int(BIO *bp,int cmd,long larg,int iarg); a
+ convenience function to allow a different argument types
+ to be passed to BIO_ctrl().
+- int BIO_dump(BIO *b,char *bytes,int len); output 'len'
+ bytes from 'bytes' in a hex dump debug format.
+- long BIO_debug_callback(BIO *bio, int cmd, char *argp, int
+ argi, long argl, long ret) - a default debug BIO callback,
+ this is mentioned below. To use this one normally has to
+ use the BIO_set_callback_arg() function to assign an
+ output BIO for the callback to use.
+- BIO *BIO_find_type(BIO *bio,int type); when there is a 'stack'
+ of BIOs, this function scan the list and returns the first
+ that is of type 'type', as listed in buffer.h under BIO_TYPE_XXX.
+- void BIO_free_all(BIO *bio); Free the bio and all other BIOs
+ in the list. It walks the bio->next_bio list.
+
+
+
+Extra commands are normally implemented as macros calling BIO_ctrl().
+- BIO_number_read(BIO *bio) - the number of bytes processed
+ by BIO_read(bio,.).
+- BIO_number_written(BIO *bio) - the number of bytes written
+ by BIO_write(bio,.).
+- BIO_reset(BIO *bio) - 'reset' the BIO.
+- BIO_eof(BIO *bio) - non zero if we are at the current end
+ of input.
+- BIO_set_close(BIO *bio, int close_flag) - set the close flag.
+- BIO_get_close(BIO *bio) - return the close flag.
+ BIO_pending(BIO *bio) - return the number of bytes waiting
+ to be read (normally buffered internally).
+- BIO_flush(BIO *bio) - output any data waiting to be output.
+- BIO_should_retry(BIO *io) - after a BIO_read/BIO_write
+ operation returns 0 or -1, a call to this function will
+ return non zero if you should retry the call later (this
+ is for non-blocking IO).
+- BIO_should_read(BIO *io) - we should retry when data can
+ be read.
+- BIO_should_write(BIO *io) - we should retry when data can
+ be written.
+- BIO_method_name(BIO *io) - return a string for the method name.
+- BIO_method_type(BIO *io) - return the unique ID of the BIO method.
+- BIO_set_callback(BIO *io, long (*callback)(BIO *io, int
+ cmd, char *argp, int argi, long argl, long ret); - sets
+ the debug callback.
+- BIO_get_callback(BIO *io) - return the assigned function
+ as mentioned above.
+- BIO_set_callback_arg(BIO *io, char *arg) - assign some
+ data against the BIO. This is normally used by the debug
+ callback but could in reality be used for anything. To
+ get an idea of how all this works, have a look at the code
+ in the default debug callback mentioned above. The
+ callback can modify the return values.
+
+Details of the BIO_METHOD structure.
+typedef struct bio_method_st
+ {
+ int type;
+ char *name;
+ int (*bwrite)();
+ int (*bread)();
+ int (*bputs)();
+ int (*bgets)();
+ long (*ctrl)();
+ int (*create)();
+ int (*destroy)();
+ } BIO_METHOD;
+
+The 'type' is the numeric type of the BIO, these are listed in buffer.h;
+'Name' is a textual representation of the BIO 'type'.
+The 7 function pointers point to the respective function
+methods, some of which can be NULL if not implemented.
+The BIO structure
+typedef struct bio_st
+ {
+ BIO_METHOD *method;
+ long (*callback)(BIO * bio, int mode, char *argp, int
+ argi, long argl, long ret);
+ char *cb_arg; /* first argument for the callback */
+ int init;
+ int shutdown;
+ int flags; /* extra storage */
+ int num;
+ char *ptr;
+ struct bio_st *next_bio; /* used by filter BIOs */
+ int references;
+ unsigned long num_read;
+ unsigned long num_write;
+ } BIO;
+
+- 'Method' is the BIO method.
+- 'callback', when configured, is called before and after
+ each BIO method is called for that particular BIO. This
+ is intended primarily for debugging and of informational feedback.
+- 'init' is 0 when the BIO can be used for operation.
+ Often, after a BIO is created, a number of operations may
+ need to be performed before it is available for use. An
+ example is for BIO_s_sock(). A socket needs to be
+ assigned to the BIO before it can be used.
+- 'shutdown', this flag indicates if the underlying
+ comunication primative being used should be closed/freed
+ when the BIO is closed.
+- 'flags' is used to hold extra state. It is primarily used
+ to hold information about why a non-blocking operation
+ failed and to record startup protocol information for the
+ SSL BIO.
+- 'num' and 'ptr' are used to hold instance specific state
+ like file descriptors or local data structures.
+- 'next_bio' is used by filter BIOs to hold the pointer of the
+ next BIO in the chain. written data is sent to this BIO and
+ data read is taken from it.
+- 'references' is used to indicate the number of pointers to
+ this structure. This needs to be '1' before a call to
+ BIO_free() is made if the BIO_free() function is to
+ actually free() the structure, otherwise the reference
+ count is just decreased. The actual BIO subsystem does
+ not really use this functionality but it is useful when
+ used in more advanced applicaion.
+- num_read and num_write are the total number of bytes
+ read/written via the 'read()' and 'write()' methods.
+
+BIO_ctrl operations.
+The following is the list of standard commands passed as the
+second parameter to BIO_ctrl() and should be supported by
+all BIO as best as possible. Some are optional, some are
+manditory, in any case, where is makes sense, a filter BIO
+should pass such requests to underlying BIO's.
+- BIO_CTRL_RESET - Reset the BIO back to an initial state.
+- BIO_CTRL_EOF - return 0 if we are not at the end of input,
+ non 0 if we are.
+- BIO_CTRL_INFO - BIO specific special command, normal
+ information return.
+- BIO_CTRL_SET - set IO specific parameter.
+- BIO_CTRL_GET - get IO specific parameter.
+- BIO_CTRL_GET_CLOSE - Get the close on BIO_free() flag, one
+ of BIO_CLOSE or BIO_NOCLOSE.
+- BIO_CTRL_SET_CLOSE - Set the close on BIO_free() flag.
+- BIO_CTRL_PENDING - Return the number of bytes available
+ for instant reading
+- BIO_CTRL_FLUSH - Output pending data, return number of bytes output.
+- BIO_CTRL_SHOULD_RETRY - After an IO error (-1 returned)
+ should we 'retry' when IO is possible on the underlying IO object.
+- BIO_CTRL_RETRY_TYPE - What kind of IO are we waiting on.
+
+The following command is a special BIO_s_file() specific option.
+- BIO_CTRL_SET_FILENAME - specify a file to open for IO.
+
+The BIO_CTRL_RETRY_TYPE needs a little more explanation.
+When performing non-blocking IO, or say reading on a memory
+BIO, when no data is present (or cannot be written),
+BIO_read() and/or BIO_write() will return -1.
+BIO_should_retry(bio) will return true if this is due to an
+IO condition rather than an actual error. In the case of
+BIO_s_mem(), a read when there is no data will return -1 and
+a should retry when there is more 'read' data.
+The retry type is deduced from 2 macros
+BIO_should_read(bio) and BIO_should_write(bio).
+Now while it may appear obvious that a BIO_read() failure
+should indicate that a retry should be performed when more
+read data is available, this is often not true when using
+things like an SSL BIO. During the SSL protocol startup
+multiple reads and writes are performed, triggered by any
+SSL_read or SSL_write.
+So to write code that will transparently handle either a
+socket or SSL BIO,
+ i=BIO_read(bio,..)
+ if (I == -1)
+ {
+ if (BIO_should_retry(bio))
+ {
+ if (BIO_should_read(bio))
+ {
+ /* call us again when BIO can be read */
+ }
+ if (BIO_should_write(bio))
+ {
+ /* call us again when BIO can be written */
+ }
+ }
+ }
+
+At this point in time only read and write conditions can be
+used but in the future I can see the situation for other
+conditions, specifically with SSL there could be a condition
+of a X509 certificate lookup taking place and so the non-
+blocking BIO_read would require a retry when the certificate
+lookup subsystem has finished it's lookup. This is all
+makes more sense and is easy to use in a event loop type
+setup.
+When using the SSL BIO, either SSL_read() or SSL_write()s
+can be called during the protocol startup and things will
+still work correctly.
+The nice aspect of the use of the BIO_should_retry() macro
+is that all the errno codes that indicate a non-fatal error
+are encapsulated in one place. The Windows specific error
+codes and WSAGetLastError() calls are also hidden from the
+application.
+
+Notes on each BIO method.
+Normally buffer.h is just required but depending on the
+BIO_METHOD, ssl.h or evp.h will also be required.
+
+BIO_METHOD *BIO_s_mem(void);
+- BIO_set_mem_buf(BIO *bio, BUF_MEM *bm, int close_flag) -
+ set the underlying BUF_MEM structure for the BIO to use.
+- BIO_get_mem_ptr(BIO *bio, char **pp) - if pp is not NULL,
+ set it to point to the memory array and return the number
+ of bytes available.
+A read/write BIO. Any data written is appended to the
+memory array and any read is read from the front. This BIO
+can be used for read/write at the same time. BIO_gets() is
+supported in the fgets() sense.
+BIO_CTRL_INFO can be used to retrieve pointers to the memory
+buffer and it's length.
+
+BIO_METHOD *BIO_s_file(void);
+- BIO_set_fp(BIO *bio, FILE *fp, int close_flag) - set 'FILE *' to use.
+- BIO_get_fp(BIO *bio, FILE **fp) - get the 'FILE *' in use.
+- BIO_read_filename(BIO *bio, char *name) - read from file.
+- BIO_write_filename(BIO *bio, char *name) - write to file.
+- BIO_append_filename(BIO *bio, char *name) - append to file.
+This BIO sits over the normal system fread()/fgets() type
+functions. Gets() is supported. This BIO in theory could be
+used for read and write but it is best to think of each BIO
+of this type as either a read or a write BIO, not both.
+
+BIO_METHOD *BIO_s_socket(void);
+BIO_METHOD *BIO_s_fd(void);
+- BIO_sock_should_retry(int i) - the underlying function
+ used to determine if a call should be retried; the
+ argument is the '0' or '-1' returned by the previous BIO
+ operation.
+- BIO_fd_should_retry(int i) - same as the
+- BIO_sock_should_retry() except that it is different internally.
+- BIO_set_fd(BIO *bio, int fd, int close_flag) - set the
+ file descriptor to use
+- BIO_get_fd(BIO *bio, int *fd) - get the file descriptor.
+These two methods are very similar. Gets() is not
+supported, if you want this functionality, put a
+BIO_f_buffer() onto it. This BIO is bi-directional if the
+underlying file descriptor is. This is normally the case
+for sockets but not the case for stdio descriptors.
+
+BIO_METHOD *BIO_s_null(void);
+Read and write as much data as you like, it all disappears
+into this BIO.
+
+BIO_METHOD *BIO_f_buffer(void);
+- BIO_get_buffer_num_lines(BIO *bio) - return the number of
+ complete lines in the buffer.
+- BIO_set_buffer_size(BIO *bio, long size) - set the size of
+ the buffers.
+This type performs input and output buffering. It performs
+both at the same time. The size of the buffer can be set
+via the set buffer size option. Data buffered for output is
+only written when the buffer fills.
+
+BIO_METHOD *BIO_f_ssl(void);
+- BIO_set_ssl(BIO *bio, SSL *ssl, int close_flag) - the SSL
+ structure to use.
+- BIO_get_ssl(BIO *bio, SSL **ssl) - get the SSL structure
+ in use.
+The SSL bio is a little different from normal BIOs because
+the underlying SSL structure is a little different. A SSL
+structure performs IO via a read and write BIO. These can
+be different and are normally set via the
+SSL_set_rbio()/SSL_set_wbio() calls. The SSL_set_fd() calls
+are just wrappers that create socket BIOs and then call
+SSL_set_bio() where the read and write BIOs are the same.
+The BIO_push() operation makes the SSLs IO BIOs the same, so
+make sure the BIO pushed is capable of two directional
+traffic. If it is not, you will have to install the BIOs
+via the more conventional SSL_set_bio() call. BIO_pop() will retrieve
+the 'SSL read' BIO.
+
+BIO_METHOD *BIO_f_md(void);
+- BIO_set_md(BIO *bio, EVP_MD *md) - set the message digest
+ to use.
+- BIO_get_md(BIO *bio, EVP_MD **mdp) - return the digest
+ method in use in mdp, return 0 if not set yet.
+- BIO_reset() reinitializes the digest (EVP_DigestInit())
+ and passes the reset to the underlying BIOs.
+All data read or written via BIO_read() or BIO_write() to
+this BIO will be added to the calculated digest. This
+implies that this BIO is only one directional. If read and
+write operations are performed, two separate BIO_f_md() BIOs
+are reuqired to generate digests on both the input and the
+output. BIO_gets(BIO *bio, char *md, int size) will place the
+generated digest into 'md' and return the number of bytes.
+The EVP_MAX_MD_SIZE should probably be used to size the 'md'
+array. Reading the digest will also reset it.
+
+BIO_METHOD *BIO_f_cipher(void);
+- BIO_reset() reinitializes the cipher.
+- BIO_flush() should be called when the last bytes have been
+ output to flush the final block of block ciphers.
+- BIO_get_cipher_status(BIO *b), when called after the last
+ read from a cipher BIO, returns non-zero if the data
+ decrypted correctly, otherwise, 0.
+- BIO_set_cipher(BIO *b, EVP_CIPHER *c, unsigned char *key,
+ unsigned char *iv, int encrypt) This function is used to
+ setup a cipher BIO. The length of key and iv are
+ specified by the choice of EVP_CIPHER. Encrypt is 1 to
+ encrypt and 0 to decrypt.
+
+BIO_METHOD *BIO_f_base64(void);
+- BIO_flush() should be called when the last bytes have been output.
+This BIO base64 encodes when writing and base64 decodes when
+reading. It will scan the input until a suitable begin line
+is found. After reading data, BIO_reset() will reset the
+BIO to start scanning again. Do not mix reading and writing
+on the same base64 BIO. It is meant as a single stream BIO.
+
+Directions type
+both BIO_s_mem()
+one/both BIO_s_file()
+both BIO_s_fd()
+both BIO_s_socket()
+both BIO_s_null()
+both BIO_f_buffer()
+one BIO_f_md()
+one BIO_f_cipher()
+one BIO_f_base64()
+both BIO_f_ssl()
+
+It is easy to mix one and two directional BIOs, all one has
+to do is to keep two separate BIO pointers for reading and
+writing and be careful about usage of underlying BIOs. The
+SSL bio by it's very nature has to be two directional but
+the BIO_push() command will push the one BIO into the SSL
+BIO for both reading and writing.
+
+The best example program to look at is apps/enc.c and/or perhaps apps/dgst.c.
+