From beb7de117d08845a5d039d4ad4ffab8f5bac61fe Mon Sep 17 00:00:00 2001 From: nigel Date: Sat, 24 Feb 2007 21:38:01 +0000 Subject: Load pcre-1.00 into code/trunk. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@3 2f5784b3-3f2a-0410-8824-cb99058d5e15 --- pcreposix.3 | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 135 insertions(+) create mode 100644 pcreposix.3 (limited to 'pcreposix.3') diff --git a/pcreposix.3 b/pcreposix.3 new file mode 100644 index 0000000..2c907e7 --- /dev/null +++ b/pcreposix.3 @@ -0,0 +1,135 @@ +.TH PCRE 3 +.SH NAME +pcreposix - POSIX API for Perl-compatible regular expressions. +.SH SYNOPSIS +.B #include +.PP +.SM +.br +.B int regcomp(regex_t *\fIpreg\fR, const char *\fIpattern\fR, +.ti +5n +.B int \fIcflags\fR); +.PP +.br +.B int regexec(regex_t *\fIpreg\fR, const char *\fIstring\fR, +.ti +5n +.B size_t \fInmatch\fR, regmatch_t \fIpmatch\fR[], int \fIeflags\fR); +.PP +.br +.B size_t regerror(int \fIerrcode\fR, const regex_t *\fIpreg\fR, +.ti +5n +.B char *\fIerrbuf\fR, size_t \fIerrbuf_size\fR); +.PP +.br +.B void regfree(regex_t *\fIpreg\fR); + + +.SH DESCRIPTION +This set of functions provides a POSIX-style API to the PCRE regular expression +package. See \fBpcre (3)\fR for a description of the native API, which contains +additional functionality. The functions described here are just wrapper +functions that ultimately call the native API. + +As I am pretty ignorant about POSIX, these functions must be considered as +experimental. I have implemented only those option bits that can be reasonably +mapped to PCRE native options. Other POSIX options are not even defined. It may +be that it is useful to define, but ignore, other options. Feedback from more +knowledgeable folk may cause this kind of detail to change. + +When PCRE is called via these functions, it is only the API that is POSIX-like +in style. The syntax and semantics of the regular expressions themselves are +still those of Perl, subject to the setting of various PCRE options, as +described below. + +The header for these functions is supplied as \fBpcreposix.h\fR to avoid any +potential clash with other POSIX libraries. It can, of course, be renamed or +aliased as \fBregex.h\fR, which is the "correct" name. It provides two +structure types, \fIregex_t\fR for compiled internal forms, and +\fIregmatch_t\fR for returning captured substrings. It also defines some +constants whose names start with "REG_"; these are used for setting options and +identifying error codes. + + +.SH COMPILING A PATTERN + +The function \fBregcomp()\fR is called to compile a pattern into an +internal form. The pattern is a C string terminated by a binary zero, and +is passed in the argument \fIpattern\fR. The \fIpreg\fR argument is a pointer +to a regex_t structure which is used as a base for storing information about +the compiled expression. + +The argument \fIcflags\fR is either zero, or contains one or more of the bits +defined by the following macros: + + REG_ICASE + +The PCRE_CASELESS option is set when the expression is passed for compilation +to the native function. + + REG_NEWLINE + +The PCRE_MULTILINE option is set when the expression is passed for compilation +to the native function. + +The yield of \fBregcomp()\fR is zero on success, and non-zero otherwise. The +\fIpreg\fR structure is filled in on success, and one member of the structure +is publicized: \fIre_nsub\fR contains the number of capturing subpatterns in +the regular expression. Various error codes are defined in the header file. + + +.SH MATCHING A PATTERN +The function \fBregexec()\fR is called to match a pre-compiled pattern +\fIpreg\fR against a given \fIstring\fR, which is terminated by a zero byte, +subject to the options in \fIeflags\fR. These can be: + + REG_NOTBOL + +The PCRE_NOTBOL option is set when calling the underlying PCRE matching +function. + + REG_NOTEOL + +The PCRE_NOTEOL option is set when calling the underlying PCRE matching +function. + +The portion of the string that was matched, and also any captured substrings, +are returned via the \fIpmatch\fR argument, which points to an array of +\fInmatch\fR structures of type \fIregmatch_t\fR, containing the members +\fIrm_so\fR and \fIrm_eo\fR. These contain the offset to the first character of +each substring and the offset to the first character after the end of each +substring, respectively. The 0th element of the vector relates to the entire +portion of \fIstring\fR that was matched; subsequent elements relate to the +capturing subpatterns of the regular expression. Unused entries in the array +have both structure members set to -1. + +A successful match yields a zero return; various error codes are defined in the +header file, of which REG_NOMATCH is the "expected" failure code. + + +.SH ERROR MESSAGES +The \fBregerror()\fR function maps a non-zero errorcode from either +\fBregcomp\fR or \fBregexec\fR to a printable message. If \fIpreg\fR is not +NULL, the error should have arisen from the use of that structure. A message +terminated by a binary zero is placed in \fIerrbuf\fR. The length of the +message, including the zero, is limited to \fIerrbuf_size\fR. The yield of the +function is the size of buffer needed to hold the whole message. + + +.SH STORAGE +Compiling a regular expression causes memory to be allocated and associated +with the \fIpreg\fR structure. The function \fBregfree()\fR frees all such +memory, after which \fIpreg\fR may no longer be used as a compiled expression. + + +.SH AUTHOR +Philip Hazel +.br +University Computing Service, +.br +New Museums Site, +.br +Cambridge CB2 3QG, England. +.br +Phone: +44 1223 334714 + +Copyright (c) 1997 University of Cambridge. -- cgit v1.2.1