1 files changed, 0 insertions, 3302 deletions
diff --git a/ext/pcre/pcrelib/doc/pcre.txt b/ext/pcre/pcrelib/doc/pcre.txt
deleted file mode 100644
index 1ec5f2ca61..0000000000
--- a/ext/pcre/pcrelib/doc/pcre.txt
+++ /dev/null
@@ -1,3302 +0,0 @@
-This file contains a concatenation of the PCRE man pages, converted to plain
-text format for ease of searching with a text editor, or for use on systems
-that do not have a man page processor. The small individual files that give
-synopses of each function in the library have not been included. There are
-separate text files for the pcregrep and pcretest commands.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-DESCRIPTION
-
-     The PCRE library is a set of functions that implement  regu-
-     lar  expression  pattern  matching using the same syntax and
-     semantics as Perl, with just a few differences. The  current
-     implementation  of  PCRE  (release 4.x) corresponds approxi-
-     mately with Perl 5.8, including support  for  UTF-8  encoded
-     strings.    However,  this  support  has  to  be  explicitly
-     enabled; it is not the default.
-
-     PCRE is written in C and released as a C library. However, a
-     number  of  people  have  written wrappers and interfaces of
-     various kinds. A C++ class is included  in  these  contribu-
-     tions,  which  can  be found in the Contrib directory at the
-     primary FTP site, which is:
-
-     ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre
-
-     Details of exactly which Perl  regular  expression  features
-     are  and  are  not  supported  by PCRE are given in separate
-     documents. See the pcrepattern and pcrecompat pages.
-
-     Some features of PCRE can be included, excluded, or  changed
-     when  the library is built. The pcre_config() function makes
-     it possible for a client  to  discover  which  features  are
-     available.  Documentation  about  building  PCRE for various
-     operating systems can be found in the  README  file  in  the
-     source distribution.
-
-
-USER DOCUMENTATION
-
-     The user documentation for PCRE has been  split  up  into  a
-     number  of  different sections. In the "man" format, each of
-     these is a separate "man page". In the HTML format, each  is
-     a  separate  page,  linked from the index page. In the plain
-     text format, all the sections are concatenated, for ease  of
-     searching. The sections are as follows:
-
-       pcre              this document
-       pcreapi           details of PCRE's native API
-       pcrebuild         options for building PCRE
-       pcrecallout       details of the callout feature
-       pcrecompat        discussion of Perl compatibility
-       pcregrep          description of the pcregrep command
-       pcrepattern       syntax and semantics of supported
-                           regular expressions
-       pcreperform       discussion of performance issues
-       pcreposix         the POSIX-compatible API
-       pcresample        discussion of the sample program
-       pcretest          the pcretest testing command
-
-     In addition, in the "man" and HTML formats, there is a short
-     page  for  each  library function, listing its arguments and
-     results.
-
-
-LIMITATIONS
-
-     There are some size limitations in PCRE but it is hoped that
-     they will never in practice be relevant.
-
-     The maximum length of a  compiled  pattern  is  65539  (sic)
-     bytes  if PCRE is compiled with the default internal linkage
-     size of 2. If you want to process regular  expressions  that
-     are  truly  enormous,  you can compile PCRE with an internal
-     linkage size of 3 or 4 (see the README file  in  the  source
-     distribution  and  the pcrebuild documentation for details).
-     If these cases the limit is substantially larger.   However,
-     the speed of execution will be slower.
-
-     All values in repeating quantifiers must be less than 65536.
-     The maximum number of capturing subpatterns is 65535.
-
-     There is no limit to the  number  of  non-capturing  subpat-
-     terns,  but  the  maximum  depth  of nesting of all kinds of
-     parenthesized subpattern, including  capturing  subpatterns,
-     assertions, and other types of subpattern, is 200.
-
-     The maximum length of a subject string is the largest  posi-
-     tive number that an integer variable can hold. However, PCRE
-     uses recursion to handle subpatterns and indefinite  repeti-
-     tion.  This  means  that the available stack space may limit
-     the size of a subject string that can be processed  by  cer-
-     tain patterns.
-
-
-UTF-8 SUPPORT
-
-     Starting at release 3.3, PCRE has had some support for char-
-     acter  strings  encoded in the UTF-8 format. For release 4.0
-     this has been greatly extended to cover most common require-
-     ments.
-
-     In order process UTF-8  strings,  you  must  build  PCRE  to
-     include  UTF-8  support  in  the code, and, in addition, you
-     must call pcre_compile() with  the  PCRE_UTF8  option  flag.
-     When  you  do this, both the pattern and any subject strings
-     that are matched against it are  treated  as  UTF-8  strings
-     instead of just strings of bytes.
-
-     If you compile PCRE with UTF-8 support, but do not use it at
-     run  time,  the  library will be a bit bigger, but the addi-
-     tional run time overhead is limited to testing the PCRE_UTF8
-     flag in several places, so should not be very large.
-
-     The following comments apply when PCRE is running  in  UTF-8
-     mode:
-
-     1. PCRE assumes that the strings it is given  contain  valid
-     UTF-8  codes. It does not diagnose invalid UTF-8 strings. If
-     you pass invalid UTF-8 strings  to  PCRE,  the  results  are
-     undefined.
-
-     2. In a pattern, the escape sequence \x{...}, where the con-
-     tents  of  the  braces is a string of hexadecimal digits, is
-     interpreted as a UTF-8 character whose code  number  is  the
-     given  hexadecimal  number, for example: \x{1234}. If a non-
-     hexadecimal digit appears between the braces,  the  item  is
-     not  recognized.  This escape sequence can be used either as
-     a literal, or within a character class.
-
-     3. The original hexadecimal escape sequence, \xhh, matches a
-     two-byte UTF-8 character if the value is greater than 127.
-
-     4. Repeat quantifiers apply to  complete  UTF-8  characters,
-     not to individual bytes, for example: \x{100}{3}.
-
-     5. The dot metacharacter matches one UTF-8 character instead
-     of a single byte.
-
-     6. The escape sequence \C can be used to match a single byte
-     in UTF-8 mode, but its use can lead to some strange effects.
-
-     7. The character escapes \b, \B, \d, \D, \s, \S, \w, and  \W
-     correctly test characters of any code value, but the charac-
-     ters that PCRE recognizes as digits, spaces, or word charac-
-     ters  remain  the  same  set as before, all with values less
-     than 256.
-
-     8. Case-insensitive  matching  applies  only  to  characters
-     whose  values  are  less than 256. PCRE does not support the
-     notion of "case" for higher-valued characters.
-
-     9. PCRE does not support the use of Unicode tables and  pro-
-     perties or the Perl escapes \p, \P, and \X.
-
-
-AUTHOR
-
-     Philip Hazel <ph10@cam.ac.uk>
-     University Computing Service,
-     Cambridge CB2 3QG, England.
-     Phone: +44 1223 334714
-
-Last updated: 04 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-PCRE BUILD-TIME OPTIONS
-
-     This document describes the optional features of  PCRE  that
-     can  be  selected when the library is compiled. They are all
-     selected, or deselected, by providing options to the config-
-     ure  script  which  is run before the make command. The com-
-     plete list of options  for  configure  (which  includes  the
-     standard  ones  such  as  the  selection of the installation
-     directory) can be obtained by running
-
-       ./configure --help
-
-     The following sections describe certain options whose  names
-     begin  with  --enable  or  --disable. These settings specify
-     changes to the defaults for the configure  command.  Because
-     of  the  way  that  configure  works, --enable and --disable
-     always come in pairs, so  the  complementary  option  always
-     exists  as  well, but as it specifies the default, it is not
-     described.
-
-
-UTF-8 SUPPORT
-
-     To build PCRE with support for UTF-8 character strings, add
-
-       --enable-utf8
-
-     to the configure command. Of itself, this does not make PCRE
-     treat  strings as UTF-8. As well as compiling PCRE with this
-     option, you also have have to set the PCRE_UTF8 option  when
-     you call the pcre_compile() function.
-
-
-CODE VALUE OF NEWLINE
-
-     By default, PCRE treats character 10 (linefeed) as the  new-
-     line  character.  This  is  the  normal newline character on
-     Unix-like systems. You can compile PCRE to use character  13
-     (carriage return) instead by adding
-
-       --enable-newline-is-cr
-
-     to the configure command. For completeness there is  also  a
-     --enable-newline-is-lf  option,  which  explicitly specifies
-     linefeed as the newline character.
-
-
-BUILDING SHARED AND STATIC LIBRARIES
-
-     The PCRE building process uses libtool to build both  shared
-     and  static  Unix libraries by default. You can suppress one
-     of these by adding one of
-
-       --disable-shared
-       --disable-static
-
-     to the configure command, as required.
-
-
-POSIX MALLOC USAGE
-
-     When PCRE is called through the  POSIX  interface  (see  the
-     pcreposix  documentation),  additional  working  storage  is
-     required for holding the pointers  to  capturing  substrings
-     because  PCRE requires three integers per substring, whereas
-     the POSIX interface provides only  two.  If  the  number  of
-     expected  substrings  is  small,  the  wrapper function uses
-     space on the stack, because this is faster than  using  mal-
-     loc()  for  each call. The default threshold above which the
-     stack is no longer used is 10; it can be changed by adding a
-     setting such as
-
-       --with-posix-malloc-threshold=20
-
-     to the configure command.
-
-
-LIMITING PCRE RESOURCE USAGE
-
-     Internally, PCRE has a  function  called  match()  which  it
-     calls  repeatedly  (possibly  recursively) when performing a
-     matching operation. By limiting the  number  of  times  this
-     function  may  be  called,  a  limit  can  be  placed on the
-     resources used by a single call to  pcre_exec().  The  limit
-     can  be  changed  at  run  time, as described in the pcreapi
-     documentation. The default is 10 million, but  this  can  be
-     changed by adding a setting such as
-
-       --with-match-limit=500000
-
-     to the configure command.
-
-
-HANDLING VERY LARGE PATTERNS
-
-     Within a compiled pattern, offset values are used  to  point
-     from  one  part  to  another  (for  example, from an opening
-     parenthesis to an  alternation  metacharacter).  By  default
-     two-byte  values  are  used  for these offsets, leading to a
-     maximum size for a compiled pattern of around 64K.  This  is
-     sufficient  to  handle  all  but the most gigantic patterns.
-     Nevertheless, some people do want to process  enormous  pat-
-     terns,  so  it is possible to compile PCRE to use three-byte
-     or four-byte offsets by adding a setting such as
-
-       --with-link-size=3
-
-     to the configure command. The value given must be 2,  3,  or
-     4.  Using  longer  offsets  slows down the operation of PCRE
-     because it has to load additional bytes when handling them.
-
-     If you build PCRE with an increased link size, test  2  (and
-     test 5 if you are using UTF-8) will fail. Part of the output
-     of these tests is a representation of the compiled  pattern,
-     and this changes with the link size.
-
-Last updated: 21 January 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-SYNOPSIS OF PCRE API
-
-     #include <pcre.h>
-
-     pcre *pcre_compile(const char *pattern, int options,
-          const char **errptr, int *erroffset,
-          const unsigned char *tableptr);
-
-     pcre_extra *pcre_study(const pcre *code, int options,
-          const char **errptr);
-
-     int pcre_exec(const pcre *code, const pcre_extra *extra,
-          const char *subject, int length, int startoffset,
-          int options, int *ovector, int ovecsize);
-
-     int pcre_copy_named_substring(const pcre *code,
-          const char *subject, int *ovector,
-          int stringcount, const char *stringname,
-          char *buffer, int buffersize);
-
-     int pcre_copy_substring(const char *subject, int *ovector,
-          int stringcount, int stringnumber, char *buffer,
-          int buffersize);
-
-     int pcre_get_named_substring(const pcre *code,
-          const char *subject, int *ovector,
-          int stringcount, const char *stringname,
-          const char **stringptr);
-
-     int pcre_get_stringnumber(const pcre *code,
-          const char *name);
-
-     int pcre_get_substring(const char *subject, int *ovector,
-          int stringcount, int stringnumber,
-          const char **stringptr);
-
-     int pcre_get_substring_list(const char *subject,
-          int *ovector, int stringcount, const char ***listptr);
-
-     void pcre_free_substring(const char *stringptr);
-
-     void pcre_free_substring_list(const char **stringptr);
-
-     const unsigned char *pcre_maketables(void);
-
-     int pcre_fullinfo(const pcre *code, const pcre_extra *extra,
-          int what, void *where);
-
-
-     int pcre_info(const pcre *code, int *optptr, *firstcharptr);
-
-     int pcre_config(int what, void *where);
-
-     char *pcre_version(void);
-
-     void *(*pcre_malloc)(size_t);
-
-     void (*pcre_free)(void *);
-
-     int (*pcre_callout)(pcre_callout_block *);
-
-
-PCRE API
-
-     PCRE has its own native API,  which  is  described  in  this
-     document.  There  is  also  a  set of wrapper functions that
-     correspond to the POSIX regular expression API.   These  are
-     described in the pcreposix documentation.
-
-     The native API function prototypes are defined in the header
-     file  pcre.h,  and  on  Unix  systems  the library itself is
-     called libpcre.a, so can be accessed by adding -lpcre to the
-     command  for  linking  an  application  which  calls it. The
-     header file defines the macros PCRE_MAJOR and PCRE_MINOR  to
-     contain the major and minor release numbers for the library.
-     Applications can use these to include support for  different
-     releases.
-
-     The functions pcre_compile(), pcre_study(), and  pcre_exec()
-     are  used  for compiling and matching regular expressions. A
-     sample program that demonstrates the simplest way  of  using
-     them  is  given in the file pcredemo.c. The pcresample docu-
-     mentation describes how to run it.
-
-     There are convenience functions for extracting captured sub-
-     strings from a matched subject string. They are:
-
-       pcre_copy_substring()
-       pcre_copy_named_substring()
-       pcre_get_substring()
-       pcre_get_named_substring()
-       pcre_get_substring_list()
-
-     pcre_free_substring()  and  pcre_free_substring_list()   are
-     also  provided,  to  free  the  memory  used  for  extracted
-     strings.
-
-     The function pcre_maketables() is used (optionally) to build
-     a  set of character tables in the current locale for passing
-     to pcre_compile().
-
-     The function pcre_fullinfo() is used to find out information
-     about a compiled pattern; pcre_info() is an obsolete version
-     which returns only some of the available information, but is
-     retained   for   backwards   compatibility.    The  function
-     pcre_version() returns a pointer to a string containing  the
-     version of PCRE and its date of release.
-
-     The global variables  pcre_malloc  and  pcre_free  initially
-     contain the entry points of the standard malloc() and free()
-     functions respectively. PCRE  calls  the  memory  management
-     functions  via  these  variables,  so  a calling program can
-     replace them if it  wishes  to  intercept  the  calls.  This
-     should be done before calling any PCRE functions.
-
-     The global variable pcre_callout initially contains NULL. It
-     can be set by the caller to a "callout" function, which PCRE
-     will then call at specified points during a matching  opera-
-     tion. Details are given in the pcrecallout documentation.
-
-
-MULTITHREADING
-
-     The PCRE functions can be used in  multi-threading  applica-
-     tions, with the proviso that the memory management functions
-     pointed to by pcre_malloc and  pcre_free,  and  the  callout
-     function  pointed  to  by  pcre_callout,  are  shared by all
-     threads.
-
-     The compiled form of a regular  expression  is  not  altered
-     during  matching, so the same compiled pattern can safely be
-     used by several threads at once.
-
-
-CHECKING BUILD-TIME OPTIONS
-
-     int pcre_config(int what, void *where);
-
-     The function pcre_config() makes  it  possible  for  a  PCRE
-     client  to  discover  which optional features have been com-
-     piled into the PCRE library. The pcrebuild documentation has
-     more details about these optional features.
-
-     The first argument for pcre_config() is an integer, specify-
-     ing  which information is required; the second argument is a
-     pointer to a variable into which the information is  placed.
-     The following information is available:
-
-       PCRE_CONFIG_UTF8
-
-     The output is an integer that is set to one if UTF-8 support
-     is available; otherwise it is set to zero.
-
-       PCRE_CONFIG_NEWLINE
-
-     The output is an integer that is set to  the  value  of  the
-     code  that  is  used for the newline character. It is either
-     linefeed (10) or carriage return (13), and  should  normally
-     be the standard character for your operating system.
-
-       PCRE_CONFIG_LINK_SIZE
-
-     The output is an integer that contains the number  of  bytes
-     used  for  internal linkage in compiled regular expressions.
-     The value is 2, 3, or 4. Larger values allow larger  regular
-     expressions  to be compiled, at the expense of slower match-
-     ing. The default value of 2 is sufficient for  all  but  the
-     most  massive patterns, since it allows the compiled pattern
-     to be up to 64K in size.
-
-       PCRE_CONFIG_POSIX_MALLOC_THRESHOLD
-
-     The output is an integer that contains the  threshold  above
-     which  the POSIX interface uses malloc() for output vectors.
-     Further details are given in the pcreposix documentation.
-
-       PCRE_CONFIG_MATCH_LIMIT
-
-     The output is an integer that gives the  default  limit  for
-     the   number  of  internal  matching  function  calls  in  a
-     pcre_exec()  execution.  Further  details  are  given   with
-     pcre_exec() below.
-
-
-COMPILING A PATTERN
-
-     pcre *pcre_compile(const char *pattern, int options,
-          const char **errptr, int *erroffset,
-          const unsigned char *tableptr);
-
-     The function pcre_compile() is called to compile  a  pattern
-     into  an internal form. The pattern is a C string terminated
-     by a binary zero, and is passed in the argument  pattern.  A
-     pointer  to  a  single  block of memory that is obtained via
-     pcre_malloc is returned. This contains the compiled code and
-     related  data.  The  pcre  type  is defined for the returned
-     block; this is a typedef for a structure whose contents  are
-     not  externally  defined. It is up to the caller to free the
-     memory when it is no longer required.
-
-     Although the compiled code of a PCRE regex  is  relocatable,
-     that is, it does not depend on memory location, the complete
-     pcre data block is not fully relocatable,  because  it  con-
-     tains  a  copy of the tableptr argument, which is an address
-     (see below).
-     The options argument contains independent bits  that  affect
-     the  compilation.  It  should  be  zero  if  no  options are
-     required. Some of the options, in particular, those that are
-     compatible  with Perl, can also be set and unset from within
-     the pattern (see the detailed description of regular expres-
-     sions  in the pcrepattern documentation). For these options,
-     the contents of the options argument specifies their initial
-     settings  at  the  start  of  compilation and execution. The
-     PCRE_ANCHORED option can be set at the time of  matching  as
-     well as at compile time.
-
-     If errptr is NULL, pcre_compile() returns NULL  immediately.
-     Otherwise, if compilation of a pattern fails, pcre_compile()
-     returns NULL, and sets the variable pointed to by errptr  to
-     point  to a textual error message. The offset from the start
-     of  the  pattern  to  the  character  where  the  error  was
-     discovered   is   placed  in  the  variable  pointed  to  by
-     erroffset, which must not be NULL. If it  is,  an  immediate
-     error is given.
-
-     If the final  argument,  tableptr,  is  NULL,  PCRE  uses  a
-     default  set  of character tables which are built when it is
-     compiled, using the default C  locale.  Otherwise,  tableptr
-     must  be  the result of a call to pcre_maketables(). See the
-     section on locale support below.
-
-     This code fragment shows a typical straightforward  call  to
-     pcre_compile():
-
-       pcre *re;
-       const char *error;
-       int erroffset;
-       re = pcre_compile(
-         "^A.*Z",          /* the pattern */
-         0,                /* default options */
-         &error,           /* for error message */
-         &erroffset,       /* for error offset */
-         NULL);            /* use default character tables */
-
-     The following option bits are defined:
-
-       PCRE_ANCHORED
-
-     If this bit is set, the pattern is forced to be  "anchored",
-     that is, it is constrained to match only at the first match-
-     ing point in the string which is being searched  (the  "sub-
-     ject string"). This effect can also be achieved by appropri-
-     ate constructs in the pattern itself, which is the only  way
-     to do it in Perl.
-
-       PCRE_CASELESS
-
-     If this bit is set, letters in the pattern match both  upper
-     and  lower  case  letters.  It  is  equivalent  to Perl's /i
-     option, and it can be changed within a  pattern  by  a  (?i)
-     option setting.
-
-       PCRE_DOLLAR_ENDONLY
-
-     If this bit is set, a dollar metacharacter  in  the  pattern
-     matches  only at the end of the subject string. Without this
-     option, a dollar also matches immediately before  the  final
-     character  if it is a newline (but not before any other new-
-     lines).  The  PCRE_DOLLAR_ENDONLY  option  is   ignored   if
-     PCRE_MULTILINE is set. There is no equivalent to this option
-     in Perl, and no way to set it within a pattern.
-
-       PCRE_DOTALL
-
-     If this bit is  set,  a  dot  metacharater  in  the  pattern
-     matches all characters, including newlines. Without it, new-
-     lines are excluded. This option is equivalent to  Perl's  /s
-     option,  and  it  can  be changed within a pattern by a (?s)
-     option setting. A negative class such as [^a] always matches
-     a  newline  character,  independent  of  the setting of this
-     option.
-
-       PCRE_EXTENDED
-
-     If this bit is set, whitespace data characters in  the  pat-
-     tern  are  totally  ignored  except when escaped or inside a
-     character class. Whitespace does not include the VT  charac-
-     ter  (code 11). In addition, characters between an unescaped
-     # outside a character class and the next newline  character,
-     inclusive, are also ignored. This is equivalent to Perl's /x
-     option, and it can be changed within a  pattern  by  a  (?x)
-     option setting.
-
-     This option makes it possible  to  include  comments  inside
-     complicated patterns.  Note, however, that this applies only
-     to data characters. Whitespace characters may  never  appear
-     within special character sequences in a pattern, for example
-     within the sequence (?( which introduces a conditional  sub-
-     pattern.
-
-       PCRE_EXTRA
-
-     This option was invented in  order  to  turn  on  additional
-     functionality of PCRE that is incompatible with Perl, but it
-     is currently of very little use. When set, any backslash  in
-     a  pattern  that is followed by a letter that has no special
-     meaning causes an error, thus reserving  these  combinations
-     for  future  expansion.  By default, as in Perl, a backslash
-     followed by a letter with no special meaning is treated as a
-     literal.  There  are at present no other features controlled
-     by this option. It can also be set by a (?X) option  setting
-     within a pattern.
-
-       PCRE_MULTILINE
-
-     By default, PCRE treats the subject string as consisting  of
-     a  single "line" of characters (even if it actually contains
-     several newlines). The "start  of  line"  metacharacter  (^)
-     matches  only  at the start of the string, while the "end of
-     line" metacharacter ($) matches  only  at  the  end  of  the
-     string,    or   before   a   terminating   newline   (unless
-     PCRE_DOLLAR_ENDONLY is set). This is the same as Perl.
-
-     When PCRE_MULTILINE it is set, the "start of line" and  "end
-     of  line"  constructs match immediately following or immedi-
-     ately before any newline  in  the  subject  string,  respec-
-     tively,  as  well  as  at  the  very  start and end. This is
-     equivalent to Perl's /m option, and it can be changed within
-     a  pattern  by  a  (?m) option setting. If there are no "\n"
-     characters in a subject string, or no occurrences of ^ or  $
-     in a pattern, setting PCRE_MULTILINE has no effect.
-
-       PCRE_NO_AUTO_CAPTURE
-
-     If this option is set, it disables the use of numbered  cap-
-     turing  parentheses  in the pattern. Any opening parenthesis
-     that is not followed by ? behaves as if it were followed  by
-     ?:  but  named  parentheses  can still be used for capturing
-     (and they acquire numbers in the usual  way).  There  is  no
-     equivalent of this option in Perl.
-
-       PCRE_UNGREEDY
-
-     This option inverts the "greediness" of the  quantifiers  so
-     that  they  are  not greedy by default, but become greedy if
-     followed by "?". It is not compatible with Perl. It can also
-     be set by a (?U) option setting within the pattern.
-
-       PCRE_UTF8
-
-     This option causes PCRE to regard both the pattern  and  the
-     subject  as  strings  of UTF-8 characters instead of single-
-     byte character strings. However, it  is  available  only  if
-     PCRE  has  been  built to include UTF-8 support. If not, the
-     use of this option provokes an error. Details  of  how  this
-     option  changes  the behaviour of PCRE are given in the sec-
-     tion on UTF-8 support in the main pcre page.
-
-
-STUDYING A PATTERN
-
-     pcre_extra *pcre_study(const pcre *code, int options,
-          const char **errptr);
-
-     When a pattern is going to be  used  several  times,  it  is
-     worth  spending  more time analyzing it in order to speed up
-     the time taken for matching. The function pcre_study() takes
-     a  pointer  to  a compiled pattern as its first argument. If
-     studing the pattern  produces  additional  information  that
-     will  help speed up matching, pcre_study() returns a pointer
-     to a pcre_extra block, in which the study_data field  points
-     to the results of the study.
-
-     The  returned  value  from  a  pcre_study()  can  be  passed
-     directly  to pcre_exec(). However, the pcre_extra block also
-     contains other fields that can be set by the  caller  before
-     the  block is passed; these are described below. If studying
-     the pattern does not  produce  any  additional  information,
-     pcre_study() returns NULL. In that circumstance, if the cal-
-     ling program wants to pass  some  of  the  other  fields  to
-     pcre_exec(), it must set up its own pcre_extra block.
-
-     The second argument contains option  bits.  At  present,  no
-     options  are  defined  for  pcre_study(),  and this argument
-     should always be zero.
-
-     The third argument for pcre_study()  is  a  pointer  for  an
-     error  message.  If  studying  succeeds  (even if no data is
-     returned), the variable it points to is set to NULL.  Other-
-     wise it points to a textual error message. You should there-
-     fore  test  the  error  pointer  for  NULL   after   calling
-     pcre_study(), to be sure that it has run successfully.
-
-     This is a typical call to pcre_study():
-
-       pcre_extra *pe;
-       pe = pcre_study(
-         re,             /* result of pcre_compile() */
-         0,              /* no options exist */
-         &error);        /* set to NULL or points to a message */
-
-     At present, studying a  pattern  is  useful  only  for  non-
-     anchored  patterns  that do not have a single fixed starting
-     character. A  bitmap  of  possible  starting  characters  is
-     created.
-
-
-LOCALE SUPPORT
-
-     PCRE handles caseless matching, and determines whether char-
-     acters  are  letters, digits, or whatever, by reference to a
-     set of tables. When running in UTF-8 mode, this applies only
-     to characters with codes less than 256. The library contains
-     a default set of tables that is created  in  the  default  C
-     locale  when  PCRE  is compiled. This is used when the final
-     argument of pcre_compile() is NULL, and  is  sufficient  for
-     many applications.
-
-     An alternative set of tables can, however, be supplied. Such
-     tables  are built by calling the pcre_maketables() function,
-     which has no arguments, in the relevant locale.  The  result
-     can  then be passed to pcre_compile() as often as necessary.
-     For example, to build and use tables  that  are  appropriate
-     for  the French locale (where accented characters with codes
-     greater than 128 are treated as letters), the following code
-     could be used:
-
-       setlocale(LC_CTYPE, "fr");
-       tables = pcre_maketables();
-       re = pcre_compile(..., tables);
-
-     The  tables  are  built  in  memory  that  is  obtained  via
-     pcre_malloc.  The  pointer that is passed to pcre_compile is
-     saved with the compiled pattern, and  the  same  tables  are
-     used via this pointer by pcre_study() and pcre_exec(). Thus,
-     for any single pattern, compilation, studying  and  matching
-     all happen in the same locale, but different patterns can be
-     compiled in different locales. It is the caller's  responsi-
-     bility  to  ensure  that  the  memory  containing the tables
-     remains available for as long as it is needed.
-
-
-INFORMATION ABOUT A PATTERN
-
-     int pcre_fullinfo(const pcre *code, const pcre_extra *extra,
-          int what, void *where);
-
-     The pcre_fullinfo() function  returns  information  about  a
-     compiled pattern. It replaces the obsolete pcre_info() func-
-     tion, which is nevertheless retained for backwards compabil-
-     ity (and is documented below).
-
-     The first argument for pcre_fullinfo() is a pointer  to  the
-     compiled  pattern.  The  second  argument  is  the result of
-     pcre_study(), or NULL if the pattern was  not  studied.  The
-     third  argument  specifies  which  piece  of  information is
-     required, and the fourth argument is a pointer to a variable
-     to  receive  the data. The yield of the function is zero for
-     success, or one of the following negative numbers:
-
-       PCRE_ERROR_NULL       the argument code was NULL
-                             the argument where was NULL
-       PCRE_ERROR_BADMAGIC   the "magic number" was not found
-       PCRE_ERROR_BADOPTION  the value of what was invalid
-
-     Here is a typical call of  pcre_fullinfo(),  to  obtain  the
-     length of the compiled pattern:
-
-       int rc;
-       unsigned long int length;
-       rc = pcre_fullinfo(
-         re,               /* result of pcre_compile() */
-         pe,               /* result of pcre_study(), or NULL */
-         PCRE_INFO_SIZE,   /* what is required */
-         &length);         /* where to put the data */
-
-     The possible values for the third argument  are  defined  in
-     pcre.h, and are as follows:
-
-       PCRE_INFO_BACKREFMAX
-
-     Return the number of the highest back reference in the  pat-
-     tern.  The  fourth argument should point to an int variable.
-     Zero is returned if there are no back references.
-
-       PCRE_INFO_CAPTURECOUNT
-
-     Return the number of capturing subpatterns in  the  pattern.
-     The fourth argument should point to an int variable.
-
-       PCRE_INFO_FIRSTBYTE
-
-     Return information about  the  first  byte  of  any  matched
-     string,  for a non-anchored pattern. (This option used to be
-     called PCRE_INFO_FIRSTCHAR; the old name is still recognized
-     for backwards compatibility.)
-
-     If there is a fixed first byte, e.g. from a pattern such  as
-     (cat|cow|coyote),  it  is returned in the integer pointed to
-     by where. Otherwise, if either
-
-     (a) the pattern was compiled with the PCRE_MULTILINE option,
-     and every branch starts with "^", or
-
-     (b) every  branch  of  the  pattern  starts  with  ".*"  and
-     PCRE_DOTALL is not set (if it were set, the pattern would be
-     anchored),
-
-     -1 is returned, indicating that the pattern matches only  at
-     the  start  of  a subject string or after any newline within
-     the string. Otherwise -2 is returned. For anchored patterns,
-     -2 is returned.
-
-       PCRE_INFO_FIRSTTABLE
-
-     If the pattern was studied, and this resulted  in  the  con-
-     struction of a 256-bit table indicating a fixed set of bytes
-     for the first byte in any matching string, a pointer to  the
-     table  is  returned.  Otherwise NULL is returned. The fourth
-     argument should point to an unsigned char * variable.
-
-       PCRE_INFO_LASTLITERAL
-
-     Return the value of the rightmost  literal  byte  that  must
-     exist  in  any  matched  string, other than at its start, if
-     such a byte has been recorded. The  fourth  argument  should
-     point  to  an  int variable. If there is no such byte, -1 is
-     returned. For anchored patterns,  a  last  literal  byte  is
-     recorded  only  if  it follows something of variable length.
-     For example, for the pattern /^a\d+z\d+/ the returned  value
-     is "z", but for /^a\dz\d/ the returned value is -1.
-
-       PCRE_INFO_NAMECOUNT
-       PCRE_INFO_NAMEENTRYSIZE
-       PCRE_INFO_NAMETABLE
-
-     PCRE supports the use of named as well as numbered capturing
-     parentheses. The names are just an additional way of identi-
-     fying the parentheses,  which  still  acquire  a  number.  A
-     caller  that  wants  to extract data from a named subpattern
-     must convert the name to a number in  order  to  access  the
-     correct  pointers  in  the  output  vector  (described  with
-     pcre_exec() below). In order to do this, it must  first  use
-     these  three  values  to  obtain  the name-to-number mapping
-     table for the pattern.
-
-     The  map  consists  of  a  number  of  fixed-size   entries.
-     PCRE_INFO_NAMECOUNT   gives   the  number  of  entries,  and
-     PCRE_INFO_NAMEENTRYSIZE gives the size of each  entry;  both
-     of  these return an int value. The entry size depends on the
-     length of the longest name.  PCRE_INFO_NAMETABLE  returns  a
-     pointer to the first entry of the table (a pointer to char).
-     The first two bytes of each entry are the number of the cap-
-     turing parenthesis, most significant byte first. The rest of
-     the entry is the corresponding name,  zero  terminated.  The
-     names  are  in alphabetical order. For example, consider the
-     following pattern (assume PCRE_EXTENDED  is  set,  so  white
-     space - including newlines - is ignored):
-
-       (?P<date> (?P<year>(\d\d)?\d\d) -
-       (?P<month>\d\d) - (?P<day>\d\d) )
-
-     There are four named subpatterns,  so  the  table  has  four
-     entries,  and  each  entry in the table is eight bytes long.
-     The table is as follows, with non-printing  bytes  shows  in
-     hex, and undefined bytes shown as ??:
-
-       00 01 d  a  t  e  00 ??
-       00 05 d  a  y  00 ?? ??
-       00 04 m  o  n  t  h  00
-       00 02 y  e  a  r  00 ??
-
-     When writing code to extract data  from  named  subpatterns,
-     remember  that the length of each entry may be different for
-     each compiled pattern.
-
-       PCRE_INFO_OPTIONS
-
-     Return a copy of the options with which the pattern was com-
-     piled.  The fourth argument should point to an unsigned long
-     int variable. These option bits are those specified  in  the
-     call  to  pcre_compile(),  modified  by any top-level option
-     settings within the pattern itself.
-
-     A pattern is automatically anchored by PCRE if  all  of  its
-     top-level alternatives begin with one of the following:
-
-       ^     unless PCRE_MULTILINE is set
-       \A    always
-       \G    always
-       .*    if PCRE_DOTALL is set and there are no back
-               references to the subpattern in which .* appears
-
-     For such patterns, the  PCRE_ANCHORED  bit  is  set  in  the
-     options returned by pcre_fullinfo().
-
-       PCRE_INFO_SIZE
-
-     Return the size of the compiled pattern, that is, the  value
-     that  was  passed as the argument to pcre_malloc() when PCRE
-     was getting memory in which to place the compiled data.  The
-     fourth argument should point to a size_t variable.
-
-       PCRE_INFO_STUDYSIZE
-
-     Returns the size  of  the  data  block  pointed  to  by  the
-     study_data  field  in a pcre_extra block. That is, it is the
-     value that was passed to pcre_malloc() when PCRE was getting
-     memory into which to place the data created by pcre_study().
-     The fourth argument should point to a size_t variable.
-
-
-OBSOLETE INFO FUNCTION
-
-     int pcre_info(const pcre *code, int *optptr, *firstcharptr);
-
-     The pcre_info() function is now obsolete because its  inter-
-     face  is  too  restrictive  to return all the available data
-     about  a  compiled  pattern.   New   programs   should   use
-     pcre_fullinfo()  instead.  The  yield  of pcre_info() is the
-     number of capturing subpatterns, or  one  of  the  following
-     negative numbers:
-
-       PCRE_ERROR_NULL       the argument code was NULL
-       PCRE_ERROR_BADMAGIC   the "magic number" was not found
-
-     If the optptr argument is not NULL, a copy  of  the  options
-     with which the pattern was compiled is placed in the integer
-     it points to (see PCRE_INFO_OPTIONS above).
-
-     If the pattern is not anchored and the firstcharptr argument
-     is  not  NULL, it is used to pass back information about the
-     first    character    of    any    matched    string    (see
-     PCRE_INFO_FIRSTBYTE above).
-
-
-MATCHING A PATTERN
-
-     int pcre_exec(const pcre *code, const pcre_extra *extra,
-          const char *subject, int length, int startoffset,
-          int options, int *ovector, int ovecsize);
-
-     The function pcre_exec() is called to match a subject string
-     against  a pre-compiled pattern, which is passed in the code
-     argument. If the pattern has been studied, the result of the
-     study should be passed in the extra argument.
-
-     Here is an example of a simple call to pcre_exec():
-
-       int rc;
-       int ovector[30];
-       rc = pcre_exec(
-         re,             /* result of pcre_compile() */
-         NULL,           /* we didn't study the pattern */
-         "some string",  /* the subject string */
-         11,             /* the length of the subject string */
-         0,              /* start at offset 0 in the subject */
-         0,              /* default options */
-         ovector,        /* vector for substring information */
-         30);            /* number of elements in the vector */
-
-     If the extra argument is  not  NULL,  it  must  point  to  a
-     pcre_extra  data  block.  The  pcre_study() function returns
-     such a block (when it doesn't return NULL), but you can also
-     create  one for yourself, and pass additional information in
-     it. The fields in the block are as follows:
-
-       unsigned long int flags;
-       void *study_data;
-       unsigned long int match_limit;
-       void *callout_data;
-
-     The flags field is a bitmap  that  specifies  which  of  the
-     other fields are set. The flag bits are:
-
-       PCRE_EXTRA_STUDY_DATA
-       PCRE_EXTRA_MATCH_LIMIT
-       PCRE_EXTRA_CALLOUT_DATA
-
-     Other flag bits should be set to zero. The study_data  field
-     is   set  in  the  pcre_extra  block  that  is  returned  by
-     pcre_study(), together with the appropriate  flag  bit.  You
-     should  not  set this yourself, but you can add to the block
-     by setting the other fields.
-
-     The match_limit field provides a means  of  preventing  PCRE
-     from  using  up a vast amount of resources when running pat-
-     terns that are not going to match, but  which  have  a  very
-     large  number  of  possibilities  in their search trees. The
-     classic example is the  use  of  nested  unlimited  repeats.
-     Internally,  PCRE  uses  a  function called match() which it
-     calls  repeatedly  (sometimes  recursively).  The  limit  is
-     imposed  on the number of times this function is called dur-
-     ing a match, which has the effect of limiting the amount  of
-     recursion and backtracking that can take place. For patterns
-     that are not anchored, the count starts from zero  for  each
-     position in the subject string.
-
-     The default limit for the library can be set  when  PCRE  is
-     built;  the default default is 10 million, which handles all
-     but the most extreme cases. You can reduce  the  default  by
-     suppling  pcre_exec()  with  a  pcre_extra  block  in  which
-     match_limit   is   set   to    a    smaller    value,    and
-     PCRE_EXTRA_MATCH_LIMIT  is  set  in  the flags field. If the
-     limit      is      exceeded,       pcre_exec()       returns
-     PCRE_ERROR_MATCHLIMIT.
-
-     The pcre_callout field is used in conjunction with the "cal-
-     lout"  feature,  which is described in the pcrecallout docu-
-     mentation.
-
-     The PCRE_ANCHORED option can be passed in the options  argu-
-     ment,   whose   unused   bits  must  be  zero.  This  limits
-     pcre_exec() to matching at the first matching position. How-
-     ever,  if  a  pattern  was  compiled  with PCRE_ANCHORED, or
-     turned out to be anchored by virtue of its contents, it can-
-     not be made unachored at matching time.
-
-     There are also three further options that can be set only at
-     matching time:
-
-       PCRE_NOTBOL
-
-     The first character of the string is not the beginning of  a
-     line,  so  the  circumflex  metacharacter  should  not match
-     before it. Setting this without PCRE_MULTILINE  (at  compile
-     time) causes circumflex never to match.
-
-       PCRE_NOTEOL
-
-     The end of the string is not the end of a line, so the  dol-
-     lar  metacharacter should not match it nor (except in multi-
-     line mode) a newline immediately  before  it.  Setting  this
-     without PCRE_MULTILINE (at compile time) causes dollar never
-     to match.
-
-       PCRE_NOTEMPTY
-
-     An empty string is not considered to be  a  valid  match  if
-     this  option  is  set. If there are alternatives in the pat-
-     tern, they are tried. If  all  the  alternatives  match  the
-     empty  string,  the  entire match fails. For example, if the
-     pattern
-
-       a?b?
-
-     is applied to a string not beginning with  "a"  or  "b",  it
-     matches  the  empty string at the start of the subject. With
-     PCRE_NOTEMPTY set, this match is not valid, so PCRE searches
-     further into the string for occurrences of "a" or "b".
-
-     Perl has no direct equivalent of PCRE_NOTEMPTY, but it  does
-     make  a  special case of a pattern match of the empty string
-     within its split() function, and when using the /g modifier.
-     It  is possible to emulate Perl's behaviour after matching a
-     null string by first trying the  match  again  at  the  same
-     offset  with  PCRE_NOTEMPTY  set,  and then if that fails by
-     advancing the starting offset  (see  below)  and  trying  an
-     ordinary match again.
-
-     The subject string is passed to pcre_exec() as a pointer  in
-     subject,  a length in length, and a starting offset in star-
-     toffset. Unlike the pattern string, the subject may  contain
-     binary  zero  bytes.  When  the starting offset is zero, the
-     search for a match starts at the beginning of  the  subject,
-     and this is by far the most common case.
-
-     If the pattern was compiled with the PCRE_UTF8  option,  the
-     subject  must  be  a sequence of bytes that is a valid UTF-8
-     string.  If  an  invalid  UTF-8  string  is  passed,  PCRE's
-     behaviour is not defined.
-
-     A non-zero starting offset  is  useful  when  searching  for
-     another  match  in  the  same subject by calling pcre_exec()
-     again after a previous success.  Setting startoffset differs
-     from  just  passing  over  a  shortened  string  and setting
-     PCRE_NOTBOL in the case of a pattern that  begins  with  any
-     kind of lookbehind. For example, consider the pattern
-
-       \Biss\B
-
-     which finds occurrences of "iss" in the middle of words. (\B
-     matches only if the current position in the subject is not a
-     word boundary.) When applied to the string "Mississipi"  the
-     first  call  to  pcre_exec()  finds the first occurrence. If
-     pcre_exec() is called again with just the remainder  of  the
-     subject,  namely  "issipi", it does not match, because \B is
-     always false at the start of the subject, which is deemed to
-     be  a  word  boundary. However, if pcre_exec() is passed the
-     entire string again, but with startoffset set to 4, it finds
-     the  second  occurrence  of "iss" because it is able to look
-     behind the starting point to discover that it is preceded by
-     a letter.
-
-     If a non-zero starting offset is passed when the pattern  is
-     anchored, one attempt to match at the given offset is tried.
-     This can only succeed if the pattern does  not  require  the
-     match to be at the start of the subject.
-
-     In general, a pattern matches a certain portion of the  sub-
-     ject,  and  in addition, further substrings from the subject
-     may be picked out by parts of  the  pattern.  Following  the
-     usage  in  Jeffrey Friedl's book, this is called "capturing"
-     in what follows, and the phrase  "capturing  subpattern"  is
-     used for a fragment of a pattern that picks out a substring.
-     PCRE supports several other kinds of  parenthesized  subpat-
-     tern that do not cause substrings to be captured.
-
-     Captured substrings are returned to the caller via a  vector
-     of  integer  offsets whose address is passed in ovector. The
-     number of elements in the vector is passed in ovecsize.  The
-     first two-thirds of the vector is used to pass back captured
-     substrings, each substring using a  pair  of  integers.  The
-     remaining  third  of  the  vector  is  used  as workspace by
-     pcre_exec() while matching capturing subpatterns, and is not
-     available for passing back information. The length passed in
-     ovecsize should always be a multiple of three. If it is not,
-     it is rounded down.
-
-     When a match has been successful, information about captured
-     substrings is returned in pairs of integers, starting at the
-     beginning of ovector, and continuing up to two-thirds of its
-     length  at  the  most. The first element of a pair is set to
-     the offset of the first character in a  substring,  and  the
-     second is set to the offset of the first character after the
-     end of a substring. The first  pair,  ovector[0]  and  ovec-
-     tor[1],  identify  the portion of the subject string matched
-     by the entire pattern. The next pair is used for  the  first
-     capturing  subpattern,  and  so  on.  The  value returned by
-     pcre_exec() is the number of pairs that have  been  set.  If
-     there  are no capturing subpatterns, the return value from a
-     successful match is 1, indicating that just the  first  pair
-     of offsets has been set.
-
-     Some convenience functions are provided for  extracting  the
-     captured substrings as separate strings. These are described
-     in the following section.
-
-     It is possible for an capturing  subpattern  number  n+1  to
-     match  some  part  of  the subject when subpattern n has not
-     been used at all.  For  example,  if  the  string  "abc"  is
-     matched  against the pattern (a|(z))(bc) subpatterns 1 and 3
-     are matched, but 2 is not. When this  happens,  both  offset
-     values corresponding to the unused subpattern are set to -1.
-
-     If a capturing subpattern is matched repeatedly, it  is  the
-     last  portion  of  the  string  that  it  matched  that gets
-     returned.
-
-     If the vector is too small to hold  all  the  captured  sub-
-     strings,  it is used as far as possible (up to two-thirds of
-     its length), and the function returns a value  of  zero.  In
-     particular,  if  the  substring offsets are not of interest,
-     pcre_exec() may be called with ovector passed  as  NULL  and
-     ovecsize  as  zero.  However,  if  the pattern contains back
-     references and the ovector isn't big enough to remember  the
-     related  substrings,  PCRE  has to get additional memory for
-     use during matching. Thus it is usually advisable to  supply
-     an ovector.
-
-     Note that pcre_info() can be used to find out how many  cap-
-     turing  subpatterns  there  are  in  a compiled pattern. The
-     smallest size for ovector that will  allow  for  n  captured
-     substrings,  in  addition  to  the  offsets of the substring
-     matched by the whole pattern, is (n+1)*3.
-
-     If pcre_exec() fails, it returns a negative number. The fol-
-     lowing are defined in the header file:
-
-       PCRE_ERROR_NOMATCH        (-1)
-
-     The subject string did not match the pattern.
-
-       PCRE_ERROR_NULL           (-2)
-
-     Either code or subject was passed as NULL,  or  ovector  was
-     NULL and ovecsize was not zero.
-
-       PCRE_ERROR_BADOPTION      (-3)
-
-     An unrecognized bit was set in the options argument.
-
-       PCRE_ERROR_BADMAGIC       (-4)
-
-     PCRE stores a 4-byte "magic number" at the start of the com-
-     piled  code,  to  catch  the  case  when it is passed a junk
-     pointer. This is the error it gives when  the  magic  number
-     isn't present.
-
-       PCRE_ERROR_UNKNOWN_NODE   (-5)
-
-     While running the pattern match, an unknown item was encoun-
-     tered in the compiled pattern. This error could be caused by
-     a bug in PCRE or by overwriting of the compiled pattern.
-
-       PCRE_ERROR_NOMEMORY       (-6)
-
-     If a pattern contains back references, but the ovector  that
-     is  passed  to pcre_exec() is not big enough to remember the
-     referenced substrings, PCRE gets a block of  memory  at  the
-     start  of  matching to use for this purpose. If the call via
-     pcre_malloc() fails, this error  is  given.  The  memory  is
-     freed at the end of matching.
-
-       PCRE_ERROR_NOSUBSTRING    (-7)
-
-     This   error   is   used   by   the   pcre_copy_substring(),
-     pcre_get_substring(),  and  pcre_get_substring_list()  func-
-     tions (see below). It is never returned by pcre_exec().
-
-       PCRE_ERROR_MATCHLIMIT     (-8)
-
-     The recursion and backtracking limit, as  specified  by  the
-     match_limit  field  in a pcre_extra structure (or defaulted)
-     was reached. See the description above.
-
-       PCRE_ERROR_CALLOUT        (-9)
-
-     This error is never generated by pcre_exec() itself.  It  is
-     provided  for  use by callout functions that want to yield a
-     distinctive error code. See  the  pcrecallout  documentation
-     for details.
-
-
-EXTRACTING CAPTURED SUBSTRINGS BY NUMBER
-
-     int pcre_copy_substring(const char *subject, int *ovector,
-          int stringcount, int stringnumber, char *buffer,
-          int buffersize);
-
-     int pcre_get_substring(const char *subject, int *ovector,
-          int stringcount, int stringnumber,
-          const char **stringptr);
-
-     int pcre_get_substring_list(const char *subject,
-          int *ovector, int stringcount, const char ***listptr);
-
-     Captured substrings can be accessed directly  by  using  the
-     offsets returned by pcre_exec() in ovector. For convenience,
-     the functions  pcre_copy_substring(),  pcre_get_substring(),
-     and  pcre_get_substring_list()  are  provided for extracting
-     captured  substrings  as  new,   separate,   zero-terminated
-     strings.  These functions identify substrings by number. The
-     next section describes functions for extracting  named  sub-
-     strings.   A  substring  that  contains  a  binary  zero  is
-     correctly extracted and has a further zero added on the end,
-     but the result is not, of course, a C string.
-
-     The first three arguments are the  same  for  all  three  of
-     these  functions:   subject  is the subject string which has
-     just been successfully matched, ovector is a pointer to  the
-     vector  of  integer  offsets that was passed to pcre_exec(),
-     and stringcount is the number of substrings that  were  cap-
-     tured by the match, including the substring that matched the
-     entire regular expression. This is  the  value  returned  by
-     pcre_exec  if  it  is  greater  than  zero.  If  pcre_exec()
-     returned zero, indicating that it ran out of space in  ovec-
-     tor,  the  value passed as stringcount should be the size of
-     the vector divided by three.
-
-     The functions pcre_copy_substring() and pcre_get_substring()
-     extract a single substring, whose number is given as string-
-     number. A value of zero extracts the substring that  matched
-     the entire pattern, while higher values extract the captured
-     substrings. For pcre_copy_substring(), the string is  placed
-     in  buffer,  whose  length is given by buffersize, while for
-     pcre_get_substring() a new block of memory is  obtained  via
-     pcre_malloc,  and its address is returned via stringptr. The
-     yield of the function is  the  length  of  the  string,  not
-     including the terminating zero, or one of
-
-       PCRE_ERROR_NOMEMORY       (-6)
-
-     The buffer was too small for pcre_copy_substring(),  or  the
-     attempt to get memory failed for pcre_get_substring().
-
-       PCRE_ERROR_NOSUBSTRING    (-7)
-
-     There is no substring whose number is stringnumber.
-
-     The pcre_get_substring_list() function extracts  all  avail-
-     able  substrings  and builds a list of pointers to them. All
-     this is done in a single block of memory which  is  obtained
-     via pcre_malloc. The address of the memory block is returned
-     via listptr, which is also the start of the list  of  string
-     pointers.  The  end of the list is marked by a NULL pointer.
-     The yield of the function is zero if all went well, or
-
-       PCRE_ERROR_NOMEMORY       (-6)
-
-     if the attempt to get the memory block failed.
-
-     When any of these functions encounter a  substring  that  is
-     unset, which can happen when capturing subpattern number n+1
-     matches some part of the subject, but subpattern n  has  not
-     been  used  at all, they return an empty string. This can be
-     distinguished  from  a  genuine  zero-length  substring   by
-     inspecting the appropriate offset in ovector, which is nega-
-     tive for unset substrings.
-
-     The  two  convenience  functions  pcre_free_substring()  and
-     pcre_free_substring_list()  can  be  used to free the memory
-     returned by  a  previous  call  of  pcre_get_substring()  or
-     pcre_get_substring_list(),  respectively.  They  do  nothing
-     more than call the function pointed to by  pcre_free,  which
-     of  course  could  be called directly from a C program. How-
-     ever, PCRE is used in some situations where it is linked via
-     a  special  interface  to another programming language which
-     cannot use pcre_free directly; it is for  these  cases  that
-     the functions are provided.
-
-
-EXTRACTING CAPTURED SUBSTRINGS BY NAME
-
-     int pcre_copy_named_substring(const pcre *code,
-          const char *subject, int *ovector,
-          int stringcount, const char *stringname,
-          char *buffer, int buffersize);
-
-     int pcre_get_stringnumber(const pcre *code,
-          const char *name);
-
-     int pcre_get_named_substring(const pcre *code,
-          const char *subject, int *ovector,
-          int stringcount, const char *stringname,
-          const char **stringptr);
-
-     To extract a substring by name, you first have to find asso-
-     ciated    number.    This    can    be   done   by   calling
-     pcre_get_stringnumber(). The first argument is the  compiled
-     pattern,  and  the second is the name. For example, for this
-     pattern
-
-       ab(?<xxx>\d+)...
-
-     the number of the subpattern called "xxx" is  1.  Given  the
-     number,  you can then extract the substring directly, or use
-     one of the functions described in the previous section.  For
-     convenience,  there are also two functions that do the whole
-     job.
-
-     Most of the  arguments  of  pcre_copy_named_substring()  and
-     pcre_get_named_substring()  are  the  same  as those for the
-     functions that  extract  by  number,  and  so  are  not  re-
-     described here. There are just two differences.
-
-     First, instead of a substring number, a  substring  name  is
-     given.  Second,  there  is  an  extra argument, given at the
-     start, which is a pointer to the compiled pattern.  This  is
-     needed  in order to gain access to the name-to-number trans-
-     lation table.
-
-     These functions  call  pcre_get_stringnumber(),  and  if  it
-     succeeds,    they   then   call   pcre_copy_substring()   or
-     pcre_get_substring(), as appropriate.
-
-Last updated: 03 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-PCRE CALLOUTS
-
-     int (*pcre_callout)(pcre_callout_block *);
-
-     PCRE provides a feature called "callout", which is  a  means
-     of  temporarily passing control to the caller of PCRE in the
-     middle of pattern matching. The caller of PCRE  provides  an
-     external  function  by putting its entry point in the global
-     variable pcre_callout. By default,  this  variable  contains
-     NULL, which disables all calling out.
-
-     Within a regular expression, (?C) indicates  the  points  at
-     which  the external function is to be called. Different cal-
-     lout points can be identified by putting a number less  than
-     256  after  the  letter  C.  The default value is zero.  For
-     example, this pattern has two callout points:
-
-       (?C1)9abc(?C2)def
-
-     During matching, when PCRE  reaches  a  callout  point  (and
-     pcre_callout  is  set), the external function is called. Its
-     only argument is a pointer to  a  pcre_callout  block.  This
-     contains the following variables:
-
-       int          version;
-       int          callout_number;
-       int         *offset_vector;
-       const char  *subject;
-       int          subject_length;
-       int          start_match;
-       int          current_position;
-       int          capture_top;
-       int          capture_last;
-       void        *callout_data;
-
-     The version field  is  an  integer  containing  the  version
-     number of the block format. The current version is zero. The
-     version number may change in future if additional fields are
-     added,  but  the  intention  is  never  to remove any of the
-     existing fields.
-
-     The callout_number field contains the number of the callout,
-     as compiled into the pattern (that is, the number after ?C).
-
-     The offset_vector field  is  a  pointer  to  the  vector  of
-     offsets  that  was  passed by the caller to pcre_exec(). The
-     contents can be inspected in  order  to  extract  substrings
-     that  have  been  matched  so  far,  in  the same way as for
-     extracting substrings after a match has completed.
-     The subject and subject_length  fields  contain  copies  the
-     values that were passed to pcre_exec().
-
-     The start_match field contains the offset within the subject
-     at  which  the current match attempt started. If the pattern
-     is not anchored, the callout function may be called  several
-     times for different starting points.
-
-     The current_position field contains the  offset  within  the
-     subject of the current match pointer.
-
-     The capture_top field contains the  number  of  the  highest
-     captured substring so far.
-
-     The capture_last field  contains  the  number  of  the  most
-     recently captured substring.
-
-     The callout_data field contains a value that  is  passed  to
-     pcre_exec()  by  the  caller  specifically so that it can be
-     passed back in callouts. It is passed  in  the  pcre_callout
-     field  of the pcre_extra data structure. If no such data was
-     passed, the value of callout_data in a pcre_callout block is
-     NULL.  There is a description of the pcre_extra structure in
-     the pcreapi documentation.
-
-
-
-RETURN VALUES
-
-     The callout function returns an integer.  If  the  value  is
-     zero,  matching  proceeds as normal. If the value is greater
-     than zero, matching fails at the current  point,  but  back-
-     tracking  to test other possibilities goes ahead, just as if
-     a lookahead assertion had failed. If the value is less  than
-     zero,  the  match  is abandoned, and pcre_exec() returns the
-     value.
-
-     Negative values should normally be chosen from  the  set  of
-     PCRE_ERROR_xxx  values.  In  particular,  PCRE_ERROR_NOMATCH
-     forces a standard "no  match"  failure.   The  error  number
-     PCRE_ERROR_CALLOUT is reserved for use by callout functions;
-     it will never be used by PCRE itself.
-
-Last updated: 21 January 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-DIFFERENCES FROM PERL
-
-     This document describes the differences  in  the  ways  that
-     PCRE  and  Perl  handle regular expressions. The differences
-     described here are with respect to Perl 5.8.
-
-     1. PCRE does  not  allow  repeat  quantifiers  on  lookahead
-     assertions. Perl permits them, but they do not mean what you
-     might think. For example, (?!a){3} does not assert that  the
-     next  three characters are not "a". It just asserts that the
-     next character is not "a" three times.
-
-     2. Capturing subpatterns that occur inside  negative  looka-
-     head  assertions  are  counted,  but  their  entries  in the
-     offsets vector are never set. Perl sets its numerical  vari-
-     ables  from  any  such  patterns that are matched before the
-     assertion fails to match something (thereby succeeding), but
-     only  if  the negative lookahead assertion contains just one
-     branch.
-
-     3. Though binary zero characters are supported in  the  sub-
-     ject  string,  they  are  not  allowed  in  a pattern string
-     because it is passed as a normal  C  string,  terminated  by
-     zero. The escape sequence "\0" can be used in the pattern to
-     represent a binary zero.
-
-     4. The following Perl escape sequences  are  not  supported:
-     \l,  \u,  \L,  \U,  \P, \p, and \X. In fact these are imple-
-     mented by Perl's general string-handling and are not part of
-     its pattern matching engine. If any of these are encountered
-     by PCRE, an error is generated.
-
-     5. PCRE does support the \Q...\E  escape  for  quoting  sub-
-     strings. Characters in between are treated as literals. This
-     is slightly different from Perl in that $  and  @  are  also
-     handled  as  literals inside the quotes. In Perl, they cause
-     variable interpolation (but of course  PCRE  does  not  have
-     variables). Note the following examples:
-
-         Pattern            PCRE matches      Perl matches
-
-         \Qabc$xyz\E        abc$xyz           abc followed by the
-                                                contents of $xyz
-         \Qabc\$xyz\E       abc\$xyz          abc\$xyz
-         \Qabc\E\$\Qxyz\E   abc$xyz           abc$xyz
-
-     In PCRE, the \Q...\E mechanism is not  recognized  inside  a
-     character class.
-
-     8. Fairly obviously, PCRE does not support the (?{code}) and
-     (?p{code})  constructions. However, there is some experimen-
-     tal support for recursive patterns using the non-Perl  items
-     (?R),  (?number)  and  (?P>name).  Also,  the PCRE "callout"
-     feature allows an external function to be called during pat-
-     tern matching.
-
-     9. There are some differences that are  concerned  with  the
-     settings  of  captured  strings  when  part  of a pattern is
-     repeated. For example, matching "aba"  against  the  pattern
-     /^(a(b)?)+$/  in Perl leaves $2 unset, but in PCRE it is set
-     to "b".
-
-     10. PCRE  provides  some  extensions  to  the  Perl  regular
-     expression facilities:
-
-     (a) Although lookbehind assertions must match  fixed  length
-     strings,  each  alternative branch of a lookbehind assertion
-     can match a different length of string. Perl  requires  them
-     all to have the same length.
-
-     (b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is  not
-     set,  the  $  meta-character matches only at the very end of
-     the string.
-
-     (c) If PCRE_EXTRA is set, a backslash followed by  a  letter
-     with no special meaning is faulted.
-
-     (d) If PCRE_UNGREEDY is set, the greediness of  the  repeti-
-     tion  quantifiers  is inverted, that is, by default they are
-     not greedy, but if followed by a question mark they are.
-
-     (e) PCRE_ANCHORED can be used to force a pattern to be tried
-     only at the first matching position in the subject string.
-
-     (f)  The  PCRE_NOTBOL,   PCRE_NOTEOL,   PCRE_NOTEMPTY,   and
-     PCRE_NO_AUTO_CAPTURE  options  for  pcre_exec() have no Perl
-     equivalents.
-
-     (g) The (?R), (?number), and (?P>name) constructs allows for
-     recursive  pattern  matching  (Perl  can  do  this using the
-     (?p{code}) construct, which PCRE cannot support.)
-
-     (h) PCRE supports  named  capturing  substrings,  using  the
-     Python syntax.
-
-     (i) PCRE supports the  possessive  quantifier  "++"  syntax,
-     taken from Sun's Java package.
-
-     (j) The (R) condition, for  testing  recursion,  is  a  PCRE
-     extension.
-
-     (k) The callout facility is PCRE-specific.
-
-Last updated: 03 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-PCRE REGULAR EXPRESSION DETAILS
-
-     The syntax and semantics of  the  regular  expressions  sup-
-     ported  by PCRE are described below. Regular expressions are
-     also described in the Perl documentation and in a number  of
-     other  books,  some  of which have copious examples. Jeffrey
-     Friedl's  "Mastering  Regular  Expressions",  published   by
-     O'Reilly,  covers them in great detail. The description here
-     is intended as reference documentation.
-
-     The basic operation of PCRE is on strings of bytes. However,
-     there  is  also  support for UTF-8 character strings. To use
-     this support you must build PCRE to include  UTF-8  support,
-     and  then call pcre_compile() with the PCRE_UTF8 option. How
-     this affects the pattern matching is  mentioned  in  several
-     places  below.  There is also a summary of UTF-8 features in
-     the section on UTF-8 support in the main pcre page.
-
-     A regular expression is a pattern that is matched against  a
-     subject string from left to right. Most characters stand for
-     themselves in a pattern, and match the corresponding charac-
-     ters in the subject. As a trivial example, the pattern
-
-       The quick brown fox
-
-     matches a portion of a subject string that is  identical  to
-     itself.  The  power  of  regular  expressions comes from the
-     ability to include alternatives and repetitions in the  pat-
-     tern.  These  are encoded in the pattern by the use of meta-
-     characters, which do not stand for  themselves  but  instead
-     are interpreted in some special way.
-
-     There are two different sets of meta-characters: those  that
-     are  recognized anywhere in the pattern except within square
-     brackets, and those that are recognized in square  brackets.
-     Outside square brackets, the meta-characters are as follows:
-
-       \      general escape character with several uses
-       ^      assert start of string (or line, in multiline mode)
-       $      assert end of string (or line, in multiline mode)
-       .      match any character except newline (by default)
-       [      start character class definition
-       |      start of alternative branch
-       (      start subpattern
-       )      end subpattern
-       ?      extends the meaning of (
-              also 0 or 1 quantifier
-              also quantifier minimizer
-       *      0 or more quantifier
-       +      1 or more quantifier
-              also "possessive quantifier"
-       {      start min/max quantifier
-
-     Part of a pattern that is in square  brackets  is  called  a
-     "character  class".  In  a  character  class  the only meta-
-     characters are:
-
-       \      general escape character
-       ^      negate the class, but only if the first character
-       -      indicates character range
-       [      POSIX character class (only if followed by POSIX
-                syntax)
-       ]      terminates the character class
-
-     The following sections describe  the  use  of  each  of  the
-     meta-characters.
-
-
-BACKSLASH
-
-     The backslash character has several uses. Firstly, if it  is
-     followed  by  a  non-alphameric character, it takes away any
-     special  meaning  that  character  may  have.  This  use  of
-     backslash  as  an  escape  character applies both inside and
-     outside character classes.
-
-     For example, if you want to match a * character,  you  write
-     \*  in the pattern.  This escaping action applies whether or
-     not the following character would otherwise  be  interpreted
-     as  a meta-character, so it is always safe to precede a non-
-     alphameric with backslash to  specify  that  it  stands  for
-     itself. In particular, if you want to match a backslash, you
-     write \\.
-
-     If a pattern is compiled with the PCRE_EXTENDED option, whi-
-     tespace in the pattern (other than in a character class) and
-     characters between a # outside a  character  class  and  the
-     next  newline  character  are ignored. An escaping backslash
-     can be used to include a whitespace or # character  as  part
-     of the pattern.
-
-     If you want to remove the special meaning from a sequence of
-     characters, you can do so by putting them between \Q and \E.
-     This is different from Perl in that $ and @ are  handled  as
-     literals  in  \Q...\E  sequences in PCRE, whereas in Perl, $
-     and @ cause variable interpolation. Note the following exam-
-     ples:
-
-       Pattern            PCRE matches   Perl matches
-
-       \Qabc$xyz\E        abc$xyz        abc followed by the
-
-                                           contents of $xyz
-       \Qabc\$xyz\E       abc\$xyz       abc\$xyz
-       \Qabc\E\$\Qxyz\E   abc$xyz        abc$xyz
-
-     The \Q...\E sequence is recognized both inside  and  outside
-     character classes.
-
-     A second use of backslash provides a way  of  encoding  non-
-     printing  characters  in patterns in a visible manner. There
-     is no restriction on the appearance of non-printing  charac-
-     ters,  apart from the binary zero that terminates a pattern,
-     but when a pattern is being prepared by text editing, it  is
-     usually  easier to use one of the following escape sequences
-     than the binary character it represents:
-
-       \a        alarm, that is, the BEL character (hex 07)
-       \cx       "control-x", where x is any character
-       \e        escape (hex 1B)
-       \f        formfeed (hex 0C)
-       \n        newline (hex 0A)
-       \r        carriage return (hex 0D)
-       \t        tab (hex 09)
-       \ddd      character with octal code ddd, or backreference
-       \xhh      character with hex code hh
-       \x{hhh..} character with hex code hhh... (UTF-8 mode only)
-
-     The precise effect of \cx is as follows: if  x  is  a  lower
-     case  letter,  it  is converted to upper case. Then bit 6 of
-     the character (hex 40) is inverted.  Thus  \cz  becomes  hex
-     1A, but \c{ becomes hex 3B, while \c; becomes hex 7B.
-
-     After \x, from zero  to  two  hexadecimal  digits  are  read
-     (letters  can be in upper or lower case). In UTF-8 mode, any
-     number of hexadecimal digits may appear between \x{  and  },
-     but  the value of the character code must be less than 2**31
-     (that is, the maximum hexadecimal  value  is  7FFFFFFF).  If
-     characters  other than hexadecimal digits appear between \x{
-     and }, or if there is no terminating }, this form of  escape
-     is  not  recognized.  Instead, the initial \x will be inter-
-     preted as a basic  hexadecimal  escape,  with  no  following
-     digits, giving a byte whose value is zero.
-
-     Characters whose value is less than 256 can  be  defined  by
-     either  of  the  two  syntaxes  for \x when PCRE is in UTF-8
-     mode. There is no difference in the way  they  are  handled.
-     For example, \xdc is exactly the same as \x{dc}.
-
-     After \0 up to two further octal digits are  read.  In  both
-     cases,  if  there are fewer than two digits, just those that
-     are present are used. Thus the  sequence  \0\x\07  specifies
-     two binary zeros followed by a BEL character (code value 7).
-     Make sure you supply two digits after the  initial  zero  if
-     the character that follows is itself an octal digit.
-
-     The handling of a backslash followed by a digit other than 0
-     is  complicated.   Outside  a character class, PCRE reads it
-     and any following digits as a decimal number. If the  number
-     is  less  than  10, or if there have been at least that many
-     previous capturing left parentheses in the  expression,  the
-     entire  sequence is taken as a back reference. A description
-     of how this works is given later, following  the  discussion
-     of parenthesized subpatterns.
-
-     Inside a character  class,  or  if  the  decimal  number  is
-     greater  than  9 and there have not been that many capturing
-     subpatterns, PCRE re-reads up to three octal digits  follow-
-     ing  the  backslash,  and  generates  a single byte from the
-     least significant 8 bits of the value. Any subsequent digits
-     stand for themselves.  For example:
-
-       \040   is another way of writing a space
-       \40    is the same, provided there are fewer than 40
-                 previous capturing subpatterns
-       \7     is always a back reference
-       \11    might be a back reference, or another way of
-                 writing a tab
-       \011   is always a tab
-       \0113  is a tab followed by the character "3"
-       \113   might be a back reference, otherwise the
-                 character with octal code 113
-       \377   might be a back reference, otherwise
-                 the byte consisting entirely of 1 bits
-       \81    is either a back reference, or a binary zero
-                 followed by the two characters "8" and "1"
-
-     Note that octal values of 100 or greater must not be  intro-
-     duced  by  a  leading zero, because no more than three octal
-     digits are ever read.
-
-     All the sequences that define a single byte value or a  sin-
-     gle  UTF-8 character (in UTF-8 mode) can be used both inside
-     and outside character classes. In addition, inside a charac-
-     ter  class,  the sequence \b is interpreted as the backspace
-     character (hex 08). Outside a character class it has a  dif-
-     ferent meaning (see below).
-
-     The third use of backslash is for specifying generic charac-
-     ter types:
-
-       \d     any decimal digit
-       \D     any character that is not a decimal digit
-       \s     any whitespace character
-       \S     any character that is not a whitespace character
-       \w     any "word" character
-       W     any "non-word" character
-
-     Each pair of escape sequences partitions the complete set of
-     characters  into  two  disjoint  sets.  Any  given character
-     matches one, and only one, of each pair.
-
-     In UTF-8 mode, characters with values greater than 255 never
-     match \d, \s, or \w, and always match \D, \S, and \W.
-
-     For compatibility with Perl, \s does not match the VT  char-
-     acter (code 11).  This makes it different from the the POSIX
-     "space" class. The \s characters are HT  (9),  LF  (10),  FF
-     (12), CR (13), and space (32).
-
-     A "word" character is any letter or digit or the  underscore
-     character,  that  is,  any  character which can be part of a
-     Perl "word". The definition of letters and  digits  is  con-
-     trolled  by PCRE's character tables, and may vary if locale-
-     specific matching is taking place (see "Locale  support"  in
-     the pcreapi page). For example, in the "fr" (French) locale,
-     some character codes greater than 128 are used for  accented
-     letters, and these are matched by \w.
-
-     These character type sequences can appear  both  inside  and
-     outside  character classes. They each match one character of
-     the appropriate type. If the current matching  point  is  at
-     the end of the subject string, all of them fail, since there
-     is no character to match.
-
-     The fourth use of backslash is  for  certain  simple  asser-
-     tions. An assertion specifies a condition that has to be met
-     at a particular point in  a  match,  without  consuming  any
-     characters  from  the subject string. The use of subpatterns
-     for more complicated  assertions  is  described  below.  The
-     backslashed assertions are
-
-       \b     matches at a word boundary
-       \B     matches when not at a word boundary
-       \A     matches at start of subject
-       \Z     matches at end of subject or before newline at end
-       \z     matches at end of subject
-       \G     matches at first matching position in subject
-
-     These assertions may not appear in  character  classes  (but
-     note  that  \b has a different meaning, namely the backspace
-     character, inside a character class).
-
-     A word boundary is a position in the  subject  string  where
-     the current character and the previous character do not both
-     match \w or \W (i.e. one matches \w and  the  other  matches
-     \W),  or the start or end of the string if the first or last
-     character matches \w, respectively.
-     The \A, \Z, and \z assertions differ  from  the  traditional
-     circumflex  and  dollar  (described below) in that they only
-     ever match at the very start and end of the subject  string,
-     whatever options are set. Thus, they are independent of mul-
-     tiline mode.
-
-     They are not affected  by  the  PCRE_NOTBOL  or  PCRE_NOTEOL
-     options.  If the startoffset argument of pcre_exec() is non-
-     zero, indicating that matching is to start at a point  other
-     than  the  beginning of the subject, \A can never match. The
-     difference between \Z and \z is that  \Z  matches  before  a
-     newline  that is the last character of the string as well as
-     at the end of the string, whereas \z  matches  only  at  the
-     end.
-
-     The \G assertion is true  only  when  the  current  matching
-     position is at the start point of the match, as specified by
-     the startoffset argument of pcre_exec(). It differs from  \A
-     when  the  value  of  startoffset  is  non-zero.  By calling
-     pcre_exec() multiple times with appropriate  arguments,  you
-     can mimic Perl's /g option, and it is in this kind of imple-
-     mentation where \G can be useful.
-
-     Note, however, that PCRE's  interpretation  of  \G,  as  the
-     start of the current match, is subtly different from Perl's,
-     which defines it as the end of the previous match. In  Perl,
-     these  can  be  different when the previously matched string
-     was empty. Because PCRE does just one match at  a  time,  it
-     cannot reproduce this behaviour.
-
-     If all the alternatives of a  pattern  begin  with  \G,  the
-     expression  is  anchored to the starting match position, and
-     the "anchored" flag is set in the compiled  regular  expres-
-     sion.
-
-
-CIRCUMFLEX AND DOLLAR
-
-     Outside a character class, in the default matching mode, the
-     circumflex  character  is an assertion which is true only if
-     the current matching point is at the start  of  the  subject
-     string.  If  the startoffset argument of pcre_exec() is non-
-     zero, circumflex  can  never  match  if  the  PCRE_MULTILINE
-     option is unset. Inside a character class, circumflex has an
-     entirely different meaning (see below).
-
-     Circumflex need not be the first character of the pattern if
-     a  number of alternatives are involved, but it should be the
-     first thing in each alternative in which it appears  if  the
-     pattern is ever to match that branch. If all possible alter-
-     natives start with a circumflex, that is, if the pattern  is
-     constrained to match only at the start of the subject, it is
-     said to be an "anchored" pattern. (There are also other con-
-     structs that can cause a pattern to be anchored.)
-
-     A dollar character is an assertion which is true only if the
-     current  matching point is at the end of the subject string,
-     or immediately before a newline character that is  the  last
-     character in the string (by default). Dollar need not be the
-     last character of the pattern if a  number  of  alternatives
-     are  involved,  but it should be the last item in any branch
-     in which it appears.  Dollar has no  special  meaning  in  a
-     character class.
-
-     The meaning of dollar can be changed so that it matches only
-     at   the   very   end   of   the   string,  by  setting  the
-     PCRE_DOLLAR_ENDONLY option at compile time.  This  does  not
-     affect the \Z assertion.
-
-     The meanings of the circumflex  and  dollar  characters  are
-     changed  if  the  PCRE_MULTILINE option is set. When this is
-     the case,  they  match  immediately  after  and  immediately
-     before an internal newline character, respectively, in addi-
-     tion to matching at the start and end of the subject string.
-     For  example, the pattern /^abc$/ matches the subject string
-     "def\nabc" in multiline  mode,  but  not  otherwise.  Conse-
-     quently,  patterns  that  are  anchored  in single line mode
-     because all branches start with ^ are not anchored in multi-
-     line  mode,  and a match for circumflex is possible when the
-     startoffset  argument  of  pcre_exec()  is   non-zero.   The
-     PCRE_DOLLAR_ENDONLY  option  is ignored if PCRE_MULTILINE is
-     set.
-
-     Note that the sequences \A, \Z, and \z can be used to  match
-     the  start  and end of the subject in both modes, and if all
-     branches of a pattern start with \A it is  always  anchored,
-     whether PCRE_MULTILINE is set or not.
-
-
-FULL STOP (PERIOD, DOT)
-
-     Outside a character class, a dot in the pattern matches  any
-     one character in the subject, including a non-printing char-
-     acter, but not (by default) newline.  In UTF-8 mode,  a  dot
-     matches  any  UTF-8  character, which might be more than one
-     byte  long,  except  (by  default)  for  newline.   If   the
-     PCRE_DOTALL  option is set, dots match newlines as well. The
-     handling of dot is entirely independent of the  handling  of
-     circumflex and dollar, the only relationship being that they
-     both involve newline characters. Dot has no special  meaning
-     in a character class.
-
-
-
-MATCHING A SINGLE BYTE
-
-     Outside a character class, the escape  sequence  \C  matches
-     any  one  byte, both in and out of UTF-8 mode. Unlike a dot,
-     it always matches a newline. The feature is provided in Perl
-     in  order  to match individual bytes in UTF-8 mode.  Because
-     it breaks up UTF-8 characters into  individual  bytes,  what
-     remains  in  the string may be a malformed UTF-8 string. For
-     this reason it is best avoided.
-
-     PCRE does not allow \C to appear  in  lookbehind  assertions
-     (see below), because in UTF-8 mode it makes it impossible to
-     calculate the length of the lookbehind.
-
-
-SQUARE BRACKETS
-
-     An opening square bracket introduces a character class, ter-
-     minated  by  a  closing  square  bracket.  A  closing square
-     bracket on its own is  not  special.  If  a  closing  square
-     bracket  is  required as a member of the class, it should be
-     the first data character in the class (after an initial cir-
-     cumflex, if present) or escaped with a backslash.
-
-     A character class matches a single character in the subject.
-     In  UTF-8 mode, the character may occupy more than one byte.
-     A matched character must be in the set of characters defined
-     by the class, unless the first character in the class defin-
-     ition is a circumflex, in which case the  subject  character
-     must not be in the set defined by the class. If a circumflex
-     is actually required as a member of the class, ensure it  is
-     not the first character, or escape it with a backslash.
-
-     For example, the character class [aeiou] matches  any  lower
-     case vowel, while [^aeiou] matches any character that is not
-     a lower case vowel. Note that a circumflex is  just  a  con-
-     venient  notation for specifying the characters which are in
-     the class by enumerating those that are not. It  is  not  an
-     assertion:  it  still  consumes a character from the subject
-     string, and fails if the current pointer is at  the  end  of
-     the string.
-
-     In UTF-8 mode, characters with values greater than  255  can
-     be  included  in a class as a literal string of bytes, or by
-     using the \x{ escaping mechanism.
-
-     When caseless matching  is  set,  any  letters  in  a  class
-     represent  both their upper case and lower case versions, so
-     for example, a caseless [aeiou] matches "A" as well as  "a",
-     and  a caseless [^aeiou] does not match "A", whereas a case-
-     ful version would. PCRE does not support the concept of case
-     for characters with values greater than 255.
-     The newline character is never treated in any special way in
-     character  classes,  whatever the setting of the PCRE_DOTALL
-     or PCRE_MULTILINE options is. A  class  such  as  [^a]  will
-     always match a newline.
-
-     The minus (hyphen) character can be used to specify a  range
-     of  characters  in  a  character  class.  For example, [d-m]
-     matches any letter between d and m, inclusive.  If  a  minus
-     character  is required in a class, it must be escaped with a
-     backslash or appear in a position where it cannot be  inter-
-     preted as indicating a range, typically as the first or last
-     character in the class.
-
-     It is not possible to have the literal character "]" as  the
-     end  character  of  a  range.  A  pattern such as [W-]46] is
-     interpreted as a class of two characters ("W" and "-")  fol-
-     lowed by a literal string "46]", so it would match "W46]" or
-     "-46]". However, if the "]" is escaped with a  backslash  it
-     is  interpreted  as  the end of range, so [W-\]46] is inter-
-     preted as a single class containing a range followed by  two
-     separate characters. The octal or hexadecimal representation
-     of "]" can also be used to end a range.
-
-     Ranges  operate  in  the  collating  sequence  of  character
-     values.  They  can  also  be  used  for characters specified
-     numerically, for example [\000-\037]. In UTF-8 mode,  ranges
-     can  include  characters  whose values are greater than 255,
-     for example [\x{100}-\x{2ff}].
-
-     If a range that  includes  letters  is  used  when  caseless
-     matching  is set, it matches the letters in either case. For
-     example, [W-c] is  equivalent  to  [][\^_`wxyzabc],  matched
-     caselessly,  and if character tables for the "fr" locale are
-     in use, [\xc8-\xcb] matches accented E  characters  in  both
-     cases.
-
-     The character types \d, \D, \s, \S,  \w,  and  \W  may  also
-     appear  in  a  character  class, and add the characters that
-     they match to the class. For example, [\dABCDEF] matches any
-     hexadecimal  digit.  A  circumflex  can conveniently be used
-     with the upper case character types to specify a  more  res-
-     tricted set of characters than the matching lower case type.
-     For example, the class [^\W_] matches any letter  or  digit,
-     but not underscore.
-
-     All non-alphameric characters other than \,  -,  ^  (at  the
-     start)  and  the  terminating ] are non-special in character
-     classes, but it does no harm if they are escaped.
-
-
-POSIX CHARACTER CLASSES
-
-     Perl supports the  POSIX  notation  for  character  classes,
-     which  uses names enclosed by [: and :] within the enclosing
-     square brackets. PCRE also supports this notation. For exam-
-     ple,
-
-       [01[:alpha:]%]
-
-     matches "0", "1", any alphabetic character, or "%". The sup-
-     ported class names are
-
-       alnum    letters and digits
-       alpha    letters
-       ascii    character codes 0 - 127
-       blank    space or tab only
-       cntrl    control characters
-       digit    decimal digits (same as \d)
-       graph    printing characters, excluding space
-       lower    lower case letters
-       print    printing characters, including space
-       punct    printing characters, excluding letters and digits
-       space    white space (not quite the same as \s)
-       upper    upper case letters
-       word     "word" characters (same as \w)
-       xdigit   hexadecimal digits
-
-     The "space" characters are HT (9),  LF  (10),  VT  (11),  FF
-     (12),  CR  (13),  and  space  (32).  Notice  that  this list
-     includes the VT character (code 11). This makes "space" dif-
-     ferent  to  \s, which does not include VT (for Perl compati-
-     bility).
-
-     The name "word" is a Perl extension, and "blank"  is  a  GNU
-     extension from Perl 5.8. Another Perl extension is negation,
-     which is indicated by a ^ character  after  the  colon.  For
-     example,
-
-       [12[:^digit:]]
-
-     matches "1", "2", or any non-digit.  PCRE  (and  Perl)  also
-     recognize the POSIX syntax [.ch.] and [=ch=] where "ch" is a
-     "collating element", but these are  not  supported,  and  an
-     error is given if they are encountered.
-
-     In UTF-8 mode, characters with values greater  than  255  do
-     not match any of the POSIX character classes.
-
-
-VERTICAL BAR
-
-     Vertical bar characters are  used  to  separate  alternative
-     patterns. For example, the pattern
-
-       gilbert|sullivan
-
-     matches either "gilbert" or "sullivan". Any number of alter-
-     natives  may  appear,  and an empty alternative is permitted
-     (matching the empty string).   The  matching  process  tries
-     each  alternative in turn, from left to right, and the first
-     one that succeeds is used. If the alternatives are within  a
-     subpattern  (defined  below),  "succeeds" means matching the
-     rest of the main pattern as well as the alternative  in  the
-     subpattern.
-
-
-INTERNAL OPTION SETTING
-
-     The   settings   of   the   PCRE_CASELESS,   PCRE_MULTILINE,
-     PCRE_DOTALL,  and  PCRE_EXTENDED options can be changed from
-     within the pattern by a  sequence  of  Perl  option  letters
-     enclosed between "(?" and ")". The option letters are
-
-       i  for PCRE_CASELESS
-       m  for PCRE_MULTILINE
-       s  for PCRE_DOTALL
-       x  for PCRE_EXTENDED
-
-     For example, (?im) sets caseless, multiline matching. It  is
-     also possible to unset these options by preceding the letter
-     with a hyphen, and a combined setting and unsetting such  as
-     (?im-sx),  which sets PCRE_CASELESS and PCRE_MULTILINE while
-     unsetting PCRE_DOTALL and PCRE_EXTENDED, is also  permitted.
-     If  a  letter  appears both before and after the hyphen, the
-     option is unset.
-
-     When an option change occurs at  top  level  (that  is,  not
-     inside  subpattern  parentheses),  the change applies to the
-     remainder of the pattern that follows.   If  the  change  is
-     placed  right  at  the  start of a pattern, PCRE extracts it
-     into the global options (and it will therefore  show  up  in
-     data extracted by the pcre_fullinfo() function).
-
-     An option change within a subpattern affects only that  part
-     of the current pattern that follows it, so
-
-       (a(?i)b)c
-
-     matches  abc  and  aBc  and  no  other   strings   (assuming
-     PCRE_CASELESS  is  not used).  By this means, options can be
-     made to have different settings in different  parts  of  the
-     pattern.  Any  changes  made  in one alternative do carry on
-     into subsequent branches within  the  same  subpattern.  For
-     example,
-
-       (a(?i)b|c)
-
-     matches "ab", "aB", "c", and "C", even though when  matching
-     "C" the first branch is abandoned before the option setting.
-     This is because the effects of  option  settings  happen  at
-     compile  time. There would be some very weird behaviour oth-
-     erwise.
-
-     The PCRE-specific options PCRE_UNGREEDY and  PCRE_EXTRA  can
-     be changed in the same way as the Perl-compatible options by
-     using the characters U and X  respectively.  The  (?X)  flag
-     setting  is  special in that it must always occur earlier in
-     the pattern than any of the additional features it turns on,
-     even when it is at top level. It is best put at the start.
-
-
-SUBPATTERNS
-
-     Subpatterns are delimited by parentheses  (round  brackets),
-     which can be nested.  Marking part of a pattern as a subpat-
-     tern does two things:
-
-     1. It localizes a set of alternatives. For example, the pat-
-     tern
-
-       cat(aract|erpillar|)
-
-     matches one of the words "cat",  "cataract",  or  "caterpil-
-     lar".  Without  the  parentheses, it would match "cataract",
-     "erpillar" or the empty string.
-
-     2. It sets up the subpattern as a capturing  subpattern  (as
-     defined  above).   When the whole pattern matches, that por-
-     tion of the subject string that matched  the  subpattern  is
-     passed  back  to  the  caller  via  the  ovector argument of
-     pcre_exec(). Opening parentheses are counted  from  left  to
-     right (starting from 1) to obtain the numbers of the captur-
-     ing subpatterns.
-
-     For example, if the string "the red king" is matched against
-     the pattern
-
-       the ((red|white) (king|queen))
-
-     the captured substrings are "red king", "red",  and  "king",
-     and are numbered 1, 2, and 3, respectively.
-
-     The fact that plain parentheses fulfil two functions is  not
-     always  helpful.  There are often times when a grouping sub-
-     pattern is required without a capturing requirement.  If  an
-     opening  parenthesis  is  followed  by a question mark and a
-     colon, the subpattern does not do any capturing, and is  not
-     counted  when computing the number of any subsequent captur-
-     ing subpatterns. For  example,  if  the  string  "the  white
-     queen" is matched against the pattern
-
-       the ((?:red|white) (king|queen))
-
-     the captured substrings are "white queen" and  "queen",  and
-     are  numbered  1 and 2. The maximum number of capturing sub-
-     patterns is 65535, and the maximum depth of nesting  of  all
-     subpatterns, both capturing and non-capturing, is 200.
-
-     As a  convenient  shorthand,  if  any  option  settings  are
-     required  at  the  start  of a non-capturing subpattern, the
-     option letters may appear between the "?" and the ":".  Thus
-     the two patterns
-
-       (?i:saturday|sunday)
-       (?:(?i)saturday|sunday)
-
-     match exactly the same set of strings.  Because  alternative
-     branches  are  tried from left to right, and options are not
-     reset until the end of the subpattern is reached, an  option
-     setting  in  one  branch does affect subsequent branches, so
-     the above patterns match "SUNDAY" as well as "Saturday".
-
-
-NAMED SUBPATTERNS
-
-     Identifying capturing parentheses by number is  simple,  but
-     it  can be very hard to keep track of the numbers in compli-
-     cated regular expressions. Furthermore, if an expression  is
-     modified,  the  numbers  may change. To help with the diffi-
-     culty, PCRE supports the naming  of  subpatterns,  something
-     that  Perl does not provide. The Python syntax (?P<name>...)
-     is used. Names consist of alphanumeric characters and under-
-     scores, and must be unique within a pattern.
-
-     Named capturing parentheses are still allocated  numbers  as
-     well  as  names.  The  PCRE  API provides function calls for
-     extracting the name-to-number translation table from a  com-
-     piled  pattern. For further details see the pcreapi documen-
-     tation.
-
-
-REPETITION
-
-     Repetition is specified by quantifiers, which can follow any
-     of the following items:
-
-       a literal data character
-       the . metacharacter
-       the \C escape sequence
-       escapes such as \d that match single characters
-       a character class
-       a back reference (see next section)
-       a parenthesized subpattern (unless it is an assertion)
-
-     The general repetition quantifier specifies  a  minimum  and
-     maximum  number  of  permitted  matches,  by  giving the two
-     numbers in curly brackets (braces), separated  by  a  comma.
-     The  numbers  must be less than 65536, and the first must be
-     less than or equal to the second. For example:
-
-       z{2,4}
-
-     matches "zz", "zzz", or "zzzz". A closing brace on  its  own
-     is not a special character. If the second number is omitted,
-     but the comma is present, there is no upper  limit;  if  the
-     second number and the comma are both omitted, the quantifier
-     specifies an exact number of required matches. Thus
-
-       [aeiou]{3,}
-
-     matches at least 3 successive vowels,  but  may  match  many
-     more, while
-
-       \d{8}
-
-     matches exactly 8 digits.  An  opening  curly  bracket  that
-     appears  in a position where a quantifier is not allowed, or
-     one that does not match the syntax of a quantifier, is taken
-     as  a literal character. For example, {,6} is not a quantif-
-     ier, but a literal string of four characters.
-
-     In UTF-8 mode, quantifiers apply to UTF-8 characters  rather
-     than  to  individual  bytes.  Thus,  for example, \x{100}{2}
-     matches two UTF-8 characters, each of which  is  represented
-     by a two-byte sequence.
-
-     The quantifier {0} is permitted, causing the  expression  to
-     behave  as  if the previous item and the quantifier were not
-     present.
-
-     For convenience (and  historical  compatibility)  the  three
-     most common quantifiers have single-character abbreviations:
-
-       *    is equivalent to {0,}
-       +    is equivalent to {1,}
-       ?    is equivalent to {0,1}
-
-     It is possible to construct infinite loops  by  following  a
-     subpattern  that  can  match no characters with a quantifier
-     that has no upper limit, for example:
-
-       (a?)*
-
-     Earlier versions of Perl and PCRE used to give an  error  at
-     compile  time  for such patterns. However, because there are
-     cases where this  can  be  useful,  such  patterns  are  now
-     accepted,  but  if  any repetition of the subpattern does in
-     fact match no characters, the loop is forcibly broken.
-
-     By default, the quantifiers  are  "greedy",  that  is,  they
-     match  as much as possible (up to the maximum number of per-
-     mitted times), without causing the rest of  the  pattern  to
-     fail. The classic example of where this gives problems is in
-     trying to match comments in C programs. These appear between
-     the  sequences /* and */ and within the sequence, individual
-     * and / characters may appear. An attempt to  match  C  com-
-     ments by applying the pattern
-
-       /\*.*\*/
-
-     to the string
-
-       /* first command */  not comment  /* second comment */
-
-     fails, because it matches the entire  string  owing  to  the
-     greediness of the .*  item.
-
-     However, if a quantifier is followed by a question mark,  it
-     ceases  to be greedy, and instead matches the minimum number
-     of times possible, so the pattern
-
-       /\*.*?\*/
-
-     does the right thing with the C comments. The meaning of the
-     various  quantifiers is not otherwise changed, just the pre-
-     ferred number of matches.  Do not confuse this use of  ques-
-     tion  mark  with  its  use as a quantifier in its own right.
-     Because it has two uses, it can sometimes appear doubled, as
-     in
-
-       \d??\d
-
-     which matches one digit by preference, but can match two  if
-     that is the only way the rest of the pattern matches.
-
-     If the PCRE_UNGREEDY option is set (an option which  is  not
-     available  in  Perl),  the  quantifiers  are  not  greedy by
-     default, but individual ones can be made greedy by following
-     them  with  a  question mark. In other words, it inverts the
-     default behaviour.
-
-     When a parenthesized subpattern is quantified with a minimum
-     repeat  count  that is greater than 1 or with a limited max-
-     imum, more store is required for the  compiled  pattern,  in
-     proportion to the size of the minimum or maximum.
-     If a pattern starts with .* or  .{0,}  and  the  PCRE_DOTALL
-     option (equivalent to Perl's /s) is set, thus allowing the .
-     to match  newlines,  the  pattern  is  implicitly  anchored,
-     because whatever follows will be tried against every charac-
-     ter position in the subject string, so there is no point  in
-     retrying  the overall match at any position after the first.
-     PCRE normally treats such a pattern as though it  were  pre-
-     ceded by \A.
-
-     In cases where it is known that the subject string  contains
-     no  newlines,  it  is  worth setting PCRE_DOTALL in order to
-     obtain this optimization, or alternatively using ^ to  indi-
-     cate anchoring explicitly.
-
-     However, there is one situation where the optimization  can-
-     not  be  used. When .*  is inside capturing parentheses that
-     are the subject of a backreference elsewhere in the pattern,
-     a match at the start may fail, and a later one succeed. Con-
-     sider, for example:
-
-       (.*)abc\1
-
-     If the subject is "xyz123abc123"  the  match  point  is  the
-     fourth  character.  For  this  reason, such a pattern is not
-     implicitly anchored.
-
-     When a capturing subpattern is repeated, the value  captured
-     is the substring that matched the final iteration. For exam-
-     ple, after
-
-       (tweedle[dume]{3}\s*)+
-
-     has matched "tweedledum tweedledee" the value  of  the  cap-
-     tured  substring  is  "tweedledee".  However,  if  there are
-     nested capturing  subpatterns,  the  corresponding  captured
-     values  may  have been set in previous iterations. For exam-
-     ple, after
-
-       /(a|(b))+/
-
-     matches "aba" the value of the second captured substring  is
-     "b".
-
-
-ATOMIC GROUPING AND POSSESSIVE QUANTIFIERS
-
-     With both maximizing and minimizing repetition,  failure  of
-     what  follows  normally  causes  the repeated item to be re-
-     evaluated to see if a different number of repeats allows the
-     rest  of  the  pattern  to  match. Sometimes it is useful to
-     prevent this, either to change the nature of the  match,  or
-     to  cause  it fail earlier than it otherwise might, when the
-     author of the pattern knows there is no  point  in  carrying
-     on.
-
-     Consider, for example, the pattern \d+foo  when  applied  to
-     the subject line
-
-       123456bar
-
-     After matching all 6 digits and then failing to match "foo",
-     the normal action of the matcher is to try again with only 5
-     digits matching the \d+ item, and then with 4,  and  so  on,
-     before  ultimately  failing. "Atomic grouping" (a term taken
-     from Jeffrey Friedl's book) provides the means for  specify-
-     ing  that once a subpattern has matched, it is not to be re-
-     evaluated in this way.
-
-     If we use atomic grouping  for  the  previous  example,  the
-     matcher  would give up immediately on failing to match "foo"
-     the  first  time.  The  notation  is  a  kind   of   special
-     parenthesis, starting with (?> as in this example:
-
-       (?>\d+)bar
-
-     This kind of parenthesis "locks up" the  part of the pattern
-     it  contains once it has matched, and a failure further into
-     the pattern is prevented from backtracking  into  it.  Back-
-     tracking  past  it to previous items, however, works as nor-
-     mal.
-
-     An alternative description is that a subpattern of this type
-     matches  the  string  of  characters that an identical stan-
-     dalone pattern would match, if anchored at the current point
-     in the subject string.
-
-     Atomic grouping subpatterns are not  capturing  subpatterns.
-     Simple  cases such as the above example can be thought of as
-     a maximizing repeat that must swallow everything it can. So,
-     while both \d+ and \d+? are prepared to adjust the number of
-     digits they match in order to make the rest of  the  pattern
-     match, (?>\d+) can only match an entire sequence of digits.
-
-     Atomic groups in general can of course  contain  arbitrarily
-     complicated  subpatterns,  and  can be nested. However, when
-     the subpattern for an atomic group is just a single repeated
-     item,  as in the example above, a simpler notation, called a
-     "possessive quantifier" can be used.  This  consists  of  an
-     additional  +  character  following a quantifier. Using this
-     notation, the previous example can be rewritten as
-
-       \d++bar
-
-     Possessive quantifiers are always greedy; the setting of the
-     PCRE_UNGREEDY option is ignored. They are a convenient nota-
-     tion for the simpler forms of atomic group.  However,  there
-     is  no  difference in the meaning or processing of a posses-
-     sive quantifier and the equivalent atomic group.
-
-     The possessive quantifier syntax is an extension to the Perl
-     syntax. It originates in Sun's Java package.
-
-     When a pattern contains an unlimited repeat inside a subpat-
-     tern  that  can  itself  be  repeated an unlimited number of
-     times, the use of an atomic group is the only way  to  avoid
-     some  failing  matches  taking  a very long time indeed. The
-     pattern
-
-       (\D+|<\d+>)*[!?]
-
-     matches an unlimited number of substrings that  either  con-
-     sist  of  non-digits,  or digits enclosed in <>, followed by
-     either ! or ?. When it matches, it runs quickly. However, if
-     it is applied to
-
-       aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
-
-     it takes a long  time  before  reporting  failure.  This  is
-     because the string can be divided between the two repeats in
-     a large number of ways, and all have to be tried. (The exam-
-     ple  used  [!?]  rather  than a single character at the end,
-     because both PCRE and Perl have an optimization that  allows
-     for  fast  failure  when  a  single  character is used. They
-     remember the last single character that is  required  for  a
-     match,  and  fail early if it is not present in the string.)
-     If the pattern is changed to
-
-       ((?>\D+)|<\d+>)*[!?]
-
-     sequences of non-digits cannot be broken, and  failure  hap-
-     pens quickly.
-
-
-BACK REFERENCES
-
-     Outside a character class, a backslash followed by  a  digit
-     greater  than  0  (and  possibly  further  digits) is a back
-     reference to a capturing subpattern earlier (that is, to its
-     left)  in  the  pattern,  provided there have been that many
-     previous capturing left parentheses.
-
-     However, if the decimal number following  the  backslash  is
-     less  than  10,  it is always taken as a back reference, and
-     causes an error only if there are not  that  many  capturing
-     left  parentheses in the entire pattern. In other words, the
-     parentheses that are referenced need not be to the  left  of
-     the  reference  for  numbers  less  than 10. See the section
-     entitled "Backslash" above for further details of  the  han-
-     dling of digits following a backslash.
-
-     A back reference matches whatever actually matched the  cap-
-     turing subpattern in the current subject string, rather than
-     anything matching the subpattern itself (see "Subpatterns as
-     subroutines" below for a way of doing that). So the pattern
-
-       (sens|respons)e and \1ibility
-
-     matches "sense and sensibility" and "response and  responsi-
-     bility",  but  not  "sense  and  responsibility". If caseful
-     matching is in force at the time of the back reference,  the
-     case of letters is relevant. For example,
-
-       ((?i)rah)\s+\1
-
-     matches "rah rah" and "RAH RAH", but  not  "RAH  rah",  even
-     though  the  original  capturing subpattern is matched case-
-     lessly.
-
-     Back references to named subpatterns use the  Python  syntax
-     (?P=name). We could rewrite the above example as follows:
-
-       (?<p1>(?i)rah)\s+(?P=p1)
-
-     There may be more than one back reference to the  same  sub-
-     pattern.  If  a  subpattern  has not actually been used in a
-     particular match, any back references to it always fail. For
-     example, the pattern
-
-       (a|(bc))\2
-
-     always fails if it starts to match  "a"  rather  than  "bc".
-     Because  there  may  be many capturing parentheses in a pat-
-     tern, all digits following the backslash are taken  as  part
-     of a potential back reference number. If the pattern contin-
-     ues with a digit character, some delimiter must be  used  to
-     terminate the back reference. If the PCRE_EXTENDED option is
-     set, this can be whitespace.  Otherwise an empty comment can
-     be used.
-
-     A back reference that occurs inside the parentheses to which
-     it  refers  fails when the subpattern is first used, so, for
-     example, (a\1) never matches.  However, such references  can
-     be useful inside repeated subpatterns. For example, the pat-
-     tern
-
-       (a|b\1)+
-
-     matches any number of "a"s and also "aba", "ababbaa" etc. At
-     each iteration of the subpattern, the back reference matches
-     the character string corresponding to  the  previous  itera-
-     tion.  In  order  for this to work, the pattern must be such
-     that the first iteration does not need  to  match  the  back
-     reference.  This  can  be  done using alternation, as in the
-     example above, or by a quantifier with a minimum of zero.
-
-
-ASSERTIONS
-
-     An assertion is  a  test  on  the  characters  following  or
-     preceding  the current matching point that does not actually
-     consume any characters. The simple assertions coded  as  \b,
-     \B,  \A, \G, \Z, \z, ^ and $ are described above.  More com-
-     plicated assertions are coded as subpatterns. There are  two
-     kinds:  those that look ahead of the current position in the
-     subject string, and those that look behind it.
-
-     An assertion subpattern is matched in the normal way, except
-     that  it  does not cause the current matching position to be
-     changed. Lookahead assertions start with  (?=  for  positive
-     assertions and (?! for negative assertions. For example,
-
-       \w+(?=;)
-
-     matches a word followed by a semicolon, but does not include
-     the semicolon in the match, and
-
-       foo(?!bar)
-
-     matches any occurrence of "foo"  that  is  not  followed  by
-     "bar". Note that the apparently similar pattern
-
-       (?!foo)bar
-
-     does not find an occurrence of "bar"  that  is  preceded  by
-     something other than "foo"; it finds any occurrence of "bar"
-     whatsoever, because the assertion  (?!foo)  is  always  true
-     when  the  next  three  characters  are  "bar". A lookbehind
-     assertion is needed to achieve this effect.
-
-     If you want to force a matching failure at some point  in  a
-     pattern,  the  most  convenient  way  to  do it is with (?!)
-     because an empty string always matches, so an assertion that
-     requires there not to be an empty string must always fail.
-
-     Lookbehind assertions start with (?<=  for  positive  asser-
-     tions and (?<! for negative assertions. For example,
-
-       (?<!foo)bar
-
-     does find an occurrence of "bar" that  is  not  preceded  by
-     "foo". The contents of a lookbehind assertion are restricted
-     such that all the strings  it  matches  must  have  a  fixed
-     length.  However, if there are several alternatives, they do
-     not all have to have the same fixed length. Thus
-
-       (?<=bullock|donkey)
-
-     is permitted, but
-
-       (?<!dogs?|cats?)
-
-     causes an error at compile time. Branches  that  match  dif-
-     ferent length strings are permitted only at the top level of
-     a lookbehind assertion. This is an extension  compared  with
-     Perl  (at  least  for  5.8),  which requires all branches to
-     match the same length of string. An assertion such as
-
-       (?<=ab(c|de))
-
-     is not permitted, because its single  top-level  branch  can
-     match two different lengths, but it is acceptable if rewrit-
-     ten to use two top-level branches:
-
-       (?<=abc|abde)
-
-     The implementation of lookbehind  assertions  is,  for  each
-     alternative,  to  temporarily move the current position back
-     by the fixed width and then  try  to  match.  If  there  are
-     insufficient  characters  before  the  current position, the
-     match is deemed to fail.
-
-     PCRE does not allow the \C escape (which  matches  a  single
-     byte  in  UTF-8  mode)  to  appear in lookbehind assertions,
-     because it makes it impossible to calculate  the  length  of
-     the lookbehind.
-
-     Atomic groups can be used  in  conjunction  with  lookbehind
-     assertions  to  specify efficient matching at the end of the
-     subject string. Consider a simple pattern such as
-
-       abcd$
-
-     when applied to a long string that does not  match.  Because
-     matching  proceeds  from  left  to right, PCRE will look for
-     each "a" in the subject and then see if what follows matches
-     the rest of the pattern. If the pattern is specified as
-
-       ^.*abcd$
-
-     the initial .* matches the entire string at first, but  when
-     this  fails  (because  there  is no following "a"), it back-
-     tracks to match all but the last character, then all but the
-     last  two  characters,  and so on. Once again the search for
-     "a" covers the entire string, from right to left, so we  are
-     no better off. However, if the pattern is written as
-
-       ^(?>.*)(?<=abcd)
-
-     or, equivalently,
-
-       ^.*+(?<=abcd)
-
-     there can be no backtracking for the .* item; it  can  match
-     only  the entire string. The subsequent lookbehind assertion
-     does a single test on the last four characters. If it fails,
-     the match fails immediately. For long strings, this approach
-     makes a significant difference to the processing time.
-
-     Several assertions (of any sort) may  occur  in  succession.
-     For example,
-
-       (?<=\d{3})(?<!999)foo
-
-     matches "foo" preceded by three digits that are  not  "999".
-     Notice  that each of the assertions is applied independently
-     at the same point in the subject string. First  there  is  a
-     check that the previous three characters are all digits, and
-     then there is a check that the same three characters are not
-     "999".   This  pattern  does not match "foo" preceded by six
-     characters, the first of which are digits and the last three
-     of  which  are  not  "999".  For  example,  it doesn't match
-     "123abcfoo". A pattern to do that is
-
-       (?<=\d{3}...)(?<!999)foo
-
-     This time the first assertion looks  at  the  preceding  six
-     characters,  checking  that  the first three are digits, and
-     then the second assertion checks that  the  preceding  three
-     characters are not "999".
-
-     Assertions can be nested in any combination. For example,
-
-       (?<=(?<!foo)bar)baz
-
-     matches an occurrence of "baz" that  is  preceded  by  "bar"
-     which in turn is not preceded by "foo", while
-
-       (?<=\d{3}(?!999)...)foo
-
-     is another pattern which matches  "foo"  preceded  by  three
-     digits and any three characters that are not "999".
-
-     Assertion subpatterns are not capturing subpatterns, and may
-     not  be  repeated,  because  it makes no sense to assert the
-     same thing several times. If any kind of assertion  contains
-     capturing  subpatterns  within it, these are counted for the
-     purposes of numbering the capturing subpatterns in the whole
-     pattern.   However,  substring capturing is carried out only
-     for positive assertions, because it does not make sense  for
-     negative assertions.
-
-
-CONDITIONAL SUBPATTERNS
-
-     It is possible to cause the matching process to obey a  sub-
-     pattern  conditionally  or to choose between two alternative
-     subpatterns, depending on the result  of  an  assertion,  or
-     whether  a previous capturing subpattern matched or not. The
-     two possible forms of conditional subpattern are
-
-       (?(condition)yes-pattern)
-       (?(condition)yes-pattern|no-pattern)
-
-     If the condition is satisfied, the yes-pattern is used; oth-
-     erwise  the  no-pattern  (if  present) is used. If there are
-     more than two alternatives in the subpattern, a compile-time
-     error occurs.
-
-     There are three kinds of condition. If the text between  the
-     parentheses  consists of a sequence of digits, the condition
-     is satisfied if the capturing subpattern of that number  has
-     previously  matched.  The  number must be greater than zero.
-     Consider  the  following  pattern,   which   contains   non-
-     significant white space to make it more readable (assume the
-     PCRE_EXTENDED option) and to divide it into three parts  for
-     ease of discussion:
-
-       ( \( )?    [^()]+    (?(1) \) )
-
-     The first part matches an optional opening parenthesis,  and
-     if  that character is present, sets it as the first captured
-     substring. The second part matches one  or  more  characters
-     that  are  not  parentheses. The third part is a conditional
-     subpattern that tests whether the first set  of  parentheses
-     matched  or  not.  If  they did, that is, if subject started
-     with an opening parenthesis, the condition is true,  and  so
-     the  yes-pattern  is  executed  and a closing parenthesis is
-     required. Otherwise, since no-pattern is  not  present,  the
-     subpattern  matches  nothing.  In  other words, this pattern
-     matches a sequence of non-parentheses,  optionally  enclosed
-     in parentheses.
-
-     If the condition is the string (R), it  is  satisfied  if  a
-     recursive  call  to the pattern or subpattern has been made.
-     At "top level", the condition is  false.   This  is  a  PCRE
-     extension.  Recursive  patterns  are  described  in the next
-     section.
-
-     If the condition is not a sequence of digits or (R), it must
-     be  an assertion.  This may be a positive or negative looka-
-     head or lookbehind assertion. Consider this  pattern,  again
-     containing  non-significant  white  space,  and with the two
-     alternatives on the second line:
-
-       (?(?=[^a-z]*[a-z])
-       \d{2}-[a-z]{3}-\d{2}  |  \d{2}-\d{2}-\d{2} )
-
-     The condition is a positive lookahead assertion that matches
-     an optional sequence of non-letters followed by a letter. In
-     other words, it tests for  the  presence  of  at  least  one
-     letter  in the subject. If a letter is found, the subject is
-     matched against  the  first  alternative;  otherwise  it  is
-     matched  against the second. This pattern matches strings in
-     one of the two forms dd-aaa-dd or dd-dd-dd,  where  aaa  are
-     letters and dd are digits.
-
-
-COMMENTS
-
-     The sequence (?# marks the start of a comment which  contin-
-     ues  up  to the next closing parenthesis. Nested parentheses
-     are not permitted. The characters that  make  up  a  comment
-     play no part in the pattern matching at all.
-
-     If the PCRE_EXTENDED option is set, an unescaped # character
-     outside  a character class introduces a comment that contin-
-     ues up to the next newline character in the pattern.
-
-
-RECURSIVE PATTERNS
-
-     Consider the problem of matching a  string  in  parentheses,
-     allowing  for  unlimited nested parentheses. Without the use
-     of recursion, the best that can be done is to use a  pattern
-     that  matches  up  to some fixed depth of nesting. It is not
-     possible to handle an arbitrary nesting depth. Perl has pro-
-     vided  an  experimental facility that allows regular expres-
-     sions to recurse (amongst other things).  It  does  this  by
-     interpolating  Perl  code in the expression at run time, and
-     the code can refer to the expression itself. A Perl  pattern
-     to solve the parentheses problem can be created like this:
-
-       $re = qr{\( (?: (?>[^()]+) | (?p{$re}) )* \)}x;
-
-     The (?p{...}) item interpolates Perl code at run  time,  and
-     in  this  case refers recursively to the pattern in which it
-     appears. Obviously, PCRE cannot support the interpolation of
-     Perl  code.  Instead,  it  supports  some special syntax for
-     recursion of the entire pattern,  and  also  for  individual
-     subpattern recursion.
-
-     The special item that consists of (? followed  by  a  number
-     greater  than  zero and a closing parenthesis is a recursive
-     call of the subpattern of the given number, provided that it
-     occurs inside that subpattern. (If not, it is a "subroutine"
-     call, which is described in the next section.)  The  special
-     item  (?R) is a recursive call of the entire regular expres-
-     sion.
-
-     For example, this PCRE pattern solves the nested parentheses
-     problem  (assume  the  PCRE_EXTENDED  option  is set so that
-     white space is ignored):
-
-       \( ( (?>[^()]+) | (?R) )* \)
-
-     First it matches an opening parenthesis. Then it matches any
-     number  of substrings which can either be a sequence of non-
-     parentheses, or a recursive  match  of  the  pattern  itself
-     (that  is  a  correctly  parenthesized  substring).  Finally
-     there is a closing parenthesis.
-
-     If this were part of a larger pattern, you would not want to
-     recurse the entire pattern, so instead you could use this:
-
-       ( \( ( (?>[^()]+) | (?1) )* \) )
-
-     We have put the pattern into  parentheses,  and  caused  the
-     recursion  to refer to them instead of the whole pattern. In
-     a larger pattern, keeping track of parenthesis  numbers  can
-     be   tricky.   It  may  be  more  convenient  to  use  named
-     parentheses instead. For this, PCRE uses (?P>name), which is
-     an  extension  to the Python syntax that PCRE uses for named
-     parentheses (Perl does not provide  named  parentheses).  We
-     could rewrite the above example as follows:
-
-       (?<pn> \( ( (?>[^()]+) | (?P>pn) )* \) )
-
-     This particular example pattern  contains  nested  unlimited
-     repeats,  and  so  the  use  of atomic grouping for matching
-     strings of non-parentheses is important  when  applying  the
-     pattern to strings that do not match. For example, when this
-     pattern is applied to
-
-       (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()
-
-     it yields "no match" quickly. However, if atomic grouping is
-     not used, the match runs for a very long time indeed because
-     there are so many different ways the +  and  *  repeats  can
-     carve  up  the  subject,  and  all  have to be tested before
-     failure can be reported.
-     At the end of a match, the values set for any capturing sub-
-     patterns are those from the outermost level of the recursion
-     at which the subpattern value is set.  If you want to obtain
-     intermediate  values,  a  callout  function can be used (see
-     below and the pcrecallout  documentation).  If  the  pattern
-     above is matched against
-
-       (ab(cd)ef)
-
-     the value for the capturing parentheses is  "ef",  which  is
-     the  last  value  taken  on  at the top level. If additional
-     parentheses are added, giving
-
-       \( ( ( (?>[^()]+) | (?R) )* ) \)
-          ^                        ^
-          ^                        ^
-
-     the string they capture is "ab(cd)ef", the contents  of  the
-     top  level  parentheses. If there are more than 15 capturing
-     parentheses in a pattern, PCRE has to obtain extra memory to
-     store  data  during  a  recursion,  which  it  does by using
-     pcre_malloc, freeing it  via  pcre_free  afterwards.  If  no
-     memory   can   be   obtained,   the  match  fails  with  the
-     PCRE_ERROR_NOMEMORY error.
-
-     Do not confuse the (?R) item with the condition  (R),  which
-     tests  for  recursion.  Consider this pattern, which matches
-     text in angle brackets, allowing for arbitrary nesting. Only
-     digits are allowed in nested brackets (that is, when recurs-
-     ing), whereas any characters  are  permitted  at  the  outer
-     level.
-
-       < (?: (?(R) \d++  | [^<>]*+) | (?R)) * >
-
-     In this pattern, (?(R) is the start of a conditional subpat-
-     tern,  with two different alternatives for the recursive and
-     non-recursive cases. The (?R) item is the  actual  recursive
-     call.
-
-
-SUBPATTERNS AS SUBROUTINES
-
-     If the syntax for a recursive subpattern  reference  (either
-     by  number  or  by  name) is used outside the parentheses to
-     which it refers, it operates like a subroutine in a program-
-     ming  language. An earlier example pointed out that the pat-
-     tern
-
-       (sens|respons)e and \1ibility
-
-     matches "sense and sensibility" and "response and  responsi-
-     bility",  but not "sense and responsibility". If instead the
-     pattern
-
-       (sens|respons)e and (?1)ibility
-
-     is used, it does match "sense and responsibility" as well as
-     the other two strings. Such references must, however, follow
-     the subpattern to which they refer.
-
-
-CALLOUTS
-
-     Perl has a  feature  whereby  using  the  sequence  (?{...})
-     causes  arbitrary  Perl  code  to be obeyed in the middle of
-     matching a  regular  expression.  This  makes  it  possible,
-     amongst  other  things, to extract different substrings that
-     match the same pair of parentheses when there is  a  repeti-
-     tion.
-
-     PCRE provides a similar feature, but  of  course  it  cannot
-     obey  arbitrary  Perl code. The feature is called "callout".
-     The caller of PCRE provides an external function by  putting
-     its  entry  point  in  the global variable pcre_callout.  By
-     default, this variable contains  NULL,  which  disables  all
-     calling out.
-
-     Within a regular expression, (?C) indicates  the  points  at
-     which  the external function is to be called. If you want to
-     identify different callout points, you can put a number less
-     than 256 after the letter C. The default value is zero.  For
-     example, this pattern has two callout points:
-
-       (?C1)9abc(?C2)def
-
-     During matching, when PCRE  reaches  a  callout  point  (and
-     pcre_callout is set), the external function is called. It is
-     provided with the number of the  callout,  and,  optionally,
-     one  item  of  data  originally  supplied  by  the caller of
-     pcre_exec(). The callout  function  may  cause  matching  to
-     backtrack,  or to fail altogether. A complete description of
-     the interface to the callout function is given in the  pcre-
-     callout documentation.
-
-Last updated: 03 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-PCRE PERFORMANCE
-
-     Certain items that may appear in regular expression patterns
-     are  more efficient than others. It is more efficient to use
-     a character class like [aeiou] than a  set  of  alternatives
-     such  as  (a|e|i|o|u). In general, the simplest construction
-     that provides the required behaviour  is  usually  the  most
-     efficient.  Jeffrey  Friedl's book contains a lot of discus-
-     sion about optimizing regular expressions for efficient per-
-     formance.
-
-     When a pattern begins with .*  not  in  parentheses,  or  in
-     parentheses that are not the subject of a backreference, and
-     the PCRE_DOTALL option is set,  the  pattern  is  implicitly
-     anchored  by PCRE, since it can match only at the start of a
-     subject string. However, if PCRE_DOTALL  is  not  set,  PCRE
-     cannot  make  this optimization, because the . metacharacter
-     does not then match a newline, and  if  the  subject  string
-     contains  newlines, the pattern may match from the character
-     immediately following one of them instead of from  the  very
-     start. For example, the pattern
-
-       .*second
-
-     matches the subject "first\nand second" (where \n stands for
-     a newline character), with the match starting at the seventh
-     character. In order to do this, PCRE has to retry the  match
-     starting after every newline in the subject.
-
-     If you are using such a pattern with subject strings that do
-     not  contain  newlines,  the best performance is obtained by
-     setting PCRE_DOTALL, or starting the  pattern  with  ^.*  to
-     indicate  explicit anchoring. That saves PCRE from having to
-     scan along the subject looking for a newline to restart at.
-
-     Beware of patterns that contain nested  indefinite  repeats.
-     These  can  take a long time to run when applied to a string
-     that does not match. Consider the pattern fragment
-
-       (a+)*
-
-     This can match "aaaa" in 33 different ways, and this  number
-     increases  very  rapidly  as  the string gets longer. (The *
-     repeat can match 0, 1, 2, 3, or 4 times,  and  for  each  of
-     those  cases other than 0, the + repeats can match different
-     numbers of times.) When the remainder of the pattern is such
-     that  the entire match is going to fail, PCRE has in princi-
-     ple to try every possible variation, and this  can  take  an
-     extremely long time.
-     An optimization catches some of the more simple  cases  such
-     as
-
-       (a+)*b
-
-     where a literal character follows. Before embarking  on  the
-     standard matching procedure, PCRE checks that there is a "b"
-     later in the subject string, and if there is not,  it  fails
-     the  match  immediately. However, when there is no following
-     literal this optimization cannot be used. You  can  see  the
-     difference by comparing the behaviour of
-
-       (a+)*\d
-
-     with the pattern above. The former gives  a  failure  almost
-     instantly  when  applied  to a whole line of "a" characters,
-     whereas the latter takes an appreciable  time  with  strings
-     longer than about 20 characters.
-
-Last updated: 03 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions.
-
-
-SYNOPSIS OF POSIX API
-     #include <pcreposix.h>
-
-     int regcomp(regex_t *preg, const char *pattern,
-          int cflags);
-
-     int regexec(regex_t *preg, const char *string,
-          size_t nmatch, regmatch_t pmatch[], int eflags);
-
-     size_t regerror(int errcode, const regex_t *preg,
-          char *errbuf, size_t errbuf_size);
-
-     void regfree(regex_t *preg);
-
-
-DESCRIPTION
-
-     This set of functions provides a POSIX-style API to the PCRE
-     regular  expression  package.  See the pcreapi documentation
-     for a description of the native API,  which  contains  addi-
-     tional functionality.
-
-     The functions described here are just wrapper functions that
-     ultimately  call  the  PCRE native API. Their prototypes are
-     defined in the pcreposix.h header file, and on Unix  systems
-     the library itself is called pcreposix.a, so can be accessed
-     by adding -lpcreposix to the command for linking an applica-
-     tion  which  uses them. Because the POSIX functions call the
-     native ones, it is also necessary to add -lpcre.
-
-     I have implemented only those option bits that can  be  rea-
-     sonably  mapped  to  PCRE  native  options. In addition, the
-     options REG_EXTENDED and  REG_NOSUB  are  defined  with  the
-     value zero. They have no effect, but since programs that are
-     written to the POSIX interface often use them, this makes it
-     easier to slot in PCRE as a replacement library. Other POSIX
-     options are not even defined.
-
-     When PCRE is called via these functions, it is only the  API
-     that is POSIX-like in style. The syntax and semantics of the
-     regular expressions themselves are still those of Perl, sub-
-     ject  to  the  setting of various PCRE options, as described
-     below. "POSIX-like in style" means that the API approximates
-     to  the  POSIX definition; it is not fully POSIX-compatible,
-     and in multi-byte encoding domains it is probably even  less
-     compatible.
-
-     The header for these functions is supplied as pcreposix.h to
-     avoid  any  potential  clash  with other POSIX libraries. It
-     can, of course, be renamed or aliased as regex.h,  which  is
-     the "correct" name. It provides two structure types, regex_t
-     for compiled internal forms, and  regmatch_t  for  returning
-     captured  substrings.  It  also defines some constants whose
-     names start with "REG_"; these are used for setting  options
-     and identifying error codes.
-
-
-COMPILING A PATTERN
-
-     The function regcomp() is called to compile a  pattern  into
-     an  internal form. The pattern is a C string terminated by a
-     binary zero, and is passed in the argument pattern. The preg
-     argument  is  a pointer to a regex_t structure which is used
-     as a base for storing information about the compiled expres-
-     sion.
-
-     The argument cflags is either zero, or contains one or  more
-     of the bits defined by the following macros:
-
-       REG_ICASE
-
-     The PCRE_CASELESS option  is  set  when  the  expression  is
-     passed for compilation to the native function.
-
-       REG_NEWLINE
-
-     The PCRE_MULTILINE option is  set  when  the  expression  is
-     passed  for  compilation  to  the native function. Note that
-     this  does  not  mimic  the  defined  POSIX  behaviour   for
-     REG_NEWLINE (see the following section).
-
-     In the absence of these flags, no options are passed to  the
-     native  function.  This means the the regex is compiled with
-     PCRE default semantics. In particular, the  way  it  handles
-     newline  characters  in  the subject string is the Perl way,
-     not the POSIX way. Note that setting PCRE_MULTILINE has only
-     some  of  the effects specified for REG_NEWLINE. It does not
-     affect the way newlines are matched by . (they aren't) or by
-     a negative class such as [^a] (they are).
-
-     The yield of regcomp() is zero on success, and non-zero oth-
-     erwise.  The preg structure is filled in on success, and one
-     member of the structure  is  public:  re_nsub  contains  the
-     number  of  capturing subpatterns in the regular expression.
-     Various error codes are defined in the header file.
-
-
-MATCHING NEWLINE CHARACTERS
-
-     This area is not simple, because POSIX and  Perl  take  dif-
-     ferent  views  of things.  It is not possible to get PCRE to
-     obey POSIX semantics, but then PCRE was never intended to be
-     a POSIX engine. The following table lists the different pos-
-     sibilities for matching newline characters in PCRE:
-
-                               Default   Change with
-
-       . matches newline          no     PCRE_DOTALL
-       newline matches [^a]       yes    not changeable
-       $ matches \n at end        yes    PCRE_DOLLARENDONLY
-       $ matches \n in middle     no     PCRE_MULTILINE
-       ^ matches \n in middle     no     PCRE_MULTILINE
-
-     This is the equivalent table for POSIX:
-
-                               Default   Change with
-
-       . matches newline          yes      REG_NEWLINE
-       newline matches [^a]       yes      REG_NEWLINE
-       $ matches \n at end        no       REG_NEWLINE
-       $ matches \n in middle     no       REG_NEWLINE
-       ^ matches \n in middle     no       REG_NEWLINE
-
-     PCRE's behaviour is the same as Perl's, except that there is
-     no  equivalent  for PCRE_DOLLARENDONLY in Perl. In both PCRE
-     and Perl, there is no way  to  stop  newline  from  matching
-     [^a].
-
-     The default POSIX newline handling can be obtained  by  set-
-     ting PCRE_DOTALL and PCRE_DOLLARENDONLY, but there is no way
-     to make PCRE behave exactly as for the REG_NEWLINE action.
-
-
-MATCHING A PATTERN
-
-     The function regexec() is called  to  match  a  pre-compiled
-     pattern  preg against a given string, which is terminated by
-     a zero byte, subject to the options in eflags. These can be:
-
-       REG_NOTBOL
-
-     The PCRE_NOTBOL option is set when  calling  the  underlying
-     PCRE matching function.
-
-       REG_NOTEOL
-
-     The PCRE_NOTEOL option is set when  calling  the  underlying
-     PCRE matching function.
-
-     The portion of the string that was  matched,  and  also  any
-     captured  substrings,  are returned via the pmatch argument,
-     which points to  an  array  of  nmatch  structures  of  type
-     regmatch_t,  containing  the  members rm_so and rm_eo. These
-     contain the offset to the first character of each  substring
-     and  the offset to the first character after the end of each
-     substring, respectively.  The  0th  element  of  the  vector
-     relates  to  the  entire portion of string that was matched;
-     subsequent elements relate to the capturing  subpatterns  of
-     the  regular  expression.  Unused  entries in the array have
-     both structure members set to -1.
-
-     A successful match yields a zero return; various error codes
-     are  defined in the header file, of which REG_NOMATCH is the
-     "expected" failure code.
-
-
-ERROR MESSAGES
-
-     The regerror()  function  maps  a  non-zero  errorcode  from
-     either  regcomp()  or  regexec()  to a printable message. If
-     preg is not NULL, the error should have arisen from the  use
-     of  that structure. A message terminated by a binary zero is
-     placed in errbuf. The length of the message,  including  the
-     zero,  is  limited to errbuf_size. The yield of the function
-     is the size of buffer needed to hold the whole message.
-
-
-STORAGE
-
-     Compiling a regular expression causes memory to be allocated
-     and  associated  with  the preg structure. The function reg-
-     free() frees all such memory, after which preg may no longer
-     be used as a compiled expression.
-
-
-AUTHOR
-
-     Philip Hazel <ph10@cam.ac.uk>
-     University Computing Service,
-     Cambridge CB2 3QG, England.
-
-Last updated: 03 February 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-
-NAME
-     PCRE - Perl-compatible regular expressions
-
-
-PCRE SAMPLE PROGRAM
-
-     A simple, complete demonstration program, to get you started
-     with  using  PCRE, is supplied in the file pcredemo.c in the
-     PCRE distribution.
-
-     The program compiles the  regular  expression  that  is  its
-     first argument, and matches it against the subject string in
-     its second argument. No PCRE options are  set,  and  default
-     character tables are used. If matching succeeds, the program
-     outputs the portion of the subject  that  matched,  together
-     with the contents of any captured substrings.
-
-     If the -g option is given on the command line,  the  program
-     then  goes on to check for further matches of the same regu-
-     lar expression in the same subject string. The  logic  is  a
-     little  bit tricky because of the possibility of matching an
-     empty string. Comments in the code explain what is going on.
-
-     On a Unix system that has PCRE installed in /usr/local,  you
-     can  compile  the demonstration program using a command like
-     this:
-
-       gcc -o pcredemo pcredemo.c -I/usr/local/include \
-           -L/usr/local/lib -lpcre
-
-     Then you can run simple tests like this:
-
-       ./pcredemo 'cat|dog' 'the cat sat on the mat'
-       ./pcredemo -g 'cat|dog' 'the dog sat on the cat'
-
-     Note that there is a much more comprehensive  test  program,
-     called  pcretest,  which  supports  many more facilities for
-     testing  regular  expressions  and  the  PCRE  library.  The
-     pcredemo program is provided as a simple coding example.
-
-     On some operating systems (e.g.  Solaris)  you  may  get  an
-     error like this when you try to run pcredemo:
-
-       ld.so.1: a.out: fatal: libpcre.so.0: open failed: No  such
-     file or directory
-
-     This is caused by the way shared library  support  works  on
-     those systems. You need to add
-
-       -R/usr/local/lib
-
-     to the compile command to get round this problem.
-
-Last updated: 28 January 2003
-Copyright (c) 1997-2003 University of Cambridge.
------------------------------------------------------------------------------
-