Load pcre-4.5 into code/trunk.

git-svn-id: svn://vcs.exim.org/pcre/code/trunk@73 2f5784b3-3f2a-0410-8824-cb99058d5e15
author: nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2007-02-24 21:40:30 +0000
committer: nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2007-02-24 21:40:30 +0000
commit: 568064fe47fac0c7caffc684e6ea227ad8127b70 (patch)
tree: 8891ffb119dd71e1d00083046876bdb6abbf28f9 /doc/pcretest.txt
parent: 4af6fcff808e079ca1aa09104d6146baa932af47 (diff)
download: pcre-568064fe47fac0c7caffc684e6ea227ad8127b70.tar.gz
1 files changed, 301 insertions, 331 deletions
diff --git a/doc/pcretest.txt b/doc/pcretest.txt
index 4fa9ca4..0e9cd13 100644
--- a/doc/pcretest.txt
+++ b/doc/pcretest.txt
@@ -1,387 +1,357 @@
-NAME
-     pcretest - a program  for  testing  Perl-compatible  regular
-     expressions.
+PCRETEST(1)                                                        PCRETEST(1)
+
 
 
+NAME
+       pcretest - a program for testing Perl-compatible regular expressions.
+
 SYNOPSIS
-     pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source]  [des-
-     tination]
+       pcretest [-d] [-i] [-m] [-o osize] [-p] [-t] [source] [destination]
 
-     pcretest was written as a test program for the PCRE  regular
-     expression  library  itself,  but  it  can  also be used for
-     experimenting  with  regular  expressions.   This   document
-     describes  the  features of the test program; for details of
-     the regular  expressions  themselves,  see  the  pcrepattern
-     documentation.  For details of PCRE and its options, see the
-     pcreapi documentation.
+       pcretest  was written as a test program for the PCRE regular expression
+       library itself, but it can also be used for experimenting with  regular
+       expressions.  This document describes the features of the test program;
+       for details of the regular expressions themselves, see the  pcrepattern
+       documentation.  For  details  of  PCRE and its options, see the pcreapi
+       documentation.
 
 
 OPTIONS
 
 
-     -C        Output the version number of the PCRE library, and
-               all   available  information  about  the  optional
-               features that are included, and then exit.
+       -C        Output the version number of the PCRE library, and all avail-
+                 able   information  about  the  optional  features  that  are
+                 included, and then exit.
 
-     -d        Behave as if each regex had the /D  modifier  (see
-               below); the internal form is output after compila-
-               tion.
+       -d        Behave as if each regex had the /D modifier (see below);  the
+                 internal form is output after compilation.
 
-     -i        Behave as if  each  regex  had  the  /I  modifier;
-               information  about  the  compiled pattern is given
-               after compilation.
+       -i        Behave  as  if  each  regex  had the /I modifier; information
+                 about the compiled pattern is given after compilation.
 
-     -m        Output the size of each compiled pattern after  it
-               has been compiled. This is equivalent to adding /M
-               to each regular expression. For compatibility with
-               earlier  versions of pcretest, -s is a synonym for
-               -m.
+       -m        Output the size of each compiled pattern after  it  has  been
+                 compiled.  This  is  equivalent  to adding /M to each regular
+                 expression.  For  compatibility  with  earlier  versions   of
+                 pcretest, -s is a synonym for -m.
 
-     -o osize  Set the number of elements in  the  output  vector
-               that  is  used  when calling PCRE to be osize. The
-               default value is 45, which is enough for  14  cap-
-               turing  subexpressions.  The  vector  size  can be
-               changed for individual matching calls by including
-               \O in the data line (see below).
+       -o osize  Set  the number of elements in the output vector that is used
+                 when calling PCRE to be osize. The default value is 45, which
+                 is  enough  for  14 capturing subexpressions. The vector size
+                 can be changed for individual matching calls by including  \O
+                 in the data line (see below).
 
-     -p        Behave as if each regex has /P modifier; the POSIX
-               wrapper  API  is  used  to  call PCRE. None of the
-               other options has any effect when -p is set.
+       -p        Behave  as  if  each regex has /P modifier; the POSIX wrapper
+                 API is used to call PCRE. None of the other options  has  any
+                 effect when -p is set.
 
-     -t        Run each compile, study, and match many times with
-               a  timer, and output resulting time per compile or
-               match (in milliseconds). Do not set  -t  with  -m,
-               because  you  will  then get the size output 20000
-               times and the timing will be distorted.
+       -t        Run  each  compile, study, and match many times with a timer,
+                 and output resulting time per compile or match (in  millisec-
+                 onds).  Do  not set -t with -m, because you will then get the
+                 size output 20000 times and the timing will be distorted.
 
 
 DESCRIPTION
 
-     If pcretest is given two filename arguments, it  reads  from
-     the  first and writes to the second. If it is given only one
-     filename argument, it reads from that  file  and  writes  to
-     stdout. Otherwise, it reads from stdin and writes to stdout,
-     and prompts for each line of input, using  "re>"  to  prompt
-     for  regular  expressions,  and  "data>"  to prompt for data
-     lines.
+       If pcretest is given two filename arguments, it reads  from  the  first
+       and writes to the second. If it is given only one filename argument, it
+       reads from that file and writes to stdout.  Otherwise,  it  reads  from
+       stdin  and  writes to stdout, and prompts for each line of input, using
+       "re>" to prompt for regular expressions, and "data>" to prompt for data
+       lines.
 
-     The program handles any number of sets of input on a  single
-     input  file.  Each set starts with a regular expression, and
-     continues with any  number  of  data  lines  to  be  matched
-     against the pattern.
+       The program handles any number of sets of input on a single input file.
+       Each set starts with a regular expression, and continues with any  num-
+       ber of data lines to be matched against the pattern.
 
-     Each line is matched separately and  independently.  If  you
-     want  to  do  multiple-line  matches, you have to use the \n
-     escape sequence in a single line of input to encode the new-
-     line  characters.  The maximum length of data line is 30,000
-     characters.
+       Each  line  is  matched separately and independently. If you want to do
+       multiple-line matches, you have to use the \n escape sequence in a sin-
+       gle  line of input to encode the newline characters. The maximum length
+       of data line is 30,000 characters.
 
-     An empty line signals the end of the data  lines,  at  which
-     point  a new regular expression is read. The regular expres-
-     sions are given enclosed in  any  non-alphameric  delimiters
-     other than backslash, for example
+       An empty line signals the end of the data lines, at which point  a  new
+       regular  expression is read. The regular expressions are given enclosed
+       in any non-alphameric delimiters other than backslash, for example
 
-       /(a|bc)x+yz/
+         /(a|bc)x+yz/
 
-     White space before the initial delimiter is ignored. A regu-
-     lar expression may be continued over several input lines, in
-     which case the newline characters are included within it. It
-     is  possible  to include the delimiter within the pattern by
-     escaping it, for example
+       White space before the initial delimiter is ignored. A regular  expres-
+       sion  may be continued over several input lines, in which case the new-
+       line characters are included within it. It is possible to  include  the
+       delimiter within the pattern by escaping it, for example
 
-       /abc\/def/
+         /abc\/def/
 
-     If you do so, the escape and the delimiter form part of  the
-     pattern,  but  since  delimiters  are always non-alphameric,
-     this does not affect its interpretation.  If the terminating
-     delimiter  is immediately followed by a backslash, for exam-
-     ple,
+       If  you  do  so, the escape and the delimiter form part of the pattern,
+       but since delimiters are always non-alphameric, this  does  not  affect
+       its  interpretation.   If the terminating delimiter is immediately fol-
+       lowed by a backslash, for example,
 
-       /abc/\
+         /abc/\
 
-     then a backslash is added to the end of the pattern. This is
-     done  to  provide  a way of testing the error condition that
-     arises if a pattern finishes with a backslash, because
+       then a backslash is added to the end of the pattern. This  is  done  to
+       provide  a  way of testing the error condition that arises if a pattern
+       finishes with a backslash, because
 
-       /abc\/
+         /abc\/
 
-     is interpreted as the first line of a  pattern  that  starts
-     with  "abc/",  causing  pcretest  to read the next line as a
-     continuation of the regular expression.
+       is interpreted as the first line of a pattern that starts with  "abc/",
+       causing pcretest to read the next line as a continuation of the regular
+       expression.
 
 
 PATTERN MODIFIERS
 
-     The pattern may be followed by i, m, s,  or  x  to  set  the
-     PCRE_CASELESS, PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED
-     options, respectively. For example:
-
-       /caseless/i
-
-     These modifier letters have the same effect as  they  do  in
-     Perl.  There  are  others  that set PCRE options that do not
-     correspond to anything in Perl:  /A, /E, /N, /U, and /X  set
-     PCRE_ANCHORED,   PCRE_DOLLAR_ENDONLY,  PCRE_NO_AUTO_CAPTURE,
-     PCRE_UNGREEDY, and PCRE_EXTRA respectively.
-
-     Searching for  all  possible  matches  within  each  subject
-     string  can  be  requested  by  the /g or /G modifier. After
-     finding  a  match,  PCRE  is  called  again  to  search  the
-     remainder  of  the subject string. The difference between /g
-     and /G is that the former uses the startoffset  argument  to
-     pcre_exec()  to  start  searching  at a new point within the
-     entire string (which is in effect what Perl  does),  whereas
-     the  latter  passes over a shortened substring. This makes a
-     difference to the matching process  if  the  pattern  begins
-     with a lookbehind assertion (including \b or \B).
-
-     If any call to pcre_exec() in a /g or /G sequence matches an
-     empty  string,  the next call is done with the PCRE_NOTEMPTY
-     and PCRE_ANCHORED flags set in order to search for  another,
-     non-empty,  match  at  the same point.  If this second match
-     fails, the start offset is advanced by one, and  the  normal
-     match  is  retried.  This imitates the way Perl handles such
-     cases when using the /g modifier or the split() function.
-
-     There are a number of other modifiers  for  controlling  the
-     way pcretest operates.
-
-     The /+ modifier requests that as well as outputting the sub-
-     string  that  matched the entire pattern, pcretest should in
-     addition output the remainder of the subject string. This is
-     useful  for tests where the subject contains multiple copies
-     of the same substring.
-
-     The /L modifier must be followed directly by the name  of  a
-     locale, for example,
-
-       /pattern/Lfr
-
-     For this reason, it must be the last  modifier  letter.  The
-     given  locale is set, pcre_maketables() is called to build a
-     set of character tables for the locale,  and  this  is  then
-     passed  to pcre_compile() when compiling the regular expres-
-     sion. Without an /L modifier, NULL is passed as  the  tables
-     pointer; that is, /L applies only to the expression on which
-     it appears.
-
-     The /I modifier requests that  pcretest  output  information
-     about the compiled expression (whether it is anchored, has a
-     fixed first character, and so on). It does this  by  calling
-     pcre_fullinfo()  after  compiling an expression, and output-
-     ting the information it gets back. If the  pattern  is  stu-
-     died, the results of that are also output.
-
-     The /D modifier is a  PCRE  debugging  feature,  which  also
-     assumes /I.  It causes the internal form of compiled regular
-     expressions to be output after compilation. If  the  pattern
-     was studied, the information returned is also output.
-
-     The /S modifier causes pcre_study() to be called  after  the
-     expression  has been compiled, and the results used when the
-     expression is matched.
-
-     The /M modifier causes the size of memory block used to hold
-     the compiled pattern to be output.
-
-     The /P modifier causes pcretest to call PCRE via  the  POSIX
-     wrapper  API  rather than its native API. When this is done,
-     all other modifiers except  /i,  /m,  and  /+  are  ignored.
-     REG_ICASE is set if /i is present, and REG_NEWLINE is set if
-     /m    is    present.    The    wrapper    functions    force
-     PCRE_DOLLAR_ENDONLY    always,    and   PCRE_DOTALL   unless
-     REG_NEWLINE is set.
-
-     The /8 modifier  causes  pcretest  to  call  PCRE  with  the
-     PCRE_UTF8  option set. This turns on support for UTF-8 char-
-     acter handling in PCRE, provided that it was  compiled  with
-     this  support  enabled.  This  modifier also causes any non-
-     printing characters in output strings to  be  printed  using
-     the \x{hh...} notation if they are valid UTF-8 sequences.
-
-     If the /? modifier is used with /8, it  causes  pcretest  to
-     call  pcre_compile()  with the PCRE_NO_UTF8_CHECK option, to
-     suppress the checking of the string for UTF-8 validity.
+       The pattern may be followed by i, m, s, or x to set the  PCRE_CASELESS,
+       PCRE_MULTILINE,  PCRE_DOTALL,  or  PCRE_EXTENDED options, respectively.
+       For example:
+
+         /caseless/i
+
+       These modifier letters have the same effect as they do in  Perl.  There
+       are  others that set PCRE options that do not correspond to anything in
+       Perl: /A, /E, /N, /U, and /X  set  PCRE_ANCHORED,  PCRE_DOLLAR_ENDONLY,
+       PCRE_NO_AUTO_CAPTURE, PCRE_UNGREEDY, and PCRE_EXTRA respectively.
+
+       Searching  for  all  possible matches within each subject string can be
+       requested by the /g or /G modifier. After  finding  a  match,  PCRE  is
+       called again to search the remainder of the subject string. The differ-
+       ence between /g and /G is that the former uses the startoffset argument
+       to  pcre_exec()  to  start  searching  at a new point within the entire
+       string (which is in effect what Perl does), whereas the  latter  passes
+       over  a  shortened  substring.  This makes a difference to the matching
+       process if the pattern begins with a lookbehind assertion (including \b
+       or \B).
+
+       If  any  call  to  pcre_exec()  in a /g or /G sequence matches an empty
+       string, the next call is done with the PCRE_NOTEMPTY and  PCRE_ANCHORED
+       flags  set in order to search for another, non-empty, match at the same
+       point.  If this second match fails, the start  offset  is  advanced  by
+       one,  and  the normal match is retried. This imitates the way Perl han-
+       dles such cases when using the /g modifier or the split() function.
+
+       There are a number of other modifiers for controlling the way  pcretest
+       operates.
+
+       The  /+ modifier requests that as well as outputting the substring that
+       matched the entire pattern, pcretest  should  in  addition  output  the
+       remainder  of  the  subject  string. This is useful for tests where the
+       subject contains multiple copies of the same substring.
+
+       The /L modifier must be followed directly by the name of a locale,  for
+       example,
+
+         /pattern/Lfr
+
+       For  this reason, it must be the last modifier letter. The given locale
+       is set, pcre_maketables() is called to build a set of character  tables
+       for  the locale, and this is then passed to pcre_compile() when compil-
+       ing the regular expression. Without an /L modifier, NULL is  passed  as
+       the tables pointer; that is, /L applies only to the expression on which
+       it appears.
+
+       The /I modifier requests that pcretest  output  information  about  the
+       compiled  expression (whether it is anchored, has a fixed first charac-
+       ter, and so on). It does this by calling pcre_fullinfo() after  compil-
+       ing  an expression, and outputting the information it gets back. If the
+       pattern is studied, the results of that are also output.
+
+       The /D modifier is a PCRE debugging feature, which also assumes /I.  It
+       causes  the  internal form of compiled regular expressions to be output
+       after compilation. If the pattern was studied, the information returned
+       is also output.
+
+       The  /S  modifier causes pcre_study() to be called after the expression
+       has been compiled, and the results used when the expression is matched.
+
+       The  /M  modifier causes the size of memory block used to hold the com-
+       piled pattern to be output.
+
+       The /P modifier causes pcretest to call PCRE via the POSIX wrapper  API
+       rather  than  its  native  API.  When this is done, all other modifiers
+       except /i, /m, and /+ are ignored. REG_ICASE is set if /i  is  present,
+       and  REG_NEWLINE  is  set if /m is present. The wrapper functions force
+       PCRE_DOLLAR_ENDONLY always, and PCRE_DOTALL unless REG_NEWLINE is  set.
+
+       The  /8 modifier causes pcretest to call PCRE with the PCRE_UTF8 option
+       set. This turns on support for UTF-8 character handling in  PCRE,  pro-
+       vided  that  it  was  compiled with this support enabled. This modifier
+       also causes any non-printing characters in output strings to be printed
+       using the \x{hh...} notation if they are valid UTF-8 sequences.
+
+       If  the  /?  modifier  is  used  with  /8,  it  causes pcretest to call
+       pcre_compile() with the  PCRE_NO_UTF8_CHECK  option,  to  suppress  the
+       checking of the string for UTF-8 validity.
 
 
 CALLOUTS
 
-     If the pattern contains  any  callout  requests,  pcretest's
-     callout function will be called. By default, it displays the
-     callout number, and the start and current positions  in  the
-     text at the callout time. For example, the output
+       If  the pattern contains any callout requests, pcretest's callout func-
+       tion will be called. By default, it displays the  callout  number,  and
+       the  start  and  current positions in the text at the callout time. For
+       example, the output
 
-       --->pqrabcdef
-         0    ^  ^
+         --->pqrabcdef
+           0    ^  ^
 
-     indicates that callout number 0 occurred for a match attempt
-     starting at the fourth character of the subject string, when
-     the pointer was at the seventh character. The callout  func-
-     tion returns zero (carry on matching) by default.
+       indicates that callout number 0 occurred for a match  attempt  starting
+       at  the fourth character of the subject string, when the pointer was at
+       the seventh character. The callout  function  returns  zero  (carry  on
+       matching) by default.
 
-     Inserting callouts may be helpful  when  using  pcretest  to
-     check  complicated regular expressions. For further informa-
-     tion about callouts, see the pcrecallout documentation.
+       Inserting  callouts may be helpful when using pcretest to check compli-
+       cated regular expressions. For further information about callouts,  see
+       the pcrecallout documentation.
 
-     For testing the PCRE library, additional control of  callout
-     behaviour  is available via escape sequences in the data, as
-     described in the following section.  In  particular,  it  is
-     possible to pass in a number as callout data (the default is
-     zero). If the callout function receives a  non-zero  number,
-     it returns that value instead of zero.
+       For  testing  the PCRE library, additional control of callout behaviour
+       is available via escape sequences in the data, as described in the fol-
+       lowing  section.  In  particular, it is possible to pass in a number as
+       callout data (the default is zero). If the callout function receives  a
+       non-zero number, it returns that value instead of zero.
 
 
 DATA LINES
 
-     Before each data line is passed to pcre_exec(), leading  and
-     trailing whitespace is removed, and it is then scanned for \
-     escapes.  Some  of  these  are  pretty  esoteric   features,
-     intended  for  checking  out  some  of  the more complicated
-     features of PCRE. If you are just testing "ordinary" regular
-     expressions,  you probably don't need any of these. The fol-
-     lowing escapes are recognized:
-
-       \a         alarm (= BEL)
-       \b         backspace
-       \e         escape
-       \f         formfeed
-       \n         newline
-       \r         carriage return
-       \t         tab
-       \v         vertical tab
-       \nnn       octal character (up to 3 octal digits)
-       \xhh       hexadecimal character (up to 2 hex digits)
-       \x{hh...}  hexadecimal character, any number of digits
-                    in UTF-8 mode
-       \A         pass the PCRE_ANCHORED option to pcre_exec()
-       \B         pass the PCRE_NOTBOL option to pcre_exec()
-       \Cdd       call pcre_copy_substring() for substring dd
-                    after a successful match (any decimal number
-                    less than 32)
-       \Cname     call pcre_copy_named_substring() for substring
-
-                    "name" after a successful match (name termin-
-                    ated by next non alphanumeric character)
-       \C+        show the current captured substrings at callout
-                    time
-       \C-        do not supply a callout function
-       \C!n       return 1 instead of 0 when callout number n is
-                    reached
-       \C!n!m     return 1 instead of 0 when callout number n is
-                    reached for the nth time
-       \C*n       pass the number n (may be negative) as callout
-                    data
-       \Gdd       call pcre_get_substring() for substring dd
-                    after a successful match (any decimal number
-                    less than 32)
-       \Gname     call pcre_get_named_substring() for substring
-                    "name" after a successful match (name termin-
-                    ated by next non-alphanumeric character)
-       \L         call pcre_get_substringlist() after a
-                    successful match
-       \M         discover the minimum MATCH_LIMIT setting
-       \N         pass the PCRE_NOTEMPTY option to pcre_exec()
-       \Odd       set the size of the output vector passed to
-                    pcre_exec() to dd (any number of decimal
-                    digits)
-       \Z         pass the PCRE_NOTEOL option to pcre_exec()
-       \?         pass the PCRE_NO_UTF8_CHECK option to
-                    pcre_exec()
-
-     If \M is present, pcretest calls pcre_exec() several  times,
-     with  different  values  in  the  match_limit  field  of the
-     pcre_extra data structure, until it finds the minimum number
-     that is needed for pcre_exec() to complete. This number is a
-     measure of the amount of  recursion  and  backtracking  that
-     takes  place,  and  checking  it out can be instructive. For
-     most simple matches, the number is quite small, but for pat-
-     terns  with very large numbers of matching possibilities, it
-     can become large very quickly with increasing length of sub-
-     ject string.
-
-     When \O is used, it may be higher or lower than the size set
-     by  the  -O  option (or defaulted to 45); \O applies only to
-     the call of pcre_exec() for the line in which it appears.
-
-     A backslash followed by anything else just escapes the  any-
-     thing else. If the very last character is a backslash, it is
-     ignored. This gives a way of passing an empty line as  data,
-     since a real empty line terminates the data input.
-
-     If /P was present on the regex, causing  the  POSIX  wrapper
-     API  to  be  used,  only  B,  and Z have any effect, causing
-     REG_NOTBOL and REG_NOTEOL to be passed to regexec()  respec-
-     tively.
-     The use of \x{hh...} to represent UTF-8  characters  is  not
-     dependent  on  the use of the /8 modifier on the pattern. It
-     is recognized always. There may be any number of hexadecimal
-     digits  inside  the  braces.  The  result is from one to six
-     bytes, encoded according to the UTF-8 rules.
+       Before  each  data  line is passed to pcre_exec(), leading and trailing
+       whitespace is removed, and it is then scanned for \  escapes.  Some  of
+       these  are  pretty esoteric features, intended for checking out some of
+       the more complicated features of PCRE. If you are just  testing  "ordi-
+       nary"  regular  expressions,  you probably don't need any of these. The
+       following escapes are recognized:
+
+         \a         alarm (= BEL)
+         \b         backspace
+         \e         escape
+         \f         formfeed
+         \n         newline
+         \r         carriage return
+         \t         tab
+         \v         vertical tab
+         \nnn       octal character (up to 3 octal digits)
+         \xhh       hexadecimal character (up to 2 hex digits)
+         \x{hh...}  hexadecimal character, any number of digits
+                      in UTF-8 mode
+         \A         pass the PCRE_ANCHORED option to pcre_exec()
+         \B         pass the PCRE_NOTBOL option to pcre_exec()
+         \Cdd       call pcre_copy_substring() for substring dd
+                      after a successful match (any decimal number
+                      less than 32)
+         \Cname     call pcre_copy_named_substring() for substring
+                      "name" after a successful match (name termin-
+                      ated by next non alphanumeric character)
+         \C+        show the current captured substrings at callout
+                      time
+         \C-        do not supply a callout function
+         \C!n       return 1 instead of 0 when callout number n is
+                      reached
+         \C!n!m     return 1 instead of 0 when callout number n is
+                      reached for the nth time
+         \C*n       pass the number n (may be negative) as callout
+                      data
+         \Gdd       call pcre_get_substring() for substring dd
+                      after a successful match (any decimal number
+                      less than 32)
+         \Gname     call pcre_get_named_substring() for substring
+                      "name" after a successful match (name termin-
+                      ated by next non-alphanumeric character)
+         \L         call pcre_get_substringlist() after a
+                      successful match
+         \M         discover the minimum MATCH_LIMIT setting
+         \N         pass the PCRE_NOTEMPTY option to pcre_exec()
+         \Odd       set the size of the output vector passed to
+                      pcre_exec() to dd (any number of decimal
+                      digits)
+         \S         output details of memory get/free calls during matching
+         \Z         pass the PCRE_NOTEOL option to pcre_exec()
+         \?         pass the PCRE_NO_UTF8_CHECK option to
+                      pcre_exec()
+
+       If \M is present, pcretest calls pcre_exec() several times,  with  dif-
+       ferent  values  in  the match_limit field of the pcre_extra data struc-
+       ture, until it finds the minimum number that is needed for  pcre_exec()
+       to  complete.  This  number is a measure of the amount of recursion and
+       backtracking that takes place, and checking it out can be  instructive.
+       For  most  simple  matches, the number is quite small, but for patterns
+       with very large numbers of matching possibilities, it can become  large
+       very quickly with increasing length of subject string.
+
+       When  \O is used, it may be higher or lower than the size set by the -O
+       option (or defaulted to 45); \O applies only to the call of pcre_exec()
+       for the line in which it appears.
+
+       A  backslash  followed by anything else just escapes the anything else.
+       If the very last character is a backslash, it is ignored. This gives  a
+       way  of  passing  an empty line as data, since a real empty line termi-
+       nates the data input.
+
+       If /P was present on the regex, causing the POSIX  wrapper  API  to  be
+       used,  only  0  causing  REG_NOTBOL  and  REG_NOTEOL  to  be  passed to
+       regexec() respectively.
+
+       The use of \x{hh...} to represent UTF-8 characters is not dependent  on
+       the  use  of  the  /8 modifier on the pattern. It is recognized always.
+       There may be any number of hexadecimal digits inside  the  braces.  The
+       result  is from one to six bytes, encoded according to the UTF-8 rules.
 
 
 OUTPUT FROM PCRETEST
 
-     When a match succeeds, pcretest outputs the list of captured
-     substrings  that pcre_exec() returns, starting with number 0
-     for the string that matched the whole pattern.  Here  is  an
-     example of an interactive pcretest run.
-
-       $ pcretest
-       PCRE version 4.00 08-Jan-2003
-
-         re> /^abc(\d+)/
-       data> abc123
-        0: abc123
-        1: 123
-       data> xyz
-       No match
-
-     If the strings contain any non-printing characters, they are
-     output  as  \0x  escapes,  or  as  \x{...} escapes if the /8
-     modifier was present on the pattern. If the pattern has  the
-     /+  modifier, then the output for substring 0 is followed by
-     the the rest of the subject string, identified by "0+"  like
-     this:
-
-         re> /cat/+
-       data> cataract
-        0: cat
-        0+ aract
-
-     If the pattern has the /g or /G  modifier,  the  results  of
-     successive  matching  attempts  are output in sequence, like
-     this:
-
-         re> /\Bi(\w\w)/g
-       data> Mississippi
-        0: iss
-        1: ss
-        0: iss
-        1: ss
-        0: ipp
-        1: pp
-
-     "No match" is output only if the first match attempt fails.
-
-     If any of the sequences \C, \G, or \L are present in a  data
-     line  that is successfully matched, the substrings extracted
-     by the convenience functions are output  with  C,  G,  or  L
-     after the string number instead of a colon. This is in addi-
-     tion to the normal full list. The string  length  (that  is,
-     the  return  from  the  extraction  function)  is  given  in
-     parentheses after each string for \C and \G.
-
-     Note that while patterns can be continued over several lines
-     (a  plain  ">" prompt is used for continuations), data lines
-     may not. However newlines can be included in data  by  means
-     of the \n escape.
+       When a match succeeds, pcretest outputs the list of captured substrings
+       that  pcre_exec()  returns,  starting with number 0 for the string that
+       matched the whole  pattern.  Here  is  an  example  of  an  interactive
+       pcretest run.
+
+         $ pcretest
+         PCRE version 4.00 08-Jan-2003
+
+           re> /^abc(\d+)/
+         data> abc123
+          0: abc123
+          1: 123
+         data> xyz
+         No match
+
+       If  the strings contain any non-printing characters, they are output as
+       \0x escapes, or as \x{...} escapes if the /8 modifier  was  present  on
+       the  pattern.  If  the pattern has the /+ modifier, then the output for
+       substring 0 is followed by the the rest of the subject string,  identi-
+       fied by "0+" like this:
+
+           re> /cat/+
+         data> cataract
+          0: cat
+          0+ aract
+
+       If  the  pattern  has  the /g or /G modifier, the results of successive
+       matching attempts are output in sequence, like this:
+
+           re> /\Bi(\w\w)/g
+         data> Mississippi
+          0: iss
+          1: ss
+          0: iss
+          1: ss
+          0: ipp
+          1: pp
+
+       "No match" is output only if the first match attempt fails.
+
+       If any of the sequences \C, \G, or \L are present in a data  line  that
+       is  successfully  matched,  the substrings extracted by the convenience
+       functions are output with C, G, or L after the string number instead of
+       a colon. This is in addition to the normal full list. The string length
+       (that is, the return from the extraction function) is given  in  paren-
+       theses after each string for \C and \G.
+
+       Note  that  while patterns can be continued over several lines (a plain
+       ">" prompt is used for continuations), data lines may not. However new-
+       lines can be included in data by means of the \n escape.
 
 
 AUTHOR
 
-     Philip Hazel <ph10@cam.ac.uk>
-     University Computing Service,
-     Cambridge CB2 3QG, England.
+       Philip Hazel <ph10@cam.ac.uk>
+       University Computing Service,
+       Cambridge CB2 3QG, England.
 
-Last updated: 20 August 2003
+Last updated: 09 December 2003
 Copyright (c) 1997-2003 University of Cambridge.
author	nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2007-02-24 21:40:30 +0000
committer	nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2007-02-24 21:40:30 +0000
commit	568064fe47fac0c7caffc684e6ea227ad8127b70 (patch)
tree	8891ffb119dd71e1d00083046876bdb6abbf28f9 /doc/pcretest.txt
parent	4af6fcff808e079ca1aa09104d6146baa932af47 (diff)
download	pcre-568064fe47fac0c7caffc684e6ea227ad8127b70.tar.gz