Load pcre-2.03 into code/trunk.

git-svn-id: svn://vcs.exim.org/pcre/code/trunk@29 2f5784b3-3f2a-0410-8824-cb99058d5e15
author: nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2007-02-24 21:38:53 +0000
committer: nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2007-02-24 21:38:53 +0000
commit: 7703eae0f55edaff9f482fa8d23a6910d5d18577 (patch)
tree: 83aa003e890adb9ef5e1968d02febf0256cf61ac /README
parent: 0c8732c8583c7e31476c0ec1c0ac92cc7e5f8bc0 (diff)
download: pcre-7703eae0f55edaff9f482fa8d23a6910d5d18577.tar.gz
1 files changed, 47 insertions, 13 deletions
diff --git a/README b/README
index e169e46..29fc714 100644
--- a/README
+++ b/README
@@ -21,6 +21,7 @@ README file for PCRE (Perl-compatible regular expressions)
 The distribution should contain the following files:
 
   ChangeLog         log of changes to the code
+  LICENCE           conditions for the use of PCRE
   Makefile          for building PCRE
   README            this file
   RunTest           a shell script for running tests
@@ -28,6 +29,7 @@ The distribution should contain the following files:
   pcre.3            man page for the functions
   pcreposix.3       man page for the POSIX wrapper API
   dftables.c        auxiliary program for building chartables.c
+  get.c             )
   maketables.c      )
   study.c           ) source of
   pcre.c            )   the functions
@@ -69,8 +71,9 @@ additional features of release 5.005, which is why it is kept separate from the
 main test input, which needs only Perl 5.004. In the long run, when 5.005 is
 widespread, these two test files may get amalgamated.
 
-The second set of tests check pcre_info(), pcre_study(), error detection and
-run-time flags that are specific to PCRE, as well as the POSIX wrapper API.
+The second set of tests check pcre_info(), pcre_study(), pcre_copy_substring(),
+pcre_get_substring(), pcre_get_substring_list(), error detection and run-time
+flags that are specific to PCRE, as well as the POSIX wrapper API.
 
 The fourth set of tests checks pcre_maketables(), the facility for building a
 set of character tables for a specific locale and using them instead of the
@@ -157,13 +160,36 @@ The program handles any number of sets of input on a single input file. Each
 set starts with a regular expression, and continues with any number of data
 lines to be matched against the pattern. An empty line signals the end of the
 set. The regular expressions are given enclosed in any non-alphameric
-delimiters, for example
+delimiters other than backslash, for example
 
   /(a|bc)x+yz/
 
-and may be followed by i, m, s, or x to set the PCRE_CASELESS, PCRE_MULTILINE,
-PCRE_DOTALL, or PCRE_EXTENDED options, respectively. These options have the
-same effect as they do in Perl.
+White space before the initial delimiter is ignored. A regular expression may
+be continued over several input lines, in which case the newline characters are
+included within it. See the testinput files for many examples. It is possible
+to include the delimiter within the pattern by escaping it, for example
+
+  /abc\/def/
+
+If you do so, the escape and the delimiter form part of the pattern, but since
+delimiters are always non-alphameric, this does not affect its interpretation.
+If the terminating delimiter is immediately followed by a backslash, for
+example,
+
+  /abc/\
+
+then a backslash is added to the end of the pattern. This provides a way of
+testing the error condition that arises if a pattern finishes with a backslash,
+because
+
+  /abc\/
+
+is interpreted as the first line of a pattern that starts with "abc/", causing
+pcretest to read the next line as a continuation of the regular expression.
+
+The pattern may be followed by i, m, s, or x to set the PCRE_CASELESS,
+PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED options, respectively. These
+options have the same effect as they do in Perl.
 
 There are also some upper case options that do not match Perl options: /A, /E,
 and /X set PCRE_ANCHORED, PCRE_DOLLAR_ENDONLY, and PCRE_EXTRA respectively.
@@ -196,9 +222,6 @@ rather than its native API. When this is done, all other options except /i and
 is present. The wrapper functions force PCRE_DOLLAR_ENDONLY always, and
 PCRE_DOTALL unless REG_NEWLINE is set.
 
-A regular expression can extend over several lines of input; the newlines are
-included in it. See the testinput files for many examples.
-
 Before each data line is passed to pcre_exec(), leading and trailing whitespace
 is removed, and it is then scanned for \ escapes. The following are recognized:
 
@@ -215,6 +238,11 @@ is removed, and it is then scanned for \ escapes. The following are recognized:
 
   \A     pass the PCRE_ANCHORED option to pcre_exec()
   \B     pass the PCRE_NOTBOL option to pcre_exec()
+  \Cdd   call pcre_copy_substring() for substring dd after a successful match
+           (any decimal number less than 32)
+  \Gdd   call pcre_get_substring() for substring dd after a successful match
+           (any decimal number less than 32)
+  \L     call pcre_get_substringlist() after a successful match
   \Odd   set the size of the output vector passed to pcre_exec() to dd
            (any number of decimal digits)
   \Z     pass the PCRE_NOTEOL option to pcre_exec()
@@ -227,7 +255,7 @@ If /P was present on the regex, causing the POSIX wrapper API to be used, only
 \B, and \Z have any effect, causing REG_NOTBOL and REG_NOTEOL to be passed to
 regexec() respectively.
 
-When a match succeeds, pcretest outputs the list of identified substrings that
+When a match succeeds, pcretest outputs the list of captured substrings that
 pcre_exec() returns, starting with number 0 for the string that matched the
 whole pattern. Here is an example of an interactive pcretest run.
 
@@ -242,6 +270,12 @@ whole pattern. Here is an example of an interactive pcretest run.
   data> xyz
   No match
 
+If any of \C, \G, or \L are present in a data line that is successfully
+matched, the substrings extracted by the convenience functions are output with
+C, G, or L after the string number instead of a colon. This is in addition to
+the normal full list. The string length (that is, the return from the
+extraction function) is given in parentheses after each string for \C and \G.
+
 Note that while patterns can be continued over several lines (a plain ">"
 prompt is used for continuations), data lines may not. However newlines can be
 included in data by means of the \n escape.
@@ -260,10 +294,10 @@ compilation.
 If the option -s is given to pcretest, it outputs the size of each compiled
 pattern after it has been compiled.
 
-If the -t option is given, each compile, study, and match is run 10000 times
+If the -t option is given, each compile, study, and match is run 20000 times
 while being timed, and the resulting time per compile or match is output in
 milliseconds. Do not set -t with -s, because you will then get the size output
-10000 times and the timing will be distorted. If you want to change the number
+20000 times and the timing will be distorted. If you want to change the number
 of repetitions used for timing, edit the definition of LOOPREPEAT at the top of
 pcretest.c
 
@@ -291,4 +325,4 @@ contains malformed regular expressions, in order to check that PCRE diagnoses
 them correctly.
 
 Philip Hazel <ph10@cam.ac.uk>
-January 1999
+February 1999
author	nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2007-02-24 21:38:53 +0000
committer	nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2007-02-24 21:38:53 +0000
commit	7703eae0f55edaff9f482fa8d23a6910d5d18577 (patch)
tree	83aa003e890adb9ef5e1968d02febf0256cf61ac /README
parent	0c8732c8583c7e31476c0ec1c0ac92cc7e5f8bc0 (diff)
download	pcre-7703eae0f55edaff9f482fa8d23a6910d5d18577.tar.gz