Documentation and general text tidies in preparation for test release.

git-svn-id: svn://vcs.exim.org/pcre/code/trunk@654 2f5784b3-3f2a-0410-8824-cb99058d5e15
author: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2011-08-02 11:00:40 +0000
committer: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2011-08-02 11:00:40 +0000
commit: 9c65843dde6af3b331acdf8518a6020df32f45af (patch)
tree: f4938ee9a3d4ca4b7282f86370a5a39875a3a562 /doc/pcretest.txt
parent: 2c1db477501a36945e05bc50a1d563c96c4e13f4 (diff)
download: pcre-9c65843dde6af3b331acdf8518a6020df32f45af.tar.gz
1 files changed, 202 insertions, 133 deletions
diff --git a/doc/pcretest.txt b/doc/pcretest.txt
index 7f67d6f..a7c42fa 100644
--- a/doc/pcretest.txt
+++ b/doc/pcretest.txt
@@ -7,26 +7,30 @@ NAME
 
 SYNOPSIS
 
-       pcretest [options] [source] [destination]
+       pcretest [options] [input file [output file]]
 
        pcretest  was written as a test program for the PCRE regular expression
        library itself, but it can also be used for experimenting with  regular
        expressions.  This document describes the features of the test program;
        for details of the regular expressions themselves, see the  pcrepattern
        documentation. For details of the PCRE library function calls and their
-       options, see the pcreapi documentation.
+       options, see the pcreapi documentation. The input  for  pcretest  is  a
+       sequence  of  regular expression patterns and strings to be matched, as
+       described below. The output shows the result of each match. Options  on
+       the command line and the patterns control PCRE options and exactly what
+       is output.
 
 
-OPTIONS
+COMMAND LINE OPTIONS
 
-       -b        Behave as if each regex has the /B (show bytecode)  modifier;
-                 the internal form is output after compilation.
+       -b        Behave as if each pattern has the /B (show byte  code)  modi-
+                 fier; the internal form is output after compilation.
 
        -C        Output the version number of the PCRE library, and all avail-
                  able  information  about  the  optional  features  that   are
                  included, and then exit.
 
-       -d        Behave  as  if  each  regex  has the /D (debug) modifier; the
+       -d        Behave  as  if  each pattern has the /D (debug) modifier; the
                  internal form and information about the compiled  pattern  is
                  output after compilation; -d is equivalent to -b -i.
 
@@ -37,7 +41,7 @@ OPTIONS
 
        -help     Output a brief summary these options and then exit.
 
-       -i        Behave as if each regex  has  the  /I  modifier;  information
+       -i        Behave as if each pattern has the  /I  modifier;  information
                  about the compiled pattern is given after compilation.
 
        -M        Behave  as if each data line contains the \M escape sequence;
@@ -47,33 +51,52 @@ OPTIONS
 
        -m        Output the size of each compiled pattern after  it  has  been
                  compiled.  This  is  equivalent  to adding /M to each regular
-                 expression.  For  compatibility  with  earlier  versions   of
-                 pcretest, -s is a synonym for -m.
-
-       -o osize  Set  the number of elements in the output vector that is used
-                 when calling pcre_exec() or pcre_dfa_exec() to be osize.  The
-                 default  value is 45, which is enough for 14 capturing subex-
-                 pressions  for  pcre_exec()  or  22  different  matches   for
-                 pcre_dfa_exec().  The vector size can be changed for individ-
-                 ual matching calls by including \O  in  the  data  line  (see
+                 expression.
+
+       -o osize  Set the number of elements in the output vector that is  used
+                 when  calling pcre_exec() or pcre_dfa_exec() to be osize. The
+                 default value is 45, which is enough for 14 capturing  subex-
+                 pressions   for  pcre_exec()  or  22  different  matches  for
+                 pcre_dfa_exec(). The vector size can be changed for  individ-
+                 ual  matching  calls  by  including  \O in the data line (see
                  below).
 
-       -p        Behave  as if each regex has the /P modifier; the POSIX wrap-
-                 per API is used to call PCRE. None of the other  options  has
-                 any effect when -p is set.
+       -p        Behave as if each pattern has  the  /P  modifier;  the  POSIX
+                 wrapper  API  is used to call PCRE. None of the other options
+                 has any effect when -p is set.
 
-       -q        Do  not output the version number of pcretest at the start of
+       -q        Do not output the version number of pcretest at the start  of
                  execution.
 
-       -S size   On Unix-like systems, set the size of the  runtime  stack  to
+       -S size   On  Unix-like  systems, set the size of the run-time stack to
                  size megabytes.
 
-       -t        Run  each  compile, study, and match many times with a timer,
-                 and output resulting time per compile or match (in  millisec-
-                 onds).  Do  not set -m with -t, because you will then get the
-                 size output a zillion times, and  the  timing  will  be  dis-
-                 torted.  You  can  control  the number of iterations that are
-                 used for timing by following -t with a number (as a  separate
+       -s        Behave as if each pattern  has  the  /S  modifier;  in  other
+                 words,  force  each  pattern  to  be studied. If the /I or /D
+                 option is present on a pattern (requesting output  about  the
+                 compiled  pattern),  information about the result of studying
+                 is not included when studying is caused only by -s  and  nei-
+                 ther -i nor -d is present on the command line. This behaviour
+                 means that the output from tests that are run with and  with-
+                 out  -s  should be identical, except when options that output
+                 information about the actual running of a match are set.  The
+                 -M,  -t,  and  -tm  options,  which  give  information  about
+                 resources used, are likely to produce different  output  with
+                 and  without  -s.  Output may also differ if the /C option is
+                 present on an individual pattern. This uses callouts to trace
+                 the  the  matching process, and this may be different between
+                 studied and non-studied patterns.  If  the  pattern  contains
+                 (*MARK)  items  there  may  also be differences, for the same
+                 reason. The -s command line option can be overridden for spe-
+                 cific  patterns  that  should  never  be  studied (see the /S
+                 option below).
+
+       -t        Run each compile, study, and match many times with  a  timer,
+                 and  output resulting time per compile or match (in millisec-
+                 onds). Do not set -m with -t, because you will then  get  the
+                 size  output  a  zillion  times,  and the timing will be dis-
+                 torted. You can control the number  of  iterations  that  are
+                 used  for timing by following -t with a number (as a separate
                  item on the command line). For example, "-t 1000" would iter-
                  ate 1000 times. The default is to iterate 500000 times.
 
@@ -83,78 +106,78 @@ OPTIONS
 
 DESCRIPTION
 
-       If  pcretest  is  given two filename arguments, it reads from the first
+       If pcretest is given two filename arguments, it reads  from  the  first
        and writes to the second. If it is given only one filename argument, it
-       reads  from  that  file  and writes to stdout. Otherwise, it reads from
-       stdin and writes to stdout, and prompts for each line of  input,  using
+       reads from that file and writes to stdout.  Otherwise,  it  reads  from
+       stdin  and  writes to stdout, and prompts for each line of input, using
        "re>" to prompt for regular expressions, and "data>" to prompt for data
        lines.
 
-       When pcretest is built, a configuration  option  can  specify  that  it
-       should  be  linked  with the libreadline library. When this is done, if
+       When  pcretest  is  built,  a  configuration option can specify that it
+       should be linked with the libreadline library. When this  is  done,  if
        the input is from a terminal, it is read using the readline() function.
-       This  provides line-editing and history facilities. The output from the
+       This provides line-editing and history facilities. The output from  the
        -help option states whether or not readline() will be used.
 
        The program handles any number of sets of input on a single input file.
-       Each  set starts with a regular expression, and continues with any num-
+       Each set starts with a regular expression, and continues with any  num-
        ber of data lines to be matched against the pattern.
 
-       Each data line is matched separately and independently. If you want  to
+       Each  data line is matched separately and independently. If you want to
        do multi-line matches, you have to use the \n escape sequence (or \r or
        \r\n, etc., depending on the newline setting) in a single line of input
-       to  encode  the  newline  sequences. There is no limit on the length of
-       data lines; the input buffer is automatically extended  if  it  is  too
+       to encode the newline sequences. There is no limit  on  the  length  of
+       data  lines;  the  input  buffer is automatically extended if it is too
        small.
 
-       An  empty  line signals the end of the data lines, at which point a new
-       regular expression is read. The regular expressions are given  enclosed
+       An empty line signals the end of the data lines, at which point  a  new
+       regular  expression is read. The regular expressions are given enclosed
        in any non-alphanumeric delimiters other than backslash, for example:
 
          /(a|bc)x+yz/
 
-       White  space before the initial delimiter is ignored. A regular expres-
-       sion may be continued over several input lines, in which case the  new-
-       line  characters  are included within it. It is possible to include the
+       White space before the initial delimiter is ignored. A regular  expres-
+       sion  may be continued over several input lines, in which case the new-
+       line characters are included within it. It is possible to  include  the
        delimiter within the pattern by escaping it, for example
 
          /abc\/def/
 
-       If you do so, the escape and the delimiter form part  of  the  pattern,
-       but  since delimiters are always non-alphanumeric, this does not affect
-       its interpretation.  If the terminating delimiter is  immediately  fol-
+       If  you  do  so, the escape and the delimiter form part of the pattern,
+       but since delimiters are always non-alphanumeric, this does not  affect
+       its  interpretation.   If the terminating delimiter is immediately fol-
        lowed by a backslash, for example,
 
          /abc/\
 
-       then  a  backslash  is added to the end of the pattern. This is done to
-       provide a way of testing the error condition that arises if  a  pattern
+       then a backslash is added to the end of the pattern. This  is  done  to
+       provide  a  way of testing the error condition that arises if a pattern
        finishes with a backslash, because
 
          /abc\/
 
-       is  interpreted as the first line of a pattern that starts with "abc/",
+       is interpreted as the first line of a pattern that starts with  "abc/",
        causing pcretest to read the next line as a continuation of the regular
        expression.
 
 
 PATTERN MODIFIERS
 
-       A  pattern may be followed by any number of modifiers, which are mostly
-       single characters. Following Perl usage, these are  referred  to  below
-       as,  for  example,  "the /i modifier", even though the delimiter of the
-       pattern need not always be a slash, and no slash is used  when  writing
-       modifiers.  Whitespace  may  appear between the final pattern delimiter
+       A pattern may be followed by any number of modifiers, which are  mostly
+       single  characters.  Following  Perl usage, these are referred to below
+       as, for example, "the /i modifier", even though the  delimiter  of  the
+       pattern  need  not always be a slash, and no slash is used when writing
+       modifiers. White space may appear between the final  pattern  delimiter
        and the first modifier, and between the modifiers themselves.
 
        The /i, /m, /s, and /x modifiers set the PCRE_CASELESS, PCRE_MULTILINE,
-       PCRE_DOTALL,  or  PCRE_EXTENDED  options,  respectively, when pcre_com-
-       pile() is called. These four modifier letters have the same  effect  as
+       PCRE_DOTALL, or PCRE_EXTENDED  options,  respectively,  when  pcre_com-
+       pile()  is  called. These four modifier letters have the same effect as
        they do in Perl. For example:
 
          /caseless/i
 
-       The  following  table  shows additional modifiers for setting PCRE com-
+       The following table shows additional modifiers for  setting  PCRE  com-
        pile-time options that do not correspond to anything in Perl:
 
          /8              PCRE_UTF8
@@ -178,48 +201,59 @@ PATTERN MODIFIERS
          /<bsr_anycrlf>  PCRE_BSR_ANYCRLF
          /<bsr_unicode>  PCRE_BSR_UNICODE
 
-       The modifiers that are enclosed in angle brackets are  literal  strings
-       as  shown,  including  the  angle  brackets,  but the letters can be in
-       either case. This example sets multiline matching with CRLF as the line
-       ending sequence:
+       The  modifiers  that are enclosed in angle brackets are literal strings
+       as shown, including the angle brackets, but the letters within  can  be
+       in  either case.  This example sets multiline matching with CRLF as the
+       line ending sequence:
 
-         /^abc/m<crlf>
+         /^abc/m<CRLF>
 
        As well as turning on the PCRE_UTF8 option, the /8 modifier also causes
-       any non-printing characters in output strings to be printed  using  the
-       \x{hh...}  notation  if they are valid UTF-8 sequences. Full details of
+       any  non-printing  characters in output strings to be printed using the
+       \x{hh...} notation if they are valid UTF-8 sequences. Full  details  of
        the PCRE options are given in the pcreapi documentation.
 
    Finding all matches in a string
 
-       Searching for all possible matches within each subject  string  can  be
-       requested  by  the  /g  or  /G modifier. After finding a match, PCRE is
+       Searching  for  all  possible matches within each subject string can be
+       requested by the /g or /G modifier. After  finding  a  match,  PCRE  is
        called again to search the remainder of the subject string. The differ-
        ence between /g and /G is that the former uses the startoffset argument
-       to pcre_exec() to start searching at a  new  point  within  the  entire
-       string  (which  is in effect what Perl does), whereas the latter passes
-       over a shortened substring. This makes a  difference  to  the  matching
+       to  pcre_exec()  to  start  searching  at a new point within the entire
+       string (which is in effect what Perl does), whereas the  latter  passes
+       over  a  shortened  substring.  This makes a difference to the matching
        process if the pattern begins with a lookbehind assertion (including \b
        or \B).
 
-       If any call to pcre_exec() in a /g or  /G  sequence  matches  an  empty
-       string,  the  next  call  is  done  with  the PCRE_NOTEMPTY_ATSTART and
-       PCRE_ANCHORED flags set in order  to  search  for  another,  non-empty,
-       match  at  the same point. If this second match fails, the start offset
-       is advanced, and the normal match is retried.  This  imitates  the  way
+       If  any  call  to  pcre_exec()  in a /g or /G sequence matches an empty
+       string, the next  call  is  done  with  the  PCRE_NOTEMPTY_ATSTART  and
+       PCRE_ANCHORED  flags  set  in  order  to search for another, non-empty,
+       match at the same point. If this second match fails, the  start  offset
+       is  advanced,  and  the  normal match is retried. This imitates the way
        Perl handles such cases when using the /g modifier or the split() func-
-       tion. Normally, the start offset is advanced by one character,  but  if
-       the  newline  convention  recognizes CRLF as a newline, and the current
+       tion.  Normally,  the start offset is advanced by one character, but if
+       the newline convention recognizes CRLF as a newline,  and  the  current
        character is CR followed by LF, an advance of two is used.
 
    Other modifiers
 
        There are yet more modifiers for controlling the way pcretest operates.
 
-       The /+ modifier requests that as well as outputting the substring  that
-       matched  the  entire  pattern,  pcretest  should in addition output the
-       remainder of the subject string. This is useful  for  tests  where  the
-       subject contains multiple copies of the same substring.
+       The  /+ modifier requests that as well as outputting the substring that
+       matched the entire pattern, pcretest  should  in  addition  output  the
+       remainder  of  the  subject  string. This is useful for tests where the
+       subject contains multiple copies of the same substring. If the +  modi-
+       fier  appears  twice, the same action is taken for captured substrings.
+       In each case the remainder is output on the following line with a  plus
+       character following the capture number.
+
+       The  /=  modifier  requests  that  the values of all potential captured
+       parentheses be output after a match by pcre_exec().  By  default,  only
+       those up to the highest one actually used in the match are output (cor-
+       responding to the return code from pcre_exec()). Values in the  offsets
+       vector  corresponding  to higher numbers should be set to -1, and these
+       are output as "<unset>". This modifier gives a  way  of  checking  that
+       this is happening.
 
        The  /B modifier is a debugging feature. It requests that pcretest out-
        put a representation of the compiled byte code after compilation.  Nor-
@@ -270,8 +304,14 @@ PATTERN MODIFIERS
        The  /M  modifier causes the size of memory block used to hold the com-
        piled pattern to be output.
 
-       The /S modifier causes pcre_study() to be called after  the  expression
-       has been compiled, and the results used when the expression is matched.
+       If the /S modifier appears once, it causes pcre_study()  to  be  called
+       after  the  expression has been compiled, and the results used when the
+       expression is matched. If /S appears  twice,  it  suppresses  studying,
+       even if it was requested externally by the -s command line option. This
+       makes it possible to specify that certain patterns are always  studied,
+       and others are never studied, independently of -s. This feature is used
+       in the test files in a few cases where the output is different when the
+       pattern is studied.
 
        The  /T  modifier  must be followed by a single digit. It causes a spe-
        cific set of built-in character tables to be passed to  pcre_compile().
@@ -306,7 +346,7 @@ PATTERN MODIFIERS
 DATA LINES
 
        Before each data line is passed to pcre_exec(),  leading  and  trailing
-       whitespace  is  removed,  and it is then scanned for \ escapes. Some of
+       white  space  is removed, and it is then scanned for \ escapes. Some of
        these are pretty esoteric features, intended for checking out  some  of
        the  more  complicated features of PCRE. If you are just testing "ordi-
        nary" regular expressions, you probably don't need any  of  these.  The
@@ -315,7 +355,7 @@ DATA LINES
          \a         alarm (BEL, \x07)
          \b         backspace (\x08)
          \e         escape (\x27)
-         \f         formfeed (\x0c)
+         \f         form feed (\x0c)
          \n         newline (\x0a)
          \qdd       set the PCRE_MATCH_LIMIT limit to dd
                       (any number of digits)
@@ -463,11 +503,14 @@ DEFAULT OUTPUT FROM PCRETEST
        (Note  that  this is the entire substring that was inspected during the
        partial match; it may include characters before the actual match  start
        if  a  lookbehind assertion, \K, \b, or \B was involved.) For any other
-       returns, it outputs the PCRE negative error number. Here is an  example
-       of an interactive pcretest run.
+       return, pcretest outputs the PCRE negative error  number  and  a  short
+       descriptive  phrase.  If  the error is a failed UTF-8 string check, the
+       byte offset of the start of the failing character and the  reason  code
+       are  also  output,  provided  that  the size of the output vector is at
+       least two. Here is an example of an interactive pcretest run.
 
          $ pcretest
-         PCRE version 7.0 30-Nov-2006
+         PCRE version 8.13 2011-04-30
 
            re> /^abc(\d+)/
          data> abc123
@@ -476,12 +519,12 @@ DEFAULT OUTPUT FROM PCRETEST
          data> xyz
          No match
 
-       Note  that unset capturing substrings that are not followed by one that
-       is set are not returned by pcre_exec(), and are not shown by  pcretest.
-       In  the following example, there are two capturing substrings, but when
-       the first data line is matched, the  second,  unset  substring  is  not
-       shown.  An "internal" unset substring is shown as "<unset>", as for the
-       second data line.
+       Unset capturing substrings that are not followed by one that is set are
+       not returned by pcre_exec(), and are not shown by pcretest. In the fol-
+       lowing example, there are two capturing substrings, but when the  first
+       data  line  is  matched,  the  second, unset substring is not shown. An
+       "internal" unset substring is shown as "<unset>",  as  for  the  second
+       data line.
 
            re> /(a)|(b)/
          data> a
@@ -492,11 +535,11 @@ DEFAULT OUTPUT FROM PCRETEST
           1: <unset>
           2: b
 
-       If the strings contain any non-printing characters, they are output  as
-       \0x  escapes,  or  as \x{...} escapes if the /8 modifier was present on
-       the pattern. See below for the definition of  non-printing  characters.
-       If  the pattern has the /+ modifier, the output for substring 0 is fol-
-       lowed by the the rest of the subject string, identified  by  "0+"  like
+       If  the strings contain any non-printing characters, they are output as
+       \0x escapes, or as \x{...} escapes if the /8 modifier  was  present  on
+       the  pattern.  See below for the definition of non-printing characters.
+       If the pattern has the /+ modifier, the output for substring 0 is  fol-
+       lowed  by  the  the rest of the subject string, identified by "0+" like
        this:
 
            re> /cat/+
@@ -504,7 +547,7 @@ DEFAULT OUTPUT FROM PCRETEST
           0: cat
           0+ aract
 
-       If  the  pattern  has  the /g or /G modifier, the results of successive
+       If the pattern has the /g or /G modifier,  the  results  of  successive
        matching attempts are output in sequence, like this:
 
            re> /\Bi(\w\w)/g
@@ -516,26 +559,32 @@ DEFAULT OUTPUT FROM PCRETEST
           0: ipp
           1: pp
 
-       "No match" is output only if the first match attempt fails.
+       "No  match" is output only if the first match attempt fails. Here is an
+       example of a failure message (the offset 4 that is specified by \>4  is
+       past the end of the subject string):
 
-       If any of the sequences \C, \G, or \L are present in a data  line  that
-       is  successfully  matched,  the substrings extracted by the convenience
+           re> /xyz/
+         data> xyz\>4
+         Error -24 (bad offset value)
+
+       If  any  of the sequences \C, \G, or \L are present in a data line that
+       is successfully matched, the substrings extracted  by  the  convenience
        functions are output with C, G, or L after the string number instead of
        a colon. This is in addition to the normal full list. The string length
-       (that is, the return from the extraction function) is given  in  paren-
+       (that  is,  the return from the extraction function) is given in paren-
        theses after each string for \C and \G.
 
        Note that whereas patterns can be continued over several lines (a plain
        ">" prompt is used for continuations), data lines may not. However new-
-       lines  can  be included in data by means of the \n escape (or \r, \r\n,
+       lines can be included in data by means of the \n escape (or  \r,  \r\n,
        etc., depending on the newline sequence setting).
 
 
 OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
 
-       When the alternative matching function, pcre_dfa_exec(),  is  used  (by
-       means  of  the \D escape sequence or the -dfa command line option), the
-       output consists of a list of all the matches that start  at  the  first
+       When  the  alternative  matching function, pcre_dfa_exec(), is used (by
+       means of the \D escape sequence or the -dfa command line  option),  the
+       output  consists  of  a list of all the matches that start at the first
        point in the subject where there is at least one match. For example:
 
            re> /(tang|tangerine|tan)/
@@ -544,11 +593,11 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
           1: tang
           2: tan
 
-       (Using  the  normal  matching function on this data finds only "tang".)
-       The longest matching string is always given first (and numbered  zero).
+       (Using the normal matching function on this data  finds  only  "tang".)
+       The  longest matching string is always given first (and numbered zero).
        After a PCRE_ERROR_PARTIAL return, the output is "Partial match:", fol-
-       lowed by the partially matching  substring.  (Note  that  this  is  the
-       entire  substring  that  was inspected during the partial match; it may
+       lowed  by  the  partially  matching  substring.  (Note that this is the
+       entire substring that was inspected during the partial  match;  it  may
        include characters before the actual match start if a lookbehind asser-
        tion, \K, \b, or \B was involved.)
 
@@ -564,16 +613,16 @@ OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION
           1: tan
           0: tan
 
-       Since the matching function does not  support  substring  capture,  the
-       escape  sequences  that  are concerned with captured substrings are not
+       Since  the  matching  function  does not support substring capture, the
+       escape sequences that are concerned with captured  substrings  are  not
        relevant.
 
 
 RESTARTING AFTER A PARTIAL MATCH
 
        When the alternative matching function has given the PCRE_ERROR_PARTIAL
-       return,  indicating that the subject partially matched the pattern, you
-       can restart the match with additional subject data by means of  the  \R
+       return, indicating that the subject partially matched the pattern,  you
+       can  restart  the match with additional subject data by means of the \R
        escape sequence. For example:
 
            re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
@@ -582,30 +631,30 @@ RESTARTING AFTER A PARTIAL MATCH
          data> n05\R\D
           0: n05
 
-       For  further  information  about  partial matching, see the pcrepartial
+       For further information about partial  matching,  see  the  pcrepartial
        documentation.
 
 
 CALLOUTS
 
-       If the pattern contains any callout requests, pcretest's callout  func-
-       tion  is  called  during  matching. This works with both matching func-
+       If  the pattern contains any callout requests, pcretest's callout func-
+       tion is called during matching. This works  with  both  matching  func-
        tions. By default, the called function displays the callout number, the
-       start  and  current  positions in the text at the callout time, and the
+       start and current positions in the text at the callout  time,  and  the
        next pattern item to be tested. For example, the output
 
          --->pqrabcdef
            0    ^  ^     \d
 
-       indicates that callout number 0 occurred for a match  attempt  starting
-       at  the fourth character of the subject string, when the pointer was at
-       the seventh character of the data, and when the next pattern  item  was
-       \d.  Just  one  circumflex is output if the start and current positions
+       indicates  that  callout number 0 occurred for a match attempt starting
+       at the fourth character of the subject string, when the pointer was  at
+       the  seventh  character of the data, and when the next pattern item was
+       \d. Just one circumflex is output if the start  and  current  positions
        are the same.
 
        Callouts numbered 255 are assumed to be automatic callouts, inserted as
-       a  result  of the /C pattern modifier. In this case, instead of showing
-       the callout number, the offset in the pattern, preceded by a  plus,  is
+       a result of the /C pattern modifier. In this case, instead  of  showing
+       the  callout  number, the offset in the pattern, preceded by a plus, is
        output. For example:
 
            re> /\d?[A-E]\*/C
@@ -617,9 +666,29 @@ CALLOUTS
          +10 ^ ^
           0: E*
 
+       If a pattern contains (*MARK) items, an additional line is output when-
+       ever  a  change  of  latest mark is passed to the callout function. For
+       example:
+
+           re> /a(*MARK:X)bc/C
+         data> abc
+         --->abc
+          +0 ^       a
+          +1 ^^      (*MARK:X)
+         +10 ^^      b
+         Latest Mark: X
+         +11 ^ ^     c
+         +12 ^  ^
+          0: abc
+
+       The mark changes between matching "a" and "b", but stays the  same  for
+       the  rest  of  the match, so nothing more is output. If, as a result of
+       backtracking, the mark reverts to being unset, the  text  "<unset>"  is
+       output.
+
        The  callout  function  in pcretest returns zero (carry on matching) by
        default, but you can use a \C item in a data line (as described  above)
-       to change this.
+       to change this and other parameters of the callout.
 
        Inserting  callouts can be helpful when using pcretest to check compli-
        cated regular expressions. For further information about callouts,  see
@@ -641,8 +710,8 @@ NON-PRINTING CHARACTERS
 SAVING AND RELOADING COMPILED PATTERNS
 
        The facilities described in this section are  not  available  when  the
-       POSIX inteface to PCRE is being used, that is, when the /P pattern mod-
-       ifier is specified.
+       POSIX  interface  to  PCRE  is being used, that is, when the /P pattern
+       modifier is specified.
 
        When the POSIX interface is not in use, you can cause pcretest to write
        a  compiled  pattern to a file, by following the modifiers with > and a
@@ -663,13 +732,13 @@ SAVING AND RELOADING COMPILED PATTERNS
        diately after the compiled pattern. After writing  the  file,  pcretest
        expects to read a new pattern.
 
-       A saved pattern can be reloaded into pcretest by specifing < and a file
-       name instead of a pattern. The name of the file must not  contain  a  <
-       character,  as  otherwise pcretest will interpret the line as a pattern
+       A  saved  pattern  can  be reloaded into pcretest by specifying < and a
+       file name instead of a pattern. The name of the file must not contain a
+       < character, as otherwise pcretest will interpret the line as a pattern
        delimited by < characters.  For example:
 
           re> </some/file
-         Compiled regex loaded from /some/file
+         Compiled pattern loaded from /some/file
          No study data
 
        When the pattern has been loaded, pcretest proceeds to read data  lines
@@ -709,5 +778,5 @@ AUTHOR
 
 REVISION
 
-       Last updated: 21 November 2010
-       Copyright (c) 1997-2010 University of Cambridge.
+       Last updated: 01 August 2011
+       Copyright (c) 1997-2011 University of Cambridge.
author	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2011-08-02 11:00:40 +0000
committer	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2011-08-02 11:00:40 +0000
commit	9c65843dde6af3b331acdf8518a6020df32f45af (patch)
tree	f4938ee9a3d4ca4b7282f86370a5a39875a3a562 /doc/pcretest.txt
parent	2c1db477501a36945e05bc50a1d563c96c4e13f4 (diff)
download	pcre-9c65843dde6af3b331acdf8518a6020df32f45af.tar.gz