summaryrefslogtreecommitdiff
path: root/doc/html
diff options
context:
space:
mode:
authorph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2018-11-12 16:02:01 +0000
committerph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2018-11-12 16:02:01 +0000
commit30dcfda7537de8d0b95200c533cf7aad792d1d9d (patch)
tree4d130314d7b71922a45258e9b61151528ee2ae1f /doc/html
parentf1dd223469f0ba82f9070a8c6e70c8acabbb1c60 (diff)
downloadpcre2-30dcfda7537de8d0b95200c533cf7aad792d1d9d.tar.gz
Upgrade the as yet unreleased substitute callout facility.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@1039 6239d852-aaf2-0410-a92c-79f79f948069
Diffstat (limited to 'doc/html')
-rw-r--r--doc/html/pcre2_set_substitute_callout.html2
-rw-r--r--doc/html/pcre2api.html57
-rw-r--r--doc/html/pcre2test.html57
3 files changed, 88 insertions, 28 deletions
diff --git a/doc/html/pcre2_set_substitute_callout.html b/doc/html/pcre2_set_substitute_callout.html
index 3e43a5a..7ae3a39 100644
--- a/doc/html/pcre2_set_substitute_callout.html
+++ b/doc/html/pcre2_set_substitute_callout.html
@@ -20,7 +20,7 @@ SYNOPSIS
</P>
<P>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
-<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
+<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *),</b>
<b> void *<i>callout_data</i>);</b>
</P>
<br><b>
diff --git a/doc/html/pcre2api.html b/doc/html/pcre2api.html
index 5faadc4..1845580 100644
--- a/doc/html/pcre2api.html
+++ b/doc/html/pcre2api.html
@@ -183,7 +183,7 @@ document for an overview of all the PCRE2 documentation.
<br>
<br>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
-<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
+<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
@@ -924,7 +924,7 @@ documentation.
<br>
<br>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
-<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
+<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
@@ -3413,9 +3413,9 @@ substitutions. However, PCRE2_SUBSTITUTE_UNKNOWN_UNSET does cause unknown
groups in the extended syntax forms to be treated as unset.
</P>
<P>
-If successful, <b>pcre2_substitute()</b> returns the number of replacements that
-were made. This may be zero if no matches were found, and is never greater than
-1 unless PCRE2_SUBSTITUTE_GLOBAL is set.
+If successful, <b>pcre2_substitute()</b> returns the number of successful
+matches. This may be zero if no matches were found, and is never greater than 1
+unless PCRE2_SUBSTITUTE_GLOBAL is set.
</P>
<P>
In the event of an error, a negative error code is returned. Except for
@@ -3457,16 +3457,16 @@ Substitution callouts
</b><br>
<P>
<b>int pcre2_set_substitute_callout(pcre2_match_context *<i>mcontext</i>,</b>
-<b> void (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
+<b> int (*<i>callout_function</i>)(pcre2_substitute_callout_block *, void *),</b>
<b> void *<i>callout_data</i>);</b>
<br>
<br>
The <b>pcre2_set_substitution_callout()</b> function can be used to specify a
callout function for <b>pcre2_substitute()</b>. This information is passed in
-a match context. The callout function is called after each substitution. It is
-not called for simulated substitutions that happen as a result of the
-PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option. A callout function should not return
-any value.
+a match context. The callout function is called after each substitution has
+been processed, but it can cause the replacement not to happen. The callout
+function is not called for simulated substitutions that happen as a result of
+the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option.
</P>
<P>
The first argument of the callout function is a pointer to a substitute callout
@@ -3474,7 +3474,11 @@ block structure, which contains the following fields, not necessarily in this
order:
<pre>
uint32_t <i>version</i>;
- PCRE2_SIZE <i>input_offsets[2]</i>;
+ uint32_t <i>subscount</i>;
+ PCRE2_SPTR <i>input</i>;
+ PCRE2_SPTR <i>output</i>;
+ PCRE2_SIZE <i>*ovector</i>;
+ uint32_t <i>oveccount</i>;
PCRE2_SIZE <i>output_offsets[2]</i>;
</pre>
The <i>version</i> field contains the version number of the block format. The
@@ -3482,13 +3486,34 @@ current version is 0. The version number will increase in future if more fields
are added, but the intention is never to remove any of the existing fields.
</P>
<P>
-The <i>input_offsets</i> vector contains the code unit offsets in the input
-string of the matched substring, and the <i>output_offsets</i> vector contains
-the offsets of the replacement in the output string.
+The <i>subscount</i> field is the number of the current match. It is 1 for the
+first callout, 2 for the second, and so on. The <i>input</i> and <i>output</i>
+pointers are copies of the values passed to <b>pcre2_substitute()</b>.
+</P>
+<P>
+The <i>ovector</i> field points to the ovector, which contains the result of the
+most recent match. The <i>oveccount</i> field contains the number of pairs that
+are set in the ovector, and is always greater than zero.
+</P>
+<P>
+The <i>output_offsets</i> vector contains the offsets of the replacement in the
+output string. This has already been processed for dollar and (if requested)
+backslash substitutions as described above.
</P>
<P>
The second argument of the callout function is the value passed as
-<i>callout_data</i> when the function was registered.
+<i>callout_data</i> when the function was registered. The value returned by the
+callout function is interpreted as follows:
+</P>
+<P>
+If the value is zero, the replacement is accepted, and, if
+PCRE2_SUBSTITUTE_GLOBAL is set, processing continues with a search for the next
+match. If the value is not zero, the current replacement is not accepted. If
+the value is greater than zero, processing continues when
+PCRE2_SUBSTITUTE_GLOBAL is set. Otherwise (the value is less than zero or
+PCRE2_SUBSTITUTE_GLOBAL is not set), the the rest of the input is copied to the
+output and the call to <b>pcre2_substitute()</b> exits, returning the number of
+matches so far.
</P>
<br><a name="SEC37" href="#TOC1">DUPLICATE SUBPATTERN NAMES</a><br>
<P>
@@ -3757,7 +3782,7 @@ Cambridge, England.
</P>
<br><a name="SEC42" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 19 October 2018
+Last updated: 12 November 2018
<br>
Copyright &copy; 1997-2018 University of Cambridge.
<br>
diff --git a/doc/html/pcre2test.html b/doc/html/pcre2test.html
index f5ce072..03bfa8b 100644
--- a/doc/html/pcre2test.html
+++ b/doc/html/pcre2test.html
@@ -1052,7 +1052,9 @@ process.
startchar show starting character when relevant
substitute_callout use substitution callouts
substitute_extended use PCRE2_SUBSTITUTE_EXTENDED
+ substitute_skip=&#60;n&#62; skip substitution number n
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
+ substitute_stop=&#60;n&#62; skip substitution number n and greater
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
</pre>
@@ -1220,7 +1222,9 @@ pattern.
startoffset=&#60;n&#62; same as offset=&#60;n&#62;
substitute_callout use substitution callouts
substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED
+ substitute_skip=&#60;n&#62; skip substitution number n
substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH
+ substitute_stop=&#60;n&#62; skip substitution number n and greater
substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET
substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY
zero_terminate pass the subject as zero-terminated
@@ -1410,16 +1414,6 @@ simple example of a substitution test:
=abc=abc=\=global
2: =xxx=xxx=
</pre>
-If the <b>substitute_callout</b> modifier is set, a substitution callout
-function is set up. When it is called (after each substitution), the offsets in
-the input and output strings are output. For example:
-<pre>
- /abc/g,replace=&#60;$0&#62;,substitute_callout
- abcdefabcpqr
- Old 0 3 New 0 5
- Old 6 9 New 8 13
- 2: &#60;abc&#62;def&#60;abc&#62;pqr
-</pre>
Subject and replacement strings should be kept relatively short (fewer than 256
characters) for substitution tests, as fixed-size buffers are used. To make it
easy to test for buffer overflow, if the replacement string starts with a
@@ -1451,6 +1445,47 @@ matching provokes an error return ("bad option value") from
<b>pcre2_substitute()</b>.
</P>
<br><b>
+Testing substitute callouts
+</b><br>
+<P>
+If the <b>substitute_callout</b> modifier is set, a substitution callout
+function is set up. When it is called (after each substitution), details of the
+the input and output strings are output. For example:
+<pre>
+ /abc/g,replace=&#60;$0&#62;,substitute_callout
+ abcdefabcpqr
+ 1(1) Old 0 3 "abc" New 0 5 "&#60;abc&#62;"
+ 2(1) Old 6 9 "abc" New 8 13 "&#60;abc&#62;"
+ 2: &#60;abc&#62;def&#60;abc&#62;pqr
+</pre>
+The first number on each callout line is the count of matches. The
+parenthesized number is the number of pairs that are set in the ovector (that
+is, one more than the number of capturing groups that were set). Then are
+listed the offsets of the old substring, its contents, and the same for the
+replacement.
+</P>
+<P>
+By default, the substitution callout function returns zero, which accepts the
+replacement and causes matching to continue if /g was used. Two further
+modifiers can be used to test other return values. If <b>substitute_skip</b> is
+set to a value greater than zero the callout function returns +1 for the match
+of that number, and similarly <b>substitute_stop</b> returns -1. These cause the
+replacement to be rejected, and -1 causes no further matching to take place. If
+either of them are set, <b>substitute_callout</b> is assumed. For example:
+<pre>
+ /abc/g,replace=&#60;$0&#62;,substitute_skip=1
+ abcdefabcpqr
+ 1(1) Old 0 3 "abc" New 0 5 "&#60;abc&#62; SKIPPED"
+ 2(1) Old 6 9 "abc" New 6 11 "&#60;abc&#62;"
+ 2: abcdef&#60;abc&#62;pqr
+ abcdefabcpqr\=substitute_stop=1
+ 1(1) Old 0 3 "abc" New 0 5 "&#60;abc&#62; STOPPED"
+ 1: abcdefabcpqr
+</pre>
+If both are set for the same number, stop takes precedence. Only a single skip
+or stop is supported, which is sufficient for testing that the feature works.
+</P>
+<br><b>
Setting the JIT stack size
</b><br>
<P>
@@ -2040,7 +2075,7 @@ Cambridge, England.
</P>
<br><a name="SEC21" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 21 September 2018
+Last updated: 12 November 2018
<br>
Copyright &copy; 1997-2018 University of Cambridge.
<br>