From 9a29fc883ba431ef060e80308d5e4b813b70388d Mon Sep 17 00:00:00 2001 From: ph10 Date: Thu, 17 Dec 2015 18:44:06 +0000 Subject: File tidies, version updates, etc. for 10.21-RC1 git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@471 6239d852-aaf2-0410-a92c-79f79f948069 --- doc/pcre2test.txt | 144 +++++++++++++++++++++++++++++++++--------------------- 1 file changed, 89 insertions(+), 55 deletions(-) (limited to 'doc/pcre2test.txt') diff --git a/doc/pcre2test.txt b/doc/pcre2test.txt index e880b9a..d9dbc4d 100644 --- a/doc/pcre2test.txt +++ b/doc/pcre2test.txt @@ -797,14 +797,18 @@ PATTERN MODIFIERS with that pattern. They may not appear in #pattern commands. These mod- ifiers do not affect the compilation process. - aftertext show text after match - allaftertext show text after captures - allcaptures show all captures - allusedtext show all consulted text - /g global global matching - mark show mark values - replace= specify a replacement string - startchar show starting character when relevant + aftertext show text after match + allaftertext show text after captures + allcaptures show all captures + allusedtext show all consulted text + /g global global matching + mark show mark values + replace= specify a replacement string + startchar show starting character when relevant + substitute_extended use PCRE2_SUBSTITUTE_EXTENDED + substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH + substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET + substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY These modifiers may not appear in a #pattern command. If you want them as defaults, set them in a #subject command. @@ -860,33 +864,38 @@ SUBJECT MODIFIERS line (see above), in which case they apply to every subject line that is matched against that pattern. - aftertext show text after match - allaftertext show text after captures - allcaptures show all captures - allusedtext show all consulted text (non-JIT only) - altglobal alternative global matching - callout_capture show captures at callout time - callout_data= set a value to pass via callouts - callout_fail=[:] control callout failure - callout_none do not supply a callout function - copy= copy captured substring - dfa use pcre2_dfa_match() - find_limits find match and recursion limits - get= extract captured substring - getall extract all captured substrings - /g global global matching - jitstack= set size of JIT stack - mark show mark values - match_limit= set a match limit - memory show memory usage - null_context match with a NULL context - offset= set starting offset - offset_limit= set offset limit - ovector= set size of output vector - recursion_limit= set a recursion limit - replace= specify a replacement string - startchar show startchar when relevant - zero_terminate pass the subject as zero-terminated + aftertext show text after match + allaftertext show text after captures + allcaptures show all captures + allusedtext show all consulted text (non-JIT only) + altglobal alternative global matching + callout_capture show captures at callout time + callout_data= set a value to pass via callouts + callout_fail=[:] control callout failure + callout_none do not supply a callout function + copy= copy captured substring + dfa use pcre2_dfa_match() + find_limits find match and recursion limits + get= extract captured substring + getall extract all captured substrings + /g global global matching + jitstack= set size of JIT stack + mark show mark values + match_limit= set a match limit + memory show memory usage + null_context match with a NULL context + offset= set starting offset + offset_limit= set offset limit + ovector= set size of output vector + recursion_limit= set a recursion limit + replace= specify a replacement string + startchar show startchar when relevant + startoffset= same as offset= + substitute_extedded use PCRE2_SUBSTITUTE_EXTENDED + substitute_overflow_length use PCRE2_SUBSTITUTE_OVERFLOW_LENGTH + substitute_unknown_unset use PCRE2_SUBSTITUTE_UNKNOWN_UNSET + substitute_unset_empty use PCRE2_SUBSTITUTE_UNSET_EMPTY + zero_terminate pass the subject as zero-terminated The effects of these modifiers are described in the following sections. @@ -1011,19 +1020,30 @@ SUBJECT MODIFIERS Testing the substitution function If the replace modifier is set, the pcre2_substitute() function is - called instead of one of the matching functions. Unlike subject - strings, pcre2test does not process replacement strings for escape - sequences. In UTF mode, a replacement string is checked to see if it is - a valid UTF-8 string. If so, it is correctly converted to a UTF string - of the appropriate code unit width. If it is not a valid UTF-8 string, - the individual code units are copied directly. This provides a means of - passing an invalid UTF-8 string for testing purposes. - - If the global modifier is set, PCRE2_SUBSTITUTE_GLOBAL is passed to - pcre2_substitute(). After a successful substitution, the modified - string is output, preceded by the number of replacements. This may be - zero if there were no matches. Here is a simple example of a substitu- - tion test: + called instead of one of the matching functions. Note that replacement + strings cannot contain commas, because a comma signifies the end of a + modifier. This is not thought to be an issue in a test program. + + Unlike subject strings, pcre2test does not process replacement strings + for escape sequences. In UTF mode, a replacement string is checked to + see if it is a valid UTF-8 string. If so, it is correctly converted to + a UTF string of the appropriate code unit width. If it is not a valid + UTF-8 string, the individual code units are copied directly. This pro- + vides a means of passing an invalid UTF-8 string for testing purposes. + + The following modifiers set options (in additional to the normal match + options) for pcre2_substitute(): + + global PCRE2_SUBSTITUTE_GLOBAL + substitute_extended PCRE2_SUBSTITUTE_EXTENDED + substitute_overflow_length PCRE2_SUBSTITUTE_OVERFLOW_LENGTH + substitute_unknown_unset PCRE2_SUBSTITUTE_UNKNOWN_UNSET + substitute_unset_empty PCRE2_SUBSTITUTE_UNSET_EMPTY + + + After a successful substitution, the modified string is output, pre- + ceded by the number of replacements. This may be zero if there were no + matches. Here is a simple example of a substitution test: /abc/replace=xxx =abc=abc= @@ -1031,12 +1051,13 @@ SUBJECT MODIFIERS =abc=abc=\=global 2: =xxx=xxx= - Subject and replacement strings should be kept relatively short for - substitution tests, as fixed-size buffers are used. To make it easy to - test for buffer overflow, if the replacement string starts with a num- - ber in square brackets, that number is passed to pcre2_substitute() as - the size of the output buffer, with the replacement string starting at - the next character. Here is an example that tests the edge case: + Subject and replacement strings should be kept relatively short (fewer + than 256 characters) for substitution tests, as fixed-size buffers are + used. To make it easy to test for buffer overflow, if the replacement + string starts with a number in square brackets, that number is passed + to pcre2_substitute() as the size of the output buffer, with the + replacement string starting at the next character. Here is an example + that tests the edge case: /abc/ 123abc123\=replace=[10]XYZ @@ -1044,6 +1065,19 @@ SUBJECT MODIFIERS 123abc123\=replace=[9]XYZ Failed: error -47: no more memory + The default action of pcre2_substitute() is to return + PCRE2_ERROR_NOMEMORY when the output buffer is too small. However, if + the PCRE2_SUBSTITUTE_OVERFLOW_LENGTH option is set (by using the sub- + stitute_overflow_length modifier), pcre2_substitute() continues to go + through the motions of matching and substituting, in order to compute + the size of buffer that is required. When this happens, pcre2test shows + the required buffer length (which includes space for the trailing zero) + as part of the error message. For example: + + /abc/substitute_overflow_length + 123abc123\=replace=[9]XYZ + Failed: error -47: no more memory: 10 code units are needed + A replacement string is ignored with POSIX and DFA matching. Specifying partial matching provokes an error return ("bad option value") from pcre2_substitute(). @@ -1471,5 +1505,5 @@ AUTHOR REVISION - Last updated: 05 November 2015 + Last updated: 12 December 2015 Copyright (c) 1997-2015 University of Cambridge. -- cgit v1.2.1