summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2011-07-02 15:20:59 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2011-07-02 15:20:59 +0000
commit5c7a0c52f657f9ee5670cddc9466e239243c9b18 (patch)
treeb5e2b3ffe768624a719d31485a06fd83f31f4fde
parent477829e693c6a38cc3443ea90b2dacb19a2eddfc (diff)
downloadpcre-5c7a0c52f657f9ee5670cddc9466e239243c9b18.tar.gz
Fix two study bugs concerned with minimum subject lengths; add features to
pcretest so that all tests can be run with or without study; adjust tests so that this happens. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@612 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--ChangeLog13
-rw-r--r--HACKING5
-rwxr-xr-xRunTest227
-rw-r--r--doc/pcretest.154
-rw-r--r--pcre_internal.h6
-rw-r--r--pcre_study.c44
-rw-r--r--pcretest.c36
-rwxr-xr-xperltest.pl4
-rw-r--r--testdata/testinput1110
-rw-r--r--testdata/testinput280
-rw-r--r--testdata/testinput52
-rw-r--r--testdata/testinput74
-rw-r--r--testdata/testoutput1113
-rw-r--r--testdata/testoutput2332
-rw-r--r--testdata/testoutput56
-rw-r--r--testdata/testoutput724
16 files changed, 667 insertions, 193 deletions
diff --git a/ChangeLog b/ChangeLog
index a198adb..97086a2 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -79,8 +79,10 @@ Version 8.13 30-Apr-2011
synonym of -m (show memory usage). I have changed it to mean "force study
for every regex", that is, assume /S for every regex. This is similar to -i
and -d etc. It's slightly incompatible, but I'm hoping nobody is still
- using it. It makes it easier to run collection of tests with study enabled,
- and thereby test pcre_study() more easily.
+ using it. It makes it easier to run collections of tests with and without
+ study enabled, and thereby test pcre_study() more easily. All the standard
+ tests are now run with and without -s (but some patterns can be marked as
+ "never study" - see 20 below).
15. When (*ACCEPT) was used in a subpattern that was called recursively, the
restoration of the capturing data to the outer values was not happening
@@ -101,6 +103,13 @@ Version 8.13 30-Apr-2011
18. If a pattern containing \R was studied, it was assumed that \R always
matched two bytes, thus causing the minimum subject length to be
incorrectly computed because \R can also match just one byte.
+
+19. If a pattern containing (*ACCEPT) was studied, the minimum subject length
+ was incorrectly computed.
+
+20. If /S is present twice on a test pattern in pcretest input, it *disables*
+ studying, thereby overriding the use of -s on the command line. This is
+ necessary for one or two tests to keep the output identical in both cases.
Version 8.12 15-Jan-2011
diff --git a/HACKING b/HACKING
index 709609b..a82a67c 100644
--- a/HACKING
+++ b/HACKING
@@ -2,7 +2,8 @@ Technical Notes about PCRE
--------------------------
These are very rough technical notes that record potentially useful information
-about PCRE internals.
+about PCRE internals. For information about testing PCRE, see the pcretest
+documentation and the comment at the head of the RunTest file.
Historical note 1
@@ -449,4 +450,4 @@ next item.
Philip Hazel
-May 2011
+July 2011
diff --git a/RunTest b/RunTest
index 1fd43ff..d8be47a 100755
--- a/RunTest
+++ b/RunTest
@@ -1,6 +1,14 @@
#! /bin/sh
-# Run PCRE tests.
+# Run the PCRE tests using the pcretest program. All tests are now run both
+# with and without -s, to ensure that everything is tested with and without
+# studying. However, there are some tests that produce different output after
+# studying, typically when we are tracing the actual matching process (for
+# example, using auto-callouts). In these few cases, the tests are duplicated
+# in the files, one with /S to force studying always, and one with /SS to force
+# *not* studying always. The use of -s doesn't then make any difference to
+# their output. There is also one test which compiles invalid UTF-8 with the
+# UTF-8 check turned off for which studying is disabled with /SS.
valgrind=
@@ -137,33 +145,37 @@ echo PCRE C library tests
if [ $do1 = yes ] ; then
echo "Test 1: main functionality (Compatible with Perl >= 5.8)"
- $valgrind ./pcretest -q $testdata/testinput1 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput1 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput1 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput1 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# PCRE tests that are not Perl-compatible - API, errors, internals
if [ $do2 = yes ] ; then
echo "Test 2: API, errors, internals, and non-Perl stuff"
- $valgrind ./pcretest -q $testdata/testinput2 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput2 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else
- echo " "
- echo "** Test 2 requires a lot of stack. If it has crashed with a"
- echo "** segmentation fault, it may be that you do not have enough"
- echo "** stack available by default. Please see the 'pcrestack' man"
- echo "** page for a discussion of PCRE's stack usage."
- echo " "
- exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput2 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput2 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else
+ echo " "
+ echo "** Test 2 requires a lot of stack. If it has crashed with a"
+ echo "** segmentation fault, it may be that you do not have enough"
+ echo "** stack available by default. Please see the 'pcrestack' man"
+ echo "** page for a discussion of PCRE's stack usage."
+ echo " "
+ exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# Locale-specific tests, provided that either the "fr_FR" or the "french"
@@ -191,19 +203,22 @@ if [ $do3 = yes ] ; then
if [ "$locale" != "" ] ; then
echo "Test 3: locale-specific features (using '$locale' locale)"
- $valgrind ./pcretest -q $infile testtry
- if [ $? = 0 ] ; then
- $cf $outfile testtry
- if [ $? != 0 ] ; then
- echo " "
- echo "Locale test did not run entirely successfully."
- echo "This usually means that there is a problem with the locale"
- echo "settings rather than a bug in PCRE."
- else
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $infile testtry
+ if [ $? = 0 ] ; then
+ $cf $outfile testtry
+ if [ $? != 0 ] ; then
+ echo " "
+ echo "Locale test did not run entirely successfully."
+ echo "This usually means that there is a problem with the locale"
+ echo "settings rather than a bug in PCRE."
+ break;
+ else
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ fi
+ else exit 1
fi
- else exit 1
- fi
+ done
else
echo "Cannot test locale-specific features - neither the 'fr_FR' nor the"
echo "'french' locale exists, or the \"locale\" command is not available"
@@ -216,70 +231,82 @@ fi
if [ $do4 = yes ] ; then
echo "Test 4: UTF-8 support (Compatible with Perl >= 5.8)"
- $valgrind ./pcretest -q $testdata/testinput4 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput4 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput4 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput4 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
if [ $do5 = yes ] ; then
echo "Test 5: API, internals, and non-Perl stuff for UTF-8 support"
- $valgrind ./pcretest -q $testdata/testinput5 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput5 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput5 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput5 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
if [ $do6 = yes ] ; then
echo "Test 6: Unicode property support (Compatible with Perl >= 5.10)"
- $valgrind ./pcretest -q $testdata/testinput6 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput6 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput6 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput6 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# Tests for DFA matching support
if [ $do7 = yes ] ; then
echo "Test 7: DFA matching"
- $valgrind ./pcretest -q -dfa $testdata/testinput7 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput7 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt -dfa $testdata/testinput7 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput7 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
if [ $do8 = yes ] ; then
echo "Test 8: DFA matching with UTF-8"
- $valgrind ./pcretest -q -dfa $testdata/testinput8 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput8 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt -dfa $testdata/testinput8 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput8 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
if [ $do9 = yes ] ; then
echo "Test 9: DFA matching with Unicode properties"
- $valgrind ./pcretest -q -dfa $testdata/testinput9 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput9 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt -dfa $testdata/testinput9 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput9 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# Test of internal offsets and code sizes. This test is run only when there
@@ -290,39 +317,45 @@ fi
if [ $do10 = yes ] ; then
echo "Test 10: Internal offsets and code size tests"
- $valgrind ./pcretest -q $testdata/testinput10 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput10 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput10 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput10 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# Test of Perl >= 5.10 features
if [ $do11 = yes ] ; then
echo "Test 11: Features from Perl >= 5.10"
- $valgrind ./pcretest -q $testdata/testinput11 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput11 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput11 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput11 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# Test non-Perl-compatible Unicode property support
if [ $do12 = yes ] ; then
echo "Test 12: API, internals, and non-Perl stuff for Unicode property support"
- $valgrind ./pcretest -q $testdata/testinput12 testtry
- if [ $? = 0 ] ; then
- $cf $testdata/testoutput12 testtry
- if [ $? != 0 ] ; then exit 1; fi
- else exit 1
- fi
- echo "OK"
+ for opt in "" "-s"; do
+ $valgrind ./pcretest -q $opt $testdata/testinput12 testtry
+ if [ $? = 0 ] ; then
+ $cf $testdata/testoutput12 testtry
+ if [ $? != 0 ] ; then exit 1; fi
+ else exit 1
+ fi
+ if [ "$opt" = "-s" ] ; then echo "OK with study" ; else echo "OK"; fi
+ done
fi
# End
diff --git a/doc/pcretest.1 b/doc/pcretest.1
index 924750c..ffea3fd 100644
--- a/doc/pcretest.1
+++ b/doc/pcretest.1
@@ -4,7 +4,7 @@ pcretest - a program for testing Perl-compatible regular expressions.
.SH SYNOPSIS
.rs
.sp
-.B pcretest "[options] [source] [destination]"
+.B pcretest "[options] [input file [output file]]"
.sp
\fBpcretest\fP was written as a test program for the PCRE regular expression
library itself, but it can also be used for experimenting with regular
@@ -18,14 +18,17 @@ options, see the
.\" HREF
\fBpcreapi\fP
.\"
-documentation.
+documentation. The input for \fBpcretest\fP is a sequence of regular expression
+patterns and strings to be matched, as described below. The output shows the
+result of each match. Options on the command line and the patterns control PCRE
+options and exactly what is output.
.
.
-.SH OPTIONS
+.SH COMMAND LINE OPTIONS
.rs
.TP 10
\fB-b\fP
-Behave as if each regex has the \fB/B\fP (show byte code) modifier; the
+Behave as if each pattern has the \fB/B\fP (show byte code) modifier; the
internal form is output after compilation.
.TP 10
\fB-C\fP
@@ -33,7 +36,7 @@ Output the version number of the PCRE library, and all available information
about the optional features that are included, and then exit.
.TP 10
\fB-d\fP
-Behave as if each regex has the \fB/D\fP (debug) modifier; the internal
+Behave as if each pattern has the \fB/D\fP (debug) modifier; the internal
form and information about the compiled pattern is output after compilation;
\fB-d\fP is equivalent to \fB-b -i\fP.
.TP 10
@@ -46,7 +49,7 @@ standard \fBpcre_exec()\fP function (more detail is given below).
Output a brief summary these options and then exit.
.TP 10
\fB-i\fP
-Behave as if each regex has the \fB/I\fP modifier; information about the
+Behave as if each pattern has the \fB/I\fP modifier; information about the
compiled pattern is given after compilation.
.TP 10
\fB-M\fP
@@ -67,7 +70,7 @@ changed for individual matching calls by including \eO in the data line (see
below).
.TP 10
\fB-p\fP
-Behave as if each regex has the \fB/P\fP modifier; the POSIX wrapper API is
+Behave as if each pattern has the \fB/P\fP modifier; the POSIX wrapper API is
used to call PCRE. None of the other options has any effect when \fB-p\fP is
set.
.TP 10
@@ -79,8 +82,21 @@ On Unix-like systems, set the size of the run-time stack to \fIsize\fP
megabytes.
.TP 10
\fB-s\fP
-Behave as if each regex has the \fB/S\fP modifier; in other words, force each
-regex to be studied.
+Behave as if each pattern has the \fB/S\fP modifier; in other words, force each
+pattern to be studied. If the \fB/I\fP or \fB/D\fP option is present on a
+pattern (requesting output about the compiled pattern), information about the
+result of studying is not included when studying is caused only by \fB-s\fP and
+neither \fB-i\fP nor \fB-d\fP is present on the command line. This behaviour
+means that the output from tests that are run with and without \fB-s\fP should
+be identical, except when options that output information about the actual
+running of a match are set. The \fB-M\fP, \fB-t\fP, and \fB-tm\fP options,
+which give information about resources used, are likely to produce different
+output with and without \fB-s\fP. Output may also differ if the \fB/C\fP option
+is present on an individual pattern. This uses callouts to trace the the
+matching process, and this may be different between studied and non-studied
+patterns. If the pattern contains (*MARK) items there may also be differences,
+for the same reason. The \fB-s\fP command line option can be overridden for
+specific patterns that should never be studied (see the /S option below).
.TP 10
\fB-t\fP
Run each compile, study, and match many times with a timer, and output
@@ -193,10 +209,10 @@ options that do not correspond to anything in Perl:
\fB/<bsr_unicode>\fP PCRE_BSR_UNICODE
.sp
The modifiers that are enclosed in angle brackets are literal strings as shown,
-including the angle brackets, but the letters can be in either case. This
-example sets multiline matching with CRLF as the line ending sequence:
+including the angle brackets, but the letters within can be in either case.
+This example sets multiline matching with CRLF as the line ending sequence:
.sp
- /^abc/m<crlf>
+ /^abc/m<CRLF>
.sp
As well as turning on the PCRE_UTF8 option, the \fB/8\fP modifier also causes
any non-printing characters in output strings to be printed using the
@@ -290,9 +306,13 @@ which it appears.
The \fB/M\fP modifier causes the size of memory block used to hold the compiled
pattern to be output.
.P
-The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the
-expression has been compiled, and the results used when the expression is
-matched.
+If the \fB/S\fP modifier appears once, it causes \fBpcre_study()\fP to be
+called after the expression has been compiled, and the results used when the
+expression is matched. If \fB/S\fP appears twice, it suppresses studying, even
+if it was requested externally by the \fB-s\fP command line option. This makes
+it possible to specify that certain patterns are always studied, and others are
+never studied, independently of \fB-s\fP. This feature is used in the test
+files in a few cases where the output is different when the pattern is studied.
.P
The \fB/T\fP modifier must be followed by a single digit. It causes a specific
set of built-in character tables to be passed to \fBpcre_compile()\fP. It is
@@ -746,7 +766,7 @@ characters.
For example:
.sp
re> </some/file
- Compiled regex loaded from /some/file
+ Compiled pattern loaded from /some/file
No study data
.sp
When the pattern has been loaded, \fBpcretest\fP proceeds to read data lines in
@@ -792,6 +812,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 06 June 2011
+Last updated: 02 July 2011
Copyright (c) 1997-2011 University of Cambridge.
.fi
diff --git a/pcre_internal.h b/pcre_internal.h
index ae3e6a4..586df5d 100644
--- a/pcre_internal.h
+++ b/pcre_internal.h
@@ -595,10 +595,10 @@ compatibility. */
#define PCRE_JCHANGED 0x0010 /* j option used in regex */
#define PCRE_HASCRORLF 0x0020 /* explicit \r or \n in pattern */
-/* Options for the "extra" block produced by pcre_study(). */
+/* Flags for the "extra" block produced by pcre_study(). */
-#define PCRE_STUDY_MAPPED 0x01 /* a map of starting chars exists */
-#define PCRE_STUDY_MINLEN 0x02 /* a minimum length field exists */
+#define PCRE_STUDY_MAPPED 0x0001 /* a map of starting chars exists */
+#define PCRE_STUDY_MINLEN 0x0002 /* a minimum length field exists */
/* Masks for identifying the public options that are permitted at compile
time, run time, or study time, respectively. */
diff --git a/pcre_study.c b/pcre_study.c
index ac0dc46..5869f86 100644
--- a/pcre_study.c
+++ b/pcre_study.c
@@ -66,9 +66,10 @@ string of that length that matches. In UTF8 mode, the result is in characters
rather than bytes.
Arguments:
- code pointer to start of group (the bracket)
- startcode pointer to start of the whole pattern
- options the compiling options
+ code pointer to start of group (the bracket)
+ startcode pointer to start of the whole pattern
+ options the compiling options
+ had_accept pointer to flag for (*ACCEPT) encountered
Returns: the minimum length
-1 if \C was encountered
@@ -77,7 +78,8 @@ Returns: the minimum length
*/
static int
-find_minlength(const uschar *code, const uschar *startcode, int options)
+find_minlength(const uschar *code, const uschar *startcode, int options,
+ BOOL *had_accept_ptr)
{
int length = -1;
BOOL utf8 = (options & PCRE_UTF8) != 0;
@@ -125,17 +127,23 @@ for (;;)
case OP_BRAPOS:
case OP_SBRAPOS:
case OP_ONCE:
- d = find_minlength(cc, startcode, options);
+ d = find_minlength(cc, startcode, options, had_accept_ptr);
if (d < 0) return d;
branchlength += d;
+ if (*had_accept_ptr) return branchlength;
do cc += GET(cc, 1); while (*cc == OP_ALT);
cc += 1 + LINK_SIZE;
break;
/* Reached end of a branch; if it's a ket it is the end of a nested
- call. If it's ALT it is an alternation in a nested call. If it is
- END it's the end of the outer call. All can be handled by the same code. */
-
+ call. If it's ALT it is an alternation in a nested call. If it is END it's
+ the end of the outer call. All can be handled by the same code. If it is
+ ACCEPT, it is essentially the same as END, but we set a flag so that
+ counting stops. */
+
+ case OP_ACCEPT:
+ *had_accept_ptr = TRUE;
+ /* Fall through */
case OP_ALT:
case OP_KET:
case OP_KETRMAX:
@@ -144,7 +152,7 @@ for (;;)
case OP_END:
if (length < 0 || (!had_recurse && branchlength < length))
length = branchlength;
- if (*cc != OP_ALT) return length;
+ if (op != OP_ALT) return length;
cc += 1 + LINK_SIZE;
branchlength = 0;
had_recurse = FALSE;
@@ -367,7 +375,11 @@ for (;;)
d = 0;
had_recurse = TRUE;
}
- else d = find_minlength(cs, startcode, options);
+ else
+ {
+ d = find_minlength(cs, startcode, options, had_accept_ptr);
+ *had_accept_ptr = FALSE;
+ }
}
else d = 0;
cc += 3;
@@ -411,7 +423,10 @@ for (;;)
if (cc > cs && cc < ce)
had_recurse = TRUE;
else
- branchlength += find_minlength(cs, startcode, options);
+ {
+ branchlength += find_minlength(cs, startcode, options, had_accept_ptr);
+ *had_accept_ptr = FALSE;
+ }
cc += 1 + LINK_SIZE;
break;
@@ -479,10 +494,9 @@ for (;;)
case OP_THEN_ARG:
cc += _pcre_OP_lengths[op] + cc[1+LINK_SIZE];
break;
-
+
/* The remaining opcodes are just skipped over. */
- case OP_ACCEPT:
case OP_CLOSE:
case OP_COMMIT:
case OP_FAIL:
@@ -688,6 +702,7 @@ do
while (try_next) /* Loop for items in this branch */
{
int rc;
+
switch(*tcode)
{
/* If we reach something we don't understand, it means a new opcode has
@@ -1200,6 +1215,7 @@ pcre_study(const pcre *external_re, int options, const char **errorptr)
{
int min;
BOOL bits_set = FALSE;
+BOOL had_accept = FALSE;
uschar start_bits[32];
pcre_extra *extra;
pcre_study_data *study;
@@ -1257,7 +1273,7 @@ if ((re->options & PCRE_ANCHORED) == 0 &&
/* Find the minimum length of subject string. */
-switch(min = find_minlength(code, code, re->options))
+switch(min = find_minlength(code, code, re->options, &had_accept))
{
case -2: *errorptr = "internal error: missing capturing bracket"; break;
case -3: *errorptr = "internal error: opcode not recognized"; break;
diff --git a/pcretest.c b/pcretest.c
index 8a4edb6..6bf6173 100644
--- a/pcretest.c
+++ b/pcretest.c
@@ -1436,6 +1436,7 @@ while (!done)
size_t size, regex_gotten_store;
int do_mark = 0;
int do_study = 0;
+ int no_force_study = 0;
int do_debug = debug;
int do_G = 0;
int do_g = 0;
@@ -1502,7 +1503,7 @@ while (!done)
}
}
- fprintf(outfile, "Compiled regex%s loaded from %s\n",
+ fprintf(outfile, "Compiled pattern%s loaded from %s\n",
do_flip? " (byte-inverted)" : "", p);
/* Need to know if UTF-8 for printing data strings */
@@ -1510,7 +1511,7 @@ while (!done)
new_info(re, NULL, PCRE_INFO_OPTIONS, &get_options);
use_utf8 = (get_options & PCRE_UTF8) != 0;
- /* Now see if there is any following study data */
+ /* Now see if there is any following study data. */
if (true_study_size != 0)
{
@@ -1624,7 +1625,14 @@ while (!done)
case 'P': do_posix = 1; break;
#endif
- case 'S': do_study = 1; break;
+ case 'S':
+ if (do_study == 0) do_study = 1; else
+ {
+ do_study = 0;
+ no_force_study = 1;
+ }
+ break;
+
case 'U': options |= PCRE_UNGREEDY; break;
case 'W': options |= PCRE_UCP; break;
case 'X': options |= PCRE_EXTRA; break;
@@ -1808,10 +1816,12 @@ while (!done)
true_size = ((real_pcre *)re)->size;
regex_gotten_store = gotten_store;
- /* If -s or /S was present, study the regexp to generate additional info to
- help with the matching. */
+ /* If -s or /S was present, study the regex to generate additional info to
+ help with the matching, unless the pattern has the SS option, which
+ suppresses the effect of /S (used for a few test patterns where studying is
+ never sensible). */
- if (do_study || force_study)
+ if (do_study || (force_study && !no_force_study))
{
if (timeit > 0)
{
@@ -2049,9 +2059,12 @@ while (!done)
/* Don't output study size; at present it is in any case a fixed
value, but it varies, depending on the computer architecture, and
so messes up the test suite. (And with the /F option, it might be
- flipped.) */
+ flipped.) If study was forced by an external -s, don't show this
+ information unless -i or -d was also present. This means that, except
+ when auto-callouts are involved, the output from runs with and without
+ -s should be identical. */
- if (do_study || force_study)
+ if (do_study || (force_study && showinfo && !no_force_study))
{
if (extra == NULL)
fprintf(outfile, "Study returned NULL\n");
@@ -2129,7 +2142,11 @@ while (!done)
}
else
{
- fprintf(outfile, "Compiled regex written to %s\n", to_file);
+ fprintf(outfile, "Compiled pattern written to %s\n", to_file);
+
+ /* If there is study data, write it, but verify the writing only
+ if the studying was requested by /S, not just by -s. */
+
if (extra != NULL)
{
if (fwrite(extra->study_data, 1, true_study_size, f) <
@@ -2139,7 +2156,6 @@ while (!done)
strerror(errno));
}
else fprintf(outfile, "Study data written to %s\n", to_file);
-
}
}
fclose(f);
diff --git a/perltest.pl b/perltest.pl
index 424de2d..9eaa8ac 100755
--- a/perltest.pl
+++ b/perltest.pl
@@ -103,6 +103,10 @@ for (;;)
$pattern =~ s/W(?=[a-zA-Z]*$)//;
+ # Remove /S or /SS from a pattern (asks pcretest to study or not to study)
+
+ $pattern =~ s/S(?=[a-zA-Z]*$)//g;
+
# Check that the pattern is valid
eval "\$_ =~ ${pattern}";
diff --git a/testdata/testinput11 b/testdata/testinput11
index 9631eb8..cf02fac 100644
--- a/testdata/testinput11
+++ b/testdata/testinput11
@@ -246,6 +246,7 @@
aaabccc
/(A (A|B(*ACCEPT)|C) D)(E)/x
+ AB
ABX
AADE
ACDE
@@ -403,7 +404,10 @@
AC
CB
-/(*MARK:A)(*SKIP:B)(C|X)/K
+/--- Force no study, otherwise mark is not seen. The studied version is in
+ test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
C
D
@@ -435,9 +439,9 @@ with the handling of backtracking verbs. ---/
/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
AAAC
-/--- Don't loop! ---/
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/
-/(*:A)A+(*SKIP:A)(B|Z)/K
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
AAAC
/--- This should succeed, as a non-existent skip name disables the skip ---/
diff --git a/testdata/testinput2 b/testdata/testinput2
index f0a32ac..d97050f 100644
--- a/testdata/testinput2
+++ b/testdata/testinput2
@@ -1061,7 +1061,12 @@
/abc(?C)de(?C1)f/I
123abcdef
-/(?C1)\dabc(?C2)def/I
+/(?C1)\dabc(?C2)def/IS
+ 1234abcdef
+ *** Failers
+ abcdef
+
+/(?C1)\dabc(?C2)def/ISS
1234abcdef
*** Failers
abcdef
@@ -1310,7 +1315,12 @@
abcde
abcdfe
-/a*b/ICDZ
+/a*b/ICDZS
+ ab
+ aaaab
+ aaaacb
+
+/a*b/ICDZSS
ab
aaaab
aaaacb
@@ -1320,9 +1330,16 @@
aaaab
aaaacb
-/(abc|def)x/ICDZ
+/(abc|def)x/ICDZS
abcx
defx
+ ** Failers
+ abcdefzx
+
+/(abc|def)x/ICDZSS
+ abcx
+ defx
+ ** Failers
abcdefzx
/(ab|cd){3,4}/IC
@@ -1330,7 +1347,10 @@
abcdabcd
abcdcdcdcdcd
-/([ab]{,4}c|xy)/ICDZ
+/([ab]{,4}c|xy)/ICDZS
+ Note: that { does NOT introduce a quantifier
+
+/([ab]{,4}c|xy)/ICDZSS
Note: that { does NOT introduce a quantifier
/([ab]{1,4}c|xy){4,5}?123/ICDZ
@@ -1404,13 +1424,25 @@
1X
123456\P
-/abc/I>testsavedregex
+/abc/IS>testsavedregex
+<testsavedregex
+ abc
+ ** Failers
+ bca
+
+/abc/ISS>testsavedregex
+<testsavedregex
+ abc
+ ** Failers
+ bca
+
+/abc/IFS>testsavedregex
<testsavedregex
abc
** Failers
bca
-/abc/IF>testsavedregex
+/abc/IFSS>testsavedregex
<testsavedregex
abc
** Failers
@@ -1422,12 +1454,24 @@
** Failers
def
+/(a|b)/ISS>testsavedregex
+<testsavedregex
+ abc
+ ** Failers
+ def
+
/(a|b)/ISF>testsavedregex
<testsavedregex
abc
** Failers
def
+/(a|b)/ISSF>testsavedregex
+<testsavedregex
+ abc
+ ** Failers
+ def
+
~<(\w+)/?>(.)*</(\1)>~smgI
<!DOCTYPE seite SYSTEM "http://www.lco.lineas.de/xmlCms.dtd">\n<seite>\n<dokumenteninformation>\n<seitentitel>Partner der LCO</seitentitel>\n<sprache>de</sprache>\n<seitenbeschreibung>Partner der LINEAS Consulting\nGmbH</seitenbeschreibung>\n<schluesselworte>LINEAS Consulting GmbH Hamburg\nPartnerfirmen</schluesselworte>\n<revisit>30 days</revisit>\n<robots>index,follow</robots>\n<menueinformation>\n<aktiv>ja</aktiv>\n<menueposition>3</menueposition>\n<menuetext>Partner</menuetext>\n</menueinformation>\n<lastedited>\n<autor>LCO</autor>\n<firma>LINEAS Consulting</firma>\n<datum>15.10.2003</datum>\n</lastedited>\n</dokumenteninformation>\n<inhalt>\n\n<absatzueberschrift>Die Partnerfirmen der LINEAS Consulting\nGmbH</absatzueberschrift>\n\n<absatz><link ziel="http://www.ca.com/" zielfenster="_blank">\n<bild name="logo_ca.gif" rahmen="no"/></link> <link\nziel="http://www.ey.com/" zielfenster="_blank"><bild\nname="logo_euy.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><link ziel="http://www.cisco.de/" zielfenster="_blank">\n<bild name="logo_cisco.gif" rahmen="ja"/></link></absatz>\n\n<absatz><link ziel="http://www.atelion.de/"\nzielfenster="_blank"><bild\nname="logo_atelion.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><link ziel="http://www.line-information.de/"\nzielfenster="_blank">\n<bild name="logo_line_information.gif" rahmen="no"/></link>\n</absatz>\n\n<absatz><bild name="logo_aw.gif" rahmen="no"/></absatz>\n\n<absatz><link ziel="http://www.incognis.de/"\nzielfenster="_blank"><bild\nname="logo_incognis.gif" rahmen="no"/></link></absatz>\n\n<absatz><link ziel="http://www.addcraft.com/"\nzielfenster="_blank"><bild\nname="logo_addcraft.gif" rahmen="no"/></link></absatz>\n\n<absatz><link ziel="http://www.comendo.com/"\nzielfenster="_blank"><bild\nname="logo_comendo.gif" rahmen="no"/></link></absatz>\n\n</inhalt>\n</seite>
@@ -3312,11 +3356,19 @@ name were given. ---/
/A(*PRUNE:A)B/K
ACAB
-/(*MARK:A)(*PRUNE:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
C
D
-/(*MARK:A)(*THEN:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
+ C
+ D
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+ C
+ D
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
C
D
@@ -3681,4 +3733,16 @@ with \Y. ---/
/-- --/
+/-- These studied versions are here because they are not Perl-compatible; the
+ studying means the mark is not seen. --/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KS
+ C
+ D
+
+/(*:A)A+(*SKIP:A)(B|Z)/KS
+ AAAC
+
+/-- --/
+
/-- End of testinput2 --/
diff --git a/testdata/testinput5 b/testdata/testinput5
index 6aeaa4d..62ae695 100644
--- a/testdata/testinput5
+++ b/testdata/testinput5
@@ -198,7 +198,7 @@ correctly, but that messes up comparisons). --/
/ÃÃÃxxx/8
-/ÃÃÃxxx/8?DZ
+/ÃÃÃxxx/8?DZSS
/abc/8
Ã]
diff --git a/testdata/testinput7 b/testdata/testinput7
index 758267e..04a1829 100644
--- a/testdata/testinput7
+++ b/testdata/testinput7
@@ -3973,13 +3973,13 @@
ac
bbbbc
-/abc/>testsavedregex
+/abc/SS>testsavedregex
<testsavedregex
abc
*** Failers
bca
-/abc/F>testsavedregex
+/abc/FSS>testsavedregex
<testsavedregex
abc
*** Failers
diff --git a/testdata/testoutput11 b/testdata/testoutput11
index 425dfcf..7942e15 100644
--- a/testdata/testoutput11
+++ b/testdata/testoutput11
@@ -501,6 +501,10 @@ No match
No match
/(A (A|B(*ACCEPT)|C) D)(E)/x
+ AB
+ 0: AB
+ 1: AB
+ 2: B
ABX
0: AB
1: AB
@@ -821,7 +825,10 @@ No match, mark = A
CB
No match, mark = B
-/(*MARK:A)(*SKIP:B)(C|X)/K
+/--- Force no study, otherwise mark is not seen. The studied version is in
+ test 2 because it isn't Perl-compatible. ---/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KSS
C
0: C
1: C
@@ -864,9 +871,9 @@ with the handling of backtracking verbs. ---/
AAAC
0: AC
-/--- Don't loop! ---/
+/--- Don't loop! Force no study, otherwise mark is not seen. ---/
-/(*:A)A+(*SKIP:A)(B|Z)/K
+/(*:A)A+(*SKIP:A)(B|Z)/KSS
AAAC
No match, mark = A
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index 002a11f..fd81bbe 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -3580,7 +3580,27 @@ Need char = 'f'
1 ^ ^ f
0: abcdef
-/(?C1)\dabc(?C2)def/I
+/(?C1)\dabc(?C2)def/IS
+Capturing subpattern count = 0
+No options
+No first char
+Need char = 'f'
+Subject length lower bound = 7
+Starting byte set: 0 1 2 3 4 5 6 7 8 9
+ 1234abcdef
+--->1234abcdef
+ 1 ^ \d
+ 1 ^ \d
+ 1 ^ \d
+ 1 ^ \d
+ 2 ^ ^ d
+ 0: 4abcdef
+ *** Failers
+No match
+ abcdef
+No match
+
+/(?C1)\dabc(?C2)def/ISS
Capturing subpattern count = 0
No options
No first char
@@ -4778,7 +4798,51 @@ Need char = 'e'
+4 ^ ^ e
No match
-/a*b/ICDZ
+/a*b/ICDZS
+------------------------------------------------------------------
+ Bra
+ Callout 255 0 2
+ a*+
+ Callout 255 2 1
+ b
+ Callout 255 3 0
+ Ket
+ End
+------------------------------------------------------------------
+Capturing subpattern count = 0
+Options:
+No first char
+Need char = 'b'
+Subject length lower bound = 1
+Starting byte set: a b
+ ab
+--->ab
+ +0 ^ a*
+ +2 ^^ b
+ +3 ^ ^
+ 0: ab
+ aaaab
+--->aaaab
+ +0 ^ a*
+ +2 ^ ^ b
+ +3 ^ ^
+ 0: aaaab
+ aaaacb
+--->aaaacb
+ +0 ^ a*
+ +2 ^ ^ b
+ +0 ^ a*
+ +2 ^ ^ b
+ +0 ^ a*
+ +2 ^ ^ b
+ +0 ^ a*
+ +2 ^^ b
+ +0 ^ a*
+ +2 ^ b
+ +3 ^^
+ 0: b
+
+/a*b/ICDZSS
------------------------------------------------------------------
Bra
Callout 255 0 2
@@ -4861,7 +4925,83 @@ Need char = 'b'
+2 ^^ b
No match
-/(abc|def)x/ICDZ
+/(abc|def)x/ICDZS
+------------------------------------------------------------------
+ Bra
+ Callout 255 0 9
+ CBra 1
+ Callout 255 1 1
+ a
+ Callout 255 2 1
+ b
+ Callout 255 3 1
+ c
+ Callout 255 4 0
+ Alt
+ Callout 255 5 1
+ d
+ Callout 255 6 1
+ e
+ Callout 255 7 1
+ f
+ Callout 255 8 0
+ Ket
+ Callout 255 9 1
+ x
+ Callout 255 10 0
+ Ket
+ End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options:
+No first char
+Need char = 'x'
+Subject length lower bound = 4
+Starting byte set: a d
+ abcx
+--->abcx
+ +0 ^ (abc|def)
+ +1 ^ a
+ +2 ^^ b
+ +3 ^ ^ c
+ +4 ^ ^ |
+ +9 ^ ^ x
++10 ^ ^
+ 0: abcx
+ 1: abc
+ defx
+--->defx
+ +0 ^ (abc|def)
+ +1 ^ a
+ +5 ^ d
+ +6 ^^ e
+ +7 ^ ^ f
+ +8 ^ ^ )
+ +9 ^ ^ x
++10 ^ ^
+ 0: defx
+ 1: def
+ ** Failers
+No match
+ abcdefzx
+--->abcdefzx
+ +0 ^ (abc|def)
+ +1 ^ a
+ +2 ^^ b
+ +3 ^ ^ c
+ +4 ^ ^ |
+ +9 ^ ^ x
+ +5 ^ d
+ +0 ^ (abc|def)
+ +1 ^ a
+ +5 ^ d
+ +6 ^^ e
+ +7 ^ ^ f
+ +8 ^ ^ )
+ +9 ^ ^ x
+No match
+
+/(abc|def)x/ICDZSS
------------------------------------------------------------------
Bra
Callout 255 0 9
@@ -4915,6 +5055,8 @@ Need char = 'x'
+10 ^ ^
0: defx
1: def
+ ** Failers
+No match
abcdefzx
--->abcdefzx
+0 ^ (abc|def)
@@ -5015,7 +5157,58 @@ No need char
0: abcdcdcd
1: cd
-/([ab]{,4}c|xy)/ICDZ
+/([ab]{,4}c|xy)/ICDZS
+------------------------------------------------------------------
+ Bra
+ Callout 255 0 14
+ CBra 1
+ Callout 255 1 4
+ [ab]
+ Callout 255 5 1
+ {
+ Callout 255 6 1
+ ,
+ Callout 255 7 1
+ 4
+ Callout 255 8 1
+ }
+ Callout 255 9 1
+ c
+ Callout 255 10 0
+ Alt
+ Callout 255 11 1
+ x
+ Callout 255 12 1
+ y
+ Callout 255 13 0
+ Ket
+ Callout 255 14 0
+ Ket
+ End
+------------------------------------------------------------------
+Capturing subpattern count = 1
+Options:
+No first char
+No need char
+Subject length lower bound = 2
+Starting byte set: a b x
+ Note: that { does NOT introduce a quantifier
+--->Note: that { does NOT introduce a quantifier
+ +0 ^ ([ab]{,4}c|xy)
+ +1 ^ [ab]
+ +5 ^^ {
++11 ^ x
+ +0 ^ ([ab]{,4}c|xy)
+ +1 ^ [ab]
+ +5 ^^ {
++11 ^ x
+ +0 ^ ([ab]{,4}c|xy)
+ +1 ^ [ab]
+ +5 ^^ {
++11 ^ x
+No match
+
+/([ab]{,4}c|xy)/ICDZSS
------------------------------------------------------------------
Bra
Callout 255 0 14
@@ -5467,14 +5660,33 @@ No match
123456\P
No match
-/abc/I>testsavedregex
+/abc/IS>testsavedregex
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Subject length lower bound = 3
+No set of starting bytes
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+Study data loaded from testsavedregex
+ abc
+ 0: abc
+ ** Failers
+No match
+ bca
+No match
+
+/abc/ISS>testsavedregex
Capturing subpattern count = 0
No options
First char = 'a'
Need char = 'c'
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
<testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
No study data
abc
0: abc
@@ -5483,14 +5695,33 @@ No match
bca
No match
-/abc/IF>testsavedregex
+/abc/IFS>testsavedregex
Capturing subpattern count = 0
No options
First char = 'a'
Need char = 'c'
-Compiled regex written to testsavedregex
+Subject length lower bound = 3
+No set of starting bytes
+Compiled pattern written to testsavedregex
+Study data written to testsavedregex
<testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+Study data loaded from testsavedregex
+ abc
+ 0: abc
+ ** Failers
+No match
+ bca
+No match
+
+/abc/IFSS>testsavedregex
+Capturing subpattern count = 0
+No options
+First char = 'a'
+Need char = 'c'
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
No study data
abc
0: abc
@@ -5506,10 +5737,10 @@ No first char
No need char
Subject length lower bound = 1
Starting byte set: a b
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
Study data loaded from testsavedregex
abc
0: a
@@ -5520,6 +5751,24 @@ Study data loaded from testsavedregex
def
No match
+/(a|b)/ISS>testsavedregex
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern loaded from testsavedregex
+No study data
+ abc
+ 0: a
+ 1: a
+ ** Failers
+ 0: a
+ 1: a
+ def
+No match
+
/(a|b)/ISF>testsavedregex
Capturing subpattern count = 1
No options
@@ -5527,10 +5776,10 @@ No first char
No need char
Subject length lower bound = 1
Starting byte set: a b
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
Study data loaded from testsavedregex
abc
0: a
@@ -5541,6 +5790,24 @@ Study data loaded from testsavedregex
def
No match
+/(a|b)/ISSF>testsavedregex
+Capturing subpattern count = 1
+No options
+No first char
+No need char
+Compiled pattern written to testsavedregex
+<testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
+No study data
+ abc
+ 0: a
+ 1: a
+ ** Failers
+ 0: a
+ 1: a
+ def
+No match
+
~<(\w+)/?>(.)*</(\1)>~smgI
Capturing subpattern count = 3
Max back reference = 1
@@ -10805,7 +11072,15 @@ name were given. ---/
ACAB
0: AB
-/(*MARK:A)(*PRUNE:B)(C|X)/K
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KSS
C
0: C
1: C
@@ -10813,7 +11088,15 @@ MK: A
D
No match, mark = B
-/(*MARK:A)(*THEN:B)(C|X)/K
+/(*MARK:A)(*THEN:B)(C|X)/KS
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match
+
+/(*MARK:A)(*THEN:B)(C|X)/KSS
C
0: C
1: C
@@ -11577,4 +11860,21 @@ No match
/-- --/
+/-- These studied versions are here because they are not Perl-compatible; the
+ studying means the mark is not seen. --/
+
+/(*MARK:A)(*SKIP:B)(C|X)/KS
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match
+
+/(*:A)A+(*SKIP:A)(B|Z)/KS
+ AAAC
+No match
+
+/-- --/
+
/-- End of testinput2 --/
diff --git a/testdata/testoutput5 b/testdata/testoutput5
index 129dbc7..9b18300 100644
--- a/testdata/testoutput5
+++ b/testdata/testoutput5
@@ -802,7 +802,7 @@ Failed: invalid UTF-8 string at offset 0
/ÃÃÃxxx/8
Failed: invalid UTF-8 string at offset 0
-/ÃÃÃxxx/8?DZ
+/ÃÃÃxxx/8?DZSS
------------------------------------------------------------------
Bra
\X{c0}\X{c0}\X{c0}xxx
@@ -2184,7 +2184,7 @@ Capturing subpattern count = 0
No options
No first char
No need char
-Subject length lower bound = 2
+Subject length lower bound = 1
Starting byte set: \x0a \x0b \x0c \x0d \x85
/\R/SI8
@@ -2192,7 +2192,7 @@ Capturing subpattern count = 0
Options: utf8
No first char
No need char
-Subject length lower bound = 2
+Subject length lower bound = 1
Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2
/\h*A/SI8
diff --git a/testdata/testoutput7 b/testdata/testoutput7
index ce63a28..45b447a 100644
--- a/testdata/testoutput7
+++ b/testdata/testoutput7
@@ -1011,10 +1011,10 @@ Partial match: efabbbbbbbbbbbbbbbb
0: bbbbbbbbbbbbcdX
/(a|b)/SF>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
Study data loaded from testsavedregex
abc
0: a
@@ -6439,10 +6439,10 @@ Error -17 (backreference condition or recursion test not supported for DFA match
bbbbc
0: c
-/abc/>testsavedregex
-Compiled regex written to testsavedregex
+/abc/SS>testsavedregex
+Compiled pattern written to testsavedregex
<testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
No study data
abc
0: abc
@@ -6451,10 +6451,10 @@ No match
bca
No match
-/abc/F>testsavedregex
-Compiled regex written to testsavedregex
+/abc/FSS>testsavedregex
+Compiled pattern written to testsavedregex
<testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
No study data
abc
0: abc
@@ -6464,10 +6464,10 @@ No match
No match
/(a|b)/S>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
-Compiled regex loaded from testsavedregex
+Compiled pattern loaded from testsavedregex
Study data loaded from testsavedregex
abc
0: a
@@ -6477,10 +6477,10 @@ Study data loaded from testsavedregex
No match
/(a|b)/SF>testsavedregex
-Compiled regex written to testsavedregex
+Compiled pattern written to testsavedregex
Study data written to testsavedregex
<testsavedregex
-Compiled regex (byte-inverted) loaded from testsavedregex
+Compiled pattern (byte-inverted) loaded from testsavedregex
Study data loaded from testsavedregex
abc
0: a