summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--ChangeLog7
-rw-r--r--configure.ac6
-rw-r--r--doc/pcreapi.330
-rw-r--r--doc/pcrepattern.3187
-rw-r--r--doc/pcretest.170
-rw-r--r--pcre.h.in2
-rw-r--r--pcre_compile.c119
-rw-r--r--pcre_dfa_exec.c12
-rw-r--r--pcre_exec.c614
-rw-r--r--pcre_internal.h35
-rw-r--r--pcre_printint.src8
-rw-r--r--pcre_study.c9
-rw-r--r--pcreposix.c5
-rw-r--r--pcretest.c29
-rwxr-xr-xperltest.pl15
-rw-r--r--testdata/testinput1184
-rw-r--r--testdata/testinput2211
-rw-r--r--testdata/testoutput11127
-rw-r--r--testdata/testoutput2322
19 files changed, 1501 insertions, 391 deletions
diff --git a/ChangeLog b/ChangeLog
index 9e833ee..6bdef03 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,6 +1,13 @@
ChangeLog for PCRE
------------------
+Version 8.03 26-Mar-2010
+------------------------
+
+1. Added support for (*MARK:ARG) and for ARG additions to PRUNE, SKIP, and
+ THEN.
+
+
Version 8.02 19-Mar-2010
------------------------
diff --git a/configure.ac b/configure.ac
index e7233c0..5759f9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -9,9 +9,9 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre_major, [8])
-m4_define(pcre_minor, [02])
-m4_define(pcre_prerelease, [])
-m4_define(pcre_date, [2010-03-19])
+m4_define(pcre_minor, [03])
+m4_define(pcre_prerelease, [-RC1])
+m4_define(pcre_date, [2010-03-22])
# Libtool shared library interface versions (current:revision:age)
m4_define(libpcre_version, [0:1:0])
diff --git a/doc/pcreapi.3 b/doc/pcreapi.3
index 6341cdc..9daf135 100644
--- a/doc/pcreapi.3
+++ b/doc/pcreapi.3
@@ -747,12 +747,14 @@ out of use. To avoid confusion, they have not been re-used.
57 \eg is not followed by a braced, angle-bracketed, or quoted
name/number or by a plain number
58 a numbered reference must not be zero
- 59 (*VERB) with an argument is not supported
+ 59 an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT)
60 (*VERB) not recognized
61 number is too big
62 subpattern name expected
63 digit expected after (?+
64 ] is an invalid data character in JavaScript compatibility mode
+ 65 different names for subpatterns of the same number are not allowed
+ 66 (*MARK) must have an argument
.sp
The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may
be used if the limits were changed when PCRE was built.
@@ -1210,6 +1212,7 @@ fields (not necessarily in this order):
unsigned long int \fImatch_limit_recursion\fP;
void *\fIcallout_data\fP;
const unsigned char *\fItables\fP;
+ unsigned char **\fImark\fP;
.sp
The \fIflags\fP field is a bitmap that specifies which of the other fields
are set. The flag bits are:
@@ -1219,6 +1222,7 @@ are set. The flag bits are:
PCRE_EXTRA_MATCH_LIMIT_RECURSION
PCRE_EXTRA_CALLOUT_DATA
PCRE_EXTRA_TABLES
+ PCRE_EXTRA_MARK
.sp
Other flag bits should be set to zero. The \fIstudy_data\fP field is set in the
\fBpcre_extra\fP block that is returned by \fBpcre_study()\fP, together with
@@ -1281,6 +1285,26 @@ called. See the
\fBpcreprecompile\fP
.\"
documentation for a discussion of saving compiled patterns for later use.
+.P
+If PCRE_EXTRA_MARK is set in the \fIflags\fP field, the \fImark\fP field must
+be set to point to a \fBchar *\fP variable. If the pattern contains any
+backtracking control verbs such as (*MARK:NAME), and the execution ends up with
+a name to pass back, a pointer to the name string (zero terminated) is placed
+in the variable pointed to by the \fImark\fP field. The names are within the
+compiled pattern; if you wish to retain such a name you must copy it before
+freeing the memory of a compiled pattern. If there is no name to pass back, the
+variable pointed to by the \fImark\fP field set to NULL. For details of the
+backtracking control verbs, see the section entitled
+.\" HTML <a href="pcrepattern#backtrackcontrol">
+.\" </a>
+"Backtracking control"
+.\"
+in the
+.\" HREF
+\fBpcrepattern\fP
+.\"
+documentation.
+.
.
.\" HTML <a name="execoptions"></a>
.SS "Option bits for \fBpcre_exec()\fP"
@@ -2075,6 +2099,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 03 October 2009
-Copyright (c) 1997-2009 University of Cambridge.
+Last updated: 26 March 2010
+Copyright (c) 1997-2010 University of Cambridge.
.fi
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
index 27afc4f..a2d02ca 100644
--- a/doc/pcrepattern.3
+++ b/doc/pcrepattern.3
@@ -2318,6 +2318,7 @@ description of the interface to the callout function is given in the
documentation.
.
.
+.\" HTML <a name="backtrackcontrol"></a>
.SH "BACKTRACKING CONTROL"
.rs
.sp
@@ -2339,15 +2340,27 @@ it does not extend to the surrounding pattern. Note that such subpatterns are
processed as anchored at the point where they are tested.
.P
The new verbs make use of what was previously invalid syntax: an opening
-parenthesis followed by an asterisk. In Perl, they are generally of the form
-(*VERB:ARG) but PCRE does not support the use of arguments, so its general
-form is just (*VERB). Any number of these verbs may occur in a pattern. There
-are two kinds:
+parenthesis followed by an asterisk. They are generally of the form
+(*VERB) or (*VERB:NAME). Some may take either form, with differing behaviour,
+depending on whether or not an argument is present. An name is a sequence of
+letters, digits, and underscores. If the name is empty, that is, if the closing
+parenthesis immediately follows the colon, the effect is as if the colon were
+not there. Any number of these verbs may occur in a pattern.
+.P
+PCRE contains some optimizations that are used to speed up matching by running
+some checks at the start of each match attempt. For example, it may know the
+minimum length of matching subject, or that a particular character must be
+present. When one of these optimizations suppresses the running of a match, any
+included backtracking verbs will not, of course, be processed. You can suppress
+the start-of-match optimizations by setting the PCRE_NO_START_OPTIMIZE option
+when calling \fBpcre_exec()\fP.
+.
.
.SS "Verbs that act immediately"
.rs
.sp
-The following verbs act as soon as they are encountered:
+The following verbs act as soon as they are encountered. They may not be
+followed by a name.
.sp
(*ACCEPT)
.sp
@@ -2374,43 +2387,141 @@ callout feature, as for example in this pattern:
A match with the string "aaaa" always fails, but the callout is taken before
each backtrack happens (in this example, 10 times).
.
+.
+.SS "Recording which path was taken"
+.rs
+.sp
+There is one verb whose main purpose is to track how a match was arrived at,
+though it also has a secondary use in conjunction with advancing the match
+starting point (see (*SKIP) below).
+.sp
+ (*MARK:NAME) or (*:NAME)
+.sp
+A name is always required with this verb. There may be as many instances of
+(*MARK) as you like in a pattern, and their names do not have to be unique.
+.P
+When a match succeeds, the name of the last-encountered (*MARK) is passed back
+to the caller via the \fIpcre_extra\fP data structure, as described in the
+.\" HTML <a href="pcreapi.html#extradata">
+.\" </a>
+section on \fIpcre_extra\fP
+.\"
+in the
+.\" HREF
+\fBpcreapi\fP
+.\"
+documentation. No data is returned for a partial match. Here is an example of
+\fBpcretest\fP output, where the /K modifier requests the retrieval and
+outputting of (*MARK) data:
+.sp
+ /X(*MARK:A)Y|X(*MARK:B)Z/K
+ XY
+ 0: XY
+ MK: A
+ XZ
+ 0: XZ
+ MK: B
+.sp
+The (*MARK) name is tagged with "MK:" in this output, and in this example it
+indicates which of the two alternatives matched. This is a more efficient way
+of obtaining this information than putting each alternative in its own
+capturing parentheses.
+.P
+A name may also be returned after a failed match if the final path through the
+pattern involves (*MARK). However, unless (*MARK) used in conjunction with
+(*COMMIT), this is unlikely to happen for an unanchored pattern because, as the
+starting point for matching is advanced, the final check is often with an empty
+string, causing a failure before (*MARK) is reached. For example:
+.sp
+ /X(*MARK:A)Y|X(*MARK:B)Z/K
+ XP
+ No match
+.sp
+There are three potential starting points for this match (starting with X,
+starting with P, and with an empty string). If the pattern is anchored, the
+result is different:
+.sp
+ /^X(*MARK:A)Y|^X(*MARK:B)Z/K
+ XP
+ No match, mark = B
+.sp
+PCRE's start-of-match optimizations can also interfere with this. For example,
+if, as a result of a call to \fBpcre_study()\fP, it knows the minimum
+subject length for a match, a shorter subject will not be scanned at all.
+.P
+Note that similar anomalies (though different in detail) exist in Perl, no
+doubt for the same reasons. The use of (*MARK) data after a failed match of an
+unanchored pattern is not recommended, unless (*COMMIT) is involved.
+.
+.
.SS "Verbs that act after backtracking"
.rs
.sp
The following verbs do nothing when they are encountered. Matching continues
-with what follows, but if there is no subsequent match, a failure is forced.
-The verbs differ in exactly what kind of failure occurs.
+with what follows, but if there is no subsequent match, causing a backtrack to
+the verb, a failure is forced. That is, backtracking cannot pass to the left of
+the verb. However, when one of these verbs appears inside an atomic group, its
+effect is confined to that group, because once the group has been matched,
+there is never any backtracking into it. In this situation, backtracking can
+"jump back" to the left of the entire atomic group. (Remember also, as stated
+above, that this localization also applies in subroutine calls and assertions.)
+.P
+These verbs differ in exactly what kind of failure occurs when backtracking
+reaches them.
.sp
(*COMMIT)
.sp
-This verb causes the whole match to fail outright if the rest of the pattern
-does not match. Even if the pattern is unanchored, no further attempts to find
-a match by advancing the starting point take place. Once (*COMMIT) has been
-passed, \fBpcre_exec()\fP is committed to finding a match at the current
-starting point, or not at all. For example:
+This verb, which may not be followed by a name, causes the whole match to fail
+outright if the rest of the pattern does not match. Even if the pattern is
+unanchored, no further attempts to find a match by advancing the starting point
+take place. Once (*COMMIT) has been passed, \fBpcre_exec()\fP is committed to
+finding a match at the current starting point, or not at all. For example:
.sp
a+(*COMMIT)b
.sp
This matches "xxaab" but not "aacaab". It can be thought of as a kind of
-dynamic anchor, or "I've started, so I must finish."
-.sp
- (*PRUNE)
-.sp
-This verb causes the match to fail at the current position if the rest of the
-pattern does not match. If the pattern is unanchored, the normal "bumpalong"
-advance to the next starting character then happens. Backtracking can occur as
-usual to the left of (*PRUNE), or when matching to the right of (*PRUNE), but
-if there is no match to the right, backtracking cannot cross (*PRUNE).
-In simple cases, the use of (*PRUNE) is just an alternative to an atomic
-group or possessive quantifier, but there are some uses of (*PRUNE) that cannot
-be expressed in any other way.
+dynamic anchor, or "I've started, so I must finish." The name of the most
+recently passed (*MARK) in the path is passed back when (*COMMIT) forces a
+match failure.
+.P
+Note that (*COMMIT) at the start of a pattern is not the same as an anchor,
+unless PCRE's start-of-match optimizations are turned off, as shown in this
+\fBpcretest\fP example:
+.sp
+ /(*COMMIT)abc/
+ xyzabc
+ 0: abc
+ xyzabc\eY
+ No match
+.sp
+PCRE knows that any match must start with "a", so the optimization skips along
+the subject to "a" before running the first match attempt, which succeeds. When
+the optimization is disabled by the \eY escape in the second subject, the match
+starts at "x" and so the (*COMMIT) causes it to fail without trying any other
+starting points.
+.sp
+ (*PRUNE) or (*PRUNE:NAME)
+.sp
+This verb causes the match to fail at the current starting position in the
+subject if the rest of the pattern does not match. If the pattern is
+unanchored, the normal "bumpalong" advance to the next starting character then
+happens. Backtracking can occur as usual to the left of (*PRUNE), before it is
+reached, or when matching to the right of (*PRUNE), but if there is no match to
+the right, backtracking cannot cross (*PRUNE). In simple cases, the use of
+(*PRUNE) is just an alternative to an atomic group or possessive quantifier,
+but there are some uses of (*PRUNE) that cannot be expressed in any other way.
+The behaviour of (*PRUNE:NAME) is the same as (*MARK:NAME)(*PRUNE) when the
+match fails completely; the name is passed back if this is the final attempt.
+(*PRUNE:NAME) does not pass back a name if the match succeeds. In an anchored
+pattern (*PRUNE) has the same effect as (*COMMIT).
.sp
(*SKIP)
.sp
-This verb is like (*PRUNE), except that if the pattern is unanchored, the
-"bumpalong" advance is not to the next character, but to the position in the
-subject where (*SKIP) was encountered. (*SKIP) signifies that whatever text
-was matched leading up to it cannot be part of a successful match. Consider:
+This verb, when given without a name, is like (*PRUNE), except that if the
+pattern is unanchored, the "bumpalong" advance is not to the next character,
+but to the position in the subject where (*SKIP) was encountered. (*SKIP)
+signifies that whatever text was matched leading up to it cannot be part of a
+successful match. Consider:
.sp
a+(*SKIP)b
.sp
@@ -2421,7 +2532,17 @@ effect as this example; although it would suppress backtracking during the
first match attempt, the second attempt would start at the second character
instead of skipping on to "c".
.sp
- (*THEN)
+ (*SKIP:NAME)
+.sp
+When (*SKIP) has an associated name, its behaviour is modified. If the
+following pattern fails to match, the previous path through the pattern is
+searched for the most recent (*MARK) that has the same name. If one is found,
+the "bumpalong" advance is to the subject position that corresponds to that
+(*MARK) instead of to where (*SKIP) was encountered. If no (*MARK) with a
+matching name is found, normal "bumpalong" of one character happens (the
+(*SKIP) is ignored).
+.sp
+ (*THEN) or (*THEN:NAME)
.sp
This verb causes a skip to the next alternation if the rest of the pattern does
not match. That is, it cancels pending backtracking, but only within the
@@ -2432,8 +2553,10 @@ for a pattern-based if-then-else block:
.sp
If the COND1 pattern matches, FOO is tried (and possibly further items after
the end of the group if FOO succeeds); on failure the matcher skips to the
-second alternative and tries COND2, without backtracking into COND1. If (*THEN)
-is used outside of any alternation, it acts exactly like (*PRUNE).
+second alternative and tries COND2, without backtracking into COND1. The
+behaviour of (*THEN:NAME) is exactly the same as (*MARK:NAME)(*THEN) if the
+overall match fails. If (*THEN) is not directly inside an alternation, it acts
+like (*PRUNE).
.
.
.SH "SEE ALSO"
@@ -2457,6 +2580,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 06 March 2010
+Last updated: 27 March 2010
Copyright (c) 1997-2010 University of Cambridge.
.fi
diff --git a/doc/pcretest.1 b/doc/pcretest.1
index c07d42b..692aafd 100644
--- a/doc/pcretest.1
+++ b/doc/pcretest.1
@@ -224,6 +224,16 @@ such cases when using the \fB/g\fP modifier or the \fBsplit()\fP function.
There are yet more modifiers for controlling the way \fBpcretest\fP
operates.
.P
+The \fB/8\fP modifier causes \fBpcretest\fP to call PCRE with the PCRE_UTF8
+option set. This turns on support for UTF-8 character handling in PCRE,
+provided that it was compiled with this support enabled. This modifier also
+causes any non-printing characters in output strings to be printed using the
+\ex{hh...} notation if they are valid UTF-8 sequences.
+.P
+If the \fB/?\fP modifier is used with \fB/8\fP, it causes \fBpcretest\fP to
+call \fBpcre_compile()\fP with the PCRE_NO_UTF8_CHECK option, to suppress the
+checking of the string for UTF-8 validity.
+.P
The \fB/+\fP modifier requests that as well as outputting the substring that
matched the entire pattern, pcretest should in addition output the remainder of
the subject string. This is useful for tests where the subject contains
@@ -236,22 +246,6 @@ also present, this data is replaced by spaces. This is a special feature for
use in the automatic test scripts; it ensures that the same output is generated
for different internal link sizes.
.P
-The \fB/L\fP modifier must be followed directly by the name of a locale, for
-example,
-.sp
- /pattern/Lfr_FR
-.sp
-For this reason, it must be the last modifier. The given locale is set,
-\fBpcre_maketables()\fP is called to build a set of character tables for the
-locale, and this is then passed to \fBpcre_compile()\fP when compiling the
-regular expression. Without an \fB/L\fP modifier, NULL is passed as the tables
-pointer; that is, \fB/L\fP applies only to the expression on which it appears.
-.P
-The \fB/I\fP modifier requests that \fBpcretest\fP output information about the
-compiled pattern (whether it is anchored, has a fixed first character, and
-so on). It does this by calling \fBpcre_fullinfo()\fP after compiling a
-pattern. If the pattern is studied, the results of that are also output.
-.P
The \fB/D\fP modifier is a PCRE debugging feature, and is equivalent to
\fB/BI\fP, that is, both the \fB/B\fP and the \fB/I\fP modifiers.
.P
@@ -263,9 +257,31 @@ available when the POSIX interface to PCRE is being used, that is, when the
\fB/P\fP pattern modifier is specified. See also the section about saving and
reloading compiled patterns below.
.P
-The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the
-expression has been compiled, and the results used when the expression is
-matched.
+The \fB/I\fP modifier requests that \fBpcretest\fP output information about the
+compiled pattern (whether it is anchored, has a fixed first character, and
+so on). It does this by calling \fBpcre_fullinfo()\fP after compiling a
+pattern. If the pattern is studied, the results of that are also output.
+.P
+The \fB/K\fP modifier requests \fBpcretest\fP to show names from backtracking
+control verbs that are returned from calls to \fBpcre_exec()\fP. It causes
+\fBpcretest\fP to create a \fBpcre_extra\fP block if one has not already been
+created by a call to \fBpcre_study()\fP, and to set the PCRE_EXTRA_MARK flag
+and the \fBmark\fP field within it, every time that \fBpcre_exec()\fP is
+called. If the variable that the \fBmark\fP field points to is non-NULL for a
+match, non-match, or partial match, \fBpcretest\fP prints the string to which
+it points. For a match, this is shown on a line by itself, tagged with "MK:".
+For a non-match it is added to the message.
+.P
+The \fB/L\fP modifier must be followed directly by the name of a locale, for
+example,
+.sp
+ /pattern/Lfr_FR
+.sp
+For this reason, it must be the last modifier. The given locale is set,
+\fBpcre_maketables()\fP is called to build a set of character tables for the
+locale, and this is then passed to \fBpcre_compile()\fP when compiling the
+regular expression. Without an \fB/L\fP modifier, NULL is passed as the tables
+pointer; that is, \fB/L\fP applies only to the expression on which it appears.
.P
The \fB/M\fP modifier causes the size of memory block used to hold the compiled
pattern to be output.
@@ -276,15 +292,9 @@ API rather than its native API. When this is done, all other modifiers except
present, and REG_NEWLINE is set if \fB/m\fP is present. The wrapper functions
force PCRE_DOLLAR_ENDONLY always, and PCRE_DOTALL unless REG_NEWLINE is set.
.P
-The \fB/8\fP modifier causes \fBpcretest\fP to call PCRE with the PCRE_UTF8
-option set. This turns on support for UTF-8 character handling in PCRE,
-provided that it was compiled with this support enabled. This modifier also
-causes any non-printing characters in output strings to be printed using the
-\ex{hh...} notation if they are valid UTF-8 sequences.
-.P
-If the \fB/?\fP modifier is used with \fB/8\fP, it causes \fBpcretest\fP to
-call \fBpcre_compile()\fP with the PCRE_NO_UTF8_CHECK option, to suppress the
-checking of the string for UTF-8 validity.
+The \fB/S\fP modifier causes \fBpcre_study()\fP to be called after the
+expression has been compiled, and the results used when the expression is
+matched.
.
.
.SH "DATA LINES"
@@ -731,6 +741,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 26 September 2009
-Copyright (c) 1997-2009 University of Cambridge.
+Last updated: 26 March 2010
+Copyright (c) 1997-2010 University of Cambridge.
.fi
diff --git a/pcre.h.in b/pcre.h.in
index 0eecbbf..b99d647 100644
--- a/pcre.h.in
+++ b/pcre.h.in
@@ -200,6 +200,7 @@ these bits, just add new ones on the end, in order to remain compatible. */
#define PCRE_EXTRA_CALLOUT_DATA 0x0004
#define PCRE_EXTRA_TABLES 0x0008
#define PCRE_EXTRA_MATCH_LIMIT_RECURSION 0x0010
+#define PCRE_EXTRA_MARK 0x0020
/* Types */
@@ -225,6 +226,7 @@ typedef struct pcre_extra {
void *callout_data; /* Data passed back in callouts */
const unsigned char *tables; /* Pointer to character tables */
unsigned long int match_limit_recursion; /* Max recursive calls to match() */
+ unsigned char **mark; /* For passing back a mark pointer */
} pcre_extra;
/* The structure for passing out data via the pcre_callout_function. We use a
diff --git a/pcre_compile.c b/pcre_compile.c
index 6ea9c74..cfa207e 100644
--- a/pcre_compile.c
+++ b/pcre_compile.c
@@ -188,11 +188,14 @@ string is built from string macros so that it works in UTF-8 mode on EBCDIC
platforms. */
typedef struct verbitem {
- int len;
- int op;
+ int len; /* Length of verb name */
+ int op; /* Op when no arg, or -1 if arg mandatory */
+ int op_arg; /* Op when arg present, or -1 if not allowed */
} verbitem;
static const char verbnames[] =
+ "\0" /* Empty name is a shorthand for MARK */
+ STRING_MARK0
STRING_ACCEPT0
STRING_COMMIT0
STRING_F0
@@ -202,13 +205,15 @@ static const char verbnames[] =
STRING_THEN;
static const verbitem verbs[] = {
- { 6, OP_ACCEPT },
- { 6, OP_COMMIT },
- { 1, OP_FAIL },
- { 4, OP_FAIL },
- { 5, OP_PRUNE },
- { 4, OP_SKIP },
- { 4, OP_THEN }
+ { 0, -1, OP_MARK },
+ { 4, -1, OP_MARK },
+ { 6, OP_ACCEPT, -1 },
+ { 6, OP_COMMIT, -1 },
+ { 1, OP_FAIL, -1 },
+ { 4, OP_FAIL, -1 },
+ { 5, OP_PRUNE, OP_PRUNE_ARG },
+ { 4, OP_SKIP, OP_SKIP_ARG },
+ { 4, OP_THEN, OP_THEN_ARG }
};
static const int verbcount = sizeof(verbs)/sizeof(verbitem);
@@ -345,7 +350,7 @@ static const char error_texts[] =
"inconsistent NEWLINE options\0"
"\\g is not followed by a braced, angle-bracketed, or quoted name/number or by a plain number\0"
"a numbered reference must not be zero\0"
- "(*VERB) with an argument is not supported\0"
+ "an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT)\0"
/* 60 */
"(*VERB) not recognized\0"
"number is too big\0"
@@ -353,7 +358,9 @@ static const char error_texts[] =
"digit expected after (?+\0"
"] is an invalid data character in JavaScript compatibility mode\0"
/* 65 */
- "different names for subpatterns of the same number are not allowed\0";
+ "different names for subpatterns of the same number are not allowed\0"
+ "(*MARK) must have an argument\0"
+ ;
/* Table to identify digits and hex digits. This is used when compiling
patterns. Note that the tables in chartables are dependent on the locale, and
@@ -1615,7 +1622,8 @@ for (;;)
/* Otherwise, we can get the item's length from the table, except that for
repeated character types, we have to test for \p and \P, which have an extra
- two bytes of parameters. */
+ two bytes of parameters, and for MARK/PRUNE/SKIP/THEN with an argument, we
+ must add in its length. */
else
{
@@ -1639,6 +1647,13 @@ for (;;)
case OP_TYPEPOSUPTO:
if (code[3] == OP_PROP || code[3] == OP_NOTPROP) code += 2;
break;
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ code += code[1];
+ break;
}
/* Add in the fixed length from the table */
@@ -1710,7 +1725,8 @@ for (;;)
/* Otherwise, we can get the item's length from the table, except that for
repeated character types, we have to test for \p and \P, which have an extra
- two bytes of parameters. */
+ two bytes of parameters, and for MARK/PRUNE/SKIP/THEN with an argument, we
+ must add in its length. */
else
{
@@ -1734,6 +1750,13 @@ for (;;)
case OP_TYPEEXACT:
if (code[3] == OP_PROP || code[3] == OP_NOTPROP) code += 2;
break;
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ code += code[1];
+ break;
}
/* Add in the fixed length from the table */
@@ -2003,6 +2026,16 @@ for (code = first_significant_code(code + _pcre_OP_lengths[*code], NULL, 0, TRUE
break;
#endif
+ /* MARK, and PRUNE/SKIP/THEN with an argument must skip over the argument
+ string. */
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ code += code[1];
+ break;
+
/* None of the remaining opcodes are required to match a character. */
default:
@@ -4514,24 +4547,34 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
/* First deal with various "verbs" that can be introduced by '*'. */
- if (*(++ptr) == CHAR_ASTERISK && (cd->ctypes[ptr[1]] & ctype_letter) != 0)
+ if (*(++ptr) == CHAR_ASTERISK &&
+ ((cd->ctypes[ptr[1]] & ctype_letter) != 0 || ptr[1] == ':'))
{
int i, namelen;
+ int arglen = 0;
const char *vn = verbnames;
- const uschar *name = ++ptr;
+ const uschar *name = ptr + 1;
+ const uschar *arg = NULL;
previous = NULL;
while ((cd->ctypes[*++ptr] & ctype_letter) != 0) {};
+ namelen = ptr - name;
+
if (*ptr == CHAR_COLON)
{
- *errorcodeptr = ERR59; /* Not supported */
- goto FAILED;
+ arg = ++ptr;
+ while ((cd->ctypes[*ptr] & (ctype_letter|ctype_digit)) != 0
+ || *ptr == '_') ptr++;
+ arglen = ptr - arg;
}
+
if (*ptr != CHAR_RIGHT_PARENTHESIS)
{
*errorcodeptr = ERR60;
goto FAILED;
}
- namelen = ptr - name;
+
+ /* Scan the table of verb names */
+
for (i = 0; i < verbcount; i++)
{
if (namelen == verbs[i].len &&
@@ -4549,13 +4592,41 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
PUT2INC(code, 0, oc->number);
}
}
- *code++ = verbs[i].op;
- break;
+
+ /* Handle the cases with/without an argument */
+
+ if (arglen == 0)
+ {
+ if (verbs[i].op < 0) /* Argument is mandatory */
+ {
+ *errorcodeptr = ERR66;
+ goto FAILED;
+ }
+ *code++ = verbs[i].op;
+ }
+
+ else
+ {
+ if (verbs[i].op_arg < 0) /* Argument is forbidden */
+ {
+ *errorcodeptr = ERR59;
+ goto FAILED;
+ }
+ *code++ = verbs[i].op_arg;
+ *code++ = arglen;
+ memcpy(code, arg, arglen);
+ code += arglen;
+ *code++ = 0;
+ }
+
+ break; /* Found verb, exit loop */
}
+
vn += verbs[i].len + 1;
}
- if (i < verbcount) continue;
- *errorcodeptr = ERR60;
+
+ if (i < verbcount) continue; /* Successfully handled a verb */
+ *errorcodeptr = ERR60; /* Verb not recognized */
goto FAILED;
}
@@ -5338,8 +5409,8 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
} /* End of switch for character following (? */
} /* End of (? handling */
- /* Opening parenthesis not followed by '?'. If PCRE_NO_AUTO_CAPTURE is set,
- all unadorned brackets become non-capturing and behave like (?:...)
+ /* Opening parenthesis not followed by '*' or '?'. If PCRE_NO_AUTO_CAPTURE
+ is set, all unadorned brackets become non-capturing and behave like (?:...)
brackets. */
else if ((options & PCRE_NO_AUTO_CAPTURE) != 0)
diff --git a/pcre_dfa_exec.c b/pcre_dfa_exec.c
index d953f99..4db4b2e 100644
--- a/pcre_dfa_exec.c
+++ b/pcre_dfa_exec.c
@@ -106,7 +106,7 @@ never stored, so we push them well clear of the normal opcodes. */
/* This table identifies those opcodes that are followed immediately by a
-character that is to be tested in some way. This makes is possible to
+character that is to be tested in some way. This makes it possible to
centralize the loading of these characters. In the case of Type * etc, the
"character" is the opcode for \D, \d, \S, \s, \W, or \w, which will always be a
small value. Non-zero values in the table are the offsets from the opcode where
@@ -161,8 +161,9 @@ static const uschar coptable[] = {
0, 0, /* RREF, NRREF */
0, /* DEF */
0, 0, /* BRAZERO, BRAMINZERO */
- 0, 0, 0, 0, /* PRUNE, SKIP, THEN, COMMIT */
- 0, 0, 0, 0 /* FAIL, ACCEPT, CLOSE, SKIPZERO */
+ 0, 0, 0, /* MARK, PRUNE, PRUNE_ARG, */
+ 0, 0, 0, 0, /* SKIP, SKIP_ARG, THEN, THEN_ARG, */
+ 0, 0, 0, 0, 0 /* COMMIT, FAIL, ACCEPT, CLOSE, SKIPZERO */
};
/* This table identifies those opcodes that inspect a character. It is used to
@@ -218,8 +219,9 @@ static const uschar poptable[] = {
0, 0, /* RREF, NRREF */
0, /* DEF */
0, 0, /* BRAZERO, BRAMINZERO */
- 0, 0, 0, 0, /* PRUNE, SKIP, THEN, COMMIT */
- 0, 0, 0, 0 /* FAIL, ACCEPT, CLOSE, SKIPZERO */
+ 0, 0, 0, /* MARK, PRUNE, PRUNE_ARG, */
+ 0, 0, 0, 0, /* SKIP, SKIP_ARG, THEN, THEN_ARG, */
+ 0, 0, 0, 0, 0 /* COMMIT, FAIL, ACCEPT, CLOSE, SKIPZERO */
};
/* These 2 tables allow for compact code for testing for \D, \d, \S, \s, \W,
diff --git a/pcre_exec.c b/pcre_exec.c
index 0f3176a..5704476 100644
--- a/pcre_exec.c
+++ b/pcre_exec.c
@@ -74,7 +74,16 @@ negative to avoid the external error codes. */
#define MATCH_COMMIT (-999)
#define MATCH_PRUNE (-998)
#define MATCH_SKIP (-997)
-#define MATCH_THEN (-996)
+#define MATCH_SKIP_ARG (-996)
+#define MATCH_THEN (-995)
+
+/* This is a convenience macro for code that occurs many times. */
+
+#define MRRETURN(ra) \
+ { \
+ md->mark = markptr; \
+ RRETURN(ra); \
+ }
/* Maximum number of ints of offset to save on the stack for recursive calls.
If the offset vector is bigger, malloc is used. This should be a multiple of 3,
@@ -413,14 +422,14 @@ the subject. */
if (md->partial != 0 && eptr >= md->end_subject && eptr > mstart)\
{\
md->hitend = TRUE;\
- if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);\
+ if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL);\
}
#define SCHECK_PARTIAL()\
if (md->partial != 0 && eptr > mstart)\
{\
md->hitend = TRUE;\
- if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);\
+ if (md->partial > 1) MRRETURN(PCRE_ERROR_PARTIAL);\
}
@@ -448,13 +457,14 @@ Arguments:
Returns: MATCH_MATCH if matched ) these values are >= 0
MATCH_NOMATCH if failed to match )
+ a negative MATCH_xxx value for PRUNE, SKIP, etc
a negative PCRE_ERROR_xxx value if aborted by an error condition
(e.g. stopped by repeated call or recursion limit)
*/
static int
-match(REGISTER USPTR eptr, REGISTER const uschar *ecode, USPTR mstart, USPTR
- markptr, int offset_top, match_data *md, unsigned long int ims,
+match(REGISTER USPTR eptr, REGISTER const uschar *ecode, USPTR mstart,
+ const uschar *markptr, int offset_top, match_data *md, unsigned long int ims,
eptrblock *eptrb, int flags, unsigned int rdepth)
{
/* These variables do not need to be preserved over recursion in this function,
@@ -671,32 +681,81 @@ for (;;)
switch(op)
{
+ case OP_MARK:
+ markptr = ecode + 2;
+ RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+ ims, eptrb, flags, RM51);
+
+ /* A return of MATCH_SKIP_ARG means that matching failed at SKIP with an
+ argument, and we must check whether that argument matches this MARK's
+ argument. It is passed back in md->start_match_ptr (an overloading of that
+ variable). If it does match, we reset that variable to the current subject
+ position and return MATCH_SKIP. Otherwise, pass back the return code
+ unaltered. */
+
+ if (rrc == MATCH_SKIP_ARG &&
+ strcmp((char *)markptr, (char *)(md->start_match_ptr)) == 0)
+ {
+ md->start_match_ptr = eptr;
+ RRETURN(MATCH_SKIP);
+ }
+
+ if (md->mark == NULL) md->mark = markptr;
+ RRETURN(rrc);
+
case OP_FAIL:
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
+
+ case OP_COMMIT:
+ RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
+ ims, eptrb, flags, RM52);
+ if (rrc != MATCH_NOMATCH) RRETURN(rrc);
+ MRRETURN(MATCH_COMMIT);
case OP_PRUNE:
RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
ims, eptrb, flags, RM51);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- RRETURN(MATCH_PRUNE);
+ MRRETURN(MATCH_PRUNE);
- case OP_COMMIT:
- RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
- ims, eptrb, flags, RM52);
+ case OP_PRUNE_ARG:
+ RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+ ims, eptrb, flags, RM51);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- RRETURN(MATCH_COMMIT);
+ md->mark = ecode + 2;
+ RRETURN(MATCH_PRUNE);
case OP_SKIP:
RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
ims, eptrb, flags, RM53);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
md->start_match_ptr = eptr; /* Pass back current position */
- RRETURN(MATCH_SKIP);
+ MRRETURN(MATCH_SKIP);
+ case OP_SKIP_ARG:
+ RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+ ims, eptrb, flags, RM53);
+ if (rrc != MATCH_NOMATCH) RRETURN(rrc);
+
+ /* Pass back the current skip name by overloading md->start_match_ptr and
+ returning the special MATCH_SKIP_ARG return code. This will either be
+ caught by a matching MARK, or get to the top, where it is treated the same
+ as PRUNE. */
+
+ md->start_match_ptr = ecode + 2;
+ RRETURN(MATCH_SKIP_ARG);
+
case OP_THEN:
RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md,
ims, eptrb, flags, RM54);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
+ MRRETURN(MATCH_THEN);
+
+ case OP_THEN_ARG:
+ RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode] + ecode[1], offset_top, md,
+ ims, eptrb, flags, RM54);
+ if (rrc != MATCH_NOMATCH) RRETURN(rrc);
+ md->mark = ecode + 2;
RRETURN(MATCH_THEN);
/* Handle a capturing bracket. If there is space in the offset vector, save
@@ -752,6 +811,7 @@ for (;;)
md->offset_vector[offset+1] = save_offset2;
md->offset_vector[md->offset_end - number] = save_offset3;
+ if (rrc != MATCH_THEN) md->mark = markptr;
RRETURN(MATCH_NOMATCH);
}
@@ -791,7 +851,8 @@ for (;;)
RMATCH(eptr, ecode + _pcre_OP_lengths[*ecode], offset_top, md, ims,
eptrb, flags, RM48);
- RRETURN(rrc);
+ if (rrc == MATCH_NOMATCH) md->mark = markptr;
+ RRETURN(rrc);
}
/* For non-final alternatives, continue the loop for a NOMATCH result;
@@ -834,7 +895,7 @@ for (;;)
cb.capture_top = offset_top/2;
cb.capture_last = md->capture_last;
cb.callout_data = md->callout_data;
- if ((rrc = (*pcre_callout)(&cb)) > 0) RRETURN(MATCH_NOMATCH);
+ if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
if (rrc < 0) RRETURN(rrc);
}
ecode += _pcre_OP_lengths[OP_CALLOUT];
@@ -1089,14 +1150,14 @@ for (;;)
(md->notempty ||
(md->notempty_atstart &&
mstart == md->start_subject + md->start_offset)))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
/* Otherwise, we have a match. */
md->end_match_ptr = eptr; /* Record where we ended */
md->end_offset_top = offset_top; /* and how many extracts were taken */
md->start_match_ptr = mstart; /* and the start (\K can modify) */
- RRETURN(MATCH_MATCH);
+ MRRETURN(MATCH_MATCH);
/* Change option settings */
@@ -1127,7 +1188,7 @@ for (;;)
ecode += GET(ecode, 1);
}
while (*ecode == OP_ALT);
- if (*ecode == OP_KET) RRETURN(MATCH_NOMATCH);
+ if (*ecode == OP_KET) MRRETURN(MATCH_NOMATCH);
/* If checking an assertion for a condition, return MATCH_MATCH. */
@@ -1151,7 +1212,7 @@ for (;;)
{
RMATCH(eptr, ecode + 1 + LINK_SIZE, offset_top, md, ims, NULL, 0,
RM5);
- if (rrc == MATCH_MATCH) RRETURN(MATCH_NOMATCH);
+ if (rrc == MATCH_MATCH) MRRETURN(MATCH_NOMATCH);
if (rrc == MATCH_SKIP || rrc == MATCH_PRUNE || rrc == MATCH_COMMIT)
{
do ecode += GET(ecode,1); while (*ecode == OP_ALT);
@@ -1180,7 +1241,7 @@ for (;;)
while (i-- > 0)
{
eptr--;
- if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
+ if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
BACKCHAR(eptr);
}
}
@@ -1191,7 +1252,7 @@ for (;;)
{
eptr -= GET(ecode, 1);
- if (eptr < md->start_subject) RRETURN(MATCH_NOMATCH);
+ if (eptr < md->start_subject) MRRETURN(MATCH_NOMATCH);
}
/* Save the earliest consulted character, then skip to next op code */
@@ -1220,7 +1281,7 @@ for (;;)
cb.capture_top = offset_top/2;
cb.capture_last = md->capture_last;
cb.callout_data = md->callout_data;
- if ((rrc = (*pcre_callout)(&cb)) > 0) RRETURN(MATCH_NOMATCH);
+ if ((rrc = (*pcre_callout)(&cb)) > 0) MRRETURN(MATCH_NOMATCH);
if (rrc < 0) RRETURN(rrc);
}
ecode += 2 + 2*LINK_SIZE;
@@ -1292,7 +1353,7 @@ for (;;)
md->recursive = new_recursive.prevrec;
if (new_recursive.offset_save != stacksave)
(pcre_free)(new_recursive.offset_save);
- RRETURN(MATCH_MATCH);
+ MRRETURN(MATCH_MATCH);
}
else if (rrc != MATCH_NOMATCH && rrc != MATCH_THEN)
{
@@ -1313,7 +1374,7 @@ for (;;)
md->recursive = new_recursive.prevrec;
if (new_recursive.offset_save != stacksave)
(pcre_free)(new_recursive.offset_save);
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never reaches here */
@@ -1467,7 +1528,7 @@ for (;;)
md->end_match_ptr = eptr; /* For ONCE */
md->end_offset_top = offset_top;
md->start_match_ptr = mstart;
- RRETURN(MATCH_MATCH);
+ MRRETURN(MATCH_MATCH);
}
/* For capturing groups we have to check the group number back at the start
@@ -1562,12 +1623,12 @@ for (;;)
/* Start of subject unless notbol, or after internal newline if multiline */
case OP_CIRC:
- if (md->notbol && eptr == md->start_subject) RRETURN(MATCH_NOMATCH);
+ if (md->notbol && eptr == md->start_subject) MRRETURN(MATCH_NOMATCH);
if ((ims & PCRE_MULTILINE) != 0)
{
if (eptr != md->start_subject &&
(eptr == md->end_subject || !WAS_NEWLINE(eptr)))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
}
@@ -1576,14 +1637,14 @@ for (;;)
/* Start of subject assertion */
case OP_SOD:
- if (eptr != md->start_subject) RRETURN(MATCH_NOMATCH);
+ if (eptr != md->start_subject) MRRETURN(MATCH_NOMATCH);
ecode++;
break;
/* Start of match assertion */
case OP_SOM:
- if (eptr != md->start_subject + md->start_offset) RRETURN(MATCH_NOMATCH);
+ if (eptr != md->start_subject + md->start_offset) MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1601,20 +1662,20 @@ for (;;)
if ((ims & PCRE_MULTILINE) != 0)
{
if (eptr < md->end_subject)
- { if (!IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH); }
+ { if (!IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH); }
else
- { if (md->noteol) RRETURN(MATCH_NOMATCH); }
+ { if (md->noteol) MRRETURN(MATCH_NOMATCH); }
ecode++;
break;
}
else
{
- if (md->noteol) RRETURN(MATCH_NOMATCH);
+ if (md->noteol) MRRETURN(MATCH_NOMATCH);
if (!md->endonly)
{
if (eptr != md->end_subject &&
(!IS_NEWLINE(eptr) || eptr != md->end_subject - md->nllen))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
}
@@ -1624,7 +1685,7 @@ for (;;)
/* End of subject assertion (\z) */
case OP_EOD:
- if (eptr < md->end_subject) RRETURN(MATCH_NOMATCH);
+ if (eptr < md->end_subject) MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1633,7 +1694,7 @@ for (;;)
case OP_EODN:
if (eptr != md->end_subject &&
(!IS_NEWLINE(eptr) || eptr != md->end_subject - md->nllen))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1693,21 +1754,21 @@ for (;;)
if ((*ecode++ == OP_WORD_BOUNDARY)?
cur_is_word == prev_is_word : cur_is_word != prev_is_word)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
/* Match a single character type; inline for speed */
case OP_ANY:
- if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+ if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
/* Fall through */
case OP_ALLANY:
if (eptr++ >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (utf8) while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
ecode++;
@@ -1720,7 +1781,7 @@ for (;;)
if (eptr++ >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
ecode++;
break;
@@ -1729,7 +1790,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1738,7 +1799,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_digit) != 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1746,7 +1807,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1755,7 +1816,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_digit) == 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1763,7 +1824,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1772,7 +1833,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_space) != 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1780,7 +1841,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1789,7 +1850,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_space) == 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1797,7 +1858,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1806,7 +1867,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_word) != 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1814,7 +1875,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
if (
@@ -1823,7 +1884,7 @@ for (;;)
#endif
(md->ctypes[c] & ctype_word) == 0
)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
ecode++;
break;
@@ -1831,12 +1892,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x000d:
if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
break;
@@ -1849,7 +1910,7 @@ for (;;)
case 0x0085:
case 0x2028:
case 0x2029:
- if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+ if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
break;
}
ecode++;
@@ -1859,7 +1920,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
switch(c)
@@ -1884,7 +1945,7 @@ for (;;)
case 0x202f: /* NARROW NO-BREAK SPACE */
case 0x205f: /* MEDIUM MATHEMATICAL SPACE */
case 0x3000: /* IDEOGRAPHIC SPACE */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
ecode++;
break;
@@ -1893,12 +1954,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
@@ -1927,7 +1988,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
switch(c)
@@ -1940,7 +2001,7 @@ for (;;)
case 0x85: /* NEL */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
ecode++;
break;
@@ -1949,12 +2010,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x0a: /* LF */
case 0x0b: /* VT */
case 0x0c: /* FF */
@@ -1976,7 +2037,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
{
@@ -1985,29 +2046,29 @@ for (;;)
switch(ecode[1])
{
case PT_ANY:
- if (op == OP_NOTPROP) RRETURN(MATCH_NOMATCH);
+ if (op == OP_NOTPROP) MRRETURN(MATCH_NOMATCH);
break;
case PT_LAMP:
if ((prop->chartype == ucp_Lu ||
prop->chartype == ucp_Ll ||
prop->chartype == ucp_Lt) == (op == OP_NOTPROP))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case PT_GC:
if ((ecode[2] != _pcre_ucp_gentype[prop->chartype]) == (op == OP_PROP))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case PT_PC:
if ((ecode[2] != prop->chartype) == (op == OP_PROP))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case PT_SC:
if ((ecode[2] != prop->script) == (op == OP_PROP))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
default:
@@ -2025,12 +2086,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
{
int category = UCD_CATEGORY(c);
- if (category == ucp_M) RRETURN(MATCH_NOMATCH);
+ if (category == ucp_M) MRRETURN(MATCH_NOMATCH);
while (eptr < md->end_subject)
{
int len = 1;
@@ -2109,7 +2170,7 @@ for (;;)
if (!match_ref(offset, eptr, length, md, ims))
{
CHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr += length;
continue; /* With the main loop */
@@ -2129,7 +2190,7 @@ for (;;)
if (!match_ref(offset, eptr, length, md, ims))
{
CHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr += length;
}
@@ -2147,11 +2208,11 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM14);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (!match_ref(offset, eptr, length, md, ims))
{
CHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr += length;
}
@@ -2178,7 +2239,7 @@ for (;;)
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
eptr -= length;
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -2240,16 +2301,16 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
if (c > 255)
{
- if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+ if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
}
else
{
- if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+ if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
}
}
}
@@ -2262,10 +2323,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
c = *eptr++;
- if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+ if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
}
}
@@ -2287,20 +2348,20 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM16);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
if (c > 255)
{
- if (op == OP_CLASS) RRETURN(MATCH_NOMATCH);
+ if (op == OP_CLASS) MRRETURN(MATCH_NOMATCH);
}
else
{
- if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+ if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
}
}
}
@@ -2312,14 +2373,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM17);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
c = *eptr++;
- if ((data[c/8] & (1 << (c&7))) == 0) RRETURN(MATCH_NOMATCH);
+ if ((data[c/8] & (1 << (c&7))) == 0) MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -2385,7 +2446,7 @@ for (;;)
}
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -2437,10 +2498,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
- if (!_pcre_xclass(c, data)) RRETURN(MATCH_NOMATCH);
+ if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
}
/* If max == min we can continue with the main loop without the
@@ -2457,14 +2518,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM20);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
- if (!_pcre_xclass(c, data)) RRETURN(MATCH_NOMATCH);
+ if (!_pcre_xclass(c, data)) MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
}
@@ -2493,7 +2554,7 @@ for (;;)
if (eptr-- == pp) break; /* Stop if tried at original pos */
if (utf8) BACKCHAR(eptr);
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -2512,9 +2573,9 @@ for (;;)
if (length > md->end_subject - eptr)
{
CHECK_PARTIAL(); /* Not SCHECK_PARTIAL() */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- while (length-- > 0) if (*ecode++ != *eptr++) RRETURN(MATCH_NOMATCH);
+ while (length-- > 0) if (*ecode++ != *eptr++) MRRETURN(MATCH_NOMATCH);
}
else
#endif
@@ -2524,9 +2585,9 @@ for (;;)
if (md->end_subject - eptr < 1)
{
SCHECK_PARTIAL(); /* This one can use SCHECK_PARTIAL() */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (ecode[1] != *eptr++) RRETURN(MATCH_NOMATCH);
+ if (ecode[1] != *eptr++) MRRETURN(MATCH_NOMATCH);
ecode += 2;
}
break;
@@ -2544,7 +2605,7 @@ for (;;)
if (length > md->end_subject - eptr)
{
CHECK_PARTIAL(); /* Not SCHECK_PARTIAL() */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* If the pattern character's value is < 128, we have only one byte, and
@@ -2552,7 +2613,7 @@ for (;;)
if (fc < 128)
{
- if (md->lcc[*ecode++] != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (md->lcc[*ecode++] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
}
/* Otherwise we must pick up the subject character */
@@ -2571,7 +2632,7 @@ for (;;)
#ifdef SUPPORT_UCP
if (dc != UCD_OTHERCASE(fc))
#endif
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
}
@@ -2583,9 +2644,9 @@ for (;;)
if (md->end_subject - eptr < 1)
{
SCHECK_PARTIAL(); /* This one can use SCHECK_PARTIAL() */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (md->lcc[ecode[1]] != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (md->lcc[ecode[1]] != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
ecode += 2;
}
break;
@@ -2679,7 +2740,7 @@ for (;;)
else
{
CHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
@@ -2691,7 +2752,7 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM22);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr <= md->end_subject - length &&
memcmp(eptr, charptr, length) == 0) eptr += length;
#ifdef SUPPORT_UCP
@@ -2702,7 +2763,7 @@ for (;;)
else
{
CHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -2733,7 +2794,7 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM23);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (eptr == pp) { RRETURN(MATCH_NOMATCH); }
+ if (eptr == pp) { MRRETURN(MATCH_NOMATCH); }
#ifdef SUPPORT_UCP
eptr--;
BACKCHAR(eptr);
@@ -2776,9 +2837,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
}
if (min == max) continue;
if (minimize)
@@ -2787,13 +2848,13 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM24);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc != md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (fc != md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
}
@@ -2819,7 +2880,7 @@ for (;;)
eptr--;
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
}
@@ -2833,9 +2894,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
+ if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
}
if (min == max) continue;
@@ -2846,13 +2907,13 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM26);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc != *eptr++) RRETURN(MATCH_NOMATCH);
+ if (fc != *eptr++) MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
}
@@ -2877,7 +2938,7 @@ for (;;)
eptr--;
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -2889,7 +2950,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
ecode++;
GETCHARINCTEST(c, eptr);
@@ -2899,11 +2960,11 @@ for (;;)
if (c < 256)
#endif
c = md->lcc[c];
- if (md->lcc[*ecode++] == c) RRETURN(MATCH_NOMATCH);
+ if (md->lcc[*ecode++] == c) MRRETURN(MATCH_NOMATCH);
}
else
{
- if (*ecode++ == c) RRETURN(MATCH_NOMATCH);
+ if (*ecode++ == c) MRRETURN(MATCH_NOMATCH);
}
break;
@@ -2997,11 +3058,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(d, eptr);
if (d < 256) d = md->lcc[d];
- if (fc == d) RRETURN(MATCH_NOMATCH);
+ if (fc == d) MRRETURN(MATCH_NOMATCH);
}
}
else
@@ -3014,9 +3075,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc == md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
}
}
@@ -3033,15 +3094,15 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM28);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(d, eptr);
if (d < 256) d = md->lcc[d];
- if (fc == d) RRETURN(MATCH_NOMATCH);
+ if (fc == d) MRRETURN(MATCH_NOMATCH);
}
}
else
@@ -3052,13 +3113,13 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM29);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc == md->lcc[*eptr++]) RRETURN(MATCH_NOMATCH);
+ if (fc == md->lcc[*eptr++]) MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -3120,7 +3181,7 @@ for (;;)
}
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
}
@@ -3139,10 +3200,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(d, eptr);
- if (fc == d) RRETURN(MATCH_NOMATCH);
+ if (fc == d) MRRETURN(MATCH_NOMATCH);
}
}
else
@@ -3154,9 +3215,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
+ if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
}
}
@@ -3173,14 +3234,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM32);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(d, eptr);
- if (fc == d) RRETURN(MATCH_NOMATCH);
+ if (fc == d) MRRETURN(MATCH_NOMATCH);
}
}
else
@@ -3191,13 +3252,13 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM33);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (fc == *eptr++) RRETURN(MATCH_NOMATCH);
+ if (fc == *eptr++) MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -3258,7 +3319,7 @@ for (;;)
}
}
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
/* Control never gets here */
@@ -3352,13 +3413,13 @@ for (;;)
switch(prop_type)
{
case PT_ANY:
- if (prop_fail_result) RRETURN(MATCH_NOMATCH);
+ if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
for (i = 1; i <= min; i++)
{
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
}
@@ -3370,14 +3431,14 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_chartype = UCD_CHARTYPE(c);
if ((prop_chartype == ucp_Lu ||
prop_chartype == ucp_Ll ||
prop_chartype == ucp_Lt) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3387,12 +3448,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_category = UCD_CATEGORY(c);
if ((prop_category == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3402,12 +3463,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_chartype = UCD_CHARTYPE(c);
if ((prop_chartype == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3417,12 +3478,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_script = UCD_SCRIPT(c);
if ((prop_script == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3441,11 +3502,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_category = UCD_CATEGORY(c);
- if (prop_category == ucp_M) RRETURN(MATCH_NOMATCH);
+ if (prop_category == ucp_M) MRRETURN(MATCH_NOMATCH);
while (eptr < md->end_subject)
{
int len = 1;
@@ -3472,9 +3533,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+ if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
eptr++;
while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
}
@@ -3486,7 +3547,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr++;
while (eptr < md->end_subject && (*eptr & 0xc0) == 0x80) eptr++;
@@ -3494,7 +3555,7 @@ for (;;)
break;
case OP_ANYBYTE:
- if (eptr > md->end_subject - min) RRETURN(MATCH_NOMATCH);
+ if (eptr > md->end_subject - min) MRRETURN(MATCH_NOMATCH);
eptr += min;
break;
@@ -3504,12 +3565,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x000d:
if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
break;
@@ -3522,7 +3583,7 @@ for (;;)
case 0x0085:
case 0x2028:
case 0x2029:
- if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+ if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
break;
}
}
@@ -3534,7 +3595,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
switch(c)
@@ -3559,7 +3620,7 @@ for (;;)
case 0x202f: /* NARROW NO-BREAK SPACE */
case 0x205f: /* MEDIUM MATHEMATICAL SPACE */
case 0x3000: /* IDEOGRAPHIC SPACE */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
break;
@@ -3570,12 +3631,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
@@ -3606,7 +3667,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
switch(c)
@@ -3619,7 +3680,7 @@ for (;;)
case 0x85: /* NEL */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
break;
@@ -3630,12 +3691,12 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x0a: /* LF */
case 0x0b: /* VT */
case 0x0c: /* FF */
@@ -3654,11 +3715,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
if (c < 128 && (md->ctypes[c] & ctype_digit) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3668,10 +3729,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_digit) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
/* No need to skip more bytes - we know it's a 1-byte character */
}
break;
@@ -3682,10 +3743,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (*eptr < 128 && (md->ctypes[*eptr] & ctype_space) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
}
break;
@@ -3696,10 +3757,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_space) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
/* No need to skip more bytes - we know it's a 1-byte character */
}
break;
@@ -3710,10 +3771,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (*eptr < 128 && (md->ctypes[*eptr] & ctype_word) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
while (++eptr < md->end_subject && (*eptr & 0xc0) == 0x80);
}
break;
@@ -3724,10 +3785,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (*eptr >= 128 || (md->ctypes[*eptr++] & ctype_word) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
/* No need to skip more bytes - we know it's a 1-byte character */
}
break;
@@ -3750,9 +3811,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if (IS_NEWLINE(eptr)) RRETURN(MATCH_NOMATCH);
+ if (IS_NEWLINE(eptr)) MRRETURN(MATCH_NOMATCH);
eptr++;
}
break;
@@ -3761,7 +3822,7 @@ for (;;)
if (eptr > md->end_subject - min)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr += min;
break;
@@ -3770,7 +3831,7 @@ for (;;)
if (eptr > md->end_subject - min)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
eptr += min;
break;
@@ -3781,11 +3842,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
switch(*eptr++)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x000d:
if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
break;
@@ -3795,7 +3856,7 @@ for (;;)
case 0x000b:
case 0x000c:
case 0x0085:
- if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+ if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
break;
}
}
@@ -3807,7 +3868,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
switch(*eptr++)
{
@@ -3815,7 +3876,7 @@ for (;;)
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
break;
@@ -3826,11 +3887,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
switch(*eptr++)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
@@ -3845,7 +3906,7 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
switch(*eptr++)
{
@@ -3855,7 +3916,7 @@ for (;;)
case 0x0c: /* FF */
case 0x0d: /* CR */
case 0x85: /* NEL */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
}
break;
@@ -3866,11 +3927,11 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
switch(*eptr++)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x0a: /* LF */
case 0x0b: /* VT */
case 0x0c: /* FF */
@@ -3887,9 +3948,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if ((md->ctypes[*eptr++] & ctype_digit) != 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[*eptr++] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3899,9 +3960,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if ((md->ctypes[*eptr++] & ctype_digit) == 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[*eptr++] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3911,9 +3972,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if ((md->ctypes[*eptr++] & ctype_space) != 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[*eptr++] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3923,9 +3984,9 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
- if ((md->ctypes[*eptr++] & ctype_space) == 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[*eptr++] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3935,10 +3996,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if ((md->ctypes[*eptr++] & ctype_word) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3948,10 +4009,10 @@ for (;;)
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if ((md->ctypes[*eptr++] & ctype_word) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
@@ -3980,14 +4041,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM36);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
- if (prop_fail_result) RRETURN(MATCH_NOMATCH);
+ if (prop_fail_result) MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -3996,18 +4057,18 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM37);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
prop_chartype = UCD_CHARTYPE(c);
if ((prop_chartype == ucp_Lu ||
prop_chartype == ucp_Ll ||
prop_chartype == ucp_Lt) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -4016,16 +4077,16 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM38);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
prop_category = UCD_CATEGORY(c);
if ((prop_category == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -4034,16 +4095,16 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM39);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
prop_chartype = UCD_CHARTYPE(c);
if ((prop_chartype == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -4052,16 +4113,16 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM40);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINC(c, eptr);
prop_script = UCD_SCRIPT(c);
if ((prop_script == prop_value) == prop_fail_result)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -4079,15 +4140,15 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM41);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
GETCHARINCTEST(c, eptr);
prop_category = UCD_CATEGORY(c);
- if (prop_category == ucp_M) RRETURN(MATCH_NOMATCH);
+ if (prop_category == ucp_M) MRRETURN(MATCH_NOMATCH);
while (eptr < md->end_subject)
{
int len = 1;
@@ -4111,14 +4172,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM42);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (ctype == OP_ANY && IS_NEWLINE(eptr))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
GETCHARINC(c, eptr);
switch(ctype)
{
@@ -4130,7 +4191,7 @@ for (;;)
case OP_ANYNL:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x000d:
if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
break;
@@ -4142,7 +4203,7 @@ for (;;)
case 0x0085:
case 0x2028:
case 0x2029:
- if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+ if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
break;
}
break;
@@ -4170,14 +4231,14 @@ for (;;)
case 0x202f: /* NARROW NO-BREAK SPACE */
case 0x205f: /* MEDIUM MATHEMATICAL SPACE */
case 0x3000: /* IDEOGRAPHIC SPACE */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
case OP_HSPACE:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
@@ -4212,14 +4273,14 @@ for (;;)
case 0x85: /* NEL */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
case OP_VSPACE:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x0a: /* LF */
case 0x0b: /* VT */
case 0x0c: /* FF */
@@ -4233,32 +4294,32 @@ for (;;)
case OP_NOT_DIGIT:
if (c < 256 && (md->ctypes[c] & ctype_digit) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case OP_DIGIT:
if (c >= 256 || (md->ctypes[c] & ctype_digit) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case OP_NOT_WHITESPACE:
if (c < 256 && (md->ctypes[c] & ctype_space) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case OP_WHITESPACE:
if (c >= 256 || (md->ctypes[c] & ctype_space) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case OP_NOT_WORDCHAR:
if (c < 256 && (md->ctypes[c] & ctype_word) != 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
case OP_WORDCHAR:
if (c >= 256 || (md->ctypes[c] & ctype_word) == 0)
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
break;
default:
@@ -4274,14 +4335,14 @@ for (;;)
{
RMATCH(eptr, ecode, offset_top, md, ims, eptrb, 0, RM43);
if (rrc != MATCH_NOMATCH) RRETURN(rrc);
- if (fi >= max) RRETURN(MATCH_NOMATCH);
+ if (fi >= max) MRRETURN(MATCH_NOMATCH);
if (eptr >= md->end_subject)
{
SCHECK_PARTIAL();
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
if (ctype == OP_ANY && IS_NEWLINE(eptr))
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
c = *eptr++;
switch(ctype)
{
@@ -4293,7 +4354,7 @@ for (;;)
case OP_ANYNL:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x000d:
if (eptr < md->end_subject && *eptr == 0x0a) eptr++;
break;
@@ -4304,7 +4365,7 @@ for (;;)
case 0x000b:
case 0x000c:
case 0x0085:
- if (md->bsr_anycrlf) RRETURN(MATCH_NOMATCH);
+ if (md->bsr_anycrlf) MRRETURN(MATCH_NOMATCH);
break;
}
break;
@@ -4316,14 +4377,14 @@ for (;;)
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
case OP_HSPACE:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x09: /* HT */
case 0x20: /* SPACE */
case 0xa0: /* NBSP */
@@ -4340,14 +4401,14 @@ for (;;)
case 0x0c: /* FF */
case 0x0d: /* CR */
case 0x85: /* NEL */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
break;
case OP_VSPACE:
switch(c)
{
- default: RRETURN(MATCH_NOMATCH);
+ default: MRRETURN(MATCH_NOMATCH);
case 0x0a: /* LF */
case 0x0b: /* VT */
case 0x0c: /* FF */
@@ -4358,27 +4419,27 @@ for (;;)
break;
case OP_NOT_DIGIT:
- if ((md->ctypes[c] & ctype_digit) != 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_digit) != 0) MRRETURN(MATCH_NOMATCH);
break;
case OP_DIGIT:
- if ((md->ctypes[c] & ctype_digit) == 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_digit) == 0) MRRETURN(MATCH_NOMATCH);
break;
case OP_NOT_WHITESPACE:
- if ((md->ctypes[c] & ctype_space) != 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_space) != 0) MRRETURN(MATCH_NOMATCH);
break;
case OP_WHITESPACE:
- if ((md->ctypes[c] & ctype_space) == 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_space) == 0) MRRETURN(MATCH_NOMATCH);
break;
case OP_NOT_WORDCHAR:
- if ((md->ctypes[c] & ctype_word) != 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_word) != 0) MRRETURN(MATCH_NOMATCH);
break;
case OP_WORDCHAR:
- if ((md->ctypes[c] & ctype_word) == 0) RRETURN(MATCH_NOMATCH);
+ if ((md->ctypes[c] & ctype_word) == 0) MRRETURN(MATCH_NOMATCH);
break;
default:
@@ -5038,7 +5099,7 @@ for (;;)
/* Get here if we can't make it match with any permitted repetitions */
- RRETURN(MATCH_NOMATCH);
+ MRRETURN(MATCH_NOMATCH);
}
/* Control never gets here */
@@ -5289,6 +5350,7 @@ md->notempty_atstart = (options & PCRE_NOTEMPTY_ATSTART) != 0;
md->partial = ((options & PCRE_PARTIAL_HARD) != 0)? 2 :
((options & PCRE_PARTIAL_SOFT) != 0)? 1 : 0;
md->hitend = FALSE;
+md->mark = NULL; /* In case never set */
md->recursive = NULL; /* No recursion at top level */
@@ -5659,7 +5721,7 @@ for(;;)
/* OK, we can now run the match. If "hitend" is set afterwards, remember the
first starting point for which a partial match was found. */
-
+
md->start_match_ptr = start_match;
md->start_used_ptr = start_match;
md->match_call_count = 0;
@@ -5669,11 +5731,13 @@ for(;;)
switch(rc)
{
- /* NOMATCH and PRUNE advance by one character. THEN at this level acts
- exactly like PRUNE. */
+ /* NOMATCH and PRUNE advance by one character. If MATCH_SKIP_ARG reaches
+ this level it means that a MARK that matched the SKIP's arg was not found.
+ We treat this as NOMATCH. THEN at this level acts exactly like PRUNE. */
case MATCH_NOMATCH:
case MATCH_PRUNE:
+ case MATCH_SKIP_ARG:
case MATCH_THEN:
new_start_match = start_match + 1;
#ifdef SUPPORT_UTF8
@@ -5734,7 +5798,8 @@ for(;;)
md->nllen == 2))
start_match++;
- } /* End of for(;;) "bumpalong" loop */
+ md->mark = NULL; /* Reset for start of next match attempt */
+ } /* End of for(;;) "bumpalong" loop */
/* ==========================================================================*/
@@ -5789,7 +5854,7 @@ if (rc == MATCH_MATCH)
}
DPRINTF((">>>> returning %d\n", rc));
- return rc;
+ goto RETURN_MARK;
}
/* Control gets here if there has been an error, or if the overall match
@@ -5800,27 +5865,44 @@ if (using_temporary_offsets)
DPRINTF(("Freeing temporary memory\n"));
(pcre_free)(md->offset_vector);
}
+
+/* For anything other than nomatch or partial match, just return the code. */
if (rc != MATCH_NOMATCH && rc != PCRE_ERROR_PARTIAL)
{
DPRINTF((">>>> error: returning %d\n", rc));
return rc;
}
-else if (start_partial != NULL)
+
+/* Handle partial matches - disable any mark data */
+
+if (start_partial != NULL)
{
DPRINTF((">>>> returning PCRE_ERROR_PARTIAL\n"));
+ md->mark = NULL;
if (offsetcount > 1)
{
offsets[0] = start_partial - (USPTR)subject;
offsets[1] = end_subject - (USPTR)subject;
}
- return PCRE_ERROR_PARTIAL;
+ rc = PCRE_ERROR_PARTIAL;
}
+
+/* This is the classic nomatch case */
+
else
{
DPRINTF((">>>> returning PCRE_ERROR_NOMATCH\n"));
- return PCRE_ERROR_NOMATCH;
+ rc = PCRE_ERROR_NOMATCH;
}
+
+/* Return the MARK data if it has been requested. */
+
+RETURN_MARK:
+
+if (extra_data != NULL && (extra_data->flags & PCRE_EXTRA_MARK) != 0)
+ *(extra_data->mark) = (unsigned char *)(md->mark);
+return rc;
}
/* End of pcre_exec.c */
diff --git a/pcre_internal.h b/pcre_internal.h
index 4554657..b8ae8b1 100644
--- a/pcre_internal.h
+++ b/pcre_internal.h
@@ -875,6 +875,7 @@ so that PCRE works on both ASCII and EBCDIC platforms, in non-UTF-mode only. */
#define STRING_COMMIT0 "COMMIT\0"
#define STRING_F0 "F\0"
#define STRING_FAIL0 "FAIL\0"
+#define STRING_MARK0 "MARK\0"
#define STRING_PRUNE0 "PRUNE\0"
#define STRING_SKIP0 "SKIP\0"
#define STRING_THEN "THEN"
@@ -1127,6 +1128,7 @@ only. */
#define STRING_COMMIT0 STR_C STR_O STR_M STR_M STR_I STR_T "\0"
#define STRING_F0 STR_F "\0"
#define STRING_FAIL0 STR_F STR_A STR_I STR_L "\0"
+#define STRING_MARK0 STR_M STR_A STR_R STR_K "\0"
#define STRING_PRUNE0 STR_P STR_R STR_U STR_N STR_E "\0"
#define STRING_SKIP0 STR_S STR_K STR_I STR_P "\0"
#define STRING_THEN STR_T STR_H STR_E STR_N
@@ -1378,20 +1380,24 @@ enum {
/* These are backtracking control verbs */
- OP_PRUNE, /* 107 */
- OP_SKIP, /* 108 */
- OP_THEN, /* 109 */
- OP_COMMIT, /* 110 */
+ OP_MARK, /* 107 always has an argument */
+ OP_PRUNE, /* 108 */
+ OP_PRUNE_ARG, /* 109 same, but with argument */
+ OP_SKIP, /* 110 */
+ OP_SKIP_ARG, /* 111 same, but with argument */
+ OP_THEN, /* 112 */
+ OP_THEN_ARG, /* 113 same, but with argument */
+ OP_COMMIT, /* 114 */
/* These are forced failure and success verbs */
- OP_FAIL, /* 111 */
- OP_ACCEPT, /* 112 */
- OP_CLOSE, /* 113 Used before OP_ACCEPT to close open captures */
+ OP_FAIL, /* 115 */
+ OP_ACCEPT, /* 116 */
+ OP_CLOSE, /* 117 Used before OP_ACCEPT to close open captures */
/* This is used to skip a subpattern with a {0} quantifier */
- OP_SKIPZERO, /* 114 */
+ OP_SKIPZERO, /* 118 */
/* This is not an opcode, but is used to check that tables indexed by opcode
are the correct length, in order to catch updating errors - there have been
@@ -1402,7 +1408,7 @@ enum {
/* *** NOTE NOTE NOTE *** Whenever the list above is updated, the two macro
definitions that follow must also be updated to match. There are also tables
-called "coptable" cna "poptable" in pcre_dfa_exec.c that must be updated. */
+called "coptable" and "poptable" in pcre_dfa_exec.c that must be updated. */
/* This macro defines textual names for all the opcodes. These are used only
@@ -1427,7 +1433,8 @@ for debugging. The macro is referenced only in pcre_printint.c. */
"Once", "Bra", "CBra", "Cond", "SBra", "SCBra", "SCond", \
"Cond ref", "Cond nref", "Cond rec", "Cond nrec", "Cond def", \
"Brazero", "Braminzero", \
- "*PRUNE", "*SKIP", "*THEN", "*COMMIT", "*FAIL", "*ACCEPT", \
+ "*MARK", "*PRUNE", "*PRUNE", "*SKIP", "*SKIP", \
+ "*THEN", "*THEN", "*COMMIT", "*FAIL", "*ACCEPT", \
"Close", "Skip zero"
@@ -1493,8 +1500,9 @@ in UTF-8 mode. The code that uses this table must know about such things. */
3, 3, /* RREF, NRREF */ \
1, /* DEF */ \
1, 1, /* BRAZERO, BRAMINZERO */ \
- 1, 1, 1, 1, /* PRUNE, SKIP, THEN, COMMIT, */ \
- 1, 1, 3, 1 /* FAIL, ACCEPT, CLOSE, SKIPZERO */
+ 3, 1, 3, /* MARK, PRUNE, PRUNE_ARG, */ \
+ 1, 3, 1, 3, /* SKIP, SKIP_ARG, THEN, THEN_ARG, */ \
+ 1, 1, 1, 3, 1 /* COMMIT, FAIL, ACCEPT, CLOSE, SKIPZERO */
/* A magic value for OP_RREF and OP_NRREF to indicate the "any recursion"
@@ -1512,7 +1520,7 @@ enum { ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9,
ERR30, ERR31, ERR32, ERR33, ERR34, ERR35, ERR36, ERR37, ERR38, ERR39,
ERR40, ERR41, ERR42, ERR43, ERR44, ERR45, ERR46, ERR47, ERR48, ERR49,
ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
- ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERRCOUNT };
+ ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERRCOUNT };
/* The real format of the start of the pcre block; the index of names and the
code vector run on as long as necessary after the end. We store an explicit
@@ -1674,6 +1682,7 @@ typedef struct match_data {
int eptrn; /* Next free eptrblock */
recursion_info *recursive; /* Linked list of recursion data */
void *callout_data; /* To pass back to callouts */
+ const uschar *mark; /* Mark pointer to pass back */
} match_data;
/* A similar structure is used for the same purpose by the DFA matching
diff --git a/pcre_printint.src b/pcre_printint.src
index 86b02b5..922be19 100644
--- a/pcre_printint.src
+++ b/pcre_printint.src
@@ -533,6 +533,14 @@ for(;;)
}
}
break;
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ fprintf(f, " %s %s", OP_names[*code], code + 2);
+ extra += code[1];
+ break;
/* Anything else is just an item with no data*/
diff --git a/pcre_study.c b/pcre_study.c
index bd00a53..7db319c 100644
--- a/pcre_study.c
+++ b/pcre_study.c
@@ -412,6 +412,15 @@ for (;;)
if (utf8 && cc[-1] >= 0xc0) cc += _pcre_utf8_table4[cc[-1] & 0x3f];
#endif
break;
+
+ /* Skip these, but we need to add in the name length. */
+
+ case OP_MARK:
+ case OP_PRUNE_ARG:
+ case OP_SKIP_ARG:
+ case OP_THEN_ARG:
+ cc += _pcre_OP_lengths[op] + cc[1];
+ break;
/* For the record, these are the opcodes that are matched by "default":
OP_ACCEPT, OP_CLOSE, OP_COMMIT, OP_FAIL, OP_PRUNE, OP_SET_SOM, OP_SKIP,
diff --git a/pcreposix.c b/pcreposix.c
index 76f3f87..5b022cc 100644
--- a/pcreposix.c
+++ b/pcreposix.c
@@ -135,7 +135,7 @@ static const int eint[] = {
REG_INVARG, /* inconsistent NEWLINE options */
REG_BADPAT, /* \g is not followed followed by an (optionally braced) non-zero number */
REG_BADPAT, /* a numbered reference must not be zero */
- REG_BADPAT, /* (*VERB) with an argument is not supported */
+ REG_BADPAT, /* an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT) */
/* 60 */
REG_BADPAT, /* (*VERB) not recognized */
REG_BADPAT, /* number is too big */
@@ -143,7 +143,8 @@ static const int eint[] = {
REG_BADPAT, /* digit expected after (?+ */
REG_BADPAT, /* ] is an invalid data character in JavaScript compatibility mode */
/* 65 */
- REG_BADPAT /* different names for subpatterns of the same number are not allowed */
+ REG_BADPAT, /* different names for subpatterns of the same number are not allowed */
+ REG_BADPAT, /* (*MARK) must have an argument */
};
/* Table of texts corresponding to POSIX error codes */
diff --git a/pcretest.c b/pcretest.c
index 5ed3289..0d9e744 100644
--- a/pcretest.c
+++ b/pcretest.c
@@ -1040,11 +1040,13 @@ while (!done)
#endif
const char *error;
+ unsigned char *markptr;
unsigned char *p, *pp, *ppp;
unsigned char *to_file = NULL;
const unsigned char *tables = NULL;
unsigned long int true_size, true_study_size = 0;
size_t size, regex_gotten_store;
+ int do_mark = 0;
int do_study = 0;
int do_debug = debug;
int do_G = 0;
@@ -1226,6 +1228,7 @@ while (!done)
case 'G': do_G = 1; break;
case 'I': do_showinfo = 1; break;
case 'J': options |= PCRE_DUPNAMES; break;
+ case 'K': do_mark = 1; break;
case 'M': log_store = 1; break;
case 'N': options |= PCRE_NO_AUTO_CAPTURE; break;
@@ -1419,6 +1422,19 @@ while (!done)
else if (extra != NULL)
true_study_size = ((pcre_study_data *)(extra->study_data))->size;
}
+
+ /* If /K was present, we set up for handling MARK data. */
+
+ if (do_mark)
+ {
+ if (extra == NULL)
+ {
+ extra = (pcre_extra *)malloc(sizeof(pcre_extra));
+ extra->flags = 0;
+ }
+ extra->mark = &markptr;
+ extra->flags |= PCRE_EXTRA_MARK;
+ }
/* If the 'F' option was present, we flip the bytes of all the integer
fields in the regex data block and the study block. This is to make it
@@ -2145,6 +2161,8 @@ while (!done)
for (;; gmatched++) /* Loop for /g or /G */
{
+ markptr = NULL;
+
if (timeitm > 0)
{
register int i;
@@ -2289,6 +2307,8 @@ while (!done)
}
}
}
+
+ if (markptr != NULL) fprintf(outfile, "MK: %s\n", markptr);
for (i = 0; i < 32; i++)
{
@@ -2373,7 +2393,8 @@ while (!done)
else if (count == PCRE_ERROR_PARTIAL)
{
- fprintf(outfile, "Partial match");
+ if (markptr == NULL) fprintf(outfile, "Partial match");
+ else fprintf(outfile, "Partial match, mark=%s", markptr);
if (use_size_offsets > 1)
{
fprintf(outfile, ": ");
@@ -2440,7 +2461,11 @@ while (!done)
{
if (count == PCRE_ERROR_NOMATCH)
{
- if (gmatched == 0) fprintf(outfile, "No match\n");
+ if (gmatched == 0)
+ {
+ if (markptr == NULL) fprintf(outfile, "No match\n");
+ else fprintf(outfile, "No match, mark = %s\n", markptr);
+ }
}
else fprintf(outfile, "Error %d\n", count);
break; /* Out of the /g loop */
diff --git a/perltest.pl b/perltest.pl
index 1df863e..b73646b 100755
--- a/perltest.pl
+++ b/perltest.pl
@@ -85,15 +85,19 @@ for (;;)
# The private /+ modifier means "print $' afterwards".
- $showrest = ($pattern =~ s/\+(?=[a-z]*$)//);
+ $showrest = ($pattern =~ s/\+(?=[a-zA-Z]*$)//);
# Remove /8 from a UTF-8 pattern.
- $utf8 = $pattern =~ s/8(?=[a-z]*$)//;
+ $utf8 = $pattern =~ s/8(?=[a-zA-Z]*$)//;
# Remove /J from a pattern with duplicate names.
- $pattern =~ s/J(?=[a-z]*$)//;
+ $pattern =~ s/J(?=[a-zA-Z]*$)//;
+
+ # Remove /K from a pattern (asks pcretest to check MARK data) */
+
+ $pattern =~ s/K(?=[a-zA-Z]*$)//;
# Check that the pattern is valid
@@ -127,8 +131,9 @@ for (;;)
chomp;
printf $outfile "$_\n" if $infile ne "STDIN";
- s/\s+$//;
- s/^\s+//;
+ s/\s+$//; # Remove trailing space
+ s/^\s+//; # Remove leading space
+ s/\\Y//g; # Remove \Y (pcretest flag to set PCRE_NO_START_OPTIMIZE)
last if ($_ eq "");
$x = eval "\"$_\""; # To get escapes processed
diff --git a/testdata/testinput11 b/testdata/testinput11
index 3543bf7..43d08f3 100644
--- a/testdata/testinput11
+++ b/testdata/testinput11
@@ -389,4 +389,88 @@
00
0000
+/--- This one does fail, as expected, in Perl. It needs the complex item at the
+ end of the pattern. A single letter instead of (B|D) makes it not fail,
+ which I think is a Perl bug. --- /
+
+/A(*COMMIT)(B|D)/
+ ACABX
+
+/--- Check the use of names for failure ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+ ** Failers
+ AC
+ CB
+
+/(*MARK:A)(*SKIP:B)(C|X)/K
+ C
+ D
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+ ** Failers
+ CB
+
+/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
+ CB
+
+/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
+ CB
+
+/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
+that we have to have something complicated such as (B|Z) at the end because,
+for Perl, a simple character somehow causes an unwanted optimization to mess
+with the handling of backtracking verbs. ---/
+
+/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+
+/--- Test skipping over a non-matching mark. ---/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+
+/--- Check shorthand for MARK ---/
+
+/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+
+/--- This should succeed, as a non-existent skip name disables the skip ---/
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
+ AAAC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
+ AAAC
+
+/--- We use something more complicated than individual letters here, because
+that causes different behaviour in Perl. Perhaps it disables some optimization;
+anyway, the result now matches PCRE in that no tag is passed back for the
+failures. ---/
+
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+ AABC
+ XXYZ
+ ** Failers
+ XAQQ
+ XAQQXZZ
+ AXQQQ
+ AXXQQQ
+
+/--- COMMIT at the start of a pattern should act like an anchor. Again,
+however, we need the complication for Perl. ---/
+
+/(*COMMIT)(A|P)(B|P)(C|P)/
+ ABCDEFG
+ ** Failers
+ DEFGABC
+
+/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
+
+/(\w+)(?>b(*COMMIT))\w{2}/
+ abbb
+
+/(\w+)b(*COMMIT)\w{2}/
+ abbb
+
/-- End of testinput11 --/
diff --git a/testdata/testinput2 b/testdata/testinput2
index 94a18c9..306f8d4 100644
--- a/testdata/testinput2
+++ b/testdata/testinput2
@@ -2279,8 +2279,6 @@ a random value. /Ix
/a+b?(*THEN)c+(*FAIL)/C
aaabccc
-/a(*PRUNE:XXX)b/
-
/a(*MARK)b/
/(?i:A{1,}\6666666666)/
@@ -3232,4 +3230,213 @@ a random value. /Ix
/(?P<L1>(?P<L2>0|)|(?P>L2)(?P>L1))/
+/abc(*MARK:)pqr/
+
+/abc(*:)pqr/
+
+/abc(*FAIL:123)xyz/
+
+/--- This should, and does, fail. In Perl, it does not, which I think is a
+ bug because replacing the B in the pattern by (B|D) does make it fail. ---/
+
+/A(*COMMIT)B/+K
+ ACABX
+
+/--- These should be different, but in Perl 5.11 are not, which I think
+ is a bug in Perl. ---/
+
+/A(*THEN)B|A(*THEN)C/K
+ AC
+
+/A(*PRUNE)B|A(*PRUNE)C/K
+ AC
+
+/--- A whole lot of tests of verbs with arguments are here rather than in test
+ 11 because Perl doesn't seem to follow its specification entirely
+ correctly. ---/
+
+/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
+ not clear how Perl defines "involved in the failure of the match". ---/
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+ AB
+ CD
+ ** Failers
+ AC
+ CB
+
+/--- Check the use of names for success and failure. PCRE doesn't show these
+names for success, though Perl does, contrary to its spec. ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+ AB
+ CD
+ ** Failers
+ AC
+ CB
+
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+ AB
+ CD
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+
+/A(*PRUNE:A)B/K
+ ACAB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/K
+ C
+ D
+
+/(*MARK:A)(*THEN:B)(C|X)/K
+ C
+ D
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+ AAAC
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+ AAAC
+
+/--- This should fail; the SKIP advances by one, but when we get to AC, the
+ PRUNE kills it. ---/
+
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+ AAAC
+
+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+ AAAC
+
+/--- This fails in PCRE, and I think that is in accordance with Perl's
+ documentation, though in Perl it succeeds. ---/
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+ AAAC
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+ AABC
+ XXYZ
+
+/^A(*:A)B|^X(*:A)Y/K
+ ** Failers
+ XAQQ
+
+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here,
+though it does not when the individual letters are made into something
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+ AABC
+ XXYZ
+ ** Failers
+ XAQQ
+ XAQQXZZ
+ AXQQQ
+ AXXQQQ
+
+/--- COMMIT at the start of a pattern should be the same as an anchor. Perl
+optimizations defeat this. So does the PCRE optimization unless we disable it
+with \Y. ---/
+
+/(*COMMIT)ABC/
+ ABCDEFG
+ ** Failers
+ DEFGABC\Y
+
+/--- Repeat some tests with added studying. ---/
+
+/A(*COMMIT)B/+KS
+ ACABX
+
+/A(*THEN)B|A(*THEN)C/KS
+ AC
+
+/A(*PRUNE)B|A(*PRUNE)C/KS
+ AC
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/KS
+ AB
+ CD
+ ** Failers
+ AC
+ CB
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
+ AB
+ CD
+ ** Failers
+ AC
+ CB
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
+ AB
+ CD
+
+/A(*PRUNE:A)B/KS
+ ACAB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+ C
+ D
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+ C
+ D
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
+ AAAC
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
+ AAAC
+
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
+ AAAC
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
+ AAAC
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
+ AAAC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
+ AAAC
+
+/A(*:A)B|XX(*:B)Y/KS
+ AABC
+ XXYZ
+ ** Failers
+ XAQQ
+ XAQQXZZ
+ AXQQQ
+ AXXQQQ
+
+/(*COMMIT)ABC/
+ ABCDEFG
+ ** Failers
+ DEFGABC\Y
+
+/^(ab (c+(*THEN)cd) | xyz)/x
+ abcccd
+
+/^(ab (c+(*PRUNE)cd) | xyz)/x
+ abcccd
+
+/^(ab (c+(*FAIL)cd) | xyz)/x
+ abcccd
+
/-- End of testinput2 --/
diff --git a/testdata/testoutput11 b/testdata/testoutput11
index 313b7cb..5821d44 100644
--- a/testdata/testoutput11
+++ b/testdata/testoutput11
@@ -803,4 +803,131 @@ No match
1: 0
2: 0
+/--- This one does fail, as expected, in Perl. It needs the complex item at the
+ end of the pattern. A single letter instead of (B|D) makes it not fail,
+ which I think is a Perl bug. --- /
+
+/A(*COMMIT)(B|D)/
+ ACABX
+No match
+
+/--- Check the use of names for failure ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+ ** Failers
+No match
+ AC
+No match, mark = A
+ CB
+No match, mark = B
+
+/(*MARK:A)(*SKIP:B)(C|X)/K
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match, mark = A
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+ ** Failers
+No match
+ CB
+No match, mark = B
+
+/^(?:A(*THEN:A)B|C(*THEN:B)D)/K
+ CB
+No match, mark = B
+
+/^(?>A(*THEN:A)B|C(*THEN:B)D)/K
+ CB
+No match, mark = B
+
+/--- This should succeed, as the skip causes bump to offset 1 (the mark). Note
+that we have to have something complicated such as (B|Z) at the end because,
+for Perl, a simple character somehow causes an unwanted optimization to mess
+with the handling of backtracking verbs. ---/
+
+/A(*MARK:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+ 0: AC
+
+/--- Test skipping over a non-matching mark. ---/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+ 0: AC
+
+/--- Check shorthand for MARK ---/
+
+/A(*:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+ 0: AC
+
+/--- This should succeed, as a non-existent skip name disables the skip ---/
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC/xK
+ AAAC
+ 0: AC
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AC(*:B)/xK
+ AAAC
+ 0: AC
+MK: B
+
+/--- We use something more complicated than individual letters here, because
+that causes different behaviour in Perl. Perhaps it disables some optimization;
+anyway, the result now matches PCRE in that no tag is passed back for the
+failures. ---/
+
+/(A|P)(*:A)(B|P) | (X|P)(X|P)(*:B)(Y|P)/xK
+ AABC
+ 0: AB
+ 1: A
+ 2: B
+MK: A
+ XXYZ
+ 0: XXY
+ 1: <unset>
+ 2: <unset>
+ 3: X
+ 4: X
+ 5: Y
+MK: B
+ ** Failers
+No match
+ XAQQ
+No match
+ XAQQXZZ
+No match
+ AXQQQ
+No match
+ AXXQQQ
+No match
+
+/--- COMMIT at the start of a pattern should act like an anchor. Again,
+however, we need the complication for Perl. ---/
+
+/(*COMMIT)(A|P)(B|P)(C|P)/
+ ABCDEFG
+ 0: ABC
+ 1: A
+ 2: B
+ 3: C
+ ** Failers
+No match
+ DEFGABC
+No match
+
+/--- COMMIT inside an atomic group can't stop backtracking over the group. ---/
+
+/(\w+)(?>b(*COMMIT))\w{2}/
+ abbb
+ 0: abbb
+ 1: a
+
+/(\w+)b(*COMMIT)\w{2}/
+ abbb
+No match
+
/-- End of testinput11 --/
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index c29bd5f..beaadeb 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -8667,11 +8667,8 @@ No match
+13 ^ ^ (*FAIL)
No match
-/a(*PRUNE:XXX)b/
-Failed: (*VERB) with an argument is not supported at offset 8
-
/a(*MARK)b/
-Failed: (*VERB) not recognized at offset 7
+Failed: (*MARK) must have an argument at offset 7
/(?i:A{1,}\6666666666)/
Failed: number is too big at offset 19
@@ -10668,4 +10665,321 @@ No match
/(?P<L1>(?P<L2>0|)|(?P>L2)(?P>L1))/
Failed: recursive call could loop indefinitely at offset 31
+/abc(*MARK:)pqr/
+Failed: (*MARK) must have an argument at offset 10
+
+/abc(*:)pqr/
+Failed: (*MARK) must have an argument at offset 6
+
+/abc(*FAIL:123)xyz/
+Failed: an argument is not allowed for (*ACCEPT), (*FAIL), or (*COMMIT) at offset 13
+
+/--- This should, and does, fail. In Perl, it does not, which I think is a
+ bug because replacing the B in the pattern by (B|D) does make it fail. ---/
+
+/A(*COMMIT)B/+K
+ ACABX
+No match
+
+/--- These should be different, but in Perl 5.11 are not, which I think
+ is a bug in Perl. ---/
+
+/A(*THEN)B|A(*THEN)C/K
+ AC
+ 0: AC
+
+/A(*PRUNE)B|A(*PRUNE)C/K
+ AC
+No match
+
+/--- A whole lot of tests of verbs with arguments are here rather than in test
+ 11 because Perl doesn't seem to follow its specification entirely
+ correctly. ---/
+
+/--- Perl 5.11 sets $REGERROR on the AC failure case here; PCRE does not. It is
+ not clear how Perl defines "involved in the failure of the match". ---/
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/K
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+ ** Failers
+No match
+ AC
+No match
+ CB
+No match, mark = B
+
+/--- Check the use of names for success and failure. PCRE doesn't show these
+names for success, though Perl does, contrary to its spec. ---/
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/K
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+ ** Failers
+No match
+ AC
+No match, mark = A
+ CB
+No match, mark = B
+
+/--- An empty name does not pass back an empty string. It is the same as if no
+name were given. ---/
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/K
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+
+/--- PRUNE goes to next bumpalong; COMMIT does not. ---/
+
+/A(*PRUNE:A)B/K
+ ACAB
+ 0: AB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/K
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match, mark = B
+
+/(*MARK:A)(*THEN:B)(C|X)/K
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match, mark = B
+
+/--- This should fail, as the skip causes a bump to offset 3 (the skip) ---/
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xK
+ AAAC
+No match
+
+/--- Same --/
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xK
+ AAAC
+No match
+
+/--- This should fail; the SKIP advances by one, but when we get to AC, the
+ PRUNE kills it. ---/
+
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xK
+ AAAC
+No match
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xK
+ AAAC
+No match
+
+/--- This should fail, as a null name is the same as no name ---/
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xK
+ AAAC
+No match
+
+/--- This fails in PCRE, and I think that is in accordance with Perl's
+ documentation, though in Perl it succeeds. ---/
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xK
+ AAAC
+No match
+
+/--- Mark names can be duplicated ---/
+
+/A(*:A)B|X(*:A)Y/K
+ AABC
+ 0: AB
+MK: A
+ XXYZ
+ 0: XY
+MK: A
+
+/^A(*:A)B|^X(*:A)Y/K
+ ** Failers
+No match
+ XAQQ
+No match, mark = A
+
+/--- A check on what happens after hitting a mark and them bumping along to
+something that does not even start. Perl reports tags after the failures here,
+though it does not when the individual letters are made into something
+more complicated. ---/
+
+/A(*:A)B|XX(*:B)Y/K
+ AABC
+ 0: AB
+MK: A
+ XXYZ
+ 0: XXY
+MK: B
+ ** Failers
+No match
+ XAQQ
+No match
+ XAQQXZZ
+No match
+ AXQQQ
+No match
+ AXXQQQ
+No match
+
+/--- COMMIT at the start of a pattern should be the same as an anchor. Perl
+optimizations defeat this. So does the PCRE optimization unless we disable it
+with \Y. ---/
+
+/(*COMMIT)ABC/
+ ABCDEFG
+ 0: ABC
+ ** Failers
+No match
+ DEFGABC\Y
+No match
+
+/--- Repeat some tests with added studying. ---/
+
+/A(*COMMIT)B/+KS
+ ACABX
+No match
+
+/A(*THEN)B|A(*THEN)C/KS
+ AC
+ 0: AC
+
+/A(*PRUNE)B|A(*PRUNE)C/KS
+ AC
+No match
+
+/^(A(*THEN:A)B|C(*THEN:B)D)/KS
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+ ** Failers
+No match
+ AC
+No match
+ CB
+No match, mark = B
+
+/^(A(*PRUNE:A)B|C(*PRUNE:B)D)/KS
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+ ** Failers
+No match
+ AC
+No match, mark = A
+ CB
+No match, mark = B
+
+/^(A(*PRUNE:)B|C(*PRUNE:B)D)/KS
+ AB
+ 0: AB
+ 1: AB
+ CD
+ 0: CD
+ 1: CD
+
+/A(*PRUNE:A)B/KS
+ ACAB
+ 0: AB
+
+/(*MARK:A)(*PRUNE:B)(C|X)/KS
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match
+
+/(*MARK:A)(*THEN:B)(C|X)/KS
+ C
+ 0: C
+ 1: C
+MK: A
+ D
+No match
+
+/A(*MARK:A)A+(*SKIP)(B|Z) | AC/xKS
+ AAAC
+No match
+
+/A(*MARK:A)A+(*MARK:B)(*SKIP:B)(B|Z) | AC/xKS
+ AAAC
+No match
+
+/A(*PRUNE:A)A+(*SKIP:A)(B|Z) | AC/xKS
+ AAAC
+No match
+
+/A(*:A)A+(*SKIP)(B|Z) | AC/xKS
+ AAAC
+No match
+
+/A(*MARK:A)A+(*SKIP:)(B|Z) | AC/xKS
+ AAAC
+No match
+
+/A(*MARK:A)A+(*SKIP:B)(B|Z) | AAC/xKS
+ AAAC
+No match
+
+/A(*:A)B|XX(*:B)Y/KS
+ AABC
+ 0: AB
+MK: A
+ XXYZ
+ 0: XXY
+MK: B
+ ** Failers
+No match
+ XAQQ
+No match
+ XAQQXZZ
+No match
+ AXQQQ
+No match
+ AXXQQQ
+No match
+
+/(*COMMIT)ABC/
+ ABCDEFG
+ 0: ABC
+ ** Failers
+No match
+ DEFGABC\Y
+No match
+
+/^(ab (c+(*THEN)cd) | xyz)/x
+ abcccd
+No match
+
+/^(ab (c+(*PRUNE)cd) | xyz)/x
+ abcccd
+No match
+
+/^(ab (c+(*FAIL)cd) | xyz)/x
+ abcccd
+No match
+
/-- End of testinput2 --/