summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-08-21 11:46:08 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-08-21 11:46:08 +0000
commitc6a88bf880d462c62e00d8d7c3eeeaad60ebab49 (patch)
tree7b948602c6a0645f32441bdf9dd602a73b4fc3e7
parent3fa5c170ee26c5a63c0efa62dc42fc9fb57ce76b (diff)
downloadpcre-c6a88bf880d462c62e00d8d7c3eeeaad60ebab49.tar.gz
Don't advance by 2 if explicit \r or \n in the pattern. Add
PCRE_INFO_HASCRORLF. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@226 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--ChangeLog12
-rw-r--r--NEWS7
-rw-r--r--doc/pcreapi.349
-rw-r--r--pcre.h.in1
-rw-r--r--pcre_compile.c34
-rw-r--r--pcre_dfa_exec.c17
-rw-r--r--pcre_exec.c17
-rw-r--r--pcre_fullinfo.c4
-rw-r--r--pcre_internal.h1
-rw-r--r--pcretest.c5
-rw-r--r--testdata/testinput212
-rw-r--r--testdata/testinput712
-rw-r--r--testdata/testoutput262
-rw-r--r--testdata/testoutput57
-rw-r--r--testdata/testoutput716
15 files changed, 207 insertions, 49 deletions
diff --git a/ChangeLog b/ChangeLog
index b410350..51244a7 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -150,6 +150,18 @@ Version 7.3 20-Aug-07
27. Patterns such as (\P{Yi}*\277)* (group with possible zero repeat containing
\p or \P) caused a compile-time loop.
+
+28. More problems have arisen in unanchored patterns when CRLF is a valid line
+ break. For example, the unstudied pattern [\r\n]A does not match the string
+ "\r\nA" because change 7.0/46 below moves the current point on by two
+ characters after failing to match at the start. However, the pattern \nA
+ *does* match, because it doesn't start till \n, and if [\r\n]A is studied,
+ the same is true. There doesn't seem any very clean way out of this, but
+ what I have chosen to do makes the common cases work: PCRE now takes note
+ of whether there can be an explicit match for \r or \n anywhere in the
+ pattern, and if so, 7.0/46 no longer applies. As part of this change,
+ there's a new PCRE_INFO_HASCRORLF option for finding out whether a compiled
+ pattern has explicit CR or LF references.
Version 7.2 19-Jun-07
diff --git a/NEWS b/NEWS
index 86f7eb8..4472d39 100644
--- a/NEWS
+++ b/NEWS
@@ -2,7 +2,7 @@ News about PCRE releases
------------------------
-Release 7.3 16-Aug-07
+Release 7.3 20-Aug-07
---------------------
Most changes are bug fixes. Some that are not:
@@ -16,6 +16,11 @@ Most changes are bug fixes. Some that are not:
3. Checking for potential integer overflow has been made more dynamic, and as a
consequence there is no longer a hard limit on the size of a subpattern that
has a limited repeat count.
+
+4. When CRLF is a valid line-ending sequence, pcre_exec() and pcre_dfa_exec()
+ no longer advance by two characters instead of one when an unanchored match
+ fails at CRLF if there are explicit CR or LF matches within the pattern.
+ This gets rid of some anomalous effects that previously occurred.
Release 7.2 19-Jun-07
diff --git a/doc/pcreapi.3 b/doc/pcreapi.3
index 0f8ecbb..e101012 100644
--- a/doc/pcreapi.3
+++ b/doc/pcreapi.3
@@ -240,8 +240,13 @@ pair of characters that indicate a line break". The choice of newline
convention affects the handling of the dot, circumflex, and dollar
metacharacters, the handling of #-comments in /x mode, and, when CRLF is a
recognized line ending sequence, the match position advancement for a
-non-anchored pattern. The choice of newline convention does not affect the
-interpretation of the \en or \er escape sequences.
+non-anchored pattern. There is more detail about this in the
+.\" HTML <a href="#execoptions">
+.\" </a>
+section on \fBpcre_exec()\fP options
+.\"
+below. The choice of newline convention does not affect the interpretation of
+the \en or \er escape sequences.
.
.
.SH MULTITHREADING
@@ -882,6 +887,11 @@ table indicating a fixed set of bytes for the first byte in any matching
string, a pointer to the table is returned. Otherwise NULL is returned. The
fourth argument should point to an \fBunsigned char *\fP variable.
.sp
+ PCRE_INFO_HASCRORLF
+.sp
+Return 1 if the pattern contains any explicit matches for CR or LF characters,
+otherwise 0. The fourth argument should point to an \fBint\fP variable.
+.sp
PCRE_INFO_JCHANGED
.sp
Return 1 if the (?J) option setting is used in the pattern, otherwise 0. The
@@ -1169,6 +1179,7 @@ called. See the
.\"
documentation for a discussion of saving compiled patterns for later use.
.
+.\" HTML <a name="execoptions"></a>
.SS "Option bits for \fBpcre_exec()\fP"
.rs
.sp
@@ -1194,19 +1205,25 @@ the pattern was compiled. For details, see the description of
\fBpcre_compile()\fP above. During matching, the newline choice affects the
behaviour of the dot, circumflex, and dollar metacharacters. It may also alter
the way the match position is advanced after a match failure for an unanchored
-pattern. When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is
-set, and a match attempt fails when the current position is at a CRLF sequence,
-the match position is advanced by two characters instead of one, in other
-words, to after the CRLF.
-.P
-Anomalous effects can occur when CRLF is a valid newline sequence and explicit
-\er or \en escapes appear in the pattern. For example, the string "\er\enA"
-matches the unanchored pattern \enA but not [X\en]A. This happens because, in
-the first case, PCRE knows that the match must start with \en, and so it skips
-there before trying to match. In the second case, it has no knowledge about the
-starting character, so it starts matching at the beginning of the string, and
-on failing, skips over the CRLF as described above. However, if the pattern is
-studied, the match succeeds, because then PCRE once again knows where to start.
+pattern.
+.P
+When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is set, and a
+match attempt for an unanchored pattern fails when the current position is at a
+CRLF sequence, and the pattern contains no explicit matches for CR or NL
+characters, the match position is advanced by two characters instead of one, in
+other words, to after the CRLF.
+.P
+The above rule is a compromise that makes the most common cases work as
+expected. For example, if the pattern is .+A (and the PCRE_DOTALL option is not
+set), it does not match the string "\er\enA" because, after failing at the
+start, it skips both the CR and the LF before retrying. However, the pattern
+[\er\en]A does match that string, because it contains an explicit CR or LF
+reference, and so advances only by one character after the first failure.
+Note than an explicit CR or LF reference occurs for negated character classes
+such as [^X] because they can match CR or LF characters.
+.P
+Notwithstanding the above, anomalous effects may still occur when CRLF is a
+valid newline sequence and explicit \er or \en escapes appear in the pattern.
.sp
PCRE_NOTBOL
.sp
@@ -1895,6 +1912,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 20 August 2007
+Last updated: 21 August 2007
Copyright (c) 1997-2007 University of Cambridge.
.fi
diff --git a/pcre.h.in b/pcre.h.in
index 5e7a08e..69edca4 100644
--- a/pcre.h.in
+++ b/pcre.h.in
@@ -167,6 +167,7 @@ extern "C" {
#define PCRE_INFO_DEFAULT_TABLES 11
#define PCRE_INFO_OKPARTIAL 12
#define PCRE_INFO_JCHANGED 13
+#define PCRE_INFO_HASCRORLF 14
/* Request types for pcre_config(). Do not re-arrange, in order to remain
compatible. */
diff --git a/pcre_compile.c b/pcre_compile.c
index e344ad4..24f694e 100644
--- a/pcre_compile.c
+++ b/pcre_compile.c
@@ -3195,7 +3195,18 @@ for (;; ptr++)
*errorcodeptr = ERR6;
goto FAILED;
}
-
+
+ /* Remember whether \r or \n are in this class */
+
+ if (negate_class)
+ {
+ if ((classbits[1] & 0x24) != 0x24) cd->external_options |= PCRE_HASCRORLF;
+ }
+ else
+ {
+ if ((classbits[1] & 0x24) != 0) cd->external_options |= PCRE_HASCRORLF;
+ }
+
/* If class_charcount is 1, we saw precisely one character whose value is
less than 256. As long as there were no characters >= 128 and there was no
use of \p or \P, in other words, no use of any XCLASS features, we can
@@ -5050,6 +5061,11 @@ for (;; ptr++)
previous = code;
*code++ = ((options & PCRE_CASELESS) != 0)? OP_CHARNC : OP_CHAR;
for (c = 0; c < mclength; c++) *code++ = mcbuffer[c];
+
+ /* Remember if \r or \n were seen */
+
+ if (mcbuffer[0] == '\r' || mcbuffer[0] == '\n')
+ cd->external_options |= PCRE_HASCRORLF;
/* Set the first and required bytes appropriately. If no previous first
byte, set it from this character, but revert to none on a zero repeat.
@@ -5982,20 +5998,8 @@ case when building a production library. */
printf("Length = %d top_bracket = %d top_backref = %d\n",
length, re->top_bracket, re->top_backref);
-
-if (re->options != 0)
- {
- printf("%s%s%s%s%s%s%s%s%s\n",
- ((re->options & PCRE_NOPARTIAL) != 0)? "nopartial " : "",
- ((re->options & PCRE_ANCHORED) != 0)? "anchored " : "",
- ((re->options & PCRE_CASELESS) != 0)? "caseless " : "",
- ((re->options & PCRE_EXTENDED) != 0)? "extended " : "",
- ((re->options & PCRE_MULTILINE) != 0)? "multiline " : "",
- ((re->options & PCRE_DOTALL) != 0)? "dotall " : "",
- ((re->options & PCRE_DOLLAR_ENDONLY) != 0)? "endonly " : "",
- ((re->options & PCRE_EXTRA) != 0)? "extra " : "",
- ((re->options & PCRE_UNGREEDY) != 0)? "ungreedy " : "");
- }
+
+printf("Options=%08x\n", re->options);
if ((re->options & PCRE_FIRSTSET) != 0)
{
diff --git a/pcre_dfa_exec.c b/pcre_dfa_exec.c
index 60cb619..03af666 100644
--- a/pcre_dfa_exec.c
+++ b/pcre_dfa_exec.c
@@ -2842,16 +2842,17 @@ for (;;)
}
if (current_subject > end_subject) break;
- /* If we have just passed a CR and the newline option is CRLF or ANY or
- ANYCRLF, and we are now at a LF, advance the match position by one more
- character. */
+ /* If we have just passed a CR and we are now at a LF, and the pattern does
+ not contain any explicit matches for \r or \n, and the newline option is CRLF
+ or ANY or ANYCRLF, advance the match position by one more character. */
if (current_subject[-1] == '\r' &&
- (md->nltype == NLTYPE_ANY ||
- md->nltype == NLTYPE_ANYCRLF ||
- md->nllen == 2) &&
- current_subject < end_subject &&
- *current_subject == '\n')
+ current_subject < end_subject &&
+ *current_subject == '\n' &&
+ (re->options & PCRE_HASCRORLF) == 0 &&
+ (md->nltype == NLTYPE_ANY ||
+ md->nltype == NLTYPE_ANYCRLF ||
+ md->nllen == 2))
current_subject++;
} /* "Bumpalong" loop */
diff --git a/pcre_exec.c b/pcre_exec.c
index 6e4a4b5..8fd0bcb 100644
--- a/pcre_exec.c
+++ b/pcre_exec.c
@@ -4785,16 +4785,17 @@ for(;;)
if (anchored || start_match > end_subject) break;
- /* If we have just passed a CR and the newline option is CRLF or ANY or
- ANYCRLF, and we are now at a LF, advance the match position by one more
- character. */
+ /* If we have just passed a CR and we are now at a LF, and the pattern does
+ not contain any explicit matches for \r or \n, and the newline option is CRLF
+ or ANY or ANYCRLF, advance the match position by one more character. */
if (start_match[-1] == '\r' &&
- (md->nltype == NLTYPE_ANY ||
- md->nltype == NLTYPE_ANYCRLF ||
- md->nllen == 2) &&
- start_match < end_subject &&
- *start_match == '\n')
+ start_match < end_subject &&
+ *start_match == '\n' &&
+ (re->options & PCRE_HASCRORLF) == 0 &&
+ (md->nltype == NLTYPE_ANY ||
+ md->nltype == NLTYPE_ANYCRLF ||
+ md->nllen == 2))
start_match++;
} /* End of for(;;) "bumpalong" loop */
diff --git a/pcre_fullinfo.c b/pcre_fullinfo.c
index 9f1f76b..b082473 100644
--- a/pcre_fullinfo.c
+++ b/pcre_fullinfo.c
@@ -152,6 +152,10 @@ switch (what)
*((int *)where) = (re->options & PCRE_JCHANGED) != 0;
break;
+ case PCRE_INFO_HASCRORLF:
+ *((int *)where) = (re->options & PCRE_HASCRORLF) != 0;
+ break;
+
default: return PCRE_ERROR_BADOPTION;
}
diff --git a/pcre_internal.h b/pcre_internal.h
index 795cb36..a2409f9 100644
--- a/pcre_internal.h
+++ b/pcre_internal.h
@@ -492,6 +492,7 @@ bits. */
#define PCRE_REQCHSET 0x20000000 /* req_byte is set */
#define PCRE_STARTLINE 0x10000000 /* start after \n for multiline */
#define PCRE_JCHANGED 0x08000000 /* j option changes within regex */
+#define PCRE_HASCRORLF 0x04000000 /* explicit \r or \n in pattern */
/* Options for the "extra" block produced by pcre_study(). */
diff --git a/pcretest.c b/pcretest.c
index 20777e0..06ded9f 100644
--- a/pcretest.c
+++ b/pcretest.c
@@ -1357,7 +1357,8 @@ while (!done)
#if !defined NOINFOCHECK
int old_first_char, old_options, old_count;
#endif
- int count, backrefmax, first_char, need_char, okpartial, jchanged;
+ int count, backrefmax, first_char, need_char, okpartial, jchanged,
+ hascrorlf;
int nameentrysize, namecount;
const uschar *nametable;
@@ -1372,6 +1373,7 @@ while (!done)
new_info(re, NULL, PCRE_INFO_NAMETABLE, (void *)&nametable);
new_info(re, NULL, PCRE_INFO_OKPARTIAL, &okpartial);
new_info(re, NULL, PCRE_INFO_JCHANGED, &jchanged);
+ new_info(re, NULL, PCRE_INFO_HASCRORLF, &hascrorlf);
#if !defined NOINFOCHECK
old_count = pcre_info(re, &old_options, &old_first_char);
@@ -1414,6 +1416,7 @@ while (!done)
}
if (!okpartial) fprintf(outfile, "Partial matching not supported\n");
+ if (hascrorlf) fprintf(outfile, "Contains explicit CR or LF match\n");
all_options = ((real_pcre *)re)->options;
if (do_flip) all_options = byteflip(all_options, sizeof(all_options));
diff --git a/testdata/testinput2 b/testdata/testinput2
index 52847ea..3b79bbc 100644
--- a/testdata/testinput2
+++ b/testdata/testinput2
@@ -2416,4 +2416,16 @@ a random value. /Ix
/(?1)\c[/
+/.+A/<crlf>
+ \r\nA
+
+/\nA/<crlf>
+ \r\nA
+
+/[\r\n]A/<crlf>
+ \r\nA
+
+/(\r|\n)A/<crlf>
+ \r\nA
+
/ End of testinput2 /
diff --git a/testdata/testinput7 b/testdata/testinput7
index 2722980..76524b7 100644
--- a/testdata/testinput7
+++ b/testdata/testinput7
@@ -4298,4 +4298,16 @@
>XY\x0aZ\x0aA\x0bNN\x0c
>\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+/.+A/<crlf>
+ \r\nA
+
+/\nA/<crlf>
+ \r\nA
+
+/[\r\n]A/<crlf>
+ \r\nA
+
+/(\r|\n)A/<crlf>
+ \r\nA
+
/ End of testinput7 /
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index c6ec398..033a016 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -166,6 +166,7 @@ Starting byte set: a b c d
/(a|[^\dZ])/IS
Capturing subpattern count = 1
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -402,6 +403,7 @@ Failed: missing terminating ] for character class at offset 4
/[^aeiou ]{3,}/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -703,6 +705,7 @@ Starting byte set: a b
/(?<=foo\n)^bar/Im
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: multiline
No first char
Need char = 'r'
@@ -719,6 +722,7 @@ No match
/^(?<=foo\n)bar/Im
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: multiline
First char at start or follows newline
Need char = 'r'
@@ -1105,6 +1109,7 @@ No need char
)?)?)?)?)?)?)?)?)?otherword/I
Capturing subpattern count = 8
Partial matching not supported
+Contains explicit CR or LF match
No options
First char = 'w'
Need char = 'd'
@@ -1347,6 +1352,7 @@ No need char
/^ab\n/Ig+
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -1356,6 +1362,7 @@ No need char
/^ab\n/Img+
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: multiline
First char at start or follows newline
Need char = 10
@@ -1433,6 +1440,7 @@ Need char = 'a'
/"([^\\"]+|\\.)*"/I
Capturing subpattern count = 1
Partial matching not supported
+Contains explicit CR or LF match
No options
First char = '"'
Need char = '"'
@@ -1708,6 +1716,7 @@ Study returned NULL
/Ix
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1737,6 +1746,7 @@ No match
/\( ( (?>[^()]+) | (?R) )* \) /Ixg
Capturing subpattern count = 1
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1752,6 +1762,7 @@ Need char = ')'
/\( (?: (?>[^()]+) | (?R) ) \) /Ix
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1771,6 +1782,7 @@ No match
/\( (?: (?>[^()]+) | (?R) )? \) /Ix
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1782,6 +1794,7 @@ Need char = ')'
/\( ( (?>[^()]+) | (?R) )* \) /Ix
Capturing subpattern count = 1
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1792,6 +1805,7 @@ Need char = ')'
/\( ( ( (?>[^()]+) | (?R) )* ) \) /Ix
Capturing subpattern count = 2
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1803,6 +1817,7 @@ Need char = ')'
/\( (123)? ( ( (?>[^()]+) | (?R) )* ) \) /Ix
Capturing subpattern count = 3
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1820,6 +1835,7 @@ Need char = ')'
/\( ( (123)? ( (?>[^()]+) | (?R) )* ) \) /Ix
Capturing subpattern count = 3
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1837,6 +1853,7 @@ Need char = ')'
/\( (((((((((( ( (?>[^()]+) | (?R) )* )))))))))) \) /Ix
Capturing subpattern count = 11
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1857,6 +1874,7 @@ Need char = ')'
/\( ( ( (?>[^()<>]+) | ((?>[^()]+)) | (?R) )* ) \) /Ix
Capturing subpattern count = 3
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1869,6 +1887,7 @@ Need char = ')'
/\( ( ( (?>[^()]+) | ((?R)) )* ) \) /Ix
Capturing subpattern count = 3
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '('
Need char = ')'
@@ -1905,6 +1924,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -1931,6 +1951,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -1952,6 +1973,7 @@ Starting byte set: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -1991,12 +2013,14 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
/[\n\x0b\x0c\x0d[:blank:]]/IS
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -2011,6 +2035,7 @@ Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -2089,6 +2114,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -2154,6 +2180,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -2167,6 +2194,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -3055,6 +3083,7 @@ Need char = 'b'
/([^()]++|\([^()]*\))+/I
Capturing subpattern count = 1
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -3065,6 +3094,7 @@ No need char
/\(([^()]++|\([^()]+\))+\)/I
Capturing subpattern count = 1
Partial matching not supported
+Contains explicit CR or LF match
No options
First char = '('
Need char = ')'
@@ -3265,6 +3295,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -3277,6 +3308,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -3284,6 +3316,7 @@ No need char
/< (?: (?(R) \d++ | [^<>]*+) | (?R)) * >/Ix
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '<'
Need char = '>'
@@ -3498,6 +3531,7 @@ Starting byte set: a b
/[^a]/I
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -3957,6 +3991,7 @@ Failed: recursive call could loop indefinitely at offset 16
/^([^()]|\((?1)*\))*$/I
Capturing subpattern count = 1
+Contains explicit CR or LF match
Options: anchored
No first char
No need char
@@ -3976,6 +4011,7 @@ No match
/^>abc>([^()]|\((?1)*\))*<xyz<$/I
Capturing subpattern count = 1
+Contains explicit CR or LF match
Options: anchored
No first char
Need char = '<'
@@ -4103,6 +4139,7 @@ No match
/((< (?: (?(R) \d++ | [^<>]*+) | (?2)) * >))/Ix
Capturing subpattern count = 2
Partial matching not supported
+Contains explicit CR or LF match
Options: extended
First char = '<'
Need char = '>'
@@ -5631,6 +5668,7 @@ No need char
/line\nbreak/I
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
First char = 'l'
Need char = 'k'
@@ -5641,6 +5679,7 @@ Need char = 'k'
/line\nbreak/If
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: firstline
First char = 'l'
Need char = 'k'
@@ -5653,6 +5692,7 @@ No match
/line\nbreak/Imf
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: multiline firstline
First char = 'l'
Need char = 'k'
@@ -5918,6 +5958,7 @@ Matched, but too many substrings
/[^()]*(?:\((?R)\)[^()]*)*/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -5931,6 +5972,7 @@ No need char
/[^()]*(?:\((?>(?R))\)[^()]*)*/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -5942,6 +5984,7 @@ No need char
/[^()]*(?:\((?R)\))*[^()]*/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -5953,6 +5996,7 @@ No need char
/(?:\((?R)\))*[^()]*/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -5966,6 +6010,7 @@ No need char
/(?:\((?R)\))|[^()]*/I
Capturing subpattern count = 0
Partial matching not supported
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -9047,4 +9092,21 @@ Failed: number is too big at offset 12
/(?1)\c[/
Failed: reference to non-existent subpattern at offset 3
+/.+A/<crlf>
+ \r\nA
+No match
+
+/\nA/<crlf>
+ \r\nA
+ 0: \x0aA
+
+/[\r\n]A/<crlf>
+ \r\nA
+ 0: \x0aA
+
+/(\r|\n)A/<crlf>
+ \r\nA
+ 0: \x0aA
+ 1: \x0a
+
/ End of testinput2 /
diff --git a/testdata/testoutput5 b/testdata/testoutput5
index cd8958a..2d9ee69 100644
--- a/testdata/testoutput5
+++ b/testdata/testoutput5
@@ -364,6 +364,7 @@ No match
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: anchored utf8
No first char
No need char
@@ -386,6 +387,7 @@ No match
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: utf8
No first char
No need char
@@ -653,6 +655,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -665,6 +668,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: utf8
No first char
No need char
@@ -788,6 +792,7 @@ Need char = 191
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
No options
No first char
No need char
@@ -800,6 +805,7 @@ No need char
End
------------------------------------------------------------------
Capturing subpattern count = 0
+Contains explicit CR or LF match
Options: utf8
No first char
No need char
@@ -936,6 +942,7 @@ Need char = 'z'
End
------------------------------------------------------------------
Capturing subpattern count = 1
+Contains explicit CR or LF match
Options: utf8
No first char
Need char = 'z'
diff --git a/testdata/testoutput7 b/testdata/testoutput7
index a77186d..39c5075 100644
--- a/testdata/testoutput7
+++ b/testdata/testoutput7
@@ -7072,4 +7072,20 @@ No match
>\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c
+/.+A/<crlf>
+ \r\nA
+No match
+
+/\nA/<crlf>
+ \r\nA
+ 0: \x0aA
+
+/[\r\n]A/<crlf>
+ \r\nA
+ 0: \x0aA
+
+/(\r|\n)A/<crlf>
+ \r\nA
+ 0: \x0aA
+
/ End of testinput7 /