diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-08-21 11:46:08 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-08-21 11:46:08 +0000 |
commit | c6a88bf880d462c62e00d8d7c3eeeaad60ebab49 (patch) | |
tree | 7b948602c6a0645f32441bdf9dd602a73b4fc3e7 | |
parent | 3fa5c170ee26c5a63c0efa62dc42fc9fb57ce76b (diff) | |
download | pcre-c6a88bf880d462c62e00d8d7c3eeeaad60ebab49.tar.gz |
Don't advance by 2 if explicit \r or \n in the pattern. Add
PCRE_INFO_HASCRORLF.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@226 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | ChangeLog | 12 | ||||
-rw-r--r-- | NEWS | 7 | ||||
-rw-r--r-- | doc/pcreapi.3 | 49 | ||||
-rw-r--r-- | pcre.h.in | 1 | ||||
-rw-r--r-- | pcre_compile.c | 34 | ||||
-rw-r--r-- | pcre_dfa_exec.c | 17 | ||||
-rw-r--r-- | pcre_exec.c | 17 | ||||
-rw-r--r-- | pcre_fullinfo.c | 4 | ||||
-rw-r--r-- | pcre_internal.h | 1 | ||||
-rw-r--r-- | pcretest.c | 5 | ||||
-rw-r--r-- | testdata/testinput2 | 12 | ||||
-rw-r--r-- | testdata/testinput7 | 12 | ||||
-rw-r--r-- | testdata/testoutput2 | 62 | ||||
-rw-r--r-- | testdata/testoutput5 | 7 | ||||
-rw-r--r-- | testdata/testoutput7 | 16 |
15 files changed, 207 insertions, 49 deletions
@@ -150,6 +150,18 @@ Version 7.3 20-Aug-07 27. Patterns such as (\P{Yi}*\277)* (group with possible zero repeat containing \p or \P) caused a compile-time loop. + +28. More problems have arisen in unanchored patterns when CRLF is a valid line + break. For example, the unstudied pattern [\r\n]A does not match the string + "\r\nA" because change 7.0/46 below moves the current point on by two + characters after failing to match at the start. However, the pattern \nA + *does* match, because it doesn't start till \n, and if [\r\n]A is studied, + the same is true. There doesn't seem any very clean way out of this, but + what I have chosen to do makes the common cases work: PCRE now takes note + of whether there can be an explicit match for \r or \n anywhere in the + pattern, and if so, 7.0/46 no longer applies. As part of this change, + there's a new PCRE_INFO_HASCRORLF option for finding out whether a compiled + pattern has explicit CR or LF references. Version 7.2 19-Jun-07 @@ -2,7 +2,7 @@ News about PCRE releases ------------------------ -Release 7.3 16-Aug-07 +Release 7.3 20-Aug-07 --------------------- Most changes are bug fixes. Some that are not: @@ -16,6 +16,11 @@ Most changes are bug fixes. Some that are not: 3. Checking for potential integer overflow has been made more dynamic, and as a consequence there is no longer a hard limit on the size of a subpattern that has a limited repeat count. + +4. When CRLF is a valid line-ending sequence, pcre_exec() and pcre_dfa_exec() + no longer advance by two characters instead of one when an unanchored match + fails at CRLF if there are explicit CR or LF matches within the pattern. + This gets rid of some anomalous effects that previously occurred. Release 7.2 19-Jun-07 diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index 0f8ecbb..e101012 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -240,8 +240,13 @@ pair of characters that indicate a line break". The choice of newline convention affects the handling of the dot, circumflex, and dollar metacharacters, the handling of #-comments in /x mode, and, when CRLF is a recognized line ending sequence, the match position advancement for a -non-anchored pattern. The choice of newline convention does not affect the -interpretation of the \en or \er escape sequences. +non-anchored pattern. There is more detail about this in the +.\" HTML <a href="#execoptions"> +.\" </a> +section on \fBpcre_exec()\fP options +.\" +below. The choice of newline convention does not affect the interpretation of +the \en or \er escape sequences. . . .SH MULTITHREADING @@ -882,6 +887,11 @@ table indicating a fixed set of bytes for the first byte in any matching string, a pointer to the table is returned. Otherwise NULL is returned. The fourth argument should point to an \fBunsigned char *\fP variable. .sp + PCRE_INFO_HASCRORLF +.sp +Return 1 if the pattern contains any explicit matches for CR or LF characters, +otherwise 0. The fourth argument should point to an \fBint\fP variable. +.sp PCRE_INFO_JCHANGED .sp Return 1 if the (?J) option setting is used in the pattern, otherwise 0. The @@ -1169,6 +1179,7 @@ called. See the .\" documentation for a discussion of saving compiled patterns for later use. . +.\" HTML <a name="execoptions"></a> .SS "Option bits for \fBpcre_exec()\fP" .rs .sp @@ -1194,19 +1205,25 @@ the pattern was compiled. For details, see the description of \fBpcre_compile()\fP above. During matching, the newline choice affects the behaviour of the dot, circumflex, and dollar metacharacters. It may also alter the way the match position is advanced after a match failure for an unanchored -pattern. When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is -set, and a match attempt fails when the current position is at a CRLF sequence, -the match position is advanced by two characters instead of one, in other -words, to after the CRLF. -.P -Anomalous effects can occur when CRLF is a valid newline sequence and explicit -\er or \en escapes appear in the pattern. For example, the string "\er\enA" -matches the unanchored pattern \enA but not [X\en]A. This happens because, in -the first case, PCRE knows that the match must start with \en, and so it skips -there before trying to match. In the second case, it has no knowledge about the -starting character, so it starts matching at the beginning of the string, and -on failing, skips over the CRLF as described above. However, if the pattern is -studied, the match succeeds, because then PCRE once again knows where to start. +pattern. +.P +When PCRE_NEWLINE_CRLF, PCRE_NEWLINE_ANYCRLF, or PCRE_NEWLINE_ANY is set, and a +match attempt for an unanchored pattern fails when the current position is at a +CRLF sequence, and the pattern contains no explicit matches for CR or NL +characters, the match position is advanced by two characters instead of one, in +other words, to after the CRLF. +.P +The above rule is a compromise that makes the most common cases work as +expected. For example, if the pattern is .+A (and the PCRE_DOTALL option is not +set), it does not match the string "\er\enA" because, after failing at the +start, it skips both the CR and the LF before retrying. However, the pattern +[\er\en]A does match that string, because it contains an explicit CR or LF +reference, and so advances only by one character after the first failure. +Note than an explicit CR or LF reference occurs for negated character classes +such as [^X] because they can match CR or LF characters. +.P +Notwithstanding the above, anomalous effects may still occur when CRLF is a +valid newline sequence and explicit \er or \en escapes appear in the pattern. .sp PCRE_NOTBOL .sp @@ -1895,6 +1912,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 20 August 2007 +Last updated: 21 August 2007 Copyright (c) 1997-2007 University of Cambridge. .fi @@ -167,6 +167,7 @@ extern "C" { #define PCRE_INFO_DEFAULT_TABLES 11 #define PCRE_INFO_OKPARTIAL 12 #define PCRE_INFO_JCHANGED 13 +#define PCRE_INFO_HASCRORLF 14 /* Request types for pcre_config(). Do not re-arrange, in order to remain compatible. */ diff --git a/pcre_compile.c b/pcre_compile.c index e344ad4..24f694e 100644 --- a/pcre_compile.c +++ b/pcre_compile.c @@ -3195,7 +3195,18 @@ for (;; ptr++) *errorcodeptr = ERR6; goto FAILED; } - + + /* Remember whether \r or \n are in this class */ + + if (negate_class) + { + if ((classbits[1] & 0x24) != 0x24) cd->external_options |= PCRE_HASCRORLF; + } + else + { + if ((classbits[1] & 0x24) != 0) cd->external_options |= PCRE_HASCRORLF; + } + /* If class_charcount is 1, we saw precisely one character whose value is less than 256. As long as there were no characters >= 128 and there was no use of \p or \P, in other words, no use of any XCLASS features, we can @@ -5050,6 +5061,11 @@ for (;; ptr++) previous = code; *code++ = ((options & PCRE_CASELESS) != 0)? OP_CHARNC : OP_CHAR; for (c = 0; c < mclength; c++) *code++ = mcbuffer[c]; + + /* Remember if \r or \n were seen */ + + if (mcbuffer[0] == '\r' || mcbuffer[0] == '\n') + cd->external_options |= PCRE_HASCRORLF; /* Set the first and required bytes appropriately. If no previous first byte, set it from this character, but revert to none on a zero repeat. @@ -5982,20 +5998,8 @@ case when building a production library. */ printf("Length = %d top_bracket = %d top_backref = %d\n", length, re->top_bracket, re->top_backref); - -if (re->options != 0) - { - printf("%s%s%s%s%s%s%s%s%s\n", - ((re->options & PCRE_NOPARTIAL) != 0)? "nopartial " : "", - ((re->options & PCRE_ANCHORED) != 0)? "anchored " : "", - ((re->options & PCRE_CASELESS) != 0)? "caseless " : "", - ((re->options & PCRE_EXTENDED) != 0)? "extended " : "", - ((re->options & PCRE_MULTILINE) != 0)? "multiline " : "", - ((re->options & PCRE_DOTALL) != 0)? "dotall " : "", - ((re->options & PCRE_DOLLAR_ENDONLY) != 0)? "endonly " : "", - ((re->options & PCRE_EXTRA) != 0)? "extra " : "", - ((re->options & PCRE_UNGREEDY) != 0)? "ungreedy " : ""); - } + +printf("Options=%08x\n", re->options); if ((re->options & PCRE_FIRSTSET) != 0) { diff --git a/pcre_dfa_exec.c b/pcre_dfa_exec.c index 60cb619..03af666 100644 --- a/pcre_dfa_exec.c +++ b/pcre_dfa_exec.c @@ -2842,16 +2842,17 @@ for (;;) } if (current_subject > end_subject) break; - /* If we have just passed a CR and the newline option is CRLF or ANY or - ANYCRLF, and we are now at a LF, advance the match position by one more - character. */ + /* If we have just passed a CR and we are now at a LF, and the pattern does + not contain any explicit matches for \r or \n, and the newline option is CRLF + or ANY or ANYCRLF, advance the match position by one more character. */ if (current_subject[-1] == '\r' && - (md->nltype == NLTYPE_ANY || - md->nltype == NLTYPE_ANYCRLF || - md->nllen == 2) && - current_subject < end_subject && - *current_subject == '\n') + current_subject < end_subject && + *current_subject == '\n' && + (re->options & PCRE_HASCRORLF) == 0 && + (md->nltype == NLTYPE_ANY || + md->nltype == NLTYPE_ANYCRLF || + md->nllen == 2)) current_subject++; } /* "Bumpalong" loop */ diff --git a/pcre_exec.c b/pcre_exec.c index 6e4a4b5..8fd0bcb 100644 --- a/pcre_exec.c +++ b/pcre_exec.c @@ -4785,16 +4785,17 @@ for(;;) if (anchored || start_match > end_subject) break; - /* If we have just passed a CR and the newline option is CRLF or ANY or - ANYCRLF, and we are now at a LF, advance the match position by one more - character. */ + /* If we have just passed a CR and we are now at a LF, and the pattern does + not contain any explicit matches for \r or \n, and the newline option is CRLF + or ANY or ANYCRLF, advance the match position by one more character. */ if (start_match[-1] == '\r' && - (md->nltype == NLTYPE_ANY || - md->nltype == NLTYPE_ANYCRLF || - md->nllen == 2) && - start_match < end_subject && - *start_match == '\n') + start_match < end_subject && + *start_match == '\n' && + (re->options & PCRE_HASCRORLF) == 0 && + (md->nltype == NLTYPE_ANY || + md->nltype == NLTYPE_ANYCRLF || + md->nllen == 2)) start_match++; } /* End of for(;;) "bumpalong" loop */ diff --git a/pcre_fullinfo.c b/pcre_fullinfo.c index 9f1f76b..b082473 100644 --- a/pcre_fullinfo.c +++ b/pcre_fullinfo.c @@ -152,6 +152,10 @@ switch (what) *((int *)where) = (re->options & PCRE_JCHANGED) != 0; break; + case PCRE_INFO_HASCRORLF: + *((int *)where) = (re->options & PCRE_HASCRORLF) != 0; + break; + default: return PCRE_ERROR_BADOPTION; } diff --git a/pcre_internal.h b/pcre_internal.h index 795cb36..a2409f9 100644 --- a/pcre_internal.h +++ b/pcre_internal.h @@ -492,6 +492,7 @@ bits. */ #define PCRE_REQCHSET 0x20000000 /* req_byte is set */ #define PCRE_STARTLINE 0x10000000 /* start after \n for multiline */ #define PCRE_JCHANGED 0x08000000 /* j option changes within regex */ +#define PCRE_HASCRORLF 0x04000000 /* explicit \r or \n in pattern */ /* Options for the "extra" block produced by pcre_study(). */ @@ -1357,7 +1357,8 @@ while (!done) #if !defined NOINFOCHECK int old_first_char, old_options, old_count; #endif - int count, backrefmax, first_char, need_char, okpartial, jchanged; + int count, backrefmax, first_char, need_char, okpartial, jchanged, + hascrorlf; int nameentrysize, namecount; const uschar *nametable; @@ -1372,6 +1373,7 @@ while (!done) new_info(re, NULL, PCRE_INFO_NAMETABLE, (void *)&nametable); new_info(re, NULL, PCRE_INFO_OKPARTIAL, &okpartial); new_info(re, NULL, PCRE_INFO_JCHANGED, &jchanged); + new_info(re, NULL, PCRE_INFO_HASCRORLF, &hascrorlf); #if !defined NOINFOCHECK old_count = pcre_info(re, &old_options, &old_first_char); @@ -1414,6 +1416,7 @@ while (!done) } if (!okpartial) fprintf(outfile, "Partial matching not supported\n"); + if (hascrorlf) fprintf(outfile, "Contains explicit CR or LF match\n"); all_options = ((real_pcre *)re)->options; if (do_flip) all_options = byteflip(all_options, sizeof(all_options)); diff --git a/testdata/testinput2 b/testdata/testinput2 index 52847ea..3b79bbc 100644 --- a/testdata/testinput2 +++ b/testdata/testinput2 @@ -2416,4 +2416,16 @@ a random value. /Ix /(?1)\c[/ +/.+A/<crlf> + \r\nA + +/\nA/<crlf> + \r\nA + +/[\r\n]A/<crlf> + \r\nA + +/(\r|\n)A/<crlf> + \r\nA + / End of testinput2 / diff --git a/testdata/testinput7 b/testdata/testinput7 index 2722980..76524b7 100644 --- a/testdata/testinput7 +++ b/testdata/testinput7 @@ -4298,4 +4298,16 @@ >XY\x0aZ\x0aA\x0bNN\x0c >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c +/.+A/<crlf> + \r\nA + +/\nA/<crlf> + \r\nA + +/[\r\n]A/<crlf> + \r\nA + +/(\r|\n)A/<crlf> + \r\nA + / End of testinput7 / diff --git a/testdata/testoutput2 b/testdata/testoutput2 index c6ec398..033a016 100644 --- a/testdata/testoutput2 +++ b/testdata/testoutput2 @@ -166,6 +166,7 @@ Starting byte set: a b c d /(a|[^\dZ])/IS Capturing subpattern count = 1 +Contains explicit CR or LF match No options No first char No need char @@ -402,6 +403,7 @@ Failed: missing terminating ] for character class at offset 4 /[^aeiou ]{3,}/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -703,6 +705,7 @@ Starting byte set: a b /(?<=foo\n)^bar/Im Capturing subpattern count = 0 +Contains explicit CR or LF match Options: multiline No first char Need char = 'r' @@ -719,6 +722,7 @@ No match /^(?<=foo\n)bar/Im Capturing subpattern count = 0 +Contains explicit CR or LF match Options: multiline First char at start or follows newline Need char = 'r' @@ -1105,6 +1109,7 @@ No need char )?)?)?)?)?)?)?)?)?otherword/I Capturing subpattern count = 8 Partial matching not supported +Contains explicit CR or LF match No options First char = 'w' Need char = 'd' @@ -1347,6 +1352,7 @@ No need char /^ab\n/Ig+ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -1356,6 +1362,7 @@ No need char /^ab\n/Img+ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: multiline First char at start or follows newline Need char = 10 @@ -1433,6 +1440,7 @@ Need char = 'a' /"([^\\"]+|\\.)*"/I Capturing subpattern count = 1 Partial matching not supported +Contains explicit CR or LF match No options First char = '"' Need char = '"' @@ -1708,6 +1716,7 @@ Study returned NULL /Ix Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1737,6 +1746,7 @@ No match /\( ( (?>[^()]+) | (?R) )* \) /Ixg Capturing subpattern count = 1 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1752,6 +1762,7 @@ Need char = ')' /\( (?: (?>[^()]+) | (?R) ) \) /Ix Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1771,6 +1782,7 @@ No match /\( (?: (?>[^()]+) | (?R) )? \) /Ix Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1782,6 +1794,7 @@ Need char = ')' /\( ( (?>[^()]+) | (?R) )* \) /Ix Capturing subpattern count = 1 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1792,6 +1805,7 @@ Need char = ')' /\( ( ( (?>[^()]+) | (?R) )* ) \) /Ix Capturing subpattern count = 2 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1803,6 +1817,7 @@ Need char = ')' /\( (123)? ( ( (?>[^()]+) | (?R) )* ) \) /Ix Capturing subpattern count = 3 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1820,6 +1835,7 @@ Need char = ')' /\( ( (123)? ( (?>[^()]+) | (?R) )* ) \) /Ix Capturing subpattern count = 3 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1837,6 +1853,7 @@ Need char = ')' /\( (((((((((( ( (?>[^()]+) | (?R) )* )))))))))) \) /Ix Capturing subpattern count = 11 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1857,6 +1874,7 @@ Need char = ')' /\( ( ( (?>[^()<>]+) | ((?>[^()]+)) | (?R) )* ) \) /Ix Capturing subpattern count = 3 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1869,6 +1887,7 @@ Need char = ')' /\( ( ( (?>[^()]+) | ((?R)) )* ) \) /Ix Capturing subpattern count = 3 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '(' Need char = ')' @@ -1905,6 +1924,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -1931,6 +1951,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -1952,6 +1973,7 @@ Starting byte set: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -1991,12 +2013,14 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char /[\n\x0b\x0c\x0d[:blank:]]/IS Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -2011,6 +2035,7 @@ Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -2089,6 +2114,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -2154,6 +2180,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -2167,6 +2194,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -3055,6 +3083,7 @@ Need char = 'b' /([^()]++|\([^()]*\))+/I Capturing subpattern count = 1 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -3065,6 +3094,7 @@ No need char /\(([^()]++|\([^()]+\))+\)/I Capturing subpattern count = 1 Partial matching not supported +Contains explicit CR or LF match No options First char = '(' Need char = ')' @@ -3265,6 +3295,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -3277,6 +3308,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -3284,6 +3316,7 @@ No need char /< (?: (?(R) \d++ | [^<>]*+) | (?R)) * >/Ix Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '<' Need char = '>' @@ -3498,6 +3531,7 @@ Starting byte set: a b /[^a]/I Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -3957,6 +3991,7 @@ Failed: recursive call could loop indefinitely at offset 16 /^([^()]|\((?1)*\))*$/I Capturing subpattern count = 1 +Contains explicit CR or LF match Options: anchored No first char No need char @@ -3976,6 +4011,7 @@ No match /^>abc>([^()]|\((?1)*\))*<xyz<$/I Capturing subpattern count = 1 +Contains explicit CR or LF match Options: anchored No first char Need char = '<' @@ -4103,6 +4139,7 @@ No match /((< (?: (?(R) \d++ | [^<>]*+) | (?2)) * >))/Ix Capturing subpattern count = 2 Partial matching not supported +Contains explicit CR or LF match Options: extended First char = '<' Need char = '>' @@ -5631,6 +5668,7 @@ No need char /line\nbreak/I Capturing subpattern count = 0 +Contains explicit CR or LF match No options First char = 'l' Need char = 'k' @@ -5641,6 +5679,7 @@ Need char = 'k' /line\nbreak/If Capturing subpattern count = 0 +Contains explicit CR or LF match Options: firstline First char = 'l' Need char = 'k' @@ -5653,6 +5692,7 @@ No match /line\nbreak/Imf Capturing subpattern count = 0 +Contains explicit CR or LF match Options: multiline firstline First char = 'l' Need char = 'k' @@ -5918,6 +5958,7 @@ Matched, but too many substrings /[^()]*(?:\((?R)\)[^()]*)*/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -5931,6 +5972,7 @@ No need char /[^()]*(?:\((?>(?R))\)[^()]*)*/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -5942,6 +5984,7 @@ No need char /[^()]*(?:\((?R)\))*[^()]*/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -5953,6 +5996,7 @@ No need char /(?:\((?R)\))*[^()]*/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -5966,6 +6010,7 @@ No need char /(?:\((?R)\))|[^()]*/I Capturing subpattern count = 0 Partial matching not supported +Contains explicit CR or LF match No options No first char No need char @@ -9047,4 +9092,21 @@ Failed: number is too big at offset 12 /(?1)\c[/ Failed: reference to non-existent subpattern at offset 3 +/.+A/<crlf> + \r\nA +No match + +/\nA/<crlf> + \r\nA + 0: \x0aA + +/[\r\n]A/<crlf> + \r\nA + 0: \x0aA + +/(\r|\n)A/<crlf> + \r\nA + 0: \x0aA + 1: \x0a + / End of testinput2 / diff --git a/testdata/testoutput5 b/testdata/testoutput5 index cd8958a..2d9ee69 100644 --- a/testdata/testoutput5 +++ b/testdata/testoutput5 @@ -364,6 +364,7 @@ No match End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: anchored utf8 No first char No need char @@ -386,6 +387,7 @@ No match End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: utf8 No first char No need char @@ -653,6 +655,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -665,6 +668,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: utf8 No first char No need char @@ -788,6 +792,7 @@ Need char = 191 End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match No options No first char No need char @@ -800,6 +805,7 @@ No need char End ------------------------------------------------------------------ Capturing subpattern count = 0 +Contains explicit CR or LF match Options: utf8 No first char No need char @@ -936,6 +942,7 @@ Need char = 'z' End ------------------------------------------------------------------ Capturing subpattern count = 1 +Contains explicit CR or LF match Options: utf8 No first char Need char = 'z' diff --git a/testdata/testoutput7 b/testdata/testoutput7 index a77186d..39c5075 100644 --- a/testdata/testoutput7 +++ b/testdata/testoutput7 @@ -7072,4 +7072,20 @@ No match >\x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c 0: \x0a\x0dX\x0aY\x0a\x0bZZZ\x0aAAA\x0bNNN\x0c +/.+A/<crlf> + \r\nA +No match + +/\nA/<crlf> + \r\nA + 0: \x0aA + +/[\r\n]A/<crlf> + \r\nA + 0: \x0aA + +/(\r|\n)A/<crlf> + \r\nA + 0: \x0aA + / End of testinput7 / |