diff options
author | Yves Orton <demerphq@gmail.com> | 2010-11-03 10:23:00 +0100 |
---|---|---|
committer | Yves Orton <demerphq@gmail.com> | 2010-11-03 10:24:41 +0100 |
commit | e7f38d0fe17e7a846c0ed55e71ebb120a336b887 (patch) | |
tree | 4ac9dd2ac643afb71072cfdff1debb705bd80ee8 /t | |
parent | 6c48061a1225c4dfbc96c8cf94f17afae2b75c24 (diff) | |
download | perl-e7f38d0fe17e7a846c0ed55e71ebb120a336b887.tar.gz |
fix 68564: /g failure with zero-width patterns
This is based on a patch by Father Chrysostomos <sprout@cpan.org>
The start class optimisation has two modes, "try every valid start
position" (doevery) and "flip flop mode" (!doevery) where it trys
only the first valid start position in a sequence.
Consider /(\d+)X/ and the string "123456Y", now we know that if we fail
to match X after matching "123456" then we will also fail to match after
"23456" (assuming no evil tricks are in place, which disable the
optimisation anyway), so we know we can skip forward until the check
/fails/ and only then start looking for a real match. This is flip-flop
mode.
Now consider the case with zero-width lookahead under /g: /(?=(\d+)X)/.
In this case we have an additional failure mode, that is failure when
we match a zero-width string twice at the same pos(). So now, the
"flip-flop" logic breaks as it /is/ possible that we could match at
"23456" when we couldn't match at "123456" because of the zero-length
twice at the same pos() rule. For instance:
print $1 for "123"=~/(?=(\d+))/g
should first match "123". Since $& is zero length, pos() is not
incremented. We then match again, successfully, except that the match
is rejected despite technical-success because its $& is also zero
length and pos() has not advanced. If the flip-flop mode is enabled
we wont retry until we find a failing character first.
The point here is that it makes perfect sense to disable the
"flip-flop" mode optimisation when the start class is inside
a lookahead as it really doesnt apply.
Diffstat (limited to 't')
-rw-r--r-- | t/re/pat_rt_report.t | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/t/re/pat_rt_report.t b/t/re/pat_rt_report.t index e63cd3bc3a..df99d9cd77 100644 --- a/t/re/pat_rt_report.t +++ b/t/re/pat_rt_report.t @@ -21,7 +21,7 @@ BEGIN { } -plan tests => 2511; # Update this when adding/deleting tests. +plan tests => 2512; # Update this when adding/deleting tests. run_tests() unless caller; @@ -1218,6 +1218,13 @@ sub run_tests { iseq($w,undef); } + { + local $BugId = 68564; # minimal CURLYM limited to 32767 matches + local $Message = "stclass optimisation does not break + inside (?=)"; + iseq join("-", " abc def " =~ /(?=(\S+))/g), + "abc-bc-c-def-ef-f", + } + } # End of sub run_tests 1; |