diff options
author | David Mitchell <davem@iabyn.com> | 2016-08-24 13:21:04 +0100 |
---|---|---|
committer | David Mitchell <davem@iabyn.com> | 2016-08-24 13:30:33 +0100 |
commit | 71a9d1055562b01938400494965dac70b3a685c5 (patch) | |
tree | ec9af7279d3a39d2503db4736ae12a7ba20a60c5 /t | |
parent | f82c7fdb5e25e4e2974e9e3c5519a3d41b00ae4c (diff) | |
download | perl-71a9d1055562b01938400494965dac70b3a685c5.tar.gz |
re_untuit_start() avoid overshoot with utf8
RT #129012
re_untuit_start() is run before doing a "proper" regex match, to either
quickly reject a match or to find the earliest position in a string where
the match could occur. Part of its action is to search within the string
for a known substring which forms a part of the pattern.
If that substring is utf8, with multiple bytes per character, then
the calculation of the highest point in the string where its worth
searching for the substring, could overshoot the end of the string.
It's a (mostly) harmless issue, since apart from the issue of reading a
few bytes beyond the end of a string (which might cause a problem if the
string is memory mapped for example), the only concern is that in theory
(although extremely unlikely) a spurious match for a substring could be
found partly beyond the end of the string, resulting in the full RE engine
being called to (correctly) do the match, when otherwise the match could
have been more quickly rejected.
Diffstat (limited to 't')
-rw-r--r-- | t/re/pat_rt_report.t | 12 | ||||
-rw-r--r-- | t/re/re_tests | 1 |
2 files changed, 12 insertions, 1 deletions
diff --git a/t/re/pat_rt_report.t b/t/re/pat_rt_report.t index cb09360f4d..addb3e226c 100644 --- a/t/re/pat_rt_report.t +++ b/t/re/pat_rt_report.t @@ -20,7 +20,7 @@ use warnings; use 5.010; use Config; -plan tests => 2500; # Update this when adding/deleting tests. +plan tests => 2501; # Update this when adding/deleting tests. run_tests() unless caller; @@ -1113,6 +1113,16 @@ EOP my $s = "\x{1ff}" . "f" x 32; ok($s =~ /\x{1ff}[[:alpha:]]+/gca, "POSIXA pointer wrap"); } + + { + # RT #129012 heap-buffer-overflow Perl_fbm_instr. + # This test is unlikely to not pass, but it used to fail + # ASAN/valgrind + + my $s ="\x{100}0000000"; + ok($s !~ /00000?\x80\x80\x80/, "RT #129012"); + } + } # End of sub run_tests 1; diff --git a/t/re/re_tests b/t/re/re_tests index b72b18a913..35948b3c23 100644 --- a/t/re/re_tests +++ b/t/re/re_tests @@ -1968,6 +1968,7 @@ ab(?#Comment){2}c abbc y $& abbc (?:.||)(?|)000000000@ 000000000@ y $& 000000000@ # [perl #126405] aa$|a(?R)a|a aaa y $& aaa # [perl 128420] recursive matches (?:\1|a)([bcd])\1(?:(?R)|e)\1 abbaccaddedcb y $& abbaccaddedcb # [perl 128420] recursive match with backreferences +AB\s+\x{100} AB \x{100}X y - - # Keep these lines at the end of the file # vim: softtabstop=0 noexpandtab |