diff options
author | Karl Williamson <khw@cpan.org> | 2018-03-06 12:32:58 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2018-03-06 14:37:46 -0700 |
commit | fb7e725522eb400ba57f680cea29799ad5c8e4ac (patch) | |
tree | e7834a0080d74cd24da268e4921731e424598295 /t | |
parent | e1168c111a3dbbf8383f433d611b13168096d280 (diff) | |
download | perl-fb7e725522eb400ba57f680cea29799ad5c8e4ac.tar.gz |
PATCH: [perl #132163] regex assertion failure
The original test case in this ticket has already been fixed; but
modifying it slightly showed some other issues that are now fixed by
this commit.
The deepest problem is that this code in some paths creates a string to
parse instead of the original pattern. And in some cases, it's not even
the original pattern, but something that had already been created to
parse instead of the pattern. Any messages that are raised should be
output in terms of the original. regcomp.c already has the
infrastructure to handle the case where a message is raised during
parsing of a constructed string, but it can't handle a 2nd level
constructed string. That was what led to the segfault in the original
ticket. Unrelated fixes caused the original ticket to no longer be
applicable, and so this fix adds tests for things still would cause a
problem.
The method chosen here is to just make sure that the string constructed
here to parse is error free, so no messages will be raised. Instead it
does the error checking as it constructs the string, so if what is being
parsed to construct a new string is an already constructed one, the
existing infrastructure handles outputting the message relative to the
original pattern. Since what is being parsed is a series of hex
numbers, it's easy to find out what their values are: just accumulate a
total, shifting 4 bits each time through the loop. A side benefit is
that this fixes some unreported bugs dealing with an input code point
that overflows. Prior to this patch, it would error ungracefully.
Diffstat (limited to 't')
-rw-r--r-- | t/lib/croak/regcomp | 56 |
1 files changed, 56 insertions, 0 deletions
diff --git a/t/lib/croak/regcomp b/t/lib/croak/regcomp new file mode 100644 index 0000000000..19586d53ee --- /dev/null +++ b/t/lib/croak/regcomp @@ -0,0 +1,56 @@ +__END__ +# NAME \N{U+too large} on 64-bit machine +# SKIP ? use Config; $Config{uvsize} < 8 && "Not 64 bit" +qr/\N{U+7FFFFFFFFFFFFFFF}/; +qr/\N{U+1_0000_0000_0000_0000}/; +EXPECT +Use of code point 0x1_0000_0000_0000_0000 is not allowed; the permissible max is 0x7fffffffffffffff in regex; marked by <-- HERE in m/\N{U+1_0000_0000_0000_0000 <-- HERE }/ at - line 2. +######## +# NAME \N{U+too large} on 32-bit machine +# SKIP ? use Config; $Config{uvsize} > 4 && "Not 32 bit" +qr/\N{U+7FFFFFFF}/; +qr/\N{U+1_0000_0000}/; +EXPECT +Use of code point 0x1_0000_0000 is not allowed; the permissible max is 0x7fffffff in regex; marked by <-- HERE in m/\N{U+1_0000_0000 <-- HERE }/ at - line 2. +######## +# NAME \N{U+100.too large} on 64-bit machine +# SKIP ? use Config; $Config{uvsize} < 8 && "Not 64 bit" +qr/\N{U+100.7FFFFFFFFFFFFFFF}/; +qr/\N{U+100.1_0000_0000_0000_0000}/; +EXPECT +Use of code point 0x1_0000_0000_0000_0000 is not allowed; the permissible max is 0x7fffffffffffffff in regex; marked by <-- HERE in m/\N{U+100.1_0000_0000_0000_0000 <-- HERE }/ at - line 2. +######## +# NAME \N{U+100.too large} on 32-bit machine +# SKIP ? use Config; $Config{uvsize} > 4 && "Not 32 bit" +qr/\N{U+100.7FFFFFFF}/; +qr/\N{U+100.1_0000_0000}/; +EXPECT +Use of code point 0x1_0000_0000 is not allowed; the permissible max is 0x7fffffff in regex; marked by <-- HERE in m/\N{U+100.1_0000_0000 <-- HERE }/ at - line 2. +######## +# NAME \N{U+.} +my $p00="\\N{U+.}"; qr/$p00/; +EXPECT +Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+. <-- HERE }/ at - line 1. +######## +# NAME \N{U+100.} +my $p00="\\N{U+100.}"; qr/$p00/; +EXPECT +Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+100. <-- HERE }/ at - line 1. +######## +# NAME \N{U+_100} +my $p00="\\N{U+_100}"; qr/$p00/; +EXPECT +Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+_ <-- HERE 100}/ at - line 1. +######## +# NAME \N{U+100_} +my $p00="\\N{U+100_}"; qr/$p00/; +EXPECT +Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+100_ <-- HERE }/ at - line 1. +######## +# NAME [ß\N{U+.}] +my $p00="[ß\\N{U+.}]"; qr/$p00/ui; +# The sharp s under /i recodes the parse, and this was causing a segfault when +# the error message referred to the original pattern +EXPECT +Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/[ß\N{U+. <-- HERE }]/ at - line 1. +######## |