summaryrefslogtreecommitdiff
path: root/t
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2018-03-06 12:32:58 -0700
committerKarl Williamson <khw@cpan.org>2018-03-06 14:37:46 -0700
commitfb7e725522eb400ba57f680cea29799ad5c8e4ac (patch)
treee7834a0080d74cd24da268e4921731e424598295 /t
parente1168c111a3dbbf8383f433d611b13168096d280 (diff)
downloadperl-fb7e725522eb400ba57f680cea29799ad5c8e4ac.tar.gz
PATCH: [perl #132163] regex assertion failure
The original test case in this ticket has already been fixed; but modifying it slightly showed some other issues that are now fixed by this commit. The deepest problem is that this code in some paths creates a string to parse instead of the original pattern. And in some cases, it's not even the original pattern, but something that had already been created to parse instead of the pattern. Any messages that are raised should be output in terms of the original. regcomp.c already has the infrastructure to handle the case where a message is raised during parsing of a constructed string, but it can't handle a 2nd level constructed string. That was what led to the segfault in the original ticket. Unrelated fixes caused the original ticket to no longer be applicable, and so this fix adds tests for things still would cause a problem. The method chosen here is to just make sure that the string constructed here to parse is error free, so no messages will be raised. Instead it does the error checking as it constructs the string, so if what is being parsed to construct a new string is an already constructed one, the existing infrastructure handles outputting the message relative to the original pattern. Since what is being parsed is a series of hex numbers, it's easy to find out what their values are: just accumulate a total, shifting 4 bits each time through the loop. A side benefit is that this fixes some unreported bugs dealing with an input code point that overflows. Prior to this patch, it would error ungracefully.
Diffstat (limited to 't')
-rw-r--r--t/lib/croak/regcomp56
1 files changed, 56 insertions, 0 deletions
diff --git a/t/lib/croak/regcomp b/t/lib/croak/regcomp
new file mode 100644
index 0000000000..19586d53ee
--- /dev/null
+++ b/t/lib/croak/regcomp
@@ -0,0 +1,56 @@
+__END__
+# NAME \N{U+too large} on 64-bit machine
+# SKIP ? use Config; $Config{uvsize} < 8 && "Not 64 bit"
+qr/\N{U+7FFFFFFFFFFFFFFF}/;
+qr/\N{U+1_0000_0000_0000_0000}/;
+EXPECT
+Use of code point 0x1_0000_0000_0000_0000 is not allowed; the permissible max is 0x7fffffffffffffff in regex; marked by <-- HERE in m/\N{U+1_0000_0000_0000_0000 <-- HERE }/ at - line 2.
+########
+# NAME \N{U+too large} on 32-bit machine
+# SKIP ? use Config; $Config{uvsize} > 4 && "Not 32 bit"
+qr/\N{U+7FFFFFFF}/;
+qr/\N{U+1_0000_0000}/;
+EXPECT
+Use of code point 0x1_0000_0000 is not allowed; the permissible max is 0x7fffffff in regex; marked by <-- HERE in m/\N{U+1_0000_0000 <-- HERE }/ at - line 2.
+########
+# NAME \N{U+100.too large} on 64-bit machine
+# SKIP ? use Config; $Config{uvsize} < 8 && "Not 64 bit"
+qr/\N{U+100.7FFFFFFFFFFFFFFF}/;
+qr/\N{U+100.1_0000_0000_0000_0000}/;
+EXPECT
+Use of code point 0x1_0000_0000_0000_0000 is not allowed; the permissible max is 0x7fffffffffffffff in regex; marked by <-- HERE in m/\N{U+100.1_0000_0000_0000_0000 <-- HERE }/ at - line 2.
+########
+# NAME \N{U+100.too large} on 32-bit machine
+# SKIP ? use Config; $Config{uvsize} > 4 && "Not 32 bit"
+qr/\N{U+100.7FFFFFFF}/;
+qr/\N{U+100.1_0000_0000}/;
+EXPECT
+Use of code point 0x1_0000_0000 is not allowed; the permissible max is 0x7fffffff in regex; marked by <-- HERE in m/\N{U+100.1_0000_0000 <-- HERE }/ at - line 2.
+########
+# NAME \N{U+.}
+my $p00="\\N{U+.}"; qr/$p00/;
+EXPECT
+Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+. <-- HERE }/ at - line 1.
+########
+# NAME \N{U+100.}
+my $p00="\\N{U+100.}"; qr/$p00/;
+EXPECT
+Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+100. <-- HERE }/ at - line 1.
+########
+# NAME \N{U+_100}
+my $p00="\\N{U+_100}"; qr/$p00/;
+EXPECT
+Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+_ <-- HERE 100}/ at - line 1.
+########
+# NAME \N{U+100_}
+my $p00="\\N{U+100_}"; qr/$p00/;
+EXPECT
+Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/\N{U+100_ <-- HERE }/ at - line 1.
+########
+# NAME [ß\N{U+.}]
+my $p00="[ß\\N{U+.}]"; qr/$p00/ui;
+# The sharp s under /i recodes the parse, and this was causing a segfault when
+# the error message referred to the original pattern
+EXPECT
+Invalid hexadecimal number in \N{U+...} in regex; marked by <-- HERE in m/[ß\N{U+. <-- HERE }]/ at - line 1.
+########