summaryrefslogtreecommitdiff
path: root/t
diff options
context:
space:
mode:
authorDavid Mitchell <davem@iabyn.com>2015-04-15 08:47:18 +0100
committerDavid Mitchell <davem@iabyn.com>2015-04-15 08:47:18 +0100
commitc7f317a9270a52c9028667b8adec18e94f450586 (patch)
tree836eab4f0534a79f8c51fa6aa8e9e0b8fee5b097 /t
parent2c0445268a1bb7696e04b8b9b324c3d6880bb18a (diff)
downloadperl-c7f317a9270a52c9028667b8adec18e94f450586.tar.gz
assertion failure on interpolated parse err
RT# 124216 When paring the interpolated string "$X", where X is a unicode char that is not a legal variable name, failure to restore things properly during error recovery led to corrupted state and assertion failures. In more detail: When parsing a double-quoted string, S_sublex_push() saves most of the current parser state. On parse error, the save stack is popped back, which restores all that state. However, PL_lex_defer wasn't being saved, so if we were in the middle of handling a forced token, PL_lex_state gets restored from PL_lex_defer, and suddenly the lexer thinks we're back inside an interpolated string again. So S_sublex_done() gets called multiple times, too many scopes are popped, and things like PL_compcv are freed prematurely. Note that in order to reproduce: * we must be within a double quoted context; * we must be parsing a var (which causes a forced token); * the variable name must be illegal, which implies unicode, as chr(0..255) are all legal names; * the terminating string quote must be the last char of the input file, as this code: case LEX_INTERPSTART: if (PL_bufptr == PL_bufend) return REPORT(sublex_done()); won't trigger an extra call to sublex_done() otherwise. I'm sure this bug affects other cases too, but this was the only way I found to reproduce.
Diffstat (limited to 't')
-rw-r--r--t/uni/parser.t26
1 files changed, 25 insertions, 1 deletions
diff --git a/t/uni/parser.t b/t/uni/parser.t
index 9c3994364b..3d89249b75 100644
--- a/t/uni/parser.t
+++ b/t/uni/parser.t
@@ -9,7 +9,7 @@ BEGIN {
skip_all_without_unicode_tables();
}
-plan (tests => 51);
+plan (tests => 52);
use utf8;
use open qw( :utf8 :std );
@@ -197,3 +197,27 @@ like( $@, qr/Bad name after Foo'/, 'Bad name after Foo\'' );
CORE::evalbytes "use charnames ':full'; use utf8; my \$x = \"\\N{abc$malformed_to_be}\"";
like( $@, qr/Malformed UTF-8 character immediately after '\\N\{abc' at .* within string/, 'Malformed UTF-8 input to \N{}');
}
+
+# RT# 124216: Perl_sv_clear: Assertion
+# If a parsing error occurred during a forced token within an interpolated
+# context, the stack unwinding failed to restore PL_lex_defer and so after
+# error recovery the state restored after the forced token was processed
+# was the wrong one, resulting in the lexer thinking we're still inside a
+# quoted string and things getting freed multiple times.
+#
+# \xe3\x80\xb0 are the utf8 bytes making up the character \x{3030}.
+# The \x{3030} char isn't a legal var name, and this triggers the error.
+#
+# NB: this only failed if the closing quote of the interpolated string is
+# the last char of the file (i.e. no trailing \n).
+
+{
+ no utf8;
+
+ fresh_perl_is(qq{use utf8; "\$\xe3\x80\xb0"}, <<EOF, { stderr => 1},
+Wide character in print at - line 1.\
+syntax error at - line 1, near "\$\xe3\x80\xb0"
+Execution of - aborted due to compilation errors.
+EOF
+ "RT# 124216");
+}