diff options
author | David Mitchell <davem@iabyn.com> | 2015-04-15 08:47:18 +0100 |
---|---|---|
committer | David Mitchell <davem@iabyn.com> | 2015-04-15 08:47:18 +0100 |
commit | c7f317a9270a52c9028667b8adec18e94f450586 (patch) | |
tree | 836eab4f0534a79f8c51fa6aa8e9e0b8fee5b097 /t | |
parent | 2c0445268a1bb7696e04b8b9b324c3d6880bb18a (diff) | |
download | perl-c7f317a9270a52c9028667b8adec18e94f450586.tar.gz |
assertion failure on interpolated parse err
RT# 124216
When paring the interpolated string "$X", where X is a unicode char that
is not a legal variable name, failure to restore things properly during
error recovery led to corrupted state and assertion failures.
In more detail:
When parsing a double-quoted string, S_sublex_push() saves most of the
current parser state. On parse error, the save stack is popped back,
which restores all that state. However, PL_lex_defer wasn't being saved,
so if we were in the middle of handling a forced token, PL_lex_state gets
restored from PL_lex_defer, and suddenly the lexer thinks we're back
inside an interpolated string again. So S_sublex_done() gets called
multiple times, too many scopes are popped, and things like PL_compcv are
freed prematurely.
Note that in order to reproduce:
* we must be within a double quoted context;
* we must be parsing a var (which causes a forced token);
* the variable name must be illegal, which implies unicode, as
chr(0..255) are all legal names;
* the terminating string quote must be the last char of the input
file, as this code:
case LEX_INTERPSTART:
if (PL_bufptr == PL_bufend)
return REPORT(sublex_done());
won't trigger an extra call to sublex_done() otherwise.
I'm sure this bug affects other cases too, but this was the only way I
found to reproduce.
Diffstat (limited to 't')
-rw-r--r-- | t/uni/parser.t | 26 |
1 files changed, 25 insertions, 1 deletions
diff --git a/t/uni/parser.t b/t/uni/parser.t index 9c3994364b..3d89249b75 100644 --- a/t/uni/parser.t +++ b/t/uni/parser.t @@ -9,7 +9,7 @@ BEGIN { skip_all_without_unicode_tables(); } -plan (tests => 51); +plan (tests => 52); use utf8; use open qw( :utf8 :std ); @@ -197,3 +197,27 @@ like( $@, qr/Bad name after Foo'/, 'Bad name after Foo\'' ); CORE::evalbytes "use charnames ':full'; use utf8; my \$x = \"\\N{abc$malformed_to_be}\""; like( $@, qr/Malformed UTF-8 character immediately after '\\N\{abc' at .* within string/, 'Malformed UTF-8 input to \N{}'); } + +# RT# 124216: Perl_sv_clear: Assertion +# If a parsing error occurred during a forced token within an interpolated +# context, the stack unwinding failed to restore PL_lex_defer and so after +# error recovery the state restored after the forced token was processed +# was the wrong one, resulting in the lexer thinking we're still inside a +# quoted string and things getting freed multiple times. +# +# \xe3\x80\xb0 are the utf8 bytes making up the character \x{3030}. +# The \x{3030} char isn't a legal var name, and this triggers the error. +# +# NB: this only failed if the closing quote of the interpolated string is +# the last char of the file (i.e. no trailing \n). + +{ + no utf8; + + fresh_perl_is(qq{use utf8; "\$\xe3\x80\xb0"}, <<EOF, { stderr => 1}, +Wide character in print at - line 1.\ +syntax error at - line 1, near "\$\xe3\x80\xb0" +Execution of - aborted due to compilation errors. +EOF + "RT# 124216"); +} |