| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When parsing the special ${var{subscript}} syntax, the lexer notes
that the } matching the ${ will be a fake bracket, and should
be ignored.
In the case of ${p{};sub p}() the first syntax error causes tokens to
be popped, such that the } following the sub declaration ends up being
the one treated as a fake bracket and ignored.
The part of the lexer that deals with sub declarations treats a ( fol-
lowing the sub name as a prototype (which is a single term) if signa-
tures are disabled, but ignores it and allows the rest of the lexer to
treat it as a parenthesis if signatures are enabled.
Hence, the part of the parser (perly.y) that parses signatures knows
that a parenthesis token can only come after a sub if signatures are
enabled, and asserts as much.
In the case of an error and tokens being discarded, a parenthesis may
come after a sub name as far as the parser is concerned, even though
there was a } in between that got discarded. The sub part of the
lexer, of course did not see the parenthesis because of the interven-
ing brace, and did not treat it as a prototype. So we get an asser-
tion failure.
The simplest fix is to loosen up the assertion and allow for anomalies
after errors. It does not hurt to go ahead and parse a signature at
this point, even though the feature is disabled, because there has
been a syntax error already, so the parsed code will never run, and
the parsed sub will not be installed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I moved subroutine signature processing into perly.y with
v5.25.3-101-gd3d9da4, I added a new lexer PL_expect state, XSIGVAR.
This indicated, when about to parse a variable, that it was a signature
element rather than a my variable; in particular, it makes ($,...)
be toked as the lone sigil '$' rather than the punctuation variable '$,'.
However this is a bit heavy-handled; so instead this commit adds a
new allowed pseudo-keyword value to PL_in_my: as well as KEY_my, KEY_our and
KEY_state, it can now be KEY_sigvar. This is a less intrusive change
to the lexer.
|
|
|
|
|
|
| |
The code snippets in perly.y are #included in a C function that has a
‘parser’ local variable. Local variables require less machine code
(especially under threads).
|
|
|
|
|
| |
assigning a char from an I32 gives a warning. Add an explicit cast
as we know the int only ever holds a char.
|
|
|
|
|
|
|
|
|
| |
During the course of parsing end exection, these values get stored
as ints and UVs, then used as SSize_t.
Standardise on IVs instead. Technically they can never be negative, but
their final use is as indices into AVs, which is SSize_t, so it's
easier to standardise on a signed value throughout.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
e.g.
a slurpy parameter may not have a default value
=>
A slurpy parameter may not have a default value
Also, split the "Too %s arguments for subroutine" diagnostic into
separate "too few" and "too many" entries in perldiag.
Based on suggestions by Father Chrysostomos.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently subroutine signature parsing emits many small discrete ops
to implement arg handling. This commit replaces them with a couple of ops
per signature element, plus an initial signature check op.
These new ops are added to the OP tree during parsing, so will be visible
to hooks called up to and including peephole optimisation. It is intended
soon that the peephole optimiser will take these per-element ops, and
replace them with a single OP_SIGNATURE op which handles the whole
signature in a single go. So normally these ops wont actually get executed
much. But adding these intermediate-level ops gives three advantages:
1) it allows the parser to efficiently generate subtrees containing
individual signature elements, which can't be done if only OP_SIGNATURE
or discrete ops are available;
2) prior to optimisation, it provides a simple and straightforward
representation of the signature;
3) hooks can mess with the signature OP subtree in ways that make it
no longer possible to optimise into an OP_SIGNATURE, but which can
still be executed, deparsed etc (if less efficiently).
This code:
use feature "signatures";
sub f($a, $, $b = 1, @c) {$a}
under 'perl -MO=Concise,f' now gives:
d <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->d
1 <;> nextstate(main 84 foo:6) v:%,469762048 ->2
2 <+> argcheck(3,1,@) v ->3
3 <;> nextstate(main 81 foo:6) v:%,469762048 ->4
4 <+> argelem(0)[$a:81,84] v/SV ->5
5 <;> nextstate(main 82 foo:6) v:%,469762048 ->6
8 <+> argelem(2)[$b:82,84] vKS/SV ->9
6 <|> argdefelem(other->7)[2] sK ->8
7 <$> const(IV 1) s ->8
9 <;> nextstate(main 83 foo:6) v:%,469762048 ->a
a <+> argelem(3)[@c:83,84] v/AV ->b
- <;> ex-nextstate(main 84 foo:6) v:%,469762048 ->b
b <;> nextstate(main 84 foo:6) v:%,469762048 ->c
c <0> padsv[$a:81,84] s ->d
The argcheck(3,1,@) op knows the number of positional params (3), the
number of optional params (1), and whether it has an array / hash slurpy
element at the end. This op is responsible for checking that @_ contains
the right number of args.
A simple argelem(0)[$a] op does the equivalent of 'my $a = $_[0]'.
Similarly, argelem(3)[@c] is equivalent to 'my @c = @_[3..$#_]'.
If it has a child, it gets its arg from the stack rather than using $_[N].
Currently the only used child is the logop argdefelem.
argdefelem(other->7)[2] is equivalent to '@_ > 2 ? $_[2] : other'.
[ These ops currently assume that the lexical var being introduced
is undef/empty and non-magival etc. This is an incorrect assumption and
is fixed in a few commits' time ]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the signature of a sub (i.e. the '($a, $b = 1)' bit) is parsed
in toke.c using a roll-your-own mini-parser. This commit makes
the signature be part of the general grammar in perly.y instead.
In theory it should still generate the same optree as before, except
that an OP_STUB is no longer appended to each signature optree: it's
unnecessary, and I assume that was a hangover from early development of
the original signature code.
Error messages have changed somewhat: the generic 'Parse error' has
changed to the generic 'syntax error', with the addition of ', near "xyz"'
now appended to each message.
Also, some specific error messages have been added; for example
(@a=1) now says that slurpy params can't have a default vale, rather than
just giving 'Parse error'.
It introduces a new lexer expect state, XSIGVAR, since otherwise when
the lexer saw something like '($, ...)' it would see the identifier
'$,' rather than the tokens '$' and ','.
Since it no longer uses parse_termexpr(), it is no longer subject to the
bug (#123010) associated with that; so sub f($x = print, $y) {}
is no longer mis-interpreted as sub f($x = print($_, $y)) {}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The enum value "WORD" can apparently clash with a enum value in windows
headers. It used to be worked around in windows by #defining YYTOKENTYPE,
which caused the
enum yytokentype {
...
WORD = ...;
}
in perly.h to be skipped, while still using the
#define WORD ...
which appears later in the same file.
In bison 3.x, the auto-generated perl.h no longer includes the #defines,
so this workaround no longer works.
Instead, change the name of the token from "WORD" to "BAREWORD".
|
|
|
|
|
|
|
|
|
|
| |
This applies to ‘my’, ‘our’, ‘state’ and ‘local’, and both to single
variable and lists of variables, in all their variations:
my \$a # equivalent to \my $a
my \($a,$b) # equivalent to \my($a, $b)
my (\($a,$b)) # same
my (\$a, $b) # equivalent to (\my $a, $b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Perl 5.000, the same token, LOCAL, was used for both ‘my’ and
‘local’, with a token value, passed to localize() as a second argu-
ment, to distinguish between them.
perl-5.003_07-9-g55497cf (inseparable changes from patch from
perl5.003_07 to perl5.003_08), for no apparent reason, split them into
two tokens, removing the token values and assigning values in perly.y
via $$ = 0 and $$ = 1. They still ultimately made their way through
the same grammar rule, as there was only one localize() call in
perly.y. The code still made sense.
perl-5.005_02-1816-g09bef84 (sub : attrlist) changed things, such that
the tokens are separate *and* they get separate token values assigned
to them. ‘local’ and ‘my’ no longer follow the same grammar rules
in perly.y, so there are separate localize() calls for the different
token types. Hence, the use of a token value to distinguish them does
not make sense. It just makes this more complicated that necessary.
So this commit removes the token values. Since the two token types
follow different paths through perly.y and have separate localize()
calls, we can hard-code the argument to localize() there, instead of
passing the value through from toke.c as a token value.
This does shrink toke.o slightly (for me it went from 876040 to
876000), and it makes this conceptually clearer.
|
|
|
|
|
|
| |
Mainly it no longer generates some tables used for debugging.
This commit also adds a new define showing what bison version was used.
|
|
|
|
|
|
| |
Previously the assignment was hidden by the not op wrapped around the
condition, but newCONDOP() is sufficiently flexible that it isn't
needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is dead code that used to allow
my $_;
...
given ($foo) {
# lexical $_ aliased to $foo here
}
Now that lexical $_ has been removed, remove the code. I've left the
signatures of the newFOO() functions unchanged; they just expect a target
of 0 to always be passed now.
|
|
|
|
|
|
| |
This token’s type is never used. We don’t bother setting the type,
either, in toke.c, so it will be garbage. Removing the type makes
it harder to use the garbage value by mistake in refactoring.
|
|
|
|
|
|
|
| |
These two tokens never use their value, and the value is not even set
in toke.c, which means it will contain a junk value from some previous
token. Removing the types prevents that junk value from being acci-
dentally used.
|
|
|
|
| |
Yay, the semicolons are back.
|
|
|
|
|
|
|
|
|
| |
This moves signatures before attributes in the grammar by
creating separate branches for the prototype and signatures
cases, so that the introduced block and the fact that signatures
do not allow for declarations can be handled properly.
Tests and regen_perly to follow.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two operators were being translated into subst("","") and
tr("","") by the lexer. Then pmruntime in op.c would take apart the
resulting list op. Instead of constructing a list op only to take it
apart again, feed the replacement part to pmruntime separately. We
can achieve this by introducing a new token ('/') that the parser rec-
ognizes as introducing a replacement.
If we had followed this approach to begin with, then bug #123542 would
never have happened.
(Actually, it seems the parser did know about the replacement part to
begin with, but it changed in perl-5.8.0-4047-g131b3ad to fix some
overloading problems.)
|
|
|
|
|
|
| |
ck_spair also applies lvalue context to the kid ops, so we just end up
calling op_lvalue twice on the same ops. It’s harmless (being idempo-
tent), but wasteful.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Formats need the same logic applied in 34b54951568, to avoid this:
Input:
{
my $x;
format STDOUT =
@
$x
.
}
__END__
$ pbpaste|./perl -Ilib -MO=Deparse
{
my $x;
}
format STDOUT =
@
$x
.
__DATA__
- syntax OK
That $x in the format is now global, not lexical.
Also, we need to make sure the sequence numbers are correct. The
statement after the format stopped having a higher sequence number
than CvOUTSIDE_SEQ(format) in 8635e3c238f, because the ‘pending’
sequence number set aside for the nextstate op created just after com-
pile-time block exit (which nextstate op comes before the block) was
not actually being used by newFORM (unlike newATTRSUB), so the follow-
ing statement was using that number.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A block in perl usually consists of an enter/leave pair plus the con-
tents of the block:
leave
enter
nextstate
whatever
But if the contents of the block are simple enough to forego the
full block structure, a simple scope op is used, which is not
even executed:
scope
ex-nextstate
whatever
If there is a real nextstate op anywhere in the block, it resets the
stack to whatever it was at block entry, based on the value on the
context stack placed there by the enter op. That’s why we can never
have scope+nextstate (we have ex-nextstate, or a former nextstate op
that is not executed).
A for-loop (for(init; cond; cont) { ... }) executes the init section
first, and then an unstack op, which is like nextstate in that it
resets the stack based on what the context stack says is the base off-
set for this block.
If we have an unstack op, we can’t use scope, just as we can’t use it
with nextstate. But we *were* nonetheless using scope in this case.
Hence, map { for(...;...;...) {...} } ... caused the for-loop to reset
the stack to the beginning of map’s own arguments. So the for-loop
would stomp on them.
We can see the same bug with ‘for’ clobbering an outer list:
$ perl5.20.1 -le 'print 1..3, do{for(0;0;){}}, 4..6;'
0456
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8635e3c2 (5.21.6) changed the COP sequence numbers for nested blocks,
such that most BEGIN blocks (incl. ‘use’ statements) and sub declara-
tions end up in the right place. However, it had the side effect of
causing declarations at the end of the enclosing scope to fall out of
it and appear below.
This commit fixes that by adding an extra nulled COP to the end of the
enclosing scope if that scope ends with a sub, so the final declara-
tion gets deparsed before it.
The frequency of sub declarations at the end of the enclosing scope is
sufficiently low (I’m guessing a bit here) that this slight increase
in run-time memory usage is probably acceptable.
I had to change B::Deparse to deparse nulled COPs the same way it does
live COPs, which means we get more extraneous semicolons than before.
I hope to fix that in a forthcoming commit. I also ran into a B bug,
in that null ops are not presented to Perl code with the right op
class (see the blessing in the patch). I plan to fix that in a separ-
ate commit, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the op tree, a statement consists of a nextstate/dbstate op (of
class cop) followed by the contents of the statement. This cop is
created after the statement has been parsed. So if you have nested
statements, the outermost statement has the highest sequence number
(cop_seq). Every sub (including BEGIN blocks) has a sequence number
indicating where it occurs in its containing sub.
So
BEGIN { } #1
# seq 2
{
# seq 1
...
}
is indistinguishable from
# seq 2
{
BEGIN { } #1
# seq 1
...
}
because the sequence number of the BEGIN block is 1 in both examples.
By reserving a sequence number at the start of every block and using
it once the block has finished parsing, we can do this:
BEGIN { } #1
# seq 1
{
# seq 2
...
}
# seq 1
{
BEGIN { } #2
# seq 2
...
}
and now B::Deparse can tell where to put the blocks.
PL_compiling.cop_seq was unused, so this is where I am stashing
the pending sequence number.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When (\$x)=\$y is compiled, the \ on the lhs gives lvalue context to
its argument by calling op_lvalue. Then later the = gives lvalue con-
text to the \, calling op_lvalue again, which transforms the $x into
an lvref op (via op.c:S_lvref).
I just copied that logic when I extended aliasing via reference to
foreach \$x. But here, we don’t need to call op_lvalue on the $x,
because we know it is going to go through op.c:S_lvref, which doesn’t
care whether it has been through op_lvalue already or not. The end
result is the same.
|
|
|
|
| |
Some passing tests are still marked to-do. We need more tests still.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new opcode class, METHOP, which will hold class/method related
info needed at runtime to improve performance of class/object method
calls, then change OP_METHOD and OP_METHOD_NAMED from being UNOP/SVOP to
being METHOP.
Note that because OP_METHOD is a UNOP with an op_first, while
OP_METHOD_NAMED is an SVOP, the first field of the METHOP structure
is a union holding either op_first or op_sv. This was seen as less messy
than having to introduce two new op classes.
The new op class's character is '.'
Nothing has changed in functionality and/or performance by this commit.
It just introduces new structure which will be extended with extra
fields and used in later commits.
Added METHOP constructors:
- newMETHOP() for method ops with dynamic method names.
The only optype for this op is OP_METHOD.
- newMETHOP_named() for method ops with constant method names.
Optypes for this op are: OP_METHOD_NAMED (currently) and (later)
OP_METHOD_SUPER, OP_METHOD_REDIR, OP_METHOD_NEXT, OP_METHOD_NEXTCAN,
OP_METHOD_MAYBENEXT
(This commit includes fixups by davem)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All these code snippets are embedded inside a function
(perly.c:yyparse) that puts the current value of PL_parser in a local
variable named parser. So the two are equivalent, but the latter
only has to access a local variable.
Before:
$ ls -ld perly.o
-rw-r--r-- 1 sprout staff 94748 Aug 22 06:12 perly.o
After:
$ ls -ld perly.o
-rw-r--r-- 1 sprout staff 94340 Aug 22 06:15 perly.o
|
|
|
|
|
|
|
|
|
|
|
|
| |
When curly subscripts are parsed, the lexer (toke.c:yylex) notes that
the value of PL_expect needs to be set to XSTATE (expecting a state-
ment) after the final brace. When the final brace is encountered,
PL_expect is set to that recorded value. But then the parser
(perly.y) sets it to XOPERATOR immediately thereafter.
This approach requires a plethora of identical statements in perly.y.
If we just set PL_expect to the right value to begin with, we can
avoid all those assignments.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As it worked before, the parser (perly.y) would set PL_expect to
XSTATE after encountering a statement-terminating semicolon.
Two functions in op.c--package and utilize--had to set the value to
XSTATE as a result.
Also, in the case of a closing brace, the lexer emits an implicit
semicolon followed by '}' (emitted via force_next). force_next
records the value of PL_expect and restores it when emitting the
token. So in this case the value of PL_expect was flipping back and
forth between two values.
Instead of having the parser set it to XSTATE, we can have the lexer
set it to XSTATE by default when emitting an explicit semicolon. (It
was setting it to XTERM.) The parser can set it to XTERM in the only
place that matters; viz., the header of a for-loop. This simplifies
things conceptually, and makes the code a whole line shorter.
(The diff stat shows more savings in line count, but that is because
the version of bison I used to regenerate the tables produces smaller
headers than what was already committed.)
|
|
|
|
|
|
| |
MAD = Misc Attribute Decoration; unmaintained attempt at preserving
the Perl parse tree more faithfully so that automatic conversion to
Perl 6 would have been easier.
|
|
|
|
|
|
|
|
|
|
| |
Declarative syntax to unwrap argument list into lexical variables.
"sub foo ($a,$b) {...}" checks number of arguments and puts the
arguments into lexical variables. Signatures are not equivalent to the
existing idiom of "sub foo { my($a,$b) = @_; ... }". Signatures are only
available by enabling a non-default feature, and generate warnings about
being experimental. The syntactic clash with prototypes is managed by
disabling the short prototype syntax when signatures are enabled.
|
|
|
|
|
| |
It's been deprecated (and emitting a warning) since Perl v5.0.0, and
support for it consitutes nearly 3% of the grammar.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This turned out to be tricky. Normally @ at the beginning of the
interpolated code signals to the lexer to emit ‘join($",’ immediately.
With "$_->@*" we would have to retract the $ _ -> tokens upon encoun-
tering @*, which we obviously cannot do.
Waiting until we reach the end of the interpolated text before emit-
ting anything could not work either, as it may contain BEGIN blocks
that affect the way part of the interpolated code is parsed.
So what we do is introduce an egregious or clever hack, depending on
how you look at it.
Normally, the lexer turns "@foo" into:
stringify ( join ( $ " , @ foo ) )
(The " is a WORD token, representing a variable name.)
"$_" becomes:
stringify ( $ _ )
We can turn "$_->@*" into:
stringify ( $ _ -> @ * POSTJOIN )
Where POSTJOIN is a new lexer token with special handling that creates
a join op just the way join($", ...) does.
To make "foo$_->@*bar" work as well, we have to make POSTJOIN have
precedence just below ->, so that
stringify ( "foo" . $ _ -> @ * POSTJOIN . "bar" )
(what the parser sees) is equivalent to:
stringify ( "foo" . ( $ _ -> @ * POSTJOIN ) . "bar" )
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$_->$* means $$_ (and compiled down to the same op tree)
$_->@* means @$_ ( ditto ditto blah blah blah )
$_->%* means %$_ (...)
$_->&* means &$_
$_->** means *$_
$_->@[...] means @$_[...]
$_->@{...} means @$_{...}
$_->*{...} means *$_{...}
$_->@* is not always equivalent to @$_, particularly in contexts like
@foo[0], which cannot be written foo->@*[0]. (Just omit the asterisk
and it works.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resolves tickets #28380 and #114024.
Commit 95a31aad5 did something similar to this for the new %hash{...}
syntax. This commit extends it to @ slices and combines the two
code paths.
The heuristics in toke.c can easily produce false positives. So the
op is flagged as being a candidate for the warning. Then when op.c
has the op tree available, it examines it to see whether the heuristic
may have been a false positive.
This avoids bugs with qw "foo bar baz" and sub calls triggering
the warning.
The source code is no longer available for the warning, so we recon-
struct it from the op tree, skipping the subscript if it is anything
other than a const op.
This means that @hash{$foo} comes out as @hash{...} and @hash{foo} as
@hash{"foo"}. It also meeans that @hash{"]"} is displayed correctly
instead of as @hash{"].
Commit 95a31aad5 also modified the heuristic for %hash{...} to exempt
qw altogether. But it did not exempt it if it was preceded by a tab.
So this commit rectifies that.
This commit also improves the false positive detection by exempting
any ops returning lists that can get past toke.c’s heuristic. I went
through the entire list of ops, but I may have missed some.
Also, @ slices on the lhs of = are exempt, as they change the context
and are hence actually useful.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of warning in the lexer, flag the op and then warn in op.c,
when the op tree is available, so we don’t end up warning for actual
lists or for sub calls.
Also, only warn in scalar context, as in list context $hash{$scalar}
and %hash{$scalar} do different things.
In op.c we no longer have easy access to the source code, so recon-
struct the hash/array access based on the op tree. This means
%hash{foo} becomes %hash{"foo"}. We only reconstruct constant keys,
so %hash{++$x} becomes %hash{...}. This also corrects erroneous
dumps, like %hash{"} for %hash{"}"}.
Instead of triggering the warning solely based on the op tree, we
still keep the heuristic in toke.c, so that common workarounds for
that warning (e.g., {q<key>} and {("key")}) continue to work.
The heuristic in toke.c is tweaked to avoid warning for qw().
In a future commit I plan to extend this to the existing @array[0] and
@hash{key} warnings, to avoid false positives.
|
|
|
|
|
|
| |
kvaslice operator that imlements %a[0,2,4] syntax which
result in list of index/value pairs. Implemented in
consistency with "key/value hash slice" operator.
|
|
|
|
|
|
| |
kvhslice operator that implements %h{1,2,3,4} syntax which
returns list of key value pairs rather than just values
(regular slices).
|
|
|
|
|
|
| |
NOAMP is only emitted by toke.c when there are no parentheses. If
there is a parenthesis following a word, the lexer conjures up an '&'
token from nowhere.
|
|
|
|
|
|
|
|
|
|
|
| |
Under mad builds, commit 5db1eb8 caused this warning:
$ PERL_XMLDUMP=/dev/null ./perl -Ilib -e 'foo:'
Invalid TOKEN object ignored at -e line 1.
Since I don’t understand the mad code so well, the easiest fix is to
revert back to using a PV, as we did before 5db1eb8. To record the
utf8ness, we sneak it behind the trailing null.
|
|
|
|
|
| |
They have leaked since v5.15.9-35-g5db1eb8 (which probably broke mad
dumping of labels; to be addressed in the next commit).
|
| |
|
|
|
|
| |
This token is not used any more.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pad slot for a my sub now holds a stub with a prototype CV
attached to it by proto magic.
The prototype is cloned on scope entry. The stub in the pad is used
when cloning, so any code that references the sub before scope entry
will be able to see that stub become defined, making these behave
similarly:
our $x;
BEGIN { $x = \&foo }
sub foo { }
our $x;
my sub foo { }
BEGIN { $x = \&foo }
Constants are currently not cloned, but that may cause bugs in
pad_push. I’ll have to look into that.
On scope exit, lexical CVs go through leave_scope’s SAVEt_CLEARSV sec-
tion, like lexical variables. If the sub is referenced elsewhere, it
is abandoned, and its proto magic is stolen and attached to a new stub
stored in the pad. If the sub is not referenced elsewhere, it is
undefined via cv_undef.
To clone my subs on scope entry, we create a sequence of introcv and
clonecv ops. See the huge comment in block_end that explains why we
need two separate ops for each CV.
To allow my subs to be defined in inner subs (my sub foo; sub { sub
foo {} }), pad_add_name_pvn and S_pad_findlex now upgrade the entry
for a my sub to a CV to begin with, so that fake entries added to pads
(fake entries are those that reference outer pads) can share the same
CV. Otherwise newMYSUB would have to add the CV to every pad that
closes over the ‘my sub’ declaration. newMYSUB no longer throws away
the initial value replacing it with a new one.
Prototypes are not currently visible to sub calls at compile time,
because the lexer sees the empty stub. A future commit will
solve that.
When I added name heks to CV’s I made mistakes in a few places, by not
turning on the CVf_NAMED flag, or by not clearing the field when free-
ing the hek. Those code paths were not exercised enough by state
subs, so the problems did not show up till now. So this commit fixes
those, too.
One of the tests in lexsub.t, involving foreach loops, was incorrect,
and has been fixed. Another test has been added to the end for a par-
ticular case of state subs closing over my subs that I broke when ini-
tially trying to get sibling my subs to close over each other, before
I had separate introcv and clonecv ops.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since state variables are not shared between closures, but only
between invocations of the same closure, state subs should behave
the same way.
This was a little tricky. When we clone a sub, we now clone inner
state subs at the same time. When walking through the pad, cloning
items, we cannot simply clone the inner sub when we see it, because it
may close over things we haven’t cloned yet:
sub {
state sub foo;
my $x
sub foo { $x }
}
We can’t just delay cloning it and do it afterwards, because they may
be multiple subs closing over each other:
sub {
state sub foo;
state sub bar;
sub foo { \&bar }
sub bar { \&foo }
}
So *all* the entries in the new pad must be filled before any inner
subs can be cloned.
So what we do is put a stub in place of the cloned sub. And then
in a second pass clone the inner subs, reusing the stubs from the
first pass.
|