summaryrefslogtreecommitdiff
path: root/perly.h
Commit message (Collapse)AuthorAgeFilesLines
* perly.y: add $$ = 0 for midrule code blocksDavid Mitchell2017-06-071-1/+1
| | | | | | | | | | | | | | | | | In places where a rule contains multiple code blocks, ensure that $$ is assigned a valid value at the end of midrule blocks, so that valgrind ./perl -Dpv ... doesn't display zillions of Conditional jump or move depends on uninitialised value errors, when perl tries to display the parse stack. I've only done the various newish top-level grammar entries - these all seemed to have the same defect, while from a quick glance elsewhere in the file, it seemed like older rules already do this.
* yyparse: only calculate yytoken on yychar changeDavid Mitchell2016-12-051-6/+4
| | | | | | | | yytoken is a translated (via lookup table) version of parser->yychar. So we only need to recalculate it when yychar changes (usually by assigning the result of yylex() to it). This means when multiple reductions are done without shifting another token, we skip the extra overhead each time.
* Regen from the "special" regen scriptsAaron Crane2016-11-111-1/+1
| | | | | | A few regen scripts aren't run by "make regen", either because they depend on an external tool, or they must be run by the Perl just built. So they must be run manually.
* perly.y: remove redundant NULL castsLukas Mai2016-10-201-187/+112
|
* [perl #129073] Assert failure: ${p{};sub p}()Father Chrysostomos2016-09-041-110/+187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When parsing the special ${var{subscript}} syntax, the lexer notes that the } matching the ${ will be a fake bracket, and should be ignored. In the case of ${p{};sub p}() the first syntax error causes tokens to be popped, such that the } following the sub declaration ends up being the one treated as a fake bracket and ignored. The part of the lexer that deals with sub declarations treats a ( fol- lowing the sub name as a prototype (which is a single term) if signa- tures are disabled, but ignores it and allows the rest of the lexer to treat it as a parenthesis if signatures are enabled. Hence, the part of the parser (perly.y) that parses signatures knows that a parenthesis token can only come after a sub if signatures are enabled, and asserts as much. In the case of an error and tokens being discarded, a parenthesis may come after a sub name as far as the parser is concerned, even though there was a } in between that got discarded. The sub part of the lexer, of course did not see the parenthesis because of the interven- ing brace, and did not treat it as a prototype. So we get an asser- tion failure. The simplest fix is to loosen up the assertion and allow for anomalies after errors. It does not hurt to go ahead and parse a signature at this point, even though the feature is disabled, because there has been a syntax error already, so the parsed code will never run, and the parsed sub will not be installed.
* signatures: eliminate XSIGVAR, add KEY_sigvarDavid Mitchell2016-08-181-187/+110
| | | | | | | | | | | | | | When I moved subroutine signature processing into perly.y with v5.25.3-101-gd3d9da4, I added a new lexer PL_expect state, XSIGVAR. This indicated, when about to parse a variable, that it was a signature element rather than a my variable; in particular, it makes ($,...) be toked as the lone sigil '$' rather than the punctuation variable '$,'. However this is a bit heavy-handled; so instead this commit adds a new allowed pseudo-keyword value to PL_in_my: as well as KEY_my, KEY_our and KEY_state, it can now be KEY_sigvar. This is a less intrusive change to the lexer.
* Use parser, not PL_parser, in perly.yFather Chrysostomos2016-08-041-110/+187
| | | | | | The code snippets in perly.y are #included in a C function that has a ‘parser’ local variable. Local variables require less machine code (especially under threads).
* silence compiler warning in perly.yDavid Mitchell2016-08-031-1/+1
| | | | | assigning a char from an I32 gives a warning. Add an explicit cast as we know the int only ever holds a char.
* signatures: make param and optional param count IVDavid Mitchell2016-08-031-1/+1
| | | | | | | | | During the course of parsing end exection, these values get stored as ints and UVs, then used as SSize_t. Standardise on IVs instead. Technically they can never be negative, but their final use is as indices into AVs, which is SSize_t, so it's easier to standardise on a signed value throughout.
* ucfirst() new signature diagnostic messagesDavid Mitchell2016-08-031-1/+1
| | | | | | | | | | | | | e.g. a slurpy parameter may not have a default value => A slurpy parameter may not have a default value Also, split the "Too %s arguments for subroutine" diagnostic into separate "too few" and "too many" entries in perldiag. Based on suggestions by Father Chrysostomos.
* add OP_ARGELEM, OP_ARGDEFELEM, OP_ARGCHECK opsDavid Mitchell2016-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently subroutine signature parsing emits many small discrete ops to implement arg handling. This commit replaces them with a couple of ops per signature element, plus an initial signature check op. These new ops are added to the OP tree during parsing, so will be visible to hooks called up to and including peephole optimisation. It is intended soon that the peephole optimiser will take these per-element ops, and replace them with a single OP_SIGNATURE op which handles the whole signature in a single go. So normally these ops wont actually get executed much. But adding these intermediate-level ops gives three advantages: 1) it allows the parser to efficiently generate subtrees containing individual signature elements, which can't be done if only OP_SIGNATURE or discrete ops are available; 2) prior to optimisation, it provides a simple and straightforward representation of the signature; 3) hooks can mess with the signature OP subtree in ways that make it no longer possible to optimise into an OP_SIGNATURE, but which can still be executed, deparsed etc (if less efficiently). This code: use feature "signatures"; sub f($a, $, $b = 1, @c) {$a} under 'perl -MO=Concise,f' now gives: d <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->d 1 <;> nextstate(main 84 foo:6) v:%,469762048 ->2 2 <+> argcheck(3,1,@) v ->3 3 <;> nextstate(main 81 foo:6) v:%,469762048 ->4 4 <+> argelem(0)[$a:81,84] v/SV ->5 5 <;> nextstate(main 82 foo:6) v:%,469762048 ->6 8 <+> argelem(2)[$b:82,84] vKS/SV ->9 6 <|> argdefelem(other->7)[2] sK ->8 7 <$> const(IV 1) s ->8 9 <;> nextstate(main 83 foo:6) v:%,469762048 ->a a <+> argelem(3)[@c:83,84] v/AV ->b - <;> ex-nextstate(main 84 foo:6) v:%,469762048 ->b b <;> nextstate(main 84 foo:6) v:%,469762048 ->c c <0> padsv[$a:81,84] s ->d The argcheck(3,1,@) op knows the number of positional params (3), the number of optional params (1), and whether it has an array / hash slurpy element at the end. This op is responsible for checking that @_ contains the right number of args. A simple argelem(0)[$a] op does the equivalent of 'my $a = $_[0]'. Similarly, argelem(3)[@c] is equivalent to 'my @c = @_[3..$#_]'. If it has a child, it gets its arg from the stack rather than using $_[N]. Currently the only used child is the logop argdefelem. argdefelem(other->7)[2] is equivalent to '@_ > 2 ? $_[2] : other'. [ These ops currently assume that the lexical var being introduced is undef/empty and non-magival etc. This is an incorrect assumption and is fixed in a few commits' time ]
* sub signatures: use parser rather than lexerDavid Mitchell2016-08-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Currently the signature of a sub (i.e. the '($a, $b = 1)' bit) is parsed in toke.c using a roll-your-own mini-parser. This commit makes the signature be part of the general grammar in perly.y instead. In theory it should still generate the same optree as before, except that an OP_STUB is no longer appended to each signature optree: it's unnecessary, and I assume that was a hangover from early development of the original signature code. Error messages have changed somewhat: the generic 'Parse error' has changed to the generic 'syntax error', with the addition of ', near "xyz"' now appended to each message. Also, some specific error messages have been added; for example (@a=1) now says that slurpy params can't have a default vale, rather than just giving 'Parse error'. It introduces a new lexer expect state, XSIGVAR, since otherwise when the lexer saw something like '($, ...)' it would see the identifier '$,' rather than the tokens '$' and ','. Since it no longer uses parse_termexpr(), it is no longer subject to the bug (#123010) associated with that; so sub f($x = print, $y) {} is no longer mis-interpreted as sub f($x = print($_, $y)) {}
* rename "WORD" lexical token to "BAREWORD"David Mitchell2016-07-201-188/+111
| | | | | | | | | | | | | | | | | | | | | | The enum value "WORD" can apparently clash with a enum value in windows headers. It used to be worked around in windows by #defining YYTOKENTYPE, which caused the enum yytokentype { ... WORD = ...; } in perly.h to be skipped, while still using the #define WORD ... which appears later in the same file. In bison 3.x, the auto-generated perl.h no longer includes the #defines, so this workaround no longer works. Instead, change the name of the token from "WORD" to "BAREWORD".
* Allow my \$aFather Chrysostomos2016-07-171-1/+1
| | | | | | | | | | This applies to ‘my’, ‘our’, ‘state’ and ‘local’, and both to single variable and lists of variables, in all their variations: my \$a # equivalent to \my $a my \($a,$b) # equivalent to \my($a, $b) my (\($a,$b)) # same my (\$a, $b) # equivalent to (\my $a, $b)
* Simplify parser’s handling of my/localFather Chrysostomos2016-05-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In Perl 5.000, the same token, LOCAL, was used for both ‘my’ and ‘local’, with a token value, passed to localize() as a second argu- ment, to distinguish between them. perl-5.003_07-9-g55497cf (inseparable changes from patch from perl5.003_07 to perl5.003_08), for no apparent reason, split them into two tokens, removing the token values and assigning values in perly.y via $$ = 0 and $$ = 1. They still ultimately made their way through the same grammar rule, as there was only one localize() call in perly.y. The code still made sense. perl-5.005_02-1816-g09bef84 (sub : attrlist) changed things, such that the tokens are separate *and* they get separate token values assigned to them. ‘local’ and ‘my’ no longer follow the same grammar rules in perly.y, so there are separate localize() calls for the different token types. Hence, the use of a token value to distinguish them does not make sense. It just makes this more complicated that necessary. So this commit removes the token values. Since the two token types follow different paths through perly.y and have separate localize() calls, we can hard-code the argument to localize() there, instead of passing the value through from toke.c as a token value. This does shrink toke.o slightly (for me it went from 876040 to 876000), and it makes this conceptually clearer.
* regen_perly.pl: Correct typoFather Chrysostomos2016-05-161-43/+23
| | | | | Sorry for the noisy patch. I can’t modify regen_perly.pl without regenerating stuff, because the checksum changes.
* run regen_perly.plDavid Mitchell2016-02-111-1/+1
|
* Add support for bison 3.0David Mitchell2016-02-031-6/+8
| | | | | | Mainly it no longer generates some tables used for debugging. This commit also adds a new define showing what bison version was used.
* [perl #127122] warn on unless (assignment) when syntax warnings are onTony Cook2016-01-211-5/+5
| | | | | | Previously the assignment was hidden by the not op wrapped around the condition, but newCONDOP() is sufficiently flexible that it isn't needed.
* given(): remove support for lexical $_David Mitchell2015-10-021-22/+42
| | | | | | | | | | | | | | There is dead code that used to allow my $_; ... given ($foo) { # lexical $_ aliased to $foo here } Now that lexical $_ has been removed, remove the code. I've left the signatures of the newFOO() functions unchanged; they just expect a target of 0 to always be passed now.
* perly.y: Remove type from ';'Father Chrysostomos2015-02-151-1/+1
| | | | | | This token’s type is never used. We don’t bother setting the type, either, in toke.c, so it will be garbage. Removing the type makes it harder to use the garbage value by mistake in refactoring.
* perly.y: Remove types for '$' and '*'Father Chrysostomos2015-02-071-1/+1
| | | | | | | These two tokens never use their value, and the value is not even set in toke.c, which means it will contain a junk value from some previous token. Removing the types prevents that junk value from being acci- dentally used.
* Parse and compile string- and num-specific bitopsFather Chrysostomos2015-01-311-42/+22
| | | | Yay, the semicolons are back.
* perly.y changes from Lukas Mai in RT 123069Peter Martini2015-01-181-22/+42
| | | | | | | | | This moves signatures before attributes in the grammar by creating separate branches for the prototype and signatures cases, so that the introduced block and the fact that signatures do not allow for declarations can be handled properly. Tests and regen_perly to follow.
* Simplify s/// and tr/// parsing logicFather Chrysostomos2015-01-081-1/+1
| | | | | | | | | | | | | | | | These two operators were being translated into subst("","") and tr("","") by the lexer. Then pmruntime in op.c would take apart the resulting list op. Instead of constructing a list op only to take it apart again, feed the replacement part to pmruntime separately. We can achieve this by introducing a new token ('/') that the parser rec- ognizes as introducing a replacement. If we had followed this approach to begin with, then bug #123542 would never have happened. (Actually, it seems the parser did know about the replacement part to begin with, but it changed in perl-5.8.0-4047-g131b3ad to fix some overloading problems.)
* perly.y: Don’t call op_lvalue on refgen kidFather Chrysostomos2015-01-061-1/+1
| | | | | | ck_spair also applies lvalue context to the kid ops, so we just end up calling op_lvalue twice on the same ops. It’s harmless (being idempo- tent), but wasteful.
* Unify format and named-sub pad weakref codeFather Chrysostomos2014-12-091-1/+1
|
* Deparse formats in the right spotFather Chrysostomos2014-12-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formats need the same logic applied in 34b54951568, to avoid this: Input: { my $x; format STDOUT = @ $x . } __END__ $ pbpaste|./perl -Ilib -MO=Deparse { my $x; } format STDOUT = @ $x . __DATA__ - syntax OK That $x in the format is now global, not lexical. Also, we need to make sure the sequence numbers are correct. The statement after the format stopped having a higher sequence number than CvOUTSIDE_SEQ(format) in 8635e3c238f, because the ‘pending’ sequence number set aside for the nextstate op created just after com- pile-time block exit (which nextstate op comes before the block) was not actually being used by newFORM (unlike newATTRSUB), so the follow- ing statement was using that number.
* [perl #123286] Lone C-style for in a blockFather Chrysostomos2014-11-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A block in perl usually consists of an enter/leave pair plus the con- tents of the block: leave enter nextstate whatever But if the contents of the block are simple enough to forego the full block structure, a simple scope op is used, which is not even executed: scope ex-nextstate whatever If there is a real nextstate op anywhere in the block, it resets the stack to whatever it was at block entry, based on the value on the context stack placed there by the enter op. That’s why we can never have scope+nextstate (we have ex-nextstate, or a former nextstate op that is not executed). A for-loop (for(init; cond; cont) { ... }) executes the init section first, and then an unstack op, which is like nextstate in that it resets the stack based on what the context stack says is the base off- set for this block. If we have an unstack op, we can’t use scope, just as we can’t use it with nextstate. But we *were* nonetheless using scope in this case. Hence, map { for(...;...;...) {...} } ... caused the for-loop to reset the stack to the beginning of map’s own arguments. So the for-loop would stomp on them. We can see the same bug with ‘for’ clobbering an outer list: $ perl5.20.1 -le 'print 1..3, do{for(0;0;){}}, 4..6;' 0456
* [perl #77452] Deparse { ...; BEGIN{} } correctlyFather Chrysostomos2014-11-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | 8635e3c2 (5.21.6) changed the COP sequence numbers for nested blocks, such that most BEGIN blocks (incl. ‘use’ statements) and sub declara- tions end up in the right place. However, it had the side effect of causing declarations at the end of the enclosing scope to fall out of it and appear below. This commit fixes that by adding an extra nulled COP to the end of the enclosing scope if that scope ends with a sub, so the final declara- tion gets deparsed before it. The frequency of sub declarations at the end of the enclosing scope is sufficiently low (I’m guessing a bit here) that this slight increase in run-time memory usage is probably acceptable. I had to change B::Deparse to deparse nulled COPs the same way it does live COPs, which means we get more extraneous semicolons than before. I hope to fix that in a forthcoming commit. I also ran into a B bug, in that null ops are not presented to Perl code with the right op class (see the blessing in the patch). I plan to fix that in a separ- ate commit, too.
* [perl #77452] Deparse BEGIN blocks in the right placeFather Chrysostomos2014-11-061-42/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the op tree, a statement consists of a nextstate/dbstate op (of class cop) followed by the contents of the statement. This cop is created after the statement has been parsed. So if you have nested statements, the outermost statement has the highest sequence number (cop_seq). Every sub (including BEGIN blocks) has a sequence number indicating where it occurs in its containing sub. So BEGIN { } #1 # seq 2 { # seq 1 ... } is indistinguishable from # seq 2 { BEGIN { } #1 # seq 1 ... } because the sequence number of the BEGIN block is 1 in both examples. By reserving a sequence number at the start of every block and using it once the block has finished parsing, we can do this: BEGIN { } #1 # seq 1 { # seq 2 ... } # seq 1 { BEGIN { } #2 # seq 2 ... } and now B::Deparse can tell where to put the blocks. PL_compiling.cop_seq was unused, so this is where I am stashing the pending sequence number.
* rename convert to op_convert_list and APIfyLukas Mai2014-10-261-22/+42
|
* Remove redundant op_lvalue calls in perly.yFather Chrysostomos2014-10-241-1/+1
| | | | | | | | | | | | | When (\$x)=\$y is compiled, the \ on the lhs gives lvalue context to its argument by calling op_lvalue. Then later the = gives lvalue con- text to the \, calling op_lvalue again, which transforms the $x into an lvref op (via op.c:S_lvref). I just copied that logic when I extended aliasing via reference to foreach \$x. But here, we don’t need to call op_lvalue on the $x, because we know it is going to go through op.c:S_lvref, which doesn’t care whether it has been through op_lvalue already or not. The end result is the same.
* foreach \$varFather Chrysostomos2014-10-111-42/+22
| | | | Some passing tests are still marked to-do. We need more tests still.
* Make OP_METHOD* to be of new class METHOPsyber2014-10-031-22/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new opcode class, METHOP, which will hold class/method related info needed at runtime to improve performance of class/object method calls, then change OP_METHOD and OP_METHOD_NAMED from being UNOP/SVOP to being METHOP. Note that because OP_METHOD is a UNOP with an op_first, while OP_METHOD_NAMED is an SVOP, the first field of the METHOP structure is a union holding either op_first or op_sv. This was seen as less messy than having to introduce two new op classes. The new op class's character is '.' Nothing has changed in functionality and/or performance by this commit. It just introduces new structure which will be extended with extra fields and used in later commits. Added METHOP constructors: - newMETHOP() for method ops with dynamic method names. The only optype for this op is OP_METHOD. - newMETHOP_named() for method ops with constant method names. Optypes for this op are: OP_METHOD_NAMED (currently) and (later) OP_METHOD_SUPER, OP_METHOD_REDIR, OP_METHOD_NEXT, OP_METHOD_NEXTCAN, OP_METHOD_MAYBENEXT (This commit includes fixups by davem)
* In perly.y, change PL_parser to parserFather Chrysostomos2014-08-241-1/+1
| | | | | | | | | | | | | | | | | All these code snippets are embedded inside a function (perly.c:yyparse) that puts the current value of PL_parser in a local variable named parser. So the two are equivalent, but the latter only has to access a local variable. Before: $ ls -ld perly.o -rw-r--r-- 1 sprout staff 94748 Aug 22 06:12 perly.o After: $ ls -ld perly.o -rw-r--r-- 1 sprout staff 94340 Aug 22 06:15 perly.o
* Set PL_expect only once after curly subscriptsFather Chrysostomos2014-08-241-1/+1
| | | | | | | | | | | | When curly subscripts are parsed, the lexer (toke.c:yylex) notes that the value of PL_expect needs to be set to XSTATE (expecting a state- ment) after the final brace. When the final brace is encountered, PL_expect is set to that recorded value. But then the parser (perly.y) sets it to XOPERATOR immediately thereafter. This approach requires a plethora of identical statements in perly.y. If we just set PL_expect to the right value to begin with, we can avoid all those assignments.
* Set PL_expect less often when parsing semicolonsFather Chrysostomos2014-08-241-42/+22
| | | | | | | | | | | | | | | | | | | | | | | | As it worked before, the parser (perly.y) would set PL_expect to XSTATE after encountering a statement-terminating semicolon. Two functions in op.c--package and utilize--had to set the value to XSTATE as a result. Also, in the case of a closing brace, the lexer emits an implicit semicolon followed by '}' (emitted via force_next). force_next records the value of PL_expect and restores it when emitting the token. So in this case the value of PL_expect was flipping back and forth between two values. Instead of having the parser set it to XSTATE, we can have the lexer set it to XSTATE by default when emitting an explicit semicolon. (It was setting it to XTERM.) The parser can set it to XTERM in the only place that matters; viz., the header of a for-loop. This simplifies things conceptually, and makes the code a whole line shorter. (The diff stat shows more savings in line count, but that is because the version of bison I used to regenerate the tables produces smaller headers than what was already committed.)
* Remove MAD.Jarkko Hietaniemi2014-06-131-23/+27
| | | | | | MAD = Misc Attribute Decoration; unmaintained attempt at preserving the Perl parse tree more faithfully so that automatic conversion to Perl 6 would have been easier.
* subroutine signaturesZefram2014-02-011-25/+9
| | | | | | | | | | Declarative syntax to unwrap argument list into lexical variables. "sub foo ($a,$b) {...}" checks number of arguments and puts the arguments into lexical variables. Signatures are not equivalent to the existing idiom of "sub foo { my($a,$b) = @_; ... }". Signatures are only available by enabling a non-default feature, and generate warnings about being experimental. The syntactic clash with prototypes is managed by disabling the short prototype syntax when signatures are enabled.
* Remove support for "do SUBROUTINE(LIST)"Dagfinn Ilmari Mannsåker2013-12-221-22/+42
| | | | | It's been deprecated (and emitting a warning) since Perl v5.0.0, and support for it consitutes nearly 3% of the grammar.
* ->$#*Father Chrysostomos2013-11-241-1/+1
|
* Allow ->@ ->$ interpolation under postderef_qq featureFather Chrysostomos2013-10-051-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This turned out to be tricky. Normally @ at the beginning of the interpolated code signals to the lexer to emit ‘join($",’ immediately. With "$_->@*" we would have to retract the $ _ -> tokens upon encoun- tering @*, which we obviously cannot do. Waiting until we reach the end of the interpolated text before emit- ting anything could not work either, as it may contain BEGIN blocks that affect the way part of the interpolated code is parsed. So what we do is introduce an egregious or clever hack, depending on how you look at it. Normally, the lexer turns "@foo" into: stringify ( join ( $ " , @ foo ) ) (The " is a WORD token, representing a variable name.) "$_" becomes: stringify ( $ _ ) We can turn "$_->@*" into: stringify ( $ _ -> @ * POSTJOIN ) Where POSTJOIN is a new lexer token with special handling that creates a join op just the way join($", ...) does. To make "foo$_->@*bar" work as well, we have to make POSTJOIN have precedence just below ->, so that stringify ( "foo" . $ _ -> @ * POSTJOIN . "bar" ) (what the parser sees) is equivalent to: stringify ( "foo" . ( $ _ -> @ * POSTJOIN ) . "bar" )
* ->%{ ->%[Father Chrysostomos2013-10-051-1/+1
|
* Postfix dereference syntaxFather Chrysostomos2013-10-051-1/+1
| | | | | | | | | | | | | | | $_->$* means $$_ (and compiled down to the same op tree) $_->@* means @$_ ( ditto ditto blah blah blah ) $_->%* means %$_ (...) $_->&* means &$_ $_->** means *$_ $_->@[...] means @$_[...] $_->@{...} means @$_{...} $_->*{...} means *$_{...} $_->@* is not always equivalent to @$_, particularly in contexts like @foo[0], which cannot be written foo->@*[0]. (Just omit the asterisk and it works.)
* Reduce false positives for @hsh{$s} and @ary[$s] warningsFather Chrysostomos2013-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This resolves tickets #28380 and #114024. Commit 95a31aad5 did something similar to this for the new %hash{...} syntax. This commit extends it to @ slices and combines the two code paths. The heuristics in toke.c can easily produce false positives. So the op is flagged as being a candidate for the warning. Then when op.c has the op tree available, it examines it to see whether the heuristic may have been a false positive. This avoids bugs with qw "foo bar baz" and sub calls triggering the warning. The source code is no longer available for the warning, so we recon- struct it from the op tree, skipping the subscript if it is anything other than a const op. This means that @hash{$foo} comes out as @hash{...} and @hash{foo} as @hash{"foo"}. It also meeans that @hash{"]"} is displayed correctly instead of as @hash{"]. Commit 95a31aad5 also modified the heuristic for %hash{...} to exempt qw altogether. But it did not exempt it if it was preceded by a tab. So this commit rectifies that. This commit also improves the false positive detection by exempting any ops returning lists that can get past toke.c’s heuristic. I went through the entire list of ops, but I may have missed some. Also, @ slices on the lhs of = are exempt, as they change the context and are hence actually useful.
* Fewer false positives for %hash{$scalar} warningFather Chrysostomos2013-09-131-42/+22
| | | | | | | | | | | | | | | | | | | | | | | | Instead of warning in the lexer, flag the op and then warn in op.c, when the op tree is available, so we don’t end up warning for actual lists or for sub calls. Also, only warn in scalar context, as in list context $hash{$scalar} and %hash{$scalar} do different things. In op.c we no longer have easy access to the source code, so recon- struct the hash/array access based on the op tree. This means %hash{foo} becomes %hash{"foo"}. We only reconstruct constant keys, so %hash{++$x} becomes %hash{...}. This also corrects erroneous dumps, like %hash{"} for %hash{"}"}. Instead of triggering the warning solely based on the op tree, we still keep the heuristic in toke.c, so that common workarounds for that warning (e.g., {q<key>} and {("key")}) continue to work. The heuristic in toke.c is tweaked to avoid warning for qw(). In a future commit I plan to extend this to the existing @array[0] and @hash{key} warnings, to avoid false positives.
* index/value array slice operationRuslan Zakirov2013-09-131-1/+1
| | | | | | kvaslice operator that imlements %a[0,2,4] syntax which result in list of index/value pairs. Implemented in consistency with "key/value hash slice" operator.
* key/value hash slice operationRuslan Zakirov2013-09-131-22/+42
| | | | | | kvhslice operator that implements %h{1,2,3,4} syntax which returns list of key value pairs rather than just values (regular slices).
* Correct three sub call comments in perly.yFather Chrysostomos2013-05-311-42/+22
| | | | | | NOAMP is only emitted by toke.c when there are no parentheses. If there is a parenthesis following a word, the lexer conjures up an '&' token from nowhere.