summaryrefslogtreecommitdiff
path: root/op.h
Commit message (Collapse)AuthorAgeFilesLines
* Make OP_METHOD* to be of new class METHOPsyber2014-10-031-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new opcode class, METHOP, which will hold class/method related info needed at runtime to improve performance of class/object method calls, then change OP_METHOD and OP_METHOD_NAMED from being UNOP/SVOP to being METHOP. Note that because OP_METHOD is a UNOP with an op_first, while OP_METHOD_NAMED is an SVOP, the first field of the METHOP structure is a union holding either op_first or op_sv. This was seen as less messy than having to introduce two new op classes. The new op class's character is '.' Nothing has changed in functionality and/or performance by this commit. It just introduces new structure which will be extended with extra fields and used in later commits. Added METHOP constructors: - newMETHOP() for method ops with dynamic method names. The only optype for this op is OP_METHOD. - newMETHOP_named() for method ops with constant method names. Optypes for this op are: OP_METHOD_NAMED (currently) and (later) OP_METHOD_SUPER, OP_METHOD_REDIR, OP_METHOD_NEXT, OP_METHOD_NEXTCAN, OP_METHOD_MAYBENEXT (This commit includes fixups by davem)
* Update comments for OPf_SPECIAL/doFather Chrysostomos2014-10-021-1/+1
| | | | | ‘do subname’ has been removed, so OPf_SPECIAL no longer applies to OP_ENTERSUB.
* Suppress some Solaris warningsKarl Williamson2014-09-291-13/+13
| | | | | | | | We get an integer overflow message when we left shift a 1 into the highest bit of a word. This changes the 1's into 1U's to indicate unsigned. This is done for all the flag bits in the affected word, as they could get reorderd by someone in the future, unintentionally reintroducing this problem again.
* Make space for /xx flagKarl Williamson2014-09-291-1/+1
| | | | | | This doesn't actually use the flag yet. We no longer have to make version-dependent changes to ext/Devel-Peek/t/Peek.t, (it being in /ext) so this doesn't
* op.h: Move flag bits; comment shared-bit schemeKarl Williamson2014-09-291-18/+50
| | | | | | | This changes op.h to correspond with regexp.h. It moves all the used bits up in the word so that if a new shared bit is added, the #error will be triggered, alerting the person doing it that things need adjusting so binary compatibility is preserved.
* Only #define IS_(PADGV|CONST) if !PERL_COREFather Chrysostomos2014-09-171-5/+8
| | | | | | | | | | | | and give IS_PADGV a simpler definition. These are not used in the perl core any more and shouldn’t be. The IS_PADGV definition checked for the IN_PAD flag, which flag never made much sense (see the prev. commit’s message). Since any GV could end up with that flag, and since any GV coming near a pad would get it, it might as well have been turned on for all GVs (except copies). So just check whether the thingy is a GV.
* op.c:ck_subr: reify GVs based on call checkerFather Chrysostomos2014-09-151-1/+4
| | | | | | | | | | | | | | | Instead of faking up a GV to pass to the call checker if we have a lexical sub, just get the GV from CvGV (since that will reify the GV, even for lexical subs), unless the call checker has not specifically requested GVs. For now, we assume the default call checker cannot handle non-GV sub names, as indeed it cannot. An imminent commit will rectify that. The code in scope.c was getting the name hek from the proto CV (stowed in magic on the pad name) if the CV in the pad had lost it. Now, the proto CV can lose it at compile time via CvGV, so that does not work anymore. Instead, just get it from the GV.
* mask VMS hints bits in COPsDavid Mitchell2014-09-101-1/+2
| | | | | | | | | | | | A couple of VMS-specific hints bits are stored in op_private on COPs. Currently these are added using NATIVE_HINTS, which is defined as PL_hints >> 24. Since other hints have started using the top byte of PL_hints, this has the possibility of inadvertently setting other bits in cop->op_private. So mask out the bits we don't want. We need this before the next commit, which will assert valid bits on debugging builds. (This is VMS-specific, and has been applied blind)
* Automate processing of op_private flagsDavid Mitchell2014-09-101-197/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new config file, regen/op_private, which contains all the information about the flags and descriptions for the OP op_private field. Previously, the flags themselves were defined in op.h, accompanied by textual descriptions (sometimes inaccurate or incomplete). For display purposes, there were short labels for each flag found in Concise.pm, and another set of labels for Perl_do_op_dump() in dump.c. These two sets of labels differed from each other in spelling (e.g. REFC verses REFCOUNT), and differed in completeness and accuracy. With this commit, all the data to generate the defines and the labels is derived from a single source, and are generated automatically by 'make regen'. It also contains complete data on which bits are used for what by each op. So any attempt to add a new flag for a particular op where that bit is already in use, will raise an error in make regen. This compares to the previous practice of reading the descriptions in op.h and hoping for the best. It also makes use of data in regen/opcodes: for example, regen/op_private specifies that all ops flagged as 'T' get the OPpTARGET_MY flag. Since the set of labels used by Concise and Perl_do_op_dump() differed, I've standardised on the Concise version. Thus this commit changes the output produced by Concise only marginally, while Perl_do_op_dump() is considerably different. As well as the change in labels (and missing labels), Perl_do_op_dump() formerly had a bug whereby any unrecognised bits would not be shown if there was at least one recognised bit. So while Concise displayed (and still does) "LVINTRO,2", Perl_do_op_dump() has changed: - PRIVATE = (INTRO) + PRIVATE = (LVINTRO,0x2) Concise has mainly changed in that a few op/bit combinations weren't being shown symbolically, and now are. I've avoiding fixing the ones that would break tests; they'll be fixed up in the next few commits. A few new OPp* flags have been added: OPpARG1_MASK OPpARG2_MASK OPpARG3_MASK OPpARG4_MASK OPpHINT_M_VMSISH_STATUS OPpHINT_M_VMSISH_TIME OPpHINT_STRICT_REFS The last three are analogues for existing HINT_* flags. The former four reflect that many ops some of the lower few bits of op_private to indicate how many args the op expects. While (for now) this is still displayed as, e.g. "LVINTRO,2", the definitions in regen/op_private now fully account for which ops use which bits for the arg count. There is a new module, B::Op_private, which allows this new data to be accessed from Perl. For example, use B::Op_private; my $name = $B::Op_private::bits{aelem}{7}; # OPpLVAL_INTRO my $value = $B::Op_private::defines{$name}; # 128 my $label = $B::Op_private::labels{$name}; # LVINTRO There are several new constant PL_* tables. PL_op_private_valid[] specifies for each op number, which bits are valid for that op. In a couple of commits' time, op_free() will use this on debugging builds to assert that no ops gained any private flags which we don't know about. In fact it was by using such a temporary assert repeatedly against the test suite, that I tracked down most of the inconsistencies and errors in the current flag data. The other PL_op_private_* tables contain a compact representation of all the ops/bits/labels in a format suitable for Perl_do_op_dump() to decode Op_private. Overall, the perl binary is about 500 bytes smaller on my system.
* better document OA_ flagsDavid Mitchell2014-09-101-2/+3
| | | | | Its a bit confusing which bits in PL_opargs are used for what, and which flags in regen/opcodes map to which OA_* value
* op.h: Correct PERL_LOADMOD_IMPORT_OPS commentFather Chrysostomos2014-09-061-1/+4
| | | | This description, added in ec6d81aba, is misleading.
* Avoid vivifying stuff when looking up barewordsFather Chrysostomos2014-08-291-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Till now, when a bareword was looked up to see whether it was a sub- routine, an rv2cv op was created (to allow PL_check hooks to override the process), which was then asked for its GV. Afterwards, the GV was downgraded back to nothing if possible. So a lot of the time a GV was autovivified and then discarded. This has been the case since f74617600 (5.12). If we know there is a good chance that the rv2cv op is about to be deleted, we can avoid that by passing a flag to the new op. Also f74617600 actually changed the behaviour by vivifying stashes that used not be vivified: sub foo { print shift, "\n" } SUPER::foo bar if 0; foo SUPER; Output in 5.10: SUPER Output as of this commit: SUPER Output in 5.12 to 5.21.3: Can't locate object method "foo" via package "SUPER" at - line 3.
* Remove flagging OP_READLINE with OPf_SPECIALRafael Garcia-Suarez2014-07-241-1/+0
| | | | | This was used to distinguish forms <FILE> from <$file>, but doesn't seem to be used anymore by anything.
* Fix typo in op.hAaron Crane2014-07-091-1/+1
| | | | | This caused all OP structures to be larger than intended; for example, it made `struct op` 48 bytes rather than 40 on Mac OS X x86-64.
* add op_lastsib and -DPERL_OP_PARENTDavid Mitchell2014-07-081-5/+13
| | | | | | | | | | | | | | | | | | | | Add the boolean field op_lastsib to OPs. Within the core, this is set on the last op in an op_sibling chain (so it is synonymous with op_sibling being null). By default, its value is set but not used. In addition, add a new build define (not yet enabled by default), -DPERL_OP_PARENT, that forces the core to use op_lastsib to detect the last op in a sibling chain, rather than op_sibling being NULL. This frees up the last op_sibling pointer in the chain, which rather than being set to NULL, is now set to point back to the parent of the sibling chain (if any). This commit also adds a C-level op_parent() function and B parent() method; under default builds they just return NULL, under PERL_OP_PARENT they return the parent of the current op. Collectively this provides a facility not previously available from B:: nor C, of being able to follow an op tree up as well as down.
* wrap op_sibling field access in OP_SIBLING* macrosDavid Mitchell2014-07-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Remove (almost all) direct access to the op_sibling field of OP structs, and use these three new macros instead: OP_SIBLING(o); OP_HAS_SIBLING(o); OP_SIBLING_set(o, new_value); OP_HAS_SIBLING is intended to be a slightly more efficient version of OP_SIBLING when only boolean context is needed. For now these three macros are just defined in the obvious way: #define OP_SIBLING(o) (0 + (o)->op_sibling) #define OP_HAS_SIBLING(o) (cBOOL((o)->op_sibling)) #define OP_SIBLING_set(o, sib) ((o)->op_sibling = (sib)) but abstracting them out will allow us shortly to make the last pointer in an op_sibling chain point back to the parent rather than being null, with a new flag indicating whether this is the last op. Perl_ck_fun() still has a couple of direct uses of op_sibling, since it takes the field's address, which is not covered by these macros.
* Remove MAD.Jarkko Hietaniemi2014-06-131-104/+0
| | | | | | MAD = Misc Attribute Decoration; unmaintained attempt at preserving the Perl parse tree more faithfully so that automatic conversion to Perl 6 would have been easier.
* Trailing comma in enum is not C89.Jarkko Hietaniemi2014-05-281-1/+1
|
* Macro for common OP checks: "is this X or was it before NULLing?"Steffen Mueller2014-02-261-3/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For example, if (OP_TYPE_IS_OR_WAS(o, OP_LIST)) ... is now available instead of either of the following: if ( o && ( o->op_type == OP_LIST || (o->op_type == OP_NULL && o->op_targ == OP_LIST) ) ) ... if ( o && (o->op_type == OP_NULL ? o->op_targ ? o->op_type) == OP_LIST ) ... In case the above logic is a bit unclear: It checks whether that OP is an OP_LIST or used to be one before being NULLed using op_null. (FTR, the resulting OP_NULLs have their op_targ set to the old OP type). This sort of check (and it's reverse "isn't and didn't use to be") are a relatively common pattern in the part of op.c that tries to intuit structures from optimization-mangled OP trees. Hopefully, using these macros will make some code a fair amount clearer.
* Change av_len calls to av_tindex for clarityKarl Williamson2014-02-201-1/+1
| | | | | | av_tindex is a more clearly named synonym for av_len, available starting in v5.18. This changes the core uses to it, including modules in /ext, which are not dual-lifed.
* subroutine signaturesZefram2014-02-011-1/+1
| | | | | | | | | | Declarative syntax to unwrap argument list into lexical variables. "sub foo ($a,$b) {...}" checks number of arguments and puts the arguments into lexical variables. Signatures are not equivalent to the existing idiom of "sub foo { my($a,$b) = @_; ... }". Signatures are only available by enabling a non-default feature, and generate warnings about being experimental. The syntactic clash with prototypes is managed by disabling the short prototype syntax when signatures are enabled.
* perlapi: Consistent spaces after dotsFather Chrysostomos2013-12-291-21/+23
| | | | plus some typo fixes. I probably changed some things in perlintern, too.
* [perl #115736] fix undocumented param from newATTRSUB_flagsDaniel Dragan2013-12-231-1/+2
| | | | | flags param was poorly designed and didn't have a formal api. Replace it with the bool it really is. See #115736 for details.
* Optimise out PUSHMARK/RETURN if return is the last statement in a sub.Matthew Horsfall2013-12-131-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This makes: sub baz { return $cat; } Behave like: sub baz { $cat; } Which is notably faster. Unpatched: ./perl -Ilib/ ~/stuff/bench.pl Benchmark: timing 40000000 iterations of normal, ret... normal: 3 wallclock secs ( 1.60 usr + 0.01 sys = 1.61 CPU) @ 24844720.50/s (n=40000000) ret: 3 wallclock secs ( 2.08 usr + 0.00 sys = 2.08 CPU) @ 19230769.23/s (n=40000000) Patched: ./perl -Ilib ~/stuff/bench.pl Benchmark: timing 40000000 iterations of aret, normal... normal: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 23255813.95/s (n=40000000) ret: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 23255813.95/s (n=40000000) The difference in OP trees can be seen here: Unpatched: $ perl -MO=Concise,baz -e 'sub baz { return $cat }' main::baz: 5 <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->5 1 <;> nextstate(main 1 -e:1) v ->2 4 <@> return K ->5 2 <0> pushmark s ->3 - <1> ex-rv2sv sK/1 ->4 3 <#> gvsv[*cat] s ->4 -e syntax OK Patched: $ ./perl -Ilib -MO=Concise,baz -e 'sub baz { return $cat }' main::baz: 3 <1> leavesub[1 ref] K/REFC,1 ->(end) - <@> lineseq KP ->3 1 <;> nextstate(main 1 -e:1) v ->2 - <@> return K ->- - <0> pushmark s ->2 - <1> ex-rv2sv sK/1 ->- 2 <$> gvsv(*cat) s ->3 -e syntax OK (Includes some modifications from Steffen)
* fix multi-eval of Perl_custom_op_xop in XopENTRYDaniel Dragan2013-11-101-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1830b3d9c8 introduced a flaw where XopENTRY calls Perl_custom_op_xop twice to retrieve the same XOP *. This is inefficient and causes extra machine code. Since I found no CPAN or upstream=blead usage of Perl_custom_op_xop, and its previous docs say it isn't 100% public, it is being converted to a macro. Most usage of Perl_custom_op_xop is to conditionally fetch a member of the XOP struct, which was previously implemented by XopENTRY. Move the XopENTRY logic and picking defaults to an expanded version of Perl_custom_op_xop. The union allows Perl_custom_op_get_field to return its result in 1 register, since the union is similar to a void * or IV, but with the machine code overhead of casting, if any, being done in the callee (Perl_custom_op_get_field), not the caller. Perl_custom_op_get_field can also return the XOP * without looking inside it to implement Perl_custom_op_xop. XopENTRYCUSTOM is a wrapper around Perl_custom_op_get_field with XopENTRY-like usage. XopENTRY is used by the OP_* macros, which are heavily used (but rarely called, since custom ops are rare) by Perl lang warnings system. The vararg warning arguments are usually evaluted no matter if the warning will be printed to STDERR or not. Since some people like to ignore warnings or run no strict; and warnings branches are frequent in pp_*, it is beneficial to make the OP_* macros smaller in machine code. The design of Perl_custom_op_get_field supports these goals. This commit does not pass judgement on Ben Morrow's unclear public or private API designation of Perl_custom_op_xop, and whether Perl_custom_op_xop should deprecated and removed from public API. It was trivial to leave a form of Perl_custom_op_xop in the new design. XOPe enums are identical to XOPf constants so no conversion has to be done between the field selector parameter and the field flag to test in machine code. ASSUME and NOT_REACHED are being introduced. The closest to the 2 previously was "assert(0)". Perl has not used ASSUME or CC specific versions of it before. Clang, GCC >= 4.5, and Visual C are supported. For completeness, ARMCC's __promise was added, but Perl is not known to have any support for ARMCC by this commiter. This patch is part of perl #115032.
* Make &CORE::exit respect vmsish exit hintFather Chrysostomos2013-11-081-3/+0
| | | | | | | | | by removing the hint from the exit op itself and just having pp_exit look in the cop hint hash, where it is already stored (as a result of having been in %^H at compile time). &CORE:: subs intentionally lack a nextstate op (cop) so they can see the hints in the caller’s nextstate op.
* Fix &CORE::exit/die under vmsish "hushed"Father Chrysostomos2013-11-081-2/+9
| | | | | | | This commit makes them behave like exit and die without the ampersand by moving the OPpHUSH_VMSISH hint from exit/die op to the current statement (nextstate/cop) instead. &CORE:: subs intentionally lack a nextstate op, so they can see the hints in the caller’s nextstate op.
* Warn for all uses of %hash{...} in scalar cxFather Chrysostomos2013-11-081-2/+2
| | | | | | | | | | | | | | and reword the warning slightly. See <20131027204944.20489.qmail@lists-nntp.develooper.com>. To avoid getting a warning about scalar context for ‘delete %a[1,2]’, which dies anyway, I stopped scalar context from being applied to delete’s argument. Scalar context is not meaningful here anyway, and the context is not really scalar. This also means that ‘delete sort’ no longer produces a warning about scalar context before dying, so I added a test for that.
* Fix bare blocks in lvalue subsFather Chrysostomos2013-10-241-1/+1
| | | | | | | | If a bare block is the last thing in an lvalue sub, OP_LEAVELOOP needs to propagate lvalue context and handle returned arrays properly, just as OP_LEAVE has done since yesterday. This is a follow-up to 2ec7f6f24289. This came up in ticket #119797.
* Add OPpLVALUE flagFather Chrysostomos2013-10-231-0/+3
| | | | This flag will be used by the next commit.
* Remove OPpCONST_FOLDEDFather Chrysostomos2013-09-161-2/+0
| | | | | | | Now that we have op->op_folded, we don’t need OPpCONST_FOLDED any more. In removing it, I modified B::Concise to output op_folded the way OPpCONST_FOLDED was output before, since it can be helpful to have it when reading op dumps.
* op.h: Describe entersub and rv2cv flagsFather Chrysostomos2013-09-151-0/+26
| | | | | | | Entersub ops can turn into rv2cv ops at compile time. It is important that possible flag conflicts are handled properly. Since it took me a while to figure out that there were no bugs here, here are my own notes, cleaned up and nicely formatted.
* Reduce false positives for @hsh{$s} and @ary[$s] warningsFather Chrysostomos2013-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This resolves tickets #28380 and #114024. Commit 95a31aad5 did something similar to this for the new %hash{...} syntax. This commit extends it to @ slices and combines the two code paths. The heuristics in toke.c can easily produce false positives. So the op is flagged as being a candidate for the warning. Then when op.c has the op tree available, it examines it to see whether the heuristic may have been a false positive. This avoids bugs with qw "foo bar baz" and sub calls triggering the warning. The source code is no longer available for the warning, so we recon- struct it from the op tree, skipping the subscript if it is anything other than a const op. This means that @hash{$foo} comes out as @hash{...} and @hash{foo} as @hash{"foo"}. It also meeans that @hash{"]"} is displayed correctly instead of as @hash{"]. Commit 95a31aad5 also modified the heuristic for %hash{...} to exempt qw altogether. But it did not exempt it if it was preceded by a tab. So this commit rectifies that. This commit also improves the false positive detection by exempting any ops returning lists that can get past toke.c’s heuristic. I went through the entire list of ops, but I may have missed some. Also, @ slices on the lhs of = are exempt, as they change the context and are hence actually useful.
* Fewer false positives for %hash{$scalar} warningFather Chrysostomos2013-09-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Instead of warning in the lexer, flag the op and then warn in op.c, when the op tree is available, so we don’t end up warning for actual lists or for sub calls. Also, only warn in scalar context, as in list context $hash{$scalar} and %hash{$scalar} do different things. In op.c we no longer have easy access to the source code, so recon- struct the hash/array access based on the op tree. This means %hash{foo} becomes %hash{"foo"}. We only reconstruct constant keys, so %hash{++$x} becomes %hash{...}. This also corrects erroneous dumps, like %hash{"} for %hash{"}"}. Instead of triggering the warning solely based on the op tree, we still keep the heuristic in toke.c, so that common workarounds for that warning (e.g., {q<key>} and {("key")}) continue to work. The heuristic in toke.c is tweaked to avoid warning for qw(). In a future commit I plan to extend this to the existing @array[0] and @hash{key} warnings, to avoid false positives.
* pp_goto: document the different branchesDavid Mitchell2013-09-061-0/+1
| | | | | | Te various different forms of goto (and dump) take different branches through this big function. Document which branches handle which variants. Also document the use of OPf_SPECIAL in OP_DUMP.
* Use SSize_t for arraysFather Chrysostomos2013-08-251-1/+1
| | | | | | | | | | Make the array interface 64-bit safe by using SSize_t instead of I32 for array indices. This is based on a patch by Chip Salzenberg. This completes what the previous commit began when it changed av_extend.
* op.c: Add op_folded to BASEOPNiels Thykier2013-07-191-2/+5
| | | | | | | | | Add a new member, op_folded, to BASEOP. It is replacement for OPpCONST_FOLDED (which can only be set on OP_CONST). At the moment OPpCONST_FOLDED remains, as it is exposed in B (e.g. B::Concise relies on it). Signed-off-by: Niels Thykier <niels@thykier.net>
* Stop using IV in pmop; remove workaroundFather Chrysostomos2013-07-061-1/+1
| | | | | | | | | | | See ticket #118055 for all the detail. On systems where IV is bigger than a pointer, the slab allocator messes things up because it only provides pointer alignment. If pmops have an IV field, we cannot allocate them via slab on such systems. Pmops actually don’t need an IV, just a PADOFFSET. So we can change them and remove the workaround. This is obviously not suitable for maint.
* Stop split from mangling constantsFather Chrysostomos2013-06-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | At compile time, if split occurs on the right-hand side of an assign- ment to a list of scalars, if the limit argument is a constant con- taining the number 0 then it is modified in place to hold one more than the number of scalars. This means ‘constants’ can change their values, if they happen to be in the wrong place at the wrong time: $ ./perl -Ilib -le 'use constant NULL => 0; ($a,$b,$c) = split //, $foo, NULL; print NULL' 4 I considered checking the reference count on the SV, but since XS code could create its own const ops with weak references to the same cons- tants elsewhere, the safest way to avoid modifying someone else’s SV is to mark the split op in ck_split so we know the SV belongs to that split op alone. Also, to be on the safe side, turn off the read-only flag before modi- fying the SV, instead of relying on the special case for compile time in sv_force_normal.
* op.h: Corrent comment about entersub stricturesFather Chrysostomos2013-06-031-1/+1
| | | | Flag 2 on entersub is HINT_STRICT_REFS, not _SUBS.
* Eliminate PL_reg_state.re_reparsing, part 1David Mitchell2013-04-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PL_reg_state.re_reparsing is a hacky flag used to allow runtime code blocks to be included in patterns. Basically, since code blocks are now handled by the perl parser within literal patterns, runtime patterns are handled by taking the (assembled at runtime) pattern, and feeding it back through the parser via the equivalent of eval q{qr'the_pattern'}, so that run-time (?{..})'s appear to be literal code blocks. When this happens, the global flag PL_reg_state.re_reparsing is set, which modifies lexing and parsing in minor ways (such as whether \\ is stripped). Now, I'm in the slow process of trying to eliminate global regex state (i.e. gradually removing the fields of PL_reg_state), and also a change which will be coming a few commits ahead requires the info which this flag indicates to linger for longer (currently it is cleared immediately after the call to scan_str(). For those two reasons, this commit adds a new mechanism to indicate this: a new flag to eval_sv(), G_RE_REPARSING (which sets OPpEVAL_RE_REPARSING in the entereval op), which sets the EVAL_RE_REPARSING bit in PL_in_eval. Its still a yukky global flag hack, but its a *different* global flag hack now. For this commit, we add the new flag(s) but keep the old PL_reg_state.re_reparsing flag and assert that the two mechanisms always match. The next commit will remove re_reparsing.
* rework split() special case interaction with regex engineYves Orton2013-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves several issues at once. The parts are sufficiently interconnected that it is hard to break it down into smaller commits. The tickets open for these issues are: RT #94490 - split and constant folding RT #116086 - split "\x20" doesn't work as documented It additionally corrects some issues with cached regexes that were exposed by the split changes (and applied to them). It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171 and cccd1425414e6518c1fc8b7bcaccfb119320c513. Prior to this patch the special RXf_SKIPWHITE behavior of split(" ", $thing) was only available if Perl could resolve the first argument to split at compile time, meaning under various arcane situations. This manifested as oddities like my $delim = $cond ? " " : qr/\s+/; split $delim, $string; and split $cond ? " ", qr/\s+/, $string not behaving the same as: ($cond ? split(" ", $string) : split(/\s+/, $string)) which isn't very convenient. This patch changes this by adding a new flag to the op_pmflags, PMf_SPLIT which enables pp_regcomp() to know whether it was called as part of split, which allows the RXf_SPLIT to be passed into run time regex compilation. We also preserve the original flags so pattern caching works properly, by adding a new property to the regexp structure, "compflags", and related macros for accessing it. We preserve the original flags passed into the compilation process, so we can compare when we are trying to decide if we need to recompile. Note that this essentially the opposite fix from the one applied originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171. The reverted patch was meant to make: split( 0 || " ", $thing ) #1 consistent with my $x=0; split( $x || " ", $thing ) #2 and not with split( " ", $thing ) #3 This was reverted because it broke C<split("\x{20}", $thing)>, and because one might argue that is not that #1 does the wrong thing, but rather that the behavior of #2 that is wrong. In other words we might expect that all three should behave the same as #3, and that instead of "fixing" the behavior of #1 to be like #2, we should really fix the behavior of #2 to behave like #3. (Which is what we did.) Also, it doesn't make sense to move the special case detection logic further from the regex engine. We really want the regex engine to decide this stuff itself, otherwise split " ", ... wouldn't work properly with an alternate engine. (Imagine we add a special regexp meta pattern that behaves the same as " " does in a split /.../. For instance we might make split /(*SPLITWHITE)/ trigger the same behavior as split " ". The other major change as result of this patch is it effectively reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which and free up bits in the regex flags structure. But we dont want to get rid of these vars, and it turns out that RXf_SEEN_LOOKBEHIND is used only in the same situation as the new RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to RXf_NO_INPLACE_SUBST, and then instead of using two vars we use only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to have their bits back.
* make m?$pat? match only once under ithreadsDavid Mitchell2013-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [perl #115080] m?...? is only supposed to match once, until reset. Normally this is done by setting the PMf_USED flag on the PMOP. Under ithreads we can't modify ops, so instead we indicate by setting the regex's SV to readonly. (This is a bit of a hack: the flag should be associated with the PMOP, not the regex). This breaks with run-time regexes when the pattern gets recompiled; for example: for my $c (qw(a b c)) { print "matched $c\n" if $c =~ m?^$c$?; } outputs matched a on unthreaded, but matched a matched b matched c on threaded. The re_eval jumbo fix made this more noticeable by sometimes recompiling even when the pattern text hasn't changed (to make closures work ok). The quick fix is to propagate the readonlyness of the old re to the new re. (The proper fix would be to store the flag state in a pad slot associated with the PMOP). Needless to say, I've gone for the quick fix.
* fix warning in PmopSTASH_set()David Mitchell2012-12-101-1/+1
| | | | | | | In the threaded version of PmopSTASH_set(), the assigned value is a PADOFFSET, not a pointer; so use 0 rather than NULL for the default value. This keeps clang happy.
* Stop renamed packages from making reset() crashFather Chrysostomos2012-12-051-34/+11
| | | | | | | | | | | | | | | | | | This only affected threaded builds. I think the comments in the added test explain well enough what was happening. The solution is to store a stashpad offset in the pmop, instead of the name of the stash. This is similar to what was done with cop stashes in d4d03940c58a. Not only does this fix the crash, but it also makes compilation faster and saves memory (no separate malloc for every m?pat?). I had to move Safefree(PL_stashpad) later on in perl_destruct, because freeing a pmop causes the PL_stashpad to be accessed, and pmops can be freed during sv_clean_all. Its previous location was not a problem for cops, as PL_stashpad[cop->cop_stashoff] is only accessed when PL_curcop==that_cop and Perl code is running, not when cops are freed.
* Don’t share TARGs between recursive opsFather Chrysostomos2012-11-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | I had to change the definition of IS_PADCONST to account for the SVf_IsCOW flag. Previously, anything marked READONLY would be consid- ered a pad constant, to be shared by pads of different recursion lev- els. Some of those READONLY things were not actually read-only, as they were copy-on-write scalars, which are never read-only. So I changed the definition of IS_PADCONST in e3918bb703c to accept COWs as well as read-only scalars, since I was removing the READONLY flag from COWs. With the new copy-on-write scheme, it is easy for a TARG to turn into a COW. If that happens and then the same subroutine calls itself recursively for the first time after that, pad_push will see that this is a pad ‘constant’ and allow the next recursion level to share it. If pp_concat calls itself recursively, the recursive call can modify the scalar the outer call is in the middle of using, causing the return value to be doubled up (‘tmptmp’) in the test case added here. Since pad constants are marked PADTMP (I would like to change that eventually), there is no way to distinguish them from TARGs when the are COWs, except for the fact that pad constants that are COWs are always shared hash keys (SvLEN==0).
* SVf_IsCOWFather Chrysostomos2012-11-141-1/+1
| | | | | | | | | | | | | | | As discussed in ticket #114820, instead of using READONLY+FAKE to mark a copy-on-write string, we should make it a separate flag. There are many modules in CPAN (and 1 in core, Compress::Raw::Zlib) that assume that SvREADONLY means read-only. Only one CPAN module, POSIX::pselect will definitely be broken by this. Others may need to be tweaked. But I believe this is for the better. It causes all tests except ext/Devel-Peek/t/Peek.t (which needs a tiny tweak still) to pass under PERL_OLD_COPY_ON_WRITE, which is a prereq- uisite for any new COW scheme that creates COWs under the same cir- cumstances.
* padrange: handle @_ directlyDavid Mitchell2012-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In a construct like my ($x,$y) = @_ the pushmark/padsv/padsv is already optimised into a single padrange op. This commit makes the OPf_SPECIAL flag on the padrange op indicate that in addition, @_ should be pushed onto the stack, skipping an additional pushmark/gv[*_]/rv2sv combination. So in total (including the earlier padrange work), the above construct goes from being 3 <0> pushmark s 4 <$> gv(*_) s 5 <1> rv2av[t3] lK/1 6 <0> pushmark sRM*/128 7 <0> padsv[$x:1,2] lRM*/LVINTRO 8 <0> padsv[$y:1,2] lRM*/LVINTRO 9 <2> aassign[t4] vKS to 3 <0> padrange[$x:1,2; $y:1,2] l*/LVINTRO,2 ->4 4 <2> aassign[t4] vKS
* add SAVEt_CLEARPADRANGEDavid Mitchell2012-11-101-1/+2
| | | | | Add a new save type that does the equivalent of multiple SAVEt_CLEARSV's for a given target range. This makes the new padange op more efficient.
* add padrange opDavid Mitchell2012-11-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This single op can, in some circumstances, replace the sequence of a pushmark followed by one or more padsv/padav/padhv ops, and possibly a trailing 'list' op, but only where the targs of the pad ops form a continuous range. This is generally more efficient, but is particularly so in the case of void-context my declarations, such as: my ($a,@b); Formerly this would be executed as the following set of ops: pushmark pushes a new mark padsv[$a] pushes $a, does a SAVEt_CLEARSV padav[@b] pushes all the flattened elements (i.e. none) of @a, does a SAVEt_CLEARSV list pops the mark, and pops all stack elements except the last nextstate pops the remaining stack element It's now: padrange[$a..@b] does two SAVEt_CLEARSV's nextstate nothing needing doing to the stack Note that in the case above, this commit changes user-visible behaviour in pathological cases; in particular, it has always been possible to modify a lexical var *before* the my is executed, using goto or closure tricks. So in principle someone could tie an array, then could notice that FETCH is no longer being called, e.g. f(); my ($s, @a); # this no longer triggers two FETCHES sub f { tie @a, ...; push @a, 1,2; } But I think we can live with that. Note also that having a padrange operator will allow us shortly to have a corresponding SAVEt_CLEARPADRANGE save type, that will replace multiple individual SAVEt_CLEARSV's.