summaryrefslogtreecommitdiff
path: root/op.h
Commit message (Collapse)AuthorAgeFilesLines
* fix multi-eval of Perl_custom_op_xop in XopENTRYDaniel Dragan2013-11-101-5/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 1830b3d9c8 introduced a flaw where XopENTRY calls Perl_custom_op_xop twice to retrieve the same XOP *. This is inefficient and causes extra machine code. Since I found no CPAN or upstream=blead usage of Perl_custom_op_xop, and its previous docs say it isn't 100% public, it is being converted to a macro. Most usage of Perl_custom_op_xop is to conditionally fetch a member of the XOP struct, which was previously implemented by XopENTRY. Move the XopENTRY logic and picking defaults to an expanded version of Perl_custom_op_xop. The union allows Perl_custom_op_get_field to return its result in 1 register, since the union is similar to a void * or IV, but with the machine code overhead of casting, if any, being done in the callee (Perl_custom_op_get_field), not the caller. Perl_custom_op_get_field can also return the XOP * without looking inside it to implement Perl_custom_op_xop. XopENTRYCUSTOM is a wrapper around Perl_custom_op_get_field with XopENTRY-like usage. XopENTRY is used by the OP_* macros, which are heavily used (but rarely called, since custom ops are rare) by Perl lang warnings system. The vararg warning arguments are usually evaluted no matter if the warning will be printed to STDERR or not. Since some people like to ignore warnings or run no strict; and warnings branches are frequent in pp_*, it is beneficial to make the OP_* macros smaller in machine code. The design of Perl_custom_op_get_field supports these goals. This commit does not pass judgement on Ben Morrow's unclear public or private API designation of Perl_custom_op_xop, and whether Perl_custom_op_xop should deprecated and removed from public API. It was trivial to leave a form of Perl_custom_op_xop in the new design. XOPe enums are identical to XOPf constants so no conversion has to be done between the field selector parameter and the field flag to test in machine code. ASSUME and NOT_REACHED are being introduced. The closest to the 2 previously was "assert(0)". Perl has not used ASSUME or CC specific versions of it before. Clang, GCC >= 4.5, and Visual C are supported. For completeness, ARMCC's __promise was added, but Perl is not known to have any support for ARMCC by this commiter. This patch is part of perl #115032.
* Make &CORE::exit respect vmsish exit hintFather Chrysostomos2013-11-081-3/+0
| | | | | | | | | by removing the hint from the exit op itself and just having pp_exit look in the cop hint hash, where it is already stored (as a result of having been in %^H at compile time). &CORE:: subs intentionally lack a nextstate op (cop) so they can see the hints in the caller’s nextstate op.
* Fix &CORE::exit/die under vmsish "hushed"Father Chrysostomos2013-11-081-2/+9
| | | | | | | This commit makes them behave like exit and die without the ampersand by moving the OPpHUSH_VMSISH hint from exit/die op to the current statement (nextstate/cop) instead. &CORE:: subs intentionally lack a nextstate op, so they can see the hints in the caller’s nextstate op.
* Warn for all uses of %hash{...} in scalar cxFather Chrysostomos2013-11-081-2/+2
| | | | | | | | | | | | | | and reword the warning slightly. See <20131027204944.20489.qmail@lists-nntp.develooper.com>. To avoid getting a warning about scalar context for ‘delete %a[1,2]’, which dies anyway, I stopped scalar context from being applied to delete’s argument. Scalar context is not meaningful here anyway, and the context is not really scalar. This also means that ‘delete sort’ no longer produces a warning about scalar context before dying, so I added a test for that.
* Fix bare blocks in lvalue subsFather Chrysostomos2013-10-241-1/+1
| | | | | | | | If a bare block is the last thing in an lvalue sub, OP_LEAVELOOP needs to propagate lvalue context and handle returned arrays properly, just as OP_LEAVE has done since yesterday. This is a follow-up to 2ec7f6f24289. This came up in ticket #119797.
* Add OPpLVALUE flagFather Chrysostomos2013-10-231-0/+3
| | | | This flag will be used by the next commit.
* Remove OPpCONST_FOLDEDFather Chrysostomos2013-09-161-2/+0
| | | | | | | Now that we have op->op_folded, we don’t need OPpCONST_FOLDED any more. In removing it, I modified B::Concise to output op_folded the way OPpCONST_FOLDED was output before, since it can be helpful to have it when reading op dumps.
* op.h: Describe entersub and rv2cv flagsFather Chrysostomos2013-09-151-0/+26
| | | | | | | Entersub ops can turn into rv2cv ops at compile time. It is important that possible flag conflicts are handled properly. Since it took me a while to figure out that there were no bugs here, here are my own notes, cleaned up and nicely formatted.
* Reduce false positives for @hsh{$s} and @ary[$s] warningsFather Chrysostomos2013-09-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This resolves tickets #28380 and #114024. Commit 95a31aad5 did something similar to this for the new %hash{...} syntax. This commit extends it to @ slices and combines the two code paths. The heuristics in toke.c can easily produce false positives. So the op is flagged as being a candidate for the warning. Then when op.c has the op tree available, it examines it to see whether the heuristic may have been a false positive. This avoids bugs with qw "foo bar baz" and sub calls triggering the warning. The source code is no longer available for the warning, so we recon- struct it from the op tree, skipping the subscript if it is anything other than a const op. This means that @hash{$foo} comes out as @hash{...} and @hash{foo} as @hash{"foo"}. It also meeans that @hash{"]"} is displayed correctly instead of as @hash{"]. Commit 95a31aad5 also modified the heuristic for %hash{...} to exempt qw altogether. But it did not exempt it if it was preceded by a tab. So this commit rectifies that. This commit also improves the false positive detection by exempting any ops returning lists that can get past toke.c’s heuristic. I went through the entire list of ops, but I may have missed some. Also, @ slices on the lhs of = are exempt, as they change the context and are hence actually useful.
* Fewer false positives for %hash{$scalar} warningFather Chrysostomos2013-09-131-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Instead of warning in the lexer, flag the op and then warn in op.c, when the op tree is available, so we don’t end up warning for actual lists or for sub calls. Also, only warn in scalar context, as in list context $hash{$scalar} and %hash{$scalar} do different things. In op.c we no longer have easy access to the source code, so recon- struct the hash/array access based on the op tree. This means %hash{foo} becomes %hash{"foo"}. We only reconstruct constant keys, so %hash{++$x} becomes %hash{...}. This also corrects erroneous dumps, like %hash{"} for %hash{"}"}. Instead of triggering the warning solely based on the op tree, we still keep the heuristic in toke.c, so that common workarounds for that warning (e.g., {q<key>} and {("key")}) continue to work. The heuristic in toke.c is tweaked to avoid warning for qw(). In a future commit I plan to extend this to the existing @array[0] and @hash{key} warnings, to avoid false positives.
* pp_goto: document the different branchesDavid Mitchell2013-09-061-0/+1
| | | | | | Te various different forms of goto (and dump) take different branches through this big function. Document which branches handle which variants. Also document the use of OPf_SPECIAL in OP_DUMP.
* Use SSize_t for arraysFather Chrysostomos2013-08-251-1/+1
| | | | | | | | | | Make the array interface 64-bit safe by using SSize_t instead of I32 for array indices. This is based on a patch by Chip Salzenberg. This completes what the previous commit began when it changed av_extend.
* op.c: Add op_folded to BASEOPNiels Thykier2013-07-191-2/+5
| | | | | | | | | Add a new member, op_folded, to BASEOP. It is replacement for OPpCONST_FOLDED (which can only be set on OP_CONST). At the moment OPpCONST_FOLDED remains, as it is exposed in B (e.g. B::Concise relies on it). Signed-off-by: Niels Thykier <niels@thykier.net>
* Stop using IV in pmop; remove workaroundFather Chrysostomos2013-07-061-1/+1
| | | | | | | | | | | See ticket #118055 for all the detail. On systems where IV is bigger than a pointer, the slab allocator messes things up because it only provides pointer alignment. If pmops have an IV field, we cannot allocate them via slab on such systems. Pmops actually don’t need an IV, just a PADOFFSET. So we can change them and remove the workaround. This is obviously not suitable for maint.
* Stop split from mangling constantsFather Chrysostomos2013-06-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | At compile time, if split occurs on the right-hand side of an assign- ment to a list of scalars, if the limit argument is a constant con- taining the number 0 then it is modified in place to hold one more than the number of scalars. This means ‘constants’ can change their values, if they happen to be in the wrong place at the wrong time: $ ./perl -Ilib -le 'use constant NULL => 0; ($a,$b,$c) = split //, $foo, NULL; print NULL' 4 I considered checking the reference count on the SV, but since XS code could create its own const ops with weak references to the same cons- tants elsewhere, the safest way to avoid modifying someone else’s SV is to mark the split op in ck_split so we know the SV belongs to that split op alone. Also, to be on the safe side, turn off the read-only flag before modi- fying the SV, instead of relying on the special case for compile time in sv_force_normal.
* op.h: Corrent comment about entersub stricturesFather Chrysostomos2013-06-031-1/+1
| | | | Flag 2 on entersub is HINT_STRICT_REFS, not _SUBS.
* Eliminate PL_reg_state.re_reparsing, part 1David Mitchell2013-04-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PL_reg_state.re_reparsing is a hacky flag used to allow runtime code blocks to be included in patterns. Basically, since code blocks are now handled by the perl parser within literal patterns, runtime patterns are handled by taking the (assembled at runtime) pattern, and feeding it back through the parser via the equivalent of eval q{qr'the_pattern'}, so that run-time (?{..})'s appear to be literal code blocks. When this happens, the global flag PL_reg_state.re_reparsing is set, which modifies lexing and parsing in minor ways (such as whether \\ is stripped). Now, I'm in the slow process of trying to eliminate global regex state (i.e. gradually removing the fields of PL_reg_state), and also a change which will be coming a few commits ahead requires the info which this flag indicates to linger for longer (currently it is cleared immediately after the call to scan_str(). For those two reasons, this commit adds a new mechanism to indicate this: a new flag to eval_sv(), G_RE_REPARSING (which sets OPpEVAL_RE_REPARSING in the entereval op), which sets the EVAL_RE_REPARSING bit in PL_in_eval. Its still a yukky global flag hack, but its a *different* global flag hack now. For this commit, we add the new flag(s) but keep the old PL_reg_state.re_reparsing flag and assert that the two mechanisms always match. The next commit will remove re_reparsing.
* rework split() special case interaction with regex engineYves Orton2013-03-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch resolves several issues at once. The parts are sufficiently interconnected that it is hard to break it down into smaller commits. The tickets open for these issues are: RT #94490 - split and constant folding RT #116086 - split "\x20" doesn't work as documented It additionally corrects some issues with cached regexes that were exposed by the split changes (and applied to them). It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171 and cccd1425414e6518c1fc8b7bcaccfb119320c513. Prior to this patch the special RXf_SKIPWHITE behavior of split(" ", $thing) was only available if Perl could resolve the first argument to split at compile time, meaning under various arcane situations. This manifested as oddities like my $delim = $cond ? " " : qr/\s+/; split $delim, $string; and split $cond ? " ", qr/\s+/, $string not behaving the same as: ($cond ? split(" ", $string) : split(/\s+/, $string)) which isn't very convenient. This patch changes this by adding a new flag to the op_pmflags, PMf_SPLIT which enables pp_regcomp() to know whether it was called as part of split, which allows the RXf_SPLIT to be passed into run time regex compilation. We also preserve the original flags so pattern caching works properly, by adding a new property to the regexp structure, "compflags", and related macros for accessing it. We preserve the original flags passed into the compilation process, so we can compare when we are trying to decide if we need to recompile. Note that this essentially the opposite fix from the one applied originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171. The reverted patch was meant to make: split( 0 || " ", $thing ) #1 consistent with my $x=0; split( $x || " ", $thing ) #2 and not with split( " ", $thing ) #3 This was reverted because it broke C<split("\x{20}", $thing)>, and because one might argue that is not that #1 does the wrong thing, but rather that the behavior of #2 that is wrong. In other words we might expect that all three should behave the same as #3, and that instead of "fixing" the behavior of #1 to be like #2, we should really fix the behavior of #2 to behave like #3. (Which is what we did.) Also, it doesn't make sense to move the special case detection logic further from the regex engine. We really want the regex engine to decide this stuff itself, otherwise split " ", ... wouldn't work properly with an alternate engine. (Imagine we add a special regexp meta pattern that behaves the same as " " does in a split /.../. For instance we might make split /(*SPLITWHITE)/ trigger the same behavior as split " ". The other major change as result of this patch is it effectively reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which and free up bits in the regex flags structure. But we dont want to get rid of these vars, and it turns out that RXf_SEEN_LOOKBEHIND is used only in the same situation as the new RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to RXf_NO_INPLACE_SUBST, and then instead of using two vars we use only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to have their bits back.
* make m?$pat? match only once under ithreadsDavid Mitchell2013-01-041-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [perl #115080] m?...? is only supposed to match once, until reset. Normally this is done by setting the PMf_USED flag on the PMOP. Under ithreads we can't modify ops, so instead we indicate by setting the regex's SV to readonly. (This is a bit of a hack: the flag should be associated with the PMOP, not the regex). This breaks with run-time regexes when the pattern gets recompiled; for example: for my $c (qw(a b c)) { print "matched $c\n" if $c =~ m?^$c$?; } outputs matched a on unthreaded, but matched a matched b matched c on threaded. The re_eval jumbo fix made this more noticeable by sometimes recompiling even when the pattern text hasn't changed (to make closures work ok). The quick fix is to propagate the readonlyness of the old re to the new re. (The proper fix would be to store the flag state in a pad slot associated with the PMOP). Needless to say, I've gone for the quick fix.
* fix warning in PmopSTASH_set()David Mitchell2012-12-101-1/+1
| | | | | | | In the threaded version of PmopSTASH_set(), the assigned value is a PADOFFSET, not a pointer; so use 0 rather than NULL for the default value. This keeps clang happy.
* Stop renamed packages from making reset() crashFather Chrysostomos2012-12-051-34/+11
| | | | | | | | | | | | | | | | | | This only affected threaded builds. I think the comments in the added test explain well enough what was happening. The solution is to store a stashpad offset in the pmop, instead of the name of the stash. This is similar to what was done with cop stashes in d4d03940c58a. Not only does this fix the crash, but it also makes compilation faster and saves memory (no separate malloc for every m?pat?). I had to move Safefree(PL_stashpad) later on in perl_destruct, because freeing a pmop causes the PL_stashpad to be accessed, and pmops can be freed during sv_clean_all. Its previous location was not a problem for cops, as PL_stashpad[cop->cop_stashoff] is only accessed when PL_curcop==that_cop and Perl code is running, not when cops are freed.
* Don’t share TARGs between recursive opsFather Chrysostomos2012-11-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | I had to change the definition of IS_PADCONST to account for the SVf_IsCOW flag. Previously, anything marked READONLY would be consid- ered a pad constant, to be shared by pads of different recursion lev- els. Some of those READONLY things were not actually read-only, as they were copy-on-write scalars, which are never read-only. So I changed the definition of IS_PADCONST in e3918bb703c to accept COWs as well as read-only scalars, since I was removing the READONLY flag from COWs. With the new copy-on-write scheme, it is easy for a TARG to turn into a COW. If that happens and then the same subroutine calls itself recursively for the first time after that, pad_push will see that this is a pad ‘constant’ and allow the next recursion level to share it. If pp_concat calls itself recursively, the recursive call can modify the scalar the outer call is in the middle of using, causing the return value to be doubled up (‘tmptmp’) in the test case added here. Since pad constants are marked PADTMP (I would like to change that eventually), there is no way to distinguish them from TARGs when the are COWs, except for the fact that pad constants that are COWs are always shared hash keys (SvLEN==0).
* SVf_IsCOWFather Chrysostomos2012-11-141-1/+1
| | | | | | | | | | | | | | | As discussed in ticket #114820, instead of using READONLY+FAKE to mark a copy-on-write string, we should make it a separate flag. There are many modules in CPAN (and 1 in core, Compress::Raw::Zlib) that assume that SvREADONLY means read-only. Only one CPAN module, POSIX::pselect will definitely be broken by this. Others may need to be tweaked. But I believe this is for the better. It causes all tests except ext/Devel-Peek/t/Peek.t (which needs a tiny tweak still) to pass under PERL_OLD_COPY_ON_WRITE, which is a prereq- uisite for any new COW scheme that creates COWs under the same cir- cumstances.
* padrange: handle @_ directlyDavid Mitchell2012-11-101-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In a construct like my ($x,$y) = @_ the pushmark/padsv/padsv is already optimised into a single padrange op. This commit makes the OPf_SPECIAL flag on the padrange op indicate that in addition, @_ should be pushed onto the stack, skipping an additional pushmark/gv[*_]/rv2sv combination. So in total (including the earlier padrange work), the above construct goes from being 3 <0> pushmark s 4 <$> gv(*_) s 5 <1> rv2av[t3] lK/1 6 <0> pushmark sRM*/128 7 <0> padsv[$x:1,2] lRM*/LVINTRO 8 <0> padsv[$y:1,2] lRM*/LVINTRO 9 <2> aassign[t4] vKS to 3 <0> padrange[$x:1,2; $y:1,2] l*/LVINTRO,2 ->4 4 <2> aassign[t4] vKS
* add SAVEt_CLEARPADRANGEDavid Mitchell2012-11-101-1/+2
| | | | | Add a new save type that does the equivalent of multiple SAVEt_CLEARSV's for a given target range. This makes the new padange op more efficient.
* add padrange opDavid Mitchell2012-11-101-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This single op can, in some circumstances, replace the sequence of a pushmark followed by one or more padsv/padav/padhv ops, and possibly a trailing 'list' op, but only where the targs of the pad ops form a continuous range. This is generally more efficient, but is particularly so in the case of void-context my declarations, such as: my ($a,@b); Formerly this would be executed as the following set of ops: pushmark pushes a new mark padsv[$a] pushes $a, does a SAVEt_CLEARSV padav[@b] pushes all the flattened elements (i.e. none) of @a, does a SAVEt_CLEARSV list pops the mark, and pops all stack elements except the last nextstate pops the remaining stack element It's now: padrange[$a..@b] does two SAVEt_CLEARSV's nextstate nothing needing doing to the stack Note that in the case above, this commit changes user-visible behaviour in pathological cases; in particular, it has always been possible to modify a lexical var *before* the my is executed, using goto or closure tricks. So in principle someone could tie an array, then could notice that FETCH is no longer being called, e.g. f(); my ($s, @a); # this no longer triggers two FETCHES sub f { tie @a, ...; push @a, 1,2; } But I think we can live with that. Note also that having a padrange operator will allow us shortly to have a corresponding SAVEt_CLEARPADRANGE save type, that will replace multiple individual SAVEt_CLEARSV's.
* simplify GIMME_VDavid Mitchell2012-11-041-4/+2
| | | | | | Since OPf_WANT_VOID == G_VOID etc, we can substantially simplify the OP_GIMME macro (which is what implements GIMME_V). This saves 588 bytes in the perl executable on my -O2 system.
* Re-enable static op allocation with obslabReini Urban2012-10-251-2/+5
| | | | | | obslab and the removal of the op_latefree logic, which allowed static ops, removed support for the compiler modules, which allocates ops statically. Add an op_static flag to replace the old latefree(d) op_free logic.
* Remove PMf_MAYBE_CONSTFather Chrysostomos2012-10-111-3/+0
| | | | It was added in ce862d02d but has never been used.
* [perl #94490] const fold should not trigger special split " "Father Chrysostomos2012-09-221-1/+1
| | | | | | | | | | | The easiest way to fix this was to move the special handling out of the regexp engine. Now a flag is set on the split op itself for this case. A real regexp is still created, as that is the most convenient way to propagate locale settings, and it prevents the need to rework pp_split to handle a null regexp. This also means that custom regexp plugins no longer need to handle split specially (which they all do currently).
* Correct typo in flag nameFather Chrysostomos2012-08-251-1/+1
|
* Banish boolkeysFather Chrysostomos2012-08-251-0/+2
| | | | | | | | | | | | | | Since 6ea72b3a1, rv2hv and padhv have had the ability to return boo- leans in scalar context, instead of bucket stats, if flagged the right way. sub { %hash || ... } is optimised to take advantage of this. If the || is in unknown context at compile time, the %hash is flagged as being maybe a true boolean. When flagged that way, it returns a bool- ean if block_gimme() returns G_VOID. If rv2hv and padhv can already do this, then we don’t need the boolkeys op any more. We can just flag the rv2hv to return a boolean. In all the cases where boolkeys was used, we know at compile time that it is true boolean context, so we add a new flag for that.
* Optimise %hash in sub { %hash || ... }Father Chrysostomos2012-08-251-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In %hash || $foo, the %hash is in scalar context, so it has to iterate through the buckets to produce statistics on bucket usage. If the || is in void context, the value returned by hash is only ever used as a boolean (as || doesn’t have to return it). We already opti- mise it by adding a boolkeys op when it is known at compile time that || will be in void context. In sub { %hash || $foo } it is not known at compile time that it will be in void context, so it wasn’t optimised. This commit optimises it by flagging the %hash at compile time as being possibly in ‘true boolean’ context. When that flag is set, the rv2hv and padhv ops call block_gimme() to see whether || is in void context. This speeds things up signficantly. Here is what I got after optimis- ing rv2hv but before doing padhv: $ time ./miniperl -e '%hash = 1..10000; sub { %hash || 1 }->() for 1..100000' real 0m0.179s user 0m0.101s sys 0m0.005s $ time ./miniperl -e 'my %hash = 1..10000; sub { %hash || 1 }->() for 1..100000' real 0m5.446s user 0m2.419s sys 0m0.015s (That example is slightly misleading because of the closure, but the closure version takes 1 sec. when optimised.)
* assert_(...)Father Chrysostomos2012-08-051-5/+1
| | | | | | | | This new macro expands to ‘assert(...),’ (with a trailing comma) under debugging builds; the empty string otherwise. It allows for the removal of some #ifdef DEBUGGINGs, which could not be avoided otherwise.
* Remove op_latefree(d)Father Chrysostomos2012-07-141-15/+2
| | | | | | | This was an early attempt to fix leaking of ops after syntax errors, disabled because it was deemed to fragile. The new slab allocator (8be227a) has solved this problem another way, so latefree(d) no longer serves any purpose.
* Eliminate PL_OP_SLAB_ALLOCFather Chrysostomos2012-07-121-1/+1
| | | | | | | | | | | | This commit eliminates the old slab allocator. It had bugs in it, in that ops would not be cleaned up properly after syntax errors. So why not fix it? Well, the new slab allocator *is* the old one fixed. Now that this is gone, we don’t have to worry as much about ops leak- ing when errors occur, because it won’t happen any more. Recent commits eliminated the only reason to hang on to it: PERL_DEBUG_READONLY_OPS required it.
* PERL_DEBUG_READONLY_OPS with the new allocatorFather Chrysostomos2012-07-121-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | I want to eliminate the old slab allocator (PL_OP_SLAB_ALLOC), but this useful debugging tool needs to be rewritten for the new one first. This is slightly better than under PL_OP_SLAB_ALLOC, in that CVs cre- ated after the main CV starts running will get read-only ops, too. It is when a CV finishes compiling and relinquishes ownership of the slab that the slab is made read-only, because at that point it should not be used again for allocation. BEGIN blocks are exempt, as they are processed before the Slab_to_ro call in newATTRSUB. The Slab_to_ro call must come at the very end, after LEAVE_SCOPE, because otherwise the ops freed via the stack (the SAVEFREEOP calls near the top of newATTRSUB) will make the slab writa- ble again. At that point, the BEGIN block has already been run and its slab freed. Maybe slabs belonging to BEGIN blocks can be made read-only later. Under PERL_DEBUG_READONLY_OPS, op slabs have two extra fields to record the size and readonliness of each slab. (Only the first slab in a CV’s slab chain uses the readonly flag, since it is conceptually simpler to treat them all as one unit.) Without recording this infor- mation manually, things become unbearably slow, the tests taking hours and hours instead of minutes.
* Record folded constants in the op treeFather Chrysostomos2012-07-041-0/+1
|
* CV-based slab allocation for opsFather Chrysostomos2012-06-291-10/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This addresses bugs #111462 and #112312 and part of #107000. When a longjmp occurs during lexing, parsing or compilation, any ops in C autos that are not referenced anywhere are leaked. This commit introduces op slabs that are attached to the currently- compiling CV. New ops are allocated on the slab. When an error occurs and the CV is freed, any ops remaining are freed. This is based on Nick Ing-Simmons’ old experimental op slab implemen- tation, but it had to be rewritten to work this way. The old slab allocator has a pointer before each op that points to a reference count stored at the beginning of the slab. Freed ops are never reused. When the last op on a slab is freed, the slab itself is freed. When a slab fills up, a new one is created. To allow iteration through the slab to free everything, I had to have two pointers; one points to the next item (op slot); the other points to the slab, for accessing the reference count. Ops come in different sizes, so adding sizeof(OP) to a pointer won’t work. The old slab allocator puts the ops at the end of the slab first, the idea being that the leaves are allocated first, so the order will be cache-friendly as a result. I have preserved that order for a dif- ferent reason: We don’t need to store the size of the slab (slabs vary in size; see below) if we can simply follow pointers to find the last op. I tried eliminating reference counts altogether, by having all ops implicitly attached to PL_compcv when allocated and freed when the CV is freed. That also allowed op_free to skip FreeOp altogether, free- ing ops faster. But that doesn’t work in those cases where ops need to survive beyond their CVs; e.g., re-evals. The CV also has to have a reference count on the slab. Sometimes the first op created is immediately freed. If the reference count of the slab reaches 0, then it will be freed with the CV still point- ing to it. CVs use the new CVf_SLABBED flag to indicate that the CV has a refer- ence count on the slab. When this flag is set, the slab is accessible via CvSTART when CvROOT is not set, or by subtracting two pointers (2*sizeof(I32 *)) from CvROOT when it is set. I decided to sneak the slab into CvSTART during compilation, because enlarging the xpvcv struct by another pointer would make all CVs larger, even though this patch only benefits few (programs using string eval). When the CVf_SLABBED flag is set, the CV takes responsibility for freeing the slab. If CvROOT is not set when the CV is freed or undeffed, it is assumed that a compilation error has occurred, so the op slab is traversed and all the ops are freed. Under normal circumstances, the CV forgets about its slab (decrement- ing the reference count) when the root is attached. So the slab ref- erence counting that happens when ops are freed takes care of free- ing the slab. In some cases, the CV is told to forget about the slab (cv_forget_slab) precisely so that the ops can survive after the CV is done away with. Forgetting the slab when the root is attached is not strictly neces- sary, but avoids potential problems with CvROOT being written over. There is code all over the place, both in core and on CPAN, that does things with CvROOT, so forgetting the slab makes things more robust and avoids potential problems. Since the CV takes ownership of its slab when flagged, that flag is never copied when a CV is cloned, as one CV could free a slab that another CV still points to, since forced freeing of ops ignores the reference count (but asserts that it looks right). To avoid slab fragmentation, freed ops are marked as freed and attached to the slab’s freed chain (an idea stolen from DBM::Deep). Those freed ops are reused when possible. I did consider not reusing freed ops, but realised that would result in significantly higher mem- ory using for programs with large ‘if (DEBUG) {...}’ blocks. SAVEFREEOP was slightly problematic. Sometimes it can cause an op to be freed after its CV. If the CV has forcibly freed the ops on its slab and the slab itself, then we will be fiddling with a freed slab. Making SAVEFREEOP a no-op won’t help, as sometimes an op can be savefreed when there is no compilation error, so the op would never be freed. It holds a reference count on the slab, so the whole slab would leak. So SAVEFREEOP now sets a special flag on the op (->op_savefree). The forced freeing of ops after a compilation error won’t free any ops thus marked. Since many pieces of code create tiny subroutines consisting of only a few ops, and since a huge slab would be quite a bit of baggage for those to carry around, the first slab is always very small. To avoid allocating too many slabs for a single CV, each subsequent slab is twice the size of the previous. Smartmatch expects to be able to allocate an op at run time, run it, and then throw it away. For that to work the op is simply mallocked when PL_compcv has’t been set up. So all slab-allocated ops are marked as such (->op_slabbed), to distinguish them from mallocked ops. All of this is kept under lock and key via #ifdef PERL_CORE, as it should be completely transparent. If it isn’t transparent, I would consider that a bug. I have left the old slab allocator (PL_OP_SLAB_ALLOC) in place, as it is used by PERL_DEBUG_READONLY_OPS, which I am not about to rewrite. :-) Concerning the change from A to X for slab allocation functions: Many times in the past, A has been used for functions that were not intended to be public but were used for public macros. Since PL_OP_SLAB_ALLOC is rarely used, it didn’t make sense for Perl_Slab_* to be API functions, since they were rarely actually available. To avoid propagating this mistake further, they are now X.
* Flag ops that are on the savestackFather Chrysostomos2012-06-291-2/+4
| | | | | | | | | This is to allow future commits to free dangling ops after errors. If an op is on the savestack, then it is going to be freed by scope.c, and op_free must not be called on it by anyone else. So we flag such ops new.
* add PMf_USE_RE_EVAL flagDavid Mitchell2012-06-131-1/+2
| | | | | | This isn't actually used in the pm_flags field of a PMOP, but is used in the pm_flags arg of Perl_re_op_compile() to indicate that this pattern should be compiled with "use re 'eval'" in scope
* add PMf_IS_QR flagDavid Mitchell2012-06-131-1/+6
| | | | | | | | | | | This indicates that a particular PMOP is in fact OP_QR. We should of course be able to tell this from op_type, but the regex-compiling API only gets passed op_flags. This then allows us to fix a bug where we were deciding during compilation whether to hang on to the code_blocks based on whether the PMOP was PMf_HAS_CV rather than PMf_IS_QR; the latter implies the former, but not the other way round.
* add PMf_CODELIST_PRIVATE flagDavid Mitchell2012-06-131-1/+5
| | | | | | | | | | | | | | | This indicates that the op_code_list field in a PMOP is "private"; that is, it points to a list of DO blocks that we don't own, and shouldn't free, and whose pad may not match ours. This will allow us to use the op_code_list field in the runtime case of literal code, e.g. /$runtime(?{...})/ and qr/$runtime(?{...})/. Here, at compile-time, we need to make the pre-compiled (?{..}) blocks available to pp_regcomp, but the list containing those blocks is also the list that is executed in the lead-up to executing pp_regcomp (while skipping the DO blocks), so the code is already embedded, and doesn't need freeing. Furthermore, in the qr// case, the code blocks are actually within a different sub (an anon one) than the PMOP, so the pads won't match.
* make qr/(?{})/ behave with closuresDavid Mitchell2012-06-131-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With this commit, qr// with a literal (compile-time) code block will Do the Right Thing as regards closures and the scope of lexical vars; in particular, the following now creates three regexes that match 1, 2 and 3: for my $i (0..2) { push @r, qr/^(??{$i})$/; } "1" =~ $r[1]; # matches Previously, $i would be evaluated as undef in all 3 patterns. This is achieved by wrapping the compilation of the pattern within a new anonymous CV, which is then attached to the pattern. At run-time pp_qr() clones the CV as well as copying the REGEXP; and when the code block is executed, it does so using the pad of the cloned CV. Which makes everything come out all right in the wash. The CV is stored in a new field of the REGEXP, called qr_anoncv. Note that run-time qr//s are still not fixed, e.g. qr/$foo(?{...})/; nor is it yet fixed where the qr// is embedded within another pattern: continuing with the code example from above, my $i = 99; "1" =~ $r[1]; # bare qr: matches: correct! "X99" =~ /X$r[1]/; # embedded qr: matches: whoops, it's still seeing the wrong $i
* Mostly complete fix for literal /(?{..})/ blocksDavid Mitchell2012-06-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the way that code blocks in patterns are parsed and executed, especially as regards lexical and scoping behaviour. (Note that this fix only applies to literal code blocks appearing within patterns: run-time patterns, and literals within qr//, are still done the old broken way for now). This change means that for literal /(?{..})/ and /(??{..})/: * the code block is now fully parsed in the same pass as the surrounding code, which means that the compiler no longer just does a simplistic count of balancing {} to find the limits of the code block; i.e. stuff like /(?{ $x = "{" })/ now works (in the same way that subscripts in double quoted strings always have: "$a{'{'}" ) * Error and warning messages will now appear to emanate from the main body rather than an re_eval; e.g. the output from #!/usr/bin/perl /(?{ warn "boo" })/ has changed from boo at (re_eval 1) line 1. to boo at /tmp/p line 2. * scope and closures now behave as you might expect; for example for my $x (qw(a b c)) { "" =~ /(?{ print $x })/ } now prints "abc" rather than "" * with recursion, it now finds the lexical within the appropriate depth of pad: this code now prints "012" rather than "000": sub recurse { my ($n) = @_; return if $n > 2; "" =~ /^(?{print $n})/; recurse($n+1); } recurse(0); * an earlier fix that stopped 'my' declarations within code blocks causing crashes, required the accumulating of two SAVECOMPPADs on the stack for each iteration of the code block; this is no longer needed; * UNITCHECK blocks within literal code blocks are now run as part of the main body of code (run-time code blocks still trigger an immediate call to the UNITCHECK block though) This is all achieved by building upon the efforts of the commits which led up to this; those altered the parser to parse literal code blocks directly, but up until now those code blocks were discarded by Perl_pmruntime and the block re-compiled using the original re_eval mechanism. As of this commit, for the non-qr and non-runtime variants, those code blocks are no longer thrown away. Instead: * the LISTOP generated by the parser, which contains all the code blocks plus OP_CONSTs that collectively make up the literal pattern, is now stored in a new field in PMOPs, called op_code_list. For example in /A(?{BLOCK})C/, the listop stored in op_code_list looks like LIST PUSHMARK CONST['A'] NULL/special (aka a DO block) BLOCK CONST['(?{BLOCK})'] CONST['B'] * each of the code blocks has its last op set to null and is individually run through the peephole optimiser, so each one becomes a little self-contained block of code, rather than a list of blocks that run into each other; * then in re_op_compile(), we concatenate the list of CONSTs to produce a string to be compiled, but at the same time we note any DO blocks and note the start and end positions of the corresponding CONST['(?{BLOCK})']; * (if the current regex engine isn't the built-in perl one, then we just throw away the code blocks and pass the concatenated string to the engine) * then during regex compilation, whenever we encounter a '(?{', we see if it matches the index of one of the pre-compiled blocks, and if so, we store a pointer to that block in an 'l' data slot, and use the end index to skip over the text of the code body. Conversely, if the index doesn't match, then we know that it's a run-time pattern and (for now), compile it in the old way. * During execution, when an EVAL op is encountered, if data->what is 'l', then we just use the pad that was in effect when the pattern was called; i.e. we use the current pad slot of the currently executing CV that the pattern is embedded within.
* update the editor hints for spaces, not tabsRicardo Signes2012-05-291-2/+2
| | | | | This updates the editor hints in our files for Emacs and vim to request that tabs be inserted as spaces.
* Remove OPpCONST_WARNINGFather Chrysostomos2012-05-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | This was added to op.h in commit 599cee73: commit 599cee73f2261c5e09cde7ceba3f9a896989e117 Author: Paul Marquess <paul.marquess@btinternet.com> Date: Wed Jul 29 10:28:45 1998 +0100 lexical warnings; tweaks to places that didn't apply correctly Message-Id: <9807290828.AA26286@claudius.bfsec.bt.co.uk> Subject: lexical warnings patch for 5.005_50 p4raw-id: //depot/perl@1773 dump.c was modified to dump in, in this commit: commit bf91b999b25fa75a3ef7a327742929592a2e7e9c Author: Simon Cozens <simon@netthink.co.uk> Date: Sun May 13 21:20:36 2001 +0100 Op private flags Message-ID: <20010513202036.A21896@netthink.co.uk> p4raw-id: //depot/perl@10117 But is apparently completely unused anywhere. And I want that bit.
* Correct comment typo in op.hFather Chrysostomos2012-05-211-1/+1
|
* Fix VMS build broken by d1718a7cf5Father Chrysostomos2012-04-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This comment in perl.c: /* Note: 20,40,80 used for NATIVE_HINTS */ (added by a0ed51b3 [Here are the long-expected Unicode/UTF-8 mod- ifications.]), has apparently always been wrong. The values in vms/vmsish.h end with 7 zeroes in hex, and are only shifted down to one zero when assigned to cop->op_private in op.c:newSTATEOP. In PL_hints they never have the values indicated in perl.h. So those are actually free bits. It’s the high versions in vmsish.h that are not free. 20 (actually 0x20000000) hasn’t been used for anything since commit 744a34f9085 (Urk -- undo previous removal of vmsish 'exit' change), which change I don’t really understand. In any case, the comment was never updated. This comment in op.h: /* NOTE: OP_NEXTSTATE and OP_DBSTATE (i.e. COPs) carry lower * bits of PL_hints in op_private */ was added by d41ff1b8ad98 (introduce $^U), which was later reverted. op_private does not carry the lower bits of PL_hints, but, rather, certain higher bits of PL_hints, shifted to fit (the NATIVE_HINTS cited above). Due to misleading comments, I ended up breaking the VMS build in com- mit d1718a7cf5, by using its bits for something else. This commit moves the bits around a bit to avoid the clash, and modi- fies the comments to reflect reality.
* Label UTF8 cleanupBrian Fraser2012-03-251-0/+3
| | | | | This meant changing LABEL's definition in perly.y, so most of this commit is actually from the regened files.