summaryrefslogtreecommitdiff
path: root/pp.c
Commit message (Collapse)AuthorAgeFilesLines
* pp.c: simplify cpp conditionalsAaron Crane2017-10-211-7/+1
|
* pp.c: delete dead cpp-conditional declarationAaron Crane2017-10-151-8/+0
| | | | | | | | This was added in commit dfe9444ca7881e716e9e8feaf20b55da491363ca (February 1998, for Perl 5.004_60) by Andy Dougherty, and its comment says that, even then, he thought it was unneeded. But the perl5 repo has ever defined the NEED_GETPID_PROTO cpp symbol that guards this declaration, so this ability has clearly never been used.
* RT#131000: splice doesn't honour read-only flagAaron Crane2017-10-151-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | The push and unshift builtins were correctly throwing a "Modification of a read-only value attempted" exception when modifying a read-only array, but splice was silently modifying the array. This commit adds tests that all three builtins throw such an exception. One discrepancy between the three remains: push has long silently accepted a push of no elements onto an array, whereas unshift throws an exception in that situation. This seems to have been originally a coincidence. The pp_unshift implementation first makes space for the elements it unshifts (which croaks for a read-only array), then copies the new values into the space thus created. The pp_push implementation, on the other hand, calls av_push() individually on each element; that implicitly croaks, but only one there's at least one element being pushed. The pp_push implementation has subsequently been changed: read-only checking is now done first, but that was done to fix a memory leak. (If the av_push() itself failed, then the new SV that had been allocated for pushing onto the array would get leaked.) That leak fix specifically grandfathered in the acceptance of empty-push-to-readonly-array, to avoid changing behaviour. I'm not fond of the inconsistency betwen push on the one hand and unshift & splice on the other, but I'm disinclined to make empty-push-to-readonly suddenly start throwing an exception after all these years, and it seems best not to extend that exemption-from-exception to the other builtins.
* [perl #129916] Allow sub-in-stash outside of mainFather Chrysostomos2017-10-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sub-in-stash optimization introduced in 2eaf799e only applied to subs in the main stash, not in other stashes, due to a problem with the logic in newATTRSUB. This comment: Also, we may be called from load_module at run time, so PL_curstash (which sets CvSTASH) may not point to the stash the sub is stored in. explains why we need the PL_curstash != CopSTASH(PL_curcop) check. (Perl_load_module will fail without it.) But that logic does not work properly at compile time (when PL_curcop == &PL_compiling). The value of CopSTASH(&PL_compiling) is never actually used. It is always set to the main stash. So if we check that PL_curstash != CopSTASH(PL_curcop) and forego the optimization in that case, we will never optimize subs outside of the main stash. What we really need is to check IN_PERL_RUNTIME && PL_curstash != opSTASH(PL_curcop). I.e., forego the optimization at run time if the stashes differ. That is what this commit implements. One observable side effect of this change is that deleting a stash element no longer anonymizes the CV if the CV had no GV that it was depending on to provide its name. Since the main thing in such situa- tions is that we do not get a crash, I think this change (arguably an improvement) is acceptable.) ----------- A bit of explanation of various other changes: gv.c:require_tie_mod needed a bit of help, since it could not handle sub refs in stashes. To keep localisation of stash elements working the same way, local($Stash::{foo}) now upgrades a coderef to a full GV before the localisation. (Changes in two pp*.c files and in scope.c:save_gp.) t/op/stash.t contains a test that makes sure that perl does not crash when a GV with a CV pointing to it gets deleted. This commit tweaks the test so that it continues to test that. (There has to be a GV for the test to test what it is meant to test.) Similarly with t/uni/caller.t and t/uni/stash.t. op.c:rv2cv_op_cv with the _MAYBE_NAME_GV flag was returning the cal- ling GV in those cases where a GV-less sub is called via a GV. E.g., *main = \&Foo::foo; main(). This meant that errors like ‘Not enough arguments’ were giving the wrong sub name. newATTRSUB was not calling mro_method_changed_in when storing a sub as an RV. gv_init needs to arrange for the new GV to have the file and line num- ber corresponding to the sub in it. These are taken from CvSTART, which may be off by a few lines, but is the closest we have to the place the sub was declared.
* (perl #131786) avoid a duplicate symbol error on _LIB_VERSIONTony Cook2017-08-101-8/+0
| | | | | | | | | | | | For -flto -mieee-fp builds, the _LIB_VERSION definition in perl.c and in libieee conflict, causing a build failure. The test we perform in Configure checks only that such a variable exists (and is declared), it doesn't check that we can *define* such a variable, which the code in pp.c tried to do. So rather than trying to define the variable, just initialize it during our normal interpreter initialization.
* move pp_padav(), pp_padhv() from pp.c to pp_hot.cDavid Mitchell2017-07-271-120/+0
| | | | | | | | | Just a cut+paste; no code or functional changes. As well as being hot code, pp_padav() and pp_padhv() also have a lot of code in common with pp_rv2av() (which also implements pp_rv2hv()). Having all three functions in the same file will allow the next few commits to move some of that common code into static inline functions.
* optimise (index() == -1)David Mitchell2017-07-271-3/+13
| | | | | | | | | | | | | | | | | | | | | | | | | Unusually, index() and rindex() return -1 on failure. So it's reasonably common to see code like if (index(...) != -1) { ... } and variants. For such code, this commit optimises away to OP_EQ and OP_CONST, and sets a couple of private flags on the index op instead, indicating: OPpTRUEBOOL return a boolean which is a comparison of what the return would have been, against -1 OPpINDEX_BOOLNEG negate the boolean result Its also supports OPpTRUEBOOL in conjunction with the existing OPpTARGET_MY flag, so for example in $lexical = (index(...) == -1) the padmy, sassign, eq and const ops are all optimised away.
* add boolean context support to several opsDavid Mitchell2017-07-271-4/+18
| | | | | | | | | | | | | | | | | | | | | | For some ops which return integer values and which have a reasonable likelihood of being used in a boolean context, set the OPpTRUEBOOL flag on the op as appropriate, and at runtime return &PL_sv_yes / &PL_sv_zero rather than an integer value. This is especially beneficial where the op doesn't have a targ, so has to create a mortal SV to return the integer value. Similarly, its a win where it may be expensive to calculate an integer return value, such as pos() or length() converting between byte and char offset. Ops done: OP_SUBST OP_AASSIGN OP_POS OP_LENGTH OP_GREPWHILE
* pp_length: code tidy and simplify assertDavid Mitchell2017-07-271-15/+26
| | | | | | | | | The STATIC_ASSERT_STMT() is basically checking that shifting the HINT_BYTES byte left 26 places gives you SVf_UTF8, so just assert that. There's no need to assert the current values of HINT_BYTES and SVf_UTF8. Other than that, this commit tides up the code a bit (only whitespace changes and unnecessary brace removal), and adds/updates some code comments.
* pp_length: only call sv_len_utf8_nomg() if neededDavid Mitchell2017-07-271-1/+4
| | | | | after doing get magic, if the result is SVf_POK and non-utf8, just use SvCUR(sv).
* pp_length: use TARGi rather rather than sv_setiv()David Mitchell2017-07-271-6/+4
| | | | | | | | TARGi(i,1) is equivalent to sv_setiv_mg(TARG,i), except that it inlines some simple common cases. Also add a couple of test for length on an overloaded utf8 string. I don't think it was being tested for properly.
* optimise @array in boolean contextDavid Mitchell2017-07-271-3/+6
| | | | | | | | | It's quicker to return (and to test for) &PL_sv_zero or &PL_sv_yes, than setting a targ to an integer value or, in the vase of padav, creating a mortal sv and setting it to an integer value. In fact for padav, even in the scalar but non-boolean case, return &PL_sv_zero if the value is zero rather than creating and setting a mortal.
* optimise away OP_KEYS op in scalar/void contextDavid Mitchell2017-07-271-5/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In something like if (keys %h) { ... } the 'keys %h' is implemented as the op sequences gv[*h] s rv2hv lKRM/1 keys[t2] sK/1 or padhv[%h:1,6] lRM keys[t2] sK/1 It turns out that (%h) in scalar and void context now behaves very similarly to (keys %h) (except that it reset the iterator), so in these cases, convert the two ops rv2hv/padhv, keys into the single op rv2hv/padhv with a private flag indicating that the op is handling the 'keys' action by itself. As well as one less op to execute, this brings the boolean-context optimisation already present in padhv/rv2sv to keys. So if (keys %h) { ... } is no longer slower than if (%h) { ... }
* Perl_do_kv(): add asserts and more code commentsDavid Mitchell2017-07-271-0/+13
| | | | | | | | | | | | This function can be called directly or indirectly by several ops. Update its code comments to explain this in detail, and assert which ops can call it. Also remove a redundant comment about OP_RKEYS/OP_RVALUES; these ops have been removed. Also, reformat the 'dokv = ' expressions. Finally, add some code comments to pp_avhvswitch explaining what its for. Apart from the op_type asserts, there should be no functional changes.
* make callers of SvTRUE() more efficientDavid Mitchell2017-07-271-1/+4
| | | | | | Where its obvious that the args can't be null, use SvTRUE_NN() instead. Avoid possible multiple evaluations of the arg by assigning to a local var first if necessary.
* S_check_for_bool_cxt(): special-case OP_ANDDavid Mitchell2017-07-271-1/+1
| | | | | | | | | | | | | Re-instate the special-casing, which was removed by v5.25.8-172-gb243b19, of OP_AND in boolean-context determination. This is because the special-case allowed things to be more efficient sometimes, but required returning a false value as sv_2mortal(newSViv(0))) rather than &PL_sv_no. Now that PL_sv_zero has been added we can use that instead, cheaply. This commit adds an extra arg to S_check_for_bool_cxt() to indicate whether the op supports the special-casing of OP_AND.
* [perl #131627] extend stack in scalar-context pp_list when no argsAaron Crane2017-07-161-0/+1
| | | | | | In scalar (well, non-list) context, pp_list always yields exactly one stack element. It must therefore extend the stack for that element, in case there were no arguments on the stack when it started.
* RT #130907: Fix the Unicode Bug in split " "Aaron Crane2017-07-151-0/+13
|
* extend stack on scalar empty list sliceDavid Mitchell2017-06-221-0/+1
| | | | | | In scalar context, an empty list slice returns PL_sv_undef. Extned the stack for this return value, since if there were no elements or indices, nothing was popped from the stack
* scalar reverse(): extend stack if no argDavid Mitchell2017-06-221-2/+5
| | | | | If we';re using the implicit $_ there's nothing to pop off the stack, so there may not be a spare stack slot to push the result.
* pp_ref: do SvSETMAGIC(TARG)David Mitchell2017-06-151-1/+1
| | | | | | | | | With v5.27.0-317-g88b1365, I changed the SvSETMAGIC(TARG) at the end of pp_ref() to a simple assert(!SvSMAGICAL(TARG)), on the grounds that it always returned a simple string or similar, which should never be magic. Turns out I didn't allow for taint magic. This commit restores the old behaviour and fixes Module::Runtime (which is where I spotted the issue).
* Use simple-minded approach to bitwise UTF-8 operationsKarl Williamson2017-06-071-35/+9
| | | | | | | | | | | | | | | | | | | Commit 5d09ee1cb7b68f5e6fd15233bfe5048612e8f949 fatalized bitwise operations of operands with wide characters in them. It retained the regular UTF-8 handling, but throws an error when a wide character is encountered. But this code is complicated because of its original intended generality. It can essentially be ripped out, replaced by code that just downgrades the operand to non-UTF-8. Then we use the regular code to do the operation. In the complement case, that's all that need be done to mimic earlier behavior, as the result has not been in UTF-8. For the other operations, the result is simply upgraded to UTF-8. This removes quite a few lines of code, and now the UTF-8 handling uses the same tight loops as the non-UTF-8. Downgrading and upgrading had to be done specially before, but now they are done in tight loops, before the operation, and after the operation
* Fatalize the use of code points above 0xFF for bitwise operators.Abigail2017-06-071-35/+14
| | | | | | This commit removes quite a number of tests, mostly from t/op/bop.t, which test the behaviour of such code points in combination of bitwise operators. Since it's now fatal, the tests are no longer useful.
* make OP_REF support boolean contextDavid Mitchell2017-06-051-6/+37
| | | | | | | | | | | | | | | | | | | | RT #78288 When ref() is used in a boolean context, it's not necessary to return the name of the package which an object is blessed into; instead a simple truth value can be returned, which is faster. Note that it has to cope with the subtlety of an object blessed into the class "0", which should return false. Porting/bench.pl shows for the expression !ref($r), approximately: unchanged for a non-reference $r doubling of speed for a reference $r tripling of speed for a blessed reference $r This commit builds on the mechanism already used to set the OPpTRUEBOOL and OPpMAYBE_TRUEBOOL flags on padhv and rv2hv ops when used in boolean context.
* S_require_tie_mod(): use a new stackDavid Mitchell2017-06-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | RT #130861 This function is used to load a module associated with various magic vars, like $[ and %+. Since it can be called 'unexpectedly', it should use a new stack. The issue in this ticket was equivalent to my $var = '['; $$var; where the symbolic dereference triggered a run-time load of arybase.pm, which grew the stack, invalidating the SP in pp_rv2sv. Note that most of the stuff which S_require_tie_mod() calls, such as load_module(), will do its own PUSHSTACK(); but S_require_tie_mod() also does a bit of stack manipulation itself. The test case includes a magic number, 125, which happens to be the exact size necessary to trigger a stack realloc in S_require_tie_mod(). In later perl versions this value may well change. But it seemed too expensive to call fresh_perl_is() 100's of times with different values of $n. This commit also adds a SPAGAIN to pp_rv2sv on the 'belt and braces' principle. This commit is based on an earlier effort by Aaron Crane.
* Define and use symbolic constants for LvFLAGSDagfinn Ilmari Mannsåker2017-06-021-5/+5
|
* Add support for deleting key/value slices (RT#131328)Dagfinn Ilmari Mannsåker2017-06-021-5/+18
|
* vec(): defer lvalue out-of-range croakingDavid Mitchell2017-03-311-26/+14
| | | | | | | | | | | | | | | | | | | | | RT #131083 Recent commits v5.25.10-81-gd69c430 and v5.25.10-82-g67dd6f3 added out-of-range/overflow checks for the offset arg of vec(). However in lvalue context, these croaks now happen before the SVt_PVLV was created, rather than when its set magic was called. This means that something like sub f { $x = $_[0] } f(vec($s, -1, 8)) now croaks even though the out-of-range value never ended up getting used in lvalue context. This commit fixes things by, in pp_vec(), rather than croaking, just set flag bits in LvFLAGS() to indicate that the offset is -Ve / out-of-range. Then in Perl_magic_getvec(), return 0 if these flags are set, and in Perl_magic_setvec() croak with a suitable error.
* Perl_do_vecget(): change offset arg to STRLEN typeDavid Mitchell2017-03-171-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | ... and fix up its caller, pp_vec(). This is part of a fix for RT #130915. pp_vec() is responsible for extracting out the offset and size from SVs on the stack, and then calling do_vecget() with those values. (Sometimes the call is done indirectly by storing the offset in the LvTARGOFF() field of a SVt_PVLV, then later Perl_magic_getvec() passes the LvTARGOFF() value to do_vecget().) Now SvCUR, SvLEN and LvTARGOFF are all of type STRLEN (a.k.a Size_t), while the offset arg of do_vecget() is of type SSize_t (i.e. there's a signed/unsigned mismatch). It makes more sense to make the arg of type STRLEN. So that is what this commit does. At the same time this commit fixes up pp_vec() to handle all the possibilities where the offset value can't fit into a STRLEN, returning 0 or croaking accordingly, so that do_vecget() is never called with a truncated or wrapped offset. The next commit will fix up the internals of do_vecget() and do_vecset(), which have to worry about offset*(2^n) wrapping or being > SvCUR(). This commit is based on an earlier proposed fix by Aaron Crane.
* RT#130624: heap-use-after-free in 4-arg substrAaron Crane2017-02-271-1/+3
|
* Show sub name in signature arity-check error messagesAaron Crane2017-02-181-6/+27
|
* Moving variables to their innermost scope.Andy Lester2017-02-181-6/+5
| | | | | | Some vars have been tagged as const because they do not change in their new scopes. In pp_reverse in pp.c, I32 tmp is only used to hold a char, so is changed to char.
* fix ord of upgraded empty stringZefram2017-01-271-1/+1
| | | | | | pp_ord fell foul of the new API stricture added by d1f8d421df731c77beff3db92d27dc6ec28589f2. Change it to avoid calling utf8n_to_uvchr() on an empty string. Fixes [perl #130545].
* (perl #130262) split scalar context stack overflow fixTony Cook2017-01-161-1/+1
| | | | | pp_split didn't ensure there was space for its return value in scalar context.
* In A && B, stop special-casing boolean-ness of ADavid Mitchell2017-01-061-3/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some ops, (currently PADHV and RV2HV) can be flagged as being in boolean context, and if so, may return a simple truth value which may be more efficient to calculate than a full scalar value. (This was originally motivated by code like if (%h) {...}, where the scalar context %h returned a bucket ratio string, which involved counting how many HvARRAY buckets were non-empty, which was slow in large hashes. It's been made less important since %h in scalar context now just returns a key count, which is quick to calculate.) There is an issue with the A argument of A||B, A//B and A&&B, in that, although A checked by the logop in boolean context, depending on its truth value the original A may be passed through to the next op. So in something like $x = (%h || -1), it's not sufficient for %h to return a truth value; it must return a full scalar value which may get assigned to $x. So in general, we only mark the A op as being in boolean context if the logop is in void context, or if the returned A would only be consumed in boolean context; so !(A||B) would be ok for example. However, && is a special case of this, since it will return the original A only if A was false. Before this commit, && was special-cased to mark A as being in boolean context regardless of the context of (A&&B). The downside of this is that the A op can't just return &PL_sv_no as a false value; it has to return something that is usable in scalar context too. For example with %h, it returns sv_2mortal(newSViv(0))), which stringifies to "0" while &PL_sv_no stringifies to "". This commit removes that special case and makes && behave like || and // again. The upside is that some ops in boolean context will be able to more cheaply return a false value (e.g. just &PL_sv_no verses sv_2mortal(newSViv(0))). The main downside is that && in unknown context (typically an 'if (%h} {...}' as the last statement in a sub) will have to check at runtime whether the caller context is slower. It will also have to return a scalar value for something like $y = (%h && $x), but that's a relatively uncommon occurrence, and now that %h in scalar context doesn't have to count used buckets, the extra cost in these rare cases is minor.
* re-implement boolean context detectionDavid Mitchell2017-01-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When certain ops are used in a boolean context (currently just PADHV and RV2SV, implementing '%hash'), one of the private flags OPpTRUEBOOL or OPpMAYBE_TRUEBOOL is set on the op to indicate this; at runtime, the pp function can then just return a boolean value rather than a full scalar value (in the case of %hash, an element count). However, the code which sets these flags has had a complex history, and is a bit messy. It also sets the flags incorrectly (but safely) in many cases: principally indicating boolean context when it's in fact void, or scalar context when it's in fact boolean. Both these permutations make the code potentially slower (but still correct). [ As a side-note: in 5.25, a bare %hash in scalar context changed from returning a bucket count etc, to just returning a key count, which is quicker to calculate. So the boolean optimisation for %hash is not nearly as important now: it's now just the overhead of creating a temp to return a count verses returning &PL_sv_yes, rather than counting keys. However the improved and generalised boolean context detection added by this commit will be useful in future to apply boolean context to other ops. ] In particular, this wasn't being optimised (i.e. a 'not' of a hash within an if): if (!%hash) { ...} This commit fixes all these cases (and uncomments a load of failing tests in t/perf/optree.t which were added in the previous commit.) It makes the code below nearly 3 times faster: my $c; my %h = 1..10; for my $i (1..10_000_000) { if (!%h) { $c++ }; } It restructures the relevant code in rpeep() so that rather than switching on logops like OP_OR, then seeing if that op is preceded by PADHV/RV2HV, it instead switches on PADHV/RV2HV then sees if succeeding ops impose boolean context on that op - that is to say, in all possible execution paths after the PADHV/RV2HV pushes a scalar onto the stack, will that scalar only ever be used for a boolean test? (*). The scanning of succeeding ops is extracted out into a static function. This will make it very easy in future to apply boolean context to other ops too, or to expand the definition of boolean context (e.g. adding 'xor'). (*) Although in theory an expression like (A && B) can return A if A is false, if A happens to be %hash, and as long as pp_padhv() etc return a boolean false value that is also usable in scalar context (so it returns 0 rather than PL_sv_no), then we can pretend that OP_AND's LH arg is never used as a scalar.
* split ' ', $foo: don't check end byteDavid Mitchell2016-12-261-3/+3
| | | | | | | | The special-cased code to skip spaces at the start of the string didn't check that s < strend, so relied on the string being \0-terminated to work correctly. The introduction of the isSPACE_utf8_safe() macro showed up this dodgy assumption by causing assert failures in regen.t under LC_ALL=en_US.UTF-8 PERL_UNICODE="".
* Convert core to use toFOO_utf8_safe()Karl Williamson2016-12-231-9/+9
|
* Convert some calls to test for malformationsKarl Williamson2016-12-231-1/+1
| | | | | | | | | | | | | | Code review showed several places in core where a UTF-8 sequence that was for a code point below 256 could be malformed, and be blindly accepted. Convert these to use the similar macro that does the check. One place in regexec.c was not converted because it is working on the pattern, which perl should have generated itself, so very unlikely to be bemalformed. I didn't add tests for these, as it would be a pain to figure out somehow to trigger them, and this is precautionary, based on code reading rather than any known field experience.
* For character case changing, create macros and useKarl Williamson2016-12-231-9/+9
| | | | | This creates several macros that future commits will use to provide a layer between the caller and the function.
* Convert core (except toke.c) to use isFOO_utf8_safe()Karl Williamson2016-12-231-4/+4
| | | | | | | The previous commit added this feature; now this commit uses it in core. toke.c is deferred to the next commit to aid in possible future bisecting, because some of the changes there seem somewhat more likely to expose bugs.
* split was leaving PL_sv_undef in unused ary slotsDavid Mitchell2016-11-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This: @a = split(/-/,"-"); $a[1] = undef; $a[0] = 0; was giving Modification of a read-only value attempted at foo line 3. This is because: 1) unused slots in AvARRAY between AvFILL and AvMAX should always be null; av_clear(), av_extend() etc do this; while av_store(), if storing to a slot N somewhere between AvFILL and AvMAX, doesn't bother to clear between (AvFILL+1)..(N-1) on the assumption that everyone else plays nicely. 2) pp_split() when splitting directly to an array, sometimes over-splits and has to null out the excess elements; 3) Since perl 5.19.4, unused AV slots are now marked with NULL rather than &PL_sv_undef; 4) pp_split was still using &PL_sv_undef; The fault was with (4), and is easily fixed.
* add sv_set_undef() API functionDavid Mitchell2016-11-241-1/+1
| | | | | | | This function is equivalent to sv_setsv(sv, &PL_sv_undef), but more efficient. Also change the obvious places in the core to use the new idiom.
* Change white space to avoid C++ deprecation warningKarl Williamson2016-11-181-5/+5
| | | | | | | | | | | | | | | | | | | | | | C++11 requires space between the end of a string literal and a macro, so that a feature can unambiguously be added to the language. Starting in g++ 6.2, the compiler emits a warning when there isn't a space (presumably so that future versions can support C++11). Unfortunately there are many such instances in the perl core. This commit fixes those, including those in ext/, but individual commits will be used for the other modules, those in dist/ and cpan/. This commit also inserts space at the end of a macro before a string literal, even though that is not deprecated, and removes useless "" literals following a macro (instead of inserting a blank). The result is easier to read, making the macro stand out, and be clearer as to the intention. Code and modules included with the Perl core need to be compilable using C++. This is so that perl can be embedded in C++ programs. (Actually, only the hdr files need to be so compilable, but it would be hard to test that just the hdrs are compilable.) So we need to accommodate changes to the C++ language.
* pp.c: use new SvPVCLEAR and constant string friendly macrosYves Orton2016-10-191-14/+13
|
* Better optimise my/local @a = split()David Mitchell2016-10-041-5/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are currently two optimisations for when the results of a split are assigned to an array. For the first, @array = split(...); the aassign and padav/rv2av are optimised away, and pp_split() directly assigns to the array attached to the split op (via op_pmtargetoff or op_pmtargetgv). For the second, my @array = split(...); local @array = split(...); @{$expr} = split(...); The aassign is optimised away, but the padav/rv2av is kept as an additional arg to split. pp_split itself then uses the first arg popped off the stack as the array (This was introduced by FC with v5.21.4-409-gef7999f). This commit moves these two: my @array = split(...); local @array = split(...); from the second case to the first case, by simply setting OPpLVAL_INTRO on the OP_SPLIT, and making pp_split() do SAVECLEARSV() or save_ary() as appropriate. This makes my @a = split(...) a few percent faster.
* make OP_SPLIT a PMOP, and eliminate OP_PUSHREDavid Mitchell2016-10-041-21/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Most ops that execute a regex, such as match and subst, are of type PMOP. A PMOP allows the actual regex to be attached directly to that op, due to its extra fields. OP_SPLIT is different; it is just a plain LISTOP, but it always has an OP_PUSHRE as its first child, which *is* a PMOP and which has the regex attached. At runtime, pp_pushre()'s only job is to push itself (i.e. the current PL_op) onto the stack. Later pp_split() pops this to get access to the regex it wants to execute. This is a bit unpleasant, because we're pushing an OP* onto the stack, which is supposed to be an array of SV*'s. As a bit of a hack, on DEBUGGING builds we push a PVLV with the PL_op address embedded instead, but this still isn't very satisfactory. Now that regexes are first-class SVs, we could push a REGEXP onto the stack rather than PL_op. However, there is an optimisation of @array = split which eliminates the assign and embeds the array's GV/padix directly in the PUSHRE op. So split still needs access to that op. But the pushre op will always be splitop->op_first anyway, so one possibility is to just skip executing the pushre altogether, and make pp_split just directly access op_first instead to get the regex and @array info. But if we're doing that, then why not just go the full hog and make OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely: with the data that was spread across the two ops now combined into just the one split op. That is exactly what this commit does. For a simple compile-time pattern like split(/foo/, $s, 1), the optree looks like: before: <@> split[t2] lK </> pushre(/"foo"/) s/RTIME <0> padsv[$s:1,2] s <$> const(IV 1) s after: </> split(/"foo"/)[t2] lK/RTIME <0> padsv[$s:1,2] s <$> const[IV 1] s while for a run-time expression like split(/$pat/, $s, 1), before: <@> split[t3] lK </> pushre() sK/RTIME <|> regcomp(other->8) sK <0> padsv[$pat:2,3] s <0> padsv[$s:1,3] s <$> const(IV 1)s after: </> split()[t3] lK/RTIME <|> regcomp(other->8) sK <0> padsv[$pat:2,3] s <0> padsv[$s:1,3] s <$> const[IV 1] s This makes the code faster and simpler. At the same time, two new private flags have been added for OP_SPLIT - OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the assign op has been optimised away, and if so, whether the array is lexical. Also, deparsing of split has been improved, to the extent that perl TEST -deparse op/split.t now passes. Also, a couple of panic messages in pp_split() have been replaced with asserts().
* vax-netbsd: avoid NV_INF/NV_NAN usesJarkko Hietaniemi2016-09-301-0/+4
|
* OP_AVHVSWITCH: make op_private bits 0..1 symbolicDavid Mitchell2016-09-271-1/+1
| | | | | Add OPpAVHVSWITCH_MASK and make Concise etc display the offset as /offset=2 rather than /2.
* [perl #129164] Crash with spliceFather Chrysostomos2016-09-111-0/+4
| | | | | | | | This fixes #129166 and #129167 as well. splice needs to take into account that arrays can hold NULLs and return &PL_sv_undef in those cases where it would have returned a NULL element.