| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
This was added in commit dfe9444ca7881e716e9e8feaf20b55da491363ca (February
1998, for Perl 5.004_60) by Andy Dougherty, and its comment says that, even
then, he thought it was unneeded. But the perl5 repo has ever defined the
NEED_GETPID_PROTO cpp symbol that guards this declaration, so this ability
has clearly never been used.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The push and unshift builtins were correctly throwing a "Modification of a
read-only value attempted" exception when modifying a read-only array, but
splice was silently modifying the array. This commit adds tests that all
three builtins throw such an exception.
One discrepancy between the three remains: push has long silently accepted
a push of no elements onto an array, whereas unshift throws an exception in
that situation. This seems to have been originally a coincidence. The
pp_unshift implementation first makes space for the elements it unshifts
(which croaks for a read-only array), then copies the new values into the
space thus created. The pp_push implementation, on the other hand, calls
av_push() individually on each element; that implicitly croaks, but only one
there's at least one element being pushed.
The pp_push implementation has subsequently been changed: read-only checking
is now done first, but that was done to fix a memory leak. (If the av_push()
itself failed, then the new SV that had been allocated for pushing onto the
array would get leaked.) That leak fix specifically grandfathered in the
acceptance of empty-push-to-readonly-array, to avoid changing behaviour.
I'm not fond of the inconsistency betwen push on the one hand and unshift &
splice on the other, but I'm disinclined to make empty-push-to-readonly
suddenly start throwing an exception after all these years, and it seems
best not to extend that exemption-from-exception to the other builtins.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The sub-in-stash optimization introduced in 2eaf799e only applied to
subs in the main stash, not in other stashes, due to a problem with
the logic in newATTRSUB.
This comment:
Also, we may be called from load_module at run time, so
PL_curstash (which sets CvSTASH) may not point to the stash the
sub is stored in.
explains why we need the PL_curstash != CopSTASH(PL_curcop) check.
(Perl_load_module will fail without it.) But that logic does not work
properly at compile time (when PL_curcop == &PL_compiling).
The value of CopSTASH(&PL_compiling) is never actually used. It is
always set to the main stash. So if we check that PL_curstash !=
CopSTASH(PL_curcop) and forego the optimization in that case, we will
never optimize subs outside of the main stash.
What we really need is to check IN_PERL_RUNTIME && PL_curstash !=
opSTASH(PL_curcop). I.e., forego the optimization at run time if the
stashes differ. That is what this commit implements.
One observable side effect of this change is that deleting a stash
element no longer anonymizes the CV if the CV had no GV that it was
depending on to provide its name. Since the main thing in such situa-
tions is that we do not get a crash, I think this change (arguably an
improvement) is acceptable.)
-----------
A bit of explanation of various other changes:
gv.c:require_tie_mod needed a bit of help, since it could not handle
sub refs in stashes.
To keep localisation of stash elements working the same way,
local($Stash::{foo}) now upgrades a coderef to a full GV before the
localisation. (Changes in two pp*.c files and in scope.c:save_gp.)
t/op/stash.t contains a test that makes sure that perl does not crash
when a GV with a CV pointing to it gets deleted. This commit tweaks
the test so that it continues to test that. (There has to be a GV for
the test to test what it is meant to test.)
Similarly with t/uni/caller.t and t/uni/stash.t.
op.c:rv2cv_op_cv with the _MAYBE_NAME_GV flag was returning the cal-
ling GV in those cases where a GV-less sub is called via a GV. E.g.,
*main = \&Foo::foo; main(). This meant that errors like ‘Not enough
arguments’ were giving the wrong sub name.
newATTRSUB was not calling mro_method_changed_in when storing a
sub as an RV.
gv_init needs to arrange for the new GV to have the file and line num-
ber corresponding to the sub in it. These are taken from CvSTART,
which may be off by a few lines, but is the closest we have to the
place the sub was declared.
|
|
|
|
|
|
|
|
|
|
|
|
| |
For -flto -mieee-fp builds, the _LIB_VERSION definition in perl.c and
in libieee conflict, causing a build failure.
The test we perform in Configure checks only that such a variable exists
(and is declared), it doesn't check that we can *define* such a variable,
which the code in pp.c tried to do.
So rather than trying to define the variable, just initialize it during
our normal interpreter initialization.
|
|
|
|
|
|
|
|
|
| |
Just a cut+paste; no code or functional changes.
As well as being hot code, pp_padav() and pp_padhv() also have a lot of
code in common with pp_rv2av() (which also implements pp_rv2hv()). Having
all three functions in the same file will allow the next few commits to
move some of that common code into static inline functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unusually, index() and rindex() return -1 on failure.
So it's reasonably common to see code like
if (index(...) != -1) { ... }
and variants.
For such code, this commit optimises away to OP_EQ and OP_CONST,
and sets a couple of private flags on the index op instead, indicating:
OPpTRUEBOOL return a boolean which is a comparison of
what the return would have been, against -1
OPpINDEX_BOOLNEG negate the boolean result
Its also supports OPpTRUEBOOL in conjunction with the existing
OPpTARGET_MY flag, so for example in
$lexical = (index(...) == -1)
the padmy, sassign, eq and const ops are all optimised away.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some ops which return integer values and which have a reasonable
likelihood of being used in a boolean context, set the OPpTRUEBOOL
flag on the op as appropriate, and at runtime return &PL_sv_yes /
&PL_sv_zero rather than an integer value.
This is especially beneficial where the op doesn't have a targ, so has
to create a mortal SV to return the integer value.
Similarly, its a win where it may be expensive to calculate an integer
return value, such as pos() or length() converting between byte and char
offset.
Ops done:
OP_SUBST
OP_AASSIGN
OP_POS
OP_LENGTH
OP_GREPWHILE
|
|
|
|
|
|
|
|
|
| |
The STATIC_ASSERT_STMT() is basically checking that shifting the
HINT_BYTES byte left 26 places gives you SVf_UTF8, so just assert that.
There's no need to assert the current values of HINT_BYTES and SVf_UTF8.
Other than that, this commit tides up the code a bit (only whitespace
changes and unnecessary brace removal), and adds/updates some code comments.
|
|
|
|
|
| |
after doing get magic, if the result is SVf_POK and non-utf8,
just use SvCUR(sv).
|
|
|
|
|
|
|
|
| |
TARGi(i,1) is equivalent to sv_setiv_mg(TARG,i), except that it inlines
some simple common cases.
Also add a couple of test for length on an overloaded utf8 string.
I don't think it was being tested for properly.
|
|
|
|
|
|
|
|
|
| |
It's quicker to return (and to test for) &PL_sv_zero or &PL_sv_yes,
than setting a targ to an integer value or, in the vase of padav,
creating a mortal sv and setting it to an integer value.
In fact for padav, even in the scalar but non-boolean case, return
&PL_sv_zero if the value is zero rather than creating and setting a mortal.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
if (keys %h) { ... }
the 'keys %h' is implemented as the op sequences
gv[*h] s
rv2hv lKRM/1
keys[t2] sK/1
or
padhv[%h:1,6] lRM
keys[t2] sK/1
It turns out that (%h) in scalar and void context now behaves very
similarly to (keys %h) (except that it reset the iterator), so in these
cases, convert the two ops
rv2hv/padhv, keys
into the single op
rv2hv/padhv
with a private flag indicating that the op is handling the 'keys' action
by itself.
As well as one less op to execute, this brings the boolean-context
optimisation already present in padhv/rv2sv to keys. So
if (keys %h) { ... }
is no longer slower than
if (%h) { ... }
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function can be called directly or indirectly by several ops.
Update its code comments to explain this in detail, and assert which
ops can call it. Also remove a redundant comment about
OP_RKEYS/OP_RVALUES; these ops have been removed.
Also, reformat the 'dokv = ' expressions.
Finally, add some code comments to pp_avhvswitch explaining what its for.
Apart from the op_type asserts, there should be no functional changes.
|
|
|
|
|
|
| |
Where its obvious that the args can't be null, use SvTRUE_NN() instead.
Avoid possible multiple evaluations of the arg by assigning to a local var
first if necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-instate the special-casing, which was removed by v5.25.8-172-gb243b19,
of OP_AND in boolean-context determination.
This is because the special-case allowed things to be more efficient
sometimes, but required returning a false value as sv_2mortal(newSViv(0)))
rather than &PL_sv_no. Now that PL_sv_zero has been added we can use
that instead, cheaply.
This commit adds an extra arg to S_check_for_bool_cxt() to indicate
whether the op supports the special-casing of OP_AND.
|
|
|
|
|
|
| |
In scalar (well, non-list) context, pp_list always yields exactly one stack
element. It must therefore extend the stack for that element, in case there
were no arguments on the stack when it started.
|
| |
|
|
|
|
|
|
| |
In scalar context, an empty list slice returns PL_sv_undef.
Extned the stack for this return value, since if there were no elements
or indices, nothing was popped from the stack
|
|
|
|
|
| |
If we';re using the implicit $_ there's nothing to pop off the stack,
so there may not be a spare stack slot to push the result.
|
|
|
|
|
|
|
|
|
| |
With v5.27.0-317-g88b1365, I changed the SvSETMAGIC(TARG) at the end of
pp_ref() to a simple assert(!SvSMAGICAL(TARG)), on the grounds that it
always returned a simple string or similar, which should never be magic.
Turns out I didn't allow for taint magic. This commit restores the old
behaviour and fixes Module::Runtime (which is where I spotted the issue).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5d09ee1cb7b68f5e6fd15233bfe5048612e8f949 fatalized bitwise
operations of operands with wide characters in them. It retained the
regular UTF-8 handling, but throws an error when a wide character is
encountered.
But this code is complicated because of its original intended
generality. It can essentially be ripped out, replaced by code that
just downgrades the operand to non-UTF-8. Then we use the regular code
to do the operation. In the complement case, that's all that need be
done to mimic earlier behavior, as the result has not been in UTF-8.
For the other operations, the result is simply upgraded to UTF-8.
This removes quite a few lines of code, and now the UTF-8 handling uses
the same tight loops as the non-UTF-8. Downgrading and upgrading had to
be done specially before, but now they are done in tight loops, before
the operation, and after the operation
|
|
|
|
|
|
| |
This commit removes quite a number of tests, mostly from t/op/bop.t,
which test the behaviour of such code points in combination of
bitwise operators. Since it's now fatal, the tests are no longer useful.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #78288
When ref() is used in a boolean context, it's not necessary to return
the name of the package which an object is blessed into; instead a simple
truth value can be returned, which is faster.
Note that it has to cope with the subtlety of an object blessed into the
class "0", which should return false.
Porting/bench.pl shows for the expression !ref($r), approximately:
unchanged for a non-reference $r
doubling of speed for a reference $r
tripling of speed for a blessed reference $r
This commit builds on the mechanism already used to set the OPpTRUEBOOL
and OPpMAYBE_TRUEBOOL flags on padhv and rv2hv ops when used in boolean
context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #130861
This function is used to load a module associated with various magic vars,
like $[ and %+. Since it can be called 'unexpectedly', it should use a new
stack. The issue in this ticket was equivalent to
my $var = '[';
$$var;
where the symbolic dereference triggered a run-time load of arybase.pm,
which grew the stack, invalidating the SP in pp_rv2sv.
Note that most of the stuff which S_require_tie_mod() calls, such as
load_module(), will do its own PUSHSTACK(); but S_require_tie_mod() also
does a bit of stack manipulation itself.
The test case includes a magic number, 125, which happens to be the exact
size necessary to trigger a stack realloc in S_require_tie_mod(). In later
perl versions this value may well change. But it seemed too expensive
to call fresh_perl_is() 100's of times with different values of $n.
This commit also adds a SPAGAIN to pp_rv2sv on the 'belt and braces'
principle.
This commit is based on an earlier effort by Aaron Crane.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #131083
Recent commits v5.25.10-81-gd69c430 and v5.25.10-82-g67dd6f3 added
out-of-range/overflow checks for the offset arg of vec(). However in
lvalue context, these croaks now happen before the SVt_PVLV was created,
rather than when its set magic was called. This means that something like
sub f { $x = $_[0] }
f(vec($s, -1, 8))
now croaks even though the out-of-range value never ended up getting used
in lvalue context.
This commit fixes things by, in pp_vec(), rather than croaking, just set
flag bits in LvFLAGS() to indicate that the offset is -Ve / out-of-range.
Then in Perl_magic_getvec(), return 0 if these flags are set, and in
Perl_magic_setvec() croak with a suitable error.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... and fix up its caller, pp_vec().
This is part of a fix for RT #130915.
pp_vec() is responsible for extracting out the offset and size from SVs on
the stack, and then calling do_vecget() with those values. (Sometimes the
call is done indirectly by storing the offset in the LvTARGOFF() field of
a SVt_PVLV, then later Perl_magic_getvec() passes the LvTARGOFF() value to
do_vecget().)
Now SvCUR, SvLEN and LvTARGOFF are all of type STRLEN (a.k.a Size_t),
while the offset arg of do_vecget() is of type SSize_t (i.e. there's a
signed/unsigned mismatch). It makes more sense to make the arg of type
STRLEN. So that is what this commit does.
At the same time this commit fixes up pp_vec() to handle all the
possibilities where the offset value can't fit into a STRLEN, returning 0
or croaking accordingly, so that do_vecget() is never called with a
truncated or wrapped offset.
The next commit will fix up the internals of do_vecget() and do_vecset(),
which have to worry about offset*(2^n) wrapping or being > SvCUR().
This commit is based on an earlier proposed fix by Aaron Crane.
|
| |
|
| |
|
|
|
|
|
|
| |
Some vars have been tagged as const because they do not change in their
new scopes. In pp_reverse in pp.c, I32 tmp is only used to hold a char,
so is changed to char.
|
|
|
|
|
|
| |
pp_ord fell foul of the new API stricture added by
d1f8d421df731c77beff3db92d27dc6ec28589f2. Change it to avoid calling
utf8n_to_uvchr() on an empty string. Fixes [perl #130545].
|
|
|
|
|
| |
pp_split didn't ensure there was space for its return value
in scalar context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some ops, (currently PADHV and RV2HV) can be flagged as being in boolean
context, and if so, may return a simple truth value which may be more
efficient to calculate than a full scalar value. (This was originally
motivated by code like if (%h) {...}, where the scalar context %h returned a
bucket ratio string, which involved counting how many HvARRAY buckets were
non-empty, which was slow in large hashes. It's been made less important
since %h in scalar context now just returns a key count, which is quick to
calculate.)
There is an issue with the A argument of A||B, A//B and A&&B, in that,
although A checked by the logop in boolean context, depending on its
truth value the original A may be passed through to the next op. So in
something like $x = (%h || -1), it's not sufficient for %h to return a
truth value; it must return a full scalar value which may get assigned to
$x.
So in general, we only mark the A op as being in boolean context if the
logop is in void context, or if the returned A would only be consumed in
boolean context; so !(A||B) would be ok for example.
However, && is a special case of this, since it will return the original A
only if A was false. Before this commit, && was special-cased to mark A as
being in boolean context regardless of the context of (A&&B). The downside
of this is that the A op can't just return &PL_sv_no as a false value;
it has to return something that is usable in scalar context too. For
example with %h, it returns sv_2mortal(newSViv(0))), which stringifies to
"0" while &PL_sv_no stringifies to "".
This commit removes that special case and makes && behave like || and //
again.
The upside is that some ops in boolean context will be able to more
cheaply return a false value (e.g. just &PL_sv_no verses
sv_2mortal(newSViv(0))).
The main downside is that && in unknown context (typically an
'if (%h} {...}' as the last statement in a sub) will have to check at
runtime whether the caller context is slower. It will also have to return
a scalar value for something like $y = (%h && $x), but that's a relatively
uncommon occurrence, and now that %h in scalar context doesn't have to
count used buckets, the extra cost in these rare cases is minor.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When certain ops are used in a boolean context (currently just PADHV and
RV2SV, implementing '%hash'), one of the private flags OPpTRUEBOOL or
OPpMAYBE_TRUEBOOL is set on the op to indicate this; at
runtime, the pp function can then just return a boolean value rather than
a full scalar value (in the case of %hash, an element count).
However, the code which sets these flags has had a complex history, and is
a bit messy. It also sets the flags incorrectly (but safely) in many
cases: principally indicating boolean context when it's in fact void, or
scalar context when it's in fact boolean. Both these permutations make the
code potentially slower (but still correct).
[ As a side-note: in 5.25, a bare %hash in scalar context changed from
returning a bucket count etc, to just returning a key count, which is
quicker to calculate. So the boolean optimisation for %hash is not nearly
as important now: it's now just the overhead of creating a temp to return
a count verses returning &PL_sv_yes, rather than counting keys. However
the improved and generalised boolean context detection added by this
commit will be useful in future to apply boolean context to other ops. ]
In particular, this wasn't being optimised (i.e. a 'not' of a hash within
an if):
if (!%hash) { ...}
This commit fixes all these cases (and uncomments a load of failing tests
in t/perf/optree.t which were added in the previous commit.)
It makes the code below nearly 3 times faster:
my $c; my %h = 1..10;
for my $i (1..10_000_000) { if (!%h) { $c++ }; }
It restructures the relevant code in rpeep() so that rather than switching
on logops like OP_OR, then seeing if that op is preceded by PADHV/RV2HV,
it instead switches on PADHV/RV2HV then sees if succeeding ops impose
boolean context on that op - that is to say, in all possible execution
paths after the PADHV/RV2HV pushes a scalar onto the stack, will that
scalar only ever be used for a boolean test? (*).
The scanning of succeeding ops is extracted out into a static function.
This will make it very easy in future to apply boolean context to other
ops too, or to expand the definition of boolean context (e.g. adding
'xor').
(*) Although in theory an expression like (A && B) can return A if A is
false, if A happens to be %hash, and as long as pp_padhv() etc return
a boolean false value that is also usable in scalar context (so it returns
0 rather than PL_sv_no), then we can pretend that OP_AND's LH arg is
never used as a scalar.
|
|
|
|
|
|
|
|
| |
The special-cased code to skip spaces at the start of the string
didn't check that s < strend, so relied on the string being \0-terminated
to work correctly. The introduction of the isSPACE_utf8_safe() macro
showed up this dodgy assumption by causing assert failures in regen.t
under LC_ALL=en_US.UTF-8 PERL_UNICODE="".
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Code review showed several places in core where a UTF-8 sequence that
was for a code point below 256 could be malformed, and be blindly
accepted. Convert these to use the similar macro that does the check.
One place in regexec.c was not converted because it is working on the
pattern, which perl should have generated itself, so very unlikely to be
bemalformed.
I didn't add tests for these, as it would be a pain to figure out
somehow to trigger them, and this is precautionary, based on code
reading rather than any known field experience.
|
|
|
|
|
| |
This creates several macros that future commits will use to provide a
layer between the caller and the function.
|
|
|
|
|
|
|
| |
The previous commit added this feature; now this commit uses it in core.
toke.c is deferred to the next commit to aid in possible future
bisecting, because some of the changes there seem somewhat more likely
to expose bugs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This:
@a = split(/-/,"-");
$a[1] = undef;
$a[0] = 0;
was giving
Modification of a read-only value attempted at foo line 3.
This is because:
1) unused slots in AvARRAY between AvFILL and AvMAX should always be
null; av_clear(), av_extend() etc do this; while av_store(), if storing
to a slot N somewhere between AvFILL and AvMAX, doesn't bother to clear
between (AvFILL+1)..(N-1) on the assumption that everyone else plays
nicely.
2) pp_split() when splitting directly to an array, sometimes over-splits
and has to null out the excess elements;
3) Since perl 5.19.4, unused AV slots are now marked with NULL rather than
&PL_sv_undef;
4) pp_split was still using &PL_sv_undef;
The fault was with (4), and is easily fixed.
|
|
|
|
|
|
|
| |
This function is equivalent to sv_setsv(sv, &PL_sv_undef), but more
efficient.
Also change the obvious places in the core to use the new idiom.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C++11 requires space between the end of a string literal and a macro, so
that a feature can unambiguously be added to the language. Starting in
g++ 6.2, the compiler emits a warning when there isn't a space
(presumably so that future versions can support C++11). Unfortunately
there are many such instances in the perl core. This commit fixes
those, including those in ext/, but individual commits will be used for
the other modules, those in dist/ and cpan/.
This commit also inserts space at the end of a macro before a string
literal, even though that is not deprecated, and removes useless ""
literals following a macro (instead of inserting a blank). The result
is easier to read, making the macro stand out, and be clearer as to the
intention.
Code and modules included with the Perl core need to be compilable using
C++. This is so that perl can be embedded in C++ programs. (Actually,
only the hdr files need to be so compilable, but it would be hard to
test that just the hdrs are compilable.) So we need to accommodate
changes to the C++ language.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are currently two optimisations for when the results of a split
are assigned to an array.
For the first,
@array = split(...);
the aassign and padav/rv2av are optimised away, and pp_split() directly
assigns to the array attached to the split op (via op_pmtargetoff or
op_pmtargetgv).
For the second,
my @array = split(...);
local @array = split(...);
@{$expr} = split(...);
The aassign is optimised away, but the padav/rv2av is kept as an additional
arg to split. pp_split itself then uses the first arg popped off the stack
as the array (This was introduced by FC with v5.21.4-409-gef7999f).
This commit moves these two:
my @array = split(...);
local @array = split(...);
from the second case to the first case, by simply setting OPpLVAL_INTRO
on the OP_SPLIT, and making pp_split() do SAVECLEARSV() or save_ary()
as appropriate.
This makes my @a = split(...) a few percent faster.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most ops that execute a regex, such as match and subst, are of type PMOP.
A PMOP allows the actual regex to be attached directly to that op, due
to its extra fields.
OP_SPLIT is different; it is just a plain LISTOP, but it always has an
OP_PUSHRE as its first child, which *is* a PMOP and which has the regex
attached.
At runtime, pp_pushre()'s only job is to push itself (i.e. the current
PL_op) onto the stack. Later pp_split() pops this to get access to the
regex it wants to execute.
This is a bit unpleasant, because we're pushing an OP* onto the stack,
which is supposed to be an array of SV*'s. As a bit of a hack, on
DEBUGGING builds we push a PVLV with the PL_op address embedded instead,
but this still isn't very satisfactory.
Now that regexes are first-class SVs, we could push a REGEXP onto the
stack rather than PL_op. However, there is an optimisation of @array =
split which eliminates the assign and embeds the array's GV/padix directly
in the PUSHRE op. So split still needs access to that op. But the pushre
op will always be splitop->op_first anyway, so one possibility is to just
skip executing the pushre altogether, and make pp_split just directly
access op_first instead to get the regex and @array info.
But if we're doing that, then why not just go the full hog and make
OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely: with the
data that was spread across the two ops now combined into just the one
split op.
That is exactly what this commit does.
For a simple compile-time pattern like split(/foo/, $s, 1), the optree
looks like:
before:
<@> split[t2] lK
</> pushre(/"foo"/) s/RTIME
<0> padsv[$s:1,2] s
<$> const(IV 1) s
after:
</> split(/"foo"/)[t2] lK/RTIME
<0> padsv[$s:1,2] s
<$> const[IV 1] s
while for a run-time expression like split(/$pat/, $s, 1),
before:
<@> split[t3] lK
</> pushre() sK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const(IV 1)s
after:
</> split()[t3] lK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const[IV 1] s
This makes the code faster and simpler.
At the same time, two new private flags have been added for OP_SPLIT -
OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
assign op has been optimised away, and if so, whether the array is
lexical.
Also, deparsing of split has been improved, to the extent that
perl TEST -deparse op/split.t
now passes.
Also, a couple of panic messages in pp_split() have been replaced with
asserts().
|
| |
|
|
|
|
|
| |
Add OPpAVHVSWITCH_MASK and make Concise etc display the offset as
/offset=2 rather than /2.
|
|
|
|
|
|
|
|
| |
This fixes #129166 and #129167 as well.
splice needs to take into account that arrays can hold NULLs and
return &PL_sv_undef in those cases where it would have returned a
NULL element.
|