| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new OP to replace cases of OP_ANONLIST and
OP_ANONHASH where there are zero elements, which is very common in
Perl code.
As an example, `my $x = {}` is currently implemented like this:
...
6 <2> sassign vKS/2 ->7
4 <@> anonhash sK* ->5
3 <0> pushmark s ->4
5 <0> padsv[$x:1,2] sRM*/LVINTRO ->6
The pushmark serves no meaningful purpose when there are zero
elements and the anonhash, besides undoing the pushmark,
performs work that is unnecessary for this special case.
The peephole optimizer, which also checks for applicability of a
related TARGMY optimization, transforms this example into:
...
- <1> ex-sassign vKS/2 ->4
3 <@> emptyavhv[$x:1,2] vK*/LVINTRO,ANONHASH,TARGMY ->4
- <0> ex-pushmark s ->3
- <0> ex-padsv sRM*/LVINTRO ->-
|
|
|
|
|
|
|
|
| |
This commit:
* Adds support for negative keys, as per the original AELEMFAST_LEX
* Changes an if() check for a "useless assignment to a temporary" into
an assert, since this condition should never be true when the LHS is
the result of an array fetch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new OP to replace simple cases of OP_SASSIGN
and OP_AELEMFAST_LEX. (Similar concept to GH #19943)
For example, `my @ary; $ary[0] = "boo"` is currently implemented as:
7 <2> sassign vKS/2 ->8
5 <$> const[PV "boo"] s ->6
- <1> ex-aelem sKRM*/2 ->7
6 <0> aelemfast_lex[@ary:1,2] sRM ->7
- <0> ex-const s ->-
But now will be turned into:
6 <1> aelemfastlex_store[@ary:1,2] vKS ->7
5 <$> const(PV "boo") s ->6
- <1> ex-aelem sKRM*/2 ->6
- <0> ex-aelemfast_lex sRM ->6
- <0> ex-const s ->-
This is intended to be a transparent performance optimization.
It should be applicable for RHS optrees of varying complexity.
|
|
|
|
|
|
| |
To avoid "subroutine redefined" warning.
For: https://github.com/Perl/perl5/issues/20164
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the existing `undef` OP to act on a pad SV. The following
two cases are optimized:
`undef my $x`, currently implemented as:
4 <1> undef vK/1 ->5
3 <0> padsv[$x:1,2] sRM/LVINTRO ->4
`my $a = undef`, currently implemented as:
5 <2> sassign vKS/2 ->6
3 <0> undef s ->4
4 <0> padsv[$x:1,2] sRM*/LVINTRO ->5
These are now just represented as:
3 <1> undef[$x:1,2] vK/SOMEFLAGS ->4
Note: The two cases are not quite functionally identical, as `$x = undef`
clears the SV flags but preserves any PV allocation for later reuse,
whereas `undef $x` does free any PV allocation. This behaviour difference
is preserved through use of the OPpUNDEF_KEEP_PV flag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new OP to replace simple cases
of OP_SASSIGN and OP_PADSV.
For example, 'my $x = 1' is currently implemented as:
1 <;> nextstate(main 1 -e:1) v:{
2 <$> const(IV 1) s
3 <0> padsv[$x:1,2] sRM*/LVINTRO
4 <2> sassign vKS/2
But now will be turned into:
1 <;> nextstate(main 1 -e:1) v:{
2 <$> const(IV 1) s
3 <1> padsv_store[$x:1,2] vKMS/LVINTRO
This intended to be a transparent performance optimization.
It should be applicable for RHS optrees of varying complexity.
|
|
|
|
|
| |
Also tweak the implementation of the other two boolean builtins (is_bool
& is_weak) to be slightly more efficient.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Also, ensure that B::Deparse understands the OA_TARGMY optimisation of
OP_ISBOOL
|
|
|
|
|
| |
* Apply OA_RETSCALAR, OA_TARGLEX and OA_FOLDCONST flags
* Handle both 'get' and 'set' magic
|
|
|
|
|
|
| |
Turn builtin::true/false into OP_CONSTs
Add a dedicated OP_ISBOOL, make an efficient op version of builtin::isbool()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds syntax `defer { BLOCK }` to create a deferred block; code that is
deferred until the scope exits. This syntax is guarded by
use feature 'defer';
Adds a new opcode, `OP_PUSHDEFER`, which is a LOGOP whose `op_other` field
gives the start of an optree to be deferred until scope exit. That op
pointer will be stored on the save stack and invoked as part of scope
unwind.
Included is support for `B::Deparse` to deparse the optree back into
syntax.
|
|
|
|
|
|
|
|
|
| |
The benchmarks which test adding an int to a num weren't measuring
correctly because, after the first iteration, the PVIV var got upgraded
to a PVNV - thus causing the optimised 'both args NV' path to be chosen
after that.
Make the benchmark recreate the integer variable on each iteration
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
local $~ = "$~X";
i.e. where localising a magic variable whose previous value should be
used as part of a string concat on the RHS, don't fold the assign into
the multiconcat op. Otherwise the code execution path looks a bit like:
local($~) = undef;
multiconcat($~, $~, "X");
[ where multiconcat's args are (target, arg1, arg2,....) ]
and thus multiconcat sees an undef arg.
By leaving the assign out of the multiconcat, code execution now looks
like
my $targ;
multiconcat($targ, $~, "X");
local($~) = $targ;
See http://nntp.perl.org/group/perl.perl5.porters/256898,
"Bug in format introduced in 5.27.6".
Although the bug only appears with magic vars, this patch pessimises
all forms of 'local $foo = "..."', 'local $foo{bar} = "..."' etc.
Strictly speaking the bug occurs because with 'local' you end up with
two SVs (the saved one and the one currently in the glob) which both
have the same container magic and where mg_set()ing one changes the
mg_get() value of the other. Thus, vars like $!. One of the two SVs
becomes an arg of multiconcat, the other becomes its target. Part of
localising the target SV (before multiconcat is called) wipes the value
of the arg SV.
|
|
|
|
|
|
|
|
| |
original merge commit: v5.31.3-198-gd2cd363728
reverted by: v5.31.4-0-g20ef288c53
The commit following this commit fixes the breakage, which that means
the revert can be undone.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d2cd363728088adada85312725ac9d96c29659be, reversing
changes made to 068b48acd4bdf9e7c69b87f4ba838bdff035053c.
This change breaks installing Test::Deep:
...
not ok 37 - Test 'isa eq' completed
ok 38 - Test 'isa eq' no premature diagnostication
...
|
|
|
|
|
|
| |
This function makes use of PL_curstackinfo->si_cxsubix to avoid the
overhead of a call to block_gimme() when the context of the op is
unknown.
|
|
|
|
|
|
|
| |
Make the checks for "do 't/perf/benchmarks'" look more like those
suggested for 'do' in perlfunc.
In particular, this may help track down the issue in RT #133663.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #132385
In something like
$overloaded . "a" . "b"
perl used to do
$overloaded->concat("a")->concat("b")
but since the introduction of OP_MULTICONCAT, started doing:
$overloaded->concat("ab")
This commit restores the old behaviour, by keeping every second adjacent
OP_CONST as an arg rather than optimising it away and adding its contents
to the constant string in the aux struct.
But note that
$overloaded .= "a" . "b"
originally, and still, constant folds.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow multiple OP_CONCAT, OP_CONST ops, plus optionally an OP_SASSIGN
or OP_STRINGIFY, to be combined into a single OP_MULTICONCAT op, which can
make things a *lot* faster: 4x or more.
In more detail: it will optimise into a single OP_MULTICONCAT, most
expressions of the form
LHS RHS
where LHS is one of
(empty)
my $lexical =
$lexical =
$lexical .=
expression =
expression .=
and RHS is one of
(A . B . C . ...) where A,B,C etc are expressions and/or
string constants
"aAbBc..." where a,A,b,B etc are expressions and/or
string constants
sprintf "..%s..%s..", A,B,.. where the format is a constant string
containing only '%s' and '%%' elements,
and A,B, etc are scalar expressions (so
only a fixed, compile-time-known number of
args: no arrays or list context function
calls etc)
It doesn't optimise other forms, such as
($a . $b) . ($c. $d)
((($a .= $b) .= $c) .= $d);
(although sub-parts of those expressions might be converted to an
OP_MULTICONCAT). This is partly because it would be hard to maintain the
correct ordering of tie or overload calls.
The compiler uses heuristics to determine when to convert: in general,
expressions involving a single OP_CONCAT aren't converted, unless some
other saving can be made, for example if an OP_CONST can be eliminated, or
in the presence of 'my $x = .. ' which OP_MULTICONCAT can apply
OPpTARGET_MY to, but OP_CONST can't.
The multiconcat op is of type UNOP_AUX, with the op_aux structure directly
holding a pointer to a single constant char* string plus a list of segment
lengths. So for
"a=$a b=$b\n";
the constant string is "a= b=\n", and the segment lengths are (2,3,1).
If the constant string has different non-utf8 and utf8 representations
(such as "\x80") then both variants are pre-computed and stored in the aux
struct, along with two sets of segment lengths.
For all the above LHS types, any SASSIGN op is optimised away. For a LHS
of '$lex=', '$lex.=' or 'my $lex=', the PADSV is optimised away too.
For example where $a and $b are lexical vars, this statement:
my $c = "a=$a, b=$b\n";
formerly compiled to
const[PV "a="] s
padsv[$a:1,3] s
concat[t4] sK/2
const[PV ", b="] s
concat[t5] sKS/2
padsv[$b:1,3] s
concat[t6] sKS/2
const[PV "\n"] s
concat[t7] sKS/2
padsv[$c:2,3] sRM*/LVINTRO
sassign vKS/2
and now compiles to:
padsv[$a:1,3] s
padsv[$b:1,3] s
multiconcat("a=, b=\n",2,4,1)[$c:2,3] vK/LVINTRO,TARGMY,STRINGIFY
In terms of how much faster it is, this code:
my $a = "the quick brown fox jumps over the lazy dog";
my $b = "to be, or not to be; sorry, what was the question again?";
for my $i (1..10_000_000) {
my $c = "a=$a, b=$b\n";
}
runs 2.7 times faster, and if you throw utf8 mixtures in it gets even
better. This loop runs 4 times faster:
my $s;
my $a = "ab\x{100}cde";
my $b = "fghij";
my $c = "\x{101}klmn";
for my $i (1..10_000_000) {
$s = "\x{100}wxyz";
$s .= "foo=$a bar=$b baz=$c";
}
The main ways in which OP_MULTICONCAT gains its speed are:
* any OP_CONSTs are eliminated, and the constant bits (already in the
right encoding) are copied directly from the constant string attached to
the op's aux structure.
* It optimises away any SASSIGN op, and possibly a PADSV op on the LHS, in
all cases; OP_CONCAT only did this in very limited circumstances.
* Because it has a holistic view of the entire concatenation expression,
it can do the whole thing in one efficient go, rather than creating and
copying intermediate results. pp_multiconcat() goes to considerable
efforts to avoid inefficiencies. For example it will only SvGROW() the
target once, and to the exact size needed, no matter what mix of utf8
and non-utf8 appear on the LHS and RHS. It never allocates any
temporary SVs except possibly in the case of tie or overloading.
* It does all its own appending and utf8 handling rather than calling
out to functions like sv_catsv().
* It's very good at handling the LHS appearing on the RHS; for example in
$x = "abcd";
$x = "-$x-$x-";
It will do roughly the equivalent of the following (where targ is $x);
SvPV_force(targ);
SvGROW(targ, 11);
p = SvPVX(targ);
Move(p, p+1, 4, char);
Copy("-", p, 1, char);
Copy("-", p+5, 1, char);
Copy(p+1, p+6, 4, char);
Copy("-", p+10, 1, char);
SvCUR(targ) = 11;
p[11] = '\0';
Formerly, pp_concat would have used multiple PADTMPs or temporary SVs to
handle situations like that.
The code is quite big; both S_maybe_multiconcat() and pp_multiconcat()
(the main compile-time and runtime parts of the implementation) are over
700 lines each. It turns out that when you combine multiple ops, the
number of edge cases grows exponentially ;-)
|
|
|
|
|
|
| |
an sprintf entry in t/perf/benchmarks was missing two %s's due to an
earlier cut+paste error. Also, it was being constant folded, so
use vars rather literals for the arguments.
|
|
|
|
| |
desc and setup are now optional; pre, post and compile have been added.
|
|
|
|
|
|
|
| |
If a benchmark has this flag set, measure the compile time of the
construct rather than its execution time, by wrapping the code in
eval q{ sub { ... } }
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These allow actions to be performed each time round the loop, just before
and after the benchmarked code, but without contributing to the timings.
For example to benchmark appending to a string, you need to reset the
string to a known state before each iteration, otherwise the string gets
bigger and bigger with each iteration:
code => '$s = ""; $s .= "foo"',
but now you're measuring both the concat and an assign. To measure just
the concat, you can now do:
pre => '$s = ""',
code => '$s .= "foo"',
Note the contrast with 'setup', which is only executed once, outside the
loop.
|
|
|
|
|
| |
Any entries in the benchmarks file which don't have a 'desc' description
fields will have the description set to the string for 'code'
|
|
|
|
|
|
|
|
|
| |
RT #131851
It was incorrectly optimising some permutations of comparison op and 0/-1
which shouldn't have been, such as
0 < index(...);
|
|
|
|
| |
fix typo from my recent commit. Spotted by Jarkko.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recently I made it so that in expression like index(...) == -1, the
const and eq ops are optimised away and a BOOL flag is set on the index
op.
This commit expands this to various permutations of relational ops too,
such as
index(...) >= 0
index(...) < 0
index(...) <= -1
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
move and rename
expr::hash::bool_empty_keys
expr::hash::bool_full_keys
to
func::keys::lex::bool_cxt_empty
func::keys::lex::bool_cxt
and add
func::keys::pkg::bool_cxt_empty
func::keys::pkg::bool_cxt
since its really testing the keys() function in boolean context rather
than a hash in boolean context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A recent commit in this branch made OP_PADHV / OP_RV2HV in void/scalar
context, followed by OP_KEYS, optimise away the OP_KEYS op and set the
OPpPADHV_ISKEYS or OPpRV2HV_ISKEYS flag on the OP_PADHV / OP_RV2HV op.
However, in scalar but non-boolean context with OP_PADHV, this actually
makes it slower, because the OP_KEYS op has a target, while the OP_PADHV
op doesn't, thus it has to create a new mortal each time to return the
integer value.
This commit fixes that by, in the case of scalar padhv, retaining the
OP_KEYS node (although still not keeping it in the execution path), then
at runtime using that op's otherwise unused target.
This only works on PERL_OP_PARENT builds (now the default) as the OP_KEYS
is the parent of the OP_PADHV, so would be hard to find at runtime
otherwise.
This commit also fixes pp_padhv/pp_rv2hv in void context - formerly it
was needlessly pushing a scalar-valued count like scalar context.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The newish function hv_pushkv() currently just pushes all key/value pairs on
the stack. i.e. it does the equivalent of the perl code '() = %h'.
Extend it so that it can handle 'keys %h' and values %h' too.
This is basically moving the remaining list-context functionality out of
do_kv() and into hv_pushkv().
The rationale for this is that hv_pushkv() is a pure HV-related function,
while do_kv() is a pp function for several ops including OP_KEYS/VALUES,
and expects PL_op->op_flags/op_private to be valid.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...and make pp_padhv(), pp_rv2hv() use it rather than using Perl_do_kv()
Both pp_padhv() and pp_rv2hv() (via S_padhv_rv2hv_common()), outsource to
Perl_do_kv(), the list-context pushing/flattening of a hash onto the
stack.
Perl_do_kv() is a big function that handles all the actions of
keys, values etc. Instead, create a new function which does just the
pushing of a hash onto the stack.
At the same time, split it out into two loops, one for tied, one for
normal: the untied one can skip extending the stack on each iteration,
and use a cheaper HeVAL() instead of calling hv_iterval().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unusually, index() and rindex() return -1 on failure.
So it's reasonably common to see code like
if (index(...) != -1) { ... }
and variants.
For such code, this commit optimises away to OP_EQ and OP_CONST,
and sets a couple of private flags on the index op instead, indicating:
OPpTRUEBOOL return a boolean which is a comparison of
what the return would have been, against -1
OPpINDEX_BOOLNEG negate the boolean result
Its also supports OPpTRUEBOOL in conjunction with the existing
OPpTARGET_MY flag, so for example in
$lexical = (index(...) == -1)
the padmy, sassign, eq and const ops are all optimised away.
|
|
|
|
|
| |
whitespace-only change, plus alphabetically sort the lines of ops being
tested).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some ops which return integer values and which have a reasonable
likelihood of being used in a boolean context, set the OPpTRUEBOOL
flag on the op as appropriate, and at runtime return &PL_sv_yes /
&PL_sv_zero rather than an integer value.
This is especially beneficial where the op doesn't have a targ, so has
to create a mortal SV to return the integer value.
Similarly, its a win where it may be expensive to calculate an integer
return value, such as pos() or length() converting between byte and char
offset.
Ops done:
OP_SUBST
OP_AASSIGN
OP_POS
OP_LENGTH
OP_GREPWHILE
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when it fails to the find the op its looking for, dump the optree.
Also, include the grep tests in parentheses, otherwise the condition
can be interpreted as the whole expression if the condition includes
parentheses;
e.g. condition: ($a==$_)
becomes grep ($a==$_), 1, 2
so do this instead
becomes grep (($a==$_), 1, 2)
|
|
|
|
|
|
|
|
|
| |
It's quicker to return (and to test for) &PL_sv_zero or &PL_sv_yes,
than setting a targ to an integer value or, in the vase of padav,
creating a mortal sv and setting it to an integer value.
In fact for padav, even in the scalar but non-boolean case, return
&PL_sv_zero if the value is zero rather than creating and setting a mortal.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
if (keys %h) { ... }
the 'keys %h' is implemented as the op sequences
gv[*h] s
rv2hv lKRM/1
keys[t2] sK/1
or
padhv[%h:1,6] lRM
keys[t2] sK/1
It turns out that (%h) in scalar and void context now behaves very
similarly to (keys %h) (except that it reset the iterator), so in these
cases, convert the two ops
rv2hv/padhv, keys
into the single op
rv2hv/padhv
with a private flag indicating that the op is handling the 'keys' action
by itself.
As well as one less op to execute, this brings the boolean-context
optimisation already present in padhv/rv2sv to keys. So
if (keys %h) { ... }
is no longer slower than
if (%h) { ... }
|
|
|
|
|
| |
Add a few not (!) expressions which exercise SvTRUE() for various types
of operand.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-instate the special-casing, which was removed by v5.25.8-172-gb243b19,
of OP_AND in boolean-context determination.
This is because the special-case allowed things to be more efficient
sometimes, but required returning a false value as sv_2mortal(newSViv(0)))
rather than &PL_sv_no. Now that PL_sv_zero has been added we can use
that instead, cheaply.
This commit adds an extra arg to S_check_for_bool_cxt() to indicate
whether the op supports the special-casing of OP_AND.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #78288
When ref() is used in a boolean context, it's not necessary to return
the name of the package which an object is blessed into; instead a simple
truth value can be returned, which is faster.
Note that it has to cope with the subtlety of an object blessed into the
class "0", which should return false.
Porting/bench.pl shows for the expression !ref($r), approximately:
unchanged for a non-reference $r
doubling of speed for a reference $r
tripling of speed for a blessed reference $r
This commit builds on the mechanism already used to set the OPpTRUEBOOL
and OPpMAYBE_TRUEBOOL flags on padhv and rv2hv ops when used in boolean
context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
"a1b2c3d4..." =~ /(?:(\w)(\d))*..../
A WHILEM state is pushed for each iteration of the '*'. Part of this
state saving includes the previous indices for each of the captures within
the body of the thing being iterated over. So we save the following sets of
values for $1,$2:
()()
(a)(1)
(b)(2)
(c)(3)
(d)(4)
Then if at any point we backtrack, we can undo one or more iterations and
restore the older values of $1,$2.
However, when the match is non-greedy, as in A*?B, then on failure of B
and backtracking we attempt *more* A's rather than removing some already
matched A's. So there's never any need to save all the current paren state
for each iteration.
This eliminates a lot of per-iteration overhead for minimal WHILEMs and
makes the following run about 25% faster:
$s = ("a" x 1000);
$s =~ /^(?:(.)(.))*?[XY]/ for 1..10_000;
|