| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
These macros are marked as subject to change and are not documented
externally. I don't know what I was thinking when I named some of them,
but whatever no longer makes sense to me. Simplify them, and change so
there is only one restore macro to remember.
|
|
|
|
|
|
| |
previously it just displayed its address.
Also, when the table is in fact a swash, don't display its address
on threaded builds, as its actually just a padix.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pumpking has determined that the CPAN breakage caused by changing
smartmatch [perl #132594] is too great for the smartmatch changes to
stay in for 5.28.
This reverts most of the merge in commit
da4e040f42421764ef069371d77c008e6b801f45. All core behaviour and
documentation is reverted. The removal of use of smartmatch from a couple
of tests (that aren't testing smartmatch) remains. Customisation of
a couple of CPAN modules to make them portable across smartmatch types
remains. A small bugfix in scope.c also remains.
|
|
|
|
|
| |
The names of ops, context types, functions, etc., all change in accordance
with the change of keyword.
|
|
|
|
|
| |
This will support the upcoming change to let loop control ops apply to
"given" blocks.
|
|
|
|
|
|
|
| |
Change it from unsigned to unsigned since it makes the SP-adjusting code
in pp_multiconcat easier without hitting undefined behaviour (RT #132390);
and change its size from UV to SSize_t since it represents the number
of args on the stack.
|
|
|
|
|
|
|
| |
This part of the op_aux union was added for OP_MULTICONCAT; its actually
of type SSize_t, so rename it to ssize to better reflect that it's signed.
This should make no functional difference.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow multiple OP_CONCAT, OP_CONST ops, plus optionally an OP_SASSIGN
or OP_STRINGIFY, to be combined into a single OP_MULTICONCAT op, which can
make things a *lot* faster: 4x or more.
In more detail: it will optimise into a single OP_MULTICONCAT, most
expressions of the form
LHS RHS
where LHS is one of
(empty)
my $lexical =
$lexical =
$lexical .=
expression =
expression .=
and RHS is one of
(A . B . C . ...) where A,B,C etc are expressions and/or
string constants
"aAbBc..." where a,A,b,B etc are expressions and/or
string constants
sprintf "..%s..%s..", A,B,.. where the format is a constant string
containing only '%s' and '%%' elements,
and A,B, etc are scalar expressions (so
only a fixed, compile-time-known number of
args: no arrays or list context function
calls etc)
It doesn't optimise other forms, such as
($a . $b) . ($c. $d)
((($a .= $b) .= $c) .= $d);
(although sub-parts of those expressions might be converted to an
OP_MULTICONCAT). This is partly because it would be hard to maintain the
correct ordering of tie or overload calls.
The compiler uses heuristics to determine when to convert: in general,
expressions involving a single OP_CONCAT aren't converted, unless some
other saving can be made, for example if an OP_CONST can be eliminated, or
in the presence of 'my $x = .. ' which OP_MULTICONCAT can apply
OPpTARGET_MY to, but OP_CONST can't.
The multiconcat op is of type UNOP_AUX, with the op_aux structure directly
holding a pointer to a single constant char* string plus a list of segment
lengths. So for
"a=$a b=$b\n";
the constant string is "a= b=\n", and the segment lengths are (2,3,1).
If the constant string has different non-utf8 and utf8 representations
(such as "\x80") then both variants are pre-computed and stored in the aux
struct, along with two sets of segment lengths.
For all the above LHS types, any SASSIGN op is optimised away. For a LHS
of '$lex=', '$lex.=' or 'my $lex=', the PADSV is optimised away too.
For example where $a and $b are lexical vars, this statement:
my $c = "a=$a, b=$b\n";
formerly compiled to
const[PV "a="] s
padsv[$a:1,3] s
concat[t4] sK/2
const[PV ", b="] s
concat[t5] sKS/2
padsv[$b:1,3] s
concat[t6] sKS/2
const[PV "\n"] s
concat[t7] sKS/2
padsv[$c:2,3] sRM*/LVINTRO
sassign vKS/2
and now compiles to:
padsv[$a:1,3] s
padsv[$b:1,3] s
multiconcat("a=, b=\n",2,4,1)[$c:2,3] vK/LVINTRO,TARGMY,STRINGIFY
In terms of how much faster it is, this code:
my $a = "the quick brown fox jumps over the lazy dog";
my $b = "to be, or not to be; sorry, what was the question again?";
for my $i (1..10_000_000) {
my $c = "a=$a, b=$b\n";
}
runs 2.7 times faster, and if you throw utf8 mixtures in it gets even
better. This loop runs 4 times faster:
my $s;
my $a = "ab\x{100}cde";
my $b = "fghij";
my $c = "\x{101}klmn";
for my $i (1..10_000_000) {
$s = "\x{100}wxyz";
$s .= "foo=$a bar=$b baz=$c";
}
The main ways in which OP_MULTICONCAT gains its speed are:
* any OP_CONSTs are eliminated, and the constant bits (already in the
right encoding) are copied directly from the constant string attached to
the op's aux structure.
* It optimises away any SASSIGN op, and possibly a PADSV op on the LHS, in
all cases; OP_CONCAT only did this in very limited circumstances.
* Because it has a holistic view of the entire concatenation expression,
it can do the whole thing in one efficient go, rather than creating and
copying intermediate results. pp_multiconcat() goes to considerable
efforts to avoid inefficiencies. For example it will only SvGROW() the
target once, and to the exact size needed, no matter what mix of utf8
and non-utf8 appear on the LHS and RHS. It never allocates any
temporary SVs except possibly in the case of tie or overloading.
* It does all its own appending and utf8 handling rather than calling
out to functions like sv_catsv().
* It's very good at handling the LHS appearing on the RHS; for example in
$x = "abcd";
$x = "-$x-$x-";
It will do roughly the equivalent of the following (where targ is $x);
SvPV_force(targ);
SvGROW(targ, 11);
p = SvPVX(targ);
Move(p, p+1, 4, char);
Copy("-", p, 1, char);
Copy("-", p+5, 1, char);
Copy(p+1, p+6, 4, char);
Copy("-", p+10, 1, char);
SvCUR(targ) = 11;
p[11] = '\0';
Formerly, pp_concat would have used multiple PADTMPs or temporary SVs to
handle situations like that.
The code is quite big; both S_maybe_multiconcat() and pp_multiconcat()
(the main compile-time and runtime parts of the implementation) are over
700 lines each. It turns out that when you combine multiple ops, the
number of edge cases grows exponentially ;-)
|
| |
|
|
|
|
|
|
|
|
|
| |
RT #131912
the (1 << i) is harmless for large i, but triggers an 'undefined-behavior'
errror in clang.
So work around it.
|
|
|
|
|
|
|
|
| |
When the len field of a REGEXP isn't usurped, display it (it used to
always be skipped for REGEXPs).
When it's usurped by a PVLV to point to a 'struct regexp', display it as
a pointer.
|
|
|
|
|
|
|
|
|
|
| |
it's like PL_sv_no, except that its string value is "0" rather than "".
It can be used for example where pp function wants to push a zero return
value on the stack. The next commit will start to use it.
Also update the SvIMMORTAL() to be more efficient: it now checks whether
the SV's address is in a range rather than individually checking against
&PL_sv_undef, &PL_sv_no etc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #131732
With v5.27.1-66-g87058c3, I introduced a DEBUGGING-only mechanism in the
runops loop for checking whether an op extended the stack by as many slots
as values it returned on the stack. It did this by setting a
high-water-mark just before calling each pp function, and checking its
result on return.
It saved and restored the old value of PL_curstackinfo->si_stack_hwm
whenever it entered or left a runops loop or did a JMPENV_PUSH /
JMPENV_POP. However, the restoring could restore to an old value that was
smaller than the current value, leading to false-positive stack-extend
panics. So only restore if the old value was larger.
In particular this was causing false positives in DBI.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On debugging builds only, add a mechanism for checking pp function calls
for insufficient stack extending. It works by:
* make the runops loop set a high-water-mark (HWM) variable equal to
PL_stack_sp just before calling each pp function;
* make EXTEND() etc update this HWM;
* on return from the pp function, panic if PL_stack_sp is > HWM.
This detects whether pp functions are pushing more items onto the stack
than they are requesting space for.
There's a possibility of false positives if the code is doing weird stuff
like direct manipulation of stacks via PL_curstack, SWITCHSTACK() etc.
It's also possible that one pp function "knows" that a previous pp
function will have already grown the stack enough. Currently the only
place in core that seems to do this is pp_enteriter, which allocates 1
stack slot so that pp_iter doesn't have to check each time it returns
&PL_sv_yes/no. To accommodate this, the new macro EXTEND_SKIP() has been
added, that tells perl that it's safely skipping an EXTEND() here.
|
|
|
|
|
| |
t/porting/libperl.t under -DPERL_GLOBAL_STRUCT_PRIVATE doesn't like
non-const static data structures
|
|
|
|
|
|
|
|
|
|
| |
My recent commit v5.25.9-32-gabd07ec made dump.c display the op_pv
string of OP_NEXT, OP_TRANS etc ops. However, for OP_TRANS/OP_TRANSR,
the string is basically a 256-byte potentially non null-temrinated array.
This was causing a buffer read overrun and garbage to be displayed.
The simple solution is to only display the address but not contents
for a trans op. OP_NEXT ec labels continue to be displayed.
|
|
|
|
|
|
|
| |
Incidentally, it currently works on SV *'s as well because there's an
explicit cast after an SvANY. Let's not rely on that. This commit also
removes a pointless const in a cast. Again. It takes an HV * as argument.
Let's only change that if we have a strong reason to.
|
|
|
|
|
| |
dump_sub() can receive a CV ref where it's expecting a GV. Make it
handle that cleanly. Fixes [perl #129126].
|
|
|
|
|
|
| |
Since the recursive case already handles null pointers, and this function is
specifically aimed at debugging, it seems sensible to handle a null pointer
at the top level too.
|
|
|
|
| |
trivial bit of tidy-up
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In op_clear(), the ops with labels stored in the op_pv field (OP_NEXT etc)
fall-through to the OP_TRANS/OP_TRANSR code, which determines whether to
free op_pv based on the OPpTRANS_FROM_UTF|OPpTRANS_TO_UTF flags, which are
only valid for OP_TRANS/OP_TRANSR. At the moment the fall-through fields
don't use either of those private bits, but in case this changes in
future, only check those flag bits for trans ops.
At the same time, enhance op_dump() to display the OP_PV field of such
ops.
Also, fix a leak I introduced in the recently-added S_gv_display()
function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #129285
These days a 'GV' can actually just be a ref to a CV when the only thing
that would be stored in the glob is a CV. Update S_do_op_dump_bar() to
handle this. Formerly it would trigger an assert on a non-threaded build.
In fact, incorporate the fixed logic into a static function,
S_gv_display(), that is shared by both S_do_op_dump_bar() and
Perl_debop(); so both
perl -Dx
and
perl -Dt
get the benefit.
Also for the -Dx case, make it display the raw address of the GV too.
|
|
|
|
|
| |
after previous commit removed an enclosing 'if' block.
Whitespace-only change
|
|
|
|
|
|
|
|
| |
between 5.14 and 5.16 pp_aelemfast changed from using OPf_SPECIAL to
using op type to distinguish between a lexical or glob arg, but op_dump()
hadn't been updated to reflect this. Also, GVSV and GV never used the
OPf_SPECIAL flag, so testing for it with those ops was wrong (but
currently harmless).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(whitespace-only change)
Not mentioning any names to protect the guilty, but about 3 years ago some
code was committed to dump.c that had just bizarre indentation; for
example, this
if (foo)
bar
being more like
if (foo)
bar
(and this is nothing to do with tab expansion).
This commit fixes up the most glaring issues.
|
|
|
|
| |
whitespace-only change
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mainly used for low-level debugging these days (higher level stuff
like Concise having since been created), e.g. calling op_dump() from
within a debugger or running with -Dx. Make it display more info, and use
an ACSII-art tree to show the structure.
The main changes are:
* added 'ASCII-art' tree structure;
* it now displays each op's class and address;
* for op_next etc links, it now displays the type and address of the
linked-to op in addition to its sequence number;
* the following ops now have their op_other field displayed, like op_and
etc already do:
andassign argdefelem dor dorassign entergiven entertry enterwhen
once orassign regcomp substcont
* enteriter now has its op_redo etc fields displayed, like enterloop
already does;
Here is a sample before and after of perl -Dx -e'($x+$y) * $z'
Before:
{
1 TYPE = leave ===> NULL
TARG = 1
FLAGS = (VOID,KIDS,PARENS,SLABBED)
PRIVATE = (REFC)
REFCNT = 1
{
2 TYPE = enter ===> 3
FLAGS = (UNKNOWN,SLABBED,MORESIB)
}
{
3 TYPE = nextstate ===> 4
FLAGS = (VOID,SLABBED,MORESIB)
LINE = 1
PACKAGE = "main"
SEQ = 4294967246
}
{
5 TYPE = multiply ===> 1
TARG = 5
FLAGS = (VOID,KIDS,SLABBED)
PRIVATE = (0x2)
{
6 TYPE = add ===> 7
TARG = 3
FLAGS = (SCALAR,KIDS,PARENS,SLABBED,MORESIB)
PRIVATE = (0x2)
{
8 TYPE = null ===> (9)
(was rv2sv)
FLAGS = (SCALAR,KIDS,SLABBED,MORESIB)
PRIVATE = (0x1)
{
4 TYPE = gvsv ===> 9
FLAGS = (SCALAR,SLABBED)
PADIX = 1
}
}
{
10 TYPE = null ===> (6)
(was rv2sv)
FLAGS = (SCALAR,KIDS,SLABBED)
PRIVATE = (0x1)
{
9 TYPE = gvsv ===> 6
FLAGS = (SCALAR,SLABBED)
PADIX = 2
}
}
}
{
11 TYPE = null ===> (5)
(was rv2sv)
FLAGS = (SCALAR,KIDS,SLABBED)
PRIVATE = (0x1)
{
7 TYPE = gvsv ===> 5
FLAGS = (SCALAR,SLABBED)
PADIX = 4
}
}
}
}
After:
1 leave LISTOP(0xdecb38) ===> [0x0]
TARG = 1
FLAGS = (VOID,KIDS,PARENS,SLABBED)
PRIVATE = (REFC)
REFCNT = 1
|
2 +--enter OP(0xdecb00) ===> 3 [nextstate 0xdecb80]
| FLAGS = (UNKNOWN,SLABBED,MORESIB)
|
3 +--nextstate COP(0xdecb80) ===> 4 [gvsv 0xdeb3b8]
| FLAGS = (VOID,SLABBED,MORESIB)
| LINE = 1
| PACKAGE = "main"
| SEQ = 4294967246
|
5 +--multiply BINOP(0xdecbe0) ===> 1 [leave 0xdecb38]
TARG = 5
FLAGS = (VOID,KIDS,SLABBED)
PRIVATE = (0x2)
|
6 +--add BINOP(0xdeb2b0) ===> 7 [gvsv 0xdeb270]
| TARG = 3
| FLAGS = (SCALAR,KIDS,PARENS,SLABBED,MORESIB)
| PRIVATE = (0x2)
| |
8 | +--null (ex-rv2sv) UNOP(0xdeb378) ===> 9 [gvsv 0xdeb338]
| | FLAGS = (SCALAR,KIDS,SLABBED,MORESIB)
| | PRIVATE = (0x1)
| | |
4 | | +--gvsv PADOP(0xdeb3b8) ===> 9 [gvsv 0xdeb338]
| | FLAGS = (SCALAR,SLABBED)
| | PADIX = 1
| |
10 | +--null (ex-rv2sv) UNOP(0xdeb2f8) ===> 6 [add 0xdeb2b0]
| FLAGS = (SCALAR,KIDS,SLABBED)
| PRIVATE = (0x1)
| |
9 | +--gvsv PADOP(0xdeb338) ===> 6 [add 0xdeb2b0]
| FLAGS = (SCALAR,SLABBED)
| PADIX = 2
|
11 +--null (ex-rv2sv) UNOP(0xdeb220) ===> 5 [multiply 0xdecbe0]
FLAGS = (SCALAR,KIDS,SLABBED)
PRIVATE = (0x1)
|
7 +--gvsv PADOP(0xdeb270) ===> 5 [multiply 0xdecbe0]
FLAGS = (SCALAR,SLABBED)
PADIX = 4
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given an op, this function determines what type of struct it has been
allocated as. Returns one of the OPclass enums, such as OPclass_LISTOP.
Originally this was a static function in B.xs, but it has wider
applicability; indeed several XS modules on CPAN have cut and pasted it.
It adds the OPclass enum to op.h. In B.xs there was a similar enum, but
with names like OPc_LISTOP. I've renamed them to OPclass_LISTOP etc. so as
not to clash with the cut+paste code already on CPAN.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C++11 requires space between the end of a string literal and a macro, so
that a feature can unambiguously be added to the language. Starting in
g++ 6.2, the compiler emits a warning when there isn't a space
(presumably so that future versions can support C++11). Unfortunately
there are many such instances in the perl core. This commit fixes
those, including those in ext/, but individual commits will be used for
the other modules, those in dist/ and cpan/.
This commit also inserts space at the end of a macro before a string
literal, even though that is not deprecated, and removes useless ""
literals following a macro (instead of inserting a blank). The result
is easier to read, making the macro stand out, and be clearer as to the
intention.
Code and modules included with the Perl core need to be compilable using
C++. This is so that perl can be embedded in C++ programs. (Actually,
only the hdr files need to be so compilable, but it would be hard to
test that just the hdrs are compilable.) So we need to accommodate
changes to the C++ language.
|
|
|
|
|
|
| |
This flag was added in 5.004 and even then it didn't seem to be used for
anything. It gets set and unset in various places, but is never tested.
I'm not even sure what it was intended for.
|
|
|
|
|
|
|
|
| |
This flag is set on an SV to indicate that it has PERL_MAGIC_bm
(fast Boyer-Moore) magic attached. Instead just directly check whether
it has such magic.
This frees up the 0x40000000 bit for anything except AVs and HVs
|
|
|
|
|
|
| |
Only use the SvTAIL() macro when we've already confirmed that
the SV is SvVALID() - this is in preparation for removing the
SVpbm_TAIL flag in the next commit
|
|
|
|
|
|
|
|
|
|
|
| |
This flag is only used to indicate that the SV holding the text of the
replacement part of a s/// has seen at least one /e.
Instead, set the IVX field in the SV to a true value.
(We already set the NVX field on that SV to indicate a multi-src-line
substitution).
This is to reduce the number of odd special cases for the SVpbm_VALID flag.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When dumping a PMOP, it displays the PMOP-specific fields with
an extra set of braces and level of indentation, e.g.
{
TYPE = match ===> 1
FLAGS = (VOID,SLABBED)
PRIVATE = (RTIME)
{
PMf_PRE /abc/ (RUNTIME)
PMFLAGS = (SCANFIRST,ALL)
}
}
This is visually confusing, because child ops are shown in the same way.
This commit removes the extra indentation:
{
TYPE = match ===> 1
FLAGS = (VOID,SLABBED)
PRIVATE = (RTIME)
PMf_PRE /abc/ (RUNTIME)
PMFLAGS = (SCANFIRST,ALL)
}
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally xav_arylen was an AV field and was displayed by sv_dump.
In 2005, this ield was removed, and replaced by PERL_MAGIC_arylen_p
magic when needed.
A side effect of this is that sv_dump on a magical AV adds
PERL_MAGIC_arylen_p magic to the av as a side-effect.
Which is undesirable.
This commit just omits displaying 'ARYLEN =' altogether. Any arylen magic
will already be displayed as part of dumping the AV, so it's redundant.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with the -R debugging flag, SVs are displayed with a reference count
(if > 1), and with a T if the SV is referenced from the temps stack.
E.g.
$ perl -DstR -e'@a = map $_,"a", "b"'
...
* <T>PV("a"\0) <T>PV("b"\0)
This commit enhances this to use both "t" and "T":
t: SV is referenced from PL_tmps_stack, but SvTEMP() not set
T: SV is referenced from PL_tmps_stack, and in addition, SvTEMP() is set
(The other permutation, SvTEMP() set but not in PL_tmps_stack, is
illegal).
This commit changes
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most ops that execute a regex, such as match and subst, are of type PMOP.
A PMOP allows the actual regex to be attached directly to that op, due
to its extra fields.
OP_SPLIT is different; it is just a plain LISTOP, but it always has an
OP_PUSHRE as its first child, which *is* a PMOP and which has the regex
attached.
At runtime, pp_pushre()'s only job is to push itself (i.e. the current
PL_op) onto the stack. Later pp_split() pops this to get access to the
regex it wants to execute.
This is a bit unpleasant, because we're pushing an OP* onto the stack,
which is supposed to be an array of SV*'s. As a bit of a hack, on
DEBUGGING builds we push a PVLV with the PL_op address embedded instead,
but this still isn't very satisfactory.
Now that regexes are first-class SVs, we could push a REGEXP onto the
stack rather than PL_op. However, there is an optimisation of @array =
split which eliminates the assign and embeds the array's GV/padix directly
in the PUSHRE op. So split still needs access to that op. But the pushre
op will always be splitop->op_first anyway, so one possibility is to just
skip executing the pushre altogether, and make pp_split just directly
access op_first instead to get the regex and @array info.
But if we're doing that, then why not just go the full hog and make
OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely: with the
data that was spread across the two ops now combined into just the one
split op.
That is exactly what this commit does.
For a simple compile-time pattern like split(/foo/, $s, 1), the optree
looks like:
before:
<@> split[t2] lK
</> pushre(/"foo"/) s/RTIME
<0> padsv[$s:1,2] s
<$> const(IV 1) s
after:
</> split(/"foo"/)[t2] lK/RTIME
<0> padsv[$s:1,2] s
<$> const[IV 1] s
while for a run-time expression like split(/$pat/, $s, 1),
before:
<@> split[t3] lK
</> pushre() sK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const(IV 1)s
after:
</> split()[t3] lK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const[IV 1] s
This makes the code faster and simpler.
At the same time, two new private flags have been added for OP_SPLIT -
OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
assign op has been optimised away, and if so, whether the array is
lexical.
Also, deparsing of split has been improved, to the extent that
perl TEST -deparse op/split.t
now passes.
Also, a couple of panic messages in pp_split() have been replaced with
asserts().
|
|
|
|
|
| |
If a CV is CvSLABBED(), then CvSTART() points to the op slab rather than a
start op. Make Perl_do_sv_dump() display this more informatively.
|
|
|
|
|
|
|
|
|
| |
Perl_do_sv_dump() (as used by Devel::Peek) dumped a logical AV - i.e.
if it was tied, it called tie methods to get its size and to get its
elements.
Instead, dump the physical fields in the AV - e.g. a tied AV will likely
have a FILL of -1 and no elements.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently subroutine signature parsing emits many small discrete ops
to implement arg handling. This commit replaces them with a couple of ops
per signature element, plus an initial signature check op.
These new ops are added to the OP tree during parsing, so will be visible
to hooks called up to and including peephole optimisation. It is intended
soon that the peephole optimiser will take these per-element ops, and
replace them with a single OP_SIGNATURE op which handles the whole
signature in a single go. So normally these ops wont actually get executed
much. But adding these intermediate-level ops gives three advantages:
1) it allows the parser to efficiently generate subtrees containing
individual signature elements, which can't be done if only OP_SIGNATURE
or discrete ops are available;
2) prior to optimisation, it provides a simple and straightforward
representation of the signature;
3) hooks can mess with the signature OP subtree in ways that make it
no longer possible to optimise into an OP_SIGNATURE, but which can
still be executed, deparsed etc (if less efficiently).
This code:
use feature "signatures";
sub f($a, $, $b = 1, @c) {$a}
under 'perl -MO=Concise,f' now gives:
d <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->d
1 <;> nextstate(main 84 foo:6) v:%,469762048 ->2
2 <+> argcheck(3,1,@) v ->3
3 <;> nextstate(main 81 foo:6) v:%,469762048 ->4
4 <+> argelem(0)[$a:81,84] v/SV ->5
5 <;> nextstate(main 82 foo:6) v:%,469762048 ->6
8 <+> argelem(2)[$b:82,84] vKS/SV ->9
6 <|> argdefelem(other->7)[2] sK ->8
7 <$> const(IV 1) s ->8
9 <;> nextstate(main 83 foo:6) v:%,469762048 ->a
a <+> argelem(3)[@c:83,84] v/AV ->b
- <;> ex-nextstate(main 84 foo:6) v:%,469762048 ->b
b <;> nextstate(main 84 foo:6) v:%,469762048 ->c
c <0> padsv[$a:81,84] s ->d
The argcheck(3,1,@) op knows the number of positional params (3), the
number of optional params (1), and whether it has an array / hash slurpy
element at the end. This op is responsible for checking that @_ contains
the right number of args.
A simple argelem(0)[$a] op does the equivalent of 'my $a = $_[0]'.
Similarly, argelem(3)[@c] is equivalent to 'my @c = @_[3..$#_]'.
If it has a child, it gets its arg from the stack rather than using $_[N].
Currently the only used child is the logop argdefelem.
argdefelem(other->7)[2] is equivalent to '@_ > 2 ? $_[2] : other'.
[ These ops currently assume that the lexical var being introduced
is undef/empty and non-magival etc. This is an incorrect assumption and
is fixed in a few commits' time ]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This subject has a long history see [perl #114576] for more discussion.
https://rt.perl.org/Public/Bug/Display.html?id=114576
There are a variety of reasons we want to change the return signature of
scalar(%hash). One is that it leaks implementation details about our
associative array structure. Another is that it requires us to keep track
of the used buckets in the hash, which we use for no other purpose but
for scalar(%hash). Another is that it is just odd. Almost nothing needs to
know these values. Perhaps debugging, but we have several much better
functions for introspecting the internals of a hash.
By changing the return signature we can remove all the logic related
to maintaining and updating xhv_fill_lazy. This should make hot code
paths a little faster, and maybe save some memory for traversed hashes.
In order to provide some form of backwards compatibility we adds three
new functions to the Hash::Util namespace: bucket_ratio(), num_buckets()
and used_buckets(). These functions are actually implemented in
universal.c, and thus always available even if Hash::Util is not loaded.
This simplifies testing. At the same time Hash::Util contains backwards
compatible code so that the new functions are available from it should
they be needed in older perls.
There are many tests in t/op/hash.t that are more or less obsolete after
this patch as they test that xhv_fill_lazy is correctly set in various
situations. However since we have a backwards compat layer we can just
switch them to use bucket_ratio(%hash) instead of scalar(%hash) and keep
the tests, just in case they are actually testing something not tested
elsewhere.
|
|
|
|
| |
They were coming out as ‘(null)’, which is incorrect and confusing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit:
1. Renames the various dtrace probe macros into a consistent and
self-documenting pattern, e.g.
ENTRY_PROBE => PERL_DTRACE_PROBE_ENTRY
RETURN_PROBE => PERL_DTRACE_PROBE_RETURN
Since they're supposed to be defined only under PERL_CORE, this shouldn't
break anything that's not being naughty.
2. Implement the main body of these macros using a real function.
They were formerly defined along the lines of
if (PERL_SUB_ENTRY_ENABLED())
PERL_SUB_ENTRY(...);
The PERL_SUB_ENTRY() part is a macro generated by the dtrace system, which
for example on linux expands to a large bunch of assembly directives.
Replace the direct macro with a function wrapper, e.g.
if (PERL_SUB_ENTRY_ENABLED())
Perl_dtrace_probe_call(aTHX_ cv, TRUE);
This reduces to once the number of times the macro is expanded.
The new functions also take simpler args and then process the values they
need using intermediate temporary vars to avoid huge macro expansions.
For example
ENTRY_PROBE(CvNAMED(cv)
? HEK_KEY(CvNAME_HEK(cv))
: GvENAME(CvGV(cv)),
CopFILE((const COP *)CvSTART(cv)),
CopLINE((const COP *)CvSTART(cv)),
CopSTASHPV((const COP *)CvSTART(cv)));
is now
PERL_DTRACE_PROBE_ENTRY(cv);
This reduces the executable size by 1K on -O2 -Dusedtrace builds,
and by 45K on -DDEBUGGING -Dusedtrace builds.
|
| |
|
|
|
|
| |
Coverity CID 135145: Missing break in switch (MISSING_BREAK)
|
|
|
|
|
|
|
|
|
| |
In runops_debug() wrap the optional printing of next op, arg stack
etc with ENTER/SAVETMPS, FREETMPS/LEAVE - so that temporaries created by
the dump output are promptly freed, and thus don't alter the tmps stack.
(I'm trying to debug some tmps stack corruption, and running with -Dst
made the problem go away).
|
| |
|
|
|
|
| |
Removes 'the' in front of parameter names in some instances.
|