| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
This function returns a string representation of the OP_MULTIDEREF op
(as used by the output of perl -Dt).
However, the stringification of a UNOP_AUX op is op-specific, and
hypothetical future UNOP_AUX-class ops will need their own functions.
So the current function name is misleading.
It should be safe to rename it, as it only been in since 5.21.7, and isn't
public.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This flag will prevent () from capturing and filling in $1, $2, etc...
Named captures will still work though, and if used will cause $1, $2, etc...
to be filled in *only* within named groups.
The motivation behind this is to allow the common construct of:
/(?:b|c)a(?:t|n)/
To be rewritten more cleanly as:
/(b|c)a(t|n)/n
When you want grouping but no memory penalty on captures.
You can also use ?n inside of a () directly to avoid capturing, and
?-n inside of a () to negate its effects if you want to capture.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This:
my $z; my @y; $y[$z]
included
<+> multideref($@y[$$z]) sK ->6
in its -MO=Concise output.
|
|
|
|
|
| |
newSVpvn_flags with SVs_TEMP takes less machine code than
sv_2mortal(newSVpv()).
|
| |
|
|
|
|
|
|
|
| |
and remove its sigil arg.
During development of OP_MULTIDEREF this function evolved; the new
name reflects its usage more accurately, and sigil is always '$'.
|
|
|
|
|
|
|
|
| |
my recent OP_MULTIDEREF patch included this bizarre thinko:
PAD *comppad = comppad = ....;
Spotted by Jarkko.
|
|
|
|
|
|
|
|
|
| |
to match the existing convention (OpREFCNT, OpSLAB).
Dave Mitchell asked me to wait until after his multideref work
was merged.
Unfortunately, there are now CPAN modules using OP_SIBLING.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This op is an optimisation for any series of one or more array or hash
lookups and dereferences, where the key/index is a simple constant or
package/lexical variable. If the first-level lookup is of a simple
array/hash variable or scalar ref, then that is included in the op too.
So all of the following are replaced with a single op:
$h{foo}
$a[$i]
$a[5][$k][$i]
$r->{$k}
local $a[0][$i]
exists $a[$i]{$k}
delete $h{foo}
while these aren't:
$a[0] already handled by OP_AELEMFAST
$a[$x+1] not a simple index
and these are partially replaced:
(expr)->[0]{$k} the bit following (expr) is replaced
$h{foo}[$x+1][0] the first and third lookups are each done with
a multideref op, while the $x+1 expression and
middle lookup are done by existing add, aelem etc
ops.
Up until now, aggregate dereferencing has been very heavyweight in ops; for
example, $r->[0]{$x} is compiled as:
gv[*r] s
rv2sv sKM/DREFAV,1
rv2av[t2] sKR/1
const[IV 0] s
aelem sKM/DREFHV,2
rv2hv sKR/1
gvsv[*x] s
helem vK/2
When executing this, in addition to the actual calls to av_fetch() and
hv_fetch(), there is a lot of overhead of pushing SVs on and off the
stack, and calling lots of little pp() functions from the runops loop
(each with its potential indirect branch miss).
The multideref op avoids that by running all the code in a loop in a
switch statement. It makes use of the new UNOP_AUX type to hold an array
of
typedef union {
PADOFFSET pad_offset;
SV *sv;
IV iv;
UV uv;
} UNOP_AUX_item;
In something like $a[7][$i]{foo}, the GVs or pad offsets for @a and $i are
stored as items in the array, along with a pointer to a const SV holding
'foo', and the UV 7 is stored directly. Along with this, some UVs are used
to store a sequence of actions (several actions are squeezed into a single
UV).
Then the main body of pp_multideref is a big while loop round a switch,
which reads actions and values from the AUX array. The two big branches in
the switch are ones that are affectively unrolled (/DREFAV, rv2av, aelem)
and (/DREFHV, rv2hv, helem) triplets. The other branches are various entry
points that handle retrieving the different types of initial value; for
example 'my %h; $h{foo}' needs to get %h from the pad, while '(expr)->{foo}'
needs to pop expr off the stack.
Note that there is a slight complication with /DEREF; in the example above
of $r->[0]{$x}, the aelem op is actually
aelem sKM/DREFHV,2
which means that the aelem, after having retrieved a (possibly undef)
value from the array, is responsible for autovivifying it into a hash,
ready for the next op. Similarly, the rv2sv that retrieves $r from the
typeglob is responsible for autovivifying it into an AV. This action
of doing the next op's work for it complicates matters somewhat. Within
pp_multideref, the autovivification action is instead included as the
first step of the current action.
In terms of benchmarking with Porting/bench.pl, a simple lexical
$a[$i][$j] shows a reduction of approx 40% in numbers of instructions
executed, while $r->[0][0][0] uses 54% fewer. The speed-up for hash
accesses is relatively more modest, since the actual hash lookup (i.e.
hv_fetch()) is more expensive than an array lookup. A lexical $h{foo}
uses 10% fewer, while $r->{foo}{bar}{baz} uses 34% fewer instructions.
Overall,
bench.pl --tests='/expr::(array|hash)/' ...
gives:
PRE POST
------ ------
Ir 100.00 145.00
Dr 100.00 165.30
Dw 100.00 175.74
COND 100.00 132.02
IND 100.00 171.11
COND_m 100.00 127.65
IND_m 100.00 203.90
with cache misses unchanged at 100%.
In general, the more lookups done, the bigger the proportionate saving.
|
|
|
|
| |
factor out the code that prints a pad var name in -Dt output
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$ ./perl -Ilib -MDevel::Peek -Mfeature=:all -Xle 'my sub f; Dump \&f'
SV = IV(0x7fd7138312b0) at 0x7fd7138312c0
REFCNT = 1
FLAGS = (TEMP,ROK)
RV = 0x7fd713831b90
SV = PVCV(0x7fd7138303d0) at 0x7fd713831b90
REFCNT = 2
FLAGS = (CLONED,DYNFILE,NAMED,LEXICAL)
COMP_STASH = 0x7fd713807ce8 "main"
ROOT = 0x0
NAME = "f"
FILE = "-e"
DEPTH = 0
FLAGS = 0x19040
OUTSIDE_SEQ = 187
PADLIST = 0x7fd7134067e8
PADNAME = 0x7fd713411f88(0x7fd713406d48) PAD = 0x7fd713807e68(0x7fd71340d2f8)
OUTSIDE = 0x0 (null)
SV = 0
That final ‘SV = 0’ is very confusing!
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was done by adding new OP_METHOD_REDIR and OP_METHOD_REDIR_SUPER optypes.
Class name to redirect is saved into METHOP as a shared hash string.
Method name is changed (class name removed) an saved into op_meth_sv as
a shared string hash.
So there is no need now to scan for '::' and calculate class and method names
at runtime (in gv_fetchmethod_*) and searching cache HV without precomputed hash.
B::* modules are changed to support new op types.
method_redir is now printed by Concise like (for threaded perl)
$obj->AAA::meth
5 <.> method_redir[PACKAGE "AAA", PV "meth"] ->6
|
|
|
|
|
|
|
|
|
|
|
| |
distinct from SV. This should fix the CPAN modules that were failing
when the PadnameLVALUE flag was added, because it shared the same
bit as SVs_OBJECT and pad names were going through code paths not
designed to handle pad names.
Unfortunately, it will probably break other CPAN modules, but I think
this change is for the better, as it makes both pad names and SVs sim-
pler and makes pad names take less memory.
|
| |
|
|
|
|
| |
This is in preparation for making PADNAME a separate type.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ck_method:
Scan for '/::. If found SUPER::, create OP_METHOD_SUPER op
with precomputed hash value for method name.
In B::*, added support for method_super
In pp_hot.c, pp_method_*:
S_method_common removed, code related to getting stash is
moved to S_opmethod_stash, other code is moved to
pp_method_* functions.
As a result, SUPER::func() calls speeded up by 50%.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This API elevates the amount of ABI compatibility protection between XS
modules and the interp. It also makes each boot XSUB smaller in machine
code by removing function calls and factoring out code into the new
Perl_xs_handshake and Perl_xs_epilog functions.
sv.c :
- revise padlist duping code to reduce code bloat/asserts on DEBUGGING
ext/DynaLoader/dlutils.c :
- disable version checking so interp startup is faster, ABI mismatches are
impossible because DynaLoader is never available as a shared library
ext/XS-APItest/XSUB-redefined-macros.xs :
- "" means dont check the version, so switch to " " to make the test in
xsub_h.t pass, see ML thread "XS_APIVERSION_BOOTCHECK and XS_VERSION
is CPP defined but "", mow what?"
ext/re/re.xs :
- disable API version checking until #123007 is resolved
ParseXS/Utilities.pm :
109-standard_XS_defs.t :
- remove context from S_croak_xs_usage similar to core commit cb077ed296 .
CvGV doesn't need a context until 5.21.4 and commit ae77754ae2 and
by then core's croak_xs_uage API has been long available and this
backport doesn't need to account for newer perls
- fix test where lack of having PERL_IMPLICIT_CONTEXT caused it to fail
|
|
|
|
|
| |
This also allows ex-dbstate ops to have their cop fields dumped, which
was not the case before.
|
|
|
|
| |
This is very useful for debugging.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CvRESERVED is a placeholder, it will be replaced with a sentinal value
from future revised BOOTCHECK API.
CvPADLIST_set was helpful during development of this patch, so keep it
around for now.
PoisonPADLIST's magic value is from PERL_POISON 0xEF pattern. Some
PoisonPADLIST locations will get code from future BOOTCHECK API.
Make padlist_dup a NN function to avoid overhead of calling it for XSUBs
during closing.
Perl_cv_undef_flags's else if (CvISXSUB(&cvbody)) is to avoid whitespace
changes.
Filed as perl [#123059].
|
| |
|
| |
|
|
|
|
|
|
| |
Sometimes we want things to fit exactly into a specific number
of chars, elipses, quotes and all. Includes make regen update
to make dsv argument nullok.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new opcode class, METHOP, which will hold class/method related
info needed at runtime to improve performance of class/object method
calls, then change OP_METHOD and OP_METHOD_NAMED from being UNOP/SVOP to
being METHOP.
Note that because OP_METHOD is a UNOP with an op_first, while
OP_METHOD_NAMED is an SVOP, the first field of the METHOP structure
is a union holding either op_first or op_sv. This was seen as less messy
than having to introduce two new op classes.
The new op class's character is '.'
Nothing has changed in functionality and/or performance by this commit.
It just introduces new structure which will be extended with extra
fields and used in later commits.
Added METHOP constructors:
- newMETHOP() for method ops with dynamic method names.
The only optype for this op is OP_METHOD.
- newMETHOP_named() for method ops with constant method names.
Optypes for this op are: OP_METHOD_NAMED (currently) and (later)
OP_METHOD_SUPER, OP_METHOD_REDIR, OP_METHOD_NEXT, OP_METHOD_NEXTCAN,
OP_METHOD_MAYBENEXT
(This commit includes fixups by davem)
|
|
|
|
|
|
| |
This doesn't actually use the flag yet.
We no longer have to make version-dependent changes to
ext/Devel-Peek/t/Peek.t, (it being in /ext) so this doesn't
|
|
|
|
| |
SVs_PADMY is now 0, and SvPADMY means !SvPADTMP.
|
|
|
|
|
|
|
|
|
|
| |
One of the main purposes of cv_name was to provide a way for CPAN mod-
ules easily to obtain the name of a sub. As written, it was not
actually sufficient, as some modules, such as Devel::Declare, need an
unqualified name.
So I am breaking compatibility with 5.21.4 (which introduced cv_name,
but is only a dev release) by adding a flags parameter.
|
| |
|
|
|
|
| |
and free up a bit.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is what -Dt was giving:
$ ./miniperl -Dt -e 'sub foo{} &foo'
EXECUTING...
(-e:0)enter
(-e:0)nextstate
(-e:1)pushmark
(-e:1)gvAssertion failed: (isGV_with_GP(_gvstash)), function Perl_gv_fullname4, file gv.c, line 2341.
Abort trap: 6
|
|
|
|
|
| |
Nothing on CPAN uses it, but some CPAN modules use the associ-
ated macros.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See also perl5porters thread titled: "Perl MBOLism in regex engine"
In the perl 5.000 release (a0d0e21ea6ea90a22318550944fe6cb09ae10cda)
the BOL regop was split into two behaviours MBOL and SBOL, with SBOL
and BOL behaving identically. Similarly the EOL regop was split into
two behaviors SEOL and MEOL, with EOL and SEOL behaving identically.
This then resulted in various duplicative code related to flags and
case statements in various parts of the regex engine.
It appears that perhaps BOL and EOL were kept because they are the
type ("regkind") for SBOL/MBOL and SEOL/MEOL/EOS. Reworking regcomp.pl
to handle aliases for the type data so that SBOL/MBOL are of type
BOL, even though BOL == SBOL seems to cover that case without adding
to the confusion.
This means two regops, a regstate, and an internal regex flag can
be removed (and used for other things), and various logic relating
to them can be removed.
For the uninitiated, SBOL is /^/ and /\A/ (with or without /m) and
MBOL is /^/m. (I consider it a fail we have no way to say MBOL without
the /m modifier). Similarly SEOL is /$/ and MEOL is /$/m (there is
also a /\z/ which is EOS "end of string" with or without the /m).
|
|
|
|
|
|
| |
The flags are not actually stored in the GP. Dumping them as part of
it implies that they are shared between globs that share the same GP,
which is not the case.
|
| |
|
| |
|
|
|
|
| |
I should have added this in perl 5.18.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new config file, regen/op_private, which contains all the
information about the flags and descriptions for the OP op_private field.
Previously, the flags themselves were defined in op.h, accompanied by
textual descriptions (sometimes inaccurate or incomplete).
For display purposes, there were short labels for each flag found in
Concise.pm, and another set of labels for Perl_do_op_dump() in dump.c.
These two sets of labels differed from each other in spelling (e.g.
REFC verses REFCOUNT), and differed in completeness and accuracy.
With this commit, all the data to generate the defines and the labels is
derived from a single source, and are generated automatically by 'make
regen'. It also contains complete data on which bits are used for what by
each op. So any attempt to add a new flag for a particular op where that
bit is already in use, will raise an error in make regen. This compares
to the previous practice of reading the descriptions in op.h and hoping
for the best.
It also makes use of data in regen/opcodes: for example, regen/op_private
specifies that all ops flagged as 'T' get the OPpTARGET_MY flag.
Since the set of labels used by Concise and Perl_do_op_dump() differed,
I've standardised on the Concise version. Thus this commit changes the
output produced by Concise only marginally, while Perl_do_op_dump() is
considerably different. As well as the change in labels (and missing
labels), Perl_do_op_dump() formerly had a bug whereby any unrecognised
bits would not be shown if there was at least one recognised bit.
So while Concise displayed (and still does) "LVINTRO,2", Perl_do_op_dump()
has changed:
- PRIVATE = (INTRO)
+ PRIVATE = (LVINTRO,0x2)
Concise has mainly changed in that a few op/bit combinations weren't being
shown symbolically, and now are. I've avoiding fixing the ones that would
break tests; they'll be fixed up in the next few commits.
A few new OPp* flags have been added:
OPpARG1_MASK
OPpARG2_MASK
OPpARG3_MASK
OPpARG4_MASK
OPpHINT_M_VMSISH_STATUS
OPpHINT_M_VMSISH_TIME
OPpHINT_STRICT_REFS
The last three are analogues for existing HINT_* flags. The former four
reflect that many ops some of the lower few bits of op_private to indicate
how many args the op expects. While (for now) this is still displayed as,
e.g. "LVINTRO,2", the definitions in regen/op_private now fully account
for which ops use which bits for the arg count.
There is a new module, B::Op_private, which allows this new data to be
accessed from Perl. For example,
use B::Op_private;
my $name = $B::Op_private::bits{aelem}{7}; # OPpLVAL_INTRO
my $value = $B::Op_private::defines{$name}; # 128
my $label = $B::Op_private::labels{$name}; # LVINTRO
There are several new constant PL_* tables. PL_op_private_valid[]
specifies for each op number, which bits are valid for that op. In a
couple of commits' time, op_free() will use this on debugging builds to
assert that no ops gained any private flags which we don't know about.
In fact it was by using such a temporary assert repeatedly against the
test suite, that I tracked down most of the inconsistencies and errors in
the current flag data.
The other PL_op_private_* tables contain a compact representation of all
the ops/bits/labels in a format suitable for Perl_do_op_dump() to decode
Op_private. Overall, the perl binary is about 500 bytes smaller on my
system.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds to handy.h isALPHA_FOLD_EQ(c1,c2) which efficiently tests if
c1 and c2 are the same character, case-insensitively. For example
isALPHA_FOLD_EQ(c, 's') returns true if and only if <c> is 's' or 'S'.
isALPHA_FOLD_NE() is also added by this commit.
At least one of c1 and c2 must be known to be in [A-Za-z] or this macro
doesn't work properly. (There is an assert for this in the macro in
DEBUGGING builds). That is why the name includes "ALPHA", so you won't
forget when using it.
This functionality has been in regcomp.c for a while, under a different
name. I had thought that the only reason to make it more generally
available was potential speed gain, but recent gcc versions optimize to
the same code, so I thought there wasn't any point to doing so.
But I now think that using this makes things easier to read (and
certainly shorter to type in). Once you grok what this macro does, it
simplifies what you have to keep in your mind when reading logical
expressions with multiple operands. That something can be either upper
or lower case can be a distraction to understanding the larger point of
the expression.
|
|
|
|
|
|
| |
With the MAD code, these macros were each used in two places to
symbolically dump op_flags and op_private. Now that the MAD code
has been removed, they are each only used once, so inline them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the boolean field op_lastsib to OPs. Within the core, this is set
on the last op in an op_sibling chain (so it is synonymous with op_sibling
being null). By default, its value is set but not used.
In addition, add a new build define (not yet enabled by default),
-DPERL_OP_PARENT, that forces the core to use op_lastsib to detect the
last op in a sibling chain, rather than op_sibling being NULL. This frees
up the last op_sibling pointer in the chain, which rather than being set
to NULL, is now set to point back to the parent of the sibling chain (if
any).
This commit also adds a C-level op_parent() function and B parent()
method; under default builds they just return NULL, under PERL_OP_PARENT
they return the parent of the current op.
Collectively this provides a facility not previously available from B:: nor
C, of being able to follow an op tree up as well as down.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove (almost all) direct access to the op_sibling field of OP structs,
and use these three new macros instead:
OP_SIBLING(o);
OP_HAS_SIBLING(o);
OP_SIBLING_set(o, new_value);
OP_HAS_SIBLING is intended to be a slightly more efficient version of
OP_SIBLING when only boolean context is needed.
For now these three macros are just defined in the obvious way:
#define OP_SIBLING(o) (0 + (o)->op_sibling)
#define OP_HAS_SIBLING(o) (cBOOL((o)->op_sibling))
#define OP_SIBLING_set(o, sib) ((o)->op_sibling = (sib))
but abstracting them out will allow us shortly to make the last pointer in
an op_sibling chain point back to the parent rather than being null, with
a new flag indicating whether this is the last op.
Perl_ck_fun() still has a couple of direct uses of op_sibling, since it
takes the field's address, which is not covered by these macros.
|
|
|
|
|
|
|
|
| |
You need to configure with g++ *and* -Accflags=-DPERL_GLOBAL_STRUCT
or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference.
(g++ does not do the "post-annotation" form of "unused".)
The version code has some of these issues, reported upstream.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- after return/croak/die/exit, return/break are pointless
(break is not a terminator/separator, it's a goto)
- after goto, another goto (!) is pointless
- in some cases (usually function ends) introduce explicit NOT_REACHED
to make the noreturn nature clearer (do not do this everywhere, though,
since that would mean adding NOT_REACHED after every croak)
- for the added NOT_REACHED also add /* NOTREACHED */ since
NOT_REACHED is for gcc (and VC), while the comment is for linters
- declaring variables in switch blocks is just too fragile:
it kind of works for narrowing the scope (which is nice),
but breaks the moment there are initializations for the variables
(the initializations will be skipped since the flow will bypass
the start of the block); in some easy cases simply hoist the declarations
out of the block and move them earlier
Note 1: Since after this patch the core is not yet -Wunreachable-code
clean, not enabling that via cflags.SH, one needs to -Accflags=... it.
Note 2: At least with the older gcc 4.4.7 there are far too many
"unreachable code" warnings, which seem to go away with gcc 4.8,
maybe better flow control analysis. Therefore, the warning should
eventually be enabled only for modernish gccs (what about clang and
Intel cc?)
|
|
|
|
|
|
|
| |
This reverts commit 8c2b19724d117cecfa186d044abdbf766372c679.
I don't understand - smoke-me came back happy with three
separate reports... oh well, some other time.
|