| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new opcode class, METHOP, which will hold class/method related
info needed at runtime to improve performance of class/object method
calls, then change OP_METHOD and OP_METHOD_NAMED from being UNOP/SVOP to
being METHOP.
Note that because OP_METHOD is a UNOP with an op_first, while
OP_METHOD_NAMED is an SVOP, the first field of the METHOP structure
is a union holding either op_first or op_sv. This was seen as less messy
than having to introduce two new op classes.
The new op class's character is '.'
Nothing has changed in functionality and/or performance by this commit.
It just introduces new structure which will be extended with extra
fields and used in later commits.
Added METHOP constructors:
- newMETHOP() for method ops with dynamic method names.
The only optype for this op is OP_METHOD.
- newMETHOP_named() for method ops with constant method names.
Optypes for this op are: OP_METHOD_NAMED (currently) and (later)
OP_METHOD_SUPER, OP_METHOD_REDIR, OP_METHOD_NEXT, OP_METHOD_NEXTCAN,
OP_METHOD_MAYBENEXT
(This commit includes fixups by davem)
|
|
|
|
|
| |
‘do subname’ has been removed, so OPf_SPECIAL no longer applies to
OP_ENTERSUB.
|
|
|
|
|
|
|
|
| |
We get an integer overflow message when we left shift a 1 into the
highest bit of a word. This changes the 1's into 1U's to indicate
unsigned. This is done for all the flag bits in the affected word, as
they could get reorderd by someone in the future, unintentionally
reintroducing this problem again.
|
|
|
|
|
|
| |
This doesn't actually use the flag yet.
We no longer have to make version-dependent changes to
ext/Devel-Peek/t/Peek.t, (it being in /ext) so this doesn't
|
|
|
|
|
|
|
| |
This changes op.h to correspond with regexp.h. It moves all the used
bits up in the word so that if a new shared bit is added, the #error
will be triggered, alerting the person doing it that things need
adjusting so binary compatibility is preserved.
|
|
|
|
|
|
|
|
|
|
|
|
| |
and give IS_PADGV a simpler definition.
These are not used in the perl core any more and shouldn’t be.
The IS_PADGV definition checked for the IN_PAD flag, which flag never
made much sense (see the prev. commit’s message). Since any GV could
end up with that flag, and since any GV coming near a pad would get
it, it might as well have been turned on for all GVs (except copies).
So just check whether the thingy is a GV.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of faking up a GV to pass to the call checker if we have a
lexical sub, just get the GV from CvGV (since that will reify the GV,
even for lexical subs), unless the call checker has not specifically
requested GVs.
For now, we assume the default call checker cannot handle non-GV sub
names, as indeed it cannot. An imminent commit will rectify that.
The code in scope.c was getting the name hek from the proto CV (stowed
in magic on the pad name) if the CV in the pad had lost it. Now, the
proto CV can lose it at compile time via CvGV, so that does not work
anymore. Instead, just get it from the GV.
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple of VMS-specific hints bits are stored in op_private on COPs.
Currently these are added using NATIVE_HINTS, which is defined as
PL_hints >> 24.
Since other hints have started using the top byte of PL_hints, this
has the possibility of inadvertently setting other bits in cop->op_private.
So mask out the bits we don't want. We need this before the next commit,
which will assert valid bits on debugging builds.
(This is VMS-specific, and has been applied blind)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new config file, regen/op_private, which contains all the
information about the flags and descriptions for the OP op_private field.
Previously, the flags themselves were defined in op.h, accompanied by
textual descriptions (sometimes inaccurate or incomplete).
For display purposes, there were short labels for each flag found in
Concise.pm, and another set of labels for Perl_do_op_dump() in dump.c.
These two sets of labels differed from each other in spelling (e.g.
REFC verses REFCOUNT), and differed in completeness and accuracy.
With this commit, all the data to generate the defines and the labels is
derived from a single source, and are generated automatically by 'make
regen'. It also contains complete data on which bits are used for what by
each op. So any attempt to add a new flag for a particular op where that
bit is already in use, will raise an error in make regen. This compares
to the previous practice of reading the descriptions in op.h and hoping
for the best.
It also makes use of data in regen/opcodes: for example, regen/op_private
specifies that all ops flagged as 'T' get the OPpTARGET_MY flag.
Since the set of labels used by Concise and Perl_do_op_dump() differed,
I've standardised on the Concise version. Thus this commit changes the
output produced by Concise only marginally, while Perl_do_op_dump() is
considerably different. As well as the change in labels (and missing
labels), Perl_do_op_dump() formerly had a bug whereby any unrecognised
bits would not be shown if there was at least one recognised bit.
So while Concise displayed (and still does) "LVINTRO,2", Perl_do_op_dump()
has changed:
- PRIVATE = (INTRO)
+ PRIVATE = (LVINTRO,0x2)
Concise has mainly changed in that a few op/bit combinations weren't being
shown symbolically, and now are. I've avoiding fixing the ones that would
break tests; they'll be fixed up in the next few commits.
A few new OPp* flags have been added:
OPpARG1_MASK
OPpARG2_MASK
OPpARG3_MASK
OPpARG4_MASK
OPpHINT_M_VMSISH_STATUS
OPpHINT_M_VMSISH_TIME
OPpHINT_STRICT_REFS
The last three are analogues for existing HINT_* flags. The former four
reflect that many ops some of the lower few bits of op_private to indicate
how many args the op expects. While (for now) this is still displayed as,
e.g. "LVINTRO,2", the definitions in regen/op_private now fully account
for which ops use which bits for the arg count.
There is a new module, B::Op_private, which allows this new data to be
accessed from Perl. For example,
use B::Op_private;
my $name = $B::Op_private::bits{aelem}{7}; # OPpLVAL_INTRO
my $value = $B::Op_private::defines{$name}; # 128
my $label = $B::Op_private::labels{$name}; # LVINTRO
There are several new constant PL_* tables. PL_op_private_valid[]
specifies for each op number, which bits are valid for that op. In a
couple of commits' time, op_free() will use this on debugging builds to
assert that no ops gained any private flags which we don't know about.
In fact it was by using such a temporary assert repeatedly against the
test suite, that I tracked down most of the inconsistencies and errors in
the current flag data.
The other PL_op_private_* tables contain a compact representation of all
the ops/bits/labels in a format suitable for Perl_do_op_dump() to decode
Op_private. Overall, the perl binary is about 500 bytes smaller on my
system.
|
|
|
|
|
| |
Its a bit confusing which bits in PL_opargs are used for what,
and which flags in regen/opcodes map to which OA_* value
|
|
|
|
| |
This description, added in ec6d81aba, is misleading.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Till now, when a bareword was looked up to see whether it was a sub-
routine, an rv2cv op was created (to allow PL_check hooks to override
the process), which was then asked for its GV.
Afterwards, the GV was downgraded back to nothing if possible.
So a lot of the time a GV was autovivified and then discarded. This
has been the case since f74617600 (5.12).
If we know there is a good chance that the rv2cv op is about to be
deleted, we can avoid that by passing a flag to the new op.
Also f74617600 actually changed the behaviour by vivifying stashes
that used not be vivified:
sub foo { print shift, "\n" }
SUPER::foo bar if 0;
foo SUPER;
Output in 5.10:
SUPER
Output as of this commit:
SUPER
Output in 5.12 to 5.21.3:
Can't locate object method "foo" via package "SUPER" at - line 3.
|
|
|
|
|
| |
This was used to distinguish forms <FILE> from <$file>, but doesn't
seem to be used anymore by anything.
|
|
|
|
|
| |
This caused all OP structures to be larger than intended; for example, it
made `struct op` 48 bytes rather than 40 on Mac OS X x86-64.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the boolean field op_lastsib to OPs. Within the core, this is set
on the last op in an op_sibling chain (so it is synonymous with op_sibling
being null). By default, its value is set but not used.
In addition, add a new build define (not yet enabled by default),
-DPERL_OP_PARENT, that forces the core to use op_lastsib to detect the
last op in a sibling chain, rather than op_sibling being NULL. This frees
up the last op_sibling pointer in the chain, which rather than being set
to NULL, is now set to point back to the parent of the sibling chain (if
any).
This commit also adds a C-level op_parent() function and B parent()
method; under default builds they just return NULL, under PERL_OP_PARENT
they return the parent of the current op.
Collectively this provides a facility not previously available from B:: nor
C, of being able to follow an op tree up as well as down.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove (almost all) direct access to the op_sibling field of OP structs,
and use these three new macros instead:
OP_SIBLING(o);
OP_HAS_SIBLING(o);
OP_SIBLING_set(o, new_value);
OP_HAS_SIBLING is intended to be a slightly more efficient version of
OP_SIBLING when only boolean context is needed.
For now these three macros are just defined in the obvious way:
#define OP_SIBLING(o) (0 + (o)->op_sibling)
#define OP_HAS_SIBLING(o) (cBOOL((o)->op_sibling))
#define OP_SIBLING_set(o, sib) ((o)->op_sibling = (sib))
but abstracting them out will allow us shortly to make the last pointer in
an op_sibling chain point back to the parent rather than being null, with
a new flag indicating whether this is the last op.
Perl_ck_fun() still has a couple of direct uses of op_sibling, since it
takes the field's address, which is not covered by these macros.
|
|
|
|
|
|
| |
MAD = Misc Attribute Decoration; unmaintained attempt at preserving
the Perl parse tree more faithfully so that automatic conversion to
Perl 6 would have been easier.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example,
if (OP_TYPE_IS_OR_WAS(o, OP_LIST))
...
is now available instead of either of the following:
if ( o
&& ( o->op_type == OP_LIST
|| (o->op_type == OP_NULL
&& o->op_targ == OP_LIST) ) )
...
if ( o &&
(o->op_type == OP_NULL ? o->op_targ ? o->op_type) == OP_LIST )
...
In case the above logic is a bit unclear: It checks whether that OP is
an OP_LIST or used to be one before being NULLed using op_null.
(FTR, the resulting OP_NULLs have their op_targ set to the old OP type).
This sort of check (and it's reverse "isn't and didn't use to be") are a
relatively common pattern in the part of op.c that tries to intuit
structures from optimization-mangled OP trees. Hopefully, using these
macros will make some code a fair amount clearer.
|
|
|
|
|
|
| |
av_tindex is a more clearly named synonym for av_len, available starting
in v5.18. This changes the core uses to it, including modules in /ext,
which are not dual-lifed.
|
|
|
|
|
|
|
|
|
|
| |
Declarative syntax to unwrap argument list into lexical variables.
"sub foo ($a,$b) {...}" checks number of arguments and puts the
arguments into lexical variables. Signatures are not equivalent to the
existing idiom of "sub foo { my($a,$b) = @_; ... }". Signatures are only
available by enabling a non-default feature, and generate warnings about
being experimental. The syntactic clash with prototypes is managed by
disabling the short prototype syntax when signatures are enabled.
|
|
|
|
| |
plus some typo fixes. I probably changed some things in perlintern, too.
|
|
|
|
|
| |
flags param was poorly designed and didn't have a formal api. Replace it
with the bool it really is. See #115736 for details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes:
sub baz { return $cat; }
Behave like:
sub baz { $cat; }
Which is notably faster.
Unpatched:
./perl -Ilib/ ~/stuff/bench.pl
Benchmark: timing 40000000 iterations of normal, ret...
normal: 3 wallclock secs ( 1.60 usr + 0.01 sys = 1.61 CPU) @ 24844720.50/s (n=40000000)
ret: 3 wallclock secs ( 2.08 usr + 0.00 sys = 2.08 CPU) @ 19230769.23/s (n=40000000)
Patched:
./perl -Ilib ~/stuff/bench.pl
Benchmark: timing 40000000 iterations of aret, normal...
normal: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 23255813.95/s (n=40000000)
ret: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 23255813.95/s (n=40000000)
The difference in OP trees can be seen here:
Unpatched:
$ perl -MO=Concise,baz -e 'sub baz { return $cat }'
main::baz:
5 <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->5
1 <;> nextstate(main 1 -e:1) v ->2
4 <@> return K ->5
2 <0> pushmark s ->3
- <1> ex-rv2sv sK/1 ->4
3 <#> gvsv[*cat] s ->4
-e syntax OK
Patched:
$ ./perl -Ilib -MO=Concise,baz -e 'sub baz { return $cat }'
main::baz:
3 <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->3
1 <;> nextstate(main 1 -e:1) v ->2
- <@> return K ->-
- <0> pushmark s ->2
- <1> ex-rv2sv sK/1 ->-
2 <$> gvsv(*cat) s ->3
-e syntax OK
(Includes some modifications from Steffen)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1830b3d9c8 introduced a flaw where XopENTRY calls
Perl_custom_op_xop twice to retrieve the same XOP *. This is inefficient
and causes extra machine code. Since I found no CPAN or upstream=blead
usage of Perl_custom_op_xop, and its previous docs say it isn't 100%
public, it is being converted to a macro.
Most usage of Perl_custom_op_xop is to conditionally fetch a member of the
XOP struct, which was previously implemented by XopENTRY. Move the XopENTRY
logic and picking defaults to an expanded version of Perl_custom_op_xop.
The union allows Perl_custom_op_get_field to return its result in 1
register, since the union is similar to a void * or IV, but with the
machine code overhead of casting, if any, being done in the callee
(Perl_custom_op_get_field), not the caller. Perl_custom_op_get_field can
also return the XOP * without looking inside it to implement
Perl_custom_op_xop.
XopENTRYCUSTOM is a wrapper around Perl_custom_op_get_field with
XopENTRY-like usage.
XopENTRY is used by the OP_* macros, which are heavily used (but rarely
called, since custom ops are rare) by Perl lang warnings system. The
vararg warning arguments are usually evaluted no matter if the warning
will be printed to STDERR or not. Since some people like to ignore warnings
or run no strict; and warnings branches are frequent in pp_*, it is
beneficial to make the OP_* macros smaller in machine code. The design
of Perl_custom_op_get_field supports these goals.
This commit does not pass judgement on Ben Morrow's unclear public or
private API designation of Perl_custom_op_xop, and whether
Perl_custom_op_xop should deprecated and removed from public API. It was
trivial to leave a form of Perl_custom_op_xop in the new design.
XOPe enums are identical to XOPf constants so no conversion has to be
done between the field selector parameter and the field flag to test
in machine code.
ASSUME and NOT_REACHED are being introduced. The closest to the 2
previously was "assert(0)". Perl has not used ASSUME or CC specific
versions of it before. Clang, GCC >= 4.5, and Visual C are supported. For
completeness, ARMCC's __promise was added, but Perl is not known to have
any support for ARMCC by this commiter.
This patch is part of perl #115032.
|
|
|
|
|
|
|
|
|
| |
by removing the hint from the exit op itself and just having pp_exit
look in the cop hint hash, where it is already stored (as a result of
having been in %^H at compile time).
&CORE:: subs intentionally lack a nextstate op (cop) so they can see
the hints in the caller’s nextstate op.
|
|
|
|
|
|
|
| |
This commit makes them behave like exit and die without the ampersand
by moving the OPpHUSH_VMSISH hint from exit/die op to the current
statement (nextstate/cop) instead. &CORE:: subs intentionally lack a
nextstate op, so they can see the hints in the caller’s nextstate op.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
and reword the warning slightly.
See <20131027204944.20489.qmail@lists-nntp.develooper.com>.
To avoid getting a warning about scalar context for ‘delete %a[1,2]’,
which dies anyway, I stopped scalar context from being applied to
delete’s argument. Scalar context is not meaningful here anyway, and
the context is not really scalar.
This also means that ‘delete sort’ no longer produces a warning about
scalar context before dying, so I added a test for that.
|
|
|
|
|
|
|
|
| |
If a bare block is the last thing in an lvalue sub, OP_LEAVELOOP needs
to propagate lvalue context and handle returned arrays properly, just
as OP_LEAVE has done since yesterday.
This is a follow-up to 2ec7f6f24289. This came up in ticket #119797.
|
|
|
|
| |
This flag will be used by the next commit.
|
|
|
|
|
|
|
| |
Now that we have op->op_folded, we don’t need OPpCONST_FOLDED any
more. In removing it, I modified B::Concise to output op_folded the
way OPpCONST_FOLDED was output before, since it can be helpful to have
it when reading op dumps.
|
|
|
|
|
|
|
| |
Entersub ops can turn into rv2cv ops at compile time. It is important
that possible flag conflicts are handled properly. Since it took me
a while to figure out that there were no bugs here, here are my own
notes, cleaned up and nicely formatted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resolves tickets #28380 and #114024.
Commit 95a31aad5 did something similar to this for the new %hash{...}
syntax. This commit extends it to @ slices and combines the two
code paths.
The heuristics in toke.c can easily produce false positives. So the
op is flagged as being a candidate for the warning. Then when op.c
has the op tree available, it examines it to see whether the heuristic
may have been a false positive.
This avoids bugs with qw "foo bar baz" and sub calls triggering
the warning.
The source code is no longer available for the warning, so we recon-
struct it from the op tree, skipping the subscript if it is anything
other than a const op.
This means that @hash{$foo} comes out as @hash{...} and @hash{foo} as
@hash{"foo"}. It also meeans that @hash{"]"} is displayed correctly
instead of as @hash{"].
Commit 95a31aad5 also modified the heuristic for %hash{...} to exempt
qw altogether. But it did not exempt it if it was preceded by a tab.
So this commit rectifies that.
This commit also improves the false positive detection by exempting
any ops returning lists that can get past toke.c’s heuristic. I went
through the entire list of ops, but I may have missed some.
Also, @ slices on the lhs of = are exempt, as they change the context
and are hence actually useful.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of warning in the lexer, flag the op and then warn in op.c,
when the op tree is available, so we don’t end up warning for actual
lists or for sub calls.
Also, only warn in scalar context, as in list context $hash{$scalar}
and %hash{$scalar} do different things.
In op.c we no longer have easy access to the source code, so recon-
struct the hash/array access based on the op tree. This means
%hash{foo} becomes %hash{"foo"}. We only reconstruct constant keys,
so %hash{++$x} becomes %hash{...}. This also corrects erroneous
dumps, like %hash{"} for %hash{"}"}.
Instead of triggering the warning solely based on the op tree, we
still keep the heuristic in toke.c, so that common workarounds for
that warning (e.g., {q<key>} and {("key")}) continue to work.
The heuristic in toke.c is tweaked to avoid warning for qw().
In a future commit I plan to extend this to the existing @array[0] and
@hash{key} warnings, to avoid false positives.
|
|
|
|
|
|
| |
Te various different forms of goto (and dump) take different branches
through this big function. Document which branches handle which variants.
Also document the use of OPf_SPECIAL in OP_DUMP.
|
|
|
|
|
|
|
|
|
|
| |
Make the array interface 64-bit safe by using SSize_t instead of I32
for array indices.
This is based on a patch by Chip Salzenberg.
This completes what the previous commit began when it changed
av_extend.
|
|
|
|
|
|
|
|
|
| |
Add a new member, op_folded, to BASEOP. It is replacement for
OPpCONST_FOLDED (which can only be set on OP_CONST). At the moment
OPpCONST_FOLDED remains, as it is exposed in B (e.g. B::Concise relies
on it).
Signed-off-by: Niels Thykier <niels@thykier.net>
|
|
|
|
|
|
|
|
|
|
|
| |
See ticket #118055 for all the detail. On systems where IV is bigger
than a pointer, the slab allocator messes things up because it only
provides pointer alignment. If pmops have an IV field, we cannot
allocate them via slab on such systems. Pmops actually don’t need
an IV, just a PADOFFSET. So we can change them and remove the
workaround.
This is obviously not suitable for maint.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At compile time, if split occurs on the right-hand side of an assign-
ment to a list of scalars, if the limit argument is a constant con-
taining the number 0 then it is modified in place to hold one more
than the number of scalars.
This means ‘constants’ can change their values, if they happen to be
in the wrong place at the wrong time:
$ ./perl -Ilib -le 'use constant NULL => 0; ($a,$b,$c) = split //, $foo, NULL; print NULL'
4
I considered checking the reference count on the SV, but since XS code
could create its own const ops with weak references to the same cons-
tants elsewhere, the safest way to avoid modifying someone else’s SV
is to mark the split op in ck_split so we know the SV belongs to that
split op alone.
Also, to be on the safe side, turn off the read-only flag before modi-
fying the SV, instead of relying on the special case for compile time
in sv_force_normal.
|
|
|
|
| |
Flag 2 on entersub is HINT_STRICT_REFS, not _SUBS.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PL_reg_state.re_reparsing is a hacky flag used to allow runtime
code blocks to be included in patterns. Basically, since code blocks
are now handled by the perl parser within literal patterns, runtime
patterns are handled by taking the (assembled at runtime) pattern,
and feeding it back through the parser via the equivalent of
eval q{qr'the_pattern'},
so that run-time (?{..})'s appear to be literal code blocks.
When this happens, the global flag PL_reg_state.re_reparsing is set,
which modifies lexing and parsing in minor ways (such as whether \\ is
stripped).
Now, I'm in the slow process of trying to eliminate global regex state
(i.e. gradually removing the fields of PL_reg_state), and also a change
which will be coming a few commits ahead requires the info which this flag
indicates to linger for longer (currently it is cleared immediately after
the call to scan_str().
For those two reasons, this commit adds a new mechanism to indicate this:
a new flag to eval_sv(), G_RE_REPARSING (which sets OPpEVAL_RE_REPARSING
in the entereval op), which sets the EVAL_RE_REPARSING bit in PL_in_eval.
Its still a yukky global flag hack, but its a *different* global flag hack
now.
For this commit, we add the new flag(s) but keep the old
PL_reg_state.re_reparsing flag and assert that the two mechanisms always
match. The next commit will remove re_reparsing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch resolves several issues at once. The parts are
sufficiently interconnected that it is hard to break it down
into smaller commits. The tickets open for these issues are:
RT #94490 - split and constant folding
RT #116086 - split "\x20" doesn't work as documented
It additionally corrects some issues with cached regexes that
were exposed by the split changes (and applied to them).
It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171
and cccd1425414e6518c1fc8b7bcaccfb119320c513.
Prior to this patch the special RXf_SKIPWHITE behavior of
split(" ", $thing)
was only available if Perl could resolve the first argument to
split at compile time, meaning under various arcane situations.
This manifested as oddities like
my $delim = $cond ? " " : qr/\s+/;
split $delim, $string;
and
split $cond ? " ", qr/\s+/, $string
not behaving the same as:
($cond ? split(" ", $string) : split(/\s+/, $string))
which isn't very convenient.
This patch changes this by adding a new flag to the op_pmflags,
PMf_SPLIT which enables pp_regcomp() to know whether it was called
as part of split, which allows the RXf_SPLIT to be passed into run
time regex compilation. We also preserve the original flags so
pattern caching works properly, by adding a new property to the
regexp structure, "compflags", and related macros for accessing it.
We preserve the original flags passed into the compilation process,
so we can compare when we are trying to decide if we need to
recompile.
Note that this essentially the opposite fix from the one applied
originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171.
The reverted patch was meant to make:
split( 0 || " ", $thing ) #1
consistent with
my $x=0; split( $x || " ", $thing ) #2
and not with
split( " ", $thing ) #3
This was reverted because it broke C<split("\x{20}", $thing)>, and
because one might argue that is not that #1 does the wrong thing,
but rather that the behavior of #2 that is wrong. In other words
we might expect that all three should behave the same as #3, and
that instead of "fixing" the behavior of #1 to be like #2, we should
really fix the behavior of #2 to behave like #3. (Which is what we did.)
Also, it doesn't make sense to move the special case detection logic
further from the regex engine. We really want the regex engine to decide
this stuff itself, otherwise split " ", ... wouldn't work properly with
an alternate engine. (Imagine we add a special regexp meta pattern that behaves
the same as " " does in a split /.../. For instance we might make
split /(*SPLITWHITE)/ trigger the same behavior as split " ".
The other major change as result of this patch is it effectively
reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which
was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which
and free up bits in the regex flags structure.
But we dont want to get rid of these vars, and it turns out that
RXf_SEEN_LOOKBEHIND is used only in the same situation as the new
RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to
RXf_NO_INPLACE_SUBST, and then instead of using two vars we use
only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to
have their bits back.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[perl #115080]
m?...? is only supposed to match once, until reset. Normally this is done
by setting the PMf_USED flag on the PMOP. Under ithreads we can't modify
ops, so instead we indicate by setting the regex's SV to readonly. (This
is a bit of a hack: the flag should be associated with the PMOP, not the
regex).
This breaks with run-time regexes when the pattern gets recompiled; for
example:
for my $c (qw(a b c)) {
print "matched $c\n" if $c =~ m?^$c$?;
}
outputs
matched a
on unthreaded, but
matched a
matched b
matched c
on threaded.
The re_eval jumbo fix made this more noticeable by sometimes recompiling
even when the pattern text hasn't changed (to make closures work ok).
The quick fix is to propagate the readonlyness of the old re to the new
re. (The proper fix would be to store the flag state in a pad slot
associated with the PMOP).
Needless to say, I've gone for the quick fix.
|
|
|
|
|
|
|
| |
In the threaded version of PmopSTASH_set(), the assigned value is a
PADOFFSET, not a pointer; so use 0 rather than NULL for the default value.
This keeps clang happy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This only affected threaded builds. I think the comments in the added
test explain well enough what was happening.
The solution is to store a stashpad offset in the pmop, instead of the
name of the stash. This is similar to what was done with cop stashes
in d4d03940c58a.
Not only does this fix the crash, but it also makes compilation faster
and saves memory (no separate malloc for every m?pat?).
I had to move Safefree(PL_stashpad) later on in perl_destruct, because
freeing a pmop causes the PL_stashpad to be accessed, and pmops can be
freed during sv_clean_all. Its previous location was not a problem
for cops, as PL_stashpad[cop->cop_stashoff] is only accessed when
PL_curcop==that_cop and Perl code is running, not when cops are freed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I had to change the definition of IS_PADCONST to account for the
SVf_IsCOW flag. Previously, anything marked READONLY would be consid-
ered a pad constant, to be shared by pads of different recursion lev-
els. Some of those READONLY things were not actually read-only, as
they were copy-on-write scalars, which are never read-only. So I
changed the definition of IS_PADCONST in e3918bb703c to accept COWs
as well as read-only scalars, since I was removing the READONLY flag
from COWs.
With the new copy-on-write scheme, it is easy for a TARG to turn into
a COW. If that happens and then the same subroutine calls itself
recursively for the first time after that, pad_push will see that this
is a pad ‘constant’ and allow the next recursion level to share it.
If pp_concat calls itself recursively, the recursive call can modify
the scalar the outer call is in the middle of using, causing the
return value to be doubled up (‘tmptmp’) in the test case added here.
Since pad constants are marked PADTMP (I would like to change that
eventually), there is no way to distinguish them from TARGs when the
are COWs, except for the fact that pad constants that are COWs are
always shared hash keys (SvLEN==0).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in ticket #114820, instead of using READONLY+FAKE to mark
a copy-on-write string, we should make it a separate flag.
There are many modules in CPAN (and 1 in core, Compress::Raw::Zlib)
that assume that SvREADONLY means read-only. Only one CPAN module,
POSIX::pselect will definitely be broken by this. Others may need to
be tweaked. But I believe this is for the better.
It causes all tests except ext/Devel-Peek/t/Peek.t (which needs a tiny
tweak still) to pass under PERL_OLD_COPY_ON_WRITE, which is a prereq-
uisite for any new COW scheme that creates COWs under the same cir-
cumstances.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a construct like
my ($x,$y) = @_
the pushmark/padsv/padsv is already optimised into a single padrange
op. This commit makes the OPf_SPECIAL flag on the padrange op indicate
that in addition, @_ should be pushed onto the stack, skipping an
additional pushmark/gv[*_]/rv2sv combination.
So in total (including the earlier padrange work), the above construct
goes from being
3 <0> pushmark s
4 <$> gv(*_) s
5 <1> rv2av[t3] lK/1
6 <0> pushmark sRM*/128
7 <0> padsv[$x:1,2] lRM*/LVINTRO
8 <0> padsv[$y:1,2] lRM*/LVINTRO
9 <2> aassign[t4] vKS
to
3 <0> padrange[$x:1,2; $y:1,2] l*/LVINTRO,2 ->4
4 <2> aassign[t4] vKS
|
|
|
|
|
| |
Add a new save type that does the equivalent of multiple SAVEt_CLEARSV's
for a given target range. This makes the new padange op more efficient.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This single op can, in some circumstances, replace the sequence of a
pushmark followed by one or more padsv/padav/padhv ops, and possibly
a trailing 'list' op, but only where the targs of the pad ops form
a continuous range.
This is generally more efficient, but is particularly so in the case
of void-context my declarations, such as:
my ($a,@b);
Formerly this would be executed as the following set of ops:
pushmark pushes a new mark
padsv[$a] pushes $a, does a SAVEt_CLEARSV
padav[@b] pushes all the flattened elements (i.e. none) of @a,
does a SAVEt_CLEARSV
list pops the mark, and pops all stack elements except the last
nextstate pops the remaining stack element
It's now:
padrange[$a..@b] does two SAVEt_CLEARSV's
nextstate nothing needing doing to the stack
Note that in the case above, this commit changes user-visible behaviour in
pathological cases; in particular, it has always been possible to modify a
lexical var *before* the my is executed, using goto or closure tricks.
So in principle someone could tie an array, then could notice that FETCH
is no longer being called, e.g.
f();
my ($s, @a); # this no longer triggers two FETCHES
sub f {
tie @a, ...;
push @a, 1,2;
}
But I think we can live with that.
Note also that having a padrange operator will allow us shortly to have
a corresponding SAVEt_CLEARPADRANGE save type, that will replace multiple
individual SAVEt_CLEARSV's.
|