| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
| |
Carp calls rand().
|
|
|
|
| |
Use a code point in the test that is equally valid on EBCDIC machines
|
|
|
|
|
|
|
|
| |
There are a number of files excluded using gitignore rules that are
included in the repository. This can lead to confusion if something
other than git tries to read the ignore files.
Add rules to the gitignore files so that these files won't be ignored.
|
|
|
|
|
|
|
|
|
|
|
| |
The current "no warranty" text warning against the use of Safe or
Opcode for "security purposes" is somewhat ambiguous. These modules
are not effective sandboxing mechanisms for evaluating untrusted
perl code and should not be used in that manner.
Safe and Opcode are, at best, hardening measures that could be used
in combination with operating system level sandboxing of the perl
interpreter.
|
| |
|
|
|
|
|
|
|
| |
This is a partial revert to remove utf8->import
which breaks Storable
This reverts commit 894d8b10212a906402f4db9f9aac9efe9fa084fd.
|
| |
|
|
|
|
|
| |
This is a fixup for #17969 which wanted to load utf8_heavy.pl but
it is no longer available as of 5.31.6
|
| |
|
| |
|
|
|
|
| |
Fixes GH #17271
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This updates the bug tracker URL from http://rt.perl.org
to https://rt.perl.org.
There is a place in the code, in corelist.pl, that is sensitive
to the URL of the bug tracker. This now understands both
versions of the bug tracker URL. Ideally, this will be
consolidated once the dust settles.
This patch also updates ExtUtils::CBuilder, Safe, threads
and threads::shared to point to the new bug tracker URL.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using vars pragma is discouraged and has been superseded
by 'our' declarations available in Perl v5.6.0 or later.
Additionally using 'vars' pragma increase the memory consumption of a
program by about 700 kB for no good reason.
This commit is about replacing the usage of 'vars' pragma
by 'our' in blead where it makes sense. ( leaving 'cpan' directory
outside of the scope )
-- using vars
perl -e 'use vars qw(@ISA $AUTOLOAD $VERSION); print qx{grep RSS /proc/$$/status} '
VmRSS: 2588 kB
-- using our instead
perl -e 'our (@ISA, $AUTOLOAD, $VERSION); print qx{grep RSS /proc/$$/status} '
VmRSS: 1864 kB
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow multiple OP_CONCAT, OP_CONST ops, plus optionally an OP_SASSIGN
or OP_STRINGIFY, to be combined into a single OP_MULTICONCAT op, which can
make things a *lot* faster: 4x or more.
In more detail: it will optimise into a single OP_MULTICONCAT, most
expressions of the form
LHS RHS
where LHS is one of
(empty)
my $lexical =
$lexical =
$lexical .=
expression =
expression .=
and RHS is one of
(A . B . C . ...) where A,B,C etc are expressions and/or
string constants
"aAbBc..." where a,A,b,B etc are expressions and/or
string constants
sprintf "..%s..%s..", A,B,.. where the format is a constant string
containing only '%s' and '%%' elements,
and A,B, etc are scalar expressions (so
only a fixed, compile-time-known number of
args: no arrays or list context function
calls etc)
It doesn't optimise other forms, such as
($a . $b) . ($c. $d)
((($a .= $b) .= $c) .= $d);
(although sub-parts of those expressions might be converted to an
OP_MULTICONCAT). This is partly because it would be hard to maintain the
correct ordering of tie or overload calls.
The compiler uses heuristics to determine when to convert: in general,
expressions involving a single OP_CONCAT aren't converted, unless some
other saving can be made, for example if an OP_CONST can be eliminated, or
in the presence of 'my $x = .. ' which OP_MULTICONCAT can apply
OPpTARGET_MY to, but OP_CONST can't.
The multiconcat op is of type UNOP_AUX, with the op_aux structure directly
holding a pointer to a single constant char* string plus a list of segment
lengths. So for
"a=$a b=$b\n";
the constant string is "a= b=\n", and the segment lengths are (2,3,1).
If the constant string has different non-utf8 and utf8 representations
(such as "\x80") then both variants are pre-computed and stored in the aux
struct, along with two sets of segment lengths.
For all the above LHS types, any SASSIGN op is optimised away. For a LHS
of '$lex=', '$lex.=' or 'my $lex=', the PADSV is optimised away too.
For example where $a and $b are lexical vars, this statement:
my $c = "a=$a, b=$b\n";
formerly compiled to
const[PV "a="] s
padsv[$a:1,3] s
concat[t4] sK/2
const[PV ", b="] s
concat[t5] sKS/2
padsv[$b:1,3] s
concat[t6] sKS/2
const[PV "\n"] s
concat[t7] sKS/2
padsv[$c:2,3] sRM*/LVINTRO
sassign vKS/2
and now compiles to:
padsv[$a:1,3] s
padsv[$b:1,3] s
multiconcat("a=, b=\n",2,4,1)[$c:2,3] vK/LVINTRO,TARGMY,STRINGIFY
In terms of how much faster it is, this code:
my $a = "the quick brown fox jumps over the lazy dog";
my $b = "to be, or not to be; sorry, what was the question again?";
for my $i (1..10_000_000) {
my $c = "a=$a, b=$b\n";
}
runs 2.7 times faster, and if you throw utf8 mixtures in it gets even
better. This loop runs 4 times faster:
my $s;
my $a = "ab\x{100}cde";
my $b = "fghij";
my $c = "\x{101}klmn";
for my $i (1..10_000_000) {
$s = "\x{100}wxyz";
$s .= "foo=$a bar=$b baz=$c";
}
The main ways in which OP_MULTICONCAT gains its speed are:
* any OP_CONSTs are eliminated, and the constant bits (already in the
right encoding) are copied directly from the constant string attached to
the op's aux structure.
* It optimises away any SASSIGN op, and possibly a PADSV op on the LHS, in
all cases; OP_CONCAT only did this in very limited circumstances.
* Because it has a holistic view of the entire concatenation expression,
it can do the whole thing in one efficient go, rather than creating and
copying intermediate results. pp_multiconcat() goes to considerable
efforts to avoid inefficiencies. For example it will only SvGROW() the
target once, and to the exact size needed, no matter what mix of utf8
and non-utf8 appear on the LHS and RHS. It never allocates any
temporary SVs except possibly in the case of tie or overloading.
* It does all its own appending and utf8 handling rather than calling
out to functions like sv_catsv().
* It's very good at handling the LHS appearing on the RHS; for example in
$x = "abcd";
$x = "-$x-$x-";
It will do roughly the equivalent of the following (where targ is $x);
SvPV_force(targ);
SvGROW(targ, 11);
p = SvPVX(targ);
Move(p, p+1, 4, char);
Copy("-", p, 1, char);
Copy("-", p+5, 1, char);
Copy(p+1, p+6, 4, char);
Copy("-", p+10, 1, char);
SvCUR(targ) = 11;
p[11] = '\0';
Formerly, pp_concat would have used multiple PADTMPs or temporary SVs to
handle situations like that.
The code is quite big; both S_maybe_multiconcat() and pp_multiconcat()
(the main compile-time and runtime parts of the implementation) are over
700 lines each. It turns out that when you combine multiple ops, the
number of edge cases grows exponentially ;-)
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Use of an unqualified dump() now gives a deprecation warning. So, change
dump into CORE::dump in the tests.
|
|
|
|
|
|
|
|
|
|
| |
Switch from two-argument form. Filehandle cloning is still done with the two
argument form for backward compatibility.
Committer: Get all porting tests to pass. Increment some $VERSIONs.
Run: ./perl -Ilib regen/mk_invlists.pl; ./perl -Ilib regen/regcharclass.pl
For: RT #130122
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most ops that execute a regex, such as match and subst, are of type PMOP.
A PMOP allows the actual regex to be attached directly to that op, due
to its extra fields.
OP_SPLIT is different; it is just a plain LISTOP, but it always has an
OP_PUSHRE as its first child, which *is* a PMOP and which has the regex
attached.
At runtime, pp_pushre()'s only job is to push itself (i.e. the current
PL_op) onto the stack. Later pp_split() pops this to get access to the
regex it wants to execute.
This is a bit unpleasant, because we're pushing an OP* onto the stack,
which is supposed to be an array of SV*'s. As a bit of a hack, on
DEBUGGING builds we push a PVLV with the PL_op address embedded instead,
but this still isn't very satisfactory.
Now that regexes are first-class SVs, we could push a REGEXP onto the
stack rather than PL_op. However, there is an optimisation of @array =
split which eliminates the assign and embeds the array's GV/padix directly
in the PUSHRE op. So split still needs access to that op. But the pushre
op will always be splitop->op_first anyway, so one possibility is to just
skip executing the pushre altogether, and make pp_split just directly
access op_first instead to get the regex and @array info.
But if we're doing that, then why not just go the full hog and make
OP_SPLIT into a PMOP, and eliminate the OP_PUSHRE op entirely: with the
data that was spread across the two ops now combined into just the one
split op.
That is exactly what this commit does.
For a simple compile-time pattern like split(/foo/, $s, 1), the optree
looks like:
before:
<@> split[t2] lK
</> pushre(/"foo"/) s/RTIME
<0> padsv[$s:1,2] s
<$> const(IV 1) s
after:
</> split(/"foo"/)[t2] lK/RTIME
<0> padsv[$s:1,2] s
<$> const[IV 1] s
while for a run-time expression like split(/$pat/, $s, 1),
before:
<@> split[t3] lK
</> pushre() sK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const(IV 1)s
after:
</> split()[t3] lK/RTIME
<|> regcomp(other->8) sK
<0> padsv[$pat:2,3] s
<0> padsv[$s:1,3] s
<$> const[IV 1] s
This makes the code faster and simpler.
At the same time, two new private flags have been added for OP_SPLIT -
OPpSPLIT_ASSIGN and OPpSPLIT_LEX - which make it explicit that the
assign op has been optimised away, and if so, whether the array is
lexical.
Also, deparsing of split has been improved, to the extent that
perl TEST -deparse op/split.t
now passes.
Also, a couple of panic messages in pp_split() have been replaced with
asserts().
|
|
|
|
|
|
| |
In commit fedc1b0e2d9cec34b7e3b1fa65dd0f7eb4f539fd, I forgot that this
is dual-lifed and may be used on early Perls. This commit allows that,
but it will fail if such a Perl were to be used on an EBCDIC platform.
|
| |
|
|
|
|
|
|
|
|
|
| |
Extracted from patch submitted by Lajos Veres in RT #123693.
This commit applies those patches to files under dist/ *other than* those
pertaining to Tie-File.
Update $VERSION in Dumper.pm and Storable.pm after re-applying patches from RT
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 5bf4b3bf13bc4055684a48448b05920845ef7764.
On p5p-list, Steve Hay wrote on 2015-01-29:
"... these and other changes to Tie-File could break backwards compatibility.
The keys of %opt are passed in from user code, so we can't change the expected
key from "autodefer_threshhold" to "autodefer_threshold" without also asking
users to change their code, which is probably more hassle than it's worth."
Parts of the reverted commit will be re-committed from a new patch.
|
|
|
|
| |
Extracted from patch submitted by Lajos Veres in RT #123693.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
(or, rather, in Opcode.xs).
It was providing scalar context when invoked in void context. Test-
ing Safe->reval itself is complicated, because Opcode.xs, which is an
essential part of the fix, is not dual-life.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This op is an optimisation for any series of one or more array or hash
lookups and dereferences, where the key/index is a simple constant or
package/lexical variable. If the first-level lookup is of a simple
array/hash variable or scalar ref, then that is included in the op too.
So all of the following are replaced with a single op:
$h{foo}
$a[$i]
$a[5][$k][$i]
$r->{$k}
local $a[0][$i]
exists $a[$i]{$k}
delete $h{foo}
while these aren't:
$a[0] already handled by OP_AELEMFAST
$a[$x+1] not a simple index
and these are partially replaced:
(expr)->[0]{$k} the bit following (expr) is replaced
$h{foo}[$x+1][0] the first and third lookups are each done with
a multideref op, while the $x+1 expression and
middle lookup are done by existing add, aelem etc
ops.
Up until now, aggregate dereferencing has been very heavyweight in ops; for
example, $r->[0]{$x} is compiled as:
gv[*r] s
rv2sv sKM/DREFAV,1
rv2av[t2] sKR/1
const[IV 0] s
aelem sKM/DREFHV,2
rv2hv sKR/1
gvsv[*x] s
helem vK/2
When executing this, in addition to the actual calls to av_fetch() and
hv_fetch(), there is a lot of overhead of pushing SVs on and off the
stack, and calling lots of little pp() functions from the runops loop
(each with its potential indirect branch miss).
The multideref op avoids that by running all the code in a loop in a
switch statement. It makes use of the new UNOP_AUX type to hold an array
of
typedef union {
PADOFFSET pad_offset;
SV *sv;
IV iv;
UV uv;
} UNOP_AUX_item;
In something like $a[7][$i]{foo}, the GVs or pad offsets for @a and $i are
stored as items in the array, along with a pointer to a const SV holding
'foo', and the UV 7 is stored directly. Along with this, some UVs are used
to store a sequence of actions (several actions are squeezed into a single
UV).
Then the main body of pp_multideref is a big while loop round a switch,
which reads actions and values from the AUX array. The two big branches in
the switch are ones that are affectively unrolled (/DREFAV, rv2av, aelem)
and (/DREFHV, rv2hv, helem) triplets. The other branches are various entry
points that handle retrieving the different types of initial value; for
example 'my %h; $h{foo}' needs to get %h from the pad, while '(expr)->{foo}'
needs to pop expr off the stack.
Note that there is a slight complication with /DEREF; in the example above
of $r->[0]{$x}, the aelem op is actually
aelem sKM/DREFHV,2
which means that the aelem, after having retrieved a (possibly undef)
value from the array, is responsible for autovivifying it into a hash,
ready for the next op. Similarly, the rv2sv that retrieves $r from the
typeglob is responsible for autovivifying it into an AV. This action
of doing the next op's work for it complicates matters somewhat. Within
pp_multideref, the autovivification action is instead included as the
first step of the current action.
In terms of benchmarking with Porting/bench.pl, a simple lexical
$a[$i][$j] shows a reduction of approx 40% in numbers of instructions
executed, while $r->[0][0][0] uses 54% fewer. The speed-up for hash
accesses is relatively more modest, since the actual hash lookup (i.e.
hv_fetch()) is more expensive than an array lookup. A lexical $h{foo}
uses 10% fewer, while $r->{foo}{bar}{baz} uses 34% fewer instructions.
Overall,
bench.pl --tests='/expr::(array|hash)/' ...
gives:
PRE POST
------ ------
Ir 100.00 145.00
Dr 100.00 165.30
Dw 100.00 175.74
COND 100.00 132.02
IND 100.00 171.11
COND_m 100.00 127.65
IND_m 100.00 203.90
with cache misses unchanged at 100%.
In general, the more lookups done, the bigger the proportionate saving.
|
|
|
|
|
|
| |
the threadsv op and the PL_threadsv_names var were part of the 5.005
threads model, long since removed. Remove some remaining references to
them.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
test added.
This example hacks outside environment:
package My::Controller;
use strict;
sub jopa { return "jopa\n"; }
package main;
use Safe;
my $s = new Safe;
my $ok = $s->reval(q{
package My::Controller;
sub jopa { return "hacked\n"; }
My::Controller->jopa();
});
print My::Controller->jopa();
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
[DELTA]
2.33 Tue Apr 3 2012
- Don't eval code under 'no strict' (Father Chrysostomos)
2.32 Sat Mar 31 2012
- Make Safe play nice with Devel::Cover
|
|
|
|
|
|
| |
There has been a release of 2.32 on CPAN with changes that are not in
blead. So what bleadperl has is 2.31 plus a tiny fix that does not
affect older perl versions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With commit b50b20584, strict.pm starting putting hints in %^H to
indicate that strict mode has been enabled or disabled explicitly, so
version declarations should not change the setting.
This causes ‘Unbalanced string table refcount’ warnings when Safe.pm
encounters prohibited ops.
This happens because ops are leaking when those ops point to HEKs (in
the internal form that %^H takes when attached to ops).
This commit moves those new strict hints into $^H, to avoid those
warnings. This does simply paper over the real problem (leaked ops),
but at least it gets the warnings back down to the 5.14 amount.
Because of the new hints in $^H, B::Concise has been updated to
account for them, and so have all its tests. I modified OptreeCheck
to avoid setting the hints with ‘no strict;’, as that resulted in
slightly fewer changes to the tests. It will also result in fewer
changes to the tests in future.
Two B::Deparse tests started failing due to %^H not being localised.
Apparently there is a bug somewhere (in perl, Deparse.pm or deparse.t)
that got triggered as a result. In fact, one of the tests exhibited
*two* bugs. But for now, I’ve simply added a workaround to the two
tests so they don’t trigger those bugs (which bugs will have to wait
till after 5.16).
|
|
|
|
|
|
|
|
|
|
| |
Instead of evaluating code under ‘no strict’, we should be evaluating
it with no pragmata at all by default.
This allows ‘use 5.012’ to enable strictures in reval. It also
has the side effect of suppressing the ‘Unbalanced string table
refcount’ warnings, at least in some cases. This was brought up in
ticket #107000.
|
|
|
|
|
|
| |
For the sake of tests in the next commit, it needs runperl(), which
test.pl provides. Since this script is only run in the perl core, it
should be fine.
|
| |
|