| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 49cf1d6641a6dfd301302f616e4f25595dcc65d4, which
reverted e045dbedc7da04e20cc8cfccec8a2e3ccc62cc8b, thus reinstating the
latter commit. It turns out that the error being chased down was not
due to this commit.
Its original message was:
This reshuffles the svtype enum to remove the dummy slot created in a
previous commit, and add the new SVt_INVLIST type in its proper order.
It still is unused, but since it is an extension of SVt_PV, it must be
greater than that type's enum value. Since it can't be upgraded to any
other PV type, it comes right after SVt_PV.
Affected tables in the core are updated.
|
|
|
|
|
|
|
|
| |
This reverts commit e045dbedc7da04e20cc8cfccec8a2e3ccc62cc8b.
Blead won't compile with address sanitizer; commit
7cb47964955167736b2923b008cc8023a9b035e8 reverting an earlier commit
failed to fix that. I'm trying two more reversions to get it back
working. This is one of those
|
|
|
|
|
|
|
|
|
|
| |
This reshuffles the svtype enum to remove the dummy slot created in a
previous commit, and add the new SVt_INVLIST type in its proper order.
It still is unused, but since it is an extension of SVt_PV, it must be
greater than that type's enum value. Since it can't be upgraded to any
other PV type, it comes right after SVt_PV.
Affected tables in the core are updated.
|
|
|
|
|
|
|
|
|
| |
This produces a report on the number of OPs of a given type that were
executed at the end of a program run. This can be useful in multiple
ways. One, it can help determine hotspots for optimization (yes, I know
execution count is not equal execution time). It can also help with
determining whether a given change to perl has had the desired effect on
deterministic programs.
|
|
|
|
| |
Yes, these can hold PVs when source filters are involved.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At compile time, if split occurs on the right-hand side of an assign-
ment to a list of scalars, if the limit argument is a constant con-
taining the number 0 then it is modified in place to hold one more
than the number of scalars.
This means ‘constants’ can change their values, if they happen to be
in the wrong place at the wrong time:
$ ./perl -Ilib -le 'use constant NULL => 0; ($a,$b,$c) = split //, $foo, NULL; print NULL'
4
I considered checking the reference count on the SV, but since XS code
could create its own const ops with weak references to the same cons-
tants elsewhere, the safest way to avoid modifying someone else’s SV
is to mark the split op in ck_split so we know the SV belongs to that
split op alone.
Also, to be on the safe side, turn off the read-only flag before modi-
fying the SV, instead of relying on the special case for compile time
in sv_force_normal.
|
|
|
|
|
| |
These were only used by the study code, which was disabled in 5.16.0
and removed shortly thereafter.
|
|
|
|
|
|
| |
This avoids HvFILL() being O(n) for large n on large hashes, but also avoids
storing the value of HvFILL() in smaller hashes (ie a memory overhead on
every single object built using a hash.)
|
|
|
|
|
|
| |
This scalar type was unused. This is the first step in using the slot
freed up for another purpose. The slot is now occupied by a temporary
placeholder named SVt_DUMMY.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PERL_ASYNC_CHECK() was added to Perl_leave_scope() as part of commit
f410a2119920dd04, which moved signal dispatch from the runloop to
control flow ops, to mitigate nearly all of the speed cost of safe
signals.
The assumption was that scope exit was a safe place to dispatch signals.
However, this is not true, as parts of the regex engine call
leave_scope(), the regex engine stores some state in per-interpreter
variables, and code called within signal handlers can change these
values.
Hence remove the call to PERL_ASYNC_CHECK() from Perl_leave_scope(), and
add it explicitly in the various OPs which were relying on their call to
leave_scope() to dispatch any pending signals. Also add a
PERL_ASYNC_CHECK() to the exit of the runloop, which ensures signals
still dispatch from S_sortcv() and S_sortcv_stacked(), as well as
addressing one of the concerns in the commit message of
f410a2119920dd04:
Subtle bugs might remain - there might be constructions that enter
the runloop (where signals used to be dispatched) but don't contain
any PERL_ASYNC_CHECK() calls themselves.
Finally, move the PERL_ASYNC_CHECK(); added by that commit to pp_goto to
the end of the function, to be consistent with the positioning of all
other PERL_ASYNC_CHECK() calls - at the beginning or end of OP
functions, hence just before the return to or just after the call from
the runloop, and hence effectively at the same point as the previous
location of PERL_ASYNC_CHECK() in the runloop.
|
|
|
|
|
|
|
|
|
| |
this fix was already applied to the non-MAD branch; apply it to
the similar MAD / xml-dump code.
dump.c:2663:57: warning: comparison of constant 85 with expression of type
'svtype' is always false [-Wtautological-constant-out-of-range-compare]
else if (sv == (const SV *)0x55555555 || SvTYPE(sv) == 'U') {
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds support for PERL_PERTURB_KEYS environment variable, which in turn allows one to control
the level of randomization applied to keys() and friends.
When PERL_PERTURB_KEYS is 0 we will not randomize key order at all. The
chance that keys() changes due to an insert will be the same as in
previous perls, basically only when the bucket size is changed.
When PERL_PERTURB_KEYS is 1 we will randomize keys in a non repeatedable
way. The chance that keys() changes due to an insert will be very high.
This is the most secure and default mode.
When PERL_PERTURB_KEYS is 2 we will randomize keys in a repeatedable way.
Repititive runs of the same program should produce the same output every
time. The chance that keys changes due to an insert will be very high.
This patch also makes PERL_HASH_SEED imply a non-default
PERL_PERTURB_KEYS setting. Setting PERL_HASH_SEED=0 (exactly one 0) implies
PERL_PERTURB_KEYS=0 (hash key randomization disabled), settng PERL_HASH_SEED
to any other value, implies PERL_PERTURB_KEYS=2 (deterministic/repeatable
hash key randomization). Specifying PERL_PERTURB_KEYS explicitly to a
different level overrides this behavior.
Includes changes to allow one to compile out various aspects of the
patch. One can compile such that PERL_PERTURB_KEYS is not respected, or
can compile without hash key traversal randomization at all. Note that
support for these modes is incomplete, and currently a few tests will
fail.
Also includes a new subroutine in Hash::Util::hash_traversal_mask()
which can be used to ensure a given hash produces a predictable key
order (assuming the same hash seed is in effect). This sub acts as a
getter and a setter.
NOTE - this patch lacks tests, but I lack tuits to get them done quickly,
so I am pushing this with the hope that others can add them afterwards.
|
|
|
|
|
|
|
| |
The split patch introduced a buffer read overrun error in sv_dump() when
stringifying empty strings. This bug was always existant but was probably
never triggered because we almost always have at least one extflags set,
so it never got an empty buffer to show. Not so with the new compflags. :-(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch resolves several issues at once. The parts are
sufficiently interconnected that it is hard to break it down
into smaller commits. The tickets open for these issues are:
RT #94490 - split and constant folding
RT #116086 - split "\x20" doesn't work as documented
It additionally corrects some issues with cached regexes that
were exposed by the split changes (and applied to them).
It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171
and cccd1425414e6518c1fc8b7bcaccfb119320c513.
Prior to this patch the special RXf_SKIPWHITE behavior of
split(" ", $thing)
was only available if Perl could resolve the first argument to
split at compile time, meaning under various arcane situations.
This manifested as oddities like
my $delim = $cond ? " " : qr/\s+/;
split $delim, $string;
and
split $cond ? " ", qr/\s+/, $string
not behaving the same as:
($cond ? split(" ", $string) : split(/\s+/, $string))
which isn't very convenient.
This patch changes this by adding a new flag to the op_pmflags,
PMf_SPLIT which enables pp_regcomp() to know whether it was called
as part of split, which allows the RXf_SPLIT to be passed into run
time regex compilation. We also preserve the original flags so
pattern caching works properly, by adding a new property to the
regexp structure, "compflags", and related macros for accessing it.
We preserve the original flags passed into the compilation process,
so we can compare when we are trying to decide if we need to
recompile.
Note that this essentially the opposite fix from the one applied
originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171.
The reverted patch was meant to make:
split( 0 || " ", $thing ) #1
consistent with
my $x=0; split( $x || " ", $thing ) #2
and not with
split( " ", $thing ) #3
This was reverted because it broke C<split("\x{20}", $thing)>, and
because one might argue that is not that #1 does the wrong thing,
but rather that the behavior of #2 that is wrong. In other words
we might expect that all three should behave the same as #3, and
that instead of "fixing" the behavior of #1 to be like #2, we should
really fix the behavior of #2 to behave like #3. (Which is what we did.)
Also, it doesn't make sense to move the special case detection logic
further from the regex engine. We really want the regex engine to decide
this stuff itself, otherwise split " ", ... wouldn't work properly with
an alternate engine. (Imagine we add a special regexp meta pattern that behaves
the same as " " does in a split /.../. For instance we might make
split /(*SPLITWHITE)/ trigger the same behavior as split " ".
The other major change as result of this patch is it effectively
reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which
was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which
and free up bits in the regex flags structure.
But we dont want to get rid of these vars, and it turns out that
RXf_SEEN_LOOKBEHIND is used only in the same situation as the new
RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to
RXf_NO_INPLACE_SUBST, and then instead of using two vars we use
only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to
have their bits back.
|
|
|
|
|
|
|
|
| |
* If the hash is not OOK omit any iterator status information
instead of showing -1/NULL
* If the hash is OOK then add the RAND value from the iterator
and if the LASTRAND is not the same show it too
* Tweak tests to test the above.
|
|
|
|
|
|
|
| |
This reverts commit 2b656fcc48f28912136698c28b3bd916c42d74f8.
I accidentally included the changes I was reviewing from a patch of
Reini's
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This should silence the following warning:
dump.c:459:57: warning: comparison of constant 85 with expression of
type 'svtype' is always false [-Wtautological-constant-out-of-range-compare]
The warning is a false positive, this code is /meant/ to detect
conditions that should not happen.
|
|
|
|
|
|
|
|
|
| |
Checking LVRET (which pp_aassign does, as of a few commits ago)
has no effect if OPpMAYBE_LVSUB is not set on the op. This com-
mit changes op.c:op_lvalue_flags to set this flag on aassign ops.
This makes sub:lvalue{%h=($x,$x)} behave correctly if the return
values of the sub are assigned to ($x is unmodfied).
|
|
|
|
|
| |
No uses of SvREFCNT_dec in dump.c are ever passed null SVs, so
don’t check for that.
|
|
|
|
|
|
|
|
| |
principally, proto.h was sometimes being included twice - once before
a fn decl, and once after - giving rise to a 'decl after def' warning.
Also, S_croak_memory_wrap was declared static in every source file, but
not used in some. So selectively disable the unused-function warning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was discussed in ticket #114820.
This new copy-on-write mechanism stores a reference count for the
PV inside the PV itself, at the very end. (I was using SvEND+1
at first, but parts of the regexp engine expect to be able to do
SvCUR_set(sv,0), which causes the wrong byte of the string to be used
as the reference count.) Only 256 SVs can share the same PV this way.
Also, only strings with allocated space after the trailing null can
be used for copy-on-write.
Much of the code is shared with PERL_OLD_COPY_ON_WRITE. The restric-
tion against doing copy-on-write with magical variables has hence been
inherited, though it is not necessary. A future commit will take
care of that.
I had to modify _core_swash_init to handle $@ differently. The exist-
ing mechanism of copying $@ to a new scalar and back again was very
fragile. With copy-on-write, $@ =~ s/// can cause pp_subst’s string
pointers to become stale. So now we remove the scalar from *@ and
allow the utf8-table-loading code to autovivify a new one. Then we
restore the untouched $@ afterwards if all goes well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does the following:
*) Introduces multiple new hash functions to choose from at build
time. This includes Murmur-32, SDBM, DJB2, SipHash, SuperFast, and
One-at-a-time. Currently this is handled by muning hv.h. Configure
support hopefully to follow.
*) Changes the default hash to Murmur hash which is faster than the
old default One-at-a-time.
*) Rips out the old HvREHASH mechanism and replaces it with a
per-process random hash seed.
*) Changes the old PL_hash_seed from an interpreter value to a
global variable. This means it does not have to be copied during
interpreter setup or cloning.
*) Changes the format of the PERL_HASH_SEED variable to a hex
string so that hash seeds longer than fit in an integer are possible.
*) Changes the return of Hash::Util::hash_seed() from a number to a
string. This is to accomodate hash functions which have more bits than
can be fit in an integer.
*) Adds new functions to Hash::Util to improve introspection of hashes
-) hash_value() - returns an integer hash value for a given string.
-) bucket_info() - returns basic hash bucket utilization info
-) bucket_stats() - returns more hash bucket utilization info
-) bucket_array() - which keys are in which buckets in a hash
More details on the new hash functions can be found below:
Murmur Hash: (v3) from google, see
http://code.google.com/p/smhasher/wiki/MurmurHash3
Superfast Hash: From Paul Hsieh.
http://www.azillionmonkeys.com/qed/hash.html
DJB2: a hash function from Daniel Bernstein
http://www.cse.yorku.ca/~oz/hash.html
SDBM: a hash function sdbm.
http://www.cse.yorku.ca/~oz/hash.html
SipHash: by Jean-Philippe Aumasson and Daniel J. Bernstein.
https://www.131002.net/siphash/
They have all be converted into Perl's ugly macro format.
I have not done any rigorous testing to make sure this conversion
is correct. They seem to function as expected however.
All of them use the random hash seed.
You can force the use of a given function by defining one of
PERL_HASH_FUNC_MURMUR
PERL_HASH_FUNC_SUPERFAST
PERL_HASH_FUNC_DJB2
PERL_HASH_FUNC_SDBM
PERL_HASH_FUNC_ONE_AT_A_TIME
Setting the environment variable PERL_HASH_SEED_DEBUG to 1 will make
perl output the current seed (changed to hex) and the hash function
it has been built with.
Setting the environment variable PERL_HASH_SEED to a hex value will
cause that value to be used at the seed. Any missing bits of the seed
will be set to 0. The bits are filled in from left to right, not
the traditional right to left so setting it to FE results in a seed
value of "FE000000" not "000000FE".
Note that we do the hash seed initialization in perl_construct().
Doing it via perl_alloc() (via init_tls) causes problems under
threaded builds as the buffers used for reentrant srand48 functions
are not allocated. See also the p5p mail "Hash improvements blocker:
portable random code that doesnt depend on a functional interpreter",
Message-ID:
<CANgJU+X+wNayjsNOpKRqYHnEy_+B9UH_2irRA5O3ZmcYGAAZFQ@mail.gmail.com>
|
|
|
|
|
|
|
| |
This was introduced by 75a6ad4aa3.
C preprocessors treat the ‘aTHX_ foo’ in MACRO(aTHX_ foo) as a sin-
gle argument.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in ticket #114820, instead of using READONLY+FAKE to mark
a copy-on-write string, we should make it a separate flag.
There are many modules in CPAN (and 1 in core, Compress::Raw::Zlib)
that assume that SvREADONLY means read-only. Only one CPAN module,
POSIX::pselect will definitely be broken by this. Others may need to
be tweaked. But I believe this is for the better.
It causes all tests except ext/Devel-Peek/t/Peek.t (which needs a tiny
tweak still) to pass under PERL_OLD_COPY_ON_WRITE, which is a prereq-
uisite for any new COW scheme that creates COWs under the same cir-
cumstances.
|
|
|
|
|
| |
Combine op_flags and op_private dumper for MAD (xml) and -Dx dump.
Add some previously missing flags.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This single op can, in some circumstances, replace the sequence of a
pushmark followed by one or more padsv/padav/padhv ops, and possibly
a trailing 'list' op, but only where the targs of the pad ops form
a continuous range.
This is generally more efficient, but is particularly so in the case
of void-context my declarations, such as:
my ($a,@b);
Formerly this would be executed as the following set of ops:
pushmark pushes a new mark
padsv[$a] pushes $a, does a SAVEt_CLEARSV
padav[@b] pushes all the flattened elements (i.e. none) of @a,
does a SAVEt_CLEARSV
list pops the mark, and pops all stack elements except the last
nextstate pops the remaining stack element
It's now:
padrange[$a..@b] does two SAVEt_CLEARSV's
nextstate nothing needing doing to the stack
Note that in the case above, this commit changes user-visible behaviour in
pathological cases; in particular, it has always been possible to modify a
lexical var *before* the my is executed, using goto or closure tricks.
So in principle someone could tie an array, then could notice that FETCH
is no longer being called, e.g.
f();
my ($s, @a); # this no longer triggers two FETCHES
sub f {
tie @a, ...;
push @a, 1,2;
}
But I think we can live with that.
Note also that having a padrange operator will allow us shortly to have
a corresponding SAVEt_CLEARPADRANGE save type, that will replace multiple
individual SAVEt_CLEARSV's.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By defining NO_TAINT_SUPPORT, all the various checks that perl does for
tainting become no-ops. It's not an entirely complete change: it doesn't
attempt to remove the taint-related interpreter variables, but instead
virtually eliminates access to it.
Why, you ask? Because it appears to speed up perl's run-time
significantly by avoiding various "are we running under taint" checks
and the like.
This change is not in a state to go into blead yet. The actual way I
implemented it might raise some (valid) objections. Basically, I
replaced all uses of the global taint variables (but not PL_taint_warn!)
with an extra layer of get/set macros (TAINT_get/TAINTING_get).
Furthermore, the change is not complete:
- PL_taint_warn would likely deserve the same treatment.
- Obviously, tests fail. We have tests for -t/-T
- Right now, I added a Perl warn() on startup when -t/-T are detected
but the perl was not compiled support it. It might be argued that it
should be silently ignored! Needs some thinking.
- Code quality concerns - needs review.
- Configure support required.
- Needs thinking: How does this tie in with CPAN XS modules that use
PL_taint and friends? It's easy to backport the new macros via PPPort,
but that doesn't magically change all code out there. Might be
harmless, though, because whenever you're running under
NO_TAINT_SUPPORT, any check of PL_taint/etc is going to come up false.
Thus, the only CPAN code that SHOULD be adversely affected is code
that changes taint state.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the xpvlv and regexp structs conflict, we have to find somewhere
else to put the regexp struct.
I was going to sneak it in SvPVX, allocating a buffer large
enough to fit the regexp struct followed by the string, and have
SvPVX - sizeof(regexp) point to the struct. But that would make all
regexp flag-checking macros fatter, and those are used in hot code.
So I came up with another method. Regexp stringification is not
speed-critical. So we can move the regexp stringification out of
re->sv_u and put it in the regexp struct. Then the regexp struct
itself can be pointed to by re->sv_u. So SVt_REGEXPs will have
re->sv_any and re->sv_u pointing to the same spot. PVLVs can then
have sv->sv_any point to the xpvlv body as usual, but have sv->sv_u
point to a regexp struct. All regexp member access can go through
sv_u instead of sv_any, which will be no slower than before.
Regular expressions will no longer be SvPOK, so we give sv_2pv spec-
ial logic for regexps. We don’t need to make the regexp struct
larger, as SvLEN is currently always 0 iff mother_re is set. So we
can replace the SvLEN field with the pv.
SvFAKE is never used without SvPOK or SvSCREAM also set. So we can
use that to identify regexps.
|
|
|
|
|
|
|
|
| |
They are on longer used in core, and we need room for more flags.
The only CPAN modules that use them check whether RXf_SPLIT is set
(which no longer happens) before setting RXf_SKIPWHITE (which is
ignored).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a pattern matches, and that pattern contains captures (or $`, $&, $'
or /p are present), a copy is made of the whole original string, so
that $1 et al continue to hold the correct value even if the original
string is subsequently modified. This can have severe performance
penalties; for example, this code causes a 1Mb buffer to be allocated,
copied and freed a million times:
$&;
$x = 'x' x 1_000_000;
1 while $x =~ /(.)/g;
This commit changes this so that, where possible, only the needed
substring of the original string is copied: in the above case, only a
1-byte buffer is copied each time. Also, it now reuses or reallocs the
buffer, rather than freeing and mallocing each time.
Now that PL_sawampersand is a 3-bit flag indicating separately whether
$`, $& and $' have been seen, they each contribute only their own
individual penalty; which ones have been seen will limit the extent to
which we can avoid copying the whole buffer.
Note that the above code *without* the $& is not currently slow, but only
because the copying is artificially disabled to avoid the performance hit.
The next but one commit will remove that hack, meaning that it will still
be fast, but will now be correct in the presence of a modified original
string.
We achieve this by by adding suboffset and subcoffset fields to the
existing subbeg and sublen fields of a regex, to indicate how many bytes
and characters have been skipped from the logical start of the string till
the physical start of the buffer. To avoid copying stuff at the end, we
just reduce sublen. For example, in this:
"abcdefgh" =~ /(c)d/
subbeg points to a malloced buffer containing "c\0"; sublen == 1,
and suboffset == 2 (as does subcoffset).
while if $& has been seen,
subbeg points to a malloced buffer containing "cd\0"; sublen == 2,
and suboffset == 2.
If in addition $' has been seen, then
subbeg points to a malloced buffer containing "cdefgh\0"; sublen == 6,
and suboffset == 2.
The regex engine won't do this by default; there are two new flag bits,
REXEC_COPY_SKIP_PRE and REXEC_COPY_SKIP_POST, which in conjunction with
REXEC_COPY_STR, request that the engine skip the start or end of the
buffer (it will still copy in the presence of the relevant $`, $&, $',
/p).
Only pp_match has been enhanced to use these extra flags; substitution
can't easily benefit, since the usual action of s///g is to copy the
whole string first time round, then perform subsequent matching iterations
against the copy, without further copying. So you still need to copy most
of the buffer.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 6ea72b3a1, rv2hv and padhv have had the ability to return boo-
leans in scalar context, instead of bucket stats, if flagged the right
way. sub { %hash || ... } is optimised to take advantage of this. If
the || is in unknown context at compile time, the %hash is flagged as
being maybe a true boolean. When flagged that way, it returns a bool-
ean if block_gimme() returns G_VOID.
If rv2hv and padhv can already do this, then we don’t need the
boolkeys op any more. We can just flag the rv2hv to return a boolean.
In all the cases where boolkeys was used, we know at compile time that
it is true boolean context, so we add a new flag for that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In %hash || $foo, the %hash is in scalar context, so it has to iterate
through the buckets to produce statistics on bucket usage.
If the || is in void context, the value returned by hash is only ever
used as a boolean (as || doesn’t have to return it). We already opti-
mise it by adding a boolkeys op when it is known at compile time that
|| will be in void context.
In sub { %hash || $foo } it is not known at compile time that it will
be in void context, so it wasn’t optimised.
This commit optimises it by flagging the %hash at compile time as
being possibly in ‘true boolean’ context. When that flag is set,
the rv2hv and padhv ops call block_gimme() to see whether || is in
void context.
This speeds things up signficantly. Here is what I got after optimis-
ing rv2hv but before doing padhv:
$ time ./miniperl -e '%hash = 1..10000; sub { %hash || 1 }->() for 1..100000'
real 0m0.179s
user 0m0.101s
sys 0m0.005s
$ time ./miniperl -e 'my %hash = 1..10000; sub { %hash || 1 }->() for 1..100000'
real 0m5.446s
user 0m2.419s
sys 0m0.015s
(That example is slightly misleading because of the closure, but the
closure version takes 1 sec. when optimised.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After a while, I realised that it can be confusing for PAD_ARRAY and
PAD_MAX to take a pad argument, but for PAD_SV to take a number and
PAD_SET_CUR a padlist.
I was copying the HEK_KEY convention, which was probably a bad idea.
This is what we use elsewhere:
TypeMACRO
----=====
AvMAX
CopFILE
PmopSTASH
StashHANDLER
OpslabREFCNT_dec
Furthermore, heks are not part of the API, so what convention they use
is not so important.
So these:
PADNAMELIST_*
PADLIST_*
PADNAME_*
PAD_*
are now:
Padnamelist*
Padlist*
Padname*
Pad*
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to fix a bug, I need to add new fields to padlists. But I
cannot easily do that as long as they are AVs.
So I have created a new padlist struct.
This not only allows me to extend the padlist struct with new members
as necessary, but also saves memory, as we now have a three-pointer
struct where before we had a whole SV head (3-4 pointers) + XPVAV (5
pointers).
This will unfortunately break half of CPAN, but the pad API docs
clearly say this:
NOTE: this function is experimental and may change or be
removed without notice.
This would have broken B::Debug, but a patch sent upstream has already
been integrated into blead with commit 9d2d23d981.
|
|
|
|
|
| |
Much code relies on the fact that PADLIST is typedeffed as AV.
PADLIST should be treated as a distinct type.
|
|
|
|
|
| |
Instead of lengthening the struct, we can reuse SvCUR, which is cur-
rently unused.
|
|
|
|
|
|
|
| |
Setting a the PV on a format is meaningless, as of the previ-
ous commit.
This frees up SvCUR for other uses.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are only used for storing a string and an IV.
Making them into full-blown SVt_PVFMs is overkill.
FmLINES was only being used on these three scalars. So make it use
the SvIVX field. struct xpvfm no longer needs an xfm_lines member,
because SVt_PVFMs no longer use it.
This also causes a TODO test in taint.t to start passing, but I do
not fully understand why. But at least that’s progress. :-)
|
| |
|
|
|
|
|
|
|
| |
This was an early attempt to fix leaking of ops after syntax errors,
disabled because it was deemed to fragile. The new slab allocator
(8be227a) has solved this problem another way, so latefree(d) no
longer serves any purpose.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
these were used as part of the old "use re 'eval'" security
mechanism used by the now-eliminated PL_reginterp_cnt
|
|
|
|
|
|
|
|
|
|
|
| |
This indicates that a particular PMOP is in fact OP_QR. We should of
course be able to tell this from op_type, but the regex-compiling API
only gets passed op_flags.
This then allows us to fix a bug where we were deciding during compilation
whether to hang on to the code_blocks based on whether the PMOP was
PMf_HAS_CV rather than PMf_IS_QR; the latter implies the former, but not
the other way round.
|
| |
|