| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This information is already in embed.fnc, and we know it compiles. Some
of this information is now out-of-date. Get rid of it.
There was one bit of information that was (apparently) wrong in
embed.fnc. The apidoc line asked that there be no usage example
generated for newXS. I added that flag to the embed.fnc entry.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
ASAN -fsanitize-undefined was tripping on the second of these two lines:
svp = AvARRAY(isa);
end = svp + AvFILLp(isa)+1;
In the case where svp is NULL and AvFILLp(isa) is -1, the first addition
is undefined behaviour. Add the 1 first, so that it becomes
svp + (-1+1), which is safe.
|
|
|
|
|
|
|
|
|
|
| |
Where the length is known, we can use these functions which relieve
the programmer and the program reader from having to count characters.
The memFOO functions should also be slightly faster than the strFOO
equivalents.
In some instances in this commit, hard coded numbers are used. These
come from the 'case' statement values that apply to them.
|
|
|
|
|
|
|
|
|
|
| |
The original names are confusing.
See thread beginning with
http://nntp.perl.org/group/perl.perl5.porters/244335
The two macros are mapped into just that one, complementing the result
for the few cases where strNEs was used.
|
|
|
|
|
|
|
| |
For the DYNAMIC_ENV_FETCH case, we don't know the number of keys
until the first iteration triggers a call to prime_env_iter(), so
piggyback on the tied magic case, which already handles extending
the stack for each iteration rather than all at once beforehand.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The newish function hv_pushkv() currently just pushes all key/value pairs on
the stack. i.e. it does the equivalent of the perl code '() = %h'.
Extend it so that it can handle 'keys %h' and values %h' too.
This is basically moving the remaining list-context functionality out of
do_kv() and into hv_pushkv().
The rationale for this is that hv_pushkv() is a pure HV-related function,
while do_kv() is a pp function for several ops including OP_KEYS/VALUES,
and expects PL_op->op_flags/op_private to be valid.
|
|
|
|
| |
Do our own mortal stack extending and handling.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
...and make pp_padhv(), pp_rv2hv() use it rather than using Perl_do_kv()
Both pp_padhv() and pp_rv2hv() (via S_padhv_rv2hv_common()), outsource to
Perl_do_kv(), the list-context pushing/flattening of a hash onto the
stack.
Perl_do_kv() is a big function that handles all the actions of
keys, values etc. Instead, create a new function which does just the
pushing of a hash onto the stack.
At the same time, split it out into two loops, one for tied, one for
normal: the untied one can skip extending the stack on each iteration,
and use a cheaper HeVAL() instead of calling hv_iterval().
|
|
|
|
|
|
| |
Where its obvious that the args can't be null, use SvTRUE_NN() instead.
Avoid possible multiple evaluations of the arg by assigning to a local var
first if necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In places that do things like mPUSHi(0) or newSViv(0), replace them
with PUSHs(&PL_sv_zero) and &PL_sv_zero, etc.
This avoids the cost of creating and/or mortalising an SV, and/or setting
its value to 0.
This commit causes a subtle change to tainting in various places as a
side-effect. For example, grep in scalar context retunrs 0 if it has no
args. Formerly the zero value could in theory get tainted:
@a = ();
$x = ( ($^X . ""), grep { 1 } @a);
It used to be the case that $x would be tainted; now its not.
In practice this doesn't matter - the zero value was only getting tainted
as a side-effect of tainting's "if anything in the statement uses a
tainted value, taint everything" mechanism, which gives (documented) false
positives. This commit merely removes some such false positives, and
makes the behaviour similar to functions which return &PL_sv_undef/no/yes,
which are also immune to side-effect tainting.
|
| |
|
|
|
|
| |
hfreeentries() reads very poorly - hv_free_entries() makes more sense too.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
hv.c: In function ‘Perl_hv_undef_flags’:
hv.c:2053:35: warning: ‘orig_ix’ may be used uninitialized in this function [-Wmaybe-uninitialized]
PL_tmps_stack[orig_ix] = &PL_sv_undef;
The warning is bogus, as we only use orig_ix if "save" is true,
and if "save" is true we will have initialized orig_ix. However
initializing it in the first place avoids any issue
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
are too long
We currently require hash keys to be less than 2**31 bytes long. But (a)
nothing actually tries to enforce that, and (b) if a Perl program tries to
create a hash with such a key (using a 64-bit system), we miscalculate the
size of a memory block, yielding a panic:
$ ./perl -e '+{ "x" x 2**31, undef }'
panic: malloc, size=18446744071562068026 at -e line 1.
Instead, check for this situation, and croak with an appropriate (new)
diagnostic in the unlikely event that it occurs.
This also involves changing the type of an argument to a public API function:
Perl_share_hek() previously took the key's length as an I32, but that makes
it impossible to detect over-long keys, so it must be SSize_t instead.
From Yves:
We also inject the length test into the PERL_HASH() macro, so that where
the macro is used *before* calling into any of the hv functions we can
avoid hashing a very long string only to throw an exception that it is
too long. Might as well fail fast.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit e4343ef32499562ce956ba3cb9cf4454d5d2ff7f,
which was a revert of 05f97de032fe95cabe8c9f6d6c0a5897b1616194.
Prior to this patch we resized hashes when after inserting a key
the load factor of the hash reached 1 (load factor= keys / buckets).
This patch makes two subtle changes to this logic:
1. We split only after inserting a key into an utilized bucket,
2. and the maximum load factor exceeds 0.667
The intent and effect of this change is to increase our hash tables
efficiency. Reducing the maximum load factor 0.667 means that we should
have much less keys in collision overall, at the cost of some unutilized
space (2/3rds was chosen as it is easier to calculate than 0.7). On the
other hand, only splitting after a collision means in theory that we execute
the "final split" less often. Additionally, insertin a key into an unused
bucket increases the efficiency of the hash, without changing the worst
case.[1] In other words without increasing collisions we use the space
in our hashes more efficiently.
A side effect of this hash is that the size of a hash is more sensitive
to key insert order. A set of keys with some collisions might be one
size if those collisions were encountered early, or another if they were
encountered later. Assuming random distribution of hash values about
50% of hashes should be smaller than they would be without this rule.
The two changes complement each other, as changing the maximum load
factor decreases the chance of a collision, but changing to only split
after a collision means that we won't waste as much of that space we
might.
[1] Since I personally didnt find this obvious at first here is my
explanation:
The old behavior was that we doubled the number of buckets when the
number of keys in the hash matched that of buckets. So on inserting
the Kth key into a K bucket hash, we would double the number of
buckets. Thus the worse case prior to this patch was a hash
containing K-1 keys which all hash into a single bucket, and the post
split worst case behavior would be having K items in a single bucket
of a hash with 2*K buckets total.
The new behavior says that we double the size of the hash once inserting
an item into an occupied bucket and after doing so we exceeed the maximum
load factor (leave aside the change in maximum load factor in this patch).
If we insert into an occupied bucket (including the worse case bucket) then
we trigger a key split, and we have exactly the same cases as before.
If we insert into an empty bucket then we now have a worst case of K-1 items
in one bucket, and 1 item in another, in a hash with K buckets, thus the
worst case has not changed.
|
|
|
|
|
|
| |
This reverts commit 05f97de032fe95cabe8c9f6d6c0a5897b1616194.
Accidentally pushed while waiting for blead-unfreeze.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch we resized hashes when after inserting a key
the load factor of the hash reached 1 (load factor= keys / buckets).
This patch makes two subtle changes to this logic:
1. We split only after inserting a key into an utilized bucket,
2. and the maximum load factor exceeds 0.667
The intent and effect of this change is to increase our hash tables
efficiency. Reducing the maximum load factor 0.667 means that we should
have much less keys in collision overall, at the cost of some unutilized
space (2/3rds was chosen as it is easier to calculate than 0.7). On the
other hand, only splitting after a collision means in theory that we execute
the "final split" less often. Additionally, insertin a key into an unused
bucket increases the efficiency of the hash, without changing the worst
case.[1] In other words without increasing collisions we use the space
in our hashes more efficiently.
A side effect of this hash is that the size of a hash is more sensitive
to key insert order. A set of keys with some collisions might be one
size if those collisions were encountered early, or another if they were
encountered later. Assuming random distribution of hash values about
50% of hashes should be smaller than they would be without this rule.
The two changes complement each other, as changing the maximum load
factor decreases the chance of a collision, but changing to only split
after a collision means that we won't waste as much of that space we
might.
[1] Since I personally didnt find this obvious at first here is my
explanation:
The old behavior was that we doubled the number of buckets when the
number of keys in the hash matched that of buckets. So on inserting
the Kth key into a K bucket hash, we would double the number of
buckets. Thus the worse case prior to this patch was a hash
containing K-1 keys which all hash into a single bucket, and the post
split worst case behavior would be having K items in a single bucket
of a hash with 2*K buckets total.
The new behavior says that we double the size of the hash once inserting
an item into an occupied bucket and after doing so we exceeed the maximum
load factor (leave aside the change in maximum load factor in this patch).
If we insert into an occupied bucket (including the worse case bucket) then
we trigger a key split, and we have exactly the same cases as before.
If we insert into an empty bucket then we now have a worst case of K-1 items
in one bucket, and 1 item in another, in a hash with K buckets, thus the
worst case has not changed.
|
| |
|
|
|
|
|
|
|
| |
Incidentally, it currently works on SV *'s as well because there's an
explicit cast after an SvANY. Let's not rely on that. This commit also
removes a pointless const in a cast. Again. It takes an HV * as argument.
Let's only change that if we have a strong reason to.
|
|
|
|
| |
Except under cpan/ and dist/
|
|
|
|
| |
For: RT #130195
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C++11 requires space between the end of a string literal and a macro, so
that a feature can unambiguously be added to the language. Starting in
g++ 6.2, the compiler emits a warning when there isn't a space
(presumably so that future versions can support C++11). Unfortunately
there are many such instances in the perl core. This commit fixes
those, including those in ext/, but individual commits will be used for
the other modules, those in dist/ and cpan/.
This commit also inserts space at the end of a macro before a string
literal, even though that is not deprecated, and removes useless ""
literals following a macro (instead of inserting a blank). The result
is easier to read, making the macro stand out, and be clearer as to the
intention.
Code and modules included with the Perl core need to be compilable using
C++. This is so that perl can be embedded in C++ programs. (Actually,
only the hdr files need to be so compilable, but it would be hard to
test that just the hdrs are compilable.) So we need to accommodate
changes to the C++ language.
|
|
|
|
|
|
|
|
| |
This reverts commit d3148f758506efd28325dfd8e1b698385133f0cd.
SV keys are stored as pointers in the key_key, on platforms with
alignment requirements (such as PA-RISC) this resulted in bus errors
early in the build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
av_clear(), av_undef(), hv_clear(), hv_undef() and av_make()
all have similar guards along the lines of:
ENTER;
SAVEFREESV(SvREFCNT_inc_simple_NN(av));
... do stuff ...;
LEAVE;
to stop the AV or HV leaking or being prematurely freed while processing
its elements (e.g. FETCH() or DESTROY() might do something to it).
Introducing an extra scope and calling leave_scope() is expensive.
Instead, use a trick I introduced in my recent pp_assign() recoding:
add the AV/HV to the temps stack, then at the end of the function,
just PL_tmpx_ix-- if nothing else has been pushed on the tmps stack in the
meantime, or replace the tmps stack slot with &PL_sv_undef otherwise
(which doesn't care how many times its ref count gets decremented).
This is efficient, and doesn't artificially extend the life of the SV
like sv_2mortal() would.
This commit makes this code around 5% faster:
my @a;
for my $i (1..3_000_000) {
@a = (1,2,3);
@a = ();
}
and this code around 3% faster:
my %h;
for my $i (1..3_000_000) {
%h = qw(a 1 b 2);
%h = ();
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move the store of HEK_FLAGS off the end of the allocated
hek_key into the hek struct, simplifying access and providing
clarity to the code.
What is not clear is why Nicholas or perhaps Jarkko did
not do this themselves. We use similar tricks elsewhere,
so perhaps it was just continuing a tradition...
One thought is that we often have do strcmp/memeq on these
strings, and having their start be aligned might improve
performance, wheras this patch changes them to be unaligned.
If so perhaps we should just make flags a U32 and let the
HEK's be larger. They are shared in PL_strtab, and are
probably often sitting in malloc blocks that are sufficiently
large enough that making them bigger would make no practical
difference. (All of this is worth checking.)
[with edits by Yves Orton]
|
| |
|
|
|
|
| |
autodoc doesn't find things like Per_hv_bucket_ratio().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This subject has a long history see [perl #114576] for more discussion.
https://rt.perl.org/Public/Bug/Display.html?id=114576
There are a variety of reasons we want to change the return signature of
scalar(%hash). One is that it leaks implementation details about our
associative array structure. Another is that it requires us to keep track
of the used buckets in the hash, which we use for no other purpose but
for scalar(%hash). Another is that it is just odd. Almost nothing needs to
know these values. Perhaps debugging, but we have several much better
functions for introspecting the internals of a hash.
By changing the return signature we can remove all the logic related
to maintaining and updating xhv_fill_lazy. This should make hot code
paths a little faster, and maybe save some memory for traversed hashes.
In order to provide some form of backwards compatibility we adds three
new functions to the Hash::Util namespace: bucket_ratio(), num_buckets()
and used_buckets(). These functions are actually implemented in
universal.c, and thus always available even if Hash::Util is not loaded.
This simplifies testing. At the same time Hash::Util contains backwards
compatible code so that the new functions are available from it should
they be needed in older perls.
There are many tests in t/op/hash.t that are more or less obsolete after
this patch as they test that xhv_fill_lazy is correctly set in various
situations. However since we have a backwards compat layer we can just
switch them to use bucket_ratio(%hash) instead of scalar(%hash) and keep
the tests, just in case they are actually testing something not tested
elsewhere.
|
|
|
|
|
|
|
|
|
|
|
|
| |
A stash’s array of names may have null for the first entry, in which
case it is not one of the effective names, and the name count will
be negative.
The ‘count > 0’ is meant to prevent hv_ename_delete from trying to
read that entry, but a precedence problem introduced in 4643eb699
stopped it from doing that.
[This commit message was written by the committer.]
|
| |
|
|
|
|
| |
See [perl #117341].
|
|
|
|
|
|
|
|
|
| |
This adds a macro that converts a code point in the ASCII 128-255 range
to UTF-8, and changes existing code to use it when the range is known to
be restricted to this one, rather than the previous macro which accepted
a wider range (any code point representable by 2 bytes), but had an
extra test on EBCDIC platforms, hence was larger than necessary and
slightly slower.
|
| |
|
|
|
|
| |
Removes 'the' in front of parameter names in some instances.
|
| |
|
|
|
|
|
| |
The majority of perlapi uses C<> to specify these things, but a few
things used I<> instead. Standardize to C<>.
|
|
|
|
|
|
|
|
|
|
| |
740 if (return_svp) {
notnull: At condition entry, the value of entry cannot be NULL.
dead_error_condition: The condition entry must be true.
CID 104777: Logically dead code (DEADCODE)
dead_error_line: Execution cannot reach the expression NULL inside this statement: return entry ? (void *)&ent....
741 return entry ? (void *) &HeVAL(entry) : NULL;
|
|
|
|
|
|
| |
CID 104831: Dereference null return value (NULL_RETURNS)
43. dereference: Dereferencing a pointer that might be null Perl_mg_find(sv, 112) when calling Perl_magic_existspack. (The dereference is assumed on the basis of the 'nonnull' parameter attribute.)
499 magic_existspack(svret, mg_find(sv, PERL_MAGIC_tiedelem));
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That set of bits sets the feature bundle to ‘custom’, which means that
the features are set by %^H, and also indicates that %^H has been did-
dled with, so it’s worth looking at.
In the specific case where %^H is untouched and there is no corres-
ponding cop hint hash behind the scenes, Perl_feature_is_enabled (in
toke.c) ends up returning TRUE.
Commit v5.15.6-55-g94250ae sped up feature checking by allowing
refcounted_he_fetch to return a boolean when checking for existence,
instead of converting the value to a scalar, whose contents we are not
even going to use.
This was when the bug started happening. I did not update the code
path in refcounted_he_fetch that handles the absence of a hint hash.
So it was returning &PL_sv_placeholder instead of NULL; TRUE instead
of FALSE.
This did not cause problems for most code, but with the introduction
of the new bitwise ops in v5.21.8-150-g8823cb8, it started causing
uni::perl to fail, because they were implicitly enabled, making ^ a
numeric op, when it was being used as a string op.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An empty cpan/.dir-locals.el stops Emacs using the core defaults for
code imported from CPAN.
Committer's work:
To keep t/porting/cmp_version.t and t/porting/utils.t happy, $VERSION needed
to be incremented in many files, including throughout dist/PathTools.
perldelta entry for module updates.
Add two Emacs control files to MANIFEST; re-sort MANIFEST.
For: RT #124119.
|
|
|
|
|
|
| |
When a hash has no canonical name and one effective name, the array of
names has a null pointer at the beginning. hv_ename_add was not tak-
ing that into account, and was trying to dereference the null pointer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For lots of core functions:
if a function parameter has been declared NN in embed.fnc, don't test for
nullness at the start of the function, i.e. eliminate code like
if (!foo) ...
On debugging builds the test is redundant, as the PERL_ARGS_ASSERT_FOO
at the start of the function will already have croaked.
On optimised builds, it will skip the check (and so be slightly faster),
but if actually passed a null arg, will now crash with a null-deref SEGV
rather than doing whatever the check used to do (e.g. croak, or silently
return and let the caller's code logic to go awry). But hopefully this
should never happen as such instances will already have been detected on
debugging builds.
It also has the advantage of shutting up recent clangs which spew forth
lots of stuff like:
sv.c:6308:10: warning: nonnull parameter 'bigstr' will evaluate to
'true' on first encounter [-Wpointer-bool-conversion]
if (!bigstr)
The only exception was in dump.c, where rather than skipping the null
test, I instead changed the function def in embed.fnc to allow a null arg,
on the basis that dump functions are often used for debugging (where
pointers may unexpectedly become NULL) and it's better there to display
that this item is null than to SEGV.
See the p5p thread starting at 20150224112829.GG28599@iabyn.com.
|
|
|
|
|
|
| |
Both needed: the macro is for compilers, the comment for static checkers.
(This doesn't address whether each spot is correct and necessary.)
|
|
|
|
| |
Extracted from patch submitted by Lajos Veres in RT #123693.
|
|
|
|
|
|
| |
We unroll hv_backreferences_p() in sv_get_backrefs() so the logic is simpler,
(we dont need a **SV for this function), and (hopefully) make it C++ compliant
at the same time.
|
|
|
|
|
|
|
|
| |
Prior to this patch the assert was meaningless as we would
use the argument before we asserted things about it.
This patch restructures the logic so we do the asserts first
and *then* use the argument.
|