summaryrefslogtreecommitdiff
path: root/hv.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "hv.h: rework HEK_FLAGS to a proper member in struct hek"Tony Cook2016-11-031-1/+2
| | | | | | | | This reverts commit d3148f758506efd28325dfd8e1b698385133f0cd. SV keys are stored as pointers in the key_key, on platforms with alignment requirements (such as PA-RISC) this resulted in bus errors early in the build.
* speed up AV and HV clearing/undeffingDavid Mitchell2016-10-261-7/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | av_clear(), av_undef(), hv_clear(), hv_undef() and av_make() all have similar guards along the lines of: ENTER; SAVEFREESV(SvREFCNT_inc_simple_NN(av)); ... do stuff ...; LEAVE; to stop the AV or HV leaking or being prematurely freed while processing its elements (e.g. FETCH() or DESTROY() might do something to it). Introducing an extra scope and calling leave_scope() is expensive. Instead, use a trick I introduced in my recent pp_assign() recoding: add the AV/HV to the temps stack, then at the end of the function, just PL_tmpx_ix-- if nothing else has been pushed on the tmps stack in the meantime, or replace the tmps stack slot with &PL_sv_undef otherwise (which doesn't care how many times its ref count gets decremented). This is efficient, and doesn't artificially extend the life of the SV like sv_2mortal() would. This commit makes this code around 5% faster: my @a; for my $i (1..3_000_000) { @a = (1,2,3); @a = (); } and this code around 3% faster: my %h; for my $i (1..3_000_000) { %h = qw(a 1 b 2); %h = (); }
* hv.h: rework HEK_FLAGS to a proper member in struct hekTodd Rinaldo2016-10-241-2/+1
| | | | | | | | | | | | | | | | | | | | | Move the store of HEK_FLAGS off the end of the allocated hek_key into the hek struct, simplifying access and providing clarity to the code. What is not clear is why Nicholas or perhaps Jarkko did not do this themselves. We use similar tricks elsewhere, so perhaps it was just continuing a tradition... One thought is that we often have do strcmp/memeq on these strings, and having their start be aligned might improve performance, wheras this patch changes them to be unaligned. If so perhaps we should just make flags a U32 and let the HEK's be larger. They are shared in PL_strtab, and are probably often sitting in malloc blocks that are sufficiently large enough that making them bigger would make no practical difference. (All of this is worth checking.) [with edits by Yves Orton]
* hv.c: use new SvPVCLEAR and constant string friendly macrosYves Orton2016-10-191-1/+1
|
* perlapi: Add entry for hv_bucket_ratioKarl Williamson2016-06-301-1/+1
| | | | autodoc doesn't find things like Per_hv_bucket_ratio().
* Change scalar(%hash) to be the same as 0+keys(%hash)Yves Orton2016-06-221-54/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This subject has a long history see [perl #114576] for more discussion. https://rt.perl.org/Public/Bug/Display.html?id=114576 There are a variety of reasons we want to change the return signature of scalar(%hash). One is that it leaks implementation details about our associative array structure. Another is that it requires us to keep track of the used buckets in the hash, which we use for no other purpose but for scalar(%hash). Another is that it is just odd. Almost nothing needs to know these values. Perhaps debugging, but we have several much better functions for introspecting the internals of a hash. By changing the return signature we can remove all the logic related to maintaining and updating xhv_fill_lazy. This should make hot code paths a little faster, and maybe save some memory for traversed hashes. In order to provide some form of backwards compatibility we adds three new functions to the Hash::Util namespace: bucket_ratio(), num_buckets() and used_buckets(). These functions are actually implemented in universal.c, and thus always available even if Hash::Util is not loaded. This simplifies testing. At the same time Hash::Util contains backwards compatible code so that the new functions are available from it should they be needed in older perls. There are many tests in t/op/hash.t that are more or less obsolete after this patch as they test that xhv_fill_lazy is correctly set in various situations. However since we have a backwards compat layer we can just switch them to use bucket_ratio(%hash) instead of scalar(%hash) and keep the tests, just in case they are actually testing something not tested elsewhere.
* [perl #128086] Fix precedence in hv_ename_deleteHugo van der Sanden2016-05-151-1/+2
| | | | | | | | | | | | A stash’s array of names may have null for the first entry, in which case it is not one of the effective names, and the name count will be negative. The ‘count > 0’ is meant to prevent hv_ename_delete from trying to read that entry, but a precedence problem introduced in 4643eb699 stopped it from doing that. [This commit message was written by the committer.]
* [perl #123788] update isa magic stash records when *ISA is deletedTony Cook2016-01-111-1/+66
|
* Improve pod for [ah]v_(clear|undef)David Mitchell2015-10-201-6/+4
| | | | See [perl #117341].
* Add macro for converting Latin1 to UTF-8, and use itKarl Williamson2015-09-041-2/+2
| | | | | | | | | This adds a macro that converts a code point in the ASCII 128-255 range to UTF-8, and changes existing code to use it when the range is known to be restricted to this one, rather than the previous macro which accepted a wider range (any code point representable by 2 bytes), but had an extra test on EBCDIC platforms, hence was larger than necessary and slightly slower.
* perlapi use 'UTF-8' instead of variants of thatKarl Williamson2015-09-031-1/+1
|
* Various pods: Add C<> around many typed-as-is thingsKarl Williamson2015-09-031-23/+24
| | | | Removes 'the' in front of parameter names in some instances.
* perlapi, perlintern: Add L<> links to podKarl Williamson2015-09-031-7/+8
|
* perlapi: Use C<> instead of I<> for parameter names, etcKarl Williamson2015-08-011-11/+11
| | | | | The majority of perlapi uses C<> to specify these things, but a few things used I<> instead. Standardize to C<>.
* Impossible for entry to be NULL at this point.Jarkko Hietaniemi2015-06-261-1/+1
| | | | | | | | | | 740 if (return_svp) { notnull: At condition entry, the value of entry cannot be NULL. dead_error_condition: The condition entry must be true. CID 104777: Logically dead code (DEADCODE) dead_error_line: Execution cannot reach the expression NULL inside this statement: return entry ? (void *)&ent.... 741 return entry ? (void *) &HeVAL(entry) : NULL;
* mg_find can return NULL.Jarkko Hietaniemi2015-06-261-1/+5
| | | | | | CID 104831: Dereference null return value (NULL_RETURNS) 43. dereference: Dereferencing a pointer that might be null Perl_mg_find(sv, 112) when calling Perl_magic_existspack. (The dereference is assumed on the basis of the 'nonnull' parameter attribute.) 499 magic_existspack(svret, mg_find(sv, PERL_MAGIC_tiedelem));
* Stop $^H |= 0x1c020000 from enabling all featuresFather Chrysostomos2015-03-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | That set of bits sets the feature bundle to ‘custom’, which means that the features are set by %^H, and also indicates that %^H has been did- dled with, so it’s worth looking at. In the specific case where %^H is untouched and there is no corres- ponding cop hint hash behind the scenes, Perl_feature_is_enabled (in toke.c) ends up returning TRUE. Commit v5.15.6-55-g94250ae sped up feature checking by allowing refcounted_he_fetch to return a boolean when checking for existence, instead of converting the value to a scalar, whose contents we are not even going to use. This was when the bug started happening. I did not update the code path in refcounted_he_fetch that handles the absence of a hint hash. So it was returning &PL_sv_placeholder instead of NULL; TRUE instead of FALSE. This did not cause problems for most code, but with the introduction of the new bitwise ops in v5.21.8-150-g8823cb8, it started causing uni::perl to fail, because they were implicitly enabled, making ^ a numeric op, when it was being used as a string op.
* Replace common Emacs file-local variables with dir-localsDagfinn Ilmari Mannsåker2015-03-221-6/+0
| | | | | | | | | | | | | | | | An empty cpan/.dir-locals.el stops Emacs using the core defaults for code imported from CPAN. Committer's work: To keep t/porting/cmp_version.t and t/porting/utils.t happy, $VERSION needed to be incremented in many files, including throughout dist/PathTools. perldelta entry for module updates. Add two Emacs control files to MANIFEST; re-sort MANIFEST. For: RT #124119.
* [perl #123847] crash with *foo::=*bar::=*with_hashFather Chrysostomos2015-03-111-2/+5
| | | | | | When a hash has no canonical name and one effective name, the array of names has a null pointer at the beginning. hv_ename_add was not tak- ing that into account, and was trying to dereference the null pointer.
* don't test non-null argsDavid Mitchell2015-03-111-23/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For lots of core functions: if a function parameter has been declared NN in embed.fnc, don't test for nullness at the start of the function, i.e. eliminate code like if (!foo) ... On debugging builds the test is redundant, as the PERL_ARGS_ASSERT_FOO at the start of the function will already have croaked. On optimised builds, it will skip the check (and so be slightly faster), but if actually passed a null arg, will now crash with a null-deref SEGV rather than doing whatever the check used to do (e.g. croak, or silently return and let the caller's code logic to go awry). But hopefully this should never happen as such instances will already have been detected on debugging builds. It also has the advantage of shutting up recent clangs which spew forth lots of stuff like: sv.c:6308:10: warning: nonnull parameter 'bigstr' will evaluate to 'true' on first encounter [-Wpointer-bool-conversion] if (!bigstr) The only exception was in dump.c, where rather than skipping the null test, I instead changed the function def in embed.fnc to allow a null arg, on the basis that dump functions are often used for debugging (where pointers may unexpectedly become NULL) and it's better there to display that this item is null than to SEGV. See the p5p thread starting at 20150224112829.GG28599@iabyn.com.
* Consistently use NOT_REACHED; /* NOTREACHED */Jarkko Hietaniemi2015-03-041-1/+1
| | | | | | Both needed: the macro is for compilers, the comment for static checkers. (This doesn't address whether each spot is correct and necessary.)
* Corrections to spelling and grammatical errors.Lajos Veres2015-01-281-1/+1
| | | | Extracted from patch submitted by Lajos Veres in RT #123693.
* Rework sv_get_backrefs() so it is simpler, and C++ compliantYves Orton2014-12-251-0/+1
| | | | | | We unroll hv_backreferences_p() in sv_get_backrefs() so the logic is simpler, (we dont need a **SV for this function), and (hopefully) make it C++ compliant at the same time.
* Restructure hv_backreferences_p() so assert makes senseYves Orton2014-12-251-4/+4
| | | | | | | | Prior to this patch the assert was meaningless as we would use the argument before we asserted things about it. This patch restructures the logic so we do the asserts first and *then* use the argument.
* faster constant hash key lookups ($hash{const})David Mitchell2014-07-081-18/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On something like $hash{constantstring}, at compile-time the PVX string on the SV attached to the OP_CONST is converted into a HEK (with an appropriate offset shift). At run-time on hash keying, this HEK is used to speed up the bucket search; however it turns out that this can be improved. Currently, the main bucket loop does: for (; entry; entry = HeNEXT(entry)) { if (HeHASH(entry) != hash) continue; if (HeKLEN(entry) != (I32)klen) continue; if (HeKEY(entry) != key && memNE(HeKEY(entry),key,klen)) continue; if ((HeKFLAGS(entry) ^ masked_flags) & HVhek_UTF8) continue; The 'HeKEY(entry) != key' test is the bit that allows us to skip the memNE() when 'key' is actually part of a HEK. However, this means that in the const HEK scenario, for a match, we do pointless hash, klen and HVhek_UTF8 tests, when HeKEY(entry) == key is sufficient for a match. Conversely, in the non-const-HEK scenario, the 'HeKEY(entry) != key' will always fail, and so it's just dead weight in the loop. To work around this, this commit splits the code into two separate bucket search loops; one for const-HEKs that just compare HEK pointers, and a general loop that now doesn't have do the 'HeKEY(entry) != key' test. Analysing this code with cachegrind shows that with this commit, lookups of constant keys that exist (e.g. the typical perl object scenario, $self->{somefield}) takes 15% less instruction reads in hv_common(), 14% less data reads and 27% less writes. A lookup with a non-existing constant key ($hash{not_exist}) is about the same as before (0.7% improvement). Non-constant existing lookup ($hash{$existing_key}) is about 5% less instructions, while $hash{$non_existing_key} is about 0.7%.
* Remove or downgrade unnecessary dVAR.Jarkko Hietaniemi2014-06-251-16/+4
| | | | | | | | You need to configure with g++ *and* -Accflags=-DPERL_GLOBAL_STRUCT or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference. (g++ does not do the "post-annotation" form of "unused".) The version code has some of these issues, reported upstream.
* Unused contexts found under PERL_GLOBAL_STRUCT.Jarkko Hietaniemi2014-06-241-0/+1
|
* PERL_UNUSED_CONTEXT -> remove interp context where possibleDaniel Dragan2014-06-241-1/+0
| | | | | | | | | | | | | | | | | | | | | Removing context params will save machine code in the callers of these functions, and 1 ptr of stack space. Some of these funcs are heavily used as mg_find*. The contexts can always be readded in the future the same way they were removed. This patch inspired by commit dc3bf40570. Also remove PERL_UNUSED_CONTEXT when its not needed. See removal candidate rejection rational in [perl #122106]. -Perl_hv_backreferences_p uses context in S_hv_auxinit commit 96a5add60f was wrong -Perl_whichsig_sv and Perl_whichsig_pv wrongly used PERL_UNUSED_CONTEXT from inception in commit 84c7b88cca -in authors opinion cast_* shouldn't be public API, no CPAN grep usage, can't be static and/or inline optimized since it is exported -Perl_my_unexec move to block where it is needed, make Win32 block, context free, for inlining likelyhood, private api and only 2 callers in core -Perl_my_dirfd make all blocks context free, then change proto -Perl_bytes_cmp_utf8 wrongly used PERL_UNUSED_CONTEXT from inception in commit fed3ba5d6b
* Silence several -Wunused-parameter warnings about my_perlBrian Fraser2014-06-131-0/+2
| | | | | | | | This meant sprinkling some PERL_UNUSED_CONTEXT invocations, as well as stopping some functions from getting my_perl in the first place; all of the functions in the latter category are internal (S_ prefix and s or i in embed.fnc), so this should be both safe and economical.
* Adding missing HEKfARG() invocationsBrian Fraser2014-06-131-3/+3
| | | | This silences a chunk of warnings under -Wformat
* perlapi: Include general informationKarl Williamson2014-06-051-1/+0
| | | | | | | | | | | Unlike other pod handling routines, autodoc requires the line following an =head1 to be non-empty for its text to be included in the paragraph started by the heading. If you fail to do this, silently the text will be omitted from perlapi. I went through the source code, and where it was apparent that the text was supposed to be in perlapi, deleted the empty line so it would be, with some revisions to make more sense. I added =cuts where I thought it best for the text to not be included.
* Cannot rotl u32 (hek_hash) by 64 bits.Jarkko Hietaniemi2014-05-281-1/+1
| | | | | | Fix for Coverity perl5 CID 28935: Operands don't affect result (CONSTANT_EXPRESSION_RESULT) result_independent_of_operands: (unsigned long)entry->hent_hek->hek_hash >> 47 /* 64 - 17 */ is 0 regardless of the values of its operands. This occurs as the bitwise second operand of '|'.
* Preallocate HvAUX() structures for large bucket arraysYves Orton2014-03-181-18/+43
| | | | | | | | | | | | The assumption is that the time/space tradeoff of not allocating the HvAUX() structure goes away for a large bucket array where the size of the allocated buffer is much larger than the nonallocated HvAUX() "extension". This should make keys() and each() on larger hashes faster, but still preserve the essence of the original space conservation, where the assumption is a lot of small hash based objects which will never be traversed.
* Split out part of hv_auxinit() so it can be reusedYves Orton2014-03-181-12/+18
| | | | | | Changes nothing except that it introduces hv_auxinit_interal() which does part of the job of hv_auxinit(), so that we can call it from somewhere else in the next commit.
* don't repeatedly call HvUSEDKEYSDaniel Dragan2014-03-101-2/+4
| | | | HvUSEDKEYS contains a function call nowadays. Don't call it repeatedly.
* make core safe against HvAUX() reallocDavid Mitchell2014-03-071-6/+13
| | | | | | | | | | | | | | | | | Since the HvAUX structure is just tacked onto the end of the HvARRAY() struct, code like this can do bad things: aux = HvAUX(); ... something that might split hv ... aux->foo = ...; /* SEGV! */ So I've visually audited core for places where HbAUX() is saved and then re-used, and re-initialised the var if it looks like HvARRAY() could have changed in the meantime. I've been very conservative about what might be unsafe. For example, destructors or __WARN__ handlers could call perl code that modifies the hash.
* add aux_flags field to HVs with aux structDavid Mitchell2014-02-281-0/+1
| | | | | | | | | | | | | Add an extra U32 general flags field to the xpvhv_aux struct (which is used on HVs such as stashes, that need extra fields). On 64-bit systems, this doesn't consume any extra space since there's already an odd number of I32/U32 fields. On 32-bit systems it will consume an extra 4 bytes. But of course only on those hashes that have the aux struct. As well as providing extra flags in the AUX case, it will also allow us to free up at least one general flag bit for HVs - see next commit.
* Do not dereference hv before ensuring it's not NULLRafael Garcia-Suarez2014-02-191-1/+2
| | | | This should fix RT #116441 and possibly other bugs.
* Use NOT_REACHED in one spot in hv.cFather Chrysostomos2014-01-131-1/+1
| | | | This reduces the size of hv.o by 32 bytes under clang.
* perlapi: Consistent spaces after dotsFather Chrysostomos2013-12-291-6/+10
| | | | plus some typo fixes. I probably changed some things in perlintern, too.
* When deleting via hek, pass the computed hash valueFather Chrysostomos2013-10-281-13/+6
| | | | | | | In those cases where the hash key comes from a hek, we already have a computed hash value, so pass that to hv_common. The easiest way to accomplish this is to add a new macro.
* hv.c: Stop being ASCII-centricKarl Williamson2013-08-291-12/+22
| | | | | This uses macros which work cross-platform. This has the added advantge what is going on is much clearer.
* Move super cache into mro metaFather Chrysostomos2013-08-201-2/+1
| | | | | | | Iterated hashes shouldn’t have to allocate space for something specific to stashes, so move the SUPER method cache from the HvAUX struct (which all iterated hashes have) into the mro meta struct (which only stashes have).
* Don’t treat COWs specially in locked hashesFather Chrysostomos2013-08-111-3/+2
| | | | | | | This is left over from when READONLY+FAKE meant copy-on-write. Read-only copy-on-write scalars (which could not occur with the old way of flagging things) must not be exempt from hash key restrictions.
* [perl #72766] Allow huge pos() settingsFather Chrysostomos2013-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is part of #116907, too. It also fixes #72924 as a side effect; the next commit will explain. The value of pos($foo) was being stored as an I32, not allowing values above I32_MAX. Change it to SSize_t (the signed equivalent of size_t, representing the maximum string length the OS/compiler supports). This is accomplished by changing the size of the entry in the magic struct, which is the simplest fix. Other parts of the code base can benefit from this, too. We actually cast the pos value to STRLEN (size_t) when reading it, to allow *very* long strings. Only the value -1 is special, meaning there is no pos. So the maximum supported offset is 2**sizeof(size_t)-2. The regexp engine itself still cannot handle large strings, so being able to set pos to large values is useless right now. This is but one piece in a larger puzzle. Changing the size of mg->mg_len also requires that Perl_hv_placeholders_p change its type. This function should in fact not be in the API, since it exists solely to implement the HvPLACEHOLDERS macro. See <https://rt.perl.org/rt3/Ticket/Display.html?id=116907#txn-1237043>.
* hv.c: Clarify uvar commentFather Chrysostomos2013-06-061-1/+2
| | | | It was not clear that it referred to uvar magic.
* Cache HvFILL() for larger hashes, and update on insertion/deletion.Nicholas Clark2013-05-291-18/+64
| | | | | | This avoids HvFILL() being O(n) for large n on large hashes, but also avoids storing the value of HvFILL() in smaller hashes (ie a memory overhead on every single object built using a hash.)
* Perl_hv_fill() can return early if the hash only has 0 or 1 keys.Nicholas Clark2013-05-271-0/+5
| | | | | | | No keys implies no chains used, so the return value is 0. One key unambiguously means 1 chain used, and all the others are free. Two or more keys might share the same chain, or might not, so the calculation can't be short-circuited.
* silence warnings under NO_TAINT_SUPPORTDavid Mitchell2013-05-091-1/+4
| | | | | The are lots of places where local vars aren't used when compiled with NO_TAINT_SUPPORT.
* Make it possible to disable and control hash key traversal randomizationYves Orton2013-05-071-21/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for PERL_PERTURB_KEYS environment variable, which in turn allows one to control the level of randomization applied to keys() and friends. When PERL_PERTURB_KEYS is 0 we will not randomize key order at all. The chance that keys() changes due to an insert will be the same as in previous perls, basically only when the bucket size is changed. When PERL_PERTURB_KEYS is 1 we will randomize keys in a non repeatedable way. The chance that keys() changes due to an insert will be very high. This is the most secure and default mode. When PERL_PERTURB_KEYS is 2 we will randomize keys in a repeatedable way. Repititive runs of the same program should produce the same output every time. The chance that keys changes due to an insert will be very high. This patch also makes PERL_HASH_SEED imply a non-default PERL_PERTURB_KEYS setting. Setting PERL_HASH_SEED=0 (exactly one 0) implies PERL_PERTURB_KEYS=0 (hash key randomization disabled), settng PERL_HASH_SEED to any other value, implies PERL_PERTURB_KEYS=2 (deterministic/repeatable hash key randomization). Specifying PERL_PERTURB_KEYS explicitly to a different level overrides this behavior. Includes changes to allow one to compile out various aspects of the patch. One can compile such that PERL_PERTURB_KEYS is not respected, or can compile without hash key traversal randomization at all. Note that support for these modes is incomplete, and currently a few tests will fail. Also includes a new subroutine in Hash::Util::hash_traversal_mask() which can be used to ensure a given hash produces a predictable key order (assuming the same hash seed is in effect). This sub acts as a getter and a setter. NOTE - this patch lacks tests, but I lack tuits to get them done quickly, so I am pushing this with the hope that others can add them afterwards.