summaryrefslogtreecommitdiff
path: root/intrpvar.h
Commit message (Collapse)AuthorAgeFilesLines
* bump version to 5.21.1Ricardo Signes2014-05-271-2/+2
|
* bump version to 5.21.0Ricardo Signes2014-05-261-1/+1
|
* bump version to 5.20.0, install 5.20 perldeltaRicardo Signes2014-05-121-1/+1
|
* Bump version for 5.19.12 (not that it's expected to exist...)Steve Hay2014-04-201-1/+1
|
* Fix comments and pod that mention 5.20 erroneouslyKarl Williamson2014-04-011-1/+1
| | | | | | In certain places in the documentation, "5.20" is no longer applicable. Also, a message referred to in perldiag got reworded, but our checks did not catch that perldiag should have been updated.
* Bump to Perl version 5.19.11Aaron Crane2014-03-201-1/+1
|
* pp_tms should use a local struct tms, instead of PL_timesbuf.Nicholas Clark2014-03-011-0/+1
| | | | | | | | | | PL_timesbuf is effectively a vestige of Perl 1, and doesn't actually need to be an interpreter variable. It will be removed early in v5.21.x, but it's a good idea to refactor the code not to use it before then. A local struct tms will be on the C stack, which will be in the CPU's L1 cache, whereas the relevant part of the interpreter struct may well not be in the CPU cache at all. Therefore this change might reduce cache pressure fractionally. A local variable access should also be simpler machine code on most CPU architectures.
* move PL_defgv nearer the top of intrvar.hDavid Mitchell2014-02-271-1/+1
| | | | on the grounds that its a reasonably hot variable.
* bump to version 5.19.10 and fix the version number reference in op.cTony Cook2014-02-201-1/+1
|
* Work properly under UTF-8 LC_CTYPE localesKarl Williamson2014-01-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This large (sorry, I couldn't figure out how to meaningfully split it up) commit causes Perl to fully support LC_CTYPE operations (case changing, character classification) in UTF-8 locales. As a side effect it resolves [perl #56820]. The basics are easy, but there were a lot of details, and one troublesome edge case discussed below. What essentially happens is that when the locale is changed to a UTF-8 one, a global variable is set TRUE (FALSE when changed to a non-UTF-8 locale). Within the scope of 'use locale', this variable is checked, and if TRUE, the code that Perl uses for non-locale behavior is used instead of the code for locale behavior. Since Perl's internal representation is UTF-8, we get UTF-8 behavior for a UTF-8 locale. More work had to be done for regular expressions. There are three cases. 1) The character classes \w, [[:punct:]] needed no extra work, as the changes fall out from the base work. 2) Strings that are to be matched case-insensitively. These form EXACTFL regops (nodes). Notice that if such a string contains only characters above-Latin1 that match only themselves, that the node can be downgraded to an EXACT-only node, which presents better optimization possibilities, as we now have a fixed string known at compile time to be required to be in the target string to match. Similarly if all characters in the string match only other above-Latin1 characters case-insensitively, the node can be downgraded to a regular EXACTFU node (match, folding, using Unicode, not locale, rules). The code changes for this could be done without accepting UTF-8 locales fully, but there were edge cases which needed to be handled differently if I stopped there, so I continued on. In an EXACTFL node, all such characters are now folded at compile time (just as before this commit), while the other characters whose folds are locale-dependent are left unfolded. This means that they have to be folded at execution time based on the locale in effect at the moment. Again, this isn't a change from before. The difference is that now some of the folds that need to be done at execution time (in regexec) are potentially multi-char. Some of the code in regexec was trivial to extend to account for this because of existing infrastructure, but the part dealing with regex quantifiers, had to have more work. Also the code that joins EXACTish nodes together had to be expanded to account for the possibility of multi-character folds within locale handling. This was fairly easy, because it already has infrastructure to handle these under somewhat different circumstances. 3) In bracketed character classes, represented by ANYOF nodes, a new inversion list was created giving the characters that should be matched by this node when the runtime locale is UTF-8. The list is ignored except under that circumstance. To do this, I created a new ANYOF type which has an extra SV for the inversion list. The edge case that caused the most difficulty is folding involving the MICRO SIGN, U+00B5. It folds to the GREEK SMALL LETTER MU, as does the GREEK CAPITAL LETTER MU. The MICRO SIGN is the only 0-255 range character that folds to outside that range. The issue is that it doesn't naturally fall out that it will match the CAP MU. If we let the CAP MU fold to the samll mu at compile time (which it can because both are above-Latin1 and so the fold is the same no matter what locale is in effect), it could appear that the regnode can be downgraded away from EXACTFL to EXACTFU, but doing so would cause the MICRO SIGN to not case insensitvely match the CAP MU. This could be special cased in regcomp and regexec, but I wanted to avoid that. Instead the mktables tables are set up to include the CAP MU as a character whose presence forbids the downgrading, so the special casing is in mktables, and not in the C code.
* bump version to 5.19.9!Ricardo Signes2014-01-201-1/+1
|
* Remove PL_L1Posix_ptrsKarl Williamson2014-01-091-1/+0
| | | | | | | | | | | | This global array is no longer used, having been removed in previous commits in this series. Since it is a global, consideration need be given to possible uses of it outside the core. It has never been externally documented, and is an opaque structure whose internals have changed with every release. The functions used to access it are almost all static to regcomp.c; those few that aren't have been hidden from all but the few .c files that need to have access to them, via #if's.
* Consistent spaces after dots in perlintern.podFather Chrysostomos2013-12-291-1/+1
|
* Bump version number from 5.19.7 to 5.19.8.Abigail2013-12-201-1/+1
|
* Bump the perl version in various places for v5.19.7Chris 'BinGOs' Williams2013-11-201-1/+1
|
* Bump version for Perl 5.19.6Steve Hay2013-10-201-1/+1
|
* Remove PL_ASCII; use existing array slots for itKarl Williamson2013-09-241-1/+0
| | | | | | | | | | | | PL_ASCII contains an inversion list to match the ASCII-range code points. It is unusable outside the core regular expression code because all the functions that manipulate inversion lists are defined only within a few core files. Therefore no outside code should be depending on it. It turns out that there are arrays of similar inversion lists, and these all have slots which should have this inversion list in them. This commit fills them, instead of using PL_ASCII.
* Add inversion list for U+80 - U+FFKarl Williamson2013-09-241-0/+1
| | | | | This is the upper half of the Latin1 range. This simplifies some code very slightly, but will be of use in future commits.
* Removed OP_IN_REGISTER and related defines.Brian Fraser2013-09-211-4/+0
| | | | | Added as an experiment in 462e5cf6, it never quite worked, and recently wasn't even using registers.
* Bump version for 5.19.5Steve Hay2013-09-201-1/+1
|
* [perl #115928] a consistent (public) rand() implementationTony Cook2013-09-131-0/+2
| | | | | | | | | | | | | | | | Based on Yves's random branch work. This version makes the new random number visible to external modules, for example, List::Util's XS shuffle() implementation. I've also added a 64-bit implementation when HAS_QUAD is true, this should be significantly faster, even on 32-bit CPUs. This is intended to produce exactly the same sequence as the original implementation. The original version of this commit retained the "freebsd" name from Yves's original work for the function and data structure names. I've removed "freebsd" from most function names so the name isn't an issue if we choose to replace the implementation,
* Use SSize_t for tmps stack offsetsFather Chrysostomos2013-08-251-3/+3
| | | | | | | | | | | | | | | This is a partial fix for #119161. On 64-bit platforms, I32 is too small to hold offsets into a stack that can grow larger than I32_MAX. What happens is the offsets can wrap so we end up referencing and modifying elements with negative indices, corrupting memory, and causing crashes. With this commit, ()=1..1000000000000 stops crashing immediately. Instead, it gobbles up all your memory first, and then, if your com- puter still survives, crashes. The second crash happesn bcause of a similar bug with the argument stack, which the next commit will take care of.
* Bump version for 5.19.4Steve Hay2013-08-201-1/+1
|
* Make PL_hints an alias for PL_compiling.cop_hintsFather Chrysostomos2013-08-111-2/+0
| | | | | | | | | | | | | | | | | | | PL_hints stores the hints at compile time that get copied into the cop_hints field of each COP (in newSTATEOP). Since perl-5.8.0-8053-gd5ec298, COPs have stored all the hints. Before that, COPs used to store only some of the hints. The hints were copied here and there into PL_compiling, a static COP-shaped buf- fer used during compilation, so that things like constant folding would see the correct hints. a0ed51b3 back in 1998 did that. Now that COPs can store all the hints, we can just use PL_compiling.cop_hints to avoid having to copy them from PL_hints from time to time. This simplifies the code and avoids creating bugs like those that a547fd219 and 1c75beb82 fixed.
* Revert "[perl #117855] Store CopFILEGV in a pad under ithreads"Father Chrysostomos2013-08-091-3/+0
| | | | | | | | | | | | This reverts commit c82ecf346. It turn out to be faulty, because a location shared betweens threads (the cop) was holding a reference count on a pad entry in a particu- lar thread. So when you free the cop, how do you know where to do SvREFCNT_dec? In reverting c82ecf346, this commit still preserves the bug fix from 1311cfc0a7b, but shifts it around.
* [perl #117855] Store CopFILEGV in a pad under ithreadsFather Chrysostomos2013-08-051-0/+3
| | | | | | | | | | | | | | | | This saves having to allocate a separate string buffer for every cop (control op; every statement has one). Under non-threaded builds, every cop has a pointer to the GV for that source file, namely *{"_<filename"}. Under threaded builds, the name of the GV used to be stored instead. Now we store an offset into the per-interpreter PL_filegvpad, which points to the GV. This makes no significant speed difference, but it reduces mem- ory usage.
* bump version to v5.19.3Aristotle Pagaltzis2013-07-221-1/+1
|
* -DPERL_TRACE_OPS to produce reports on executed OP countsSteffen Mueller2013-07-021-0/+9
| | | | | | | | | This produces a report on the number of OPs of a given type that were executed at the end of a program run. This can be useful in multiple ways. One, it can help determine hotspots for optimization (yes, I know execution count is not equal execution time). It can also help with determining whether a given change to perl has had the desired effect on deterministic programs.
* SV_CONST(name) and PL_sv_constsRuslan Zakirov2013-06-301-0/+2
| | | | | | | | | SV_CONST(XXX) returns SV* that contains "XXX" string. SVs are built on demand and stored in interp's structure for re-use. All SVs have precomputed hash value. Creates SVs on demand, we don't want 35 SV created during compile time or cloned during thread creation.
* bump version to v5.19.2David Golden2013-06-201-1/+1
|
* better comment the remaining PL_ regex varsDavid Mitchell2013-06-021-3/+4
|
* eliminate PL_regdummyDavid Mitchell2013-06-021-2/+0
| | | | | | | This global (per-interpreter) var is just used during regex compilation as a placeholder to point RExC_emit at during the first (non-emitting) pass, to indicate to not to emit anything. There's no need for it to be a global var: just add it as an extra field in the RExC_state_t struct instead.
* eliminate PL_reg_stateDavid Mitchell2013-06-021-2/+0
| | | | | | | | | | This is a struct that holds all the global state of the current regex match. The previous set of commits have gradually removed all the fields of this struct (by making things local rather than global state). Since the struct is now empty, the PL_reg_state var can be removed, along with the SAVEt_RE_STATE save type which was used to save and restore those fields on recursive re-entry to the regex engine.
* make PL_reg_curpm globalDavid Mitchell2013-06-021-0/+3
| | | | | | | | | | | | | | | | | | Currently PL_reg_curpm is actually #deffed to a field within PL_reg_state; promote it into a fully autonomous perl-interpreter variable. PL_reg_curpm points to a fake PMOP that's used to temporarily point PL_curpm to, that we can hang the current regex off, so that this works: "a" =~ /^(.)(?{ print $1 })/ # prints 'a' It turns out that it doesn't need to be saved and restored when we recursively enter the regex engine; that is already handled by saving and restoring which regex is currently attached to PL_reg_curpm. So we just need a single global (per interpreter) placeholder. Since we're shortly going to get rid of PL_reg_state, we need to move it out of that struct.
* bump version to 5.19.1Ricardo Signes2013-05-201-1/+1
|
* bump version to 5.19.0Ricardo Signes2013-05-181-1/+1
|
* Make it possible to disable and control hash key traversal randomizationYves Orton2013-05-071-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for PERL_PERTURB_KEYS environment variable, which in turn allows one to control the level of randomization applied to keys() and friends. When PERL_PERTURB_KEYS is 0 we will not randomize key order at all. The chance that keys() changes due to an insert will be the same as in previous perls, basically only when the bucket size is changed. When PERL_PERTURB_KEYS is 1 we will randomize keys in a non repeatedable way. The chance that keys() changes due to an insert will be very high. This is the most secure and default mode. When PERL_PERTURB_KEYS is 2 we will randomize keys in a repeatedable way. Repititive runs of the same program should produce the same output every time. The chance that keys changes due to an insert will be very high. This patch also makes PERL_HASH_SEED imply a non-default PERL_PERTURB_KEYS setting. Setting PERL_HASH_SEED=0 (exactly one 0) implies PERL_PERTURB_KEYS=0 (hash key randomization disabled), settng PERL_HASH_SEED to any other value, implies PERL_PERTURB_KEYS=2 (deterministic/repeatable hash key randomization). Specifying PERL_PERTURB_KEYS explicitly to a different level overrides this behavior. Includes changes to allow one to compile out various aspects of the patch. One can compile such that PERL_PERTURB_KEYS is not respected, or can compile without hash key traversal randomization at all. Note that support for these modes is incomplete, and currently a few tests will fail. Also includes a new subroutine in Hash::Util::hash_traversal_mask() which can be used to ensure a given hash produces a predictable key order (assuming the same hash seed is in effect). This sub acts as a getter and a setter. NOTE - this patch lacks tests, but I lack tuits to get them done quickly, so I am pushing this with the hope that others can add them afterwards.
* Re-order intrpvar.h to minimise holes in the interpreter struct.Nicholas Clark2013-03-201-20/+23
| | | | Commit 19bc2726ec6be805 created 32 bytes of holes (on LP64 systems).
* Harden hashes against hash seed discovery by randomizing hash iterationYves Orton2013-03-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds: S_ptr_hash() - A new static function in hv.c which can be used to hash a pointer or integer. PL_hash_rand_bits - A new interpreter variable used as a cheap provider of "semi-random" state for use by the hash infrastructure. xpvhv_aux.xhv_rand - Used as a mask which is xored against the xpvhv_aux.riter during iteration to randomize the order the actual buckets are visited. PL_hash_rand_bits is initialized as interpreter start from the random hash seed, and then modified by "mixing in" the result of ptr_hash() on the bucket array pointer in the hv (HvARRAY(hv)) every time hv_auxinit() allocates a new iterator structure. The net result is that every hash has its own iteration order, which should make it much more difficult to determine what the current hash seed is. This required some test to be restructured, as they tested for something that was not necessarily true, we never guaranteed that two hashes with the same keys would produce the same key order, we merely promised that using keys(), values(), or each() on the same hash, without any insertions in between, would produce the same order of visiting the key/values.
* reorder intrpvar.hDavid Mitchell2013-03-091-109/+117
| | | | | | | | | | | | | | | | | Move more of the more commonly-used PL_ variables towards the front of the file (and thus to the top of the interpreter struct on MULTIPLICITY builds). This helps ensure that "hot" variables are clustered together on the same small number of cache lines, and also that the machine code to load them will have shorter offsets, which on some architectures may be achieved with shorter instructions. The "hotness" has been determined purely by my subjective judgement rather than any profiling. It's still open for the later to be done. (Only simple shunting of whole lines has been done; no changes have been made to individual lines.)
* Prepare PL_sv_objcount removalSteffen Mueller2013-03-061-1/+3
| | | | | | | | | This used to keep track of all objects. At least by now, that is for no particularly good reason. Just because it could avoid a bit of work during global destruction if no objects remained. Let's do less work at run-time instead. The interpreter global will remain for one deprecation cycle.
* Use native-size integers for some global countersSteffen Mueller2013-02-271-3/+3
| | | | | | | | | It may be unlikely that a Perl program will hit 2 billion SVs, but by the time that 5.18 is ancient history, it's looking a lot more likely. This makes two global counters use native-size ints. I'm preserving signedness just for hysterical raisins: It might be deliberate.
* Rename PL_interp_size_5_16_0 to PL_interp_size_5_18_0.Nicholas Clark2013-02-191-2/+2
|
* Re-order intrpvar.h to minimise holes in the interpreter struct.Nicholas Clark2013-02-191-4/+6
| | | | | | | | | | Holes were created by commit f59909ab8dad6ceb (April 2012) which removed PL_reginterp_cnt, commit 7dc8663964c66a69 (Nov 2012) which removed PL_rehash_seed_set, and commit 8936b48a49448f4e (Dec 2012) which removed PL_glob_index. There is still an unavoidable U16 sized hole on the default threaded configuration on x86_64. (U8 if PERL_SAWAMPERSAND is defined).
* regex: Add pseudo-Posix class: 'cased'Karl Williamson2012-12-311-2/+0
| | | | | | | | | | | | | | | | | /[[:upper:]]/i and /[[:lower:]]/i should match the Unicode property \p{Cased}. This commit introduces a pseudo-Posix class, internally named 'cased', to represent this. This class isn't specifiable by the user, except through using either /[[:upper:]]/i or /[[:lower:]]/i. Debug output will say ':cased:'. The regex parsing either of :lower: or :upper: will change them into :cased:, where already existing logic can handle this, just like any other class. This commit fixes the regression introduced in 3018b823898645e44b8c37c70ac5c6302b031381, and that these have never worked under 'use locale'. The next commit will un-TODO the tests for these things.
* handy.h: Add full complement of isIDCONT() macrosKarl Williamson2012-12-231-0/+1
| | | | | | | This also changes isIDCONT_utf8() to use the Perl definition, which excludes any \W characters (the Unicode definition includes a few of these). Tests are also added. These macros remain undocumented for now.
* Use an array for some inversion listsKarl Williamson2012-12-221-6/+2
| | | | | Previous commits have placed some inversion list pointers into arrays. This commit extends that to another group of inversion lists
* Use an array for some inversion listsKarl Williamson2012-12-221-29/+2
| | | | | An earlier commit placed some inversion list pointers into an array. This commit extends that to another group of inversion lists.
* Use array for some inversion listsKarl Williamson2012-12-221-8/+1
| | | | | | This patch creates an array pointing to the inversion lists that cover the Latin-1 ranges for Posix character classes, and uses it instead of the individual variables previously referred to.
* intrpvar.h: Place some swash pointers in an arrayKarl Williamson2012-12-221-9/+1
|