summaryrefslogtreecommitdiff
path: root/embed.fnc
Commit message (Collapse)AuthorAgeFilesLines
* [perl #115112] avoid repeated calls to path_is_absolute() and rename itTony Cook2013-06-041-1/+1
| | | | | | | | | | A micro-optimization inspired by bulk88's perl #115112. The original proposal suggested applying a two changes that removed the duplicate calls, and then explicitly inlined path_is_absolute(). This version removes the duplicate calls, renames the function to better match it's purpose and asks the compiler to inline it.
* add strbeg argument to Perl_re_intuit_start()David Mitchell2013-06-021-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | (note that this is a change both to the perl API and the regex engine plugin API). Currently, Perl_re_intuit_start() is passed an SV, plus pointers to: where in the string to start matching (strpos); and to the end of the string (strend). Unlike Perl_regexec_flags(), it doesn't also have a strbeg arg. Because of this this, it guesses strbeg: based on the passed SV (if its svPOK()); or just set to strpos otherwise. This latter can happen if for example the SV is overloaded. Note also that this latter guess is wrong, and could in theory make /\b.../ fail. But just to confuse matters, although Perl_re_intuit_start() itself uses its guesstimate strbeg var, some of the functions it calls use the global value of PL_bostr instead. To make this work, the *callers* of Perl_re_intuit_start() currently set PL_bostr first. This is why \b doesn't actually break. The fix to this unholy mess is to simply add a strbeg arg to Perl_re_intuit_start(). It's also the first step to eliminating PL_bostr altogether.
* find_byclass, regrepeat: remove is_utf8_pat argDavid Mitchell2013-06-021-4/+2
| | | | | | Remove the is_utf8_pat arg from these two static functions in regexec.c. Since both these functions are now passed a valid reginfo pointer, this info is already available as one of the fields in that struct.
* make more use of regmatch_info struct.David Mitchell2013-06-021-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | regmatch_info is a small struct that is currently directly allocated as a local var in Perl_regexec_flags(), and has a few fields that maintain part of the state of the current pattern match. It is passed as an arg to various functions that regexec_flags() calls, such as regtry(). In some ways its a rival to PL_reg_state, which also maintains state for the current match, but which is a global variable (whose state needs saving and restoring whenever the regex engine goes reentrant). It makes more sense to store state in the regmatch_info struct, and as a first step in moving more state to there, this commit makes more use of regmatch_info. In particular, it makes Perl_re_intuit_start() also allocate such a struct, so that now *both* the main execution entry points to the regex engine make use of it. It's also now passed as an arg to more of the static functions that these two op-level ones call. Two changes of special note. First, whether S_find_byclass() got called with a null reginfo pointer of not indicated whether it had been called from Perl_regexec_flags() (with a valid reginfo pointer), or from Perl_re_intuit_start() (null pointer). Since they both pass non-null reginfo pointers now, instead we add an extra field, reginfo->intuit that indicates who's the top-level caller. Secondly, to allow in future for various macros to uniformly refer to values like reginfo->foo, where the structure is actually allocated as a local var in Perl_regexec_flags(), we change the reginfo from being the struct itself to being a pointer to the struct, (so Perl_regexec_flags itself now uses reginfo->foo too rather than reginfo.foo). In summary, all the above is essentially window dressing that makes no functional changes to the code, but will facilitate future changes.
* [perl #116735] Honour lexical prototypes when no parens are usedFather Chrysostomos2013-06-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As Peter Martini noted in ticket #116735, lexical subs produce dif- ferent op trees for ‘foo 1’ and ‘foo(1)’. foo(1) produces an rv2cv op with a padcv kid. The unparenthetical version produces just a padcv op. And the difference in op trees caused lexical sub calls to honour prototypes only in the presence of parentheses, because rv2cv_op_cv (which searches for the cv in order to check its prototype) was expecting rv2cv+padcv. Not realising there was a discrepancy between the two forms, and noticing that foo() produces *two* newCVREF ops, in commit 279d09bf893 I made newCVREF return just a padcv op for lexical subs. At the time I couldn’t figure out why there were two rv2cv ops, and punted on researching it. This is how it works for package subs: When a sub call is compiled, if there are parentheses, an implicit '&' is fed to the parser. The token that follows is a WORD token with a constant op attached to it, containing the name of the subroutine. When the parser sees '&', it calls newCVREF on the const op to create an rv2cv op. For sub calls without parentheses, the token passed to the parser is already an rv2cv op. The resulting op tree is the same either way. For lexical subs, I had the lexer emitting an rv2cv op in both paths, which was why we got the double rv2cv when newCVREF was returning an rv2cv for lexical subs. The real solution is to call newCVREF in the lexer only when there are no parentheses, since in that case the lexer is not going to call newCVREF itself. That avoids a redundant newCVREF call. Hence, we can have newCVREF always return an rv2cv op. The result is that ‘foo(1)’ and ‘foo 1’ produce identical op trees for a lexical sub. One more thing needed to change: The lexer was not looking at the lexical prototype CV but simply the stub to be autovivified, so it couldn’t see the parameter prototype attached to the CV (the stub doesn’t have one). The lexer needs to see the parameter prototype too, in order to deter- mine precedence. The logic for digging through pads to find the CV has been extracted out of rv2cv_op_cv into a separate (non-API!) routine.
* Cache HvFILL() for larger hashes, and update on insertion/deletion.Nicholas Clark2013-05-291-1/+1
| | | | | | This avoids HvFILL() being O(n) for large n on large hashes, but also avoids storing the value of HvFILL() in smaller hashes (ie a memory overhead on every single object built using a hash.)
* Eliminate Perl_my_swabn(), as it is now unused.Nicholas Clark2013-05-201-4/+0
| | | | It is not marked as part of the API, and no code on CPAN is using it.
* When endian-swapping in pack, simply copy the bytes in reverse order.Nicholas Clark2013-05-201-1/+2
| | | | | This should restore support for big endian Crays. It doesn't support mixed-endian systems.
* Eliminate my_{hto[bl]e,[bl]etoh}{16,32,64,s,i,l} as nothing now uses them.Nicholas Clark2013-05-201-74/+0
|
* Eliminate the conditionally-compiled fallback functions for htonl etc.Nicholas Clark2013-05-201-5/+0
| | | | | | | | | | | These are now only being used for mixed-endian platforms which do not provide their own htnol (etc) functions. Given that the fallbacks have been buggy since they were added in Perl 3.0, it's safe to conclude that no mixed-endian platforms were ever using these functions. It's also unclear why these functions were ever marked as 'A', part of the API. XS code can't call them directly, as it can't rely on them being compiled. Unsurprisingly, no code on CPAN references them.
* Expand flags parameter from boolean in _to_fold_latin1Karl Williamson2013-05-201-1/+1
| | | | This will be used in future commits to pass more flags.
* embed.fnc: Slight clarification in commentsKarl Williamson2013-05-201-1/+1
|
* cleanup and test PERL_PERTURB_KEYS environment variable handlingYves Orton2013-05-081-1/+1
|
* Make it possible to disable and control hash key traversal randomizationYves Orton2013-05-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds support for PERL_PERTURB_KEYS environment variable, which in turn allows one to control the level of randomization applied to keys() and friends. When PERL_PERTURB_KEYS is 0 we will not randomize key order at all. The chance that keys() changes due to an insert will be the same as in previous perls, basically only when the bucket size is changed. When PERL_PERTURB_KEYS is 1 we will randomize keys in a non repeatedable way. The chance that keys() changes due to an insert will be very high. This is the most secure and default mode. When PERL_PERTURB_KEYS is 2 we will randomize keys in a repeatedable way. Repititive runs of the same program should produce the same output every time. The chance that keys changes due to an insert will be very high. This patch also makes PERL_HASH_SEED imply a non-default PERL_PERTURB_KEYS setting. Setting PERL_HASH_SEED=0 (exactly one 0) implies PERL_PERTURB_KEYS=0 (hash key randomization disabled), settng PERL_HASH_SEED to any other value, implies PERL_PERTURB_KEYS=2 (deterministic/repeatable hash key randomization). Specifying PERL_PERTURB_KEYS explicitly to a different level overrides this behavior. Includes changes to allow one to compile out various aspects of the patch. One can compile such that PERL_PERTURB_KEYS is not respected, or can compile without hash key traversal randomization at all. Note that support for these modes is incomplete, and currently a few tests will fail. Also includes a new subroutine in Hash::Util::hash_traversal_mask() which can be used to ensure a given hash produces a predictable key order (assuming the same hash seed is in effect). This sub acts as a getter and a setter. NOTE - this patch lacks tests, but I lack tuits to get them done quickly, so I am pushing this with the hope that others can add them afterwards.
* Remove the non-inline function S_croak_memory_wrap from inline.h.Andy Dougherty2013-03-281-1/+1
| | | | | | | | | | | | | | | | | | | | This appears to resolve these three related tickets: [perl #116989] S_croak_memory_wrap breaks gcc warning flags detection [perl #117319] Can't include perl.h without linking to libperl [perl #117331] Time::HiRes::clock_gettime not implemented on Linux (regression?) This patch changes S_croak_memory_wrap from a static (but not inline) function into an ordinary exported function Perl_croak_memory_wrap. This has the advantage of allowing programs (particuarly probes, such as in cflags.SH and Time::HiRes) to include perl.h without linking against libperl. Since it is not a static function defined within each compilation unit, the optimizer can no longer remove it when it's not needed or inline it as needed. This likely negates some of the savings that motivated the original commit 380f764c1ead36fe3602184804292711. However, calling the simpler function Perl_croak_memory_wrap() still does take less set-up than the previous version, so it may still be a slight win. Specific cross-platform measurements are welcome.
* In Perl_re_op_compile(), tidy up after removing setjmp().Nicholas Clark2013-03-191-1/+1
| | | | | | | | | | Remove volatile qualifiers. Remove the variable jump_ret. Move the initialisation of restudied back to the declaration. This reverts several of the changes made by commits 5d51ce98fae3de07 and bbd61b5ffb7621c2. However, I can't see a cleaner way to avoid code duplication when restarting the parse than to approach I've taken here - the label redo_first_pass is now inside an if (0) block, which is clear but ugly.
* In S_regclass(), create listsv as a mortal, claiming a reference if needed.Nicholas Clark2013-03-191-1/+1
| | | | | | | | | | | The SV listsv is sometimes stored in an array generated near the end of S_regclass(). In other cases it is not used, and it needs to be freed if any of the warnings that S_regclass() can trigger turn out to be fatal. The simplest solution to this problem is to declare it from the start as a mortal, and claim a (new) reference to it if it is *not* to be freed. This permits the removal of all other code related to ensuring that it is freed at the right time, but not freed prematurely if a call to a warning returns.
* Harden hashes against hash seed discovery by randomizing hash iterationYves Orton2013-03-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Adds: S_ptr_hash() - A new static function in hv.c which can be used to hash a pointer or integer. PL_hash_rand_bits - A new interpreter variable used as a cheap provider of "semi-random" state for use by the hash infrastructure. xpvhv_aux.xhv_rand - Used as a mask which is xored against the xpvhv_aux.riter during iteration to randomize the order the actual buckets are visited. PL_hash_rand_bits is initialized as interpreter start from the random hash seed, and then modified by "mixing in" the result of ptr_hash() on the bucket array pointer in the hv (HvARRAY(hv)) every time hv_auxinit() allocates a new iterator structure. The net result is that every hash has its own iteration order, which should make it much more difficult to determine what the current hash seed is. This required some test to be restructured, as they tested for something that was not necessarily true, we never guaranteed that two hashes with the same keys would produce the same key order, we merely promised that using keys(), values(), or each() on the same hash, without any insertions in between, would produce the same order of visiting the key/values.
* Fix several differences in the parsing of $.. and ${...}Brian Fraser2013-03-061-0/+3
| | | | | | | | | | | | | | Namely: * The first character in ${...} used to have no restrictions * ${foo:bar} used to be legal * ${foo::bar} worked, but ${foo'bar} didn't And possibly other subtle, so far undiscovered bugs. This was resolved by simply using the same code for both things. Note that this commit is not entirely useful on its own; While tests pass, it requires changes from the following commit to work entirely.
* Pass the current and desired hash sizes to S_hsplit().Nicholas Clark2013-02-261-1/+1
| | | | | | | | Whilst this is slightly more work for its existing two callers, it will permit Perl_hv_ksplit() to also call it. Use STRLEN for the parameters, and change a local variable from I32 to STRLEN to match.
* Add av_tindex() synonym for av_top_index()Karl Williamson2013-02-081-0/+1
| | | | | The latter is a somewhat less clumsy name. The old one is provided a a very clear name; the new one as a somewhat slangy version
* Inline av_top_index()Karl Williamson2013-02-081-1/+1
| | | | | This function is just an assert and a macro call. Avoid the function call overhead by making it inline.
* Change name 'av_top' to 'av_top_index'Karl Williamson2013-02-081-1/+1
| | | | | | | | In using the av_top() function created in a recent commit, I found myself being confused, and thinking it meant the top element of the array, whereas it really means the index of the top element of that array. Since the new name has not appeared in a stable release, it can be changed, without remorse, to include 'index' in it.
* embed.fnc: Remove inappropriate 'p' flagsKarl Williamson2013-02-081-2/+2
| | | | | These functions do not begin with 'Perl_'; currently this flag is ignored here.
* Add interpolations to regex setsKarl Williamson2013-02-031-0/+1
| | | | | | | | | This commit adds the capability for '(?[ ])' to contain interpolated variables from other '(?[ ])' constructs. A set operation can thus be built up from the composition of other ones, without having to worry about precedence, etc. Thanks to Aaron Crane for suggesting this.
* Incorporate code review feedback for (?[])Karl Williamson2013-02-031-3/+3
| | | | Thanks to Hugo van der Sanden for reviewing this new code.
* regcomp.c: Extract code into functionKarl Williamson2013-02-031-0/+1
| | | | | | The code to parse the flags that occur after in '(?foo)' and '(?foo:bar)' is extracted into a function; some comments were added. This is in preparation for this to be called from an additional place
* hv.c: add some NULL check removalbulk88 (via RT)2013-01-291-1/+1
| | | | | | | | | | | | | | The purpose is less machine instructions/faster code. * S_hv_free_ent_ret() is always called with entry non-null: so change its signature to reflect this, and remove a null check; * Add some SvREFCNT_dec_NNs; * In hv_clear(), refactor the code slightly to only do a SvREFCNT_dec_NN within the branch where its already been determined that the arg is non-null; also, use the _nocontext variant of Perl_croak() to save a push instruction in threaded perls.
* Correct variable names in embed.fnc for hv_free_ent and hv_free_ent_ret.Andy Dougherty2013-01-251-2/+2
| | | | | | | Make the second variable name in embed.fnc match those used in the actual function declaration. This will matter if we add in 'entry' to PERL_ARGS_ASSERT_HV_FREE_ENT_RET. Also regen headers (only proto.h is affected) to match.
* Add av_top() synonym for av_len()Karl Williamson2013-01-191-0/+1
| | | | av_len() is misleadingly named.
* Deprecate certain rare uses of backslashes within regexesKarl Williamson2013-01-191-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are three pairs of characters that Perl recognizes as metacharacters in regular expression patterns: {}, [], and (). These can be used as well to delimit patterns, as in: m{foo} s(foo)(bar) Since they are metacharacters, they have special meaning to regular expression patterns, and it turns out that you can't turn off that special meaning by the normal means of preceding them with a backslash, if you use them, paired, within a pattern delimitted by them. For example, in m{foo\{1,3\}} the backslashes do not change the behavior, and this matches "f", "o" followed by one to three more occurrences of "o". Usages like this, where they are interpreted as metacharacters, are exceedingly rare; we think there are none, for example, in all of CPAN. Hence, this deprecation should affect very little code. It does give notice, however, that any such code needs to change, which will in turn allow us to change the behavior in future Perl versions so that the backslashes do have an effect, and without fear that we are silently breaking any existing code. =head1 Performance Enhancements
* Extend strictness for qr/(?[ \N{} ])/Karl Williamson2013-01-191-3/+4
| | | | | | | | | | This recently added regex syntax imposes stricter rules on parsing than normal. However, this did not include parsing \N{} constructs that occur within it. This commit does that, making fatal the warnings that come from \N{} I will add to perldiag the newly added messages along with the others for (?[ ]) before 5.18 ships
* PATCH: [perl 116411]: code comment for commit 518a5310cc "Silence a ↵bulk88 (via RT)2013-01-161-0/+2
| | | | | | | | | | | MSVC++-specific warning" There is no written investigation to google up for the record. I don't want to forget that the #ifdef is benign and accidentally reinvestigate it in the future. .text section of perl517.dll was 0xC013F before and after the commit. No change. -----------------------------------------------------------------
* Silence a MSVC++-specific warningSteve Hay2013-01-151-0/+4
| | | | | ("function declared with __declspec(noreturn) has non-void return type" / "function declared with __declspec(noreturn) has a return statement".)
* Silence a couple of warningsSteve Hay2013-01-141-1/+1
| | | | | ("'initializing' : conversion from 'I32' to 'U8', possible loss of data" and "formal parameter n different from declaration".)
* Add warnings for "\08", /\017/Karl Williamson2013-01-141-0/+2
| | | | | | | | | | | | | This was discussed in thread http://perl.markmail.org/thread/avtzvtpzemvg2ki2 but I never got around to this portion of the consensus, until now. I did a cpan grep http://grep.cpan.me/?q=%28^|[^\\]%29\\[0-7]{1%2C2}[8-9]&page=1 and eyeballing the results, saw three cases where this warning might show up; one of which was for EBCDIC. The others looked to be false positives, such as in .css files.
* embed.fnc: Clarify that varargs suppresses embed.hKarl Williamson2013-01-131-1/+2
| | | | | Macro don't have variable numbers of args, hence the entry in embed.h is suppressed.
* Create deprecated fncs to replace to-be-removed macrosKarl Williamson2013-01-121-0/+2
| | | | | | | | | | | These macros should not be used as they are prone to misuse. There are no occurrences of them in CPAN. The single use of either of them in core has recently been removed (commit 8d40577bdbdfa85ed3293f84bf26a313b1b92f55), because it was a misuse. Instead code should use isIDFIRST_lazy_if or isWORDCHAR_lazy_if (isALNUM_lazy_if is also available, but can be confused with the Posix alnum, which it doesn't mean).
* New regex experimental feature: (?[ ])Karl Williamson2013-01-111-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a fancier [bracketed] character class which allows set operations, such as intersection and subtraction. The entry in perlre for this commit details its operation. Besides extending regular expressions to handle this functionality, recommended by Unicode, the intent here is to do three things: 1) Intersection has been simulated by regexes using zero-width look-around assertions, which are non-obvious. This allows replacing those with a more powerful and clearer syntax; the compiled regexes are smaller and faster. Everything is known at compile time. 2) Set operations have also been simulated by using user-defined Unicode properties. These are globals, have security implications, restricted names, and d don't allow as complex expressions as this new feature. 3) I hope that this feature will come to be viewed as a "better" bracketed character class. I took advantage of the fact that there is no embedded base to have to be compatibile with to forbid certain iffy practices with the existing ones, while remaining mostly backwards compatible. The main difference is that /x is always enabled, so white space can be pretty much freely used with these, but to specify a match on white space, it must be escaped. Things that should have been illegal are, such as \x{}, and \x{abcdefghi}. Things that look like a posix specifier but don't quite meet the rules now give an error instead of silently compiling. e.g., [:digit] is an error instead of the union of the characters that compose it. I may have omitted things; perhaps it should be an error to have the same letter occur twice, adjacent. Since this is experimental, we can make such changes based on field feed back. The intent is to keep this feature, since it is strongly recommended by Unicode. The exact syntax is subject to change, so is experimental.
* regcomp.c: Add capability for regclass() to return inversion listKarl Williamson2013-01-111-1/+2
| | | | | This is currently unused, but will have regclass() return an inversion list instead of a node.
* regcomp.c: Add capability for strict [:posix:]Karl Williamson2013-01-111-1/+1
| | | | | This adds a parameter to regpposixcc() to enforce stricter rules on the posix class syntax. It is currently unused
* regcomp.c: Add function to skip pattern white spaceKarl Williamson2013-01-111-0/+2
| | | | | | | The plan is to eventually convert all of regcomp to use this for white space ignoring under /x, but this will be used for now in just the new syntax for (?[ ]), coming in a few commits. Until then, this function is unused.
* regcomp.c: Add parameter to regclass()Karl Williamson2013-01-111-1/+2
| | | | | This parameter silences warnings for non-portable characters. It currently is always FALSE, meaning that warnings are given.
* regcomp.c: Add parameter to regclass()Karl Williamson2013-01-111-1/+2
| | | | | | | | | This parameter allows the caller to specify whether multi-character folds should be allowed or not. In general it should, and in the case where this commit says it shouldn't, they never are returned anyway from Unicode properties. This capability will be put to real use by future commits
* grok_bslash_[ox]: Add param to silence non-portable warningsKarl Williamson2013-01-111-8/+12
| | | | | | | | If a hex or octal number is too big to fit in a 32 bit word, grok_oct and grok_hex by default output a warning that it is a non-portable value. This new parameter to the grok_bslash functions can cause them to shut up those warnings. This is currently unused, but will be needed in future commits.
* Add optional strict mode to grok_bslash_[xo]Karl Williamson2013-01-111-2/+4
| | | | | | This mode croaks on any iffy constructs that currently compile. It is not currently used; documentation of the error messages will be delivered later.
* Revise calling sequences for grok_bslash_[xo]Karl Williamson2013-01-111-2/+6
| | | | | | By passing the address of the parse pointer, the functions can advance it, eliminating a parameter to the function, and simplifying the code in the caller.
* regcomp.c: Use a parameter to simplify some codeKarl Williamson2013-01-111-1/+1
| | | | | | | | | | | | When parsing \p{} outside of a bracketed character class, code in regcomp.c has pretended it is a bracketed character class by changing and restoring the parsing pointers, and then calling the charclass handler. This code can be simplified by instead passing a flag to the handler meaning to just parse one item. The faking is simpler there, with no restoring necessary. Also we can eliminate the duplicate handling of special cases. Future commits will make more extensive use of this mechanism.
* embed.fnc: Fix flags for _invlist_dumpKarl Williamson2013-01-111-1/+1
| | | | | This debugging function is normally #ifdef'd out, but should it e enabled, the flags were wrong.
* embed.fnc: Properly declare fcn inlineKarl Williamson2013-01-061-1/+1
| | | | | This function is specified as inline in the source code, but not in the prototypes; only one compiler seems to have noticed.