summaryrefslogtreecommitdiff
path: root/embed.h
Commit message (Collapse)AuthorAgeFilesLines
* Move amagic hint checking to new functionFather Chrysostomos2012-01-241-0/+1
| | | | so that stringification will be able to use it, too.
* [rt.cpan.org #74289] Don’t make *CORE::foo read-onlyFather Chrysostomos2012-01-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | newATTRSUB requires the sub name to be passed to it wrapped up in a const op. Commit 8756617677dbd allowed it to accept a GV that way, since S_maybe_add_coresub (in gv.c) needed to pass it an existing GV not in the symbol table yet (to simplify code elsewhere). This had the inadvertent side-effect of making the GV read-only, since that’s what the check function for const ops does. Even if we were to call this a feature, it wouldn’t make sense as implemented, as GVs for non-ampable (&-able) subs like *CORE::chdir were not being made read-only. This commit adds a new flag to newATTRSUB, to allow a GV to be passed as the o parameter, instead of an op. While this may look as though it’s undoing the simplification in commit 8756617677dbd by adding more code, the new code is still conceptually simpler and more straightforward. Since newATTRSUB is in the API, I had to add a new _flags variant. (How did newATTRSUB get into the API to begin with?) In adding a test, I also discovered that ‘used once’ warnings were applying to these subs, which is obviously wrong. Commit 8756617677dbd caused that, too, as it was relying on the side-effect of newATTRSUB doing a GV lookup. This fixes that, too, by turning on the multi flag in S_maybe_add_coresub.
* regcomp.c: Refactor join_exact() to eliminate extra passesKarl Williamson2012-01-191-1/+1
| | | | | | | The strings in every EXACTFish node are examined for certain problematic sequences and code points. Prior to this patch, this was done in several passes, but this refactors the routine to do it in a single pass.
* regexec.c: Allow for returning shared swashKarl Williamson2012-01-131-0/+1
| | | | | | | | | | | | This changes the function that returns the swash associated with a bracketed character class so that it returns the original swash and not a copy. The function is renamed and made accessible only from within regexec.c, and a new wrapper function with the original name is created that just calls the other one and returns a copy of the swash. Thus, all access from outside regexec.c will use a copy which if overwritten will not harm others; while the option exists from within regexec.c to use a shared version.
* regcomp.c: Add _invlist_contents() to compactly dump inversion listKarl Williamson2012-01-131-0/+1
| | | | This will be used in future commits for debug traces
* utf8.c: Add ability to pass inversion list to _core_swash_init()Karl Williamson2012-01-131-1/+1
| | | | | | | Add a new parameter to _core_swash_init() that is an inversion list to add to the swash, along with a boolean to indicate if this inversion list is derived from a user-defined property. This capability will prove useful in future commits
* utf8.c: Add flag to swash_init() to not croak on errorKarl Williamson2012-01-131-1/+1
| | | | | | This adds the capability, to be used in future commits, for swash_ini() to return NULL instead of croaking if it can't find a property, so that the caller can choose how to handle the situation.
* regcomp.c: Add _invlist_populate_swatch()Karl Williamson2012-01-131-0/+1
| | | | This function will be used in future commits
* regcomp.c: Add invlist_search()Karl Williamson2012-01-131-0/+1
| | | | | This function does a binary search on an inversion list. It will be used in future commits
* utf8.c: New function to retrieve non-copy of swashKarl Williamson2012-01-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, swash_init returns a copy of the swash it finds. The core portions of the swash are read-only, and the non-read-only portions are derived from them. When the value for a code point is looked up, the results for it and adjacent code points are stored in a new element, so that the lookup never has to be performed again. But since a copy is returned, those results are stored only in the copy, and any other uses of the same logical stash don't have access to them, so the lookups have to be performed for each logical use. Here's an example. If you have 2 occurrences of /\p{Upper}/ in your program, there are 2 different swashes created, both initialized identically. As you start matching against code points, say "A" =~ /\p{Upper}/, the swashes diverge, as the results for each match are saved in the one applicable to that match. If you match "A" in each swash, it has to be looked up in each swash, and an (identical) element will be saved for it in each swash. This is wasteful of both time and memory. This patch renames the function and returns the original and not a copy, thus eliminating the overhead for stashes accessed through the new interface. The old function name is serviced by a new function which merely wraps the new name result with a copy, thus preserving the interface for existing calls. Thus, in the example above, there is only one swash, and matching "A" against it results in only one new element, and so the second use will find that, and not have to go out looking again. In a program with lots of regular expressions, the savings in time and memory can be quite large. The new name is restricted to use only in regcomp.c and utf8.c (unless XS code cheats the preprocessor), where we will code so as to not destroy the original's data. Otherwise, a change to that would change the definition of a Unicode property everywhere in the program. Note that there are no current callers of the new interface; these will be added in future commits.
* utf8.c: Change name of static functionKarl Williamson2012-01-131-1/+1
| | | | | This function has always confused me, as it doesn't return a swash, but a swatch.
* need backwards-compatile to_utf8_foo()Karl Williamson2012-01-081-4/+4
| | | | | | | | | | These 4 functions have been replaced by variants to_utf8_foo_flags(), but for XS code that called the old ones in the Perl_to_utf8_foo() forms, backwards compatibility versions need to be created. For calls of just the to_utf8_foo() forms, macros have been used to automatically call the new forms without the performance penalty of going through the compatibility functions.
* [perl #29070] Add vstring set-magicFather Chrysostomos2011-12-231-0/+1
| | | | | | | | | | | | Some operators, like pp_complement, assign their argument to TARG (which copies vstring magic), modify it in place, and then call set- magic. That’s supposed to work, but vstring magic was remaining as it was, such that ~v7 would still be treated as "v7" by vstring-aware code, even though the resulting string is not "\7". This commit adds vstring set-magic that checks to see whether the pv still matches the vstring. It cannot simply free the vstring magic, as that would prevent $x=v0 from working.
* Stop tell($glob_copy) from clearing PL_last_in_gvFather Chrysostomos2011-12-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This bug is a side effect of rv2gv’s starting to return an incoercible mortal copy of a coercible glob in 5.14: $ perl5.12.4 -le 'open FH, "t/test.pl"; $fh=*FH; tell $fh; print tell' 0 $ perl5.14.0 -le 'open FH, "t/test.pl"; $fh=*FH; tell $fh; print tell' -1 In the first case, tell without arguments is returning the position of the filehandle. In the second case, tell with an explicit argument that happens to be a coercible glob (tell has an implicit rv2gv, so tell $fh is actu- ally tell *$fh) sets PL_last_in_gv to a mortal copy thereof, which is freed at the end of the statement, setting PL_last_in_gv to null. So there is no ‘last used’ handle by the time we get to the tell without arguments. This commit adds a new rv2gv flag that tells it not to copy the glob. By doing it unconditionally on the kidop, this allows tell(*$fh) to work the same way. Let’s hope nobody does tell(*{*$fh}), which will unset PL_last_in_gv because the inner * returns a mortal copy. This whole area is really icky. PL_last_in_gv should be refcounted, but that would cause handles to leak out of scope, breaking programs that rely on the auto-closing ‘feature’.
* utf8.c: Allow Changed behavior of utf8 under localeKarl Williamson2011-12-151-4/+5
| | | | | | | | | | This changes the 4 case changing functions to take extra parameters to specify if the utf8 string is to be processed under locale rules when the code points are < 256. The current functions are changed to macros that call the new versions so that current behavior is unchanged. An additional, static, function is created that makes sure that the 255/256 boundary is not crossed during the case change.
* Adjust substr offsets when using, not when creating, lvalueFather Chrysostomos2011-12-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When substr() occurs in potential lvalue context, the offsets are adjusted to the current string (negative being converted to positive, lengths reaching beyond the end of the string being shortened, etc.) as soon as the special lvalue to be returned is created. When that lvalue is assigned to, the original scalar is stringified once more. That implementation results in two bugs: 1) Fetch is called twice in a simple substr() assignment (except in void context, due to the special optimisation of commit 24fcb59fc). 2) These two calls are not equivalent: $SIG{__WARN__} = sub { warn "w ",shift}; sub myprint { print @_; $_[0] = 1 } print substr("", 2); myprint substr("", 2); The second one dies. The first one only warns. That’s mean. The error is also wrong, sometimes, if the original string is going to get longer before the substr lvalue is actually used. The behaviour of \substr($str, -1) if $str changes length is com- pletely undocumented. Before 5.10, it was documented as being unreli- able and subject to change. What this commit does is make the lvalue returned by substr remember the original arguments and only adjust the offsets when the assign- ment happens. This means that the following now prints z, instead of xyz (which is actually what I would expect): $str = "a"; $substr = \substr($str,-1); $str = "xyz"; print $substr;
* Break the -v code out from Perl_moreswitches() into S_minus_v().Nicholas Clark2011-12-041-0/+1
|
* Refactor S_usage() to take 0 parameters and exit directly().Nicholas Clark2011-12-041-1/+1
| | | | | This simplifies the code, as it's only called from one spot, in Perl_moreswitches().
* Make sitecustomize relocatableinc awareCarl Hayter2011-12-031-0/+1
| | | | | | | | | | When -Dusesitecustomize is used with -Duserelocatableinc, SITELIB_EXP/sitecustomize.pl is not found due to SITELIB_EXP having a '.../..' relocation path. This patch refactors the path relocation code from S_incpush() into S_mayberelocate() so that it can be used in both S_incpush() and in usesitecustomize's use of SITELIB_EXP.
* Make assignment over glob copies much fasterFather Chrysostomos2011-11-241-1/+1
| | | | | | | | | | | | | | | | | | | sv_force_normal is passed the SV_COW_DROP_PV flag if the scalar is about to be written over. That flag is not currently used. We can speed up assignment over fake GVs a lot by taking advantage of the flag. Before and after: $ time ./perl -e '$x = *foo, undef $x for 1..2000000' real 0m4.264s user 0m4.248s sys 0m0.007s $ time ./perl -e '$x = *foo, undef $x for 1..2000000' real 0m1.820s user 0m1.812s sys 0m0.005s
* Put sub redef warnings in one placeFather Chrysostomos2011-11-211-0/+3
| | | | | | | | | | The logic surrounding subroutine redefinition warnings (to warn or not to warn?) was in three places. Over time, they drifted apart, to the point that newXS was following completely different rules. It was only warning for redefinition of functions in the autouse namespace. Recent commits have brought it into conformity with the other redefi- nition warnings. Obviously it’s about time we put it in one function.
* Make const redef warnings default in newXSFather Chrysostomos2011-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | There is no reason why constant redefinition warnings should be default warnings for sub foo(){1}, but not for newCONSTSUB (which calls newXS, which triggers the warning). To make this work properly, I also had to import sv.c’s ‘are these const subs from the same SV originally?’ logic. Constants created with XS can have NULL for the SV (they return an empty list or &PL_sv_undef), which means sv.c’s logic will stop *this=\&that from warning if both this and that are such XS-created constants. newCONSTSUB needed to be consistent with that. It required tweaking a test I added a few commits ago, which arguably shouldn’t have warned the way it was written. As of this commit (and before it, too, come to think of it), newXS_len_flags’s calling convention is quite awful and would need to be throughly re-thunk before being made into an API, or probably sim- ply never made into an API.
* Add newXS_len_flagsFather Chrysostomos2011-11-201-0/+1
| | | | | | | | | It accepts a length as well as a pv for the name. Since newXS_flags is marked with M in embed.fnc and is undocumented, technically policy allows me to change it, but there are files throughout cpan/ that use newXS_flags. So it seemed safer to add a new function.
* Add len flag to newCONSTSUB_flagsFather Chrysostomos2011-11-201-1/+1
| | | | | This function was added after 5.14.0, so it is not too late to change it. It is currently unused.
* Mention the variable name in the new length warningsFather Chrysostomos2011-11-181-1/+3
|
* Throw a helpful warning when someone tries length(@array) or length(%hash)Matthew Horsfall (alh)2011-11-181-0/+1
|
* [perl #70151] eval localises %^H at runtimeFather Chrysostomos2011-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | It doesn’t any more. Now the hints are localised in a separate inner scope surrounding the call to yyparse. This meant moving hint-handling code from pp_require and pp_entereval into S_doeval. Some tests in t/comp/hints.t were testing for the buggy behaviour, so they have been adjusted. Basically, this fixes sub import { eval "strict->import" } which should work the same way as sub import { strict->import } but was not working because %^H and $^H were being localised to the eval at its run time, not just its compilation. So the values assigned to %^H and $^H at the eval’s run time would simply be lost.
* embed.fnc: Make _to_upper_title_latin1() avail to pp.cKarl Williamson2011-11-111-1/+3
| | | | | | | If something like this were to be made more generally available, it would be better to have two in-line functions, to_upper_latin1() and to_title_latin1() that just call this underlying one with the correct final parameter.
* utf8.c: Faster latin1 foldingKarl Williamson2011-11-081-0/+1
| | | | | | | This adds a function similar to the ones for the other three case changing operations that works on latin1 characters only, and avoids having to go out to swashes. It changes to_uni_fold() and to_utf8_fold() to call it on the appropriate input
* utf8.c: Faster latin1 upper/title casingKarl Williamson2011-11-081-0/+1
| | | | | | | | | | | | | This creates a new function to handle upper/title casing code points in the latin1 range, and avoids using a swash to compute the case. This is because the correct values are compiled-in. And it calls this function when appropriate for both title and upper casing, in both utf8 and uni forms, Unlike the similar function for lower casing, it may make sense for this function to be called from outside utf8.c, but inside the core, so it is not static, but its name begins with an underscore.
* utf8.c: Refactor to_uni_lower()Karl Williamson2011-11-081-0/+1
| | | | | The portion that deals with Latin1 range characters is refactored into a separate (static) function, so that it can be called from more than one place.
* Warn for $[ ‘version’ checksFather Chrysostomos2011-11-011-0/+1
| | | | | | | Following Michael Schwern’s suggestion, here is a warning for those hapless folks who use $[ for version checks. It applies whenever $[ is used in one of: < > <= >=
* simplify op_dump() / -Dx sequencingDavid Mitchell2011-10-171-2/+0
| | | | | | | | | | | | | | | | Currently, whenever we dump an op tree, we first call sequence(), which walks the tree, creating address => sequence# mappings in PL_op_sequence. Then when individual ops or op-next fields are displayed, the sequence is looked up. Instead, do away with the initial walk, and just map addresses on request. This simplifies the code. As a deliberate side-effect, it no longer assigns a seq# of zero to null ops. This makes it easer to work out what's going on when you call op_dump() during a debugging session with partially constructed op-trees. It also removes the ambiguity in "====> 0" as to whether op_next is NULL or just points to an op_null.
* whichsig nul-cleanup.Brian Fraser2011-10-061-1/+3
| | | | | This adds _pv, _pvn, and _pv versions of whichsig() in mg.c, which get both kill "NAME" and %SIG lookup nul-clean.
* Oust cv_ckproto_lenFather Chrysostomos2011-10-061-1/+0
| | | | | | | | | | It is no longer used in core (having been superseded by cv_ckproto_len_flags), is unused on CPAN, and is not part of the API. The cv_ckproto ‘public’ macro is modified to use the _flags version. I put ‘public’ in quotes because, even before this commit, cv_ckproto was using a non-exported function, and hence could never have worked on a strict linker (or whatever you call it).
* toke.c, op.c, sv.c: Prototype parsing and checking are nul-and-UTF8 clean.Brian Fraser2011-10-061-0/+1
| | | | | | | | | | | | This means that eval "sub foo ($;\0whoops) { say @_ }" will correctly include \0whoops in the CV's prototype (while complaining about illegal characters), and that use utf8; BEGIN { $::{"foo"} = "\$\0L\351on" } BEGIN { eval "sub foo (\$\0L\x{c3}\x{a9}on) {};"; } will not warn about a mismatched prototype.
* universal.c: sv_does() UTF8 cleanup.Brian Fraser2011-10-061-0/+4
| | | | | This adds _sv, _pv, and _pvn forms to sv_does, and changes it to use sv_ref() instead of sv_reftype().
* mro UTF8 cleanup.Brian Fraser2011-10-061-1/+1
| | | | | | | | | | | This patch also duplicates existing mro tests with copies that use Unicode in identifiers, to test the mro code. Since those tests trigger it, it also fixes a bug in the parsing of *{...}: If the first character inside the braces is a non-ASCII Unicode identifier character, the inside is now implicitly quoted if it is just an identifier (just as it is with ASCII identifiers), instead of being parsed as a bareword that would violate strict subs.
* universal.c: ->isa, sv_derived_from UTF8 cleanup.Brian Fraser2011-10-061-1/+4
| | | | | | | This makes them both nul-and-UTF8 clean, although the latter is somewhat superficial, as mro isn't clean yet. (Tests coming once ->can and ->DOES are clean)
* Add a sv_sethek() function to sv.cBrian Fraser2011-10-061-0/+1
| | | | This is exported so that attributes.xs can use it.
* op.c: newCONSTSUB and newXS UTF8 cleanup.Brian Fraser2011-10-061-0/+1
| | | | | | | | newXS was merged into newXS_flags; added a line in the docs recommeding using that instead. newCONSTSUB got a _flags version, which generates the CV in the right glob if passed the UTF-8 flag.
* Merge multi and flags params to gv_init_*Father Chrysostomos2011-10-061-3/+3
| | | | | Since multi is a boolean (even though it’s typed as an int), there is no need to have a separate parameter. We can just use a flag bit.
* gv.c: newGVgen_flags and a flags parameter for gv_get_super_pkg.Brian Fraser2011-10-061-2/+2
|
* Remove method param from gv_autoload_*Father Chrysostomos2011-10-061-3/+3
| | | | | | | | method is a boolean flag (typed I32, but used as a boolean) added by commit 54310121b442. These new gv_autoload_* functions have a flags parameter, so there’s no reason for this extra effective bool. We can just use a flag bit.
* Remove 4 from new gv_autoload4_(sv|pvn?) functionsFather Chrysostomos2011-10-061-3/+3
| | | | | | | | | | | | The 4 was added in commit 54310121b442 (inseparable changes during 5.003/4 developement), presumably the ‘Don't look up &AUTOLOAD in @ISA when calling plain function’ part. Before that, gv_autoload had three arguments, so the 4 indicated the new version (with the method argument). Since these new functions don’t all have four arguments, and since they have a new naming convention, there is not reason for the 4.
* gv.c: Added gv_autoload4_(sv|pv|pvn)Brian Fraser2011-10-061-1/+3
|
* gv.c: Added gv_fetchmethod_(sv|pv|pvn)_flags.Brian Fraser2011-10-061-1/+3
| | | | | | In addition from taking a flags parameter, it also takes the length of the method; This will eventually make method lookup nul-clean.
* gv.c: Added gv_fetchmeth_(sv|pv|pvn)_autoload.Brian Fraser2011-10-061-1/+3
|
* gv.c: Added gv_fetchmeth_(sv|pv|pvn).Brian Fraser2011-10-061-1/+3
| | | | | I'm probably pushing this too early. Can't do the Perl-level tests because of that. TODO.
* gv.c: Added gv_init_(sv|pv|pvn), renamed gv_init_sv as gv_init_svtype.Brian Fraser2011-10-061-2/+4
| | | | | | | | | gv_init_pvn() is the same as the old gv_init(), but takes a flags parameter, which will be used for the UTF-8 cleanup. The old gv_init() is now implemeneted as a macro in gv.h. Also included is some minimal testing in XS::APItest.