summaryrefslogtreecommitdiff
path: root/mg.c
Commit message (Collapse)AuthorAgeFilesLines
...
* fix 'ignoring return value' compiler warningsDavid Mitchell2013-11-241-18/+30
| | | | | | | | | | | Various system functions like write() are marked with the __warn_unused_result__ attribute, which causes an 'ignoring return value' warning to be emitted, even if the function call result is cast to (void). The generic solution seems to be int rc = write(...); PERL_UNUSED_VAR(rc);
* fix a few warnings (format strings, unused variable)Lukas Mai2013-11-201-1/+1
| | | | | | | | | | | | | | During compilation gcc complains about the following: perl.c:4970: warning: format '%u' expects argument of type 'unsigned int', but argument 2 has type 'U32' [-Wformat=] perl.c:5075: warning: format '%u' expects argument of type 'unsigned int', but argument 2 has type 'I32' [-Wformat=] mg.c:1972: warning: format '%ld' expects argument of type 'long int', but argument 2 has type 'ssize_t' [-Wformat=] pp_ctl.c:2610: warning: unused variable 'mark' [-Wunused-variable] regexec.c:2275: warning: format '%ld' expects argument of type 'long int', but argument 3 has type 'int' [-Wformat=] This patch fixes all of them. Tony: warning: unused variable 'mark' was fixed in 481c819b
* mg.c: Fix misuse of AvARRAY in defelem_targetFather Chrysostomos2013-11-041-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | defelem magic does not usually apply to tied arrays, but an array can be tied after a defelem has been created and points to it. The code for handling deferred elements was never updated for tied arrays when those were added, so it still does AvFILL and AvARRAY. AvFILL works on tied arrays, and calls FETCHSIZE. But AvARRAY accesses the AV’s internal structure. So AvFILL might suggest that the index is within the array, whereas it is actually past the end of AvARRAY. By tying the array after a deferred element with a high index has been created and then extending the tied array (so AvFILL returns a big number), we can make AvARRAY[big number] crash. This script: use Tie::Array; sub { tie @a, "Tie::StdArray"; $#a = 20000; warn pre; "$_[0]"; warn post }->($a[10000]); gives this output: pre at -e line 5. Segmentation fault: 11 For tied arrays, we need to use av_fetch, rather than AvARRAY.
* [perl #119799] Set breakpoints without *DB::dblineFather Chrysostomos2013-10-281-1/+2
| | | | | | | | | | | | | | | | | | | The elements of the %{"_<..."} hashes (where ‘...’ is the filename), whose keys are line numbers, are used to set breakpoints on the given lines. The corresponding @{"_<..."} array contains the actual lines of source code. %{"_<..."} actually acts on the array of lines that @DB::dbline is aliased to. The assumption is that *DB::dbline = *{"_<..."} will have taken place first. Hence, all %{"_<..."} hashes are the same, when it comes to writing to keys. It is more useful for each %{"_<..."} hash to set breakpoints on its corresponding file’s lines regardless of whether @DB::dbline has been aliased, so that is what this commit does. Each hash’s mg_obj pointer in its dbfile magic now points to the array, and magic_setdbline uses it instead of PL_DBline.
* WinCE Makefile and make_ext.pl general and XS fixesDaniel Dragan2013-10-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On a WinCE build. On the 2nd nmake run, using Makefile.ce, eventually calls the Extensions target which calls make_ext.pl. What happens is nmake for CE for each module is called on the Desktop per module makefile from the earlier Desktop build. Since the Desktop Perl already was built sucessfully, all rules/deps are met in the Desktop per module makefile, and nothing happens during the module building phase for a CE build. Previously I used external file management tools to delete the per module Makefiles before running Makefile.ce. *make_ext.pl - implement deleting and rebuilding the per module makefile on a Cross build - use constants for constant folding, there are opportunities for other variables to be converted to constants in the future - fix a bug from commit baff067e71 where unlink() on a file with an open handle ($mfh) didn't delete the file from disk and a new per module makefile would be not be built by make_ext.pl later since the per module makefile was still on disk. This was observed on Win32. Also harden the unlink code with a new _unlink sub that is fatal if the file is still on disk after unlink supposedly deleted it. - var $header and the quotemeta is because of an issue in Perl #119793 *Makefile.ce - bring the debugging symbol generation flags and optimization flags to be closer to a Dekstop VC Perl build - ICWD is obsolete as of commit f6b3c354c9 , remove it - MINIMOD is obsolete as of commit 7b4d95f74b , remove it - make a poisoned config.h so if there is a XS building mixup between a desktop and CE perl, the poisoned config.h for CE will stop the build gracefully - $(MINIPERL) has never been defined in Makefile.ce from day 1 (10 years) replace with $(HPERL) everywhere, this was causing things to not run silently since $(MINIPERL) was empty string. Use HPERL instead of MINIPERL to allow flexibility to use the full perl binary if necessery one day - better cleaning on root makefile clean target *win32/win32.h *win32/win32iop.h - silence alot of redefinition warnings which gave pages of warnings on each WinCE compliand *mg.c - win32_get_errno is only on WIN32 build not WINCE a "nmake -f Makefile.ce all" will now build the CE interp and all modules in 1 shot with no user intervention
* Intercept assignment to $! to translate WSAExxx values to Exxx values on WindowsSteve Hay2013-09-161-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | Since perl previously assigned WSAExxx values to $! on Windows it is quite possible that (Perl-level) user code may also manually make similar assignments, which will now cause breakage if the value put in $! is subsequently compared to Errno/POSIX constants because the latter are now the corresponding Exxx values where possible. An example of this is in Net::Ping::tcp_connect(), which does the following to fetch a socket-level error code: unpack("i", getsockopt($self->{"fh"}, SOL_SOCKET, SO_ERROR)) and assigns the result (a WSAExxx value such as 10061) to $! and then goes wrong in the subsequent test (in ping_tcp()) for $! == ECONNREFUSED (which is now 107 rather than 10061 if perl is built with VC10 or higher). To avoid this we now intercept assignment to $! and convert any WSAExxx values to Exxx values first. This causes a minor oddity in that this: perl -le "$! = 10061; print 0+$!" will now output 107 (for VC10+ perls) but this is surely preferable to the alternative breakage described above.
* Simplify some code in Perl_magic_get() and Perl_magic_set().Nicholas Clark2013-09-021-9/+3
| | | | | | | | | | | Remove the checks that avoided confusing $^P, ${^PREMATCH} and ${^POSTMATCH} now that the latter two do not take that code path. Remove a similar check for $^S added by commit 4ffa73a366885f68 (Feb 2003). (This commit did not add any other variable starting with a control-S.) This eliminates all uses of the variable remaining. Move the goto target do_numbuf_fetch inside the checks for PL_curpm, as both its comefroms have already made the same check.
* Remove now unused $` $' ${^MATCH} ${^PREMATCH} ${^POSTMATCH} code.Nicholas Clark2013-09-021-36/+0
| | | | | The previous commit's changes to Perl_gv_fetchpvn_flags() rendered this code in Perl_magic_get() and Perl_magic_set() unreachable.
* Store all other match vars in mg_len instead of mg_ptr/mg_len.Nicholas Clark2013-09-021-2/+1
| | | | | | | Perl_gv_fetchpvn_flags() now stores the appropriate RX_BUFF_IDX_* constant in mg_len for $` $' ${^MATCH} ${^PREMATCH} and ${^POSTMATCH} This makes some code in mg.c unreachable and hence unnecessary; the next commit will remove it.
* Store the match vars in mg_len instead of calling atoi() on mg_ptr.Nicholas Clark2013-09-021-38/+35
| | | | | | | | | | | | | | The match variables $1, $2 etc, along with many other special scalars, have magic type PERL_MAGIC_sv, with the variable's name stored in mg_ptr. The look up in mg.c involved calling atoi() on the string in mg_ptr to get the capture buffer as an integer, which is passed to the regex API. To avoid this repeated use of atoi() at runtime, change the storage in the MAGIC structure for $1, $2 etc and $&. Set mg_ptr to NULL, and store the capture buffer in mg_len. Other code which manipulates magic ignores mg_len if mg_ptr is NULL, so this representation does not require changes outside of the routines which set up, read and write these variables. (Perl_gv_fetchpvn_flags(), Perl_magic_get() and Perl_magic_set())
* In Perl_magic_setdbline, replace the use of atoi() with sv_2iv().Nicholas Clark2013-08-291-5/+12
| | | | | The value on which atoi() is called is actually always the buffer of an SV. Hence we can use sv_2iv() instead.
* Make vivify_defelem allow &PL_sv_undef array entriesFather Chrysostomos2013-08-281-1/+1
| | | | | | | This is something I failed to change in commit ce0d59f. I don’t know of a way to trigger this in pure-Perl code, hence the use of XS in the test. It did show up in pure-Perl code due to a bug fixed by the previous commit.
* Fix assert fail when fetching pos clobbers ref with undefFather Chrysostomos2013-08-251-1/+1
| | | | | | | | | | | pos($x) returns a special magical scalar that sets the match position on $x. Calling pos($x) twice will provide two such scalars. If we set one of them to a reference, set the other to undef, and then read the first, all hail breaks loose, because of the use of SvOK_off. SvOK_off is not sufficient if arbitrary values can be assigned by Perl code. Globs, refs and regexps (among others) need special handling, which sv_setsv knows how to do.
* Stop values from ‘sticking’ to @- and @+ elemsFather Chrysostomos2013-08-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | These arrays are very similar to tied arrays, in that the elements are created on the fly when looked up. So push @_, \$+[0], \$+[0], will push references to two different scalars on to @_. That they are created on the fly prevents this bug from showing up in most code: If you reference the element you can observe that, on FETCH, it gets set to the corresponding offset *if* the last match has a set of capturing parentheses with the right number. Otherwise, the value in the element is left as-is. So, doing another pattern match with, say, 5 captures and then another with fewer will leave $+[5] and $-[5] holding values from the first match, if there is a FETCH in between the two matches: $ perl -le '" "=~/()()()()(..)/; $_ = \$+[5]; print $$_; ""=~ /()/; print $$_;' 2 2 And attempts at assignment will succeed, even though they croak: $ perl -le 'for ($-[0]) { eval { $_ = *foo }; print $_ }' *main::foo The solution here is to make the magic ‘get’ handler set the SV no matter what, instead of just setting it when it refers to a valid offset.
* Make @- and @+ return correct offsets beyond 2**31Father Chrysostomos2013-08-251-5/+5
|
* Stop pos() from being confused by changing utf8nessFather Chrysostomos2013-08-251-6/+2
| | | | | | | | | | | | | | | | | | | | | | | The value of pos() is stored as a byte offset. If it is stored on a tied variable or a reference (or glob), then the stringification could change, resulting in pos() now pointing to a different character off- set or pointing to the middle of a character: $ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, a; print pos $x' 2 $ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, "\x{1000}"; print pos $x' Malformed UTF-8 character (unexpected end of string) in match position at -e line 1. 0 So pos() should be stored as a character offset. The regular expression engine expects byte offsets always, so allow it to store bytes when possible (a pure non-magical string) but use char- acters otherwise. This does result in more complexity than I should like, but the alter- native (always storing a character offset) would slow down regular expressions, which is a big no-no.
* Remove null check from mg.c:magic_getvecFather Chrysostomos2013-08-221-4/+1
| | | | | lsv can never be null here. This null check has been here since vec’s get-magic was added in ae389c8a or 6ff81951f7.
* Fix assertion failure with $#a=\1Father Chrysostomos2013-08-221-1/+1
| | | | | | | | | If the array has been freed and a reference is then assigned to the arylen scalar and then get-magic is called on that scalar, Perl_magic_getarylen misbehaves. SvOK_off is not sufficient if arbitrary values can be assigned by Perl code. Globs, refs and regexps (among others) need special handling, which sv_setsv knows how to do.
* [perl #118691] Allow defelem magic with neg indicesFather Chrysostomos2013-08-211-6/+8
| | | | | | | | | | | | | | | | | | | | | | When a nonexistent array element is passed to a subroutine, a special ‘deferred element’ scalar (implemented using something called defelem magic) is passed to the subroutine instead, which delegates to the array element. This allows some_benign_function($array[$nonexistent]) to avoid autovivifying unnecessarily. Whether this magic would be triggered was based on whether the element was within the range 0..$#array. Since arrays can contain nonexistent elements before $#array, this logic is incorrect. It also makes sense to allow $array[$neg] where the negative number points before the beginning of the array to create a deferred element and only croak if it is assigned to. This commit fixes the logic for when deferred elements are created and implements these deferred negative elements. Since we have to be able to store negative values in xlv_targoff, it is convenient to make it a union (with two types--signed and unsigned) and use LvSTARGOFF for defelem array indices.
* mg.c: Fix U32-to-bool assignmentFather Chrysostomos2013-08-121-1/+1
| | | | | | | This was caused by 3805b5fb04. This commit restores the !=0 that was there before 2fd13eccf0. Thanks to Steve Hay for helping to track down the smoke failures.
* Make PL_hints an alias for PL_compiling.cop_hintsFather Chrysostomos2013-08-111-1/+0
| | | | | | | | | | | | | | | | | | | PL_hints stores the hints at compile time that get copied into the cop_hints field of each COP (in newSTATEOP). Since perl-5.8.0-8053-gd5ec298, COPs have stored all the hints. Before that, COPs used to store only some of the hints. The hints were copied here and there into PL_compiling, a static COP-shaped buf- fer used during compilation, so that things like constant folding would see the correct hints. a0ed51b3 back in 1998 did that. Now that COPs can store all the hints, we can just use PL_compiling.cop_hints to avoid having to copy them from PL_hints from time to time. This simplifies the code and avoids creating bugs like those that a547fd219 and 1c75beb82 fixed.
* Modifying ${^OPEN} changes the value of $^H:Father Chrysostomos2013-08-111-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | $ ./perl -le 'BEGIN { print $^H; ${^OPEN} = "a\0b"; print $^H}' 256 917760 So changing $^H back should change the value of ${^OPEN} back to undef, right? $ ./perl -le 'BEGIN { ${^OPEN} = "a\0b"; $^H=256; print ${^OPEN}//"undef"}' ab $ ./perl -le 'BEGIN { ${^OPEN} = "a\0b"; $^H=256;}BEGIN{ print ${^OPEN}//"undef"}' undef Apparently you have to hop from one BEGIN block to another to see the changes. This happens because compile-time hints are stored in PL_hints (which $^H sets) but ${^OPEN} looks in PL_compiling.cop_hints. Setting ${^OPEN} sets both. The contents of PL_hints are assigned to PL_compiling.cop_hints at certain points (the start of a BEGIN block sees the right value because newSTATEOP sets it), but the two are not always kept in synch. The smallest fix here is to have $^H set PL_compiling.cop_hints as well as PL_hints, but the ultimate fix--to come later--is to merge the two and stop storing hints in two different places.
* Remove SvIsCOW checks from mg.c:mg_localizeFather Chrysostomos2013-08-111-1/+1
| | | | | | | | | | | | | | | | It no longer needs to worry about SvIsCOW. This logic is left over from when READONLY+FAKE was used for COWs. Since it is possible for COWs to be read-only now, this logic is actu- ally faulty, as it doesn’t temporarily stop read-only COWs from being read-only, as it does for other read-only values. This actually causes discrepancies with scalar-tied locked hash keys, which differ in readonliness when localised depending on whether the previous value used copy-on-write. Whether such scalars should be read-only after localisation is open to debate, but it should not differ based on the means of storing the previous value.
* Remove SvIsCOW checks from mg.c:S_save_magicFather Chrysostomos2013-08-111-4/+2
| | | | | | | | | | | | It no longer needs to worry about SvIsCOW. This logic is left over from when READONLY+FAKE was used for COWs. Since it is possible for COWs to be read-only now, this logic is actu- ally faulty, as it doesn’t temporarily stop read-only COWs from being read-only, as it does for other read-only values. This actually causes bugs with scalar-tied locked hash keys, which croak on FETCH.
* Skip trailing constants when searching padsFather Chrysostomos2013-07-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under ithreads, constants and GVs are stored in the pad. When names are looked up up in a pad, the search begins at the end and works its way toward the beginning, so that an $x declared later masks one declared earlier. If there are many constants at the end of the pad, which can happen for generated code such as lib/unicore/TestProp.pl (which has about 100,000 lines and over 500,000 pad entries for constants at the end of the file scope’s pad), it can take a long time to search through them all. Before commit 325e1816, constants used &PL_sv_undef ‘names’. Since that is the default value for array elements (when viewed directly through AvARRAY, rather than av_fetch), the pad allocation code did not even bother storing the ‘name’ for these. So the name pad (aka padnamelist) was not extended, leaving just 10 entries or so in the case of lib/unicore/TestProp.pl. Commit 325e1816 make pad constants have &PL_sv_no names, so the name pad would be implicitly extended as a result of storing &PL_sv_no, causing a huge slowdown in t/re/uniprops.t (which runs lib/unicore/TestProp.pl) under threaded builds. Now, normally the name pad *does* get extended to match the pad, in pad_tidy, but that is skipped for string eval (and required file scope, of course). Hence, wrapping the contents of lib/unicore/TestProp.pl in a sub or adding ‘my $x’ to end of it will cause the same slowdown before 325e1816. lib/unicore/TestProp.pl just happened to be written (ok, generated) in such a way that it ended up with a small name pad. This commit fixes things to make them as fast as before by recording the index of the last named variable in the pad. Anything following that is disregarded in pad lookup and search begins with the last named variable. (This actually does make things faster before for subs with many trailing constants in the pad.) This is not a complete fix. Adding ‘my $x’ to the end of a large file like lib/unicore/TestProp.pl will make it just as slow again. Ultimately we need another algorithm, such as a binary search.
* [perl #72766] Allow huge pos() settingsFather Chrysostomos2013-07-231-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is part of #116907, too. It also fixes #72924 as a side effect; the next commit will explain. The value of pos($foo) was being stored as an I32, not allowing values above I32_MAX. Change it to SSize_t (the signed equivalent of size_t, representing the maximum string length the OS/compiler supports). This is accomplished by changing the size of the entry in the magic struct, which is the simplest fix. Other parts of the code base can benefit from this, too. We actually cast the pos value to STRLEN (size_t) when reading it, to allow *very* long strings. Only the value -1 is special, meaning there is no pos. So the maximum supported offset is 2**sizeof(size_t)-2. The regexp engine itself still cannot handle large strings, so being able to set pos to large values is useless right now. This is but one piece in a larger puzzle. Changing the size of mg->mg_len also requires that Perl_hv_placeholders_p change its type. This function should in fact not be in the API, since it exists solely to implement the HvPLACEHOLDERS macro. See <https://rt.perl.org/rt3/Ticket/Display.html?id=116907#txn-1237043>.
* [perl #27010] Make tie work through defelemsFather Chrysostomos2013-07-161-7/+15
| | | | | | | | | When elements of @_ refer to nonexistent hash or array elements, then the magic scalar in $_[0] delegates all set/get actions to the element in represents, vivifying it if needed. tie/tied/untie, however, were not delegating to the element, but were tying the the magical ‘deferred element’ scalar itself.
* [perl #77814] Make defelems propagate posFather Chrysostomos2013-07-151-14/+19
| | | | | | | | | When elements of @_ refer to nonexistent hash or array elements, then the magic scalar in $_[0] delegates all set/get actions to the element in represents, vivifying it if needed. pos($_[0]), however, was not delegating the value to the element, but storing it on the magical ‘deferred element’ scalar.
* Make set-magic handle vstrings properlyFather Chrysostomos2013-07-151-4/+6
| | | | | | | | | | | | | | | | | | | | | Assigning a vstring to a tied variable would result in a plain string in $_[1] in STORE. Assigning a vstring to a magic deferred element would result in a plain string in the aggregate’s actual element. When magic is invoked, the magic flags are temporarily turned off on the sv so that recursive calls to magic don’t happen. This makes it easier to implement functions like Perl_magic_set to read the value of the sv without triggering get-magic. Since vstrings are only considered vstrings when they are SvRMAGICAL, this meant that set-magic would turn vstrings temporarily into plain strings. Subsequent copying (e.g., in STORE) would then fail to copy the vstring magic. This commit changes mg_set to leave the rmagical flag on, since it does not affect the functionaiity of set-magic.
* PATCH: [perl #112208]: Set utf8 flag on $! appropriatelyKarl Williamson2013-07-051-1/+29
| | | | | | | | | | | | | | | | | This patch sets the utf8 flag on $! if the error string passes utf8 validity tests and has some bytes with the upper bit set. (If none have that bit set, is an ASCII string, and whether or not it is UTF-8 is irrelevant.) This is a heuristic that could fail, but as the reference in the comments points out this is unlikely. One can reasonably assume that a UTF-8 locale will return a UTF-8 result. So another approach would be to look at that (but we wouldn't want to turn the flag on for a purely ASCII string anyway, as that could change the semantics from existing behavior by making the string follow Unicode rules, whereas it didn't necessarily before.) To do this, we could keep track of the utf8ness of the LC_MESSAGES locale. But until the heuristic in this patch is shown to not be good enough, I don't see the need to do this extra work.
* change magic_methcall to use SV with shared hash valueRuslan Zakirov2013-06-301-15/+14
| | | | | | Perl_magic_methcall is not public API, so there is no need to add another function and we can just change function's arguments.
* G_METHOD_NAMED flag for call_method and call_svRuslan Zakirov2013-06-301-2/+2
| | | | | | | | | Can be used when it's known that method name has no package part - just method name. With flag set SV with precomputed hash value is used and pp_method_named is called instead of pp_method. Method lookup is faster.
* Stop making assumptions about uids and gids.Brian Fraser2013-06-041-23/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | The code dealt rather inconsistently with uids and gids. Some places assumed that they could be safely stored in UVs, others in IVs, others in ints; All of them should've been using the macros from config.h instead. Similarly, code that created SVs or pushed values into the stack was also making incorrect assumptions -- As a point of reference, only pp_stat did the right thing: #if Uid_t_size > IVSIZE mPUSHn(PL_statcache.st_uid); #else # if Uid_t_sign <= 0 mPUSHi(PL_statcache.st_uid); # else mPUSHu(PL_statcache.st_uid); # endif #endif The other places were potential bugs, and some were even causing warnings in some unusual OSs, like haiku or qnx. This commit ammends the situation by introducing four new macros, SvUID(), sv_setuid(), SvGID(), and sv_setgid(), and using them where needed.
* mg.c: Use SvREFCNT_dec_NNFather Chrysostomos2013-05-261-2/+2
| | | | | Using SvREFCNT_dec_NN in a couple of places eliminates needless null checks.
* autodoc.pl: Add note for deprecated functionsKarl Williamson2013-05-201-3/+1
| | | | | This causes each deprecated function to have a prominent note to that effect in its API documentation.
* mg.c: White-space onlyKarl Williamson2013-05-201-2/+5
| | | | I found re-formatting this multi-line 'if' to be easier to understand
* silence warnings under NO_TAINT_SUPPORTDavid Mitchell2013-05-091-0/+3
| | | | | The are lots of places where local vars aren't used when compiled with NO_TAINT_SUPPORT.
* mg.c : revert ENV{x} = undef behaviour to be empty string, not key deletionKent Fredric2013-02-171-1/+1
| | | | | pod/perldelta.pod: document reversion of ENV{foo} = undef behaviour in delta t/op/magic.t: add a test for ENV{foo} = undef
* better POD for mg_get and SvGROWDaniel Dragan2012-12-111-1/+2
| | | | | | | | | | | SvGROW unconditionally derefs SvANY to check SvLEN. A crash occurs if the sv is SVt_NULL. Also mg_get uses SvMAGIC which also has the same problem. Rather than having people finding these properties out by trial and error, document them. There is no sense in adding type checks since these 2 calls have been had sv type restrictions since probably day 1 and for performance reason. Anyone who hit the restrictions would have either fixed their code immediately, or abandoned using XS. I observed someone abandoning XS in the field over these undocumented restrictions.
* prevent multiple evaluations of ERRSVDaniel Dragan2012-11-231-17/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Remove a large amount of machine code (~4KB for me) from funcs that use ERRSV making Perl faster and smaller by preventing multiple evaluation. ERRSV is a macro that contains GvSVn which eventually conditionally calls Perl_gv_add_by_type. If a SvTRUE or any other multiple evaluation macro is used on ERRSV, the expansion will, in asm have dozens of calls to Perl_gv_add_by_type one for each test/deref of the SV in SvTRUE. A less severe problem exists when multiple funcs (sv_set*) in a row call, each with ERRSV as an arg. Its recalculated then, Perl_gv_add_by_type and all. I think ERRSV macro got the func call in commit f5fa9033b8, Perl RT #70862. Prior to that commit it would be pure derefs I think. Saving the SV* is still better than looking into interp->gv->gp to get the SV * after each func call. I received no responses to http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195724.html explaining when the SV is replaced in PL_errgv, so took a conservative view and assumed callbacks (with Perl stack/ENTER/LEAVE/eval_*/call_*) can change it. I also assume ERRSV will never return null, this allows a more efficiently version of SvTRUE to be used. In Perl_newATTRSUB_flags a wasteful copy to C stack operation with the string was removed, and a croak_notcontext to remove push instructions to the stack. I was not sure about the interaction between ERRSV and message sv, I didn't change it to a more efficient (instruction wise, speed, idk) format string combining of the not safe string and ERRSV in the croak call. If such an optimization is done, a compiler potentially will put the not safe string on the first, unconditionally, then check PL_in_eval, and then jump to the croak call site, or eval ERRSV, push the SV on the C stack then push the format string "%"SVf"%s". The C stack allocated const char array came from commit e1ec3a884f . In Perl_eval_pv, croak_on_error was checked first to not eval ERRSV unless necessery. I was not sure about the side effects of using a more efficient croak_sv instead of Perl_croak (null chars, utf8, etc) so I left a comment. nocontext used to save an push instruction on implicit sys perl. In S_doeval, don't open a new block to avoid large whitespace changes. The NULL assignment should optimize away unless accidental usage of errsv in the future happens through a code change. There might be a bug here from commit ecad31f018 since previous a char * was derefed to check for null char, but ERRSV will never be null, so "Unknown error\n" branch will never be taken. For pp_sys.c, in pp_die a new block was opened to not eval ERRSV if "well-formed exception supplied". The else if else if else blocks all used ERRSV, so a "SV * errsv = NULL;" and a eval in the conditional with comma op thing wouldn't work (maybe it would, see toke.c comments later in this message). pp_warn, I have no comments. In S_compile_runtime_code, a croak_sv question comes up same as in Perl_eval_pv. In S_new_constant, a eval in the conditional is done to avoid evaling ERRSV if PL_in_eval short circuits. Same thing in Perl_yyerror_pvn. Perl__core_swash_init I have no comments. In the future, a SvEMPTYSTRING macro should be considered (not fully thought out by me) to replace the SvTRUEs with something smaller and faster when dealing with ERRSV. _nomg is another thing to think about. In S_init_main_stash there is an opportunity to prevent an extra ERRSV between "sv_grow(ERRSV, 240);" and "CLEAR_ERRSV();" that was too complicated for me to optimize. before perl517.dll .text 0xc2f77 .rdata 0x212dc .data 0x3948 after perl517.dll .text 0xc20d7 .rdata 0x212dc .data 0x3948 Numbers are from VC 2003 x86 32 bit.
* rmv context from Perl_croak_no_modify and Perl_croak_xs_usageDaniel Dragan2012-11-121-2/+2
| | | | | | | | | | | Remove the context/pTHX from Perl_croak_no_modify and Perl_croak_xs_usage. For croak_no_modify, it now has no parameters (and always has been no return), and on some compilers will now be optimized to a conditional jump. For Perl_croak_xs_usage one push asm opcode is removed at the caller. For both funcs, their footprint in their callers (which probably are hot code) is smaller, which means a tiny bit more room in the cache. My text section went from 0xC1A2F to 0xC198F after apply this. Also see http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195233.html .
* Add C define to remove taint support from perlSteffen Mueller2012-11-051-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By defining NO_TAINT_SUPPORT, all the various checks that perl does for tainting become no-ops. It's not an entirely complete change: it doesn't attempt to remove the taint-related interpreter variables, but instead virtually eliminates access to it. Why, you ask? Because it appears to speed up perl's run-time significantly by avoiding various "are we running under taint" checks and the like. This change is not in a state to go into blead yet. The actual way I implemented it might raise some (valid) objections. Basically, I replaced all uses of the global taint variables (but not PL_taint_warn!) with an extra layer of get/set macros (TAINT_get/TAINTING_get). Furthermore, the change is not complete: - PL_taint_warn would likely deserve the same treatment. - Obviously, tests fail. We have tests for -t/-T - Right now, I added a Perl warn() on startup when -t/-T are detected but the perl was not compiled support it. It might be argued that it should be silently ignored! Needs some thinking. - Code quality concerns - needs review. - Configure support required. - Needs thinking: How does this tie in with CPAN XS modules that use PL_taint and friends? It's easy to backport the new macros via PPPort, but that doesn't magically change all code out there. Might be harmless, though, because whenever you're running under NO_TAINT_SUPPORT, any check of PL_taint/etc is going to come up false. Thus, the only CPAN code that SHOULD be adversely affected is code that changes taint state.
* Don’t sv_force_normal in mg.c:S_save_magicFather Chrysostomos2012-10-281-7/+4
| | | | | | | | | | | | | This was added to make SvREADONLY_off safe. (I think read-only is turned off during magic so the magic scalar itself can be set without the sv_set* functions getting upset.) Since SvREADONLY doesn’t mean read-only for COWs, we don’t actually need to do sv_force_normal, but can simply skip SvREADONLY_off for COWs. By leaving it to sv_set* functions to do sv_force_normal, we avoid having to copy the string buffer if it is just going to be thrown away anyway. S_save_magic can’t know whether the scalar will actually be overwritten, so it has to copy the buffer.
* Call overloading once for utf8 ovld→substr assignmentFather Chrysostomos2012-10-011-1/+1
|
* Make substr assignment work with changing UTF8nessFather Chrysostomos2012-10-011-3/+2
| | | | | | | | | | Assigning to a substr lvalue scalar was invoking overload too many times if the target was a UTF8 string and the assigned sub- string was not. Since sv_insert_flags itself stringifies the scalar, the easiest way to fix this is to force the target to a PV before doing any- thing to it.
* mg.c:magic_setsubstr: rmv redundante null checkFather Chrysostomos2012-10-011-1/+1
| | | | | This was added in commit 9bf12eaf4 to fix a crash, but it is not necessary any more, due to changes elsewhere.
* Remove length magic on scalarsFather Chrysostomos2012-10-011-79/+0
| | | | | | | | | It is not possible to know how to interpret the returned length without accessing the UTF8 flag, which is not reliable until the SV has been stringified, which requires get-magic. So length magic has not made senses since utf8 support was added. I have removed all uses of length magic from the core, so this is now dead code.
* Make substr = $utf8 call get-magic onceFather Chrysostomos2012-10-011-1/+1
|
* Make magic_setsubstr check UTF8 flag after stringificationFather Chrysostomos2012-10-011-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | By checking it before, it can end up treating a UTF8 string as bytes when calculating offsets if the UTF8 flag is not turned on until the target is stringified. This can happen with overloading and typeglobs. This is a regression from 5.14. 5.14 itself was buggy, too, but one would have to modify the target after creating the substr lvalue but before assigning to it; and that because of another bug fixed by 83f78d1a27, which was cancelling out this one. package o { use overload '""' => sub { $_[0][0] } } my $refee = bless ["\x{100}a"], o::; my $substr = \substr $refee, -2; $$substr = "b"; warn $refee; That prints: Wide character in warn at - line 7. Āb at - line 7. In 5.14 it prints: b at - line 7.
* Stop substr lvalues from being confused by changing UTF8nessFather Chrysostomos2012-10-011-3/+3
|