summaryrefslogtreecommitdiff
path: root/sv.c
Commit message (Collapse)AuthorAgeFilesLines
* clear magic flags in sv_clearDavid Mitchell2012-03-261-0/+1
| | | | | | | | | commit 5bec93bead1c10563a402404de095bbdf398790f made temporary use of the no-longer used SvMAGIC field while freeing a HV. This commit makes sure that before this happens, that the SvMAGICAL flags are turned off. This is because it turns out that some XS code (e.g. Glib) can leave an SV with a null SvMAGIC field, but with magic flags still set.
* fix slowdown in nested hash freeingDavid Mitchell2012-03-061-12/+9
| | | | | | | | | | | | | | Commit 104d7b69 made sv_clear free hashes iteratively rather than recursively; however, my code didn't record the current hash index when freeing a nested hash, which made the code go quadratic when freeing a large hash with inner hashes, e.g.: my $r; $r->{$_} = { a => 1 } for 1..10_0000; This was noticeable on such things as CPAN.pm being very slow to exit. This commit fixes this by squirrelling away the old hash index in the now-unused SvMAGIC field of the hash being freed.
* Remove gete?[ug]id cachingÆvar Arnfjörð Bjarmason2012-02-181-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we cache the UID/GID and effective UID/GID similarly to how we used to cache getpid() before v5.14.0-251-g0e21945. Remove this magical behavior in favor of always calling getpid(), getgid() etc. This resolves RT #96208. A minimal testcase for this is the following by Leon Timmermans attached to RT #96208: eval { require 'syscall.ph'; 1 } or eval { require 'sys/syscall.ph'; 1 } or die $@; if (syscall(&SYS_setuid, $ARGV[0] + 0 || 1000) >= 0 or die "$!") { printf "\$< = %d, getuid = %d\n", $<, syscall(&SYS_getuid); } I.e. if we call the sete?[ug]id() functions unbeknownst to perl the $<, $>, $( and $) variables won't be updated. This results in the same sort of issues we had with $$ before v5.14.0-251-g0e21945, and getppid() before my v5.15.7-407-gd7c042c patch. I'm completely eliminating the PL_egid, PL_euid, PL_gid and PL_uid variables as part of this patch, this will break some CPAN modules, but it'll be really easy before the v5.16.0 final to reinstate them. I'd like to remove them to see what breaks, and how easy it is to fix it. These variables are not part of the public API, and the modules using them could either use the Perl_gete?[ug]id() functions or are working around the bug I'm fixing with this commit. The new PL_delaymagic_(egid|euid|gid|uid) variables I'm adding are *only* intended to be used internally in the interpreter to facilitate the delaymagic in Perl_pp_sassign. There's probably some way not to export these to programs that embed perl, but I haven't found out how to do that.
* correctly clone eval context framesZefram2012-02-181-0/+3
| | | | | | When cloning stacks (only used for Win32 fork emulation, not for ordinary threads), the CV referenced by an eval context frame wasn't being cloned. This led to crashes when Win32 forked inside an eval [perl #109718].
* In Perl_sv_del_backref(), don't panic if tsv is already freed.Nicholas Clark2012-02-171-0/+24
| | | | | | During global destruction it's possible for tsv, the target of this weak reference, to already be freed. This isn't a bug, and hence the interpreter should not panic.
* perl #77654: quotemeta quotes non-ASCII consistentlyKarl Williamson2012-02-151-0/+1
| | | | | | | | | | As described in the pod changes in this commit, this changes quotemeta() to consistenly quote non-ASCII characters when used under unicode_strings. The behavior is changed for these and UTF-8 encoded strings to more closely align with Unicode's recommendations. The end result is that we *could* at some future point start using other characters as metacharacters than the 12 we do now.
* Further eliminate POSIX-emulation under LinuxThreadsÆvar Arnfjörð Bjarmason2012-02-151-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under POSIX threads the getpid() and getppid() functions return the same values across multiple threads, i.e. threads don't have their own PID's. This is not the case under the obsolete LinuxThreads where each thread has a different PID, so getpid() and getppid() will return different values across threads. Ever since the first perl 5.0 we've returned POSIX-consistent semantics for $$, until v5.14.0-251-g0e21945 when the getpid() cache was removed. In 5.8.1 Rafael added further explicit POSIX emulation in perl-5.8.0-133-g4d76a34 [1] by explicitly caching getppid(), so that multiple threads would always return the same value. I don't think all this effort to emulate POSIX sematics is worth it. I think $$ and getppid() are OS-level functions that should always return the same as their C equivalents. I shouldn't have to use a module like Linux::Pid to get the OS version of the return values. This is pretty much a complete non-issue in practice these days, LinuxThreads was a Linux 2.4 thread implementation that nobody maintains anymore[2], all modern Linux distros use NPTL threads which don't suffer from this discrepancy. Debian GNU/kFreeBSD does use LinuxThreads in the 6.0 release, but they too will be moving away from it in future releases, and really, nobody uses Debian GNU/kFreeBSD anyway. This caching makes it unnecessarily tedious to fork an embedded Perl interpreter. When someone that constructs an embedded perl interpreter and forks their application, the fork(2) system call isn't going to run Perl_pp_fork(), and thus the return value of $$ and getppid() doesn't reflect the current process. See [3] for a bug in uWSGI related to this, and Perl::AfterFork on the CPAN for XS code that you need to run after forking a PerlInterpreter unbeknownst to perl. We've already been failing the tests in t/op/getpid.t on these Linux systems that nobody apparently uses, the Debian GNU/kFreeBSD users did notice and filed #96270, this patch fixes that failure by changing the tests to test for different behavior under LinuxThreads, I've tested that this works on my Debian GNU/kFreeBSD 6.0.4 virtual machine. If this change is found to be unacceptable (i.e. we want to continue to emulate POSIX thread semantics for the sake of LinuxThreads) we also need to revert v5.14.0-251-g0e21945, because currently we're only emulating POSIX semantics for getppid(), not getpid(). But I don't think we should do that, both v5.14.0-251-g0e21945 and this commit are awesome. This commit includes a change to embedvar.h made by "make regen_headers". 1. http://www.nntp.perl.org/group/perl.perl5.porters/2002/08/msg64603.html 2. http://pauillac.inria.fr/~xleroy/linuxthreads/ 3. http://projects.unbit.it/uwsgi/ticket/85
* Clarify the newSVpvn documentation.Shlomi Fish2012-02-151-6/+8
| | | | | "string" is now called "buffer", and we mention that it might contain NUL characters.
* regcomp.c: /[[:lower:]]/i should match the same as /\p{Lower}/iKarl Williamson2012-02-111-0/+2
| | | | | | Same for [[:upper:]] and \p{Upper}. These were matching instead all of [[:alpha:]] or \p{Alpha}. What /\p{Lower}/i and /\p{Upper}/i match instead is \p{Cased}, and so that is what these should match.
* regcomp.c: Remove duplicate inversion listKarl Williamson2012-02-111-2/+0
| | | | | \h and \p{XPosixBlank} contain the same code points, so there is no need to have both of them.
* Add compile-time inversion lists for POSIX classesKarl Williamson2012-02-091-1/+50
| | | | | | | | | | | | | | These will be used in regcomp.c to replace the existing bit-wise handling of these, enabling subsequent optimizations. These are compiled-in, and hence affect the memory footprint of every program, including those that don't use Unicode. The lists that aren't tiny are therefore currently restricted to only the Latin1 range; anything needed beyond that will have to be read in at execution time, just as before. The design allows for easy conversion from Latin1 to use the full Unicode range, should it be deemed desirable for some or all of these.
* regcomp.c: Use compile-time invlistsKarl Williamson2012-02-091-0/+3
| | | | | | This creates three simple compile-time inversion lists from the data that has been generated in a previous commit, and uses two of them. Three PL_ variables are used to store them.
* Silence compiler warningsRobin Barker2012-02-091-1/+1
| | | | | | | Cf. RT #110208. - Remove missing unused variables: op.c, regcomp.c - Silence -Wformat type error: sv.c - Cast first part of (,) expression as (void): gv.c
* Update, correct and clarify the comment in Perl_sv_setuv().Nicholas Clark2012-02-011-2/+5
| | | | See the correspondence on ticket #36459 for more details.
* Stop SvPVutf8 from forcing the POK flagFather Chrysostomos2012-01-311-1/+3
| | | | | | It was setting this even on magical variables, causing stringification not to bother calling FETCH, because the POK flag means ‘yes, I’m a bonified [sic] string, with nothing funny going on’.
* Make SvPVbyte return bytes for non-PVsFather Chrysostomos2012-01-311-2/+7
| | | | | Instead of just doing SvPV on something that is not a PV, SvPVbyte should actually do what it is advertised as doing.
* [perl #108994] Stop SvPVutf8 from coercing SVsFather Chrysostomos2012-01-311-1/+3
| | | | | | | | | | In shouldn’t destroy globs or references passed to it, or try to coerce them if they are read-only or incoercible. I added tests for SvPVbyte at the same time, even though it was not exhibiting the same problems, as sv_utf8_downgrade doesn’t try to coerce anything. (SvPVbyte has its own set of bugs, which I hope to fix in fifthcoming commits.)
* Correctly escape UTF-8 in hash keys in uninitialized warningsRafael Garcia-Suarez2012-01-251-1/+2
|
* [perl #108780] Make ‘no overloading’ work with qr//Father Chrysostomos2012-01-241-1/+4
| | | | | | | | | | | | | | Traditionally, overload::StrVal(qr//) has returned Regexp=SCALAR(0xc0ffee), and later Regexp=REGEXP(0xc0c0a) when regexps were made into first-class SVs. When the overloading pragma was added in 5.10.1, qr// things were not accounted for, so they would still stringify as (?-xism:) even with ‘no overloading’ (or as (?^:) under 5.14). This commit makes the overloading pragma work with qr// things, so that they stringify the same way as overload::StrVal; i.e., as Regexp=REGEXP(0xbe600d).
* sv.c:sv_utf8_encode: simplify codeFather Chrysostomos2012-01-231-4/+1
| | | | sv_force_normal already croaks for read-only variables
* Don’t allow read-only regexps to be tiedFather Chrysostomos2012-01-231-2/+2
| | | | | Since the test triggered another bug in freeing read-only regexps, this commit fixes that too.
* sv_force_normal: Don’t confuse regexps with cowsFather Chrysostomos2012-01-221-1/+1
| | | | | Otherwise we get assertion failures and possibly corrupt string tables.
* [perl #82772] utf8::decode: Don’t read past SvCURFather Chrysostomos2012-01-201-1/+1
|
* [perl #106726] Don’t crash on length(@arr) warningFather Chrysostomos2012-01-171-2/+4
| | | | | | | | | | | | | | | | | The RT ticket blames this on 676a678ac, but it was actually commit 579333ee9e3. 676a678ac extended this problem to evals (and modules), but it already occurred in the main program. This crashes: ./miniperl -Ilib -we 'sub {length my @forecasts}' because it is trying to find the variable name for the warning in the CV returned by find_runcv, but this is a *compile-time* warning, so using find_runcv is just wrong. It ends up looking for the array in PL_main_cv’s pad, instead of PL_compcv.
* Provide as much diagnostic information as possible in "panic: ..." messages.Nicholas Clark2012-01-161-3/+6
| | | | | | | | | | | | | | | The convention is that when the interpreter dies with an internal error, the message starts "panic: ". Historically, many panic messages had been terse fixed strings, which means that the out-of-range values that triggered the panic are lost. Now we try to report these values, as such panics may not be repeatable, and the original error message may be the only diagnostic we get when we try to find the cause. We can't report diagnostics when the panic message is generated by something other than croak(), as we don't have *printf-style format strings. Don't attempt to report values in panics related to *printf buffer overflows, as attempting to format the values to strings may repeat or compound the original error.
* stat $ioref should record the handle for -T _Father Chrysostomos2012-01-131-0/+2
| | | | | | | stat $gv records the handle so that -T _ can use it. But stat $ioref hasn’t been doing that, until this commit. PL_statgv can now hold an SVt_PVIO instead of a SVt_PVGV.
* Set PL_statgv to null when freed or coercedFather Chrysostomos2012-01-131-0/+4
| | | | | | | | | | | | If PL_statgv is not set to null when freed, that same SV could be reused for another GV, in which case -T _ will then use another handle unrelated to the previous stat. Similarly, if PL_statgv points to a fake glob that gets coerced into a non-glob before it is freed, it will not follow the code path in sv_free that sets PL_statgv to null. Furthermore, if it becomes a GV again, it could be a completely different filehandle, unrelated to the previous stat.
* In Perl_sv_del_backref(), don't panic if svp is NULL during global destruction.Nicholas Clark2012-01-131-3/+12
| | | | | | | | It's possible that the referencing SV is being freed partway through the freeing of reference target. If this happens, the backreferences array of the target has already been freed, and so svp will be NULL. If this is the case, do nothing and return. Previously, this condition was not recognised and the code would panic.
* In Perl_sv_del_backref(), don't panic if the backref array is already freed.Nicholas Clark2012-01-131-0/+3
| | | | | | | | | | During global destruction, it's possible for the array containing backreferences to be freed before the SV that owns it. If this happens, don't mistake it for a scalar backreference stored directly, and then get confused and panic because things seem inconsistent. http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2011-12/msg00039.html gives more information.
* Better panic diagnostics in Perl_sv_del_backref()Nicholas Clark2012-01-131-2/+3
| | | | | | If panicing with a croak(), include in the panic message the values which caused the croak. This reveals something about the cause of the panic, and more subtly, which of the two possible panic locations this is.
* [perl #107366] Don’t clone GVs during thread joinFather Chrysostomos2012-01-011-0/+21
| | | | | | | | | | | | | | | | unless they are orphaned. This commit stops globs that still reside in their stashes from being cloned during a join. That way, a sub like sub{$::x++}, when cloned into a subthread and returned from it, will still point to the same $::x. This commit takes the conservative approach of copying on those globs that can be found under their names in the original thread. While this doesn’t work for all cases, it’s probably not possible to make it work all the time.
* [perl #103492] Make %n printf format work with UnicodeFather Chrysostomos2011-12-311-1/+1
| | | | | | | It was using the internal byte count instead of the number of charac- ters. The iatter is documented. The former is useless, even for C code calling this, as later arguments could cause the current buffer to be upgraded to utf8, throwing off any offsets returned.
* diag_listed_as galoreFather Chrysostomos2011-12-281-0/+2
| | | | | In two instances, I actually modified to code to avoid %s for a constant string, as it should be faster that way.
* Correct spelling of sv_insert error msgFather Chrysostomos2011-12-281-1/+1
| | | | It is already documented in perldiag with the right spelling.
* perldiag: Remove ‘in %s’ from bizarre copy msgFather Chrysostomos2011-12-271-0/+1
| | | | | so that splain can find the message even when there is no op mentioned.
* sv.c:dirp_dup: Avoid compiler warningFather Chrysostomos2011-12-251-1/+1
| | | | | | | Some compilers complain, because -1 is being assigned to an unsigned variable. This variable is not actually used before being assigned to, but we have to initialise it as some other compilers cannot detect that.
* Add diag_listed_as for non-numeric warningsFather Chrysostomos2011-12-251-0/+2
|
* sv.c: consistent spaces after dots in apidocsFather Chrysostomos2011-12-231-7/+9
|
* Don’t clobber all magic when clobbering vstringFather Chrysostomos2011-12-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code in sv_setsv, introduced in ece467f9b3, can’t possi- bly be right: if ( SvVOK(dstr) ) { /* need to nuke the magic */ mg_free(dstr); } And here is a test to prove it: sub TIESCALAR { bless[]} sub STORE {} tie $@, ""; $@ = v0; warn tied $@; # main=ARRAY(0xc0ffee) $@ = 3; warn tied $@; # something’s wrong It blows away tiedness. You could do that to any variable. Let’s see: $! = v0; $! = 3; open foo, 'oentuhaeontu' or die $!; # 3 at - line 3. Youch! Let’s just free vstring magic, shall we?
* Avoid an unused temp scalar in sv.c:S_sv_unglobFather Chrysostomos2011-12-171-1/+1
| | | | This is something I missed in 804b5ed7.
* [perl #97988] Nullify PL_last_in_gv when unglobbedFather Chrysostomos2011-12-171-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Code like this can cause PL_last_in_gv to point to a coercible glob: $f{g} = *STDOUT; readline $f{g}; If $f{g} is then modified such that it is no longer a glob, PL_last_in_gv ends up pointing to a non-glob: $f{g} = 3; If $f{g} is freed now, the PL_last_in_gv-nulling code in sv_clear will be skipped, as it only applies to globs. undef %f; # now PL_last_in_gv points to a freed scalar The resulting freed scalar can be reused by another handle, *{"foom"} = *other; causing tell() with no arguments to return the position on *other, even though *other was no the last handle read from. This commit fixes it by nulling PL_last_in_gv when a coercible glob is coerced.
* perldiag: Retitle ‘Cannot copy’Father Chrysostomos2011-12-161-0/+1
| | | | Without the ‘in %s’ it covers both forms of the error message.
* Use syntax from perlguts for testing objectsJohn Peacock2011-12-091-1/+1
| | | | | | | | | | | | | The following paragraph is in perlguts.pod: To check if you've got an object derived from a specific class you have to write: if (sv_isobject(sv) && sv_derived_from(sv, class)) { ... } which does the right thing with magical things like tied scalars. Signed-off-by: David Golden <dagolden@cpan.org>
* Clarify docs for sv_usepvn_flagsFather Chrysostomos2011-12-041-1/+3
| | | | | Note that the string must be the start of a mallocked block of memory, and not a pointer to the middle of it.
* Remove SvTAINT from sv_sethekFather Chrysostomos2011-12-031-4/+0
| | | | | | | | | | | This was copied from sv_usepvn_flags in commit 58b643af9. It is unnecessary, and probably incorrect, as heks are not tainted. Why sv_sethek used sv_usepvn_flags to begin with I don’t know, but I imagine it was for brevity’s sake. This code was ultimately derived from newSVhek, which doesn’t use sv_usepvn_flags. Because of that, and because it is now far enough removed from sv_usepvn_flags, I have removed the comment referring to it.
* Stop calling sv_usepvn_flags from sv_sethekPeter Martini2011-12-031-1/+7
| | | | | | | | | | | | | | | sv_usepvn_flags assumes that ptr is at the head of a block of memory allocated by malloc. If perl's malloc is in use, the data structures malloc uses and the data allocated for perl are intermixed, and accounting done by malloced_size in sv_usepvn_flags will overwrite valid memory if its called on an address that is not the start of a malloced block. The actual work being accomplished by sv_usepvn_flags, and not undone immediately after by sv_sethek, is limited to 3 calls on the SV. Inlining those calls removes the dependency on malloc. This fixes perl #104034.
* Allow COW PVMGs to be tiedFather Chrysostomos2011-12-011-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This logic in sv_magic is wrong: if (SvREADONLY(sv)) { if ( /* its okay to attach magic to shared strings; the subsequent * upgrade to PVMG will unshare the string */ !(SvFAKE(sv) && SvTYPE(sv) < SVt_PVMG) && IN_PERL_RUNTIME && !PERL_MAGIC_TYPE_READONLY_ACCEPTABLE(how) ) { Perl_croak_no_modify(aTHX); } } There is nothing wrong with attaching magic to a shared string that will stay shared. Also, shared strings are not always < SVt_PVMG. Sometimes a PVMG or PVLV can end up with a shared string. In those cases, the logic above treats them as read-only, which they ain’t. The easiest example is a downgraded typeglob: $x = *foo; # now a PVGV undef $x ; # downgraded to PVMG $x = __PACKAGE__; # now a shared string (COW) tie $x, "main"; # bang! $x is considered read-only sub main::TIESCALAR{bless[]}
* Use SvOOK_onFather Chrysostomos2011-12-011-3/+2
| | | | | | Now that SvOOK_on has a usable definition (i.e., it leaves the NIOK flags alone), we can use it and remove the comments warning against it.
* sv.c: fix comment typo added by ce2077b184Father Chrysostomos2011-11-301-1/+1
|
* sv.c/find_uninit_var: Explain kid-scanning loop betterFather Chrysostomos2011-11-291-2/+7
| | | | | Hopefully this explanation will be clearer and will prevent clumsy individuals like me from introducing bugs like #103766.