delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix -Wformat-security issues	Niko Tyni	2013-05-10	1	-3/+3
\| \| \| \| \| \| \| \|	Building with -Accflags="-Wformat -Werror=format-security" triggers format string warnings from gcc. As gcc can't tell that all the strings are constant here, explicitly pass separate format strings to make it happy.
*	Remove PERL_ASYNC_CHECK() from Perl_leave_scope().	Nicholas Clark	2013-05-09	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PERL_ASYNC_CHECK() was added to Perl_leave_scope() as part of commit f410a2119920dd04, which moved signal dispatch from the runloop to control flow ops, to mitigate nearly all of the speed cost of safe signals. The assumption was that scope exit was a safe place to dispatch signals. However, this is not true, as parts of the regex engine call leave_scope(), the regex engine stores some state in per-interpreter variables, and code called within signal handlers can change these values. Hence remove the call to PERL_ASYNC_CHECK() from Perl_leave_scope(), and add it explicitly in the various OPs which were relying on their call to leave_scope() to dispatch any pending signals. Also add a PERL_ASYNC_CHECK() to the exit of the runloop, which ensures signals still dispatch from S_sortcv() and S_sortcv_stacked(), as well as addressing one of the concerns in the commit message of f410a2119920dd04: Subtle bugs might remain - there might be constructions that enter the runloop (where signals used to be dispatched) but don't contain any PERL_ASYNC_CHECK() calls themselves. Finally, move the PERL_ASYNC_CHECK(); added by that commit to pp_goto to the end of the function, to be consistent with the positioning of all other PERL_ASYNC_CHECK() calls - at the beginning or end of OP functions, hence just before the return to or just after the call from the runloop, and hence effectively at the same point as the previous location of PERL_ASYNC_CHECK() in the runloop.
*	silence warnings under NO_TAINT_SUPPORT	David Mitchell	2013-05-09	1	-0/+3
\| \| \| \| \|	The are lots of places where local vars aren't used when compiled with NO_TAINT_SUPPORT.
*	make qr/(?{ __SUB__ })/ safe	David Mitchell	2013-04-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(See RT #113928) Formerly, __SUB__ within a code block within a qr// returned a pointer to the "hidden" anon CV that implements the qr// closure. Since this was never designed to be called directly, it would SEGV if you tried. The easiest way to make this safe is to skip any CXt_SUB frames that are marked as CXp_SUB_RE: i.e. skip any subs that are there just to execute code blocks. For a qr//, this means that we return the sub which the pattern match is embedded in. Also, document the behaviour of __SUB__ within code blocks as being subject to change. It could be argued for example that in these cases it should return undef. But with the 5.18.0 release a month or two away, just make it safe for now, and revisit the semantics later if necessary.
*	fix caller with re_evals.	David Mitchell	2013-04-24	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(See RT #113928) In code like sub foo { /A(?{ bar; caller(); }B/; } the regex /A(?{B})C/ is, from a scope point of view, supposed to be compiled and executed as: /A/ && do { B } && /C/; i.e. the code block in B is part of the same sub as the code surrounding the regex. Thus the result of caller() above should see the caller as whoever called foo. Due to an implementation detail, we actually push a hidden extra sub CX before calling the pattern. This detail was leaking when caller() was used. Fux it so that it ignores this extra context frame. Conversely, for a qr//, that is supposed to be seen as an extra level of anonymous sub, so add tests to ensure that is so. i.e. $r = qr/...(?{code}).../ /...$r.../ is supposed to behave like $r = sub { code }; $r->();
*	move Perl_ck_warner() before unwind [perl #113794]	Zefram	2013-04-24	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Indeed. The Perl_ck_warner() call in die_unwind() used to happen before unwinding, so would be affected by the lexical warning state at the die() site. Now it happens after unwinding, so takes the lexical warning state at the catching site. I don't have a clear idea of which behaviour is more correct. t/op/die_keeperr.t, which was introduced as part of my exception handling changes, is actually testing for the catching-site criterion, but that's not asserting that the criterion should be that. The documentation speaks of "no warnings 'misc'", but doesn't say which lexical scope matters. Assuming we want to revert this change, the easy fix is to move the conditional Perl_ck_warner() back to before unwinding. A more difficult way would be to determine the disposition of the warning before unwinding and then warn in the required manner after unwinding. I see no compelling reason to warn after unwinding rather than before, so just moving the warning code should be fine. Note from the committer: This patch was supplied by Zefram in https://rt.perl.org/rt3/Ticket/Display.html?id=113794#txn-1204749 with a note that some extra work was required for ext/XS-APItest/t/call.t before the job was done. Ricardo Signes applied this patch and followed Zefram's lead in patching ext/XS-APItest/t/call.t without being 100% certain that this was what was meant. This commit was then submitted for review.
*	fix runtime /(?{})/ with overload::constant qr	David Mitchell	2013-04-12	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are two issues fixed here. First, when a pattern has a run-time code-block included, such as $code = '(?{...})' /foo$code/ the mechanism used to parse those run-time blocks: of feeding the resultant pattern into a call to eval_sv() with the string qr'foo(?{...})' and then extracting out any resulting opcode trees from the returned qr object -- suffered from the re-parsed qr'..' also being subject to overload:constant qr processing, which could result in Bad Things happening. Since we now have the PL_parser->lex_re_reparsing flag in scope throughout the parsing of the pattern, this is easy to detect and avoid. The second issue is a mechanism to avoid recursion when getting false positives in S_has_runtime_code() for code like '[(?{})]'. For patterns like this, we would suspect that the pattern may have code (even though it doesn't), so feed it into qr'...' and reparse, and again it looks like runtime code, so feed it in, rinse and repeat. The thing to stop recursion was when we saw a qr with a single OP_CONST string, we assumed it couldn't have any run-time component, and thus no run-time code blocks. However, this broke qr/foo/ in the presence of overload::constant qr overloading, which could convert foo into a string containing code blocks. The fix for this is to change the recursion-avoidance mechanism (in a way which also turns out to be simpler too). Basically, when we fake up a qr'...' and eval it, we turn off any 'use re eval' in scope: its not needed, since we know the .... will be a constant string without any overloading. Then we use the lack of 'use re eval' in scope to skip calling S_has_runtime_code() and just assume that the code has no run-time patterns (if it has, then eventually the regex parser will rightly complain about 'Eval-group not allowed at runtime'). This commit also adds some fairly comprehensive tests for this.
*	Eliminate PL_reg_state.re_reparsing, part 1	David Mitchell	2013-04-12	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	PL_reg_state.re_reparsing is a hacky flag used to allow runtime code blocks to be included in patterns. Basically, since code blocks are now handled by the perl parser within literal patterns, runtime patterns are handled by taking the (assembled at runtime) pattern, and feeding it back through the parser via the equivalent of eval q{qr'the_pattern'}, so that run-time (?{..})'s appear to be literal code blocks. When this happens, the global flag PL_reg_state.re_reparsing is set, which modifies lexing and parsing in minor ways (such as whether \\ is stripped). Now, I'm in the slow process of trying to eliminate global regex state (i.e. gradually removing the fields of PL_reg_state), and also a change which will be coming a few commits ahead requires the info which this flag indicates to linger for longer (currently it is cleared immediately after the call to scan_str(). For those two reasons, this commit adds a new mechanism to indicate this: a new flag to eval_sv(), G_RE_REPARSING (which sets OPpEVAL_RE_REPARSING in the entereval op), which sets the EVAL_RE_REPARSING bit in PL_in_eval. Its still a yukky global flag hack, but its a different global flag hack now. For this commit, we add the new flag(s) but keep the old PL_reg_state.re_reparsing flag and assert that the two mechanisms always match. The next commit will remove re_reparsing.
*	rework split() special case interaction with regex engine	Yves Orton	2013-03-27	1	-2/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch resolves several issues at once. The parts are sufficiently interconnected that it is hard to break it down into smaller commits. The tickets open for these issues are: RT #94490 - split and constant folding RT #116086 - split "\x20" doesn't work as documented It additionally corrects some issues with cached regexes that were exposed by the split changes (and applied to them). It effectively reverts 5255171e6cd0accee6f76ea2980e32b3b5b8e171 and cccd1425414e6518c1fc8b7bcaccfb119320c513. Prior to this patch the special RXf_SKIPWHITE behavior of split(" ", $thing) was only available if Perl could resolve the first argument to split at compile time, meaning under various arcane situations. This manifested as oddities like my $delim = $cond ? " " : qr/\s+/; split $delim, $string; and split $cond ? " ", qr/\s+/, $string not behaving the same as: ($cond ? split(" ", $string) : split(/\s+/, $string)) which isn't very convenient. This patch changes this by adding a new flag to the op_pmflags, PMf_SPLIT which enables pp_regcomp() to know whether it was called as part of split, which allows the RXf_SPLIT to be passed into run time regex compilation. We also preserve the original flags so pattern caching works properly, by adding a new property to the regexp structure, "compflags", and related macros for accessing it. We preserve the original flags passed into the compilation process, so we can compare when we are trying to decide if we need to recompile. Note that this essentially the opposite fix from the one applied originally to fix #94490 in 5255171e6cd0accee6f76ea2980e32b3b5b8e171. The reverted patch was meant to make: split( 0 \|\| " ", $thing ) #1 consistent with my $x=0; split( $x \|\| " ", $thing ) #2 and not with split( " ", $thing ) #3 This was reverted because it broke C<split("\x{20}", $thing)>, and because one might argue that is not that #1 does the wrong thing, but rather that the behavior of #2 that is wrong. In other words we might expect that all three should behave the same as #3, and that instead of "fixing" the behavior of #1 to be like #2, we should really fix the behavior of #2 to behave like #3. (Which is what we did.) Also, it doesn't make sense to move the special case detection logic further from the regex engine. We really want the regex engine to decide this stuff itself, otherwise split " ", ... wouldn't work properly with an alternate engine. (Imagine we add a special regexp meta pattern that behaves the same as " " does in a split /.../. For instance we might make split /(*SPLITWHITE)/ trigger the same behavior as split " ". The other major change as result of this patch is it effectively reverts commit cccd1425414e6518c1fc8b7bcaccfb119320c513, which was intended to get rid of RXf_SPLIT and RXf_SKIPWHITE, which and free up bits in the regex flags structure. But we dont want to get rid of these vars, and it turns out that RXf_SEEN_LOOKBEHIND is used only in the same situation as the new RXf_MODIFIES_VARS. So I have renamed RXf_SEEN_LOOKBEHIND to RXf_NO_INPLACE_SUBST, and then instead of using two vars we use only the one. Which in turn allows RXf_SPLIT and RXf_SKIPWHITE to have their bits back.
*	Extremely minor pp_goto optimization	Steffen Mueller	2013-03-06	1	-6/+6
\| \| \| \| \| \|	Makes use of the fact that the exception case is both rare and okay to be penalized (a tiny bit) instead of the common case doing an extra branch.
*	RT-116192 - If a directory in @INC already has a trailing '/', don't add ↵	Matthew Horsfall (alh)	2013-02-10	1	-1/+6
\| \| \| \|	another.
*	pp_ctl.c: Silence compiler warning.	Karl Williamson	2013-01-06	1	-1/+1
\| \| \| \|	A compiler on one of our smoke platforms wants empty braces here.
*	use PERL_UNUSED_VAR rather than PERL_UNUSED_DECL	David Mitchell	2012-12-17	1	-2/+5
\| \| \| \| \|	PERL_UNUSED_DECL doesn't do anything under g++, so doing this silences some g++ warnings.
*	Silence some g++ compiler warnings	Karl Williamson	2012-12-09	1	-2/+2
\| \| \| \| \|	Changing these slightly got rid of the warnings like: toke.c:9168: warning: format not a string literal and no format arguments
*	pp_goto: Call get-magic before choosing goto type	Father Chrysostomos	2012-12-05	1	-1/+2
\| \| \| \| \| \|	Deciding whether this is goto-label or goto-sub can only correctly happen after get-magic has been invoked, as get-magic can cause the argument to begin or cease to be a subroutine reference.
*	New COW mechanism	Father Chrysostomos	2012-11-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was discussed in ticket #114820. This new copy-on-write mechanism stores a reference count for the PV inside the PV itself, at the very end. (I was using SvEND+1 at first, but parts of the regexp engine expect to be able to do SvCUR_set(sv,0), which causes the wrong byte of the string to be used as the reference count.) Only 256 SVs can share the same PV this way. Also, only strings with allocated space after the trailing null can be used for copy-on-write. Much of the code is shared with PERL_OLD_COPY_ON_WRITE. The restric- tion against doing copy-on-write with magical variables has hence been inherited, though it is not necessary. A future commit will take care of that. I had to modify _core_swash_init to handle $@ differently. The exist- ing mechanism of copying $@ to a new scalar and back again was very fragile. With copy-on-write, $@ =~ s/// can cause pp_subst’s string pointers to become stale. So now we remove the scalar from *@ and allow the utf8-table-loading code to autovivify a new one. Then we restore the untouched $@ afterwards if all goes well.
*	prevent multiple evaluations of ERRSV	Daniel Dragan	2012-11-23	1	-6/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove a large amount of machine code (~4KB for me) from funcs that use ERRSV making Perl faster and smaller by preventing multiple evaluation. ERRSV is a macro that contains GvSVn which eventually conditionally calls Perl_gv_add_by_type. If a SvTRUE or any other multiple evaluation macro is used on ERRSV, the expansion will, in asm have dozens of calls to Perl_gv_add_by_type one for each test/deref of the SV in SvTRUE. A less severe problem exists when multiple funcs (sv_set) in a row call, each with ERRSV as an arg. Its recalculated then, Perl_gv_add_by_type and all. I think ERRSV macro got the func call in commit f5fa9033b8, Perl RT #70862. Prior to that commit it would be pure derefs I think. Saving the SV is still better than looking into interp->gv->gp to get the SV * after each func call. I received no responses to http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195724.html explaining when the SV is replaced in PL_errgv, so took a conservative view and assumed callbacks (with Perl stack/ENTER/LEAVE/eval_/call_) can change it. I also assume ERRSV will never return null, this allows a more efficiently version of SvTRUE to be used. In Perl_newATTRSUB_flags a wasteful copy to C stack operation with the string was removed, and a croak_notcontext to remove push instructions to the stack. I was not sure about the interaction between ERRSV and message sv, I didn't change it to a more efficient (instruction wise, speed, idk) format string combining of the not safe string and ERRSV in the croak call. If such an optimization is done, a compiler potentially will put the not safe string on the first, unconditionally, then check PL_in_eval, and then jump to the croak call site, or eval ERRSV, push the SV on the C stack then push the format string "%"SVf"%s". The C stack allocated const char array came from commit e1ec3a884f . In Perl_eval_pv, croak_on_error was checked first to not eval ERRSV unless necessery. I was not sure about the side effects of using a more efficient croak_sv instead of Perl_croak (null chars, utf8, etc) so I left a comment. nocontext used to save an push instruction on implicit sys perl. In S_doeval, don't open a new block to avoid large whitespace changes. The NULL assignment should optimize away unless accidental usage of errsv in the future happens through a code change. There might be a bug here from commit ecad31f018 since previous a char * was derefed to check for null char, but ERRSV will never be null, so "Unknown error\n" branch will never be taken. For pp_sys.c, in pp_die a new block was opened to not eval ERRSV if "well-formed exception supplied". The else if else if else blocks all used ERRSV, so a "SV * errsv = NULL;" and a eval in the conditional with comma op thing wouldn't work (maybe it would, see toke.c comments later in this message). pp_warn, I have no comments. In S_compile_runtime_code, a croak_sv question comes up same as in Perl_eval_pv. In S_new_constant, a eval in the conditional is done to avoid evaling ERRSV if PL_in_eval short circuits. Same thing in Perl_yyerror_pvn. Perl__core_swash_init I have no comments. In the future, a SvEMPTYSTRING macro should be considered (not fully thought out by me) to replace the SvTRUEs with something smaller and faster when dealing with ERRSV. _nomg is another thing to think about. In S_init_main_stash there is an opportunity to prevent an extra ERRSV between "sv_grow(ERRSV, 240);" and "CLEAR_ERRSV();" that was too complicated for me to optimize. before perl517.dll .text 0xc2f77 .rdata 0x212dc .data 0x3948 after perl517.dll .text 0xc20d7 .rdata 0x212dc .data 0x3948 Numbers are from VC 2003 x86 32 bit.
*	[perl #115742] Push a new pad for recursive DB::DB	Father Chrysostomos	2012-11-15	1	-1/+5
\| \| \| \| \| \|	When invoking the debugger recursively, pp_dbstate needs to push a new pad (like pp_entersub) so that DB::DB doesn’t stomp on the lexical variables belonging to the outer call.
*	Silence two build warnings on systems where ivsize > ptrsize.	Eric Brine\" (via RT)	2012-11-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	# New Ticket Created by "Eric Brine" # Please include the string: [perl #115710] # in the subject line of all future correspondence about this issue. # <URL: https://rt.perl.org:443/rt3/Ticket/Display.html?id=115710 > This is a bug report for perl from ikegami@adaelis.com, generated with the help of perlbug 1.39 running under perl 5.14.2. ----------------------------------------------------------------- [Please describe your issue here] Attached patch silences two build warnings on systems where ivsize > ptrsize. They are safe to ignore, a side-effect of a function with a polymorphic interface. cv = find_runcv_where(FIND_RUNCV_level_eq, iv, NULL); cv = find_runcv_where(FIND_RUNCV_padid_eq, PTR2IV(p), NULL); // p is a PADNAMELIST* [Please do not change anything below this line] -----------------------------------------------------------------
*	Prune dead code in pp_ctl.c:pp_goto	Father Chrysostomos	2012-11-13	1	-5/+0
\| \| \| \| \|	We croak if CxTYPE(cx) == CXt_EVAL before reaching the code in question.
*	Stop goto &sub from leaking when it croaks	Father Chrysostomos	2012-11-13	1	-0/+7
\|
*	pv->pvn for literals in pp_require and Perl_sv_derived_from_pvn	Daniel Dragan	2012-11-12	1	-1/+1
\| \| \| \| \| \| \| \| \|	I found these 2 strlens while stepping through the interp while running a script and both came from a pp_require. UNIVERSAL::can was not modified since it is more rarely called than pp_require. A better more through investigation of version obj comparison and upgrading will need to be done in the future (new funcs needed for the derived/upg_version idiom, remove the upg_version since it was changed to always be a ver obj, etc).
*	[perl #43077] Make goto &sub leave @_ alone	Father Chrysostomos	2012-11-11	1	-55/+41
\| \| \| \| \|	It is a little tricky, as we have to hang on to @_ while unwinding the effects of local @_.
*	Add C define to remove taint support from perl	Steffen Mueller	2012-11-05	1	-9/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By defining NO_TAINT_SUPPORT, all the various checks that perl does for tainting become no-ops. It's not an entirely complete change: it doesn't attempt to remove the taint-related interpreter variables, but instead virtually eliminates access to it. Why, you ask? Because it appears to speed up perl's run-time significantly by avoiding various "are we running under taint" checks and the like. This change is not in a state to go into blead yet. The actual way I implemented it might raise some (valid) objections. Basically, I replaced all uses of the global taint variables (but not PL_taint_warn!) with an extra layer of get/set macros (TAINT_get/TAINTING_get). Furthermore, the change is not complete: - PL_taint_warn would likely deserve the same treatment. - Obviously, tests fail. We have tests for -t/-T - Right now, I added a Perl warn() on startup when -t/-T are detected but the perl was not compiled support it. It might be argued that it should be silently ignored! Needs some thinking. - Code quality concerns - needs review. - Configure support required. - Needs thinking: How does this tie in with CPAN XS modules that use PL_taint and friends? It's easy to backport the new macros via PPPort, but that doesn't magically change all code out there. Might be harmless, though, because whenever you're running under NO_TAINT_SUPPORT, any check of PL_taint/etc is going to come up false. Thus, the only CPAN code that SHOULD be adversely affected is code that changes taint state.
*	Stop require nonexistent::module from leaking	Father Chrysostomos	2012-11-04	1	-1/+1
\| \| \| \|	This leak was caused by v5.17.4-125-gf7ee53b.
*	Stop string eval from leaking ops	Father Chrysostomos	2012-11-02	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was leaking: $ ./miniperl -Xe 'warn $$; while(1){eval "ok 8"};' 1915 at -e line 1. ^C This was not: $ ./miniperl -Xe 'warn $$; while(1){eval "sub {ok 8}"};' 1916 at -e line 1. ^C The sub is successfully taking care of its ops when it is freed. The eval is not. I made the mistake of having the CV relinquish ownership of the op slab after an eval syntax error. That’s precisely the situation in which the ops are likely to leak, and for which the slab allocator was designed. Duh.
*	Allow regexp-to-pvlv assignment	Father Chrysostomos	2012-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the xpvlv and regexp structs conflict, we have to find somewhere else to put the regexp struct. I was going to sneak it in SvPVX, allocating a buffer large enough to fit the regexp struct followed by the string, and have SvPVX - sizeof(regexp) point to the struct. But that would make all regexp flag-checking macros fatter, and those are used in hot code. So I came up with another method. Regexp stringification is not speed-critical. So we can move the regexp stringification out of re->sv_u and put it in the regexp struct. Then the regexp struct itself can be pointed to by re->sv_u. So SVt_REGEXPs will have re->sv_any and re->sv_u pointing to the same spot. PVLVs can then have sv->sv_any point to the xpvlv body as usual, but have sv->sv_u point to a regexp struct. All regexp member access can go through sv_u instead of sv_any, which will be no slower than before. Regular expressions will no longer be SvPOK, so we give sv_2pv spec- ial logic for regexps. We don’t need to make the regexp struct larger, as SvLEN is currently always 0 iff mother_re is set. So we can replace the SvLEN field with the pv. SvFAKE is never used without SvPOK or SvSCREAM also set. So we can use that to identify regexps.
*	Used pad name lists for pad ids	Father Chrysostomos	2012-10-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I added pad IDs so that a pad could record which pad it closes over, to avoid problems with closures closing over the wrong pad, resulting in crashes or bizarre copies. These pad IDs were shared between clones of the same pad. In commit 9ef8d56, for efficiency I made clones of the same closure share the same pad name list. It has just occurred to be that each padlist containing the same pad name list also has the same pad ID, so we can just use the pad name list itself as the ID. This makes padlists 32 bits smaller and eliminates PL_pad_generation from the interpreter struct.
*	Handle cow $_ in @INC filter	Father Chrysostomos	2012-10-12	1	-0/+1
\| \| \| \| \| \|	Setting $_ to a copy-on-write scalar in an @INC filter causes the parser to modify every other scalar sharing the same string buffer. It needs to be forced to a regular scalar before the parser sees it.
*	Don’t taint return value of s///e based on replacement	Father Chrysostomos	2012-10-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	According to the comments about how taint works above pp_subst in pp_hot.c, the return value of s/// should not be tainted based on the taintedness of the replacement. That makes sense, because the replacement does not affect how many iterations there were. (The return value is the number of iterations). It only applies, however, to the cases where the ‘constant replace- ment’ optimisation applies. That means /e taints its return value: $ perl5.16.0 -MDevel::Peek -Te '$_ = "abcd"; $x = s//$^X/; Dump $x' SV = PVMG(0x822ff4) at 0x824dc0 REFCNT = 1 FLAGS = (pIOK) IV = 1 NV = 0 PV = 0 $ perl5.16.0 -MDevel::Peek -Te '$_ = "abcd"; $x = s//$^X/e; Dump $x' SV = PVMG(0x823010) at 0x824dc0 REFCNT = 1 FLAGS = (GMG,SMG,pIOK) IV = 1 NV = 0 PV = 0 MAGIC = 0x201940 MG_VIRTUAL = &PL_vtbl_taint MG_TYPE = PERL_MAGIC_taint(t) MG_LEN = 1 The number pushed on to the stack was becoming tainted due to the set- ting of PL_tainted. PL_tainted is assigned to and the return value explicitly tainted if appropriate shortly after the mPUSHi (which implies sv_setiv, which taints when PL_tainted is true), so setting PL_tainted to 0 just before the mPUSHi is safe.
*	Suggest cause of error requiring .pm file.	Paul Johnson	2012-09-30	1	-8/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following on from a recent thread I've put together a patch to expand the error message when a module can't be loaded. With this patch, instead of: Can't locate Stuff/Of/Dreams.pm in @INC (@INC contains: ...) You get: Can't locate Stuff/Of/Dreams.pm in @INC (you may need to install the Stuff::Of::Dreams module) (@INC contains: ...) [The committer tweaked the error message, based on a suggestion by Tony Cook. See <https://rt.perl.org/rt3/Ticket/Display.html?id=115048#txn-1157750>.]
*	Don’t crash with existent but undefined &DB::DB	Father Chrysostomos	2012-09-24	1	-1/+1
\| \| \| \| \|	This is a follow-up to 432d4561c48, which fixed *DB::DB without &DB::DB, but not &DB::DB without body.
*	[perl #97958] Make reset "" match its docs	Father Chrysostomos	2012-09-24	1	-3/+7
\| \| \| \| \| \| \| \| \|	According to the documentation, reset() with no argument resets pat- terns. But reset "" and reset "\0foo" were also resetting patterns. While I was at it, I fixed embedded nulls, too, though it’s not likely anyone is using this. I could not fix the bug within the existing API for sv_reset, so I created a new function and left the old one with the old behaviour. Call me pear-annoyed.
*	don't crash with -d if DB::DB is seen but not defined [perl #114990]	Jesse Luehrs	2012-09-24	1	-1/+4
\|
*	pp_ctl.c:caller: Remove obsolete comment	Father Chrysostomos	2012-09-14	1	-3/+0
\| \| \| \| \|	This was added in f3aa04c29a, but stopped being relevant in d5ec2987912.
*	Make (caller $n)[9] respect std warnings	Father Chrysostomos	2012-09-14	1	-2/+3
\| \| \| \| \| \| \|	In commit 7e4f04509c6 I forgot about caller. This commit makes the value returned by (caller $n)[9] assignable to ${^WARNING_BITS} to produce exactly the same warnings settings, including warnings con- trolled by $^W.
*	Fix buggy -DPERL_POISON code in S_rxres_free(), exposed by a recent test.	Nicholas Clark	2012-09-14	1	-10/+14
\| \| \| \| \| \| \| \| \| \| \| \|	The code had been buggily attempting to overwrite just-freed memory since PERL_POISON was added by commit 94010e71b67db040 in June 2005. However, no regression test exercised this code path until recently. Also fix the offset in the array of UVs used by PERL_OLD_COPY_ON_WRITE to store RX_SAVED_COPY(). It now uses p[2]. Previously it had used p[1], directly conflicting with the use of p[1] to store RX_NPARENS(). The code is too intertwined to meaningfully do these as separate commits.
*	fix s/(.)/die/e	David Mitchell	2012-09-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit 6502e08109cd003b2cdf39bc94ef35e52203240b introduced copying just the part of the regex string that were needed; but piggy-backing on that commit was a temporary change I made that I forgot to undo, which - it turns out - causes SEGVs and similar when the replacement part of a substitution dies. This commits reverts that change. Spotted as Bleadperl v5.17.3-255-g6502e08 breaks GAAS/URI-1.60.tar.gz (not assigned an RT ticket number yet)
*	tidy up patten match copying code	David Mitchell	2012-09-08	1	-3/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(no functional changes). 1. Remove some dead code from pp_split; it's protected by an assert that it could never be called. 2. Simplify the flags settings for the call to CALLREGEXEC() in pp_substcont: on subsequent matches we always set REXEC_NOT_FIRST, which forces the regex engine not to copy anyway, so passing the REXEC_COPY_STR is pointless, as is the conditional code to set it. 3. (whitespace change): split a conditional expression over 2 lines for easier reading.
*	Don't copy all of the match string buffer	David Mitchell	2012-09-08	1	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When a pattern matches, and that pattern contains captures (or $`, $&, $' or /p are present), a copy is made of the whole original string, so that $1 et al continue to hold the correct value even if the original string is subsequently modified. This can have severe performance penalties; for example, this code causes a 1Mb buffer to be allocated, copied and freed a million times: $&; $x = 'x' x 1_000_000; 1 while $x =~ /(.)/g; This commit changes this so that, where possible, only the needed substring of the original string is copied: in the above case, only a 1-byte buffer is copied each time. Also, it now reuses or reallocs the buffer, rather than freeing and mallocing each time. Now that PL_sawampersand is a 3-bit flag indicating separately whether $`, $& and $' have been seen, they each contribute only their own individual penalty; which ones have been seen will limit the extent to which we can avoid copying the whole buffer. Note that the above code without the $& is not currently slow, but only because the copying is artificially disabled to avoid the performance hit. The next but one commit will remove that hack, meaning that it will still be fast, but will now be correct in the presence of a modified original string. We achieve this by by adding suboffset and subcoffset fields to the existing subbeg and sublen fields of a regex, to indicate how many bytes and characters have been skipped from the logical start of the string till the physical start of the buffer. To avoid copying stuff at the end, we just reduce sublen. For example, in this: "abcdefgh" =~ /(c)d/ subbeg points to a malloced buffer containing "c\0"; sublen == 1, and suboffset == 2 (as does subcoffset). while if $& has been seen, subbeg points to a malloced buffer containing "cd\0"; sublen == 2, and suboffset == 2. If in addition $' has been seen, then subbeg points to a malloced buffer containing "cdefgh\0"; sublen == 6, and suboffset == 2. The regex engine won't do this by default; there are two new flag bits, REXEC_COPY_SKIP_PRE and REXEC_COPY_SKIP_POST, which in conjunction with REXEC_COPY_STR, request that the engine skip the start or end of the buffer (it will still copy in the presence of the relevant $`, $&, $', /p). Only pp_match has been enhanced to use these extra flags; substitution can't easily benefit, since the usual action of s///g is to copy the whole string first time round, then perform subsequent matching iterations against the copy, without further copying. So you still need to copy most of the buffer.
*	"loading-file" and "loaded-file" DTrace probes	Shawn M Moore	2012-08-28	1	-0/+4
\|
*	Stop (caller $n)[6] from including final "\n;"	Father Chrysostomos	2012-08-27	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	String eval appends "\n;" to the string before evaluating it. (caller $n)[6], which returns the text of the eval, was giving the modified string, rather than the original. In fact, it was returning the actual string buffer that the parser uses. This commit changes it to create a new mortal SV from that string buffer, but without the last two characters. It unfortunately breaks this JAPH: eval'BEGIN{${\(caller 2)[6]}=~y< !"$()+\-145=ACHMT^acfhinrsty{}> <nlrhta"o Pe e,\nkrcrJ uthspeia">}say if+chr(1) -int"145"!=${^MATCH}'
*	Fix format closure bug with redefined outer sub	Father Chrysostomos	2012-08-21	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	CVs close over their outer CVs. So, when you write: my $x = 52; sub foo { sub bar { sub baz { $x } } } baz’s CvOUTSIDE pointer points to bar, bar’s CvOUTSIDE points to foo, and foo’s to the main cv. When the inner reference to $x is looked up, the CvOUTSIDE chain is followed, and each sub’s pad is looked at to see if it has an $x. (This happens at compile time.) It can happen that bar is undefined and then redefined: undef &bar; eval 'sub bar { my $x = 34 }'; After this, baz will still refer to the main cv’s $x (52), but, if baz had ‘eval '$x'’ instead of just $x, it would see the new bar’s $x. (It’s not really a new bar, as its refaddr is the same, but it has a new body.) This particular case is harmless, and is obscure enough that we could define it any way we want, and it could still be considered correct. The real problem happens when CVs are cloned. When a CV is cloned, its name pad already contains the offsets into the parent pad where the values are to be found. If the outer CV has been undefined and redefined, those pad offsets can be com- pletely bogus. Normally, a CV cannot be cloned except when its outer CV is running. And the outer CV cannot have been undefined without also throwing away the op that would have cloned the prototype. But formats can be cloned when the outer CV is not running. So it is possible for cloned formats to close over bogus entries in a new parent pad. In this example, \$x gives us an array ref. It shows ARRAY(0xbaff1ed) instead of SCALAR(0xdeafbee): sub foo { my $x; format = @ ($x,warn \$x)[0] . } undef &foo; eval 'sub foo { my @x; write }'; foo __END__ And if the offset that the format’s pad closes over is beyond the end of the parent’s new pad, we can even get a crash, as in this case: eval 'sub foo {' . '{my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l,$m,$n,$o,$p,$q,$r,$s,$t,$u)}'x999 . q\| my $x; format = @ ($x,warn \$x)[0] . } \|; undef &foo; eval 'sub foo { my @x; my $x = 34; write }'; foo(); __END__ So now, instead of using CvROOT to identify clones of CvOUTSIDE(format), we use the padlist ID instead. Padlists don’t actually have an ID, so we give them one. Any time a sub is cloned, the new padlist gets the same ID as the old. The format needs to remember what its outer sub’s padlist ID was, so we put that in the padlist struct, too.
*	Use PADLIST in more places	Father Chrysostomos	2012-08-21	1	-1/+1
\| \| \| \| \|	Much code relies on the fact that PADLIST is typedeffed as AV. PADLIST should be treated as a distinct type.
*	Omnibus removal of register declarations	Karl Williamson	2012-08-18	1	-57/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes most register declarations in C code (and accompanying documentation) in the Perl core. Retained are those in the ext directory, Configure, and those that are associated with assembly language. See: http://stackoverflow.com/questions/314994/whats-a-good-example-of-register-variable-usage-in-c which says, in part: There is no good example of register usage when using modern compilers (read: last 10+ years) because it almost never does any good and can do some bad. When you use register, you are telling the compiler "I know how to optimize my code better than you do" which is almost never the case. One of three things can happen when you use register: The compiler ignores it, this is most likely. In this case the only harm is that you cannot take the address of the variable in the code. The compiler honors your request and as a result the code runs slower. The compiler honors your request and the code runs faster, this is the least likely scenario. Even if one compiler produces better code when you use register, there is no reason to believe another will do the same. If you have some critical code that the compiler is not optimizing well enough your best bet is probably to use assembler for that part anyway but of course do the appropriate profiling to verify the generated code is really a problem first.
*	pp_ctl.c:pp_dbstate: Don’t adjust CvDEPTH for XSUBs	Father Chrysostomos	2012-08-17	1	-2/+0
\| \| \| \| \| \| \| \|	Commit c127bd3aaa5c5 made XS DB::DB subs work. Before that, pp_dbstate assumed DB::DB was written it perl. It adjusts CvDEPTH when calling the XSUB, which serves no purpose. It was presumably just copied from the pure-Perl-calling code. pp_entersub does- n’t do this.
*	pp_require thread safety for VMS.	Craig A. Berry	2012-08-10	1	-3/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we translate path names of required modules into Unix format, we haven't (recently) been using the optional second argument to the translation routines,[1] an argument that supplies a buffer for the translation. That causes them to use a static buffer. Which means that if two or more different threads are doing a require operation at the same time, they will be blindly sharing the same buffer. So allocate buffers as we need them and make them mortal so they will go away at the next state transition. [1] Use of an automatic variable for the buffer was removed way back in 46fc3d4c69a0ad.
*	[perl #114020, #90018, #53186] Make given alias $_	Father Chrysostomos	2012-08-01	1	-2/+9
\| \| \| \| \| \| \| \|	This commit makes given() alias $_ to the argument, using a slot in the lexical pad if a lexical $_ is in scope, or $'_ otherwise. This makes it work very similarly to foreach, and eliminates the problem of List::Util functions not working inside given().
*	Avoid reading before the buffer start when generating errors from require.	Nicholas Clark	2012-08-01	1	-2/+2
\| \| \| \| \| \| \| \| \|	In pp_require, the error reporting code treats file names ending /\.p?h\z/ specially. The detection code for this, as refactored in 2010 by commit 686c4ca09cf9d6ae, could read one or two bytes before the start of the filename for filenames less than 3 bytes long. (Note this cannot happen with module names given to use or require, as appending ".pm" will always make the filename at least 3 bytes long.)
*	[perl #113684] Make redo/last/next/dump accept expr	Father Chrysostomos	2012-07-27	1	-63/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These functions have been allowing arbitrary expressions, but would treat anything that did not resolve to a const op as the empty string. Not only were arguments swallowed up without warning, but constant folding could change the behaviour. Computed labels are allowed for goto, and there is no reason to disallow them for these other ops. This can also come in handy for certain types of code generators. In the process of modifying pp functions to accept arbitrary labels, I noticed that the label and loop-popping code was identical in three functions, so I moved it out into a separate static function, to make the changes easier. I also had to reorder newLOOPEX significantly, because code under the goto branch needed to a apply to last, and vice versa. Using multiple gotos to switch between the branches created too much of a mess. I also eliminated the use of SP from pp_last, to avoid copying the value back and forth between SP and PL_stack_sp.