| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit v5.19.8-533-g63baef5 changed the handling of locale-dependent
regexes so that the pattern was considered tainted at compile-time, rather
than determining it each time at run-time whenever it executed a
locale-dependent node. Unfortunately due to the conflating of two flags,
RXf_TAINTED and RXf_TAINTED_SEEN, it had the side effect of permanently
marking a pattern as tainted once it had had a single tainted result.
E.g.
use re qw(taint);
use Scalar::Util qw(tainted);
for ($^X, "abc") {
/(.*)/ or die;
print "not " unless tainted("$1"); print "tainted\n";
};
which from 5.19.9 onwards output:
tainted
tainted
but with this commit (and with 5.19.8 and earlier), it now outputs:
tainted
not tainted
The RXf_TAINTED flag indicates that the pattern itself is tainted, e.g.
$r = qr/$tainted_value/
while the RXf_TAINTED_SEEN flag means that the results of the last match
are tainted, e.g.
use re 'tainted';
$tainted =~ /(.*)/;
# $1 is tainted
Pre 63baef5, the code used to look like:
at run-time:
turn off RXf_TAINTED_SEEN;
while (nodes to execute) {
switch(node) {
case
BOUNDL: /* and other locale-specific ops */
turn on RXf_TAINTED_SEEN;
...;
}
}
if (tainted || RXf_TAINTED)
turn on RXf_TAINTED_SEEN;
63baef5 changed it to:
at compile-time:
if (pattern has locale ops)
turn on RXf_TAINTED_SEEN;
at run-time:
while (nodes to execute) {
...
}
if (tainted || RXf_TAINTED)
turn on RXf_TAINTED_SEEN;
This commit changes it to:
at compile-time;
if (pattern has locale ops)
turn on RXf_TAINTED;
at run-time:
turn off RXf_TAINTED_SEEN;
while (nodes to execute) {
...
}
if (tainted || RXf_TAINTED)
turn on RXf_TAINTED_SEEN;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When PERL_EXTERNAL_GLOB is defined (currently only on VMS, or on other
platforms when running miniperl), each item returned from the glob
operation is checked against a set of glob metacharacters, and then, if
it matches any of these characters, it's checked with lstat() to see if
it actually exists. Then it's passed through if it exists but skipped
otherwise. Presumably this is because returning the pattern unchanged
is how a shell glob indicates "no match."
It appears that the lstat() is necessary because glob metacharacters
are almost always valid filename characters on Unix, so it's the
only way to distinguish a funny-looking but real filename from the no
match case (oops -- indeterminate grammar). Since these filenames are
rare on Unix, lstat() is seldom called.
Enter VMS, where "external glob" is neither external, nor is it actually
a glob. It is a native wildcard-matching search built into Perl with
vms/vms.c's Perl_vms_start_glob(), which does return the original
pattern when there is no match, but that pattern will only contain
native wildcard characters, so the check for glob metacharacters is
really not the right thing to be doing. Moreover, glob metacharacters
such as dollar signs, brackets, and semicolons are extremely common in
native VMS paths, so in many common scenarios, the lstat() to see if the
file really exists gets triggered for every single file, which is
expensive.
This commit replaces, on VMS only, the check for glob metacharacters
with a check for VMS wildcard characters. A simple glob of a Perl
source tree becomes 60% faster on the first iteration and 80% faster on
subsequent iterations.
Thanks to Hein van den Heuvel, who reported the problem and explained
that a search operation should only be looking at directory files,
whereas a stat does I/O to the individual file header, resulting in a
massive increase in unnecessary I/O.
This closes [perl #121440].
|
|
|
|
|
|
| |
Calls to Perl_do_openn() all have at least 2 unused arguments which clutter
the code and hinder easy understanding. Perl_do_open6() and
Perl_do_open_raw() each only do one job, so don't have the dead arguments.
|
|
|
|
|
|
|
|
| |
I've gone through pp_hot.c and scope.c and added LIKELY() or UNLIKELY()
to all conditionals where I understand the code well enough to know that
a particular branch is or isn't likely to be taken very often.
I also processed some of the .h files which contain commonly used macros.
|
|
|
|
|
|
|
|
|
|
| |
Use aelemfast for literal index array access where the index is in the
range -128..127, rather than 0..255.
You'd expect something like $a[-1] or $a[-2] to be a lot more common than
$a[100] say. In fact a quick CPAN grep shows 66 distributions
matching /\$\w+\[\d{3,}\]/, but "at least" 1000 matching /\$\w+\[\-\d\]/.
And most of the former appear to be table initialisations.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under threaded builds, GVs for OPs are stored in the pad rather than
being directly attached to the op. For some reason, all such GV's were
getting the SvPADTMP flag set. There seems to be be no good reason for
this, and after skipping setting the flag, all tests still pass.
The advantage of not setting this flag is that there are quite a few
hot places in the code that do
if (SvPADTMP(sv) && !IS_PADGV(sv)) {
...
I've replaced them all with
if (SvPADTMP(sv)) {
assert(!IS_PADGV(sv));
...
Since the IS_PADGV() macro expands to something quite heavyweight, this is
quite a saving: for example this commit reduces the size of pp_entersub by
111 bytes.
|
|
|
|
|
|
| |
The code that reduces the returned XS args to 1 arg in scalar context
was a bit messy. This makes it smaller, more comprehensible, with slighylu
smaller object size.
|
|
|
|
|
|
|
|
|
| |
make the scope of item be smaller;
don't assign +1 to MARK when we will shortly overwrite it;
use a local var to to store &GvAV(PL_defgv) to avoid recalculating it
use a _NN variant of SvREFCNT_inc
Shaves 2% off the object size of pp_entersub.
|
|
|
|
| |
stop repeatedly retrieving it
|
|
|
|
|
| |
the previous commit wrapped a switch statement in an if(), so re-indent
accordingly. Whitespace-only change.
|
|
|
|
|
|
|
|
|
| |
liberally sprinkle some LIKELY()s, initialise gimme closer to where
it's first needed, and shortcut the common case of a GV with valid CV,
skipping the switch statement.
With gcc and x86_64, this reduces the code size of the function by
about 5%.
|
|
|
|
|
|
| |
av_tindex is a more clearly named synonym for av_len, available starting
in v5.18. This changes the core uses to it, including modules in /ext,
which are not dual-lifed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See discussion at https://rt.perl.org/Ticket/Display.html?id=120675
There are several unresolved items in this discussion, but we did agree
that tainting should be dependent only on the regex pattern, and not the
particular input string being matched against:
"The bottom line is we are moving to the policy that tainting is based
on the operation being in locale, without regard to the particular
operand's contents passed this time to the operation. This means simpler
core code and more consistent tainting results. And it lessens the
likelihood that there are paths in the core that should taint but don't"
This commit does the minimal work to change regex pattern matching to
determine tainting at pattern compilation time. Simply put, if a
pattern contains a regnode whose match/not match depends on the run-time
locale, any attempt to match against that pattern will taint, regardless
of the actual target string or runtime locale in effect. Given this
change, there are optimizations that can be made to avoid runtime work,
but these are deferred until later.
Note that just because a regular expression is compiled under locale
doesn't mean that the generated pattern will be tainted. It depends on
the actual pattern. For example, the pattern /(.)/ doesn't taint
because it will match exactly one character of the input, regardless of
locale settings.
|
|
|
|
|
|
|
|
|
| |
If we just did an sv_setpvs on it, the SvPV_force_nolen should not do
anything useful, so let's not.
Side note: s/TARG/left/ in the enclosing block because they are the same
pointer, so why use a define that needs grokking by the reader of the
code if the local variable is guaranteed to be the same?
|
|
|
|
|
| |
The strchr() cannot not succeed unless the first test fails, so the
first test need not be tried at all.
|
|
|
|
|
|
|
|
|
|
|
| |
Various system functions like write() are marked with the
__warn_unused_result__ attribute, which causes an 'ignoring return value'
warning to be emitted, even if the function call result is cast to (void).
The generic solution seems to be
int rc = write(...);
PERL_UNUSED_VAR(rc);
|
|
|
|
|
|
|
|
| |
Under the debugger (the -d switch), if there is a DB::sub subroutine,
then %DB::lsub gets autovivified during a call to an lvalue sub. That
hash is never used. The code for vivifying the *DB::lsub glob was
copied from *DB::sub when lsub was added, and *DB::sub does have a
hash (%DB::sub), but *DB::lsub doesn’t need one.
|
|
|
|
|
|
|
|
|
|
| |
Commit perl-5.8.0-9908-g5e17dd7, in March of 2007, added a SPAGAIN
here to account for the stack shifting when hv_scalar calls tied
hashes’ SCALAR method.
It was never necessary, because commit perl-5.8.0-3008-ga3bcc51,
which added hv_scalar and magic_scalarpack in December of 2003, made
magic_scalarpack push a new stack, protecting the old one.
|
|
|
|
|
|
|
| |
rv2hv has had a TARG since perl 5.000, but it has not used it since
hv_scalar was added in perl-5.8.0-3008-ga3bcc51.
This commit removes it, saving a tiny bit of space in the pad.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the only place where tied @_ does not work, and there appears
to be no reason why it shouldn’t, apart from the fact that it hasn’t
been implemented.
Commit 67955e0c was what made &xsub work to begin with. 93965878572
introduced tied arrays and added the comment to pp_entersub saying
that @_ is not tiable.
goto &xsub has worked since perl 5.000, but 93965878572 did not make
it work with tied arrays.
|
|
|
|
| |
This avoids an extraneous warning when globbing fails for other reasons.
|
|
|
|
|
| |
I don’t have enough memory to test this, but it needs to be done even-
tually anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$ perl -e 'undef *_; &Internals::V'
Segmentation fault: 11
$ perl -e 'sub { undef *_; goto &Internals::V }->()'
$ perl5.18.1 -e 'sub { undef *_; goto &Internals::V }->()'
Segmentation fault: 11
The goto case is actually a regression from 5.16 (049bd5ffd62), as
goto used to ignore changes to *_. (Fixing one bug uncovers another.)
We shouldn’t assume that GvAV(PL_defgv) (*_{ARRAY}) gives us anything.
While we’re at it, since we have to add extra checks anyway, use them
to speed up empty @_ in goto (by checking items, rather than arg).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before ce0d59f:
$ perl -e '++$#_; &utf8::encode'
Modification of a read-only value attempted at -e line 1.
As of ce0d59f:
$ ./perl -Ilib -e '++$#_; &utf8::encode'
Assertion failed: (sv), function Perl_sv_utf8_encode, file sv.c, line 3581.
Abort trap: 6
Calling sub { utf8::encode($_[0]) } should be more or less equivalent
to calling utf8::encode, but it is not in this case:
$ ./perl -Ilib -we '++$#_; &{sub { utf8::encode($_[0]) }}'
Use of uninitialized value in subroutine entry at -e line 1.
In the first two examples above, an implementation detail is leaking
through. What you are seeing is not the array element, but a place-
holder that indicates an element that has not been assigned to yet.
We should use defelem magic so that what the XSUB assigns to will cre-
ate an array element (as happens with utf8::encode($_[0])).
All of the above applies to goto &xsub as well.
|
|
|
|
|
| |
This av can never be null here. av_len will already have failed an
assertion if it is.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of ticket #119433. This particular bug is triggered by
Data::Dump’s test suite.
Commit ce0d59f changed arrays to use NULL for nonexistent elements,
instead of &PL_sv_undef (the special scalar returned by Perl’s ‘undef’
operator).
‘foreach’ was not updated to account. It was still treating
&PL_sv_undef as a nonexistent element. This was causing ‘Modifica-
tion of non-creatable array value attempted, subscript 0’, due to a
similar bug in vivify_defelem, which the next commit will fix.
(Fixing vivify_defelem without fixing foreach will make the test pass,
but for foreach to create a defelem to begin with is inefficient and
should be addressed anyway.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
when unwinding sub and format calls.
The comments in the added test file explain what the problem is.
The fix is to call LEAVE_SCOPE in POPSUB and POPFORMAT (to free their
lexicals) before lowering CvDEPTH.
If the context has already been popped via cxstack_ix--, then
LEAVE_SCOPE could overwrite it, so accessing cx after LEAVE_SCOPE is
unsafe. Hence the changes to POPSUB and POPFORMAT are a bit involved.
Some callers of POPSUB do a temporary cxstack_ix++ first so they
can access cx afterwards. Two cases needed to be changed to
work that way.
|
| |
|
|
|
|
|
|
|
|
|
| |
Change the internal fields for storing positions so that //g in scalar
context can move past the 2**31 character threshold. Before this com-
mit, the numbers would wrap, resulting in assertion failures.
The changes in this commit are only enough to get the added test pass-
ing. Stay tuned for more.
|
|
|
|
|
|
|
|
|
|
| |
This fixes #112790 and part of #116907.
The length of the string is cast to I32, so it wraps and end up less
than the minimum length.
For now, simply skip this optimisation if minlen itself wraps and
becomes negative.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value of pos() is stored as a byte offset. If it is stored on a
tied variable or a reference (or glob), then the stringification could
change, resulting in pos() now pointing to a different character off-
set or pointing to the middle of a character:
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, a; print pos $x'
2
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, "\x{1000}"; print pos $x'
Malformed UTF-8 character (unexpected end of string) in match position at -e line 1.
0
So pos() should be stored as a character offset.
The regular expression engine expects byte offsets always, so allow it
to store bytes when possible (a pure non-magical string) but use char-
acters otherwise.
This does result in more complexity than I should like, but the alter-
native (always storing a character offset) would slow down regular
expressions, which is a big no-no.
|
|
|
|
|
|
|
|
|
|
| |
Make the array interface 64-bit safe by using SSize_t instead of I32
for array indices.
This is based on a patch by Chip Salzenberg.
This completes what the previous commit began when it changed
av_extend.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the more correct version of 1555b325 which was reverted
by 6200d5a0e.
In pp_subst, there is an initial pattern match against the target
string, followed by logic to determine which of several code paths
will handle the rest of the substitution, depending on which shortcuts
can be taken.
There is one path specifically for doing a global sort (/g) and modi-
fying the target in place. This code was skipped if the target was a
copy-on-write scalar or if the pre-match copy was enabled. The pre-
match copy is always enabled now, so this code is unreachable.
In-place substitution stringifies the rhs at the outset, just after
the first regexp match, but before any substitution. Then it uses
that string buffer, expecting it not to change.
That clearly cannot work with s/a/$&/g; it will also cause erratic
behaviour in the case of regexp code blocks (which will see the
string being modified, which doesn not happen with unoptimised subst).
That’s why the in-place optimisation has to be skipped when the
REXEC_COPY_STR flag is set.
But we can tweak that logic:
• As long as the rhs is not a magical var, its contents are not going
to change from one iteration to the next.
• If there are no code blocks, nothing will see the string during the
substitution.
So this commit adds logic to check those things, enabling this opti-
misation where possible.
Skipping this optimisation for the pre-match copy was originally added
in commit 5d5aaa5e7.
|
|
|
|
|
|
|
| |
This reverts commit 1555b325296e46f7b95bee03fe856cec348b0d57.
This is causing test failures (not on my machine), and I do not have
time right now to track them all down.
|
|
|
|
|
|
|
|
|
| |
Perl 5 has always had the ‘don't match same null twice’ comment
in pp_subst. Originally it was a parenthetical note right below
the s == m.
71be2cbc removed the parentheses. Commit f722798be moved it further
away from s == m. Now what it refers to is far from clear.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In pp_subst, there is an initial pattern match against the target
string, followed by logic to determine which of several code paths
will handle the rest of the substitution, depending on which shortcuts
can be taken.
There is one path specifically for doing a global sort (/g) and modi-
fying the target in place. This code was skipped if the target was a
copy-on-write scalar or if the pre-match copy was enabled. The pre-
match copy is always enabled now, so this code is unreachable.
There does not appear to be any reason why this path must be skipped
in the presence of the pre-match copy. The string gets copied by the
initial regexp match and $& and friends point there afterwards.
This skip was added in commit 5d5aaa5e7 (a jumbo patch, so good luck
figuring it out). This commit removes the skip, and all tests pass.
This, of course, only affects those cases where copy-on-write does
not kick in; for instance, when the string’s length is one less than
its buffer:
$ ./perl -Ilib -e 'use Devel::Peek; $x = " "; $x .= " "x22; Dump $x; $x =~ s/ /b/g; Dump $x'
SV = PV(0x7ffb0b807098) at 0x7ffb0b82eed8
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x7ffb0b4066b8 " "\0
CUR = 23
LEN = 24
SV = PV(0x7ffb0b807098) at 0x7ffb0b82eed8
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x7ffb0b4066b8 "bbbbbbbbbbbbbbbbbbbbbbb"\0
CUR = 23
LEN = 24
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a nonexistent array element is passed to a subroutine, a special
‘deferred element’ scalar (implemented using something called defelem
magic) is passed to the subroutine instead, which delegates to the
array element. This allows some_benign_function($array[$nonexistent])
to avoid autovivifying unnecessarily.
Whether this magic would be triggered was based on whether the element
was within the range 0..$#array. Since arrays can contain nonexistent
elements before $#array, this logic is incorrect. It also makes sense
to allow $array[$neg] where the negative number points before the
beginning of the array to create a deferred element and only croak if
it is assigned to.
This commit fixes the logic for when deferred elements are created
and implements these deferred negative elements.
Since we have to be able to store negative values in xlv_targoff, it
is convenient to make it a union (with two types--signed and unsigned)
and use LvSTARGOFF for defelem array indices.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes bug #7508 and provides the groundwork for fixing
several other bugs.
Elements of @_ are aliased to the arguments, so that \$_[0] within
sub foo will reference the same scalar as \$x if the sub is called
as foo($x).
&PL_sv_undef (the global read-only undef scalar returned by the
‘undef’ operator itself) was being used to represent nonexistent
array elements. So the pattern would be broken for foo(undef), where
\$_[0] would vivify a new $_[0] element, treating it as having been
nonexistent.
This also causes other problems with constants under ithreads
(#105906) and causes a pending fix for another bug (#118691) to trig-
ger this bug.
This commit changes the internals to use a null pointer to represent a
nonexistent element.
This requires that Storable be changed to account for it. Also,
IPC::Open3 was relying on the bug. So this commit patches
both modules.
|
|
|
|
|
|
|
|
|
|
| |
This resolves the last remaining issue in ticket #78194, that
newRV is supposedly buggy because it doesn’t copy its referent.
The full implications of the PADTMP are not explained anywhere in
the API docs, and even XSUBs shouldn’t have to worry about special
handling. (E.g., what if they do SvREFCNT_dec(SvRV(sv)); SvRV(sv)=...?)
So the real solution here is not to let XSUBs see them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$ ./miniperl -Ilib -e 'for(__PACKAGE__) { s/a/a/ }'
Modification of a read-only value attempted at -e line 1.
$ ./miniperl -Ilib -e 'for(__PACKAGE__) { s/b/b/ }'
$ ./miniperl -Ilib -e 'for("main") { s/a/a/ }'
Modification of a read-only value attempted at -e line 1.
$ ./miniperl -Ilib -e 'for("main") { s/b/b/ }'
Modification of a read-only value attempted at -e line 1.
When I pass the constant "main" to s///, it croaks whether the regular
expression matches or not.
When I pass __PACKAGE__, which has the same content and is also read-
only, it only croaks when the pattern matches.
This commit removes some logic that is left over from when
READONLY+FAKE meant copy-on-write. Read-only does mean read-only now,
so copy-on-write scalars should not be exempt from read-only checks.
|
| |
|
|
|
|
|
|
|
| |
Currently before matching, we see whether the SV has any pos() magic
attached; then after the match we look it up again to update pos().
Instead just remember the previous value of mg and reuse it where
possible.
|
|
|
|
|
| |
a successful match always sets $-[0] now, so there's no need to check
whether its set
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case where a qr// regex is directly used by PMOP (rather than being
interpolated with some other stuff and a new regex created, such as
/a$r/p), then the PMf_KEEPCOPY flag will be set on the PMOP, but the
corresponding RXf_PMf_KEEPCOPY flag *won't* be set on the regex.
Since most of the regex handling for copying the string and extracting out
${^PREMATCH} etc is done based on the RXf_PMf_KEEPCOPY flag in the regex,
this is a bit of a problem.
Prior to 5.18.0 this wasn't so noticeable, since various other bugs around
//p handling meant that ${$PREMATCH} etc often accidentally got set
anyway. 5.18.0 fixed these bugs, and so as a side-effect, exposed the
PMOP verses regex flag issue. In particular, this stopped working in
5.18.0:
my $pat = qr/a/;
'aaaa' =~ /$pat/gp or die;
print "MATCH=[${^MATCH}]\n";
(prints 'a' in 5.16.0, undef in 5.18.0).
The presence /g caused the engine to copy the string anyway by luck.
We can't just set the RXf_PMf_KEEPCOPY flag on the regex if we see the
PMf_KEEPCOPY flag on the PMOP, otherwise stuff like this will be wrong:
$r = qr/..../;
/$r/p; # set RXf_PMf_KEEPCOPY on $r
/$r/; # does a /p match by mistake
Since for 5.19.x onwards COW is enabled by default (and cheap copies are
always made regardless of /p), then this fix is mainly for PERL_NO_COW
builds and for backporting to 5.18.x. (Although it still applies to
strings that can't be COWed for whatever reason).
Since we can't set a flag in the rx, we fix this by:
1) when calling the regex engine (which may attempt to copy part or all of
the capture string), make sure we pass REXEC_COPY_STR, but neither of
REXEC_COPY_SKIP_PRE, REXEC_COPY_SKIP_POST when we call regexec() from
pp_match or pp_subst when the corresponding PMOP has PMf_KEEPCOPY set.
2) in Perl_reg_numbered_buff_fetch() etc, check for PMf_KEEPCOPY in
PL_curpm as well as for RXf_PMf_KEEPCOPY in the current rx before deciding
whether to process ${^PREMATCH} etc.
As well as adding new tests to t/re/reg_pmod.t, I also changed the
string to be matched against from being '12...' to '012...', to ensure that
the lengths of ${^PREMATCH}, ${^MATCH}, ${^POSTMATCH} would all be
different.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On something like:
$_ = "123456789";
pos = 6;
s/.(?=.\G)/X/g;
each iteration could in theory start with pos one character to the left
of the previous position, and with the substitution replacing bits that
it has already replaced. Since that way madness lies, ban any attempt by
s/// to substitute to the left of a previous position.
To implement this, add a new flag to regexec(), REXEC_FAIL_ON_UNDERFLOW.
This tells regexec() to return failure even if the match itself succeeded,
but where the start of $& is before the passed stringarg point.
This change caused one existing test to fail (which was added about a year
ago):
$_="abcdef";
s/bc|(.)\G(.)/$1 ? "[$1-$2]" : "XX"/ge;
print; # used to print "aXX[c-d][d-e][e-f]"; now prints "aXXdef"
I think that that test relies on ambiguous behaviour, and that my change
makes things saner.
Note that s/// with \G is generally very under-tested.
|
|
|
|
|
|
|
|
|
|
|
| |
pp_subst() sets the REXEC_COPY_STR flag on the first match. On the second
and subsequent matches, it doesn't set it in two out three of the branches
(including pp_susbstcont) where it calls CALLREGEXEC().
The one place where it *does* set it is a (harmless) mistake, since regexec
ignores REXEC_COPY_STR if REXEC_NOT_FIRST is set (which is it is, on all 3
brnanches).
So unset REXEC_COPY_STR in the third branch too, for consistency
|
|
|
|
| |
and slightly reduce the scope of the temporary i var.
|
| |
|
|
|
|
|
| |
it's just used a temporary var in a few blocks; declare it individually
in each block rather than being scoped to the whole function.
|