diff options
author | David Mitchell <davem@iabyn.com> | 2017-02-14 16:28:31 +0000 |
---|---|---|
committer | David Mitchell <davem@iabyn.com> | 2017-02-14 17:49:58 +0000 |
commit | bb414e1295cbc3c4c2a55aaf82d832d6c8bf76ec (patch) | |
tree | 5d20b35f13d84a8982b07509a577aadd3b5c95a9 | |
parent | cbb658a1562fa3da6a29d865ee9b0ba564affb3f (diff) | |
download | perl-bb414e1295cbc3c4c2a55aaf82d832d6c8bf76ec.tar.gz |
S_regmatch: eliminate WHILEM_B paren saving
In something like
"a1b2c3d4..." =~ /(?:(\w)(\d))*..../
A WHILEM state is pushed for each iteration of the '*'. Part of this
state saving includes the previous indices for each of the captures within
the body of the thing being iterated over. So we save the following sets of
values for $1,$2:
()()
(a)(1)
(b)(2)
(c)(3)
(d)(4)
Then if at any point we backtrack, we can undo one or more iterations and
restore the older values of $1,$2.
For /A*B/ where A is a complex sub-pattern like (\w)(\d), we currently save
the paren state each time we're about to attempt to iterate another A.
But it turns out that for non-greedy matching, i.e. A*?B, we also
save the paren state before executing B. This is unnecessary, as
B can't alter the capture state of the parens within A. So eliminate it.
If in the future some sneaky regex is found which this commit breaks,
then as well as restoring the old behaviour, you should look carefully
to see whether similar paren-saving behaviour for B should be added to
greedy matches too, i.e. A*B. It was partly the discrepancy between
saving for A*?B but not for A*B which made me suspect it was redundant.
-rw-r--r-- | regexec.c | 5 |
1 files changed, 0 insertions, 5 deletions
@@ -7571,9 +7571,6 @@ NULL if (cur_curlyx->u.curlyx.minmod) { ST.save_curlyx = cur_curlyx; cur_curlyx = cur_curlyx->u.curlyx.prev_curlyx; - ST.cp = regcppush(rex, ST.save_curlyx->u.curlyx.parenfloor, - maxopenparen); - REGCP_SET(ST.lastcp); PUSH_YES_STATE_GOTO(WHILEM_B_min, ST.save_curlyx->u.curlyx.B, locinput); NOT_REACHED; /* NOTREACHED */ @@ -7643,8 +7640,6 @@ NULL case WHILEM_B_min_fail: /* just failed to match B in a minimal match */ cur_curlyx = ST.save_curlyx; - REGCP_UNWIND(ST.lastcp); - regcppop(rex, &maxopenparen); if (cur_curlyx->u.curlyx.count >= /*max*/ARG2(cur_curlyx->u.curlyx.me)) { /* Maximum greed exceeded */ |