diff options
Diffstat (limited to 'pod')
-rw-r--r-- | pod/perl595delta.pod | 8 | ||||
-rw-r--r-- | pod/perldiag.pod | 10 | ||||
-rw-r--r-- | pod/perlre.pod | 17 | ||||
-rw-r--r-- | pod/perlreguts.pod | 11 |
4 files changed, 42 insertions, 4 deletions
diff --git a/pod/perl595delta.pod b/pod/perl595delta.pod index af76cf68ee..717540cb22 100644 --- a/pod/perl595delta.pod +++ b/pod/perl595delta.pod @@ -113,7 +113,13 @@ quantifiers. (Yves Orton) The regex engine now supports a number of special purpose backtrack control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL) -and (*ACCEPT). See L<perlre> for their descriptions. +and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton) + +=item Relative backreferences + +A new syntax C<\R1> ("1" being any positive decimal integer) allows +relative backreferencing. This should make it easier to embed patterns +that contain backreferences. (Yves Orton) =back diff --git a/pod/perldiag.pod b/pod/perldiag.pod index e9d23267bd..e6a8b0f6dd 100644 --- a/pod/perldiag.pod +++ b/pod/perldiag.pod @@ -3495,6 +3495,16 @@ prepend a zero to make the number at least two digits: C<\07> The <-- HERE shows in the regular expression about where the problem was discovered. +=item Reference to nonexistent or unclosed group in regex; marked by <-- HERE in m/%s/ + +(F) You used something like C<\R7> in your regular expression, but there are +not at least seven sets of closed capturing parentheses in the expression before +where the C<\R7> was located. It's also possible you forgot to escape the +backslash. + +The <-- HERE shows in the regular expression about where the problem was +discovered. + =item Reference to nonexistent named group in regex; marked by <-- HERE in m/%s/ (F) You used something like C<\k'NAME'> or C<< \k<NAME> >> in your regular diff --git a/pod/perlre.pod b/pod/perlre.pod index c2b968062b..7df564738e 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -246,7 +246,9 @@ X<word> X<whitespace> so you may end up with malformed pieces of UTF-8. Unsupported in lookbehind. \1 Backreference to a specific group. - '1' may actually be any positive integer. + '1' may actually be any positive integer. + \R1 Relative backreference to a preceding closed group. + '1' may actually be any positive integer. \k<name> Named backreference \N{name} Named unicode character, or unicode escape \x12 Hexadecimal escape sequence @@ -469,7 +471,15 @@ ambiguity by interpreting \10 as a backreference only if at least 10 left parentheses have opened before it. Likewise \11 is a backreference only if at least 11 left parentheses have opened before it. And so on. \1 through \9 are always interpreted as -backreferences. +backreferences. + +X<relative backreference> +In Perl 5.10 it is possible to relatively address a capture buffer by +using the C<\RNNN> notation, where C<NNN> is negative offset to a +preceding completed capture buffer. Thus C<\R1> refers to the last +buffer closed, C<\R2> refers to the buffer before that, and so on. Note +especially that C</(foo)(\R1)/> refers to the capture buffer containing +C<foo>, not to the buffer containing C<\R1>. Additionally, as of Perl 5.10 you may use named capture buffers and named backreferences. The notation is C<< (?<name>...) >> and C<< \k<name> >> @@ -884,6 +894,9 @@ C<(?R)>. If PARNO is preceded by a plus or minus sign then it is assumed to be relative, with negative numbers indicating preceding capture buffers and positive ones following. Thus C<(?-1)> refers to the most recently declared buffer, and C<(?+1)> indicates the next buffer to be declared. +Note that the counting for relative recursion differs from that of +relative backreferences, in that with recursion unclosed buffers B<are> +included. The following pattern matches a function foo() which may contain balanced parentheses as the argument. diff --git a/pod/perlreguts.pod b/pod/perlreguts.pod index 937565745c..aa54bfcb8f 100644 --- a/pod/perlreguts.pod +++ b/pod/perlreguts.pod @@ -747,6 +747,7 @@ F<regexp.h> contains the base structure definition: typedef struct regexp { I32 *startp; I32 *endp; + regexp_paren_ofs *swap; regnode *regstclass; struct reg_substr_data *substrs; char *precomp; /* pre-compilation regular expression */ @@ -802,11 +803,19 @@ These fields are used to keep track of how many paren groups could be matched in the pattern, which was the last open paren to be entered, and which was the last close paren to be entered. -=item C<startp>, C<endp> +=item C<startp>, C<endp>, C<swap> These fields store arrays that are used to hold the offsets of the begining and end of each capture group that has matched. -1 is used to indicate no match. +C<swap> is an extra set of startp/endp stored in a C<regexp_paren_ofs> +struct. This is used when the last successful match was from same pattern +as the current pattern, so that a partial match doesn't overwrite the +previous match's results. When this field is data filled the matching +engine will swap buffers before every match attempt. If the match fails, +then it swaps them back. If it's successful it leaves them. This field +is populated on demand and is by default null. + These are the source for @- and @+. =item C<subbeg> C<sublen> C<saved_copy> |