diff options
author | Karl Williamson <khw@khw-desktop.(none)> | 2010-06-22 09:45:23 -0600 |
---|---|---|
committer | Jesse Vincent <jesse@bestpractical.com> | 2010-06-28 22:30:04 -0400 |
commit | c27a5cfe2661343fcb3b4f58478604d8b59b20de (patch) | |
tree | 6570affd331aa336f0150b98b0e527a511ea5c28 | |
parent | 3ff1c45a45226c5e55f9f22807a1b068b751e49e (diff) | |
download | perl-c27a5cfe2661343fcb3b4f58478604d8b59b20de.tar.gz |
Standardize on use of 'capture group' over 'buffer'
Both terms 'capture group' and 'capture buffer' are used in the
documentation. This patch changes most uses of the latter to the
former, as they are referenced using "\g".
-rw-r--r-- | ext/re/re.pm | 2 | ||||
-rw-r--r-- | pod/perlfaq6.pod | 12 | ||||
-rw-r--r-- | pod/perlglossary.pod | 11 | ||||
-rw-r--r-- | pod/perlre.pod | 109 | ||||
-rw-r--r-- | pod/perlreapi.pod | 8 | ||||
-rw-r--r-- | pod/perlrebackslash.pod | 58 | ||||
-rw-r--r-- | pod/perlreref.pod | 6 | ||||
-rw-r--r-- | pod/perlretut.pod | 22 | ||||
-rw-r--r-- | pod/perlvar.pod | 4 |
9 files changed, 126 insertions, 106 deletions
diff --git a/ext/re/re.pm b/ext/re/re.pm index fb0b8d264c..d9fd912a4a 100644 --- a/ext/re/re.pm +++ b/ext/re/re.pm @@ -307,7 +307,7 @@ Turns on all "extra" debugging options. =item BUFFERS -Enable debugging the capture buffer storage during match. Warning, +Enable debugging the capture group storage during match. Warning, this can potentially produce extremely large output. =item TRIEM diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 40965d09fc..b51884d063 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -535,11 +535,11 @@ The group C<< [^<>]++ >> finds one or more non-angle brackets without backtracking. Second, the new C<(?PARNO)> refers to the sub-pattern in the -particular capture buffer given by C<PARNO>. In the following regex, -the first capture buffer finds (and remembers) the balanced text, and +particular capture group given by C<PARNO>. In the following regex, +the first capture group finds (and remembers) the balanced text, and you need that same pattern within the first buffer to get past the nested text. That's the recursive part. The C<(?1)> uses the pattern -in the outer capture buffer as an independent part of the regex. +in the outer capture group as an independent part of the regex. Putting it all together, you have: @@ -552,15 +552,15 @@ Putting it all together, you have: HERE my @groups = $string =~ m/ - ( # start of capture buffer 1 + ( # start of capture group 1 < # match an opening angle bracket (?: [^<>]++ # one or more non angle brackets, non backtracking | - (?1) # found < or >, so recurse to capture buffer 1 + (?1) # found < or >, so recurse to capture group 1 )* > # match a closing angle bracket - ) # end of capture buffer 1 + ) # end of capture group 1 /xg; $" = "\n\t"; diff --git a/pod/perlglossary.pod b/pod/perlglossary.pod index b44fcd447f..858f24b12c 100644 --- a/pod/perlglossary.pod +++ b/pod/perlglossary.pod @@ -234,7 +234,8 @@ some of its high-level ideas. =item backreference A substring L<captured|/capturing> by a subpattern within -unadorned parentheses in a L</regex>. Backslashed decimal numbers +unadorned parentheses in a L</regex>, also referred to as a capture group. +Backslashed decimal numbers (C<\1>, C<\2>, etc.) later in the same pattern refer back to the corresponding subpattern in the current match. Outside the pattern, the numbered variables (C<$1>, C<$2>, etc.) continue to refer to these @@ -458,10 +459,16 @@ handler when some event of interest transpires. Reduced to a standard form to facilitate comparison. +=item capture buffer, capture group + +These two terms are synonymous: +a L<captured substring|/capturing> by a regex subpattern. + =item capturing The use of parentheses around a L</subpattern> in a L</regular -expression> to store the matched L</substring> as a L</backreference>. +expression> to store the matched L</substring> as a L</backreference> +or L</capture group>. (Captured strings are also returned as a list in L</list context>.) =item character diff --git a/pod/perlre.pod b/pod/perlre.pod index e2e6eb5e93..9a7e4fef06 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -271,11 +271,11 @@ X<\g> X<\k> X<\K> X<backreference> characters into their UTF-8 bytes, so you may end up with malformed pieces of UTF-8. Unsupported in lookbehind. - \1 [5] Backreference to a specific capture buffer or group. + \1 [5] Backreference to a specific capture group or buffer. '1' may actually be any positive integer. \g1 [5] Backreference to a specific or previous group, \g{-1} [5] The number may be negative indicating a relative - previous buffer and may optionally be wrapped in + previous group and may optionally be wrapped in curly brackets for safer parsing. \g{name} [5] Named backreference \k<name> [5] Named backreference @@ -308,7 +308,7 @@ See L<perlrebackslash/Misc> for details. =item [5] -See L</Capture buffers> below for details. +See L</Capture groups> below for details. =item [6] @@ -377,18 +377,20 @@ row. It is worth noting that C<\G> improperly used can result in an infinite loop. Take care when using patterns that include C<\G> in an alternation. -=head3 Capture buffers +=head3 Capture groups -The bracketing construct C<( ... )> creates capture buffers. To refer -to the current contents of a buffer later on, within the same pattern, -use \1 for the first, \2 for the second, and so on. +The bracketing construct C<( ... )> creates capture groups (also referred to as +capture buffers). To refer to the current contents of a group later on, within +same pattern, use \1 for the first, \2 for the second, and so on. Outside the match use "$" instead of "\". (The \<digit> notation works in certain circumstances outside the match. See L</Warning on \1 Instead of $1> below for details.) Referring back to another part of the match is called a I<backreference>. X<regex, capture buffer> X<regexp, capture buffer> +X<regex, capture group> X<regexp, capture group> X<regular expression, capture buffer> X<backreference> +X<regular expression, capture group> X<backreference> There is no limit to the number of captured substrings that you may use. However Perl also uses \10, \11, etc. as aliases for \010, @@ -413,33 +415,34 @@ following it. When N is a positive integer the C<\g{N}> notation is exactly equivalent to using normal backreferences. When N is a negative integer then it is a relative backreference referring to the previous N'th capturing group. When the bracket form is used and N is not an integer, it -is treated as a reference to a named buffer. +is treated as a reference to a named group. -Thus C<\g{-1}> refers to the last buffer, C<\g{-2}> refers to the -buffer before that. For example: +Thus C<\g{-1}> refers to the last group, C<\g{-2}> refers to the +group before that. For example: / - (Y) # buffer 1 - ( # buffer 2 - (X) # buffer 3 - \g{-1} # backref to buffer 3 - \g{-3} # backref to buffer 1 + (Y) # group 1 + ( # group 2 + (X) # group 3 + \g{-1} # backref to group 3 + \g{-3} # backref to group 1 ) /x and would match the same as C</(Y) ( (X) \3 \1 )/x>. -Additionally, as of Perl 5.10.0 you may use named capture buffers and named +Additionally, as of Perl 5.10.0 you may use named capture groups and named backreferences. The notation is C<< (?<name>...) >> to declare and C<< \k<name> >> to reference. You may also use apostrophes instead of angle brackets to delimit the name; and you may use the bracketed C<< \g{name} >> backreference syntax. -It's possible to refer to a named capture buffer by absolute and relative number as well. -Outside the pattern, a named capture buffer is available via the C<%+> hash. -When different buffers within the same pattern have the same name, C<$+{name}> +It's possible to refer to a named capture group by absolute and relative number as well. +Outside the pattern, a named capture group is available via the C<%+> hash. +When different groups within the same pattern have the same name, C<$+{name}> and C<< \k<name> >> refer to the leftmost defined group. (Thus it's possible -to do things with named capture buffers that would otherwise require C<(??{})> +to do things with named capture groups that would otherwise require C<(??{})> code to accomplish.) X<named capture buffer> X<regular expression, named capture buffer> +X<named capture group> X<regular expression, named capture group> X<%+> X<$+{name}> X<< \k<name> >> Examples: @@ -628,22 +631,22 @@ is equivalent to the more verbose X<(?|)> X<Branch reset> This is the "branch reset" pattern, which has the special property -that the capture buffers are numbered from the same starting point +that the capture groups are numbered from the same starting point in each alternation branch. It is available starting from perl 5.10.0. -Capture buffers are numbered from left to right, but inside this +Capture groups are numbered from left to right, but inside this construct the numbering is restarted for each branch. -The numbering within each branch will be as normal, and any buffers +The numbering within each branch will be as normal, and any groups following this construct will be numbered as though the construct contained only one branch, that being the one with the most capture -buffers in it. +groups in it. This construct will be useful when you want to capture one of a number of alternative matches. Consider the following pattern. The numbers underneath show in -which buffer the captured content will be stored. +which group the captured content will be stored. # before ---------------branch-reset----------- after @@ -652,7 +655,7 @@ which buffer the captured content will be stored. Be careful when using the branch reset pattern in combination with named captures. Named captures are implemented as being aliases to -numbered buffers holding the captures, and that interferes with the +numbered groups holding the captures, and that interferes with the implementation of the branch reset pattern. If you are using named captures in a branch reset pattern, it's best to use the same names, in the same order, in each of the alternations: @@ -666,8 +669,8 @@ Not doing so may lead to surprises: say $+ {a}; # Prints '12' say $+ {b}; # *Also* prints '12'. -The problem here is that both the buffer named C<< a >> and the buffer -named C<< b >> are aliases for the buffer belonging to C<< $1 >>. +The problem here is that both the group named C<< a >> and the group +named C<< b >> are aliases for the group belonging to C<< $1 >>. =item Look-Around Assertions X<look-around assertion> X<lookaround assertion> X<look-around> X<lookaround> @@ -744,18 +747,18 @@ only for fixed-width look-behind. =item C<< (?<NAME>pattern) >> X<< (?<NAME>) >> X<(?'NAME')> X<named capture> X<capture> -A named capture buffer. Identical in every respect to normal capturing +A named capture group. Identical in every respect to normal capturing parentheses C<()> but for the additional fact that C<%+> or C<%-> may be -used after a successful match to refer to a named buffer. See C<perlvar> +used after a successful match to refer to a named group. See C<perlvar> for more details on the C<%+> and C<%-> hashes. -If multiple distinct capture buffers have the same name then the -$+{NAME} will refer to the leftmost defined buffer in the match. +If multiple distinct capture groups have the same name then the +$+{NAME} will refer to the leftmost defined group in the match. The forms C<(?'NAME'pattern)> and C<< (?<NAME>pattern) >> are equivalent. B<NOTE:> While the notation of this construct is the same as the similar -function in .NET regexes, the behavior is not. In Perl the buffers are +function in .NET regexes, the behavior is not. In Perl the groups are numbered sequentially regardless of being named or not. Thus in the pattern @@ -892,9 +895,9 @@ This is a "postponed" regular subexpression. The C<code> is evaluated at run time, at the moment this subexpression may match. The result of evaluation is considered as a regular expression and matched as if it were inserted instead of this construct. Note that this means -that the contents of capture buffers defined inside an eval'ed pattern +that the contents of capture groups defined inside an eval'ed pattern are not available outside of the pattern, and vice versa, there is no -way for the inner pattern to refer to a capture buffer defined outside. +way for the inner pattern to refer to a capture group defined outside. Thus, ('a' x 100)=~/(??{'(.)' x 100})/ @@ -939,20 +942,20 @@ X<regex, recursive> X<regexp, recursive> X<regular expression, recursive> X<regex, relative recursion> Similar to C<(??{ code })> except it does not involve compiling any code, -instead it treats the contents of a capture buffer as an independent -pattern that must match at the current position. Capture buffers +instead it treats the contents of a capture group as an independent +pattern that must match at the current position. Capture groups contained by the pattern will have the value as determined by the outermost recursion. PARNO is a sequence of digits (not starting with 0) whose value reflects -the paren-number of the capture buffer to recurse to. C<(?R)> recurses to +the paren-number of the capture group to recurse to. C<(?R)> recurses to the beginning of the whole pattern. C<(?0)> is an alternate syntax for C<(?R)>. If PARNO is preceded by a plus or minus sign then it is assumed -to be relative, with negative numbers indicating preceding capture buffers +to be relative, with negative numbers indicating preceding capture groups and positive ones following. Thus C<(?-1)> refers to the most recently -declared buffer, and C<(?+1)> indicates the next buffer to be declared. +declared group, and C<(?+1)> indicates the next group to be declared. Note that the counting for relative recursion differs from that of -relative backreferences, in that with recursion unclosed buffers B<are> +relative backreferences, in that with recursion unclosed groups B<are> included. The following pattern matches a function foo() which may contain @@ -987,7 +990,7 @@ the output produced should be the following: $2 = (bar(baz)+baz(bop)) $3 = bar(baz)+baz(bop) -If there is no corresponding capture buffer defined, then it is a +If there is no corresponding capture group defined, then it is a fatal error. Recursing deeper than 50 times without consuming any input string will also result in a fatal error. The maximum depth is compiled into perl, so changing it requires a custom build. @@ -1030,7 +1033,7 @@ X<(?()> Conditional expression. C<(condition)> should be either an integer in parentheses (which is valid if the corresponding pair of parentheses matched), a look-ahead/look-behind/evaluate zero-width assertion, a -name in angle brackets or single quotes (which is valid if a buffer +name in angle brackets or single quotes (which is valid if a group with the given name matched), or the special symbol (R) (true when evaluated inside of recursion or eval). Additionally the R may be followed by a number, (which will be true when evaluated when recursing @@ -1043,11 +1046,11 @@ Here's a summary of the possible predicates: =item (1) (2) ... -Checks if the numbered capturing buffer has matched something. +Checks if the numbered capturing group has matched something. =item (<NAME>) ('NAME') -Checks if a buffer with the given name has matched something. +Checks if a group with the given name has matched something. =item (?{ CODE }) @@ -1112,8 +1115,8 @@ An example of how this might be used is as follows: (?<ADRESS_PAT>....) )/x -Note that capture buffers matched inside of recursion are not accessible -after the recursion returns, so the extra layer of capturing buffers is +Note that capture groups matched inside of recursion are not accessible +after the recursion returns, so the extra layer of capturing groups is necessary. Thus C<$+{NAME_PAT}> would not be defined even though C<$+{NAME}> would be. @@ -1367,7 +1370,7 @@ name of the most recently executed C<(*MARK:NAME)> that was involved in the match. This can be used to determine which branch of a pattern was matched -without using a separate capture buffer for each branch, which in turn +without using a separate capture group for each branch, which in turn can result in a performance improvement, as perl cannot optimize C</(?:(x)|(y)|(z))/> as efficiently as something like C</(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/>. @@ -1462,7 +1465,7 @@ whether there is actually more to match in the string. When inside of a nested pattern, such as recursion, or in a subpattern dynamically generated via C<(??{})>, only the innermost pattern is ended immediately. -If the C<(*ACCEPT)> is inside of capturing buffers then the buffers are +If the C<(*ACCEPT)> is inside of capturing groups then the groups are marked as ended at the point at which the C<(*ACCEPT)> was encountered. For instance: @@ -1952,7 +1955,7 @@ only whether or not C<S> can match is important. =item C<(??{ EXPR })>, C<(?PARNO)> The ordering is the same as for the regular expression which is -the result of EXPR, or the pattern contained by capture buffer PARNO. +the result of EXPR, or the pattern contained by capture group PARNO. =item C<(?(condition)yes-pattern|no-pattern)> @@ -2026,15 +2029,15 @@ Perl specific syntax, the following are also accepted: =item C<< (?PE<lt>NAMEE<gt>pattern) >> -Define a named capture buffer. Equivalent to C<< (?<NAME>pattern) >>. +Define a named capture group. Equivalent to C<< (?<NAME>pattern) >>. =item C<< (?P=NAME) >> -Backreference to a named capture buffer. Equivalent to C<< \g{NAME} >>. +Backreference to a named capture group. Equivalent to C<< \g{NAME} >>. =item C<< (?P>NAME) >> -Subroutine call to a named capture buffer. Equivalent to C<< (?&NAME) >>. +Subroutine call to a named capture group. Equivalent to C<< (?&NAME) >>. =back diff --git a/pod/perlreapi.pod b/pod/perlreapi.pod index d1d947b8a7..7dc9645fdd 100644 --- a/pod/perlreapi.pod +++ b/pod/perlreapi.pod @@ -243,7 +243,7 @@ perl will handle releasing anything else contained in the regexp structure. Called to get/set the value of C<$`>, C<$'>, C<$&> and their named equivalents, ${^PREMATCH}, ${^POSTMATCH} and $^{MATCH}, as well as the -numbered capture buffers (C<$1>, C<$2>, ...). +numbered capture groups (C<$1>, C<$2>, ...). The C<paren> parameter will be C<-2> for C<$`>, C<-1> for C<$'>, C<0> for C<$&>, C<1> for C<$1> and so forth. @@ -492,7 +492,7 @@ values. in the final match, used for optimisations */ struct reg_substr_data *substrs; - U32 nparens; /* number of capture buffers */ + U32 nparens; /* number of capture groups */ /* private engine specific data */ U32 intflags; /* Engine Specific Internal flags */ @@ -612,7 +612,7 @@ C<regexp_paren_pair> struct is defined as follows: } regexp_paren_pair; If C<< ->offs[num].start >> or C<< ->offs[num].end >> is C<-1> then that -capture buffer did not match. C<< ->offs[0].start/end >> represents C<$&> (or +capture group did not match. C<< ->offs[0].start/end >> represents C<$&> (or C<${^MATCH> under C<//p>) and C<< ->offs[paren].end >> matches C<$$paren> where C<$paren >= 1>. @@ -633,7 +633,7 @@ The relevant snippet from C<Perl_pp_regcomp>: =head2 C<paren_names> -This is a hash used internally to track named capture buffers and their +This is a hash used internally to track named capture groups and their offsets. The keys are the names of the buffers the values are dualvars, with the IV slot holding the number of buffers with the given name and the pv being an embedded array of I32. The values may also be contained diff --git a/pod/perlrebackslash.pod b/pod/perlrebackslash.pod index 6587ea97d6..5e514ceec6 100644 --- a/pod/perlrebackslash.pod +++ b/pod/perlrebackslash.pod @@ -360,41 +360,51 @@ absolutely, relatively, and by name. =head3 Absolute referencing -A backslash sequence that starts with a backslash and is followed by a -number is an absolute reference (but be aware of the caveat mentioned above). -If the number is I<N>, it refers to the Nth set of parentheses - whatever -has been matched by that set of parenthesis has to be matched by the C<\N> -as well. +Either C<\gI<N>> (starting in Perl 5.10.0), or C<\I<N>> (old-style) where I<N> +is an positive (unsigned) decimal number of any length is an absolute reference +to a capturing group. + +I<N> refers to the Nth set of parentheses - or more accurately, whatever has +been matched by that set of parenthesis. Thus C<\g1> refers to the first +capture group in the regex. + +The C<\gI<N>> form can be equivalently written as C<\g{I<N>}> +which avoids ambiguity when building a regex by concatenating shorter +strings. Otherwise if you had a regex C</$a$b/>, and C<$a> contained C<"\g1">, +and C<$b> contained C<"37">, you would get C</\g137/> which is probably not +what you intended. + +In the C<\I<N>> form, I<N> must not begin with a "0", and there must be at +least I<N> capturing groups, or else I<N> will be considered an octal escape +(but something like C<\18> is the same as C<\0018>, that is the octal escape +C<"\001"> followed by a literal digit C<"8">). + +Mnemonic: I<g>roup. =head4 Examples - /(\w+) \1/; # Finds a duplicated word, (e.g. "cat cat"). - /(.)(.)\2\1/; # Match a four letter palindrome (e.g. "ABBA"). + /(\w+) \g1/; # Finds a duplicated word, (e.g. "cat cat"). + /(\w+) \1/; # Same thing; written old-style + /(.)(.)\g2\g1/; # Match a four letter palindrome (e.g. "ABBA"). =head3 Relative referencing -New in perl 5.10.0 is a different way of referring to capture buffers: C<\g>. -C<\g> takes a number as argument, with the number in curly braces (the -braces are optional). If the number (N) does not have a sign, it's a reference -to the Nth capture group (so C<\g{2}> is equivalent to C<\2> - except that -C<\g> always refers to a capture group and will never be seen as an octal -escape). If the number is negative, the reference is relative, referring to -the Nth group before the C<\g{-N}>. +C<\g-I<N>> (starting in Perl 5.10.0) is used for relative addressing. (It can +be written as C<\g{-I<N>>.) It refers to the I<N>th group before the +C<\g{-I<N>}>. -The big advantage of C<\g{-N}> is that it makes it much easier to write +The big advantage of this form is that it makes it much easier to write patterns with references that can be interpolated in larger patterns, even if the larger pattern also contains capture groups. -Mnemonic: I<g>roup. - =head4 Examples - /(A) # Buffer 1 - ( # Buffer 2 - (B) # Buffer 3 - \g{-1} # Refers to buffer 3 (B) - \g{-3} # Refers to buffer 1 (A) + /(A) # Group 1 + ( # Group 2 + (B) # Group 3 + \g{-1} # Refers to group 3 (B) + \g{-3} # Refers to group 1 (A) ) /x; # Matches "ABBA". @@ -403,9 +413,9 @@ Mnemonic: I<g>roup. =head3 Named referencing -Also new in perl 5.10.0 is the use of named capture buffers, which can be +Also new in perl 5.10.0 is the use of named capture groups, which can be referred to by name. This is done with C<\g{name}>, which is a -backreference to the capture buffer with the name I<name>. +backreference to the capture group with the name I<name>. To be compatible with .Net regular expressions, C<\g{name}> may also be written as C<\k{name}>, C<< \k<name> >> or C<\k'name'>. diff --git a/pod/perlreref.pod b/pod/perlreref.pod index 5ddacc5046..c6e01787a3 100644 --- a/pod/perlreref.pod +++ b/pod/perlreref.pod @@ -71,8 +71,8 @@ delimiters can be used. Must be reset with reset(). (...) Groups subexpressions for capturing to $1, $2... (?:...) Groups subexpressions without capturing (cluster) | Matches either the subexpression preceding or following it - \1, \2, \3 ... Matches the text from the Nth group \g1 or \g{1}, \g2 ... Matches the text from the Nth group + \1, \2, \3 ... Matches the text from the Nth group \g-1 or \g{-1}, \g-2 ... Matches the text from the Nth previous group \g{name} Named backreference \k<name> Named backreference @@ -281,8 +281,8 @@ specify the C</p> (preserve) modifier on your regular expression. $^R Holds the result of the last (?{...}) expr @- Offsets of starts of groups. $-[0] holds start of whole match @+ Offsets of ends of groups. $+[0] holds end of whole match - %+ Named capture buffers - %- Named capture buffers, as array refs + %+ Named capture groups + %- Named capture groups, as array refs Captured groups are numbered according to their I<opening> paren. diff --git a/pod/perlretut.pod b/pod/perlretut.pod index a9a3372636..9eded21002 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -799,18 +799,18 @@ using relative backreferences: =head2 Named backreferences -Perl 5.10 also introduced named capture buffers and named backreferences. +Perl 5.10 also introduced named capture groups and named backreferences. To attach a name to a capturing group, you write either C<< (?<name>...) >> or C<< (?'name'...) >>. The backreference may then be written as C<\g{name}>. It is permissible to attach the same name to more than one group, but then only the leftmost one of the eponymous set can be referenced. Outside of the pattern a named -capture buffer is accessible through the C<%+> hash. +capture group is accessible through the C<%+> hash. Assuming that we have to match calendar dates which may be given in one of the three formats yyyy-mm-dd, mm/dd/yyyy or dd.mm.yyyy, we can write three suitable patterns where we use 'd', 'm' and 'y' respectively as the -names of the buffers capturing the pertaining components of a date. The +names of the groups capturing the pertaining components of a date. The matching operation combines the three patterns as alternatives: $fmt1 = '(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)'; @@ -838,7 +838,7 @@ Consider a pattern for matching a time of the day, civil or military style: Processing the results requires an additional if statement to determine whether C<$1> and C<$2> or C<$3> and C<$4> contain the goodies. It would -be easier if we could use buffer numbers 1 and 2 in second alternative as +be easier if we could use group numbers 1 and 2 in second alternative as well, and this is exactly what the parenthesized construct C<(?|...)>, set around an alternative achieves. Here is an extended version of the previous pattern: @@ -847,7 +847,7 @@ previous pattern: print "hour=$1 minute=$2 zone=$3\n"; } -Within the alternative numbering group, buffer numbers start at the same +Within the alternative numbering group, group numbers start at the same position for each alternative. After the group, numbering continues with one higher than the maximum reached across all the alternatives. @@ -901,10 +901,10 @@ C<@+> instead: A group that is required to bundle a set of alternatives may or may not be useful as a capturing group. If it isn't, it just creates a superfluous -addition to the set of available capture buffer values, inside as well as +addition to the set of available capture group values, inside as well as outside the regexp. Non-capturing groupings, denoted by C<(?:regexp)>, still allow the regexp to be treated as a single unit, but don't establish -a capturing buffer at the same time. Both capturing and non-capturing +a capturing group at the same time. Both capturing and non-capturing groupings are allowed to co-exist in the same regexp. Because there is no extraction, non-capturing groupings are faster than capturing groupings. Non-capturing groupings are also handy for choosing exactly @@ -2372,7 +2372,7 @@ matched, otherwise the C<no-regexp> will be matched. The C<condition> can have several forms. The first form is simply an integer in parentheses C<(integer)>. It is true if the corresponding backreference C<\integer> matched earlier in the regexp. The same -thing can be done with a name associated with a capture buffer, written +thing can be done with a name associated with a capture group, written as C<< (<name>) >> or C<< ('name') >>. The second form is a bare zero width assertion C<(?...)>, either a lookahead, a lookbehind, or a code assertion (discussed in the next section). The third set of forms @@ -2466,8 +2466,8 @@ have the full pattern: In C<(?...)> both absolute and relative backreferences may be used. The entire pattern can be reinserted with C<(?R)> or C<(?0)>. -If you prefer to name your buffers, you can use C<(?&name)> to -recurse into that buffer. +If you prefer to name your groups, you can use C<(?&name)> to +recurse into that group. =head2 A bit of magic: executing Perl code in a regular expression @@ -2712,7 +2712,7 @@ it will cause to fail, just like at some mismatch between the pattern and the string. Processing of the regexp continues like after any "normal" failure, so that, for instance, the next position in the string or another alternative will be tried. As failing to match doesn't preserve capture -buffers or produce results, it may be necessary to use this in +groups or produce results, it may be necessary to use this in combination with embedded code. %count = (); diff --git a/pod/perlvar.pod b/pod/perlvar.pod index 806809956c..5823b81f0a 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -633,9 +633,9 @@ After a match against some variable $var: =item %- X<%-> -Similar to C<%+>, this variable allows access to the named capture buffers +Similar to C<%+>, this variable allows access to the named capture groups in the last successful match in the currently active dynamic scope. To -each capture buffer name found in the regular expression, it associates a +each capture group name found in the regular expression, it associates a reference to an array containing the list of values captured by all buffers with that name (should there be several of them), in the order where they appear. |