Re: Named-capture regex syntax

Message-ID: <9b18b3110612240538m5c45654br7d27171835f6664@mail.gmail.com> p4raw-id: //depot/perl@29621
author: Yves Orton <demerphq@gmail.com> 2006-12-24 15:38:15 +0100
committer: Rafael Garcia-Suarez <rgarciasuarez@gmail.com> 2006-12-25 17:03:14 +0000
commit: 1f1031fe96c14865e4f60fdd3a6a6ce073d190c1 (patch)
tree: 1057ec70f13ea09891a734756af802113aed89ad /pod/perlre.pod
parent: 5b64f2bff5b0212a9713f87c3a9e7f6653a1e126 (diff)
download: perl-1f1031fe96c14865e4f60fdd3a6a6ce073d190c1.tar.gz
1 files changed, 51 insertions, 9 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod
index a8762118b8..6c2049628c 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -250,6 +250,7 @@ X<word> X<whitespace>
     \g1      Backreference to a specific or previous group,
     \g{-1}   number may be negative indicating a previous buffer and may
              optionally be wrapped in curly brackets for safer parsing.
+    \g{name} Named backreference
     \k<name> Named backreference
     \N{name} Named unicode character, or unicode escape
     \x12     Hexadecimal escape sequence
@@ -486,7 +487,7 @@ backreference only if at least 11 left parentheses have opened
 before it.  And so on.  \1 through \9 are always interpreted as
 backreferences.
 
-X<\g{1}> X<\g{-1}> X<relative backreference>
+X<\g{1}> X<\g{-1}> X<\g{name}> X<relative backreference> X<named backreference>
 In order to provide a safer and easier way to construct patterns using
 backrefs, in Perl 5.10 the C<\g{N}> notation is provided. The curly
 brackets are optional, however omitting them is less safe as the meaning
@@ -494,6 +495,8 @@ of the pattern can be changed by text (such as digits) following it.
 When N is a positive integer the C<\g{N}> notation is exactly equivalent
 to using normal backreferences. When N is a negative integer then it is
 a relative backreference referring to the previous N'th capturing group.
+When the bracket form is used and N is not an integer, it is treated as a
+reference to a named buffer.
 
 Thus C<\g{-1}> refers to the last buffer, C<\g{-2}> refers to the
 buffer before that. For example:
@@ -510,11 +513,12 @@ buffer before that. For example:
 and would match the same as C</(Y) ( (X) \3 \1 )/x>.
 
 Additionally, as of Perl 5.10 you may use named capture buffers and named
-backreferences. The notation is C<< (?<name>...) >> and C<< \k<name> >>
-(you may also use single quotes instead of angle brackets to quote the
-name). The only difference with named capture buffers and unnamed ones is
+backreferences. The notation is C<< (?<name>...) >> to declare and C<< \k<name> >>
+to reference. You may also use single quotes instead of angle brackets to quote the
+name; and you may use the bracketed C<< \g{name} >> back reference syntax.
+The only difference between named capture buffers and unnamed ones is
 that multiple buffers may have the same name and that the contents of
-named capture buffers is available via the C<%+> hash. When multiple
+named capture buffers are available via the C<%+> hash. When multiple
 groups share the same name C<$+{name}> and C<< \k<name> >> refer to the
 leftmost defined group, thus it's possible to do things with named capture
 buffers that would otherwise require C<(??{})> code to accomplish. Named
@@ -751,12 +755,20 @@ pattern
 $+{foo} will be the same as $2, and $3 will contain 'z' instead of
 the opposite which is what a .NET regex hacker might expect.
 
-Currently NAME is restricted to word chars only. In other words, it
-must match C</^\w+$/>.
+Currently NAME is restricted to simple identifiers only.
+In other words, it must match C</^[_A-Za-z][_A-Za-z0-9]*\z/> or
+its Unicode extension (see L<utf8>),
+though it isn't extended by the locale (see L<perllocale>).
 
-=item C<< \k<name> >>
+B<NOTE:> In order to make things easier for programmers with experience
+with the Python or PCRE regex engines the pattern C<< (?P<NAME>pattern) >>
+maybe be used instead of C<< (?<NAME>pattern) >>; however this form does not
+support the use of single quotes as a delimiter for the name. This is
+only available in Perl 5.10 or later.
 
-=item C<< \k'name' >>
+=item C<< \k<NAME> >>
+
+=item C<< \k'NAME' >>
 
 Named backreference. Similar to numeric backreferences, except that
 the group is designated by name and not number. If multiple groups
@@ -768,6 +780,10 @@ earlier in the pattern.
 
 Both forms are equivalent.
 
+B<NOTE:> In order to make things easier for programmers with experience
+with the Python or PCRE regex engines the pattern C<< (?P=NAME) >>
+maybe be used instead of C<< \k<NAME> >> in Perl 5.10 or later.
+
 =item C<(?{ code })>
 X<(?{})> X<regex, code in> X<regexp, code in> X<regular expression, code in>
 
@@ -989,6 +1005,10 @@ the same name, then it recurses to the leftmost.
 It is an error to refer to a name that is not declared somewhere in the
 pattern.
 
+B<NOTE:> In order to make things easier for programmers with experience
+with the Python or PCRE regex engines the pattern C<< (?P>NAME) >>
+maybe be used instead of C<< (?&NAME) >> as of Perl 5.10.
+
 =item C<(?(condition)yes-pattern|no-pattern)>
 X<(?()>
 
@@ -1980,6 +2000,28 @@ part of this regular expression needs to be converted explicitly
     $re = customre::convert $re;
     /\Y|$re\Y|/;
 
+=head1 PCRE/Python Support
+
+As of Perl 5.10 Perl supports several Python/PCRE specific extensions
+to the regex syntax. While Perl programmers are encouraged to use the
+Perl specific syntax, the following are legal in Perl 5.10:
+
+=over 4
+
+=item C<< (?P<NAME>pattern) >>
+
+Define a named capture buffer. Equivalent to C<< (?<NAME>pattern) >>.
+
+=item C<< (?P=NAME) >>
+
+Backreference to a named capture buffer. Equivalent to C<< \g{NAME} >>.
+
+=item C<< (?P>NAME) >>
+
+Subroutine call to a named capture buffer. Equivalent to C<< (?&NAME) >>.
+
+=back 4
+
 =head1 BUGS
 
 This document varies from difficult to understand to completely
author	Yves Orton <demerphq@gmail.com>	2006-12-24 15:38:15 +0100
committer	Rafael Garcia-Suarez <rgarciasuarez@gmail.com>	2006-12-25 17:03:14 +0000
commit	1f1031fe96c14865e4f60fdd3a6a6ce073d190c1 (patch)
tree	1057ec70f13ea09891a734756af802113aed89ad /pod/perlre.pod
parent	5b64f2bff5b0212a9713f87c3a9e7f6653a1e126 (diff)
download	perl-1f1031fe96c14865e4f60fdd3a6a6ce073d190c1.tar.gz