summaryrefslogtreecommitdiff
path: root/pod/perlre.pod
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2017-01-13 11:17:25 -0700
committerKarl Williamson <khw@cpan.org>2017-01-13 11:44:35 -0700
commit2ab076704905c338cc874079818784698cd5bc85 (patch)
treeab9205493a154a541b4b75a924068186e3bd1f49 /pod/perlre.pod
parent563642b4907d9b1b6beaa96b472ae787ae81d56f (diff)
downloadperl-2ab076704905c338cc874079818784698cd5bc85.tar.gz
perlre: Clarifications, typos
Diffstat (limited to 'pod/perlre.pod')
-rw-r--r--pod/perlre.pod40
1 files changed, 36 insertions, 4 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod
index 10783a30b8..0b1ae4ce1e 100644
--- a/pod/perlre.pod
+++ b/pod/perlre.pod
@@ -144,7 +144,6 @@ L<perlretut/"Using regular expressions in Perl"> are:
g - globally match the pattern repeatedly in the string
Substitution-specific modifiers described in
-
L<perlop/"s/PATTERN/REPLACEMENT/msixpodualngcer"> are:
e - evaluate the right-hand side as an expression
@@ -170,7 +169,7 @@ L</Overview> above.
C</x> tells
the regular expression parser to ignore most whitespace that is neither
backslashed nor within a bracketed character class. You can use this to
-break up your regular expression into (slightly) more readable parts.
+break up your regular expression into more readable parts.
Also, the C<"#"> character is treated as a metacharacter introducing a
comment that runs up to the pattern's closing delimiter, or to the end
of the current line if the pattern extends onto the next line. Hence,
@@ -190,6 +189,10 @@ You can use L</(?#text)> to create a comment that ends earlier than the
end of the current line, but C<text> also can't contain the closing
delimiter unless escaped with a backslash.
+A common pitfall is to forget that C<#> characters begin a comment under
+C</x> and are not matched literally. Just keep that in mind when trying
+to puzzle out why a particular C</x> pattern isn't working as expected.
+
Taken together, these features go a long way towards
making Perl's regular expressions more readable. Here's an example:
@@ -554,7 +557,6 @@ meanings:
X<metacharacter>
X<\> X<^> X<.> X<$> X<|> X<(> X<()> X<[> X<[]>
-
\ Quote the next metacharacter
^ Match the beginning of the line
. Match any character (except newline)
@@ -1075,13 +1077,30 @@ a backslash if it appears in the comment.
See L</E<sol>x> for another way to have comments in patterns.
+Note that a comment can go just about anywhere, except in the middle of
+an escape sequence. Examples:
+
+ qr/foo(?#comment)bar/' # Matches 'foobar'
+
+ # The pattern below matches 'abcd', 'abccd', or 'abcccd'
+ qr/abc(?#comment between literal and its quantifier){1,3}d/
+
+ # The pattern below generates a syntax error, because the '\p' must
+ # be followed immediately by a '{'.
+ qr/\p(?#comment between \p and its property name){Any}/
+
+ # The pattern below generates a syntax error, because the initial
+ # '\(' is a literal opening parenthesis, and so there is nothing
+ # for the closing ')' to match
+ qr/\(?#the backslash means this isn't a comment)p{Any}/
+
=item C<(?adlupimnsx-imnsx)>
=item C<(?^alupimnsx)>
X<(?)> X<(?^)>
One or more embedded pattern-match modifiers, to be turned on (or
-turned off, if preceded by C<"-">) for the remainder of the pattern or
+turned off if preceded by C<"-">) for the remainder of the pattern or
the remainder of the enclosing pattern group (if any).
This is particularly useful for dynamically-generated patterns,
@@ -1111,6 +1130,15 @@ These modifiers do not carry over into named subpatterns called in the
enclosing group. In other words, a pattern such as C<((?i)(?&NAME))> does not
change the case-sensitivity of the C<"NAME"> pattern.
+A modifier is overridden by later occurrences of this construct in the
+same scope containing the same modifier, so that
+
+ /((?im)foo(?-m)bar)/
+
+matches all of C<foobar> case insensitively, but uses C</m> rules for
+only the C<foo> portion. The C<a> flag overrides C<aa> as well;
+likewise C<aa> overrides C<a>.
+
Any of these modifiers can be set to apply globally to all regular
expressions compiled within the scope of a C<use re>. See
L<re/"'/flags' mode">.
@@ -1165,6 +1193,10 @@ is equivalent to the more verbose
Note that any C<()> constructs enclosed within this one will still
capture unless the C</n> modifier is in effect.
+
+Like the L</(?adlupimnsx-imnsx)> construct, C<aa> and C<a> override each
+other.
+
Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately
after the C<"?"> is a shorthand equivalent to C<d-imnsx>. Any positive
flags (except C<"d">) may follow the caret, so