diff options
Diffstat (limited to 'pod/perlre.pod')
-rw-r--r-- | pod/perlre.pod | 27 |
1 files changed, 16 insertions, 11 deletions
diff --git a/pod/perlre.pod b/pod/perlre.pod index 2b24379c8b..14892a8846 100644 --- a/pod/perlre.pod +++ b/pod/perlre.pod @@ -136,7 +136,7 @@ also work: \L lowercase till \E (think vi) \U uppercase till \E (think vi) \E end case modification (think vi) - \Q quote regexp metacharacters till \E + \Q quote (disable) regexp metacharacters till \E If C<use locale> is in effect, the case map used by C<\l>, C<\L>, C<\u> and <\U> is taken from the current locale. See L<perllocale>. @@ -226,19 +226,20 @@ you've used them once, use them at will, because you've already paid the price. You will note that all backslashed metacharacters in Perl are -alphanumeric, such as C<\b>, C<\w>, C<\n>. Unlike some other regular expression -languages, there are no backslashed symbols that aren't alphanumeric. -So anything that looks like \\, \(, \), \E<lt>, \E<gt>, \{, or \} is always -interpreted as a literal character, not a metacharacter. This makes it -simple to quote a string that you want to use for a pattern but that -you are afraid might contain metacharacters. Quote simply all the +alphanumeric, such as C<\b>, C<\w>, C<\n>. Unlike some other regular +expression languages, there are no backslashed symbols that aren't +alphanumeric. So anything that looks like \\, \(, \), \E<lt>, \E<gt>, +\{, or \} is always interpreted as a literal character, not a +metacharacter. This was once used in a common idiom to disable or +quote the special meanings of regular expression metacharacters in a +string that you want to use for a pattern. Simply quote all the non-alphanumeric characters: $pattern =~ s/(\W)/\\$1/g; -You can also use the builtin quotemeta() function to do this. -An even easier way to quote metacharacters right in the match operator -is to say +Now it is much more common to see either the quotemeta() function or +the \Q escape sequence used to disable the metacharacters special +meanings like this: /$unquoted\Q$quoted\E$unquoted/ @@ -515,7 +516,11 @@ in C<[]>, which will match any one of the characters in the list. If the first character after the "[" is "^", the class matches any character not in the list. Within a list, the "-" character is used to specify a range, so that C<a-z> represents all the characters between "a" and "z", -inclusive. +inclusive. If you want "-" itself to be a member of a class, put it +at the start or end of the list, or escape it with a backslash. (The +following all specify the same class of three characters: C<[-az]>, +C<[az-]>, and C<[a\-z]>. All are different from C<[a-z]>, which +specifies a class containing twenty-six characters.) Characters may be specified using a metacharacter syntax much like that used in C: "\n" matches a newline, "\t" a tab, "\r" a carriage return, |