summaryrefslogtreecommitdiff
path: root/pod/perlrecharclass.pod
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2010-10-30 10:13:48 -0600
committerFather Chrysostomos <sprout@cpan.org>2010-10-31 06:11:43 -0700
commitd5944336d74c819152158dabfd806d49ad0ecb21 (patch)
tree78a523b14eed8ebe3f6d2336b31d2d5462d2c13e /pod/perlrecharclass.pod
parentb6dac59a93d03037bfa91e14bd72ebe78feb54ea (diff)
downloadperl-d5944336d74c819152158dabfd806d49ad0ecb21.tar.gz
Add consistent synonyms for \p{PosxFOO}
This patch adds a set of synonyms \p{XPosixFOO} for the full extended Unicode version of \p{PosixFOO}, so only one rule need be remembered. Similarly, \p{XPerlSpace} is added to preserve the rule for the one similar class that doesn't have Posix in its name.
Diffstat (limited to 'pod/perlrecharclass.pod')
-rw-r--r--pod/perlrecharclass.pod52
1 files changed, 30 insertions, 22 deletions
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod
index 0b88cc46a5..7f96b4b5ea 100644
--- a/pod/perlrecharclass.pod
+++ b/pod/perlrecharclass.pod
@@ -522,7 +522,8 @@ The other counterpart, in the column labelled "Full-range Unicode", matches any
appropriate characters in the full Unicode character set. For example,
C<\p{Alpha}> will match not just the ASCII alphabetic characters, but any
character in the entire Unicode character set that is considered to be
-alphabetic.
+alphabetic. The backslash sequence column is a (short) synonym for
+the Full-range Unicode form.
(Each of the counterparts has various synonyms as well.
L<perluniprops/Properties accessible through \p{} and \P{}> lists all the
@@ -548,25 +549,25 @@ EBCDIC code page is present, they will behave in accordance with those; if
absent, the classes will match only their ASCII-range counterparts. If you
disagree with this proposal, send email to C<perl5-porters@perl.org>.
- [[:...:]] ASCII-range Full-range backslash Note
- Unicode Unicode sequence
+ [[:...:]] ASCII-range Full-range backslash Note
+ Unicode Unicode sequence
-----------------------------------------------------
- alpha \p{PosixAlpha} \p{Alpha}
- alnum \p{PosixAlnum} \p{Alnum}
+ alpha \p{PosixAlpha} \p{XPosixAlpha}
+ alnum \p{PosixAlnum} \p{XPosixAlnum}
ascii \p{ASCII}
- blank \p{PosixBlank} \p{Blank} = [1]
- \p{HorizSpace} \h [1]
- cntrl \p{PosixCntrl} \p{Cntrl} [2]
- digit \p{PosixDigit} \p{Digit} \d
- graph \p{PosixGraph} \p{Graph} [3]
- lower \p{PosixLower} \p{Lower}
- print \p{PosixPrint} \p{Print} [4]
- punct \p{PosixPunct} \p{Punct} [5]
- \p{PerlSpace} \p{SpacePerl} \s [6]
- space \p{PosixSpace} \p{Space} [6]
- upper \p{PosixUpper} \p{Upper}
- word \p{PerlWord} \p{Word} \w
- xdigit \p{ASCII_Hex_Digit} \p{XDigit}
+ blank \p{PosixBlank} \p{XPosixBlank} \h [1]
+ or \p{HorizSpace} [1]
+ cntrl \p{PosixCntrl} \p{XPosixCntrl} [2]
+ digit \p{PosixDigit} \p{XPosixDigit} \d
+ graph \p{PosixGraph} \p{XPosixGraph} [3]
+ lower \p{PosixLower} \p{XPosixLower}
+ print \p{PosixPrint} \p{XPosixPrint} [4]
+ punct \p{PosixPunct} \p{XPosixPunct} [5]
+ \p{PerlSpace} \p{XPerlSpace} \s [6]
+ space \p{PosixSpace} \p{XPosixSpace} [6]
+ upper \p{PosixUpper} \p{XPosixUpper}
+ word \p{PosixWord} \p{XPosixWord} \w
+ xdigit \p{ASCII_Hex_Digit} \p{XPosixXDigit}
=over 4
@@ -621,6 +622,11 @@ matches the vertical tab, C<\cK>. Same for the two ASCII-only range forms.
=back
+There are various other synonyms that can be used for these besides
+C<\p{HorizSpace}> and \C<\p{XPosixBlank}>. For example
+C<\p{PosixAlpha}> can be written as C<\p{Alpha}>. All are listed
+in L<perluniprops/Properties accessible through \p{} and \P{}>.
+
=head4 Negation
X<character class, negation>
@@ -631,10 +637,12 @@ Some examples:
POSIX ASCII-range Full-range backslash
Unicode Unicode sequence
-----------------------------------------------------
- [[:^digit:]] \P{PosixDigit} \P{Digit} \D
- [[:^space:]] \P{PosixSpace} \P{Space}
- \P{PerlSpace} \P{SpacePerl} \S
- [[:^word:]] \P{PerlWord} \P{Word} \W
+ [[:^digit:]] \P{PosixDigit} \P{XPosixDigit} \D
+ [[:^space:]] \P{PosixSpace} \P{XPosixSpace}
+ \P{PerlSpace} \P{XPerlSpace} \S
+ [[:^word:]] \P{PerlWord} \P{XPosixWord} \W
+
+Again, the backslash sequence means Full-range Unicode.
=head4 [= =] and [. .]