diff options
author | Karl Williamson <public@khwilliamson.com> | 2010-10-30 10:13:48 -0600 |
---|---|---|
committer | Father Chrysostomos <sprout@cpan.org> | 2010-10-31 06:11:43 -0700 |
commit | d5944336d74c819152158dabfd806d49ad0ecb21 (patch) | |
tree | 78a523b14eed8ebe3f6d2336b31d2d5462d2c13e /pod/perlrecharclass.pod | |
parent | b6dac59a93d03037bfa91e14bd72ebe78feb54ea (diff) | |
download | perl-d5944336d74c819152158dabfd806d49ad0ecb21.tar.gz |
Add consistent synonyms for \p{PosxFOO}
This patch adds a set of synonyms \p{XPosixFOO} for the full extended
Unicode version of \p{PosixFOO}, so only one rule need be remembered.
Similarly, \p{XPerlSpace} is added to preserve the rule for the one
similar class that doesn't have Posix in its name.
Diffstat (limited to 'pod/perlrecharclass.pod')
-rw-r--r-- | pod/perlrecharclass.pod | 52 |
1 files changed, 30 insertions, 22 deletions
diff --git a/pod/perlrecharclass.pod b/pod/perlrecharclass.pod index 0b88cc46a5..7f96b4b5ea 100644 --- a/pod/perlrecharclass.pod +++ b/pod/perlrecharclass.pod @@ -522,7 +522,8 @@ The other counterpart, in the column labelled "Full-range Unicode", matches any appropriate characters in the full Unicode character set. For example, C<\p{Alpha}> will match not just the ASCII alphabetic characters, but any character in the entire Unicode character set that is considered to be -alphabetic. +alphabetic. The backslash sequence column is a (short) synonym for +the Full-range Unicode form. (Each of the counterparts has various synonyms as well. L<perluniprops/Properties accessible through \p{} and \P{}> lists all the @@ -548,25 +549,25 @@ EBCDIC code page is present, they will behave in accordance with those; if absent, the classes will match only their ASCII-range counterparts. If you disagree with this proposal, send email to C<perl5-porters@perl.org>. - [[:...:]] ASCII-range Full-range backslash Note - Unicode Unicode sequence + [[:...:]] ASCII-range Full-range backslash Note + Unicode Unicode sequence ----------------------------------------------------- - alpha \p{PosixAlpha} \p{Alpha} - alnum \p{PosixAlnum} \p{Alnum} + alpha \p{PosixAlpha} \p{XPosixAlpha} + alnum \p{PosixAlnum} \p{XPosixAlnum} ascii \p{ASCII} - blank \p{PosixBlank} \p{Blank} = [1] - \p{HorizSpace} \h [1] - cntrl \p{PosixCntrl} \p{Cntrl} [2] - digit \p{PosixDigit} \p{Digit} \d - graph \p{PosixGraph} \p{Graph} [3] - lower \p{PosixLower} \p{Lower} - print \p{PosixPrint} \p{Print} [4] - punct \p{PosixPunct} \p{Punct} [5] - \p{PerlSpace} \p{SpacePerl} \s [6] - space \p{PosixSpace} \p{Space} [6] - upper \p{PosixUpper} \p{Upper} - word \p{PerlWord} \p{Word} \w - xdigit \p{ASCII_Hex_Digit} \p{XDigit} + blank \p{PosixBlank} \p{XPosixBlank} \h [1] + or \p{HorizSpace} [1] + cntrl \p{PosixCntrl} \p{XPosixCntrl} [2] + digit \p{PosixDigit} \p{XPosixDigit} \d + graph \p{PosixGraph} \p{XPosixGraph} [3] + lower \p{PosixLower} \p{XPosixLower} + print \p{PosixPrint} \p{XPosixPrint} [4] + punct \p{PosixPunct} \p{XPosixPunct} [5] + \p{PerlSpace} \p{XPerlSpace} \s [6] + space \p{PosixSpace} \p{XPosixSpace} [6] + upper \p{PosixUpper} \p{XPosixUpper} + word \p{PosixWord} \p{XPosixWord} \w + xdigit \p{ASCII_Hex_Digit} \p{XPosixXDigit} =over 4 @@ -621,6 +622,11 @@ matches the vertical tab, C<\cK>. Same for the two ASCII-only range forms. =back +There are various other synonyms that can be used for these besides +C<\p{HorizSpace}> and \C<\p{XPosixBlank}>. For example +C<\p{PosixAlpha}> can be written as C<\p{Alpha}>. All are listed +in L<perluniprops/Properties accessible through \p{} and \P{}>. + =head4 Negation X<character class, negation> @@ -631,10 +637,12 @@ Some examples: POSIX ASCII-range Full-range backslash Unicode Unicode sequence ----------------------------------------------------- - [[:^digit:]] \P{PosixDigit} \P{Digit} \D - [[:^space:]] \P{PosixSpace} \P{Space} - \P{PerlSpace} \P{SpacePerl} \S - [[:^word:]] \P{PerlWord} \P{Word} \W + [[:^digit:]] \P{PosixDigit} \P{XPosixDigit} \D + [[:^space:]] \P{PosixSpace} \P{XPosixSpace} + \P{PerlSpace} \P{XPerlSpace} \S + [[:^word:]] \P{PerlWord} \P{XPosixWord} \W + +Again, the backslash sequence means Full-range Unicode. =head4 [= =] and [. .] |