diff options
author | Jeff Pinyan <japhy@pobox.com> | 2004-04-22 10:31:30 -0400 |
---|---|---|
committer | Rafael Garcia-Suarez <rgarciasuarez@gmail.com> | 2004-04-27 08:43:38 +0000 |
commit | 12ac2576dfc10fd43d91903e7602870c10b4f00f (patch) | |
tree | 892346cd2c9b2ff37d1269e1845007129a60e263 /pod | |
parent | 88567e60ed3ba016aaedace3242715b8ab2023f7 (diff) | |
download | perl-12ac2576dfc10fd43d91903e7602870c10b4f00f.tar.gz |
candidate for TR18 compliance
Date: Thu, 22 Apr 2004 14:31:30 -0400 (EDT)
Message-ID: <Pine.LNX.4.44.0404221429040.10466-101000@perlmonk.org>
Date: Mon, 26 Apr 2004 12:37:21 -0400 (EDT)
Message-ID: <Pine.LNX.4.44.0404261222320.7154-400000@perlmonk.org>
p4raw-id: //depot/perl@22744
Diffstat (limited to 'pod')
-rw-r--r-- | pod/perlunicode.pod | 53 |
1 files changed, 29 insertions, 24 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 0817bb36e9..46ea68216c 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -207,6 +207,7 @@ for instance, are identical. Short Long L Letter + LC CasedLetter Lu UppercaseLetter Ll LowercaseLetter Lt TitlecaseLetter @@ -254,7 +255,8 @@ for instance, are identical. Single-letter properties match all characters in any of the two-letter sub-properties starting with the same letter. -C<L&> is a special case, which is an alias for C<Ll>, C<Lu>, and C<Lt>. +C<LC> and C<L&> are special cases, which are aliases for the set of +C<Ll>, C<Lu>, and C<Lt>. Because Perl hides the need for the user to understand the internal representation of Unicode characters, there is no need to implement @@ -262,31 +264,32 @@ the somewhat messy concept of surrogates. C<Cs> is therefore not supported. Because scripts differ in their directionality--Hebrew is -written right to left, for example--Unicode supplies these properties: +written right to left, for example--Unicode supplies these properties in +the BidiClass class: Property Meaning - BidiL Left-to-Right - BidiLRE Left-to-Right Embedding - BidiLRO Left-to-Right Override - BidiR Right-to-Left - BidiAL Right-to-Left Arabic - BidiRLE Right-to-Left Embedding - BidiRLO Right-to-Left Override - BidiPDF Pop Directional Format - BidiEN European Number - BidiES European Number Separator - BidiET European Number Terminator - BidiAN Arabic Number - BidiCS Common Number Separator - BidiNSM Non-Spacing Mark - BidiBN Boundary Neutral - BidiB Paragraph Separator - BidiS Segment Separator - BidiWS Whitespace - BidiON Other Neutrals - -For example, C<\p{BidiR}> matches characters that are normally + L Left-to-Right + LRE Left-to-Right Embedding + LRO Left-to-Right Override + R Right-to-Left + AL Right-to-Left Arabic + RLE Right-to-Left Embedding + RLO Right-to-Left Override + PDF Pop Directional Format + EN European Number + ES European Number Separator + ET European Number Terminator + AN Arabic Number + CS Common Number Separator + NSM Non-Spacing Mark + BN Boundary Neutral + B Paragraph Separator + S Segment Separator + WS Whitespace + ON Other Neutrals + +For example, C<\p{BidiClass:R}> matches characters that are normally written right to left. =back @@ -824,7 +827,9 @@ Level 1 - Basic Unicode Support [ 1] \x{...} [ 2] \N{...} [ 3] . \p{...} \P{...} - [ 4] now scripts (see UTR#24 Script Names) in addition to blocks + [ 4] support for scripts (see UTR#24 Script Names), blocks, + binary properties, enumerated non-binary properties, and + numeric properties (as listed in UTR#18 Other Properties) [ 5] have negation [ 6] can use regular expression look-ahead [a] or user-defined character properties [b] to emulate subtraction |