Move the full \p\P lists to perlunicode.

p4raw-id: //depot/perl@10520
author: Jarkko Hietaniemi <jhi@iki.fi> 2001-06-11 17:55:47 +0000
committer: Jarkko Hietaniemi <jhi@iki.fi> 2001-06-11 17:55:47 +0000
commit: 3229381570cd559d546c04f62dc3e86718ceccd8 (patch)
tree: de91f8808ec4f2646f2b485c97fb4e4d90fd94af /pod/perlunicode.pod
parent: fd71b04b8dfd287735727332d78eb1f4dd10bbbf (diff)
download: perl-3229381570cd559d546c04f62dc3e86718ceccd8.tar.gz
1 files changed, 168 insertions, 1 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index 12bee5c7a3..d629cabe9f 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -165,6 +165,173 @@ names of the C<In> classes are the official Unicode block names but
 with all non-alphanumeric characters removed, for example the block
 name C<"Latin-1 Supplement"> becomes C<\p{InLatin1Supplement}>.
 
+Here is the list as of Unicode 3.1.0 (the two-letter classes) and
+Perl 5.8.0 (the one-letter classes):
+
+   L  Letter
+   Lu Letter, Uppercase
+   Ll Letter, Lowercase
+   Lt Letter, Titlecase
+   Lm Letter, Modifier
+   Lo Letter, Other
+   M  Mark
+   Mn Mark, Non-Spacing
+   Mc Mark, Spacing Combining
+   Me Mark, Enclosing
+   N  Number
+   Nd Number, Decimal Digit
+   Nl Number, Letter
+   No Number, Other
+   P  Punctuation
+   Pc Punctuation, Connector
+   Pd Punctuation, Dash
+   Ps Punctuation, Open
+   Pe Punctuation, Close
+   Pi Punctuation, Initial quote
+       (may behave like Ps or Pe depending on usage)
+   Pf Punctuation, Final quote
+       (may behave like Ps or Pe depending on usage)
+   Po Punctuation, Other
+   S  Symbol
+   Sm Symbol, Math
+   Sc Symbol, Currency
+   Sk Symbol, Modifier
+   So Symbol, Other
+   Z  Separator
+   Zs Separator, Space
+   Zl Separator, Line
+   Zp Separator, Paragraph
+   C  Other
+   Cc Other, Control
+   Cf Other, Format
+   Cs Other, Surrogate
+   Co Other, Private Use
+   Cn Other, Not Assigned (Unicode defines no Cn characters)
+
+Additionally, because scripts differ in their directionality
+(for example Hebrew is written right to left), all characters
+have their directionality defined:
+
+   BidiL   Left-to-Right
+   BidiLRE Left-to-Right Embedding
+   BidiLRO Left-to-Right Override
+   BidiR   Right-to-Left
+   BidiAL  Right-to-Left Arabic
+   BidiRLE Right-to-Left Embedding
+   BidiRLO Right-to-Left Override
+   BidiPDF Pop Directional Format
+   BidiEN  European Number
+   BidiES  European Number Separator
+   BidiET  European Number Terminator
+   BidiAN  Arabic Number
+   BidiCS  Common Number Separator
+   BidiNSM Non-Spacing Mark
+   BidiBN  Boundary Neutral
+   BidiB   Paragraph Separator
+   BidiS   Segment Separator
+   BidiWS  Whitespace
+   BidiON  Other Neutrals
+
+The blocks available for C<\p{InBlock}> and C<\P{InBlock}>, for
+example \p{InCyrillic>, are as follows:
+
+    BasicLatin
+    Latin1Supplement
+    LatinExtendedA
+    LatinExtendedB
+    IPAExtensions
+    SpacingModifierLetters
+    CombiningDiacriticalMarks
+    Greek
+    Cyrillic
+    Armenian
+    Hebrew
+    Arabic
+    Syriac
+    Thaana
+    Devanagari
+    Bengali
+    Gurmukhi
+    Gujarati
+    Oriya
+    Tamil
+    Telugu
+    Kannada
+    Malayalam
+    Sinhala
+    Thai
+    Lao
+    Tibetan
+    Myanmar
+    Georgian
+    HangulJamo
+    Ethiopic
+    Cherokee
+    UnifiedCanadianAboriginalSyllabics
+    Ogham
+    Runic
+    Khmer
+    Mongolian
+    LatinExtendedAdditional
+    GreekExtended
+    GeneralPunctuation
+    SuperscriptsandSubscripts
+    CurrencySymbols
+    CombiningMarksforSymbols
+    LetterlikeSymbols
+    NumberForms
+    Arrows
+    MathematicalOperators
+    MiscellaneousTechnical
+    ControlPictures
+    OpticalCharacterRecognition
+    EnclosedAlphanumerics
+    BoxDrawing
+    BlockElements
+    GeometricShapes
+    MiscellaneousSymbols
+    Dingbats
+    BraillePatterns
+    CJKRadicalsSupplement
+    KangxiRadicals
+    IdeographicDescriptionCharacters
+    CJKSymbolsandPunctuation
+    Hiragana
+    Katakana
+    Bopomofo
+    HangulCompatibilityJamo
+    Kanbun
+    BopomofoExtended
+    EnclosedCJKLettersandMonths
+    CJKCompatibility
+    CJKUnifiedIdeographsExtensionA
+    CJKUnifiedIdeographs
+    YiSyllables
+    YiRadicals
+    HangulSyllables
+    HighSurrogates
+    HighPrivateUseSurrogates
+    LowSurrogates
+    PrivateUse
+    CJKCompatibilityIdeographs
+    AlphabeticPresentationForms
+    ArabicPresentationFormsA
+    CombiningHalfMarks
+    CJKCompatibilityForms
+    SmallFormVariants
+    ArabicPresentationFormsB
+    Specials
+    HalfwidthandFullwidthForms
+    OldItalic
+    Gothic
+    Deseret
+    ByzantineMusicalSymbols
+    MusicalSymbols
+    MathematicalAlphanumericSymbols
+    CJKUnifiedIdeographsExtensionB
+    CJKCompatibilityIdeographsSupplement
+    Tags
+
 =item *
 
 The special pattern C<\X> match matches any extended Unicode sequence
@@ -253,6 +420,6 @@ tend to run slower.  Avoidance of locales is strongly encouraged.
 
 =head1 SEE ALSO
 
-L<bytes>, L<utf8>, L<perlvar/"${^WIDE_SYSTEM_CALLS}">
+L<bytes>, L<utf8>, L<perlretut>, L<perlvar/"${^WIDE_SYSTEM_CALLS}">
 
 =cut
author	Jarkko Hietaniemi <jhi@iki.fi>	2001-06-11 17:55:47 +0000
committer	Jarkko Hietaniemi <jhi@iki.fi>	2001-06-11 17:55:47 +0000
commit	3229381570cd559d546c04f62dc3e86718ceccd8 (patch)
tree	de91f8808ec4f2646f2b485c97fb4e4d90fd94af /pod/perlunicode.pod
parent	fd71b04b8dfd287735727332d78eb1f4dd10bbbf (diff)
download	perl-3229381570cd559d546c04f62dc3e86718ceccd8.tar.gz