diff options
author | Karl Williamson <khw@khw-desktop.(none)> | 2009-12-02 21:36:17 -0700 |
---|---|---|
committer | Rafael Garcia-Suarez <rgs@consttype.org> | 2009-12-03 11:08:42 +0100 |
commit | 283b82dc9f704fb99591ed28497a6a263e6ef519 (patch) | |
tree | 163772fbf089f2287a3eab20c9805c636d6d4268 /lib/unicore/auxiliary | |
parent | c9930541bfa04399c3b648e83c9b750cee1154fb (diff) | |
download | perl-283b82dc9f704fb99591ed28497a6a263e6ef519.tar.gz |
Unicode 5.2
Diffstat (limited to 'lib/unicore/auxiliary')
-rw-r--r-- | lib/unicore/auxiliary/GraphemeBreakProperty.txt | 97 | ||||
-rw-r--r-- | lib/unicore/auxiliary/SentenceBreakProperty.txt | 168 | ||||
-rw-r--r-- | lib/unicore/auxiliary/WordBreakProperty.txt | 132 |
3 files changed, 325 insertions, 72 deletions
diff --git a/lib/unicore/auxiliary/GraphemeBreakProperty.txt b/lib/unicore/auxiliary/GraphemeBreakProperty.txt index 50477a15ea..57da65828d 100644 --- a/lib/unicore/auxiliary/GraphemeBreakProperty.txt +++ b/lib/unicore/auxiliary/GraphemeBreakProperty.txt @@ -1,10 +1,10 @@ -# GraphemeBreakProperty-5.1.0.txt -# Date: 2008-03-03, 21:57:47 GMT [MD] +# GraphemeBreakProperty-5.2.0.txt +# Date: 2009-06-09, 21:40:09 GMT [MD] # # Unicode Character Database -# Copyright (c) 1991-2008 Unicode, Inc. +# Copyright (c) 1991-2009 Unicode, Inc. # For terms of use, see http://www.unicode.org/terms_of_use.html -# For documentation, see UCD.html +# For documentation, see http://www.unicode.org/reports/tr44/ # ================================================ @@ -47,11 +47,12 @@ 206A..206F ; Control # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES FEFF ; Control # Cf ZERO WIDTH NO-BREAK SPACE FFF9..FFFB ; Control # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR +110BD ; Control # Cf KAITHI NUMBER SIGN 1D173..1D17A ; Control # Cf [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE E0001 ; Control # Cf LANGUAGE TAG E0020..E007F ; Control # Cf [96] TAG SPACE..CANCEL TAG -# Total code points: 202 +# Total code points: 203 # ================================================ @@ -75,11 +76,15 @@ E0020..E007F ; Control # Cf [96] TAG SPACE..CANCEL TAG 0730..074A ; Extend # Mn [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH 07A6..07B0 ; Extend # Mn [11] THAANA ABAFILI..THAANA SUKUN 07EB..07F3 ; Extend # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE -0901..0902 ; Extend # Mn [2] DEVANAGARI SIGN CANDRABINDU..DEVANAGARI SIGN ANUSVARA +0816..0819 ; Extend # Mn [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH +081B..0823 ; Extend # Mn [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A +0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U +0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA +0900..0902 ; Extend # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA 093C ; Extend # Mn DEVANAGARI SIGN NUKTA 0941..0948 ; Extend # Mn [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI 094D ; Extend # Mn DEVANAGARI SIGN VIRAMA -0951..0954 ; Extend # Mn [4] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI ACUTE ACCENT +0951..0955 ; Extend # Mn [5] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN CANDRA LONG E 0962..0963 ; Extend # Mn [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL 0981 ; Extend # Mn BENGALI SIGN CANDRABINDU 09BC ; Extend # Mn BENGALI SIGN NUKTA @@ -170,6 +175,7 @@ E0020..E007F ; Control # Cf [96] TAG SPACE..CANCEL TAG 1082 ; Extend # Mn MYANMAR CONSONANT SIGN SHAN MEDIAL WA 1085..1086 ; Extend # Mn [2] MYANMAR VOWEL SIGN SHAN E ABOVE..MYANMAR VOWEL SIGN SHAN FINAL Y 108D ; Extend # Mn MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE +109D ; Extend # Mn MYANMAR VOWEL SIGN AITON AI 135F ; Extend # Mn ETHIOPIC COMBINING GEMINATION MARK 1712..1714 ; Extend # Mn [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA 1732..1734 ; Extend # Mn [3] HANUNOO VOWEL SIGN I..HANUNOO SIGN PAMUDPOD @@ -186,6 +192,13 @@ E0020..E007F ; Control # Cf [96] TAG SPACE..CANCEL TAG 1932 ; Extend # Mn LIMBU SMALL LETTER ANUSVARA 1939..193B ; Extend # Mn [3] LIMBU SIGN MUKPHRENG..LIMBU SIGN SA-I 1A17..1A18 ; Extend # Mn [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U +1A56 ; Extend # Mn TAI THAM CONSONANT SIGN MEDIAL LA +1A58..1A5E ; Extend # Mn [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA +1A60 ; Extend # Mn TAI THAM SIGN SAKOT +1A62 ; Extend # Mn TAI THAM VOWEL SIGN MAI SAT +1A65..1A6C ; Extend # Mn [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW +1A73..1A7C ; Extend # Mn [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN +1A7F ; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT 1B00..1B03 ; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG 1B34 ; Extend # Mn BALINESE SIGN REREKAN 1B36..1B3A ; Extend # Mn [5] BALINESE VOWEL SIGN ULU..BALINESE VOWEL SIGN RA REPA @@ -197,32 +210,51 @@ E0020..E007F ; Control # Cf [96] TAG SPACE..CANCEL TAG 1BA8..1BA9 ; Extend # Mn [2] SUNDANESE VOWEL SIGN PAMEPET..SUNDANESE VOWEL SIGN PANEULEUNG 1C2C..1C33 ; Extend # Mn [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T 1C36..1C37 ; Extend # Mn [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA +1CD0..1CD2 ; Extend # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA +1CD4..1CE0 ; Extend # Mn [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA +1CE2..1CE8 ; Extend # Mn [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL +1CED ; Extend # Mn VEDIC SIGN TIRYAK 1DC0..1DE6 ; Extend # Mn [39] COMBINING DOTTED GRAVE ACCENT..COMBINING LATIN SMALL LETTER Z -1DFE..1DFF ; Extend # Mn [2] COMBINING LEFT ARROWHEAD ABOVE..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW +1DFD..1DFF ; Extend # Mn [3] COMBINING ALMOST EQUAL TO BELOW..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW 200C..200D ; Extend # Cf [2] ZERO WIDTH NON-JOINER..ZERO WIDTH JOINER 20D0..20DC ; Extend # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE 20DD..20E0 ; Extend # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH 20E1 ; Extend # Mn COMBINING LEFT RIGHT ARROW ABOVE 20E2..20E4 ; Extend # Me [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE 20E5..20F0 ; Extend # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE +2CEF..2CF1 ; Extend # Mn [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS 2DE0..2DFF ; Extend # Mn [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS 302A..302F ; Extend # Mn [6] IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT TONE MARK 3099..309A ; Extend # Mn [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK A66F ; Extend # Mn COMBINING CYRILLIC VZMET A670..A672 ; Extend # Me [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN A67C..A67D ; Extend # Mn [2] COMBINING CYRILLIC KAVYKA..COMBINING CYRILLIC PAYEROK +A6F0..A6F1 ; Extend # Mn [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS A802 ; Extend # Mn SYLOTI NAGRI SIGN DVISVARA A806 ; Extend # Mn SYLOTI NAGRI SIGN HASANTA A80B ; Extend # Mn SYLOTI NAGRI SIGN ANUSVARA A825..A826 ; Extend # Mn [2] SYLOTI NAGRI VOWEL SIGN U..SYLOTI NAGRI VOWEL SIGN E A8C4 ; Extend # Mn SAURASHTRA SIGN VIRAMA +A8E0..A8F1 ; Extend # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA A926..A92D ; Extend # Mn [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU A947..A951 ; Extend # Mn [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R +A980..A982 ; Extend # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR +A9B3 ; Extend # Mn JAVANESE SIGN CECAK TELU +A9B6..A9B9 ; Extend # Mn [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT +A9BC ; Extend # Mn JAVANESE VOWEL SIGN PEPET AA29..AA2E ; Extend # Mn [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE AA31..AA32 ; Extend # Mn [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE AA35..AA36 ; Extend # Mn [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA AA43 ; Extend # Mn CHAM CONSONANT SIGN FINAL NG AA4C ; Extend # Mn CHAM CONSONANT SIGN FINAL M +AAB0 ; Extend # Mn TAI VIET MAI KANG +AAB2..AAB4 ; Extend # Mn [3] TAI VIET VOWEL I..TAI VIET VOWEL U +AAB7..AAB8 ; Extend # Mn [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA +AABE..AABF ; Extend # Mn [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK +AAC1 ; Extend # Mn TAI VIET TONE MAI THO +ABE5 ; Extend # Mn MEETEI MAYEK VOWEL SIGN ANAP +ABE8 ; Extend # Mn MEETEI MAYEK VOWEL SIGN UNAP +ABED ; Extend # Mn MEETEI MAYEK APUN IYEK FB1E ; Extend # Mn HEBREW POINT JUDEO-SPANISH VARIKA FE00..FE0F ; Extend # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FE20..FE26 ; Extend # Mn [7] COMBINING LIGATURE LEFT HALF..COMBINING CONJOINING MACRON @@ -233,6 +265,9 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 10A0C..10A0F ; Extend # Mn [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA 10A38..10A3A ; Extend # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW 10A3F ; Extend # Mn KHAROSHTHI VIRAMA +11080..11081 ; Extend # Mn [2] KAITHI SIGN CANDRABINDU..KAITHI SIGN ANUSVARA +110B3..110B6 ; Extend # Mn [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI +110B9..110BA ; Extend # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA 1D165 ; Extend # Mc MUSICAL SYMBOL COMBINING STEM 1D167..1D169 ; Extend # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3 1D16E..1D172 ; Extend # Mc [5] MUSICAL SYMBOL COMBINING FLAG-1..MUSICAL SYMBOL COMBINING FLAG-5 @@ -242,20 +277,24 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 1D242..1D244 ; Extend # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 -# Total code points: 1075 +# Total code points: 1205 # ================================================ 0E40..0E44 ; Prepend # Lo [5] THAI CHARACTER SARA E..THAI CHARACTER SARA AI MAIMALAI 0EC0..0EC4 ; Prepend # Lo [5] LAO VOWEL SIGN E..LAO VOWEL SIGN AI +AAB5..AAB6 ; Prepend # Lo [2] TAI VIET VOWEL E..TAI VIET VOWEL O +AAB9 ; Prepend # Lo TAI VIET VOWEL UEA +AABB..AABC ; Prepend # Lo [2] TAI VIET VOWEL AUE..TAI VIET VOWEL AY -# Total code points: 10 +# Total code points: 15 # ================================================ 0903 ; SpacingMark # Mc DEVANAGARI SIGN VISARGA 093E..0940 ; SpacingMark # Mc [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II 0949..094C ; SpacingMark # Mc [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU +094E ; SpacingMark # Mc DEVANAGARI VOWEL SIGN PRISHTHAMATRA E 0982..0983 ; SpacingMark # Mc [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA 09BF..09C0 ; SpacingMark # Mc [2] BENGALI VOWEL SIGN I..BENGALI VOWEL SIGN II 09C7..09C8 ; SpacingMark # Mc [2] BENGALI VOWEL SIGN E..BENGALI VOWEL SIGN AI @@ -302,6 +341,7 @@ E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 1083..1084 ; SpacingMark # Mc [2] MYANMAR VOWEL SIGN SHAN AA..MYANMAR VOWEL SIGN SHAN E 1087..108C ; SpacingMark # Mc [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3 108F ; SpacingMark # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5 +109A..109C ; SpacingMark # Mc [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A 17B6 ; SpacingMark # Mc KHMER VOWEL SIGN AA 17BE..17C5 ; SpacingMark # Mc [8] KHMER VOWEL SIGN OE..KHMER VOWEL SIGN AU 17C7..17C8 ; SpacingMark # Mc [2] KHMER SIGN REAHMUK..KHMER SIGN YUUKALEAPINTU @@ -312,6 +352,11 @@ E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 19B0..19C0 ; SpacingMark # Mc [17] NEW TAI LUE VOWEL SIGN VOWEL SHORTENER..NEW TAI LUE VOWEL SIGN IY 19C8..19C9 ; SpacingMark # Mc [2] NEW TAI LUE TONE MARK-1..NEW TAI LUE TONE MARK-2 1A19..1A1B ; SpacingMark # Mc [3] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN AE +1A55 ; SpacingMark # Mc TAI THAM CONSONANT SIGN MEDIAL RA +1A57 ; SpacingMark # Mc TAI THAM CONSONANT SIGN LA TANG LAI +1A61 ; SpacingMark # Mc TAI THAM VOWEL SIGN A +1A63..1A64 ; SpacingMark # Mc [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA +1A6D..1A72 ; SpacingMark # Mc [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI 1B04 ; SpacingMark # Mc BALINESE SIGN BISAH 1B35 ; SpacingMark # Mc BALINESE VOWEL SIGN TEDUNG 1B3B ; SpacingMark # Mc BALINESE VOWEL SIGN RA REPA TEDUNG @@ -323,37 +368,53 @@ E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 1BAA ; SpacingMark # Mc SUNDANESE SIGN PAMAAEH 1C24..1C2B ; SpacingMark # Mc [8] LEPCHA SUBJOINED LETTER YA..LEPCHA VOWEL SIGN UU 1C34..1C35 ; SpacingMark # Mc [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG +1CE1 ; SpacingMark # Mc VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA +1CF2 ; SpacingMark # Mc VEDIC SIGN ARDHAVISARGA A823..A824 ; SpacingMark # Mc [2] SYLOTI NAGRI VOWEL SIGN A..SYLOTI NAGRI VOWEL SIGN I A827 ; SpacingMark # Mc SYLOTI NAGRI VOWEL SIGN OO A880..A881 ; SpacingMark # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA A8B4..A8C3 ; SpacingMark # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU A952..A953 ; SpacingMark # Mc [2] REJANG CONSONANT SIGN H..REJANG VIRAMA +A983 ; SpacingMark # Mc JAVANESE SIGN WIGNYAN +A9B4..A9B5 ; SpacingMark # Mc [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG +A9BA..A9BB ; SpacingMark # Mc [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE +A9BD..A9C0 ; SpacingMark # Mc [4] JAVANESE CONSONANT SIGN KERET..JAVANESE PANGKON AA2F..AA30 ; SpacingMark # Mc [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI AA33..AA34 ; SpacingMark # Mc [2] CHAM CONSONANT SIGN YA..CHAM CONSONANT SIGN RA AA4D ; SpacingMark # Mc CHAM CONSONANT SIGN FINAL H +AA7B ; SpacingMark # Mc MYANMAR SIGN PAO KAREN TONE +ABE3..ABE4 ; SpacingMark # Mc [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP +ABE6..ABE7 ; SpacingMark # Mc [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP +ABE9..ABEA ; SpacingMark # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG +ABEC ; SpacingMark # Mc MEETEI MAYEK LUM IYEK +11082 ; SpacingMark # Mc KAITHI SIGN VISARGA +110B0..110B2 ; SpacingMark # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II +110B7..110B8 ; SpacingMark # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU 1D166 ; SpacingMark # Mc MUSICAL SYMBOL COMBINING SPRECHGESANG STEM 1D16D ; SpacingMark # Mc MUSICAL SYMBOL COMBINING AUGMENTATION DOT -# Total code points: 217 +# Total code points: 257 # ================================================ -1100..1159 ; L # Lo [90] HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINHIEUH -115F ; L # Lo HANGUL CHOSEONG FILLER +1100..115F ; L # Lo [96] HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG FILLER +A960..A97C ; L # Lo [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH -# Total code points: 91 +# Total code points: 125 # ================================================ -1160..11A2 ; V # Lo [67] HANGUL JUNGSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA +1160..11A7 ; V # Lo [72] HANGUL JUNGSEONG FILLER..HANGUL JUNGSEONG O-YAE +D7B0..D7C6 ; V # Lo [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E -# Total code points: 67 +# Total code points: 95 # ================================================ -11A8..11F9 ; T # Lo [82] HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH +11A8..11FF ; T # Lo [88] HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG SSANGNIEUN +D7CB..D7FB ; T # Lo [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH -# Total code points: 82 +# Total code points: 137 # ================================================ diff --git a/lib/unicore/auxiliary/SentenceBreakProperty.txt b/lib/unicore/auxiliary/SentenceBreakProperty.txt index 77c68dbd17..50e830c549 100644 --- a/lib/unicore/auxiliary/SentenceBreakProperty.txt +++ b/lib/unicore/auxiliary/SentenceBreakProperty.txt @@ -1,10 +1,10 @@ -# SentenceBreakProperty-5.1.0.txt -# Date: 2008-03-20, 17:55:34 GMT [MD] +# SentenceBreakProperty-5.2.0.txt +# Date: 2009-08-22, 04:58:44 GMT [MD] # # Unicode Character Database -# Copyright (c) 1991-2008 Unicode, Inc. +# Copyright (c) 1991-2009 Unicode, Inc. # For terms of use, see http://www.unicode.org/terms_of_use.html -# For documentation, see UCD.html +# For documentation, see http://www.unicode.org/reports/tr44/ # ================================================ @@ -49,14 +49,19 @@ 0730..074A ; Extend # Mn [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH 07A6..07B0 ; Extend # Mn [11] THAANA ABAFILI..THAANA SUKUN 07EB..07F3 ; Extend # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE -0901..0902 ; Extend # Mn [2] DEVANAGARI SIGN CANDRABINDU..DEVANAGARI SIGN ANUSVARA +0816..0819 ; Extend # Mn [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH +081B..0823 ; Extend # Mn [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A +0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U +0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA +0900..0902 ; Extend # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA 0903 ; Extend # Mc DEVANAGARI SIGN VISARGA 093C ; Extend # Mn DEVANAGARI SIGN NUKTA 093E..0940 ; Extend # Mc [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II 0941..0948 ; Extend # Mn [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI 0949..094C ; Extend # Mc [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU 094D ; Extend # Mn DEVANAGARI SIGN VIRAMA -0951..0954 ; Extend # Mn [4] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI ACUTE ACCENT +094E ; Extend # Mc DEVANAGARI VOWEL SIGN PRISHTHAMATRA E +0951..0955 ; Extend # Mn [5] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN CANDRA LONG E 0962..0963 ; Extend # Mn [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL 0981 ; Extend # Mn BENGALI SIGN CANDRABINDU 0982..0983 ; Extend # Mc [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA @@ -181,6 +186,8 @@ 1087..108C ; Extend # Mc [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3 108D ; Extend # Mn MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE 108F ; Extend # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5 +109A..109C ; Extend # Mc [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A +109D ; Extend # Mn MYANMAR VOWEL SIGN AITON AI 135F ; Extend # Mn ETHIOPIC COMBINING GEMINATION MARK 1712..1714 ; Extend # Mn [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA 1732..1734 ; Extend # Mn [3] HANUNOO VOWEL SIGN I..HANUNOO SIGN PAMUDPOD @@ -207,6 +214,18 @@ 19C8..19C9 ; Extend # Mc [2] NEW TAI LUE TONE MARK-1..NEW TAI LUE TONE MARK-2 1A17..1A18 ; Extend # Mn [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U 1A19..1A1B ; Extend # Mc [3] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN AE +1A55 ; Extend # Mc TAI THAM CONSONANT SIGN MEDIAL RA +1A56 ; Extend # Mn TAI THAM CONSONANT SIGN MEDIAL LA +1A57 ; Extend # Mc TAI THAM CONSONANT SIGN LA TANG LAI +1A58..1A5E ; Extend # Mn [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA +1A60 ; Extend # Mn TAI THAM SIGN SAKOT +1A61 ; Extend # Mc TAI THAM VOWEL SIGN A +1A62 ; Extend # Mn TAI THAM VOWEL SIGN MAI SAT +1A63..1A64 ; Extend # Mc [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA +1A65..1A6C ; Extend # Mn [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW +1A6D..1A72 ; Extend # Mc [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI +1A73..1A7C ; Extend # Mn [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN +1A7F ; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT 1B00..1B03 ; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG 1B04 ; Extend # Mc BALINESE SIGN BISAH 1B34 ; Extend # Mn BALINESE SIGN REREKAN @@ -229,20 +248,28 @@ 1C2C..1C33 ; Extend # Mn [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T 1C34..1C35 ; Extend # Mc [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG 1C36..1C37 ; Extend # Mn [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA +1CD0..1CD2 ; Extend # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA +1CD4..1CE0 ; Extend # Mn [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA +1CE1 ; Extend # Mc VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA +1CE2..1CE8 ; Extend # Mn [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL +1CED ; Extend # Mn VEDIC SIGN TIRYAK +1CF2 ; Extend # Mc VEDIC SIGN ARDHAVISARGA 1DC0..1DE6 ; Extend # Mn [39] COMBINING DOTTED GRAVE ACCENT..COMBINING LATIN SMALL LETTER Z -1DFE..1DFF ; Extend # Mn [2] COMBINING LEFT ARROWHEAD ABOVE..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW +1DFD..1DFF ; Extend # Mn [3] COMBINING ALMOST EQUAL TO BELOW..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW 200C..200D ; Extend # Cf [2] ZERO WIDTH NON-JOINER..ZERO WIDTH JOINER 20D0..20DC ; Extend # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE 20DD..20E0 ; Extend # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH 20E1 ; Extend # Mn COMBINING LEFT RIGHT ARROW ABOVE 20E2..20E4 ; Extend # Me [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE 20E5..20F0 ; Extend # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE +2CEF..2CF1 ; Extend # Mn [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS 2DE0..2DFF ; Extend # Mn [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS 302A..302F ; Extend # Mn [6] IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT TONE MARK 3099..309A ; Extend # Mn [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK A66F ; Extend # Mn COMBINING CYRILLIC VZMET A670..A672 ; Extend # Me [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN A67C..A67D ; Extend # Mn [2] COMBINING CYRILLIC KAVYKA..COMBINING CYRILLIC PAYEROK +A6F0..A6F1 ; Extend # Mn [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS A802 ; Extend # Mn SYLOTI NAGRI SIGN DVISVARA A806 ; Extend # Mn SYLOTI NAGRI SIGN HASANTA A80B ; Extend # Mn SYLOTI NAGRI SIGN ANUSVARA @@ -252,9 +279,18 @@ A827 ; Extend # Mc SYLOTI NAGRI VOWEL SIGN OO A880..A881 ; Extend # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA A8B4..A8C3 ; Extend # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU A8C4 ; Extend # Mn SAURASHTRA SIGN VIRAMA +A8E0..A8F1 ; Extend # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA A926..A92D ; Extend # Mn [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU A947..A951 ; Extend # Mn [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R A952..A953 ; Extend # Mc [2] REJANG CONSONANT SIGN H..REJANG VIRAMA +A980..A982 ; Extend # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR +A983 ; Extend # Mc JAVANESE SIGN WIGNYAN +A9B3 ; Extend # Mn JAVANESE SIGN CECAK TELU +A9B4..A9B5 ; Extend # Mc [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG +A9B6..A9B9 ; Extend # Mn [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT +A9BA..A9BB ; Extend # Mc [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE +A9BC ; Extend # Mn JAVANESE VOWEL SIGN PEPET +A9BD..A9C0 ; Extend # Mc [4] JAVANESE CONSONANT SIGN KERET..JAVANESE PANGKON AA29..AA2E ; Extend # Mn [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE AA2F..AA30 ; Extend # Mc [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI AA31..AA32 ; Extend # Mn [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE @@ -263,6 +299,19 @@ AA35..AA36 ; Extend # Mn [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA AA43 ; Extend # Mn CHAM CONSONANT SIGN FINAL NG AA4C ; Extend # Mn CHAM CONSONANT SIGN FINAL M AA4D ; Extend # Mc CHAM CONSONANT SIGN FINAL H +AA7B ; Extend # Mc MYANMAR SIGN PAO KAREN TONE +AAB0 ; Extend # Mn TAI VIET MAI KANG +AAB2..AAB4 ; Extend # Mn [3] TAI VIET VOWEL I..TAI VIET VOWEL U +AAB7..AAB8 ; Extend # Mn [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA +AABE..AABF ; Extend # Mn [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK +AAC1 ; Extend # Mn TAI VIET TONE MAI THO +ABE3..ABE4 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP +ABE5 ; Extend # Mn MEETEI MAYEK VOWEL SIGN ANAP +ABE6..ABE7 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP +ABE8 ; Extend # Mn MEETEI MAYEK VOWEL SIGN UNAP +ABE9..ABEA ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG +ABEC ; Extend # Mc MEETEI MAYEK LUM IYEK +ABED ; Extend # Mn MEETEI MAYEK APUN IYEK FB1E ; Extend # Mn HEBREW POINT JUDEO-SPANISH VARIKA FE00..FE0F ; Extend # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FE20..FE26 ; Extend # Mn [7] COMBINING LIGATURE LEFT HALF..COMBINING CONJOINING MACRON @@ -273,6 +322,12 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 10A0C..10A0F ; Extend # Mn [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA 10A38..10A3A ; Extend # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW 10A3F ; Extend # Mn KHAROSHTHI VIRAMA +11080..11081 ; Extend # Mn [2] KAITHI SIGN CANDRABINDU..KAITHI SIGN ANUSVARA +11082 ; Extend # Mc KAITHI SIGN VISARGA +110B0..110B2 ; Extend # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II +110B3..110B6 ; Extend # Mn [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI +110B7..110B8 ; Extend # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU +110B9..110BA ; Extend # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA 1D165..1D166 ; Extend # Mc [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM 1D167..1D169 ; Extend # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3 1D16D..1D172 ; Extend # Mc [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5 @@ -282,7 +337,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 1D242..1D244 ; Extend # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 -# Total code points: 1285 +# Total code points: 1455 # ================================================ @@ -306,11 +361,12 @@ E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 206A..206F ; Format # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES FEFF ; Format # Cf ZERO WIDTH NO-BREAK SPACE FFF9..FFFB ; Format # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR +110BD ; Format # Cf KAITHI NUMBER SIGN 1D173..1D17A ; Format # Cf [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE E0001 ; Format # Cf LANGUAGE TAG E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG -# Total code points: 137 +# Total code points: 138 # ================================================ @@ -598,6 +654,7 @@ E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG 051F ; Lower # L& CYRILLIC SMALL LETTER ALEUT KA 0521 ; Lower # L& CYRILLIC SMALL LETTER EL WITH MIDDLE HOOK 0523 ; Lower # L& CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK +0525 ; Lower # L& CYRILLIC SMALL LETTER PE WITH DESCENDER 0561..0587 ; Lower # L& [39] ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LIGATURE ECH YIWN 1D00..1D2B ; Lower # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL 1D2C..1D61 ; Lower # Lm [54] MODIFIER LETTER CAPITAL A..MODIFIER LETTER SMALL CHI @@ -749,8 +806,6 @@ E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG 1FE0..1FE7 ; Lower # L& [8] GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND PERISPOMENI 1FF2..1FF4 ; Lower # L& [3] GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI 1FF6..1FF7 ; Lower # L& [2] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI -2071 ; Lower # L& SUPERSCRIPT LATIN SMALL LETTER I -207F ; Lower # L& SUPERSCRIPT LATIN SMALL LETTER N 2090..2094 ; Lower # Lm [5] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER SCHWA 210A ; Lower # L& SCRIPT SMALL G 210E..210F ; Lower # L& [2] PLANCK CONSTANT..PLANCK CONSTANT OVER TWO PI @@ -824,6 +879,8 @@ E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG 2CDF ; Lower # L& COPTIC SMALL LETTER OLD NUBIAN NGI 2CE1 ; Lower # L& COPTIC SMALL LETTER OLD NUBIAN NYI 2CE3..2CE4 ; Lower # L& [2] COPTIC SMALL LETTER OLD NUBIAN WAU..COPTIC SYMBOL KAI +2CEC ; Lower # L& COPTIC SMALL LETTER CRYPTOGRAMMIC SHEI +2CEE ; Lower # L& COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA 2D00..2D25 ; Lower # L& [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE A641 ; Lower # L& CYRILLIC SMALL LETTER ZEMLYA A643 ; Lower # L& CYRILLIC SMALL LETTER DZELO @@ -940,7 +997,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 1D7C4..1D7C9 ; Lower # L& [6] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL SANS-SERIF BOLD ITALIC PI SYMBOL 1D7CB ; Lower # L& MATHEMATICAL BOLD SMALL DIGAMMA -# Total code points: 1906 +# Total code points: 1907 # ================================================ @@ -1208,6 +1265,7 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 051E ; Upper # L& CYRILLIC CAPITAL LETTER ALEUT KA 0520 ; Upper # L& CYRILLIC CAPITAL LETTER EL WITH MIDDLE HOOK 0522 ; Upper # L& CYRILLIC CAPITAL LETTER EN WITH MIDDLE HOOK +0524 ; Upper # L& CYRILLIC CAPITAL LETTER PE WITH DESCENDER 0531..0556 ; Upper # L& [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH 10A0..10C5 ; Upper # L& [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE 1E00 ; Upper # L& LATIN CAPITAL LETTER A WITH RING BELOW @@ -1374,10 +1432,10 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 2C67 ; Upper # L& LATIN CAPITAL LETTER H WITH DESCENDER 2C69 ; Upper # L& LATIN CAPITAL LETTER K WITH DESCENDER 2C6B ; Upper # L& LATIN CAPITAL LETTER Z WITH DESCENDER -2C6D..2C6F ; Upper # L& [3] LATIN CAPITAL LETTER ALPHA..LATIN CAPITAL LETTER TURNED A +2C6D..2C70 ; Upper # L& [4] LATIN CAPITAL LETTER ALPHA..LATIN CAPITAL LETTER TURNED ALPHA 2C72 ; Upper # L& LATIN CAPITAL LETTER W WITH HOOK 2C75 ; Upper # L& LATIN CAPITAL LETTER HALF H -2C80 ; Upper # L& COPTIC CAPITAL LETTER ALFA +2C7E..2C80 ; Upper # L& [3] LATIN CAPITAL LETTER S WITH SWASH TAIL..COPTIC CAPITAL LETTER ALFA 2C82 ; Upper # L& COPTIC CAPITAL LETTER VIDA 2C84 ; Upper # L& COPTIC CAPITAL LETTER GAMMA 2C86 ; Upper # L& COPTIC CAPITAL LETTER DALDA @@ -1427,6 +1485,8 @@ FF41..FF5A ; Lower # L& [26] FULLWIDTH LATIN SMALL LETTER A..FULLWIDTH LATIN 2CDE ; Upper # L& COPTIC CAPITAL LETTER OLD NUBIAN NGI 2CE0 ; Upper # L& COPTIC CAPITAL LETTER OLD NUBIAN NYI 2CE2 ; Upper # L& COPTIC CAPITAL LETTER OLD NUBIAN WAU +2CEB ; Upper # L& COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI +2CED ; Upper # L& COPTIC CAPITAL LETTER CRYPTOGRAMMIC GANGIA A640 ; Upper # L& CYRILLIC CAPITAL LETTER ZEMLYA A642 ; Upper # L& CYRILLIC CAPITAL LETTER DZELO A644 ; Upper # L& CYRILLIC CAPITAL LETTER REVERSED DZE @@ -1541,7 +1601,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1D790..1D7A8 ; Upper # L& [25] MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC CAPITAL OMEGA 1D7CA ; Upper # L& MATHEMATICAL BOLD CAPITAL DIGAMMA -# Total code points: 1494 +# Total code points: 1500 # ================================================ @@ -1574,13 +1634,17 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 07CA..07EA ; OLetter # Lo [33] NKO LETTER A..NKO LETTER JONA RA 07F4..07F5 ; OLetter # Lm [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE 07FA ; OLetter # Lm NKO LAJANYALAN +0800..0815 ; OLetter # Lo [22] SAMARITAN LETTER ALAF..SAMARITAN LETTER TAAF +081A ; OLetter # Lm SAMARITAN MODIFIER LETTER EPENTHETIC YUT +0824 ; OLetter # Lm SAMARITAN MODIFIER LETTER SHORT A +0828 ; OLetter # Lm SAMARITAN MODIFIER LETTER I 0904..0939 ; OLetter # Lo [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA 093D ; OLetter # Lo DEVANAGARI SIGN AVAGRAHA 0950 ; OLetter # Lo DEVANAGARI OM 0958..0961 ; OLetter # Lo [10] DEVANAGARI LETTER QA..DEVANAGARI LETTER VOCALIC LL 0971 ; OLetter # Lm DEVANAGARI SIGN HIGH SPACING DOT 0972 ; OLetter # Lo DEVANAGARI LETTER CANDRA A -097B..097F ; OLetter # Lo [5] DEVANAGARI LETTER GGA..DEVANAGARI LETTER BBA +0979..097F ; OLetter # Lo [7] DEVANAGARI LETTER ZHA..DEVANAGARI LETTER BBA 0985..098C ; OLetter # Lo [8] BENGALI LETTER A..BENGALI LETTER VOCALIC L 098F..0990 ; OLetter # Lo [2] BENGALI LETTER E..BENGALI LETTER AI 0993..09A8 ; OLetter # Lo [22] BENGALI LETTER O..BENGALI LETTER NA @@ -1696,10 +1760,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 108E ; OLetter # Lo MYANMAR LETTER RUMAI PALAUNG FA 10D0..10FA ; OLetter # Lo [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN 10FC ; OLetter # Lm MODIFIER LETTER GEORGIAN NAR -1100..1159 ; OLetter # Lo [90] HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINHIEUH -115F..11A2 ; OLetter # Lo [68] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA -11A8..11F9 ; OLetter # Lo [82] HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH -1200..1248 ; OLetter # Lo [73] ETHIOPIC SYLLABLE HA..ETHIOPIC SYLLABLE QWA +1100..1248 ; OLetter # Lo [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA 124A..124D ; OLetter # Lo [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE 1250..1256 ; OLetter # Lo [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO 1258 ; OLetter # Lo ETHIOPIC SYLLABLE QHWA @@ -1718,7 +1779,7 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1380..138F ; OLetter # Lo [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE 13A0..13F4 ; OLetter # Lo [85] CHEROKEE LETTER A..CHEROKEE LETTER YV 1401..166C ; OLetter # Lo [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA -166F..1676 ; OLetter # Lo [8] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS NNGAA +166F..167F ; OLetter # Lo [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W 1681..169A ; OLetter # Lo [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH 16A0..16EA ; OLetter # Lo [75] RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X 16EE..16F0 ; OLetter # Nl [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL @@ -1736,12 +1797,15 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1844..1877 ; OLetter # Lo [52] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER MANCHU ZHA 1880..18A8 ; OLetter # Lo [41] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER MANCHU ALI GALI BHA 18AA ; OLetter # Lo MONGOLIAN LETTER MANCHU ALI GALI LHA +18B0..18F5 ; OLetter # Lo [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S 1900..191C ; OLetter # Lo [29] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER HA 1950..196D ; OLetter # Lo [30] TAI LE LETTER KA..TAI LE LETTER AI 1970..1974 ; OLetter # Lo [5] TAI LE LETTER TONE-2..TAI LE LETTER TONE-6 -1980..19A9 ; OLetter # Lo [42] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW XVA +1980..19AB ; OLetter # Lo [44] NEW TAI LUE LETTER HIGH QA..NEW TAI LUE LETTER LOW SUA 19C1..19C7 ; OLetter # Lo [7] NEW TAI LUE LETTER FINAL V..NEW TAI LUE LETTER FINAL B 1A00..1A16 ; OLetter # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA +1A20..1A54 ; OLetter # Lo [53] TAI THAM LETTER HIGH KA..TAI THAM LETTER GREAT SA +1AA7 ; OLetter # Lm TAI THAM SIGN MAI YAMOK 1B05..1B33 ; OLetter # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA 1B45..1B4B ; OLetter # Lo [7] BALINESE LETTER KAF SASAK..BALINESE LETTER ASYURA SASAK 1B83..1BA0 ; OLetter # Lo [30] SUNDANESE LETTER A..SUNDANESE LETTER HA @@ -1750,6 +1814,10 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 1C4D..1C4F ; OLetter # Lo [3] LEPCHA LETTER TTA..LEPCHA LETTER DDA 1C5A..1C77 ; OLetter # Lo [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH 1C78..1C7D ; OLetter # Lm [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD +1CE9..1CEC ; OLetter # Lo [4] VEDIC SIGN ANUSVARA ANTARGOMUKHA..VEDIC SIGN ANUSVARA VAMAGOMUKHA WITH TAIL +1CEE..1CF1 ; OLetter # Lo [4] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ANUSVARA UBHAYATO MUKHA +2071 ; OLetter # Lm SUPERSCRIPT LATIN SMALL LETTER I +207F ; OLetter # Lm SUPERSCRIPT LATIN SMALL LETTER N 2135..2138 ; OLetter # Lo [4] ALEF SYMBOL..DALET SYMBOL 2180..2182 ; OLetter # Nl [3] ROMAN NUMERAL ONE THOUSAND C D..ROMAN NUMERAL TEN THOUSAND 2185..2188 ; OLetter # Nl [4] ROMAN NUMERAL SIX LATE FORM..ROMAN NUMERAL ONE HUNDRED THOUSAND @@ -1784,16 +1852,20 @@ FF21..FF3A ; Upper # L& [26] FULLWIDTH LATIN CAPITAL LETTER A..FULLWIDTH LAT 31A0..31B7 ; OLetter # Lo [24] BOPOMOFO LETTER BU..BOPOMOFO FINAL LETTER H 31F0..31FF ; OLetter # Lo [16] KATAKANA LETTER SMALL KU..KATAKANA LETTER SMALL RO 3400..4DB5 ; OLetter # Lo [6582] CJK UNIFIED IDEOGRAPH-3400..CJK UNIFIED IDEOGRAPH-4DB5 -4E00..9FC3 ; OLetter # Lo [20932] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FC3 +4E00..9FCB ; OLetter # Lo [20940] CJK UNIFIED IDEOGRAPH-4E00..CJK UNIFIED IDEOGRAPH-9FCB A000..A014 ; OLetter # Lo [21] YI SYLLABLE IT..YI SYLLABLE E A015 ; OLetter # Lm YI SYLLABLE WU A016..A48C ; OLetter # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR +A4D0..A4F7 ; OLetter # Lo [40] LISU LETTER BA..LISU LETTER OE +A4F8..A4FD ; OLetter # Lm [6] LISU LETTER TONE MYA TI..LISU LETTER TONE MYA JEU A500..A60B ; OLetter # Lo [268] VAI SYLLABLE EE..VAI SYLLABLE NG A60C ; OLetter # Lm VAI SYLLABLE LENGTHENER A610..A61F ; OLetter # Lo [16] VAI SYLLABLE NDOLE FA..VAI SYMBOL JONG A62A..A62B ; OLetter # Lo [2] VAI SYLLABLE NDOLE MA..VAI SYLLABLE NDOLE DO A66E ; OLetter # Lo CYRILLIC LETTER MULTIOCULAR O A67F ; OLetter # Lm CYRILLIC PAYEROK +A6A0..A6E5 ; OLetter # Lo [70] BAMUM LETTER A..BAMUM LETTER KI +A6E6..A6EF ; OLetter # Nl [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM A717..A71F ; OLetter # Lm [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK A788 ; OLetter # Lm MODIFIER LETTER LOW CIRCUMFLEX ACCENT A7FB..A801 ; OLetter # Lo [7] LATIN EPIGRAPHIC LETTER REVERSED F..SYLOTI NAGRI LETTER I @@ -1802,14 +1874,34 @@ A807..A80A ; OLetter # Lo [4] SYLOTI NAGRI LETTER KO..SYLOTI NAGRI LETTER G A80C..A822 ; OLetter # Lo [23] SYLOTI NAGRI LETTER CO..SYLOTI NAGRI LETTER HO A840..A873 ; OLetter # Lo [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU A882..A8B3 ; OLetter # Lo [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA +A8F2..A8F7 ; OLetter # Lo [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA +A8FB ; OLetter # Lo DEVANAGARI HEADSTROKE A90A..A925 ; OLetter # Lo [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO A930..A946 ; OLetter # Lo [23] REJANG LETTER KA..REJANG LETTER A +A960..A97C ; OLetter # Lo [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH +A984..A9B2 ; OLetter # Lo [47] JAVANESE LETTER A..JAVANESE LETTER HA +A9CF ; OLetter # Lm JAVANESE PANGRANGKEP AA00..AA28 ; OLetter # Lo [41] CHAM LETTER A..CHAM LETTER HA AA40..AA42 ; OLetter # Lo [3] CHAM LETTER FINAL K..CHAM LETTER FINAL NG AA44..AA4B ; OLetter # Lo [8] CHAM LETTER FINAL CH..CHAM LETTER FINAL SS +AA60..AA6F ; OLetter # Lo [16] MYANMAR LETTER KHAMTI GA..MYANMAR LETTER KHAMTI FA +AA70 ; OLetter # Lm MYANMAR MODIFIER LETTER KHAMTI REDUPLICATION +AA71..AA76 ; OLetter # Lo [6] MYANMAR LETTER KHAMTI XA..MYANMAR LOGOGRAM KHAMTI HM +AA7A ; OLetter # Lo MYANMAR LETTER AITON RA +AA80..AAAF ; OLetter # Lo [48] TAI VIET LETTER LOW KO..TAI VIET LETTER HIGH O +AAB1 ; OLetter # Lo TAI VIET VOWEL AA +AAB5..AAB6 ; OLetter # Lo [2] TAI VIET VOWEL E..TAI VIET VOWEL O +AAB9..AABD ; OLetter # Lo [5] TAI VIET VOWEL UEA..TAI VIET VOWEL AN +AAC0 ; OLetter # Lo TAI VIET TONE MAI NUENG +AAC2 ; OLetter # Lo TAI VIET TONE MAI SONG +AADB..AADC ; OLetter # Lo [2] TAI VIET SYMBOL KON..TAI VIET SYMBOL NUENG +AADD ; OLetter # Lm TAI VIET SYMBOL SAM +ABC0..ABE2 ; OLetter # Lo [35] MEETEI MAYEK LETTER KOK..MEETEI MAYEK LETTER I LONSUM AC00..D7A3 ; OLetter # Lo [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH +D7B0..D7C6 ; OLetter # Lo [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E +D7CB..D7FB ; OLetter # Lo [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH F900..FA2D ; OLetter # Lo [302] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA2D -FA30..FA6A ; OLetter # Lo [59] CJK COMPATIBILITY IDEOGRAPH-FA30..CJK COMPATIBILITY IDEOGRAPH-FA6A +FA30..FA6D ; OLetter # Lo [62] CJK COMPATIBILITY IDEOGRAPH-FA30..CJK COMPATIBILITY IDEOGRAPH-FA6D FA70..FAD9 ; OLetter # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9 FB1D ; OLetter # Lo HEBREW LETTER YOD WITH HIRIQ FB1F..FB28 ; OLetter # Lo [10] HEBREW LIGATURE YIDDISH YOD YOD PATAH..HEBREW LETTER WIDE TAV @@ -1858,19 +1950,27 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1080A..10835 ; OLetter # Lo [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO 10837..10838 ; OLetter # Lo [2] CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE 1083C ; OLetter # Lo CYPRIOT SYLLABLE ZA -1083F ; OLetter # Lo CYPRIOT SYLLABLE ZO +1083F..10855 ; OLetter # Lo [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW 10900..10915 ; OLetter # Lo [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU 10920..10939 ; OLetter # Lo [26] LYDIAN LETTER A..LYDIAN LETTER C 10A00 ; OLetter # Lo KHAROSHTHI LETTER A 10A10..10A13 ; OLetter # Lo [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA 10A15..10A17 ; OLetter # Lo [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA 10A19..10A33 ; OLetter # Lo [27] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER TTTHA +10A60..10A7C ; OLetter # Lo [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH +10B00..10B35 ; OLetter # Lo [54] AVESTAN LETTER A..AVESTAN LETTER HE +10B40..10B55 ; OLetter # Lo [22] INSCRIPTIONAL PARTHIAN LETTER ALEPH..INSCRIPTIONAL PARTHIAN LETTER TAW +10B60..10B72 ; OLetter # Lo [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW +10C00..10C48 ; OLetter # Lo [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH +11083..110AF ; OLetter # Lo [45] KAITHI LETTER A..KAITHI LETTER HA 12000..1236E ; OLetter # Lo [879] CUNEIFORM SIGN A..CUNEIFORM SIGN ZUM 12400..12462 ; OLetter # Nl [99] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN OLD ASSYRIAN ONE QUARTER +13000..1342E ; OLetter # Lo [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032 20000..2A6D6 ; OLetter # Lo [42711] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6D6 +2A700..2B734 ; OLetter # Lo [4149] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B734 2F800..2FA1D ; OLetter # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D -# Total code points: 90320 +# Total code points: 96405 # ================================================ @@ -1896,7 +1996,9 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 17E0..17E9 ; Numeric # Nd [10] KHMER DIGIT ZERO..KHMER DIGIT NINE 1810..1819 ; Numeric # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE 1946..194F ; Numeric # Nd [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE -19D0..19D9 ; Numeric # Nd [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE +19D0..19DA ; Numeric # Nd [11] NEW TAI LUE DIGIT ZERO..NEW TAI LUE THAM DIGIT ONE +1A80..1A89 ; Numeric # Nd [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE +1A90..1A99 ; Numeric # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE 1B50..1B59 ; Numeric # Nd [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE 1BB0..1BB9 ; Numeric # Nd [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE 1C40..1C49 ; Numeric # Nd [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE @@ -1904,11 +2006,13 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL A620..A629 ; Numeric # Nd [10] VAI DIGIT ZERO..VAI DIGIT NINE A8D0..A8D9 ; Numeric # Nd [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE A900..A909 ; Numeric # Nd [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE +A9D0..A9D9 ; Numeric # Nd [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE AA50..AA59 ; Numeric # Nd [10] CHAM DIGIT ZERO..CHAM DIGIT NINE +ABF0..ABF9 ; Numeric # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE 104A0..104A9 ; Numeric # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE 1D7CE..1D7FF ; Numeric # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE -# Total code points: 362 +# Total code points: 403 # ================================================ @@ -1946,17 +2050,23 @@ FF0E ; ATerm # Po FULLWIDTH FULL STOP 2047..2049 ; STerm # Po [3] DOUBLE QUESTION MARK..EXCLAMATION QUESTION MARK 2E2E ; STerm # Po REVERSED QUESTION MARK 3002 ; STerm # Po IDEOGRAPHIC FULL STOP +A4FF ; STerm # Po LISU PUNCTUATION FULL STOP A60E..A60F ; STerm # Po [2] VAI FULL STOP..VAI QUESTION MARK +A6F3 ; STerm # Po BAMUM FULL STOP +A6F7 ; STerm # Po BAMUM QUESTION MARK A876..A877 ; STerm # Po [2] PHAGS-PA MARK SHAD..PHAGS-PA MARK DOUBLE SHAD A8CE..A8CF ; STerm # Po [2] SAURASHTRA DANDA..SAURASHTRA DOUBLE DANDA A92F ; STerm # Po KAYAH LI SIGN SHYA +A9C8..A9C9 ; STerm # Po [2] JAVANESE PADA LINGSA..JAVANESE PADA LUNGSI AA5D..AA5F ; STerm # Po [3] CHAM PUNCTUATION DANDA..CHAM PUNCTUATION TRIPLE DANDA +ABEB ; STerm # Po MEETEI MAYEK CHEIKHEI FE56..FE57 ; STerm # Po [2] SMALL QUESTION MARK..SMALL EXCLAMATION MARK FF01 ; STerm # Po FULLWIDTH EXCLAMATION MARK FF1F ; STerm # Po FULLWIDTH QUESTION MARK FF61 ; STerm # Po HALFWIDTH IDEOGRAPHIC FULL STOP +110BE..110C1 ; STerm # Po [4] KAITHI SECTION MARK..KAITHI DOUBLE DANDA -# Total code points: 53 +# Total code points: 63 # ================================================ diff --git a/lib/unicore/auxiliary/WordBreakProperty.txt b/lib/unicore/auxiliary/WordBreakProperty.txt index 2768f72a85..e38cb939f3 100644 --- a/lib/unicore/auxiliary/WordBreakProperty.txt +++ b/lib/unicore/auxiliary/WordBreakProperty.txt @@ -1,10 +1,10 @@ -# WordBreakProperty-5.1.0.txt -# Date: 2008-03-20, 17:55:36 GMT [MD] +# WordBreakProperty-5.2.0.txt +# Date: 2009-07-12, 04:17:35 GMT [MD] # # Unicode Character Database -# Copyright (c) 1991-2008 Unicode, Inc. +# Copyright (c) 1991-2009 Unicode, Inc. # For terms of use, see http://www.unicode.org/terms_of_use.html -# For documentation, see UCD.html +# For documentation, see http://www.unicode.org/reports/tr44/ # ================================================ @@ -58,14 +58,19 @@ 0730..074A ; Extend # Mn [27] SYRIAC PTHAHA ABOVE..SYRIAC BARREKH 07A6..07B0 ; Extend # Mn [11] THAANA ABAFILI..THAANA SUKUN 07EB..07F3 ; Extend # Mn [9] NKO COMBINING SHORT HIGH TONE..NKO COMBINING DOUBLE DOT ABOVE -0901..0902 ; Extend # Mn [2] DEVANAGARI SIGN CANDRABINDU..DEVANAGARI SIGN ANUSVARA +0816..0819 ; Extend # Mn [4] SAMARITAN MARK IN..SAMARITAN MARK DAGESH +081B..0823 ; Extend # Mn [9] SAMARITAN MARK EPENTHETIC YUT..SAMARITAN VOWEL SIGN A +0825..0827 ; Extend # Mn [3] SAMARITAN VOWEL SIGN SHORT A..SAMARITAN VOWEL SIGN U +0829..082D ; Extend # Mn [5] SAMARITAN VOWEL SIGN LONG I..SAMARITAN MARK NEQUDAA +0900..0902 ; Extend # Mn [3] DEVANAGARI SIGN INVERTED CANDRABINDU..DEVANAGARI SIGN ANUSVARA 0903 ; Extend # Mc DEVANAGARI SIGN VISARGA 093C ; Extend # Mn DEVANAGARI SIGN NUKTA 093E..0940 ; Extend # Mc [3] DEVANAGARI VOWEL SIGN AA..DEVANAGARI VOWEL SIGN II 0941..0948 ; Extend # Mn [8] DEVANAGARI VOWEL SIGN U..DEVANAGARI VOWEL SIGN AI 0949..094C ; Extend # Mc [4] DEVANAGARI VOWEL SIGN CANDRA O..DEVANAGARI VOWEL SIGN AU 094D ; Extend # Mn DEVANAGARI SIGN VIRAMA -0951..0954 ; Extend # Mn [4] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI ACUTE ACCENT +094E ; Extend # Mc DEVANAGARI VOWEL SIGN PRISHTHAMATRA E +0951..0955 ; Extend # Mn [5] DEVANAGARI STRESS SIGN UDATTA..DEVANAGARI VOWEL SIGN CANDRA LONG E 0962..0963 ; Extend # Mn [2] DEVANAGARI VOWEL SIGN VOCALIC L..DEVANAGARI VOWEL SIGN VOCALIC LL 0981 ; Extend # Mn BENGALI SIGN CANDRABINDU 0982..0983 ; Extend # Mc [2] BENGALI SIGN ANUSVARA..BENGALI SIGN VISARGA @@ -190,6 +195,8 @@ 1087..108C ; Extend # Mc [6] MYANMAR SIGN SHAN TONE-2..MYANMAR SIGN SHAN COUNCIL TONE-3 108D ; Extend # Mn MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE 108F ; Extend # Mc MYANMAR SIGN RUMAI PALAUNG TONE-5 +109A..109C ; Extend # Mc [3] MYANMAR SIGN KHAMTI TONE-1..MYANMAR VOWEL SIGN AITON A +109D ; Extend # Mn MYANMAR VOWEL SIGN AITON AI 135F ; Extend # Mn ETHIOPIC COMBINING GEMINATION MARK 1712..1714 ; Extend # Mn [3] TAGALOG VOWEL SIGN I..TAGALOG SIGN VIRAMA 1732..1734 ; Extend # Mn [3] HANUNOO VOWEL SIGN I..HANUNOO SIGN PAMUDPOD @@ -216,6 +223,18 @@ 19C8..19C9 ; Extend # Mc [2] NEW TAI LUE TONE MARK-1..NEW TAI LUE TONE MARK-2 1A17..1A18 ; Extend # Mn [2] BUGINESE VOWEL SIGN I..BUGINESE VOWEL SIGN U 1A19..1A1B ; Extend # Mc [3] BUGINESE VOWEL SIGN E..BUGINESE VOWEL SIGN AE +1A55 ; Extend # Mc TAI THAM CONSONANT SIGN MEDIAL RA +1A56 ; Extend # Mn TAI THAM CONSONANT SIGN MEDIAL LA +1A57 ; Extend # Mc TAI THAM CONSONANT SIGN LA TANG LAI +1A58..1A5E ; Extend # Mn [7] TAI THAM SIGN MAI KANG LAI..TAI THAM CONSONANT SIGN SA +1A60 ; Extend # Mn TAI THAM SIGN SAKOT +1A61 ; Extend # Mc TAI THAM VOWEL SIGN A +1A62 ; Extend # Mn TAI THAM VOWEL SIGN MAI SAT +1A63..1A64 ; Extend # Mc [2] TAI THAM VOWEL SIGN AA..TAI THAM VOWEL SIGN TALL AA +1A65..1A6C ; Extend # Mn [8] TAI THAM VOWEL SIGN I..TAI THAM VOWEL SIGN OA BELOW +1A6D..1A72 ; Extend # Mc [6] TAI THAM VOWEL SIGN OY..TAI THAM VOWEL SIGN THAM AI +1A73..1A7C ; Extend # Mn [10] TAI THAM VOWEL SIGN OA ABOVE..TAI THAM SIGN KHUEN-LUE KARAN +1A7F ; Extend # Mn TAI THAM COMBINING CRYPTOGRAMMIC DOT 1B00..1B03 ; Extend # Mn [4] BALINESE SIGN ULU RICEM..BALINESE SIGN SURANG 1B04 ; Extend # Mc BALINESE SIGN BISAH 1B34 ; Extend # Mn BALINESE SIGN REREKAN @@ -238,20 +257,28 @@ 1C2C..1C33 ; Extend # Mn [8] LEPCHA VOWEL SIGN E..LEPCHA CONSONANT SIGN T 1C34..1C35 ; Extend # Mc [2] LEPCHA CONSONANT SIGN NYIN-DO..LEPCHA CONSONANT SIGN KANG 1C36..1C37 ; Extend # Mn [2] LEPCHA SIGN RAN..LEPCHA SIGN NUKTA +1CD0..1CD2 ; Extend # Mn [3] VEDIC TONE KARSHANA..VEDIC TONE PRENKHA +1CD4..1CE0 ; Extend # Mn [13] VEDIC SIGN YAJURVEDIC MIDLINE SVARITA..VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA +1CE1 ; Extend # Mc VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA +1CE2..1CE8 ; Extend # Mn [7] VEDIC SIGN VISARGA SVARITA..VEDIC SIGN VISARGA ANUDATTA WITH TAIL +1CED ; Extend # Mn VEDIC SIGN TIRYAK +1CF2 ; Extend # Mc VEDIC SIGN ARDHAVISARGA 1DC0..1DE6 ; Extend # Mn [39] COMBINING DOTTED GRAVE ACCENT..COMBINING LATIN SMALL LETTER Z -1DFE..1DFF ; Extend # Mn [2] COMBINING LEFT ARROWHEAD ABOVE..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW +1DFD..1DFF ; Extend # Mn [3] COMBINING ALMOST EQUAL TO BELOW..COMBINING RIGHT ARROWHEAD AND DOWN ARROWHEAD BELOW 200C..200D ; Extend # Cf [2] ZERO WIDTH NON-JOINER..ZERO WIDTH JOINER 20D0..20DC ; Extend # Mn [13] COMBINING LEFT HARPOON ABOVE..COMBINING FOUR DOTS ABOVE 20DD..20E0 ; Extend # Me [4] COMBINING ENCLOSING CIRCLE..COMBINING ENCLOSING CIRCLE BACKSLASH 20E1 ; Extend # Mn COMBINING LEFT RIGHT ARROW ABOVE 20E2..20E4 ; Extend # Me [3] COMBINING ENCLOSING SCREEN..COMBINING ENCLOSING UPWARD POINTING TRIANGLE 20E5..20F0 ; Extend # Mn [12] COMBINING REVERSE SOLIDUS OVERLAY..COMBINING ASTERISK ABOVE +2CEF..2CF1 ; Extend # Mn [3] COPTIC COMBINING NI ABOVE..COPTIC COMBINING SPIRITUS LENIS 2DE0..2DFF ; Extend # Mn [32] COMBINING CYRILLIC LETTER BE..COMBINING CYRILLIC LETTER IOTIFIED BIG YUS 302A..302F ; Extend # Mn [6] IDEOGRAPHIC LEVEL TONE MARK..HANGUL DOUBLE DOT TONE MARK 3099..309A ; Extend # Mn [2] COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK..COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK A66F ; Extend # Mn COMBINING CYRILLIC VZMET A670..A672 ; Extend # Me [3] COMBINING CYRILLIC TEN MILLIONS SIGN..COMBINING CYRILLIC THOUSAND MILLIONS SIGN A67C..A67D ; Extend # Mn [2] COMBINING CYRILLIC KAVYKA..COMBINING CYRILLIC PAYEROK +A6F0..A6F1 ; Extend # Mn [2] BAMUM COMBINING MARK KOQNDON..BAMUM COMBINING MARK TUKWENTIS A802 ; Extend # Mn SYLOTI NAGRI SIGN DVISVARA A806 ; Extend # Mn SYLOTI NAGRI SIGN HASANTA A80B ; Extend # Mn SYLOTI NAGRI SIGN ANUSVARA @@ -261,9 +288,18 @@ A827 ; Extend # Mc SYLOTI NAGRI VOWEL SIGN OO A880..A881 ; Extend # Mc [2] SAURASHTRA SIGN ANUSVARA..SAURASHTRA SIGN VISARGA A8B4..A8C3 ; Extend # Mc [16] SAURASHTRA CONSONANT SIGN HAARU..SAURASHTRA VOWEL SIGN AU A8C4 ; Extend # Mn SAURASHTRA SIGN VIRAMA +A8E0..A8F1 ; Extend # Mn [18] COMBINING DEVANAGARI DIGIT ZERO..COMBINING DEVANAGARI SIGN AVAGRAHA A926..A92D ; Extend # Mn [8] KAYAH LI VOWEL UE..KAYAH LI TONE CALYA PLOPHU A947..A951 ; Extend # Mn [11] REJANG VOWEL SIGN I..REJANG CONSONANT SIGN R A952..A953 ; Extend # Mc [2] REJANG CONSONANT SIGN H..REJANG VIRAMA +A980..A982 ; Extend # Mn [3] JAVANESE SIGN PANYANGGA..JAVANESE SIGN LAYAR +A983 ; Extend # Mc JAVANESE SIGN WIGNYAN +A9B3 ; Extend # Mn JAVANESE SIGN CECAK TELU +A9B4..A9B5 ; Extend # Mc [2] JAVANESE VOWEL SIGN TARUNG..JAVANESE VOWEL SIGN TOLONG +A9B6..A9B9 ; Extend # Mn [4] JAVANESE VOWEL SIGN WULU..JAVANESE VOWEL SIGN SUKU MENDUT +A9BA..A9BB ; Extend # Mc [2] JAVANESE VOWEL SIGN TALING..JAVANESE VOWEL SIGN DIRGA MURE +A9BC ; Extend # Mn JAVANESE VOWEL SIGN PEPET +A9BD..A9C0 ; Extend # Mc [4] JAVANESE CONSONANT SIGN KERET..JAVANESE PANGKON AA29..AA2E ; Extend # Mn [6] CHAM VOWEL SIGN AA..CHAM VOWEL SIGN OE AA2F..AA30 ; Extend # Mc [2] CHAM VOWEL SIGN O..CHAM VOWEL SIGN AI AA31..AA32 ; Extend # Mn [2] CHAM VOWEL SIGN AU..CHAM VOWEL SIGN UE @@ -272,6 +308,19 @@ AA35..AA36 ; Extend # Mn [2] CHAM CONSONANT SIGN LA..CHAM CONSONANT SIGN WA AA43 ; Extend # Mn CHAM CONSONANT SIGN FINAL NG AA4C ; Extend # Mn CHAM CONSONANT SIGN FINAL M AA4D ; Extend # Mc CHAM CONSONANT SIGN FINAL H +AA7B ; Extend # Mc MYANMAR SIGN PAO KAREN TONE +AAB0 ; Extend # Mn TAI VIET MAI KANG +AAB2..AAB4 ; Extend # Mn [3] TAI VIET VOWEL I..TAI VIET VOWEL U +AAB7..AAB8 ; Extend # Mn [2] TAI VIET MAI KHIT..TAI VIET VOWEL IA +AABE..AABF ; Extend # Mn [2] TAI VIET VOWEL AM..TAI VIET TONE MAI EK +AAC1 ; Extend # Mn TAI VIET TONE MAI THO +ABE3..ABE4 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN ONAP..MEETEI MAYEK VOWEL SIGN INAP +ABE5 ; Extend # Mn MEETEI MAYEK VOWEL SIGN ANAP +ABE6..ABE7 ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN YENAP..MEETEI MAYEK VOWEL SIGN SOUNAP +ABE8 ; Extend # Mn MEETEI MAYEK VOWEL SIGN UNAP +ABE9..ABEA ; Extend # Mc [2] MEETEI MAYEK VOWEL SIGN CHEINAP..MEETEI MAYEK VOWEL SIGN NUNG +ABEC ; Extend # Mc MEETEI MAYEK LUM IYEK +ABED ; Extend # Mn MEETEI MAYEK APUN IYEK FB1E ; Extend # Mn HEBREW POINT JUDEO-SPANISH VARIKA FE00..FE0F ; Extend # Mn [16] VARIATION SELECTOR-1..VARIATION SELECTOR-16 FE20..FE26 ; Extend # Mn [7] COMBINING LIGATURE LEFT HALF..COMBINING CONJOINING MACRON @@ -282,6 +331,12 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 10A0C..10A0F ; Extend # Mn [4] KHAROSHTHI VOWEL LENGTH MARK..KHAROSHTHI SIGN VISARGA 10A38..10A3A ; Extend # Mn [3] KHAROSHTHI SIGN BAR ABOVE..KHAROSHTHI SIGN DOT BELOW 10A3F ; Extend # Mn KHAROSHTHI VIRAMA +11080..11081 ; Extend # Mn [2] KAITHI SIGN CANDRABINDU..KAITHI SIGN ANUSVARA +11082 ; Extend # Mc KAITHI SIGN VISARGA +110B0..110B2 ; Extend # Mc [3] KAITHI VOWEL SIGN AA..KAITHI VOWEL SIGN II +110B3..110B6 ; Extend # Mn [4] KAITHI VOWEL SIGN U..KAITHI VOWEL SIGN AI +110B7..110B8 ; Extend # Mc [2] KAITHI VOWEL SIGN O..KAITHI VOWEL SIGN AU +110B9..110BA ; Extend # Mn [2] KAITHI SIGN VIRAMA..KAITHI SIGN NUKTA 1D165..1D166 ; Extend # Mc [2] MUSICAL SYMBOL COMBINING STEM..MUSICAL SYMBOL COMBINING SPRECHGESANG STEM 1D167..1D169 ; Extend # Mn [3] MUSICAL SYMBOL COMBINING TREMOLO-1..MUSICAL SYMBOL COMBINING TREMOLO-3 1D16D..1D172 ; Extend # Mc [6] MUSICAL SYMBOL COMBINING AUGMENTATION DOT..MUSICAL SYMBOL COMBINING FLAG-5 @@ -291,7 +346,7 @@ FF9E..FF9F ; Extend # Lm [2] HALFWIDTH KATAKANA VOICED SOUND MARK..HALFWIDT 1D242..1D244 ; Extend # Mn [3] COMBINING GREEK MUSICAL TRISEME..COMBINING GREEK MUSICAL PENTASEME E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 -# Total code points: 1285 +# Total code points: 1455 # ================================================ @@ -300,13 +355,13 @@ E0100..E01EF ; Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256 06DD ; Format # Cf ARABIC END OF AYAH 070F ; Format # Cf SYRIAC ABBREVIATION MARK 17B4..17B5 ; Format # Cf [2] KHMER VOWEL INHERENT AQ..KHMER VOWEL INHERENT AA -200B ; Format # Cf ZERO WIDTH SPACE 200E..200F ; Format # Cf [2] LEFT-TO-RIGHT MARK..RIGHT-TO-LEFT MARK 202A..202E ; Format # Cf [5] LEFT-TO-RIGHT EMBEDDING..RIGHT-TO-LEFT OVERRIDE 2060..2064 ; Format # Cf [5] WORD JOINER..INVISIBLE PLUS 206A..206F ; Format # Cf [6] INHIBIT SYMMETRIC SWAPPING..NOMINAL DIGIT SHAPES FEFF ; Format # Cf ZERO WIDTH NO-BREAK SPACE FFF9..FFFB ; Format # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLINEAR ANNOTATION TERMINATOR +110BD ; Format # Cf KAITHI NUMBER SIGN 1D173..1D17A ; Format # Cf [8] MUSICAL SYMBOL BEGIN BEAM..MUSICAL SYMBOL END PHRASE E0001 ; Format # Cf LANGUAGE TAG E0020..E007F ; Format # Cf [96] TAG SPACE..CANCEL TAG @@ -362,7 +417,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 038E..03A1 ; ALetter # L& [20] GREEK CAPITAL LETTER UPSILON WITH TONOS..GREEK CAPITAL LETTER RHO 03A3..03F5 ; ALetter # L& [83] GREEK CAPITAL LETTER SIGMA..GREEK LUNATE EPSILON SYMBOL 03F7..0481 ; ALetter # L& [139] GREEK CAPITAL LETTER SHO..CYRILLIC SMALL LETTER KOPPA -048A..0523 ; ALetter # L& [154] CYRILLIC CAPITAL LETTER SHORT I WITH TAIL..CYRILLIC SMALL LETTER EN WITH MIDDLE HOOK +048A..0525 ; ALetter # L& [156] CYRILLIC CAPITAL LETTER SHORT I WITH TAIL..CYRILLIC SMALL LETTER PE WITH DESCENDER 0531..0556 ; ALetter # L& [38] ARMENIAN CAPITAL LETTER AYB..ARMENIAN CAPITAL LETTER FEH 0559 ; ALetter # Lm ARMENIAN MODIFIER LETTER LEFT HALF RING 0561..0587 ; ALetter # L& [39] ARMENIAN SMALL LETTER AYB..ARMENIAN SMALL LIGATURE ECH YIWN @@ -386,13 +441,17 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 07CA..07EA ; ALetter # Lo [33] NKO LETTER A..NKO LETTER JONA RA 07F4..07F5 ; ALetter # Lm [2] NKO HIGH TONE APOSTROPHE..NKO LOW TONE APOSTROPHE 07FA ; ALetter # Lm NKO LAJANYALAN +0800..0815 ; ALetter # Lo [22] SAMARITAN LETTER ALAF..SAMARITAN LETTER TAAF +081A ; ALetter # Lm SAMARITAN MODIFIER LETTER EPENTHETIC YUT +0824 ; ALetter # Lm SAMARITAN MODIFIER LETTER SHORT A +0828 ; ALetter # Lm SAMARITAN MODIFIER LETTER I 0904..0939 ; ALetter # Lo [54] DEVANAGARI LETTER SHORT A..DEVANAGARI LETTER HA 093D ; ALetter # Lo DEVANAGARI SIGN AVAGRAHA 0950 ; ALetter # Lo DEVANAGARI OM 0958..0961 ; ALetter # Lo [10] DEVANAGARI LETTER QA..DEVANAGARI LETTER VOCALIC LL 0971 ; ALetter # Lm DEVANAGARI SIGN HIGH SPACING DOT 0972 ; ALetter # Lo DEVANAGARI LETTER CANDRA A -097B..097F ; ALetter # Lo [5] DEVANAGARI LETTER GGA..DEVANAGARI LETTER BBA +0979..097F ; ALetter # Lo [7] DEVANAGARI LETTER ZHA..DEVANAGARI LETTER BBA 0985..098C ; ALetter # Lo [8] BENGALI LETTER A..BENGALI LETTER VOCALIC L 098F..0990 ; ALetter # Lo [2] BENGALI LETTER E..BENGALI LETTER AI 0993..09A8 ; ALetter # Lo [22] BENGALI LETTER O..BENGALI LETTER NA @@ -479,10 +538,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 10A0..10C5 ; ALetter # L& [38] GEORGIAN CAPITAL LETTER AN..GEORGIAN CAPITAL LETTER HOE 10D0..10FA ; ALetter # Lo [43] GEORGIAN LETTER AN..GEORGIAN LETTER AIN 10FC ; ALetter # Lm MODIFIER LETTER GEORGIAN NAR -1100..1159 ; ALetter # Lo [90] HANGUL CHOSEONG KIYEOK..HANGUL CHOSEONG YEORINHIEUH -115F..11A2 ; ALetter # Lo [68] HANGUL CHOSEONG FILLER..HANGUL JUNGSEONG SSANGARAEA -11A8..11F9 ; ALetter # Lo [82] HANGUL JONGSEONG KIYEOK..HANGUL JONGSEONG YEORINHIEUH -1200..1248 ; ALetter # Lo [73] ETHIOPIC SYLLABLE HA..ETHIOPIC SYLLABLE QWA +1100..1248 ; ALetter # Lo [329] HANGUL CHOSEONG KIYEOK..ETHIOPIC SYLLABLE QWA 124A..124D ; ALetter # Lo [4] ETHIOPIC SYLLABLE QWI..ETHIOPIC SYLLABLE QWE 1250..1256 ; ALetter # Lo [7] ETHIOPIC SYLLABLE QHA..ETHIOPIC SYLLABLE QHO 1258 ; ALetter # Lo ETHIOPIC SYLLABLE QHWA @@ -501,7 +557,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 1380..138F ; ALetter # Lo [16] ETHIOPIC SYLLABLE SEBATBEIT MWA..ETHIOPIC SYLLABLE PWE 13A0..13F4 ; ALetter # Lo [85] CHEROKEE LETTER A..CHEROKEE LETTER YV 1401..166C ; ALetter # Lo [620] CANADIAN SYLLABICS E..CANADIAN SYLLABICS CARRIER TTSA -166F..1676 ; ALetter # Lo [8] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS NNGAA +166F..167F ; ALetter # Lo [17] CANADIAN SYLLABICS QAI..CANADIAN SYLLABICS BLACKFOOT W 1681..169A ; ALetter # Lo [26] OGHAM LETTER BEITH..OGHAM LETTER PEITH 16A0..16EA ; ALetter # Lo [75] RUNIC LETTER FEHU FEOH FE F..RUNIC LETTER X 16EE..16F0 ; ALetter # Nl [3] RUNIC ARLAUG SYMBOL..RUNIC BELGTHOR SYMBOL @@ -516,6 +572,7 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 1844..1877 ; ALetter # Lo [52] MONGOLIAN LETTER TODO E..MONGOLIAN LETTER MANCHU ZHA 1880..18A8 ; ALetter # Lo [41] MONGOLIAN LETTER ALI GALI ANUSVARA ONE..MONGOLIAN LETTER MANCHU ALI GALI BHA 18AA ; ALetter # Lo MONGOLIAN LETTER MANCHU ALI GALI LHA +18B0..18F5 ; ALetter # Lo [70] CANADIAN SYLLABICS OY..CANADIAN SYLLABICS CARRIER DENTAL S 1900..191C ; ALetter # Lo [29] LIMBU VOWEL-CARRIER LETTER..LIMBU LETTER HA 1A00..1A16 ; ALetter # Lo [23] BUGINESE LETTER KA..BUGINESE LETTER HA 1B05..1B33 ; ALetter # Lo [47] BALINESE LETTER AKARA..BALINESE LETTER HA @@ -526,6 +583,8 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 1C4D..1C4F ; ALetter # Lo [3] LEPCHA LETTER TTA..LEPCHA LETTER DDA 1C5A..1C77 ; ALetter # Lo [30] OL CHIKI LETTER LA..OL CHIKI LETTER OH 1C78..1C7D ; ALetter # Lm [6] OL CHIKI MU TTUDDAG..OL CHIKI AHAD +1CE9..1CEC ; ALetter # Lo [4] VEDIC SIGN ANUSVARA ANTARGOMUKHA..VEDIC SIGN ANUSVARA VAMAGOMUKHA WITH TAIL +1CEE..1CF1 ; ALetter # Lo [4] VEDIC SIGN HEXIFORM LONG ANUSVARA..VEDIC SIGN ANUSVARA UBHAYATO MUKHA 1D00..1D2B ; ALetter # L& [44] LATIN LETTER SMALL CAPITAL A..CYRILLIC LETTER SMALL CAPITAL EL 1D2C..1D61 ; ALetter # Lm [54] MODIFIER LETTER CAPITAL A..MODIFIER LETTER SMALL CHI 1D62..1D77 ; ALetter # L& [22] LATIN SUBSCRIPT SMALL LETTER I..LATIN SMALL LETTER TURNED G @@ -551,8 +610,8 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 1FE0..1FEC ; ALetter # L& [13] GREEK SMALL LETTER UPSILON WITH VRACHY..GREEK CAPITAL LETTER RHO WITH DASIA 1FF2..1FF4 ; ALetter # L& [3] GREEK SMALL LETTER OMEGA WITH VARIA AND YPOGEGRAMMENI..GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI 1FF6..1FFC ; ALetter # L& [7] GREEK SMALL LETTER OMEGA WITH PERISPOMENI..GREEK CAPITAL LETTER OMEGA WITH PROSGEGRAMMENI -2071 ; ALetter # L& SUPERSCRIPT LATIN SMALL LETTER I -207F ; ALetter # L& SUPERSCRIPT LATIN SMALL LETTER N +2071 ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER I +207F ; ALetter # Lm SUPERSCRIPT LATIN SMALL LETTER N 2090..2094 ; ALetter # Lm [5] LATIN SUBSCRIPT SMALL LETTER A..LATIN SUBSCRIPT SMALL LETTER SCHWA 2102 ; ALetter # L& DOUBLE-STRUCK CAPITAL C 2107 ; ALetter # L& EULER CONSTANT @@ -575,10 +634,10 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK 24B6..24E9 ; ALetter # So [52] CIRCLED LATIN CAPITAL LETTER A..CIRCLED LATIN SMALL LETTER Z 2C00..2C2E ; ALetter # L& [47] GLAGOLITIC CAPITAL LETTER AZU..GLAGOLITIC CAPITAL LETTER LATINATE MYSLITE 2C30..2C5E ; ALetter # L& [47] GLAGOLITIC SMALL LETTER AZU..GLAGOLITIC SMALL LETTER LATINATE MYSLITE -2C60..2C6F ; ALetter # L& [16] LATIN CAPITAL LETTER L WITH DOUBLE BAR..LATIN CAPITAL LETTER TURNED A -2C71..2C7C ; ALetter # L& [12] LATIN SMALL LETTER V WITH RIGHT HOOK..LATIN SUBSCRIPT SMALL LETTER J +2C60..2C7C ; ALetter # L& [29] LATIN CAPITAL LETTER L WITH DOUBLE BAR..LATIN SUBSCRIPT SMALL LETTER J 2C7D ; ALetter # Lm MODIFIER LETTER CAPITAL V -2C80..2CE4 ; ALetter # L& [101] COPTIC CAPITAL LETTER ALFA..COPTIC SYMBOL KAI +2C7E..2CE4 ; ALetter # L& [103] LATIN CAPITAL LETTER S WITH SWASH TAIL..COPTIC SYMBOL KAI +2CEB..2CEE ; ALetter # L& [4] COPTIC CAPITAL LETTER CRYPTOGRAMMIC SHEI..COPTIC SMALL LETTER CRYPTOGRAMMIC GANGIA 2D00..2D25 ; ALetter # L& [38] GEORGIAN SMALL LETTER AN..GEORGIAN SMALL LETTER HOE 2D30..2D65 ; ALetter # Lo [54] TIFINAGH LETTER YA..TIFINAGH LETTER YAZZ 2D6F ; ALetter # Lm TIFINAGH MODIFIER LETTER LABIALIZATION MARK @@ -601,6 +660,8 @@ FF71..FF9D ; Katakana # Lo [45] HALFWIDTH KATAKANA LETTER A..HALFWIDTH KATAK A000..A014 ; ALetter # Lo [21] YI SYLLABLE IT..YI SYLLABLE E A015 ; ALetter # Lm YI SYLLABLE WU A016..A48C ; ALetter # Lo [1143] YI SYLLABLE BIT..YI SYLLABLE YYR +A4D0..A4F7 ; ALetter # Lo [40] LISU LETTER BA..LISU LETTER OE +A4F8..A4FD ; ALetter # Lm [6] LISU LETTER TONE MYA TI..LISU LETTER TONE MYA JEU A500..A60B ; ALetter # Lo [268] VAI SYLLABLE EE..VAI SYLLABLE NG A60C ; ALetter # Lm VAI SYLLABLE LENGTHENER A610..A61F ; ALetter # Lo [16] VAI SYLLABLE NDOLE FA..VAI SYMBOL JONG @@ -610,6 +671,8 @@ A662..A66D ; ALetter # L& [12] CYRILLIC CAPITAL LETTER SOFT DE..CYRILLIC SMA A66E ; ALetter # Lo CYRILLIC LETTER MULTIOCULAR O A67F ; ALetter # Lm CYRILLIC PAYEROK A680..A697 ; ALetter # L& [24] CYRILLIC CAPITAL LETTER DWE..CYRILLIC SMALL LETTER SHWE +A6A0..A6E5 ; ALetter # Lo [70] BAMUM LETTER A..BAMUM LETTER KI +A6E6..A6EF ; ALetter # Nl [10] BAMUM LETTER MO..BAMUM LETTER KOGHOM A717..A71F ; ALetter # Lm [9] MODIFIER LETTER DOT VERTICAL BAR..MODIFIER LETTER LOW INVERTED EXCLAMATION MARK A722..A76F ; ALetter # L& [78] LATIN CAPITAL LETTER EGYPTOLOGICAL ALEF..LATIN SMALL LETTER CON A770 ; ALetter # Lm MODIFIER LETTER US @@ -622,12 +685,20 @@ A807..A80A ; ALetter # Lo [4] SYLOTI NAGRI LETTER KO..SYLOTI NAGRI LETTER G A80C..A822 ; ALetter # Lo [23] SYLOTI NAGRI LETTER CO..SYLOTI NAGRI LETTER HO A840..A873 ; ALetter # Lo [52] PHAGS-PA LETTER KA..PHAGS-PA LETTER CANDRABINDU A882..A8B3 ; ALetter # Lo [50] SAURASHTRA LETTER A..SAURASHTRA LETTER LLA +A8F2..A8F7 ; ALetter # Lo [6] DEVANAGARI SIGN SPACING CANDRABINDU..DEVANAGARI SIGN CANDRABINDU AVAGRAHA +A8FB ; ALetter # Lo DEVANAGARI HEADSTROKE A90A..A925 ; ALetter # Lo [28] KAYAH LI LETTER KA..KAYAH LI LETTER OO A930..A946 ; ALetter # Lo [23] REJANG LETTER KA..REJANG LETTER A +A960..A97C ; ALetter # Lo [29] HANGUL CHOSEONG TIKEUT-MIEUM..HANGUL CHOSEONG SSANGYEORINHIEUH +A984..A9B2 ; ALetter # Lo [47] JAVANESE LETTER A..JAVANESE LETTER HA +A9CF ; ALetter # Lm JAVANESE PANGRANGKEP AA00..AA28 ; ALetter # Lo [41] CHAM LETTER A..CHAM LETTER HA AA40..AA42 ; ALetter # Lo [3] CHAM LETTER FINAL K..CHAM LETTER FINAL NG AA44..AA4B ; ALetter # Lo [8] CHAM LETTER FINAL CH..CHAM LETTER FINAL SS +ABC0..ABE2 ; ALetter # Lo [35] MEETEI MAYEK LETTER KOK..MEETEI MAYEK LETTER I LONSUM AC00..D7A3 ; ALetter # Lo [11172] HANGUL SYLLABLE GA..HANGUL SYLLABLE HIH +D7B0..D7C6 ; ALetter # Lo [23] HANGUL JUNGSEONG O-YEO..HANGUL JUNGSEONG ARAEA-E +D7CB..D7FB ; ALetter # Lo [49] HANGUL JONGSEONG NIEUN-RIEUL..HANGUL JONGSEONG PHIEUPH-THIEUTH FB00..FB06 ; ALetter # L& [7] LATIN SMALL LIGATURE FF..LATIN SMALL LIGATURE ST FB13..FB17 ; ALetter # L& [5] ARMENIAN SMALL LIGATURE MEN NOW..ARMENIAN SMALL LIGATURE MEN XEH FB1D ; ALetter # Lo HEBREW LETTER YOD WITH HIRIQ @@ -677,15 +748,22 @@ FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1080A..10835 ; ALetter # Lo [44] CYPRIOT SYLLABLE KA..CYPRIOT SYLLABLE WO 10837..10838 ; ALetter # Lo [2] CYPRIOT SYLLABLE XA..CYPRIOT SYLLABLE XE 1083C ; ALetter # Lo CYPRIOT SYLLABLE ZA -1083F ; ALetter # Lo CYPRIOT SYLLABLE ZO +1083F..10855 ; ALetter # Lo [23] CYPRIOT SYLLABLE ZO..IMPERIAL ARAMAIC LETTER TAW 10900..10915 ; ALetter # Lo [22] PHOENICIAN LETTER ALF..PHOENICIAN LETTER TAU 10920..10939 ; ALetter # Lo [26] LYDIAN LETTER A..LYDIAN LETTER C 10A00 ; ALetter # Lo KHAROSHTHI LETTER A 10A10..10A13 ; ALetter # Lo [4] KHAROSHTHI LETTER KA..KHAROSHTHI LETTER GHA 10A15..10A17 ; ALetter # Lo [3] KHAROSHTHI LETTER CA..KHAROSHTHI LETTER JA 10A19..10A33 ; ALetter # Lo [27] KHAROSHTHI LETTER NYA..KHAROSHTHI LETTER TTTHA +10A60..10A7C ; ALetter # Lo [29] OLD SOUTH ARABIAN LETTER HE..OLD SOUTH ARABIAN LETTER THETH +10B00..10B35 ; ALetter # Lo [54] AVESTAN LETTER A..AVESTAN LETTER HE +10B40..10B55 ; ALetter # Lo [22] INSCRIPTIONAL PARTHIAN LETTER ALEPH..INSCRIPTIONAL PARTHIAN LETTER TAW +10B60..10B72 ; ALetter # Lo [19] INSCRIPTIONAL PAHLAVI LETTER ALEPH..INSCRIPTIONAL PAHLAVI LETTER TAW +10C00..10C48 ; ALetter # Lo [73] OLD TURKIC LETTER ORKHON A..OLD TURKIC LETTER ORKHON BASH +11083..110AF ; ALetter # Lo [45] KAITHI LETTER A..KAITHI LETTER HA 12000..1236E ; ALetter # Lo [879] CUNEIFORM SIGN A..CUNEIFORM SIGN ZUM 12400..12462 ; ALetter # Nl [99] CUNEIFORM NUMERIC SIGN TWO ASH..CUNEIFORM NUMERIC SIGN OLD ASSYRIAN ONE QUARTER +13000..1342E ; ALetter # Lo [1071] EGYPTIAN HIEROGLYPH A001..EGYPTIAN HIEROGLYPH AA032 1D400..1D454 ; ALetter # L& [85] MATHEMATICAL BOLD CAPITAL A..MATHEMATICAL ITALIC SMALL G 1D456..1D49C ; ALetter # L& [71] MATHEMATICAL ITALIC SMALL I..MATHEMATICAL SCRIPT CAPITAL A 1D49E..1D49F ; ALetter # L& [2] MATHEMATICAL SCRIPT CAPITAL C..MATHEMATICAL SCRIPT CAPITAL D @@ -717,7 +795,7 @@ FFDA..FFDC ; ALetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL 1D7AA..1D7C2 ; ALetter # L& [25] MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA..MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL OMEGA 1D7C4..1D7CB ; ALetter # L& [8] MATHEMATICAL SANS-SERIF BOLD ITALIC EPSILON SYMBOL..MATHEMATICAL BOLD SMALL DIGAMMA -# Total code points: 21903 +# Total code points: 23694 # ================================================ @@ -788,7 +866,9 @@ FF0E ; MidNumLet # Po FULLWIDTH FULL STOP 17E0..17E9 ; Numeric # Nd [10] KHMER DIGIT ZERO..KHMER DIGIT NINE 1810..1819 ; Numeric # Nd [10] MONGOLIAN DIGIT ZERO..MONGOLIAN DIGIT NINE 1946..194F ; Numeric # Nd [10] LIMBU DIGIT ZERO..LIMBU DIGIT NINE -19D0..19D9 ; Numeric # Nd [10] NEW TAI LUE DIGIT ZERO..NEW TAI LUE DIGIT NINE +19D0..19DA ; Numeric # Nd [11] NEW TAI LUE DIGIT ZERO..NEW TAI LUE THAM DIGIT ONE +1A80..1A89 ; Numeric # Nd [10] TAI THAM HORA DIGIT ZERO..TAI THAM HORA DIGIT NINE +1A90..1A99 ; Numeric # Nd [10] TAI THAM THAM DIGIT ZERO..TAI THAM THAM DIGIT NINE 1B50..1B59 ; Numeric # Nd [10] BALINESE DIGIT ZERO..BALINESE DIGIT NINE 1BB0..1BB9 ; Numeric # Nd [10] SUNDANESE DIGIT ZERO..SUNDANESE DIGIT NINE 1C40..1C49 ; Numeric # Nd [10] LEPCHA DIGIT ZERO..LEPCHA DIGIT NINE @@ -796,11 +876,13 @@ FF0E ; MidNumLet # Po FULLWIDTH FULL STOP A620..A629 ; Numeric # Nd [10] VAI DIGIT ZERO..VAI DIGIT NINE A8D0..A8D9 ; Numeric # Nd [10] SAURASHTRA DIGIT ZERO..SAURASHTRA DIGIT NINE A900..A909 ; Numeric # Nd [10] KAYAH LI DIGIT ZERO..KAYAH LI DIGIT NINE +A9D0..A9D9 ; Numeric # Nd [10] JAVANESE DIGIT ZERO..JAVANESE DIGIT NINE AA50..AA59 ; Numeric # Nd [10] CHAM DIGIT ZERO..CHAM DIGIT NINE +ABF0..ABF9 ; Numeric # Nd [10] MEETEI MAYEK DIGIT ZERO..MEETEI MAYEK DIGIT NINE 104A0..104A9 ; Numeric # Nd [10] OSMANYA DIGIT ZERO..OSMANYA DIGIT NINE 1D7CE..1D7FF ; Numeric # Nd [50] MATHEMATICAL BOLD DIGIT ZERO..MATHEMATICAL MONOSPACE DIGIT NINE -# Total code points: 361 +# Total code points: 402 # ================================================ |