diff options
author | Bruno Haible <bruno@clisp.org> | 2021-12-30 16:45:39 +0100 |
---|---|---|
committer | Bruno Haible <bruno@clisp.org> | 2021-12-30 18:20:55 +0100 |
commit | ecbed643ffd4e817a924a645832f73dc6a0abdd0 (patch) | |
tree | 878cfb2fcf77bebdf7505194de8b2324f5a05428 /tests/unictype/test-pr_punctuation.c | |
parent | ef4c53b0329bd6ce418bebbbac3fdf8b52aeb2aa (diff) | |
download | gnulib-ecbed643ffd4e817a924a645832f73dc6a0abdd0.tar.gz |
Update to Unicode 11.0.0.
* lib/gen-uni-tables.c (is_property_default_ignorable_code_point):
Simplify by use of PROP_PREPENDED_CONCATENATION_MARK.
(UC_JOINING_GROUP_HANIFI_ROHINGYA_PA,
UC_JOINING_GROUP_HANIFI_ROHINGYA_KINNA_YA): New enum values.
(fill_arabicshaping, joining_group_as_c_identifier): Recognize these
joining groups.
(get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected.
(WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): Remove enum values.
(WBP_WSS): New enum value.
(get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected.
(debug_output_wbp, fill_org_wbp, debug_output_org_wbp, output_wbp):
Update for changed enum values.
* lib/unictype.in.h (UC_JOINING_GROUP_HANIFI_ROHINGYA_*): New enum
values.
* lib/unictype/joininggroup_name.h: Add the HANIFI_ROHINGYA_* joining
groups.
* lib/unictype/joininggroup_byname.gperf: Likewise.
* lib/unigbrk.in.h: Mark 4 enum values as obsolete.
* lib/unigbrk/u-grapheme-breaks.h (FUNC): Handle emoji modifier sequence
according to Unicode 11.0.0.
* lib/unigbrk/u8-grapheme-breaks.c: Include <stdbool.h>, unictype.h.
* lib/unigbrk/u16-grapheme-breaks.c: Likewise.
* lib/unigbrk/u32-grapheme-breaks.c: Likewise.
* lib/unigbrk/uc-grapheme-breaks.c: Likewise.
* modules/unigbrk/u8-grapheme-breaks (Depends-on): Add
unictype/property-extended-pictographic, stdbool.
* modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise.
* modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise.
* modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise.
* tests/unigbrk/test-u8-grapheme-breaks.c (main): Add test for emoji
modifier / ZWJ sequence.
* tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise.
* tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise.
* tests/unigbrk/test-uc-is-grapheme-break.c: Include <stdbool.h>,
unictype.h.
(main): Update workaround logic to match the one in
lib/unigbrk/u-grapheme-breaks.h.
* modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add
unictype/property-extended-pictographic, stdbool.
* lib/uniwbrk.in.h: Mark 4 enum values as obsolete.
(WBP_WSS): New enum value.
* lib/uniwbrk/u-wordbreaks.h (FUNC): Handle emoji ZWJ sequences and
horizontal whitespace according to Unicode 11.0.0.
* lib/uniwbrk/u8-wordbreaks.c: Include unictype.h.
* lib/uniwbrk/u16-wordbreaks.c: Likewise.
* lib/uniwbrk/u32-wordbreaks.c: Likewise.
* lib/uniwbrk/wbrktable.c (uniwbrk_prop_index, uniwbrk_table): Add a row
and column for WBP_WSS.
* lib/uniwbrk/wbrktable.h (uniwbrk_prop_index, uniwbrk_table): Update
declarations.
* modules/uniwbrk/u8-wordbreaks (Depends-on): Add
unictype/property-extended-pictographic.
* modules/uniwbrk/u16-wordbreaks (Depends-on): Likewise.
* modules/uniwbrk/u32-wordbreaks (Depends-on): Likewise.
* tests/uniwbrk/test-u8-wordbreaks.c (main): Update expected results.
* tests/uniwbrk/test-u16-wordbreaks.c (main): Likewise.
* tests/uniwbrk/test-u32-wordbreaks.c (main): Likewise.
* tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string):
Update.
* lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop):
Handle ZWJ according to Unicode 11.0.0.
* lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop):
Likewise.
* lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop):
Likewise.
* lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind):
Update.
(uc_width): Assign width 2 to the characters 0x187ED..0x187F1, 0x1F6F9,
0x1F9E7..0x1F9FF.
* tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters
0x07FD, 0x08D3, 0x09FE, 0x0C04, 0xA8FF, 0x10D24..0x10D27,
0x10F46..0x10F50, 0x110CD, 0x111C9, 0x1133B, 0x1145E, 0x1182F..0x11837,
0x11839..0x1183A, 0x11D90..0x11D91, 0x11D95, 0x11D97, 0x11EF3..0x11EF4.
Expect width 2 for the characters 0x187ED..0x187F1, 0x1F6F9,
0x1F9E7..0x1F9FF.
* All generated files under lib/uni* and tests/uni*: Regenerate.
* tests/uniname/NameAliases.txt: Update.
* tests/uniname/UnicodeData.txt: Update.
* tests/uninorm/NormalizationTest.txt: Update.
* tests/unigbrk/GraphemeBreakTest.txt: Update.
* tests/uniwbrk/WordBreakTest.txt: Update.
* All the affected modules: Bump required libunistring version.
Diffstat (limited to 'tests/unictype/test-pr_punctuation.c')
-rw-r--r-- | tests/unictype/test-pr_punctuation.c | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/tests/unictype/test-pr_punctuation.c b/tests/unictype/test-pr_punctuation.c index 9d8324ffb0..f904d26f7a 100644 --- a/tests/unictype/test-pr_punctuation.c +++ b/tests/unictype/test-pr_punctuation.c @@ -54,7 +54,9 @@ { 0x0964, 0x0965 }, { 0x0970, 0x0970 }, { 0x09FD, 0x09FD }, + { 0x0A76, 0x0A76 }, { 0x0AF0, 0x0AF0 }, + { 0x0C84, 0x0C84 }, { 0x0DF4, 0x0DF4 }, { 0x0E4F, 0x0E4F }, { 0x0E5A, 0x0E5B }, @@ -103,7 +105,7 @@ { 0x2CFE, 0x2CFF }, { 0x2D70, 0x2D70 }, { 0x2E00, 0x2E2E }, - { 0x2E30, 0x2E49 }, + { 0x2E30, 0x2E4E }, { 0x3001, 0x3003 }, { 0x3008, 0x3011 }, { 0x3014, 0x301F }, @@ -157,12 +159,13 @@ { 0x10AF0, 0x10AF6 }, { 0x10B39, 0x10B3F }, { 0x10B99, 0x10B9C }, + { 0x10F55, 0x10F59 }, { 0x11047, 0x1104D }, { 0x110BB, 0x110BC }, { 0x110BE, 0x110C1 }, { 0x11140, 0x11143 }, { 0x11174, 0x11175 }, - { 0x111C5, 0x111C9 }, + { 0x111C5, 0x111C8 }, { 0x111CD, 0x111CD }, { 0x111DB, 0x111DB }, { 0x111DD, 0x111DF }, @@ -176,16 +179,19 @@ { 0x11641, 0x11643 }, { 0x11660, 0x1166C }, { 0x1173C, 0x1173E }, + { 0x1183B, 0x1183B }, { 0x11A3F, 0x11A46 }, { 0x11A9A, 0x11A9C }, { 0x11A9E, 0x11AA2 }, { 0x11C41, 0x11C45 }, { 0x11C70, 0x11C71 }, + { 0x11EF7, 0x11EF8 }, { 0x12470, 0x12474 }, { 0x16A6E, 0x16A6F }, { 0x16AF5, 0x16AF5 }, { 0x16B37, 0x16B3B }, { 0x16B44, 0x16B44 }, + { 0x16E97, 0x16E9A }, { 0x1BC9F, 0x1BC9F }, { 0x1DA87, 0x1DA8B }, { 0x1E95E, 0x1E95F } |