Update to Unicode 11.0.0.

* lib/gen-uni-tables.c (is_property_default_ignorable_code_point): Simplify by use of PROP_PREPENDED_CONCATENATION_MARK. (UC_JOINING_GROUP_HANIFI_ROHINGYA_PA, UC_JOINING_GROUP_HANIFI_ROHINGYA_KINNA_YA): New enum values. (fill_arabicshaping, joining_group_as_c_identifier): Recognize these joining groups. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. (WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): Remove enum values. (WBP_WSS): New enum value. (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected. (debug_output_wbp, fill_org_wbp, debug_output_org_wbp, output_wbp): Update for changed enum values. * lib/unictype.in.h (UC_JOINING_GROUP_HANIFI_ROHINGYA_*): New enum values. * lib/unictype/joininggroup_name.h: Add the HANIFI_ROHINGYA_* joining groups. * lib/unictype/joininggroup_byname.gperf: Likewise. * lib/unigbrk.in.h: Mark 4 enum values as obsolete. * lib/unigbrk/u-grapheme-breaks.h (FUNC): Handle emoji modifier sequence according to Unicode 11.0.0. * lib/unigbrk/u8-grapheme-breaks.c: Include <stdbool.h>, unictype.h. * lib/unigbrk/u16-grapheme-breaks.c: Likewise. * lib/unigbrk/u32-grapheme-breaks.c: Likewise. * lib/unigbrk/uc-grapheme-breaks.c: Likewise. * modules/unigbrk/u8-grapheme-breaks (Depends-on): Add unictype/property-extended-pictographic, stdbool. * modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise. * tests/unigbrk/test-u8-grapheme-breaks.c (main): Add test for emoji modifier / ZWJ sequence. * tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-uc-is-grapheme-break.c: Include <stdbool.h>, unictype.h. (main): Update workaround logic to match the one in lib/unigbrk/u-grapheme-breaks.h. * modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add unictype/property-extended-pictographic, stdbool. * lib/uniwbrk.in.h: Mark 4 enum values as obsolete. (WBP_WSS): New enum value. * lib/uniwbrk/u-wordbreaks.h (FUNC): Handle emoji ZWJ sequences and horizontal whitespace according to Unicode 11.0.0. * lib/uniwbrk/u8-wordbreaks.c: Include unictype.h. * lib/uniwbrk/u16-wordbreaks.c: Likewise. * lib/uniwbrk/u32-wordbreaks.c: Likewise. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index, uniwbrk_table): Add a row and column for WBP_WSS. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index, uniwbrk_table): Update declarations. * modules/uniwbrk/u8-wordbreaks (Depends-on): Add unictype/property-extended-pictographic. * modules/uniwbrk/u16-wordbreaks (Depends-on): Likewise. * modules/uniwbrk/u32-wordbreaks (Depends-on): Likewise. * tests/uniwbrk/test-u8-wordbreaks.c (main): Update expected results. * tests/uniwbrk/test-u16-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-u32-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Update. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Handle ZWJ according to Unicode 11.0.0. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x07FD, 0x08D3, 0x09FE, 0x0C04, 0xA8FF, 0x10D24..0x10D27, 0x10F46..0x10F50, 0x110CD, 0x111C9, 0x1133B, 0x1145E, 0x1182F..0x11837, 0x11839..0x1183A, 0x11D90..0x11D91, 0x11D95, 0x11D97, 0x11EF3..0x11EF4. Expect width 2 for the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
author: Bruno Haible <bruno@clisp.org> 2021-12-30 16:45:39 +0100
committer: Bruno Haible <bruno@clisp.org> 2021-12-30 18:20:55 +0100
commit: ecbed643ffd4e817a924a645832f73dc6a0abdd0 (patch)
tree: 878cfb2fcf77bebdf7505194de8b2324f5a05428 /tests/unictype/test-pr_punctuation.c
parent: ef4c53b0329bd6ce418bebbbac3fdf8b52aeb2aa (diff)
download: gnulib-ecbed643ffd4e817a924a645832f73dc6a0abdd0.tar.gz
1 files changed, 8 insertions, 2 deletions
diff --git a/tests/unictype/test-pr_punctuation.c b/tests/unictype/test-pr_punctuation.c
index 9d8324ffb0..f904d26f7a 100644
--- a/tests/unictype/test-pr_punctuation.c
+++ b/tests/unictype/test-pr_punctuation.c
@@ -54,7 +54,9 @@
     { 0x0964, 0x0965 },
     { 0x0970, 0x0970 },
     { 0x09FD, 0x09FD },
+    { 0x0A76, 0x0A76 },
     { 0x0AF0, 0x0AF0 },
+    { 0x0C84, 0x0C84 },
     { 0x0DF4, 0x0DF4 },
     { 0x0E4F, 0x0E4F },
     { 0x0E5A, 0x0E5B },
@@ -103,7 +105,7 @@
     { 0x2CFE, 0x2CFF },
     { 0x2D70, 0x2D70 },
     { 0x2E00, 0x2E2E },
-    { 0x2E30, 0x2E49 },
+    { 0x2E30, 0x2E4E },
     { 0x3001, 0x3003 },
     { 0x3008, 0x3011 },
     { 0x3014, 0x301F },
@@ -157,12 +159,13 @@
     { 0x10AF0, 0x10AF6 },
     { 0x10B39, 0x10B3F },
     { 0x10B99, 0x10B9C },
+    { 0x10F55, 0x10F59 },
     { 0x11047, 0x1104D },
     { 0x110BB, 0x110BC },
     { 0x110BE, 0x110C1 },
     { 0x11140, 0x11143 },
     { 0x11174, 0x11175 },
-    { 0x111C5, 0x111C9 },
+    { 0x111C5, 0x111C8 },
     { 0x111CD, 0x111CD },
     { 0x111DB, 0x111DB },
     { 0x111DD, 0x111DF },
@@ -176,16 +179,19 @@
     { 0x11641, 0x11643 },
     { 0x11660, 0x1166C },
     { 0x1173C, 0x1173E },
+    { 0x1183B, 0x1183B },
     { 0x11A3F, 0x11A46 },
     { 0x11A9A, 0x11A9C },
     { 0x11A9E, 0x11AA2 },
     { 0x11C41, 0x11C45 },
     { 0x11C70, 0x11C71 },
+    { 0x11EF7, 0x11EF8 },
     { 0x12470, 0x12474 },
     { 0x16A6E, 0x16A6F },
     { 0x16AF5, 0x16AF5 },
     { 0x16B37, 0x16B3B },
     { 0x16B44, 0x16B44 },
+    { 0x16E97, 0x16E9A },
     { 0x1BC9F, 0x1BC9F },
     { 0x1DA87, 0x1DA8B },
     { 0x1E95E, 0x1E95F }
author	Bruno Haible <bruno@clisp.org>	2021-12-30 16:45:39 +0100
committer	Bruno Haible <bruno@clisp.org>	2021-12-30 18:20:55 +0100
commit	ecbed643ffd4e817a924a645832f73dc6a0abdd0 (patch)
tree	878cfb2fcf77bebdf7505194de8b2324f5a05428 /tests/unictype/test-pr_punctuation.c
parent	ef4c53b0329bd6ce418bebbbac3fdf8b52aeb2aa (diff)
download	gnulib-ecbed643ffd4e817a924a645832f73dc6a0abdd0.tar.gz