summaryrefslogtreecommitdiff
path: root/ChangeLog
diff options
context:
space:
mode:
authorBruno Haible <bruno@clisp.org>2021-12-30 16:45:39 +0100
committerBruno Haible <bruno@clisp.org>2021-12-30 18:20:55 +0100
commitecbed643ffd4e817a924a645832f73dc6a0abdd0 (patch)
tree878cfb2fcf77bebdf7505194de8b2324f5a05428 /ChangeLog
parentef4c53b0329bd6ce418bebbbac3fdf8b52aeb2aa (diff)
downloadgnulib-ecbed643ffd4e817a924a645832f73dc6a0abdd0.tar.gz
Update to Unicode 11.0.0.
* lib/gen-uni-tables.c (is_property_default_ignorable_code_point): Simplify by use of PROP_PREPENDED_CONCATENATION_MARK. (UC_JOINING_GROUP_HANIFI_ROHINGYA_PA, UC_JOINING_GROUP_HANIFI_ROHINGYA_KINNA_YA): New enum values. (fill_arabicshaping, joining_group_as_c_identifier): Recognize these joining groups. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. (WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): Remove enum values. (WBP_WSS): New enum value. (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected. (debug_output_wbp, fill_org_wbp, debug_output_org_wbp, output_wbp): Update for changed enum values. * lib/unictype.in.h (UC_JOINING_GROUP_HANIFI_ROHINGYA_*): New enum values. * lib/unictype/joininggroup_name.h: Add the HANIFI_ROHINGYA_* joining groups. * lib/unictype/joininggroup_byname.gperf: Likewise. * lib/unigbrk.in.h: Mark 4 enum values as obsolete. * lib/unigbrk/u-grapheme-breaks.h (FUNC): Handle emoji modifier sequence according to Unicode 11.0.0. * lib/unigbrk/u8-grapheme-breaks.c: Include <stdbool.h>, unictype.h. * lib/unigbrk/u16-grapheme-breaks.c: Likewise. * lib/unigbrk/u32-grapheme-breaks.c: Likewise. * lib/unigbrk/uc-grapheme-breaks.c: Likewise. * modules/unigbrk/u8-grapheme-breaks (Depends-on): Add unictype/property-extended-pictographic, stdbool. * modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise. * tests/unigbrk/test-u8-grapheme-breaks.c (main): Add test for emoji modifier / ZWJ sequence. * tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-uc-is-grapheme-break.c: Include <stdbool.h>, unictype.h. (main): Update workaround logic to match the one in lib/unigbrk/u-grapheme-breaks.h. * modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add unictype/property-extended-pictographic, stdbool. * lib/uniwbrk.in.h: Mark 4 enum values as obsolete. (WBP_WSS): New enum value. * lib/uniwbrk/u-wordbreaks.h (FUNC): Handle emoji ZWJ sequences and horizontal whitespace according to Unicode 11.0.0. * lib/uniwbrk/u8-wordbreaks.c: Include unictype.h. * lib/uniwbrk/u16-wordbreaks.c: Likewise. * lib/uniwbrk/u32-wordbreaks.c: Likewise. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index, uniwbrk_table): Add a row and column for WBP_WSS. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index, uniwbrk_table): Update declarations. * modules/uniwbrk/u8-wordbreaks (Depends-on): Add unictype/property-extended-pictographic. * modules/uniwbrk/u16-wordbreaks (Depends-on): Likewise. * modules/uniwbrk/u32-wordbreaks (Depends-on): Likewise. * tests/uniwbrk/test-u8-wordbreaks.c (main): Update expected results. * tests/uniwbrk/test-u16-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-u32-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Update. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Handle ZWJ according to Unicode 11.0.0. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x07FD, 0x08D3, 0x09FE, 0x0C04, 0xA8FF, 0x10D24..0x10D27, 0x10F46..0x10F50, 0x110CD, 0x111C9, 0x1133B, 0x1145E, 0x1182F..0x11837, 0x11839..0x1183A, 0x11D90..0x11D91, 0x11D95, 0x11D97, 0x11EF3..0x11EF4. Expect width 2 for the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
Diffstat (limited to 'ChangeLog')
-rw-r--r--ChangeLog94
1 files changed, 94 insertions, 0 deletions
diff --git a/ChangeLog b/ChangeLog
index 9466a87427..b5f7f8eb2f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,99 @@
2021-12-30 Bruno Haible <bruno@clisp.org>
+ Update to Unicode 11.0.0.
+
+ * lib/gen-uni-tables.c (is_property_default_ignorable_code_point):
+ Simplify by use of PROP_PREPENDED_CONCATENATION_MARK.
+ (UC_JOINING_GROUP_HANIFI_ROHINGYA_PA,
+ UC_JOINING_GROUP_HANIFI_ROHINGYA_KINNA_YA): New enum values.
+ (fill_arabicshaping, joining_group_as_c_identifier): Recognize these
+ joining groups.
+ (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected.
+ (WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): Remove enum values.
+ (WBP_WSS): New enum value.
+ (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected.
+ (debug_output_wbp, fill_org_wbp, debug_output_org_wbp, output_wbp):
+ Update for changed enum values.
+
+ * lib/unictype.in.h (UC_JOINING_GROUP_HANIFI_ROHINGYA_*): New enum
+ values.
+ * lib/unictype/joininggroup_name.h: Add the HANIFI_ROHINGYA_* joining
+ groups.
+ * lib/unictype/joininggroup_byname.gperf: Likewise.
+
+ * lib/unigbrk.in.h: Mark 4 enum values as obsolete.
+ * lib/unigbrk/u-grapheme-breaks.h (FUNC): Handle emoji modifier sequence
+ according to Unicode 11.0.0.
+ * lib/unigbrk/u8-grapheme-breaks.c: Include <stdbool.h>, unictype.h.
+ * lib/unigbrk/u16-grapheme-breaks.c: Likewise.
+ * lib/unigbrk/u32-grapheme-breaks.c: Likewise.
+ * lib/unigbrk/uc-grapheme-breaks.c: Likewise.
+ * modules/unigbrk/u8-grapheme-breaks (Depends-on): Add
+ unictype/property-extended-pictographic, stdbool.
+ * modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise.
+ * modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise.
+ * modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise.
+ * tests/unigbrk/test-u8-grapheme-breaks.c (main): Add test for emoji
+ modifier / ZWJ sequence.
+ * tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise.
+ * tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise.
+ * tests/unigbrk/test-uc-is-grapheme-break.c: Include <stdbool.h>,
+ unictype.h.
+ (main): Update workaround logic to match the one in
+ lib/unigbrk/u-grapheme-breaks.h.
+ * modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add
+ unictype/property-extended-pictographic, stdbool.
+
+ * lib/uniwbrk.in.h: Mark 4 enum values as obsolete.
+ (WBP_WSS): New enum value.
+ * lib/uniwbrk/u-wordbreaks.h (FUNC): Handle emoji ZWJ sequences and
+ horizontal whitespace according to Unicode 11.0.0.
+ * lib/uniwbrk/u8-wordbreaks.c: Include unictype.h.
+ * lib/uniwbrk/u16-wordbreaks.c: Likewise.
+ * lib/uniwbrk/u32-wordbreaks.c: Likewise.
+ * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index, uniwbrk_table): Add a row
+ and column for WBP_WSS.
+ * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index, uniwbrk_table): Update
+ declarations.
+ * modules/uniwbrk/u8-wordbreaks (Depends-on): Add
+ unictype/property-extended-pictographic.
+ * modules/uniwbrk/u16-wordbreaks (Depends-on): Likewise.
+ * modules/uniwbrk/u32-wordbreaks (Depends-on): Likewise.
+ * tests/uniwbrk/test-u8-wordbreaks.c (main): Update expected results.
+ * tests/uniwbrk/test-u16-wordbreaks.c (main): Likewise.
+ * tests/uniwbrk/test-u32-wordbreaks.c (main): Likewise.
+ * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string):
+ Update.
+
+ * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop):
+ Handle ZWJ according to Unicode 11.0.0.
+ * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop):
+ Likewise.
+ * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop):
+ Likewise.
+
+ * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind):
+ Update.
+ (uc_width): Assign width 2 to the characters 0x187ED..0x187F1, 0x1F6F9,
+ 0x1F9E7..0x1F9FF.
+ * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters
+ 0x07FD, 0x08D3, 0x09FE, 0x0C04, 0xA8FF, 0x10D24..0x10D27,
+ 0x10F46..0x10F50, 0x110CD, 0x111C9, 0x1133B, 0x1145E, 0x1182F..0x11837,
+ 0x11839..0x1183A, 0x11D90..0x11D91, 0x11D95, 0x11D97, 0x11EF3..0x11EF4.
+ Expect width 2 for the characters 0x187ED..0x187F1, 0x1F6F9,
+ 0x1F9E7..0x1F9FF.
+
+ * All generated files under lib/uni* and tests/uni*: Regenerate.
+ * tests/uniname/NameAliases.txt: Update.
+ * tests/uniname/UnicodeData.txt: Update.
+ * tests/uninorm/NormalizationTest.txt: Update.
+ * tests/unigbrk/GraphemeBreakTest.txt: Update.
+ * tests/uniwbrk/WordBreakTest.txt: Update.
+
+ * All the affected modules: Bump required libunistring version.
+
+2021-12-30 Bruno Haible <bruno@clisp.org>
+
unictype: Add Emoji properties from Unicode 11.0.0.
* lib/gen-uni-tables.c (PROP_EMOJI*, PROP_EXTENDED_PICTOGRAPHIC): New
enum values.