summaryrefslogtreecommitdiff
path: root/tests/unictype
Commit message (Collapse)AuthorAgeFilesLines
* Update to Unicode 15.0.0.Bruno Haible2022-10-1661-292/+657
| | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (is_property_default_ignorable_code_point): Exclude 0x13439..0x1343F. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x0ECE, 0x10EFD..0x10EFF, 0x11241, 0x11F00..0x11F01, 0x11F36..0x11F3A, 0x11F40, 0x11F42, 0x13439..0x13440, 0x13447..0x13455, 0x1E08F, 0x1E4EC..0x1E4EF. * All the affected modules: Bump required libunistring version.
* Rely on new stdbool behaviorPaul Eggert2022-09-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prefer the C23 style to the C99 style, since the stdbool module now supports C23. * lib/acl-internal.h, lib/acl.h, lib/argmatch.c, lib/argmatch.h: * lib/argp-help.c, lib/argv-iter.h, lib/asyncsafe-spin.c: * lib/backup-internal.h, lib/backupfile.c, lib/base32.h: * lib/base64.h, lib/basename-lgpl.c, lib/bitset/base.h: * lib/c-ctype.h, lib/c-strcasestr.c, lib/canonicalize-lgpl.c: * lib/canonicalize.c, lib/chdir-long.c, lib/chown.c: * lib/classpath.h, lib/clean-temp-private.h: * lib/clean-temp-simple.c, lib/clean-temp-simple.h: * lib/clean-temp.c, lib/clean-temp.h, lib/cloexec.h: * lib/close-stream.c, lib/closein.c, lib/closeout.c, lib/closeout.h: * lib/csharpcomp.h, lib/csharpexec.h, lib/cycle-check.c: * lib/cycle-check.h, lib/des.h, lib/dfa.h, lib/diffseq.h: * lib/dirname.h, lib/exclude.c, lib/exclude.h, lib/execute.c: * lib/execute.h, lib/execvpe.c, lib/fatal-signal.c, lib/fchdir.c: * lib/file-set.h, lib/filevercmp.c, lib/findprog-in.c: * lib/findprog.c, lib/findprog.h, lib/fma.c, lib/fnmatch.c: * lib/fopen.c, lib/freadable.h, lib/freading.h, lib/freopen-safer.c: * lib/fstrcmp.c, lib/fsusage.h, lib/fts.c, lib/fwritable.h: * lib/fwriteerror.c, lib/fwriting.h, lib/gen-uni-tables.c: * lib/getaddrinfo.c, lib/getcwd.c, lib/getloadavg.c: * lib/getndelim2.c, lib/getpass.c, lib/getrandom.c: * lib/git-merge-changelog.c, lib/gl_list.h, lib/gl_map.h: * lib/gl_omap.h, lib/gl_oset.h, lib/gl_set.h, lib/glob.c: * lib/glthread/cond.h, lib/hamt.h, lib/hard-locale.h: * lib/hash-triple.h, lib/hash.h, lib/human.h, lib/i-ring.h: * lib/isapipe.c, lib/javacomp.h, lib/javaexec.h, lib/javaversion.c: * lib/lchown.c, lib/localeinfo.h, lib/localename.c: * lib/long-options.h, lib/malloc/dynarray.h, lib/mbchar.h: * lib/mbfile.h, lib/mbiter.h, lib/mbmemcasecoll.h, lib/mbscasestr.c: * lib/mbsstr.c, lib/mbuiter.h, lib/mkdir-p.h, lib/modechange.h: * lib/mountlist.h, lib/nanosleep.c, lib/nonblocking.h: * lib/nstrftime.c, lib/openat.c, lib/openat.h, lib/os2-spawn.c: * lib/parse-datetime.h, lib/pipe-filter-aux.c, lib/pipe-filter-gi.c: * lib/pipe-filter-ii.c, lib/pipe-filter.h, lib/posixtm.h: * lib/priv-set.c, lib/progreloc.c, lib/propername.c: * lib/pthread-spin.c, lib/quotearg.c, lib/readtokens.c: * lib/readtokens0.h, lib/readutmp.c, lib/regex-quote.h: * lib/regex_internal.h, lib/relocwrapper.c, lib/rename.c: * lib/renameatu.c, lib/rpmatch.c, lib/same.c, lib/same.h: * lib/save-cwd.c, lib/savewd.c, lib/savewd.h, lib/spawn-pipe.h: * lib/spawni.c, lib/stack.h, lib/stat.c, lib/stdckdint.in.h: * lib/strcasestr.c, lib/strfmon_l.c, lib/striconveh.c: * lib/striconveha.h, lib/string-buffer.h, lib/strptime.c: * lib/strstr.c, lib/strtod.c, lib/supersede.h, lib/system-quote.c: * lib/tempname.c, lib/term-style-control.c: * lib/term-style-control.h, lib/textstyle.in.h, lib/time_rz.c: * lib/tmpdir.c, lib/tmpdir.h, lib/tmpfile.c, lib/unicase.in.h: * lib/unicase/caseprop.h, lib/unicase/invariant.h: * lib/unicase/u16-casemap.c, lib/unicase/u16-ct-totitle.c: * lib/unicase/u16-is-invariant.c, lib/unicase/u32-casemap.c: * lib/unicase/u32-ct-totitle.c, lib/unicase/u32-is-invariant.c: * lib/unicase/u8-casemap.c, lib/unicase/u8-ct-totitle.c: * lib/unicase/u8-is-invariant.c, lib/unictype.in.h: * lib/unigbrk.in.h, lib/unigbrk/u16-grapheme-breaks.c: * lib/unigbrk/u32-grapheme-breaks.c: * lib/unigbrk/u8-grapheme-breaks.c: * lib/unigbrk/uc-grapheme-breaks.c, lib/uniname/uniname.c: * lib/unistr.in.h, lib/unlinkdir.h, lib/userspec.h, lib/utime.c: * lib/utimecmp.c, lib/utimens.c, lib/wait-process.h: * lib/windows-cond.c, lib/windows-spawn.c, lib/windows-spawn.h: * lib/windows-timedrwlock.c, lib/write-any-file.h, lib/xbinary-io.c: * lib/xstrtod.h, lib/yesno.h: * tests/nap.h, tests/qemu.h, tests/test-areadlink-with-size.c: * tests/test-areadlink.c, tests/test-areadlinkat-with-size.c: * tests/test-areadlinkat.c, tests/test-base32.c: * tests/test-base64.c, tests/test-ceil2.c, tests/test-ceilf2.c: * tests/test-chown.c, tests/test-dirname.c, tests/test-dup-safer.c: * tests/test-dup3.c, tests/test-exclude.c: * tests/test-execute-child.c, tests/test-execute-main.c: * tests/test-execute-script.c, tests/test-explicit_bzero.c: * tests/test-fchownat.c, tests/test-fcntl-safer.c: * tests/test-fcntl.c, tests/test-fdutimensat.c: * tests/test-filenamecat.c, tests/test-floor2.c: * tests/test-floorf2.c, tests/test-fstatat.c, tests/test-fstrcmp.c: * tests/test-futimens.c, tests/test-getlogin.h, tests/test-getopt.h: * tests/test-hard-locale.c, tests/test-hash.c: * tests/test-idpriv-drop.c, tests/test-idpriv-droptemp.c: * tests/test-immutable.c, tests/test-intprops.c: * tests/test-lchown.c, tests/test-link.c, tests/test-linkat.c: * tests/test-lstat.c, tests/test-mbmemcasecmp.c: * tests/test-mbmemcasecoll.c, tests/test-mkdir.c: * tests/test-mkdirat.c, tests/test-mkfifo.c, tests/test-mkfifoat.c: * tests/test-mknod.c, tests/test-nonblocking-pipe-child.c: * tests/test-nonblocking-pipe-main.c: * tests/test-nonblocking-socket-child.c: * tests/test-nonblocking-socket-main.c, tests/test-open.c: * tests/test-openat.c, tests/test-pipe.c, tests/test-pipe2.c: * tests/test-poll.c, tests/test-posix_spawn-chdir.c: * tests/test-posix_spawn-dup2-stdin.c: * tests/test-posix_spawn-dup2-stdout.c: * tests/test-posix_spawn-fchdir.c, tests/test-posix_spawn-open1.c: * tests/test-posix_spawn-open2.c, tests/test-quotearg-simple.c: * tests/test-quotearg.c, tests/test-readlink.c: * tests/test-readlinkat.c, tests/test-readtokens.c: * tests/test-rename.c, tests/test-renameat.c: * tests/test-renameatu.c, tests/test-rmdir.c, tests/test-round2.c: * tests/test-select.h, tests/test-spawn-pipe-child.c: * tests/test-spawn-pipe-main.c, tests/test-spawn-pipe-script.c: * tests/test-stack.c, tests/test-stat.c, tests/test-supersede.c: * tests/test-symlink.c, tests/test-symlinkat.c: * tests/test-system-quote-main.c: * tests/test-term-style-control-hello.c: * tests/test-term-style-control-yes.c, tests/test-timespec.c: * tests/test-trunc2.c, tests/test-truncf2.c, tests/test-unlink.c: * tests/test-unlinkat.c, tests/test-userspec.c, tests/test-utime.c: * tests/test-utimens.c, tests/test-utimensat.c: * tests/unictype/test-categ_byname.c: * tests/unigbrk/test-uc-is-grapheme-break.c: Don’t include stdbool.h. * modules/acl, modules/xgetcwd: Don’t depend on stdbool, as these modules don’t use bool. * modules/argp, modules/bitset, modules/diffseq, modules/file-has-acl: * modules/gen-uni-tables, modules/getrandom: * modules/hash-triple-simple, modules/posix_spawn-internal: * modules/strcasestr, modules/supersede, modules/system-quote: * modules/uniconv/base, modules/uniname/uniname, modules/utime: * modules/windows-timedrwlock: Depend on stdbool, as these modules use bool.
* unictype/category-none tests: Fix a link error on MSVC.Bruno Haible2022-09-051-0/+7
| | | | * tests/unictype/test-categ_none.c (main): Disable the test on MSVC.
* license: fix GPLv3 texts to use a comma instead of semicolon.Bernhard Voelker2022-01-05189-189/+189
| | | | | | | | | See: https://www.gnu.org/licenses/gpl-3.0.html#howto Run: $ git grep -l 'Foundation; either version 3' \ | xargs sed -i '/Foundation; either version 3/ s/n; e/n, e/' * All files using GPLv3: Adjust via the above command.
* maint: Update copyright notices in code generating programs.Bruno Haible2022-01-01150-150/+150
| | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (output_predicate_test, output_predicate, output_category, output_combclass, output_bidi_category, output_decimal_digit_test, output_decimal_digit, output_digit_test, output_digit, output_numeric_test, output_numeric, output_mirror, output_joining_type_test, output_joining_type, output_joining_group_test, output_joining_group, output_scripts, output_scripts_byname, output_blocks, output_ident_category, output_nonspacing_property, output_lbrk_tables, output_lbrk_rules_as_tables, output_wbrk_tables, output_gbp_test, output_gbp_table, output_decomposition_tables, output_composition_tables, output_simple_mapping_test, output_simple_mapping, output_casing_rules): Extend copyright year of generated file to 2022. * lib/uniname/gen-uninames.lisp (main): Likewise. Produce license notice that is consistent with the gnulib/etc/license-notices/ files. * All files regenerated.
* Update to Unicode 14.0.0.Bruno Haible2021-12-3182-680/+1442
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (UC_JOINING_GROUP_THIN_YEH, UC_JOINING_GROUP_VERTICAL_TAIL): New enum values. (fill_arabicshaping, joining_group_as_c_identifier): Recognize these joining groups. * lib/unictype.in.h (UC_JOINING_GROUP_THIN_YEH, UC_JOINING_GROUP_VERTICAL_TAIL): New enum values. * lib/unictype/joininggroup_name.h: Add the THIN_YEH, VERTICAL_TAIL joining groups. * lib/unictype/joininggroup_byname.gperf: Likewise. * lib/gen-uni-tables.c (LBP_ID1, LBP_ID2): New enum values. (LBP_ID): Assign artificial value. (get_lbp): Use the extended_pictographic property to assign LBP_ID1, LBP_ID2 instead of LBP_ID. Update such that unilbrk/lbrkprop.txt comes out as expected. (debug_output_lbp): Print either LBP_ID1 or LBP_ID2 as LBP_ID. (lbp_value_to_string): Handle LBP_ID1, LBP_ID2 instead of LBP_ID. (output_lbrk_rules_as_tables): Treat LBP_ID as macro that maps to two table rows/columns. In rule LB30b, use LBP_ID2 in addition to LBP_EB. Remove redundant part of rule LB27. * lib/unilbrk/lbrktables.h (LBP_ID1, LBP_ID2): New enum values. (LBP_ID): Remove enum value. (unilbrk_table): Update declaration. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Use LBP_ID1 instead of LBP_ID. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * tests/unilbrk/test-u8-possible-linebreaks.c (test_function): Add a test of potential future emoji. * tests/unilbrk/test-u16-possible-linebreaks.c (test_function): Likewise. * tests/unilbrk/test-u32-possible-linebreaks.c (test_function): Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x1AFF0..0x1AFF3, 0x1AFF5..0x1AFFB, 0x1AFFD..0x1AFFE, 0x1B120..0x1B122, 0x1F6DD..0x1F6DF, 0x1F7F0, 0x1FA7B..0x1FA7C, 0x1FAA9..0x1FAAC, 0x1FAB7..0x1FABA, 0x1FAC3..0x1FAC5, 0x1FAD7..0x1FAD9, 0x1FAE0..0x1FAE7, 0x1FAF0..0x1FAF6. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x0890..0x0891, 0x0898..0x089F, 0x08CA..0x0902, 0x0C3C, 0x180F, 0x1AC1..0x1ACE, 0x1DFA, 0x10F82..0x10F85, 0x11070, 0x11073..0x11074, 0x110C2, 0x1CF00..0x1CF2D, 0x1CF30..0x1CF46, 0x1E2AE. Expect ambiguous width for the character 0x1734. Expect width 2 for the characters 0x1AFF0..0x1AFF3, 0x1AFF5..0x1AFFB, 0x1AFFD..0x1AFFE, 0x1B120..0x1B122, 0x1F6DD..0x1F6DF, 0x1F7F0, 0x1FA7B..0x1FA7C, 0x1FAA9..0x1FAAC, 0x1FAB7..0x1FABA, 0x1FAC3..0x1FAC5, 0x1FAD7..0x1FAD9, 0x1FAE0..0x1FAE7, 0x1FAF0..0x1FAF6. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* Update to Unicode 13.0.0.Bruno Haible2021-12-3174-542/+1036
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (is_WBP_MIDLETTER): Add character 0x055F. (get_wbp): Assign value WBP_ALETTER to the characters 0x02E5..0x02EB, 0x055A, 0x058A, 0xA708..0xA716. * lib/gen-uni-tables.c (LBP_CP1, LBP_CP2, LBP_OP1, LBP_OP2): New enum values. (LBP_OP, LBP_CP): Assign artificial values. (get_lbp): Use the unicode_width[] table to assign LBP_CP1, LBP_CP2 instead of LBP_CP, and LBP_OP1, LBP_OP2 instead of LBP_OP. Update such that unilbrk/lbrkprop.txt comes out as expected. (debug_output_lbp): Print either LBP_CP1 or LBP_CP2 as LBP_CP. Print either LBP_OP1 or LBP_OP2 as LBP_OP. (lbp_value_to_string): Handle LBP_CP1, LBP_CP2, LBP_OP1, LBP_OP2 instead of LBP_CP, LBP_OP. (output_lbrk_rules_as_tables): Treat LBP_CP and LBP_OP as macros that map to two table rows/columns. In rule LB30, use only LBP_OP1 instead of LBP_OP, and only LBP_CP1 instead of LBP_CP. Simplify rule LB22. * lib/unilbrk/lbrktables.h (LBP_CP1, LBP_CP2, LBP_OP1, LBP_OP2): New enum values. (LBP_OP, LBP_CP): Remove enum values. (unilbrk_table): Update declaration. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Add a test for East Asian opening parenthesis. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x16FF0..0x16FF1, 0x18AF3..0x18CD5, 0x18D00..0x18D08, 0x1F6D6..0x1F6D7, 0x1F6FB..0x1F6FC, 0x1F90C, 0x1FA74, 0x1FA83..0x1FA86, 0x1FA96..0x1FAA8, 0x1FAB0..0x1FAB6, 0x1FAC0..0x1FAC2, 0x1FAD0..0x1FAD6. Assign width 1 to the characters 0x1F93B, 0x1F946. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x0B55, 0x0D81, 0x1ABF..0x1AC0, 0xA82C, 0x10EAB..0x10EAC, 0x111CF, 0x1193B..0x1193C, 0x1193E, 0x11943, 0x16FE4. Expect width 2 for the characters 0x16FF0..0x16FF1, 0x18AF3..0x18CD5, 0x18D00..0x18D08, 0x1F6D6..0x1F6D7, 0x1F6FB..0x1F6FC, 0x1F90C, 0x1FA74, 0x1FA83..0x1FA86, 0x1FA96..0x1FAA8, 0x1FAB0..0x1FAB6, 0x1FAC0..0x1FAC2, 0x1FAD0..0x1FAD6. Expect width 1 for the characters 0x1F93B, 0x1F946. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* Update to Unicode 12.1.0.Bruno Haible2021-12-3015-21/+12
| | | | | | | | | | | | | * lib/gen-uni-tables.c: Update comments. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* Update to Unicode 12.0.0.Bruno Haible2021-12-3073-591/+1070
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (is_property_default_ignorable_code_point): Exclude 0x13430..0x13438. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. (get_wbp): Map 0xFF10..0xFF19 to WBP_NUMERIC. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x16FE2..0x16FE3, 0x187F2..0x187F7, 0x1B150..0x1B152, 0x1B164..0x1B167, 0x1F6D5, 0x1F6FA, 0x1F7E0..0x1F7EB, 0x1F90D..0x1F90F, 0x1FA70..0x1FA73, 0x1FA78..0x1FA7A, 0x1FA80..0x1FA82, 0x1FA90..0x1FA95. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x0EBA, 0xA9BD, 0x119D4..0x119D7, 0x119DA..0x119DB, 0x119E0, 0x13430..0x13438, 0x16F4F, 0x1E130..0x1E136, 0x1E2EC..0x1E2EF. Expect width 2 for the characters 0x16FE2..0x16FE3, 0x187F2..0x187F7, 0x1B150..0x1B152, 0x1B164..0x1B167, 0x1F6D5, 0x1F6FA, 0x1F7E0..0x1F7EB, 0x1F90D..0x1F90F, 0x1FA70..0x1FA73, 0x1FA78..0x1FA7A, 0x1FA80..0x1FA82, 0x1FA90..0x1FA95. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* Update to Unicode 11.0.0.Bruno Haible2021-12-3068-520/+1235
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (is_property_default_ignorable_code_point): Simplify by use of PROP_PREPENDED_CONCATENATION_MARK. (UC_JOINING_GROUP_HANIFI_ROHINGYA_PA, UC_JOINING_GROUP_HANIFI_ROHINGYA_KINNA_YA): New enum values. (fill_arabicshaping, joining_group_as_c_identifier): Recognize these joining groups. (get_lbp): Update such that unilbrk/lbrkprop.txt comes out as expected. (WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): Remove enum values. (WBP_WSS): New enum value. (get_wbp): Update such that uniwbrk/wbrkprop.txt comes out as expected. (debug_output_wbp, fill_org_wbp, debug_output_org_wbp, output_wbp): Update for changed enum values. * lib/unictype.in.h (UC_JOINING_GROUP_HANIFI_ROHINGYA_*): New enum values. * lib/unictype/joininggroup_name.h: Add the HANIFI_ROHINGYA_* joining groups. * lib/unictype/joininggroup_byname.gperf: Likewise. * lib/unigbrk.in.h: Mark 4 enum values as obsolete. * lib/unigbrk/u-grapheme-breaks.h (FUNC): Handle emoji modifier sequence according to Unicode 11.0.0. * lib/unigbrk/u8-grapheme-breaks.c: Include <stdbool.h>, unictype.h. * lib/unigbrk/u16-grapheme-breaks.c: Likewise. * lib/unigbrk/u32-grapheme-breaks.c: Likewise. * lib/unigbrk/uc-grapheme-breaks.c: Likewise. * modules/unigbrk/u8-grapheme-breaks (Depends-on): Add unictype/property-extended-pictographic, stdbool. * modules/unigbrk/u16-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/u32-grapheme-breaks (Depends-on): Likewise. * modules/unigbrk/uc-grapheme-breaks (Depends-on): Likewise. * tests/unigbrk/test-u8-grapheme-breaks.c (main): Add test for emoji modifier / ZWJ sequence. * tests/unigbrk/test-u16-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-u32-grapheme-breaks.c (main): Likewise. * tests/unigbrk/test-uc-is-grapheme-break.c: Include <stdbool.h>, unictype.h. (main): Update workaround logic to match the one in lib/unigbrk/u-grapheme-breaks.h. * modules/unigbrk/uc-is-grapheme-break-tests (Depends-on): Add unictype/property-extended-pictographic, stdbool. * lib/uniwbrk.in.h: Mark 4 enum values as obsolete. (WBP_WSS): New enum value. * lib/uniwbrk/u-wordbreaks.h (FUNC): Handle emoji ZWJ sequences and horizontal whitespace according to Unicode 11.0.0. * lib/uniwbrk/u8-wordbreaks.c: Include unictype.h. * lib/uniwbrk/u16-wordbreaks.c: Likewise. * lib/uniwbrk/u32-wordbreaks.c: Likewise. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index, uniwbrk_table): Add a row and column for WBP_WSS. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index, uniwbrk_table): Update declarations. * modules/uniwbrk/u8-wordbreaks (Depends-on): Add unictype/property-extended-pictographic. * modules/uniwbrk/u16-wordbreaks (Depends-on): Likewise. * modules/uniwbrk/u32-wordbreaks (Depends-on): Likewise. * tests/uniwbrk/test-u8-wordbreaks.c (main): Update expected results. * tests/uniwbrk/test-u16-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-u32-wordbreaks.c (main): Likewise. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Update. * lib/unilbrk/u8-possible-linebreaks.c (u8_possible_linebreaks_loop): Handle ZWJ according to Unicode 11.0.0. * lib/unilbrk/u16-possible-linebreaks.c (u16_possible_linebreaks_loop): Likewise. * lib/unilbrk/u32-possible-linebreaks.c (u32_possible_linebreaks_loop): Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. (uc_width): Assign width 2 to the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * tests/uniwidth/test-uc_width2.sh: Expect width 0 for the characters 0x07FD, 0x08D3, 0x09FE, 0x0C04, 0xA8FF, 0x10D24..0x10D27, 0x10F46..0x10F50, 0x110CD, 0x111C9, 0x1133B, 0x1145E, 0x1182F..0x11837, 0x11839..0x1183A, 0x11D90..0x11D91, 0x11D95, 0x11D97, 0x11EF3..0x11EF4. Expect width 2 for the characters 0x187ED..0x187F1, 0x1F6F9, 0x1F9E7..0x1F9FF. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* unictype: Add Emoji properties from Unicode 11.0.0.Bruno Haible2021-12-306-0/+478
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (PROP_EMOJI*, PROP_EXTENDED_PICTOGRAPHIC): New enum values. (fill_properties): Don't require a space between the property name and the comment. Handle the property names from emoji-data.txt. (is_property_emoji, is_property_emoji_presentation, is_property_emoji_modifier, is_property_emoji_modifier_base, is_property_emoji_component, is_property_extended_pictographic): New declarations. (output_properties): Emit the properties emoji, emoji_presentation, emoji_modifier, emoji_modifier_base, emoji_component, extended_pictographic. (get_lbp): Use the emoji_modifier property. (main): Expect one more argument, for the emoji-data.txt file. * lib/unictype.in.h (UC_PROPERTY_EMOJI, UC_PROPERTY_EMOJI_PRESENTATION, UC_PROPERTY_EMOJI_MODIFIER, UC_PROPERTY_EMOJI_MODIFIER_BASE, UC_PROPERTY_EMOJI_COMPONENT, UC_PROPERTY_EXTENDED_PICTOGRAPHIC, uc_is_property_emoji, uc_is_property_emoji_presentation, uc_is_property_emoji_modifier, uc_is_property_emoji_modifier_base, uc_is_property_emoji_component, uc_is_property_extended_pictographic): New declarations. * lib/unictype/pr_emoji.c: New file. * lib/unictype/pr_emoji_presentation.c: New file. * lib/unictype/pr_emoji_modifier.c: New file. * lib/unictype/pr_emoji_modifier_base.c: New file. * lib/unictype/pr_emoji_component.c: New file. * lib/unictype/pr_extended_pictographic.c: New file. * modules/unictype/property-emoji: New file. * modules/unictype/property-emoji-tests: New file. * modules/unictype/property-emoji-presentation: New file. * modules/unictype/property-emoji-presentation-tests: New file. * modules/unictype/property-emoji-modifier: New file. * modules/unictype/property-emoji-modifier-tests: New file. * modules/unictype/property-emoji-modifier-base: New file. * modules/unictype/property-emoji-modifier-base-tests: New file. * modules/unictype/property-emoji-component: New file. * modules/unictype/property-emoji-component-tests: New file. * modules/unictype/property-extended-pictographic: New file. * modules/unictype/property-extended-pictographic-tests: New file. * modules/unictype/property-all (Depends-on): Depend on the new modules.
* gen-uni-tables: Produce license notices suitable for Gnulib.Bruno Haible2021-12-28144-720/+720
| | | | | | | * lib/gen-uni-tables.c (output_library_license, output_tests_license): Produce license notices that are consistent with the gnulib/etc/license-notices/ files. * All generated files under lib/uni* and tests/uni*: Regenerate.
* Update to Unicode 10.0.0.Bruno Haible2021-12-2654-340/+788
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (PROP_REGIONAL_INDICATOR): New enum value. (fill_properties): Recognize property "Regional_Indicator". (is_property_regional_indicator): New function. (output_properties): Also output the data for regional_indicator. (UC_JOINING_GROUP_MALAYALAM_*): New enum values. (fill_arabicshaping, joining_group_as_c_identifier): Recognize these joining groups. * lib/unictype/pr_regional_indicator.c: New file. * modules/unictype/property-regional-indicator: New file. * modules/unictype/property-regional-indicator-tests: New file. * modules/unictype/property-all (Depends-on): Add unictype/property-regional-indicator. * lib/unictype.in.h (UC_JOINING_GROUP_MALAYALAM_*): New enum values. * lib/unictype/joininggroup_name.h: Add the MALAYALAM_* joining groups. * lib/unictype/joininggroup_byname.gperf: Likewise. * lib/uniwidth/width.c (nonspacing_table_data, nonspacing_table_ind): Update. * tests/uniwidth/test-uc_width2.sh: Update. * All generated files under lib/uni* and tests/uni*: Regenerate. * tests/uniname/NameAliases.txt: Update. * tests/uniname/UnicodeData.txt: Update. * tests/uninorm/NormalizationTest.txt: Update. * tests/unigbrk/GraphemeBreakTest.txt: Update. * tests/uniwbrk/WordBreakTest.txt: Update. * All the affected modules: Bump required libunistring version.
* Generate correct license notices in libunistring files.Bruno Haible2021-06-045-0/+80
| | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (output_library_license, output_tests_license): New functions. (output_predicate, output_category, output_combclass, output_bidi_category, output_decimal_digit, output_digit, output_numeric, output_mirror, output_joining_type, output_joining_group, output_scripts, output_scripts_byname, output_blocks, output_ident_category, output_lbrk_tables, output_wbrk_tables, output_gbp_table, output_decomposition_tables, output_composition_tables, output_simple_mapping, output_casing_rules): Invoke output_library_license. (output_predicate_test, output_decimal_digit_test, output_digit_test, output_numeric_test, output_joining_type_test, output_joining_group_test, output_gbp_test, output_simple_mapping_test): Invoke output_tests_license. * lib/uni*/*.h, lib/uni*/*.gperf: Regenerated. * tests/uni*/*.h: Likewise.
* libunistring: update to Unicode 9.0.0Daiki Ueno2017-11-2770-306/+1133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (fill_properties): Recognize Sentence_Terminal and Prepended_Concatenation_Mark. (is_property_default_ignorable_code_point): Exclude U+08E2. (fill_arabicshaping): Allow missing whitespace when parsing; recognize "AFRICAN FEH", "AFRICAN QAF", and "AFRICAN MOON". (output_blocks): Increase the element size of the level1 table to accommodate more blocks. (get_lbp): Recognize ZWJ, E_Base, and E_Modifier characters; Update each class according to the standard. (get_wbp): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. (output_gbp_table): Recognize ZWJ, E_Base, E_Modifier, Glue_After_Zwj, and E_Base_GAZ characters. * lib/unictype.in.h (UC_JOINING_GROUP_AFRICAN_FEH) (UC_JOINING_GROUP_AFRICAN_QAF, UC_JOINING_GROUP_AFRICAN_MOON): New enum value. * lib/unilbrk/lbrktables.h (LBP_ZWJ, LBP_EB, LBP_EM): New enum value. * lib/unilbrk/lbrktables.c (unilbrk_table): Extend the table with LBP_ZWJ, LBP_EB, and LBP_EM. * lib/uniwbrk.in.h (WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, WBP_EBG): New enum value. * lib/uniwbrk/u-wordbreaks.h: Implement WB3c, WB15, and WB16. * lib/uniwbrk/wbrktable.h (uniwbrk_prop_index): New variable declaration. * lib/uniwbrk/wbrktable.c (uniwbrk_prop_index): New variable. (uniwbrk_table): Implement WB14. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Check WBP_ZWJ, WBP_EB, WBP_EM, WBP_GAZ, and WBP_EBG. * modules/unigbrk/u{32,16,8}-grapheme-breaks: No longer depend on uc-is-grapheme-break. * modules/unigbrk/uc-grapheme-breaks: New module. * modules/unigbrk/uc-grapheme-breaks-tests: New module. * lib/unigbrk.in.h (GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, GBP_EBG): New enum value. (uc_grapheme_breaks): New function, replacing uc_is_grapheme_break. * lib/unigbrk/u-grapheme-breaks.h: New file. * lib/unigbrk/u{32,16,8}-grapheme-breaks.c: Rewrite using u-grapheme-breaks.h instead of uc_is_grapheme_break. * lib/unigbrk/uc-grapheme-breaks.c: New file. * lib/unigbrk/uc-is-grapheme-break.c: Partially update to TR29 rev 29. * tests/unigbrk/test-uc-gbrk-prop.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. * tests/unigbrk/test-uc-grapheme-breaks.c: New test. * tests/unigbrk/test-uc-is-grapheme-break.c (graphemebreakproperty_to_string): Check GBP_ZWJ, GBP_EB, GBP_EM, GBP_GAZ, and GBP_EBG. (main): Skip unsupported rules involving 3 or more characters, namely GB10, GB12, and GB13. * lib/uniwidth/width.c (nonspacing_table_data): Update. * all generated files under lib/uni* and tests/uni*: Regenerate. * all the dependant modules: Bump version.
* all: prefer https: URLsPaul Eggert2017-09-13177-177/+177
|
* libunistring: update to Unicode 8.0.0Daiki Ueno2015-06-1868-520/+1076
| | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (SIZEOF): New macro. (output_numeric): Increase the maximum number of fractions from 128 to 160. Increase the level3 value width from 7 bits to 8 bits. Use SIZEOF instead of a hard-coded integer. (output_blocks): Decrease the cut-off threshold from 0x30000 to 0x28000. (fill_blocks): Increase the maximum number of blocks from 256 to 384. Use SIZEOF instead of a hard-coded integer. (get_lbp): Adjust to new characters added in Unicode 8.0.0. * lib/unictype/numeric.c (uc_numeric_value): Adjust the level3 value width. * lib/unilbrk/lbrktables.c (unilbrk_table): Implement LBP21b and a new case added to LBP22. * lib/uniwidth/width.c (nonspacing_table_data): Add U+08E3, U+A69E, U+FE2E..U+FE2F, U+111CA..U+111CC, U+11300, U+115DC..U+115DD, U+1171D..U+1171F, U+11722..U+11725, U+11727..U+1172B, U+1DA00..U+1DA36, U+1DA3B..U+1DA6C, U+1DA75, U+1DA84, U+1DA9B..U+1DA9F, and U+1DAA1..U+1DAAF. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * all generated files under lib/uni* and tests/uni*: Regenerate.
* libunistring: update to Unicode 7.0.0Daiki Ueno2015-01-1585-885/+2933
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/unictype/joininggroup_byname.gperf: Add Straight Waw and Manichaean names. * lib/unictype/joininggroup_name.h: Likewise. * lib/unictype.in.h (UC_JOINING_GROUP_STRAIGHT_WAW) (UC_JOINING_GROUP_MANICHAEAN_ALEPH): New enumeration values. * lib/gen-uni-tables.c (UC_JOINING_GROUP_STRAIGHT_WAW) (UC_JOINING_GROUP_MANICHAEAN_*): New enumeration values. (fill_arabicshaping, joining_group_as_c_identifier): Support those enum values. (is_property_alphabetic): Accept newly added characters to cuneiform numeric signs. (is_property_default_ignorable_code_point): Reject U+0605. (FIELDLEN): Increase from 120 to 160. * lib/uniwidth/width.c (nonspacing_table_data): Add U+0605, U+08FF, U+0C00, U+0C81, U+0D01, U+1AB0..U+1ABE, U+1BAC..U+1BAD, U+1CF8..U+1CF9, U+1DE7..U+1DF5, U+A9E5, U+AA7C, U+FE27..U+FE2D, U+102E0, U+10376..U+1037A, U+10AE5..U+10AE6, U+1107F, U+11173, U+1122F..U+11231, U+11234, U+11236..U+11237, U+112DF, U+112E3..U+112EA, U+11301, U+1133C, U+11340, U+11366..U+1136C, U+11370..U+11374, U+114B3..U+114B8, U+114BA, U+114BF..U+114C0, U+114C2..U+114C3, U+115B2..U+115B5, U+115BC..U+115C0, U+11633..U+1163A, U+1163D, U+1163F..U+11640, U+16AF0..U+16AF4, U+16B30..U+16B36, U+1BC9D..U+1BC9E, U+1BCA0..U+1BCA3, and U+1E8D0..U+1E8D6. (uc_width): Adjust nonspacing_table_ind boundary from 240 to 248. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * all generated files under lib/uni* and tests/uni*: Regenerate.
* libunistring: update to Unicode 6.3.0Daiki Ueno2015-01-1541-43/+249
| | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/uniwbrk.in.h (WBP_DQ, WBP_SQ, WBP_HL): New enumeration values. * lib/uniwbrk/u-wordbreaks.h (FUNC): Support WB7a, WB7b, and WB7c. Update WB5, WB6, WB7, WB9, WB11, WB12, WB13a, and WB13b. * lib/uniwbrk/wbrktable.h (uniwbrk_table): Adjust table size. * lib/uniwbrk/wbrktable.c (uniwbrk_table): Support rule WB7a. Update WB5, WB9, WB10, WB13a, and WB13b. * tests/uniwbrk/test-uc-wordbreaks.c (wordbreakproperty_to_string): Support WBP_DQ, WBP_SQ, and WBP_HL. * lib/gen-uni-tables.c (UC_BIDI_LRI, UC_BIDI_RLI, UC_BIDI_FSI) (UC_BIDI_PDI): New enumeration values. (bidi_category_byname): Support those enum values. (is_WBP_MIDNUMLET): Exclude 0x0027 (SINGLE QUOTE), which is now a dedicated property assigned. (is_property_case_ignorable): Check 0x0027. (WBP_DQ, WBP_SQ, WBP_HL): New enumeration values. (get_wbp, debug_output_wbp, fill_org_wbp, debug_output_org_wbp) (output_wbp): Support those enum values. * lib/unictype.in.h (UC_BIDI_LRI, UC_BIDI_RLI, UC_BIDI_FSI) (UC_BIDI_PDI): New enumeration values. * lib/unictype/bidi_byname.gperf: Add those property names. * lib/uniwidth/width.c (nonspacing_table_data): Add U+061C, U+180E, U+1A1B, and U+2066..U+2069. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * all generated files under lib/uni* and tests/uni*: Regenerate.
* libunistring: update to Unicode 6.2.0Daiki Ueno2015-01-1518-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/unilbrk/lbrktables.h (LBP_RI): New enumeration value. (unilbrk_table): Adjust table size. * lib/unilbrk/lbrktables.c (unilbrk_table): Add a row and column for LBP_RI. * lib/uniwbrk.in.h (WBP_RI): New enumeration value. * lib/uniwbrk/u-wordbreaks.h (FUNC): Support rule WB13c. Normalize table index skipping ignored properties. * lib/uniwbrk/wbrktable.c (uniwbrk_table): Support WBP_RI. Remove WBP_EXTEND and WBP_FORMAT, which are now computed without using the table. * lib/uniwbrk/wbrktable.h: Adjust table size. * lib/unigbrk.in.h (GBP_RI): New enumeration value. * lib/unigbrk/uc-is-grapheme-break.c (UC_IS_GRAPHEME_BREAK): Support rule GB8a. (UC_GRAPHEME_BREAKS_FOR, gb_table): Support GBP_RI. * tests/unigbrk/test-uc-is-grapheme-break.c (graphemebreakproperty_to_string): Support GBP_RI. * lib/gen-uni-tables.c (LBP_RI): New enumeration value. (get_lbp, debug_output_lbp, fill_org_lbp, debug_output_org_lbp) (output_lbp): Support LBP_RI. Adjust some characters changed from LBP_AL to LBP_ID. (output_lbp): Support LBP_RI. (WBP_RI): New enumeration value. (debug_output_wbp, fill_org_wbp, debug_output_org_wbp) (output_wbp): Support WBP_RI. (GBP_RI): New enumeration value. (output_gbp_test, fill_org_gbp): Support GBP_RI. * all generated files under lib/uni* and tests/uni*: Regenerate.
* libunistring: update to Unicode 6.1.0Daiki Ueno2015-01-1580-558/+1793
| | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (output_joining_group): Switch to 3-level table to accommodate joining groups defined with higher codepoint value. Since there are only 88 groups defined in Unicode 7.0.0, use 7-bit packed format for level3 entries. (get_lbp): Update for Unicode 6.1.0. * lib/unictype/joininggroup_of.c (uc_joining_group): Adjust to use 3-level table. * lib/unictype/joininggroup_byname.gperf: Add Rohingya Yeh joining group name. * lib/unictype/joininggroup_name.h: Likewise. * lib/unilbrk/lbrktables.h (LBP_HL): New enumeration value. (unilbrk_table): Adjust table size. * lib/unilbrk/lbrktables.c (unilbrk_table): Add a row and column for LBP_HL. * lib/uniwidth/width.c (nonspacing_table_data): Add U+0604, U+08E4..U+08FE, U+1BAB, U+1CF4, U+A674..U+A67B, U+A69F, U+AAEC..U+AAED, U+AAF6, U+11100..U+11102, U+11127..U+1112B, U+1112D..U+11134, U+11180..U+11181, U+111B6..U+111BE, U+116AB, U+116AD, U+116B0..U+116B5, U+116B7, U+16F8F..U+16F92. Remove U+302E..U+302F. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * all generated files under lib/uni* and tests/uni*: Regenerate. * modules/uni*/* (configure.ac): Bump minimum version to 0.9.5.
* unictype/joininggroup-byname: Allow hyphens, omitted word separators.Bruno Haible2011-03-261-0/+17
| | | | | | | | * lib/unictype/joininggroup_byname.c (uc_joining_group_byname): Convert also hyphens to space. * lib/unictype/joininggroup_byname.gperf: Recognize the names also without spaces. * tests/unictype/test-joininggroup_byname.c (main): Add more tests.
* unictype/joiningtype-byname: Recognize long names as well.Bruno Haible2011-03-261-0/+24
| | | | | | | | | | | | | | | * lib/unictype.in.h (uc_joiningtype_class_byname): Allow argument to be a long name. * lib/unictype/joiningtype_byname.c: Include <string.h>, unictype/joiningtype_byname.h. (uc_joiningtype_class_byname): Use uc_joining_type_lookup. * lib/unictype/joiningtype_byname.gperf: New file. * modules/unictype/joiningtype-byname (Files): Add lib/unictype/joiningtype_byname.gperf. (Depends-on): Add gperf. (Makefile.am): Add rule for generating unictype/joiningtype_byname.h. * tests/unictype/test-joiningtype_byname.c (main): Test the recognition of long names.
* Tests for module 'unictype/joiningtype-longname'.Bruno Haible2011-03-261-0/+39
| | | | | * modules/unictype/joiningtype-longname-tests: New file. * tests/unictype/test-joiningtype_longname.c: New file.
* unictype/bidiclass-byname: Recognize long names as well.Bruno Haible2011-03-261-0/+78
| | | | | | | | | | | | | | | * lib/unictype.in.h (uc_bidi_class_byname): Allow argument to be a long name. * lib/unictype/bidi_byname.c: Include <string.h>, unictype/bidi_byname.h. (uc_bidi_class_byname): Use uc_bidi_class_lookup. * lib/unictype/bidi_byname.gperf: New file. * modules/unictype/bidiclass-byname (Files): Add lib/unictype/bidi_byname.gperf. (Depends-on): Add gperf. (Makefile.am): Add rule for generating unictype/bidi_byname.h. * tests/unictype/test-bidi_byname.c (main): Test the recognition of long names.
* Tests for module 'unictype/bidiclass-longname'.Bruno Haible2011-03-261-0/+50
| | | | | * modules/unictype/bidiclass-longname-tests: New file. * tests/unictype/test-bidi_longname.c: New file.
* Tests for module 'unictype/combining-class-byname'.Bruno Haible2011-03-261-0/+118
| | | | | * modules/unictype/combining-class-byname-tests: New file. * tests/unictype/test-combiningclass_byname.c: New file.
* Tests for module 'unictype/combining-class-longname'.Bruno Haible2011-03-261-0/+53
| | | | | * modules/unictype/combining-class-longname-tests: New file. * tests/unictype/test-combiningclass_longname.c: New file.
* Tests for module 'unictype/combining-class-name'.Bruno Haible2011-03-261-0/+53
| | | | | * modules/unictype/combining-class-name-tests: New file. * tests/unictype/test-combiningclass_name.c: New file.
* unictype/combining-class: Rename source files.Bruno Haible2011-03-261-0/+0
| | | | | | | | | | | | | * lib/gen-uni-tables.c (main): Emit unictype/combiningclass.h instead of unictype/combining.h. * lib/unictype/combiningclass.c: Renamed from lib/unictype/combining.c. Update. * lib/unictype/combiningclass.h: Renamed from lib/unictype/combining.h. * modules/unictype/combining-class (Description): Fix. (Files, Makefile.am): Update. * tests/unictype/test-combiningclass.c: Renamed from tests/unictype/test-combining.c. * modules/unictype/combining-class-tests (Files, Makefile.am): Update.
* unictype/category-byname: Recognize long names as well.Bruno Haible2011-03-251-1/+133
| | | | | | | | | | | | | | | | | * lib/unictype.in.h (uc_general_category_byname): Allow argument to be a long name. * lib/unictype/categ_byname.c: Include <stdlib.h>, <string.h>, unictype/categ_byname.h. (UC_CATEGORY_INDEX_*): New enumeration values. (uc_general_category_byname): Use uc_general_category_lookup and convert from index to value. * lib/unictype/categ_byname.gperf: New file. * modules/unictype/category-byname (Files): Add lib/unictype/categ_byname.gperf. (Depends-on): Add gperf. (Makefile.am): Add rule for generating unictype/categ_byname.h. * tests/unictype/test-categ_byname.c (main): Test the recognition of long names.
* Tests for module 'unictype/category-longname'.Bruno Haible2011-03-251-0/+33
| | | | | * modules/unictype/category-longname-tests: New file. * tests/unictype/test-categ_longname.c: New file.
* Tests for module 'unictype/category-LC'.Bruno Haible2011-03-251-0/+132
| | | | | * modules/unictype/category-LC-tests: New file. * tests/unictype/test-categ_LC.c: New file, automatically generated.
* unictype/bidi*: Rename functions.Bruno Haible2011-03-234-40/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | * lib/unictype.in.h (uc_bidi_class_name, uc_bidi_class_byname, uc_bidi_class, uc_is_bidi_class): New declarations. * lib/unictype/bidi_byname.c (uc_bidi_class_byname): Renamed from uc_bidi_category_byname. (uc_bidi_category_byname): New function. * lib/unictype/bidi_name.c (u_bidi_class_name): Renamed from u_bidi_category_name. (uc_bidi_class_name): Renamed from uc_bidi_category_name. (uc_bidi_category_name): New function. * lib/unictype/bidi_of.c (uc_bidi_class): Renamed from uc_bidi_category. (uc_bidi_category): New function. * lib/unictype/bidi_test.c (uc_is_bidi_class): Renamed from uc_is_bidi_category. Invoke uc_bidi_class. (uc_is_bidi_category): New function. * tests/unictype/test-bidi_byname.c (main): Test uc_bidi_class_byname instead of uc_bidi_category_byname. * tests/unictype/test-bidi_name.c (main): Test uc_bidi_class_name instead of uc_bidi_category_name. * tests/unictype/test-bidi_of.c (main): Test uc_bidi_class instead of uc_bidi_category. * tests/unictype/test-bidi_test.c (main): Test uc_is_bidi_class instead of uc_is_bidi_category.
* Tests for module 'unictype/joininggroup-of'.Bruno Haible2011-03-212-0/+285
| | | | | | | * modules/unictype/joininggroup-of-tests: New file. * tests/unictype/test-joininggroup_of.c: New file. * tests/unictype/test-joininggroup_of.h: New file, automatically generated by gen-uni-tables.
* Tests for module 'unictype/joininggroup-byname'.Bruno Haible2011-03-211-0/+161
| | | | | * modules/unictype/joininggroup-byname-tests: New file. * tests/unictype/test-joininggroup_byname.c: New file.
* Tests for module 'unictype/joininggroup-name'.Bruno Haible2011-03-211-0/+90
| | | | | * modules/unictype/joininggroup-name-tests: New file. * tests/unictype/test-joininggroup_name.c: New file.
* Tests for module 'unictype/joiningtype-of'.Bruno Haible2011-03-212-0/+344
| | | | | | | * modules/unictype/joiningtype-of-tests: New file. * tests/unictype/test-joiningtype_of.c: New file. * tests/unictype/test-joiningtype_of.h: New file, automatically generated by gen-uni-tables.
* Tests for module 'unictype/joiningtype-byname'.Bruno Haible2011-03-211-0/+40
| | | | | * modules/unictype/joiningtype-byname-tests: New file. * tests/unictype/test-joiningtype_byname.c: New file.
* Tests for module 'unictype/joiningtype-name'.Bruno Haible2011-03-211-0/+39
| | | | | * modules/unictype/joiningtype-name-tests: New file. * tests/unictype/test-joiningtype_name.c: New file.
* Update to Unicode 6.0.0.Bruno Haible2011-01-0973-740/+1298
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (symbolic_width): Fix bounds of planes. (get_lbp): Update for Unicode 6.0.0. * lib/uniwidth/width.c (nonspacing_table_data): Add U+065F, U+0859..U+085B, U+093A, U+0956..U+0957, U+0F8D..U+0F8F, U+135D..U+135E, U+1BE6, U+1BE8..U+1BE9, U+1BED, U+1BEF..U+1BF1, U+1DFC, U+2D7F, U+11001, U+11038..U+11046. Remove U+06DE. (uc_width): Fix bounds of planes. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * tests/unigbrk/GraphemeBreakTest.txt: Copied from Unicode 6.0.0, with trailing whitespace removed. * tests/uninorm/NormalizationTest.txt: Update from Unicode 6.0.0, without comments, but with the original copyright notice. * lib/unicase/cased.h: Regenerated for Unicode 6.0.0. * lib/unicase/ignorable.h: Likewise. * lib/unicase/tocasefold.h: Likewise. * lib/unicase/tolower.h: Likewise. * lib/unicase/totitle.h: Likewise. * lib/unicase/toupper.h: Likewise. * lib/unictype/bidi_of.h: Likewise. * lib/unictype/blocks.h: Likewise. * lib/unictype/categ_C.h: Likewise. * lib/unictype/categ_Cn.h: Likewise. * lib/unictype/categ_L.h: Likewise. * lib/unictype/categ_Ll.h: Likewise. * lib/unictype/categ_Lm.h: Likewise. * lib/unictype/categ_Lo.h: Likewise. * lib/unictype/categ_Lu.h: Likewise. * lib/unictype/categ_M.h: Likewise. * lib/unictype/categ_Mc.h: Likewise. * lib/unictype/categ_Me.h: Likewise. * lib/unictype/categ_Mn.h: Likewise. * lib/unictype/categ_N.h: Likewise. * lib/unictype/categ_Nd.h: Likewise. * lib/unictype/categ_No.h: Likewise. * lib/unictype/categ_P.h: Likewise. * lib/unictype/categ_Po.h: Likewise. * lib/unictype/categ_S.h: Likewise. * lib/unictype/categ_Sc.h: Likewise. * lib/unictype/categ_Sk.h: Likewise. * lib/unictype/categ_Sm.h: Likewise. * lib/unictype/categ_So.h: Likewise. * lib/unictype/categ_of.h: Likewise. * lib/unictype/combining.h: Likewise. * lib/unictype/ctype_alnum.h: Likewise. * lib/unictype/ctype_alpha.h: Likewise. * lib/unictype/ctype_graph.h: Likewise. * lib/unictype/ctype_lower.h: Likewise. * lib/unictype/ctype_print.h: Likewise. * lib/unictype/ctype_punct.h: Likewise. * lib/unictype/ctype_upper.h: Likewise. * lib/unictype/decdigit.h: Likewise. * lib/unictype/digit.h: Likewise. * lib/unictype/numeric.h: Likewise. * lib/unictype/pr_alphabetic.h: Likewise. * lib/unictype/pr_bidi_arabic_digit.h: Likewise. * lib/unictype/pr_bidi_arabic_right_to_left.h: Likewise. * lib/unictype/pr_bidi_boundary_neutral.h: Likewise. * lib/unictype/pr_bidi_eur_num_terminator.h: Likewise. * lib/unictype/pr_bidi_hebrew_right_to_left.h: Likewise. * lib/unictype/pr_bidi_left_to_right.h: Likewise. * lib/unictype/pr_bidi_non_spacing_mark.h: Likewise. * lib/unictype/pr_bidi_other_neutral.h: Likewise. * lib/unictype/pr_case_ignorable.h: Likewise. * lib/unictype/pr_cased.h: Likewise. * lib/unictype/pr_changes_when_casefolded.h: Likewise. * lib/unictype/pr_changes_when_casemapped.h: Likewise. * lib/unictype/pr_changes_when_lowercased.h: Likewise. * lib/unictype/pr_changes_when_titlecased.h: Likewise. * lib/unictype/pr_changes_when_uppercased.h: Likewise. * lib/unictype/pr_combining.h: Likewise. * lib/unictype/pr_composite.h: Likewise. * lib/unictype/pr_currency_symbol.h: Likewise. * lib/unictype/pr_decimal_digit.h: Likewise. * lib/unictype/pr_deprecated.h: Likewise. * lib/unictype/pr_format_control.h: Likewise. * lib/unictype/pr_grapheme_base.h: Likewise. * lib/unictype/pr_grapheme_extend.h: Likewise. * lib/unictype/pr_grapheme_link.h: Likewise. * lib/unictype/pr_id_continue.h: Likewise. * lib/unictype/pr_id_start.h: Likewise. * lib/unictype/pr_ideographic.h: Likewise. * lib/unictype/pr_lowercase.h: Likewise. * lib/unictype/pr_math.h: Likewise. * lib/unictype/pr_numeric.h: Likewise. * lib/unictype/pr_other_alphabetic.h: Likewise. * lib/unictype/pr_other_id_continue.h: Likewise. * lib/unictype/pr_other_math.h: Likewise. * lib/unictype/pr_punctuation.h: Likewise. * lib/unictype/pr_sentence_terminal.h: Likewise. * lib/unictype/pr_terminal_punctuation.h: Likewise. * lib/unictype/pr_unassigned_code_value.h: Likewise. * lib/unictype/pr_unified_ideograph.h: Likewise. * lib/unictype/pr_uppercase.h: Likewise. * lib/unictype/pr_xid_continue.h: Likewise. * lib/unictype/pr_xid_start.h: Likewise. * lib/unictype/scripts.h: Likewise. * lib/unictype/scripts_byname.gperf: Likewise. * lib/unictype/sy_java_ident.h: Likewise. * lib/unigbrk/gbrkprop.h: Likewise. * lib/unilbrk/lbrkprop1.h: Likewise. * lib/unilbrk/lbrkprop2.h: Likewise. * lib/uninorm/decomposition-table2.h: Likewise. * lib/uniwbrk/wbrkprop.h: Likewise. * tests/unicase/test-cased.c: Likewise. * tests/unicase/test-ignorable.c: Likewise. * tests/unicase/test-uc_tolower.c: Likewise. * tests/unicase/test-uc_totitle.c: Likewise. * tests/unicase/test-uc_toupper.c: Likewise. * tests/unictype/test-categ_C.c: Likewise. * tests/unictype/test-categ_Cn.c: Likewise. * tests/unictype/test-categ_L.c: Likewise. * tests/unictype/test-categ_Ll.c: Likewise. * tests/unictype/test-categ_Lm.c: Likewise. * tests/unictype/test-categ_Lo.c: Likewise. * tests/unictype/test-categ_Lu.c: Likewise. * tests/unictype/test-categ_M.c: Likewise. * tests/unictype/test-categ_Mc.c: Likewise. * tests/unictype/test-categ_Me.c: Likewise. * tests/unictype/test-categ_Mn.c: Likewise. * tests/unictype/test-categ_N.c: Likewise. * tests/unictype/test-categ_Nd.c: Likewise. * tests/unictype/test-categ_No.c: Likewise. * tests/unictype/test-categ_P.c: Likewise. * tests/unictype/test-categ_Po.c: Likewise. * tests/unictype/test-categ_S.c: Likewise. * tests/unictype/test-categ_Sc.c: Likewise. * tests/unictype/test-categ_Sk.c: Likewise. * tests/unictype/test-categ_Sm.c: Likewise. * tests/unictype/test-categ_So.c: Likewise. * tests/unictype/test-ctype_alnum.c: Likewise. * tests/unictype/test-ctype_alpha.c: Likewise. * tests/unictype/test-ctype_graph.c: Likewise. * tests/unictype/test-ctype_lower.c: Likewise. * tests/unictype/test-ctype_print.c: Likewise. * tests/unictype/test-ctype_punct.c: Likewise. * tests/unictype/test-ctype_upper.c: Likewise. * tests/unictype/test-decdigit.h: Likewise. * tests/unictype/test-digit.h: Likewise. * tests/unictype/test-numeric.h: Likewise. * tests/unictype/test-pr_alphabetic.c: Likewise. * tests/unictype/test-pr_bidi_arabic_digit.c: Likewise. * tests/unictype/test-pr_bidi_arabic_right_to_left.c: Likewise. * tests/unictype/test-pr_bidi_boundary_neutral.c: Likewise. * tests/unictype/test-pr_bidi_eur_num_terminator.c: Likewise. * tests/unictype/test-pr_bidi_hebrew_right_to_left.c: Likewise. * tests/unictype/test-pr_bidi_left_to_right.c: Likewise. * tests/unictype/test-pr_bidi_non_spacing_mark.c: Likewise. * tests/unictype/test-pr_bidi_other_neutral.c: Likewise. * tests/unictype/test-pr_case_ignorable.c: Likewise. * tests/unictype/test-pr_cased.c: Likewise. * tests/unictype/test-pr_changes_when_casefolded.c: Likewise. * tests/unictype/test-pr_changes_when_casemapped.c: Likewise. * tests/unictype/test-pr_changes_when_lowercased.c: Likewise. * tests/unictype/test-pr_changes_when_titlecased.c: Likewise. * tests/unictype/test-pr_changes_when_uppercased.c: Likewise. * tests/unictype/test-pr_combining.c: Likewise. * tests/unictype/test-pr_composite.c: Likewise. * tests/unictype/test-pr_currency_symbol.c: Likewise. * tests/unictype/test-pr_decimal_digit.c: Likewise. * tests/unictype/test-pr_deprecated.c: Likewise. * tests/unictype/test-pr_format_control.c: Likewise. * tests/unictype/test-pr_grapheme_base.c: Likewise. * tests/unictype/test-pr_grapheme_extend.c: Likewise. * tests/unictype/test-pr_grapheme_link.c: Likewise. * tests/unictype/test-pr_id_continue.c: Likewise. * tests/unictype/test-pr_id_start.c: Likewise. * tests/unictype/test-pr_ideographic.c: Likewise. * tests/unictype/test-pr_lowercase.c: Likewise. * tests/unictype/test-pr_math.c: Likewise. * tests/unictype/test-pr_numeric.c: Likewise. * tests/unictype/test-pr_other_alphabetic.c: Likewise. * tests/unictype/test-pr_other_id_continue.c: Likewise. * tests/unictype/test-pr_other_math.c: Likewise. * tests/unictype/test-pr_punctuation.c: Likewise. * tests/unictype/test-pr_sentence_terminal.c: Likewise. * tests/unictype/test-pr_terminal_punctuation.c: Likewise. * tests/unictype/test-pr_unassigned_code_value.c: Likewise. * tests/unictype/test-pr_unified_ideograph.c: Likewise. * tests/unictype/test-pr_uppercase.c: Likewise. * tests/unictype/test-pr_xid_continue.c: Likewise. * tests/unictype/test-pr_xid_start.c: Likewise. * tests/unigbrk/test-uc-gbrk-prop.h: Likewise. * lib/unicase/special-casing-table.gperf: Regenerated; only comment changes. * lib/unictype/categ_Cc.h: Likewise. * lib/unictype/categ_Cf.h: Likewise. * lib/unictype/categ_Co.h: Likewise. * lib/unictype/categ_Cs.h: Likewise. * lib/unictype/categ_Lt.h: Likewise. * lib/unictype/categ_Nl.h: Likewise. * lib/unictype/categ_Pc.h: Likewise. * lib/unictype/categ_Pd.h: Likewise. * lib/unictype/categ_Pe.h: Likewise. * lib/unictype/categ_Pf.h: Likewise. * lib/unictype/categ_Pi.h: Likewise. * lib/unictype/categ_Ps.h: Likewise. * lib/unictype/categ_Z.h: Likewise. * lib/unictype/categ_Zl.h: Likewise. * lib/unictype/categ_Zp.h: Likewise. * lib/unictype/categ_Zs.h: Likewise. * lib/unictype/ctype_blank.h: Likewise. * lib/unictype/ctype_cntrl.h: Likewise. * lib/unictype/ctype_digit.h: Likewise. * lib/unictype/ctype_space.h: Likewise. * lib/unictype/ctype_xdigit.h: Likewise. * lib/unictype/mirror.h: Likewise. * lib/unictype/pr_ascii_hex_digit.h: Likewise. * lib/unictype/pr_bidi_block_separator.h: Likewise. * lib/unictype/pr_bidi_common_separator.h: Likewise. * lib/unictype/pr_bidi_control.h: Likewise. * lib/unictype/pr_bidi_embedding_or_override.h: Likewise. * lib/unictype/pr_bidi_eur_num_separator.h: Likewise. * lib/unictype/pr_bidi_european_digit.h: Likewise. * lib/unictype/pr_bidi_pdf.h: Likewise. * lib/unictype/pr_bidi_segment_separator.h: Likewise. * lib/unictype/pr_bidi_whitespace.h: Likewise. * lib/unictype/pr_dash.h: Likewise. * lib/unictype/pr_default_ignorable_code_point.h: Likewise. * lib/unictype/pr_diacritic.h: Likewise. * lib/unictype/pr_extender.h: Likewise. * lib/unictype/pr_hex_digit.h: Likewise. * lib/unictype/pr_hyphen.h: Likewise. * lib/unictype/pr_ids_binary_operator.h: Likewise. * lib/unictype/pr_ids_trinary_operator.h: Likewise. * lib/unictype/pr_ignorable_control.h: Likewise. * lib/unictype/pr_iso_control.h: Likewise. * lib/unictype/pr_join_control.h: Likewise. * lib/unictype/pr_left_of_pair.h: Likewise. * lib/unictype/pr_line_separator.h: Likewise. * lib/unictype/pr_logical_order_exception.h: Likewise. * lib/unictype/pr_non_break.h: Likewise. * lib/unictype/pr_not_a_character.h: Likewise. * lib/unictype/pr_other_default_ignorable_code_point.h: Likewise. * lib/unictype/pr_other_grapheme_extend.h: Likewise. * lib/unictype/pr_other_id_start.h: Likewise. * lib/unictype/pr_other_lowercase.h: Likewise. * lib/unictype/pr_other_uppercase.h: Likewise. * lib/unictype/pr_paired_punctuation.h: Likewise. * lib/unictype/pr_paragraph_separator.h: Likewise. * lib/unictype/pr_pattern_syntax.h: Likewise. * lib/unictype/pr_pattern_white_space.h: Likewise. * lib/unictype/pr_private_use.h: Likewise. * lib/unictype/pr_quotation_mark.h: Likewise. * lib/unictype/pr_radical.h: Likewise. * lib/unictype/pr_soft_dotted.h: Likewise. * lib/unictype/pr_space.h: Likewise. * lib/unictype/pr_titlecase.h: Likewise. * lib/unictype/pr_variation_selector.h: Likewise. * lib/unictype/pr_white_space.h: Likewise. * lib/unictype/pr_zero_width.h: Likewise. * lib/unictype/sy_c_ident.h: Likewise. * lib/unictype/sy_c_whitespace.h: Likewise. * lib/unictype/sy_java_whitespace.h: Likewise. * lib/uninorm/composition-table.gperf: Likewise. * lib/uninorm/decomposition-table1.h: Likewise. * tests/unilbrk/test-u8-possible-linebreaks.c (main): Add test for rule LB8. * tests/unilbrk/test-u16-possible-linebreaks.c (main): Likewise. * tests/unilbrk/test-u32-possible-linebreaks.c (main): Likewise. * modules/unictype/*: Bump version number of expected libunistring version.
* New module 'unictype/property-changes-when-casemapped'.Bruno Haible2011-01-091-0/+119
| | | | | | | | | | * modules/unictype/property-changes-when-casemapped: New file. * lib/unictype/pr_changes_when_casemapped.c: New file. * lib/unictype/pr_changes_when_casemapped.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-changes-when-casemapped-tests: New file. * tests/unictype/test-pr_changes_when_casemapped.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-changes-when-casefolded'.Bruno Haible2011-01-091-0/+590
| | | | | | | | | | * modules/unictype/property-changes-when-casefolded: New file. * lib/unictype/pr_changes_when_casefolded.c: New file. * lib/unictype/pr_changes_when_casefolded.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-changes-when-casefolded-tests: New file. * tests/unictype/test-pr_changes_when_casefolded.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-changes-when-titlecased'.Bruno Haible2011-01-091-0/+596
| | | | | | | | | | * modules/unictype/property-changes-when-titlecased: New file. * lib/unictype/pr_changes_when_titlecased.c: New file. * lib/unictype/pr_changes_when_titlecased.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-changes-when-titlecased-tests: New file. * tests/unictype/test-pr_changes_when_titlecased.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-changes-when-uppercased'.Bruno Haible2011-01-091-0/+595
| | | | | | | | | | * modules/unictype/property-changes-when-uppercased: New file. * lib/unictype/pr_changes_when_uppercased.c: New file. * lib/unictype/pr_changes_when_uppercased.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-changes-when-uppercased-tests: New file. * tests/unictype/test-pr_changes_when_uppercased.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-changes-when-lowercased'.Bruno Haible2011-01-091-0/+579
| | | | | | | | | | * modules/unictype/property-changes-when-lowercased: New file. * lib/unictype/pr_changes_when_lowercased.c: New file. * lib/unictype/pr_changes_when_lowercased.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-changes-when-lowercased-tests: New file. * tests/unictype/test-pr_changes_when_lowercased.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-case-ignorable'.Bruno Haible2011-01-091-0/+288
| | | | | | | | | | * modules/unictype/property-case-ignorable: New file. * lib/unictype/pr_case_ignorable.c: New file. * lib/unictype/pr_case_ignorable.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-case-ignorable-tests: New file. * tests/unictype/test-pr_case_ignorable.c: New file, automatically generated by gen-uni-tables.
* New module 'unictype/property-cased'.Bruno Haible2011-01-091-0/+132
| | | | | | | | | | * modules/unictype/property-cased: New file. * lib/unictype/pr_cased.c: New file. * lib/unictype/pr_cased.h: New file, automatically generated by gen-uni-tables. * modules/unictype/property-cased-tests: New file. * tests/unictype/test-pr_cased.c: New file, automatically generated by gen-uni-tables.
* Update to Unicode 5.2.0.Bruno Haible2011-01-0967-557/+1886
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * lib/gen-uni-tables.c (output_predicate, output_category, output_combclass, output_bidi_category, output_decimal_digit_test, output_decimal_digit, output_digit_test, output_digit, output_numeric_test, output_numeric, output_mirror, output_scripts, output_scripts_byname, output_blocks, output_ident_category): Fix comment header. (is_WBP_MIDNUMLET, is_WBP_MIDLETTER): New functions, extracted from get_wbp. (PROP_CASED, PROP_CASE_IGNORABLE, PROP_CHANGES_WHEN_*): New enumeration items. (fill_properties): Also fill the peoperties Cased, Case_Ignorable, Changes_When_Lowercased, Changes_When_Uppercased, Changes_When_Titlecased, Changes_When_Casefolded, Changes_When_Casemapped. (is_property_alphabetic, is_property_default_ignorable_code_point): Update for Unicode 5.2.0. (is_property_cased, is_property_case_ignorable, is_property_changes_when_lowercased, is_property_changes_when_uppercased, is_property_changes_when_titlecased, is_property_changes_when_casefolded, is_property_changes_when_casemapped): New functions. (output_properties): Output also the properties cased, case_ignorable, changes_when_lowercased, changes_when_uppercased, changes_when_titlecased, changes_when_casefolded, changes_when_casemapped. (symbolic_width): Update for Unicode 5.2.0, incorporating changes from Unicode TR#11 revision 17 -> 19. (LBP_CP): New enumeration value. (LBP_*): Adjust values accordingly. (get_lbp): Update for Unicode 5.2.0, incorporating changes from Unicode TR#14 revision 22 -> 24. (debug_output_lbp): Allow for LBP_* bits >= 32. Support LBP_CP. (fill_org_lbp, debug_output_org_lbp, output_lbp): Support LBP_CP. (get_wbp): Update for Unicode 5.2.0, incorporating changes from Unicode TR#29 revision 13 -> 15. Use functions is_WBP_MIDNUMLET, is_WBP_MIDLETTER. (output_composition_tables): Allow for 24 bits instead of 16 bits in the code1 and code2 of each composition rule. * lib/unicase/cased.h: Regenerated for Unicode 5.2.0. * lib/unicase/ignorable.h: Likewise. * lib/unicase/tocasefold.h: Likewise. * lib/unicase/tolower.h: Likewise. * lib/unicase/totitle.h: Likewise. * lib/unicase/toupper.h: Likewise. * lib/unictype/bidi_of.h: Likewise. * lib/unictype/blocks.h: Likewise. * lib/unictype/categ_C.h: Likewise. * lib/unictype/categ_Cf.h: Likewise. * lib/unictype/categ_Cn.h: Likewise. * lib/unictype/categ_L.h: Likewise. * lib/unictype/categ_Ll.h: Likewise. * lib/unictype/categ_Lm.h: Likewise. * lib/unictype/categ_Lo.h: Likewise. * lib/unictype/categ_Lu.h: Likewise. * lib/unictype/categ_M.h: Likewise. * lib/unictype/categ_Mc.h: Likewise. * lib/unictype/categ_Mn.h: Likewise. * lib/unictype/categ_N.h: Likewise. * lib/unictype/categ_Nd.h: Likewise. * lib/unictype/categ_Nl.h: Likewise. * lib/unictype/categ_No.h: Likewise. * lib/unictype/categ_P.h: Likewise. * lib/unictype/categ_Pd.h: Likewise. * lib/unictype/categ_Po.h: Likewise. * lib/unictype/categ_S.h: Likewise. * lib/unictype/categ_Sc.h: Likewise. * lib/unictype/categ_So.h: Likewise. * lib/unictype/categ_of.h: Likewise. * lib/unictype/combining.h: Likewise. * lib/unictype/ctype_alnum.h: Likewise. * lib/unictype/ctype_alpha.h: Likewise. * lib/unictype/ctype_graph.h: Likewise. * lib/unictype/ctype_lower.h: Likewise. * lib/unictype/ctype_print.h: Likewise. * lib/unictype/ctype_punct.h: Likewise. * lib/unictype/ctype_upper.h: Likewise. * lib/unictype/decdigit.h: Likewise. * lib/unictype/digit.h: Likewise. * lib/unictype/numeric.h: Likewise. * lib/unictype/pr_alphabetic.h: Likewise. * lib/unictype/pr_bidi_arabic_digit.h: Likewise. * lib/unictype/pr_bidi_eur_num_terminator.h: Likewise. * lib/unictype/pr_bidi_european_digit.h: Likewise. * lib/unictype/pr_bidi_hebrew_right_to_left.h: Likewise. * lib/unictype/pr_bidi_left_to_right.h: Likewise. * lib/unictype/pr_bidi_non_spacing_mark.h: Likewise. * lib/unictype/pr_bidi_other_neutral.h: Likewise. * lib/unictype/pr_combining.h: Likewise. * lib/unictype/pr_composite.h: Likewise. * lib/unictype/pr_currency_symbol.h: Likewise. * lib/unictype/pr_dash.h: Likewise. * lib/unictype/pr_decimal_digit.h: Likewise. * lib/unictype/pr_deprecated.h: Likewise. * lib/unictype/pr_diacritic.h: Likewise. * lib/unictype/pr_extender.h: Likewise. * lib/unictype/pr_grapheme_base.h: Likewise. * lib/unictype/pr_grapheme_extend.h: Likewise. * lib/unictype/pr_grapheme_link.h: Likewise. * lib/unictype/pr_id_continue.h: Likewise. * lib/unictype/pr_id_start.h: Likewise. * lib/unictype/pr_ideographic.h: Likewise. * lib/unictype/pr_ignorable_control.h: Likewise. * lib/unictype/pr_logical_order_exception.h: Likewise. * lib/unictype/pr_lowercase.h: Likewise. * lib/unictype/pr_numeric.h: Likewise. * lib/unictype/pr_other_alphabetic.h: Likewise. * lib/unictype/pr_punctuation.h: Likewise. * lib/unictype/pr_sentence_terminal.h: Likewise. * lib/unictype/pr_terminal_punctuation.h: Likewise. * lib/unictype/pr_unassigned_code_value.h: Likewise. * lib/unictype/pr_unified_ideograph.h: Likewise. * lib/unictype/pr_uppercase.h: Likewise. * lib/unictype/pr_xid_continue.h: Likewise. * lib/unictype/pr_xid_start.h: Likewise. * lib/unictype/pr_zero_width.h: Likewise. * lib/unictype/scripts.h: Likewise. * lib/unictype/scripts_byname.gperf: Likewise. * lib/unictype/sy_java_ident.h: Likewise. * lib/unigbrk/gbrkprop.h: Likewise. * lib/unilbrk/lbrkprop1.h: Likewise. * lib/unilbrk/lbrkprop2.h: Likewise. * lib/unilbrk/lbrktables.h: Likewise. * lib/unilbrk/lbrktables.c (unilbrk_table): Add a row and column for LBP_CP. Implement rule LB30. * lib/uniwidth/width.c (nonspacing_table_data): Add U+0816..U+0819, U+081B..U+0823, U+0825..U+0827, U+0829..U+082D, U+0900, U+0955, U+109D, U+1A56, U+1A58..U+1A5E, U+1A60, U+1A62, U+1A65..U+1A6C, U+1A73..U+1A7C, U+1A7F, U+1CD0..U+1CD2, U+1CD4..U+1CE0, U+1CE2..U+1CE8, U+1CED, U+1DFD, U+2CEF..U+2CF1, U+A6F0..U+A6F1, U+A8E0..U+A8F1, U+A980..U+A982, U+A9B3, U+A9B6..U+A9B9, U+A9BC, U+AAB0, U+AAB2..U+AAB4, U+AAB7..U+AAB8, U+AABE..U+AABF, U+AAC1, U+ABE5, U+ABE8, U+ABED, U+11080..U+11081, U+110B3..U+110B6, U+110B9..U+110BA, U+110BD. (uc_width): Return 2 also for unassigned code points of planes 2 and 3. * lib/uninorm/composition-table.gperf: Regenerated for Unicode 5.2.0. * lib/uninorm/composition.c (struct composition_rule): Allow for 24 bits instead of 16 bits in the code1 and code2 of each composition rule. (uc_composition): Update for Unicode 5.2.0. * lib/uninorm/decomposition-table1.h: Regenerated for Unicode 5.2.0. * lib/uninorm/decomposition-table2.h: Likewise. * lib/uniwbrk/wbrkprop.h: Likewise. * tests/unicase/test-cased.c: Likewise. * tests/unicase/test-ignorable.c: Likewise. * tests/unicase/test-uc_tolower.c: Likewise. * tests/unicase/test-uc_totitle.c: Likewise. * tests/unicase/test-uc_toupper.c: Likewise. * tests/unictype/test-categ_C.c: Likewise. * tests/unictype/test-categ_Cf.c: Likewise. * tests/unictype/test-categ_Cn.c: Likewise. * tests/unictype/test-categ_L.c: Likewise. * tests/unictype/test-categ_Ll.c: Likewise. * tests/unictype/test-categ_Lm.c: Likewise. * tests/unictype/test-categ_Lo.c: Likewise. * tests/unictype/test-categ_Lu.c: Likewise. * tests/unictype/test-categ_M.c: Likewise. * tests/unictype/test-categ_Mc.c: Likewise. * tests/unictype/test-categ_Mn.c: Likewise. * tests/unictype/test-categ_N.c: Likewise. * tests/unictype/test-categ_Nd.c: Likewise. * tests/unictype/test-categ_Nl.c: Likewise. * tests/unictype/test-categ_No.c: Likewise. * tests/unictype/test-categ_P.c: Likewise. * tests/unictype/test-categ_Pd.c: Likewise. * tests/unictype/test-categ_Po.c: Likewise. * tests/unictype/test-categ_S.c: Likewise. * tests/unictype/test-categ_Sc.c: Likewise. * tests/unictype/test-categ_So.c: Likewise. * tests/unictype/test-ctype_alnum.c: Likewise. * tests/unictype/test-ctype_alpha.c: Likewise. * tests/unictype/test-ctype_graph.c: Likewise. * tests/unictype/test-ctype_lower.c: Likewise. * tests/unictype/test-ctype_print.c: Likewise. * tests/unictype/test-ctype_punct.c: Likewise. * tests/unictype/test-ctype_upper.c: Likewise. * tests/unictype/test-decdigit.h: Likewise. * tests/unictype/test-digit.h: Likewise. * tests/unictype/test-numeric.h: Likewise. * tests/unictype/test-pr_alphabetic.c: Likewise. * tests/unictype/test-pr_bidi_arabic_digit.c: Likewise. * tests/unictype/test-pr_bidi_eur_num_terminator.c: Likewise. * tests/unictype/test-pr_bidi_european_digit.c: Likewise. * tests/unictype/test-pr_bidi_hebrew_right_to_left.c: Likewise. * tests/unictype/test-pr_bidi_left_to_right.c: Likewise. * tests/unictype/test-pr_bidi_non_spacing_mark.c: Likewise. * tests/unictype/test-pr_bidi_other_neutral.c: Likewise. * tests/unictype/test-pr_combining.c: Likewise. * tests/unictype/test-pr_composite.c: Likewise. * tests/unictype/test-pr_currency_symbol.c: Likewise. * tests/unictype/test-pr_dash.c: Likewise. * tests/unictype/test-pr_decimal_digit.c: Likewise. * tests/unictype/test-pr_deprecated.c: Likewise. * tests/unictype/test-pr_diacritic.c: Likewise. * tests/unictype/test-pr_extender.c: Likewise. * tests/unictype/test-pr_grapheme_base.c: Likewise. * tests/unictype/test-pr_grapheme_extend.c: Likewise. * tests/unictype/test-pr_grapheme_link.c: Likewise. * tests/unictype/test-pr_id_continue.c: Likewise. * tests/unictype/test-pr_id_start.c: Likewise. * tests/unictype/test-pr_ideographic.c: Likewise. * tests/unictype/test-pr_ignorable_control.c: Likewise. * tests/unictype/test-pr_logical_order_exception.c: Likewise. * tests/unictype/test-pr_lowercase.c: Likewise. * tests/unictype/test-pr_numeric.c: Likewise. * tests/unictype/test-pr_other_alphabetic.c: Likewise. * tests/unictype/test-pr_punctuation.c: Likewise. * tests/unictype/test-pr_sentence_terminal.c: Likewise. * tests/unictype/test-pr_terminal_punctuation.c: Likewise. * tests/unictype/test-pr_unassigned_code_value.c: Likewise. * tests/unictype/test-pr_unified_ideograph.c: Likewise. * tests/unictype/test-pr_uppercase.c: Likewise. * tests/unictype/test-pr_xid_continue.c: Likewise. * tests/unictype/test-pr_xid_start.c: Likewise. * tests/unictype/test-pr_zero_width.c: Likewise. * tests/unigbrk/test-uc-gbrk-prop.h: Likewise. * tests/unilbrk/test-u8-possible-linebreaks.c (main): Update for changed behaviour: line breaking is now disallowed between a letter or '=' and '('. * tests/unilbrk/test-u16-possible-linebreaks.c (main): Likewise. * tests/unilbrk/test-u32-possible-linebreaks.c (main): Likewise. * tests/unilbrk/test-ulc-possible-linebreaks.c (main): Likewise. * tests/unilbrk/test-ulc-width-linebreaks.c (main): Likewise. * tests/uniwidth/test-uc_width2.sh: Same updates as in lib/uniwidth/width.c. * tests/uninorm/NormalizationTest.txt: Update from Unicode 5.2.0, without comments, but with the original copyright notice. * lib/unicase/special-casing-table.gperf: Regenerated; only comment changes. * lib/unictype/categ_Cc.h: Likewise. * lib/unictype/categ_Co.h: Likewise. * lib/unictype/categ_Cs.h: Likewise. * lib/unictype/categ_Lt.h: Likewise. * lib/unictype/categ_Me.h: Likewise. * lib/unictype/categ_Pc.h: Likewise. * lib/unictype/categ_Pe.h: Likewise. * lib/unictype/categ_Pf.h: Likewise. * lib/unictype/categ_Pi.h: Likewise. * lib/unictype/categ_Ps.h: Likewise. * lib/unictype/categ_Sk.h: Likewise. * lib/unictype/categ_Sm.h: Likewise. * lib/unictype/categ_Z.h: Likewise. * lib/unictype/categ_Zl.h: Likewise. * lib/unictype/categ_Zp.h: Likewise. * lib/unictype/categ_Zs.h: Likewise. * lib/unictype/ctype_blank.h: Likewise. * lib/unictype/ctype_cntrl.h: Likewise. * lib/unictype/ctype_digit.h: Likewise. * lib/unictype/ctype_space.h: Likewise. * lib/unictype/ctype_xdigit.h: Likewise. * lib/unictype/mirror.h: Likewise. * lib/unictype/pr_ascii_hex_digit.h: Likewise. * lib/unictype/pr_bidi_arabic_right_to_left.h: Likewise. * lib/unictype/pr_bidi_block_separator.h: Likewise. * lib/unictype/pr_bidi_boundary_neutral.h: Likewise. * lib/unictype/pr_bidi_common_separator.h: Likewise. * lib/unictype/pr_bidi_control.h: Likewise. * lib/unictype/pr_bidi_embedding_or_override.h: Likewise. * lib/unictype/pr_bidi_eur_num_separator.h: Likewise. * lib/unictype/pr_bidi_pdf.h: Likewise. * lib/unictype/pr_bidi_segment_separator.h: Likewise. * lib/unictype/pr_bidi_whitespace.h: Likewise. * lib/unictype/pr_default_ignorable_code_point.h: Likewise. * lib/unictype/pr_format_control.h: Likewise. * lib/unictype/pr_hex_digit.h: Likewise. * lib/unictype/pr_hyphen.h: Likewise. * lib/unictype/pr_ids_binary_operator.h: Likewise. * lib/unictype/pr_ids_trinary_operator.h: Likewise. * lib/unictype/pr_iso_control.h: Likewise. * lib/unictype/pr_join_control.h: Likewise. * lib/unictype/pr_left_of_pair.h: Likewise. * lib/unictype/pr_line_separator.h: Likewise. * lib/unictype/pr_math.h: Likewise. * lib/unictype/pr_non_break.h: Likewise. * lib/unictype/pr_not_a_character.h: Likewise. * lib/unictype/pr_other_default_ignorable_code_point.h: Likewise. * lib/unictype/pr_other_grapheme_extend.h: Likewise. * lib/unictype/pr_other_id_continue.h: Likewise. * lib/unictype/pr_other_id_start.h: Likewise. * lib/unictype/pr_other_lowercase.h: Likewise. * lib/unictype/pr_other_math.h: Likewise. * lib/unictype/pr_other_uppercase.h: Likewise. * lib/unictype/pr_paired_punctuation.h: Likewise. * lib/unictype/pr_paragraph_separator.h: Likewise. * lib/unictype/pr_pattern_syntax.h: Likewise. * lib/unictype/pr_pattern_white_space.h: Likewise. * lib/unictype/pr_private_use.h: Likewise. * lib/unictype/pr_quotation_mark.h: Likewise. * lib/unictype/pr_radical.h: Likewise. * lib/unictype/pr_soft_dotted.h: Likewise. * lib/unictype/pr_space.h: Likewise. * lib/unictype/pr_titlecase.h: Likewise. * lib/unictype/pr_variation_selector.h: Likewise. * lib/unictype/pr_white_space.h: Likewise. * lib/unictype/sy_c_ident.h: Likewise. * lib/unictype/sy_c_whitespace.h: Likewise. * lib/unictype/sy_java_whitespace.h: Likewise. * modules/uni*/*: Bump version number of expected libunistring version. Reported by Simon Josefsson.
* Add more tests.Bruno Haible2010-01-101-2/+51
|