diff options
author | Karl Williamson <khw@cpan.org> | 2022-03-17 11:55:54 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2022-03-19 23:17:51 -0600 |
commit | 9153861ee6e7388524ed557a08b6498ce3d8988e (patch) | |
tree | 3f88dc020dc2bfac1787de6690a9325a9d5b2bea /regen/unicode_constants.pl | |
parent | 50b179151ab7c50465ad9dcd16636805978a0ada (diff) | |
download | perl-9153861ee6e7388524ed557a08b6498ce3d8988e.tar.gz |
Directionality pres/abs-ence can mean paired delimiters
Another way Unicode indicates that a character has horizontal
directionality is by adding LEFT or RIGHT to the name of a base
character. Hence we get RIGHT SPEAKER vs just plain SPEAKER.
Presumably this comes about when they didn't consider directionality at
first, and then realized later it was needed.
This commit makes the script look for these kinds of character pairs.
Because the current Unicode version only has this characteristic for
Symbols, and symbols must be included explicitly, no changes in what
gets paired ensues. But if you turn on the outputting of characters not
chosen, that list will now include things meeting this new criteria.
Less than a handful actually are like this.
Diffstat (limited to 'regen/unicode_constants.pl')
-rw-r--r-- | regen/unicode_constants.pl | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/regen/unicode_constants.pl b/regen/unicode_constants.pl index 82b59f953f..c777cdc95a 100644 --- a/regen/unicode_constants.pl +++ b/regen/unicode_constants.pl @@ -548,11 +548,22 @@ foreach my $list (qw(Punctuation Symbol)) { next CODE_POINT; } + # If no mate was found, it could be that it's like the case of + # SPEAKER vs RIGHT SPEAKER (which probably means the mirror was added + # in a later version than the original. Check by removing all + # directionality and trying to see if there is a character with that + # name. if (! defined $mirror_code_point) { + $mirror =~ s/$directional_re //; + $mirror_code_point = charnames::vianame($mirror); + if (! defined $mirror_code_point) { + + # Still no mate. $discards{$code_point} = { reason => $unpaired, mirror => undef }; next; + } } if ($code_point == $mirror_code_point) { |