summaryrefslogtreecommitdiff
path: root/regen/unicode_constants.pl
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2022-03-17 11:55:54 -0600
committerKarl Williamson <khw@cpan.org>2022-03-19 23:17:51 -0600
commit9153861ee6e7388524ed557a08b6498ce3d8988e (patch)
tree3f88dc020dc2bfac1787de6690a9325a9d5b2bea /regen/unicode_constants.pl
parent50b179151ab7c50465ad9dcd16636805978a0ada (diff)
downloadperl-9153861ee6e7388524ed557a08b6498ce3d8988e.tar.gz
Directionality pres/abs-ence can mean paired delimiters
Another way Unicode indicates that a character has horizontal directionality is by adding LEFT or RIGHT to the name of a base character. Hence we get RIGHT SPEAKER vs just plain SPEAKER. Presumably this comes about when they didn't consider directionality at first, and then realized later it was needed. This commit makes the script look for these kinds of character pairs. Because the current Unicode version only has this characteristic for Symbols, and symbols must be included explicitly, no changes in what gets paired ensues. But if you turn on the outputting of characters not chosen, that list will now include things meeting this new criteria. Less than a handful actually are like this.
Diffstat (limited to 'regen/unicode_constants.pl')
-rw-r--r--regen/unicode_constants.pl11
1 files changed, 11 insertions, 0 deletions
diff --git a/regen/unicode_constants.pl b/regen/unicode_constants.pl
index 82b59f953f..c777cdc95a 100644
--- a/regen/unicode_constants.pl
+++ b/regen/unicode_constants.pl
@@ -548,11 +548,22 @@ foreach my $list (qw(Punctuation Symbol)) {
next CODE_POINT;
}
+ # If no mate was found, it could be that it's like the case of
+ # SPEAKER vs RIGHT SPEAKER (which probably means the mirror was added
+ # in a later version than the original. Check by removing all
+ # directionality and trying to see if there is a character with that
+ # name.
if (! defined $mirror_code_point) {
+ $mirror =~ s/$directional_re //;
+ $mirror_code_point = charnames::vianame($mirror);
+ if (! defined $mirror_code_point) {
+
+ # Still no mate.
$discards{$code_point} = { reason => $unpaired,
mirror => undef
};
next;
+ }
}
if ($code_point == $mirror_code_point) {