diff options
author | Karl Williamson <public@khwilliamson.com> | 2010-10-14 19:36:04 -0600 |
---|---|---|
committer | Father Chrysostomos <sprout@cpan.org> | 2010-10-21 05:54:20 -0700 |
commit | 53d84487fbdd2060c1a666eacaef6e34ce4a1483 (patch) | |
tree | 763a20d52bc4763c68af086590ec76e4d05be95e /lib/charnames.t | |
parent | 969a34cc81e0088cad802e53b05376b1280dedbb (diff) | |
download | perl-53d84487fbdd2060c1a666eacaef6e34ce4a1483.tar.gz |
charnames::viacode returning less correct name
There are several cases where more than one name is valid for a code
point. This happens usually when the original name was published with a
typo in it. It's best for viacode to return the revised name, though
the original remains valid.
The names data is in a table generated by mktables exclusively for
charnames, including vianame (and its kin) and viacode. The fix is to
mktables to put the more correct name first in the table, so that it is
found first and returned by viacode().
When I originally designed this code, I thought the correct name should
come last in the tables, so someone looping and reading it could just
overwrite the less correct one with the more correct one.
But to save memory we have the same table shared by viacode and
vianame, and vianame has to recognize both names, so both entries
are needed. viacode could do an rindex to find the more correct name,
but experiments show that that was twice as slow as going the other
direction. Therefore, this patch is for speed.
If the tables for vianame and viacode were ever to be split, this patch
could be reverted, if desired, to put things back to the reverse order.
Diffstat (limited to 'lib/charnames.t')
-rw-r--r-- | lib/charnames.t | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/lib/charnames.t b/lib/charnames.t index 49442665ef..883740e73d 100644 --- a/lib/charnames.t +++ b/lib/charnames.t @@ -960,6 +960,14 @@ is("\N{U+1D0C5}", "\N{BYZANTINE MUSICAL SYMBOL FTHORA SKLIRON CHROMA VASIS}"); # just read) return the same code point. test_vianame($i, $hex, $name); test_vianame($i, $hex, $names[$i]); + + # Set up so that a test below of this code point will use the alias + # instead of the less-correct original. We can't test here that + # viacode is correct, because the alias file may contain multiple + # aliases for the same code point, and viacode should return only the + # final one. So don't do it here; instead rely on the loop below to + # pick up the test. + $names[$i] = $name; } close $fh; |