diff options
author | Karl Williamson <public@khwilliamson.com> | 2012-02-13 09:54:22 -0700 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2012-02-13 10:42:19 -0700 |
commit | 898b2fa7ca685c63900bc063ed519d98af0db541 (patch) | |
tree | 7a0ee33c04f0b5ea307ccaa27329557249bc2796 /lib/charnames.t | |
parent | 1722e378f962c2c0bd9735fe63e69fa95671f5e2 (diff) | |
download | perl-898b2fa7ca685c63900bc063ed519d98af0db541.tar.gz |
mktables: viacode() return unparenthesized names for 4 controls
This commit changes the viacode() returned name for four control characters, as
follows:
Code point Old Name New Name
U+000A LINE FEED (LF) LINE FEED
U+000C FORM FEED (FF) FORM FEED
U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN
U+0085 NEXT LINE (NEL) NEXT LINE
Only the return from viacode is affected. All the names are accepted as
input, as they always have been.
Unicode 6.1 now has official names for all the controls, and the new
names match those. The old names were the ones that were recommended by
TR18 prior to 6.1, and still are, sort of. This change uses the
official names in preference to the TR18 ones. We probably wouldn't
bother except that the old names were problematic--the only names in the
whole universe of names containing parentheses, and not matching
traditional usage. The new names have always been accepted as inputs by
Perl.
I actually doubt that Unicode ever grokked that they were recommending
these ugly names. and they haven't paid much attention to TR18 anyway,
breaking it in version 6.0 by encoding one of the recommended names
(BELL) as an official name for another code point, and without realizing
it. TR18 now is in limbo, still wrongly recommending BELL, with a
rewrite being promised for many months now. It's unclear what will
happen with it.
It was agreed on p5p to go with the cleaner, now official names, instead
of the older, likely obsolete, TR18 names. I did a search of
CPAN; it was unclear if this change, (which again is only for viacode())
mattered to any code there or not. There were a few instances of the
old names, but none of those were apparently associated with viacode().
Diffstat (limited to 'lib/charnames.t')
-rw-r--r-- | lib/charnames.t | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/lib/charnames.t b/lib/charnames.t index 9d37daa58c..09a4314905 100644 --- a/lib/charnames.t +++ b/lib/charnames.t @@ -313,6 +313,19 @@ is("\N{BOM}", chr(0xFEFF), 'Verify "\N{BOM}" is correct'); is(charnames::viacode(0xFEFF), "ZERO WIDTH NO-BREAK SPACE", 'Verify viacode(0xFEFF) is correct'); +# These test that the changes to these in 6.1 are recognized. (The double +# test of using viacode and vianame is less than optimal as two errors could +# cancel each other out, but later each is tested individually, and this +# sidesteps and EBCDIC issues. +is(charnames::viacode(charnames::vianame("CR")), "CARRIAGE RETURN", + 'Verify viacode(vianame("CR")) is "CARRIAGE RETURN"'); +is(charnames::viacode(charnames::vianame("LF")), "LINE FEED", + 'Verify viacode(vianame("LF")) is "LINE FEED"'); +is(charnames::viacode(charnames::vianame("FF")), "FORM FEED", + 'Verify viacode(vianame("FF")) is "FORM FEED"'); +is(charnames::viacode(charnames::vianame("NEL")), "NEXT LINE", + 'Verify viacode(vianame("NEL")) is "NEXT LINE"'); + { use warnings; cmp_ok(ord("\N{BOM}"), '==', 0xFEFF, 'Verify \N{BOM} is correct'); |