EBCDIC has the Unicode bug too

We have not had a working modern Perl on EBCDIC for some years. When I started out, comments and code led me to conclude erroneously that natively it supported semantics for all 256 characters 0-255. It turns out that I was wrong; it natively (at least on some platforms) has the same rules (essentially none) for the characters which don't correspond to ASCII onees, as the rules for these on ASCII platforms. This commit is documentation only, mostly just removing the special mentions of EBCDIC.
author: Karl Williamson <public@khwilliamson.com> 2013-03-11 21:13:38 -0600
committer: Karl Williamson <public@khwilliamson.com> 2013-03-11 21:21:03 -0600
commit: 4b9734bf16232aac75ed56df6352c09d1caad7b3 (patch)
tree: b1fe580a02d6ae63b54c758d033c9722519e86b5 /handy.h
parent: 020c4f9110283940e8755ca2f70f6e943b42efe3 (diff)
download: perl-4b9734bf16232aac75ed56df6352c09d1caad7b3.tar.gz
1 files changed, 9 insertions, 13 deletions
diff --git a/handy.h b/handy.h
index a65523e492..a969d1a2ec 100644
--- a/handy.h
+++ b/handy.h
@@ -489,19 +489,15 @@ Perl rules.  If the input is a number that doesn't fit in an octet, FALSE is
 always returned.
 
 Variant C<isFOO_A> (e.g., C<isALPHA_A()>) will return TRUE only if the input is
-also in the ASCII character set.  For ASCII platforms, the base function with
-no suffix and the one with the C<_A> suffix are identical.  On EBCDIC
-platforms, the C<_A> suffix function will not return true unless the specified
-character also has an ASCII equivalent.
-
-Variant C<isFOO_L1> operates on the full Latin1 character set.  For EBCDIC
-platforms, the base function with no suffix and the one with the C<_L1> suffix
-are identical.  For ASCII platforms, the C<_L1> suffix imposes the Latin-1
-character set onto the platform.  That is, the code points that are ASCII are
-unaffected, since ASCII is a subset of Latin-1.  But the non-ASCII code points
-are treated as if they are Latin-1 characters.  For example, C<isSPACE_L1()>
-will return true when called with the code point 0xA0, which is the Latin-1
-NO-BREAK SPACE.
+also in the ASCII character set.  The base function with no suffix and the one
+with the C<_A> suffix are identical.
+
+Variant C<isFOO_L1> imposes the Latin-1 (or EBCDIC equivlalent) character set
+onto the platform.  That is, the code points that are ASCII are unaffected,
+since ASCII is a subset of Latin-1.  But the non-ASCII code points are treated
+as if they are Latin-1 characters.  For example, C<isWORDCHAR_L1()> will return
+true when called with the code point 0xDF, which is a word character in both
+ASCII and EBCDIC (though it represent different characters in each).
 
 Variant C<isFOO_uni> is like the C<isFOO_L1> variant, but accepts any UV code
 point as input.  If the code point is larger than 255, Unicode rules are used
author	Karl Williamson <public@khwilliamson.com>	2013-03-11 21:13:38 -0600
committer	Karl Williamson <public@khwilliamson.com>	2013-03-11 21:21:03 -0600
commit	4b9734bf16232aac75ed56df6352c09d1caad7b3 (patch)
tree	b1fe580a02d6ae63b54c758d033c9722519e86b5 /handy.h
parent	020c4f9110283940e8755ca2f70f6e943b42efe3 (diff)
download	perl-4b9734bf16232aac75ed56df6352c09d1caad7b3.tar.gz