Update the Unicode vs EBCDIC situation.

p4raw-id: //depot/perl@15313
author: Jarkko Hietaniemi <jhi@iki.fi> 2002-03-19 04:58:22 +0000
committer: Jarkko Hietaniemi <jhi@iki.fi> 2002-03-19 04:58:22 +0000
commit: 64c66fb6d001b6ad9c6dcec93084b647d4c6eb13 (patch)
tree: a4f708a0a0e71b216bf952783da5418c7b9b9495 /pod/perlebcdic.pod
parent: c3939953e4b292b4e1d7b8bbc60cede4dc14fcaf (diff)
download: perl-64c66fb6d001b6ad9c6dcec93084b647d4c6eb13.tar.gz
1 files changed, 20 insertions, 0 deletions
diff --git a/pod/perlebcdic.pod b/pod/perlebcdic.pod
index 6339bb4c2e..0053d91a38 100644
--- a/pod/perlebcdic.pod
+++ b/pod/perlebcdic.pod
@@ -97,6 +97,26 @@ for VM/ESA.  CCSID 1047 differs from CCSID 0037 in eight places.
 The EBCDIC code page in use on Siemens' BS2000 system is distinct from
 1047 and 0037.  It is identified below as the POSIX-BC set.
 
+=head2 Unicode code points versus EBCDIC code points
+
+In Unicode terminology a I<code point> is the number assigned to a
+character: for example, in EBCDIC the character "A" is usually assigned
+the number 193.  In Unicode the character "A" is assigned the number 65.
+This causes a problem with the semantics of the pack/unpack "U", which
+are supposed to pack Unicode code points to characters and back to numbers.
+The problem is: which code points to use for code points less than 256?
+(for 256 and over there's no problem: Unicode code points are used)
+In EBCDIC, for the low 256 the EBCDIC code points are used.  This
+means that the equivalences
+
+	pack("U", ord($character)) eq $character
+	unpack("U", $character) == ord $character
+
+will hold.  (If Unicode code points were applied consistently over
+all the possible code points, pack("U",ord("A")) would in EBCDIC
+equal I<A with acute> or chr(101), and unpack("U", "A") would equal
+65, or I<non-breaking space>, not 193, or ord "A".)
+
 =head2 Unicode and UTF
 
 UTF is a Unicode Transformation Format.  UTF-8 is a Unicode conforming
author	Jarkko Hietaniemi <jhi@iki.fi>	2002-03-19 04:58:22 +0000
committer	Jarkko Hietaniemi <jhi@iki.fi>	2002-03-19 04:58:22 +0000
commit	64c66fb6d001b6ad9c6dcec93084b647d4c6eb13 (patch)
tree	a4f708a0a0e71b216bf952783da5418c7b9b9495 /pod/perlebcdic.pod
parent	c3939953e4b292b4e1d7b8bbc60cede4dc14fcaf (diff)
download	perl-64c66fb6d001b6ad9c6dcec93084b647d4c6eb13.tar.gz