summaryrefslogtreecommitdiff
path: root/pod/perlebcdic.pod
diff options
context:
space:
mode:
authorKarl Williamson <khw@khw-desktop.(none)>2009-12-24 22:54:58 -0700
committerAbigail <abigail@abigail.be>2009-12-25 10:07:41 +0100
commite1b711dac329baf9cf4ea3e4628e6c713e24b342 (patch)
treeb12ce1b41c2d6c0582296ddad541efd2ae3f71e2 /pod/perlebcdic.pod
parent27bca3226281a592aed848b7e68ea50f27381dac (diff)
downloadperl-e1b711dac329baf9cf4ea3e4628e6c713e24b342.tar.gz
Update .pods
Signed-off-by: Abigail <abigail@abigail.be>
Diffstat (limited to 'pod/perlebcdic.pod')
-rw-r--r--pod/perlebcdic.pod20
1 files changed, 12 insertions, 8 deletions
diff --git a/pod/perlebcdic.pod b/pod/perlebcdic.pod
index 962244ae2c..28d47b9551 100644
--- a/pod/perlebcdic.pod
+++ b/pod/perlebcdic.pod
@@ -11,6 +11,10 @@ than some discussion of UTF-8 and UTF-EBCDIC.
Portions that are still incomplete are marked with XXX.
+Perl used to work on EBCDIC machines, but there are now areas of the code where
+it doesn't. If you want to use Perl on an EBCDIC machine, please let us know
+by sending mail to perlbug@perl.org
+
=head1 COMMON CHARACTER CODE SETS
=head2 ASCII
@@ -57,13 +61,13 @@ also known as CCSID 819 (or sometimes 0819 or even 00819).
=head2 EBCDIC
The Extended Binary Coded Decimal Interchange Code refers to a
-large collection of slightly different single and multi byte
-coded character sets that are different from ASCII or ISO 8859-1
-and typically run on host computers. The EBCDIC encodings derive
-from 8 bit byte extensions of Hollerith punched card encodings.
-The layout on the cards was such that high bits were set for the
-upper and lower case alphabet characters [a-z] and [A-Z], but there
-were gaps within each Latin alphabet range.
+large collection of single and multi byte coded character sets that are
+different from ASCII or ISO 8859-1 and are all slightly different from each
+other; they typically run on host computers. The EBCDIC encodings derive from
+8 bit byte extensions of Hollerith punched card encodings. The layout on the
+cards was such that high bits were set for the upper and lower case alphabet
+characters [a-z] and [A-Z], but there were gaps within each Latin alphabet
+range.
Some IBM EBCDIC character sets may be known by character code set
identification numbers (CCSID numbers) or code page numbers. Leading
@@ -160,7 +164,7 @@ value when encoded as when not.
mentioned above.)
For example, the ordinal value of 'A' is 193 in most EBCDIC code pages,
and also is 193 when encoded in UTF-EBCDIC.
-All other code points occupy at least two bytes when encoded.
+All variant code points occupy at least two bytes when encoded.
In UTF-8, the code points corresponding to the lowest 128
ordinal numbers (0 - 127: the ASCII characters) are invariant.
In UTF-EBCDIC, there are 160 invariant characters.