diff options
Diffstat (limited to 'pod/perlebcdic.pod')
-rw-r--r-- | pod/perlebcdic.pod | 20 |
1 files changed, 12 insertions, 8 deletions
diff --git a/pod/perlebcdic.pod b/pod/perlebcdic.pod index 962244ae2c..28d47b9551 100644 --- a/pod/perlebcdic.pod +++ b/pod/perlebcdic.pod @@ -11,6 +11,10 @@ than some discussion of UTF-8 and UTF-EBCDIC. Portions that are still incomplete are marked with XXX. +Perl used to work on EBCDIC machines, but there are now areas of the code where +it doesn't. If you want to use Perl on an EBCDIC machine, please let us know +by sending mail to perlbug@perl.org + =head1 COMMON CHARACTER CODE SETS =head2 ASCII @@ -57,13 +61,13 @@ also known as CCSID 819 (or sometimes 0819 or even 00819). =head2 EBCDIC The Extended Binary Coded Decimal Interchange Code refers to a -large collection of slightly different single and multi byte -coded character sets that are different from ASCII or ISO 8859-1 -and typically run on host computers. The EBCDIC encodings derive -from 8 bit byte extensions of Hollerith punched card encodings. -The layout on the cards was such that high bits were set for the -upper and lower case alphabet characters [a-z] and [A-Z], but there -were gaps within each Latin alphabet range. +large collection of single and multi byte coded character sets that are +different from ASCII or ISO 8859-1 and are all slightly different from each +other; they typically run on host computers. The EBCDIC encodings derive from +8 bit byte extensions of Hollerith punched card encodings. The layout on the +cards was such that high bits were set for the upper and lower case alphabet +characters [a-z] and [A-Z], but there were gaps within each Latin alphabet +range. Some IBM EBCDIC character sets may be known by character code set identification numbers (CCSID numbers) or code page numbers. Leading @@ -160,7 +164,7 @@ value when encoded as when not. mentioned above.) For example, the ordinal value of 'A' is 193 in most EBCDIC code pages, and also is 193 when encoded in UTF-EBCDIC. -All other code points occupy at least two bytes when encoded. +All variant code points occupy at least two bytes when encoded. In UTF-8, the code points corresponding to the lowest 128 ordinal numbers (0 - 127: the ASCII characters) are invariant. In UTF-EBCDIC, there are 160 invariant characters. |