diff options
author | Karl Williamson <khw@cpan.org> | 2015-09-02 18:10:11 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2015-09-02 18:23:39 -0600 |
commit | c229c17811a602799827802511e046065bf0259b (patch) | |
tree | f9f1f7ec7c43c0988989aa5911e8823de69d4e9b /utfebcdic.h | |
parent | b5804ad61d2f8076a2596d0fbc58e267d1e5e072 (diff) | |
download | perl-c229c17811a602799827802511e046065bf0259b.tar.gz |
utfebcdic.h: Clarify comment
Diffstat (limited to 'utfebcdic.h')
-rw-r--r-- | utfebcdic.h | 10 |
1 files changed, 6 insertions, 4 deletions
diff --git a/utfebcdic.h b/utfebcdic.h index 39eb30cdc5..d9e1402ce2 100644 --- a/utfebcdic.h +++ b/utfebcdic.h @@ -11,10 +11,12 @@ * http://www.unicode.org/unicode/reports/tr16 * * To summarize, the way it works is: - * To convert an EBCDIC character to UTF-EBCDIC: - * 1) convert to Unicode. The table in the generated file 'ebcdic_tables.h' - * that does this for EBCDIC bytes is PL_e2a (with inverse PL_a2e). The - * 'a' stands for ASCII platform, meaning latin1. + * To convert an EBCDIC code point to UTF-EBCDIC: + * 1) convert to Unicode. No conversion is necesary for code points above + * 255, as Unicode and EBCDIC are identical in this range. For smaller + * code points, the conversion is done by lookup in the PL_e2a table (with + * inverse PL_a2e) in the generated file 'ebcdic_tables.h'. The 'a' + * stands for ASCII platform, meaning 0-255 Unicode. * 2) convert that to a utf8-like string called I8 ('I' stands for * intermediate) with variant characters occupying multiple bytes. This * step is similar to the utf8-creating step from Unicode, but the details |