diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-12-16 14:39:34 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-12-16 14:39:34 +0000 |
commit | 4ef28c72ed49aa6b9d3f54cb581962eceee8c546 (patch) | |
tree | 14b35e251d5bd09fafaaf1ebd63d1302b91d0a90 /lib | |
parent | d8d29d4f846a79d680ffc51bd1c4737df419932b (diff) | |
download | perl-4ef28c72ed49aa6b9d3f54cb581962eceee8c546.tar.gz |
More documentation for the encoding pragma.
p4raw-id: //depot/perl@13719
Diffstat (limited to 'lib')
-rw-r--r-- | lib/encoding.pm | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/lib/encoding.pm b/lib/encoding.pm index 4938bfd350..642726da7e 100644 --- a/lib/encoding.pm +++ b/lib/encoding.pm @@ -77,6 +77,13 @@ since the C<\xDF> on the left will B<not> be upgraded to C<\x{3af}> because of the C<\x{100}> on the left. You should not be mixing your legacy data and Unicode in the same string. +This pragma also affects encoding of the 0x80..0xFF code point range: +normally characters in that range are left as eight-bit bytes (unless +they are combined with characters with code points 0x100 or larger, +in which case all characters need to become UTF-8 encoded), but if +the C<encoding> pragma is present, even the 0x80..0xFF range always +gets UTF-8 encoded. + If no encoding is specified, the environment variable L<PERL_ENCODING> is consulted. If that fails, "latin1" (ISO 8859-1) is assumed. If no encoding can be found, C<Unknown encoding '...'> error will be thrown. |