summaryrefslogtreecommitdiff
path: root/lib
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2001-12-16 14:39:34 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2001-12-16 14:39:34 +0000
commit4ef28c72ed49aa6b9d3f54cb581962eceee8c546 (patch)
tree14b35e251d5bd09fafaaf1ebd63d1302b91d0a90 /lib
parentd8d29d4f846a79d680ffc51bd1c4737df419932b (diff)
downloadperl-4ef28c72ed49aa6b9d3f54cb581962eceee8c546.tar.gz
More documentation for the encoding pragma.
p4raw-id: //depot/perl@13719
Diffstat (limited to 'lib')
-rw-r--r--lib/encoding.pm7
1 files changed, 7 insertions, 0 deletions
diff --git a/lib/encoding.pm b/lib/encoding.pm
index 4938bfd350..642726da7e 100644
--- a/lib/encoding.pm
+++ b/lib/encoding.pm
@@ -77,6 +77,13 @@ since the C<\xDF> on the left will B<not> be upgraded to C<\x{3af}>
because of the C<\x{100}> on the left. You should not be mixing your
legacy data and Unicode in the same string.
+This pragma also affects encoding of the 0x80..0xFF code point range:
+normally characters in that range are left as eight-bit bytes (unless
+they are combined with characters with code points 0x100 or larger,
+in which case all characters need to become UTF-8 encoded), but if
+the C<encoding> pragma is present, even the 0x80..0xFF range always
+gets UTF-8 encoded.
+
If no encoding is specified, the environment variable L<PERL_ENCODING>
is consulted. If that fails, "latin1" (ISO 8859-1) is assumed. If no
encoding can be found, C<Unknown encoding '...'> error will be thrown.