summaryrefslogtreecommitdiff
path: root/pod/perluniintro.pod
diff options
context:
space:
mode:
authorJuerd Waalboer <#####@juerd.nl>2007-03-04 17:00:19 +0100
committerH.Merijn Brand <h.m.brand@xs4all.nl>2007-03-07 13:23:23 +0000
commit2575c402a8f9be55f848bdfb219afbf912c50ac1 (patch)
treec21a19c42deaa2dba098c38d74338a7c01328c28 /pod/perluniintro.pod
parent2a6a970fa1b36c99c83fd3fdd48253c1b567db9b (diff)
downloadperl-2575c402a8f9be55f848bdfb219afbf912c50ac1.tar.gz
Re: [PATCH] (Re: [PATCH] unicode/utf8 pod)
Message-ID: <20070304150019.GN4723@c4.convolution.nl> p4raw-id: //depot/perl@30493
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r--pod/perluniintro.pod22
1 files changed, 4 insertions, 18 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod
index b0d5859065..9337e5f919 100644
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -278,21 +278,7 @@ encodings, I/O, and certain special cases:
When you combine legacy data and Unicode the legacy data needs
to be upgraded to Unicode. Normally ISO 8859-1 (or EBCDIC, if
-applicable) is assumed. You can override this assumption by
-using the C<encoding> pragma, for example
-
- use encoding 'latin2'; # ISO 8859-2
-
-in which case literals (string or regular expressions), C<chr()>,
-and C<ord()> in your whole script are assumed to produce Unicode
-characters from ISO 8859-2 code points. Note that the matching for
-encoding names is forgiving: instead of C<latin2> you could have
-said C<Latin 2>, or C<iso8859-2>, or other variations. With just
-
- use encoding;
-
-the environment variable C<PERL_ENCODING> will be consulted.
-If that variable isn't set, the encoding pragma will fail.
+applicable) is assumed.
The C<Encode> module knows about many encodings and has interfaces
for doing conversions between those encodings:
@@ -404,8 +390,8 @@ the file "text.utf8", encoded as UTF-8:
while (<$nihongo>) { print $unicode $_ }
The naming of encodings, both by the C<open()> and by the C<open>
-pragma, is similar to the C<encoding> pragma in that it allows for
-flexible names: C<koi8-r> and C<KOI8R> will both be understood.
+pragma allows for flexible names: C<koi8-r> and C<KOI8R> will both be
+understood.
Common encodings recognized by ISO, MIME, IANA, and various other
standardisation organisations are recognised; for a more detailed
@@ -885,7 +871,7 @@ to UTF-8 bytes and back, the code works even with older Perl 5 versions.
=head1 SEE ALSO
-L<perlunicode>, L<Encode>, L<encoding>, L<open>, L<utf8>, L<bytes>,
+L<perlunitut>, L<perlunicode>, L<Encode>, L<open>, L<utf8>, L<bytes>,
L<perlretut>, L<perlrun>, L<Unicode::Collate>, L<Unicode::Normalize>,
L<Unicode::UCD>