diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-11-16 15:26:41 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-11-16 15:26:41 +0000 |
commit | 1d7919c50afce3283e44737a6095660e99d8c972 (patch) | |
tree | e6bdad2f0c1a2ef525ec999575c69c71afd4e543 /pod/perluniintro.pod | |
parent | 9a0edf7c98d243cb5ee8bd7fa3422d7a9fecbc66 (diff) | |
download | perl-1d7919c50afce3283e44737a6095660e99d8c972.tar.gz |
Update perluniintro on the UTF-8 output matters
(that -w will warn unless the stream is explicitly UTF-8-ified).
p4raw-id: //depot/perl@13051
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r-- | pod/perluniintro.pod | 25 |
1 files changed, 17 insertions, 8 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index cdd0b4075e..cd978d0861 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -236,11 +236,19 @@ for doing conversions between those encodings: Normally writing out Unicode data - print chr(0x100), "\n"; + print FH chr(0x100), "\n"; -will print out the raw UTF-8 bytes. +will print out the raw UTF-8 bytes, but you will get a warning +out of that if you use C<-w> or C<use warnings>. To avoid the +warning open the stream explicitly in UTF-8: -But reading in correctly formed UTF-8 data will not magically turn + open FH, ">:utf8", "file"; + +and on already open streams use C<binmode()>: + + binmode(STDOUT, ":utf8"); + +Reading in correctly formed UTF-8 data will not magically turn the data into Unicode in Perl's eyes. You can use either the C<':utf8'> I/O discipline when opening files @@ -251,11 +259,11 @@ You can use either the C<':utf8'> I/O discipline when opening files The I/O disciplines can also be specified more flexibly with the C<open> pragma; see L<open>: - use open ':utf8'; # input and output will be UTF-8 - open X, ">utf8"; - print X chr(0x100), "\n"; # this would have been UTF-8 without the pragma + use open ':utf8'; # input and output default discipline will be UTF-8 + open X, ">file"; + print X chr(0x100), "\n"; close X; - open Y, "<utf8"; + open Y, "<file"; printf "%#x\n", ord(<Y>); # this should print 0x100 close Y; @@ -329,7 +337,8 @@ by repeatedly encoding it in UTF-8: close F; If you run this code twice, the contents of the F<file> will be twice -UTF-8 encoded. A C<use open ':utf8'> would have avoided the bug. +UTF-8 encoded. A C<use open ':utf8'> would have avoided the bug, or +explicitly opening also the F<file> for input as UTF-8. =head2 Special Cases |