diff options
author | Audrey Tang <cpan@audreyt.org> | 2003-12-10 04:39:16 +0800 |
---|---|---|
committer | Rafael Garcia-Suarez <rgarciasuarez@gmail.com> | 2003-12-09 21:33:22 +0000 |
commit | 990e18f721a7d2ee48d50ea4262bd5d109e9f89c (patch) | |
tree | 4334f1521da4188b6bebdae304b622e37030f8ca /ext/Encode/encoding.pm | |
parent | 4e2344ada78d8742c0023d545c1baed6597bae39 (diff) | |
download | perl-990e18f721a7d2ee48d50ea4262bd5d109e9f89c.tar.gz |
Implicit upgrading docs
Message-ID: <20031209123915.GA1454@not.autrijus.org>
p4raw-id: //depot/perl@21873
Diffstat (limited to 'ext/Encode/encoding.pm')
-rw-r--r-- | ext/Encode/encoding.pm | 19 |
1 files changed, 19 insertions, 0 deletions
diff --git a/ext/Encode/encoding.pm b/ext/Encode/encoding.pm index f203cb3d7e..93662524fa 100644 --- a/ext/Encode/encoding.pm +++ b/ext/Encode/encoding.pm @@ -192,6 +192,25 @@ not "\x{99F1}\x{99DD} is the symbol of perl.\n". You can override this by giving extra arguments; see below. +=head2 Implicit upgrading for byte strings + +By default, if strings operating under byte semantics and strings +with Unicode character data are concatenated, the new string will +be created by decoding the byte strings as I<ISO 8859-1 (Latin-1)>. + +The B<encoding> pragma changes this to use the specified encoding +instead. For example: + + use encoding 'utf8'; + my $string = chr(20000); # a Unicode string + utf8::encode($string); # now it's a UTF-8 encoded byte string + # concatenate with another Unicode string + print length($string . chr(20000)); + +Will print C<2>, because C<$string> is upgraded as UTF-8. Without +C<use encoding 'utf8';>, it will print C<4> instead, since C<$string> +is three octets when interpreted as Latin-1. + =head1 FEATURES THAT REQUIRE 5.8.1 Some of the features offered by this pragma requires perl 5.8.1. Most |