summaryrefslogtreecommitdiff
path: root/pod/perluniintro.pod
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2002-02-14 22:13:18 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2002-02-14 22:13:18 +0000
commit1eda90df89aea387a3d959817af94a372619a3af (patch)
tree8209ac9ceb51c20e64e85967225c65361cf119f4 /pod/perluniintro.pod
parent90133b69afb5dccc00b1483d3839904e458ba347 (diff)
downloadperl-1eda90df89aea387a3d959817af94a372619a3af.tar.gz
Document pack U0U.
p4raw-id: //depot/perl@14696
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r--pod/perluniintro.pod16
1 files changed, 14 insertions, 2 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod
index 5f2f34031c..ee900bba1c 100644
--- a/pod/perluniintro.pod
+++ b/pod/perluniintro.pod
@@ -221,6 +221,17 @@ Note that both C<\x{...}> and C<\N{...}> are compile-time string
constants: you cannot use variables in them. if you want similar
run-time functionality, use C<chr()> and C<charnames::vianame()>.
+Also note that if all the code points for pack "U" are below 0x100,
+bytes will be generated, just like if you were using C<chr()>.
+
+ my $bytes = pack("U*", 0x80, 0xFF);
+
+If you want to force the result to Unicode characters, use the special
+C<"U0"> prefix. It consumes no arguments but forces the result to be
+in Unicode characters, instead of bytes.
+
+ my $chars = pack("U0U*", 0x80, 0xFF);
+
=head2 Handling Unicode
Handling Unicode is for the most part transparent: just use the
@@ -611,8 +622,9 @@ For UTF-8 only, you can use:
If invalid, a C<Malformed UTF-8 character (byte 0x##) in
unpack> is produced. The "U0" means "expect strictly UTF-8
-encoded Unicode". Without that the C<unpack("U*", ...)>
-would accept also data like C<chr(0xFF>).
+encoded Unicode". Without that the C<unpack("U*", ...)>
+would accept also data like C<chr(0xFF>), similarly to the
+C<pack> as we saw earlier.
=item How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa?