diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2002-02-14 22:13:18 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2002-02-14 22:13:18 +0000 |
commit | 1eda90df89aea387a3d959817af94a372619a3af (patch) | |
tree | 8209ac9ceb51c20e64e85967225c65361cf119f4 /pod/perluniintro.pod | |
parent | 90133b69afb5dccc00b1483d3839904e458ba347 (diff) | |
download | perl-1eda90df89aea387a3d959817af94a372619a3af.tar.gz |
Document pack U0U.
p4raw-id: //depot/perl@14696
Diffstat (limited to 'pod/perluniintro.pod')
-rw-r--r-- | pod/perluniintro.pod | 16 |
1 files changed, 14 insertions, 2 deletions
diff --git a/pod/perluniintro.pod b/pod/perluniintro.pod index 5f2f34031c..ee900bba1c 100644 --- a/pod/perluniintro.pod +++ b/pod/perluniintro.pod @@ -221,6 +221,17 @@ Note that both C<\x{...}> and C<\N{...}> are compile-time string constants: you cannot use variables in them. if you want similar run-time functionality, use C<chr()> and C<charnames::vianame()>. +Also note that if all the code points for pack "U" are below 0x100, +bytes will be generated, just like if you were using C<chr()>. + + my $bytes = pack("U*", 0x80, 0xFF); + +If you want to force the result to Unicode characters, use the special +C<"U0"> prefix. It consumes no arguments but forces the result to be +in Unicode characters, instead of bytes. + + my $chars = pack("U0U*", 0x80, 0xFF); + =head2 Handling Unicode Handling Unicode is for the most part transparent: just use the @@ -611,8 +622,9 @@ For UTF-8 only, you can use: If invalid, a C<Malformed UTF-8 character (byte 0x##) in unpack> is produced. The "U0" means "expect strictly UTF-8 -encoded Unicode". Without that the C<unpack("U*", ...)> -would accept also data like C<chr(0xFF>). +encoded Unicode". Without that the C<unpack("U*", ...)> +would accept also data like C<chr(0xFF>), similarly to the +C<pack> as we saw earlier. =item How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa? |