diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2002-03-26 01:19:57 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2002-03-26 01:19:57 +0000 |
commit | 3990cdf50f04b3556c0bb3f25d178926ef5d1117 (patch) | |
tree | 1d0d2b15d5657960eab42b5654598319b9fb3666 /pod/perlunicode.pod | |
parent | 2ac72d6ee10eac553987a271a333c11a24d55989 (diff) | |
download | perl-3990cdf50f04b3556c0bb3f25d178926ef5d1117.tar.gz |
Mention the effect of Unicode keys on hashes.
p4raw-id: //depot/perl@15507
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r-- | pod/perlunicode.pod | 13 |
1 files changed, 13 insertions, 0 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod index 9ba32ee3e0..dd2a896224 100644 --- a/pod/perlunicode.pod +++ b/pod/perlunicode.pod @@ -137,6 +137,19 @@ This works for all characters that have names. =item * +If Unicode is used in hash keys, there is a subtle effect on the hashes. +The hash becomes "Unicode-sticky" so that keys retrieved from the hash +(either by %hash, each %hash, or keys %hash) will be in Unicode, not +in bytes, even when the keys were bytes went they "went in". This +"stickiness" persists unless the hash is completely emptied, either by +using delete() or clearing the with undef() or assigning an empty list +to the hash. Most of the time this difference is negligible, but +there are few places where it matters: for example the regular +expression character classes like C<\w> behave differently for +bytes and characters. + +=item * + If an appropriate L<encoding> is specified, identifiers within the Perl script may contain Unicode alphanumeric characters, including ideographs. (You are currently on your own when it comes to using the |