summaryrefslogtreecommitdiff
path: root/pod/perlunicode.pod
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2002-03-26 01:19:57 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2002-03-26 01:19:57 +0000
commit3990cdf50f04b3556c0bb3f25d178926ef5d1117 (patch)
tree1d0d2b15d5657960eab42b5654598319b9fb3666 /pod/perlunicode.pod
parent2ac72d6ee10eac553987a271a333c11a24d55989 (diff)
downloadperl-3990cdf50f04b3556c0bb3f25d178926ef5d1117.tar.gz
Mention the effect of Unicode keys on hashes.
p4raw-id: //depot/perl@15507
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r--pod/perlunicode.pod13
1 files changed, 13 insertions, 0 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index 9ba32ee3e0..dd2a896224 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -137,6 +137,19 @@ This works for all characters that have names.
=item *
+If Unicode is used in hash keys, there is a subtle effect on the hashes.
+The hash becomes "Unicode-sticky" so that keys retrieved from the hash
+(either by %hash, each %hash, or keys %hash) will be in Unicode, not
+in bytes, even when the keys were bytes went they "went in". This
+"stickiness" persists unless the hash is completely emptied, either by
+using delete() or clearing the with undef() or assigning an empty list
+to the hash. Most of the time this difference is negligible, but
+there are few places where it matters: for example the regular
+expression character classes like C<\w> behave differently for
+bytes and characters.
+
+=item *
+
If an appropriate L<encoding> is specified, identifiers within the
Perl script may contain Unicode alphanumeric characters, including
ideographs. (You are currently on your own when it comes to using the