summaryrefslogtreecommitdiff
path: root/pod/perlguts.pod
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2015-05-07 17:07:16 -0600
committerKarl Williamson <khw@cpan.org>2015-05-07 17:32:48 -0600
commit6e31cdd1306e50af630ec6ef415b48d1ad6c978d (patch)
treedf474e66da3869c065423a914c61613bc13f98b5 /pod/perlguts.pod
parenta6a7eedc7e11636c834ac840a3a04d5d2931932a (diff)
downloadperl-6e31cdd1306e50af630ec6ef415b48d1ad6c978d.tar.gz
perlguts: Add links to perlunicode
Diffstat (limited to 'pod/perlguts.pod')
-rw-r--r--pod/perlguts.pod3
1 files changed, 2 insertions, 1 deletions
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
index cd7a512ff6..a58d7ade9d 100644
--- a/pod/perlguts.pod
+++ b/pod/perlguts.pod
@@ -2832,6 +2832,7 @@ C<v194.128>; this continues up to character 191, which is
C<v194.191>. Now we've run out of bits (191 is binary
C<10111111>) so we move on; character 192 is C<v195.128>. And
so it goes on, moving to three bytes at character 2048.
+L<perlunicode/Unicode Encodings> has pictures of how this works.
Assuming you know you're dealing with a UTF-8 string, you can find out
how long the first character in it is with the C<UTF8SKIP> macro:
@@ -2957,7 +2958,7 @@ to support it.
And this isn't the whole story. Starting in Perl v5.12, strings that
aren't encoded in UTF-8 may also be treated as Unicode under various
-conditions.
+conditions (see L<perlunicode/ASCII Rules versus Unicode Rules>).
This is only really a problem for characters whose ordinals are between
128 and 255, and their behavior varies under ASCII versus Unicode rules
in ways that your code cares about (see L<perlunicode/The "Unicode Bug">).