summaryrefslogtreecommitdiff
path: root/pod/perlunicode.pod
diff options
context:
space:
mode:
authorAnton Tagunov <tagunov@motor.ru>2002-03-04 08:41:41 +0300
committerJarkko Hietaniemi <jhi@iki.fi>2002-03-04 03:46:31 +0000
commitf1e62f77e429d3d8456955e82037ca65bbe65d82 (patch)
tree30e25107972589aa410bc8477cb7330f3b1bbe0d /pod/perlunicode.pod
parentd83fe81478642d61ea67b41c692afcf14811c9af (diff)
downloadperl-f1e62f77e429d3d8456955e82037ca65bbe65d82.tar.gz
[ID 20020303.006] [Doc][utf8::up/down grade][use encoding] application for clarification
Date: Mon, 4 Mar 2002 05:41:41 +0300 Message-Id: <7916563907.20020304054141@motor.ru> Subject: [ID 20020303.005] Patch perlinicode C API description From: Anton Tagunov <tagunov@motor.ru> Date: Mon, 4 Mar 2002 06:08:23 +0300 Message-Id: <2018165510.20020304060823@motor.ru> p4raw-id: //depot/perl@14981
Diffstat (limited to 'pod/perlunicode.pod')
-rw-r--r--pod/perlunicode.pod20
1 files changed, 10 insertions, 10 deletions
diff --git a/pod/perlunicode.pod b/pod/perlunicode.pod
index 7ea87141f0..c170d2c1da 100644
--- a/pod/perlunicode.pod
+++ b/pod/perlunicode.pod
@@ -848,16 +848,16 @@ the following C APIs useful (see perlapi for details):
=item *
-DO_UTF8(sv) returns true if the UTF8 flag is on and the bytes
-pragma is not in effect. SvUTF8(sv) returns true is the UTF8
-flag is on, the bytes pragma is ignored. Remember that UTF8
-flag being on does not mean that there would be any characters
-of code points greater than 255 or 127 in the scalar, or that
-there even are any characters in the scalar. The UTF8 flag
-means that any characters added to the string will be encoded
-in UTF8 if the code points of the characters are greater than
-255. Not "if greater than 127", since Perl's Unicode model
-is not to use UTF-8 until it's really necessary.
+DO_UTF8(sv) returns true if the UTF8 flag is on and the bytes pragma
+is not in effect. SvUTF8(sv) returns true is the UTF8 flag is on, the
+bytes pragma is ignored. The UTF8 flag being on does B<not> mean that
+there are any characters of code points greater than 255 (or 127) in the
+scalar, or that there even are any characters in the scalar. What the
+UTF8 flag means is that the sequence of octets in the representation
+of the scalar should be treated as UTF-8 encoding of a string.
+The UTF8 flag being off means that each octet in this representation
+encodes a single character with codepoint 0..255 within the string.
+Perl's Unicode model is not to use UTF-8 until it's really necessary.
=item *