summaryrefslogtreecommitdiff
path: root/pod/perlguts.pod
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2012-03-19 15:38:06 -0600
committerKarl Williamson <public@khwilliamson.com>2012-03-19 18:23:44 -0600
commit4b88fb76efce8c436e63b907c9842345d4fa77c7 (patch)
tree67d8be3146bf0c32e93bd8209c141ed72c5a0ae2 /pod/perlguts.pod
parent27d6c58a7e12243bef66c58b38e7d1415d9ca07e (diff)
downloadperl-4b88fb76efce8c436e63b907c9842345d4fa77c7.tar.gz
Use the new utf8 to code point functions
These functions should be used in preference to the old ones which can read beyond the end of the input string.
Diffstat (limited to 'pod/perlguts.pod')
-rw-r--r--pod/perlguts.pod7
1 files changed, 4 insertions, 3 deletions
diff --git a/pod/perlguts.pod b/pod/perlguts.pod
index ee938ea137..908fa1f0bd 100644
--- a/pod/perlguts.pod
+++ b/pod/perlguts.pod
@@ -2670,17 +2670,18 @@ character like this (the UTF8_IS_INVARIANT() is a macro that tests
whether the byte can be encoded as a single byte even in UTF-8):
U8 *utf;
+ U8 *utf_end; /* 1 beyond buffer pointed to by utf */
UV uv; /* Note: a UV, not a U8, not a char */
STRLEN len; /* length of character in bytes */
if (!UTF8_IS_INVARIANT(*utf))
/* Must treat this as UTF-8 */
- uv = utf8_to_uvchr(utf, &len);
+ uv = utf8_to_uvchr_buf(utf, utf_end, &len);
else
/* OK to treat this character as a byte */
uv = *utf;
-You can also see in that example that we use C<utf8_to_uvchr> to get the
+You can also see in that example that we use C<utf8_to_uvchr_buf> to get the
value of the character; the inverse function C<uvchr_to_utf8> is available
for putting a UV into UTF-8:
@@ -2792,7 +2793,7 @@ it's not - if you pass on the PV to somewhere, pass on the flag too.
=item *
-If a string is UTF-8, B<always> use C<utf8_to_uvchr> to get at the value,
+If a string is UTF-8, B<always> use C<utf8_to_uvchr_buf> to get at the value,
unless C<UTF8_IS_INVARIANT(*s)> in which case you can use C<*s>.
=item *