summaryrefslogtreecommitdiff
path: root/utf8.c
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2012-03-19 16:31:18 -0600
committerKarl Williamson <public@khwilliamson.com>2012-03-19 18:23:44 -0600
commit977c1d31fff4d41aa42e40c904fe08b509e3a34e (patch)
tree6dc75d828245a74564c25954ec7096938adc2572 /utf8.c
parent4b88fb76efce8c436e63b907c9842345d4fa77c7 (diff)
downloadperl-977c1d31fff4d41aa42e40c904fe08b509e3a34e.tar.gz
Deprecate utf8_to_uvchr() and utf8_to_uvuni()
These functions can read beyond the end of their input strings if presented with malformed UTF-8 input. Perl core code has been converted to use other functions instead of these.
Diffstat (limited to 'utf8.c')
-rw-r--r--utf8.c12
1 files changed, 8 insertions, 4 deletions
diff --git a/utf8.c b/utf8.c
index 85bf2f00c8..1d646a88d3 100644
--- a/utf8.c
+++ b/utf8.c
@@ -835,13 +835,15 @@ Perl_valid_utf8_to_uvchr(pTHX_ const U8 *s, STRLEN *retlen)
/*
=for apidoc utf8_to_uvchr
+DEPRECATED!
+
Returns the native code point of the first character in the string C<s>
which is assumed to be in UTF-8 encoding; C<retlen> will be set to the
length, in bytes, of that character.
Some, but not all, UTF-8 malformations are detected, and in fact, some
-malformed input could cause reading beyond the end of the input buffer.
-Use L</utf8_to_uvchr_buf> instead.
+malformed input could cause reading beyond the end of the input buffer, which
+is why this function is deprecated. Use L</utf8_to_uvchr_buf> instead.
If C<s> points to one of the detected malformations, zero is
returned and C<retlen> is set, if possible, to -1.
@@ -901,13 +903,15 @@ Perl_valid_utf8_to_uvuni(pTHX_ const U8 *s, STRLEN *retlen)
/*
=for apidoc utf8_to_uvuni
+DEPRECATED!
+
Returns the Unicode code point of the first character in the string C<s>
which is assumed to be in UTF-8 encoding; C<retlen> will be set to the
length, in bytes, of that character.
Some, but not all, UTF-8 malformations are detected, and in fact, some
-malformed input could cause reading beyond the end of the input buffer.
-Use L</utf8_to_uvuni_buf> instead.
+malformed input could cause reading beyond the end of the input buffer, which
+is why this function is deprecated. Use L</utf8_to_uvuni_buf> instead.
If C<s> points to one of the detected malformations, zero is
returned and C<retlen> is set, if possible, to -1.