diff options
author | Karl Williamson <public@khwilliamson.com> | 2013-02-26 11:02:33 -0700 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2013-08-29 09:55:57 -0600 |
commit | 4f83cdcd5c1f4154a1ecc18f39f9e5c3f21bc4b3 (patch) | |
tree | 2cfb559f611fb57257b30fde275e0e11f4fb0fd9 /utf8.c | |
parent | 5495102afe3b4589647ff274c9692632113ce6f4 (diff) | |
download | perl-4f83cdcd5c1f4154a1ecc18f39f9e5c3f21bc4b3.tar.gz |
Deprecate utf8_to_uni_buf()
Now that the tables are stored in native order, there is almost no need
for code to be dealing in Unicode order.
According to grep.cpan.me, there are no uses of this function in CPAN.
Diffstat (limited to 'utf8.c')
-rw-r--r-- | utf8.c | 16 |
1 files changed, 8 insertions, 8 deletions
@@ -996,13 +996,14 @@ Perl_utf8_to_uvchr(pTHX_ const U8 *s, STRLEN *retlen) /* =for apidoc utf8_to_uvuni_buf -Returns the Unicode code point of the first character in the string C<s> which +Only in very rare circumstances should code need to be dealing in the Unicode +code point. Use L</utf8_to_uvchr_buf> instead. + +Returns the Unicode (not-native) code point of the first character in the +string C<s> which is assumed to be in UTF-8 encoding; C<send> points to 1 beyond the end of C<s>. C<retlen> will be set to the length, in bytes, of that character. -This function should only be used when the returned UV is considered -an index into the Unicode semantic tables (e.g. swashes). - If C<s> does not point to a well-formed UTF-8 character and UTF8 warnings are enabled, zero is returned and C<*retlen> is set (if C<retlen> isn't NULL) to -1. If those warnings are off, the computed value if well-defined (or @@ -1046,12 +1047,11 @@ Returns the Unicode code point of the first character in the string C<s> which is assumed to be in UTF-8 encoding; C<retlen> will be set to the length, in bytes, of that character. -This function should only be used when the returned UV is considered -an index into the Unicode semantic tables (e.g. swashes). - Some, but not all, UTF-8 malformations are detected, and in fact, some malformed input could cause reading beyond the end of the input buffer, which -is why this function is deprecated. Use L</utf8_to_uvuni_buf> instead. +is one reason why this function is deprecated. The other is that only in +extremely limited circumstances should the Unicode versus native code point be +of any interest to you. Use L</utf8_to_uvchr_buf> instead. If C<s> points to one of the detected malformations, and UTF8 warnings are enabled, zero is returned and C<*retlen> is set (if C<retlen> doesn't point to |