perlapi: Clarify NUL handling for 2 fcns; nits

The string input to these two functions must be NUL terminated when the length parameter is 0.
author: Karl Williamson <khw@cpan.org> 2014-04-23 13:51:48 -0600
committer: Karl Williamson <khw@cpan.org> 2014-04-23 17:08:08 -0600
commit: 75200dff8561a9c5d6eaa86a0ac75874bf13282b (patch)
tree: 1b821bbc87435dd0afac2cd0ef1d981f7d53551b /utf8.c
parent: 70d95cc994e515d77711476f6c853c3f1f1f1458 (diff)
download: perl-75200dff8561a9c5d6eaa86a0ac75874bf13282b.tar.gz
1 files changed, 10 insertions, 8 deletions
diff --git a/utf8.c b/utf8.c
index fa5b4a7323..dab538789a 100644
--- a/utf8.c
+++ b/utf8.c
@@ -57,7 +57,9 @@ or not the string is encoded in UTF-8 (or UTF-EBCDIC on EBCDIC machines).  That
 is, if they are invariant.  On ASCII-ish machines, only ASCII characters
 fit this definition, hence the function's name.
 
-If C<len> is 0, it will be calculated using C<strlen(s)>.  
+If C<len> is 0, it will be calculated using C<strlen(s)>, (which means if you
+use this option, that C<s> can't have embedded C<NUL> characters and has to
+have a terminating C<NUL> byte).
 
 See also L</is_utf8_string>(), L</is_utf8_string_loclen>(), and L</is_utf8_string_loc>().
 
@@ -401,9 +403,9 @@ Perl_is_utf8_char(const U8 *s)
 
 Returns true if the first C<len> bytes of string C<s> form a valid
 UTF-8 string, false otherwise.  If C<len> is 0, it will be calculated
-using C<strlen(s)> (which means if you use this option, that C<s> has to have a
-terminating NUL byte).  Note that all characters being ASCII constitute 'a
-valid UTF-8 string'.
+using C<strlen(s)> (which means if you use this option, that C<s> can't have
+embedded C<NUL> characters and has to have a terminating C<NUL> byte).  Note
+that all characters being ASCII constitute 'a valid UTF-8 string'.
 
 See also L</is_ascii_string>(), L</is_utf8_string_loclen>(), and L</is_utf8_string_loc>().
 
@@ -548,11 +550,11 @@ flags) malformation is found.  If this flag is set, the routine assumes that
 the caller will raise a warning, and this function will silently just set
 C<retlen> to C<-1> (cast to C<STRLEN>) and return zero.
 
-Note that this API requires disambiguation between successful decoding a NUL
+Note that this API requires disambiguation between successful decoding a C<NUL>
 character, and an error return (unless the UTF8_CHECK_ONLY flag is set), as
 in both cases, 0 is returned.  To disambiguate, upon a zero return, see if the
-first byte of C<s> is 0 as well.  If so, the input was a NUL; if not, the input
-had an error.
+first byte of C<s> is 0 as well.  If so, the input was a C<NUL>; if not, the
+input had an error.
 
 Certain code points are considered problematic.  These are Unicode surrogates,
 Unicode non-characters, and code points above the Unicode maximum of 0x10FFFF.
@@ -1400,7 +1402,7 @@ UTF-8.
 Returns a pointer to the newly-created string, and sets C<len> to
 reflect the new length in bytes.
 
-A NUL character will be written after the end of the string.
+A C<NUL> character will be written after the end of the string.
 
 If you want to convert to UTF-8 from encodings other than
 the native (Latin1 or EBCDIC),
author	Karl Williamson <khw@cpan.org>	2014-04-23 13:51:48 -0600
committer	Karl Williamson <khw@cpan.org>	2014-04-23 17:08:08 -0600
commit	75200dff8561a9c5d6eaa86a0ac75874bf13282b (patch)
tree	1b821bbc87435dd0afac2cd0ef1d981f7d53551b /utf8.c
parent	70d95cc994e515d77711476f6c853c3f1f1f1458 (diff)
download	perl-75200dff8561a9c5d6eaa86a0ac75874bf13282b.tar.gz