summaryrefslogtreecommitdiff
path: root/sv.c
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2013-12-06 14:23:43 -0700
committerKarl Williamson <public@khwilliamson.com>2013-12-06 14:29:11 -0700
commit2a590426b39988d335b9a105074d0e3b4c7fbeb8 (patch)
tree40deddd569842ddb5a9f193752262a516c2262fa /sv.c
parent360633e8a6ccddff927f53d4151443234b709f93 (diff)
downloadperl-2a590426b39988d335b9a105074d0e3b4c7fbeb8.tar.gz
perlapi, sv.c: Comments and API documentation
Diffstat (limited to 'sv.c')
-rw-r--r--sv.c46
1 files changed, 25 insertions, 21 deletions
diff --git a/sv.c b/sv.c
index 2c8a7bd8c0..c2ee630ba1 100644
--- a/sv.c
+++ b/sv.c
@@ -3231,35 +3231,39 @@ Always sets the SvUTF8 flag to avoid future validity checks even
if all the bytes are invariant in UTF-8.
If C<flags> has C<SV_GMAGIC> bit set,
will C<mg_get> on C<sv> if appropriate, else not.
-Returns the number of bytes in the converted string
-C<sv_utf8_upgrade> and
-C<sv_utf8_upgrade_nomg> are implemented in terms of this function.
+
+If C<flags> has SV_FORCE_UTF8_UPGRADE set, this function assumes that the PV
+will expand when converted to UTF-8, and skips the extra work of checking for
+that. Typically this flag is used by a routine that has already parsed the
+string and found such characters, and passes this information on so that the
+work doesn't have to be repeated.
+
+Returns the number of bytes in the converted string.
This is not a general purpose byte encoding to Unicode interface:
use the Encode extension for that.
-=cut
+=for apidoc sv_utf8_upgrade_flags_grow
-The grow version is currently not externally documented. It adds a parameter,
-extra, which is the number of unused bytes the string of 'sv' is guaranteed to
-have free after it upon return. This allows the caller to reserve extra space
-that it intends to fill, to avoid extra grows.
+Like sv_utf8_upgrade_flags, but has an additional parameter C<extra>, which is
+the number of unused bytes the string of 'sv' is guaranteed to have free after
+it upon return. This allows the caller to reserve extra space that it intends
+to fill, to avoid extra grows.
-Also externally undocumented for the moment is the flag SV_FORCE_UTF8_UPGRADE,
-which can be used to tell this function to not first check to see if there are
-any characters that are different in UTF-8 (variant characters) which would
-force it to allocate a new string to sv, but to assume there are. Typically
-this flag is used by a routine that has already parsed the string to find that
-there are such characters, and passes this information on so that the work
-doesn't have to be repeated.
+C<sv_utf8_upgrade>, C<sv_utf8_upgrade_nomg>, and C<sv_utf8_upgrade_flags>
+are implemented in terms of this function.
+
+Returns the number of bytes in the converted string (not including the spares).
+
+=cut
(One might think that the calling routine could pass in the position of the
-first such variant, so it wouldn't have to be found again. But that is not the
-case, because typically when the caller is likely to use this flag, it won't be
-calling this routine unless it finds something that won't fit into a byte.
-Otherwise it tries to not upgrade and just use bytes. But some things that
-do fit into a byte are variants in utf8, and the caller may not have been
-keeping track of these.)
+first variant character when it has set SV_FORCE_UTF8_UPGRADE, so it wouldn't
+have to be found again. But that is not the case, because typically when the
+caller is likely to use this flag, it won't be calling this routine unless it
+finds something that won't fit into a byte. Otherwise it tries to not upgrade
+and just use bytes. But some things that do fit into a byte are variants in
+utf8, and the caller may not have been keeping track of these.)
If the routine itself changes the string, it adds a trailing NUL. Such a NUL
isn't guaranteed due to having other routines do the work in some input cases,