perlapi: Fix up pod for utf8n_to_uvchr_error()

There are two return flags signalling that the input UTF-8 was malformed by being too short. This commit adds detail comparing and contrasting the meanings of the two
author: Karl Williamson <khw@cpan.org> 2018-08-04 13:19:58 -0600
committer: Karl Williamson <khw@cpan.org> 2018-08-05 05:52:48 -0600
commit: 00d976bbd170bbdc283618817b02b1b8f46bddd4 (patch)
tree: 2d22c0e7768b711e9a9f75a7729088871f0cecbf /utf8.c
parent: a4ee4fb50b9465c0ded09ff38f68516952970ec8 (diff)
download: perl-00d976bbd170bbdc283618817b02b1b8f46bddd4.tar.gz
1 files changed, 30 insertions, 1 deletions
diff --git a/utf8.c b/utf8.c
index 06b77689c0..cba1523aa6 100644
--- a/utf8.c
+++ b/utf8.c
@@ -1365,7 +1365,8 @@ C<UTF8_DISALLOW_NONCHAR> or the C<UTF8_WARN_NONCHAR> flags.
 =item C<UTF8_GOT_NON_CONTINUATION>
 
 The input sequence was malformed in that a non-continuation type byte was found
-in a position where only a continuation type one should be.
+in a position where only a continuation type one should be.  See also
+L</C<UTF8_GOT_SHORT>>.
 
 =item C<UTF8_GOT_OVERFLOW>
 
@@ -1378,6 +1379,34 @@ The input sequence was malformed in that C<curlen> is smaller than required for
 a complete sequence.  In other words, the input is for a partial character
 sequence.
 
+
+C<UTF8_GOT_SHORT> and C<UTF8_GOT_NON_CONTINUATION> both indicate a too short
+sequence.  The difference is that C<UTF8_GOT_NON_CONTINUATION> indicates always
+that there is an error, while C<UTF8_GOT_SHORT> means that an incomplete
+sequence was looked at.   If no other flags are present, it means that the
+sequence was valid as far as it went.  Depending on the application, this could
+mean one of three things:
+
+=over
+
+=item *
+
+The C<curlen> length parameter passed in was too small, and the function was
+prevented from examining all the necessary bytes.
+
+=item *
+
+The buffer being looked at is based on reading data, and the data received so
+far stopped in the middle of a character, so that the next read will
+read the remainder of this character.  (It is up to the caller to deal with the
+split bytes somehow.)
+
+=item *
+
+This is a real error, and the partial sequence is all we're going to get.
+
+=back
+
 =item C<UTF8_GOT_SUPER>
 
 The input sequence was malformed in that it is for a non-Unicode code point;
author	Karl Williamson <khw@cpan.org>	2018-08-04 13:19:58 -0600
committer	Karl Williamson <khw@cpan.org>	2018-08-05 05:52:48 -0600
commit	00d976bbd170bbdc283618817b02b1b8f46bddd4 (patch)
tree	2d22c0e7768b711e9a9f75a7729088871f0cecbf /utf8.c
parent	a4ee4fb50b9465c0ded09ff38f68516952970ec8 (diff)
download	perl-00d976bbd170bbdc283618817b02b1b8f46bddd4.tar.gz