perlapi: Add explanation for why certain macros don't exist.

This also fixes some orphaned references.
author: Karl Williamson <khw@cpan.org> 2016-12-18 13:57:46 -0700
committer: Karl Williamson <khw@cpan.org> 2016-12-19 12:08:15 -0700
commit: 21da7284d6090cdcf2be93de47dfe3e32cbaf6c4 (patch)
tree: 83d171c1e7f69d68e88bb70e144551a33ffeadcd /handy.h
parent: a5ab225509d435feb91e537a15f319832528ca1f (diff)
download: perl-21da7284d6090cdcf2be93de47dfe3e32cbaf6c4.tar.gz
1 files changed, 26 insertions, 8 deletions
diff --git a/handy.h b/handy.h
index 1eb88923bf..848050f333 100644
--- a/handy.h
+++ b/handy.h
@@ -783,6 +783,16 @@ Returns the value of an ASCII-range hex digit and advances the string pointer.
 Behaviour is only well defined when isXDIGIT(*str) is true.
 
 =head1 Character case changing
+Perl uses "full" Unicode case mappings.  This means that converting a single
+character to another case may result in a sequence of more than one character.
+For example, the uppercase of C<E<223>> (LATIN SMALL LETTER SHARP S) is the two
+character sequence C<SS>.  This presents some complications   The lowercase of
+all characters in the range 0..255 is a single character, and thus
+C<L</toLOWER_L1>> is furnished.  But, C<toUPPER_L1> can't exist, as it couldn't
+return a valid result for all legal inputs.  Instead C<L</toUPPER_uvchr>> has
+an API that does allow every possible legal result to be returned.)  Likewise
+no other function that is crippled by not being able to give the correct
+results for the full range of possible inputs has been implemented here.
 
 =for apidoc Am|U8|toUPPER|U8 ch
 Converts the specified character to uppercase.  If the input is anything but an
@@ -797,7 +807,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the uppercase version may be longer than the original character.
 
 The first code point of the uppercased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more.)
 
 =for apidoc Am|UV|toUPPER_utf8|U8* p|U8* s|STRLEN* lenp
 Converts the UTF-8 encoded character at C<p> to its uppercase version, and
@@ -806,7 +817,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the uppercase version may be longer than the original character.
 
 The first code point of the uppercased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 The input character at C<p> is assumed to be well-formed.
 
@@ -824,7 +836,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the foldcase version may be longer than the original character.
 
 The first code point of the foldcased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 =for apidoc Am|UV|toFOLD_utf8|U8* p|U8* s|STRLEN* lenp
 Converts the UTF-8 encoded character at C<p> to its foldcase version, and
@@ -833,7 +846,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the foldcase version may be longer than the original character.
 
 The first code point of the foldcased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 The input character at C<p> is assumed to be well-formed.
 
@@ -858,7 +872,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the lowercase version may be longer than the original character.
 
 The first code point of the lowercased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 =for apidoc Am|UV|toLOWER_utf8|U8* p|U8* s|STRLEN* lenp
 Converts the UTF-8 encoded character at C<p> to its lowercase version, and
@@ -867,7 +882,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the lowercase version may be longer than the original character.
 
 The first code point of the lowercased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 The input character at C<p> is assumed to be well-formed.
 
@@ -886,7 +902,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the titlecase version may be longer than the original character.
 
 The first code point of the titlecased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 =for apidoc Am|UV|toTITLE_utf8|U8* p|U8* s|STRLEN* lenp
 Converts the UTF-8 encoded character at C<p> to its titlecase version, and
@@ -895,7 +912,8 @@ that the buffer pointed to by C<s> needs to be at least C<UTF8_MAXBYTES_CASE+1>
 bytes since the titlecase version may be longer than the original character.
 
 The first code point of the titlecased version is returned
-(but note, as explained just above, that there may be more.)
+(but note, as explained at L<the top of this section|/Character case
+changing>, that there may be more).
 
 The input character at C<p> is assumed to be well-formed.
author	Karl Williamson <khw@cpan.org>	2016-12-18 13:57:46 -0700
committer	Karl Williamson <khw@cpan.org>	2016-12-19 12:08:15 -0700
commit	21da7284d6090cdcf2be93de47dfe3e32cbaf6c4 (patch)
tree	83d171c1e7f69d68e88bb70e144551a33ffeadcd /handy.h
parent	a5ab225509d435feb91e537a15f319832528ca1f (diff)
download	perl-21da7284d6090cdcf2be93de47dfe3e32cbaf6c4.tar.gz