From c7171788844166655de22f8bf356408afb659f77 Mon Sep 17 00:00:00 2001
From: vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4>
Date: Mon, 26 Apr 2021 12:17:08 +0000
Subject: [doc] Update about "case insensitive" and issue with Turkish locales
 for "I" / "i".

  * mpfr.texi: added "with the rules of the C locale" in the
    mpfr_strtofr description.
  * README.dev: completed information about Turkish locales.

git-svn-id: https://scm.gforge.inria.fr/anonscm/svn/mpfr/trunk@14505 280ebfd0-de03-0410-8827-d642c229c3f4
---
 doc/README.dev | 14 ++++++++++----
 doc/mpfr.texi  |  9 ++++++++-
 2 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/doc/README.dev b/doc/README.dev
index d833c477d..caa8255ac 100644
--- a/doc/README.dev
+++ b/doc/README.dev
@@ -872,10 +872,16 @@ Conversely, do not use locale-dependent functions when the result must
 not depend on the locales. In particular, the alphanumeric characters
 used in number strings (as created by mpfr_get_str) must be those of
 the required characters from the basic character set (see ISO C99
-standard Section 5.2.1 "Character sets"). And tolower(letter) does
-not necessarily return the corresponding lowercase letter from these
-required characters. For instance, tolower('I') returns a dotless 'i'
-in Turkish tr_TR.iso88599 locales.
+standard Section 5.2.1 "Character sets").
+
+Note that in Turkish locales on some systems:
+  * the uppercase version of "i" is "İ" (an "I" with a dot above);
+  * the lowercase version of "I" is "ı" (a dotless "i").
+These characters are available in ISO-8859-9, thus as "char" in the
+tr_TR.iso88599 locale. However, in UTF-8, they are not available as
+(8-bit) "char"; thus toupper('i') gives 'i' and tolower('I') gives 'I'.
+So, when writing code and testing, these two encodings need to be
+considered, as they can give different behaviors.
 
 ===========================================================================
 
diff --git a/doc/mpfr.texi b/doc/mpfr.texi
index 18b23f908..31e141b74 100644
--- a/doc/mpfr.texi
+++ b/doc/mpfr.texi
@@ -1540,7 +1540,8 @@ stops at the character @samp{0}, thus 0 is read.
 Special data (for infinities and NaN) can be @samp{@@inf@@} or
 @samp{@@nan@@(n-char-sequence-opt)}, and if @math{@var{base} @le{} 16},
 it can also be @samp{infinity}, @samp{inf}, @samp{nan} or
-@samp{nan(n-char-sequence-opt)}, all case insensitive.
+@samp{nan(n-char-sequence-opt)}, all case insensitive with the rules of
+the C locale.
 A @samp{n-char-sequence-opt} is a possibly empty string containing only digits,
 Latin letters and the underscore (0, 1, 2, @dots{}, 9, a, b, @dots{}, z,
 A, B, @dots{}, Z, _). Note: one has an optional sign for all data, even
@@ -1548,6 +1549,12 @@ NaN@.
 For example, @samp{-@@nAn@@(This_Is_Not_17)} is a valid representation for NaN
 in base 17.
 
+@c Note about the "case insensitive with the rules of the C locale":
+@c The reason is that in Turkish locales on some systems, the uppercase
+@c version of "i" is an "I" with a dot above, and the lowercase version
+@c of "I" is a dotless "i". We do not follow these rules here.
+@c See README.dev for additional information.
+
 @end deftypefun
 
 @deftypefun void mpfr_set_nan (mpfr_t @var{x})
-- 
cgit v1.2.1