Merged the latest changes from the trunk, including some old changesets

related to mpfr_zeta that were skipped, resolving conflicts. Added RNDF support to new code introduced by this merge: * mpfr_mul_1n in src/mul.c (from r11281); * mpfr_sqr_1n in src/sqr.c (from r11283); * mpfr_div_1n in src/div.c (from r11284); * mpfr_sqrt1n in src/sqrt.c (from r11293). git-svn-id: svn://scm.gforge.inria.fr/svn/mpfr/branches/faithful@11456 280ebfd0-de03-0410-8827-d642c229c3f4
author: vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4> 2017-05-04 09:40:05 +0000
committer: vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4> 2017-05-04 09:40:05 +0000
commit: af5a1593331d686b9cc5fbbbbdc47e1733a4644e (patch)
tree: ff8210e41ae8ced432dbcd42e8be2a919f8dddc6
parent: 87ff38458263c9a9ed79a7ebd547fd32a66ae843 (diff)
parent: d79a8111e6b7851b15bac211d8dca0e67a2979b5 (diff)
download: mpfr-af5a1593331d686b9cc5fbbbbdc47e1733a4644e.tar.gz
51 files changed, 2955 insertions, 788 deletions
diff --git a/NEWS b/NEWS
index f8c343836..f9056d32a 100644
--- a/NEWS
+++ b/NEWS
@@ -46,7 +46,8 @@ Changes from versions 3.1.* to version 4.0.0:
   following normal and exponential distributions respectively.
 - New functions mpfr_fmma and mpfr_fmms to compute a*b+c*d and a*b-c*d.
 - New functions mpfr_log_ui to compute the logarithm of an integer,
-  and mpfr_gamma_inc for the incomplete Gamma function.
+  mpfr_gamma_inc for the incomplete Gamma function, and mpfr_beta for the
+  Beta function.
 - The mpfr_eint function now returns the value of the E1/eint1 function
   for negative argument.
 - The behavior of the mpfr_set_exp function changed, as it could easily
@@ -57,7 +58,9 @@ Changes from versions 3.1.* to version 4.0.0:
   old one could take all the memory and/or crash with inputs of different
   magnitudes in case of huge cancellation or table maker's dilemma). The
   sign of an exact zero result is now specified, and the return value is
-  now the usual ternary value.
+  now the usual ternary value. Note that the position of "const" in the
+  mpfr_sum prototype has been fixed (the manual was correct); user code
+  should not be affected.
 - Internally, improved caching: a minimum of 10% increase of the precision
   is guaranteed to avoid too many recomputations; added mpz_t caching.
 - Added configure option --enable-assert=none to avoid checking any assertion.
@@ -84,7 +87,9 @@ Changes from versions 3.1.* to version 4.0.0:
 - Bug fixes. In particular: a speed improvement when the --enable-assert
   or --enable-assert=full configure option is used with GCC; mpfr_get_str
   now sets the NaN flag on NaN input and the inexact flag when the conversion
-  is inexact.
+  is inexact. For a full list, see http://www.mpfr.org/mpfr-3.1.5/#fixed
+  and the same section for any previous 3.1.x version (follow the links
+  in the "Changes..." sections).
 - MinGW (MS Windows): Added support for thread-safe DLL (shared library).
 - Limited pkg-config support.
 
diff --git a/TODO b/TODO
index b1de1f5d2..c00e3f8cc 100644
--- a/TODO
+++ b/TODO
@@ -101,8 +101,16 @@ Table of contents:
     native type (float / double / long double) has been recognized and
     which format it is?
   * For functions that return a native floating-point value (mpfr_get_flt,
-    mpfr_get_d, mpfr_get_ld, mpfr_get_decimal64), raise exception flags
-    with feraiseexcept(), when supported.
+    mpfr_get_d, mpfr_get_ld, mpfr_get_decimal64), in case of underflow or
+    overflow, follow the convention used for the functions in <math.h>?
+    See §7.12.1 "Treatment of error conditions" of ISO C11, which provides
+    two ways of handling error conditions, depending on math_errhandling:
+    errno (to be set to ERANGE here) and floating-point exceptions.
+    If floating-point exceptions need to be generated, do not use
+    feraiseexcept(), as this function may require the math library (-lm);
+    use a floating-point expression instead, such as DBL_MIN * DBL_MIN
+    (underflow) or DBL_MAX * DBL_MAX (overflow), which are probably safe
+    as used in the GNU libc implementation.
   * For testing the lack of subnormal support:
     see the -mfpu GCC option for ARM and
     https://en.wikipedia.org/wiki/Denormal_number#Disabling_denormal_floats_at_the_code_level
@@ -149,6 +157,10 @@ Table of contents:
 - new functions of IEEE 754-2008, and more generally functions of the
   C binding draft TS 18661-4:
     http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1946.pdf
+  Some propositions about rootn: mpfr_rootn_si, mpfr_rootn_sj, mpfr_rootn_z,
+  and perhaps versions with an unsigned integer: mpfr_rootn_ui (in which
+  case mpfr_root could be declared as deprecated) and mpfr_rootn_uj.
+  Warning! rootn(-0,positive even) is +0, not -0 like in mpfr_root.
 - functions defined in the LIA-2 standard
   + minimum and maximum (5.2.2): max, min, max_seq, min_seq, mmax_seq
     and mmin_seq (mpfr_min and mpfr_max correspond to mmin and mmax);
@@ -257,10 +269,9 @@ Table of contents:
   match this standard.
 
 - from gnumeric (www.gnome.org/projects/gnumeric/doc/function-reference.html):
-  - beta
-    (also incomplete beta function, see message from Martin Maechler
+  - incomplete beta function, see message from Martin Maechler
     <maechler@stat.math.ethz.ch> on 18 Jan 2016, and Section 6.6 in
-    Abramowitz & Stegun)
+    Abramowitz & Stegun
   - betaln
   - degrees
   - radians
@@ -298,6 +309,8 @@ Table of contents:
   Also mpfr_div uses the remainder computed by mpn_divrem. A workaround would
   be to first try with mpn_div_q, and if we cannot (easily) compute the
   rounding, then use the current code with mpn_divrem.
+- improve atanh(x) for small x by using atanh(x) = log1p(2x/(1-x)),
+  and log1p should also be improved for small arguments.
 - compute exp by using the series for cosh or sinh, which has half the terms
   (see Exercise 4.11 from Modern Computer Arithmetic, version 0.3)
   The same method can be used for log, using the series for atanh, i.e.,
diff --git a/doc/README.dev b/doc/README.dev
index d0ddcc847..971020aa1 100644
--- a/doc/README.dev
+++ b/doc/README.dev
@@ -229,9 +229,9 @@ To make a release (for the MPFR team):
      platforms where the endianness is unknown (or can't be specified
      without AC_CONFIG_HEADERS).
      Check also without mpz_t caching (-DMPFR_MY_MPZ_INIT=0).
-     Check with -DMPFR_GENERIC_ABI to test the generic code, not tied to
-     a particular ABI; this is useful when there is specific MPFR code
-     for both GMP_NUMB_BITS == 32 and GMP_NUMB_BITS == 64.
+     Check with -DMPFR_GENERIC_ABI to test only the generic code; this is
+     useful because most tests are written for low precision, thus would
+     use only specific code of some MPFR operations.
 
      Check that make and make check pass with a C++ compiler, for example:
      ./configure CC=g++ (MPFR 2.3.2 did not).
@@ -443,6 +443,11 @@ Format of long double.
 
 + MPFR_GENERIC_ABI:     Define to disable code that is tied to a specific
                         ABI (e.g. GMP_NUMB_BITS value).
+                        Note: Currently it is also used to disable code
+                        specific to low precision, i.e. to use only generic
+                        code. This is useful because most tests are written
+                        for low precision, meaning that without this macro,
+                        the generic code would not sufficiently be tested.
 
 List of macros used for checking MPFR:
 
@@ -580,6 +585,14 @@ user-defined string literal in C++11:
 
    =====================================================================
 
+Setting errno is safe to signal some error information (as in the
+formatted output functions), but errno must not be read (unless we
+have just modified it) as this may yield undefined behavior in some
+corner cases out of our control (ISO C99 / C11, 7.14.1.1p5, also
+mentioned in J.2).
+
+   =====================================================================
+
 C-Reduce may be useful to try to identify whether a bug comes from the
 compiler.
 
diff --git a/doc/algorithms.tex b/doc/algorithms.tex
index af56ea2dc..d5311d1c3 100644
--- a/doc/algorithms.tex
+++ b/doc/algorithms.tex
@@ -3243,14 +3243,35 @@ of Eq.~(\ref{eq:legendre_cf}) is bounded by
 
 \subsection{The Riemann Zeta function}
 
-The algorithm for the Riemann Zeta function is due to Jean-Luc R\'emy
-and Sapphorain P\'etermann \cite{PeRe06,PeRe07}. For $s < 1/2$ we use the
-functional equation
-\[ \zeta(s) = 2^s \pi^{s-1} \sin\left(\frac{\pi s}{2}\right) \Gamma(1-s)
-   \zeta(1-s) \]
-(in that case, one should take care of cancellation when $\sin(\pi s/2)$ is
-small, i.e., when $s$ is near an even negative integer).
-For $s \geq 1/2$ we use the Euler-MacLaurin summation formula, applied
+\subsubsection{Special cases}
+
+As usual, special inputs are first taken into account: NaN, infinities,
+zeros, but also values close enough to $0$ (see below), even negative
+integers, and $1$ (pole).
+
+Let us focus on $\zeta(s)$, where $s$ has a small exponent.
+In theory, this case could be handled by the reflection formula
+(\textsection\ref{zeta:reflection}), but as the term $\zeta(1-s)$
+of this formula is close to the pole of $\zeta$ at $1$, this method would
+be slow (and could even yield an internal overflow for tiny $s$), while
+the correctly rounded result can be determined very quickly when $s$ is
+close enough to $0$. We have around $0$:
+\[ \zeta(s) = -\frac{1}{2} - \frac{1}{2} \log(2 \pi) s + \ldots \]
+and for $|s| \leq 2^{-4}$, we have $|\zeta(s) + 1/2| \leq |s|$. Thus
+if $|s| \leq 2^{-4}$ and $|s| \leq \frac{1}{4} \ulp(1/2)$ in the target
+precision $p$, we can deduce the correct rounding for any rounding mode.
+The second condition can be rewritten: $|s| \leq 2^{-2-p}$. For $p \geq 2$,
+the second condition implies the first one, and it is sufficient to have
+$\Exp(s) \leq -2-p$. For $p = 1$, if we assume $\Exp(s) \leq -2-p$, then
+$|s| \leq \frac{1}{2} 2^{-2-p} = 2^{-4}$, so that both conditions are
+also satisfied. Thus, for any target precision $p \geq 1$, a sufficient
+condition is $\Exp(s) \leq -2-p$, or equivalently $\Exp(s) + 1 < -p$.
+
+\subsubsection{Case $s \geq 1/2$}
+
+The algorithm for the Riemann Zeta function for $s \geq 1/2$ is due to
+Jean-Luc R\'emy and Sapphorain P\'etermann \cite{PeRe06,PeRe07}.
+We use the Euler-MacLaurin summation formula, applied
 to the real function $f(x) = x^{-s}$ for $s > 1$:
 \[ \zeta(s) = \sum_{k=1}^{N-1} \frac{1}{k^s} + \frac{1}{2N^s}
 + \frac{1}{(s-1)N^{s-1}} + \sum_{k=1}^p \frac{B_{2k}}{2k}
@@ -3282,7 +3303,52 @@ the approximation computed, we have
 \[ |\zeta(s) - z| \leq 2^{-\Pi} |\zeta(s)|. \]
 \end{theorem}
 
-\subsubsection{The integer argument case.}
+\subsubsection{Case $s < 1/2$}
+\label{zeta:reflection}
+
+For $s < 1/2$, we use the reflection formula:
+\[ \zeta(s) = 2^s \pi^{s-1} \sin\left(\frac{\pi s}{2}\right) \Gamma(1-s)
+   \zeta(1-s). \]
+
+For simplicity, to avoid taking into account the error from the
+$\Gamma$ and $\zeta$ inputs, we will ensure that $1-s$ is represented
+exactly. Thus its precision may need to be much larger than the target
+precision. However, for efficiency reasons, the internal working
+precision $q$ will still be based only on the target precision as usual.
+
+So, assuming that no underflows nor overflows occur, the terms
+$\Gamma(1-s)$ and $\zeta(1-s)$ will each have an error factor of
+the form $1+\theta$, with $|\theta| \leq 2^{-q}$.
+
+Assuming that $\Gamma$ and $\zeta$ have a larger complexity than
+the other terms, we would like the other error factors not to be
+significantly larger than $1+2^{-q}$. Otherwise we would have wasted
+time by computing $\Gamma(1-s)$ and $\zeta(1-s)$ with more precision
+than really needed.
+
+Concerning the term $\pi^{s-1}$, if the constant $\pi$ is represented
+by a variable $x$ with a precision $p_x$ to the nearest, then
+$(1-2^{-p_x}) x \leq \pi \leq (1+2^{-p_x}) x$. So,
+\[ (1+2^{-p_x})^{s-1} x^{s-1} \leq \pi^{s-1} \leq
+   (1-2^{-p_x})^{s-1} x^{s-1}, \]
+i.e.,
+\[ (1-2^{-p_x})^{1-s} \pi^{s-1} \leq x^{s-1} \leq
+   (1+2^{-p_x})^{1-s} \pi^{s-1}. \]
+We want $(1-s) 2^{-p_x}$ to be small and of the order of $2^{-q}$.
+We will choose $p_x = q - \Exp(1-s)$, so that $(1-s) 2^{-p_x} < 2^{-q}$.
+The value $x^{s-1}$ will be computed and rounded to the working precision
+by the \texttt{mpfr\_pow} function, giving another error term of the form
+$1+\theta$, with $|\theta| \leq 2^{-q}$.
+
+Concerning the term $\sin\left(\frac{\pi s}{2}\right)$, because of the
+factor $\pi$ in the argument, we will do the range reduction ourselves:
+it will much simpler and faster than the one in \texttt{mpfr\_sin}, and
+this will allow us to select the intermediate precision more accurately.
+
+% To be removed when complete and correctly implemented.
+(WORK IN PROGRESS -- May not correspond to the implementation yet.)
+
+\subsubsection{The integer argument case}
 In case of an integer argument $s \geq 2$,
 the \texttt{mpfr\_zeta\_ui} function computes
 $\zeta(s)$ using the following formula from \cite{Borwein95}:
@@ -3536,6 +3602,8 @@ is an exponent loss in the final subtraction $r = \circ(v_{n+1}-w)$.
 
 \subsection{The Bessel functions}
 
+% see https://people.eecs.berkeley.edu/~fateman/papers/hermite.pdf
+
 \subsubsection{Bessel function $J_n(z)$ of first kind}
 
 The Bessel function $J_n(z)$ of first kind and integer order $n$
diff --git a/doc/mpfr.texi b/doc/mpfr.texi
index f627f8577..fea812d82 100644
--- a/doc/mpfr.texi
+++ b/doc/mpfr.texi
@@ -3,7 +3,7 @@
 @setfilename mpfr.info
 @documentencoding UTF-8
 @set VERSION 4.0.0-dev
-@set UPDATED-MONTH February 2017
+@set UPDATED-MONTH April 2017
 @settitle GNU MPFR @value{VERSION}
 @synindex tp fn
 @iftex
@@ -276,7 +276,8 @@ from most arbitrary precision floating-point software tools, are:
 @item the MPFR code is portable, i.e., the result of any operation
 does not depend on the machine word size
 @code{mp_bits_per_limb} (64 on most current processors), possibly
-except in faithful rounding;
+except in faithful rounding.
+It does not depend either on the machine rounding mode or rounding precision;
 
 @item the precision in bits can be set @emph{exactly} to any valid value
 for each variable (including very small precision);
@@ -1044,8 +1045,10 @@ As a consequence, if two variables are used to store
 only a few significant bits, and their product is stored in a variable with large
 precision, then MPFR will still compute the result with full precision.
 
-The value of the standard C macro @code{errno} may be set to non-zero by
-any MPFR function or macro, whether or not there is an error.
+The value of the standard C macro @code{errno} may be set to non-zero after
+calling any MPFR function or macro, whether or not there is an error. Except
+when documented, MPFR will not set @code{errno}, but functions called by the
+MPFR code (libc functions, memory allocator, etc.) may do so.
 
 @menu
 * Initialization Functions::
@@ -2159,6 +2162,11 @@ function on @var{op}, rounded in the direction @var{rnd}.
 When @var{op} is a negative integer, set @var{rop} to NaN@.
 @end deftypefun
 
+@deftypefun int mpfr_beta (mpfr_t @var{rop}, mpfr_t @var{op1}, mpfr_t @var{op2}, mpfr_rnd_t @var{rnd})
+Set @var{rop} to the value of the Beta function at arguments @var{op1} and
+@var{op2}.
+@end deftypefun
+
 @deftypefun int mpfr_zeta (mpfr_t @var{rop}, mpfr_t @var{op}, mpfr_rnd_t @var{rnd})
 @deftypefunx int mpfr_zeta_ui (mpfr_t @var{rop}, unsigned long @var{op}, mpfr_rnd_t @var{rnd})
 Set @var{rop} to the value of the Riemann Zeta function on @var{op},
@@ -2300,7 +2308,7 @@ or the caches common to all threads (if @var{way} is
 @code{MPFR_FREE_GLOBAL_CACHE}).
 @end deftypefun
 
-@deftypefun int mpfr_sum (mpfr_t @var{rop}, mpfr_ptr const @var{tab}[], unsigned long int @var{n}, mpfr_rnd_t @var{rnd})
+@deftypefun int mpfr_sum (mpfr_t @var{rop}, const mpfr_ptr @var{tab}[], unsigned long int @var{n}, mpfr_rnd_t @var{rnd})
 Set @var{rop} to the sum of all elements of @var{tab}, whose size is @var{n},
 correctly rounded in the direction @var{rnd}. Warning: for efficiency reasons,
 @var{tab} is an array of pointers
@@ -2607,8 +2615,9 @@ specifiers @samp{f}, @samp{F}, @samp{g}, and @samp{G} is 6.
 For all the following functions, if the number of characters which ought to be
 written appears to exceed the maximum limit for an @code{int}, nothing is
 written in the stream (resp.@: to @code{stdout}, to @var{buf}, to @var{str}),
-the function returns @minus{}1, sets the @emph{erange} flag, and (in
-POSIX system only) @code{errno} is set to @code{EOVERFLOW}.
+the function returns @minus{}1, sets the @emph{erange} flag, and @code{errno}
+is set to @code{EOVERFLOW} if the @code{EOVERFLOW} macro is defined (such as
+in POSIX systems).
 
 @deftypefun int mpfr_fprintf (FILE *@var{stream}, const char *@var{template}, @dots{})
 @deftypefunx int mpfr_vfprintf (FILE *@var{stream}, const char *@var{template}, va_list @var{ap})
@@ -2656,7 +2665,7 @@ the control of the template string @var{template}, and print it in
 @var{buf}. If @var{n} is zero, nothing is
 written and @var{buf} may be a null pointer, otherwise, the @var{n}@minus{}1
 first characters are written in @var{buf} and the @var{n}-th is a null character.
-Return the number of characters that would have been written had @var{n} be
+Return the number of characters that would have been written had @var{n} been
 sufficiently large, @emph{not counting}
 the terminating null character, or a negative value if an error occurred.
 @c If the number of characters produced by the
@@ -2926,12 +2935,13 @@ representable in precision @var{prec}, then one can use the following
 trick to determine the (non-zero) @ref{ternary value} in any rounding
 mode (@code{MPFR_RNDZ} can be replaced by any directed rounding mode):
 @example
-if (mpfr_can_round (b, err, rnd1, MPFR_RNDZ, prec + (rnd2 == MPFR_RNDN)))
-@{
+if (mpfr_can_round (b, err, rnd1, MPFR_RNDZ,
+                    prec + (rnd2 == MPFR_RNDN)))
+  @{
     /* round the approximation 'b' to the result 'r' of 'prec' bits
        with rounding mode 'rnd2' and get the ternary value 'inex' */
     inex = mpfr_set (r, b, rnd2);
-@}
+  @}
 @end example
 Indeed, if @var{rnd2} is @code{MPFR_RNDN}, this will check if one can
 round to @var{prec}+1 bits with a directed rounding:
@@ -3771,6 +3781,8 @@ that were added after MPFR 2.2, and in which MPFR version.
 
 @item @code{mpfr_asprintf} in MPFR 2.4.
 
+@item @code{mpfr_beta} in MPFR 4.0.
+
 @item @code{mpfr_buildopt_decimal_p} in MPFR 3.0.
 
 @item @code{mpfr_buildopt_float128_p} in MPFR 4.0.
diff --git a/doc/sum.txt b/doc/sum.txt
index d46c85cb8..fbb2a639b 100644
--- a/doc/sum.txt
+++ b/doc/sum.txt
@@ -97,12 +97,30 @@ Specification
 The prototype:
 
 int
-mpfr_sum (mpfr_ptr sum, mpfr_ptr *const x, unsigned long n, mpfr_rnd_t rnd)
+mpfr_sum (mpfr_ptr sum, const mpfr_ptr *x, unsigned long n, mpfr_rnd_t rnd)
 
 where sum will contain the result, x is an array of pointers to the
 inputs, n is the length of this array, and rnd is the rounding mode.
 The return value of type int will be the usual ternary value.
 
+Note: One uses
+
+  const mpfr_ptr *x      i.e.:  __mpfr_struct *const *x
+
+instead of
+
+  const mpfr_srcptr *x   i.e.:  const __mpfr_struct *const *x
+
+because here one has a double indirection and the type matching rules
+from the C standard in such a case are stricter and they would yield
+annoying errors for the user in practice. See:
+
+  Why can't I pass a char ** to a function which expects a const char **?
+
+in the comp.lang.c FAQ:
+
+  http://c-faq.com/ansi/constmismatch.html
+
 If n = 0, then the result is +0, whatever the rounding mode. This is
 equivalent to mpfr_set_ui and mpfr_set_si on the integer 0, and this
 choice is consistent with IEEE 754-2008's sum reduction operation of
@@ -282,7 +300,7 @@ that a number starting with a sequence of 1's is close to 0 (secondary
 term when the TMD occurs) and that each bit except the first one has a
 positive value (for the error bound).
 
-The precision of the accumulator needs to be a bit larger than the
+The precision of the accumulator needs to be slightly larger than the
 output precision, denoted sq, for two reasons:
 
   * We need some additional bits on the side of the most significant
@@ -531,7 +549,7 @@ code.
 After the loop over the inputs, we need to see whether the accuracy
 of the truncated sum is sufficient. We first determine the number of
 cancelled bits, defined as the number of consecutive identical bits
-starting with the most significant one in the accumulator. At the
+starting with the most significant bit in the accumulator. At the
 same time, we can determine whether the truncated sum is 0 (all the
 bits are identical and their value is 0). If it is 0, we have two
 cases: if maxexp2 is equal to MPFR_EXP_MIN (meaning no more tails),
@@ -1031,7 +1049,7 @@ We distinguish two cases:
     Let us recall that the d−1 bits from exponent u−2 to u−d (= err)
     are identical. We distinguish two subcases:
 
-      * err ≥ minexp. The last two over the d−1 identical bits and the
+      * err ≥ minexp. The last two of the d−1 identical bits and the
         following bits, i.e., the bits from err+1 to minexp, are copied
         (possibly with a shift) to the most significant part of the new
         accumulator.
diff --git a/src/Makefile.am b/src/Makefile.am
index f7e21f3ea..c002950b6 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -60,7 +60,7 @@ scale2.c set_z_exp.c ai.c gammaonethird.c ieee_floats.h			\
 grandom.c fpif.c set_float128.c get_float128.c rndna.c nrandom.c        \
 random_deviate.h random_deviate.c erandom.c mpfr-mini-gmp.c             \
 mpfr-mini-gmp.h fmma.c log_ui.c gamma_inc.c ubf.c invert_limb.h 	\
-invsqrt_limb.h
+invsqrt_limb.h beta.c odd_p.c
 
 libmpfr_la_LIBADD = @LIBOBJS@
 
diff --git a/src/add1.c b/src/add1.c
index 066255f69..c4b053dd0 100644
--- a/src/add1.c
+++ b/src/add1.c
@@ -48,6 +48,8 @@ mpfr_add1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
   else
     exp = MPFR_GET_EXP (b);
 
+  MPFR_ASSERTD (exp <= __gmpfr_emax);
+
   MPFR_TMP_MARK(marker);
 
   aq = MPFR_GET_PREC (a);
@@ -561,6 +563,15 @@ mpfr_add1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
     }
 
  set_exponent:
+  if (MPFR_UNLIKELY (exp < __gmpfr_emin))  /* possible if b and c are UBF's */
+    {
+      if (rnd_mode == MPFR_RNDN &&
+          (exp < __gmpfr_emin - 1 ||
+           (inex >= 0 && mpfr_powerof2_raw (a))))
+        rnd_mode = MPFR_RNDZ;
+      inex = mpfr_underflow (a, rnd_mode, MPFR_SIGN(a));
+      goto end_of_add;
+    }
   MPFR_SET_EXP (a, exp);
 
  end_of_add:
diff --git a/src/add1sp.c b/src/add1sp.c
index be0e5529a..80606281d 100644
--- a/src/add1sp.c
+++ b/src/add1sp.c
@@ -102,6 +102,8 @@ int mpfr_add1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 # define DEBUG(x) /**/
 #endif
 
+#if !defined(MPFR_GENERIC_ABI)
+
 /* same as mpfr_add1sp, but for p < GMP_NUMB_BITS */
 static int
 mpfr_add1sp1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
@@ -614,6 +616,8 @@ mpfr_add1sp3 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
     }
 }
 
+#endif /* !defined(MPFR_GENERIC_ABI) */
+
 /* compute sign(b) * (|b| + |c|).
    Returns 0 iff result is exact,
    a negative value when the result is less than the exact value,
@@ -642,6 +646,8 @@ mpfr_add1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 
   /* Read prec and num of limbs */
   p = MPFR_GET_PREC (b);
+
+#if !defined(MPFR_GENERIC_ABI)
   if (p < GMP_NUMB_BITS)
     return mpfr_add1sp1 (a, b, c, rnd_mode, p);
 
@@ -655,6 +661,7 @@ mpfr_add1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 
   if (2 * GMP_NUMB_BITS < p && p < 3 * GMP_NUMB_BITS)
     return mpfr_add1sp3 (a, b, c, rnd_mode, p);
+#endif
 
   /* We need to get the sign before the possible exchange. */
   neg = MPFR_IS_NEG (b);
diff --git a/src/atanh.c b/src/atanh.c
index ae8e860e7..2a5910c55 100644
--- a/src/atanh.c
+++ b/src/atanh.c
@@ -23,11 +23,16 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 #define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
- /* The computation of atanh is done by
-       atanh= 1/2*ln(x+1)-1/2*ln(1-x)   */
+/* The computation of atanh is done by:
+   atanh = ln((1+x)/(1-x)) / 2
+   except when x is very small, in which case atanh = x + tiny error.
+   TODO: When x is small (but x + tiny error cannot be used), the above
+   formula is slow due to the large error as (1+x)/(1-x) is close to 1;
+   one should use: log1p(2x/(1-x)).
+*/
 
 int
-mpfr_atanh (mpfr_ptr y, mpfr_srcptr xt , mpfr_rnd_t rnd_mode)
+mpfr_atanh (mpfr_ptr y, mpfr_srcptr xt, mpfr_rnd_t rnd_mode)
 {
   int inexact;
   mpfr_t x, t, te;
@@ -85,27 +90,25 @@ mpfr_atanh (mpfr_ptr y, mpfr_srcptr xt , mpfr_rnd_t rnd_mode)
   MPFR_TMP_INIT_ABS (x, xt);
   Ny = MPFR_PREC (y);
   Nt = MAX (Nx, Ny);
-  /* the optimal number of bits : see algorithms.ps */
   Nt = Nt + MPFR_INT_CEIL_LOG2 (Nt) + 4;
 
   /* initialize of intermediary variable */
   mpfr_init2 (t, Nt);
   mpfr_init2 (te, Nt);
 
-  /* First computation of cosh */
   MPFR_ZIV_INIT (loop, Nt);
   for (;;)
     {
       /* compute atanh */
-      mpfr_ui_sub (te, 1, x, MPFR_RNDU);   /* (1-xt)*/
-      mpfr_add_ui (t,  x, 1, MPFR_RNDD);   /* (xt+1)*/
-      mpfr_div (t, t, te, MPFR_RNDN);      /* (1+xt)/(1-xt)*/
-      mpfr_log (t, t, MPFR_RNDN);          /* ln((1+xt)/(1-xt))*/
-      mpfr_div_2ui (t, t, 1, MPFR_RNDN);   /* (1/2)*ln((1+xt)/(1-xt))*/
+      mpfr_ui_sub (te, 1, x, MPFR_RNDU);   /* (1-x) with x = |xt| */
+      mpfr_add_ui (t, x, 1, MPFR_RNDD);    /* (1+x) */
+      mpfr_div (t, t, te, MPFR_RNDN);      /* (1+x)/(1-x) */
+      mpfr_log (t, t, MPFR_RNDN);          /* ln((1+x)/(1-x)) */
+      mpfr_div_2ui (t, t, 1, MPFR_RNDN);   /* ln((1+x)/(1-x)) / 2 */
 
       /* error estimate: see algorithms.tex */
       /* FIXME: this does not correspond to the value in algorithms.tex!!! */
-      /* err=Nt-__gmpfr_ceil_log2(1+5*pow(2,1-MPFR_EXP(t)));*/
+      /* err = Nt - __gmpfr_ceil_log2(1+5*pow(2,1-MPFR_EXP(t))); */
       err = Nt - (MAX (4 - MPFR_GET_EXP (t), 0) + 1);
 
       if (MPFR_LIKELY (MPFR_IS_ZERO (t)
@@ -121,10 +124,9 @@ mpfr_atanh (mpfr_ptr y, mpfr_srcptr xt , mpfr_rnd_t rnd_mode)
 
   inexact = mpfr_set4 (y, t, rnd_mode, MPFR_SIGN (xt));
 
-  mpfr_clear(t);
-  mpfr_clear(te);
+  mpfr_clear (t);
+  mpfr_clear (te);
 
   MPFR_SAVE_EXPO_FREE (expo);
   return mpfr_check_range (y, inexact, rnd_mode);
 }
-
diff --git a/src/beta.c b/src/beta.c
new file mode 100644
index 000000000..3fb9b9a77
--- /dev/null
+++ b/src/beta.c
@@ -0,0 +1,345 @@
+/* mpfr_beta -- beta function
+
+Copyright 2017 Free Software Foundation, Inc.
+Contributed by the AriC and Caramba projects, INRIA.
+
+This file is part of the GNU MPFR Library.
+
+The GNU MPFR Library is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as published by
+the Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+The GNU MPFR Library is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
+License for more details.
+
+You should have received a copy of the GNU Lesser General Public License
+along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
+http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
+51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
+
+#define MPFR_NEED_LONGLONG_H /* for MPFR_INT_CEIL_LOG2 */
+#include "mpfr-impl.h"
+
+/* use formula (6.2.2) from Abramowitz & Stegun:
+   beta(z,w) = gamma(z)*gamma(w)/gamma(z+w) */
+int
+mpfr_beta (mpfr_ptr r, mpfr_srcptr z, mpfr_srcptr w, mpfr_rnd_t rnd_mode)
+{
+  mpfr_exp_t emin, emax;
+  mpfr_uexp_t pmin;
+  mpfr_prec_t prec;
+  mpfr_t z_plus_w, tmp, tmp2;
+  int inex, w_integer;
+  MPFR_GROUP_DECL (group);
+  MPFR_ZIV_DECL (loop);
+  MPFR_SAVE_EXPO_DECL (expo);
+
+  if (mpfr_less_p (z, w))
+    return mpfr_beta (r, w, z, rnd_mode);
+
+  /* Now, either z and w are unordered (at least one is a NaN), or z >= w. */
+
+  if (MPFR_ARE_SINGULAR (z, w))
+    {
+      /* if z or w is NaN, return NaN */
+      if (MPFR_IS_NAN (z) || MPFR_IS_NAN (w))
+        {
+          MPFR_SET_NAN (r);
+          MPFR_RET_NAN;
+        }
+      else if (MPFR_IS_INF (z) || MPFR_IS_INF (w))
+        {
+          /* Since we have z >= w:
+             if z = +Inf and w > 0, then r = +0 (including w = +Inf);
+             if z = +Inf and w = 0, then r = NaN
+               [beta(z,1/log(z)) tends to +Inf whereas
+                beta(z,1/log(log(z))) tends to +0]
+             if z = +Inf and w < 0:
+                if w is an integer or -Inf: r = NaN
+                if -2k-1 < w < -2k:   r = -Inf
+                if -2k-2 < w < -2k-1: r = +Inf
+             if w = -Inf and z is finite and not an integer:
+                beta(z,t) for t going to -Inf oscillates between positive and
+                negative values, with poles around integer values of t, thus
+                beta(z,w) gives NaN;
+             if w = -Inf and z is an integer:
+                beta(z,w) gives +0 for z even > 0, -0 for z odd > 0,
+                NaN for z <= 0;
+             if z = -Inf (then w = -Inf too): r = NaN */
+          if (MPFR_IS_INF (z) && MPFR_IS_POS(z)) /* z = +Inf */
+            {
+              if (mpfr_cmp_ui (w, 0) > 0)
+                {
+                  MPFR_SET_ZERO(r);
+                  MPFR_SET_POS(r);
+                  MPFR_RET(0);
+                }
+              else if (MPFR_IS_ZERO(w) || MPFR_IS_INF(w) || mpfr_integer_p (w))
+                {
+                  MPFR_SET_NAN(r);
+                  MPFR_RET_NAN;
+                }
+              else
+                {
+                  long q;
+                  mpfr_t t;
+
+                  MPFR_SAVE_EXPO_MARK (expo);
+                  mpfr_init2 (t, MPFR_PREC_MIN);
+                  mpfr_set_ui (t, 1, MPFR_RNDN);
+                  mpfr_fmodquo (t, &q, w, t, MPFR_RNDD);
+                  mpfr_clear (t);
+                  MPFR_SAVE_EXPO_FREE (expo);
+                  /* q contains the low bits of trunc(w) where trunc() rounds
+                     towards zero, thus if q is odd, then -2k-2 < w < -2k-1 */
+                  MPFR_SET_INF(r);
+                  if ((unsigned long) q & 1)
+                    MPFR_SET_NEG(r);
+                  else
+                    MPFR_SET_POS(r);
+                  MPFR_RET(0);
+                }
+            }
+          else if (MPFR_IS_INF(w)) /* w = -Inf */
+            {
+              if (mpfr_cmp_ui (z, 0) <= 0 || !mpfr_integer_p (z))
+                {
+                  MPFR_SET_NAN(r);
+                  MPFR_RET_NAN;
+                }
+              else
+                {
+                  MPFR_SET_ZERO(r);
+                  if (mpfr_odd_p (z))
+                    MPFR_SET_NEG(r);
+                  else
+                    MPFR_SET_POS(r);
+                  MPFR_RET(0);
+                }
+            }
+        }
+      else /* z or w is 0 */
+        {
+          /* If x is not a nonpositive integer, Gamma(x) is regular, so that
+             when y -> 0 with either y >= 0 or y <= 0,
+               Beta(x,y) ~ Gamma(x) * Gamma(y) / Gamma(x) = Gamma(y)
+             Gamma(y) tends to an infinity of the same sign as y.
+             Thus Beta(x,y) should be an infinity of the same sign as y.
+           */
+          if (mpfr_cmp_ui (z, 0) != 0) /* then w is +0 or -0 and z > 0 */
+            {
+              /* beta(z,+0) = +Inf, beta(z,-0) = -Inf (see above) */
+              MPFR_SET_INF(r);
+              MPFR_SET_SAME_SIGN(r,w);
+              MPFR_SET_DIVBY0 ();
+              MPFR_RET(0);
+            }
+          else if (mpfr_cmp_ui (w, 0) != 0) /* then z is +0 or -0 and w < 0 */
+            {
+              if (mpfr_integer_p (w))
+                {
+                  /* For small u > 0, Beta(2u,w+u) and Beta(2u,w-u) have
+                     opposite signs, so that they tend to infinities of
+                     opposite signs when u -> 0. Thus the result is NaN. */
+                  MPFR_SET_NAN(r);
+                  MPFR_RET_NAN;
+                }
+              else
+                {
+                  /* beta(+0,w) = +Inf, beta(-0,w) = -Inf (see above) */
+                  MPFR_SET_INF(r);
+                  MPFR_SET_SAME_SIGN(r,z);
+                  MPFR_SET_DIVBY0 ();
+                  MPFR_RET(0);
+                }
+            }
+          else /* w = z = 0:
+                  beta(+0,+0) = +Inf
+                  beta(-0,-0) = -Inf
+                  beta(+0,-0) = NaN */
+            {
+              if (MPFR_SIGN(z) == MPFR_SIGN(w))
+                {
+                  MPFR_SET_INF(r);
+                  MPFR_SET_SAME_SIGN(r,z);
+                  MPFR_SET_DIVBY0 ();
+                  MPFR_RET(0);
+                }
+              else
+                {
+                  MPFR_SET_NAN(r);
+                  MPFR_RET_NAN;
+                }
+            }
+        }
+    }
+
+  /* special case when w is a negative integer */
+  w_integer = mpfr_integer_p (w);
+  if (w_integer && MPFR_IS_NEG(w))
+    {
+      /* if z < 0 or z+w > 0, or z is not an integer, return NaN */
+      if (MPFR_IS_NEG(z) || mpfr_cmpabs (z, w) > 0 || !mpfr_integer_p (z))
+        {
+          MPFR_SET_NAN(r);
+          MPFR_RET_NAN;
+        }
+      /* If z+w = 0, the result is 1/z. */
+      if (mpfr_cmpabs (z, w) == 0)
+        return mpfr_ui_div (r, 1, z, rnd_mode);
+      /* Now z is an integer and z+w <= 0: return (-1)^z*beta(z,1-w-z).
+         Since z and w are of opposite signs, |z+w| <= max(|z|,|w|). */
+      emax = MAX (MPFR_EXP(z), MPFR_EXP(w));
+      mpfr_init2 (z_plus_w, (mpfr_prec_t) emax);
+      inex = mpfr_add (z_plus_w, z, w, MPFR_RNDN);
+      MPFR_ASSERTN(inex == 0);
+      inex = mpfr_ui_sub (z_plus_w, 1, z_plus_w, MPFR_RNDN);
+      MPFR_ASSERTN(inex == 0);
+      if (mpfr_odd_p (z))
+        {
+          inex = -mpfr_beta (r, z, z_plus_w, MPFR_INVERT_RND (rnd_mode));
+          MPFR_CHANGE_SIGN(r);
+        }
+      else
+        inex = mpfr_beta (r, z, z_plus_w, rnd_mode);
+      mpfr_clear (z_plus_w);
+      return inex;
+    }
+
+  /* special case when z is a negative integer: here w < z and w is not an
+     integer */
+  if (mpfr_integer_p (z) && MPFR_IS_NEG(z))
+    {
+      MPFR_SET_NAN(r);
+      MPFR_RET_NAN;
+    }
+
+  MPFR_SAVE_EXPO_MARK (expo);
+
+  /* compute the smallest precision such that z + w is exact */
+  emax = MAX (MPFR_EXP(z), MPFR_EXP(w));
+  emin = MIN (MPFR_EXP(z) - MPFR_PREC(z), MPFR_EXP(w) - MPFR_PREC(w));
+  MPFR_ASSERTD (emax >= emin);
+  /* Thus the math value of emax - emin is representable in mpfr_uexp_t. */
+  pmin = (mpfr_uexp_t) emax - emin;
+  /* If z and w have same sign, their sum can have exponent emax + 1. */
+  pmin += 1;
+  if (pmin > MPFR_PREC_MAX) /* FIXME: check if result can differ from NaN. */
+    {
+      MPFR_SAVE_EXPO_FREE (expo);
+      MPFR_SET_NAN(r);
+      MPFR_RET_NAN;
+    }
+  MPFR_ASSERTN (pmin <= MPFR_PREC_MAX);  /* detect integer overflow */
+  mpfr_init2 (z_plus_w, (mpfr_prec_t) pmin);
+  inex = mpfr_add (z_plus_w, z, w, MPFR_RNDN);
+  /* if z+w overflows with rounding to nearest, then w must be larger than
+     1/2*ulp(z), thus we have an underflow. */
+  if (MPFR_IS_INF(z_plus_w))
+    {
+      mpfr_clear (z_plus_w);
+      MPFR_SAVE_EXPO_FREE (expo);
+      return mpfr_underflow (r, rnd_mode, 1);
+    }
+  MPFR_ASSERTN(inex == 0);
+
+  /* If z+w is 0 or a negative integer, return +0 when w (and thus z) is not
+     an integer. Indeed, gamma(z) and gamma(w) are regular numbers, and
+     gamma(z+w) is Inf, thus 1/gamma(z+w) is zero. Unless there is a rule
+     to choose the sign of 0, we choose +0. */
+  if (mpfr_cmp_ui (z_plus_w, 0) <= 0 && !w_integer
+      && mpfr_integer_p (z_plus_w))
+    {
+      mpfr_clear (z_plus_w);
+      MPFR_SAVE_EXPO_FREE (expo);
+      MPFR_SET_ZERO(r);
+      MPFR_SET_POS(r);
+      MPFR_RET(0);
+    }
+
+  prec = MPFR_PREC(r);
+  prec += MPFR_INT_CEIL_LOG2 (prec);
+  MPFR_GROUP_INIT_2 (group, prec, tmp, tmp2);
+  MPFR_ZIV_INIT (loop, prec);
+  for (;;)
+    {
+      unsigned int inex2;  /* unsigned due to bitwise operations */
+
+      MPFR_GROUP_REPREC_2 (group, prec, tmp, tmp2);
+      inex2 = mpfr_gamma (tmp, z, MPFR_RNDN);
+      /* tmp = gamma(z) * (1 + theta) with |theta| <= 2^-prec */
+      inex2 |= mpfr_gamma (tmp2, w, MPFR_RNDN);
+      /* tmp2 = gamma(w) * (1 + theta2) with |theta2| <= 2^-prec */
+      inex2 |= mpfr_mul (tmp, tmp, tmp2, MPFR_RNDN);
+      /* tmp = gamma(z)*gamma(w) * (1 + theta3)^3 with |theta3| <= 2^-prec */
+      inex2 |= mpfr_gamma (tmp2, z_plus_w, MPFR_RNDN);
+      /* tmp2 = gamma(z+w) * (1 + theta4) with |theta4| <= 2^-prec */
+      inex2 |= mpfr_div (tmp, tmp, tmp2, MPFR_RNDN);
+      /* tmp = gamma(z)*gamma(w)/gamma(z+w) * (1 + theta5)^5
+         with |theta5| <= 2^-prec. For prec >= 3, we have
+         |(1 + theta5)^5 - 1| <= 7 * 2^(-prec), thus the error is bounded
+         by 7 ulps */
+
+      if (MPFR_IS_NAN(tmp)) /* FIXME: most probably gamma(z)*gamma(w) = +-Inf,
+                               and gamma(z+w) = +-Inf, can we do better? */
+        {
+          mpfr_clear (z_plus_w);
+          MPFR_ZIV_FREE (loop);
+          MPFR_GROUP_CLEAR (group);
+          MPFR_SAVE_EXPO_FREE (expo);
+          MPFR_SET_NAN(r);
+          MPFR_RET_NAN;
+        }
+
+      MPFR_ASSERTN(mpfr_regular_p (tmp));
+
+      /* if inex2 = 0, then tmp is exactly beta(z,w) */
+      if (inex2 == 0 ||
+          MPFR_LIKELY (MPFR_CAN_ROUND (tmp, prec - 3, MPFR_PREC(r), rnd_mode)))
+        break;
+
+      /* beta(1,+/-2^(-k)) = +/-2^k is exact, and cannot be detected above
+         since gamma(+/-2^(-k)) is not exact */
+      if (mpfr_cmp_ui (z, 1) == 0)
+        {
+          mpfr_exp_t expw = mpfr_get_exp (w);
+          if (mpfr_cmp_ui_2exp (w, 1, expw - 1) == 0)
+            {
+              /* since z >= w, this will only match w <= 1 */
+              mpfr_set_ui_2exp (tmp, 1, 1 - expw, MPFR_RNDN);
+              break;
+            }
+          else if (mpfr_cmp_si_2exp (w, -1, expw - 1) == 0)
+            {
+              mpfr_set_si_2exp (tmp, -1, 1 - expw, MPFR_RNDN);
+              break;
+            }
+        }
+
+      /* beta(2^k,1) = 1/2^k for k > 0 (k <= 0 was already tested above) */
+      if (mpfr_cmp_ui (w, 1) == 0 &&
+          mpfr_cmp_ui_2exp (z, 1, MPFR_EXP(z) - 1) == 0)
+        {
+          mpfr_set_ui_2exp (tmp, 1, 1 - MPFR_EXP(z), MPFR_RNDN);
+          break;
+        }
+
+      /* beta(2,-0.5) = -4 */
+      if (mpfr_cmp_ui (z, 2) == 0 && mpfr_cmp_si_2exp (w, -1, -1) == 0)
+        {
+          mpfr_set_si_2exp (tmp, -1, 2, MPFR_RNDN);
+          break;
+        }
+
+      MPFR_ZIV_NEXT (loop, prec);
+    }
+  MPFR_ZIV_FREE (loop);
+  inex = mpfr_set (r, tmp, rnd_mode);
+  MPFR_GROUP_CLEAR (group);
+  mpfr_clear (z_plus_w);
+  MPFR_SAVE_EXPO_FREE (expo);
+  return mpfr_check_range (r, inex, rnd_mode);
+}
diff --git a/src/div.c b/src/div.c
index 07756e3b7..890d8984d 100644
--- a/src/div.c
+++ b/src/div.c
@@ -205,7 +205,7 @@ mpfr_div_1 (mpfr_ptr q, mpfr_srcptr u, mpfr_srcptr v, mpfr_rnd_t rnd_mode)
 
   MPFR_EXP (q) = qx; /* Don't use MPFR_SET_EXP since qx might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(qx >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -245,6 +245,149 @@ mpfr_div_1 (mpfr_ptr q, mpfr_srcptr u, mpfr_srcptr v, mpfr_rnd_t rnd_mode)
     }
 }
 
+/* Special code for PREC(q) = GMP_NUMB_BITS,
+   with PREC(u), PREC(v) <= GMP_NUMB_BITS. */
+static int
+mpfr_div_1n (mpfr_ptr q, mpfr_srcptr u, mpfr_srcptr v, mpfr_rnd_t rnd_mode)
+{
+  mpfr_limb_ptr qp = MPFR_MANT(q);
+  mpfr_exp_t qx = MPFR_GET_EXP(u) - MPFR_GET_EXP(v);
+  mp_limb_t u0 = MPFR_MANT(u)[0];
+  mp_limb_t v0 = MPFR_MANT(v)[0];
+  mp_limb_t q0, rb, sb, l;
+  int extra;
+
+  MPFR_ASSERTD(MPFR_PREC(q) == GMP_NUMB_BITS);
+  MPFR_ASSERTD(MPFR_PREC(u) <= GMP_NUMB_BITS);
+  MPFR_ASSERTD(MPFR_PREC(v) <= GMP_NUMB_BITS);
+
+  if ((extra = (u0 >= v0)))
+    u0 -= v0;
+
+#if GMP_NUMB_BITS == 64 /* __gmpfr_invert_limb_approx only exists for 64-bit */
+  {
+    mp_limb_t inv, h;
+
+    /* First compute an approximate quotient. */
+    __gmpfr_invert_limb_approx (inv, v0);
+    umul_ppmm (rb, sb, u0, inv);
+    q0 = u0 + rb;
+    /* rb does not exceed the true quotient floor(u0*2^GMP_NUMB_BITS/v0),
+       with error at most 2, which means the rational quotient q satisfies
+       rb <= q < rb + 3, thus the true quotient is rb, rb+1 or rb+2 */
+    umul_ppmm (h, l, q0, v0);
+    MPFR_ASSERTD(h < u0 || (h == u0 && l == MPFR_LIMB_ZERO));
+    /* subtract {h,l} from {u0,0} */
+    sub_ddmmss (h, l, u0, 0, h, l);
+    /* the remainder {h, l} should be < v0 */
+    /* This while loop is executed at most two times, but does not seem
+       slower than two consecutive identical if-statements. */
+    while (h || l >= v0)
+      {
+        q0 ++;
+        h -= (l < v0);
+        l -= v0;
+      }
+    MPFR_ASSERTD(h == 0 && l < v0);
+  }
+#else
+  udiv_qrnnd (q0, l, u0, 0, v0);
+#endif
+
+  /* now (u0 - extra*v0) * 2^GMP_NUMB_BITS = q0*v0 + l with 0 <= l < v0 */
+
+  /* If extra=0, the quotient is q0, the round bit is 1 if l >= v0/2,
+     and sb are the remaining bits from l.
+     If extra=1, the quotient is MPFR_LIMB_HIGHBIT + (q0 >> 1), the round bit
+     is the least significant bit of q0, and sb is l. */
+
+  if (extra == 0)
+    {
+      qp[0] = q0;
+      /* If "l + l < l", then there is a carry in l + l, thus 2*l > v0.
+         Otherwise if there is no carry, we check whether 2*l >= v0. */
+      rb = (l + l < l) || (l + l >= v0);
+      sb = (rb) ? l + l - v0 : l;
+    }
+  else
+    {
+      qp[0] = MPFR_LIMB_HIGHBIT | (q0 >> 1);
+      rb = q0 & MPFR_LIMB_ONE;
+      sb = l;
+      qx ++;
+    }
+
+  MPFR_SIGN(q) = MPFR_MULT_SIGN (MPFR_SIGN (u), MPFR_SIGN (v));
+
+  /* rounding */
+  if (MPFR_UNLIKELY(qx > __gmpfr_emax))
+    return mpfr_overflow (q, rnd_mode, MPFR_SIGN(q));
+
+  /* Warning: underflow should be checked *after* rounding, thus when rounding
+     away and when q > 0.111...111*2^(emin-1), or when rounding to nearest and
+     q >= 0.111...111[1]*2^(emin-1), there is no underflow. */
+  if (MPFR_UNLIKELY(qx < __gmpfr_emin))
+    {
+      /* Note: the case 0.111...111*2^(emin-1) < q < 2^(emin-1) is not possible
+         here since (up to exponent) this would imply 1 - 2^(-p) < u/v < 1,
+         thus v - 2^(-p)*v < u < v, and since we can assume 1/2 <= v < 1, it
+         would imply v - 2^(-p) = v - ulp(v) < u < v, which has no solution. */
+
+      /* For RNDN, mpfr_underflow always rounds away, thus for |q|<=2^(emin-2)
+         we have to change to RNDZ. This corresponds to:
+         (a) either qx < emin - 1
+         (b) or qx = emin - 1 and qp[0] = 1000....000 and rb = sb = 0.
+         Note: in case (b), it suffices to check whether sb = 0, since rb = 1
+         and sb = 0 is not possible (the exact quotient would have p+1 bits,
+         thus u would need at least p+1 bits). */
+      if (rnd_mode == MPFR_RNDN &&
+          (qx < __gmpfr_emin - 1 || (qp[0] == MPFR_LIMB_HIGHBIT && sb == 0)))
+        rnd_mode = MPFR_RNDZ;
+      return mpfr_underflow (q, rnd_mode, MPFR_SIGN(q));
+    }
+
+  MPFR_EXP (q) = qx; /* Don't use MPFR_SET_EXP since qx might be < __gmpfr_emin
+                        in the cases "goto rounding" above. */
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
+    {
+      MPFR_ASSERTD(qx >= __gmpfr_emin);
+      return 0; /* idem than MPFR_RET(0) but faster */
+    }
+  else if (rnd_mode == MPFR_RNDN)
+    {
+      /* It is not possible to have rb <> 0 and sb = 0 here, since it would
+         mean a n-bit by n-bit division gives an exact (n+1)-bit number.
+         And since the case rb = sb = 0 was already dealt with, we cannot
+         have sb = 0. Thus we cannot be in the middle of two numbers. */
+      MPFR_ASSERTD(sb != 0);
+      if (rb == 0)
+        goto truncate;
+      else
+        goto add_one_ulp;
+    }
+  else if (MPFR_IS_LIKE_RNDZ(rnd_mode, MPFR_IS_NEG(q)))
+    {
+    truncate:
+      MPFR_ASSERTD(qx >= __gmpfr_emin);
+      MPFR_RET(-MPFR_SIGN(q));
+    }
+  else /* round away from zero */
+    {
+    add_one_ulp:
+      qp[0] += MPFR_LIMB_ONE;
+      if (qp[0] == 0)
+        {
+          qp[0] = MPFR_LIMB_HIGHBIT;
+          if (MPFR_UNLIKELY(qx + 1 > __gmpfr_emax))
+            return mpfr_overflow (q, rnd_mode, MPFR_SIGN(q));
+          MPFR_ASSERTD(qx + 1 <= __gmpfr_emax);
+          MPFR_ASSERTD(qx + 1 >= __gmpfr_emin);
+          MPFR_SET_EXP (q, qx + 1);
+        }
+      MPFR_RET(MPFR_SIGN(q));
+    }
+}
+
 /* Special code for GMP_NUMB_BITS < PREC(q) < 2*GMP_NUMB_BITS and
    PREC(u) = PREC(v) = PREC(q) */
 static int
@@ -468,7 +611,7 @@ mpfr_div_2 (mpfr_ptr q, mpfr_srcptr u, mpfr_srcptr v, mpfr_rnd_t rnd_mode)
 
   MPFR_EXP (q) = qx; /* Don't use MPFR_SET_EXP since qx might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(qx >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -826,7 +969,10 @@ mpfr_div (mpfr_ptr q, mpfr_srcptr u, mpfr_srcptr v, mpfr_rnd_t rnd_mode)
 
       if (GMP_NUMB_BITS < MPFR_GET_PREC(q) &&
           MPFR_GET_PREC(q) < 2 * GMP_NUMB_BITS)
-    return mpfr_div_2 (q, u, v, rnd_mode);
+        return mpfr_div_2 (q, u, v, rnd_mode);
+
+      if (MPFR_GET_PREC(q) == GMP_NUMB_BITS)
+        return mpfr_div_1n (q, u, v, rnd_mode);
     }
 #endif /* !defined(MPFR_GENERIC_ABI) */
 
diff --git a/src/fma.c b/src/fma.c
index 945b8bb39..958176035 100644
--- a/src/fma.c
+++ b/src/fma.c
@@ -20,6 +20,7 @@ along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
 http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
 
+#define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
 /* The fused-multiply-add (fma) of x, y and z is defined by:
@@ -126,32 +127,76 @@ mpfr_fma (mpfr_ptr s, mpfr_srcptr x, mpfr_srcptr y, mpfr_srcptr z,
      we assume mpn_mul_n is faster up to 4*MPFR_MUL_THRESHOLD).
      Since |EXP(x)|, |EXP(y)| < 2^(k-2) on a k-bit computer,
      |EXP(x)+EXP(y)| < 2^(k-1), thus cannot overflow nor underflow. */
-  n = MPFR_LIMB_SIZE(x);
-  if (n <= 4 * MPFR_MUL_THRESHOLD && MPFR_PREC(x) == MPFR_PREC(y) &&
-      e <= __gmpfr_emax && e > __gmpfr_emin)
+  if (MPFR_PREC(x) == MPFR_PREC(y) && e <= __gmpfr_emax && e > __gmpfr_emin)
     {
-      mp_size_t un = n + n;
-      mpfr_limb_ptr up;
-      MPFR_TMP_DECL(marker);
-
-      MPFR_TMP_MARK(marker);
-      MPFR_TMP_INIT (up, u, un * GMP_NUMB_BITS, un);
-      up = MPFR_MANT(u);
-      /* multiply x*y exactly into u */
-      mpn_mul_n (up, MPFR_MANT(x), MPFR_MANT(y), n);
-      if (MPFR_LIMB_MSB (up[un - 1]) == 0)
+      if (MPFR_PREC(x) < GMP_NUMB_BITS && MPFR_PREC(z) == MPFR_PREC(x))
         {
-          mpn_lshift (up, up, un, 1);
-          MPFR_EXP(u) = e - 1;
+          mp_limb_t umant[2], zmant[2];
+          mpfr_t zz;
+          int inex;
+
+          umul_ppmm (umant[1], umant[0], MPFR_MANT(x)[0], MPFR_MANT(y)[0]);
+          MPFR_PREC(u) = MPFR_PREC(zz) = 2 * MPFR_PREC(x);
+          MPFR_MANT(u) = umant;
+          MPFR_MANT(zz) = zmant;
+          MPFR_SIGN(u) = MPFR_MULT_SIGN( MPFR_SIGN(x) , MPFR_SIGN(y) );
+          MPFR_SIGN(zz) = MPFR_SIGN(z);
+          MPFR_EXP(zz) = MPFR_EXP(z);
+          if (MPFR_PREC(zz) <= GMP_NUMB_BITS) /* zz fits in one limb */
+            {
+              if ((umant[1] & MPFR_LIMB_HIGHBIT) == 0)
+                {
+                  umant[0] = umant[1] << 1;
+                  MPFR_EXP(u) = e - 1;
+                }
+              else
+                {
+                  umant[0] = umant[1];
+                  MPFR_EXP(u) = e;
+                }
+              zmant[0] = MPFR_MANT(z)[0];
+            }
+          else
+            {
+              zmant[1] = MPFR_MANT(z)[0];
+              zmant[0] = MPFR_LIMB_ZERO;
+              if ((umant[1] & MPFR_LIMB_HIGHBIT) == 0)
+                {
+                  umant[1] = (umant[1] << 1) | (umant[0] >> (GMP_NUMB_BITS - 1));
+                  umant[0] = umant[0] << 1;
+                  MPFR_EXP(u) = e - 1;
+                }
+              else
+                MPFR_EXP(u) = e;
+            }
+          inex = mpfr_add (u, u, zz, rnd_mode);
+          return mpfr_set_1_2 (s, u, rnd_mode, inex);
+        }
+      else if ((n = MPFR_LIMB_SIZE(x)) <= 4 * MPFR_MUL_THRESHOLD)
+        {
+          mpfr_limb_ptr up;
+          mp_size_t un = n + n;
+          MPFR_TMP_DECL(marker);
+
+          MPFR_TMP_MARK(marker);
+          MPFR_TMP_INIT (up, u, un * GMP_NUMB_BITS, un);
+          up = MPFR_MANT(u);
+          /* multiply x*y exactly into u */
+          mpn_mul_n (up, MPFR_MANT(x), MPFR_MANT(y), n);
+          if (MPFR_LIMB_MSB (up[un - 1]) == 0)
+            {
+              mpn_lshift (up, up, un, 1);
+              MPFR_EXP(u) = e - 1;
+            }
+          else
+            MPFR_EXP(u) = e;
+          MPFR_SIGN(u) = MPFR_MULT_SIGN( MPFR_SIGN(x) , MPFR_SIGN(y) );
+          /* The above code does not generate any exception.
+             The exceptions will come only from mpfr_add. */
+          inexact = mpfr_add (s, u, z, rnd_mode);
+          MPFR_TMP_FREE(marker);
+          return inexact;
         }
-      else
-        MPFR_EXP(u) = e;
-      MPFR_SIGN(u) = MPFR_MULT_SIGN( MPFR_SIGN(x) , MPFR_SIGN(y) );
-      /* The above code does not generate any exception.
-         The exceptions will come only from mpfr_add. */
-      inexact = mpfr_add (s, u, z, rnd_mode);
-      MPFR_TMP_FREE(marker);
-      return inexact;
     }
 
   /* If we take prec(u) >= prec(x) + prec(y), the product u <- x*y
diff --git a/src/fmma.c b/src/fmma.c
index 9198ccefa..6168c2af0 100644
--- a/src/fmma.c
+++ b/src/fmma.c
@@ -27,8 +27,10 @@ mpfr_fmma_aux (mpfr_ptr z, mpfr_srcptr a, mpfr_srcptr b, mpfr_srcptr c,
                mpfr_srcptr d, mpfr_rnd_t rnd, int neg)
 {
   mpfr_ubf_t u, v;
+  mpfr_t zz;
+  mpfr_prec_t prec_z = MPFR_PREC(z);
   mp_size_t un, vn;
-  mpfr_limb_ptr up, vp;
+  mpfr_limb_ptr up, vp, zp;
   int inex;
   MPFR_TMP_DECL(marker);
 
@@ -52,7 +54,17 @@ mpfr_fmma_aux (mpfr_ptr z, mpfr_srcptr a, mpfr_srcptr b, mpfr_srcptr c,
   mpfr_ubf_mul_exact (v, c, d);
   if (neg)
     MPFR_CHANGE_SIGN (v);
-  inex = mpfr_add (z, (mpfr_srcptr) u, (mpfr_srcptr) v, rnd);
+  if (prec_z == MPFR_PREC(a) && prec_z == MPFR_PREC(b) &&
+      prec_z == MPFR_PREC(c) && prec_z == MPFR_PREC(d) &&
+      un == MPFR_PREC2LIMBS(2 * prec_z))
+    {
+      MPFR_TMP_INIT (zp, zz, 2 * prec_z, un);
+      MPFR_PREC(u) = MPFR_PREC(v) = 2 * prec_z;
+      inex = mpfr_add (zz, (mpfr_srcptr) u, (mpfr_srcptr) v, rnd);
+      inex = mpfr_set_1_2 (z, zz, rnd, inex);
+    }
+  else
+    inex = mpfr_add (z, (mpfr_srcptr) u, (mpfr_srcptr) v, rnd);
 
   MPFR_UBF_CLEAR_EXP (u);
   MPFR_UBF_CLEAR_EXP (v);
diff --git a/src/fms.c b/src/fms.c
index 6f7c5e6a4..5f05fbb23 100644
--- a/src/fms.c
+++ b/src/fms.c
@@ -24,7 +24,7 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 
 /* The fused-multiply-subtract (fms) of x, y and z is defined by:
    fms(x,y,z)= x*y - z
-   Note: this is neither in IEEE754R, nor in LIA-2, but both the
+   Note: this is neither in IEEE 754-2008, nor in LIA-2, but both the
    PowerPC and the Itanium define fms as x*y - z.
 */
 int
diff --git a/src/gamma.c b/src/gamma.c
index 7f42509e7..d3ad0c516 100644
--- a/src/gamma.c
+++ b/src/gamma.c
@@ -158,14 +158,25 @@ mpfr_gamma (mpfr_ptr gamma, mpfr_srcptr x, mpfr_rnd_t rnd_mode)
         }
     }
 
-  /* Check for tiny arguments, where gamma(x) ~ 1/x - euler + ....
+  /* Check for tiny arguments, where gamma(x) ~ 1/x - euler + ... can be
+     approximated by 1/x, with some error term ~= - euler.
+     We need to make sure that there are no breakpoints (discontinuity
+     points of the rounding function) between gamma(x) and 1/x (included),
+     where the possible breakpoints (for all rounding modes) are the numbers
+     that fit on PREC(gamma)+1 bits. There will be a special case when |x|
+     is a power of two, since such values are breakpoints. We will choose n
+     minimum such that x fits on n bits and the breakpoints fit on n+1 bits,
+     thus
+       n = MAX(MPFR_PREC(x), MPFR_PREC(gamma)).
      We know from "Bound on Runs of Zeros and Ones for Algebraic Functions",
      Proceedings of Arith15, T. Lang and J.-M. Muller, 2001, that the maximal
-     number of consecutive zeroes or ones after the round bit is n-1 for an
-     input of n bits. But we need a more precise lower bound. Assume x has
-     n bits, and 1/x is near a floating-point number y of n+1 bits. We can
-     write x = X*2^e, y = Y/2^f with X, Y integers of n and n+1 bits.
-     Thus X*Y^2^(e-f) is near from 1, i.e., X*Y is near from 2^(f-e).
+     number of consecutive zeroes or ones after the round bit for 1/x is n-1
+     for an input x of n bits [this is an actually much older result!].
+     But we need a more precise lower bound. Assume that 1/x is near a
+     breakpoint y. From the definition of n, the input x fits on n bits
+     and the breakpoint y fits on of n+1 bits. We can write x = X*2^e,
+     y = Y/2^f with X, Y integers of n and n+1 bits respectively.
+     Thus X*Y^2^(e-f) is near 1, i.e., X*Y is near the integer 2^(f-e).
      Two cases can happen:
      (i) either X*Y is exactly 2^(f-e), but this can happen only if X and Y
          are themselves powers of two, i.e., x is a power of two;
@@ -173,13 +184,21 @@ mpfr_gamma (mpfr_ptr gamma, mpfr_srcptr x, mpfr_rnd_t rnd_mode)
           |xy-1| >= 2^(e-f), or |y-1/x| >= 2^(e-f)/x = 2^(-f)/X >= 2^(-f-n).
           Since ufp(y) = 2^(n-f) [ufp = unit in first place], this means
           that the distance |y-1/x| >= 2^(-2n) ufp(y).
-          Now assuming |gamma(x)-1/x| <= 1, which is true for x <= 1,
-          if 2^(-2n) ufp(y) >= 2, the error is at most 2^(-2n-1) ufp(y),
-          and round(1/x) with precision >= 2n+2 gives the correct result.
-          If x < 2^E, then y > 2^(-E), thus ufp(y) > 2^(-E-1).
-          A sufficient condition is thus EXP(x) + 2 <= -2 MAX(PREC(x),PREC(Y)).
+          Now, assuming |gamma(x)-1/x| < 1, which is true for 0 < x <= 1,
+          if 2^(-2n) ufp(y) >= 1, then gamma(x) and 1/x round in the same
+          way, so that rounding 1/x gives the correct result and correct
+          (nonzero) ternary value.
+          If x < 2^E, then y >= 2^(-E), thus ufp(y) >= 2^(-E).
+          A sufficient condition is thus EXP(x) <= -2n, where
+          n = MAX(MPFR_PREC(x), MPFR_PREC(gamma)).
   */
-  if (MPFR_GET_EXP (x) + 2
+  /* TODO: The above proof uses the same precision for input and output.
+     Without this assumption, one might obtain a bound like
+     PREC(x) + PREC(y) instead of 2 MAX(PREC(x),PREC(y)). */
+  /* TODO: Handle the very small arguments that do not satisfy the condition,
+     by using the approximation 1/x - euler and a Ziv loop. Otherwise, after
+     some tests, even Gamma(1+x)/x would be faster than the generic code. */
+  if (MPFR_GET_EXP (x)
       <= -2 * (mpfr_exp_t) MAX(MPFR_PREC(x), MPFR_PREC(gamma)))
     {
       int sign = MPFR_SIGN (x); /* retrieve sign before possible override */
@@ -197,7 +216,7 @@ mpfr_gamma (mpfr_ptr gamma, mpfr_srcptr x, mpfr_rnd_t rnd_mode)
         mpfr_powerof2_raw (x);
 
       MPFR_BLOCK (flags, inex = mpfr_ui_div (gamma, 1, x, rnd_mode));
-      if (inex == 0) /* x is a power of two */
+      if (inex == 0) /* |x| is a power of two */
         {
           /* return RND(1/x - euler) = RND(+/- 2^k - eps) with eps > 0 */
           if (rnd_mode == MPFR_RNDN || MPFR_IS_LIKE_RNDU (rnd_mode, sign))
diff --git a/src/get_d64.c b/src/get_d64.c
index 35d3e3cfa..9da7650b2 100644
--- a/src/get_d64.c
+++ b/src/get_d64.c
@@ -1,5 +1,5 @@
 /* mpfr_get_decimal64 -- convert a multiple precision floating-point number
-                         to a IEEE 754r decimal64 float
+                         to an IEEE 754-2008 decimal64 float
 
 See https://gcc.gnu.org/ml/gcc/2006-06/msg00691.html,
 https://gcc.gnu.org/onlinedocs/gcc/Decimal-Float.html,
diff --git a/src/jn.c b/src/jn.c
index 21940aacd..b0aff5ef8 100644
--- a/src/jn.c
+++ b/src/jn.c
@@ -311,6 +311,11 @@ mpfr_jn (mpfr_ptr res, long n, mpfr_srcptr z, mpfr_rnd_t r)
           MPFR_ASSERTN (! exception);
           exception = 1;
         }
+      /* the expected number of lost bits is k0, if err is larger than k0
+         most probably there is a cancellation in the series, thus we add
+         err - k0 bits to prec */
+      if (err > k0)
+        prec = MPFR_ADD_PREC (prec, err - k0);
       MPFR_ZIV_NEXT (loop, prec);
     }
   MPFR_ZIV_FREE (loop);
diff --git a/src/log1p.c b/src/log1p.c
index cbeae17c6..8612ca3cb 100644
--- a/src/log1p.c
+++ b/src/log1p.c
@@ -23,8 +23,13 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 #define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
- /* The computation of log1p is done by
-    log1p(x)=log(1+x)                      */
+/* The computation of log1p is done by
+   log1p(x) = log(1+x)
+   except when x is very small, in which case log1p(x) = x + tiny error.
+   TODO: When x is small (but x + tiny error cannot be used), the above
+   formula is slow due to the absorption in 1+x and cancellation in the
+   log. The Taylor expansion may be faster.
+*/
 
 int
 mpfr_log1p (mpfr_ptr y, mpfr_srcptr x, mpfr_rnd_t rnd_mode)
diff --git a/src/mpfr-impl.h b/src/mpfr-impl.h
index affb806b4..c241ae34e 100644
--- a/src/mpfr-impl.h
+++ b/src/mpfr-impl.h
@@ -730,7 +730,8 @@ static double double_zero = 0.0;
    optimizing anything. */
 #ifdef WANT_LONGDOUBLE_VOLATILE
 # ifdef volatile
-__MPFR_DECLSPEC long double __gmpfr_longdouble_volatile (long double) MPFR_CONST_ATTR;
+__MPFR_DECLSPEC long double
+  __gmpfr_longdouble_volatile (long double) MPFR_CONST_ATTR;
 #  define LONGDOUBLE_VOLATILE(x)  (__gmpfr_longdouble_volatile (x))
 #  define WANT_GMPFR_LONGDOUBLE_VOLATILE 1
 # else
@@ -837,8 +838,8 @@ union ieee_double_decimal64 { double d; _Decimal64 d64; };
 #define MPFR_UEXP(X) (MPFR_ASSERTD ((X) >= 0), (mpfr_uexp_t) (X))
 
 #if _MPFR_EXP_FORMAT <= 3
-typedef long int mpfr_eexp_t;
-typedef unsigned long int mpfr_ueexp_t;
+typedef long mpfr_eexp_t;
+typedef unsigned long mpfr_ueexp_t;
 # define mpfr_get_exp_t(x,r) mpfr_get_si((x),(r))
 # define mpfr_set_exp_t(x,e,r) mpfr_set_si((x),(e),(r))
 # define MPFR_EXP_FSPEC "l"
@@ -1175,8 +1176,9 @@ typedef union { mp_size_t s; mp_limb_t l; } mpfr_size_limb_t;
    Since one does not know what is behind the associated typedef name,
    one cannot provide an explicit initialization for such a type. Two
    possible solutions:
-     1. Use a union whose first member is a char and initialize the
-        union with: { 0 }
+     1. Encapsulate the type in a structure or a union and use the
+        universal zero initializer: { 0 }
+        But: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80454
      2. Use designated initializers when supported. But this needs a
         configure test.
 */
@@ -1215,7 +1217,7 @@ asm (".section predict_data, \"aw\"; .previous\n"
      ".section predict_file, \"a\"; .previous");
 # if defined __x86_64__
 #  define MPFR_DEBUGPRED__(e,E)                                         \
-  ({ long int _e = !!(e);                                               \
+  ({ long _e = !!(e);                                                   \
     asm volatile (".pushsection predict_data\n"                         \
                   "..predictcnt%=: .quad 0; .quad 0\n"                  \
                   ".section predict_line; .quad %c1\n"                  \
@@ -1226,7 +1228,7 @@ asm (".section predict_data, \"aw\"; .previous\n"
   })
 # elif defined __i386__
 #  define MPFR_DEBUGPRED__(e,E)                                         \
-  ({ long int _e = !!(e);                                               \
+  ({ long _e = !!(e);                                                   \
     asm volatile (".pushsection predict_data\n"                         \
                   "..predictcnt%=: .long 0; .long 0\n"                  \
                   ".section predict_line; .long %c1\n"                  \
@@ -2097,19 +2099,18 @@ MPFR_COLD_FUNCTION_ATTR __MPFR_DECLSPEC int
 MPFR_COLD_FUNCTION_ATTR __MPFR_DECLSPEC int
   mpfr_overflow (mpfr_ptr, mpfr_rnd_t, int);
 
-__MPFR_DECLSPEC int mpfr_add1 (mpfr_ptr, mpfr_srcptr,
-                               mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub1 (mpfr_ptr, mpfr_srcptr,
-                               mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_add1sp (mpfr_ptr, mpfr_srcptr,
-                                 mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub1sp (mpfr_ptr, mpfr_srcptr,
-                                 mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_add1 (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub1 (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_add1sp (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub1sp (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                 mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_can_round_raw (const mp_limb_t *,
              mp_size_t, int, mpfr_exp_t, mpfr_rnd_t, mpfr_rnd_t, mpfr_prec_t);
 
-__MPFR_DECLSPEC int mpfr_cmp2 (mpfr_srcptr, mpfr_srcptr,
-                               mpfr_prec_t *);
+__MPFR_DECLSPEC int mpfr_set_1_2 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t, int);
+
+__MPFR_DECLSPEC int mpfr_cmp2 (mpfr_srcptr, mpfr_srcptr, mpfr_prec_t *);
 
 __MPFR_DECLSPEC long          __gmpfr_ceil_log2     (double);
 __MPFR_DECLSPEC long          __gmpfr_floor_log2    (double);
@@ -2120,13 +2121,13 @@ __MPFR_DECLSPEC int       __gmpfr_int_ceil_log2 (unsigned long);
 
 __MPFR_DECLSPEC mpfr_exp_t mpfr_ceil_mul (mpfr_exp_t, int, int);
 
-__MPFR_DECLSPEC int mpfr_exp_2 (mpfr_ptr, mpfr_srcptr,mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_exp_3 (mpfr_ptr, mpfr_srcptr,mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_exp_2 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_exp_3 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_powerof2_raw (mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_powerof2_raw2 (const mp_limb_t *, mp_size_t);
 
-__MPFR_DECLSPEC int mpfr_pow_general (mpfr_ptr, mpfr_srcptr,
-                           mpfr_srcptr, mpfr_rnd_t, int, mpfr_save_expo_t *);
+__MPFR_DECLSPEC int mpfr_pow_general (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                      mpfr_rnd_t, int, mpfr_save_expo_t *);
 
 __MPFR_DECLSPEC void mpfr_setmax (mpfr_ptr, mpfr_exp_t);
 __MPFR_DECLSPEC void mpfr_setmin (mpfr_ptr, mpfr_exp_t);
@@ -2138,14 +2139,14 @@ __MPFR_DECLSPEC long mpfr_mpn_exp (mp_limb_t *, mpfr_exp_t *, int,
 __MPFR_DECLSPEC void mpfr_fprint_binary (FILE *, mpfr_srcptr);
 #endif
 __MPFR_DECLSPEC void mpfr_print_binary (mpfr_srcptr);
-__MPFR_DECLSPEC void mpfr_print_mant_binary (const char*,
-                                          const mp_limb_t*, mpfr_prec_t);
+__MPFR_DECLSPEC void mpfr_print_mant_binary (const char*, const mp_limb_t*,
+                                             mpfr_prec_t);
 __MPFR_DECLSPEC void mpfr_set_str_binary (mpfr_ptr, const char*);
 
 __MPFR_DECLSPEC int mpfr_round_raw (mp_limb_t *,
        const mp_limb_t *, mpfr_prec_t, int, mpfr_prec_t, mpfr_rnd_t, int *);
-__MPFR_DECLSPEC int mpfr_round_raw_2 (const mp_limb_t *,
-             mpfr_prec_t, int, mpfr_prec_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_round_raw_2 (const mp_limb_t *, mpfr_prec_t, int,
+                                      mpfr_prec_t, mpfr_rnd_t);
 /* No longer defined (see round_prec.c).
    Uncomment if it needs to be defined again.
 __MPFR_DECLSPEC int mpfr_round_raw_3 (const mp_limb_t *,
@@ -2174,23 +2175,21 @@ __MPFR_DECLSPEC void mpfr_init_cache (mpfr_cache_t,
                                       int(*)(mpfr_ptr,mpfr_rnd_t));
 #endif
 __MPFR_DECLSPEC void mpfr_clear_cache (mpfr_cache_t);
-__MPFR_DECLSPEC int  mpfr_cache (mpfr_ptr, mpfr_cache_t,
-                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int  mpfr_cache (mpfr_ptr, mpfr_cache_t, mpfr_rnd_t);
 
-__MPFR_DECLSPEC void mpfr_mulhigh_n (mpfr_limb_ptr,
-                        mpfr_limb_srcptr, mpfr_limb_srcptr, mp_size_t);
-__MPFR_DECLSPEC void mpfr_mullow_n  (mpfr_limb_ptr,
-                        mpfr_limb_srcptr, mpfr_limb_srcptr, mp_size_t);
-__MPFR_DECLSPEC void mpfr_sqrhigh_n (mpfr_limb_ptr,
-                        mpfr_limb_srcptr, mp_size_t);
-__MPFR_DECLSPEC mp_limb_t mpfr_divhigh_n (mpfr_limb_ptr,
-                        mpfr_limb_ptr, mpfr_limb_ptr, mp_size_t);
+__MPFR_DECLSPEC void mpfr_mulhigh_n (mpfr_limb_ptr, mpfr_limb_srcptr,
+                                     mpfr_limb_srcptr, mp_size_t);
+__MPFR_DECLSPEC void mpfr_mullow_n  (mpfr_limb_ptr, mpfr_limb_srcptr,
+                                     mpfr_limb_srcptr, mp_size_t);
+__MPFR_DECLSPEC void mpfr_sqrhigh_n (mpfr_limb_ptr, mpfr_limb_srcptr,
+                                     mp_size_t);
+__MPFR_DECLSPEC mp_limb_t mpfr_divhigh_n (mpfr_limb_ptr, mpfr_limb_ptr,
+                                          mpfr_limb_ptr, mp_size_t);
 
-__MPFR_DECLSPEC int mpfr_round_p (mp_limb_t *, mp_size_t,
-                                  mpfr_exp_t, mpfr_prec_t);
+__MPFR_DECLSPEC int mpfr_round_p (mp_limb_t *, mp_size_t, mpfr_exp_t,
+                                  mpfr_prec_t);
 
-__MPFR_DECLSPEC int mpfr_round_near_x (mpfr_ptr, mpfr_srcptr,
-                                       mpfr_uexp_t, int,
+__MPFR_DECLSPEC int mpfr_round_near_x (mpfr_ptr, mpfr_srcptr, mpfr_uexp_t, int,
                                        mpfr_rnd_t);
 __MPFR_DECLSPEC MPFR_COLD_FUNCTION_ATTR MPFR_NORETURN void
   mpfr_abort_prec_max (void);
@@ -2201,24 +2200,26 @@ __MPFR_DECLSPEC void mpfr_rand_raw (mpfr_limb_ptr, gmp_randstate_t,
 __MPFR_DECLSPEC mpz_srcptr mpfr_bernoulli_cache (unsigned long);
 __MPFR_DECLSPEC void mpfr_bernoulli_freecache (void);
 
-__MPFR_DECLSPEC int mpfr_sincos_fast (mpfr_t, mpfr_t,
-                                      mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sincos_fast (mpfr_t, mpfr_t, mpfr_srcptr, mpfr_rnd_t);
 
 __MPFR_DECLSPEC double mpfr_scale2 (double, int);
 
-__MPFR_DECLSPEC void mpfr_div_ui2 (mpfr_ptr, mpfr_srcptr,
-                                   unsigned long int, unsigned long int,
-                                   mpfr_rnd_t);
+__MPFR_DECLSPEC void mpfr_div_ui2 (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                   unsigned long, mpfr_rnd_t);
 
-__MPFR_DECLSPEC void mpfr_gamma_one_and_two_third (mpfr_ptr, mpfr_ptr, mpfr_prec_t);
+__MPFR_DECLSPEC void mpfr_gamma_one_and_two_third (mpfr_ptr, mpfr_ptr,
+                                                   mpfr_prec_t);
 
 __MPFR_DECLSPEC void mpfr_mpz_init (mpz_ptr);
 __MPFR_DECLSPEC void mpfr_mpz_init2 (mpz_t, mp_bitcnt_t);
 __MPFR_DECLSPEC void mpfr_mpz_clear (mpz_ptr);
 
+__MPFR_DECLSPEC int mpfr_odd_p (mpfr_srcptr);
+
 #ifdef _MPFR_H_HAVE_VA_LIST
 /* Declared only if <stdarg.h> has been included. */
-__MPFR_DECLSPEC int mpfr_vasnprintf_aux (char**, char*, size_t, const char*, va_list);
+__MPFR_DECLSPEC int mpfr_vasnprintf_aux (char**, char*, size_t, const char*,
+                                         va_list);
 #endif
 
 #if defined (__cplusplus)
@@ -2364,6 +2365,6 @@ __MPFR_DECLSPEC mpfr_exp_t mpfr_ubf_diff_exp (mpfr_srcptr, mpfr_srcptr);
    ((mpfr_ubf_ptr) (x))->_mpfr_zexp)
 
 #define MPFR_UBF_CLEAR_EXP(x) \
-  ((void) (MPFR_IS_UBF (u) && (mpz_clear (MPFR_ZEXP (x)), 0)))
+  ((void) (MPFR_IS_UBF (x) && (mpz_clear (MPFR_ZEXP (x)), 0)))
 
 #endif /* __MPFR_IMPL_H__ */
diff --git a/src/mpfr.h b/src/mpfr.h
index ed22b0094..c54d24aca 100644
--- a/src/mpfr.h
+++ b/src/mpfr.h
@@ -351,8 +351,7 @@ __MPFR_DECLSPEC mpfr_exp_t mpfr_get_emax_max (void);
 
 __MPFR_DECLSPEC void mpfr_set_default_rounding_mode (mpfr_rnd_t);
 __MPFR_DECLSPEC mpfr_rnd_t mpfr_get_default_rounding_mode (void);
-__MPFR_DECLSPEC const char *
-   mpfr_print_rnd_mode (mpfr_rnd_t);
+__MPFR_DECLSPEC const char * mpfr_print_rnd_mode (mpfr_rnd_t);
 
 __MPFR_DECLSPEC void mpfr_clear_flags (void);
 __MPFR_DECLSPEC void mpfr_clear_underflow (void);
@@ -383,8 +382,7 @@ __MPFR_DECLSPEC mpfr_flags_t mpfr_flags_save (void);
 __MPFR_DECLSPEC void mpfr_flags_restore (mpfr_flags_t,
                                          mpfr_flags_t);
 
-__MPFR_DECLSPEC int
-  mpfr_check_range (mpfr_ptr, int, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_check_range (mpfr_ptr, int, mpfr_rnd_t);
 
 __MPFR_DECLSPEC void mpfr_init2 (mpfr_ptr, mpfr_prec_t);
 __MPFR_DECLSPEC void mpfr_init (mpfr_ptr);
@@ -397,11 +395,9 @@ __MPFR_DECLSPEC void
 __MPFR_DECLSPEC void
   mpfr_clears (mpfr_ptr, ...) __MPFR_SENTINEL_ATTR;
 
-__MPFR_DECLSPEC int
-  mpfr_prec_round (mpfr_ptr, mpfr_prec_t, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_can_round (mpfr_srcptr, mpfr_exp_t, mpfr_rnd_t, mpfr_rnd_t,
-                  mpfr_prec_t);
+__MPFR_DECLSPEC int mpfr_prec_round (mpfr_ptr, mpfr_prec_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_can_round (mpfr_srcptr, mpfr_exp_t, mpfr_rnd_t,
+                                    mpfr_rnd_t, mpfr_prec_t);
 __MPFR_DECLSPEC mpfr_prec_t mpfr_min_prec (mpfr_srcptr);
 
 __MPFR_DECLSPEC mpfr_exp_t mpfr_get_exp (mpfr_srcptr);
@@ -417,107 +413,77 @@ __MPFR_DECLSPEC int mpfr_set_flt (mpfr_ptr, float, mpfr_rnd_t);
 #ifdef MPFR_WANT_DECIMAL_FLOATS
 /* _Decimal64 is not defined in C++,
    cf https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51364 */
-__MPFR_DECLSPEC int mpfr_set_decimal64 (mpfr_ptr, _Decimal64,
-                                        mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_decimal64 (mpfr_ptr, _Decimal64, mpfr_rnd_t);
 #endif
-__MPFR_DECLSPEC int
-  mpfr_set_ld (mpfr_ptr, long double, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_ld (mpfr_ptr, long double, mpfr_rnd_t);
 #ifdef MPFR_WANT_FLOAT128
-__MPFR_DECLSPEC int
-  mpfr_set_float128 (mpfr_ptr, __float128, mpfr_rnd_t);
-__MPFR_DECLSPEC __float128
-  mpfr_get_float128 (mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_float128 (mpfr_ptr, __float128, mpfr_rnd_t);
+__MPFR_DECLSPEC __float128 mpfr_get_float128 (mpfr_srcptr, mpfr_rnd_t);
 #endif
-__MPFR_DECLSPEC int
-  mpfr_set_z (mpfr_ptr, mpz_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_z_2exp (mpfr_ptr, mpz_srcptr, mpfr_exp_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_z (mpfr_ptr, mpz_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_z_2exp (mpfr_ptr, mpz_srcptr, mpfr_exp_t,
+                                     mpfr_rnd_t);
 __MPFR_DECLSPEC void mpfr_set_nan (mpfr_ptr);
 __MPFR_DECLSPEC void mpfr_set_inf (mpfr_ptr, int);
 __MPFR_DECLSPEC void mpfr_set_zero (mpfr_ptr, int);
 
 #ifndef MPFR_USE_MINI_GMP
   /* mini-gmp does not provide mpf_t, we disable the following functions */
-__MPFR_DECLSPEC int
-  mpfr_set_f (mpfr_ptr, mpf_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_f (mpfr_ptr, mpf_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cmp_f (mpfr_srcptr, mpf_srcptr);
-__MPFR_DECLSPEC int
-  mpfr_get_f (mpf_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_get_f (mpf_ptr, mpfr_srcptr, mpfr_rnd_t);
 #endif
 __MPFR_DECLSPEC int mpfr_set_si (mpfr_ptr, long, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_si_2exp (mpfr_ptr, long, mpfr_exp_t, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_ui_2exp (mpfr_ptr, unsigned long, mpfr_exp_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_si_2exp (mpfr_ptr, long, mpfr_exp_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_ui_2exp (mpfr_ptr, unsigned long, mpfr_exp_t,
+                                      mpfr_rnd_t);
 #ifndef MPFR_USE_MINI_GMP
   /* mini-gmp does not provide mpq_t, we disable the following functions */
-__MPFR_DECLSPEC int
-  mpfr_set_q (mpfr_ptr, mpq_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_q (mpfr_ptr, mpfr_srcptr,
-                                mpq_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_q (mpfr_ptr, mpfr_srcptr,
-                                mpq_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_add_q (mpfr_ptr, mpfr_srcptr,
-                                mpq_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub_q (mpfr_ptr, mpfr_srcptr,
-                                mpq_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_q (mpfr_ptr, mpq_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_q (mpfr_ptr, mpfr_srcptr, mpq_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_q (mpfr_ptr, mpfr_srcptr, mpq_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_add_q (mpfr_ptr, mpfr_srcptr, mpq_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub_q (mpfr_ptr, mpfr_srcptr, mpq_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cmp_q (mpfr_srcptr, mpq_srcptr);
 #endif
-__MPFR_DECLSPEC int
-  mpfr_set_str (mpfr_ptr, const char *, int, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_init_set_str (mpfr_ptr, const char *, int,
-                     mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set4 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t, int);
-__MPFR_DECLSPEC int
-  mpfr_abs (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_str (mpfr_ptr, const char *, int, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_init_set_str (mpfr_ptr, const char *, int,
+                                       mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set4 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t, int);
+__MPFR_DECLSPEC int mpfr_abs (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_neg (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_signbit (mpfr_srcptr);
-__MPFR_DECLSPEC int
-  mpfr_setsign (mpfr_ptr, mpfr_srcptr, int, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_copysign (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_setsign (mpfr_ptr, mpfr_srcptr, int, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_copysign (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                   mpfr_rnd_t);
 
 __MPFR_DECLSPEC mpfr_exp_t mpfr_get_z_2exp (mpz_ptr, mpfr_srcptr);
 __MPFR_DECLSPEC float mpfr_get_flt (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC double mpfr_get_d (mpfr_srcptr, mpfr_rnd_t);
 #ifdef MPFR_WANT_DECIMAL_FLOATS
-__MPFR_DECLSPEC _Decimal64 mpfr_get_decimal64 (mpfr_srcptr,
-                                               mpfr_rnd_t);
+__MPFR_DECLSPEC _Decimal64 mpfr_get_decimal64 (mpfr_srcptr, mpfr_rnd_t);
 #endif
-__MPFR_DECLSPEC long double mpfr_get_ld (mpfr_srcptr,
-                                         mpfr_rnd_t);
+__MPFR_DECLSPEC long double mpfr_get_ld (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC double mpfr_get_d1 (mpfr_srcptr);
-__MPFR_DECLSPEC double mpfr_get_d_2exp (long*, mpfr_srcptr,
-                                        mpfr_rnd_t);
-__MPFR_DECLSPEC long double mpfr_get_ld_2exp (long*, mpfr_srcptr,
-                                              mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_frexp (mpfr_exp_t*, mpfr_ptr,
-                                mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC double mpfr_get_d_2exp (long*, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC long double mpfr_get_ld_2exp (long*, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_frexp (mpfr_exp_t*, mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC long mpfr_get_si (mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC unsigned long mpfr_get_ui (mpfr_srcptr,
-                                           mpfr_rnd_t);
-__MPFR_DECLSPEC char*mpfr_get_str (char*, mpfr_exp_t*, int, size_t,
-                                   mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_get_z (mpz_ptr z, mpfr_srcptr f,
-                                mpfr_rnd_t);
+__MPFR_DECLSPEC unsigned long mpfr_get_ui (mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC char * mpfr_get_str (char*, mpfr_exp_t*, int, size_t,
+                                     mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_get_z (mpz_ptr z, mpfr_srcptr f, mpfr_rnd_t);
 
 __MPFR_DECLSPEC void mpfr_free_str (char *);
 
-__MPFR_DECLSPEC int mpfr_urandom (mpfr_ptr, gmp_randstate_t,
-                                  mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_urandom (mpfr_ptr, gmp_randstate_t, mpfr_rnd_t);
 MPFR_DEPRECATED
 __MPFR_DECLSPEC int mpfr_grandom (mpfr_ptr, mpfr_ptr, gmp_randstate_t,
                                   mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_nrandom (mpfr_ptr, gmp_randstate_t,
-                                  mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_erandom (mpfr_ptr, gmp_randstate_t,
-                                  mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_nrandom (mpfr_ptr, gmp_randstate_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_erandom (mpfr_ptr, gmp_randstate_t, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_urandomb (mpfr_ptr, gmp_randstate_t);
 
 __MPFR_DECLSPEC void mpfr_nextabove (mpfr_ptr);
@@ -526,81 +492,56 @@ __MPFR_DECLSPEC void mpfr_nexttoward (mpfr_ptr, mpfr_srcptr);
 
 #ifndef MPFR_USE_MINI_GMP
 __MPFR_DECLSPEC int mpfr_printf (const char*, ...);
-__MPFR_DECLSPEC int mpfr_asprintf (char**, const char*,
-                                   ...);
-__MPFR_DECLSPEC int mpfr_sprintf (char*, const char*,
-                                  ...);
-__MPFR_DECLSPEC int mpfr_snprintf (char*, size_t,
-                                   const char*, ...);
+__MPFR_DECLSPEC int mpfr_asprintf (char**, const char*, ...);
+__MPFR_DECLSPEC int mpfr_sprintf (char*, const char*, ...);
+__MPFR_DECLSPEC int mpfr_snprintf (char*, size_t, const char*, ...);
 #endif
 
-__MPFR_DECLSPEC int mpfr_pow (mpfr_ptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_pow_si (mpfr_ptr, mpfr_srcptr,
-                                 long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_pow_ui (mpfr_ptr, mpfr_srcptr,
-                                 unsigned long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_ui_pow_ui (mpfr_ptr, unsigned long int,
-                                    unsigned long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_ui_pow (mpfr_ptr, unsigned long int,
-                                 mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_pow_z (mpfr_ptr, mpfr_srcptr,
-                                mpz_srcptr, mpfr_rnd_t);
-
-__MPFR_DECLSPEC int mpfr_sqrt (mpfr_ptr, mpfr_srcptr,
-                               mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sqrt_ui (mpfr_ptr, unsigned long,
-                                  mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_rec_sqrt (mpfr_ptr, mpfr_srcptr,
-                                   mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_pow (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_pow_si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_pow_ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_ui_pow_ui (mpfr_ptr, unsigned long, unsigned long,
+                                    mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_ui_pow (mpfr_ptr, unsigned long, mpfr_srcptr,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_pow_z (mpfr_ptr, mpfr_srcptr, mpz_srcptr, mpfr_rnd_t);
+
+__MPFR_DECLSPEC int mpfr_sqrt (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sqrt_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rec_sqrt (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int mpfr_add (mpfr_ptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub (mpfr_ptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul (mpfr_ptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div (mpfr_ptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-
-__MPFR_DECLSPEC int mpfr_add_ui (mpfr_ptr, mpfr_srcptr,
-                                 unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub_ui (mpfr_ptr, mpfr_srcptr,
-                                 unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_ui_sub (mpfr_ptr, unsigned long,
-                                 mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_ui (mpfr_ptr, mpfr_srcptr,
-                                 unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_ui (mpfr_ptr, mpfr_srcptr,
-                                 unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_ui_div (mpfr_ptr, unsigned long,
-                                 mpfr_srcptr, mpfr_rnd_t);
-
-__MPFR_DECLSPEC int mpfr_add_si (mpfr_ptr, mpfr_srcptr,
-                                 long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub_si (mpfr_ptr, mpfr_srcptr,
-                                 long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_si_sub (mpfr_ptr, long int,
-                                 mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_si (mpfr_ptr, mpfr_srcptr,
-                                 long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_si (mpfr_ptr, mpfr_srcptr,
-                                 long int, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_si_div (mpfr_ptr, long int,
-                                 mpfr_srcptr, mpfr_rnd_t);
-
-__MPFR_DECLSPEC int mpfr_add_d (mpfr_ptr, mpfr_srcptr,
-                                double, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub_d (mpfr_ptr, mpfr_srcptr,
-                                double, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_d_sub (mpfr_ptr, double,
-                                mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_d (mpfr_ptr, mpfr_srcptr,
-                                double, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_d (mpfr_ptr, mpfr_srcptr,
-                                double, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_d_div (mpfr_ptr, double,
-                                mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_add (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+
+__MPFR_DECLSPEC int mpfr_add_ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub_ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_ui_sub (mpfr_ptr, unsigned long, mpfr_srcptr,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_ui_div (mpfr_ptr, unsigned long, mpfr_srcptr,
+                                 mpfr_rnd_t);
+
+__MPFR_DECLSPEC int mpfr_add_si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub_si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_si_sub (mpfr_ptr, long, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_si_div (mpfr_ptr, long, mpfr_srcptr, mpfr_rnd_t);
+
+__MPFR_DECLSPEC int mpfr_add_d (mpfr_ptr, mpfr_srcptr, double, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub_d (mpfr_ptr, mpfr_srcptr, double, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_d_sub (mpfr_ptr, double, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_d (mpfr_ptr, mpfr_srcptr, double, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_d (mpfr_ptr, mpfr_srcptr, double, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_d_div (mpfr_ptr, double, mpfr_srcptr, mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_sqr (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 
@@ -609,24 +550,18 @@ __MPFR_DECLSPEC int mpfr_const_log2 (mpfr_ptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_const_euler (mpfr_ptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_const_catalan (mpfr_ptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int mpfr_agm (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_agm (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_log (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_log2 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_log10 (mpfr_ptr, mpfr_srcptr,
-                                mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_log1p (mpfr_ptr, mpfr_srcptr,
-                                mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_log_ui (mpfr_ptr, unsigned long,
-                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_log10 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_log1p (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_log_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_exp (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_exp2 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_exp10 (mpfr_ptr, mpfr_srcptr,
-                                mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_expm1 (mpfr_ptr, mpfr_srcptr,
-                                mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_exp10 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_expm1 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_eint (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_li2 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 
@@ -637,28 +572,23 @@ __MPFR_DECLSPEC int mpfr_cmp_ld (mpfr_srcptr, long double);
 __MPFR_DECLSPEC int mpfr_cmpabs (mpfr_srcptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_cmp_ui (mpfr_srcptr, unsigned long);
 __MPFR_DECLSPEC int mpfr_cmp_si (mpfr_srcptr, long);
-__MPFR_DECLSPEC int mpfr_cmp_ui_2exp (mpfr_srcptr, unsigned long,
-                                      mpfr_exp_t);
-__MPFR_DECLSPEC int mpfr_cmp_si_2exp (mpfr_srcptr, long,
-                                      mpfr_exp_t);
-__MPFR_DECLSPEC void mpfr_reldiff (mpfr_ptr, mpfr_srcptr,
-                                   mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_eq (mpfr_srcptr, mpfr_srcptr,
-                             unsigned long);
+__MPFR_DECLSPEC int mpfr_cmp_ui_2exp (mpfr_srcptr, unsigned long, mpfr_exp_t);
+__MPFR_DECLSPEC int mpfr_cmp_si_2exp (mpfr_srcptr, long, mpfr_exp_t);
+__MPFR_DECLSPEC void mpfr_reldiff (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                   mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_eq (mpfr_srcptr, mpfr_srcptr, unsigned long);
 __MPFR_DECLSPEC int mpfr_sgn (mpfr_srcptr);
 
-__MPFR_DECLSPEC int mpfr_mul_2exp (mpfr_ptr, mpfr_srcptr,
-                                   unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_2exp (mpfr_ptr, mpfr_srcptr,
-                                   unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_2ui (mpfr_ptr, mpfr_srcptr,
-                                  unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_2ui (mpfr_ptr, mpfr_srcptr,
-                                  unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_mul_2si (mpfr_ptr, mpfr_srcptr,
-                                  long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_2si (mpfr_ptr, mpfr_srcptr,
-                                  long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_2exp (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                   mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_2exp (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                   mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_2ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                  mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_2ui (mpfr_ptr, mpfr_srcptr, unsigned long,
+                                  mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_2si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_2si (mpfr_ptr, mpfr_srcptr, long, mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_rint (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_roundeven (mpfr_ptr, mpfr_srcptr);
@@ -666,27 +596,20 @@ __MPFR_DECLSPEC int mpfr_round (mpfr_ptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_trunc (mpfr_ptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_ceil (mpfr_ptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_floor (mpfr_ptr, mpfr_srcptr);
-__MPFR_DECLSPEC int mpfr_rint_roundeven (mpfr_ptr, mpfr_srcptr,
-                                         mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_rint_round (mpfr_ptr, mpfr_srcptr,
-                                     mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_rint_trunc (mpfr_ptr, mpfr_srcptr,
-                                     mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_rint_ceil (mpfr_ptr, mpfr_srcptr,
-                                    mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_rint_floor (mpfr_ptr, mpfr_srcptr,
-                                     mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rint_roundeven (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rint_round (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rint_trunc (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rint_ceil (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_rint_floor (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_frac (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_modf (mpfr_ptr, mpfr_ptr, mpfr_srcptr,
-                               mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_remquo (mpfr_ptr, long*, mpfr_srcptr,
-                                 mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_remainder (mpfr_ptr, mpfr_srcptr,
-                                    mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fmod (mpfr_ptr, mpfr_srcptr,
-                               mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fmodquo (mpfr_ptr, long*, mpfr_srcptr,
-                                  mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_modf (mpfr_ptr, mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_remquo (mpfr_ptr, long*, mpfr_srcptr, mpfr_srcptr,
+                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_remainder (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                    mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fmod (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fmodquo (mpfr_ptr, long*, mpfr_srcptr, mpfr_srcptr,
+                                  mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_fits_ulong_p (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_fits_slong_p (mpfr_srcptr, mpfr_rnd_t);
@@ -697,8 +620,7 @@ __MPFR_DECLSPEC int mpfr_fits_sshort_p (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_fits_uintmax_p (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_fits_intmax_p (mpfr_srcptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC void mpfr_extract (mpz_ptr, mpfr_srcptr,
-                                   unsigned int);
+__MPFR_DECLSPEC void mpfr_extract (mpz_ptr, mpfr_srcptr, unsigned int);
 __MPFR_DECLSPEC void mpfr_swap (mpfr_ptr, mpfr_ptr);
 __MPFR_DECLSPEC void mpfr_dump (mpfr_srcptr);
 
@@ -710,8 +632,7 @@ __MPFR_DECLSPEC int mpfr_zero_p (mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_regular_p (mpfr_srcptr);
 
 __MPFR_DECLSPEC int mpfr_greater_p (mpfr_srcptr, mpfr_srcptr);
-__MPFR_DECLSPEC int mpfr_greaterequal_p (mpfr_srcptr,
-                                         mpfr_srcptr);
+__MPFR_DECLSPEC int mpfr_greaterequal_p (mpfr_srcptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_less_p (mpfr_srcptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_lessequal_p (mpfr_srcptr, mpfr_srcptr);
 __MPFR_DECLSPEC int mpfr_lessgreater_p (mpfr_srcptr, mpfr_srcptr);
@@ -724,8 +645,8 @@ __MPFR_DECLSPEC int mpfr_asinh (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cosh (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_sinh (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_tanh (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sinh_cosh (mpfr_ptr, mpfr_ptr,
-                                    mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sinh_cosh (mpfr_ptr, mpfr_ptr, mpfr_srcptr,
+                                    mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_sech (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_csch (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
@@ -735,96 +656,80 @@ __MPFR_DECLSPEC int mpfr_acos (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_asin (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_atan (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_sin (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sin_cos (mpfr_ptr, mpfr_ptr,
-                                  mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sin_cos (mpfr_ptr, mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cos (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_tan (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_atan2 (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                                mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_atan2 (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_sec (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_csc (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cot (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int mpfr_hypot (mpfr_ptr, mpfr_srcptr,
-                                mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_hypot (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_erf (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_erfc (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cbrt (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_root (mpfr_ptr, mpfr_srcptr, unsigned long,
                                mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_gamma (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_gamma_inc (mpfr_ptr, mpfr_srcptr,
-                                    mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_gamma_inc (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
+                                    mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_beta (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_lngamma (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_lgamma (mpfr_ptr, int *, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_digamma (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_zeta (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_zeta_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fac_ui (mpfr_ptr, unsigned long int,
-                                 mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fac_ui (mpfr_ptr, unsigned long, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_j0 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_j1 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_jn (mpfr_ptr, long, mpfr_srcptr,
-                             mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_jn (mpfr_ptr, long, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_y0 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_y1 (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_yn (mpfr_ptr, long, mpfr_srcptr,
-                             mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_yn (mpfr_ptr, long, mpfr_srcptr, mpfr_rnd_t);
 
 __MPFR_DECLSPEC int mpfr_ai (mpfr_ptr, mpfr_srcptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int mpfr_min (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_max (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_dim (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_min (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_max (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_dim (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int mpfr_mul_z (mpfr_ptr, mpfr_srcptr,
-                                mpz_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_div_z (mpfr_ptr, mpfr_srcptr,
-                                mpz_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_add_z (mpfr_ptr, mpfr_srcptr,
-                                mpz_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sub_z (mpfr_ptr, mpfr_srcptr,
-                                mpz_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_z_sub (mpfr_ptr, mpz_srcptr,
-                                mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_mul_z (mpfr_ptr, mpfr_srcptr, mpz_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_div_z (mpfr_ptr, mpfr_srcptr, mpz_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_add_z (mpfr_ptr, mpfr_srcptr, mpz_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sub_z (mpfr_ptr, mpfr_srcptr, mpz_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_z_sub (mpfr_ptr, mpz_srcptr, mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_cmp_z (mpfr_srcptr, mpz_srcptr);
 
-__MPFR_DECLSPEC int mpfr_fma (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fms (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                              mpfr_srcptr, mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fmma (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                               mpfr_srcptr, mpfr_srcptr,
-                               mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_fmms (mpfr_ptr, mpfr_srcptr, mpfr_srcptr,
-                               mpfr_srcptr, mpfr_srcptr,
-                               mpfr_rnd_t);
-__MPFR_DECLSPEC int mpfr_sum (mpfr_ptr, mpfr_ptr *const,
-                              unsigned long, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fma (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_srcptr,
+                              mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fms (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_srcptr,
+                              mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fmma (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_srcptr,
+                               mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_fmms (mpfr_ptr, mpfr_srcptr, mpfr_srcptr, mpfr_srcptr,
+                               mpfr_srcptr, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_sum (mpfr_ptr, const mpfr_ptr *, unsigned long,
+                              mpfr_rnd_t);
 
 __MPFR_DECLSPEC void mpfr_free_cache (void);
 __MPFR_DECLSPEC void mpfr_free_cache2 (mpfr_free_cache_t);
 
-__MPFR_DECLSPEC int  mpfr_subnormalize (mpfr_ptr, int,
-                                        mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_subnormalize (mpfr_ptr, int, mpfr_rnd_t);
 
-__MPFR_DECLSPEC int  mpfr_strtofr (mpfr_ptr, const char *,
-                                   char **, int, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_strtofr (mpfr_ptr, const char *, char **, int,
+                                  mpfr_rnd_t);
 
 __MPFR_DECLSPEC void mpfr_round_nearest_away_begin (mpfr_t);
-__MPFR_DECLSPEC int  mpfr_round_nearest_away_end   (mpfr_t, int);
+__MPFR_DECLSPEC int mpfr_round_nearest_away_end (mpfr_t, int);
 
-__MPFR_DECLSPEC size_t mpfr_custom_get_size   (mpfr_prec_t);
-__MPFR_DECLSPEC void   mpfr_custom_init    (void *, mpfr_prec_t);
+__MPFR_DECLSPEC size_t mpfr_custom_get_size (mpfr_prec_t);
+__MPFR_DECLSPEC void mpfr_custom_init (void *, mpfr_prec_t);
 __MPFR_DECLSPEC void * mpfr_custom_get_significand (mpfr_srcptr);
-__MPFR_DECLSPEC mpfr_exp_t mpfr_custom_get_exp  (mpfr_srcptr);
-__MPFR_DECLSPEC void   mpfr_custom_move       (mpfr_ptr, void *);
-__MPFR_DECLSPEC void   mpfr_custom_init_set   (mpfr_ptr, int,
-                                             mpfr_exp_t, mpfr_prec_t, void *);
-__MPFR_DECLSPEC int    mpfr_custom_get_kind   (mpfr_srcptr);
+__MPFR_DECLSPEC mpfr_exp_t mpfr_custom_get_exp (mpfr_srcptr);
+__MPFR_DECLSPEC void mpfr_custom_move (mpfr_ptr, void *);
+__MPFR_DECLSPEC void mpfr_custom_init_set (mpfr_ptr, int, mpfr_exp_t,
+                                           mpfr_prec_t, void *);
+__MPFR_DECLSPEC int    mpfr_custom_get_kind (mpfr_srcptr);
 
 #if defined (__cplusplus)
 }
@@ -1124,11 +1029,9 @@ extern "C" {
 #define mpfr_get_sj __gmpfr_mpfr_get_sj
 #define mpfr_get_uj __gmpfr_mpfr_get_uj
 __MPFR_DECLSPEC int mpfr_set_sj (mpfr_t, intmax_t, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_sj_2exp (mpfr_t, intmax_t, intmax_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_sj_2exp (mpfr_t, intmax_t, intmax_t, mpfr_rnd_t);
 __MPFR_DECLSPEC int mpfr_set_uj (mpfr_t, uintmax_t, mpfr_rnd_t);
-__MPFR_DECLSPEC int
-  mpfr_set_uj_2exp (mpfr_t, uintmax_t, intmax_t, mpfr_rnd_t);
+__MPFR_DECLSPEC int mpfr_set_uj_2exp (mpfr_t, uintmax_t, intmax_t, mpfr_rnd_t);
 __MPFR_DECLSPEC intmax_t mpfr_get_sj (mpfr_srcptr, mpfr_rnd_t);
 __MPFR_DECLSPEC uintmax_t mpfr_get_uj (mpfr_srcptr, mpfr_rnd_t);
 
@@ -1151,19 +1054,17 @@ extern "C" {
 
 #define mpfr_inp_str __gmpfr_inp_str
 #define mpfr_out_str __gmpfr_out_str
-__MPFR_DECLSPEC size_t mpfr_inp_str (mpfr_ptr, FILE*, int,
+__MPFR_DECLSPEC size_t mpfr_inp_str (mpfr_ptr, FILE*, int, mpfr_rnd_t);
+__MPFR_DECLSPEC size_t mpfr_out_str (FILE*, int, size_t, mpfr_srcptr,
                                      mpfr_rnd_t);
-__MPFR_DECLSPEC size_t mpfr_out_str (FILE*, int, size_t,
-                                     mpfr_srcptr, mpfr_rnd_t);
 #ifndef MPFR_USE_MINI_GMP
 #define mpfr_fprintf __gmpfr_fprintf
-__MPFR_DECLSPEC int mpfr_fprintf (FILE*, const char*,
-                                  ...);
+__MPFR_DECLSPEC int mpfr_fprintf (FILE*, const char*, ...);
 #endif
 #define mpfr_fpif_export __gmpfr_fpif_export
 #define mpfr_fpif_import __gmpfr_fpif_import
-__MPFR_DECLSPEC int    mpfr_fpif_export (FILE*, mpfr_ptr);
-__MPFR_DECLSPEC int    mpfr_fpif_import (mpfr_ptr, FILE*);
+__MPFR_DECLSPEC int mpfr_fpif_export (FILE*, mpfr_ptr);
+__MPFR_DECLSPEC int mpfr_fpif_import (mpfr_ptr, FILE*);
 
 #if defined (__cplusplus)
 }
@@ -1187,12 +1088,9 @@ extern "C" {
 #define mpfr_vsprintf __gmpfr_vsprintf
 #define mpfr_vsnprintf __gmpfr_vsnprintf
 __MPFR_DECLSPEC int mpfr_vprintf (const char*, va_list);
-__MPFR_DECLSPEC int mpfr_vasprintf (char**, const char*,
-                                    va_list);
-__MPFR_DECLSPEC int mpfr_vsprintf (char*, const char*,
-                                   va_list);
-__MPFR_DECLSPEC int mpfr_vsnprintf (char*, size_t,
-                                    const char*, va_list);
+__MPFR_DECLSPEC int mpfr_vasprintf (char**, const char*, va_list);
+__MPFR_DECLSPEC int mpfr_vsprintf (char*, const char*, va_list);
+__MPFR_DECLSPEC int mpfr_vsnprintf (char*, size_t, const char*, va_list);
 
 #if defined (__cplusplus)
 }
@@ -1213,8 +1111,7 @@ extern "C" {
 #endif
 
 #define mpfr_vfprintf __gmpfr_vfprintf
-__MPFR_DECLSPEC int mpfr_vfprintf (FILE*, const char*,
-                                   va_list);
+__MPFR_DECLSPEC int mpfr_vfprintf (FILE*, const char*, va_list);
 
 #if defined (__cplusplus)
 }
diff --git a/src/mul.c b/src/mul.c
index c2c453141..c2e87d08c 100644
--- a/src/mul.c
+++ b/src/mul.c
@@ -208,6 +208,8 @@ mpfr_mul (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 
 /* Multiply 2 mpfr_t */
 
+#if !defined(MPFR_GENERIC_ABI)
+
 /* Special code for prec(a) < GMP_NUMB_BITS and
    prec(b), prec(c) <= GMP_NUMB_BITS.
    Note: this code was copied in sqr.c, function mpfr_sqr_1 (this saves a few cycles
@@ -219,8 +221,8 @@ mpfr_mul_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
 {
   mp_limb_t a0;
   mpfr_limb_ptr ap = MPFR_MANT(a);
-  mpfr_limb_ptr bp = MPFR_MANT(b);
-  mpfr_limb_ptr cp = MPFR_MANT(c);
+  mp_limb_t b0 = MPFR_MANT(b)[0];
+  mp_limb_t c0 = MPFR_MANT(c)[0];
   mpfr_exp_t ax;
   mpfr_prec_t sh = GMP_NUMB_BITS - p;
   mp_limb_t rb, sb, mask = MPFR_LIMB_MASK(sh);
@@ -228,12 +230,11 @@ mpfr_mul_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
   /* When prec(b), prec(c) <= GMP_NUMB_BITS / 2, we could replace umul_ppmm
      by a limb multiplication as follows, but we assume umul_ppmm is as fast
      as a limb multiplication on modern processors:
-      a0 = (bp[0] >> (GMP_NUMB_BITS / 2))
-        * (cp[0] >> (GMP_NUMB_BITS / 2));
+      a0 = (b0 >> (GMP_NUMB_BITS / 2)) * (c0 >> (GMP_NUMB_BITS / 2));
       sb = 0;
   */
   ax = MPFR_GET_EXP(b) + MPFR_GET_EXP(c);
-  umul_ppmm (a0, sb, bp[0], cp[0]);
+  umul_ppmm (a0, sb, b0, c0);
   if (a0 < MPFR_LIMB_HIGHBIT)
     {
       ax --;
@@ -276,7 +277,7 @@ mpfr_mul_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -311,6 +312,101 @@ mpfr_mul_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
     }
 }
 
+/* Special code for prec(a) = GMP_NUMB_BITS and
+   prec(b), prec(c) <= GMP_NUMB_BITS. */
+static int
+mpfr_mul_1n (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
+{
+  mp_limb_t a0;
+  mpfr_limb_ptr ap = MPFR_MANT(a);
+  mp_limb_t b0 = MPFR_MANT(b)[0];
+  mp_limb_t c0 = MPFR_MANT(c)[0];
+  mpfr_exp_t ax;
+  mp_limb_t rb, sb;
+
+  ax = MPFR_GET_EXP(b) + MPFR_GET_EXP(c);
+  umul_ppmm (a0, sb, b0, c0);
+  if (a0 < MPFR_LIMB_HIGHBIT)
+    {
+      ax --;
+      /* TODO: This is actually an addition with carry (no shifts and no OR
+         needed in asm). Make sure that GCC generates optimized code once
+         it supports carry-in. */
+      a0 = (a0 << 1) | (sb >> (GMP_NUMB_BITS - 1));
+      sb = sb << 1;
+    }
+  rb = sb & MPFR_LIMB_HIGHBIT;
+  sb = sb & ~MPFR_LIMB_HIGHBIT;
+  ap[0] = a0;
+
+  MPFR_SIGN(a) = MPFR_MULT_SIGN (MPFR_SIGN (b), MPFR_SIGN (c));
+
+  /* rounding */
+  if (MPFR_UNLIKELY(ax > __gmpfr_emax))
+    return mpfr_overflow (a, rnd_mode, MPFR_SIGN(a));
+
+  /* Warning: underflow should be checked *after* rounding, thus when rounding
+     away and when a > 0.111...111*2^(emin-1), or when rounding to nearest and
+     a >= 0.111...111[1]*2^(emin-1), there is no underflow.
+     Note: this case can only occur when the initial a0 (after the umul_ppmm
+     call above) had its most significant bit 0, since the largest a0 is
+     obtained for b0 = c0 = B-1 where B=2^GMP_NUMB_BITS, thus b0*c0 <= (B-1)^2
+     thus a0 <= B-2. */
+  if (MPFR_UNLIKELY(ax < __gmpfr_emin))
+    {
+      if (ax == __gmpfr_emin - 1 && ap[0] == ~MPFR_LIMB_ZERO &&
+          ((rnd_mode == MPFR_RNDN && rb) ||
+           (MPFR_IS_LIKE_RNDA(rnd_mode, MPFR_IS_NEG (a)) && (rb | sb))))
+        goto rounding; /* no underflow */
+      /* For RNDN, mpfr_underflow always rounds away, thus for |a| <= 2^(emin-2)
+         we have to change to RNDZ. This corresponds to:
+         (a) either ax < emin - 1
+         (b) or ax = emin - 1 and ap[0] = 1000....000 and rb = sb = 0 */
+      if (rnd_mode == MPFR_RNDN &&
+          (ax < __gmpfr_emin - 1 ||
+           (ap[0] == MPFR_LIMB_HIGHBIT && (rb | sb) == 0)))
+        rnd_mode = MPFR_RNDZ;
+      return mpfr_underflow (a, rnd_mode, MPFR_SIGN(a));
+    }
+
+ rounding:
+  MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
+                        in the cases "goto rounding" above. */
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
+    {
+      MPFR_ASSERTD(ax >= __gmpfr_emin);
+      return 0; /* idem than MPFR_RET(0) but faster */
+    }
+  else if (rnd_mode == MPFR_RNDN)
+    {
+      if (rb == 0 || (sb == 0 && (ap[0] & MPFR_LIMB_ONE) == 0))
+        goto truncate;
+      else
+        goto add_one_ulp;
+    }
+  else if (MPFR_IS_LIKE_RNDZ(rnd_mode, MPFR_IS_NEG(a)))
+    {
+    truncate:
+      MPFR_ASSERTD(ax >= __gmpfr_emin);
+      MPFR_RET(-MPFR_SIGN(a));
+    }
+  else /* round away from zero */
+    {
+    add_one_ulp:
+      ap[0] += MPFR_LIMB_ONE;
+      if (ap[0] == 0)
+        {
+          ap[0] = MPFR_LIMB_HIGHBIT;
+          if (MPFR_UNLIKELY(ax + 1 > __gmpfr_emax))
+            return mpfr_overflow (a, rnd_mode, MPFR_SIGN(a));
+          MPFR_ASSERTD(ax + 1 <= __gmpfr_emax);
+          MPFR_ASSERTD(ax + 1 >= __gmpfr_emin);
+          MPFR_SET_EXP (a, ax + 1);
+        }
+      MPFR_RET(MPFR_SIGN(a));
+    }
+}
+
 /* Special code for GMP_NUMB_BITS < prec(a) < 2*GMP_NUMB_BITS and
    GMP_NUMB_BITS < prec(b), prec(c) <= 2*GMP_NUMB_BITS.
    Note: this code was copied in sqr.c, function mpfr_sqr_2 (this saves a few cycles
@@ -402,7 +498,7 @@ mpfr_mul_2 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -539,7 +635,7 @@ mpfr_mul_3 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -576,6 +672,8 @@ mpfr_mul_3 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
     }
 }
 
+#endif /* !defined(MPFR_GENERIC_ABI) */
+
 /* Note: mpfr_sqr will call mpfr_mul if bn > MPFR_SQR_THRESHOLD,
    in order to use Mulders' mulhigh, which is handled only here
    to avoid partial code duplication. There is some overhead due
@@ -651,18 +749,20 @@ mpfr_mul (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
   cq = MPFR_GET_PREC (c);
 
 #if !defined(MPFR_GENERIC_ABI)
-  if (aq < GMP_NUMB_BITS && bq <= GMP_NUMB_BITS && cq <= GMP_NUMB_BITS)
-    return mpfr_mul_1 (a, b, c, rnd_mode, aq);
-
-  if (GMP_NUMB_BITS < aq && aq < 2 * GMP_NUMB_BITS &&
-      GMP_NUMB_BITS < bq && bq <= 2 * GMP_NUMB_BITS &&
-      GMP_NUMB_BITS < cq && cq <= 2 * GMP_NUMB_BITS)
-    return mpfr_mul_2 (a, b, c, rnd_mode, aq);
-
-  if (2 * GMP_NUMB_BITS < aq && aq < 3 * GMP_NUMB_BITS &&
-      2 * GMP_NUMB_BITS < bq && bq <= 3 * GMP_NUMB_BITS &&
-      2 * GMP_NUMB_BITS < cq && cq <= 3 * GMP_NUMB_BITS)
-    return mpfr_mul_3 (a, b, c, rnd_mode, aq);
+  if (aq == bq && aq == cq)
+    {
+      if (aq < GMP_NUMB_BITS)
+        return mpfr_mul_1 (a, b, c, rnd_mode, aq);
+
+      if (GMP_NUMB_BITS < aq && aq < 2 * GMP_NUMB_BITS)
+        return mpfr_mul_2 (a, b, c, rnd_mode, aq);
+
+      if (aq == GMP_NUMB_BITS)
+        return mpfr_mul_1n (a, b, c, rnd_mode);
+
+      if (2 * GMP_NUMB_BITS < aq && aq < 3 * GMP_NUMB_BITS)
+        return mpfr_mul_3 (a, b, c, rnd_mode, aq);
+    }
 #endif
 
   sign = MPFR_MULT_SIGN (MPFR_SIGN (b), MPFR_SIGN (c));
diff --git a/src/odd_p.c b/src/odd_p.c
new file mode 100644
index 000000000..411f8555d
--- /dev/null
+++ b/src/odd_p.c
@@ -0,0 +1,73 @@
+/* mpfr_odd_p -- check for odd integers
+
+Copyright 2001-2017 Free Software Foundation, Inc.
+Contributed by the AriC and Caramba projects, INRIA.
+
+This file is part of the GNU MPFR Library.
+
+The GNU MPFR Library is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as published by
+the Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+The GNU MPFR Library is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
+License for more details.
+
+You should have received a copy of the GNU Lesser General Public License
+along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
+http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
+51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
+
+#define MPFR_NEED_LONGLONG_H
+#include "mpfr-impl.h"
+
+/* Return 1 if y is an odd integer, 0 otherwise.
+   Assumes y is not singular. */
+int
+mpfr_odd_p (mpfr_srcptr y)
+{
+  mpfr_exp_t expo;
+  mpfr_prec_t prec;
+  mp_size_t yn;
+  mp_limb_t *yp;
+
+  /* NAN, INF or ZERO are not allowed */
+  MPFR_ASSERTD (!MPFR_IS_SINGULAR (y));
+
+  expo = MPFR_GET_EXP (y);
+  if (expo <= 0)
+    return 0;  /* |y| < 1 and not 0 */
+
+  prec = MPFR_PREC(y);
+  if ((mpfr_prec_t) expo > prec)
+    return 0;  /* y is a multiple of 2^(expo-prec), thus not odd */
+
+  /* 0 < expo <= prec:
+     y = 1xxxxxxxxxt.zzzzzzzzzzzzzzzzzz[000]
+          expo bits   (prec-expo) bits
+
+     We have to check that:
+     (a) the bit 't' is set
+     (b) all the 'z' bits are zero
+  */
+
+  prec = MPFR_PREC2LIMBS (prec) * GMP_NUMB_BITS - expo;
+  /* number of z+0 bits */
+
+  yn = prec / GMP_NUMB_BITS;
+  MPFR_ASSERTN(yn >= 0);
+  /* yn is the index of limb containing the 't' bit */
+
+  yp = MPFR_MANT(y);
+  /* if expo is a multiple of GMP_NUMB_BITS, t is bit 0 */
+  if (expo % GMP_NUMB_BITS == 0 ? (yp[yn] & 1) == 0
+      : yp[yn] << ((expo % GMP_NUMB_BITS) - 1) != MPFR_LIMB_HIGHBIT)
+    return 0;
+  while (--yn >= 0)
+    if (yp[yn] != 0)
+      return 0;
+  return 1;
+}
+
diff --git a/src/pow.c b/src/pow.c
index 8ba7f3c3c..6d5c05f96 100644
--- a/src/pow.c
+++ b/src/pow.c
@@ -107,53 +107,6 @@ mpfr_pow_is_exact (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y,
   return res;
 }
 
-/* Return 1 if y is an odd integer, 0 otherwise. */
-static int
-is_odd (mpfr_srcptr y)
-{
-  mpfr_exp_t expo;
-  mpfr_prec_t prec;
-  mp_size_t yn;
-  mp_limb_t *yp;
-
-  /* NAN, INF or ZERO are not allowed */
-  MPFR_ASSERTD (!MPFR_IS_SINGULAR (y));
-
-  expo = MPFR_GET_EXP (y);
-  if (expo <= 0)
-    return 0;  /* |y| < 1 and not 0 */
-
-  prec = MPFR_PREC(y);
-  if ((mpfr_prec_t) expo > prec)
-    return 0;  /* y is a multiple of 2^(expo-prec), thus not odd */
-
-  /* 0 < expo <= prec:
-     y = 1xxxxxxxxxt.zzzzzzzzzzzzzzzzzz[000]
-          expo bits   (prec-expo) bits
-
-     We have to check that:
-     (a) the bit 't' is set
-     (b) all the 'z' bits are zero
-  */
-
-  prec = MPFR_PREC2LIMBS (prec) * GMP_NUMB_BITS - expo;
-  /* number of z+0 bits */
-
-  yn = prec / GMP_NUMB_BITS;
-  MPFR_ASSERTN(yn >= 0);
-  /* yn is the index of limb containing the 't' bit */
-
-  yp = MPFR_MANT(y);
-  /* if expo is a multiple of GMP_NUMB_BITS, t is bit 0 */
-  if (expo % GMP_NUMB_BITS == 0 ? (yp[yn] & 1) == 0
-      : yp[yn] << ((expo % GMP_NUMB_BITS) - 1) != MPFR_LIMB_HIGHBIT)
-    return 0;
-  while (--yn >= 0)
-    if (yp[yn] != 0)
-      return 0;
-  return 1;
-}
-
 /* Assumes that the exponent range has already been extended and if y is
    an integer, then the result is not exact in unbounded exponent range.
    If x < 0, assumes y is an integer.
@@ -191,7 +144,7 @@ mpfr_pow_general (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y,
   if (MPFR_IS_NEG (x))
     {
       MPFR_ASSERTD (y_is_integer);
-      if (is_odd (y))
+      if (mpfr_odd_p (y))
         {
           neg_result = 1;
           rnd_mode = MPFR_INVERT_RND (rnd_mode);
@@ -505,7 +458,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
         {
           int negative;
           /* Determine the sign now, in case y and z are the same object */
-          negative = MPFR_IS_NEG (x) && is_odd (y);
+          negative = MPFR_IS_NEG (x) && mpfr_odd_p (y);
           if (MPFR_IS_POS (y))
             MPFR_SET_INF (z);
           else
@@ -521,7 +474,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
           int negative;
           MPFR_ASSERTD (MPFR_IS_ZERO (x));
           /* Determine the sign now, in case y and z are the same object */
-          negative = MPFR_IS_NEG(x) && is_odd (y);
+          negative = MPFR_IS_NEG(x) && mpfr_odd_p (y);
           if (MPFR_IS_NEG (y))
             {
               MPFR_ASSERTD (! MPFR_IS_INF (y));
@@ -552,7 +505,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
 
   cmp_x_1 = mpfr_cmpabs (x, __gmpfr_one);
   if (cmp_x_1 == 0)
-    return mpfr_set_si (z, MPFR_IS_NEG (x) && is_odd (y) ? -1 : 1, rnd_mode);
+    return mpfr_set_si (z, MPFR_IS_NEG (x) && mpfr_odd_p (y) ? -1 : 1, rnd_mode);
 
   /* now we have:
      (1) either x > 0
@@ -608,7 +561,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
       if (overflow)
         {
           MPFR_LOG_MSG (("early overflow detection\n", 0));
-          negative = MPFR_IS_NEG (x) && is_odd (y);
+          negative = MPFR_IS_NEG (x) && mpfr_odd_p (y);
           return mpfr_overflow (z, rnd_mode, negative ? -1 : 1);
         }
     }
@@ -651,7 +604,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
           MPFR_LOG_MSG (("early underflow detection\n", 0));
           return mpfr_underflow (z,
                                  rnd_mode == MPFR_RNDN ? MPFR_RNDZ : rnd_mode,
-                                 MPFR_IS_NEG (x) && is_odd (y) ? -1 : 1);
+                                 MPFR_IS_NEG (x) && mpfr_odd_p (y) ? -1 : 1);
         }
     }
 
@@ -701,7 +654,7 @@ mpfr_pow (mpfr_ptr z, mpfr_srcptr x, mpfr_srcptr y, mpfr_rnd_t rnd_mode)
     MPFR_CLEAR_FLAGS ();
     inexact = mpfr_exp2 (z, tmp, rnd_mode);
     mpfr_clear (tmp);
-    if (sgnx < 0 && is_odd (y))
+    if (sgnx < 0 && mpfr_odd_p (y))
       {
         mpfr_neg (z, z, rnd_mode);
         inexact = -inexact;
diff --git a/src/root.c b/src/root.c
index d3b5b171a..88e903bcc 100644
--- a/src/root.c
+++ b/src/root.c
@@ -63,6 +63,8 @@ mpfr_root (mpfr_ptr y, mpfr_srcptr x, unsigned long k, mpfr_rnd_t rnd_mode)
     {
       if (k == 0)
         {
+          /* x^(1/0) = NaN since 0 is not signed, thus 1/0 might be +Inf or
+             -Inf */
           MPFR_SET_NAN (y);
           MPFR_RET_NAN;
         }
diff --git a/src/set.c b/src/set.c
index 326b06ce1..39b93e14b 100644
--- a/src/set.c
+++ b/src/set.c
@@ -79,3 +79,121 @@ mpfr_abs (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode)
 {
   return mpfr_set4 (a, b, rnd_mode, MPFR_SIGN_POS);
 }
+
+/* Round (u, inex) into s with rounding mode rnd, where inex is the ternary
+   value associated to u with the *same* rounding mode.
+   Assumes PREC(u) = 2*PREC(s).
+   The main algorithm is the following:
+   rnd=RNDZ: inex2 = mpfr_set (s, u, rnd_mode); return inex2 | inex;
+             (a negative value, if any, is preserved in inex2 | inex)
+   rnd=RNDA: idem
+   rnd=RNDN: inex2 = mpfr_set (s, u, rnd_mode);
+             if (inex2) return inex2; else return inex; */
+int
+mpfr_set_1_2 (mpfr_ptr s, mpfr_srcptr u, mpfr_rnd_t rnd_mode, int inex)
+{
+  mpfr_prec_t p = MPFR_PREC(s);
+  mpfr_prec_t sh = GMP_NUMB_BITS - p;
+  mp_limb_t rb, sb;
+  mp_limb_t *sp = MPFR_MANT(s);
+  mp_limb_t *up = MPFR_MANT(u);
+  mp_limb_t mask;
+  int inex2;
+
+  if (MPFR_UNLIKELY(MPFR_IS_SINGULAR(u)))
+    {
+      mpfr_set (s, u, rnd_mode);
+      return inex;
+    }
+
+  MPFR_ASSERTD(MPFR_PREC(u) == 2 * MPFR_PREC(s));
+
+  if (MPFR_PREC(s) < GMP_NUMB_BITS)
+    {
+      mask = MPFR_LIMB_MASK(sh);
+
+      if (MPFR_PREC(u) <= GMP_NUMB_BITS)
+        {
+          mp_limb_t u0 = up[0];
+
+          /* it suffices to round (u0, inex) */
+          rb = u0 & (MPFR_LIMB_ONE << (sh - 1));
+          sb = (u0 & mask) ^ rb;
+          sp[0] = u0 & ~mask;
+        }
+      else
+        {
+          mp_limb_t u1 = up[1];
+
+          /* we need to round (u1, u0, inex) */
+          mask = MPFR_LIMB_MASK(sh);
+          rb = u1 & (MPFR_LIMB_ONE << (sh - 1));
+          sb = ((u1 & mask) ^ rb) | up[0];
+          sp[0] = u1 & ~mask;
+        }
+
+      inex2 = inex * MPFR_SIGN(u);
+      MPFR_SIGN(s) = MPFR_SIGN(u);
+      MPFR_EXP(s) = MPFR_EXP(u);
+
+      /* in case inex2 > 0, the value of u is rounded away,
+         thus we need to subtract something from (u0, rb, sb):
+         (a) if sb is not zero, since the subtracted value is < 1, we can leave
+         sb as it is;
+         (b) if rb <> 0 and sb = 0: change to rb = 0 and sb = 1
+         (c) if rb = sb = 0: change to rb = 1 and sb = 1, and subtract 1 */
+      if (inex2 > 0)
+        {
+          if (rb && sb == 0)
+            {
+              rb = 0;
+              sb = 1;
+            }
+        }
+      else /* inex2 <= 0 */
+        sb |= inex;
+
+      /* now rb, sb are the round and sticky bits, together with the value of
+         sp[0], except possibly in the case rb = sb = 0 and inex2 > 0 */
+      if (rb == 0 && sb == 0)
+        {
+          if (inex2 <= 0)
+            MPFR_RET(0);
+          else /* inex2 > 0 can only occur for RNDN and RNDA:
+                  RNDN: return sp[0] and inex
+                  RNDA: return sp[0] and inex */
+            MPFR_RET(inex);
+        }
+      else if (rnd_mode == MPFR_RNDN)
+        {
+          if (rb == 0 || (sb == 0 && (sp[0] & (MPFR_LIMB_ONE << sh)) == 0))
+            goto truncate;
+          else
+            goto add_one_ulp;
+        }
+      else if (MPFR_IS_LIKE_RNDZ(rnd_mode, MPFR_IS_NEG(s)))
+        {
+        truncate:
+          MPFR_RET(-MPFR_SIGN(s));
+        }
+      else /* round away from zero */
+        {
+        add_one_ulp:
+          sp[0] += MPFR_LIMB_ONE << sh;
+          if (MPFR_UNLIKELY(sp[0] == 0))
+            {
+              sp[0] = MPFR_LIMB_HIGHBIT;
+              if (MPFR_EXP(s) + 1 <= __gmpfr_emax)
+                MPFR_SET_EXP (s, MPFR_EXP(s) + 1);
+              else /* overflow */
+                return mpfr_overflow (s, rnd_mode, MPFR_SIGN(s));
+            }
+          MPFR_RET(MPFR_SIGN(s));
+        }
+    }
+
+  /* general case PREC(s) >= GMP_NUMB_BITS */
+  inex2 = mpfr_set (s, u, rnd_mode);
+  return (rnd_mode != MPFR_RNDN) ? inex | inex2
+    : (inex2) ? inex2 : inex;
+}
diff --git a/src/set_d64.c b/src/set_d64.c
index 321c121e4..2e3e37e01 100644
--- a/src/set_d64.c
+++ b/src/set_d64.c
@@ -1,4 +1,4 @@
-/* mpfr_set_decimal64 -- convert a IEEE 754r decimal64 float to
+/* mpfr_set_decimal64 -- convert an IEEE 754-2008 decimal64 float to
                          a multiple precision floating-point number
 
 See https://gcc.gnu.org/ml/gcc/2006-06/msg00691.html,
diff --git a/src/sqr.c b/src/sqr.c
index d3b710733..416f14e5b 100644
--- a/src/sqr.c
+++ b/src/sqr.c
@@ -23,6 +23,8 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 #define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
+#if !defined(MPFR_GENERIC_ABI) && (GMP_NUMB_BITS == 32 || GMP_NUMB_BITS == 64)
+
 /* Special code for prec(a) < GMP_NUMB_BITS and prec(b) <= GMP_NUMB_BITS.
    Note: this function was copied from mpfr_mul_1 in file mul.c, thus any change
    here should be done also in mpfr_mul_1. */
@@ -31,7 +33,7 @@ mpfr_sqr_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
 {
   mp_limb_t a0;
   mpfr_limb_ptr ap = MPFR_MANT(a);
-  mpfr_limb_ptr bp = MPFR_MANT(b);
+  mp_limb_t b0 = MPFR_MANT(b)[0];
   mpfr_exp_t ax;
   mpfr_prec_t sh = GMP_NUMB_BITS - p;
   mp_limb_t rb, sb, mask = MPFR_LIMB_MASK(sh);
@@ -39,11 +41,11 @@ mpfr_sqr_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
   /* When prec(b) <= GMP_NUMB_BITS / 2, we could replace umul_ppmm
      by a limb multiplication as follows, but we assume umul_ppmm is as fast
      as a limb multiplication on modern processors:
-      a0 = (bp[0] >> (GMP_NUMB_BITS / 2)) * (bp[0] >> (GMP_NUMB_BITS / 2));
+      a0 = (b0 >> (GMP_NUMB_BITS / 2)) * (b0 >> (GMP_NUMB_BITS / 2));
       sb = 0;
   */
   ax = MPFR_GET_EXP(b) * 2;
-  umul_ppmm (a0, sb, bp[0], bp[0]);
+  umul_ppmm (a0, sb, b0, b0);
   if (a0 < MPFR_LIMB_HIGHBIT)
     {
       ax --;
@@ -90,7 +92,7 @@ mpfr_sqr_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -125,6 +127,98 @@ mpfr_sqr_1 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
     }
 }
 
+/* special code for PREC(a) = GMP_NUMB_BITS */
+static int
+mpfr_sqr_1n (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode)
+{
+  mp_limb_t a0;
+  mpfr_limb_ptr ap = MPFR_MANT(a);
+  mp_limb_t b0 = MPFR_MANT(b)[0];
+  mpfr_exp_t ax;
+  mp_limb_t rb, sb;
+
+  ax = MPFR_GET_EXP(b) * 2;
+  umul_ppmm (a0, sb, b0, b0);
+  if (a0 < MPFR_LIMB_HIGHBIT)
+    {
+      ax --;
+      a0 = (a0 << 1) | (sb >> (GMP_NUMB_BITS - 1));
+      sb = sb << 1;
+    }
+  rb = sb & MPFR_LIMB_HIGHBIT;
+  sb = sb & ~MPFR_LIMB_HIGHBIT;
+  ap[0] = a0;
+
+  MPFR_SIGN(a) = MPFR_SIGN_POS;
+
+  /* rounding */
+  if (MPFR_UNLIKELY(ax > __gmpfr_emax))
+    return mpfr_overflow (a, rnd_mode, MPFR_SIGN_POS);
+
+  /* Warning: underflow should be checked *after* rounding, thus when rounding
+     away and when a > 0.111...111*2^(emin-1), or when rounding to nearest and
+     a >= 0.111...111[1]*2^(emin-1), there is no underflow. */
+  if (MPFR_UNLIKELY(ax < __gmpfr_emin))
+    {
+      /* As seen in mpfr_mul_1, we cannot have a0 = 111...111 here if there
+         was not exponent decrease (ax--) above.
+         In the case of an exponent decrease, it is not possible for
+         GMP_NUMB_BITS=32 since the largest b0 such that b0^2 < 2^(2*32-1)
+         is b0=3037000499, but its square has only 30 leading ones.
+         For GMP_NUMB_BITS=64 it is possible: the largest b0 is
+         13043817825332782212, and its square has 64 leading ones. */
+      if ((ax == __gmpfr_emin - 1) && (ap[0] == ~MPFR_LIMB_HIGHBIT) &&
+          ((rnd_mode == MPFR_RNDN && rb) ||
+           (MPFR_IS_LIKE_RNDA(rnd_mode, MPFR_IS_NEG (a)) && (rb | sb))))
+        goto rounding; /* no underflow */
+      /* For RNDN, mpfr_underflow always rounds away, thus for |a| <= 2^(emin-2)
+         we have to change to RNDZ. This corresponds to:
+         (a) either ax < emin - 1
+         (b) or ax = emin - 1 and ap[0] = 1000....000 and rb = sb = 0 */
+      if (rnd_mode == MPFR_RNDN &&
+          (ax < __gmpfr_emin - 1 || (ap[0] == MPFR_LIMB_HIGHBIT && (rb | sb) == 0)))
+        rnd_mode = MPFR_RNDZ;
+      return mpfr_underflow (a, rnd_mode, MPFR_SIGN_POS);
+    }
+
+ rounding:
+  MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
+                        in the cases "goto rounding" above. */
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
+    {
+      MPFR_ASSERTD(ax >= __gmpfr_emin);
+      return 0; /* idem than MPFR_RET(0) but faster */
+    }
+  else if (rnd_mode == MPFR_RNDN)
+    {
+      if (rb == 0 || (sb == 0 && (ap[0] & MPFR_LIMB_ONE) == 0))
+        goto truncate;
+      else
+        goto add_one_ulp;
+    }
+  else if (MPFR_IS_LIKE_RNDZ(rnd_mode, MPFR_IS_NEG(a)))
+    {
+    truncate:
+      MPFR_ASSERTD(ax >= __gmpfr_emin);
+      MPFR_RET(-MPFR_SIGN_POS);
+    }
+  else /* round away from zero */
+    {
+    add_one_ulp:
+      ap[0] += MPFR_LIMB_ONE;
+      if (ap[0] == 0)
+        {
+          ap[0] = MPFR_LIMB_HIGHBIT;
+          if (MPFR_UNLIKELY(ax + 1 > __gmpfr_emax))
+            return mpfr_overflow (a, rnd_mode, MPFR_SIGN_POS);
+          MPFR_ASSERTD(ax + 1 <= __gmpfr_emax);
+          MPFR_ASSERTD(ax + 1 >= __gmpfr_emin);
+          MPFR_SET_EXP (a, ax + 1);
+        }
+      MPFR_RET(MPFR_SIGN_POS);
+    }
+}
+
 /* Special code for GMP_NUMB_BITS < prec(a) < 2*GMP_NUMB_BITS and
    GMP_NUMB_BITS < prec(b) <= 2*GMP_NUMB_BITS.
    Note: this function was copied and optimized from mpfr_mul_2 in file mul.c,
@@ -219,7 +313,7 @@ mpfr_sqr_2 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -358,7 +452,7 @@ mpfr_sqr_3 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
  rounding:
   MPFR_EXP (a) = ax; /* Don't use MPFR_SET_EXP since ax might be < __gmpfr_emin
                         in the cases "goto rounding" above. */
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if ((rb == 0 && sb == 0) || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(ax >= __gmpfr_emin);
       return 0; /* idem than MPFR_RET(0) but faster */
@@ -395,6 +489,8 @@ mpfr_sqr_3 (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode, mpfr_prec_t p)
     }
 }
 
+#endif /* !defined(MPFR_GENERIC_ABI) && ... */
+
 /* Note: mpfr_sqr will call mpfr_mul if bn > MPFR_SQR_THRESHOLD,
    in order to use Mulders' mulhigh, which is handled only here
    to avoid partial code duplication. There is some overhead due
@@ -435,6 +531,7 @@ mpfr_sqr (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode)
   aq = MPFR_GET_PREC(a);
   bq = MPFR_GET_PREC(b);
 
+#if !defined(MPFR_GENERIC_ABI) && (GMP_NUMB_BITS == 32 || GMP_NUMB_BITS == 64)
   if (aq < GMP_NUMB_BITS && bq <= GMP_NUMB_BITS)
     return mpfr_sqr_1 (a, b, rnd_mode, aq);
 
@@ -442,9 +539,13 @@ mpfr_sqr (mpfr_ptr a, mpfr_srcptr b, mpfr_rnd_t rnd_mode)
       && GMP_NUMB_BITS < bq && bq <= 2 * GMP_NUMB_BITS)
     return mpfr_sqr_2 (a, b, rnd_mode, aq);
 
+  if (aq == GMP_NUMB_BITS && bq <= GMP_NUMB_BITS)
+    return mpfr_sqr_1n (a, b, rnd_mode);
+
   if (2 * GMP_NUMB_BITS < aq && aq < 3 * GMP_NUMB_BITS
       && 2 * GMP_NUMB_BITS < bq && bq <= 3 * GMP_NUMB_BITS)
     return mpfr_sqr_3 (a, b, rnd_mode, aq);
+#endif
 
   ax = 2 * MPFR_GET_EXP (b);
   MPFR_ASSERTN (2 * (mpfr_uprec_t) bq <= MPFR_PREC_MAX);
diff --git a/src/sqrt.c b/src/sqrt.c
index fa30defc5..9b487af65 100644
--- a/src/sqrt.c
+++ b/src/sqrt.c
@@ -127,7 +127,8 @@ mpfr_sqrt1 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
   sb |= (r0 & mask) ^ rb;
   rp[0] = r0 & ~mask;
 
-  /* rounding */
+  /* rounding: sb = 0 implies rb = 0, since (rb,sb)=(1,0) is not possible */
+  MPFR_ASSERTD (rb == 0 || sb != 0);
 
   /* Note: if 1 and 2 are in [emin,emax], no overflow nor underflow
      is possible */
@@ -154,16 +155,18 @@ mpfr_sqrt1 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
 
  rounding:
   MPFR_EXP (r) = exp_r;
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if (sb == 0 /* implies rb = 0 */ || rnd_mode == MPFR_RNDF)
     {
+      MPFR_ASSERTD (rb == 0 || rnd_mode == MPFR_RNDF);
       MPFR_ASSERTD(exp_r >= __gmpfr_emin);
       MPFR_ASSERTD(exp_r <= __gmpfr_emax);
       return 0; /* idem than MPFR_RET(0) but faster */
     }
   else if (rnd_mode == MPFR_RNDN)
     {
-      if (rb == 0 || (rb && sb == 0 &&
-                      (rp[0] & (MPFR_LIMB_ONE << sh)) == 0))
+      /* since sb <> 0, only rb is needed to decide how to round, and the exact
+         middle is not possible */
+      if (rb == 0)
         goto truncate;
       else
         goto add_one_ulp;
@@ -192,6 +195,133 @@ mpfr_sqrt1 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
     }
 }
 
+/* Special code for prec(r) = GMP_NUMB_BITS and prec(u) <= GMP_NUMB_BITS. */
+static int
+mpfr_sqrt1n (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
+{
+  mpfr_prec_t exp_u = MPFR_EXP(u), exp_r;
+  mp_limb_t u0, r0, rb, sb, low;
+  mpfr_limb_ptr rp = MPFR_MANT(r);
+
+  MPFR_STAT_STATIC_ASSERT (GMP_NUMB_BITS == 64);
+  MPFR_ASSERTD(MPFR_PREC(r) == GMP_NUMB_BITS);
+  MPFR_ASSERTD(MPFR_PREC(u) <= GMP_NUMB_BITS);
+
+  /* first make the exponent even */
+  u0 = MPFR_MANT(u)[0];
+  if (((unsigned int) exp_u & 1) != 0)
+    {
+      low = u0 << (GMP_NUMB_BITS - 1);
+      u0 >>= 1;
+      exp_u ++;
+    }
+  else
+    low = 0; /* low part of u0 */
+  MPFR_ASSERTD (((unsigned int) exp_u & 1) == 0);
+  exp_r = exp_u / 2;
+
+  /* then compute an approximation of the integer square root of
+     u0*2^GMP_NUMB_BITS */
+  __gmpfr_sqrt_limb_approx (r0, u0);
+
+  /* the exact square root is in [r0, r0 + 7] */
+
+  /* first ensure r0 has its most significant bit set */
+  if (MPFR_UNLIKELY(r0 < MPFR_LIMB_HIGHBIT))
+    r0 = MPFR_LIMB_HIGHBIT;
+
+  umul_ppmm (rb, sb, r0, r0);
+  sub_ddmmss (rb, sb, u0, low, rb, sb);
+  /* for the exact square root, we should have 0 <= rb:sb <= 2*r0 */
+  while (!(rb == 0 || (rb == 1 && sb <= 2 * r0)))
+    {
+      /* subtract 2*r0+1 from rb:sb: subtract r0 before incrementing r0,
+         then r0 after (which is r0+1) */
+      rb -= (sb < r0);
+      sb -= r0;
+      r0 ++;
+      rb -= (sb < r0);
+      sb -= r0;
+    }
+  /* now we have u0*2^64+low = r0^2 + rb*2^64+sb, with rb*2^64+sb <= 2*r0 */
+  MPFR_ASSERTD(rb == 0 || (rb == 1 && sb <= 2 * r0));
+
+  /* We can't have the middle case u0*2^64 = (r0 + 1/2)^2 since
+     (r0 + 1/2)^2 is not an integer.
+     We thus rb = 1 whenever u0*2^64 > (r0 + 1/2)^2, thus rb*2^64 + sb > r0
+     and the sticky bit is always 1, unless we had rb = sb = 0. */
+
+  rb = rb || (sb > r0);
+  sb = rb | sb;
+  rp[0] = r0;
+
+  /* rounding */
+
+  /* Note: if 1 and 2 are in [emin,emax], no overflow nor underflow
+     is possible */
+  if (MPFR_UNLIKELY (exp_r > __gmpfr_emax))
+    return mpfr_overflow (r, rnd_mode, 1);
+
+  /* See comments in mpfr_div_1 */
+  if (MPFR_UNLIKELY (exp_r < __gmpfr_emin))
+    {
+      if (rnd_mode == MPFR_RNDN)
+        {
+          /* the case rp[0] = 111...111 and rb = 1 cannot happen, since it
+             would imply u0 >= (2^64-1/2)^2/2^64 thus u0 >= 2^64 */
+          if (exp_r < __gmpfr_emin - 1 || (rp[0] == MPFR_LIMB_HIGHBIT && sb == 0))
+            rnd_mode = MPFR_RNDZ;
+        }
+      else if (MPFR_IS_LIKE_RNDA(rnd_mode, 0))
+        {
+          if ((exp_r == __gmpfr_emin - 1) && (rp[0] == ~MPFR_LIMB_ZERO) && (rb | sb))
+            goto rounding; /* no underflow */
+        }
+      return mpfr_underflow (r, rnd_mode, 1);
+    }
+
+  /* sb = 0 can only occur when the square root is exact, i.e., rb = 0 */
+
+ rounding:
+  MPFR_EXP (r) = exp_r;
+  if (sb == 0 /* implies rb = 0 */ || rnd_mode == MPFR_RNDF)
+    {
+      MPFR_ASSERTD(exp_r >= __gmpfr_emin);
+      MPFR_ASSERTD(exp_r <= __gmpfr_emax);
+      return 0; /* idem than MPFR_RET(0) but faster */
+    }
+  else if (rnd_mode == MPFR_RNDN)
+    {
+      /* we can't have sb = 0, thus rb is enough */
+      if (rb == 0)
+        goto truncate;
+      else
+        goto add_one_ulp;
+    }
+  else if (MPFR_IS_LIKE_RNDZ(rnd_mode, 0))
+    {
+    truncate:
+      MPFR_ASSERTD(exp_r >= __gmpfr_emin);
+      MPFR_ASSERTD(exp_r <= __gmpfr_emax);
+      MPFR_RET(-1);
+    }
+  else /* round away from zero */
+    {
+    add_one_ulp:
+      rp[0] += MPFR_LIMB_ONE;
+      if (rp[0] == 0)
+        {
+          rp[0] = MPFR_LIMB_HIGHBIT;
+          if (MPFR_UNLIKELY(exp_r + 1 > __gmpfr_emax))
+            return mpfr_overflow (r, rnd_mode, 1);
+          MPFR_ASSERTD(exp_r + 1 <= __gmpfr_emax);
+          MPFR_ASSERTD(exp_r + 1 >= __gmpfr_emin);
+          MPFR_SET_EXP (r, exp_r + 1);
+        }
+      MPFR_RET(1);
+    }
+}
+
 /* Special code for GMP_NUMB_BITS < prec(r) < 2*GMP_NUMB_BITS,
    and GMP_NUMB_BITS < prec(u) <= 2*GMP_NUMB_BITS.
    Assumes GMP_NUMB_BITS=64. */
@@ -303,7 +433,7 @@ mpfr_sqrt2 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
 
  rounding:
   MPFR_EXP (r) = exp_r;
-  if ((rb == 0 && sb == 0) || (rnd_mode == MPFR_RNDF))
+  if (sb == 0 /* implies rb = 0 */ || rnd_mode == MPFR_RNDF)
     {
       MPFR_ASSERTD(exp_r >= __gmpfr_emin);
       MPFR_ASSERTD(exp_r <= __gmpfr_emax);
@@ -311,8 +441,8 @@ mpfr_sqrt2 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
     }
   else if (rnd_mode == MPFR_RNDN)
     {
-      if (rb == 0 || (rb && sb == 0 &&
-                      (rp[0] & (MPFR_LIMB_ONE << sh)) == 0))
+      /* since sb <> 0 now, only rb is needed */
+      if (rb == 0)
         goto truncate;
       else
         goto add_one_ulp;
@@ -341,6 +471,7 @@ mpfr_sqrt2 (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
       MPFR_RET(1);
     }
 }
+
 #endif /* !defined(MPFR_GENERIC_ABI) && GMP_NUMB_BITS == 64 */
 
 int
@@ -362,6 +493,7 @@ mpfr_sqrt (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
   int sh; /* number of extra bits in rp[0] */
   int inexact; /* return ternary flag */
   mpfr_exp_t expr;
+  mpfr_prec_t rq = MPFR_GET_PREC (r);
   MPFR_TMP_DECL(marker);
 
   MPFR_LOG_FUNC
@@ -405,16 +537,25 @@ mpfr_sqrt (mpfr_ptr r, mpfr_srcptr u, mpfr_rnd_t rnd_mode)
   MPFR_SET_POS(r);
 
 #if !defined(MPFR_GENERIC_ABI) && GMP_NUMB_BITS == 64
-  if (MPFR_GET_PREC (r) < GMP_NUMB_BITS && MPFR_GET_PREC (u) < GMP_NUMB_BITS)
-    return mpfr_sqrt1 (r, u, rnd_mode);
+  {
+    mpfr_prec_t uq = MPFR_GET_PREC (u);
+
+    if (rq == uq)
+      {
+        if (rq < GMP_NUMB_BITS)
+          return mpfr_sqrt1 (r, u, rnd_mode);
+
+        if (GMP_NUMB_BITS < rq && rq < 2*GMP_NUMB_BITS)
+          return mpfr_sqrt2 (r, u, rnd_mode);
 
-  if (GMP_NUMB_BITS < MPFR_GET_PREC (r) && MPFR_GET_PREC (r) < 2*GMP_NUMB_BITS
-      && MPFR_LIMB_SIZE(u) == 2)
-    return mpfr_sqrt2 (r, u, rnd_mode);
+        if (rq == GMP_NUMB_BITS)
+          return mpfr_sqrt1n (r, u, rnd_mode);
+      }
+  }
 #endif
 
   MPFR_TMP_MARK (marker);
-  MPFR_UNSIGNED_MINUS_MODULO (sh, MPFR_GET_PREC (r));
+  MPFR_UNSIGNED_MINUS_MODULO (sh, rq);
   if (sh == 0 && rnd_mode == MPFR_RNDN)
     sh = GMP_NUMB_BITS; /* ugly case */
   rsize = MPFR_LIMB_SIZE(r) + (sh == GMP_NUMB_BITS);
diff --git a/src/sub1.c b/src/sub1.c
index c38e2f979..f67fcd9f0 100644
--- a/src/sub1.c
+++ b/src/sub1.c
@@ -34,7 +34,7 @@ int
 mpfr_sub1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 {
   int sign;
-  mpfr_exp_t diff_exp, exp_b;
+  mpfr_exp_t diff_exp, exp_a, exp_b;
   mpfr_prec_t cancel, cancel1;
   mp_size_t cancel2, an, bn, cn, cn0;
   mp_limb_t *ap, *bp, *cp;
@@ -652,15 +652,19 @@ mpfr_sub1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
   /* we have to set MPFR_EXP(a) to MPFR_EXP(b) - cancel + add_exp, taking
      care of underflows/overflows in that computation, and of the allowed
      exponent range */
+  MPFR_TMP_FREE (marker);
   if (MPFR_LIKELY(cancel))
     {
-      mpfr_exp_t exp_a;
-
       cancel -= add_exp; /* OK: add_exp is an int equal to 0 or 1 */
       exp_a = exp_b - cancel;
+      /* The following assertion corresponds to a limitation of the MPFR
+         implementation. It may fail with a 32-bit ABI and huge precisions,
+         but this is practically impossible with a 64-bit ABI. This kind
+         of issue is not specific to this function. */
+      MPFR_ASSERTN (exp_b != MPFR_EXP_MAX || exp_a > __gmpfr_emax);
       if (MPFR_UNLIKELY (exp_a < __gmpfr_emin))
         {
-          MPFR_TMP_FREE (marker);
+        underflow:
           if (rnd_mode == MPFR_RNDN &&
               (exp_a < __gmpfr_emin - 1 ||
                (inexact >= 0 && mpfr_powerof2_raw (a))))
@@ -669,25 +673,28 @@ mpfr_sub1 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
         }
       if (MPFR_UNLIKELY (exp_a > __gmpfr_emax))
         {
-          MPFR_TMP_FREE (marker);
           return mpfr_overflow (a, rnd_mode, MPFR_SIGN (a));
         }
-      MPFR_SET_EXP (a, exp_a);
     }
   else /* cancel = 0: MPFR_EXP(a) <- MPFR_EXP(b) + add_exp */
     {
+
       /* in case cancel = 0, add_exp can still be 1, in case b is just
          below a power of two, c is very small, prec(a) < prec(b),
          and rnd=away or nearest */
       MPFR_ASSERTD (add_exp == 0 || add_exp == 1);
-      if (MPFR_UNLIKELY (add_exp && exp_b >= __gmpfr_emax))
+      /* Overflow iff exp_b + add_exp > __gmpfr_emax in Z, but we do
+         a subtraction below to avoid a potential integer overflow in
+         the case exp_b == MPFR_EXP_MAX. */
+      if (MPFR_UNLIKELY (exp_b > __gmpfr_emax - add_exp))
         {
-          MPFR_TMP_FREE (marker);
           return mpfr_overflow (a, rnd_mode, MPFR_SIGN (a));
         }
-      MPFR_SET_EXP (a, exp_b + add_exp);
+      exp_a = exp_b + add_exp;
+      if (MPFR_UNLIKELY (exp_a < __gmpfr_emin))
+        goto underflow;
     }
-  MPFR_TMP_FREE(marker);
+  MPFR_SET_EXP (a, exp_a);
   /* check that result is msb-normalized */
   MPFR_ASSERTD(ap[an-1] > ~ap[an-1]);
   MPFR_RET (inexact * MPFR_INT_SIGN (a));
diff --git a/src/sub1sp.c b/src/sub1sp.c
index 511469a41..956a52884 100644
--- a/src/sub1sp.c
+++ b/src/sub1sp.c
@@ -92,57 +92,7 @@ int mpfr_sub1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 # define DEBUG(x) /**/
 #endif
 
-/* Rounding Sub */
-
-/*
-   compute sgn(b)*(|b| - |c|) if |b|>|c| else -sgn(b)*(|c| -|b|)
-   Returns 0 iff result is exact,
-   a negative value when the result is less than the exact value,
-   a positive value otherwise.
-*/
-
-/* A0...Ap-1
- *          Cp Cp+1 ....
- *             <- C'p+1 ->
- * Cp = -1 if calculated from c mantissa
- * Cp = 0  if 0 from a or c
- * Cp = 1  if calculated from a.
- * C'p+1 = First bit not null or 0 if there isn't one
- *
- * Can't have Cp=-1 and C'p+1=1*/
-
-/* RND = MPFR_RNDZ:
- *  + if Cp=0 and C'p+1=0,1,  Truncate.
- *  + if Cp=0 and C'p+1=-1,   SubOneUlp
- *  + if Cp=-1,               SubOneUlp
- *  + if Cp=1,                AddOneUlp
- * RND = MPFR_RNDA (Away)
- *  + if Cp=0 and C'p+1=0,-1, Truncate
- *  + if Cp=0 and C'p+1=1,    AddOneUlp
- *  + if Cp=1,                AddOneUlp
- *  + if Cp=-1,               Truncate
- * RND = MPFR_RNDN
- *  + if Cp=0,                Truncate
- *  + if Cp=1 and C'p+1=1,    AddOneUlp
- *  + if Cp=1 and C'p+1=-1,   Truncate
- *  + if Cp=1 and C'p+1=0,    Truncate if Ap-1=0, AddOneUlp else
- *  + if Cp=-1 and C'p+1=-1,  SubOneUlp
- *  + if Cp=-1 and C'p+1=0,   Truncate if Ap-1=0, SubOneUlp else
- *
- * If AddOneUlp:
- *   If carry, then it is 11111111111 + 1 = 10000000000000
- *      ap[n-1]=MPFR_HIGHT_BIT
- * If SubOneUlp:
- *   If we lose one bit, it is 1000000000 - 1 = 0111111111111
- *      Then shift, and put as last bit x which is calculated
- *              according Cp, Cp-1 and rnd_mode.
- * If Truncate,
- *    If it is a power of 2,
- *       we may have to suboneulp in some special cases.
- *
- * To simplify, we don't use Cp = 1.
- *
- */
+#if !defined(MPFR_GENERIC_ABI)
 
 /* special code for p < GMP_NUMB_BITS */
 static int
@@ -1095,6 +1045,60 @@ mpfr_sub1sp3 (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode,
     }
 }
 
+#endif /* !defined(MPFR_GENERIC_ABI) */
+
+/* Rounding Sub */
+
+/*
+   compute sgn(b)*(|b| - |c|) if |b|>|c| else -sgn(b)*(|c| -|b|)
+   Returns 0 iff result is exact,
+   a negative value when the result is less than the exact value,
+   a positive value otherwise.
+*/
+
+/* A0...Ap-1
+ *          Cp Cp+1 ....
+ *             <- C'p+1 ->
+ * Cp = -1 if calculated from c mantissa
+ * Cp = 0  if 0 from a or c
+ * Cp = 1  if calculated from a.
+ * C'p+1 = First bit not null or 0 if there isn't one
+ *
+ * Can't have Cp=-1 and C'p+1=1*/
+
+/* RND = MPFR_RNDZ:
+ *  + if Cp=0 and C'p+1=0,1,  Truncate.
+ *  + if Cp=0 and C'p+1=-1,   SubOneUlp
+ *  + if Cp=-1,               SubOneUlp
+ *  + if Cp=1,                AddOneUlp
+ * RND = MPFR_RNDA (Away)
+ *  + if Cp=0 and C'p+1=0,-1, Truncate
+ *  + if Cp=0 and C'p+1=1,    AddOneUlp
+ *  + if Cp=1,                AddOneUlp
+ *  + if Cp=-1,               Truncate
+ * RND = MPFR_RNDN
+ *  + if Cp=0,                Truncate
+ *  + if Cp=1 and C'p+1=1,    AddOneUlp
+ *  + if Cp=1 and C'p+1=-1,   Truncate
+ *  + if Cp=1 and C'p+1=0,    Truncate if Ap-1=0, AddOneUlp else
+ *  + if Cp=-1 and C'p+1=-1,  SubOneUlp
+ *  + if Cp=-1 and C'p+1=0,   Truncate if Ap-1=0, SubOneUlp else
+ *
+ * If AddOneUlp:
+ *   If carry, then it is 11111111111 + 1 = 10000000000000
+ *      ap[n-1]=MPFR_HIGHT_BIT
+ * If SubOneUlp:
+ *   If we lose one bit, it is 1000000000 - 1 = 0111111111111
+ *      Then shift, and put as last bit x which is calculated
+ *              according Cp, Cp-1 and rnd_mode.
+ * If Truncate,
+ *    If it is a power of 2,
+ *       we may have to suboneulp in some special cases.
+ *
+ * To simplify, we don't use Cp = 1.
+ *
+ */
+
 MPFR_HOT_FUNCTION_ATTR int
 mpfr_sub1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
 {
@@ -1119,6 +1123,7 @@ mpfr_sub1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
   /* Read prec and num of limbs */
   p = MPFR_GET_PREC (b);
 
+#if !defined(MPFR_GENERIC_ABI)
   /* special case for p < GMP_NUMB_BITS */
   if (p < GMP_NUMB_BITS)
     return mpfr_sub1sp1 (a, b, c, rnd_mode, p);
@@ -1135,6 +1140,7 @@ mpfr_sub1sp (mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_rnd_t rnd_mode)
   /* special case for 2*GMP_NUMB_BITS < p < 3*GMP_NUMB_BITS */
   if (2 * GMP_NUMB_BITS < p && p < 3 * GMP_NUMB_BITS)
     return mpfr_sub1sp3 (a, b, c, rnd_mode, p);
+#endif
 
   n = MPFR_PREC2LIMBS (p);
   /* Fast cmp of |b| and |c| */
diff --git a/src/sum.c b/src/sum.c
index 9fd9d693d..a77c116ab 100644
--- a/src/sum.c
+++ b/src/sum.c
@@ -23,6 +23,25 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 #define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
+/* Note: In the prototypes, one uses
+ *
+ *   const mpfr_ptr *x      i.e.:  __mpfr_struct *const *x
+ *
+ * instead of
+ *
+ *   const mpfr_srcptr *x   i.e.:  const __mpfr_struct *const *x
+ *
+ * because here one has a double indirection and the type matching rules
+ * from the C standard in such a case are stricter and they would yield
+ * annoying errors for the user in practice. See:
+ *
+ *   Why can't I pass a char ** to a function which expects a const char **?
+ *
+ * in the comp.lang.c FAQ:
+ *
+ *   http://c-faq.com/ansi/constmismatch.html
+ */
+
 /* See the doc/sum.txt file for the algorithm and a part of its proof
 (this will later go into algorithms.tex).
 
@@ -127,7 +146,7 @@ int __gmpfr_cov_sum_tmd[MPFR_RND_MAX][2][2][3][2][2] = { 0 };
  *   iteration (= maxexp2 of the last iteration).
  */
 static mpfr_prec_t
-sum_raw (mp_limb_t *wp, mp_size_t ws, mpfr_prec_t wq, mpfr_ptr *const x,
+sum_raw (mp_limb_t *wp, mp_size_t ws, mpfr_prec_t wq, const mpfr_ptr *x,
          unsigned long n, mpfr_exp_t minexp, mpfr_exp_t maxexp,
          mp_limb_t *tp, mp_size_t ts, int logn, mpfr_prec_t prec,
          mpfr_exp_t *ep, mpfr_exp_t *minexpp, mpfr_exp_t *maxexpp)
@@ -465,6 +484,7 @@ sum_raw (mp_limb_t *wp, mp_size_t ws, mpfr_prec_t wq, mpfr_ptr *const x,
 
                 MPFR_ASSERTD (diffexp < cancel - 2);
                 shiftq = cancel - 2 - (mpfr_prec_t) diffexp;
+                /* equivalent to: minexp + wq - 2 - max(e,err) */
                 MPFR_ASSERTD (shiftq > 0);
                 shifts = shiftq / GMP_NUMB_BITS;
                 shiftc = shiftq % GMP_NUMB_BITS;
@@ -503,7 +523,7 @@ sum_raw (mp_limb_t *wp, mp_size_t ws, mpfr_prec_t wq, mpfr_ptr *const x,
 /* Generic case: all the inputs are finite numbers,
    with at least 3 regular numbers. */
 static int
-sum_aux (mpfr_ptr sum, mpfr_ptr *const x, unsigned long n, mpfr_rnd_t rnd,
+sum_aux (mpfr_ptr sum, const mpfr_ptr *x, unsigned long n, mpfr_rnd_t rnd,
          mpfr_exp_t maxexp, unsigned long rn)
 {
   mp_limb_t *sump;
@@ -1236,7 +1256,7 @@ sum_aux (mpfr_ptr sum, mpfr_ptr *const x, unsigned long n, mpfr_rnd_t rnd,
 /**********************************************************************/
 
 int
-mpfr_sum (mpfr_ptr sum, mpfr_ptr *const x, unsigned long n, mpfr_rnd_t rnd)
+mpfr_sum (mpfr_ptr sum, const mpfr_ptr *x, unsigned long n, mpfr_rnd_t rnd)
 {
   MPFR_LOG_FUNC
     (("n=%lu rnd=%d", n, rnd),
diff --git a/src/ubf.c b/src/ubf.c
index 78eb90e37..2c01707c0 100644
--- a/src/ubf.c
+++ b/src/ubf.c
@@ -20,6 +20,7 @@ along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
 http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
 
+#define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
 /* Note: In MPFR math functions, even if UBF code is not called first,
@@ -32,7 +33,7 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 
 /* This function does not change the flags. */
 static void
-mpfr_get_zexp (mpz_ptr ez, mpfr_srcptr x)
+mpfr_init_get_zexp (mpz_ptr ez, mpfr_srcptr x)
 {
   mpz_init (ez);
 
@@ -128,17 +129,32 @@ mpfr_ubf_mul_exact (mpfr_ubf_ptr a, mpfr_srcptr b, mpfr_srcptr c)
 
       ap = MPFR_MANT (a);
 
-      u = (bn >= cn) ?
-        mpn_mul (ap, MPFR_MANT (b), bn, MPFR_MANT (c), cn) :
-        mpn_mul (ap, MPFR_MANT (c), cn, MPFR_MANT (b), bn);
-      if (MPFR_UNLIKELY (MPFR_LIMB_MSB (u) == 0))
+      if (bn == 1 && cn == 1)
         {
-          m = 1;
-          MPFR_DBGRES (v = mpn_lshift (ap, ap, bn + cn, 1));
-          MPFR_ASSERTD (v == 0);
+          umul_ppmm (ap[1], ap[0], MPFR_MANT(b)[0], MPFR_MANT(c)[0]);
+          if (ap[1] & MPFR_LIMB_HIGHBIT)
+            m = 0;
+          else
+            {
+              ap[1] = (ap[1] << 1) | (ap[0] >> (GMP_NUMB_BITS - 1));
+              ap[0] = ap[0] << 1;
+              m = 1;
+            }
         }
       else
-        m = 0;
+        {
+          u = (bn >= cn) ?
+            mpn_mul (ap, MPFR_MANT (b), bn, MPFR_MANT (c), cn) :
+            mpn_mul (ap, MPFR_MANT (c), cn, MPFR_MANT (b), bn);
+          if (MPFR_LIMB_MSB (u) == 0)
+            {
+              m = 1;
+              MPFR_DBGRES (v = mpn_lshift (ap, ap, bn + cn, 1));
+              MPFR_ASSERTD (v == 0);
+            }
+          else
+            m = 0;
+        }
 
       if (! MPFR_IS_UBF (b) && ! MPFR_IS_UBF (c) &&
           (e = MPFR_GET_EXP (b) + MPFR_GET_EXP (c) - m,
@@ -154,8 +170,8 @@ mpfr_ubf_mul_exact (mpfr_ubf_ptr a, mpfr_srcptr b, mpfr_srcptr c)
 
           /* This may involve copies of mpz_t, but exponents should not be
              very large integers anyway. */
-          mpfr_get_zexp (be, b);
-          mpfr_get_zexp (ce, c);
+          mpfr_init_get_zexp (be, b);
+          mpfr_init_get_zexp (ce, c);
           mpz_add (MPFR_ZEXP (a), be, ce);
           mpz_clear (be);
           mpz_clear (ce);
@@ -171,8 +187,8 @@ mpfr_ubf_exp_less_p (mpfr_srcptr x, mpfr_srcptr y)
   mpz_t xe, ye;
   int c;
 
-  mpfr_get_zexp (xe, x);
-  mpfr_get_zexp (ye, y);
+  mpfr_init_get_zexp (xe, x);
+  mpfr_init_get_zexp (ye, y);
   c = mpz_cmp (xe, ye) < 0;
   mpz_clear (xe);
   mpz_clear (ye);
@@ -216,8 +232,8 @@ mpfr_ubf_diff_exp (mpfr_srcptr x, mpfr_srcptr y)
   mpz_t xe, ye;
   mpfr_exp_t e;
 
-  mpfr_get_zexp (xe, x);
-  mpfr_get_zexp (ye, y);
+  mpfr_init_get_zexp (xe, x);
+  mpfr_init_get_zexp (ye, y);
   mpz_sub (xe, xe, ye);
   mpz_clear (ye);
   e = mpfr_ubf_zexp2exp (xe);
diff --git a/src/vasprintf.c b/src/vasprintf.c
index e654c6647..0886567f5 100644
--- a/src/vasprintf.c
+++ b/src/vasprintf.c
@@ -21,6 +21,24 @@ along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
 http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
 
+/* If the number of output characters is larger than INT_MAX, the
+   ISO C99 standard is silent, but POSIX says concerning the snprintf()
+   function:
+   "[EOVERFLOW] The value of n is greater than {INT_MAX} or the
+   number of bytes needed to hold the output excluding the
+   terminating null is greater than {INT_MAX}." See:
+   http://www.opengroup.org/onlinepubs/009695399/functions/fprintf.html
+   But it doesn't say anything concerning the other printf-like functions.
+   A defect report has been submitted to austin-review-l (item 2532).
+   So, for the time being, we return a negative value and set the erange
+   flag, and set errno to EOVERFLOW in POSIX system. */
+
+/* Note: Due to limitations from the C standard and GMP, if
+   size_t < unsigned int (which is allowed by the C standard but unlikely
+   to occur on any platform), the behavior is undefined for output that
+   would reach SIZE_MAX (if the result cannot be delivered, there should
+   be an assertion failure, but this could not be tested). */
+
 #ifdef HAVE_CONFIG_H
 # include "config.h"
 #endif
@@ -158,7 +176,7 @@ struct printf_spec
 
   int width;                    /* Width */
   int prec;                     /* Precision */
-  int size;                     /* Wanted size (0 iff snprintf with size=0) */
+  size_t size;                  /* Wanted size (0 iff snprintf with size=0) */
 
   enum arg_t arg_type;          /* Type of argument */
   mpfr_rnd_t rnd_mode;          /* Rounding mode */
@@ -476,7 +494,10 @@ typedef wint_t mpfr_va_wint;
   } while (0)
 
 /* process the format part which does not deal with mpfr types,
-   jump to external label 'error' if gmp_asprintf return -1. */
+   jump to external label 'error' if gmp_asprintf return -1.
+   Note: start and end are pointers to the format string, so that
+   size_t is the best type to express the difference.
+ */
 #define FLUSH(flag, start, end, ap, buf_ptr)                            \
   do {                                                                  \
     const size_t n = (end) - (start);                                   \
@@ -507,15 +528,43 @@ struct string_buffer
   char *start;                  /* beginning of the buffer */
   char *curr;                   /* null terminating character */
   size_t size;                  /* buffer capacity */
+  int len;                      /* string length or -1 if overflow */
 };
 
 static void
 buffer_init (struct string_buffer *b, size_t s)
 {
-  b->start = (char *) (*__gmp_allocate_func) (s);
-  b->start[0] = '\0';
-  b->curr = b->start;
+  if (s != 0)
+    {
+      b->start = (char *) (*__gmp_allocate_func) (s);
+      b->start[0] = '\0';
+      b->curr = b->start;
+    }
   b->size = s;
+  b->len = 0;
+}
+
+/* Increase the len field of the buffer. Return non-zero iff overflow. */
+static int
+buffer_incr_len (struct string_buffer *b, size_t len)
+{
+  if (b->len == -1)
+    return 1;
+  else
+    {
+      size_t newlen = (size_t) b->len + len;
+
+      /* size_t is unsigned, thus the above is valid, but one has
+         newlen < len in case of overflow. */
+
+      if (MPFR_UNLIKELY (newlen < len || newlen > INT_MAX))
+        return 1;
+      else
+        {
+          b->len = newlen;
+          return 0;
+        }
+    }
 }
 
 /* Increase buffer size by a number of character being the least multiple of
@@ -525,9 +574,14 @@ buffer_widen (struct string_buffer *b, size_t len)
 {
   const size_t pos = b->curr - b->start;
   const size_t n = 0x1000 + (len & ~((size_t) 0xfff));
+
+  /* An overflow is not possible since it would have been detected
+     in buffer_incr_len, called first (see buffer_* functions). */
+  MPFR_ASSERTD (n >= 0x1000 && n >= len);
+
+  MPFR_ASSERTD (*b->curr == '\0');
   MPFR_ASSERTD (pos < b->size);
 
-  MPFR_ASSERTN ((len & ~((size_t) 4095)) <= (size_t)(SIZE_MAX - 4096));
   MPFR_ASSERTN (b->size < SIZE_MAX - n);
 
   b->start =
@@ -540,106 +594,145 @@ buffer_widen (struct string_buffer *b, size_t len)
 }
 
 /* Concatenate the LEN first characters of the string S to the buffer B and
-   expand it if needed. */
-static void
+   expand it if needed. Return non-zero if overflow. */
+static int
 buffer_cat (struct string_buffer *b, const char *s, size_t len)
 {
-  MPFR_ASSERTD (len != 0);
+  MPFR_ASSERTD (len > 0);
   MPFR_ASSERTD (len <= strlen (s));
 
-  if (MPFR_UNLIKELY ((b->curr + len) >= (b->start + b->size)))
-    buffer_widen (b, len);
+  if (buffer_incr_len (b, len))
+    return 1;
 
-  strncat (b->curr, s, len);
-  b->curr += len;
+  if (b->size != 0)
+    {
+      MPFR_ASSERTD (*b->curr == '\0');
+      MPFR_ASSERTN (b->size < SIZE_MAX - len);
+      if (MPFR_UNLIKELY (b->curr + len >= b->start + b->size))
+        buffer_widen (b, len);
+
+      /* strncat is similar to strncpy here, except that strncat ensures
+         that the buffer will be null-terminated. */
+      strncat (b->curr, s, len);
+      b->curr += len;
+
+      MPFR_ASSERTD (b->curr < b->start + b->size);
+      MPFR_ASSERTD (*b->curr == '\0');
+    }
 
-  MPFR_ASSERTD (b->curr < b->start + b->size);
-  MPFR_ASSERTD (*b->curr == '\0');
+  return 0;
 }
 
-/* Add N characters C to the end of buffer B */
-static void
+/* Add N characters C to the end of buffer B. Return non-zero if overflow. */
+static int
 buffer_pad (struct string_buffer *b, const char c, const size_t n)
 {
-  MPFR_ASSERTD (n != 0);
+  MPFR_ASSERTD (n > 0);
 
-  MPFR_ASSERTN (b->size < SIZE_MAX - n - 1);
-  if (MPFR_UNLIKELY ((b->curr + n + 1) > (b->start + b->size)))
-    buffer_widen (b, n);
+  if (buffer_incr_len (b, n))
+    return 1;
 
-  if (n == 1)
-    *b->curr = c;
-  else
-    memset (b->curr, c, n);
-  b->curr += n;
-  *b->curr = '\0';
+  if (b->size != 0)
+    {
+      MPFR_ASSERTD (*b->curr == '\0');
+      MPFR_ASSERTN (b->size < SIZE_MAX - n);
+      if (MPFR_UNLIKELY (b->curr + n >= b->start + b->size))
+        buffer_widen (b, n);
 
-  MPFR_ASSERTD (b->curr < b->start + b->size);
+      if (n == 1)
+        *b->curr = c;
+      else
+        memset (b->curr, c, n);
+      b->curr += n;
+      *b->curr = '\0';
+
+      MPFR_ASSERTD (b->curr < b->start + b->size);
+    }
+
+  return 0;
 }
 
 /* Form a string by concatenating the first LEN characters of STR to TZ
    zero(s), insert into one character C each 3 characters starting from end
    to beginning and concatenate the result to the buffer B. */
-static void
+static int
 buffer_sandwich (struct string_buffer *b, char *str, size_t len,
                  const size_t tz, const char c)
 {
-  const size_t step = 3;
-  const size_t size = len + tz;
-  const size_t r = size % step == 0 ? step : size % step;
-  const size_t q = size % step == 0 ? size / step - 1 : size / step;
-  size_t i;
+  MPFR_ASSERTD (len <= strlen (str));
 
-  MPFR_ASSERTD (size != 0);
   if (c == '\0')
-    {
-      buffer_cat (b, str, len);
+    return
+      buffer_cat (b, str, len) ||
       buffer_pad (b, '0', tz);
-      return;
-    }
+  else
+    {
+      const size_t step = 3;
+      const size_t size = len + tz;
+      const size_t r = size % step == 0 ? step : size % step;
+      const size_t q = size % step == 0 ? size / step - 1 : size / step;
+      const size_t fullsize = size + q;
+      size_t i;
 
-  MPFR_ASSERTN (b->size < SIZE_MAX - size - 1 - q);
-  MPFR_ASSERTD (len <= strlen (str));
-  if (MPFR_UNLIKELY ((b->curr + size + 1 + q) > (b->start + b->size)))
-    buffer_widen (b, size + q);
+      MPFR_ASSERTD (size > 0);
 
-  /* first R significant digits */
-  memcpy (b->curr, str, r);
-  b->curr += r;
-  str += r;
-  len -= r;
+      if (buffer_incr_len (b, fullsize))
+        return 1;
 
-  /* blocks of thousands. Warning: STR might end in the middle of a block */
-  for (i = 0; i < q; ++i)
-    {
-      *b->curr++ = c;
-      if (MPFR_LIKELY (len > 0))
+      if (b->size != 0)
         {
-          if (MPFR_LIKELY (len >= step))
-            /* step significant digits */
-            {
-              memcpy (b->curr, str, step);
-              len -= step;
-            }
-          else
-            /* last digits in STR, fill up thousand block with zeros */
+          char *oldcurr;
+
+          MPFR_ASSERTD (*b->curr == '\0');
+          MPFR_ASSERTN (b->size < SIZE_MAX - fullsize);
+          if (MPFR_UNLIKELY (b->curr + fullsize >= b->start + b->size))
+            buffer_widen (b, fullsize);
+
+          MPFR_DBGRES (oldcurr = b->curr);
+
+          /* first R significant digits */
+          memcpy (b->curr, str, r);
+          b->curr += r;
+          str += r;
+          len -= r;
+
+          /* blocks of thousands. Warning: STR might end in the middle of a block */
+          for (i = 0; i < q; ++i)
             {
-              memcpy (b->curr, str, len);
-              memset (b->curr + len, '0', step - len);
-              len = 0;
+              *b->curr++ = c;
+              if (MPFR_LIKELY (len > 0))
+                {
+                  if (MPFR_LIKELY (len >= step))
+                    /* step significant digits */
+                    {
+                      memcpy (b->curr, str, step);
+                      len -= step;
+                    }
+                  else
+                    /* last digits in STR, fill up thousand block with zeros */
+                    {
+                      memcpy (b->curr, str, len);
+                      memset (b->curr + len, '0', step - len);
+                      len = 0;
+                    }
+                }
+              else
+                /* trailing zeros */
+                memset (b->curr, '0', step);
+
+              b->curr += step;
+              str += step;
             }
-        }
-      else
-        /* trailing zeros */
-        memset (b->curr, '0', step);
 
-      b->curr += step;
-      str += step;
-    }
+          MPFR_ASSERTD (b->curr - oldcurr == fullsize);
 
-  *b->curr = '\0';
+          *b->curr = '\0';
 
-  MPFR_ASSERTD (b->curr < b->start + b->size);
+          MPFR_ASSERTD (b->curr < b->start + b->size);
+        }
+
+      return 0;
+    }
 }
 
 /* let gmp_xprintf process the part it can understand */
@@ -650,8 +743,8 @@ sprntf_gmp (struct string_buffer *b, const char *fmt, va_list ap)
   char *s;
 
   length = gmp_vasprintf (&s, fmt, ap);
-  if (length > 0)
-    buffer_cat (b, s, length);
+  if (length > 0 && buffer_cat (b, s, length))
+    length = -1;  /* overflow in buffer_cat */
 
   mpfr_free_str (s);
   return length;
@@ -909,7 +1002,8 @@ regular_ab (struct number_parts *np, mpfr_srcptr p,
          - if no given precision, let mpfr_get_str determine it;
          - if a non-zero precision is specified, then one digit before decimal
          point plus SPEC.PREC after it. */
-      nsd = spec.prec < 0 ? 0 : spec.prec + np->ip_size;
+      MPFR_ASSERTD (np->ip_size == 1); /* thus no integer overflow below */
+      nsd = spec.prec < 0 ? 0 : (size_t) spec.prec + np->ip_size;
       str = mpfr_get_str_aux (&exp, base, nsd, p, spec);
       register_string (np->sl, str);
       np->ip_ptr = MPFR_IS_NEG (p) ? ++str : str;  /* skip sign if any */
@@ -1114,7 +1208,8 @@ regular_eg (struct number_parts *np, mpfr_srcptr p,
          plus SPEC.PREC after it.
          We use the fact here that mpfr_get_str allows us to ask for only one
          significant digit when the base is not a power of 2. */
-      nsd = (spec.prec < 0) ? 0 : spec.prec + np->ip_size;
+      MPFR_ASSERTD (np->ip_size == 1); /* thus no integer overflow below */
+      nsd = spec.prec < 0 ? 0 : (size_t) spec.prec + np->ip_size;
       str = mpfr_get_str_aux (&exp, 10, nsd, p, spec);
       register_string (np->sl, str);
     }
@@ -1329,15 +1424,25 @@ regular_fg (struct number_parts *np, mpfr_srcptr p,
                   np->fp_leading_zeros = spec.prec;
                 }
             }
-          else
+          else  /* exp >= -spec.prec */
             /* the most significant digits are the last
                spec.prec + exp + 1 digits in fractional part */
             {
               char *ptr;
               size_t str_len;
+
+              MPFR_ASSERTD (exp >= -spec.prec);
               if (dec_info == NULL)
                 {
-                  size_t nsd = spec.prec + exp + 1;
+                  size_t nsd;
+
+                  /* Consequences of earlier assertions (in r11307).
+                     They guarantee that the integers are representable
+                     (i.e., no integer overflow), assuming size_t >= int
+                     as usual. */
+                  MPFR_ASSERTD (exp <= -1);
+                  MPFR_ASSERTD (spec.prec + (exp + 1) >= 0);
+                  nsd = spec.prec + (exp + 1);
                   /* WARNING: nsd may equal 1, but here we use the
                      fact that mpfr_get_str can return one digit with
                      base ten (undocumented feature, see comments in
@@ -1656,13 +1761,37 @@ partition_number (struct number_parts *np, mpfr_srcptr p,
              where T is the threshold computed below and X is the exponent
              that would be displayed with style 'e' and precision T-1. */
           int threshold;
-          mpfr_exp_t x;
+          mpfr_exp_t x, e, k;
           struct decimal_info dec_info;
 
           threshold = (spec.prec < 0) ? 6 : (spec.prec == 0) ? 1 : spec.prec;
-          /* here we cannot call mpfr_get_str_aux since we need the full
-             significand in dec_info.str */
-          dec_info.str = mpfr_get_str (NULL, &dec_info.exp, 10, threshold,
+
+          /* Here we cannot call mpfr_get_str_aux since we need the full
+             significand in dec_info.str.
+             Moreover, threshold may be huge while one can know that the
+             number of digits that are not trailing zeros remains limited;
+             such a limit occurs in practical cases, e.g. with numbers
+             representable in the IEEE 754-2008 basic formats. Since the
+             trailing zeros are not necessarily output, we do not want to
+             waste time and memory by making mpfr_get_str generate them.
+             So, let us try to find a smaller threshold for mpfr_get_str.
+             |p| < 2^EXP(p) = 10^(EXP(p)*log10(2)). So, the integer part
+             takes at most ceil(EXP(p)*log10(2)) digits (unless p rounds
+             to the next power of 10, but in this case any threshold will
+             be OK). So, for the integer part, we will take:
+             max(0,floor((EXP(p)+2)/3)).
+             Let k = PREC(p) - EXP(p), so that the last bit of p has
+             weight 2^(-k). If k <= 0, then p is an integer, otherwise
+             the fractional part in base 10 may have up to k digits
+             (this bound is reached if the last bit is 1).
+             Note: The bound could be improved, but this is not critical. */
+          e = MPFR_GET_EXP (p);
+          k = MPFR_PREC (p) - e;
+          e = e <= 0 ? k : (e + 2) / 3 + (k <= 0 ? 0 : k);
+          MPFR_ASSERTD (e >= 1);
+
+          dec_info.str = mpfr_get_str (NULL, &dec_info.exp, 10,
+                                       e < threshold ? e : threshold,
                                        p, spec.rnd_mode);
           register_string (np->sl, dec_info.str);
           /* mpfr_get_str corresponds to a significand between 0.1 and 1,
@@ -1756,9 +1885,11 @@ sprnt_fp (struct string_buffer *buf, mpfr_srcptr p,
   if (length < 0)
     return -1;
 
-  if (spec.size == 0) /* no need to fill the buffer */
+  if (spec.size == 0)
     {
-      buf->curr += length;
+      /* This is equivalent to the following code (no need to fill the buffer
+         and length is known). */
+      buffer_incr_len (buf, length);
       goto clear_and_exit;
     }
 
@@ -1818,7 +1949,7 @@ sprnt_fp (struct string_buffer *buf, mpfr_srcptr p,
 
  clear_and_exit:
   clear_string_list (np.sl);
-  return length;
+  return buf->len == -1 ? -1 : length;
 }
 
 /* the following internal function implements both mpfr_vasprintf and
@@ -1847,8 +1978,7 @@ mpfr_vasnprintf_aux (char **ptr, char *Buf, size_t size, const char *fmt,
   MPFR_SAVE_EXPO_DECL (expo);
   MPFR_SAVE_EXPO_MARK (expo);
 
-  nbchar = 0;
-  buffer_init (&buf, 4096);
+  buffer_init (&buf, ptr != NULL || size != 0 ? 4096 : 0);
   xgmp_fmt_flag = 0;
   va_copy (ap2, ap);
   start = fmt;
@@ -2044,6 +2174,8 @@ mpfr_vasnprintf_aux (char **ptr, char *Buf, size_t size, const char *fmt,
           char format[MPFR_PREC_FORMAT_SIZE + 6]; /* see examples below */
           size_t length;
           mpfr_prec_t prec;
+          int err;
+
           prec = va_arg (ap, mpfr_prec_t);
 
           FLUSH (xgmp_fmt_flag, start, end, ap2, &buf);
@@ -2061,16 +2193,11 @@ mpfr_vasnprintf_aux (char **ptr, char *Buf, size_t size, const char *fmt,
           format[4 + MPFR_PREC_FORMAT_SIZE] = spec.spec;
           format[5 + MPFR_PREC_FORMAT_SIZE] = '\0';
           length = gmp_asprintf (&s, format, spec.width, spec.prec, prec);
-          if (buf.size <= INT_MAX - length)
-            {
-              buffer_cat (&buf, s, length);
-              mpfr_free_str (s);
-            }
-          else
-            {
-              mpfr_free_str (s);
-              goto overflow_error;
-            }
+          MPFR_ASSERTN (length >= 0);  /* guaranteed by GMP 6 */
+          err = buffer_cat (&buf, s, length);
+          mpfr_free_str (s);
+          if (err)
+            goto overflow_error;
         }
       else if (spec.arg_type == MPFR_ARG)
         /* output a mpfr_t variable */
@@ -2111,54 +2238,34 @@ mpfr_vasnprintf_aux (char **ptr, char *Buf, size_t size, const char *fmt,
     FLUSH (xgmp_fmt_flag, start, fmt, ap2, &buf);
 
   va_end (ap2);
-  nbchar = buf.curr - buf.start;
+  MPFR_ASSERTD (buf.len >= 0);  /* overflow already detected */
+  nbchar = buf.len;
 
-  if (ptr == NULL) /* implement mpfr_vsnprintf */
+  if (ptr != NULL)  /* implement mpfr_vasprintf */
+    {
+      MPFR_ASSERTD (nbchar == strlen (buf.start));
+      *ptr = (char *)
+        (*__gmp_reallocate_func) (buf.start, buf.size, nbchar + 1);
+    }
+  else if (size > 0)  /* implement mpfr_vsnprintf */
     {
-      if (size > 0)
+      if (nbchar < size)
         {
-          if (nbchar < size)
-            {
-              strncpy (Buf, buf.start, nbchar);
-              Buf[nbchar] = '\0';
-            }
-          else
-            {
-              strncpy (Buf, buf.start, size - 1);
-              Buf[size-1] = '\0';
-            }
+          strncpy (Buf, buf.start, nbchar);
+          Buf[nbchar] = '\0';
+        }
+      else
+        {
+          strncpy (Buf, buf.start, size - 1);
+          Buf[size-1] = '\0';
         }
-      MPFR_SAVE_EXPO_FREE (expo);
       (*__gmp_free_func) (buf.start, buf.size);
-      return nbchar; /* return the number of characters that would have been
-                        written had 'size' be sufficiently large, not counting
-                        the terminating null character */
     }
 
-  MPFR_ASSERTD (nbchar == strlen (buf.start));
-  buf.start =
-    (char *) (*__gmp_reallocate_func) (buf.start, buf.size, nbchar + 1);
-  buf.size = nbchar + 1; /* update needed for __gmp_free_func below when
-                            nbchar is too large (overflow_error) */
-  
-  /* below we implement mpfr_vasprintf */
-  *ptr = buf.start;
-
-  /* If nbchar is larger than INT_MAX, the ISO C99 standard is silent, but
-     POSIX says concerning the snprintf() function:
-     "[EOVERFLOW] The value of n is greater than {INT_MAX} or the
-     number of bytes needed to hold the output excluding the
-     terminating null is greater than {INT_MAX}." See:
-     http://www.opengroup.org/onlinepubs/009695399/functions/fprintf.html
-     But it doesn't say anything concerning the other printf-like functions.
-     A defect report has been submitted to austin-review-l (item 2532).
-     So, for the time being, we return a negative value and set the erange
-     flag, and set errno to EOVERFLOW in POSIX system. */
-  if (nbchar <= INT_MAX)
-    {
-      MPFR_SAVE_EXPO_FREE (expo);
-      return nbchar;
-    }
+  MPFR_SAVE_EXPO_FREE (expo);
+  return nbchar; /* return the number of characters that would have been
+                    written had 'size' be sufficiently large, not counting
+                    the terminating null character */
 
  overflow_error:
   MPFR_SAVE_EXPO_UPDATE_FLAGS(expo, MPFR_FLAGS_ERANGE);
diff --git a/src/zeta.c b/src/zeta.c
index 2e8fe064e..950ab718f 100644
--- a/src/zeta.c
+++ b/src/zeta.c
@@ -20,6 +20,8 @@ along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
 http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
 
+#include <float.h> /* for DBL_MAX */
+
 #define MPFR_NEED_LONGLONG_H
 #include "mpfr-impl.h"
 
@@ -287,6 +289,137 @@ mpfr_zeta_pos (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
   return inex;
 }
 
+/* return add = 1 + floor(log(c^3*(13+m1))/log(2))
+   where c = (1+eps)*(1+eps*max(8,m1)),
+   m1 = 1 + max(1/eps,2*sd)*(1+eps),
+   eps = 2^(-precz-14)
+   sd = abs(s-1)
+ */
+static long
+compute_add (mpfr_srcptr s, mpfr_prec_t precz)
+{
+  mpfr_t t, u, m1;
+  long add;
+
+  mpfr_inits2 (64, t, u, m1, (mpfr_ptr) 0);
+  if (mpfr_cmp_ui (s, 1) >= 0)
+    mpfr_sub_ui (t, s, 1, MPFR_RNDU);
+  else
+    mpfr_ui_sub (t, 1, s, MPFR_RNDU);
+  /* now t = sd = abs(s-1), rounded up */
+  mpfr_set_ui_2exp (u, 1, - precz - 14, MPFR_RNDU);
+  /* u = eps */
+  /* since 1/eps = 2^(precz+14), if EXP(sd) >= precz+14, then
+     sd >= 1/2*2^(precz+14) thus 2*sd >= 2^(precz+14) >= 1/eps */
+  if (mpfr_get_exp (t) >= precz + 14)
+    mpfr_mul_2exp (t, t, 1, MPFR_RNDU);
+  else
+    mpfr_set_ui_2exp (t, 1, precz + 14, MPFR_RNDU);
+  /* now t = max(1/eps,2*sd) */
+  mpfr_add_ui (u, u, 1, MPFR_RNDU); /* u = 1+eps, rounded up */
+  mpfr_mul (t, t, u, MPFR_RNDU); /* t = max(1/eps,2*sd)*(1+eps) */
+  mpfr_add_ui (m1, t, 1, MPFR_RNDU);
+  if (mpfr_get_exp (m1) <= 3)
+    mpfr_set_ui (t, 8, MPFR_RNDU);
+  else
+    mpfr_set (t, m1, MPFR_RNDU);
+  /* now t = max(8,m1) */
+  mpfr_div_2exp (t, t, precz + 14, MPFR_RNDU); /* eps*max(8,m1) */
+  mpfr_add_ui (t, t, 1, MPFR_RNDU); /* 1+eps*max(8,m1) */
+  mpfr_mul (t, t, u, MPFR_RNDU); /* t = c */
+  mpfr_add_ui (u, m1, 13, MPFR_RNDU); /* 13+m1 */
+  mpfr_mul (u, u, t, MPFR_RNDU); /* c*(13+m1) */
+  mpfr_sqr (t, t, MPFR_RNDU); /* c^2 */
+  mpfr_mul (u, u, t, MPFR_RNDU); /* c^3*(13+m1) */
+  add = mpfr_get_exp (u);
+  mpfr_clears (t, u, m1, (mpfr_ptr) 0);
+  return add;
+}
+
+/* return in z a lower bound (for rnd = RNDD) or upper bound (for rnd = RNDU)
+   of |zeta(s)|/2, using:
+   log(|zeta(s)|/2) = (s-1)*log(2*Pi) + lngamma(1-s)
+   + log(|sin(Pi*s/2)| * zeta(1-s)).
+   Assumes s < 1/2 and s1 = 1-s exactly, thus s1 > 1/2.
+   y and p are temporary variables.
+   At input, p is Pi rounded down.
+   The comments in the code are for rnd = RNDD. */
+static void
+mpfr_reflection_overflow (mpfr_t z, mpfr_t s1, const mpfr_t s, mpfr_t y,
+                          mpfr_t p, mpfr_rnd_t rnd)
+{
+  mpz_t sint;
+
+  MPFR_ASSERTD (rnd == MPFR_RNDD || rnd == MPFR_RNDU);
+
+  /* Since log is increasing, we want lower bounds on |sin(Pi*s/2)| and
+     zeta(1-s). */
+  mpz_init (sint);
+  mpfr_get_z (sint, s, MPFR_RNDD); /* sint = floor(s) */
+  /* We first compute a lower bound of |sin(Pi*s/2)|, which is a periodic
+     function of period 2. Thus:
+     if 2k < s < 2k+1, then |sin(Pi*s/2)| is increasing;
+     if 2k-1 < s < 2k, then |sin(Pi*s/2)| is decreasing.
+     These cases are distinguished by testing bit 0 of floor(s) as if
+     represented in two's complement (or equivalently, as an unsigned
+     integer mod 2):
+     0: sint = 0 mod 2, thus 2k < s < 2k+1 and |sin(Pi*s/2)| is increasing;
+     1: sint = 1 mod 2, thus 2k-1 < s < 2k and |sin(Pi*s/2)| is decreasing.
+     Let's recall that the comments are for rnd = RNDD. */
+  if (mpz_tstbit (sint, 0) == 0) /* |sin(Pi*s/2)| is increasing: round down
+                                    Pi*s to get a lower bound. */
+    {
+      mpfr_mul (y, p, s, rnd);
+      if (rnd == MPFR_RNDD)
+        mpfr_nextabove (p); /* we will need p rounded above afterwards */
+    }
+  else /* |sin(Pi*s/2)| is decreasing: round up Pi*s to get a lower bound. */
+    {
+      if (rnd == MPFR_RNDD)
+        mpfr_nextabove (p);
+      mpfr_mul (y, p, s, MPFR_INVERT_RND(rnd));
+    }
+  mpfr_div_2ui (y, y, 1, MPFR_RNDN); /* exact, rounding mode doesn't matter */
+  /* The rounding direction of sin depends on its sign. We have:
+     if -4k-2 < s < -4k, then -2k-1 < s/2 < -2k, thus sin(Pi*s/2) < 0;
+     if -4k < s < -4k+2, then -2k < s/2 < -2k+1, thus sin(Pi*s/2) > 0.
+     These cases are distinguished by testing bit 1 of floor(s) as if
+     represented in two's complement (or equivalently, as an unsigned
+     integer mod 4):
+     0: sint = {0,1} mod 4, thus -2k < s/2 < -2k+1 and sin(Pi*s/2) > 0;
+     1: sint = {2,3} mod 4, thus -2k-1 < s/2 < -2k and sin(Pi*s/2) < 0.
+     Let's recall that the comments are for rnd = RNDD. */
+  if (mpz_tstbit (sint, 1) == 0) /* -2k < s/2 < -2k+1; sin(Pi*s/2) > 0 */
+    {
+      /* Round sin down to get a lower bound of |sin(Pi*s/2)|. */
+      mpfr_sin (y, y, rnd);
+    }
+  else /* -2k-1 < s/2 < -2k; sin(Pi*s/2) < 0 */
+    {
+      /* Round sin up to get a lower bound of |sin(Pi*s/2)|. */
+      mpfr_sin (y, y, MPFR_INVERT_RND(rnd));
+      mpfr_abs (y, y, MPFR_RNDN); /* exact, rounding mode doesn't matter */
+    }
+  mpz_clear (sint);
+  /* now y <= |sin(Pi*s/2)| when rnd=RNDD, y >= |sin(Pi*s/2)| when rnd=RNDU */
+  mpfr_zeta_pos (z, s1, rnd); /* zeta(1-s) */
+  mpfr_mul (z, z, y, rnd);
+  /* now z <= |sin(Pi*s/2)|*zeta(1-s) */
+  mpfr_log (z, z, rnd);
+  /* now z <= log(|sin(Pi*s/2)|*zeta(1-s)) */
+  mpfr_lngamma (y, s1, rnd);
+  mpfr_add (z, z, y, rnd);
+  /* z <= lngamma(1-s) + log(|sin(Pi*s/2)|*zeta(1-s)) */
+  /* since s-1 < 0, we want to round log(2*pi) upwards */
+  mpfr_mul_2ui (y, p, 1, MPFR_INVERT_RND(rnd));
+  mpfr_log (y, y, MPFR_INVERT_RND(rnd));
+  mpfr_mul (y, y, s1, MPFR_INVERT_RND(rnd));
+  mpfr_sub (z, z, y, rnd);
+  mpfr_exp (z, z, rnd);
+  if (rnd == MPFR_RNDD)
+    mpfr_nextbelow (p); /* restore original p */
+}
+
 int
 mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
 {
@@ -328,10 +461,9 @@ mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
   /* s is neither Nan, nor Inf, nor Zero */
 
   /* check tiny s: we have zeta(s) = -1/2 - 1/2 log(2 Pi) s + ... around s=0,
-     and for |s| <= 0.074, we have |zeta(s) + 1/2| <= |s|.
-     Thus if |s| <= 1/4*ulp(1/2), we can deduce the correct rounding
-     (the 1/4 covers the case where |zeta(s)| < 1/2 and rounding to nearest).
-     A sufficient condition is that EXP(s) + 1 < -PREC(z). */
+     and for |s| <= 2^(-4), we have |zeta(s) + 1/2| <= |s|.
+     EXP(s) + 1 < -PREC(z) is a sufficient condition to be able to round
+     correctly, for any PREC(z) >= 1 (see algorithms.tex for details). */
   if (MPFR_GET_EXP (s) + 1 < - (mpfr_exp_t) MPFR_PREC(z))
     {
       int signs = MPFR_SIGN(s);
@@ -402,6 +534,20 @@ mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
       /* Precision precs1 needed to represent 1 - s, and s + 2,
          without any truncation */
       precs1 = precs + 2 + MAX (0, - MPFR_GET_EXP (s));
+      /* FIXME: For the error analysis, use MPFR instead of the native
+         double type. The code below can yield overflows on double's
+         when s is large enough (its precision also needs to be large
+         enough, otherwise s is an even integer, which has already been
+         taken into account). In particular, on platforms where overflow
+         is trapped (or if the user has chosen to trap overflow), this
+         can make the application crash.
+         Moreover, does the error computation need to be accurate, such as
+         the multiplications by (1.0 + eps)? If yes, what about rounding
+         directions when using double? If no, the expression could probably
+         be simplified, so that using native integer arithmetic with
+         mpfr_exp_t may be sufficient instead of using MPFR.
+         Note: This FIXME can be made obsolete by rewriting the code
+         (see algorithms.tex, still very incomplete). */
       sd = mpfr_get_d (s, MPFR_RNDN) - 1.0;
       if (sd < 0.0)
         sd = -sd; /* now sd = abs(s-1.0) */
@@ -409,36 +555,98 @@ mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
          it ensures a final precision prec1 - add for zeta(s) */
       /* eps = pow (2.0, - (double) precz - 14.0); */
       eps = __gmpfr_ceil_exp2 (- (double) precz - 14.0);
-      m1 = 1.0 + MAX(1.0 / eps,  2.0 * sd) * (1.0 + eps);
+      m1 = 1.0 + MAX(1.0 / eps, 2.0 * sd) * (1.0 + eps);
       c = (1.0 + eps) * (1.0 + eps * MAX(8.0, m1));
       /* add = 1 + floor(log(c*c*c*(13 + m1))/log(2)); */
-      add = __gmpfr_ceil_log2 (c * c * c * (13.0 + m1));
+      c = c * c * c * (13.0 + m1);
+      add = (c <= DBL_MAX) ? __gmpfr_ceil_log2 (c) : compute_add (s, precz);
       prec1 = precz + add;
-      /* FIXME: to avoid that the working precision (prec1) depends on the
+      /* FIXME: To avoid that the working precision (prec1) depends on the
          input precision, one would need to take into account the error made
          when s1 is not exactly 1-s when computing zeta(s1) and gamma(s1)
-         below, and also in the case y=Inf (i.e. when gamma(s1) overflows). */
+         below, and also in the case y=Inf (i.e. when gamma(s1) overflows).
+         Make sure that underflows do not occur in intermediate computations.
+         Due to the limited precision, they are probably not possible
+         in practice; add some MPFR_ASSERTN's to be sure that problems
+         do not remain undetected? */
       prec1 = MAX (prec1, precs1) + 10;
 
       MPFR_GROUP_INIT_4 (group, prec1, z_pre, s1, y, p);
       MPFR_ZIV_INIT (loop, prec1);
       for (;;)
         {
+          mpfr_exp_t ey;
+          mpfr_t z_up;
+
+          mpfr_const_pi (p, MPFR_RNDD); /* p is Pi */
+
           mpfr_sub (s1, __gmpfr_one, s, MPFR_RNDN); /* s1 = 1-s */
-          mpfr_zeta_pos (z_pre, s1, MPFR_RNDN);   /* zeta(1-s)  */
           mpfr_gamma (y, s1, MPFR_RNDN);          /* gamma(1-s) */
-          if (MPFR_IS_INF (y)) /* Zeta(s) < 0 for -4k-2 < s < -4k,
-                                  Zeta(s) > 0 for -4k < s < -4k+2 */
+          if (MPFR_IS_INF (y)) /* zeta(s) < 0 for -4k-2 < s < -4k,
+                                  zeta(s) > 0 for -4k < s < -4k+2 */
             {
-              mpfr_div_2ui (s1, s, 2, MPFR_RNDN); /* s/4, exact */
-              mpfr_frac (s1, s1, MPFR_RNDN); /* exact, -1 < s1 < 0 */
-              overflow = (mpfr_cmp_si_2exp (s1, -1, -1) > 0) ? -1 : 1;
-              break;
+              /* FIXME: An overflow in gamma(s1) does not imply that
+                 zeta(s) will overflow. A solution:
+                 1. Compute
+                   log(|zeta(s)|/2) = (s-1)*log(2*pi) + lngamma(1-s)
+                     + log(abs(sin(Pi*s/2)) * zeta(1-s))
+                 (possibly sharing computations with the normal case)
+                 with a rather good accuracy (see (2)).
+                 Memorize the sign of sin(...) for the final sign.
+                 2. Take the exponential, ~= |zeta(s)|/2. If there is an
+                 overflow, then this means an overflow on the final result
+                 (due to the multiplication by 2, which has not been done
+                 yet).
+                 3. Ziv test.
+                 4. Correct the sign from the sign of sin(...).
+                 5. Round then multiply by 2. Here, an overflow in either
+                 operation means a real overflow. */
+              mpfr_reflection_overflow (z_pre, s1, s, y, p, MPFR_RNDD);
+              /* z_pre is a lower bound of |zeta(s)|/2, thus if it overflows,
+                 or has exponent emax, then |zeta(s)| overflows too. */
+              if (MPFR_IS_INF (z_pre) || MPFR_GET_EXP(z_pre) == __gmpfr_emax)
+                { /* determine the sign of overflow */
+                  mpfr_div_2ui (s1, s, 2, MPFR_RNDN); /* s/4, exact */
+                  mpfr_frac (s1, s1, MPFR_RNDN); /* exact, -1 < s1 < 0 */
+                  overflow = (mpfr_cmp_si_2exp (s1, -1, -1) > 0) ? -1 : 1;
+                  break;
+                }
+              else /* EXP(z_pre) < __gmpfr_emax */
+                {
+                  int ok = 0;
+                  mpfr_t z_down;
+                  mpfr_init2 (z_up, mpfr_get_prec (z_pre));
+                  mpfr_reflection_overflow (z_up, s1, s, y, p, MPFR_RNDU);
+                  /* if the lower approximation z_pre does not overflow, but
+                     z_up does, we need more precision */
+                  if (MPFR_IS_INF (z_up) || MPFR_GET_EXP(z_up) == __gmpfr_emax)
+                    goto next_loop;
+                  /* check if z_pre and z_up round to the same number */
+                  mpfr_init2 (z_down, precz);
+                  mpfr_set (z_down, z_pre, rnd_mode);
+                  /* Note: it might be that EXP(z_down) = emax here, in that
+                     case we will have overflow below when we multiply by 2 */
+                  mpfr_prec_round (z_up, precz, rnd_mode);
+                  ok = mpfr_cmp (z_down, z_up) == 0;
+                  mpfr_clear (z_up);
+                  mpfr_clear (z_down);
+                  if (ok)
+                    {
+                      /* get correct sign and multiply by 2 */
+                      mpfr_div_2ui (s1, s, 2, MPFR_RNDN); /* s/4, exact */
+                      mpfr_frac (s1, s1, MPFR_RNDN); /* exact, -1 < s1 < 0 */
+                      if (mpfr_cmp_si_2exp (s1, -1, -1) > 0)
+                        mpfr_neg (z_pre, z_pre, rnd_mode);
+                      mpfr_mul_2ui (z_pre, z_pre, 1, rnd_mode);
+                      break;
+                    }
+                  else
+                    goto next_loop;
+                }
             }
+          mpfr_zeta_pos (z_pre, s1, MPFR_RNDN);   /* zeta(1-s)  */
           mpfr_mul (z_pre, z_pre, y, MPFR_RNDN);  /* gamma(1-s)*zeta(1-s) */
 
-          mpfr_const_pi (p, MPFR_RNDD); /* p is Pi */
-
           /* multiply z_pre by 2^s*Pi^(s-1) where p=Pi, s1=1-s */
           mpfr_mul_2ui (y, p, 1, MPFR_RNDN);      /* 2*Pi */
           mpfr_neg (s1, s1, MPFR_RNDN);           /* s-1 */
@@ -450,12 +658,18 @@ mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
           mpfr_mul (y, s, p, MPFR_RNDN);
           mpfr_div_2ui (p, y, 1, MPFR_RNDN);      /* p = s*Pi/2 */
           /* FIXME: sinpi will be available, we should replace the mpfr_sin
-             call below by mpfr_sinpi(s/2), where s/2 will be exact */
+             call below by mpfr_sinpi(s/2), where s/2 will be exact.
+             Can mpfr_sin underflow? Moreover, the code below should be
+             improved so that the "if" condition becomes unlikely, e.g.
+             by taking a slightly larger working precision. */
           mpfr_sin (y, p, MPFR_RNDN);             /* y = sin(Pi*s/2) */
-          if (MPFR_GET_EXP(y) < 0) /* take account of cancellation in sin(p) */
+          ey = MPFR_GET_EXP (y);
+          if (ey < 0) /* take account of cancellation in sin(p) */
             {
               mpfr_t t;
-              mpfr_init2 (t, prec1 - MPFR_GET_EXP(y));
+
+              MPFR_ASSERTN (- ey < MPFR_PREC_MAX - prec1);
+              mpfr_init2 (t, prec1 - ey);
               mpfr_const_pi (t, MPFR_RNDD);
               mpfr_mul (t, s, t, MPFR_RNDN);
               mpfr_div_2ui (t, t, 1, MPFR_RNDN);
@@ -468,6 +682,7 @@ mpfr_zeta (mpfr_t z, mpfr_srcptr s, mpfr_rnd_t rnd_mode)
                                            rnd_mode)))
             break;
 
+        next_loop:
           MPFR_ZIV_NEXT (loop, prec1);
           MPFR_GROUP_REPREC_4 (group, prec1, z_pre, s1, y, p);
         }
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 265b89509..f2c0502a6 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -33,7 +33,7 @@ check_PROGRAMS = tversion tabort_prec_max tassert tabort_defalloc1	\
      tsprintf tsqr tsqrt tsqrt_ui tstckintc tstdint tstrtofr tsub	\
      tsub1sp tsub_d tsub_ui tsubnormal tsum tswap ttan ttanh ttrunc	\
      tui_div tui_pow tui_sub turandom tvalist ty0 ty1 tyn tzeta		\
-     tzeta_ui
+     tzeta_ui tbeta
 
 # Before Automake 1.13, we ran tversion at the beginning and at the end
 # of the tests, and output from tversion appeared at the same place as
diff --git a/tests/memory.c b/tests/memory.c
index d31ef2407..82e18380b 100644
--- a/tests/memory.c
+++ b/tests/memory.c
@@ -34,6 +34,14 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
    when tests_memory_end() is called.  Test programs must be sure to have
    "clear"s for all temporary variables used.  */
 
+/* Note about error messages
+   -------------------------
+   Error messages in MPFR are usually written to stdout. However, those
+   coming from the memory allocator need to be written to stderr in order
+   to be visible when the standard output is redirected, e.g. in the tests
+   of I/O functions (like tprintf). For consistency, all error messages in
+   this file should be written to stderr. */
+
 struct header {
   void           *ptr;
   size_t         size;
@@ -59,7 +67,8 @@ mpfr_default_allocate (size_t size)
   ret = malloc (size);
   if (MPFR_UNLIKELY (ret == NULL))
     {
-      fprintf (stderr, "MPFR: Can't allocate memory (size=%lu)\n",
+      fprintf (stderr, "[MPFR] mpfr_default_allocate(): "
+               "can't allocate memory (size=%lu)\n",
                (unsigned long) size);
       abort ();
     }
@@ -73,8 +82,8 @@ mpfr_default_reallocate (void *oldptr, size_t old_size, size_t new_size)
   ret = realloc (oldptr, new_size);
   if (MPFR_UNLIKELY(ret == NULL))
     {
-      fprintf (stderr,
-               "MPFR: Can't reallocate memory (old_size=%lu new_size=%lu)\n",
+      fprintf (stderr, "[MPFR] mpfr_default_reallocate(): "
+               "can't reallocate memory (old_size=%lu new_size=%lu)\n",
                (unsigned long) old_size, (unsigned long) new_size);
       abort ();
     }
@@ -119,7 +128,8 @@ tests_addsize (size_t size)
     {
       /* The total size taken by MPFR on the heap is more than 4 MB:
          either a bug or a huge inefficiency. */
-      printf ("MPFR: too much memory (%lu bytes)\n",
+      fprintf (stderr, "[MPFR] tests_addsize(): "
+               "too much memory (%lu bytes)\n",
               (unsigned long) tests_total_size);
       abort ();
     }
@@ -134,7 +144,8 @@ tests_allocate (size_t size)
 
   if (size == 0)
     {
-      printf ("tests_allocate(): attempt to allocate 0 bytes\n");
+      fprintf (stderr, "[MPFR] tests_allocate(): "
+               "attempt to allocate 0 bytes\n");
       abort ();
     }
 
@@ -161,7 +172,8 @@ tests_reallocate (void *ptr, size_t old_size, size_t new_size)
 
   if (new_size == 0)
     {
-      printf ("tests_reallocate(): attempt to reallocate 0x%lX to 0 bytes\n",
+      fprintf (stderr, "[MPFR] tests_reallocate(): "
+               "attempt to reallocate 0x%lX to 0 bytes\n",
               (unsigned long) ptr);
       abort ();
     }
@@ -169,7 +181,8 @@ tests_reallocate (void *ptr, size_t old_size, size_t new_size)
   hp = tests_memory_find (ptr);
   if (hp == NULL)
     {
-      printf ("tests_reallocate(): attempt to reallocate bad pointer 0x%lX\n",
+      fprintf (stderr, "[MPFR] tests_reallocate(): "
+               "attempt to reallocate bad pointer 0x%lX\n",
               (unsigned long) ptr);
       abort ();
     }
@@ -179,7 +192,8 @@ tests_reallocate (void *ptr, size_t old_size, size_t new_size)
     {
       /* Note: we should use the standard %zu to print sizes, but
          this is not supported by old C implementations. */
-      printf ("tests_reallocate(): bad old size %lu, should be %lu\n",
+      fprintf (stderr, "[MPFR] tests_reallocate(): "
+               "bad old size %lu, should be %lu\n",
               (unsigned long) old_size, (unsigned long) h->size);
       abort ();
     }
@@ -201,7 +215,8 @@ tests_free_find (void *ptr)
   struct header  **hp = tests_memory_find (ptr);
   if (hp == NULL)
     {
-      printf ("tests_free(): attempt to free bad pointer 0x%lX\n",
+      fprintf (stderr, "[MPFR] tests_free(): "
+               "attempt to free bad pointer 0x%lX\n",
               (unsigned long) ptr);
       abort ();
     }
@@ -235,7 +250,7 @@ tests_free (void *ptr, size_t size)
     {
       /* Note: we should use the standard %zu to print sizes, but
          this is not supported by old C implementations. */
-      printf ("tests_free(): bad size %lu, should be %lu\n",
+      fprintf (stderr, "[MPFR] tests_free(): bad size %lu, should be %lu\n",
               (unsigned long) size, (unsigned long) h->size);
       abort ();
     }
@@ -271,13 +286,13 @@ tests_memory_end (void)
       struct header  *h;
       unsigned  count;
 
-      printf ("tests_memory_end(): not all memory freed\n");
+      fprintf (stderr, "[MPFR] tests_memory_end(): not all memory freed\n");
 
       count = 0;
       for (h = tests_memory_list; h != NULL; h = h->next)
         count++;
 
-      printf ("    %u blocks remaining\n", count);
+      fprintf (stderr, "[MPFR]    %u blocks remaining\n", count);
       abort ();
     }
 }
diff --git a/tests/tbeta.c b/tests/tbeta.c
new file mode 100644
index 000000000..45897a888
--- /dev/null
+++ b/tests/tbeta.c
@@ -0,0 +1,374 @@
+/* Test file for the beta function
+
+Copyright 2017 Free Software Foundation, Inc.
+Contributed by ChemicalDevelopment.
+
+This file is part of the GNU MPFR Library.
+
+The GNU MPFR Library is free software; you can redistribute it and/or modify
+it under the terms of the GNU Lesser General Public License as published by
+the Free Software Foundation; either version 3 of the License, or (at your
+option) any later version.
+
+The GNU MPFR Library is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser General Public
+License for more details.
+
+You should have received a copy of the GNU Lesser General Public License
+along with the GNU MPFR Library; see the file COPYING.LESSER.  If not, see
+http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
+51 Franklin St, Fifth Floor, Boston, MA 02110-1301, USA. */
+
+#include "mpfr-test.h"
+
+/* TODO: Test the ternary value and the flags. Add tgeneric tests. */
+
+#define FAILED(p, r, z, w, expected, rnd_mode) do {                     \
+    printf ("prec=%d, rnd=%s case failed for:",                         \
+            (int) p, mpfr_print_rnd_mode (rnd_mode));                   \
+    printf("\n z  =");                                                  \
+    mpfr_out_str (stdout, 2, 0, z, MPFR_RNDN);                          \
+    printf("\n w  =");                                                  \
+    mpfr_out_str (stdout, 2, 0, w, MPFR_RNDN);                          \
+    printf("\n ex.=");                                                  \
+    mpfr_out_str (stdout, 2, 0, expected, MPFR_RNDN);                   \
+    printf("\n ac.=");                                                  \
+    mpfr_out_str (stdout, 2, 0, r, MPFR_RNDN);                          \
+    printf("\n\n");                                                     \
+  } while (0)
+
+#define TEST(p, r, z, w, expected) TESTRND(p, r, z, w, expected, MPFR_RNDN)
+
+#define TESTRND(p, r, z, w, expected, rnd_mode) do {                    \
+    mpfr_beta (r, z, w, rnd_mode);                                      \
+    if (not_same (r, expected))                                         \
+      FAILED(p, r, z, w, expected, rnd_mode);                           \
+  } while (0)
+
+static int
+not_same (mpfr_t a, mpfr_t b)
+{
+  int res = 0;
+
+  if (mpfr_cmp(a, b) != 0)
+    res = 1;
+  if (! mpfr_nan_p(a) != ! mpfr_nan_p(b))
+    res = 1;
+  if (! mpfr_equal_p(a, b) && (! mpfr_nan_p(a) && ! mpfr_nan_p(b)))
+    res = 1;
+  if ((! mpfr_signbit(a) != ! mpfr_signbit(b)) &&
+      (! mpfr_nan_p(a) && ! mpfr_nan_p(b)))
+    res = 1;
+
+  return res;
+}
+
+static void
+test_beta_special (mpfr_prec_t prec)
+{
+  mpfr_t z, w, r, expect;
+
+  mpfr_init2 (r, prec);
+  mpfr_init2 (z, prec);
+  mpfr_init2 (w, prec);
+  mpfr_init2 (expect, prec);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_inf (w, 1);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_zero (w, 1);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_zero (w, -1);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_str (w, "-.1e0", 2, MPFR_RNDN);
+  mpfr_set_inf (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_str (w, "-1.1e0", 2, MPFR_RNDN);
+  mpfr_set_inf (expect, -1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_str (w, "-1e0", 2, MPFR_RNDN);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, 1);
+  mpfr_set_str (w, "-2e0", 2, MPFR_RNDN);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  if (prec > 81)
+    {
+      mpfr_set_inf (z, 1);
+      mpfr_set_str (w, "-1e80", 2, MPFR_RNDN);
+      mpfr_set_nan (expect);
+      TEST(prec, r, z, w, expect);
+
+      mpfr_set_inf (z, 1);
+      mpfr_set_str (w, "-1e80", 2, MPFR_RNDN);
+      mpfr_sub_d (w, w, .1, MPFR_RNDN);
+      mpfr_set_inf (expect, 1);
+      TEST(prec, r, z, w, expect);
+
+      mpfr_set_str (w, "-1e80", 2, MPFR_RNDN);
+      mpfr_sub_d (w, w, 1.1, MPFR_RNDN);
+      mpfr_set_inf (expect, -1);
+      TEST(prec, r, z, w, expect);
+    }
+
+  mpfr_set_str (z, "1.1e0", 2, MPFR_RNDN);
+  mpfr_set_inf (w, -1);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_str (z, "11e0", 2, MPFR_RNDN);
+  mpfr_set_inf (w, -1);
+  mpfr_set_zero (expect, -1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_str (z, "10e0", 2, MPFR_RNDN);
+  mpfr_set_inf (w, -1);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_set_inf (z, -1);
+  mpfr_set_inf (w, -1);
+  mpfr_set_nan (expect);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_clear (r);
+  mpfr_clear (z);
+  mpfr_clear (w);
+  mpfr_clear (expect);
+}
+
+static void
+test_beta_2exp (mpfr_prec_t prec, int trials, int spread)
+{
+  mpfr_t r, z, w, expect;
+  int i;
+
+  mpfr_init2 (r, prec);
+  mpfr_init2 (z, prec);
+  mpfr_init2 (w, prec);
+  mpfr_init2 (expect, prec);
+  for (i = -(spread*trials)/2; spread*i < trials / 2; i += spread)
+    {
+      mpfr_set_si_2exp (z, 1, i, MPFR_RNDN);
+      mpfr_set_ui (w, 1, MPFR_RNDN);
+      mpfr_set_si_2exp (expect, 1, -i, MPFR_RNDN);
+
+      TEST(prec, r, z, w, expect);
+    }
+
+  mpfr_clear (r);
+  mpfr_clear (z);
+  mpfr_clear (w);
+  mpfr_clear (expect);
+}
+
+/*
+Tests values such that z and w are not integers, but (z+w) is.
+
+An example that was given:
+beta(-.3, -1.7) = gamma(-0.3)*gamma(-1.7)/gamma(-2)
+
+Sage gives this as 0, and Lefevre said that we should return +0
+
+*/
+static void
+test_beta_zw_sum_int (mpfr_prec_t prec)
+{
+  mpfr_t r, z, w, expect;
+  int sum;
+
+  if (prec < 4)
+    prec = 4;
+
+  mpfr_init2 (r, prec);
+  mpfr_init2 (z, prec);
+  mpfr_init2 (w, prec);
+  mpfr_init2 (expect, prec);
+
+  sum = -3;
+  mpfr_set_str (z, "-1.1e0", 2, MPFR_RNDN);
+  mpfr_si_sub (w, sum, z, MPFR_RNDN);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  sum = -12;
+  mpfr_set_str (z, "-1.101e0", 2, MPFR_RNDN);
+  mpfr_si_sub (w, sum, z, MPFR_RNDN);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  sum = -1;
+  mpfr_set_str (z, "-.11e0", 2, MPFR_RNDN);
+  mpfr_si_sub (w, sum, z, MPFR_RNDN);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  sum = -13;
+  mpfr_set_str (z, "-.001e0", 2, MPFR_RNDN);
+  mpfr_si_sub (w, sum, z, MPFR_RNDN);
+  mpfr_set_zero (expect, 1);
+  TEST(prec, r, z, w, expect);
+
+  mpfr_clear (r);
+  mpfr_clear (z);
+  mpfr_clear (w);
+  mpfr_clear (expect);
+}
+
+
+static void
+test_beta_hardcoded (mpfr_prec_t prec)
+{
+  mpfr_t r, z, w, expect;
+  mpfr_prec_t oprec = 1;
+
+  if (prec < 10)
+    prec = 10;
+
+  mpfr_init2 (z, prec);
+  mpfr_init2 (w, prec);
+  mpfr_init2 (r, oprec);
+  mpfr_init2 (expect, oprec);
+
+  mpfr_set_ui (z, 3, MPFR_RNDN);
+  mpfr_set_ui (w, 3, MPFR_RNDN);
+  mpfr_set_str (expect, "1e-5", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDN);
+
+  mpfr_set_str (expect, "1.1e-5", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDU);
+
+  mpfr_set_str (expect, "1e-5", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDD);
+
+  mpfr_set_ui (z, 5, MPFR_RNDN);
+  mpfr_set_ui (w, 27, MPFR_RNDN);
+  mpfr_set_str (expect, "1e-20", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDN);
+
+  mpfr_set_str (expect, "1e-19", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDU);
+
+  mpfr_set_ui (z, 5, MPFR_RNDN);
+  mpfr_set_ui (w, 27, MPFR_RNDN);
+  mpfr_set_str (expect, "1e-20", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDN);
+
+
+  mpfr_set_ui (z, 121, MPFR_RNDN);
+  mpfr_set_ui (w, 2, MPFR_RNDN);
+  mpfr_set_str (expect, "1e-14", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDN);
+
+  mpfr_set_ui (z, 121, MPFR_RNDN);
+  mpfr_set_ui (w, 151, MPFR_RNDN);
+  mpfr_set_str (expect, "1e-271", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDN);
+
+  mpfr_set_str (expect, "1e-272", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDD);
+
+  mpfr_set_str (expect, "1e-271", 2, MPFR_RNDN);
+  TESTRND(prec, r, z, w, expect, MPFR_RNDU);
+
+  mpfr_clear (r);
+  mpfr_clear (z);
+  mpfr_clear (w);
+  mpfr_clear (expect);
+}
+
+/* makes sure beta(a, b) = beta(b, a) */
+static void
+test_beta_refl (mpfr_prec_t prec, mpfr_rnd_t rnd_mode)
+{
+  mpfr_t r, z, w, expect;
+
+  mpfr_init2 (z, prec);
+  mpfr_init2 (w, prec);
+  mpfr_init2 (r, prec);
+  mpfr_init2 (expect, prec);
+
+  mpfr_set_ui (z, 3, MPFR_RNDN);
+  mpfr_set_ui (w, 3, MPFR_RNDN);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+
+  mpfr_set_ui (z, 5, MPFR_RNDN);
+  mpfr_set_ui (w, 100, MPFR_RNDN);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+
+  mpfr_set_nan (z);
+  mpfr_set_ui (w, 100, MPFR_RNDN);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+
+  mpfr_set_nan (z);
+  mpfr_set_ui (w, 1, MPFR_RNDN);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+
+  mpfr_set_nan (z);
+  mpfr_set_nan (w);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+  mpfr_set_nan (z);
+  mpfr_set_nan (w);
+  mpfr_beta (expect, w, z, rnd_mode);
+  TESTRND(prec, r, z, w, expect, rnd_mode);
+
+  mpfr_clear (r);
+  mpfr_clear (z);
+  mpfr_clear (w);
+  mpfr_clear (expect);
+}
+
+#define TEST_FUNCTION mpfr_beta
+#define TWO_ARGS
+#define TEST_RANDOM_EMIN -16
+#define TEST_RANDOM_EMAX 16
+#include "tgeneric.c"
+
+int
+main (void)
+{
+  tests_start_mpfr ();
+
+  test_beta_special (10);
+  test_beta_special (100);
+  test_beta_special (1000);
+
+  test_beta_2exp (1, 10, 1);
+  test_beta_2exp (100, 40, 3);
+
+  test_beta_hardcoded (10);
+  test_beta_hardcoded (100);
+
+  test_beta_refl (1, MPFR_RNDN);
+  test_beta_refl (100, MPFR_RNDD);
+
+  test_beta_zw_sum_int (10);
+  test_beta_zw_sum_int (100);
+
+  test_generic (MPFR_PREC_MIN, 100, 20);
+
+  tests_end_mpfr ();
+  return 0;
+}
diff --git a/tests/tbuildopt.c b/tests/tbuildopt.c
index 2ce7c97ae..47bd59885 100644
--- a/tests/tbuildopt.c
+++ b/tests/tbuildopt.c
@@ -80,9 +80,12 @@ check_gmpinternals_p (void)
 int
 main (void)
 {
+  tests_start_mpfr ();
+
   check_tls_p();
   check_decimal_p();
   check_gmpinternals_p();
 
+  tests_end_mpfr ();
   return 0;
 }
diff --git a/tests/tfmma.c b/tests/tfmma.c
index efc72a9c1..737f05768 100644
--- a/tests/tfmma.c
+++ b/tests/tfmma.c
@@ -103,7 +103,7 @@ random_tests (void)
       mpfr_urandomb (b, RANDS);
       mpfr_urandomb (c, RANDS);
       mpfr_urandomb (d, RANDS);
-      RND_LOOP (r)
+      RND_LOOP_NO_RNDF (r)
         random_test (a, b, c, d, (mpfr_rnd_t) r);
       mpfr_clears (a, b, c, d, (mpfr_ptr) 0);
     }
@@ -418,6 +418,57 @@ half_plus_half (void)
   set_emin (emin);
 }
 
+/* check that result has exponent in [emin, emax]
+   (see https://sympa.inria.fr/sympa/arc/mpfr/2017-04/msg00016.html)
+   Overflow detection in sub1.c was incorrect (only for UBF cases);
+   fixed in r11414. */
+static void
+bug20170405 (void)
+{
+  mpfr_t x, y, z;
+
+  mpfr_inits2 (866, x, y, z, (mpfr_ptr) 0);
+
+  mpfr_set_str_binary (x, "-0.10010101110110001111000110010100011001111011110001010100010000111110001010111110100001000000011010001000010000101110000000001100001011000110010000100111001100000101110000000001001101101101010110000110100010010111011001101101010011111000101100000010001100010000011100000000011110100010111011101011000101101011110110001010011001101110011101100001111000011000000011000010101010000101001001010000111101100001000001011110011000110010001100001101101001001010000111100101000010111001001101010011001110110001000010101001100000101010110000100100100010101011111001001100010001010110011000000001011110011000110001000100101000111010010111111110010111001101110101010010101101010100111001011100101101010111010011001000001010011001010001101000111011010010100110011001111111000011101111001010111001001011011011110101101001100011010001010110011100001101100100001001100111010100010100E768635654");
+  mpfr_set_str_binary (y, "-0.11010001010111110010110101010011000010010011010011011101100100110000110101100110111010001001110101110000011101100010110100100011001101111010100011111001011100111101110101101001000101011110101101101011010100110010111111010011011100101111110011001001010101011101111100011101100001010010011000110010110101001110010001101111111001100100000101010100110011101101101010011001000110100001001100000010110010101111000110110000111011000110001000100100100101111110001111100101011100100100110111010000010110110001110010001001101000000110111000101000110101111110000110001110100010101111010110001111010111111111010011001001100110011000110010110011000101110001010001101000100010000110011101010010010011110100000111100000101100110001111010000100011111000001101111110100000011011110010100010010011111111000010110000000011010011001100110001110111111010111110000111110010110011001000010E768635576");
+  /* since emax = 1073741821, x^2-y^2 should overflow */
+  mpfr_fmms (z, x, x, y, y, MPFR_RNDN);
+  MPFR_ASSERTN(mpfr_inf_p (z) && mpfr_sgn (z) > 0);
+
+  /* same test when z has a different precision */
+  mpfr_set_prec (z, 867);
+  mpfr_fmms (z, x, x, y, y, MPFR_RNDN);
+  MPFR_ASSERTN(mpfr_inf_p (z) && mpfr_sgn (z) > 0);
+
+  mpfr_set_prec (x, 564);
+  mpfr_set_prec (y, 564);
+  mpfr_set_prec (z, 2256);
+  mpfr_set_str_binary (x, "1.10010000111100110011001101111111101000111001011000110100110010000101000100010001000000111100010000101001011011111001111000110101111100101111001100001100011101100100011110000000011000010110111100111000100101010001011111010111011001110010001011101111001011001110110000010000011100010001010001011100100110111110101001001111001011101111110011101110101010110100010010111011111100010101111100011110111001011111101110101101101110100101111010000101011110100000000110111101000001100001000100010110100111010011011010110011100111010000101110010101111001011100110101100001100e-737194993");
+  mpfr_set_str_binary (y, "-1.00101000100001001101011110100010110011101010011011010111100110101011111100000100101000111010111101100100110010001110011011100100110110000001011001000111101111101111110101100110111000000011000001101001010100010010001110001000011010000100111001001100101111111100010101110101001101101101111010100011011110001000010000010100011000011000010110101100000111111110111001100100100001101011111011100101110111000100101010110100010011101010110010100110100111000000100111101101101000000011110000100110100100011000010011110010001010000110100011111101101101110001110001101101010e-737194903");
+  mpfr_fmma (z, x, y, x, y, MPFR_RNDN);
+  /* we should get -0 as result */
+  MPFR_ASSERTN(mpfr_zero_p (z) && mpfr_signbit (z));
+
+  mpfr_set_prec (x, 2);
+  mpfr_set_prec (y, 2);
+  mpfr_set_prec (z, 2);
+  /* (a+i*b)*(c+i*d) with:
+     a=0.10E1
+     b=0.10E-536870912
+     c=0.10E-536870912
+     d=0.10E1 */
+  mpfr_set_str_binary (x, "0.10E1"); /* x = a = d */
+  mpfr_set_str_binary (y, "0.10E-536870912"); /* y = b = c */
+  /* real part is a*c-b*d = x*y-y*x */
+  mpfr_fmms (z, x, y, y, x, MPFR_RNDN);
+  MPFR_ASSERTN(mpfr_zero_p (z) && !mpfr_signbit (z));
+  /* imaginary part is a*d+b*c = x*x+y*y */
+  mpfr_fmma (z, x, x, y, y, MPFR_RNDN);
+  MPFR_ASSERTN(mpfr_cmp_ui (z, 1) == 0);
+  
+  mpfr_clears (x, y, z, (mpfr_ptr) 0);
+}
+
 int
 main (int argc, char *argv[])
 {
@@ -428,6 +479,7 @@ main (int argc, char *argv[])
   max_tests ();
   overflow_tests ();
   half_plus_half ();
+  bug20170405 ();
 
   tests_end_mpfr ();
   return 0;
diff --git a/tests/tinits.c b/tests/tinits.c
index 3c089ce22..61a6f3fb9 100644
--- a/tests/tinits.c
+++ b/tests/tinits.c
@@ -40,9 +40,21 @@ main (void)
   large_prec = 2147483647;
   if (getenv ("MPFR_CHECK_LARGEMEM") != NULL)
     {
+      size_t min_memory_limit;
+
       /* We assume that the precision won't be increased internally. */
       if (large_prec > MPFR_PREC_MAX)
         large_prec = MPFR_PREC_MAX;
+
+      /* Increase tests_memory_limit if need be in order to avoid an
+         obvious failure due to insufficient memory, by choosing a bit
+         more than the memory used for the variables a and b. Note
+         that such an increase is necessary, but is not guaranteed to
+         be sufficient in all cases (e.g. with logging activated). */
+      min_memory_limit = 2 * (large_prec / MPFR_BYTES_PER_MP_LIMB) + 65536;
+      if (tests_memory_limit > 0 && tests_memory_limit < min_memory_limit)
+        tests_memory_limit = min_memory_limit;
+
       mpfr_inits2 (large_prec, a, b, (mpfr_ptr) 0);
       mpfr_set_ui (a, 17, MPFR_RNDN);
       mpfr_set (b, a, MPFR_RNDN);
diff --git a/tests/tprintf.c b/tests/tprintf.c
index df3af0acf..7767a9052 100644
--- a/tests/tprintf.c
+++ b/tests/tprintf.c
@@ -120,6 +120,9 @@ check_long_string (void)
      in memory (~2.5 GB) */
   mpfr_t x;
   long large_prec = 2147483647;
+  size_t min_memory_limit, old_memory_limit;
+
+  old_memory_limit = tests_memory_limit;
 
   /* With a 32-bit (4GB) address space, a realloc failure has been noticed
      with a 2G precision (though allocating up to 4GB is possible):
@@ -134,6 +137,18 @@ check_long_string (void)
   if (large_prec > MPFR_PREC_MAX)
     large_prec = MPFR_PREC_MAX;
 
+  /* Increase tests_memory_limit if need be in order to avoid an
+     obvious failure due to insufficient memory. Note that such an
+     increase is necessary, but is not guaranteed to be sufficient
+     in all cases (e.g. with logging activated). */
+  min_memory_limit = large_prec / MPFR_BYTES_PER_MP_LIMB;
+  if (min_memory_limit > (size_t) -1 / 12)
+    min_memory_limit = (size_t) -1;
+  else
+    min_memory_limit *= 12;
+  if (tests_memory_limit > 0 && tests_memory_limit < min_memory_limit)
+    tests_memory_limit = min_memory_limit;
+
   mpfr_init2 (x, large_prec);
 
   mpfr_set_ui (x, 1, MPFR_RNDN);
@@ -146,6 +161,7 @@ check_long_string (void)
     }
 
   mpfr_clear (x);
+  tests_memory_limit = old_memory_limit;
 }
 
 static void
diff --git a/tests/tsprintf.c b/tests/tsprintf.c
index a90cf708e..a2bb09c0b 100644
--- a/tests/tsprintf.c
+++ b/tests/tsprintf.c
@@ -329,6 +329,8 @@ decimal (void)
   /* sign or space, decimal point, left justified */
   check_sprintf (" 1.8E+07   ", "%- #11.1RDE", x);
   check_sprintf (" 1.E+07    ", "%- #11.0RDE", x);
+  /* large requested precision */
+  check_sprintf ("18993474.61279296875", "%.2147483647Rg", x);
 
   /* negative numbers */
   mpfr_mul_si (x, x, -1, MPFR_RNDD);
@@ -1316,6 +1318,54 @@ bug21056 (void)
   /* since trailing zeros are removed with %g, we get less digits */
   MPFR_ASSERTN(r == 309);
 
+  ndigits = INT_MAX;
+  r = mpfr_snprintf (0, 0, "%.*RDg", ndigits, x);
+  /* since trailing zeros are removed with %g, we get less digits */
+  MPFR_ASSERTN(r == 309);
+
+  ndigits = INT_MAX - 1;
+  r = mpfr_snprintf (0, 0, "%#.*RDg", ndigits, x);
+  MPFR_ASSERTN(r == ndigits + 1);
+
+  mpfr_clear (x);
+}
+
+/* Fails for i = 5, i.e. t[i] = (size_t) UINT_MAX + 1,
+   with r11427 on 64-bit machines (4-byte int, 8-byte size_t).
+   On such machines, t[5] converted to int typically gives 0.
+   Note: the assumed behavior corresponds to the snprintf behavior
+   in ISO C, but this conflicts with POSIX:
+     https://sourceware.org/bugzilla/show_bug.cgi?id=14771#c2
+     http://austingroupbugs.net/view.php?id=761
+*/
+static void
+snprintf_size (void)
+{
+  mpfr_t x;
+  char buf[12];
+  const char s[] = "17.00000000";
+  size_t t[] = { 11, 12, 64, INT_MAX, (size_t) INT_MAX + 1,
+                 (size_t) UINT_MAX + 1, (size_t) UINT_MAX + 2,
+                 (size_t) -1 };
+  int i, r;
+
+  mpfr_init2 (x, 64);
+  mpfr_set_ui (x, 17, MPFR_RNDN);
+
+  for (i = 0; i < sizeof (t) / sizeof (*t); i++)
+    {
+      memset (buf, 0, sizeof (buf));
+      /* r = snprintf (buf, t[i], "%.8f", 17.0); */
+      r = mpfr_snprintf (buf, t[i], "%.8Rf", x);
+      if (r != 11 || (t[i] > 11 && strcmp (buf, s) != 0))
+        {
+          printf ("Error in snprintf_size for i = %d:\n", i);
+          printf ("expected r = 11, \"%s\"\n", s);
+          printf ("got      r = %d, \"%s\"\n", r, buf);
+          exit (1);
+        }
+    }
+
   mpfr_clear (x);
 }
 
@@ -1341,6 +1391,7 @@ main (int argc, char **argv)
   check_emin ();
   test20161214 ();
   bug21056 ();
+  snprintf_size ();
 
 #if defined(HAVE_LOCALE_H) && defined(HAVE_SETLOCALE)
 #if MPFR_LCONV_DPTS
diff --git a/tests/tsqrt.c b/tests/tsqrt.c
index 135b9fe1e..87c178208 100644
--- a/tests/tsqrt.c
+++ b/tests/tsqrt.c
@@ -655,6 +655,41 @@ testall_rndf (mpfr_prec_t pmax)
     }
 }
 
+/* test the case prec = GMP_NUMB_BITS */
+static void
+test_sqrt1n (void)
+{
+  mpfr_t r, u;
+  int inex;
+
+  mpfr_init2 (r, GMP_NUMB_BITS);
+  mpfr_init2 (u, GMP_NUMB_BITS);
+
+  inex = mpfr_set_ui_2exp (u, 17 * 17, 2 * GMP_NUMB_BITS - 10, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  inex = mpfr_sqrt (r, u, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  MPFR_ASSERTN(mpfr_cmp_ui_2exp (r, 17, GMP_NUMB_BITS - 5) == 0);
+
+  inex = mpfr_set_ui_2exp (u, 1, GMP_NUMB_BITS - 2, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  inex = mpfr_add_ui (u, u, 1, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  inex = mpfr_mul_2exp (u, u, GMP_NUMB_BITS, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  /* u = 2^(2*GMP_NUMB_BITS-2) + 2^GMP_NUMB_BITS, thus
+     u = r^2 + 2^GMP_NUMB_BITS with r = 2^(GMP_NUMB_BITS-1).
+     Should round to r+1 to nearest. */
+  inex = mpfr_sqrt (r, u, MPFR_RNDN);
+  MPFR_ASSERTN(inex > 0);
+  inex = mpfr_sub_ui (r, r, 1, MPFR_RNDN);
+  MPFR_ASSERTN(inex == 0);
+  MPFR_ASSERTN(mpfr_cmp_ui_2exp (r, 1, GMP_NUMB_BITS - 1) == 0);
+
+  mpfr_clear (r);
+  mpfr_clear (u);
+}
+
 #define TEST_FUNCTION test_sqrt
 #define TEST_RANDOM_POS 8
 #include "tgeneric.c"
@@ -793,6 +828,7 @@ main (void)
 
   bug20160120 ();
   bug20160908 ();
+  test_sqrt1n ();
 
   tests_end_mpfr ();
   return 0;
diff --git a/tests/tstrtofr.c b/tests/tstrtofr.c
index e21f11add..ab5ffd46d 100644
--- a/tests/tstrtofr.c
+++ b/tests/tstrtofr.c
@@ -1189,24 +1189,63 @@ bug20120829 (void)
   mpfr_clears (e, x1, x2, (mpfr_ptr) 0);
 }
 
-/* Note: the number is 5^47/2^9. */
+/* https://sympa.inria.fr/sympa/arc/mpfr/2016-12/msg00043.html
+   mpfr_strtofr can return an incorrect ternary value.
+   Note: As a consequence, the value can also be incorrect if the current
+   exponent range is not the maximum one (since the ternary value is used
+   to resolve double rounding in mpfr_check_range); this can happen only
+   if the value is a midpoint between 0 and the minimum positive number
+   or the opposite. */
 static void
 bug20161217 (void)
 {
   mpfr_t fp, z;
   static const char * num = "0.1387778780781445675529539585113525390625e31";
+  /* The above number is 5^47/2^9. */
   int inex;
 
   mpfr_init2 (fp, 110);
   mpfr_init2 (z, 110);
+
   inex = mpfr_strtofr (fp, num, NULL, 10, MPFR_RNDN);
   MPFR_ASSERTN(inex == 0);
   mpfr_set_str_binary (z, "10001100001000010011110110011101101001010000001011011110010001010100010100100110111101000010001011001100001101E-9");
   MPFR_ASSERTN(mpfr_equal_p (fp, z));
+
+  /* try with 109 bits */
+  mpfr_set_prec (fp, 109);
+  inex = mpfr_strtofr (fp, num, NULL, 10, MPFR_RNDN);
+  MPFR_ASSERTN(inex < 0);
+  mpfr_set_str_binary (z, "10001100001000010011110110011101101001010000001011011110010001010100010100100110111101000010001011001100001100E-9");
+  MPFR_ASSERTN(mpfr_equal_p (fp, z));
+
   mpfr_clear (fp);
   mpfr_clear (z);
 }
 
+/* check bug in MPFR 3.1.5 is fixed: cf
+   https://sympa.inria.fr/sympa/arc/mpfr/2017-03/msg00009.html
+   Note: same bug as bug20161217. See also the comments of bug20161217;
+   here, this is a case where the value is incorrect. */
+static void
+bug20170308 (void)
+{
+  mpfr_exp_t emin;
+   /* the following is slightly larger than 2^-1075, thus should be rounded
+      to 0.5*2^-1074, with ternary value < 0 */
+  char str[] = "2.47032822920623272089E-324";
+  mpfr_t z;
+  int inex;
+
+  emin = mpfr_get_emin ();
+  mpfr_set_emin (-1073);
+  mpfr_set_emin (emin);
+  mpfr_init2 (z, 53);
+  inex = mpfr_strtofr (z, str, NULL, 10, MPFR_RNDN);
+  MPFR_ASSERTN(inex < 0 && mpfr_cmp_ui_2exp (z, 1, -1075) == 0);
+  mpfr_clear (z);
+}
+
 int
 main (int argc, char *argv[])
 {
@@ -1222,6 +1261,7 @@ main (int argc, char *argv[])
   bug20120814 ();
   bug20120829 ();
   bug20161217 ();
+  bug20170308 ();
 
   tests_end_mpfr ();
   return 0;
diff --git a/tests/tsub1sp.c b/tests/tsub1sp.c
index 52ec3398a..bace43931 100644
--- a/tests/tsub1sp.c
+++ b/tests/tsub1sp.c
@@ -167,7 +167,7 @@ compare_sub_sub1sp (void)
   mpfr_prec_t p;
   unsigned long d;
   int i, inex_ref, inex;
-  mpfr_rnd_t r;
+  int r;
 
   for (p = 1; p <= 3*GMP_NUMB_BITS; p++)
     {
@@ -194,14 +194,14 @@ compare_sub_sub1sp (void)
               {
                 /* increase the precision of b to ensure sub1sp is not used */
                 mpfr_prec_round (b, p + 1, MPFR_RNDN);
-                inex_ref = mpfr_sub (a_ref, b, c, r);
+                inex_ref = mpfr_sub (a_ref, b, c, (mpfr_rnd_t) r);
                 inex = mpfr_prec_round (b, p, MPFR_RNDN);
                 MPFR_ASSERTN(inex == 0);
-                inex = mpfr_sub1sp (a, b, c, r);
+                inex = mpfr_sub1sp (a, b, c, (mpfr_rnd_t) r);
                 if (inex != inex_ref)
                   {
                     printf ("mpfr_sub and mpfr_sub1sp differ for r=%s\n",
-                            mpfr_print_rnd_mode (r));
+                            mpfr_print_rnd_mode ((mpfr_rnd_t) r));
                     printf ("b="); mpfr_dump (b);
                     printf ("c="); mpfr_dump (c);
                     printf ("expected inex=%d and ", inex_ref);
diff --git a/tests/tzeta.c b/tests/tzeta.c
index 542eabf80..d670c0a3b 100644
--- a/tests/tzeta.c
+++ b/tests/tzeta.c
@@ -184,6 +184,48 @@ test2 (void)
   mpfr_clears (x, y, (mpfr_ptr) 0);
 }
 
+/* The following test attempts to trigger an intermediate overflow in
+   Gamma(s1) in the reflection formula with a 32-bit ABI (the example
+   depends on the extended exponent range): r10804 fails when the
+   exponent field is on 32 bits. */
+static void
+intermediate_overflow (void)
+{
+  mpfr_t x, y1, y2;
+  mpfr_flags_t flags1, flags2;
+  int inex1, inex2;
+
+  mpfr_inits2 (64, x, y1, y2, (mpfr_ptr) 0);
+
+  mpfr_set_si (x, -44787928, MPFR_RNDN);
+  mpfr_nextabove (x);
+
+  mpfr_set_str (y1, "0x3.0a6ab0ab281742acp+954986780", 0, MPFR_RNDN);
+  inex1 = -1;
+  flags1 = MPFR_FLAGS_INEXACT;
+
+  mpfr_clear_flags ();
+  inex2 = mpfr_zeta (y2, x, MPFR_RNDN);
+  flags2 = __gmpfr_flags;
+
+  if (!(mpfr_equal_p (y1, y2) &&
+        SAME_SIGN (inex1, inex2) &&
+        flags1 == flags2))
+    {
+      printf ("Error in intermediate_overflow\n");
+      printf ("Expected ");
+      mpfr_dump (y1);
+      printf ("with inex = %d and flags =", inex1);
+      flags_out (flags1);
+      printf ("Got      ");
+      mpfr_dump (y2);
+      printf ("with inex = %d and flags =", inex2);
+      flags_out (flags2);
+      exit (1);
+    }
+  mpfr_clears (x, y1, y2, (mpfr_ptr) 0);
+}
+
 #define TEST_FUNCTION mpfr_zeta
 #define TEST_RANDOM_EMIN -48
 #define TEST_RANDOM_EMAX 31
@@ -198,6 +240,7 @@ main (int argc, char *argv[])
   mpfr_t s, y, z;
   mpfr_prec_t prec;
   mpfr_rnd_t rnd_mode;
+  mpfr_flags_t flags;
   int inex;
 
   tests_start_mpfr ();
@@ -411,6 +454,24 @@ main (int argc, char *argv[])
         }
     }
 
+  /* The following test yields an overflow in the error computation.
+     With r10864, this is detected and one gets an assertion failure. */
+  mpfr_set_prec (s, 1025);
+  mpfr_set_si_2exp (s, -1, 1024, MPFR_RNDN);
+  mpfr_nextbelow (s);  /* -(2^1024 + 1) */
+  mpfr_clear_flags ();
+  inex = mpfr_zeta (z, s, MPFR_RNDN);
+  flags = __gmpfr_flags;
+  if (flags != (MPFR_FLAGS_OVERFLOW | MPFR_FLAGS_INEXACT) ||
+      ! mpfr_inf_p (z) || MPFR_IS_POS (z) || inex >= 0)
+    {
+      printf ("Error in mpfr_zeta for s = -(2^1024 + 1)\nGot ");
+      mpfr_dump (z);
+      printf ("with inex = %d and flags =", inex);
+      flags_out (flags);
+      exit (1);
+    }
+
   mpfr_clear (s);
   mpfr_clear (y);
   mpfr_clear (z);
@@ -421,6 +482,8 @@ main (int argc, char *argv[])
   test_generic (MPFR_PREC_MIN, 70, 1);
   test2 ();
 
+  intermediate_overflow ();
+
   tests_end_mpfr ();
   return 0;
 }
diff --git a/tools/mbench/Makefile b/tools/mbench/Makefile
index c126e84c1..826f2862a 100644
--- a/tools/mbench/Makefile
+++ b/tools/mbench/Makefile
@@ -1,4 +1,4 @@
-# Copyright 2005-2010, 2014 Free Software Foundation, Inc.
+# Copyright 2005-2017 Free Software Foundation, Inc.
 # Contributed by Patrick Pelissier, INRIA.
 # 
 # This file is part of the MPFR Library.
@@ -22,7 +22,11 @@ AR=ar
 CC=gcc
 CXX=g++
 RANLIB=ranlib
-CFLAGS=-O2 -fomit-frame-pointer -Wall -g -static
+# added -march=native to CFLAGS to properly detect cpus
+# (the -march=native option is not supported on all architectures,
+# but if this is an issue, this could be solved by a test of the
+# architecture with an ifeq).
+CFLAGS=-O2 -fomit-frame-pointer -Wall -g -static -march=native
 LDFLAGS=
 RM=rm -f
 CP=cp -f
@@ -38,6 +42,8 @@ ARPREC=$(GMP)
 CRLIBM=$(GMP)
 LIDIA=/localdisk/lidia/
 
+GMP_INCLUDE=$(GMP)/include
+
 MPFR_INCLUDE=$(MPFR)/include
 MPFR_LIB=$(MPFR)/lib
 
diff --git a/tools/mbench/mfv5-mpfr.cc b/tools/mbench/mfv5-mpfr.cc
index 48496bae0..4e417fd8a 100644
--- a/tools/mbench/mfv5-mpfr.cc
+++ b/tools/mbench/mfv5-mpfr.cc
@@ -135,16 +135,14 @@ public:
   }
 };
 
-#ifdef mpfr_fmma
+#if MPFR_VERSION_MAJOR >= 4
 class mpfr_fmma_test {
 public:
   int func(mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_srcptr d, mpfr_srcptr e, mp_rnd_t r) {
     return mpfr_fmma (a,b,c,d,e,r);
   }
 };
-#endif
 
-#ifdef mpfr_fmms
 class mpfr_fmms_test {
 public:
   int func(mpfr_ptr a, mpfr_srcptr b, mpfr_srcptr c, mpfr_srcptr d, mpfr_srcptr e, mp_rnd_t r) {
@@ -281,10 +279,8 @@ static mpfr_test<mpfr_sub_test> test2 ("mpfr_sub");
 static mpfr_test<mpfr_mul_test> test3 ("mpfr_mul");
 static mpfr_test3<mpfr_fma_test> test10 ("mpfr_fma");
 static mpfr_test3<mpfr_fms_test> test11 ("mpfr_fms");
-#ifdef mpfr_fmma
+#if MPFR_VERSION_MAJOR >= 4
 static mpfr_test4<mpfr_fmma_test> test12 ("mpfr_fmma");
-#endif
-#ifdef mpfr_fmms
 static mpfr_test4<mpfr_fmms_test> test13 ("mpfr_fmms");
 #endif
 static mpfr_test<mpfr_sqr_test> test14 ("mpfr_sqr");
diff --git a/tools/mbench/timp.h b/tools/mbench/timp.h
index 1a0151227..858f803b4 100644
--- a/tools/mbench/timp.h
+++ b/tools/mbench/timp.h
@@ -1,5 +1,5 @@
 /*
-Copyright 2005-2009 Free Software Foundation, Inc.
+Copyright 2005-2017 Free Software Foundation, Inc.
 Contributed by Patrick Pelissier, INRIA.
 
 This file is part of the MPFR Library.
@@ -51,8 +51,9 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 
 #elif defined (__i386__) || defined(__amd64__)
 
-#if !defined(corei7)
+#if !defined(corei7) && !defined(__core_avx2__)
 
+/* the following implements Section 3.2.3 of the article cited below */
 #define timp_rdtsc_before(time)           \
         __asm__ __volatile__(             \
                 ".align 64\n\t"           \
@@ -81,7 +82,8 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
                 : "eax", "ebx", "ecx", "edx", "memory")
 #else
 
-/* corei7 offers newer instruction rdtscp, which should be better */
+/* corei7 and corei5 offer newer instruction rdtscp, which should be better,
+   see https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-32-ia-64-benchmark-code-execution-paper.pdf */
 #define timp_rdtsc_before(time)           \
         __asm__ __volatile__(             \
                 ".align 64\n\t"           \
@@ -130,11 +132,11 @@ http://www.gnu.org/licenses/ or write to the Free Software Foundation, Inc.,
 #endif
 
 /* We do several measures and keep the minimum to avoid counting
- * hardware interruption cycles.
+ * hardware interrupt cycles.
  * The filling of the CPU cache is done because we do several loops,
  * and get the minimum.
  * Declaring num_cycle as "volatile" is to avoid optimization when it is
- * possible (To properly calcul overhead).
+ * possible (to properly compute overhead).
  * overhead is calculated outside by a call to:
  *   overhead = MEASURE("overhead", ;)
  * Use a lot the preprocessor.
diff --git a/tools/mpfrlint b/tools/mpfrlint
index 801f2c609..fdc19fc5b 100755
--- a/tools/mpfrlint
+++ b/tools/mpfrlint
@@ -211,7 +211,8 @@ done
 # Even on platforms where it is available, the prototype
 # may not be included (e.g. with gcc -ansi), so that the
 # code may be compiled incorrectly.
-err-if-output -t "snprintf" grep '[^a-z_]snprintf *([^)]' $srctests
+grep '[^a-z_]snprintf *([^)]' $srctests | \
+  err-if-output -t "snprintf" grep -v '/\*.*[^a-z_]snprintf *([^)]'
 
 # Constant checking should use either MPFR_STAT_STATIC_ASSERT
 # or MPFR_ASSERTN(0) for not yet implemented corner cases.
@@ -288,6 +289,23 @@ do
     { echo "Missing '#include \"mpfr-impl.h\"' in $file?" && err=1 }
 done
 
+# Check that the usual test programs call tests_start_mpfr and tests_end_mpfr.
+tprg=($(sed -n '/^check_PROGRAMS/,/[^\\]$/ {
+  s/.*=//
+  s/\\//
+  p
+  }' tests/Makefile.am))
+[[ -n $tprg ]]
+for t in $tprg
+do
+  [[ $t != mpf*_compat ]] || continue
+  tc=tests/$t.c
+  for fn in tests_start_mpfr tests_end_mpfr
+  do
+    err-if-output "missing call to $fn in $tc" grep -q "$fn *();" $tc
+  done
+done
+
 # mpfr_printf-like functions shouldn't be used in the tests,
 # as they need <stdarg.h> (HAVE_STDARG defined).
 for file in tests/*.c
author	vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4>	2017-05-04 09:40:05 +0000
committer	vlefevre <vlefevre@280ebfd0-de03-0410-8827-d642c229c3f4>	2017-05-04 09:40:05 +0000
commit	af5a1593331d686b9cc5fbbbbdc47e1733a4644e (patch)
tree	ff8210e41ae8ced432dbcd42e8be2a919f8dddc6
parent	87ff38458263c9a9ed79a7ebd547fd32a66ae843 (diff)
parent	d79a8111e6b7851b15bac211d8dca0e67a2979b5 (diff)
download	mpfr-af5a1593331d686b9cc5fbbbbdc47e1733a4644e.tar.gz