summaryrefslogtreecommitdiff
path: root/sysdeps/powerpc/powerpc32/power4
Commit message (Collapse)AuthorAgeFilesLines
* powerpc: Remove strncmp variantsAdhemerval Zanella Netto2023-03-026-321/+1
| | | | | | | | | | | | | The default, power4, and power7 implementation just adds word aligned access when inputs have the same aligment. The unaligned case is still done by byte operations. This is already covered by the generic implementation, which also add the unaligned input optimization. Checked on powerpc-linux-gnu built without multi-arch for powerpc, power4, and power7. Reviewed-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
* string: Add libc_hidden_proto for memrchrAdhemerval Zanella2023-02-082-6/+16
| | | | | | | Although static linker can optimize it to local call, it follows the internal scheme to provide hidden proto and definitions. Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
* string: Add libc_hidden_proto for strchrnulAdhemerval Zanella2023-02-082-8/+16
| | | | | | | Although static linker can optimize it to local call, it follows the internal scheme to provide hidden proto and definitions. Reviewed-by: Carlos Eduardo Seo <carlos.seo@linaro.org>
* string: Improve generic strnlen with memchrAdhemerval Zanella2023-02-061-7/+7
| | | | | | | | It also cleanups the multiple inclusion by leaving the ifunc implementation to undef the weak_alias and libc_hidden_def. Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* string: Improve generic memchrAdhemerval Zanella2023-02-061-10/+4
| | | | | | | | | | | | | | | | | New algorithm read the first aligned address and mask off the unwanted bytes (this strategy is similar to arch-specific implementations used on powerpc, sparc, and sh). The loop now read word-aligned address and check using the has_eq macro. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc-linux-gnu, and powerpc64-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* string: Improve generic strchrnulAdhemerval Zanella2023-02-061-4/+0
| | | | | | | | | | | | | | | | | New algorithm read the first aligned address and mask off the unwanted bytes (this strategy is similar to arch-specific implementations used on powerpc, sparc, and sh). The loop now read word-aligned address and check using the has_zero_eq function. Checked on x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu, and powerpc-linux-gnu by removing the arch-specific assembly implementation and disabling multi-arch (it covers both LE and BE for 64 and 32 bits). Co-authored-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Noah Goldstein <goldstein.w.n@gmail.com>
* Parameterize OP_T_THRES from memcopy.hRichard Henderson2023-02-061-5/+0
| | | | | | | | | | | | It moves OP_T_THRES out of memcopy.h to its own header and adjust each architecture that redefines it. Checked with a build and check with run-built-tests=no for all major Linux ABIs. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
* Update copyright dates with scripts/update-copyrightsJoseph Myers2023-01-0698-98/+98
|
* Disable use of -fsignaling-nans if compiler does not support itAdhemerval Zanella2022-11-011-2/+2
| | | | Reviewed-by: Fangrui Song <maskray@google.com>
* Add bounds check to __libc_ifunc_impl_listWilco Dijkstra2022-06-101-7/+2
| | | | | | | | | | | | Add a proper bounds check to __libc_ifunc_impl_list. This makes MAX_IFUNC redundant and fixes several targets that will write outside the array. To avoid unnecessary large diffs, pass the maximum in the argument 'i' to IFUNC_IMPL_ADD - 'max' can be used in new ifunc definitions and existing ones can be updated if desired. Passes buildmanyglibc. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
* powerpc: Remove powerpc32 bzero optimizationsAdhemerval Zanella2022-02-236-131/+2
| | | | | The symbol is not present in current POSIX specification and compiler already generates memset call.
* Update copyright dates with scripts/update-copyrightsPaul Eggert2022-01-01102-102/+102
| | | | | | | | | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 7061 files FOO. I then removed trailing white space from math/tgmath.h, support/tst-support-open-dev-null-range.c, and sysdeps/x86_64/multiarch/strlen-vec.S, to work around the following obscure pre-commit check failure diagnostics from Savannah. I don't know why I run into these diagnostics whereas others evidently do not. remote: *** 912-#endif remote: *** 913: remote: *** 914- remote: *** error: lines with trailing whitespace found ... remote: *** error: sysdeps/unix/sysv/linux/statx_cp.c: trailing lines
* math: Remove powerpc e_hypotAdhemerval Zanella2021-12-137-162/+1
| | | | | | | | | | | | | | | | | | | | | | | The generic implementation is shows only slight worse performance: POWER10 reciprocal-throughput latency master 8.28478 13.7253 new hypot 7.21945 13.1933 POWER9 reciprocal-throughput latency master 13.4024 14.0967 new hypot 14.8479 15.8061 POWER8 reciprocal-throughput latency master 15.5767 16.8885 new hypot 16.5371 18.4057 One way to improve might to make gcc generate xsmaxdp/xsmindp for fmax/fmin (it onl does for -ffast-math, clang does for default options). Checked on powerpc64-linux-gnu (power8) and powerpc64le-linux-gnu (power9).
* String: Add hidden defs for __memcmpeq() to enable internal usageNoah Goldstein2021-10-262-0/+4
| | | | | | | | No bug. This commit adds hidden defs for all declarations of __memcmpeq. This enables usage of __memcmpeq without the PLT for usage internal to GLIBC.
* String: Add support for __memcmpeq() ABI on all targetsNoah Goldstein2021-10-263-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | No bug. This commit adds support for __memcmpeq() as a new ABI for all targets. In this commit __memcmpeq() is implemented only as an alias to the corresponding targets memcmp() implementation. __memcmpeq() is added as a new symbol starting with GLIBC_2.35 and defined in string.h with comments explaining its behavior. Basic tests that it is callable and works where added in string/tester.c As discussed in the proposal "Add new ABI '__memcmpeq()' to libc" __memcmpeq() is essentially a reserved namespace for bcmp(). The means is shares the same specifications as memcmp() except the return value for non-equal byte sequences is any non-zero value. This is less strict than memcmp()'s return value specification and can be better optimized when a boolean return is all that is needed. __memcmpeq() is meant to only be called by compilers if they can prove that the return value of a memcmp() call is only used for its boolean value. All tests in string/tester.c passed. As well build succeeds on x86_64-linux-gnu target.
* Remove "Contributed by" linesSiddhesh Poyarekar2021-09-039-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | We stopped adding "Contributed by" or similar lines in sources in 2012 in favour of git logs and keeping the Contributors section of the glibc manual up to date. Removing these lines makes the license header a bit more consistent across files and also removes the possibility of error in attribution when license blocks or files are copied across since the contributed-by lines don't actually reflect reality in those cases. Move all "Contributed by" and similar lines (Written by, Test by, etc.) into a new file CONTRIBUTED-BY to retain record of these contributions. These contributors are also mentioned in manual/contrib.texi, so we just maintain this additional record as a courtesy to the earlier developers. The following scripts were used to filter a list of files to edit in place and to clean up the CONTRIBUTED-BY file respectively. These were not added to the glibc sources because they're not expected to be of any use in future given that this is a one time task: https://gist.github.com/siddhesh/b5ecac94eabfd72ed2916d6d8157e7dc https://gist.github.com/siddhesh/15ea1f5e435ace9774f485030695ee02 Reviewed-by: Carlos O'Donell <carlos@redhat.com>
* Update copyright dates with scripts/update-copyrightsPaul Eggert2021-01-02108-108/+108
| | | | | | | | | | | | | | | | I used these shell commands: ../glibc/scripts/update-copyrights $PWD/../gnulib/build-aux/update-copyright (cd ../glibc && git commit -am"[this commit message]") and then ignored the output, which consisted lines saying "FOO: warning: copyright statement not found" for each of 6694 files FOO. I then removed trailing white space from benchtests/bench-pthread-locks.c and iconvdata/tst-iconv-big5-hkscs-to-2ucs4.c, to work around this diagnostic from Savannah: remote: *** pre-commit check failed ... remote: *** error: lines with trailing whitespace found remote: error: hook declined to update refs/heads/master
* powerpc: Protect dl_powerpc_cpu_features on INIT_ARCH() [BZ #26615]Raphael Moreira Zinsly2020-09-221-1/+1
| | | | | | | dl_powerpc_cpu_features also needs to be protected by __GLRO to check for the _rtld_global_ro realocation before accessing it. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
* Add libm_alias_finite for _finite symbolsWilco Dijkstra2020-01-036-16/+4
| | | | | | | | | | | | | | | | | | | This patch adds a new macro, libm_alias_finite, to define all _finite symbol. It sets all _finite symbol as compat symbol based on its first version (obtained from the definition at built generated first-versions.h). The <fn>f128_finite symbols were introduced in GLIBC 2.26 and so need special treatment in code that is shared between long double and float128. It is done by adding a list, similar to internal symbol redifinition, on sysdeps/ieee754/float128/float128_private.h. Alpha also needs some tricky changes to ensure we still emit 2 compat symbols for sqrt(f). Passes buildmanyglibc. Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Reviewed-by: Siddhesh Poyarekar <siddhesh@sourceware.org>
* Update copyright dates with scripts/update-copyrights.Joseph Myers2020-01-01108-108/+108
|
* Prefer https to http for gnu.org and fsf.org URLsPaul Eggert2019-09-07108-108/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Also, change sources.redhat.com to sourceware.org. This patch was automatically generated by running the following shell script, which uses GNU sed, and which avoids modifying files imported from upstream: sed -ri ' s,(http|ftp)(://(.*\.)?(gnu|fsf|sourceware)\.org($|[^.]|\.[^a-z])),https\2,g s,(http|ftp)(://(.*\.)?)sources\.redhat\.com($|[^.]|\.[^a-z]),https\2sourceware.org\4,g ' \ $(find $(git ls-files) -prune -type f \ ! -name '*.po' \ ! -name 'ChangeLog*' \ ! -path COPYING ! -path COPYING.LIB \ ! -path manual/fdl-1.3.texi ! -path manual/lgpl-2.1.texi \ ! -path manual/texinfo.tex ! -path scripts/config.guess \ ! -path scripts/config.sub ! -path scripts/install-sh \ ! -path scripts/mkinstalldirs ! -path scripts/move-if-change \ ! -path INSTALL ! -path locale/programs/charmap-kw.h \ ! -path po/libc.pot ! -path sysdeps/gnu/errlist.c \ ! '(' -name configure \ -execdir test -f configure.ac -o -f configure.in ';' ')' \ ! '(' -name preconfigure \ -execdir test -f preconfigure.ac ';' ')' \ -print) and then by running 'make dist-prepare' to regenerate files built from the altered files, and then executing the following to cleanup: chmod a+x sysdeps/unix/sysv/linux/riscv/configure # Omit irrelevant whitespace and comment-only changes, # perhaps from a slightly-different Autoconf version. git checkout -f \ sysdeps/csky/configure \ sysdeps/hppa/configure \ sysdeps/riscv/configure \ sysdeps/unix/sysv/linux/csky/configure # Omit changes that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/powerpc/powerpc64/ppc-mcount.S: trailing lines git checkout -f \ sysdeps/powerpc/powerpc64/ppc-mcount.S \ sysdeps/unix/sysv/linux/s390/s390-64/syscall.S # Omit change that caused a pre-commit check to fail like this: # remote: *** error: sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S: last line does not end in newline git checkout -f sysdeps/sparc/sparc64/multiarch/memcpy-ultra3.S
* powerpc: refactor logb{f,l}Adhemerval Zanella2019-07-083-21/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The power7 logb implementation does not show a performance gain on ISA 2.07+ chips with faster floating-point to GRP instructions (currently POWER8 and POWER9). This patch moves the POWER7 implementation to generic one and enables it for POWER7. It also add some cleanup to use inline floating-point number instead of define them using static const. The performance difference is for POWER9: - Without patch: "logb": { "subnormal": { "duration": 4.99202e+09, "iterations": 8.83662e+08, "max": 75.194, "min": 5.501, "mean": 5.64925 }, "normal": { "duration": 4.97063e+09, "iterations": 9.97094e+08, "max": 46.489, "min": 4.956, "mean": 4.98512 } } - With patch: "logb": { "subnormal": { "duration": 4.97226e+09, "iterations": 9.92036e+08, "max": 77.209, "min": 4.892, "mean": 5.01218 }, "normal": { "duration": 4.96192e+09, "iterations": 1.07545e+09, "max": 12.361, "min": 4.593, "mean": 4.61382 } } The ifunc implementation is also enabled only for powerpc64. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/power7/fpu/s_logb.c: Move to ... * sysdeps/powerpc/fpu/s_logb.c: ... here. Use inline FP constants. * sysdeps/powerpc/power7/fpu/s_logbf.c: Move to ... * sysdeps/powerpc/fpu/s_logbf.c: ... here. Use inline FP constants. * sysdeps/powerpc/power7/fpu/s_logbl.c: Move to ... * sysdeps/powerpc/fpu/s_logbl.c: ... here. Use inline FP constants. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logb-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbl-power7.c: Adjust implementation path. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_log* objects. (CFLAGS-s_logbf-power7.c, CFLAGS-s_logbl-power7.c, CFLAGS-s_logb-power7.c): New fule. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logb.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logb.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-power7.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-power7.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbl.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_logbl.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile: Remove file. * sysdeps/powerpc/powerpc64/power7/fpu/s_logb.c: Remove file. * sysdeps/powerpc/powerpc64/power7/fpu/s_logbf.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_logbl.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Refactor modf{f}Adhemerval Zanella2019-07-082-20/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The modf{f} optimization is not an optimization for ISA 2.07+. This patch move the IFUNC for powerpc64 only, move the power5+ to generic location, and include the generic implementation for ISA 2.07+. The performance changes are based on modf benchtests: * POWER9 - ppc64 "modf": { "": { "duration": 4.97057e+09, "iterations": 1.00688e+09, "max": 28.76, "min": 4.912, "mean": 4.9366 } } * POWER9 - power5+ "modf": { "": { "duration": 4.98291e+09, "iterations": 9.32818e+08, "max": 15.058, "min": 5.107, "mean": 5.34178 } } * POWER8 - ppc64 "modf": { "": { "duration": 5.05329e+09, "iterations": 8.38814e+08, "max": 518.051, "min": 5.79, "mean": 6.02433 } } * POWER8 - power5+ "modf": { "": { "duration": 5.05573e+09, "iterations": 8.35254e+08, "max": 63.141, "min": 5.873, "mean": 6.05293 } } * POWER7 - ppc64 "modf": { "": { "duration": 4.89818e+09, "iterations": 1.08408e+09, "max": 57.556, "min": 3.953, "mean": 4.51827 } } * POWER7 - power5+ "modf": { "": { "duration": 4.83789e+09, "iterations": 1.33409e+09, "max": 46.608, "min": 2.224, "mean": 3.62636 } } Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/power5+/fpu/s_modf.c: Move to ... * sysdeps/powerpc/fpu/s_modf.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/power5+/fpu/s_modff.c: Move to ... * sysdeps/powerpc/fpu/s_modff.c: ... here. Add ISA 2.07 optimization. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modf-power5+.c: Adjust include. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff-power5+.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (sysdep_calls, sysdep_routines): Add s_modf* objects. (CFLAGS-s_modf-power5+.c, CFLAGS-s_modff-power5+.c, CFLAGS-s_modf-ppc64.c, CFLAGS-s_modff-ppc64.c): New rule. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf-power5+.c: Movo to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf-power5+.c: Move ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-power5+.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-power5+.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff-ppc64.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff-ppc64.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_modff.c: ... here. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Refactor powerpc32 lround/lroundf/llround/llroundfAdhemerval Zanella2019-06-2613-263/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llround{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llround{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lround.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llround.c (__llround): Add power5+ and fctidz optimization. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lround.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llround-power6.c, CFLAGS-s_llround-power5+.c, CFLAGS-s_llround-ppc32.c, CFLAGS-s_lround-ppc32.c, CFLAGS-s_lround-power5+.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lround-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llroundf.S: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llroundf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: Refactor powerpc32 lrint/lrintf/llrint/llrintfAdhemerval Zanella2019-06-1713-223/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc llrint{f} implementations on the generic sysdeps/powerpc/powerpc32/fpu/s_llrint{f}. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cpu and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_lrintf.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/s_lrintf.c: Move to ... * sysdeps/powerpc/fpu/s_lrintf.c: ... here. * sysdeps/powerpc/powerpc32/fpu/Makefile [$(subdir) == math] (CFLAGS-s_lrint.c): New rule. * sysdeps/powerpc/powerpc32/fpu/s_llrint.c (__llrint): Add power4 optimization. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c (__llrintf): Likewise. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_lrint.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_llrintf-power6.c, CFLAGS-s_llrintf-ppc32.c, CFLAGS-s_llrint-power6.c, CFLAGS-s_llrint-ppc32.c, CFLAGS-s_lrint-ppc32.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrint.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-power6.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrint-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrint-ppc32.c: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized finiteAdhemerval Zanella2019-06-126-193/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc finite optimization do not show much gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc64-linux-gnu-power7 (--with-cpu=power7): - Generic sysdeps/ieee754 implementation: "isfinite": { "": { "duration": 5.0082e+09, "iterations": 2.45299e+09, "max": 43.824, "min": 2.008, "mean": 2.04167 }, "INF": { "duration": 4.66554e+09, "iterations": 2.28288e+09, "max": 35.73, "min": 2.008, "mean": 2.04371 }, "NAN": { "duration": 4.66274e+09, "iterations": 2.28716e+09, "max": 34.161, "min": 2.009, "mean": 2.03866 } } - power7 optimized one: "isfinite": { "": { "duration": 4.99111e+09, "iterations": 2.65566e+09, "max": 25.015, "min": 1.716, "mean": 1.87942 }, "INF": { "duration": 4.6783e+09, "iterations": 2.0999e+09, "max": 35.264, "min": 1.868, "mean": 2.22787 }, "NAN": { "duration": 4.67915e+09, "iterations": 2.08678e+09, "max": 38.099, "min": 1.869, "mean": 2.24228 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_finite* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_finite* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power7.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finite.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_finitef.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_finitef.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finite.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_finitef.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized isinfAdhemerval Zanella2019-06-126-186/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc isinf optimizations onyl adds complexity: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power7 uses ftdiv to optimize for some input pattern and branch implementation for INF and denormal that does: return (ix & UINT64_C (0x7fffffffffffffff)) == UINT64_C (0x7ff0000000000000) Although it does show slight better latency than generic algorithm (as below), it is only for power7 and requires it to override it for power8. - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-power7.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_call): Remove s_isinf* and s_isinf* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isinff.c: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isinff.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isinff.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: Remove optimized isnanAdhemerval Zanella2019-06-129-287/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The powerpc isnan optimizations are not really a gain: - GCC will call libm iff -fsignaling-nans is used. This usage pattern is usually not performance oriented and for such calls PLT overhead should dominate execution time. - The power5, power6, and power6x are just micro-optimization to improve the Load-Hit-Store hazards from floating-point to general register transfer, and current GCC already has support to minimize it by inserting either extra nops or group dispatch instructions. - The power7 uses ftdiv to optimize for some input patterns, but at cost of others. Comparing against generic C implementation built for powerpc-linux-gnu-power4 (which uses the hp-timing support on benchtests): - Generic sysdeps/ieee754 implementation: "isnan": { "": { "duration": 4.98415e+09, "iterations": 2.34516e+09, "max": 45.925, "min": 2.052, "mean": 2.12529 }, "INF": { "duration": 4.74057e+09, "iterations": 1.69761e+09, "max": 91.01, "min": 2.052, "mean": 2.79249 }, "NAN": { "duration": 4.74071e+09, "iterations": 1.68768e+09, "max": 282.343, "min": 2.052, "mean": 2.809 } } - power7 optimized one: $ ./testrun.sh benchtests/bench-isnan "isnan": { "": { "duration": 4.96842e+09, "iterations": 2.56297e+09, "max": 50.048, "min": 1.872, "mean": 1.93854 }, "INF": { "duration": 4.76648e+09, "iterations": 1.54213e+09, "max": 373.408, "min": 2.661, "mean": 3.09084 }, "NAN": { "duration": 4.76845e+09, "iterations": 1.54515e+09, "max": 51.016, "min": 2.736, "mean": 3.08607 } } So it basically optimizes marginally for normal numbers while increasing the latency for other kind of FP. - The generic implementation requires getting the floating point status, disable the invalid operation bit, and restore the floating-point status. Each operation is costly and requires flushing the FP pipeline. Using the same scenarion for the previous analysis: "isnan": { "": { "duration": 5.08284e+09, "iterations": 6.2898e+08, "max": 41.844, "min": 8.057, "mean": 8.08108 }, "INF": { "duration": 4.97904e+09, "iterations": 6.16176e+08, "max": 39.661, "min": 8.057, "mean": 8.08055 }, "NAN": { "duration": 4.98695e+09, "iterations": 5.95866e+08, "max": 29.728, "min": 8.345, "mean": 8.36925 } } - The power8 implementation is just the generic implementation using ISA 2.07 mfvsrd instruction (which GCC uses for generic implementation). So generic implementation is the best option for powerpc64le. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_isnan.c: Remove file. * sysdeps/powerpc/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdeps_routines, libm-sysdep_routines): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power5.S: Remove file * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power5.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf-power6.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power5/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc32/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdep_calls): Remove s_isnan-* and s_isnanf-* objects. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power5.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power6x.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power7.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-power8.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnan.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_isnanf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power5/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power6x/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power7/fpu/s_isnanf.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnan.S: Likewise. * sysdeps/powerpc/powerpc64/power8/fpu/s_isnanf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: copysign cleanupAdhemerval Zanella2019-06-125-149/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GCC always expand copysign{f} for all possible cpus, so calling the libm is only done if user explicitly states to disable the builtin (which is done usually not for performance reason). So to provide ifunc variant for copysign is just unrequired complexity, since libm will be called on non-performance critical code. This patch removes both powerpc32 and powerpc64 ifunc variants and consolidates the powerpc implementation on sysdeps/powerpc/fpu/s_copysign{f}.c using compiler builtins. Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/s_copysign.c: New file. * sysdeps/powerpc/fpu/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (sysdep_routines, libm-sysdep_routines): Remove s_copysign-power6 and s_copysign-ppc32. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (sysdeps_calls): Remove s_copysign-power6 s_copysign-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-power6.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_copysignf.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Likewise. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysignf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabrielftg@linux.ibm.com>
* powerpc: trunc/truncf refactorAdhemerval Zanella2019-05-098-117/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc trunc{f} implementations on the generic sysdeps/powerpc/fpu/s_trunc{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/trunc_to_integer.h (set_fenv_mode): Add TRUNC handling. (round_mode): Add definition for TRUNC. * sysdeps/powerpc/fpu/s_trunc.c: New file. * sysdeps/powerpc/fpu/s_truncf.c: New file. * sysdeps/powerpc/powerpc32/fpu/s_trunc.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-power5+.c: New file. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_trunc-ppc32.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-power5+.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_truncf-ppc32.c: Likewise. * sysdep/powerpc/powerpc32/power5+/fpu/s_trunc.S: Remove file. * sysdep/powerpc/powerpc32/power5+/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_trunc-power5+, s_trunc-ppc64, s_truncf-power5+, and s_truncf-ppc64. (CFLAGS-s_trunc-power5+.c, CFLAGS-s_truncf-power5+.c): New rule. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_trunc-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_trunc.c: ... here. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_truncf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_truncf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_trunc-power5+, s_trunc-ppc64, s_truncf-power5+, and s_truncf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-power5+.S: Remove file. * sysdep/powerpc/powerpc64/fpu/multiarch/s_trunc-ppc64.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-power5+.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_truncf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_trunc.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_trunc.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_truncf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: round/roundf refactorAdhemerval Zanella2019-05-098-117/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc round{f} implementations on the generic sysdeps/powerpc/fpu/s_round{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add ROUND handling. (round_mode): Add definition for ROUND. (round_to_integer_float): Likewise. * sysdeps/powerpc/fpu/s_round.c: New file. * sysdeps/powerpc/fpu/s_roundf.c: New file. * sysdeps/powerpc/powerpc32/fpu/s_round.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.S: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-power5+.c: New file. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_round-ppc32.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-power5+.c: Likewise. * sysdep/powerpc/powepc32/power4/fpu/multiarch/s_roundf-ppc32.c: Likewise. * sysdep/powerpc/powerpc32/power5+/fpu/s_round.S: Remove file. * sysdep/powerpc/powerpc32/power5+/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_round-power5+, s_round-ppc64, s_roundf-power5+, and s_roundf-ppc64. (CFLAGS-s_round-power5+.c, CFLAGS-s_roundf-power5+.c): New rule. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_round-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_round.c: ... here. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-power5+.c: New file. * sysdep/powerpc/powercp64/be/fpu/multiarch/s_roundf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_roundf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_round-power5+, s_round-ppc64, s_roundf-power5+, and s_roundf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_round-power5+.S: Remove file. * sysdep/powerpc/powerpc64/fpu/multiarch/s_round-ppc64.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-power5+.S: Likewise. * sysdep/powerpc/powerpc64/fpu/multiarch/s_roundf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_round.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_round.S: Likewise. * sysdep/powerpc/powerpc64/power5+/fpu/s_roundf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: floor/floorf refactorAdhemerval Zanella2019-05-098-117/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc floor{f} implementations on the generic sysdeps/powerpc/fpu/s_floor{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frim instruction) or a generic implementation which uses FP only operations. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/round_to_integer.h (set_fenv_mode): Add FLOOR option. (round_mode): Add definition for FLOOR. * sysdeps/powerpc/fpu/s_floor.c: New file. * sysdeps/powerpc/fpu/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_floor.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.S: Likewise * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floor.S: Remove file. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Remove file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile (libm-sysdep_routines): Add s_floor-power5+, s_floor-ppc64, s_floorf-power5+, and s_floorf-ppc64. (CFLAGS-s_floor-power5+.c, CFLAGS-s_floorf-power5+.c): New rule. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-power5+.c: New file. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floor-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floor.c: ... here. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-power5+.c: New file. * sysdep/powerpc/powerpc64/be/fpu/multiarch/s_floorf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_floorf.c: ... here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_floor-power5+, s_floor-ppc64, s_floorf-power5+, and s_floorf-ppc64. * sysdep/powerpc/powerpc64/fpu/multiarch/s_floor-power5+.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor-ppc64.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-power5+.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floor.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: ceil/ceilf refactorAdhemerval Zanella2019-04-299-117/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patches consolidates all the powerpc ceil{f} implementations on the generic sysdeps/powerpc/fpu/s_ceil{f}. The generic implementation uses either the compiler builts for ISA 2.03+ (which generates the frip instruction) or a generic implementation which uses FP only operations. It adds a generic implementation (round_to_integer.h) which is shared with other rounding to integer routines. The resulting code should be similar in term os performance to previous assembly one. The IFUNC organization for powerpc64 is also change to be enabled only for powerpc64 and not for powerpc64le (since minium ISA of 2.08 does not require the fallback generic implementation). Checked on powerpc-linux-gnu (built without --with-cpu, with --with-cpu=power4 and with --with-cpu=power5+ and --disable-multi-arch), powerpc64-linux-gnu (built without --with-cp and with --with-cpu=power5+ and --disable-multi-arch). * sysdeps/powerpc/fpu/fenv_libc.h (__fesetround_inline_nocheck): New function. * sysdeps/powerpc/fpu/round_to_integer.h: New file. * sysdeps/powerpc/fpu/s_ceil.c: Likewise. * sysdeps/powerpc/fpu/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc32/fpu/s_ceil.S: Remove file. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/Makefile (CFLAGS-s_ceil-power5+.c, CFLAGS-s_ceilf-power5+.c): New rule. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.S: Remove file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.S: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-power5+.c: New file. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-power5+.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceil.S: Remove file. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/Makefile: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-power5+.c: Likewise. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceil.c: ... here. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-power5+.c: New file. * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Move to ... * sysdeps/powerpc/powerpc64/be/fpu/multiarch/s_ceilf.c: ... * here. * sysdeps/powerpc/powerpc64/fpu/multiarch/Makefile (libm-sysdep_routines): Remove s_ceil-power5+, s_ceil-ppc64, s_ceilf-power5+, and s_ceilf-ppc64. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-power5+.S: Remove file. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-power5+.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf-ppc64.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceil.S: Likewise. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Likewise. Reviewed-by: Gabriel F. T. Gomes <gabriel@inconstante.eti.br>
* powerpc: Use generic wcsrchr optimizationAdhemerval Zanella2019-04-046-120/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the power6 wcsrchr optimization and uses generic implementation instead. Currently, both power6 and power7 IFUNC variant resulting binary are essentially the same and the generic implementation with unrolling loop set to 8 also results in similar performance. Checked on powerpc64-linux-gnu. * sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcsrchr.c): New rule. * sysdeps/powerpc/power6/wcsrchr.c: Remove file. * sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-power7.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcsrchr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power6.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcsrchr-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcsrchr-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcsrchr.c: Likewise. * sysdeps/powerpc/powerpc64/power6/wcsrchr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/Makefile [$(subdir) == wcsmbs] (sysdeps_routines): Remove wcsrchr-power6 and wcsrchr-power7. (CFLAGS-wcsrchr-power7.c, CFLAGS-wcsrchr-power6.c): Remove rule. * sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c: Remove wcsrchr optimizations. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
* powerpc: Use generic wcschr optimizationAdhemerval Zanella2019-04-046-151/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the power6 wcschr optimization and uses generic implementation instead. Currently, both power6 and power7 IFUNC variant resulting binary are essentially the same and the generic implementation with unrolling loop set to 8 also results in similar performance. Checked on powerpc64-linux-gnu. * sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcschr.c): New rule. * sysdeps/powerpc/power6/wcschr.c: Remove file. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-power7.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcschr.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr-power6.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcschr.c: Likewise. * sysdeps/powerpc/powerpc64/power6/wcschr.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/Makefile [$(subdir) == wcsmbs] (sysdeps_routines): Remove wcschr-power6 and wcschr-power7. (CFLAGS-wcschr-power7.c, CFLAGS-wcschr-power6.c): Remove rule. * sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c: Remove wcschr optimizations. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
* powerpc: Use generic wcscpy optimizationAdhemerval Zanella2019-04-046-122/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the power6 wcscpy optimization and uses generic implementation instead. Currently, both power6 and power7 IFUNC variant resulting binary are essentially the same and the generic implementation with unrolling loop set to 8 also results in similar performance. Checked on powerpc64-linux-gnu. * sysdeps/powerpc/Makefile [$(subdir) == wcsmbs] (CFLAGS-wcscpy.c): New rule. * sysdeps/powerpc/power6/wcscpy.c: Remove file. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power6.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-power7.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcscpy-power6.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcscpy-power7.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcscpy-ppc64.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/wcscpy.c: Likewise. * sysdeps/powerpc/powerpc64/power6/wcscpy.c: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/Makefile [$(subdir) == wcsmbs] (sysdeps_routines): Remove wcscpy-power6 and wcscpy-power7. (CFLAGS-wcscpy-power7.c, CFLAGS-wcscpy-power6.c): Remove rule. * sysdeps/powerpc/powerpc64/multiarch/Makefile: Likewise. * sysdeps/powerpc/powerpc32/power4/multiarch/ifunc-impl-list.c: Remove wcscpy optimizations. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Likewise.
* Refactor hp-timing rtld usageAdhemerval Zanella2019-03-221-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch refactor how hp-timing is used on loader code for statistics report. The HP_TIMING_AVAIL and HP_SMALL_TIMING_AVAIL are removed and HP_TIMING_INLINE is used instead to check for hp-timing avaliability. For alpha, which only defines HP_SMALL_TIMING_AVAIL, the HP_TIMING_INLINE is set iff for IS_IN(rtld). Checked on aarch64-linux-gnu, x86_64-linux-gnu, and i686-linux-gnu. I also checked the builds for all afected ABIs. * benchtests/bench-timing.h: Replace HP_TIMING_AVAIL with HP_TIMING_INLINE. * nptl/descr.h: Likewise. * elf/rtld.c (RLTD_TIMING_DECLARE, RTLD_TIMING_NOW, RTLD_TIMING_DIFF, RTLD_TIMING_ACCUM_NT, RTLD_TIMING_SET): Define. (dl_start_final_info, _dl_start_final, dl_main, print_statistics): Abstract hp-timing usage with RTLD_* macros. * sysdeps/alpha/hp-timing.h (HP_TIMING_INLINE): Define iff IS_IN(rtld). (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Remove. * sysdeps/generic/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL, HP_TIMING_NONAVAIL): Likewise. * sysdeps/ia64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/powerpc/powerpc32/power4/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/powerpc/powerpc64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/sparc/sparc32/sparcv9/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/sparc/sparc64/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/x86/hp-timing.h (HP_TIMING_AVAIL, HP_SMALL_TIMING_AVAIL): Likewise. * sysdeps/generic/hp-timing-common.h: Update comment with HP_TIMING_AVAIL removal.
* powerpc: Fix linknamespace introduced by 4d8015639a75Adhemerval Zanella2019-02-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | This patch fixes the linknamespace issues add on wcscpy refactor for powerpc-linux-gnu-power4 as shown by the tests: FAIL: conform/POSIX/fnmatch.h/linknamespace FAIL: conform/POSIX/glob.h/linknamespace FAIL: conform/POSIX/wordexp.h/linknamespace FAIL: conform/XPG4/fnmatch.h/linknamespace FAIL: conform/XPG4/glob.h/linknamespace FAIL: conform/XPG4/wordexp.h/linknamespace FAIL: conform/XPG42/fnmatch.h/linknamespace FAIL: conform/XPG42/glob.h/linknamespace FAIL: conform/XPG42/wordexp.h/linknamespace [initial] wordexp -> [libc.a(wordexp.o)] fnmatch -> [libc.a(fnmatch.o)] __wcscat -> [libc.a(wcscat.o)] __wcscpy -> [libc.a(wcscpy.o)] wcscpy Checked on powerpc-linux-gnu-power4. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c: Define ifunc symbol as __wcspcy instead of wcscpy.
* wcsmbs: optimize wcscatAdhemerval Zanella2019-02-272-16/+17
| | | | | | | | | | | | | | | | | | | | | | | | This patch rewrites wcscat using wcslen and wcscpy. This is similar to the optimization done on strcat by 6e46de42fe. The strcpy changes are mainly to add the internal alias to avoid PLT calls. Checked on x86_64-linux-gnu and a build against the affected architectures. * include/wchar.h (__wcscpy): New prototype. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy-ppc32.c (__wcscpy): Route internal symbol to generic implementation. * sysdeps/powerpc/powerpc32/power4/multiarch/wcscpy.c (wcscpy): Add internal __wcscpy alias. * sysdeps/powerpc/powerpc64/multiarch/wcscpy.c (wcscpy): Likewise. * sysdeps/s390/wcscpy.c (wcscpy): Likewise. * sysdeps/x86_64/multiarch/wcscpy.c (wcscpy): Likewise. * wcsmbs/wcscpy.c (wcscpy): Add * sysdeps/x86_64/multiarch/wcscpy-c.c (WCSCPY): Adjust macro to use generic implementation. * wcsmbs/wcscat.c (wcscat): Rewrite using wcslen and wcscpy.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2019-01-01171-171/+171
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* Use copysign functions not __copysign functions in glibc libm.Joseph Myers2018-09-272-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __copysign functions to call the corresponding copysign names instead, with asm redirection to __copysign when the calls are not inlined (all cases are inlined except for IBM long double for powerpc soft-float / e500v1). This eliminates the need for an inline function defining __copysign in terms of __builtin_copysign. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_BINARY_ARGS): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (copysign): Redirect using MATH_REDIRECT. * sysdeps/alpha/fpu/s_copysign.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/alpha/fpu/s_copysignf.c: Likewise. * sysdeps/ieee754/dbl-64/s_copysign.c: Likewise. * sysdeps/ieee754/float128/s_copysignf128.c: Likewise. * sysdeps/ieee754/flt-32/s_copysignf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_copysignl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_copysignl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/riscv/rvd/s_copysign.c: Likewise. * sysdeps/riscv/rvf/s_copysignf.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysign.c: Likewise. * sysdeps/sparc/sparc32/sparcv9/fpu/multiarch/s_copysignf.c: Likewise. * sysdeps/generic/math_private_calls.h [!__MATH_DECLARING_LONG_DOUBLE || !NO_LONG_DOUBLE] (__copysign): Do not declare and define as an inline function. * math/divtc3.c (__divtc3): Use copysign functions instead of __copysign variants. * math/multc3.c (__multc3): Likewise. * sysdeps/generic/math-type-macros.h (M_COPYSIGN): Likewise. * sysdeps/ieee754/dbl-64/e_atan2.c (signArctan2): Likewise. * sysdeps/ieee754/dbl-64/e_atanh.c (__ieee754_atanh): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Likewise. * sysdeps/ieee754/dbl-64/e_jn.c (__ieee754_jn): Likewise. (__ieee754_yn): Likewise. * sysdeps/ieee754/dbl-64/s_asinh.c (__asinh): Likewise. * sysdeps/ieee754/dbl-64/s_atan.c (__signArctan): Likewise. * sysdeps/ieee754/dbl-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Likewise. (__sin): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_nearbyint.c (__nearbyint): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbln.c (__scalbln): Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_scalbn.c (__scalbn): Likewise. * sysdeps/ieee754/flt-32/e_atanhf.c (__ieee754_atanhf): Likewise. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/flt-32/e_jnf.c (__ieee754_jnf): Likewise. (__ieee754_ynf): Likewise. * sysdeps/ieee754/flt-32/s_asinhf.c (__asinhf): Likewise. * sysdeps/ieee754/flt-32/s_scalbnf.c (__scalbnf): Likewise. * sysdeps/ieee754/k_standard.c (__kernel_standard): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_fmal.c (__fmal): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_scalbnl.c (__scalbnl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_jnl.c (__ieee754_jnl): Likewise. (__ieee754_ynl) * sysdeps/ieee754/ldbl-96/s_asinhl.c (__asinhl): Likewise. * sysdeps/ieee754/ldbl-96/s_scalblnl.c (__scalblnl): Likewise. * sysdeps/ieee754/ldbl-opt/nldbl-copysign.c (copysignl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Use round functions not __round functions in glibc libm.Joseph Myers2018-09-272-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __round functions to call the corresponding round names instead, with asm redirection to __round when the calls are not inlined. An additional complication arises in sysdeps/ieee754/ldbl-128ibm/e_expl.c, where a call to roundl, with the result converted to int, gets converted by the compiler to call lroundl in the case of 32-bit long, so resulting in localplt test failures. It's logically correct to let the compiler make such an optimization; an appropriate asm redirection of lroundl to __lroundl is thus added to that file (it's not needed anywhere else). Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (round): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_round.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_roundf.c: Likewise. * sysdeps/ieee754/dbl-64/s_round.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_round.c: Likewise. * sysdeps/ieee754/float128/s_roundf128.c: Likewise. * sysdeps/ieee754/flt-32/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_roundl.c: Likewise. * sysdeps/ieee754/ldbl-96/s_roundl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_round.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_round.c: Likewise. * sysdeps/riscv/rvf/s_roundf.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_roundl.c: Likewise. (round): Redirect to __round. (__roundl): Call round instead of __round. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__round): Remove macro. [_ARCH_PWR5X] (__roundf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use round functions instead of __round variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/x86/fpu/powl_helper.c (__powl_helper): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_expl.c (lroundl): Redirect to __lroundl. (__ieee754_expl): Call roundl instead of __roundl.
* Use trunc functions not __trunc functions in glibc libm.Joseph Myers2018-09-202-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __trunc functions to call the corresponding trunc names instead, with asm redirection to __trunc when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (trunc): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_trunc.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_truncf.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_trunc.c: Likewise. * sysdeps/ieee754/float128/s_truncf128.c: Likewise. * sysdeps/ieee754/dbl-64/s_trunc.c: Likewise. * sysdeps/ieee754/flt-32/s_truncf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_truncl.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_trunc.c: Likewise. * sysdeps/riscv/rvf/s_truncf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_trunc.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_truncf.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_trunc_template.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c: Likewise. (ceil): Redirect to __ceil. (floor): Redirect to __floor. (trunc): Redirect to __trunc. (__truncl): Call trunc instead of __trunc. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__trunc): Remove macro. [_ARCH_PWR5X] (__truncf): Likewise. * sysdeps/ieee754/dbl-64/e_gamma_r.c (__ieee754_gamma_r): Use trunc functions instead of __trunc variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (__ieee754_gammaf_r): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (__ieee754_gammal_r): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (__ieee754_gammal_r): Likewise.
* Use ceil functions not __ceil functions in glibc libm.Joseph Myers2018-09-172-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the move to use, within libm, public names for libm functions that can be inlined as built-in functions on many architectures, this patch moves calls to __ceil functions to call the corresponding ceil names instead, with asm redirection to __ceil when the calls are not inlined. Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (ceil): Redirect using MATH_REDIRECT. * sysdeps/aarch64/fpu/s_ceil.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_ceilf.c: Likewise. * sysdeps/ieee754/dbl-64/s_ceil.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_ceil.c: Likewise. * sysdeps/ieee754/float128/s_ceilf128.c: Likewise. * sysdeps/ieee754/flt-32/s_ceilf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_ceill.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_ceill.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_ceil_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_ceil.c: Likewise. * sysdeps/riscv/rvf/s_ceilf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceil.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_ceilf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__ceil): Remove macro. * sysdeps/ieee754/dbl-64/e_gamma_r.c (gamma_positive): Use ceil functions instead of __ceil variants. * sysdeps/ieee754/flt-32/e_gammaf_r.c (gammaf_positive): Likewise. * sysdeps/ieee754/ldbl-128/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_gammal_r.c (gammal_positive): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Use floor functions not __floor functions in glibc libm.Joseph Myers2018-09-142-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Similar to the changes that were made to call sqrt functions directly in glibc, instead of __ieee754_sqrt variants, so that the compiler could inline them automatically without needing special inline definitions in lots of math_private.h headers, this patch makes libm code call floor functions directly instead of __floor variants, removing the inlines / macros for x86_64 (SSE4.1) and powerpc (POWER5). The redirection used to ensure that __ieee754_sqrt does still get called when the compiler doesn't inline a built-in function expansion is refactored so it can be applied to other functions; the refactoring is arranged so it's not limited to unary functions either (it would be reasonable to use this mechanism for copysign - removing the inline in math_private_calls.h but also eliminating unnecessary local PLT entry use in the cases (powerpc soft-float and e500v1, for IBM long double) where copysign calls don't get inlined). The point of this change is that more architectures can get floor calls inlined where they weren't previously (AArch64, for example), without needing special inline definitions in their math_private.h, and existing such definitions in math_private.h headers can be removed. Note that it's possible that in some cases an inline may be used where an IFUNC call was previously used - this is the case on x86_64, for example. I think the direct calls to floor are still appropriate; if there's any significant performance cost from inline SSE2 floor instead of an IFUNC call ending up with SSE4.1 floor, that indicates that either the function should be doing something else that's faster than using floor at all, or it should itself have IFUNC variants, or that the compiler choice of inlining for generic tuning should change to allow for the possibility that, by not inlining, an SSE4.1 IFUNC might be called at runtime - but not that glibc should avoid calling floor internally. (After all, all the same considerations would apply to any user program calling floor, where it might either be inlined or left as an out-of-line call allowing for a possible IFUNC.) Tested for x86_64, and with build-many-glibcs.py. * include/math.h [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT): New macro. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_LDBL): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_F128): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (MATH_REDIRECT_UNARY_ARGS): Likewise. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (sqrt): Redirect using MATH_REDIRECT. [!_ISOMAC && !(__FINITE_MATH_ONLY__ && __FINITE_MATH_ONLY__ > 0) && !NO_MATH_REDIRECT] (floor): Likewise. * sysdeps/aarch64/fpu/s_floor.c: Define NO_MATH_REDIRECT before header inclusion. * sysdeps/aarch64/fpu/s_floorf.c: Likewise. * sysdeps/ieee754/dbl-64/s_floor.c: Likewise. * sysdeps/ieee754/dbl-64/wordsize-64/s_floor.c: Likewise. * sysdeps/ieee754/float128/s_floorf128.c: Likewise. * sysdeps/ieee754/flt-32/s_floorf.c: Likewise. * sysdeps/ieee754/ldbl-128/s_floorl.c: Likewise. * sysdeps/ieee754/ldbl-128ibm/s_floorl.c: Likewise. * sysdeps/m68k/m680x0/fpu/s_floor_template.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/riscv/rv64/rvd/s_floor.c: Likewise. * sysdeps/riscv/rvf/s_floorf.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/sparc/sparc64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floor.c: Likewise. * sysdeps/x86_64/fpu/multiarch/s_floorf.c: Likewise. * sysdeps/powerpc/fpu/math_private.h [_ARCH_PWR5X] (__floor): Remove macro. [_ARCH_PWR5X] (__floorf): Likewise. * sysdeps/x86_64/fpu/math_private.h [__SSE4_1__] (__floor): Remove inline function. [__SSE4_1__] (__floorf): Likewise. * math/w_lgamma_main.c (LGFUNC (__lgamma)): Use floor functions instead of __floor variants. * math/w_lgamma_r_compat.c (__lgamma_r): Likewise. * math/w_lgammaf_main.c (LGFUNC (__lgammaf)): Likewise. * math/w_lgammaf_r_compat.c (__lgammaf_r): Likewise. * math/w_lgammal_main.c (LGFUNC (__lgammal)): Likewise. * math/w_lgammal_r_compat.c (__lgammal_r): Likewise. * math/w_tgamma_compat.c (__tgamma): Likewise. * math/w_tgamma_template.c (M_DECL_FUNC (__tgamma)): Likewise. * math/w_tgammaf_compat.c (__tgammaf): Likewise. * math/w_tgammal_compat.c (__tgammal): Likewise. * sysdeps/ieee754/dbl-64/e_lgamma_r.c (sin_pi): Likewise. * sysdeps/ieee754/dbl-64/k_rem_pio2.c (__kernel_rem_pio2): Likewise. * sysdeps/ieee754/dbl-64/lgamma_neg.c (__lgamma_neg): Likewise. * sysdeps/ieee754/flt-32/e_lgammaf_r.c (sin_pif): Likewise. * sysdeps/ieee754/flt-32/lgamma_negf.c (__lgamma_negf): Likewise. * sysdeps/ieee754/ldbl-128/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_lgammal_r.c (__ieee754_lgammal_r): Likewise. * sysdeps/ieee754/ldbl-128ibm/e_powl.c (__ieee754_powl): Likewise. * sysdeps/ieee754/ldbl-128ibm/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_expm1l.c (__expm1l): Likewise. * sysdeps/ieee754/ldbl-128ibm/s_truncl.c (__truncl): Likewise. * sysdeps/ieee754/ldbl-96/e_lgammal_r.c (sin_pi): Likewise. * sysdeps/ieee754/ldbl-96/lgamma_negl.c (__lgamma_negl): Likewise. * sysdeps/powerpc/power5+/fpu/s_modf.c (__modf): Likewise. * sysdeps/powerpc/power5+/fpu/s_modff.c (__modff): Likewise.
* Don't include math.h/math_private.h in math_ldbl_opt.h.Zack Weinberg2018-03-102-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The sysdeps/ieee754/ldbl-opt version of math_ldbl_opt.h includes math.h and math_private.h, despite not having any need for those headers itself; the sysdeps/generic version doesn't. About 20 files are relying on math_ldbl_opt.h to include math.h and/or math_private.h for them, even though none of them necessarily used on a platform that needs ldbl-opt support. * sysdeps/ieee754/ldbl-opt/math_ldbl_opt.h: Don't include math.h or math_private.h. * sysdeps/alpha/fpu/s_isnan.c * sysdeps/ieee754/ldbl-128ibm/s_ceill.c * sysdeps/ieee754/ldbl-128ibm/s_floorl.c * sysdeps/ieee754/ldbl-128ibm/s_llrintl.c * sysdeps/ieee754/ldbl-128ibm/s_llroundl.c * sysdeps/ieee754/ldbl-128ibm/s_lrintl.c * sysdeps/ieee754/ldbl-128ibm/s_lroundl.c * sysdeps/ieee754/ldbl-128ibm/s_rintl.c * sysdeps/ieee754/ldbl-128ibm/s_roundl.c * sysdeps/ieee754/ldbl-128ibm/s_truncl.c * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypot.c * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/e_hypotf.c: * sysdeps/powerpc/powerpc64/fpu/multiarch/e_expf.c * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypot.c * sysdeps/powerpc/powerpc64/fpu/multiarch/e_hypotf.c: Include math_private.h. * sysdeps/ieee754/ldbl-64-128/s_finitel.c * sysdeps/ieee754/ldbl-64-128/s_fpclassifyl.c * sysdeps/ieee754/ldbl-64-128/s_isinfl.c * sysdeps/ieee754/ldbl-64-128/s_isnanl.c * sysdeps/ieee754/ldbl-64-128/s_signbitl.c * sysdeps/powerpc/power7/fpu/s_logb.c: Include math.h and math_private.h.
* powerpc: Fix llround spurious inexact on 32-bit POWER4 [BZ #22697]Tulio Magno Quites Machado Filho2018-01-121-0/+5
| | | | | | | | | | This issue is similar to BZ #19235, where spurious exceptions are created from adding 0.5 then converting to an integer. The solution is based on Joseph's fix for BZ #19235. [BZ #22697] * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S (__llround): Do not add 0.5 to integer or out-of-range arguments.
* Update copyright dates with scripts/update-copyrights.Joseph Myers2018-01-01171-171/+171
| | | | | | | * All files with FSF copyright notices: Update copyright dates using scripts/update-copyrights. * locale/programs/charmap-kw.h: Regenerated. * locale/programs/locfile-kw.h: Likewise.
* powerpc: POWER8 memcpy optimization for cached memoryAdhemerval Zanella2017-12-111-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On POWER8, unaligned memory accesses to cached memory has little impact on performance as opposed to its ancestors. It is disabled by default and will only be available when the tunable glibc.tune.cached_memopt is set to 1. __memcpy_power8_cached __memcpy_power7 ============================================================ max-size=4096: 33325.70 ( 12.65%) 38153.00 max-size=8192: 32878.20 ( 11.17%) 37012.30 max-size=16384: 33782.20 ( 11.61%) 38219.20 max-size=32768: 33296.20 ( 11.30%) 37538.30 max-size=65536: 33765.60 ( 10.53%) 37738.40 * manual/tunables.texi (Hardware Capability Tunables): Document glibc.tune.cached_memopt. * sysdeps/powerpc/cpu-features.c: New file. * sysdeps/powerpc/cpu-features.h: New file. * sysdeps/powerpc/dl-procinfo.c [!IS_IN(ldconfig)]: Add _dl_powerpc_cpu_features. * sysdeps/powerpc/dl-tunables.list: New file. * sysdeps/powerpc/ldsodefs.h: Include cpu-features.h. * sysdeps/powerpc/powerpc32/power4/multiarch/init-arch.h (INIT_ARCH): Initialize use_aligned_memopt. * sysdeps/powerpc/powerpc64/dl-machine.h [defined(SHARED && IS_IN(rtld))]: Restrict dl_platform_init availability and initialize CPU features used by tunables. * sysdeps/powerpc/powerpc64/multiarch/Makefile (sysdep_routines): Add memcpy-power8-cached. * sysdeps/powerpc/powerpc64/multiarch/ifunc-impl-list.c: Add __memcpy_power8_cached. * sysdeps/powerpc/powerpc64/multiarch/memcpy.c: Likewise. * sysdeps/powerpc/powerpc64/multiarch/memcpy-power8-cached.S: New file. Reviewed-by: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
* Use libm_alias_float for powerpc.Joseph Myers2017-12-0513-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuing the preparation for additional _FloatN / _FloatNx function aliases, this patch makes powerpc libm function implementations use libm_alias_float to define function aliases. Tested with build-many-glibcs.py that installed stripped shared libraries are unchanged for all its hard-float powerpc configurations. * sysdeps/powerpc/fpu/s_cosf.c: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_fabs.S: Include <libm-alias-float.h>. (fabsf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_fmaf.S: Include <libm-alias-float.h>. (fmaf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_rintf.c: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/fpu/s_sinf.c: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/powerpc/power5+/fpu/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/power7/fpu/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_lrint.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_nearbyintf.S: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_rintf.S: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_copysignf.c: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lrintf.c: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_lroundf.c: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_roundf.c: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power4/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power5+/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_llrintf.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lrint.S: Include <libm-alias-float.h>. (lrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc32/power6x/fpu/s_lround.S: Include <libm-alias-float.h>. (lroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_ceilf.c: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_copysignf.c: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_cosf.c: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_floorf.c: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llrintf.c: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_llroundf.c: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_logbf.c: Include <libm-alias-float.h>. (logbf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_modff.c: Include <libm-alias-float.h>. (modff): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_roundf.c: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_sinf.c: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/multiarch/s_truncf.c: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_llroundf.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_nearbyintf.S: Include <libm-alias-float.h>. (nearbyintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_rintf.S: Include <libm-alias-float.h>. (rintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_ceilf.S: Include <libm-alias-float.h>. (ceilf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_floorf.S: Include <libm-alias-float.h>. (floorf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_roundf.S: Include <libm-alias-float.h>. (roundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power5+/fpu/s_truncf.S: Include <libm-alias-float.h>. (truncf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6/fpu/s_copysign.S: Include <libm-alias-float.h>. (copysignf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power6x/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_cosf.S: Include <libm-alias-float.h>. (cosf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_llrint.S: Include <libm-alias-float.h>. (llrintf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_llround.S: Include <libm-alias-float.h>. (llroundf): Define using libm_alias_float. * sysdeps/powerpc/powerpc64/power8/fpu/s_sinf.S: Include <libm-alias-float.h>. (sinf): Define using libm_alias_float.