summaryrefslogtreecommitdiff
path: root/numpy/core/setup.py
diff options
context:
space:
mode:
authorRafael Cardoso Fernandes Sousa <rafaelcfsousa@ibm.com>2022-02-15 19:07:58 -0600
committerRafael Cardoso Fernandes Sousa <rafaelcfsousa@ibm.com>2022-04-15 22:42:38 -0500
commita14d04752036c9f1b4eb000d079b27da3bacedf2 (patch)
treecb918438e3bff523fbff923eed84c9b51686f171 /numpy/core/setup.py
parent1ab7e8fbf90ac4a81d2ffdde7d78ec464dccb02e (diff)
downloadnumpy-a14d04752036c9f1b4eb000d079b27da3bacedf2.tar.gz
ENH,SIMD: Vectorize modulo/divide using the universal intrinsics
This commit optimizes the operations below: - fmod (signed/unsigned integers) - remainder (signed/unsigned integers) - divmod (signed/unsigned integers) - floor_divide (signed integers) using the VSX4/Power10 integer vector division/modulo instructions. See the improvements below (maximum speedup): - numpy.fmod - arr OP arr: signed (1.17x), unsigned (1.13x) - arr OP scalar: signed (1.34x), unsigned (1.29x) - numpy.remainder - arr OP arr: signed (4.19x), unsigned (1.17x) - arr OP scalar: signed (4.87x), unsigned (1.29x) - numpy.divmod - arr OP arr: signed (4.73x), unsigned (1.23x) - arr OP scalar: signed (5.05x), unsigned (1.31x) - numpy.floor_divide - arr OP arr: signed (4.44x) The times above were collected using the benchmark tool available in NumPy.
Diffstat (limited to 'numpy/core/setup.py')
-rw-r--r--numpy/core/setup.py1
1 files changed, 1 insertions, 0 deletions
diff --git a/numpy/core/setup.py b/numpy/core/setup.py
index f6b31075d..fe52fde0d 100644
--- a/numpy/core/setup.py
+++ b/numpy/core/setup.py
@@ -1014,6 +1014,7 @@ def configuration(parent_package='',top_path=None):
join('src', 'umath', 'loops_umath_fp.dispatch.c.src'),
join('src', 'umath', 'loops_exponent_log.dispatch.c.src'),
join('src', 'umath', 'loops_hyperbolic.dispatch.c.src'),
+ join('src', 'umath', 'loops_modulo.dispatch.c.src'),
join('src', 'umath', 'matmul.h.src'),
join('src', 'umath', 'matmul.c.src'),
join('src', 'umath', 'clip.h'),