summaryrefslogtreecommitdiff
path: root/libraries
diff options
context:
space:
mode:
authorsheaf <sam.derbyshire@gmail.com>2023-04-08 13:42:58 +0200
committerMarge Bot <ben+marge-bot@smart-cactus.org>2023-05-11 11:55:22 -0400
commit87eebf98cb485f7c9175330051736e147ade9848 (patch)
treeffa226b3fefa8b0a03e1798fa4f55affbddf654b /libraries
parent630b1fea1e41a1e00860a30742b6ab8ade8a0de0 (diff)
downloadhaskell-87eebf98cb485f7c9175330051736e147ade9848.tar.gz
Add fused multiply-add instructions
This patch adds eight new primops that fuse a multiplication and an addition or subtraction: - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#` fmadd x y z is x * y + z, computed with a single rounding step. This patch implements code generation for these primops in the following backends: - X86, AArch64 and PowerPC NCG, - LLVM - C WASM uses the C implementation. The primops are unsupported in the JavaScript backend. The following constant folding rules are also provided: - compute a * b + c when a, b, c are all literals, - x * y + 0 ==> x * y, - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x. NB: the constant folding rules incorrectly handle signed zero. This is a known limitation with GHC's floating-point constant folding rules (#21227), which we hope to resolve in the future.
Diffstat (limited to 'libraries')
-rw-r--r--libraries/ghc-prim/changelog.md18
1 files changed, 18 insertions, 0 deletions
diff --git a/libraries/ghc-prim/changelog.md b/libraries/ghc-prim/changelog.md
index 1cf411c029..39e5face03 100644
--- a/libraries/ghc-prim/changelog.md
+++ b/libraries/ghc-prim/changelog.md
@@ -23,6 +23,24 @@
- `copyAddrToAddrNonOverlapping#`
- `setAddrRange#`
+- New primops for fused multiply-add operations. These primops combine a
+ multiplication and an addition, compiling to a single instruction when
+ the `-mfma` flag is enabled and the architecture supports it.
+
+ The new primops are `fmaddFloat#, fmsubFloat#, fnmaddFloat#, fnmsubFloat# :: Float# -> Float# -> Float# -> Float#`
+ and `fmaddDouble#, fmsubDouble#, fnmaddDouble#, fnmsubDouble# :: Double# -> Double# -> Double# -> Double#`.
+
+ These implement the following operations, while performing one single
+ rounding at the end, leading to a more accurate result:
+
+ - `fmaddFloat# x y z`, `fmaddDouble# x y z` compute `x * y + z`.
+ - `fmsubFloat# x y z`, `fmsubDouble# x y z` compute `x * y - z`.
+ - `fnmaddFloat# x y z`, `fnmaddDouble# x y z` compute `- x * y + z`.
+ - `fnmsubFloat# x y z`, `fnmsubDouble# x y z` compute `- x * y - z`.
+
+ Warning: on unsupported architectures, the software emulation provided by
+ the fallback to the C standard library is not guaranteed to be IEEE-compliant.
+
## 0.10.0
- Shipped with GHC 9.6.1