diff options
author | Takenobu Tani <takenobu.hs@gmail.com> | 2021-09-04 15:07:31 +0900 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2022-02-16 05:27:17 -0500 |
commit | acb482cc198bc6fbc5b0f7d93fd493853c77671f (patch) | |
tree | 52da660543b2bf19394a3354d10484a4c1560ef1 /rts | |
parent | ef5cf55d71e84a0a42596b4ec253ecb0d63f149b (diff) | |
download | haskell-acb482cc198bc6fbc5b0f7d93fd493853c77671f.tar.gz |
Relax load_load_barrier for aarch64
This patch relaxes the instruction for load_load_barrier().
Current load_load_barrier() implements full-barrier with `dmb sy`.
It's too strong to order load-load instructions.
We can relax it by using `dmb ld`.
If current load_load_barrier() is used for full-barriers
(load/store - load/store barrier), this patch is not suitable.
See also linux-kernel's smp_rmb() implementation:
https://github.com/torvalds/linux/blob/v5.14/arch/arm64/include/asm/barrier.h#L90
Hopefully, it's better to use `dmb ishld` rather than `dmb ld`
to improve performance. However, I can't validate effects on
a real many-core Arm machine.
Diffstat (limited to 'rts')
-rw-r--r-- | rts/include/stg/SMP.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/rts/include/stg/SMP.h b/rts/include/stg/SMP.h index f672009c76..696ec737ac 100644 --- a/rts/include/stg/SMP.h +++ b/rts/include/stg/SMP.h @@ -438,7 +438,7 @@ load_load_barrier(void) { #elif defined(arm_HOST_ARCH) __asm__ __volatile__ ("dmb" : : : "memory"); #elif defined(aarch64_HOST_ARCH) - __asm__ __volatile__ ("dmb sy" : : : "memory"); + __asm__ __volatile__ ("dmb ld" : : : "memory"); #elif defined(riscv64_HOST_ARCH) __asm__ __volatile__ ("fence r,r" : : : "memory"); #else |