diff options
author | Artem Pyanykh <artem.pyanykh@gmail.com> | 2019-04-04 13:43:38 +0300 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2019-04-09 10:30:13 -0400 |
commit | af4cea7f1411e5b99e2417d7c2d3d0e697093103 (patch) | |
tree | ec9ef85347e5c8915e864573997c15aaa8cc5a73 /compiler/codeGen | |
parent | 36d380475d9056fdf93305985be3def00aaf6cf7 (diff) | |
download | haskell-af4cea7f1411e5b99e2417d7c2d3d0e697093103.tar.gz |
codegen: fix memset unroll for small bytearrays, add 64-bit sets
Fixes #16052
When the offset in `setByteArray#` is statically known, we can provide
better alignment guarantees then just 1 byte.
Also, memset can now do 64-bit wide sets.
The current memset intrinsic is not optimal however and can be
improved for the case when we know that we deal with
(baseAddress at known alignment) + offset
For instance, on 64-bit
`setByteArray# s 1# 23# 0#`
given that bytearray is 8 bytes aligned could be unrolled into
`movb, movw, movl, movq, movq`; but currently it is
`movb x23` since alignment of 1 is all we can embed into MO_Memset op.
Diffstat (limited to 'compiler/codeGen')
-rw-r--r-- | compiler/codeGen/StgCmmPrim.hs | 16 |
1 files changed, 12 insertions, 4 deletions
diff --git a/compiler/codeGen/StgCmmPrim.hs b/compiler/codeGen/StgCmmPrim.hs index 4a07c7893e..1abef3a90a 100644 --- a/compiler/codeGen/StgCmmPrim.hs +++ b/compiler/codeGen/StgCmmPrim.hs @@ -2073,10 +2073,18 @@ doCopyAddrToByteArrayOp src_p dst dst_off bytes = do -- character. doSetByteArrayOp :: CmmExpr -> CmmExpr -> CmmExpr -> CmmExpr -> FCode () -doSetByteArrayOp ba off len c - = do dflags <- getDynFlags - p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off - emitMemsetCall p c len 1 +doSetByteArrayOp ba off len c = do + dflags <- getDynFlags + let maxAlign = wORD_SIZE dflags + align = minimum [maxAlign, possibleAlign] + + p <- assignTempE $ cmmOffsetExpr dflags (cmmOffsetB dflags ba (arrWordsHdrSize dflags)) off + + emitMemsetCall p c len align + where + possibleAlign = case off of + CmmLit (CmmInt intOff _) -> fromIntegral $ byteAlignment (fromIntegral intOff) + _ -> 1 -- ---------------------------------------------------------------------------- -- Allocating arrays |