diff options
author | John Ky <newhoggy@gmail.com> | 2018-01-21 11:55:45 -0500 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2018-01-21 11:56:11 -0500 |
commit | f855769690eb998ea25818ee794714957852af48 (patch) | |
tree | 969d0c0bafa72929f5ec50f1794d51e6e1799840 /compiler/codeGen | |
parent | 5e8ea6a62e948bcc0da1279f06844fd1d8e979bd (diff) | |
download | haskell-f855769690eb998ea25818ee794714957852af48.tar.gz |
Add new mbmi and mbmi2 compiler flags
This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Implement x86 code generator for pdep and pext. Properly initialise
bmiVersion field.
pdep and pext test cases
Fix pattern match for pdep and pext instructions
Fix build of pdep and pext code for 32-bit architectures
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, angerman
Reviewed By: bgamari
Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4236
Diffstat (limited to 'compiler/codeGen')
-rw-r--r-- | compiler/codeGen/StgCmmPrim.hs | 28 |
1 files changed, 28 insertions, 0 deletions
diff --git a/compiler/codeGen/StgCmmPrim.hs b/compiler/codeGen/StgCmmPrim.hs index 0a6ac9dba3..948af2aba0 100644 --- a/compiler/codeGen/StgCmmPrim.hs +++ b/compiler/codeGen/StgCmmPrim.hs @@ -584,6 +584,20 @@ emitPrimOp _ [res] PopCnt32Op [w] = emitPopCntCall res w W32 emitPrimOp _ [res] PopCnt64Op [w] = emitPopCntCall res w W64 emitPrimOp dflags [res] PopCntOp [w] = emitPopCntCall res w (wordWidth dflags) +-- Parallel bit deposit +emitPrimOp _ [res] Pdep8Op [src, mask] = emitPdepCall res src mask W8 +emitPrimOp _ [res] Pdep16Op [src, mask] = emitPdepCall res src mask W16 +emitPrimOp _ [res] Pdep32Op [src, mask] = emitPdepCall res src mask W32 +emitPrimOp _ [res] Pdep64Op [src, mask] = emitPdepCall res src mask W64 +emitPrimOp dflags [res] PdepOp [src, mask] = emitPdepCall res src mask (wordWidth dflags) + +-- Parallel bit extract +emitPrimOp _ [res] Pext8Op [src, mask] = emitPextCall res src mask W8 +emitPrimOp _ [res] Pext16Op [src, mask] = emitPextCall res src mask W16 +emitPrimOp _ [res] Pext32Op [src, mask] = emitPextCall res src mask W32 +emitPrimOp _ [res] Pext64Op [src, mask] = emitPextCall res src mask W64 +emitPrimOp dflags [res] PextOp [src, mask] = emitPextCall res src mask (wordWidth dflags) + -- count leading zeros emitPrimOp _ [res] Clz8Op [w] = emitClzCall res w W8 emitPrimOp _ [res] Clz16Op [w] = emitClzCall res w W16 @@ -2266,6 +2280,20 @@ emitPopCntCall res x width = do (MO_PopCnt width) [ x ] +emitPdepCall :: LocalReg -> CmmExpr -> CmmExpr -> Width -> FCode () +emitPdepCall res x y width = do + emitPrimCall + [ res ] + (MO_Pdep width) + [ x, y ] + +emitPextCall :: LocalReg -> CmmExpr -> CmmExpr -> Width -> FCode () +emitPextCall res x y width = do + emitPrimCall + [ res ] + (MO_Pext width) + [ x, y ] + emitClzCall :: LocalReg -> CmmExpr -> Width -> FCode () emitClzCall res x width = do emitPrimCall |