summaryrefslogtreecommitdiff
path: root/compiler/codeGen/StgCmmPrim.hs
Commit message (Collapse)AuthorAgeFilesLines
* StgCmmPrim: remove an unnecessary instruction in doNewArrayOpMichal Terepeta2019-04-191-5/+2
| | | | | | | | | | | Previously we would generate a local variable pointing after the array header and use it to initialize the array elements. But we already use stores with offset, so it's easy to just add the header to those offsets during compilation and avoid generating the local variable (which would become a LEA instruction when using native codegen; LLVM already optimizes it away). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
* codegen: unroll memcpy calls for small bytearraysArtem Pyanykh2019-04-141-24/+26
|
* removing x87 register support from native code genCarter Schonwald2019-04-101-0/+30
| | | | | | | | | | | | | | | | * simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors * makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding behavior in 32bit haskell code * removes the 80bit floating point representation from the supported float sizes * theres still 1 tiny bit of x87 support needed, for handling float and double return values in FFI calls wrt the C ABI on x86_32, but this one piece does not leak into the rest of NCG. * Lots of code thats not been touched in a long time got deleted as a consequence of all of this all in all, this change paves the way towards a lot of future further improvements in how GHC handles floating point computations, along with making the native code gen more accessible to a larger pool of contributors.
* codegen: use newtype for Alignment in BasicTypesArtem Pyanykh2019-04-091-10/+9
|
* codegen: fix memset unroll for small bytearrays, add 64-bit setsArtem Pyanykh2019-04-091-4/+12
| | | | | | | | | | | | | | | | | | | | | | Fixes #16052 When the offset in `setByteArray#` is statically known, we can provide better alignment guarantees then just 1 byte. Also, memset can now do 64-bit wide sets. The current memset intrinsic is not optimal however and can be improved for the case when we know that we deal with (baseAddress at known alignment) + offset For instance, on 64-bit `setByteArray# s 1# 23# 0#` given that bytearray is 8 bytes aligned could be unrolled into `movb, movw, movl, movq, movq`; but currently it is `movb x23` since alignment of 1 is all we can embed into MO_Memset op.
* Generate straightline code for inline array allocationMichal Terepeta2019-04-081-11/+5
| | | | | | | | | | | | | | | GHC has an optimization for allocating arrays when the size is statically known -- it'll generate the code allocating and initializing the array inline (instead of a call to a procedure from `rts/PrimOps.cmm`). However, the generated code uses a loop to do the initialization. Since we already check that the requested size is small (we check against `maxInlineAllocSize`), we can generate faster straightline code instead. This brings about 15% improvement for `newSmallArray#` in my testing and slightly simplifies the code in GHC. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
* Add support for bitreverse primopAlexandre2019-04-011-0/+13
| | | | | | This commit includes the necessary changes in code and documentation to support a primop that reverses a word's bits. It also includes a test.
* Minor refactoring in copy array primops:Ömer Sinan Ağacan2019-03-271-15/+17
| | | | | | | | | | - `emitCopySmallArray` now checks size before generating code and doesn't generate any code when size is 0. `emitCopyArray` already does this so this makes small/large array cases the same in argument checking. - In both `emitCopySmallArray` and `emitCopyArray` read the `dflags` after checking the argument.
* PPC NCG: Generate MO_?_QuotRem for subword sizesPeter Trommler2018-12-111-22/+13
| | | | | | | | | | | | | | | Handle Int*QuotRemOP and Word*QuotRemOp in PPC NCG. Refactor common code with remainder operation. Test Plan: validate (I validated on Linux powerpc64le and x86_64) Reviewers: erikd, hvr, bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5323
* LLVM: Use generic code for small size quot-rem opsPeter Trommler2018-11-221-2/+2
|
* Introduce Int16# and Word16#Abhiroop Sarkar2018-11-171-0/+45
| | | | | | | | | | | | This builds off of D4475. Bumps binary submodule. Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D5006
* [LlvmCodeGen] Fixes for Int8#/Word8#Michal Terepeta2018-11-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two isssues: - Using bitcast for MO_XX_Conv Arguments to a bitcast must be of the same size. We should be using `trunc` and `zext` instead. - Using unsupported MO_*_QuotRem for LLVM The two primops `MO_*_QuotRem` are not supported by the LLVM backend, so we shouldn't use them for `Int8#`/`Word8#` (just as we do not use them for `Int#`/`Word#`). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: manually run tests with WAY=llvm Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15864 Differential Revision: https://phabricator.haskell.org/D5304
* Add Int8# and Word8#Michal Terepeta2018-11-021-14/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: primops.txt.pp gets two new sections for two new primitive types for signed and unsigned 8-bit integers (Int8# and Word8 respectively) along with basic arithmetic and comparison operations. PrimRep/RuntimeRep get two new constructors for them. All of the primops translate into the existing MachOPs. For CmmCalls the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. This is the second attempt at merging this, after the first attempt in D4475 had to be backed out due to regressions on i386. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate (on both x86-{32,64}) Reviewers: bgamari, hvr, goldfire, simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5258
* Fix dataToTag# argument evaluationÖmer Sinan Ağacan2018-10-101-6/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | See #15696 for more details. We now always enter dataToTag# argument (done in generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag# applications are done by the simplifier. Looking at tag bits (instead of reading the info table) for small types is left to another diff. Incorrect test T14626 is removed. We no longer do this optimisation (see comment:44, comment:45, comment:60). Comments and notes about special cases around dataToTag# are removed. We no longer have any special cases around it in Core. Other changes related to evaluating primops (seq# and dataToTag#) will be pursued in follow-up diffs. Test Plan: Validates with three regression tests Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #15696 Differential Revision: https://phabricator.haskell.org/D5201
* Revert "Add Int8# and Word8#"Ben Gamari2018-10-091-60/+14
| | | | | | | | | This unfortunately broke i386 support since it introduced references to byte-sized registers that don't exist on that architecture. Reverts binary submodule This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06.
* Add Int8# and Word8#Michal Terepeta2018-10-071-14/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: - `primops.txt.pp` gets two new sections for two new primitive types for signed and unsigned 8-bit integers (`Int8#` and `Word8` respectively) along with basic arithmetic and comparison operations. `PrimRep`/`RuntimeRep` get two new constructors for them. All of the primops translate into the existing `MachOP`s. - For `CmmCall`s the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. - x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate with new tests Reviewers: hvr, goldfire, bgamari, simonmar Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4475
* Add a missing write barrier to small array writesÖmer Sinan Ağacan2018-10-061-0/+1
| | | | | | | | | | | | | | | | Write barriers for large array writes were added in D2525, as a part of #12469. However it seems we forgot about small arrays. This patch adds the same write barrier to small array writes. Reviewers: simonmar, bgamari Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #12469 Differential Revision: https://phabricator.haskell.org/D5209
* Finish stable splitDavid Feuer2018-08-291-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long ago, the stable name table and stable pointer tables were one. Now, they are separate, and have significantly different implementations. I believe the time has come to finish the split that began in #7674. * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`. * Give each table its own mutex. * Add FFI functions `hs_lock_stable_ptr_table` and `hs_unlock_stable_ptr_table` and document them. These are intended to replace the previously undocumented `hs_lock_stable_tables` and `hs_lock_stable_tables`, which are now documented as deprecated synonyms. * Make `eqStableName#` use pointer equality instead of unnecessarily comparing stable name table indices. Reviewers: simonmar, bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15555 Differential Revision: https://phabricator.haskell.org/D5084
* Fix precision of asinh/acosh/atanh by making them primopsArtem Pelenitsyn2018-08-211-0/+6
| | | | | | | | | | Reviewers: hvr, bgamari, simonmar, jrtc27 Reviewed By: bgamari Subscribers: alpmestan, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5034
* Turn on MonadFail desugaring by defaultHerbert Valerio Riedel2018-08-071-16/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This contains two commits: ---- Make GHC's code-base compatible w/ `MonadFail` There were a couple of use-sites which implicitly used pattern-matches in `do`-notation even though the underlying `Monad` didn't explicitly support `fail` This refactoring turns those use-sites into explicit case discrimations and adds an `MonadFail` instance for `UniqSM` (`UniqSM` was the worst offender so this has been postponed for a follow-up refactoring) --- Turn on MonadFail desugaring by default This finally implements the phase scheduled for GHC 8.6 according to https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy This also preserves some tests that assumed MonadFail desugaring to be active; all ghc boot libs were already made compatible with this `MonadFail` long ago, so no changes were needed there. Test Plan: Locally performed ./validate --fast Reviewers: bgamari, simonmar, jrtc27, RyanGlScott Reviewed By: bgamari Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5028
* Rename some mutable closure types for consistencyÖmer Sinan Ağacan2018-06-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
* Add 'addWordC#' PrimOpSebastian Graf2018-05-051-10/+62
| | | | | | | | | | | | | | | | | | | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'. I found 'plusWord2#' while implementing this, which both lacks documentation and has a slightly different specification than 'addWordC#', which means the generic implementation is unnecessarily complex. While I was at it, I also added lacking meta-information on PrimOps and refactored 'subWordC#'s generic implementation to be branchless. Reviewers: bgamari, simonmar, jrtc27, dfeuer Reviewed By: bgamari, dfeuer Subscribers: dfeuer, thomie, carter Differential Revision: https://phabricator.haskell.org/D4592
* Add unaligned bytearray access primops. Fixes #4442.Reiner Pope2018-03-251-0/+53
| | | | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: dfeuer, rwbarton, thomie, carter GHC Trac Issues: #4442 Differential Revision: https://phabricator.haskell.org/D4488
* myThreadId# is trivial; make it an inline primopSimon Marlow2018-02-181-0/+3
| | | | | | | | | | | | | | | The pattern `threadCapability =<< myThreadId` is used a lot in code that uses `hs_try_putmvar`, I want to make it cheaper. Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4381
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-181-9/+9
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Add ptr-eq short-cut to `compareByteArrays#` primitiveHerbert Valerio Riedel2018-01-261-0/+43
| | | | | | | | | | | | | | | | | | This is an obvious optimisation whose overhead is neglectable but which significantly simplifies the common uses of `compareByteArrays#` which would otherwise require to make *careful* use of `reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#` which can result in less optimal assembler code being generated. Test Plan: carefully examined generated cmm/asm code; validate via phab Reviewers: alexbiehl, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4319
* Add new mbmi and mbmi2 compiler flagsJohn Ky2018-01-211-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Implement x86 code generator for pdep and pext. Properly initialise bmiVersion field. pdep and pext test cases Fix pattern match for pdep and pext instructions Fix build of pdep and pext code for 32-bit architectures Test Plan: Validate Reviewers: austin, simonmar, bgamari, angerman Reviewed By: bgamari Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4236
* Get rid of some stuttering in comments and docsGabor Greif2017-12-191-1/+1
|
* Revert "Add new mbmi and mbmi2 compiler flags"Ben Gamari2017-11-221-78/+0
| | | | | | This broke the 32-bit build. This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
* Add new mbmi and mbmi2 compiler flagsJohn Ky2017-11-151-0/+78
| | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Test Plan: Validate Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4063
* Turn `compareByteArrays#` out-of-line primop into inline primopalexbiehl2017-10-291-1/+40
| | | | | | | | | | | | Depends on D4090 Reviewers: austin, bgamari, erikd, simonmar, alexbiehl Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4091
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-2/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Use lengthIs and friends in more placesRyan Scott2017-06-021-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | While investigating #12545, I discovered several places in the code that performed length-checks like so: ``` length ts == 4 ``` This is not ideal, since the length of `ts` could be much longer than 4, and we'd be doing way more work than necessary! There are already a slew of helper functions in `Util` such as `lengthIs` that are designed to do this efficiently, so I found every place where they ought to be used and did just that. I also defined a couple more utility functions for list length that were common patterns (e.g., `ltLength`). Test Plan: ./validate Reviewers: austin, hvr, goldfire, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: goldfire, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3622
* PPC NCG: Lower MO_*_Fabs as PowerPC fabs instructionPeter Trommler2017-05-011-2/+4
| | | | | | | | | | | | | | | | In Phab:D3265 we introduced MO_F32_Fabs and MO_F64_Fabs. This patch improves code generation by generating PowerPC fabs instructions. Test Plan: run numeric/should_run/numrun015 or validate Reviewers: austin, bgamari, hvr, simonmar, erikd Reviewed By: erikd Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3512
* PPC NCG: Implement callish prim opsPeter Trommler2017-04-251-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide PowerPC optimised implementations of callish prim ops. MO_?_QuotRem The generic implementation of quotient remainder prim ops uses a division and a remainder operation. There is no remainder on PowerPC and so we need to implement remainder "by hand" which results in a duplication of the divide operation when using the generic code. Avoid this duplication by implementing the prim op in the native code generator. MO_U_Mul2 Use PowerPC's instructions for long multiplication. Addition and subtraction Use PowerPC add/subtract with carry/overflow instructions MO_Clz and MO_Ctz Use PowerPC's CNTLZ instruction and implement count trailing zeros using count leading zeros MO_QuotRem2 Implement an algorithm given by Henry Warren in "Hacker's Delight" using PowerPC divide instruction. TODO: Use long division instructions when available (POWER7 and later). Test Plan: validate on AIX and 32-bit Linux Reviewers: simonmar, erikd, hvr, austin, bgamari Reviewed By: erikd, hvr, bgamari Subscribers: trofi, kgardas, thomie Differential Revision: https://phabricator.haskell.org/D2973
* Generate better fp abs for X86 and llvm with default cmm otherwiseDominic Steinitz2017-03-071-0/+34
| | | | | | | | | | | | | | | | | | | | | | | Currently we have this in libraries/base/GHC/Float.hs: ``` abs x | x == 0 = 0 -- handles (-0.0) | x > 0 = x | otherwise = negateFloat x ``` But 3-4 years ago it was noted that this was inefficient: https://mail.haskell.org/pipermail/libraries/2013-April/019690.html We can generate better code for X86 and llvm and for others generate some custom cmm code which is similar to what the compiler generates now. Reviewers: austin, simonmar, hvr, bgamari Reviewed By: bgamari Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D3265
* Use newBlockId instead of newLabelCBen Gamari2016-11-291-1/+2
| | | | | | | | | | | | | | | This seems like a clearer name and the fewer functions that one needs to remember, the better. Test Plan: validate Reviewers: austin, simonmar, michalt Reviewed By: simonmar, michalt Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2735
* StgCmmPrim: Add missing write barrier.Peter Trommler2016-10-191-0/+4
| | | | | | | | | | | | | | | | | | | On architectures with weak memory consistency a write barrier is needed before the write to the pointer array. Fixes #12469 Test Plan: rebuilt Stackage nightly twice on powerpc64le Reviewers: hvr, rrnewton, erikd, austin, simonmar, bgamari Reviewed By: erikd, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2525 GHC Trac Issues: #12469
* StgCmmPrim: Add missing MO_WriteBarrierBen Gamari2016-08-311-2/+5
| | | | | | | | | | | | | | Test Plan: Good question Reviewers: austin, trommler, simonmar, rrnewton Reviewed By: simonmar Subscribers: RyanGlScott, thomie Differential Revision: https://phabricator.haskell.org/D2495 GHC Trac Issues: #12469
* Remove StgRubbishArg and CmmArgÖmer Sinan Ağacan2016-08-101-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind adding special "rubbish" arguments was in unboxed sum types depending on the tag some arguments are not used and we don't want to move some special values (like 0 for literals and some special pointer for boxed slots) for those arguments (to stack locations or registers). "StgRubbishArg" was an indicator to the code generator that the value won't be used. During Stg-to-Cmm we were then not generating any move or store instructions at all. This caused problems in the register allocator because some variables were only initialized in some code paths. As an example, suppose we have this STG: (after unarise) Lib.$WT = \r [dt_sit] case case dt_sit of { Lib.F dt_siv [Occ=Once] -> (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#]; Lib.I dt_siw [Occ=Once] -> (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw]; } of dt_six { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE]; }; This basically unpacks a sum type to an unboxed sum with 3 fields, and then moves the unboxed sum to a constructor (`Lib.T`). This is the Cmm for the inner case expression (case expression in the scrutinee position of the outer case): ciN: ... -- look at dt_sit's tag if (_ciT::P64 != 1) goto ciS; else goto ciR; ciS: -- Tag is 2, i.e. Lib.F _siw::I64 = I64[_siu::P64 + 6]; _giE::I64 = _siw::I64; _giD::P64 = stg_RUBBISH_ENTRY_info; _giC::I64 = 2; goto ciU; ciR: -- Tag is 1, i.e. Lib.I _siv::P64 = P64[_siu::P64 + 7]; _giD::P64 = _siv::P64; _giC::I64 = 1; goto ciU; Here one of the blocks `ciS` and `ciR` is executed and then the execution continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch `_giE` is not initialized, because it's "rubbish" in the STG and so we don't generate an assignment during code generator. The code generator then panics during the register allocations: ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20160722 for x86_64-unknown-linux): LocalReg's live-in to graph ciY {_giE::I64} (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a pointer slot, we have to initialize it otherwise garbage collector follows the pointer to some random place. So we only remove assignment if the "rubbish" arg has unboxed type.) This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize rubbish slots. If the slot is for boxed types we use the existing `absentError`, otherwise we initialize the slot with literal 0. Reviewers: simonpj, erikd, austin, simonmar, bgamari Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2446
* Implement unboxed sum primitive typeÖmer Sinan Ağacan2016-07-211-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements primitive unboxed sum types, as described in https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes. Main changes are: - Add new syntax for unboxed sums types, terms and patterns. Hidden behind `-XUnboxedSums`. - Add unlifted unboxed sum type constructors and data constructors, extend type and pattern checkers and desugarer. - Add new RuntimeRep for unboxed sums. - Extend unarise pass to translate unboxed sums to unboxed tuples right before code generation. - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better code generation when sum values are involved. - Add user manual section for unboxed sums. Some other changes: - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to `MultiValAlt` to be able to use those with both sums and tuples. - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really wrong, given an `Any` `TyCon`, there's no way to tell what its kind is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`. - Fix some bugs on the way: #12375. Not included in this patch: - Update Haddock for new the new unboxed sum syntax. - `TemplateHaskell` support is left as future work. For reviewers: - Front-end code is mostly trivial and adapted from unboxed tuple code for type checking, pattern checking, renaming, desugaring etc. - Main translation routines are in `RepType` and `UnariseStg`. Documentation in `UnariseStg` should be enough for understanding what's going on. Credits: - Johan Tibell wrote the initial front-end and interface file extensions. - Simon Peyton Jones reviewed this patch many times, wrote some code, and helped with debugging. Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin, simonmar, hvr, erikd Reviewed By: simonpj Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D2259
* Compact RegionsGiovanni Campagna2016-07-201-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | This brings in initial support for compact regions, as described in the ICFP 2015 paper "Efficient Communication and Collection with Compact Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni Campagna. Some things may change before the 8.2 release, but I (Simon M.) wanted to get the main patch committed so that we can iterate. What documentation there is is in the Data.Compact module in the new compact package. We'll need to extend and polish the documentation before the release. Test Plan: validate (new test cases included) Reviewers: ezyang, simonmar, hvr, bgamari, austin Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1264 GHC Trac Issues: #11493
* Drop pre-AMP compatibility CPP conditionalsHerbert Valerio Riedel2015-12-311-2/+0
| | | | | | | | | | | | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP compatibility CPP-mess for good! Reviewers: austin, goldfire, bgamari Subscribers: goldfire, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1724
* Add subWordC# on x86ishNikita Karetnikov2015-10-311-0/+17
| | | | | | | | | | | | | | | This adds a subWordC# primop which implements subtraction with overflow reporting. Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1334 GHC Trac Issues: #10962
* s/StgArrWords/StgArrBytes/Siddhanathan Shanmugam2015-09-111-4/+4
| | | | | | | | | | Rename StgArrWords to StgArrBytes (see Trac #8552) Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D1233 GHC Trac Issues: #8552
* Fix trac #10413Ben Gamari2015-09-021-2/+5
| | | | | | | | | | | | | | Test Plan: Validate. Reviewers: austin, tibbe, bgamari Reviewed By: tibbe, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1194 GHC Trac Issues: #10413
* Implement getSizeofMutableByteArrayOp primopBen Gamari2015-08-211-0/+5
| | | | | | | | | | | | | | | Now since ByteArrays are mutable we need to be more explicit about when the size is queried. Test Plan: Add testcase and validate Reviewers: goldfire, hvr, austin Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1139 GHC Trac Issues: #9447
* Support MO_U_QuotRem2 in LLVM backendMichal Terepeta2015-08-031-1/+2
| | | | | | | | | | | | | | | | | | | This adds support for MO_U_QuotRem2 in LLVM backend. Similarly to MO_U_Mul2 we use the standard LLVM instructions (in this case 'udiv' and 'urem') but do the computation on double the word width (e.g., for 64-bit we will do them on 128 registers). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1100 GHC Trac Issues: #9430
* LlvmCodeGen: add support for MO_U_Mul2 CallishMachOpMichal Terepeta2015-07-201-1/+2
| | | | | | | | | | | | | | | | | | | This adds support MO_U_Mul2 to the LLVM backend by simply using 'mul' instruction but operating at twice the bit width (e.g., for 64 bit words we will generate mul that operates on 128 bits and then extract the two 64 bit values for the result of the CallishMachOp). Test Plan: validate Reviewers: rwbarton, austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1068 GHC Trac Issues: #9430
* Support MO_{Add,Sub}IntC and MO_Add2 in the LLVM backendMichal Terepeta2015-07-041-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | This includes: - Adding new LlvmType called LMStructP that represents an unpacked struct (this is necessary since LLVM's instructions the llvm.sadd.with.overflow.* return an unpacked struct). - Modifications to LlvmCodeGen.CodeGen to generate the LLVM instructions for the primops. - Modifications to StgCmmPrim to actually use those three instructions if we use the LLVM backend (so far they were only used for NCG). Test Plan: validate Reviewers: austin, rwbarton, bgamari Reviewed By: bgamari Subscribers: thomie, bgamari Differential Revision: https://phabricator.haskell.org/D991 GHC Trac Issues: #9430