summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
* Use ByteString to represent Cmm string literals (#16198)Sylvain Henry2019-01-313-12/+11
| | | | Also used ByteString in some other relevant places
* PPC NCG: Remove Darwin supportPeter Trommler2019-01-012-27/+5
| | | | | | | Support for Mac OS X on PowerPC has been dropped by Apple years ago. We follow suit and remove PowerPC support for Darwin. Fixes #16106.
* Typo fix, replace a foldl with foldl'Ömer Sinan Ağacan2018-12-121-3/+3
|
* PPC NCG: Generate MO_?_QuotRem for subword sizesPeter Trommler2018-12-111-22/+13
| | | | | | | | | | | | | | | Handle Int*QuotRemOP and Word*QuotRemOp in PPC NCG. Refactor common code with remainder operation. Test Plan: validate (I validated on Linux powerpc64le and x86_64) Reviewers: erikd, hvr, bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5323
* Don't use a generic apply thunk for known callsSebastian Graf2018-12-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, an AP thunk like `sat = f a b c` will not have its own entry point and info pointer and will instead reuse a generic apply thunk like `stg_ap_4_upd`. That's great from a code size perspective, but if `f` is a known function, a specialised entry point with a plain call can be much faster than figuring out the arity and doing dynamic dispatch. This looks at `f`s arity to figure out if it is a known function and if so, it will not lower it to a generic apply function. Benchmark results are encouraging: No changes to allocation, but 0.2% less counted instructions. Test Plan: Validates locally Reviewers: simonmar, osa1, simonpj, bgamari Reviewed By: simonpj Subscribers: rwbarton, carter GHC Trac Issues: #16005 Differential Revision: https://phabricator.haskell.org/D5414
* Deduplicate decision to count thunks in `-ticky`Sebastian Graf2018-11-301-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, the logic that checks whether a thunk has a counter or not was duplicated in multiple functions. This led to thunk enters being accounted to their enclosing functions in `StgCmmTicky.tickyEnterThunk`, because the outer call to `withNewTickyCounterThunk` didn't set the counter label for the thunk. And rightly so! `tickyEnterThunk` should only account thunk enters to a counter if `-ticky-dyn-thunk` is on. This patch extracts the logic that was already present in its most general form in `withNewTickyCounterThunk` into its own functions and lets all other call sites checking for `-ticky-dyn-thunk` call this new function named `thunkHasCounter` instead. Reviewers: bgamari, simonmar Reviewed By: simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5392
* Comments onlySimon Peyton Jones2018-11-282-19/+23
|
* Add Note [Dead case binders in -O0]Sebastian Graf2018-11-281-1/+13
| | | | | | | | | | | | | | | After reverting Phab:D5358, Simon (Peyton Jones) asked for a Note summarising why we want to keep the dead case binder check in `cgCase`. Summary from mail conversation: * Phab:D5324 means that we no longer /recompute/ dead-ness of case-binders in STG-land * But TidyPgm preserves dead-ness info (see CoreTidy.tidyIdBndr) * And so we can take advantage of it to avoid a redundant load. This load would be eliminated by CmmSink, but that only happens with -O
* Revert "Remove redundant check in cgCase"Ömer Sinan Ağacan2018-11-261-4/+7
| | | | | | This reverts commit d13b7d60650cb84af11ee15b3f51c3511548cfdb. (See discussion in D5358)
* Implement late lambda liftSebastian Graf2018-11-233-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements a selective lambda-lifting pass late in the STG pipeline. Lambda lifting has the effect of avoiding closure allocation at the cost of having to make former free vars available at call sites, possibly enlarging closures surrounding call sites in turn. We identify beneficial cases by means of an analysis that estimates closure growth. There's a Wiki page at https://ghc.haskell.org/trac/ghc/wiki/LateLamLift. Reviewers: simonpj, bgamari, simonmar Reviewed By: simonpj Subscribers: rwbarton, carter GHC Trac Issues: #9476 Differential Revision: https://phabricator.haskell.org/D5224
* Fix unused-import warningsDavid Eichmann2018-11-221-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes a fairly long-standing bug (dating back to 2015) in RdrName.bestImport, namely commit 9376249b6b78610db055a10d05f6592d6bbbea2f Author: Simon Peyton Jones <simonpj@microsoft.com> Date: Wed Oct 28 17:16:55 2015 +0000 Fix unused-import stuff in a better way In that patch got the sense of the comparison back to front, and thereby failed to implement the unused-import rules described in Note [Choosing the best import declaration] in RdrName This led to Trac #13064 and #15393 Fixing this bug revealed a bunch of unused imports in libraries; the ones in the GHC repo are part of this commit. The two important changes are * Fix the bug in bestImport * Modified the rules by adding (a) in Note [Choosing the best import declaration] in RdrName Reason: the previosu rules made Trac #5211 go bad again. And the new rule (a) makes sense to me. In unravalling this I also ended up doing a few other things * Refactor RnNames.ImportDeclUsage to use a [GlobalRdrElt] for the things that are used, rather than [AvailInfo]. This is simpler and more direct. * Rename greParentName to greParent_maybe, to follow GHC naming conventions * Delete dead code RdrName.greUsedRdrName Bumps a few submodules. Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27 Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5312
* LLVM: Use generic code for small size quot-rem opsPeter Trommler2018-11-221-2/+2
|
* Rename literal constructorsSylvain Henry2018-11-222-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | In a previous patch we replaced some built-in literal constructors (MachInt, MachWord, etc.) with a single LitNumber constructor. In this patch we replace the `Mach` prefix of the remaining constructors with `Lit` for consistency (e.g., LitChar, LitLabel, etc.). Sadly the name `LitString` was already taken for a kind of FastString and it would become misleading to have both `LitStr` (literal constructor renamed after `MachStr`) and `LitString` (FastString variant). Hence this patch renames the FastString variant `PtrString` (which is more accurate) and the literal string constructor now uses the least surprising `LitString` name. Both `Literal` and `LitString/PtrString` have recently seen breaking changes so doing this kind of renaming now shouldn't harm much. Reviewers: hvr, goldfire, bgamari, simonmar, jrtc27, tdammers Subscribers: tdammers, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4881
* Remove redundant check in cgCaseÖmer Sinan Ağacan2018-11-201-7/+4
| | | | | | | | | | | | | D5339 (part of D5324) removed the dead case binder analysis done during CoreToStg so this condition always holds now. Test Plan: Validated locally. Reviewers: sgraf, bgamari, simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5358
* Don't track free variables in STG syntax by defaultSebastian Graf2018-11-194-29/+35
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, `CoreToStg` annotates `StgRhsClosure`s with their set of non-global free variables. This free variable information is only needed in the final code generation step (i.e. `StgCmm.codeGen`), which leads to transformations such as `StgCse` and `StgUnarise` having to maintain this information. This is tiresome and unnecessary, so this patch introduces a trees-to-grow-like approach that only introduces the free variable set into the syntax tree in the code gen pass, along with a free variable analysis on STG terms to generate that information. Fixes #15754. Reviewers: simonpj, osa1, bgamari, simonmar Reviewed By: osa1 Subscribers: rwbarton, carter GHC Trac Issues: #15754 Differential Revision: https://phabricator.haskell.org/D5324
* Introduce Int16# and Word16#Abhiroop Sarkar2018-11-172-0/+47
| | | | | | | | | | | | This builds off of D4475. Bumps binary submodule. Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D5006
* Remove StgBinderInfo and related computation in CoreToStgÖmer Sinan Ağacan2018-11-124-97/+10
| | | | | | | | | | | | | | | | | | | | | | - The StgBinderInfo type was never used in the code gen, so the type, related computation in CoreToStg, and some comments about it are removed. See #15770 for more details. - Simplified CoreToStg after removing the StgBinderInfo computation: removed StgBinderInfo arguments and mfix stuff. The StgBinderInfo values were not used in the code gen, but I still run nofib just to make sure: 0.0% change in allocations and binary sizes. Test Plan: Validated locally Reviewers: simonpj, simonmar, bgamari, sgraf Reviewed By: sgraf Subscribers: AndreasK, sgraf, rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5232
* [LlvmCodeGen] Fixes for Int8#/Word8#Michal Terepeta2018-11-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two isssues: - Using bitcast for MO_XX_Conv Arguments to a bitcast must be of the same size. We should be using `trunc` and `zext` instead. - Using unsupported MO_*_QuotRem for LLVM The two primops `MO_*_QuotRem` are not supported by the LLVM backend, so we shouldn't use them for `Int8#`/`Word8#` (just as we do not use them for `Int#`/`Word#`). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: manually run tests with WAY=llvm Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15864 Differential Revision: https://phabricator.haskell.org/D5304
* Add Int8# and Word8#Michal Terepeta2018-11-022-14/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: primops.txt.pp gets two new sections for two new primitive types for signed and unsigned 8-bit integers (Int8# and Word8 respectively) along with basic arithmetic and comparison operations. PrimRep/RuntimeRep get two new constructors for them. All of the primops translate into the existing MachOPs. For CmmCalls the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. This is the second attempt at merging this, after the first attempt in D4475 had to be backed out due to regressions on i386. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate (on both x86-{32,64}) Reviewers: bgamari, hvr, goldfire, simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5258
* Fix some broken links (#15733)Fangyi Zhou2018-10-251-1/+1
| | | | | | | | | | | | | | | | Change some URLs from hackage.haskell.org/trac to ghc.haskell.org/trac Test Plan: manually verify links work Reviewers: bgamari, simonmar, mpickering Reviewed By: bgamari, mpickering Subscribers: mpickering, rwbarton, carter GHC Trac Issues: #15733 Differential Revision: https://phabricator.haskell.org/D5257
* Add RubbishLit for absent bindings of UnliftedRepSebastian Graf2018-10-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Trac #9279 reminded us that the worker wrapper transformation copes really badly with absent unlifted boxed bindings. As `Note [Absent errors]` in WwLib.hs points out, we can't just use `absentError` for unlifted bindings because there is no bottom to hide the error in. So instead, we synthesise a new `RubbishLit` of type `forall (a :: TYPE 'UnliftedRep). a`, which code-gen may subsitute for any boxed value. We choose `()`, so that there is a good chance that the program crashes instead instead of leading to corrupt data, should absence analysis have been too optimistic (#11126). Reviewers: simonpj, hvr, goldfire, bgamari, simonmar Reviewed By: simonpj Subscribers: osa1, rwbarton, carter GHC Trac Issues: #15627, #9279, #4306, #11126 Differential Revision: https://phabricator.haskell.org/D5153
* Fix dataToTag# argument evaluationÖmer Sinan Ağacan2018-10-102-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | See #15696 for more details. We now always enter dataToTag# argument (done in generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag# applications are done by the simplifier. Looking at tag bits (instead of reading the info table) for small types is left to another diff. Incorrect test T14626 is removed. We no longer do this optimisation (see comment:44, comment:45, comment:60). Comments and notes about special cases around dataToTag# are removed. We no longer have any special cases around it in Core. Other changes related to evaluating primops (seq# and dataToTag#) will be pursued in follow-up diffs. Test Plan: Validates with three regression tests Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #15696 Differential Revision: https://phabricator.haskell.org/D5201
* Revert "Add Int8# and Word8#"Ben Gamari2018-10-092-62/+14
| | | | | | | | | This unfortunately broke i386 support since it introduced references to byte-sized registers that don't exist on that architecture. Reverts binary submodule This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06.
* Add Int8# and Word8#Michal Terepeta2018-10-072-14/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: - `primops.txt.pp` gets two new sections for two new primitive types for signed and unsigned 8-bit integers (`Int8#` and `Word8` respectively) along with basic arithmetic and comparison operations. `PrimRep`/`RuntimeRep` get two new constructors for them. All of the primops translate into the existing `MachOP`s. - For `CmmCall`s the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. - x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate with new tests Reviewers: hvr, goldfire, bgamari, simonmar Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4475
* Add a missing write barrier to small array writesÖmer Sinan Ağacan2018-10-061-0/+1
| | | | | | | | | | | | | | | | Write barriers for large array writes were added in D2525, as a part of #12469. However it seems we forgot about small arrays. This patch adds the same write barrier to small array writes. Reviewers: simonmar, bgamari Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #12469 Differential Revision: https://phabricator.haskell.org/D5209
* Finish stable splitDavid Feuer2018-08-291-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long ago, the stable name table and stable pointer tables were one. Now, they are separate, and have significantly different implementations. I believe the time has come to finish the split that began in #7674. * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`. * Give each table its own mutex. * Add FFI functions `hs_lock_stable_ptr_table` and `hs_unlock_stable_ptr_table` and document them. These are intended to replace the previously undocumented `hs_lock_stable_tables` and `hs_lock_stable_tables`, which are now documented as deprecated synonyms. * Make `eqStableName#` use pointer equality instead of unnecessarily comparing stable name table indices. Reviewers: simonmar, bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15555 Differential Revision: https://phabricator.haskell.org/D5084
* Fix precision of asinh/acosh/atanh by making them primopsArtem Pelenitsyn2018-08-211-0/+6
| | | | | | | | | | Reviewers: hvr, bgamari, simonmar, jrtc27 Reviewed By: bgamari Subscribers: alpmestan, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5034
* Replace most occurences of foldl with foldl'.klebinger.andreas@gmx.at2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds foldl' to GhcPrelude and changes must occurences of foldl to foldl'. This leads to better performance especially for quick builds where GHC does not perform strictness analysis. It does change strictness behaviour when we use foldl' to turn a argument list into function applications. But this is only a drawback if code looks ONLY at the last argument but not at the first. And as the benchmarks show leads to fewer allocations in practice at O2. Compiler performance for Nofib: O2 Allocations: -1 s.d. ----- -0.0% +1 s.d. ----- -0.0% Average ----- -0.0% O2 Compile Time: -1 s.d. ----- -2.8% +1 s.d. ----- +1.3% Average ----- -0.8% O0 Allocations: -1 s.d. ----- -0.2% +1 s.d. ----- -0.1% Average ----- -0.2% Test Plan: ci Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal Reviewed By: bgamari, monoidal Subscribers: tdammers, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4929
* Turn on MonadFail desugaring by defaultHerbert Valerio Riedel2018-08-073-19/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This contains two commits: ---- Make GHC's code-base compatible w/ `MonadFail` There were a couple of use-sites which implicitly used pattern-matches in `do`-notation even though the underlying `Monad` didn't explicitly support `fail` This refactoring turns those use-sites into explicit case discrimations and adds an `MonadFail` instance for `UniqSM` (`UniqSM` was the worst offender so this has been postponed for a follow-up refactoring) --- Turn on MonadFail desugaring by default This finally implements the phase scheduled for GHC 8.6 according to https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy This also preserves some tests that assumed MonadFail desugaring to be active; all ghc boot libs were already made compatible with this `MonadFail` long ago, so no changes were needed there. Test Plan: Locally performed ./validate --fast Reviewers: bgamari, simonmar, jrtc27, RyanGlScott Reviewed By: bgamari Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5028
* Built-in Natural literals in CoreSylvain Henry2018-06-152-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for built-in Natural literals in Core. - Replace MachInt,MachWord, LitInteger, etc. with a single LitNumber constructor with a LitNumType field - Support built-in Natural literals - Add desugar warning for negative literals - Move Maybe(..) from GHC.Base to GHC.Maybe for module dependency reasons This patch introduces only a few rules for Natural literals (compared to Integer's rules). Factorization of the built-in rules for numeric literals will be done in another patch as this one is already big to review. Test Plan: validate test build with integer-simple Reviewers: hvr, bgamari, goldfire, Bodigrim, simonmar Reviewed By: bgamari Subscribers: phadej, simonpj, RyanGlScott, carter, hsyl20, rwbarton, thomie GHC Trac Issues: #14170, #14465 Differential Revision: https://phabricator.haskell.org/D4212
* Rename some mutable closure types for consistencyÖmer Sinan Ağacan2018-06-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
* remove dead maybeIsLFConGabor Greif2018-05-291-6/+1
|
* Comments and refactoring onlySimon Marlow2018-05-171-1/+1
| | | | Addressing review comments on D4637
* Merge FUN_STATIC closure with its SRTSimon Marlow2018-05-162-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The idea here is to save a little code size and some work in the GC, by collapsing FUN_STATIC closures and their SRTs. This is (4) in a series; see D4632 for more details. There's a tradeoff here: more complexity in the compiler in exchange for a modest code size reduction (probably around 0.5%). Results: * GHC binary itself (statically linked) is 1% smaller * -0.2% binary sizes in nofib (-0.5% module sizes) Full nofib results comparing D4634 with this: P177 (ignore runtimes, these aren't stable on my laptop) Test Plan: validate, nofib Reviewers: bgamari, niteria, simonpj, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4637
* An overhaul of the SRT representationSimon Marlow2018-05-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
* Add 'addWordC#' PrimOpSebastian Graf2018-05-051-10/+62
| | | | | | | | | | | | | | | | | | | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'. I found 'plusWord2#' while implementing this, which both lacks documentation and has a slightly different specification than 'addWordC#', which means the generic implementation is unnecessarily complex. While I was at it, I also added lacking meta-information on PrimOps and refactored 'subWordC#'s generic implementation to be branchless. Reviewers: bgamari, simonmar, jrtc27, dfeuer Reviewed By: bgamari, dfeuer Subscribers: dfeuer, thomie, carter Differential Revision: https://phabricator.haskell.org/D4592
* Add unaligned bytearray access primops. Fixes #4442.Reiner Pope2018-03-251-0/+53
| | | | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: dfeuer, rwbarton, thomie, carter GHC Trac Issues: #4442 Differential Revision: https://phabricator.haskell.org/D4488
* Fix seq# case of exprOkForSpeculationSimon Peyton Jones2018-03-201-6/+9
| | | | | | | | | This subtle patch fixes Trac #5129 (again; comment:20 and following). I took the opportunity to document seq# properly; see Note [seq# magic] in PrelRules, and Note [seq# and expr_ok] in CoreUtils.
* Improve accuracy of get/setAllocationCounterBen Gamari2018-03-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: hvr, erikd, simonmar, jrtc27, trommler Reviewed By: simonmar Subscribers: trommler, jrtc27, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4363
* Get rid of more CPP in cmm/ and codeGen/Michal Terepeta2018-03-198-33/+11
| | | | | | | | | | | | | | | | | | This removes a bunch of unnecessary includes of `HsVersions.h` along with unnecessary CPP (e.g., due to checking for DEBUG which can be achieved by looking at `debugIsOn`) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4462
* Fix interpreter with profilingSimon Marlow2018-03-063-16/+28
| | | | | | | | | | | | | | | | | | | This was broken by D3746 and/or D3809, but unfortunately we didn't notice because CI at the time wasn't building the profiling way. Test Plan: ``` cd testsuite/test/profiling/should_run make WAY=ghci-ext-prof ``` Reviewers: bgamari, michalt, hvr, erikd Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14705 Differential Revision: https://phabricator.haskell.org/D4437
* myThreadId# is trivial; make it an inline primopSimon Marlow2018-02-181-0/+3
| | | | | | | | | | | | | | | The pattern `threadCapability =<< myThreadId` is used a lot in code that uses `hs_try_putmvar`, I want to make it cheaper. Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4381
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-189-66/+45
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Add ptr-eq short-cut to `compareByteArrays#` primitiveHerbert Valerio Riedel2018-01-261-0/+43
| | | | | | | | | | | | | | | | | | This is an obvious optimisation whose overhead is neglectable but which significantly simplifies the common uses of `compareByteArrays#` which would otherwise require to make *careful* use of `reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#` which can result in less optimal assembler code being generated. Test Plan: carefully examined generated cmm/asm code; validate via phab Reviewers: alexbiehl, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4319
* Add new mbmi and mbmi2 compiler flagsJohn Ky2018-01-211-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Implement x86 code generator for pdep and pext. Properly initialise bmiVersion field. pdep and pext test cases Fix pattern match for pdep and pext instructions Fix build of pdep and pext code for 32-bit architectures Test Plan: Validate Reviewers: austin, simonmar, bgamari, angerman Reviewed By: bgamari Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4236
* Remove unused extern cost centre collectionÖmer Sinan Ağacan2018-01-181-1/+1
| | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: simonmar Subscribers: alexbiehl, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4309
* Revert "Improve accuracy of get/setAllocationCounter"Ben Gamari2018-01-181-2/+2
| | | | This reverts commit a1a689dda48113f3735834350fb562bb1927a633.
* Typos in commentsGabor Greif2018-01-171-1/+1
|
* Improve accuracy of get/setAllocationCounterSimon Marlow2018-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: niteria, bgamari, hvr, erikd Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4288
* Cache the number of data cons in DataTyCon and SumTyConBartosz Nitka2018-01-041-8/+3
| | | | | | | | | | | | | | | | This is a follow-up after faf60e85 - Make tagForCon non-linear. On the mailing list @simonpj suggested to solve the linear behavior by caching the sizes. Test Plan: ./validate Reviewers: simonpj, simonmar, bgamari, austin Reviewed By: simonpj Subscribers: carter, goldfire, rwbarton, thomie, simonpj Differential Revision: https://phabricator.haskell.org/D4131