summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Introduce Int16# and Word16#Abhiroop Sarkar2018-11-172-0/+47
| | | | | | | | | | | | This builds off of D4475. Bumps binary submodule. Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D5006
* Remove StgBinderInfo and related computation in CoreToStgÖmer Sinan Ağacan2018-11-124-97/+10
| | | | | | | | | | | | | | | | | | | | | | - The StgBinderInfo type was never used in the code gen, so the type, related computation in CoreToStg, and some comments about it are removed. See #15770 for more details. - Simplified CoreToStg after removing the StgBinderInfo computation: removed StgBinderInfo arguments and mfix stuff. The StgBinderInfo values were not used in the code gen, but I still run nofib just to make sure: 0.0% change in allocations and binary sizes. Test Plan: Validated locally Reviewers: simonpj, simonmar, bgamari, sgraf Reviewed By: sgraf Subscribers: AndreasK, sgraf, rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5232
* [LlvmCodeGen] Fixes for Int8#/Word8#Michal Terepeta2018-11-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes two isssues: - Using bitcast for MO_XX_Conv Arguments to a bitcast must be of the same size. We should be using `trunc` and `zext` instead. - Using unsupported MO_*_QuotRem for LLVM The two primops `MO_*_QuotRem` are not supported by the LLVM backend, so we shouldn't use them for `Int8#`/`Word8#` (just as we do not use them for `Int#`/`Word#`). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: manually run tests with WAY=llvm Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15864 Differential Revision: https://phabricator.haskell.org/D5304
* Add Int8# and Word8#Michal Terepeta2018-11-022-14/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: primops.txt.pp gets two new sections for two new primitive types for signed and unsigned 8-bit integers (Int8# and Word8 respectively) along with basic arithmetic and comparison operations. PrimRep/RuntimeRep get two new constructors for them. All of the primops translate into the existing MachOPs. For CmmCalls the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. This is the second attempt at merging this, after the first attempt in D4475 had to be backed out due to regressions on i386. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate (on both x86-{32,64}) Reviewers: bgamari, hvr, goldfire, simonmar Subscribers: rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5258
* Fix some broken links (#15733)Fangyi Zhou2018-10-251-1/+1
| | | | | | | | | | | | | | | | Change some URLs from hackage.haskell.org/trac to ghc.haskell.org/trac Test Plan: manually verify links work Reviewers: bgamari, simonmar, mpickering Reviewed By: bgamari, mpickering Subscribers: mpickering, rwbarton, carter GHC Trac Issues: #15733 Differential Revision: https://phabricator.haskell.org/D5257
* Add RubbishLit for absent bindings of UnliftedRepSebastian Graf2018-10-141-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Trac #9279 reminded us that the worker wrapper transformation copes really badly with absent unlifted boxed bindings. As `Note [Absent errors]` in WwLib.hs points out, we can't just use `absentError` for unlifted bindings because there is no bottom to hide the error in. So instead, we synthesise a new `RubbishLit` of type `forall (a :: TYPE 'UnliftedRep). a`, which code-gen may subsitute for any boxed value. We choose `()`, so that there is a good chance that the program crashes instead instead of leading to corrupt data, should absence analysis have been too optimistic (#11126). Reviewers: simonpj, hvr, goldfire, bgamari, simonmar Reviewed By: simonpj Subscribers: osa1, rwbarton, carter GHC Trac Issues: #15627, #9279, #4306, #11126 Differential Revision: https://phabricator.haskell.org/D5153
* Fix dataToTag# argument evaluationÖmer Sinan Ağacan2018-10-102-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | See #15696 for more details. We now always enter dataToTag# argument (done in generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag# applications are done by the simplifier. Looking at tag bits (instead of reading the info table) for small types is left to another diff. Incorrect test T14626 is removed. We no longer do this optimisation (see comment:44, comment:45, comment:60). Comments and notes about special cases around dataToTag# are removed. We no longer have any special cases around it in Core. Other changes related to evaluating primops (seq# and dataToTag#) will be pursued in follow-up diffs. Test Plan: Validates with three regression tests Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #15696 Differential Revision: https://phabricator.haskell.org/D5201
* Revert "Add Int8# and Word8#"Ben Gamari2018-10-092-62/+14
| | | | | | | | | This unfortunately broke i386 support since it introduced references to byte-sized registers that don't exist on that architecture. Reverts binary submodule This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06.
* Add Int8# and Word8#Michal Terepeta2018-10-072-14/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is the first step of implementing: https://github.com/ghc-proposals/ghc-proposals/pull/74 The main highlights/changes: - `primops.txt.pp` gets two new sections for two new primitive types for signed and unsigned 8-bit integers (`Int8#` and `Word8` respectively) along with basic arithmetic and comparison operations. `PrimRep`/`RuntimeRep` get two new constructors for them. All of the primops translate into the existing `MachOP`s. - For `CmmCall`s the codegen will now zero-extend the values at call site (so that they can be moved to the right register) and then truncate them back their original width. - x86 native codegen needed some updates, since it wasn't able to deal with the new widths, but all the changes are quite localized. LLVM backend seems to just work. Bumps binary submodule. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate with new tests Reviewers: hvr, goldfire, bgamari, simonmar Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4475
* Add a missing write barrier to small array writesÖmer Sinan Ağacan2018-10-061-0/+1
| | | | | | | | | | | | | | | | Write barriers for large array writes were added in D2525, as a part of #12469. However it seems we forgot about small arrays. This patch adds the same write barrier to small array writes. Reviewers: simonmar, bgamari Reviewed By: simonmar Subscribers: rwbarton, carter GHC Trac Issues: #12469 Differential Revision: https://phabricator.haskell.org/D5209
* Finish stable splitDavid Feuer2018-08-291-8/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Long ago, the stable name table and stable pointer tables were one. Now, they are separate, and have significantly different implementations. I believe the time has come to finish the split that began in #7674. * Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`. * Give each table its own mutex. * Add FFI functions `hs_lock_stable_ptr_table` and `hs_unlock_stable_ptr_table` and document them. These are intended to replace the previously undocumented `hs_lock_stable_tables` and `hs_lock_stable_tables`, which are now documented as deprecated synonyms. * Make `eqStableName#` use pointer equality instead of unnecessarily comparing stable name table indices. Reviewers: simonmar, bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, carter GHC Trac Issues: #15555 Differential Revision: https://phabricator.haskell.org/D5084
* Fix precision of asinh/acosh/atanh by making them primopsArtem Pelenitsyn2018-08-211-0/+6
| | | | | | | | | | Reviewers: hvr, bgamari, simonmar, jrtc27 Reviewed By: bgamari Subscribers: alpmestan, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5034
* Replace most occurences of foldl with foldl'.klebinger.andreas@gmx.at2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds foldl' to GhcPrelude and changes must occurences of foldl to foldl'. This leads to better performance especially for quick builds where GHC does not perform strictness analysis. It does change strictness behaviour when we use foldl' to turn a argument list into function applications. But this is only a drawback if code looks ONLY at the last argument but not at the first. And as the benchmarks show leads to fewer allocations in practice at O2. Compiler performance for Nofib: O2 Allocations: -1 s.d. ----- -0.0% +1 s.d. ----- -0.0% Average ----- -0.0% O2 Compile Time: -1 s.d. ----- -2.8% +1 s.d. ----- +1.3% Average ----- -0.8% O0 Allocations: -1 s.d. ----- -0.2% +1 s.d. ----- -0.1% Average ----- -0.2% Test Plan: ci Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal Reviewed By: bgamari, monoidal Subscribers: tdammers, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4929
* Turn on MonadFail desugaring by defaultHerbert Valerio Riedel2018-08-073-19/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This contains two commits: ---- Make GHC's code-base compatible w/ `MonadFail` There were a couple of use-sites which implicitly used pattern-matches in `do`-notation even though the underlying `Monad` didn't explicitly support `fail` This refactoring turns those use-sites into explicit case discrimations and adds an `MonadFail` instance for `UniqSM` (`UniqSM` was the worst offender so this has been postponed for a follow-up refactoring) --- Turn on MonadFail desugaring by default This finally implements the phase scheduled for GHC 8.6 according to https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy This also preserves some tests that assumed MonadFail desugaring to be active; all ghc boot libs were already made compatible with this `MonadFail` long ago, so no changes were needed there. Test Plan: Locally performed ./validate --fast Reviewers: bgamari, simonmar, jrtc27, RyanGlScott Reviewed By: bgamari Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D5028
* Built-in Natural literals in CoreSylvain Henry2018-06-152-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for built-in Natural literals in Core. - Replace MachInt,MachWord, LitInteger, etc. with a single LitNumber constructor with a LitNumType field - Support built-in Natural literals - Add desugar warning for negative literals - Move Maybe(..) from GHC.Base to GHC.Maybe for module dependency reasons This patch introduces only a few rules for Natural literals (compared to Integer's rules). Factorization of the built-in rules for numeric literals will be done in another patch as this one is already big to review. Test Plan: validate test build with integer-simple Reviewers: hvr, bgamari, goldfire, Bodigrim, simonmar Reviewed By: bgamari Subscribers: phadej, simonpj, RyanGlScott, carter, hsyl20, rwbarton, thomie GHC Trac Issues: #14170, #14465 Differential Revision: https://phabricator.haskell.org/D4212
* Rename some mutable closure types for consistencyÖmer Sinan Ağacan2018-06-051-8/+8
| | | | | | | | | | | | | | | | | | | | | | | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
* remove dead maybeIsLFConGabor Greif2018-05-291-6/+1
|
* Comments and refactoring onlySimon Marlow2018-05-171-1/+1
| | | | Addressing review comments on D4637
* Merge FUN_STATIC closure with its SRTSimon Marlow2018-05-162-18/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The idea here is to save a little code size and some work in the GC, by collapsing FUN_STATIC closures and their SRTs. This is (4) in a series; see D4632 for more details. There's a tradeoff here: more complexity in the compiler in exchange for a modest code size reduction (probably around 0.5%). Results: * GHC binary itself (statically linked) is 1% smaller * -0.2% binary sizes in nofib (-0.5% module sizes) Full nofib results comparing D4634 with this: P177 (ignore runtimes, these aren't stable on my laptop) Test Plan: validate, nofib Reviewers: bgamari, niteria, simonpj, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4637
* An overhaul of the SRT representationSimon Marlow2018-05-161-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
* Add 'addWordC#' PrimOpSebastian Graf2018-05-051-10/+62
| | | | | | | | | | | | | | | | | | | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'. I found 'plusWord2#' while implementing this, which both lacks documentation and has a slightly different specification than 'addWordC#', which means the generic implementation is unnecessarily complex. While I was at it, I also added lacking meta-information on PrimOps and refactored 'subWordC#'s generic implementation to be branchless. Reviewers: bgamari, simonmar, jrtc27, dfeuer Reviewed By: bgamari, dfeuer Subscribers: dfeuer, thomie, carter Differential Revision: https://phabricator.haskell.org/D4592
* Add unaligned bytearray access primops. Fixes #4442.Reiner Pope2018-03-251-0/+53
| | | | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: dfeuer, rwbarton, thomie, carter GHC Trac Issues: #4442 Differential Revision: https://phabricator.haskell.org/D4488
* Fix seq# case of exprOkForSpeculationSimon Peyton Jones2018-03-201-6/+9
| | | | | | | | | This subtle patch fixes Trac #5129 (again; comment:20 and following). I took the opportunity to document seq# properly; see Note [seq# magic] in PrelRules, and Note [seq# and expr_ok] in CoreUtils.
* Improve accuracy of get/setAllocationCounterBen Gamari2018-03-191-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: hvr, erikd, simonmar, jrtc27, trommler Reviewed By: simonmar Subscribers: trommler, jrtc27, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4363
* Get rid of more CPP in cmm/ and codeGen/Michal Terepeta2018-03-198-33/+11
| | | | | | | | | | | | | | | | | | This removes a bunch of unnecessary includes of `HsVersions.h` along with unnecessary CPP (e.g., due to checking for DEBUG which can be achieved by looking at `debugIsOn`) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4462
* Fix interpreter with profilingSimon Marlow2018-03-063-16/+28
| | | | | | | | | | | | | | | | | | | This was broken by D3746 and/or D3809, but unfortunately we didn't notice because CI at the time wasn't building the profiling way. Test Plan: ``` cd testsuite/test/profiling/should_run make WAY=ghci-ext-prof ``` Reviewers: bgamari, michalt, hvr, erikd Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14705 Differential Revision: https://phabricator.haskell.org/D4437
* myThreadId# is trivial; make it an inline primopSimon Marlow2018-02-181-0/+3
| | | | | | | | | | | | | | | The pattern `threadCapability =<< myThreadId` is used a lot in code that uses `hs_try_putmvar`, I want to make it cheaper. Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4381
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-189-66/+45
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Add ptr-eq short-cut to `compareByteArrays#` primitiveHerbert Valerio Riedel2018-01-261-0/+43
| | | | | | | | | | | | | | | | | | This is an obvious optimisation whose overhead is neglectable but which significantly simplifies the common uses of `compareByteArrays#` which would otherwise require to make *careful* use of `reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#` which can result in less optimal assembler code being generated. Test Plan: carefully examined generated cmm/asm code; validate via phab Reviewers: alexbiehl, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4319
* Add new mbmi and mbmi2 compiler flagsJohn Ky2018-01-211-0/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Implement x86 code generator for pdep and pext. Properly initialise bmiVersion field. pdep and pext test cases Fix pattern match for pdep and pext instructions Fix build of pdep and pext code for 32-bit architectures Test Plan: Validate Reviewers: austin, simonmar, bgamari, angerman Reviewed By: bgamari Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4236
* Remove unused extern cost centre collectionÖmer Sinan Ağacan2018-01-181-1/+1
| | | | | | | | | | Reviewers: bgamari, simonmar Reviewed By: simonmar Subscribers: alexbiehl, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4309
* Revert "Improve accuracy of get/setAllocationCounter"Ben Gamari2018-01-181-2/+2
| | | | This reverts commit a1a689dda48113f3735834350fb562bb1927a633.
* Typos in commentsGabor Greif2018-01-171-1/+1
|
* Improve accuracy of get/setAllocationCounterSimon Marlow2018-01-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: get/setAllocationCounter didn't take into account allocations in the current block. This was known at the time, but it turns out to be important to have more accuracy when using these in a fine-grained way. Test Plan: New unit test to test incrementally larger allocaitons. Before I got results like this: ``` +0 +0 +0 +0 +0 +4096 +0 +0 +0 +0 +0 +4064 +0 +0 +4088 +4056 +0 +0 +0 +4088 +4096 +4056 +4096 ``` Notice how the results aren't always monotonically increasing. After this patch: ``` +344 +416 +488 +560 +632 +704 +776 +848 +920 +992 +1064 +1136 +1208 +1280 +1352 +1424 +1496 +1568 +1640 +1712 +1784 +1856 +1928 +2000 +2072 +2144 ``` Reviewers: niteria, bgamari, hvr, erikd Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4288
* Cache the number of data cons in DataTyCon and SumTyConBartosz Nitka2018-01-041-8/+3
| | | | | | | | | | | | | | | | This is a follow-up after faf60e85 - Make tagForCon non-linear. On the mailing list @simonpj suggested to solve the linear behavior by caching the sizes. Test Plan: ./validate Reviewers: simonpj, simonmar, bgamari, austin Reviewed By: simonpj Subscribers: carter, goldfire, rwbarton, thomie, simonpj Differential Revision: https://phabricator.haskell.org/D4131
* Get rid of some stuttering in comments and docsGabor Greif2017-12-191-1/+1
|
* typos in local varGabor Greif2017-12-141-1/+1
|
* Revert "Add new mbmi and mbmi2 compiler flags"Ben Gamari2017-11-221-78/+0
| | | | | | This broke the 32-bit build. This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
* Add new mbmi and mbmi2 compiler flagsJohn Ky2017-11-151-0/+78
| | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Test Plan: Validate Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4063
* Allow packing constructor fieldsMichal Terepeta2017-10-294-36/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is another step for fixing #13825 and is based on D38 by Simon Marlow. The change allows storing multiple constructor fields within the same word. This currently applies only to `Float`s, e.g., ``` data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float ``` on 64-bit arch, will now store both fields within the same constructor word. For `WordX/IntX` we'll need to introduce new primop types. Main changes: - We now use sizes in bytes when we compute the offsets for constructor fields in `StgCmmLayout` and introduce padding if necessary (word-sized fields are still word-aligned) - `ByteCodeGen` had to be updated to correctly construct the data types. This required some new bytecode instructions to allow pushing things that are not full words onto the stack (and updating `Interpreter.c`). Note that we only use the packed stuff when constructing data types (i.e., for `PACK`), in all other cases the behavior should not change. - `RtClosureInspect` was changed to handle the new layout when extracting subterms. This seems to be used by things like `:print`. I've also added a test for this. - I deviated slightly from Simon's approach and use `PrimRep` instead of `ArgRep` for computing the size of fields. This seemed more natural and in the future we'll probably want to introduce new primitive types (e.g., `Int8#`) and `PrimRep` seems like a better place to do that (where we already have `Int64Rep` for example). `ArgRep` on the other hand seems to be more focused on calling functions. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: maoe, rwbarton, thomie GHC Trac Issues: #13825 Differential Revision: https://phabricator.haskell.org/D3809
* Turn `compareByteArrays#` out-of-line primop into inline primopalexbiehl2017-10-291-1/+40
| | | | | | | | | | | | Depends on D4090 Reviewers: austin, bgamari, erikd, simonmar, alexbiehl Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4091
* Make tagForCon non-linearBartosz Nitka2017-10-271-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Computing the number of constructors for TyCon is linear in the number of constructors. That's wasteful if all you want to check is if that number is smaller than what fits in tag bits (usually 8 things). What this change does is to use a function that can determine the ineqaulity without computing the size. This improves compile time on a module with a data type that has 10k constructors. The variance in total time is (suspiciously) high, but going by the best of 3 the numbers are 8.186s vs 7.511s. For 1000 constructors the difference isn't noticeable: 0.646s vs 0.624s. The hot spots were cgDataCon and cgEnumerationTyCon where tagForCon is called in a loop. One alternative would be to pass down the size. Test Plan: harbormaster Reviewers: bgamari, simonmar, austin Reviewed By: simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4116
* Simplify, no functionality changeGabor Greif2017-10-171-7/+6
|
* StgCmmMonad: Remove unnecessary use of unboxed tuplesBen Gamari2017-09-262-55/+24
| | | | | | | | | | | | | | | The simplifier can simplify this without any trouble. Moreover, the unboxed tuples cause bootstrapping issues due #14123. I also went ahead and inlined a few definitions into the Monad instance. Test Plan: Validate Reviewers: austin, simonmar Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4026
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-1927-13/+53
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Fix typos in diagnostics, testsuite and commentsGabor Greif2017-09-071-1/+1
|
* Add support for producing position-independent executablesBen Gamari2017-08-221-2/+2
| | | | | | | | | | | | | | | | Previously due to #12759 we disabled PIE support entirely. However, this breaks the user's ability to produce PIEs. Add an explicit flag, -fPIE, allowing the user to build PIEs. Test Plan: Validate Reviewers: rwbarton, austin, simonmar Subscribers: trommler, simonmar, trofi, jrtc27, thomie GHC Trac Issues: #12759, #13702 Differential Revision: https://phabricator.haskell.org/D3589
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-233-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* Use lengthIs and friends in more placesRyan Scott2017-06-025-7/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | While investigating #12545, I discovered several places in the code that performed length-checks like so: ``` length ts == 4 ``` This is not ideal, since the length of `ts` could be much longer than 4, and we'd be doing way more work than necessary! There are already a slew of helper functions in `Util` such as `lengthIs` that are designed to do this efficiently, so I found every place where they ought to be used and did just that. I also defined a couple more utility functions for list length that were common patterns (e.g., `ltLength`). Test Plan: ./validate Reviewers: austin, hvr, goldfire, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: goldfire, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3622
* PPC NCG: Lower MO_*_Fabs as PowerPC fabs instructionPeter Trommler2017-05-011-2/+4
| | | | | | | | | | | | | | | | In Phab:D3265 we introduced MO_F32_Fabs and MO_F64_Fabs. This patch improves code generation by generating PowerPC fabs instructions. Test Plan: run numeric/should_run/numrun015 or validate Reviewers: austin, bgamari, hvr, simonmar, erikd Reviewed By: erikd Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3512