summaryrefslogtreecommitdiff
path: root/compiler/codeGen/StgCmmBind.hs
Commit message (Collapse)AuthorAgeFilesLines
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-753/+0
| | | | | | Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.
* Dont gather ticks when only striping them in STG.Andreas Klebinger2019-07-041-1/+1
| | | | | | | | | Adds stripStgTicksTopE which only returns the stripped expression. So far we also allocated a list for the stripped ticks which was never used. Allocation difference is as expected very small but present. About 0.02% difference when compiling with -O.
* Correct closure observation, construction, and mutation on weak memory machines.Travis Whitaker2019-06-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here the following changes are introduced: - A read barrier machine op is added to Cmm. - The order in which a closure's fields are read and written is changed. - Memory barriers are added to RTS code to ensure correctness on out-or-order machines with weak memory ordering. Cmm has a new CallishMachOp called MO_ReadBarrier. On weak memory machines, this is lowered to an instruction that ensures memory reads that occur after said instruction in program order are not performed before reads coming before said instruction in program order. On machines with strong memory ordering properties (e.g. X86, SPARC in TSO mode) no such instruction is necessary, so MO_ReadBarrier is simply erased. However, such an instruction is necessary on weakly ordered machines, e.g. ARM and PowerPC. Weam memory ordering has consequences for how closures are observed and mutated. For example, consider a closure that needs to be updated to an indirection. In order for the indirection to be safe for concurrent observers to enter, said observers must read the indirection's info table before they read the indirectee. Furthermore, the entering observer makes assumptions about the closure based on its info table contents, e.g. an INFO_TYPE of IND imples the closure has an indirectee pointer that is safe to follow. When a closure is updated with an indirection, both its info table and its indirectee must be written. With weak memory ordering, these two writes can be arbitrarily reordered, and perhaps even interleaved with other threads' reads and writes (in the absence of memory barrier instructions). Consider this example of a bad reordering: - An updater writes to a closure's info table (INFO_TYPE is now IND). - A concurrent observer branches upon reading the closure's INFO_TYPE as IND. - A concurrent observer reads the closure's indirectee and enters it. (!!!) - An updater writes the closure's indirectee. Here the update to the indirectee comes too late and the concurrent observer has jumped off into the abyss. Speculative execution can also cause us issues, consider: - An observer is about to case on a value in closure's info table. - The observer speculatively reads one or more of closure's fields. - An updater writes to closure's info table. - The observer takes a branch based on the new info table value, but with the old closure fields! - The updater writes to the closure's other fields, but its too late. Because of these effects, reads and writes to a closure's info table must be ordered carefully with respect to reads and writes to the closure's other fields, and memory barriers must be placed to ensure that reads and writes occur in program order. Specifically, updates to a closure must follow the following pattern: - Update the closure's (non-info table) fields. - Write barrier. - Update the closure's info table. Observing a closure's fields must follow the following pattern: - Read the closure's info pointer. - Read barrier. - Read the closure's (non-info table) fields. This patch updates RTS code to obey this pattern. This should fix long-standing SMP bugs on ARM (specifically newer aarch64 microarchitectures supporting out-of-order execution) and PowerPC. This fixes issue #15449. Co-Authored-By: Ben Gamari <ben@well-typed.com>
* Simplify link_caf and mkForeignLabel functionsÖmer Sinan Ağacan2019-06-251-3/+2
|
* asm-emit-time IND_STATIC eliminationGabor Greif2019-04-151-1/+3
| | | | | | | | | | | | When a new closure identifier is being established to a local or exported closure already emitted into the same module, refrain from adding an IND_STATIC closure, and instead emit an assembly-language alias. Inter-module IND_STATIC objects still remain, and need to be addressed by other measures. Binary-size savings on nofib are around 0.1%.
* Don't use a generic apply thunk for known callsSebastian Graf2018-12-061-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, an AP thunk like `sat = f a b c` will not have its own entry point and info pointer and will instead reuse a generic apply thunk like `stg_ap_4_upd`. That's great from a code size perspective, but if `f` is a known function, a specialised entry point with a plain call can be much faster than figuring out the arity and doing dynamic dispatch. This looks at `f`s arity to figure out if it is a known function and if so, it will not lower it to a generic apply function. Benchmark results are encouraging: No changes to allocation, but 0.2% less counted instructions. Test Plan: Validates locally Reviewers: simonmar, osa1, simonpj, bgamari Reviewed By: simonpj Subscribers: rwbarton, carter GHC Trac Issues: #16005 Differential Revision: https://phabricator.haskell.org/D5414
* Implement late lambda liftSebastian Graf2018-11-231-5/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This implements a selective lambda-lifting pass late in the STG pipeline. Lambda lifting has the effect of avoiding closure allocation at the cost of having to make former free vars available at call sites, possibly enlarging closures surrounding call sites in turn. We identify beneficial cases by means of an analysis that estimates closure growth. There's a Wiki page at https://ghc.haskell.org/trac/ghc/wiki/LateLamLift. Reviewers: simonpj, bgamari, simonmar Reviewed By: simonpj Subscribers: rwbarton, carter GHC Trac Issues: #9476 Differential Revision: https://phabricator.haskell.org/D5224
* Don't track free variables in STG syntax by defaultSebastian Graf2018-11-191-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: Currently, `CoreToStg` annotates `StgRhsClosure`s with their set of non-global free variables. This free variable information is only needed in the final code generation step (i.e. `StgCmm.codeGen`), which leads to transformations such as `StgCse` and `StgUnarise` having to maintain this information. This is tiresome and unnecessary, so this patch introduces a trees-to-grow-like approach that only introduces the free variable set into the syntax tree in the code gen pass, along with a free variable analysis on STG terms to generate that information. Fixes #15754. Reviewers: simonpj, osa1, bgamari, simonmar Reviewed By: osa1 Subscribers: rwbarton, carter GHC Trac Issues: #15754 Differential Revision: https://phabricator.haskell.org/D5324
* Remove StgBinderInfo and related computation in CoreToStgÖmer Sinan Ağacan2018-11-121-8/+7
| | | | | | | | | | | | | | | | | | | | | | - The StgBinderInfo type was never used in the code gen, so the type, related computation in CoreToStg, and some comments about it are removed. See #15770 for more details. - Simplified CoreToStg after removing the StgBinderInfo computation: removed StgBinderInfo arguments and mfix stuff. The StgBinderInfo values were not used in the code gen, but I still run nofib just to make sure: 0.0% change in allocations and binary sizes. Test Plan: Validated locally Reviewers: simonpj, simonmar, bgamari, sgraf Reviewed By: sgraf Subscribers: AndreasK, sgraf, rwbarton, carter Differential Revision: https://phabricator.haskell.org/D5232
* Merge FUN_STATIC closure with its SRTSimon Marlow2018-05-161-12/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | Summary: The idea here is to save a little code size and some work in the GC, by collapsing FUN_STATIC closures and their SRTs. This is (4) in a series; see D4632 for more details. There's a tradeoff here: more complexity in the compiler in exchange for a modest code size reduction (probably around 0.5%). Results: * GHC binary itself (statically linked) is 1% smaller * -0.2% binary sizes in nofib (-0.5% module sizes) Full nofib results comparing D4634 with this: P177 (ignore runtimes, these aren't stable on my laptop) Test Plan: validate, nofib Reviewers: bgamari, niteria, simonpj, erikd Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4637
* Get rid of more CPP in cmm/ and codeGen/Michal Terepeta2018-03-191-4/+0
| | | | | | | | | | | | | | | | | | This removes a bunch of unnecessary includes of `HsVersions.h` along with unnecessary CPP (e.g., due to checking for DEBUG which can be achieved by looking at `debugIsOn`) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4462
* Fix interpreter with profilingSimon Marlow2018-03-061-6/+8
| | | | | | | | | | | | | | | | | | | This was broken by D3746 and/or D3809, but unfortunately we didn't notice because CI at the time wasn't building the profiling way. Test Plan: ``` cd testsuite/test/profiling/should_run make WAY=ghci-ext-prof ``` Reviewers: bgamari, michalt, hvr, erikd Subscribers: rwbarton, thomie, carter GHC Trac Issues: #14705 Differential Revision: https://phabricator.haskell.org/D4437
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-181-6/+5
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Allow packing constructor fieldsMichal Terepeta2017-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is another step for fixing #13825 and is based on D38 by Simon Marlow. The change allows storing multiple constructor fields within the same word. This currently applies only to `Float`s, e.g., ``` data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float ``` on 64-bit arch, will now store both fields within the same constructor word. For `WordX/IntX` we'll need to introduce new primop types. Main changes: - We now use sizes in bytes when we compute the offsets for constructor fields in `StgCmmLayout` and introduce padding if necessary (word-sized fields are still word-aligned) - `ByteCodeGen` had to be updated to correctly construct the data types. This required some new bytecode instructions to allow pushing things that are not full words onto the stack (and updating `Interpreter.c`). Note that we only use the packed stuff when constructing data types (i.e., for `PACK`), in all other cases the behavior should not change. - `RtClosureInspect` was changed to handle the new layout when extracting subterms. This seems to be used by things like `:print`. I've also added a test for this. - I deviated slightly from Simon's approach and use `PrimRep` instead of `ArgRep` for computing the size of fields. This seemed more natural and in the future we'll probably want to introduce new primitive types (e.g., `Int8#`) and `PrimRep` seems like a better place to do that (where we already have `Int64Rep` for example). `ArgRep` on the other hand seems to be more focused on calling functions. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: maoe, rwbarton, thomie GHC Trac Issues: #13825 Differential Revision: https://phabricator.haskell.org/D3809
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-2/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Use newBlockId instead of newLabelCBen Gamari2016-11-291-1/+2
| | | | | | | | | | | | | | | This seems like a clearer name and the fewer functions that one needs to remember, the better. Test Plan: validate Reviewers: austin, simonmar, michalt Reviewed By: simonmar, michalt Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2735
* Codegen for case: Remove redundant void id checksÖmer Sinan Ağacan2016-09-201-8/+12
| | | | | | | | | | | | | | | | New unarise (714bebf) eliminates void binders in patterns already, so no need to eliminate them here. I leave assertions to make sure this is the case. Assertion failure -> bug in unarise Reviewers: bgamari, simonpj, austin, simonmar, hvr Reviewed By: simonpj Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2416
* Remove StgRubbishArg and CmmArgÖmer Sinan Ağacan2016-08-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind adding special "rubbish" arguments was in unboxed sum types depending on the tag some arguments are not used and we don't want to move some special values (like 0 for literals and some special pointer for boxed slots) for those arguments (to stack locations or registers). "StgRubbishArg" was an indicator to the code generator that the value won't be used. During Stg-to-Cmm we were then not generating any move or store instructions at all. This caused problems in the register allocator because some variables were only initialized in some code paths. As an example, suppose we have this STG: (after unarise) Lib.$WT = \r [dt_sit] case case dt_sit of { Lib.F dt_siv [Occ=Once] -> (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#]; Lib.I dt_siw [Occ=Once] -> (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw]; } of dt_six { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE]; }; This basically unpacks a sum type to an unboxed sum with 3 fields, and then moves the unboxed sum to a constructor (`Lib.T`). This is the Cmm for the inner case expression (case expression in the scrutinee position of the outer case): ciN: ... -- look at dt_sit's tag if (_ciT::P64 != 1) goto ciS; else goto ciR; ciS: -- Tag is 2, i.e. Lib.F _siw::I64 = I64[_siu::P64 + 6]; _giE::I64 = _siw::I64; _giD::P64 = stg_RUBBISH_ENTRY_info; _giC::I64 = 2; goto ciU; ciR: -- Tag is 1, i.e. Lib.I _siv::P64 = P64[_siu::P64 + 7]; _giD::P64 = _siv::P64; _giC::I64 = 1; goto ciU; Here one of the blocks `ciS` and `ciR` is executed and then the execution continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch `_giE` is not initialized, because it's "rubbish" in the STG and so we don't generate an assignment during code generator. The code generator then panics during the register allocations: ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20160722 for x86_64-unknown-linux): LocalReg's live-in to graph ciY {_giE::I64} (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a pointer slot, we have to initialize it otherwise garbage collector follows the pointer to some random place. So we only remove assignment if the "rubbish" arg has unboxed type.) This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize rubbish slots. If the slot is for boxed types we use the existing `absentError`, otherwise we initialize the slot with literal 0. Reviewers: simonpj, erikd, austin, simonmar, bgamari Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2446
* StgCmmBind: Some minor simplificationsÖmer Sinan Ağacan2016-07-221-8/+2
|
* Comments re ApThunks + small refactor in mkRhsClosureSimon Peyton Jones2016-07-211-15/+21
|
* Implement unboxed sum primitive typeÖmer Sinan Ağacan2016-07-211-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements primitive unboxed sum types, as described in https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes. Main changes are: - Add new syntax for unboxed sums types, terms and patterns. Hidden behind `-XUnboxedSums`. - Add unlifted unboxed sum type constructors and data constructors, extend type and pattern checkers and desugarer. - Add new RuntimeRep for unboxed sums. - Extend unarise pass to translate unboxed sums to unboxed tuples right before code generation. - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better code generation when sum values are involved. - Add user manual section for unboxed sums. Some other changes: - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to `MultiValAlt` to be able to use those with both sums and tuples. - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really wrong, given an `Any` `TyCon`, there's no way to tell what its kind is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`. - Fix some bugs on the way: #12375. Not included in this patch: - Update Haddock for new the new unboxed sum syntax. - `TemplateHaskell` support is left as future work. For reviewers: - Front-end code is mostly trivial and adapted from unboxed tuple code for type checking, pattern checking, renaming, desugaring etc. - Main translation routines are in `RepType` and `UnariseStg`. Documentation in `UnariseStg` should be enough for understanding what's going on. Credits: - Johan Tibell wrote the initial front-end and interface file extensions. - Simon Peyton Jones reviewed this patch many times, wrote some code, and helped with debugging. Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin, simonmar, hvr, erikd Reviewed By: simonpj Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D2259
* Ticky: Do not count every entry twiceJoachim Breitner2016-03-291-2/+1
| | | | | (likely introduced by 99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176, possibly due to a merge mistake).
* Be more explicit about closure types in ticky-ticky-reportJoachim Breitner2016-03-291-5/+10
| | | | | | | | | | The report now distinguishes thunks (in the variants single-entry and standard thunks), constructors and functions (possibly single-entry). Forthermore, for standard thunks (AP and selector), do not count an entry when they are allocated. It is not possible to count their entries, as their code is shared, but better count nothing than count the wrong thing.
* Revert "Various ticky-related work"Ben Gamari2016-03-241-7/+6
| | | | | This reverts commit 6c2c853b11fe25c106469da7b105e2be596c17de which was supposed to be merged as individual commits.
* Various ticky-related workJoachim Breitner2016-03-241-6/+7
| | | | | | | | | | | | | | | | | | this Diff contains small, self-contained changes as I work towards fixing #10613. It is mostly created to let harbormaster do its job, but feedback is welcome as well. Please do not merge this via arc; I’d like to push the individual patches as layed out here. I might push mostly trivial ones even without review, as long as the build passes. Reviewers: austin, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2014
* Remove "use mask" from StgAlt syntaxÖmer Sinan Ağacan2016-02-241-1/+1
| | | | | | | | | | Reviewers: austin, bgamari, simonpj Reviewed By: simonpj Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1933
* Remove unused LiveVars and SRT fields of StgCaseÖmer Sinan Ağacan2016-02-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | | We also need to update `stgBindHasCafRefs` assertion with this change, as we no longer have the pre-computed SRT, LiveVars etc. We rename it to `topStgBindHasCafRefs` and implement it like this: A non-updatable top-level binding may refer to a CAF by referring to a top-level definition with CAFs. A top-level definition may have CAFs if it's updatable. At this point (because this is done after TidyPgm) top-level Ids (whether imported or defined in this module) are GlobalIds, so the top-levelness test is easy. (see also comments in the code) Reviewers: bgamari, simonpj, austin Reviewed By: simonpj Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1889 GHC Trac Issues: #11550
* Revert "Remove unused LiveVars and SRT fields of StgCase and StgLetNoEscape"Ömer Sinan Ağacan2016-02-061-2/+2
| | | | This reverts commit 4f9967aa3d1f7cfd539d0c173cafac0fe290e26f.
* Remove unused LiveVars and SRT fields of StgCase and StgLetNoEscapeÖmer Sinan Ağacan2016-02-041-2/+2
| | | | | | | | | | | | | | | | | | Also remove the functions and types that became useless after removing the fields: - SRT functions - LiveInfo type and functions - freeVarsToLiveVars - unariseLives and unariseSRT Reviewers: bgamari, simonpj, austin Reviewed By: simonpj Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1880
* Drop pre-AMP compatibility CPP conditionalsHerbert Valerio Riedel2015-12-311-2/+0
| | | | | | | | | | | | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP compatibility CPP-mess for good! Reviewers: austin, goldfire, bgamari Subscribers: goldfire, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1724
* Remove comments and flag for GranSimThomas Miedema2015-03-191-4/+1
| | | | | | | | | The GranSim code was removed in dd56e9ab and 297b05a9 in 2009, and perhaps other commits I couldn't find. Reviewed By: austin Differential Revision: https://phabricator.haskell.org/D737
* Replace .lhs with .hs in compiler commentsYuri de Wit2015-02-091-1/+1
| | | | | | | | | | | | | | Summary: It looks like during .lhs -> .hs switch the comments were not updated. So doing exactly that. Reviewers: austin, jstolarek, hvr, goldfire Reviewed By: austin, jstolarek Subscribers: thomie, goldfire Differential Revision: https://phabricator.haskell.org/D621 GHC Trac Issues: #9986
* Tick scopesPeter Wortmann2014-12-161-1/+3
| | | | | | | | | | | | | | | | | | | | | | This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
* Source notes (CorePrep and Stg support)Peter Wortmann2014-12-161-20/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is basically just about continuing maintaining source notes after the Core stage. Unfortunately, this is more involved as it might seem, as there are more restrictions on where ticks are allowed to show up. Notes: * We replace the StgTick / StgSCC constructors with a unified StgTick that can carry any tickish. * For handling constructor or lambda applications, we generally float ticks out. * Note that thanks to the NonLam placement, we know that source notes can never appear on lambdas. This means that as long as we are careful to always use mkTick, we will never violate CorePrep invariants. * This is however not automatically true for eta expansion, which needs to somewhat awkwardly strip, then re-tick the expression in question. * Where CorePrep floats out lets, we make sure to wrap them in the same spirit as FloatOut. * Detecting selector thunks becomes a bit more involved, as we can run into ticks at multiple points. (From Phabricator D169)
* Revert "Place static closures in their own section."Edward Z. Yang2014-10-201-2/+2
| | | | | | | | | | This reverts commit b23ba2a7d612c6b466521399b33fe9aacf5c4f75. Conflicts: compiler/cmm/PprCmmDecl.hs compiler/nativeGen/PPC/Ppr.hs compiler/nativeGen/SPARC/Ppr.hs compiler/nativeGen/X86/Ppr.hs
* Place static closures in their own section.Edward Z. Yang2014-10-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Summary: The primary reason for doing this is assisting debuggability: if static closures are all in the same section, they are guaranteed to be adjacent to one another. This will help later when we add some code that takes section start/end and uses this to sanity-check the sections. Part of remove HEAP_ALLOCED patch set (#8199) Signed-off-by: Edward Z. Yang <ezyang@mit.edu> Test Plan: validate Reviewers: simonmar, austin Subscribers: simonmar, ezyang, carter, thomie Differential Revision: https://phabricator.haskell.org/D263 GHC Trac Issues: #8199
* Make Applicative a superclass of MonadAustin Seipp2014-09-091-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This includes pretty much all the changes needed to make `Applicative` a superclass of `Monad` finally. There's mostly reshuffling in the interests of avoid orphans and boot files, but luckily we can resolve all of them, pretty much. The only catch was that Alternative/MonadPlus also had to go into Prelude to avoid this. As a result, we must update the hsc2hs and haddock submodules. Signed-off-by: Austin Seipp <austin@well-typed.com> Test Plan: Build things, they might not explode horribly. Reviewers: hvr, simonmar Subscribers: simonmar Differential Revision: https://phabricator.haskell.org/D13
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-0/+2
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Add SmallArray# and SmallMutableArray# typesJohan Tibell2014-03-291-4/+4
| | | | | | | | | | | | | | | These array types are smaller than Array# and MutableArray# and are faster when the array size is small, as they don't have the overhead of a card table. Having no card table reduces the closure size with 2 words in the typical small array case and leads to less work when updating or GC:ing the array. Reduces both the runtime and memory allocation by 8.8% on my insert benchmark for the HashMap type in the unordered-containers package, which makes use of lots of small arrays. With tuned GC settings (i.e. `+RTS -A6M`) the runtime reduction is 15%. Fixes #8923.
* Represent offsets into heap objects with byte, not word, offsetsSimon Marlow2014-03-111-6/+7
| | | | | I'd like to be able to pack together non-pointer fields that are less than a word in size, and this is a necessary prerequisite.
* Cleaned up Maybes.lhsBaldur Blöndal2014-02-131-2/+2
|
* Loopification jump between stack and heap checksJan Stolarek2014-02-011-9/+5
| | | | | | | | | | | | Fixes #8585 When emmiting label of a self-recursive tail call (ie. when performing loopification optimization) we emit the loop header label after a stack check but before the heap check. The reason is that tail-recursive functions use constant amount of stack space so we don't need to repeat the check in every loop. But they can grow the heap so heap check must be repeated in every call. See Note [Self-recursive tail calls] and [Self-recursive loop header].
* Update and deduplicate the comments on CAF management (#8590)Patrick Palka2013-12-041-31/+4
|
* Move the allocation of CAF blackholes into 'newCAF' (#8590)Patrick Palka2013-12-041-30/+10
| | | | | | | | | | We now do the allocation of the blackhole indirection closure inside the RTS procedure 'newCAF' instead of generating the allocation code inline in the closure body of each CAF. This slightly decreases code size in modules with a lot of CAFs. As a result of this change, for example, the size of DynFlags.o drops by ~60KB and HsExpr.o by ~100KB.
* Move the LDV code below the self-loop label (#8275)Patrick Palka2013-12-011-1/+1
|
* Don't explicitly refer to nodeReg in ldvEnterClosurePatrick Palka2013-12-011-2/+2
|
* Document solution to #8275Jan Stolarek2013-12-011-2/+13
|
* Fix loopification with profiling and enable it by default (#8275)Patrick Palka2013-12-011-4/+2
|
* Explicit import lists for StgCmmProf.Edward Z. Yang2013-09-011-1/+2
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Optimize self-recursive tail callsJan Stolarek2013-08-291-2/+14
| | | | | | | | | | | | | | | | | | | | | | This patch implements loopification optimization. It was described in "Low-level code optimisations in the Glasgow Haskell Compiler" by Krzysztof Woś, but we use a different approach here. Krzysztof's approach was to perform optimization as a Cmm-to-Cmm pass. Our approach is to generate properly optimized tail calls in the code generator, which saves us the trouble of processing Cmm. This idea was proposed by Simon Marlow. Implementation details are explained in Note [Self-recursive tail calls]. Performance of most nofib benchmarks is not affected. There are some benchmarks that show 5-7% improvement, with an average improvement of 2.6%. It would require some further investigation to check if this is related to benchamrking noise or does this optimization really help make some class of programs faster. As a minor cleanup, this patch renames forkProc to forkLneBody. It also moves some data declarations from StgCmmMonad to StgCmmClosure, because they are needed there and it seems that StgCmmClosure is on top of the whole StgCmm* hierarchy.