summaryrefslogtreecommitdiff
path: root/compiler/codeGen/StgCmmHeap.hs
Commit message (Collapse)AuthorAgeFilesLines
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-680/+0
| | | | | | Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.
* Update Trac ticket URLs to point to GitLabRyan Scott2019-03-151-1/+1
| | | | | This moves all URL references to Trac tickets to their corresponding GitLab counterparts.
* Fix some broken links (#15733)Fangyi Zhou2018-10-251-1/+1
| | | | | | | | | | | | | | | | Change some URLs from hackage.haskell.org/trac to ghc.haskell.org/trac Test Plan: manually verify links work Reviewers: bgamari, simonmar, mpickering Reviewed By: bgamari, mpickering Subscribers: mpickering, rwbarton, carter GHC Trac Issues: #15733 Differential Revision: https://phabricator.haskell.org/D5257
* Get rid of more CPP in cmm/ and codeGen/Michal Terepeta2018-03-191-4/+0
| | | | | | | | | | | | | | | | | | This removes a bunch of unnecessary includes of `HsVersions.h` along with unnecessary CPP (e.g., due to checking for DEBUG which can be achieved by looking at `debugIsOn`) Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4462
* Tidy up and consolidate canned CmmReg and CmmGlobalsSimon Marlow2018-02-181-5/+4
| | | | | | | | | | | | Test Plan: validate Reviewers: bgamari, erikd Reviewed By: bgamari Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4380
* Typos in commentsGabor Greif2018-01-171-1/+1
|
* Allow packing constructor fieldsMichal Terepeta2017-10-291-14/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is another step for fixing #13825 and is based on D38 by Simon Marlow. The change allows storing multiple constructor fields within the same word. This currently applies only to `Float`s, e.g., ``` data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float ``` on 64-bit arch, will now store both fields within the same constructor word. For `WordX/IntX` we'll need to introduce new primop types. Main changes: - We now use sizes in bytes when we compute the offsets for constructor fields in `StgCmmLayout` and introduce padding if necessary (word-sized fields are still word-aligned) - `ByteCodeGen` had to be updated to correctly construct the data types. This required some new bytecode instructions to allow pushing things that are not full words onto the stack (and updating `Interpreter.c`). Note that we only use the packed stuff when constructing data types (i.e., for `PACK`), in all other cases the behavior should not change. - `RtClosureInspect` was changed to handle the new layout when extracting subterms. This seems to be used by things like `:print`. I've also added a test for this. - I deviated slightly from Simon's approach and use `PrimRep` instead of `ArgRep` for computing the size of fields. This seemed more natural and in the future we'll probably want to introduce new primitive types (e.g., `Int8#`) and `PrimRep` seems like a better place to do that (where we already have `Int64Rep` for example). `ArgRep` on the other hand seems to be more focused on calling functions. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: maoe, rwbarton, thomie GHC Trac Issues: #13825 Differential Revision: https://phabricator.haskell.org/D3809
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-191-2/+2
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* Fix typos in diagnostics, testsuite and commentsGabor Greif2017-09-071-1/+1
|
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* Use newBlockId instead of newLabelCBen Gamari2016-11-291-6/+7
| | | | | | | | | | | | | | | This seems like a clearer name and the fewer functions that one needs to remember, the better. Test Plan: validate Reviewers: austin, simonmar, michalt Reviewed By: simonmar, michalt Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2735
* LLVM generate llvm.expect for conditional branchesAlex Biehl2016-11-171-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds likeliness annotations to heap and and stack checks and modifies the llvm codegen to recognize those to help it generate better code. So with this patch ``` ... if ((Sp + 8) - 24 < SpLim) (likely: False) goto c23c; else goto c23d; ... ``` roughly generates: ``` %ln23k = icmp ult i64 %ln23j, %SpLim_Arg %ln23m = call ccc i1 (i1, i1) @llvm.expect.i1( i1 %ln23k, i1 0 ) br i1 %ln23m, label %c23c, label %c23d ``` Note the call to `llvm.expect` which denotes the expected result for the comparison. Test Plan: Look at assembler code with and without this patch. If the heap-checks moved out of the way we are happy. Reviewers: austin, simonmar, bgamari Reviewed By: bgamari Subscribers: michalt, thomie Differential Revision: https://phabricator.haskell.org/D2688 GHC Trac Issues: #8321
* Remove StgRubbishArg and CmmArgÖmer Sinan Ağacan2016-08-101-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The idea behind adding special "rubbish" arguments was in unboxed sum types depending on the tag some arguments are not used and we don't want to move some special values (like 0 for literals and some special pointer for boxed slots) for those arguments (to stack locations or registers). "StgRubbishArg" was an indicator to the code generator that the value won't be used. During Stg-to-Cmm we were then not generating any move or store instructions at all. This caused problems in the register allocator because some variables were only initialized in some code paths. As an example, suppose we have this STG: (after unarise) Lib.$WT = \r [dt_sit] case case dt_sit of { Lib.F dt_siv [Occ=Once] -> (#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#]; Lib.I dt_siw [Occ=Once] -> (#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw]; } of dt_six { (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE]; }; This basically unpacks a sum type to an unboxed sum with 3 fields, and then moves the unboxed sum to a constructor (`Lib.T`). This is the Cmm for the inner case expression (case expression in the scrutinee position of the outer case): ciN: ... -- look at dt_sit's tag if (_ciT::P64 != 1) goto ciS; else goto ciR; ciS: -- Tag is 2, i.e. Lib.F _siw::I64 = I64[_siu::P64 + 6]; _giE::I64 = _siw::I64; _giD::P64 = stg_RUBBISH_ENTRY_info; _giC::I64 = 2; goto ciU; ciR: -- Tag is 1, i.e. Lib.I _siv::P64 = P64[_siu::P64 + 7]; _giD::P64 = _siv::P64; _giC::I64 = 1; goto ciU; Here one of the blocks `ciS` and `ciR` is executed and then the execution continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch `_giE` is not initialized, because it's "rubbish" in the STG and so we don't generate an assignment during code generator. The code generator then panics during the register allocations: ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20160722 for x86_64-unknown-linux): LocalReg's live-in to graph ciY {_giE::I64} (`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a pointer slot, we have to initialize it otherwise garbage collector follows the pointer to some random place. So we only remove assignment if the "rubbish" arg has unboxed type.) This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize rubbish slots. If the slot is for boxed types we use the existing `absentError`, otherwise we initialize the slot with literal 0. Reviewers: simonpj, erikd, austin, simonmar, bgamari Reviewed By: erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2446
* Implement unboxed sum primitive typeÖmer Sinan Ağacan2016-07-211-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: This patch implements primitive unboxed sum types, as described in https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes. Main changes are: - Add new syntax for unboxed sums types, terms and patterns. Hidden behind `-XUnboxedSums`. - Add unlifted unboxed sum type constructors and data constructors, extend type and pattern checkers and desugarer. - Add new RuntimeRep for unboxed sums. - Extend unarise pass to translate unboxed sums to unboxed tuples right before code generation. - Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better code generation when sum values are involved. - Add user manual section for unboxed sums. Some other changes: - Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to `MultiValAlt` to be able to use those with both sums and tuples. - Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really wrong, given an `Any` `TyCon`, there's no way to tell what its kind is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`. - Fix some bugs on the way: #12375. Not included in this patch: - Update Haddock for new the new unboxed sum syntax. - `TemplateHaskell` support is left as future work. For reviewers: - Front-end code is mostly trivial and adapted from unboxed tuple code for type checking, pattern checking, renaming, desugaring etc. - Main translation routines are in `RepType` and `UnariseStg`. Documentation in `UnariseStg` should be enough for understanding what's going on. Credits: - Johan Tibell wrote the initial front-end and interface file extensions. - Simon Peyton Jones reviewed this patch many times, wrote some code, and helped with debugging. Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin, simonmar, hvr, erikd Reviewed By: simonpj Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D2259
* Drop pre-AMP compatibility CPP conditionalsHerbert Valerio Riedel2015-12-311-2/+0
| | | | | | | | | | | | Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP compatibility CPP-mess for good! Reviewers: austin, goldfire, bgamari Subscribers: goldfire, thomie, erikd Differential Revision: https://phabricator.haskell.org/D1724
* Rename package key to unit ID, and installed package ID to component ID.Edward Z. Yang2015-10-141-1/+1
| | | | | | Comes with Haddock submodule update. Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
* StgCmmHeap: Re-add check for large static allocationsBen Gamari2015-08-291-0/+9
| | | | | | | This should at least help alleviate the annoyance of #4505. This reintroduces a compile-time check originally added in a278f3f02d09bc32b0a75d4a04d710090cde250f but dropped with the new code generator.
* Eliminate zero_static_objects_list()Simon Marlow2015-07-281-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Summary: [Revised version of D1076 that was committed and then backed out] In a workload with a large amount of code, zero_static_objects_list() takes a significant amount of time, and furthermore it is in the single-threaded part of the GC. This patch uses a slightly fiddly scheme for marking objects on the static object lists, using a flag in the low 2 bits that flips between two states to indicate whether an object has been visited during this GC or not. We also have to take into account objects that have not been visited yet, which might appear at any time due to runtime linking. Test Plan: validate Reviewers: austin, ezyang, rwbarton, bgamari, thomie Reviewed By: bgamari, thomie Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1106
* Revert "Eliminate zero_static_objects_list()"Simon Marlow2015-07-271-3/+2
| | | | This reverts commit b949c96b4960168a3b399fe14485b24a2167b982.
* Eliminate zero_static_objects_list()Simon Marlow2015-07-221-2/+3
| | | | | | | | | | | | | | | | | | | | | Summary: In a workload with a large amount of code, zero_static_objects_list() takes a significant amount of time, and furthermore it is in the single-threaded part of the GC. This patch uses a slightly fiddly scheme for marking objects on the static object lists, using a flag in the low 2 bits that flips between two states to indicate whether an object has been visited during this GC or not. We also have to take into account objects that have not been visited yet, which might appear at any time due to runtime linking. Test Plan: validate Reviewers: austin, bgamari, ezyang, rwbarton Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1076
* Tick scopesPeter Wortmann2014-12-161-3/+5
| | | | | | | | | | | | | | | | | | | | | | This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
* Make Applicative a superclass of MonadAustin Seipp2014-09-091-0/+4
| | | | | | | | | | | | | | | | | | | | | Summary: This includes pretty much all the changes needed to make `Applicative` a superclass of `Monad` finally. There's mostly reshuffling in the interests of avoid orphans and boot files, but luckily we can resolve all of them, pretty much. The only catch was that Alternative/MonadPlus also had to go into Prelude to avoid this. As a result, we must update the hsc2hs and haddock submodules. Signed-off-by: Austin Seipp <austin@well-typed.com> Test Plan: Build things, they might not explode horribly. Reviewers: hvr, simonmar Subscribers: simonmar Differential Revision: https://phabricator.haskell.org/D13
* Rename PackageId to PackageKey, distinguishing it from Cabal's PackageId.Edward Z. Yang2014-07-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Previously, both Cabal and GHC defined the type PackageId, and we expected them to be roughly equivalent (but represented differently). This refactoring separates these two notions. A package ID is a user-visible identifier; it's the thing you write in a Cabal file, e.g. containers-0.9. The components of this ID are semantically meaningful, and decompose into a package name and a package vrsion. A package key is an opaque identifier used by GHC to generate linking symbols. Presently, it just consists of a package name and a package version, but pursuant to #9265 we are planning to extend it to record other information. Within a single executable, it uniquely identifies a package. It is *not* an InstalledPackageId, as the choice of a package key affects the ABI of a package (whereas an InstalledPackageId is computed after compilation.) Cabal computes a package key for the package and passes it to GHC using -package-name (now *extremely* misnamed). As an added bonus, we don't have to worry about shadowing anymore. As a follow on, we should introduce -current-package-key having the same role as -package-name, and deprecate the old flag. This commit is just renaming. The haddock submodule needed to be updated. Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu> Test Plan: validate Reviewers: simonpj, simonmar, hvr, austin Subscribers: simonmar, relrod, carter Differential Revision: https://phabricator.haskell.org/D79 Conflicts: compiler/main/HscTypes.lhs compiler/main/Packages.lhs utils/haddock
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-0/+2
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Comments on virtHp, realHp (Trac #8864)Simon Peyton Jones2014-03-131-1/+1
| | | | | | | Documentation in response to Johan's questions Plus, don't export hpRel from StgCmmHeap, StgCmmLayout (it is only used locally in StgCmmLayout)
* Fix incorrect loop condition in inline array allocationJohan Tibell2014-03-111-2/+3
| | | | | Also make sure allocHeapClosure updates profiling counters with the memory allocated.
* Refactor inline array allocationSimon Marlow2014-03-111-38/+46
| | | | | | | | | | - Move array representation knowledge into SMRep - Separate out low-level heap-object allocation so that we can reuse it from doNewArrayOp - remove card-table initialisation, we can safely ignore the card table for newly allocated arrays.
* Represent offsets into heap objects with byte, not word, offsetsSimon Marlow2014-03-111-5/+4
| | | | | I'd like to be able to pack together non-pointer fields that are less than a word in size, and this is a necessary prerequisite.
* Loopification jump between stack and heap checksJan Stolarek2014-02-011-4/+37
| | | | | | | | | | | | Fixes #8585 When emmiting label of a self-recursive tail call (ie. when performing loopification optimization) we emit the loop header label after a stack check but before the heap check. The reason is that tail-recursive functions use constant amount of stack space so we don't need to repeat the check in every loop. But they can grow the heap so heap check must be repeated in every call. See Note [Self-recursive tail calls] and [Self-recursive loop header].
* More comments about stack layoutSimon Peyton Jones2013-10-181-9/+27
|
* Comments (about the stack overflow check) onlySimon Peyton Jones2013-10-181-15/+23
|
* Generate (old + 0) instead of Sp in stack checksJan Stolarek2013-10-161-2/+25
| | | | | | | | | | | | | | | | | | | | When compiling a function we can determine how much stack space it will use. We therefore need to perform only a single stack check at the beginning of a function to see if we have enough stack space. Instead of referring directly to Sp - as we used to do in the past - the code generator uses (old + 0) in the stack check. Stack layout phase turns (old + 0) into Sp. The idea here is that, while we need to perform only one stack check for each function, we could in theory place more stack checks later in the function. They would be redundant, but not incorrect (in a sense that they should not change program behaviour). We need to make sure however that a stack check inserted after incrementing the stack pointer checks for a respectively smaller stack space. This would not be the case if the code generator produced direct references to Sp. By referencing (old + 0) we make sure that we always check for a correct amount of stack: when converting (old + 0) to Sp the stack layout phase takes into account changes already made to stack pointer. The idea for this change came from observations made while debugging #8275.
* Explicit import lists for StgCmmProf.Edward Z. Yang2013-09-011-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Remove unused moduleJan Stolarek2013-08-201-10/+2
| | | | | | This commit removes module StgCmmGran which has only no-op functions. According to comments in the module, it was used by GpH, but GpH project seems to be dead for a couple of years now.
* Trailing whitespaces, code formatting, detabifyJan Stolarek2013-08-201-1/+1
| | | | | A major cleanup of trailing whitespaces and tabs in codeGen/ directory. I also adjusted code formatting in some places.
* extended ticky to also track "let"s that are not conventional closuresNicolas Frisby2013-05-021-6/+8
| | | | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag. (This is 024df664b600a with a small bug fix.)
* Revert "extended ticky to also track "let"s that are not closures"Nicolas Frisby2013-04-121-8/+6
| | | | | | This reverts commit 024df664b600a622cb8189ccf31789688505fc1c. Of course I gaff on my last day...
* extended ticky to also track "let"s that are not closuresNicolas Frisby2013-04-121-6/+8
| | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag.
* added ticky counters for heap and stack checksNicolas Frisby2013-04-111-1/+2
|
* ticky enhancementsNicolas Frisby2013-03-291-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Remove stale, commented-out code about heap checksSimon Peyton Jones2013-03-091-83/+0
|
* Code-size optimisation for top-level indirections (#7308)Simon Marlow2012-11-191-4/+2
| | | | | | | | | | | | | | | Top-level indirections are often generated when there is a cast, e.g. foo :: T foo = bar `cast` (some coercion) For these we were generating a full-blown CAF, which is a fair chunk of code. This patch makes these indirections generate a single IND_STATIC closure (4 words) instead. This is exactly what the CAF would evaluate to eventually anyway, we're just shortcutting the whole process.
* Fix the Slow calling convention (#7192)Simon Marlow2012-11-131-8/+4
| | | | | | | | The Slow calling convention passes the closure in R1, but we were ignoring this and hoping it would work, which it often did. However, this bug seems to have been the cause of #7192, because the graph-colouring allocator is more sensitive to having correct liveness information on jumps.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-301-1/+1
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Some alpha renamingIan Lynagh2012-10-161-1/+1
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-77/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Partially fix #367 by adding HpLim checks to entry with -fno-omit-yields.Edward Z. Yang2012-09-261-20/+37
| | | | | | | | | | | | | | | | | | | | | The current fix is relatively dumb as far as where to add HpLim checks: it will always perform a check unless we know that we're returning from a closure or we are doing a non let-no-escape case analysis. The performance impact on the nofib suite looks like this: Min +5.7% -0.0% -6.5% -6.4% -50.0% Max +6.3% +5.8% +5.0% +5.5% +0.8% Geometric Mean +6.2% +0.1% +0.5% +0.5% -0.8% Overall, the executable bloat is the biggest problem, so we keep the old omit-yields optimization on by default. Remember that if you need an interruptibility guarantee, you need to recompile all of your libraries with -fno-omit-yields. A better fix would involve only inserting the yields necessary to break loops; this is left as future work. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Move wORD_SIZE into platformConstantsIan Lynagh2012-09-161-3/+2
|
* Pass DynFlags down to wordWidthIan Lynagh2012-09-121-40/+39
|
* Pass DynFlags down to bWordIan Lynagh2012-09-121-13/+16
| | | | | | I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.