summaryrefslogtreecommitdiff
path: root/compiler/codeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix the GHC package DLL-splittingIan Lynagh2013-05-141-1/+2
| | | | | | | There's now an internal -dll-split flag, which we use to tell GHC how the GHC package is split into 2 separate DLLs. This is used by Packages.isDllName to determine whether a call is within the same DLL, or whether it is a call to another DLL.
* extended ticky to also track "let"s that are not conventional closuresNicolas Frisby2013-05-026-47/+71
| | | | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag. (This is 024df664b600a with a small bug fix.)
* In CMM, only allow foreign calls to labels, not arbitrary expressionsIan Lynagh2013-04-243-10/+8
| | | | | | | | | I'm not sure if we want to make this change permanently, but for now it fixes the unreg build. I've also removed some redundant special-case code that generated prototypes for foreign functions. The standard pprTempAndExternDecls now generates them.
* Small refactoring in StgCmmExtCodeIan Lynagh2013-04-231-6/+7
|
* Don't duplicate decls unnecessarily in the environmentIan Lynagh2013-04-231-1/+1
| | | | | In loopDecls, as far as I can see the globalDecls will always already be in the environment, so don't add them again.
* Make CmmParse abstractIan Lynagh2013-04-231-1/+1
|
* Revert "extended ticky to also track "let"s that are not closures"Nicolas Frisby2013-04-126-69/+47
| | | | | | This reverts commit 024df664b600a622cb8189ccf31789688505fc1c. Of course I gaff on my last day...
* extended ticky to also track "let"s that are not closuresNicolas Frisby2013-04-126-47/+69
| | | | | This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag.
* added ticky counters for heap and stack checksNicolas Frisby2013-04-112-1/+11
|
* ticky enhancementsNicolas Frisby2013-03-299-348/+614
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
* Typo-fix for panic.Edward Z. Yang2013-03-111-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Remove unnecessary DynFlags arg to mkCgIdInfoSimon Peyton Jones2013-03-091-4/+4
|
* Remove stale, commented-out code about heap checksSimon Peyton Jones2013-03-091-83/+0
|
* Remove unused functions cmmConstrTag, cmmGetTagSimon Peyton Jones2013-03-091-2/+2
| | | | | Patch offered by Boris Sukholitko <boriss@gmail.com> Trac #7757
* Remove cg_tag from CgIdInfoBoris Sukholitko2013-03-092-7/+3
|
* Detabify StgCmmEnvBoris Sukholitko2013-03-091-63/+55
|
* Detabify StgCmmMonadBoris Sukholitko2013-03-091-175/+168
|
* Satisfy the invariant on CmmUnsafeForeignCall argumentsSimon Marlow2013-03-061-30/+23
| | | | | | | There was potentially a bug here, but no actual failures were identified in the wild. See Note [Register Parameter Passing]
* Primitive bitwise operations on Int# (Fixes #7689)Jan Stolarek2013-02-181-0/+4
|
* some more typosGabor Greif2013-02-021-1/+1
|
* Add prefetch primops.Geoffrey Mainland2013-02-011-0/+47
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-012-5/+20
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int64X2# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+37
|
* Add the DoubleX2# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+36
|
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-011-0/+37
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-011-137/+337
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-28/+36
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* Tidy up: move info-table related stuff to CmmInfoSimon Marlow2013-01-234-121/+4
| | | | Prep for #709
* White space onlySimon Peyton Jones2013-01-151-1/+1
|
* Inline some FastBytes/ByteString wrappersIan Lynagh2012-12-141-1/+2
| | | | Working towards removing FastBytes
* Implement word2Float# and word2Double#Johan Tibell2012-12-131-0/+6
|
* Code-size optimisation for top-level indirections (#7308)Simon Marlow2012-11-194-19/+48
| | | | | | | | | | | | | | | Top-level indirections are often generated when there is a cast, e.g. foo :: T foo = bar `cast` (some coercion) For these we were generating a full-blown CAF, which is a fair chunk of code. This patch makes these indirections generate a single IND_STATIC closure (4 words) instead. This is exactly what the CAF would evaluate to eventually anyway, we're just shortcutting the whole process.
* Fix the Slow calling convention (#7192)Simon Marlow2012-11-134-21/+18
| | | | | | | | The Slow calling convention passes the closure in R1, but we were ignoring this and hoping it would work, which it often did. However, this bug seems to have been the cause of #7192, because the graph-colouring allocator is more sensitive to having correct liveness information on jumps.
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-121-59/+28
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* loadThreadState should set HpAlloc=0Simon Marlow2012-11-051-1/+7
|
* Fix popcnt callsIan Lynagh2012-11-011-10/+5
| | | | | We don't want to narrow the argument size before making the foreign call: Word8 still gets passed as a Word-sized argument
* Whitespace only in codeGen/StgCmmPrim.hsIan Lynagh2012-11-011-90/+83
|
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-301-2/+8
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-306-17/+17
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Remove the old codegenSimon Marlow2012-10-1925-10495/+13
| | | | | Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
* Some alpha renamingIan Lynagh2012-10-1624-53/+53
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Fix copyArray# bug in new code generatorRoman Leshchinskiy2012-10-081-17/+22
|
* Fix copyArray# bug in old code generatorRoman Leshchinskiy2012-10-081-16/+19
|
* expand tabsSimon Marlow2012-10-081-58/+58
|
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-0822-357/+359
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Partially fix #367 by adding HpLim checks to entry with -fno-omit-yields.Edward Z. Yang2012-09-262-23/+38
| | | | | | | | | | | | | | | | | | | | | The current fix is relatively dumb as far as where to add HpLim checks: it will always perform a check unless we know that we're returning from a closure or we are doing a non let-no-escape case analysis. The performance impact on the nofib suite looks like this: Min +5.7% -0.0% -6.5% -6.4% -50.0% Max +6.3% +5.8% +5.0% +5.5% +0.8% Geometric Mean +6.2% +0.1% +0.5% +0.5% -0.8% Overall, the executable bloat is the biggest problem, so we keep the old omit-yields optimization on by default. Remember that if you need an interruptibility guarantee, you need to recompile all of your libraries with -fno-omit-yields. A better fix would involve only inserting the yields necessary to break loops; this is left as future work. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Remove some old-codegen cruftSimon Marlow2012-09-252-238/+4
|
* Misc tidyupSimon Marlow2012-09-241-6/+1
|
* non-tablesNextToCode fix for returns in the new codegenSimon Marlow2012-09-201-1/+3
|
* Change some "else return ()"s to use when/unlessIan Lynagh2012-09-202-3/+3
|