summaryrefslogtreecommitdiff
path: root/compiler/cmm
Commit message (Collapse)AuthorAgeFilesLines
...
* Add Cmm support for representing 128-bit-wide SIMD vectors.Geoffrey Mainland2013-02-016-15/+89
|
* Merge branch 'master' of http://darcs.haskell.org/ghcSimon Peyton Jones2013-01-301-2/+5
|\ | | | | | | | | Conflicts: compiler/types/Coercion.lhs
| * hopefully fix #7620Simon Marlow2013-01-291-2/+5
| |
* | Merge branch 'master' of http://darcs.haskell.org/ghcSimon Peyton Jones2013-01-243-2/+156
|\ \ | |/
| * Tidy up: move info-table related stuff to CmmInfoSimon Marlow2013-01-233-2/+156
| | | | | | | | Prep for #709
* | Introduce CPR for sum types (Trac #5075)Simon Peyton Jones2013-01-241-1/+0
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main payload of this patch is to extend CPR so that it detects when a function always returns a result constructed with the *same* constructor, even if the constructor comes from a sum type. This doesn't matter very often, but it does improve some things (results below). Binary sizes increase a little bit, I think because there are more wrappers. This with -split-objs. Without split-ojbs binary sizes increased by 6% even for HelloWorld.hs. It's hard to see exactly why, but I think it was because System.Posix.Types.o got included in the linked binary, whereas it didn't before. Program Size Allocs Runtime Elapsed TotalMem fluid +1.8% -0.3% 0.01 0.01 +0.0% tak +2.2% -0.2% 0.02 0.02 +0.0% ansi +1.7% -0.3% 0.00 0.00 +0.0% cacheprof +1.6% -0.3% +0.6% +0.5% +1.4% parstof +1.4% -4.4% 0.00 0.00 +0.0% reptile +2.0% +0.3% 0.02 0.02 +0.0% ---------------------------------------------------------------------- Min +1.1% -4.4% -4.7% -4.7% -15.0% Max +2.3% +0.3% +8.3% +9.4% +50.0% Geometric Mean +1.9% -0.1% +0.6% +0.7% +0.3% Other things in this commit ~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Got rid of the Lattice class in Demand * Refactored the way that products and newtypes are decomposed (no change in functionality)
* Rename all of the 'cmmz' flags and make them more consistent.Austin Seipp2012-12-192-20/+19
| | | | | | | | | | | | | | | | There's only a single compiler backend now, so the 'z' suffix means nothing. Also, the flags were confusingly named ('cmm-foo' vs 'foo-cmm',) and counter-intuitively, '-ddump-cmm' did not do at all what you expected since the new backend went live. Basically, all of the -ddump-cmmz-* flags are now -ddump-cmm-*. Some were renamed to be more consistent. This doesn't update the manual; it already mentions '-ddump-cmm' and that flag implies all the others anyway, which is probably what you want. Signed-off-by: Austin Seipp <mad.one@gmail.com>
* Implement word2Float# and word2Double#Johan Tibell2012-12-132-0/+3
|
* Pessimistically assume that unknown arches can't do unaligned loadsIan Lynagh2012-12-071-0/+3
|
* Tweak commentsIan Lynagh2012-12-021-2/+3
|
* Fix broken -fPIC on Darwin/PPC (#7442)PHO2012-11-241-4/+12
| | | | The workaround described in note [darwin-x86-pic] applies to Darwin/PPC too.
* C backend: put the entry block firstSimon Marlow2012-11-191-1/+1
|
* Code-size optimisation for top-level indirections (#7308)Simon Marlow2012-11-192-2/+12
| | | | | | | | | | | | | | | Top-level indirections are often generated when there is a cast, e.g. foo :: T foo = bar `cast` (some coercion) For these we were generating a full-blown CAF, which is a fair chunk of code. This patch makes these indirections generate a single IND_STATIC closure (4 words) instead. This is exactly what the CAF would evaluate to eventually anyway, we're just shortcutting the whole process.
* C backend: ignore MO_TouchSimon Marlow2012-11-161-0/+2
|
* fix syntax error in generated C (#7407)Simon Marlow2012-11-161-2/+2
|
* Tell the compiler about alpha, mipseb and mipsel again; fixes #7339Ian Lynagh2012-11-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts the compiler parts of commit 7b594a5d7ac29972db39228e9c8b7f384313f39b Author: David Terei <davidterei@gmail.com> Date: Mon Nov 21 12:05:18 2011 -0800 Remove registerised code for dead architectures: mips, ia64, alpha, hppa1, m68k In particular, we want to know whether bewareLoadStoreAlignment should return True or False for them. It also reverts commit 3fc68b5c356b39b2b52a86d953367d0021c13262 Author: Simon Marlow <marlowsd@gmail.com> Date: Wed Jan 4 11:44:02 2012 +0000 Remove missing archs (mipseb, mipsel, alpha) (#5734) It doesn't hurt to map these to ArchUnknown since we don't need to know anything specific about them, and adding them would be a pain (there are a bunch of places where we have to case-match on all the arches to avoid warnings).
* Fix the Slow calling convention (#7192)Simon Marlow2012-11-133-28/+12
| | | | | | | | The Slow calling convention passes the closure in R1, but we were ignoring this and hoping it would work, which it often did. However, this bug seems to have been the cause of #7192, because the graph-colouring allocator is more sensitive to having correct liveness information on jumps.
* replaceLabels: null out the cml_cont field of CmmCallSimon Marlow2012-11-121-1/+4
| | | | This fixes a CmmLint complaint when doing proc-point splitting.
* Fix warningsSimon Marlow2012-11-122-1/+3
|
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-1214-1031/+196
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* fix 'return' in cmm code when tablesNextToCode==FalseSimon Marlow2012-11-052-7/+7
|
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-302-18/+44
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-3014-41/+56
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Generalize register sets and liveness calculations.Geoffrey Mainland2012-10-308-167/+267
| | | | | | We would like to calculate register liveness for global registers as well as local registers, so this patch generalizes the existing infrastructure to set the stage.
* Cmm jumps always have live register information.Geoffrey Mainland2012-10-303-4/+4
| | | | Jumps now always have live register information attached, so drop Maybes.
* INFO_TABLE_RET should generate a CmmRetInfoLabel, not a CmmInfoLabelSimon Marlow2012-10-301-2/+2
| | | | | | | Fixes this, when building unregisterised: rts/dist/build/AutoApply.hc:87:1: error: ‘stg_ap_v_entry’ undeclared (first use in this function)
* Fix a bug in CmmSink exposed by a recent optimisation (#7366)Simon Marlow2012-10-251-0/+10
|
* Fix bug in 88a6f863d9f127fc1b03a1e2f068fd20ecbe096c (#7366)Simon Marlow2012-10-251-20/+20
|
* Comment to explain why we need to split proc points on x86/Darwin with -fPICSimon Marlow2012-10-241-1/+31
|
* Add a case for CmmLabelDiffOff to cmmOffsetLitSimon Marlow2012-10-241-0/+2
|
* Merge branch 'master' of darcs.haskell.org:/srv/darcs//ghcIan Lynagh2012-10-232-6/+19
|\
| * Small optimisation: always sink/inline reg1 = reg2 assignmentsSimon Marlow2012-10-231-6/+5
| |
| * a small -fPIC optimisationSimon Marlow2012-10-231-0/+14
| | | | | | | | (PicBaseReg + lit) + N ==> PicBaseReg + (lit+N)
* | Merge branch 'master' of http://darcs.haskell.org/ghcIan Lynagh2012-10-231-20/+23
|\ \ | |/
| * Avoid calling toInfoLbl on the entry label (#7313)Simon Marlow2012-10-231-20/+23
| |
* | Fix -fPIC on OS X x86Ian Lynagh2012-10-231-0/+6
|/
* Foreign calls can clobber heap & stack memory tooSimon Marlow2012-10-221-2/+17
| | | | | | | We were making an aggressive assumption that foreign calls cannot clobber heap or stack memory, which for the majority of foreign calls is true, but we violate the assumption in the implementation of primops in the RTS. This was causing crashes in some STM tests.
* Remove the old codegenSimon Marlow2012-10-193-7/+3
| | | | | Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
* Refactor the way dump flags are handledIan Lynagh2012-10-181-4/+4
| | | | | | | | | | | | | We were being inconsistent about how we tested whether dump flags were enabled; in particular, sometimes we also checked the verbosity, and sometimes we didn't. This lead to oddities such as "ghc -v4" printing an "Asm code" section which didn't contain any code, and "-v4" enabled some parts of "-ddump-deriv" but not others. Now all the tests use dopt, which also takes the verbosity into account as appropriate.
* Some alpha renamingIan Lynagh2012-10-166-12/+12
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Rename DynFlag to GeneralFlagIan Lynagh2012-10-161-2/+2
| | | | | This avoids confusion due to [DynFlag] and DynFlags being completely different types.
* Add a type signature needed when using GADTsSimon Peyton Jones2012-10-121-0/+1
|
* untabSimon Marlow2012-10-081-253/+253
|
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-0824-779/+1118
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Add a ToDo commentSimon Marlow2012-10-051-0/+21
|
* Remove some old-codegen cruftSimon Marlow2012-09-251-269/+0
|
* When -split-objs is on, make one SRT per split, not one per moduleSimon Marlow2012-09-252-18/+16
| | | | | This is a hopefully temporary measure until the new SRT design is implemeented.
* Misc tidyupSimon Marlow2012-09-245-13/+20
|
* no functional changesSimon Marlow2012-09-241-7/+16
|
* add a missing entryCodeSimon Marlow2012-09-201-1/+3
|