delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Merge cgTailCall and cgLneJump into one function	Jan Stolarek	2013-08-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previosly logic of these functions was sth like this: cgIdApp x = case x of A -> cgLneJump x _ -> cgTailCall x cgTailCall x = case x of B -> ... C -> ... _ -> ... After merging there is no nesting of cases: cgIdApp x = case x of A -> -- body of cgLneJump B -> ... C -> ... _ -> ...
*	Remove unused module	Jan Stolarek	2013-08-20	1	-3/+0
\| \| \| \| \| \|	This commit removes module StgCmmGran which has only no-op functions. According to comments in the module, it was used by GpH, but GpH project seems to be dead for a couple of years now.
*	Cleanup StgCmm pass	Jan Stolarek	2013-08-20	1	-19/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This cleanup includes: * removing dead code. This includes forkStatics function, which was in fact one big noop, and global bindings in CgInfoDownwards, * converting functions that used FCode monad only to access DynFlags into functions that take DynFlags as a parameter and don't work in a monad, * addBindC function is now smarter. It extracts Id from CgIdInfo passed to it in the same way addBindsC does. Previously this was done at every call site, which was redundant.
*	Trailing whitespaces, code formatting, detabify	Jan Stolarek	2013-08-20	1	-4/+4
\| \| \| \| \|	A major cleanup of trailing whitespaces and tabs in codeGen/ directory. I also adjusted code formatting in some places.
*	Add final remaining bits to fix #7978.	Geoffrey Mainland	2013-07-22	1	-30/+1
\|
*	Add a work-around for #7978.	Geoffrey Mainland	2013-06-22	1	-2/+7
\| \| \| \| \|	This patch fixes profiling at the cost of losing cost centre accounting in a very small number of cases. I am working on a better fix.
*	Wibbles (merg-os) to ticky-ticky	Simon Peyton Jones	2013-06-06	1	-2/+2
\|
*	Implement cardinality analysis	Simon Peyton Jones	2013-06-06	1	-10/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This major patch implements the cardinality analysis described in our paper "Higher order cardinality analysis". It is joint work with Ilya Sergey and Dimitrios Vytiniotis. The basic is augment the absence-analysis part of the demand analyser so that it can tell when something is used never at most once some other way The "at most once" information is used a) to enable transformations, and in particular to identify one-shot lambdas b) to allow updates on thunks to be omitted. There are two new flags, mainly there so you can do performance comparisons: -fkill-absence stops GHC doing absence analysis at all -fkill-one-shot stops GHC spotting one-shot lambdas and single-entry thunks The big changes are: * The Demand type is substantially refactored. In particular the UseDmd is factored as follows data UseDmd = UCall Count UseDmd \| UProd [MaybeUsed] \| UHead \| Used data MaybeUsed = Abs \| Use Count UseDmd data Count = One \| Many Notice that UCall recurses straight to UseDmd, whereas UProd goes via MaybeUsed. The "Count" embodies the "at most once" or "many" idea. * The demand analyser itself was refactored a lot * The previously ad-hoc stuff in the occurrence analyser for foldr and build goes away entirely. Before if we had build (\cn -> ...x... ) then the "\cn" was hackily made one-shot (by spotting 'build' as special. That's essential to allow x to be inlined. Now the occurrence analyser propagates info gotten from 'build's stricness signature (so build isn't special); and that strictness sig is in turn derived entirely automatically. Much nicer! * The ticky stuff is improved to count single-entry thunks separately. One shortcoming is that there is no DEBUG way to spot if an allegedly-single-entry thunk is acually entered more than once. It would not be hard to generate a bit of code to check for this, and it would be reassuring. But it's fiddly and I have not done it. Despite all this fuss, the performance numbers are rather under-whelming. See the paper for more discussion. nucleic2 -0.8% -10.9% 0.10 0.10 +0.0% sphere -0.7% -1.5% 0.08 0.08 +0.0% -------------------------------------------------------------------------------- Min -4.7% -10.9% -9.3% -9.3% -50.0% Max -0.4% +0.5% +2.2% +2.3% +7.4% Geometric Mean -0.8% -0.2% -1.3% -1.3% -1.8% I don't quite know how much credence to place in the runtime changes, but movement seems generally in the right direction.
*	extended ticky to also track "let"s that are not conventional closures	Nicolas Frisby	2013-05-02	1	-9/+13
\| \| \| \| \| \| \|	This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag. (This is 024df664b600a with a small bug fix.)
*	In CMM, only allow foreign calls to labels, not arbitrary expressions	Ian Lynagh	2013-04-24	1	-1/+1
\| \| \| \| \| \| \| \| \|	I'm not sure if we want to make this change permanently, but for now it fixes the unreg build. I've also removed some redundant special-case code that generated prototypes for foreign functions. The standard pprTempAndExternDecls now generates them.
*	Revert "extended ticky to also track "let"s that are not closures"	Nicolas Frisby	2013-04-12	1	-14/+9
\| \| \| \| \| \|	This reverts commit 024df664b600a622cb8189ccf31789688505fc1c. Of course I gaff on my last day...
*	extended ticky to also track "let"s that are not closures	Nicolas Frisby	2013-04-12	1	-9/+14
\| \| \| \| \|	This includes selector, ap, and constructor thunks. They are still guarded by the -ticky-dyn-thk flag.
*	ticky enhancements	Nicolas Frisby	2013-03-29	1	-26/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap__info resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation of closure in addition to allocation by that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
*	Tidy up: move info-table related stuff to CmmInfo	Simon Marlow	2013-01-23	1	-0/+1
\| \| \| \|	Prep for #709
*	Code-size optimisation for top-level indirections (#7308)	Simon Marlow	2012-11-19	1	-9/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Top-level indirections are often generated when there is a cast, e.g. foo :: T foo = bar `cast` (some coercion) For these we were generating a full-blown CAF, which is a fair chunk of code. This patch makes these indirections generate a single IND_STATIC closure (4 words) instead. This is exactly what the CAF would evaluate to eventually anyway, we're just shortcutting the whole process.
*	Fix the Slow calling convention (#7192)	Simon Marlow	2012-11-13	1	-10/+11
\| \| \| \| \| \| \| \|	The Slow calling convention passes the closure in R1, but we were ignoring this and hoping it would work, which it often did. However, this bug seems to have been the cause of #7192, because the graph-colouring allocator is more sensitive to having correct liveness information on jumps.
*	Some alpha renaming	Ian Lynagh	2012-10-16	1	-5/+5
\| \| \| \| \|	Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
*	Produce new-style Cmm from the Cmm parser	Simon Marlow	2012-10-08	1	-16/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
*	Change some "else return ()"s to use when/unless	Ian Lynagh	2012-09-20	1	-2/+1
\|
*	Move tAG_BITS into platformConstants	Ian Lynagh	2012-09-16	1	-2/+2
\|
*	Move wORD_SIZE into platformConstants	Ian Lynagh	2012-09-16	1	-2/+1
\|
*	Start moving other constants from (Haskell)Constants to platformConstants	Ian Lynagh	2012-09-14	1	-2/+2
\|
*	Use oFFSET_* from platformConstants rather than Constants	Ian Lynagh	2012-09-13	1	-1/+1
\|
*	Use sIZEOF_* from platformConstants rather than Constants	Ian Lynagh	2012-09-13	1	-1/+1
\|
*	Pass DynFlags down to wordWidth	Ian Lynagh	2012-09-12	1	-4/+4
\|
*	Pass DynFlags down to bWord	Ian Lynagh	2012-09-12	1	-9/+11
\| \| \| \| \| \|	I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.
*	Cleanup: add mkIntExpr and zeroExpr utils	Simon Marlow	2012-08-31	1	-1/+1
\|
*	remove tabs	Simon Marlow	2012-08-21	1	-124/+117
\|
*	Remove uses of fixC from the codeGen, and make the FCode monad strict	Simon Marlow	2012-08-09	1	-74/+110
\|
*	fix warning	Simon Marlow	2012-08-07	1	-1/+1
\|
*	entryHeapCheck: fix calls to stg_gc_fun and stg_gc_enter_1	Simon Marlow	2012-08-07	1	-2/+2
\| \| \| \| \| \| \| \| \|	We weren't passing the arguments correctly to the GC functions, which usually happened to work because the arguments were in the right registers already. After this fix the profiling tests go through with the new code generator.
*	Small optimisation	Simon Marlow	2012-08-07	1	-5/+6
\| \| \| \| \| \|	When calling newCAF, refer to the closure using its LocalReg rather than R1. Using R1 here was preventing the register allocator from coalescing the assignment x=R1 at the beginning of the function.
*	fix a warning	Simon Marlow	2012-08-07	1	-1/+1
\|
*	Fix update frames for profiling	Simon Marlow	2012-08-07	1	-12/+16
\|
*	Cleanup and fixes to profiling	Simon Marlow	2012-08-07	1	-1/+5
\|
*	A closure with void args only should be a function, not a thunk	Simon Marlow	2012-08-07	1	-4/+3
\|
*	Generate one fewer temps per heap allocation	Simon Marlow	2012-08-07	1	-5/+6
\| \| \| \| \|	This saves compile time and can make a big difference in some pathological cases (T4801)
*	Add "Unregisterised" as a field in the settings file	Ian Lynagh	2012-08-07	1	-2/+3
\| \| \| \| \| \|	To explicitly choose whether you want an unregisterised build you now need to use the "--enable-unregisterised"/"--disable-unregisterised" configure flags.
*	Make tablesNextToCode "dynamic"	Ian Lynagh	2012-08-06	1	-3/+4
\| \| \| \| \|	This is a bit odd by itself, but it's a stepping stone on the way to putting "target unregisterised" into the settings file.
*	Use "ReturnedTo" when generating safe foreign calls	Simon Marlow	2012-08-06	1	-4/+3
\|
*	Add a comment to explain why the FCode monad is lazy	Simon Marlow	2012-08-06	1	-1/+2
\|
*	Explicitly share some return continuations	Simon Marlow	2012-08-02	1	-2/+4
\| \| \| \| \| \| \|	Instead of relying on common-block-elimination to share return continuations in the common case (case-alternative heap checks) we do it explicitly. This isn't hard to do, is more robust, and saves some compilation time. Full commentary in Note [sharing continuations].
*	Small optimisation to the code generated for CAFs	Simon Marlow	2012-07-30	1	-9/+14
\|
*	New codegen: do not split proc-points when using the NCG	Simon Marlow	2012-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \|	Proc-point splitting is only required by backends that do not support having proc-points within a code block (that is, everything except the native backend, i.e. LLVM and C). Not doing proc-point splitting saves some compilation time, and might produce slightly better code in some cases.
*	Make -fscc-profiling a dynamic flag	Ian Lynagh	2012-07-24	1	-23/+28
\| \| \| \|	All the flags that 'ways' imply are now dynamic
*	remove some redundant SRT-related stuff	Simon Marlow	2012-07-11	1	-5/+3
\|
*	Merge remote-tracking branch 'origin/master' into newcg	Simon Marlow	2012-07-04	1	-22/+25
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* origin/master: (756 commits) don't crash if argv[0] == NULL (#7037) -package P was loading all versions of P in GHCi (#7030) Add a Note, copying text from #2437 improve the --help docs a bit (#7008) Copy Data.HashTable's hashString into our Util module Build fix Build fixes Parse error: suggest brackets and indentation. Don't build the ghc DLL on Windows; works around trac #5987 On Windows, detect if DLLs have too many symbols; trac #5987 Add some more Integer rules; fixes #6111 Fix PA dfun construction with silent superclass args Add silent superclass parameters to the vectoriser Add silent superclass parameters (again) Mention Generic1 in the user's guide Make the GHC API a little more powerful. tweak llvm version warning message New version of the patch for #5461. Fix Word64ToInteger conversion rule. Implemented feature request on reconfigurable pretty-printing in GHCi (#5461) ... Conflicts: compiler/basicTypes/UniqSupply.lhs compiler/cmm/CmmBuildInfoTables.hs compiler/cmm/CmmLint.hs compiler/cmm/CmmOpt.hs compiler/cmm/CmmPipeline.hs compiler/cmm/CmmStackLayout.hs compiler/cmm/MkGraph.hs compiler/cmm/OldPprCmm.hs compiler/codeGen/CodeGen.lhs compiler/codeGen/StgCmm.hs compiler/codeGen/StgCmmBind.hs compiler/codeGen/StgCmmLayout.hs compiler/codeGen/StgCmmUtils.hs compiler/main/CodeOutput.lhs compiler/main/HscMain.hs compiler/nativeGen/AsmCodeGen.lhs compiler/simplStg/SimplStg.lhs
\| *	Remove some unnecessary platform arguments	Ian Lynagh	2012-06-13	1	-7/+3
\| \|
\| *	Pass DynFlags down to showSDocDump	Ian Lynagh	2012-06-12	1	-6/+10
\| \| \| \| \| \| \| \| \| \|	To help with this, we now also pass DynFlags around inside the SpecM monad.
\| *	Fix for earger blackholing of thunks with no free variables (#6146)	Simon Marlow	2012-06-07	1	-7/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A thunk with no free variables was not getting blackholed when -feager-blackholing was on, but we were nevertheless pushing the stg_bh_upd_frame version of the update frame that expects to see a black hole. I fixed this twice for good measure: - we now call blackHoleOnEntry when pushing the update frame to check whether the closure was actually blackholed, and so that we use the same predicate in both places - we now black hole thunks even if they have no free variables. These only occur when optimisation is off, but presumably if you say -feager-blackholing then that's what you want to happen.