delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Allow allocNursery() to allocate single blocks (#7257)	Simon Marlow	2012-09-21	1	-1/+7
\| \| \| \| \| \| \|	Forcing large allocations here can creates serious fragmentation in some cases, and since the large allocations are only a small optimisation we should allow the nursery to hoover up small blocks before allocating large chunks.
*	Lots of nat -> StgWord changes	Simon Marlow	2012-09-07	1	-14/+14
\|
*	Deprecate lnat, and use StgWord instead	Simon Marlow	2012-09-07	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	lnat was originally "long unsigned int" but we were using it when we wanted a 64-bit type on a 64-bit machine. This broke on Windows x64, where long == int == 32 bits. Using types of unspecified size is bad, but what we really wanted was a type with N bits on an N-bit machine. StgWord is exactly that. lnat was mentioned in some APIs that clients might be using (e.g. StackOverflowHook()), so we leave it defined but with a comment to say that it's deprecated.
*	Some further tweaks to reduce fragmentation when allocating the nursery	Simon Marlow	2012-09-07	1	-16/+32
\|
*	Reduce fragmentation when using +RTS -H (with or without a size)	Simon Marlow	2012-08-21	1	-0/+35
\|
*	improve debug output	Simon Marlow	2012-08-21	1	-1/+1
\|
*	update debugging code for fragmentation	Simon Marlow	2011-01-25	1	-2/+3
\|
*	Implement stack chunks and separate TSO/STACK objects	Simon Marlow	2010-12-15	1	-42/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch makes two changes to the way stacks are managed: 1. The stack is now stored in a separate object from the TSO. This means that it is easier to replace the stack object for a thread when the stack overflows or underflows; we don't have to leave behind the old TSO as an indirection any more. Consequently, we can remove ThreadRelocated and deRefTSO(), which were a pain. This is obviously the right thing, but the last time I tried to do it it made performance worse. This time I seem to have cracked it. 2. Stacks are now represented as a chain of chunks, rather than a single monolithic object. The big advantage here is that individual chunks are marked clean or dirty according to whether they contain pointers to the young generation, and the GC can avoid traversing clean stack chunks during a young-generation collection. This means that programs with deep stacks will see a big saving in GC overhead when using the default GC settings. A secondary advantage is that there is much less copying involved as the stack grows. Programs that quickly grow a deep stack will see big improvements. In some ways the implementation is simpler, as nothing special needs to be done to reclaim stack as the stack shrinks (the GC just recovers the dead stack chunks). On the other hand, we have to manage stack underflow between chunks, so there's a new stack frame (UNDERFLOW_FRAME), and we now have separate TSO and STACK objects. The total amount of code is probably about the same as before. There are new RTS flags: -ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m -kc<size> Sets the stack chunk size (default 32k) -kb<size> Sets the stack chunk buffer size (default 1k) -ki was previously called just -k, and the old name is still accepted for backwards compatibility. These new options are documented.
*	fix another sanity error, and refactor/tidy up	Simon Marlow	2010-12-09	1	-9/+8
\|
*	sanity: fix places where we weren't filling fresh memory with 0xaa	Simon Marlow	2010-10-29	1	-0/+2
\|
*	On Windows, when returning memory to the OS, we try to release it	Ian Lynagh	2010-11-01	1	-0/+2
\| \| \| \|	as well as decommiting it.
*	Return memory to the OS; trac #698	Ian Lynagh	2010-08-13	1	-0/+34
\|
*	Cast some more nats to StgWord to be on the safe side	Simon Marlow	2010-06-24	1	-3/+13
\| \| \| \|	And add a comment about the dangers of int overflow
*	comments only	Simon Marlow	2010-06-24	1	-3/+2
\|
*	Fix an arithmetic overflow bug causing crashes with multi-GB heaps	Simon Marlow	2010-06-24	1	-1/+1
\|
*	rts/sm/BlockAlloc.c: Small comment correction.	Marco Túlio Gontijo e Silva	2010-05-26	1	-1/+1
\|
*	GC refactoring, remove "steps"	Simon Marlow	2009-12-03	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The GC had a two-level structure, G generations each of T steps. Steps are for aging within a generation, mostly to avoid premature promotion. Measurements show that more than 2 steps is almost never worthwhile, and 1 step is usually worse than 2. In theory fractional steps are possible, so the ideal number of steps is somewhere between 1 and 3. GHC's default has always been 2. We can implement 2 steps quite straightforwardly by having each block point to the generation to which objects in that block should be promoted, so blocks in the nursery point to generation 0, and blocks in gen 0 point to gen 1, and so on. This commit removes the explicit step structures, merging generations with steps, thus simplifying a lot of code. Performance is unaffected. The tunable number of steps is now gone, although it may be replaced in the future by a way to tune the aging in generation 0.
*	Refactoring only	Simon Marlow	2009-12-02	1	-0/+34
\|
*	Store a destination step in the block descriptor	Simon Marlow	2009-11-29	1	-0/+1
\| \| \| \| \| \| \|	At the moment, this just saves a memory reference in the GC inner loop (worth a percent or two of GC time). Later, it will hopefully let me experiment with partial steps, and simplifying the generation/step infrastructure.
*	RTS tidyup sweep, first phase	Simon Marlow	2009-08-02	1	-21/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The first phase of this tidyup is focussed on the header files, and in particular making sure we are exposinng publicly exactly what we need to, and no more. - Rts.h now includes everything that the RTS exposes publicly, rather than a random subset of it. - Most of the public header files have moved into subdirectories, and many of them have been renamed. But clients should not need to include any of the other headers directly, just #include the main public headers: Rts.h, HsFFI.h, RtsAPI.h. - All the headers needed for via-C compilation have moved into the stg subdirectory, which is self-contained. Most of the headers for the rest of the RTS APIs have moved into the rts subdirectory. - I left MachDeps.h where it is, because it is so widely used in Haskell code. - I left a deprecated stub for RtsFlags.h in place. The flag structures are now exposed by Rts.h. - Various internal APIs are no longer exposed by public header files. - Various bits of dead code and declarations have been removed - More gcc warnings are turned on, and the RTS code is more warning-clean. - More source files #include "PosixSource.h", and hence only use standard POSIX (1003.1c-1995) interfaces. There is a lot more tidying up still to do, this is just the first pass. I also intend to standardise the names for external RTS APIs (e.g use the rts_ prefix consistently), and declare the internal APIs as hidden for shared libraries.
*	Fix some bugs in the stack-reducing code (#2571)	Simon Marlow	2008-09-12	1	-4/+12
\|
*	when a memory leak is detected, report which blocks are unreachable	Simon Marlow	2008-09-09	1	-0/+32
\|
*	fix a tiny bug spotted by gcc 4.3	Simon Marlow	2008-06-19	1	-1/+1
\|
*	fix allocated blocks calculation, and add more sanity checks	Simon Marlow	2008-06-08	1	-10/+24
\|
*	refactoring	Simon Marlow	2008-04-16	1	-12/+11
\|
*	add debugging code to check for fragmentation	Simon Marlow	2008-04-16	1	-0/+8
\|
*	update copyrights in rts/sm	Simon Marlow	2008-04-16	1	-1/+1
\|
*	faster block allocator, by dividing the free list into buckets	Simon Marlow	2008-04-16	1	-165/+165
\|
*	Release some of the memory allocated to a stack when it shrinks (#2090)	simonmar@microsoft.com	2008-02-28	1	-0/+34
\| \| \| \| \| \|	When a stack is occupying less than 1/4 of the memory it owns, and is larger than a megablock, we release half of it. Shrinking is O(1), it doesn't need to copy the stack.
*	Initial parallel GC support	Simon Marlow	2007-10-31	1	-2/+4
\| \| \| \| \| \| \| \| \|	eg. use +RTS -g2 -RTS for 2 threads. Only major GCs are parallelised, minor GCs are still sequential. Don't use more threads than you have CPUs. It works most of the time, although you won't see much speedup yet. Tuning and more work on stability still required.
*	Rework the block allocator	Simon Marlow	2006-12-14	1	-203/+480
\| \| \| \| \| \| \| \|	The main goal here is to reduce fragmentation, which turns out to be the case of #743. While I was here I found some opportunities to improve performance too. The code is rather more complex, but it also contains a long comment describing the strategy, so please take a look at that for the details.
*	small fix to DEBUG case in coalesce/freeGroup patch	Simon Marlow	2006-11-21	1	-1/+3
\|
*	optimisation to freeGroup() to avoid an O(N^2) pathalogical case	Simon Marlow	2006-11-21	1	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	In the free list, we don't strictly speaking need to have every block in a coalesced group point to the head block, although this is an invariant for non-free blocks. Dropping this invariant for the free list means that coalesce() is O(1) rather than O(N), and freeGroup() is therefore O(N) not O(N^2). The bad case probably didn't happen most of the time, indeed it has never shown up in a profile that I've seen. I had a report from a while back that this was a problem with really large heaps, though. Fortunately the fix is easy.
*	Split GC.c, and move storage manager into sm/ directory	Simon Marlow	2006-10-24	1	-0/+391
	In preparation for parallel GC, split up the monolithic GC.c file into smaller parts. Also in this patch (and difficult to separate, unfortunatley): - Don't include Stable.h in Rts.h, instead just include it where necessary. - consistently use STATIC_INLINE in source files, and INLINE_HEADER in header files. STATIC_INLINE is now turned off when DEBUG is on, to make debugging easier. - The GC no longer takes the get_roots function as an argument. We weren't making use of this generalisation.