delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Merge the smp and threaded RTS ways	Simon Marlow	2006-02-09	1	-3/+2
\| \| \| \| \| \| \|	Now, the threaded RTS also includes SMP support. The -smp flag is a synonym for -threaded. The performance implications of this are small to negligible, and it results in a code cleanup and reduces the number of combinations we have to test.
*	[project @ 2005-10-21 14:02:17 by simonmar]	simonmar	2005-10-21	1	-2/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Big re-hash of the threaded/SMP runtime This is a significant reworking of the threaded and SMP parts of the runtime. There are two overall goals here: - To push down the scheduler lock, reducing contention and allowing more parts of the system to run without locks. In particular, the scheduler does not require a lock any more in the common case. - To improve affinity, so that running Haskell threads stick to the same OS threads as much as possible. At this point we have the basic structure working, but there are some pieces missing. I believe it's reasonably stable - the important parts of the testsuite pass in all the (normal,threaded,SMP) ways. In more detail: - Each capability now has a run queue, instead of one global run queue. The Capability and Task APIs have been completely rewritten; see Capability.h and Task.h for the details. - Each capability has its own pool of worker Tasks. Hence, Haskell threads on a Capability's run queue will run on the same worker Task(s). As long as the OS is doing something reasonable, this should mean they usually stick to the same CPU. Another way to look at this is that we're assuming each Capability is associated with a fixed CPU. - What used to be StgMainThread is now part of the Task structure. Every OS thread in the runtime has an associated Task, and it can ask for its current Task at any time with myTask(). - removed RTS_SUPPORTS_THREADS symbol, use THREADED_RTS instead (it is now defined for SMP too). - The RtsAPI has had to change; we must explicitly pass a Capability around now. The previous interface assumed some global state. SchedAPI has also changed a lot. - The OSThreads API now supports thread-local storage, used to implement myTask(), although it could be done more efficiently using gcc's __thread extension when available. - I've moved some POSIX-specific stuff into the posix subdirectory, moving in the direction of separating out platform-specific implementations. - lots of lock-debugging and assertions in the runtime. In particular, when DEBUG is on, we catch multiple ACQUIRE_LOCK()s, and there is also an ASSERT_LOCK_HELD() call. What's missing so far: - I have almost certainly broken the Win32 build, will fix soon. - any kind of thread migration or load balancing. This is high up the agenda, though. - various performance tweaks to do - throwTo and forkProcess still do not work in SMP mode
*	[project @ 2005-07-27 15:46:19 by simonmar]	simonmar	2005-07-27	1	-100/+57
\| \| \| \| \|	back out revision 1.22; it led to very bad memory fragmentation. A rethink is in order.
*	[project @ 2005-06-13 12:29:48 by simonmar]	simonmar	2005-06-13	1	-57/+100
\| \| \| \| \| \| \| \| \| \| \| \|	Block allocator performance fix: instead of keeping the free list ordered, keep it doubly-linked, and introduce a new flag BF_FREE so we can tell when a block is free. We can still coalesce blocks on the free list because block descriptors are kept consecutively in memory, so we can tell based on the BF_FREE flag whether to coalesce with the next higher/lower blocks when freeing a block. This (almost) make freeChain O(n) rather than O(n^2), and has been reported to help a lot when dealing with very large heaps.
*	[project @ 2005-02-10 13:01:52 by simonmar]	simonmar	2005-02-10	1	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GC changes: instead of threading old-generation mutable lists through objects in the heap, keep it in a separate flat array. This has some advantages: - the IND_OLDGEN object is now only 2 words, so the minimum size of a THUNK is now 2 words instead of 3. This saves some amount of allocation (about 2% on average according to my measurements), and is more friendly to the cache by squashing objects together more. - keeping the mutable list separate from the IND object will be necessary for our multiprocessor implementation. - removing the mut_link field makes the layout of some objects more uniform, leading to less complexity and special cases. - I also unified the two mutable lists (mut_once_list and mut_list) into a single mutable list, which lead to more simplifications in the GC.
*	[project @ 2004-09-12 11:27:10 by panne]	panne	2004-09-12	1	-1/+0
\| \| \| \| \|	Removed the annoying "Id" CVS keywords, they're a real PITA when it comes to merging...
*	[project @ 2004-09-06 11:00:21 by simonmar]	simonmar	2004-09-06	1	-7/+8
\| \| \| \|	eliminate some more gcc 3.4 warnings
*	[project @ 2004-09-03 15:28:18 by simonmar]	simonmar	2004-09-03	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Cleanup: all (well, most) messages from the RTS now go through the functions in RtsUtils: barf(), debugBelch() and errorBelch(). The latter two were previously called belch() and prog_belch() respectively. See the comments for the right usage of these message functions. One reason for doing this is so that we can avoid spurious uses of stdout/stderr by Haskell apps on platforms where we shouldn't be using them (eg. non-console apps on Windows).
*	[project @ 2003-11-12 17:49:05 by sof]	sof	2003-11-12	1	-3/+3
\| \| \| \| \| \|	Tweaks to have RTS (C) sources compile with MSVC. Apart from wibbles related to the handling of 'inline', changed Schedule.h:POP_RUN_QUEUE() not to use expression-level statement blocks.
*	[project @ 2003-02-18 05:47:53 by sof]	sof	2003-02-18	1	-2/+2
\| \| \| \|	make use of MBLOCK_ROUND_DOWN()
*	[project @ 2003-01-28 17:04:58 by simonmar]	simonmar	2003-01-28	1	-3/+3
\| \| \| \|	Make it multi-init-safe
*	[project @ 2002-07-17 09:21:48 by simonmar]	simonmar	2002-07-17	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove most #includes of system headers from Stg.h, and instead #include any required headers directly in each RTS source file. The idea is to (a) reduce namespace pollution from system headers that we don't need, (c) be clearer about dependencies on system things in the RTS, and (c) improve via-C compilation times (maybe). In practice though, HsBase.h #includes everything anyway, so the difference from the point of view of .hc source is minimal. However, this makes it easier to move to zero-includes if we wanted to (see discussion on the FFI list; I'm still not sure that's possible but at least this is a step in the right direction).
*	[project @ 2001-11-08 14:42:11 by simonmar]	simonmar	2001-11-08	1	-3/+17
\| \| \| \|	Fix a bug in the previous commit, and add some more sanity checking.
*	[project @ 2001-11-08 12:41:07 by simonmar]	simonmar	2001-11-08	1	-1/+2
\| \| \| \| \|	(addendum to the previous commit) also set bd->blocks to zero in coalesce().
*	[project @ 2001-11-08 10:18:49 by simonmar]	simonmar	2001-11-08	1	-1/+2
\| \| \| \| \|	For each non-head block in a block group, set its 'blocks' field to zero (as per comments elsewhere).
*	[project @ 2001-08-14 13:40:07 by sewardj]	sewardj	2001-08-14	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change the story about POSIX headers in C compilation. Until now, all C code in the RTS and library cbits has by default been compiled with settings for POSIXness enabled, that is: #define _POSIX_SOURCE 1 #define _POSIX_C_SOURCE 199309L #define _ISOC9X_SOURCE If you wanted to negate this, you'd have to define NON_POSIX_SOURCE before including headers. This scheme has some bad effects: * It means that ccall-unfoldings exported via interfaces from a module compiled with -DNON_POSIX_SOURCE may not compile when imported into a module which does not -DNON_POSIX_SOURCE. * It overlaps with the feature tests we do with autoconf. * It seems to have caused borkage in the Solaris builds for some considerable period of time. The New Way is: * The default changes to not-being-in-Posix mode. * If you want to force a C file into Posix mode, #include as the first include the new file ghc/includes/PosixSource.h. Most of the RTS C sources have this include now. * NON_POSIX_SOURCE is almost totally expunged. Unfortunately we have to retain some vestiges of it in ghc/compiler so that modules compiled via C on Solaris using older compilers don't break.
*	[project @ 2001-07-23 17:23:19 by simonmar]	simonmar	2001-07-23	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a compacting garbage collector. It isn't enabled by default, as there are still a couple of problems: there's a fallback case I haven't implemented yet which means it will occasionally bomb out, and speed-wise it's quite a bit slower than the copying collector (about 1.8x slower). Until I can make it go faster, it'll only be useful when you're actually running low on real memory. '+RTS -c' to enable it. Oh, and I cleaned up a few things in the RTS while I was there, and fixed one or two possibly real bugs in the existing GC.
*	[project @ 2001-07-23 10:47:16 by simonmar]	simonmar	2001-07-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Small changes to improve GC performance slightly: - store the generation number in the block descriptor rather than a pointer to the generation structure, since the most common operation is to pull out the generation number, and it's one less indirection this way. - cache the generation number in the step structure too, which avoids an extra indirection in several places.
*	[project @ 2000-01-30 10:17:44 by simonmar]	simonmar	2000-01-30	1	-2/+5
\| \| \| \| \|	The bd->free field of a block descriptor is supposed to be set to -1 for free blocks, if we're #ifdef DEBUGging. It wasn't sometimes.
*	[project @ 1999-07-01 13:48:22 by panne]	panne	1999-07-01	1	-2/+4
\| \| \| \| \| \| \|	The allocator for mega groups now checks if consecutive megablocks on the free list are contiguous in memory. The omission of this check caused all kinds of funny runtime errors and took away at least five happy years of my life... :-{
*	[project @ 1999-03-26 14:54:43 by simonm]	simonm	1999-03-26	1	-3/+5
\| \| \| \|	Fix bug in allocGroup() when allocating an entire megablock in one go.
*	[project @ 1999-02-05 16:02:18 by simonm]	simonm	1999-02-05	1	-1/+3
\| \| \| \|	Copyright police.
*	[project @ 1999-01-13 17:25:37 by simonm]	simonm	1999-01-13	1	-4/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added a generational garbage collector. The collector is reliable but fairly untuned as yet. It works with an arbitrary number of generations: use +RTS -G<gens> to change the number of generations used (default 2). Stats: +RTS -Sstderr is quite useful, but to really see what's going on compile the RTS with -DDEBUG and use +RTS -D32. ARR_PTRS removed - it wasn't used anywhere. Sanity checking improved: - free blocks are now spammed when sanity checking is turned on - a check for leaking blocks is performed after each GC.
*	[project @ 1998-12-02 13:17:09 by simonm]	simonm	1998-12-02	1	-0/+304
	Move 4.01 onto the main trunk.