summaryrefslogtreecommitdiff
path: root/includes
Commit message (Collapse)AuthorAgeFilesLines
* Run sparks in batches, instead of creating a new thread for each oneSimon Marlow2008-11-061-0/+1
| | | | | Signficantly reduces the overhead for par, which means that we can make use of paralellism at a much finer granularity.
* Refactoring and reorganisation of the schedulerSimon Marlow2008-10-221-39/+7
| | | | | | | | | | | | | | | | | Change the way we look for work in the scheduler. Previously, checking to see whether there was anything to do was a non-side-effecting operation, but this has changed now that we do work-stealing. This lead to a refactoring of the inner loop of the scheduler. Also, lots of cleanup in the new work-stealing code, but no functional changes. One new statistic is added to the +RTS -s output: SPARKS: 1430 (2 converted, 1427 pruned) lets you know something about the use of `par` in the program.
* Work stealing for sparksberthold@mathematik.uni-marburg.de2008-09-152-95/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | Spark stealing support for PARALLEL_HASKELL and THREADED_RTS versions of the RTS. Spark pools are per capability, separately allocated and held in the Capability structure. The implementation uses Double-Ended Queues (deque) and cas-protected access. The write end of the queue (position bottom) can only be used with mutual exclusion, i.e. by exactly one caller at a time. Multiple readers can steal()/findSpark() from the read end (position top), and are synchronised without a lock, based on a cas of the top position. One reader wins, the others return NULL for a failure. Work stealing is called when Capabilities find no other work (inside yieldCapability), and tries all capabilities 0..n-1 twice, unless a theft succeeds. Inside schedulePushWork, all considered cap.s (those which were idle and could be grabbed) are woken up. Future versions should wake up capabilities immediately when putting a new spark in the local pool, from newSpark(). Patch has been re-recorded due to conflicting bugfixes in the sparks.c, also fixing a (strange) conflict in the scheduler.
* add readTVarIO :: TVar a -> IO aSimon Marlow2008-10-102-0/+3
|
* Remove #define _BSD_SOURCE from Stg.hIan Lynagh2008-10-061-3/+0
| | | | It's no longer needed, as base no longer #includes it
* On Linux use libffi for allocating executable memory (fixed #738)Simon Marlow2008-09-192-2/+2
|
* Move the context_switch flag into the CapabilitySimon Marlow2008-09-193-2/+2
| | | | | Fixes a long-standing bug that could in some cases cause sub-optimal scheduling behaviour.
* Fix MacOS X build: don't believe __GNUC_GNU_INLINE__ on MacOS XSimon Marlow2008-09-181-1/+5
|
* FIX #2469: sort out our static/extern inline storySimon Marlow2008-09-162-15/+22
| | | | | | gcc has changed the meaning of "extern inline" when certain flags are on (e.g. --std=gnu99), and this broke our use of it in the header files.
* when a memory leak is detected, report which blocks are unreachableSimon Marlow2008-09-091-1/+2
|
* More sanity checking for the TSO write barrierSimon Marlow2008-09-091-0/+2
| | | | Check that all threads marked as dirty are really on the mutable list.
* Make LOOKS_LIKE_{INFO,CLOSURE}_PTR into inline functions, instead of macrosSimon Marlow2008-09-081-9/+18
| | | | | | The macros were duplicating their arguments, which was normally harmless, but in the parallel GC was actually wrong and caused spurious assertion failures.
* Define _BSD_SOURCE in Stg.hIan Lynagh2008-09-041-1/+5
| | | | This means S_ISSOCK gets defined on Linux
* bindists are now some way towards workingIan Lynagh2008-08-101-4/+2
|
* FIX #2332: avoid overflow on 64-bit machines in the memory allocatorSimon Marlow2008-07-291-4/+4
|
* add threadStatus# primop, for querying the status of a ThreadId#Simon Marlow2008-07-102-0/+2
|
* add new primop: asyncExceptionsBlocked# :: IO BoolSimon Marlow2008-07-091-0/+1
|
* FIX part of #2301, and #1619Simon Marlow2008-07-091-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | 2301: Control-C now causes the new exception (AsyncException UserInterrupt) to be raised in the main thread. The signal handler is set up by GHC.TopHandler.runMainIO, and can be overriden in the usual way by installing a new signal handler. The advantage is that now all programs will get a chance to clean up on ^C. When UserInterrupt is caught by the topmost handler, we now exit the program via kill(getpid(),SIGINT), which tells the parent process that we exited as a result of ^C, so the parent can take appropriate action (it might want to exit too, for example). One subtlety is that we have to use a weak reference to the ThreadId for the main thread, so that the signal handler doesn't prevent the main thread from being subject to deadlock detection. 1619: we now ignore SIGPIPE by default. Although POSIX says that a SIGPIPE should terminate the process by default, I wonder if this decision was made because many C applications failed to check the exit code from write(). In Haskell a failed write due to a closed pipe will generate an exception anyway, so the main difference is that we now get a useful error message instead of silent program termination. See #1619 for more discussion.
* FIX #2313 do not include BFD symbols in RTS when the BFD library is not ↵Karel Gardas2008-05-281-1/+1
| | | | available for linking
* Fix up inlines for gcc 4.3Simon Marlow2008-06-193-21/+44
| | | | | | | | | gcc 4.3 emits warnings for static inline functions that its heuristics decided not to inline. The workaround is to either mark appropriate functions as "hot" (a new attribute in gcc 4.3), or sometimes to use "extern inline" instead. With this fix I can validate with gcc 4.3 on Fedora 9.
* Experimental "mark-region" strategy for the old generationSimon Marlow2008-06-093-3/+11
| | | | Sometimes better than the default copying, enabled by +RTS -w
* remove EVACUATED: store the forwarding pointer in the info pointerSimon Marlow2008-04-173-7/+5
|
* Don't traverse the entire list of threads on every GC (phase 1)Simon Marlow2008-04-161-0/+3
| | | | | | Instead of keeping a single list of all threads, keep one per step and only look at the threads belonging to steps that we are collecting.
* Add a write barrier to the TSO link field (#1589)Simon Marlow2008-04-165-7/+28
|
* pad step_workspace to 64 bytes, to speed up access to gct->steps[]Simon Marlow2008-04-161-0/+6
|
* Reorganisation to fix problems related to the gct register variableSimon Marlow2008-04-162-5/+6
| | | | | | | | | - GCAux.c contains code not compiled with the gct register enabled, it is callable from outside the GC - marking functions are moved to their relevant subsystems, outside the GC - mark_root needs to save the gct register, as it is called from outside the GC
* improvements to +RTS -s outputSimon Marlow2008-04-161-0/+1
| | | | | | | - count and report number of parallel collections - calculate bytes scanned in addition to bytes copied per thread - calculate "work balance factor" - tidy up the formatting a bit
* Keep track of an accurate count of live words in each stepSimon Marlow2008-04-161-0/+1
| | | | | This means we can calculate slop easily, and also improve predictability of GC.
* Allow work units smaller than a block to improve load balancingSimon Marlow2008-04-162-0/+4
|
* use RTS_VAR()Simon Marlow2008-04-161-1/+1
|
* treat the global work list as a queue rather than a stackSimon Marlow2008-04-161-0/+1
|
* GC: move static object processinng into thread-local storageSimon Marlow2008-04-161-1/+0
|
* Add +RTS -vg flag for requesting some GC trace messages, outside DEBUGSimon Marlow2008-04-161-0/+1
| | | | | | | DEBUG imposes a significant performance hit in the GC, yet we often want some of the debugging output, so -vg gives us the cheap trace messages without the sanity checking of DEBUG, just like -vs for the scheduler.
* GC: rearrange storage to reduce memory accesses in the inner loopSimon Marlow2008-04-161-6/+15
|
* Add profiling of spinlocksSimon Marlow2008-04-161-0/+4
|
* rename StgSync to SpinLockSimon Marlow2008-04-161-24/+19
|
* Release some of the memory allocated to a stack when it shrinks (#2090)simonmar@microsoft.com2008-02-282-9/+21
| | | | | | When a stack is occupying less than 1/4 of the memory it owns, and is larger than a megablock, we release half of it. Shrinking is O(1), it doesn't need to copy the stack.
* round_to_mblocks: should use StgWord not natSimon Marlow2008-02-201-2/+2
|
* add ROUNDUP_BYTES_TO_WDSsimonmar@microsoft.com2008-02-151-1/+3
|
* memInventory: optionally dump the memory inventorysimonmar@microsoft.com2008-01-301-1/+1
| | | | in addition to checking for leaks
* recordMutableGen_GC: we must call the spinlocked version of allocBlock()Simon Marlow2008-01-111-1/+18
|
* calculate wastage due to unused memory at the end of each blocksimonmar@microsoft.com2007-12-141-1/+3
|
* remove declarations for variables that no longer existsimonmar@microsoft.com2007-12-131-3/+0
|
* improvements to PAPI supportsimonmar@microsoft.com2007-11-201-2/+7
| | | | | | | - major (multithreaded) GC is measured separately from minor GC - events to measure can now be specified on the command line, e.g prog +RTS -a+PAPI_TOT_CYC
* Initial parallel GC supportSimon Marlow2007-10-311-1/+2
| | | | | | | | | eg. use +RTS -g2 -RTS for 2 threads. Only major GCs are parallelised, minor GCs are still sequential. Don't use more threads than you have CPUs. It works most of the time, although you won't see much speedup yet. Tuning and more work on stability still required.
* Refactoring of the GC in preparation for parallel GCSimon Marlow2007-10-312-40/+60
| | | | | | | | | | | | This patch localises the state of the GC into a gc_thread structure, and reorganises the inner loop of the GC to scavenge one block at a time from global work lists in each "step". The gc_thread structure has a "workspace" for each step, in which it collects evacuated objects until it has a full block to push out to the step's global list. Details of the algorithm will be on the wiki in due course. At the moment, THREADED_RTS does not compile, but the single-threaded GC works (and is 10-20% slower than before).
* move GetRoots() to GC.cSimon Marlow2007-10-301-2/+2
|
* Fix conversions between Double/Float and simple-integerIan Lynagh2008-06-142-1/+3
|
* Fix unreg buildSimon Marlow2008-06-041-0/+1
|
* FIX #1861: floating-point constants for infinity and NaN in via-CSimon Marlow2008-05-121-0/+3
|