summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
* Fix a scheduling bug in the threaded RTSSimon Marlow2011-12-017-22/+65
| | | | | | | | | | | | | | | The parallel GC was using setContextSwitches() to stop all the other threads, which sets the context_switch flag on every Capability. That had the side effect of causing every Capability to also switch threads, and since GCs can be much more frequent than context switches, this increased the context switch frequency. When context switches are expensive (because the switch is between two bound threads or a bound and unbound thread), the difference is quite noticeable. The fix is to have a separate flag to indicate that a Capability should stop and return to the scheduler, but not switch threads. I've called this the "interrupt" flag.
* loadArchive: need to allocate executable memory on Win32 (#5371)Simon Marlow2011-12-011-0/+5
|
* Fix potential crash on Windows: off-by-one in malloc()Simon Marlow2011-12-011-1/+1
| | | | Spotted by gdb's malloc debugger while I was looking for something else.
* Add missing newline in RTS help output.Edward Z. Yang2011-12-011-1/+1
| | | | Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
* Turn a bunch of ints into longs to avoid overflow (#5656)Simon Marlow2011-11-301-18/+18
|
* Add a new primop: getCCCS# :: State# s -> (# State# s, Addr# #)Simon Marlow2011-11-291-0/+3
| | | | | Returns a pointer to the current cost-centre stack when profiling, NULL otherwise.
* Another fix to the stg_enter_checkbh frameSimon Marlow2011-11-291-1/+8
|
* Make profiling work with multiple capabilities (+RTS -N)Simon Marlow2011-11-2915-101/+165
| | | | | | | | | | | | | | | | | | | | | | | | | | | This means that both time and heap profiling work for parallel programs. Main internal changes: - CCCS is no longer a global variable; it is now another pseudo-register in the StgRegTable struct. Thus every Capability has its own CCCS. - There is a new built-in CCS called "IDLE", which records ticks for Capabilities in the idle state. If you profile a single-threaded program with +RTS -N2, you'll see about 50% of time in "IDLE". - There is appropriate locking in rts/Profiling.c to protect the shared cost-centre-stack data structures. This patch does enough to get it working, I have cut one big corner: the cost-centre-stack data structure is still shared amongst all Capabilities, which means that multiple Capabilities will race when updating the "allocations" and "entries" fields of a CCS. Not only does this give unpredictable results, but it runs very slowly due to cache line bouncing. It is strongly recommended that you use -fno-prof-count-entries to disable the "entries" count when profiling parallel programs. (I shall add a note to this effect to the docs).
* stg_enter_checkbh: fix offsets for profilingSimon Marlow2011-11-291-2/+2
|
* Time handling overhaulSimon Marlow2011-11-2523-439/+317
| | | | | | | | | | | | | | | | | | | | | Terminology cleanup: the type "Ticks" has been renamed "Time", which is an StgWord64 in units of TIME_RESOLUTION (currently nanoseconds). The terminology "tick" is now used consistently to mean the interval between timer signals. The ticker now always ticks in realtime (actually CLOCK_MONOTONIC if we have it). Before it used CPU time in the non-threaded RTS and realtime in the threaded RTS, but I've discovered that the CPU timer has terrible resolution (at least on Linux) and isn't much use for profiling. So now we always use realtime. This should also fix The default tick interval is now 10ms, except when profiling where we drop it to 1ms. This gives more accurate profiles without affecting runtime too much (<1%). Lots of cleanups - the resolution of Time is now in one place only (Rts.h) rather than having calculations that depend on the resolution scattered all over the RTS. I hope I found them all.
* Fixes for NetBSDIan Lynagh2011-11-251-1/+1
| | | | | Based on a patch from Arnaud Degroote <degroote@NetBSD.org> in trac #5480.
* Fix bug in flushStdHandles()Simon Marlow2011-11-241-1/+1
| | | | Was causing occasional failure in some threaded2 tests.
* Drop ".exe" exetention from eventlog file nameDuncan Coutts2011-11-221-3/+18
| | | | Fixes ticket #5472
* Remove registerised code for dead architectures: mips, ia64, alpha,David Terei2011-11-221-378/+4
| | | | hppa1, m68k
* Tabs -> SpacesDavid Terei2011-11-221-24/+24
|
* Remove some old comments about the manglerDavid Terei2011-11-221-2/+0
|
* mergeSimon Marlow2011-11-224-6/+35
|\
| * Enable pthread_getspecific() tls for LLVM compilerDavid M Peixotto2011-10-075-8/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | LLVM does not support the __thread attribute for thread local storage and may generate incorrect code for global register variables. We want to allow building the runtime with LLVM-based compilers such as llvm-gcc and clang, particularly for MacOS. This patch changes the gct variable used by the garbage collector to use pthread_getspecific() for thread local storage when an llvm based compiler is used to build the runtime.
* | Fix bug in the handling of TSOs in the compacting GC (#5644)Simon Marlow2011-11-211-1/+2
| |
* | Simplify a regexp and improve a couple of commentsIan Lynagh2011-11-201-4/+5
| |
* | fix new warnings with gcc 4.6Simon Marlow2011-11-181-4/+14
| |
* | Better documentation for stack alignment designDavid Terei2011-11-171-41/+41
| |
* | Tabs -> Spaces + formatting fixesDavid Terei2011-11-171-395/+387
| |
* | Add a getStablePtr for flushStdHandles_closureSimon Marlow2011-11-171-0/+1
| |
* | Fix #4211: No need to fixup stack using mangler on OSXDavid Terei2011-11-171-1/+1
| | | | | | | | | | | | | | We now manage the stack correctly on both x86 and i386, keeping the stack align at (16n bytes - word size) on function entry and at (16n bytes) on function calls. This gives us compatability with LLVM and GCC.
* | Remove executable mode from some filesDavid Terei2011-11-162-0/+0
| |
* | Fix trashing of the masking state in STM (#5238)Simon Marlow2011-11-161-18/+21
| |
* | Generate the C main() function when linking a binary (fixes #5373)Simon Marlow2011-11-168-91/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than have main() be statically compiled as part of the RTS, we now generate it into the tiny C file that we compile when linking a binary. The main motivation is that we want to pass the settings for the -rtsotps and -with-rtsopts flags into the RTS, rather than relying on fragile linking semantics to override the defaults, which don't work with DLLs on Windows (#5373). In order to do this, we need to extend the API for initialising the RTS, so now we have: void hs_init_ghc (int *argc, char **argv[], // program arguments RtsConfig rts_config); // RTS configuration hs_init_ghc() can optionally be used instead of hs_init(), and allows passing in configuration options for the RTS. RtsConfig is a struct, which currently has two fields: typedef struct { RtsOptsEnabledEnum rts_opts_enabled; const char *rts_opts; } RtsConfig; but might have more in the future. There is a default value for the struct, defaultRtsConfig, the idea being that you start with this and override individual fields as necessary. In fact, main() was in a separate static library, libHSrtsmain.a. That's now gone.
* | further fixes to the #5505 fix.Simon Marlow2011-11-151-4/+3
| |
* | Avoid generating chains of indirections in stack squeezing (#5505)Simon Marlow2011-11-151-60/+73
| |
* | +RTS -xc: print a the closure type of the exception tooSimon Marlow2011-11-144-5/+27
| |
* | Close the handle for the ticker thread (#5604)Simon Marlow2011-11-111-1/+2
| |
* | fix dynamic way on Win32 (missing bits from flushStdHandles changes)Dimitrios Vytiniotis2011-11-093-1/+5
| |
* | add -u flag for the new flushStdHandles referenceSimon Marlow2011-11-091-0/+2
| | | | | | | | (fix build failure with -split-objs)
* | Close some handle leaks (#5604)Simon Marlow2011-11-091-9/+21
| | | | | | | | | | Also, use the Win32 API (CreateThread) instead of the CRT API (_beginthreadex) for thread creation.
* | Flush stdout and stderr during hs_exit() (#5594)Simon Marlow2011-11-082-0/+17
| | | | | | | | | | | | | | Ensures that these handles are flushed even when the RTS is being used as a library, with no main. Needs a corresponding change to libraries/base.
* | get the column widths right for Unicode SCC labels/modulesSimon Marlow2011-11-081-7/+29
| |
* | Add eventlog event for thread labelsDuncan Coutts2011-11-048-7/+98
| | | | | | | | | | | | The existing GHC.Conc.labelThread will now also emit the the thread label into the eventlog. Profiling tools like ThreadScope could then use the thread labels rather than thread numbers.
* | fix disassembling of large instructionsSimon Marlow2011-11-021-8/+36
| |
* | fix BCO_GET_LARGE_ARG (seems to be completely wrong)Simon Marlow2011-11-021-2/+2
| |
* | Overhaul of infrastructure for profiling, coverage (HPC) and breakpointsSimon Marlow2011-11-0213-609/+677
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | User visible changes ==================== Profilng -------- Flags renamed (the old ones are still accepted for now): OLD NEW --------- ------------ -auto-all -fprof-auto -auto -fprof-exported -caf-all -fprof-cafs New flags: -fprof-auto Annotates all bindings (not just top-level ones) with SCCs -fprof-top Annotates just top-level bindings with SCCs -fprof-exported Annotates just exported bindings with SCCs -fprof-no-count-entries Do not maintain entry counts when profiling (can make profiled code go faster; useful with heap profiling where entry counts are not used) Cost-centre stacks have a new semantics, which should in most cases result in more useful and intuitive profiles. If you find this not to be the case, please let me know. This is the area where I have been experimenting most, and the current solution is probably not the final version, however it does address all the outstanding bugs and seems to be better than GHC 7.2. Stack traces ------------ +RTS -xc now gives more information. If the exception originates from a CAF (as is common, because GHC tends to lift exceptions out to the top-level), then the RTS walks up the stack and reports the stack in the enclosing update frame(s). Result: +RTS -xc is much more useful now - but you still have to compile for profiling to get it. I've played around a little with adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem quite accurately. I plan to add more facilities for stack tracing (e.g. in GHCi) in the future. Coverage (HPC) -------------- * derived instances are now coloured yellow if they weren't used * likewise record field names * entry counts are more accurate (hpc --fun-entry-count) * tab width is now correct (markup was previously off in source with tabs) Internal changes ================ In Core, the Note constructor has been replaced by Tick (Tickish b) (Expr b) which is used to represent all the kinds of source annotation we support: profiling SCCs, HPC ticks, and GHCi breakpoints. Depending on the properties of the Tickish, different transformations apply to Tick. See CoreUtils.mkTick for details. Tickets ======= This commit closes the following tickets, test cases to follow: - Close #2552: not a bug, but the behaviour is now more intuitive (test is T2552) - Close #680 (test is T680) - Close #1531 (test is result001) - Close #949 (test is T949) - Close #2466: test case has bitrotted (doesn't compile against current version of vector-space package)
* | fix time calculation for retainer profilingSimon Marlow2011-11-021-3/+11
| |
* | fix for a deadlock when using +RTS -hb with -prof -threadedSimon Marlow2011-11-021-2/+5
| |
* | Change stack alignment to 16+8 bytes in STG codeDavid M Peixotto2011-11-011-20/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch changes the STG code so that %rsp to be aligned to a 16-byte boundary + 8. This is the alignment required by the x86_64 ABI on entry to a function. Previously we kept %rsp aligned to a 16-byte boundary, but this was causing problems for the LLVM backend (see #4211). We now don't need to invoke llvm stack mangler on x86_64 targets. Since the stack is now 16+8 byte algined in STG land on x86_64, we don't need to mangle the stack manipulations with the llvm mangler. This patch only modifies the alignement for x86_64 backends. Signed-off-by: David Terei <davidterei@gmail.com>
* | Fix unused var warning on windowsDuncan Coutts2011-10-311-2/+2
| |
* | Fix recent rts flags changes on windowsDuncan Coutts2011-10-311-1/+1
| | | | | | | | | | I naively assumed that mingw would not have unistd.h or sys/types but it has both, yet does not have getuid() and friends.
* | Allow the -t -T -s -S flags (without <file> arg!) in -rtsopts=some modeDuncan Coutts2011-10-271-4/+10
| | | | | | | | | | | | | | Without any <file> arg, these flags just dump info to stderr so are at most a mild information disclosure danger. We disallow a <file> arg in the default -rtsopts=some mode since that will overwrite the given file.
* | Change what +RTS options are available by defaultDuncan Coutts2011-10-271-21/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Ticket #3910 originally pointed out that the RTS options are a potential security problem. For example the -t -s or -S flags can be used to overwrite files. This would be bad in the context of CGI scripts or setuid binaries. So we introduced a system where +RTS processing is more or less disabled unless you pass the -rtsopts flag at link time. This scheme is safe enough but it also really annoies users. They have to use -rtsopts in many circumstances: with -threaded to use -N, with -eventlog to use -l, with -prof to use any of the profiling flags. Many users just set -rtsopts globally or in project .cabal files. Apart from annoying users it reduces security because it means that deployed binaries will have all RTS options enabled rather than just profiling ones. This patch relaxes the set of RTS options that are available in the default -rtsopts=some case. For "deployment" ways like vanilla and -threaded we remain quite conservative. Only --info -? --help are allowed for vanilla. For -threaded, -N and -N<x> are allowed with a check that x <= num cpus. For "developer" ways like -debug, -eventlog, -prof, we allow all the options that are special to that way. Some of these allow writing files, but the file written is not directly under the control of the attacker. For the setuid case (where the attacker would have control over binary name, current dir, local symlinks etc) we check if the process is running setuid/setgid and refuse all RTS option processing. Users would need to use -rtsopts=all in this case. We are making the assumption that developers will not deploy binaries built in the -debug, -eventlog, -prof ways. And even if they do, the damage should be limited to DOS, information disclosure and writing files like <progname>.eventlog, not arbitrary files.
* | Use signed comparison for +RTS -N x <= 0 testDuncan Coutts2011-10-271-4/+5
| | | | | | | | | | Otherwise we can use +RTS -N-1 and get 2^32 or 2^64 capabilities which doesn't work out so well...
* | Remove +RTS --help text for -De flag which no longer existsDuncan Coutts2011-10-271-1/+0
| |