summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
* linker: only keep rtl exception tables if they have been relocatedTamar Christina2022-06-201-5/+6
|
* ghc-heap: Don't Box NULL pointersBen Gamari2022-06-181-6/+11
| | | | | | | | | Previously we could construct a `Box` of a NULL pointer from the `link` field of `StgWeak`. Now we take care to avoid ever introducing such pointers in `collect_pointers` and ensure that the `link` field is represented as a `Maybe` in the `Closure` type. Fixes #21622
* getProcessCPUTime: Fix the getrusage fallback to account for system CPU timeMatthew Pickering2022-06-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | clock_gettime reports the combined total or user AND system time so in order to replicate it with getrusage we need to add both system and user time together. See https://stackoverflow.com/questions/7622371/getrusage-vs-clock-gettime Some sample measurements when building Cabal with this patch t1: rusage t2: clock_gettime t1: 62347518000; t2: 62347520873 t1: 62395687000; t2: 62395690171 t1: 62432435000; t2: 62432437313 t1: 62478489000; t2: 62478492465 t1: 62514990000; t2: 62514992534 t1: 62515479000; t2: 62515480327 t1: 62515485000; t2: 62515486344 Fixes #21656
* Split out `GHC.HsToCore.{Breakpoints,Coverage}` and use `SizedSeq`John Ericson2022-06-021-1/+1
| | | | | | | | | | | | | | | | | | | As proposed in https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7508#note_432877 and https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7508#note_434676, `GHC.HsToCore.Ticks` is about ticks, breakpoints are separate and backend-specific (only for the bytecode interpreter), and mix entry writing is just for HPC. With this split we separate out those interpreter- and HPC-specific its, and keep the main `GHC.HsToCore.Ticks` agnostic. Also, instead of passing the reversed list and count around, we use `SizedSeq` which abstracts over the algorithm. This is much nicer to avoid noise and prevents bugs. (The bugs are not just hypothetical! I missed up the reverses on an earlier draft of this commit.)
* Rename `HsToCore.{Coverage -> Ticks}`John Ericson2022-06-021-1/+1
| | | | | | The old name made it confusing why disabling HPC didn't disable the entire pass. The name makes it clear --- there are other reasons to add ticks in addition.
* typosEric Lindblad2022-06-0131-42/+42
|
* rts: Remove explicit timescale for deprecating -h flagMatthew Pickering2022-05-301-2/+2
| | | | | | | | We originally planned to remove the flag in 9.4 but there's actually no great rush to do so and it's probably less confusing (forever) to keep the message around suggesting an explicit profiling option. Fixes #21545
* Enable USE_INLINE_SRT_FIELD on ARM64Sylvain Henry2022-05-301-5/+1
| | | | | | | It was previously disabled because of: - a confusion about "SRT inlining" (see removed comment in this commit) - a linker bug (overflow) in the handling of ARM64_RELOC_SUBTRACTOR relocation: fixed by a previous commit.
* MachO linker: fix handling of ARM64_RELOC_SUBTRACTORSylvain Henry2022-05-301-29/+73
| | | | | | | | | | | | | | | ARM64_RELOC_SUBTRACTOR relocations are paired with an AMR64_RELOC_UNSIGNED relocation to implement: addend + sym1 - sym2 The linker was doing it in two steps, basically: *addend <- *addend - sym2 *addend <- *addend + sym1 The first operation was likely to overflow. For example when the relocation target was 32-bit and both sym1/sym2 were 64-bit addresses. With the small memory model, (sym1-sym2) would fit in 32 bits but (*addend-sym2) may not. Now the linker does it in one step: *addend <- *addend + sym1 - sym2
* Allow passing -po outside profiling wayTeo Camarasu2022-05-241-0/+17
| | | | Resolves #21455
* nonmoving: Fix documentation of GC statistics fieldsBen Gamari2022-05-201-14/+9
| | | | | | These were previously incorrect. Fixes #21553.
* base: Introduce [sg]etFinalizerExceptionHandlerBen Gamari2022-05-194-6/+7
| | | | | This introduces a global hook which is called when an exception is thrown during finalization.
* Give all EXTERN_INLINE closure macros prototypesAndreas Klebinger2022-05-171-12/+36
|
* codeGen: Ensure that static datacon apps are included in SRTsBen Gamari2022-05-171-0/+6
| | | | | | | | | | | | | | | | When generating an SRT for a recursive group, GHC.Cmm.Info.Build.oneSRT filters out recursive references, as described in Note [recursive SRTs]. However, doing so for static functions would be unsound, for the reason described in Note [Invalid optimisation: shortcutting]. However, the same argument applies to static data constructor applications, as we discovered in #20959. Fix this by ensuring that static data constructor applications are included in recursive SRTs. The approach here is not entirely satisfactory, but it is a starting point. Fixes #20959.
* driver: Introduce pgmcxxBen Gamari2022-05-171-0/+1
| | | | | | | | | | Here we introduce proper support for compilation of C++ objects. This includes: * logic in `configure` to detect the C++ toolchain and propagating this information into the `settings` file * logic in the driver to use the C++ toolchain when compiling C++ sources
* Make closure macros EXTERN_INLINE to make debugging easierAndreas Klebinger2022-05-161-34/+34
| | | | | | | | | | | | | | Implements #21424. The RTS macros get_itbl and friends are extremely helpful during debugging. However only a select few of those were available in the compiled RTS as actual symbols as the rest were INLINE macros. This commit marks all of them as EXTERN_INLINE. This will still inline them at use sites but allow us to use their compiled counterparts during debugging. This allows us to use things like `p get_fun_itbl(ptr)` in the gdb shell since `get_fun_itbl` will now be available as symbol!
* rts/PEi386: Fix handling of weak symbolsBen Gamari2022-05-132-0/+13
| | | | | | | | Previously we would flag the symbol as weak but failed to set its address, which must be computed from an "auxiliary" symbol entry the follows the weak symbol. Fixes #21556.
* rts: Add debug output on ocResolve failureBen Gamari2022-05-131-1/+4
| | | | This makes it easier to see how resolution failures nest.
* Add mention of -hi to RTS --helpMatthew Pickering2022-05-111-0/+2
| | | | Fixes #21546
* rts: Drop setExecutableBen Gamari2022-05-113-29/+0
| | | | | Since f6e366c058b136f0789a42222b8189510a3693d1 setExecutable has been dead code. Drop it.
* hadrian: Only copy and install libffi headers when using in-tree libffiMatthew Pickering2022-05-101-3/+3
| | | | | | | | When passed `--use-system-libffi` then we shouldn't copy and install the headers from the system package. Instead the headers are expected to be available as a runtime dependency on the users system. Fixes #21485 #21487
* Respect -po when heap profiling (#21446)Teo Camarasu2022-05-091-15/+21
|
* rts: Ensure that XMM registers are preserved on Win64Ben Gamari2022-05-052-27/+38
| | | | | | | | Previously we only preserved the bottom 64-bits of the callee-saved 128-bit XMM registers, in violation of the Win64 calling convention. Fix this. Fixes #21465.
* adjustors: align comment about number of integer like arguments with ↵Adam Sandberg Ericsson2022-05-051-3/+3
| | | | implementation for Amd4+MinGW implementation
* rts/ghc.mk: Only build StgCRunAsm.S when it is neededBen Gamari2022-05-041-0/+3
| | | | | | | | Previously the make build system unconditionally included StgCRunAsm.S in the link, meaning that the RTS would require an execstack unnecessarily. Fixes #21478.
* rts/m32: Fix assertion failureBen Gamari2022-04-301-0/+3
| | | | | | | | | | This fixes an assertion failure in the m32 allocator due to the imprecisely specified preconditions of `m32_allocator_push_filled_list`. Specifically, the caller must ensure that the page type is set to filled prior to calling `m32_allocator_push_filled_list`. While this issue did result in an assertion failure in the debug RTS, the issue is in fact benign.
* rts: Refactor handling of dead threads' stacksBen Gamari2022-04-295-11/+31
| | | | | | | | | | | | | | | | This fixes a bug that @JunmingZhao42 and I noticed while working on her MMTK port. Specifically, in stg_stop_thread we used stg_enter_info as a sentinel at the tail of a stack after a thread has completed. However, stg_enter_info expects to have a two-field payload, which we do not push. Consequently, if the GC ends up somehow the stack it will attempt to interpret data past the end of the stack as the frame's fields, resulting in unsound behavior. To fix this I eliminate this hacky use of `stg_stop_thread` and instead introduce a new stack frame type, `stg_dead_thread_info`. Not only does this eliminate the potential for the previously mentioned memory unsoundness but it also more clearly captures the intended structure of the dead threads' stacks.
* Revert "rts: Refactor handling of dead threads' stacks"Matthew Pickering2022-04-285-29/+9
| | | | This reverts commit e09afbf2a998beea7783e3de5dce5dd3c6ff23db.
* rts: add some more documentation to StgWeak closure typeAdam Sandberg Ericsson2022-04-271-2/+13
|
* Enable eventlog support in all ways by defaultBen Gamari2022-04-274-9/+10
| | | | | | | | | | | | | | | | | Here we deprecate the eventlogging RTS ways and instead enable eventlog support in the remaining ways. This simplifies packaging and reduces GHC compilation times (as we can eliminate two whole compilations of the RTS) while simplifying the end-user story. The trade-off is a small increase in binary sizes in the case that the user does not want eventlogging support, but we think that this is a fine trade-off. This also revealed a latent RTS bug: some files which included `Cmm.h` also assumed that it defined various macros which were in fact defined by `Config.h`, which `Cmm.h` did not include. Fixing this in turn revealed that `StgMiscClosures.cmm` failed to import various spinlock statistics counters, as evidenced by the failed unregisterised build. Closes #18948.
* rts/eventlog: Don't attempt to flush if there is no writerBen Gamari2022-04-271-0/+8
| | | | If the user has not configured a writer then there is nothing to flush.
* rts: state explicitly what evacuate and scavange mean in the copying gcAdam Sandberg Ericsson2022-04-272-1/+9
|
* Add note about inefficiency in returnMemoryToOSFabian Thorand2022-04-271-0/+8
|
* Defer freeing of mega block groupsFabian Thorand2022-04-273-35/+245
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Solves the quadratic worst case performance of freeing megablocks that was described in issue #19897. During GC runs, we now keep a secondary free list for megablocks that is neither sorted, nor coalesced. That way, free becomes an O(1) operation at the expense of not being able to reuse memory for larger allocations. At the end of a GC run, the secondary free list is sorted and then merged into the actual free list in a single pass. That way, our worst case performance is O(n log(n)) rather than O(n^2). We postulate that temporarily losing coalescense during a single GC run won't have any adverse effects in practice because: - We would need to release enough memory during the GC, and then after that (but within the same GC run) allocate a megablock group of more than one megablock. This seems unlikely, as large objects are not copied during GC, and so we shouldn't need such large allocations during a GC run. - Allocations of megablock groups of more than one megablock are rare. They only happen when a single heap object is large enough to require that amount of space. Any allocation areas that are supposed to hold more than one heap object cannot use megablock groups, because only the first megablock of a megablock group has valid `bdescr`s. Thus, heap object can only start in the first megablock of a group, not in later ones.
* rts: Improve documentation of closure typesBen Gamari2022-04-251-13/+35
| | | | Also drops the unused TREC_COMMITTED transaction state.
* rts: Refactor handling of dead threads' stacksBen Gamari2022-04-255-9/+29
| | | | | | | | | | | | | | | | This fixes a bug that @JunmingZhao42 and I noticed while working on her MMTK port. Specifically, in stg_stop_thread we used stg_enter_info as a sentinel at the tail of a stack after a thread has completed. However, stg_enter_info expects to have a two-field payload, which we do not push. Consequently, if the GC ends up somehow the stack it will attempt to interpret data past the end of the stack as the frame's fields, resulting in unsound behavior. To fix this I eliminate this hacky use of `stg_stop_thread` and instead introduce a new stack frame type, `stg_dead_thread_info`. Not only does this eliminate the potential for the previously mentioned memory unsoundness but it also more clearly captures the intended structure of the dead threads' stacks.
* Drop libtool path from settings fileBen Gamari2022-04-251-1/+0
| | | | | GHC no longers uses libtool for linking and therefore this is no longer necessary.
* Ensure that wired-in exception closures aren't GC'dBen Gamari2022-04-252-0/+20
| | | | | | | | | | | | | | | As described in Note [Wired-in exceptions are not CAFfy], a small set of built-in exception closures get special treatment in the code generator, being declared as non-CAFfy despite potentially containing CAF references. The original intent of this treatment for the RTS to then add StablePtrs for each of the closures, ensuring that they are not GC'd. However, this logic was not applied consistently and eventually removed entirely in 951c1fb0. This lead to #21141. Here we fix this bug by reintroducing the StablePtrs and document the status quo. Closes #21141.
* rts: Factor out built-in GC rootsBen Gamari2022-04-251-35/+41
|
* hadrian: Clean up handling of libffi dependenciesBen Gamari2022-04-251-1/+4
|
* rts: Mark closureFlags array as constBen Gamari2022-04-222-2/+2
|
* rts: Introduce ip_STACK_FRAMEBen Gamari2022-04-222-66/+68
| | | | | | While debugging it is very useful to be able to determine whether a given info table is a stack frame or not. We have spare bits in the closure flags array anyways, use one for this information.
* [ci skip] Drop outdated TODO in RtsAPI.cCheng Shao2022-04-211-4/+0
|
* rts: Ensure that the interpreter doesn't disregard tagsBen Gamari2022-04-151-4/+4
| | | | | Previously the interpreter's handling of `RET_BCO` stack frames would throw away the tag of the returned closure. This resulted in #21390.
* Only enable PROF_SPIN in DEBUGDylan Yudaken2022-04-151-0/+2
|
* rts: Fix off-by-one in snwprintf usagewip/windows-finalwip/windows-clang-joinBen Gamari2022-04-071-2/+5
|
* rts: Fallback to ucrtbase not msvcrtBen Gamari2022-04-071-3/+4
| | | | | Since we have switched to Clang the toolchain now links against ucrt rather than msvcrt.
* rts/CloneStack: Ensure that Rts.h is #included firstBen Gamari2022-04-071-2/+2
| | | | As is necessary on Windows.
*---. Merge branches 'wip/windows-high-codegen', 'wip/windows-high-linker', ↵Ben Gamari2022-04-0733-802/+1080
|\ \ \ | | | | | | | | | | | | 'wip/windows-clang-2' and 'wip/lint-rts-includes' into wip/windows-clang-join
| | | * rts: Fix various #include issuesBen Gamari2022-04-0616-30/+28
| | | | | | | | | | | | | | | | This fixes various violations of the newly-added RTS includes linter.