| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
Previously we could construct a `Box` of a NULL pointer from the `link`
field of `StgWeak`. Now we take care to avoid ever introducing such
pointers in `collect_pointers` and ensure that the `link` field is
represented as a `Maybe` in the `Closure` type.
Fixes #21622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
clock_gettime reports the combined total or user AND system time so in
order to replicate it with getrusage we need to add both system and user
time together.
See https://stackoverflow.com/questions/7622371/getrusage-vs-clock-gettime
Some sample measurements when building Cabal with this patch
t1: rusage
t2: clock_gettime
t1: 62347518000; t2: 62347520873
t1: 62395687000; t2: 62395690171
t1: 62432435000; t2: 62432437313
t1: 62478489000; t2: 62478492465
t1: 62514990000; t2: 62514992534
t1: 62515479000; t2: 62515480327
t1: 62515485000; t2: 62515486344
Fixes #21656
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As proposed in
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7508#note_432877 and
https://gitlab.haskell.org/ghc/ghc/-/merge_requests/7508#note_434676,
`GHC.HsToCore.Ticks` is about ticks, breakpoints are separate and
backend-specific (only for the bytecode interpreter), and mix entry
writing is just for HPC.
With this split we separate out those interpreter- and HPC-specific
its, and keep the main `GHC.HsToCore.Ticks` agnostic.
Also, instead of passing the reversed list and count around, we use
`SizedSeq` which abstracts over the algorithm. This is much nicer to
avoid noise and prevents bugs.
(The bugs are not just hypothetical! I missed up the reverses on an
earlier draft of this commit.)
|
|
|
|
|
|
| |
The old name made it confusing why disabling HPC didn't disable the
entire pass. The name makes it clear --- there are other reasons to add
ticks in addition.
|
| |
|
|
|
|
|
|
|
|
| |
We originally planned to remove the flag in 9.4 but there's actually no
great rush to do so and it's probably less confusing (forever) to keep
the message around suggesting an explicit profiling option.
Fixes #21545
|
|
|
|
|
|
|
| |
It was previously disabled because of:
- a confusion about "SRT inlining" (see removed comment in this commit)
- a linker bug (overflow) in the handling of ARM64_RELOC_SUBTRACTOR
relocation: fixed by a previous commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARM64_RELOC_SUBTRACTOR relocations are paired with an
AMR64_RELOC_UNSIGNED relocation to implement: addend + sym1 - sym2
The linker was doing it in two steps, basically:
*addend <- *addend - sym2
*addend <- *addend + sym1
The first operation was likely to overflow. For example when the
relocation target was 32-bit and both sym1/sym2 were 64-bit addresses.
With the small memory model, (sym1-sym2) would fit in 32 bits but
(*addend-sym2) may not.
Now the linker does it in one step:
*addend <- *addend + sym1 - sym2
|
|
|
|
| |
Resolves #21455
|
|
|
|
|
|
| |
These were previously incorrect.
Fixes #21553.
|
|
|
|
|
| |
This introduces a global hook which is called when an exception is
thrown during finalization.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When generating an SRT for a recursive group, GHC.Cmm.Info.Build.oneSRT
filters out recursive references, as described in Note [recursive SRTs].
However, doing so for static functions would be unsound, for the reason
described in Note [Invalid optimisation: shortcutting].
However, the same argument applies to static data constructor
applications, as we discovered in #20959. Fix this by ensuring that
static data constructor applications are included in recursive SRTs.
The approach here is not entirely satisfactory, but it is a starting
point.
Fixes #20959.
|
|
|
|
|
|
|
|
|
|
| |
Here we introduce proper support for compilation of C++ objects. This
includes:
* logic in `configure` to detect the C++ toolchain and propagating this
information into the `settings` file
* logic in the driver to use the C++ toolchain when compiling C++
sources
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements #21424.
The RTS macros get_itbl and friends are extremely helpful during debugging.
However only a select few of those were available in the compiled RTS as actual symbols
as the rest were INLINE macros.
This commit marks all of them as EXTERN_INLINE. This will still inline them at use sites
but allow us to use their compiled counterparts during debugging.
This allows us to use things like `p get_fun_itbl(ptr)` in the gdb shell
since `get_fun_itbl` will now be available as symbol!
|
|
|
|
|
|
|
|
| |
Previously we would flag the symbol as weak but failed
to set its address, which must be computed from an "auxiliary"
symbol entry the follows the weak symbol.
Fixes #21556.
|
|
|
|
| |
This makes it easier to see how resolution failures nest.
|
|
|
|
| |
Fixes #21546
|
|
|
|
|
| |
Since f6e366c058b136f0789a42222b8189510a3693d1 setExecutable has been
dead code. Drop it.
|
|
|
|
|
|
|
|
| |
When passed `--use-system-libffi` then we shouldn't copy and install the
headers from the system package. Instead the headers are expected to be
available as a runtime dependency on the users system.
Fixes #21485 #21487
|
| |
|
|
|
|
|
|
|
|
| |
Previously we only preserved the bottom 64-bits of the callee-saved
128-bit XMM registers, in violation of the Win64 calling convention.
Fix this.
Fixes #21465.
|
|
|
|
| |
implementation for Amd4+MinGW implementation
|
|
|
|
|
|
|
|
| |
Previously the make build system unconditionally included StgCRunAsm.S
in the link, meaning that the RTS would require an execstack
unnecessarily.
Fixes #21478.
|
|
|
|
|
|
|
|
|
|
| |
This fixes an assertion failure in the m32 allocator due to the
imprecisely specified preconditions of `m32_allocator_push_filled_list`.
Specifically, the caller must ensure that the page type is set to filled
prior to calling `m32_allocator_push_filled_list`.
While this issue did result in an assertion failure in the debug RTS,
the issue is in fact benign.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug that @JunmingZhao42 and I noticed while working on her
MMTK port. Specifically, in stg_stop_thread we used stg_enter_info as a
sentinel at the tail of a stack after a thread has completed. However,
stg_enter_info expects to have a two-field payload, which we do not
push. Consequently, if the GC ends up somehow the stack it will attempt
to interpret data past the end of the stack as the frame's fields,
resulting in unsound behavior.
To fix this I eliminate this hacky use of `stg_stop_thread` and instead
introduce a new stack frame type, `stg_dead_thread_info`. Not only does
this eliminate the potential for the previously mentioned memory
unsoundness but it also more clearly captures the intended structure of
the dead threads' stacks.
|
|
|
|
| |
This reverts commit e09afbf2a998beea7783e3de5dce5dd3c6ff23db.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we deprecate the eventlogging RTS ways and instead enable eventlog
support in the remaining ways. This simplifies packaging and reduces GHC
compilation times (as we can eliminate two whole compilations of the RTS)
while simplifying the end-user story. The trade-off is a small increase
in binary sizes in the case that the user does not want eventlogging
support, but we think that this is a fine trade-off.
This also revealed a latent RTS bug: some files which included `Cmm.h`
also assumed that it defined various macros which were in fact defined
by `Config.h`, which `Cmm.h` did not include. Fixing this in turn
revealed that `StgMiscClosures.cmm` failed to import various spinlock
statistics counters, as evidenced by the failed unregisterised build.
Closes #18948.
|
|
|
|
| |
If the user has not configured a writer then there is nothing to flush.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Solves the quadratic worst case performance of freeing megablocks that
was described in issue #19897.
During GC runs, we now keep a secondary free list for megablocks that is
neither sorted, nor coalesced. That way, free becomes an O(1) operation
at the expense of not being able to reuse memory for larger allocations.
At the end of a GC run, the secondary free list is sorted and then
merged into the actual free list in a single pass.
That way, our worst case performance is O(n log(n)) rather than O(n^2).
We postulate that temporarily losing coalescense during a single GC run
won't have any adverse effects in practice because:
- We would need to release enough memory during the GC, and then after
that (but within the same GC run) allocate a megablock group of more
than one megablock. This seems unlikely, as large objects are not
copied during GC, and so we shouldn't need such large allocations
during a GC run.
- Allocations of megablock groups of more than one megablock are rare.
They only happen when a single heap object is large enough to require
that amount of space. Any allocation areas that are supposed to hold
more than one heap object cannot use megablock groups, because only
the first megablock of a megablock group has valid `bdescr`s. Thus,
heap object can only start in the first megablock of a group, not in
later ones.
|
|
|
|
| |
Also drops the unused TREC_COMMITTED transaction state.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug that @JunmingZhao42 and I noticed while working on her
MMTK port. Specifically, in stg_stop_thread we used stg_enter_info as a
sentinel at the tail of a stack after a thread has completed. However,
stg_enter_info expects to have a two-field payload, which we do not
push. Consequently, if the GC ends up somehow the stack it will attempt
to interpret data past the end of the stack as the frame's fields,
resulting in unsound behavior.
To fix this I eliminate this hacky use of `stg_stop_thread` and instead
introduce a new stack frame type, `stg_dead_thread_info`. Not only does
this eliminate the potential for the previously mentioned memory
unsoundness but it also more clearly captures the intended structure of
the dead threads' stacks.
|
|
|
|
|
| |
GHC no longers uses libtool for linking and therefore this is no longer
necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As described in Note [Wired-in exceptions are not CAFfy], a small set of
built-in exception closures get special treatment in the code generator,
being declared as non-CAFfy despite potentially containing CAF
references. The original intent of this treatment for the RTS to then
add StablePtrs for each of the closures, ensuring that they are not
GC'd. However, this logic was not applied consistently and eventually
removed entirely in 951c1fb0. This lead to #21141.
Here we fix this bug by reintroducing the StablePtrs and document the
status quo.
Closes #21141.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
While debugging it is very useful to be able to determine whether a
given info table is a stack frame or not. We have spare bits in the
closure flags array anyways, use one for this information.
|
| |
|
|
|
|
|
| |
Previously the interpreter's handling of `RET_BCO` stack frames would
throw away the tag of the returned closure. This resulted in #21390.
|
| |
|
| |
|
|
|
|
|
| |
Since we have switched to Clang the toolchain now links against
ucrt rather than msvcrt.
|
|
|
|
| |
As is necessary on Windows.
|
|\ \ \
| | | |
| | | |
| | | | |
'wip/windows-clang-2' and 'wip/lint-rts-includes' into wip/windows-clang-join
|
| | | |
| | | |
| | | |
| | | | |
This fixes various violations of the newly-added RTS includes linter.
|