| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Previously it didn't enable/disable nonmoving_gc and ticky event types
Fixes #21813
|
|
|
|
|
| |
When getuid() is not present, don't do checkSuid since it doesn't make
sense anyway on that target.
|
|
|
|
|
|
|
| |
Instead of sprinkling the codebase with
`GNU(C3)_ATTRIBUTE(__noreturn__)`, add a `STG_NORETURN` macro (for,
basically, the same thing) similar to `STG_UNUSED` and others, and
update the code to use this macro where applicable.
|
| |
|
|
|
|
|
|
|
|
| |
We originally planned to remove the flag in 9.4 but there's actually no
great rush to do so and it's probably less confusing (forever) to keep
the message around suggesting an explicit profiling option.
Fixes #21545
|
|
|
|
| |
Resolves #21455
|
|
|
|
| |
Fixes #21546
|
| |
|
|
|
|
|
|
|
| |
Align flag descriptions and acknowledge that some flags may not be
available unless the user linked with `-rtsopts` (as noted in #20961).
Fixes #20961.
|
|
|
|
|
|
|
| |
Just a tiny cleanup inspired by the following comment:
https://gitlab.haskell.org/ghc/ghc/-/issues/19437#note_334271
I was just getting familiar with rts code base so I
thought might as well do this.
|
|
|
|
|
| |
Otherwise the user may be surprised by the missing context provided by
the latter.
|
|
|
|
|
|
|
|
| |
This splits the -Dl RTS debug output into two distinct flags:
* `+RTS -Dl` shows errors and debug output which scales with at most
O(# objects)
* `+RTS -DL` shows debug output which scales with O(# symbols)t
|
| |
|
| |
|
|
|
|
|
|
| |
is used outside of the rts so we do this rather than just fish it out of
the repo in ad-hoc way, in order to make packages in this repo more
self-contained.
|
|
|
|
| |
Fixes #19995
|
|
|
|
| |
Fixes #20006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Related to #19381 #19359 #14702
After a spike in memory usage we have been conservative about returning
allocated blocks to the OS in case we are still allocating a lot and would
end up just reallocating them. The result of this was that up to 4 * live_bytes
of blocks would be retained once they were allocated even if memory usage ended up
a lot lower.
For a heap of size ~1.5G, this would result in OS memory reporting 6G which is
both misleading and worrying for users.
In long-lived server applications this results in consistent high memory
usage when the live data size is much more reasonable (for example ghcide)
Therefore we have a new (2021) strategy which starts by retaining up to 4 * live_bytes
of blocks before gradually returning uneeded memory back to the OS on subsequent
major GCs which are NOT caused by a heap overflow.
Each major GC which is NOT caused by heap overflow increases the consec_idle_gcs
counter and the amount of memory which is retained is inversely proportional to this number.
By default the excess memory retained is
oldGenFactor (controlled by -F) / 2 ^ (consec_idle_gcs * returnDecayFactor)
On a major GC caused by a heap overflow, the `consec_idle_gcs` variable is reset to 0
(as we could continue to allocate more, so retaining all the memory might make sense).
Therefore setting bigger values for `-Fd` makes the rate at which memory is returned slower.
Smaller values make it get returned faster. Setting `-Fd0` disables the
memory return completely, which is the behaviour of older GHC versions.
The default is `-Fd4` which results in the following scaling:
> mapM print [(x, 1/ (2**(x / 4))) | x <- [1 :: Double ..20]]
(1.0,0.8408964152537146)
(2.0,0.7071067811865475)
(3.0,0.5946035575013605)
(4.0,0.5)
(5.0,0.4204482076268573)
(6.0,0.35355339059327373)
(7.0,0.29730177875068026)
(8.0,0.25)
(9.0,0.21022410381342865)
(10.0,0.17677669529663687)
(11.0,0.14865088937534013)
(12.0,0.125)
(13.0,0.10511205190671433)
(14.0,8.838834764831843e-2)
(15.0,7.432544468767006e-2)
(16.0,6.25e-2)
(17.0,5.255602595335716e-2)
(18.0,4.4194173824159216e-2)
(19.0,3.716272234383503e-2)
(20.0,3.125e-2)
So after 13 consecutive GCs only 0.1 of the maximum memory used will be retained.
Further to this decay factor, the amount of memory we attempt to retain is
also influenced by the GC strategy for the oldest generation. If we are using
a copying strategy then we will need at least 2 * live_bytes for copying to take
place, so we always keep that much. If using compacting or nonmoving then we need a lower number,
so we just retain at least `1.2 * live_bytes` for some protection.
In future we might want to make this behaviour more aggressive, some
relevant literature is
> Ulan Degenbaev, Jochen Eisinger, Manfred Ernst, Ross McIlroy, and Hannes Payer. 2016. Idle time garbage collection scheduling. SIGPLAN Not. 51, 6 (June 2016), 570–583. DOI:https://doi.org/10.1145/2980983.2908106
which describes the "memory reducer" in the V8 javascript engine which
on an idle collection immediately returns as much memory as possible.
|
|
|
|
|
|
|
|
| |
This profiling mode creates bands by the address of the info table for
each closure. This provides a much more fine-grained profiling output
than any of the other profiling modes.
The `-hi` profiling mode does not require a profiling build.
|
|
|
|
|
|
|
|
|
|
| |
This patch exposes three new functions in `GHC.Profiling` which allow
heap profiling to be enabled and disabled dynamically.
1. startHeapProfTimer - Starts heap profiling with the given RTS options
2. stopHeapProfTimer - Stops heap profiling
3. requestHeapCensus - Perform a heap census on the next context
switch, regardless of whether the timer is enabled or not.
|
|
|
|
|
|
|
| |
It should be left to tooling to perform the filtering to remove these
specific closure types from the profile if desired.
Fixes #16795
|
|
|
|
|
|
|
| |
This introduces a flag, --eventlog-flush-interval, which can be used to
set an upper bound on the amount of time for which an eventlog event
will remain enqueued. This can be useful in real-time monitoring
settings.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
It is confusing that it defaults to two different things depending on
whether we are in the profiling way or not.
Use -hc if you have a profiling build
Use -hT if you have a normal build
Fixes #19031
|
|
|
|
|
|
|
| |
As noted in #9666, the mark-region GC is not compatible with heap
profiling. Also add documentation for this flag.
Closes #9666.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This gives a small increase in performance under most circumstances.
For single threaded GC the improvement is on the order of 1-2%.
For multi threaded GC the results are quite noisy but seem to
fall into the same ballpark.
Fixes #16499
|
|
|
|
|
|
|
|
| |
We currently only post the entry counters, not the other global
counters as in my experience the former are more useful. We use the heap
profiler's census period to decide when to dump.
Also spruces up the documentation surrounding ticky-ticky a bit.
|
|
|
|
| |
Fixes #18281.
|
| |
|
|
|
|
|
| |
Previously this was not easily available to the user. Fix this.
Non-moving collection lifecycle events are now reported with -lg.
|
| |
|
|
|
|
|
|
|
|
|
| |
Stack squeezing is done on context switch, not on GC or stack overflow.
Fix the documentation.
Fixes #17685
[ci skip]
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
As noted in #17606, Docker disallows the get_mempolicy syscall by
default. This caused numerous tests to fail under CI in the `debug_numa`
way. Avoid this by disabling the NUMA probing logic when --debug-numa is
in use, instead setting n_numa_nodes in RtsFlags.c.
Fixes #17606.
|
|
|
|
|
| |
Previously things like `+RTS --numa-debug` would enable NUMA support,
despite being an invalid flag.
|
| |
|
|
|
|
|
| |
The old flag, `-xn`, was quite cryptic. Here we add `--nonmoving-gc` in
addition.
|
|
|
|
|
|
| |
Sets `MiscFlags.disableDelayedOsMemoryReturn`.
See the added `Note [MADV_FREE and MADV_DONTNEED]` for details.
|
| |
|
| |
|
|
|
|
| |
This flag will enable the use of a non-moving oldest generation.
|
| |
|
|
|
|
| |
Zeros heap memory after gc freed it.
|
| |
|
|
|
|
|
|
|
|
|
| |
Previously there were numerous places in the RTS where we would fopen
with the "w" flag string. This is wrong as it will not truncate the
file. Consequently if we write less data than the previous length of the
file we will leave garbage at its end.
Fixes #16993.
|
| |
|
|
|
|
|
| |
This moves all URL references to Trac tickets to their corresponding
GitLab counterparts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This re-applies {D5195} with fixes for i386:
* Fix unused label warnings, see {D5230} or {D5273}
* Fix a silly bug introduced by moving `#if`
{P190}
Add a RTS option -xp to load PIC object anywhere in address space. We do
this by relaxing the requirement of <0x80000000 result of
`mmapForLinker` and implying USE_CONTIGUOUS_MMAP.
We also need to change calls to `ocInit` and `ocGetNames` to avoid
dangling pointers when the address of `oc->image` is changed by
`ocAllocateSymbolExtra`.
Test Plan:
See {D5195}, also test under i386:
```
$ uname -a
Linux watashi-arch32 4.18.5-arch1-1.0-ARCH #1 SMP PREEMPT Tue Aug 28
20:45:30 CEST 2018 i686 GNU/Linux
$ cd testsuite/tests/th/ && make test
...
```
will run `./validate` on stacked diff.
Reviewers: simonmar, bgamari, alpmestan, trommler, hvr, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5289
|