diff options
author | Douglas Wilson <douglas.wilson@gmail.com> | 2021-04-02 11:48:45 +0100 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2021-05-12 18:43:44 -0400 |
commit | 13095b9b92b609ce527adae441dbcb0ea72a5c22 (patch) | |
tree | b0d42e6c04772bf9349815c2ce679b8e92c450a9 | |
parent | 581679a80f60b5c21ebf683d896022dfbc171ec2 (diff) | |
download | haskell-13095b9b92b609ce527adae441dbcb0ea72a5c22.tar.gz |
[docs] release notes for !4729 + !3678
Also includes small unrelated type fix
(cherry picked from commit 70c39e220aa75608e277f311b53f8428c2abc4ff)
-rw-r--r-- | docs/users_guide/9.2.1-notes.rst | 32 |
1 files changed, 32 insertions, 0 deletions
diff --git a/docs/users_guide/9.2.1-notes.rst b/docs/users_guide/9.2.1-notes.rst index f133449250..198137f79f 100644 --- a/docs/users_guide/9.2.1-notes.rst +++ b/docs/users_guide/9.2.1-notes.rst @@ -129,6 +129,15 @@ Language Compiler ~~~~~~~~ +- Performance of the compiler in :ghc-flag:`--make` mode with + :ghc-flag:`-j[⟨n⟩]` is significantly improved by improvements to the parallel + garbage collector noted below. + + Benchmarks show a 20% decrease in wall clock time, and a 40% decrease in cpu + time, when compiling Cabal with ``-j4`` on linux. Improvements are more dramatic + with higher parallelism, and we no longer see significant degradation in wall + clock time as parallelism increases above 4. + - New :ghc-flag:`-Wredundant-bang-patterns` flag that enables checks for "dead" bangs. For instance, given this program: :: @@ -191,6 +200,26 @@ GHCi Runtime system ~~~~~~~~~~~~~~ +- The parallel garbage collector is now significantly more performant. Heavily + contended spinlocks have been replaced with mutexes and condition variables. + For most programs compiled with the threaded runtime, and run with more than + four capabilities, we expect minor GC pauses and GC cpu time both to be reduced. + + For very short running programs (in the order of 10s of milliseconds), we have seen + some performance regressions. We recommend programs affected by this to either + compile with the single threaded runtime, or otherwise to disable the parallel + garbage collector with :rts-flag:`-qg ⟨gen⟩`. + + We don't expect any other performance regressions, however only limited + benchmarking has been done. We have only benchmarked GHC and nofib and only on + linux. + + Users are advised to reconsider the rts flags that programs are run with. If + you have been mitigating poor parallel GC performance by: using large + nurseries (:rts-flag:`-A <-A ⟨size⟩>`), disabling load balancing (:rts-flag:`-qb ⟨gen⟩`), or + limiting parallel GC to older generations (:rts-flag:`-qg ⟨gen⟩`); then you may + find these mitigations are no longer necessary. + - The heap profiler now has proper treatment of pinned ``ByteArray#``\ s. Such heap objects will now be correctly attributed to their appropriate cost centre instead of merely being lumped into the ``PINNED`` category. @@ -212,6 +241,9 @@ Runtime system is returned is controlled by the :rts-flag:`-Fd ⟨factor⟩`. Memory return is triggered by consecutive idle collections. +- The default nursery size, :rts-flag:`-A <-A ⟨size⟩>`, has been increased from + 1mb to 4mb. + Template Haskell ~~~~~~~~~~~~~~~~ |