diff options
author | Sebastian Graf <sebastian.graf@kit.edu> | 2021-04-28 14:55:26 +0200 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2021-10-24 01:26:46 -0400 |
commit | 3bab222c585343f8febe2a627d280b7be9401e92 (patch) | |
tree | bb95653710d6ac277a88f8011c4e491a73531a64 /docs/users_guide/using-optimisation.rst | |
parent | 8300ca2e3bcc3e74f7524116f85688da6167bb2f (diff) | |
download | haskell-3bab222c585343f8febe2a627d280b7be9401e92.tar.gz |
DmdAnal: Implement Boxity Analysis (#19871)
This patch fixes some abundant reboxing of `DynFlags` in
`GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic
of #19407) by introducing a Boxity analysis to GHC, done as part of demand
analysis. This allows to accurately capture ad-hoc unboxing decisions previously
made in worker/wrapper in demand analysis now, where the boxity info can
propagate through demand signatures.
See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in
`Note [No lazy, Unboxed demand in demand signature]`, but
`Note [Finalising boxity for demand signature]` is probably a better entry-point.
To support the fix for #19407, I had to change (what was)
`Note [Add demands for strict constructors]` a bit
(now `Note [Unboxing evaluated arguments]`). In particular, we now take care of
it in `finaliseBoxity` (which is only called from demand analaysis) instead of
`wantToUnboxArg`.
I also had to resurrect `Note [Product demands for function body]` and rename
it to `Note [Unboxed demand on function bodies returning small products]` to
avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again.
See the updated Note for details.
A nice side-effect is that the worker/wrapper transformation no longer needs to
look at strictness info and other bits such as `InsideInlineableFun` flags
(needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects
boxity info from argument demands and interprets them with a severely simplified
`wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved
to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys`
which would be awkward to export.
I spent some time figuring out the reason for why `T16197` failed prior to my
amendments to `Note [Unboxing evaluated arguments]`. After having it figured
out, I minimised it a bit and added `T16197b`, which simply compares computed
strictness signatures and thus should be far simpler to eyeball.
The 12% ghc/alloc regression in T11545 is because of the additional `Boxity`
field in `Poly` and `Prod` that results in more allocation during `lubSubDmd`
and `plusSubDmd`. I made sure in the ticky profiles that the number of calls
to those functions stayed the same. We can bear such an increase here, as we
recently improved it by -68% (in b760c1f).
T18698* regress slightly because there is more unboxing of dictionaries
happening and that causes Lint (mostly) to allocate more.
Fixes #19871, #19407, #4267, #16859, #18907 and #13331.
Metric Increase:
T11545
T18698a
T18698b
Metric Decrease:
T12425
T16577
T18223
T18282
T4267
T9961
Diffstat (limited to 'docs/users_guide/using-optimisation.rst')
-rw-r--r-- | docs/users_guide/using-optimisation.rst | 14 |
1 files changed, 14 insertions, 0 deletions
diff --git a/docs/users_guide/using-optimisation.rst b/docs/users_guide/using-optimisation.rst index a57225da25..6d33c5b5bc 100644 --- a/docs/users_guide/using-optimisation.rst +++ b/docs/users_guide/using-optimisation.rst @@ -816,6 +816,20 @@ by saying ``-fno-wombat``. more detailed list. Usually that identifies the loop quite accurately, because some numbers are very large. +.. ghc-flag:: -fdmd-unbox-width=⟨n⟩ + :shortdesc: *default: 3.* Boxity analysis pretends that returned records + with this many fields can be unboxed. + :type: dynamic + :category: + + :default: 3 + + Boxity analysis optimistically pretends that a function returning a record + with at most ``-fdmd-unbox-width`` fields has only call sites that don't + need the box of the returned record. That may in turn allow more argument + unboxing to happen. Set to 0 to be completely conservative (which guarantees + that no reboxing will happen due to this mechanism). + .. ghc-flag:: -fspec-constr :shortdesc: Turn on the SpecConstr transformation. Implied by :ghc-flag:`-O2`. :type: dynamic |