summaryrefslogtreecommitdiff
path: root/docs/users_guide/using-optimisation.rst
diff options
context:
space:
mode:
authorSebastian Graf <sebastian.graf@kit.edu>2021-04-28 14:55:26 +0200
committerMarge Bot <ben+marge-bot@smart-cactus.org>2021-10-24 01:26:46 -0400
commit3bab222c585343f8febe2a627d280b7be9401e92 (patch)
treebb95653710d6ac277a88f8011c4e491a73531a64 /docs/users_guide/using-optimisation.rst
parent8300ca2e3bcc3e74f7524116f85688da6167bb2f (diff)
downloadhaskell-3bab222c585343f8febe2a627d280b7be9401e92.tar.gz
DmdAnal: Implement Boxity Analysis (#19871)
This patch fixes some abundant reboxing of `DynFlags` in `GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic of #19407) by introducing a Boxity analysis to GHC, done as part of demand analysis. This allows to accurately capture ad-hoc unboxing decisions previously made in worker/wrapper in demand analysis now, where the boxity info can propagate through demand signatures. See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in `Note [No lazy, Unboxed demand in demand signature]`, but `Note [Finalising boxity for demand signature]` is probably a better entry-point. To support the fix for #19407, I had to change (what was) `Note [Add demands for strict constructors]` a bit (now `Note [Unboxing evaluated arguments]`). In particular, we now take care of it in `finaliseBoxity` (which is only called from demand analaysis) instead of `wantToUnboxArg`. I also had to resurrect `Note [Product demands for function body]` and rename it to `Note [Unboxed demand on function bodies returning small products]` to avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again. See the updated Note for details. A nice side-effect is that the worker/wrapper transformation no longer needs to look at strictness info and other bits such as `InsideInlineableFun` flags (needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects boxity info from argument demands and interprets them with a severely simplified `wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys` which would be awkward to export. I spent some time figuring out the reason for why `T16197` failed prior to my amendments to `Note [Unboxing evaluated arguments]`. After having it figured out, I minimised it a bit and added `T16197b`, which simply compares computed strictness signatures and thus should be far simpler to eyeball. The 12% ghc/alloc regression in T11545 is because of the additional `Boxity` field in `Poly` and `Prod` that results in more allocation during `lubSubDmd` and `plusSubDmd`. I made sure in the ticky profiles that the number of calls to those functions stayed the same. We can bear such an increase here, as we recently improved it by -68% (in b760c1f). T18698* regress slightly because there is more unboxing of dictionaries happening and that causes Lint (mostly) to allocate more. Fixes #19871, #19407, #4267, #16859, #18907 and #13331. Metric Increase: T11545 T18698a T18698b Metric Decrease: T12425 T16577 T18223 T18282 T4267 T9961
Diffstat (limited to 'docs/users_guide/using-optimisation.rst')
-rw-r--r--docs/users_guide/using-optimisation.rst14
1 files changed, 14 insertions, 0 deletions
diff --git a/docs/users_guide/using-optimisation.rst b/docs/users_guide/using-optimisation.rst
index a57225da25..6d33c5b5bc 100644
--- a/docs/users_guide/using-optimisation.rst
+++ b/docs/users_guide/using-optimisation.rst
@@ -816,6 +816,20 @@ by saying ``-fno-wombat``.
more detailed list. Usually that identifies the loop quite
accurately, because some numbers are very large.
+.. ghc-flag:: -fdmd-unbox-width=⟨n⟩
+ :shortdesc: *default: 3.* Boxity analysis pretends that returned records
+ with this many fields can be unboxed.
+ :type: dynamic
+ :category:
+
+ :default: 3
+
+ Boxity analysis optimistically pretends that a function returning a record
+ with at most ``-fdmd-unbox-width`` fields has only call sites that don't
+ need the box of the returned record. That may in turn allow more argument
+ unboxing to happen. Set to 0 to be completely conservative (which guarantees
+ that no reboxing will happen due to this mechanism).
+
.. ghc-flag:: -fspec-constr
:shortdesc: Turn on the SpecConstr transformation. Implied by :ghc-flag:`-O2`.
:type: dynamic