| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The update of the Outputable instance resulted in a slew of
documentation changes within Notes that used the old syntax.
The most important doc changes are to `Note [Demand notation]`
and the user's guide.
Fixes #19016.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As outlined in #18903, interleaving usage and strictness demands not
only means a more compact demand representation, but also allows us to
express demands that we weren't easily able to express before.
Call demands are *relative* in the sense that a call demand `Cn(cd)`
on `g` says "`g` is called `n` times. *Whenever `g` is called*, the
result is used according to `cd`". Example from #18903:
```hs
h :: Int -> Int
h m =
let g :: Int -> (Int,Int)
g 1 = (m, 0)
g n = (2 * n, 2 `div` n)
{-# NOINLINE g #-}
in case m of
1 -> 0
2 -> snd (g m)
_ -> uncurry (+) (g m)
```
Without the interleaved representation, we would just get `L` for the
strictness demand on `g`. Now we are able to express that whenever
`g` is called, its second component is used strictly in denoting `g`
by `1C1(P(1P(U),SP(U)))`. This would allow Nested CPR to unbox the
division, for example.
Fixes #18903.
While fixing regressions, I also discovered and fixed #18957.
Metric Decrease:
T13253-spj
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead, look through expandable unfoldings in `cprTransform`.
See the new Note [CPR for expandable unfoldings]:
```
Long static data structures (whether top-level or not) like
xs = x1 : xs1
xs1 = x2 : xs2
xs2 = x3 : xs3
should not get CPR signatures, because they
* Never get WW'd, so their CPR signature should be irrelevant after analysis
(in fact the signature might even be harmful for that reason)
* Would need to be inlined/expanded to see their constructed product
* Recording CPR on them blows up interface file sizes and is redundant with
their unfolding. In case of Nested CPR, this blow-up can be quadratic!
But we can't just stop giving DataCon application bindings the CPR property,
for example
fac 0 = 1
fac n = n * fac (n-1)
fac certainly has the CPR property and should be WW'd! But FloatOut will
transform the first clause to
lvl = 1
fac 0 = lvl
If lvl doesn't have the CPR property, fac won't either. But lvl doesn't have a
CPR signature to extrapolate into a CPR transformer ('cprTransform'). So
instead we keep on cprAnal'ing through *expandable* unfoldings for these arity
0 bindings via 'cprExpandUnfolding_maybe'.
In practice, GHC generates a lot of (nested) TyCon and KindRep bindings, one
for each data declaration. It's wasteful to attach CPR signatures to each of
them (and intractable in case of Nested CPR).
```
Fixes #18154.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reasons for that can be found in the wiki:
https://gitlab.haskell.org/ghc/ghc/wikis/nested-cpr/split-off-cpr
We now run CPR after demand analysis (except for after the final demand
analysis run just before code gen). CPR got its own dump flags
(`-ddump-cpr-anal`, `-ddump-cpr-signatures`), but not its own flag to
activate/deactivate. It will run with `-fstrictness`/`-fworker-wrapper`.
As explained on the wiki page, this step is necessary for a sane Nested
CPR analysis. And it has quite positive impact on compiler performance:
Metric Decrease:
T9233
T9675
T9961
T15263
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disabling worker/wrapper for NOINLINE things can cause unnecessary
reboxing of values. Consider
{-# NOINLINE f #-}
f :: Int -> a
f x = error (show x)
g :: Bool -> Bool -> Int -> Int
g True True p = f p
g False True p = p + 1
g b False p = g b True p
the strictness analysis will discover f and g are strict, but because f
has no wrapper, the worker for g will rebox p. So we get
$wg x y p# =
let p = I# p# in -- Yikes! Reboxing!
case x of
False ->
case y of
False -> $wg False True p#
True -> +# p# 1#
True ->
case y of
False -> $wg True True p#
True -> case f p of { }
g x y p = case p of (I# p#) -> $wg x y p#
Now, in this case the reboxing will float into the True branch, an so
the allocation will only happen on the error path. But it won't float
inwards if there are multiple branches that call (f p), so the reboxing
will happen on every call of g. Disaster.
Solution: do worker/wrapper even on NOINLINE things; but move the
NOINLINE pragma to the worker.
Test Plan: make test TEST="13143"
Reviewers: simonpj, bgamari, dfeuer, austin
Reviewed By: simonpj, bgamari
Subscribers: dfeuer, thomie
Differential Revision: https://phabricator.haskell.org/D3046
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
in order to have precise used-once information in the exported
strictness signatures, as well as precise used-once information on
thunks. This avoids the bad effects of #11731.
The subsequent worker-wrapper pass is responsible for removing the
demand environment part of the strictness signature. It does not run
after the final demand analyzer pass, so remove this also in CoreTidy.
The subsequent worker-wrapper pass is also responsible for removing
used-once-information from the demands and strictness signatures, as
these might not be preserved by the simplifier. This is _not_ done by
CoreTidy, because we _do_ want this information, as produced by the last
round of the demand analyzer, to be available to the code generator.
Differential Revision: https://phabricator.haskell.org/D2073
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second attempt at merging D757.
This patch implements the idea floated in Trac #9858, namely that we
should generate type-representation information at the data type
declaration site, rather than when solving a Typeable constraint.
However, this turned out quite a bit harder than I expected. I still
think it's the right thing to do, and it's done now, but it was quite
a struggle.
See particularly
* Note [Grand plan for Typeable] in TcTypeable (which is a new module)
* Note [The overall promotion story] in DataCon (clarifies existing
stuff)
The most painful bit was that to generate Typeable instances (ie
TyConRepName bindings) for every TyCon is tricky for types in ghc-prim
etc:
* We need to have enough data types around to *define* a TyCon
* Many of these types are wired-in
Also, to minimise the code generated for each data type, I wanted to
generate pure data, not CAFs with unpackCString# stuff floating about.
Performance
~~~~~~~~~~~
Three perf/compiler tests start to allocate quite a bit more. This isn't
surprising, because they all allocate zillions of data types, with
practically no other code, esp. T1969
* T1969: GHC allocates 19% more
* T4801: GHC allocates 13% more
* T5321FD: GHC allocates 13% more
* T9675: GHC allocates 11% more
* T783: GHC allocates 11% more
* T5642: GHC allocates 10% more
I'm treating this as acceptable. The payoff comes in Typeable-heavy
code.
Remaining to do
~~~~~~~~~~~~~~~
* I think that "TyCon" and "Module" are over-generic names to use for
the runtime type representations used in GHC.Typeable. Better might
be
"TrTyCon" and "TrModule". But I have not yet done this
* Add more info the the "TyCon" e.g. source location where it was
defined
* Use the new "Module" type to help with Trac Trac #10068
* It would be possible to generate TyConRepName (ie Typeable
instances) selectively rather than all the time. We'd need to persist
the information in interface files. Lacking a motivating reason I
have
not done this, but it would not be difficult.
Refactoring
~~~~~~~~~~~
As is so often the case, I ended up refactoring more than I intended.
In particular
* In TyCon, a type *family* (whether type or data) is repesented by a
FamilyTyCon
* a algebraic data type (including data/newtype instances) is
represented by AlgTyCon This wasn't true before; a data family
was represented as an AlgTyCon. There are some corresponding
changes in IfaceSyn.
* Also get rid of the (unhelpfully named) tyConParent.
* In TyCon define 'Promoted', isomorphic to Maybe, used when things are
optionally promoted; and use it elsewhere in GHC.
* Cleanup handling of knownKeyNames
* Each TyCon, including promoted TyCons, contains its TyConRepName, if
it has one. This is, in effect, the name of its Typeable instance.
Updates haddock submodule
Test Plan: Let Harbormaster validate
Reviewers: austin, hvr, goldfire
Subscribers: goldfire, thomie
Differential Revision: https://phabricator.haskell.org/D1404
GHC Trac Issues: #9858
|
|
|
|
|
|
|
|
| |
This reverts commit bef2f03e4d56d88a7e9752a7afd6a0a35616da6c.
This merge was botched
Also reverts haddock submodule.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the idea floated in Trac #9858, namely that we
should generate type-representation information at the data type
declaration site, rather than when solving a Typeable constraint.
However, this turned out quite a bit harder than I expected. I still
think it's the right thing to do, and it's done now, but it was quite
a struggle.
See particularly
* Note [Grand plan for Typeable] in TcTypeable (which is a new module)
* Note [The overall promotion story] in DataCon (clarifies existing stuff)
The most painful bit was that to generate Typeable instances (ie
TyConRepName bindings) for every TyCon is tricky for types in ghc-prim
etc:
* We need to have enough data types around to *define* a TyCon
* Many of these types are wired-in
Also, to minimise the code generated for each data type, I wanted to
generate pure data, not CAFs with unpackCString# stuff floating about.
Performance
~~~~~~~~~~~
Three perf/compiler tests start to allocate quite a bit more. This isn't
surprising, because they all allocate zillions of data types, with
practically no other code, esp. T1969
* T3294: GHC allocates 110% more (filed #11030 to track this)
* T1969: GHC allocates 30% more
* T4801: GHC allocates 14% more
* T5321FD: GHC allocates 13% more
* T783: GHC allocates 12% more
* T9675: GHC allocates 12% more
* T5642: GHC allocates 10% more
* T9961: GHC allocates 6% more
* T9203: Program allocates 54% less
I'm treating this as acceptable. The payoff comes in Typeable-heavy
code.
Remaining to do
~~~~~~~~~~~~~~~
* I think that "TyCon" and "Module" are over-generic names to use for
the runtime type representations used in GHC.Typeable. Better might be
"TrTyCon" and "TrModule". But I have not yet done this
* Add more info the the "TyCon" e.g. source location where it was
defined
* Use the new "Module" type to help with Trac Trac #10068
* It would be possible to generate TyConRepName (ie Typeable
instances) selectively rather than all the time. We'd need to persist
the information in interface files. Lacking a motivating reason I have
not done this, but it would not be difficult.
Refactoring
~~~~~~~~~~~
As is so often the case, I ended up refactoring more than I intended.
In particular
* In TyCon, a type *family* (whether type or data) is repesented by a
FamilyTyCon
* a algebraic data type (including data/newtype instances) is
represented by AlgTyCon This wasn't true before; a data family
was represented as an AlgTyCon. There are some corresponding
changes in IfaceSyn.
* Also get rid of the (unhelpfully named) tyConParent.
* In TyCon define 'Promoted', isomorphic to Maybe, used when things are
optionally promoted; and use it elsewhere in GHC.
* Cleanup handling of knownKeyNames
* Each TyCon, including promoted TyCons, contains its TyConRepName, if
it has one. This is, in effect, the name of its Typeable instance.
Requires update of the haddock submodule.
Differential Revision: https://phabricator.haskell.org/D757
|
|
catching mistakes that I had during my refactoring, and which I do not
want to do again.
|