| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #19969 we discovered that GHC has has a bug *forever* that means it
sometimes essentially discarded INLINE pragams. This happened when you have
* Two more more mutually recursive functions
* Some of which (presumably not all!) have an INLINE pragma
* Completely monomorphic.
This hits a particular case in GHC.HsToCore.Binds.dsAbsBinds, which was
simply wrong -- it put the INLINE pragma on the wrong binder.
This patch fixes the bug, rather easily, by adjusting the
no-tyvar, no-dict case of GHC.HsToCore.Binds.dsAbsBinds.
I also discovered that the GHC.Core.Opt.Pipeline.shortOutIndirections
was not doing a good job for
{-# INLINE lcl_id #-}
lcl_id = BIG
gbl_id = lcl_id
Here we want to transfer the stable unfolding to gbl_id (we do), but
we also want to remove it from lcl_id (we were not doing that).
Otherwise both Ids have large stable unfoldings. Easily fixed.
Note [Transferring IdInfo] explains.
|
|
|
|
| |
fixes #19756, updates haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For reasons described in GHC.Core.Opt.Simplify
Historical Note [Case binders and join points],
we used to keep a Core unfolding in one of the lambda-binders
for a join point. But this was always a gross hack -- it's
very odd to have an unfolding in a lambda binder, that refers to
earlier lambda binders.
The hack bit us in various ways:
* Most seriously, it is incompatible with linear types in Core.
* It complicated demand analysis, and could worsen results
* It required extra care in the simplifier (simplLamBinder)
* It complicated !5641 (look for "join binder unfoldings")
So this patch just removes the hack. Happily, doind so turned out to
have no effect on performance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce LogFlags as a independent subset of DynFlags used for logging.
As a consequence in many places we don't have to pass both Logger and
DynFlags anymore.
The main reason for this refactoring is that I want to refactor the
systools interfaces: for now many systools functions use DynFlags both
to use the Logger and to fetch their parameters (e.g. ldInputs for the
linker). I'm interested in refactoring the way they fetch their
parameters (i.e. use dedicated XxxOpts data types instead of DynFlags)
for #19877. But if I did this refactoring before refactoring the Logger,
we would have duplicate parameters (e.g. ldInputs from DynFlags and
linkerInputs from LinkerOpts). Hence this patch first.
Some flags don't really belong to LogFlags because they are subsystem
specific (e.g. most DumpFlags). For example -ddump-asm should better be
passed in NCGConfig somehow. This patch doesn't fix this tight coupling:
the dump flags are part of the UI but they are passed all the way down
for example to infer the file name for the dumps.
Because LogFlags are a subset of the DynFlags, we must update the former
when the latter changes (not so often). As a consequence we now use
accessors to read/write DynFlags in HscEnv instead of using `hsc_dflags`
directly.
In the process I've also made some subsystems less dependent on DynFlags:
- CmmToAsm: by passing some missing flags via NCGConfig (see new fields
in GHC.CmmToAsm.Config)
- Core.Opt.*:
- by passing -dinline-check value into UnfoldingOpts
- by fixing some Core passes interfaces (e.g. CallArity, FloatIn)
that took DynFlags argument for no good reason.
- as a side-effect GHC.Core.Opt.Pipeline.doCorePass is much less
convoluted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider this (from #19824)
let t = ...big...
in ...(f t x)...
were `f` ignores its first argument. With luck f's wrapper will inline
thereby dropping `t`, but maybe not: the arguments to f all look boring.
So we pre-empt the problem by replacing t's RHS with an absent filler
during w/w. Simple and effective.
The main payload is the new `isAbsDmd` case in `tryWw`, but there are
some other minor refactorings:
* To implment this I had to refactor `mk_absent_let` to
`mkAbsentFiller`, which can be called from `tryWW`.
* wwExpr took both WwOpts and DynFlags which seems silly. I combined
them into one.
* I renamed the historical mkInineRule to mkWrapperUnfolding
|
|
|
|
|
|
|
|
|
|
| |
As #19882 pointed out, we were simply doing rubbish literals wrong.
(I'll refrain from explaining the wrong-ness here -- see the ticket.)
This patch fixes it by adding a Type (of kind RuntimeRep) as field of
LitRubbish, rather than [PrimRep].
The Note [Rubbish literals] in GHC.Types.Literal explains the details.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit:
commit c6faa42bfb954445c09c5680afd4fb875ef03758
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Mon Mar 9 10:20:42 2020 +0000
Avoid useless w/w split
This patch is just a tidy-up for the post-strictness-analysis
worker wrapper split. Consider
f x = x
Strictnesss analysis does not lead to a w/w split, so the
obvious thing is to leave it 100% alone. But actually, because
the RHS is small, we ended up adding a StableUnfolding for it.
There is some reason to do this if we choose /not/ do to w/w
on the grounds that the function is small. See
Note [Don't w/w inline small non-loop-breaker things]
But there is no reason if we would not have done w/w anyway.
This patch just moves the conditional to later. Easy.
turns out to have a bug in it. Instead of /moving/ the conditional,
I /duplicated/ it. Then in a subsequent unrelated tidy-up
(087ac4eb) I removed the second (redundant) test!
This patch does what I originally intended.
There is also a small refactoring in GHC.Core.Unfold, to make the
code clearer, but with no change in behaviour.
It does, however, have a generally good effect on compile times,
because we aren't dealing with so many silly stable unfoldings.
Here are the non-zero changes:
Metrics: compile_time/bytes allocated
-------------------------------------
Baseline
Test Metric value New value Change
---------------------------------------------------------------------------
ManyAlternatives(normal) ghc/alloc 791969344.0 792665048.0 +0.1%
ManyConstructors(normal) ghc/alloc 4351126824.0 4358303528.0 +0.2%
PmSeriesG(normal) ghc/alloc 50362552.0 50482208.0 +0.2%
PmSeriesS(normal) ghc/alloc 63733024.0 63619912.0 -0.2%
T10421(normal) ghc/alloc 121224624.0 119695448.0 -1.3% GOOD
T10421a(normal) ghc/alloc 85256392.0 83714224.0 -1.8%
T10547(normal) ghc/alloc 29253072.0 29258256.0 +0.0%
T10858(normal) ghc/alloc 189343152.0 187972328.0 -0.7%
T11195(normal) ghc/alloc 281208248.0 279727584.0 -0.5%
T11276(normal) ghc/alloc 141966952.0 142046224.0 +0.1%
T11303b(normal) ghc/alloc 46228360.0 46259024.0 +0.1%
T11545(normal) ghc/alloc 2663128768.0 2667412656.0 +0.2%
T11822(normal) ghc/alloc 138686944.0 138760176.0 +0.1%
T12227(normal) ghc/alloc 482836000.0 475421056.0 -1.5% GOOD
T12234(optasm) ghc/alloc 60710520.0 60781808.0 +0.1%
T12425(optasm) ghc/alloc 104089000.0 104022424.0 -0.1%
T12545(normal) ghc/alloc 1711759416.0 1705711528.0 -0.4%
T12707(normal) ghc/alloc 991541120.0 991921776.0 +0.0%
T13035(normal) ghc/alloc 108199872.0 108370704.0 +0.2%
T13056(optasm) ghc/alloc 414642544.0 412580384.0 -0.5%
T13253(normal) ghc/alloc 361701272.0 355838624.0 -1.6%
T13253-spj(normal) ghc/alloc 157710168.0 157397768.0 -0.2%
T13379(normal) ghc/alloc 370984400.0 371345888.0 +0.1%
T13701(normal) ghc/alloc 2439764144.0 2441351984.0 +0.1%
T14052(ghci) ghc/alloc 2154090896.0 2156671400.0 +0.1%
T15164(normal) ghc/alloc 1478517688.0 1440317696.0 -2.6% GOOD
T15630(normal) ghc/alloc 178053912.0 172489808.0 -3.1%
T16577(normal) ghc/alloc 7859948896.0 7854524080.0 -0.1%
T17516(normal) ghc/alloc 1271520128.0 1202096488.0 -5.5% GOOD
T17836(normal) ghc/alloc 1123320632.0 1123922480.0 +0.1%
T17836b(normal) ghc/alloc 54526280.0 54576776.0 +0.1%
T17977b(normal) ghc/alloc 42706752.0 42730544.0 +0.1%
T18140(normal) ghc/alloc 108834568.0 108693816.0 -0.1%
T18223(normal) ghc/alloc 5539629264.0 5579500872.0 +0.7%
T18304(normal) ghc/alloc 97589720.0 97196944.0 -0.4%
T18478(normal) ghc/alloc 770755472.0 771232888.0 +0.1%
T18698a(normal) ghc/alloc 408691160.0 374364992.0 -8.4% GOOD
T18698b(normal) ghc/alloc 492419768.0 458809408.0 -6.8% GOOD
T18923(normal) ghc/alloc 72177032.0 71368824.0 -1.1%
T1969(normal) ghc/alloc 803523496.0 804655112.0 +0.1%
T3064(normal) ghc/alloc 198411784.0 198608512.0 +0.1%
T4801(normal) ghc/alloc 312416688.0 312874976.0 +0.1%
T5321Fun(normal) ghc/alloc 325230680.0 325474448.0 +0.1%
T5631(normal) ghc/alloc 592064448.0 593518968.0 +0.2%
T5837(normal) ghc/alloc 37691496.0 37710904.0 +0.1%
T783(normal) ghc/alloc 404629536.0 405064432.0 +0.1%
T9020(optasm) ghc/alloc 266004608.0 266375592.0 +0.1%
T9198(normal) ghc/alloc 49221336.0 49268648.0 +0.1%
T9233(normal) ghc/alloc 913464984.0 742680256.0 -18.7% GOOD
T9675(optasm) ghc/alloc 552296608.0 466322000.0 -15.6% GOOD
T9872a(normal) ghc/alloc 1789910616.0 1793924472.0 +0.2%
T9872b(normal) ghc/alloc 2315141376.0 2310338056.0 -0.2%
T9872c(normal) ghc/alloc 1840422424.0 1841567224.0 +0.1%
T9872d(normal) ghc/alloc 556713248.0 556838432.0 +0.0%
T9961(normal) ghc/alloc 383809160.0 384601600.0 +0.2%
WWRec(normal) ghc/alloc 773751272.0 753949608.0 -2.6% GOOD
Residency goes down too:
Metrics: compile_time/max_bytes_used
------------------------------------
Baseline
Test Metric value New value Change
-----------------------------------------------------------
T10370(optasm) ghc/max 42058448.0 39481672.0 -6.1%
T11545(normal) ghc/max 43641392.0 43634752.0 -0.0%
T15304(normal) ghc/max 29895824.0 29439032.0 -1.5%
T15630(normal) ghc/max 8822568.0 8772328.0 -0.6%
T18698a(normal) ghc/max 13882536.0 13787112.0 -0.7%
T18698b(normal) ghc/max 14714112.0 13836408.0 -6.0%
T1969(normal) ghc/max 24724128.0 24733496.0 +0.0%
T3064(normal) ghc/max 14041152.0 14034768.0 -0.0%
T3294(normal) ghc/max 32769248.0 32760312.0 -0.0%
T9630(normal) ghc/max 41605120.0 41572184.0 -0.1%
T9675(optasm) ghc/max 18652296.0 17253480.0 -7.5%
Metric Decrease:
T10421
T12227
T15164
T17516
T18698a
T18698b
T9233
T9675
WWRec
Metric Increase:
T12545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As the now historic part of `NOTE [aBSENT_ERROR_ID]` explains, we used to have
`exprIsHNF` respond True to `absentError` and give it a non-bottoming demand
signature, in order to perform case-to-let on certain `case`s we used to emit
that scrutinised `absentError` (Urgh).
What changed, why don't we emit these questionable absent errors anymore?
The absent errors in question filled in for binders that would end up in
strict fields after being seq'd. Apparently, the old strictness analyser would
give these binders an absent demand, but today we give them head-strict demand
`1A` and thus don't replace with absent errors at all.
This fixes items (1) and (2) of #19853.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GHC's internal State monad benefits from oneShot annotations on its
state, allowing for more aggressive eta expansion.
We currently don't have monad transformers with the same optimisation,
so we only change uses of the pure State monad here.
See #19657 and 19380.
Metric Decrease:
hie002
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
We need to match on DataCon workers for the rules to be triggered.
T13701 ghc/alloc decreases by ~2.5% on some archs
Metric Decrease:
T13701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the common case this is a straight performance win
at a compile time cost so we enable it at -O2.
In rare cases it can lead to compile time regressions
because of changed inlining behaviour. Which can very
rarely also affect runtime performance.
Increasing the inlining threshold can help to avoid this
which is documented in the user guide.
In terms of measured results this reduced instructions executed
for nofib by 1%.
However for some cases (e.g. Cabal) enabling this
by default increases compile time by 2-3% so we enable it only
at -O2 where it's clear that a user is willing to trade compile
time for runtime.
Most of the testsuite runs without -O2 so there are few
perf changes.
Increases:
T12545/T18698: We perform more WW work because dicts are now treated strict.
T9198: Also some more work because functions are now subject to W/W
Decreases:
T14697: Compiling empty modules. Probably because of changes inside ghc.
T9203: I can't reproduce this improvement locally. Might be spurious.
-------------------------
Metric Decrease:
T12227
T14697
T9203
Metric Increase:
T9198
T12545
T18698a
T18698b
-------------------------
|
|
|
|
| |
Fixes #19851
|
|
|
|
|
|
| |
Fixes #19849
Co-authored-by: Krzysztof Gogolewski <krzysztof.gogolewski@tweag.io>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #19822, we realised that the Simplifier's new habit of floating cases into
`runRW#` continuations inhibits CPR analysis from giving key functions of `text`
the CPR property, such as `singleton`.
This patch fixes that by anticipating part of !5667 (Nested CPR) to give
`runRW#` the proper CPR transformer it now deserves: Namely, `runRW# (\s -> e)`
should have the CPR property iff `e` has it.
The details are in `Note [Simplification of runRW#]` in GHC.CoreToStg.Prep.
The output of T18086 changed a bit: `panic` (which calls `runRW#`) now has
`botCpr`. As outlined in Note [Bottom CPR iff Dead-Ending Divergence], that's
OK.
Fixes #19822.
Metric Decrease:
T9872d
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit modifies interface files so that *only* direct information
about modules and packages is stored in the interface file.
* Only direct module and direct package dependencies are stored in the
interface files.
* Trusted packages are now stored separately as they need to be checked
transitively.
* hs-boot files below the compiled module in the home module are stored
so that eps_is_boot can be calculated in one-shot mode without loading
all interface files in the home package.
* The transitive closure of signatures is stored separately
This is important for two reasons
* Less recompilation is needed, as motivated by #16885, a lot of
redundant compilation was triggered when adding new imports deep in the
module tree as all the parent interface files had to be redundantly
updated.
* Checking an interface file is cheaper because you don't have to
perform a transitive traversal to check the dependencies are up-to-date.
In the code, places where we would have used the transitive closure, we
instead compute the necessary transitive closure. The closure is not
computed very often, was already happening in checkDependencies, and
was already happening in getLinkDeps.
Fixes #16885
-------------------------
Metric Decrease:
MultiLayerModules
T13701
T13719
-------------------------
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Replace uses of WARN macro with calls to:
warnPprTrace :: Bool -> SDoc -> a -> a
Remove the now unused HsVersions.h
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no reason to use CPP. __LINE__ and __FILE__ macros are now
better replaced with GHC's CallStack. As a bonus, assert error messages
now contain more information (function name, column).
Here is the mapping table (HasCallStack omitted):
* ASSERT: assert :: Bool -> a -> a
* MASSERT: massert :: Bool -> m ()
* ASSERTM: assertM :: m Bool -> m ()
* ASSERT2: assertPpr :: Bool -> SDoc -> a -> a
* MASSERT2: massertPpr :: Bool -> SDoc -> m ()
* ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
|
|
|
|
|
|
|
|
|
|
|
| |
tryWW used to always returns an Id with a zapped:
* DmdEnv
* Used Once info
except in the case where the ID was guaranteed to be inlined.
We now also zap the info in that case.
Fixes #19818.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GHC.Core.Opt.SpecConstr.spec_one we were giving join-points an
incorrect join-arity -- this was fallout from
commit c71b220491a6ae46924cc5011b80182bcc773a58
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Thu Apr 8 23:36:24 2021 +0100
Improvements in SpecConstr
* Allow under-saturated calls to specialise
See Note [SpecConstr call patterns]
This just allows a bit more specialisation to take place.
and showed up in #19780. I refactored the code to make the new
function calcSpecInfo which treats join points separately.
In doing this I discovered two other small bugs:
* In the Var case of argToPat we were treating UnkOcc as
uninteresting, but (by omission) NoOcc as interesting. As a
result we were generating SpecConstr specialisations for functions
with unused arguments. But the absence anlyser does that much
better; doing it here just generates more code. Easily fixed.
* The lifted/unlifted test in GHC.Core.Opt.WorkWrap.Utils.mkWorkerArgs
was back to front (#19794). Easily fixed.
* In the same function, mkWorkerArgs, we were adding an extra argument
nullary join points, which isn't necessary. I added a test for
this. That in turn meant I had to remove an ASSERT in
CoreToStg.mkStgRhs for nullary join points, which was always bogus
but now trips; I added a comment to explain.
|
|
|
|
| |
Fixes #19817
|
|
|
|
| |
Fixes #19586
|
|
|
|
|
|
| |
This patch just does the tidying up from #19805.
No change in behaviour.
|
|
|
|
| |
Not perfect. But I consider this to be a documentation fix for #19789.
|
|
|
|
|
|
|
|
| |
The eta-reduction we do for newype axioms was generating
an inhomogeneous axiom: see #19739.
This patch fixes it in a simple way; see GHC.Tc.TyCl.Build
Note [Newtype eta and homogeneous axioms]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch was driven by #18481, to allow visible type application
for levity-polymorphic newtypes. As so often, it started simple
but grew:
* Significant refactor: I removed HsConLikeOut from the
client-independent Language.Haskell.Syntax.Expr, and put it where it
belongs, as a new constructor `ConLikeTc` in the GHC-specific extension
data type for expressions, `GHC.Hs.Expr.XXExprGhcTc`.
That changed touched a lot of files in a very superficial way.
* Note [Typechecking data constructors] explains the main payload.
The eta-expansion part is no longer done by the typechecker, but
instead deferred to the desugarer, via `ConLikeTc`
* A little side benefit is that I was able to restore VTA for
data types with a "stupid theta": #19775. Not very important,
but the code in GHC.Tc.Gen.Head.tcInferDataCon is is much, much
more elegant now.
* I had to refactor the levity-polymorphism checking code in
GHC.HsToCore.Expr, see
Note [Checking for levity-polymorphic functions]
Note [Checking levity-polymorphic data constructors]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CorePrepProv is only created in CorePrep, so I thought it wouldn't be
needed in IfaceUnivCoProv. But actually IfaceSyn is used during
pretty-printing, and we can certainly pretty-print things after
CorePrep as #19768 showed.
So the simplest thing is to represent CorePrepProv in IfaceSyn.
To improve what Lint can do I also added a boolean to CorePrepProv, to
record whether it is homogeneously kinded or not. It is introduced in
two distinct ways (see Note [Unsafe coercions] in GHC.CoreToStg.Prep),
one of which may be hetero-kinded (e.g. Int ~ Int#) beause it is
casting a divergent expression; but the other is not. The boolean
keeps track.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible that the type variables bound by a class header will map to
something different in the typechecker in the presence of
`StandaloneKindSignatures`. `tcClassDecl2` was not aware of this, however,
leading to #19738. To fix it, in `tcTyClDecls` we map each class `TcTyCon` to
its `tcTyConScopedTyVars` as a `ClassScopedTVEnv`. We then plumb that
`ClassScopedTVEnv` to `tcClassDecl2` where it can be used.
Fixes #19738.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. `text` is as efficient as `ptext . sLit` thanks to the rewrite rules
2. `text` is visually nicer than `ptext . sLit`
3. `ptext . sLit` encourages using one `ptext` for several `sLit` as in:
ptext $ case xy of
... -> sLit ...
... -> sLit ...
which may allocate SDoc's TextBeside constructors at runtime instead
of sharing them into CAFs.
|
|
|
|
|
|
|
|
| |
Doing so is important to maintain invariants (EQ3) and (EQ4) from
`Note [Respecting definitional equality]` in `GHC.Core.TyCo.Rep`. For the
details, see the new `Note [Using coreView in mk_cast_ty]`.
Fixes #19742.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This gives a more precise type signature to `magicDict` as proposed in #16646.
In addition, this replaces the constant-folding rule for `magicDict` in
`GHC.Core.Opt.ConstantFold` with a special case in the desugarer in
`GHC.HsToCore.Expr.dsHsWrapped`. I have also renamed `magicDict` to `withDict`
in light of the discussion in
https://mail.haskell.org/pipermail/ghc-devs/2021-April/019833.html.
All of this has the following benefits:
* `withDict` is now more type safe than before. Moreover, if a user applies
`withDict` at an incorrect type, the special-casing in `dsHsWrapped` will
now throw an error message indicating what the user did incorrectly.
* `withDict` can now work with classes that have multiple type arguments, such
as `Typeable @k a`. This means that `Data.Typeable.Internal.withTypeable` can
now be implemented in terms of `withDict`.
* Since the special-casing for `withDict` no longer needs to match on the
structure of the expression passed as an argument to `withDict`, it no
longer cares about the presence or absence of `Tick`s. In effect, this
obsoletes the fix for #19667.
The new `T16646` test case demonstrates the new version of `withDict` in
action, both in terms of `base` functions defined in terms of `withDict`
as well as in terms of functions from the `reflection` and `singletons`
libraries. The `T16646Fail` test case demonstrates the error message that GHC
throws when `withDict` is applied incorrectly.
This fixes #16646. By adding more tests for `withDict`, this also
fixes #19673 as a side effect.
|
|
|
|
|
|
|
|
| |
Previously -ddump-inlinings and -dverbose-core2core used in conjunction
would have the side-effect of dumping additional information about all
inlinings considered by the simplifier. However, I have sometimes wanted
this inlining information without the firehose of information produced by
-dverbose-core2core. Introduce a new dump flag for this purpose.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main idea here is to avoid treating
* case e of {}
* case unsafeEqualityProof of UnsafeRefl co -> blah
specially in CoreToStg. Instead, nail them in CorePrep,
by converting
case e of {}
==> e |> unsafe-co
case unsafeEqualityProof of UnsafeRefl cv -> blah
==> blah[unsafe-co/cv]
in GHC.Core.Prep. Now expressions that we want to treat as trivial
really are trivial. We can get rid of cpExprIsTrivial.
And we fix #19700.
A downside is that, at least under unsafeEqualityProof, we substitute
in types and coercions, which is more work. But a big advantage is
that it's all very simple and principled: CorePrep really gets rid of
the unsafeCoerce stuff, as it does empty case, runRW#, lazyId etc.
I've updated the overview in GHC.Core.Prep, and added
Note [Unsafe coercions] in GHC.Core.Prep
Note [Implementing unsafeCoerce] in base:Unsafe.Coerce
We get 3% fewer bytes allocated when compiling perf/compiler/T5631,
which uses a lot of unsafeCoerces. (It's a happy-generated parser.)
Metric Decrease:
T5631
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In OccurAnal the function occAnalApp was failing to reset occ_encl to
OccVanilla. This omission sometimes resulted in over-pessimistic
occurrence information.
I tripped over this when analysing eta-expansions.
Compile times in perf/compiler fell slightly (no increases)
PmSeriesG(normal) ghc/alloc 50738104.0 50580440.0 -0.3%
PmSeriesS(normal) ghc/alloc 64045284.0 63739384.0 -0.5%
PmSeriesT(normal) ghc/alloc 94430324.0 93800688.0 -0.7%
PmSeriesV(normal) ghc/alloc 63051056.0 62758240.0 -0.5%
T10547(normal) ghc/alloc 29322840.0 29307784.0 -0.1%
T10858(normal) ghc/alloc 191988716.0 189801744.0 -1.1%
T11195(normal) ghc/alloc 282654016.0 281839440.0 -0.3%
T11276(normal) ghc/alloc 142994648.0 142338688.0 -0.5%
T11303b(normal) ghc/alloc 46435532.0 46343376.0 -0.2%
T11374(normal) ghc/alloc 256866536.0 255653056.0 -0.5%
T11822(normal) ghc/alloc 140210356.0 138935296.0 -0.9%
T12234(optasm) ghc/alloc 60753880.0 60720648.0 -0.1%
T14052(ghci) ghc/alloc 2235105796.0 2230906584.0 -0.2%
T17096(normal) ghc/alloc 297725396.0 296237112.0 -0.5%
T17836(normal) ghc/alloc 1127785292.0 1125316160.0 -0.2%
T17836b(normal) ghc/alloc 54761928.0 54637592.0 -0.2%
T17977(normal) ghc/alloc 47529464.0 47397048.0 -0.3%
T17977b(normal) ghc/alloc 42906972.0 42809824.0 -0.2%
T18478(normal) ghc/alloc 777385708.0 774219280.0 -0.4%
T18698a(normal) ghc/alloc 415097664.0 409009120.0 -1.5% GOOD
T18698b(normal) ghc/alloc 500082104.0 493124016.0 -1.4% GOOD
T18923(normal) ghc/alloc 72252364.0 72216016.0 -0.1%
T1969(normal) ghc/alloc 811581860.0 804883136.0 -0.8%
T5837(normal) ghc/alloc 37688048.0 37666288.0 -0.1%
Nice!
Metric Decrease:
T18698a
T18698b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In another small step towards bringing a manageable variant of Nested
CPR into GHC, this patch refactors worker/wrapper to be able to exploit
Nested CPR signatures. See the new Note [Worker/wrapper for CPR].
The nested code path is currently not triggered, though, because all
signatures that we annotate are still flat. So purely a refactoring.
I am very confident that it works, because I ripped it off !1866 95%
unchanged.
A few test case outputs changed, but only it's auxiliary names only.
I also added test cases for #18109 and #18401.
There's a 2.6% metric increase in T13056 after a rebase, caused by an
additional Simplifier run. It appears b1d0b9c saw a similar additional
iteration. I think it's just a fluke.
Metric Increase:
T13056
|
| |
|
|
|
|
|
|
|
|
|
|
| |
I renamed `wantToUnbox` to `wantToUnboxArg` and then introduced
`wantToUnboxResult`, which we call in `mkWWcpr_one` now.
I also deleted `splitArgType_maybe` (the single call site outside of
`wantToUnboxArg` actually cared about the result type of a function, not
an argument) and `splitResultType_maybe` (which is entirely superceded
by `wantToUnboxResult`.
|
|
|
|
|
|
|
|
|
| |
Plus a few minor refactorings:
* Introduce `normSplitTyConApp_maybe` to Core.Utils
* Reduce boolean blindness in the Bool argument to `wantToUnbox`
* Let `wantToUnbox` also decide when to drop an argument, cleaning up
`mkWWstr_one`
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow under-saturated calls to specialise
See Note [SpecConstr call patterns]
This just allows a bit more specialisation to take place.
* Don't discard calls from un-specialised RHSs. This was
a plain bug in `specialise`, again leading to loss of
specialisation. Refactoring yields an `otherwise`
case that is easier to grok.
* I refactored CallPat to become a proper data type, not a tuple.
All this came up when I was working on eta-reduction. The ticket
is #19672.
The nofib results are mostly zero, with a couple of big wins:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
awards +0.2% -0.1% -18.7% -18.8% 0.0%
comp_lab_zift +0.2% -0.2% -23.9% -23.9% 0.0%
fft2 +0.2% -1.0% -34.9% -36.6% 0.0%
hpg +0.2% -0.3% -18.4% -18.4% 0.0%
mate +0.2% -15.7% -19.3% -19.3% +11.1%
parser +0.2% +0.6% -16.3% -16.3% 0.0%
puzzle +0.4% -19.7% -33.7% -34.0% 0.0%
rewrite +0.2% -0.5% -20.7% -20.7% 0.0%
--------------------------------------------------------------------------------
Min +0.2% -19.7% -48.1% -48.9% 0.0%
Max +0.4% +0.6% -1.2% -1.1% +11.1%
Geometric Mean +0.2% -0.4% -21.0% -21.1% +0.1%
I investigated the 0.6% increase on 'parser'. It comes because SpecConstr
has a limit of 3 specialisations. With HEAD, hsDoExpr has 2
specialisations, and then a further several from the specialised
bodies, of which 1 is picked. With this patch we get 3
specialisations right off the bat, so we discard all from the
recursive calls. Turns out that that's not the best choice, but there
is no way to tell that. I'm accepting it.
NB: these figures actually come from this patch plus the preceding one for
StgCSE, but I think the gains come from SpecConstr.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ticket #13873 unexpectedly showed that a SPECIALISE pragma made a
program run (a lot) slower, because less specialisation took place
overall. It turned out that the specialiser was missing opportunities
because of quantified type variables.
It was quite easy to fix. The story is given in
Note [Specialising polymorphic dictionaries]
Two other minor fixes in the specialiser
* There is no benefit in specialising data constructor /wrappers/.
(They can appear overloaded because they are given a dictionary
to store in the constructor.) Small guard in canSpecImport.
* There was a buglet in the UnspecArg case of specHeader, in the
case where there is a dead binder. We need a LitRubbish filler
for the specUnfolding stuff. I expanded
Note [Drop dead args from specialisations] to explain.
There is a 4% increase in compile time for T13056, because we generate
more specialised code. This seems OK.
Metric Increase:
T13056
|
|
|
|
|
|
|
|
|
|
|
| |
The problem was that ghci inserts some ticks around the crucial bit of
the expression. Just like in some of the other rules we now strip the
ticks so that the rule fires more reliably.
It was possible to defeat magicDict by using -fhpc as well, so not just an
issue in ghci.
Fixes #19667 and related to #19673
|
| |
|
|
|
|
|
|
|
| |
This allows us to use the unsafe shifts in non-debug builds for performance.
For older versions of base we instead export Data.Bits
See also #19618
|
|
|
|
|
| |
Metric Decrease:
T16577
|
| |
|