summaryrefslogtreecommitdiff
path: root/compiler/GHC/Core
Commit message (Collapse)AuthorAgeFilesLines
* Fix INLINE pragmas in desugarerSimon Peyton Jones2021-06-101-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | In #19969 we discovered that GHC has has a bug *forever* that means it sometimes essentially discarded INLINE pragams. This happened when you have * Two more more mutually recursive functions * Some of which (presumably not all!) have an INLINE pragma * Completely monomorphic. This hits a particular case in GHC.HsToCore.Binds.dsAbsBinds, which was simply wrong -- it put the INLINE pragma on the wrong binder. This patch fixes the bug, rather easily, by adjusting the no-tyvar, no-dict case of GHC.HsToCore.Binds.dsAbsBinds. I also discovered that the GHC.Core.Opt.Pipeline.shortOutIndirections was not doing a good job for {-# INLINE lcl_id #-} lcl_id = BIG gbl_id = lcl_id Here we want to transfer the stable unfolding to gbl_id (we do), but we also want to remove it from lcl_id (we were not doing that). Otherwise both Ids have large stable unfoldings. Easily fixed. Note [Transferring IdInfo] explains.
* Reword: representation instead of levitysheaf2021-06-109-67/+75
| | | | fixes #19756, updates haddock submodule
* Do not add unfoldings to lambda-bindersSimon Peyton Jones2021-06-104-88/+21
| | | | | | | | | | | | | | | | | | For reasons described in GHC.Core.Opt.Simplify Historical Note [Case binders and join points], we used to keep a Core unfolding in one of the lambda-binders for a join point. But this was always a gross hack -- it's very odd to have an unfolding in a lambda binder, that refers to earlier lambda binders. The hack bit us in various ways: * Most seriously, it is incompatible with linear types in Core. * It complicated demand analysis, and could worsen results * It required extra care in the simplifier (simplLamBinder) * It complicated !5641 (look for "join binder unfoldings") So this patch just removes the hack. Happily, doind so turned out to have no effect on performance.
* Make Logger independent of DynFlagsSylvain Henry2021-06-0710-231/+149
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce LogFlags as a independent subset of DynFlags used for logging. As a consequence in many places we don't have to pass both Logger and DynFlags anymore. The main reason for this refactoring is that I want to refactor the systools interfaces: for now many systools functions use DynFlags both to use the Logger and to fetch their parameters (e.g. ldInputs for the linker). I'm interested in refactoring the way they fetch their parameters (i.e. use dedicated XxxOpts data types instead of DynFlags) for #19877. But if I did this refactoring before refactoring the Logger, we would have duplicate parameters (e.g. ldInputs from DynFlags and linkerInputs from LinkerOpts). Hence this patch first. Some flags don't really belong to LogFlags because they are subsystem specific (e.g. most DumpFlags). For example -ddump-asm should better be passed in NCGConfig somehow. This patch doesn't fix this tight coupling: the dump flags are part of the UI but they are passed all the way down for example to infer the file name for the dumps. Because LogFlags are a subset of the DynFlags, we must update the former when the latter changes (not so often). As a consequence we now use accessors to read/write DynFlags in HscEnv instead of using `hsc_dflags` directly. In the process I've also made some subsystems less dependent on DynFlags: - CmmToAsm: by passing some missing flags via NCGConfig (see new fields in GHC.CmmToAsm.Config) - Core.Opt.*: - by passing -dinline-check value into UnfoldingOpts - by fixing some Core passes interfaces (e.g. CallArity, FloatIn) that took DynFlags argument for no good reason. - as a side-effect GHC.Core.Opt.Pipeline.doCorePass is much less convoluted.
* Drop absent bindings in worker/wrapperSimon Peyton Jones2021-06-056-75/+138
| | | | | | | | | | | | | | | | | | | | | | | Consider this (from #19824) let t = ...big... in ...(f t x)... were `f` ignores its first argument. With luck f's wrapper will inline thereby dropping `t`, but maybe not: the arguments to f all look boring. So we pre-empt the problem by replacing t's RHS with an absent filler during w/w. Simple and effective. The main payload is the new `isAbsDmd` case in `tryWw`, but there are some other minor refactorings: * To implment this I had to refactor `mk_absent_let` to `mkAbsentFiller`, which can be called from `tryWW`. * wwExpr took both WwOpts and DynFlags which seems silly. I combined them into one. * I renamed the historical mkInineRule to mkWrapperUnfolding
* Re-do rubbish literalsSimon Peyton Jones2021-06-054-22/+38
| | | | | | | | | | As #19882 pointed out, we were simply doing rubbish literals wrong. (I'll refrain from explaining the wrong-ness here -- see the ticket.) This patch fixes it by adding a Type (of kind RuntimeRep) as field of LitRubbish, rather than [PrimRep]. The Note [Rubbish literals] in GHC.Types.Literal explains the details.
* Avoid useless w/w split, take 2Simon Peyton Jones2021-06-052-20/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit: commit c6faa42bfb954445c09c5680afd4fb875ef03758 Author: Simon Peyton Jones <simonpj@microsoft.com> Date: Mon Mar 9 10:20:42 2020 +0000 Avoid useless w/w split This patch is just a tidy-up for the post-strictness-analysis worker wrapper split. Consider f x = x Strictnesss analysis does not lead to a w/w split, so the obvious thing is to leave it 100% alone. But actually, because the RHS is small, we ended up adding a StableUnfolding for it. There is some reason to do this if we choose /not/ do to w/w on the grounds that the function is small. See Note [Don't w/w inline small non-loop-breaker things] But there is no reason if we would not have done w/w anyway. This patch just moves the conditional to later. Easy. turns out to have a bug in it. Instead of /moving/ the conditional, I /duplicated/ it. Then in a subsequent unrelated tidy-up (087ac4eb) I removed the second (redundant) test! This patch does what I originally intended. There is also a small refactoring in GHC.Core.Unfold, to make the code clearer, but with no change in behaviour. It does, however, have a generally good effect on compile times, because we aren't dealing with so many silly stable unfoldings. Here are the non-zero changes: Metrics: compile_time/bytes allocated ------------------------------------- Baseline Test Metric value New value Change --------------------------------------------------------------------------- ManyAlternatives(normal) ghc/alloc 791969344.0 792665048.0 +0.1% ManyConstructors(normal) ghc/alloc 4351126824.0 4358303528.0 +0.2% PmSeriesG(normal) ghc/alloc 50362552.0 50482208.0 +0.2% PmSeriesS(normal) ghc/alloc 63733024.0 63619912.0 -0.2% T10421(normal) ghc/alloc 121224624.0 119695448.0 -1.3% GOOD T10421a(normal) ghc/alloc 85256392.0 83714224.0 -1.8% T10547(normal) ghc/alloc 29253072.0 29258256.0 +0.0% T10858(normal) ghc/alloc 189343152.0 187972328.0 -0.7% T11195(normal) ghc/alloc 281208248.0 279727584.0 -0.5% T11276(normal) ghc/alloc 141966952.0 142046224.0 +0.1% T11303b(normal) ghc/alloc 46228360.0 46259024.0 +0.1% T11545(normal) ghc/alloc 2663128768.0 2667412656.0 +0.2% T11822(normal) ghc/alloc 138686944.0 138760176.0 +0.1% T12227(normal) ghc/alloc 482836000.0 475421056.0 -1.5% GOOD T12234(optasm) ghc/alloc 60710520.0 60781808.0 +0.1% T12425(optasm) ghc/alloc 104089000.0 104022424.0 -0.1% T12545(normal) ghc/alloc 1711759416.0 1705711528.0 -0.4% T12707(normal) ghc/alloc 991541120.0 991921776.0 +0.0% T13035(normal) ghc/alloc 108199872.0 108370704.0 +0.2% T13056(optasm) ghc/alloc 414642544.0 412580384.0 -0.5% T13253(normal) ghc/alloc 361701272.0 355838624.0 -1.6% T13253-spj(normal) ghc/alloc 157710168.0 157397768.0 -0.2% T13379(normal) ghc/alloc 370984400.0 371345888.0 +0.1% T13701(normal) ghc/alloc 2439764144.0 2441351984.0 +0.1% T14052(ghci) ghc/alloc 2154090896.0 2156671400.0 +0.1% T15164(normal) ghc/alloc 1478517688.0 1440317696.0 -2.6% GOOD T15630(normal) ghc/alloc 178053912.0 172489808.0 -3.1% T16577(normal) ghc/alloc 7859948896.0 7854524080.0 -0.1% T17516(normal) ghc/alloc 1271520128.0 1202096488.0 -5.5% GOOD T17836(normal) ghc/alloc 1123320632.0 1123922480.0 +0.1% T17836b(normal) ghc/alloc 54526280.0 54576776.0 +0.1% T17977b(normal) ghc/alloc 42706752.0 42730544.0 +0.1% T18140(normal) ghc/alloc 108834568.0 108693816.0 -0.1% T18223(normal) ghc/alloc 5539629264.0 5579500872.0 +0.7% T18304(normal) ghc/alloc 97589720.0 97196944.0 -0.4% T18478(normal) ghc/alloc 770755472.0 771232888.0 +0.1% T18698a(normal) ghc/alloc 408691160.0 374364992.0 -8.4% GOOD T18698b(normal) ghc/alloc 492419768.0 458809408.0 -6.8% GOOD T18923(normal) ghc/alloc 72177032.0 71368824.0 -1.1% T1969(normal) ghc/alloc 803523496.0 804655112.0 +0.1% T3064(normal) ghc/alloc 198411784.0 198608512.0 +0.1% T4801(normal) ghc/alloc 312416688.0 312874976.0 +0.1% T5321Fun(normal) ghc/alloc 325230680.0 325474448.0 +0.1% T5631(normal) ghc/alloc 592064448.0 593518968.0 +0.2% T5837(normal) ghc/alloc 37691496.0 37710904.0 +0.1% T783(normal) ghc/alloc 404629536.0 405064432.0 +0.1% T9020(optasm) ghc/alloc 266004608.0 266375592.0 +0.1% T9198(normal) ghc/alloc 49221336.0 49268648.0 +0.1% T9233(normal) ghc/alloc 913464984.0 742680256.0 -18.7% GOOD T9675(optasm) ghc/alloc 552296608.0 466322000.0 -15.6% GOOD T9872a(normal) ghc/alloc 1789910616.0 1793924472.0 +0.2% T9872b(normal) ghc/alloc 2315141376.0 2310338056.0 -0.2% T9872c(normal) ghc/alloc 1840422424.0 1841567224.0 +0.1% T9872d(normal) ghc/alloc 556713248.0 556838432.0 +0.0% T9961(normal) ghc/alloc 383809160.0 384601600.0 +0.2% WWRec(normal) ghc/alloc 773751272.0 753949608.0 -2.6% GOOD Residency goes down too: Metrics: compile_time/max_bytes_used ------------------------------------ Baseline Test Metric value New value Change ----------------------------------------------------------- T10370(optasm) ghc/max 42058448.0 39481672.0 -6.1% T11545(normal) ghc/max 43641392.0 43634752.0 -0.0% T15304(normal) ghc/max 29895824.0 29439032.0 -1.5% T15630(normal) ghc/max 8822568.0 8772328.0 -0.6% T18698a(normal) ghc/max 13882536.0 13787112.0 -0.7% T18698b(normal) ghc/max 14714112.0 13836408.0 -6.0% T1969(normal) ghc/max 24724128.0 24733496.0 +0.0% T3064(normal) ghc/max 14041152.0 14034768.0 -0.0% T3294(normal) ghc/max 32769248.0 32760312.0 -0.0% T9630(normal) ghc/max 41605120.0 41572184.0 -0.1% T9675(optasm) ghc/max 18652296.0 17253480.0 -7.5% Metric Decrease: T10421 T12227 T15164 T17516 T18698a T18698b T9233 T9675 WWRec Metric Increase: T12545
* WW: Mark absent errors as diverging againSebastian Graf2021-06-023-42/+43
| | | | | | | | | | | | | | | As the now historic part of `NOTE [aBSENT_ERROR_ID]` explains, we used to have `exprIsHNF` respond True to `absentError` and give it a non-bottoming demand signature, in order to perform case-to-let on certain `case`s we used to emit that scrutinised `absentError` (Urgh). What changed, why don't we emit these questionable absent errors anymore? The absent errors in question filled in for binders that would end up in strict fields after being seq'd. Apparently, the old strictness analyser would give these binders an absent demand, but today we give them head-strict demand `1A` and thus don't replace with absent errors at all. This fixes items (1) and (2) of #19853.
* Use GHC's State monad consistentlyBen Gamari2021-05-291-1/+1
| | | | | | | | | | | | | GHC's internal State monad benefits from oneShot annotations on its state, allowing for more aggressive eta expansion. We currently don't have monad transformers with the same optimisation, so we only change uses of the pure State monad here. See #19657 and 19380. Metric Decrease: hie002
* Split GHC.Utils.Monad.State into .Strict and .LazyBen Gamari2021-05-291-1/+1
|
* Fix and slight improvement to datacon worker/wrapper notesSylvain Henry2021-05-291-0/+3
|
* Bignum: match on DataCon workers in rules (#19892)Sylvain Henry2021-05-291-7/+26
| | | | | | | | | We need to match on DataCon workers for the rules to be triggered. T13701 ghc/alloc decreases by ~2.5% on some archs Metric Decrease: T13701
* Enable strict dicts by default at -O2.Andreas Klebinger2021-05-271-17/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the common case this is a straight performance win at a compile time cost so we enable it at -O2. In rare cases it can lead to compile time regressions because of changed inlining behaviour. Which can very rarely also affect runtime performance. Increasing the inlining threshold can help to avoid this which is documented in the user guide. In terms of measured results this reduced instructions executed for nofib by 1%. However for some cases (e.g. Cabal) enabling this by default increases compile time by 2-3% so we enable it only at -O2 where it's clear that a user is willing to trade compile time for runtime. Most of the testsuite runs without -O2 so there are few perf changes. Increases: T12545/T18698: We perform more WW work because dicts are now treated strict. T9198: Also some more work because functions are now subject to W/W Decreases: T14697: Compiling empty modules. Probably because of changes inside ghc. T9203: I can't reproduce this improvement locally. Might be spurious. ------------------------- Metric Decrease: T12227 T14697 T9203 Metric Increase: T9198 T12545 T18698a T18698b -------------------------
* constant folding: Make shiftRule for Word8/16/32# types return correct typeMatthew Pickering2021-05-191-6/+6
| | | | Fixes #19851
* Make setBndrsDemandInfo work with only type variablesMatthew Pickering2021-05-191-4/+4
| | | | | | Fixes #19849 Co-authored-by: Krzysztof Gogolewski <krzysztof.gogolewski@tweag.io>
* CPR: Detect constructed products in `runRW#` apps (#19822)Sebastian Graf2021-05-191-23/+43
| | | | | | | | | | | | | | | | | | | | | In #19822, we realised that the Simplifier's new habit of floating cases into `runRW#` continuations inhibits CPR analysis from giving key functions of `text` the CPR property, such as `singleton`. This patch fixes that by anticipating part of !5667 (Nested CPR) to give `runRW#` the proper CPR transformer it now deserves: Namely, `runRW# (\s -> e)` should have the CPR property iff `e` has it. The details are in `Note [Simplification of runRW#]` in GHC.CoreToStg.Prep. The output of T18086 changed a bit: `panic` (which calls `runRW#`) now has `botCpr`. As outlined in Note [Bottom CPR iff Dead-Ending Divergence], that's OK. Fixes #19822. Metric Decrease: T9872d
* Remove transitive information about modules and packages from interface filesMatthew Pickering2021-05-192-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit modifies interface files so that *only* direct information about modules and packages is stored in the interface file. * Only direct module and direct package dependencies are stored in the interface files. * Trusted packages are now stored separately as they need to be checked transitively. * hs-boot files below the compiled module in the home module are stored so that eps_is_boot can be calculated in one-shot mode without loading all interface files in the home package. * The transitive closure of signatures is stored separately This is important for two reasons * Less recompilation is needed, as motivated by #16885, a lot of redundant compilation was triggered when adding new imports deep in the module tree as all the parent interface files had to be redundantly updated. * Checking an interface file is cheaper because you don't have to perform a transitive traversal to check the dependencies are up-to-date. In the code, places where we would have used the transitive closure, we instead compute the necessary transitive closure. The closure is not computed very often, was already happening in checkDependencies, and was already happening in getLinkDeps. Fixes #16885 ------------------------- Metric Decrease: MultiLayerModules T13701 T13719 -------------------------
* Remove useless {-# LANGUAGE CPP #-} pragmasSylvain Henry2021-05-1247-58/+41
|
* Fully remove HsVersions.hSylvain Henry2021-05-1248-151/+52
| | | | | | | | | | Replace uses of WARN macro with calls to: warnPprTrace :: Bool -> SDoc -> a -> a Remove the now unused HsVersions.h Bump haddock submodule
* Replace CPP assertions with Haskell functionsSylvain Henry2021-05-1237-307/+342
| | | | | | | | | | | | | | | There is no reason to use CPP. __LINE__ and __FILE__ macros are now better replaced with GHC's CallStack. As a bonus, assert error messages now contain more information (function name, column). Here is the mapping table (HasCallStack omitted): * ASSERT: assert :: Bool -> a -> a * MASSERT: massert :: Bool -> m () * ASSERTM: assertM :: m Bool -> m () * ASSERT2: assertPpr :: Bool -> SDoc -> a -> a * MASSERT2: massertPpr :: Bool -> SDoc -> m () * ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
* W/W: Always zap useless idInfos.Andreas Klebinger2021-05-121-2/+28
| | | | | | | | | | | tryWW used to always returns an Id with a zapped: * DmdEnv * Used Once info except in the case where the ID was guaranteed to be inlined. We now also zap the info in that case. Fixes #19818.
* Fix strictness and arity info in SpecConstrSimon Peyton Jones2021-05-112-60/+85
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In GHC.Core.Opt.SpecConstr.spec_one we were giving join-points an incorrect join-arity -- this was fallout from commit c71b220491a6ae46924cc5011b80182bcc773a58 Author: Simon Peyton Jones <simonpj@microsoft.com> Date: Thu Apr 8 23:36:24 2021 +0100 Improvements in SpecConstr * Allow under-saturated calls to specialise See Note [SpecConstr call patterns] This just allows a bit more specialisation to take place. and showed up in #19780. I refactored the code to make the new function calcSpecInfo which treats join points separately. In doing this I discovered two other small bugs: * In the Var case of argToPat we were treating UnkOcc as uninteresting, but (by omission) NoOcc as interesting. As a result we were generating SpecConstr specialisations for functions with unused arguments. But the absence anlyser does that much better; doing it here just generates more code. Easily fixed. * The lifted/unlifted test in GHC.Core.Opt.WorkWrap.Utils.mkWorkerArgs was back to front (#19794). Easily fixed. * In the same function, mkWorkerArgs, we were adding an extra argument nullary join points, which isn't necessary. I added a test for this. That in turn meant I had to remove an ASSERT in CoreToStg.mkStgRhs for nullary join points, which was always bogus but now trips; I added a comment to explain.
* Document unfolding treatment of simplLamBndr.Andreas Klebinger2021-05-111-3/+5
| | | | Fixes #19817
* Don't warn about ClassOp bindings not specialising.Andreas Klebinger2021-05-111-0/+34
| | | | Fixes #19586
* Minor refactoring in WorkWrapSimon Peyton Jones2021-05-111-15/+7
| | | | | | This patch just does the tidying up from #19805. No change in behaviour.
* Expand Note [Data con representation].Andreas Klebinger2021-05-111-4/+30
| | | | Not perfect. But I consider this to be a documentation fix for #19789.
* Fix newtype eta-reductionSimon Peyton Jones2021-05-071-12/+18
| | | | | | | | The eta-reduction we do for newype axioms was generating an inhomogeneous axiom: see #19739. This patch fixes it in a simple way; see GHC.Tc.TyCl.Build Note [Newtype eta and homogeneous axioms]
* Allow visible type application for levity-poly data consSimon Peyton Jones2021-05-073-17/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch was driven by #18481, to allow visible type application for levity-polymorphic newtypes. As so often, it started simple but grew: * Significant refactor: I removed HsConLikeOut from the client-independent Language.Haskell.Syntax.Expr, and put it where it belongs, as a new constructor `ConLikeTc` in the GHC-specific extension data type for expressions, `GHC.Hs.Expr.XXExprGhcTc`. That changed touched a lot of files in a very superficial way. * Note [Typechecking data constructors] explains the main payload. The eta-expansion part is no longer done by the typechecker, but instead deferred to the desugarer, via `ConLikeTc` * A little side benefit is that I was able to restore VTA for data types with a "stupid theta": #19775. Not very important, but the code in GHC.Tc.Gen.Head.tcInferDataCon is is much, much more elegant now. * I had to refactor the levity-polymorphism checking code in GHC.HsToCore.Expr, see Note [Checking for levity-polymorphic functions] Note [Checking levity-polymorphic data constructors]
* Persist CorePrepProv into IfaceUnivCoProvSimon Peyton Jones2021-05-049-26/+30
| | | | | | | | | | | | | | | | CorePrepProv is only created in CorePrep, so I thought it wouldn't be needed in IfaceUnivCoProv. But actually IfaceSyn is used during pretty-printing, and we can certainly pretty-print things after CorePrep as #19768 showed. So the simplest thing is to represent CorePrepProv in IfaceSyn. To improve what Lint can do I also added a boolean to CorePrepProv, to record whether it is homogeneously kinded or not. It is introduced in two distinct ways (see Note [Unsafe coercions] in GHC.CoreToStg.Prep), one of which may be hetero-kinded (e.g. Int ~ Int#) beause it is casting a divergent expression; but the other is not. The boolean keeps track.
* Make sized division primops ok-for-spec (#19026)Sylvain Henry2021-04-301-14/+1
|
* Bring tcTyConScopedTyVars into scope in tcClassDecl2Ryan Scott2021-04-301-0/+30
| | | | | | | | | | | It is possible that the type variables bound by a class header will map to something different in the typechecker in the presence of `StandaloneKindSignatures`. `tcClassDecl2` was not aware of this, however, leading to #19738. To fix it, in `tcTyClDecls` we map each class `TcTyCon` to its `tcTyConScopedTyVars` as a `ClassScopedTVEnv`. We then plumb that `ClassScopedTVEnv` to `tcClassDecl2` where it can be used. Fixes #19738.
* Replace (ptext .. sLit) with `text`Sylvain Henry2021-04-296-15/+11
| | | | | | | | | | | | | | | 1. `text` is as efficient as `ptext . sLit` thanks to the rewrite rules 2. `text` is visually nicer than `ptext . sLit` 3. `ptext . sLit` encourages using one `ptext` for several `sLit` as in: ptext $ case xy of ... -> sLit ... ... -> sLit ... which may allocate SDoc's TextBeside constructors at runtime instead of sharing them into CAFs.
* Expand synonyms in mkCastTy when necessaryRyan Scott2021-04-291-20/+51
| | | | | | | | Doing so is important to maintain invariants (EQ3) and (EQ4) from `Note [Respecting definitional equality]` in `GHC.Core.TyCo.Rep`. For the details, see the new `Note [Using coreView in mk_cast_ty]`. Fixes #19742.
* Redesign withDict (formerly magicDict)Ryan Scott2021-04-291-21/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This gives a more precise type signature to `magicDict` as proposed in #16646. In addition, this replaces the constant-folding rule for `magicDict` in `GHC.Core.Opt.ConstantFold` with a special case in the desugarer in `GHC.HsToCore.Expr.dsHsWrapped`. I have also renamed `magicDict` to `withDict` in light of the discussion in https://mail.haskell.org/pipermail/ghc-devs/2021-April/019833.html. All of this has the following benefits: * `withDict` is now more type safe than before. Moreover, if a user applies `withDict` at an incorrect type, the special-casing in `dsHsWrapped` will now throw an error message indicating what the user did incorrectly. * `withDict` can now work with classes that have multiple type arguments, such as `Typeable @k a`. This means that `Data.Typeable.Internal.withTypeable` can now be implemented in terms of `withDict`. * Since the special-casing for `withDict` no longer needs to match on the structure of the expression passed as an argument to `withDict`, it no longer cares about the presence or absence of `Tick`s. In effect, this obsoletes the fix for #19667. The new `T16646` test case demonstrates the new version of `withDict` in action, both in terms of `base` functions defined in terms of `withDict` as well as in terms of functions from the `reflection` and `singletons` libraries. The `T16646Fail` test case demonstrates the error message that GHC throws when `withDict` is applied incorrectly. This fixes #16646. By adding more tests for `withDict`, this also fixes #19673 as a side effect.
* Introduce -ddump-verbose-inliningsBen Gamari2021-04-271-1/+1
| | | | | | | | Previously -ddump-inlinings and -dverbose-core2core used in conjunction would have the side-effect of dumping additional information about all inlinings considered by the simplifier. However, I have sometimes wanted this inlining information without the firehose of information produced by -dverbose-core2core. Introduce a new dump flag for this purpose.
* Eliminate unsafeEqualityProof in CorePrepSimon Peyton Jones2021-04-2611-3/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main idea here is to avoid treating * case e of {} * case unsafeEqualityProof of UnsafeRefl co -> blah specially in CoreToStg. Instead, nail them in CorePrep, by converting case e of {} ==> e |> unsafe-co case unsafeEqualityProof of UnsafeRefl cv -> blah ==> blah[unsafe-co/cv] in GHC.Core.Prep. Now expressions that we want to treat as trivial really are trivial. We can get rid of cpExprIsTrivial. And we fix #19700. A downside is that, at least under unsafeEqualityProof, we substitute in types and coercions, which is more work. But a big advantage is that it's all very simple and principled: CorePrep really gets rid of the unsafeCoerce stuff, as it does empty case, runRW#, lazyId etc. I've updated the overview in GHC.Core.Prep, and added Note [Unsafe coercions] in GHC.Core.Prep Note [Implementing unsafeCoerce] in base:Unsafe.Coerce We get 3% fewer bytes allocated when compiling perf/compiler/T5631, which uses a lot of unsafeCoerces. (It's a happy-generated parser.) Metric Decrease: T5631
* Fix occAnalAppSimon Peyton Jones2021-04-201-5/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In OccurAnal the function occAnalApp was failing to reset occ_encl to OccVanilla. This omission sometimes resulted in over-pessimistic occurrence information. I tripped over this when analysing eta-expansions. Compile times in perf/compiler fell slightly (no increases) PmSeriesG(normal) ghc/alloc 50738104.0 50580440.0 -0.3% PmSeriesS(normal) ghc/alloc 64045284.0 63739384.0 -0.5% PmSeriesT(normal) ghc/alloc 94430324.0 93800688.0 -0.7% PmSeriesV(normal) ghc/alloc 63051056.0 62758240.0 -0.5% T10547(normal) ghc/alloc 29322840.0 29307784.0 -0.1% T10858(normal) ghc/alloc 191988716.0 189801744.0 -1.1% T11195(normal) ghc/alloc 282654016.0 281839440.0 -0.3% T11276(normal) ghc/alloc 142994648.0 142338688.0 -0.5% T11303b(normal) ghc/alloc 46435532.0 46343376.0 -0.2% T11374(normal) ghc/alloc 256866536.0 255653056.0 -0.5% T11822(normal) ghc/alloc 140210356.0 138935296.0 -0.9% T12234(optasm) ghc/alloc 60753880.0 60720648.0 -0.1% T14052(ghci) ghc/alloc 2235105796.0 2230906584.0 -0.2% T17096(normal) ghc/alloc 297725396.0 296237112.0 -0.5% T17836(normal) ghc/alloc 1127785292.0 1125316160.0 -0.2% T17836b(normal) ghc/alloc 54761928.0 54637592.0 -0.2% T17977(normal) ghc/alloc 47529464.0 47397048.0 -0.3% T17977b(normal) ghc/alloc 42906972.0 42809824.0 -0.2% T18478(normal) ghc/alloc 777385708.0 774219280.0 -0.4% T18698a(normal) ghc/alloc 415097664.0 409009120.0 -1.5% GOOD T18698b(normal) ghc/alloc 500082104.0 493124016.0 -1.4% GOOD T18923(normal) ghc/alloc 72252364.0 72216016.0 -0.1% T1969(normal) ghc/alloc 811581860.0 804883136.0 -0.8% T5837(normal) ghc/alloc 37688048.0 37666288.0 -0.1% Nice! Metric Decrease: T18698a T18698b
* Worker/wrapper: Refactor CPR WW to work for nested CPR (#18174)wip/nested-cpr-wwSebastian Graf2021-04-202-296/+464
| | | | | | | | | | | | | | | | | | | | | In another small step towards bringing a manageable variant of Nested CPR into GHC, this patch refactors worker/wrapper to be able to exploit Nested CPR signatures. See the new Note [Worker/wrapper for CPR]. The nested code path is currently not triggered, though, because all signatures that we annotate are still flat. So purely a refactoring. I am very confident that it works, because I ripped it off !1866 95% unchanged. A few test case outputs changed, but only it's auxiliary names only. I also added test cases for #18109 and #18401. There's a 2.6% metric increase in T13056 after a rebase, caused by an additional Simplifier run. It appears b1d0b9c saw a similar additional iteration. I think it's just a fluke. Metric Increase: T13056
* Worker/wrapper: Consistent namesSebastian Graf2021-04-207-16/+16
|
* Refactor around `wantToUnbox`Sebastian Graf2021-04-203-81/+69
| | | | | | | | | | I renamed `wantToUnbox` to `wantToUnboxArg` and then introduced `wantToUnboxResult`, which we call in `mkWWcpr_one` now. I also deleted `splitArgType_maybe` (the single call site outside of `wantToUnboxArg` actually cared about the result type of a function, not an argument) and `splitResultType_maybe` (which is entirely superceded by `wantToUnboxResult`.
* Factor out DynFlags from WorkWrap.UtilsSebastian Graf2021-04-206-112/+148
| | | | | | | | | Plus a few minor refactorings: * Introduce `normSplitTyConApp_maybe` to Core.Utils * Reduce boolean blindness in the Bool argument to `wantToUnbox` * Let `wantToUnbox` also decide when to drop an argument, cleaning up `mkWWstr_one`
* Fix Haddock referenceBen Gamari2021-04-181-1/+1
|
* Improvements in SpecConstrwip/T19672Simon Peyton Jones2021-04-172-90/+132
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Allow under-saturated calls to specialise See Note [SpecConstr call patterns] This just allows a bit more specialisation to take place. * Don't discard calls from un-specialised RHSs. This was a plain bug in `specialise`, again leading to loss of specialisation. Refactoring yields an `otherwise` case that is easier to grok. * I refactored CallPat to become a proper data type, not a tuple. All this came up when I was working on eta-reduction. The ticket is #19672. The nofib results are mostly zero, with a couple of big wins: Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- awards +0.2% -0.1% -18.7% -18.8% 0.0% comp_lab_zift +0.2% -0.2% -23.9% -23.9% 0.0% fft2 +0.2% -1.0% -34.9% -36.6% 0.0% hpg +0.2% -0.3% -18.4% -18.4% 0.0% mate +0.2% -15.7% -19.3% -19.3% +11.1% parser +0.2% +0.6% -16.3% -16.3% 0.0% puzzle +0.4% -19.7% -33.7% -34.0% 0.0% rewrite +0.2% -0.5% -20.7% -20.7% 0.0% -------------------------------------------------------------------------------- Min +0.2% -19.7% -48.1% -48.9% 0.0% Max +0.4% +0.6% -1.2% -1.1% +11.1% Geometric Mean +0.2% -0.4% -21.0% -21.1% +0.1% I investigated the 0.6% increase on 'parser'. It comes because SpecConstr has a limit of 3 specialisations. With HEAD, hsDoExpr has 2 specialisations, and then a further several from the specialised bodies, of which 1 is picked. With this patch we get 3 specialisations right off the bat, so we discard all from the recursive calls. Turns out that that's not the best choice, but there is no way to tell that. I'm accepting it. NB: these figures actually come from this patch plus the preceding one for StgCSE, but I think the gains come from SpecConstr.
* Add isInjectiveTyCon check to opt_univ (fixes #19509)Adam Gundry2021-04-141-0/+18
|
* Make the specialiser handle polymorphic specialisationSimon Peyton Jones2021-04-131-50/+172
| | | | | | | | | | | | | | | | | | | | | | | | | | | Ticket #13873 unexpectedly showed that a SPECIALISE pragma made a program run (a lot) slower, because less specialisation took place overall. It turned out that the specialiser was missing opportunities because of quantified type variables. It was quite easy to fix. The story is given in Note [Specialising polymorphic dictionaries] Two other minor fixes in the specialiser * There is no benefit in specialising data constructor /wrappers/. (They can appear overloaded because they are given a dictionary to store in the constructor.) Small guard in canSpecImport. * There was a buglet in the UnspecArg case of specHeader, in the case where there is a dead binder. We need a LitRubbish filler for the specUnfolding stuff. I expanded Note [Drop dead args from specialisations] to explain. There is a 4% increase in compile time for T13056, because we generate more specialised code. This seems OK. Metric Increase: T13056
* Fix magicDict in ghci (and in the presence of other ticks)Matthew Pickering2021-04-101-2/+2
| | | | | | | | | | | The problem was that ghci inserts some ticks around the crucial bit of the expression. Just like in some of the other rules we now strip the ticks so that the rule fires more reliably. It was possible to defeat magicDict by using -fhpc as well, so not just an issue in ghci. Fixes #19667 and related to #19673
* Add missing relational constant folding for sized numeric typesJohn Ericson2021-04-101-11/+67
|
* Re-export GHC.Bits from GHC.Prelude with custom shift implementation.Andreas Klebinger2021-04-091-13/+12
| | | | | | | This allows us to use the unsafe shifts in non-debug builds for performance. For older versions of base we instead export Data.Bits See also #19618
* CoreTidy: handle special cases to preserve more sharing.Sylvain Henry2021-04-091-1/+2
| | | | | Metric Decrease: T16577
* CoreTidy: enhance strictness noteBen Gamari2021-04-091-5/+15
|