delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Simple subsumptionwip/T17775	Simon Peyton Jones	2020-06-05	4	-9/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch simplifies GHC to use simple subsumption. Ticket #17775 Implements GHC proposal #287 https://github.com/ghc-proposals/ghc-proposals/blob/master/ proposals/0287-simplify-subsumption.rst All the motivation is described there; I will not repeat it here. The implementation payload: * tcSubType and friends become noticably simpler, because it no longer uses eta-expansion when checking subsumption. * No deeplyInstantiate or deeplySkolemise That in turn means that some tests fail, by design; they can all be fixed by eta expansion. There is a list of such changes below. Implementing the patch led me into a variety of sticky corners, so the patch includes several othe changes, some quite significant: * I made String wired-in, so that "foo" :: String rather than "foo" :: [Char] This improves error messages, and fixes #15679 * The pattern match checker relies on knowing about in-scope equality constraints, andd adds them to the desugarer's environment using addTyCsDs. But the co_fn in a FunBind was missed, and for some reason simple-subsumption ends up with dictionaries there. So I added a call to addTyCsDs. This is really part of #18049. * I moved the ic_telescope field out of Implication and into ForAllSkol instead. This is a nice win; just expresses the code much better. * There was a bug in GHC.Tc.TyCl.Instance.tcDataFamInstHeader. We called checkDataKindSig inside tc_kind_sig, /before/ solveEqualities and zonking. Obviously wrong, easily fixed. * solveLocalEqualitiesX: there was a whole mess in here, around failing fast enough. I discovered a bad latent bug where we could successfully kind-check a type signature, and use it, but have unsolved constraints that could fill in coercion holes in that signature -- aargh. It's all explained in Note [Failure in local type signatures] in GHC.Tc.Solver. Much better now. * I fixed a serious bug in anonymous type holes. IN f :: Int -> (forall a. a -> _) -> Int that "_" should be a unification variable at the /outer/ level; it cannot be instantiated to 'a'. This was plain wrong. New fields mode_lvl and mode_holes in TcTyMode, and auxiliary data type GHC.Tc.Gen.HsType.HoleMode. This fixes #16292, but makes no progress towards the more ambitious #16082 * I got sucked into an enormous refactoring of the reporting of equality errors in GHC.Tc.Errors, especially in mkEqErr1 mkTyVarEqErr misMatchMsg misMatchMsgOrCND In particular, the very tricky mkExpectedActualMsg function is gone. It took me a full day. But the result is far easier to understand. (Still not easy!) This led to various minor improvements in error output, and an enormous number of test-case error wibbles. One particular point: for occurs-check errors I now just say Can't match 'a' against '[a]' rather than using the intimidating language of "occurs check". * Pretty-printing AbsBinds Tests review * Eta expansions T11305: one eta expansion T12082: one eta expansion (undefined) T13585a: one eta expansion T3102: one eta expansion T3692: two eta expansions (tricky) T2239: two eta expansions T16473: one eta determ004: two eta expansions (undefined) annfail06: two eta (undefined) T17923: four eta expansions (a strange program indeed!) tcrun035: one eta expansion * Ambiguity check at higher rank. Now that we have simple subsumption, a type like f :: (forall a. Eq a => Int) -> Int is no longer ambiguous, because we could write g :: (forall a. Eq a => Int) -> Int g = f and it'd typecheck just fine. But f's type is a bit suspicious, and we might want to consider making the ambiguity check do a check on each sub-term. Meanwhile, these tests are accepted, whereas they were previously rejected as ambiguous: T7220a T15438 T10503 T9222 * Some more interesting error message wibbles T13381: Fine: one error (Int ~ Exp Int) rather than two (Int ~ Exp Int, Exp Int ~ Int) T9834: Small change in error (improvement) T10619: Improved T2414: Small change, due to order of unification, fine T2534: A very simple case in which a change of unification order means we get tow unsolved constraints instead of one tc211: bizarre impredicative tests; just accept this for now Updates Cabal and haddock submodules. Metric Increase: T12150 T12234 T5837 haddock.base Metric Decrease: haddock.compiler haddock.Cabal haddock.base Merge note: This appears to break the `UnliftedNewtypesDifficultUnification` test. It has been marked as broken in the interest of merging. (cherry picked from commit 66b7b195cb3dce93ed5078b80bf568efae904cc5)
*	FloatOut: Only eta-expand dead-end RHS if arity will increase (#18231)	Sebastian Graf	2020-05-28	3	-0/+52
\| \| \| \| \| \| \| \|	Otherwise we risk turning trivial RHS into non-trivial RHS, introducing unnecessary bindings in the next Simplifier run, resulting in more churn. Fixes #18231.
*	CprAnal: Don't attach CPR sigs to expandable bindings (#18154)	Sebastian Graf	2020-05-13	11	-41/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead, look through expandable unfoldings in `cprTransform`. See the new Note [CPR for expandable unfoldings]: ``` Long static data structures (whether top-level or not) like xs = x1 : xs1 xs1 = x2 : xs2 xs2 = x3 : xs3 should not get CPR signatures, because they * Never get WW'd, so their CPR signature should be irrelevant after analysis (in fact the signature might even be harmful for that reason) * Would need to be inlined/expanded to see their constructed product * Recording CPR on them blows up interface file sizes and is redundant with their unfolding. In case of Nested CPR, this blow-up can be quadratic! But we can't just stop giving DataCon application bindings the CPR property, for example fac 0 = 1 fac n = n * fac (n-1) fac certainly has the CPR property and should be WW'd! But FloatOut will transform the first clause to lvl = 1 fac 0 = lvl If lvl doesn't have the CPR property, fac won't either. But lvl doesn't have a CPR signature to extrapolate into a CPR transformer ('cprTransform'). So instead we keep on cprAnal'ing through expandable unfoldings for these arity 0 bindings via 'cprExpandUnfolding_maybe'. In practice, GHC generates a lot of (nested) TyCon and KindRep bindings, one for each data declaration. It's wasteful to attach CPR signatures to each of them (and intractable in case of Nested CPR). ``` Fixes #18154.
*	Fix specialisation for DFuns	Simon Peyton Jones	2020-05-08	2	-0/+35
\| \| \| \| \| \| \|	When specialising a DFun we must take care to saturate the unfolding. See Note [Specialising DFuns] in Specialise. Fixes #18120
*	Remove SpecConstrAnnotation (#13681)	Sylvain Henry	2020-05-05	2	-9/+2
\| \| \| \| \| \| \| \| \|	This has been deprecated since 2013. Use GHC.Types.SPEC instead. Make GHC.Exts "not-home" for haddock Metric Decrease: haddock.base
*	Mark rule args as non-tail-called	Simon Peyton Jones	2020-04-30	2	-0/+79
\| \| \| \| \| \| \| \|	This was just an omission...b I'd failed to call markAllNonTailCall on rule args. I think this bug has been here a long time, but it's quite hard to trigger. Fixes #18098
*	Unit: split and rename modules	Sylvain Henry	2020-04-30	5	-10/+10
\| \| \| \| \| \| \|	Introduce GHC.Unit.* hierarchy for everything concerning units, packages and modules. Update Haddock submodule
*	PPC NCG: Add DWARF constants and debug labels	Peter Trommler	2020-04-22	2	-19/+7
\| \| \| \|	Fixes #11261
*	Mark DataCon wrappers CONLIKE	Alexis King	2020-04-22	5	-3/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that DataCon wrappers don’t inline until phase 0 (see commit b78cc64e923716ac0512c299f42d4d0012306c05), it’s important that case-of-known-constructor and RULE matching be able to see saturated applications of DataCon wrappers in unfoldings. Making them conlike is a natural way to do it, since they are, in fact, precisely the sort of thing the CONLIKE pragma exists to solve. Fixes #18012. This also bumps the version of the parsec submodule to incorporate a patch that avoids a metric increase on the haddock perf tests. The increase was not really a flaw in this patch, as parsec was implicitly relying on inlining heuristics. The patch to parsec just adds some INLINABLE pragmas, and we get a nice performance bump out of it (well beyond the performance we lost from this patch). Metric Decrease: T12234 WWRec haddock.Cabal haddock.base haddock.compiler
*	GHC.Core.Opt renaming	Sylvain Henry	2020-04-18	6	-7/+7
\| \| \| \| \| \| \| \| \| \| \|	* GHC.Core.Op => GHC.Core.Opt * GHC.Core.Opt.Simplify.Driver => GHC.Core.Opt.Driver * GHC.Core.Opt.Tidy => GHC.Core.Tidy * GHC.Core.Opt.WorkWrap.Lib => GHC.Core.Opt.WorkWrap.Utils As discussed in: * https://mail.haskell.org/pipermail/ghc-devs/2020-April/018758.html * https://gitlab.haskell.org/ghc/ghc/issues/13009#note_264650
*	Add an INLINE pragma to Control.Category.>>>	Alexis King	2020-04-12	4	-0/+287
\| \| \| \| \| \| \|	This fixes #18013 by adding INLINE pragmas to both Control.Category.>>> and GHC.Desugar.>>>. The functional change in this patch is tiny (just two lines of pragmas!), but an accompanying Note explains in gory detail what’s going on.
*	Modules: type-checker (#13009)	Sylvain Henry	2020-04-07	1	-1/+1
\| \| \| \|	Update Haddock submodule
*	simplifier: Kill off ufKeenessFactor	Ben Gamari	2020-04-07	4	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We used to have another factor, ufKeenessFactor, which would scale the discounts before they were subtracted from the size. This was justified with the following comment: -- We multiple the raw discounts (args_discount and result_discount) -- ty opt_UnfoldingKeenessFactor because the former have to do with -- size whereas the discounts imply that there's some extra -- efficiency to be gained (e.g. beta reductions, case reductions) -- by inlining. However, this is highly suspect since it means that we subtract a scaled size from an absolute size, resulting in crazy (e.g. negative) scores in some cases (#15304). We consequently killed off ufKeenessFactor and bumped up the ufUseThreshold to compensate. Adjustment of unfolding use threshold ===================================== Since this removes a discount from our inlining heuristic, I revisited our default choice of -funfolding-use-threshold to minimize the change in overall inlining behavior. Specifically, I measured runtime allocations and executable size of nofib and the testsuite performance tests built using compilers (and core libraries) built with several values of -funfolding-use-threshold. This comes as a result of a quantitative comparison of testsuite performance and code size as a function of ufUseThreshold, comparing GHC trees using values of 50, 60, 70, 80, 90, and 100. The test set consisted of nofib and the testsuite performance tests. A full summary of these measurements are found in the description of !2608 Comparing executable sizes (relative to the base commit) across all nofib tests, we see that sizes are similar to the baseline: gmean min max median thresh 50 -6.36% -7.04% -4.82% -6.46% 60 -5.04% -5.97% -3.83% -5.11% 70 -2.90% -3.84% -2.31% -2.92% 80 -0.75% -2.16% -0.42% -0.73% 90 +0.24% -0.41% +0.55% +0.26% 100 +1.36% +0.80% +1.64% +1.37% baseline +0.00% +0.00% +0.00% +0.00% Likewise, looking at runtime allocations we see that 80 gives slightly better optimisation than the baseline: gmean min max median thresh 50 +0.16% -0.16% +4.43% +0.00% 60 +0.09% -0.00% +3.10% +0.00% 70 +0.04% -0.09% +2.29% +0.00% 80 +0.02% -1.17% +2.29% +0.00% 90 -0.02% -2.59% +1.86% +0.00% 100 +0.00% -2.59% +7.51% -0.00% baseline +0.00% +0.00% +0.00% +0.00% Finally, I had to add a NOINLINE in T4306 to ensure that `upd` is worker-wrappered as the test expects. This makes me wonder whether the inlining heuristic is now too liberal as `upd` is quite a large function. The same measure was taken in T12600. Wall clock time compiling Cabal with -O0 thresh 50 60 70 80 90 100 baseline build-Cabal 93.88 89.58 92.59 90.09 100.26 94.81 89.13 Also, this change happens to avoid the spurious test output in `plugin-recomp-change` and `plugin-recomp-change-prof` (see #17308). Metric Decrease: hie002 T12234 T13035 T13719 T14683 T4801 T5631 T5642 T9020 T9872d T9961 Metric Increase: T12150 T12425 T13701 T14697 T15426 T1969 T3064 T5837 T6048 T9203 T9872a T9872b T9872c T9872d haddock.Cabal haddock.base haddock.compiler
*	Fix an tricky specialiser loop	Simon Peyton Jones	2020-04-06	4	-0/+226
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Issue #17151 was a very tricky example of a bug in which the specialiser accidentally constructs a recurive dictionary, so that everything turns into bottom. I have fixed variants of this bug at least twice before: see Note [Avoiding loops]. It was a bit of a struggle to isolate the problem, greatly aided by the work that Alexey Kuleshevich did in distilling a test case. Once I'd understood the problem, it was not difficult to fix, though it did lead me a bit of refactoring in specImports.
*	Major improvements to the specialiser	Simon Peyton Jones	2020-04-03	10	-0/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch is joint work of Alexis King and Simon PJ. It does some significant refactoring of the type-class specialiser. Main highlights: * We can specialise functions with types like f :: Eq a => a -> Ord b => b => blah where the classes aren't all at the front (#16473). Here we can correctly specialise 'f' based on a call like f @Int @Bool dEqInt x dOrdBool This change really happened in an earlier patch commit 2d0cf6252957b8980d89481ecd0b79891da4b14b Author: Sandy Maguire <sandy@sandymaguire.me> Date: Thu May 16 12:12:10 2019 -0400 work that this new patch builds directly on that work, and refactors it a bit. * We can specialise functions with implicit parameters (#17930) g :: (?foo :: Bool, Show a) => a -> String Previously we could not, but now they behave just like a non-class argument as in 'f' above. * We can specialise under-saturated calls, where some (but not all of the dictionary arguments are provided (#17966). For example, we can specialise the above 'f' based on a call map (f @Int dEqInt) xs even though we don't (and can't) give Ord dictionary. This may sound exotic, but #17966 is a program from the wild, and showed significant perf loss for functions like f, if you need saturation of all dictionaries. * We fix a buglet in which a floated dictionary had a bogus demand (#17810), by using zapIdDemandInfo in the NonRec case of specBind. * A tiny side benefit: we can drop dead arguments to specialised functions; see Note [Drop dead args from specialisations] * Fixed a bug in deciding what dictionaries are "interesting"; see Note [Keep the old dictionaries interesting] This is all achieved by by building on Sandy Macguire's work in defining SpecArg, which mkCallUDs uses to describe the arguments of the call. Main changes: * Main work is in specHeader, which marched down the [InBndr] from the function definition and the [SpecArg] from the call site, together. * specCalls no longer has an arity check; the entire mechanism now handles unders-saturated calls fine. * mkCallUDs decides on an argument-by-argument basis whether to specialise a particular dictionary argument; this is new. See mk_spec_arg in mkCallUDs. It looks as if there are many more lines of code, but I think that all the extra lines are comments!
*	Re-engineer the binder-swap transformation	Simon Peyton Jones	2020-04-02	3	-35/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The binder-swap transformation is implemented by the occurrence analyser -- see Note [Binder swap] in OccurAnal. However it had a very nasty corner in it, for the case where the case scrutinee was a GlobalId. This led to trouble and hacks, and ultimately to #16296. This patch re-engineers how the occurrence analyser implements the binder-swap, by actually carrying out a substitution rather than by adding a let-binding. It's all described in Note [The binder-swap substitution]. I did a few other things along the way * Fix a bug in StgCse, which could allow a loop breaker to be CSE'd away. See Note [Care with loop breakers] in StgCse. I think it can only show up if occurrence analyser sets up bad loop breakers, but still. * Better commenting in SimplUtils.prepareAlts * A little refactoring in CoreUnfold; nothing significant e.g. rename CoreUnfold.mkTopUnfolding to mkFinalUnfolding * Renamed CoreSyn.isFragileUnfolding to hasCoreUnfolding * Move mkRuleInfo to CoreFVs We observed respectively 4.6% and 5.9% allocation decreases for the following tests: Metric Decrease: T9961 haddock.base
*	Demand analysis: simplify the demand for a RHS	Simon Peyton Jones	2020-03-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ticket #17932 showed that we were using a stupid demand for the RHS of a let-binding, when the result is a product. This was the result of a "fix" in 2013, which (happily) turns out to no longer be necessary. So I just deleted the code, which simplifies the demand analyser, and fixes #17932. That in turn uncovered that the anticipation of worker/wrapper in CPR analysis was inaccurate, hence the logic that decides whether to unbox an argument in WW was extracted into a function `wantToUnbox`, now consulted by CPR analysis. I tried nofib, and got 0.0% perf changes. All this came up when messing about with !2873 (ticket #17917), but is idependent of it. Unfortunately, this patch regresses #4267 and realised that it is now blocked on #16335.
*	Modules: Types (#13009)	Sylvain Henry	2020-03-29	1	-1/+1
\| \| \| \| \| \| \|	Update Haddock submodule Metric Increase: haddock.compiler
*	Modules: Core operations (#13009)	Sylvain Henry	2020-03-18	6	-7/+7
\|
*	Improve CSE.combineAlts	Simon Peyton Jones	2020-03-14	4	-0/+45
\| \| \| \| \| \| \| \| \| \| \| \| \|	This patch improves the way that CSE combines identical alternatives. See #17901. I'm still not happy about the duplication between CSE.combineAlts and GHC.Core.Utils.combineIdenticalAlts; see the Notes with those functions. But this patch is a step forward. Metric Decrease: T12425 T5642
*	Fix Lint	Simon Peyton Jones	2020-03-14	2	-0/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Ticket #17590 pointed out a bug in the way the linter dealt with type lets, exposed by the new uniqAway story. The fix is described in Note [Linting type lets]. I ended up putting the in-scope Ids in a different env field, le_ids, rather than (as before) sneaking them into the TCvSubst. Surprisingly tiresome, but done. Metric Decrease: hie002
*	Typos in comments [skip ci]	Krzysztof Gogolewski	2020-03-10	1	-1/+1
\|
*	Simplify IfaceIdInfo type	Ömer Sinan Ağacan	2020-02-29	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IfaceIdInfo type is confusing: there's practically no difference between `NoInfo` and `HasInfo []`. The comments say NoInfo is used when -fomit-interface-pragmas is enabled, but we don't need to distinguish `NoInfo` from `HasInfo []` in when reading the interface so the distinction is not important. This patch simplifies the type by removing NoInfo. When we have no info we use an empty list. With this change we no longer read the info list lazily when reading an IfaceInfoItem, but when reading an IfaceId the ifIdInfo field is read lazily, so I doubt this is going to be a problem.
*	PmCheck: Implement Long-distance information with Covered sets	Sebastian Graf	2020-02-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Consider ```hs data T = A \| B \| C f :: T -> Int f A = 1 f x = case x of A -> 2 B -> 3 C -> 4 ``` Clearly, the RHS returning 2 is redundant. But we don't currently see that, because our approximation to the covered set of the inner case expression just picks up the positive information from surrounding pattern matches. It lacks the context sensivity that `x` can't be `A` anymore! Therefore, we adopt the conceptually and practically superior approach of reusing the covered set of a particular GRHS from an outer pattern match. In this case, we begin checking the `case` expression with the covered set of `f`s second clause, which encodes the information that `x` can't be `A` anymore. After this MR, we will successfully warn about the RHS returning 2 being redundant. Perhaps surprisingly, this was a great simplification to the code of both the coverage checker and the desugarer. Found a redundant case alternative in `unix` submodule, so we have to bump it with a fix. Metric Decrease: T12227
*	Treat coercions as arguments for floating and inlining	Alexis King	2020-02-26	3	-0/+11
\| \| \| \| \|	This reverts commit 8924224ecfa065ebc67b96a90d01cf9d2edd0e77 and fixes #17787.
*	Modules: Driver (#13009)	Sylvain Henry	2020-02-21	1	-1/+1
\| \| \| \|	submodule updates: nofib, haddock
*	Fix #17724 by having occAnal preserve used bindings.	Andreas Klebinger	2020-02-20	4	-0/+121
\| \| \| \| \| \|	It sometimes happened that occAnal would remove bindings as dead code by relying on bindings to be in dependency order. The fix was contributed by SPJ.
*	Re-implement unsafe coercions in terms of unsafe equality proofs	Simon Peyton Jones	2020-02-20	3	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(Commit message written by Omer, most of the code is written by Simon and Richard) See Note [Implementing unsafeCoerce] for how unsafe equality proofs and the new unsafeCoerce# are implemented. New notes added: - [Checking for levity polymorphism] in CoreLint.hs - [Implementing unsafeCoerce] in base/Unsafe/Coerce.hs - [Patching magic definitions] in Desugar.hs - [Wiring in unsafeCoerce#] in Desugar.hs Only breaking change in this patch is unsafeCoerce# is not exported from GHC.Exts, instead of GHC.Prim. Fixes #17443 Fixes #16893 NoFib ----- -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.1% 0.0% -0.0% -0.0% -0.0% CSD -0.1% 0.0% -0.0% -0.0% -0.0% FS -0.1% 0.0% -0.0% -0.0% -0.0% S -0.1% 0.0% -0.0% -0.0% -0.0% VS -0.1% 0.0% -0.0% -0.0% -0.0% VSD -0.1% 0.0% -0.0% -0.0% -0.1% VSM -0.1% 0.0% -0.0% -0.0% -0.0% anna -0.0% 0.0% -0.0% -0.0% -0.0% ansi -0.1% 0.0% -0.0% -0.0% -0.0% atom -0.1% 0.0% -0.0% -0.0% -0.0% awards -0.1% 0.0% -0.0% -0.0% -0.0% banner -0.1% 0.0% -0.0% -0.0% -0.0% bernouilli -0.1% 0.0% -0.0% -0.0% -0.0% binary-trees -0.1% 0.0% -0.0% -0.0% -0.0% boyer -0.1% 0.0% -0.0% -0.0% -0.0% boyer2 -0.1% 0.0% -0.0% -0.0% -0.0% bspt -0.1% 0.0% -0.0% -0.0% -0.0% cacheprof -0.1% 0.0% -0.0% -0.0% -0.0% calendar -0.1% 0.0% -0.0% -0.0% -0.0% cichelli -0.1% 0.0% -0.0% -0.0% -0.0% circsim -0.1% 0.0% -0.0% -0.0% -0.0% clausify -0.1% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.1% 0.0% -0.0% -0.0% -0.0% compress -0.1% 0.0% -0.0% -0.0% -0.0% compress2 -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.1% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.1% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.1% 0.0% -0.0% -0.0% -0.0% cse -0.1% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.1% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.1% 0.0% -0.0% -0.0% -0.0% dom-lt -0.1% 0.0% -0.0% -0.0% -0.0% eliza -0.1% 0.0% -0.0% -0.0% -0.0% event -0.1% 0.0% -0.0% -0.0% -0.0% exact-reals -0.1% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.1% 0.0% -0.0% -0.0% -0.0% expert -0.1% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.1% 0.0% -0.0% -0.0% -0.0% fasta -0.1% 0.0% -0.5% -0.3% -0.4% fem -0.1% 0.0% -0.0% -0.0% -0.0% fft -0.1% 0.0% -0.0% -0.0% -0.0% fft2 -0.1% 0.0% -0.0% -0.0% -0.0% fibheaps -0.1% 0.0% -0.0% -0.0% -0.0% fish -0.1% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.1% 0.0% +0.0% +0.0% +0.0% gamteb -0.1% 0.0% -0.0% -0.0% -0.0% gcd -0.1% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.1% 0.0% -0.0% -0.0% -0.0% genfft -0.1% 0.0% -0.0% -0.0% -0.0% gg -0.1% 0.0% -0.0% -0.0% -0.0% grep -0.1% 0.0% -0.0% -0.0% -0.0% hidden -0.1% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.1% 0.0% -0.0% -0.0% -0.0% infer -0.1% 0.0% -0.0% -0.0% -0.0% integer -0.1% 0.0% -0.0% -0.0% -0.0% integrate -0.1% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.1% 0.0% -0.0% -0.0% -0.0% kahan -0.1% 0.0% -0.0% -0.0% -0.0% knights -0.1% 0.0% -0.0% -0.0% -0.0% lambda -0.1% 0.0% -0.0% -0.0% -0.0% last-piece -0.1% 0.0% -0.0% -0.0% -0.0% lcss -0.1% 0.0% -0.0% -0.0% -0.0% life -0.1% 0.0% -0.0% -0.0% -0.0% lift -0.1% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.1% 0.0% -0.0% -0.0% -0.0% listcopy -0.1% 0.0% -0.0% -0.0% -0.0% maillist -0.1% 0.0% -0.0% -0.0% -0.0% mandel -0.1% 0.0% -0.0% -0.0% -0.0% mandel2 -0.1% 0.0% -0.0% -0.0% -0.0% mate -0.1% 0.0% -0.0% -0.0% -0.0% minimax -0.1% 0.0% -0.0% -0.0% -0.0% mkhprog -0.1% 0.0% -0.0% -0.0% -0.0% multiplier -0.1% 0.0% -0.0% -0.0% -0.0% n-body -0.1% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.1% 0.0% -0.0% -0.0% -0.0% para -0.1% 0.0% -0.0% -0.0% -0.0% paraffins -0.1% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.1% 0.0% -0.0% -0.0% -0.0% pidigits -0.1% 0.0% -0.0% -0.0% -0.0% power -0.1% 0.0% -0.0% -0.0% -0.0% pretty -0.1% 0.0% -0.1% -0.1% -0.1% primes -0.1% 0.0% -0.0% -0.0% -0.0% primetest -0.1% 0.0% -0.0% -0.0% -0.0% prolog -0.1% 0.0% -0.0% -0.0% -0.0% puzzle -0.1% 0.0% -0.0% -0.0% -0.0% queens -0.1% 0.0% -0.0% -0.0% -0.0% reptile -0.1% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.1% 0.0% -0.0% -0.0% -0.0% rewrite -0.1% 0.0% -0.0% -0.0% -0.0% rfib -0.1% 0.0% -0.0% -0.0% -0.0% rsa -0.1% 0.0% -0.0% -0.0% -0.0% scc -0.1% 0.0% -0.1% -0.1% -0.1% sched -0.1% 0.0% -0.0% -0.0% -0.0% scs -0.1% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.1% 0.0% -0.0% -0.0% -0.0% sorting -0.1% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.1% 0.0% -0.0% -0.0% -0.0% sphere -0.1% 0.0% -0.0% -0.0% -0.0% symalg -0.1% 0.0% -0.0% -0.0% -0.0% tak -0.1% 0.0% -0.0% -0.0% -0.0% transform -0.1% 0.0% -0.0% -0.0% -0.0% treejoin -0.1% 0.0% -0.0% -0.0% -0.0% typecheck -0.1% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.1% 0.0% -0.0% -0.0% -0.0% wave4main -0.1% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.1% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.1% 0.0% -0.0% -0.0% -0.0% x2n1 -0.1% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.5% -0.3% -0.4% Max -0.0% 0.0% +0.0% +0.0% +0.0% Geometric Mean -0.1% -0.0% -0.0% -0.0% -0.0% Test changes ------------ - break006 is marked as broken, see #17833 - The compiler allocates less when building T14683 (an unsafeCoerce#- heavy happy-generated code) on 64-platforms. Allocates more on 32-bit platforms. - Rest of the increases are tiny amounts (still enough to pass the threshold) in micro-benchmarks. I briefly looked at each one in a profiling build: most of the increased allocations seem to be because of random changes in the generated code. Metric Decrease: T14683 Metric Increase: T12150 T12234 T12425 T13035 T14683 T5837 T6048 Co-Authored-By: Richard Eisenberg <rae@cs.brynmawr.edu> Co-Authored-By: Ömer Sinan Ağacan <omeragacan@gmail.com>
*	testsuite: Mark T7702 as fragile on Windows	Ben Gamari	2020-02-14	1	-1/+1
\| \| \| \| \|	Due to #16799. There was previously an attempt to mark it as broken but the `opsys` name was incorrect.
*	Always display inferred variables using braces	Krzysztof Gogolewski	2020-02-12	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	We now always show "forall {a}. T" for inferred variables, previously this was controlled by -fprint-explicit-foralls. This implements part 1 of https://github.com/ghc-proposals/ghc-proposals/pull/179. Part of GHC ticket #16320. Furthermore, when printing a levity restriction error, we now display the HsWrap of the expression. This lets users see the full elaboration with -fprint-typechecker-elaboration (see also #17670)
*	Separate CPR analysis from the Demand analyserwip/sep-cpr	Sebastian Graf	2020-02-12	12	-59/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The reasons for that can be found in the wiki: https://gitlab.haskell.org/ghc/ghc/wikis/nested-cpr/split-off-cpr We now run CPR after demand analysis (except for after the final demand analysis run just before code gen). CPR got its own dump flags (`-ddump-cpr-anal`, `-ddump-cpr-signatures`), but not its own flag to activate/deactivate. It will run with `-fstrictness`/`-fworker-wrapper`. As explained on the wiki page, this step is necessary for a sane Nested CPR analysis. And it has quite positive impact on compiler performance: Metric Decrease: T9233 T9675 T9961 T15263
*	testsuite: Fix -Wcompat-unqualified-imports issues	Ben Gamari	2020-02-08	1	-1/+1
\|
*	Do CafInfo/SRT analysis in Cmm	Ömer Sinan Ağacan	2020-01-31	11	-91/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch removes all CafInfo predictions and various hacks to preserve predicted CafInfos from the compiler and assigns final CafInfos to interface Ids after code generation. SRT analysis is extended to support static data, and Cmm generator is modified to allow generating static_link fields after SRT analysis. This also fixes `-fcatch-bottoms`, which introduces error calls in case expressions in CorePrep, which runs after CoreTidy (which is where we decide on CafInfos) and turns previously non-CAFFY things into CAFFY. Fixes #17648 Fixes #9718 Evaluation ========== NoFib ----- Boot with: `make boot mode=fast` Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1` -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.0% 0.0% -0.0% -0.0% -0.0% CSD -0.0% 0.0% -0.0% -0.0% -0.0% FS -0.0% 0.0% -0.0% -0.0% -0.0% S -0.0% 0.0% -0.0% -0.0% -0.0% VS -0.0% 0.0% -0.0% -0.0% -0.0% VSD -0.0% 0.0% -0.0% -0.0% -0.5% VSM -0.0% 0.0% -0.0% -0.0% -0.0% anna -0.1% 0.0% -0.0% -0.0% -0.0% ansi -0.0% 0.0% -0.0% -0.0% -0.0% atom -0.0% 0.0% -0.0% -0.0% -0.0% awards -0.0% 0.0% -0.0% -0.0% -0.0% banner -0.0% 0.0% -0.0% -0.0% -0.0% bernouilli -0.0% 0.0% -0.0% -0.0% -0.0% binary-trees -0.0% 0.0% -0.0% -0.0% -0.0% boyer -0.0% 0.0% -0.0% -0.0% -0.0% boyer2 -0.0% 0.0% -0.0% -0.0% -0.0% bspt -0.0% 0.0% -0.0% -0.0% -0.0% cacheprof -0.0% 0.0% -0.0% -0.0% -0.0% calendar -0.0% 0.0% -0.0% -0.0% -0.0% cichelli -0.0% 0.0% -0.0% -0.0% -0.0% circsim -0.0% 0.0% -0.0% -0.0% -0.0% clausify -0.0% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0% compress -0.0% 0.0% -0.0% -0.0% -0.0% compress2 -0.0% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0% cse -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0% dom-lt -0.0% 0.0% -0.0% -0.0% -0.0% eliza -0.0% 0.0% -0.0% -0.0% -0.0% event -0.0% 0.0% -0.0% -0.0% -0.0% exact-reals -0.0% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0% expert -0.0% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0% fasta -0.0% 0.0% -0.0% -0.0% -0.0% fem -0.0% 0.0% -0.0% -0.0% -0.0% fft -0.0% 0.0% -0.0% -0.0% -0.0% fft2 -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% fish -0.0% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.0% 0.0% -0.0% -0.0% -0.0% gamteb -0.0% 0.0% -0.0% -0.0% -0.0% gcd -0.0% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0% genfft -0.0% 0.0% -0.0% -0.0% -0.0% gg -0.0% 0.0% -0.0% -0.0% -0.0% grep -0.0% 0.0% -0.0% -0.0% -0.0% hidden -0.0% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.0% 0.0% -0.0% -0.0% -0.0% infer -0.0% 0.0% -0.0% -0.0% -0.0% integer -0.0% 0.0% -0.0% -0.0% -0.0% integrate -0.0% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0% kahan -0.0% 0.0% -0.0% -0.0% -0.0% knights -0.0% 0.0% -0.0% -0.0% -0.0% lambda -0.0% 0.0% -0.0% -0.0% -0.0% last-piece -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% life -0.0% 0.0% -0.0% -0.0% -0.0% lift -0.0% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.0% 0.0% -0.0% -0.0% -0.0% listcopy -0.0% 0.0% -0.0% -0.0% -0.0% maillist -0.0% 0.0% -0.0% -0.0% -0.0% mandel -0.0% 0.0% -0.0% -0.0% -0.0% mandel2 -0.0% 0.0% -0.0% -0.0% -0.0% mate -0.0% 0.0% -0.0% -0.0% -0.0% minimax -0.0% 0.0% -0.0% -0.0% -0.0% mkhprog -0.0% 0.0% -0.0% -0.0% -0.0% multiplier -0.0% 0.0% -0.0% -0.0% -0.0% n-body -0.0% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0% para -0.0% 0.0% -0.0% -0.0% -0.0% paraffins -0.0% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.0% 0.0% -0.0% -0.0% -0.0% pidigits -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% pretty -0.0% 0.0% -0.3% -0.4% -0.4% primes -0.0% 0.0% -0.0% -0.0% -0.0% primetest -0.0% 0.0% -0.0% -0.0% -0.0% prolog -0.0% 0.0% -0.0% -0.0% -0.0% puzzle -0.0% 0.0% -0.0% -0.0% -0.0% queens -0.0% 0.0% -0.0% -0.0% -0.0% reptile -0.0% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0% rewrite -0.0% 0.0% -0.0% -0.0% -0.0% rfib -0.0% 0.0% -0.0% -0.0% -0.0% rsa -0.0% 0.0% -0.0% -0.0% -0.0% scc -0.0% 0.0% -0.3% -0.5% -0.4% sched -0.0% 0.0% -0.0% -0.0% -0.0% scs -0.0% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.0% 0.0% -0.0% -0.0% -0.0% sorting -0.0% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0% sphere -0.0% 0.0% -0.0% -0.0% -0.0% symalg -0.0% 0.0% -0.0% -0.0% -0.0% tak -0.0% 0.0% -0.0% -0.0% -0.0% transform -0.0% 0.0% -0.0% -0.0% -0.0% treejoin -0.0% 0.0% -0.0% -0.0% -0.0% typecheck -0.0% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.0% 0.0% -0.0% -0.0% -0.0% wave4main -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0% x2n1 -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.3% -0.5% -0.5% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% gc_bench -0.0% 0.0% -0.0% -0.0% -0.0% hash -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% spellcheck -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.0% -0.0% -0.0% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0% Manual inspection of programs in testsuite/tests/programs --------------------------------------------------------- I built these programs with a bunch of dump flags and `-O` and compared STG, Cmm, and Asm dumps and file sizes. (Below the numbers in parenthesis show number of modules in the program) These programs have identical compiler (same .hi and .o sizes, STG, and Cmm and Asm dumps): - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3), andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2), jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4), jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1), bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1), strict_anns (1), thurston-module-arith (2), okeefe_neural (1), joao-circular (6), 10queens (1) Programs with different compiler outputs: - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level `[Int]` closures as CAFFY for no reason. With this patch we no longer make them CAFFY and generate less SRT entries. For some reason Main.o is slightly larger with this patch (1.3%) and the executable sizes are the same. (I'd expect both to be smaller) - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the executable sizes are the same. - galois_raytrace (13): Differences are in the Parse module. There are a lot, but some of the changes are caused by the fact that for some reason (I think a bug) GHC HEAD marks the dictionary for `Functor Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the same. - north_array: We now generate less SRT entries because some of array primops used in this program like `NewArrayOp` get eliminated during Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24% larger (9224 bytes from 9000 bytes), executable sizes are the same. - seward-space-leak: Difference in this program is better shown by this smaller example: module Lib where data CDS = Case [CDS] [(Int, CDS)] \| Call CDS CDS instance Eq CDS where Case sels1 rets1 == Case sels2 rets2 = sels1 == sels2 && rets1 == rets2 Call a1 b1 == Call a2 b2 = a1 == a2 && b1 == b2 _ == _ = False In this program GHC HEAD builds a new SRT for the recursive group of `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==` in its SRT field, and `==` uses the SRT object as its SRT. With this patch we use the closure for `/=` as the SRT and add `==` there. Then `/=` gets an empty SRT field and `==` points to `/=` in its SRT field. This change looks fine to me. Main.o gets 0.07% larger, executable sizes are identical. head.hackage ------------ head.hackage's CI script builds 428 packages from Hackage using this patch with no failures. Compiler performance -------------------- The compiler perf tests report that the compiler allocates slightly more (worst case observed so far is 4%). However most programs in the test suite are small, single file programs. To benchmark compiler performance on something more realistic I build Cabal (the library, 236 modules) with different optimisation levels. For the "max residency" row I run GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows are generated with just `-s`. (This is because `-i0` causes running GC much more frequently and as a result "bytes copied" gets inflated by more than 25x in some cases) * -O0 \| \| GHC HEAD \| This MR \| Diff \| \| --------------- \| -------------- \| -------------- \| ------ \| \| Bytes allocated \| 54,413,350,872 \| 54,701,099,464 \| +0.52% \| \| Bytes copied \| 4,926,037,184 \| 4,990,638,760 \| +1.31% \| \| Max residency \| 421,225,624 \| 424,324,264 \| +0.73% \| * -O1 \| \| GHC HEAD \| This MR \| Diff \| \| --------------- \| --------------- \| --------------- \| ------ \| \| Bytes allocated \| 245,849,209,992 \| 246,562,088,672 \| +0.28% \| \| Bytes copied \| 26,943,452,560 \| 27,089,972,296 \| +0.54% \| \| Max residency \| 982,643,440 \| 991,663,432 \| +0.91% \| * -O2 \| \| GHC HEAD \| This MR \| Diff \| \| --------------- \| --------------- \| --------------- \| ------ \| \| Bytes allocated \| 291,044,511,408 \| 291,863,910,912 \| +0.28% \| \| Bytes copied \| 37,044,237,616 \| 36,121,690,472 \| -2.49% \| \| Max residency \| 1,071,600,328 \| 1,086,396,256 \| +1.38% \| Extra compiler allocations -------------------------- Runtime allocations of programs are as reported above (NoFib section). The compiler now allocates more than before. Main source of allocation in this patch compared to base commit is the new SRT algorithm (GHC.Cmm.Info.Build). Below is some of the extra work we do with this patch, numbers generated by profiled stage 2 compiler when building a pathological case (the test 'ManyConstructors') with '-O2': - We now sort the final STG for a module, which means traversing the entire program, generating free variable set for each top-level binding, doing SCC analysis, and re-ordering the program. In ManyConstructors this step allocates 97,889,952 bytes. - We now do SRT analysis on static data, which in a program like ManyConstructors causes analysing 10,000 bindings that we would previously just skip. This step allocates 70,898,352 bytes. - We now maintain an SRT map for the entire module as we compile Cmm groups: data ModuleSRTInfo = ModuleSRTInfo { ... , moduleSRTMap :: SRTMap } (SRTMap is just a strict Map from the 'containers' library) This map gets an entry for most bindings in a module (exceptions are THUNKs and CAFFY static functions). For ManyConstructors this map gets 50015 entries. - Once we're done with code generation we generate a NameSet from SRTMap for the non-CAFFY names in the current module. This set gets the same number of entries as the SRTMap. - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using the NameSet generated in the previous step. This usually does the least amount of allocation among the work listed here. Only place with this patch where we do less work in the CAF analysis in the tidying pass (CoreTidy). However that doesn't save us much, as the pass still needs to traverse the whole program and update IdInfos for other reasons. Only thing we don't here do is the `hasCafRefs` pass over the RHS of bindings, which is a stateless pass that returns a boolean value, so it doesn't allocate much. (Metric changes blow are all increased allocations) Metric changes -------------- Metric Increase: ManyAlternatives ManyConstructors T13035 T14683 T1969 T9961
*	Print Core type applications with no whitespace after @ (#17643)	Ryan Scott	2020-01-08	17	-66/+66
\| \| \| \| \| \| \| \| \| \| \|	This brings the pretty-printer for Core in line with how visible type applications are normally printed: namely, with no whitespace after the `@` character (i.e., `f @a` instead of `f @ a`). While I'm in town, I also give the same treatment to type abstractions (i.e., `\(@a)` instead of `\(@ a)`) and coercion applications (i.e., `f @~x` instead of `f @~ x`). Fixes #17643.
*	Fix typos, via a Levenshtein-style corrector	Brian Wignall	2020-01-04	2	-2/+2
\|
*	Take care to not eta-reduce jumps in CorePrep	Matheus Magalhães de Alcantara	2019-11-23	2	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	CorePrep already had a check to prevent it from eta-reducing Ids that respond true to hasNoBinding (foreign calls, constructors for unboxed sums and products, and Ids with compulsory unfoldings). It did not, however, consider join points as ids that 'must be saturated'. Checking whether the Id responds True to 'isJoinId' should prevent CorePrep from turning saturated jumps like the following (from #17429) into undersaturated ones: (\ eta_XP -> join { mapped_s1vo _ = lvl_s1vs } in jump mapped_s1vo eta_XP)
*	Give seq a more precise type and remove magic	Ben Gamari	2019-11-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`GHC.Prim.seq` previously had the rather plain type: seq :: forall a b. a -> b -> b However, it also had a special typing rule to applications where `b` is not of kind `Type`. Issue #17440 noted that levity polymorphism allows us to rather give it the more precise type: seq :: forall (r :: RuntimeRep) a (b :: TYPE r). a -> b -> b This allows us to remove the special typing rule that we previously required to allow applications on unlifted arguments. T9404 contains a non-Type application of `seq` which should verify that this works as expected. Closes #17440.
*	testsuite: Drop T7995	Ben Gamari	2019-11-09	3	-12/+0
\| \| \| \| \| \| \| \| \| \| \|	This test is quite sensitive to the build configuration as it requires that ghc have unfoldings, which isn't true in the quick build flavours. I considered various options to make the test more robust but none of them seemed particularly appealing. Moreover, Simon PJ was a bit skeptical of the value of the test to begin with and I strongly suspect that any regression in #7995 would be accompanied by failures in our other compiler performance tests. Closes #17399.
*	SetLevels: Don't set context level when floating cases	Ben Gamari	2019-11-08	3	-1/+20
\| \| \| \| \| \| \| \| \| \| \|	When floating a single-alternative case we previously would set the context level to the level where we were floating the case. However, this is wrong as we are only moving the case and its binders. This resulted in #16978, where the disrepancy caused us to unnecessarily abstract over some free variables of the case body, resulting in shadowing and consequently Core Lint failures. (cherry picked from commit a2a0e6f3bb2d02a9347dec4c7c4f6d4480bc2421)
*	Make CSE delay inlining less	Simon Peyton Jones	2019-11-01	4	-1/+20
\| \| \| \| \| \| \| \| \|	CSE delays inlining a little bit, to avoid losing vital specialisations; see Note [Delay inlining after CSE] in CSE. But it was being over-enthusiastic. This patch makes the delay only apply to Ids with specialisation rules, which avoids unnecessary delay (#17409).
*	Whitespace forward compatibility for proposal #229	Vladislav Zavialov	2019-10-30	1	-3/+3
\| \| \| \| \| \| \| \|	GHC Proposal #229 changes the lexical rules of Haskell, which may require slight whitespace adjustments in certain cases. This patch changes formatting in a few places in GHC and its testsuite in a way that enables it to compile under the proposed rules.
*	Add new flag for unarised STG dumps	Ömer Sinan Ağacan	2019-10-23	1	-34/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously -ddump-stg would dump pre and post-unarise STGs. Now we have a new flag for post-unarise STG and -ddump-stg only dumps coreToStg output. STG dump flags after this commit: - -ddump-stg: Dumps CoreToStg output - -ddump-stg-unarised: Unarise output - -ddump-stg-final: STG right before code gen (includes CSE and lambda lifting)
*	testsuite: Ensure that makefile tests get run	Ben Gamari	2019-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously `makefile_test` and `run_command` tests could easily end up in a situation where they wouldn't be run if the user used the `only_ways` modifier. The reason is to build the set of a ways to run the test in we first start with a candidate set determined by the test type (e.g. `makefile_test`, `compile_run`, etc.) and then filter that set with the constraints given by the test's modifiers. `makefile_test` and `run_command` tests' candidate sets were simply `{normal}`, and consequently most uses of `only_ways` would result in the test being never run. To avoid this we rather use all ways as the candidate sets for these test types. This may result in a few more testcases than we would like (given that some `run_command` tests are insensitive to way) but this can be fixed by adding modifiers and we would much rather run too many tests than too few. This fixes #16042 and a number of other tests afflicted by the same issue. However, there were a few cases that required special attention: * `T14028` is currently failing and is therefore marked as broken due to #17300 * `T-signals-child` is fragile in the `threaded1` and `threaded2` ways (tracked in #17307)
*	testsuite: Add testcase for #17206	Ben Gamari	2019-09-19	3	-0/+19
\|
*	Add a test to make sure we don't regress on #17140 in the future	Alp Mestanogullari	2019-09-13	5	-0/+29
\|
*	Add test for #16893	Ömer Sinan Ağacan	2019-08-10	3	-0/+93
\|
*	testsuite: Add testsuite for #16978	Ben Gamari	2019-08-04	2	-0/+30
\|
*	Apply a missing substitution in mkEtaWW (#16979)	Simon Peyton Jones	2019-07-30	3	-0/+266
\| \| \| \| \| \| \| \| \| \|	The `mkEtaWW` case for newtypes forgot to apply the substitution to the newtype coercion, resulting in the Core Lint errors observed in #16979. Easily fixed. Fixes #16979. Co-authored-by: Ryan Scott <ryan.gl.scott@gmail.com>
*	Expunge #ifdef and #ifndef from the codebase	John Ericson	2019-07-14	2	-3/+3
\| \| \| \| \| \| \| \|	These are unexploded minds as far as the linter is concerned. I don't want to hit in my MRs by mistake! I did this with `sed`, and then rolled back some changes in the docs, config.guess, and the linter itself.
*	Fix erroneous float in CoreOpt	Simon Peyton Jones	2019-07-09	3	-0/+34
\| \| \| \| \| \| \| \| \| \| \|	The simple optimiser was making an invalid transformation to join points -- yikes. The fix is easy. I also added some documentation about the fact that GHC uses a slightly more restrictive version of join points than does the paper. Fix #16918