| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
various perf tests have been broken over the course of the past few
months. This updates the numbers.
Test Plan: ./validate
Reviewers: austin, bgamari
Subscribers: thomie, #ghc_windows_task_force
Differential Revision: https://phabricator.haskell.org/D3160
|
|
|
|
|
| |
Previously the comment was correct, but the expected value itself was never
updated.
|
|
|
|
|
| |
These are right on the edge of acceptance and are only reproducible on a
stressed machine.
|
|
|
|
|
| |
These things are simply too expensive to generate at the moment. More
work is needed here; see #13276 and #13261.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This at long last realizes the ideas for type-indexed Typeable discussed in A
Reflection on Types (#11011). The general sketch of the project is described on
the Wiki (Typeable/BenGamari). The general idea is that we are adding a type
index to `TypeRep`,
data TypeRep (a :: k)
This index allows the typechecker to reason about the type represented by the `TypeRep`.
This index representation mechanism is exposed as `Type.Reflection`, which also provides
a number of patterns for inspecting `TypeRep`s,
```lang=haskell
pattern TRFun :: forall k (fun :: k). ()
=> forall (r1 :: RuntimeRep) (r2 :: RuntimeRep)
(arg :: TYPE r1) (res :: TYPE r2).
(k ~ Type, fun ~~ (arg -> res))
=> TypeRep arg
-> TypeRep res
-> TypeRep fun
pattern TRApp :: forall k2 (t :: k2). ()
=> forall k1 (a :: k1 -> k2) (b :: k1). (t ~ a b)
=> TypeRep a -> TypeRep b -> TypeRep t
-- | Pattern match on a type constructor.
pattern TRCon :: forall k (a :: k). TyCon -> TypeRep a
-- | Pattern match on a type constructor including its instantiated kind
-- variables.
pattern TRCon' :: forall k (a :: k). TyCon -> [SomeTypeRep] -> TypeRep a
```
In addition, we give the user access to the kind of a `TypeRep` (#10343),
typeRepKind :: TypeRep (a :: k) -> TypeRep k
Moreover, all of this plays nicely with 8.2's levity polymorphism, including the
newly levity polymorphic (->) type constructor.
Library changes
---------------
The primary change here is the introduction of a Type.Reflection module to base.
This module provides access to the new type-indexed TypeRep introduced in this
patch. We also continue to provide the unindexed Data.Typeable interface, which
is simply a type synonym for the existentially quantified SomeTypeRep,
data SomeTypeRep where SomeTypeRep :: TypeRep a -> SomeTypeRep
Naturally, this change also touched Data.Dynamic, which can now export the
Dynamic data constructor. Moreover, I removed a blanket reexport of
Data.Typeable from Data.Dynamic (which itself doesn't even import Data.Typeable
now).
We also add a kind heterogeneous type equality type, (:~~:), to
Data.Type.Equality.
Implementation
--------------
The implementation strategy is described in Note [Grand plan for Typeable] in
TcTypeable. None of it was difficult, but it did exercise a number of parts of
the new levity polymorphism story which had not yet been exercised, which took
some sorting out.
The rough idea is that we augment the TyCon produced for each type constructor
with information about the constructor's kind (which we call a KindRep). This
allows us to reconstruct the monomorphic result kind of an particular
instantiation of a type constructor given its kind arguments.
Unfortunately all of this takes a fair amount of work to generate and send
through the compilation pipeline. In particular, the KindReps can unfortunately
get quite large. Moreover, the simplifier will float out various pieces of them,
resulting in numerous top-level bindings. Consequently we mark the KindRep
bindings as noinline, ensuring that the float-outs don't make it into the
interface file. This is important since there is generally little benefit to
inlining KindReps and they would otherwise strongly affect compiler performance.
Performance
-----------
Initially I was hoping to also clear up the remaining holes in Typeable's
coverage by adding support for both unboxed tuples (#12409) and unboxed sums
(#13276). While the former was fairly straightforward, the latter ended up being
quite difficult: while the implementation can support them easily, enabling this
support causes thousands of Typeable bindings to be emitted to the GHC.Types as
each arity-N sum tycon brings with it N promoted datacons, each of which has a
KindRep whose size which itself scales with N. Doing this was simply too
expensive to be practical; consequently I've disabled support for the time
being.
Even after disabling sums this change regresses compiler performance far more
than I would like. In particular there are several testcases in the testsuite
which consist mostly of types which regress by over 30% in compiler allocations.
These include (considering the "bytes allocated" metric),
* T1969: +10%
* T10858: +23%
* T3294: +19%
* T5631: +41%
* T6048: +23%
* T9675: +20%
* T9872a: +5.2%
* T9872d: +12%
* T9233: +10%
* T10370: +34%
* T12425: +30%
* T12234: +16%
* 13035: +17%
* T4029: +6.1%
I've spent quite some time chasing down the source of this regression and while
I was able to make som improvements, I think this approach of generating
Typeable bindings at time of type definition is doomed to give us unnecessarily
large compile-time overhead.
In the future I think we should consider moving some of all of the Typeable
binding generation logic back to the solver (where it was prior to
91c6b1f54aea658b0056caec45655475897f1972). I've opened #13261 documenting this
proposal.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is generalizes the kind of `(->)`, as discussed in #11714.
This involves a few things,
* Generalizing the kind of `funTyCon`, adding two new `RuntimeRep`
binders,
```lang=haskell
(->) :: forall (r1 :: RuntimeRep) (r2 :: RuntimeRep)
(a :: TYPE r1) (b :: TYPE r2).
a -> b -> *
```
* Unsaturated applications of `(->)` are expressed as explicit
`TyConApp`s
* Saturated applications of `(->)` are expressed as `FunTy` as they are
currently
* Saturated applications of `(->)` are expressed by a new `FunCo`
constructor in coercions
* `splitTyConApp` needs to ensure that `FunTy`s are split to a
`TyConApp`
of `(->)` with the appropriate `RuntimeRep` arguments
* Teach CoreLint to check that all saturated applications of `(->)` are
represented with `FunTy`
At the moment I assume that `Constraint ~ *`, which is an annoying
source of complexity. This will
be simplified once D3023 is resolved.
Also, this introduces two known regressions,
`tcfail181`, `T10403`
=====================
Only shows the instance,
instance Monad ((->) r) -- Defined in ‘GHC.Base’
in its error message when -fprint-potential-instances is used. This is
because its instance head now mentions 'LiftedRep which is not in scope.
I'm not entirely sure of the right way to fix this so I'm just accepting
the new output for now.
T5963 (Typeable)
================
T5963 is now broken since Data.Typeable.Internals.mkFunTy computes its
fingerprint without the RuntimeRep variables that (->) expects. This
will be fixed with the merge of D2010.
Haddock performance
===================
The `haddock.base` and `haddock.Cabal` tests regress in allocations by
about 20%. This certainly hurts, but it's also not entirely unexpected:
the size of every function type grows with this patch and Haddock has a
lot of functions in its heap.
|
|
|
|
|
|
|
|
|
|
|
| |
I think this is due to
commit 6bab649bde653f13c15eba30d5007bef4a9a9d3a
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Thu Feb 16 09:42:32 2017 +0000
Improve checking of joins in Core Lint
Improvement is around 5%.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We previously checked type family instance declarations
in a module for consistency with all instances that we happened
to have read into the EPS or HPT. It was possible to arrange that
an imported type family instance (used by an imported function)
was in a module whose interface file was never read during
compilation; then we wouldn't check consistency of local instances
with this imported instance and as a result type safety was lost.
With this patch, we still check consistency of local type family
instances with all type family instances that we have loaded; but
we make sure to load the interface files of all our imports that
define family instances first. More selective consistency checking
is left to #13102.
On the other hand, we can now safely assume when we import a module
that it has been checked for consistency with its imports. So we
can save checking in checkFamInstConsistency, and overall we should
have less work to do now.
This patch also adds a note describing the Plan for ensuring type
family consistency.
Test Plan: Two new tests added; harbormaster
Reviewers: austin, simonpj, bgamari
Reviewed By: simonpj, bgamari
Subscribers: ggreif, thomie
Differential Revision: https://phabricator.haskell.org/D2992
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When determining whether an expression is used saturatedly, count the
number of value arguments that the occurrence analyser sees, and add
the number of one-shot arguments that we know (from the strictness
analyser) are passed from the context.
perf results suggest no noticable change in allocations, reduction of
code sizes, and performance regression possibliy due to loss of join
points.
Test Plan: perf.haskell.org
Reviewers: simonpj, austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D3089
|
|
|
|
|
|
|
|
|
| |
Simon decreased this earlier today but Harbormaster doesn't reproduce his
number. I've done two things here:
1. increased the allocations number to the Harbormaster value
2. increased the acceptance threshold from 5% to 7%, since Simon saw a 6.6%
change in his environment.
|
|
|
|
| |
Unfortunately it's not clear what regressed this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In my recent commit
commit 432f952ef64641be9f32152a0fbf2b8496d8fe9c
Float unboxed expressions by boxing
I changed how float_me in lvlMFE worked. That was right, but
it exposed another bug: an error expression wasn't getting floated
as it should from a case alternative. And that led to a collection
of minor improvements
* I found a much better way to cast it, by using lvlFloatRhs for
top-level bindinds as well as nested ones, which is
(a) more consistent and
(b) works correctly.
See Note [Floating from a RHS]
* I also found some delicacy in the "floating to the top" stuff, so I
greatly elaborated the Note [Floating to the top].
* I simplified the "bottoming-float" stuff; the change is in the treatment
of bottoming lambdas (\x y. error blah), where we now float the
(error blah) part instead of the whole lambda (which risks just making
duplicate lambdas. See Note [Bottoming floats], esp (2).
Perf effects are minor.
* perf/compiler/T13056 improved sligtly (about 2%) in compiler
allocations. Also T9233 improved by 1%. I'm not sure why.
* Some small nofib changes:
- Generally some very small reductions in run-time
allocation, except k-nucleotide, which halves for some
reason. (I did try to look but it's a big complicated
function and it was far from obvious. Had it been a loss
I would have looked harder!
NB: there's a nearby patch "Do not inline bottoming things" that could
also be responsible for either or both. I didn't think it was worth
more testing to distinguish.
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
grep +0.1% -0.2% 0.00 0.00 +0.0%
mandel -0.1% -1.4% 0.13 0.13 +0.0%
k-nucleotide +0.1% -51.6% -1.0% -1.0% +0.0%
--------------------------------------------------------------------------------
Min -0.3% -51.6% -9.4% -9.1% -4.0%
Max +0.2% +0.0% +31.8% +32.7% +0.0%
Geometric Mean -0.0% -0.8% +1.4% +1.4% -0.1%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using the default definition of `<$` for derived `Functor`
instance is very bad for recursive data types. Derive
the definition instead.
Fixes #13218
Reviewers: austin, bgamari, RyanGlScott
Reviewed By: RyanGlScott
Subscribers: RyanGlScott, thomie
Differential Revision: https://phabricator.haskell.org/D3072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disabling worker/wrapper for NOINLINE things can cause unnecessary
reboxing of values. Consider
{-# NOINLINE f #-}
f :: Int -> a
f x = error (show x)
g :: Bool -> Bool -> Int -> Int
g True True p = f p
g False True p = p + 1
g b False p = g b True p
the strictness analysis will discover f and g are strict, but because f
has no wrapper, the worker for g will rebox p. So we get
$wg x y p# =
let p = I# p# in -- Yikes! Reboxing!
case x of
False ->
case y of
False -> $wg False True p#
True -> +# p# 1#
True ->
case y of
False -> $wg True True p#
True -> case f p of { }
g x y p = case p of (I# p#) -> $wg x y p#
Now, in this case the reboxing will float into the True branch, an so
the allocation will only happen on the error path. But it won't float
inwards if there are multiple branches that call (f p), so the reboxing
will happen on every call of g. Disaster.
Solution: do worker/wrapper even on NOINLINE things; but move the
NOINLINE pragma to the worker.
Test Plan: make test TEST="13143"
Reviewers: simonpj, bgamari, dfeuer, austin
Reviewed By: simonpj, bgamari
Subscribers: dfeuer, thomie
Differential Revision: https://phabricator.haskell.org/D3046
|
| |
|
|
|
|
|
|
| |
according to the graph at perf.haskell.org, it has regressed due to join
points, which moved it very very close to the +10% mark and hence made it
fail just sometimes.
|
|
|
|
| |
This has been failing on Darwin since 748b79741652028827b6225c36b8ab55d22bdeb0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This major patch implements Join Points, as described in
https://ghc.haskell.org/trac/ghc/wiki/SequentCore. You have
to read that page, and especially the paper it links to, to
understand what's going on; but it is very cool.
It's Luke Maurer's work, but done in close collaboration with Simon PJ.
This Phab is a squash-merge of wip/join-points branch of
http://github.com/lukemaurer/ghc. There are many, many interdependent
changes.
Reviewers: goldfire, mpickering, bgamari, simonmar, dfeuer, austin
Subscribers: simonpj, dfeuer, mpickering, Mikolaj, thomie
Differential Revision: https://phabricator.haskell.org/D2853
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces a new flag `-fsolve-constant-dicts` which makes the
constraint solver solve super class constraints with available dictionaries if
possible. The flag is enabled by `-O1`.
The motivation of this patch is that the compiler can produce more efficient
code if the constraint solver used top-level instance declarations to solve
constraints that are currently solved givens and their superclasses. In
particular, as it currently stands, the compiler imposes a performance penalty
on the common use-case where superclasses are bundled together for user
convenience. The performance penalty applies to constraint synonyms as
well. This example illustrates the issue:
```
{-# LANGUAGE ConstraintKinds, MultiParamTypeClasses, FlexibleContexts #-}
module B where
class M a b where m :: a -> b
type C a b = (Num a, M a b)
f :: C Int b => b -> Int -> Int
f _ x = x + 1
```
Output without the patch, notice that we get the instance for `Num Int` by
using the class selector `p1`.
```
f :: forall b_arz. C Int b_arz => b_arz -> Int -> Int
f =
\ (@ b_a1EB) ($d(%,%)_a1EC :: C Int b_a1EB) _ (eta1_B1 :: Int) ->
+ @ Int
(GHC.Classes.$p1(%,%) @ (Num Int) @ (M Int b_a1EB) $d(%,%)_a1EC)
eta1_B1
B.f1
```
Output with the patch, nicely optimised code!
```
f :: forall b. C Int b => b -> Int -> Int
f =
\ (@ b) _ _ (x_azg :: Int) ->
case x_azg of { GHC.Types.I# x1_a1DP ->
GHC.Types.I# (GHC.Prim.+# x1_a1DP 1#)
}
```
Reviewers: simonpj, bgamari, austin
Reviewed By: simonpj
Subscribers: mpickering, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D2714
GHC Trac Issues: #12791, #5835
|
|
|
|
|
| |
Gipeda suggests that this is due to the recent top-level string literals in Core
patch.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch is a refinement of the original commit (which
was reverted):
commit 6b976eb89fe72827f226506d16d3721ba4e28bab
Date: Fri Jan 13 08:56:53 2017 +0000
Record evaluated-ness on workers and wrappers
In Trac #13027, comment:20, I noticed that wrappers created after
demand analysis weren't recording the evaluated-ness of strict
constructor arguments. In the ticket that led to a (debatable)
Lint error but in general the more we know about evaluated-ness
the better we can optimise.
This commit adds that info
* both in the worker (on args)
* and in the wrapper (on CPR result patterns).
See Note [Record evaluated-ness in worker/wrapper] in WwLib
On the way I defined Id.setCaseBndrEvald, and used it to shorten
the code in a few other places
Then I added test T13077a to test the CPR aspect of this patch,
but I found that Lint failed!
Reason: simpleOptExpr was discarding evaluated-ness info on
lambda binders because zapFragileIdInfo was discarding an
Unfolding of (OtherCon _). But actually that's a robust
unfolding; there is no need to discard it. To fix this:
* zapFragileIdInfo only zaps fragile unfoldings
* Replace isClosedUnfolding with isFragileUnfolding (the latter
is just the negation of the former, but the nomenclature is
more consistent). Better documentation too
Note [Fragile unfoldings]
* And Simplify.simplLamBndr can now look at isFragileUnfolding
to decide whether to use the longer route of simplUnfolding.
For some reason perf/compiler/T9233 improves in compile-time
allocation by 10%. Hooray
Nofib: essentially no change:
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
cacheprof +0.0% -0.3% +0.9% +0.4% +0.0%
--------------------------------------------------------------------------------
Min +0.0% -0.3% -2.4% -2.4% +0.0%
Max +0.0% +0.0% +9.8% +11.4% +2.4%
Geometric Mean +0.0% -0.0% +1.1% +1.0% +0.0%
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `clean_cmd` and `extra_clean` setup functions don't do anything.
Remove them from .T files.
Created using https://github.com/thomie/refactor-ghc-testsuite. This
diff is a test for the .T-file parser/processor/pretty-printer in that
repository.
find . -name '*.T' -exec ~/refactor-ghc-testsuite/Main "{}" \;
Tests containing inline comments or multiline strings are not modified.
Preparation for #12223.
Test Plan: Harbormaster
Reviewers: austin, hvr, simonmar, mpickering, bgamari
Reviewed By: mpickering
Subscribers: mpickering
Differential Revision: https://phabricator.haskell.org/D3000
GHC Trac Issues: #12223
|
|
|
|
|
| |
These are only failing on Darwin, strangely enough, but do so quite
reproducibly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commits relaxes the invariants of the Core syntax so that a
top-level variable can be bound to a primitive string literal of type
Addr#.
This commit:
* Relaxes the invatiants of the Core, and allows top-level bindings whose
type is Addr# as long as their RHS is either a primitive string literal or
another variable.
* Allows the simplifier and the full-laziness transformer to float out
primitive string literals to the top leve.
* Introduces the new StgGenTopBinding type to accomodate top-level Addr#
bindings.
* Introduces a new type of labels in the object code, with the suffix "_bytes",
for exported top-level Addr# bindings.
* Makes some built-in rules more robust. This was necessary to keep them
functional after the above changes.
This is a continuation of D2554.
Rebasing notes:
This had two slightly suspicious performance regressions:
* T12425: bytes allocated regressed by roughly 5%
* T4029: bytes allocated regressed by a bit over 1%
* T13035: bytes allocated regressed by a bit over 5%
These deserve additional investigation.
Rebased by: bgamari.
Test Plan: ./validate --slow
Reviewers: goldfire, trofi, simonmar, simonpj, austin, hvr, bgamari
Reviewed By: trofi, simonpj, bgamari
Subscribers: trofi, simonpj, gridaphobe, thomie
Differential Revision: https://phabricator.haskell.org/D2605
GHC Trac Issues: #8472
|
|
|
|
| |
Unfortunately it's not clear why but this has been failing on Harbormaster.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Suppose we have
```
data Node a = Node2 !Int a a | Node3 !Int a a a
instance Traversable Node where
traverse f (Node2 s x y) = Node2 s <$> f x <*> f y
...
```
Since `Node2` is partially applied, we wouldn't inline its
wrapper. The result was that we'd box up the `Int#` to put
the box in the closure passed to `fmap`. We now allow the wrapper
to inline when partially applied, so GHC stores the `Int#`
directly in the closure.
Reviewers: rwbarton, mpickering, simonpj, austin, bgamari
Reviewed By: simonpj, bgamari
Subscribers: mpickering, thomie
Differential Revision: https://phabricator.haskell.org/D2891
GHC Trac Issues: #12990
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When fusion rules successfully fire, we are left with calls to
*FB functions. They are higher-order functions, and therefore they
often benefit from inlining. This is particularly important when
then final consumer is a strict fold (foldl', length, etc.), because
not inlining these functions means allocating a function closure
for each element in the list, which often is more costly than what
fusion eliminates.
Nofib shows a slight increase in the binary size:
------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
------------------------------------------------------------------------
gen_regexps -0.3% 0.0% 0.000 0.000 0.0%
puzzle +0.8% 0.0% 0.089 0.090 0.0%
reptile +0.8% -0.0% 0.008 0.008 0.0%
------------------------------------------------------------------------
Min -0.3% -0.0% -7.3% -7.1% 0.0%
Max +0.8% +0.0% +7.8% +7.7% +1.8%
Geometric Mean +0.0% -0.0% +0.2% +0.2% +0.0%
------------------------------------------------------------------------
Reviewers: simonpj, austin, hvr, bgamari
Reviewed By: simonpj
Subscribers: simonpj, thomie
Differential Revision: https://phabricator.haskell.org/D2951
GHC Trac Issues: #13001
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
This performance regression was fixed by commit
517d03e41b4f5c144d1ad684539340421be2be2a (#12234). Let's add a performance test
to ensure that it doesn't break again.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Richard pointed out (comment:12 of Trac #13025) that my
implementation of the coercion "push rules", newly added
in exprIsConAppMaybe by commit b4c3a66, wasn't quite right.
But in fact that means that the implementation of those same
rules in Simplify.simplCast was wrong too.
Hence this commit:
* Refactor the push rules so they are implemented in just
one place (CoreSubst.pushCoArgs, pushCoTyArg, pushCoValArg)
The code in Simplify gets simpler, which is nice.
* Fix the bug that Richard pointed out (to do with hetero-kinded
coercions)
Then compiler performance worsened, which led mt do discover
two performance bugs:
* The smart constructor Coercion.mkNthCo didn't have a case
for ForAllCos, which meant we stupidly build a complicated
coercion where a simple one would do
* In OptCoercion there was one place where we used CoherenceCo
(the data constructor) rather than mkCoherenceCo (the smart
constructor), which meant that the the stupid complicated
coercion wasn't optimised away
For reasons I don't fully understand, T5321Fun did 2% less compiler
allocation after all this, which is good.
|
|
|
|
|
|
| |
Trac #13035 showed up a nasty case where we took exponentially
long to normalise a (actually rather simple) type. Fortunately
it was easy to fix: see Note [Normalisation and type synonyms].
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary: See https://phabricator.haskell.org/rGHCd3b546b1a605
Reviewers: nomeata, austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2883
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This is a complete fix based off of
ed7af26606b3a605a4511065ca1a43b1c0f3b51d for handling
shadowing and out-of-order -package-db flags simultaneously.
The general strategy is we first put all databases together,
overriding packages as necessary. Once this is done, we successfully
prune out broken packages, including packages which depend on a package
whose ABI differs from the ABI we need.
Our check gracefully degrades in the absence of abi-depends, as
we only check deps which are recorded in abi-depends.
Contains time and Cabal submodule update.
Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
Test Plan: validate
Reviewers: niteria, austin, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2846
GHC Trac Issues: #12485
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improved compiler allocations by abut 5%. It comes from one of
1a4c04b1 Fix 'SPECIALISE instance'
c48595ee Never apply worker/wrapper to DFuns
05d233e8 Move InId/OutId to CoreSyn
e07ad4db Don't eta-expand in stable unfoldings
d250d493 Add INLINE pragamas on Traversable default methods
c66dd05c Move typeSize/coercionSize into TyCoRep
I think d250d493.
But it's good anyway.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This test seems to have much different allocation behavior on Windows
and Linux. Previously we had widened the acceptance window to 7% to
accomodate this, but even this isn't enough any more. Instead of further
widening the window let's just give an expected number for each
platform. Really, this is precisely the issue with our performance
testing model which I've been complaining about in #12758.
Fixes test for #5205 on 64-bit Windows.
Test Plan: Validate on Windows
Reviewers: austin
Subscribers: thomie, Phyx
Differential Revision: https://phabricator.haskell.org/D2848
GHC Trac Issues: #5205
|
|
|
|
|
| |
This deviated by 12% from the expected allocations on Windows.
Yet another case of #12758.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces new rules to perform constant folding through
case-expressions.
E.g.,
```
case t -# 10# of _ { ===> case t of _ {
5# -> e1 15# -> e1
8# -> e2 18# -> e2
DEFAULT -> e DEFAULT -> e
```
The initial motivation is that it allows "Merge Nested Cases"
optimization to kick in and to further simplify the code
(see Trac #12877).
Currently we recognize the following operations for Word# and Int#: Add,
Sub, Xor, Not and Negate (for Int# only).
Test Plan: validate
Reviewers: simonpj, austin, bgamari
Reviewed By: simonpj, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2762
GHC Trac Issues: #12877
|
|
|
|
| |
Refactoring only
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trac #12425 and #12234 showed up a major and long-standing
bug in the occurrence analyser, whereby it could generate
explonentially large program!
There's a lot of commentary on #12425; and it's all described
in Note [Loop breakers, node scoring, and stability]
I did quite a lot of refactoring to make the code comprehensibe
again (its structure had bit-rotted rather), so the patch
looks bigger than it really is.
Hurrah!
I did a nofib run to check that I hadn't inadertently ruined
anything:
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
fluid -0.3% -1.5% 0.01 0.01 +0.0%
parser -0.9% +0.6% 0.04 0.04 +0.0%
prolog -0.1% +1.2% 0.00 0.00 +0.0%
--------------------------------------------------------------------------------
Min -0.9% -1.5% -8.6% -8.7% +0.0%
Max +0.1% +1.2% +7.7% +7.8% +2.4%
Geometric Mean -0.2% -0.0% -0.2% -0.3% +0.0%
I checked what happened in 'prolog'. It seems that we have a
recursive data structure something like this
f :: [blah]
f x = build (\cn. ...g... )
g :: [blah2]
g y = ....(foldr k z (f y))....
If we inline 'f' into 'g' we get better fusion than the other
way round, but we don't have any way to spot that at the moment.
(I wonder if we could do worker/wrapper for functions returning
a 'build'?) It was happening before by a fluke.
Anyway I decided to accept this; it's relatively rare I think.
|
|
|
|
| |
The previous attempt updated the comment but not the value. Silly me.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following D2684 these two tests need to be updated:
* T7037: timeout.exe now waits until all processes are finished.
this makes T7037 reliable. So enabled.
* T876: Unknown reason, allocations are much lower than before.
Test Plan: ./validate
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: thomie, #ghc_windows_task_force
Differential Revision: https://phabricator.haskell.org/D2759
GHC Trac Issues: #12725, #12004
|
|
|
|
| |
Unfortunately it's quite unclear what caused this.
|
|
|
|
|
| |
I think this wave of commits just made these two a little better;
they must have been close to the threshold before.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch takes further my refactoring of the constraint
solver, which I've been doing over the last couple of months
in consultation with Richard.
It fixes a number of tricky bugs that made the constraint
solver actually go into a loop, including
Trac #12526
Trac #12444
Trac #12538
The main changes are these
* Flatten unification variables (fmvs/fuvs) appear on the LHS
of a tvar/tyvar equality; thus
fmv ~ alpha
and not
alpha ~ fmv
See Note [Put flatten unification variables on the left]
in TcUnify. This is implemented by TcUnify.swapOverTyVars.
* Don't reduce a "loopy" CFunEqCan where the fsk appears on
the LHS:
F t1 .. tn ~ fsk
where 'fsk' is free in t1..tn.
See Note [FunEq occurs-check principle] in TcInteract
This neatly stops some infinite loops that people reported;
and it allows us to delete some crufty code in reduce_top_fun_eq.
And it appears to be no loss whatsoever.
As well as fixing loops, ContextStack2 and T5837 both terminate
when they didn't before.
* Previously we generated "derived shadow" constraints from
Wanteds, but we could (and sometimes did; Trac #xxxx) repeatedly
generate a derived shadow from the same Wanted.
A big change in this patch is to have two kinds of Wanteds:
[WD] behaves like a pair of a Wanted and a Derived
[W] behaves like a Wanted only
See CtFlavour and ShadowInfo in TcRnTypes, and the ctev_nosh
field of a Wanted.
This turned out to be a lot simpler. A [WD] gets split into a
[W] and a [D] in TcSMonad.maybeEmitShaodow.
See TcSMonad Note [The improvement story and derived shadows]
* Rather than have a separate inert_model in the InertCans, I've
put the derived equalities back into inert_eqs. We weren't
gaining anything from a separate field.
* Previously we had a mode for the constraint solver in which it
would more aggressively solve Derived constraints; it was used
for simplifying the context of a 'deriving' clause, or a 'default'
delcaration, for example.
But the complexity wasn't worth it; now I just make proper Wanted
constraints. See TcMType.cloneWC
* Don't generate injectivity improvement for Givens; see
Note [No FunEq improvement for Givens] in TcInteract
* solveSimpleWanteds leaves the insolubles in-place rather than
returning them. Simpler.
I also did lots of work on comments, including fixing Trac #12821.
|
|
|
|
| |
Ostensibly due to IfaceType rework, but I have my doubts.
|