diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2020-05-21 12:53:35 +0100 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2020-06-10 04:17:06 -0400 |
commit | 6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c (patch) | |
tree | 3129da29bee5d7f34666133a4247a60418d30212 /testsuite/tests/perf/compiler | |
parent | 9454511b0bdfcd79a1899d7f24bf65a3eb0d06e3 (diff) | |
download | haskell-6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c.tar.gz |
Implement cast worker/wrapper properly
The cast worker/wrapper transformation transforms
x = e |> co
into
y = e
x = y |> co
This is done by the simplifier, but we were being
careless about transferring IdInfo from x to y,
and about what to do if x is a NOINLNE function.
This resulted in a series of bugs:
#17673, #18093, #18078.
This patch fixes all that:
* Main change is in GHC.Core.Opt.Simplify, and
the new prepareBinding function, which does this
cast worker/wrapper transform.
See Note [Cast worker/wrappers].
* There is quite a bit of refactoring around
prepareRhs, makeTrivial etc. It's nicer now.
* Some wrappers from strictness and cast w/w, notably those for
a function with a NOINLINE, should inline very late. There
wasn't really a mechanism for that, which was an existing bug
really; so I invented a new finalPhase = Phase (-1). It's used
for all simplifier runs after the user-visible phase 2,1,0 have
run. (No new runs of the simplifier are introduced thereby.)
See new Note [Compiler phases] in GHC.Types.Basic;
the main changes are in GHC.Core.Opt.Driver
* Doing this made me trip over two places where the AnonArgFlag on a
FunTy was being lost so we could end up with (Num a -> ty)
rather than (Num a => ty)
- In coercionLKind/coercionRKind
- In contHoleType in the Simplifier
I fixed the former by defining mkFunctionType and using it in
coercionLKind/RKind.
I could have done the same for the latter, but the information
is almost to hand. So I fixed the latter by
- adding sc_hole_ty to ApplyToVal (like ApplyToTy),
- adding as_hole_ty to ValArg (like TyArg)
- adding sc_fun_ty to StrictArg
Turned out I could then remove ai_type from ArgInfo. This is
just moving the deck chairs around, but it worked out nicely.
See the new Note [AnonArgFlag] in GHC.Types.Var
* When looking at the 'arity decrease' thing (#18093) I discovered
that stable unfoldings had a much lower arity than the actual
optimised function. That's what led to the arity-decrease
message. Simple solution: eta-expand.
It's described in Note [Eta-expand stable unfoldings]
in GHC.Core.Opt.Simplify
* I also discovered that unsafeCoerce wasn't being inlined if
the context was boring. So (\x. f (unsafeCoerce x)) would
create a thunk -- yikes! I fixed that by making inlineBoringOK
a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold.
I also found that unsafeCoerceName was unused, so I removed it.
I made a test case for #18078, and a very similar one for #17673.
The net effect of all this on nofib is very modest, but positive:
--------------------------------------------------------------------------------
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
anna -0.4% -0.1% -3.1% -3.1% 0.0%
fannkuch-redux -0.4% -0.3% -0.1% -0.1% 0.0%
maillist -0.4% -0.1% -7.8% -1.0% -14.3%
primetest -0.4% -15.6% -7.1% -6.6% 0.0%
--------------------------------------------------------------------------------
Min -0.9% -15.6% -13.3% -14.2% -14.3%
Max -0.3% 0.0% +12.1% +12.4% 0.0%
Geometric Mean -0.4% -0.2% -2.3% -2.2% -0.1%
All following metric decreases are compile-time allocation decreases
between -1% and -3%:
Metric Decrease:
T5631
T13701
T14697
T15164
Diffstat (limited to 'testsuite/tests/perf/compiler')
-rw-r--r-- | testsuite/tests/perf/compiler/T16473.stdout | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/testsuite/tests/perf/compiler/T16473.stdout b/testsuite/tests/perf/compiler/T16473.stdout index 2d14bc5fe7..755017fd83 100644 --- a/testsuite/tests/perf/compiler/T16473.stdout +++ b/testsuite/tests/perf/compiler/T16473.stdout @@ -1,10 +1,10 @@ Rule fired: Class op fmap (BUILTIN) Rule fired: Class op liftA2 (BUILTIN) -Rule fired: Class op $p1Applicative (BUILTIN) Rule fired: Class op <*> (BUILTIN) -Rule fired: Class op <$ (BUILTIN) Rule fired: Class op $p1Applicative (BUILTIN) +Rule fired: Class op <$ (BUILTIN) Rule fired: Class op <*> (BUILTIN) +Rule fired: Class op $p1Applicative (BUILTIN) Rule fired: Class op fmap (BUILTIN) Rule fired: Class op pure (BUILTIN) Rule fired: Class op pure (BUILTIN) |