summaryrefslogtreecommitdiff
path: root/testsuite/tests/perf
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2020-05-21 12:53:35 +0100
committerMarge Bot <ben+marge-bot@smart-cactus.org>2020-06-10 04:17:06 -0400
commit6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c (patch)
tree3129da29bee5d7f34666133a4247a60418d30212 /testsuite/tests/perf
parent9454511b0bdfcd79a1899d7f24bf65a3eb0d06e3 (diff)
downloadhaskell-6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c.tar.gz
Implement cast worker/wrapper properly
The cast worker/wrapper transformation transforms x = e |> co into y = e x = y |> co This is done by the simplifier, but we were being careless about transferring IdInfo from x to y, and about what to do if x is a NOINLNE function. This resulted in a series of bugs: #17673, #18093, #18078. This patch fixes all that: * Main change is in GHC.Core.Opt.Simplify, and the new prepareBinding function, which does this cast worker/wrapper transform. See Note [Cast worker/wrappers]. * There is quite a bit of refactoring around prepareRhs, makeTrivial etc. It's nicer now. * Some wrappers from strictness and cast w/w, notably those for a function with a NOINLINE, should inline very late. There wasn't really a mechanism for that, which was an existing bug really; so I invented a new finalPhase = Phase (-1). It's used for all simplifier runs after the user-visible phase 2,1,0 have run. (No new runs of the simplifier are introduced thereby.) See new Note [Compiler phases] in GHC.Types.Basic; the main changes are in GHC.Core.Opt.Driver * Doing this made me trip over two places where the AnonArgFlag on a FunTy was being lost so we could end up with (Num a -> ty) rather than (Num a => ty) - In coercionLKind/coercionRKind - In contHoleType in the Simplifier I fixed the former by defining mkFunctionType and using it in coercionLKind/RKind. I could have done the same for the latter, but the information is almost to hand. So I fixed the latter by - adding sc_hole_ty to ApplyToVal (like ApplyToTy), - adding as_hole_ty to ValArg (like TyArg) - adding sc_fun_ty to StrictArg Turned out I could then remove ai_type from ArgInfo. This is just moving the deck chairs around, but it worked out nicely. See the new Note [AnonArgFlag] in GHC.Types.Var * When looking at the 'arity decrease' thing (#18093) I discovered that stable unfoldings had a much lower arity than the actual optimised function. That's what led to the arity-decrease message. Simple solution: eta-expand. It's described in Note [Eta-expand stable unfoldings] in GHC.Core.Opt.Simplify * I also discovered that unsafeCoerce wasn't being inlined if the context was boring. So (\x. f (unsafeCoerce x)) would create a thunk -- yikes! I fixed that by making inlineBoringOK a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold. I also found that unsafeCoerceName was unused, so I removed it. I made a test case for #18078, and a very similar one for #17673. The net effect of all this on nofib is very modest, but positive: -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- anna -0.4% -0.1% -3.1% -3.1% 0.0% fannkuch-redux -0.4% -0.3% -0.1% -0.1% 0.0% maillist -0.4% -0.1% -7.8% -1.0% -14.3% primetest -0.4% -15.6% -7.1% -6.6% 0.0% -------------------------------------------------------------------------------- Min -0.9% -15.6% -13.3% -14.2% -14.3% Max -0.3% 0.0% +12.1% +12.4% 0.0% Geometric Mean -0.4% -0.2% -2.3% -2.2% -0.1% All following metric decreases are compile-time allocation decreases between -1% and -3%: Metric Decrease: T5631 T13701 T14697 T15164
Diffstat (limited to 'testsuite/tests/perf')
-rw-r--r--testsuite/tests/perf/compiler/T16473.stdout4
1 files changed, 2 insertions, 2 deletions
diff --git a/testsuite/tests/perf/compiler/T16473.stdout b/testsuite/tests/perf/compiler/T16473.stdout
index 2d14bc5fe7..755017fd83 100644
--- a/testsuite/tests/perf/compiler/T16473.stdout
+++ b/testsuite/tests/perf/compiler/T16473.stdout
@@ -1,10 +1,10 @@
Rule fired: Class op fmap (BUILTIN)
Rule fired: Class op liftA2 (BUILTIN)
-Rule fired: Class op $p1Applicative (BUILTIN)
Rule fired: Class op <*> (BUILTIN)
-Rule fired: Class op <$ (BUILTIN)
Rule fired: Class op $p1Applicative (BUILTIN)
+Rule fired: Class op <$ (BUILTIN)
Rule fired: Class op <*> (BUILTIN)
+Rule fired: Class op $p1Applicative (BUILTIN)
Rule fired: Class op fmap (BUILTIN)
Rule fired: Class op pure (BUILTIN)
Rule fired: Class op pure (BUILTIN)