From 6d49d5be904c0c01788fa7aae1b112d5b4dfaf1c Mon Sep 17 00:00:00 2001 From: Simon Peyton Jones Date: Thu, 21 May 2020 12:53:35 +0100 Subject: Implement cast worker/wrapper properly The cast worker/wrapper transformation transforms x = e |> co into y = e x = y |> co This is done by the simplifier, but we were being careless about transferring IdInfo from x to y, and about what to do if x is a NOINLNE function. This resulted in a series of bugs: #17673, #18093, #18078. This patch fixes all that: * Main change is in GHC.Core.Opt.Simplify, and the new prepareBinding function, which does this cast worker/wrapper transform. See Note [Cast worker/wrappers]. * There is quite a bit of refactoring around prepareRhs, makeTrivial etc. It's nicer now. * Some wrappers from strictness and cast w/w, notably those for a function with a NOINLINE, should inline very late. There wasn't really a mechanism for that, which was an existing bug really; so I invented a new finalPhase = Phase (-1). It's used for all simplifier runs after the user-visible phase 2,1,0 have run. (No new runs of the simplifier are introduced thereby.) See new Note [Compiler phases] in GHC.Types.Basic; the main changes are in GHC.Core.Opt.Driver * Doing this made me trip over two places where the AnonArgFlag on a FunTy was being lost so we could end up with (Num a -> ty) rather than (Num a => ty) - In coercionLKind/coercionRKind - In contHoleType in the Simplifier I fixed the former by defining mkFunctionType and using it in coercionLKind/RKind. I could have done the same for the latter, but the information is almost to hand. So I fixed the latter by - adding sc_hole_ty to ApplyToVal (like ApplyToTy), - adding as_hole_ty to ValArg (like TyArg) - adding sc_fun_ty to StrictArg Turned out I could then remove ai_type from ArgInfo. This is just moving the deck chairs around, but it worked out nicely. See the new Note [AnonArgFlag] in GHC.Types.Var * When looking at the 'arity decrease' thing (#18093) I discovered that stable unfoldings had a much lower arity than the actual optimised function. That's what led to the arity-decrease message. Simple solution: eta-expand. It's described in Note [Eta-expand stable unfoldings] in GHC.Core.Opt.Simplify * I also discovered that unsafeCoerce wasn't being inlined if the context was boring. So (\x. f (unsafeCoerce x)) would create a thunk -- yikes! I fixed that by making inlineBoringOK a bit cleverer: see Note [Inline unsafeCoerce] in GHC.Core.Unfold. I also found that unsafeCoerceName was unused, so I removed it. I made a test case for #18078, and a very similar one for #17673. The net effect of all this on nofib is very modest, but positive: -------------------------------------------------------------------------------- Program Size Allocs Runtime Elapsed TotalMem -------------------------------------------------------------------------------- anna -0.4% -0.1% -3.1% -3.1% 0.0% fannkuch-redux -0.4% -0.3% -0.1% -0.1% 0.0% maillist -0.4% -0.1% -7.8% -1.0% -14.3% primetest -0.4% -15.6% -7.1% -6.6% 0.0% -------------------------------------------------------------------------------- Min -0.9% -15.6% -13.3% -14.2% -14.3% Max -0.3% 0.0% +12.1% +12.4% 0.0% Geometric Mean -0.4% -0.2% -2.3% -2.2% -0.1% All following metric decreases are compile-time allocation decreases between -1% and -3%: Metric Decrease: T5631 T13701 T14697 T15164 --- testsuite/tests/perf/compiler/T16473.stdout | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) (limited to 'testsuite/tests/perf') diff --git a/testsuite/tests/perf/compiler/T16473.stdout b/testsuite/tests/perf/compiler/T16473.stdout index 2d14bc5fe7..755017fd83 100644 --- a/testsuite/tests/perf/compiler/T16473.stdout +++ b/testsuite/tests/perf/compiler/T16473.stdout @@ -1,10 +1,10 @@ Rule fired: Class op fmap (BUILTIN) Rule fired: Class op liftA2 (BUILTIN) -Rule fired: Class op $p1Applicative (BUILTIN) Rule fired: Class op <*> (BUILTIN) -Rule fired: Class op <$ (BUILTIN) Rule fired: Class op $p1Applicative (BUILTIN) +Rule fired: Class op <$ (BUILTIN) Rule fired: Class op <*> (BUILTIN) +Rule fired: Class op $p1Applicative (BUILTIN) Rule fired: Class op fmap (BUILTIN) Rule fired: Class op pure (BUILTIN) Rule fired: Class op pure (BUILTIN) -- cgit v1.2.1