From 4dc2002aeb08b8be399f1f535b86a671d18eac04 Mon Sep 17 00:00:00 2001 From: Simon Peyton Jones Date: Thu, 11 Feb 2021 09:47:42 +0000 Subject: Fix over-eager inlining in SimpleOpt In GHC.Core.SimpleOpt, I found that its inlining could duplicate an arbitary redex inside a lambda! Consider (\xyz. x+y). The occurrence-analysis treats the lamdda as a group, and says that both x and y occur once, even though the occur under the lambda-z. See Note [Occurrence analysis for lambda binders] in OccurAnal. When the lambda is under-applied in a call, the Simplifier is careful to zap the occ-info on x,y, because they appear under the \z. (See the call to zapLamBndrs in simplExprF1.) But SimpleOpt missed this test, resulting in #19347. So this patch * commons up the binder-zapping in GHC.Core.Utils.zapLamBndrs. * Calls this new function from GHC.Core.Opt.Simplify * Adds a call to zapLamBndrs to GHC.Core.SimpleOpt.simple_app This change makes test T12990 regress somewhat, but it was always very delicate, so I'm going to put up with that. In this voyage I also discovered a small, rather unrelated infelicity in the Simplifier: * In GHC.Core.Opt.Simplify.simplNonRecX we should apply isStrictId to the OutId not the InId. See Note [Dark corner with levity polymorphism] It may never "bite", because SimpleOpt should have inlined all the levity-polymorphic compulsory inlnings already, but somehow it bit me at one point and it's generally a more solid thing to do. Fixing the main bug increases runtime allocation in test perf/should_run/T12990, for (acceptable) reasons explained in a comement on Metric Increase: T12990 --- testsuite/tests/perf/should_run/T19347.hs | 30 +++++++++++++++++++++++++++ testsuite/tests/perf/should_run/T19347.stdout | 1 + testsuite/tests/perf/should_run/all.T | 5 +++++ 3 files changed, 36 insertions(+) create mode 100644 testsuite/tests/perf/should_run/T19347.hs create mode 100644 testsuite/tests/perf/should_run/T19347.stdout (limited to 'testsuite/tests/perf') diff --git a/testsuite/tests/perf/should_run/T19347.hs b/testsuite/tests/perf/should_run/T19347.hs new file mode 100644 index 0000000000..c885eac724 --- /dev/null +++ b/testsuite/tests/perf/should_run/T19347.hs @@ -0,0 +1,30 @@ +{-# LANGUAGE UnboxedTuples #-} + +module Main where + +data T = MkT !Int Int + +-- An expensive recursive function +g :: Int -> Int -> (# Int, Int #) +g x 0 = (# x, 33 #) +g x n = g (x+n) (n-1) + +-- 'foo' calls 'h' often +foo h 0 = 0 +foo h n = h n `seq` foo h (n-1) + +main = print (foo (MkT (case g 1 200 of (# a,b #) -> a)) + 200) + +{- In main, we don't want to eta-expand the MkT to + (\x. MkT (case g 1 200 of (# a,b #) -> a) x) +because then that call to g may be made more often +The faffing with unboxed tuples is to defeat full +laziness which would otherwise lift the call to g +out to top level + +Before fixing #19347, running this program gave + 2,012,096 bytes allocated in the heap +after it gave + 101,712 bytes allocated in the heap +-} diff --git a/testsuite/tests/perf/should_run/T19347.stdout b/testsuite/tests/perf/should_run/T19347.stdout new file mode 100644 index 0000000000..573541ac97 --- /dev/null +++ b/testsuite/tests/perf/should_run/T19347.stdout @@ -0,0 +1 @@ +0 diff --git a/testsuite/tests/perf/should_run/all.T b/testsuite/tests/perf/should_run/all.T index 75044776ca..0cb7c7a73a 100644 --- a/testsuite/tests/perf/should_run/all.T +++ b/testsuite/tests/perf/should_run/all.T @@ -385,3 +385,8 @@ test('T18574', compile_and_run, ['-O']) +test('T19347', + [collect_stats('bytes allocated', 5), only_ways(['normal'])], + compile_and_run, + ['-O']) + -- cgit v1.2.1