Make DeriveFunctor-generated code require fewer beta reductions

Issue #17880 demonstrates that `DeriveFunctor`-generated code is surprisingly fragile when rank-_n_ types are involved. The culprit is that `$fmap` (the algorithm used to generate `fmap` implementations) was too keen on applying arguments with rank-_n_ types to lambdas, which fail to typecheck more often than not. In this patch, I change `$fmap` (both the specification and the implementation) to produce code that avoids creating as many lambdas, avoiding problems when rank-_n_ field types arise. See the comments titled "Functor instances" in `TcGenFunctor` for a more detailed description. Not only does this fix #17880, but it also ensures that the code that `DeriveFunctor` generates will continue to work after simplified subsumption is implemented (see #17775). What is truly amazing is that #17880 is actually a regression (introduced in GHC 7.6.3) caused by commit 49ca2a37bef18aa57235ff1dbbf1cc0434979b1e, the fix #7436. Prior to that commit, the version of `$fmap` that was used was almost identical to the one used in this patch! Why did that commit change `$fmap` then? It was to avoid severe performance issues that would arise for recursive `fmap` implementations, such as in the example below: ```hs data List a = Nil | Cons a (List a) deriving Functor -- ===> instance Functor List where fmap f Nil = Nil fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs) ``` The fact that `\y -> f y` was eta expanded caused significant performance overheads. Commit 49ca2a37bef18aa57235ff1dbbf1cc0434979b1e fixed this performance issue, but it went too far. As a result, this patch partially reverts 49ca2a37bef18aa57235ff1dbbf1cc0434979b1e. To ensure that the performance issues pre-#7436 do not resurface, I have taken some precautionary measures: * I have added a special case to `$fmap` for situations where the last type variable in an application of some type occurs directly. If this special case fires, we avoid creating a lambda expression. This ensures that we generate `fmap f (Cons x xs) = Cons (f x) (fmap f xs)` in the derived `Functor List` instance above. For more details, see `Note [Avoid unnecessary eta expansion in derived fmap implementations]` in `TcGenFunctor`. * I have added a `T7436b` test case to ensure that the performance of this derived `Functor List`-style code does not regress. When implementing this, I discovered that `$replace`, the algorithm which generates implementations of `(<$)`, has a special case that is very similar to the `$fmap` special case described above. `$replace` marked this special case with a custom `Replacer` data type, which was a bit overkill. In order to use the same machinery for both `Functor` methods, I ripped out `Replacer` and instead implemented a simple way to detect the special case. See the updated commentary in `Note [Deriving <$]` for more details.
author: Ryan Scott <ryan.gl.scott@gmail.com> 2020-03-06 11:47:56 -0500
committer: Marge Bot <ben+marge-bot@smart-cactus.org> 2020-03-11 20:34:14 -0400
commit: cb93a1a4405b448e83cad973f93dab3f7f050736 (patch)
tree: 11fecca8bf924585c8acfefa18784fa1ecdcedf7 /testsuite/tests/perf
parent: a6989971379c26d8c288551d536149675e009e34 (diff)
download: haskell-cb93a1a4405b448e83cad973f93dab3f7f050736.tar.gz
3 files changed, 28 insertions, 0 deletions
diff --git a/testsuite/tests/perf/should_run/T7436b.hs b/testsuite/tests/perf/should_run/T7436b.hs
new file mode 100644
index 0000000000..e5f09aba9b
--- /dev/null
+++ b/testsuite/tests/perf/should_run/T7436b.hs
@@ -0,0 +1,20 @@
+{-# LANGUAGE DeriveFunctor #-}
+-- A variation on T7436 that tests a derived Functor instance.
+module Main where
+
+data List a = Nil | Cons a (List a)
+    deriving Functor
+
+mkList :: Int -> List Int
+mkList 0 = Nil
+mkList n = Cons n (mkList (n-1))
+
+sumList :: List Int -> Int
+sumList = go 0
+  where
+    go a Nil = a
+    go a (Cons n ns) = a `seq` go (a+n) ns
+
+main :: IO ()
+main = print $ sumList . fmap id $ mkList n
+  where n = 40000
diff --git a/testsuite/tests/perf/should_run/T7436b.stdout b/testsuite/tests/perf/should_run/T7436b.stdout
new file mode 100644
index 0000000000..3d6a314caa
--- /dev/null
+++ b/testsuite/tests/perf/should_run/T7436b.stdout
@@ -0,0 +1 @@
+800020000
diff --git a/testsuite/tests/perf/should_run/all.T b/testsuite/tests/perf/should_run/all.T
index 89f1fc8ecd..d0dec5b2a6 100644
--- a/testsuite/tests/perf/should_run/all.T
+++ b/testsuite/tests/perf/should_run/all.T
@@ -198,6 +198,13 @@ test('T7436',
      compile_and_run,
      ['-O'])
 
+test('T7436b',
+     [collect_stats('max_bytes_used',4),
+      only_ways(['normal'])
+      ],
+     compile_and_run,
+     ['-O'])
+
 test('T7797',
       [collect_stats('bytes allocated',5),
       extra_clean(['T7797a.hi', 'T7797a.o']),
author	Ryan Scott <ryan.gl.scott@gmail.com>	2020-03-06 11:47:56 -0500
committer	Marge Bot <ben+marge-bot@smart-cactus.org>	2020-03-11 20:34:14 -0400
commit	cb93a1a4405b448e83cad973f93dab3f7f050736 (patch)
tree	11fecca8bf924585c8acfefa18784fa1ecdcedf7 /testsuite/tests/perf
parent	a6989971379c26d8c288551d536149675e009e34 (diff)
download	haskell-cb93a1a4405b448e83cad973f93dab3f7f050736.tar.gz