Allow simplification through runRW#

Because runRW# inlines so late, we were previously able to do very little simplification across it. For instance, given even a simple program like case runRW# (\s -> let n = I# 42# in n) of I# n# -> f n# we previously had no way to avoid the allocation of the I#. This patch allows the simplifier to push strict contexts into the continuation of a runRW# application, as explained in in Note [Simplification of runRW#] in GHC.CoreToStg.Prep. Fixes #15127. Metric Increase: T9961 Metric Decrease: ManyConstructors Co-Authored-By: Simon Peyton-Jone <simonpj@microsoft.com>
author: Ben Gamari <ben@smart-cactus.org> 2020-04-19 11:16:32 -0400
committer: Marge Bot <ben+marge-bot@smart-cactus.org> 2020-05-29 01:39:19 -0400
commit: 46720997a0b1fa2971a884adf43de096ce130a7e (patch)
tree: f4ff59bdb6979c03f535b7682c170a057d0f5607 /compiler/GHC/CoreToStg/Prep.hs
parent: c49f7df02ff02c8f09e6a6e00a271b867ca6b092 (diff)
download: haskell-46720997a0b1fa2971a884adf43de096ce130a7e.tar.gz
1 files changed, 83 insertions, 5 deletions
diff --git a/compiler/GHC/CoreToStg/Prep.hs b/compiler/GHC/CoreToStg/Prep.hs
index 32bbfff214..ac74dd6723 100644
--- a/compiler/GHC/CoreToStg/Prep.hs
+++ b/compiler/GHC/CoreToStg/Prep.hs
@@ -829,14 +829,23 @@ cpeApp top_env expr
         -- rather than the far superior "f x y".  Test case is par01.
         = let (terminal, args', depth') = collect_args arg
           in cpe_app env terminal (args' ++ args) (depth + depth' - 1)
-    cpe_app env (Var f) [CpeApp _runtimeRep@Type{}, CpeApp _type@Type{}, CpeApp arg] 1
+    cpe_app env (Var f) (CpeApp _runtimeRep@Type{} : CpeApp _type@Type{} : CpeApp arg : rest) n
         | f `hasKey` runRWKey
+        -- N.B. While it may appear that n == 1 in the case of runRW#
+        -- applications, keep in mind that we may have applications that return
+        , n >= 1
         -- See Note [runRW magic]
         -- Replace (runRW# f) by (f realWorld#), beta reducing if possible (this
         -- is why we return a CorePrepEnv as well)
         = case arg of
-            Lam s body -> cpe_app (extendCorePrepEnv env s realWorldPrimId) body [] 0
-            _          -> cpe_app env arg [CpeApp (Var realWorldPrimId)] 1
+            Lam s body -> cpe_app (extendCorePrepEnv env s realWorldPrimId) body rest (n-2)
+            _          -> cpe_app env arg (CpeApp (Var realWorldPrimId) : rest) (n-1)
+             -- TODO: What about casts?
+
+    cpe_app _env (Var f) args n
+        | f `hasKey` runRWKey
+        = pprPanic "cpe_app(runRW#)" (ppr args $$ ppr n)
+
     cpe_app env (Var v) args depth
       = do { v1 <- fiddleCCall v
            ; let e2 = lookupCorePrepEnv env v1
@@ -965,8 +974,77 @@ pragma.  It is levity-polymorphic.
            => (State# RealWorld -> (# State# RealWorld, o #))
                               -> (# State# RealWorld, o #)
 
-It needs no special treatment in GHC except this special inlining here
-in CorePrep (and in GHC.CoreToByteCode).
+It's correctness needs no special treatment in GHC except this special inlining
+here in CorePrep (and in GHC.CoreToByteCode).
+
+However, there are a variety of optimisation opportunities that the simplifier
+takes advantage of. See Note [Simplification of runRW#].
+
+
+Note [Simplification of runRW#]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider the program,
+
+    case runRW# (\s -> let n = I# 42# in n) of
+      I# n# -> f n#
+
+There is no reason why we should allocate an I# constructor given that we
+immediately destructure it. To avoid this the simplifier will push strict
+contexts into runRW's continuation. That is, it transforms
+
+    K[ runRW# @r @ty cont ]
+              ~>
+    runRW# @r @ty K[cont]
+
+This has a few interesting implications. Consider, for instance, this program:
+
+    join j = ...
+    in case runRW# @r @ty cont of
+         result -> jump j result
+
+Performing the transform described above would result in:
+
+    join j x = ...
+    in runRW# @r @ty (\s ->
+         case cont of in
+           result -> jump j result
+       )
+
+If runRW# were a "normal" function this call to join point j would not be
+allowed in its continuation argument. However, since runRW# is inlined (as
+described in Note [runRW magic] above), such join point occurences are
+completely fine. Both occurrence analysis and Core Lint have special treatment
+for runRW# applications. See Note [Linting of runRW#] for details on the latter.
+
+Moreover, it's helpful to ensure that runRW's continuation isn't floated out
+(since doing so would then require a call, whereas we would otherwise end up
+with straight-line). Consequently, GHC.Core.Opt.SetLevels.lvlApp has special
+treatment for runRW# applications, ensure the arguments are not floated if
+MFEs.
+
+Other considered designs
+------------------------
+
+One design that was rejected was to *require* that runRW#'s continuation be
+headed by a lambda. However, this proved to be quite fragile. For instance,
+SetLevels is very eager to float bottoming expressions. For instance given
+something of the form,
+
+    runRW# @r @ty (\s -> case expr of x -> undefined)
+
+SetLevels will see that the body the lambda is bottoming and will consequently
+float it to the top-level (assuming expr has no free coercion variables which
+prevent this). We therefore end up with
+
+    runRW# @r @ty (\s -> lvl s)
+
+Which the simplifier will beta reduce, leaving us with
+
+    runRW# @r @ty lvl
+
+Breaking our desired invariant. Ultimately we decided to simply accept that
+the continuation may not be a manifest lambda.
+
 
 -- ---------------------------------------------------------------------------
 --      CpeArg: produces a result satisfying CpeArg
author	Ben Gamari <ben@smart-cactus.org>	2020-04-19 11:16:32 -0400
committer	Marge Bot <ben+marge-bot@smart-cactus.org>	2020-05-29 01:39:19 -0400
commit	46720997a0b1fa2971a884adf43de096ce130a7e (patch)
tree	f4ff59bdb6979c03f535b7682c170a057d0f5607 /compiler/GHC/CoreToStg/Prep.hs
parent	c49f7df02ff02c8f09e6a6e00a271b867ca6b092 (diff)
download	haskell-46720997a0b1fa2971a884adf43de096ce130a7e.tar.gz