Eta-reduce PAPs

This patch arranges to eta-reduce if doing so produces a PAP. Thus \x. foldr e1 e2 x ==> foldr e1 e2 In other direction we are already careful not to eta-expand foldr e1 e2 ==> \x. foldr e1 e2 x See Note [Do not eta-expand PAPs] in GHC.Core.Opt.Simplify.Utils So this patch just makes it work symmetrically when considering eta-reduction. I noticed this when examining #18993 and, although it is delicate, this patch does fix the regression in #18993. But that's not the main point here. Specifics: * In GHC.Core.Utils.tryEtaReduce, allow eta-reducing if we get a PAP. This changes the function ok_fun a bit. I also moved tryEtaReduce from GHC.Core.Utils to GHC.Core.Opt.Arity, where it properly belongs. See Note [Eta reduce PAPs] * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, pull out the code for when eta-expansion is wanted, to make wantEtaExpansion, and all that same function in GHC.Core.Opt.Simplify.simplStableUnfolding. It was previously inconsistent, but it's doing the same thing. * I did a substantial refactor of ArityType; see Note [ArityType]. This allowed me to do away with the somewhat mysterious takeOneShots; more generally it allows arityType to describe the function, leaving its clients to decide how to use that information. I made ArityType abstract, so that clients have to use functions to access it. * Make GHC.Core.Opt.Simplify.Utils.rebuildLam (was stupidly called mkLam before) aware of the floats that the simplifier builds up, so that it can still do eta-reduction even if there are some floats. (Previously that would not happen.) That means passing the floats to rebuildLam, and an extra check when eta-reducting (etaFloatOk). * I made the CallCtxt which the Simplifier uses distinguish between recursive and non-recursive right-hand sides. I can't now quite remember why, but it must have seemed important at the time. It affects two things: - We only eta-reduce non-recursive RHS, rather than eta-reducing every lambda - We call an RHS context interesting only if it is non-recursive see Note [RHS of lets] in GHC.Core.Unfold * Now that we eta-reduce to expose PAPs in GHC.Core.Opt.Arity, we no longer need to do so in GHC.CoreToStg.Prep, a welcome simplification. See Note [No eta reduction needed in rhsToBody] in GHC.CoreToStg.Prep. * In GHC.Core.Opt.Simplify.Utils.tryEtaExpandRhs, make use of call-info in the idDemandInfo of the binder, as well as the CallArity info. The occurrence analyser did this but we were failing to take advantage here. In the end I moved the heavy lifting to GHC.Core.Opt.Arity.findRhsArity; see Note [Combining arityType with demand info], and functions idDemandOneShots and combineWithDemandOneShots. (These changes partly drove my refactoring of ArityType.) Other incidental changes * Fix a fairly long-standing outright bug in the ApplyToVal case of GHC.Core.Opt.Simplify.mkDupableContWithDmds. I was failing to take the tail of 'dmds' in the recursive call, which meant the demands were All Wrong. I have no idea why this has not caused problems before now. * Delete dead function GHC.Core.Opt.Simplify.Utils.contIsRhsOrArg
author: Simon Peyton Jones <simonpj@microsoft.com> 2021-06-16 12:32:08 +0100
committer: Simon Peyton Jones <simonpj@microsoft.com> 2021-07-20 11:44:10 +0100
commit: cc8ea077a0ee58a76dbb7f4ae56c1f571f9f720f (patch)
tree: 95c95f9e20123f0d7ddbc7ad99c87baea05d8946
parent: 58b960d2af0ebfc37104ec68a4df377a074951dd (diff)
download: haskell-cc8ea077a0ee58a76dbb7f4ae56c1f571f9f720f.tar.gz
24 files changed, 1545 insertions, 539 deletions
diff --git a/compiler/GHC/Core/Lint.hs b/compiler/GHC/Core/Lint.hs
index f20dbcc62b..73897c9e63 100644
--- a/compiler/GHC/Core/Lint.hs
+++ b/compiler/GHC/Core/Lint.hs
@@ -1895,7 +1895,6 @@ try to trim the forall'd binder list.
 
 Note [Rules for join points]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
 A join point cannot be partially applied. However, the left-hand side of a rule
 for a join point is effectively a *pattern*, not a piece of code, so there's an
 argument to be made for allowing a situation like this:
diff --git a/compiler/GHC/Core/Opt/Arity.hs b/compiler/GHC/Core/Opt/Arity.hs
index e5e63aca26..0b1c78c46d 100644
--- a/compiler/GHC/Core/Opt/Arity.hs
+++ b/compiler/GHC/Core/Opt/Arity.hs
@@ -11,14 +11,21 @@
 
 -- | Arity and eta expansion
 module GHC.Core.Opt.Arity
-   ( manifestArity, joinRhsArity, exprArity, typeArity
-   , exprEtaExpandArity, findRhsArity
-   , etaExpand, etaExpandAT
+   ( -- Finding arity
+     manifestArity, joinRhsArity, exprArity, typeArity
+   , findRhsArity
    , exprBotStrictness_maybe
 
+   -- ** Eta expansion
+   , exprEtaExpandArity, etaExpand, etaExpandAT
+
+   -- ** Eta reduction
+   , tryEtaReduce
+
    -- ** ArityType
-   , ArityType(..), mkBotArityType, mkTopArityType, expandableArityType
-   , arityTypeArity, maxWithArity, idArityType
+   , ArityType, mkBotArityType, mkManifestArityType
+   , expandableArityType
+   , arityTypeArity, arityTypeArityDiv, idArityType
 
    -- ** Join points
    , etaExpandToJoinPoint, etaExpandToJoinPointRule
@@ -39,7 +46,7 @@ import GHC.Core.Utils
 import GHC.Core.DataCon
 import GHC.Core.TyCon     ( tyConArity )
 import GHC.Core.TyCon.RecWalk     ( initRecTc, checkRecTc )
-import GHC.Core.Predicate ( isDictTy )
+import GHC.Core.Predicate ( isDictTy, isEvVar )
 import GHC.Core.Multiplicity
 
 -- We have two sorts of substitution:
@@ -50,9 +57,8 @@ import GHC.Core.Type     as Type
 import GHC.Core.Coercion as Type
 
 import GHC.Types.Demand
-import GHC.Types.Var
-import GHC.Types.Var.Env
 import GHC.Types.Id
+import GHC.Types.Var.Env
 import GHC.Types.Var.Set
 import GHC.Types.Basic
 import GHC.Types.Tickish
@@ -183,9 +189,8 @@ exprBotStrictness_maybe e
   where
     sig ar = mkClosedDmdSig (replicate ar topDmd) botDiv
 
-{-
-Note [exprArity invariant]
-~~~~~~~~~~~~~~~~~~~~~~~~~~
+{- Note [exprArity invariant]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 exprArity has the following invariants:
 
   (1) If typeArity (exprType e) = n,
@@ -453,19 +458,60 @@ but not to introduce a new lambda.
 
 Note [ArityType]
 ~~~~~~~~~~~~~~~~
+ArityType can be thought of as an abstraction of an expression.
+The ArityType
+   AT [ (IsCheap,     NoOneShotInfo)
+      , (IsExpensive, OneShotLam)
+      , (IsCheap,     OneShotLam) ] Dunno)
+
+abstracts an expression like
+   \x. let <expensive> in
+       \y{os}.
+       \z{os}. blah
+
+In general we have (AT prs div).  Then
+* In prs :: [(Cost,OneShotInfo)]
+  * The Cost flag describes the part of the expression down
+    to the first (value) lambda.
+  * The OneShotInfo flag gives the one-shot info on that lambda.
+
+* If 'div; is dead-ending ('isDeadEndDiv'), then application to
+  'length prs' arguments will surely diverge, similar to the situation
+  with 'DmdType'.
+
 ArityType is the result of a compositional analysis on expressions,
 from which we can decide the real arity of the expression (extracted
 with function exprEtaExpandArity).
 
 We use the following notation:
-  at  ::= \o1..on.div
+  at  ::= \p1..pn.div
   div ::= T | x | ⊥
-  o   ::= ? | 1
+  p   ::= (c o)
+  c   ::= X | C    -- Expensive or Cheap
+  o   ::= ? | 1    -- NotOneShot or OneShotLam
+
 And omit the \. if n = 0. Examples:
-  \?11.T stands for @AT [NoOneShotInfo,OneShotLam,OneShotLam] topDiv@
-  ⊥      stands for @AT [] botDiv@
+  \(C?)(CX)(C1).T
+stands for
+  @AT [(IsCheap,NoOneShotInfo),(IsExpensive,OneShotLam),(IsCheap,OneShotLam)] topDiv@
+
+And ⊥ stands for @AT [] botDiv@
 See the 'Outputable' instance for more information. It's pretty simple.
 
+Example:
+      f = \x\y. let v = <expensive> in
+          \s(one-shot) \t(one-shot). blah
+      'f' has arity type \(C?)(C?)(X1)(C1).T
+      The one-shot-ness means we can, in effect, push that
+      'let' inside the \st, and expand to arity 4
+
+
+Suppose f = \xy. x+y
+Then  f             :: \(C?)(C?).T
+      f v           :: \(C?).T
+      f <expensive> :: \(X?).T
+
+
 Here is what the fields mean. If an arbitrary expression 'f' has
 ArityType 'at', then
 
@@ -478,9 +524,9 @@ ArityType 'at', then
        let x = <expensive> in \y. error (g x y)
        ==> \y. let x = <expensive> in error (g x y)
 
- * If @at = AT [o1,..,on] topDiv@ (notation: \o1..on.T), then expanding 'f'
-   to @\x1..xn. f x1..xn@ loses no sharing, assuming the calls of f respect
-   the one-shot-ness o1..on of its definition.
+ * If `f` has ArityType `at` we can eta-expand `f` by (aritTypeOneShots at)
+   without losing sharing. This function checks that the either there
+   are no expensive expressions, or the lambdas are one-shots.
 
    NB 'f' is an arbitrary expression, eg @f = g e1 e2@.  This 'f' can have
    arity type @AT oss _@, with @length oss > 0@, only if e1 e2 are themselves
@@ -493,18 +539,6 @@ ArityType 'at', then
    So eta expansion is dynamically ok; see Note [State hack and
    bottoming functions], the part about catch#
 
-Example:
-      f = \x\y. let v = <expensive> in
-          \s(one-shot) \t(one-shot). blah
-      'f' has arity type \??11.T
-      The one-shot-ness means we can, in effect, push that
-      'let' inside the \st.
-
-
-Suppose f = \xy. x+y
-Then  f             :: \??.T
-      f v           :: \?.T
-      f <expensive> :: T
 -}
 
 
@@ -537,8 +571,8 @@ Then  f             :: \??.T
 --
 -- We rely on this lattice structure for fixed-point iteration in
 -- 'findRhsArity'. For the semantics of 'ArityType', see Note [ArityType].
-data ArityType
-  = AT ![OneShotInfo] !Divergence
+data ArityType  -- See Note [ArityType]
+  = AT ![(Cost,OneShotInfo)] !Divergence
   -- ^ @AT oss div@ means this value can safely be eta-expanded @length oss@
   -- times, provided use sites respect the 'OneShotInfo's in @oss@.
   -- A 'OneShotLam' annotation can come from two sources:
@@ -553,6 +587,17 @@ data ArityType
   -- with 'DmdType'.
   deriving Eq
 
+data Cost = IsCheap | IsExpensive
+          deriving( Eq )
+
+isCheap :: Cost -> Bool
+isCheap IsCheap     = True
+isCheap IsExpensive = False
+
+addCost :: Cost -> Cost -> Cost
+addCost IsCheap IsCheap = IsCheap
+addCost _       _       = IsExpensive
+
 -- | This is the BNF of the generated output:
 --
 -- @
@@ -571,54 +616,59 @@ instance Outputable ArityType where
       pp_div Diverges = char '⊥'
       pp_div ExnOrDiv = char 'x'
       pp_div Dunno    = char 'T'
-      pp_os OneShotLam    = char '1'
-      pp_os NoOneShotInfo = char '?'
+      pp_os (IsCheap,     OneShotLam)    = text "(C1)"
+      pp_os (IsExpensive, OneShotLam)    = text "(X1)"
+      pp_os (IsCheap,     NoOneShotInfo) = text "(C?)"
+      pp_os (IsExpensive, NoOneShotInfo) = text "(X?)"
 
 mkBotArityType :: [OneShotInfo] -> ArityType
-mkBotArityType oss = AT oss botDiv
+mkBotArityType oss = AT [(IsCheap,os) | os <- oss] botDiv
 
 botArityType :: ArityType
 botArityType = mkBotArityType []
 
-mkTopArityType :: [OneShotInfo] -> ArityType
-mkTopArityType oss = AT oss topDiv
+mkManifestArityType :: [OneShotInfo] -> ArityType
+mkManifestArityType oss = AT [(IsCheap,os) | os <- oss] topDiv
 
 topArityType :: ArityType
-topArityType = mkTopArityType []
+topArityType = AT [] topDiv
 
 -- | The number of value args for the arity type
 arityTypeArity :: ArityType -> Arity
-arityTypeArity (AT oss _) = length oss
+arityTypeArity at = length (arityTypeOneShots at)
+
+arityTypeArityDiv :: ArityType -> (Arity, Divergence)
+arityTypeArityDiv at@(AT oss div)
+  = (length oss', div')
+  where
+    oss' = arityTypeOneShots at
+    div' | oss `equalLength` oss' = div
+         | otherwise              = topDiv
+
+arityTypeOneShots :: ArityType -> [OneShotInfo]
+-- Returns a list only as long as the arity should be
+arityTypeOneShots (AT prs _)
+  = go IsCheap prs
+  where
+    go :: Cost -> [(Cost,OneShotInfo)] -> [OneShotInfo]
+    go _ [] = []
+    go ch1 ((ch2,os):prs)
+       = case (ch1 `addCost` ch2, os) of
+           (IsExpensive, NoOneShotInfo) -> []
+           (ch,          _)             -> os : go ch prs
 
 -- | True <=> eta-expansion will add at least one lambda
 expandableArityType :: ArityType -> Bool
-expandableArityType at = arityTypeArity at /= 0
-
--- | See Note [Dead ends] in "GHC.Types.Demand".
--- Bottom implies a dead end.
-isDeadEndArityType :: ArityType -> Bool
-isDeadEndArityType (AT _ div) = isDeadEndDiv div
-
--- | Expand a non-bottoming arity type so that it has at least the given arity.
-maxWithArity :: ArityType -> Arity -> ArityType
-maxWithArity at@(AT oss div) !ar
-  | isDeadEndArityType at    = at
-  | oss `lengthAtLeast` ar   = at
-  | otherwise                = AT (take ar $ oss ++ repeat NoOneShotInfo) div
+expandableArityType at = not (null (arityTypeOneShots at))
 
 -- | Trim an arity type so that it has at most the given arity.
 -- Any excess 'OneShotInfo's are truncated to 'topDiv', even if they end in
 -- 'ABot'.
-minWithArity :: ArityType -> Arity -> ArityType
-minWithArity at@(AT oss _) ar
+trimArityType :: ArityType -> Arity -> ArityType
+trimArityType at@(AT oss _) ar
   | oss `lengthAtMost` ar = at
   | otherwise             = AT (take ar oss) topDiv
 
-takeWhileOneShot :: ArityType -> ArityType
-takeWhileOneShot (AT oss div)
-  | isDeadEndDiv div = AT (takeWhile isOneShotInfo oss) topDiv
-  | otherwise        = AT (takeWhile isOneShotInfo oss) div
-
 -- | The Arity returned is the number of value args the
 -- expression can be applied to without doing much work
 exprEtaExpandArity :: DynFlags -> CoreExpr -> ArityType
@@ -632,21 +682,40 @@ getBotArity (AT oss div)
   | isDeadEndDiv div = Just $ length oss
   | otherwise        = Nothing
 
-----------------------
-findRhsArity :: DynFlags -> Id -> CoreExpr -> Arity -> ArityType
+
+{- *********************************************************************
+*                                                                      *
+                   findRhsArity
+*                                                                      *
+********************************************************************* -}
+
+findRhsArity :: DynFlags -> RecFlag -> Id -> CoreExpr -> Arity -> ArityType
 -- This implements the fixpoint loop for arity analysis
 -- See Note [Arity analysis]
 -- If findRhsArity e = (n, is_bot) then
 --  (a) any application of e to <n arguments will not do much work,
 --      so it is safe to expand e  ==>  (\x1..xn. e x1 .. xn)
 --  (b) if is_bot=True, then e applied to n args is guaranteed bottom
-findRhsArity dflags bndr rhs old_arity
-  = go 0 botArityType
-      -- We always do one step, but usually that produces a result equal to
-      -- old_arity, and then we stop right away, because old_arity is assumed
-      -- to be sound. In other words, arities should never decrease.
-      -- Result: the common case is that there is just one iteration
+findRhsArity dflags is_rec bndr rhs old_arity
+  = rhs_arity_type `combineWithDemandOneShots` idDemandOneShots bndr
+       -- combineWithDemandOneShots: take account of the demand on the
+       -- binder.  Perhaps it is always called with 2 args
+       --   let f = \x. blah in (f 3 4, f 1 9)
+       -- f's demand-info says how many args it is called with
+
   where
+    init_env :: ArityEnv
+    init_env = findRhsArityEnv dflags
+
+    rhs_arity_type = case is_rec of
+                        Recursive    -> go 0 botArityType
+                        NonRecursive -> arityType init_env rhs
+      -- In the recursive case we always do one step, but usually that
+      -- produces a result equal to old_arity, and then we stop right
+      -- away, because old_arity is assumed to be sound. In other
+      -- words, arities should never decrease.  Result: the common
+      -- case is that there is just one iteration
+
     go :: Int -> ArityType -> ArityType
     go !n cur_at@(AT oss div)
       | not (isDeadEndDiv div)           -- the "stop right away" case
@@ -662,14 +731,44 @@ findRhsArity dflags bndr rhs old_arity
         next_at = step cur_at
 
     step :: ArityType -> ArityType
-    step at = -- pprTrace "step" (ppr bndr <+> ppr at <+> ppr (arityType env rhs)) $
-              arityType env rhs
+    step cur_at = arityType env rhs
       where
-        env = extendSigEnv (findRhsArityEnv dflags) bndr at
+        env = extendSigEnv init_env bndr cur_at
 
-{-
-Note [Arity analysis]
-~~~~~~~~~~~~~~~~~~~~~
+combineWithDemandOneShots :: ArityType -> [OneShotInfo] -> ArityType
+-- See Note [Combining arity type with demand info]
+combineWithDemandOneShots (AT prs div) oss
+  = AT (zip_prs prs oss) div
+  where
+    zip_prs prs [] = prs
+    zip_prs [] oss = [(IsExpensive,os) | os <- oss]   -- False <=> expensive
+    zip_prs ((ch,os1):prs) (os2:oss)
+      = (ch, os1 `bestOneShot` os2) : zip_prs prs oss
+
+idDemandOneShots :: Id -> [OneShotInfo]
+idDemandOneShots bndr
+  = call_arity_one_shots `zip_oss` dmd_one_shots
+  where
+    call_arity_one_shots :: [OneShotInfo]
+    call_arity_one_shots
+      | call_arity == 0 = []
+      | otherwise       = NoOneShotInfo : replicate (call_arity-1) OneShotLam
+    -- Call Arity analysis says the function is always called
+    -- applied to this many arguments
+    call_arity = idCallArity bndr
+
+    dmd_one_shots :: [OneShotInfo]
+    -- If the demand info is Cx(C1(C1(.))) then we know that an
+    -- application to one arg is also an application to three
+    dmd_one_shots = argOneShots (idDemandInfo bndr)
+
+    -- Take the *longer* list
+    zip_oss (os1:oss1) (os2:oss2) = (os1 `bestOneShot` os2) : zip_oss oss1 oss2
+    zip_oss []         oss2       = oss2
+    zip_oss oss1       []         = oss1
+
+{- Note [Arity analysis]
+~~~~~~~~~~~~~~~~~~~~~~~~
 The motivating example for arity analysis is this:
 
   f = \x. let g = f (x+1)
@@ -731,57 +830,81 @@ to floatIn the non-cheap let-binding.  Which is all perfectly benign, but
 means we do two iterations (well, actually 3 'step's to detect we are stable)
 and don't want to emit the warning.
 
-Note [Eta expanding through dictionaries]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-If the experimental -fdicts-cheap flag is on, we eta-expand through
-dictionary bindings.  This improves arities. Thereby, it also
-means that full laziness is less prone to floating out the
-application of a function to its dictionary arguments, which
-can thereby lose opportunities for fusion.  Example:
-        foo :: Ord a => a -> ...
-     foo = /\a \(d:Ord a). let d' = ...d... in \(x:a). ....
-        -- So foo has arity 1
+Note [Combining arity type with demand info]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+   let f = \x. let y = <expensive> in \p \q{os}. blah
+   in ...(f a b)...(f c d)...
 
-     f = \x. foo dInt $ bar x
+* From the RHS we get an ArityType like
+    AT [ (True,?), (False,?), (True,OneShotLam) ] Dunno
+  where "?" means NoOneShotInfo
 
-The (foo DInt) is floated out, and makes ineffective a RULE
-     foo (bar x) = ...
+* From the body, the demand analyser, or Call Arity, will tell us
+  that the function is always applied to at least two arguments.
 
-One could go further and make exprIsCheap reply True to any
-dictionary-typed expression, but that's more work.
+Combining these two pieces of info, we can get the final ArityType
+    AT [ (True,?), (False,OneShotLam), (True,OneShotLam) ] Dunno
+result: arity=3, which is better than we could do from either
+source alone.
+
+The "combining" part is done by combineWithDemandOneShots.
 -}
 
+
+{- *********************************************************************
+*                                                                      *
+                   arityType
+*                                                                      *
+********************************************************************* -}
+
 arityLam :: Id -> ArityType -> ArityType
-arityLam id (AT oss div) = AT (idStateHackOneShotInfo id : oss) div
+arityLam id (AT oss div)
+  = AT ((IsCheap, idStateHackOneShotInfo id) : oss) div
 
 floatIn :: Bool -> ArityType -> ArityType
 -- We have something like (let x = E in b),
 -- where b has the given arity type.
-floatIn cheap at
-  | isDeadEndArityType at || cheap = at
-  -- If E is not cheap, keep arity only for one-shots
-  | otherwise                      = takeWhileOneShot at
+floatIn cheap at | cheap     = at
+                 | otherwise = addWork at
+
+addWork :: ArityType -> ArityType
+addWork at@(AT prs div)
+  = case prs of
+      []      -> at
+      pr:prs' -> AT (add_work pr : prs') div
+  where
+    add_work (_,os) = (IsExpensive,os)
 
 arityApp :: ArityType -> Bool -> ArityType
 -- Processing (fun arg) where at is the ArityType of fun,
 -- Knock off an argument and behave like 'let'
-arityApp (AT (_:oss) div) cheap = floatIn cheap (AT oss div)
-arityApp at               _     = at
+arityApp (AT ((ch1,_):oss) div) ch2 = floatIn (isCheap ch1 && ch2) (AT oss div)
+arityApp at                     _   = at
 
 -- | Least upper bound in the 'ArityType' lattice.
 -- See the haddocks on 'ArityType' for the lattice.
 --
 -- Used for branches of a @case@.
 andArityType :: ArityType -> ArityType -> ArityType
-andArityType (AT (os1:oss1) div1) (AT (os2:oss2) div2)
-  | AT oss' div' <- andArityType (AT oss1 div1) (AT oss2 div2)
-  = AT ((os1 `bestOneShot` os2) : oss') div' -- See Note [Combining case branches]
-andArityType (AT []         div1) at2
-  | isDeadEndDiv div1 = at2                  -- Note [ABot branches: max arity wins]
-  | otherwise         = takeWhileOneShot at2 -- See Note [Combining case branches]
-andArityType at1                  (AT []         div2)
-  | isDeadEndDiv div2 = at1                  -- Note [ABot branches: max arity wins]
-  | otherwise         = takeWhileOneShot at1 -- See Note [Combining case branches]
+andArityType (AT (pr1:prs1) div1) (AT (pr2:prs2) div2)
+  | AT prs' div' <- andArityType (AT prs1 div1) (AT prs2 div2)
+  = AT ((pr1 `and_pr` pr2) : prs') div' -- See Note [Combining case branches]
+  where
+    (ch1,os1) `and_pr` (ch2,os2)
+      = ( ch1 `addCost` ch2, os1 `bestOneShot` os2)
+
+andArityType (AT [] div1) at2 = andWithTail div1 at2
+andArityType at1 (AT [] div2) = andWithTail div2 at1
+
+andWithTail :: Divergence -> ArityType -> ArityType
+andWithTail div1 at2@(AT oss2 div2)
+  | isDeadEndDiv div1     -- case x of { T -> error; F -> \y.e }
+  = at2
+  | otherwise  -- case x of { T -> plusInt <expensive>; F -> \y.e }
+  = addWork (AT oss2 (div1 `lubDivergence` div2))
+        -- Note [ABot branches: max arity wins]
+        -- See Note [Combining case branches]
 
 {- Note [ABot branches: max arity wins]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -835,6 +958,25 @@ topDiv.
 
 Historical note: long ago, we unconditionally switched to topDiv when we
 encountered a cast, but that is far too conservative: see #5475
+
+Note [Eta expanding through dictionaries]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If the experimental -fdicts-cheap flag is on, we eta-expand through
+dictionary bindings.  This improves arities. Thereby, it also
+means that full laziness is less prone to floating out the
+application of a function to its dictionary arguments, which
+can thereby lose opportunities for fusion.  Example:
+        foo :: Ord a => a -> ...
+     foo = /\a \(d:Ord a). let d' = ...d... in \(x:a). ....
+        -- So foo has arity 1
+
+     f = \x. foo dInt $ bar x
+
+The (foo DInt) is floated out, and makes ineffective a RULE
+     foo (bar x) = ...
+
+One could go further and make exprIsCheap reply True to any
+dictionary-typed expression, but that's more work.
 -}
 
 ---------------------------
@@ -867,6 +1009,7 @@ data AnalysisMode
   -- ^ Used for regular, fixed-point arity analysis ('findRhsArity').
   --   See Note [Arity analysis] for details about fixed-point iteration.
   --   INVARIANT: Disjoint with 'ae_joins'.
+  -- am_dicts_cheap: see Note [Eta expanding through dictionaries]
 
 data ArityEnv
   = AE
@@ -964,6 +1107,7 @@ myExprIsCheap AE{ae_mode = mode} e mb_ty = case mode of
   _             -> cheap_dict || cheap_fun e
     where
       cheap_dict = am_dicts_cheap mode && fmap isDictTy mb_ty == Just True
+                   -- See Note [Eta expanding through dictionaries]
       cheap_fun e = case mode of
 #if __GLASGOW_HASKELL__ <= 900
         BotStrictness                -> panic "impossible"
@@ -976,21 +1120,24 @@ myExprIsCheap AE{ae_mode = mode} e mb_ty = case mode of
 -- it's important.
 myIsCheapApp :: IdEnv ArityType -> CheapAppFun
 myIsCheapApp sigs fn n_val_args = case lookupVarEnv sigs fn of
+
   -- Nothing means not a local function, fall back to regular
   -- 'GHC.Core.Utils.isCheapApp'
-  Nothing         -> isCheapApp fn n_val_args
+  Nothing -> isCheapApp fn n_val_args
+
   -- @Just at@ means local function with @at@ as current ArityType.
   -- Roughly approximate what 'isCheapApp' is doing.
   Just (AT oss div)
     | isDeadEndDiv div -> True -- See Note [isCheapApp: bottoming functions] in GHC.Core.Utils
-    | n_val_args < length oss -> True -- Essentially isWorkFreeApp
-    | otherwise -> False
+    | n_val_args == 0         -> True -- Essentially
+    | n_val_args < length oss -> True -- isWorkFreeApp
+    | otherwise               -> False
 
 ----------------
 arityType :: ArityEnv -> CoreExpr -> ArityType
 
 arityType env (Cast e co)
-  = minWithArity (arityType env e) co_arity -- See Note [Arity trimming]
+  = trimArityType (arityType env e) co_arity -- See Note [Arity trimming]
   where
     co_arity = length (typeArity (coercionRKind co))
     -- See Note [exprArity invariant] (2); must be true of
@@ -1017,7 +1164,10 @@ arityType env (Lam x e)
 arityType env (App fun (Type _))
    = arityType env fun
 arityType env (App fun arg )
-   = arityApp (arityType env fun) (myExprIsCheap env arg Nothing)
+   = arityApp fun_at cheap_arg
+   where
+     fun_at    = arityType env fun
+     cheap_arg = myExprIsCheap env arg Nothing
 
         -- Case/Let; keep arity if either the expression is cheap
         -- or it's a 1-shot lambda
@@ -1030,14 +1180,16 @@ arityType env (App fun arg )
 arityType env (Case scrut bndr _ alts)
   | exprIsDeadEnd scrut || null alts
   = botArityType    -- Do not eta expand. See Note [Dealing with bottom (1)]
+
   | not (pedanticBottoms env)  -- See Note [Dealing with bottom (2)]
   , myExprIsCheap env scrut (Just (idType bndr))
   = alts_type
+
   | exprOkForSpeculation scrut
   = alts_type
 
-  | otherwise                  -- In the remaining cases we may not push
-  = takeWhileOneShot alts_type -- evaluation of the scrutinee in
+  | otherwise            -- In the remaining cases we may not push
+  = addWork alts_type -- evaluation of the scrutinee in
   where
     env' = delInScope env bndr
     arity_type_alt (Alt _con bndrs rhs) = arityType (delInScopeList env' bndrs) rhs
@@ -1146,8 +1298,8 @@ idArityType v
   | otherwise
   = AT (take (idArity v) one_shots) topDiv
   where
-    one_shots :: [OneShotInfo]  -- One-shot-ness derived from the type
-    one_shots = typeArity (idType v)
+    one_shots :: [(Cost,OneShotInfo)]  -- One-shot-ness derived from the type
+    one_shots = repeat IsCheap `zip` typeArity (idType v)
 
 {-
 %************************************************************************
@@ -1279,9 +1431,9 @@ see Note [The one-shot state monad trick] in GHC.Utils.Monad.
 etaExpand   :: Arity     -> CoreExpr -> CoreExpr
 etaExpandAT :: ArityType -> CoreExpr -> CoreExpr
 
-etaExpand   n          orig_expr = eta_expand (replicate n NoOneShotInfo) orig_expr
-etaExpandAT (AT oss _) orig_expr = eta_expand oss                         orig_expr
-                           -- See Note [Eta expansion with ArityType]
+etaExpand   n  orig_expr = eta_expand (replicate n NoOneShotInfo) orig_expr
+etaExpandAT at orig_expr = eta_expand (arityTypeOneShots at)      orig_expr
+                                   -- See Note [Eta expansion with ArityType]
 
 -- etaExpand arity e = res
 -- Then 'res' has at least 'arity' lambdas at the top
@@ -1605,6 +1757,238 @@ mkEtaWW orig_oss ppr_orig_expr in_scope orig_ty
         -- with an explicit lambda having a non-function type
 
 
+{-
+************************************************************************
+*                                                                      *
+                Eta reduction
+*                                                                      *
+************************************************************************
+
+Note [Eta reduction conditions]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We try for eta reduction here, but *only* if we get all the way to an
+trivial expression.  We don't want to remove extra lambdas unless we
+are going to avoid allocating this thing altogether.
+
+There are some particularly delicate points here:
+
+* We want to eta-reduce if doing so leaves
+         a trivial expression,
+     or  a PAP: see Note [Eta reduce PAPs]
+  *including* a cast.  For example
+       \x. f |> co  -->  f |> co
+  (provided co doesn't mention x)
+
+  c.f. Note [Which RHSs do we eta-expand?] in GHC.Core.Opt.Simplify.Utils.
+  If we eta-reduce to 'e', we don't want to eta-expand 'e'!
+
+* Eta reduction is not valid in general:
+        \x. bot  /=  bot
+  This matters, partly for old-fashioned correctness reasons but,
+  worse, getting it wrong can yield a seg fault. Consider
+        f = \x.f x
+        h y = case (case y of { True -> f `seq` True; False -> False }) of
+                True -> ...; False -> ...
+
+  If we (unsoundly) eta-reduce f to get f=f, the strictness analyser
+  says f=bottom, and replaces the (f `seq` True) with just
+  (f `cast` unsafe-co).  BUT, as thing stand, 'f' got arity 1, and it
+  *keeps* arity 1 (perhaps also wrongly).  So CorePrep eta-expands
+  the definition again, so that it does not terminate after all.
+  Result: seg-fault because the boolean case actually gets a function value.
+  See #1947.
+
+  So it's important to do the right thing.
+
+* With linear types, eta-reduction can break type-checking:
+        f :: A ⊸ B
+        g :: A -> B
+        g = \x. f x
+
+  The above is correct, but eta-reducing g would yield g=f, the linter will
+  complain that g and f don't have the same type.
+
+* Note [Arity care]: we need to be careful if we just look at f's
+  arity. Currently (Dec07), f's arity is visible in its own RHS (see
+  Note [Arity robustness] in GHC.Core.Opt.Simplify.Env) so we must *not* trust the
+  arity when checking that 'f' is a value.  Otherwise we will
+  eta-reduce
+      f = \x. f x
+  to
+      f = f
+  Which might change a terminating program (think (f `seq` e)) to a
+  non-terminating one.  So we check for being a loop breaker first.
+
+  However for GlobalIds we can look at the arity.
+
+* Type and dictionary abstraction.
+  Regardless of whether 'f' is a value, we always want to reduce
+        (/\a -> f a)  -->   f
+  This came up in a RULE: foldr (build (/\a -> g a))
+  did not match           foldr (build (/\b -> ...something complex...))
+  The type checker can insert these eta-expanded versions,
+  with both type and dictionary lambdas; hence the slightly
+  ad-hoc (all ok_lam bndrs)
+
+* Never *reduce* arity. For example
+      f = \xy. g x y
+  Then if g has arity 1 we don't want to eta-reduce because then
+  f's arity would decrease, and that is bad
+
+These delicacies are why we don't use exprIsTrivial and exprIsHNF here.
+Alas.
+
+Note [Eta reduce PAPs]
+~~~~~~~~~~~~~~~~~~~~~~
+We eta-reduce if the result is a PAP:
+   \x. f e1 e2 x  ==>   f e1 e2
+
+This reduces clutter, sometimes a lot. See Note [Do not eta-expand PAPs]
+in GHC.Core.Opt.Simplify.Utils, where we are careful not to eta-expand
+a PAP.  If eta-expanding is bad, then eta-reducing is good!
+
+Also the code generator likes eta-reduced PAPs; see GHC.CoreToStg.Prep
+Note [No eta reduction needed in rhsToBody].
+
+See also #18993.
+
+
+Note [Eta reduction with casted arguments]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Consider
+    (\(x:t3). f (x |> g)) :: t3 -> t2
+  where
+    f :: t1 -> t2
+    g :: t3 ~ t1
+This should be eta-reduced to
+
+    f |> (sym g -> t2)
+
+So we need to accumulate a coercion, pushing it inward (past
+variable arguments only) thus:
+   f (x |> co_arg) |> co  -->  (f |> (sym co_arg -> co)) x
+   f (x:t)         |> co  -->  (f |> (t -> co)) x
+   f @ a           |> co  -->  (f |> (forall a.co)) @ a
+   f @ (g:t1~t2)   |> co  -->  (f |> (t1~t2 => co)) @ (g:t1~t2)
+These are the equations for ok_arg.
+
+It's true that we could also hope to eta reduce these:
+    (\xy. (f x |> g) y)
+    (\xy. (f x y) |> g)
+But the simplifier pushes those casts outwards, so we don't
+need to address that here.
+
+Note [Eta reduction of an eval'd function]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+In Haskell it is not true that    f = \x. f x
+because f might be bottom, and 'seq' can distinguish them.
+
+But it *is* true that   f = f `seq` \x. f x
+and we'd like to simplify the latter to the former.  This amounts
+to the rule that
+  * when there is just *one* value argument,
+  * f is not bottom
+we can eta-reduce    \x. f x  ===>  f
+
+This turned up in #7542.
+-}
+
+tryEtaReduce :: [Var] -> CoreExpr -> Maybe CoreExpr
+tryEtaReduce bndrs body
+  = go (reverse bndrs) body refl_co
+  where
+    refl_co = mkRepReflCo (exprType body)
+    incoming_arity = count isId bndrs
+
+    go :: [Var]            -- Binders, innermost first, types [a3,a2,a1]
+       -> CoreExpr         -- Of type tr
+       -> Coercion         -- Of type tr ~ ts
+       -> Maybe CoreExpr   -- Of type a1 -> a2 -> a3 -> ts
+    -- See Note [Eta reduction with casted arguments]
+    -- for why we have an accumulating coercion
+    go [] fun co
+      | all ok_lam bndrs || ok_fun incoming_arity fun
+      , let etad_expr = mkCast fun co
+            used_vars = exprFreeVars etad_expr
+      , not (any (`elemVarSet` used_vars) bndrs)
+      = Just etad_expr
+
+    go bs@(_ : _) (Tick t e) co
+      | tickishFloatable t
+      = fmap (Tick t) $ go bs e co
+      -- Float app ticks: \x -> Tick t (e x) ==> Tick t e
+
+    go (b : bs) (App fun arg) co
+      | Just (co', ticks) <- ok_arg b arg co (exprType fun)
+      = fmap (flip (foldr mkTick) ticks) $ go bs fun co'
+            -- Float arg ticks: \x -> e (Tick t x) ==> Tick t e
+
+    go _ _ _  = Nothing         -- Failure!
+
+    ---------------
+    -- Note [Eta reduction conditions]
+    ok_fun :: Arity -> CoreExpr -> Bool
+    ok_fun n (App fun arg)
+      | isTypeArg arg         = ok_fun n fun
+      | otherwise           = False
+--      | otherwise             = ok_fun (n+1) fun
+    ok_fun n (Cast fun _)  = ok_fun n fun
+    ok_fun n (Tick _ expr) = ok_fun n expr
+    ok_fun n (Var fun_id)  = ok_fun_id n fun_id
+    ok_fun _ _ = False
+
+    ---------------
+    ok_fun_id n fun = fun_arity fun >= n
+
+    ---------------
+    fun_arity fun             -- See Note [Arity care]
+--       | isLocalId fun
+--       , isStrongLoopBreaker (idOccInfo fun) = 0
+       | arity > 0                           = arity
+       | isEvaldUnfolding (idUnfolding fun)  = 1
+            -- See Note [Eta reduction of an eval'd function]
+       | otherwise                           = 0
+       where
+         arity = idArity fun
+
+    ---------------
+    ok_lam v = isTyVar v || isEvVar v
+    -- See Note [Eta reduction conditions]:
+    -- bullet on Type and dictionary abstractions
+
+    ---------------
+    ok_arg :: Var              -- Of type bndr_t
+           -> CoreExpr         -- Of type arg_t
+           -> Coercion         -- Of kind (t1~t2)
+           -> Type             -- Type of the function to which the argument is applied
+           -> Maybe (Coercion  -- Of type (arg_t -> t1 ~  bndr_t -> t2)
+                               --   (and similarly for tyvars, coercion args)
+                    , [CoreTickish])
+    -- See Note [Eta reduction with casted arguments]
+    ok_arg bndr (Type ty) co _
+       | Just tv <- getTyVar_maybe ty
+       , bndr == tv  = Just (mkHomoForAllCos [tv] co, [])
+    ok_arg bndr (Var v) co fun_ty
+       | bndr == v
+       , let mult = idMult bndr
+       , Just (fun_mult, _, _) <- splitFunTy_maybe fun_ty
+       , mult `eqType` fun_mult -- There is no change in multiplicity, otherwise we must abort
+       = let reflCo = mkRepReflCo (idType bndr)
+         in Just (mkFunCo Representational (multToCo mult) reflCo co, [])
+    ok_arg bndr (Cast e co_arg) co fun_ty
+       | (ticks, Var v) <- stripTicksTop tickishFloatable e
+       , Just (fun_mult, _, _) <- splitFunTy_maybe fun_ty
+       , bndr == v
+       , fun_mult `eqType` idMult bndr
+       = Just (mkFunCo Representational (multToCo fun_mult) (mkSymCo co_arg) co, ticks)
+       -- The simplifier combines multiple casts into one,
+       -- so we can have a simple-minded pattern match here
+    ok_arg bndr (Tick t arg) co fun_ty
+       | tickishFloatable t, Just (co', ticks) <- ok_arg bndr arg co fun_ty
+       = Just (co', t:ticks)
+
+    ok_arg _ _ _ _ = Nothing
+
 {- *********************************************************************
 *                                                                      *
               The "push rules"
diff --git a/compiler/GHC/Core/Opt/OccurAnal.hs b/compiler/GHC/Core/Opt/OccurAnal.hs
index cc67802309..08495682d0 100644
--- a/compiler/GHC/Core/Opt/OccurAnal.hs
+++ b/compiler/GHC/Core/Opt/OccurAnal.hs
@@ -610,7 +610,7 @@ both j's RHS and in its stable unfolding.  We want to discover
 j2 as a join point.  So we must do the adjustRhsUsage thing
 on j's RHS.  That's why we pass mb_join_arity to calcUnfolding.
 
-Aame with rules. Suppose we have:
+Same with rules. Suppose we have:
 
   let j :: Int -> Int
       j y = 2 * y
@@ -622,7 +622,7 @@ Aame with rules. Suppose we have:
 We identify k as a join point, and we want j to be a join point too.
 Without the RULE it would be, and we don't want the RULE to mess it
 up.  So provided the join-point arity of k matches the args of the
-rule we can allow the tail-cal info from the RHS of the rule to
+rule we can allow the tail-call info from the RHS of the rule to
 propagate.
 
 * Wrinkle for Rec case. In the recursive case we don't know the
diff --git a/compiler/GHC/Core/Opt/Simplify.hs b/compiler/GHC/Core/Opt/Simplify.hs
index 1bbb728de6..4643488335 100644
--- a/compiler/GHC/Core/Opt/Simplify.hs
+++ b/compiler/GHC/Core/Opt/Simplify.hs
@@ -36,9 +36,9 @@ import GHC.Core.Ppr     ( pprCoreExpr )
 import GHC.Core.Unfold
 import GHC.Core.Unfold.Make
 import GHC.Core.Utils
-import GHC.Core.Opt.Arity ( ArityType(..)
+import GHC.Core.Opt.Arity ( ArityType, arityTypeArityDiv, exprArity
                           , pushCoTyArg, pushCoValArg
-                          , etaExpandAT )
+                          , arityTypeArity, etaExpandAT )
 import GHC.Core.SimpleOpt ( exprIsConApp_maybe, joinPointBinding_maybe, joinPointBindings_maybe )
 import GHC.Core.FVs     ( mkRuleInfo )
 import GHC.Core.Rules   ( lookupRule, getRules, initRuleOpts )
@@ -298,7 +298,7 @@ simplRecOrTopPair env top_lvl is_rec mb_cont old_bndr new_bndr rhs
   = {-#SCC "simplRecOrTopPair-join" #-}
     assert (isNotTopLevel top_lvl && isJoinId new_bndr )
     trace_bind "join" $
-    simplJoinBind env cont old_bndr new_bndr rhs env
+    simplJoinBind env is_rec cont old_bndr new_bndr rhs env
 
   | otherwise
   = {-#SCC "simplRecOrTopPair-normal" #-}
@@ -354,7 +354,7 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
                 -- See Note [Floating and type abstraction] in GHC.Core.Opt.Simplify.Utils
 
         -- Simplify the RHS
-        ; let rhs_cont = mkRhsStop (substTy body_env (exprType body))
+        ; let rhs_cont = mkRhsStop (substTy body_env (exprType body)) is_rec
         ; (body_floats0, body0) <- {-#SCC "simplExprF" #-} simplExprF body_env body rhs_cont
 
               -- Never float join-floats out of a non-join let-binding (which this is)
@@ -372,8 +372,7 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
             <-  if not (doFloatFromRhs top_lvl is_rec False body_floats2 body2)
                 then                    -- No floating, revert to body1
                      {-#SCC "simplLazyBind-no-floating" #-}
-                     do { rhs' <- mkLam env tvs' (wrapFloats body_floats2 body1) rhs_cont
-                        ; return (emptyFloats env, rhs') }
+                     rebuildLam env tvs' (emptyFloats rhs_env) (wrapFloats body_floats1 body1) rhs_cont
 
                 else if null tvs then   -- Simple floating
                      {-#SCC "simplLazyBind-simple-floating" #-}
@@ -385,26 +384,28 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
                      do { tick LetFloatFromLet
                         ; (poly_binds, body3) <- abstractFloats (seUnfoldingOpts env) top_lvl
                                                                 tvs' body_floats2 body2
-                        ; let floats = foldl' extendFloats (emptyFloats env) poly_binds
-                        ; rhs' <- mkLam env tvs' body3 rhs_cont
-                        ; return (floats, rhs') }
+                        ; let poly_floats = foldl' extendFloats (emptyFloats env) poly_binds
+                        ; (_empty_floats, rhs') <- rebuildLam env tvs' (emptyFloats body_env) body3 rhs_cont
+                        ; assertPpr (isEmptyFloats _empty_floats) (ppr _empty_floats) $
+                              -- rebuildLam returns emptyFloats if given emptyFloats
+                          return (poly_floats, rhs') }
 
         ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
-                                             top_lvl Nothing bndr bndr1 rhs'
+                                             top_lvl is_rec Nothing bndr bndr1 rhs'
         ; return (rhs_floats `addFloats` bind_float, env2) }
 
 --------------------------
-simplJoinBind :: SimplEnv
+simplJoinBind :: SimplEnv -> RecFlag
               -> SimplCont
               -> InId -> OutId          -- Binder, both pre-and post simpl
                                         -- The OutId has IdInfo, except arity,
                                         --   unfolding
               -> InExpr -> SimplEnv     -- The right hand side and its env
               -> SimplM (SimplFloats, SimplEnv)
-simplJoinBind env cont old_bndr new_bndr rhs rhs_se
+simplJoinBind env is_rec cont old_bndr new_bndr rhs rhs_se
   = do  { let rhs_env = rhs_se `setInScopeFromE` env
         ; rhs' <- simplJoinRhs rhs_env old_bndr rhs cont
-        ; completeBind env NotTopLevel (Just cont) old_bndr new_bndr rhs' }
+        ; completeBind env NotTopLevel is_rec (Just cont) old_bndr new_bndr rhs' }
 
 --------------------------
 simplNonRecX :: SimplEnv
@@ -464,7 +465,7 @@ completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
                      return (emptyFloats env, wrapFloats floats new_rhs)
 
         ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
-                                             NotTopLevel Nothing
+                                             NotTopLevel NonRecursive Nothing
                                              old_bndr new_bndr rhs2
         ; return (rhs_floats `addFloats` bind_float, env2) }
 
@@ -813,7 +814,7 @@ makeTrivialBinding mode top_lvl occ_fs info expr expr_ty
 
         -- Now something very like completeBind,
         -- but without the postInlineUnconditionally part
-        ; (arity_type, expr2) <- tryEtaExpandRhs mode var expr1
+        ; (arity_type, expr2) <- tryEtaExpandRhs mode NonRecursive var expr1
         ; unf <- mkLetUnfolding (sm_uf_opts mode) top_lvl InlineRhs var expr2
 
         ; let final_id = addLetBndrInfo var arity_type unf
@@ -882,6 +883,7 @@ Nor does it do the atomic-argument thing
 
 completeBind :: SimplEnv
              -> TopLevelFlag            -- Flag stuck into unfolding
+             -> RecFlag
              -> MaybeJoinCont           -- Required only for join point
              -> InId                    -- Old binder
              -> OutId -> OutExpr        -- New binder and RHS
@@ -892,7 +894,7 @@ completeBind :: SimplEnv
 --
 -- Binder /can/ be a JoinId
 -- Precondition: rhs obeys the let/app invariant
-completeBind env top_lvl mb_cont old_bndr new_bndr new_rhs
+completeBind env top_lvl is_rec mb_cont old_bndr new_bndr new_rhs
  | isCoVar old_bndr
  = case new_rhs of
      Coercion co -> return (emptyFloats env, extendCvSubst env old_bndr co)
@@ -907,7 +909,7 @@ completeBind env top_lvl mb_cont old_bndr new_bndr new_rhs
 
          -- Do eta-expansion on the RHS of the binding
          -- See Note [Eta-expanding at let bindings] in GHC.Core.Opt.Simplify.Utils
-      ; (new_arity, eta_rhs) <- tryEtaExpandRhs mode new_bndr new_rhs
+      ; (new_arity, eta_rhs) <- tryEtaExpandRhs mode is_rec new_bndr new_rhs
 
         -- Simplify the unfolding
       ; new_unfolding <- simplLetUnfolding env top_lvl mb_cont old_bndr
@@ -934,8 +936,7 @@ addLetBndrInfo :: OutId -> ArityType -> Unfolding -> OutId
 addLetBndrInfo new_bndr new_arity_type new_unf
   = new_bndr `setIdInfo` info5
   where
-    AT oss div = new_arity_type
-    new_arity  = length oss
+    (new_arity, div) = arityTypeArityDiv new_arity_type
 
     info1 = idInfo new_bndr `setArityInfo` new_arity
 
@@ -1635,10 +1636,12 @@ simplLam env bndrs body (TickIt tickish cont)
 
         -- Not enough args, so there are real lambdas left to put in the result
 simplLam env bndrs body cont
-  = do  { (env', bndrs') <- simplLamBndrs env bndrs
-        ; body' <- simplExpr env' body
-        ; new_lam <- mkLam env bndrs' body' cont
-        ; rebuild env' new_lam cont }
+  = do  { (env', bndrs')  <- simplLamBndrs env bndrs
+        ; let body_ty' = substTy env' (exprType body)
+        ; (floats, body') <- simplExprF env' body (mkBoringStop body_ty')
+        ; (floats1, new_lam) <- rebuildLam env bndrs' floats body' cont
+        ; (floats2, expr')   <- rebuild (env `setInScopeFromF` floats1) new_lam cont
+        ; return (floats1 `addFloats` floats2, expr') }
 
 -------------
 simplLamBndr :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr)
@@ -1821,7 +1824,7 @@ simplNonRecJoinPoint env bndr rhs body cont
               res_ty = contResultType cont
         ; (env1, bndr1)    <- simplNonRecJoinBndr env bndr mult res_ty
         ; (env2, bndr2)    <- addBndrRules env1 bndr bndr1 (Just cont)
-        ; (floats1, env3)  <- simplJoinBind env2 cont bndr bndr2 rhs env
+        ; (floats1, env3)  <- simplJoinBind env2 NonRecursive cont bndr bndr2 rhs env
         ; (floats2, body') <- simplExprF env3 body cont
         ; return (floats1 `addFloats` floats2, body') }
 
@@ -3505,8 +3508,8 @@ mkDupableContWithDmds env dmds
         --              let a = ...arg...
         --              in [...hole...] a
         -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable
-    do  { let (dmd:_) = dmds   -- Never fails
-        ; (floats1, cont') <- mkDupableContWithDmds env dmds cont
+    do  { let (dmd:cont_dmds) = dmds   -- Never fails
+        ; (floats1, cont') <- mkDupableContWithDmds env cont_dmds cont
         ; let env' = env `setInScopeFromF` floats1
         ; (_, se', arg') <- simplArg env' dup se arg
         ; (let_floats2, arg'') <- makeTrivial (getMode env) NotTopLevel dmd (fsLit "karg") arg'
@@ -4081,11 +4084,12 @@ simplStableUnfolding env top_lvl mb_cont id rhs_ty id_arity unf
          -- See Note [Simplifying inside stable unfoldings] in GHC.Core.Opt.Simplify.Utils
 
     -- See Note [Eta-expand stable unfoldings]
-    eta_expand expr
-      | not eta_on         = expr
-      | exprIsTrivial expr = expr
-      | otherwise          = etaExpandAT id_arity expr
-    eta_on = sm_eta_expand (getMode env)
+    eta_expand expr | sm_eta_expand (getMode env)
+                    , exprArity expr < arityTypeArity id_arity
+                    , wantEtaExpansion expr
+                    = etaExpandAT id_arity expr
+                    | otherwise
+                    = expr
 
 {- Note [Eta-expand stable unfoldings]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/compiler/GHC/Core/Opt/Simplify/Env.hs b/compiler/GHC/Core/Opt/Simplify/Env.hs
index 6d325d02bb..19b1c90403 100644
--- a/compiler/GHC/Core/Opt/Simplify/Env.hs
+++ b/compiler/GHC/Core/Opt/Simplify/Env.hs
@@ -20,7 +20,7 @@ module GHC.Core.Opt.Simplify.Env (
         getSimplRules,
 
         -- * Substitution results
-        SimplSR(..), mkContEx, substId, lookupRecBndr, refineFromInScope,
+        SimplSR(..), mkContEx, substId, lookupRecBndr,
 
         -- * Simplifying 'Id' binders
         simplNonRecBndr, simplNonRecJoinBndr, simplRecBndrs, simplRecJoinBndrs,
@@ -32,7 +32,8 @@ module GHC.Core.Opt.Simplify.Env (
         SimplFloats(..), emptyFloats, mkRecFloats,
         mkFloatBind, addLetFloats, addJoinFloats, addFloats,
         extendFloats, wrapFloats,
-        doFloatFromRhs, getTopFloatBinds,
+        isEmptyFloats, isEmptyJoinFloats, isEmptyLetFloats,
+        doFloatFromRhs, getTopFloatBinds, etaFloatOk,
 
         -- * LetFloats
         LetFloats, letFloatBinds, emptyLetFloats, unitLetFloat,
@@ -49,6 +50,7 @@ import GHC.Core.Opt.Simplify.Monad
 import GHC.Core.Opt.Monad        ( SimplMode(..) )
 import GHC.Core
 import GHC.Core.Utils
+import GHC.Core.FVs
 import GHC.Core.Multiplicity     ( scaleScaled )
 import GHC.Core.Unfold
 import GHC.Types.Var
@@ -75,6 +77,8 @@ import GHC.Types.Unique.FM      ( pprUniqFM )
 
 import Data.List (mapAccumL)
 
+import GHC.Utils.Trace( pprTrace )
+
 {-
 ************************************************************************
 *                                                                      *
@@ -139,6 +143,10 @@ emptyFloats env
                 , sfJoinFloats = emptyJoinFloats
                 , sfInScope    = seInScope env }
 
+isEmptyFloats :: SimplFloats -> Bool
+isEmptyFloats (SimplFloats { sfLetFloats = lf, sfJoinFloats = jf })
+  = isEmptyLetFloats lf && isEmptyJoinFloats jf
+
 pprSimplEnv :: SimplEnv -> SDoc
 -- Used for debugging; selective
 pprSimplEnv env
@@ -495,6 +503,23 @@ doFloatFromRhs lvl rec str (SimplFloats { sfLetFloats = LetFloats fs ff }) rhs
                    FltOkSpec  -> isNotTopLevel lvl && isNonRec rec
                    FltCareful -> isNotTopLevel lvl && isNonRec rec && str
 
+etaFloatOk :: [Id] -> SimplFloats -> Bool
+etaFloatOk bndrs (SimplFloats { sfLetFloats = LetFloats let_floats float_flag
+                              , sfJoinFloats = join_floats })
+  =  isEmptyJoinFloats join_floats
+  && case float_flag of { FltCareful -> False; _ -> True }
+  && bndr_set `disjointVarSet` let_float_fvs
+  && bndr_set `disjointVarSet` let_float_bndrs
+  where
+    bndr_set        = mkVarSet bndrs
+    let_float_bndrs = foldr (unionVarSet . mkVarSet . bindersOf) emptyVarSet let_floats
+    let_float_fvs   = foldr (unionVarSet . bindFreeVars) emptyVarSet let_floats
+         -- This formulation may return a set that is slightly too large,
+         -- by not deleting variables bound by the let's, but that is rare
+         -- and at worst we miss an eta-reduction
+
+
+
 {-
 Note [Float when cheap or expandable]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -510,9 +535,15 @@ so we must take the 'or' of the two.
 emptyLetFloats :: LetFloats
 emptyLetFloats = LetFloats nilOL FltLifted
 
+isEmptyLetFloats :: LetFloats -> Bool
+isEmptyLetFloats (LetFloats fs _) = isNilOL fs
+
 emptyJoinFloats :: JoinFloats
 emptyJoinFloats = nilOL
 
+isEmptyJoinFloats :: JoinFloats -> Bool
+isEmptyJoinFloats = isNilOL
+
 unitLetFloat :: OutBind -> LetFloats
 -- This key function constructs a singleton float with the right form
 unitLetFloat bind = assert (all (not . isJoinId) (bindersOf bind)) $
@@ -702,7 +733,8 @@ refineFromInScope :: InScopeSet -> Var -> Var
 refineFromInScope in_scope v
   | isLocalId v = case lookupInScope in_scope v of
                   Just v' -> v'
-                  Nothing -> pprPanic "refineFromInScope" (ppr in_scope $$ ppr v)
+                  Nothing -> pprTrace "refineFromInScope" (ppr in_scope $$ ppr v) v
+                             -- pprPanic "refineFromInScope" (ppr in_scope $$ ppr v)
                              -- c.f #19074 for a subtle place where this went wrong
   | otherwise = v
 
@@ -789,7 +821,6 @@ simplRecBndrs env@(SimplEnv {}) ids
     do  { let (!env1, ids1) = mapAccumL substIdBndr env ids
         ; seqIds ids1 `seq` return env1 }
 
-
 ---------------
 substIdBndr :: SimplEnv -> InBndr -> (SimplEnv, OutBndr)
 -- Might be a coercion variable
@@ -1016,7 +1047,7 @@ getTCvSubst (SimplEnv { seInScope = in_scope, seTvSubst = tv_env
                       , seCvSubst = cv_env })
   = mkTCvSubst in_scope (tv_env, cv_env)
 
-substTy :: SimplEnv -> Type -> Type
+substTy :: HasDebugCallStack => SimplEnv -> Type -> Type
 substTy env ty = Type.substTy (getTCvSubst env) ty
 
 substTyVar :: SimplEnv -> TyVar -> Type
diff --git a/compiler/GHC/Core/Opt/Simplify/Utils.hs b/compiler/GHC/Core/Opt/Simplify/Utils.hs
index 39f62d8744..94e6636cc7 100644
--- a/compiler/GHC/Core/Opt/Simplify/Utils.hs
+++ b/compiler/GHC/Core/Opt/Simplify/Utils.hs
@@ -8,7 +8,8 @@ The simplifier utilities
 
 module GHC.Core.Opt.Simplify.Utils (
         -- Rebuilding
-        mkLam, mkCase, prepareAlts, tryEtaExpandRhs,
+        rebuildLam, mkCase, prepareAlts,
+        tryEtaExpandRhs, wantEtaExpansion,
 
         -- Inlining,
         preInlineUnconditionally, postInlineUnconditionally,
@@ -22,7 +23,7 @@ module GHC.Core.Opt.Simplify.Utils (
         contIsDupable, contResultType, contHoleType, contHoleScaling,
         contIsTrivial, contArgs,
         countArgs,
-        mkBoringStop, mkRhsStop, mkLazyArgStop, contIsRhsOrArg,
+        mkBoringStop, mkRhsStop, mkLazyArgStop,
         interestingCallContext,
 
         -- ArgInfo
@@ -394,22 +395,16 @@ mkFunRules rs = Just (n_required, rs)
 mkBoringStop :: OutType -> SimplCont
 mkBoringStop ty = Stop ty BoringCtxt
 
-mkRhsStop :: OutType -> SimplCont       -- See Note [RHS of lets] in GHC.Core.Unfold
-mkRhsStop ty = Stop ty RhsCtxt
+mkRhsStop :: OutType -> RecFlag -> SimplCont       -- See Note [RHS of lets] in GHC.Core.Unfold
+mkRhsStop ty is_rec = Stop ty (RhsCtxt is_rec)
 
 mkLazyArgStop :: OutType -> CallCtxt -> SimplCont
 mkLazyArgStop ty cci = Stop ty cci
 
 -------------------
-contIsRhsOrArg :: SimplCont -> Bool
-contIsRhsOrArg (Stop {})       = True
-contIsRhsOrArg (StrictBind {}) = True
-contIsRhsOrArg (StrictArg {})  = True
-contIsRhsOrArg _               = False
-
-contIsRhs :: SimplCont -> Bool
-contIsRhs (Stop _ RhsCtxt) = True
-contIsRhs _                = False
+contIsRhs :: SimplCont -> Maybe RecFlag
+contIsRhs (Stop _ (RhsCtxt is_rec)) = Just is_rec
+contIsRhs _                         = Nothing
 
 -------------------
 contIsStop :: SimplCont -> Bool
@@ -697,7 +692,7 @@ strictArgContext (ArgInfo { ai_encl = encl_rules, ai_discs = discs })
 -- Use this for strict arguments
   | encl_rules                = RuleArgCtxt
   | disc:_ <- discs, disc > 0 = DiscArgCtxt  -- Be keener here
-  | otherwise                 = RhsCtxt
+  | otherwise                 = RhsCtxt NonRecursive
       -- Why RhsCtxt?  if we see f (g x) (h x), and f is strict, we
       -- want to be a bit more eager to inline g, because it may
       -- expose an eval (on x perhaps) that can be eliminated or
@@ -1553,55 +1548,77 @@ won't inline because 'e' is too big.
 ************************************************************************
 -}
 
-mkLam :: SimplEnv -> [OutBndr] -> OutExpr -> SimplCont -> SimplM OutExpr
--- mkLam tries three things
+rebuildLam :: SimplEnv
+           -> [OutBndr] -> SimplFloats -> OutExpr
+           -> SimplCont -> SimplM (SimplFloats, OutExpr)
+-- (rebuildLam env bndrs floats body cont)
+-- returns an expression that means the same as
+--      \bndrs. let floats in body
+-- But it tries
 --      a) eta reduction, if that gives a trivial expression
 --      b) eta expansion [only if there are some value lambdas]
+--
+-- Invariant: emptyFloats in => emptyFloats out
+rebuildLam _env [] floats body _cont
+  = return (floats, body)
 
-mkLam _env [] body _cont
-  = return body
-mkLam env bndrs body cont
+rebuildLam env bndrs floats body cont
   = do { dflags <- getDynFlags
        ; mkLam' dflags bndrs body }
   where
-    mkLam' :: DynFlags -> [OutBndr] -> OutExpr -> SimplM OutExpr
+
+    mb_rhs :: Maybe RecFlag   -- Just => continuation is the RHS of a let
+    mb_rhs = contIsRhs cont
+
+    mkLam' :: DynFlags -> [OutBndr] -> OutExpr -> SimplM (SimplFloats, OutExpr)
     mkLam' dflags bndrs (Cast body co)
       | not (any bad bndrs)
         -- Note [Casts and lambdas]
-      = do { lam <- mkLam' dflags bndrs body
-           ; return (mkCast lam (mkPiCos Representational bndrs co)) }
+      = do { (floats, lam) <- mkLam' dflags bndrs body
+           ; return (floats, mkCast lam (mkPiCos Representational bndrs co)) }
       where
         co_vars  = tyCoVarsOfCo co
         bad bndr = isCoVar bndr && bndr `elemVarSet` co_vars
 
     mkLam' dflags bndrs body@(Lam {})
+      | isEmptyFloats floats   -- \xs. let floats in \ys. blah
+                               -- Do not combine these lambdas
       = mkLam' dflags (bndrs ++ bndrs1) body1
       where
         (bndrs1, body1) = collectBinders body
 
     mkLam' dflags bndrs (Tick t expr)
       | tickishFloatable t
-      = mkTick t <$> mkLam' dflags bndrs expr
+      = do { (floats, expr') <- mkLam' dflags bndrs expr
+           ; return (floats, mkTick t expr') }
 
     mkLam' dflags bndrs body
       | gopt Opt_DoEtaReduction dflags
+      , case mb_rhs of { Just Recursive -> False; _ -> True }
+            -- Is this lambda the RHS of a non-recursive let?
+            -- See Note [Eta reduce PAPs] in GHC.Core.Opt.Arity, and
+            -- Note [Do not eta-expand PAPs] in this module
+            -- If so try eta-reduction; but not otherwise
+      , etaFloatOk bndrs floats  -- Can the floats go outside the lambdas?
       , Just etad_lam <- tryEtaReduce bndrs body
       = do { tick (EtaReduction (head bndrs))
-           ; return etad_lam }
+           ; return (floats, etad_lam) }
 
-      | not (contIsRhs cont)   -- See Note [Eta-expanding lambdas]
+      | Nothing <- mb_rhs  -- See Note [Eta-expanding lambdas]
       , sm_eta_expand (getMode env)
-      , any isRuntimeVar bndrs
-      , let body_arity = exprEtaExpandArity dflags body
-      , expandableArityType body_arity
+      , any isRuntimeVar bndrs  -- Only when there is at least one value lambda already
+      , let full_body  = wrapFloats floats body
+            body_arity = exprEtaExpandArity dflags full_body
+      , expandableArityType body_arity  -- This guard is only so that we only do
+                                        -- a tick if there so something to do
       = do { tick (EtaExpansion (head bndrs))
-           ; let res = mkLams bndrs (etaExpandAT body_arity body)
-           ; traceSmpl "eta expand" (vcat [text "before" <+> ppr (mkLams bndrs body)
+           ; let res = mkLams bndrs (etaExpandAT body_arity full_body)
+           ; traceSmpl "eta expand" (vcat [text "before" <+> ppr (mkLams bndrs full_body)
                                           , text "after" <+> ppr res])
-           ; return res }
+           ; return (emptyFloats env, res) }
 
       | otherwise
-      = return (mkLams bndrs body)
+      = return (emptyFloats env, mkLams bndrs (wrapFloats floats body))
 
 {-
 Note [Eta expanding lambdas]
@@ -1664,18 +1681,18 @@ because the latter is not well-kinded.
 ************************************************************************
 -}
 
-tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr
+tryEtaExpandRhs :: SimplMode -> RecFlag -> OutId -> OutExpr
                 -> SimplM (ArityType, OutExpr)
 -- See Note [Eta-expanding at let bindings]
 -- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then
 --   (a) rhs' has manifest arity n
 --   (b) if is_bot is True then rhs' applied to n args is guaranteed bottom
-tryEtaExpandRhs mode bndr rhs
+tryEtaExpandRhs mode is_rec bndr rhs
   | Just join_arity <- isJoinId_maybe bndr
   = do { let (join_bndrs, join_body) = collectNBinders join_arity rhs
              oss   = [idOneShotInfo id | id <- join_bndrs, isId id]
              arity_type | exprIsDeadEnd join_body = mkBotArityType oss
-                        | otherwise               = mkTopArityType oss
+                        | otherwise               = mkManifestArityType oss
        ; return (arity_type, rhs) }
          -- Note [Do not eta-expand join points]
          -- But do return the correct arity and bottom-ness, because
@@ -1684,7 +1701,7 @@ tryEtaExpandRhs mode bndr rhs
 
   | sm_eta_expand mode      -- Provided eta-expansion is on
   , new_arity > old_arity   -- And the current manifest arity isn't enough
-  , want_eta rhs
+  , wantEtaExpansion rhs
   = do { tick (EtaExpansion bndr)
        ; return (arity_type, etaExpandAT arity_type rhs) }
 
@@ -1695,24 +1712,19 @@ tryEtaExpandRhs mode bndr rhs
     dflags    = sm_dflags mode
     old_arity = exprArity rhs
 
-    arity_type = findRhsArity dflags bndr rhs old_arity
-                 `maxWithArity` idCallArity bndr
-    new_arity = arityTypeArity arity_type
-
-    -- See Note [Which RHSs do we eta-expand?]
-    want_eta (Cast e _)                  = want_eta e
-    want_eta (Tick _ e)                  = want_eta e
-    want_eta (Lam b e) | isTyVar b       = want_eta e
-    want_eta (App e a) | exprIsTrivial a = want_eta e
-    want_eta (Var {})                    = False
-    want_eta (Lit {})                    = False
-    want_eta _ = True
-{-
-    want_eta _ = case arity_type of
-                   ATop (os:_) -> isOneShotInfo os
-                   ATop []     -> False
-                   ABot {}     -> True
--}
+    arity_type = findRhsArity dflags is_rec bndr rhs old_arity
+    new_arity  = arityTypeArity arity_type
+
+wantEtaExpansion :: CoreExpr -> Bool
+-- Mostly True; but False of PAPs which will immediately eta-reduce again
+-- See Note [Which RHSs do we eta-expand?]
+wantEtaExpansion (Cast e _)             = wantEtaExpansion e
+wantEtaExpansion (Tick _ e)             = wantEtaExpansion e
+wantEtaExpansion (Lam b e) | isTyVar b  = wantEtaExpansion e
+wantEtaExpansion (App e _)              = wantEtaExpansion e
+wantEtaExpansion (Var {})               = False
+wantEtaExpansion (Lit {})               = False
+wantEtaExpansion _                      = True
 
 {-
 Note [Eta-expanding at let bindings]
diff --git a/compiler/GHC/Core/Unfold.hs b/compiler/GHC/Core/Unfold.hs
index 4e054ea709..d5bb69871a 100644
--- a/compiler/GHC/Core/Unfold.hs
+++ b/compiler/GHC/Core/Unfold.hs
@@ -50,7 +50,7 @@ import GHC.Core.DataCon
 import GHC.Types.Literal
 import GHC.Builtin.PrimOps
 import GHC.Types.Id.Info
-import GHC.Types.Basic  ( Arity, InlineSpec(..), inlinePragmaSpec )
+import GHC.Types.Basic  ( Arity, InlineSpec(..), RecFlag(..), inlinePragmaSpec )
 import GHC.Core.Type
 import GHC.Builtin.Names
 import GHC.Builtin.Types.Prim ( realWorldStatePrimTy )
@@ -1081,7 +1081,7 @@ nonTriv _       = True
 
 data CallCtxt
   = BoringCtxt
-  | RhsCtxt             -- Rhs of a let-binding; see Note [RHS of lets]
+  | RhsCtxt RecFlag     -- Rhs of a let-binding; see Note [RHS of lets]
   | DiscArgCtxt         -- Argument of a function with non-zero arg discount
   | RuleArgCtxt         -- We are somewhere in the argument of a function with rules
 
@@ -1096,7 +1096,7 @@ instance Outputable CallCtxt where
   ppr CaseCtxt    = text "CaseCtxt"
   ppr ValAppCtxt  = text "ValAppCtxt"
   ppr BoringCtxt  = text "BoringCtxt"
-  ppr RhsCtxt     = text "RhsCtxt"
+  ppr (RhsCtxt ir)= text "RhsCtxt" <> parens (ppr ir)
   ppr DiscArgCtxt = text "DiscArgCtxt"
   ppr RuleArgCtxt = text "RuleArgCtxt"
 
@@ -1324,7 +1324,8 @@ tryUnfolding logger opts !case_depth id lone_variable
               ValAppCtxt -> True                           -- Note [Cast then apply]
               RuleArgCtxt -> uf_arity > 0  -- See Note [Unfold info lazy contexts]
               DiscArgCtxt -> uf_arity > 0  -- Note [Inlining in ArgCtxt]
-              RhsCtxt     -> uf_arity > 0  --
+              RhsCtxt NonRecursive
+                          -> uf_arity > 0  -- See Note [RHS of lets]
               _other      -> False         -- See Note [Nested functions]
 
 
@@ -1341,7 +1342,9 @@ only we can't see it.  Also
 could be expensive whereas
      x = case v of (a,b) -> a
 is patently cheap and may allow more eta expansion.
-So we treat the RHS of a let as not-totally-boring.
+
+So we treat the RHS of a /non-recursive/ let as not-totally-boring.
+A recursive let isn't going be inlined so there is much less point.
 
 Note [Unsaturated applications]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/compiler/GHC/Core/Utils.hs b/compiler/GHC/Core/Utils.hs
index 12efdddcd4..a62bc6db52 100644
--- a/compiler/GHC/Core/Utils.hs
+++ b/compiler/GHC/Core/Utils.hs
@@ -38,8 +38,8 @@ module GHC.Core.Utils (
         cheapEqExpr, cheapEqExpr', eqExpr,
         diffExpr, diffBinds,
 
-        -- * Lambdas and eta reduction
-        tryEtaReduce, zapLamBndrs,
+        -- * Lambdas
+        zapLamBndrs,
 
         -- * Manipulating data constructors and types
         exprToType, exprToCoercion_maybe,
@@ -69,11 +69,9 @@ import GHC.Platform
 
 import GHC.Core
 import GHC.Core.Ppr
-import GHC.Core.FVs( exprFreeVars )
 import GHC.Core.DataCon
 import GHC.Core.Type as Type
 import GHC.Core.FamInstEnv
-import GHC.Core.Predicate
 import GHC.Core.TyCo.Rep( TyCoBinder(..), TyBinder )
 import GHC.Core.Coercion
 import GHC.Core.TyCon
@@ -2317,214 +2315,6 @@ locBind loc b1 b2 diffs = map addLoc diffs
         bindLoc | b1 == b2  = ppr b1
                 | otherwise = ppr b1 <> char '/' <> ppr b2
 
-{-
-************************************************************************
-*                                                                      *
-                Eta reduction
-*                                                                      *
-************************************************************************
-
-Note [Eta reduction conditions]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We try for eta reduction here, but *only* if we get all the way to an
-trivial expression.  We don't want to remove extra lambdas unless we
-are going to avoid allocating this thing altogether.
-
-There are some particularly delicate points here:
-
-* We want to eta-reduce if doing so leaves a trivial expression,
-  *including* a cast.  For example
-       \x. f |> co  -->  f |> co
-  (provided co doesn't mention x)
-
-* Eta reduction is not valid in general:
-        \x. bot  /=  bot
-  This matters, partly for old-fashioned correctness reasons but,
-  worse, getting it wrong can yield a seg fault. Consider
-        f = \x.f x
-        h y = case (case y of { True -> f `seq` True; False -> False }) of
-                True -> ...; False -> ...
-
-  If we (unsoundly) eta-reduce f to get f=f, the strictness analyser
-  says f=bottom, and replaces the (f `seq` True) with just
-  (f `cast` unsafe-co).  BUT, as thing stand, 'f' got arity 1, and it
-  *keeps* arity 1 (perhaps also wrongly).  So CorePrep eta-expands
-  the definition again, so that it does not terminate after all.
-  Result: seg-fault because the boolean case actually gets a function value.
-  See #1947.
-
-  So it's important to do the right thing.
-
-* With linear types, eta-reduction can break type-checking:
-        f :: A ⊸ B
-        g :: A -> B
-        g = \x. f x
-
-  The above is correct, but eta-reducing g would yield g=f, the linter will
-  complain that g and f don't have the same type.
-
-* Note [Arity care]: we need to be careful if we just look at f's
-  arity. Currently (Dec07), f's arity is visible in its own RHS (see
-  Note [Arity robustness] in GHC.Core.Opt.Simplify.Env) so we must *not* trust the
-  arity when checking that 'f' is a value.  Otherwise we will
-  eta-reduce
-      f = \x. f x
-  to
-      f = f
-  Which might change a terminating program (think (f `seq` e)) to a
-  non-terminating one.  So we check for being a loop breaker first.
-
-  However for GlobalIds we can look at the arity; and for primops we
-  must, since they have no unfolding.
-
-* Regardless of whether 'f' is a value, we always want to
-  reduce (/\a -> f a) to f
-  This came up in a RULE: foldr (build (/\a -> g a))
-  did not match           foldr (build (/\b -> ...something complex...))
-  The type checker can insert these eta-expanded versions,
-  with both type and dictionary lambdas; hence the slightly
-  ad-hoc isDictId
-
-* Never *reduce* arity. For example
-      f = \xy. g x y
-  Then if h has arity 1 we don't want to eta-reduce because then
-  f's arity would decrease, and that is bad
-
-These delicacies are why we don't use exprIsTrivial and exprIsHNF here.
-Alas.
-
-Note [Eta reduction with casted arguments]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Consider
-    (\(x:t3). f (x |> g)) :: t3 -> t2
-  where
-    f :: t1 -> t2
-    g :: t3 ~ t1
-This should be eta-reduced to
-
-    f |> (sym g -> t2)
-
-So we need to accumulate a coercion, pushing it inward (past
-variable arguments only) thus:
-   f (x |> co_arg) |> co  -->  (f |> (sym co_arg -> co)) x
-   f (x:t)         |> co  -->  (f |> (t -> co)) x
-   f @ a           |> co  -->  (f |> (forall a.co)) @ a
-   f @ (g:t1~t2)   |> co  -->  (f |> (t1~t2 => co)) @ (g:t1~t2)
-These are the equations for ok_arg.
-
-It's true that we could also hope to eta reduce these:
-    (\xy. (f x |> g) y)
-    (\xy. (f x y) |> g)
-But the simplifier pushes those casts outwards, so we don't
-need to address that here.
--}
-
--- When updating this function, make sure to update
--- CorePrep.tryEtaReducePrep as well!
-tryEtaReduce :: [Var] -> CoreExpr -> Maybe CoreExpr
-tryEtaReduce bndrs body
-  = go (reverse bndrs) body (mkRepReflCo (exprType body))
-  where
-    incoming_arity = count isId bndrs
-
-    go :: [Var]            -- Binders, innermost first, types [a3,a2,a1]
-       -> CoreExpr         -- Of type tr
-       -> Coercion         -- Of type tr ~ ts
-       -> Maybe CoreExpr   -- Of type a1 -> a2 -> a3 -> ts
-    -- See Note [Eta reduction with casted arguments]
-    -- for why we have an accumulating coercion
-    go [] fun co
-      | ok_fun fun
-      , let used_vars = exprFreeVars fun `unionVarSet` tyCoVarsOfCo co
-      , not (any (`elemVarSet` used_vars) bndrs)
-      = Just (mkCast fun co)   -- Check for any of the binders free in the result
-                               -- including the accumulated coercion
-
-    go bs (Tick t e) co
-      | tickishFloatable t
-      = fmap (Tick t) $ go bs e co
-      -- Float app ticks: \x -> Tick t (e x) ==> Tick t e
-
-    go (b : bs) (App fun arg) co
-      | Just (co', ticks) <- ok_arg b arg co (exprType fun)
-      = fmap (flip (foldr mkTick) ticks) $ go bs fun co'
-            -- Float arg ticks: \x -> e (Tick t x) ==> Tick t e
-
-    go _ _ _  = Nothing         -- Failure!
-
-    ---------------
-    -- Note [Eta reduction conditions]
-    ok_fun (App fun (Type {})) = ok_fun fun
-    ok_fun (Cast fun _)        = ok_fun fun
-    ok_fun (Tick _ expr)       = ok_fun expr
-    ok_fun (Var fun_id)        = ok_fun_id fun_id || all ok_lam bndrs
-    ok_fun _fun                = False
-
-    ---------------
-    ok_fun_id fun = fun_arity fun >= incoming_arity
-
-    ---------------
-    fun_arity fun             -- See Note [Arity care]
-       | isLocalId fun
-       , isStrongLoopBreaker (idOccInfo fun) = 0
-       | arity > 0                           = arity
-       | isEvaldUnfolding (idUnfolding fun)  = 1
-            -- See Note [Eta reduction of an eval'd function]
-       | otherwise                           = 0
-       where
-         arity = idArity fun
-
-    ---------------
-    ok_lam v = isTyVar v || isEvVar v
-
-    ---------------
-    ok_arg :: Var              -- Of type bndr_t
-           -> CoreExpr         -- Of type arg_t
-           -> Coercion         -- Of kind (t1~t2)
-           -> Type             -- Type of the function to which the argument is applied
-           -> Maybe (Coercion  -- Of type (arg_t -> t1 ~  bndr_t -> t2)
-                               --   (and similarly for tyvars, coercion args)
-                    , [CoreTickish])
-    -- See Note [Eta reduction with casted arguments]
-    ok_arg bndr (Type ty) co _
-       | Just tv <- getTyVar_maybe ty
-       , bndr == tv  = Just (mkHomoForAllCos [tv] co, [])
-    ok_arg bndr (Var v) co fun_ty
-       | bndr == v
-       , let mult = idMult bndr
-       , Just (fun_mult, _, _) <- splitFunTy_maybe fun_ty
-       , mult `eqType` fun_mult -- There is no change in multiplicity, otherwise we must abort
-       = let reflCo = mkRepReflCo (idType bndr)
-         in Just (mkFunCo Representational (multToCo mult) reflCo co, [])
-    ok_arg bndr (Cast e co_arg) co fun_ty
-       | (ticks, Var v) <- stripTicksTop tickishFloatable e
-       , Just (fun_mult, _, _) <- splitFunTy_maybe fun_ty
-       , bndr == v
-       , fun_mult `eqType` idMult bndr
-       = Just (mkFunCo Representational (multToCo fun_mult) (mkSymCo co_arg) co, ticks)
-       -- The simplifier combines multiple casts into one,
-       -- so we can have a simple-minded pattern match here
-    ok_arg bndr (Tick t arg) co fun_ty
-       | tickishFloatable t, Just (co', ticks) <- ok_arg bndr arg co fun_ty
-       = Just (co', t:ticks)
-
-    ok_arg _ _ _ _ = Nothing
-
-{-
-Note [Eta reduction of an eval'd function]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In Haskell it is not true that    f = \x. f x
-because f might be bottom, and 'seq' can distinguish them.
-
-But it *is* true that   f = f `seq` \x. f x
-and we'd like to simplify the latter to the former.  This amounts
-to the rule that
-  * when there is just *one* value argument,
-  * f is not bottom
-we can eta-reduce    \x. f x  ===>  f
-
-This turned up in #7542.
--}
 
 {- *********************************************************************
 *                                                                      *
diff --git a/compiler/GHC/CoreToStg.hs b/compiler/GHC/CoreToStg.hs
index 38050e79e1..57b59d0a66 100644
--- a/compiler/GHC/CoreToStg.hs
+++ b/compiler/GHC/CoreToStg.hs
@@ -549,7 +549,7 @@ coreToStgApp f args ticks = do
                 -- Some primitive operator that might be implemented as a library call.
                 -- As noted by Note [Eta expanding primops] in GHC.Builtin.PrimOps
                 -- we require that primop applications be saturated.
-                PrimOpId op      -> assert saturated $
+                PrimOpId op      -> assertPpr saturated (ppr f <+> ppr args) $
                                     StgOpApp (StgPrimOp op) args' res_ty
 
                 -- A call to some primitive Cmm function.
diff --git a/compiler/GHC/CoreToStg/Prep.hs b/compiler/GHC/CoreToStg/Prep.hs
index 30c28a6db2..0f7def7a6c 100644
--- a/compiler/GHC/CoreToStg/Prep.hs
+++ b/compiler/GHC/CoreToStg/Prep.hs
@@ -35,7 +35,6 @@ import GHC.Builtin.Types.Prim ( realWorldStatePrimTy )
 
 import GHC.Core.Utils
 import GHC.Core.Opt.Arity
-import GHC.Core.FVs
 import GHC.Core.Opt.Monad ( CoreToDo(..) )
 import GHC.Core.Lint    ( endPassIO )
 import GHC.Core
@@ -63,7 +62,6 @@ import GHC.Utils.Trace
 
 import GHC.Types.Demand
 import GHC.Types.Var
-import GHC.Types.Var.Set
 import GHC.Types.Var.Env
 import GHC.Types.Id
 import GHC.Types.Id.Info
@@ -782,7 +780,7 @@ cpeRhsE env expr@(Lit (LitNumber nt i))
       Just e  -> cpeRhsE env e
 cpeRhsE _env expr@(Lit {}) = return (emptyFloats, expr)
 cpeRhsE env expr@(Var {})  = cpeApp env expr
-cpeRhsE env expr@(App {}) = cpeApp env expr
+cpeRhsE env expr@(App {})  = cpeApp env expr
 
 cpeRhsE env (Let bind body)
   = do { (env', bind_floats, maybe_bind') <- cpeBind NotTopLevel env bind
@@ -916,9 +914,7 @@ rhsToBody (Cast e co)
   = do { (floats, e') <- rhsToBody e
        ; return (floats, Cast e' co) }
 
-rhsToBody expr@(Lam {})
-  | Just no_lam_result <- tryEtaReducePrep bndrs body
-  = return (emptyFloats, no_lam_result)
+rhsToBody expr@(Lam {})   -- See Note [No eta reduction needed in rhsToBody]
   | all isTyVar bndrs           -- Type lambdas are ok
   = return (emptyFloats, expr)
   | otherwise                   -- Some value lambdas
@@ -927,11 +923,30 @@ rhsToBody expr@(Lam {})
        ; let float = FloatLet (NonRec fn rhs)
        ; return (unitFloat float, Var fn) }
   where
-    (bndrs,body) = collectBinders expr
+    (bndrs,_) = collectBinders expr
 
 rhsToBody expr = return (emptyFloats, expr)
 
 
+{- Note [No eta reduction needed in rhsToBody]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Historical note.  In the olden days we used to have a Prep-specific
+eta-reduction step in rhsToBody:
+  rhsToBody expr@(Lam {})
+    | Just no_lam_result <- tryEtaReducePrep bndrs body
+    = return (emptyFloats, no_lam_result)
+
+The goal was to reduce
+        case x of { p -> \xs. map f xs }
+    ==> case x of { p -> map f }
+
+to avoid allocating a lambda.  Of course, we'd allocate a PAP
+instead, which is hardly better, but that's the way it was.
+
+Anyway, now we eta-reduce PAPs in the Simplifier (see
+Note [Eta reduce PAPs] in GHC.Core.Opt.Arity), so there is
+no need to do so here.
+-}
 
 -- ---------------------------------------------------------------------------
 --              CpeApp: produces a result satisfying CpeApp
@@ -1085,7 +1100,7 @@ cpeApp top_env expr
        case head of
          Just fn_id -> do { sat_app <- maybeSaturate fn_id app depth
                           ; return (floats, sat_app) }
-         _other              -> return (floats, app)
+         _other     -> return (floats, app)
 
     -- Deconstruct and rebuild the application, floating any non-atomic
     -- arguments to the outside.  We collect the type of the expression,
@@ -1452,7 +1467,7 @@ maybeSaturate fn expr n_args
 
 Note [Eta expansion]
 ~~~~~~~~~~~~~~~~~~~~~
-Eta expand to match the arity claimed by the binder Remember,
+Eta expand to match the arity claimed by the binder. Remember,
 CorePrep must not change arity
 
 Eta expansion might not have happened already, because it is done by
@@ -1460,7 +1475,7 @@ the simplifier only when there at least one lambda already.
 
 NB1:we could refrain when the RHS is trivial (which can happen
     for exported things).  This would reduce the amount of code
-    generated (a little) and make things a little words for
+    generated (a little) and make things a little worse for
     code compiled without -O.  The case in point is data constructor
     wrappers.
 
@@ -1493,50 +1508,6 @@ cpeEtaExpand arity expr
   | arity == 0 = expr
   | otherwise  = etaExpand arity expr
 
-{-
--- -----------------------------------------------------------------------------
---      Eta reduction
--- -----------------------------------------------------------------------------
-
-Why try eta reduction?  Hasn't the simplifier already done eta?
-But the simplifier only eta reduces if that leaves something
-trivial (like f, or f Int).  But for deLam it would be enough to
-get to a partial application:
-        case x of { p -> \xs. map f xs }
-    ==> case x of { p -> map f }
--}
-
--- When updating this function, make sure it lines up with
--- GHC.Core.Utils.tryEtaReduce!
-tryEtaReducePrep :: [CoreBndr] -> CoreExpr -> Maybe CoreExpr
-tryEtaReducePrep bndrs expr@(App _ _)
-  | ok_to_eta_reduce f
-  , n_remaining >= 0
-  , and (zipWith ok bndrs last_args)
-  , not (any (`elemVarSet` fvs_remaining) bndrs)
-  , exprIsHNF remaining_expr   -- Don't turn value into a non-value
-                               -- else the behaviour with 'seq' changes
-  = Just remaining_expr
-  where
-    (f, args) = collectArgs expr
-    remaining_expr = mkApps f remaining_args
-    fvs_remaining = exprFreeVars remaining_expr
-    (remaining_args, last_args) = splitAt n_remaining args
-    n_remaining = length args - length bndrs
-
-    ok bndr (Var arg) = bndr == arg
-    ok _    _         = False
-
-    -- We can't eta reduce something which must be saturated.
-    ok_to_eta_reduce (Var f) = not (hasNoBinding f) && not (isLinearType (idType f))
-    ok_to_eta_reduce _       = False -- Safe. ToDo: generalise
-
-
-tryEtaReducePrep bndrs (Tick tickish e)
-  | tickishFloatable tickish
-  = fmap (mkTick tickish) $ tryEtaReducePrep bndrs e
-
-tryEtaReducePrep _ _ = Nothing
 
 {-
 ************************************************************************
diff --git a/testsuite/tests/arityanal/should_compile/Arity03.stderr b/testsuite/tests/arityanal/should_compile/Arity03.stderr
index e5e3e754dd..74084d1004 100644
--- a/testsuite/tests/arityanal/should_compile/Arity03.stderr
+++ b/testsuite/tests/arityanal/should_compile/Arity03.stderr
@@ -19,19 +19,14 @@ fac [InlPrag=[2]] :: Int -> Int
 [GblId,
  Arity=1,
  Str=<1P(1L)>,
- Cpr=m1,
+ Cpr=1,
  Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
-         Tmpl= \ (w [Occ=Once1!] :: Int) -> case w of { GHC.Types.I# ww1 [Occ=Once1] -> case F3.$wfac ww1 of ww2 [Occ=Once1] { __DEFAULT -> GHC.Types.I# ww2 } }}]
-fac = \ (w :: Int) -> case w of { GHC.Types.I# ww1 -> case F3.$wfac ww1 of ww2 { __DEFAULT -> GHC.Types.I# ww2 } }
+         Tmpl= \ (w [Occ=Once1!] :: Int) -> case w of { GHC.Types.I# ww [Occ=Once1] -> case F3.$wfac ww of ww1 [Occ=Once1] { __DEFAULT -> GHC.Types.I# ww1 } }}]
+fac = \ (w :: Int) -> case w of { GHC.Types.I# ww -> case F3.$wfac ww of ww1 { __DEFAULT -> GHC.Types.I# ww1 } }
 
 -- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
 f3 :: Int -> Int
-[GblId,
- Arity=1,
- Str=<1P(1L)>,
- Cpr=m1,
- Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=True)
-         Tmpl= fac}]
+[GblId, Arity=1, Str=<1P(1L)>, Cpr=1, Unf=Unf{Src=<vanilla>, TopLvl=True, Value=True, ConLike=True, WorkFree=True, Expandable=True, Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=True)}]
 f3 = fac
 
 
diff --git a/testsuite/tests/arityanal/should_compile/Arity11.stderr b/testsuite/tests/arityanal/should_compile/Arity11.stderr
index da35b40ab8..088896dc17 100644
--- a/testsuite/tests/arityanal/should_compile/Arity11.stderr
+++ b/testsuite/tests/arityanal/should_compile/Arity11.stderr
@@ -124,4 +124,7 @@ f11 :: (Integer, Integer)
 f11 = (F11.f4, F11.f1)
 
 
+------ Local rules for imported ids --------
+"SPEC fib @Integer @Integer" forall ($dEq :: Eq Integer) ($dNum :: Num Integer) ($dNum1 :: Num Integer). fib @Integer @Integer $dEq $dNum $dNum1 = F11.f11_fib
+
 
diff --git a/testsuite/tests/codeGen/should_compile/debug.stdout b/testsuite/tests/codeGen/should_compile/debug.stdout
index 3dca62a419..25df0c258f 100644
--- a/testsuite/tests/codeGen/should_compile/debug.stdout
+++ b/testsuite/tests/codeGen/should_compile/debug.stdout
@@ -18,6 +18,7 @@ src<debug.hs:4:9>
 src<debug.hs:5:21-29>
 src<debug.hs:5:9-29>
 src<debug.hs:6:1-21>
+src<debug.hs:6:16-21>
 == CBE ==
 src<debug.hs:4:9>
 89
diff --git a/testsuite/tests/deSugar/should_compile/T19969.hs b/testsuite/tests/deSugar/should_compile/T19969.hs
index ad9546c84a..c6a8ac9c05 100644
--- a/testsuite/tests/deSugar/should_compile/T19969.hs
+++ b/testsuite/tests/deSugar/should_compile/T19969.hs
@@ -5,10 +5,10 @@ module T19969 where
 -- Three mutually recursive functions
 -- We want to inline g, h, keeping f as the loop breaker
 
+f [] = []
 f x = reverse (g (x:: [Int])) :: [Int]
 
 {-# INLINE g #-}
-
 g x = reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse (reverse (h x))))))))))))
 
 {-# INLINE h #-}
diff --git a/testsuite/tests/deSugar/should_compile/T19969.stderr b/testsuite/tests/deSugar/should_compile/T19969.stderr
index 5e23785472..3c70f95163 100644
--- a/testsuite/tests/deSugar/should_compile/T19969.stderr
+++ b/testsuite/tests/deSugar/should_compile/T19969.stderr
@@ -1,38 +1,425 @@
 
 ==================== Tidy Core ====================
 Result size of Tidy Core
-  = {terms: 12, types: 18, coercions: 0, joins: 0/0}
+  = {terms: 196, types: 204, coercions: 0, joins: 0/0}
 
 Rec {
--- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
-f [Occ=LoopBreaker] :: [Int] -> [Int]
-[GblId, Arity=1, Str=<B>b, Cpr=b, Unf=OtherCon []]
-f = \ (x :: [Int]) -> f x
+-- RHS size: {terms: 55, types: 53, coercions: 0, joins: 0/0}
+T19969.f_$sf [Occ=LoopBreaker] :: Int -> [Int] -> [Int]
+[GblId, Arity=2, Str=<B><B>b, Unf=OtherCon []]
+T19969.f_$sf
+  = \ (sc :: Int) (sc1 :: [Int]) ->
+      GHC.List.reverse1
+        @Int
+        (GHC.List.reverse1
+           @Int
+           (GHC.List.reverse1
+              @Int
+              (GHC.List.reverse1
+                 @Int
+                 (GHC.List.reverse1
+                    @Int
+                    (GHC.List.reverse1
+                       @Int
+                       (GHC.List.reverse1
+                          @Int
+                          (GHC.List.reverse1
+                             @Int
+                             (GHC.List.reverse1
+                                @Int
+                                (GHC.List.reverse1
+                                   @Int
+                                   (GHC.List.reverse1
+                                      @Int
+                                      (GHC.List.reverse1
+                                         @Int
+                                         (GHC.List.reverse1
+                                            @Int
+                                            (GHC.List.reverse1
+                                               @Int
+                                               (GHC.List.reverse1
+                                                  @Int
+                                                  (GHC.List.reverse1
+                                                     @Int
+                                                     (GHC.List.reverse1
+                                                        @Int
+                                                        (GHC.List.reverse1
+                                                           @Int
+                                                           (GHC.List.reverse1
+                                                              @Int
+                                                              (GHC.List.reverse1
+                                                                 @Int
+                                                                 (GHC.List.reverse1
+                                                                    @Int
+                                                                    (GHC.List.reverse1
+                                                                       @Int
+                                                                       (GHC.List.reverse1
+                                                                          @Int
+                                                                          (GHC.List.reverse1
+                                                                             @Int
+                                                                             (GHC.List.reverse1
+                                                                                @Int
+                                                                                (T19969.f_$sf
+                                                                                   sc sc1)
+                                                                                (GHC.Types.[] @Int))
+                                                                             (GHC.Types.[] @Int))
+                                                                          (GHC.Types.[] @Int))
+                                                                       (GHC.Types.[] @Int))
+                                                                    (GHC.Types.[] @Int))
+                                                                 (GHC.Types.[] @Int))
+                                                              (GHC.Types.[] @Int))
+                                                           (GHC.Types.[] @Int))
+                                                        (GHC.Types.[] @Int))
+                                                     (GHC.Types.[] @Int))
+                                                  (GHC.Types.[] @Int))
+                                               (GHC.Types.[] @Int))
+                                            (GHC.Types.[] @Int))
+                                         (GHC.Types.[] @Int))
+                                      (GHC.Types.[] @Int))
+                                   (GHC.Types.[] @Int))
+                                (GHC.Types.[] @Int))
+                             (GHC.Types.[] @Int))
+                          (GHC.Types.[] @Int))
+                       (GHC.Types.[] @Int))
+                    (GHC.Types.[] @Int))
+                 (GHC.Types.[] @Int))
+              (GHC.Types.[] @Int))
+           (GHC.Types.[] @Int))
+        (GHC.Types.[] @Int)
 end Rec }
 
--- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
-g [InlPrag=INLINE (sat-args=1)] :: [Int] -> [Int]
+-- RHS size: {terms: 59, types: 58, coercions: 0, joins: 0/0}
+f :: [Int] -> [Int]
+[GblId, Arity=1, Str=<1L>, Unf=OtherCon []]
+f = \ (ds :: [Int]) ->
+      case ds of {
+        [] -> GHC.Types.[] @Int;
+        : ipv ipv1 ->
+          GHC.List.reverse1
+            @Int
+            (GHC.List.reverse1
+               @Int
+               (GHC.List.reverse1
+                  @Int
+                  (GHC.List.reverse1
+                     @Int
+                     (GHC.List.reverse1
+                        @Int
+                        (GHC.List.reverse1
+                           @Int
+                           (GHC.List.reverse1
+                              @Int
+                              (GHC.List.reverse1
+                                 @Int
+                                 (GHC.List.reverse1
+                                    @Int
+                                    (GHC.List.reverse1
+                                       @Int
+                                       (GHC.List.reverse1
+                                          @Int
+                                          (GHC.List.reverse1
+                                             @Int
+                                             (GHC.List.reverse1
+                                                @Int
+                                                (GHC.List.reverse1
+                                                   @Int
+                                                   (GHC.List.reverse1
+                                                      @Int
+                                                      (GHC.List.reverse1
+                                                         @Int
+                                                         (GHC.List.reverse1
+                                                            @Int
+                                                            (GHC.List.reverse1
+                                                               @Int
+                                                               (GHC.List.reverse1
+                                                                  @Int
+                                                                  (GHC.List.reverse1
+                                                                     @Int
+                                                                     (GHC.List.reverse1
+                                                                        @Int
+                                                                        (GHC.List.reverse1
+                                                                           @Int
+                                                                           (GHC.List.reverse1
+                                                                              @Int
+                                                                              (GHC.List.reverse1
+                                                                                 @Int
+                                                                                 (GHC.List.reverse1
+                                                                                    @Int
+                                                                                    (T19969.f_$sf
+                                                                                       ipv ipv1)
+                                                                                    (GHC.Types.[]
+                                                                                       @Int))
+                                                                                 (GHC.Types.[]
+                                                                                    @Int))
+                                                                              (GHC.Types.[] @Int))
+                                                                           (GHC.Types.[] @Int))
+                                                                        (GHC.Types.[] @Int))
+                                                                     (GHC.Types.[] @Int))
+                                                                  (GHC.Types.[] @Int))
+                                                               (GHC.Types.[] @Int))
+                                                            (GHC.Types.[] @Int))
+                                                         (GHC.Types.[] @Int))
+                                                      (GHC.Types.[] @Int))
+                                                   (GHC.Types.[] @Int))
+                                                (GHC.Types.[] @Int))
+                                             (GHC.Types.[] @Int))
+                                          (GHC.Types.[] @Int))
+                                       (GHC.Types.[] @Int))
+                                    (GHC.Types.[] @Int))
+                                 (GHC.Types.[] @Int))
+                              (GHC.Types.[] @Int))
+                           (GHC.Types.[] @Int))
+                        (GHC.Types.[] @Int))
+                     (GHC.Types.[] @Int))
+                  (GHC.Types.[] @Int))
+               (GHC.Types.[] @Int))
+            (GHC.Types.[] @Int)
+      }
+
+-- RHS size: {terms: 27, types: 26, coercions: 0, joins: 0/0}
+h [InlPrag=INLINE (sat-args=1)] :: [Int] -> [Int]
 [GblId,
  Arity=1,
- Str=<B>b,
- Cpr=b,
+ Str=<1L>,
  Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
          WorkFree=True, Expandable=True,
-         Guidance=ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=True)
-         Tmpl= \ (x [Occ=Once1] :: [Int]) -> f x}]
-g = \ (x :: [Int]) -> f x
+         Guidance=ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=False)
+         Tmpl= \ (x [Occ=Once1] :: [Int]) ->
+                 GHC.List.reverse1
+                   @Int
+                   (GHC.List.reverse1
+                      @Int
+                      (GHC.List.reverse1
+                         @Int
+                         (GHC.List.reverse1
+                            @Int
+                            (GHC.List.reverse1
+                               @Int
+                               (GHC.List.reverse1
+                                  @Int
+                                  (GHC.List.reverse1
+                                     @Int
+                                     (GHC.List.reverse1
+                                        @Int
+                                        (GHC.List.reverse1
+                                           @Int
+                                           (GHC.List.reverse1
+                                              @Int
+                                              (GHC.List.reverse1
+                                                 @Int
+                                                 (GHC.List.reverse1 @Int (f x) (GHC.Types.[] @Int))
+                                                 (GHC.Types.[] @Int))
+                                              (GHC.Types.[] @Int))
+                                           (GHC.Types.[] @Int))
+                                        (GHC.Types.[] @Int))
+                                     (GHC.Types.[] @Int))
+                                  (GHC.Types.[] @Int))
+                               (GHC.Types.[] @Int))
+                            (GHC.Types.[] @Int))
+                         (GHC.Types.[] @Int))
+                      (GHC.Types.[] @Int))
+                   (GHC.Types.[] @Int)}]
+h = \ (x :: [Int]) ->
+      GHC.List.reverse1
+        @Int
+        (GHC.List.reverse1
+           @Int
+           (GHC.List.reverse1
+              @Int
+              (GHC.List.reverse1
+                 @Int
+                 (GHC.List.reverse1
+                    @Int
+                    (GHC.List.reverse1
+                       @Int
+                       (GHC.List.reverse1
+                          @Int
+                          (GHC.List.reverse1
+                             @Int
+                             (GHC.List.reverse1
+                                @Int
+                                (GHC.List.reverse1
+                                   @Int
+                                   (GHC.List.reverse1
+                                      @Int
+                                      (GHC.List.reverse1 @Int (f x) (GHC.Types.[] @Int))
+                                      (GHC.Types.[] @Int))
+                                   (GHC.Types.[] @Int))
+                                (GHC.Types.[] @Int))
+                             (GHC.Types.[] @Int))
+                          (GHC.Types.[] @Int))
+                       (GHC.Types.[] @Int))
+                    (GHC.Types.[] @Int))
+                 (GHC.Types.[] @Int))
+              (GHC.Types.[] @Int))
+           (GHC.Types.[] @Int))
+        (GHC.Types.[] @Int)
 
--- RHS size: {terms: 3, types: 2, coercions: 0, joins: 0/0}
-h [InlPrag=INLINE (sat-args=1)] :: [Int] -> [Int]
+-- RHS size: {terms: 51, types: 50, coercions: 0, joins: 0/0}
+g [InlPrag=INLINE (sat-args=1)] :: [Int] -> [Int]
 [GblId,
  Arity=1,
- Str=<B>b,
- Cpr=b,
+ Str=<1L>,
  Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
          WorkFree=True, Expandable=True,
-         Guidance=ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=True)
-         Tmpl= \ (x [Occ=Once1] :: [Int]) -> f x}]
-h = \ (x :: [Int]) -> f x
+         Guidance=ALWAYS_IF(arity=1,unsat_ok=False,boring_ok=False)
+         Tmpl= \ (x [Occ=Once1] :: [Int]) ->
+                 GHC.List.reverse1
+                   @Int
+                   (GHC.List.reverse1
+                      @Int
+                      (GHC.List.reverse1
+                         @Int
+                         (GHC.List.reverse1
+                            @Int
+                            (GHC.List.reverse1
+                               @Int
+                               (GHC.List.reverse1
+                                  @Int
+                                  (GHC.List.reverse1
+                                     @Int
+                                     (GHC.List.reverse1
+                                        @Int
+                                        (GHC.List.reverse1
+                                           @Int
+                                           (GHC.List.reverse1
+                                              @Int
+                                              (GHC.List.reverse1
+                                                 @Int
+                                                 (GHC.List.reverse1
+                                                    @Int
+                                                    (GHC.List.reverse1
+                                                       @Int
+                                                       (GHC.List.reverse1
+                                                          @Int
+                                                          (GHC.List.reverse1
+                                                             @Int
+                                                             (GHC.List.reverse1
+                                                                @Int
+                                                                (GHC.List.reverse1
+                                                                   @Int
+                                                                   (GHC.List.reverse1
+                                                                      @Int
+                                                                      (GHC.List.reverse1
+                                                                         @Int
+                                                                         (GHC.List.reverse1
+                                                                            @Int
+                                                                            (GHC.List.reverse1
+                                                                               @Int
+                                                                               (GHC.List.reverse1
+                                                                                  @Int
+                                                                                  (GHC.List.reverse1
+                                                                                     @Int
+                                                                                     (GHC.List.reverse1
+                                                                                        @Int
+                                                                                        (f x)
+                                                                                        (GHC.Types.[]
+                                                                                           @Int))
+                                                                                     (GHC.Types.[]
+                                                                                        @Int))
+                                                                                  (GHC.Types.[]
+                                                                                     @Int))
+                                                                               (GHC.Types.[] @Int))
+                                                                            (GHC.Types.[] @Int))
+                                                                         (GHC.Types.[] @Int))
+                                                                      (GHC.Types.[] @Int))
+                                                                   (GHC.Types.[] @Int))
+                                                                (GHC.Types.[] @Int))
+                                                             (GHC.Types.[] @Int))
+                                                          (GHC.Types.[] @Int))
+                                                       (GHC.Types.[] @Int))
+                                                    (GHC.Types.[] @Int))
+                                                 (GHC.Types.[] @Int))
+                                              (GHC.Types.[] @Int))
+                                           (GHC.Types.[] @Int))
+                                        (GHC.Types.[] @Int))
+                                     (GHC.Types.[] @Int))
+                                  (GHC.Types.[] @Int))
+                               (GHC.Types.[] @Int))
+                            (GHC.Types.[] @Int))
+                         (GHC.Types.[] @Int))
+                      (GHC.Types.[] @Int))
+                   (GHC.Types.[] @Int)}]
+g = \ (x :: [Int]) ->
+      GHC.List.reverse1
+        @Int
+        (GHC.List.reverse1
+           @Int
+           (GHC.List.reverse1
+              @Int
+              (GHC.List.reverse1
+                 @Int
+                 (GHC.List.reverse1
+                    @Int
+                    (GHC.List.reverse1
+                       @Int
+                       (GHC.List.reverse1
+                          @Int
+                          (GHC.List.reverse1
+                             @Int
+                             (GHC.List.reverse1
+                                @Int
+                                (GHC.List.reverse1
+                                   @Int
+                                   (GHC.List.reverse1
+                                      @Int
+                                      (GHC.List.reverse1
+                                         @Int
+                                         (GHC.List.reverse1
+                                            @Int
+                                            (GHC.List.reverse1
+                                               @Int
+                                               (GHC.List.reverse1
+                                                  @Int
+                                                  (GHC.List.reverse1
+                                                     @Int
+                                                     (GHC.List.reverse1
+                                                        @Int
+                                                        (GHC.List.reverse1
+                                                           @Int
+                                                           (GHC.List.reverse1
+                                                              @Int
+                                                              (GHC.List.reverse1
+                                                                 @Int
+                                                                 (GHC.List.reverse1
+                                                                    @Int
+                                                                    (GHC.List.reverse1
+                                                                       @Int
+                                                                       (GHC.List.reverse1
+                                                                          @Int
+                                                                          (GHC.List.reverse1
+                                                                             @Int
+                                                                             (f x)
+                                                                             (GHC.Types.[] @Int))
+                                                                          (GHC.Types.[] @Int))
+                                                                       (GHC.Types.[] @Int))
+                                                                    (GHC.Types.[] @Int))
+                                                                 (GHC.Types.[] @Int))
+                                                              (GHC.Types.[] @Int))
+                                                           (GHC.Types.[] @Int))
+                                                        (GHC.Types.[] @Int))
+                                                     (GHC.Types.[] @Int))
+                                                  (GHC.Types.[] @Int))
+                                               (GHC.Types.[] @Int))
+                                            (GHC.Types.[] @Int))
+                                         (GHC.Types.[] @Int))
+                                      (GHC.Types.[] @Int))
+                                   (GHC.Types.[] @Int))
+                                (GHC.Types.[] @Int))
+                             (GHC.Types.[] @Int))
+                          (GHC.Types.[] @Int))
+                       (GHC.Types.[] @Int))
+                    (GHC.Types.[] @Int))
+                 (GHC.Types.[] @Int))
+              (GHC.Types.[] @Int))
+           (GHC.Types.[] @Int))
+        (GHC.Types.[] @Int)
+
 
+------ Local rules for imported ids --------
+"SC:f0"
+    forall (sc :: Int) (sc1 :: [Int]).
+      f (GHC.Types.: @Int sc sc1)
+      = T19969.f_$sf sc sc1
 
 
diff --git a/testsuite/tests/driver/inline-check.stderr b/testsuite/tests/driver/inline-check.stderr
index 40b5b59d19..a65d39ea6f 100644
--- a/testsuite/tests/driver/inline-check.stderr
+++ b/testsuite/tests/driver/inline-check.stderr
@@ -1,6 +1,6 @@
 Considering inlining: foo
   arg infos [ValueArg]
-  interesting continuation RhsCtxt
+  interesting continuation RhsCtxt(NonRecursive)
   some_benefit True
   is exp: True
   is work-free: True
@@ -19,7 +19,7 @@ Inactive unfolding: foo1
 Inactive unfolding: foo1
 Considering inlining: foo
   arg infos []
-  interesting continuation RhsCtxt
+  interesting continuation RhsCtxt(NonRecursive)
   some_benefit False
   is exp: True
   is work-free: True
diff --git a/testsuite/tests/simplCore/should_compile/Makefile b/testsuite/tests/simplCore/should_compile/Makefile
index 08d1798fa8..1667c11cdf 100644
--- a/testsuite/tests/simplCore/should_compile/Makefile
+++ b/testsuite/tests/simplCore/should_compile/Makefile
@@ -165,12 +165,12 @@ T5298:
 .PHONY: T5327
 T5327:
 	$(RM) -f T5327.hi T5327.o
-	'$(TEST_HC)' $(TEST_HC_OPTS) -c T5327.hs -O -ddump-simpl | grep -c '># 34# '
+	'$(TEST_HC)' $(TEST_HC_OPTS) -c T5327.hs -O -ddump-simpl | grep -c '34#'
 
 .PHONY: T16254
 T16254:
 	$(RM) -f T16254.hi T16254.o
-	'$(TEST_HC)' $(TEST_HC_OPTS) -c T16254.hs -O -ddump-simpl | grep -c '># 34# '
+	'$(TEST_HC)' $(TEST_HC_OPTS) -c T16254.hs -O -ddump-simpl | grep -c '34#'
 
 .PHONY: T5623
 T5623:
diff --git a/testsuite/tests/simplCore/should_compile/T16254.hs b/testsuite/tests/simplCore/should_compile/T16254.hs
index 3c1490c17c..a877eee6ab 100644
--- a/testsuite/tests/simplCore/should_compile/T16254.hs
+++ b/testsuite/tests/simplCore/should_compile/T16254.hs
@@ -8,7 +8,12 @@ newtype Size a b where
 {-# INLINABLE val2 #-}
 val2 = Size 17
 
--- In the core, we should see a comparison against 34#, i.e. constant
--- folding should have happened. We actually see it twice: Once in f's
--- definition, and once in its unfolding.
+-- In the core, we should see 34#, i.e. constant folding
+-- should have happened.
+--
+-- We actually get eta-reduction thus:
+--    tmp = I# 34#
+--    f = gtInt tmp
+-- beucase gtInt is marked INLINE with two parameters.
+-- But that's ok
 f n = case val2 of Size s -> s + s > n
diff --git a/testsuite/tests/simplCore/should_compile/T5327.hs b/testsuite/tests/simplCore/should_compile/T5327.hs
index a2d9c018ae..a533a2fe32 100644
--- a/testsuite/tests/simplCore/should_compile/T5327.hs
+++ b/testsuite/tests/simplCore/should_compile/T5327.hs
@@ -5,8 +5,13 @@ newtype Size = Size Int
 {-# INLINABLE val2 #-}
 val2 = Size 17
 
--- In the core, we should see a comparison against 34#, i.e. constant
--- folding should have happened. We actually see it twice: Once in f's
--- definition, and once in its unfolding.
+-- In the core, we should see 34#, i.e. constant folding
+-- should have happened.
+--
+-- We actually get eta-reduction thus:
+--    tmp = I# 34#
+--    f = gtInt tmp
+-- beucase gtInt is marked INLINE with two parameters.
+-- But that's ok
 f n = case val2 of Size s -> s + s > n
 
diff --git a/testsuite/tests/simplCore/should_compile/T7785.stderr b/testsuite/tests/simplCore/should_compile/T7785.stderr
index f0187fe958..53efb3bab4 100644
--- a/testsuite/tests/simplCore/should_compile/T7785.stderr
+++ b/testsuite/tests/simplCore/should_compile/T7785.stderr
@@ -1,8 +1,335 @@
 
-==================== Tidy Core rules ====================
-"SPEC shared @[]"
-    forall ($dMyFunctor :: MyFunctor []) (irred :: Domain [] Int).
-      shared @[] $dMyFunctor irred
-      = bar_$sshared
+==================== Common sub-expression ====================
+Result size of Common sub-expression
+  = {terms: 181, types: 89, coercions: 5, joins: 0/1}
+
+-- RHS size: {terms: 5, types: 10, coercions: 0, joins: 0/0}
+$cmyfmap_aG7
+  :: forall a b. (Domain [] a, Domain [] b) => (a -> b) -> [a] -> [b]
+[LclId,
+ Arity=4,
+ Str=<A><A><U><SU>,
+ Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True,
+         Guidance=ALWAYS_IF(arity=2,unsat_ok=True,boring_ok=True)
+         Tmpl= \ (@a_aGa)
+                 (@b_aGb)
+                 _ [Occ=Dead]
+                 _ [Occ=Dead]
+                 (eta_B0 [Occ=Once1, OS=OneShot] :: a_aGa -> b_aGb)
+                 (eta_B1 [Occ=Once1, OS=OneShot] :: [a_aGa]) ->
+                 GHC.Base.build
+                   @b_aGb
+                   (\ (@b1_aHe)
+                      (c_aHf [Occ=Once1, OS=OneShot] :: b_aGb -> b1_aHe -> b1_aHe)
+                      (n_aHg [Occ=Once1, OS=OneShot] :: b1_aHe) ->
+                      GHC.Base.foldr
+                        @a_aGa
+                        @b1_aHe
+                        (GHC.Base.mapFB @b_aGb @b1_aHe @a_aGa c_aHf eta_B0)
+                        n_aHg
+                        eta_B1)}]
+$cmyfmap_aG7
+  = \ (@a_aGa)
+      (@b_aGb)
+      _ [Occ=Dead, Dmd=A]
+      _ [Occ=Dead, Dmd=A, OS=OneShot] ->
+      map @a_aGa @b_aGb
+
+-- RHS size: {terms: 1, types: 0, coercions: 3, joins: 0/0}
+Foo.$fMyFunctor[] [InlPrag=INLINE (sat-args=0)] :: MyFunctor []
+[LclIdX[DFunId(nt)],
+ Arity=4,
+ Str=<A><A><U><SU>,
+ Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True,
+         Guidance=ALWAYS_IF(arity=0,unsat_ok=False,boring_ok=True)
+         Tmpl= $cmyfmap_aG7
+               `cast` (Sym (Foo.N:MyFunctor[0] <[]>_N)
+                       :: (forall a b.
+                           (Domain [] a, Domain [] b) =>
+                           (a -> b) -> [a] -> [b])
+                          ~R# MyFunctor [])}]
+Foo.$fMyFunctor[]
+  = $cmyfmap_aG7
+    `cast` (Sym (Foo.N:MyFunctor[0] <[]>_N)
+            :: (forall a b.
+                (Domain [] a, Domain [] b) =>
+                (a -> b) -> [a] -> [b])
+               ~R# MyFunctor [])
+
+-- RHS size: {terms: 2, types: 2, coercions: 0, joins: 0/0}
+$sshared_sHD :: [Int] -> [Int]
+[LclId,
+ Arity=1,
+ Str=<SU>,
+ Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True,
+         Guidance=ALWAYS_IF(arity=0,unsat_ok=True,boring_ok=False)
+         Tmpl= map @Int @Int GHC.Num.$fNumInt_$cnegate}]
+$sshared_sHD = map @Int @Int GHC.Num.$fNumInt_$cnegate
+
+-- RHS size: {terms: 115, types: 15, coercions: 2, joins: 0/1}
+shared
+  :: forall (f :: * -> *).
+     (MyFunctor f, Domain f Int) =>
+     f Int -> f Int
+[LclIdX,
+ Arity=2,
+ Str=<UC1(CS(CS(U)))><U>,
+ RULES: "SPEC shared @[]"
+            forall ($dMyFunctor_sHz :: MyFunctor [])
+                   (irred_sHA :: Domain [] Int).
+              shared @[] $dMyFunctor_sHz irred_sHA
+              = $sshared_sHD]
+shared
+  = \ (@(f_ayh :: * -> *))
+      ($dMyFunctor_ayi [Dmd=UC1(CS(CS(U)))] :: MyFunctor f_ayh)
+      (irred_ayj :: Domain f_ayh Int) ->
+      let {
+        f_sHy :: f_ayh Int -> f_ayh Int
+        [LclId]
+        f_sHy
+          = ($dMyFunctor_ayi
+             `cast` (Foo.N:MyFunctor[0] <f_ayh>_N
+                     :: MyFunctor f_ayh
+                        ~R# (forall a b.
+                             (Domain f_ayh a, Domain f_ayh b) =>
+                             (a -> b) -> f_ayh a -> f_ayh b)))
+              @Int @Int irred_ayj irred_ayj GHC.Num.$fNumInt_$cnegate } in
+      \ (x_X4N :: f_ayh Int) ->
+        f_sHy
+          (f_sHy
+             (f_sHy
+                (f_sHy
+                   (f_sHy
+                      (f_sHy
+                         (f_sHy
+                            (f_sHy
+                               (f_sHy
+                                  (f_sHy
+                                     (f_sHy
+                                        (f_sHy
+                                           (f_sHy
+                                              (f_sHy
+                                                 (f_sHy
+                                                    (f_sHy
+                                                       (f_sHy
+                                                          (f_sHy
+                                                             (f_sHy
+                                                                (f_sHy
+                                                                   (f_sHy
+                                                                      (f_sHy
+                                                                         (f_sHy
+                                                                            (f_sHy
+                                                                               (f_sHy
+                                                                                  (f_sHy
+                                                                                     (f_sHy
+                                                                                        (f_sHy
+                                                                                           (f_sHy
+                                                                                              (f_sHy
+                                                                                                 (f_sHy
+                                                                                                    (f_sHy
+                                                                                                       (f_sHy
+                                                                                                          (f_sHy
+                                                                                                             (f_sHy
+                                                                                                                (f_sHy
+                                                                                                                   (f_sHy
+                                                                                                                      (f_sHy
+                                                                                                                         (f_sHy
+                                                                                                                            (f_sHy
+                                                                                                                               (f_sHy
+                                                                                                                                  (f_sHy
+                                                                                                                                     (f_sHy
+                                                                                                                                        (f_sHy
+                                                                                                                                           (f_sHy
+                                                                                                                                              (f_sHy
+                                                                                                                                                 (f_sHy
+                                                                                                                                                    (f_sHy
+                                                                                                                                                       (f_sHy
+                                                                                                                                                          (f_sHy
+                                                                                                                                                             (f_sHy
+                                                                                                                                                                (f_sHy
+                                                                                                                                                                   (f_sHy
+                                                                                                                                                                      (f_sHy
+                                                                                                                                                                         (f_sHy
+                                                                                                                                                                            (f_sHy
+                                                                                                                                                                               (f_sHy
+                                                                                                                                                                                  (f_sHy
+                                                                                                                                                                                     (f_sHy
+                                                                                                                                                                                        (f_sHy
+                                                                                                                                                                                           (f_sHy
+                                                                                                                                                                                              (f_sHy
+                                                                                                                                                                                                 (f_sHy
+                                                                                                                                                                                                    (f_sHy
+                                                                                                                                                                                                       (f_sHy
+                                                                                                                                                                                                          (f_sHy
+                                                                                                                                                                                                             (f_sHy
+                                                                                                                                                                                                                (f_sHy
+                                                                                                                                                                                                                   (f_sHy
+                                                                                                                                                                                                                      (f_sHy
+                                                                                                                                                                                                                         (f_sHy
+                                                                                                                                                                                                                            (f_sHy
+                                                                                                                                                                                                                               (f_sHy
+                                                                                                                                                                                                                                  (f_sHy
+                                                                                                                                                                                                                                     (f_sHy
+                                                                                                                                                                                                                                        (f_sHy
+                                                                                                                                                                                                                                           (f_sHy
+                                                                                                                                                                                                                                              (f_sHy
+                                                                                                                                                                                                                                                 (f_sHy
+                                                                                                                                                                                                                                                    (f_sHy
+                                                                                                                                                                                                                                                       (f_sHy
+                                                                                                                                                                                                                                                          (f_sHy
+                                                                                                                                                                                                                                                             (f_sHy
+                                                                                                                                                                                                                                                                (f_sHy
+                                                                                                                                                                                                                                                                   (f_sHy
+                                                                                                                                                                                                                                                                      (f_sHy
+                                                                                                                                                                                                                                                                         (f_sHy
+                                                                                                                                                                                                                                                                            (f_sHy
+                                                                                                                                                                                                                                                                               (f_sHy
+                                                                                                                                                                                                                                                                                  (f_sHy
+                                                                                                                                                                                                                                                                                     (f_sHy
+                                                                                                                                                                                                                                                                                        (f_sHy
+                                                                                                                                                                                                                                                                                           (f_sHy
+                                                                                                                                                                                                                                                                                              (f_sHy
+                                                                                                                                                                                                                                                                                                 (f_sHy
+                                                                                                                                                                                                                                                                                                    (f_sHy
+                                                                                                                                                                                                                                                                                                       (f_sHy
+                                                                                                                                                                                                                                                                                                          (f_sHy
+                                                                                                                                                                                                                                                                                                             (f_sHy
+                                                                                                                                                                                                                                                                                                                (f_sHy
+                                                                                                                                                                                                                                                                                                                   (f_sHy
+                                                                                                                                                                                                                                                                                                                      (f_sHy
+                                                                                                                                                                                                                                                                                                                         (f_sHy
+                                                                                                                                                                                                                                                                                                                            (f_sHy
+                                                                                                                                                                                                                                                                                                                               (f_sHy
+                                                                                                                                                                                                                                                                                                                                  x_X4N))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+lvl_sHI :: Int
+[LclId]
+lvl_sHI = GHC.Types.I# 0#
+
+-- RHS size: {terms: 6, types: 5, coercions: 0, joins: 0/0}
+foo :: [Int] -> [Int]
+[LclIdX,
+ Arity=1,
+ Str=<U>,
+ Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True,
+         Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
+         Tmpl= \ (xs_awV [Occ=Once1] :: [Int]) ->
+                 GHC.Base.build
+                   @Int
+                   (\ (@b1_aHe)
+                      (c_aHf [Occ=Once1, OS=OneShot] :: Int -> b1_aHe -> b1_aHe)
+                      (n_aHg [Occ=Once1, OS=OneShot] :: b1_aHe) ->
+                      GHC.Base.foldr
+                        @Int
+                        @b1_aHe
+                        (GHC.Base.mapFB @Int @b1_aHe @Int c_aHf GHC.Num.$fNumInt_$cnegate)
+                        n_aHg
+                        (GHC.Types.: @Int lvl_sHI xs_awV))}]
+foo
+  = \ (xs_awV :: [Int]) ->
+      map
+        @Int
+        @Int
+        GHC.Num.$fNumInt_$cnegate
+        (GHC.Types.: @Int lvl_sHI xs_awV)
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+lvl_sHJ :: Int
+[LclId]
+lvl_sHJ = lvl_sHI
+
+-- RHS size: {terms: 6, types: 5, coercions: 0, joins: 0/0}
+bar :: [Int] -> [Int]
+[LclIdX,
+ Arity=1,
+ Str=<1U>,
+ Cpr=m2,
+ Unf=Unf{Src=InlineStable, TopLvl=True, Value=True, ConLike=True,
+         WorkFree=True, Expandable=True,
+         Guidance=ALWAYS_IF(arity=1,unsat_ok=True,boring_ok=False)
+         Tmpl= \ (xs_awW [Occ=Once1] :: [Int]) ->
+                 GHC.Types.:
+                   @Int
+                   lvl_sHI
+                   (GHC.Base.build
+                      @Int
+                      (\ (@b1_aHe)
+                         (c_aHf [Occ=Once1, OS=OneShot] :: Int -> b1_aHe -> b1_aHe)
+                         (n_aHg [Occ=Once1, OS=OneShot] :: b1_aHe) ->
+                         GHC.Base.foldr
+                           @Int
+                           @b1_aHe
+                           (GHC.Base.mapFB @Int @b1_aHe @Int c_aHf GHC.Num.$fNumInt_$cnegate)
+                           n_aHg
+                           xs_awW))}]
+bar
+  = \ (xs_awW :: [Int]) ->
+      GHC.Types.:
+        @Int lvl_sHI (map @Int @Int GHC.Num.$fNumInt_$cnegate xs_awW)
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+$trModule_sHr :: GHC.Prim.Addr#
+[LclId]
+$trModule_sHr = "main"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+$trModule_sHs :: GHC.Types.TrName
+[LclId]
+$trModule_sHs = GHC.Types.TrNameS $trModule_sHr
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+$trModule_sHt :: GHC.Prim.Addr#
+[LclId]
+$trModule_sHt = "Foo"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+$trModule_sHu :: GHC.Types.TrName
+[LclId]
+$trModule_sHu = GHC.Types.TrNameS $trModule_sHt
+
+-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
+Foo.$trModule :: GHC.Types.Module
+[LclIdX]
+Foo.$trModule = GHC.Types.Module $trModule_sHs $trModule_sHu
+
+-- RHS size: {terms: 3, types: 1, coercions: 0, joins: 0/0}
+$krep_aGF [InlPrag=[~]] :: GHC.Types.KindRep
+[LclId]
+$krep_aGF
+  = GHC.Types.KindRepTyConApp
+      GHC.Types.$tcConstraint (GHC.Types.[] @GHC.Types.KindRep)
+
+-- RHS size: {terms: 3, types: 0, coercions: 0, joins: 0/0}
+$krep_aGE [InlPrag=[~]] :: GHC.Types.KindRep
+[LclId]
+$krep_aGE = GHC.Types.KindRepFun GHC.Types.krep$*Arr* $krep_aGF
+
+-- RHS size: {terms: 1, types: 0, coercions: 0, joins: 0/0}
+$tcMyFunctor_sHv :: GHC.Prim.Addr#
+[LclId]
+$tcMyFunctor_sHv = "MyFunctor"#
+
+-- RHS size: {terms: 2, types: 0, coercions: 0, joins: 0/0}
+$tcMyFunctor_sHw :: GHC.Types.TrName
+[LclId]
+$tcMyFunctor_sHw = GHC.Types.TrNameS $tcMyFunctor_sHv
+
+-- RHS size: {terms: 7, types: 0, coercions: 0, joins: 0/0}
+Foo.$tcMyFunctor :: GHC.Types.TyCon
+[LclIdX]
+Foo.$tcMyFunctor
+  = GHC.Types.TyCon
+      12837160846121910345##
+      787075802864859973##
+      Foo.$trModule
+      $tcMyFunctor_sHw
+      0#
+      $krep_aGE
+
 
 
diff --git a/testsuite/tests/simplCore/should_compile/T8331.stderr b/testsuite/tests/simplCore/should_compile/T8331.stderr
index 7219016651..99a8432708 100644
--- a/testsuite/tests/simplCore/should_compile/T8331.stderr
+++ b/testsuite/tests/simplCore/should_compile/T8331.stderr
@@ -2,39 +2,91 @@
 ==================== Tidy Core rules ====================
 "SPEC $c*> @(ST s) _"
     forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
-      $fApplicativeReaderT_$c*> @(ST s) @r $dApplicative
-      = ($fApplicativeReaderT3 @s @r)
+      $fApplicativeReaderT5 @(ST s) @r $dApplicative
+      = ($fApplicativeReaderT6 @s @r)
         `cast` (forall (a :: <*>_N) (b :: <*>_N).
                 <ReaderT r (ST s) a>_R
                 %<'Many>_N ->_R <ReaderT r (ST s) b>_R
-                %<'Many>_N ->_R <r>_R %<'Many>_N ->_R Sym (N:ST[0] <s>_N <b>_R)
-                                ; Sym (N:ReaderT[0] <*>_N <r>_R <ST s>_R <b>_N)
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <b>_R)
                 :: Coercible
                      (forall {a} {b}.
                       ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> STRep s b)
                      (forall {a} {b}.
-                      ReaderT r (ST s) a -> ReaderT r (ST s) b -> ReaderT r (ST s) b))
+                      ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> ST s b))
+"SPEC $c<$ @(ST s) _"
+    forall (@s) (@r) ($dFunctor :: Functor (ST s)).
+      $fFunctorReaderT1 @(ST s) @r $dFunctor
+      = ($fFunctorReaderT2 @s @r)
+        `cast` (forall (a :: <*>_N) (b :: <*>_N).
+                <a>_R
+                %<'Many>_N ->_R <ReaderT r (ST s) b>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <a>_R)
+                :: Coercible
+                     (forall {a} {b}. a -> ReaderT r (ST s) b -> r -> STRep s a)
+                     (forall {a} {b}. a -> ReaderT r (ST s) b -> r -> ST s a))
+"SPEC $c<*> @(ST s) _"
+    forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+      $fApplicativeReaderT8 @(ST s) @r $dApplicative
+      = ($fApplicativeReaderT9 @s @r)
+        `cast` (forall (a :: <*>_N) (b :: <*>_N).
+                <ReaderT r (ST s) (a -> b)>_R
+                %<'Many>_N ->_R <ReaderT r (ST s) a>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <b>_R)
+                :: Coercible
+                     (forall {a} {b}.
+                      ReaderT r (ST s) (a -> b) -> ReaderT r (ST s) a -> r -> STRep s b)
+                     (forall {a} {b}.
+                      ReaderT r (ST s) (a -> b) -> ReaderT r (ST s) a -> r -> ST s b))
 "SPEC $c>> @(ST s) _"
     forall (@s) (@r) ($dMonad :: Monad (ST s)).
       $fMonadReaderT1 @(ST s) @r $dMonad
       = $fMonadAbstractIOSTReaderT_$s$c>> @s @r
+"SPEC $c>>= @(ST s) _"
+    forall (@s) (@r) ($dMonad :: Monad (ST s)).
+      $fMonadReaderT2 @(ST s) @r $dMonad
+      = ($fMonadReaderT3 @s @r)
+        `cast` (forall (a :: <*>_N) (b :: <*>_N).
+                <ReaderT r (ST s) a>_R
+                %<'Many>_N ->_R <a -> ReaderT r (ST s) b>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <b>_R)
+                :: Coercible
+                     (forall {a} {b}.
+                      ReaderT r (ST s) a -> (a -> ReaderT r (ST s) b) -> r -> STRep s b)
+                     (forall {a} {b}.
+                      ReaderT r (ST s) a -> (a -> ReaderT r (ST s) b) -> r -> ST s b))
+"SPEC $cfmap @(ST s) _"
+    forall (@s) (@r) ($dFunctor :: Functor (ST s)).
+      $fFunctorReaderT3 @(ST s) @r $dFunctor
+      = ($fFunctorReaderT4 @s @r)
+        `cast` (forall (a :: <*>_N) (b :: <*>_N).
+                <a -> b>_R
+                %<'Many>_N ->_R <ReaderT r (ST s) a>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <b>_R)
+                :: Coercible
+                     (forall {a} {b}. (a -> b) -> ReaderT r (ST s) a -> r -> STRep s b)
+                     (forall {a} {b}. (a -> b) -> ReaderT r (ST s) a -> r -> ST s b))
 "SPEC $cliftA2 @(ST s) _"
     forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
-      $fApplicativeReaderT_$cliftA2 @(ST s) @r $dApplicative
+      $fApplicativeReaderT4 @(ST s) @r $dApplicative
       = ($fApplicativeReaderT1 @s @r)
         `cast` (forall (a :: <*>_N) (b :: <*>_N) (c :: <*>_N).
                 <a -> b -> c>_R
                 %<'Many>_N ->_R <ReaderT r (ST s) a>_R
                 %<'Many>_N ->_R <ReaderT r (ST s) b>_R
-                %<'Many>_N ->_R <r>_R %<'Many>_N ->_R Sym (N:ST[0] <s>_N <c>_R)
-                                ; Sym (N:ReaderT[0] <*>_N <r>_R <ST s>_R <c>_N)
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <c>_R)
                 :: Coercible
                      (forall {a} {b} {c}.
                       (a -> b -> c)
                       -> ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> STRep s c)
                      (forall {a} {b} {c}.
                       (a -> b -> c)
-                      -> ReaderT r (ST s) a -> ReaderT r (ST s) b -> ReaderT r (ST s) c))
+                      -> ReaderT r (ST s) a -> ReaderT r (ST s) b -> r -> ST s c))
 "SPEC $cp1Applicative @(ST s) _"
     forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
       $fApplicativeReaderT_$cp1Applicative @(ST s) @r $dApplicative
@@ -43,6 +95,26 @@
     forall (@s) (@r) ($dMonad :: Monad (ST s)).
       $fMonadReaderT_$cp1Monad @(ST s) @r $dMonad
       = $fApplicativeReaderT_$s$fApplicativeReaderT @s @r
+"SPEC $cpure @(ST s) _"
+    forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
+      $fApplicativeReaderT10 @(ST s) @r $dApplicative
+      = ($fApplicativeReaderT11 @s @r)
+        `cast` (forall (a :: <*>_N).
+                <a>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <a>_R)
+                :: Coercible
+                     (forall {a}. a -> r -> STRep s a) (forall {a}. a -> r -> ST s a))
+"SPEC $creturn @(ST s) _"
+    forall (@s) (@r) ($dMonad :: Monad (ST s)).
+      $fMonadReaderT1 @(ST s) @r $dMonad
+      = ($fApplicativeReaderT11 @s @r)
+        `cast` (forall (a :: <*>_N).
+                <a>_R
+                %<'Many>_N ->_R <r>_R
+                %<'Many>_N ->_R Sym (N:ST[0] <s>_N <a>_R)
+                :: Coercible
+                     (forall {a}. a -> r -> STRep s a) (forall {a}. a -> r -> ST s a))
 "SPEC $fApplicativeReaderT @(ST s) _"
     forall (@s) (@r) ($dApplicative :: Applicative (ST s)).
       $fApplicativeReaderT @(ST s) @r $dApplicative
diff --git a/testsuite/tests/simplCore/should_compile/all.T b/testsuite/tests/simplCore/should_compile/all.T
index 5f742742d1..408cc4bd4e 100644
--- a/testsuite/tests/simplCore/should_compile/all.T
+++ b/testsuite/tests/simplCore/should_compile/all.T
@@ -136,7 +136,14 @@ test('T5366',
 test('T7796', [], makefile_test, ['T7796'])
 test('T5550', omit_ways(prof_ways), compile, [''])
 test('T7865', normal, makefile_test, ['T7865'])
-test('T7785', only_ways(['optasm']), compile, ['-ddump-rules'])
+
+# T7785: we want to check that we specialise 'shared'. But Tidy discards the
+# rule (see Note [Trimming auto-rules] in GHC.Iface.Tidy)
+# So, rather arbitrarily, we dump the output of CSE and grep for SPEC
+test('T7785', [ only_ways(['optasm']),
+               grep_errmsg(r'SPEC') ],
+      compile, ['-ddump-cse'])
+
 test('T7702',
      [extra_files(['T7702plugin']),
       only_ways([config.ghc_plugin_way]),
diff --git a/testsuite/tests/simplCore/should_run/T18012.hs b/testsuite/tests/simplCore/should_run/T18012.hs
index 9118b75ff4..9ce1f1fb9d 100644
--- a/testsuite/tests/simplCore/should_run/T18012.hs
+++ b/testsuite/tests/simplCore/should_run/T18012.hs
@@ -32,10 +32,10 @@ notRule x = x
 {-# INLINE [0] notRule #-}
 {-# RULES "notRule/False" [~0] notRule False = True #-}
 
-f :: T -> () -> Bool
-f (D a) () = notRule a
+f :: () -> T -> Bool
+f () (D a) = notRule a
 {-# INLINE [100] f #-} -- so it isn’t inlined before FloatOut
 
 g :: () -> Bool
-g x = f (D False) x
+g x = f x (D False)
 {-# NOINLINE g #-}
diff --git a/testsuite/tests/stranal/should_compile/T18894b.hs b/testsuite/tests/stranal/should_compile/T18894b.hs
index e90f34e3fd..99a4bf954d 100644
--- a/testsuite/tests/stranal/should_compile/T18894b.hs
+++ b/testsuite/tests/stranal/should_compile/T18894b.hs
@@ -17,4 +17,14 @@ f :: Int -> Int
 f 1 = 0
 f m
   | odd m     = eta m 2
-  | otherwise = eta 2 m
+  | otherwise = eta m m
+
+{-
+An earlier version of this test had (eta 2 m) in the otherwise case.
+But then (eta 2) could be floated out; and indeed if 'f' is applied
+many times, then sharing (eta 2) might be good.  And if we inlined
+eta, we certainly would share (expensive 2).
+
+So I made the test more robust at testing what we actually want here,
+by changing to (eta m m).
+-}
author	Simon Peyton Jones <simonpj@microsoft.com>	2021-06-16 12:32:08 +0100
committer	Simon Peyton Jones <simonpj@microsoft.com>	2021-07-20 11:44:10 +0100
commit	cc8ea077a0ee58a76dbb7f4ae56c1f571f9f720f (patch)
tree	95c95f9e20123f0d7ddbc7ad99c87baea05d8946
parent	58b960d2af0ebfc37104ec68a4df377a074951dd (diff)
download	haskell-cc8ea077a0ee58a76dbb7f4ae56c1f571f9f720f.tar.gz