Refactor the Mighty Simplifier

Triggered by #12150, and the knock-on effects of join points, I did a major refactoring of the Simplifier. This is a big patch that change a lot of Simplify.hs: I did a lot of other re-organisation. The main event ~~~~~~~~~~~~~~ Since the dawn of time we have had simplExpr :: SimplEnv -> InExpr -> SimplCont -> SimplM (SimplEnv, OutExpr) What's that SimplEnv in the result? When simplifying an expression the simplifier add floated let-bindings to the SimplEnv, extending the in-scope set appropriately, and hence needs to resturn the SimplEnv at the end. The mode, flags, substitution in the returned SimplEnv were all irrelevant: it was just the floating bindings. It's strange to accumulate part of the /result/ in the /environment/ argument! And indeed its leads to all manner of mysterious calls to zapFloats and transferring of floats from one SimplEnv to another. It got worse with join points, so I finally bit the bullet and refactored. Now we have simplExpr :: SimplEnv -> InExpr -> SimplCont -> SimplM (SimplFloats, OutExpr) -- See Note [The big picture] and the SimplEnv no longer has floats in it. The code is no shorter, but it /is/ easier to understand. Main changes * Remove seLetFloats field from SimplEnv * Define new data type SimplFloats, and functions over it * Change the types of simplExpr, simplBind, and their many variants, to follow the above plan Bottoming bindings ~~~~~~~~~~~~~~~~~~ I made one other significant change in SimplUtils (not just refactoring), related to Trac #12150 comment:16. Given x = <rhs> where <rhs> turns out to be a bottoming expression, propagate that information to x's IdInfo immediately. That's always good, because it makes x be inlined less (we don't inline bottoming things), and it allows (case x of ...) to drop the dead alterantives immediately. Moreover, we are doing the analysis anyway, in tryEtaExpandRhs, which calls CoreArity.findRhsArity, which already does simple bottom analysis. So we are generating the information; all we need do is to atach the bottoming info to the IdInfo. See Note [Bottoming bindings] Smaller refactoring ~~~~~~~~~~~~~~~~~~~ * Rename SimplifierMode to SimplMode * Put DynFlags as a new field in SimplMode, to make fewer monadic calls to getDynFlags. * Move the code in addPolyBind into abstractFloats * Move the "don't eta-expand join points" into tryEtaExpandRhs
author: Simon Peyton Jones <simonpj@microsoft.com> 2017-08-25 09:22:03 +0100
committer: Simon Peyton Jones <simonpj@microsoft.com> 2017-08-25 12:57:11 +0100
commit: 33452dfc6cf891b59d63fa9fe138b18cbce4df81 (patch)
tree: e6cc85f4e02d0f792d3e44a28958b7261cf3fb44
parent: 407c11b880325f4f327982d4f6b9f9cba4564016 (diff)
download: haskell-33452dfc6cf891b59d63fa9fe138b18cbce4df81.tar.gz
10 files changed, 1273 insertions, 1164 deletions
diff --git a/compiler/coreSyn/CoreArity.hs b/compiler/coreSyn/CoreArity.hs
index 3f429d1ad2..a2ad5f7fff 100644
--- a/compiler/coreSyn/CoreArity.hs
+++ b/compiler/coreSyn/CoreArity.hs
@@ -521,61 +521,60 @@ mk_cheap_fn dflags cheap_app
 
 
 ----------------------
-findRhsArity :: DynFlags -> Id -> CoreExpr -> Arity -> Arity
+findRhsArity :: DynFlags -> Id -> CoreExpr -> Arity -> (Arity, Bool)
 -- This implements the fixpoint loop for arity analysis
 -- See Note [Arity analysis]
+-- If findRhsArity e = (n, is_bot) then
+--  (a) any application of e to <n arguments will not do much work,
+--      so it is safe to expand e  ==>  (\x1..xn. e x1 .. xn)
+--  (b) if is_bot=True, then e applied to n args is guaranteed bottom
 findRhsArity dflags bndr rhs old_arity
-  = go (rhsEtaExpandArity dflags init_cheap_app rhs)
+  = go (get_arity init_cheap_app)
        -- We always call exprEtaExpandArity once, but usually
        -- that produces a result equal to old_arity, and then
        -- we stop right away (since arities should not decrease)
        -- Result: the common case is that there is just one iteration
   where
+    is_lam = has_lam rhs
+
+    has_lam (Tick _ e) = has_lam e
+    has_lam (Lam b e)  = isId b || has_lam e
+    has_lam _          = False
+
     init_cheap_app :: CheapAppFun
     init_cheap_app fn n_val_args
       | fn == bndr = True   -- On the first pass, this binder gets infinite arity
       | otherwise  = isCheapApp fn n_val_args
 
-    go :: Arity -> Arity
-    go cur_arity
-      | cur_arity <= old_arity = cur_arity
-      | new_arity == cur_arity = cur_arity
+    go :: (Arity, Bool) -> (Arity, Bool)
+    go cur_info@(cur_arity, _)
+      | cur_arity <= old_arity = cur_info
+      | new_arity == cur_arity = cur_info
       | otherwise = ASSERT( new_arity < cur_arity )
 #if defined(DEBUG)
                     pprTrace "Exciting arity"
                        (vcat [ ppr bndr <+> ppr cur_arity <+> ppr new_arity
-                                                    , ppr rhs])
+                             , ppr rhs])
 #endif
-                    go new_arity
+                    go new_info
       where
-        new_arity = rhsEtaExpandArity dflags cheap_app rhs
+        new_info@(new_arity, _) = get_arity cheap_app
 
         cheap_app :: CheapAppFun
         cheap_app fn n_val_args
           | fn == bndr = n_val_args < cur_arity
           | otherwise  = isCheapApp fn n_val_args
 
--- ^ The Arity returned is the number of value args the
--- expression can be applied to without doing much work
-rhsEtaExpandArity :: DynFlags -> CheapAppFun -> CoreExpr -> Arity
--- exprEtaExpandArity is used when eta expanding
---      e  ==>  \xy -> e x y
-rhsEtaExpandArity dflags cheap_app e
-  = case (arityType env e) of
-      ATop (os:oss)
-        | isOneShotInfo os || has_lam e -> 1 + length oss
-                                   -- Don't expand PAPs/thunks
-                                   -- Note [Eta expanding thunks]
-        | otherwise       -> 0
-      ATop []             -> 0
-      ABot n              -> n
-  where
-    env = AE { ae_cheap_fn = mk_cheap_fn dflags cheap_app
-             , ae_ped_bot  = gopt Opt_PedanticBottoms dflags }
-
-    has_lam (Tick _ e) = has_lam e
-    has_lam (Lam b e)  = isId b || has_lam e
-    has_lam _          = False
+    get_arity :: CheapAppFun -> (Arity, Bool)
+    get_arity cheap_app
+      = case (arityType env rhs) of
+          ABot n -> (n, True)
+          ATop (os:oss) | isOneShotInfo os || is_lam
+                  -> (1 + length oss, False)    -- Don't expand PAPs/thunks
+          ATop _  -> (0,              False)    -- Note [Eta expanding thunks]
+       where
+         env = AE { ae_cheap_fn = mk_cheap_fn dflags cheap_app
+                  , ae_ped_bot  = gopt Opt_PedanticBottoms dflags }
 
 {-
 Note [Arity analysis]
diff --git a/compiler/simplCore/CoreMonad.hs b/compiler/simplCore/CoreMonad.hs
index c689eea346..82c636c232 100644
--- a/compiler/simplCore/CoreMonad.hs
+++ b/compiler/simplCore/CoreMonad.hs
@@ -9,7 +9,7 @@
 module CoreMonad (
     -- * Configuration of the core-to-core passes
     CoreToDo(..), runWhen, runMaybe,
-    SimplifierMode(..),
+    SimplMode(..),
     FloatOutSwitches(..),
     pprPassDetails,
 
@@ -107,7 +107,7 @@ data CoreToDo           -- These are diff core-to-core passes,
 
   = CoreDoSimplify      -- The core-to-core simplifier.
         Int                    -- Max iterations
-        SimplifierMode
+        SimplMode
   | CoreDoPluginPass String PluginPass
   | CoreDoFloatInwards
   | CoreDoFloatOutwards FloatOutSwitches
@@ -163,17 +163,19 @@ pprPassDetails (CoreDoSimplify n md) = vcat [ text "Max iterations =" <+> int n
                                             , ppr md ]
 pprPassDetails _ = Outputable.empty
 
-data SimplifierMode             -- See comments in SimplMonad
+data SimplMode             -- See comments in SimplMonad
   = SimplMode
         { sm_names      :: [String] -- Name(s) of the phase
         , sm_phase      :: CompilerPhase
+        , sm_dflags     :: DynFlags -- Just for convenient non-monadic
+                                    -- access; we don't override these
         , sm_rules      :: Bool     -- Whether RULES are enabled
         , sm_inline     :: Bool     -- Whether inlining is enabled
         , sm_case_case  :: Bool     -- Whether case-of-case is enabled
         , sm_eta_expand :: Bool     -- Whether eta-expansion is enabled
         }
 
-instance Outputable SimplifierMode where
+instance Outputable SimplMode where
     ppr (SimplMode { sm_phase = p, sm_names = ss
                    , sm_rules = r, sm_inline = i
                    , sm_eta_expand = eta, sm_case_case = cc })
diff --git a/compiler/simplCore/SimplCore.hs b/compiler/simplCore/SimplCore.hs
index c1513b8af6..6f481c5e06 100644
--- a/compiler/simplCore/SimplCore.hs
+++ b/compiler/simplCore/SimplCore.hs
@@ -142,6 +142,7 @@ getCoreToDo dflags
 
     base_mode = SimplMode { sm_phase      = panic "base_mode"
                           , sm_names      = []
+                          , sm_dflags     = dflags
                           , sm_rules      = rules_on
                           , sm_eta_expand = eta_expand_on
                           , sm_inline     = True
@@ -619,7 +620,7 @@ simplExprGently :: SimplEnv -> CoreExpr -> SimplM CoreExpr
 --      (b) the LHS and RHS of a RULE
 --      (c) Template Haskell splices
 --
--- The name 'Gently' suggests that the SimplifierMode is SimplGently,
+-- The name 'Gently' suggests that the SimplMode is SimplGently,
 -- and in fact that is so.... but the 'Gently' in simplExprGently doesn't
 -- enforce that; it just simplifies the expression twice
 
@@ -754,8 +755,8 @@ simplifyPgmIO pass@(CoreDoSimplify max_iterations mode)
                 -- Simplify the program
            ((binds1, rules1), counts1) <-
              initSmpl dflags (mkRuleEnv rule_base2 vis_orphs) fam_envs us1 sz $
-               do { env1 <- {-# SCC "SimplTopBinds" #-}
-                            simplTopBinds simpl_env tagged_binds
+               do { (floats, env1) <- {-# SCC "SimplTopBinds" #-}
+                                      simplTopBinds simpl_env tagged_binds
 
                       -- Apply the substitution to rules defined in this module
                       -- for imported Ids.  Eg  RULE map my_f = blah
@@ -763,7 +764,7 @@ simplifyPgmIO pass@(CoreDoSimplify max_iterations mode)
                       -- apply it to the rule to, or it'll never match
                   ; rules1 <- simplRules env1 Nothing rules
 
-                  ; return (getFloatBinds env1, rules1) } ;
+                  ; return (getTopFloatBinds floats, rules1) } ;
 
                 -- Stop if nothing happened; don't dump output
            if isZeroSimplCount counts1 then
diff --git a/compiler/simplCore/SimplEnv.hs b/compiler/simplCore/SimplEnv.hs
index 9316ec08af..21ba4bc539 100644
--- a/compiler/simplCore/SimplEnv.hs
+++ b/compiler/simplCore/SimplEnv.hs
@@ -8,14 +8,14 @@
 
 module SimplEnv (
         -- * The simplifier mode
-        setMode, getMode, updMode,
+        setMode, getMode, updMode, seDynFlags,
 
         -- * Environments
         SimplEnv(..), StaticEnv, pprSimplEnv,   -- Temp not abstract
         mkSimplEnv, extendIdSubst,
         SimplEnv.extendTvSubst, SimplEnv.extendCvSubst,
         zapSubstEnv, setSubstEnv,
-        getInScope, setInScopeAndZapFloats,
+        getInScope, setInScopeFromE, setInScopeFromF,
         setInScopeSet, modifyInScope, addNewInScopeIds,
         getSimplRules,
 
@@ -29,19 +29,24 @@ module SimplEnv (
         substCo, substCoVar,
 
         -- * Floats
-        Floats, emptyFloats, isEmptyFloats,
-        addNonRec, addLetFloats, addFloats, extendFloats, addFlts,
-        wrapFloats, setFloats, zapFloats, addRecFloats, mapFloats,
-        doFloatFromRhs, getFloatBinds,
-
-        JoinFloat, JoinFloats, emptyJoinFloats, isEmptyJoinFloats,
-        wrapJoinFloats, wrapJoinFloatsX, zapJoinFloats, addJoinFloats
+        SimplFloats(..), emptyFloats, mkRecFloats,
+        mkFloatBind, addLetFloats, addJoinFloats, addFloats,
+        extendFloats, wrapFloats,
+        doFloatFromRhs, getTopFloatBinds,
+
+        -- * LetFloats
+        LetFloats, letFloatBinds, emptyLetFloats, unitLetFloat,
+        addLetFlts,  mapLetFloats,
+
+        -- * JoinFloats
+        JoinFloat, JoinFloats, emptyJoinFloats,
+        wrapJoinFloats, wrapJoinFloatsX, unitJoinFloat, addJoinFlts
     ) where
 
 #include "HsVersions.h"
 
 import SimplMonad
-import CoreMonad                ( SimplifierMode(..) )
+import CoreMonad                ( SimplMode(..) )
 import CoreSyn
 import CoreUtils
 import Var
@@ -50,6 +55,7 @@ import VarSet
 import OrdList
 import Id
 import MkCore                   ( mkWildValBinder )
+import DynFlags                 ( DynFlags )
 import TysWiredIn
 import qualified Type
 import Type hiding              ( substTy, substTyVar, substTyVarBndr )
@@ -77,12 +83,12 @@ data SimplEnv
      -- Static in the sense of lexically scoped,
      -- wrt the original expression
 
-        seMode      :: SimplifierMode,
+        seMode      :: SimplMode
 
         -- The current substitution
-        seTvSubst   :: TvSubstEnv,      -- InTyVar |--> OutType
-        seCvSubst   :: CvSubstEnv,      -- InCoVar |--> OutCoercion
-        seIdSubst   :: SimplIdSubst,    -- InId    |--> OutExpr
+      , seTvSubst   :: TvSubstEnv      -- InTyVar |--> OutType
+      , seCvSubst   :: CvSubstEnv      -- InCoVar |--> OutCoercion
+      , seIdSubst   :: SimplIdSubst    -- InId    |--> OutExpr
 
      ----------- Dynamic part of the environment -----------
      -- Dynamic in the sense of describing the setup where
@@ -90,23 +96,42 @@ data SimplEnv
 
         -- The current set of in-scope variables
         -- They are all OutVars, and all bound in this module
-        seInScope   :: InScopeSet,      -- OutVars only
-                -- Includes all variables bound
-                -- by seLetFloats and seJoinFloats
+      , seInScope   :: InScopeSet       -- OutVars only
+    }
 
-        -- Ordinary bindings
-        seLetFloats  :: Floats,
-                -- See Note [Simplifier floats]
+type StaticEnv = SimplEnv       -- Just the static part is relevant
+
+data SimplFloats
+  = SimplFloats
+      { -- Ordinary let bindings
+        sfLetFloats  :: LetFloats
+                -- See Note [LetFloats]
 
         -- Join points
-        seJoinFloats :: JoinFloats
+      , sfJoinFloats :: JoinFloats
                 -- Handled separately; they don't go very far
-                -- We consider these to be /inside/ seLetFloats
+                -- We consider these to be /inside/ sfLetFloats
                 -- because join points can refer to ordinary bindings,
                 -- but not vice versa
-    }
 
-type StaticEnv = SimplEnv       -- Just the static part is relevant
+        -- Includes all variables bound by sfLetFloats and
+        -- sfJoinFloats, plus at least whatever is in scope where
+        -- these bindings land up.
+      , sfInScope :: InScopeSet  -- All OutVars
+      }
+
+instance Outputable SimplFloats where
+  ppr (SimplFloats { sfLetFloats = lf, sfJoinFloats = jf, sfInScope = is })
+    = text "SimplFloats"
+      <+> braces (vcat [ text "lets: " <+> ppr lf
+                       , text "joins:" <+> ppr jf
+                       , text "in_scope:" <+> ppr is ])
+
+emptyFloats :: SimplEnv -> SimplFloats
+emptyFloats env
+  = SimplFloats { sfLetFloats  = emptyLetFloats
+                , sfJoinFloats = emptyJoinFloats
+                , sfInScope    = seInScope env }
 
 pprSimplEnv :: SimplEnv -> SDoc
 -- Used for debugging; selective
@@ -241,12 +266,10 @@ need to know at the occurrence site that the variable is a join point
 so that we know to drop the context. Thus we remember which join
 points we're substituting. -}
 
-mkSimplEnv :: SimplifierMode -> SimplEnv
+mkSimplEnv :: SimplMode -> SimplEnv
 mkSimplEnv mode
   = SimplEnv { seMode = mode
              , seInScope = init_in_scope
-             , seLetFloats = emptyFloats
-             , seJoinFloats = emptyJoinFloats
              , seTvSubst = emptyVarEnv
              , seCvSubst = emptyVarEnv
              , seIdSubst = emptyVarEnv }
@@ -276,13 +299,16 @@ wild-ids before doing much else.
 It's a very dark corner of GHC.  Maybe it should be cleaned up.
 -}
 
-getMode :: SimplEnv -> SimplifierMode
+getMode :: SimplEnv -> SimplMode
 getMode env = seMode env
 
-setMode :: SimplifierMode -> SimplEnv -> SimplEnv
+seDynFlags :: SimplEnv -> DynFlags
+seDynFlags env = sm_dflags (seMode env)
+
+setMode :: SimplMode -> SimplEnv -> SimplEnv
 setMode mode env = env { seMode = mode }
 
-updMode :: (SimplifierMode -> SimplifierMode) -> SimplEnv -> SimplEnv
+updMode :: (SimplMode -> SimplMode) -> SimplEnv -> SimplEnv
 updMode upd env = env { seMode = upd (seMode env) }
 
 ---------------------
@@ -308,19 +334,11 @@ getInScope env = seInScope env
 setInScopeSet :: SimplEnv -> InScopeSet -> SimplEnv
 setInScopeSet env in_scope = env {seInScope = in_scope}
 
-setInScopeAndZapFloats :: SimplEnv -> SimplEnv -> SimplEnv
--- Set the in-scope set, and *zap* the floats
-setInScopeAndZapFloats env env_with_scope
-  = env { seInScope    = seInScope env_with_scope,
-          seLetFloats  = emptyFloats,
-          seJoinFloats = emptyJoinFloats }
+setInScopeFromE :: SimplEnv -> SimplEnv -> SimplEnv
+setInScopeFromE env env' = env { seInScope = seInScope env' }
 
-setFloats :: SimplEnv -> SimplEnv -> SimplEnv
--- Set the in-scope set *and* the floats
-setFloats env env_with_floats
-  = env { seInScope    = seInScope env_with_floats,
-          seLetFloats  = seLetFloats  env_with_floats,
-          seJoinFloats = seJoinFloats env_with_floats }
+setInScopeFromF :: SimplEnv -> SimplFloats -> SimplEnv
+setInScopeFromF env floats = env { seInScope = sfInScope floats }
 
 addNewInScopeIds :: SimplEnv -> [CoreBndr] -> SimplEnv
         -- The new Ids are guaranteed to be freshly allocated
@@ -353,13 +371,13 @@ mkContEx (SimplEnv { seTvSubst = tvs, seCvSubst = cvs, seIdSubst = ids }) e = Co
 {-
 ************************************************************************
 *                                                                      *
-\subsection{Floats}
+\subsection{LetFloats}
 *                                                                      *
 ************************************************************************
 
-Note [Simplifier floats]
-~~~~~~~~~~~~~~~~~~~~~~~~~
-The Floats is a bunch of bindings, classified by a FloatFlag.
+Note [LetFloats]
+~~~~~~~~~~~~~~~~
+The LetFloats is a bunch of bindings, classified by a FloatFlag.
 
 * All of them satisfy the let/app invariant
 
@@ -378,8 +396,8 @@ Can't happen:
   NonRec x# (f y)       -- Might diverge; does not satisfy let/app
 -}
 
-data Floats = Floats (OrdList OutBind) FloatFlag
-        -- See Note [Simplifier floats]
+data LetFloats = LetFloats (OrdList OutBind) FloatFlag
+                 -- See Note [LetFloats]
 
 type JoinFloat  = OutBind
 type JoinFloats = OrdList JoinFloat
@@ -401,12 +419,12 @@ data FloatFlag
                 --      and not guaranteed cheap
                 --      Do not float these bindings out of a lazy let
 
-instance Outputable Floats where
-  ppr (Floats binds ff) = ppr ff $$ ppr (fromOL binds)
+instance Outputable LetFloats where
+  ppr (LetFloats binds ff) = ppr ff $$ ppr (fromOL binds)
 
 instance Outputable FloatFlag where
-  ppr FltLifted = text "FltLifted"
-  ppr FltOkSpec = text "FltOkSpec"
+  ppr FltLifted  = text "FltLifted"
+  ppr FltOkSpec  = text "FltOkSpec"
   ppr FltCareful = text "FltCareful"
 
 andFF :: FloatFlag -> FloatFlag -> FloatFlag
@@ -415,9 +433,9 @@ andFF FltOkSpec  FltCareful = FltCareful
 andFF FltOkSpec  _          = FltOkSpec
 andFF FltLifted  flt        = flt
 
-doFloatFromRhs :: TopLevelFlag -> RecFlag -> Bool -> OutExpr -> SimplEnv -> Bool
+doFloatFromRhs :: TopLevelFlag -> RecFlag -> Bool -> SimplFloats -> OutExpr -> Bool
 -- If you change this function look also at FloatIn.noFloatFromRhs
-doFloatFromRhs lvl rec str rhs (SimplEnv {seLetFloats = Floats fs ff})
+doFloatFromRhs lvl rec str (SimplFloats { sfLetFloats = LetFloats fs ff }) rhs
   =  not (isNilOL fs) && want_to_float && can_float
   where
      want_to_float = isTopLevel lvl || exprIsCheap rhs || exprIsExpandable rhs
@@ -439,16 +457,16 @@ But there are
 so we must take the 'or' of the two.
 -}
 
-emptyFloats :: Floats
-emptyFloats = Floats nilOL FltLifted
+emptyLetFloats :: LetFloats
+emptyLetFloats = LetFloats nilOL FltLifted
 
 emptyJoinFloats :: JoinFloats
 emptyJoinFloats = nilOL
 
-unitFloat :: OutBind -> Floats
+unitLetFloat :: OutBind -> LetFloats
 -- This key function constructs a singleton float with the right form
-unitFloat bind = ASSERT(all (not . isJoinId) (bindersOf bind))
-                 Floats (unitOL bind) (flag bind)
+unitLetFloat bind = ASSERT(all (not . isJoinId) (bindersOf bind))
+                    LetFloats (unitOL bind) (flag bind)
   where
     flag (Rec {})                = FltLifted
     flag (NonRec bndr rhs)
@@ -465,138 +483,132 @@ unitJoinFloat :: OutBind -> JoinFloats
 unitJoinFloat bind = ASSERT(all isJoinId (bindersOf bind))
                      unitOL bind
 
-addNonRec :: SimplEnv -> OutId -> OutExpr -> SimplEnv
--- Add a non-recursive binding and extend the in-scope set
--- The latter is important; the binder may already be in the
--- in-scope set (although it might also have been created with newId)
--- but it may now have more IdInfo
-addNonRec env@(SimplEnv { seLetFloats  = floats
-                        , seJoinFloats = jfloats
-                        , seInScope = in_scope })
-          id rhs
-  | isJoinId id  -- This test incidentally forces the Id, and hence
-                 -- its IdInfo, and hence any inner substitutions
-  = env { seInScope    = in_scope'
-        , seLetFloats  = floats
-        , seJoinFloats = jfloats' }
-  | otherwise
-  = env { seInScope    = in_scope'
-        , seLetFloats  = floats'
-        , seJoinFloats = jfloats }
+mkFloatBind :: SimplEnv -> OutBind -> (SimplFloats, SimplEnv)
+-- Make a singleton SimplFloats, and
+-- extend the incoming SimplEnv's in-scope set with its binders
+-- These binders may already be in the in-scope set,
+-- but may have by now been augmented with more IdInfo
+mkFloatBind env bind
+  = (floats, env { seInScope = in_scope' })
   where
-    bind      = NonRec id rhs
-    in_scope' = extendInScopeSet in_scope id
-    floats'   = floats  `addFlts`     unitFloat     bind
-    jfloats'  = jfloats `addJoinFlts` unitJoinFloat bind
-
-extendFloats :: SimplEnv -> OutBind -> SimplEnv
+    floats
+      | isJoinBind bind
+      = SimplFloats { sfLetFloats  = emptyLetFloats
+                    , sfJoinFloats = unitJoinFloat bind
+                    , sfInScope    = in_scope' }
+      | otherwise
+      = SimplFloats { sfLetFloats  = unitLetFloat bind
+                    , sfJoinFloats = emptyJoinFloats
+                    , sfInScope    = in_scope' }
+
+    in_scope' = seInScope env `extendInScopeSetBind` bind
+
+extendFloats :: SimplFloats -> OutBind -> SimplFloats
 -- Add this binding to the floats, and extend the in-scope env too
-extendFloats env@(SimplEnv { seLetFloats  = floats
-                           , seJoinFloats = jfloats
-                           , seInScope = in_scope })
+extendFloats (SimplFloats { sfLetFloats  = floats
+                          , sfJoinFloats = jfloats
+                          , sfInScope    = in_scope })
              bind
   | isJoinBind bind
-  = env { seInScope    = in_scope'
-        , seLetFloats  = floats
-        , seJoinFloats = jfloats' }
+  = SimplFloats { sfInScope    = in_scope'
+                , sfLetFloats  = floats
+                , sfJoinFloats = jfloats' }
   | otherwise
-  = env { seInScope    = in_scope'
-        , seLetFloats  = floats'
-        , seJoinFloats = jfloats }
+  = SimplFloats { sfInScope    = in_scope'
+                , sfLetFloats  = floats'
+                , sfJoinFloats = jfloats }
   where
-    bndrs = bindersOf bind
-
-    in_scope' = extendInScopeSetList in_scope bndrs
-    floats'   = floats  `addFlts`     unitFloat bind
+    in_scope' = in_scope `extendInScopeSetBind` bind
+    floats'   = floats  `addLetFlts`  unitLetFloat bind
     jfloats'  = jfloats `addJoinFlts` unitJoinFloat bind
 
-addLetFloats :: SimplEnv -> SimplEnv -> SimplEnv
+addLetFloats :: SimplFloats -> LetFloats -> SimplFloats
 -- Add the let-floats for env2 to env1;
 -- *plus* the in-scope set for env2, which is bigger
 -- than that for env1
-addLetFloats env1 env2
-  = env1 { seLetFloats = seLetFloats env1 `addFlts` seLetFloats env2
-         , seInScope   = seInScope env2 }
-
-addFloats :: SimplEnv -> SimplEnv -> SimplEnv
+addLetFloats floats let_floats@(LetFloats binds _)
+  = floats { sfLetFloats = sfLetFloats floats `addLetFlts` let_floats
+           , sfInScope   = foldlOL extendInScopeSetBind
+                                   (sfInScope floats) binds }
+
+addJoinFloats :: SimplFloats -> JoinFloats -> SimplFloats
+addJoinFloats floats join_floats
+  = floats { sfJoinFloats = sfJoinFloats floats `addJoinFlts` join_floats
+           , sfInScope    = foldlOL extendInScopeSetBind
+                                    (sfInScope floats) join_floats }
+
+extendInScopeSetBind :: InScopeSet -> CoreBind -> InScopeSet
+extendInScopeSetBind in_scope bind
+  = extendInScopeSetList in_scope (bindersOf bind)
+
+addFloats :: SimplFloats -> SimplFloats -> SimplFloats
 -- Add both let-floats and join-floats for env2 to env1;
 -- *plus* the in-scope set for env2, which is bigger
 -- than that for env1
-addFloats env1 env2
-  = env1 { seLetFloats  = seLetFloats env1 `addFlts` seLetFloats env2
-         , seJoinFloats = seJoinFloats env1 `addJoinFlts` seJoinFloats env2
-         , seInScope    = seInScope env2 }
+addFloats (SimplFloats { sfLetFloats = lf1, sfJoinFloats = jf1 })
+          (SimplFloats { sfLetFloats = lf2, sfJoinFloats = jf2, sfInScope = in_scope })
+  = SimplFloats { sfLetFloats  = lf1 `addLetFlts` lf2
+                , sfJoinFloats = jf1 `addJoinFlts` jf2
+                , sfInScope    = in_scope }
 
-addFlts :: Floats -> Floats -> Floats
-addFlts (Floats bs1 l1) (Floats bs2 l2)
-  = Floats (bs1 `appOL` bs2) (l1 `andFF` l2)
+addLetFlts :: LetFloats -> LetFloats -> LetFloats
+addLetFlts (LetFloats bs1 l1) (LetFloats bs2 l2)
+  = LetFloats (bs1 `appOL` bs2) (l1 `andFF` l2)
+
+letFloatBinds :: LetFloats -> [CoreBind]
+letFloatBinds (LetFloats bs _) = fromOL bs
 
 addJoinFlts :: JoinFloats -> JoinFloats -> JoinFloats
 addJoinFlts = appOL
 
-zapFloats :: SimplEnv -> SimplEnv
-zapFloats env = env { seLetFloats  = emptyFloats
-                    , seJoinFloats = emptyJoinFloats }
-
-zapJoinFloats :: SimplEnv -> SimplEnv
-zapJoinFloats env = env { seJoinFloats = emptyJoinFloats }
-
-addJoinFloats :: SimplEnv -> JoinFloats -> SimplEnv
-addJoinFloats env@(SimplEnv { seJoinFloats = fb1 }) fb2
-  = env { seJoinFloats = fb1 `addJoinFlts` fb2 }
-
-addRecFloats :: SimplEnv -> SimplEnv -> SimplEnv
+mkRecFloats :: SimplFloats -> SimplFloats
 -- Flattens the floats from env2 into a single Rec group,
--- prepends the floats from env1, and puts the result back in env2
--- This is all very specific to the way recursive bindings are
--- handled; see Simplify.simplRecBind
-addRecFloats env1 env2@(SimplEnv {seLetFloats  = Floats bs ff
-                                 ,seJoinFloats = jbs })
+-- They must either all be lifted LetFloats or all JoinFloats
+mkRecFloats floats@(SimplFloats { sfLetFloats  = LetFloats bs ff
+                                , sfJoinFloats = jbs
+                                , sfInScope    = in_scope })
   = ASSERT2( case ff of { FltLifted -> True; _ -> False }, ppr (fromOL bs) )
-    env2 {seLetFloats = seLetFloats env1 `addFlts` floats'
-         ,seJoinFloats = seJoinFloats env1 `addJoinFlts` jfloats'}
+    ASSERT2( isNilOL bs || isNilOL jbs, ppr floats )
+    SimplFloats { sfLetFloats  = floats'
+                , sfJoinFloats = jfloats'
+                , sfInScope    = in_scope }
   where
-    floats'  | isNilOL bs  = emptyFloats
-             | otherwise   = unitFloat (Rec (flattenBinds (fromOL bs)))
+    floats'  | isNilOL bs  = emptyLetFloats
+             | otherwise   = unitLetFloat (Rec (flattenBinds (fromOL bs)))
     jfloats' | isNilOL jbs = emptyJoinFloats
              | otherwise   = unitJoinFloat (Rec (flattenBinds (fromOL jbs)))
 
-wrapFloats :: SimplEnv -> OutExpr -> OutExpr
+wrapFloats :: SimplFloats -> OutExpr -> OutExpr
 -- Wrap the floats around the expression; they should all
 -- satisfy the let/app invariant, so mkLets should do the job just fine
-wrapFloats (SimplEnv { seLetFloats  = Floats bs _
-                     , seJoinFloats = jbs }) body
+wrapFloats (SimplFloats { sfLetFloats  = LetFloats bs _
+                        , sfJoinFloats = jbs }) body
   = foldrOL Let (wrapJoinFloats jbs body) bs
      -- Note: Always safe to put the joins on the inside
      -- since the values can't refer to them
 
-wrapJoinFloatsX :: SimplEnv -> OutExpr -> (SimplEnv, OutExpr)
--- Wrap the seJoinFloats of the env around the expression,
+wrapJoinFloatsX :: SimplFloats -> OutExpr -> (SimplFloats, OutExpr)
+-- Wrap the sfJoinFloats of the env around the expression,
 -- and take them out of the SimplEnv
-wrapJoinFloatsX env@(SimplEnv { seJoinFloats = jbs }) body
-  = (zapJoinFloats env, wrapJoinFloats jbs body)
+wrapJoinFloatsX floats body
+  = ( floats { sfJoinFloats = emptyJoinFloats }
+    , wrapJoinFloats (sfJoinFloats floats) body )
 
 wrapJoinFloats :: JoinFloats -> OutExpr -> OutExpr
--- Wrap the seJoinFloats of the env around the expression,
+-- Wrap the sfJoinFloats of the env around the expression,
 -- and take them out of the SimplEnv
 wrapJoinFloats join_floats body
   = foldrOL Let body join_floats
 
-getFloatBinds :: SimplEnv -> [CoreBind]
-getFloatBinds (SimplEnv {seLetFloats = Floats bs _, seJoinFloats = jbs})
-  = fromOL bs ++ fromOL jbs
-
-isEmptyFloats :: SimplEnv -> Bool
-isEmptyFloats env@(SimplEnv {seLetFloats = Floats bs _})
-  = isNilOL bs && isEmptyJoinFloats env
-
-isEmptyJoinFloats :: SimplEnv -> Bool
-isEmptyJoinFloats (SimplEnv {seJoinFloats = jbs})
-  = isNilOL jbs
+getTopFloatBinds :: SimplFloats -> [CoreBind]
+getTopFloatBinds (SimplFloats { sfLetFloats  = lbs
+                              , sfJoinFloats = jbs})
+  = ASSERT( isNilOL jbs )  -- Can't be any top-level join bindings
+    letFloatBinds lbs
 
-mapFloats :: Floats -> ((Id,CoreExpr) -> (Id,CoreExpr)) -> Floats
-mapFloats (Floats fs ff) fun
-   = Floats (mapOL app fs) ff
+mapLetFloats :: LetFloats -> ((Id,CoreExpr) -> (Id,CoreExpr)) -> LetFloats
+mapLetFloats (LetFloats fs ff) fun
+   = LetFloats (mapOL app fs) ff
    where
     app (NonRec b e) = case fun (b,e) of (b',e') -> NonRec b' e'
     app (Rec bs)     = Rec (map fun bs)
diff --git a/compiler/simplCore/SimplUtils.hs b/compiler/simplCore/SimplUtils.hs
index b01955c8be..70e1134814 100644
--- a/compiler/simplCore/SimplUtils.hs
+++ b/compiler/simplCore/SimplUtils.hs
@@ -18,7 +18,7 @@ module SimplUtils (
 
         -- The continuation type
         SimplCont(..), DupFlag(..),
-        isSimplified,
+        isSimplified, contIsStop,
         contIsDupable, contResultType, contHoleType,
         contIsTrivial, contArgs,
         countArgs,
@@ -36,7 +36,7 @@ module SimplUtils (
 #include "HsVersions.h"
 
 import SimplEnv
-import CoreMonad        ( SimplifierMode(..), Tick(..) )
+import CoreMonad        ( SimplMode(..), Tick(..) )
 import DynFlags
 import CoreSyn
 import qualified CoreSubst
@@ -57,6 +57,7 @@ import DataCon          ( dataConWorkId, isNullaryRepDataCon )
 import VarSet
 import BasicTypes
 import Util
+import OrdList          ( isNilOL )
 import MonadUtils
 import Outputable
 import Pair
@@ -345,6 +346,10 @@ contIsRhs (Stop _ RhsCtxt) = True
 contIsRhs _                = False
 
 -------------------
+contIsStop :: SimplCont -> Bool
+contIsStop (Stop {}) = True
+contIsStop _         = False
+
 contIsDupable :: SimplCont -> Bool
 contIsDupable (Stop {})                         = True
 contIsDupable (ApplyToTy  { sc_cont = k })      = contIsDupable k
@@ -546,14 +551,31 @@ since we can just eliminate this case instead (x is in WHNF).  Similar
 applies when x is bound to a lambda expression.  Hence
 contIsInteresting looks for case expressions with just a single
 default case.
+
+Note [No case of case is boring]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we see
+   case f x of <alts>
+
+we'd usually treat the context as interesting, to encourage 'f' to
+inline.  But if case-of-case is off, it's really not so interesting
+after all, because we are unlikely to be able to push the case
+expression into the branches of any case in f's unfolding.  So, to
+reduce unnecessary code expansion, we just make the context look boring.
+This made a small compile-time perf improvement in perf/compiler/T6048,
+and it looks plausible to me.
 -}
 
-interestingCallContext :: SimplCont -> CallCtxt
+interestingCallContext :: SimplEnv -> SimplCont -> CallCtxt
 -- See Note [Interesting call context]
-interestingCallContext cont
+interestingCallContext env cont
   = interesting cont
   where
-    interesting (Select {})     = CaseCtxt
+    interesting (Select {})
+       | sm_case_case (getMode env) = CaseCtxt
+       | otherwise                  = BoringCtxt
+       -- See Note [No case of case is boring]
+
     interesting (ApplyToVal {}) = ValAppCtxt
         -- Can happen if we have (f Int |> co) y
         -- If f has an INLINE prag we need to give it some
@@ -694,11 +716,11 @@ interestingArg env e = go env 0 e
 {-
 ************************************************************************
 *                                                                      *
-                  SimplifierMode
+                  SimplMode
 *                                                                      *
 ************************************************************************
 
-The SimplifierMode controls several switches; see its definition in
+The SimplMode controls several switches; see its definition in
 CoreMonad
         sm_rules      :: Bool     -- Whether RULES are enabled
         sm_inline     :: Bool     -- Whether inlining is enabled
@@ -708,19 +730,20 @@ CoreMonad
 
 simplEnvForGHCi :: DynFlags -> SimplEnv
 simplEnvForGHCi dflags
-  = mkSimplEnv $ SimplMode { sm_names = ["GHCi"]
-                           , sm_phase = InitialPhase
-                           , sm_rules = rules_on
+  = mkSimplEnv $ SimplMode { sm_names  = ["GHCi"]
+                           , sm_phase  = InitialPhase
+                           , sm_dflags = dflags
+                           , sm_rules  = rules_on
                            , sm_inline = False
                            , sm_eta_expand = eta_expand_on
-                           , sm_case_case = True }
+                           , sm_case_case  = True }
   where
     rules_on      = gopt Opt_EnableRewriteRules   dflags
     eta_expand_on = gopt Opt_DoLambdaEtaExpansion dflags
    -- Do not do any inlining, in case we expose some unboxed
    -- tuple stuff that confuses the bytecode interpreter
 
-updModeForStableUnfoldings :: Activation -> SimplifierMode -> SimplifierMode
+updModeForStableUnfoldings :: Activation -> SimplMode -> SimplMode
 -- See Note [Simplifying inside stable unfoldings]
 updModeForStableUnfoldings inline_rule_act current_mode
   = current_mode { sm_phase      = phaseFromActivation inline_rule_act
@@ -733,7 +756,7 @@ updModeForStableUnfoldings inline_rule_act current_mode
     phaseFromActivation (ActiveAfter _ n) = Phase n
     phaseFromActivation _                 = InitialPhase
 
-updModeForRules :: SimplifierMode -> SimplifierMode
+updModeForRules :: SimplMode -> SimplMode
 -- See Note [Simplifying rules]
 updModeForRules current_mode
   = current_mode { sm_phase  = InitialPhase
@@ -1054,16 +1077,16 @@ is a term (not a coercion) so we can't necessarily inline the latter in
 the former.
 -}
 
-preInlineUnconditionally :: DynFlags -> SimplEnv -> TopLevelFlag -> InId -> InExpr -> Bool
+preInlineUnconditionally :: SimplEnv -> TopLevelFlag -> InId -> InExpr -> Bool
 -- Precondition: rhs satisfies the let/app invariant
 -- See Note [CoreSyn let/app invariant] in CoreSyn
 -- Reason: we don't want to inline single uses, or discard dead bindings,
 --         for unlifted, side-effect-ful bindings
-preInlineUnconditionally dflags env top_lvl bndr rhs
+preInlineUnconditionally env top_lvl bndr rhs
+  | not pre_inline_unconditionally           = False
   | not active                               = False
   | isStableUnfolding (idUnfolding bndr)     = False -- Note [Stable unfoldings and preInlineUnconditionally]
   | isTopLevel top_lvl && isBottomingId bndr = False -- Note [Top-level bottoming Ids]
-  | not (gopt Opt_SimplPreInlining dflags)   = False
   | isCoVar bndr                             = False -- Note [Do not inline CoVars unconditionally]
   | otherwise = case idOccInfo bndr of
                   IAmDead                    -> True -- Happens in ((\x.1) v)
@@ -1072,7 +1095,8 @@ preInlineUnconditionally dflags env top_lvl bndr rhs
                                                          (occ_int_cxt occ)
                   _                          -> False
   where
-    mode = getMode env
+    pre_inline_unconditionally = gopt Opt_SimplPreInlining (seDynFlags env)
+    mode   = getMode env
     active = isActive (sm_phase mode) act
              -- See Note [pre/postInlineUnconditionally in gentle mode]
     act = idInlineActivation bndr
@@ -1163,18 +1187,16 @@ story for now.
 -}
 
 postInlineUnconditionally
-    :: DynFlags -> SimplEnv -> TopLevelFlag
-    -> OutId            -- The binder (an InId would be fine too)
-                        --            (*not* a CoVar)
+    :: SimplEnv -> TopLevelFlag
+    -> OutId            -- The binder (*not* a CoVar), including its unfolding
     -> OccInfo          -- From the InId
     -> OutExpr
-    -> Unfolding
     -> Bool
 -- Precondition: rhs satisfies the let/app invariant
 -- See Note [CoreSyn let/app invariant] in CoreSyn
 -- Reason: we don't want to inline single uses, or discard dead bindings,
 --         for unlifted, side-effect-ful bindings
-postInlineUnconditionally dflags env top_lvl bndr occ_info rhs unfolding
+postInlineUnconditionally env top_lvl bndr occ_info rhs
   | not active                  = False
   | isWeakLoopBreaker occ_info  = False -- If it's a loop-breaker of any kind, don't inline
                                         -- because it might be referred to "earlier"
@@ -1242,7 +1264,9 @@ postInlineUnconditionally dflags env top_lvl bndr occ_info rhs unfolding
 -- Alas!
 
   where
-    active = isActive (sm_phase (getMode env)) (idInlineActivation bndr)
+    unfolding = idUnfolding bndr
+    dflags    = seDynFlags env
+    active    = isActive (sm_phase (getMode env)) (idInlineActivation bndr)
         -- See Note [pre/postInlineUnconditionally in gentle mode]
 
 {-
@@ -1414,40 +1438,46 @@ because the latter is not well-kinded.
 ************************************************************************
 -}
 
-tryEtaExpandRhs :: SimplEnv -> RecFlag -> OutId -> OutExpr
-                -> SimplM (Arity, OutExpr)
+tryEtaExpandRhs :: SimplMode -> OutId -> OutExpr
+                -> SimplM (Arity, Bool, OutExpr)
 -- See Note [Eta-expanding at let bindings]
-tryEtaExpandRhs env is_rec bndr rhs
-  = do { dflags <- getDynFlags
-       ; (new_arity, new_rhs) <- try_expand dflags
+-- If tryEtaExpandRhs rhs = (n, is_bot, rhs') then
+--   (a) rhs' has manifest arity
+--   (b) if is_bot is True then rhs' applied to n args is guaranteed bottom
+tryEtaExpandRhs mode bndr rhs
+  | isJoinId bndr
+  = return (manifestArity rhs, False, rhs)
+    -- Note [Do not eta-expand join points]
+
+  | otherwise
+  = do { (new_arity, is_bot, new_rhs) <- try_expand
 
        ; WARN( new_arity < old_id_arity,
                (text "Arity decrease:" <+> (ppr bndr <+> ppr old_id_arity
                 <+> ppr old_arity <+> ppr new_arity) $$ ppr new_rhs) )
                         -- Note [Arity decrease] in Simplify
-         return (new_arity, new_rhs) }
+         return (new_arity, is_bot, new_rhs) }
   where
-    try_expand dflags
+    try_expand
       | exprIsTrivial rhs
-      = return (exprArity rhs, rhs)
-
-      | sm_eta_expand (getMode env)      -- Provided eta-expansion is on
-      , let new_arity1 = findRhsArity dflags bndr rhs old_arity
-            new_arity2 = idCallArity bndr
-            new_arity  = max new_arity1 new_arity2
-      , new_arity > old_arity      -- And the current manifest arity isn't enough
-      = if is_rec == Recursive && isJoinId bndr
-           then WARN(True, text "Can't eta-expand recursive join point:" <+>
-                             ppr bndr)
-                return (old_arity, rhs)
-           else do { tick (EtaExpansion bndr)
-                   ; return (new_arity, etaExpand new_arity rhs) }
+      = return (exprArity rhs, False, rhs)
+
+      | sm_eta_expand mode      -- Provided eta-expansion is on
+      , new_arity > old_arity   -- And the current manifest arity isn't enough
+      = do { tick (EtaExpansion bndr)
+           ; return (new_arity, is_bot, etaExpand new_arity rhs) }
+
       | otherwise
-      = return (old_arity, rhs)
+      = return (old_arity, is_bot && new_arity == old_arity, rhs)
 
+    dflags       = sm_dflags mode
     old_arity    = exprArity rhs -- See Note [Do not expand eta-expand PAPs]
     old_id_arity = idArity bndr
 
+    (new_arity1, is_bot) = findRhsArity dflags bndr rhs old_arity
+    new_arity2 = idCallArity bndr
+    new_arity  = max new_arity1 new_arity2
+
 {-
 Note [Eta-expanding at let bindings]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1473,6 +1503,44 @@ because then 'genMap' will inline, and it really shouldn't: at least
 as far as the programmer is concerned, it's not applied to two
 arguments!
 
+Note [Do not eta-expand join points]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Similarly to CPR (see Note [Don't CPR join points] in WorkWrap), a join point
+stands well to gain from its outer binding's eta-expansion, and eta-expanding a
+join point is fraught with issues like how to deal with a cast:
+
+    let join $j1 :: IO ()
+             $j1 = ...
+             $j2 :: Int -> IO ()
+             $j2 n = if n > 0 then $j1
+                              else ...
+
+    =>
+
+    let join $j1 :: IO ()
+             $j1 = (\eta -> ...)
+                     `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ())
+                                 ~  IO ()
+             $j2 :: Int -> IO ()
+             $j2 n = (\eta -> if n > 0 then $j1
+                                       else ...)
+                     `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ())
+                                 ~  IO ()
+
+The cast here can't be pushed inside the lambda (since it's not casting to a
+function type), so the lambda has to stay, but it can't because it contains a
+reference to a join point. In fact, $j2 can't be eta-expanded at all. Rather
+than try and detect this situation (and whatever other situations crop up!), we
+don't bother; again, any surrounding eta-expansion will improve these join
+points anyway, since an outer cast can *always* be pushed inside. By the time
+CorePrep comes around, the code is very likely to look more like this:
+
+    let join $j1 :: State# RealWorld -> (# State# RealWorld, ())
+             $j1 = (...) eta
+             $j2 :: Int -> State# RealWorld -> (# State# RealWorld, ())
+             $j2 = if n > 0 then $j1
+                            else (...) eta
+
 Note [Do not eta-expand PAPs]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 We used to have old_arity = manifestArity rhs, which meant that we
@@ -1603,22 +1671,25 @@ new binding is abstracted.  Note that
     which is obviously bogus.
 -}
 
-abstractFloats :: [OutTyVar] -> SimplEnv -> OutExpr -> SimplM ([OutBind], OutExpr)
-abstractFloats main_tvs body_env body
+abstractFloats :: DynFlags -> TopLevelFlag -> [OutTyVar] -> SimplFloats
+              -> OutExpr -> SimplM ([OutBind], OutExpr)
+abstractFloats dflags top_lvl main_tvs floats body
   = ASSERT( notNull body_floats )
+    ASSERT( isNilOL (sfJoinFloats floats) )
     do  { (subst, float_binds) <- mapAccumLM abstract empty_subst body_floats
         ; return (float_binds, CoreSubst.substExpr (text "abstract_floats1") subst body) }
   where
+    is_top_lvl  = isTopLevel top_lvl
     main_tv_set = mkVarSet main_tvs
-    body_floats = getFloatBinds body_env
-    empty_subst = CoreSubst.mkEmptySubst (seInScope body_env)
+    body_floats = letFloatBinds (sfLetFloats floats)
+    empty_subst = CoreSubst.mkEmptySubst (sfInScope floats)
 
     abstract :: CoreSubst.Subst -> OutBind -> SimplM (CoreSubst.Subst, OutBind)
     abstract subst (NonRec id rhs)
-      = do { (poly_id, poly_app) <- mk_poly tvs_here id
-           ; let poly_rhs = mkLams tvs_here rhs'
-                 subst'   = CoreSubst.extendIdSubst subst id poly_app
-           ; return (subst', (NonRec poly_id poly_rhs)) }
+      = do { (poly_id1, poly_app) <- mk_poly1 tvs_here id
+           ; let (poly_id2, poly_rhs) = mk_poly2 poly_id1 tvs_here rhs'
+                 subst' = CoreSubst.extendIdSubst subst id poly_app
+           ; return (subst', NonRec poly_id2 poly_rhs) }
       where
         rhs' = CoreSubst.substExpr (text "abstract_floats2") subst rhs
 
@@ -1629,11 +1700,13 @@ abstractFloats main_tvs body_env body
                    exprSomeFreeVarsList isTyVar rhs'
 
     abstract subst (Rec prs)
-       = do { (poly_ids, poly_apps) <- mapAndUnzipM (mk_poly tvs_here) ids
+       = do { (poly_ids, poly_apps) <- mapAndUnzipM (mk_poly1 tvs_here) ids
             ; let subst' = CoreSubst.extendSubstList subst (ids `zip` poly_apps)
-                  poly_rhss = [mkLams tvs_here (CoreSubst.substExpr (text "abstract_floats3") subst' rhs)
-                              | rhs <- rhss]
-            ; return (subst', Rec (poly_ids `zip` poly_rhss)) }
+                  poly_pairs = [ mk_poly2 poly_id tvs_here rhs'
+                               | (poly_id, rhs) <- poly_ids `zip` rhss
+                               , let rhs' = CoreSubst.substExpr (text "abstract_floats")
+                                                                subst' rhs ]
+            ; return (subst', Rec poly_pairs) }
        where
          (ids,rhss) = unzip prs
                 -- For a recursive group, it's a bit of a pain to work out the minimal
@@ -1651,7 +1724,8 @@ abstractFloats main_tvs body_env body
                 -- Here, we must abstract 'x' over 'a'.
          tvs_here = toposortTyVars main_tvs
 
-    mk_poly tvs_here var
+    mk_poly1 :: [TyVar] -> Id -> SimplM (Id, CoreExpr)
+    mk_poly1 tvs_here var
       = do { uniq <- getUniqueM
            ; let  poly_name = setNameUnique (idName var) uniq           -- Keep same name
                   poly_ty   = mkInvForAllTys tvs_here (idType var) -- But new type of course
@@ -1671,6 +1745,21 @@ abstractFloats main_tvs body_env body
                 -- the occurrences of x' will be just the occurrences originally
                 -- pinned on x.
 
+    mk_poly2 :: Id -> [TyVar] -> CoreExpr -> (Id, CoreExpr)
+    mk_poly2 poly_id tvs_here rhs
+      = (poly_id `setIdUnfolding` unf, poly_rhs)
+      where
+        poly_rhs = mkLams tvs_here rhs
+        unf = mkUnfolding dflags InlineRhs is_top_lvl False poly_rhs
+
+        -- We want the unfolding.  Consider
+        --      let
+        --            x = /\a. let y = ... in Just y
+        --      in body
+        -- Then we float the y-binding out (via abstractFloats and addPolyBind)
+        -- but 'x' may well then be inlined in 'body' in which case we'd like the
+        -- opportunity to inline 'y' too.
+
 {-
 Note [Abstract over coercions]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
diff --git a/compiler/simplCore/Simplify.hs b/compiler/simplCore/Simplify.hs
index 1fc9112fcf..b17f2afa17 100644
--- a/compiler/simplCore/Simplify.hs
+++ b/compiler/simplCore/Simplify.hs
@@ -27,24 +27,20 @@ import Coercion hiding  ( substCo, substCoVar )
 import OptCoercion      ( optCoercion )
 import FamInstEnv       ( topNormaliseType_maybe )
 import DataCon          ( DataCon, dataConWorkId, dataConRepStrictness, dataConRepArgTys )
---import TyCon            ( isEnumerationTyCon ) -- temporalily commented out. See #8326
-import CoreMonad        ( Tick(..), SimplifierMode(..) )
+import CoreMonad        ( Tick(..), SimplMode(..) )
 import CoreSyn
 import Demand           ( StrictSig(..), dmdTypeDepth, isStrictDmd )
 import PprCore          ( pprCoreExpr )
 import CoreUnfold
 import CoreUtils
-import CoreArity
 import CoreOpt          ( pushCoTyArg, pushCoValArg
                         , joinPointBinding_maybe, joinPointBindings_maybe )
---import PrimOp           ( tagToEnumKey ) -- temporalily commented out. See #8326
 import Rules            ( mkRuleInfo, lookupRule, getRules )
---import TysPrim          ( intPrimTy ) -- temporalily commented out. See #8326
+import Demand           ( mkClosedStrictSig, topDmd, exnRes )
 import BasicTypes       ( TopLevelFlag(..), isNotTopLevel, isTopLevel,
-                          RecFlag(..) )
-import MonadUtils       ( foldlM, mapAccumLM, liftIO )
-import Maybes           ( isJust, fromJust, orElse, catMaybes )
---import Unique           ( hasKey ) -- temporalily commented out. See #8326
+                          RecFlag(..), Arity )
+import MonadUtils       ( mapAccumLM, liftIO )
+import Maybes           (  orElse )
 import Control.Monad
 import Outputable
 import FastString
@@ -57,142 +53,50 @@ import Module          ( moduleName, pprModuleName )
 The guts of the simplifier is in this module, but the driver loop for
 the simplifier is in SimplCore.hs.
 
+Note [The big picture]
+~~~~~~~~~~~~~~~~~~~~~~
+The general shape of the simplifier is this:
 
------------------------------------------
-        *** IMPORTANT NOTE ***
------------------------------------------
-The simplifier used to guarantee that the output had no shadowing, but
-it does not do so any more.   (Actually, it never did!)  The reason is
-documented with simplifyArgs.
-
-
------------------------------------------
-        *** IMPORTANT NOTE ***
------------------------------------------
-Many parts of the simplifier return a bunch of "floats" as well as an
-expression. This is wrapped as a datatype SimplUtils.FloatsWith.
-
-All "floats" are let-binds, not case-binds, but some non-rec lets may
-be unlifted (with RHS ok-for-speculation).
-
-
-
------------------------------------------
-        ORGANISATION OF FUNCTIONS
------------------------------------------
-simplTopBinds
-  - simplify all top-level binders
-  - for NonRec, call simplRecOrTopPair
-  - for Rec,    call simplRecBind
-
-
-        ------------------------------
-simplExpr (applied lambda)      ==> simplNonRecBind
-simplExpr (Let (NonRec ...) ..) ==> simplNonRecBind
-simplExpr (Let (Rec ...)    ..) ==> simplify binders; simplRecBind
-
-        ------------------------------
-simplRecBind    [binders already simplfied]
-  - use simplRecOrTopPair on each pair in turn
-
-simplRecOrTopPair [binder already simplified]
-  Used for: recursive bindings (top level and nested)
-            top-level non-recursive bindings
-  Returns:
-  - check for PreInlineUnconditionally
-  - simplLazyBind
-
-simplNonRecBind
-  Used for: non-top-level non-recursive bindings
-            beta reductions (which amount to the same thing)
-  Because it can deal with strict arts, it takes a
-        "thing-inside" and returns an expression
-
-  - check for PreInlineUnconditionally
-  - simplify binder, including its IdInfo
-  - if strict binding
-        simplStrictArg
-        mkAtomicArgs
-        completeNonRecX
-    else
-        simplLazyBind
-        addFloats
-
-simplNonRecX:   [given a *simplified* RHS, but an *unsimplified* binder]
-  Used for: binding case-binder and constr args in a known-constructor case
-  - check for PreInLineUnconditionally
-  - simplify binder
-  - completeNonRecX
-
-        ------------------------------
-simplLazyBind:  [binder already simplified, RHS not]
-  Used for: recursive bindings (top level and nested)
-            top-level non-recursive bindings
-            non-top-level, but *lazy* non-recursive bindings
-        [must not be strict or unboxed]
-  Returns floats + an augmented environment, not an expression
-  - substituteIdInfo and add result to in-scope
-        [so that rules are available in rec rhs]
-  - simplify rhs
-  - mkAtomicArgs
-  - float if exposes constructor or PAP
-  - completeBind
-
-
-completeNonRecX:        [binder and rhs both simplified]
-  - if the the thing needs case binding (unlifted and not ok-for-spec)
-        build a Case
-   else
-        completeBind
-        addFloats
-
-completeBind:   [given a simplified RHS]
-        [used for both rec and non-rec bindings, top level and not]
-  - try PostInlineUnconditionally
-  - add unfolding [this is the only place we add an unfolding]
-  - add arity
-
-
-
-Right hand sides and arguments
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In many ways we want to treat
-        (a) the right hand side of a let(rec), and
-        (b) a function argument
-in the same way.  But not always!  In particular, we would
-like to leave these arguments exactly as they are, so they
-will match a RULE more easily.
-
-        f (g x, h x)
-        g (+ x)
-
-It's harder to make the rule match if we ANF-ise the constructor,
-or eta-expand the PAP:
+  simplExpr :: SimplEnv -> InExpr -> SimplCont -> SimplM (SimplFloats, OutExpr)
+  simplBind :: SimplEnv -> InBind -> SimplM (SimplFloats, SimplEnv)
 
-        f (let { a = g x; b = h x } in (a,b))
-        g (\y. + x y)
+ * SimplEnv contains
+     - Simplifier mode (which includes DynFlags for convenience)
+     - Ambient substitution
+     - InScopeSet
 
-On the other hand if we see the let-defns
+ * SimplFloats contains
+     - Let-floats (which includes ok-for-spec case-floats)
+     - Join floats
+     - InScopeSet (including all the floats)
 
-        p = (g x, h x)
-        q = + x
+ * Expressions
+      simplExpr :: SimplEnv -> InExpr -> SimplCont
+                -> SimplM (SimplFloats, OutExpr)
+   The result of simplifying an /expression/ is (floats, expr)
+      - A bunch of floats (let bindings, join bindings)
+      - A simplified expression.
+   The overall result is effectively (let floats in expr)
 
-then we *do* want to ANF-ise and eta-expand, so that p and q
-can be safely inlined.
+ * Bindings
+      simplBind :: SimplEnv -> InBind -> SimplM (SimplFloats, SimplEnv)
+   The result of simplifying a binding is
+     - A bunch of floats, the last of which is the simplified binding
+       There may be auxiliary bindings too; see prepareRhs
+     - An environment suitable for simplifying the scope of the binding
 
-Even floating lets out is a bit dubious.  For let RHS's we float lets
-out if that exposes a value, so that the value can be inlined more vigorously.
-For example
+   The floats may also be empty, if the binding is inlined unconditionally;
+   in that case the returned SimplEnv will have an augmented substitution.
 
-        r = let x = e in (x,x)
+   The returned floats and env both have an in-scope set, and they are
+   guaranteed to be the same.
 
-Here, if we float the let out we'll expose a nice constructor. We did experiments
-that showed this to be a generally good thing.  But it was a bad thing to float
-lets out unconditionally, because that meant they got allocated more often.
 
-For function arguments, there's less reason to expose a constructor (it won't
-get inlined).  Just possibly it might make a rule match, but I'm pretty skeptical.
-So for the moment we don't float lets out of function arguments either.
+Note [Shadowing]
+~~~~~~~~~~~~~~~~
+The simplifier used to guarantee that the output had no shadowing, but
+it does not do so any more.   (Actually, it never did!)  The reason is
+documented with simplifyArgs.
 
 
 Eta expansion
@@ -206,36 +110,6 @@ lambdas together.  And in general that's a good thing to do.  Perhaps
 we should eta expand wherever we find a (value) lambda?  Then the eta
 expansion at a let RHS can concentrate solely on the PAP case.
 
-
-Case-of-case and join points
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-When we perform the case-of-case transform (or otherwise push continuations
-inward), we want to treat join points specially. Since they're always
-tail-called and we want to maintain this invariant, we can do this (for any
-evaluation context E):
-
-  E[join j = e
-    in case ... of
-         A -> jump j 1
-         B -> jump j 2
-         C -> f 3]
-
-    -->
-
-  join j = E[e]
-  in case ... of
-       A -> jump j 1
-       B -> jump j 2
-       C -> E[f 3]
-
-As is evident from the example, there are two components to this behavior:
-
-  1. When entering the RHS of a join point, copy the context inside.
-  2. When a join point is invoked, discard the outer context.
-
-Clearly we need to be very careful here to remain consistent---neither part is
-optional!
-
 ************************************************************************
 *                                                                      *
 \subsection{Bindings}
@@ -243,8 +117,8 @@ optional!
 ************************************************************************
 -}
 
-simplTopBinds :: SimplEnv -> [InBind] -> SimplM SimplEnv
-
+simplTopBinds :: SimplEnv -> [InBind] -> SimplM (SimplFloats, SimplEnv)
+-- See Note [The big picture]
 simplTopBinds env0 binds0
   = do  {       -- Put all the top-level binders into scope at the start
                 -- so that if a transformation rule has unexpectedly brought
@@ -252,18 +126,19 @@ simplTopBinds env0 binds0
                 -- It's rather as if the top-level binders were imported.
                 -- See note [Glomming] in OccurAnal.
         ; env1 <- simplRecBndrs env0 (bindersOfBinds binds0)
-        ; env2 <- simpl_binds env1 binds0
+        ; (floats, env2) <- simpl_binds env1 binds0
         ; freeTick SimplifierDone
-        ; return env2 }
+        ; return (floats, env2) }
   where
         -- We need to track the zapped top-level binders, because
         -- they should have their fragile IdInfo zapped (notably occurrence info)
         -- That's why we run down binds and bndrs' simultaneously.
         --
-    simpl_binds :: SimplEnv -> [InBind] -> SimplM SimplEnv
-    simpl_binds env []           = return env
-    simpl_binds env (bind:binds) = do { env' <- simpl_bind env bind
-                                      ; simpl_binds env' binds }
+    simpl_binds :: SimplEnv -> [InBind] -> SimplM (SimplFloats, SimplEnv)
+    simpl_binds env []           = return (emptyFloats env, env)
+    simpl_binds env (bind:binds) = do { (float,  env1) <- simpl_bind env bind
+                                      ; (floats, env2) <- simpl_binds env1 binds
+                                      ; return (float `addFloats` floats, env2) }
 
     simpl_bind env (Rec pairs)  = simplRecBind env TopLevel Nothing pairs
     simpl_bind env (NonRec b r) = do { (env', b') <- addBndrRules env b (lookupRecBndr env b)
@@ -274,7 +149,7 @@ simplTopBinds env0 binds0
 {-
 ************************************************************************
 *                                                                      *
-\subsection{Lazy bindings}
+        Lazy bindings
 *                                                                      *
 ************************************************************************
 
@@ -284,13 +159,11 @@ simplRecBind is used for
 
 simplRecBind :: SimplEnv -> TopLevelFlag -> Maybe SimplCont
              -> [(InId, InExpr)]
-             -> SimplM SimplEnv
+             -> SimplM (SimplFloats, SimplEnv)
 simplRecBind env0 top_lvl mb_cont pairs0
   = do  { (env_with_info, triples) <- mapAccumLM add_rules env0 pairs0
-        ; env1 <- go (zapFloats env_with_info) triples
-        ; return (env0 `addRecFloats` env1) }
-        -- addRecFloats adds the floats from env1,
-        -- _and_ updates env0 with the in-scope set from env1
+        ; (rec_floats, env1) <- go env_with_info triples
+        ; return (mkRecFloats rec_floats, env1) }
   where
     add_rules :: SimplEnv -> (InBndr,InExpr) -> SimplM (SimplEnv, (InBndr, OutBndr, InExpr))
         -- Add the (substituted) rules to the binder
@@ -298,12 +171,13 @@ simplRecBind env0 top_lvl mb_cont pairs0
         = do { (env', bndr') <- addBndrRules env bndr (lookupRecBndr env bndr)
              ; return (env', (bndr, bndr', rhs)) }
 
-    go env [] = return env
+    go env [] = return (emptyFloats env, env)
 
     go env ((old_bndr, new_bndr, rhs) : pairs)
-        = do { env' <- simplRecOrTopPair env top_lvl Recursive mb_cont
-                                         old_bndr new_bndr rhs
-             ; go env' pairs }
+        = do { (float, env1) <- simplRecOrTopPair env top_lvl Recursive mb_cont
+                                                  old_bndr new_bndr rhs
+             ; (floats, env2) <- go env1 pairs
+             ; return (float `addFloats` floats, env2) }
 
 {-
 simplOrTopPair is used for
@@ -316,57 +190,36 @@ It assumes the binder has already been simplified, but not its IdInfo.
 simplRecOrTopPair :: SimplEnv
                   -> TopLevelFlag -> RecFlag -> Maybe SimplCont
                   -> InId -> OutBndr -> InExpr  -- Binder and rhs
-                  -> SimplM SimplEnv    -- Returns an env that includes the binding
+                  -> SimplM (SimplFloats, SimplEnv)
 
 simplRecOrTopPair env top_lvl is_rec mb_cont old_bndr new_bndr rhs
-  = do { dflags <- getDynFlags
-       ; trace_bind dflags $
-           if preInlineUnconditionally dflags env top_lvl old_bndr rhs
-                    -- Check for unconditional inline
-           then do tick (PreInlineUnconditionally old_bndr)
-                   return (extendIdSubst env old_bndr (mkContEx env rhs))
-           else simplBind env top_lvl is_rec mb_cont old_bndr new_bndr rhs env }
-  where
-    trace_bind dflags thing_inside
-      | not (dopt Opt_D_verbose_core2core dflags)
-      = thing_inside
-      | otherwise
-      = pprTrace "SimplBind" (ppr old_bndr) thing_inside
-        -- trace_bind emits a trace for each top-level binding, which
-        -- helps to locate the tracing for inlining and rule firing
+  | preInlineUnconditionally env top_lvl old_bndr rhs
+  = trace_bind "pre-inline-uncond" $
+    do { tick (PreInlineUnconditionally old_bndr)
+       ; return ( emptyFloats env
+                , extendIdSubst env old_bndr (mkContEx env rhs)) }
 
-{-
-simplBind is used for
-  * [simplRecOrTopPair] recursive bindings (whether top level or not)
-  * [simplRecOrTopPair] top-level non-recursive bindings
-  * [simplNonRecE]      non-top-level *lazy* non-recursive bindings
+  | Just cont <- mb_cont
+  = ASSERT( isNotTopLevel top_lvl && isJoinId new_bndr )
+    trace_bind "join" $
+    simplJoinBind env cont old_bndr new_bndr rhs
 
-Nota bene:
-    1. It assumes that the binder is *already* simplified,
-       and is in scope, and its IdInfo too, except unfolding
-
-    2. It assumes that the binder type is lifted.
+  | otherwise
+  = trace_bind "normal" $
+    simplLazyBind env top_lvl is_rec old_bndr new_bndr rhs env
 
-    3. It does not check for pre-inline-unconditionally;
-       that should have been done already.
--}
+  where
+    dflags = seDynFlags env
 
-simplBind :: SimplEnv
-          -> TopLevelFlag -> RecFlag -> Maybe SimplCont
-          -> InId -> OutId      -- Binder, both pre-and post simpl
-                                -- Can be a JoinId
-                                -- The OutId has IdInfo, except arity, unfolding
-                                -- Ids only, no TyVars
-          -> InExpr -> SimplEnv -- The RHS and its environment
-          -> SimplM SimplEnv
-simplBind env top_lvl is_rec mb_cont bndr bndr1 rhs rhs_se
-  | ASSERT( isId bndr1 )
-    isJoinId bndr1
-  = ASSERT(isNotTopLevel top_lvl && isJust mb_cont)
-    simplJoinBind env is_rec (fromJust mb_cont) bndr bndr1 rhs rhs_se
-  | otherwise
-  = simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
+    -- trace_bind emits a trace for each top-level binding, which
+    -- helps to locate the tracing for inlining and rule firing
+    trace_bind what thing_inside
+      | not (dopt Opt_D_verbose_core2core dflags)
+      = thing_inside
+      | otherwise
+      = pprTrace ("SimplBind " ++ what) (ppr old_bndr) thing_inside
 
+--------------------------
 simplLazyBind :: SimplEnv
               -> TopLevelFlag -> RecFlag
               -> InId -> OutId          -- Binder, both pre-and post simpl
@@ -374,7 +227,7 @@ simplLazyBind :: SimplEnv
                                         -- The OutId has IdInfo, except arity, unfolding
                                         -- Ids only, no TyVars
               -> InExpr -> SimplEnv     -- The RHS and its environment
-              -> SimplM SimplEnv
+              -> SimplM (SimplFloats, SimplEnv)
 -- Precondition: not a JoinId
 -- Precondition: rhs obeys the let/app invariant
 -- NOT used for JoinIds
@@ -382,7 +235,7 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
   = ASSERT( isId bndr )
     ASSERT2( not (isJoinId bndr), ppr bndr )
     -- pprTrace "simplLazyBind" ((ppr bndr <+> ppr bndr1) $$ ppr rhs $$ ppr (seIdSubst rhs_se)) $
-    do  { let   rhs_env     = rhs_se `setInScopeAndZapFloats` env
+    do  { let   rhs_env     = rhs_se `setInScopeFromE` env
                 (tvs, body) = case collectTyAndValBinders rhs of
                                 (tvs, [], body)
                                   | surely_not_lam body -> (tvs, body)
@@ -404,146 +257,111 @@ simplLazyBind env top_lvl is_rec bndr bndr1 rhs rhs_se
 
         -- Simplify the RHS
         ; let rhs_cont = mkRhsStop (substTy body_env (exprType body))
-        ; (body_env0, body0) <- simplExprF body_env body rhs_cont
-        ; let (body_env1, body1) = wrapJoinFloatsX body_env0 body0
-
-        -- ANF-ise a constructor or PAP rhs
-        ; (body_env2, body2) <- prepareRhs top_lvl body_env1 bndr1 body1
+        ; (body_floats0, body0) <- simplExprF body_env body rhs_cont
 
-        -- We need body_env2 for its let-floats (only);
-        -- we've dealt with its join-floats, which are now empty
-        ; (env', rhs')
-            <-  if not (doFloatFromRhs top_lvl is_rec False body2 body_env2)
-                then                            -- No floating, revert to body1
-                     do { rhs' <- mkLam env tvs' (wrapFloats body_env1 body1) rhs_cont
-                        ; return (env, rhs') }
+              -- Never float join-floats out of a non-join let-binding
+              -- So wrap the body in the join-floats right now
+              -- Henc: body_floats1 consists only of let-floats
+        ; let (body_floats1, body1) = wrapJoinFloatsX body_floats0 body0
 
-                else if null tvs then           -- Simple floating
+        -- ANF-ise a constructor or PAP rhs
+        -- We get at most one float per argument here
+        ; (let_floats, body2) <- prepareRhs (getMode env) top_lvl
+                                            (getOccFS bndr1) (idInfo bndr1) body1
+        ; let body_floats2 = body_floats1 `addLetFloats` let_floats
+
+        ; (rhs_floats, rhs')
+            <-  if not (doFloatFromRhs top_lvl is_rec False body_floats2 body2)
+                then                    -- No floating, revert to body1
+                     do { rhs' <- mkLam env tvs' (wrapFloats body_floats2 body1) rhs_cont
+                        ; return (emptyFloats env, rhs') }
+
+                else if null tvs then   -- Simple floating
                      do { tick LetFloatFromLet
-                        ; return (addLetFloats env body_env2, body2) }
+                        ; return (body_floats2, body2) }
 
-                else                            -- Do type-abstraction first
+                else                    -- Do type-abstraction first
                      do { tick LetFloatFromLet
-                        ; (poly_binds, body3) <- abstractFloats tvs' body_env2 body2
+                        ; (poly_binds, body3) <- abstractFloats (seDynFlags env) top_lvl
+                                                                tvs' body_floats2 body2
+                        ; let floats = foldl extendFloats (emptyFloats env) poly_binds
                         ; rhs' <- mkLam env tvs' body3 rhs_cont
-                        ; env' <- foldlM (addPolyBind top_lvl) env poly_binds
-                        ; return (env', rhs') }
+                        ; return (floats, rhs') }
 
-        ; completeBind env' top_lvl is_rec Nothing bndr bndr1 rhs' }
+        ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
+                                             top_lvl Nothing bndr bndr1 rhs'
+        ; return (rhs_floats `addFloats` bind_float, env2) }
 
+--------------------------
 simplJoinBind :: SimplEnv
-              -> RecFlag
               -> SimplCont
               -> InId -> OutId          -- Binder, both pre-and post simpl
                                         -- The OutId has IdInfo, except arity,
                                         --   unfolding
-              -> InExpr -> SimplEnv     -- The RHS and its environment
-              -> SimplM SimplEnv
-simplJoinBind env is_rec cont bndr bndr1 rhs rhs_se
-  = -- pprTrace "simplLazyBind" ((ppr bndr <+> ppr bndr1) $$
-    --                           ppr rhs $$ ppr (seIdSubst rhs_se)) $
-    do  { let rhs_env = rhs_se `setInScopeAndZapFloats` env
-        ; rhs' <- simplJoinRhs rhs_env bndr rhs cont
-        ; completeBind env NotTopLevel is_rec (Just cont) bndr bndr1 rhs' }
-
-{-
-A specialised variant of simplNonRec used when the RHS is already simplified,
-notably in knownCon.  It uses case-binding where necessary.
--}
+              -> InExpr
+              -> SimplM (SimplFloats, SimplEnv)
+simplJoinBind env cont old_bndr new_bndr rhs
+  = do  { rhs' <- simplJoinRhs env old_bndr rhs cont
+        ; completeBind env NotTopLevel (Just cont) old_bndr new_bndr rhs' }
 
+--------------------------
 simplNonRecX :: SimplEnv
              -> InId            -- Old binder; not a JoinId
              -> OutExpr         -- Simplified RHS
-             -> SimplM SimplEnv
+             -> SimplM (SimplFloats, SimplEnv)
+-- A specialised variant of simplNonRec used when the RHS is already
+-- simplified, notably in knownCon.  It uses case-binding where necessary.
+--
 -- Precondition: rhs satisfies the let/app invariant
+
 simplNonRecX env bndr new_rhs
   | ASSERT2( not (isJoinId bndr), ppr bndr )
     isDeadBinder bndr   -- Not uncommon; e.g. case (a,b) of c { (p,q) -> p }
-  = return env    --  Here c is dead, and we avoid creating
-                  --   the binding c = (a,b)
+  = return (emptyFloats env, env)    --  Here c is dead, and we avoid
+                                         --  creating the binding c = (a,b)
 
   | Coercion co <- new_rhs
-  = return (extendCvSubst env bndr co)
+  = return (emptyFloats env, extendCvSubst env bndr co)
 
   | otherwise
   = do  { (env', bndr') <- simplBinder env bndr
         ; completeNonRecX NotTopLevel env' (isStrictId bndr) bndr bndr' new_rhs }
                 -- simplNonRecX is only used for NotTopLevel things
 
+--------------------------
 completeNonRecX :: TopLevelFlag -> SimplEnv
                 -> Bool
                 -> InId                 -- Old binder; not a JoinId
                 -> OutId                -- New binder
                 -> OutExpr              -- Simplified RHS
-                -> SimplM SimplEnv      -- The new binding extends the seLetFloats
-                                        -- of the resulting SimpleEnv
+                -> SimplM (SimplFloats, SimplEnv)    -- The new binding is in the floats
 -- Precondition: rhs satisfies the let/app invariant
 --               See Note [CoreSyn let/app invariant] in CoreSyn
 
 completeNonRecX top_lvl env is_strict old_bndr new_bndr new_rhs
   = ASSERT2( not (isJoinId new_bndr), ppr new_bndr )
-    do  { (env1, rhs1) <- prepareRhs top_lvl (zapFloats env) new_bndr new_rhs
-        ; (env2, rhs2) <-
-                if doFloatFromRhs NotTopLevel NonRecursive is_strict rhs1 env1
-                then do { tick LetFloatFromLet
-                        ; return (addLetFloats env env1, rhs1) }   -- Add the floats to the main env
-                else return (env, wrapFloats env1 rhs1)         -- Wrap the floats around the RHS
-        ; completeBind env2 NotTopLevel NonRecursive Nothing
-                       old_bndr new_bndr rhs2 }
-
-{- No, no, no!  Do not try preInlineUnconditionally in completeNonRecX
-   Doing so risks exponential behaviour, because new_rhs has been simplified once already
-   In the cases described by the following comment, postInlineUnconditionally will
-   catch many of the relevant cases.
-        -- This happens; for example, the case_bndr during case of
-        -- known constructor:  case (a,b) of x { (p,q) -> ... }
-        -- Here x isn't mentioned in the RHS, so we don't want to
-        -- create the (dead) let-binding  let x = (a,b) in ...
-        --
-        -- Similarly, single occurrences can be inlined vigourously
-        -- e.g.  case (f x, g y) of (a,b) -> ....
-        -- If a,b occur once we can avoid constructing the let binding for them.
-
-   Furthermore in the case-binding case preInlineUnconditionally risks extra thunks
-        -- Consider     case I# (quotInt# x y) of
-        --                I# v -> let w = J# v in ...
-        -- If we gaily inline (quotInt# x y) for v, we end up building an
-        -- extra thunk:
-        --                let w = J# (quotInt# x y) in ...
-        -- because quotInt# can fail.
-
-  | preInlineUnconditionally env NotTopLevel bndr new_rhs
-  = thing_inside (extendIdSubst env bndr (DoneEx new_rhs))
--}
-
-----------------------------------
-{- Note [Avoiding exponential behaviour]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-One way in which we can get exponential behaviour is if we simplify a
-big expression, and the re-simplify it -- and then this happens in a
-deeply-nested way.  So we must be jolly careful about re-simplifying
-an expression.  That is why completeNonRecX does not try
-preInlineUnconditionally.
-
-Example:
-  f BIG, where f has a RULE
-Then
- * We simplify BIG before trying the rule; but the rule does not fire
- * We inline f = \x. x True
- * So if we did preInlineUnconditionally we'd re-simplify (BIG True)
+    do  { (prepd_floats, rhs1) <- prepareRhs (getMode env) top_lvl (getOccFS new_bndr)
+                                             (idInfo new_bndr) new_rhs
+        ; let floats = emptyFloats env `addLetFloats` prepd_floats
+        ; (rhs_floats, rhs2) <-
+                if doFloatFromRhs NotTopLevel NonRecursive is_strict floats rhs1
+                then    -- Add the floats to the main env
+                     do { tick LetFloatFromLet
+                        ; return (floats, rhs1) }
+                else    -- Do not float; wrap the floats around the RHS
+                     return (emptyFloats env, wrapFloats floats rhs1)
 
-However, if BIG has /not/ already been simplified, we'd /like/ to
-simplify BIG True; maybe good things happen.  That is why
+        ; (bind_float, env2) <- completeBind (env `setInScopeFromF` rhs_floats)
+                                             NotTopLevel Nothing
+                                             old_bndr new_bndr rhs2
+        ; return (rhs_floats `addFloats` bind_float, env2) }
 
-* simplLam has
-    - a case for (isSimplified dup), which goes via simplNonRecX, and
-    - a case for the un-simplified case, which goes via simplNonRecE
 
-* We go to some efforts to avoid unnecessarily simplifying ApplyToVal,
-  in at least two places
-    - In simplCast/addCoerce, where we check for isReflCo
-    - In rebuildCall we avoid simplifying arguments before we have to
-      (see Note [Trying rewrite rules])
+{- *********************************************************************
+*                                                                      *
+           prepareRhs, makeTrivial
+*                                                                      *
+************************************************************************
 
 Note [prepareRhs]
 ~~~~~~~~~~~~~~~~~
@@ -563,57 +381,62 @@ Here we want to make e1,e2 trivial and get
 That's what the 'go' loop in prepareRhs does
 -}
 
-prepareRhs :: TopLevelFlag -> SimplEnv -> OutId -> OutExpr -> SimplM (SimplEnv, OutExpr)
+prepareRhs :: SimplMode -> TopLevelFlag
+           -> FastString   -- Base for any new variables
+           -> IdInfo       -- IdInfo for the LHS of this binding
+           -> OutExpr
+           -> SimplM (LetFloats, OutExpr)
+-- Transforms a RHS into a better RHS by adding floats
+-- e.g        x = Just e
+-- becomes    a = e
+--            x = Just a
 -- See Note [prepareRhs]
--- Adds new floats to the env iff that allows us to return a good RHS
-prepareRhs top_lvl env id (Cast rhs co)    -- Note [Float coercions]
-  | Pair ty1 _ty2 <- coercionKind co       -- Do *not* do this if rhs has an unlifted type
-  , not (isUnliftedType ty1)            -- see Note [Float coercions (unlifted)]
-  = do  { (env', rhs') <- makeTrivialWithInfo top_lvl env (getOccFS id) sanitised_info rhs
-        ; return (env', Cast rhs' co) }
+prepareRhs mode top_lvl occ info (Cast rhs co)  -- Note [Float coercions]
+  | Pair ty1 _ty2 <- coercionKind co         -- Do *not* do this if rhs has an unlifted type
+  , not (isUnliftedType ty1)                 -- see Note [Float coercions (unlifted)]
+  = do  { (floats, rhs') <- makeTrivialWithInfo mode top_lvl occ sanitised_info rhs
+        ; return (floats, Cast rhs' co) }
   where
     sanitised_info = vanillaIdInfo `setStrictnessInfo` strictnessInfo info
-                                   `setDemandInfo` demandInfo info
-    info = idInfo id
+                                   `setDemandInfo`     demandInfo info
 
-prepareRhs top_lvl env0 id rhs0
-  = do  { (_is_exp, env1, rhs1) <- go 0 env0 rhs0
-        ; return (env1, rhs1) }
+prepareRhs mode top_lvl occ _ rhs0
+  = do  { (_is_exp, floats, rhs1) <- go 0 rhs0
+        ; return (floats, rhs1) }
   where
-    go n_val_args env (Cast rhs co)
-        = do { (is_exp, env', rhs') <- go n_val_args env rhs
-             ; return (is_exp, env', Cast rhs' co) }
-    go n_val_args env (App fun (Type ty))
-        = do { (is_exp, env', rhs') <- go n_val_args env fun
-             ; return (is_exp, env', App rhs' (Type ty)) }
-    go n_val_args env (App fun arg)
-        = do { (is_exp, env', fun') <- go (n_val_args+1) env fun
+    go :: Int -> OutExpr -> SimplM (Bool, LetFloats, OutExpr)
+    go n_val_args (Cast rhs co)
+        = do { (is_exp, floats, rhs') <- go n_val_args rhs
+             ; return (is_exp, floats, Cast rhs' co) }
+    go n_val_args (App fun (Type ty))
+        = do { (is_exp, floats, rhs') <- go n_val_args fun
+             ; return (is_exp, floats, App rhs' (Type ty)) }
+    go n_val_args (App fun arg)
+        = do { (is_exp, floats1, fun') <- go (n_val_args+1) fun
              ; case is_exp of
-                True -> do { (env'', arg') <- makeTrivial top_lvl env' (getOccFS id) arg
-                           ; return (True, env'', App fun' arg') }
-                False -> return (False, env, App fun arg) }
-    go n_val_args env (Var fun)
-        = return (is_exp, env, Var fun)
+                False -> return (False, emptyLetFloats, App fun arg)
+                True  -> do { (floats2, arg') <- makeTrivial mode top_lvl occ arg
+                            ; return (True, floats1 `addLetFlts` floats2, App fun' arg') } }
+    go n_val_args (Var fun)
+        = return (is_exp, emptyLetFloats, Var fun)
         where
           is_exp = isExpandableApp fun n_val_args   -- The fun a constructor or PAP
                         -- See Note [CONLIKE pragma] in BasicTypes
                         -- The definition of is_exp should match that in
                         -- OccurAnal.occAnalApp
 
-    go n_val_args env (Tick t rhs)
+    go n_val_args (Tick t rhs)
         -- We want to be able to float bindings past this
         -- tick. Non-scoping ticks don't care.
         | tickishScoped t == NoScope
-        = do { (is_exp, env', rhs') <- go n_val_args env rhs
-             ; return (is_exp, env', Tick t rhs') }
+        = do { (is_exp, floats, rhs') <- go n_val_args rhs
+             ; return (is_exp, floats, Tick t rhs') }
 
         -- On the other hand, for scoping ticks we need to be able to
         -- copy them on the floats, which in turn is only allowed if
         -- we can obtain non-counting ticks.
         | (not (tickishCounts t) || tickishCanSplit t)
-        = do { (is_exp, env', rhs') <- go n_val_args (zapFloats env) rhs
-                    -- env' has the extra let-bindings from
-                    -- the makeTrivial calls in 'go'; no join floats
+        = do { (is_exp, floats, rhs') <- go n_val_args rhs
              ; let tickIt (id, expr)
                        -- we have to take care not to tick top-level literal
                        -- strings. See Note [CoreSyn top-level string literals].
@@ -621,12 +444,11 @@ prepareRhs top_lvl env0 id rhs0
                      = (id, expr)
                      | otherwise
                      = (id, mkTick (mkNoCount t) expr)
-                   floats' = seLetFloats env `addFlts`
-                             mapFloats (seLetFloats env') tickIt
-             ; return (is_exp, env' { seLetFloats = floats' }, Tick t rhs') }
+                   floats' = mapLetFloats floats tickIt
+             ; return (is_exp, floats', Tick t rhs') }
 
-    go _ env other
-        = return (False, env, other)
+    go _ other
+        = return (False, emptyLetFloats, other)
 
 {-
 Note [Float coercions]
@@ -679,50 +501,55 @@ These strange casts can happen as a result of case-of-case
                 (# p,q #) -> p+q
 -}
 
-makeTrivialArg :: SimplEnv -> ArgSpec -> SimplM (SimplEnv, ArgSpec)
-makeTrivialArg env (ValArg e) = do
-    { (env', e') <- makeTrivial NotTopLevel env (fsLit "arg") e
-    ; return (env', ValArg e') }
-makeTrivialArg env arg        = return (env, arg)  -- CastBy, TyArg
-
-makeTrivial :: TopLevelFlag -> SimplEnv
-            -> FastString  -- ^ a "friendly name" to build the new binder from
-            -> OutExpr -> SimplM (SimplEnv, OutExpr)
+makeTrivialArg :: SimplMode -> ArgSpec -> SimplM (LetFloats, ArgSpec)
+makeTrivialArg mode (ValArg e)
+  = do { (floats, e') <- makeTrivial mode NotTopLevel (fsLit "arg") e
+       ; return (floats, ValArg e') }
+makeTrivialArg _ arg
+  = return (emptyLetFloats, arg)  -- CastBy, TyArg
+
+makeTrivial :: SimplMode -> TopLevelFlag
+            -> FastString  -- ^ A "friendly name" to build the new binder from
+            -> OutExpr     -- ^ This expression satisfies the let/app invariant
+            -> SimplM (LetFloats, OutExpr)
 -- Binds the expression to a variable, if it's not trivial, returning the variable
-makeTrivial top_lvl env context expr =
-    makeTrivialWithInfo top_lvl env context vanillaIdInfo expr
-
-makeTrivialWithInfo :: TopLevelFlag -> SimplEnv
-                    -> FastString
-                    -- ^ a "friendly name" to build the new binder from
-                    -> IdInfo -> OutExpr -> SimplM (SimplEnv, OutExpr)
+makeTrivial mode top_lvl context expr
+ = makeTrivialWithInfo mode top_lvl context vanillaIdInfo expr
+
+makeTrivialWithInfo :: SimplMode -> TopLevelFlag
+                    -> FastString  -- ^ a "friendly name" to build the new binder from
+                    -> IdInfo
+                    -> OutExpr     -- ^ This expression satisfies the let/app invariant
+                    -> SimplM (LetFloats, OutExpr)
 -- Propagate strictness and demand info to the new binder
 -- Note [Preserve strictness when floating coercions]
 -- Returned SimplEnv has same substitution as incoming one
-makeTrivialWithInfo top_lvl env context info expr
+makeTrivialWithInfo mode top_lvl occ_fs info expr
   | exprIsTrivial expr                          -- Already trivial
   || not (bindingOk top_lvl expr expr_ty)       -- Cannot trivialise
                                                 --   See Note [Cannot trivialise]
-  = return (env, expr)
-
-  | otherwise           -- See Note [Take care] below
-  = do  { uniq <- getUniqueM
-        ; let name = mkSystemVarName uniq context
-              var = mkLocalIdOrCoVarWithInfo name expr_ty info
-        ; env'  <- completeNonRecX top_lvl env False var var expr
-        ; expr' <- simplVar env' var
-        ; return (env', expr') }
-        -- The simplVar is needed because we're constructing a new binding
-        --     a = rhs
-        -- And if rhs is of form (rhs1 |> co), then we might get
-        --     a1 = rhs1
-        --     a = a1 |> co
-        -- and now a's RHS is trivial and can be substituted out, and that
-        -- is what completeNonRecX will do
-        -- To put it another way, it's as if we'd simplified
-        --    let var = e in var
-  where
-    expr_ty = exprType expr
+  = return (emptyLetFloats, expr)
+
+  | otherwise
+  = do  { (floats, expr1) <- prepareRhs mode top_lvl occ_fs info expr
+        ; if   exprIsTrivial expr1  -- See Note [Trivial after prepareRhs]
+          then return (floats, expr1)
+          else do
+        { uniq <- getUniqueM
+        ; let name = mkSystemVarName uniq occ_fs
+              var  = mkLocalIdOrCoVarWithInfo name expr_ty info
+
+        -- Now something very like completeBind,
+        -- but without the postInlineUnconditinoally part
+        ; (arity, is_bot, expr2) <- tryEtaExpandRhs mode var expr1
+        ; unf <- mkLetUnfolding (sm_dflags mode) top_lvl InlineRhs var expr2
+
+        ; let final_id = addLetBndrInfo var arity is_bot unf
+              bind     = NonRec final_id expr2
+
+        ; return ( floats `addLetFlts` unitLetFloat bind, Var final_id ) }}
+   where
+     expr_ty = exprType expr
 
 bindingOk :: TopLevelFlag -> CoreExpr -> Type -> Bool
 -- True iff we can have a binding of this expression at this level
@@ -731,10 +558,16 @@ bindingOk top_lvl expr expr_ty
   | isTopLevel top_lvl = exprIsTopLevelBindable expr expr_ty
   | otherwise          = True
 
-{-
+{- Note [Trivial after prepareRhs]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we call makeTrival on (e |> co), the recursive use of prepareRhs
+may leave us with
+   { a1 = e }  and   (a1 |> co)
+Now the latter is trivial, so we don't want to let-bind it.
+
 Note [Cannot trivialise]
 ~~~~~~~~~~~~~~~~~~~~~~~~
-Consider tih
+Consider:
    f :: Int -> Addr#
 
    foo :: Bar
@@ -760,7 +593,7 @@ See Note [CoreSyn top-level string literals] in CoreSyn.
 
 ************************************************************************
 *                                                                      *
-\subsection{Completing a lazy binding}
+          Completing a lazy binding
 *                                                                      *
 ************************************************************************
 
@@ -786,22 +619,21 @@ Nor does it do the atomic-argument thing
 
 completeBind :: SimplEnv
              -> TopLevelFlag            -- Flag stuck into unfolding
-             -> RecFlag                 -- Recursive binding?
              -> Maybe SimplCont         -- Required only for join point
              -> InId                    -- Old binder
              -> OutId -> OutExpr        -- New binder and RHS
-             -> SimplM SimplEnv
+             -> SimplM (SimplFloats, SimplEnv)
 -- completeBind may choose to do its work
 --      * by extending the substitution (e.g. let x = y in ...)
 --      * or by adding to the floats in the envt
 --
 -- Binder /can/ be a JoinId
 -- Precondition: rhs obeys the let/app invariant
-completeBind env top_lvl is_rec mb_cont old_bndr new_bndr new_rhs
+completeBind env top_lvl mb_cont old_bndr new_bndr new_rhs
  | isCoVar old_bndr
  = case new_rhs of
-     Coercion co -> return (extendCvSubst env old_bndr co)
-     _           -> return (addNonRec env new_bndr new_rhs)
+     Coercion co -> return (emptyFloats env, extendCvSubst env old_bndr co)
+     _           -> return (mkFloatBind env (NonRec new_bndr new_rhs))
 
  | otherwise
  = ASSERT( isId new_bndr )
@@ -809,87 +641,61 @@ completeBind env top_lvl is_rec mb_cont old_bndr new_bndr new_rhs
             old_unf  = unfoldingInfo old_info
             occ_info = occInfo old_info
 
-        -- Do eta-expansion on the RHS of the binding
-        -- See Note [Eta-expanding at let bindings] in SimplUtils
-      ; (new_arity, final_rhs) <- if isJoinId new_bndr
-                                    then return (manifestArity new_rhs, new_rhs)
-                                         -- Note [Don't eta-expand join points]
-                                    else tryEtaExpandRhs env is_rec
-                                                         new_bndr new_rhs
+         -- Do eta-expansion on the RHS of the binding
+         -- See Note [Eta-expanding at let bindings] in SimplUtils
+      ; (new_arity, is_bot, final_rhs) <- tryEtaExpandRhs (getMode env)
+                                                          new_bndr new_rhs
 
         -- Simplify the unfolding
       ; new_unfolding <- simplLetUnfolding env top_lvl mb_cont old_bndr
                                            final_rhs old_unf
 
-      ; dflags <- getDynFlags
-      ; if postInlineUnconditionally dflags env top_lvl new_bndr occ_info
-                                     final_rhs new_unfolding
+      ; let final_bndr = addLetBndrInfo new_bndr new_arity is_bot new_unfolding
+
+      ; if postInlineUnconditionally env top_lvl final_bndr occ_info final_rhs
 
-                        -- Inline and discard the binding
-        then do  { tick (PostInlineUnconditionally old_bndr)
-                 ; return (extendIdSubst env old_bndr
-                              (DoneEx final_rhs (isJoinId_maybe new_bndr))) }
+        then -- Inline and discard the binding
+             do  { tick (PostInlineUnconditionally old_bndr)
+                 ; return ( emptyFloats env
+                          , extendIdSubst env old_bndr $
+                            DoneEx final_rhs (isJoinId_maybe new_bndr)) }
                 -- Use the substitution to make quite, quite sure that the
                 -- substitution will happen, since we are going to discard the binding
-        else
-   do { let info1 = idInfo new_bndr `setArityInfo` new_arity
-
-              -- Unfolding info: Note [Setting the new unfolding]
-            info2 = info1 `setUnfoldingInfo` new_unfolding
-
-              -- Demand info: Note [Setting the demand info]
-              --
-              -- We also have to nuke demand info if for some reason
-              -- eta-expansion *reduces* the arity of the binding to less
-              -- than that of the strictness sig. This can happen: see Note [Arity decrease].
-            info3 | isEvaldUnfolding new_unfolding
-                    || (case strictnessInfo info2 of
-                          StrictSig dmd_ty -> new_arity < dmdTypeDepth dmd_ty)
-                  = zapDemandInfo info2 `orElse` info2
-                  | otherwise
-                  = info2
-
-              -- Zap call arity info. We have used it by now (via
-              -- `tryEtaExpandRhs`), and the simplifier can invalidate this
-              -- information, leading to broken code later (e.g. #13479)
-            info4 = zapCallArityInfo info3
-
-            final_id = new_bndr `setIdInfo` info4
-
-      ; -- pprTrace "Binding" (ppr final_id <+> ppr new_unfolding) $
-        return (addNonRec env final_id final_rhs) } }
-                -- The addNonRec adds it to the in-scope set too
-
-------------------------------
-addPolyBind :: TopLevelFlag -> SimplEnv -> OutBind -> SimplM SimplEnv
--- Add a new binding to the environment, complete with its unfolding
--- but *do not* do postInlineUnconditionally, because we have already
--- processed some of the scope of the binding
--- We still want the unfolding though.  Consider
---      let
---            x = /\a. let y = ... in Just y
---      in body
--- Then we float the y-binding out (via abstractFloats and addPolyBind)
--- but 'x' may well then be inlined in 'body' in which case we'd like the
--- opportunity to inline 'y' too.
---
--- INVARIANT: the arity is correct on the incoming binders
 
-addPolyBind top_lvl env (NonRec poly_id rhs)
-  = do  { unfolding <- simplLetUnfolding env top_lvl Nothing poly_id rhs
-                                         noUnfolding
-                        -- Assumes that poly_id did not have an INLINE prag
-                        -- which is perhaps wrong.  ToDo: think about this
-        ; let final_id = setIdInfo poly_id $
-                         idInfo poly_id `setUnfoldingInfo` unfolding
+        else -- Keep the binding
+             -- pprTrace "Binding" (ppr final_bndr <+> ppr new_unfolding) $
+             return (mkFloatBind env (NonRec final_bndr final_rhs)) }
+
+addLetBndrInfo :: OutId -> Arity -> Bool -> Unfolding -> OutId
+addLetBndrInfo new_bndr new_arity is_bot new_unf
+  = new_bndr `setIdInfo` info5
+  where
+    info1 = idInfo new_bndr `setArityInfo` new_arity
+
+    -- Unfolding info: Note [Setting the new unfolding]
+    info2 = info1 `setUnfoldingInfo` new_unf
+
+    -- Demand info: Note [Setting the demand info]
+    -- We also have to nuke demand info if for some reason
+    -- eta-expansion *reduces* the arity of the binding to less
+    -- than that of the strictness sig. This can happen: see Note [Arity decrease].
+    info3 | isEvaldUnfolding new_unf
+            || (case strictnessInfo info2 of
+                  StrictSig dmd_ty -> new_arity < dmdTypeDepth dmd_ty)
+          = zapDemandInfo info2 `orElse` info2
+          | otherwise
+          = info2
+
+    -- Bottoming bindings: see Note [Bottoming bindings]
+    info4 | is_bot    = info3 `setStrictnessInfo`
+                        mkClosedStrictSig (replicate new_arity topDmd) exnRes
+          | otherwise = info3
 
-        ; return (addNonRec env final_id rhs) }
+     -- Zap call arity info. We have used it by now (via
+     -- `tryEtaExpandRhs`), and the simplifier can invalidate this
+     -- information, leading to broken code later (e.g. #13479)
+    info5 = zapCallArityInfo info4
 
-addPolyBind _ env bind@(Rec _)
-  = return (extendFloats env bind)
-        -- Hack: letrecs are more awkward, so we extend "by steam"
-        -- without adding unfoldings etc.  At worst this leads to
-        -- more simplifier iterations
 
 {- Note [Arity decrease]
 ~~~~~~~~~~~~~~~~~~~~~~~~
@@ -914,6 +720,26 @@ Here opInt has arity 1; but when we apply the rule its arity drops to 0.
 That's why Specialise goes to a little trouble to pin the right arity
 on specialised functions too.
 
+Note [Bottoming bindings]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+Suppose we have
+   let x = error "urk"
+   in ...(case x of <alts>)...
+or
+   let f = \x. error (x ++ "urk")
+   in ...(case f "foo" of <alts>)...
+
+Then we'd like to drop the dead <alts> immediately.  So it's good to
+propagate the info that x's RHS is bottom to x's IdInfo as rapidly as
+possible.
+
+We use tryEtaExpandRhs on every binding, and it turns ou that the
+arity computation it performs (via CoreArity.findRhsArity) already
+does a simple bottoming-expression analysis.  So all we need to do
+is propagate that info to the binder's IdInfo.
+
+This showed up in Trac #12150; see comment:16.
+
 Note [Setting the demand info]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 If the unfolding is a value, the demand info may
@@ -930,44 +756,6 @@ After inlining f at some of its call sites the original binding may
 (for example) be no longer strictly demanded.
 The solution here is a bit ad hoc...
 
-Note [Don't eta-expand join points]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Similarly to CPR (see Note [Don't CPR join points] in WorkWrap), a join point
-stands well to gain from its outer binding's eta-expansion, and eta-expanding a
-join point is fraught with issues like how to deal with a cast:
-
-    let join $j1 :: IO ()
-             $j1 = ...
-             $j2 :: Int -> IO ()
-             $j2 n = if n > 0 then $j1
-                              else ...
-
-    =>
-
-    let join $j1 :: IO ()
-             $j1 = (\eta -> ...)
-                     `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ())
-                                 ~  IO ()
-             $j2 :: Int -> IO ()
-             $j2 n = (\eta -> if n > 0 then $j1
-                                       else ...)
-                     `cast` N:IO :: State# RealWorld -> (# State# RealWorld, ())
-                                 ~  IO ()
-
-The cast here can't be pushed inside the lambda (since it's not casting to a
-function type), so the lambda has to stay, but it can't because it contains a
-reference to a join point. In fact, $j2 can't be eta-expanded at all. Rather
-than try and detect this situation (and whatever other situations crop up!), we
-don't bother; again, any surrounding eta-expansion will improve these join
-points anyway, since an outer cast can *always* be pushed inside. By the time
-CorePrep comes around, the code is very likely to look more like this:
-
-    let join $j1 :: State# RealWorld -> (# State# RealWorld, ())
-             $j1 = (...) eta
-             $j2 :: Int -> State# RealWorld -> (# State# RealWorld, ())
-             $j2 = if n > 0 then $j1
-                            else (...) eta
 
 ************************************************************************
 *                                                                      *
@@ -1033,17 +821,17 @@ simplExprC :: SimplEnv
         -- Simplify an expression, given a continuation
 simplExprC env expr cont
   = -- pprTrace "simplExprC" (ppr expr $$ ppr cont {- $$ ppr (seIdSubst env) -} $$ ppr (seLetFloats env) ) $
-    do  { (env', expr') <- simplExprF (zapFloats env) expr cont
+    do  { (floats, expr') <- simplExprF env expr cont
         ; -- pprTrace "simplExprC ret" (ppr expr $$ ppr expr') $
           -- pprTrace "simplExprC ret3" (ppr (seInScope env')) $
           -- pprTrace "simplExprC ret4" (ppr (seLetFloats env')) $
-          return (wrapFloats env' expr') }
+          return (wrapFloats floats expr') }
 
 --------------------------------------------------
 simplExprF :: SimplEnv
            -> InExpr     -- A term-valued expression, never (Type ty)
            -> SimplCont
-           -> SimplM (SimplEnv, OutExpr)
+           -> SimplM (SimplFloats, OutExpr)
 
 simplExprF env e cont
   = {- pprTrace "simplExprF" (vcat
@@ -1053,12 +841,11 @@ simplExprF env e cont
       , text "tvsubst =" <+> ppr (seTvSubst env)
       , text "idsubst =" <+> ppr (seIdSubst env)
       , text "cvsubst =" <+> ppr (seCvSubst env)
-      {- , ppr (seLetFloats env) -}
       ]) $ -}
     simplExprF1 env e cont
 
 simplExprF1 :: SimplEnv -> InExpr -> SimplCont
-            -> SimplM (SimplEnv, OutExpr)
+            -> SimplM (SimplFloats, OutExpr)
 
 simplExprF1 _ (Type ty) _
   = pprPanic "simplExprF: type" (ppr ty)
@@ -1115,18 +902,15 @@ simplExprF1 env expr@(Lam {}) cont
           | otherwise = zapLamIdInfo b
 
 simplExprF1 env (Case scrut bndr _ alts) cont
-  | sm_case_case (getMode env)
   = simplExprF env scrut (Select { sc_dup = NoDup, sc_bndr = bndr
                                  , sc_alts = alts
                                  , sc_env = env, sc_cont = cont })
-  | otherwise
-  = do { (env', scrut') <- simplExprF (zapFloats env) scrut $
-                           mkBoringStop (substTy env (idType bndr))
-       ; let scrut'' = wrapJoinFloats (seJoinFloats env') scrut'
-             env''   = env `addLetFloats` env'
-       ; rebuildCase env'' scrut'' bndr alts cont }
 
 simplExprF1 env (Let (Rec pairs) body) cont
+  | Just pairs' <- joinPointBindings_maybe pairs
+  = simplRecJoinPoint env pairs' body cont
+
+  | otherwise
   = simplRecE env pairs body cont
 
 simplExprF1 env (Let (NonRec bndr rhs) body) cont
@@ -1135,6 +919,9 @@ simplExprF1 env (Let (NonRec bndr rhs) body) cont
     do { ty' <- simplType env ty
        ; simplExprF (extendTvSubst env bndr ty') body cont }
 
+  | Just (bndr', rhs') <- joinPointBinding_maybe bndr rhs
+  = simplNonRecJoinPoint env bndr' rhs' body cont
+
   | otherwise
   = simplNonRecE env bndr (rhs, env) ([], body) cont
 
@@ -1212,7 +999,7 @@ simplType env ty
 
 ---------------------------------
 simplCoercionF :: SimplEnv -> InCoercion -> SimplCont
-               -> SimplM (SimplEnv, OutExpr)
+               -> SimplM (SimplFloats, OutExpr)
 simplCoercionF env co cont
   = do { co' <- simplCoercion env co
        ; rebuild env (Coercion co') cont }
@@ -1228,7 +1015,7 @@ simplCoercion env co
 -- optimisations apply.
 
 simplTick :: SimplEnv -> Tickish Id -> InExpr -> SimplCont
-          -> SimplM (SimplEnv, OutExpr)
+          -> SimplM (SimplFloats, OutExpr)
 simplTick env tickish expr cont
   -- A scoped tick turns into a continuation, so that we can spot
   -- (scc t (\x . e)) in simplLam and eliminate the scc.  If we didn't do
@@ -1255,8 +1042,8 @@ simplTick env tickish expr cont
   -- application context, allowing the normal case and application
   -- optimisations to fire.
   | tickish `tickishScopesLike` SoftScope
-  = do { (env', expr') <- simplExprF env expr cont
-       ; return (env', mkTick tickish expr')
+  = do { (floats, expr') <- simplExprF env expr cont
+       ; return (floats, mkTick tickish expr')
        }
 
   -- Push tick inside if the context looks like this will allow us to
@@ -1294,8 +1081,8 @@ simplTick env tickish expr cont
 
   no_floating_past_tick =
     do { let (inc,outc) = splitCont cont
-       ; (env1, expr1) <- simplExprF (zapFloats env) expr inc
-       ; let expr2    = wrapFloats env1 expr1
+       ; (floats, expr1) <- simplExprF env expr inc
+       ; let expr2    = wrapFloats floats expr1
              tickish' = simplTickish env tickish
        ; rebuild env (mkTick tickish' expr2) outc
        }
@@ -1377,27 +1164,28 @@ simplTick env tickish expr cont
 ************************************************************************
 -}
 
-rebuild :: SimplEnv -> OutExpr -> SimplCont -> SimplM (SimplEnv, OutExpr)
--- At this point the substitution in the SimplEnv should be irrelevant
--- only the in-scope set and floats should matter
+rebuild :: SimplEnv -> OutExpr -> SimplCont -> SimplM (SimplFloats, OutExpr)
+-- At this point the substitution in the SimplEnv should be irrelevant;
+-- only the in-scope set matters
 rebuild env expr cont
   = case cont of
-      Stop {}          -> return (env, expr)
+      Stop {}          -> return (emptyFloats env, expr)
       TickIt t cont    -> rebuild env (mkTick t expr) cont
       CastIt co cont   -> rebuild env (mkCast expr co) cont
                        -- NB: mkCast implements the (Coercion co |> g) optimisation
 
       Select { sc_bndr = bndr, sc_alts = alts, sc_env = se, sc_cont = cont }
-        -> rebuildCase (se `setFloats` env) expr bndr alts cont
+        -> rebuildCase (se `setInScopeFromE` env) expr bndr alts cont
 
       StrictArg { sc_fun = fun, sc_cont = cont }
         -> rebuildCall env (fun `addValArgTo` expr) cont
       StrictBind { sc_bndr = b, sc_bndrs = bs, sc_body = body
                  , sc_env = se, sc_cont = cont }
-        -> do { env' <- simplNonRecX (se `setFloats` env) b expr
-                             -- expr satisfies let/app since it started life
-                             -- in a call to simplNonRecE
-              ; simplLam env' bs body cont }
+        -> do { (floats1, env') <- simplNonRecX (se `setInScopeFromE` env) b expr
+                                  -- expr satisfies let/app since it started life
+                                  -- in a call to simplNonRecE
+              ; (floats2, expr') <- simplLam env' bs body cont
+              ; return (floats1 `addFloats` floats2, expr') }
 
       ApplyToTy  { sc_arg_ty = ty, sc_cont = cont}
         -> rebuild env (App expr (Type ty)) cont
@@ -1416,7 +1204,7 @@ rebuild env expr cont
 -}
 
 simplCast :: SimplEnv -> InExpr -> Coercion -> SimplCont
-          -> SimplM (SimplEnv, OutExpr)
+          -> SimplM (SimplFloats, OutExpr)
 simplCast env body co0 cont0
   = do  { co1   <- simplCoercion env co0
         ; cont1 <- addCoerce co1 cont0
@@ -1470,7 +1258,7 @@ simplArg env dup_flag arg_env arg
   | isSimplified dup_flag
   = return (dup_flag, arg_env, arg)
   | otherwise
-  = do { arg' <- simplExpr (arg_env `setInScopeAndZapFloats` env) arg
+  = do { arg' <- simplExpr (arg_env `setInScopeFromE` env) arg
        ; return (Simplified, zapSubstEnv arg_env, arg') }
 
 {-
@@ -1479,27 +1267,13 @@ simplArg env dup_flag arg_env arg
 \subsection{Lambdas}
 *                                                                      *
 ************************************************************************
-
-Note [Zap unfolding when beta-reducing]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Lambda-bound variables can have stable unfoldings, such as
-   $j = \x. \b{Unf=Just x}. e
-See Note [Case binders and join points] below; the unfolding for lets
-us optimise e better.  However when we beta-reduce it we want to
-revert to using the actual value, otherwise we can end up in the
-stupid situation of
-          let x = blah in
-          let b{Unf=Just x} = y
-          in ...b...
-Here it'd be far better to drop the unfolding and use the actual RHS.
 -}
 
 simplLam :: SimplEnv -> [InId] -> InExpr -> SimplCont
-         -> SimplM (SimplEnv, OutExpr)
-
-simplLam env [] body cont = simplExprF env body cont
+         -> SimplM (SimplFloats, OutExpr)
 
-        -- Beta reduction
+simplLam env [] body cont
+  = simplExprF env body cont
 
 simplLam env (bndr:bndrs) body (ApplyToTy { sc_arg_ty = arg_ty, sc_cont = cont })
   = do { tick (BetaReduction bndr)
@@ -1510,8 +1284,9 @@ simplLam env (bndr:bndrs) body (ApplyToVal { sc_arg = arg, sc_env = arg_se
   | isSimplified dup  -- Don't re-simplify if we've simplified it once
                       -- See Note [Avoiding exponential behaviour]
   = do  { tick (BetaReduction bndr)
-        ; env' <- simplNonRecX env zapped_bndr arg
-        ; simplLam env' bndrs body cont }
+        ; (floats1, env') <- simplNonRecX env zapped_bndr arg
+        ; (floats2, expr') <- simplLam env' bndrs body cont
+        ; return (floats1 `addFloats` floats2, expr') }
 
   | otherwise
   = do  { tick (BetaReduction bndr)
@@ -1521,7 +1296,7 @@ simplLam env (bndr:bndrs) body (ApplyToVal { sc_arg = arg, sc_env = arg_se
       | isId bndr = zapStableUnfolding bndr
       | otherwise = bndr
 
-      -- discard a non-counting tick on a lambda.  This may change the
+      -- Discard a non-counting tick on a lambda.  This may change the
       -- cost attribution slightly (moving the allocation of the
       -- lambda elsewhere), but we don't care: optimisation changes
       -- cost attribution all the time.
@@ -1536,9 +1311,6 @@ simplLam env bndrs body cont
         ; new_lam <- mkLam env bndrs' body' cont
         ; rebuild env' new_lam cont }
 
-simplLamBndrs :: SimplEnv -> [InBndr] -> SimplM (SimplEnv, [OutBndr])
-simplLamBndrs env bndrs = mapAccumLM simplLamBndr env bndrs
-
 -------------
 simplLamBndr :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr)
 -- Used for lambda binders.  These sometimes have unfoldings added by
@@ -1550,7 +1322,7 @@ simplLamBndr :: SimplEnv -> InBndr -> SimplM (SimplEnv, OutBndr)
 simplLamBndr env bndr
   | isId bndr && isFragileUnfolding old_unf   -- Special case
   = do { (env1, bndr1) <- simplBinder env bndr
-       ; unf'          <- simplUnfolding env1 NotTopLevel Nothing bndr old_unf
+       ; unf'          <- simplStableUnfolding env1 NotTopLevel Nothing bndr old_unf
        ; let bndr2 = bndr1 `setIdUnfolding` unf'
        ; return (modifyInScope env1 bndr2, bndr2) }
 
@@ -1559,18 +1331,21 @@ simplLamBndr env bndr
   where
     old_unf = idUnfolding bndr
 
+simplLamBndrs :: SimplEnv -> [InBndr] -> SimplM (SimplEnv, [OutBndr])
+simplLamBndrs env bndrs = mapAccumLM simplLamBndr env bndrs
+
 ------------------
 simplNonRecE :: SimplEnv
              -> InId                    -- The binder, always an Id
-                                        -- Can be a join point
+                                        -- Never a join point
              -> (InExpr, SimplEnv)      -- Rhs of binding (or arg of lambda)
              -> ([InBndr], InExpr)      -- Body of the let/lambda
                                         --      \xs.e
              -> SimplCont
-             -> SimplM (SimplEnv, OutExpr)
+             -> SimplM (SimplFloats, OutExpr)
 
 -- simplNonRecE is used for
---  * non-top-level non-recursive lets in expressions
+--  * non-top-level non-recursive non-join-point lets in expressions
 --  * beta reduction
 --
 -- simplNonRec env b (rhs, rhs_se) (bs, body) k
@@ -1589,74 +1364,249 @@ simplNonRecE :: SimplEnv
 --       the call to simplLam in simplExprF (Lam ...)
 
 simplNonRecE env bndr (rhs, rhs_se) (bndrs, body) cont
-  = ASSERT( isId bndr )
-    do dflags <- getDynFlags
-       case () of
-         _ | preInlineUnconditionally dflags env NotTopLevel bndr rhs
-           -> do { tick (PreInlineUnconditionally bndr)
-                 ; -- pprTrace "preInlineUncond" (ppr bndr <+> ppr rhs) $
-                  simplLam (extendIdSubst env bndr (mkContEx rhs_se rhs)) bndrs body cont }
-
-           -- Deal with join points
-           | Just (bndr', rhs') <- joinPointBinding_maybe bndr rhs
-           -> ASSERT( null bndrs )  -- Must be a let-binding;
-                                    -- join points are never lambda-bound
-              do { (env1, cont') <- prepareJoinCont env cont
-
-                   -- We push cont_dup into the join RHS and the body;
-                   -- and wrap cont_nodup around the whole thing
-                 ; let res_ty = contResultType cont'
-                 ; (env2, bndr1) <- simplNonRecJoinBndr env1 res_ty bndr'
-                 ; (env3, bndr2) <- addBndrRules env2 bndr' bndr1
-                 ; env4 <- simplJoinBind env3 NonRecursive cont'
-                                         bndr' bndr2 rhs' rhs_se
-                 ; simplExprF env4 body cont' }
-
-           -- Deal with strict bindings
-           | isStrictId bndr          -- Includes coercions
-           , sm_case_case (getMode env)
-           -> simplExprF (rhs_se `setFloats` env) rhs
-                         (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs, sc_body = body
-                                     , sc_env = env, sc_cont = cont, sc_dup = NoDup })
-
-           -- Deal with lazy bindings
-           | otherwise
-           -> ASSERT( not (isTyVar bndr) )
-              do { (env1, bndr1) <- simplNonRecBndr env bndr
-                 ; (env2, bndr2) <- addBndrRules env1 bndr bndr1
-                 ; env3 <- simplLazyBind env2 NotTopLevel NonRecursive bndr bndr2 rhs rhs_se
-                 ; simplLam env3 bndrs body cont }
+  | ASSERT( isId bndr && not (isJoinId bndr) )
+    preInlineUnconditionally env NotTopLevel bndr rhs
+  = do { tick (PreInlineUnconditionally bndr)
+       ; -- pprTrace "preInlineUncond" (ppr bndr <+> ppr rhs) $
+         simplLam (extendIdSubst env bndr (mkContEx rhs_se rhs)) bndrs body cont }
+
+  -- Deal with strict bindings
+  | isStrictId bndr          -- Includes coercions
+  , sm_case_case (getMode env)
+  = simplExprF (rhs_se `setInScopeFromE` env) rhs
+               (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs, sc_body = body
+                           , sc_env = env, sc_cont = cont, sc_dup = NoDup })
+
+  -- Deal with lazy bindings
+  | otherwise
+  = ASSERT( not (isTyVar bndr) )
+    do { (env1, bndr1) <- simplNonRecBndr env bndr
+       ; (env2, bndr2) <- addBndrRules env1 bndr bndr1
+       ; (floats1, env3) <- simplLazyBind env2 NotTopLevel NonRecursive bndr bndr2 rhs rhs_se
+       ; (floats2, expr') <- simplLam env3 bndrs body cont
+       ; return (floats1 `addFloats` floats2, expr') }
 
 ------------------
 simplRecE :: SimplEnv
           -> [(InId, InExpr)]
           -> InExpr
           -> SimplCont
-          -> SimplM (SimplEnv, OutExpr)
+          -> SimplM (SimplFloats, OutExpr)
 
 -- simplRecE is used for
 --  * non-top-level recursive lets in expressions
 simplRecE env pairs body cont
-  | Just pairs' <- joinPointBindings_maybe pairs
-  = do  { (env1, cont') <- prepareJoinCont env cont
-        ; let bndrs' = map fst pairs'
-              res_ty = contResultType cont
-        ; env2 <- simplRecJoinBndrs env1 res_ty bndrs'
-                -- NB: bndrs' don't have unfoldings or rules
-                -- We add them as we go down
-        ; env3 <- simplRecBind env2 NotTopLevel (Just cont') pairs'
-        ; simplExprF env3 body cont' }
-
-  | otherwise
   = do  { let bndrs = map fst pairs
         ; MASSERT(all (not . isJoinId) bndrs)
         ; env1 <- simplRecBndrs env bndrs
                 -- NB: bndrs' don't have unfoldings or rules
                 -- We add them as we go down
-        ; env2 <- simplRecBind env1 NotTopLevel Nothing pairs
-        ; simplExprF env2 body cont }
+        ; (floats1, env2) <- simplRecBind env1 NotTopLevel Nothing pairs
+        ; (floats2, expr') <- simplExprF env2 body cont
+        ; return (floats1 `addFloats` floats2, expr') }
+
+{- Note [Avoiding exponential behaviour]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+One way in which we can get exponential behaviour is if we simplify a
+big expression, and the re-simplify it -- and then this happens in a
+deeply-nested way.  So we must be jolly careful about re-simplifying
+an expression.  That is why completeNonRecX does not try
+preInlineUnconditionally.
+
+Example:
+  f BIG, where f has a RULE
+Then
+ * We simplify BIG before trying the rule; but the rule does not fire
+ * We inline f = \x. x True
+ * So if we did preInlineUnconditionally we'd re-simplify (BIG True)
+
+However, if BIG has /not/ already been simplified, we'd /like/ to
+simplify BIG True; maybe good things happen.  That is why
+
+* simplLam has
+    - a case for (isSimplified dup), which goes via simplNonRecX, and
+    - a case for the un-simplified case, which goes via simplNonRecE
+
+* We go to some efforts to avoid unnecessarily simplifying ApplyToVal,
+  in at least two places
+    - In simplCast/addCoerce, where we check for isReflCo
+    - In rebuildCall we avoid simplifying arguments before we have to
+      (see Note [Trying rewrite rules])
+
+
+Note [Zap unfolding when beta-reducing]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Lambda-bound variables can have stable unfoldings, such as
+   $j = \x. \b{Unf=Just x}. e
+See Note [Case binders and join points] below; the unfolding for lets
+us optimise e better.  However when we beta-reduce it we want to
+revert to using the actual value, otherwise we can end up in the
+stupid situation of
+          let x = blah in
+          let b{Unf=Just x} = y
+          in ...b...
+Here it'd be far better to drop the unfolding and use the actual RHS.
+
+************************************************************************
+*                                                                      *
+                     Join points
+*                                                                      *
+********************************************************************* -}
+
+simplNonRecJoinPoint :: SimplEnv -> InId -> InExpr
+                     -> InExpr -> SimplCont
+                     -> SimplM (SimplFloats, OutExpr)
+simplNonRecJoinPoint env bndr rhs body cont
+  | ASSERT( isJoinId bndr )
+    preInlineUnconditionally env NotTopLevel bndr rhs
+  = do { tick (PreInlineUnconditionally bndr)
+       ; simplExprF (extendIdSubst env bndr (mkContEx env rhs)) body cont }
+
+   | otherwise
+   = wrapJoinCont env cont $ \ cont ->
+     do { -- We push join_cont into the join RHS and the body;
+          -- and wrap wrap_cont around the whole thing
+        ; let res_ty = contResultType cont
+        ; (env1, bndr1)    <- simplNonRecJoinBndr env res_ty bndr
+        ; (env2, bndr2)    <- addBndrRules env1 bndr bndr1
+        ; (floats1, env3)  <- simplJoinBind env2 cont bndr bndr2 rhs
+        ; (floats2, body') <- simplExprF env3 body cont
+        ; return (floats1 `addFloats` floats2, body') }
+
+
+------------------
+simplRecJoinPoint :: SimplEnv -> [(InId, InExpr)]
+                  -> InExpr -> SimplCont
+                  -> SimplM (SimplFloats, OutExpr)
+simplRecJoinPoint env pairs body cont
+  = wrapJoinCont env cont $ \ cont ->
+    do { let bndrs = map fst pairs
+             res_ty = contResultType cont
+       ; env1 <- simplRecJoinBndrs env res_ty bndrs
+               -- NB: bndrs' don't have unfoldings or rules
+               -- We add them as we go down
+       ; (floats1, env2)  <- simplRecBind env1 NotTopLevel (Just cont) pairs
+       ; (floats2, body') <- simplExprF env2 body cont
+       ; return (floats1 `addFloats` floats2, body') }
+
+--------------------
+wrapJoinCont :: SimplEnv -> SimplCont
+             -> (SimplCont -> SimplM (SimplFloats, OutExpr))
+             -> SimplM (SimplFloats, OutExpr)
+-- Deal with making the continuation duplicable if necessary,
+-- and with the no-case-of-case situation.
+wrapJoinCont env cont thing_inside
+  | contIsStop cont        -- Common case; no need for fancy footwork
+  = thing_inside cont
+
+  | not (sm_case_case (getMode env))
+    -- See Note [Join points wih -fno-case-of-case]
+  = do { (floats1, expr1) <- thing_inside (mkBoringStop (contHoleType cont))
+       ; let (floats2, expr2) = wrapJoinFloatsX floats1 expr1
+       ; (floats3, expr3) <- rebuild (env `setInScopeFromF` floats2) expr2 cont
+       ; return (floats2 `addFloats` floats3, expr3) }
+
+  | otherwise
+    -- Normal case; see Note [Join points and case-of-case]
+  = do { (floats1, cont')  <- mkDupableCont env cont
+       ; (floats2, result) <- thing_inside cont'
+       ; return (floats1 `addFloats` floats2, result) }
+
+
+--------------------
+trimJoinCont :: Id -> Maybe JoinArity -> SimplCont -> SimplCont
+-- Drop outer context from join point invocation (jump)
+-- See Note [Join points and case-of-case]
+
+trimJoinCont _ Nothing cont
+  = cont -- Not a jump
+trimJoinCont var (Just arity) cont
+  = trim arity cont
+  where
+    trim 0 cont@(Stop {})
+      = cont
+    trim 0 cont
+      = mkBoringStop (contResultType cont)
+    trim n cont@(ApplyToVal { sc_cont = k })
+      = cont { sc_cont = trim (n-1) k }
+    trim n cont@(ApplyToTy { sc_cont = k })
+      = cont { sc_cont = trim (n-1) k } -- join arity counts types!
+    trim _ cont
+      = pprPanic "completeCall" $ ppr var $$ ppr cont
+
+
+{- Note [Join points and case-of-case]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+When we perform the case-of-case transform (or otherwise push continuations
+inward), we want to treat join points specially. Since they're always
+tail-called and we want to maintain this invariant, we can do this (for any
+evaluation context E):
+
+  E[join j = e
+    in case ... of
+         A -> jump j 1
+         B -> jump j 2
+         C -> f 3]
+
+    -->
+
+  join j = E[e]
+  in case ... of
+       A -> jump j 1
+       B -> jump j 2
+       C -> E[f 3]
+
+As is evident from the example, there are two components to this behavior:
+
+  1. When entering the RHS of a join point, copy the context inside.
+  2. When a join point is invoked, discard the outer context.
+
+We need to be very careful here to remain consistent---neither part is
+optional!
+
+We need do make the continuation E duplicable (since we are duplicating it)
+with mkDuableCont.
+
+
+Note [Join points wih -fno-case-of-case]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Supose case-of-case is switched off, and we are simplifying
+
+    case (join j x = <j-rhs> in
+          case y of
+             A -> j 1
+             B -> j 2
+             C -> e) of <outer-alts>
+
+Usually, we'd push the outer continuation (case . of <outer-alts>) into
+both the RHS and the body of the join point j.  But since we aren't doing
+case-of-case we may then end up with this totally bogus result
+
+    join x = case <j-rhs> of <outer-alts> in
+    case (case y of
+             A -> j 1
+             B -> j 2
+             C -> e) of <outer-alts>
+
+This would be OK in the language of the paper, but not in GHC: j is no longer
+a join point.  We can only do the "push contination into the RHS of the
+join point j" if we also push the contination right down to the /jumps/ to
+j, so that it can evaporate there.  If we are doing case-of-case, we'll get to
+
+    join x = case <j-rhs> of <outer-alts> in
+    case y of
+      A -> j 1
+      B -> j 2
+      C -> case e of <outer-alts>
+
+which is great.
+
+Bottom line: if case-of-case is off, we must stop pushing the continuation
+inwards altogether at any join point.  Instead simplify the (join ... in ...)
+with a Stop continuation, and wrap the original continuation around the
+outside.  Surprisingly tricky!
+
 
-{-
 ************************************************************************
 *                                                                      *
                      Variables
@@ -1675,67 +1625,53 @@ simplVar env var
         DoneId var1          -> return (Var var1)
         DoneEx e _           -> return e
 
-simplIdF :: SimplEnv -> InId -> SimplCont -> SimplM (SimplEnv, OutExpr)
+simplIdF :: SimplEnv -> InId -> SimplCont -> SimplM (SimplFloats, OutExpr)
 simplIdF env var cont
   = case substId env var of
-        ContEx tvs cvs ids e -> simplExprF (setSubstEnv env tvs cvs ids) e cont
-                                  -- Don't trim; haven't already simplified e,
-                                  -- so the cont is not embodied in e
-
-        DoneId var1      -> completeCall env var1 (trim_cont (isJoinId_maybe var1))
-        DoneEx e mb_join -> simplExprF (zapSubstEnv env) e (trim_cont mb_join)
-                -- Note [zapSubstEnv]
-                -- The template is already simplified, so don't re-substitute.
-                -- This is VITAL.  Consider
-                --      let x = e in
-                --      let y = \z -> ...x... in
-                --      \ x -> ...y...
-                -- We'll clone the inner \x, adding x->x' in the id_subst
-                -- Then when we inline y, we must *not* replace x by x' in
-                -- the inlined copy!!
-  where
-    trim_cont (Just arity) = trim arity cont
-    trim_cont Nothing      = cont
-
-    -- Drop outer context from join point invocation
-    -- Note [Case-of-case and join points]
-    trim 0 cont@(Stop {})
-      = cont
-    trim 0 cont
-      = mkBoringStop (contResultType cont)
-    trim n cont@(ApplyToVal { sc_cont = k })
-      = cont { sc_cont = trim (n-1) k }
-    trim n cont@(ApplyToTy { sc_cont = k })
-      = cont { sc_cont = trim (n-1) k } -- join arity counts types!
-    trim _ cont
-      = pprPanic "completeCall" $ ppr var $$ ppr cont
+      ContEx tvs cvs ids e -> simplExprF (setSubstEnv env tvs cvs ids) e cont
+                                -- Don't trim; haven't already simplified e,
+                                -- so the cont is not embodied in e
+
+      DoneId var1 -> completeCall env var1 (trimJoinCont var (isJoinId_maybe var1) cont)
+
+      DoneEx e mb_join -> simplExprF (zapSubstEnv env) e (trimJoinCont var mb_join cont)
+              -- Note [zapSubstEnv]
+              -- The template is already simplified, so don't re-substitute.
+              -- This is VITAL.  Consider
+              --      let x = e in
+              --      let y = \z -> ...x... in
+              --      \ x -> ...y...
+              -- We'll clone the inner \x, adding x->x' in the id_subst
+              -- Then when we inline y, we must *not* replace x by x' in
+              -- the inlined copy!!
 
 ---------------------------------------------------------
 --      Dealing with a call site
 
-completeCall :: SimplEnv -> OutId -> SimplCont -> SimplM (SimplEnv, OutExpr)
+completeCall :: SimplEnv -> OutId -> SimplCont -> SimplM (SimplFloats, OutExpr)
 completeCall env var cont
-  = do  {   ------------- Try inlining ----------------
-          dflags <- getDynFlags
-        ; let  (lone_variable, arg_infos, call_cont) = contArgs cont
-               n_val_args = length arg_infos
-               interesting_cont = interestingCallContext call_cont
-               unfolding    = activeUnfolding env var
-               maybe_inline = callSiteInline dflags var unfolding
-                                             lone_variable arg_infos interesting_cont
-        ; case maybe_inline of
-            Just expr      -- There is an inlining!
-              ->  do { checkedTick (UnfoldingDone var)
-                     ; dump_inline dflags expr cont
-                     ; simplExprF (zapSubstEnv env) expr cont }
-
-            ; Nothing -> do { rule_base <- getSimplRules
-                            ; let info = mkArgInfo var (getRules rule_base var)
-                                                   n_val_args call_cont
-                            ; rebuildCall env info cont }
-    }
+  | Just expr <- callSiteInline dflags var unfolding
+                                lone_variable arg_infos interesting_cont
+  -- Inline the variable's RHS
+  = do { checkedTick (UnfoldingDone var)
+       ; dump_inline expr cont
+       ; simplExprF (zapSubstEnv env) expr cont }
+
+  | otherwise
+  -- Don't inline; instead rebuild the call
+  = do { rule_base <- getSimplRules
+       ; let info = mkArgInfo var (getRules rule_base var)
+                              n_val_args call_cont
+       ; rebuildCall env info cont }
+
   where
-    dump_inline dflags unfolding cont
+    dflags = seDynFlags env
+    (lone_variable, arg_infos, call_cont) = contArgs cont
+    n_val_args       = length arg_infos
+    interesting_cont = interestingCallContext env call_cont
+    unfolding        = activeUnfolding env var
+
+    dump_inline unfolding cont
       | not (dopt Opt_D_dump_inlinings dflags) = return ()
       | not (dopt Opt_D_verbose_core2core dflags)
       = when (isExternalName (idName var)) $
@@ -1750,7 +1686,7 @@ completeCall env var cont
 rebuildCall :: SimplEnv
             -> ArgInfo
             -> SimplCont
-            -> SimplM (SimplEnv, OutExpr)
+            -> SimplM (SimplFloats, OutExpr)
 -- We decided not to inline, so
 --    - simplify the arguments
 --    - try rewrite rules
@@ -1772,7 +1708,7 @@ rebuildCall env (ArgInfo { ai_fun = fun, ai_args = rev_args, ai_strs = [] }) con
                                  -- continuation to discard, else we do it
                                  -- again and again!
   = seqType cont_ty `seq`        -- See Note [Avoiding space leaks in OutType]
-    return (env, castBottomExpr res cont_ty)
+    return (emptyFloats env, castBottomExpr res cont_ty)
   where
     res     = argInfoExpr fun rev_args
     cont_ty = contResultType cont
@@ -1811,10 +1747,10 @@ rebuildCall env info@(ArgInfo { ai_encl = encl_rules, ai_type = fun_ty
   | isSimplified dup_flag     -- See Note [Avoid redundant simplification]
   = rebuildCall env (addValArgTo info' arg) cont
 
-  | str                 -- Strict argument
+  | str         -- Strict argument
   , sm_case_case (getMode env)
   = -- pprTrace "Strict Arg" (ppr arg $$ ppr (seIdSubst env) $$ ppr (seInScope env)) $
-    simplExprF (arg_se `setFloats` env) arg
+    simplExprF (arg_se `setInScopeFromE` env) arg
                (StrictArg { sc_fun = info', sc_cci = cci_strict
                           , sc_dup = Simplified, sc_cont = cont })
                 -- Note [Shadowing]
@@ -1824,7 +1760,7 @@ rebuildCall env info@(ArgInfo { ai_encl = encl_rules, ai_type = fun_ty
         -- There is no benefit (unlike in a let-binding), and we'd
         -- have to be very careful about bogus strictness through
         -- floating a demanded let.
-  = do  { arg' <- simplExprC (arg_se `setInScopeAndZapFloats` env) arg
+  = do  { arg' <- simplExprC (arg_se `setInScopeFromE` env) arg
                              (mkLazyArgStop arg_ty cci_lazy)
         ; rebuildCall env (addValArgTo info' arg') cont }
   where
@@ -1935,13 +1871,13 @@ tryRules :: SimplEnv -> [CoreRule]
 tryRules env rules fn args call_cont
   | null rules
   = return Nothing
+
 {- Disabled until we fix #8326
   | fn `hasKey` tagToEnumKey   -- See Note [Optimising tagToEnum#]
   , [_type_arg, val_arg] <- args
   , Select dup bndr ((_,[],rhs1) : rest_alts) se cont <- call_cont
   , isDeadBinder bndr
-  = do { dflags <- getDynFlags
-       ; let enum_to_tag :: CoreAlt -> CoreAlt
+  = do { let enum_to_tag :: CoreAlt -> CoreAlt
                 -- Takes   K -> e  into   tagK# -> e
                 -- where tagK# is the tag of constructor K
              enum_to_tag (DataAlt con, [], rhs)
@@ -1956,35 +1892,39 @@ tryRules env rules fn args call_cont
                  -- The binder is dead, but should have the right type
       ; return (Just (val_arg, Select dup new_bndr new_alts se cont)) }
 -}
-  | otherwise
-  = do { dflags <- getDynFlags
-       ; case lookupRule dflags (getUnfoldingInRuleMatch env) (activeRule env)
-                         fn (argInfoAppArgs args) rules of {
-           Nothing ->
-             do { nodump dflags  -- This ensures that an empty file is written
-                ; return Nothing } ;  -- No rule matches
-           Just (rule, rule_rhs) ->
-             do { checkedTick (RuleFired (ruleName rule))
-                ; let cont' = pushSimplifiedArgs zapped_env
-                                                 (drop (ruleArity rule) args)
-                                                 call_cont
-                              -- (ruleArity rule) says how
-                              -- many args the rule consumed
-
-                      occ_anald_rhs = occurAnalyseExpr rule_rhs
-                          -- See Note [Occurrence-analyse after rule firing]
-                ; dump dflags rule rule_rhs
-                ; return (Just (zapped_env, occ_anald_rhs, cont')) }}}
-                     -- The occ_anald_rhs and cont' are all Out things
-                     -- hence zapping the environment
+
+  | Just (rule, rule_rhs) <- lookupRule dflags (getUnfoldingInRuleMatch env)
+                                        (activeRule env) fn
+                                        (argInfoAppArgs args) rules
+  -- Fire a rule for the function
+  = do { checkedTick (RuleFired (ruleName rule))
+       ; let cont' = pushSimplifiedArgs zapped_env
+                                        (drop (ruleArity rule) args)
+                                        call_cont
+                     -- (ruleArity rule) says how
+                     -- many args the rule consumed
+
+             occ_anald_rhs = occurAnalyseExpr rule_rhs
+                 -- See Note [Occurrence-analyse after rule firing]
+       ; dump rule rule_rhs
+       ; return (Just (zapped_env, occ_anald_rhs, cont')) }
+            -- The occ_anald_rhs and cont' are all Out things
+            -- hence zapping the environment
+
+  | otherwise  -- No rule fires
+  = do { nodump  -- This ensures that an empty file is written
+       ; return Nothing }
+
   where
+    dflags     = seDynFlags env
     zapped_env = zapSubstEnv env  -- See Note [zapSubstEnv]
 
-    printRuleModule rule =
-      parens
-        (maybe (text "BUILTIN") (pprModuleName . moduleName) (ruleModule rule))
+    printRuleModule rule
+      = parens (maybe (text "BUILTIN")
+                      (pprModuleName . moduleName)
+                      (ruleModule rule))
 
-    dump dflags rule rule_rhs
+    dump rule rule_rhs
       | dopt Opt_D_dump_rule_rewrites dflags
       = log_rule dflags Opt_D_dump_rule_rewrites "Rule fired" $ vcat
           [ text "Rule:" <+> ftext (ruleName rule)
@@ -2001,7 +1941,7 @@ tryRules env rules fn args call_cont
       | otherwise
       = return ()
 
-    nodump dflags
+    nodump
       | dopt Opt_D_dump_rule_rewrites dflags
       = liftIO $ dumpSDoc dflags alwaysQualify Opt_D_dump_rule_rewrites "" empty
 
@@ -2333,7 +2273,7 @@ rebuildCase, reallyRebuildCase
    -> InId             -- Case binder
    -> [InAlt]          -- Alternatives (inceasing order)
    -> SimplCont
-   -> SimplM (SimplEnv, OutExpr)
+   -> SimplM (SimplFloats, OutExpr)
 
 --------------------------------------------------
 --      1. Eliminate the case if there's a known constructor
@@ -2360,10 +2300,11 @@ rebuildCase env scrut case_bndr alts cont
         }
   where
     simple_rhs bs rhs = ASSERT( null bs )
-                        do { env' <- simplNonRecX env case_bndr scrut
+                        do { (floats1, env') <- simplNonRecX env case_bndr scrut
                                -- scrut is a constructor application,
                                -- hence satisfies let/app invariant
-                           ; simplExprF env' rhs cont }
+                           ; (floats2, expr') <- simplExprF env' rhs cont
+                           ; return (floats1 `addFloats` floats2, expr') }
 
 
 --------------------------------------------------
@@ -2397,8 +2338,9 @@ rebuildCase env scrut case_bndr alts@[(_, bndrs, rhs)] cont
     else exprIsHNF scrut             -- See Note [Case elimination: lifted case]
       || scrut_is_demanded_var scrut
   = do { tick (CaseElim case_bndr)
-       ; env' <- simplNonRecX env case_bndr scrut
-       ; simplExprF env' rhs cont }
+       ; (floats1, env') <- simplNonRecX env case_bndr scrut
+       ; (floats2, expr') <- simplExprF env' rhs cont
+       ; return (floats1 `addFloats` floats2, expr') }
 
   -- 2c. Try the seq rules if
   --     a) it binds only the case binder
@@ -2429,23 +2371,16 @@ rebuildCase env scrut case_bndr alts cont
 --------------------------------------------------
 
 reallyRebuildCase env scrut case_bndr alts cont
-  = do  {       -- Prepare the continuation;
-                -- The new subst_env is in place
-          (env, alt_cont, wrap_cont) <- prepareCaseCont env alts cont
-
-        -- Simplify the alternatives
-        ; (scrut', case_bndr', alts') <- simplAlts env scrut case_bndr alts alt_cont
-
-        ; dflags <- getDynFlags
-        ; let alts_ty' = contResultType alt_cont
-        -- See Note [Avoiding space leaks in OutType]
-        ; case_expr <- seqType alts_ty' `seq`
-                       mkCase dflags scrut' case_bndr' alts_ty' alts'
+  | not (sm_case_case (getMode env))
+  = do { case_expr <- simplAlts env scrut case_bndr alts
+                                (mkBoringStop (contHoleType cont))
+       ; rebuild env case_expr cont }
 
-        -- Notice that rebuild gets the in-scope set from env', not alt_env
-        -- (which in any case is only build in simplAlts)
-        -- The case binder *not* scope over the whole returned case-expression
-        ; rebuild env case_expr wrap_cont }
+  | otherwise
+  = do { (floats, cont') <- mkDupableCaseCont env alts cont
+       ; case_expr <- simplAlts (env `setInScopeFromF` floats)
+                                scrut case_bndr alts cont'
+       ; return (floats, case_expr) }
 
 {-
 simplCaseBinder checks whether the scrutinee is a variable, v.  If so,
@@ -2527,19 +2462,14 @@ robust here.  (Otherwise, there's a danger that we'll simply drop the
 -}
 
 simplAlts :: SimplEnv
-          -> OutExpr
-          -> InId                       -- Case binder
-          -> [InAlt]                    -- Non-empty
+          -> OutExpr         -- Scrutinee
+          -> InId            -- Case binder
+          -> [InAlt]         -- Non-empty
           -> SimplCont
-          -> SimplM (OutExpr, OutId, [OutAlt])  -- Includes the continuation
--- Like simplExpr, this just returns the simplified alternatives;
--- it does not return an environment
--- The returned alternatives can be empty, none are possible
-
-simplAlts env scrut case_bndr alts cont'
-  = do  { let env0 = zapFloats env
+          -> SimplM OutExpr  -- Returns the complete simplified case expression
 
-        ; (env1, case_bndr1) <- simplBinder env0 case_bndr
+simplAlts env0 scrut case_bndr alts cont'
+  = do  { (env1, case_bndr1) <- simplBinder env0 case_bndr
         ; let case_bndr2 = case_bndr1 `setIdUnfolding` evaldUnfolding
               env2       = modifyInScope env1 case_bndr2
               -- See Note [Case binder evaluated-ness]
@@ -2554,7 +2484,11 @@ simplAlts env scrut case_bndr alts cont'
 
         ; alts' <- mapM (simplAlt alt_env' (Just scrut') imposs_deflt_cons case_bndr' cont') in_alts
         ; -- pprTrace "simplAlts" (ppr case_bndr $$ ppr alts_ty $$ ppr alts_ty' $$ ppr alts $$ ppr cont') $
-          return (scrut', case_bndr', alts') }
+
+        ; let alts_ty' = contResultType cont'
+        -- See Note [Avoiding space leaks in OutType]
+        ; seqType alts_ty' `seq`
+          mkCase (seDynFlags env0) scrut' case_bndr' alts_ty' alts' }
 
 
 ------------------------------------
@@ -2647,19 +2581,20 @@ simplAlt env scrut' _ case_bndr' cont' (DataAlt con, vs, rhs)
 
 addAltUnfoldings :: SimplEnv -> Maybe OutExpr -> OutId -> OutExpr -> SimplM SimplEnv
 addAltUnfoldings env scrut case_bndr con_app
-  = do { dflags <- getDynFlags
-       ; let con_app_unf = mkSimpleUnfolding dflags con_app
+  = do { let con_app_unf = mk_simple_unf con_app
              env1 = addBinderUnfolding env case_bndr con_app_unf
 
              -- See Note [Add unfolding for scrutinee]
              env2 = case scrut of
                       Just (Var v)           -> addBinderUnfolding env1 v con_app_unf
                       Just (Cast (Var v) co) -> addBinderUnfolding env1 v $
-                                                mkSimpleUnfolding dflags (Cast con_app (mkSymCo co))
+                                                mk_simple_unf (Cast con_app (mkSymCo co))
                       _                      -> env1
 
        ; traceSmpl "addAltUnf" (vcat [ppr case_bndr <+> ppr scrut, ppr con_app])
        ; return env2 }
+  where
+    mk_simple_unf = mkSimpleUnfolding (seDynFlags env)
 
 addBinderUnfolding :: SimplEnv -> Id -> Unfolding -> SimplEnv
 addBinderUnfolding env bndr unf
@@ -2755,17 +2690,18 @@ knownCon :: SimplEnv
          -> DataCon -> [OutType] -> [OutExpr]   -- The scrutinee (in pieces)
          -> InId -> [InBndr] -> InExpr          -- The alternative
          -> SimplCont
-         -> SimplM (SimplEnv, OutExpr)
+         -> SimplM (SimplFloats, OutExpr)
 
 knownCon env scrut dc dc_ty_args dc_args bndr bs rhs cont
-  = do  { env'  <- bind_args env bs dc_args
-        ; env'' <- bind_case_bndr env'
-        ; simplExprF env'' rhs cont }
+  = do  { (floats1, env1)  <- bind_args env bs dc_args
+        ; (floats2, env2) <- bind_case_bndr env1
+        ; (floats3, expr') <- simplExprF env2 rhs cont
+        ; return (floats1 `addFloats` floats2 `addFloats` floats3, expr') }
   where
     zap_occ = zapBndrOccInfo (isDeadBinder bndr)    -- bndr is an InId
 
                   -- Ugh!
-    bind_args env' [] _  = return env'
+    bind_args env' [] _  = return (emptyFloats env', env')
 
     bind_args env' (b:bs') (Type ty : args)
       = ASSERT( isTyVar b )
@@ -2783,8 +2719,9 @@ knownCon env scrut dc dc_ty_args dc_args bndr bs rhs cont
              -- it via postInlineUnconditionally.
              -- Nevertheless we must keep it if the case-binder is alive,
              -- because it may be used in the con_app.  See Note [knownCon occ info]
-           ; env'' <- simplNonRecX env' b' arg  -- arg satisfies let/app invariant
-           ; bind_args env'' bs' args }
+           ; (floats1, env2) <- simplNonRecX env' b' arg  -- arg satisfies let/app invariant
+           ; (floats2, env3)  <- bind_args env2 bs' args
+           ; return (floats1 `addFloats` floats2, env3) }
 
     bind_args _ _ _ =
       pprPanic "bind_args" $ ppr dc $$ ppr bs $$ ppr dc_args $$
@@ -2798,8 +2735,9 @@ knownCon env scrut dc dc_ty_args dc_args bndr bs rhs cont
        -- about duplicating the arg redexes; in that case, make
        -- a new con-app from the args
     bind_case_bndr env
-      | isDeadBinder bndr   = return env
-      | exprIsTrivial scrut = return (extendIdSubst env bndr (DoneEx scrut Nothing))
+      | isDeadBinder bndr   = return (emptyFloats env, env)
+      | exprIsTrivial scrut = return (emptyFloats env
+                                     , extendIdSubst env bndr (DoneEx scrut Nothing))
       | otherwise           = do { dc_args <- mapM (simplVar env) bs
                                          -- dc_ty_args are aready OutTypes,
                                          -- but bs are InBndrs
@@ -2809,7 +2747,8 @@ knownCon env scrut dc dc_ty_args dc_args bndr bs rhs cont
                                  ; simplNonRecX env bndr con_app }
 
 -------------------
-missingAlt :: SimplEnv -> Id -> [InAlt] -> SimplCont -> SimplM (SimplEnv, OutExpr)
+missingAlt :: SimplEnv -> Id -> [InAlt] -> SimplCont
+           -> SimplM (SimplFloats, OutExpr)
                 -- This isn't strictly an error, although it is unusual.
                 -- It's possible that the simplifier might "see" that
                 -- an inner case has no accessible alternatives before
@@ -2819,7 +2758,8 @@ missingAlt env case_bndr _ cont
   = WARN( True, text "missingAlt" <+> ppr case_bndr )
     -- See Note [Avoiding space leaks in OutType]
     let cont_ty = contResultType cont
-    in seqType cont_ty `seq` return (env, mkImpossibleExpr cont_ty)
+    in seqType cont_ty `seq`
+       return (emptyFloats env, mkImpossibleExpr cont_ty)
 
 {-
 ************************************************************************
@@ -2839,7 +2779,7 @@ and will split it into
    join floats:  $j1 = e1, $j2 = e2
    non_dupable:  let x* = [] in b; stop
 
-Putting this back togeher would give
+Putting this back together would give
    let x* = let { $j1 = e1; $j2 = e2 } in
             case e of { True -> $j1; False -> $j2 }
    in b
@@ -2849,57 +2789,23 @@ inner expression, and not around the whole thing.
 
 In contrast, any let-bindings introduced by mkDupableCont can wrap
 around the entire thing.
--}
-
-
-prepareCaseCont :: SimplEnv -> [InAlt] -> SimplCont
-                -> SimplM (SimplEnv,
-                           SimplCont,  -- For the alternatives
-                           SimplCont)  -- Wraps the entire case
--- We are considering
---     K[ case _ of { p1 -> r1; ...; pn -> rn } ]
--- where K is some enclosing continuation for the case
--- Goal: split K into two pieces Kdup,Knodup so that
---       a) Kdup can be duplicated
---       b) Knodup[Kdup[e]] = K[e]
--- The idea is that we'll transform thus:
---          Knodup[ (case _ of { p1 -> Kdup[r1]; ...; pn -> Kdup[rn] }
---
--- We may also return some extra value bindings in SimplEnv (that scope over
--- the entire continuation) as well as some join points (thus must *not* float
--- past the continuation!).
--- Hence, the full story is this:
---     K[case _ of { p1 -> r1; ...; pn -> rn }] ==>
---     F_v[Knodup[F_j[ (case _ of { p1 -> Kdup[r1]; ...; pn -> Kdup[rn] }) ]]]
--- Here F_v represents some values that got floated out and F_j represents some
--- join points that got floated out.
---
--- When case-of-case is off, just make the entire continuation non-dupable
 
-prepareCaseCont env alts cont
-  | not (altsWouldDup alts)
-  = return (env, cont, mkBoringStop (contResultType cont))
-  | otherwise
-  = do { (env', cont') <- mkDupableCont env cont
-       ; return (env', cont', mkBoringStop (contResultType cont)) }
-
-prepareJoinCont :: SimplEnv -> SimplCont -> SimplM (SimplEnv, SimplCont)
-
--- Similar to prepareCaseCont, only for
---     K[let { j1 = r1; ...; jn -> rn } in _]
--- If the js are join points, this will turn into
---     Knodup[join { j1 = Kdup[r1]; ...; jn = Kdup[rn] } in Kdup[_]].
---
--- When case-of-case is off and it's a join binding, just make the entire
--- continuation non-dupable. This is necessary because otherwise
---     case (join j = ... in case e of { A -> jump j 1; ... }) of { B -> ... }
--- becomes
---     join j = case ... of { B -> ... } in
---     case (case e of { A -> jump j 1; ... }) of { B -> ... },
--- and the reference to j is invalid.
+Note [Bottom alternatives]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+When we have
+     case (case x of { A -> error .. ; B -> e; C -> error ..)
+       of alts
+then we can just duplicate those alts because the A and C cases
+will disappear immediately.  This is more direct than creating
+join points and inlining them away.  See Trac #4930.
+-}
 
-prepareJoinCont env cont
-  = mkDupableCont env cont
+--------------------
+mkDupableCaseCont :: SimplEnv -> [InAlt] -> SimplCont
+                  -> SimplM (SimplFloats, SimplCont)
+mkDupableCaseCont env alts cont
+  | altsWouldDup alts = mkDupableCont env cont
+  | otherwise         = return (emptyFloats env, cont)
 
 altsWouldDup :: [InAlt] -> Bool -- True iff strictly > 1 non-bottom alternative
 altsWouldDup []  = False        -- See Note [Bottom alternatives]
@@ -2910,98 +2816,89 @@ altsWouldDup (alt:alts)
   where
     is_bot_alt (_,_,rhs) = exprIsBottom rhs
 
-{-
-Note [Bottom alternatives]
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-When we have
-     case (case x of { A -> error .. ; B -> e; C -> error ..)
-       of alts
-then we can just duplicate those alts because the A and C cases
-will disappear immediately.  This is more direct than creating
-join points and inlining them away.  See Trac #4930.
--}
-
 -------------------------
 mkDupableCont :: SimplEnv -> SimplCont
-              -> SimplM ( SimplEnv    -- Incoming SimplEnv augmented with
-                                      --   extra let/join-floats and in-scope variables
-                        , SimplCont)  -- dup_cont: duplicable continuation
-mkDupableCont env cont
-  = mk_dupable_cont env cont
+              -> SimplM ( SimplFloats  -- Incoming SimplEnv augmented with
+                                       --   extra let/join-floats and in-scope variables
+                        , SimplCont)   -- dup_cont: duplicable continuation
 
--------------------------
-mk_dupable_cont :: SimplEnv -> SimplCont
-                -> SimplM (SimplEnv, SimplCont)
-mk_dupable_cont env cont
+mkDupableCont env cont
   | contIsDupable cont
-  = return (env, cont)
+  = return (emptyFloats env, cont)
 
-mk_dupable_cont _ (Stop {}) = panic "mkDupableCont"     -- Handled by previous eqn
+mkDupableCont _ (Stop {}) = panic "mkDupableCont"     -- Handled by previous eqn
 
-mk_dupable_cont env (CastIt ty cont)
-  = do  { (env', cont') <- mk_dupable_cont env cont
-        ; return (env', CastIt ty cont') }
+mkDupableCont env (CastIt ty cont)
+  = do  { (floats, cont') <- mkDupableCont env cont
+        ; return (floats, CastIt ty cont') }
 
 -- Duplicating ticks for now, not sure if this is good or not
-mk_dupable_cont env (TickIt t cont)
-  = do  { (env', cont') <- mk_dupable_cont env cont
-        ; return (env', TickIt t cont') }
+mkDupableCont env (TickIt t cont)
+  = do  { (floats, cont') <- mkDupableCont env cont
+        ; return (floats, TickIt t cont') }
 
-mk_dupable_cont env (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs
-                                , sc_body = body, sc_env = se, sc_cont = cont})
+mkDupableCont env (StrictBind { sc_bndr = bndr, sc_bndrs = bndrs
+                              , sc_body = body, sc_env = se, sc_cont = cont})
   -- See Note [Duplicating StrictBind]
-  = do { let sb_env = se `setInScopeAndZapFloats` env
+  = do { let sb_env = se `setInScopeFromE` env
        ; (sb_env1, bndr') <- simplBinder sb_env bndr
-       ; (sb_env', join_inner) <- simplLam sb_env1 bndrs body cont
-          -- No need to use mk_dupable_cont before simplLam; we
+       ; (floats1, join_inner) <- simplLam sb_env1 bndrs body cont
+          -- No need to use mkDupableCont before simplLam; we
           -- use cont once here, and then share the result if necessary
-       ; let join_body = wrapFloats sb_env' join_inner
+
+       ; let join_body = wrapFloats floats1 join_inner
              res_ty    = contResultType cont
-       ; dflags <- getDynFlags
-       ; (env2, body2)
-            <- if exprIsDupable dflags join_body
-               then return (env, join_body)
+
+       ; (floats2, body2)
+            <- if exprIsDupable (seDynFlags env) join_body
+               then return (emptyFloats env, join_body)
                else do { join_bndr <- newJoinId [bndr'] res_ty
                        ; let join_call = App (Var join_bndr) (Var bndr')
                              join_rhs  = Lam (setOneShotLambda bndr') join_body
-                       ; return (addNonRec env join_bndr join_rhs, join_call) }
-       ; return ( env2
+                             join_bind = NonRec join_bndr join_rhs
+                             floats    = emptyFloats env `extendFloats` join_bind
+                       ; return (floats, join_call) }
+       ; return ( floats2
                 , StrictBind { sc_bndr = bndr', sc_bndrs = []
                              , sc_body = body2
                              , sc_env  = zapSubstEnv se
                              , sc_dup  = OkToDup
                              , sc_cont = mkBoringStop res_ty } ) }
 
-mk_dupable_cont env (StrictArg { sc_fun = info, sc_cci = cci, sc_cont = cont })
+mkDupableCont env (StrictArg { sc_fun = info, sc_cci = cci, sc_cont = cont })
         -- See Note [Duplicating StrictArg]
         -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable
-  = do { (env', cont')  <- mk_dupable_cont env cont
-       ; (env'', args') <- mapAccumLM makeTrivialArg env' (ai_args info)
-       ; return (env'', StrictArg { sc_fun = info { ai_args = args' }
-                                  , sc_cci = cci
-                                  , sc_cont = cont'
-                                  , sc_dup = OkToDup} ) }
-
-mk_dupable_cont env (ApplyToTy { sc_cont = cont
-                               , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty })
-  = do  { (env', cont') <- mk_dupable_cont env cont
-        ; return (env', ApplyToTy { sc_cont = cont'
-                                  , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty }) }
-
-mk_dupable_cont env (ApplyToVal { sc_arg = arg, sc_dup = dup
-                                , sc_env = se, sc_cont = cont })
+  = do { (floats1, cont') <- mkDupableCont env cont
+       ; (floats_s, args') <- mapAndUnzipM (makeTrivialArg (getMode env))
+                                           (ai_args info)
+       ; return ( foldl addLetFloats floats1 floats_s
+                , StrictArg { sc_fun = info { ai_args = args' }
+                            , sc_cci = cci
+                            , sc_cont = cont'
+                            , sc_dup = OkToDup} ) }
+
+mkDupableCont env (ApplyToTy { sc_cont = cont
+                             , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty })
+  = do  { (floats, cont') <- mkDupableCont env cont
+        ; return (floats, ApplyToTy { sc_cont = cont'
+                                    , sc_arg_ty = arg_ty, sc_hole_ty = hole_ty }) }
+
+mkDupableCont env (ApplyToVal { sc_arg = arg, sc_dup = dup
+                              , sc_env = se, sc_cont = cont })
   =     -- e.g.         [...hole...] (...arg...)
         --      ==>
         --              let a = ...arg...
         --              in [...hole...] a
         -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable
-    do  { (env', cont') <- mk_dupable_cont env cont
+    do  { (floats1, cont') <- mkDupableCont env cont
+        ; let env' = env `setInScopeFromF` floats1
         ; (_, se', arg') <- simplArg env' dup se arg
-        ; (env'', arg'') <- makeTrivial NotTopLevel env' (fsLit "karg") arg'
-        ; return (env'', ApplyToVal { sc_arg = arg'', sc_env = se'
-                                    , sc_dup = OkToDup, sc_cont = cont' }) }
+        ; (let_floats2, arg'') <- makeTrivial (getMode env) NotTopLevel (fsLit "karg") arg'
+        ; return ( floats1 `addLetFloats` let_floats2
+                 , ApplyToVal { sc_arg = arg'', sc_env = se'
+                              , sc_dup = OkToDup, sc_cont = cont' }) }
 
-mk_dupable_cont env (Select { sc_bndr = case_bndr, sc_alts = alts
+mkDupableCont env (Select { sc_bndr = case_bndr, sc_alts = alts
                             , sc_env = se, sc_cont = cont })
   =     -- e.g.         (case [...hole...] of { pi -> ei })
         --      ===>
@@ -3009,16 +2906,12 @@ mk_dupable_cont env (Select { sc_bndr = case_bndr, sc_alts = alts
         --              in case [...hole...] of { pi -> ji xij }
         -- NB: sc_dup /= OkToDup; that is caught earlier by contIsDupable
     do  { tick (CaseOfCase case_bndr)
-        ; (env', alt_cont, wrap_cont) <- prepareCaseCont env alts cont
-                -- NB: We call prepareCaseCont here.  If there is only one
-                -- alternative, then dup_cont may be big, but that's ok
-                -- because we push it into the single alternative, and then
-                -- use mkDupableAlt to turn that simplified alternative into
-                -- a join point if it's too big to duplicate.
+        ; (floats, alt_cont) <- mkDupableCaseCont env alts cont
+                -- NB: We call mkDupableCaseCont here to make cont duplicable
+                --     (if necessary, depending on the number of alts)
                 -- And this is important: see Note [Fusing case continuations]
 
-        ; let alt_env = se `setInScopeAndZapFloats` env'
-
+        ; let alt_env = se `setInScopeFromF` floats
         ; (alt_env', case_bndr') <- simplBinder alt_env case_bndr
         ; alts' <- mapM (simplAlt alt_env' Nothing [] case_bndr' alt_cont) alts
         -- Safe to say that there are no handled-cons for the DEFAULT case
@@ -3033,27 +2926,22 @@ mk_dupable_cont env (Select { sc_bndr = case_bndr, sc_alts = alts
         -- NB: we don't use alt_env further; it has the substEnv for
         --     the alternatives, and we don't want that
 
-        ; (join_binds, alts'') <- mkDupableAlts case_bndr' alts'
-        ; let env'' = foldl (\env (j,r) -> addNonRec env j r) env' join_binds
+        ; (join_floats, alts'') <- mapAccumLM (mkDupableAlt (seDynFlags env) case_bndr')
+                                              emptyJoinFloats alts'
 
-        ; return (env'',  -- Note [Duplicated env]
+        ; return (floats `addJoinFloats` join_floats,  -- Note [Duplicated env]
                   Select { sc_dup  = OkToDup
-                         , sc_bndr = case_bndr', sc_alts = alts''
-                         , sc_env  = zapSubstEnv env''
-                         , sc_cont = wrap_cont } ) }
-
-mkDupableAlts :: OutId -> [OutAlt] -> SimplM ([(JoinId, OutExpr)], [OutAlt])
-mkDupableAlts case_bndr' the_alts
-  = do { dflags <- getDynFlags
-       ; (mb_join_floats, dup_alts)
-             <- mapAndUnzipM (mkDupableAlt dflags case_bndr') the_alts
-       ; return (catMaybes mb_join_floats, dup_alts) }
-
-mkDupableAlt :: DynFlags -> OutId -> OutAlt
-             -> SimplM (Maybe (JoinId,OutExpr), OutAlt)
-mkDupableAlt dflags case_bndr (con, bndrs', rhs')
+                         , sc_bndr = case_bndr'
+                         , sc_alts = alts''
+                         , sc_env  = zapSubstEnv env
+                         , sc_cont = mkBoringStop (contResultType cont) } ) }
+
+mkDupableAlt :: DynFlags -> OutId
+             -> JoinFloats -> OutAlt
+             -> SimplM (JoinFloats, OutAlt)
+mkDupableAlt dflags case_bndr jfloats (con, bndrs', rhs')
   | exprIsDupable dflags rhs'  -- Note [Small alternative rhs]
-  = return (Nothing, (con, bndrs', rhs'))
+  = return (jfloats, (con, bndrs', rhs'))
 
   | otherwise
   = do  { let rhs_ty'  = exprType rhs'
@@ -3098,7 +2986,8 @@ mkDupableAlt dflags case_bndr (con, bndrs', rhs')
         ; let join_call = mkApps (Var join_bndr) final_args
               alt'      = (con, bndrs', join_call)
 
-        ; return (Just (join_bndr, join_rhs), alt') }
+        ; return ( jfloats `addJoinFlts` unitJoinFloat (NonRec join_bndr join_rhs)
+                 , alt') }
                 -- See Note [Duplicated env]
 
 {-
@@ -3352,11 +3241,16 @@ simplLetUnfolding :: SimplEnv-> TopLevelFlag
                   -> Unfolding -> SimplM Unfolding
 simplLetUnfolding env top_lvl cont_mb id new_rhs unf
   | isStableUnfolding unf
-  = simplUnfolding env top_lvl cont_mb id unf
+  = simplStableUnfolding env top_lvl cont_mb id unf
   | otherwise
+  = mkLetUnfolding (seDynFlags env) top_lvl InlineRhs id new_rhs
+
+-------------------
+mkLetUnfolding :: DynFlags -> TopLevelFlag -> UnfoldingSource
+               -> InId -> OutExpr -> SimplM Unfolding
+mkLetUnfolding dflags top_lvl src id new_rhs
   = is_bottoming `seq`  -- See Note [Force bottoming field]
-    do { dflags <- getDynFlags
-       ; return (mkUnfolding dflags InlineRhs is_top_lvl is_bottoming new_rhs) }
+    return (mkUnfolding dflags src is_top_lvl is_bottoming new_rhs)
             -- We make an  unfolding *even for loop-breakers*.
             -- Reason: (a) It might be useful to know that they are WHNF
             --         (b) In TidyPgm we currently assume that, if we want to
@@ -3367,16 +3261,17 @@ simplLetUnfolding env top_lvl cont_mb id new_rhs unf
     is_top_lvl   = isTopLevel top_lvl
     is_bottoming = isBottomingId id
 
-simplUnfolding :: SimplEnv -> TopLevelFlag
-               -> Maybe SimplCont  -- Just k => a join point with continuation k
-               -> InId
-               -> Unfolding -> SimplM Unfolding
+-------------------
+simplStableUnfolding :: SimplEnv -> TopLevelFlag
+                     -> Maybe SimplCont  -- Just k => a join point with continuation k
+                     -> InId
+                     -> Unfolding -> SimplM Unfolding
 -- Note [Setting the new unfolding]
-simplUnfolding env top_lvl mb_cont id unf
+simplStableUnfolding env top_lvl mb_cont id unf
   = case unf of
-      NoUnfolding -> return unf
+      NoUnfolding   -> return unf
       BootUnfolding -> return unf
-      OtherCon {} -> return unf
+      OtherCon {}   -> return unf
 
       DFunUnfolding { df_bndrs = bndrs, df_con = con, df_args = args }
         -> do { (env', bndrs') <- simplBinders rule_env bndrs
@@ -3401,19 +3296,17 @@ simplUnfolding env top_lvl mb_cont id unf
                             -- See Note [Top-level flag on inline rules] in CoreUnfold
 
                   _other              -- Happens for INLINABLE things
-                     -> is_bottoming `seq` -- See Note [Force bottoming field]
-                        do { dflags <- getDynFlags
-                           ; return (mkUnfolding dflags src is_top_lvl is_bottoming expr') } }
+                     -> mkLetUnfolding dflags top_lvl src id expr' }
                 -- If the guidance is UnfIfGoodArgs, this is an INLINABLE
                 -- unfolding, and we need to make sure the guidance is kept up
                 -- to date with respect to any changes in the unfolding.
 
         | otherwise -> return noUnfolding   -- Discard unstable unfoldings
   where
-    is_top_lvl   = isTopLevel top_lvl
-    is_bottoming = isBottomingId id
-    act          = idInlineActivation id
-    rule_env     = updMode (updModeForStableUnfoldings act) env
+    dflags     = seDynFlags env
+    is_top_lvl = isTopLevel top_lvl
+    act        = idInlineActivation id
+    rule_env   = updMode (updModeForStableUnfoldings act) env
          -- See Note [Simplifying inside stable unfoldings] in SimplUtils
 
 {-
diff --git a/testsuite/tests/perf/compiler/T12150.hs b/testsuite/tests/perf/compiler/T12150.hs
new file mode 100644
index 0000000000..a0d4ed5ce4
--- /dev/null
+++ b/testsuite/tests/perf/compiler/T12150.hs
@@ -0,0 +1,103 @@
+module T12150 where
+
+data Result a = Success a | Error String
+
+{- 80 guards
+
+   ghc-7.10.3 -O :  0.3s
+   ghc-8.0.1 -O  :  1.8s
+-}
+
+instance Functor Result where
+    {-# INLINE fmap #-}
+    fmap | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+         | bool = f
+
+      where
+        bool = undefined
+        f = undefined
diff --git a/testsuite/tests/perf/compiler/all.T b/testsuite/tests/perf/compiler/all.T
index baca57cf5e..1da2883edb 100644
--- a/testsuite/tests/perf/compiler/all.T
+++ b/testsuite/tests/perf/compiler/all.T
@@ -1106,6 +1106,16 @@ test('T12707',
      compile,
      [''])
 
+test('T12150',
+     [ only_ways(['optasm']),
+       compiler_stats_num_field('bytes allocated',
+          [(wordsize(64), 70773000, 5)
+          # initial:  70773000
+          ]),
+     ],
+    compile,
+     [''])
+
 test('T13379',
      [ compiler_stats_num_field('bytes allocated',
           [(platform('x86_64-apple-darwin'), 453166912, 10),
diff --git a/testsuite/tests/simplCore/should_compile/T3234.stderr b/testsuite/tests/simplCore/should_compile/T3234.stderr
index 0e5dc178cf..223650af6d 100644
--- a/testsuite/tests/simplCore/should_compile/T3234.stderr
+++ b/testsuite/tests/simplCore/should_compile/T3234.stderr
@@ -10,7 +10,7 @@
 
 
 ==================== Grand total simplifier statistics ====================
-Total ticks:     55
+Total ticks:     52
 
 18 PreInlineUnconditionally
   1 g
@@ -39,7 +39,7 @@ Total ticks:     55
   1 fold/build
   1 unpack
   1 unpack-list
-5 LetFloatFromLet 5
+2 LetFloatFromLet 2
 25 BetaReduction
   1 a
   1 g
diff --git a/testsuite/tests/simplCore/should_compile/rule2.stderr b/testsuite/tests/simplCore/should_compile/rule2.stderr
index 7444cc90a4..35bcec7835 100644
--- a/testsuite/tests/simplCore/should_compile/rule2.stderr
+++ b/testsuite/tests/simplCore/should_compile/rule2.stderr
@@ -10,12 +10,12 @@
 
 
 ==================== Grand total simplifier statistics ====================
-Total ticks:     13
+Total ticks:     11
 
 1 PreInlineUnconditionally 1 f
 1 UnfoldingDone 1 Roman.bar
 1 RuleFired 1 foo/bar
-3 LetFloatFromLet 3
+1 LetFloatFromLet 1
 1 EtaReduction 1 ds
 6 BetaReduction
   1 f
author	Simon Peyton Jones <simonpj@microsoft.com>	2017-08-25 09:22:03 +0100
committer	Simon Peyton Jones <simonpj@microsoft.com>	2017-08-25 12:57:11 +0100
commit	33452dfc6cf891b59d63fa9fe138b18cbce4df81 (patch)
tree	e6cc85f4e02d0f792d3e44a28958b7261cf3fb44
parent	407c11b880325f4f327982d4f6b9f9cba4564016 (diff)
download	haskell-33452dfc6cf891b59d63fa9fe138b18cbce4df81.tar.gz