Turn coercion optimization into an optimization pass (#19909)wip/amg/T19909

This distinguishes between "simple" and "non-simple" coercion optimizations, with the former applied by the simple optimizer (e.g. immediately after desugaring and during simplification) while the latter are applied during a separate core-to-core pass. At the moment the only coercion optimizations not considered "simple" are the `opt_trans_rule` cases for pushing transitivity inside axioms. There are now explicit `-fopt-coercion-simple` and `-fopt-coercion-full` options to control these passes. The existing `-fno-opt-coercion` disables them both, and a new `-fopt-coercion` enables them both. By default `-fopt-coercion-simple` is enabled for all optimization levels, while `-fopt-coercion-full` is enabled only for `-O1` and higher. Open questions: * When should we run the non-simple optimization pass? At the moment it is run only once, at the beginning of the pipeline. * Should any other coercion optimizations be "non-simple"? * Should we perform non-simple optimization of coercions inside types? I've so far refrained from doing so, because I have a hunch that it will be a waste of time, but I haven't got data to back this up.
author: Adam Gundry <adam@well-typed.com> 2021-05-31 13:21:54 +0100
committer: Adam Gundry <adam@well-typed.com> 2021-06-01 14:57:36 +0100
commit: a7b206099937a2cf8cd6319052b3149fd7ecbaa7 (patch)
tree: 2aa0d82b2a033229ebe1bc3593497cf12277c64d
parent: cc1ba576d26b90c0c01aa43e7100c94ee3a287ad (diff)
download: haskell-wip/amg/T19909.tar.gz
9 files changed, 345 insertions, 190 deletions
diff --git a/compiler/GHC/Core/Coercion/Opt.hs b/compiler/GHC/Core/Coercion/Opt.hs
index 66ba8ee8ab..dc76c8683a 100644
--- a/compiler/GHC/Core/Coercion/Opt.hs
+++ b/compiler/GHC/Core/Coercion/Opt.hs
@@ -6,6 +6,7 @@ module GHC.Core.Coercion.Opt
    ( optCoercion
    , checkAxInstCo
    , OptCoercionOpts (..)
+   , optCoercionProgram
    )
 where
 
@@ -15,6 +16,7 @@ import GHC.Prelude
 
 import GHC.Driver.Ppr
 
+import GHC.Core
 import GHC.Core.TyCo.Rep
 import GHC.Core.TyCo.Subst
 import GHC.Core.Coercion
@@ -40,6 +42,68 @@ import GHC.Utils.Panic
 %*                                                                      *
 %************************************************************************
 
+The type-checker and simplifier can potentially produce very large coercion
+objects to witness relatively simple equalities.  These large coercions must
+then be traversed repeatedly (e.g. to apply substitutions during
+simplification), and may end up being written out to interface files as part of
+unfoldings.
+
+To avoid this problem, coercion optimisation attempts to make the representation
+of a coercion smaller, without changing its kind.  It is described in:
+
+    Vytiniotis and Peyton Jones. Evidence normalization in System FC. (RTA'13)
+    https://www.microsoft.com/en-us/research/publication/evidence-normalization-system-fc-2/
+
+Note [Simple and full coercion optimisation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If we are not careful, coercion optimisation itself can end up being expensive
+(see #19909).  We do not want to spend lots of time repeatedly trying and
+failing to simplify coercions, but want to strike a balance.
+
+Thus we run the coercion optimiser in two modes:
+
+ * "simple" coercion optimisation is called by the very simple optimiser (see
+   Note [The simple optimiser] in GHC.Core.SimpleOpt).  In particular this
+   happens immediately after desugaring (deSugar calls simpleOptPgm), and during
+   simplification (simplCoercion).  This is enabled by default at all
+   optimization levels; it can be controlled explicitly with
+   `-f[no-]opt-coercion-simple`.
+
+ * "full" coercion optimisation is called as a normal optimisation pass in the
+   pipeline (see the uses of CoreDoOptCoercion in GHC.Core.Opt.Pipeline).  This
+   is disabled by default for -O0 but enabled by default for -O1 and higher.  It
+   can be controlled explicitly with `-f[no-]opt-coercion-full`.
+
+The `-f[no-]opt-coercion` flag enables or disables both simple and full coercion
+optimisation.
+
+There is currently one class of optimisations performed in "full" mode but not
+in "simple" mode, namely the opt_trans_rule cases for pushing coercions inside
+axioms.  These are the [Sym]AxSuck(L|R) rules described in the paper, which take
+a coercion like:
+
+    C gamma1...gammaN ; delta    (where C is an axiom)
+
+and attempt to convert delta into a coercion of the form C epsilon1...epsilonN
+so that the entire thing can become:
+
+    C (gamma1;epsilon1 ... gammaN;epsilonN)
+
+Unfortunately this is rather difficult to check (see the calls to matchAxiom and
+checkAxInstCo, and Note [Why call checkAxInstCo during optimisation]).
+Moreover, it may well not be possible to convert delta into the appropriate
+form, in which case these rules cannot fire.
+
+Long transitive chains of coercions including some axioms are common in some
+programs involving type families.  Since we call coercion optimisation
+repeatedly, we could end up repeatedly trying and failing to apply the AxSuck
+rules, thereby wasting a lot of time.  Thus we skip these rules during "simple"
+coercion optimisation, and apply them only during "full" optimisation.
+
+Further experimentation to gather concrete data about when it is useful to
+enable "simple" vs "full" optimisation would be helpful.
+
+
 Note [Optimising coercion optimisation]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 Looking up a coercion's role or kind is linear in the size of the
@@ -111,22 +175,29 @@ So we substitute the coercion variable c for the coercion
 (h1 ~N (n1; h2; sym n2)) in g.
 -}
 
--- | Coercion optimisation options
-newtype OptCoercionOpts = OptCoercionOpts
-   { optCoercionEnabled :: Bool  -- ^ Enable coercion optimisation (reduce its size)
-   }
+-- | Coercion optimisation modes.
+-- See Note [Simple and full coercion optimisation].
+data OptCoercionOpts
+  = NoCoercionOpt      -- ^ No coercion optimization at all
+  | SimpleCoercionOpt  -- ^ Only perform "simple" optimizations
+  | FullCoercionOpt    -- ^ Perform all optimizations, including those that are not "simple"
+
+optCoercionEnabled :: OptCoercionOpts -> Bool
+optCoercionEnabled NoCoercionOpt = False
+optCoercionEnabled _             = True
+
+optCoercionNonSimple :: OptCoercionOpts -> Bool
+optCoercionNonSimple FullCoercionOpt = True
+optCoercionNonSimple _               = False
+
 
 optCoercion :: OptCoercionOpts -> TCvSubst -> Coercion -> NormalCo
 -- ^ optCoercion applies a substitution to a coercion,
 --   *and* optimises it to reduce its size
 optCoercion opts env co
-  | optCoercionEnabled opts = optCoercion' env co
-  | otherwise               = substCo env co
-
-optCoercion' :: TCvSubst -> Coercion -> NormalCo
-optCoercion' env co
+  | not (optCoercionEnabled opts) = substCo env co
   | debugIsOn
-  = let out_co = opt_co1 lc False co
+  = let out_co = opt_co1 opts lc False co
         (Pair in_ty1  in_ty2,  in_role)  = coercionKindRole co
         (Pair out_ty1 out_ty2, out_role) = coercionKindRole out_co
     in
@@ -143,7 +214,7 @@ optCoercion' env co
              $$ hang (text "subst:") 2 (ppr env) )
     out_co
 
-  | otherwise         = opt_co1 lc False co
+  | otherwise         = opt_co1 opts lc False co
   where
     lc = mkSubstLiftingContext env
 
@@ -163,59 +234,61 @@ type ReprFlag = Bool
 
 -- | Optimize a coercion, making no assumptions. All coercions in
 -- the lifting context are already optimized (and sym'd if nec'y)
-opt_co1 :: LiftingContext
+opt_co1 :: OptCoercionOpts
+        -> LiftingContext
         -> SymFlag
         -> Coercion -> NormalCo
-opt_co1 env sym co = opt_co2 env sym (coercionRole co) co
+opt_co1 opts env sym co = opt_co2 opts env sym (coercionRole co) co
 
 -- See Note [Optimising coercion optimisation]
 -- | Optimize a coercion, knowing the coercion's role. No other assumptions.
-opt_co2 :: LiftingContext
+opt_co2 :: OptCoercionOpts
+        -> LiftingContext
         -> SymFlag
         -> Role   -- ^ The role of the input coercion
         -> Coercion -> NormalCo
-opt_co2 env sym Phantom co = opt_phantom env sym co
-opt_co2 env sym r       co = opt_co3 env sym Nothing r co
+opt_co2 opts env sym Phantom co = opt_phantom opts env sym co
+opt_co2 opts env sym r       co = opt_co3 opts env sym Nothing r co
 
 -- See Note [Optimising coercion optimisation]
 -- | Optimize a coercion, knowing the coercion's non-Phantom role.
-opt_co3 :: LiftingContext -> SymFlag -> Maybe Role -> Role -> Coercion -> NormalCo
-opt_co3 env sym (Just Phantom)          _ co = opt_phantom env sym co
-opt_co3 env sym (Just Representational) r co = opt_co4_wrap env sym True  r co
+opt_co3 :: OptCoercionOpts -> LiftingContext -> SymFlag -> Maybe Role -> Role -> Coercion -> NormalCo
+opt_co3 opts env sym (Just Phantom)          _ co = opt_phantom opts env sym co
+opt_co3 opts env sym (Just Representational) r co = opt_co4_wrap opts env sym True  r co
   -- if mrole is Just Nominal, that can't be a downgrade, so we can ignore
-opt_co3 env sym _                       r co = opt_co4_wrap env sym False r co
+opt_co3 opts env sym _                       r co = opt_co4_wrap opts env sym False r co
 
 -- See Note [Optimising coercion optimisation]
 -- | Optimize a non-phantom coercion.
-opt_co4, opt_co4_wrap :: LiftingContext -> SymFlag -> ReprFlag -> Role -> Coercion -> NormalCo
+opt_co4, opt_co4_wrap :: OptCoercionOpts -> LiftingContext -> SymFlag -> ReprFlag -> Role -> Coercion -> NormalCo
 
 opt_co4_wrap = opt_co4
 {-
-opt_co4_wrap env sym rep r co
+opt_co4_wrap opts env sym rep r co
   = pprTrace "opt_co4_wrap {"
     ( vcat [ text "Sym:" <+> ppr sym
            , text "Rep:" <+> ppr rep
            , text "Role:" <+> ppr r
            , text "Co:" <+> ppr co ]) $
     ASSERT( r == coercionRole co )
-    let result = opt_co4 env sym rep r co in
+    let result = opt_co4 opts env sym rep r co in
     pprTrace "opt_co4_wrap }" (ppr co $$ text "---" $$ ppr result) $
     result
 -}
 
-opt_co4 env _   rep r (Refl ty)
+opt_co4 _ env _   rep r (Refl ty)
   = ASSERT2( r == Nominal, text "Expected role:" <+> ppr r    $$
                            text "Found role:" <+> ppr Nominal $$
                            text "Type:" <+> ppr ty )
     liftCoSubst (chooseRole rep r) env ty
 
-opt_co4 env _   rep r (GRefl _r ty MRefl)
+opt_co4 _ env _   rep r (GRefl _r ty MRefl)
   = ASSERT2( r == _r, text "Expected role:" <+> ppr r $$
                       text "Found role:" <+> ppr _r   $$
                       text "Type:" <+> ppr ty )
     liftCoSubst (chooseRole rep r) env ty
 
-opt_co4 env sym  rep r (GRefl _r ty (MCo co))
+opt_co4 opts env sym  rep r (GRefl _r ty (MCo co))
   = ASSERT2( r == _r, text "Expected role:" <+> ppr r $$
                       text "Found role:" <+> ppr _r   $$
                       text "Type:" <+> ppr ty )
@@ -225,56 +298,56 @@ opt_co4 env sym  rep r (GRefl _r ty (MCo co))
   where
     r'  = chooseRole rep r
     ty' = substTy (lcSubstLeft env) ty
-    co' = opt_co4 env False False Nominal co
+    co' = opt_co4 opts env False False Nominal co
 
-opt_co4 env sym rep r (SymCo co)  = opt_co4_wrap env (not sym) rep r co
+opt_co4 opts env sym rep r (SymCo co)  = opt_co4_wrap opts env (not sym) rep r co
   -- surprisingly, we don't have to do anything to the env here. This is
   -- because any "lifting" substitutions in the env are tied to ForAllCos,
   -- which treat their left and right sides differently. We don't want to
   -- exchange them.
 
-opt_co4 env sym rep r g@(TyConAppCo _r tc cos)
+opt_co4 opts env sym rep r g@(TyConAppCo _r tc cos)
   = ASSERT( r == _r )
     case (rep, r) of
       (True, Nominal) ->
         mkTyConAppCo Representational tc
-                     (zipWith3 (opt_co3 env sym)
+                     (zipWith3 (opt_co3 opts env sym)
                                (map Just (tyConRolesRepresentational tc))
                                (repeat Nominal)
                                cos)
       (False, Nominal) ->
-        mkTyConAppCo Nominal tc (map (opt_co4_wrap env sym False Nominal) cos)
+        mkTyConAppCo Nominal tc (map (opt_co4_wrap opts env sym False Nominal) cos)
       (_, Representational) ->
                       -- must use opt_co2 here, because some roles may be P
                       -- See Note [Optimising coercion optimisation]
-        mkTyConAppCo r tc (zipWith (opt_co2 env sym)
+        mkTyConAppCo r tc (zipWith (opt_co2 opts env sym)
                                    (tyConRolesRepresentational tc)  -- the current roles
                                    cos)
       (_, Phantom) -> pprPanic "opt_co4 sees a phantom!" (ppr g)
 
-opt_co4 env sym rep r (AppCo co1 co2)
-  = mkAppCo (opt_co4_wrap env sym rep r co1)
-            (opt_co4_wrap env sym False Nominal co2)
+opt_co4 opts env sym rep r (AppCo co1 co2)
+  = mkAppCo (opt_co4_wrap opts env sym rep r co1)
+            (opt_co4_wrap opts env sym False Nominal co2)
 
-opt_co4 env sym rep r (ForAllCo tv k_co co)
-  = case optForAllCoBndr env sym tv k_co of
+opt_co4 opts env sym rep r (ForAllCo tv k_co co)
+  = case optForAllCoBndr opts env sym tv k_co of
       (env', tv', k_co') -> mkForAllCo tv' k_co' $
-                            opt_co4_wrap env' sym rep r co
+                            opt_co4_wrap opts env' sym rep r co
      -- Use the "mk" functions to check for nested Refls
 
-opt_co4 env sym rep r (FunCo _r cow co1 co2)
+opt_co4 opts env sym rep r (FunCo _r cow co1 co2)
   = ASSERT( r == _r )
     if rep
     then mkFunCo Representational cow' co1' co2'
     else mkFunCo r cow' co1' co2'
   where
-    co1' = opt_co4_wrap env sym rep r co1
-    co2' = opt_co4_wrap env sym rep r co2
-    cow' = opt_co1 env sym cow
+    co1' = opt_co4_wrap opts env sym rep r co1
+    co2' = opt_co4_wrap opts env sym rep r co2
+    cow' = opt_co1 opts env sym cow
 
-opt_co4 env sym rep r (CoVarCo cv)
+opt_co4 opts env sym rep r (CoVarCo cv)
   | Just co <- lookupCoVar (lcTCvSubst env) cv
-  = opt_co4_wrap (zapLiftingContext env) sym rep r co
+  = opt_co4_wrap opts (zapLiftingContext env) sym rep r co
 
   | ty1 `eqType` ty2   -- See Note [Optimise CoVarCo to Refl]
   = mkReflCo (chooseRole rep r) ty1
@@ -294,10 +367,10 @@ opt_co4 env sym rep r (CoVarCo cv)
                          cv
           -- cv1 might have a substituted kind!
 
-opt_co4 _ _ _ _ (HoleCo h)
+opt_co4 _ _ _ _ _ (HoleCo h)
   = pprPanic "opt_univ fell into a hole" (ppr h)
 
-opt_co4 env sym rep r (AxiomInstCo con ind cos)
+opt_co4 opts env sym rep r (AxiomInstCo con ind cos)
     -- Do *not* push sym inside top-level axioms
     -- e.g. if g is a top-level axiom
     --   g a : f a ~ a
@@ -307,25 +380,25 @@ opt_co4 env sym rep r (AxiomInstCo con ind cos)
     wrapSym sym $
                        -- some sub-cos might be P: use opt_co2
                        -- See Note [Optimising coercion optimisation]
-    AxiomInstCo con ind (zipWith (opt_co2 env False)
+    AxiomInstCo con ind (zipWith (opt_co2 opts env False)
                                  (coAxBranchRoles (coAxiomNthBranch con ind))
                                  cos)
       -- Note that the_co does *not* have sym pushed into it
 
-opt_co4 env sym rep r (UnivCo prov _r t1 t2)
+opt_co4 opts env sym rep r (UnivCo prov _r t1 t2)
   = ASSERT( r == _r )
-    opt_univ env sym prov (chooseRole rep r) t1 t2
+    opt_univ opts env sym prov (chooseRole rep r) t1 t2
 
-opt_co4 env sym rep r (TransCo co1 co2)
+opt_co4 opts env sym rep r (TransCo co1 co2)
                       -- sym (g `o` h) = sym h `o` sym g
-  | sym       = opt_trans in_scope co2' co1'
-  | otherwise = opt_trans in_scope co1' co2'
+  | sym       = opt_trans opts in_scope co2' co1'
+  | otherwise = opt_trans opts in_scope co1' co2'
   where
-    co1' = opt_co4_wrap env sym rep r co1
-    co2' = opt_co4_wrap env sym rep r co2
+    co1' = opt_co4_wrap opts env sym rep r co1
+    co2' = opt_co4_wrap opts env sym rep r co2
     in_scope = lcInScopeSet env
 
-opt_co4 env _sym rep r (NthCo _r n co)
+opt_co4 _ env _sym rep r (NthCo _r n co)
   | Just (ty, _) <- isReflCo_maybe co
   , Just (_tc, args) <- ASSERT( r == _r )
                         splitTyConApp_maybe ty
@@ -337,22 +410,22 @@ opt_co4 env _sym rep r (NthCo _r n co)
       -- works for both tyvar and covar
   = liftCoSubst (chooseRole rep r) env (varType tv)
 
-opt_co4 env sym rep r (NthCo r1 n (TyConAppCo _ _ cos))
+opt_co4 opts env sym rep r (NthCo r1 n (TyConAppCo _ _ cos))
   = ASSERT( r == r1 )
-    opt_co4_wrap env sym rep r (cos `getNth` n)
+    opt_co4_wrap opts env sym rep r (cos `getNth` n)
 
 -- see the definition of GHC.Builtin.Types.Prim.funTyCon
-opt_co4 env sym rep r (NthCo r1 n (FunCo _r2 w co1 co2))
+opt_co4 opts env sym rep r (NthCo r1 n (FunCo _r2 w co1 co2))
   = ASSERT( r == r1 )
-    opt_co4_wrap env sym rep r (mkNthCoFunCo n w co1 co2)
+    opt_co4_wrap opts env sym rep r (mkNthCoFunCo n w co1 co2)
 
-opt_co4 env sym rep r (NthCo _r n (ForAllCo _ eta _))
+opt_co4 opts env sym rep r (NthCo _r n (ForAllCo _ eta _))
       -- works for both tyvar and covar
   = ASSERT( r == _r )
     ASSERT( n == 0 )
-    opt_co4_wrap env sym rep Nominal eta
+    opt_co4_wrap opts env sym rep Nominal eta
 
-opt_co4 env sym rep r (NthCo _r n co)
+opt_co4 opts env sym rep r (NthCo _r n co)
   | Just nth_co <- case co' of
       TyConAppCo _ _ cos -> Just (cos `getNth` n)
       FunCo _ w co1 co2  -> Just (mkNthCoFunCo n w co1 co2)
@@ -360,36 +433,36 @@ opt_co4 env sym rep r (NthCo _r n co)
       _                  -> Nothing
   = if rep && (r == Nominal)
       -- keep propagating the SubCo
-    then opt_co4_wrap (zapLiftingContext env) False True Nominal nth_co
+    then opt_co4_wrap opts (zapLiftingContext env) False True Nominal nth_co
     else nth_co
 
   | otherwise
   = wrapRole rep r $ NthCo r n co'
   where
-    co' = opt_co1 env sym co
+    co' = opt_co1 opts env sym co
 
-opt_co4 env sym rep r (LRCo lr co)
+opt_co4 opts env sym rep r (LRCo lr co)
   | Just pr_co <- splitAppCo_maybe co
   = ASSERT( r == Nominal )
-    opt_co4_wrap env sym rep Nominal (pick_lr lr pr_co)
+    opt_co4_wrap opts env sym rep Nominal (pick_lr lr pr_co)
   | Just pr_co <- splitAppCo_maybe co'
   = ASSERT( r == Nominal )
     if rep
-    then opt_co4_wrap (zapLiftingContext env) False True Nominal (pick_lr lr pr_co)
+    then opt_co4_wrap opts (zapLiftingContext env) False True Nominal (pick_lr lr pr_co)
     else pick_lr lr pr_co
   | otherwise
   = wrapRole rep Nominal $ LRCo lr co'
   where
-    co' = opt_co4_wrap env sym False Nominal co
+    co' = opt_co4_wrap opts env sym False Nominal co
 
     pick_lr CLeft  (l, _) = l
     pick_lr CRight (_, r) = r
 
 -- See Note [Optimising InstCo]
-opt_co4 env sym rep r (InstCo co1 arg)
+opt_co4 opts env sym rep r (InstCo co1 arg)
     -- forall over type...
   | Just (tv, kind_co, co_body) <- splitForAllCo_ty_maybe co1
-  = opt_co4_wrap (extendLiftingContext env tv
+  = opt_co4_wrap opts (extendLiftingContext env tv
                     (mkCoherenceRightCo Nominal t2 (mkSymCo kind_co) sym_arg))
                    -- mkSymCo kind_co :: k1 ~ k2
                    -- sym_arg :: (t1 :: k1) ~ (t2 :: k2)
@@ -400,15 +473,15 @@ opt_co4 env sym rep r (InstCo co1 arg)
   | Just (cv, kind_co, co_body) <- splitForAllCo_co_maybe co1
   , CoercionTy h1 <- t1
   , CoercionTy h2 <- t2
-  = let new_co = mk_new_co cv (opt_co4_wrap env sym False Nominal kind_co) h1 h2
-    in opt_co4_wrap (extendLiftingContext env cv new_co) sym rep r co_body
+  = let new_co = mk_new_co cv (opt_co4_wrap opts env sym False Nominal kind_co) h1 h2
+    in opt_co4_wrap opts (extendLiftingContext env cv new_co) sym rep r co_body
 
     -- See if it is a forall after optimization
     -- If so, do an inefficient one-variable substitution, then re-optimize
 
     -- forall over type...
   | Just (tv', kind_co', co_body') <- splitForAllCo_ty_maybe co1'
-  = opt_co4_wrap (extendLiftingContext (zapLiftingContext env) tv'
+  = opt_co4_wrap opts (extendLiftingContext (zapLiftingContext env) tv'
                     (mkCoherenceRightCo Nominal t2' (mkSymCo kind_co') arg'))
             False False r' co_body'
 
@@ -417,14 +490,14 @@ opt_co4 env sym rep r (InstCo co1 arg)
   , CoercionTy h1' <- t1'
   , CoercionTy h2' <- t2'
   = let new_co = mk_new_co cv' kind_co' h1' h2'
-    in opt_co4_wrap (extendLiftingContext (zapLiftingContext env) cv' new_co)
+    in opt_co4_wrap opts (extendLiftingContext (zapLiftingContext env) cv' new_co)
                     False False r' co_body'
 
   | otherwise = InstCo co1' arg'
   where
-    co1'    = opt_co4_wrap env sym rep r co1
+    co1'    = opt_co4_wrap opts env sym rep r co1
     r'      = chooseRole rep r
-    arg'    = opt_co4_wrap env sym False Nominal arg
+    arg'    = opt_co4_wrap opts env sym False Nominal arg
     sym_arg = wrapSym sym arg'
 
     -- Performance note: don't be alarmed by the two calls to coercionKind
@@ -452,25 +525,25 @@ opt_co4 env sym rep r (InstCo co1 arg)
          in mkProofIrrelCo Nominal (Refl (coercionType h1)) h1
                            (n1 `mkTransCo` h2 `mkTransCo` (mkSymCo n2))
 
-opt_co4 env sym _rep r (KindCo co)
+opt_co4 opts env sym _rep r (KindCo co)
   = ASSERT( r == Nominal )
     let kco' = promoteCoercion co in
     case kco' of
-      KindCo co' -> promoteCoercion (opt_co1 env sym co')
-      _          -> opt_co4_wrap env sym False Nominal kco'
+      KindCo co' -> promoteCoercion (opt_co1 opts env sym co')
+      _          -> opt_co4_wrap opts env sym False Nominal kco'
   -- This might be able to be optimized more to do the promotion
   -- and substitution/optimization at the same time
 
-opt_co4 env sym _ r (SubCo co)
+opt_co4 opts env sym _ r (SubCo co)
   = ASSERT( r == Representational )
-    opt_co4_wrap env sym True Nominal co
+    opt_co4_wrap opts env sym True Nominal co
 
 -- This could perhaps be optimized more.
-opt_co4 env sym rep r (AxiomRuleCo co cs)
+opt_co4 opts env sym rep r (AxiomRuleCo co cs)
   = ASSERT( r == coaxrRole co )
     wrapRole rep r $
     wrapSym sym $
-    AxiomRuleCo co (zipWith (opt_co2 env False) (coaxrAsmpRoles co) cs)
+    AxiomRuleCo co (zipWith (opt_co2 opts env False) (coaxrAsmpRoles co) cs)
 
 {- Note [Optimise CoVarCo to Refl]
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -484,9 +557,9 @@ in GHC.Core.Coercion.
 -------------
 -- | Optimize a phantom coercion. The input coercion may not necessarily
 -- be a phantom, but the output sure will be.
-opt_phantom :: LiftingContext -> SymFlag -> Coercion -> NormalCo
-opt_phantom env sym co
-  = opt_univ env sym (PhantomProv (mkKindCo co)) Phantom ty1 ty2
+opt_phantom :: OptCoercionOpts -> LiftingContext -> SymFlag -> Coercion -> NormalCo
+opt_phantom opts env sym co
+  = opt_univ opts env sym (PhantomProv (mkKindCo co)) Phantom ty1 ty2
   where
     Pair ty1 ty2 = coercionKind co
 
@@ -521,17 +594,18 @@ See #19509.
 
  -}
 
-opt_univ :: LiftingContext -> SymFlag -> UnivCoProvenance -> Role
+opt_univ :: OptCoercionOpts -> LiftingContext -> SymFlag -> UnivCoProvenance -> Role
          -> Type -> Type -> Coercion
-opt_univ env sym (PhantomProv h) _r ty1 ty2
+opt_univ opts env sym (PhantomProv h) _r ty1 ty2
   | sym       = mkPhantomCo h' ty2' ty1'
   | otherwise = mkPhantomCo h' ty1' ty2'
   where
-    h' = opt_co4 env sym False Nominal h
+    h' = opt_co4 opts env sym False Nominal h
     ty1' = substTy (lcSubstLeft  env) ty1
     ty2' = substTy (lcSubstRight env) ty2
 
-opt_univ env sym prov role oty1 oty2
+opt_univ opts env sym prov role oty1 oty2
+
   | Just (tc1, tys1) <- splitTyConApp_maybe oty1
   , Just (tc2, tys2) <- splitTyConApp_maybe oty2
   , tc1 == tc2
@@ -541,7 +615,7 @@ opt_univ env sym prov role oty1 oty2
       -- Phantom is already taken care of, and ProofIrrel doesn't relate tyconapps
   = let roles    = tyConRolesX role tc1
         arg_cos  = zipWith3 (mkUnivCo prov') roles tys1 tys2
-        arg_cos' = zipWith (opt_co4 env sym False) roles arg_cos
+        arg_cos' = zipWith (opt_co4 opts env sym False) roles arg_cos
     in
     mkTyConAppCo role tc1 arg_cos'
 
@@ -556,9 +630,9 @@ opt_univ env sym prov role oty1 oty2
           -- eta gets opt'ed soon, but not yet.
         ty2' = substTyWith [tv2] [TyVarTy tv1 `mkCastTy` eta] ty2
 
-        (env', tv1', eta') = optForAllCoBndr env sym tv1 eta
+        (env', tv1', eta') = optForAllCoBndr opts env sym tv1 eta
     in
-    mkForAllCo tv1' eta' (opt_univ env' sym prov' role ty1 ty2')
+    mkForAllCo tv1' eta' (opt_univ opts env' sym prov' role ty1 ty2')
 
   | Just (cv1, ty1) <- splitForAllCoVar_maybe oty1
   , Just (cv2, ty2) <- splitForAllCoVar_maybe oty2
@@ -574,9 +648,9 @@ opt_univ env sym prov role oty1 oty2
                 (mkNthCo r' 3 eta_d)
         ty2'  = substTyWithCoVars [cv2] [n_co] ty2
 
-        (env', cv1', eta') = optForAllCoBndr env sym cv1 eta
+        (env', cv1', eta') = optForAllCoBndr opts env sym cv1 eta
     in
-    mkForAllCo cv1' eta' (opt_univ env' sym prov' role ty1 ty2')
+    mkForAllCo cv1' eta' (opt_univ opts env' sym prov' role ty1 ty2')
 
   | otherwise
   = let ty1 = substTyUnchecked (lcSubstLeft  env) oty1
@@ -590,80 +664,80 @@ opt_univ env sym prov role oty1 oty2
     prov' = case prov of
 #if __GLASGOW_HASKELL__ < 901
 -- This alt is redundant with the first match of the FunDef
-      PhantomProv kco    -> PhantomProv $ opt_co4_wrap env sym False Nominal kco
+      PhantomProv kco    -> PhantomProv $ opt_co4_wrap opts env sym False Nominal kco
 #endif
-      ProofIrrelProv kco -> ProofIrrelProv $ opt_co4_wrap env sym False Nominal kco
+      ProofIrrelProv kco -> ProofIrrelProv $ opt_co4_wrap opts env sym False Nominal kco
       PluginProv _       -> prov
 
 -------------
-opt_transList :: HasDebugCallStack => InScopeSet -> [NormalCo] -> [NormalCo] -> [NormalCo]
-opt_transList is = zipWithEqual "opt_transList" (opt_trans is)
+opt_transList :: HasDebugCallStack => OptCoercionOpts -> InScopeSet -> [NormalCo] -> [NormalCo] -> [NormalCo]
+opt_transList opts is = zipWithEqual "opt_transList" (opt_trans opts is)
   -- The input lists must have identical length.
 
-opt_trans :: InScopeSet -> NormalCo -> NormalCo -> NormalCo
-opt_trans is co1 co2
+opt_trans :: OptCoercionOpts -> InScopeSet -> NormalCo -> NormalCo -> NormalCo
+opt_trans opts is co1 co2
   | isReflCo co1 = co2
     -- optimize when co1 is a Refl Co
-  | otherwise    = opt_trans1 is co1 co2
+  | otherwise    = opt_trans1 opts is co1 co2
 
-opt_trans1 :: InScopeSet -> NormalNonIdCo -> NormalCo -> NormalCo
+opt_trans1 :: OptCoercionOpts -> InScopeSet -> NormalNonIdCo -> NormalCo -> NormalCo
 -- First arg is not the identity
-opt_trans1 is co1 co2
+opt_trans1 opts is co1 co2
   | isReflCo co2 = co1
     -- optimize when co2 is a Refl Co
-  | otherwise    = opt_trans2 is co1 co2
+  | otherwise    = opt_trans2 opts is co1 co2
 
-opt_trans2 :: InScopeSet -> NormalNonIdCo -> NormalNonIdCo -> NormalCo
+opt_trans2 :: OptCoercionOpts -> InScopeSet -> NormalNonIdCo -> NormalNonIdCo -> NormalCo
 -- Neither arg is the identity
-opt_trans2 is (TransCo co1a co1b) co2
+opt_trans2 opts is (TransCo co1a co1b) co2
     -- Don't know whether the sub-coercions are the identity
-  = opt_trans is co1a (opt_trans is co1b co2)
+  = opt_trans opts is co1a (opt_trans opts is co1b co2)
 
-opt_trans2 is co1 co2
-  | Just co <- opt_trans_rule is co1 co2
+opt_trans2 opts is co1 co2
+  | Just co <- opt_trans_rule opts is co1 co2
   = co
 
-opt_trans2 is co1 (TransCo co2a co2b)
-  | Just co1_2a <- opt_trans_rule is co1 co2a
+opt_trans2 opts is co1 (TransCo co2a co2b)
+  | Just co1_2a <- opt_trans_rule opts is co1 co2a
   = if isReflCo co1_2a
     then co2b
-    else opt_trans1 is co1_2a co2b
+    else opt_trans1 opts is co1_2a co2b
 
-opt_trans2 _ co1 co2
+opt_trans2 _ _ co1 co2
   = mkTransCo co1 co2
 
 ------
 -- Optimize coercions with a top-level use of transitivity.
-opt_trans_rule :: InScopeSet -> NormalNonIdCo -> NormalNonIdCo -> Maybe NormalCo
+opt_trans_rule :: OptCoercionOpts -> InScopeSet -> NormalNonIdCo -> NormalNonIdCo -> Maybe NormalCo
 
-opt_trans_rule is in_co1@(GRefl r1 t1 (MCo co1)) in_co2@(GRefl r2 _ (MCo co2))
+opt_trans_rule opts is in_co1@(GRefl r1 t1 (MCo co1)) in_co2@(GRefl r2 _ (MCo co2))
   = ASSERT( r1 == r2 )
     fireTransRule "GRefl" in_co1 in_co2 $
-    mkGReflRightCo r1 t1 (opt_trans is co1 co2)
+    mkGReflRightCo r1 t1 (opt_trans opts is co1 co2)
 
 -- Push transitivity through matching destructors
-opt_trans_rule is in_co1@(NthCo r1 d1 co1) in_co2@(NthCo r2 d2 co2)
+opt_trans_rule opts is in_co1@(NthCo r1 d1 co1) in_co2@(NthCo r2 d2 co2)
   | d1 == d2
   , coercionRole co1 == coercionRole co2
   , co1 `compatible_co` co2
   = ASSERT( r1 == r2 )
     fireTransRule "PushNth" in_co1 in_co2 $
-    mkNthCo r1 d1 (opt_trans is co1 co2)
+    mkNthCo r1 d1 (opt_trans opts is co1 co2)
 
-opt_trans_rule is in_co1@(LRCo d1 co1) in_co2@(LRCo d2 co2)
+opt_trans_rule opts is in_co1@(LRCo d1 co1) in_co2@(LRCo d2 co2)
   | d1 == d2
   , co1 `compatible_co` co2
   = fireTransRule "PushLR" in_co1 in_co2 $
-    mkLRCo d1 (opt_trans is co1 co2)
+    mkLRCo d1 (opt_trans opts is co1 co2)
 
 -- Push transitivity inside instantiation
-opt_trans_rule is in_co1@(InstCo co1 ty1) in_co2@(InstCo co2 ty2)
+opt_trans_rule opts is in_co1@(InstCo co1 ty1) in_co2@(InstCo co2 ty2)
   | ty1 `eqCoercion` ty2
   , co1 `compatible_co` co2
   = fireTransRule "TrPushInst" in_co1 in_co2 $
-    mkInstCo (opt_trans is co1 co2) ty1
+    mkInstCo (opt_trans opts is co1 co2) ty1
 
-opt_trans_rule is in_co1@(UnivCo p1 r1 tyl1 _tyr1)
+opt_trans_rule opts is in_co1@(UnivCo p1 r1 tyl1 _tyr1)
                   in_co2@(UnivCo p2 r2 _tyl2 tyr2)
   | Just prov' <- opt_trans_prov p1 p2
   = ASSERT( r1 == r2 )
@@ -672,50 +746,50 @@ opt_trans_rule is in_co1@(UnivCo p1 r1 tyl1 _tyr1)
   where
     -- if the provenances are different, opt'ing will be very confusing
     opt_trans_prov (PhantomProv kco1)    (PhantomProv kco2)
-      = Just $ PhantomProv $ opt_trans is kco1 kco2
+      = Just $ PhantomProv $ opt_trans opts is kco1 kco2
     opt_trans_prov (ProofIrrelProv kco1) (ProofIrrelProv kco2)
-      = Just $ ProofIrrelProv $ opt_trans is kco1 kco2
+      = Just $ ProofIrrelProv $ opt_trans opts is kco1 kco2
     opt_trans_prov (PluginProv str1)     (PluginProv str2)     | str1 == str2 = Just p1
     opt_trans_prov _ _ = Nothing
 
 -- Push transitivity down through matching top-level constructors.
-opt_trans_rule is in_co1@(TyConAppCo r1 tc1 cos1) in_co2@(TyConAppCo r2 tc2 cos2)
+opt_trans_rule opts is in_co1@(TyConAppCo r1 tc1 cos1) in_co2@(TyConAppCo r2 tc2 cos2)
   | tc1 == tc2
   = ASSERT( r1 == r2 )
     fireTransRule "PushTyConApp" in_co1 in_co2 $
-    mkTyConAppCo r1 tc1 (opt_transList is cos1 cos2)
+    mkTyConAppCo r1 tc1 (opt_transList opts is cos1 cos2)
 
-opt_trans_rule is in_co1@(FunCo r1 w1 co1a co1b) in_co2@(FunCo r2 w2 co2a co2b)
+opt_trans_rule opts is in_co1@(FunCo r1 w1 co1a co1b) in_co2@(FunCo r2 w2 co2a co2b)
   = ASSERT( r1 == r2)   -- Just like the TyConAppCo/TyConAppCo case
     fireTransRule "PushFun" in_co1 in_co2 $
-    mkFunCo r1 (opt_trans is w1 w2) (opt_trans is co1a co2a) (opt_trans is co1b co2b)
+    mkFunCo r1 (opt_trans opts is w1 w2) (opt_trans opts is co1a co2a) (opt_trans opts is co1b co2b)
 
-opt_trans_rule is in_co1@(AppCo co1a co1b) in_co2@(AppCo co2a co2b)
+opt_trans_rule opts is in_co1@(AppCo co1a co1b) in_co2@(AppCo co2a co2b)
   -- Must call opt_trans_rule_app; see Note [EtaAppCo]
-  = opt_trans_rule_app is in_co1 in_co2 co1a [co1b] co2a [co2b]
+  = opt_trans_rule_app opts is in_co1 in_co2 co1a [co1b] co2a [co2b]
 
 -- Eta rules
-opt_trans_rule is co1@(TyConAppCo r tc cos1) co2
+opt_trans_rule opts is co1@(TyConAppCo r tc cos1) co2
   | Just cos2 <- etaTyConAppCo_maybe tc co2
   = fireTransRule "EtaCompL" co1 co2 $
-    mkTyConAppCo r tc (opt_transList is cos1 cos2)
+    mkTyConAppCo r tc (opt_transList opts is cos1 cos2)
 
-opt_trans_rule is co1 co2@(TyConAppCo r tc cos2)
+opt_trans_rule opts is co1 co2@(TyConAppCo r tc cos2)
   | Just cos1 <- etaTyConAppCo_maybe tc co1
   = fireTransRule "EtaCompR" co1 co2 $
-    mkTyConAppCo r tc (opt_transList is cos1 cos2)
+    mkTyConAppCo r tc (opt_transList opts is cos1 cos2)
 
-opt_trans_rule is co1@(AppCo co1a co1b) co2
+opt_trans_rule opts is co1@(AppCo co1a co1b) co2
   | Just (co2a,co2b) <- etaAppCo_maybe co2
-  = opt_trans_rule_app is co1 co2 co1a [co1b] co2a [co2b]
+  = opt_trans_rule_app opts is co1 co2 co1a [co1b] co2a [co2b]
 
-opt_trans_rule is co1 co2@(AppCo co2a co2b)
+opt_trans_rule opts is co1 co2@(AppCo co2a co2b)
   | Just (co1a,co1b) <- etaAppCo_maybe co1
-  = opt_trans_rule_app is co1 co2 co1a [co1b] co2a [co2b]
+  = opt_trans_rule_app opts is co1 co2 co1a [co1b] co2a [co2b]
 
 -- Push transitivity inside forall
 -- forall over types.
-opt_trans_rule is co1 co2
+opt_trans_rule opts is co1 co2
   | Just (tv1, eta1, r1) <- splitForAllCo_ty_maybe co1
   , Just (tv2, eta2, r2) <- etaForAllCo_ty_maybe co2
   = push_trans tv1 eta1 r1 tv2 eta2 r2
@@ -732,14 +806,14 @@ opt_trans_rule is co1 co2
     -- Wanted:
     --   /\tv1 : (eta1;eta2).  (r1; r2[tv2 |-> tv1 |> eta1])
     = fireTransRule "EtaAllTy_ty" co1 co2 $
-      mkForAllCo tv1 (opt_trans is eta1 eta2) (opt_trans is' r1 r2')
+      mkForAllCo tv1 (opt_trans opts is eta1 eta2) (opt_trans opts is' r1 r2')
     where
       is' = is `extendInScopeSet` tv1
       r2' = substCoWithUnchecked [tv2] [mkCastTy (TyVarTy tv1) eta1] r2
 
 -- Push transitivity inside forall
 -- forall over coercions.
-opt_trans_rule is co1 co2
+opt_trans_rule opts is co1 co2
   | Just (cv1, eta1, r1) <- splitForAllCo_co_maybe co1
   , Just (cv2, eta2, r2) <- etaForAllCo_co_maybe co2
   = push_trans cv1 eta1 r1 cv2 eta2 r2
@@ -758,7 +832,7 @@ opt_trans_rule is co1 co2
     --   n2 = nth 3 eta1
     --   nco = /\ cv1 : (eta1;eta2). (r1; r2[cv2 |-> (sym n1);cv1;n2])
     = fireTransRule "EtaAllTy_co" co1 co2 $
-      mkForAllCo cv1 (opt_trans is eta1 eta2) (opt_trans is' r1 r2')
+      mkForAllCo cv1 (opt_trans opts is eta1 eta2) (opt_trans opts is' r1 r2')
     where
       is'  = is `extendInScopeSet` cv1
       role = coVarRole cv1
@@ -770,42 +844,46 @@ opt_trans_rule is co1 co2
                     r2
 
 -- Push transitivity inside axioms
-opt_trans_rule is co1 co2
+opt_trans_rule opts is co1 co2
 
   -- See Note [Why call checkAxInstCo during optimisation]
-  -- TrPushSymAxR
-  | Just (sym, con, ind, cos1) <- co1_is_axiom_maybe
+  -- TrPushSymAxR (SymAxSuckR)
+  | optCoercionNonSimple opts
+  , Just (sym, con, ind, cos1) <- co1_is_axiom_maybe
   , True <- sym
   , Just cos2 <- matchAxiom sym con ind co2
-  , let newAxInst = AxiomInstCo con ind (opt_transList is (map mkSymCo cos2) cos1)
+  , let newAxInst = AxiomInstCo con ind (opt_transList opts is (map mkSymCo cos2) cos1)
   , Nothing <- checkAxInstCo newAxInst
   = fireTransRule "TrPushSymAxR" co1 co2 $ SymCo newAxInst
 
-  -- TrPushAxR
-  | Just (sym, con, ind, cos1) <- co1_is_axiom_maybe
+  -- TrPushAxR (AxSuckR)
+  | optCoercionNonSimple opts
+  , Just (sym, con, ind, cos1) <- co1_is_axiom_maybe
   , False <- sym
   , Just cos2 <- matchAxiom sym con ind co2
-  , let newAxInst = AxiomInstCo con ind (opt_transList is cos1 cos2)
+  , let newAxInst = AxiomInstCo con ind (opt_transList opts is cos1 cos2)
   , Nothing <- checkAxInstCo newAxInst
   = fireTransRule "TrPushAxR" co1 co2 newAxInst
 
-  -- TrPushSymAxL
-  | Just (sym, con, ind, cos2) <- co2_is_axiom_maybe
+  -- TrPushSymAxL (SymAxSuckL)
+  | optCoercionNonSimple opts
+  , Just (sym, con, ind, cos2) <- co2_is_axiom_maybe
   , True <- sym
   , Just cos1 <- matchAxiom (not sym) con ind co1
-  , let newAxInst = AxiomInstCo con ind (opt_transList is cos2 (map mkSymCo cos1))
+  , let newAxInst = AxiomInstCo con ind (opt_transList opts is cos2 (map mkSymCo cos1))
   , Nothing <- checkAxInstCo newAxInst
   = fireTransRule "TrPushSymAxL" co1 co2 $ SymCo newAxInst
 
-  -- TrPushAxL
-  | Just (sym, con, ind, cos2) <- co2_is_axiom_maybe
+  -- TrPushAxL (AxSuckL)
+  | optCoercionNonSimple opts
+  , Just (sym, con, ind, cos2) <- co2_is_axiom_maybe
   , False <- sym
   , Just cos1 <- matchAxiom (not sym) con ind co1
-  , let newAxInst = AxiomInstCo con ind (opt_transList is cos1 cos2)
+  , let newAxInst = AxiomInstCo con ind (opt_transList opts is cos1 cos2)
   , Nothing <- checkAxInstCo newAxInst
   = fireTransRule "TrPushAxL" co1 co2 newAxInst
 
-  -- TrPushAxSym/TrPushSymAx
+  -- TrPushAxSym/TrPushSymAx (AxSym/SymAx)
   | Just (sym1, con1, ind1, cos1) <- co1_is_axiom_maybe
   , Just (sym2, con2, ind2, cos2) <- co2_is_axiom_maybe
   , con1 == con2
@@ -819,16 +897,16 @@ opt_trans_rule is co1 co2
   , all (`elemVarSet` pivot_tvs) qtvs
   = fireTransRule "TrPushAxSym" co1 co2 $
     if sym2
-       -- TrPushAxSym
-    then liftCoSubstWith role qtvs (opt_transList is cos1 (map mkSymCo cos2)) lhs
-       -- TrPushSymAx
-    else liftCoSubstWith role qtvs (opt_transList is (map mkSymCo cos1) cos2) rhs
+       -- TrPushAxSym (AxSym)
+    then liftCoSubstWith role qtvs (opt_transList opts is cos1 (map mkSymCo cos2)) lhs
+       -- TrPushSymAx (SymAx)
+    else liftCoSubstWith role qtvs (opt_transList opts is (map mkSymCo cos1) cos2) rhs
   where
     co1_is_axiom_maybe = isAxiom_maybe co1
     co2_is_axiom_maybe = isAxiom_maybe co2
     role = coercionRole co1 -- should be the same as coercionRole co2!
 
-opt_trans_rule _ co1 co2        -- Identity rule
+opt_trans_rule _ _ co1 co2        -- Identity rule
   | let ty1 = coercionLKind co1
         r   = coercionRole co1
         ty2 = coercionRKind co2
@@ -836,10 +914,11 @@ opt_trans_rule _ co1 co2        -- Identity rule
   = fireTransRule "RedTypeDirRefl" co1 co2 $
     mkReflCo r ty2
 
-opt_trans_rule _ _ _ = Nothing
+opt_trans_rule _ _ _ _ = Nothing
 
 -- See Note [EtaAppCo]
-opt_trans_rule_app :: InScopeSet
+opt_trans_rule_app :: OptCoercionOpts
+                   -> InScopeSet
                    -> Coercion   -- original left-hand coercion (printing only)
                    -> Coercion   -- original right-hand coercion (printing only)
                    -> Coercion   -- left-hand coercion "function"
@@ -847,14 +926,14 @@ opt_trans_rule_app :: InScopeSet
                    -> Coercion   -- right-hand coercion "function"
                    -> [Coercion] -- right-hand coercion "args"
                    -> Maybe Coercion
-opt_trans_rule_app is orig_co1 orig_co2 co1a co1bs co2a co2bs
+opt_trans_rule_app opts is orig_co1 orig_co2 co1a co1bs co2a co2bs
   | AppCo co1aa co1ab <- co1a
   , Just (co2aa, co2ab) <- etaAppCo_maybe co2a
-  = opt_trans_rule_app is orig_co1 orig_co2 co1aa (co1ab:co1bs) co2aa (co2ab:co2bs)
+  = opt_trans_rule_app opts is orig_co1 orig_co2 co1aa (co1ab:co1bs) co2aa (co2ab:co2bs)
 
   | AppCo co2aa co2ab <- co2a
   , Just (co1aa, co1ab) <- etaAppCo_maybe co1a
-  = opt_trans_rule_app is orig_co1 orig_co2 co1aa (co1ab:co1bs) co2aa (co2ab:co2bs)
+  = opt_trans_rule_app opts is orig_co1 orig_co2 co1aa (co1ab:co1bs) co2aa (co2ab:co2bs)
 
   | otherwise
   = ASSERT( co1bs `equalLength` co2bs )
@@ -875,8 +954,8 @@ opt_trans_rule_app is orig_co1 orig_co2 co1a co1bs co2a co2bs
         co2bs'  = zipWith3 mkGReflLeftCo rt2bs lt2bs kcobs
         co2bs'' = zipWith mkTransCo co2bs' co2bs
     in
-    mkAppCos (opt_trans is co1a co2a')
-             (zipWith (opt_trans is) co1bs co2bs'')
+    mkAppCos (opt_trans opts is co1a co2a')
+             (zipWith (opt_trans opts is) co1bs co2bs'')
 
 fireTransRule :: String -> Coercion -> Coercion -> Coercion -> Maybe Coercion
 fireTransRule _rule _co1 _co2 res
@@ -1236,7 +1315,40 @@ and these two imply
 
 -}
 
-optForAllCoBndr :: LiftingContext -> Bool
+optForAllCoBndr :: OptCoercionOpts -> LiftingContext -> Bool
                 -> TyCoVar -> Coercion -> (LiftingContext, TyCoVar, Coercion)
-optForAllCoBndr env sym
-  = substForAllCoBndrUsingLC sym (opt_co4_wrap env sym False Nominal) env
+optForAllCoBndr opts env sym
+  = substForAllCoBndrUsingLC sym (opt_co4_wrap opts env sym False Nominal) env
+
+
+-- | A core-to-core pass that runs the full coercion optimizer (including
+-- optimizations that are not "simple").
+optCoercionProgram :: CoreProgram -> CoreProgram
+optCoercionProgram binds = map go_bind binds
+  where
+    go_bind :: CoreBind -> CoreBind
+    go_bind (NonRec v e) = NonRec v (go e)
+    go_bind (Rec pairs) = Rec (map (fmap go) pairs)
+
+    go :: CoreExpr -> CoreExpr
+    go e@(Var{})    = e
+    go e@(Lit {})   = e
+    go (Type ty)    = Type (go_ty ty)
+    go (Coercion c) = Coercion (go_co c)
+    go (Cast e' c) = Cast (go e') (go_co c)
+    go (Tick t e') = Tick t (go e')
+    go (App e1 e2) = App (go e1) (go e2)
+    go (Lam v e')  = Lam v (go e')
+    go (Let bind e') = Let (go_bind bind) (go e')
+    go (Case scrut bndr ty alts) = Case (go scrut) bndr (go_ty ty) (map go_alt alts)
+
+    go_alt :: CoreAlt -> CoreAlt
+    go_alt (Alt dc pats rhs) = Alt dc pats (go rhs)
+
+    -- TODO: should we walk over types to look for CastTy and CoercionTy?  I
+    -- think they may be sufficiently uncommon that it is better not to bother.
+    go_ty :: Type -> Type
+    go_ty = id
+
+    go_co :: Coercion -> Coercion
+    go_co = optCoercion FullCoercionOpt emptyTCvSubst
diff --git a/compiler/GHC/Core/Lint.hs b/compiler/GHC/Core/Lint.hs
index 3438f372fc..dfd80b03d4 100644
--- a/compiler/GHC/Core/Lint.hs
+++ b/compiler/GHC/Core/Lint.hs
@@ -362,6 +362,7 @@ coreDumpFlag CoreTidy                 = Just Opt_D_dump_simpl
 coreDumpFlag CorePrep                 = Just Opt_D_dump_prep
 coreDumpFlag CoreOccurAnal            = Just Opt_D_dump_occur_anal
 
+coreDumpFlag CoreDoOptCoercion        = Nothing
 coreDumpFlag CoreAddCallerCcs         = Nothing
 coreDumpFlag CoreDoPrintCore          = Nothing
 coreDumpFlag (CoreDoRuleCheck {})     = Nothing
diff --git a/compiler/GHC/Core/Opt/Monad.hs b/compiler/GHC/Core/Opt/Monad.hs
index 3c6ff07a65..32b7b9d3d4 100644
--- a/compiler/GHC/Core/Opt/Monad.hs
+++ b/compiler/GHC/Core/Opt/Monad.hs
@@ -132,6 +132,7 @@ data CoreToDo           -- These are diff core-to-core passes,
   | CorePrep
   | CoreAddCallerCcs
   | CoreOccurAnal
+  | CoreDoOptCoercion
 
 instance Outputable CoreToDo where
   ppr (CoreDoSimplify _ _)     = text "Simplifier"
@@ -154,6 +155,7 @@ instance Outputable CoreToDo where
   ppr CoreAddCallerCcs         = text "Add caller cost-centres"
   ppr CorePrep                 = text "CorePrep"
   ppr CoreOccurAnal            = text "Occurrence analysis"
+  ppr CoreDoOptCoercion        = text "Coercion optimization"
   ppr CoreDoPrintCore          = text "Print core"
   ppr (CoreDoRuleCheck {})     = text "Rule check"
   ppr CoreDoNothing            = text "CoreDoNothing"
diff --git a/compiler/GHC/Core/Opt/Pipeline.hs b/compiler/GHC/Core/Opt/Pipeline.hs
index f81f45eba2..aa6956a580 100644
--- a/compiler/GHC/Core/Opt/Pipeline.hs
+++ b/compiler/GHC/Core/Opt/Pipeline.hs
@@ -19,6 +19,7 @@ import GHC.Driver.Env
 import GHC.Platform.Ways  ( hasWay, Way(WayProf) )
 
 import GHC.Core
+import GHC.Core.Coercion.Opt ( optCoercionProgram )
 import GHC.Core.Opt.CSE  ( cseProgram )
 import GHC.Core.Rules   ( mkRuleBase, unionRuleBase,
                           extendRuleBaseList, ruleCheckProgram, addRuleInfo,
@@ -152,6 +153,7 @@ getCoreToDo logger dflags
     eta_expand_on = gopt Opt_DoLambdaEtaExpansion         dflags
     pre_inline_on = gopt Opt_SimplPreInlining             dflags
     ww_on         = gopt Opt_WorkerWrapper                dflags
+    opt_coercion  = gopt Opt_OptCoercionFull              dflags
     static_ptrs   = xopt LangExt.StaticPointers           dflags
     profiling     = ways dflags `hasWay` WayProf
 
@@ -226,7 +228,8 @@ getCoreToDo logger dflags
 
     core_todo =
      if opt_level == 0 then
-       [ static_ptrs_float_outwards,
+       [ runWhen opt_coercion CoreDoOptCoercion,
+         static_ptrs_float_outwards,
          CoreDoSimplify max_iter
              (base_mode { sm_phase = FinalPhase
                         , sm_names = ["Non-opt simplification"] })
@@ -235,6 +238,8 @@ getCoreToDo logger dflags
 
      else {- opt_level >= 1 -} [
 
+       runWhen opt_coercion CoreDoOptCoercion,
+
     -- We want to do the static argument transform before full laziness as it
     -- may expose extra opportunities to float things outwards. However, to fix
     -- up the output of the transformation we need at do at least one simplify
@@ -523,6 +528,9 @@ doCorePass pass guts = do
     CoreAddCallerCcs          -> {-# SCC "AddCallerCcs" #-}
                                  addCallerCostCentres guts
 
+    CoreDoOptCoercion         -> {-# SCC "OptCoercion" #-}
+                                 doPass optCoercionProgram guts
+
     CoreDoPrintCore           -> observe (printCore logger) guts
 
     CoreDoRuleCheck phase pat -> ruleCheckPass phase pat guts
diff --git a/compiler/GHC/Core/SimpleOpt.hs b/compiler/GHC/Core/SimpleOpt.hs
index 9e3e92d247..ffb0ee1844 100644
--- a/compiler/GHC/Core/SimpleOpt.hs
+++ b/compiler/GHC/Core/SimpleOpt.hs
@@ -99,8 +99,7 @@ data SimpleOpts = SimpleOpts
 defaultSimpleOpts :: SimpleOpts
 defaultSimpleOpts = SimpleOpts
    { so_uf_opts = defaultUnfoldingOpts
-   , so_co_opts = OptCoercionOpts
-      { optCoercionEnabled = False }
+   , so_co_opts = NoCoercionOpt
    }
 
 simpleOptExpr :: HasDebugCallStack => SimpleOpts -> CoreExpr -> CoreExpr
diff --git a/compiler/GHC/Driver/Config.hs b/compiler/GHC/Driver/Config.hs
index 6bd8988add..46a0a85ac7 100644
--- a/compiler/GHC/Driver/Config.hs
+++ b/compiler/GHC/Driver/Config.hs
@@ -13,11 +13,13 @@ import GHC.Core.SimpleOpt
 import GHC.Core.Coercion.Opt
 import GHC.Parser.Lexer
 
--- | Initialise coercion optimiser configuration from DynFlags
+-- | Initialise coercion optimiser configuration from DynFlags, for use when
+-- performing "simple" coercion optimisation.
+-- See Note [Simple and full coercion optimisation] in GHC.Core.Coercion.Opt.
 initOptCoercionOpts :: DynFlags -> OptCoercionOpts
-initOptCoercionOpts dflags = OptCoercionOpts
-   { optCoercionEnabled = not (hasNoOptCoercion dflags)
-   }
+initOptCoercionOpts dflags
+  | gopt Opt_OptCoercionSimple dflags = SimpleCoercionOpt
+  | otherwise                         = NoCoercionOpt
 
 -- | Initialise Simple optimiser configuration from DynFlags
 initSimpleOpts :: DynFlags -> SimpleOpts
diff --git a/compiler/GHC/Driver/Flags.hs b/compiler/GHC/Driver/Flags.hs
index 393927e1b2..a18f5ee682 100644
--- a/compiler/GHC/Driver/Flags.hs
+++ b/compiler/GHC/Driver/Flags.hs
@@ -204,6 +204,9 @@ data GeneralFlag
    | Opt_AlignmentSanitisation
    | Opt_CatchBottoms
    | Opt_NumConstantFolding
+   | Opt_OptCoercionFull   -- ^ Run the full coercion optimizer (as an optimization pass)
+   | Opt_OptCoercionSimple -- ^ Run the simple coercion optimizer (as part of the simple optimizer / simplifier).
+                           -- See Note [Simple and full coercion optimisation] in GHC.Core.Coercion.Opt.
 
    -- PreInlining is on by default. The option is there just to see how
    -- bad things get if you turn it off!
@@ -355,7 +358,6 @@ data GeneralFlag
    | Opt_PluginTrustworthy
 
    | Opt_G_NoStateHack
-   | Opt_G_NoOptCoercion
    deriving (Eq, Show, Enum)
 
 -- Check whether a flag should be considered an "optimisation flag"
diff --git a/compiler/GHC/Driver/Session.hs b/compiler/GHC/Driver/Session.hs
index 969d63a54b..ea82f8d535 100644
--- a/compiler/GHC/Driver/Session.hs
+++ b/compiler/GHC/Driver/Session.hs
@@ -28,7 +28,7 @@ module GHC.Driver.Session (
         ProfAuto(..),
         glasgowExtsFlags,
         warningGroups, warningHierarchies,
-        hasPprDebug, hasNoDebugOutput, hasNoStateHack, hasNoOptCoercion,
+        hasPprDebug, hasNoDebugOutput, hasNoStateHack,
         dopt, dopt_set, dopt_unset,
         gopt, gopt_set, gopt_unset, setGeneralFlag', unSetGeneralFlag',
         wopt, wopt_set, wopt_unset,
@@ -1398,9 +1398,6 @@ hasNoDebugOutput = dopt Opt_D_no_debug_output
 hasNoStateHack :: DynFlags -> Bool
 hasNoStateHack = gopt Opt_G_NoStateHack
 
-hasNoOptCoercion :: DynFlags -> Bool
-hasNoOptCoercion = gopt Opt_G_NoOptCoercion
-
 
 -- | Test whether a 'DumpFlag' is set
 dopt :: DumpFlag -> DynFlags -> Bool
@@ -2236,8 +2233,10 @@ dynamic_flags_deps = [
         (NoArg (setGeneralFlag Opt_NoHsMain))
   , make_ord_flag defGhcFlag "fno-state-hack"
         (NoArg (setGeneralFlag Opt_G_NoStateHack))
+  , make_ord_flag defGhcFlag "fopt-coercion"
+        (NoArg (setGeneralFlag Opt_OptCoercionSimple >> setGeneralFlag Opt_OptCoercionFull))
   , make_ord_flag defGhcFlag "fno-opt-coercion"
-        (NoArg (setGeneralFlag Opt_G_NoOptCoercion))
+        (NoArg (unSetGeneralFlag Opt_OptCoercionSimple >> unSetGeneralFlag Opt_OptCoercionFull))
   , make_ord_flag defGhcFlag "with-rtsopts"
         (HasArg setRtsOpts)
   , make_ord_flag defGhcFlag "rtsopts"
@@ -3299,6 +3298,8 @@ fFlagsDeps = [
   flagSpec "block-layout-weightless"          Opt_WeightlessBlocklayout,
   flagSpec "omit-interface-pragmas"           Opt_OmitInterfacePragmas,
   flagSpec "omit-yields"                      Opt_OmitYields,
+  flagSpec "opt-coercion-full"                Opt_OptCoercionFull,
+  flagSpec "opt-coercion-simple"              Opt_OptCoercionSimple,
   flagSpec "optimal-applicative-do"           Opt_OptimalApplicativeDo,
   flagSpec "pedantic-bottoms"                 Opt_PedanticBottoms,
   flagSpec "pre-inlining"                     Opt_SimplPreInlining,
@@ -3868,6 +3869,9 @@ optLevelFlags -- see Note [Documenting optimisation flags]
     , ([1,2],   Opt_SolveConstantDicts)
     , ([1,2],   Opt_NumConstantFolding)
 
+    , ([0,1,2], Opt_OptCoercionSimple)
+    , ([1,2],   Opt_OptCoercionFull)
+
     , ([2],     Opt_LiberateCase)
     , ([2],     Opt_SpecConstr)
 --  , ([2],     Opt_RegsGraph)
diff --git a/docs/users_guide/using-optimisation.rst b/docs/users_guide/using-optimisation.rst
index 2722a285df..9cea41166b 100644
--- a/docs/users_guide/using-optimisation.rst
+++ b/docs/users_guide/using-optimisation.rst
@@ -664,14 +664,39 @@ by saying ``-fno-wombat``.
     value arguments of the resulting worker exceeds both that of the original
     function and this setting.
 
-.. ghc-flag:: -fno-opt-coercion
-    :shortdesc: Turn off the coercion optimiser
+.. ghc-flag:: -fopt-coercion
+    :shortdesc: Perform coercion optimisation.
     :type: dynamic
+    :reverse: -fno-opt-coercion
     :category:
 
-    :default: coercion optimisation enabled.
+    :default: off
+    :since: 9.4.1
+
+    Equivalent to both :ghc-flag:`-fopt-coercion-simple` and :ghc-flag:`-fopt-coercion-full`.
+
+.. ghc-flag:: -fopt-coercion-simple
+    :shortdesc: Perform simple coercion optimisations.
+    :type: dynamic
+    :reverse: -fno-opt-coercion-simple
+    :category:
+
+    :default: on
+    :since: 9.4.1
+
+    Run the "simple" coercion optimiser which performs certain coercion optimisations
+    after desugaring and during simplification.
+
+.. ghc-flag:: -fopt-coercion-full
+    :shortdesc: Perform a full coercion optimisation pass.
+    :type: dynamic
+    :reverse: -fno-opt-coercion-full
+    :category:
+
+    :default: off but enabled with :ghc-flag:`-O`.
+    :since: 9.4.1
 
-    Turn off the coercion optimiser.
+    Run the coercion optimiser as an optimisation pass, including all optimisations.
 
 .. ghc-flag:: -fno-pre-inlining
     :shortdesc: Turn off pre-inlining
author	Adam Gundry <adam@well-typed.com>	2021-05-31 13:21:54 +0100
committer	Adam Gundry <adam@well-typed.com>	2021-06-01 14:57:36 +0100
commit	a7b206099937a2cf8cd6319052b3149fd7ecbaa7 (patch)
tree	2aa0d82b2a033229ebe1bc3593497cf12277c64d
parent	cc1ba576d26b90c0c01aa43e7100c94ee3a287ad (diff)
download	haskell-wip/amg/T19909.tar.gz