diff options
-rw-r--r-- | compiler/GHC/Core/Opt/DmdAnal.hs | 100 | ||||
-rw-r--r-- | compiler/GHC/CoreToStg/Prep.hs | 4 | ||||
-rw-r--r-- | testsuite/tests/stranal/should_run/T22549.hs | 37 | ||||
-rw-r--r-- | testsuite/tests/stranal/should_run/T22549.stdout | 1 | ||||
-rw-r--r-- | testsuite/tests/stranal/should_run/all.T | 2 |
5 files changed, 106 insertions, 38 deletions
diff --git a/compiler/GHC/Core/Opt/DmdAnal.hs b/compiler/GHC/Core/Opt/DmdAnal.hs index 74316e1e1b..3738f8b3ed 100644 --- a/compiler/GHC/Core/Opt/DmdAnal.hs +++ b/compiler/GHC/Core/Opt/DmdAnal.hs @@ -290,8 +290,10 @@ dmdAnalBind dmdAnalBind top_lvl env dmd bind anal_body = case bind of NonRec id rhs | useLetUp top_lvl id - -> dmdAnalBindLetUp top_lvl env id rhs anal_body - _ -> dmdAnalBindLetDown top_lvl env dmd bind anal_body + -> dmdAnalBindLetUp top_lvl env_rhs id rhs anal_body + _ -> dmdAnalBindLetDown top_lvl env_rhs dmd bind anal_body + where + env_rhs = enterDFun bind env -- | Annotates uninteresting top level functions ('isInterestingTopLevelFn') -- with 'topDmd', the rest with the given demand. @@ -1759,6 +1761,10 @@ We never unbox class dictionaries in worker/wrapper. occur without INLINABLE, when we use -fexpose-all-unfoldings and -fspecialise-aggressively to do vigorous cross-module specialisation. +3. #18421 found that unboxing a dictionary can also make the worker less likely + to inline; the inlining heuristics seem to prefer to inline a function + applied to a dictionary over a function applied to a bunch of functions. + TL;DR we /never/ unbox class dictionaries. Unboxing the dictionary, and passing a raft of higher-order functions isn't a huge win anyway -- you really want to specialise the function. @@ -2242,14 +2248,6 @@ addLazyFVs dmd_ty lazy_fvs -- L demand doesn't get both'd with the Bot coming up from the inner -- call to f. So we just get an L demand for x for g. -{- -Note [Do not strictify the argument dictionaries of a dfun] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The typechecker can tie recursive knots involving dfuns, so we do the -conservative thing and refrain from strictifying a dfun's argument -dictionaries. --} - setBndrsDemandInfo :: HasCallStack => [Var] -> [Demand] -> [Var] setBndrsDemandInfo (b:bs) ds | isTyVar b = b : setBndrsDemandInfo bs ds @@ -2388,6 +2386,16 @@ emptyAnalEnv opts fam_envs , ae_rec_dc = memoiseUniqueFun (isRecDataCon fam_envs 3) } +-- | Unset the 'dmd_strict_dicts' flag if any of the given bindings is a DFun +-- binding. Part of the mechanism that detects +-- Note [Do not strictify a DFun's parameter dictionaries]. +enterDFun :: CoreBind -> AnalEnv -> AnalEnv +enterDFun bind env + | any isDFunId (bindersOf bind) + = env { ae_opts = (ae_opts env) { dmd_strict_dicts = False } } + | otherwise + = env + emptySigEnv :: SigEnv emptySigEnv = emptyVarEnv @@ -2439,31 +2447,29 @@ findBndrDmd env dmd_ty id id_ty = idType id strictify dmd - -- See Note [Making dictionaries strict] + -- See Note [Making dictionary parameters strict] + -- and Note [Do not strictify a DFun's parameter dictionaries] | dmd_strict_dicts (ae_opts env) - -- We never want to strictify a recursive let. At the moment - -- findBndrDmd is never called for recursive lets; if that - -- changes, we need a RecFlag parameter and another guard here. = strictifyDictDmd id_ty dmd | otherwise = dmd fam_envs = ae_fam_envs env -{- Note [Making dictionaries strict] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +{- Note [Making dictionary parameters strict] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Opt_DictsStrict flag makes GHC use call-by-value for dictionaries. Why? * Generally CBV is more efficient. -* Dictionaries are always non-bottom; and never take much work to - compute. E.g. a dfun from an instance decl always returns a dictionary +* A datatype dictionary is always non-bottom and never takes much work to + compute. E.g. a DFun from an instance decl always returns a dictionary record immediately. See DFunUnfolding in CoreSyn. See also Note [Recursive superclasses] in TcInstDcls. -* The strictness analyser will then unbox dictionaries and pass the - methods individually, rather than in a bundle. If there are a lot of - methods that might be bad; but worker/wrapper already does throttling. +See #17758 for more background and perf numbers. + +Wrinkles: * A newtype dictionary is *not* always non-bottom. E.g. class C a where op :: a -> a @@ -2471,26 +2477,44 @@ The Opt_DictsStrict flag makes GHC use call-by-value for dictionaries. Why? Now a value of type (C Int) is just a newtype wrapper (a cast) around the error thunk. Don't strictify these! -See #17758 for more background and perf numbers. +* Strictifying DFuns risks destroying the invariant that DFuns never take much + work to compute, so we don't do it. + See Note [Do not strictify a DFun's parameter dictionaries] for details. + +* Although worker/wrapper *could* unbox strictly used dictionaries, we do not do + so; see Note [Do not unbox class dictionaries]. The implementation is extremely simple: just make the strictness analyser strictify the demand on a dictionary binder in -'findBndrDmd'. - -However there is one case where this can make performance worse. -For the principle consider some function at the core level: - myEq :: Eq a => a -> a -> Bool - myEq eqDict x y = ((==) eqDict) x y -If we make the dictionary strict then WW can fire turning this into: - $wmyEq :: (a -> a -> Bool) -> a -> a -> Bool - $wmyEq eq x y = eq x y -Which *usually* performs better. However if the dictionary is known we -are far more likely to inline a function applied to the dictionary than -to inline one applied to a function. Sometimes this makes just enough -of a difference to stop a function from inlining. This is documented in #18421. - -It's somewhat similar to Note [Do not unbox class dictionaries] although -here our problem is with the inliner, not the specializer. +'findBndrDmd' if the binder does not belong to a DFun. + +Note [Do not strictify a DFun's parameter dictionaries] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The typechecker can tie recursive knots involving (non-recursive) DFuns, so +we must not strictify a DFun's parameter dictionaries (#22549). +T22549 has an example involving undecidable instances that <<loop>>s when we +strictify the DFun of, e.g., `$fEqSeqT`: + + Main.$fEqSeqT + = \@m @a ($dEq :: Eq (m (ViewT m a))) ($dMonad :: Monad m) -> + GHC.Classes.C:Eq @(SeqT m a) ($c== @m @a $dEq $dMonad) + ($c/= @m @a $dEq $dMonad) + + Rec { + $dEq_a = Main.$fEqSeqT @Identity @Int $dEq_b Main.$fMonadIdentity + $dEq_b = ... $dEq_a ... <another strict context due to DFun> + } + +If we make `$fEqSeqT` strict in `$dEq`, we'll collapse the Rec group into a +giant, <<loop>>ing thunk. + +To prevent that, we never strictify dictionary params when inside a DFun. +That is implemented by unsetting 'dmd_strict_dicts' when entering a DFun. + +See also Note [Speculative evaluation] in GHC.CoreToStg.Prep which has a rather +similar example in #20836. We may never speculate *arguments* of (recursive) +DFun calls, likewise we should not mark *formal parameters* of recursive DFuns +as strict. Note [Initialising strictness] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/compiler/GHC/CoreToStg/Prep.hs b/compiler/GHC/CoreToStg/Prep.hs index d0a56f9166..0b9259bc0e 100644 --- a/compiler/GHC/CoreToStg/Prep.hs +++ b/compiler/GHC/CoreToStg/Prep.hs @@ -1708,6 +1708,10 @@ Fortunately, pce_rec_ids already has all the information we need in that case. The problem is very similar to Note [Eta reduction in recursive RHSs]. Here as well as there it is *unsound* to change the termination properties of the very function whose termination properties we are exploiting. + +It is also similar to Note [Do not strictify a DFun's parameter dictionaries], +where marking recursive DFuns (of undecidable *instances*) strict in dictionary +*parameters* leads to quite the same change in termination as above. -} data FloatingBind diff --git a/testsuite/tests/stranal/should_run/T22549.hs b/testsuite/tests/stranal/should_run/T22549.hs new file mode 100644 index 0000000000..841ead007b --- /dev/null +++ b/testsuite/tests/stranal/should_run/T22549.hs @@ -0,0 +1,37 @@ +{-# LANGUAGE FlexibleContexts #-} +{-# LANGUAGE StandaloneDeriving #-} +{-# LANGUAGE LambdaCase #-} +{-# LANGUAGE UndecidableInstances #-} +{-# LANGUAGE MonoLocalBinds #-} +{-# LANGUAGE DeriveFunctor #-} + +import Data.Function (on) + +newtype Identity a = Identity a deriving (Eq, Functor) +instance Applicative Identity where + pure = Identity + Identity f <*> Identity a = Identity $ f a +instance Monad Identity where + Identity a >>= f = f a + +data ViewT m a + = Empty + | a :< SeqT m a +newtype SeqT m a = SeqT [m (ViewT m a)] + +toViewT :: Monad m => SeqT m a -> m (ViewT m a) +toViewT (SeqT []) = pure Empty +toViewT (SeqT (h : t)) = h >>= \case + Empty -> toViewT (SeqT t) + hi :< SeqT ti -> pure (hi :< SeqT (ti ++ t)) + +instance (Eq (m (ViewT m a)), Monad m) => Eq (SeqT m a) where + (==) = (==) `on` toViewT + +deriving instance (Eq a, Eq (SeqT m a)) => Eq (ViewT m a) + +example :: SeqT Identity Int +example = SeqT [] + +main :: IO () +main = print (example == example) diff --git a/testsuite/tests/stranal/should_run/T22549.stdout b/testsuite/tests/stranal/should_run/T22549.stdout new file mode 100644 index 0000000000..0ca95142bb --- /dev/null +++ b/testsuite/tests/stranal/should_run/T22549.stdout @@ -0,0 +1 @@ +True diff --git a/testsuite/tests/stranal/should_run/all.T b/testsuite/tests/stranal/should_run/all.T index a2b8e4cfc5..9da7863314 100644 --- a/testsuite/tests/stranal/should_run/all.T +++ b/testsuite/tests/stranal/should_run/all.T @@ -30,3 +30,5 @@ test('T19053', normal, compile_and_run, ['']) test('T21717b', normal, compile_and_run, ['']) test('T22475', normal, compile_and_run, ['']) test('T22475b', normal, compile_and_run, ['']) +# T22549: Do not strictify DFuns, otherwise we will <<loop>> +test('T22549', normal, compile_and_run, ['-fdicts-strict -fno-specialise']) |