diff options
author | Matthew Pickering <matthewtpickering@gmail.com> | 2021-04-07 10:57:06 +0100 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2021-04-14 05:07:45 -0400 |
commit | 726da09e76d0832b5aedd5b78624435695ac04e7 (patch) | |
tree | 61c013968fc4a218562a647c1860696ef9ff95a8 /compiler | |
parent | b665d9833b13d9d4241ff56585bbf45d2fcf2278 (diff) | |
download | haskell-726da09e76d0832b5aedd5b78624435695ac04e7.tar.gz |
Always generate ModDetails from ModIface
This vastly reduces memory usage when compiling with `--make` mode, from
about 900M when compiling Cabal to about 300M.
As a matter of uniformity, it also ensures that reading from an
interface performs the same as using the in-memory cache. We can also
delete all the horrible knot-tying in updateIdInfos.
Goes some way to fixing #13586
Accept new output of tests fixing some bugs along the way
-------------------------
Metric Decrease:
T12545
-------------------------
Diffstat (limited to 'compiler')
-rw-r--r-- | compiler/GHC/Driver/Main.hs | 71 | ||||
-rw-r--r-- | compiler/GHC/Driver/Pipeline.hs | 22 | ||||
-rw-r--r-- | compiler/GHC/Driver/Pipeline/Monad.hs | 9 | ||||
-rw-r--r-- | compiler/GHC/Iface/UpdateIdInfos.hs | 160 | ||||
-rw-r--r-- | compiler/GHC/Unit/Module/Status.hs | 1 | ||||
-rw-r--r-- | compiler/ghc.cabal.in | 1 |
6 files changed, 73 insertions, 191 deletions
diff --git a/compiler/GHC/Driver/Main.hs b/compiler/GHC/Driver/Main.hs index 07f1e7acda..296a855acf 100644 --- a/compiler/GHC/Driver/Main.hs +++ b/compiler/GHC/Driver/Main.hs @@ -42,6 +42,7 @@ module GHC.Driver.Main , Messager, batchMsg , HscStatus (..) , hscIncrementalCompile + , initModDetails , hscMaybeWriteIface , hscCompileCmmFile @@ -804,16 +805,7 @@ hscIncrementalCompile always_do_basic_recompilation_check m_tc_result -- We didn't need to do any typechecking; the old interface -- file on disk was good enough. Left iface -> do - -- Knot tying! See Note [Knot-tying typecheckIface] - details <- liftIO . fixIO $ \details' -> do - let act hpt = addToHpt hpt (ms_mod_name mod_summary) - (HomeModInfo iface details' Nothing) - let hsc_env' = hscUpdateHPT act hsc_env - -- NB: This result is actually not that useful - -- in one-shot mode, since we're not going to do - -- any further typechecking. It's much more useful - -- in make mode, since this HMI will go into the HPT. - genModDetails hsc_env' iface + details <- liftIO $ initModDetails hsc_env mod_summary iface return (HscUpToDate iface details, hsc_env') -- We finished type checking. (mb_old_hash is the hash of -- the interface that existed on disk; it's possible we had @@ -823,6 +815,64 @@ hscIncrementalCompile always_do_basic_recompilation_check m_tc_result status <- finish mod_summary tc_result mb_old_hash return (status, hsc_env) +-- Knot tying! See Note [Knot-tying typecheckIface] +-- See Note [ModDetails and --make mode] +initModDetails :: HscEnv -> ModSummary -> ModIface -> IO ModDetails +initModDetails hsc_env mod_summary iface = + fixIO $ \details' -> do + let act hpt = addToHpt hpt (ms_mod_name mod_summary) + (HomeModInfo iface details' Nothing) + let hsc_env' = hscUpdateHPT act hsc_env + -- NB: This result is actually not that useful + -- in one-shot mode, since we're not going to do + -- any further typechecking. It's much more useful + -- in make mode, since this HMI will go into the HPT. + genModDetails hsc_env' iface + + +{- +Note [ModDetails and --make mode] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +An interface file consists of two parts + +* The `ModIface` which ends up getting written to disk. + The `ModIface` is a completely acyclic tree, which can be serialised + and de-serialised completely straightforwardly. The `ModIface` is + also the structure that is finger-printed for recompilation control. + +* The `ModDetails` which provides a more structured view that is suitable + for usage during compilation. The `ModDetails` is heavily cyclic: + An `Id` contains a `Type`, which mentions a `TyCon` that contains kind + that mentions other `TyCons`; the `Id` also includes an unfolding that + in turn mentions more `Id`s; And so on. + +The `ModIface` can be created from the `ModDetails` and the `ModDetails` from +a `ModIface`. + +During tidying, just before interfaces are written to disk, +the ModDetails is calculated and then converted into a ModIface (see GHC.Iface.Make.mkIface_). +Then when GHC needs to restart typechecking from a certain point it can read the +interface file, and regenerate the ModDetails from the ModIface (see GHC.IfaceToCore.typecheckIface). +The key part about the loading is that the ModDetails is regenerated lazily +from the ModIface, so that there's only a detailed in-memory representation +for declarations which are actually used from the interface. This mode is +also used when reading interface files from external packages. + +In the old --make mode implementation, the interface was written after compiling a module +but the in-memory ModDetails which was used to compute the ModIface was retained. +The result was that --make mode used much more memory than `-c` mode, because a large amount of +information about a module would be kept in the ModDetails but never used. + +The new idea is that even in `--make` mode, when there is an in-memory `ModDetails` +at hand, we re-create the `ModDetails` from the `ModIface`. Doing this means that +we only have to keep the `ModIface` decls in memory and then lazily load +detailed representations if needed. It turns out this makes a really big difference +to memory usage, halving maximum memory used in some cases. + +See !5492 and #13586 +-} + -- Runs the post-typechecking frontend (desugar and simplify). We want to -- generate most of the interface as late as possible. This gets us up-to-date -- and good unfoldings and other info in the interface file. @@ -876,7 +926,6 @@ finish summary tc_result mb_old_hash = do return HscRecomp { hscs_guts = cg_guts, hscs_mod_location = ms_location summary, - hscs_mod_details = details, hscs_partial_iface = partial_iface, hscs_old_iface_hash = mb_old_hash } diff --git a/compiler/GHC/Driver/Pipeline.hs b/compiler/GHC/Driver/Pipeline.hs index 0a75b62248..e6b7be62ef 100644 --- a/compiler/GHC/Driver/Pipeline.hs +++ b/compiler/GHC/Driver/Pipeline.hs @@ -87,7 +87,6 @@ import GHC.Data.StringBuffer ( hGetStringBuffer, hPutStringBuffer ) import GHC.Data.Maybe ( expectJust ) import GHC.Iface.Make ( mkFullIface ) -import GHC.Iface.UpdateIdInfos ( updateModDetailsIdInfos ) import GHC.Types.Basic ( SuccessFlag(..) ) import GHC.Types.Target @@ -100,7 +99,6 @@ import GHC.Unit.Env import GHC.Unit.State import GHC.Unit.Finder import GHC.Unit.Module.ModSummary -import GHC.Unit.Module.ModDetails import GHC.Unit.Module.ModIface import GHC.Unit.Module.Graph (needsTemplateHaskellOrQQ) import GHC.Unit.Module.Deps @@ -258,13 +256,15 @@ compileOne' m_tc_result mHscMessage return $! HomeModInfo iface hmi_details (Just linkable) (HscRecomp { hscs_guts = cgguts, hscs_mod_location = mod_location, - hscs_mod_details = hmi_details, hscs_partial_iface = partial_iface, hscs_old_iface_hash = mb_old_iface_hash }, Interpreter) -> do -- In interpreted mode the regular codeGen backend is not run so we -- generate a interface without codeGen info. final_iface <- mkFullIface hsc_env' partial_iface Nothing + -- Reconstruct the `ModDetails` from the just-constructed `ModIface` + -- See Note [ModDetails and --make mode] + hmi_details <- liftIO $ initModDetails hsc_env' summary final_iface liftIO $ hscMaybeWriteIface logger dflags True final_iface mb_old_iface_hash (ms_location summary) (hasStub, comp_bc, spt_entries) <- hscInteractive hsc_env' cgguts mod_location @@ -291,7 +291,7 @@ compileOne' m_tc_result mHscMessage (Temporary TFL_CurrentModule) basename dflags next_phase (Just location) -- We're in --make mode: finish the compilation pipeline. - (_, _, Just (iface, details)) <- runPipeline StopLn hsc_env' + (_, _, Just iface) <- runPipeline StopLn hsc_env' (output_fn, Nothing, Just (HscOut src_flavour mod_name status)) @@ -302,6 +302,8 @@ compileOne' m_tc_result mHscMessage -- The object filename comes from the ModLocation o_time <- getModificationUTCTime object_filename let !linkable = LM o_time this_mod [DotO object_filename] + -- See Note [ModDetails and --make mode] + details <- initModDetails hsc_env' summary iface return $! HomeModInfo iface details (Just linkable) where dflags0 = ms_hspp_opts summary @@ -712,7 +714,7 @@ runPipeline -> PipelineOutput -- ^ Output filename -> Maybe ModLocation -- ^ A ModLocation, if this is a Haskell module -> [FilePath] -- ^ foreign objects - -> IO (DynFlags, FilePath, Maybe (ModIface, ModDetails)) + -> IO (DynFlags, FilePath, Maybe ModIface) -- ^ (final flags, output filename, interface) runPipeline stop_phase hsc_env0 (input_fn, mb_input_buf, mb_phase) mb_basename output maybe_loc foreign_os @@ -842,7 +844,7 @@ runPipeline' -> FilePath -- ^ Input filename -> Maybe ModLocation -- ^ A ModLocation, if this is a Haskell module -> [FilePath] -- ^ foreign objects, if we have one - -> IO (DynFlags, FilePath, Maybe (ModIface, ModDetails)) + -> IO (DynFlags, FilePath, Maybe ModIface) -- ^ (final flags, output filename, interface) runPipeline' start_phase hsc_env env input_fn maybe_loc foreign_os @@ -1374,7 +1376,6 @@ runPhase (HscOut src_flavour mod_name result) _ = do return (RealPhase StopLn, o_file) HscRecomp { hscs_guts = cgguts, hscs_mod_location = mod_location, - hscs_mod_details = mod_details, hscs_partial_iface = partial_iface, hscs_old_iface_hash = mb_old_iface_hash } @@ -1387,12 +1388,7 @@ runPhase (HscOut src_flavour mod_name result) _ = do let dflags = hsc_dflags hsc_env' final_iface <- liftIO (mkFullIface hsc_env' partial_iface (Just cg_infos)) - let final_mod_details - | gopt Opt_OmitInterfacePragmas dflags - = mod_details - | otherwise = {-# SCC updateModDetailsIdInfos #-} - updateModDetailsIdInfos cg_infos mod_details - setIface final_iface final_mod_details + setIface final_iface -- See Note [Writing interface files] liftIO $ hscMaybeWriteIface logger dflags False final_iface mb_old_iface_hash mod_location diff --git a/compiler/GHC/Driver/Pipeline/Monad.hs b/compiler/GHC/Driver/Pipeline/Monad.hs index 4a33543527..d95f9a3973 100644 --- a/compiler/GHC/Driver/Pipeline/Monad.hs +++ b/compiler/GHC/Driver/Pipeline/Monad.hs @@ -27,7 +27,6 @@ import GHC.Utils.TmpFs (TempFileLifetime) import GHC.Types.SourceFile import GHC.Unit.Module -import GHC.Unit.Module.ModDetails import GHC.Unit.Module.ModIface import GHC.Unit.Module.Status @@ -82,7 +81,7 @@ data PipeState = PipeState { -- ^ additional object files resulting from compiling foreign -- code. They come from two sources: foreign stubs, and -- add{C,Cxx,Objc,Objcxx}File from template haskell - iface :: Maybe (ModIface, ModDetails) + iface :: Maybe ModIface -- ^ Interface generated by HscOut phase. Only available after the -- phase runs. } @@ -90,7 +89,7 @@ data PipeState = PipeState { pipeStateDynFlags :: PipeState -> DynFlags pipeStateDynFlags = hsc_dflags . hsc_env -pipeStateModIface :: PipeState -> Maybe (ModIface, ModDetails) +pipeStateModIface :: PipeState -> Maybe ModIface pipeStateModIface = iface data PipelineOutput @@ -139,5 +138,5 @@ setForeignOs :: [FilePath] -> CompPipeline () setForeignOs os = P $ \_env state -> return (state{ foreign_os = os }, ()) -setIface :: ModIface -> ModDetails -> CompPipeline () -setIface iface details = P $ \_env state -> return (state{ iface = Just (iface, details) }, ()) +setIface :: ModIface -> CompPipeline () +setIface iface = P $ \_env state -> return (state{ iface = Just iface }, ()) diff --git a/compiler/GHC/Iface/UpdateIdInfos.hs b/compiler/GHC/Iface/UpdateIdInfos.hs deleted file mode 100644 index 0c70b5caeb..0000000000 --- a/compiler/GHC/Iface/UpdateIdInfos.hs +++ /dev/null @@ -1,160 +0,0 @@ -{-# LANGUAGE CPP, BangPatterns, Strict, RecordWildCards #-} - -module GHC.Iface.UpdateIdInfos - ( updateModDetailsIdInfos - ) where - -import GHC.Prelude - -import GHC.Core -import GHC.Core.InstEnv - -import GHC.StgToCmm.Types (CgInfos (..)) - -import GHC.Types.Id -import GHC.Types.Id.Info -import GHC.Types.Name.Env -import GHC.Types.Name.Set -import GHC.Types.Var -import GHC.Types.TypeEnv -import GHC.Types.TyThing - -import GHC.Unit.Module.ModDetails - -import GHC.Utils.Misc -import GHC.Utils.Outputable -import GHC.Utils.Panic - -#include "HsVersions.h" - --- | Update CafInfos and LFInfos of all occurrences (in rules, unfoldings, class --- instances). --- --- See Note [Conveying CAF-info and LFInfo between modules] in --- GHC.StgToCmm.Types. -updateModDetailsIdInfos - :: CgInfos - -> ModDetails -- ^ ModDetails to update - -> ModDetails - -updateModDetailsIdInfos cg_infos mod_details = - let - ModDetails{ md_types = type_env -- for unfoldings - , md_insts = insts - , md_rules = rules - } = mod_details - - -- type TypeEnv = NameEnv TyThing - type_env' = mapNameEnv (updateTyThingIdInfos type_env' cg_infos) type_env - -- NB: Knot-tied! The result, type_env', is passed right back into into - -- updateTyThingIdInfos, so that that occurrences of any Ids (e.g. in - -- IdInfos, etc) can be looked up in the tidied env - - !insts' = strictMap (updateInstIdInfos type_env' cg_infos) insts - !rules' = strictMap (updateRuleIdInfos type_env') rules - in - mod_details{ md_types = type_env' - , md_insts = insts' - , md_rules = rules' - } - --------------------------------------------------------------------------------- --- Rules --------------------------------------------------------------------------------- - -updateRuleIdInfos :: TypeEnv -> CoreRule -> CoreRule -updateRuleIdInfos _ rule@BuiltinRule{} = rule -updateRuleIdInfos type_env Rule{ .. } = Rule { ru_rhs = updateGlobalIds type_env ru_rhs, .. } - --------------------------------------------------------------------------------- --- Instances --------------------------------------------------------------------------------- - -updateInstIdInfos :: TypeEnv -> CgInfos -> ClsInst -> ClsInst -updateInstIdInfos type_env cg_infos = - updateClsInstDFun (updateIdUnfolding type_env . updateIdInfo cg_infos) - --------------------------------------------------------------------------------- --- TyThings --------------------------------------------------------------------------------- - -updateTyThingIdInfos :: TypeEnv -> CgInfos -> TyThing -> TyThing - -updateTyThingIdInfos type_env cg_infos (AnId id) = - AnId (updateIdUnfolding type_env (updateIdInfo cg_infos id)) - -updateTyThingIdInfos _ _ other = other -- AConLike, ATyCon, ACoAxiom - --------------------------------------------------------------------------------- --- Unfoldings --------------------------------------------------------------------------------- - -updateIdUnfolding :: TypeEnv -> Id -> Id -updateIdUnfolding type_env id = - case idUnfolding id of - CoreUnfolding{ .. } -> - setIdUnfolding id CoreUnfolding{ uf_tmpl = updateGlobalIds type_env uf_tmpl, .. } - DFunUnfolding{ .. } -> - setIdUnfolding id DFunUnfolding{ df_args = map (updateGlobalIds type_env) df_args, .. } - _ -> id - --------------------------------------------------------------------------------- --- Expressions --------------------------------------------------------------------------------- - -updateIdInfo :: CgInfos -> Id -> Id -updateIdInfo CgInfos{ cgNonCafs = NonCaffySet non_cafs, cgLFInfos = lf_infos } id = - let - not_caffy = elemNameSet (idName id) non_cafs - mb_lf_info = lookupNameEnv lf_infos (idName id) - - id1 = if not_caffy then setIdCafInfo id NoCafRefs else id - id2 = case mb_lf_info of - Nothing -> id1 - Just lf_info -> setIdLFInfo id1 lf_info - in - id2 - --------------------------------------------------------------------------------- - -updateGlobalIds :: NameEnv TyThing -> CoreExpr -> CoreExpr --- Update occurrences of GlobalIds as directed by 'env' --- The 'env' maps a GlobalId to a version with accurate CAF info --- (and in due course perhaps other back-end-related info) -updateGlobalIds env e = go env e - where - go_id :: NameEnv TyThing -> Id -> Id - go_id env var = - case lookupNameEnv env (varName var) of - Nothing -> var - Just (AnId id) -> id - Just other -> pprPanic "UpdateIdInfos.updateGlobalIds" $ - text "Found a non-Id for Id Name" <+> ppr (varName var) $$ - nest 4 (text "Id:" <+> ppr var $$ - text "TyThing:" <+> ppr other) - - go :: NameEnv TyThing -> CoreExpr -> CoreExpr - go env (Var v) = Var (go_id env v) - go _ e@Lit{} = e - go env (App e1 e2) = App (go env e1) (go env e2) - go env (Lam b e) = assertNotInNameEnv env [b] (Lam b (go env e)) - go env (Let bs e) = Let (go_binds env bs) (go env e) - go env (Case e b ty alts) = - assertNotInNameEnv env [b] (Case (go env e) b ty (map go_alt alts)) - where - go_alt (Alt k bs e) = assertNotInNameEnv env bs (Alt k bs (go env e)) - go env (Cast e c) = Cast (go env e) c - go env (Tick t e) = Tick t (go env e) - go _ e@Type{} = e - go _ e@Coercion{} = e - - go_binds :: NameEnv TyThing -> CoreBind -> CoreBind - go_binds env (NonRec b e) = - assertNotInNameEnv env [b] (NonRec b (go env e)) - go_binds env (Rec prs) = - assertNotInNameEnv env (map fst prs) (Rec (mapSnd (go env) prs)) - --- In `updateGlobaLIds` Names of local binders should not shadow Name of --- globals. This assertion is to check that. -assertNotInNameEnv :: NameEnv a -> [Id] -> b -> b -assertNotInNameEnv env ids x = ASSERT(not (any (\id -> elemNameEnv (idName id) env) ids)) x diff --git a/compiler/GHC/Unit/Module/Status.hs b/compiler/GHC/Unit/Module/Status.hs index 539158fdb1..52938154b4 100644 --- a/compiler/GHC/Unit/Module/Status.hs +++ b/compiler/GHC/Unit/Module/Status.hs @@ -28,7 +28,6 @@ data HscStatus -- ^ Information for the code generator. , hscs_mod_location :: !ModLocation -- ^ Module info - , hscs_mod_details :: !ModDetails , hscs_partial_iface :: !PartialModIface -- ^ Partial interface , hscs_old_iface_hash :: !(Maybe Fingerprint) diff --git a/compiler/ghc.cabal.in b/compiler/ghc.cabal.in index 4178b9d0f6..f260600ba5 100644 --- a/compiler/ghc.cabal.in +++ b/compiler/ghc.cabal.in @@ -456,7 +456,6 @@ Library GHC.Iface.Tidy.StaticPtrTable GHC.IfaceToCore GHC.Iface.Type - GHC.Iface.UpdateIdInfos GHC.Linker GHC.Linker.Dynamic GHC.Linker.ExtraObj |