diff options
author | Matthew Pickering <matthewtpickering@gmail.com> | 2021-07-20 11:49:22 +0200 |
---|---|---|
committer | Matthew Pickering <matthewtpickering@gmail.com> | 2021-12-28 09:47:53 +0000 |
commit | fd42ab5fa1df847a6b595dfe4b63d9c7eecbf400 (patch) | |
tree | 3bd7add640ee4e1340de079a16a05fd34548925f /compiler/GHC/Linker | |
parent | 3219610e3ba6cb6a5cd1f4e32e2b4befea5bd384 (diff) | |
download | haskell-fd42ab5fa1df847a6b595dfe4b63d9c7eecbf400.tar.gz |
Multiple Home Units
Multiple home units allows you to load different packages which may depend on
each other into one GHC session. This will allow both GHCi and HLS to support
multi component projects more naturally.
Public Interface
~~~~~~~~~~~~~~~~
In order to specify multiple units, the -unit @⟨filename⟩ flag
is given multiple times with a response file containing the arguments for each unit.
The response file contains a newline separated list of arguments.
```
ghc -unit @unitLibCore -unit @unitLib
```
where the `unitLibCore` response file contains the normal arguments that cabal would pass to `--make` mode.
```
-this-unit-id lib-core-0.1.0.0
-i
-isrc
LibCore.Utils
LibCore.Types
```
The response file for lib, can specify a dependency on lib-core, so then modules in lib can use modules from lib-core.
```
-this-unit-id lib-0.1.0.0
-package-id lib-core-0.1.0.0
-i
-isrc
Lib.Parse
Lib.Render
```
Then when the compiler starts in --make mode it will compile both units lib and lib-core.
There is also very basic support for multiple home units in GHCi, at the
moment you can start a GHCi session with multiple units but only the
:reload is supported. Most commands in GHCi assume a single home unit,
and so it is additional work to work out how to modify the interface to
support multiple loaded home units.
Options used when working with Multiple Home Units
There are a few extra flags which have been introduced specifically for
working with multiple home units. The flags allow a home unit to pretend
it’s more like an installed package, for example, specifying the package
name, module visibility and reexported modules.
-working-dir ⟨dir⟩
It is common to assume that a package is compiled in the directory
where its cabal file resides. Thus, all paths used in the compiler
are assumed to be relative to this directory. When there are
multiple home units the compiler is often not operating in the
standard directory and instead where the cabal.project file is
located. In this case the -working-dir option can be passed which
specifies the path from the current directory to the directory the
unit assumes to be it’s root, normally the directory which contains
the cabal file.
When the flag is passed, any relative paths used by the compiler are
offset by the working directory. Notably this includes -i and
-I⟨dir⟩ flags.
-this-package-name ⟨name⟩
This flag papers over the awkward interaction of the PackageImports
and multiple home units. When using PackageImports you can specify
the name of the package in an import to disambiguate between modules
which appear in multiple packages with the same name.
This flag allows a home unit to be given a package name so that you
can also disambiguate between multiple home units which provide
modules with the same name.
-hidden-module ⟨module name⟩
This flag can be supplied multiple times in order to specify which
modules in a home unit should not be visible outside of the unit it
belongs to.
The main use of this flag is to be able to recreate the difference
between an exposed and hidden module for installed packages.
-reexported-module ⟨module name⟩
This flag can be supplied multiple times in order to specify which
modules are not defined in a unit but should be reexported. The
effect is that other units will see this module as if it was defined
in this unit.
The use of this flag is to be able to replicate the reexported
modules feature of packages with multiple home units.
Offsetting Paths in Template Haskell splices
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When using Template Haskell to embed files into your program,
traditionally the paths have been interpreted relative to the directory
where the .cabal file resides. This causes problems for multiple home
units as we are compiling many different libraries at once which have
.cabal files in different directories.
For this purpose we have introduced a way to query the value of the
-working-dir flag to the Template Haskell API. By using this function we
can implement a makeRelativeToProject function which offsets a path
which is relative to the original project root by the value of
-working-dir.
```
import Language.Haskell.TH.Syntax ( makeRelativeToProject )
foo = $(makeRelativeToProject "./relative/path" >>= embedFile)
```
> If you write a relative path in a Template Haskell splice you should use the makeRelativeToProject function so that your library works correctly with multiple home units.
A similar function already exists in the file-embed library. The
function in template-haskell implements this function in a more robust
manner by honouring the -working-dir flag rather than searching the file
system.
Closure Property for Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For tools or libraries using the API there is one very important closure
property which must be adhered to:
> Any dependency which is not a home unit must not (transitively) depend
on a home unit.
For example, if you have three packages p, q and r, then if p depends on
q which depends on r then it is illegal to load both p and r as home
units but not q, because q is a dependency of the home unit p which
depends on another home unit r.
If you are using GHC by the command line then this property is checked,
but if you are using the API then you need to check this property
yourself. If you get it wrong you will probably get some very confusing
errors about overlapping instances.
Limitations of Multiple Home Units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are a few limitations of the initial implementation which will be smoothed out on user demand.
* Package thinning/renaming syntax is not supported
* More complicated reexports/renaming are not yet supported.
* It’s more common to run into existing linker bugs when loading a
large number of packages in a session (for example #20674, #20689)
* Backpack is not yet supported when using multiple home units.
* Dependency chasing can be quite slow with a large number of
modules and packages.
* Loading wired-in packages as home units is currently not supported
(this only really affects GHC developers attempting to load
template-haskell).
* Barely any normal GHCi features are supported, it would be good to
support enough for ghcid to work correctly.
Despite these limitations, the implementation works already for nearly
all packages. It has been testing on large dependency closures,
including the whole of head.hackage which is a total of 4784 modules
from 452 packages.
Internal Changes
~~~~~~~~~~~~~~~~
* The biggest change is that the HomePackageTable is replaced with the
HomeUnitGraph. The HomeUnitGraph is a map from UnitId to HomeUnitEnv,
which contains information specific to each home unit.
* The HomeUnitEnv contains:
- A unit state, each home unit can have different package db flags
- A set of dynflags, each home unit can have different flags
- A HomePackageTable
* LinkNode: A new node type is added to the ModuleGraph, this is used to
place the linking step into the build plan so linking can proceed in
parralel with other packages being built.
* New invariant: Dependencies of a ModuleGraphNode can be completely
determined by looking at the value of the node. In order to achieve
this, downsweep now performs a more complete job of downsweeping and
then the dependenices are recorded forever in the node rather than
being computed again from the ModSummary.
* Some transitive module calculations are rewritten to use the
ModuleGraph which is more efficient.
* There is always an active home unit, which simplifies modifying a lot
of the existing API code which is unit agnostic (for example, in the
driver).
The road may be bumpy for a little while after this change but the
basics are well-tested.
One small metric increase, which we accept and also submodule update to
haddock which removes ExtendedModSummary.
Closes #10827
-------------------------
Metric Increase:
MultiLayerModules
-------------------------
Co-authored-by: Fendor <power.walross@gmail.com>
Diffstat (limited to 'compiler/GHC/Linker')
-rw-r--r-- | compiler/GHC/Linker/Loader.hs | 165 | ||||
-rw-r--r-- | compiler/GHC/Linker/Static.hs | 29 | ||||
-rw-r--r-- | compiler/GHC/Linker/Static/Utils.hs | 31 |
3 files changed, 156 insertions, 69 deletions
diff --git a/compiler/GHC/Linker/Loader.hs b/compiler/GHC/Linker/Loader.hs index 80e303b046..6fc324e27a 100644 --- a/compiler/GHC/Linker/Loader.hs +++ b/compiler/GHC/Linker/Loader.hs @@ -54,7 +54,6 @@ import GHC.Tc.Utils.Monad import GHC.Runtime.Interpreter import GHCi.RemoteTypes -import GHC.Iface.Load import GHC.ByteCode.Linker import GHC.ByteCode.Asm @@ -72,7 +71,6 @@ import GHC.Utils.Outputable import GHC.Utils.Panic import GHC.Utils.Panic.Plain import GHC.Utils.Constants (isWindowsHost, isDarwinHost) -import GHC.Utils.Misc import GHC.Utils.Error import GHC.Utils.Logger import GHC.Utils.TmpFs @@ -82,7 +80,6 @@ import GHC.Unit.Finder import GHC.Unit.Module import GHC.Unit.Module.ModIface import GHC.Unit.Module.Deps -import GHC.Unit.Home import GHC.Unit.Home.ModInfo import GHC.Unit.State as Packages @@ -119,6 +116,12 @@ import GHC.Utils.Exception import qualified Data.Map as M import Data.Either (partitionEithers) +import GHC.Unit.Module.Graph +import GHC.Types.SourceFile +import GHC.Utils.Misc +import GHC.Iface.Load +import GHC.Unit.Home + uninitialised :: a uninitialised = panic "Loader not initialised" @@ -210,7 +213,6 @@ loadDependencies -> IO (LoaderState, SuccessFlag) loadDependencies interp hsc_env pls span needed_mods = do -- initLoaderState (hsc_dflags hsc_env) dl - let hpt = hsc_HPT hsc_env let dflags = hsc_dflags hsc_env -- The interpreter and dynamic linker can only handle object code built -- the "normal" way, i.e. no non-std ways like profiling or ticky-ticky. @@ -219,11 +221,11 @@ loadDependencies interp hsc_env pls span needed_mods = do maybe_normal_osuf <- checkNonStdWay dflags interp (fst span) -- Find what packages and linkables are required - (lnks, all_lnks, pkgs) <- getLinkDeps hsc_env hpt pls + (lnks, all_lnks, pkgs) <- getLinkDeps hsc_env pls maybe_normal_osuf (fst span) needed_mods let pls1 = - case (snd span) of + case snd span of Just mn -> pls { module_deps = M.insertWith (++) mn all_lnks (module_deps pls) } Nothing -> pls @@ -310,8 +312,9 @@ reallyInitLoaderState interp hsc_env = do -- (a) initialise the C dynamic linker initObjLinker interp + -- (b) Load packages from the command-line (Note [preload packages]) - pls <- loadPackages' interp hsc_env (preloadUnits (hsc_units hsc_env)) pls0 + pls <- unitEnv_foldWithKey (\k u env -> k >>= \pls' -> loadPackages' interp (hscSetActiveUnitId u hsc_env) (preloadUnits (homeUnitEnv_units env)) pls') (return pls0) (hsc_HUG hsc_env) -- steps (c), (d) and (e) loadCmdLineLibs' interp hsc_env pls @@ -323,13 +326,33 @@ loadCmdLineLibs interp hsc_env = do modifyLoaderState_ interp $ \pls -> loadCmdLineLibs' interp hsc_env pls -loadCmdLineLibs' + +loadCmdLineLibs' :: Interp -> HscEnv -> LoaderState -> IO LoaderState +loadCmdLineLibs' interp hsc_env pls = snd <$> + foldM + (\(done', pls') cur_uid -> load done' cur_uid pls') + (Set.empty, pls) + (hsc_all_home_unit_ids hsc_env) + + where + load :: Set.Set UnitId -> UnitId -> LoaderState -> IO (Set.Set UnitId, LoaderState) + load done uid pls | uid `Set.member` done = return (done, pls) + load done uid pls = do + let hsc' = hscSetActiveUnitId uid hsc_env + -- Load potential dependencies first + (done', pls') <- foldM (\(done', pls') uid -> load done' uid pls') (done, pls) + (homeUnitDepends (hsc_units hsc')) + pls'' <- loadCmdLineLibs'' interp hsc' pls' + return $ (Set.insert uid done', pls'') + +loadCmdLineLibs'' :: Interp -> HscEnv -> LoaderState -> IO LoaderState -loadCmdLineLibs' interp hsc_env pls = +loadCmdLineLibs'' interp hsc_env pls = do + let dflags@(DynFlags { ldInputs = cmdline_ld_inputs , libraryPaths = lib_paths_base}) = hsc_dflags hsc_env @@ -661,7 +684,7 @@ failNonStd dflags srcspan = dieWith dflags srcspan $ Prof -> "with -prof" Dyn -> "with -dynamic" -getLinkDeps :: HscEnv -> HomePackageTable +getLinkDeps :: HscEnv -> LoaderState -> Maybe FilePath -- replace object suffixes? -> SrcSpan -- for error messages @@ -669,13 +692,21 @@ getLinkDeps :: HscEnv -> HomePackageTable -> IO ([Linkable], [Linkable], [UnitId]) -- ... then link these first -- Fails with an IO exception if it can't find enough files -getLinkDeps hsc_env hpt pls replace_osuf span mods +getLinkDeps hsc_env pls replace_osuf span mods -- Find all the packages and linkables that a set of modules depends on = do { -- 1. Find the dependent home-pkg-modules/packages from each iface -- (omitting modules from the interactive package, which is already linked) - ; (mods_s, pkgs_s) <- follow_deps (filterOut isInteractiveModule mods) - emptyUniqDSet emptyUniqDSet; + ; (mods_s, pkgs_s) <- + -- Why two code paths here? There is a significant amount of repeated work + -- performed calculating transitive dependencies + -- if --make uses the oneShot code path (see MultiLayerModulesTH_* tests) + if isOneShot (ghcMode dflags) + then follow_deps (filterOut isInteractiveModule mods) + emptyUniqDSet emptyUniqDSet; + else do + (pkgs, mmods) <- unzip <$> mapM get_mod_info all_home_mods + return (catMaybes mmods, Set.toList (Set.unions (init_pkg_set : pkgs))) ; let -- 2. Exclude ones already linked @@ -683,11 +714,11 @@ getLinkDeps hsc_env hpt pls replace_osuf span mods (mods_needed, mods_got) = partitionEithers (map split_mods mods_s) pkgs_needed = pkgs_s `minusList` pkgs_loaded pls - split_mods mod_name = - let is_linked = find ((== mod_name) . (moduleName . linkableModule)) (objs_loaded pls ++ bcos_loaded pls) + split_mods mod = + let is_linked = find ((== mod) . (linkableModule)) (objs_loaded pls ++ bcos_loaded pls) in case is_linked of Just linkable -> Right linkable - Nothing -> Left mod_name + Nothing -> Left mod -- 3. For each dependent module, find its linkable -- This will either be in the HPT or (in the case of one-shot @@ -698,16 +729,62 @@ getLinkDeps hsc_env hpt pls replace_osuf span mods ; return (lnks_needed, mods_got ++ lnks_needed, pkgs_needed) } where dflags = hsc_dflags hsc_env + mod_graph = hsc_mod_graph hsc_env - -- The ModIface contains the transitive closure of the module dependencies - -- within the current package, *except* for boot modules: if we encounter - -- a boot module, we have to find its real interface and discover the - -- dependencies of that. Hence we need to traverse the dependency - -- tree recursively. See bug #936, testcase ghci/prog007. + -- This code is used in `--make` mode to calculate the home package and unit dependencies + -- for a set of modules. + -- + -- It is significantly more efficient to use the shared transitive dependency + -- calculation than to compute the transitive dependency set in the same manner as oneShot mode. + + -- It is also a matter of correctness to use the module graph so that dependencies between home units + -- is resolved correctly. + make_deps_loop :: (Set.Set UnitId, Set.Set NodeKey) -> [ModNodeKeyWithUid] -> (Set.Set UnitId, Set.Set NodeKey) + make_deps_loop found [] = found + make_deps_loop found@(found_units, found_mods) (nk:nexts) + | NodeKey_Module nk `Set.member` found_mods = make_deps_loop found nexts + | otherwise = + case M.lookup (NodeKey_Module nk) (mgTransDeps mod_graph) of + Just trans_deps -> + let deps = Set.insert (NodeKey_Module nk) trans_deps + -- See #936 and the ghci.prog007 test for why we have to continue traversing through + -- boot modules. + todo_boot_mods = [ModNodeKeyWithUid (GWIB mn NotBoot) uid | NodeKey_Module (ModNodeKeyWithUid (GWIB mn IsBoot) uid) <- Set.toList trans_deps] + in make_deps_loop (found_units, deps `Set.union` found_mods) (todo_boot_mods ++ nexts) + Nothing -> + let (ModNodeKeyWithUid _ uid) = nk + in make_deps_loop (uid `Set.insert` found_units, found_mods) nexts + + mkNk m = ModNodeKeyWithUid (GWIB (moduleName m) NotBoot) (moduleUnitId m) + (init_pkg_set, all_deps) = make_deps_loop (Set.empty, Set.empty) $ map mkNk (filterOut isInteractiveModule mods) + + all_home_mods = [with_uid | NodeKey_Module with_uid <- Set.toList all_deps] + + get_mod_info (ModNodeKeyWithUid gwib uid) = + case lookupHug (hsc_HUG hsc_env) uid (gwib_mod gwib) of + Just hmi -> + let iface = (hm_iface hmi) + mmod = case mi_hsc_src iface of + HsBootFile -> link_boot_mod_error (mi_module iface) + _ -> return $ Just (mi_module iface) + + in (dep_direct_pkgs (mi_deps iface),) <$> mmod + Nothing -> + let err = text "getLinkDeps: Home module not loaded" <+> ppr (gwib_mod gwib) <+> ppr uid + in throwGhcExceptionIO (ProgramError (showSDoc dflags err)) + + + -- This code is used in one-shot mode to traverse downwards through the HPT + -- to find all link dependencies. + -- The ModIface contains the transitive closure of the module dependencies + -- within the current package, *except* for boot modules: if we encounter + -- a boot module, we have to find its real interface and discover the + -- dependencies of that. Hence we need to traverse the dependency + -- tree recursively. See bug #936, testcase ghci/prog007. follow_deps :: [Module] -- modules to follow - -> UniqDSet ModuleName -- accum. module dependencies + -> UniqDSet Module -- accum. module dependencies -> UniqDSet UnitId -- accum. package dependencies - -> IO ([ModuleName], [UnitId]) -- result + -> IO ([Module], [UnitId]) -- result follow_deps [] acc_mods acc_pkgs = return (uniqDSetToList acc_mods, uniqDSetToList acc_pkgs) follow_deps (mod:mods) acc_mods acc_pkgs @@ -727,23 +804,28 @@ getLinkDeps hsc_env hpt pls replace_osuf span mods pkg_deps = dep_direct_pkgs deps (boot_deps, mod_deps) = flip partitionWith (Set.toList (dep_direct_mods deps)) $ \case - GWIB m IsBoot -> Left m - GWIB m NotBoot -> Right m - - mod_deps' = filter (not . (`elementOfUniqDSet` acc_mods)) (boot_deps ++ mod_deps) - acc_mods' = addListToUniqDSet acc_mods (moduleName mod : mod_deps) + (_, GWIB m IsBoot) -> Left m + (_, GWIB m NotBoot) -> Right m + + mod_deps' = case hsc_home_unit_maybe hsc_env of + Nothing -> [] + Just home_unit -> filter (not . (`elementOfUniqDSet` acc_mods)) (map (mkHomeModule home_unit) $ (boot_deps ++ mod_deps)) + acc_mods' = case hsc_home_unit_maybe hsc_env of + Nothing -> acc_mods + Just home_unit -> addListToUniqDSet acc_mods (mod : map (mkHomeModule home_unit) mod_deps) acc_pkgs' = addListToUniqDSet acc_pkgs (Set.toList pkg_deps) - -- - case ue_home_unit (hsc_unit_env hsc_env) of - Just home_unit - | isHomeUnit home_unit pkg - -> follow_deps (map (mkHomeModule home_unit) mod_deps' ++ mods) acc_mods' acc_pkgs' - _ -> follow_deps mods acc_mods (addOneToUniqDSet acc_pkgs' (toUnitId pkg)) + + case hsc_home_unit_maybe hsc_env of + Just home_unit | isHomeUnit home_unit pkg -> follow_deps (mod_deps' ++ mods) + acc_mods' acc_pkgs' + _ -> follow_deps mods acc_mods (addOneToUniqDSet acc_pkgs' (toUnitId pkg)) where - msg = text "need to link module" <+> ppr mod <+> + msg = text "need to link module" <+> ppr mod <+> text "due to use of Template Haskell" + + link_boot_mod_error :: Module -> IO a link_boot_mod_error mod = throwGhcExceptionIO (ProgramError (showSDoc dflags ( text "module" <+> ppr mod <+> @@ -759,22 +841,23 @@ getLinkDeps hsc_env hpt pls replace_osuf span mods -- This one is a build-system bug - get_linkable osuf mod_name -- A home-package module - | Just mod_info <- lookupHpt hpt mod_name + get_linkable osuf mod -- A home-package module + | Just mod_info <- lookupHugByModule mod (hsc_HUG hsc_env) = adjust_linkable (Maybes.expectJust "getLinkDeps" (hm_linkable mod_info)) | otherwise = do -- It's not in the HPT because we are in one shot mode, -- so use the Finder to get a ModLocation... - case ue_home_unit (hsc_unit_env hsc_env) of - Nothing -> no_obj mod_name + case hsc_home_unit_maybe hsc_env of + Nothing -> no_obj mod Just home_unit -> do + let fc = hsc_FC hsc_env let dflags = hsc_dflags hsc_env let fopts = initFinderOpts dflags - mb_stuff <- findHomeModule fc fopts home_unit mod_name + mb_stuff <- findHomeModule fc fopts home_unit (moduleName mod) case mb_stuff of Found loc mod -> found loc mod - _ -> no_obj mod_name + _ -> no_obj (moduleName mod) where found loc mod = do { -- ...and then find the linkable for it diff --git a/compiler/GHC/Linker/Static.hs b/compiler/GHC/Linker/Static.hs index 108dbec525..5d63d59461 100644 --- a/compiler/GHC/Linker/Static.hs +++ b/compiler/GHC/Linker/Static.hs @@ -2,7 +2,6 @@ module GHC.Linker.Static ( linkBinary , linkBinary' , linkStaticLib - , exeFileName ) where @@ -29,6 +28,7 @@ import GHC.Linker.Unit import GHC.Linker.Dynamic import GHC.Linker.ExtraObj import GHC.Linker.Windows +import GHC.Linker.Static.Utils import GHC.Driver.Session @@ -306,30 +306,3 @@ linkStaticLib logger dflags unit_env o_files dep_units = do -- run ranlib over the archive. write*Ar does *not* create the symbol index. runRanlib logger dflags [GHC.SysTools.FileOption "" output_fn] - - - --- | Compute the output file name of a program. --- --- StaticLink boolean is used to indicate if the program is actually a static library --- (e.g., on iOS). --- --- Use the provided filename (if any), otherwise use "main.exe" (Windows), --- "a.out (otherwise without StaticLink set), "liba.a". In every case, add the --- extension if it is missing. -exeFileName :: Platform -> Bool -> Maybe FilePath -> FilePath -exeFileName platform staticLink output_fn - | Just s <- output_fn = - case platformOS platform of - OSMinGW32 -> s <?.> "exe" - _ -> if staticLink - then s <?.> "a" - else s - | otherwise = - if platformOS platform == OSMinGW32 - then "main.exe" - else if staticLink - then "liba.a" - else "a.out" - where s <?.> ext | null (takeExtension s) = s <.> ext - | otherwise = s diff --git a/compiler/GHC/Linker/Static/Utils.hs b/compiler/GHC/Linker/Static/Utils.hs new file mode 100644 index 0000000000..6439d197d8 --- /dev/null +++ b/compiler/GHC/Linker/Static/Utils.hs @@ -0,0 +1,31 @@ +module GHC.Linker.Static.Utils where + +import GHC.Prelude +import GHC.Platform +import System.FilePath + +-- | Compute the output file name of a program. +-- +-- StaticLink boolean is used to indicate if the program is actually a static library +-- (e.g., on iOS). +-- +-- Use the provided filename (if any), otherwise use "main.exe" (Windows), +-- "a.out (otherwise without StaticLink set), "liba.a". In every case, add the +-- extension if it is missing. +exeFileName :: Platform -> Bool -> Maybe FilePath -> FilePath +exeFileName platform staticLink output_fn + | Just s <- output_fn = + case platformOS platform of + OSMinGW32 -> s <?.> "exe" + _ -> if staticLink + then s <?.> "a" + else s + | otherwise = + if platformOS platform == OSMinGW32 + then "main.exe" + else if staticLink + then "liba.a" + else "a.out" + where s <?.> ext | null (takeExtension s) = s <.> ext + | otherwise = s + |