diff options
author | Matthew Pickering <matthewtpickering@gmail.com> | 2022-07-22 16:52:07 +0100 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2022-08-04 13:58:01 -0400 |
commit | 0b1f5fd1ba67d04c534625740344a3c9ad525148 (patch) | |
tree | efb60dac43f49864aa83345cea89e5a4aa81aa5d /testsuite/tests/driver | |
parent | 0f43837f50d16b1fa24458d6ca2a042b52c95270 (diff) | |
download | haskell-0b1f5fd1ba67d04c534625740344a3c9ad525148.tar.gz |
Fix leaks in --make mode when there are module loops
This patch fixes quite a tricky leak where we would end up retaining
stale ModDetails due to rehydrating modules against non-finalised
interfaces.
== Loops with multiple boot files
It is possible for a module graph to have a loop (SCC, when ignoring boot files)
which requires multiple boot files to break. In this case we must perform the
necessary hydration steps before and after compiling modules which have boot files
which are described above for corectness but also perform an additional hydration step
at the end of the SCC to remove space leaks.
Consider the following example:
┌───────┐ ┌───────┐
│ │ │ │
│ A │ │ B │
│ │ │ │
└─────┬─┘ └───┬───┘
│ │
┌────▼─────────▼──┐
│ │
│ C │
└────┬─────────┬──┘
│ │
┌────▼──┐ ┌───▼───┐
│ │ │ │
│ A-boot│ │ B-boot│
│ │ │ │
└───────┘ └───────┘
A, B and C live together in a SCC. Say we compile the modules in order
A-boot, B-boot, C, A, B then when we compile A we will perform the hydration steps
(because A has a boot file). Therefore C will be hydrated relative to A, and the
ModDetails for A will reference C/A. Then when B is compiled C will be rehydrated again,
and so B will reference C/A,B, its interface will be hydrated relative to both A and B.
Now there is a space leak because say C is a very big module, there are now two different copies of
ModDetails kept alive by modules A and B.
The way to avoid this space leak is to rehydrate an entire SCC together at the
end of compilation so that all the ModDetails point to interfaces for .hs files.
In this example, when we hydrate A, B and C together then both A and B will refer to
C/A,B.
See #21900 for some more discussion.
-------------------------------------------------------
In addition to this simple case, there is also the potential for a leak
during parallel upsweep which is also fixed by this patch. Transcibed is
Note [ModuleNameSet, efficiency and space leaks]
Note [ModuleNameSet, efficiency and space leaks]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
During unsweep the results of compiling modules are placed into a MVar, to find
the environment the module needs to compile itself in the MVar is consulted and
the HomeUnitGraph is set accordingly. The reason we do this is that precisely tracking
module dependencies and recreating the HUG from scratch each time is very expensive.
In serial mode (-j1), this all works out fine because a module can only be compiled after
its dependencies have finished compiling and not interleaved with compiling module loops.
Therefore when we create the finalised or no loop interfaces, the HUG only contains
finalised interfaces.
In parallel mode, we have to be more careful because the HUG variable can contain
non-finalised interfaces which have been started by another thread. In order to avoid
a space leak where a finalised interface is compiled against a HPT which contains a
non-finalised interface we have to restrict the HUG to only the visible modules.
The visible modules is recording in the ModuleNameSet, this is propagated upwards
whilst compiling and explains which transitive modules are visible from a certain point.
This set is then used to restrict the HUG before the module is compiled to only
the visible modules and thus avoiding this tricky space leak.
Efficiency of the ModuleNameSet is of utmost importance because a union occurs for
each edge in the module graph. Therefore the set is represented directly as an IntSet
which provides suitable performance, even using a UniqSet (which is backed by an IntMap) is
too slow. The crucial test of performance here is the time taken to a do a no-op build in --make mode.
See test "jspace" for an example which used to trigger this problem.
Fixes #21900
Diffstat (limited to 'testsuite/tests/driver')
-rw-r--r-- | testsuite/tests/driver/j-space/Makefile | 9 | ||||
-rw-r--r-- | testsuite/tests/driver/j-space/all.T | 1 | ||||
-rwxr-xr-x | testsuite/tests/driver/j-space/genJspace | 33 | ||||
-rw-r--r-- | testsuite/tests/driver/j-space/jspace.hs | 55 |
4 files changed, 98 insertions, 0 deletions
diff --git a/testsuite/tests/driver/j-space/Makefile b/testsuite/tests/driver/j-space/Makefile new file mode 100644 index 0000000000..0e65cc72a6 --- /dev/null +++ b/testsuite/tests/driver/j-space/Makefile @@ -0,0 +1,9 @@ +TOP=../../.. +include $(TOP)/mk/boilerplate.mk +include $(TOP)/mk/test.mk + +jspace: + ./genJspace + "$(TEST_HC)" $(TEST_HC_OPTS) -Wall -Werror -v0 jspace.hs -rtsopts -package ghc -threaded + ./jspace$(exeext) +RTS -N -hT --no-automatic-heap-samples -RTS "`'$(TEST_HC)' --print-libdir | tr -d '\r'`" + diff --git a/testsuite/tests/driver/j-space/all.T b/testsuite/tests/driver/j-space/all.T new file mode 100644 index 0000000000..7864ebf73a --- /dev/null +++ b/testsuite/tests/driver/j-space/all.T @@ -0,0 +1 @@ +test('jspace', [extra_files(['genJspace']), req_smp], makefile_test, ['jspace']) diff --git a/testsuite/tests/driver/j-space/genJspace b/testsuite/tests/driver/j-space/genJspace new file mode 100755 index 0000000000..3056883367 --- /dev/null +++ b/testsuite/tests/driver/j-space/genJspace @@ -0,0 +1,33 @@ +#!/usr/bin/env bash +# Generates the following module graph: +# - A module cycle H_d - ... - H_1 - H_0 - H_d-boot. +# - A w-wide collection of modules W_1 ... W_w all importing a common module W. +# - A module J importing H_d and all of W_1, ..., W_w. +# This module graph makes it so that GHC has to compile one of the W_i modules at the same time +# as it is compiling the loop of H modules. +DEPTH=8 +WIDTH=40 +echo "module JSpaceTest where" > JSpaceTest.hs +echo "module W where" > W.hs + for j in $(seq -w 1 300); do + echo "w$j = 123" >> W.hs; + done +for i in $(seq -w 1 $WIDTH); do + echo "module W$i where" > W$i.hs; + echo "import W" >> W$i.hs; + echo "import W$i" >> JSpaceTest.hs; + for j in $(seq -w 1 1000); do + echo "w$j = 123" >> W$i.hs; + done +done +echo "module H0 where" > H0.hs; +echo "import {-# SOURCE #-} H$DEPTH" >> H0.hs; +echo "import H$DEPTH" >> JSpaceTest.hs; +for i in $(seq -w 1 $DEPTH); do + echo "module H$i where" > H$i.hs; + echo "import H$((i-1))" >> H$i.hs; + for j in $(seq -w 1 100); do + echo "h$j = 123" >> H$i.hs; + done +done +echo "module H$DEPTH where" > H$DEPTH.hs-boot; diff --git a/testsuite/tests/driver/j-space/jspace.hs b/testsuite/tests/driver/j-space/jspace.hs new file mode 100644 index 0000000000..d8a4fc9779 --- /dev/null +++ b/testsuite/tests/driver/j-space/jspace.hs @@ -0,0 +1,55 @@ +module Main where + +import GHC +import GHC.Driver.Monad +import System.Environment +import GHC.Driver.Env.Types +import GHC.Profiling +import System.Mem +import Data.List (isPrefixOf) +import Control.Monad +import System.Exit +import GHC.Platform + +main :: IO () +main = do + [libdir] <- getArgs + runGhc (Just libdir) $ do + initGhcM ["JSpaceTest.hs", "-O", "-j", "-v0"] + + +initGhcM :: [String] -> Ghc () +initGhcM xs = do + session <- getSession + df1 <- getSessionDynFlags + let cmdOpts = ["-fforce-recomp"] ++ xs + (df2, leftovers, _) <- parseDynamicFlags (hsc_logger session) df1 (map noLoc cmdOpts) + setSessionDynFlags df2 + ts <- mapM (\s -> guessTarget s Nothing Nothing) $ map unLoc leftovers + setTargets ts + _ <- load LoadAllTargets + let plat :: Platform + plat = targetPlatform df2 + word_size = case platformWordSize plat of + PW8 -> 8 + PW4 -> 4 + liftIO $ do + requestHeapCensus + performGC + [ys] <- filter (isPrefixOf "ghc:GHC.Unit.Module.ModDetails.ModDetails") . lines <$> readFile "jspace.hp" + let (n :: Int) = read (last (words ys)) + -- The output should be 50 * 8 * word_size (i.e. 3200, or 1600 on 32-bit architectures): + -- the test contains DEPTH + WIDTH + 2 = 50 modules J, H_0, .., H_DEPTH, W_1, .., W_WIDTH, + -- and each ModDetails contains 1 (info table) + 7 word-sized fields. + -- If this number changes DO NOT ACCEPT THE TEST, you have introduced a space leak. + -- + -- There is some unexplained behaviour where the result is infrequently 3264.. but + -- this resisted investigation using ghc-debug so the test actually checks whether there + -- are less than 51 live ModDetails which is still a big improvement over before. + when (n >= (51 * word_size * 8)) $ do + putStrLn "Space leak detetched by jspace test:" + putStrLn $ (show (n `div` (word_size * 8))) ++ " live ModDetails when <= 51 are expected" + exitFailure + return () + + |