summaryrefslogtreecommitdiff
path: root/testsuite/tests/driver
diff options
context:
space:
mode:
authorMatthew Pickering <matthewtpickering@gmail.com>2022-07-22 16:52:07 +0100
committerMarge Bot <ben+marge-bot@smart-cactus.org>2022-08-04 13:58:01 -0400
commit0b1f5fd1ba67d04c534625740344a3c9ad525148 (patch)
treeefb60dac43f49864aa83345cea89e5a4aa81aa5d /testsuite/tests/driver
parent0f43837f50d16b1fa24458d6ca2a042b52c95270 (diff)
downloadhaskell-0b1f5fd1ba67d04c534625740344a3c9ad525148.tar.gz
Fix leaks in --make mode when there are module loops
This patch fixes quite a tricky leak where we would end up retaining stale ModDetails due to rehydrating modules against non-finalised interfaces. == Loops with multiple boot files It is possible for a module graph to have a loop (SCC, when ignoring boot files) which requires multiple boot files to break. In this case we must perform the necessary hydration steps before and after compiling modules which have boot files which are described above for corectness but also perform an additional hydration step at the end of the SCC to remove space leaks. Consider the following example: ┌───────┐ ┌───────┐ │ │ │ │ │ A │ │ B │ │ │ │ │ └─────┬─┘ └───┬───┘ │ │ ┌────▼─────────▼──┐ │ │ │ C │ └────┬─────────┬──┘ │ │ ┌────▼──┐ ┌───▼───┐ │ │ │ │ │ A-boot│ │ B-boot│ │ │ │ │ └───────┘ └───────┘ A, B and C live together in a SCC. Say we compile the modules in order A-boot, B-boot, C, A, B then when we compile A we will perform the hydration steps (because A has a boot file). Therefore C will be hydrated relative to A, and the ModDetails for A will reference C/A. Then when B is compiled C will be rehydrated again, and so B will reference C/A,B, its interface will be hydrated relative to both A and B. Now there is a space leak because say C is a very big module, there are now two different copies of ModDetails kept alive by modules A and B. The way to avoid this space leak is to rehydrate an entire SCC together at the end of compilation so that all the ModDetails point to interfaces for .hs files. In this example, when we hydrate A, B and C together then both A and B will refer to C/A,B. See #21900 for some more discussion. ------------------------------------------------------- In addition to this simple case, there is also the potential for a leak during parallel upsweep which is also fixed by this patch. Transcibed is Note [ModuleNameSet, efficiency and space leaks] Note [ModuleNameSet, efficiency and space leaks] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ During unsweep the results of compiling modules are placed into a MVar, to find the environment the module needs to compile itself in the MVar is consulted and the HomeUnitGraph is set accordingly. The reason we do this is that precisely tracking module dependencies and recreating the HUG from scratch each time is very expensive. In serial mode (-j1), this all works out fine because a module can only be compiled after its dependencies have finished compiling and not interleaved with compiling module loops. Therefore when we create the finalised or no loop interfaces, the HUG only contains finalised interfaces. In parallel mode, we have to be more careful because the HUG variable can contain non-finalised interfaces which have been started by another thread. In order to avoid a space leak where a finalised interface is compiled against a HPT which contains a non-finalised interface we have to restrict the HUG to only the visible modules. The visible modules is recording in the ModuleNameSet, this is propagated upwards whilst compiling and explains which transitive modules are visible from a certain point. This set is then used to restrict the HUG before the module is compiled to only the visible modules and thus avoiding this tricky space leak. Efficiency of the ModuleNameSet is of utmost importance because a union occurs for each edge in the module graph. Therefore the set is represented directly as an IntSet which provides suitable performance, even using a UniqSet (which is backed by an IntMap) is too slow. The crucial test of performance here is the time taken to a do a no-op build in --make mode. See test "jspace" for an example which used to trigger this problem. Fixes #21900
Diffstat (limited to 'testsuite/tests/driver')
-rw-r--r--testsuite/tests/driver/j-space/Makefile9
-rw-r--r--testsuite/tests/driver/j-space/all.T1
-rwxr-xr-xtestsuite/tests/driver/j-space/genJspace33
-rw-r--r--testsuite/tests/driver/j-space/jspace.hs55
4 files changed, 98 insertions, 0 deletions
diff --git a/testsuite/tests/driver/j-space/Makefile b/testsuite/tests/driver/j-space/Makefile
new file mode 100644
index 0000000000..0e65cc72a6
--- /dev/null
+++ b/testsuite/tests/driver/j-space/Makefile
@@ -0,0 +1,9 @@
+TOP=../../..
+include $(TOP)/mk/boilerplate.mk
+include $(TOP)/mk/test.mk
+
+jspace:
+ ./genJspace
+ "$(TEST_HC)" $(TEST_HC_OPTS) -Wall -Werror -v0 jspace.hs -rtsopts -package ghc -threaded
+ ./jspace$(exeext) +RTS -N -hT --no-automatic-heap-samples -RTS "`'$(TEST_HC)' --print-libdir | tr -d '\r'`"
+
diff --git a/testsuite/tests/driver/j-space/all.T b/testsuite/tests/driver/j-space/all.T
new file mode 100644
index 0000000000..7864ebf73a
--- /dev/null
+++ b/testsuite/tests/driver/j-space/all.T
@@ -0,0 +1 @@
+test('jspace', [extra_files(['genJspace']), req_smp], makefile_test, ['jspace'])
diff --git a/testsuite/tests/driver/j-space/genJspace b/testsuite/tests/driver/j-space/genJspace
new file mode 100755
index 0000000000..3056883367
--- /dev/null
+++ b/testsuite/tests/driver/j-space/genJspace
@@ -0,0 +1,33 @@
+#!/usr/bin/env bash
+# Generates the following module graph:
+# - A module cycle H_d - ... - H_1 - H_0 - H_d-boot.
+# - A w-wide collection of modules W_1 ... W_w all importing a common module W.
+# - A module J importing H_d and all of W_1, ..., W_w.
+# This module graph makes it so that GHC has to compile one of the W_i modules at the same time
+# as it is compiling the loop of H modules.
+DEPTH=8
+WIDTH=40
+echo "module JSpaceTest where" > JSpaceTest.hs
+echo "module W where" > W.hs
+ for j in $(seq -w 1 300); do
+ echo "w$j = 123" >> W.hs;
+ done
+for i in $(seq -w 1 $WIDTH); do
+ echo "module W$i where" > W$i.hs;
+ echo "import W" >> W$i.hs;
+ echo "import W$i" >> JSpaceTest.hs;
+ for j in $(seq -w 1 1000); do
+ echo "w$j = 123" >> W$i.hs;
+ done
+done
+echo "module H0 where" > H0.hs;
+echo "import {-# SOURCE #-} H$DEPTH" >> H0.hs;
+echo "import H$DEPTH" >> JSpaceTest.hs;
+for i in $(seq -w 1 $DEPTH); do
+ echo "module H$i where" > H$i.hs;
+ echo "import H$((i-1))" >> H$i.hs;
+ for j in $(seq -w 1 100); do
+ echo "h$j = 123" >> H$i.hs;
+ done
+done
+echo "module H$DEPTH where" > H$DEPTH.hs-boot;
diff --git a/testsuite/tests/driver/j-space/jspace.hs b/testsuite/tests/driver/j-space/jspace.hs
new file mode 100644
index 0000000000..d8a4fc9779
--- /dev/null
+++ b/testsuite/tests/driver/j-space/jspace.hs
@@ -0,0 +1,55 @@
+module Main where
+
+import GHC
+import GHC.Driver.Monad
+import System.Environment
+import GHC.Driver.Env.Types
+import GHC.Profiling
+import System.Mem
+import Data.List (isPrefixOf)
+import Control.Monad
+import System.Exit
+import GHC.Platform
+
+main :: IO ()
+main = do
+ [libdir] <- getArgs
+ runGhc (Just libdir) $ do
+ initGhcM ["JSpaceTest.hs", "-O", "-j", "-v0"]
+
+
+initGhcM :: [String] -> Ghc ()
+initGhcM xs = do
+ session <- getSession
+ df1 <- getSessionDynFlags
+ let cmdOpts = ["-fforce-recomp"] ++ xs
+ (df2, leftovers, _) <- parseDynamicFlags (hsc_logger session) df1 (map noLoc cmdOpts)
+ setSessionDynFlags df2
+ ts <- mapM (\s -> guessTarget s Nothing Nothing) $ map unLoc leftovers
+ setTargets ts
+ _ <- load LoadAllTargets
+ let plat :: Platform
+ plat = targetPlatform df2
+ word_size = case platformWordSize plat of
+ PW8 -> 8
+ PW4 -> 4
+ liftIO $ do
+ requestHeapCensus
+ performGC
+ [ys] <- filter (isPrefixOf "ghc:GHC.Unit.Module.ModDetails.ModDetails") . lines <$> readFile "jspace.hp"
+ let (n :: Int) = read (last (words ys))
+ -- The output should be 50 * 8 * word_size (i.e. 3200, or 1600 on 32-bit architectures):
+ -- the test contains DEPTH + WIDTH + 2 = 50 modules J, H_0, .., H_DEPTH, W_1, .., W_WIDTH,
+ -- and each ModDetails contains 1 (info table) + 7 word-sized fields.
+ -- If this number changes DO NOT ACCEPT THE TEST, you have introduced a space leak.
+ --
+ -- There is some unexplained behaviour where the result is infrequently 3264.. but
+ -- this resisted investigation using ghc-debug so the test actually checks whether there
+ -- are less than 51 live ModDetails which is still a big improvement over before.
+ when (n >= (51 * word_size * 8)) $ do
+ putStrLn "Space leak detetched by jspace test:"
+ putStrLn $ (show (n `div` (word_size * 8))) ++ " live ModDetails when <= 51 are expected"
+ exitFailure
+ return ()
+
+