From b3b4d3c1b1fbe1fa3a04d8233ef78dcd12299753 Mon Sep 17 00:00:00 2001 From: Andreas Klebinger Date: Tue, 12 Jan 2021 20:02:42 +0100 Subject: SimplM: Create uniques via IO instead of threading --- compiler/GHC/Types/Unique/Supply.hs | 83 ++++++++++++++++++++++++++++++++++--- 1 file changed, 78 insertions(+), 5 deletions(-) (limited to 'compiler/GHC/Types/Unique') diff --git a/compiler/GHC/Types/Unique/Supply.hs b/compiler/GHC/Types/Unique/Supply.hs index 4b146edd9f..0a10fde9b3 100644 --- a/compiler/GHC/Types/Unique/Supply.hs +++ b/compiler/GHC/Types/Unique/Supply.hs @@ -86,9 +86,35 @@ lazily-evaluated infinite tree. * The fresh node * A thunk for each sub-tree -Note [Optimising the unique supply] +Note [How unique supplies are used] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The general design (used throughout GHC) is to: + +* For creating new uniques either a UniqSupply is used and threaded through + or for monadic code a MonadUnique instance might conjure up uniques using + `uniqFromMask`. +* Different parts of the compiler will use a UniqSupply or MonadUnique instance + with a specific mask. This way the different parts of the compiler will + generate uniques with different masks. + +If different code shares the same mask then care has to be taken that all uniques +still get distinct numbers. Usually this is done by relying on genSym which +has *one* counter per GHC invocation that is relied on by all calls to it. +But using something like the address for pinned objects works as well and in fact is done +for fast strings. + +This is important for example in the simplifier. Most passes of the simplifier use +the same mask 's'. However in some places we create a unique supply using `mkSplitUniqSupply` +and thread it through the code, while in GHC.Core.Opt.Simplify.Monad we use the +`instance MonadUnique SimplM`, which uses `mkSplitUniqSupply` in getUniqueSupplyM +and `uniqFromMask` in getUniqeM. + +Ultimately all these boil down to each new unique consisting of the mask and the result from +a call to `genSym`. The later producing a distinct number for each invocation ensuring +uniques are distinct. +Note [Optimising the unique supply] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The inner loop of mkSplitUniqSupply is a function closure mk_supply s0 = @@ -117,6 +143,46 @@ result. It was very brittle and required enabling -fno-state-hack globally. So it has been rewritten using lower level constructs to explicitly state what we want. +Note [Optimising use of unique supplies] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When it comes to having a way to generate new Uniques +there are generally three ways to deal with this: + +For pure code the only good approach is to take an UniqSupply +as argument. Then thread it through the code splitting it +for sub-passes or when creating uniques. +The code for this is about as optimized as it gets, but we can't +get around the need to allocate one `UniqSupply` for each Unique +we need. + +For code in IO we can improve on this by threading only the *mask* +we are going to use for Uniques. Using `uniqFromMask` to +generate uniques as needed. This gets rid of the overhead of +allocating a new UniqSupply for each unique generated. It also avoids +frequent state updates when the Unique/Mask is part of the state in a +state monad. + +For monadic code in IO which always uses the same mask we can go further +and hardcode the mask into the MonadUnique instance. On top of all the +benefits of threading the mask this *also* has the benefit of avoiding +the mask getting captured in thunks, or being passed around at runtime. +It does however come at the cost of having to use a fixed Mask for all +code run in this Monad. But rememeber, the Mask is purely cosmetic: +See Note [Uniques and masks]. + +NB: It's *not* an optimization to pass around the UniqSupply inside an +IORef instead of the mask. While this would avoid frequent state updates +it still requires allocating one UniqSupply per Unique. On top of some +overhead for reading/writing to/from the IORef. + +All of this hinges on the assumption that UniqSupply and +uniqFromMask use the same source of distinct numbers (`genSym`) which +allows both to be used at the same time, with the same mask, while still +ensuring distinct uniques. +One might consider this fact to be an "accident". But GHC worked like this +as far back as source control history goes. It also allows the later two +optimizations to be used. So it seems safe to depend on this fact. + -} @@ -132,9 +198,16 @@ data UniqSupply -- when split => these two supplies mkSplitUniqSupply :: Char -> IO UniqSupply --- ^ Create a unique supply out of thin air. The character given must --- be distinct from those of all calls to this function in the compiler --- for the values generated to be truly unique. +-- ^ Create a unique supply out of thin air. +-- The "mask" (Char) supplied is purely cosmetic, making it easier +-- to figure out where a Unique was born. See +-- Note [Uniques and masks]. +-- +-- The payload part of the Uniques allocated from this UniqSupply are +-- guaranteed distinct wrt all other supplies, regardless of their "mask". +-- This is achieved by allocating the payload part from +-- a single source of Uniques, namely `genSym`, shared across +-- all UniqSupply's. -- See Note [How the unique supply works] -- See Note [Optimising the unique supply] @@ -187,7 +260,7 @@ initUniqSupply counter inc = do poke ghc_unique_inc inc uniqFromMask :: Char -> IO Unique -uniqFromMask mask +uniqFromMask !mask = do { uqNum <- genSym ; return $! mkUnique mask uqNum } -- cgit v1.2.1