diff options
Diffstat (limited to 'compiler/specialise/Specialise.lhs')
-rw-r--r-- | compiler/specialise/Specialise.lhs | 1236 |
1 files changed, 1236 insertions, 0 deletions
diff --git a/compiler/specialise/Specialise.lhs b/compiler/specialise/Specialise.lhs new file mode 100644 index 0000000000..0e66b0bc78 --- /dev/null +++ b/compiler/specialise/Specialise.lhs @@ -0,0 +1,1236 @@ +% +% (c) The GRASP/AQUA Project, Glasgow University, 1993-1998 +% +\section[Specialise]{Stamping out overloading, and (optionally) polymorphism} + +\begin{code} +module Specialise ( specProgram ) where + +#include "HsVersions.h" + +import DynFlags ( DynFlags, DynFlag(..) ) +import Id ( Id, idName, idType, mkUserLocal ) +import TcType ( Type, mkTyVarTy, tcSplitSigmaTy, + tyVarsOfTypes, tyVarsOfTheta, isClassPred, + tcCmpType, isUnLiftedType + ) +import CoreSubst ( Subst, mkEmptySubst, extendTvSubstList, lookupIdSubst, + substBndr, substBndrs, substTy, substInScope, + cloneIdBndr, cloneIdBndrs, cloneRecIdBndrs + ) +import VarSet +import VarEnv +import CoreSyn +import CoreUtils ( applyTypeToArgs, mkPiTypes ) +import CoreFVs ( exprFreeVars, exprsFreeVars, idRuleVars ) +import CoreTidy ( tidyRules ) +import CoreLint ( showPass, endPass ) +import Rules ( addIdSpecialisations, mkLocalRule, lookupRule, emptyRuleBase, rulesOfBinds ) +import PprCore ( pprRules ) +import UniqSupply ( UniqSupply, + UniqSM, initUs_, thenUs, returnUs, getUniqueUs, + getUs, mapUs + ) +import Name ( nameOccName, mkSpecOcc, getSrcLoc ) +import MkId ( voidArgId, realWorldPrimId ) +import FiniteMap +import Maybes ( catMaybes, maybeToBool ) +import ErrUtils ( dumpIfSet_dyn ) +import BasicTypes ( Activation( AlwaysActive ) ) +import Bag +import List ( partition ) +import Util ( zipEqual, zipWithEqual, cmpList, lengthIs, + equalLength, lengthAtLeast, notNull ) +import Outputable +import FastString + +infixr 9 `thenSM` +\end{code} + +%************************************************************************ +%* * +\subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]} +%* * +%************************************************************************ + +These notes describe how we implement specialisation to eliminate +overloading. + +The specialisation pass works on Core +syntax, complete with all the explicit dictionary application, +abstraction and construction as added by the type checker. The +existing type checker remains largely as it is. + +One important thought: the {\em types} passed to an overloaded +function, and the {\em dictionaries} passed are mutually redundant. +If the same function is applied to the same type(s) then it is sure to +be applied to the same dictionary(s)---or rather to the same {\em +values}. (The arguments might look different but they will evaluate +to the same value.) + +Second important thought: we know that we can make progress by +treating dictionary arguments as static and worth specialising on. So +we can do without binding-time analysis, and instead specialise on +dictionary arguments and no others. + +The basic idea +~~~~~~~~~~~~~~ +Suppose we have + + let f = <f_rhs> + in <body> + +and suppose f is overloaded. + +STEP 1: CALL-INSTANCE COLLECTION + +We traverse <body>, accumulating all applications of f to types and +dictionaries. + +(Might there be partial applications, to just some of its types and +dictionaries? In principle yes, but in practice the type checker only +builds applications of f to all its types and dictionaries, so partial +applications could only arise as a result of transformation, and even +then I think it's unlikely. In any case, we simply don't accumulate such +partial applications.) + + +STEP 2: EQUIVALENCES + +So now we have a collection of calls to f: + f t1 t2 d1 d2 + f t3 t4 d3 d4 + ... +Notice that f may take several type arguments. To avoid ambiguity, we +say that f is called at type t1/t2 and t3/t4. + +We take equivalence classes using equality of the *types* (ignoring +the dictionary args, which as mentioned previously are redundant). + +STEP 3: SPECIALISATION + +For each equivalence class, choose a representative (f t1 t2 d1 d2), +and create a local instance of f, defined thus: + + f@t1/t2 = <f_rhs> t1 t2 d1 d2 + +f_rhs presumably has some big lambdas and dictionary lambdas, so lots +of simplification will now result. However we don't actually *do* that +simplification. Rather, we leave it for the simplifier to do. If we +*did* do it, though, we'd get more call instances from the specialised +RHS. We can work out what they are by instantiating the call-instance +set from f's RHS with the types t1, t2. + +Add this new id to f's IdInfo, to record that f has a specialised version. + +Before doing any of this, check that f's IdInfo doesn't already +tell us about an existing instance of f at the required type/s. +(This might happen if specialisation was applied more than once, or +it might arise from user SPECIALIZE pragmas.) + +Recursion +~~~~~~~~~ +Wait a minute! What if f is recursive? Then we can't just plug in +its right-hand side, can we? + +But it's ok. The type checker *always* creates non-recursive definitions +for overloaded recursive functions. For example: + + f x = f (x+x) -- Yes I know its silly + +becomes + + f a (d::Num a) = let p = +.sel a d + in + letrec fl (y::a) = fl (p y y) + in + fl + +We still have recusion for non-overloaded functions which we +speciailise, but the recursive call should get specialised to the +same recursive version. + + +Polymorphism 1 +~~~~~~~~~~~~~~ + +All this is crystal clear when the function is applied to *constant +types*; that is, types which have no type variables inside. But what if +it is applied to non-constant types? Suppose we find a call of f at type +t1/t2. There are two possibilities: + +(a) The free type variables of t1, t2 are in scope at the definition point +of f. In this case there's no problem, we proceed just as before. A common +example is as follows. Here's the Haskell: + + g y = let f x = x+x + in f y + f y + +After typechecking we have + + g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x + in +.sel a d (f a d y) (f a d y) + +Notice that the call to f is at type type "a"; a non-constant type. +Both calls to f are at the same type, so we can specialise to give: + + g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x + in +.sel a d (f@a y) (f@a y) + + +(b) The other case is when the type variables in the instance types +are *not* in scope at the definition point of f. The example we are +working with above is a good case. There are two instances of (+.sel a d), +but "a" is not in scope at the definition of +.sel. Can we do anything? +Yes, we can "common them up", a sort of limited common sub-expression deal. +This would give: + + g a (d::Num a) (y::a) = let +.sel@a = +.sel a d + f@a (x::a) = +.sel@a x x + in +.sel@a (f@a y) (f@a y) + +This can save work, and can't be spotted by the type checker, because +the two instances of +.sel weren't originally at the same type. + +Further notes on (b) + +* There are quite a few variations here. For example, the defn of + +.sel could be floated ouside the \y, to attempt to gain laziness. + It certainly mustn't be floated outside the \d because the d has to + be in scope too. + +* We don't want to inline f_rhs in this case, because +that will duplicate code. Just commoning up the call is the point. + +* Nothing gets added to +.sel's IdInfo. + +* Don't bother unless the equivalence class has more than one item! + +Not clear whether this is all worth it. It is of course OK to +simply discard call-instances when passing a big lambda. + +Polymorphism 2 -- Overloading +~~~~~~~~~~~~~~ +Consider a function whose most general type is + + f :: forall a b. Ord a => [a] -> b -> b + +There is really no point in making a version of g at Int/Int and another +at Int/Bool, because it's only instancing the type variable "a" which +buys us any efficiency. Since g is completely polymorphic in b there +ain't much point in making separate versions of g for the different +b types. + +That suggests that we should identify which of g's type variables +are constrained (like "a") and which are unconstrained (like "b"). +Then when taking equivalence classes in STEP 2, we ignore the type args +corresponding to unconstrained type variable. In STEP 3 we make +polymorphic versions. Thus: + + f@t1/ = /\b -> <f_rhs> t1 b d1 d2 + +We do this. + + +Dictionary floating +~~~~~~~~~~~~~~~~~~~ +Consider this + + f a (d::Num a) = let g = ... + in + ...(let d1::Ord a = Num.Ord.sel a d in g a d1)... + +Here, g is only called at one type, but the dictionary isn't in scope at the +definition point for g. Usually the type checker would build a +definition for d1 which enclosed g, but the transformation system +might have moved d1's defn inward. Solution: float dictionary bindings +outwards along with call instances. + +Consider + + f x = let g p q = p==q + h r s = (r+s, g r s) + in + h x x + + +Before specialisation, leaving out type abstractions we have + + f df x = let g :: Eq a => a -> a -> Bool + g dg p q = == dg p q + h :: Num a => a -> a -> (a, Bool) + h dh r s = let deq = eqFromNum dh + in (+ dh r s, g deq r s) + in + h df x x + +After specialising h we get a specialised version of h, like this: + + h' r s = let deq = eqFromNum df + in (+ df r s, g deq r s) + +But we can't naively make an instance for g from this, because deq is not in scope +at the defn of g. Instead, we have to float out the (new) defn of deq +to widen its scope. Notice that this floating can't be done in advance -- it only +shows up when specialisation is done. + +User SPECIALIZE pragmas +~~~~~~~~~~~~~~~~~~~~~~~ +Specialisation pragmas can be digested by the type checker, and implemented +by adding extra definitions along with that of f, in the same way as before + + f@t1/t2 = <f_rhs> t1 t2 d1 d2 + +Indeed the pragmas *have* to be dealt with by the type checker, because +only it knows how to build the dictionaries d1 and d2! For example + + g :: Ord a => [a] -> [a] + {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-} + +Here, the specialised version of g is an application of g's rhs to the +Ord dictionary for (Tree Int), which only the type checker can conjure +up. There might not even *be* one, if (Tree Int) is not an instance of +Ord! (All the other specialision has suitable dictionaries to hand +from actual calls.) + +Problem. The type checker doesn't have to hand a convenient <f_rhs>, because +it is buried in a complex (as-yet-un-desugared) binding group. +Maybe we should say + + f@t1/t2 = f* t1 t2 d1 d2 + +where f* is the Id f with an IdInfo which says "inline me regardless!". +Indeed all the specialisation could be done in this way. +That in turn means that the simplifier has to be prepared to inline absolutely +any in-scope let-bound thing. + + +Again, the pragma should permit polymorphism in unconstrained variables: + + h :: Ord a => [a] -> b -> b + {-# SPECIALIZE h :: [Int] -> b -> b #-} + +We *insist* that all overloaded type variables are specialised to ground types, +(and hence there can be no context inside a SPECIALIZE pragma). +We *permit* unconstrained type variables to be specialised to + - a ground type + - or left as a polymorphic type variable +but nothing in between. So + + {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-} + +is *illegal*. (It can be handled, but it adds complication, and gains the +programmer nothing.) + + +SPECIALISING INSTANCE DECLARATIONS +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + + instance Foo a => Foo [a] where + ... + {-# SPECIALIZE instance Foo [Int] #-} + +The original instance decl creates a dictionary-function +definition: + + dfun.Foo.List :: forall a. Foo a -> Foo [a] + +The SPECIALIZE pragma just makes a specialised copy, just as for +ordinary function definitions: + + dfun.Foo.List@Int :: Foo [Int] + dfun.Foo.List@Int = dfun.Foo.List Int dFooInt + +The information about what instance of the dfun exist gets added to +the dfun's IdInfo in the same way as a user-defined function too. + + +Automatic instance decl specialisation? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Can instance decls be specialised automatically? It's tricky. +We could collect call-instance information for each dfun, but +then when we specialised their bodies we'd get new call-instances +for ordinary functions; and when we specialised their bodies, we might get +new call-instances of the dfuns, and so on. This all arises because of +the unrestricted mutual recursion between instance decls and value decls. + +Still, there's no actual problem; it just means that we may not do all +the specialisation we could theoretically do. + +Furthermore, instance decls are usually exported and used non-locally, +so we'll want to compile enough to get those specialisations done. + +Lastly, there's no such thing as a local instance decl, so we can +survive solely by spitting out *usage* information, and then reading that +back in as a pragma when next compiling the file. So for now, +we only specialise instance decls in response to pragmas. + + +SPITTING OUT USAGE INFORMATION +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To spit out usage information we need to traverse the code collecting +call-instance information for all imported (non-prelude?) functions +and data types. Then we equivalence-class it and spit it out. + +This is done at the top-level when all the call instances which escape +must be for imported functions and data types. + +*** Not currently done *** + + +Partial specialisation by pragmas +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +What about partial specialisation: + + k :: (Ord a, Eq b) => [a] -> b -> b -> [a] + {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-} + +or even + + {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-} + +Seems quite reasonable. Similar things could be done with instance decls: + + instance (Foo a, Foo b) => Foo (a,b) where + ... + {-# SPECIALIZE instance Foo a => Foo (a,Int) #-} + {-# SPECIALIZE instance Foo b => Foo (Int,b) #-} + +Ho hum. Things are complex enough without this. I pass. + + +Requirements for the simplifer +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The simplifier has to be able to take advantage of the specialisation. + +* When the simplifier finds an application of a polymorphic f, it looks in +f's IdInfo in case there is a suitable instance to call instead. This converts + + f t1 t2 d1 d2 ===> f_t1_t2 + +Note that the dictionaries get eaten up too! + +* Dictionary selection operations on constant dictionaries must be + short-circuited: + + +.sel Int d ===> +Int + +The obvious way to do this is in the same way as other specialised +calls: +.sel has inside it some IdInfo which tells that if it's applied +to the type Int then it should eat a dictionary and transform to +Int. + +In short, dictionary selectors need IdInfo inside them for constant +methods. + +* Exactly the same applies if a superclass dictionary is being + extracted: + + Eq.sel Int d ===> dEqInt + +* Something similar applies to dictionary construction too. Suppose +dfun.Eq.List is the function taking a dictionary for (Eq a) to +one for (Eq [a]). Then we want + + dfun.Eq.List Int d ===> dEq.List_Int + +Where does the Eq [Int] dictionary come from? It is built in +response to a SPECIALIZE pragma on the Eq [a] instance decl. + +In short, dfun Ids need IdInfo with a specialisation for each +constant instance of their instance declaration. + +All this uses a single mechanism: the SpecEnv inside an Id + + +What does the specialisation IdInfo look like? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The SpecEnv of an Id maps a list of types (the template) to an expression + + [Type] |-> Expr + +For example, if f has this SpecInfo: + + [Int, a] -> \d:Ord Int. f' a + +it means that we can replace the call + + f Int t ===> (\d. f' t) + +This chucks one dictionary away and proceeds with the +specialised version of f, namely f'. + + +What can't be done this way? +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +There is no way, post-typechecker, to get a dictionary for (say) +Eq a from a dictionary for Eq [a]. So if we find + + ==.sel [t] d + +we can't transform to + + eqList (==.sel t d') + +where + eqList :: (a->a->Bool) -> [a] -> [a] -> Bool + +Of course, we currently have no way to automatically derive +eqList, nor to connect it to the Eq [a] instance decl, but you +can imagine that it might somehow be possible. Taking advantage +of this is permanently ruled out. + +Still, this is no great hardship, because we intend to eliminate +overloading altogether anyway! + + + +A note about non-tyvar dictionaries +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Some Ids have types like + + forall a,b,c. Eq a -> Ord [a] -> tau + +This seems curious at first, because we usually only have dictionary +args whose types are of the form (C a) where a is a type variable. +But this doesn't hold for the functions arising from instance decls, +which sometimes get arguements with types of form (C (T a)) for some +type constructor T. + +Should we specialise wrt this compound-type dictionary? We used to say +"no", saying: + "This is a heuristic judgement, as indeed is the fact that we + specialise wrt only dictionaries. We choose *not* to specialise + wrt compound dictionaries because at the moment the only place + they show up is in instance decls, where they are simply plugged + into a returned dictionary. So nothing is gained by specialising + wrt them." + +But it is simpler and more uniform to specialise wrt these dicts too; +and in future GHC is likely to support full fledged type signatures +like + f ;: Eq [(a,b)] => ... + + +%************************************************************************ +%* * +\subsubsection{The new specialiser} +%* * +%************************************************************************ + +Our basic game plan is this. For let(rec) bound function + f :: (C a, D c) => (a,b,c,d) -> Bool + +* Find any specialised calls of f, (f ts ds), where + ts are the type arguments t1 .. t4, and + ds are the dictionary arguments d1 .. d2. + +* Add a new definition for f1 (say): + + f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2 + + Note that we abstract over the unconstrained type arguments. + +* Add the mapping + + [t1,b,t3,d] |-> \d1 d2 -> f1 b d + + to the specialisations of f. This will be used by the + simplifier to replace calls + (f t1 t2 t3 t4) da db + by + (\d1 d1 -> f1 t2 t4) da db + + All the stuff about how many dictionaries to discard, and what types + to apply the specialised function to, are handled by the fact that the + SpecEnv contains a template for the result of the specialisation. + +We don't build *partial* specialisations for f. For example: + + f :: Eq a => a -> a -> Bool + {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-} + +Here, little is gained by making a specialised copy of f. +There's a distinct danger that the specialised version would +first build a dictionary for (Eq b, Eq c), and then select the (==) +method from it! Even if it didn't, not a great deal is saved. + +We do, however, generate polymorphic, but not overloaded, specialisations: + + f :: Eq a => [a] -> b -> b -> b + {#- SPECIALISE f :: [Int] -> b -> b -> b #-} + +Hence, the invariant is this: + + *** no specialised version is overloaded *** + + +%************************************************************************ +%* * +\subsubsection{The exported function} +%* * +%************************************************************************ + +\begin{code} +specProgram :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind] +specProgram dflags us binds + = do + showPass dflags "Specialise" + + let binds' = initSM us (go binds `thenSM` \ (binds', uds') -> + returnSM (dumpAllDictBinds uds' binds')) + + endPass dflags "Specialise" Opt_D_dump_spec binds' + + dumpIfSet_dyn dflags Opt_D_dump_rules "Top-level specialisations" + (pprRules (tidyRules emptyTidyEnv (rulesOfBinds binds'))) + + return binds' + where + -- We need to start with a Subst that knows all the things + -- that are in scope, so that the substitution engine doesn't + -- accidentally re-use a unique that's already in use + -- Easiest thing is to do it all at once, as if all the top-level + -- decls were mutually recursive + top_subst = mkEmptySubst (mkInScopeSet (mkVarSet (bindersOfBinds binds))) + + go [] = returnSM ([], emptyUDs) + go (bind:binds) = go binds `thenSM` \ (binds', uds) -> + specBind top_subst bind uds `thenSM` \ (bind', uds') -> + returnSM (bind' ++ binds', uds') +\end{code} + +%************************************************************************ +%* * +\subsubsection{@specExpr@: the main function} +%* * +%************************************************************************ + +\begin{code} +specVar :: Subst -> Id -> CoreExpr +specVar subst v = lookupIdSubst subst v + +specExpr :: Subst -> CoreExpr -> SpecM (CoreExpr, UsageDetails) +-- We carry a substitution down: +-- a) we must clone any binding that might flaot outwards, +-- to avoid name clashes +-- b) we carry a type substitution to use when analysing +-- the RHS of specialised bindings (no type-let!) + +---------------- First the easy cases -------------------- +specExpr subst (Type ty) = returnSM (Type (substTy subst ty), emptyUDs) +specExpr subst (Var v) = returnSM (specVar subst v, emptyUDs) +specExpr subst (Lit lit) = returnSM (Lit lit, emptyUDs) + +specExpr subst (Note note body) + = specExpr subst body `thenSM` \ (body', uds) -> + returnSM (Note (specNote subst note) body', uds) + + +---------------- Applications might generate a call instance -------------------- +specExpr subst expr@(App fun arg) + = go expr [] + where + go (App fun arg) args = specExpr subst arg `thenSM` \ (arg', uds_arg) -> + go fun (arg':args) `thenSM` \ (fun', uds_app) -> + returnSM (App fun' arg', uds_arg `plusUDs` uds_app) + + go (Var f) args = case specVar subst f of + Var f' -> returnSM (Var f', mkCallUDs subst f' args) + e' -> returnSM (e', emptyUDs) -- I don't expect this! + go other args = specExpr subst other + +---------------- Lambda/case require dumping of usage details -------------------- +specExpr subst e@(Lam _ _) + = specExpr subst' body `thenSM` \ (body', uds) -> + let + (filtered_uds, body'') = dumpUDs bndrs' uds body' + in + returnSM (mkLams bndrs' body'', filtered_uds) + where + (bndrs, body) = collectBinders e + (subst', bndrs') = substBndrs subst bndrs + -- More efficient to collect a group of binders together all at once + -- and we don't want to split a lambda group with dumped bindings + +specExpr subst (Case scrut case_bndr ty alts) + = specExpr subst scrut `thenSM` \ (scrut', uds_scrut) -> + mapAndCombineSM spec_alt alts `thenSM` \ (alts', uds_alts) -> + returnSM (Case scrut' case_bndr' (substTy subst ty) alts', uds_scrut `plusUDs` uds_alts) + where + (subst_alt, case_bndr') = substBndr subst case_bndr + -- No need to clone case binder; it can't float like a let(rec) + + spec_alt (con, args, rhs) + = specExpr subst_rhs rhs `thenSM` \ (rhs', uds) -> + let + (uds', rhs'') = dumpUDs args uds rhs' + in + returnSM ((con, args', rhs''), uds') + where + (subst_rhs, args') = substBndrs subst_alt args + +---------------- Finally, let is the interesting case -------------------- +specExpr subst (Let bind body) + = -- Clone binders + cloneBindSM subst bind `thenSM` \ (rhs_subst, body_subst, bind') -> + + -- Deal with the body + specExpr body_subst body `thenSM` \ (body', body_uds) -> + + -- Deal with the bindings + specBind rhs_subst bind' body_uds `thenSM` \ (binds', uds) -> + + -- All done + returnSM (foldr Let body' binds', uds) + +-- Must apply the type substitution to coerceions +specNote subst (Coerce t1 t2) = Coerce (substTy subst t1) (substTy subst t2) +specNote subst note = note +\end{code} + +%************************************************************************ +%* * +\subsubsection{Dealing with a binding} +%* * +%************************************************************************ + +\begin{code} +specBind :: Subst -- Use this for RHSs + -> CoreBind + -> UsageDetails -- Info on how the scope of the binding + -> SpecM ([CoreBind], -- New bindings + UsageDetails) -- And info to pass upstream + +specBind rhs_subst bind body_uds + = specBindItself rhs_subst bind (calls body_uds) `thenSM` \ (bind', bind_uds) -> + let + bndrs = bindersOf bind + all_uds = zapCalls bndrs (body_uds `plusUDs` bind_uds) + -- It's important that the `plusUDs` is this way round, + -- because body_uds may bind dictionaries that are + -- used in the calls passed to specDefn. So the + -- dictionary bindings in bind_uds may mention + -- dictionaries bound in body_uds. + in + case splitUDs bndrs all_uds of + + (_, ([],[])) -- This binding doesn't bind anything needed + -- in the UDs, so put the binding here + -- This is the case for most non-dict bindings, except + -- for the few that are mentioned in a dict binding + -- that is floating upwards in body_uds + -> returnSM ([bind'], all_uds) + + (float_uds, (dict_binds, calls)) -- This binding is needed in the UDs, so float it out + -> returnSM ([], float_uds `plusUDs` mkBigUD bind' dict_binds calls) + + +-- A truly gruesome function +mkBigUD bind@(NonRec _ _) dbs calls + = -- Common case: non-recursive and no specialisations + -- (if there were any specialistions it would have been made recursive) + MkUD { dict_binds = listToBag (mkDB bind : dbs), + calls = listToCallDetails calls } + +mkBigUD bind dbs calls + = -- General case + MkUD { dict_binds = unitBag (mkDB (Rec (bind_prs bind ++ dbsToPairs dbs))), + -- Make a huge Rec + calls = listToCallDetails calls } + where + bind_prs (NonRec b r) = [(b,r)] + bind_prs (Rec prs) = prs + + dbsToPairs [] = [] + dbsToPairs ((bind,_):dbs) = bind_prs bind ++ dbsToPairs dbs + +-- specBindItself deals with the RHS, specialising it according +-- to the calls found in the body (if any) +specBindItself rhs_subst (NonRec bndr rhs) call_info + = specDefn rhs_subst call_info (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) -> + let + new_bind | null spec_defns = NonRec bndr' rhs' + | otherwise = Rec ((bndr',rhs'):spec_defns) + -- bndr' mentions the spec_defns in its SpecEnv + -- Not sure why we couln't just put the spec_defns first + in + returnSM (new_bind, spec_uds) + +specBindItself rhs_subst (Rec pairs) call_info + = mapSM (specDefn rhs_subst call_info) pairs `thenSM` \ stuff -> + let + (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff + spec_defns = concat spec_defns_s + spec_uds = plusUDList spec_uds_s + new_bind = Rec (spec_defns ++ pairs') + in + returnSM (new_bind, spec_uds) + + +specDefn :: Subst -- Subst to use for RHS + -> CallDetails -- Info on how it is used in its scope + -> (Id, CoreExpr) -- The thing being bound and its un-processed RHS + -> SpecM ((Id, CoreExpr), -- The thing and its processed RHS + -- the Id may now have specialisations attached + [(Id,CoreExpr)], -- Extra, specialised bindings + UsageDetails -- Stuff to fling upwards from the RHS and its + ) -- specialised versions + +specDefn subst calls (fn, rhs) + -- The first case is the interesting one + | rhs_tyvars `lengthIs` n_tyvars -- Rhs of fn's defn has right number of big lambdas + && rhs_bndrs `lengthAtLeast` n_dicts -- and enough dict args + && notNull calls_for_me -- And there are some calls to specialise + +-- At one time I tried not specialising small functions +-- but sometimes there are big functions marked INLINE +-- that we'd like to specialise. In particular, dictionary +-- functions, which Marcin is keen to inline +-- && not (certainlyWillInline fn) -- And it's not small + -- If it's small, it's better just to inline + -- it than to construct lots of specialisations + = -- Specialise the body of the function + specExpr subst rhs `thenSM` \ (rhs', rhs_uds) -> + + -- Make a specialised version for each call in calls_for_me + mapSM spec_call calls_for_me `thenSM` \ stuff -> + let + (spec_defns, spec_uds, spec_rules) = unzip3 stuff + + fn' = addIdSpecialisations fn spec_rules + in + returnSM ((fn',rhs'), + spec_defns, + rhs_uds `plusUDs` plusUDList spec_uds) + + | otherwise -- No calls or RHS doesn't fit our preconceptions + = specExpr subst rhs `thenSM` \ (rhs', rhs_uds) -> + returnSM ((fn, rhs'), [], rhs_uds) + + where + fn_type = idType fn + (tyvars, theta, _) = tcSplitSigmaTy fn_type + n_tyvars = length tyvars + n_dicts = length theta + + (rhs_tyvars, rhs_ids, rhs_body) + = collectTyAndValBinders (dropInline rhs) + -- It's important that we "see past" any INLINE pragma + -- else we'll fail to specialise an INLINE thing + + rhs_dicts = take n_dicts rhs_ids + rhs_bndrs = rhs_tyvars ++ rhs_dicts + body = mkLams (drop n_dicts rhs_ids) rhs_body + -- Glue back on the non-dict lambdas + + calls_for_me = case lookupFM calls fn of + Nothing -> [] + Just cs -> fmToList cs + + ---------------------------------------------------------- + -- Specialise to one particular call pattern + spec_call :: (CallKey, ([DictExpr], VarSet)) -- Call instance + -> SpecM ((Id,CoreExpr), -- Specialised definition + UsageDetails, -- Usage details from specialised body + CoreRule) -- Info for the Id's SpecEnv + spec_call (CallKey call_ts, (call_ds, call_fvs)) + = ASSERT( call_ts `lengthIs` n_tyvars && call_ds `lengthIs` n_dicts ) + -- Calls are only recorded for properly-saturated applications + + -- Suppose f's defn is f = /\ a b c d -> \ d1 d2 -> rhs + -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [dx1, dx2] + + -- Construct the new binding + -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b d -> rhs) + -- PLUS the usage-details + -- { d1' = dx1; d2' = dx2 } + -- where d1', d2' are cloned versions of d1,d2, with the type substitution applied. + -- + -- Note that the substitution is applied to the whole thing. + -- This is convenient, but just slightly fragile. Notably: + -- * There had better be no name clashes in a/b/c/d + -- + let + -- poly_tyvars = [b,d] in the example above + -- spec_tyvars = [a,c] + -- ty_args = [t1,b,t3,d] + poly_tyvars = [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts] + spec_tyvars = [tv | (tv, Just _) <- rhs_tyvars `zip` call_ts] + ty_args = zipWithEqual "spec_call" mk_ty_arg rhs_tyvars call_ts + where + mk_ty_arg rhs_tyvar Nothing = Type (mkTyVarTy rhs_tyvar) + mk_ty_arg rhs_tyvar (Just ty) = Type ty + rhs_subst = extendTvSubstList subst (spec_tyvars `zip` [ty | Just ty <- call_ts]) + in + cloneBinders rhs_subst rhs_dicts `thenSM` \ (rhs_subst', rhs_dicts') -> + let + inst_args = ty_args ++ map Var rhs_dicts' + + -- Figure out the type of the specialised function + body_ty = applyTypeToArgs rhs fn_type inst_args + (lam_args, app_args) -- Add a dummy argument if body_ty is unlifted + | isUnLiftedType body_ty -- C.f. WwLib.mkWorkerArgs + = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [realWorldPrimId]) + | otherwise = (poly_tyvars, poly_tyvars) + spec_id_ty = mkPiTypes lam_args body_ty + in + newIdSM fn spec_id_ty `thenSM` \ spec_f -> + specExpr rhs_subst' (mkLams lam_args body) `thenSM` \ (spec_rhs, rhs_uds) -> + let + -- The rule to put in the function's specialisation is: + -- forall b,d, d1',d2'. f t1 b t3 d d1' d2' = f1 b d + spec_env_rule = mkLocalRule (mkFastString ("SPEC " ++ showSDoc (ppr fn))) + AlwaysActive (idName fn) + (poly_tyvars ++ rhs_dicts') + inst_args + (mkVarApps (Var spec_f) app_args) + + -- Add the { d1' = dx1; d2' = dx2 } usage stuff + final_uds = foldr addDictBind rhs_uds (my_zipEqual "spec_call" rhs_dicts' call_ds) + + -- NOTE: we don't add back in any INLINE pragma on the RHS, so even if + -- the original function said INLINE, the specialised copies won't. + -- The idea is that the point of inlining was precisely to specialise + -- the function at its call site, and that's not so important for the + -- specialised copies. But it still smells like an ad hoc decision. + + in + returnSM ((spec_f, spec_rhs), + final_uds, + spec_env_rule) + + where + my_zipEqual doc xs ys + | not (equalLength xs ys) = pprPanic "my_zipEqual" (ppr xs $$ ppr ys $$ (ppr fn <+> ppr call_ts) $$ ppr rhs) + | otherwise = zipEqual doc xs ys + +dropInline :: CoreExpr -> CoreExpr +dropInline (Note InlineMe rhs) = rhs +dropInline rhs = rhs +\end{code} + +%************************************************************************ +%* * +\subsubsection{UsageDetails and suchlike} +%* * +%************************************************************************ + +\begin{code} +data UsageDetails + = MkUD { + dict_binds :: !(Bag DictBind), + -- Floated dictionary bindings + -- The order is important; + -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1 + -- (Remember, Bags preserve order in GHC.) + + calls :: !CallDetails + } + +type DictBind = (CoreBind, VarSet) + -- The set is the free vars of the binding + -- both tyvars and dicts + +type DictExpr = CoreExpr + +emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM } + +type ProtoUsageDetails = ([DictBind], + [(Id, CallKey, ([DictExpr], VarSet))] + ) + +------------------------------------------------------------ +type CallDetails = FiniteMap Id CallInfo +newtype CallKey = CallKey [Maybe Type] -- Nothing => unconstrained type argument +type CallInfo = FiniteMap CallKey + ([DictExpr], VarSet) -- Dict args and the vars of the whole + -- call (including tyvars) + -- [*not* include the main id itself, of course] + -- The finite maps eliminate duplicates + -- The list of types and dictionaries is guaranteed to + -- match the type of f + +-- Type isn't an instance of Ord, so that we can control which +-- instance we use. That's tiresome here. Oh well +instance Eq CallKey where + k1 == k2 = case k1 `compare` k2 of { EQ -> True; other -> False } + +instance Ord CallKey where + compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2 + where + cmp Nothing Nothing = EQ + cmp Nothing (Just t2) = LT + cmp (Just t1) Nothing = GT + cmp (Just t1) (Just t2) = tcCmpType t1 t2 + +unionCalls :: CallDetails -> CallDetails -> CallDetails +unionCalls c1 c2 = plusFM_C plusFM c1 c2 + +singleCall :: Id -> [Maybe Type] -> [DictExpr] -> CallDetails +singleCall id tys dicts + = unitFM id (unitFM (CallKey tys) (dicts, call_fvs)) + where + call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs + tys_fvs = tyVarsOfTypes (catMaybes tys) + -- The type args (tys) are guaranteed to be part of the dictionary + -- types, because they are just the constrained types, + -- and the dictionary is therefore sure to be bound + -- inside the binding for any type variables free in the type; + -- hence it's safe to neglect tyvars free in tys when making + -- the free-var set for this call + -- BUT I don't trust this reasoning; play safe and include tys_fvs + -- + -- We don't include the 'id' itself. + +listToCallDetails calls + = foldr (unionCalls . mk_call) emptyFM calls + where + mk_call (id, tys, dicts_w_fvs) = unitFM id (unitFM tys dicts_w_fvs) + -- NB: the free vars of the call are provided + +callDetailsToList calls = [ (id,tys,dicts) + | (id,fm) <- fmToList calls, + (tys, dicts) <- fmToList fm + ] + +mkCallUDs subst f args + | null theta + || not (all isClassPred theta) + -- Only specialise if all overloading is on class params. + -- In ptic, with implicit params, the type args + -- *don't* say what the value of the implicit param is! + || not (spec_tys `lengthIs` n_tyvars) + || not ( dicts `lengthIs` n_dicts) + || maybeToBool (lookupRule (\act -> True) (substInScope subst) emptyRuleBase f args) + -- There's already a rule covering this call. A typical case + -- is where there's an explicit user-provided rule. Then + -- we don't want to create a specialised version + -- of the function that overlaps. + = emptyUDs -- Not overloaded, or no specialisation wanted + + | otherwise + = MkUD {dict_binds = emptyBag, + calls = singleCall f spec_tys dicts + } + where + (tyvars, theta, _) = tcSplitSigmaTy (idType f) + constrained_tyvars = tyVarsOfTheta theta + n_tyvars = length tyvars + n_dicts = length theta + + spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args] + dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)] + + mk_spec_ty tyvar ty + | tyvar `elemVarSet` constrained_tyvars = Just ty + | otherwise = Nothing + +------------------------------------------------------------ +plusUDs :: UsageDetails -> UsageDetails -> UsageDetails +plusUDs (MkUD {dict_binds = db1, calls = calls1}) + (MkUD {dict_binds = db2, calls = calls2}) + = MkUD {dict_binds = d, calls = c} + where + d = db1 `unionBags` db2 + c = calls1 `unionCalls` calls2 + +plusUDList = foldr plusUDs emptyUDs + +-- zapCalls deletes calls to ids from uds +zapCalls ids uds = uds {calls = delListFromFM (calls uds) ids} + +mkDB bind = (bind, bind_fvs bind) + +bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs) +bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs + where + bndrs = map fst prs + rhs_fvs = unionVarSets (map pair_fvs prs) + +pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idRuleVars bndr + -- Don't forget variables mentioned in the + -- rules of the bndr. C.f. OccAnal.addRuleUsage + + +addDictBind (dict,rhs) uds = uds { dict_binds = mkDB (NonRec dict rhs) `consBag` dict_binds uds } + +dumpAllDictBinds (MkUD {dict_binds = dbs}) binds + = foldrBag add binds dbs + where + add (bind,_) binds = bind : binds + +dumpUDs :: [CoreBndr] + -> UsageDetails -> CoreExpr + -> (UsageDetails, CoreExpr) +dumpUDs bndrs uds body + = (free_uds, foldr add_let body dict_binds) + where + (free_uds, (dict_binds, _)) = splitUDs bndrs uds + add_let (bind,_) body = Let bind body + +splitUDs :: [CoreBndr] + -> UsageDetails + -> (UsageDetails, -- These don't mention the binders + ProtoUsageDetails) -- These do + +splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs, + calls = orig_calls}) + + = if isEmptyBag dump_dbs && null dump_calls then + -- Common case: binder doesn't affect floats + (uds, ([],[])) + + else + -- Binders bind some of the fvs of the floats + (MkUD {dict_binds = free_dbs, + calls = listToCallDetails free_calls}, + (bagToList dump_dbs, dump_calls) + ) + + where + bndr_set = mkVarSet bndrs + + (free_dbs, dump_dbs, dump_idset) + = foldlBag dump_db (emptyBag, emptyBag, bndr_set) orig_dbs + -- Important that it's foldl not foldr; + -- we're accumulating the set of dumped ids in dump_set + + -- Filter out any calls that mention things that are being dumped + orig_call_list = callDetailsToList orig_calls + (dump_calls, free_calls) = partition captured orig_call_list + captured (id,tys,(dicts, fvs)) = fvs `intersectsVarSet` dump_idset + || id `elemVarSet` dump_idset + + dump_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs) + | dump_idset `intersectsVarSet` fvs -- Dump it + = (free_dbs, dump_dbs `snocBag` db, + extendVarSetList dump_idset (bindersOf bind)) + + | otherwise -- Don't dump it + = (free_dbs `snocBag` db, dump_dbs, dump_idset) +\end{code} + + +%************************************************************************ +%* * +\subsubsection{Boring helper functions} +%* * +%************************************************************************ + +\begin{code} +type SpecM a = UniqSM a + +thenSM = thenUs +returnSM = returnUs +getUniqSM = getUniqueUs +mapSM = mapUs +initSM = initUs_ + +mapAndCombineSM f [] = returnSM ([], emptyUDs) +mapAndCombineSM f (x:xs) = f x `thenSM` \ (y, uds1) -> + mapAndCombineSM f xs `thenSM` \ (ys, uds2) -> + returnSM (y:ys, uds1 `plusUDs` uds2) + +cloneBindSM :: Subst -> CoreBind -> SpecM (Subst, Subst, CoreBind) +-- Clone the binders of the bind; return new bind with the cloned binders +-- Return the substitution to use for RHSs, and the one to use for the body +cloneBindSM subst (NonRec bndr rhs) + = getUs `thenUs` \ us -> + let + (subst', bndr') = cloneIdBndr subst us bndr + in + returnUs (subst, subst', NonRec bndr' rhs) + +cloneBindSM subst (Rec pairs) + = getUs `thenUs` \ us -> + let + (subst', bndrs') = cloneRecIdBndrs subst us (map fst pairs) + in + returnUs (subst', subst', Rec (bndrs' `zip` map snd pairs)) + +cloneBinders subst bndrs + = getUs `thenUs` \ us -> + returnUs (cloneIdBndrs subst us bndrs) + +newIdSM old_id new_ty + = getUniqSM `thenSM` \ uniq -> + let + -- Give the new Id a similar occurrence name to the old one + name = idName old_id + new_id = mkUserLocal (mkSpecOcc (nameOccName name)) uniq new_ty (getSrcLoc name) + in + returnSM new_id +\end{code} + + + Old (but interesting) stuff about unboxed bindings + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +What should we do when a value is specialised to a *strict* unboxed value? + + map_*_* f (x:xs) = let h = f x + t = map f xs + in h:t + +Could convert let to case: + + map_*_Int# f (x:xs) = case f x of h# -> + let t = map f xs + in h#:t + +This may be undesirable since it forces evaluation here, but the value +may not be used in all branches of the body. In the general case this +transformation is impossible since the mutual recursion in a letrec +cannot be expressed as a case. + +There is also a problem with top-level unboxed values, since our +implementation cannot handle unboxed values at the top level. + +Solution: Lift the binding of the unboxed value and extract it when it +is used: + + map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h# + t = map f xs + in case h of + _Lift h# -> h#:t + +Now give it to the simplifier and the _Lifting will be optimised away. + +The benfit is that we have given the specialised "unboxed" values a +very simplep lifted semantics and then leave it up to the simplifier to +optimise it --- knowing that the overheads will be removed in nearly +all cases. + +In particular, the value will only be evaluted in the branches of the +program which use it, rather than being forced at the point where the +value is bound. For example: + + filtermap_*_* p f (x:xs) + = let h = f x + t = ... + in case p x of + True -> h:t + False -> t + ==> + filtermap_*_Int# p f (x:xs) + = let h = case (f x) of h# -> _Lift h# + t = ... + in case p x of + True -> case h of _Lift h# + -> h#:t + False -> t + +The binding for h can still be inlined in the one branch and the +_Lifting eliminated. + + +Question: When won't the _Lifting be eliminated? + +Answer: When they at the top-level (where it is necessary) or when +inlining would duplicate work (or possibly code depending on +options). However, the _Lifting will still be eliminated if the +strictness analyser deems the lifted binding strict. + |