diff options
author | Simon Peyton Jones <simonpj@microsoft.com> | 2013-01-17 10:54:07 +0000 |
---|---|---|
committer | Simon Peyton Jones <simonpj@microsoft.com> | 2013-01-17 10:54:07 +0000 |
commit | 0831a12ea2fc73c33652eeec1adc79fa19700578 (patch) | |
tree | 6382f3cd4cb7070d101e22d7de2876aa8cbbbc39 /compiler/iface | |
parent | aef38d130b0ff74b0a5f2478be985e96b40c0f97 (diff) | |
download | haskell-0831a12ea2fc73c33652eeec1adc79fa19700578.tar.gz |
Major patch to implement the new Demand Analyser
This patch is the result of Ilya Sergey's internship at MSR. It
constitutes a thorough overhaul and simplification of the demand
analyser. It makes a solid foundation on which we can now build.
Main changes are
* Instead of having one combined type for Demand, a Demand is
now a pair (JointDmd) of
- a StrDmd and
- an AbsDmd.
This allows strictness and absence to be though about quite
orthogonally, and greatly reduces brain melt-down.
* Similarly in the DmdResult type, it's a pair of
- a PureResult (indicating only divergence/non-divergence)
- a CPRResult (which deals only with the CPR property
* In IdInfo, the
strictnessInfo field contains a StrictSig, not a Maybe StrictSig
demandInfo field contains a Demand, not a Maybe Demand
We don't need Nothing (to indicate no strictness/demand info)
any more; topSig/topDmd will do.
* Remove "boxity" analysis entirely. This was an attempt to
avoid "reboxing", but it added complexity, is extremely
ad-hoc, and makes very little difference in practice.
* Remove the "unboxing strategy" computation. This was an an
attempt to ensure that a worker didn't get zillions of
arguments by unboxing big tuples. But in fact removing it
DRAMATICALLY reduces allocation in an inner loop of the
I/O library (where the threshold argument-count had been
set just too low). It's exceptional to have a zillion arguments
and I don't think it's worth the complexity, especially since
it turned out to have a serious performance hit.
* Remove quite a bit of ad-hoc cruft
* Move worthSplittingFun, worthSplittingThunk from WorkWrap to
Demand. This allows JointDmd to be fully abstract, examined
only inside Demand.
Everything else really follows from these changes.
All of this is really just refactoring, so we don't expect
big performance changes, but acutally the numbers look quite
good. Here is a full nofib run with some highlights identified:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
expert -2.6% -15.5% 0.00 0.00 +0.0%
fluid -2.4% -7.1% 0.01 0.01 +0.0%
gg -2.5% -28.9% 0.02 0.02 -33.3%
integrate -2.6% +3.2% +2.6% +2.6% +0.0%
mandel2 -2.6% +4.2% 0.01 0.01 +0.0%
nucleic2 -2.0% -16.3% 0.11 0.11 +0.0%
para -2.6% -20.0% -11.8% -11.7% +0.0%
parser -2.5% -17.9% 0.05 0.05 +0.0%
prolog -2.6% -13.0% 0.00 0.00 +0.0%
puzzle -2.6% +2.2% +0.8% +0.8% +0.0%
sorting -2.6% -35.9% 0.00 0.00 +0.0%
treejoin -2.6% -52.2% -9.8% -9.9% +0.0%
--------------------------------------------------------------------------------
Min -2.7% -52.2% -11.8% -11.7% -33.3%
Max -1.8% +4.2% +10.5% +10.5% +7.7%
Geometric Mean -2.5% -2.8% -0.4% -0.5% -0.4%
Things to note
* Binary sizes are smaller. I don't know why, but it's good.
* Allocation is sometiemes a *lot* smaller. I believe that all the big numbers
(I checked treejoin, gg, sorting) arise from one place, namely a function
GHC.IO.Encoding.UTF8.utf8_decode, which is strict in two Buffers both of
which have several arugments. Not w/w'ing both arguments (which is what
we did before) has a big effect. So the big win in actually somewhat
accidental, gained by removing the "unboxing strategy" code.
* A couple of benchmarks allocate slightly more. This turns out
to be due to reboxing (integrate). But the biggest increase is
mandel2, and *that* turned out also to be a somewhat accidental
loss of CSE, and pointed the way to doing better CSE: see Trac
#7596.
* Runtimes are never very reliable, but seem to improve very slightly.
All in all, a good piece of work. Thank you Ilya!
Diffstat (limited to 'compiler/iface')
-rw-r--r-- | compiler/iface/BinIface.hs | 100 | ||||
-rw-r--r-- | compiler/iface/IfaceSyn.lhs | 22 | ||||
-rw-r--r-- | compiler/iface/MkIface.lhs | 8 | ||||
-rw-r--r-- | compiler/iface/TcIface.lhs | 22 |
4 files changed, 29 insertions, 123 deletions
diff --git a/compiler/iface/BinIface.hs b/compiler/iface/BinIface.hs index cf9402a74e..39801bf64c 100644 --- a/compiler/iface/BinIface.hs +++ b/compiler/iface/BinIface.hs @@ -30,13 +30,11 @@ import TysWiredIn import IfaceEnv import HscTypes import BasicTypes -import Demand import Annotations import IfaceSyn import Module import Name import Avail -import VarEnv import DynFlags import UniqFM import UniqSupply @@ -389,7 +387,6 @@ data BinSymbolTable = BinSymbolTable { -- indexed by Name } - putFastString :: BinDictionary -> BinHandle -> FastString -> IO () putFastString dict bh fs = allocateFastString dict fs >>= put_ bh @@ -427,12 +424,6 @@ data BinDictionary = BinDictionary { {-! for StrictnessMark derive: Binary !-} {-! for Activation derive: Binary !-} --- Demand -{-! for Demand derive: Binary !-} -{-! for Demands derive: Binary !-} -{-! for DmdResult derive: Binary !-} -{-! for StrictSig derive: Binary !-} - -- Class {-! for DefMeth derive: Binary !-} @@ -818,87 +809,6 @@ instance Binary Fixity where ab <- get bh return (Fixity aa ab) -------------------------------------------------------------------------- --- Types from: Demand -------------------------------------------------------------------------- - -instance Binary DmdType where - -- Ignore DmdEnv when spitting out the DmdType - put bh (DmdType _ ds dr) = do p <- put bh ds; put_ bh dr; return (castBin p) - get bh = do ds <- get bh; dr <- get bh; return (DmdType emptyVarEnv ds dr) - -instance Binary Demand where - put_ bh Top = do - putByte bh 0 - put_ bh Abs = do - putByte bh 1 - put_ bh (Call aa) = do - putByte bh 2 - put_ bh aa - put_ bh (Eval ab) = do - putByte bh 3 - put_ bh ab - put_ bh (Defer ac) = do - putByte bh 4 - put_ bh ac - put_ bh (Box ad) = do - putByte bh 5 - put_ bh ad - put_ bh Bot = do - putByte bh 6 - get bh = do - h <- getByte bh - case h of - 0 -> do return Top - 1 -> do return Abs - 2 -> do aa <- get bh - return (Call aa) - 3 -> do ab <- get bh - return (Eval ab) - 4 -> do ac <- get bh - return (Defer ac) - 5 -> do ad <- get bh - return (Box ad) - _ -> do return Bot - -instance Binary Demands where - put_ bh (Poly aa) = do - putByte bh 0 - put_ bh aa - put_ bh (Prod ab) = do - putByte bh 1 - put_ bh ab - get bh = do - h <- getByte bh - case h of - 0 -> do aa <- get bh - return (Poly aa) - _ -> do ab <- get bh - return (Prod ab) - -instance Binary DmdResult where - put_ bh TopRes = do - putByte bh 0 - put_ bh RetCPR = do - putByte bh 1 - put_ bh BotRes = do - putByte bh 2 - get bh = do - h <- getByte bh - case h of - 0 -> do return TopRes - 1 -> do return RetCPR -- Really use RetCPR even if -fcpr-off - -- The wrapper was generated for CPR in - -- the imported module! - _ -> do return BotRes - -instance Binary StrictSig where - put_ bh (StrictSig aa) = do - put_ bh aa - get bh = do - aa <- get bh - return (StrictSig aa) - ------------------------------------------------------------------------- -- Types from: CostCentre @@ -1219,11 +1129,11 @@ instance Binary IfaceIdInfo where _ -> lazyGet bh >>= (return . HasInfo) -- NB lazyGet instance Binary IfaceInfoItem where - put_ bh (HsArity aa) = putByte bh 0 >> put_ bh aa - put_ bh (HsStrictness ab) = putByte bh 1 >> put_ bh ab - put_ bh (HsUnfold lb ad) = putByte bh 2 >> put_ bh lb >> put_ bh ad - put_ bh (HsInline ad) = putByte bh 3 >> put_ bh ad - put_ bh HsNoCafRefs = putByte bh 4 + put_ bh (HsArity aa) = putByte bh 0 >> put_ bh aa + put_ bh (HsStrictness ab) = putByte bh 1 >> put_ bh ab + put_ bh (HsUnfold lb ad) = putByte bh 2 >> put_ bh lb >> put_ bh ad + put_ bh (HsInline ad) = putByte bh 3 >> put_ bh ad + put_ bh HsNoCafRefs = putByte bh 4 get bh = do h <- getByte bh case h of diff --git a/compiler/iface/IfaceSyn.lhs b/compiler/iface/IfaceSyn.lhs index e72f95c9a8..9ef5ef66f4 100644 --- a/compiler/iface/IfaceSyn.lhs +++ b/compiler/iface/IfaceSyn.lhs @@ -230,11 +230,11 @@ data IfaceIdInfo -- and so gives a new version. data IfaceInfoItem - = HsArity Arity - | HsStrictness StrictSig - | HsInline InlinePragma - | HsUnfold Bool -- True <=> isStrongLoopBreaker is true - IfaceUnfolding -- See Note [Expose recursive functions] + = HsArity Arity + | HsStrictness StrictSig + | HsInline InlinePragma + | HsUnfold Bool -- True <=> isStrongLoopBreaker is true + IfaceUnfolding -- See Note [Expose recursive functions] | HsNoCafRefs -- NB: Specialisations and rules come in separately and are @@ -750,13 +750,13 @@ instance Outputable IfaceIdInfo where <+> ptext (sLit "-}") instance Outputable IfaceInfoItem where - ppr (HsUnfold lb unf) = ptext (sLit "Unfolding") - <> ppWhen lb (ptext (sLit "(loop-breaker)")) - <> colon <+> ppr unf - ppr (HsInline prag) = ptext (sLit "Inline:") <+> ppr prag - ppr (HsArity arity) = ptext (sLit "Arity:") <+> int arity + ppr (HsUnfold lb unf) = ptext (sLit "Unfolding") + <> ppWhen lb (ptext (sLit "(loop-breaker)")) + <> colon <+> ppr unf + ppr (HsInline prag) = ptext (sLit "Inline:") <+> ppr prag + ppr (HsArity arity) = ptext (sLit "Arity:") <+> int arity ppr (HsStrictness str) = ptext (sLit "Strictness:") <+> pprIfaceStrictSig str - ppr HsNoCafRefs = ptext (sLit "HasNoCafRefs") + ppr HsNoCafRefs = ptext (sLit "HasNoCafRefs") instance Outputable IfaceUnfolding where ppr (IfCompulsory e) = ptext (sLit "<compulsory>") <+> parens (ppr e) diff --git a/compiler/iface/MkIface.lhs b/compiler/iface/MkIface.lhs index d5b302406e..2200577c59 100644 --- a/compiler/iface/MkIface.lhs +++ b/compiler/iface/MkIface.lhs @@ -1695,7 +1695,7 @@ toIfaceIdDetails other = pprTrace "toIfaceIdDetails" (p toIfaceIdInfo :: IdInfo -> IfaceIdInfo toIfaceIdInfo id_info - = case catMaybes [arity_hsinfo, caf_hsinfo, strict_hsinfo, + = case catMaybes [arity_hsinfo, caf_hsinfo, strict_hsinfo, inline_hsinfo, unfold_hsinfo] of [] -> NoInfo infos -> HasInfo infos @@ -1715,9 +1715,9 @@ toIfaceIdInfo id_info ------------ Strictness -------------- -- No point in explicitly exporting TopSig - strict_hsinfo = case strictnessInfo id_info of - Just sig | not (isTopSig sig) -> Just (HsStrictness sig) - _other -> Nothing + sig_info = strictnessInfo id_info + strict_hsinfo | not (isTopSig sig_info) = Just (HsStrictness sig_info) + | otherwise = Nothing ------------ Unfolding -------------- unfold_hsinfo = toIfUnfolding loop_breaker (unfoldingInfo id_info) diff --git a/compiler/iface/TcIface.lhs b/compiler/iface/TcIface.lhs index d6acc06688..930bb1e2a2 100644 --- a/compiler/iface/TcIface.lhs +++ b/compiler/iface/TcIface.lhs @@ -31,8 +31,8 @@ import CoreSyn import CoreUtils import CoreUnfold import CoreLint -import WorkWrap -import MkCore( castBottomExpr ) +import WorkWrap ( mkWrapper ) +import MkCore ( castBottomExpr ) import Id import MkId import IdInfo @@ -1205,7 +1205,7 @@ tcIdInfo ignore_prags name ty info tcPrag :: IdInfo -> IfaceInfoItem -> IfL IdInfo tcPrag info HsNoCafRefs = return (info `setCafInfo` NoCafRefs) tcPrag info (HsArity arity) = return (info `setArityInfo` arity) - tcPrag info (HsStrictness str) = return (info `setStrictnessInfo` Just str) + tcPrag info (HsStrictness str) = return (info `setStrictnessInfo` str) tcPrag info (HsInline prag) = return (info `setInlinePragInfo` prag) -- The next two are lazy, so they don't transitively suck stuff in @@ -1226,12 +1226,11 @@ tcUnfolding name _ info (IfCoreUnfold stable if_expr) Nothing -> NoUnfolding Just expr -> mkUnfolding dflags unf_src True {- Top level -} - is_bottoming expr) } + is_bottoming + expr) } where -- Strictness should occur before unfolding! - is_bottoming = case strictnessInfo info of - Just sig -> isBottomingSig sig - Nothing -> False + is_bottoming = isBottomingSig $ strictnessInfo info tcUnfolding name _ _ (IfCompulsory if_expr) = do { mb_expr <- tcPragExpr name if_expr @@ -1278,12 +1277,9 @@ tcIfaceWrapper name ty info arity get_worker = mkWwInlineRule wkr_id (initUs_ us (mkWrapper dflags ty strict_sig) wkr_id) arity - - -- Again we rely here on strictness info always appearing - -- before unfolding - strict_sig = case strictnessInfo info of - Just sig -> sig - Nothing -> pprPanic "Worker info but no strictness for" (ppr name) + -- Again we rely here on strictness info + -- always appearing before unfolding + strict_sig = strictnessInfo info \end{code} For unfoldings we try to do the job lazily, so that we never type check |