summaryrefslogtreecommitdiff
path: root/compiler/stgSyn
diff options
context:
space:
mode:
authorSimon Peyton Jones <simonpj@microsoft.com>2013-05-03 14:50:58 +0100
committerSimon Peyton Jones <simonpj@microsoft.com>2013-06-06 14:29:53 +0100
commit99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176 (patch)
tree62098e1b36c61fe1a978a29d955f57b629c5ec79 /compiler/stgSyn
parentda4ff650ae77930a5a10d4886c8bc7d37f081db7 (diff)
downloadhaskell-99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176.tar.gz
Implement cardinality analysis
This major patch implements the cardinality analysis described in our paper "Higher order cardinality analysis". It is joint work with Ilya Sergey and Dimitrios Vytiniotis. The basic is augment the absence-analysis part of the demand analyser so that it can tell when something is used never at most once some other way The "at most once" information is used a) to enable transformations, and in particular to identify one-shot lambdas b) to allow updates on thunks to be omitted. There are two new flags, mainly there so you can do performance comparisons: -fkill-absence stops GHC doing absence analysis at all -fkill-one-shot stops GHC spotting one-shot lambdas and single-entry thunks The big changes are: * The Demand type is substantially refactored. In particular the UseDmd is factored as follows data UseDmd = UCall Count UseDmd | UProd [MaybeUsed] | UHead | Used data MaybeUsed = Abs | Use Count UseDmd data Count = One | Many Notice that UCall recurses straight to UseDmd, whereas UProd goes via MaybeUsed. The "Count" embodies the "at most once" or "many" idea. * The demand analyser itself was refactored a lot * The previously ad-hoc stuff in the occurrence analyser for foldr and build goes away entirely. Before if we had build (\cn -> ...x... ) then the "\cn" was hackily made one-shot (by spotting 'build' as special. That's essential to allow x to be inlined. Now the occurrence analyser propagates info gotten from 'build's stricness signature (so build isn't special); and that strictness sig is in turn derived entirely automatically. Much nicer! * The ticky stuff is improved to count single-entry thunks separately. One shortcoming is that there is no DEBUG way to spot if an allegedly-single-entry thunk is acually entered more than once. It would not be hard to generate a bit of code to check for this, and it would be reassuring. But it's fiddly and I have not done it. Despite all this fuss, the performance numbers are rather under-whelming. See the paper for more discussion. nucleic2 -0.8% -10.9% 0.10 0.10 +0.0% sphere -0.7% -1.5% 0.08 0.08 +0.0% -------------------------------------------------------------------------------- Min -4.7% -10.9% -9.3% -9.3% -50.0% Max -0.4% +0.5% +2.2% +2.3% +7.4% Geometric Mean -0.8% -0.2% -1.3% -1.3% -1.8% I don't quite know how much credence to place in the runtime changes, but movement seems generally in the right direction.
Diffstat (limited to 'compiler/stgSyn')
-rw-r--r--compiler/stgSyn/CoreToStg.lhs30
1 files changed, 18 insertions, 12 deletions
diff --git a/compiler/stgSyn/CoreToStg.lhs b/compiler/stgSyn/CoreToStg.lhs
index 1e737f91e9..1395c22580 100644
--- a/compiler/stgSyn/CoreToStg.lhs
+++ b/compiler/stgSyn/CoreToStg.lhs
@@ -38,6 +38,7 @@ import FastString
import Util
import DynFlags
import ForeignCall
+import Demand ( isSingleUsed )
import PrimOp ( PrimCall(..) )
\end{code}
@@ -245,7 +246,7 @@ coreToTopStgRhs dflags this_mod scope_fv_info (bndr, rhs)
= do { (new_rhs, rhs_fvs, _) <- coreToStgExpr rhs
; lv_info <- freeVarsToLiveVars rhs_fvs
- ; let stg_rhs = mkTopStgRhs dflags this_mod rhs_fvs (mkSRT lv_info) bndr_info new_rhs
+ ; let stg_rhs = mkTopStgRhs dflags this_mod rhs_fvs (mkSRT lv_info) bndr bndr_info new_rhs
stg_arity = stgRhsArity stg_rhs
; return (ASSERT2( arity_ok stg_arity, mk_arity_msg stg_arity) stg_rhs,
rhs_fvs) }
@@ -272,26 +273,31 @@ coreToTopStgRhs dflags this_mod scope_fv_info (bndr, rhs)
ptext (sLit "STG arity:") <+> ppr stg_arity]
mkTopStgRhs :: DynFlags -> Module -> FreeVarsInfo
- -> SRT -> StgBinderInfo -> StgExpr
+ -> SRT -> Id -> StgBinderInfo -> StgExpr
-> StgRhs
-mkTopStgRhs _ _ rhs_fvs srt binder_info (StgLam bndrs body)
+mkTopStgRhs _ _ rhs_fvs srt _ binder_info (StgLam bndrs body)
= StgRhsClosure noCCS binder_info
(getFVs rhs_fvs)
ReEntrant
srt
bndrs body
-mkTopStgRhs dflags this_mod _ _ _ (StgConApp con args)
+mkTopStgRhs dflags this_mod _ _ _ _ (StgConApp con args)
| not (isDllConApp dflags this_mod con args) -- Dynamic StgConApps are updatable
= StgRhsCon noCCS con args
-mkTopStgRhs _ _ rhs_fvs srt binder_info rhs
+mkTopStgRhs _ _ rhs_fvs srt bndr binder_info rhs
= StgRhsClosure noCCS binder_info
(getFVs rhs_fvs)
- Updatable
+ (getUpdateFlag bndr)
srt
[] rhs
+
+getUpdateFlag :: Id -> UpdateFlag
+getUpdateFlag bndr
+ = if isSingleUsed (idDemandInfo bndr)
+ then SingleEntry else Updatable
\end{code}
@@ -781,27 +787,27 @@ coreToStgRhs :: FreeVarsInfo -- Free var info for the scope of the bi
coreToStgRhs scope_fv_info binders (bndr, rhs) = do
(new_rhs, rhs_fvs, rhs_escs) <- coreToStgExpr rhs
lv_info <- freeVarsToLiveVars (binders `minusFVBinders` rhs_fvs)
- return (mkStgRhs rhs_fvs (mkSRT lv_info) bndr_info new_rhs,
+ return (mkStgRhs rhs_fvs (mkSRT lv_info) bndr bndr_info new_rhs,
rhs_fvs, lv_info, rhs_escs)
where
bndr_info = lookupFVInfo scope_fv_info bndr
-mkStgRhs :: FreeVarsInfo -> SRT -> StgBinderInfo -> StgExpr -> StgRhs
+mkStgRhs :: FreeVarsInfo -> SRT -> Id -> StgBinderInfo -> StgExpr -> StgRhs
-mkStgRhs _ _ _ (StgConApp con args) = StgRhsCon noCCS con args
+mkStgRhs _ _ _ _ (StgConApp con args) = StgRhsCon noCCS con args
-mkStgRhs rhs_fvs srt binder_info (StgLam bndrs body)
+mkStgRhs rhs_fvs srt _ binder_info (StgLam bndrs body)
= StgRhsClosure noCCS binder_info
(getFVs rhs_fvs)
ReEntrant
srt bndrs body
-mkStgRhs rhs_fvs srt binder_info rhs
+mkStgRhs rhs_fvs srt bndr binder_info rhs
= StgRhsClosure noCCS binder_info
(getFVs rhs_fvs)
upd_flag srt [] rhs
where
- upd_flag = Updatable
+ upd_flag = getUpdateFlag bndr
{-
SDM: disabled. Eval/Apply can't handle functions with arity zero very
well; and making these into simple non-updatable thunks breaks other