diff options
author | Sergei Trofimovich <slyfox@gentoo.org> | 2016-09-02 18:47:56 +0100 |
---|---|---|
committer | Sergei Trofimovich <siarheit@google.com> | 2016-09-02 21:42:44 +0100 |
commit | f93c363fab1ac8ce6f0b474f5967b0b097995827 (patch) | |
tree | 8c8b7f5d9ff70ad614fa6b66897d69df8563c620 /compiler/specialise | |
parent | 133a5cc6647a2ea5a63b8d81f9f357f89cb541ef (diff) | |
download | haskell-f93c363fab1ac8ce6f0b474f5967b0b097995827.tar.gz |
extend '-fmax-worker-args' limit to specialiser (Trac #11565)
It's a complementary change to
a48de37dcca98e7d477040b0ed298bcd1b3ab303
restore -fmax-worker-args handling (Trac #11565)
I don't have a small example but I've noticed another
discrepancy when was profiling GHC for performance
cmmExprNative :: ReferenceKind -> CmmExpr -> CmmOptM CmmExpr
was specialised by 'spec_one' down to a function with arity 159.
As a result 'perf record' pointed at it as at slowest
function in whole ghc library.
I've extended -fmax-worker-args effect to 'spec_one'
as it does the same worker/wrapper split to push
arguments to the heap.
The change decreases heap usage on a synth.bash benchmark
(Trac #9221) from 67G down to 64G (-4%). Benchmark runtime
decreased from 14.5 s down to 14.s (-7%).
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Reviewers: ezyang, simonpj, austin, goldfire, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2507
GHC Trac Issues: #11565
Diffstat (limited to 'compiler/specialise')
-rw-r--r-- | compiler/specialise/SpecConstr.hs | 12 |
1 files changed, 8 insertions, 4 deletions
diff --git a/compiler/specialise/SpecConstr.hs b/compiler/specialise/SpecConstr.hs index 8cc393cb44..1cf3d448bf 100644 --- a/compiler/specialise/SpecConstr.hs +++ b/compiler/specialise/SpecConstr.hs @@ -29,7 +29,7 @@ import CoreFVs ( exprsFreeVarsList ) import CoreMonad import Literal ( litIsLifted ) import HscTypes ( ModGuts(..) ) -import WwLib ( mkWorkerArgs ) +import WwLib ( isWorkerSmallEnough, mkWorkerArgs ) import DataCon import Coercion hiding( substCo ) import Rules @@ -1533,10 +1533,14 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs | Just all_calls <- lookupVarEnv bind_calls fn = -- pprTrace "specialise entry {" (ppr fn <+> ppr (length all_calls)) $ - do { (boring_call, pats) <- callsToPats env specs arg_occs all_calls - + do { (boring_call, all_pats) <- callsToPats env specs arg_occs all_calls -- Bale out if too many specialisations - ; let n_pats = length pats + ; let pats = filter (is_small_enough . fst) all_pats + is_small_enough vars = isWorkerSmallEnough (sc_dflags env) vars + -- We are about to construct w/w pair in 'spec_one'. + -- Omit specialisation leading to high arity workers. + -- See Note [Limit w/w arity] + n_pats = length pats spec_count' = n_pats + spec_count ; case sc_count env of Just max | not (sc_force env) && spec_count' > max |