summaryrefslogtreecommitdiff
path: root/compiler/specialise
diff options
context:
space:
mode:
authorSergei Trofimovich <slyfox@gentoo.org>2016-09-02 18:47:56 +0100
committerSergei Trofimovich <siarheit@google.com>2016-09-02 21:42:44 +0100
commitf93c363fab1ac8ce6f0b474f5967b0b097995827 (patch)
tree8c8b7f5d9ff70ad614fa6b66897d69df8563c620 /compiler/specialise
parent133a5cc6647a2ea5a63b8d81f9f357f89cb541ef (diff)
downloadhaskell-f93c363fab1ac8ce6f0b474f5967b0b097995827.tar.gz
extend '-fmax-worker-args' limit to specialiser (Trac #11565)
It's a complementary change to a48de37dcca98e7d477040b0ed298bcd1b3ab303 restore -fmax-worker-args handling (Trac #11565) I don't have a small example but I've noticed another discrepancy when was profiling GHC for performance cmmExprNative :: ReferenceKind -> CmmExpr -> CmmOptM CmmExpr was specialised by 'spec_one' down to a function with arity 159. As a result 'perf record' pointed at it as at slowest function in whole ghc library. I've extended -fmax-worker-args effect to 'spec_one' as it does the same worker/wrapper split to push arguments to the heap. The change decreases heap usage on a synth.bash benchmark (Trac #9221) from 67G down to 64G (-4%). Benchmark runtime decreased from 14.5 s down to 14.s (-7%). Signed-off-by: Sergei Trofimovich <siarheit@google.com> Reviewers: ezyang, simonpj, austin, goldfire, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2507 GHC Trac Issues: #11565
Diffstat (limited to 'compiler/specialise')
-rw-r--r--compiler/specialise/SpecConstr.hs12
1 files changed, 8 insertions, 4 deletions
diff --git a/compiler/specialise/SpecConstr.hs b/compiler/specialise/SpecConstr.hs
index 8cc393cb44..1cf3d448bf 100644
--- a/compiler/specialise/SpecConstr.hs
+++ b/compiler/specialise/SpecConstr.hs
@@ -29,7 +29,7 @@ import CoreFVs ( exprsFreeVarsList )
import CoreMonad
import Literal ( litIsLifted )
import HscTypes ( ModGuts(..) )
-import WwLib ( mkWorkerArgs )
+import WwLib ( isWorkerSmallEnough, mkWorkerArgs )
import DataCon
import Coercion hiding( substCo )
import Rules
@@ -1533,10 +1533,14 @@ specialise env bind_calls (RI { ri_fn = fn, ri_lam_bndrs = arg_bndrs
| Just all_calls <- lookupVarEnv bind_calls fn
= -- pprTrace "specialise entry {" (ppr fn <+> ppr (length all_calls)) $
- do { (boring_call, pats) <- callsToPats env specs arg_occs all_calls
-
+ do { (boring_call, all_pats) <- callsToPats env specs arg_occs all_calls
-- Bale out if too many specialisations
- ; let n_pats = length pats
+ ; let pats = filter (is_small_enough . fst) all_pats
+ is_small_enough vars = isWorkerSmallEnough (sc_dflags env) vars
+ -- We are about to construct w/w pair in 'spec_one'.
+ -- Omit specialisation leading to high arity workers.
+ -- See Note [Limit w/w arity]
+ n_pats = length pats
spec_count' = n_pats + spec_count
; case sc_count env of
Just max | not (sc_force env) && spec_count' > max