diff options
author | Simon Marlow <marlowsd@gmail.com> | 2011-10-27 13:47:27 +0100 |
---|---|---|
committer | Simon Marlow <marlowsd@gmail.com> | 2011-11-02 16:34:05 +0000 |
commit | 7bb0447df9a783c222c2a077e35e5013c7c68d91 (patch) | |
tree | 78d6d2a14f7e42df5cda32199c71ced973f169ef /compiler/codeGen/CgProf.hs | |
parent | bd72eeb184a95ae0ae79ccad19c8ccc2b45a12e0 (diff) | |
download | haskell-7bb0447df9a783c222c2a077e35e5013c7c68d91.tar.gz |
Overhaul of infrastructure for profiling, coverage (HPC) and breakpoints
User visible changes
====================
Profilng
--------
Flags renamed (the old ones are still accepted for now):
OLD NEW
--------- ------------
-auto-all -fprof-auto
-auto -fprof-exported
-caf-all -fprof-cafs
New flags:
-fprof-auto Annotates all bindings (not just top-level
ones) with SCCs
-fprof-top Annotates just top-level bindings with SCCs
-fprof-exported Annotates just exported bindings with SCCs
-fprof-no-count-entries Do not maintain entry counts when profiling
(can make profiled code go faster; useful with
heap profiling where entry counts are not used)
Cost-centre stacks have a new semantics, which should in most cases
result in more useful and intuitive profiles. If you find this not to
be the case, please let me know. This is the area where I have been
experimenting most, and the current solution is probably not the
final version, however it does address all the outstanding bugs and
seems to be better than GHC 7.2.
Stack traces
------------
+RTS -xc now gives more information. If the exception originates from
a CAF (as is common, because GHC tends to lift exceptions out to the
top-level), then the RTS walks up the stack and reports the stack in
the enclosing update frame(s).
Result: +RTS -xc is much more useful now - but you still have to
compile for profiling to get it. I've played around a little with
adding 'head []' to GHC itself, and +RTS -xc does pinpoint the problem
quite accurately.
I plan to add more facilities for stack tracing (e.g. in GHCi) in the
future.
Coverage (HPC)
--------------
* derived instances are now coloured yellow if they weren't used
* likewise record field names
* entry counts are more accurate (hpc --fun-entry-count)
* tab width is now correct (markup was previously off in source with
tabs)
Internal changes
================
In Core, the Note constructor has been replaced by
Tick (Tickish b) (Expr b)
which is used to represent all the kinds of source annotation we
support: profiling SCCs, HPC ticks, and GHCi breakpoints.
Depending on the properties of the Tickish, different transformations
apply to Tick. See CoreUtils.mkTick for details.
Tickets
=======
This commit closes the following tickets, test cases to follow:
- Close #2552: not a bug, but the behaviour is now more intuitive
(test is T2552)
- Close #680 (test is T680)
- Close #1531 (test is result001)
- Close #949 (test is T949)
- Close #2466: test case has bitrotted (doesn't compile against current
version of vector-space package)
Diffstat (limited to 'compiler/codeGen/CgProf.hs')
-rw-r--r-- | compiler/codeGen/CgProf.hs | 195 |
1 files changed, 27 insertions, 168 deletions
diff --git a/compiler/codeGen/CgProf.hs b/compiler/codeGen/CgProf.hs index b58fbb4238..b43751361c 100644 --- a/compiler/codeGen/CgProf.hs +++ b/compiler/codeGen/CgProf.hs @@ -10,13 +10,13 @@ module CgProf ( mkCCostCentre, mkCCostCentreStack, -- Cost-centre Profiling - dynProfHdr, profDynAlloc, profAlloc, staticProfHdr, initUpdFrameProf, - enterCostCentre, enterCostCentrePAP, enterCostCentreThunk, - chooseDynCostCentres, - costCentreFrom, + dynProfHdr, profDynAlloc, profAlloc, staticProfHdr, initUpdFrameProf, + enterCostCentreThunk, + enterCostCentreFun, + costCentreFrom, curCCS, curCCSAddr, emitCostCentreDecl, emitCostCentreStackDecl, - emitSetCCC, emitCCS, + emitSetCCC, -- Lag/drag/void stuff ldvEnter, ldvEnterClosure, ldvRecordCreate @@ -40,10 +40,8 @@ import OldCmm import OldCmmUtils import CLabel -import Id import qualified Module import CostCentre -import StgSyn import StaticFlags import FastString import Module @@ -108,6 +106,9 @@ profDynAlloc cl_info ccs -- | Record the allocation of a closure (size is given by a CmmExpr) -- The size must be in words, because the allocation counter in a CCS counts -- in words. +-- +-- This API is used by the @CCS_ALLOC()@ macro in @.cmm@ code. +-- profAlloc :: CmmExpr -> CmmExpr -> Code profAlloc words ccs = ifProfiling $ @@ -121,160 +122,21 @@ profAlloc words ccs where alloc_rep = typeWidth REP_CostCentreStack_mem_alloc --- ---------------------------------------------------------------------- --- Setting the cost centre in a new closure - -chooseDynCostCentres :: CostCentreStack - -> [Id] -- Args - -> StgExpr -- Body - -> FCode (CmmExpr, CmmExpr) --- Called when alllcating a closure --- Tells which cost centre to put in the object, and which --- to blame the cost of allocation on -chooseDynCostCentres ccs args body = do - -- Cost-centre we record in the object - use_ccs <- emitCCS ccs - - -- Cost-centre on whom we blame the allocation - let blame_ccs - | null args && isBox body = CmmLit (mkCCostCentreStack overheadCCS) - | otherwise = use_ccs - - return (use_ccs, blame_ccs) - - --- Some CostCentreStacks are a sequence of pushes on top of CCCS. --- These pushes must be performed before we can refer to the stack in --- an expression. -emitCCS :: CostCentreStack -> FCode CmmExpr -emitCCS ccs = push_em (ccsExpr ccs') (reverse cc's) - where - (cc's, ccs') = decomposeCCS ccs - - push_em ccs [] = return ccs - push_em ccs (cc:rest) = do - tmp <- newTemp bWord -- TODO FIXME NOW - pushCostCentre tmp ccs cc - push_em (CmmReg (CmmLocal tmp)) rest - -ccsExpr :: CostCentreStack -> CmmExpr -ccsExpr ccs - | isCurrentCCS ccs = curCCS - | otherwise = CmmLit (mkCCostCentreStack ccs) - - -isBox :: StgExpr -> Bool --- If it's an utterly trivial RHS, then it must be --- one introduced by boxHigherOrderArgs for profiling, --- so we charge it to "OVERHEAD". --- This looks like a GROSS HACK to me --SDM -isBox (StgApp _ []) = True -isBox _ = False - - -- ----------------------------------------------------------------------- -- Setting the current cost centre on entry to a closure --- For lexically scoped profiling we have to load the cost centre from --- the closure entered, if the costs are not supposed to be inherited. --- This is done immediately on entering the fast entry point. - --- Load current cost centre from closure, if not inherited. --- Node is guaranteed to point to it, if profiling and not inherited. - -enterCostCentre - :: ClosureInfo - -> CostCentreStack - -> StgExpr -- The RHS of the closure - -> Code - --- We used to have a special case for bindings of form --- f = g True --- where g has arity 2. The RHS is a thunk, but we don't --- need to update it; and we want to subsume costs. --- We don't have these sort of PAPs any more, so the special --- case has gone away. - -enterCostCentre closure_info ccs body - = ifProfiling $ - ASSERT2(not (noCCSAttached ccs), ppr (closureName closure_info) <+> ppr ccs) - enter_cost_centre closure_info ccs body - -enter_cost_centre :: ClosureInfo -> CostCentreStack -> StgExpr -> Code -enter_cost_centre closure_info ccs body - | isSubsumedCCS ccs - = ASSERT(isToplevClosure closure_info) - ASSERT(re_entrant) - enter_ccs_fsub - - | isDerivedFromCurrentCCS ccs - = do { - if re_entrant && not is_box - then - enter_ccs_fun node_ccs - else - stmtC (CmmStore curCCSAddr node_ccs) - - -- don't forget to bump the scc count. This closure might have been - -- of the form let x = _scc_ "x" e in ...x..., which the SCCfinal - -- pass has turned into simply let x = e in ...x... and attached - -- the _scc_ as PushCostCentre(x,CCCS) on the x closure. So that - -- we don't lose the scc counter, bump it in the entry code for x. - -- ToDo: for a multi-push we should really bump the counter for - -- each of the intervening CCSs, not just the top one. - ; when (not (isCurrentCCS ccs)) $ - stmtC (bumpSccCount curCCS) - } - - | isCafCCS ccs - = ASSERT(isToplevClosure closure_info) - ASSERT(not re_entrant) - do { -- This is just a special case of the isDerivedFromCurrentCCS - -- case above. We could delete this, but it's a micro - -- optimisation and saves a bit of code. - stmtC (CmmStore curCCSAddr enc_ccs) - ; stmtC (bumpSccCount node_ccs) - } - - | otherwise - = panic "enterCostCentre" - where - enc_ccs = CmmLit (mkCCostCentreStack ccs) - re_entrant = closureReEntrant closure_info - node_ccs = costCentreFrom (cmmOffsetB (CmmReg nodeReg) (-node_tag)) - is_box = isBox body - - -- if this is a function, then node will be tagged; we must subract the tag - node_tag = funTag closure_info - --- set the current CCS when entering a PAP -enterCostCentrePAP :: CmmExpr -> Code -enterCostCentrePAP closure = - ifProfiling $ do - enter_ccs_fun (costCentreFrom closure) - enteringPAP 1 - enterCostCentreThunk :: CmmExpr -> Code enterCostCentreThunk closure = ifProfiling $ do stmtC $ CmmStore curCCSAddr (costCentreFrom closure) -enter_ccs_fun :: CmmExpr -> Code -enter_ccs_fun stack = emitRtsCall rtsPackageId (fsLit "EnterFunCCS") [CmmHinted stack AddrHint] False - -- ToDo: vols - -enter_ccs_fsub :: Code -enter_ccs_fsub = enteringPAP 0 - --- When entering a PAP, EnterFunCCS is called by both the PAP entry --- code and the function entry code; we don't want the function's --- entry code to also update CCCS in the event that it was called via --- a PAP, so we set the flag entering_PAP to indicate that we are --- entering via a PAP. -enteringPAP :: Integer -> Code -enteringPAP n - = stmtC (CmmStore (CmmLit (CmmLabel (mkCmmDataLabel rtsPackageId (fsLit "entering_PAP")))) - (CmmLit (CmmInt n cIntWidth))) +enterCostCentreFun :: CostCentreStack -> CmmExpr -> Code +enterCostCentreFun ccs closure = + ifProfiling $ do + if isCurrentCCS ccs + then emitRtsCall rtsPackageId (fsLit "enterFunCCS") + [CmmHinted (costCentreFrom closure) AddrHint] False + else return () -- top-level function, nothing to do ifProfiling :: Code -> Code ifProfiling code @@ -286,7 +148,6 @@ ifProfilingL xs | opt_SccProfilingOn = xs | otherwise = [] - -- --------------------------------------------------------------------------- -- Initialising Cost Centres & CCSs @@ -306,15 +167,15 @@ emitCostCentreDecl cc = do modl, -- char *module, zero, -- StgWord time_ticks zero64, -- StgWord64 mem_alloc - subsumed, -- StgInt is_caf - zero -- struct _CostCentre *link + is_caf, -- StgInt is_caf + zero -- struct _CostCentre *link ] ; emitDataLits (mkCCLabel cc) lits } where - subsumed | isCafCC cc = mkIntCLit (ord 'c') -- 'c' == is a CAF - | otherwise = mkIntCLit (ord 'B') -- 'B' == is boring - + is_caf | isCafCC cc = mkIntCLit (ord 'c') -- 'c' == is a CAF + | otherwise = zero + emitCostCentreStackDecl :: CostCentreStack @@ -349,23 +210,21 @@ sizeof_ccs_words -- --------------------------------------------------------------------------- -- Set the current cost centre stack -emitSetCCC :: CostCentre -> Code -emitSetCCC cc +emitSetCCC :: CostCentre -> Bool -> Bool -> Code +emitSetCCC cc tick push | not opt_SccProfilingOn = nopC | otherwise = do tmp <- newTemp bWord -- TODO FIXME NOW - ASSERT( sccAbleCostCentre cc ) - pushCostCentre tmp curCCS cc - stmtC (CmmStore curCCSAddr (CmmReg (CmmLocal tmp))) - when (isSccCountCostCentre cc) $ - stmtC (bumpSccCount curCCS) + pushCostCentre tmp curCCS cc + when tick $ stmtC (bumpSccCount (CmmReg (CmmLocal tmp))) + when push $ stmtC (CmmStore curCCSAddr (CmmReg (CmmLocal tmp))) pushCostCentre :: LocalReg -> CmmExpr -> CostCentre -> Code pushCostCentre result ccs cc = emitRtsCallWithResult result AddrHint rtsPackageId - (fsLit "PushCostCentre") [CmmHinted ccs AddrHint, - CmmHinted (CmmLit (mkCCostCentre cc)) AddrHint] + (fsLit "pushCostCentre") [CmmHinted ccs AddrHint, + CmmHinted (CmmLit (mkCCostCentre cc)) AddrHint] False bumpSccCount :: CmmExpr -> CmmStmt |