diff options
author | Simon Marlow <marlowsd@gmail.com> | 2016-12-15 11:17:19 -0500 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2016-12-15 11:17:29 -0500 |
commit | 394231b301efb6b56654b0a480ab794fe3b7e4db (patch) | |
tree | aae8fd1ee76ed8a06b9c87beb5dd2cb1aa8187e7 /testsuite/tests/profiling | |
parent | 5c76f834b5b7f2ee9712d0888a8b1b186b77dee5 (diff) | |
download | haskell-394231b301efb6b56654b0a480ab794fe3b7e4db.tar.gz |
Fix cost-centre-stacks bug (#5654)
This fixes some cases of wrong stacks being generated by the profiler.
For background and details on the fix see
`Note [Evaluating functions with profiling]` in `rts/Apply.cmm`.
This does have an impact on allocations for some programs when
profiling. nofib results:
```
k-nucleotide +0.0% +8.8% +11.0% +11.0% 0.0%
puzzle +0.0% +12.5% 0.244 0.246 0.0%
typecheck 0.0% +8.7% +16.1% +16.2% 0.0%
------------------------------------------------------------------------
--------
Min -0.0% -0.0% -34.4% -35.5% -25.0%
Max +0.0% +12.5% +48.9% +49.4% +10.6%
Geometric Mean +0.0% +0.6% +2.0% +1.8% -0.3%
```
But runtimes don't seem to be affected much, and the examples I looked
at were completely legitimate. For example, in puzzle we have this:
```
position :: ItemType -> StateType -> BankType
position Bono = bonoPos
position Edge = edgePos
position Larry = larryPos
position Adam = adamPos
```
where the identifiers on the rhs are all record selectors. Previously
the profiler gave a stack that looked like
```
position
bonoPos
...
```
i.e. `bonoPos` was at the same level of the call stack as `position`,
but now it looks like
```
position
bonoPos
...
```
I used the normaliser from the testsuite to diff the profiling output
from other nofib programs and they all looked better.
Test Plan:
* the broken test passes
* validate
* compiled and ran all of nofib, measured perf, diff'd several .prof
files
Reviewers: niteria, erikd, austin, scpmw, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2804
GHC Trac Issues: #5654, #10007
Diffstat (limited to 'testsuite/tests/profiling')
-rw-r--r-- | testsuite/tests/profiling/should_run/T5654-O0.hs (renamed from testsuite/tests/profiling/should_run/T5654.hs) | 0 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/T5654-O0.prof.sample | 28 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/T5654-O1.hs | 14 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/T5654-O1.prof.sample | 27 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/T5654.prof.sample | 28 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/T680.prof.sample | 50 | ||||
-rw-r--r-- | testsuite/tests/profiling/should_run/all.T | 4 |
7 files changed, 97 insertions, 54 deletions
diff --git a/testsuite/tests/profiling/should_run/T5654.hs b/testsuite/tests/profiling/should_run/T5654-O0.hs index d7f83bf9fa..d7f83bf9fa 100644 --- a/testsuite/tests/profiling/should_run/T5654.hs +++ b/testsuite/tests/profiling/should_run/T5654-O0.hs diff --git a/testsuite/tests/profiling/should_run/T5654-O0.prof.sample b/testsuite/tests/profiling/should_run/T5654-O0.prof.sample new file mode 100644 index 0000000000..2344d260f5 --- /dev/null +++ b/testsuite/tests/profiling/should_run/T5654-O0.prof.sample @@ -0,0 +1,28 @@ + Thu Dec 8 11:35 2016 Time and Allocation Profiling Report (Final) + + T5654-O0 +RTS -p -RTS + + total time = 0.00 secs (0 ticks @ 1000 us, 1 processor) + total alloc = 39,184 bytes (excludes profiling overheads) + +COST CENTRE MODULE SRC %time %alloc + +MAIN MAIN <built-in> 0.0 1.0 +CAF GHC.IO.Handle.FD <entire-module> 0.0 88.3 +CAF GHC.IO.Encoding <entire-module> 0.0 7.1 +CAF GHC.Conc.Signal <entire-module> 0.0 1.6 + + + individual inherited +COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc + +MAIN MAIN <built-in> 104 0 0.0 1.0 0.0 100.0 + CAF Main <entire-module> 207 0 0.0 0.9 0.0 1.4 + f Main T5654-O0.hs:7:1-5 209 1 0.0 0.1 0.0 0.1 + main Main T5654-O0.hs:13:1-21 208 1 0.0 0.3 0.0 0.4 + f Main T5654-O0.hs:7:1-5 210 0 0.0 0.0 0.0 0.1 + g Main T5654-O0.hs:11:1-11 211 1 0.0 0.1 0.0 0.1 + CAF GHC.Conc.Signal <entire-module> 201 0 0.0 1.6 0.0 1.6 + CAF GHC.IO.Encoding <entire-module> 191 0 0.0 7.1 0.0 7.1 + CAF GHC.IO.Encoding.Iconv <entire-module> 189 0 0.0 0.6 0.0 0.6 + CAF GHC.IO.Handle.FD <entire-module> 181 0 0.0 88.3 0.0 88.3 diff --git a/testsuite/tests/profiling/should_run/T5654-O1.hs b/testsuite/tests/profiling/should_run/T5654-O1.hs new file mode 100644 index 0000000000..d7f83bf9fa --- /dev/null +++ b/testsuite/tests/profiling/should_run/T5654-O1.hs @@ -0,0 +1,14 @@ +-- Tests for a bug in the handling of cost-centre stacks in the +-- runtime, where we lose the current cost-centre stack when +-- evaluating a function. + +{-# NOINLINE f #-} +f :: Int -> Int +f = g -- here we should remember the stack under which g was evaluated + +{-# NOINLINE g #-} +g :: Int -> Int +g x = x + 1 + +main = return $! f 3 + diff --git a/testsuite/tests/profiling/should_run/T5654-O1.prof.sample b/testsuite/tests/profiling/should_run/T5654-O1.prof.sample new file mode 100644 index 0000000000..0e65631521 --- /dev/null +++ b/testsuite/tests/profiling/should_run/T5654-O1.prof.sample @@ -0,0 +1,27 @@ + Thu Dec 8 11:34 2016 Time and Allocation Profiling Report (Final) + + T5654-O1 +RTS -p -RTS + + total time = 0.00 secs (0 ticks @ 1000 us, 1 processor) + total alloc = 39,064 bytes (excludes profiling overheads) + +COST CENTRE MODULE SRC %time %alloc + +MAIN MAIN <built-in> 0.0 1.9 +CAF GHC.IO.Handle.FD <entire-module> 0.0 88.6 +CAF GHC.IO.Encoding <entire-module> 0.0 7.1 +CAF GHC.Conc.Signal <entire-module> 0.0 1.6 + + + individual inherited +COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc + +MAIN MAIN <built-in> 104 0 0.0 1.9 0.0 100.0 + CAF Main <entire-module> 207 0 0.0 0.0 0.0 0.2 + main Main T5654-O1.hs:13:1-21 208 1 0.0 0.1 0.0 0.2 + f Main T5654-O1.hs:7:1-5 209 1 0.0 0.0 0.0 0.0 + g Main T5654-O1.hs:11:1-11 210 1 0.0 0.0 0.0 0.0 + CAF GHC.Conc.Signal <entire-module> 201 0 0.0 1.6 0.0 1.6 + CAF GHC.IO.Encoding <entire-module> 191 0 0.0 7.1 0.0 7.1 + CAF GHC.IO.Encoding.Iconv <entire-module> 189 0 0.0 0.6 0.0 0.6 + CAF GHC.IO.Handle.FD <entire-module> 181 0 0.0 88.6 0.0 88.6 diff --git a/testsuite/tests/profiling/should_run/T5654.prof.sample b/testsuite/tests/profiling/should_run/T5654.prof.sample deleted file mode 100644 index 7d12acdaab..0000000000 --- a/testsuite/tests/profiling/should_run/T5654.prof.sample +++ /dev/null @@ -1,28 +0,0 @@ - Wed Nov 23 11:52 2011 Time and Allocation Profiling Report (Final) - - scc004 +RTS -p -RTS - - total time = 0.00 secs (0 ticks @ 20 ms) - total alloc = 39,400 bytes (excludes profiling overheads) - -COST CENTRE MODULE %time %alloc - -MAIN MAIN 0.0 1.7 -CAF GHC.Conc.Signal 0.0 1.7 -CAF GHC.IO.Encoding 0.0 7.1 -CAF GHC.IO.Handle.FD 0.0 88.7 - - - individual inherited -COST CENTRE MODULE no. entries %time %alloc %time %alloc - -MAIN MAIN 98 0 0.0 1.7 0.0 100.0 - CAF GHC.IO.Handle.FD 124 0 0.0 88.7 0.0 88.7 - CAF GHC.IO.Encoding.Iconv 116 0 0.0 0.6 0.0 0.6 - CAF GHC.IO.Encoding 114 0 0.0 7.1 0.0 7.1 - CAF GHC.Conc.Signal 106 0 0.0 1.7 0.0 1.7 - CAF Main 104 0 0.0 0.1 0.0 0.2 - f Main 197 1 0.0 0.0 0.0 0.0 - main Main 196 1 0.0 0.0 0.0 0.1 - f Main 198 0 0.0 0.0 0.0 0.0 - g Main 199 1 0.0 0.0 0.0 0.0 diff --git a/testsuite/tests/profiling/should_run/T680.prof.sample b/testsuite/tests/profiling/should_run/T680.prof.sample index d14bad9bdd..1d52d7157d 100644 --- a/testsuite/tests/profiling/should_run/T680.prof.sample +++ b/testsuite/tests/profiling/should_run/T680.prof.sample @@ -1,9 +1,9 @@ - Sat Jun 4 11:59 2016 Time and Allocation Profiling Report (Final) + Thu Dec 8 15:23 2016 Time and Allocation Profiling Report (Final) T680 +RTS -hc -p -RTS total time = 0.00 secs (0 ticks @ 1000 us, 1 processor) - total alloc = 752,952 bytes (excludes profiling overheads) + total alloc = 753,032 bytes (excludes profiling overheads) COST CENTRE MODULE SRC %time %alloc @@ -17,26 +17,26 @@ foo.bar.\ Main T680.hs:(8,11)-(9,38) 0.0 38.2 individual inherited COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc -MAIN MAIN <built-in> 46 0 0.0 0.1 0.0 100.0 - CAF Main <entire-module> 91 0 0.0 0.0 0.0 93.6 - k Main T680.hs:12:1-17 95 1 0.0 0.0 0.0 0.0 - foo Main T680.hs:(2,1)-(9,38) 96 1 0.0 0.0 0.0 0.0 - foo.bar Main T680.hs:(5,3)-(9,38) 100 1 0.0 0.0 0.0 0.0 - foo.bar.k' Main T680.hs:6:9-34 104 1 0.0 0.0 0.0 0.0 - k.\ Main T680.hs:12:16 105 1 0.0 0.0 0.0 0.0 - main Main T680.hs:20:1-14 92 1 0.0 0.0 0.0 0.0 - r Main T680.hs:18:1-26 94 1 0.0 0.0 0.0 93.5 - k Main T680.hs:12:1-17 97 0 0.0 0.0 0.0 93.5 - foo Main T680.hs:(2,1)-(9,38) 98 0 0.0 0.0 0.0 93.5 - foo.\ Main T680.hs:3:12-40 99 4001 0.0 25.5 0.0 25.5 - foo.bar Main T680.hs:(5,3)-(9,38) 101 0 0.0 29.8 0.0 68.0 - foo.bar.\ Main T680.hs:(8,11)-(9,38) 102 4001 0.0 38.2 0.0 38.2 - foo.bar.\.k'' Main T680.hs:8:15-27 103 4000 0.0 0.0 0.0 0.0 - foo.bar.k' Main T680.hs:6:9-34 106 0 0.0 0.0 0.0 0.0 - CAF GHC.IO.Handle.FD <entire-module> 84 0 0.0 4.6 0.0 4.6 - CAF GHC.IO.Handle.Text <entire-module> 83 0 0.0 0.0 0.0 0.0 - CAF GHC.Conc.Signal <entire-module> 80 0 0.0 0.1 0.0 0.1 - CAF GHC.IO.Encoding <entire-module> 78 0 0.0 0.4 0.0 0.4 - CAF GHC.Show <entire-module> 76 0 0.0 0.0 0.0 0.0 - CAF GHC.IO.Encoding.Iconv <entire-module> 62 0 0.0 0.0 0.0 0.0 - main Main T680.hs:20:1-14 93 0 0.0 1.2 0.0 1.2 +MAIN MAIN <built-in> 104 0 0.0 0.1 0.0 100.0 + CAF Main <entire-module> 207 0 0.0 0.0 0.0 93.5 + k Main T680.hs:12:1-17 211 1 0.0 0.0 0.0 0.0 + foo Main T680.hs:(2,1)-(9,38) 212 1 0.0 0.0 0.0 0.0 + foo.bar Main T680.hs:(5,3)-(9,38) 216 1 0.0 0.0 0.0 0.0 + foo.bar.k' Main T680.hs:6:9-34 220 1 0.0 0.0 0.0 0.0 + k.\ Main T680.hs:12:16 221 1 0.0 0.0 0.0 0.0 + main Main T680.hs:20:1-14 208 1 0.0 0.0 0.0 0.0 + r Main T680.hs:18:1-26 210 1 0.0 0.0 0.0 93.5 + k Main T680.hs:12:1-17 213 0 0.0 0.0 0.0 93.5 + foo Main T680.hs:(2,1)-(9,38) 214 0 0.0 0.0 0.0 93.5 + foo.\ Main T680.hs:3:12-40 215 4001 0.0 25.5 0.0 25.5 + foo.bar Main T680.hs:(5,3)-(9,38) 217 0 0.0 29.8 0.0 68.0 + foo.bar.\ Main T680.hs:(8,11)-(9,38) 218 4001 0.0 38.2 0.0 38.2 + foo.bar.\.k'' Main T680.hs:8:15-27 219 4000 0.0 0.0 0.0 0.0 + foo.bar.k' Main T680.hs:6:9-34 222 0 0.0 0.0 0.0 0.0 + CAF GHC.Conc.Signal <entire-module> 201 0 0.0 0.1 0.0 0.1 + CAF GHC.IO.Encoding <entire-module> 191 0 0.0 0.4 0.0 0.4 + CAF GHC.IO.Encoding.Iconv <entire-module> 189 0 0.0 0.0 0.0 0.0 + CAF GHC.IO.Handle.FD <entire-module> 181 0 0.0 4.6 0.0 4.6 + CAF GHC.IO.Handle.Text <entire-module> 179 0 0.0 0.0 0.0 0.0 + CAF GHC.Show <entire-module> 165 0 0.0 0.0 0.0 0.0 + main Main T680.hs:20:1-14 209 0 0.0 1.2 0.0 1.2 diff --git a/testsuite/tests/profiling/should_run/all.T b/testsuite/tests/profiling/should_run/all.T index 151b75b858..5faca297b6 100644 --- a/testsuite/tests/profiling/should_run/all.T +++ b/testsuite/tests/profiling/should_run/all.T @@ -47,7 +47,9 @@ test('scc002', [], compile_and_run, ['']) test('scc003', [], compile_and_run, ['-fno-state-hack']) # Note [consistent stacks] -test('T5654', [expect_broken(5654)], compile_and_run, ['']) +test('T5654-O0', [only_ways(['prof'])], compile_and_run, ['']) + +test('T5654-O1', [only_ways(['profasm'])], compile_and_run, ['']) test('T5654b-O0', [only_ways(['prof'])], compile_and_run, ['']) |