| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This re-applies {D5195} with fixes for i386:
* Fix unused label warnings, see {D5230} or {D5273}
* Fix a silly bug introduced by moving `#if`
{P190}
Add a RTS option -xp to load PIC object anywhere in address space. We do
this by relaxing the requirement of <0x80000000 result of
`mmapForLinker` and implying USE_CONTIGUOUS_MMAP.
We also need to change calls to `ocInit` and `ocGetNames` to avoid
dangling pointers when the address of `oc->image` is changed by
`ocAllocateSymbolExtra`.
Test Plan:
See {D5195}, also test under i386:
```
$ uname -a
Linux watashi-arch32 4.18.5-arch1-1.0-ARCH #1 SMP PREEMPT Tue Aug 28
20:45:30 CEST 2018 i686 GNU/Linux
$ cd testsuite/tests/th/ && make test
...
```
will run `./validate` on stacked diff.
Reviewers: simonmar, bgamari, alpmestan, trommler, hvr, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5289
|
|
|
|
| |
This reverts commit 76c8fd674435a652c75a96c85abbf26f1f221876.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
- Remove REGISTER_CC and REGISTER_CCS macros, add functions registerCC
and registerCCS to Profiling.c.
- Reduce scope of symbols: CC_LIST, CCS_LIST, CC_ID, CCS_ID
- Document CC_LIST and CCS_LIST
|
| |
|
|
|
|
|
|
|
| |
Support for Mac OS X on PowerPC has been dropped by Apple years ago. We
follow suit and remove PowerPC support for Darwin.
Fixes #16106.
|
|
|
|
|
|
| |
Apparently clang doesn't enable implicitly fallthrough warnings by default
http://llvm.org/viewvc/llvm-project?revision=167655&view=revision when compiling
C and the attribute cause warnings of their own (#16019).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`EventLogWriter.h` doesn't use anything from `Rts.h`, the include is
redundant. This include is ignored when we include
```
Rts.h -> RtsAPI.h -> rts/EventLogWriter.h -> Rts.h
```
but can can cause problem when we include `RtsApi.h` directly with
errors like
```
In file included from /usr/lib/ghc-8.6.2/include/RtsAPI.h:20:
In file included from
/usr/lib/ghc-8.6.2/include/rts/EventLogWriter.h:14:
In file included from /usr/lib/ghc-8.6.2/include/Rts.h:185:
/usr/lib/ghc-8.6.2/include/rts/storage/GC.h:187:29: error: unknown type
name 'Capability'
StgPtr allocate ( Capability *cap, W_ n );
```
Test Plan: ./validate
Reviewers: simonmar, bgamari, afarmer, erikd, alexbiehl
Reviewed By: bgamari, alexbiehl
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We now allocate the key to spt on C stack rather than in Haskell heap,
avoiding allocating in `unsafeLookupStaticPtr`. This should be slightly
more efficient.
Test Plan: Validated locally
Reviewers: simonmar, hvr, bgamari, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5333
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first step of implementing:
https://github.com/ghc-proposals/ghc-proposals/pull/74
The main highlights/changes:
primops.txt.pp gets two new sections for two new primitive types for
signed and unsigned 8-bit integers (Int8# and Word8 respectively) along
with basic arithmetic and comparison operations. PrimRep/RuntimeRep get
two new constructors for them. All of the primops translate into the
existing MachOPs.
For CmmCalls the codegen will now zero-extend the values at call
site (so that they can be moved to the right register) and then truncate
them back their original width.
x86 native codegen needed some updates, since it wasn't able to deal
with the new widths, but all the changes are quite localized. LLVM
backend seems to just work.
This is the second attempt at merging this, after the first attempt in
D4475 had to be backed out due to regressions on i386.
Bumps binary submodule.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate (on both x86-{32,64})
Reviewers: bgamari, hvr, goldfire, simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5258
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using the GCC `/* fallthrough */` syntax we now use the
`__attribute__((fallthrough))`, which Phyx says should be more portable
than the former.
Also adds a missing fallthrough annotation in the MachO linker,
fixing #14613.
Reviewers: erikd, simonmar
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #14613
Differential Revision: https://phabricator.haskell.org/D5292
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces the `+RTS -ol` flag, which allows user to specify the
destination file for eventlog output.
Test Plan: Validate with included test
Reviewers: simonmar, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5293
|
|
|
|
| |
This reverts commit 5403a8636fe82f971234873564f3a05393b89b7a.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Fixes #14784. Note that C++11 is quite conservative; we could likely accept
C++03 as well.
Test Plan:
```
$ cat >hi.c <<EOF
#include <Rts.h>
EOF
$ g++ -std=c++11 hi.c
```
Reviewers: simonmar, hvr
Subscribers: rwbarton, carter
GHC Trac Issues: #14784
Differential Revision: https://phabricator.haskell.org/D5244
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a RTS option -xp to load PIC object anywhere in address space. We do
this by relaxing the requirement of <0x80000000 result of
`mmapForLinker` and implying USE_CONTIGUOUS_MMAP.
We also need to change calls to `ocInit` and `ocGetNames` to avoid
dangling pointers when the address of `oc->image` is changed by
`ocAllocateSymbolExtra`.
Test Plan:
```
$ uname -a
Linux localhost 4.18.8-arch1-1-ARCH #1 SMP PREEMPT Sat Sep 15 20:34:48
UTC 2018 x86_64 GNU/Linux
$ cat mk/build.mk
DYNAMIC_GHC_PROGRAMS = NO
DYNAMIC_BY_DEFAULT = NO
GhcRTSWays += thr_debug
EXTRA_HC_OPTS += -debug
WAY_p_HC_OPTS += -fPIC -fexternal-dynamic-refs
$ inplace/bin/ghc-stage2 --interactive -prof +RTS -xp
GHCi, version 8.7.20180928: http://www.haskell.org/ghc/ :? for help
ghc-stage2: R_X86_64_32 relocation out of range:
ghczmprim_GHCziTypes_ZMZN_closure = 7f690bffab59
Recompile
/data/users/watashi/ghc/libraries/ghc-prim/dist-install/build/HSghc-prim
-0.5.3.o with -fPIC -fexternal-dynamic-refs.
ghc-stage2: unable to load package `ghc-prim-0.5.3'
$ strace -f -e open,mmap inplace/bin/ghc-stage2 --interactive -prof
-fexternal-interpreter -opti+RTS -opti-xp
...
[pid 1355283]
open("/data/users/watashi/ghc/libraries/base/dist-install/build/libHSbas
e-4.12.0.0_p.a", O_RDONLY) = 14
[pid 1355283] mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6a84842000
[pid 1355283]
open("/data/users/watashi/ghc/libraries/base/dist-install/build/libHSbas
e-4.12.0.0_p.a", O_RDONLY) = 14
[pid 1355283] mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6a84676000
...
Prelude> System.Posix.Process.getProcessID
...
[pid 1355283]
open("/data/users/watashi/ghc/libraries/unix/dist-install/build/libHSuni
x-2.7.2.2_p.a", O_RDONLY) = 14
[pid 1355283] mmap(NULL, 45056, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6a67d60000
[pid 1355283]
open("/data/users/watashi/ghc/libraries/unix/dist-install/build/libHSuni
x-2.7.2.2_p.a", O_RDONLY) = 14
[pid 1355283] mmap(NULL, 57344, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f6a67d52000
...
```
```
$ uname -a
Darwin watashis-iMac.local 18.0.0 Darwin Kernel Version 18.0.0: Wed Aug
22 20:13:40 PDT 2018; root:xnu-4903.201.2~1/RELEASE_X86_64 x86_64
$ mv
/Users/watashi/gao/ghc/libraries/integer-gmp/dist-install/build/HSintege
r-gmp-1.0.2.0.o{,._DISABLE_GHC_ISSUE_15105}
$ inplace/bin/ghc-stage2 --interactive +RTS -xp
GHCi, version 8.7.20181003: http://www.haskell.org/ghc/ :? for help
Prelude> System.Posix.Process.getProcessID
42791
Prelude> Data.Set.fromList [1 .. 10]
fromList [1,2,3,4,5,6,7,8,9,10]
Prelude>
Leaving GHCi.
$ inplace/bin/ghc-stage2 --interactive -prof -fexternal-interpreter
GHCi, version 8.7.20181003: http://www.haskell.org/ghc/ :? for help
Prelude> System.Posix.Process.getProcessID
42806
Prelude> Data.Set.fromList [1 .. 10]
fromList [1,2,3,4,5,6,7,8,9,10]
Prelude>
Leaving GHCi.
```
Also test with something that used to hit the 2Gb limit and it loads
and runs without problem.
Reviewers: simonmar, bgamari, angerman, Phyx, hvr, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5195
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: simonmar
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5186
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: simonmar, bgamari, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #15508
Differential Revision: https://phabricator.haskell.org/D5178
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As #15571 reports, eager blackholing breaks sanity checks as we can't
zero the payload when eagerly blackholing (because we'll be using the
payload after blackholing), but by the time we blackhole a previously
eagerly blackholed object (in `threadPaused()`) we don't have the
correct size information for the object (because the object's type
becomes BLACKHOLE when we eagerly blackhole it) so can't properly zero
the slop.
This problem can be solved for AP_STACK eager blackholing (which unlike
eager blackholing in general, is not optional) by zeroing the payload
after entering the stack. This patch implements this idea.
Fixes #15571.
Test Plan:
Previously concprog001 when compiled and run with sanity checks
ghc-stage2 Mult.hs -debug -rtsopts
./Mult +RTS -DS
was failing with
Mult: internal error: checkClosure: stack frame
(GHC version 8.7.20180821 for x86_64_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
thic patch fixes this panic. The test still panics, but it runs for a while
before panicking (instead of directly panicking as before), and the new problem
seems unrelated:
Mult: internal error: ASSERTION FAILED: file rts/sm/Sanity.c, line 296
(GHC version 8.7.20180919 for x86_64_unknown_linux)
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
The new problem will be fixed in another diff.
I also tried slow validate (which requires D5164): this does not introduce any
new failures.
Reviewers: simonmar, bgamari, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #15571
Differential Revision: https://phabricator.haskell.org/D5165
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Long ago, the stable name table and stable pointer tables were one.
Now, they are separate, and have significantly different
implementations. I believe the time has come to finish the split
that began in #7674.
* Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`.
* Give each table its own mutex.
* Add FFI functions `hs_lock_stable_ptr_table` and
`hs_unlock_stable_ptr_table` and document them.
These are intended to replace the previously undocumented
`hs_lock_stable_tables` and `hs_lock_stable_tables`,
which are now documented as deprecated synonyms.
* Make `eqStableName#` use pointer equality instead of unnecessarily
comparing stable name table indices.
Reviewers: simonmar, bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15555
Differential Revision: https://phabricator.haskell.org/D5084
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new primop called traceBinaryEvent# that takes the length
of binary data and a pointer to the data, then emits it to the eventlog.
There is some example code that uses this primop and the new event:
* [traceBinaryEventIO][1] that calls `traceBinaryEvent#`
* [A patch to ghc-events][2] that parses the new `EVENT_USER_BINARY_MSG`
There's no corresponding issue on Trac but it was discussed at
ghc-devs [3].
[1] https://github.com/maoe/ghc-trace-events/blob
/fb226011ef1f85a97b4da7cc9d5f98f9fe6316ae/src/Debug/Trace/Binary.hs#L29)
[2] https://github.com/maoe/ghc-events/commit
/239ca77c24d18cdd10d6d85a0aef98e4a7c56ae6)
[3] https://mail.haskell.org/pipermail/ghc-devs/2018-May/015791.html
Reviewers: bgamari, erikd, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5007
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Validate
Reviewers: simonmar
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5000
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: simonmar, hvr, bgamari, erikd, fryguybob, rrnewton
Reviewed By: simonmar
Subscribers: fryguybob, rwbarton, thomie, carter
GHC Trac Issues: #15364
Differential Revision: https://phabricator.haskell.org/D4884
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems like we currently support string literals in Cmm, so we can use
__LINE__ CPP macro in assertion macros. This improves error messages
that previously looked like
ASSERTION FAILED: file (null), line 1302
(null) part now shows the actual file name.
Also inline some single-use string literals in PrimOps.cmm.
Reviewers: bgamari, simonmar, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4862
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY
SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN
MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY
MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN
Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR,
MUT_ARR_PTRS).
(alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR)
Removed a few comments in Scav.c about FROZEN0 being on the mut_list
because it's now clear from the closure type.
Reviewers: bgamari, simonmar, erikd
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This feature has some very serious correctness issues (#14310),
introduces a great deal of complexity, and hasn't seen wide usage.
Consequently we are removing it, as proposed in Proposal #77 [1]. This
is heavily based on a patch from fryguybob.
Updates stm submodule.
[1] https://github.com/ghc-proposals/ghc-proposals/pull/77
Test Plan: Validate
Reviewers: erikd, simonmar, hvr
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14310
Differential Revision: https://phabricator.haskell.org/D4760
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make it clear that max_live_bytes is updated after a major GC whereas
live_bytes is updated after all GCs (including minor collections) and
considers data in uncollected generations as live.
Reviewers: bgamari, simonmar, hvr
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4734
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately, this optimisation is infeasible on MachO platforms (e.g.
Darwin) due to an object format limitation. Specifically, linking fails
with errors of the form:
error: unsupported relocation with subtraction expression, symbol
'_integerzmgmp_GHCziIntegerziType_quotInteger_closure' can not be
undefined in a subtraction expression
Apparently MachO does not permit relocations' subtraction expressions to
refer to undefined symbols. As far as I can tell this means that it is
essentially impossible to express an offset between symbols living in
different compilation units. This means that we lively can't use this
optimisation on MachO platforms.
Test Plan: Validate on Darwin
Reviewers: simonmar, erikd
Subscribers: rwbarton, thomie, carter, angerman
GHC Trac Issues: #15169
Differential Revision: https://phabricator.haskell.org/D4715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pulls parts of Joachim Breitner's ghc-heap-view library inside GHC.
The bits added are the C hooks into the RTS and a basic Haskell wrapper
to these C hooks. The main reason for these to be added to GHC proper
is that the code needs to be kept in sync with the closure types
defined by the RTS. It is expected that the version of HeapView shipped
with GHC will always work with that version of GHC and that extra
functionality can be layered on top with a library like ghc-heap-view
distributed via Hackage.
Test Plan: validate
Reviewers: simonmar, hvr, nomeata, austin, Phyx, bgamari, erikd
Reviewed By: bgamari
Subscribers: carter, patrickdoc, tmcgilchrist, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3055
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
See the new note.
This should fix cb5c2fe875965b7aedbc189012803fc62e48fb3f enough
to unbreak Windows and OS X builds.
Test Plan: manual testing with patched gdb
Reviewers: bgamari, simonmar, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4694
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The idea here is to save a little code size and some work in the GC,
by collapsing FUN_STATIC closures and their SRTs.
This is (4) in a series; see D4632 for more details.
There's a tradeoff here: more complexity in the compiler in exchange
for a modest code size reduction (probably around 0.5%).
Results:
* GHC binary itself (statically linked) is 1% smaller
* -0.2% binary sizes in nofib (-0.5% module sizes)
Full nofib results comparing D4634 with this: P177 (ignore runtimes,
these aren't stable on my laptop)
Test Plan: validate, nofib
Reviewers: bgamari, niteria, simonpj, erikd
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
An info table with an SRT normally looks like this:
StgWord64 srt_offset
StgClosureInfo layout
StgWord32 layout
StgWord32 has_srt
But we only need 32 bits for srt_offset on x86_64, because the small
memory model requires that code segments are at most 2GB. So we can
optimise this to
StgClosureInfo layout
StgWord32 layout
StgWord32 srt_offset
saving a word. We can tell whether the info table has an SRT or not,
because zero is not a valid srt_offset, so zero still indicates that
there's no SRT.
Test Plan:
* validate
* For results, see D4632.
Reviewers: bgamari, niteria, osa1, erikd
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4634
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Previously we would hvae a single big table of pointers per module,
with a set of bitmaps to reference entries within it. The new
representation is identical to a static constructor, which is much
simpler for the GC to traverse, and we get to remove the complicated
bitmap-traversal code from the GC.
- Rewrite all the code to generate SRTs in CmmBuildInfoTables, and
document it much better (see Note [SRTs]). This has been something
I've wanted to do since we moved to the new code generator, I
finally had the opportunity to finish it while on a transatlantic
flight recently :)
There are a series of 4 diffs:
1. D4632 (this one), which does the bulk of the changes
2. D4633 which adds support for smaller `CmmLabelDiffOff` constants
3. D4634 which takes advantage of D4632 and D4633 to save a word in
info tables that have an SRT on x86_64. This is where most of the
binary size improvement comes from.
4. D4637 which makes a further optimisation to merge some SRTs with
static FUN closures. This adds some complexity and the benefits
are fairly modest, so it's not clear yet whether we should do this.
Results (after (3), on x86_64)
- GHC itself (staticaly linked) is 5.2% smaller
- -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176
- I measured the overhead of traversing all the static objects in a
major GC in GHC itself by doing `replicateM_ 1000 performGC` as the
first thing in `Main.main`. The new version was 5-10% faster, but
the results did vary quite a bit.
- I'm not sure if there's a compile-time difference, the results are
too unreliable.
Test Plan: validate
Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCC 8 now generates warnings for incompatible function pointer casts
[-Werror=cast-function-type]. Apparently there are a few of those in rts
code, which makes `./validate` unhappy (since we compile with `-Werror`)
This commit tries to fix these issues by changing the functions to have
the correct type (and, if necessary, moving the casts into those
functions).
For instance, hash/comparison function are declared (`Hash.h`) to take
`StgWord` but we want to use `StgWord64[2]` in `StaticPtrTable.c`.
Instead of casting the function pointers, we can cast the `StgWord`
parameter to `StgWord*`. I think this should be ok since `StgWord`
should be the same size as a pointer.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, erikd, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4673
|
|
|
|
|
|
| |
This reverts commit cb5c2fe875965b7aedbc189012803fc62e48fb3f.
It appears to have broken OSX and Windows builds.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See the new note.
Test Plan:
manual testing with patched gdb
Reviewers: bgamari, simonmar, erikd
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4666
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Passes validate
Reviewers: simonmar, bgamari, erikd
Subscribers: thomie, carter
GHC Trac Issues: #10296
Differential Revision: https://phabricator.haskell.org/D4627
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: bgamari, simonmar, erikd
Reviewed By: bgamari, simonmar
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4539
|
|
|
|
| |
[skip ci]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There should be no change in the output of the '+RTS -s' (summary)
report, or
the 'RTS -t' (one-line) report.
All data shown in the summary report is now shown in the machine
readable
report.
All data in RTSStats is now shown in the machine readable report.
init times are added to RTSStats and added to GHC.Stats.
Example of the new output:
```
[("bytes allocated", "375016384")
,("num_GCs", "113")
,("average_bytes_used", "148348")
,("max_bytes_used", "206552")
,("num_byte_usage_samples", "2")
,("peak_megabytes_allocated", "6")
,("init_cpu_seconds", "0.001642")
,("init_wall_seconds", "0.001027")
,("mut_cpu_seconds", "3.020166")
,("mut_wall_seconds", "0.757244")
,("GC_cpu_seconds", "0.037750")
,("GC_wall_seconds", "0.009569")
,("exit_cpu_seconds", "0.000890")
,("exit_wall_seconds", "0.002551")
,("total_cpu_seconds", "3.060452")
,("total_wall_seconds", "0.770395")
,("major_gcs", "2")
,("allocated_bytes", "375016384")
,("max_live_bytes", "206552")
,("max_large_objects_bytes", "159344")
,("max_compact_bytes", "0")
,("max_slop_bytes", "59688")
,("max_mem_in_use_bytes", "6291456")
,("cumulative_live_bytes", "296696")
,("copied_bytes", "541024")
,("par_copied_bytes", "493976")
,("cumulative_par_max_copied_bytes", "104104")
,("cumulative_par_balanced_copied_bytes", "274456")
,("fragmentation_bytes", "2112")
,("alloc_rate", "124170795")
,("productivity_cpu_percent", "0.986838")
,("productivity_wall_percent", "0.982935")
,("bound_task_count", "1")
,("sparks_count", "5836258")
,("sparks_converted", "237")
,("sparks_overflowed", "1990408")
,("sparks_dud ", "0")
,("sparks_gcd", "3455553")
,("sparks_fizzled", "390060")
,("work_balance", "0.555606")
,("n_capabilities", "4")
,("task_count", "10")
,("peak_worker_count", "9")
,("worker_count", "9")
,("gc_alloc_block_sync_spin", "162")
,("gc_alloc_block_sync_yield", "0")
,("gc_alloc_block_sync_spin", "162")
,("gc_spin_spin", "18840855")
,("gc_spin_yield", "10355")
,("mut_spin_spin", "70331392")
,("mut_spin_yield", "61700")
,("waitForGcThreads_spin", "241")
,("waitForGcThreads_yield", "2797")
,("whitehole_gc_spin", "0")
,("whitehole_lockClosure_spin", "0")
,("whitehole_lockClosure_yield", "0")
,("whitehole_executeMessage_spin", "0")
,("whitehole_threadPaused_spin", "0")
,("any_work", "1667")
,("no_work", "1662")
,("scav_find_work", "1026")
,("gen_0_collections", "111")
,("gen_0_par_collections", "111")
,("gen_0_cpu_seconds", "0.036126")
,("gen_0_wall_seconds", "0.036126")
,("gen_0_max_pause_seconds", "0.036126")
,("gen_0_avg_pause_seconds", "0.000081")
,("gen_0_sync_spin", "21")
,("gen_0_sync_yield", "0")
,("gen_1_collections", "2")
,("gen_1_par_collections", "1")
,("gen_1_cpu_seconds", "0.001624")
,("gen_1_wall_seconds", "0.001624")
,("gen_1_max_pause_seconds", "0.001624")
,("gen_1_avg_pause_seconds", "0.000272")
,("gen_1_sync_spin", "3")
,("gen_1_sync_yield", "0")
]
```
Test Plan: Ensure that one-line and summary reports are unchanged.
Reviewers: erikd, simonmar, hvr
Subscribers: duog, carter, thomie, rwbarton
GHC Trac Issues: #14660
Differential Revision: https://phabricator.haskell.org/D4529
|
|
|
|
| |
This reverts commit 2d4bda2e4ac68816baba0afab00da6f769ea75a7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There should be no change in the output of the '+RTS -s' (summary)
report, or the 'RTS -t' (one-line) report.
All data shown in the summary report is now shown in the machine
readable report.
All data in RTSStats is now shown in the machine readable report.
init times are added to RTSStats and added to GHC.Stats.
Example of the new output:
```
[("bytes allocated", "375016384")
,("num_GCs", "113")
,("average_bytes_used", "148348")
,("max_bytes_used", "206552")
,("num_byte_usage_samples", "2")
,("peak_megabytes_allocated", "6")
,("init_cpu_seconds", "0.001642")
,("init_wall_seconds", "0.001027")
,("mut_cpu_seconds", "3.020166")
,("mut_wall_seconds", "0.757244")
,("GC_cpu_seconds", "0.037750")
,("GC_wall_seconds", "0.009569")
,("exit_cpu_seconds", "0.000890")
,("exit_wall_seconds", "0.002551")
,("total_cpu_seconds", "3.060452")
,("total_wall_seconds", "0.770395")
,("major_gcs", "2")
,("allocated_bytes", "375016384")
,("max_live_bytes", "206552")
,("max_large_objects_bytes", "159344")
,("max_compact_bytes", "0")
,("max_slop_bytes", "59688")
,("max_mem_in_use_bytes", "6291456")
,("cumulative_live_bytes", "296696")
,("copied_bytes", "541024")
,("par_copied_bytes", "493976")
,("cumulative_par_max_copied_bytes", "104104")
,("cumulative_par_balanced_copied_bytes", "274456")
,("fragmentation_bytes", "2112")
,("alloc_rate", "124170795")
,("productivity_cpu_percent", "0.986838")
,("productivity_wall_percent", "0.982935")
,("bound_task_count", "1")
,("sparks_count", "5836258")
,("sparks_converted", "237")
,("sparks_overflowed", "1990408")
,("sparks_dud ", "0")
,("sparks_gcd", "3455553")
,("sparks_fizzled", "390060")
,("work_balance", "0.555606")
,("n_capabilities", "4")
,("task_count", "10")
,("peak_worker_count", "9")
,("worker_count", "9")
,("gc_alloc_block_sync_spin", "162")
,("gc_alloc_block_sync_yield", "0")
,("gc_alloc_block_sync_spin", "162")
,("gc_spin_spin", "18840855")
,("gc_spin_yield", "10355")
,("mut_spin_spin", "70331392")
,("mut_spin_yield", "61700")
,("waitForGcThreads_spin", "241")
,("waitForGcThreads_yield", "2797")
,("whitehole_gc_spin", "0")
,("whitehole_lockClosure_spin", "0")
,("whitehole_lockClosure_yield", "0")
,("whitehole_executeMessage_spin", "0")
,("whitehole_threadPaused_spin", "0")
,("any_work", "1667")
,("no_work", "1662")
,("scav_find_work", "1026")
,("gen_0_collections", "111")
,("gen_0_par_collections", "111")
,("gen_0_cpu_seconds", "0.036126")
,("gen_0_wall_seconds", "0.036126")
,("gen_0_max_pause_seconds", "0.036126")
,("gen_0_avg_pause_seconds", "0.000081")
,("gen_0_sync_spin", "21")
,("gen_0_sync_yield", "0")
,("gen_1_collections", "2")
,("gen_1_par_collections", "1")
,("gen_1_cpu_seconds", "0.001624")
,("gen_1_wall_seconds", "0.001624")
,("gen_1_max_pause_seconds", "0.001624")
,("gen_1_avg_pause_seconds", "0.000272")
,("gen_1_sync_spin", "3")
,("gen_1_sync_yield", "0")
]
```
Test Plan: Ensure that one-line and summary reports are unchanged.
Reviewers: bgamari, erikd, simonmar, hvr
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14660
Differential Revision: https://phabricator.haskell.org/D4303
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
get/setAllocationCounter didn't take into account allocations in the
current block. This was known at the time, but it turns out to be
important to have more accuracy when using these in a fine-grained
way.
Test Plan:
New unit test to test incrementally larger allocaitons. Before I got
results like this:
```
+0
+0
+0
+0
+0
+4096
+0
+0
+0
+0
+0
+4064
+0
+0
+4088
+4056
+0
+0
+0
+4088
+4096
+4056
+4096
```
Notice how the results aren't always monotonically increasing. After
this patch:
```
+344
+416
+488
+560
+632
+704
+776
+848
+920
+992
+1064
+1136
+1208
+1280
+1352
+1424
+1496
+1568
+1640
+1712
+1784
+1856
+1928
+2000
+2072
+2144
```
Reviewers: hvr, erikd, simonmar, jrtc27, trommler
Reviewed By: simonmar
Subscribers: trommler, jrtc27, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing internal counters:
* gc_alloc_block_sync
* whitehole_spin
* gen[g].sync
* gen[1].sync
are now not shown in the -s report unless --internal-counters is also passed.
If --internal-counters is passed we now show the counters above, reformatted, as
well as several other counters. In particular, we now count the yieldThread()
calls that SpinLocks do as well as their spins.
The added counters are:
* gc_spin (spin and yield)
* mut_spin (spin and yield)
* whitehole_threadPaused (spin only)
* whitehole_executeMessage (spin only)
* whitehole_lockClosure (spin only)
* waitForGcThreadsd (spin and yield)
As well as the following, which are not SpinLock-like things:
* any_work
* do_work
* scav_find_work
See the Note for descriptions of what these counters are.
We add busy_wait_nops in these loops along with the counter increment where it
was absent.
Old internal counters output:
```
gc_alloc_block_sync: 0
whitehole_gc_spin: 0
gen[0].sync: 0
gen[1].sync: 0
```
New internal counters output:
```
Internal Counters:
Spins Yields
gc_alloc_block_sync 323 0
gc_spin 9016713 752
mut_spin 57360944 47716
whitehole_gc 0 n/a
whitehole_threadPaused 0 n/a
whitehole_executeMessage 0 n/a
whitehole_lockClosure 0 0
waitForGcThreads 2 415
gen[0].sync 6 0
gen[1].sync 1 0
any_work 2017
no_work 2014
scav_find_work 1004
```
Test Plan:
./validate
Check it builds with #define PROF_SPIN removed from includes/rts/Config.h
Reviewers: bgamari, erikd, simonmar, hvr
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #3553, #9221
Differential Revision: https://phabricator.haskell.org/D4302
|