| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves all URL references to Trac Wiki to their corresponding
GitLab counterparts.
This substitution is classified as follows:
1. Automated substitution using sed with Ben's mapping rule [1]
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...
2. Manual substitution for URLs containing `#` index
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz
3. Manual substitution for strings starting with `Commentary`
Old: Commentary/XxxYyy...
New: commentary/xxx-yyy...
See also !539
[1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
|
| |
|
|
|
|
| |
Our new CPP linter enforces this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This both says what we mean and silences a bunch of spurious CPP linting
warnings. This pragma is supported by all CPP implementations which we
support.
Reviewers: austin, erikd, simonmar, hvr
Reviewed By: simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3482
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes Cmm generation required to produce histograms when
compiling with -ticky flag, strips dead code from rts/Ticky.c and
reworks it to use a shared constant in both C and Haskell code.
Fixes #8308.
Test Plan: T8308
Reviewers: jstolarek, simonpj, austin
Reviewed By: simonpj
Subscribers: mpickering, simonpj, bgamari, mlen, thomie, jstolarek
Differential Revision: https://phabricator.haskell.org/D931
GHC Trac Issues: #8308
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This major patch implements the cardinality analysis described
in our paper "Higher order cardinality analysis". It is joint
work with Ilya Sergey and Dimitrios Vytiniotis.
The basic is augment the absence-analysis part of the demand
analyser so that it can tell when something is used
never
at most once
some other way
The "at most once" information is used
a) to enable transformations, and
in particular to identify one-shot lambdas
b) to allow updates on thunks to be omitted.
There are two new flags, mainly there so you can do performance
comparisons:
-fkill-absence stops GHC doing absence analysis at all
-fkill-one-shot stops GHC spotting one-shot lambdas
and single-entry thunks
The big changes are:
* The Demand type is substantially refactored. In particular
the UseDmd is factored as follows
data UseDmd
= UCall Count UseDmd
| UProd [MaybeUsed]
| UHead
| Used
data MaybeUsed = Abs | Use Count UseDmd
data Count = One | Many
Notice that UCall recurses straight to UseDmd, whereas
UProd goes via MaybeUsed.
The "Count" embodies the "at most once" or "many" idea.
* The demand analyser itself was refactored a lot
* The previously ad-hoc stuff in the occurrence analyser for foldr and
build goes away entirely. Before if we had build (\cn -> ...x... )
then the "\cn" was hackily made one-shot (by spotting 'build' as
special. That's essential to allow x to be inlined. Now the
occurrence analyser propagates info gotten from 'build's stricness
signature (so build isn't special); and that strictness sig is
in turn derived entirely automatically. Much nicer!
* The ticky stuff is improved to count single-entry thunks separately.
One shortcoming is that there is no DEBUG way to spot if an
allegedly-single-entry thunk is acually entered more than once. It
would not be hard to generate a bit of code to check for this, and it
would be reassuring. But it's fiddly and I have not done it.
Despite all this fuss, the performance numbers are rather under-whelming.
See the paper for more discussion.
nucleic2 -0.8% -10.9% 0.10 0.10 +0.0%
sphere -0.7% -1.5% 0.08 0.08 +0.0%
--------------------------------------------------------------------------------
Min -4.7% -10.9% -9.3% -9.3% -50.0%
Max -0.4% +0.5% +2.2% +2.3% +7.4%
Geometric Mean -0.8% -0.2% -1.3% -1.3% -1.8%
I don't quite know how much credence to place in the runtime changes,
but movement seems generally in the right direction.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* the new StgCmmArgRep module breaks a dependency cycle; I also
untabified it, but made no real changes
* updated the documentation in the wiki and change the user guide to
point there
* moved the allocation enters for ticky and CCS to after the heap check
* I left LDV where it was, which was before the heap check at least
once, since I have no idea what it is
* standardized all (active?) ticky alloc totals to bytes
* in order to avoid double counting StgCmmLayout.adjustHpBackwards
no longer bumps ALLOC_HEAP_ctr
* I resurrected the SLOW_CALL counters
* the new module StgCmmArgRep breaks cyclic dependency between
Layout and Ticky (which the SLOW_CALL counters cause)
* renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL
* added ALLOC_RTS_ctr and _tot ticky counters
* eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info
* resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and
ALLOC_PRIM
* added -ticky and -DTICKY_TICKY in ways.mk for debug ways
* added a ticky counter for total LNE entries
* new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE
* all off by default
* -ticky-allocd: tracks allocation *of* closure in addition to
allocation *by* that closure
* -ticky-dyn-thunk tracks dynamic thunks as if they were functions
* -ticky-LNE tracks LNEs as if they were functions
* updated the ticky report format, including making the argument
categories (more?) accurate again
* the printed name for things in the report include the unique of
their ticky parent as well as if they are not top-level
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes two changes to the way stacks are managed:
1. The stack is now stored in a separate object from the TSO.
This means that it is easier to replace the stack object for a thread
when the stack overflows or underflows; we don't have to leave behind
the old TSO as an indirection any more. Consequently, we can remove
ThreadRelocated and deRefTSO(), which were a pain.
This is obviously the right thing, but the last time I tried to do it
it made performance worse. This time I seem to have cracked it.
2. Stacks are now represented as a chain of chunks, rather than
a single monolithic object.
The big advantage here is that individual chunks are marked clean or
dirty according to whether they contain pointers to the young
generation, and the GC can avoid traversing clean stack chunks during
a young-generation collection. This means that programs with deep
stacks will see a big saving in GC overhead when using the default GC
settings.
A secondary advantage is that there is much less copying involved as
the stack grows. Programs that quickly grow a deep stack will see big
improvements.
In some ways the implementation is simpler, as nothing special needs
to be done to reclaim stack as the stack shrinks (the GC just recovers
the dead stack chunks). On the other hand, we have to manage stack
underflow between chunks, so there's a new stack frame
(UNDERFLOW_FRAME), and we now have separate TSO and STACK objects.
The total amount of code is probably about the same as before.
There are new RTS flags:
-ki<size> Sets the initial thread stack size (default 1k) Egs: -ki4k -ki2m
-kc<size> Sets the stack chunk size (default 32k)
-kb<size> Sets the stack chunk buffer size (default 1k)
-ki was previously called just -k, and the old name is still accepted
for backwards compatibility. These new options are documented.
|
|
|
|
|
|
|
| |
This is a follow up to the patch tha fixes Trac #3439.
We had forgotten the dynamic linker, which needs to
know all these ticky symbols too.
|
| |
|
| |
|
|
|
|
|
|
|
| |
I've updated the wiki page about the RTS headers
http://hackage.haskell.org/trac/ghc/wiki/Commentary/SourceTree/Includes
to reflect the new layout and explain some of the rationale. All the
header files now point to this page.
|
|
The first phase of this tidyup is focussed on the header files, and in
particular making sure we are exposinng publicly exactly what we need
to, and no more.
- Rts.h now includes everything that the RTS exposes publicly,
rather than a random subset of it.
- Most of the public header files have moved into subdirectories, and
many of them have been renamed. But clients should not need to
include any of the other headers directly, just #include the main
public headers: Rts.h, HsFFI.h, RtsAPI.h.
- All the headers needed for via-C compilation have moved into the
stg subdirectory, which is self-contained. Most of the headers for
the rest of the RTS APIs have moved into the rts subdirectory.
- I left MachDeps.h where it is, because it is so widely used in
Haskell code.
- I left a deprecated stub for RtsFlags.h in place. The flag
structures are now exposed by Rts.h.
- Various internal APIs are no longer exposed by public header files.
- Various bits of dead code and declarations have been removed
- More gcc warnings are turned on, and the RTS code is more
warning-clean.
- More source files #include "PosixSource.h", and hence only use
standard POSIX (1003.1c-1995) interfaces.
There is a lot more tidying up still to do, this is just the first
pass. I also intend to standardise the names for external RTS APIs
(e.g use the rts_ prefix consistently), and declare the internal APIs
as hidden for shared libraries.
|