| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch in in preparation for the fix to Trac #13397
The code generator has a special case for
case tagToEnum (a>#b) of
False -> e1
True -> e2
but it was not doing nearly so well on
case a>#b of
DEFAULT -> e1
1# -> e2
This patch arranges to behave essentially identically in
both cases. In due course we can eliminate the special
case for tagToEnum#, once we've completed Trac #13397.
The changes are:
* Make CmmSink swizzle the order of a conditional where necessary;
see Note [Improving conditionals] in CmmSink
* Hack the general case of StgCmmExpr.cgCase so that it use
NoGcInAlts for conditionals. This doesn't seem right, but it's
the same choice as the tagToEnum version. Without it, code size
increases a lot (more heap checks).
There's a loose end here.
* Add comments in CmmOpt.cmmMachOpFoldM
|
|
|
|
| |
Just a simple refactoring to remove duplication
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide PowerPC optimised implementations of callish prim ops.
MO_?_QuotRem
The generic implementation of quotient remainder prim ops uses
a division and a remainder operation. There is no remainder on
PowerPC and so we need to implement remainder "by hand" which
results in a duplication of the divide operation when using the
generic code.
Avoid this duplication by implementing the prim op in the native
code generator.
MO_U_Mul2
Use PowerPC's instructions for long multiplication.
Addition and subtraction
Use PowerPC add/subtract with carry/overflow instructions
MO_Clz and MO_Ctz
Use PowerPC's CNTLZ instruction and implement count trailing
zeros using count leading zeros
MO_QuotRem2
Implement an algorithm given by Henry Warren in "Hacker's Delight"
using PowerPC divide instruction. TODO: Use long division instructions
when available (POWER7 and later).
Test Plan: validate on AIX and 32-bit Linux
Reviewers: simonmar, erikd, hvr, austin, bgamari
Reviewed By: erikd, hvr, bgamari
Subscribers: trofi, kgardas, thomie
Differential Revision: https://phabricator.haskell.org/D2973
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before ghc-7.2 hs_add_root() had to be used to initialize haskell
modules when haskell was called from FFI.
commit a52ff7619e8b7d74a9d933d922eeea49f580bca8
("Change the way module initialisation is done (#3252, #4417)")
removed needs for hs_add_root() and made function a no-op.
For backward compatibility '__stginit_<module>' symbol was
not removed.
This change removes no-op hs_add_root() function and unused
'__stginit_<module>' symbol from each haskell module.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Test Plan: ./validate
Reviewers: simonmar, austin, bgamari, erikd
Reviewed By: simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3460
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This refactoring makes it more obvious when we are constructing
a Node for the digraph rather than a less useful 3-tuple.
Reviewers: austin, goldfire, bgamari, simonmar, dfeuer
Reviewed By: dfeuer
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3414
|
|
|
|
|
|
|
|
| |
Trac #13394, comment:4 showed up another place where we were testing
for the representation of of a type; and it turned out to be a JoinId
which can be rep-polymorphic.
Just putting the test in the right places solves this easily.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have this in libraries/base/GHC/Float.hs:
```
abs x | x == 0 = 0 -- handles (-0.0)
| x > 0 = x
| otherwise = negateFloat x
```
But 3-4 years ago it was noted that this was inefficient:
https://mail.haskell.org/pipermail/libraries/2013-April/019690.html
We can generate better code for X86 and llvm and for others generate
some custom cmm code which is similar to what the compiler generates
now.
Reviewers: austin, simonmar, hvr, bgamari
Reviewed By: bgamari
Subscribers: dfeuer, thomie
Differential Revision: https://phabricator.haskell.org/D3265
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And use to mark `stg_stack_underflow_frame`, which we are unable to
determine a caller from.
To simplify parsing at the moment we steal the `return` keyword to
indicate an undefined unwind value. Perhaps this should be revisited.
Reviewers: scpmw, simonmar, austin, erikd
Subscribers: dfeuer, thomie
Differential Revision: https://phabricator.haskell.org/D2738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack
unwinding information produced by GHC is currently quite approximate.
Essentially we assume that register values do not change at all within a
basic block. While this is somewhat true in normal Haskell code, blocks
containing foreign calls often break this assumption. This results in
unreliable call stacks, especially in the code containing foreign calls.
This is worse than it sounds as unreliable unwinding information can at
times result in segmentation faults.
This patch set attempts to improve this situation by tracking unwinding
information with finer granularity. By dispensing with the assumption of
one unwinding table per block, we allow the compiler to accurately
represent the areas surrounding foreign calls.
Towards this end we generalize the representation of unwind information
in the backend in three ways,
* Multiple CmmUnwind nodes can occur per block
* CmmUnwind nodes can now carry unwind information for multiple
registers (while not strictly necessary; this makes emitting
unwinding information a bit more convenient in the compiler)
* The NCG backend is given an opportunity to modify the unwinding
records since it may need to make adjustments due to, for instance,
native calling convention requirements for foreign calls (see
#11353).
This sets the stage for resolving #11337 and #11338.
Test Plan: Validate
Reviewers: scpmw, simonmar, austin, erikd
Subscribers: qnikst, thomie
Differential Revision: https://phabricator.haskell.org/D2741
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commits relaxes the invariants of the Core syntax so that a
top-level variable can be bound to a primitive string literal of type
Addr#.
This commit:
* Relaxes the invatiants of the Core, and allows top-level bindings whose
type is Addr# as long as their RHS is either a primitive string literal or
another variable.
* Allows the simplifier and the full-laziness transformer to float out
primitive string literals to the top leve.
* Introduces the new StgGenTopBinding type to accomodate top-level Addr#
bindings.
* Introduces a new type of labels in the object code, with the suffix "_bytes",
for exported top-level Addr# bindings.
* Makes some built-in rules more robust. This was necessary to keep them
functional after the above changes.
This is a continuation of D2554.
Rebasing notes:
This had two slightly suspicious performance regressions:
* T12425: bytes allocated regressed by roughly 5%
* T4029: bytes allocated regressed by a bit over 1%
* T13035: bytes allocated regressed by a bit over 5%
These deserve additional investigation.
Rebased by: bgamari.
Test Plan: ./validate --slow
Reviewers: goldfire, trofi, simonmar, simonpj, austin, hvr, bgamari
Reviewed By: trofi, simonpj, bgamari
Subscribers: trofi, simonpj, gridaphobe, thomie
Differential Revision: https://phabricator.haskell.org/D2605
GHC Trac Issues: #8472
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit implements the proposal in
https://github.com/ghc-proposals/ghc-proposals/pull/29 and
https://github.com/ghc-proposals/ghc-proposals/pull/35.
Here are some of the pieces of that proposal:
* Some of RuntimeRep's constructors have been shortened.
* TupleRep and SumRep are now parameterized over a list of RuntimeReps.
* This
means that two types with the same kind surely have the same
representation.
Previously, all unboxed tuples had the same kind, and thus the fact
above was
false.
* RepType.typePrimRep and friends now return a *list* of PrimReps. These
functions can now work successfully on unboxed tuples. This change is
necessary because we allow abstraction over unboxed tuple types and so
cannot
always handle unboxed tuples specially as we did before.
* We sometimes have to create an Id from a PrimRep. I thus split PtrRep
* into
LiftedRep and UnliftedRep, so that the created Ids have the right
strictness.
* The RepType.RepType type was removed, as it didn't seem to help with
* much.
* The RepType.repType function is also removed, in favor of typePrimRep.
* I have waffled a good deal on whether or not to keep VoidRep in
TyCon.PrimRep. In the end, I decided to keep it there. PrimRep is *not*
represented in RuntimeRep, and typePrimRep will never return a list
including
VoidRep. But it's handy to have in, e.g., ByteCodeGen and friends. I can
imagine another design choice where we have a PrimRepV type that is
PrimRep
with an extra constructor. That seemed to be a heavier design, though,
and I'm
not sure what the benefit would be.
* The last, unused vestiges of # (unliftedTypeKind) have been removed.
* There were several pretty-printing bugs that this change exposed;
* these are fixed.
* We previously checked for levity polymorphism in the types of binders.
* But we
also must exclude levity polymorphism in function arguments. This is
hard to check
for, requiring a good deal of care in the desugarer. See Note [Levity
polymorphism
checking] in DsMonad.
* In order to efficiently check for levity polymorphism in functions, it
* was necessary
to add a new bit of IdInfo. See Note [Levity info] in IdInfo.
* It is now safe for unlifted types to be unsaturated in Core. Core Lint
* is updated
accordingly.
* We can only know strictness after zonking, so several checks around
* strictness
in the type-checker (checkStrictBinds, the check for unlifted variables
under a ~
pattern) have been moved to the desugarer.
* Along the way, I improved the treatment of unlifted vs. banged
* bindings. See
Note [Strict binds checks] in DsBinds and #13075.
* Now that we print type-checked source, we must be careful to print
* ConLikes correctly.
This is facilitated by a new HsConLikeOut constructor to HsExpr.
Particularly troublesome
are unlifted pattern synonyms that get an extra void# argument.
* Includes a submodule update for haddock, getting rid of #.
* New testcases:
typecheck/should_fail/StrictBinds
typecheck/should_fail/T12973
typecheck/should_run/StrictPats
typecheck/should_run/T12809
typecheck/should_fail/T13105
patsyn/should_fail/UnliftedPSBind
typecheck/should_fail/LevPolyBounded
typecheck/should_compile/T12987
typecheck/should_compile/T11736
* Fixed tickets:
#12809
#12973
#11736
#13075
#12987
* This also adds a test case for #13105. This test case is
* "compile_fail" and
succeeds, because I want the testsuite to monitor the error message.
When #13105 is fixed, the test case will compile cleanly.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some cases of wrong stacks being generated by the profiler.
For background and details on the fix see
`Note [Evaluating functions with profiling]` in `rts/Apply.cmm`.
This does have an impact on allocations for some programs when
profiling. nofib results:
```
k-nucleotide +0.0% +8.8% +11.0% +11.0% 0.0%
puzzle +0.0% +12.5% 0.244 0.246 0.0%
typecheck 0.0% +8.7% +16.1% +16.2% 0.0%
------------------------------------------------------------------------
--------
Min -0.0% -0.0% -34.4% -35.5% -25.0%
Max +0.0% +12.5% +48.9% +49.4% +10.6%
Geometric Mean +0.0% +0.6% +2.0% +1.8% -0.3%
```
But runtimes don't seem to be affected much, and the examples I looked
at were completely legitimate. For example, in puzzle we have this:
```
position :: ItemType -> StateType -> BankType
position Bono = bonoPos
position Edge = edgePos
position Larry = larryPos
position Adam = adamPos
```
where the identifiers on the rhs are all record selectors. Previously
the profiler gave a stack that looked like
```
position
bonoPos
...
```
i.e. `bonoPos` was at the same level of the call stack as `position`,
but now it looks like
```
position
bonoPos
...
```
I used the normaliser from the testsuite to diff the profiling output
from other nofib programs and they all looked better.
Test Plan:
* the broken test passes
* validate
* compiled and ran all of nofib, measured perf, diff'd several .prof
files
Reviewers: niteria, erikd, austin, scpmw, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2804
GHC Trac Issues: #5654, #10007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This seems like a clearer name and the fewer functions that
one needs to remember, the better.
Test Plan: validate
Reviewers: austin, simonmar, michalt
Reviewed By: simonmar, michalt
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2735
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds likeliness annotations to heap and and stack checks and
modifies the llvm codegen to recognize those to help it generate better
code.
So with this patch
```
...
if ((Sp + 8) - 24 < SpLim) (likely: False) goto c23c; else goto c23d;
...
```
roughly generates:
```
%ln23k = icmp ult i64 %ln23j, %SpLim_Arg
%ln23m = call ccc i1 (i1, i1) @llvm.expect.i1( i1 %ln23k, i1 0 )
br i1 %ln23m, label %c23c, label %c23d
```
Note the call to `llvm.expect` which denotes the expected result for
the comparison.
Test Plan: Look at assembler code with and without this patch. If the
heap-checks moved out of the way we are happy.
Reviewers: austin, simonmar, bgamari
Reviewed By: bgamari
Subscribers: michalt, thomie
Differential Revision: https://phabricator.haskell.org/D2688
GHC Trac Issues: #8321
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
We currently have two info tables for a constructor
* XXX_con_info: the info table for a heap-resident instance of the
constructor, It has type CONSTR, or one of the specialised types like
CONSTR_1_0
* XXX_static_info: the info table for a static instance of this
constructor, which has type CONSTR_STATIC or CONSTR_STATIC_NOCAF.
I'm getting rid of the latter, and using the `con_info` info table for
both static and dynamic constructors. For rationale and more details
see Note [static constructors] in SMRep.hs.
I also removed these macros: `isSTATIC()`, `ip_STATIC()`,
`closure_STATIC()`, since they relied on the CONSTR/CONSTR_STATIC
distinction, and anyway HEAP_ALLOCED() does the same job.
Test Plan: validate
Reviewers: bgamari, simonpj, austin, gcampax, hvr, niteria, erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2690
GHC Trac Issues: #12455
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`Lcall` enters the closure. If it has tags we jump directly to `Lret`.
Confirmed with some generated cmm code:
```
R1 = _s2pP::P64;
Sp = Sp - 8;
if (R1 & 7 != 0) goto c2x0; else goto c2x1;
c2x1:
call (I64[R1])(R1) returns to c2x0, args: 8, res: 8, upd: 8;
c2x0:
_s2pQ::P64 = R1;
```
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On architectures with weak memory consistency a write barrier
is needed before the write to the pointer array.
Fixes #12469
Test Plan: rebuilt Stackage nightly twice on powerpc64le
Reviewers: hvr, rrnewton, erikd, austin, simonmar, bgamari
Reviewed By: erikd, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2525
GHC Trac Issues: #12469
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New unarise (714bebf) eliminates void binders in patterns already, so no
need to eliminate them here. I leave assertions to make sure this is the
case.
Assertion failure -> bug in unarise
Reviewers: bgamari, simonpj, austin, simonmar, hvr
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2416
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Good question
Reviewers: austin, trommler, simonmar, rrnewton
Reviewed By: simonmar
Subscribers: RyanGlScott, thomie
Differential Revision: https://phabricator.haskell.org/D2495
GHC Trac Issues: #12469
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea behind adding special "rubbish" arguments was in unboxed sum types
depending on the tag some arguments are not used and we don't want to move some
special values (like 0 for literals and some special pointer for boxed slots)
for those arguments (to stack locations or registers). "StgRubbishArg" was an
indicator to the code generator that the value won't be used. During Stg-to-Cmm
we were then not generating any move or store instructions at all.
This caused problems in the register allocator because some variables were only
initialized in some code paths. As an example, suppose we have this STG: (after
unarise)
Lib.$WT =
\r [dt_sit]
case
case dt_sit of {
Lib.F dt_siv [Occ=Once] ->
(#,,#) [1# dt_siv StgRubbishArg::GHC.Prim.Int#];
Lib.I dt_siw [Occ=Once] ->
(#,,#) [2# StgRubbishArg::GHC.Types.Any dt_siw];
}
of
dt_six
{ (#,,#) us_giC us_giD us_giE -> Lib.T [us_giC us_giD us_giE];
};
This basically unpacks a sum type to an unboxed sum with 3 fields, and then
moves the unboxed sum to a constructor (`Lib.T`).
This is the Cmm for the inner case expression (case expression in the scrutinee
position of the outer case):
ciN:
...
-- look at dt_sit's tag
if (_ciT::P64 != 1) goto ciS; else goto ciR;
ciS: -- Tag is 2, i.e. Lib.F
_siw::I64 = I64[_siu::P64 + 6];
_giE::I64 = _siw::I64;
_giD::P64 = stg_RUBBISH_ENTRY_info;
_giC::I64 = 2;
goto ciU;
ciR: -- Tag is 1, i.e. Lib.I
_siv::P64 = P64[_siu::P64 + 7];
_giD::P64 = _siv::P64;
_giC::I64 = 1;
goto ciU;
Here one of the blocks `ciS` and `ciR` is executed and then the execution
continues to `ciR`, but only `ciS` initializes `_giE`, in the other branch
`_giE` is not initialized, because it's "rubbish" in the STG and so we don't
generate an assignment during code generator. The code generator then panics
during the register allocations:
ghc-stage1: panic! (the 'impossible' happened)
(GHC version 8.1.20160722 for x86_64-unknown-linux):
LocalReg's live-in to graph ciY {_giE::I64}
(`_giD` is also "rubbish" in `ciS`, but it's still initialized because it's a
pointer slot, we have to initialize it otherwise garbage collector follows the
pointer to some random place. So we only remove assignment if the "rubbish" arg
has unboxed type.)
This patch removes `StgRubbishArg` and `CmmArg`. We now always initialize
rubbish slots. If the slot is for boxed types we use the existing `absentError`,
otherwise we initialize the slot with literal 0.
Reviewers: simonpj, erikd, austin, simonmar, bgamari
Reviewed By: erikd
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2446
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This patch implements primitive unboxed sum types, as described in
https://ghc.haskell.org/trac/ghc/wiki/UnpackedSumTypes.
Main changes are:
- Add new syntax for unboxed sums types, terms and patterns. Hidden
behind `-XUnboxedSums`.
- Add unlifted unboxed sum type constructors and data constructors,
extend type and pattern checkers and desugarer.
- Add new RuntimeRep for unboxed sums.
- Extend unarise pass to translate unboxed sums to unboxed tuples right
before code generation.
- Add `StgRubbishArg` to `StgArg`, and a new type `CmmArg` for better
code generation when sum values are involved.
- Add user manual section for unboxed sums.
Some other changes:
- Generalize `UbxTupleRep` to `MultiRep` and `UbxTupAlt` to
`MultiValAlt` to be able to use those with both sums and tuples.
- Don't use `tyConPrimRep` in `isVoidTy`: `tyConPrimRep` is really
wrong, given an `Any` `TyCon`, there's no way to tell what its kind
is, but `kindPrimRep` and in turn `tyConPrimRep` returns `PtrRep`.
- Fix some bugs on the way: #12375.
Not included in this patch:
- Update Haddock for new the new unboxed sum syntax.
- `TemplateHaskell` support is left as future work.
For reviewers:
- Front-end code is mostly trivial and adapted from unboxed tuple code
for type checking, pattern checking, renaming, desugaring etc.
- Main translation routines are in `RepType` and `UnariseStg`.
Documentation in `UnariseStg` should be enough for understanding
what's going on.
Credits:
- Johan Tibell wrote the initial front-end and interface file
extensions.
- Simon Peyton Jones reviewed this patch many times, wrote some code,
and helped with debugging.
Reviewers: bgamari, alanz, goldfire, RyanGlScott, simonpj, austin,
simonmar, hvr, erikd
Reviewed By: simonpj
Subscribers: Iceland_jack, ggreif, ezyang, RyanGlScott, goldfire,
thomie, mpickering
Differential Revision: https://phabricator.haskell.org/D2259
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings in initial support for compact regions, as described in the
ICFP 2015 paper "Efficient Communication and Collection with Compact
Normal Forms" (Edward Z. Yang et.al.) and implemented by Giovanni
Campagna.
Some things may change before the 8.2 release, but I (Simon M.) wanted
to get the main patch committed so that we can iterate.
What documentation there is is in the Data.Compact module in the new
compact package. We'll need to extend and polish the documentation
before the release.
Test Plan:
validate
(new test cases included)
Reviewers: ezyang, simonmar, hvr, bgamari, austin
Subscribers: vikraman, Yuras, RyanGlScott, qnikst, mboes, facundominguez, rrnewton, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D1264
GHC Trac Issues: #11493
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to remove the `Ord Unique` instance because there's
no way to implement it in deterministic way and it's too
easy to use by accident.
We sometimes compute SCC for datatypes whose Ord instance
is implemented in terms of Unique. The Ord constraint on
SCC is just an artifact of some internal data structures.
We can have an alternative implementation with a data
structure that uses Uniquable instead.
This does exactly that and I'm pleased that I didn't have
to introduce any duplication to do that.
Test Plan:
./validate
I looked at performance tests and it's a tiny bit better.
Reviewers: bgamari, simonmar, ezyang, austin, goldfire
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2359
GHC Trac Issues: #4012
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With TypeInType Richard combined ForAllTy and FunTy, but that was often
awkward, and yielded little benefit becuase in practice the two were
always treated separately. This patch re-introduces FunTy. Specfically
* New type
data TyVarBinder = TvBndr TyVar VisibilityFlag
This /always/ has a TyVar it. In many places that's just what
what we want, so there are /lots/ of TyBinder -> TyVarBinder changes
* TyBinder still exists:
data TyBinder = Named TyVarBinder | Anon Type
* data Type = ForAllTy TyVarBinder Type
| FunTy Type Type
| ....
There are a LOT of knock-on changes, but they are all routine.
The Haddock submodule needs to be updated too
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The histogram types are defined in `Ticky.c` as `StgInt` values.
```
EXTERN StgInt RET_NEW_hst[TICKY_BIN_COUNT] INIT({0});
EXTERN StgInt RET_OLD_hst[TICKY_BIN_COUNT] INIT({0});
EXTERN StgInt RET_UNBOXED_TUP_hst[TICKY_BIN_COUNT] INIT({0});
```
which means they'll be `32-bits` on `x86` and `64-bits` on `x86_64`.
However the `bumpHistogram` in `StgCmmTicky` is incrementing them as if
they're a `cLong`. A long on Windows `x86_64` is `32-bit`.
As such when then value for the `_hst_1` is being set what it's actually doing
is setting the value of the high bits of the first entry.
This ends up giving us `0b100000000000000000000000000000000` or `4294967296`
as is displayed in the ticket on #8308.
Since `StgInt` is defined using the `WORD` size. Just use that directly in
`bumpHistogram`.
Also since `cLong` is no longer used after this commit it will also be dropped.
Test Plan: make TEST=T8308
Reviewers: mlen, jstolarek, bgamari, thomie, goldfire, simonmar, austin
Reviewed By: bgamari, thomie
Subscribers: #ghc_windows_task_force
Differential Revision: https://phabricator.haskell.org/D2318
GHC Trac Issues: #8308
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: austin, bgamari, simonmar, hvr
Reviewed By: hvr
Subscribers: hvr, thomie
Differential Revision: https://phabricator.haskell.org/D2285
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Say we have a record like this:
data Rec = Rec
{ f1 :: Int
, f2 :: Int
, f3 :: Int
, f4 :: Int
, f5 :: Int
}
Before this patch, the code generated for `f1` looked like this:
f1_entry()
{offset
...
cJT:
_sI6::P64 = R1;
_sI7::P64 = P64[_sI6::P64 + 7];
_sI8::P64 = P64[_sI6::P64 + 15];
_sI9::P64 = P64[_sI6::P64 + 23];
_sIa::P64 = P64[_sI6::P64 + 31];
_sIb::P64 = P64[_sI6::P64 + 39];
R1 = _sI7::P64 & (-8);
Sp = Sp + 8;
call (I64[R1])(R1) args: 8, res: 0, upd: 8;
}
Note how all fields of the record are moved to local variables, even though
they're never used. These moves make it to the final assembly:
f1_info:
...
_cJT:
movq 7(%rbx),%rax
movq 15(%rbx),%rcx
movq 23(%rbx),%rcx
movq 31(%rbx),%rcx
movq 39(%rbx),%rbx
movq %rax,%rbx
andq $-8,%rbx
addq $8,%rbp
jmp *(%rbx)
With this patch we stop generating these move instructions. Cmm becomes this:
f1_entry()
{offset
...
cJT:
_sI6::P64 = R1;
_sI7::P64 = P64[_sI6::P64 + 7];
R1 = _sI7::P64 & (-8);
Sp = Sp + 8;
call (I64[R1])(R1) args: 8, res: 0, upd: 8;
}
Assembly becomes this:
f1_info:
...
_cJT:
movq 7(%rbx),%rax
movq %rax,%rbx
andq $-8,%rbx
addq $8,%rbp
jmp *(%rbx)
It turns out CmmSink already optimizes this, but it's better to generate
better code in the first place.
Reviewers: simonmar, simonpj, austin, bgamari
Reviewed By: simonmar, simonpj
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D2269
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've changed the functions to their nonDet equivalents and explained
why they're OK there. This allowed me to remove foldNameSet,
foldVarEnv, foldVarEnv_Directly, foldVarSet and foldUFM_Directly.
Test Plan: ./validate, there should be no change in behavior
Reviewers: simonpj, simonmar, austin, goldfire, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2244
GHC Trac Issues: #4012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes Cmm generation required to produce histograms when
compiling with -ticky flag, strips dead code from rts/Ticky.c and
reworks it to use a shared constant in both C and Haskell code.
Fixes #8308.
Test Plan: T8308
Reviewers: jstolarek, simonpj, austin
Reviewed By: simonpj
Subscribers: mpickering, simonpj, bgamari, mlen, thomie, jstolarek
Differential Revision: https://phabricator.haskell.org/D931
GHC Trac Issues: #8308
|
|
|
|
|
| |
(likely introduced by 99d4e5b4a0bd32813ff8c74e91d2dcf6b3555176, possibly
due to a merge mistake).
|
|
|
|
|
|
|
|
|
|
| |
The report now distinguishes thunks (in the variants single-entry and
standard thunks), constructors and functions (possibly single-entry).
Forthermore, for standard thunks (AP and selector), do not count an
entry when they are allocated. It is not possible to count their
entries, as their code is shared, but better count nothing than count
the wrong thing.
|
| |
|
|
|
|
|
| |
This reverts commit 6c2c853b11fe25c106469da7b105e2be596c17de which was
supposed to be merged as individual commits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this Diff contains small, self-contained changes as I work towards
fixing #10613. It is mostly created to let harbormaster do its job, but
feedback is welcome as well.
Please do not merge this via arc; I’d like to push the individual
patches as layed out here. I might push mostly trivial ones even without
review, as long as the build passes.
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D2014
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: austin, bgamari, simonpj
Reviewed By: simonpj
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1933
|