| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
This builds off of D4475.
Bumps binary submodule.
Reviewers: carter, AndreasK, hvr, goldfire, bgamari, simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D5006
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- The StgBinderInfo type was never used in the code gen, so the type, related
computation in CoreToStg, and some comments about it are removed. See #15770
for more details.
- Simplified CoreToStg after removing the StgBinderInfo computation: removed
StgBinderInfo arguments and mfix stuff.
The StgBinderInfo values were not used in the code gen, but I still run nofib
just to make sure: 0.0% change in allocations and binary sizes.
Test Plan: Validated locally
Reviewers: simonpj, simonmar, bgamari, sgraf
Reviewed By: sgraf
Subscribers: AndreasK, sgraf, rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5232
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes two isssues:
- Using bitcast for MO_XX_Conv
Arguments to a bitcast must be of the same size. We should be using
`trunc` and `zext` instead.
- Using unsupported MO_*_QuotRem for LLVM
The two primops `MO_*_QuotRem` are not supported by the LLVM backend,
so
we shouldn't use them for `Int8#`/`Word8#` (just as we do not use
them for
`Int#`/`Word#`).
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: manually run tests with WAY=llvm
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15864
Differential Revision: https://phabricator.haskell.org/D5304
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first step of implementing:
https://github.com/ghc-proposals/ghc-proposals/pull/74
The main highlights/changes:
primops.txt.pp gets two new sections for two new primitive types for
signed and unsigned 8-bit integers (Int8# and Word8 respectively) along
with basic arithmetic and comparison operations. PrimRep/RuntimeRep get
two new constructors for them. All of the primops translate into the
existing MachOPs.
For CmmCalls the codegen will now zero-extend the values at call
site (so that they can be moved to the right register) and then truncate
them back their original width.
x86 native codegen needed some updates, since it wasn't able to deal
with the new widths, but all the changes are quite localized. LLVM
backend seems to just work.
This is the second attempt at merging this, after the first attempt in
D4475 had to be backed out due to regressions on i386.
Bumps binary submodule.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate (on both x86-{32,64})
Reviewers: bgamari, hvr, goldfire, simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5258
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change some URLs from hackage.haskell.org/trac to ghc.haskell.org/trac
Test Plan: manually verify links work
Reviewers: bgamari, simonmar, mpickering
Reviewed By: bgamari, mpickering
Subscribers: mpickering, rwbarton, carter
GHC Trac Issues: #15733
Differential Revision: https://phabricator.haskell.org/D5257
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Trac #9279 reminded us that the worker wrapper transformation copes
really badly with absent unlifted boxed bindings.
As `Note [Absent errors]` in WwLib.hs points out, we can't just use
`absentError` for unlifted bindings because there is no bottom to hide
the error in.
So instead, we synthesise a new `RubbishLit` of type
`forall (a :: TYPE 'UnliftedRep). a`, which code-gen may subsitute for
any boxed value. We choose `()`, so that there is a good chance that
the program crashes instead instead of leading to corrupt data, should
absence analysis have been too optimistic (#11126).
Reviewers: simonpj, hvr, goldfire, bgamari, simonmar
Reviewed By: simonpj
Subscribers: osa1, rwbarton, carter
GHC Trac Issues: #15627, #9279, #4306, #11126
Differential Revision: https://phabricator.haskell.org/D5153
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See #15696 for more details. We now always enter dataToTag# argument (done in
generated Cmm, in StgCmmExpr). Any high-level optimisations on dataToTag#
applications are done by the simplifier. Looking at tag bits (instead of
reading the info table) for small types is left to another diff.
Incorrect test T14626 is removed. We no longer do this optimisation (see
comment:44, comment:45, comment:60).
Comments and notes about special cases around dataToTag# are removed. We no
longer have any special cases around it in Core.
Other changes related to evaluating primops (seq# and dataToTag#) will be
pursued in follow-up diffs.
Test Plan: Validates with three regression tests
Reviewers: simonpj, simonmar, hvr, bgamari, dfeuer
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #15696
Differential Revision: https://phabricator.haskell.org/D5201
|
|
|
|
|
|
|
|
|
| |
This unfortunately broke i386 support since it introduced references to
byte-sized registers that don't exist on that architecture.
Reverts binary submodule
This reverts commit 5d5307f943d7581d7013ffe20af22233273fba06.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the first step of implementing:
https://github.com/ghc-proposals/ghc-proposals/pull/74
The main highlights/changes:
- `primops.txt.pp` gets two new sections for two new primitive types
for signed and unsigned 8-bit integers (`Int8#` and `Word8`
respectively) along with basic arithmetic and comparison
operations. `PrimRep`/`RuntimeRep` get two new constructors for
them. All of the primops translate into the existing `MachOP`s.
- For `CmmCall`s the codegen will now zero-extend the values at call
site (so that they can be moved to the right register) and then
truncate them back their original width.
- x86 native codegen needed some updates, since it wasn't able to deal
with the new widths, but all the changes are quite localized. LLVM
backend seems to just work.
Bumps binary submodule.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate with new tests
Reviewers: hvr, goldfire, bgamari, simonmar
Subscribers: Abhiroop, dfeuer, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4475
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Write barriers for large array writes were added in D2525, as a part of #12469.
However it seems we forgot about small arrays. This patch adds the same write
barrier to small array writes.
Reviewers: simonmar, bgamari
Reviewed By: simonmar
Subscribers: rwbarton, carter
GHC Trac Issues: #12469
Differential Revision: https://phabricator.haskell.org/D5209
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Long ago, the stable name table and stable pointer tables were one.
Now, they are separate, and have significantly different
implementations. I believe the time has come to finish the split
that began in #7674.
* Divide `rts/Stable` into `rts/StableName` and `rts/StablePtr`.
* Give each table its own mutex.
* Add FFI functions `hs_lock_stable_ptr_table` and
`hs_unlock_stable_ptr_table` and document them.
These are intended to replace the previously undocumented
`hs_lock_stable_tables` and `hs_lock_stable_tables`,
which are now documented as deprecated synonyms.
* Make `eqStableName#` use pointer equality instead of unnecessarily
comparing stable name table indices.
Reviewers: simonmar, bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, carter
GHC Trac Issues: #15555
Differential Revision: https://phabricator.haskell.org/D5084
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: hvr, bgamari, simonmar, jrtc27
Reviewed By: bgamari
Subscribers: alpmestan, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5034
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds foldl' to GhcPrelude and changes must occurences
of foldl to foldl'. This leads to better performance especially
for quick builds where GHC does not perform strictness analysis.
It does change strictness behaviour when we use foldl' to turn
a argument list into function applications. But this is only a
drawback if code looks ONLY at the last argument but not at the first.
And as the benchmarks show leads to fewer allocations in practice
at O2.
Compiler performance for Nofib:
O2 Allocations:
-1 s.d. ----- -0.0%
+1 s.d. ----- -0.0%
Average ----- -0.0%
O2 Compile Time:
-1 s.d. ----- -2.8%
+1 s.d. ----- +1.3%
Average ----- -0.8%
O0 Allocations:
-1 s.d. ----- -0.2%
+1 s.d. ----- -0.1%
Average ----- -0.2%
Test Plan: ci
Reviewers: goldfire, bgamari, simonmar, tdammers, monoidal
Reviewed By: bgamari, monoidal
Subscribers: tdammers, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4929
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This contains two commits:
----
Make GHC's code-base compatible w/ `MonadFail`
There were a couple of use-sites which implicitly used pattern-matches
in `do`-notation even though the underlying `Monad` didn't explicitly
support `fail`
This refactoring turns those use-sites into explicit case
discrimations and adds an `MonadFail` instance for `UniqSM`
(`UniqSM` was the worst offender so this has been postponed for a
follow-up refactoring)
---
Turn on MonadFail desugaring by default
This finally implements the phase scheduled for GHC 8.6 according to
https://prime.haskell.org/wiki/Libraries/Proposals/MonadFail#Transitionalstrategy
This also preserves some tests that assumed MonadFail desugaring to be
active; all ghc boot libs were already made compatible with this
`MonadFail` long ago, so no changes were needed there.
Test Plan: Locally performed ./validate --fast
Reviewers: bgamari, simonmar, jrtc27, RyanGlScott
Reviewed By: bgamari
Subscribers: bgamari, RyanGlScott, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D5028
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for built-in Natural literals in Core.
- Replace MachInt,MachWord, LitInteger, etc. with a single LitNumber
constructor with a LitNumType field
- Support built-in Natural literals
- Add desugar warning for negative literals
- Move Maybe(..) from GHC.Base to GHC.Maybe for module dependency
reasons
This patch introduces only a few rules for Natural literals (compared
to Integer's rules). Factorization of the built-in rules for numeric
literals will be done in another patch as this one is already big to
review.
Test Plan:
validate
test build with integer-simple
Reviewers: hvr, bgamari, goldfire, Bodigrim, simonmar
Reviewed By: bgamari
Subscribers: phadej, simonpj, RyanGlScott, carter, hsyl20, rwbarton,
thomie
GHC Trac Issues: #14170, #14465
Differential Revision: https://phabricator.haskell.org/D4212
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY
SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN
MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY
MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN
Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR,
MUT_ARR_PTRS).
(alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR)
Removed a few comments in Scav.c about FROZEN0 being on the mut_list
because it's now clear from the closure type.
Reviewers: bgamari, simonmar, erikd
Reviewed By: simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4784
|
| |
|
|
|
|
| |
Addressing review comments on D4637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
The idea here is to save a little code size and some work in the GC,
by collapsing FUN_STATIC closures and their SRTs.
This is (4) in a series; see D4632 for more details.
There's a tradeoff here: more complexity in the compiler in exchange
for a modest code size reduction (probably around 0.5%).
Results:
* GHC binary itself (statically linked) is 1% smaller
* -0.2% binary sizes in nofib (-0.5% module sizes)
Full nofib results comparing D4634 with this: P177 (ignore runtimes,
these aren't stable on my laptop)
Test Plan: validate, nofib
Reviewers: bgamari, niteria, simonpj, erikd
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
- Previously we would hvae a single big table of pointers per module,
with a set of bitmaps to reference entries within it. The new
representation is identical to a static constructor, which is much
simpler for the GC to traverse, and we get to remove the complicated
bitmap-traversal code from the GC.
- Rewrite all the code to generate SRTs in CmmBuildInfoTables, and
document it much better (see Note [SRTs]). This has been something
I've wanted to do since we moved to the new code generator, I
finally had the opportunity to finish it while on a transatlantic
flight recently :)
There are a series of 4 diffs:
1. D4632 (this one), which does the bulk of the changes
2. D4633 which adds support for smaller `CmmLabelDiffOff` constants
3. D4634 which takes advantage of D4632 and D4633 to save a word in
info tables that have an SRT on x86_64. This is where most of the
binary size improvement comes from.
4. D4637 which makes a further optimisation to merge some SRTs with
static FUN closures. This adds some complexity and the benefits
are fairly modest, so it's not clear yet whether we should do this.
Results (after (3), on x86_64)
- GHC itself (staticaly linked) is 5.2% smaller
- -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176
- I measured the overhead of traversing all the static objects in a
major GC in GHC itself by doing `replicateM_ 1000 performGC` as the
first thing in `Main.main`. The new version was 5-10% faster, but
the results did vary quite a bit.
- I'm not sure if there's a compile-time difference, the results are
too unreliable.
Test Plan: validate
Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1
Subscribers: thomie, carter
Differential Revision: https://phabricator.haskell.org/D4632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'.
I found 'plusWord2#' while implementing this, which both lacks
documentation and has a slightly different specification than
'addWordC#', which means the generic implementation is unnecessarily
complex.
While I was at it, I also added lacking meta-information on PrimOps
and refactored 'subWordC#'s generic implementation to be branchless.
Reviewers: bgamari, simonmar, jrtc27, dfeuer
Reviewed By: bgamari, dfeuer
Subscribers: dfeuer, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4592
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: dfeuer, rwbarton, thomie, carter
GHC Trac Issues: #4442
Differential Revision: https://phabricator.haskell.org/D4488
|
|
|
|
|
|
|
|
|
| |
This subtle patch fixes Trac #5129 (again; comment:20
and following).
I took the opportunity to document seq# properly; see
Note [seq# magic] in PrelRules, and Note [seq# and expr_ok]
in CoreUtils.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
get/setAllocationCounter didn't take into account allocations in the
current block. This was known at the time, but it turns out to be
important to have more accuracy when using these in a fine-grained
way.
Test Plan:
New unit test to test incrementally larger allocaitons. Before I got
results like this:
```
+0
+0
+0
+0
+0
+4096
+0
+0
+0
+0
+0
+4064
+0
+0
+4088
+4056
+0
+0
+0
+4088
+4096
+4056
+4096
```
Notice how the results aren't always monotonically increasing. After
this patch:
```
+344
+416
+488
+560
+632
+704
+776
+848
+920
+992
+1064
+1136
+1208
+1280
+1352
+1424
+1496
+1568
+1640
+1712
+1784
+1856
+1928
+2000
+2072
+2144
```
Reviewers: hvr, erikd, simonmar, jrtc27, trommler
Reviewed By: simonmar
Subscribers: trommler, jrtc27, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes a bunch of unnecessary includes of `HsVersions.h` along
with unnecessary CPP (e.g., due to checking for DEBUG which can be
achieved by looking at `debugIsOn`)
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4462
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was broken by D3746 and/or D3809, but unfortunately we didn't
notice because CI at the time wasn't building the profiling way.
Test Plan:
```
cd testsuite/test/profiling/should_run
make WAY=ghci-ext-prof
```
Reviewers: bgamari, michalt, hvr, erikd
Subscribers: rwbarton, thomie, carter
GHC Trac Issues: #14705
Differential Revision: https://phabricator.haskell.org/D4437
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pattern `threadCapability =<< myThreadId` is used a lot in code
that uses `hs_try_putmvar`, I want to make it cheaper.
Test Plan: validate
Reviewers: bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4381
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: validate
Reviewers: bgamari, erikd
Reviewed By: bgamari
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4380
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is an obvious optimisation whose overhead is neglectable but
which significantly simplifies the common uses of `compareByteArrays#`
which would otherwise require to make *careful* use of
`reallyUnsafePtrEquality#` or (equally fragile) `byteArrayContents#`
which can result in less optimal assembler code being generated.
Test Plan: carefully examined generated cmm/asm code; validate via phab
Reviewers: alexbiehl, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4319
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Implement x86 code generator for pdep and pext. Properly initialise
bmiVersion field.
pdep and pext test cases
Fix pattern match for pdep and pext instructions
Fix build of pdep and pext code for 32-bit architectures
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, angerman
Reviewed By: bgamari
Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4236
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: bgamari, simonmar
Reviewed By: simonmar
Subscribers: alexbiehl, rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4309
|
|
|
|
| |
This reverts commit a1a689dda48113f3735834350fb562bb1927a633.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
get/setAllocationCounter didn't take into account allocations in the
current block. This was known at the time, but it turns out to be
important to have more accuracy when using these in a fine-grained
way.
Test Plan:
New unit test to test incrementally larger allocaitons. Before I got
results like this:
```
+0
+0
+0
+0
+0
+4096
+0
+0
+0
+0
+0
+4064
+0
+0
+4088
+4056
+0
+0
+0
+4088
+4096
+4056
+4096
```
Notice how the results aren't always monotonically increasing. After
this patch:
```
+344
+416
+488
+560
+632
+704
+776
+848
+920
+992
+1064
+1136
+1208
+1280
+1352
+1424
+1496
+1568
+1640
+1712
+1784
+1856
+1928
+2000
+2072
+2144
```
Reviewers: niteria, bgamari, hvr, erikd
Subscribers: rwbarton, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4288
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a follow-up after faf60e85 - Make tagForCon non-linear.
On the mailing list @simonpj suggested to solve the
linear behavior by caching the sizes.
Test Plan: ./validate
Reviewers: simonpj, simonmar, bgamari, austin
Reviewed By: simonpj
Subscribers: carter, goldfire, rwbarton, thomie, simonpj
Differential Revision: https://phabricator.haskell.org/D4131
|
| |
|
| |
|
|
|
|
|
|
| |
This broke the 32-bit build.
This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd
Reviewed By: bgamari
Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4063
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another step for fixing #13825 and is based on D38 by Simon
Marlow.
The change allows storing multiple constructor fields within the same
word. This currently applies only to `Float`s, e.g.,
```
data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float
```
on 64-bit arch, will now store both fields within the same constructor
word. For `WordX/IntX` we'll need to introduce new primop types.
Main changes:
- We now use sizes in bytes when we compute the offsets for
constructor fields in `StgCmmLayout` and introduce padding if
necessary (word-sized fields are still word-aligned)
- `ByteCodeGen` had to be updated to correctly construct the data
types. This required some new bytecode instructions to allow pushing
things that are not full words onto the stack (and updating
`Interpreter.c`). Note that we only use the packed stuff when
constructing data types (i.e., for `PACK`), in all other cases the
behavior should not change.
- `RtClosureInspect` was changed to handle the new layout when
extracting subterms. This seems to be used by things like `:print`.
I've also added a test for this.
- I deviated slightly from Simon's approach and use `PrimRep` instead
of `ArgRep` for computing the size of fields. This seemed more
natural and in the future we'll probably want to introduce new
primitive types (e.g., `Int8#`) and `PrimRep` seems like a better
place to do that (where we already have `Int64Rep` for example).
`ArgRep` on the other hand seems to be more focused on calling
functions.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd
Reviewed By: bgamari
Subscribers: maoe, rwbarton, thomie
GHC Trac Issues: #13825
Differential Revision: https://phabricator.haskell.org/D3809
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on D4090
Reviewers: austin, bgamari, erikd, simonmar, alexbiehl
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Computing the number of constructors for TyCon is linear
in the number of constructors.
That's wasteful if all you want to check is if that
number is smaller than what fits in tag bits
(usually 8 things).
What this change does is to use a function that can
determine the ineqaulity without computing the size.
This improves compile time on a module with a
data type that has 10k constructors.
The variance in total time is (suspiciously) high,
but going by the best of 3 the numbers are 8.186s vs 7.511s.
For 1000 constructors the difference isn't noticeable:
0.646s vs 0.624s.
The hot spots were cgDataCon and cgEnumerationTyCon
where tagForCon is called in a loop.
One alternative would be to pass down the size.
Test Plan: harbormaster
Reviewers: bgamari, simonmar, austin
Reviewed By: simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4116
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The simplifier can simplify this without any trouble. Moreover, the
unboxed tuples cause bootstrapping issues due #14123.
I also went ahead and inlined a few definitions into the Monad instance.
Test Plan: Validate
Reviewers: austin, simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4026
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously due to #12759 we disabled PIE support entirely. However, this
breaks the user's ability to produce PIEs. Add an explicit flag, -fPIE,
allowing the user to build PIEs.
Test Plan: Validate
Reviewers: rwbarton, austin, simonmar
Subscribers: trommler, simonmar, trofi, jrtc27, thomie
GHC Trac Issues: #12759, #13702
Differential Revision: https://phabricator.haskell.org/D3589
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies the subset of Hoopl's functionality needed by GHC to
`cmm/Hoopl` and removes the dependency on the Hoopl package.
The main motivation for this change is the confusing/noisy interface
between GHC and Hoopl:
- Hoopl has `Label` which is GHC's `BlockId` but different than
GHC's `CLabel`
- Hoopl has `Unique` which is different than GHC's `Unique`
- Hoopl has `Unique{Map,Set}` which are different than GHC's
`Uniq{FM,Set}`
- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
needed just to filter the exposed functions (filter out some of the
Hoopl's and add the GHC ones)
With this change, we'll be able to simplify this significantly.
It'll also be much easier to do invasive changes (Hoopl is a public
package on Hackage with users that depend on the current behavior)
This should introduce no changes in functionality - it merely
copies the relevant code.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: simonpj, kavon, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While investigating #12545, I discovered several places in the code
that performed length-checks like so:
```
length ts == 4
```
This is not ideal, since the length of `ts` could be much longer than 4,
and we'd be doing way more work than necessary! There are already a slew
of helper functions in `Util` such as `lengthIs` that are designed to do
this efficiently, so I found every place where they ought to be used and
did just that. I also defined a couple more utility functions for list
length that were common patterns (e.g., `ltLength`).
Test Plan: ./validate
Reviewers: austin, hvr, goldfire, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: goldfire, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3622
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Phab:D3265 we introduced MO_F32_Fabs and MO_F64_Fabs.
This patch improves code generation by generating PowerPC fabs
instructions.
Test Plan: run numeric/should_run/numrun015 or validate
Reviewers: austin, bgamari, hvr, simonmar, erikd
Reviewed By: erikd
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3512
|