| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These pragmas were having the perverse effect of having these
performance
critical modules be LESS optimized in builds with -O2.
Test Plan: Check on gipedia whether this is worthwhile.
Reviewers: austin, bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4156
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixing #14244 required the newer gcc atomic built-ins that are provided
from 4.7 and up. This updates the test to check for minimum gcc version
4.7.
The version tests for 3.4 (!), 4.4, and 4.6 are no longer needed and can
be removed. This makes the build system simpler.
Test Plan: validate
Reviewers: austin, bgamari, hvr, simonmar
Reviewed By: bgamari
Subscribers: rwbarton, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D4165
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This flag reintroduces the verbose module name output produced by GHCi's
:load command behind a new flag, -show-mods-loaded. This was originally
removed in D3651 but apparently some tools (e.g. haskell-mode) rely on
this output.
Addresses #14427.
Test Plan: Validate
Reviewers: svenpanne
Reviewed By: svenpanne
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4164
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: hvr, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4163
|
|
|
|
|
|
|
|
|
|
| |
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4157
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that we don't consider lists of equal length to be equal when
they are not. I noticed these while working on the fix for #14361.
Reviewers: austin, simonmar, michalt
Reviewed By: michalt
Subscribers: rwbarton, thomie
GHC Trac Issues: #14361
Differential Revision: https://phabricator.haskell.org/D4153
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously CBE computed equality by taking the lists of middle nodes of
the blocks being compared and zipping them together. It would then map
over this list with the equality relation, and accumulate the result.
However, this is completely wrong: Consider what will happen when we
compare a block with no middle nodes with one with one or more. The
result of `zip` will be empty and consequently the pass may conclude
that the two are indeed equivalent (if their last nodes also match).
This is very bad and the cause of #14361.
The solution I chose was just to write out an explicit recursion, like I
distinctly recall considering doing when I first wrote this code.
Unfortunately I was feeling clever at the time.
Unfortunately this case was just rare enough not to be triggered by the
testsuite. I still need to find a testcase that doesn't have external
dependencies.
Test Plan: Need to find a more minimal testcase
Reviewers: austin, simonmar, michalt
Reviewed By: michalt
Subscribers: michalt, rwbarton, thomie, hvr
GHC Trac Issues: #14361
Differential Revision: https://phabricator.haskell.org/D4152
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Test Plan: Validate
Reviewers: Phyx, austin, erikd, simonmar
Reviewed By: Phyx
Subscribers: rwbarton, thomie
GHC Trac Issues: #14415
Differential Revision: https://phabricator.haskell.org/D4151
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we could only deserialize `TypeRep (a -> b)` if
both `a` and `b` had kind `Type`. Now, we do it regardless of
their runtime representations.
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4137
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Give `TypeRep` constructor fields names, and use them when pattern
matching and constructing values. This is a bit verbose, but makes
it obvious which field means what.
Reviewers: austin, hvr, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4136
|
|
|
|
|
|
|
|
| |
Reviewers: austin, bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4144
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier this year Edward Kmett requested [1] that we enable passing of
vector values in vector registers by default. The GHC calling convention
changes have been in LLVM for a number of years now so let's just flip
the switch.
[1] https://mail.haskell.org/pipermail/ghc-devs/2017-March/013905.html
Reviewers: austin
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4142
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CmmProcs which have *lots* of local variables take a considerable
amount of time in CmmSink. This was noticed by @tdammers in #7258
while compiling files with large records (~200-400 fields).
Before:
```
Sun Oct 29 19:58 2017 Time and Allocation Profiling Report (Final)
ghc-stage2 +RTS -p -RTS
-B/Users/alexbiehl/git/ghc/inplace/lib /Users/alexbiehl/Downloads/W2.hs
-fforce-recomp -O2
total time = 26.00 secs (25996 ticks @ 1000 us, 1 processor)
total alloc = 14,921,627,912 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
sink CmmPipeline
compiler/cmm/CmmPipeline.hs:(104,13)-(105,59) 55.7 15.9
SimplTopBinds SimplCore compiler/simplCore/SimplCore.hs:761:39-74 19.5 30.6
FloatOutwards SimplCore compiler/simplCore/SimplCore.hs:471:40-66 4.2 9.0
RegAlloc-linear AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(658,27)-(660,55) 4.0 11.1
pprNativeCode AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(529,37)-(530,65) 2.8 6.3
NewStranal SimplCore compiler/simplCore/SimplCore.hs:480:40-63 1.6 3.7
OccAnal SimplCore compiler/simplCore/SimplCore.hs:(739,22)-(740,67) 1.5 3.5
StgCmm HscMain compiler/main/HscMain.hs:(1426,13)-(1427,62) 1.2 2.4
regLiveness AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(591,17)-(593,52) 1.2 1.9
genMachCode AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(580,17)-(582,62) 0.9 1.8
NativeCodeGen CodeOutput compiler/main/CodeOutput.hs:171:18-78 0.9 2.1
CoreTidy HscMain compiler/main/HscMain.hs:1253:27-67 0.8 1.9
```
After:
```
Sun Oct 29 19:18 2017 Time and Allocation Profiling Report (Final)
ghc-stage2 +RTS -p -RTS
-B/Users/alexbiehl/git/ghc/inplace/lib /Users/alexbiehl/Downloads/W2.hs
-fforce-recomp -O2
total time = 13.31 secs (13307 ticks @ 1000 us, 1 processor)
total alloc = 15,772,184,488 bytes (excludes profiling overheads)
COST CENTRE MODULE SRC %time %alloc
SimplTopBinds SimplCore
compiler/simplCore/SimplCore.hs:761:39-74 38.3 29.0
sink CmmPipeline compiler/cmm/CmmPipeline.hs:(104,13)-(105,59) 13.2 20.3
RegAlloc-linear AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(658,27)-(660,55) 8.3 10.5
FloatOutwards SimplCore compiler/simplCore/SimplCore.hs:471:40-66 8.1 8.5
pprNativeCode AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(529,37)-(530,65) 5.4 5.9
NewStranal SimplCore compiler/simplCore/SimplCore.hs:480:40-63 3.1 3.5
OccAnal SimplCore compiler/simplCore/SimplCore.hs:(739,22)-(740,67) 2.9 3.3
StgCmm HscMain compiler/main/HscMain.hs:(1426,13)-(1427,62) 2.3 2.3
regLiveness AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(591,17)-(593,52) 2.1 1.8
NativeCodeGen CodeOutput compiler/main/CodeOutput.hs:171:18-78 1.7 2.0
genMachCode AsmCodeGen compiler/nativeGen/AsmCodeGen.hs:(580,17)-(582,62) 1.6 1.7
CoreTidy HscMain compiler/main/HscMain.hs:1253:27-67 1.4 1.8
foldNodesBwdOO Hoopl.Dataflow compiler/cmm/Hoopl/Dataflow.hs:(397,1)-(403,17) 1.1 0.8
```
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari
Subscribers: duog, rwbarton, thomie, tdammers
GHC Trac Issues: #7258
Differential Revision: https://phabricator.haskell.org/D4145
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the change UNREG ghc build failed as:
```
rts_dist_HC rts/dist/build/PrimOps.o
/tmp/ghc2370_0/ghc_4.hc: In function 'stg_newByteArrayzh':
/tmp/ghc2370_0/ghc_4.hc:26:13: error:
error: 'base_GHCziIOziException_heapOverflow_closure'
undeclared (first use in this function)
R1.w = (W_)&base_GHCziIOziException_heapOverflow_closure;
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
26 | R1.w = (W_)&base_GHCziIOziException_heapOverflow_closure;
| ^
```
It's an UNREG-specific failure because C backend always requires
declarations to be known.
Added missing declaration.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
|
|
|
|
|
|
|
|
| |
Unfortunately this (ironically) ended up breaking bindist testing since
we didn't have a package-data.mk. Unfortunately there is no easy way to
fix this.
This reverts commit 1e9f90af7311c33de0f7f5b7dba594725596d675.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `libraries/ghc-prim/cbits/atomic.c` no barriers were issued for
atomic read and write operations. Starting with gcc 4.7 compiler
intrinsics are offered. The atomic intrinisics are also available in
clang. Use these to implement `hs_atomicread*` and `hs_atomicwrite`.
Test Plan: validate on OSX and Windows
Reviewers: austin, bgamari, simonmar, hvr, erikd, dfeuer
Reviewed By: bgamari
Subscribers: dfeuer, rwbarton, thomie
GHC Trac Issues: #14244
Differential Revision: https://phabricator.haskell.org/D4009
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This breaks out control over STG free variable list output from
-dppr-debug into its own distinct flag. This makes it more discoverable
and easier to change independently from other dump output.
Test Plan: Validate
Reviewers: austin
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4140
|
|
|
|
|
|
|
|
| |
Reviewers: austin
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4141
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, `fixIO f` throws `BlockedIndefinitelyOnMVar` if
`f` is strict. This is not particularly friendly, since the
`MVar` in question is just part of the way `fixIO` happens to be
implemented. Instead, throw a new `FixIOException` with a better
explanation of the problem.
Reviewers: austin, hvr, bgamari
Subscribers: rwbarton, thomie
GHC Trac Issues: #14356
Differential Revision: https://phabricator.haskell.org/D4113
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this while tinkering in haddock. This might be a relict from
ancient times where newtypes wouldn't optimize well.
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement hexadecmial floating point literals.
The digits of the mantissa are hexadecimal.
The exponent is written in base 10, and the base for the exponentiation is 2.
Hexadecimal literals look a lot like ordinary decimal literals, except that
they use hexadecmial digits, and the exponent is written using `p` rather than `e`.
The specification of the feature is available here:
https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0004-hexFloats.rst
For a discussion of the various choices:
https://github.com/ghc-proposals/ghc-proposals/pull/37
Reviewers: mpickering, goldfire, austin, bgamari, hvr
Reviewed By: bgamari
Subscribers: mpickering, thomie
Differential Revision: https://phabricator.haskell.org/D3066
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement AtomicRMW ops, atomic read, atomic write
in PowerPC native code generator. Also implement
branch prediction because we need it in atomic ops
anyway.
This patch improves the issue in #12537 a bit but
does not fix it entirely.
The fallback operations for atomicread and atomicwrite
in libraries/ghc-prim/cbits/atomic.c are incorrect.
This patch avoids those functions by implementing the
operations directly in the native code generator. This
is also what the x86/amd64 NCG and the LLVM backend
do.
Test Plan: validate on AIX and PowerPC (32-bit) Linux
Reviewers: erikd, hvr, austin, bgamari, simonmar
Reviewed By: hvr, bgamari
Subscribers: rwbarton, thomie
GHC Trac Issues: #12537
Differential Revision: https://phabricator.haskell.org/D3984
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the `EmptyDataDeriving` proposal put forth in
https://github.com/ghc-proposals/ghc-proposals/blob/dbf51608/proposals/0006-deriving-empty.rst.
This has two major changes:
* The introduction of an `EmptyDataDeriving` extension, which
permits directly deriving `Eq`, `Ord`, `Read`, and `Show` instances
for empty data types.
* An overhaul in the code that is emitted in derived instances for
empty data types. To see an overview of the changes brought forth,
refer to the changes to the 8.4.1 release notes.
Test Plan: ./validate
Reviewers: bgamari, dfeuer, austin, hvr, goldfire
Reviewed By: bgamari
Subscribers: rwbarton, thomie
GHC Trac Issues: #7401, #10577, #13117
Differential Revision: https://phabricator.haskell.org/D4047
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit
commit 85aa1f4253163985fe07d172f8da73b784bb7b4b
Date: Sun Oct 29 20:48:19 2017 -0400
Fix #14390 by making toIfaceTyCon aware of equality
was a bit over-complicated. This patch simplifies the (horribly
ad-hoc) treatement of IfaceEqualityTyCon, and documents it better.
No visible change in behaviour.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
and in comments
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another step for fixing #13825 and is based on D38 by Simon
Marlow.
The change allows storing multiple constructor fields within the same
word. This currently applies only to `Float`s, e.g.,
```
data Foo = Foo {-# UNPACK #-} !Float {-# UNPACK #-} !Float
```
on 64-bit arch, will now store both fields within the same constructor
word. For `WordX/IntX` we'll need to introduce new primop types.
Main changes:
- We now use sizes in bytes when we compute the offsets for
constructor fields in `StgCmmLayout` and introduce padding if
necessary (word-sized fields are still word-aligned)
- `ByteCodeGen` had to be updated to correctly construct the data
types. This required some new bytecode instructions to allow pushing
things that are not full words onto the stack (and updating
`Interpreter.c`). Note that we only use the packed stuff when
constructing data types (i.e., for `PACK`), in all other cases the
behavior should not change.
- `RtClosureInspect` was changed to handle the new layout when
extracting subterms. This seems to be used by things like `:print`.
I've also added a test for this.
- I deviated slightly from Simon's approach and use `PrimRep` instead
of `ArgRep` for computing the size of fields. This seemed more
natural and in the future we'll probably want to introduce new
primitive types (e.g., `Int8#`) and `PrimRep` seems like a better
place to do that (where we already have `Int64Rep` for example).
`ArgRep` on the other hand seems to be more focused on calling
functions.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: bgamari, simonmar, austin, hvr, goldfire, erikd
Reviewed By: bgamari
Subscribers: maoe, rwbarton, thomie
GHC Trac Issues: #13825
Differential Revision: https://phabricator.haskell.org/D3809
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GHC was panicking when pretty-printing a heterogeneous
equality type constructor (#14390) because the function which
produced the type constructor, `toIfaceTyCon`, wasn't attaching the
appropriate `IfaceTyConSort` for equality type constructors, which
is `IfaceEqualityTyCon`. This is fixed easily enough.
Test Plan: make test TEST=T14390
Reviewers: austin, bgamari
Reviewed By: bgamari
Subscribers: rwbarton, thomie
GHC Trac Issues: #14390
Differential Revision: https://phabricator.haskell.org/D4132
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on D4090
Reviewers: austin, bgamari, erikd, simonmar, alexbiehl
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we add a flag to instruct the native code generator to add
alignment checks in all info table dereferences. This is helpful in
catching pointer tagging issues.
Thanks to @jrtc27 for uncovering the tagging issues on Sparc which
inspired this flag.
Test Plan: Validate
Reviewers: simonmar, austin, erikd
Reviewed By: simonmar
Subscribers: rwbarton, trofi, thomie, jrtc27
Differential Revision: https://phabricator.haskell.org/D4101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Hopefully these are more robust to NFS malfunction than BSD flock-style
locks. See #13945.
Test Plan: Validate via @simonpj
Reviewers: austin, hvr
Subscribers: rwbarton, thomie, erikd, simonpj
GHC Trac Issues: #13945
Differential Revision: https://phabricator.haskell.org/D4129
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea is described in #14152, and can be summarized: Float the exit
path out of a joinrec, so that the simplifier can do more with it.
See the test case for a nice example.
The floating goes against what the simplifier usually does, hence we
need to be careful not inline them back.
The position of exitification in the pipeline was chosen after a small
amount of experimentation, but may need to be improved. For example,
exitification can allow rewrite rules to fire, but for that it would
have to happen before the `simpl_phases`.
Perf.haskell.org reports these nice performance wins:
Nofib allocations
fannkuch-redux 78446640 - 99.92% 64560
k-nucleotide 109466384 - 91.32% 9502040
simple 72424696 - 5.96% 68109560
Nofib instruction counts
fannkuch-redux 1744331636 - 3.86% 1676999519
k-nucleotide 2318221965 - 6.30% 2172067260
scs 1978470869 - 3.35% 1912263779
simple 669858104 - 3.38% 647206739
spectral-norm 186423292 - 5.37% 176411536
Differential Revision: https://phabricator.haskell.org/D3903
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, (since 33452df), simplNonRecJoinPoint would do the wrong
thing in the presence of shadowing: It analyzed the RHS of a join
binding with the environment for the body. In particular, with
foo x =
join x = x * x
in x
where there is shadowing, it renames the inner x to x1, and should
produce
foo x =
join x1 = x * x
in x1
but because the substitution (x ↦ x1) is also used on the RHS we get the
bogus
foo x =
join x1 = x1 * x1
in x1
Fixed this by adding a `rhs_se` parameter, analogous to `simplNonRecE`
and `simplLazyBind`.
Differential Revision: https://phabricator.haskell.org/D4130
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Computing the number of constructors for TyCon is linear
in the number of constructors.
That's wasteful if all you want to check is if that
number is smaller than what fits in tag bits
(usually 8 things).
What this change does is to use a function that can
determine the ineqaulity without computing the size.
This improves compile time on a module with a
data type that has 10k constructors.
The variance in total time is (suspiciously) high,
but going by the best of 3 the numbers are 8.186s vs 7.511s.
For 1000 constructors the difference isn't noticeable:
0.646s vs 0.624s.
The hot spots were cgDataCon and cgEnumerationTyCon
where tagForCon is called in a loop.
One alternative would be to pass down the size.
Test Plan: harbormaster
Reviewers: bgamari, simonmar, austin
Reviewed By: simonmar
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4116
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
It's simple to treat BodyStmt just like a BindStmt with a wildcard
pattern, which is enough to fix #12143 without going all the way to
using `<*` and `*>` (#10892).
Test Plan:
* new test cases in `ado004.hs`
* validate
Reviewers: niteria, simonpj, bgamari, austin, erikd
Subscribers: rwbarton, thomie
GHC Trac Issues: #12143
Differential Revision: https://phabricator.haskell.org/D4128
|
|
|
|
|
|
|
|
|
| |
Trac #14379 showed a case where use of "forcing" to do
"damn the torpedos" specialisation without resource limits
(which 'vector' does a lot) led to exponential blowup.
The fix is easy. Finding it wasn't. See Note [Forcing
specialisation] and the one-line change in decreaseSpecCount.
|