| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
GHC Proposal #229 changes the lexical rules of Haskell, which may
require slight whitespace adjustments in certain cases.
This patch changes formatting in a few places in GHC and its testsuite
in a way that enables it to compile under the proposed rules.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part two of fixing #17334.
There are two parts to this commit:
- A bugfix for computing loop levels
- A bugfix of basic block invariants in the NCG.
-----------------------------------------------------------
In the first bug we ended up with a CFG of the sort: [A -> B -> C]
This was represented via maps as fromList [(A,B),(B,C)] and later
transformed into a adjacency array. However the transformation did
not include block C in the array (since we only looked at the keys of
the map).
This was still fine until we tried to look up successors for C and tried
to read outside of the array bounds when accessing C.
In order to prevent this in the future I refactored to code to include
all nodes as keys in the map representation. And make this a invariant
which is checked in a few places.
Overall I expect this to make the code more robust as now any failed
lookup will represent an error, versus failed lookups sometimes being
expected and sometimes not.
In terms of performance this makes some things cheaper (getting a list
of all nodes) and others more expensive (adding a new edge). Overall
this adds up to no noteable performance difference.
-----------------------------------------------------------
Part 2: When the NCG generated a new basic block, it did
not always insert a NEWBLOCK meta instruction in the stream which
caused a quite subtle bug.
During instruction selection a statement `s`
in a block B with control of the sort: B -> C
will sometimes result in control
flow of the sort:
┌ < ┐
v ^
B -> B1 ┴ -> C
as is the case for some atomic operations.
Now to keep the CFG in sync when introducing B1 we clearly
want to insert it between B and C. However there is
a catch when we have to deal with self loops.
We might start with code and a CFG of these forms:
loop:
stmt1 ┌ < ┐
.... v ^
stmtX loop ┘
stmtY
....
goto loop:
Now we introduce B1:
┌ ─ ─ ─ ─ ─┐
loop: │ ┌ < ┐ │
instrs v │ │ ^
.... loop ┴ B1 ┴ ┘
instrsFromX
stmtY
goto loop:
This is simple, all outgoing edges from loop now simply
start from B1 instead and the code generator knows which
new edges it introduced for the self loop of B1.
Disaster strikes if the statement Y follows the same pattern.
If we apply the same rule that all outgoing edges change then
we end up with:
loop ─> B1 ─> B2 ┬─┐
│ │ └─<┤ │
│ └───<───┘ │
└───────<────────┘
This is problematic. The edge B1->B1 is modified as expected.
However the modification is wrong!
The assembly in this case looked like this:
_loop:
<instrs>
_B1:
...
cmpxchgq ...
jne _B1
<instrs>
<end _B1>
_B2:
...
cmpxchgq ...
jne _B2
<instrs>
jmp loop
There is no edge _B2 -> _B1 here. It's still a self loop onto _B1.
The problem here is that really B1 should be two basic blocks.
Otherwise we have control flow in the *middle* of a basic block.
A contradiction!
So to account for this we add yet another basic block marker:
_B:
<instrs>
_B1:
...
cmpxchgq ...
jne _B1
jmp _B1'
_B1':
<instrs>
<end _B1>
_B2:
...
Now when inserting B2 we will only look at the outgoing edges of B1' and
everything will work out nicely.
You might also wonder why we don't insert jumps at the end of _B1'. There is
no way another block ends up jumping to the labels _B1 or _B2 since they are
essentially invisible to other blocks. View them as control flow labels local
to the basic block if you'd like.
Not doing this ultimately caused (part 2 of) #17334.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes #16512.
There are lots of parts of this patch:
* The main payload is in FamInst. See
Note [Coverage condition for injective type families] there
for the overview. But it doesn't fix the bug.
* We now bump the reduction depth every time we discharge
a CFunEqCan. See Note [Flatten when discharging CFunEqCan]
in TcInteract.
* Exploration of this revealed a new, easy to maintain invariant
for CTyEqCans. See Note [Almost function-free] in TcRnTypes.
* We also realized that type inference for injectivity was a
bit incomplete. This means we exchanged lookupFlattenTyVar for
rewriteTyVar. See Note [rewriteTyVar] in TcFlatten. The new
function is monadic while the previous one was pure, necessitating
some faff in TcInteract. Nothing too bad.
* zonkCt did not maintain invariants on CTyEqCan. It's not worth
the bother doing so, so we just transmute CTyEqCans to
CNonCanonicals.
* The pure unifier was finding the fixpoint of the returned
substitution, even when doing one-way matching (in tcUnifyTysWithTFs).
Fixed now.
Test cases: typecheck/should_fail/T16512{a,b}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For backends maintaining the CFG during codegen
we can now find loops and their nesting level.
This is based on the Cmm CFG and dominator analysis.
As a result we can estimate edge frequencies a lot better
for methods, resulting in far better code layout.
Speedup on nofib: ~1.5%
Increase in compile times: ~1.9%
To make this feasible this commit adds:
* Dominator analysis based on the Lengauer-Tarjan Algorithm.
* An algorithm estimating global edge frequences from branch
probabilities - In CFG.hs
A few static branch prediction heuristics:
* Expect to take the backedge in loops.
* Expect to take the branch NOT exiting a loop.
* Expect integer vs constant comparisons to be false.
We also treat heap/stack checks special for branch prediction
to avoid them being treated as loops.
|
|
|
|
| |
To avoid polluting the macro namespace
|
|
|
|
|
|
|
|
| |
- Remove unneeded ones
- Use <..> for inter-package.
Besides general clean up, helps distinguish between the RTS we link
against vs the RTS we compile for.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit refactors interface file generation to allow information
from the later passed (NCG, STG) to be stored in interface files.
We achieve this by splitting interface file generation into two parts:
* Partial interfaces, built based on the result of the core pipeline
* A fully instantiated interface, which also contains the final
fingerprints and can optionally contain information produced by the backend.
This change is required by !1304 and !1530.
-dynamic-too handling is refactored too: previously when generating code
we'd branch on -dynamic-too *before* code generation, but now we do it
after.
(Original code written by @AndreasK in !1530)
Performance
~~~~~~~~~~~
Before this patch interface files where created and immediately flushed
to disk which made space leaks impossible.
With this change we instead use NFData to force all iface related data
structures to avoid space leaks.
In the process of refactoring it was discovered that the code in the
ToIface Module allocated a lot of thunks which were immediately forced
when writing/forcing the interface file. So we made this module more
strict to avoid creating many of those thunks.
Bottom line is that allocations go down by about ~0.1% compared to
master.
Residency is not meaningfully different after this patch.
Runtime was not benchmarked.
Co-Authored-By: Andreas Klebinger <klebinger.andreas@gmx.at>
Co-Authored-By: Ömer Sinan Ağacan <omer@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using EvVars for capturing type constraints implied side-effects in DsM
when we just wanted to *construct* type constraints.
But giving names to type constraints is only necessary when passing
Givens to the type checker, of which the majority of the pattern match
checker should be unaware.
Thus, we simply generate `newtype TyCt = TyCt PredType`, which are
nicely stateless. But at the same time this means we have to allocate
EvVars when we want to query the type oracle! So we keep the type oracle
state as `newtype TyState = TySt (Bag EvVar)`, which nicely makes a
distinction between new, unchecked `TyCt`s and the inert set in
`TyState`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had an elaborate mechanism for selecting the warnings to
generate in the presence of different `COMPLETE` matching groups that,
albeit finely-tuned, produced wrong results from an end user's
perspective in some cases (#13363).
The underlying issue is that at the point where the `ConVar` case has to
commit to a particular `COMPLETE` group, there's not enough information
to do so and the status quo was to just enumerate all possible complete
sets nondeterministically. The `getResult` function would then pick the
outcome according to metrics defined in accordance to the user's guide.
But crucially, it lacked knowledge about the order in which affected
clauses appear, leading to the surprising behavior in #13363.
In !1010 we taught the term oracle to reason about literal values a
variable can certainly not take on. This MR extends that idea to
`ConLike`s and thereby fixes #13363: Instead of committing to a
particular `COMPLETE` group in the `ConVar` case, we now split off the
matching constructor incrementally and record the newly covered case as
a refutable shape in the oracle. Whenever the set of refutable shapes
covers any `COMPLETE` set, the oracle recognises vacuosity of the
uncovered set.
This patch goes a step further: Since at this point the information
in value abstractions is merely a cut down representation of what the
oracle knows, value abstractions degenerate to a single `Id`, the
semantics of which is determined by the oracle state `Delta`.
Value vectors become lists of `[Id]` given meaning to by a single
`Delta`, value set abstractions (of which the uncovered set is an
instance) correspond to a union of `Delta`s which instantiate the
same `[Id]` (akin to models of formula).
Fixes #11528 #13021, #13363, #13965, #14059, #14253, #14851, #15753, #17096, #17149
-------------------------
Metric Decrease:
ManyAlternatives
T11195
-------------------------
|
|
|
|
|
| |
07ee15915d5a0d6d1aeee137541eec6e9c153e65 started the transition, but the
job was never finished.
|
|
|
|
| |
Haven't been used since 16206a6603e87e15d61c57456267c5f7ba68050e.
|
|
|
|
|
|
| |
1) FastStrings are always UTF-8 encoded now.
2) Clarify what is meant by "hashed"
3) Add mention of lazy z-enc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having an IORef in FastString to memoize the z-encoded version is
unecessary because there is this amazing thing Haskell can do natively,
it's called "lazyness" :)
We simply remove the UNPACK and strictness annotations from the constructor
field corresponding to the z-encoding, making it lazy, and store the
(pure) z-encoded string there.
The only complication here is 'hasZEncoding' which allows cheking if a
z-encoding was computed for a given string. Since this is only used for
compiler performance statistics though it's not actually necessary to have
the current per-string granularity.
Instead I add a global IORef counter to the FastStringTable and use
unsafePerformIO to increment the counter whenever a lazy z-encoding is
forced.
|
|
|
|
|
| |
Those constructors have been added after GHC 8.8. The version guards
in `binary` are correct, see https://github.com/kolmodin/binary/pull/167/files.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This generalizes code generators (outputAsm, outputLlvm, outputC, and
the call site codeOutput) so that they'll return the return values of
the passed Cmm streams.
This allows accumulating data during Cmm generation and returning it to
the call site in HscMain.
Previously the Cmm streams were assumed to return (), so the code
generators returned () as well.
This change is required by !1304 and !1530.
Skipping CI as this was tested before and I only updated the commit
message.
[skip ci]
|
|
|
|
|
|
|
|
| |
Use LEB128 encoding for Int/Word variants. This reduces
the size of interface files significantly. (~19%).
Also includes a few small optimizations to make unboxing
work better that I have noticed while looking at the core.
|
|
|
|
|
|
|
|
|
| |
This adds a Stream.consume function, uses it in LLVM and C code
generators, and removes the use of Stream.collect function which was
used to collect streaming Cmm generation results into a list.
LLVM and C backends now properly use streamed Cmm generation, instead of
collecting Cmm groups into a list before generating LLVM/C code.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Make `exactLog2` faster (use `countLeadingZeros` and Int32 bit-ops).
On my Core i7-9700k Criterion reports ~50% speedup (from 16 to 8ns).
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to serialise large integers as strings. Now they are serialized
as a list of Bytes.
This changes the size for a Integer in the higher 64bit range from 77 to
9 bytes when written to disk.
The impact on the general case is small (<1% for interface files) as we
don't use many Integers. But for code that uses many this should be a
nice benefit.
|
|
|
|
|
| |
The `UniqDFM` is deterministic, of course, while we provide an unsafe
`NonDetUniqFM` wrapper for `UniqFM` to opt into nondeterministic instances.
|
|
|
|
| |
This should help identify a few cases where this is throwing warnings
|
|
|
|
|
|
|
| |
This prepares the way for making Int32# and Word32# the actual size they
claim to be.
Updates binary submodule for (de)serializing the new runtime reps.
|
|
|
|
|
|
|
|
| |
These are unexploded minds as far as the linter is concerned. I don't
want to hit in my MRs by mistake!
I did this with `sed`, and then rolled back some changes in the docs,
config.guess, and the linter itself.
|
|
|
|
|
|
|
| |
In particular we very often pass one empty list and in these
cases we want to avoid the overhead of computing `xs ++ []`.
This should fix #14759 and #16911.
|
|
|
|
| |
This matches GHC itself getting the target platform from there.
|
|
|
|
|
|
|
| |
ghc-pkg needs to be aware of platforms so it can figure out which
subdire within the user package db to use. This is admittedly
roundabout, but maybe Cabal could use the same notion of a platform as
GHC to good affect too.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in #16331, the GHCI macro, defined through 'ghci' flags
in ghc.cabal.in, ghc-bin.cabal.in and ghci.cabal.in, is supposed to indicate
whether GHC is built with support for an internal interpreter, that runs in
the same process. It is however overloaded in a few places to mean
"there is an interpreter available", regardless of whether it's an internal
or external interpreter.
For the sake of clarity and with the hope of more easily being able to
build stage 1 GHCs with external interpreter support, this patch splits
the previous GHCI macro into 3 different ones:
- HAVE_INTERNAL_INTERPRETER: GHC is built with an internal interpreter
- HAVE_EXTERNAL_INTERPRETER: GHC is built with support for external interpreters
- HAVE_INTERPRETER: HAVE_INTERNAL_INTERPRETER || HAVE_EXTERNAL_INTERPRETER
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `PmExprEq` business was a huge hack and was at the same time vastly
too powerful and not powerful enough to encode negative term equalities,
i.e. facts of the form "forall y. x ≁ Just y".
This patch introduces the concept of 'refutable shapes': What matters
for the pattern match checker is being able to encode knowledge of the
kind "x can no longer be the literal 5". We encode this knowledge in a
`PmRefutEnv`, mapping a set of newly introduced `PmAltCon`s (which are
just `PmLit`s at the moment) to each variable denoting above
inequalities.
So, say we have `x ≁ 42 ∈ refuts` in the term oracle context and
try to solve an equality like `x ~ 42`. The entry in the refutable
environment will immediately lead to a contradiction.
This machinery renders the whole `PmExprEq` and `ComplexEq` business
unnecessary, getting rid of a lot of (mostly dead) code.
See the Note [Refutable shapes] in TmOracle for a place to start.
Metric Decrease:
T11195
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, the fields within `Settings` aren't *intrinsicly*
related. They just happen to be initialized the same way (in particular
prior to the rest of `DynFlags`), and that is why they are grouped
together.
Within `Settings`, however, there are groups of settings that clearly do
share something in common, regardless of how they anything is
initialized.
In the spirit of GHC being a library, where the end cosumer may choose
to initialize this configuration in arbitrary ways, I made some new data
types for thoses groups internal to `Settings`, and used them to define
`Settings` instead. Hopefully this is a baby step towards a general
decoupling of the stateful and stateless parts of GHC.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unboxed tuples and sums take extra RuntimeRep arguments,
which must be manually passed in a few places.
This was not done in deSugar/Check.
This error was hidden because zipping functions in TyCoRep
ignored lists with mismatching length. This is now fixed;
the lengths are now checked by calling zipEqual.
As suggested in #16565, I moved checking for isTyVar and
isCoVar to zipTyEnv and zipCoEnv.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit splits out a subset of GhcException which do not depend on
pretty printing (SDoc), as a new datatype called
PlainGhcException. These exceptions can be caught as GhcException,
because 'fromException' will convert them.
The motivation for this change is that that the Panic module
transitively depends on many modules, primarily due to pretty printing
code. It's on the order of about 130 modules. This large set of
dependencies has a few implications:
1. To avoid cycles / use of boot files, these dependencies cannot
throw GhcException.
2. There are some utility modules that use UnboxedTuples and also use
`panic`. This means that when loading GHC into GHCi, about 130
additional modules would need to be compiled instead of
interpreted. Splitting the non-pprint exception throwing into a new
module resolves this issue. See #13101
|
| |
|
|
|
|
| |
When printf debugging this can be helpful.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #16052
When the offset in `setByteArray#` is statically known, we can provide
better alignment guarantees then just 1 byte.
Also, memset can now do 64-bit wide sets.
The current memset intrinsic is not optimal however and can be
improved for the case when we know that we deal with
(baseAddress at known alignment) + offset
For instance, on 64-bit
`setByteArray# s 1# 23# 0#`
given that bytearray is 8 bytes aligned could be unrolled into
`movb, movw, movl, movq, movq`; but currently it is
`movb x23` since alignment of 1 is all we can embed into MO_Memset op.
|
|
|
|
| |
Also removes a couple unnecessary MagicHash pragmas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves all URL references to Trac Wiki to their corresponding
GitLab counterparts.
This substitution is classified as follows:
1. Automated substitution using sed with Ben's mapping rule [1]
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...
2. Manual substitution for URLs containing `#` index
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz
3. Manual substitution for strings starting with `Commentary`
Old: Commentary/XxxYyy...
New: commentary/xxx-yyy...
See also !539
[1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
|
| |
|
|
|
|
|
| |
As discussed in #16450, this feature regresses CI on Windows, causing
non-deterministic failures due to missing files.
|
|
|
|
|
| |
This moves all URL references to Trac tickets to their corresponding
GitLab counterparts.
|
|
|
|
|
| |
GHC native code generator generates .incbin and .file directives. We
need to escape those strings correctly on Windows (see #16389).
|