| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
See Note [Infinitary substitution in lookup] in GHC.Core.InstEnv
and Note [Unification result] in GHC.Core.Unify.
Test case: typecheck/should_compile/T190{44,52}
Close #19044
Close #19052
|
|
|
|
| |
This should fix #19002.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimization either returns Nothing if nothing is to be done or
`Just <cmmExpr>` otherwise. There is no point in being lazy in
`cmmExpr`. We usually inspect this element so the thunk gets forced
not long after.
We might eliminate it as dead code once in a blue moon but that's
not a case worth optimizing for.
Overall the impact of this is rather low. As Cmm.Opt doesn't allocate
much (compared to the rest of GHC) to begin with.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sinking requires us to track live local regs after each
cmm statement. We used to do this via "Set LocalReg".
However we can replace this with a solution based on IntSet
which is overall more efficient without losing much. The thing
we lose is width of the variables, which isn't used by the sinking
pass anyway.
I also reworked how we keep assignments to regs mentioned in
skipped assignments. I put the details into
Note [Keeping assignemnts mentioned in skipped RHSs].
The gist of it is instead of keeping track of it via the use count
which is a `IntMap Int` we now use the live regs set (IntSet) which
is quite a bit faster.
I think it also matches the semantics a lot better. The skipped
(not discarded) assignment does in fact keep the regs on it's rhs
alive so keeping track of this in the live set seems like the clearer
solution as well.
Improves allocations for T3294 by yet another 1%.
|
|
|
|
|
|
| |
About 0.6% reduction in allocations for the code I was looking at.
Not a huge difference but no need to throw away performance.
|
|
|
|
|
| |
Helps avoid allocating the folding function. Improves
perf for T3294 by about 1%.
|
|
|
|
|
|
|
|
|
| |
Reduces allocation for the test case I was looking at by about 1.2%.
Mostly from avoiding allocation of some folding functions which turn
into let-no-escape bindings which just reuse their environment instead.
We also force inlining in a few key places in CmmSink which helps a bit
more.
|
|
|
|
|
| |
The ghc binary requires the eventlog rts since
fc644b1a643128041cfec25db84e417851e28bab
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes several aspects of kind inference for data type
declarations, especially data /instance/ declarations
Specifically
1. In kcConDecls/kcConDecl make it clear that the tc_res_kind argument
is only used in the H98 case; and in that case there is no result
kind signature; and hence no need for the disgusting splitPiTys in
kcConDecls (now thankfully gone).
The GADT case is a bit different to before, and much nicer.
This is what fixes #18891.
See Note [kcConDecls: kind-checking data type decls]
2. Do not look at the constructor decls of a data/newtype instance
in tcDataFamInstanceHeader. See GHC.Tc.TyCl.Instance
Note [Kind inference for data family instances]. This was a
new realisation that arose when doing (1)
This causes a few knock-on effects in the tests suite, because
we require more information than before in the instance /header/.
New user-manual material about this in "Kind inference in data type
declarations" and "Kind inference for data/newtype instance
declarations".
3. Minor improvement in kcTyClDecl, combining GADT and H98 cases
4. Fix #14111 and #8707 by allowing the header of a data instance
to affect kind inferece for the the data constructor signatures;
as described at length in Note [GADT return types] in GHC.Tc.TyCl
This led to a modest refactoring of the arguments (and argument
order) of tcConDecl/tcConDecls.
5. Fix #19000 by inverting the sense of the test in new_locs
in GHC.Tc.Solver.Canonical.canDecomposableTyConAppOK.
|
| |
|
| |
|
|
|
|
| |
Ensuring that the right toolchain is used.
|
|
|
|
| |
Also be more consistent in quoting.
|
| |
|
|
|
|
| |
Instead of relying on RTS_LINKER_USE_MMAP
|
| |
|
|
|
|
| |
Consolidates munmap calls to ensure consistent error handling.
|
|
|
|
|
|
|
|
| |
Previously most of the uses of mmapForLinker were mapping anonymous
memory, resulting in a great deal of unnecessary repetition. Factor this
out into a new helper.
Also fixes a few places where error checking was missing or suboptimal.
|
|
|
|
|
|
|
|
| |
Now that flattening doesn't produce flattening variables,
it's not really flattening anything: it's rewriting. This
change also means that the rewriter can no longer be confused
the core flattener (in GHC.Core.Unify), which is sometimes used
during type-checking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch redesigns the flattener to simplify type family applications
directly instead of using flattening meta-variables and skolems. The key new
innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS
is either a type variable or exactly-saturated type family application; either
can now be rewritten using a CEqCan constraint in the inert set.
Because the flattener no longer reduces all type family applications to
variables, there was some performance degradation if a lengthy type family
application is now flattened over and over (not making progress). To
compensate, this patch contains some extra optimizations in the flattener,
leading to a number of performance improvements.
Close #18875.
Close #18910.
There are many extra parts of the compiler that had to be affected in writing
this patch:
* The family-application cache (formerly the flat-cache) sometimes stores
coercions built from Given inerts. When these inerts get kicked out, we must
kick out from the cache as well. (This was, I believe, true previously, but
somehow never caused trouble.) Kicking out from the cache requires adding a
filterTM function to TrieMap.
* This patch obviates the need to distinguish "blocking" coercion holes from
non-blocking ones (which, previously, arose from CFunEqCans). There is thus
some simplification around coercion holes.
* Extra commentary throughout parts of the code I read through, to preserve
the knowledge I gained while working.
* A change in the pure unifier around unifying skolems with other types.
Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented
in Note [Binding when looking up instances] in GHC.Core.InstEnv.
* Some more use of MCoercion where appropriate.
* Previously, class-instance lookup automatically noticed that e.g. C Int was
a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to
a variable. Now, a little more care must be taken around checking for
unifying instances.
* Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly,
because (=>) is not a tycon in Haskell. Fixed now, but there are some
knock-on changes in e.g. TrieMap code and in the canonicaliser.
* New function anyFreeVarsOf{Type,Co} to check whether a free variable
satisfies a certain predicate.
* Type synonyms now remember whether or not they are "forgetful"; a forgetful
synonym drops at least one argument. This is useful when flattening; see
flattenView.
* The pattern-match completeness checker invokes the solver. This invocation
might need to look through newtypes when checking representational equality.
Thus, the desugarer needs to keep track of the in-scope variables to know
what newtype constructors are in scope. I bet this bug was around before but
never noticed.
* Extra-constraints wildcards are no longer simplified before printing.
See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver.
* Whether or not there are Given equalities has become slightly subtler.
See the new HasGivenEqs datatype.
* Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical
explains a significant new wrinkle in the new approach.
* See Note [What might match later?] in GHC.Tc.Solver.Interact, which
explains the fix to #18910.
* The inert_count field of InertCans wasn't actually used, so I removed
it.
Though I (Richard) did the implementation, Simon PJ was very involved
in design and review.
This updates the Haddock submodule to avoid #18932 by adding
a type signature.
-------------------------
Metric Decrease:
T12227
T5030
T9872a
T9872b
T9872c
Metric Increase:
T9872d
-------------------------
|
|
|
|
|
|
|
|
| |
The previous value of 75 meant that a feature branch with
more than 75 commits would get spurious CI passes.
This affects #18692, but does not fix that ticket, because
if a baseline cannot be found, we should fail, not succeed.
|
|
|
|
|
|
|
|
|
|
| |
This sets the stage for a later change, where this
algorithm will be needed from GHC.Core.InstEnv.
This commit also splits GHC.Core.Map into
GHC.Core.Map.Type and GHC.Core.Map.Expr,
in order to avoid module import cycles
with GHC.Core.
|
| |
|
|
|
|
| |
This previously resulted in warnings due to spurious unmap failures.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
While the original head and tail of the TSO queue may be in the same
generation as the MVAR, interior elements of the queue could be younger
after a GC run and may then be exposed by putMVar operation that updates
the queue head.
Resolves #18919
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR concerns the following functions from `Data.Foldable`:
* minimum
* maximum
* sum
* product
* minimumBy
* maximumBy
- Default implementations of these functions now use `foldl'` or `foldMap'`.
- All have been marked with INLINEABLE to make room for further optimisations.
|
|
|
|
| |
See #18973.
|
|
|
|
|
|
|
|
| |
In the past some people have confused ASSERT, which is for checking
internal invariants, which CHECK, which should be used when checking
things that might fail due to bad input (and therefore should be enabled
even in the release compiler). Change some of these cases in the linker
to use CHECK.
|
|
|
|
| |
Use the GHC wrappers instead of <assert.h>.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, in an attempt to reduce fragmentation, each new allocator
would map a region of M32_MAX_PAGES fresh pages to seed itself. However,
this ends up being extremely wasteful since it turns out that we often
use fewer than this. Consequently, these pages end up getting freed
which, ends up fragmenting our address space more than than we would
have if we had naively allocated pages on-demand.
Here we refactor m32 to avoid this waste while achieving the
fragmentation mitigation previously desired. In particular, we move all
page allocation into the global m32_alloc_page, which will pull a page
from the free page pool. If the free page pool is empty we then refill
it by allocating a region of M32_MAP_PAGES and adding them to the pool.
Furthermore, we do away with the initial seeding entirely. That is, the
allocator starts with no active pages: pages are rather allocated on an
as-needed basis.
On the whole this ends up being a pleasingly simple change,
simultaneously making m32 more efficient, more robust, and simpler.
Fixes #18980.
|
|
|
|
| |
See Note [Non-moving GC: Marking evacuated objects].
|
| |
|
|
|
|
| |
The mark thread is not joinable as we detach from it on creation.
|
|
|
|
| |
pthread_join returns its error code and apparently doesn't set errno.
|
|
|
|
|
| |
Ensure that the the free variables have been pushed to the update
remembered set before we zero the slop.
|
| |
|
|
|
|
|
|
| |
After a THROWTO message has been handle the message closure is
overwritten by a NULL message. We must ensure that the original
closure's pointers continue to be visible to the nonmoving GC.
|
|
|
|
|
|
|
| |
The TSAN rework (specifically aad1f803) introduced a subtle regression
in GC.c, swapping `g0` in place of `gen`. Whoops!
Fixes #18997.
|
|
|
|
|
|
|
|
| |
When threadPaused blackholes a thunk it calls `OVERWRITING_CLOSURE` to
zero the slop for the benefit of the sanity checker. Previously this was
done *before* pushing the thunk's free variables to the update
remembered set. Consequently we would pull zero'd pointers to the update
remembered set.
|
|
|
|
|
|
|
| |
This will allow us to back out the allocations per compiler pass from
the eventlog. Note that we dump the allocation counter rather than the
difference since this will allow us to determine how much work is done
*between* `withTiming` blocks.
|
|
|
|
| |
Due to #18953.
|
|
|
|
|
|
|
| |
Harmonize the internal (big sum type) names of the native vs fixed-sized
number primops a bit. (Mainly by renaming the former.)
No user-facing names are changed.
|
| |
|
|
|
|
|
|
|
|
| |
Inside `regsUsedIn` we can avoid some thunks by specializing the
recursion. In particular we avoid the thunk for `(f e z)` in the
MachOp/Load branches, where we know this will evaluate to z.
Reduces allocations for T3294 by ~1%.
|
|
|
|
|
|
| |
Co-authored-by: Sven Tennie <sven.tennie@gmail.com>
Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com>
Co-authored-by: Ben Gamari <bgamari.foss@gmail.com>
|