summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Fix #19044 by tweaking unification in inst lookupwip/T19044Richard Eisenberg2020-12-165-52/+168
| | | | | | | | | | See Note [Infinitary substitution in lookup] in GHC.Core.InstEnv and Note [Unification result] in GHC.Core.Unify. Test case: typecheck/should_compile/T190{44,52} Close #19044 Close #19052
* Bump time submodule.Andreas Klebinger2020-12-081-0/+0
| | | | This should fix #19002.
* GHC.Cmm.Opt: Be stricter in results.Andreas Klebinger2020-12-081-51/+51
| | | | | | | | | | | | | Optimization either returns Nothing if nothing is to be done or `Just <cmmExpr>` otherwise. There is no point in being lazy in `cmmExpr`. We usually inspect this element so the thunk gets forced not long after. We might eliminate it as dead code once in a blue moon but that's not a case worth optimizing for. Overall the impact of this is rather low. As Cmm.Opt doesn't allocate much (compared to the rest of GHC) to begin with.
* Cmm.Sink: Optimize retaining of assignments, live sets.Andreas Klebinger2020-12-085-52/+174
| | | | | | | | | | | | | | | | | | | | | | | | | Sinking requires us to track live local regs after each cmm statement. We used to do this via "Set LocalReg". However we can replace this with a solution based on IntSet which is overall more efficient without losing much. The thing we lose is width of the variables, which isn't used by the sinking pass anyway. I also reworked how we keep assignments to regs mentioned in skipped assignments. I put the details into Note [Keeping assignemnts mentioned in skipped RHSs]. The gist of it is instead of keeping track of it via the use count which is a `IntMap Int` we now use the live regs set (IntSet) which is quite a bit faster. I think it also matches the semantics a lot better. The skipped (not discarded) assignment does in fact keep the regs on it's rhs alive so keeping track of this in the live set seems like the clearer solution as well. Improves allocations for T3294 by yet another 1%.
* Cmm: Make a few types and utility function slightly stricter.Andreas Klebinger2020-12-082-9/+11
| | | | | | About 0.6% reduction in allocations for the code I was looking at. Not a huge difference but no need to throw away performance.
* CmmSink: Force inlining of foldRegsDefdAndreas Klebinger2020-12-081-6/+45
| | | | | Helps avoid allocating the folding function. Improves perf for T3294 by about 1%.
* CodeGen: Make folds User/DefinerOfRegs INLINEABLE.Andreas Klebinger2020-12-082-0/+7
| | | | | | | | | Reduces allocation for the test case I was looking at by about 1.2%. Mostly from avoiding allocation of some folding functions which turn into let-no-escape bindings which just reuse their environment instead. We also force inlining in a few key places in CmmSink which helps a bit more.
* hadrian: build the _l and _thr_l rts flavours in the develN flavoursAdam Sandberg Ericsson2020-12-081-1/+1
| | | | | The ghc binary requires the eventlog rts since fc644b1a643128041cfec25db84e417851e28bab
* Fix kind inference for data types. Again.Simon Peyton Jones2020-12-0832-331/+735
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch fixes several aspects of kind inference for data type declarations, especially data /instance/ declarations Specifically 1. In kcConDecls/kcConDecl make it clear that the tc_res_kind argument is only used in the H98 case; and in that case there is no result kind signature; and hence no need for the disgusting splitPiTys in kcConDecls (now thankfully gone). The GADT case is a bit different to before, and much nicer. This is what fixes #18891. See Note [kcConDecls: kind-checking data type decls] 2. Do not look at the constructor decls of a data/newtype instance in tcDataFamInstanceHeader. See GHC.Tc.TyCl.Instance Note [Kind inference for data family instances]. This was a new realisation that arose when doing (1) This causes a few knock-on effects in the tests suite, because we require more information than before in the instance /header/. New user-manual material about this in "Kind inference in data type declarations" and "Kind inference for data/newtype instance declarations". 3. Minor improvement in kcTyClDecl, combining GADT and H98 cases 4. Fix #14111 and #8707 by allowing the header of a data instance to affect kind inferece for the the data constructor signatures; as described at length in Note [GADT return types] in GHC.Tc.TyCl This led to a modest refactoring of the arguments (and argument order) of tcConDecl/tcConDecls. 5. Fix #19000 by inverting the sense of the test in new_locs in GHC.Tc.Solver.Canonical.canDecomposableTyConAppOK.
* testsuite: Add a test for #18923Ben Gamari2020-12-052-0/+20
|
* Fix bad span calculations of post qualified importsShayne Fletcher2020-12-054-8/+83
|
* gitlab-ci: Run linters through ci.shBen Gamari2020-12-032-9/+12
| | | | Ensuring that the right toolchain is used.
* gitlab-ci: Fix copy-paste errorBen Gamari2020-12-031-6/+6
| | | | Also be more consistent in quoting.
* rts/linker: Use m32 to allocate symbol extras in PEi386Ben Gamari2020-12-014-33/+20
|
* rts/m32: Introduce NEEDS_M32 macroBen Gamari2020-12-015-27/+32
| | | | Instead of relying on RTS_LINKER_USE_MMAP
* rts/Linker: Introduce Windows implementations for mmapForLinker, et al.Ben Gamari2020-12-011-1/+32
|
* rts/linker: Introduce munmapForLinkerBen Gamari2020-12-017-30/+22
| | | | Consolidates munmap calls to ensure consistent error handling.
* rts: Introduce mmapAnonForLinkerBen Gamari2020-12-018-26/+41
| | | | | | | | Previously most of the uses of mmapForLinker were mapping anonymous memory, resulting in a great deal of unnecessary repetition. Factor this out into a new helper. Also fixes a few places where error checking was missing or suboptimal.
* Rename the flattener to become the rewriter.Richard Eisenberg2020-12-0116-492/+459
| | | | | | | | Now that flattening doesn't produce flattening variables, it's not really flattening anything: it's rewriting. This change also means that the rewriter can no longer be confused the core flattener (in GHC.Core.Unify), which is sometimes used during type-checking.
* Remove flattening variablesRichard Eisenberg2020-12-01119-3752/+3693
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch redesigns the flattener to simplify type family applications directly instead of using flattening meta-variables and skolems. The key new innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS is either a type variable or exactly-saturated type family application; either can now be rewritten using a CEqCan constraint in the inert set. Because the flattener no longer reduces all type family applications to variables, there was some performance degradation if a lengthy type family application is now flattened over and over (not making progress). To compensate, this patch contains some extra optimizations in the flattener, leading to a number of performance improvements. Close #18875. Close #18910. There are many extra parts of the compiler that had to be affected in writing this patch: * The family-application cache (formerly the flat-cache) sometimes stores coercions built from Given inerts. When these inerts get kicked out, we must kick out from the cache as well. (This was, I believe, true previously, but somehow never caused trouble.) Kicking out from the cache requires adding a filterTM function to TrieMap. * This patch obviates the need to distinguish "blocking" coercion holes from non-blocking ones (which, previously, arose from CFunEqCans). There is thus some simplification around coercion holes. * Extra commentary throughout parts of the code I read through, to preserve the knowledge I gained while working. * A change in the pure unifier around unifying skolems with other types. Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented in Note [Binding when looking up instances] in GHC.Core.InstEnv. * Some more use of MCoercion where appropriate. * Previously, class-instance lookup automatically noticed that e.g. C Int was a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to a variable. Now, a little more care must be taken around checking for unifying instances. * Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly, because (=>) is not a tycon in Haskell. Fixed now, but there are some knock-on changes in e.g. TrieMap code and in the canonicaliser. * New function anyFreeVarsOf{Type,Co} to check whether a free variable satisfies a certain predicate. * Type synonyms now remember whether or not they are "forgetful"; a forgetful synonym drops at least one argument. This is useful when flattening; see flattenView. * The pattern-match completeness checker invokes the solver. This invocation might need to look through newtypes when checking representational equality. Thus, the desugarer needs to keep track of the in-scope variables to know what newtype constructors are in scope. I bet this bug was around before but never noticed. * Extra-constraints wildcards are no longer simplified before printing. See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver. * Whether or not there are Given equalities has become slightly subtler. See the new HasGivenEqs datatype. * Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical explains a significant new wrinkle in the new approach. * See Note [What might match later?] in GHC.Tc.Solver.Interact, which explains the fix to #18910. * The inert_count field of InertCans wasn't actually used, so I removed it. Though I (Richard) did the implementation, Simon PJ was very involved in design and review. This updates the Haddock submodule to avoid #18932 by adding a type signature. ------------------------- Metric Decrease: T12227 T5030 T9872a T9872b T9872c Metric Increase: T9872d -------------------------
* Bump the # of commits searched for perf baselineRichard Eisenberg2020-12-011-1/+1
| | | | | | | | The previous value of 75 meant that a feature branch with more than 75 commits would get spurious CI passes. This affects #18692, but does not fix that ticket, because if a baseline cannot be found, we should fail, not succeed.
* Move core flattening algorithm to Core.UnifyRichard Eisenberg2020-12-0117-693/+731
| | | | | | | | | | This sets the stage for a later change, where this algorithm will be needed from GHC.Core.InstEnv. This commit also splits GHC.Core.Map into GHC.Core.Map.Type and GHC.Core.Map.Expr, in order to avoid module import cycles with GHC.Core.
* Include tried paths in findToolDir errorjneira2020-11-301-6/+8
|
* rts/linker: Don't declare dynamic objects with image_mappedGHC GitLab CI2020-11-301-1/+1
| | | | This previously resulted in warnings due to spurious unmap failures.
* rts/linker: Move shared library loading logic into Elf.cBen Gamari2020-11-305-184/+197
|
* rts/linker: Initialise CCSs from native shared objectsBen Gamari2020-11-302-1/+7
|
* rts/linker: Don't allow shared libraries to be loaded multiple timesBen Gamari2020-11-301-0/+9
|
* dirty MVAR after mutating TSO queue headViktor Dukhovni2020-11-302-15/+28
| | | | | | | | | While the original head and tail of the TSO queue may be in the same generation as the MVAR, interior elements of the queue could be younger after a GC run and may then be exposed by putMVar operation that updates the queue head. Resolves #18919
* Apply suggestion to libraries/base/Data/Foldable.hschessai2020-11-301-1/+1
|
* Apply suggestion to libraries/base/Data/Foldable.hschessai2020-11-301-1/+1
|
* Optimisations in Data.Foldable (T17867)chessai2020-11-301-20/+28
| | | | | | | | | | | | | This PR concerns the following functions from `Data.Foldable`: * minimum * maximum * sum * product * minimumBy * maximumBy - Default implementations of these functions now use `foldl'` or `foldMap'`. - All have been marked with INLINEABLE to make room for further optimisations.
* Allow deploy:pages job to failRyan Scott2020-11-301-0/+2
| | | | See #18973.
* rts/linker: Replace some ASSERTs with CHECKBen Gamari2020-11-305-38/+35
| | | | | | | | In the past some people have confused ASSERT, which is for checking internal invariants, which CHECK, which should be used when checking things that might fail due to bad input (and therefore should be enabled even in the release compiler). Change some of these cases in the linker to use CHECK.
* rts: Use CHECK instead of assertBen Gamari2020-11-302-30/+28
| | | | Use the GHC wrappers instead of <assert.h>.
* rts/m32: Refactor handling of allocator seedingBen Gamari2020-11-301-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | Previously, in an attempt to reduce fragmentation, each new allocator would map a region of M32_MAX_PAGES fresh pages to seed itself. However, this ends up being extremely wasteful since it turns out that we often use fewer than this. Consequently, these pages end up getting freed which, ends up fragmenting our address space more than than we would have if we had naively allocated pages on-demand. Here we refactor m32 to avoid this waste while achieving the fragmentation mitigation previously desired. In particular, we move all page allocation into the global m32_alloc_page, which will pull a page from the free page pool. If the free page pool is empty we then refill it by allocating a region of M32_MAP_PAGES and adding them to the pool. Furthermore, we do away with the initial seeding entirely. That is, the allocator starts with no active pages: pages are rather allocated on an as-needed basis. On the whole this ends up being a pleasingly simple change, simultaneously making m32 more efficient, more robust, and simpler. Fixes #18980.
* nonmoving: Ensure that evacuated large objects are markedGHC GitLab CI2020-11-292-7/+60
| | | | See Note [Non-moving GC: Marking evacuated objects].
* nonmoving: Add reference to Ueno 2016Ben Gamari2020-11-291-2/+7
|
* nonmoving: Don't join to mark_thread on shutdownGHC GitLab CI2020-11-291-1/+0
| | | | The mark thread is not joinable as we detach from it on creation.
* OSThreads: Fix error code checkingGHC GitLab CI2020-11-291-2/+3
| | | | pthread_join returns its error code and apparently doesn't set errno.
* Updates: Don't zero slop until closure has been pushedGHC GitLab CI2020-11-291-1/+1
| | | | | Ensure that the the free variables have been pushed to the update remembered set before we zero the slop.
* nonmoving: Add missing write barrier in shrinkSmallByteArrayGHC GitLab CI2020-11-291-0/+15
|
* rts/Messages: Add missing write barrier in THROWTO message updateGHC GitLab CI2020-11-293-6/+14
| | | | | | After a THROWTO message has been handle the message closure is overwritten by a NULL message. We must ensure that the original closure's pointers continue to be visible to the nonmoving GC.
* nonmoving: Fix regression from TSAN workGHC GitLab CI2020-11-291-7/+2
| | | | | | | The TSAN rework (specifically aad1f803) introduced a subtle regression in GC.c, swapping `g0` in place of `gen`. Whoops! Fixes #18997.
* ThreadPaused: Don't zero slop until free vars are pushedGHC GitLab CI2020-11-292-6/+11
| | | | | | | | When threadPaused blackholes a thunk it calls `OVERWRITING_CLOSURE` to zero the slop for the benefit of the sanity checker. Previously this was done *before* pushing the thunk's free variables to the update remembered set. Consequently we would pull zero'd pointers to the update remembered set.
* withTimings: Emit allocations counterBen Gamari2020-11-291-5/+14
| | | | | | | This will allow us to back out the allocations per compiler pass from the eventlog. Note that we dump the allocation counter rather than the difference since this will allow us to determine how much work is done *between* `withTiming` blocks.
* testsuite: Mark T14702 as fragile on WindowsBen Gamari2020-11-281-0/+1
| | | | Due to #18953.
* Cleanup some primop constructor namesJohn Ericson2020-11-283-67/+67
| | | | | | | Harmonize the internal (big sum type) names of the native vs fixed-sized number primops a bit. (Mainly by renaming the former.) No user-facing names are changed.
* Make primop handler indentation more consistentJohn Ericson2020-11-281-49/+49
|
* Small optimization to CmmSink.Andreas Klebinger2020-11-281-4/+11
| | | | | | | | Inside `regsUsedIn` we can avoid some thunks by specializing the recursion. In particular we avoid the thunk for `(f e z)` in the MachOp/Load branches, where we know this will evaluate to z. Reduces allocations for T3294 by ~1%.
* ghc-heap: partial TSO/STACK decodingDavid Eichmann2020-11-2822-22/+1046
| | | | | | Co-authored-by: Sven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com> Co-authored-by: Ben Gamari <bgamari.foss@gmail.com>