summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Remove flattening variablesRichard Eisenberg2020-12-01119-3752/+3693
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch redesigns the flattener to simplify type family applications directly instead of using flattening meta-variables and skolems. The key new innovation is the CanEqLHS type and the new CEqCan constraint (Ct). A CanEqLHS is either a type variable or exactly-saturated type family application; either can now be rewritten using a CEqCan constraint in the inert set. Because the flattener no longer reduces all type family applications to variables, there was some performance degradation if a lengthy type family application is now flattened over and over (not making progress). To compensate, this patch contains some extra optimizations in the flattener, leading to a number of performance improvements. Close #18875. Close #18910. There are many extra parts of the compiler that had to be affected in writing this patch: * The family-application cache (formerly the flat-cache) sometimes stores coercions built from Given inerts. When these inerts get kicked out, we must kick out from the cache as well. (This was, I believe, true previously, but somehow never caused trouble.) Kicking out from the cache requires adding a filterTM function to TrieMap. * This patch obviates the need to distinguish "blocking" coercion holes from non-blocking ones (which, previously, arose from CFunEqCans). There is thus some simplification around coercion holes. * Extra commentary throughout parts of the code I read through, to preserve the knowledge I gained while working. * A change in the pure unifier around unifying skolems with other types. Unifying a skolem now leads to SurelyApart, not MaybeApart, as documented in Note [Binding when looking up instances] in GHC.Core.InstEnv. * Some more use of MCoercion where appropriate. * Previously, class-instance lookup automatically noticed that e.g. C Int was a "unifier" to a target [W] C (F Bool), because the F Bool was flattened to a variable. Now, a little more care must be taken around checking for unifying instances. * Previously, tcSplitTyConApp_maybe would split (Eq a => a). This is silly, because (=>) is not a tycon in Haskell. Fixed now, but there are some knock-on changes in e.g. TrieMap code and in the canonicaliser. * New function anyFreeVarsOf{Type,Co} to check whether a free variable satisfies a certain predicate. * Type synonyms now remember whether or not they are "forgetful"; a forgetful synonym drops at least one argument. This is useful when flattening; see flattenView. * The pattern-match completeness checker invokes the solver. This invocation might need to look through newtypes when checking representational equality. Thus, the desugarer needs to keep track of the in-scope variables to know what newtype constructors are in scope. I bet this bug was around before but never noticed. * Extra-constraints wildcards are no longer simplified before printing. See Note [Do not simplify ConstraintHoles] in GHC.Tc.Solver. * Whether or not there are Given equalities has become slightly subtler. See the new HasGivenEqs datatype. * Note [Type variable cycles in Givens] in GHC.Tc.Solver.Canonical explains a significant new wrinkle in the new approach. * See Note [What might match later?] in GHC.Tc.Solver.Interact, which explains the fix to #18910. * The inert_count field of InertCans wasn't actually used, so I removed it. Though I (Richard) did the implementation, Simon PJ was very involved in design and review. This updates the Haddock submodule to avoid #18932 by adding a type signature. ------------------------- Metric Decrease: T12227 T5030 T9872a T9872b T9872c Metric Increase: T9872d -------------------------
* Bump the # of commits searched for perf baselineRichard Eisenberg2020-12-011-1/+1
| | | | | | | | The previous value of 75 meant that a feature branch with more than 75 commits would get spurious CI passes. This affects #18692, but does not fix that ticket, because if a baseline cannot be found, we should fail, not succeed.
* Move core flattening algorithm to Core.UnifyRichard Eisenberg2020-12-0117-693/+731
| | | | | | | | | | This sets the stage for a later change, where this algorithm will be needed from GHC.Core.InstEnv. This commit also splits GHC.Core.Map into GHC.Core.Map.Type and GHC.Core.Map.Expr, in order to avoid module import cycles with GHC.Core.
* Include tried paths in findToolDir errorjneira2020-11-301-6/+8
|
* rts/linker: Don't declare dynamic objects with image_mappedGHC GitLab CI2020-11-301-1/+1
| | | | This previously resulted in warnings due to spurious unmap failures.
* rts/linker: Move shared library loading logic into Elf.cBen Gamari2020-11-305-184/+197
|
* rts/linker: Initialise CCSs from native shared objectsBen Gamari2020-11-302-1/+7
|
* rts/linker: Don't allow shared libraries to be loaded multiple timesBen Gamari2020-11-301-0/+9
|
* dirty MVAR after mutating TSO queue headViktor Dukhovni2020-11-302-15/+28
| | | | | | | | | While the original head and tail of the TSO queue may be in the same generation as the MVAR, interior elements of the queue could be younger after a GC run and may then be exposed by putMVar operation that updates the queue head. Resolves #18919
* Apply suggestion to libraries/base/Data/Foldable.hschessai2020-11-301-1/+1
|
* Apply suggestion to libraries/base/Data/Foldable.hschessai2020-11-301-1/+1
|
* Optimisations in Data.Foldable (T17867)chessai2020-11-301-20/+28
| | | | | | | | | | | | | This PR concerns the following functions from `Data.Foldable`: * minimum * maximum * sum * product * minimumBy * maximumBy - Default implementations of these functions now use `foldl'` or `foldMap'`. - All have been marked with INLINEABLE to make room for further optimisations.
* Allow deploy:pages job to failRyan Scott2020-11-301-0/+2
| | | | See #18973.
* rts/linker: Replace some ASSERTs with CHECKBen Gamari2020-11-305-38/+35
| | | | | | | | In the past some people have confused ASSERT, which is for checking internal invariants, which CHECK, which should be used when checking things that might fail due to bad input (and therefore should be enabled even in the release compiler). Change some of these cases in the linker to use CHECK.
* rts: Use CHECK instead of assertBen Gamari2020-11-302-30/+28
| | | | Use the GHC wrappers instead of <assert.h>.
* rts/m32: Refactor handling of allocator seedingBen Gamari2020-11-301-25/+36
| | | | | | | | | | | | | | | | | | | | | | | | Previously, in an attempt to reduce fragmentation, each new allocator would map a region of M32_MAX_PAGES fresh pages to seed itself. However, this ends up being extremely wasteful since it turns out that we often use fewer than this. Consequently, these pages end up getting freed which, ends up fragmenting our address space more than than we would have if we had naively allocated pages on-demand. Here we refactor m32 to avoid this waste while achieving the fragmentation mitigation previously desired. In particular, we move all page allocation into the global m32_alloc_page, which will pull a page from the free page pool. If the free page pool is empty we then refill it by allocating a region of M32_MAP_PAGES and adding them to the pool. Furthermore, we do away with the initial seeding entirely. That is, the allocator starts with no active pages: pages are rather allocated on an as-needed basis. On the whole this ends up being a pleasingly simple change, simultaneously making m32 more efficient, more robust, and simpler. Fixes #18980.
* nonmoving: Ensure that evacuated large objects are markedGHC GitLab CI2020-11-292-7/+60
| | | | See Note [Non-moving GC: Marking evacuated objects].
* nonmoving: Add reference to Ueno 2016Ben Gamari2020-11-291-2/+7
|
* nonmoving: Don't join to mark_thread on shutdownGHC GitLab CI2020-11-291-1/+0
| | | | The mark thread is not joinable as we detach from it on creation.
* OSThreads: Fix error code checkingGHC GitLab CI2020-11-291-2/+3
| | | | pthread_join returns its error code and apparently doesn't set errno.
* Updates: Don't zero slop until closure has been pushedGHC GitLab CI2020-11-291-1/+1
| | | | | Ensure that the the free variables have been pushed to the update remembered set before we zero the slop.
* nonmoving: Add missing write barrier in shrinkSmallByteArrayGHC GitLab CI2020-11-291-0/+15
|
* rts/Messages: Add missing write barrier in THROWTO message updateGHC GitLab CI2020-11-293-6/+14
| | | | | | After a THROWTO message has been handle the message closure is overwritten by a NULL message. We must ensure that the original closure's pointers continue to be visible to the nonmoving GC.
* nonmoving: Fix regression from TSAN workGHC GitLab CI2020-11-291-7/+2
| | | | | | | The TSAN rework (specifically aad1f803) introduced a subtle regression in GC.c, swapping `g0` in place of `gen`. Whoops! Fixes #18997.
* ThreadPaused: Don't zero slop until free vars are pushedGHC GitLab CI2020-11-292-6/+11
| | | | | | | | When threadPaused blackholes a thunk it calls `OVERWRITING_CLOSURE` to zero the slop for the benefit of the sanity checker. Previously this was done *before* pushing the thunk's free variables to the update remembered set. Consequently we would pull zero'd pointers to the update remembered set.
* withTimings: Emit allocations counterBen Gamari2020-11-291-5/+14
| | | | | | | This will allow us to back out the allocations per compiler pass from the eventlog. Note that we dump the allocation counter rather than the difference since this will allow us to determine how much work is done *between* `withTiming` blocks.
* testsuite: Mark T14702 as fragile on WindowsBen Gamari2020-11-281-0/+1
| | | | Due to #18953.
* Cleanup some primop constructor namesJohn Ericson2020-11-283-67/+67
| | | | | | | Harmonize the internal (big sum type) names of the native vs fixed-sized number primops a bit. (Mainly by renaming the former.) No user-facing names are changed.
* Make primop handler indentation more consistentJohn Ericson2020-11-281-49/+49
|
* Small optimization to CmmSink.Andreas Klebinger2020-11-281-4/+11
| | | | | | | | Inside `regsUsedIn` we can avoid some thunks by specializing the recursion. In particular we avoid the thunk for `(f e z)` in the MachOp/Load branches, where we know this will evaluate to z. Reduces allocations for T3294 by ~1%.
* ghc-heap: partial TSO/STACK decodingDavid Eichmann2020-11-2822-22/+1046
| | | | | | Co-authored-by: Sven Tennie <sven.tennie@gmail.com> Co-authored-by: Matthew Pickering <matthewtpickering@gmail.com> Co-authored-by: Ben Gamari <bgamari.foss@gmail.com>
* gitlab-ci: Only deploy GitLab Pages in ghc/ghc>Ben Gamari2020-11-281-1/+3
| | | | | The deployments are quite large and yet are currently only served for the ghc/ghc> project.
* gitlab-ci: Introduce a nightly cross-compilation jobBen Gamari2020-11-282-5/+47
| | | | | | | This adds a job to test cross-compilation from x86-64 to AArch64 with Hadrian. Fixes #18234
* hadrian: fix ghc-pkg uses (#17601)Sylvain Henry2020-11-281-6/+24
| | | | | Make sure ghc-pkg doesn't read the compiler "settings" file by passing --no-user-package-db.
* Hadrian: fix detection of ghc-pkg for cross-compilersSylvain Henry2020-11-281-4/+12
|
* rts: Allocate MBlocks with MAP_TOP_DOWN on WindowsBen Gamari2020-11-271-1/+4
| | | | | | | As noted in #18991, we would previously allocate heap in low memory. Due to this the linker, which typically *needs* low memory, would end up competing with the heap. In longer builds we end up running out of low memory entirely, leading to linking failures.
* RegAlloc: Add missing raPlatformfield to RegAllocStatsSpillAndreas Klebinger2020-11-262-2/+7
| | | | | | Fixes #18994 Co-Author: Benjamin Maurer <maurer.benjamin@gmail.com>
* Split Up getClosureDataFromHeapRepMatthew Pickering2020-11-261-9/+18
| | | | | | | | | Motivation 1. Don't enforce the repeated decoding of an info table, when the client can cache it (ghc-debug) 2. Allow the constructor information decoding to be overridden, this casues segfaults in ghc-debug
* Remove special case for GHC.ByteCode.InstrMatthew Pickering2020-11-261-3/+1
| | | | | | | | | This was added in https://github.com/nomeata/ghc-heap-view/commit/34935206e51b9c86902481d84d2f368a6fd93423 GHC.ByteCode.Instr.BreakInfo no longer exists so the special case is dead code. Any check like this can be easily dealt with in client code.
* rts: Use RTS_LIKELY in CHECKBen Gamari2020-11-261-2/+2
| | | | | Most compilers probably already infer that `barf` diverges but it nevertheless doesn't hurt to be explicit.
* Set dynamic users-guide TOC spacing (fixes #18554)Tim Barnes2020-11-261-0/+3
|
* Fix toArgRep to support 64-bit reps on all systemsSylvain Henry2020-11-266-65/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | [This is @Ericson2314 writing a commit message for @hsyl20's patch.] (Progress towards #11953, #17377, #17375) `Int64Rep` and `Word64Rep` are currently broken on 64-bit systems. This is because they should use "native arg rep" but instead use "large arg rep" as they do on 32-bit systems, which is either a non-concept or a 128-bit rep depending on one's vantage point. Now, these reps currently aren't used during 64-bit compilation, so the brokenness isn't observed, but I don't think that constitutes reasons not to fix it. Firstly, the linked issues there is a clearly expressed desire to use explicit-bitwidth constructs in more places. Secondly, per [1], there are other bugs that *do* manifest from not threading explicit-bitwidth information all the way through the compilation pipeline. One can therefore view this as one piece of the larger effort to do that, improve ergnomics, and squash remaining bugs. Also, this is needed for !3658. I could just merge this as part of that, but I'm keen on merging fixes "as they are ready" so the fixes that aren't ready are isolated and easier to debug. [1]: https://mail.haskell.org/pipermail/ghc-devs/2020-October/019332.html
* RTS: Fix failed inlining of copy_tag.Andreas Klebinger2020-11-262-8/+18
| | | | | | | | | | | | | | | On windows using gcc-10 gcc failed to inline copy_tag into evacuate. To fix this we now set the always_inline attribute for the various copy* functions in Evac.c. The main motivation here is not the overhead of the function call, but rather that this allows the code to "specialize" for the size of the closure we copy which is often known at compile time. An earlier commit also tried to avoid evacuate_large inlining. But didn't quite succeed. So I also marked evacuate_large as noinline. Fixes #12416
* [Sized Cmm] properly retain sizes.Moritz Angermann2020-11-2671-582/+1067
| | | | | | | | | | | | | | | | | | | | | | | | | | | This replaces all Word<N> = W<N># Word# and Int<N> = I<N># Int# with Word<N> = W<N># Word<N># and Int<N> = I<N># Int<N>#, thus providing us with properly sized primitives in the codegenerator instead of pretending they are all full machine words. This came up when implementing darwinpcs for arm64. The darwinpcs reqires us to pack function argugments in excess of registers on the stack. While most procedure call standards (pcs) assume arguments are just passed in 8 byte slots; and thus the caller does not know the exact signature to make the call, darwinpcs requires us to adhere to the prototype, and thus have the correct sizes. If we specify CInt in the FFI call, it should correspond to the C int, and not just be Word sized, when it's only half the size. This does change the expected output of T16402 but the new result is no less correct as it eliminates the narrowing (instead of the `and` as was previously done). Bumps the array, bytestring, text, and binary submodules. Co-Authored-By: Ben Gamari <ben@well-typed.com> Metric Increase: T13701 T14697
* CmmToLlvm: Declare signature for memcmpwip/angerman/arm64Ben Gamari2020-11-246-8/+47
| | | | | | Otherwise `opt` fails with: error: use of undefined value '@memcmp$def'
* gitlab-ci: Run LLVM builds on Debian 10Ben Gamari2020-11-241-17/+17
| | | | The current Debian 9 image doesn't provide LLVM 7.
* gitlab-ci: Run LLVM job on appropriately-labelled MRsBen Gamari2020-11-241-2/+3
| | | | Namely, those marked with the ~"LLVM backend" label
* rts: Flush eventlog buffers from flushEventLogBen Gamari2020-11-2411-11/+75
| | | | | | | | | | | | As noted in #18043, flushTrace failed flush anything beyond the writer. This means that a significant amount of data sitting in capability-local event buffers may never get flushed, despite the users' pleads for us to flush. Fix this by making flushEventLog flush all of the event buffers before flushing the writer. Fixes #18043.
* hadrian: Drop redundant flavour definitionsBen Gamari2020-11-226-94/+5
| | | | | Drop the profiled, LLVM, and ThreadSanitizer flavour definitions as these can now be realized with flavour transformers.
* hadrian: Add profiled_ghc and no_dynamic_ghc modifiersBen Gamari2020-11-222-0/+27
|