delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix #17405 by not checking imported equations	Richard Eisenberg	2019-11-10	5	-104/+125
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, we checked all imported type family equations for injectivity. This is very silly. Now, we check only for conflicts. Before I could even imagine doing the fix, I needed to untangle several functions that were (in my opinion) overly complicated. It's still not quite as perfect as I'd like, but it's good enough for now. Test case: typecheck/should_compile/T17405
*	Improve SPECIALIZE pragma error messages (Fixes #12126)	Alina Banerjee	2019-11-10	4	-6/+8
\|
*	Use the right type in :force	Simon Peyton Jones	2019-11-09	1	-3/+5
\| \| \| \| \| \| \| \|	A missing prime meant that we were considering the wrong type in the GHCi debugger, when doing :force on multiple arguments (issue #17431). The fix is trivial.
*	SetLevels: Don't set context level when floating cases	Ben Gamari	2019-11-08	1	-4/+33
\| \| \| \| \| \| \| \| \| \| \|	When floating a single-alternative case we previously would set the context level to the level where we were floating the case. However, this is wrong as we are only moving the case and its binders. This resulted in #16978, where the disrepancy caused us to unnecessarily abstract over some free variables of the case body, resulting in shadowing and consequently Core Lint failures. (cherry picked from commit a2a0e6f3bb2d02a9347dec4c7c4f6d4480bc2421)
*	Set correct length of DWARF .debug_aranges section (fixes #17428)	Szymon Nowicki-Korgol	2019-11-08	1	-1/+2
\|
*	FlagChecker: Add ticky flags to hashed flags	Ben Gamari	2019-11-07	1	-1/+5
\| \| \| \|	These affect output and therefore should be part of the flag hash.
*	For s390x issue a warning if LLVM 9 or older is used	Stefan Schulze Frielinghaus	2019-11-07	1	-0/+6
\| \| \| \| \|	For s390x the GHC calling convention is only supported since LLVM version 10. Issue a warning in case an older version of LLVM is used.
*	Clean up TH's treatment of unary tuples (or, #16881 part two)	Ryan Scott	2019-11-07	7	-39/+86
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	!1906 left some loose ends in regards to Template Haskell's treatment of unary tuples. This patch ends to tie up those loose ends: * In addition to having `TupleT 1` produce unary tuples, `TupE [exp]` and `TupP [pat]` also now produce unary tuples. * I have added various special cases in GHC's pretty-printers to ensure that explicit 1-tuples are printed using the `Unit` type. See `testsuite/tests/th/T17380`. * The GHC 8.10.1 release notes entry has been tidied up a little. Fixes #16881. Fixes #17371. Fixes #17380.
*	CoreTidy: hide tidyRule	Ömer Sinan Ağacan	2019-11-05	1	-1/+1
\|
*	TidyPgm: replace an explicit loop with mapAccumL	Ömer Sinan Ağacan	2019-11-05	1	-7/+2
\|
*	Check EmptyCase by simply adding a non-void constraint	Sebastian Graf	2019-11-05	5	-301/+300
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We can handle non-void constraints since !1733, so we can now express the strictness of `-XEmptyCase` just by adding a non-void constraint to the initial Uncovered set. For `case x of {}` we thus check that the Uncovered set `{ x \| x /~ ⊥ }` is non-empty. This is conceptually simpler than the plan outlined in #17376, because it talks to the oracle directly. In order for this patch to pass the testsuite, I had to fix handling of newtypes in the pattern-match checker (#17248). Since we use a different code path (well, the main code path) for `-XEmptyCase` now, we apparently also handle #13717 correctly. There's also some dead code that we can get rid off now. `provideEvidence` has been updated to provide output more in line with the old logic, which used `inhabitationCandidates` under the hood. A consequence of the shift away from the `UncoveredPatterns` type is that we don't report reduced type families for empty case matches, because the pretty printer is pure and only knows the match variable's type. Fixes #13717, #17248, #17386
*	SysTools: Only apply Windows-specific workaround on Windows	Ben Gamari	2019-11-04	1	-1/+7
\| \| \| \| \| \| \| \| \|	Issue #1110 was apparently due to a bug in Vista which prevented GCC from finding its binaries unless we explicitly added it to PATH. However, this workaround was incorrectly applied on non-Windows platforms as well, resulting in ill-formed PATHs (#17266). Fixes #17266.
*	Update Note references -- comments only	Richard Eisenberg	2019-11-02	1	-2/+2
\| \| \| \|	Follow-on from !2041.
*	Separate `LPat` from `Pat` on the type-level	Sebastian Graf	2019-11-02	10	-69/+55
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since the Trees That Grow effort started, we had `type LPat = Pat`. This is so that `SrcLoc`s would only be annotated in GHC's AST, which is the reason why all GHC passes use the extension constructor `XPat` to attach source locations. See #15495 for the design discussion behind that. But now suddenly there are `XPat`s everywhere! There are several functions which dont't cope with `XPat`s by either crashing (`hsPatType`) or simply returning incorrect results (`collectEvVarsPat`). This issue was raised in #17330. I also came up with a rather clean and type-safe solution to the problem: We define ```haskell type family XRec p (f :: * -> *) = r \| r -> p f type instance XRec (GhcPass p) f = Located (f (GhcPass p)) type instance XRec TH f = f p type LPat p = XRec p Pat ``` This is a rather modular embedding of the old "ping-pong" style, while we only pay for the `Located` wrapper within GHC. No ping-ponging in a potential Template Haskell AST, for example. Yet, we miss no case where we should've handled a `SrcLoc`: `hsPatType` and `collectEvVarsPat` are not callable at an `LPat`. Also, this gets rid of one indirection in `Located` variants: Previously, we'd have to go through `XPat` and `Located` to get from `LPat` to the wrapped `Pat`. Now it's just `Located` again. Thus we fix #17330.
*	Make CSE delay inlining less	Simon Peyton Jones	2019-11-01	1	-7/+49
\| \| \| \| \| \| \| \| \|	CSE delays inlining a little bit, to avoid losing vital specialisations; see Note [Delay inlining after CSE] in CSE. But it was being over-enthusiastic. This patch makes the delay only apply to Ids with specialisation rules, which avoids unnecessary delay (#17409).
*	template-haskell: require at least 1 GADT constructor name (#17379)	Adam Sandberg Eriksson	2019-11-01	1	-0/+6
\|
*	Describe optimisation of demand analysis of noinline	Ben Gamari	2019-11-01	1	-0/+8
\| \| \| \|	As described in #16588.
*	Fix a bad error in tcMatchTy	Simon Peyton Jones	2019-11-01	1	-4/+28
\| \| \| \| \| \| \| \| \| \| \|	This patch fixes #17395, a very subtle and hard-to-trigger bug in tcMatchTy. It's all explained in Note [Matching in the presence of casts (2)] I have not added a regression test because it is very hard to trigger it, until we have the upcoming mkAppTyM patch, after which lacking this patch means you can't even compile the libraries.
*	Makes Lint less chatty:	Simon Peyton Jones	2019-11-01	1	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I found in #17415 that Lint was printing out truly gigantic warnings, unmanageably huge, with repeated copies of the same thing. This patch makes Lint less chatty, especially for warnings: * For warnings, I don't print details of the location, unless you add `-dppr-debug`. * For errors, I still print all the info. They are fatal and stop exection, whereas warnings appear repeatedly. * I've made much less use of `AnExpr` in `LintLocInfo`; the expression can be gigantic.
*	Whitespace forward compatibility for proposal #229	Vladislav Zavialov	2019-10-30	2	-4/+4
\| \| \| \| \| \| \| \|	GHC Proposal #229 changes the lexical rules of Haskell, which may require slight whitespace adjustments in certain cases. This patch changes formatting in a few places in GHC and its testsuite in a way that enables it to compile under the proposed rules.
*	MkIface: Remove redundant parameter and outdated comments from addFingerprints	Ömer Sinan Ağacan	2019-10-29	1	-8/+8
\|
*	HscMain: Move a comment closer to the relevant site	Ömer Sinan Ağacan	2019-10-29	1	-4/+4
\|
*	Remove unused DynFlags arg of lookupIfaceByModule	Ömer Sinan Ağacan	2019-10-29	6	-17/+12
\|
*	Return ModIface in compilation pipeline, remove IORef hack for generating ↵	Ömer Sinan Ağacan	2019-10-29	4	-97/+114
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ModIfaces The compilation phases now optionally return ModIface (for phases that generate an interface, currently only HscOut when (re)compiling a file). The value is then used by compileOne' to return the generated interface with HomeModInfo (which is then used by the batch mode compiler when building rest of the tree). hscIncrementalMode also returns a DynFlags with plugin info, to be used in the rest of the pipeline. Unfortunately this introduces a (perhaps less bad) hack in place of the previous IORef: we now record the DynFlags used to generate the partial infterface in HscRecomp and use the same DynFlags when generating the full interface. I spent almost three days trying to understand what's changing in DynFlags that causes a backpack test to fail, but I couldn't figure it out. There's a FIXME added next to the field so hopefully someone who understands this better than I do will fix it leter.
*	Refactor HscRecomp constructors:	Ömer Sinan Ağacan	2019-10-29	3	-47/+45
\| \| \| \| \| \| \| \| \| \| \|	Make it evident in the constructors that the final interface is only available when HscStatus is not HscRecomp. (When HscStatus == HscRecomp we need to finish the compilation to get the final interface) `Maybe ModIface` return value of hscIncrementalCompile and the partial `expectIface` function are removed.
*	Better arity for join points	Simon Peyton Jones	2019-10-28	7	-24/+49
\| \| \| \| \| \|	A join point was getting too large an arity, leading to #17294. I've tightened up the invariant: see CoreSyn, Note [Invariants on join points], invariant 2b
*	Use FlexibleInstances for `Outputable (* p)` instead of match-all instances ↵	Sebastian Graf	2019-10-28	16	-205/+209
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	with equality constraints In #17304, Richard and Simon dicovered that using `-XFlexibleInstances` for `Outputable` instances of AST data types means users can provide orphan `Outputable` instances for passes other than `GhcPass`. Type inference doesn't currently to suffer, and Richard gave an example in #17304 that shows how rare a case would be where the slightly worse type inference would matter. So I went ahead with the refactoring, attempting to fix #17304.
*	Attach API Annotations for {-# SOURCE #-} import pragma	Alan Zimmerman	2019-10-28	1	-14/+13
\| \| \| \| \| \| \|	Attach the API annotations for the start and end locations of the {-# SOURCE #-} pragma in an ImportDecl. Closes #17388
*	Refactor TcDeriv to validity-check less in anyclass/via deriving (#13154)	Ryan Scott	2019-10-28	3	-300/+499
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to the way `DerivEnv` is currently structured, there is an invariant that every derived instance must consist of a class applied to a non-empty list of argument types, where the last argument must be an application of a type constructor to some arguments. This works for many cases, but there are also some design patterns in standalone `anyclass`/`via` deriving that are made impossible due to enforcing this invariant, as documented in #13154. This fixes #13154 by refactoring `TcDeriv` and friends to perform fewer validity checks when using the `anyclass` or `via` strategies. The highlights are as followed: * Five fields of `DerivEnv` have been factored out into a new `DerivInstTys` data type. These fields only make sense for instances that satisfy the invariant mentioned above, so `DerivInstTys` is now only used in `stock` and `newtype` deriving, but not in other deriving strategies. * There is now a `Note [DerivEnv and DerivSpecMechanism]` describing the bullet point above in more detail, as well as explaining the exact requirements that each deriving strategy imposes. * I've refactored `mkEqnHelp`'s call graph to be slightly less complicated. Instead of the previous `mkDataTypeEqn`/`mkNewTypeEqn` dichotomy, there is now a single entrypoint `mk_eqn`. * Various bits of code were tweaked so as not to use fields that are specific to `DerivInstTys` so that they may be used by all deriving strategies, since not all deriving strategies use `DerivInstTys`.
*	Fix #15344: use fail when desugaring applicative-do	Josef Svenningsson	2019-10-28	8	-59/+156
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Applicative-do has a bug where it fails to use the monadic fail method when desugaring patternmatches which can fail. See #15344. This patch fixes that problem. It required more rewiring than I had expected. Applicative-do happens mostly in the renamer; that's where decisions about scheduling are made. This schedule is then carried through the typechecker and into the desugarer which performs the actual translation. Fixing this bug required sending information about the fail method from the renamer, through the type checker and into the desugarer. Previously, the desugarer didn't have enough information to actually desugar pattern matches correctly. As a side effect, we also fix #16628, where GHC wouldn't catch missing MonadFail instances with -XApplicativeDo.
*	Parenthesize nullary constraint tuples using sigPrec (#17403)	Ryan Scott	2019-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \|	We were using `appPrec`, not `sigPrec`, as the precedence when determining whether or not to parenthesize `() :: Constraint`, which lead to the parentheses being omitted in function contexts like `(() :: Constraint) => String`. Easily fixed. Fixes #17403.
*	Remove redundant -fno-cse options	Ömer Sinan Ağacan	2019-10-26	2	-4/+0
\| \| \| \| \|	These were probably added with some GLOBAL_VARs, but those GLOBAL_VARs are now gone.
*	Implement shrinkSmallMutableArray# and resizeSmallMutableArray#.	Andrew Martin	2019-10-26	2	-1/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a part of GHC Proposal #25: "Offer more array resizing primitives". Resources related to the proposal: - Discussion: https://github.com/ghc-proposals/ghc-proposals/pull/121 - Proposal: https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0025-resize-boxed.rst Only shrinkSmallMutableArray# is implemented as a primop since a library-space implementation of resizeSmallMutableArray# (in GHC.Exts) is no less efficient than a primop would be. This may be replaced by a primop in the future if someone devises a strategy for growing arrays in-place. The library-space implementation always copies the array when growing it. This commit also tweaks the documentation of the deprecated sizeofMutableByteArray#, removing the mention of concurrency. That primop is unsound even in single-threaded applications. Additionally, the non-negativity assertion on the existing shrinkMutableByteArray# primop has been removed since this predicate is trivially always true.
*	Mark promoted InfixT names as IsPromoted (#17394)	Ryan Scott	2019-10-24	1	-8/+13
\| \| \| \| \|	We applied a similar fix for `ConT` in #15572 but forgot to apply the fix to `InfixT` as well. This patch fixes #17394 by doing just that.
*	Make isTcLevPoly more conservative with newtypes (#17360)	Ryan Scott	2019-10-24	2	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \|	`isTcLevPoly` gives an approximate answer for when a type constructor is levity polymorphic when fully applied, where `True` means "possibly levity polymorphic" and `False` means "definitely not levity polymorphic". `isTcLevPoly` returned `False` for newtypes, which is incorrect in the presence of `UnliftedNewtypes`, leading to #17360. This patch tweaks `isTcLevPoly` to return `True` for newtypes instead. Fixes #17360.
*	Parenthesize GADT return types in pprIfaceConDecl (#17384)	Ryan Scott	2019-10-24	1	-1/+1
\| \| \| \| \| \| \|	We were using `pprIfaceAppArgs` instead of `pprParendIfaceAppArgs` in `pprIfaceConDecl`. Oops. Fixes #17384.
*	Merge non-moving garbage collector	Ben Gamari	2019-10-23	4	-6/+105
\|\ \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces a concurrent mark & sweep garbage collector to manage the old generation. The concurrent nature of this collector typically results in significantly reduced maximum and mean pause times in applications with large working sets. Due to the large and intricate nature of the change I have opted to preserve the fully-buildable history, including merge commits, which is described in the "Branch overview" section below. Collector design ================ The full design of the collector implemented here is described in detail in a technical note > B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell > Compiler" (2018) This document can be requested from @bgamari. The basic heap structure used in this design is heavily inspired by > K. Ueno & A. Ohori. "A fully concurrent garbage collector for > functional programs on multicore processors." /ACM SIGPLAN Notices/ > Vol. 51. No. 9 (presented at ICFP 2016) This design is intended to allow both marking and sweeping concurrent to execution of a multi-core mutator. Unlike the Ueno design, which requires no global synchronization pauses, the collector introduced here requires a stop-the-world pause at the beginning and end of the mark phase. To avoid heap fragmentation, the allocator consists of a number of fixed-size /sub-allocators/. Each of these sub-allocators allocators into its own set of /segments/, themselves allocated from the block allocator. Each segment is broken into a set of fixed-size allocation blocks (which back allocations) in addition to a bitmap (used to track the liveness of blocks) and some additional metadata (used also used to track liveness). This heap structure enables collection via mark-and-sweep, which can be performed concurrently via a snapshot-at-the-beginning scheme (although concurrent collection is not implemented in this patch). Implementation structure ======================== The majority of the collector is implemented in a handful of files: * `rts/Nonmoving.c` is the heart of the beast. It implements the entry-point to the nonmoving collector (`nonmoving_collect`), as well as the allocator (`nonmoving_allocate`) and a number of utilities for manipulating the heap. * `rts/NonmovingMark.c` implements the mark queue functionality, update remembered set, and mark loop. * `rts/NonmovingSweep.c` implements the sweep loop. * `rts/NonmovingScav.c` implements the logic necessary to scavenge the nonmoving heap. Branch overview =============== ``` * wip/gc/opt-pause: \| A variety of small optimisations to further reduce pause times. \| * wip/gc/compact-nfdata: \| Introduce support for compact regions into the non-moving \|\ collector \| \ \| \ \| \| * wip/gc/segment-header-to-bdescr: \| \| \| Another optimization that we are considering, pushing \| \| \| some segment metadata into the segment descriptor for \| \| \| the sake of locality during mark \| \| \| \| * \| wip/gc/shortcutting: \| \| \| Support for indirection shortcutting and the selector optimization \| \| \| in the non-moving heap. \| \| \| * \| \| wip/gc/docs: \| \|/ Work on implementation documentation. \| / \|/ * wip/gc/everything: \| A roll-up of everything below. \|\ \| \ \| \|\ \| \| \ \| \| * wip/gc/optimize: \| \| \| A variety of optimizations, primarily to the mark loop. \| \| \| Some of these are microoptimizations but a few are quite \| \| \| significant. In particular, the prefetch patches have \| \| \| produced a nontrivial improvement in mark performance. \| \| \| \| \| * wip/gc/aging: \| \| \| Enable support for aging in major collections. \| \| \| \| * \| wip/gc/test: \| \| \| Fix up the testsuite to more or less pass. \| \| \| * \| \| wip/gc/instrumentation: \| \| \| A variety of runtime instrumentation including statistics \| \| / support, the nonmoving census, and eventlog support. \| \|/ \| / \|/ * wip/gc/nonmoving-concurrent: \| The concurrent write barriers. \| * wip/gc/nonmoving-nonconcurrent: \| The nonmoving collector without the write barriers necessary \| for concurrent collection. \| * wip/gc/preparation: \| A merge of the various preparatory patches that aren't directly \| implementing the GC. \| \| * GHC HEAD . . . ```
\| *	rts: Implement concurrent collection in the nonmoving collector	Ben Gamari	2019-10-20	4	-6/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This extends the non-moving collector to allow concurrent collection. The full design of the collector implemented here is described in detail in a technical note B. Gamari. "A Concurrent Garbage Collector For the Glasgow Haskell Compiler" (2018) This extension involves the introduction of a capability-local remembered set, known as the /update remembered set/, which tracks objects which may no longer be visible to the collector due to mutation. To maintain this remembered set we introduce a write barrier on mutations which is enabled while a concurrent mark is underway. The update remembered set representation is similar to that of the nonmoving mark queue, being a chunked array of `MarkEntry`s. Each `Capability` maintains a single accumulator chunk, which it flushed when it (a) is filled, or (b) when the nonmoving collector enters its post-mark synchronization phase. While the write barrier touches a significant amount of code it is conceptually straightforward: the mutator must ensure that the referee of any pointer it overwrites is added to the update remembered set. However, there are a few details: * In the case of objects with a dirty flag (e.g. `MVar`s) we can exploit the fact that only the first mutation requires a write barrier. * Weak references, as usual, complicate things. In particular, we must ensure that the referee of a weak object is marked if dereferenced by the mutator. For this we (unfortunately) must introduce a read barrier, as described in Note [Concurrent read barrier on deRefWeak#] (in `NonMovingMark.c`). * Stable names are also a bit tricky as described in Note [Sweeping stable names in the concurrent collector] (`NonMovingSweep.c`). We take quite some pains to ensure that the high thread count often seen in parallel Haskell applications doesn't affect pause times. To this end we allow thread stacks to be marked either by the thread itself (when it is executed or stack-underflows) or the concurrent mark thread (if the thread owning the stack is never scheduled). There is a non-trivial handshake to ensure that this happens without racing which is described in Note [StgStack dirtiness flags and concurrent marking]. Co-Authored-by: Ömer Sinan Ağacan <omer@well-typed.com>
* \|	Add new flag for unarised STG dumps	Ömer Sinan Ağacan	2019-10-23	3	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously -ddump-stg would dump pre and post-unarise STGs. Now we have a new flag for post-unarise STG and -ddump-stg only dumps coreToStg output. STG dump flags after this commit: - -ddump-stg: Dumps CoreToStg output - -ddump-stg-unarised: Unarise output - -ddump-stg-final: STG right before code gen (includes CSE and lambda lifting)
* \|	Make dynflag argument for withTiming pure.	Andreas Klebinger	2019-10-23	21	-71/+93
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	19 times out of 20 we already have dynflags in scope. We could just always use `return dflags`. But this is in fact not free. When looking at some STG code I noticed that we always allocate a closure for this expression in the heap. Clearly a waste in these cases. For the other cases we can either just modify the callsite to get dynflags or use the _D variants of withTiming I added which will use getDynFlags under the hood.
* \|	Fix bug in the x86 backend involving the CFG.	Andreas Klebinger	2019-10-23	7	-70/+304
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part two of fixing #17334. There are two parts to this commit: - A bugfix for computing loop levels - A bugfix of basic block invariants in the NCG. ----------------------------------------------------------- In the first bug we ended up with a CFG of the sort: [A -> B -> C] This was represented via maps as fromList [(A,B),(B,C)] and later transformed into a adjacency array. However the transformation did not include block C in the array (since we only looked at the keys of the map). This was still fine until we tried to look up successors for C and tried to read outside of the array bounds when accessing C. In order to prevent this in the future I refactored to code to include all nodes as keys in the map representation. And make this a invariant which is checked in a few places. Overall I expect this to make the code more robust as now any failed lookup will represent an error, versus failed lookups sometimes being expected and sometimes not. In terms of performance this makes some things cheaper (getting a list of all nodes) and others more expensive (adding a new edge). Overall this adds up to no noteable performance difference. ----------------------------------------------------------- Part 2: When the NCG generated a new basic block, it did not always insert a NEWBLOCK meta instruction in the stream which caused a quite subtle bug. During instruction selection a statement `s` in a block B with control of the sort: B -> C will sometimes result in control flow of the sort: ┌ < ┐ v ^ B -> B1 ┴ -> C as is the case for some atomic operations. Now to keep the CFG in sync when introducing B1 we clearly want to insert it between B and C. However there is a catch when we have to deal with self loops. We might start with code and a CFG of these forms: loop: stmt1 ┌ < ┐ .... v ^ stmtX loop ┘ stmtY .... goto loop: Now we introduce B1: ┌ ─ ─ ─ ─ ─┐ loop: │ ┌ < ┐ │ instrs v │ │ ^ .... loop ┴ B1 ┴ ┘ instrsFromX stmtY goto loop: This is simple, all outgoing edges from loop now simply start from B1 instead and the code generator knows which new edges it introduced for the self loop of B1. Disaster strikes if the statement Y follows the same pattern. If we apply the same rule that all outgoing edges change then we end up with: loop ─> B1 ─> B2 ┬─┐ │ │ └─<┤ │ │ └───<───┘ │ └───────<────────┘ This is problematic. The edge B1->B1 is modified as expected. However the modification is wrong! The assembly in this case looked like this: _loop: <instrs> _B1: ... cmpxchgq ... jne _B1 <instrs> <end _B1> _B2: ... cmpxchgq ... jne _B2 <instrs> jmp loop There is no edge _B2 -> _B1 here. It's still a self loop onto _B1. The problem here is that really B1 should be two basic blocks. Otherwise we have control flow in the middle of a basic block. A contradiction! So to account for this we add yet another basic block marker: _B: <instrs> _B1: ... cmpxchgq ... jne _B1 jmp _B1' _B1': <instrs> <end _B1> _B2: ... Now when inserting B2 we will only look at the outgoing edges of B1' and everything will work out nicely. You might also wonder why we don't insert jumps at the end of _B1'. There is no way another block ends up jumping to the labels _B1 or _B2 since they are essentially invisible to other blocks. View them as control flow labels local to the basic block if you'd like. Not doing this ultimately caused (part 2 of) #17334.
* \|	Reify oversaturated data family instances correctly (#17296)	Ryan Scott	2019-10-23	2	-21/+118
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	`TcSplice` was not properly handling oversaturated data family instances, such as the example in #17296, as it dropped arguments due to carelessly zipping data family instance arguments with `tyConTyVars`. For data families, the number of `tyConTyVars` can sometimes be less than the number of arguments it can accept in a data family instance due to the fact that data family instances can be oversaturated. To account for this, `TcSplice.mkIsPolyTvs` has now been renamed to `tyConArgsPolyKinded` and now factors in `tyConResKind` in addition to `tyConTyVars`. I've also added `Note [Reified instances and explicit kind signatures]` which explains the various subtleties in play here. Fixes #17296.
* \|	compiler: introduce DynFlags plugins	Alp Mestanogullari	2019-10-23	2	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	They have type '[CommandLineOpts] -> Maybe (DynFlags -> IO DynFlags)'. All plugins that supply a non-Nothing 'dynflagsPlugin' will see their updates applied to the current DynFlags right after the plugins are loaded. One use case for this is to superseede !1580 for registering hooks from a plugin. Frontend/parser plugins were considered to achieve this but they respectively conflict with how this plugin is going to be used and don't allow overriding/modifying the DynFlags, which is how hooks have to be registered. This commit comes with a test, 'test-hook-plugin', that registers a "fake" meta hook that replaces TH expressions with the 0 integer literal.
* \|	Implement a coverage checker for injectivity	Richard Eisenberg	2019-10-23	17	-227/+528
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes #16512. There are lots of parts of this patch: * The main payload is in FamInst. See Note [Coverage condition for injective type families] there for the overview. But it doesn't fix the bug. * We now bump the reduction depth every time we discharge a CFunEqCan. See Note [Flatten when discharging CFunEqCan] in TcInteract. * Exploration of this revealed a new, easy to maintain invariant for CTyEqCans. See Note [Almost function-free] in TcRnTypes. * We also realized that type inference for injectivity was a bit incomplete. This means we exchanged lookupFlattenTyVar for rewriteTyVar. See Note [rewriteTyVar] in TcFlatten. The new function is monadic while the previous one was pure, necessitating some faff in TcInteract. Nothing too bad. * zonkCt did not maintain invariants on CTyEqCan. It's not worth the bother doing so, so we just transmute CTyEqCans to CNonCanonicals. * The pure unifier was finding the fixpoint of the returned substitution, even when doing one-way matching (in tcUnifyTysWithTFs). Fixed now. Test cases: typecheck/should_fail/T16512{a,b}
* \|	Warn about missing profiled libs when using the Interpreter.	Andreas Klebinger	2019-10-23	1	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When GHC itself, or it's interpreter is profiled we need to load profiled libraries as well. This requirement is not always obvious, especially when TH implicilty uses the interpreter. When the libs were not found we fall back to assuming the are in a DLL. This is usually not the case so now we warn users when we do so. This makes it more obvious what is happening and gives users a way to fix the issue. This fixes #17121.
* \|	Implement s390x LLVM backend.	Stefan Schulze Frielinghaus	2019-10-22	9	-1/+31
\| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for the s390x architecture for the LLVM code generator. The patch includes a register mapping of STG registers onto s390x machine registers which enables a registerised build.
* \|	Windows: Update tarballs to GCC 9.2 and remove MAX_PATH limit.	Tamar Christina	2019-10-20	1	-1/+1
\|/
*	Tiny fixes to comments around flattening.	Richard Eisenberg	2019-10-17	2	-3/+3
\|
*	Make Coverage.TM a newtype	Ryan Scott	2019-10-16	1	-1/+1
\|
*	Break up TcRnTypes, among other modules.	Richard Eisenberg	2019-10-16	64	-2890/+3067
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This introduces three new modules: - basicTypes/Predicate.hs describes predicates, moving this logic out of Type. Predicates don't really exist in Core, and so don't belong in Type. - typecheck/TcOrigin.hs describes the origin of constraints and types. It was easy to remove from other modules and can often be imported instead of other, scarier modules. - typecheck/Constraint.hs describes constraints as used in the solver. It is taken from TcRnTypes. No work other than module splitting is in this patch. This is the first step toward homogeneous equality, which will rely more strongly on predicates. And homogeneous equality is the next step toward a dependently typed core language.