delta/haskell.git - gitlab.haskell.org: ghc/ghc.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	rts: Teach listAllBlocks about nonmoving heap	Teo Camarasu	2023-05-12	1	-2/+29
\| \| \| \| \| \|	List all blocks on the non-moving heap. Resolves #22627
*	rts: Ensure non-moving gc is not running when pausing	Teo Camarasu	2023-05-12	1	-0/+15
\|
*	rts: Refine memory retention behaviour to account for pinned/compacted objects	Matthew Pickering	2023-05-11	3	-12/+59
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using the copying collector there is still a lot of data which isn't copied (such as pinned, compacted, large objects etc). The logic to decide how much memory to retain didn't take into account that these wouldn't be copied. Therefore we pessimistically retained 2* the amount of memory for these blocks even though they wouldn't be copied by the collector. The solution is to split up the heap into two parts, the parts which will be copied and the parts which won't be copied. Then the appropiate factor is applied to each part individually (2 * for copying and 1.2 * for not copying). The T23221 test demonstrates this improvement with a program which first allocates many unpinned ByteArray# followed by many pinned ByteArray# and observes the difference in the ultimate memory baseline between the two. There are some charts on #23221. Fixes #23221
*	Add fused multiply-add instructions	sheaf	2023-05-11	2	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds eight new primops that fuse a multiplication and an addition or subtraction: - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#` fmadd x y z is x * y + z, computed with a single rounding step. This patch implements code generation for these primops in the following backends: - X86, AArch64 and PowerPC NCG, - LLVM - C WASM uses the C implementation. The primops are unsupported in the JavaScript backend. The following constant folding rules are also provided: - compute a * b + c when a, b, c are all literals, - x * y + 0 ==> x * y, - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x. NB: the constant folding rules incorrectly handle signed zero. This is a known limitation with GHC's floating-point constant folding rules (#21227), which we hope to resolve in the future.
*	nonmoving: Account for mutator allocations in bytes_allocated	Ben Gamari	2023-05-09	3	-1/+7
\| \| \| \| \| \| \| \| \|	Previously we failed to account direct mutator allocations into the nonmoving heap against the mutator's allocation limit and `cap->total_allocated`. This only manifests during CAF evaluation (since we allocate the CAF's blackhole directly into the nonmoving heap). Fixes #23312.
*	Make atomicSwapMutVar# an inline primopwip/ioref-swap-xchg	Ben Gamari	2023-05-09	3	-13/+0
\|
*	compiler: Implement atomicSwapIORef with xchg	Ben Gamari	2023-05-09	4	-0/+15
\| \| \| \|	As requested by @treeowl in CLC#139.
*	rts: Fix data-race in hs_init_ghcwip/T22756	Ben Gamari	2023-05-08	1	-8/+11
\| \| \| \| \| \| \| \|	As noticed by @Terrorjack, `hs_init_ghc` previously used non-atomic increment/decrement on the RTS's initialization count. This may go wrong in a multithreaded program which initializes the runtime multiple times. Closes #22756.
*	Fix remaining issues with bound checking (#23123)	Sylvain Henry	2023-05-04	5	-69/+40
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While fixing these I've also changed the way we store addresses into ByteArray#. Addr# are composed of two parts: a JavaScript array and an offset (32-bit number). Suppose we want to store an Addr# in a ByteArray# foo at offset i. Before this patch, we were storing both fields as a tuple in the "arr" array field: foo.arr[i] = [addr_arr, addr_offset]; Now we only store the array part in the "arr" field and the offset directly in the array: foo.dv.setInt32(i, addr_offset): foo.arr[i] = addr_arr; It avoids wasting space for the tuple.
*	JS: fix bounds checking (Issue 23123)	Josh Meredith	2023-05-04	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* For ByteArray-based bounds-checking, the JavaScript backend must use the `len` field, instead of the inbuild JavaScript `length` field. * Range-based operations must also check both the start and end of the range for bounds * All indicies are valid for ranges of size zero, since they are essentially no-ops * For cases of ByteArray accesses (e.g. read as Int), the end index is (i * sizeof(type) + sizeof(type) - 1), while the previous implementation uses (i + sizeof(type) - 1). In the Int32 example, this is (i * 4 + 3) * IndexByteArrayOp_Word8As* primitives use byte array indicies (unlike the previous point), but now check both start and end indicies * Byte array copies now check if the arrays are the same by identity and then if the ranges overlap.
*	rts: always build 64-bit atomic ops	Cheng Shao	2023-04-24	2	-24/+17
\| \| \| \| \| \| \| \| \| \|	This patch does a few things: - Always build 64-bit atomic ops in rts/ghc-prim, even on 32-bit platforms - Remove legacy "64bit" cabal flag of rts package - Fix hs_xchg64 function prototype for 32-bit platforms - Fix AtomicFetch test for wasm32
*	rts: Initialize Array# header in listThreads#	Ben Gamari	2023-04-20	1	-0/+1
\| \| \| \| \| \| \|	Previously the implementation of listThreads# failed to initialize the header of the created array, leading to various nastiness. Fixes #23071
*	JS: fix thread-related primops	Sylvain Henry	2023-04-19	2	-12/+26
\|
*	rts: improve memory ordering and add some comments in the StablePtr ↵	Adam Sandberg Ericsson	2023-04-14	1	-10/+36
\| \| \| \|	implementation
*	Add missing cases in -Di prettyprinter	Krzysztof Gogolewski	2023-04-11	1	-0/+51
\| \| \| \|	Fixes #23142
*	nonmoving: Disable slop-zeroing	Ben Gamari	2023-04-06	1	-4/+8
\| \| \| \| \| \| \| \| \|	As noted in #23170, the nonmoving GC can race with a mutator zeroing the slop of an updated thunk (in much the same way that two mutators would race). Consequently, we must disable slop-zeroing when the nonmoving GC is in use. Closes #23170
*	StgToCmm: Upgrade -fcheck-prim-bounds behavior	Matthew Craven	2023-04-04	1	-0/+9
\| \| \| \| \|	Fixes #21054. Additionally, we can now check for range overlap when generating Cmm for primops that use memcpy internally.
*	rts: Fix capability-count check in zeroSlop	Ben Gamari	2023-03-25	1	-3/+2
\| \| \| \| \| \| \| \| \|	Previously `zeroSlop` examined `RtsFlags` to determine whether the program was single-threaded. This is wrong; a program may be started with `+RTS -N1` yet the process may later increase the capability count with `setNumCapabilities`. This lead to quite subtle and rare crashes. Fixes #23088.
*	rts: Don't rely on EXTERN_INLINE for slop-zeroing logic	Ben Gamari	2023-03-25	4	-23/+47
\| \| \| \| \| \| \| \| \| \| \| \| \|	Previously we relied on calling EXTERN_INLINE functions defined in ClosureMacros.h from Cmm to zero slop. However, as far as I can tell, this is no longer safe to do in C99 as EXTERN_INLINE definitions may be emitted in each compilation unit. Fix this by explicitly declaring a new set of non-inline functions in ZeroSlop.c which can be called from Cmm and marking the ClosureMacros.h definitions as INLINE_HEADER. In the future we should try to eliminate EXTERN_INLINE.
*	rts: use performBlockingMajorGC in hs_perform_gc and fix ffi023	Cheng Shao	2023-03-25	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	This patch does a few things: - Add the missing RtsSymbols.c entry of performBlockingMajorGC - Make hs_perform_gc call performBlockingMajorGC, which restores previous behavior - Use hs_perform_gc in ffi023 - Remove rts_clearMemory() call in ffi023, it now works again in some test ways previously marked as broken. Fixes #23089
*	rts: Fix barriers of IND and IND_STATIC	Ben Gamari	2023-03-25	2	-9/+11
\| \| \| \| \| \| \| \| \|	Previously IND and IND_STATIC lacked the acquire barriers enjoyed by BLACKHOLE. As noted in the (now updated) Note [Heap memory barriers], this barrier is critical to ensure that the indirectee is visible to the entering core. Fixes #22872.
*	fix: account for large and compact object stats with nonmoving gc	Teo Camarasu	2023-03-25	5	-7/+36
\| \| \| \| \| \| \|	Make sure that we keep track of the size of large and compact objects that have been moved onto the nonmoving heap. We keep track of their size and add it to the amount of live bytes in nonmoving segments to get the total size of the live nonmoving heap. Resolves #17574
*	JS: remove dead code for old integer-gmp	Sylvain Henry	2023-03-10	1	-16/+2
\|
*	nonmoving: Non-concurrent collection	Ben Gamari	2023-03-08	7	-82/+132
\|
*	rts: Capture GC configuration in a struct	Ben Gamari	2023-03-08	3	-19/+34
\| \| \| \| \|	The number of distinct arguments passed to GarbageCollect was getting a bit out of hand.
*	rts: Fix incorrect STATIC_INLINE	Ben Gamari	2023-03-08	1	-1/+1
\| \| \| \|	This should be INLINE_HEADER lest we get unused declaration warnings.
*	rts: Rename clear_segment(_free_blocks)?	Ben Gamari	2023-03-08	3	-9/+9
\| \| \| \| \|	To reflect the fact that these are to do with the nonmoving collector, now since they are exposed no longer static.
*	nonmoving: Split out nonmovingAllocateGC	Ben Gamari	2023-03-08	4	-15/+55
\|
*	nonmoving: Move allocator into new source file	Ben Gamari	2023-03-08	7	-198/+237
\|
*	nonmoving: Ensure that sanity checker accounts for saved_filled segments	Ben Gamari	2023-03-08	1	-0/+1
\|
*	nonmoving: Fix unregisterised build	Ben Gamari	2023-03-08	1	-0/+4
\|
*	rts: Encapsulate block allocator spinlock	Ben Gamari	2023-03-08	7	-21/+28
\| \| \| \| \|	This makes it a bit easier to add instrumentation on this spinlock while debugging.
*	nonmoving: Don't call prepareUnloadCheck	Ben Gamari	2023-03-08	1	-1/+2
\| \| \| \| \| \|	When the nonmoving GC is in use we do not call `checkUnload` (since we don't unload code) and therefore should not call `prepareUnloadCheck`, lest we run into assertions.
*	rts/Sanity: Fix block count assertion with non-moving collector	Ben Gamari	2023-03-08	1	-3/+13
\| \| \| \| \| \| \|	The nonmoving collector does not use `oldest_gen->blocks` to track its block list. However, it nevertheless updates `oldest_gen->n_blocks` to ensure that its size is accounted for by the storage manager. Consequently, we must not attempt to assert consistency between the two.
*	nonmoving: Fix Note references	Ben Gamari	2023-03-08	7	-8/+8
\| \| \| \| \|	Some references to Note [Deadlock detection under the non-moving collector] were missing an article.
*	nonmoving: Move current segment array into Capability	Ben Gamari	2023-03-08	11	-137/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current segments are conceptually owned by the mutator, not the collector. Consequently, it was quite tricky to prove that the mutator would not race with the collect due to this shared state. It turns out that such races are possible: when resizing the current segment array we may concurrently try to take a heap census. This will attempt to walk the current segment array, causing a data race. Fix this by moving the current segment array into `Capability`, where it belongs. Fixes #22926.
*	rts: Reenable assertion	Ben Gamari	2023-03-08	1	-1/+1
\|
*	nonmoving: Allow pinned gen0 objects to be WEAK keys	Ben Gamari	2023-03-08	1	-4/+14
\|
*	nonmoving: Sync-phase mark budgeting	Ben Gamari	2023-03-08	3	-12/+86
\| \| \| \| \| \| \| \| \| \|	Here we significantly improve the bound on sync phase pause times by imposing a limit on the amount of work that we can perform during the sync. If we find that we have exceeded our marking budget then we allow the mutators to resume, return to concurrent marking, and try synchronizing again later. Fixes #22929.
*	nonmoving: Be more paranoid in segment tracking	Ben Gamari	2023-03-08	3	-1/+7
\| \| \| \| \|	Previously we left various segment link pointers dangling. None of this wrong per se, but it did make it harder than necessary to debug.
*	nonmoving: Don't push if nonmoving collector isn't enabled	Ben Gamari	2023-03-08	1	-1/+1
\|
*	nonmoving: Avoid n_caps race	Ben Gamari	2023-03-08	1	-4/+4
\|
*	nonmoving: Post-sweep sanity checking	Ben Gamari	2023-03-08	1	-1/+13
\|
*	nonmoving: Add missing write barriers in selector optimisation	Ben Gamari	2023-03-08	2	-6/+62
\| \| \| \| \| \| \|	This fixes the selector optimisation, adding a few write barriers which are necessary for soundness. See the inline comments for details. Fixes #22930.
*	nonmoving: Don't clobber update rem sets of old capabilities	Ben Gamari	2023-03-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	Previously `storageAddCapabilities` (called by `setNumCapabilities`) would clobber the update remembered sets of existing capabilities when increasing the capability count. Fix this by only initializing the update remembered sets of the newly-created capabilities. Fixes #22927.
*	nonmoving: Handle new closures in nonmovingIsNowAlive	Ben Gamari	2023-03-08	2	-8/+18
\| \| \| \| \|	We must conservatively assume that new closures are reachable since we are not guaranteed to mark such blocks.
*	nonmoving: Assert state of swept segments	Ben Gamari	2023-03-08	2	-0/+3
\|
*	nonmoving: Fix tracking of FILLED_SWEEPING segments	Ben Gamari	2023-03-08	1	-1/+1
\| \| \| \| \|	Previously we only updated the state of the segment at the head of each allocator's filled list.
*	nonmoving: Don't show occupancy if we didn't collect live words	Ben Gamari	2023-03-08	3	-17/+41
\|
*	nonmoving: Sanity check mutable list	Ben Gamari	2023-03-08	1	-0/+1
\| \| \| \| \|	Assert that entries in the nonmoving generation's generational remembered set (a.k.a. mutable list) live in nonmoving generation.