summaryrefslogtreecommitdiff
path: root/rts
Commit message (Collapse)AuthorAgeFilesLines
* Use a uniform return convention in bytecode for unary resultsAlexis King2023-05-136-133/+83
| | | | fixes #22958
* rts: Teach listAllBlocks about nonmoving heapTeo Camarasu2023-05-121-2/+29
| | | | | | List all blocks on the non-moving heap. Resolves #22627
* rts: Ensure non-moving gc is not running when pausingTeo Camarasu2023-05-121-0/+15
|
* rts: Refine memory retention behaviour to account for pinned/compacted objectsMatthew Pickering2023-05-113-12/+59
| | | | | | | | | | | | | | | | | | | | | | | When using the copying collector there is still a lot of data which isn't copied (such as pinned, compacted, large objects etc). The logic to decide how much memory to retain didn't take into account that these wouldn't be copied. Therefore we pessimistically retained 2* the amount of memory for these blocks even though they wouldn't be copied by the collector. The solution is to split up the heap into two parts, the parts which will be copied and the parts which won't be copied. Then the appropiate factor is applied to each part individually (2 * for copying and 1.2 * for not copying). The T23221 test demonstrates this improvement with a program which first allocates many unpinned ByteArray# followed by many pinned ByteArray# and observes the difference in the ultimate memory baseline between the two. There are some charts on #23221. Fixes #23221
* Add fused multiply-add instructionssheaf2023-05-112-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds eight new primops that fuse a multiplication and an addition or subtraction: - `{fmadd,fmsub,fnmadd,fnmsub}{Float,Double}#` fmadd x y z is x * y + z, computed with a single rounding step. This patch implements code generation for these primops in the following backends: - X86, AArch64 and PowerPC NCG, - LLVM - C WASM uses the C implementation. The primops are unsupported in the JavaScript backend. The following constant folding rules are also provided: - compute a * b + c when a, b, c are all literals, - x * y + 0 ==> x * y, - ±1 * y + z ==> z ± y and x * ±1 + z ==> z ± x. NB: the constant folding rules incorrectly handle signed zero. This is a known limitation with GHC's floating-point constant folding rules (#21227), which we hope to resolve in the future.
* nonmoving: Account for mutator allocations in bytes_allocatedBen Gamari2023-05-093-1/+7
| | | | | | | | | Previously we failed to account direct mutator allocations into the nonmoving heap against the mutator's allocation limit and `cap->total_allocated`. This only manifests during CAF evaluation (since we allocate the CAF's blackhole directly into the nonmoving heap). Fixes #23312.
* Make atomicSwapMutVar# an inline primopwip/ioref-swap-xchgBen Gamari2023-05-093-13/+0
|
* compiler: Implement atomicSwapIORef with xchgBen Gamari2023-05-094-0/+15
| | | | As requested by @treeowl in CLC#139.
* rts: Fix data-race in hs_init_ghcwip/T22756Ben Gamari2023-05-081-8/+11
| | | | | | | | As noticed by @Terrorjack, `hs_init_ghc` previously used non-atomic increment/decrement on the RTS's initialization count. This may go wrong in a multithreaded program which initializes the runtime multiple times. Closes #22756.
* Fix remaining issues with bound checking (#23123)Sylvain Henry2023-05-045-69/+40
| | | | | | | | | | | | | | | | | | | | While fixing these I've also changed the way we store addresses into ByteArray#. Addr# are composed of two parts: a JavaScript array and an offset (32-bit number). Suppose we want to store an Addr# in a ByteArray# foo at offset i. Before this patch, we were storing both fields as a tuple in the "arr" array field: foo.arr[i] = [addr_arr, addr_offset]; Now we only store the array part in the "arr" field and the offset directly in the array: foo.dv.setInt32(i, addr_offset): foo.arr[i] = addr_arr; It avoids wasting space for the tuple.
* JS: fix bounds checking (Issue 23123)Josh Meredith2023-05-041-0/+12
| | | | | | | | | | | | | | | | | | | | * For ByteArray-based bounds-checking, the JavaScript backend must use the `len` field, instead of the inbuild JavaScript `length` field. * Range-based operations must also check both the start and end of the range for bounds * All indicies are valid for ranges of size zero, since they are essentially no-ops * For cases of ByteArray accesses (e.g. read as Int), the end index is (i * sizeof(type) + sizeof(type) - 1), while the previous implementation uses (i + sizeof(type) - 1). In the Int32 example, this is (i * 4 + 3) * IndexByteArrayOp_Word8As* primitives use byte array indicies (unlike the previous point), but now check both start and end indicies * Byte array copies now check if the arrays are the same by identity and then if the ranges overlap.
* rts: always build 64-bit atomic opsCheng Shao2023-04-242-24/+17
| | | | | | | | | | This patch does a few things: - Always build 64-bit atomic ops in rts/ghc-prim, even on 32-bit platforms - Remove legacy "64bit" cabal flag of rts package - Fix hs_xchg64 function prototype for 32-bit platforms - Fix AtomicFetch test for wasm32
* rts: Initialize Array# header in listThreads#Ben Gamari2023-04-201-0/+1
| | | | | | | Previously the implementation of listThreads# failed to initialize the header of the created array, leading to various nastiness. Fixes #23071
* JS: fix thread-related primopsSylvain Henry2023-04-192-12/+26
|
* rts: improve memory ordering and add some comments in the StablePtr ↵Adam Sandberg Ericsson2023-04-141-10/+36
| | | | implementation
* Add missing cases in -Di prettyprinterKrzysztof Gogolewski2023-04-111-0/+51
| | | | Fixes #23142
* nonmoving: Disable slop-zeroingBen Gamari2023-04-061-4/+8
| | | | | | | | | As noted in #23170, the nonmoving GC can race with a mutator zeroing the slop of an updated thunk (in much the same way that two mutators would race). Consequently, we must disable slop-zeroing when the nonmoving GC is in use. Closes #23170
* StgToCmm: Upgrade -fcheck-prim-bounds behaviorMatthew Craven2023-04-041-0/+9
| | | | | Fixes #21054. Additionally, we can now check for range overlap when generating Cmm for primops that use memcpy internally.
* rts: Fix capability-count check in zeroSlopBen Gamari2023-03-251-3/+2
| | | | | | | | | Previously `zeroSlop` examined `RtsFlags` to determine whether the program was single-threaded. This is wrong; a program may be started with `+RTS -N1` yet the process may later increase the capability count with `setNumCapabilities`. This lead to quite subtle and rare crashes. Fixes #23088.
* rts: Don't rely on EXTERN_INLINE for slop-zeroing logicBen Gamari2023-03-254-23/+47
| | | | | | | | | | | | | Previously we relied on calling EXTERN_INLINE functions defined in ClosureMacros.h from Cmm to zero slop. However, as far as I can tell, this is no longer safe to do in C99 as EXTERN_INLINE definitions may be emitted in each compilation unit. Fix this by explicitly declaring a new set of non-inline functions in ZeroSlop.c which can be called from Cmm and marking the ClosureMacros.h definitions as INLINE_HEADER. In the future we should try to eliminate EXTERN_INLINE.
* rts: use performBlockingMajorGC in hs_perform_gc and fix ffi023Cheng Shao2023-03-252-2/+3
| | | | | | | | | | | This patch does a few things: - Add the missing RtsSymbols.c entry of performBlockingMajorGC - Make hs_perform_gc call performBlockingMajorGC, which restores previous behavior - Use hs_perform_gc in ffi023 - Remove rts_clearMemory() call in ffi023, it now works again in some test ways previously marked as broken. Fixes #23089
* rts: Fix barriers of IND and IND_STATICBen Gamari2023-03-252-9/+11
| | | | | | | | | Previously IND and IND_STATIC lacked the acquire barriers enjoyed by BLACKHOLE. As noted in the (now updated) Note [Heap memory barriers], this barrier is critical to ensure that the indirectee is visible to the entering core. Fixes #22872.
* fix: account for large and compact object stats with nonmoving gcTeo Camarasu2023-03-255-7/+36
| | | | | | | Make sure that we keep track of the size of large and compact objects that have been moved onto the nonmoving heap. We keep track of their size and add it to the amount of live bytes in nonmoving segments to get the total size of the live nonmoving heap. Resolves #17574
* JS: remove dead code for old integer-gmpSylvain Henry2023-03-101-16/+2
|
* nonmoving: Non-concurrent collectionBen Gamari2023-03-087-82/+132
|
* rts: Capture GC configuration in a structBen Gamari2023-03-083-19/+34
| | | | | The number of distinct arguments passed to GarbageCollect was getting a bit out of hand.
* rts: Fix incorrect STATIC_INLINEBen Gamari2023-03-081-1/+1
| | | | This should be INLINE_HEADER lest we get unused declaration warnings.
* rts: Rename clear_segment(_free_blocks)?Ben Gamari2023-03-083-9/+9
| | | | | To reflect the fact that these are to do with the nonmoving collector, now since they are exposed no longer static.
* nonmoving: Split out nonmovingAllocateGCBen Gamari2023-03-084-15/+55
|
* nonmoving: Move allocator into new source fileBen Gamari2023-03-087-198/+237
|
* nonmoving: Ensure that sanity checker accounts for saved_filled segmentsBen Gamari2023-03-081-0/+1
|
* nonmoving: Fix unregisterised buildBen Gamari2023-03-081-0/+4
|
* rts: Encapsulate block allocator spinlockBen Gamari2023-03-087-21/+28
| | | | | This makes it a bit easier to add instrumentation on this spinlock while debugging.
* nonmoving: Don't call prepareUnloadCheckBen Gamari2023-03-081-1/+2
| | | | | | When the nonmoving GC is in use we do not call `checkUnload` (since we don't unload code) and therefore should not call `prepareUnloadCheck`, lest we run into assertions.
* rts/Sanity: Fix block count assertion with non-moving collectorBen Gamari2023-03-081-3/+13
| | | | | | | The nonmoving collector does not use `oldest_gen->blocks` to track its block list. However, it nevertheless updates `oldest_gen->n_blocks` to ensure that its size is accounted for by the storage manager. Consequently, we must not attempt to assert consistency between the two.
* nonmoving: Fix Note referencesBen Gamari2023-03-087-8/+8
| | | | | Some references to Note [Deadlock detection under the non-moving collector] were missing an article.
* nonmoving: Move current segment array into CapabilityBen Gamari2023-03-0811-137/+89
| | | | | | | | | | | | | | The current segments are conceptually owned by the mutator, not the collector. Consequently, it was quite tricky to prove that the mutator would not race with the collect due to this shared state. It turns out that such races are possible: when resizing the current segment array we may concurrently try to take a heap census. This will attempt to walk the current segment array, causing a data race. Fix this by moving the current segment array into `Capability`, where it belongs. Fixes #22926.
* rts: Reenable assertionBen Gamari2023-03-081-1/+1
|
* nonmoving: Allow pinned gen0 objects to be WEAK keysBen Gamari2023-03-081-4/+14
|
* nonmoving: Sync-phase mark budgetingBen Gamari2023-03-083-12/+86
| | | | | | | | | | Here we significantly improve the bound on sync phase pause times by imposing a limit on the amount of work that we can perform during the sync. If we find that we have exceeded our marking budget then we allow the mutators to resume, return to concurrent marking, and try synchronizing again later. Fixes #22929.
* nonmoving: Be more paranoid in segment trackingBen Gamari2023-03-083-1/+7
| | | | | Previously we left various segment link pointers dangling. None of this wrong per se, but it did make it harder than necessary to debug.
* nonmoving: Don't push if nonmoving collector isn't enabledBen Gamari2023-03-081-1/+1
|
* nonmoving: Avoid n_caps raceBen Gamari2023-03-081-4/+4
|
* nonmoving: Post-sweep sanity checkingBen Gamari2023-03-081-1/+13
|
* nonmoving: Add missing write barriers in selector optimisationBen Gamari2023-03-082-6/+62
| | | | | | | This fixes the selector optimisation, adding a few write barriers which are necessary for soundness. See the inline comments for details. Fixes #22930.
* nonmoving: Don't clobber update rem sets of old capabilitiesBen Gamari2023-03-081-1/+1
| | | | | | | | | Previously `storageAddCapabilities` (called by `setNumCapabilities`) would clobber the update remembered sets of existing capabilities when increasing the capability count. Fix this by only initializing the update remembered sets of the newly-created capabilities. Fixes #22927.
* nonmoving: Handle new closures in nonmovingIsNowAliveBen Gamari2023-03-082-8/+18
| | | | | We must conservatively assume that new closures are reachable since we are not guaranteed to mark such blocks.
* nonmoving: Assert state of swept segmentsBen Gamari2023-03-082-0/+3
|
* nonmoving: Fix tracking of FILLED_SWEEPING segmentsBen Gamari2023-03-081-1/+1
| | | | | Previously we only updated the state of the segment at the head of each allocator's filled list.
* nonmoving: Don't show occupancy if we didn't collect live wordsBen Gamari2023-03-083-17/+41
|