summaryrefslogtreecommitdiff
path: root/rts/Compact.cmm
Commit message (Collapse)AuthorAgeFilesLines
* Fix a few Note inconsistenciesBen Gamari2022-02-011-1/+1
|
* Correct closure observation, construction, and mutation on weak memory machines.Travis Whitaker2019-06-281-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Here the following changes are introduced: - A read barrier machine op is added to Cmm. - The order in which a closure's fields are read and written is changed. - Memory barriers are added to RTS code to ensure correctness on out-or-order machines with weak memory ordering. Cmm has a new CallishMachOp called MO_ReadBarrier. On weak memory machines, this is lowered to an instruction that ensures memory reads that occur after said instruction in program order are not performed before reads coming before said instruction in program order. On machines with strong memory ordering properties (e.g. X86, SPARC in TSO mode) no such instruction is necessary, so MO_ReadBarrier is simply erased. However, such an instruction is necessary on weakly ordered machines, e.g. ARM and PowerPC. Weam memory ordering has consequences for how closures are observed and mutated. For example, consider a closure that needs to be updated to an indirection. In order for the indirection to be safe for concurrent observers to enter, said observers must read the indirection's info table before they read the indirectee. Furthermore, the entering observer makes assumptions about the closure based on its info table contents, e.g. an INFO_TYPE of IND imples the closure has an indirectee pointer that is safe to follow. When a closure is updated with an indirection, both its info table and its indirectee must be written. With weak memory ordering, these two writes can be arbitrarily reordered, and perhaps even interleaved with other threads' reads and writes (in the absence of memory barrier instructions). Consider this example of a bad reordering: - An updater writes to a closure's info table (INFO_TYPE is now IND). - A concurrent observer branches upon reading the closure's INFO_TYPE as IND. - A concurrent observer reads the closure's indirectee and enters it. (!!!) - An updater writes the closure's indirectee. Here the update to the indirectee comes too late and the concurrent observer has jumped off into the abyss. Speculative execution can also cause us issues, consider: - An observer is about to case on a value in closure's info table. - The observer speculatively reads one or more of closure's fields. - An updater writes to closure's info table. - The observer takes a branch based on the new info table value, but with the old closure fields! - The updater writes to the closure's other fields, but its too late. Because of these effects, reads and writes to a closure's info table must be ordered carefully with respect to reads and writes to the closure's other fields, and memory barriers must be placed to ensure that reads and writes occur in program order. Specifically, updates to a closure must follow the following pattern: - Update the closure's (non-info table) fields. - Write barrier. - Update the closure's info table. Observing a closure's fields must follow the following pattern: - Read the closure's info pointer. - Read barrier. - Read the closure's (non-info table) fields. This patch updates RTS code to obey this pattern. This should fix long-standing SMP bugs on ARM (specifically newer aarch64 microarchitectures supporting out-of-order execution) and PowerPC. This fixes issue #15449. Co-Authored-By: Ben Gamari <ben@well-typed.com>
* Rename some mutable closure types for consistencyÖmer Sinan Ağacan2018-06-051-4/+4
| | | | | | | | | | | | | | | | | | | | | | | SMALL_MUT_ARR_PTRS_FROZEN0 -> SMALL_MUT_ARR_PTRS_FROZEN_DIRTY SMALL_MUT_ARR_PTRS_FROZEN -> SMALL_MUT_ARR_PTRS_FROZEN_CLEAN MUT_ARR_PTRS_FROZEN0 -> MUT_ARR_PTRS_FROZEN_DIRTY MUT_ARR_PTRS_FROZEN -> MUT_ARR_PTRS_FROZEN_CLEAN Naming is now consistent with other CLEAR/DIRTY objects (MVAR, MUT_VAR, MUT_ARR_PTRS). (alternatively we could rename MVAR_DIRTY/MVAR_CLEAN etc. to MVAR0/MVAR) Removed a few comments in Scav.c about FROZEN0 being on the mut_list because it's now clear from the closure type. Reviewers: bgamari, simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4784
* rts: Fix compaction of SmallMutArrPtrsBen Gamari2018-05-201-13/+15
| | | | | | | | | | | | | | | | | | | | | | | | This was blatantly wrong due to copy-paste blindness: * labels were shadowed, which GHC doesn't warn about(!), resulting in plainly wrong behavior * the sharing check was omitted * the wrong closure layout was being used Moreover, the test wasn't being run due to its primitive dependency, so I didn't even notice. Sillyness. Test Plan: install `primitive`, `make test TEST=compact_small_array` Reviewers: simonmar, erikd Reviewed By: simonmar Subscribers: rwbarton, thomie, carter GHC Trac Issues: #13857. Differential Revision: https://phabricator.haskell.org/D4702
* Add likely annotation to cmm files in a few obvious places.klebinger.andreas@gmx.at2018-01-291-3/+3
| | | | | | | | | | | | | | | Provide information about paths more likely to be taken in the cmm files used by the rts. This leads to slightly better assembly being generated. Reviewers: bgamari, erikd, simonmar Subscribers: alexbiehl, rwbarton, thomie, carter GHC Trac Issues: #14672 Differential Revision: https://phabricator.haskell.org/D4324
* whitespace onlyGabor Greif2017-10-181-2/+1
|
* CNF: Implement compaction for small pointer arraysBen Gamari2017-08-241-3/+21
| | | | | | | | | | | | | | Test Plan: Validate Reviewers: austin, erikd, simonmar, dfeuer Reviewed By: dfeuer Subscribers: rwbarton, andrewthad, thomie, dfeuer GHC Trac Issues: #13860, #13857 Differential Revision: https://phabricator.haskell.org/D3888
* Prefer #if defined to #ifdefBen Gamari2017-04-281-2/+2
| | | | Our new CPP linter enforces this.
* rts: Fix "ASSERT ("sBen Gamari2017-04-231-4/+4
| | | | | | | | | | Reviewers: austin, erikd, simonmar Reviewed By: erikd Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3486
* rts/Compact.cmm: fix UNREG build failureSergei Trofimovich2016-12-171-54/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The change does the following: - Add explicit declaration of exception closures from base. C backend needs those symbols to be visible. - Reorder cmm functions in use order. Again C backend needs symbol declaration/definition before use. even for module-local cmm functions. Fixes the following build failure: rts_dist_HC rts/dist/build/Compact.o In file included from /tmp/ghc3348_0/ghc_4.hc:3:0: error: /tmp/ghc3348_0/ghc_4.hc: In function 'stg_compactAddWithSharingzh': /tmp/ghc3348_0/ghc_4.hc:27:11: error: error: 'stg_compactAddWorkerzh' undeclared (first use in this function) JMP_((W_)&stg_compactAddWorkerzh); ^ ... /tmp/ghc3348_0/ghc_4.hc:230:13: error: error: 'base_GHCziIOziException_cannotCompactMutable_closure' undeclared (first use in this function) R1.w = (W_)&base_GHCziIOziException_cannotCompactMutable_closure; ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Sergei Trofimovich <siarheit@google.com>
* Overhaul of Compact Regions (#12455)Simon Marlow2016-12-071-0/+437
Summary: This commit makes various improvements and addresses some issues with Compact Regions (aka Compact Normal Forms). This was the most important thing I wanted to fix. Compaction previously prevented GC from running until it was complete, which would be a problem in a multicore setting. Now, we compact using a hand-written Cmm routine that can be interrupted at any point. When a GC is triggered during a sharing-enabled compaction, the GC has to traverse and update the hash table, so this hash table is now stored in the StgCompactNFData object. Previously, compaction consisted of a deepseq using the NFData class, followed by a traversal in C code to copy the data. This is now done in a single pass with hand-written Cmm (see rts/Compact.cmm). We no longer use the NFData instances, instead the Cmm routine evaluates components directly as it compacts. The new compaction is about 50% faster than the old one with no sharing, and a little faster on average with sharing (the cost of the hash table dominates when we're doing sharing). Static objects that don't (transitively) refer to any CAFs don't need to be copied into the compact region. In particular this means we often avoid copying Char values and small Int values, because these are static closures in the runtime. Each Compact# object can support a single compactAdd# operation at any given time, so the Data.Compact library now enforces mutual exclusion using an MVar stored in the Compact object. We now get exceptions rather than killing everything with a barf() when we encounter an object that cannot be compacted (a function, or a mutable object). We now also detect pinned objects, which can't be compacted either. The Data.Compact API has been refactored and cleaned up. A new compactSize operation returns the size (in bytes) of the compact object. Most of the documentation is in the Haddock docs for the compact library, which I've expanded and improved here. Various comments in the code have been improved, especially the main Note [Compact Normal Forms] in rts/sm/CNF.c. I've added a few tests, and expanded a few of the tests that were there. We now also run the tests with GHCi, and in a new test way that enables sanity checking (+RTS -DS). There's a benchmark in libraries/compact/tests/compact_bench.hs for measuring compaction speed and comparing sharing vs. no sharing. The field totalDataW in StgCompactNFData was unnecessary. Test Plan: * new unit tests * validate * tested manually that we can compact Data.Aeson data Reviewers: gcampax, bgamari, ezyang, austin, niteria, hvr, erikd Subscribers: thomie, simonpj Differential Revision: https://phabricator.haskell.org/D2751 GHC Trac Issues: #12455