diff options
author | Ben Gamari <bgamari.foss@gmail.com> | 2017-07-03 19:10:07 -0400 |
---|---|---|
committer | Ben Gamari <ben@smart-cactus.org> | 2017-07-03 19:42:22 -0400 |
commit | fd7a7a6363d8dde1813bc23cb4ef00ebb70a49c0 (patch) | |
tree | eab8f5e155dbe389bfc43d577dee38970474dded | |
parent | 0836bfbd480b00a690937060fc98df5e26453078 (diff) | |
download | haskell-fd7a7a6363d8dde1813bc23cb4ef00ebb70a49c0.tar.gz |
Eagerly blackhole AP_STACKs
This fixes #13615. See the rather lengthy Note [AP_STACKs must be
eagerly blackholed] for details.
Reviewers: simonmar, austin, erikd, dfeuer
Subscribers: duog, dfeuer, hsyl20, rwbarton, thomie
GHC Trac Issues: #13615
Differential Revision: https://phabricator.haskell.org/D3695
-rw-r--r-- | rts/Apply.cmm | 180 | ||||
-rw-r--r-- | rts/RaiseAsync.c | 1 | ||||
-rw-r--r-- | rts/ThreadPaused.c | 9 |
3 files changed, 190 insertions, 0 deletions
diff --git a/rts/Apply.cmm b/rts/Apply.cmm index f14bb8f331..a0b498a820 100644 --- a/rts/Apply.cmm +++ b/rts/Apply.cmm @@ -454,6 +454,172 @@ for: directly to a function, so we have to enter it using stg_ap_0. -------------------------------------------------------------------------- */ +/* + Note [AP_STACKs must be eagerly blackholed] + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Trac #13615 describes a nasty concurrency issue where we can enter into the +middle of an ST action multiple times, resulting in duplication of effects. +In short, the construction of an AP_STACK allows us to suspend a computation +which should not be duplicated. When running with lazy blackholing, we can then +enter this AP_STACK multiple times, duplicating the computation with potentially +disasterous consequences. + +For instance, consider the case of a simple ST program which computes a sum +using in─place mutation, + + inplaceSum :: Num a => [a] ─> a + inplaceSum xs0 = runST $ do + y <─ newSTRef 0 + let go [] = readSTRef y + go (x : xs) = do + modifySTRef y (+x) + go xs + go xs0 + +Of course, it is fine if we enter an inplaceSum thunk more than once: the two +threads will inhabit different worlds with different STRefs. However, if we +suspend some part of inplaceSum (for instance, due to the heap check at the +beginning of go) and then multiple threads resume that suspension (as is safe in +pure computation) we will have multiple threads concurrently mutating the same +STRef. Disaster! + +Let's consider how this might happen: Consider this situation, + + ┌─────────┐ ┌───────┐ ┌───────┐ ┌─────────┐ + │ TSO 1 │ ╭────→│ go │ │ fun │ │ TSO 2 │ + └─────────┘ │ └───────┘ └───────┘ └─────────┘ + │ │ + ┌─────────┐ │ │ ┌─────────┐ + │ │──────╯ ╰──────────────│ │ + ├─────────┤ ┌─────────┐ ├─────────┤ + │ UPDATE_ │──────────→│ THUNK A │ ╭────────│ UPDATE_ │ + │ FRAME │ updatee └─────────┘ │updatee │ FRAME │ + ├─────────┤ │ ├─────────┤ + │ ... │ │ │ etc. │ + ├─────────┤ updatee ┌─────────┐ │ + │ UPDATE_ │─────────────────────→│ THUNK B │←───╯ + │ FRAME │ └─────────┘ + ├─────────┤ + │ etc. │ + +Here we have two threads (TSO 1 and TSO 2) which are in currently pausing (e.g. +in threadPaused). Since they are pausing, their stacks are headed by a pointer +to the continuation code which we will run on resumption (go and fun, +respectively). We also see that there are two thunks on the heap: THUNK A and +THUNK B where THUNK A is reachable from THUNK B (for instance, as an argument or +free variable). We see that thread 1 has THUNK A under evaluation, and both +threads have THUNK B under evaluation. + +As each thread enters threadPaused, threadPaused will walk its stack looking for +duplicate computation (see Note [suspend duplicate work], although there is some +background described below as well). Let's consider what this check does: + +Say that TSO 2 begins this check first. The check will walk TSO 2's stack, until +it finds the first update frame, which updates THUNK B. Upon finding this frame, +it will try to lock THUNK B, replacing it with a BLACKHOLE owned by its TSO. We +now have, + + ┌─────────┐ ┌───────┐ ┌───────┐ ┌─────────┐ + │ TSO 1 │ ╭────→│ go │ │ fun │ ╭────────→│ TSO 2 │ + └─────────┘ │ └───────┘ └───────┘ │ └─────────┘ + │ ↑ ╭─────╯ + ┌─────────┐ │ │ │ ┌─────────┐ + │ │──────╯ ╰─────────────────│ │ + ├─────────┤ updatee ┌─────────┐ │ ├─────────┤ + │ UPDATE_ │──────────→│ THUNK A │ │ ╭──────────│ UPDATE_ │ + │ FRAME │ └─────────┘ │ │ updatee │ FRAME │ + ├─────────┤ │ │ ├─────────┤ + │ ... │ owner│ │ │ etc. │ + ├─────────┤ updatee ┌────────────┐ │ + │ UPDATE_ │──────────────────→│ BLACKHOLE │←─╯ + │ FRAME │ └────────────┘ + ├─────────┤ + │ etc. │ + +Now consider what happens when TSO 1 runs its duplicate-computation check. +Again, we start walking the stack from the top, where we we find the update +frame updating THUNK A. We will lock this thunk, replacing it with a BLACKHOLE +owned by its TSO. We now have, + + ┌─────────┐ ┌───────┐ ┌───────┐ ┌─────────┐ + │ TSO 1 │←──╮ ╭────→│ go │ │ fun │ ╭────────→│ TSO 2 │ + └─────────┘ │ │ └───────┘ └───────┘ │ └─────────┘ + │ │ ↑ ╭─────╯ + ┌─────────┐ ╰──│─────────╮ │ │ ┌─────────┐ + │ │──────╯ │owner ╰─────────────────│ │ + ├─────────┤ ┌───────────┐ │ ├─────────┤ + │ UPDATE_ │──────────→│ BLACKHOLE │ │ ╭──────────│ UPDATE_ │ + │ FRAME │ updatee └───────────┘ │ │ updatee │ FRAME │ + ├─────────┤ │ │ ├─────────┤ + │ ... │ owner│ │ │ etc. │ + ├─────────┤ updatee ┌────────────┐ │ + │ UPDATE_ │──────────────────→│ BLACKHOLE │←─╯ + │ FRAME │ └────────────┘ + ├─────────┤ + │ etc. │ + +Now we will continue walking down TSO 1's stack, next coming across the second +update frame, pointing to the now-BLACKHOLE'd THUNK B. At this point +threadPaused will correctly conclude that TSO 1 is duplicating a computation +being carried out by TSO 2 and attempt to suspend it. + +The suspension process proceeds by invoking raiseAsync, which walks the stack +from the top looking for update frames. For each update frame we take any stack +frames preceeding it and construct an AP_STACK heap object from them. We then +replace the updatee of the frame with an indirection pointing to the AP_STACK. +So, after suspending the first update frame we have, + + ┌─────────┐ ┌───────┐ ┌───────┐ ┌─────────┐ + │ TSO 1 │ ╭────────→│ go │←─╮ │ fun │ ╭───────→│ TSO 2 │ + └─────────┘ │ └───────┘ │ └───────┘ │ └─────────┘ + │ ┌───────────┐ │ ↑ ╭─────╯ + ┌─────────┐ │ │ AP_STACK │ │ │ │ ┌─────────┐ + │ │──╯ ├───────────┤ │ ╰────────────────│ │ + ├─────────┤ │ │─╯ │ ├─────────┤ + │ UPDATE_ │───────╮ └───────────┘ │ ╭──────────│ UPDATE_ │ + │ FRAME │updatee│ ↑ │ │ updatee │ FRAME │ + ├─────────┤ │ │indirectee │ │ ├─────────┤ + │ ... │ ╰→┌───────────┐ │ │ │ etc. │ + ├─────────┤updatee │ BLACKHOLE │ │ │ + │ UPDATE_ │──╮ └───────────┘ owner│ │ + │ FRAME │ │ ┌────────────┐ │ + ├─────────┤ ╰───────────────→│ BLACKHOLE │←─╯ + │ etc. │ └────────────┘ + +Finally, we will replace the second update frame with a blackhole so that TSO 1 +will block on TSO 2's computation of THUNK B, + + ┌─────────┐ ┌───────┐ ┌───────┐ ┌─────────┐ + │ TSO 1 │ ╭────────→│ go │←─╮ │ fun │ ╭───────→│ TSO 2 │ + └─────────┘ │ └───────┘ │ └───────┘ │ └─────────┘ + │ ┌───────────┐ │ ↑ ╭─────╯ + ┌─────────┐ │ │ AP_STACK │ │ │ │ ┌─────────┐ + │ │──╯ ├───────────┤ │ ╰────────────────│ │ + ├─────────┤ │ │─╯ │ ├─────────┤ + │ UPDATE_ │───────╮ └───────────┘ │ ╭──────────│ UPDATE_ │ + │ FRAME │updatee│ ↑ │ │ updatee │ FRAME │ + ├─────────┤ │ │indirectee │ │ ├─────────┤ + │ ... │ ╰→┌───────────┐ │ │ │ etc. │ + ├─────────┤ │ BLACKHOLE │ │ │ + │ BLACK │ └───────────┘ owner│ │ + │ HOLE │───────────╮ ┌────────────┐ │ + ├─────────┤indirectee ╰──────→│ BLACKHOLE │←─╯ + │ etc. │ └────────────┘ + +At first glance there's still nothing terribly alarming here. However, consider +what would happen if some other closure held a reference to THUNK A. We would +now have leaked an AP_STACK capturing the state of a potentially +non-duplicatable computation to heap. Even worse, if two threads had references +to THUNK A and both attempted to enter at the same time, they would both succeed +if we allowed AP_STACKs to be lazily blackholed. This is the reason why we must +be very careful when entering AP_STACKS: they introduce the possibility that we +duplicate a computation which could never otherwise be duplicated. + +For this reason we employ an atomic blackholing strategy when entering AP_STACK +closures. + */ + + INFO_TABLE(stg_AP_STACK,/*special layout*/0,0,AP_STACK,"AP_STACK","AP_STACK") /* no args => explicit stack */ { @@ -477,6 +643,20 @@ INFO_TABLE(stg_AP_STACK,/*special layout*/0,0,AP_STACK,"AP_STACK","AP_STACK") PUSH_UPD_FRAME(Sp - SIZEOF_StgUpdateFrame, R1); Sp = Sp - SIZEOF_StgUpdateFrame - WDS(Words); + /* + * It is imperative that we blackhole lest we may duplicate computation which + * must not be duplicated. See Note [AP_STACKs must be eagerly blackholed]. + */ + W_ old_info; + (old_info) = prim %cmpxchgW(ap, stg_AP_STACK_info, stg_WHITEHOLE_info); + if (old_info != stg_AP_STACK_info) { + /* someone else beat us to it */ + jump ENTRY_LBL(stg_WHITEHOLE) (ap); + } + StgInd_indirectee(ap) = CurrentTSO; + prim_write_barrier; + SET_INFO(ap, __stg_EAGER_BLACKHOLE_info); + TICK_ENT_AP(); LDV_ENTER(ap); ENTER_CCS_THUNK(ap); diff --git a/rts/RaiseAsync.c b/rts/RaiseAsync.c index e04a875e49..6f1ab79691 100644 --- a/rts/RaiseAsync.c +++ b/rts/RaiseAsync.c @@ -873,6 +873,7 @@ raiseAsync(Capability *cap, StgTSO *tso, StgClosure *exception, ap->size = words; ap->fun = (StgClosure *)sp[0]; + sp++; for(i=0; i < words; ++i) { ap->payload[i] = (StgClosure *)*sp++; diff --git a/rts/ThreadPaused.c b/rts/ThreadPaused.c index 2483466255..d01be291d3 100644 --- a/rts/ThreadPaused.c +++ b/rts/ThreadPaused.c @@ -275,6 +275,9 @@ threadPaused(Capability *cap, StgTSO *tso) // deadlocked on itself. See #5226 for an instance of // this bug. // + // Note that great care is required when entering computations + // suspended by this mechanism. See Note [AP_STACKs must be eagerly + // blackholed] for details. if (((bh_info == &stg_BLACKHOLE_info) && ((StgInd*)bh)->indirectee != (StgClosure*)tso) || (bh_info == &stg_WHITEHOLE_info)) @@ -303,6 +306,12 @@ threadPaused(Capability *cap, StgTSO *tso) continue; } + // We should never have made it here in the event of blackholes that + // we already own; they should have been marked when we blackholed + // them and consequently we should have stopped our stack walk + // above. + ASSERT(!((bh_info == &stg_BLACKHOLE_info) + && (((StgInd*)bh)->indirectee == (StgClosure*)tso))); // zero out the slop so that the sanity checker can tell // where the next closure is. |