From 598472908ebb08f6811b892f285490554c290ae3 Mon Sep 17 00:00:00 2001 From: Simon Marlow Date: Sat, 3 Jun 2017 20:26:13 +0100 Subject: Fix a lost-wakeup bug in BLACKHOLE handling (#13751) Summary: The problem occurred when * Threads A & B evaluate the same thunk * Thread A context-switches, so the thunk gets blackholed * Thread C enters the blackhole, creates a BLOCKING_QUEUE attached to the blackhole and thread A's `tso->bq` queue * Thread B updates the blackhole with a value, overwriting the BLOCKING_QUEUE * We GC, replacing A's update frame with stg_enter_checkbh * Throw an exception in A, which ignores the stg_enter_checkbh frame Now we have C blocked on A's tso->bq queue, but we forgot to check the queue because the stg_enter_checkbh frame has been thrown away by the exception. The solution and alternative designs are discussed in Note [upd-black-hole]. This also exposed a bug in the interpreter, whereby we were sometimes context-switching without calling `threadPaused()`. I've fixed this and added some Notes. Test Plan: * `cd testsuite/tests/concurrent && make slow` * validate Reviewers: niteria, bgamari, austin, erikd Reviewed By: erikd Subscribers: rwbarton, thomie GHC Trac Issues: #13751 Differential Revision: https://phabricator.haskell.org/D3630 --- rts/Messages.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) (limited to 'rts/Messages.c') diff --git a/rts/Messages.c b/rts/Messages.c index 0508a20d20..8fab314bc4 100644 --- a/rts/Messages.c +++ b/rts/Messages.c @@ -289,7 +289,9 @@ loop: recordClosureMutated(cap,(StgClosure*)bq); } - debugTraceCap(DEBUG_sched, cap, "thread %d blocked on thread %d", + debugTraceCap(DEBUG_sched, cap, + "thread %d blocked on existing BLOCKING_QUEUE " + "owned by thread %d", (W_)msg->tso->id, (W_)owner->id); // See above, #3838 -- cgit v1.2.1