fix race crash for accepting new connections

Inspired by a patch by Kazuki Ohta. Summary from a mail sent to the list by Shigeki: [Example Scenario] 1. throw alot clients (well over connection limit) to connect to memcached. 2. memcached's file descriptors reaches maximum setting 3. main thread calls accept_new_conns(false) to stop polling sfd 4. main thread's event_base_loop stops accepting incoming request 5. main thread stops to acceess main_base at this point 6. a client disconnects 7. worker thread calls accept_new_conns(true) to start polling sfd 8. accept_new_conns uses mutex to protect main_base's race condition 9. worker thread starts loop with listen_conn 10. worker thread calls update_event with first conn 11. after first update_event(), main thread start polling sfd and starts to access main_base <- PROBLEM 12. Worker thread continues to call update_event() with second conn At this point, worker thread and main thread both acccess and modify main_base. --- The original patch coupled polling with the once-per-second clock timer. My patch creates a 10ms poller which kicks off after the listener is disabled. Switching for a conditional would be too much rewiring for 1.4, as 1.6 solves this in a better way.
author: dormando <dormando@rydia.net> 2011-07-12 00:26:54 -0700
committer: dormando <dormando@rydia.net> 2011-07-12 00:26:54 -0700
commit: ac939723874f1ebe1f0738af3855e81c026e9bc4 (patch)
tree: 0d8045cde98022eab00492902c2e511f9bab0d96
parent: 0d45e55dee7b380efccee1859fe6caa4ddace17a (diff)
download: memcached-ac939723874f1ebe1f0738af3855e81c026e9bc4.tar.gz
1 files changed, 26 insertions, 1 deletions
diff --git a/memcached.c b/memcached.c
index df1ee7b..39012b0 100644
--- a/memcached.c
+++ b/memcached.c
@@ -120,6 +120,27 @@ enum transmit_result {
 
 static enum transmit_result transmit(conn *c);
 
+/* This reduces the latency without adding lots of extra wiring to be able to
+ * notify the listener thread of when to listen again.
+ * Also, the clock timer could be broken out into its own thread and we
+ * can block the listener via a condition.
+ */
+static volatile bool allow_new_conns = true;
+static struct event maxconnsevent;
+static void maxconns_handler(const int fd, const short which, void *arg) {
+    struct timeval t = {.tv_sec = 0, .tv_usec = 10000};
+
+    if (allow_new_conns == false) {
+        /* reschedule in 10ms if we need to keep polling */
+        evtimer_set(&maxconnsevent, maxconns_handler, 0);
+        event_base_set(main_base, &maxconnsevent);
+        evtimer_add(&maxconnsevent, &t);
+    } else {
+        evtimer_del(&maxconnsevent);
+        accept_new_conns(true);
+    }
+}
+
 #define REALTIME_MAXDELTA 60*60*24*30
 
 /*
@@ -509,7 +530,9 @@ static void conn_close(conn *c) {
 
     MEMCACHED_CONN_RELEASE(c->sfd);
     close(c->sfd);
-    accept_new_conns(true);
+    pthread_mutex_lock(&conn_lock);
+    allow_new_conns = true;
+    pthread_mutex_unlock(&conn_lock);
     conn_cleanup(c);
 
     /* if the connection has big buffers, just free it */
@@ -3363,6 +3386,8 @@ void do_accept_new_conns(const bool do_accept) {
         stats.accepting_conns = false;
         stats.listen_disabled_num++;
         STATS_UNLOCK();
+        allow_new_conns = false;
+        maxconns_handler(0, 0, 0);
     }
 }
author	dormando <dormando@rydia.net>	2011-07-12 00:26:54 -0700
committer	dormando <dormando@rydia.net>	2011-07-12 00:26:54 -0700
commit	ac939723874f1ebe1f0738af3855e81c026e9bc4 (patch)
tree	0d8045cde98022eab00492902c2e511f9bab0d96
parent	0d45e55dee7b380efccee1859fe6caa4ddace17a (diff)
download	memcached-ac939723874f1ebe1f0738af3855e81c026e9bc4.tar.gz