Freeze time sampling during command execution, and scripts (#10300)

Freeze time during execution of scripts and all other commands. This means that a key is either expired or not, and doesn't change state during a script execution. resolves #10182 This PR try to add a new `commandTimeSnapshot` function. The function logic is extracted from `keyIsExpired`, but the related calls to `fixed_time_expire` and `mstime()` are removed, see below. In commands, we will avoid calling `mstime()` multiple times and just use the one that sampled in call. The background is, e.g. using `PEXPIRE 1` with valgrind sometimes result in the key being deleted rather than expired. The reason is that both `PEXPIRE` command and `checkAlreadyExpired` call `mstime()` separately. There are other more important changes in this PR: 1. Eliminate `fixed_time_expire`, it is no longer needed. When we want to sample time we should always use a time snapshot. We will use `in_nested_call` instead to update the cached time in `call`. 2. Move the call for `updateCachedTime` from `serverCron` to `afterSleep`. Now `commandTimeSnapshot` will always return the sample time, the `lookupKeyReadWithFlags` call in `getNodeByQuery` will get a outdated cached time (because `processCommand` is out of the `call` context). We put the call to `updateCachedTime` in `aftersleep`. 3. Cache the time each time the module lock Redis. Call `updateCachedTime` in `moduleGILAfterLock`, affecting `RM_ThreadSafeContextLock` and `RM_ThreadSafeContextTryLock` Currently the commandTimeSnapshot change affects the following TTL commands: - SET EX / SET PX - EXPIRE / PEXPIRE - SETEX / PSETEX - GETEX EX / GETEX PX - TTL / PTTL - EXPIRETIME / PEXPIRETIME - RESTORE key TTL And other commands just use the cached mstime (including TIME). This is considered to be a breaking change since it can break a script that uses a loop to wait for a key to expire.
author: Binbin <binloveplay1314@qq.com> 2022-10-09 13:18:34 +0800
committer: GitHub <noreply@github.com> 2022-10-09 08:18:34 +0300
commit: 35b3fbd90c2ad2c503c9e3d28bfbffff13099925 (patch)
tree: 8ca44f6572095f44f29fcddcefb8bdac3490b806 /src/server.c
parent: d2ad01ab3e24ca537efdd1fc6ff1ae2c657f4a51 (diff)
download: redis-35b3fbd90c2ad2c503c9e3d28bfbffff13099925.tar.gz
1 files changed, 33 insertions, 15 deletions
diff --git a/src/server.c b/src/server.c
index cb1e0447a..617a5e3bc 100644
--- a/src/server.c
+++ b/src/server.c
@@ -201,6 +201,32 @@ mstime_t mstime(void) {
     return ustime()/1000;
 }
 
+/* Return the command time snapshot in milliseconds.
+ * The time the command started is the logical time it runs,
+ * and all the time readings during the execution time should
+ * reflect the same time.
+ * More details can be found in the comments below. */
+mstime_t commandTimeSnapshot(void) {
+    /* If we are in the context of a Lua script, we pretend that time is
+     * blocked to when the Lua script started. This way a key can expire
+     * only the first time it is accessed and not in the middle of the
+     * script execution, making propagation to slaves / AOF consistent.
+     * See issue #1525 on Github for more information. */
+    if (server.script_caller) {
+        return scriptTimeSnapshot();
+    }
+    /* If we are in the middle of a command execution, we still want to use
+     * a reference time that does not change: in that case we just use the
+     * cached time, that we update before each call in the call() function.
+     * This way we avoid that commands such as RPOPLPUSH or similar, that
+     * may re-open the same key multiple times, can invalidate an already
+     * open object in a next call, if the next call will see the key expired,
+     * while the first did not. */
+    else {
+        return server.mstime;
+    }
+}
+
 /* After an RDB dump or AOF rewrite we exit from children using _exit() instead of
  * exit(), because the latter may interact with the same file objects used by
  * the parent process. However if we are testing the coverage normal exit() is
@@ -1171,9 +1197,6 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {
      * handler if we don't return here fast enough. */
     if (server.watchdog_period) watchdogScheduleSignal(server.watchdog_period);
 
-    /* Update the time cache. */
-    updateCachedTime(1);
-
     server.hz = server.config_hz;
     /* Adapt the server.hz value to the number of configured clients. If we have
      * many clients, we want to call serverCron() with an higher frequency. */
@@ -1656,6 +1679,9 @@ void beforeSleep(struct aeEventLoop *eventLoop) {
 void afterSleep(struct aeEventLoop *eventLoop) {
     UNUSED(eventLoop);
 
+    /* Update the time cache. */
+    updateCachedTime(1);
+
     /* Do NOT add anything above moduleAcquireGIL !!! */
 
     /* Acquire the modules GIL so that their threads won't touch anything. */
@@ -2408,7 +2434,6 @@ void initServer(void) {
     server.main_thread_id = pthread_self();
     server.current_client = NULL;
     server.errors = raxNew();
-    server.fixed_time_expire = 0;
     server.in_nested_call = 0;
     server.clients = listCreate();
     server.clients_index = raxNew();
@@ -3333,7 +3358,7 @@ void call(client *c, int flags) {
 
     /* Update cache time, in case we have nested calls we want to
      * update only on the first call*/
-    if (server.fixed_time_expire++ == 0) {
+    if (server.in_nested_call++ == 0) {
         updateCachedTimeWithUs(0,call_timer);
     }
 
@@ -3341,7 +3366,6 @@ void call(client *c, int flags) {
     if (monotonicGetType() == MONOTONIC_CLOCK_HW)
         monotonic_start = getMonotonicUs();
 
-    server.in_nested_call++;
     c->cmd->proc(c);
     server.in_nested_call--;
 
@@ -3487,7 +3511,6 @@ void call(client *c, int flags) {
         }
     }
 
-    server.fixed_time_expire--;
     server.stat_numcommands++;
 
     /* Record peak memory after each command and before the eviction that runs
@@ -4348,14 +4371,9 @@ void echoCommand(client *c) {
 }
 
 void timeCommand(client *c) {
-    struct timeval tv;
-
-    /* gettimeofday() can only fail if &tv is a bad address so we
-     * don't check for errors. */
-    gettimeofday(&tv,NULL);
     addReplyArrayLen(c,2);
-    addReplyBulkLongLong(c,tv.tv_sec);
-    addReplyBulkLongLong(c,tv.tv_usec);
+    addReplyBulkLongLong(c, server.unixtime);
+    addReplyBulkLongLong(c, server.ustime-server.unixtime*1000000);
 }
 
 typedef struct replyFlagNames {
@@ -5368,7 +5386,7 @@ sds genRedisInfoString(dict *section_dict, int all_sections, int everything) {
         if (isShutdownInitiated()) {
             info = sdscatfmt(info,
                 "shutdown_in_milliseconds:%I\r\n",
-                (int64_t)(server.shutdown_mstime - server.mstime));
+                (int64_t)(server.shutdown_mstime - commandTimeSnapshot()));
         }
 
         /* get all the listeners information */
author	Binbin <binloveplay1314@qq.com>	2022-10-09 13:18:34 +0800
committer	GitHub <noreply@github.com>	2022-10-09 08:18:34 +0300
commit	35b3fbd90c2ad2c503c9e3d28bfbffff13099925 (patch)
tree	8ca44f6572095f44f29fcddcefb8bdac3490b806 /src/server.c
parent	d2ad01ab3e24ca537efdd1fc6ff1ae2c657f4a51 (diff)
download	redis-35b3fbd90c2ad2c503c9e3d28bfbffff13099925.tar.gz