diff options
author | Binbin <binloveplay1314@qq.com> | 2022-10-09 13:18:34 +0800 |
---|---|---|
committer | GitHub <noreply@github.com> | 2022-10-09 08:18:34 +0300 |
commit | 35b3fbd90c2ad2c503c9e3d28bfbffff13099925 (patch) | |
tree | 8ca44f6572095f44f29fcddcefb8bdac3490b806 /src/server.c | |
parent | d2ad01ab3e24ca537efdd1fc6ff1ae2c657f4a51 (diff) | |
download | redis-35b3fbd90c2ad2c503c9e3d28bfbffff13099925.tar.gz |
Freeze time sampling during command execution, and scripts (#10300)
Freeze time during execution of scripts and all other commands.
This means that a key is either expired or not, and doesn't change
state during a script execution. resolves #10182
This PR try to add a new `commandTimeSnapshot` function.
The function logic is extracted from `keyIsExpired`, but the related
calls to `fixed_time_expire` and `mstime()` are removed, see below.
In commands, we will avoid calling `mstime()` multiple times
and just use the one that sampled in call. The background is,
e.g. using `PEXPIRE 1` with valgrind sometimes result in the key
being deleted rather than expired. The reason is that both `PEXPIRE`
command and `checkAlreadyExpired` call `mstime()` separately.
There are other more important changes in this PR:
1. Eliminate `fixed_time_expire`, it is no longer needed.
When we want to sample time we should always use a time snapshot.
We will use `in_nested_call` instead to update the cached time in `call`.
2. Move the call for `updateCachedTime` from `serverCron` to `afterSleep`.
Now `commandTimeSnapshot` will always return the sample time, the
`lookupKeyReadWithFlags` call in `getNodeByQuery` will get a outdated
cached time (because `processCommand` is out of the `call` context).
We put the call to `updateCachedTime` in `aftersleep`.
3. Cache the time each time the module lock Redis.
Call `updateCachedTime` in `moduleGILAfterLock`, affecting `RM_ThreadSafeContextLock`
and `RM_ThreadSafeContextTryLock`
Currently the commandTimeSnapshot change affects the following TTL commands:
- SET EX / SET PX
- EXPIRE / PEXPIRE
- SETEX / PSETEX
- GETEX EX / GETEX PX
- TTL / PTTL
- EXPIRETIME / PEXPIRETIME
- RESTORE key TTL
And other commands just use the cached mstime (including TIME).
This is considered to be a breaking change since it can break a script
that uses a loop to wait for a key to expire.
Diffstat (limited to 'src/server.c')
-rw-r--r-- | src/server.c | 48 |
1 files changed, 33 insertions, 15 deletions
diff --git a/src/server.c b/src/server.c index cb1e0447a..617a5e3bc 100644 --- a/src/server.c +++ b/src/server.c @@ -201,6 +201,32 @@ mstime_t mstime(void) { return ustime()/1000; } +/* Return the command time snapshot in milliseconds. + * The time the command started is the logical time it runs, + * and all the time readings during the execution time should + * reflect the same time. + * More details can be found in the comments below. */ +mstime_t commandTimeSnapshot(void) { + /* If we are in the context of a Lua script, we pretend that time is + * blocked to when the Lua script started. This way a key can expire + * only the first time it is accessed and not in the middle of the + * script execution, making propagation to slaves / AOF consistent. + * See issue #1525 on Github for more information. */ + if (server.script_caller) { + return scriptTimeSnapshot(); + } + /* If we are in the middle of a command execution, we still want to use + * a reference time that does not change: in that case we just use the + * cached time, that we update before each call in the call() function. + * This way we avoid that commands such as RPOPLPUSH or similar, that + * may re-open the same key multiple times, can invalidate an already + * open object in a next call, if the next call will see the key expired, + * while the first did not. */ + else { + return server.mstime; + } +} + /* After an RDB dump or AOF rewrite we exit from children using _exit() instead of * exit(), because the latter may interact with the same file objects used by * the parent process. However if we are testing the coverage normal exit() is @@ -1171,9 +1197,6 @@ int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) { * handler if we don't return here fast enough. */ if (server.watchdog_period) watchdogScheduleSignal(server.watchdog_period); - /* Update the time cache. */ - updateCachedTime(1); - server.hz = server.config_hz; /* Adapt the server.hz value to the number of configured clients. If we have * many clients, we want to call serverCron() with an higher frequency. */ @@ -1656,6 +1679,9 @@ void beforeSleep(struct aeEventLoop *eventLoop) { void afterSleep(struct aeEventLoop *eventLoop) { UNUSED(eventLoop); + /* Update the time cache. */ + updateCachedTime(1); + /* Do NOT add anything above moduleAcquireGIL !!! */ /* Acquire the modules GIL so that their threads won't touch anything. */ @@ -2408,7 +2434,6 @@ void initServer(void) { server.main_thread_id = pthread_self(); server.current_client = NULL; server.errors = raxNew(); - server.fixed_time_expire = 0; server.in_nested_call = 0; server.clients = listCreate(); server.clients_index = raxNew(); @@ -3333,7 +3358,7 @@ void call(client *c, int flags) { /* Update cache time, in case we have nested calls we want to * update only on the first call*/ - if (server.fixed_time_expire++ == 0) { + if (server.in_nested_call++ == 0) { updateCachedTimeWithUs(0,call_timer); } @@ -3341,7 +3366,6 @@ void call(client *c, int flags) { if (monotonicGetType() == MONOTONIC_CLOCK_HW) monotonic_start = getMonotonicUs(); - server.in_nested_call++; c->cmd->proc(c); server.in_nested_call--; @@ -3487,7 +3511,6 @@ void call(client *c, int flags) { } } - server.fixed_time_expire--; server.stat_numcommands++; /* Record peak memory after each command and before the eviction that runs @@ -4348,14 +4371,9 @@ void echoCommand(client *c) { } void timeCommand(client *c) { - struct timeval tv; - - /* gettimeofday() can only fail if &tv is a bad address so we - * don't check for errors. */ - gettimeofday(&tv,NULL); addReplyArrayLen(c,2); - addReplyBulkLongLong(c,tv.tv_sec); - addReplyBulkLongLong(c,tv.tv_usec); + addReplyBulkLongLong(c, server.unixtime); + addReplyBulkLongLong(c, server.ustime-server.unixtime*1000000); } typedef struct replyFlagNames { @@ -5368,7 +5386,7 @@ sds genRedisInfoString(dict *section_dict, int all_sections, int everything) { if (isShutdownInitiated()) { info = sdscatfmt(info, "shutdown_in_milliseconds:%I\r\n", - (int64_t)(server.shutdown_mstime - server.mstime)); + (int64_t)(server.shutdown_mstime - commandTimeSnapshot())); } /* get all the listeners information */ |