summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Bump vmactions/freebsd-vm from 0.2.0 to 0.2.3 (#11072)dependabot[bot]2022-08-021-3/+3
| | | | | | | | | | | | | | | | Bumps [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm) from 0.2.0 to 0.2.3. - [Release notes](https://github.com/vmactions/freebsd-vm/releases) - [Commits](https://github.com/vmactions/freebsd-vm/compare/v0.2.0...v0.2.3) --- updated-dependencies: - dependency-name: vmactions/freebsd-vm dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* Solve usleep compilation warning in keyspace_events.c (#11073)Binbin2022-08-021-0/+1
| | | | | | | | | | There is a -Wimplicit-function-declaration warning in here: ``` keyspace_events.c: In function ‘KeySpace_NotificationGeneric’: keyspace_events.c:67:9: warning: implicit declaration of function ‘usleep’; did you mean ‘sleep’? [-Wimplicit-function-declaration] 67 | usleep(1); | ^~~~~~ | sleep ```
* Fix wrong commands json docs for CLIENT KILL (#10970)Rudi Floren2022-08-012-70/+95
| | | | | | | The docs state that there is a new and an old argument format. The current state of the arguments allows mixing the old and new format, thus the need for two additional oneof blocks. One for differentiating the new from the old format and then one to allow setting multiple filters using the new format.
* Tests (cluster / sentinel): add --stop and--loop options (#11070)Binbin2022-08-011-0/+19
| | | | | --stop: Blocks once the first test fails. --loop: Execute the specified set of tests forever. It is useful when we debug some test failures.
* Optimization: moduleLoadString should try to create embedded string if not ↵Valentino Geron2022-07-311-3/+1
| | | | | | | | | | | | plain (#11050) Before this change, if the module has an embedded string, then uses RedisModule_SaveString and RedisModule_LoadString, the result would be a raw string instead of an embedded string. Now the `RDB_LOAD_ENC` flag to `moduleLoadString` only affects integer encoding, but not embedded strings (which still hold an sds in the robj ptr, so they're actually still raw strings for anyone who reads them). Co-authored-by: Valentino Geron <valentino@redis.com>
* tracking pending invalidation message of flushdb sent by (#11068)Huang Zhw2022-07-312-6/+12
| | | trackingHandlePendingKeyInvalidations should use proto.
* Avoid false positive out-of-bounds in writeForgottenNodePingExt (#11053)Binbin2022-07-281-2/+2
| | | | | | In clusterMsgPingExtForgottenNode, sizeof(name) is CLUSTER_NAMELEN, and sizeof(clusterMsgPingExtForgottenNode) is > CLUSTER_NAMELEN. Doing a (name + sizeof(clusterMsgPingExtForgottenNode)) sanitizer generates an out-of-bounds error which is a false positive in here
* Fix bgsaveerr issue in psync wrong offset test (#11043)Binbin2022-07-271-4/+0
| | | | | | | | | | | | | The kill above is sometimes successful and sometimes already too late. The PING in pysnc wrong offset test got rejected by bgsaveerr because lastbgsave_status is C_ERR. In theory, using diskless can avoid PING being affected, because when the replica is dropped, we will kill the child with SIGUSR1, and this will not affect lastbgsave_status. Anyway, this kill is not particularly needed here, dropping the kill is the best one, since we do have the waitForBgsave, so just let it take care of the bgsave. No need for fast termination.
* Adds RM_Microseconds and RM_CachedMicroseconds (#11016)guybe72022-07-273-2/+37
| | | | | | | | | | | | RM_Microseconds Return the wall-clock Unix time, in microseconds RM_CachedMicroseconds Returns a cached copy of the Unix time, in microseconds. It is updated in the server cron job and before executing a command. It is useful for complex call stacks, such as a command causing a key space notification, causing a module to execute a RedisModule_Call, causing another notification, etc. It makes sense that all these callbacks would use the same clock.
* Change the return value of rdbLoad function to enums (#11039)Binbin2022-07-265-6/+13
| | | | | | | | | | The reason we do this is because in #11036, we added error log message when failing to open RDB file for reading. In loadDdataFromDisk we call rdbLoad and also check errno, now the logging corrupts errno (reported in alpine daily). It is not safe to rely on errno as we do today, so we change the return value of rdbLoad function to enums, like we have when loading an AOF.
* When client tracking is on, invalidation message of flushdb in a (#11038)Huang Zhw2022-07-262-1/+20
| | | | | | When FLUSHDB / FLUSHALL / SWAPDB is inside MULTI / EXEC, the client side tracking invalidation message was interleaved with transaction response.
* Fix #11030, use lua_rawget to avoid triggering metatables and crash. (#11032)Meir Shpilraien (Spielrein)2022-07-262-10/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix #11030, use lua_rawget to avoid triggering metatables. #11030 shows how return `_G` from the Lua script (either function or eval), cause the Lua interpreter to Panic and the Redis processes to exit with error code 1. Though return `_G` only panic on Redis 7 and 6.2.7, the underline issue exists on older versions as well (6.0 and 6.2). The underline issue is returning a table with a metatable such that the metatable raises an error. The following example demonstrate the issue: ``` 127.0.0.1:6379> eval "local a = {}; setmetatable(a,{__index=function() foo() end}) return a" 0 Error: Server closed the connection ``` ``` PANIC: unprotected error in call to Lua API (user_script:1: Script attempted to access nonexistent global variable 'foo') ``` The Lua panic happened because when returning the result to the client, Redis needs to introspect the returning table and transform the table into a resp. In order to scan the table, Redis uses `lua_gettable` api which might trigger the metatable (if exists) and might raise an error. This code is not running inside `pcall` (Lua protected call), so raising an error causes the Lua to panic and exit. Notice that this is not a crash, its a Lua panic that exit with error code 1. Returning `_G` panics on Redis 7 and 6.2.7 because on those versions `_G` has a metatable that raises error when trying to fetch a none existing key. ### Solution Instead of using `lua_gettable` that might raise error and cause the issue, use `lua_rawget` that simply return the value from the table without triggering any metatable logic. This is promised not to raise and error. The downside of this solution is that it might be considered as breaking change, if someone rely on metatable in the returned value. An alternative solution is to wrap this entire logic with `pcall` (Lua protected call), this alternative require a much bigger refactoring. ### Back Porting The same fix will work on older versions as well (6.2, 6.0). Notice that on those version, the issue can cause Redis to crash if inside the metatable logic there is an attempt to accesses Redis (`redis.call`). On 7.0, there is not crash and the `redis.call` is executed as if it was done from inside the script itself. ### Tests Tests was added the verify the fix
* Gossip forgotten nodes on `CLUSTER FORGET` (#10869)Viktor Söderqvist2022-07-263-6/+79
| | | | | | | | Gossip the cluster node blacklist in ping and pong messages. This means that CLUSTER FORGET doesn't need to be sent to all nodes in a cluster. It can be sent to one or more nodes and then be propagated to the rest of them. For each blacklisted node, its node id and its remaining blacklist TTL is gossiped in a cluster bus ping extension (introduced in #9530).
* Add error log message when failing to open RDB file for reading (#11036)YaacovHazan2022-07-251-1/+5
| | | | When failing to open the rdb file, there was no specific error printed (unlike a corrupt file), so it was not clear what failed and why.
* fsync the old aof file when open a new INCR AOF (#11004)Binbin2022-07-254-23/+49
| | | | | | | | | | | | | | | | | | | | | In rewriteAppendOnlyFileBackground, after flushAppendOnlyFile(1), and before openNewIncrAofForAppend, we should call redis_fsync to fsync the aof file. Because we may open a new INCR AOF in openNewIncrAofForAppend, in the case of using everysec policy, the old AOF file may not be fsynced in time (or even at all). When using everysec, we don't want to pay the disk latency from the main thread, so we will do a background fsync. Adding a argument for bioCreateCloseJob, a `need_fsync` flag to indicate that a fsync is required before the file is closed. So we will fsync the old AOF file before we close it. A cleanup, we make union become a union, since the free_* args and the fd / fsync args are never used together. Co-authored-by: Oran Agra <oran@redislabs.com>
* Register abs-expire apis (#11025)chenyang80942022-07-241-0/+2
| | | | RM_SetAbsExpire and RM_GetAbsExpire were not actually operational since they were introduced, due to omission in API registration.
* fixed complexity of bzmpop and zmpop commands (#11026)Pavel Krush2022-07-243-4/+4
| | | | | Swap M and N in the complexity formula of [B]ZMPOP Co-authored-by: Pavel Krush <neon@pushwoosh.com>
* Don't update node ip when peer fd is closed (#10696)Tian2022-07-202-5/+19
|
* Adds LASTID to XCLAIM docs (#11017)guybe72022-07-202-0/+7
| | | It seems it was overlooked when we first created the json files
* Make cluster config file saving atomic and fsync acl (#10924)Tian2022-07-202-27/+60
| | | | | | | | | | As an outstanding part mentioned in #10737, we could just make the cluster config file and ACL file saving done with a more safe and atomic pattern (write to temp file, fsync, rename, fsync dir). The cluster config file uses an in-place overwrite and truncation (which was also used by the main config file before #7824). The ACL file is using the temp file and rename approach, but was missing an fsync. Co-authored-by: 朱天 <zhutian03@meituan.com>
* Fix EVALSHA_RO and EVAL_RO command json file (#11015)Wen Hui2022-07-193-4/+8
| | | | | these are missing from the RO_ commands, present in the other ones. Co-authored-by: Ubuntu <lucas.guang.yang1@huawei.com>
* CI: Update vmaction. (#11013)Yossi Gottlieb2022-07-191-6/+6
|
* Set RM_StringCompare input args as const (#11010)Binbin2022-07-193-6/+6
| | | | | | Following #10996, it forgot to modify RM_StringCompare in module.c Modified RM_StringCompare, compareStringObjectsWithFlags, compareStringObjects and collateStringObjects.
* Fix timing issue in cluster test (#11008)Binbin2022-07-182-7/+16
| | | | | | | | | | | | | | | | | | A timing issue like this was reported in freebsd daily CI: ``` *** [err]: Sanity test push cmd after resharding in tests/unit/cluster/cli.tcl Expected 'CLUSTERDOWN The cluster is down' to match '*MOVED*' ``` We additionally wait for each node to reach a consensus on the cluster state in wait_for_condition to avoid the cluster down error. The fix just like #10495, quoting madolson's comment: Cluster check just verifies the the config state is self-consistent, waiting for cluster_state to be okay is an independent check that all the nodes actually believe each other are healthy. At the same time i noticed that unit/moduleapi/cluster.tcl has an exact same test, may have the same problem, also modified it.
* Fix heap overflow corruption in XAUTOCLAIM (CVE-2022-31144) (#11002)Oran Agra2022-07-182-2/+20
| | | | | | | | | The temporary array for deleted entries reply of XAUTOCLAIM was insufficient, but also in fact the COUNT argument should be used to control the size of the reply, so instead of terminating the loop by only counting the claimed entries, we'll count deleted entries as well. Fix #10968 Addresses CVE-2022-31144
* Avoid using unsafe C functions (#10932)ranshid2022-07-1819-113/+227
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | replace use of: sprintf --> snprintf strcpy/strncpy --> redis_strlcpy strcat/strncat --> redis_strlcat **why are we making this change?** Much of the code uses some unsafe variants or deprecated buffer handling functions. While most cases are probably not presenting any issue on the known path programming errors and unterminated strings might lead to potential buffer overflows which are not covered by tests. **As part of this PR we change** 1. added implementation for redis_strlcpy and redis_strlcat based on the strl implementation: https://linux.die.net/man/3/strl 2. change all occurrences of use of sprintf with use of snprintf 3. change occurrences of use of strcpy/strncpy with redis_strlcpy 4. change occurrences of use of strcat/strncat with redis_strlcat 5. change the behavior of ll2string/ull2string/ld2string so that it will always place null termination ('\0') on the output buffer in the first index. this was done in order to make the use of these functions more safe in cases were the user will not check the output returned by them (for example in rdbRemoveTempFile) 6. we added a compiler directive to issue a deprecation error in case a use of sprintf/strcpy/strcat is found during compilation which will result in error during compile time. However keep in mind that since the deprecation attribute is not supported on all compilers, this is expected to fail during push workflows. **NOTE:** while this is only an initial milestone. We might also consider using the *_s implementation provided by the C11 Extensions (however not yet widly supported). I would also suggest to start looking at static code analyzers to track unsafe use cases. For example LLVM clang checker supports security.insecureAPI.DeprecatedOrUnsafeBufferHandling which can help locate unsafe function usage. https://clang.llvm.org/docs/analyzer/checkers.html#security-insecureapi-deprecatedorunsafebufferhandling-c The main reason not to onboard it at this stage is that the alternative excepted by clang is to use the C11 extensions which are not always supported by stdlib.
* remove boolean usage and use 0/1 instead (#10997)Valentino Geron2022-07-171-2/+2
| | | | | | If we do not use jemalloc (mostly with valgrind) and use an old compiler that does not support C11 we will get compilation error Co-authored-by: Valentino Geron <valentino@redis.com>
* Set RedisModule_StringCompare input args as const (#10996)Guy Korland2022-07-171-1/+1
|
* Fix cluster hostnames test causing failover while running valgrind (#10991)Madelyn Olson2022-07-171-0/+6
| | | | | | | In the newly added cluster hostnames test, the primary is failing over during the reboot for valgrind so we are validating the wrong node. This change just sets the replica to prevent taking over, which seems to fix the test. We could have also set the timeout higher, but it slows down the test.
* Add optional for FCALL and FCALL_RO command json file (#10988)Wen Hui2022-07-173-4/+8
| | | | | | | | | According to the Redis functions documentation, FCALL command format could be FCALL function_name numberOfKeys [key1, key2, key3.....] [arg1, arg2, arg3.....] So in the json file of fcall and fcall_ro, we should add optional for key and arg part. Just like EVAL... Co-authored-by: Binbin <binloveplay1314@qq.com>
* Add an option to specify multiple skip files using `--skipfile` (#10975)Valentino Geron2022-07-171-2/+2
| | | | | | `--skipfile` can be repeated. For example: ./runtests --skipfile file1.txt --skipfile file2.txt Co-authored-by: Valentino Geron <valentino@redis.com>
* Avoid valgrind fishy value warning on corrupt restore payloads (#10937)Oran Agra2022-07-132-6/+23
| | | | | | | | | | | | | | The corrupt dump fuzzer uncovered a valgrind warning saying: ``` ==76370== Argument 'size' of function malloc has a fishy (possibly negative) value: -3744781444216323815 ``` This allocation would have failed (returning NULL) and being handled properly by redis (even before this change), but we also want to silence the valgrind warnings (which are checking that casting to ssize_t produces a non-negative value). The solution i opted for is to explicitly fail these allocations (returning NULL), before even reaching `malloc` (which would have failed and return NULL too). The implication is that we will not be able to support a single allocation of more than 2GB on a 32bit system (which i don't think is a realistic scenario). i.e. i do think we could be facing cases were redis consumes more than 2gb on a 32bit system, but not in a single allocation. The byproduct of this, is that i dropped the overflow assertions, since these will now lead to the same OOM panic we have for failed allocations.
* Add range check for server port in redis-cli/benchmark (#9854)Binbin2022-07-122-0/+16
| | | Validating inputs ahead of time, to give the end user a slightly more useful error.
* Cluster test improvements (#10920)Madelyn Olson2022-07-1210-210/+235
| | | * Restructured testing to allow running cluster tests easily as part of the normal testing
* Normalize the style of help information with other documentation (#10965)Wen Hui2022-07-121-1/+1
|
* Simplify arithmetic expression on LP_REPLACE case in lpinsert (#6327)jimgreen20132022-07-121-2/+1
| | | Remove unnecessary variable name.
* Trying to fix cluster test (#10963)Binbin2022-07-111-0/+4
| | | | | | | | | #10942 break the new test added in #10449 ``` Testing unit: 29-slot-migration-response.tcl Cluster Join and auto-discovery test: FAILED: Cluster failed to join into a full mesh. ``` It looks like we need to wait for the cluster in 28 to become stable.
* update help.h (#10961)Oran Agra2022-07-111-4/+4
|
* Add cluster-port support to redis-cli --cluster (#10344)Binbin2022-07-113-56/+153
| | | | | | | | | | | | | | | In #9389, we add a new `cluster-port` config and make cluster bus port configurable, and currently redis-cli --cluster create/add-node doesn't support with a configurable `cluster-port` instance. Because redis-cli uses the old way (port + 10000) to send the `CLUSTER MEET` command. Now we add this support on redis-cli `--cluster`, note we don't need to explicitly pass in the `cluster-port` parameter, we can get the real `cluster-port` of the node in `clusterManagerNodeLoadInfo`, so the `--cluster create` and `--cluster add-node` interfaces have not changed. We will use the `cluster-port` when we are doing `CLUSTER MEET`, also note that `CLUSTER MEET` bus-port parameter was added in 4.0, so if the bus_port (the one in redis-cli) is 0, or equal (port + 10000), we just call `CLUSTER MEET` with 2 arguments, using the old form. Co-authored-by: Madelyn Olson <34459052+madolson@users.noreply.github.com>
* Add --check-system in redis-server usage (#10960)Binbin2022-07-111-1/+3
|
* Fix crash during handshake and cluster shards call (#10942)Madelyn Olson2022-07-102-3/+20
| | | * Fix an engine crash when there are nodes in handshaking and a user calls cluster shards
* Only print ACL syntax errors once and include command names in errors (#10922)Madelyn Olson2022-07-091-6/+16
| | | * Only print ACL syntax errors once and include command names in errors
* Fix some outdated comments and some typo (#10946)Binbin2022-07-066-17/+11
| | | * Fix some outdated comments and some typo
* Renaming conduct to code of conduct and contributing files to .md (#10941)adasarpan4042022-07-063-3/+3
| | | | | | CONTRIBUTING to get better formatting CONDUCT also because github doesn't seem recognize the code of conduct page Co-authored-by: Oran Agra <oran@redislabs.com>
* Add pubsubshard_channels field in INFO STATS (#10929)Binbin2022-07-062-1/+3
| | | | | | We already have `pubsub_channels` and `pubsub_patterns` in INFO stats, now add `pubsubshard_channels` (symmetry). Sharded pubsub was added in #8621
* TLS: Notify clients on connection shutdown. (#10931)Yossi Gottlieb2022-07-051-0/+2
| | | | | Use SSL_shutdown(), in a best-effort manner, when closing a TLS connection. This change better supports OpenSSL 3.x clients that will not silently ignore the socket-level EOF.
* Avoid double multiplication of alloc_count (#10934)Harkrishn Patro2022-07-041-1/+1
|
* Add tests for error messages during slot migrations (#10449)Wen Hui2022-07-041-0/+50
| | | | | | * Add tests for error messages during slot migrations Co-authored-by: Ubuntu <lucas.guang.yang1@huawei.com> Co-authored-by: Madelyn Olson <madelyneolson@gmail.com>
* Optimize number of realloc syscall during multi/exec flow (#10921)Harkrishn Patro2022-07-042-3/+14
| | | | | | | | | | | | | | | | | | | | | ## Issue During the MULTI/EXEC flow, each command gets queued until the `EXEC` command is received and during this phase on every command queue, a `realloc` is being invoked. This could be expensive based on the realloc behavior (if copy to a new memory location). ## Solution In order to reduce the no. of syscall, couple of optimization I've used. 1. By default, reserve memory for atleast two commands. `MULTI/EXEC` for a single command doesn't have any significance. Hence, I believe customer wouldn't use it. 2. For further reservation, increase the memory allocation in exponent growth (power of 2). This reduces the no. of `realloc` call from `N` to `log(N)` times. ## Other changes: * Include multi exec queued command array in client memory consumption calculation (affects client eviction too)
* Unlock cluster config file upon server shutdown. (#10912)Qu Chen2022-07-042-4/+11
| | | | | | | | | | | | | | Currently in cluster mode, Redis process locks the cluster config file when starting up and holds the lock for the entire lifetime of the process. When the server shuts down, it doesn't explicitly release the lock on the cluster config file. We noticed a problem with restart testing that if you shut down a very large redis-server process (i.e. with several hundred GB of data stored), it takes the OS a while to free the resources and unlock the cluster config file. So if we immediately try to restart the redis server process, it might fail to acquire the lock on the cluster config file and fail to come up. This fix explicitly releases the lock on the cluster config file upon a shutdown rather than relying on the OS to release the lock, which is a cleaner and safer approach to free up resources acquired.