delta/redis.git - github.com: antirez/redis.git

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	Fix sentinel function that compares hostnames (if failed resolve) (#11419)	Moti Cohen	2022-10-23	1	-3/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Funcion sentinelAddrEqualsHostname() of sentinel makes DNS resolve and based on it determines if two IP addresses are equal. Now, If the DNS resolve command fails, the function simply returns 0, even if the hostnames are identical. This might become an issue in case of failover such that sentinel might receives from Redis instance, response to regular INFO query it sent, and wrongly decide that the instance is pointing to is different leader than the one recorded because of this function, yet hostnames are identical. In turn sentinel disconnects the connection between sentinel and valid slave which leads to -failover-abort-no-good-slave. See issue #11241. I managed to reproduce only part of the flow in which the function return wrong result and trigger +fix-slave-config. The fix is even if the function failed to resolve then compare based on hostnames. That is our best effort as long as the server is unavailable for some reason. It is fine since Redis instance cannot have multiple hostnames for a given setup
*	Fix wrong tips when the user pass wrong # of arguments to redis.set_repl(). ↵	qetu3790	2022-10-23	1	-1/+1
\| \| \| \| \|	(#11415) redis.set_repl() needs one arg, but the tips says two.
*	Make PFMERGE source key optional in docs, add tests with one input key, add ↵	Binbin	2022-10-22	3	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	tests on missing source keys (#11205) The following two cases will create an empty destkey HLL: 1. called with no source keys, like `pfmerge destkey` 2. called with non-existing source keys, like `pfmerge destkey non-existing-source-key` In the first case, in `PFMERGE`, the dest key is actually one of the source keys too. So `PFMERGE k1 k2` is equivalent to `SUNIONSTORE k1 k1 k2`, and `PFMERGE k1` is equivalent to `SUNIONSTORE k1 k1`. So the first case is reasonable, the source key is actually optional. And the second case, `PFMERGE` on missing keys should succeed and create an empty dest. This is consistent with `PFCOUNT`, and also with `SUNIONSTORE`, no need to change.
*	Fix crash due to to reuse iterator entry after list deletion in module (#11383)	sundb	2022-10-22	1	-13/+22
\| \| \| \| \| \| \| \| \| \| \|	In the module, we will reuse the list iterator entry for RM_ListDelete, but `listTypeDelete` will only update `quicklistEntry->zi` but not `quicklistEntry->node`, which will result in `quicklistEntry->node` pointing to a freed memory address if the quicklist node is deleted. This PR sync `key->u.list.index` and `key->u.list.entry` to list iterator after `RM_ListDelete`. This PR also optimizes the release code of the original list iterator. Co-authored-by: Viktor Söderqvist <viktor@zuiderkwast.se>
*	fix the size of variable merge_sz in quicklist.c (#11285)	FutabaRio	2022-10-19	1	-3/+3
\| \| \| \|	11 was the size of header/trailer in the old structure Ziplist, but now the size of header/trailer in the new structure Listpack should be 7.
*	Blocked module clients should be aware when a key is deleted (#11310)	guybe7	2022-10-18	9	-128/+174
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The use case is a module that wants to implement a blocking command on a key that necessarily exists and wants to unblock the client in case the key is deleted (much like what we implemented for XREADGROUP in #10306) New module API: * RedisModule_BlockClientOnKeysWithFlags Flags: * REDISMODULE_BLOCK_UNBLOCK_NONE * REDISMODULE_BLOCK_UNBLOCK_DELETED ### Detailed description of code changes blocked.c: 1. Both module and stream functions are called whether the key exists or not, regardless of its type. We do that in order to allow modules/stream to unblock the client in case the key is no longer present or has changed type (the behavior for streams didn't change, just code that moved into serveClientsBlockedOnStreamKey) 2. Make sure afterCommand is called in serveClientsBlockedOnKeyByModule, in order to propagate actions from moduleTryServeClientBlockedOnKey. 3. handleClientsBlockedOnKeys: call propagatePendingCommands directly after lookupKeyReadWithFlags to prevent a possible lazy-expire DEL from being mixed with any command propagated by the preceding functions. 4. blockForKeys: Caller can specifiy that it wants to be awakened if key is deleted. Minor optimizations (use dictAddRaw). 5. signalKeyAsReady became signalKeyAsReadyLogic which can take a boolean in case the key is deleted. It will only signal if there's at least one client that awaits key deletion (to save calls to handleClientsBlockedOnKeys). Minor optimizations (use dictAddRaw) db.c: 1. scanDatabaseForDeletedStreams is now scanDatabaseForDeletedKeys and will signalKeyAsReady for any key that was removed from the database or changed type. It is the responsibility of the code in blocked.c to ignore or act on deleted/type-changed keys. 2. Use the new signalDeletedKeyAsReady where needed blockedonkey.c + tcl: 1. Added test of new capabilities (FSL.BPOPGT now requires the key to exist in order to work)
*	Avoid saving module aux on RDB if no aux data was saved by the module. (#11374)	Meir Shpilraien (Spielrein)	2022-10-18	5	-16/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	### Background The issue is that when saving an RDB with module AUX data, the module AUX metadata (moduleid, when, ...) is saved to the RDB even though the module did not saved any actual data. This prevent loading the RDB in the absence of the module (although there is no actual data in the RDB that requires the module to be loaded). ### Solution The solution suggested in this PR is that module AUX will be saved on the RDB only if the module actually saved something during `aux_save` function. To support backward compatibility, we introduce `aux_save2` callback that acts the same as `aux_save` with the tiny change of avoid saving the aux field if no data was actually saved by the module. Modules can use the new API to make sure that if they have no data to save, then it will be possible to load the created RDB even without the module. ### Concerns A module may register for the aux load and save hooks just in order to be notified when saving or loading starts or completed (there are better ways to do that, but it still possible that someone used it). However, if a module didn't save a single field in the save callback, it means it's not allowed to read in the read callback, since it has no way to distinguish between empty and non-empty payloads. furthermore, it means that if the module did that, it must never change it, since it'll break compatibility with it's old RDB files, so this is really not a valid use case. Since some modules (ones who currently save one field indicating an empty payload), need to know if saving an empty payload is valid, and if Redis is gonna ignore an empty payload or store it, we opted to add a new API (rather than change behavior of an existing API and expect modules to check the redis version) ### Technical Details To avoid saving AUX data on RDB, we change the code to first save the AUX metadata (moduleid, when, ...) into a temporary buffer. The buffer is then flushed to the rio at the first time the module makes a write operation inside the `aux_save` function. If the module saves nothing (and `aux_save2` was used), the entire temporary buffer is simply dropped and no data about this AUX field is saved to the RDB. This make it possible to load the RDB even in the absence of the module. Test was added to verify the fix.
*	keyIsExpired checks server.loading before calling getExpire (#11393)	Shuning	2022-10-18	1	-3/+3
\| \| \| \| \| \|	Seems excessive to call getExpire if we don't need it. This can maybe have some speedup on AOF file loading (saving a dictFind call) Co-authored-by: lvshuning <lvshuning@meituan.com>
*	fix malloc macro in listpack.c (#11398)	DarrenJiang13	2022-10-18	1	-5/+5
\| \| \| \|	fix some malloc macros in `listpack.c`. listpack has it's own malloc aliases, but in some places normal redis malloc calls have slipped in.
*	Bump codespell from 2.2.1 to 2.2.2 in /.codespell (#11399)	Binbin	2022-10-18	1	-1/+1
\| \| \| \|	And fix a few newly detected typo. Closes #11394
*	Unify ACL failure error messaging. (#11160)	Shaya Potter	2022-10-16	5	-49/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Motivation: for applications that use RM ACL verification functions, they would want to return errors back to the user, in ways that are consistent with Redis. While investigating how we should return ACL errors to the user, we realized that Redis isn't consistent, and currently returns ACL error strings in 3 primary ways. [For the actual implications of this change, see the "Impact" section at the bottom] 1. how it returns an error when calling a command normally ACL_DENIED_CMD -> "this user has no permissions to run the '%s' command" ACL_DENIED_KEY -> "this user has no permissions to access one of the keys used as arguments" ACL_DENIED_CHANNEL -> "this user has no permissions to access one of the channels used as arguments" 2. how it returns an error when calling via 'acl dryrun' command ACL_DENIED_CMD -> "This user has no permissions to run the '%s' command" ACL_DENIED_KEY -> "This user has no permissions to access the '%s' key" ACL_DENIED_CHANNEL -> "This user has no permissions to access the '%s' channel" 3. how it returns an error via RM_Call (and scripting is similar). ACL_DENIED_CMD -> "can't run this command or subcommand"; ACL_DENIED_KEY -> "can't access at least one of the keys mentioned in the command arguments"; ACL_DENIED_CHANNEL -> "can't publish to the channel mentioned in the command"; In addition, if one wants to use RM_Call's "dry run" capability instead of the RM ACL functions directly, one also sees a different problem than it returns ACL errors with a -ERR, not a -PERM, so it can't be returned directly to the caller. This PR modifies the code to generate a base message in a common manner with the ability to set verbose flag for acl dry run errors, and keep it unset for normal/rm_call/script cases ```c sds getAclErrorMessage(int acl_res, user user, struct redisCommand cmd, sds errored_val, int verbose) { switch (acl_res) { case ACL_DENIED_CMD: return sdscatfmt(sdsempty(), "User %S has no permissions to run " "the '%S' command", user->name, cmd->fullname); case ACL_DENIED_KEY: if (verbose) { return sdscatfmt(sdsempty(), "User %S has no permissions to access " "the '%S' key", user->name, errored_val); } else { return sdsnew("No permissions to access a key"); } case ACL_DENIED_CHANNEL: if (verbose) { return sdscatfmt(sdsempty(), "User %S has no permissions to access " "the '%S' channel", user->name, errored_val); } else { return sdsnew("No permissions to access a channel"); } } ``` The caller can append/prepend the message (adding NOPERM for normal/RM_Call or indicating it's within a script). Impact: - Plain commands, as well as scripts and RM_Call now include the user name. - ACL DRYRUN remains the only one that's verbose (mentions the offending channel or key name) - Changes RM_Call ACL errors from being a `-ERR` to being `-NOPERM` (besides for textual changes) This somewhat a breaking change, but it only affects the RM_Call with both `C` and `E`, or `D` - Changes ACL errors in scripts textually from being `The user executing the script <old non unified text>` to `ACL failure in script: <new unified text>`
*	Fix wrong replication on cluster slotmap changes with module KSN propagation ↵	Meir Shpilraien (Spielrein)	2022-10-16	1	-1/+9
\| \| \| \| \| \| \| \|	(#11377) As discussed on #11084, `propagatePendingCommands` should happened after the del notification is fired so that the notification effect and the `del` will be replicated inside MULTI EXEC. Test was added to verify the fix.
*	Fixes build warning when CACHE_LINE_SIZE is already defined. (#11389)	David CARLIER	2022-10-16	1	-0/+6
\| \| \| \| \| \| \|	* Fixes build warning when CACHE_LINE_SIZE is already defined * Fixes wrong CACHE_LINE_SIZE on some FreeBSD systems where it could be set to 128 (e.g. on MIPS) * Fixes wrong CACHE_LINE_SIZE on Apple M1 (use 128 instead of 64) Wrong cache line size in that case can some false sharing of array elements between threads, see #10892
*	optimizing d2string() and addReplyDouble() with grisu2: double to string ↵	filipe oliveira	2022-10-15	8	-19/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	conversion based on Florian Loitsch's Grisu-algorithm (#10587) All commands / use cases that heavily rely on double to a string representation conversion, (e.g. meaning take a double-precision floating-point number like 1.5 and return a string like "1.5" ), could benefit from a performance boost by swapping snprintf(buf,len,"%.17g",value) by the equivalent [fpconv_dtoa](https://github.com/night-shift/fpconv) or any other algorithm that ensures 100% coverage of conversion. This is a well-studied topic and Projects like MongoDB. RedPanda, PyTorch leverage libraries ( fmtlib ) that use the optimized double to string conversion underneath. The positive impact can be substantial. This PR uses the grisu2 approach ( grisu explained on https://www.cs.tufts.edu/~nr/cs257/archive/florian-loitsch/printf.pdf section 5 ). test suite changes: Despite being compatible, in some cases it produces a different result from printf, and some tests had to be adjusted. one case is that `%.17g` (which means %e or %f which ever is shorter), chose to use `5000000000` instead of 5e+9, which sounds like a bug? In other cases, we changed TCL to compare numbers instead of strings to ignore minor rounding issues (`expr 0.8 == 0.79999999999999999`)
*	MIGTATE with AUTH that contains "keys" is getting wrong key names in ↵	C Charles	2022-10-13	1	-6/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	migrateGetKeys, leads to ACL errors (#11253) When using the MIGRATE, with a destination Redis that has the user name or password set to the string "keys", Redis would have determine the wrong set of key names the command is gonna access. This lead to ACL returning wrong authentication result. Destination instance: ``` 127.0.0.1:6380> acl setuser default >keys OK 127.0.0.1:6380> acl setuser keys on nopass ~* &* +@all OK ``` Source instance: ``` 127.0.0.1:6379> set a 123 OK 127.0.0.1:6379> acl setuser cc on nopass ~a* +@all OK 127.0.0.1:6379> auth cc 1 OK 127.0.0.1:6379> migrate 127.0.0.1 6380 "" 0 1000 auth keys keys a (error) NOPERM this user has no permissions to access one of the keys used as arguments 127.0.0.1:6379> migrate 127.0.0.1 6380 "" 0 1000 auth2 keys pswd keys a (error) NOPERM this user has no permissions to access one of the keys used as arguments ``` Using `acl dryrun` we know that the parameters of `auth` and `auth2` are mistaken for the `keys` option. ``` 127.0.0.1:6379> acl dryrun cc migrate whatever whatever "" 0 1000 auth keys keys a "This user has no permissions to access the 'keys' key" 127.0.0.1:6379> acl dryrun cc migrate whatever whatever "" 0 1000 auth2 keys pswd keys a "This user has no permissions to access the 'pswd' key" ``` Fix the bug by editing db.c/migrateGetKeys function, which finds the `keys` option and all the keys following.
*	Improve linux overcommit check and warning (#11357)	Oran Agra	2022-10-13	1	-2/+5
\| \| \| \| \| \|	1. show the overcommit warning when overcommit is disabled (2), not just when it is set to heuristic (0). 2. improve warning text to mention the issue with jemalloc causing VM mapping fragmentation when set to 2.
*	Fix crash on RM_Call inside module load (#11346)	Meir Shpilraien (Spielrein)	2022-10-12	3	-18/+25
\| \| \| \| \| \| \| \| \| \| \|	PR #9320 introduces initialization order changes. Now cluster is initialized after modules. This changes causes a crash if the module uses RM_Call inside the load function on cluster mode (the code will try to access `server.cluster` which at this point is NULL). To solve it, separate cluster initialization into 2 phases: 1. Structure initialization that happened before the modules initialization 2. Listener initialization that happened after. Test was added to verify the fix.
*	Fix typo in scan commands complexity statement (#11370)	Nikita Tolkachev	2022-10-11	4	-6/+6
\| \| \|	remove double period at the end of sentence.
*	Fix TIME command microseconds overflow under 32-bits (#11368)	Binbin	2022-10-09	1	-1/+1
\| \| \| \|	The old `server.unixtime*1000000` will overflow in 32-bits. This was introduced in #10300 (not released).
*	Fix redis-benchmark hang when it fails to connect to redis (#11366)	yancz2000	2022-10-09	1	-1/+1
\| \| \| \| \|	Forgot to start redis-server when testing performance. When opening the benchmark for testing, it will always be stuck, and the process cpu will reach 100%.
*	Freeze time sampling during command execution, and scripts (#10300)	Binbin	2022-10-09	12	-72/+61
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Freeze time during execution of scripts and all other commands. This means that a key is either expired or not, and doesn't change state during a script execution. resolves #10182 This PR try to add a new `commandTimeSnapshot` function. The function logic is extracted from `keyIsExpired`, but the related calls to `fixed_time_expire` and `mstime()` are removed, see below. In commands, we will avoid calling `mstime()` multiple times and just use the one that sampled in call. The background is, e.g. using `PEXPIRE 1` with valgrind sometimes result in the key being deleted rather than expired. The reason is that both `PEXPIRE` command and `checkAlreadyExpired` call `mstime()` separately. There are other more important changes in this PR: 1. Eliminate `fixed_time_expire`, it is no longer needed. When we want to sample time we should always use a time snapshot. We will use `in_nested_call` instead to update the cached time in `call`. 2. Move the call for `updateCachedTime` from `serverCron` to `afterSleep`. Now `commandTimeSnapshot` will always return the sample time, the `lookupKeyReadWithFlags` call in `getNodeByQuery` will get a outdated cached time (because `processCommand` is out of the `call` context). We put the call to `updateCachedTime` in `aftersleep`. 3. Cache the time each time the module lock Redis. Call `updateCachedTime` in `moduleGILAfterLock`, affecting `RM_ThreadSafeContextLock` and `RM_ThreadSafeContextTryLock` Currently the commandTimeSnapshot change affects the following TTL commands: - SET EX / SET PX - EXPIRE / PEXPIRE - SETEX / PSETEX - GETEX EX / GETEX PX - TTL / PTTL - EXPIRETIME / PEXPIRETIME - RESTORE key TTL And other commands just use the cached mstime (including TIME). This is considered to be a breaking change since it can break a script that uses a loop to wait for a key to expire.
*	`RedisModule_ResetDataset` should not clear the functions. (#11268)	Meir Shpilraien (Spielrein)	2022-10-09	1	-1/+1
\| \| \| \| \|	As mentioned on docs, `RM_ResetDataset` Performs similar operation to FLUSHALL. As FLUSHALL do not clean the function, `RM_ResetDataset` should not clean the functions as well.
*	fix arm build warning due to new compiler optimizations (#11362)	Oran Agra	2022-10-07	2	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Build fails with warnings in ARM CI after adding more aggressive optimizations (#11350) probably a result of more aggressive inlining ``` ziplist.c: In function ‘pop.constprop’: ziplist.c:1770:13: error: ‘vlong’ may be used uninitialized in this function [-Werror=maybe-uninitialized] printf("%lld", vlong); ^~~~~~~~~~~~~~~~~~~~~ ``` ``` listpack.c: In function ‘lpInsert.constprop’: listpack.c:406:9: error: argument 2 null where non-null expected [-Werror=nonnull] memcpy(buf+1,s,len); ^~~~~~~~~~~~~~~~~~~ ```
*	Added authentication failure and access denied metrics (#11288)	aradz44	2022-10-07	3	-0/+49
\| \| \|	Added authentication failure and access denied metrics
*	Improve BLMPOP/BZMPOP/WAIT timeout overflow handling and error messages (#11338)	Moti Cohen	2022-10-06	1	-2/+13
\| \| \| \| \| \| \| \| \| \| \|	Refine getTimeoutFromObjectOrReply() out-of-range check. Timeout is parsed (and verifies out of range) as double and multiplied by 1000, added mstime() and stored in long-long which might lead to out-of-range value of long-long. Co-authored-by: moticless <moticless@github.com> Co-authored-by: Oran Agra <oran@redislabs.com> Co-authored-by: Ozan Tezcan <ozantezcan@gmail.com>
*	Pass -flto flag to the linker (#11350)	Ozan Tezcan	2022-10-06	3	-5/+9
\| \| \| \| \| \| \|	Currently, we add -flto to the compile flags only. We are supposed to add it to the linker flags as well. Clang build fails because of this. Added a change to add -flto to REDIS_CFLAGS and REDIS_LDFLAGS if the build optimization flag is -O3. (noopt build will not use -flto)
*	Stabilize cluster hostnames tests (#11307)	Madelyn Olson	2022-10-03	3	-1/+5
\| \| \| \| \| \| \| \|	This PR introduces a couple of changes to improve cluster test stability: 1. Increase the cluster node timeout to 3 seconds, which is similar to the normal cluster tests, but introduce a new mechanism to increase the ping period so that the tests are still fast. This new config is a debug config. 2. Set `cluster-replica-no-failover yes` on a wider array of tests which are sensitive to failovers. This was occurring on the ARM CI.
*	register debug support on illumos/solaris. (#11335)	David CARLIER	2022-10-02	3	-1/+37
\|
*	Change compiler optimizations to -O3 -flto (#11207)	Maria Markova	2022-10-02	3	-3/+5
\| \| \| \| \| \| \| \| \| \|	Optimization update from -O2 to -O3 -flto gives up to 5% performance gain in 'redis-benchmarks-spec-client-runner' tests geomean where GCC 9.4.0 is used for build * Fix for false-positive warning in bitops.c Warning appeared with O3, on CentOS during inlininig procedure * Fixed unitialized streamID within streamTrim() (#1) Co-authored-by: filipe oliveira <filipecosta.90@gmail.com>
*	code, typo and comment cleanups (#11280)	Binbin	2022-10-02	7	-9/+8
\| \| \| \| \| \| \| \| \|	- fix `the the` typo - `LPOPRPUSH` does not exist, should be `RPOPLPUSH` - `CLUSTER GETKEYINSLOT` 's time complexity should be O(N) - `there bytes` should be `three bytes`, this closes #11266 - `slave` word to `replica` in log, modified the front and missed the back - remove useless aofReadDiffFromParent in server.h - `trackingHandlePendingKeyInvalidations` method adds a void parameter
*	fix some commands json file (#11201)	Huang Zhw	2022-10-02	6	-12/+99
\| \| \| \| \| \|	- BITOP: turn argument `operation` from string to oneof - CLIENT KILL: turn argument `skipme` from string to oneof - COMMAND GETKEYS / GETKEYSANDFLAGS: change arguments to optional, and change arity to -3 (fixes regression in redis 7.0) - CLIENT PAUSE: this command was added in v3.0.0
*	Update CLUSTER NODES help message (#11341)	Binbin	2022-09-30	1	-1/+1
\| \| \|	We will always show the bus-port, and if the node hostname exists, it will also show it.
*	Add redis-cli hints to ACL DRYRUN, COMMAND GETKEYS, COMMAND GETKEYSANDFLAGS ↵	Huang Zhw	2022-09-29	1	-2/+12
\| \| \| \| \| \| \| \| \| \|	(#11232) Better redis-cli hints for commands that take other commands as arguments. ``` command getkeysandflags hello [protover [AUTH username password]] acl dryrun user hello [protover [AUTH username password]] ```
*	Update sentinel-config.json to consistent with Config Get and Set operation ↵	Wen Hui	2022-09-29	2	-6/+3
\| \| \| \| \| \|	(#11334) The sentinel CONFIG GET command doesn't support multiple arguments, but the json file did. remove that.
*	Avoid crash on crash report when a bad function pointer was called (#11298)	Meir Shpilraien (Spielrein)	2022-09-29	1	-22/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If Redis crashes due to calling an invalid function pointer, the `backtrace` function will try to dereference this invalid pointer which will cause a crash inside the crash report and will kill the processes without having all the crash report information. Example: ``` === REDIS BUG REPORT START: Cut & paste starting from here === 198672:M 19 Sep 2022 18:06:12.936 # Redis 255.255.255 crashed by signal: 11, si_code: 1 198672:M 19 Sep 2022 18:06:12.936 # Accessing address: 0x1 198672:M 19 Sep 2022 18:06:12.936 # Crashed running the instruction at: 0x1 // here the processes is crashing ``` This PR tries to fix this crash be: 1. Identify the issue when it happened. 2. Replace the invalid pointer with a pointer to some dummy function so that `backtrace` will not crash. I identification is done by comparing `eip` to `info->si_addr`, if they are the same we know that the crash happened on the same address it tries to accesses and we can conclude that it tries to call and invalid function pointer. To replace the invalid pointer we introduce a new function, `setMcontextEip`, which is very similar to `getMcontextEip` and it knows to set the Eip for the different supported OS's. After printing the trace we retrieve the old `Eip` value.
*	Fix the missing server.dirty increment and redundant signalModifiedKey in ↵	sundb	2022-09-28	2	-38/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	serveClientBlockedOnList (#11326) Mainly fix two minor bug 1. When handle BL*POP/BLMOVE commands with blocked clients, we should increment server.dirty. 2. `listPopRangeAndReplyWithKey()` in `serveClientBlockedOnList()` should not repeat calling `signalModifiedKey()` which has been called when an element was pushed on the list. (was skipped in all bpop commands, other than blmpop) Other optimization add `signal` param for `listElementsRemoved` to skip `signalModifiedKey()` to unify all pop operation. Unifying all pop operations also prepares us for #11303, so that we can avoid having to deal with the conversion from quicklist to listpack() in various places when the list shrinks.
*	RM_CreateCommand should not set CMD_KEY_VARIABLE_FLAGS automatically (#11320)	guybe7	2022-09-28	2	-10/+4
\| \| \| \| \| \| \| \| \| \| \|	The original idea behind auto-setting the default (first,last,step) spec was to use the most "open" flags when the user didn't provide any key-spec flags information. While the above idea is a good approach, it really makes no sense to set CMD_KEY_VARIABLE_FLAGS if the user didn't provide the getkeys-api flag: in this case there's not way to retrieve these variable flags, so what's the point? Internally in redis there was code to ignore this already, so this fix doesn't change redis's behavior, it only affects the output of COMMAND command.
*	fix: redis-cli --memkeys-samples add check lastarg (#11269)	Phoeniwx	2022-09-28	1	-1/+1
\| \| \| \| \|	doing redis-cli --memkeys-samples without any additional arguments would have lead to a crash of the cli.
*	Remove redundant arity checks in XINFO (#11331)	guybe7	2022-09-28	1	-8/+0
\| \| \| \|	The arity in the JSON files of the subcommands reneder this code unreachable
*	Fixing compilation by removing flock() when compiling on Solaris (#11327)	Steffen Moser	2022-09-27	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	SunOS/Solaris and its relatives don't support the flock() function. While "redis" has been excluding setting up the lock using flock() on the cluster configuration file when compiling under Solaris, it was still using flock() in the unlock call while shutting down. This pull request eliminates the flock() call also in the unlocking stage for Oracle Solaris and its relatives. Fix compilation regression from #10912
*	Ignore RM_Call deny-oom flag if maxmemory is zero (#11319)	Ozan Tezcan	2022-09-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	If a command gets an OOM response and then if we set maxmemory to zero to disable the limit, server.pre_command_oom_state never gets updated and it stays true. As RM_Call() calls with "respect deny-oom" flag checks server.pre_command_oom_state, all calls will fail with OOM. Added server.maxmemory check in RM_Call() to process deny-oom flag only if maxmemory is configured.
*	Fix CLUSTER SHARDS showing empty hostname (#11297)	Binbin	2022-09-22	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Fix CLUSTER SHARDS showing empty hostname In #10290, we changed clusterNode hostname from `char*` to `sds`, and the old `node->hostname` was changed to `sdslen(node->hostname)!=0`. But in `addNodeDetailsToShardReply` it is missing. It results in the return of an empty string hostname in CLUSTER SHARDS command if it unavailable. Like this (note that we listed it as optional in the doc): ``` 9) "hostname" 10) "" ```
*	Add RM_SetContextUser to support acl validation in RM_Call (and scripts) ↵	Shaya Potter	2022-09-22	6	-74/+222
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#10966) Adds a number of user management/ACL validaiton/command execution functions to improve a Redis module's ability to enforce ACLs correctly and easily. * RM_SetContextUser - sets a RedisModuleUser on the context, which RM_Call will use to both validate ACLs (if requested and set) as well as assign to the client so that scripts executed via RM_Call will have proper ACL validation. * RM_SetModuleUserACLString - Enables one to pass an entire ACL string, not just a single OP and have it applied to the user * RM_GetModuleUserACLString - returns a stringified version of the user's ACL (same format as dump and list). Contains an optimization to cache the stringified version until the underlying ACL is modified. * Slightly re-purpose the "C" flag to RM_Call from just being about ACL check before calling the command, to actually running the command with the right user, so that it also affects commands inside EVAL scripts. see #11231
*	Fix heap overflow vulnerability in XAUTOCLAIM (CVE-2022-35951) (#11301)	Oran Agra	2022-09-22	1	-3/+10
\| \| \| \| \| \|	Executing an XAUTOCLAIM command on a stream key in a specific state, with a specially crafted COUNT argument may cause an integer overflow, a subsequent heap overflow, and potentially lead to remote code execution. The problem affects Redis versions 7.0.0 or newer.
*	Replica that asks for rdb only should be closed right after the rdb part ↵	Valentino Geron	2022-09-22	2	-14/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	(#11296) The bug is that the the server keeps on sending newlines to the client. As a result, the receiver might not find the EOF marker since it searches for it only on the end of each payload it reads from the socket. The but only affects `CLIENT_REPL_RDBONLY`. This affects `redis-cli --rdb` (depending on timing) The fixed consist of two steps: 1. The `CLIENT_REPL_RDBONLY` should be closed ASAP (we cannot always call to `freeClient` so we use `freeClientAsync`) 2. Add new replication state `SLAVE_STATE_RDB_TRANSMITTED`
*	ACL default newly created user set USER_FLAG_SANITIZE_PAYLOAD flag (#11279)	Binbin	2022-09-22	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Starting from 6.2, after ACL SETUSER user reset, the user will carry the sanitize-payload flag. It was added in #7807, and then ACL SETUSER reset is inconsistent with default newly created user which missing sanitize-payload flag. Same as `off` and `on` these two bits are mutually exclusive, the default created user needs to have sanitize-payload flag. Adds USER_FLAG_SANITIZE_PAYLOAD flag to ACLCreateUser. Note that the bug don't have any real implications, since the code in rdb.c (rdbLoadObject) checks for `USER_FLAG_SANITIZE_PAYLOAD_SKIP`, so the fact that `USER_FLAG_SANITIZE_PAYLOAD` is missing doesn't really matters. Added tests to make sure it won't be broken in the future, and updated the comment in ACLSetUser and redis.conf
*	Fix missing sections for INFO ALL with module (#11291)	Shay Fadida	2022-09-21	1	-3/+7
\| \| \| \| \| \| \| \|	When using `INFO ALL <section>`, when `section` is a specific module section. Redis will not print the additional section(s). The fix in this case, will search the modules info sections if the user provided additional sections to `ALL`. Co-authored-by: Oran Agra <oran@redislabs.com>
*	Fix Invalid node address specified in redis-cli --cluster create/add-node ↵	Binbin	2022-09-19	1	-3/+3
\| \| \| \| \| \| \| \| \|	(#11151) This bug was introduced in #10344 (7.0.3), and it breaks the redis-cli --cluster create usage in #10436 (7.0 RC3). At the same time, the cluster-port support introduced in #10344 cannot use the DNS lookup brought by #10436.
*	Fix crash due to delete entry from compress quicklistNode and wrongly split ↵	sundb	2022-09-19	3	-3/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	quicklistNode (#11242) This PR mainly deals with 2 crashes introduced in #9357, and fix the QUICKLIST-PACKED-THRESHOLD mess in external test mode. 1. Fix crash due to deleting an entry from a compress quicklistNode When inserting a large element, we need to create a new quicklistNode first, and then delete its previous element, if the node where the deleted element is located is compressed, it will cause a crash. Now add `dont_compress` to quicklistNode, if we want to use a quicklistNode after some operation, we can use this flag like following: ```c node->dont_compress = 1; /* Prevent to be compressed / some_operation(node); / This operation might try to compress this node / some_other_operation(node); / We can use this node without decompress it / node->dont_compress = 0; / Re-able compression */ quicklistCompressNode(node); ``` Perhaps in the future, we could just disable the current entry from being compressed during the iterator loop, but that would require more work. 2. Fix crash due to wrongly split quicklist before #9357, the offset param of _quicklistSplitNode() will not negative. For now, when offset is negative, the split extent will be wrong. following example: ```c int orig_start = after ? offset + 1 : 0; int orig_extent = after ? -1 : offset; int new_start = after ? 0 : offset; int new_extent = after ? offset + 1 : -1; # offset: -2, after: 1, node->count: 2 # current wrong range: [-1,-1] [0,-1] # correct range: [1,-1] [0, 1] ``` Because only `_quicklistInsert()` splits the quicklistNode and only `quicklistInsertAfter()`, `quicklistInsertBefore()` call _quicklistInsert(), so `quicklistReplaceEntry()` and `listTypeInsert()` might occur this crash. But the iterator of `listTypeInsert()` is alway from head to tail(iter->offset is always positive), so it is not affected. The final conclusion is this crash only occur when we insert a large element with negative index into a list, that affects `LSET` command and `RM_ListSet` module api. 3. In external test mode, we need to restore quicklist packed threshold after when the end of test. 4. Show `node->count` in quicklistRepr(). 5. Add new tcl proc `config_get_set` to support restoring config in tests.
*	fix infinite sleep in performEvictions when have lazyfree jobs (#11237)	zhaozhao.zz	2022-09-18	1	-3/+9
\| \| \| \| \| \| \| \|	This bug is introduced in #7653. (Redis 6.2.0) When `server.maxmemory_eviction_tenacity` is 100, `eviction_time_limit_us` is `ULONG_MAX`, and if we cannot find the best key to delete (e.g. maxmemory-policy is `volatile-lru` and all keys with ttl have been evicted), in `cant_free` redis will sleep forever if some items are being freed in the lazyfree thread.