summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* proxy: send CLIENT_ERROR when necessarystagingdormando2023-04-034-18/+24
| | | | | A few code paths were returning SERVER_ERROR (a retryable error) when it should have been CLIENT_ERROR (bad protocol syntax).
* proxy: print lua error on reload failuredormando2023-04-031-1/+1
| | | | | | | When failing to run mcp_config_routes on a worker thread, we were eating the lua error message. This is now consistent with the rest of the code.
* proxy: some TODO/FIXME updatesdormando2023-03-262-16/+3
| | | | rest seem honestly reasonable. no huge red flags anymore.
* proxy: rip out io_uring codedormando2023-03-264-663/+3
| | | | | | | | | | | | | | with the event handler rewrite the IO thread scales much better (up to 8-12 worker threads), leaving the io_uring code in the dust. realistically io_uring won't be able to beat the event code if you're using kernels older than 6.2, which is brand new. Instead of carrying all this code around and having people randomly try it to get more performance, I want to rip it out of the way and add it back in later when it makes sense. I am using mcshredder as a platform to learn and keep up to date with io_uring, and will port over its usage pattern when it's time.
* proxy: overhaul backend error handlingdormando2023-03-266-56/+211
| | | | | | | | | | Cleans up logic around response handling in general. Allows returning server-sent error messages upstream for handling. In general SERVER_ERROR means we can keep the connection to the backend. The rest of the errors are protocol errors, and while some are perfectly safe to whitelist, clients should not be causing those sorts of errors and we should cycle the backend regardless.
* mcmc: upstream updatesdormando2023-03-262-3/+55
|
* proxy: fix reversal of pipelined backend queriesdormando2023-03-262-13/+10
| | | | | | | | | | | | | | | | | | If a client sends multiple requests in the same packet, the proxy would reverse the requests before sending them to the backend. They would return to client in the correct order because top level responses are sent in the order they were created. In practice I guess this is rarely noticed. If a client sends a series of commands where the first one generates a syntax error, all prior commands would still succeed. It would also trip people up if they test pipelining commands as read-your-write would fail as the write gets ordered after the read. Did run into this before, but I thought it was just the ascii multiget code reversing keys, which would be harmless as the whole command has to complete regardless of key order.
* proxy: add request and buffer memory limitsdormando2023-03-2610-20/+400
| | | | | | | | | | | | | | | | | | Adds: mcp.active_req_limit(count) mcp.buffer_memory_limit(kilobytes) Divides by the number of worker threads and creates a per-worker-thread limit for the number of concurrent proxy requests, and how many bytes used specifically for value bytes. This does not represent total memory usage but will be close. Buffer memory for inbound set requests is not accounted for until after the object has been read from the socket; to be improved in a future update. This should be fine unless clients send just the SET request and then hang without sending further data. Limits should be live-adjustable via configuration reloads.
* proxy: restrict functions for lua config vs routedormando2023-03-268-81/+94
| | | | | | | | | | Also changes the way the global context and thread contexts are fetched from lua; via the VM extra space instead of upvalues, which is a little faster and more universal. It was always erroneous to run a lot of the config functions from routes and vice versa, but there was no consistent strictness so users could get into trouble.
* proxy: fix bug ignoring -R setting for proxy reqsdormando2023-03-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | The client connection state machine loops through a few states when handling pipelined requests. To start: conn_waiting -> conn_read -> conn_parse_cmd (execution) After conn_parse_cmd, we can enter: conn_nread (read a mutation payload from the network) -> conn_new_cmd or directly: conn_new_cmd conn_new_cmd checks the limit specified in -R, flushing the pipeline if we exceed that limit. Else it wraps back to conn_parse_cmd The proxy code was _not_ resetting state to conn_new_cmd after any non-mutation command. If a value was set it would properly run through nread -> conn_new_cmd This means that clients issuing requests against a proxy server have unlimited pipelines, and the proxy will buffer the entire result set before beginning to return data to the client. Especially if requests are for very large items, this can cause a very high Time To First Byte in the response to the client.
* proxy: add conntimeout errordormando2023-03-212-2/+4
| | | | | use a specific error when timeouts happen during connection stage vs read/write stage. it even had a test!
* log: Add a new watcher to watch for deletions.Hemal Shah2023-03-156-3/+102
| | | | | | | `watch deletions`: would log all keys which are deleted using either `delete` or `md` command. The log line would contain the command used, the key, the clsid and size of the deleted item. Items which result in delete miss or are marked as stale wouldn't show up in the logs
* meta: N flag changes append/prepend. ms s flag.dormando2023-03-119-23/+128
| | | | | | | | | Sending 's' flag to metaset now returns the size of the item stored. Useful if you want to know how large an append/prepended item now is. If the 'N' flag is supplied while in append/prepend mode, allows autovivifying (with exptime supplied from N) for append/prepend style keys that don't need headers created first.
* proxy: repair t/proxyconfig.tdormando2023-03-091-0/+1
| | | | somehow missed from earlier change with marking dead backends.
* core: fix another dtrace compilation issue1.6.19dormando2023-03-083-6/+4
|
* core: disable some dtracesdormando2023-03-082-3/+3
| | | | | | | | No longer have access to the client object there. I need to rewire things and honestly not sure if anyone even uses the traces anymore. Will make a decision on deleting or updating them soon; if you read this and care please reach out.
* replace 2&>1 by 2>&1Patrice Duroux2023-03-081-1/+1
| | | | | | Hi, This follows the point here: https://lists.debian.org/debian-qa/2023/02/msg00052.html Thanks!
* log: fix race condition while incrementing log entries droppedRamasai2023-03-081-1/+1
|
* Add new pkg-config dependencies to dockerfilesOlof Nord2023-03-084-5/+5
| | | | As is documented in https://github.com/memcached/memcached/issues/932 the images does not build without this.
* remove unnecessary HAVE_UNISTD_H checkxuesenliang2023-03-081-6/+0
|
* rm_lru_maintainer_initializedxuesenliang2023-03-083-10/+0
|
* bugfix: size.c: struct size errorxuesenliang2023-03-081-1/+1
|
* declare item_lock_hashpower as a static variablexuesenliang2023-03-082-2/+1
|
* remove useless function declaration do_assoc_move_next_bucket()xuesenliang2023-03-081-1/+3
|
* fix few unitialized data.David Carlier2023-03-082-1/+3
|
* Document missing flags of Meta ArithmeticMate Borcsok2023-03-081-0/+2
|
* configure.ac: add --enable-werrorFabrice Fontaine2023-03-081-2/+9
| | | | | | | | | | | | | | | | | Disables Werror by default. Allows user to conditionally enable -Werror Change originally to avoid the following build failure: In file included from hash.c:7: xxhash.h:2667:5: error: #warning is a GCC extension [-Werror] 2667 | # warning "XXH3 is highly inefficient without ARM or Thumb-2." | ^~~~~~~ xxhash.h:2667:5: error: #warning "XXH3 is highly inefficient without ARM or Thumb-2." [-Werror=cpp] Fixes: - http://autobuild.buildroot.org/results/3124bae73c207f1a118e57e41e222ef464ccb297 Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
* proxy: mcp.internal fixes and testsdormando2023-03-065-8/+252
| | | | | | | | | | | - Refcount leak on sets - Move the response elapsed timer back closer to when the response was processed as to not clobber the wrong IO object data - Restores error messages from set/ms - Adds start of unit tests Requests will look like they run a tiiiiny bit faster than they do, but I need to get the elapsed time there for a later change.
* proxy: reduce noise for dead backendsdormando2023-03-061-14/+11
| | | | | | | only log until a backend is marked bad. was previously ticking the "bad" counter on every retry attempt as well. turned out to be trivial.
* proxy: more await unit testsdormando2023-03-012-1/+113
| | | | test FASTGOOD and some set scenarios
* proxy: fix trailingdata error with ascii multigetdormando2023-02-282-0/+20
| | | | | | | | One of the side effects of pre-warming all of the tests I did with multiget, and not having done a second round on the unit tests, is that we somehow never tried an ascii multiget against a damn miss. Easy to test, easy to fix.
* crawler: mgdump documentationdormando2023-02-271-0/+25
|
* crawler: add "lru_crawler mgdump" commanddormando2023-02-273-4/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "metadump" command was designed primarily for doing analysis on what's in cache, but it's also used for pulling the data out for various reasons. The string format is a bit onerous: key=value (for futureproofing) and URI encoded keys (which may or may not be binary internally) This adds a command "mgdump", which dumps keys in the format: "mg key\r\nmg key2\r\n" if a key is binary encoded, it uses the meta binary encoding scheme of base64-ing keys and appends a "b" flag: "mg 44OG44K544OI b\r\n" when the dump is complete it prints an "EN\r\n" clients wishing to stream or fetch data can take the mg commands, strip the \r\n, append any flags they care about, then send the command back to the server to fetch the full key data. This seems to use 30-40% less CPU time on the server for the same key dumps.
* crawler: don't hold lock while writing to networkdormando2023-02-271-77/+103
| | | | | | | | | | | | | | | | | | | | | LRU crawler in per-LRU mode only flushes the write buffer to the client socket while not holding important locks. The hash table iterator version of the crawler was accidentally holding an item lock while flushing to the network. Item locks must NOT be held for long periods of time as they will cause the daemon to lag. Originally the code used a circular buffer for writing to the network; this allowed it to easily do partial write flushes to the socket and continue filling the other half of the buffer. Fixing this requires the buffer be resizeable, so we instead use a straight buffer allocation. The write buffer must be large enough to handle all items within a hash table bucket. Hash table buckets are _supposed_ to max out at an average depth of 1.5 items, so in theory it should never resize. However it's possible to go higher if a user clamps the hash table size. There could also be larger than average buckets naturally due to the hash algorithm and luck.
* proxy: add mcp.internal(r) APIdormando2023-02-255-14/+1807
| | | | | | | | local res = mcp.internal(r) - takes a request object and executes it against the proxy's internal cache instance. Experimental as of this commit. Needs more test coverage and benchmarking.
* proxy: allow workers to run IO optionallydormando2023-02-2412-174/+470
| | | | | | | | | | | | | | | | | `mcp.pool(p, { dist = etc, iothread = true }` By default the IO thread is not used; instead a backend connection is created for each worker thread. This can be overridden by setting `iothread = true` when creating a pool. `mcp.pool(p, { dist = etc, beprefix = "etc" }` If a `beprefix` is added to pool arguments, it will create unique backend connections for this pool. This allows you to create multiple sockets per backend by making multiple pools with unique prefixes. There are legitimate use cases for sharing backend connections across different pools, which is why that is the default behavior.
* proxy: redo libevent handling codedormando2023-02-224-85/+157
| | | | | | | | | | | | | | | | | | The event handling code was unoptimized and temporary; it was slated for a rewrite for performance and non-critical bugs alone. However the old code may be causing critical bugs so it's being rewritten now. Fixes: - backend disconnects are detected immediately instead of on the next time they are used. - backend reconnects happen _after_ the retry timeout, not before - use a persistent read handler and a temporary write handler to avoid constantly calling epoll_ctl syscalls for potential performance boost. Updated some tests for proxyconfig.t as it was picking up the disconnects immediately. Unrelated to a timing issue I resolved to the benchmark.
* proxy: fix "missingend" error on reading responsesdormando2023-02-172-1/+56
| | | | | | | | | | | | | | If the backend handler reads an incomplete response from the network, it changes state to wait for more data. The want_read state was considering the data completed if "data read" was bigger than "value length", but it should have been "value + result line". This means if the response buffer landed in a bullseye where it has read more than the size of the value but less than the total size of the request (typically a span of 200 bytes or less), it would consider the request complete and look for the END\r\n marker. This change has been... here forever.
* proxy: add read buffer data to backend errorsdormando2023-02-154-4/+98
| | | | | | Errors like "trailing data" or "missingend" or etc are only useful if you're in a debugger and can break and inspect. This adds detail in uriencoding into the log message when applicable.
* proxy: fix partial responses on backend timeoutsdormando2023-02-143-9/+26
| | | | | | | Response object error conditions were not being checked before looking at the response buffer. If a response was partially filled then the backend timed out, a partial response could be sent intead of the proper backend error.
* proxy: fix write flushing bugsdormando2023-02-081-8/+11
| | | | | | | | | | | | 1) the event flags were not being used from the result of the flush_pending_write() function, causing it to not listen on WRITE events in some cases. 2) be->can_write flag was not being set on short writes; old iterations of code would loop-write until EAGAIN but this does not. This prevented the code from listening for WRITE events. 3) if many short writes are happening, a successful read event was overwriting the WRITE event with a READ event, instead of READ|WRITE. fixed via the be->can_write flag fix from 2).
* proxy: disallow overriding mn commanddormando2023-02-011-0/+5
| | | | | | When using CMD_ANY_STORAGE to enable the proxy this causes the MN command to no longer work as intended; the proxy eats the command and does not flush the client response pipeline.
* proxy: add mcp.backend(t) for more overridesdormando2023-02-016-98/+186
| | | | | | | | | | ie: local b1 = mcp.backend({ label = "b1", host = "127.0.0.1", port = 11511, connecttimeout = 1, retrytimeout = 0.5, readtimeout = 0.1, failurelimit = 11 }) ... to allow for overriding connect/retry/etc tunables on a per-backend basis. If not passed in the global settings are used.
* tests: timedrun SIGHUP pass-thrudormando2023-02-011-0/+10
| | | | | timedrun would attmept to exit after passing along any type of signal to the child. allow SIGHUP to pass and continue waiting.
* proxy: add mcp.await_logerrors()dormando2023-01-275-2/+89
| | | | | | | Logs any backgrounded requests that resulted in an error. Note that this may be a temporary interface, and could be deprecated in the future.
* proxy: new integration tests.dormando2023-01-255-0/+933
| | | | | | | | uses mocked backend servers so we can test: - end to end client to backend proxying - lua API functions - configuration reload - various error conditions
* proxy: fix mismatched responses after bad writedormando2023-01-221-12/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regression from the IO thread performance fix (again...) back in early december. Was getting corrupt backends if IO's were flushed in a very specific way, which would give bad data to clients. Once traffic stops the backends would timeout (waiting for responses that were never coming) and reset themselves. The optimization added was to "fast skip" IO's that were already flushed to the network by tracking a pointer into the list of IO's. The bug requires a series of events: 1) the "prep write command" function notes a pointer into the top of the backend IO stack. 2) a write to the backend socket resulting in an EAGAIN (no bytes written, try again later). 3) reads then complete from the backend, changing the list of IO objects. 4) "prep write command" tries again from a now invalid backend object. The fix: 1) only set the offset pointer _post flush_ to the last specifically non-flushed IO object, so if the list changes it should always be behind the IO pointer. 2) the IO pointer is nulled out immediately if flushing is complete. Took staring at it for a long time to understand this. I've rewritten this change once. I will split the stacks for "to-write queue" and "to-read queue" soon. That should be safer.
* proxy: fix stats deadlock caused by await codedormando2023-01-203-19/+19
| | | | | | | | | | | - specifically the WSTAT_DECR in proxy_await.c's return code could potentially use the wrong thread's lock This is why I've been swapping c with thread as lock/function arguments all over the code lately; it's very accident prone. Am reasonably sure this causes the deadlock but need to attempt to verify more.
* proxy: clean logic around lua yieldingdormando2023-01-124-47/+46
| | | | | | We were duck typing the response code for a coroutine yield before. It would also pile random logic for overriding IO's in certain cases. This now makes everything explicit and more clear.
* core: remove *c from some response codedormando2023-01-122-33/+53
| | | | | | Allow freeing client response objects without the client object. Clean some confusing logic around clearing memory. Also exposes an interface for allocating unlinked response objects.