| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
re-accepting backends was really common.
|
|
|
|
|
|
|
| |
New backend connections returned 'conntimeout' whether it timed out
establishing the TCP connection or if it died waiting for the
"version\r\n" response. Now gives a 'readvalidate' if it's already
properly connected.
|
|
|
|
|
|
|
|
|
|
|
| |
A long sleep in the unix startup code made backends hit the connection
timeout before the backends were configured.
Make all the proxy tests use the unix socket instead of listening on a
hardcoded port. Proxy code is completely equivalent from the client
standpoint.
This fix should make the whole test suite run a bit faster too.
|
|
|
|
|
| |
These functions landed on the wrong side of "pool or routes" move commit
a while back. They are lacking in test coverage.
|
|
|
|
| |
If you had a numerical gap you would print junk.
|
|
|
|
|
|
| |
Apparently I don't typically run this one much. I think it should be
deprecated for the newer style used in proxyunits.t/etc, but needs to be
a concerted effort.
|
|
|
|
|
| |
allow using a pre-existing tarball instead of fetching from github, and
also test the sha before using it.
|
|
|
|
|
|
|
|
|
|
|
| |
The connect timeout is supposed to be applied to TCP connect() calls.
The "retry timeout" is a poorly named variable for how long to wait
before retrying a backend that is marked as bad.
The retry timeout was accidentally being applied in cases where
connect() was being called when retrying broken connections.
This now uses the appropriate timeouts.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The connect timeout won't fire when blocking a backend from connecting
in these tests; it will connect, send a version command to validate,
then time out on read.
With the read timeout set to 0.1 it would sometimes fail before the
restart finished, clogging log lines and causing test failures.
Now we wait for the watcher and remove a sleep, with a longer read
timeout.
|
|
|
|
|
|
| |
the previous fix broke memory accounting by underflowing the counter
after an OOM. This broke the final test in t/proxyconfig.t sometimes
intermittently.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bug introduced in 6c80728: use after free for response buffer while
under concurrency.
The await code has a different method of wrapping up a lua coroutine
than a standard response, so it was not managing the lifecycle of the
response object properly, causing data buffers to be reused before being
written back to the client.
This fix separates the accounting of memory from the freeing of the
buffer, so there is no more race.
Further restructuring is needed to both make this less bug prone and
make memory accounting be lock step with the memory freeing.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for sys/auxv.h to avoid the following uclibc build failure on
aarch64:
crc32c.c:277:10: fatal error: sys/auxv.h: No such file or directory
277 | #include <sys/auxv.h>
| ^~~~~~~~~~~~
Fixes:
- http://autobuild.buildroot.org/results/08591fbf9677ff126492c50c15170c641bcab56a
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When mcp.pool() is called in its two argument form, ie: mcp.pool({b1,
b2}, { foo = bar }), backend objects would not be properly cached
internally, causing objects to leak.
Further, it was settings the objects into the cache table indexed by the
object itself, so they would not be cleaned up by garbage collection.
Bug was introduced as part of 6442017c (allow workers to run IO
optionally)
|
|
|
|
|
| |
A few code paths were returning SERVER_ERROR (a retryable error)
when it should have been CLIENT_ERROR (bad protocol syntax).
|
|
|
|
|
|
|
| |
When failing to run mcp_config_routes on a worker thread,
we were eating the lua error message.
This is now consistent with the rest of the code.
|
|
|
|
| |
rest seem honestly reasonable. no huge red flags anymore.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
with the event handler rewrite the IO thread scales much better (up to
8-12 worker threads), leaving the io_uring code in the dust.
realistically io_uring won't be able to beat the event code if you're
using kernels older than 6.2, which is brand new. Instead of carrying
all this code around and having people randomly try it to get more
performance, I want to rip it out of the way and add it back in later
when it makes sense.
I am using mcshredder as a platform to learn and keep up to date with
io_uring, and will port over its usage pattern when it's time.
|
|
|
|
|
|
|
|
|
|
| |
Cleans up logic around response handling in general. Allows returning
server-sent error messages upstream for handling.
In general SERVER_ERROR means we can keep the connection to the backend.
The rest of the errors are protocol errors, and while some are perfectly
safe to whitelist, clients should not be causing those sorts of errors
and we should cycle the backend regardless.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a client sends multiple requests in the same packet, the proxy would
reverse the requests before sending them to the backend. They would
return to client in the correct order because top level responses are
sent in the order they were created.
In practice I guess this is rarely noticed. If a client sends a series
of commands where the first one generates a syntax error, all prior
commands would still succeed.
It would also trip people up if they test pipelining commands as
read-your-write would fail as the write gets ordered after the read.
Did run into this before, but I thought it was just the ascii multiget
code reversing keys, which would be harmless as the whole command has to
complete regardless of key order.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds:
mcp.active_req_limit(count)
mcp.buffer_memory_limit(kilobytes)
Divides by the number of worker threads and creates a per-worker-thread
limit for the number of concurrent proxy requests, and how many bytes
used specifically for value bytes. This does not represent total memory
usage but will be close.
Buffer memory for inbound set requests is not accounted for until after
the object has been read from the socket; to be improved in a future
update. This should be fine unless clients send just the SET request and
then hang without sending further data.
Limits should be live-adjustable via configuration reloads.
|
|
|
|
|
|
|
|
|
|
| |
Also changes the way the global context and thread contexts are fetched
from lua; via the VM extra space instead of upvalues, which is a little
faster and more universal.
It was always erroneous to run a lot of the config functions from routes
and vice versa, but there was no consistent strictness so users could
get into trouble.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The client connection state machine loops through a few states when
handling pipelined requests.
To start:
conn_waiting -> conn_read -> conn_parse_cmd (execution)
After conn_parse_cmd, we can enter:
conn_nread (read a mutation payload from the network) -> conn_new_cmd
or directly: conn_new_cmd
conn_new_cmd checks the limit specified in -R, flushing the pipeline if
we exceed that limit. Else it wraps back to conn_parse_cmd
The proxy code was _not_ resetting state to conn_new_cmd after any
non-mutation command. If a value was set it would properly run through
nread -> conn_new_cmd
This means that clients issuing requests against a proxy server have
unlimited pipelines, and the proxy will buffer the entire result set
before beginning to return data to the client. Especially if requests
are for very large items, this can cause a very high Time To First Byte
in the response to the client.
|
|
|
|
|
| |
use a specific error when timeouts happen during connection stage vs
read/write stage. it even had a test!
|
|
|
|
|
|
|
| |
`watch deletions`: would log all keys which are deleted using either `delete` or `md` command.
The log line would contain the command used, the key, the clsid and size of the deleted item.
Items which result in delete miss or are marked as stale wouldn't show up in the logs
|
|
|
|
|
|
|
|
|
| |
Sending 's' flag to metaset now returns the size of the item stored.
Useful if you want to know how large an append/prepended item now is.
If the 'N' flag is supplied while in append/prepend mode, allows
autovivifying (with exptime supplied from N) for append/prepend style
keys that don't need headers created first.
|
|
|
|
| |
somehow missed from earlier change with marking dead backends.
|
| |
|
|
|
|
|
|
|
|
| |
No longer have access to the client object there. I need to rewire
things and honestly not sure if anyone even uses the traces anymore.
Will make a decision on deleting or updating them soon; if you read this
and care please reach out.
|
|
|
|
|
|
| |
Hi,
This follows the point here: https://lists.debian.org/debian-qa/2023/02/msg00052.html
Thanks!
|
| |
|
|
|
|
| |
As is documented in https://github.com/memcached/memcached/issues/932 the images does not build without this.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Disables Werror by default. Allows user to conditionally enable -Werror
Change originally to avoid the following build failure:
In file included from hash.c:7:
xxhash.h:2667:5: error: #warning is a GCC extension [-Werror]
2667 | # warning "XXH3 is highly inefficient without ARM or Thumb-2."
| ^~~~~~~
xxhash.h:2667:5: error: #warning "XXH3 is highly inefficient without ARM or Thumb-2." [-Werror=cpp]
Fixes:
- http://autobuild.buildroot.org/results/3124bae73c207f1a118e57e41e222ef464ccb297
Signed-off-by: Fabrice Fontaine <fontaine.fabrice@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
- Refcount leak on sets
- Move the response elapsed timer back closer to when the response was
processed as to not clobber the wrong IO object data
- Restores error messages from set/ms
- Adds start of unit tests
Requests will look like they run a tiiiiny bit faster than they do, but
I need to get the elapsed time there for a later change.
|
|
|
|
|
|
|
| |
only log until a backend is marked bad. was previously ticking the "bad"
counter on every retry attempt as well.
turned out to be trivial.
|
|
|
|
| |
test FASTGOOD and some set scenarios
|
|
|
|
|
|
|
|
| |
One of the side effects of pre-warming all of the tests I did with
multiget, and not having done a second round on the unit tests, is that
we somehow never tried an ascii multiget against a damn miss.
Easy to test, easy to fix.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "metadump" command was designed primarily for doing analysis on
what's in cache, but it's also used for pulling the data out for various
reasons.
The string format is a bit onerous: key=value (for futureproofing) and
URI encoded keys (which may or may not be binary internally)
This adds a command "mgdump", which dumps keys in the format:
"mg key\r\nmg key2\r\n"
if a key is binary encoded, it uses the meta binary encoding scheme of
base64-ing keys and appends a "b" flag:
"mg 44OG44K544OI b\r\n"
when the dump is complete it prints an "EN\r\n"
clients wishing to stream or fetch data can take the mg commands, strip
the \r\n, append any flags they care about, then send the command back
to the server to fetch the full key data.
This seems to use 30-40% less CPU time on the server for the same key
dumps.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LRU crawler in per-LRU mode only flushes the write buffer to the client
socket while not holding important locks. The hash table iterator
version of the crawler was accidentally holding an item lock while
flushing to the network. Item locks must NOT be held for long periods of
time as they will cause the daemon to lag.
Originally the code used a circular buffer for writing to the
network; this allowed it to easily do partial write flushes to the
socket and continue filling the other half of the buffer.
Fixing this requires the buffer be resizeable, so we instead use a
straight buffer allocation. The write buffer must be large enough to
handle all items within a hash table bucket.
Hash table buckets are _supposed_ to max out at an average depth of 1.5
items, so in theory it should never resize. However it's possible to go
higher if a user clamps the hash table size. There could also be larger
than average buckets naturally due to the hash algorithm and luck.
|
|
|
|
|
|
|
|
| |
local res = mcp.internal(r) - takes a request object and executes it
against the proxy's internal cache instance.
Experimental as of this commit. Needs more test coverage and
benchmarking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`mcp.pool(p, { dist = etc, iothread = true }`
By default the IO thread is not used; instead a backend connection is
created for each worker thread. This can be overridden by setting
`iothread = true` when creating a pool.
`mcp.pool(p, { dist = etc, beprefix = "etc" }`
If a `beprefix` is added to pool arguments, it will create unique
backend connections for this pool. This allows you to create multiple
sockets per backend by making multiple pools with unique prefixes.
There are legitimate use cases for sharing backend connections across
different pools, which is why that is the default behavior.
|