summaryrefslogtreecommitdiff
path: root/memcached.h
Commit message (Collapse)AuthorAgeFilesLines
* proxy: add request and buffer memory limitsdormando2023-03-261-0/+4
| | | | | | | | | | | | | | | | | | Adds: mcp.active_req_limit(count) mcp.buffer_memory_limit(kilobytes) Divides by the number of worker threads and creates a per-worker-thread limit for the number of concurrent proxy requests, and how many bytes used specifically for value bytes. This does not represent total memory usage but will be close. Buffer memory for inbound set requests is not accounted for until after the object has been read from the socket; to be improved in a future update. This should be fine unless clients send just the SET request and then hang without sending further data. Limits should be live-adjustable via configuration reloads.
* proxy: restrict functions for lua config vs routedormando2023-03-261-1/+2
| | | | | | | | | | Also changes the way the global context and thread contexts are fetched from lua; via the VM extra space instead of upvalues, which is a little faster and more universal. It was always erroneous to run a lot of the config functions from routes and vice versa, but there was no consistent strictness so users could get into trouble.
* log: Add a new watcher to watch for deletions.Hemal Shah2023-03-151-0/+3
| | | | | | | `watch deletions`: would log all keys which are deleted using either `delete` or `md` command. The log line would contain the command used, the key, the clsid and size of the deleted item. Items which result in delete miss or are marked as stale wouldn't show up in the logs
* meta: N flag changes append/prepend. ms s flag.dormando2023-03-111-2/+4
| | | | | | | | | Sending 's' flag to metaset now returns the size of the item stored. Useful if you want to know how large an append/prepended item now is. If the 'N' flag is supplied while in append/prepend mode, allows autovivifying (with exptime supplied from N) for append/prepend style keys that don't need headers created first.
* remove unnecessary HAVE_UNISTD_H checkxuesenliang2023-03-081-6/+0
|
* crawler: add "lru_crawler mgdump" commanddormando2023-02-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "metadump" command was designed primarily for doing analysis on what's in cache, but it's also used for pulling the data out for various reasons. The string format is a bit onerous: key=value (for futureproofing) and URI encoded keys (which may or may not be binary internally) This adds a command "mgdump", which dumps keys in the format: "mg key\r\nmg key2\r\n" if a key is binary encoded, it uses the meta binary encoding scheme of base64-ing keys and appends a "b" flag: "mg 44OG44K544OI b\r\n" when the dump is complete it prints an "EN\r\n" clients wishing to stream or fetch data can take the mg commands, strip the \r\n, append any flags they care about, then send the command back to the server to fetch the full key data. This seems to use 30-40% less CPU time on the server for the same key dumps.
* proxy: allow workers to run IO optionallydormando2023-02-241-0/+2
| | | | | | | | | | | | | | | | | `mcp.pool(p, { dist = etc, iothread = true }` By default the IO thread is not used; instead a backend connection is created for each worker thread. This can be overridden by setting `iothread = true` when creating a pool. `mcp.pool(p, { dist = etc, beprefix = "etc" }` If a `beprefix` is added to pool arguments, it will create unique backend connections for this pool. This allows you to create multiple sockets per backend by making multiple pools with unique prefixes. There are legitimate use cases for sharing backend connections across different pools, which is why that is the default behavior.
* core: remove *c from some response codedormando2023-01-121-2/+4
| | | | | | Allow freeing client response objects without the client object. Clean some confusing logic around clearing memory. Also exposes an interface for allocating unlinked response objects.
* core: simplify background IO APIdormando2023-01-111-19/+9
| | | | | | | | - removes unused "completed" IO callback handler - moves primary post-IO callback handlers from the queue definition to the actual IO objects. - allows IO object callbacks to be handled generically instead of based on the queue they were submitted from.
* core: remove *conn object from cache commandsdormando2023-01-111-10/+14
| | | | | | | | | We want to start using cache commands in contexts without a client connection, but the client object has always been passed to all functions. In most cases we only need the worker thread (LIBEVENT_THREAD *t), so this change adjusts the arguments passed in.
* proxy: add proxy_await_active statdormando2022-12-011-1/+2
| | | | | | | proxy_req_active shows the number of active proxy requests, but if those proxy requests make sub-requests via mcp.await() they are not accounted for. This gives the number of await's active, but not the total number of in-flight async requests.
* core: give threads unique namesdormando2022-11-011-0/+1
| | | | | allow users to differentiate thread functions externally to memcached. Useful for setting priorities or pinning threads to CPU's.
* meta: remove "meta_response_old" start optiondormando2022-10-201-1/+0
| | | | | was a temporary hidden option for a few early adopters to use while migrating from OK to HD status codes.
* extstore: make defaults more aggressivedormando2022-08-251-2/+2
| | | | | | | | | | | | | | | | | | | | | | extstore has a background thread which examines slab classes for items to flush to disk. The thresholds for flushing to disk are managed by a specialized "slab automove" algorithm. This algorithm was written in 2017 and not tuned since. Most serious users set "ext_item_age=0" and force flush all items. This is partially because the defaults do not flush aggressively enough, which causes memory to run out and evictions to happen. This change simplifies the slab automove portion. Instead of balancing free chunks of memory per slab class, it sets a target of a certain number of free global pages. The extstore flusher thread also uses the page pool and some low chunk limits to decide when to start flushing. Its sleep routines have also been adjusted as it could oversleep too easily. A few other small changes were required to avoid over-moving slab pages around.
* sock ip filtering tagging support for FBSD/OBSDDavid CARLIER2022-08-251-0/+11
| | | | Also linux.
* core: allow forcing protocol per listener socketdormando2022-08-241-2/+2
| | | | | | | | | | | | | | -l proto[ascii]:127.0.0.1:11211 accepts: - ascii - binary - negotiating - proxy Allows running proxy on default listeners but direct to memcached on a specific port, or binary and ascii on different ports, or etc.
* core: add tagging to listener socketsdormando2022-08-241-2/+3
| | | | | | -l tag[asdfasdf]:0.0.0.0:11211 not presently used for anything outside of the proxy code.
* tls: Add switch to opt-in to kernel TLS on OpenSSL 3.0.0+Kevin Lin2022-07-031-0/+1
|
* proxy: mcp.log_req* API interfacedormando2022-04-081-0/+1
| | | | | | Lua level API for logging full context of a request/response. Provides log_req() for simple logging and log_reqsample() for conditional logging.
* storage: parameterize the compaction thread sleepdormando2022-02-211-0/+1
| | | | | | allows tests to run faster, let users make it sleep longer/less time. Also cuts the sleep time down when actively compacting and coming from high idle.
* proxy: track in-flight requestsdormando2022-02-111-1/+2
| | | | | | | I wanted to do this via lua with some on-close hooks on the coroutine but this might work for now. not 100% sure I caught all of the incr/decr cases properly. Was trying to avoid hitting the counters too hard as well.
* proxy: add stats for commands seendormando2022-02-111-1/+2
| | | | added to "stats proxy" output. counters of commands seen at inbound.
* proxy: more misc fixesdormando2022-02-041-1/+2
| | | | | | | - fixes potential memory leaks if an error is generated while creating a pool object. - misc comment updates and error handling. - avoid crash if attempting to route commands that don't have a key.
* proxy: fix bug/crash for set commandsdormando2022-02-041-0/+3
| | | | | | | if a conn goes to sleep while reading set data from the network its coroutine would be lost, and crash/corruption on next resume. this now properly handles lifetime/cleanup of the coroutine.
* proxy: `-o proxy_uring` to enable io_uringdormando2022-01-261-0/+1
| | | | | instead of automatically attempting to use io_uring if compiled in, require a start option.
* Track store errors in thread statsKevin Lin2021-11-231-1/+3
| | | | | | Add two new stat keys, `store_too_large` and `store_no_memory`, to track occurrences of storage request rejections due to writing too large of a value and writing beyond available provisioned memory, respectively.
* proxy: initial commit.dormando2021-10-051-2/+29
| | | | | | | | | | | | | | | See BUILD for compilation details. See t/startfile.lua for configuration examples. (see also https://github.com/memcached/memcached-proxylibs for extensions, config libraries, more examples) NOTE: io_uring mode is _not stable_, will crash. As of this commit it is not recommended to run the proxy in production. If you are interested please let us know, as we are actively stabilizing for production use.
* Expose number of currently active watchers in stats1.6.11Kevin Lin2021-09-271-0/+1
| | | | | The stat key `log_watchers` indicates the number of active connected `watch` clients.
* Configurable minimum supported TLS protocol versionKevin Lin2021-09-271-0/+1
| | | | | | | `-o ssl_min_version` can be used to configure the server to only accept handshakes from clients with a minimum TLS protocol version. Currently supported options are TLS v1.0, TLS v1.1, TLS v1.2, and TLS v1.3 (OpenSSL 1.1.1+ only).
* core: io_queue flow second attemptdormando2021-08-091-37/+49
| | | | | | | | | | | probably squash into previous commit. io->c->thead can change for orpahned IO's, so we had to directly add the original worker thread as a reference. also tried again to split callbacks onto the thread and off of the connection for similar reasons; sometimes we just need the callbacks, sometimes we need both.
* core: io_queue_t flow modedormando2021-08-091-5/+9
| | | | | | | | | | | instead of passing ownership of (io_queue_t)*q to the side thread, instead the ownership of IO objects are passed to the side thread, which are then individually returned. The worker thread runs return_cb() on each, determining when it's done with the response batch. this interface could use more explicit functions to make it more clear. Ownership of *q isn't actually "passed" anywhere, it's just used or not used depending on which return function the owner wants.
* thread: use eventfd for worker notify if availabledormando2021-08-091-0/+4
| | | | | | now that all of the read/writes to the notify pipe are in one place, we can easily use linux eventfd if available. This also allows batching events so we're not firing the same notifier constantly.
* thread: unify worker notify interfacedormando2021-08-091-0/+1
| | | | | | | worker notification was a mix of reading data from pipe or examining a an object queue stack. now it's all one interface. this is necessary to switch signalling to eventfd or similar, since we won't have that pipe to work with.
* thread: per-worker-thread connection event queuesdormando2021-08-091-1/+1
| | | | | | | help scalability a bit by having a per-worker-thread freelist and queue for connection event items (new conns, etc). Also removes a hand-rolled linked list and uses cache.c for freelist handling to cull some redundancy.
* Implement LOG_CONNEVENTS watcher flag for connection state transitionsKevin Lin2021-08-071-0/+8
| | | | | | | | Add support for `watch connevents` to report opened (`conn_new`) and closed (`conn_close`) client connections. Event log lines indicate the connection's remote IP, remote port, and transport type. `conn_close` events additionally supply a reason for the closing the connection.
* Fix typos in doc/code comments (tem->item, etc)Tyson Andre2021-08-051-1/+1
| | | | | Note: Do not fix typos in crc32.c because it's copied from an upstream source
* meta: response code OK -> HDdormando2021-06-101-0/+1
| | | | | | | | | | | | | | I had the response code as "HD" in the past, but standardized on OK while merging a number of "OK-like" rescodes together. This was a mistake; as many "generic" memcached response codes use "OK". Most of these are management or specialized uncommon commands. With this, a client response parser can know for sure if a response is to a meta command, or some other command. `-o meta_response_old` starttime option has been added, valid for the next 3 months, which switches the response code back from HD to OK. In case any existing users depended on this and need time to migrate.
* The total number of UDP datagrams required for the message is calculated ↵tom2021-06-071-0/+1
| | | | | | | | | | | | | | | incorrectly. UDP_MAX_PAYLOAD_SIZE actually contains the length of the private UDP header, but resp->tosend only contains the length of the data part. The number of required UDP packets calculated by the original code will be less than the actual need. E.g: 1000000/1400 = 714.2 ceil 715 1000000/1392 = 718.3 ceil 719 Actually 719 datagrams are needed, and 715 is wrong. Signed-off-by: AK Deng <ttttabcd@protonmail.com>
* meta: allow base64'ed binary keys with 'b' flagdormando2021-06-071-0/+2
| | | | | ie: ms [key] b if 'k' flag is given and key is binary, returns as binary encoded.
* Added debugtime commandminkikim892021-06-071-0/+5
|
* core: support malloc'ed blobs for body readdormando2020-12-111-0/+1
| | | | | | | conn_nread state handles c->item like an item, but allow it to be a temporary malloced blob via setting c->item_malloced = true. to be used for buffering value reads in the proxy code.
* Expose memory_file path in stats settingsKevin Lin2020-11-201-0/+1
|
* net: fix compile failures when missing NAPI definedormando2020-11-111-0/+5
| | | | was defined in memcached.c, but also used in thread.c.
* Introduce NAPI ID based worker thread selectionSridhar Samudrala2020-11-021-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | By default memcached assigns connections to worker threads in a round-robin manner. This patch introduces an option to select a worker thread based on the incoming connection's NAPI ID if SO_INCOMING_NAPI_ID socket option is supported by the OS. This allows a memcached worker thread to be associated with a NIC HW receive queue and service all the connection requests received on a specific RX queue. This mapping between a memcached thread and a HW NIC queue streamlines the flow of data from the NIC to the application. In addition, an optimal path with reduced context switches is possible, if epoll based busy polling (sysctl -w net.core.busy_poll = <non-zero value>) is also enabled. This feature is enabled via a new command line parameter -N <num> or "--napi_ids=<num>", where <num> is the number of available/assigned NIC hardware RX queues through which the connections can be received. The number of napi_ids specified cannot be greater than the number of worker threads specified using -t/--threads option. If the option is not specified, or the conditions not met, the code defaults to round robin thread selection. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
* queue: replace c->io_pending to avoid a mutexdormando2020-10-301-3/+18
| | | | | | since multiple queues can be sent to different sidethreads, we need a new mechanism for knowing when to return everything. In the common case only one queue will be active, so adding a mutex would be excessive.
* core: restructure IO queue callbacksdormando2020-10-301-13/+18
| | | | | | | | | | | | | | mc_resp is the proper owner of a pending IO once it's been initialized; release it during resp_finish(). Also adds a completion callback which runs on the submitted stack after returning to the worker thread but before the response is transmitted. allows re-queueing for pending IO if processing a response generates another pending IO. also allows a further refactor to run more extstore code on the worker thread instead of the IO threads. uses proper conn_io_queue state to describe connections waiting for pending IO's.
* core: io_pending_t is an embeddable structdormando2020-10-301-7/+1
| | | | | | reserve space in an io_pending_t. users cast it to a more specific structure, avoiding extra allocations for local data. In this case what might require 3 allocs stays as just 1.
* core: compile io_queue code by defaultdormando2020-10-301-8/+4
| | | | | don't gate on EXTSTORE for the deferred io_queue code. removes a number of ifdef's and allows more clean usage of the interface.
* core: move more storage functions to storage.cdormando2020-10-301-6/+0
| | | | | | | extstore.h is now only used from storage.c. starting a path towards getting the storage interface to be more generalized. should be no functional changes.
* core: generalize extstore's defered IO queuedormando2020-10-301-7/+25
| | | | | | | | | | | | want to reuse the deferred IO system for extstore for something else. Should allow evolving into a more plugin-centric system. step one of three(?) - replace in place and tests pass with extstore enabled. step two should move more extstore code into storage.c step three should build the IO queue code without ifdef gating.