summaryrefslogtreecommitdiff
path: root/crawler.c
Commit message (Collapse)AuthorAgeFilesLines
* crawler: add "lru_crawler mgdump" commanddormando2023-02-271-3/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | The "metadump" command was designed primarily for doing analysis on what's in cache, but it's also used for pulling the data out for various reasons. The string format is a bit onerous: key=value (for futureproofing) and URI encoded keys (which may or may not be binary internally) This adds a command "mgdump", which dumps keys in the format: "mg key\r\nmg key2\r\n" if a key is binary encoded, it uses the meta binary encoding scheme of base64-ing keys and appends a "b" flag: "mg 44OG44K544OI b\r\n" when the dump is complete it prints an "EN\r\n" clients wishing to stream or fetch data can take the mg commands, strip the \r\n, append any flags they care about, then send the command back to the server to fetch the full key data. This seems to use 30-40% less CPU time on the server for the same key dumps.
* crawler: don't hold lock while writing to networkdormando2023-02-271-77/+103
| | | | | | | | | | | | | | | | | | | | | LRU crawler in per-LRU mode only flushes the write buffer to the client socket while not holding important locks. The hash table iterator version of the crawler was accidentally holding an item lock while flushing to the network. Item locks must NOT be held for long periods of time as they will cause the daemon to lag. Originally the code used a circular buffer for writing to the network; this allowed it to easily do partial write flushes to the socket and continue filling the other half of the buffer. Fixing this requires the buffer be resizeable, so we instead use a straight buffer allocation. The write buffer must be large enough to handle all items within a hash table bucket. Hash table buckets are _supposed_ to max out at an average depth of 1.5 items, so in theory it should never resize. However it's possible to go higher if a user clamps the hash table size. There could also be larger than average buckets naturally due to the hash algorithm and luck.
* core: give threads unique namesdormando2022-11-011-0/+1
| | | | | allow users to differentiate thread functions externally to memcached. Useful for setting priorities or pinning threads to CPU's.
* crawler: remove bad mutex unlock during errordormando2020-11-201-1/+0
| | | | Detail in: https://github.com/memcached/memcached/issues/741
* item crawler hash table walk modedormando2020-11-201-12/+89
| | | | | | | specifying 'hash' instead of 'all' will make the LRU crawler iterate over ever bucket in the hash table once, instead of what's in the LRU. This also doesn't suffer from missing items because of LRU reordering or high lock contention.
* core: move more storage functions to storage.cdormando2020-10-301-3/+2
| | | | | | | extstore.h is now only used from storage.c. starting a path towards getting the storage interface to be more generalized. should be no functional changes.
* Remove multiple double-initializations of condition variables and mutexesDaniel Schemmel2019-11-101-5/+0
| | | | | | | | | - `slabs_rebalance_lock` - `slab_rebalance_cond` - `maintenance_lock` - `lru_crawler_lock` - `lru_crawler_cond` - `lru_maintainer_lock`
* restartable cachedormando2019-09-171-3/+12
| | | | | | | | | | | | | | | "-e /path/to/tmpfsmnt/file" SIGUSR1 for graceful stop restart requires the same memory limit, slab sizes, and some other infrequently changed details. Most other options and features can change between restarts. Binary can be upgraded between restarts. Restart does some fixup work on start for every item in cache. Can take over a minute with more than a few hundred million items in cache. Keep in mind when a cache is down it may be missing invalidations, updates, and so on.
* Use correct buffer size for internal URI encoding.Tharanga Gamaethige2019-05-201-2/+2
| | | | | Modified Logger and Crawler to use the correct buffer length when they are printing URI encoded keys. Fixes #471
* Basic implementation of TLS for memcached.1.5.13Tharanga Gamaethige2019-04-151-2/+2
| | | | | | | | | | | | | Most of the work done by Tharanga. Some commits squashed in by dormando. Also reviewed by dormando. Tested, working, but experimental implementation of TLS for memcached. Enable with ./configure --enable-tls Requires OpenSSL 1.1.0 or better. See `memcached -h` output for usage.
* remove bad assert from crawlerdormando2018-07-031-1/+0
|
* Fixes decrement-before-check problem (issue #362).Calin Iorgulescu2018-03-191-0/+8
| | | | Adds test for issue #362.
* Apply the cast to the whole expressions, fixing build with clangDavid Carlier2018-02-191-2/+2
|
* limit crawls for metadumperdormando2018-02-121-3/+5
| | | | | | | | | | | | | | | LRU crawler metadumper is used for getting snapshot-y looks at the LRU's. Since there's no default limit, it'll get any new items added or bumped since the roll started. with this change it limits the number of items dumped to the number that existed in that LRU when the roll was kicked off. You still end up with an approximation, but not a terrible one: - items bumped after the crawler passes them likely won't be revisited - items bumped before the crawler passes them will likely be visited toward the end, or mixed with new items. - deletes are somewhere in the middle.
* build fixes1.5.4dormando2017-12-201-1/+1
|
* extstore: crawler fix and ext_low_ttl optiondormando2017-11-281-0/+3
| | | | | | | | | | | | LRU crawler was not marking reclaimed expired items as removed from the storage engine. This could cause fragmentation to persist much longer than it should, but would not cause any problems once compaction started. Adds "ext_low_ttl" option. Items with a remaining expiration age below this value are grouped into special pages. If you have a mixed TTL workload this would help prevent low TTL items from causing excess fragmentation/compaction. Pages with low ttl items are excluded from compaction.
* external storage base commitdormando2017-11-281-1/+20
| | | | | been squashing reorganizing, and pulling code off to go upstream ahead of merging the whole branch.
* metadump: don't crash if client lostdormando2017-11-281-4/+6
| | | | fixes previous commit :|
* lru_crawler metadump output expansiondormando2017-11-161-3/+14
| | | | | add class id and total item size. adds "END\r\n" to signify end of output.
* interface code for flash branchdormando2017-09-261-1/+1
| | | | | removes a few ifdef's and upstreams small internal interface tweaks for easy rebase.
* LRU crawler scheduling improvementsdormando2017-05-291-57/+76
| | | | | | | | | | | | | | | | | | | when trying to manually run a crawl, the internal autocrawler is now blocked from restarting for 60 seconds. the internal autocrawl now independently schedules LRU's, and can re-schedule sub-LRU's while others are still running. should allow much better memory control when some sub-lru's (such as TEMP or WARM) are small, or slab classes are differently sized. this also makes the crawler drop its lock frequently.. this fixes an issue where a long crawl happening at the same time as a hash table expansion could hang the server until the crawl finished. to improve still: - elapsed time can be wrong in the logger entry - need to cap number of entries scanned. enough set pressure and a crawl may never finish.
* check index at increments crawlerstats_t->histoFumihiro Ito2017-05-221-1/+3
| | | | | | | Very rarely, current_time will be greater than expiration time of item. If an item has expired at after check its expiration in crawler thread, access to crawlerstats_t->histo with invalid index. so segmentation fault occurs as a result of access violation.
* stop using atomics for item refcount managementdormando2017-01-221-5/+5
| | | | | | | | | | | | | | when I first split the locks up further I had a trick where "item_remove()" did not require holding the associated item lock. If an item were to be freed, it would then do the necessary work.; Since then, all calls to refcount_incr and refcount_decr only happen while the item is locked. This was mostly due to the slab mover being very tricky with locks. The atomic is no longer needed as the refcount is only ever checked after a lock to the item. Calling atomics is pretty expensive, especially in multicore/multisocket scenarios. This yields a notable performance increase.
* metadump: Fix preventing dumping of class 63dormando2017-01-071-1/+1
| | | | Off by one :|
* metadump: ensure buffer is flushed before finishdormando2017-01-061-5/+5
| | | | | if more than one write/poll() worth of data is in the bipbuf but the crawl is complete, the client might get released back with an incomplete write.
* don't double free in lru_crawler on closed clientsdormando2016-12-161-1/+4
| | | | | during finalization, a poll and deliberate close are run. if a client is closed during the poll it might double free.
* fix segfault if metadump client goes awaydormando2016-12-161-0/+3
| | | | | | missing else branch caused the first slab class to hit a closed client to terminate, but didn't kill the run and the next slab class would try to print to the missing client.
* pull LRU crawler out into its own file.dormando2016-08-191-0/+630
~600 lines gone from items.c makes it a lot more manageable. this change is almost purely moving code around and renaming functions. very little logic has changed.