| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
| |
Updates the slab mover for the new method.
1.4.10 lacks some crucial protection around item freeing and removal,
resulting in some potential crashes. Moving the cache_lock around item_remove
caused a 30% performance drop, so it's been reimplemented with GCC atomics.
refcount of 1 now means an item is linked but has no reference, which allows
us to test an atomic sub and fetch of 0 as a clear indicator of when to free
an item.
|
|
|
|
|
|
|
|
|
|
|
|
| |
I re-implemented a linked list for the slab freelist since we don't need to
manage the tail, check the previous item, and use it as a FIFO. However
prev/next must be managed so the slab mover is safe.
However I neglected to clear prev on a fetch, so if the slab mover was
zeroing the head of the freelist it would relink the next item in the freelist
with one in the main LRU.
Which results in chaos.
|
|
|
|
|
|
|
|
| |
do_item_update could decide to update an item, then wait on the cache_lock,
but the item could be unlinked in the meantime.
caused this to happen on purpose by flooding with sets, then flushing
repeatedly. flush has to unlink items until it hits the previous second.
|
|
|
|
|
| |
popular items could stuck the slab mover forever, so if a move is in progress,
check to see if the item we're fetching should be unlinked instead.
|
|
|
|
|
| |
Add human parseable strings to the errors for slabs ressign. Also prevent
reassigning memory to the same source and destination.
|
|
|
|
|
|
|
|
|
|
| |
Enable at startup with -o slab_reassign,slab_automove
Enable or disable at runtime with "slabs automove 1\r\n"
Has many weaknesses. Only pulls from slabs which have had zero recent
evictions. Is slow, not tunable, etc. Use the scripts/mc_slab_mover example to
write your own external automover if this doesn't satisfy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a "slabs reassign src dst" manual command, and a thread to safely process
slab moves in the background.
- slab freelist is now a linked list, reusing the item structure
- is -o slab_reassign is enabled, an extra background thread is started
- thread attempts to safely free up items when it's been told to move a page
from one slab to another.
-o slab_automove is stubbed.
There are some limitations. Most notable is that you cannot repeatedly move
pages around without first having items use up the memory. Slabs with newly
assigned memory work off of a pointer, handing out chunks individually. We
would need to change that to quickly split chunks for all newly assigned pages
into that slabs freelist.
Further testing is required to ensure such is possible without impacting
performance.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the race here is absolutely insane:
- do_item_get and do_item_alloc call at the same time, against different
items
- do_item_get wins cache_lock lock race, returns item for internal testing
- do_item_alloc runs next, pulls item off of tail of a slab class which is the
same item as do_item_get just got
- do_item_alloc sees refcount == 0 since do_item_get incrs it at the bottom,
and starts messing with the item
- do_item_get runs its tests and maybe even refcount++'s and returns the item
- evil shit happens.
This race is much more likely to hit during the slab reallocation work, so I'm
fixing it even though it's almost impossible to cause.
Also cleaned up the logic so it's not testing the item for NULL more than
once. Far fewer branches now, though I did not examine gcc's output to see if
it is optimized differently.
|
|
|
|
|
|
|
|
|
| |
Fix an unlikely bug where search == NULL and the first alloc fails, which then
attempts to use search.
Also reorders branches from most likely to least likely, and removes all
redundant tests that I can see. No longer double checks things like refcount
or exptime for the eviction case.
|
|
|
|
|
| |
after pulling an item off of the LRU, there's no reason to hold the cache lock
while we initialize a few values and memcpy some junk.
|
| |
|
|
|
|
|
|
|
|
|
| |
the fix for issue 140 only helped in the case of you poking at memcached with
a handful of items (or this particular test). On real instances you could
easily exhaust the 50 item search and still come up with a crap item.
It was removed because adding the proper locks back in that place is
difficult, and it makes "stats items" take longer in a gross lock anyway.
|
|
|
|
|
|
| |
Directly use the hash for accessing the table. Performance seems unchanged
from before but this is more proper. It also scales the hash table a bit as
worker threads are increased.
|
|
|
|
|
| |
easy win without restructuring item_alloc more: push the lock down after it's
done fiddling with snprintf.
|
|
|
|
| |
push cache_lock deeper into the abyss
|
|
|
|
|
|
|
|
|
|
|
| |
Code checked 50 items before checking up to 50 more items to expire one, if
none were expired. Given the shallow depth search (50) by any sizeable cache
(as low as 1000 items, even), I believe that whole optimization was pointless.
Flattening it to be a single test is shorter code and benches a bit faster as
it holds the lock for less time.
I may have made a mess of the logic, could be cleaned up a little.
|
|
|
|
|
| |
been hard to measure while using the intel hash (since it's very fast), but
should help with the software hash.
|
|
|
|
|
|
|
| |
Partly by Ripduman Sohan
Appears to significantly help prevent performance dropoff from additional
threads, but only when the locks are frequently contested and are short.
|
|
|
|
|
|
| |
The \0 test in the loop was accounting for 2% of memcached's CPU usage
according to callgrind. strlen is an SSE4 instruction and can sniff out that
null byte quickly.
|
|
|
|
|
| |
All the other tests did... just this one didn't. You really shouldn't build
this thing as root.
|
|
|
|
| |
we need time travel tests so all of these can go away.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
If main event loop bombs out early with an error, exit to the OS with an
error.
|
|
|
|
| |
SIGH. sorry.
|
|
|
|
|
| |
after the doublefork was added, the pidfile was now of the intermediary pid,
not the one forked after setsid.
|
|
|
|
|
|
|
|
|
| |
An audit turned up that the LRU bump to move just accessed items to the front
was missing from the binary get command. It was also missing from incr/decr
and the new touch commands.
If someone was using the binary protocol exclusively, memcached would be
acting as a FIFO for stored items.
|
|
|
|
| |
4 billion evictions should be enough for anybody!
|
|
|
|
|
|
|
|
|
| |
The two stats represent items which expired and memory was reused, and valid
items which were evicted, but never touched by get/incr/append/etc
operations in the meantime.
Useful for seeing how many wasted items are being set and then rolling out
through the bottom of the LRU's.
|
|
|
|
|
|
| |
Instances which run many millions of items can now have its hash table
presized. This can avoid some minor memory churn during the warmup
period.
|
|
|
|
|
|
| |
Now users can tell how much memory is being used for the hash table structure.
It also exposes the current hash power level, which is useful for presizing
the structure.
|
|
|
|
|
|
| |
This happens when we allocate a new item instead of reusing the space
of an existing one, but consistently set the CAS from the original
item's CAS (which is being discarded).
|
|
|
|
|
| |
without setting write_and_go, multiple commands in one packet weren't always
being processed.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also fixes -c option to allow reducing the maximum connection limit.
This gives a new option "-o maxconns_fast", which changes how memcached
handles hitting the maximum connection limit. By default, it disables the
accept listener and new connections will wait in the listen queue. With
maxconns_fast enabled, new connections over the limited have an error written
to them and are immediately closed by the listener thread.
This is currently experimental, as we aren't sure how clients will handle the
change. It may become the default in the future.
|
| |
|
|
|
|
|
| |
Not doing GAT for now since I'd have to iterate through gat/gats/multigat/etc.
If there's demand, we can add it.
|
| |
|
|
|
|
|
|
|
|
| |
Apparently nothing tests GETK/GETKQ, so tests still have to be added.
1.6 doesn't have GATK/GATKQ because the membase folks didn't need it. I'm
adding them for completeness and because I don't want to argue about why
people can't have it. If you're reading this, please use opaques :)
|
|
|
|
|
| |
Taken from the 1.6 branch, partly written by Trond. I hope the CAS handling is
correct.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
I've still removed the "set the time now" stuff that the flush_all commands
do. They push to one second in the past, and with some startup fudge the tests
all pass.
Relying on libevent's firing of clock_handler was drifting ~5ms per tick.
Fudging it further wouldn't be a great idea.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gettimeofday() can flip around all willy-nilly, and is actually common for
users to cause this by having memcached start on boot before ntp launches
and corrects system time. libevent fires events on a monotonically increasing
clock, so we can more or less safely tick our internal timer up by one second
every time the handler runs.
Unfortunately we support expiration by date. If memcached's start time isn't
in sync with the rest of the world this feature won't work well, but it never
did.
Was originally going to make this optional, but I can't come up with a great
reason to do so. If it turns out this isn't "accurate enough", we can add the
clock_gettime() code inline.
|
|
|
|
| |
Dustin's set clock stuff would be nice :P
|
|
|
|
|
| |
Negative values larger than the server start time used to become immortal. Now
it's set to REALTIME_MAXDELTA + 1 in an attempt to immediately expire it.
|
|
|
|
| |
previously hardcoded to 40. now will iterate up through all of them.
|
|
|
|
|
| |
The debianish start script routes the STDERR/STDOUT to make "logfiles", but in
doing so doesn't break free of the launching session. Patch fixes that.
|