delta/ceph.git - github.com: ceph/ceph.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	mon: OSDMonitor: get rid of encode_full() as we don't use it.wip-5704-cuttlefish	Joao Eduardo Luis	2013-07-23	2	-12/+5
\| \| \| \| \| \| \| \|	We have delegated this to encode_trim_extra() since 7fb3804fb860dcd0340dd3f7c39eec4315f8e4b6 -- no need to keep this code around. Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com>
*	mon: OSDMonitor: update the osdmap's latest_full with the new full version	Joao Eduardo Luis	2013-07-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	We used to do this on encode_full(), but since [1] we no longer rely on PaxosService to manage the full maps for us. And we forgot to write down the latest_full version to the store, leaving it in a truly outdated state. Fixes: #5704 Backport: cuttlefish [1] - 7fb3804fb860dcd0340dd3f7c39eec4315f8e4b6
*	v0.61.5v0.61.5	Gary Lowell	2013-07-17	2	-1/+7
\|
*	ceph-disk: rely on /dev/disk/by-partuuid instead of special-casing journal ↵	Sage Weil	2013-07-16	1	-34/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	symlinks This was necessary when ceph-disk-udev didn't create the by-partuuid (and other) symlinks for us, but now it is fragile and error-prone. (It also appears to be broken on a certain customer RHEL VM.) See d7f7d613512fe39ec883e11d201793c75ee05db1. Instead, just use the by-partuuid symlinks that we spent all that ugly effort generating. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> (cherry picked from commit 64379e701b3ed862c05f156539506d3382f77aa8)
*	mon: Monitor: StoreConverter: clearer debug message on 'needs_conversion()'	Joao Eduardo Luis	2013-07-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	The previous debug message outputted the function's name, as often our functions do. This was however a source of bewilderment, as users would see those in logs and think their stores would need conversion. Changing this message is trivial enough and it will make ceph users happier log readers. Backport: cuttlefish Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit ad1392f68170b391d11df0ce5523c2d1fb57f60e)
*	mon: Monitor: do not reopen MonitorDBStore during conversion	Joao Eduardo Luis	2013-07-16	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already open the store on ceph_mon.cc, before we start the conversion. Given we are unable to reproduce this every time a conversion is triggered, we are led to believe that this causes a race in leveldb that will lead to 'store.db/LOCK' being locked upon the open this patch removes. Regardless, reopening the db here is pointless as we already did it when we reach Monitor::StoreConverter::convert(). Fixes: #5640 Backport: cuttlefish Signed-off-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit 036e6739a4e873863bae3d7d00f310c015dfcdb3)
*	messages/MClientReconnect: clear data when encoding	Sage Weil	2013-07-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MClientReconnect puts everything in the data payload portion of the message and nothing in the front portion. That means that if the message is resent (socket failure or something), the messenger thinks it hasn't been encoded yet (front empty) and reencodes, which means everything gets added (again) to the data portion. Decoding keep decoding until it runs out of data, so the second copy means we decode garbage snap realms, leading to the crash in bug Clearing data each time around resolves the problem, although it does mean we do the encoding work multiple times. We could alternatively (or also) stick some data in the front portion of the payload (ignored), but that changes the wire protocol and I would rather not do that. Fixes: #4565 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit 314cf046b0b787ca69665e8751eab6fe7adb4037)
*	mon: once sync full is chosen, make sure we don't change our mind	Sage Weil	2013-07-15	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible for a sequence like: - probe - first probe reply has paxos trim that indicates a full sync is needed - start sync - clear store - something happens that makes us abort and bootstrap (e.g., the provider mon restarts - probe - first probe reply has older paxos trim bound and we call an election - on election completion, we crash because we have no data. Non-determinism of the probe decision aside, we need to ensure that the info we share during probe (fc, lc) is accurate, and that once we clear the store we know we must do a full sync. This is a backport of aa60f940ec1994a61624345586dc70d261688456. Fixes: #5621 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
*	mon: do not scrub if scrub is in progress	Sage Weil	2013-07-13	1	-0/+5
\| \| \| \| \| \| \| \| \|	This prevents an assert from unexpected scrub results from the previous scrub on the leader. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 00ae543b3e32f89d906a0e934792cc5309f57696)
*	messages/MPGStats: do not set paxos version to osdmap epoch	Sage Weil	2013-07-13	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	The PaxosServiceMessage version field is meant for client-coordinated ordering of messages when switching between monitors (and is rarely used). Do not fill it with the osdmap epoch lest it be compared to a pgmap version, which may cause the mon to (near) indefinitely put it on a wait queue until the pgmap version catches up. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit b36338be43f43b6dd4ee87c97f2eaa23b467c386)
*	osd/OSDmap: fix OSDMap::Incremental::dump() for new pool names	Sage Weil	2013-07-13	1	-1/+8
\| \| \| \| \| \| \| \| \| \|	The name is always present when pools are created, but not when they are modified. Also, a name may be present with a new_pools entry if the pool is just renamed. Separate it out completely in the dump. Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 3e4a29111e89588385e63f8d92ce3d67739dd679)
*	mon/PaxosService: prevent reads until initial service commit is done	Sage Weil	2013-07-13	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Do not process reads (or, by PaxosService::dispatch() implication, writes) until we have committed the initial service state. This avoids things like EPERM due to missing keys when we race with mon creation, triggered by teuthology tests doing their health check after startup. Fixes: #5515 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit d08b6d6df7dba06dad73bdec2c945f24afc02717)
*	client: send all request put's through put_request()	Sage Weil	2013-07-13	3	-20/+31
\| \| \| \| \| \| \| \|	Make sure all MetaRequest reference put's go through the same path that releases inode references, including all of the error paths. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 87217e1e3cb2785b79d0dec49bd3f23a827551f5)
*	client: fix remaining Inode::put() caller, and make method psuedo-private	Sage Weil	2013-07-13	2	-3/+5
\| \| \| \| \| \| \| \| \|	Not sure I can make this actually private and make Client::put_inode() a friend method (making all of Client a friend would defeat the purpose). This works well enough, though! Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 9af3b86b25574e4d2cdfd43e61028cffa19bdeb1)
*	client: use put_inode on MetaRequest inode refs	Sage Weil	2013-07-13	3	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \|	When we drop the request inode refs, we need to use put_inode() to ensure they get cleaned up properly (removed from inode_map, caps released, etc.). Do this explicitly here (as we do with all other inode put() paths that matter). Fixes: #5381 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 81bee6487fb1ce9e090b030d61bda128a3cf4982)
*	mon: be smarter about calculating last_epoch_clean lower bound	Sage Weil	2013-07-12	2	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \|	We need to take PGs whose mapping has not changed in a long time into account. For them, the pg state will indicate it was clean at the time of the report, in which case we can use that as a lower-bound on their actual latest epoch clean. If they are not currently clean (at report time), use the last_epoch_clean value. Fixes: #5519 Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit cc0006deee3153e06ddd220bf8a40358ba830135)
*	osd: report pg stats to mon at least every N (=500) epochs	Sage Weil	2013-07-12	2	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The mon needs a moderately accurate last_epoch_clean value in order to trim old osdmaps. To prevent a PG that hasn't peered or received IO in forever from preventing this, send pg stats at some minimum frequency. This will increase the pg stat report workload for the mon over an idle pool, but should be no worse that a cluster that is getting actual IO and sees these updates from normal stat updates. This makes the reported update a bit more aggressive/useful in that the epoch is the last map epoch processed by this PG and not just one that is >= the currenting interval. Note that the semantics of this field are pretty useless at this point. See #5519 Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit da81228cc73c95737f26c630e5c3eccf6ae1aaec)
*	osd: fix warning	Sage Weil	2013-07-12	1	-1/+1
\| \| \| \| \| \| \| \|	From 653e04a79430317e275dd77a46c2b17c788b860b Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit bc291d3fc3fc1cac838565cbe0f25f71d855a6e3)
*	Merge remote-tracking branch 'gh/wip-mon-sync-2' into cuttlefish	Sage Weil	2013-07-12	6	-32/+83
\|\ \| \| \| \| \| \| \| \|	Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com>
\| *	Merge remote-tracking branch 'gh/cuttlefish' into wip-mon-sync-2	Sage Weil	2013-07-10	24	-78/+419
\| \|\
\| * \|	mon/Paxos: fix sync restart	Sage Weil	2013-07-05	2	-9/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we have a sync going, and an election intervenes, the client will try to continue by sending a new start_chunks request. In order to ensure that we get all of the paxos commits from our original starting point (and thus properly update the keys from which they started), only pay attention if they also send their current last_committed version. Otherwise, start them at the beginning. Signed-off-by: Sage Weil <sage@inktank.com>
\| * \|	mon: uninline _trim_enable and Paxos::trim_{enable,disable} so we can debug them	Sage Weil	2013-07-05	4	-13/+30
\| \| \| \| \| \| \| \| \| \| \| \|	Signed-off-by: Sage Weil <sage@inktank.com>
\| * \|	mon/Paxos: increase paxos max join drift	Sage Weil	2013-07-04	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A value of 10 is too aggressive for large, long-running syncs. 100 is about 2 minutes of activity at most, which should be a more forgiving buffer. Signed-off-by: Sage Weil <sage@inktank.com>
\| * \|	mon/Paxos: configure minimum paxos txns separately	Sage Weil	2013-07-04	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were using paxos_max_join_drift to control the minimum number of paxos transactions to keep around. Instead, make this explicit, and separate from the join drift. Signed-off-by: Sage Weil <sage@inktank.com>
\| * \|	mon: include any new paxos commits in each sync CHUNK message	Sage Weil	2013-07-04	1	-1/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We already take note of the paxos version when we begin the sync. As sync progresses and there are new paxos commits/txns, include those and update last_committed, so that when sync completes we will have a full view of everything that happened during sync. Note that this does not introduce any compatibility change. This change only affects the provider. The key difference is that at the end of the sync, the provide will set version to the latest version, and not the version from the start of the sync (as was done previously). Signed-off-by: Sage Weil <sage@inktank.com>
\| * \|	mon/MonitorDBStore: expose get_chunk_tx()	Sage Weil	2013-07-04	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow users get the transaction unencoded. Signed-off-by: Sage Weil <sage@inktank.com>
* \| \|	Get device-by-path by looking for it instead of assuming 3rd entry.	Sandon Van Ness	2013-07-10	1	-1/+7
\| \|/ \|/\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some systems (virtual machines so far) the device-by-path entry from udevadm is not always in the same spot so instead actually look for the right output instead of blindy assuming that its a specific field in the output. Signed-off-by: Sandon Van Ness <sandon@inktank.com> Reviewed-by: Gary Lowell <gary.lowell@inktank.com>
* \|	osd: limit number of inc osdmaps send to peers, clients	Sage Weil	2013-07-10	2	-5/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We should not send an unbounded number of inc maps to our peers or clients. In particular, if a peer is not contacted for a while, we may think they have a very old map (say, 10000 epochs ago) and send thousands of inc maps when the distribution shifts and we need to peer. Note that if we do not send enough maps, the peers will make do by requesting the map from somewhere else (currently the mon). Regardless of the source, however, we must limit the amount that we speculatively share as it usually is not needed. Backport: cuttlefish, bobtail Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Samuel Just <sam.just@inktank.com> (cherry picked from commit 653e04a79430317e275dd77a46c2b17c788b860b)
* \|	rgw: Fix return value for swift user not found	Christophe Courtaut	2013-07-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	http://tracker.ceph.com/issues/1779 fixes #1779 Adjust the return value from rgw_get_user_info_by_swift call in RGW_SWIFT_Auth_Get::execute() to have the correct return code in response. (cherry picked from commit 4089001de1f22d6acd0b9f09996b71c716235551)
* \|	mon/OSDMonitor: make 'osd crush rm ...' slightly more idempotent	Sage Weil	2013-07-09	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a manual backport of 18a624fd8b90d9959de51f07622cf0839e6bd9aa. Do not return immediately if we are looking at uncommitted state.t Signed-off-by: Sage Weil <sage@inktank.com>
* \|	mon/OSDMonitor: fix base case for loading full osdmap	Sage Weil	2013-07-08	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right after cluster creation, first_committed is 1 and latest stashed in 0, but we don't have the initial full map yet. Thereafter, we do (because we write it with trim). Fixes afd6c7d8247075003e5be439ad59976c3d123218. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit 43fa7aabf1f7e5deb844c1f52d451bab9e7d1006)
* \|	mon: fix osdmap stash, trim to retain complete history of full maps	Sage Weil	2013-07-08	4	-20/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The current interaction between sync and stashing full osdmaps only on active mons means that a sync can result in an incomplete osdmap_full history: - mon.c starts a full sync - during sync, active osdmap service should_stash_full() is true and includes a full in the txn - mon.c sync finishes - mon.c update_from_paxos gets "latest" stashed that it got from the paxos txn - mon.c does not walk to previous inc maps to complete it's collection of full maps. To fix this, we disable the periodic/random stash of full maps by the osdmap service. This introduces a new problem: we must have at least one full map (the first one) in order for a mon that just synced to build it's full collection. Extend the encode_trim() process to allow the osdmap service to include the oldest full map with the trim txn. This is more complex than just writing the full maps in the txn, but cheaper--we only write the full map at trim time. This might be related to previous bugs where the full osdmap was missing, or case where leveldb keys seemed to 'disappear'. Fixes: #5512 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit afd6c7d8247075003e5be439ad59976c3d123218)
* \|	mon: implement simple 'scrub' command	Sage Weil	2013-07-08	9	-4/+309
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compare all keys within the sync'ed prefixes across members of the quorum and compare the key counts and CRC for inconsistencies. Currently this is a one-shot inefficient hammer. We'll want to make this work in chunks before it is usable in production environments. Protect with a feature bit to avoid sending MMonScrub to mons who can't decode it. Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit a9906641a1dce150203b72682da05651e4d68ff5) Conflicts: src/mon/MonCommands.h src/mon/Monitor.cc
* \|	Elector.h: features are 64 bit	Samuel Just	2013-07-08	2	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #5497 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Luis <joao.luis@inktank.com> (cherry picked from commit 3564e304e3f50642e4d9ff25e529d5fc60629093)
* \|	ceph_features.h: declare all features as ULL	Samuel Just	2013-07-08	1	-33/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Otherwise, the first 32 get \|'d together as ints. Then, the result ((int)-1) is sign extended to ((long long int)-1) before being \|'d with the 1LL entries. This results in ~((uint64_t)0). Fixes: #5497 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Luis <joao.luis@inktank.com> (cherry picked from commit 4255b5c2fb54ae40c53284b3ab700fdfc7e61748)
* \|	Pipe: use uint64_t not unsigned when setting features	Samuel Just	2013-07-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes: #5497 Signed-off-by: Samuel Just <sam.just@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Luis <joao.luis@inktank.com> (cherry picked from commit bc3e2f09f8860555d8b3b49b2eea164b4118d817)
* \|	client: remove O_LAZYwip-lazy-cuttlefish	Sage Weil	2013-07-08	4	-10/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The once-upon-a-time unique O_LAZY value I chose forever ago is now O_NOATIME, which means that some clients are choosing relaxed consistency without meaning to. It is highly unlikely that a real O_LAZY will ever exist, and we can select it in the ceph case with the ioctl or libcephfs call, so drop any support for doing this via open(2) flags. Update doc/lazy_posix.txt file re: lazy io. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Greg Farnum <greg@inktank.com> (cherry picked from commit 94afedf02d07ad4678222aa66289a74b87768810)
* \|	osd/osd_types: fix pg_stat_t::dump for last_epoch_clean	Sage Weil	2013-07-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 69a55445439fce0dd6a3d32ff4bf436da42f1b11)
* \|	mon: remove bad assert about monmap version	Sage Weil	2013-07-08	1	-1/+0
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	It is possible to start a sync when our newest monmap is 0. Usually we see e0 from probe, but that isn't always published as part of the very first paxos transaction due to the way PaxosService::_active generates it's first initial commit. In any case, having e0 here is harmless. Fixes: #5509 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit 85a1d6cc5d3852c94d1287b566656c5b5024fa13)
*	mon: enable leveldb cache by default	Sage Weil	2013-07-03	1	-1/+1
\| \| \| \| \| \| \| \| \|	256 is not as large as the upstream 512 MB, but will help signficiantly and be less disruptive for existing cuttlefish clusters. Sort-of backport of e93730b7ffa48b53c8da2f439a60cb6805facf5a. Signed-off-by: Sage Weil <sage@inktank.com>
*	mon/Paxos: make 'paxos trim disabled max versions' much much larger	Sage Weil	2013-07-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	108000 is about 3 hours if paxos is going full-bore (1 proposal/second). That ought to be pretty safe. Otherwise, we start trimming to soon and a slow sync will just have to restart when it finishes. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Joao Eduardo Luis <joao.luis@inktank.com> (cherry picked from commit 71ebfe7e1abe4795b46cf00dfe1b03d1893368b0) Conflicts: src/common/config_opts.h
*	mon: do not reopen MonitorDBStore during startup	Sage Weil	2013-07-03	3	-55/+58
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	level doesn't seem to like this when it races with an internal compaction attempt (see below). Instead, let the store get opened by the ceph_mon caller, and pull a bit of the logic into the caller to make the flow a little easier to follow. -2> 2013-06-25 17:49:25.184490 7f4d439f8780 10 needs_conversion -1> 2013-06-25 17:49:25.184495 7f4d4065c700 5 asok(0x13b1460) entry start 0> 2013-06-25 17:49:25.316908 7f4d3fe5b700 -1 * Caught signal (Segmentation fault) in thread 7f4d3fe5b700 ceph version 0.64-667-g089cba8 (089cba8fc0e8ae8aef9a3111cba7342ecd0f8314) 1: ceph-mon() [0x649f0a] 2: (()+0xfcb0) [0x7f4d435dccb0] 3: (leveldb::Table::BlockReader(void, leveldb::ReadOptions const&, leveldb::Slice const&)+0x154) [0x806e54] 4: ceph-mon() [0x808840] 5: ceph-mon() [0x808b39] 6: ceph-mon() [0x806540] 7: (leveldb::DBImpl::DoCompactionWork(leveldb::DBImpl::CompactionState)+0xdd) [0x7f363d] 8: (leveldb::DBImpl::BackgroundCompaction()+0x2c0) [0x7f4210] 9: (leveldb::DBImpl::BackgroundCall()+0x68) [0x7f4cc8] 10: ceph-mon() [0x80b3af] 11: (()+0x7e9a) [0x7f4d435d4e9a] 12: (clone()+0x6d) [0x7f4d4196bccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit ea1f316e5de21487ae034a1aa929068ba23ac525)
*	sysvinit, upstart: handle symlinks to dirs in /var/lib/ceph/*	Sage Weil	2013-07-02	5	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Match a symlink to a dir, not just dirs. This fixes the osd case of e.g., creating an osd in /data/osd$id in which ceph-disk makes a symlink from /var/lib/ceph/osd/ceph-$id. Fix proposed by Matt Thompson <matt.thompson@mandiant.com>; extended to include the upstart users too. Fixes: #5490 Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> (cherry picked from commit 87c98e92d1375c8bc76196bbbf06f677bef95e64)
*	rgw: add RGWFormatter_Plain allocation to sidestep cranky strlen()	Sage Weil	2013-07-02	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	Valgrind complains about an invalid read when we don't pad the allocation, and because it is inlined we can't whitelist it for valgrind. Workaround the warning by just padding our allocations a bit. Fixes: #5346 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 49ff63b1750789070a8c6fef830c9526ae0f6d9f)
*	mds: warn on unconnected snap realms	Yan, Zheng	2013-07-01	1	-1/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there are more than one active MDS, restarting MDS triggers assertion "reconnected_snaprealms.empty()" quite often. If there is no snapshot in the FS, the items left in reconnected_snaprealms should be other MDS' mdsdir. I think it's harmless. If there are snapshots in the FS, the assertion probably can catch real bugs. But at present, snapshot feature is broken, fixing it is non-trivial. So replace the assertion with a warning. Signed-off-by: Yan, Zheng <zheng.z.yan@intel.com> (cherry picked from commit 26effc0e583b0a3dade6ec81ef26dec1c94ac8b2)
*	mon/PGMonitor: use post_paxos_update, not init, to refresh from osdmap	Sage Weil	2013-06-28	3	-7/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We do two things here: - make init an one-time unconditional init method, which is what the health service expects/needs. - switch PGMonitor::init to be post_paxos_update() which is called after the other services update, which is what PGMonitor really needs. This is a new version of the fix originally in commit a2fe0137946541e7b3b537698e1865fbce974ca6 (and those around it). That is, this re-fixes a problem where osds do not see pg creates from their subscribe due to map_pg_creates() not getting called. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit e635c47851d185eda557e36bdc4bf3775f7b87a2) Conflicts: src/mon/PGMonitor.cc src/mon/PGMonitor.h
*	mon/PaxosService: add post_paxos_update() hook	Sage Weil	2013-06-28	2	-0/+11
\| \| \| \| \| \| \| \| \| \|	Some services need to update internal state based on other service's state, and thus need to be run after everyone has pulled their info out of paxos. Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> (cherry picked from commit 131686980f0a930d5de7cbce8234fead5bd438b6)
*	ceph-disk: s/else if/elif/	Greg Farnum	2013-06-27	1	-1/+1
\| \| \| \| \| \| \|	Signed-off-by: Greg Farnum <greg@inktank.com> Reviewed-by: Joao Luis <joao.luis@inktank.com> (cherry picked from commit bd8255a750de08c1b8ee5e9c9a0a1b9b16171462) (cherry picked from commit 9e604ee6943fdb131978afbec51321050faddfc6)
*	rgw: fix radosgw-admin buckets list	Yehuda Sadeh	2013-06-26	2	-7/+13
\| \| \| \| \| \| \| \| \| \| \|	Fixes: #5455 Backport: cuttlefish This commit fixes a regression, where radosgw-admin buckets list operation wasn't returning any data. Signed-off-by: Yehuda Sadeh <yehuda@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com> (cherry picked from commit e1f9fe58d2860fcbb18c92d3eb3946236b49a6ce)
*	ceph-disk: use unix lock instead of lockfile class	Sage Weil	2013-06-26	1	-3/+25
\| \| \| \| \| \| \| \| \| \| \| \| \|	The lockfile class relies on file system trickery to get safe mutual exclusion. However, the unix syscalls do this for us. More importantly, the unix locks go away when the owning process dies, which is behavior that we want here. Fixes: #5387 Backport: cuttlefish Signed-off-by: Sage Weil <sage@inktank.com> Reviewed-by: Dan Mick <dan.mick@inktank.com> (cherry picked from commit 2a4953b697a3464862fd3913336edfd7eede2487)